Command that produces this log: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 Initialized slot model with checkpoint at logs/slot/slot-model.mdl.lang-chinese ---------------------------------------------------------------------------------------------------- > trainable params: >>> xlmr.embeddings.word_embeddings.weight: torch.Size([250002, 1024]) >>> xlmr.embeddings.position_embeddings.weight: torch.Size([514, 1024]) >>> xlmr.embeddings.token_type_embeddings.weight: torch.Size([1, 1024]) >>> xlmr.embeddings.LayerNorm.weight: torch.Size([1024]) >>> xlmr.embeddings.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.0.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.0.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.0.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.0.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.0.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.0.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.0.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.0.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.0.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.0.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.0.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.0.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.0.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.0.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.0.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.0.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.1.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.1.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.1.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.1.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.1.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.1.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.1.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.1.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.1.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.1.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.1.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.1.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.1.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.1.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.1.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.1.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.2.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.2.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.2.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.2.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.2.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.2.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.2.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.2.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.2.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.2.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.2.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.2.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.2.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.2.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.2.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.2.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.3.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.3.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.3.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.3.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.3.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.3.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.3.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.3.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.3.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.3.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.3.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.3.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.3.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.3.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.3.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.3.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.4.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.4.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.4.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.4.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.4.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.4.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.4.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.4.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.4.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.4.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.4.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.4.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.4.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.4.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.4.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.4.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.5.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.5.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.5.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.5.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.5.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.5.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.5.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.5.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.5.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.5.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.5.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.5.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.5.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.5.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.5.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.5.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.6.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.6.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.6.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.6.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.6.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.6.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.6.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.6.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.6.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.6.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.6.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.6.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.6.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.6.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.6.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.6.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.7.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.7.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.7.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.7.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.7.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.7.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.7.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.7.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.7.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.7.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.7.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.7.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.7.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.7.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.7.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.7.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.8.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.8.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.8.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.8.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.8.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.8.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.8.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.8.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.8.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.8.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.8.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.8.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.8.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.8.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.8.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.8.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.9.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.9.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.9.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.9.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.9.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.9.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.9.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.9.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.9.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.9.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.9.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.9.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.9.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.9.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.9.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.9.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.10.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.10.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.10.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.10.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.10.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.10.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.10.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.10.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.10.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.10.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.10.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.10.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.10.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.10.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.10.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.10.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.11.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.11.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.11.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.11.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.11.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.11.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.11.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.11.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.11.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.11.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.11.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.11.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.11.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.11.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.11.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.11.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.12.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.12.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.12.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.12.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.12.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.12.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.12.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.12.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.12.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.12.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.12.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.12.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.12.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.12.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.12.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.12.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.13.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.13.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.13.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.13.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.13.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.13.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.13.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.13.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.13.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.13.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.13.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.13.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.13.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.13.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.13.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.13.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.14.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.14.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.14.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.14.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.14.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.14.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.14.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.14.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.14.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.14.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.14.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.14.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.14.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.14.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.14.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.14.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.15.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.15.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.15.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.15.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.15.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.15.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.15.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.15.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.15.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.15.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.15.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.15.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.15.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.15.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.15.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.15.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.16.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.16.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.16.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.16.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.16.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.16.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.16.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.16.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.16.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.16.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.16.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.16.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.16.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.16.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.16.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.16.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.17.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.17.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.17.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.17.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.17.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.17.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.17.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.17.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.17.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.17.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.17.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.17.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.17.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.17.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.17.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.17.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.18.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.18.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.18.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.18.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.18.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.18.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.18.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.18.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.18.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.18.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.18.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.18.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.18.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.18.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.18.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.18.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.19.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.19.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.19.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.19.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.19.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.19.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.19.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.19.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.19.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.19.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.19.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.19.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.19.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.19.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.19.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.19.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.20.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.20.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.20.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.20.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.20.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.20.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.20.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.20.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.20.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.20.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.20.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.20.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.20.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.20.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.20.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.20.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.21.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.21.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.21.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.21.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.21.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.21.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.21.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.21.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.21.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.21.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.21.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.21.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.21.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.21.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.21.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.21.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.22.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.22.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.22.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.22.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.22.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.22.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.22.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.22.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.22.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.22.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.22.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.22.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.22.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.22.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.22.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.22.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.23.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.23.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.23.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.23.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.23.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.23.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.23.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.23.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.23.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.23.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.23.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.23.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.23.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.23.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.23.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.23.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.pooler.dense.weight: torch.Size([1024, 1024]) >>> xlmr.pooler.dense.bias: torch.Size([1024]) >>> basic_gcn.T_T.0.weight: torch.Size([1024, 1024]) >>> basic_gcn.T_T.0.bias: torch.Size([1024]) >>> basic_gcn.T_T.1.weight: torch.Size([1024, 1024]) >>> basic_gcn.T_T.1.bias: torch.Size([1024]) >>> basic_gcn.T_T.2.weight: torch.Size([1024, 1024]) >>> basic_gcn.T_T.2.bias: torch.Size([1024]) >>> basic_gcn.T_E.0.weight: torch.Size([1024, 1024]) >>> basic_gcn.T_E.0.bias: torch.Size([1024]) >>> basic_gcn.T_E.1.weight: torch.Size([1024, 1024]) >>> basic_gcn.T_E.1.bias: torch.Size([1024]) >>> basic_gcn.T_E.2.weight: torch.Size([1024, 1024]) >>> basic_gcn.T_E.2.bias: torch.Size([1024]) >>> basic_gcn.E_T.0.weight: torch.Size([1024, 1024]) >>> basic_gcn.E_T.0.bias: torch.Size([1024]) >>> basic_gcn.E_T.1.weight: torch.Size([1024, 1024]) >>> basic_gcn.E_T.1.bias: torch.Size([1024]) >>> basic_gcn.E_T.2.weight: torch.Size([1024, 1024]) >>> basic_gcn.E_T.2.bias: torch.Size([1024]) >>> basic_gcn.E_E.0.weight: torch.Size([1024, 1024]) >>> basic_gcn.E_E.0.bias: torch.Size([1024]) >>> basic_gcn.E_E.1.weight: torch.Size([1024, 1024]) >>> basic_gcn.E_E.1.bias: torch.Size([1024]) >>> basic_gcn.E_E.2.weight: torch.Size([1024, 1024]) >>> basic_gcn.E_E.2.bias: torch.Size([1024]) >>> basic_gcn.f_t.0.weight: torch.Size([1024, 2048]) >>> basic_gcn.f_t.0.bias: torch.Size([1024]) >>> basic_gcn.f_e.0.weight: torch.Size([1024, 2048]) >>> basic_gcn.f_e.0.bias: torch.Size([1024]) >>> name2classifier.occupy-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.occupy-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.occupy-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.occupy-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.outcome-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.outcome-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.outcome-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.outcome-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.protest-event-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.protest-event-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.protest-event-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.protest-event-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.when-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.when-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.when-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.when-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.where-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.where-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.where-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.where-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.who-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.who-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.who-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.who-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.protest-against-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.protest-against-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.protest-against-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.protest-against-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.protest-for-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.protest-for-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.protest-for-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.protest-for-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.organizer-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.organizer-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.organizer-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.organizer-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.wounded-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.wounded-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.wounded-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.wounded-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.arrested-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.arrested-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.arrested-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.arrested-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.killed-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.killed-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.killed-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.killed-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.imprisoned-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.imprisoned-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.imprisoned-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.imprisoned-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.corrupt-event-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.corrupt-event-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.corrupt-event-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.corrupt-event-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.judicial-actions-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.judicial-actions-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.judicial-actions-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.judicial-actions-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.charged-with-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.charged-with-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.charged-with-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.charged-with-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.prison-term-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.prison-term-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.prison-term-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.prison-term-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.fine-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.fine-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.fine-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.fine-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.npi-events-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.npi-events-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.npi-events-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.npi-events-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.disease-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.disease-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.disease-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.disease-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.infected-cumulative-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.infected-cumulative-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.infected-cumulative-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.infected-cumulative-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.killed-count-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.killed-count-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.killed-count-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.killed-count-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.killed-cumulative-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.killed-cumulative-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.killed-cumulative-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.killed-cumulative-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.outbreak-event-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.outbreak-event-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.outbreak-event-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.outbreak-event-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.killed-individuals-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.killed-individuals-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.killed-individuals-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.killed-individuals-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.hospitalized-count-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.hospitalized-count-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.hospitalized-count-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.hospitalized-count-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.hospitalized-individuals-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.hospitalized-individuals-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.hospitalized-individuals-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.hospitalized-individuals-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.infected-individuals-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.infected-individuals-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.infected-individuals-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.infected-individuals-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.tested-individuals-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.tested-individuals-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.tested-individuals-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.tested-individuals-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.infected-count-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.infected-count-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.infected-count-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.infected-count-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.vaccinated-count-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.vaccinated-count-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.vaccinated-count-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.vaccinated-count-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.exposed-individuals-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.exposed-individuals-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.exposed-individuals-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.exposed-individuals-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.recovered-cumulative-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.recovered-cumulative-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.recovered-cumulative-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.recovered-cumulative-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.tested-count-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.tested-count-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.tested-count-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.tested-count-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.tested-cumulative-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.tested-cumulative-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.tested-cumulative-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.tested-cumulative-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.recovered-count-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.recovered-count-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.recovered-count-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.recovered-count-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.exposed-count-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.exposed-count-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.exposed-count-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.exposed-count-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.vaccinated-cumulative-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.vaccinated-cumulative-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.vaccinated-cumulative-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.vaccinated-cumulative-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.vaccinated-individuals-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.vaccinated-individuals-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.vaccinated-individuals-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.vaccinated-individuals-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.exposed-cumulative-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.exposed-cumulative-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.exposed-cumulative-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.exposed-cumulative-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.hospitalized-cumulative-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.hospitalized-cumulative-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.hospitalized-cumulative-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.hospitalized-cumulative-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.recovered-individuals-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.recovered-individuals-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.recovered-individuals-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.recovered-individuals-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.blamed-by-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.blamed-by-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.blamed-by-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.blamed-by-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.claimed-by-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.claimed-by-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.claimed-by-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.claimed-by-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.terror-event-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.terror-event-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.terror-event-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.terror-event-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.kidnapped-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.kidnapped-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.kidnapped-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.kidnapped-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.named-perp-org-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.named-perp-org-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.named-perp-org-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.named-perp-org-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.target-physical-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.target-physical-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.target-physical-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.target-physical-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.named-perp-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.named-perp-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.named-perp-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.named-perp-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.perp-killed-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.perp-killed-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.perp-killed-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.perp-killed-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.target-human-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.target-human-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.target-human-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.target-human-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.perp-captured-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.perp-captured-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.perp-captured-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.perp-captured-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.perp-objective-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.perp-objective-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.perp-objective-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.perp-objective-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.weapon-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.weapon-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.weapon-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.weapon-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.named-organizer-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.named-organizer-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.named-organizer-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.named-organizer-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.affected-cumulative-count-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.affected-cumulative-count-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.affected-cumulative-count-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.affected-cumulative-count-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.damage-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.damage-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.damage-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.damage-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.human-displacement-events-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.human-displacement-events-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.human-displacement-events-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.human-displacement-events-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.major-disaster-event-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.major-disaster-event-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.major-disaster-event-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.major-disaster-event-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.related-natural-phenomena-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.related-natural-phenomena-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.related-natural-phenomena-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.related-natural-phenomena-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.responders-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.responders-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.responders-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.responders-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.assistance-provided-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.assistance-provided-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.assistance-provided-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.assistance-provided-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.rescue-events-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.rescue-events-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.rescue-events-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.rescue-events-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.individuals-affected-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.individuals-affected-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.individuals-affected-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.individuals-affected-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.missing-count-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.missing-count-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.missing-count-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.missing-count-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.injured-count-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.injured-count-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.injured-count-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.injured-count-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.assistance-needed-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.assistance-needed-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.assistance-needed-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.assistance-needed-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.rescued-count-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.rescued-count-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.rescued-count-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.rescued-count-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.repair-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.repair-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.repair-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.repair-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.declare-emergency-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.declare-emergency-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.declare-emergency-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.declare-emergency-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.announce-disaster-warnings-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.announce-disaster-warnings-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.announce-disaster-warnings-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.announce-disaster-warnings-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.disease-outbreak-events-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.disease-outbreak-events-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.disease-outbreak-events-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.disease-outbreak-events-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.current-location-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.current-location-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.current-location-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.current-location-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.event-or-soa-at-origin-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.event-or-soa-at-origin-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.event-or-soa-at-origin-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.event-or-soa-at-origin-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.group-identity-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.group-identity-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.group-identity-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.group-identity-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.human-displacement-event-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.human-displacement-event-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.human-displacement-event-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.human-displacement-event-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.origin-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.origin-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.origin-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.origin-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.total-displaced-count-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.total-displaced-count-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.total-displaced-count-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.total-displaced-count-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.transitory-events-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.transitory-events-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.transitory-events-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.transitory-events-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.destination-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.destination-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.destination-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.destination-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.transiting-location-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.transiting-location-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.transiting-location-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.transiting-location-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.detained-count-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.detained-count-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.detained-count-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.detained-count-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.blocked-migration-count-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.blocked-migration-count-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.blocked-migration-count-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.blocked-migration-count-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.settlement-status-event-or-soa-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.settlement-status-event-or-soa-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.settlement-status-event-or-soa-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.settlement-status-event-or-soa-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.cybercrime-event-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.cybercrime-event-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.cybercrime-event-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.cybercrime-event-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.perpetrator-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.perpetrator-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.perpetrator-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.perpetrator-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.victim-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.victim-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.victim-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.victim-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.response-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.response-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.response-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.response-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.information-stolen-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.information-stolen-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.information-stolen-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.information-stolen-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.related-crimes-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.related-crimes-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.related-crimes-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.related-crimes-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.victim-impact-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.victim-impact-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.victim-impact-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.victim-impact-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.contract-amount-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.contract-amount-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.contract-amount-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.contract-amount-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.etip-event-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.etip-event-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.etip-event-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.etip-event-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.project-location-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.project-location-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.project-location-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.project-location-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.project-name-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.project-name-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.project-name-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.project-name-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.signatories-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.signatories-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.signatories-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.signatories-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.contract-awardee-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.contract-awardee-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.contract-awardee-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.contract-awardee-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.overall-project-value-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.overall-project-value-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.overall-project-value-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.overall-project-value-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.funding-amount-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.funding-amount-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.funding-amount-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.funding-amount-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.funding-recipient-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.funding-recipient-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.funding-recipient-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.funding-recipient-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.funding-source-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.funding-source-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.funding-source-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.funding-source-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.contract-awarder-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.contract-awarder-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.contract-awarder-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.contract-awarder-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.agreement-length-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.agreement-length-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.agreement-length-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.agreement-length-ffn.layers.1.bias: torch.Size([2]) >>> irrealis_classifier.layers.0.weight: torch.Size([350, 1128]) >>> irrealis_classifier.layers.0.bias: torch.Size([350]) >>> irrealis_classifier.layers.1.weight: torch.Size([7, 350]) >>> irrealis_classifier.layers.1.bias: torch.Size([7]) n_trainable_params: 614103147, n_nontrainable_params: 0 ---------------------------------------------------------------------------------------------------- ****************************** Epoch: 0 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-22 14:27:15.355141: step: 2/463, loss: 0.7110377550125122 2023-01-22 14:27:15.989436: step: 4/463, loss: 1.9634547233581543 2023-01-22 14:27:16.636335: step: 6/463, loss: 1.5215129852294922 2023-01-22 14:27:17.223242: step: 8/463, loss: 0.5526178479194641 2023-01-22 14:27:17.842902: step: 10/463, loss: 1.2928082942962646 2023-01-22 14:27:18.447579: step: 12/463, loss: 0.5128095149993896 2023-01-22 14:27:19.121427: step: 14/463, loss: 0.3281662166118622 2023-01-22 14:27:19.810715: step: 16/463, loss: 0.4295336902141571 2023-01-22 14:27:20.399208: step: 18/463, loss: 0.30979084968566895 2023-01-22 14:27:21.122020: step: 20/463, loss: 0.4251997768878937 2023-01-22 14:27:21.716456: step: 22/463, loss: 0.26424384117126465 2023-01-22 14:27:22.334817: step: 24/463, loss: 0.5383137464523315 2023-01-22 14:27:22.945994: step: 26/463, loss: 0.32456859946250916 2023-01-22 14:27:23.611610: step: 28/463, loss: 2.2164814472198486 2023-01-22 14:27:24.231932: step: 30/463, loss: 0.5305370092391968 2023-01-22 14:27:24.846273: step: 32/463, loss: 0.5779165029525757 2023-01-22 14:27:25.420754: step: 34/463, loss: 0.24728155136108398 2023-01-22 14:27:26.034734: step: 36/463, loss: 0.17951759696006775 2023-01-22 14:27:26.625656: step: 38/463, loss: 1.298595666885376 2023-01-22 14:27:27.284742: step: 40/463, loss: 1.1850732564926147 2023-01-22 14:27:27.847737: step: 42/463, loss: 0.45125308632850647 2023-01-22 14:27:28.449211: step: 44/463, loss: 1.7101916074752808 2023-01-22 14:27:29.133104: step: 46/463, loss: 0.7005334496498108 2023-01-22 14:27:29.757962: step: 48/463, loss: 0.7047523856163025 2023-01-22 14:27:30.323839: step: 50/463, loss: 0.24437357485294342 2023-01-22 14:27:30.912355: step: 52/463, loss: 0.48864373564720154 2023-01-22 14:27:31.534663: step: 54/463, loss: 0.4767221212387085 2023-01-22 14:27:32.178646: step: 56/463, loss: 0.524865984916687 2023-01-22 14:27:32.835252: step: 58/463, loss: 0.6094954609870911 2023-01-22 14:27:33.462646: step: 60/463, loss: 1.1059892177581787 2023-01-22 14:27:34.097705: step: 62/463, loss: 0.4665301442146301 2023-01-22 14:27:34.694685: step: 64/463, loss: 0.48152926564216614 2023-01-22 14:27:35.300640: step: 66/463, loss: 0.25296494364738464 2023-01-22 14:27:35.941907: step: 68/463, loss: 1.4228724241256714 2023-01-22 14:27:36.541225: step: 70/463, loss: 1.5477015972137451 2023-01-22 14:27:37.229155: step: 72/463, loss: 2.4783730506896973 2023-01-22 14:27:37.866836: step: 74/463, loss: 1.2154754400253296 2023-01-22 14:27:38.493926: step: 76/463, loss: 1.2590984106063843 2023-01-22 14:27:39.187405: step: 78/463, loss: 4.090399265289307 2023-01-22 14:27:39.845984: step: 80/463, loss: 0.8656907677650452 2023-01-22 14:27:40.476581: step: 82/463, loss: 0.9034408926963806 2023-01-22 14:27:41.034191: step: 84/463, loss: 0.6368462443351746 2023-01-22 14:27:41.659086: step: 86/463, loss: 2.118135929107666 2023-01-22 14:27:42.289133: step: 88/463, loss: 1.1363718509674072 2023-01-22 14:27:42.921633: step: 90/463, loss: 0.2635005712509155 2023-01-22 14:27:43.577653: step: 92/463, loss: 0.9445956945419312 2023-01-22 14:27:44.181471: step: 94/463, loss: 1.0266847610473633 2023-01-22 14:27:44.732370: step: 96/463, loss: 0.49595534801483154 2023-01-22 14:27:45.347412: step: 98/463, loss: 0.7541561722755432 2023-01-22 14:27:45.963014: step: 100/463, loss: 0.25146907567977905 2023-01-22 14:27:46.582545: step: 102/463, loss: 0.7391803860664368 2023-01-22 14:27:47.191605: step: 104/463, loss: 0.2965460419654846 2023-01-22 14:27:47.937273: step: 106/463, loss: 1.5200207233428955 2023-01-22 14:27:48.636167: step: 108/463, loss: 0.4561516344547272 2023-01-22 14:27:49.264528: step: 110/463, loss: 0.6982751488685608 2023-01-22 14:27:49.934245: step: 112/463, loss: 0.9412851929664612 2023-01-22 14:27:50.512050: step: 114/463, loss: 0.47166991233825684 2023-01-22 14:27:51.088031: step: 116/463, loss: 0.6609206795692444 2023-01-22 14:27:51.661534: step: 118/463, loss: 0.3279365301132202 2023-01-22 14:27:52.263754: step: 120/463, loss: 0.19368818402290344 2023-01-22 14:27:52.893034: step: 122/463, loss: 0.6561187505722046 2023-01-22 14:27:53.538081: step: 124/463, loss: 0.48703479766845703 2023-01-22 14:27:54.172946: step: 126/463, loss: 0.5891143083572388 2023-01-22 14:27:54.799737: step: 128/463, loss: 0.24704664945602417 2023-01-22 14:27:55.406012: step: 130/463, loss: 0.5358189940452576 2023-01-22 14:27:55.966194: step: 132/463, loss: 1.0203373432159424 2023-01-22 14:27:56.610575: step: 134/463, loss: 0.8915371894836426 2023-01-22 14:27:57.258251: step: 136/463, loss: 0.43279144167900085 2023-01-22 14:27:57.873770: step: 138/463, loss: 7.46989631652832 2023-01-22 14:27:58.482766: step: 140/463, loss: 0.26247867941856384 2023-01-22 14:27:59.082259: step: 142/463, loss: 0.28327253460884094 2023-01-22 14:27:59.681113: step: 144/463, loss: 0.6575441360473633 2023-01-22 14:28:00.289720: step: 146/463, loss: 0.36621129512786865 2023-01-22 14:28:00.867882: step: 148/463, loss: 0.5917226076126099 2023-01-22 14:28:01.486598: step: 150/463, loss: 1.09660005569458 2023-01-22 14:28:02.126074: step: 152/463, loss: 1.1894365549087524 2023-01-22 14:28:02.795782: step: 154/463, loss: 4.09426212310791 2023-01-22 14:28:03.406863: step: 156/463, loss: 0.6460101008415222 2023-01-22 14:28:04.059969: step: 158/463, loss: 0.8001412749290466 2023-01-22 14:28:04.677761: step: 160/463, loss: 0.585970938205719 2023-01-22 14:28:05.316544: step: 162/463, loss: 0.9949851036071777 2023-01-22 14:28:05.916313: step: 164/463, loss: 0.2851049602031708 2023-01-22 14:28:06.560350: step: 166/463, loss: 1.9504828453063965 2023-01-22 14:28:07.200426: step: 168/463, loss: 0.4471861720085144 2023-01-22 14:28:07.788726: step: 170/463, loss: 0.688849151134491 2023-01-22 14:28:08.444552: step: 172/463, loss: 2.9592556953430176 2023-01-22 14:28:09.076830: step: 174/463, loss: 1.0243690013885498 2023-01-22 14:28:09.713041: step: 176/463, loss: 0.2237897515296936 2023-01-22 14:28:10.333650: step: 178/463, loss: 0.46969369053840637 2023-01-22 14:28:10.973663: step: 180/463, loss: 5.125553607940674 2023-01-22 14:28:11.602101: step: 182/463, loss: 0.7905886769294739 2023-01-22 14:28:12.225396: step: 184/463, loss: 0.8888781666755676 2023-01-22 14:28:12.805723: step: 186/463, loss: 0.5473672747612 2023-01-22 14:28:13.497389: step: 188/463, loss: 1.7068771123886108 2023-01-22 14:28:14.101542: step: 190/463, loss: 1.7497769594192505 2023-01-22 14:28:14.672779: step: 192/463, loss: 0.13810977339744568 2023-01-22 14:28:15.271188: step: 194/463, loss: 0.21714536845684052 2023-01-22 14:28:15.845348: step: 196/463, loss: 2.6487720012664795 2023-01-22 14:28:16.468928: step: 198/463, loss: 0.1875929832458496 2023-01-22 14:28:17.001306: step: 200/463, loss: 0.4878803491592407 2023-01-22 14:28:17.662834: step: 202/463, loss: 0.1977538913488388 2023-01-22 14:28:18.292819: step: 204/463, loss: 0.6322788000106812 2023-01-22 14:28:18.904907: step: 206/463, loss: 0.3756943941116333 2023-01-22 14:28:19.610994: step: 208/463, loss: 0.422389417886734 2023-01-22 14:28:20.203750: step: 210/463, loss: 2.5367040634155273 2023-01-22 14:28:20.815925: step: 212/463, loss: 0.488334983587265 2023-01-22 14:28:21.430175: step: 214/463, loss: 1.513160228729248 2023-01-22 14:28:22.107642: step: 216/463, loss: 1.0832053422927856 2023-01-22 14:28:22.847871: step: 218/463, loss: 0.8703013062477112 2023-01-22 14:28:23.492302: step: 220/463, loss: 0.16551199555397034 2023-01-22 14:28:24.149975: step: 222/463, loss: 0.9340542554855347 2023-01-22 14:28:24.752804: step: 224/463, loss: 0.417271226644516 2023-01-22 14:28:25.332128: step: 226/463, loss: 1.1746236085891724 2023-01-22 14:28:26.065016: step: 228/463, loss: 1.1418590545654297 2023-01-22 14:28:26.633817: step: 230/463, loss: 0.6291463375091553 2023-01-22 14:28:27.200499: step: 232/463, loss: 0.4290115237236023 2023-01-22 14:28:27.772905: step: 234/463, loss: 0.9395275712013245 2023-01-22 14:28:28.379390: step: 236/463, loss: 0.4954184591770172 2023-01-22 14:28:28.984106: step: 238/463, loss: 0.24520249664783478 2023-01-22 14:28:29.641827: step: 240/463, loss: 0.30184486508369446 2023-01-22 14:28:30.231493: step: 242/463, loss: 0.5127584934234619 2023-01-22 14:28:30.925716: step: 244/463, loss: 0.3230479657649994 2023-01-22 14:28:31.522767: step: 246/463, loss: 0.2729988694190979 2023-01-22 14:28:32.235293: step: 248/463, loss: 0.8001754879951477 2023-01-22 14:28:32.873271: step: 250/463, loss: 0.42000246047973633 2023-01-22 14:28:33.532610: step: 252/463, loss: 2.3957343101501465 2023-01-22 14:28:34.150059: step: 254/463, loss: 0.13441751897335052 2023-01-22 14:28:34.803339: step: 256/463, loss: 4.072413444519043 2023-01-22 14:28:35.449414: step: 258/463, loss: 0.46325889229774475 2023-01-22 14:28:36.054295: step: 260/463, loss: 0.8921032547950745 2023-01-22 14:28:36.608132: step: 262/463, loss: 0.3637198507785797 2023-01-22 14:28:37.223154: step: 264/463, loss: 1.1094224452972412 2023-01-22 14:28:37.809929: step: 266/463, loss: 0.41182786226272583 2023-01-22 14:28:38.431986: step: 268/463, loss: 0.35808348655700684 2023-01-22 14:28:39.021140: step: 270/463, loss: 0.1960856020450592 2023-01-22 14:28:39.572624: step: 272/463, loss: 0.6238852143287659 2023-01-22 14:28:40.218690: step: 274/463, loss: 0.8750197291374207 2023-01-22 14:28:40.796490: step: 276/463, loss: 0.6028929948806763 2023-01-22 14:28:41.360440: step: 278/463, loss: 0.5123562216758728 2023-01-22 14:28:41.990557: step: 280/463, loss: 0.36594900488853455 2023-01-22 14:28:42.582832: step: 282/463, loss: 0.5846149921417236 2023-01-22 14:28:43.250232: step: 284/463, loss: 0.1833552122116089 2023-01-22 14:28:43.849060: step: 286/463, loss: 1.3880512714385986 2023-01-22 14:28:44.510969: step: 288/463, loss: 0.8846035003662109 2023-01-22 14:28:45.122703: step: 290/463, loss: 0.3541041314601898 2023-01-22 14:28:45.745569: step: 292/463, loss: 0.9688112735748291 2023-01-22 14:28:46.383361: step: 294/463, loss: 0.2690090239048004 2023-01-22 14:28:46.979111: step: 296/463, loss: 1.0178827047348022 2023-01-22 14:28:47.588520: step: 298/463, loss: 0.3885231018066406 2023-01-22 14:28:48.171129: step: 300/463, loss: 2.072477102279663 2023-01-22 14:28:48.738601: step: 302/463, loss: 0.6431088447570801 2023-01-22 14:28:49.395360: step: 304/463, loss: 0.34189966320991516 2023-01-22 14:28:50.042022: step: 306/463, loss: 0.6046356558799744 2023-01-22 14:28:50.784493: step: 308/463, loss: 0.7071553468704224 2023-01-22 14:28:51.363302: step: 310/463, loss: 1.2107521295547485 2023-01-22 14:28:51.979003: step: 312/463, loss: 0.2931959331035614 2023-01-22 14:28:52.521505: step: 314/463, loss: 1.391312599182129 2023-01-22 14:28:53.095716: step: 316/463, loss: 0.3702233135700226 2023-01-22 14:28:53.654003: step: 318/463, loss: 0.9324830770492554 2023-01-22 14:28:54.291351: step: 320/463, loss: 0.3153139054775238 2023-01-22 14:28:54.891439: step: 322/463, loss: 0.45497894287109375 2023-01-22 14:28:55.483469: step: 324/463, loss: 1.026483416557312 2023-01-22 14:28:56.066737: step: 326/463, loss: 1.3770110607147217 2023-01-22 14:28:56.678627: step: 328/463, loss: 0.18883921205997467 2023-01-22 14:28:57.386730: step: 330/463, loss: 0.5676251649856567 2023-01-22 14:28:57.979908: step: 332/463, loss: 0.2552279531955719 2023-01-22 14:28:58.555890: step: 334/463, loss: 0.15682950615882874 2023-01-22 14:28:59.184629: step: 336/463, loss: 5.483316898345947 2023-01-22 14:28:59.933512: step: 338/463, loss: 0.26740601658821106 2023-01-22 14:29:00.604370: step: 340/463, loss: 0.34555819630622864 2023-01-22 14:29:01.188881: step: 342/463, loss: 0.6372495293617249 2023-01-22 14:29:01.831381: step: 344/463, loss: 0.18789167702198029 2023-01-22 14:29:02.556805: step: 346/463, loss: 0.3237600028514862 2023-01-22 14:29:03.159696: step: 348/463, loss: 1.2242016792297363 2023-01-22 14:29:03.761291: step: 350/463, loss: 0.3657453954219818 2023-01-22 14:29:04.395380: step: 352/463, loss: 0.4055376648902893 2023-01-22 14:29:05.006995: step: 354/463, loss: 0.4057251811027527 2023-01-22 14:29:05.565810: step: 356/463, loss: 0.4648597538471222 2023-01-22 14:29:06.200193: step: 358/463, loss: 0.19327548146247864 2023-01-22 14:29:06.803940: step: 360/463, loss: 0.4659730792045593 2023-01-22 14:29:07.404870: step: 362/463, loss: 0.34436526894569397 2023-01-22 14:29:08.034341: step: 364/463, loss: 1.7228577136993408 2023-01-22 14:29:08.693360: step: 366/463, loss: 0.9704595804214478 2023-01-22 14:29:09.328292: step: 368/463, loss: 0.5101957321166992 2023-01-22 14:29:09.925548: step: 370/463, loss: 0.2810487151145935 2023-01-22 14:29:10.506244: step: 372/463, loss: 0.5631650686264038 2023-01-22 14:29:11.211053: step: 374/463, loss: 0.48112404346466064 2023-01-22 14:29:11.822513: step: 376/463, loss: 0.3883575201034546 2023-01-22 14:29:12.403530: step: 378/463, loss: 1.0454851388931274 2023-01-22 14:29:13.139648: step: 380/463, loss: 0.32958826422691345 2023-01-22 14:29:13.783144: step: 382/463, loss: 1.251512885093689 2023-01-22 14:29:14.395319: step: 384/463, loss: 0.6470736265182495 2023-01-22 14:29:15.030396: step: 386/463, loss: 0.45478492975234985 2023-01-22 14:29:15.668217: step: 388/463, loss: 1.3127366304397583 2023-01-22 14:29:16.338689: step: 390/463, loss: 0.4562246799468994 2023-01-22 14:29:16.947340: step: 392/463, loss: 1.1095129251480103 2023-01-22 14:29:17.580042: step: 394/463, loss: 0.3238731324672699 2023-01-22 14:29:18.229635: step: 396/463, loss: 5.099658012390137 2023-01-22 14:29:18.813909: step: 398/463, loss: 1.319354772567749 2023-01-22 14:29:19.414607: step: 400/463, loss: 0.256930410861969 2023-01-22 14:29:20.043192: step: 402/463, loss: 0.19138945639133453 2023-01-22 14:29:20.658598: step: 404/463, loss: 0.17694461345672607 2023-01-22 14:29:21.219668: step: 406/463, loss: 0.05967296287417412 2023-01-22 14:29:21.837114: step: 408/463, loss: 0.2454880326986313 2023-01-22 14:29:22.424891: step: 410/463, loss: 0.7378798127174377 2023-01-22 14:29:23.010280: step: 412/463, loss: 0.4065646231174469 2023-01-22 14:29:23.632099: step: 414/463, loss: 0.5879358649253845 2023-01-22 14:29:24.264421: step: 416/463, loss: 0.45791909098625183 2023-01-22 14:29:24.881657: step: 418/463, loss: 0.5145699381828308 2023-01-22 14:29:25.488965: step: 420/463, loss: 0.5762168765068054 2023-01-22 14:29:26.189624: step: 422/463, loss: 0.32217931747436523 2023-01-22 14:29:26.833935: step: 424/463, loss: 0.2534725069999695 2023-01-22 14:29:27.420714: step: 426/463, loss: 1.5495320558547974 2023-01-22 14:29:28.008073: step: 428/463, loss: 0.4527604579925537 2023-01-22 14:29:28.635682: step: 430/463, loss: 0.38663479685783386 2023-01-22 14:29:29.244578: step: 432/463, loss: 1.7005350589752197 2023-01-22 14:29:29.882418: step: 434/463, loss: 1.6915276050567627 2023-01-22 14:29:30.516040: step: 436/463, loss: 0.6575692892074585 2023-01-22 14:29:31.063468: step: 438/463, loss: 0.3307230770587921 2023-01-22 14:29:31.684316: step: 440/463, loss: 0.3964828848838806 2023-01-22 14:29:32.310962: step: 442/463, loss: 0.2246091365814209 2023-01-22 14:29:32.942805: step: 444/463, loss: 0.6495397686958313 2023-01-22 14:29:33.547465: step: 446/463, loss: 0.5014159679412842 2023-01-22 14:29:34.159893: step: 448/463, loss: 0.7837904691696167 2023-01-22 14:29:34.779319: step: 450/463, loss: 0.5788853764533997 2023-01-22 14:29:35.376963: step: 452/463, loss: 0.15913105010986328 2023-01-22 14:29:35.996132: step: 454/463, loss: 0.43566659092903137 2023-01-22 14:29:36.565559: step: 456/463, loss: 0.2578667402267456 2023-01-22 14:29:37.169096: step: 458/463, loss: 0.27921104431152344 2023-01-22 14:29:37.800465: step: 460/463, loss: 0.8514171838760376 2023-01-22 14:29:38.480558: step: 462/463, loss: 0.3492605686187744 2023-01-22 14:29:39.051200: step: 464/463, loss: 2.1347439289093018 2023-01-22 14:29:39.588876: step: 466/463, loss: 4.5686445236206055 2023-01-22 14:29:40.221456: step: 468/463, loss: 0.6787347793579102 2023-01-22 14:29:40.815033: step: 470/463, loss: 0.5877057909965515 2023-01-22 14:29:41.390975: step: 472/463, loss: 1.0290591716766357 2023-01-22 14:29:41.966312: step: 474/463, loss: 0.9100293517112732 2023-01-22 14:29:42.621323: step: 476/463, loss: 1.161621332168579 2023-01-22 14:29:43.204922: step: 478/463, loss: 0.7352535128593445 2023-01-22 14:29:43.821604: step: 480/463, loss: 0.4720044732093811 2023-01-22 14:29:44.403273: step: 482/463, loss: 0.32759010791778564 2023-01-22 14:29:44.950352: step: 484/463, loss: 0.3483801484107971 2023-01-22 14:29:45.516341: step: 486/463, loss: 0.6938784122467041 2023-01-22 14:29:46.237221: step: 488/463, loss: 1.6254940032958984 2023-01-22 14:29:46.878789: step: 490/463, loss: 0.3006121516227722 2023-01-22 14:29:47.519909: step: 492/463, loss: 0.2562975287437439 2023-01-22 14:29:48.132220: step: 494/463, loss: 0.4081796109676361 2023-01-22 14:29:48.885885: step: 496/463, loss: 2.9090559482574463 2023-01-22 14:29:49.541796: step: 498/463, loss: 0.8642628192901611 2023-01-22 14:29:50.215721: step: 500/463, loss: 0.5731088519096375 2023-01-22 14:29:50.803010: step: 502/463, loss: 0.8631112575531006 2023-01-22 14:29:51.394942: step: 504/463, loss: 0.6537529826164246 2023-01-22 14:29:51.989358: step: 506/463, loss: 0.5805056095123291 2023-01-22 14:29:52.613668: step: 508/463, loss: 0.9434651136398315 2023-01-22 14:29:53.182373: step: 510/463, loss: 0.45128804445266724 2023-01-22 14:29:53.789431: step: 512/463, loss: 0.492315411567688 2023-01-22 14:29:54.384695: step: 514/463, loss: 0.1113784909248352 2023-01-22 14:29:54.976214: step: 516/463, loss: 0.09925766289234161 2023-01-22 14:29:55.584176: step: 518/463, loss: 0.3642551600933075 2023-01-22 14:29:56.162876: step: 520/463, loss: 0.5471283793449402 2023-01-22 14:29:56.807121: step: 522/463, loss: 0.3618479073047638 2023-01-22 14:29:57.397231: step: 524/463, loss: 0.4802892208099365 2023-01-22 14:29:58.017647: step: 526/463, loss: 0.28477156162261963 2023-01-22 14:29:58.669446: step: 528/463, loss: 2.115710973739624 2023-01-22 14:29:59.260815: step: 530/463, loss: 1.0982928276062012 2023-01-22 14:29:59.937425: step: 532/463, loss: 0.9339340925216675 2023-01-22 14:30:00.559234: step: 534/463, loss: 1.1053259372711182 2023-01-22 14:30:01.207423: step: 536/463, loss: 0.5319801568984985 2023-01-22 14:30:01.920893: step: 538/463, loss: 0.13508951663970947 2023-01-22 14:30:02.519885: step: 540/463, loss: 0.32389169931411743 2023-01-22 14:30:03.160470: step: 542/463, loss: 1.4359307289123535 2023-01-22 14:30:03.828755: step: 544/463, loss: 0.1961343139410019 2023-01-22 14:30:04.466708: step: 546/463, loss: 0.1485653668642044 2023-01-22 14:30:05.051101: step: 548/463, loss: 0.2372693121433258 2023-01-22 14:30:05.659646: step: 550/463, loss: 1.4439091682434082 2023-01-22 14:30:06.261227: step: 552/463, loss: 0.36783695220947266 2023-01-22 14:30:06.855342: step: 554/463, loss: 0.20392726361751556 2023-01-22 14:30:07.474727: step: 556/463, loss: 0.2454075962305069 2023-01-22 14:30:08.028275: step: 558/463, loss: 0.2509396970272064 2023-01-22 14:30:08.565855: step: 560/463, loss: 0.21449409425258636 2023-01-22 14:30:09.150873: step: 562/463, loss: 0.4912894070148468 2023-01-22 14:30:09.781515: step: 564/463, loss: 0.545263409614563 2023-01-22 14:30:10.373586: step: 566/463, loss: 0.47198545932769775 2023-01-22 14:30:11.057816: step: 568/463, loss: 0.6080930829048157 2023-01-22 14:30:11.681945: step: 570/463, loss: 1.7562438249588013 2023-01-22 14:30:12.272738: step: 572/463, loss: 0.2110171914100647 2023-01-22 14:30:12.871688: step: 574/463, loss: 0.5106478333473206 2023-01-22 14:30:13.486583: step: 576/463, loss: 0.2748633325099945 2023-01-22 14:30:14.097929: step: 578/463, loss: 0.3755010664463043 2023-01-22 14:30:14.648721: step: 580/463, loss: 0.8457111716270447 2023-01-22 14:30:15.308043: step: 582/463, loss: 0.7980345487594604 2023-01-22 14:30:15.920525: step: 584/463, loss: 0.9062277674674988 2023-01-22 14:30:16.558948: step: 586/463, loss: 1.4902383089065552 2023-01-22 14:30:17.114801: step: 588/463, loss: 1.0061204433441162 2023-01-22 14:30:17.829321: step: 590/463, loss: 0.22058585286140442 2023-01-22 14:30:18.408328: step: 592/463, loss: 1.013270378112793 2023-01-22 14:30:18.961252: step: 594/463, loss: 0.8260298371315002 2023-01-22 14:30:19.543136: step: 596/463, loss: 0.5032271146774292 2023-01-22 14:30:20.251004: step: 598/463, loss: 0.8600624203681946 2023-01-22 14:30:20.881946: step: 600/463, loss: 0.2312360256910324 2023-01-22 14:30:21.397942: step: 602/463, loss: 0.30232805013656616 2023-01-22 14:30:21.961535: step: 604/463, loss: 0.3175065219402313 2023-01-22 14:30:22.590277: step: 606/463, loss: 0.761849582195282 2023-01-22 14:30:23.200699: step: 608/463, loss: 1.4511946439743042 2023-01-22 14:30:23.852009: step: 610/463, loss: 0.44240328669548035 2023-01-22 14:30:24.457757: step: 612/463, loss: 0.40805041790008545 2023-01-22 14:30:25.053485: step: 614/463, loss: 0.3180646598339081 2023-01-22 14:30:25.635986: step: 616/463, loss: 0.9752008318901062 2023-01-22 14:30:26.254646: step: 618/463, loss: 0.7122431993484497 2023-01-22 14:30:26.945330: step: 620/463, loss: 0.5516524910926819 2023-01-22 14:30:27.593413: step: 622/463, loss: 1.1317731142044067 2023-01-22 14:30:28.256773: step: 624/463, loss: 0.4803718030452728 2023-01-22 14:30:28.867684: step: 626/463, loss: 0.2919762432575226 2023-01-22 14:30:29.467170: step: 628/463, loss: 0.2965410053730011 2023-01-22 14:30:30.039998: step: 630/463, loss: 1.000588297843933 2023-01-22 14:30:30.622731: step: 632/463, loss: 0.38177481293678284 2023-01-22 14:30:31.212255: step: 634/463, loss: 1.1366404294967651 2023-01-22 14:30:31.792376: step: 636/463, loss: 0.27113187313079834 2023-01-22 14:30:32.376449: step: 638/463, loss: 0.38027048110961914 2023-01-22 14:30:33.005330: step: 640/463, loss: 0.40984851121902466 2023-01-22 14:30:33.645996: step: 642/463, loss: 0.31813642382621765 2023-01-22 14:30:34.334808: step: 644/463, loss: 0.2989621162414551 2023-01-22 14:30:34.939133: step: 646/463, loss: 0.23246346414089203 2023-01-22 14:30:35.584866: step: 648/463, loss: 0.6159422397613525 2023-01-22 14:30:36.161784: step: 650/463, loss: 0.7953891158103943 2023-01-22 14:30:36.736806: step: 652/463, loss: 0.17942120134830475 2023-01-22 14:30:37.385303: step: 654/463, loss: 0.7475681304931641 2023-01-22 14:30:38.083400: step: 656/463, loss: 0.48117488622665405 2023-01-22 14:30:38.679223: step: 658/463, loss: 0.4192470908164978 2023-01-22 14:30:39.310850: step: 660/463, loss: 0.42249608039855957 2023-01-22 14:30:39.938086: step: 662/463, loss: 1.3255600929260254 2023-01-22 14:30:40.520566: step: 664/463, loss: 0.37822043895721436 2023-01-22 14:30:41.136202: step: 666/463, loss: 0.44286683201789856 2023-01-22 14:30:41.825693: step: 668/463, loss: 0.8132822513580322 2023-01-22 14:30:42.411230: step: 670/463, loss: 0.4119279682636261 2023-01-22 14:30:42.989972: step: 672/463, loss: 0.19651460647583008 2023-01-22 14:30:43.568529: step: 674/463, loss: 0.321584016084671 2023-01-22 14:30:44.183737: step: 676/463, loss: 2.1282129287719727 2023-01-22 14:30:44.821728: step: 678/463, loss: 0.5800396203994751 2023-01-22 14:30:45.391728: step: 680/463, loss: 0.28494784235954285 2023-01-22 14:30:46.093403: step: 682/463, loss: 0.9049884080886841 2023-01-22 14:30:46.665438: step: 684/463, loss: 0.1805637925863266 2023-01-22 14:30:47.229586: step: 686/463, loss: 0.9251638650894165 2023-01-22 14:30:47.817013: step: 688/463, loss: 0.22899910807609558 2023-01-22 14:30:48.379818: step: 690/463, loss: 0.7915898561477661 2023-01-22 14:30:49.046727: step: 692/463, loss: 0.7407082319259644 2023-01-22 14:30:49.665317: step: 694/463, loss: 0.6815155148506165 2023-01-22 14:30:50.251744: step: 696/463, loss: 0.19524085521697998 2023-01-22 14:30:50.875373: step: 698/463, loss: 0.932349443435669 2023-01-22 14:30:51.497431: step: 700/463, loss: 0.25380977988243103 2023-01-22 14:30:52.139841: step: 702/463, loss: 1.0239704847335815 2023-01-22 14:30:52.752482: step: 704/463, loss: 0.6593723297119141 2023-01-22 14:30:53.323926: step: 706/463, loss: 0.2932918965816498 2023-01-22 14:30:53.950352: step: 708/463, loss: 0.2701033055782318 2023-01-22 14:30:54.527161: step: 710/463, loss: 0.42596882581710815 2023-01-22 14:30:55.302723: step: 712/463, loss: 0.7785442471504211 2023-01-22 14:30:55.897633: step: 714/463, loss: 0.3493736684322357 2023-01-22 14:30:56.565921: step: 716/463, loss: 0.6795763969421387 2023-01-22 14:30:57.159817: step: 718/463, loss: 0.13511766493320465 2023-01-22 14:30:57.774694: step: 720/463, loss: 0.7000515460968018 2023-01-22 14:30:58.343396: step: 722/463, loss: 0.6480221748352051 2023-01-22 14:30:58.987013: step: 724/463, loss: 0.33398693799972534 2023-01-22 14:30:59.602533: step: 726/463, loss: 0.5859184861183167 2023-01-22 14:31:00.208304: step: 728/463, loss: 0.311678409576416 2023-01-22 14:31:00.783103: step: 730/463, loss: 0.2859085500240326 2023-01-22 14:31:01.367215: step: 732/463, loss: 0.28451016545295715 2023-01-22 14:31:02.001352: step: 734/463, loss: 0.264586478471756 2023-01-22 14:31:02.616408: step: 736/463, loss: 1.0608325004577637 2023-01-22 14:31:03.242343: step: 738/463, loss: 0.5518170595169067 2023-01-22 14:31:03.879201: step: 740/463, loss: 0.5363950133323669 2023-01-22 14:31:04.486607: step: 742/463, loss: 0.33050817251205444 2023-01-22 14:31:05.080176: step: 744/463, loss: 0.3545205593109131 2023-01-22 14:31:05.688108: step: 746/463, loss: 0.3218936026096344 2023-01-22 14:31:06.316307: step: 748/463, loss: 0.33383825421333313 2023-01-22 14:31:06.973028: step: 750/463, loss: 0.2966766357421875 2023-01-22 14:31:07.610278: step: 752/463, loss: 0.36890727281570435 2023-01-22 14:31:08.264833: step: 754/463, loss: 0.7295394539833069 2023-01-22 14:31:08.829965: step: 756/463, loss: 0.09646491706371307 2023-01-22 14:31:09.443121: step: 758/463, loss: 0.18695104122161865 2023-01-22 14:31:09.980233: step: 760/463, loss: 0.30426305532455444 2023-01-22 14:31:10.546100: step: 762/463, loss: 0.3962453007698059 2023-01-22 14:31:11.141116: step: 764/463, loss: 0.506095826625824 2023-01-22 14:31:11.791263: step: 766/463, loss: 0.47268959879875183 2023-01-22 14:31:12.459944: step: 768/463, loss: 0.6120697259902954 2023-01-22 14:31:13.075742: step: 770/463, loss: 1.1976311206817627 2023-01-22 14:31:13.701594: step: 772/463, loss: 0.9530817866325378 2023-01-22 14:31:14.278943: step: 774/463, loss: 0.2800822854042053 2023-01-22 14:31:14.887370: step: 776/463, loss: 1.0232794284820557 2023-01-22 14:31:15.553206: step: 778/463, loss: 0.22213663160800934 2023-01-22 14:31:16.214314: step: 780/463, loss: 0.449619859457016 2023-01-22 14:31:16.836843: step: 782/463, loss: 0.41700172424316406 2023-01-22 14:31:17.421281: step: 784/463, loss: 0.14007307589054108 2023-01-22 14:31:18.073226: step: 786/463, loss: 1.3294371366500854 2023-01-22 14:31:18.623472: step: 788/463, loss: 0.6568621397018433 2023-01-22 14:31:19.286027: step: 790/463, loss: 0.8767426013946533 2023-01-22 14:31:19.927818: step: 792/463, loss: 0.5056803822517395 2023-01-22 14:31:20.614566: step: 794/463, loss: 0.25756436586380005 2023-01-22 14:31:21.235162: step: 796/463, loss: 0.580981433391571 2023-01-22 14:31:21.829616: step: 798/463, loss: 0.5176219344139099 2023-01-22 14:31:22.446529: step: 800/463, loss: 0.36559242010116577 2023-01-22 14:31:23.002314: step: 802/463, loss: 0.1727662980556488 2023-01-22 14:31:23.625923: step: 804/463, loss: 0.6708003282546997 2023-01-22 14:31:24.285588: step: 806/463, loss: 0.7026484608650208 2023-01-22 14:31:24.934106: step: 808/463, loss: 0.5069759488105774 2023-01-22 14:31:25.535214: step: 810/463, loss: 1.4748495817184448 2023-01-22 14:31:26.138795: step: 812/463, loss: 0.233711838722229 2023-01-22 14:31:26.700204: step: 814/463, loss: 0.7495606541633606 2023-01-22 14:31:27.271920: step: 816/463, loss: 2.9529035091400146 2023-01-22 14:31:27.868569: step: 818/463, loss: 0.4902271628379822 2023-01-22 14:31:28.514661: step: 820/463, loss: 0.5887980461120605 2023-01-22 14:31:29.122044: step: 822/463, loss: 0.39090418815612793 2023-01-22 14:31:29.676766: step: 824/463, loss: 0.39508742094039917 2023-01-22 14:31:30.228615: step: 826/463, loss: 0.3593222200870514 2023-01-22 14:31:30.793290: step: 828/463, loss: 0.4631485939025879 2023-01-22 14:31:31.443661: step: 830/463, loss: 0.2933242619037628 2023-01-22 14:31:32.070071: step: 832/463, loss: 0.2546429932117462 2023-01-22 14:31:32.683071: step: 834/463, loss: 0.2525102198123932 2023-01-22 14:31:33.331928: step: 836/463, loss: 0.3825846314430237 2023-01-22 14:31:34.078943: step: 838/463, loss: 0.8450503349304199 2023-01-22 14:31:34.746564: step: 840/463, loss: 1.405557632446289 2023-01-22 14:31:35.373146: step: 842/463, loss: 4.327742576599121 2023-01-22 14:31:36.048581: step: 844/463, loss: 0.39875566959381104 2023-01-22 14:31:36.653968: step: 846/463, loss: 0.2450835257768631 2023-01-22 14:31:37.232759: step: 848/463, loss: 0.2649916112422943 2023-01-22 14:31:37.836840: step: 850/463, loss: 0.7916967868804932 2023-01-22 14:31:38.450997: step: 852/463, loss: 0.9974624514579773 2023-01-22 14:31:39.032754: step: 854/463, loss: 0.7380846738815308 2023-01-22 14:31:39.672643: step: 856/463, loss: 0.14538098871707916 2023-01-22 14:31:40.268878: step: 858/463, loss: 0.9558815956115723 2023-01-22 14:31:40.912079: step: 860/463, loss: 0.48988571763038635 2023-01-22 14:31:41.476528: step: 862/463, loss: 0.20142319798469543 2023-01-22 14:31:42.112839: step: 864/463, loss: 0.6636270880699158 2023-01-22 14:31:42.702514: step: 866/463, loss: 0.21880663931369781 2023-01-22 14:31:43.319614: step: 868/463, loss: 0.49093732237815857 2023-01-22 14:31:43.905678: step: 870/463, loss: 0.4691268801689148 2023-01-22 14:31:44.486996: step: 872/463, loss: 2.876213312149048 2023-01-22 14:31:45.086348: step: 874/463, loss: 0.42662736773490906 2023-01-22 14:31:45.715057: step: 876/463, loss: 0.15170860290527344 2023-01-22 14:31:46.290800: step: 878/463, loss: 0.39958634972572327 2023-01-22 14:31:46.918071: step: 880/463, loss: 0.4473869800567627 2023-01-22 14:31:47.547081: step: 882/463, loss: 0.4297513961791992 2023-01-22 14:31:48.145218: step: 884/463, loss: 0.4243835210800171 2023-01-22 14:31:48.813957: step: 886/463, loss: 0.3453958034515381 2023-01-22 14:31:49.495999: step: 888/463, loss: 0.3606170415878296 2023-01-22 14:31:50.139340: step: 890/463, loss: 2.015164613723755 2023-01-22 14:31:50.785488: step: 892/463, loss: 0.6192336678504944 2023-01-22 14:31:51.395940: step: 894/463, loss: 0.22592231631278992 2023-01-22 14:31:51.978129: step: 896/463, loss: 0.4592258334159851 2023-01-22 14:31:52.639496: step: 898/463, loss: 0.40885621309280396 2023-01-22 14:31:53.284006: step: 900/463, loss: 1.7040338516235352 2023-01-22 14:31:53.885278: step: 902/463, loss: 0.4968821406364441 2023-01-22 14:31:54.467812: step: 904/463, loss: 0.4340237081050873 2023-01-22 14:31:55.121307: step: 906/463, loss: 0.6314359307289124 2023-01-22 14:31:55.736085: step: 908/463, loss: 0.9903269410133362 2023-01-22 14:31:56.396987: step: 910/463, loss: 0.32997357845306396 2023-01-22 14:31:56.953843: step: 912/463, loss: 0.6933586001396179 2023-01-22 14:31:57.588803: step: 914/463, loss: 1.1111772060394287 2023-01-22 14:31:58.157015: step: 916/463, loss: 0.7083145976066589 2023-01-22 14:31:58.808425: step: 918/463, loss: 0.5266413688659668 2023-01-22 14:31:59.384118: step: 920/463, loss: 0.6543651819229126 2023-01-22 14:31:59.970813: step: 922/463, loss: 0.2363874912261963 2023-01-22 14:32:00.524857: step: 924/463, loss: 0.5578360557556152 2023-01-22 14:32:01.131991: step: 926/463, loss: 0.5949733257293701 ================================================== Loss: 0.744 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.28520174708818635, 'r': 0.3252490512333966, 'f1': 0.3039117907801418}, 'combined': 0.22393500373273606, 'epoch': 0} Test Chinese: {'template': {'p': 0.9722222222222222, 'r': 0.5511811023622047, 'f1': 0.7035175879396985}, 'slot': {'p': 0.37191344751505184, 'r': 0.2850904475725136, 'f1': 0.32276512210379255}, 'combined': 0.22707094017352242, 'epoch': 0} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2903846153846154, 'r': 0.3223434535104364, 'f1': 0.30553057553956836}, 'combined': 0.22512779250283982, 'epoch': 0} Test Korean: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.36121212786452517, 'r': 0.27562876655864466, 'f1': 0.31266978657047834}, 'combined': 0.2219955484650396, 'epoch': 0} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.301291961130742, 'r': 0.3235887096774194, 'f1': 0.3120425434583715}, 'combined': 0.2299260846535369, 'epoch': 0} Test Russian: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.38111410877910484, 'r': 0.27421624899959984, 'f1': 0.318946559120102}, 'combined': 0.2264520569752724, 'epoch': 0} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.22598870056497172, 'r': 0.38095238095238093, 'f1': 0.2836879432624113}, 'combined': 0.1891252955082742, 'epoch': 0} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.31666666666666665, 'r': 0.41304347826086957, 'f1': 0.3584905660377358}, 'combined': 0.1792452830188679, 'epoch': 0} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.3888888888888889, 'r': 0.2413793103448276, 'f1': 0.2978723404255319}, 'combined': 0.19858156028368792, 'epoch': 0} New best chinese model... New best korean model... New best russian model... ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.28520174708818635, 'r': 0.3252490512333966, 'f1': 0.3039117907801418}, 'combined': 0.22393500373273606, 'epoch': 0} Test for Chinese: {'template': {'p': 0.9722222222222222, 'r': 0.5511811023622047, 'f1': 0.7035175879396985}, 'slot': {'p': 0.37191344751505184, 'r': 0.2850904475725136, 'f1': 0.32276512210379255}, 'combined': 0.22707094017352242, 'epoch': 0} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.22598870056497172, 'r': 0.38095238095238093, 'f1': 0.2836879432624113}, 'combined': 0.1891252955082742, 'epoch': 0} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2903846153846154, 'r': 0.3223434535104364, 'f1': 0.30553057553956836}, 'combined': 0.22512779250283982, 'epoch': 0} Test for Korean: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.36121212786452517, 'r': 0.27562876655864466, 'f1': 0.31266978657047834}, 'combined': 0.2219955484650396, 'epoch': 0} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.31666666666666665, 'r': 0.41304347826086957, 'f1': 0.3584905660377358}, 'combined': 0.1792452830188679, 'epoch': 0} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.301291961130742, 'r': 0.3235887096774194, 'f1': 0.3120425434583715}, 'combined': 0.2299260846535369, 'epoch': 0} Test for Russian: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.38111410877910484, 'r': 0.27421624899959984, 'f1': 0.318946559120102}, 'combined': 0.2264520569752724, 'epoch': 0} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.3888888888888889, 'r': 0.2413793103448276, 'f1': 0.2978723404255319}, 'combined': 0.19858156028368792, 'epoch': 0} ****************************** Epoch: 1 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-22 14:34:55.569501: step: 2/463, loss: 0.19585728645324707 2023-01-22 14:34:56.179611: step: 4/463, loss: 0.39597535133361816 2023-01-22 14:34:56.790202: step: 6/463, loss: 0.7657061815261841 2023-01-22 14:34:57.382950: step: 8/463, loss: 0.31791043281555176 2023-01-22 14:34:58.031077: step: 10/463, loss: 0.5949783325195312 2023-01-22 14:34:58.700020: step: 12/463, loss: 0.5473995804786682 2023-01-22 14:34:59.261756: step: 14/463, loss: 0.8219742774963379 2023-01-22 14:34:59.902261: step: 16/463, loss: 0.3287069797515869 2023-01-22 14:35:00.571930: step: 18/463, loss: 0.4599754214286804 2023-01-22 14:35:01.245321: step: 20/463, loss: 0.18107396364212036 2023-01-22 14:35:01.891411: step: 22/463, loss: 0.616373598575592 2023-01-22 14:35:02.518185: step: 24/463, loss: 0.1342749446630478 2023-01-22 14:35:03.064367: step: 26/463, loss: 0.3912769854068756 2023-01-22 14:35:03.693476: step: 28/463, loss: 0.6757115125656128 2023-01-22 14:35:04.307263: step: 30/463, loss: 0.16425196826457977 2023-01-22 14:35:04.907471: step: 32/463, loss: 0.35391271114349365 2023-01-22 14:35:05.514834: step: 34/463, loss: 1.077383041381836 2023-01-22 14:35:06.091820: step: 36/463, loss: 0.21433678269386292 2023-01-22 14:35:06.711523: step: 38/463, loss: 1.0441455841064453 2023-01-22 14:35:07.336838: step: 40/463, loss: 0.4132791757583618 2023-01-22 14:35:07.940149: step: 42/463, loss: 0.3133901357650757 2023-01-22 14:35:08.564729: step: 44/463, loss: 0.5324940085411072 2023-01-22 14:35:09.209586: step: 46/463, loss: 0.28842881321907043 2023-01-22 14:35:09.799690: step: 48/463, loss: 0.38999855518341064 2023-01-22 14:35:10.393083: step: 50/463, loss: 0.8913331627845764 2023-01-22 14:35:11.003416: step: 52/463, loss: 0.7144319415092468 2023-01-22 14:35:11.564871: step: 54/463, loss: 0.9332661032676697 2023-01-22 14:35:12.207759: step: 56/463, loss: 0.5182240605354309 2023-01-22 14:35:12.827097: step: 58/463, loss: 0.3184746503829956 2023-01-22 14:35:13.454842: step: 60/463, loss: 0.5256998538970947 2023-01-22 14:35:14.073495: step: 62/463, loss: 0.5230571031570435 2023-01-22 14:35:14.727842: step: 64/463, loss: 0.21986477077007294 2023-01-22 14:35:15.299812: step: 66/463, loss: 0.43045324087142944 2023-01-22 14:35:15.875330: step: 68/463, loss: 0.2414759397506714 2023-01-22 14:35:16.481681: step: 70/463, loss: 0.4536994695663452 2023-01-22 14:35:17.117063: step: 72/463, loss: 0.5503613352775574 2023-01-22 14:35:17.723510: step: 74/463, loss: 0.927859902381897 2023-01-22 14:35:18.307153: step: 76/463, loss: 0.7228865623474121 2023-01-22 14:35:18.949380: step: 78/463, loss: 0.23453272879123688 2023-01-22 14:35:19.630824: step: 80/463, loss: 0.4681675136089325 2023-01-22 14:35:20.169287: step: 82/463, loss: 0.6745116114616394 2023-01-22 14:35:20.764374: step: 84/463, loss: 0.33498528599739075 2023-01-22 14:35:21.361062: step: 86/463, loss: 0.09371042996644974 2023-01-22 14:35:21.952911: step: 88/463, loss: 0.20578297972679138 2023-01-22 14:35:22.552554: step: 90/463, loss: 0.25540396571159363 2023-01-22 14:35:23.099794: step: 92/463, loss: 0.19388741254806519 2023-01-22 14:35:23.685112: step: 94/463, loss: 0.7285294532775879 2023-01-22 14:35:24.362075: step: 96/463, loss: 0.9432556629180908 2023-01-22 14:35:24.954997: step: 98/463, loss: 0.2498428225517273 2023-01-22 14:35:25.536369: step: 100/463, loss: 0.3710387945175171 2023-01-22 14:35:26.201775: step: 102/463, loss: 0.21482504904270172 2023-01-22 14:35:26.876980: step: 104/463, loss: 0.1585397571325302 2023-01-22 14:35:27.456760: step: 106/463, loss: 0.34361034631729126 2023-01-22 14:35:28.002606: step: 108/463, loss: 0.28889888525009155 2023-01-22 14:35:28.650635: step: 110/463, loss: 0.8251504898071289 2023-01-22 14:35:29.268744: step: 112/463, loss: 0.5852532386779785 2023-01-22 14:35:29.816615: step: 114/463, loss: 0.2219550609588623 2023-01-22 14:35:30.475513: step: 116/463, loss: 1.6506783962249756 2023-01-22 14:35:31.079965: step: 118/463, loss: 0.20651747286319733 2023-01-22 14:35:31.675509: step: 120/463, loss: 0.32658785581588745 2023-01-22 14:35:32.353305: step: 122/463, loss: 0.19296163320541382 2023-01-22 14:35:32.930560: step: 124/463, loss: 0.316299170255661 2023-01-22 14:35:33.544256: step: 126/463, loss: 1.5248197317123413 2023-01-22 14:35:34.160586: step: 128/463, loss: 0.1860351264476776 2023-01-22 14:35:34.786507: step: 130/463, loss: 0.27384746074676514 2023-01-22 14:35:35.394429: step: 132/463, loss: 0.2149728536605835 2023-01-22 14:35:36.054132: step: 134/463, loss: 0.46675795316696167 2023-01-22 14:35:36.688751: step: 136/463, loss: 0.20895560085773468 2023-01-22 14:35:37.298965: step: 138/463, loss: 0.6667411923408508 2023-01-22 14:35:37.931981: step: 140/463, loss: 0.8120225071907043 2023-01-22 14:35:38.517486: step: 142/463, loss: 0.16750864684581757 2023-01-22 14:35:39.151147: step: 144/463, loss: 0.3574519753456116 2023-01-22 14:35:39.713791: step: 146/463, loss: 0.5376714468002319 2023-01-22 14:35:40.323884: step: 148/463, loss: 0.22540387511253357 2023-01-22 14:35:40.981281: step: 150/463, loss: 0.4751604199409485 2023-01-22 14:35:41.616558: step: 152/463, loss: 0.4385087490081787 2023-01-22 14:35:42.256942: step: 154/463, loss: 0.6574984788894653 2023-01-22 14:35:42.866449: step: 156/463, loss: 1.6254949569702148 2023-01-22 14:35:43.512447: step: 158/463, loss: 0.3618869185447693 2023-01-22 14:35:44.119655: step: 160/463, loss: 0.3058200478553772 2023-01-22 14:35:44.708644: step: 162/463, loss: 0.4607722759246826 2023-01-22 14:35:45.355367: step: 164/463, loss: 0.4368288516998291 2023-01-22 14:35:46.003808: step: 166/463, loss: 0.22349052131175995 2023-01-22 14:35:46.641086: step: 168/463, loss: 0.23627817630767822 2023-01-22 14:35:47.242657: step: 170/463, loss: 0.8253803253173828 2023-01-22 14:35:47.847058: step: 172/463, loss: 0.44908201694488525 2023-01-22 14:35:48.486186: step: 174/463, loss: 1.5602459907531738 2023-01-22 14:35:49.109993: step: 176/463, loss: 0.13364383578300476 2023-01-22 14:35:49.730597: step: 178/463, loss: 0.23759756982326508 2023-01-22 14:35:50.350816: step: 180/463, loss: 0.8966045379638672 2023-01-22 14:35:50.941246: step: 182/463, loss: 0.16692018508911133 2023-01-22 14:35:51.543293: step: 184/463, loss: 0.5923877358436584 2023-01-22 14:35:52.202709: step: 186/463, loss: 0.2065540850162506 2023-01-22 14:35:52.899386: step: 188/463, loss: 0.2770240902900696 2023-01-22 14:35:53.501024: step: 190/463, loss: 0.5257288217544556 2023-01-22 14:35:54.155934: step: 192/463, loss: 0.15113915503025055 2023-01-22 14:35:54.915233: step: 194/463, loss: 0.29753807187080383 2023-01-22 14:35:55.521426: step: 196/463, loss: 0.5056251883506775 2023-01-22 14:35:56.131367: step: 198/463, loss: 0.21869823336601257 2023-01-22 14:35:56.762103: step: 200/463, loss: 0.22960779070854187 2023-01-22 14:35:57.298131: step: 202/463, loss: 0.27974164485931396 2023-01-22 14:35:57.862974: step: 204/463, loss: 0.9470335245132446 2023-01-22 14:35:58.410815: step: 206/463, loss: 0.2707477807998657 2023-01-22 14:35:59.016763: step: 208/463, loss: 0.46579110622406006 2023-01-22 14:35:59.679271: step: 210/463, loss: 0.3898662328720093 2023-01-22 14:36:00.325454: step: 212/463, loss: 0.3594658374786377 2023-01-22 14:36:00.963668: step: 214/463, loss: 4.784701347351074 2023-01-22 14:36:01.576233: step: 216/463, loss: 0.4560096859931946 2023-01-22 14:36:02.175610: step: 218/463, loss: 0.19345135986804962 2023-01-22 14:36:02.766659: step: 220/463, loss: 0.5748313665390015 2023-01-22 14:36:03.470644: step: 222/463, loss: 0.5320520997047424 2023-01-22 14:36:04.198696: step: 224/463, loss: 0.39326146245002747 2023-01-22 14:36:04.930216: step: 226/463, loss: 0.38007086515426636 2023-01-22 14:36:05.623625: step: 228/463, loss: 0.3957745432853699 2023-01-22 14:36:06.240933: step: 230/463, loss: 0.35035526752471924 2023-01-22 14:36:06.861868: step: 232/463, loss: 0.10355721414089203 2023-01-22 14:36:07.461279: step: 234/463, loss: 0.251360684633255 2023-01-22 14:36:08.127061: step: 236/463, loss: 0.8822212815284729 2023-01-22 14:36:08.738819: step: 238/463, loss: 0.5282868146896362 2023-01-22 14:36:09.339616: step: 240/463, loss: 0.7721855640411377 2023-01-22 14:36:10.054083: step: 242/463, loss: 1.3109180927276611 2023-01-22 14:36:10.597240: step: 244/463, loss: 0.225619375705719 2023-01-22 14:36:11.236231: step: 246/463, loss: 0.8196591138839722 2023-01-22 14:36:11.854629: step: 248/463, loss: 0.5923926830291748 2023-01-22 14:36:12.452474: step: 250/463, loss: 0.19557514786720276 2023-01-22 14:36:13.081891: step: 252/463, loss: 1.1578744649887085 2023-01-22 14:36:13.652923: step: 254/463, loss: 1.301957130432129 2023-01-22 14:36:14.236349: step: 256/463, loss: 0.32014790177345276 2023-01-22 14:36:14.812668: step: 258/463, loss: 0.22637328505516052 2023-01-22 14:36:15.339975: step: 260/463, loss: 0.29314589500427246 2023-01-22 14:36:16.025981: step: 262/463, loss: 0.46571049094200134 2023-01-22 14:36:16.636209: step: 264/463, loss: 0.7660174369812012 2023-01-22 14:36:17.255899: step: 266/463, loss: 0.2099403738975525 2023-01-22 14:36:17.885132: step: 268/463, loss: 0.8722325563430786 2023-01-22 14:36:18.569769: step: 270/463, loss: 1.6717137098312378 2023-01-22 14:36:19.192457: step: 272/463, loss: 0.2936002016067505 2023-01-22 14:36:19.799653: step: 274/463, loss: 0.6355797648429871 2023-01-22 14:36:20.450910: step: 276/463, loss: 0.5348999500274658 2023-01-22 14:36:21.081691: step: 278/463, loss: 0.4594631493091583 2023-01-22 14:36:21.601754: step: 280/463, loss: 1.566497802734375 2023-01-22 14:36:22.184406: step: 282/463, loss: 0.494848370552063 2023-01-22 14:36:22.794062: step: 284/463, loss: 0.6561375856399536 2023-01-22 14:36:23.309577: step: 286/463, loss: 0.7224798202514648 2023-01-22 14:36:23.978397: step: 288/463, loss: 0.25280511379241943 2023-01-22 14:36:24.614755: step: 290/463, loss: 0.18655575811862946 2023-01-22 14:36:25.189838: step: 292/463, loss: 0.1829933226108551 2023-01-22 14:36:25.755889: step: 294/463, loss: 0.26584315299987793 2023-01-22 14:36:26.391689: step: 296/463, loss: 0.24086683988571167 2023-01-22 14:36:27.009215: step: 298/463, loss: 0.1897473931312561 2023-01-22 14:36:27.600631: step: 300/463, loss: 0.3339339792728424 2023-01-22 14:36:28.204986: step: 302/463, loss: 0.3493659794330597 2023-01-22 14:36:28.807398: step: 304/463, loss: 0.7415933609008789 2023-01-22 14:36:29.466175: step: 306/463, loss: 0.1549147069454193 2023-01-22 14:36:30.029458: step: 308/463, loss: 0.30904412269592285 2023-01-22 14:36:30.576218: step: 310/463, loss: 0.21701383590698242 2023-01-22 14:36:31.154506: step: 312/463, loss: 2.7062652111053467 2023-01-22 14:36:31.777832: step: 314/463, loss: 0.48734477162361145 2023-01-22 14:36:32.320164: step: 316/463, loss: 0.22967302799224854 2023-01-22 14:36:32.930265: step: 318/463, loss: 0.4436897933483124 2023-01-22 14:36:33.585653: step: 320/463, loss: 0.6010516285896301 2023-01-22 14:36:34.176732: step: 322/463, loss: 0.22536210715770721 2023-01-22 14:36:34.771347: step: 324/463, loss: 3.061147689819336 2023-01-22 14:36:35.378453: step: 326/463, loss: 0.31790152192115784 2023-01-22 14:36:35.993157: step: 328/463, loss: 0.20307254791259766 2023-01-22 14:36:36.548274: step: 330/463, loss: 0.4403744339942932 2023-01-22 14:36:37.114030: step: 332/463, loss: 0.3328777253627777 2023-01-22 14:36:37.706865: step: 334/463, loss: 0.26805371046066284 2023-01-22 14:36:38.296669: step: 336/463, loss: 0.16040468215942383 2023-01-22 14:36:38.920666: step: 338/463, loss: 0.3903311491012573 2023-01-22 14:36:39.536954: step: 340/463, loss: 0.7325121760368347 2023-01-22 14:36:40.176576: step: 342/463, loss: 0.4636270999908447 2023-01-22 14:36:40.790134: step: 344/463, loss: 0.7845694422721863 2023-01-22 14:36:41.345881: step: 346/463, loss: 0.9925259351730347 2023-01-22 14:36:42.047923: step: 348/463, loss: 0.3937426805496216 2023-01-22 14:36:42.662942: step: 350/463, loss: 0.45332545042037964 2023-01-22 14:36:43.313448: step: 352/463, loss: 0.45665550231933594 2023-01-22 14:36:43.968678: step: 354/463, loss: 0.6265438795089722 2023-01-22 14:36:44.565317: step: 356/463, loss: 0.14529560506343842 2023-01-22 14:36:45.176196: step: 358/463, loss: 1.0199923515319824 2023-01-22 14:36:45.812421: step: 360/463, loss: 0.7039109468460083 2023-01-22 14:36:46.432934: step: 362/463, loss: 0.40457671880722046 2023-01-22 14:36:47.021118: step: 364/463, loss: 0.12513157725334167 2023-01-22 14:36:47.582318: step: 366/463, loss: 0.5390586256980896 2023-01-22 14:36:48.120622: step: 368/463, loss: 0.2585051655769348 2023-01-22 14:36:48.684526: step: 370/463, loss: 0.473108172416687 2023-01-22 14:36:49.286060: step: 372/463, loss: 0.6961944699287415 2023-01-22 14:36:49.883692: step: 374/463, loss: 0.4352990388870239 2023-01-22 14:36:50.517828: step: 376/463, loss: 0.6548789143562317 2023-01-22 14:36:51.059339: step: 378/463, loss: 3.077148914337158 2023-01-22 14:36:51.709190: step: 380/463, loss: 0.5724066495895386 2023-01-22 14:36:52.307750: step: 382/463, loss: 0.38456666469573975 2023-01-22 14:36:52.870206: step: 384/463, loss: 0.460111141204834 2023-01-22 14:36:53.461979: step: 386/463, loss: 0.5270627737045288 2023-01-22 14:36:54.050607: step: 388/463, loss: 0.18158987164497375 2023-01-22 14:36:54.665330: step: 390/463, loss: 0.5960894823074341 2023-01-22 14:36:55.243016: step: 392/463, loss: 0.678532600402832 2023-01-22 14:36:55.793381: step: 394/463, loss: 1.2117701768875122 2023-01-22 14:36:56.386685: step: 396/463, loss: 0.9469590783119202 2023-01-22 14:36:57.026146: step: 398/463, loss: 0.14562702178955078 2023-01-22 14:36:57.661442: step: 400/463, loss: 0.33831822872161865 2023-01-22 14:36:58.250342: step: 402/463, loss: 0.1399655044078827 2023-01-22 14:36:59.083714: step: 404/463, loss: 0.8558112978935242 2023-01-22 14:36:59.687768: step: 406/463, loss: 0.6785668134689331 2023-01-22 14:37:00.312653: step: 408/463, loss: 0.3525480031967163 2023-01-22 14:37:00.898503: step: 410/463, loss: 0.4536390006542206 2023-01-22 14:37:01.567541: step: 412/463, loss: 4.140870094299316 2023-01-22 14:37:02.218510: step: 414/463, loss: 0.7896156311035156 2023-01-22 14:37:02.816468: step: 416/463, loss: 1.6702438592910767 2023-01-22 14:37:03.410371: step: 418/463, loss: 0.3009040653705597 2023-01-22 14:37:04.019001: step: 420/463, loss: 0.26697689294815063 2023-01-22 14:37:04.626217: step: 422/463, loss: 0.6904835104942322 2023-01-22 14:37:05.197818: step: 424/463, loss: 0.6115447282791138 2023-01-22 14:37:05.770982: step: 426/463, loss: 0.5334144234657288 2023-01-22 14:37:06.343543: step: 428/463, loss: 0.35336893796920776 2023-01-22 14:37:06.917584: step: 430/463, loss: 0.13788948953151703 2023-01-22 14:37:07.608363: step: 432/463, loss: 0.5630030632019043 2023-01-22 14:37:08.167169: step: 434/463, loss: 0.11666632443666458 2023-01-22 14:37:08.785041: step: 436/463, loss: 0.3252636194229126 2023-01-22 14:37:09.397216: step: 438/463, loss: 0.6952319145202637 2023-01-22 14:37:09.950794: step: 440/463, loss: 0.21628892421722412 2023-01-22 14:37:10.576920: step: 442/463, loss: 0.2892216145992279 2023-01-22 14:37:11.151156: step: 444/463, loss: 0.7028605341911316 2023-01-22 14:37:11.850761: step: 446/463, loss: 1.1647183895111084 2023-01-22 14:37:12.434771: step: 448/463, loss: 0.16185317933559418 2023-01-22 14:37:13.051301: step: 450/463, loss: 0.3545680344104767 2023-01-22 14:37:13.665370: step: 452/463, loss: 0.7373641729354858 2023-01-22 14:37:14.281427: step: 454/463, loss: 0.4182175099849701 2023-01-22 14:37:14.903828: step: 456/463, loss: 0.08479192852973938 2023-01-22 14:37:15.492690: step: 458/463, loss: 1.5064592361450195 2023-01-22 14:37:16.092168: step: 460/463, loss: 0.14611418545246124 2023-01-22 14:37:16.727808: step: 462/463, loss: 0.6331365704536438 2023-01-22 14:37:17.377020: step: 464/463, loss: 0.566244900226593 2023-01-22 14:37:17.953014: step: 466/463, loss: 0.1699327826499939 2023-01-22 14:37:18.620311: step: 468/463, loss: 0.1908017098903656 2023-01-22 14:37:19.173633: step: 470/463, loss: 0.28972506523132324 2023-01-22 14:37:19.780666: step: 472/463, loss: 0.3339434266090393 2023-01-22 14:37:20.455649: step: 474/463, loss: 0.9610645771026611 2023-01-22 14:37:21.056041: step: 476/463, loss: 0.22169621288776398 2023-01-22 14:37:21.636061: step: 478/463, loss: 0.37318548560142517 2023-01-22 14:37:22.253510: step: 480/463, loss: 0.4892626702785492 2023-01-22 14:37:22.887782: step: 482/463, loss: 0.6090335845947266 2023-01-22 14:37:23.554644: step: 484/463, loss: 0.5468646883964539 2023-01-22 14:37:24.136397: step: 486/463, loss: 0.6679657697677612 2023-01-22 14:37:24.760750: step: 488/463, loss: 0.20053252577781677 2023-01-22 14:37:25.365071: step: 490/463, loss: 0.644349992275238 2023-01-22 14:37:25.987206: step: 492/463, loss: 0.18672671914100647 2023-01-22 14:37:26.554551: step: 494/463, loss: 0.644810676574707 2023-01-22 14:37:27.179649: step: 496/463, loss: 1.1243841648101807 2023-01-22 14:37:27.799023: step: 498/463, loss: 0.31747666001319885 2023-01-22 14:37:28.403170: step: 500/463, loss: 0.14564982056617737 2023-01-22 14:37:28.965337: step: 502/463, loss: 0.4342922866344452 2023-01-22 14:37:29.558426: step: 504/463, loss: 0.6133755445480347 2023-01-22 14:37:30.174663: step: 506/463, loss: 0.10595931112766266 2023-01-22 14:37:30.776284: step: 508/463, loss: 0.28709539771080017 2023-01-22 14:37:31.375103: step: 510/463, loss: 0.3986273407936096 2023-01-22 14:37:31.982033: step: 512/463, loss: 0.3383350968360901 2023-01-22 14:37:32.551059: step: 514/463, loss: 0.1657877415418625 2023-01-22 14:37:33.246527: step: 516/463, loss: 0.3530150055885315 2023-01-22 14:37:33.831432: step: 518/463, loss: 0.8292036056518555 2023-01-22 14:37:34.438608: step: 520/463, loss: 0.3785654604434967 2023-01-22 14:37:35.022255: step: 522/463, loss: 0.21164549887180328 2023-01-22 14:37:35.668168: step: 524/463, loss: 0.6882820129394531 2023-01-22 14:37:36.243308: step: 526/463, loss: 1.164467453956604 2023-01-22 14:37:36.840468: step: 528/463, loss: 0.13333871960639954 2023-01-22 14:37:37.502163: step: 530/463, loss: 1.0616865158081055 2023-01-22 14:37:38.118766: step: 532/463, loss: 0.41032901406288147 2023-01-22 14:37:38.724705: step: 534/463, loss: 0.5675647854804993 2023-01-22 14:37:39.318330: step: 536/463, loss: 0.3366907238960266 2023-01-22 14:37:39.940932: step: 538/463, loss: 0.3847780227661133 2023-01-22 14:37:40.518866: step: 540/463, loss: 0.35826772451400757 2023-01-22 14:37:41.168683: step: 542/463, loss: 0.6545200347900391 2023-01-22 14:37:41.719497: step: 544/463, loss: 0.3019644320011139 2023-01-22 14:37:42.357637: step: 546/463, loss: 0.20736636221408844 2023-01-22 14:37:42.951111: step: 548/463, loss: 0.5021512508392334 2023-01-22 14:37:43.525849: step: 550/463, loss: 0.6048513650894165 2023-01-22 14:37:44.129511: step: 552/463, loss: 0.14977408945560455 2023-01-22 14:37:44.727554: step: 554/463, loss: 0.5103378295898438 2023-01-22 14:37:45.282221: step: 556/463, loss: 3.563530921936035 2023-01-22 14:37:45.834061: step: 558/463, loss: 0.0657290369272232 2023-01-22 14:37:46.478291: step: 560/463, loss: 0.8456913232803345 2023-01-22 14:37:47.103997: step: 562/463, loss: 0.22835709154605865 2023-01-22 14:37:47.725621: step: 564/463, loss: 0.2777712047100067 2023-01-22 14:37:48.317845: step: 566/463, loss: 0.6331112384796143 2023-01-22 14:37:48.911978: step: 568/463, loss: 0.4482853412628174 2023-01-22 14:37:49.463468: step: 570/463, loss: 0.4121679365634918 2023-01-22 14:37:50.103539: step: 572/463, loss: 0.3052824139595032 2023-01-22 14:37:50.712848: step: 574/463, loss: 0.09329718351364136 2023-01-22 14:37:51.325059: step: 576/463, loss: 0.9068806767463684 2023-01-22 14:37:51.885848: step: 578/463, loss: 0.18510934710502625 2023-01-22 14:37:52.447144: step: 580/463, loss: 0.14637409150600433 2023-01-22 14:37:53.065015: step: 582/463, loss: 0.3196537494659424 2023-01-22 14:37:53.669137: step: 584/463, loss: 0.5623336434364319 2023-01-22 14:37:54.254083: step: 586/463, loss: 0.7740505337715149 2023-01-22 14:37:54.853755: step: 588/463, loss: 0.27375495433807373 2023-01-22 14:37:55.417611: step: 590/463, loss: 0.5583195090293884 2023-01-22 14:37:55.966159: step: 592/463, loss: 0.6633048057556152 2023-01-22 14:37:56.718649: step: 594/463, loss: 0.2701930105686188 2023-01-22 14:37:57.315312: step: 596/463, loss: 0.1170184314250946 2023-01-22 14:37:57.947984: step: 598/463, loss: 0.42056772112846375 2023-01-22 14:37:58.521341: step: 600/463, loss: 0.3729879856109619 2023-01-22 14:37:59.148289: step: 602/463, loss: 0.332736998796463 2023-01-22 14:37:59.796890: step: 604/463, loss: 0.4164681136608124 2023-01-22 14:38:00.436516: step: 606/463, loss: 1.468922734260559 2023-01-22 14:38:01.012260: step: 608/463, loss: 0.18038666248321533 2023-01-22 14:38:01.565868: step: 610/463, loss: 0.13868804275989532 2023-01-22 14:38:02.193866: step: 612/463, loss: 0.3853779435157776 2023-01-22 14:38:02.810383: step: 614/463, loss: 0.1887216717004776 2023-01-22 14:38:03.395871: step: 616/463, loss: 0.23369747400283813 2023-01-22 14:38:03.968583: step: 618/463, loss: 0.527493953704834 2023-01-22 14:38:04.504558: step: 620/463, loss: 0.22691801190376282 2023-01-22 14:38:05.139187: step: 622/463, loss: 1.387054681777954 2023-01-22 14:38:05.780874: step: 624/463, loss: 0.2124626636505127 2023-01-22 14:38:06.416200: step: 626/463, loss: 0.21335060894489288 2023-01-22 14:38:07.032814: step: 628/463, loss: 0.10543233156204224 2023-01-22 14:38:07.639745: step: 630/463, loss: 0.1167958527803421 2023-01-22 14:38:08.348911: step: 632/463, loss: 1.2241394519805908 2023-01-22 14:38:09.043886: step: 634/463, loss: 0.273427277803421 2023-01-22 14:38:09.618997: step: 636/463, loss: 0.47759711742401123 2023-01-22 14:38:10.275820: step: 638/463, loss: 0.3240094780921936 2023-01-22 14:38:10.980894: step: 640/463, loss: 0.14702461659908295 2023-01-22 14:38:11.598508: step: 642/463, loss: 0.6459577083587646 2023-01-22 14:38:12.181758: step: 644/463, loss: 0.8614663481712341 2023-01-22 14:38:12.849866: step: 646/463, loss: 0.4682038128376007 2023-01-22 14:38:13.443663: step: 648/463, loss: 0.6623556017875671 2023-01-22 14:38:14.006599: step: 650/463, loss: 0.3775046169757843 2023-01-22 14:38:14.608341: step: 652/463, loss: 0.8535000681877136 2023-01-22 14:38:15.209536: step: 654/463, loss: 0.18346664309501648 2023-01-22 14:38:15.760336: step: 656/463, loss: 0.48958536982536316 2023-01-22 14:38:16.422917: step: 658/463, loss: 0.1783166527748108 2023-01-22 14:38:17.143129: step: 660/463, loss: 0.1954294592142105 2023-01-22 14:38:17.714420: step: 662/463, loss: 1.4892653226852417 2023-01-22 14:38:18.307244: step: 664/463, loss: 0.5399670600891113 2023-01-22 14:38:18.950984: step: 666/463, loss: 0.16029560565948486 2023-01-22 14:38:19.626248: step: 668/463, loss: 0.27526137232780457 2023-01-22 14:38:20.250318: step: 670/463, loss: 0.3510288596153259 2023-01-22 14:38:20.822646: step: 672/463, loss: 1.2451423406600952 2023-01-22 14:38:21.430459: step: 674/463, loss: 0.3461191654205322 2023-01-22 14:38:22.045556: step: 676/463, loss: 1.2946021556854248 2023-01-22 14:38:22.611225: step: 678/463, loss: 0.29601821303367615 2023-01-22 14:38:23.250549: step: 680/463, loss: 0.5967240929603577 2023-01-22 14:38:23.931846: step: 682/463, loss: 0.43684858083724976 2023-01-22 14:38:24.572805: step: 684/463, loss: 0.37003806233406067 2023-01-22 14:38:25.170810: step: 686/463, loss: 0.2925017774105072 2023-01-22 14:38:25.883877: step: 688/463, loss: 0.750586986541748 2023-01-22 14:38:26.465132: step: 690/463, loss: 0.2662805914878845 2023-01-22 14:38:27.146406: step: 692/463, loss: 0.08315921574831009 2023-01-22 14:38:27.765190: step: 694/463, loss: 0.510014533996582 2023-01-22 14:38:28.365949: step: 696/463, loss: 0.6255664825439453 2023-01-22 14:38:28.926788: step: 698/463, loss: 0.3857118487358093 2023-01-22 14:38:29.543906: step: 700/463, loss: 0.8448182344436646 2023-01-22 14:38:30.235569: step: 702/463, loss: 0.23823508620262146 2023-01-22 14:38:30.962620: step: 704/463, loss: 0.416903555393219 2023-01-22 14:38:31.628686: step: 706/463, loss: 0.6272727251052856 2023-01-22 14:38:32.249992: step: 708/463, loss: 0.41132432222366333 2023-01-22 14:38:32.804040: step: 710/463, loss: 0.2904421389102936 2023-01-22 14:38:33.483534: step: 712/463, loss: 0.2201634645462036 2023-01-22 14:38:34.122638: step: 714/463, loss: 0.19659151136875153 2023-01-22 14:38:34.675197: step: 716/463, loss: 0.13755370676517487 2023-01-22 14:38:35.263290: step: 718/463, loss: 0.15967606008052826 2023-01-22 14:38:35.830686: step: 720/463, loss: 0.21314632892608643 2023-01-22 14:38:36.428450: step: 722/463, loss: 0.29815948009490967 2023-01-22 14:38:36.937993: step: 724/463, loss: 0.4772103726863861 2023-01-22 14:38:37.522371: step: 726/463, loss: 0.6796538829803467 2023-01-22 14:38:38.149999: step: 728/463, loss: 1.1817771196365356 2023-01-22 14:38:38.744581: step: 730/463, loss: 0.11458157002925873 2023-01-22 14:38:39.414508: step: 732/463, loss: 0.5423222184181213 2023-01-22 14:38:40.061838: step: 734/463, loss: 1.5138659477233887 2023-01-22 14:38:40.671693: step: 736/463, loss: 0.31976398825645447 2023-01-22 14:38:41.281782: step: 738/463, loss: 0.6855306625366211 2023-01-22 14:38:41.886867: step: 740/463, loss: 0.6657004952430725 2023-01-22 14:38:42.518359: step: 742/463, loss: 0.30983278155326843 2023-01-22 14:38:43.164563: step: 744/463, loss: 0.3589613437652588 2023-01-22 14:38:43.792797: step: 746/463, loss: 0.1466592699289322 2023-01-22 14:38:44.370617: step: 748/463, loss: 0.33431005477905273 2023-01-22 14:38:44.991056: step: 750/463, loss: 0.33402177691459656 2023-01-22 14:38:45.563112: step: 752/463, loss: 0.10586196929216385 2023-01-22 14:38:46.103206: step: 754/463, loss: 0.21629676222801208 2023-01-22 14:38:46.705064: step: 756/463, loss: 0.7314077019691467 2023-01-22 14:38:47.409627: step: 758/463, loss: 0.4795093238353729 2023-01-22 14:38:47.999939: step: 760/463, loss: 0.3549164831638336 2023-01-22 14:38:48.590622: step: 762/463, loss: 1.4829680919647217 2023-01-22 14:38:49.254562: step: 764/463, loss: 0.7186530828475952 2023-01-22 14:38:49.992058: step: 766/463, loss: 0.8889808058738708 2023-01-22 14:38:50.658542: step: 768/463, loss: 0.3271034061908722 2023-01-22 14:38:51.254161: step: 770/463, loss: 0.11731631308794022 2023-01-22 14:38:51.868530: step: 772/463, loss: 0.6403341889381409 2023-01-22 14:38:52.449486: step: 774/463, loss: 0.21435602009296417 2023-01-22 14:38:53.071085: step: 776/463, loss: 0.8118274807929993 2023-01-22 14:38:53.745509: step: 778/463, loss: 0.2780480682849884 2023-01-22 14:38:54.365011: step: 780/463, loss: 0.2366432547569275 2023-01-22 14:38:54.937112: step: 782/463, loss: 0.5622676610946655 2023-01-22 14:38:55.505543: step: 784/463, loss: 0.15237371623516083 2023-01-22 14:38:56.133755: step: 786/463, loss: 0.11023455858230591 2023-01-22 14:38:56.773618: step: 788/463, loss: 0.11888183653354645 2023-01-22 14:38:57.348663: step: 790/463, loss: 0.7956339120864868 2023-01-22 14:38:57.954291: step: 792/463, loss: 0.1040564700961113 2023-01-22 14:38:58.524357: step: 794/463, loss: 0.29696404933929443 2023-01-22 14:38:59.090416: step: 796/463, loss: 0.9707725644111633 2023-01-22 14:38:59.709227: step: 798/463, loss: 0.9434940814971924 2023-01-22 14:39:00.251260: step: 800/463, loss: 0.3342609703540802 2023-01-22 14:39:00.837295: step: 802/463, loss: 0.16596946120262146 2023-01-22 14:39:01.412400: step: 804/463, loss: 0.15220169723033905 2023-01-22 14:39:02.044629: step: 806/463, loss: 0.2993983328342438 2023-01-22 14:39:02.725564: step: 808/463, loss: 1.494969367980957 2023-01-22 14:39:03.319065: step: 810/463, loss: 0.5887359380722046 2023-01-22 14:39:03.963293: step: 812/463, loss: 0.14861951768398285 2023-01-22 14:39:04.562657: step: 814/463, loss: 1.1531627178192139 2023-01-22 14:39:05.163477: step: 816/463, loss: 1.8949087858200073 2023-01-22 14:39:05.901086: step: 818/463, loss: 1.137722373008728 2023-01-22 14:39:06.519227: step: 820/463, loss: 0.40890055894851685 2023-01-22 14:39:07.106832: step: 822/463, loss: 0.31907492876052856 2023-01-22 14:39:07.714449: step: 824/463, loss: 0.3969718813896179 2023-01-22 14:39:08.386144: step: 826/463, loss: 0.2063141018152237 2023-01-22 14:39:09.075227: step: 828/463, loss: 5.490340709686279 2023-01-22 14:39:09.793549: step: 830/463, loss: 0.5402935743331909 2023-01-22 14:39:10.358667: step: 832/463, loss: 0.1961703598499298 2023-01-22 14:39:10.952321: step: 834/463, loss: 0.28550654649734497 2023-01-22 14:39:11.603437: step: 836/463, loss: 0.3302740156650543 2023-01-22 14:39:12.192870: step: 838/463, loss: 1.9249546527862549 2023-01-22 14:39:12.788777: step: 840/463, loss: 0.14874154329299927 2023-01-22 14:39:13.402313: step: 842/463, loss: 0.2976018786430359 2023-01-22 14:39:13.961539: step: 844/463, loss: 0.22562696039676666 2023-01-22 14:39:14.611015: step: 846/463, loss: 0.5715310573577881 2023-01-22 14:39:15.219100: step: 848/463, loss: 0.4846253991127014 2023-01-22 14:39:15.849887: step: 850/463, loss: 0.35301318764686584 2023-01-22 14:39:16.394191: step: 852/463, loss: 0.3543980121612549 2023-01-22 14:39:17.017901: step: 854/463, loss: 0.18314628303050995 2023-01-22 14:39:17.688931: step: 856/463, loss: 0.4690706133842468 2023-01-22 14:39:18.343860: step: 858/463, loss: 0.33335211873054504 2023-01-22 14:39:18.953097: step: 860/463, loss: 0.6792327761650085 2023-01-22 14:39:19.574198: step: 862/463, loss: 0.22534975409507751 2023-01-22 14:39:20.173432: step: 864/463, loss: 0.24095089733600616 2023-01-22 14:39:20.782293: step: 866/463, loss: 0.18276841938495636 2023-01-22 14:39:21.443672: step: 868/463, loss: 3.013735771179199 2023-01-22 14:39:21.991651: step: 870/463, loss: 0.53801429271698 2023-01-22 14:39:22.590324: step: 872/463, loss: 0.2600904107093811 2023-01-22 14:39:23.213564: step: 874/463, loss: 0.7525782585144043 2023-01-22 14:39:23.789786: step: 876/463, loss: 0.7971716523170471 2023-01-22 14:39:24.487389: step: 878/463, loss: 1.2024035453796387 2023-01-22 14:39:25.092515: step: 880/463, loss: 0.20178255438804626 2023-01-22 14:39:25.685985: step: 882/463, loss: 0.9095093607902527 2023-01-22 14:39:26.281350: step: 884/463, loss: 0.5892717838287354 2023-01-22 14:39:26.916775: step: 886/463, loss: 0.12595294415950775 2023-01-22 14:39:27.542537: step: 888/463, loss: 0.720389723777771 2023-01-22 14:39:28.189485: step: 890/463, loss: 0.7258880734443665 2023-01-22 14:39:28.807640: step: 892/463, loss: 0.38389071822166443 2023-01-22 14:39:29.387424: step: 894/463, loss: 0.21582907438278198 2023-01-22 14:39:30.000502: step: 896/463, loss: 0.37585902214050293 2023-01-22 14:39:30.586425: step: 898/463, loss: 0.8942197561264038 2023-01-22 14:39:31.123036: step: 900/463, loss: 0.48552414774894714 2023-01-22 14:39:31.743764: step: 902/463, loss: 0.22777792811393738 2023-01-22 14:39:32.340077: step: 904/463, loss: 0.36894893646240234 2023-01-22 14:39:32.936333: step: 906/463, loss: 0.35650408267974854 2023-01-22 14:39:33.636315: step: 908/463, loss: 0.33863526582717896 2023-01-22 14:39:34.265597: step: 910/463, loss: 0.5312002897262573 2023-01-22 14:39:34.931887: step: 912/463, loss: 0.18997633457183838 2023-01-22 14:39:35.523749: step: 914/463, loss: 1.7363231182098389 2023-01-22 14:39:36.151362: step: 916/463, loss: 0.38287869095802307 2023-01-22 14:39:36.825119: step: 918/463, loss: 0.5501614809036255 2023-01-22 14:39:37.423166: step: 920/463, loss: 0.23027585446834564 2023-01-22 14:39:37.996844: step: 922/463, loss: 1.2226881980895996 2023-01-22 14:39:38.549662: step: 924/463, loss: 0.07019729912281036 2023-01-22 14:39:39.119166: step: 926/463, loss: 0.13759225606918335 ================================================== Loss: 0.541 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3095149595559295, 'r': 0.3253724622276754, 'f1': 0.31724567547453275}, 'combined': 0.23375997140228727, 'epoch': 1} Test Chinese: {'template': {'p': 0.9722222222222222, 'r': 0.5511811023622047, 'f1': 0.7035175879396985}, 'slot': {'p': 0.36402697976295945, 'r': 0.29452226435922085, 'f1': 0.32560678286267597}, 'combined': 0.22907009849635496, 'epoch': 1} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.31359210182048525, 'r': 0.32489807892596767, 'f1': 0.3191449908555172}, 'combined': 0.23515946694617054, 'epoch': 1} Test Korean: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.3653684903443556, 'r': 0.29076445304891124, 'f1': 0.32382513429937054}, 'combined': 0.22991584535255308, 'epoch': 1} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.32028864503816795, 'r': 0.31846537001897535, 'f1': 0.3193744053282588}, 'combined': 0.2353285091892433, 'epoch': 1} Test Russian: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.3783160839747361, 'r': 0.2851412929870718, 'f1': 0.3251860363248976}, 'combined': 0.2308820857906773, 'epoch': 1} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.29409479409479405, 'r': 0.32770562770562767, 'f1': 0.3099918099918099}, 'combined': 0.2066612066612066, 'epoch': 1} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.296875, 'r': 0.41304347826086957, 'f1': 0.3454545454545454}, 'combined': 0.1727272727272727, 'epoch': 1} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.4, 'r': 0.20689655172413793, 'f1': 0.2727272727272727}, 'combined': 0.1818181818181818, 'epoch': 1} New best chinese model... New best korean model... ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3095149595559295, 'r': 0.3253724622276754, 'f1': 0.31724567547453275}, 'combined': 0.23375997140228727, 'epoch': 1} Test for Chinese: {'template': {'p': 0.9722222222222222, 'r': 0.5511811023622047, 'f1': 0.7035175879396985}, 'slot': {'p': 0.36402697976295945, 'r': 0.29452226435922085, 'f1': 0.32560678286267597}, 'combined': 0.22907009849635496, 'epoch': 1} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.29409479409479405, 'r': 0.32770562770562767, 'f1': 0.3099918099918099}, 'combined': 0.2066612066612066, 'epoch': 1} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.31359210182048525, 'r': 0.32489807892596767, 'f1': 0.3191449908555172}, 'combined': 0.23515946694617054, 'epoch': 1} Test for Korean: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.3653684903443556, 'r': 0.29076445304891124, 'f1': 0.32382513429937054}, 'combined': 0.22991584535255308, 'epoch': 1} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.296875, 'r': 0.41304347826086957, 'f1': 0.3454545454545454}, 'combined': 0.1727272727272727, 'epoch': 1} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.301291961130742, 'r': 0.3235887096774194, 'f1': 0.3120425434583715}, 'combined': 0.2299260846535369, 'epoch': 0} Test for Russian: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.38111410877910484, 'r': 0.27421624899959984, 'f1': 0.318946559120102}, 'combined': 0.2264520569752724, 'epoch': 0} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.3888888888888889, 'r': 0.2413793103448276, 'f1': 0.2978723404255319}, 'combined': 0.19858156028368792, 'epoch': 0} ****************************** Epoch: 2 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-22 14:42:24.703528: step: 2/463, loss: 0.2262139916419983 2023-01-22 14:42:25.287896: step: 4/463, loss: 0.13970355689525604 2023-01-22 14:42:25.878828: step: 6/463, loss: 0.5326498746871948 2023-01-22 14:42:26.467918: step: 8/463, loss: 0.21552881598472595 2023-01-22 14:42:27.154568: step: 10/463, loss: 0.18897739052772522 2023-01-22 14:42:27.771938: step: 12/463, loss: 0.18361541628837585 2023-01-22 14:42:28.424551: step: 14/463, loss: 0.2795824706554413 2023-01-22 14:42:29.030441: step: 16/463, loss: 0.7946299314498901 2023-01-22 14:42:29.661100: step: 18/463, loss: 0.46807724237442017 2023-01-22 14:42:30.275825: step: 20/463, loss: 0.45619815587997437 2023-01-22 14:42:30.879440: step: 22/463, loss: 0.24499639868736267 2023-01-22 14:42:31.425500: step: 24/463, loss: 2.2843081951141357 2023-01-22 14:42:32.062774: step: 26/463, loss: 0.19392967224121094 2023-01-22 14:42:32.654220: step: 28/463, loss: 0.1821277141571045 2023-01-22 14:42:33.269240: step: 30/463, loss: 0.6032585501670837 2023-01-22 14:42:33.938215: step: 32/463, loss: 0.48990920186042786 2023-01-22 14:42:34.531471: step: 34/463, loss: 0.16831152141094208 2023-01-22 14:42:35.142050: step: 36/463, loss: 0.39644360542297363 2023-01-22 14:42:35.740611: step: 38/463, loss: 0.41686809062957764 2023-01-22 14:42:36.334683: step: 40/463, loss: 0.976547122001648 2023-01-22 14:42:36.925515: step: 42/463, loss: 0.4144194424152374 2023-01-22 14:42:37.514520: step: 44/463, loss: 0.03228946030139923 2023-01-22 14:42:38.099164: step: 46/463, loss: 0.3626503050327301 2023-01-22 14:42:38.694554: step: 48/463, loss: 0.14880554378032684 2023-01-22 14:42:39.299077: step: 50/463, loss: 0.4781567454338074 2023-01-22 14:42:39.927725: step: 52/463, loss: 0.3004966080188751 2023-01-22 14:42:40.497024: step: 54/463, loss: 0.17531314492225647 2023-01-22 14:42:41.152713: step: 56/463, loss: 0.3630841076374054 2023-01-22 14:42:41.733268: step: 58/463, loss: 0.5739310383796692 2023-01-22 14:42:42.292527: step: 60/463, loss: 0.16647598147392273 2023-01-22 14:42:42.879424: step: 62/463, loss: 0.2597082257270813 2023-01-22 14:42:43.470324: step: 64/463, loss: 0.5741511583328247 2023-01-22 14:42:44.097865: step: 66/463, loss: 0.4561243951320648 2023-01-22 14:42:44.697355: step: 68/463, loss: 0.5546277761459351 2023-01-22 14:42:45.298150: step: 70/463, loss: 0.15766729414463043 2023-01-22 14:42:45.866409: step: 72/463, loss: 0.09988642483949661 2023-01-22 14:42:46.507985: step: 74/463, loss: 0.31349870562553406 2023-01-22 14:42:47.099491: step: 76/463, loss: 0.5003664493560791 2023-01-22 14:42:47.744784: step: 78/463, loss: 0.32491016387939453 2023-01-22 14:42:48.381596: step: 80/463, loss: 0.4183450937271118 2023-01-22 14:42:48.958046: step: 82/463, loss: 0.4364106357097626 2023-01-22 14:42:49.587152: step: 84/463, loss: 0.2937825918197632 2023-01-22 14:42:50.163168: step: 86/463, loss: 0.19527187943458557 2023-01-22 14:42:50.730313: step: 88/463, loss: 1.6810400485992432 2023-01-22 14:42:51.452772: step: 90/463, loss: 0.17712566256523132 2023-01-22 14:42:52.115532: step: 92/463, loss: 0.4210018515586853 2023-01-22 14:42:52.752964: step: 94/463, loss: 0.18066702783107758 2023-01-22 14:42:53.365818: step: 96/463, loss: 0.3043428957462311 2023-01-22 14:42:54.004187: step: 98/463, loss: 0.7485647797584534 2023-01-22 14:42:54.626462: step: 100/463, loss: 0.33182600140571594 2023-01-22 14:42:55.202982: step: 102/463, loss: 0.9638960361480713 2023-01-22 14:42:55.779821: step: 104/463, loss: 0.21546097099781036 2023-01-22 14:42:56.419165: step: 106/463, loss: 0.25197961926460266 2023-01-22 14:42:56.998546: step: 108/463, loss: 0.1622588187456131 2023-01-22 14:42:57.614976: step: 110/463, loss: 0.3231807351112366 2023-01-22 14:42:58.246606: step: 112/463, loss: 0.1387479305267334 2023-01-22 14:42:58.846255: step: 114/463, loss: 0.11353818327188492 2023-01-22 14:42:59.459664: step: 116/463, loss: 0.6061052083969116 2023-01-22 14:43:00.093927: step: 118/463, loss: 0.37963390350341797 2023-01-22 14:43:00.679782: step: 120/463, loss: 0.5895127654075623 2023-01-22 14:43:01.247731: step: 122/463, loss: 0.18380790948867798 2023-01-22 14:43:01.835277: step: 124/463, loss: 0.20000189542770386 2023-01-22 14:43:02.488381: step: 126/463, loss: 0.4867173731327057 2023-01-22 14:43:03.032814: step: 128/463, loss: 0.3935805559158325 2023-01-22 14:43:03.625365: step: 130/463, loss: 0.2907065749168396 2023-01-22 14:43:04.256907: step: 132/463, loss: 0.23521944880485535 2023-01-22 14:43:04.940788: step: 134/463, loss: 0.07782916724681854 2023-01-22 14:43:05.521472: step: 136/463, loss: 0.20115149021148682 2023-01-22 14:43:06.193105: step: 138/463, loss: 0.6252246499061584 2023-01-22 14:43:06.801683: step: 140/463, loss: 0.2027372419834137 2023-01-22 14:43:07.440187: step: 142/463, loss: 0.4098787009716034 2023-01-22 14:43:08.004289: step: 144/463, loss: 0.29198339581489563 2023-01-22 14:43:08.596974: step: 146/463, loss: 0.06793458014726639 2023-01-22 14:43:09.225978: step: 148/463, loss: 0.16480296850204468 2023-01-22 14:43:09.850341: step: 150/463, loss: 0.1725868582725525 2023-01-22 14:43:10.449503: step: 152/463, loss: 1.4453141689300537 2023-01-22 14:43:11.007331: step: 154/463, loss: 0.19334720075130463 2023-01-22 14:43:11.671173: step: 156/463, loss: 0.35991835594177246 2023-01-22 14:43:12.339559: step: 158/463, loss: 0.20071880519390106 2023-01-22 14:43:12.953385: step: 160/463, loss: 0.28377851843833923 2023-01-22 14:43:13.645937: step: 162/463, loss: 1.074163794517517 2023-01-22 14:43:14.231576: step: 164/463, loss: 0.29272860288619995 2023-01-22 14:43:14.808209: step: 166/463, loss: 0.3365318477153778 2023-01-22 14:43:15.412590: step: 168/463, loss: 0.19417046010494232 2023-01-22 14:43:16.026136: step: 170/463, loss: 0.5363714098930359 2023-01-22 14:43:16.682394: step: 172/463, loss: 0.28791987895965576 2023-01-22 14:43:17.353269: step: 174/463, loss: 0.24893659353256226 2023-01-22 14:43:17.950617: step: 176/463, loss: 0.48654118180274963 2023-01-22 14:43:18.510774: step: 178/463, loss: 0.2546629309654236 2023-01-22 14:43:19.151288: step: 180/463, loss: 0.07618706673383713 2023-01-22 14:43:19.774799: step: 182/463, loss: 0.30248498916625977 2023-01-22 14:43:20.407358: step: 184/463, loss: 0.6183922290802002 2023-01-22 14:43:20.995114: step: 186/463, loss: 0.08533728867769241 2023-01-22 14:43:21.688191: step: 188/463, loss: 0.20948509871959686 2023-01-22 14:43:22.285161: step: 190/463, loss: 0.1542375385761261 2023-01-22 14:43:22.900117: step: 192/463, loss: 1.817619800567627 2023-01-22 14:43:23.510026: step: 194/463, loss: 0.19710521399974823 2023-01-22 14:43:24.161924: step: 196/463, loss: 0.5215041637420654 2023-01-22 14:43:24.740353: step: 198/463, loss: 0.17316752672195435 2023-01-22 14:43:25.294389: step: 200/463, loss: 0.22215090692043304 2023-01-22 14:43:25.921444: step: 202/463, loss: 0.10861774533987045 2023-01-22 14:43:26.518306: step: 204/463, loss: 0.20030221343040466 2023-01-22 14:43:27.149491: step: 206/463, loss: 0.2579762041568756 2023-01-22 14:43:27.797140: step: 208/463, loss: 0.21427719295024872 2023-01-22 14:43:28.406967: step: 210/463, loss: 0.32289883494377136 2023-01-22 14:43:28.979516: step: 212/463, loss: 1.2323558330535889 2023-01-22 14:43:29.585477: step: 214/463, loss: 0.33822569251060486 2023-01-22 14:43:30.219549: step: 216/463, loss: 0.13827966153621674 2023-01-22 14:43:30.823674: step: 218/463, loss: 0.16938865184783936 2023-01-22 14:43:31.444373: step: 220/463, loss: 0.49105772376060486 2023-01-22 14:43:32.159013: step: 222/463, loss: 0.16074548661708832 2023-01-22 14:43:32.894039: step: 224/463, loss: 0.07550489902496338 2023-01-22 14:43:33.489297: step: 226/463, loss: 2.0119118690490723 2023-01-22 14:43:34.064992: step: 228/463, loss: 0.19576358795166016 2023-01-22 14:43:34.663397: step: 230/463, loss: 0.12733149528503418 2023-01-22 14:43:35.286491: step: 232/463, loss: 0.5914895534515381 2023-01-22 14:43:35.880716: step: 234/463, loss: 0.14311295747756958 2023-01-22 14:43:36.509730: step: 236/463, loss: 0.07294968515634537 2023-01-22 14:43:37.116987: step: 238/463, loss: 0.5615681409835815 2023-01-22 14:43:37.720804: step: 240/463, loss: 0.38742750883102417 2023-01-22 14:43:38.393364: step: 242/463, loss: 1.1079474687576294 2023-01-22 14:43:39.013849: step: 244/463, loss: 0.15486350655555725 2023-01-22 14:43:39.577141: step: 246/463, loss: 0.4444189965724945 2023-01-22 14:43:40.196597: step: 248/463, loss: 0.14523114264011383 2023-01-22 14:43:40.772902: step: 250/463, loss: 0.24669846892356873 2023-01-22 14:43:41.456245: step: 252/463, loss: 0.16515623033046722 2023-01-22 14:43:42.043402: step: 254/463, loss: 0.8675667643547058 2023-01-22 14:43:42.638682: step: 256/463, loss: 0.12947112321853638 2023-01-22 14:43:43.303031: step: 258/463, loss: 0.353164404630661 2023-01-22 14:43:43.921688: step: 260/463, loss: 0.15890935063362122 2023-01-22 14:43:44.500681: step: 262/463, loss: 0.5473419427871704 2023-01-22 14:43:45.214596: step: 264/463, loss: 0.9091481566429138 2023-01-22 14:43:45.762327: step: 266/463, loss: 1.010164737701416 2023-01-22 14:43:46.342093: step: 268/463, loss: 0.16562531888484955 2023-01-22 14:43:46.985081: step: 270/463, loss: 0.11564770340919495 2023-01-22 14:43:47.569472: step: 272/463, loss: 0.32844293117523193 2023-01-22 14:43:48.149501: step: 274/463, loss: 0.3388992249965668 2023-01-22 14:43:48.738673: step: 276/463, loss: 0.48698490858078003 2023-01-22 14:43:49.349370: step: 278/463, loss: 0.49804896116256714 2023-01-22 14:43:49.932516: step: 280/463, loss: 0.2674975097179413 2023-01-22 14:43:50.532828: step: 282/463, loss: 0.16241858899593353 2023-01-22 14:43:51.181783: step: 284/463, loss: 0.690189778804779 2023-01-22 14:43:51.760549: step: 286/463, loss: 0.3890738785266876 2023-01-22 14:43:52.402598: step: 288/463, loss: 0.22666752338409424 2023-01-22 14:43:52.984153: step: 290/463, loss: 0.13155047595500946 2023-01-22 14:43:53.666599: step: 292/463, loss: 0.7252397537231445 2023-01-22 14:43:54.272273: step: 294/463, loss: 0.1373811662197113 2023-01-22 14:43:54.904389: step: 296/463, loss: 0.3560776114463806 2023-01-22 14:43:55.537256: step: 298/463, loss: 0.6798927187919617 2023-01-22 14:43:56.138914: step: 300/463, loss: 0.24240463972091675 2023-01-22 14:43:56.748600: step: 302/463, loss: 1.0126694440841675 2023-01-22 14:43:57.334643: step: 304/463, loss: 0.21671414375305176 2023-01-22 14:43:57.880316: step: 306/463, loss: 0.24639222025871277 2023-01-22 14:43:58.541516: step: 308/463, loss: 0.3770604729652405 2023-01-22 14:43:59.135223: step: 310/463, loss: 0.3968098759651184 2023-01-22 14:43:59.703992: step: 312/463, loss: 0.12437605857849121 2023-01-22 14:44:00.344311: step: 314/463, loss: 0.5885732173919678 2023-01-22 14:44:01.022602: step: 316/463, loss: 0.32180649042129517 2023-01-22 14:44:01.676392: step: 318/463, loss: 0.32718878984451294 2023-01-22 14:44:02.319529: step: 320/463, loss: 0.0870414450764656 2023-01-22 14:44:02.928422: step: 322/463, loss: 0.15117381513118744 2023-01-22 14:44:03.613448: step: 324/463, loss: 0.6232203245162964 2023-01-22 14:44:04.282201: step: 326/463, loss: 0.26005834341049194 2023-01-22 14:44:04.998699: step: 328/463, loss: 0.6427912712097168 2023-01-22 14:44:05.631154: step: 330/463, loss: 0.8352834582328796 2023-01-22 14:44:06.217493: step: 332/463, loss: 0.362226277589798 2023-01-22 14:44:06.860825: step: 334/463, loss: 0.14629898965358734 2023-01-22 14:44:07.516694: step: 336/463, loss: 0.24533942341804504 2023-01-22 14:44:08.133033: step: 338/463, loss: 0.1690611094236374 2023-01-22 14:44:08.836720: step: 340/463, loss: 0.5808298587799072 2023-01-22 14:44:09.452588: step: 342/463, loss: 0.5011056065559387 2023-01-22 14:44:10.064781: step: 344/463, loss: 0.32861724495887756 2023-01-22 14:44:10.677223: step: 346/463, loss: 0.47371912002563477 2023-01-22 14:44:11.336569: step: 348/463, loss: 0.1712179183959961 2023-01-22 14:44:11.942702: step: 350/463, loss: 0.08994511514902115 2023-01-22 14:44:12.629036: step: 352/463, loss: 0.13645125925540924 2023-01-22 14:44:13.243937: step: 354/463, loss: 0.13113732635974884 2023-01-22 14:44:13.836114: step: 356/463, loss: 0.34946003556251526 2023-01-22 14:44:14.469696: step: 358/463, loss: 0.3220357298851013 2023-01-22 14:44:15.094595: step: 360/463, loss: 1.2338030338287354 2023-01-22 14:44:15.765096: step: 362/463, loss: 0.24648243188858032 2023-01-22 14:44:16.336511: step: 364/463, loss: 0.9990612864494324 2023-01-22 14:44:16.893268: step: 366/463, loss: 0.9966065287590027 2023-01-22 14:44:17.465818: step: 368/463, loss: 0.3378678858280182 2023-01-22 14:44:18.178664: step: 370/463, loss: 0.4601289927959442 2023-01-22 14:44:18.768192: step: 372/463, loss: 0.3299180269241333 2023-01-22 14:44:19.395622: step: 374/463, loss: 0.38735929131507874 2023-01-22 14:44:20.020631: step: 376/463, loss: 0.7826871275901794 2023-01-22 14:44:20.607754: step: 378/463, loss: 0.14810487627983093 2023-01-22 14:44:21.185940: step: 380/463, loss: 0.4473065733909607 2023-01-22 14:44:21.824666: step: 382/463, loss: 0.13955721259117126 2023-01-22 14:44:22.494267: step: 384/463, loss: 0.07252318412065506 2023-01-22 14:44:23.132159: step: 386/463, loss: 0.24605226516723633 2023-01-22 14:44:23.715481: step: 388/463, loss: 0.498287171125412 2023-01-22 14:44:24.275079: step: 390/463, loss: 0.3971472978591919 2023-01-22 14:44:24.900014: step: 392/463, loss: 0.2603898346424103 2023-01-22 14:44:25.510845: step: 394/463, loss: 0.08706539124250412 2023-01-22 14:44:26.091736: step: 396/463, loss: 0.714023232460022 2023-01-22 14:44:26.723575: step: 398/463, loss: 0.6536813974380493 2023-01-22 14:44:27.342349: step: 400/463, loss: 0.17577290534973145 2023-01-22 14:44:27.919574: step: 402/463, loss: 0.6449795365333557 2023-01-22 14:44:28.539303: step: 404/463, loss: 0.4789900481700897 2023-01-22 14:44:29.162795: step: 406/463, loss: 0.12064830213785172 2023-01-22 14:44:29.830962: step: 408/463, loss: 0.2557472586631775 2023-01-22 14:44:30.481506: step: 410/463, loss: 0.33917632699012756 2023-01-22 14:44:31.085361: step: 412/463, loss: 0.5771576762199402 2023-01-22 14:44:31.633224: step: 414/463, loss: 0.45937082171440125 2023-01-22 14:44:32.261714: step: 416/463, loss: 0.8191688656806946 2023-01-22 14:44:32.853821: step: 418/463, loss: 0.36285141110420227 2023-01-22 14:44:33.396388: step: 420/463, loss: 0.17637917399406433 2023-01-22 14:44:34.037025: step: 422/463, loss: 0.6516456604003906 2023-01-22 14:44:34.750977: step: 424/463, loss: 0.34220048785209656 2023-01-22 14:44:35.287247: step: 426/463, loss: 0.4493963420391083 2023-01-22 14:44:35.864297: step: 428/463, loss: 0.1352560818195343 2023-01-22 14:44:36.484595: step: 430/463, loss: 2.2609877586364746 2023-01-22 14:44:37.131341: step: 432/463, loss: 0.8286347389221191 2023-01-22 14:44:37.726555: step: 434/463, loss: 0.1838429868221283 2023-01-22 14:44:38.328707: step: 436/463, loss: 0.3064652979373932 2023-01-22 14:44:38.931655: step: 438/463, loss: 0.2734317183494568 2023-01-22 14:44:39.521230: step: 440/463, loss: 1.0021854639053345 2023-01-22 14:44:40.059391: step: 442/463, loss: 0.09702878445386887 2023-01-22 14:44:40.629196: step: 444/463, loss: 0.5952252149581909 2023-01-22 14:44:41.192206: step: 446/463, loss: 0.47734540700912476 2023-01-22 14:44:41.842182: step: 448/463, loss: 0.17956160008907318 2023-01-22 14:44:42.449850: step: 450/463, loss: 0.10206503421068192 2023-01-22 14:44:43.071279: step: 452/463, loss: 0.3675583004951477 2023-01-22 14:44:43.663307: step: 454/463, loss: 0.6757737994194031 2023-01-22 14:44:44.308548: step: 456/463, loss: 0.33928850293159485 2023-01-22 14:44:44.888530: step: 458/463, loss: 0.2503208518028259 2023-01-22 14:44:45.451302: step: 460/463, loss: 0.24320414662361145 2023-01-22 14:44:46.112943: step: 462/463, loss: 1.474251627922058 2023-01-22 14:44:46.677081: step: 464/463, loss: 0.25353512167930603 2023-01-22 14:44:47.284030: step: 466/463, loss: 0.12053804099559784 2023-01-22 14:44:47.867683: step: 468/463, loss: 0.5638688802719116 2023-01-22 14:44:48.575315: step: 470/463, loss: 0.18507513403892517 2023-01-22 14:44:49.221531: step: 472/463, loss: 0.356080025434494 2023-01-22 14:44:49.816947: step: 474/463, loss: 0.17452777922153473 2023-01-22 14:44:50.427176: step: 476/463, loss: 0.2471393197774887 2023-01-22 14:44:51.076337: step: 478/463, loss: 0.25505387783050537 2023-01-22 14:44:51.726301: step: 480/463, loss: 0.5362104177474976 2023-01-22 14:44:52.314422: step: 482/463, loss: 0.20229533314704895 2023-01-22 14:44:52.971461: step: 484/463, loss: 3.891817569732666 2023-01-22 14:44:53.636282: step: 486/463, loss: 0.16219474375247955 2023-01-22 14:44:54.219154: step: 488/463, loss: 0.1829301118850708 2023-01-22 14:44:54.787674: step: 490/463, loss: 1.6643515825271606 2023-01-22 14:44:55.374671: step: 492/463, loss: 0.43801987171173096 2023-01-22 14:44:55.980384: step: 494/463, loss: 0.2452651709318161 2023-01-22 14:44:56.548917: step: 496/463, loss: 0.3577100336551666 2023-01-22 14:44:57.153726: step: 498/463, loss: 0.22962814569473267 2023-01-22 14:44:57.838901: step: 500/463, loss: 0.4521733820438385 2023-01-22 14:44:58.575117: step: 502/463, loss: 0.2610316574573517 2023-01-22 14:44:59.174611: step: 504/463, loss: 0.8688585758209229 2023-01-22 14:44:59.763281: step: 506/463, loss: 0.17935660481452942 2023-01-22 14:45:00.309617: step: 508/463, loss: 0.09292390942573547 2023-01-22 14:45:00.990769: step: 510/463, loss: 0.6938821077346802 2023-01-22 14:45:01.620199: step: 512/463, loss: 0.2595483958721161 2023-01-22 14:45:02.235201: step: 514/463, loss: 0.3995194137096405 2023-01-22 14:45:02.823799: step: 516/463, loss: 0.24844281375408173 2023-01-22 14:45:03.408915: step: 518/463, loss: 1.806038737297058 2023-01-22 14:45:03.980950: step: 520/463, loss: 0.20255634188652039 2023-01-22 14:45:04.583797: step: 522/463, loss: 0.37533795833587646 2023-01-22 14:45:05.227038: step: 524/463, loss: 0.25347769260406494 2023-01-22 14:45:05.817585: step: 526/463, loss: 0.7211840152740479 2023-01-22 14:45:06.524418: step: 528/463, loss: 0.20950478315353394 2023-01-22 14:45:07.139275: step: 530/463, loss: 0.06605667620897293 2023-01-22 14:45:07.762581: step: 532/463, loss: 0.3842030465602875 2023-01-22 14:45:08.343026: step: 534/463, loss: 0.12531177699565887 2023-01-22 14:45:08.964510: step: 536/463, loss: 0.37251192331314087 2023-01-22 14:45:09.605277: step: 538/463, loss: 0.23689790070056915 2023-01-22 14:45:10.276728: step: 540/463, loss: 0.16443565487861633 2023-01-22 14:45:10.916469: step: 542/463, loss: 0.8490524888038635 2023-01-22 14:45:11.558954: step: 544/463, loss: 0.7690421342849731 2023-01-22 14:45:12.136816: step: 546/463, loss: 0.4573170840740204 2023-01-22 14:45:12.714583: step: 548/463, loss: 0.19372953474521637 2023-01-22 14:45:13.282985: step: 550/463, loss: 0.8426194787025452 2023-01-22 14:45:13.876007: step: 552/463, loss: 0.34307390451431274 2023-01-22 14:45:14.484806: step: 554/463, loss: 0.9012276530265808 2023-01-22 14:45:15.116238: step: 556/463, loss: 0.1480189710855484 2023-01-22 14:45:15.756885: step: 558/463, loss: 0.6231532692909241 2023-01-22 14:45:16.406350: step: 560/463, loss: 0.6065738201141357 2023-01-22 14:45:17.016848: step: 562/463, loss: 0.6545352339744568 2023-01-22 14:45:17.649969: step: 564/463, loss: 0.20287767052650452 2023-01-22 14:45:18.252840: step: 566/463, loss: 0.4257025718688965 2023-01-22 14:45:18.881448: step: 568/463, loss: 0.2873762845993042 2023-01-22 14:45:19.603588: step: 570/463, loss: 0.2711103558540344 2023-01-22 14:45:20.190378: step: 572/463, loss: 0.251329630613327 2023-01-22 14:45:20.811273: step: 574/463, loss: 0.5893901586532593 2023-01-22 14:45:21.379580: step: 576/463, loss: 4.22517204284668 2023-01-22 14:45:21.974599: step: 578/463, loss: 0.31776368618011475 2023-01-22 14:45:22.531433: step: 580/463, loss: 0.18703852593898773 2023-01-22 14:45:23.136413: step: 582/463, loss: 1.2758302688598633 2023-01-22 14:45:23.750662: step: 584/463, loss: 0.18132202327251434 2023-01-22 14:45:24.368111: step: 586/463, loss: 0.27566802501678467 2023-01-22 14:45:24.953066: step: 588/463, loss: 0.41068628430366516 2023-01-22 14:45:25.531080: step: 590/463, loss: 0.323525071144104 2023-01-22 14:45:26.136314: step: 592/463, loss: 0.744659423828125 2023-01-22 14:45:26.738401: step: 594/463, loss: 0.9132704734802246 2023-01-22 14:45:27.375959: step: 596/463, loss: 0.11091698706150055 2023-01-22 14:45:28.061238: step: 598/463, loss: 0.30361202359199524 2023-01-22 14:45:28.616746: step: 600/463, loss: 0.44980588555336 2023-01-22 14:45:29.204624: step: 602/463, loss: 0.3262363076210022 2023-01-22 14:45:29.786876: step: 604/463, loss: 0.19514457881450653 2023-01-22 14:45:30.410376: step: 606/463, loss: 0.15008683502674103 2023-01-22 14:45:31.068204: step: 608/463, loss: 0.18740703165531158 2023-01-22 14:45:31.698069: step: 610/463, loss: 0.22440780699253082 2023-01-22 14:45:32.262947: step: 612/463, loss: 1.23301362991333 2023-01-22 14:45:32.850722: step: 614/463, loss: 0.3038812279701233 2023-01-22 14:45:33.496526: step: 616/463, loss: 0.271881103515625 2023-01-22 14:45:34.063624: step: 618/463, loss: 0.1711817979812622 2023-01-22 14:45:34.741837: step: 620/463, loss: 0.9485455751419067 2023-01-22 14:45:35.353020: step: 622/463, loss: 0.4422200322151184 2023-01-22 14:45:36.003730: step: 624/463, loss: 0.6881210803985596 2023-01-22 14:45:36.681458: step: 626/463, loss: 0.15195192396640778 2023-01-22 14:45:37.227675: step: 628/463, loss: 0.12405158579349518 2023-01-22 14:45:37.799475: step: 630/463, loss: 0.15682324767112732 2023-01-22 14:45:38.406928: step: 632/463, loss: 0.24069520831108093 2023-01-22 14:45:39.002829: step: 634/463, loss: 0.13905930519104004 2023-01-22 14:45:39.568544: step: 636/463, loss: 0.17272640764713287 2023-01-22 14:45:40.205748: step: 638/463, loss: 0.13042089343070984 2023-01-22 14:45:40.851103: step: 640/463, loss: 0.3743976056575775 2023-01-22 14:45:41.512654: step: 642/463, loss: 1.008528470993042 2023-01-22 14:45:42.061254: step: 644/463, loss: 1.9237840175628662 2023-01-22 14:45:42.662266: step: 646/463, loss: 0.17411521077156067 2023-01-22 14:45:43.258699: step: 648/463, loss: 0.8163173794746399 2023-01-22 14:45:43.929004: step: 650/463, loss: 2.189035415649414 2023-01-22 14:45:44.509173: step: 652/463, loss: 0.14511334896087646 2023-01-22 14:45:45.087962: step: 654/463, loss: 0.4977121353149414 2023-01-22 14:45:45.745894: step: 656/463, loss: 0.11279921978712082 2023-01-22 14:45:46.422208: step: 658/463, loss: 0.7377143502235413 2023-01-22 14:45:47.041438: step: 660/463, loss: 0.29171061515808105 2023-01-22 14:45:47.640392: step: 662/463, loss: 0.20627543330192566 2023-01-22 14:45:48.230288: step: 664/463, loss: 0.4290556311607361 2023-01-22 14:45:48.820595: step: 666/463, loss: 1.004115343093872 2023-01-22 14:45:49.507709: step: 668/463, loss: 0.1845473200082779 2023-01-22 14:45:50.102001: step: 670/463, loss: 0.3395169973373413 2023-01-22 14:45:50.739962: step: 672/463, loss: 0.23915690183639526 2023-01-22 14:45:51.316594: step: 674/463, loss: 0.4111122190952301 2023-01-22 14:45:52.009561: step: 676/463, loss: 0.5128810405731201 2023-01-22 14:45:52.594244: step: 678/463, loss: 0.3220401108264923 2023-01-22 14:45:53.210496: step: 680/463, loss: 0.11768213659524918 2023-01-22 14:45:53.858877: step: 682/463, loss: 0.17047147452831268 2023-01-22 14:45:54.497797: step: 684/463, loss: 0.48051273822784424 2023-01-22 14:45:55.048271: step: 686/463, loss: 0.4260302782058716 2023-01-22 14:45:55.646624: step: 688/463, loss: 0.48900124430656433 2023-01-22 14:45:56.193842: step: 690/463, loss: 0.36692264676094055 2023-01-22 14:45:56.812318: step: 692/463, loss: 0.16269977390766144 2023-01-22 14:45:57.456192: step: 694/463, loss: 0.23012447357177734 2023-01-22 14:45:58.139846: step: 696/463, loss: 0.41264817118644714 2023-01-22 14:45:58.749061: step: 698/463, loss: 0.41545382142066956 2023-01-22 14:45:59.400905: step: 700/463, loss: 0.37560033798217773 2023-01-22 14:46:00.026644: step: 702/463, loss: 0.20912964642047882 2023-01-22 14:46:00.673268: step: 704/463, loss: 0.06672941148281097 2023-01-22 14:46:01.253072: step: 706/463, loss: 0.09048417210578918 2023-01-22 14:46:01.971463: step: 708/463, loss: 0.9929485321044922 2023-01-22 14:46:02.594552: step: 710/463, loss: 0.2512942850589752 2023-01-22 14:46:03.258030: step: 712/463, loss: 1.8073031902313232 2023-01-22 14:46:03.795399: step: 714/463, loss: 0.12253375351428986 2023-01-22 14:46:04.378556: step: 716/463, loss: 0.2178686410188675 2023-01-22 14:46:05.164809: step: 718/463, loss: 0.2268475890159607 2023-01-22 14:46:05.752329: step: 720/463, loss: 0.7890647649765015 2023-01-22 14:46:06.340035: step: 722/463, loss: 0.49494680762290955 2023-01-22 14:46:07.031355: step: 724/463, loss: 0.6746903657913208 2023-01-22 14:46:07.630899: step: 726/463, loss: 0.7202172875404358 2023-01-22 14:46:08.212546: step: 728/463, loss: 0.43795469403266907 2023-01-22 14:46:08.743726: step: 730/463, loss: 0.2207535356283188 2023-01-22 14:46:09.354599: step: 732/463, loss: 0.29543352127075195 2023-01-22 14:46:09.927668: step: 734/463, loss: 0.18779702484607697 2023-01-22 14:46:10.551880: step: 736/463, loss: 0.18888381123542786 2023-01-22 14:46:11.206364: step: 738/463, loss: 0.5460423231124878 2023-01-22 14:46:11.816278: step: 740/463, loss: 0.07184544205665588 2023-01-22 14:46:12.379397: step: 742/463, loss: 0.45520514249801636 2023-01-22 14:46:13.015142: step: 744/463, loss: 1.027717113494873 2023-01-22 14:46:13.672744: step: 746/463, loss: 0.16178809106349945 2023-01-22 14:46:14.355937: step: 748/463, loss: 1.2187113761901855 2023-01-22 14:46:14.956082: step: 750/463, loss: 0.27004769444465637 2023-01-22 14:46:15.632195: step: 752/463, loss: 0.6326010823249817 2023-01-22 14:46:16.239766: step: 754/463, loss: 0.21198250353336334 2023-01-22 14:46:16.889986: step: 756/463, loss: 0.6279456615447998 2023-01-22 14:46:17.589403: step: 758/463, loss: 0.6672720313072205 2023-01-22 14:46:18.168503: step: 760/463, loss: 0.2894316613674164 2023-01-22 14:46:18.817470: step: 762/463, loss: 0.13172197341918945 2023-01-22 14:46:19.467916: step: 764/463, loss: 0.49930858612060547 2023-01-22 14:46:20.037273: step: 766/463, loss: 0.7457582354545593 2023-01-22 14:46:20.647564: step: 768/463, loss: 0.23861724138259888 2023-01-22 14:46:21.228845: step: 770/463, loss: 0.2063734531402588 2023-01-22 14:46:21.848186: step: 772/463, loss: 2.314769744873047 2023-01-22 14:46:22.438421: step: 774/463, loss: 1.7482213973999023 2023-01-22 14:46:23.028360: step: 776/463, loss: 0.1785968542098999 2023-01-22 14:46:23.640301: step: 778/463, loss: 0.8356150984764099 2023-01-22 14:46:24.234135: step: 780/463, loss: 0.1255253702402115 2023-01-22 14:46:24.816793: step: 782/463, loss: 0.2881535291671753 2023-01-22 14:46:25.493059: step: 784/463, loss: 0.15772071480751038 2023-01-22 14:46:26.117876: step: 786/463, loss: 0.13829894363880157 2023-01-22 14:46:26.749405: step: 788/463, loss: 0.16847145557403564 2023-01-22 14:46:27.375539: step: 790/463, loss: 0.3722435235977173 2023-01-22 14:46:27.910253: step: 792/463, loss: 0.6596832275390625 2023-01-22 14:46:28.525578: step: 794/463, loss: 0.4496105909347534 2023-01-22 14:46:29.114472: step: 796/463, loss: 0.49357667565345764 2023-01-22 14:46:29.794813: step: 798/463, loss: 1.3976190090179443 2023-01-22 14:46:30.395244: step: 800/463, loss: 0.29794758558273315 2023-01-22 14:46:31.000363: step: 802/463, loss: 0.4289681017398834 2023-01-22 14:46:31.565635: step: 804/463, loss: 0.09841360151767731 2023-01-22 14:46:32.236139: step: 806/463, loss: 3.611863851547241 2023-01-22 14:46:32.868539: step: 808/463, loss: 0.5582469701766968 2023-01-22 14:46:33.463488: step: 810/463, loss: 1.1284466981887817 2023-01-22 14:46:34.117011: step: 812/463, loss: 0.21190449595451355 2023-01-22 14:46:34.716276: step: 814/463, loss: 0.21592657268047333 2023-01-22 14:46:35.319981: step: 816/463, loss: 0.35144075751304626 2023-01-22 14:46:35.933645: step: 818/463, loss: 0.0970849022269249 2023-01-22 14:46:36.632248: step: 820/463, loss: 0.5442741513252258 2023-01-22 14:46:37.280517: step: 822/463, loss: 0.26241594552993774 2023-01-22 14:46:37.839311: step: 824/463, loss: 0.38277679681777954 2023-01-22 14:46:38.434448: step: 826/463, loss: 0.09575918316841125 2023-01-22 14:46:39.006050: step: 828/463, loss: 0.2038913071155548 2023-01-22 14:46:39.617424: step: 830/463, loss: 0.23944729566574097 2023-01-22 14:46:40.277367: step: 832/463, loss: 0.09433344006538391 2023-01-22 14:46:40.895269: step: 834/463, loss: 0.15715545415878296 2023-01-22 14:46:41.524040: step: 836/463, loss: 0.2119702696800232 2023-01-22 14:46:42.239647: step: 838/463, loss: 0.46763312816619873 2023-01-22 14:46:42.881427: step: 840/463, loss: 0.19788500666618347 2023-01-22 14:46:43.539874: step: 842/463, loss: 0.35139769315719604 2023-01-22 14:46:44.137359: step: 844/463, loss: 0.3690606951713562 2023-01-22 14:46:44.839343: step: 846/463, loss: 0.12892365455627441 2023-01-22 14:46:45.427251: step: 848/463, loss: 0.1478589028120041 2023-01-22 14:46:46.003110: step: 850/463, loss: 0.22224806249141693 2023-01-22 14:46:46.539372: step: 852/463, loss: 1.0827016830444336 2023-01-22 14:46:47.143493: step: 854/463, loss: 0.24283356964588165 2023-01-22 14:46:47.757821: step: 856/463, loss: 1.0577061176300049 2023-01-22 14:46:48.371586: step: 858/463, loss: 0.439443439245224 2023-01-22 14:46:48.949368: step: 860/463, loss: 0.11953823268413544 2023-01-22 14:46:49.555629: step: 862/463, loss: 0.18949565291404724 2023-01-22 14:46:50.149822: step: 864/463, loss: 0.5006502866744995 2023-01-22 14:46:50.780577: step: 866/463, loss: 0.2474052459001541 2023-01-22 14:46:51.336900: step: 868/463, loss: 0.4753151535987854 2023-01-22 14:46:51.952087: step: 870/463, loss: 0.33479684591293335 2023-01-22 14:46:52.537422: step: 872/463, loss: 0.1882864236831665 2023-01-22 14:46:53.118805: step: 874/463, loss: 0.8432376980781555 2023-01-22 14:46:53.776697: step: 876/463, loss: 0.4843776226043701 2023-01-22 14:46:54.367854: step: 878/463, loss: 0.41607335209846497 2023-01-22 14:46:54.958082: step: 880/463, loss: 0.7908477783203125 2023-01-22 14:46:55.633803: step: 882/463, loss: 1.6781855821609497 2023-01-22 14:46:56.221193: step: 884/463, loss: 0.30157893896102905 2023-01-22 14:46:56.807261: step: 886/463, loss: 0.14532431960105896 2023-01-22 14:46:57.393454: step: 888/463, loss: 0.5817785263061523 2023-01-22 14:46:58.024895: step: 890/463, loss: 0.12963341176509857 2023-01-22 14:46:58.656202: step: 892/463, loss: 0.3013290762901306 2023-01-22 14:46:59.252491: step: 894/463, loss: 0.21239760518074036 2023-01-22 14:46:59.797260: step: 896/463, loss: 0.22966769337654114 2023-01-22 14:47:00.376235: step: 898/463, loss: 0.7141059637069702 2023-01-22 14:47:01.031105: step: 900/463, loss: 3.5087928771972656 2023-01-22 14:47:01.658053: step: 902/463, loss: 0.21874621510505676 2023-01-22 14:47:02.266294: step: 904/463, loss: 0.2496662735939026 2023-01-22 14:47:02.905804: step: 906/463, loss: 0.30359697341918945 2023-01-22 14:47:03.528041: step: 908/463, loss: 0.4091629683971405 2023-01-22 14:47:04.304504: step: 910/463, loss: 0.2425011545419693 2023-01-22 14:47:04.883871: step: 912/463, loss: 0.2608892321586609 2023-01-22 14:47:05.502583: step: 914/463, loss: 0.15927426517009735 2023-01-22 14:47:06.146346: step: 916/463, loss: 0.5035494565963745 2023-01-22 14:47:06.751652: step: 918/463, loss: 0.24742530286312103 2023-01-22 14:47:07.431343: step: 920/463, loss: 0.3260877728462219 2023-01-22 14:47:07.988766: step: 922/463, loss: 0.18224895000457764 2023-01-22 14:47:08.600879: step: 924/463, loss: 0.16027310490608215 2023-01-22 14:47:09.229664: step: 926/463, loss: 0.07010610401630402 ================================================== Loss: 0.449 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2586267407906559, 'r': 0.31211879913255625, 'f1': 0.2828660483970028}, 'combined': 0.20842761460831785, 'epoch': 2} Test Chinese: {'template': {'p': 0.9722222222222222, 'r': 0.5511811023622047, 'f1': 0.7035175879396985}, 'slot': {'p': 0.31496860059607057, 'r': 0.3080975578256502, 'f1': 0.3114951930023777}, 'combined': 0.21914234683584363, 'epoch': 2} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.26608296792060926, 'r': 0.31253388452155056, 'f1': 0.287443904263276}, 'combined': 0.21180077156241386, 'epoch': 2} Test Korean: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.31941779034226536, 'r': 0.30854755140391604, 'f1': 0.31388858758001575}, 'combined': 0.22286089718181118, 'epoch': 2} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.283145285477375, 'r': 0.3239783816752507, 'f1': 0.3021886852085967}, 'combined': 0.2226653469958081, 'epoch': 2} Test Russian: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.3221816396918299, 'r': 0.2918718687103686, 'f1': 0.3062787068368401}, 'combined': 0.21745788185415646, 'epoch': 2} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.2333333333333333, 'r': 0.36666666666666664, 'f1': 0.28518518518518515}, 'combined': 0.19012345679012344, 'epoch': 2} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.2875, 'r': 0.5, 'f1': 0.36507936507936506}, 'combined': 0.18253968253968253, 'epoch': 2} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.28703703703703703, 'r': 0.2672413793103448, 'f1': 0.27678571428571425}, 'combined': 0.18452380952380948, 'epoch': 2} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3095149595559295, 'r': 0.3253724622276754, 'f1': 0.31724567547453275}, 'combined': 0.23375997140228727, 'epoch': 1} Test for Chinese: {'template': {'p': 0.9722222222222222, 'r': 0.5511811023622047, 'f1': 0.7035175879396985}, 'slot': {'p': 0.36402697976295945, 'r': 0.29452226435922085, 'f1': 0.32560678286267597}, 'combined': 0.22907009849635496, 'epoch': 1} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.29409479409479405, 'r': 0.32770562770562767, 'f1': 0.3099918099918099}, 'combined': 0.2066612066612066, 'epoch': 1} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.31359210182048525, 'r': 0.32489807892596767, 'f1': 0.3191449908555172}, 'combined': 0.23515946694617054, 'epoch': 1} Test for Korean: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.3653684903443556, 'r': 0.29076445304891124, 'f1': 0.32382513429937054}, 'combined': 0.22991584535255308, 'epoch': 1} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.296875, 'r': 0.41304347826086957, 'f1': 0.3454545454545454}, 'combined': 0.1727272727272727, 'epoch': 1} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.301291961130742, 'r': 0.3235887096774194, 'f1': 0.3120425434583715}, 'combined': 0.2299260846535369, 'epoch': 0} Test for Russian: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.38111410877910484, 'r': 0.27421624899959984, 'f1': 0.318946559120102}, 'combined': 0.2264520569752724, 'epoch': 0} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.3888888888888889, 'r': 0.2413793103448276, 'f1': 0.2978723404255319}, 'combined': 0.19858156028368792, 'epoch': 0} ****************************** Epoch: 3 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-22 14:49:44.096366: step: 2/463, loss: 0.6545366644859314 2023-01-22 14:49:44.699958: step: 4/463, loss: 0.17979156970977783 2023-01-22 14:49:45.327349: step: 6/463, loss: 0.25840112566947937 2023-01-22 14:49:45.930958: step: 8/463, loss: 0.6634191274642944 2023-01-22 14:49:46.542202: step: 10/463, loss: 0.11520200222730637 2023-01-22 14:49:47.134954: step: 12/463, loss: 0.23941530287265778 2023-01-22 14:49:47.763187: step: 14/463, loss: 0.5665580034255981 2023-01-22 14:49:48.428757: step: 16/463, loss: 0.3791900873184204 2023-01-22 14:49:49.059496: step: 18/463, loss: 0.17846426367759705 2023-01-22 14:49:49.644868: step: 20/463, loss: 0.32046788930892944 2023-01-22 14:49:50.253195: step: 22/463, loss: 0.6647206544876099 2023-01-22 14:49:50.827134: step: 24/463, loss: 0.14639554917812347 2023-01-22 14:49:51.427922: step: 26/463, loss: 0.6034649014472961 2023-01-22 14:49:52.044320: step: 28/463, loss: 0.269064724445343 2023-01-22 14:49:52.693667: step: 30/463, loss: 0.10481162369251251 2023-01-22 14:49:53.290998: step: 32/463, loss: 0.1687096208333969 2023-01-22 14:49:54.006245: step: 34/463, loss: 0.1032605841755867 2023-01-22 14:49:54.688563: step: 36/463, loss: 0.07723211497068405 2023-01-22 14:49:55.292969: step: 38/463, loss: 0.2834753692150116 2023-01-22 14:49:55.926465: step: 40/463, loss: 0.10482033342123032 2023-01-22 14:49:56.615302: step: 42/463, loss: 0.18584904074668884 2023-01-22 14:49:57.270542: step: 44/463, loss: 0.30854901671409607 2023-01-22 14:49:57.853904: step: 46/463, loss: 0.2897805869579315 2023-01-22 14:49:58.443474: step: 48/463, loss: 0.056759871542453766 2023-01-22 14:49:59.043374: step: 50/463, loss: 0.1284082680940628 2023-01-22 14:49:59.615505: step: 52/463, loss: 0.19186964631080627 2023-01-22 14:50:00.246744: step: 54/463, loss: 0.580914318561554 2023-01-22 14:50:00.931345: step: 56/463, loss: 1.0117038488388062 2023-01-22 14:50:01.514918: step: 58/463, loss: 0.246613547205925 2023-01-22 14:50:02.150122: step: 60/463, loss: 0.15058791637420654 2023-01-22 14:50:02.772272: step: 62/463, loss: 1.129349708557129 2023-01-22 14:50:03.406037: step: 64/463, loss: 0.12945550680160522 2023-01-22 14:50:04.008578: step: 66/463, loss: 0.5077367424964905 2023-01-22 14:50:04.602831: step: 68/463, loss: 0.204027459025383 2023-01-22 14:50:05.187119: step: 70/463, loss: 0.09176284819841385 2023-01-22 14:50:05.798285: step: 72/463, loss: 0.2758084833621979 2023-01-22 14:50:06.438598: step: 74/463, loss: 0.15949974954128265 2023-01-22 14:50:06.996304: step: 76/463, loss: 0.37386447191238403 2023-01-22 14:50:07.699138: step: 78/463, loss: 1.3307422399520874 2023-01-22 14:50:08.334834: step: 80/463, loss: 0.23718556761741638 2023-01-22 14:50:09.063552: step: 82/463, loss: 0.1577424854040146 2023-01-22 14:50:09.693034: step: 84/463, loss: 0.18213438987731934 2023-01-22 14:50:10.303097: step: 86/463, loss: 0.4712124168872833 2023-01-22 14:50:10.904031: step: 88/463, loss: 0.3374689817428589 2023-01-22 14:50:11.479469: step: 90/463, loss: 0.33871880173683167 2023-01-22 14:50:12.084834: step: 92/463, loss: 0.33789223432540894 2023-01-22 14:50:12.694517: step: 94/463, loss: 0.0928788110613823 2023-01-22 14:50:13.334225: step: 96/463, loss: 0.4106515645980835 2023-01-22 14:50:13.938885: step: 98/463, loss: 0.16269680857658386 2023-01-22 14:50:14.615305: step: 100/463, loss: 0.8528810143470764 2023-01-22 14:50:15.246758: step: 102/463, loss: 0.27488136291503906 2023-01-22 14:50:15.835344: step: 104/463, loss: 0.297185480594635 2023-01-22 14:50:16.447140: step: 106/463, loss: 0.17549465596675873 2023-01-22 14:50:17.093876: step: 108/463, loss: 0.06225522980093956 2023-01-22 14:50:17.649839: step: 110/463, loss: 0.12190604209899902 2023-01-22 14:50:18.291258: step: 112/463, loss: 0.13461928069591522 2023-01-22 14:50:18.909397: step: 114/463, loss: 0.621019184589386 2023-01-22 14:50:19.548577: step: 116/463, loss: 1.1715493202209473 2023-01-22 14:50:20.190506: step: 118/463, loss: 0.2886779308319092 2023-01-22 14:50:20.832574: step: 120/463, loss: 0.08416549861431122 2023-01-22 14:50:21.453668: step: 122/463, loss: 2.845693588256836 2023-01-22 14:50:22.056872: step: 124/463, loss: 0.0920800268650055 2023-01-22 14:50:22.670771: step: 126/463, loss: 0.2523557245731354 2023-01-22 14:50:23.229672: step: 128/463, loss: 0.09577610343694687 2023-01-22 14:50:23.851825: step: 130/463, loss: 0.34109729528427124 2023-01-22 14:50:24.379854: step: 132/463, loss: 0.1181560605764389 2023-01-22 14:50:24.952385: step: 134/463, loss: 0.040498554706573486 2023-01-22 14:50:25.539283: step: 136/463, loss: 0.8973309993743896 2023-01-22 14:50:26.106828: step: 138/463, loss: 0.12431257218122482 2023-01-22 14:50:26.692294: step: 140/463, loss: 0.1623237431049347 2023-01-22 14:50:27.351599: step: 142/463, loss: 0.06568959355354309 2023-01-22 14:50:27.926287: step: 144/463, loss: 0.11562041938304901 2023-01-22 14:50:28.567803: step: 146/463, loss: 0.12063157558441162 2023-01-22 14:50:29.193676: step: 148/463, loss: 0.19134759902954102 2023-01-22 14:50:29.779489: step: 150/463, loss: 0.06976736336946487 2023-01-22 14:50:30.374485: step: 152/463, loss: 0.12127666920423508 2023-01-22 14:50:30.957514: step: 154/463, loss: 0.1960328370332718 2023-01-22 14:50:31.525312: step: 156/463, loss: 0.14171122014522552 2023-01-22 14:50:32.057534: step: 158/463, loss: 0.1587807536125183 2023-01-22 14:50:32.777058: step: 160/463, loss: 1.2733919620513916 2023-01-22 14:50:33.403594: step: 162/463, loss: 0.1599987894296646 2023-01-22 14:50:33.992565: step: 164/463, loss: 0.5303712487220764 2023-01-22 14:50:34.633001: step: 166/463, loss: 0.6226189136505127 2023-01-22 14:50:35.174489: step: 168/463, loss: 0.15808279812335968 2023-01-22 14:50:35.807877: step: 170/463, loss: 0.19162683188915253 2023-01-22 14:50:36.457439: step: 172/463, loss: 0.19067732989788055 2023-01-22 14:50:37.154762: step: 174/463, loss: 0.5788407921791077 2023-01-22 14:50:37.754959: step: 176/463, loss: 0.2098609060049057 2023-01-22 14:50:38.430527: step: 178/463, loss: 0.13497823476791382 2023-01-22 14:50:39.096111: step: 180/463, loss: 0.7091683745384216 2023-01-22 14:50:39.717331: step: 182/463, loss: 0.20256191492080688 2023-01-22 14:50:40.328565: step: 184/463, loss: 0.29331544041633606 2023-01-22 14:50:41.030388: step: 186/463, loss: 0.2842046618461609 2023-01-22 14:50:41.602325: step: 188/463, loss: 0.10872101783752441 2023-01-22 14:50:42.193784: step: 190/463, loss: 0.14071539044380188 2023-01-22 14:50:42.795644: step: 192/463, loss: 0.2819697856903076 2023-01-22 14:50:43.379768: step: 194/463, loss: 0.3422684669494629 2023-01-22 14:50:44.037072: step: 196/463, loss: 0.14401301741600037 2023-01-22 14:50:44.611025: step: 198/463, loss: 0.15978208184242249 2023-01-22 14:50:45.213201: step: 200/463, loss: 0.3691805601119995 2023-01-22 14:50:45.857148: step: 202/463, loss: 0.6299613118171692 2023-01-22 14:50:46.479990: step: 204/463, loss: 0.48296013474464417 2023-01-22 14:50:47.147010: step: 206/463, loss: 0.5042459964752197 2023-01-22 14:50:47.802780: step: 208/463, loss: 0.145422101020813 2023-01-22 14:50:48.407516: step: 210/463, loss: 0.1986807882785797 2023-01-22 14:50:49.023950: step: 212/463, loss: 0.5207840204238892 2023-01-22 14:50:49.627500: step: 214/463, loss: 0.5095028877258301 2023-01-22 14:50:50.155453: step: 216/463, loss: 0.05604223161935806 2023-01-22 14:50:50.771240: step: 218/463, loss: 0.19667601585388184 2023-01-22 14:50:51.391085: step: 220/463, loss: 0.21422915160655975 2023-01-22 14:50:52.018337: step: 222/463, loss: 0.4643193483352661 2023-01-22 14:50:52.681043: step: 224/463, loss: 0.22712451219558716 2023-01-22 14:50:53.288278: step: 226/463, loss: 0.331502228975296 2023-01-22 14:50:53.960885: step: 228/463, loss: 0.11260007321834564 2023-01-22 14:50:54.586015: step: 230/463, loss: 0.20540715754032135 2023-01-22 14:50:55.207219: step: 232/463, loss: 0.8827142119407654 2023-01-22 14:50:55.794487: step: 234/463, loss: 0.16686221957206726 2023-01-22 14:50:56.315125: step: 236/463, loss: 0.20217867195606232 2023-01-22 14:50:56.927445: step: 238/463, loss: 0.2255817949771881 2023-01-22 14:50:57.515306: step: 240/463, loss: 0.49457550048828125 2023-01-22 14:50:58.144125: step: 242/463, loss: 0.06601173430681229 2023-01-22 14:50:58.736001: step: 244/463, loss: 0.13363875448703766 2023-01-22 14:50:59.323670: step: 246/463, loss: 0.2466599941253662 2023-01-22 14:51:00.007795: step: 248/463, loss: 0.17611056566238403 2023-01-22 14:51:00.573550: step: 250/463, loss: 0.29267624020576477 2023-01-22 14:51:01.210547: step: 252/463, loss: 0.09851387143135071 2023-01-22 14:51:01.839515: step: 254/463, loss: 1.117849588394165 2023-01-22 14:51:02.462390: step: 256/463, loss: 0.11566440761089325 2023-01-22 14:51:03.024108: step: 258/463, loss: 0.7431011199951172 2023-01-22 14:51:03.623565: step: 260/463, loss: 0.209122434258461 2023-01-22 14:51:04.198667: step: 262/463, loss: 0.7733904123306274 2023-01-22 14:51:04.795188: step: 264/463, loss: 0.37820279598236084 2023-01-22 14:51:05.447268: step: 266/463, loss: 0.19969801604747772 2023-01-22 14:51:06.049611: step: 268/463, loss: 0.2954059839248657 2023-01-22 14:51:06.742690: step: 270/463, loss: 0.37319833040237427 2023-01-22 14:51:07.363661: step: 272/463, loss: 1.2052875757217407 2023-01-22 14:51:07.973217: step: 274/463, loss: 0.18042978644371033 2023-01-22 14:51:08.547129: step: 276/463, loss: 0.14747416973114014 2023-01-22 14:51:09.153368: step: 278/463, loss: 0.3258588910102844 2023-01-22 14:51:09.777551: step: 280/463, loss: 0.4714135527610779 2023-01-22 14:51:10.375548: step: 282/463, loss: 1.6447973251342773 2023-01-22 14:51:10.989124: step: 284/463, loss: 0.07277882844209671 2023-01-22 14:51:11.650018: step: 286/463, loss: 0.16202419996261597 2023-01-22 14:51:12.255246: step: 288/463, loss: 1.617053508758545 2023-01-22 14:51:12.809449: step: 290/463, loss: 0.1369830220937729 2023-01-22 14:51:13.351971: step: 292/463, loss: 0.4580770432949066 2023-01-22 14:51:13.996504: step: 294/463, loss: 0.14030884206295013 2023-01-22 14:51:14.645149: step: 296/463, loss: 0.8546111583709717 2023-01-22 14:51:15.274195: step: 298/463, loss: 0.16766364872455597 2023-01-22 14:51:15.873704: step: 300/463, loss: 0.17069435119628906 2023-01-22 14:51:16.478576: step: 302/463, loss: 0.1539897322654724 2023-01-22 14:51:17.116452: step: 304/463, loss: 0.693009614944458 2023-01-22 14:51:17.757581: step: 306/463, loss: 0.30223792791366577 2023-01-22 14:51:18.332168: step: 308/463, loss: 0.03775439038872719 2023-01-22 14:51:18.938461: step: 310/463, loss: 0.5486364960670471 2023-01-22 14:51:19.569381: step: 312/463, loss: 0.03703434765338898 2023-01-22 14:51:20.202451: step: 314/463, loss: 0.1872120499610901 2023-01-22 14:51:20.792450: step: 316/463, loss: 0.05299428477883339 2023-01-22 14:51:21.387229: step: 318/463, loss: 0.2484895884990692 2023-01-22 14:51:22.024114: step: 320/463, loss: 0.29330599308013916 2023-01-22 14:51:22.592658: step: 322/463, loss: 0.21668381989002228 2023-01-22 14:51:23.185762: step: 324/463, loss: 0.2975279688835144 2023-01-22 14:51:23.846051: step: 326/463, loss: 0.10119713097810745 2023-01-22 14:51:24.525003: step: 328/463, loss: 0.6551008224487305 2023-01-22 14:51:25.169975: step: 330/463, loss: 0.3158409297466278 2023-01-22 14:51:25.719152: step: 332/463, loss: 0.43347910046577454 2023-01-22 14:51:26.324577: step: 334/463, loss: 0.18957503139972687 2023-01-22 14:51:26.918311: step: 336/463, loss: 0.24078141152858734 2023-01-22 14:51:27.551692: step: 338/463, loss: 0.288547545671463 2023-01-22 14:51:28.138902: step: 340/463, loss: 0.1789487898349762 2023-01-22 14:51:28.693187: step: 342/463, loss: 0.3461121916770935 2023-01-22 14:51:29.316405: step: 344/463, loss: 0.5152040719985962 2023-01-22 14:51:29.972382: step: 346/463, loss: 0.3475997745990753 2023-01-22 14:51:30.590494: step: 348/463, loss: 0.49877727031707764 2023-01-22 14:51:31.312790: step: 350/463, loss: 0.41477981209754944 2023-01-22 14:51:31.937324: step: 352/463, loss: 0.5062109231948853 2023-01-22 14:51:32.580066: step: 354/463, loss: 0.07885991781949997 2023-01-22 14:51:33.226946: step: 356/463, loss: 0.08258064091205597 2023-01-22 14:51:33.842636: step: 358/463, loss: 0.16875484585762024 2023-01-22 14:51:34.462960: step: 360/463, loss: 0.1189001202583313 2023-01-22 14:51:35.162289: step: 362/463, loss: 0.3614869713783264 2023-01-22 14:51:35.738028: step: 364/463, loss: 0.18180584907531738 2023-01-22 14:51:36.407786: step: 366/463, loss: 0.4548305869102478 2023-01-22 14:51:36.997440: step: 368/463, loss: 0.23168841004371643 2023-01-22 14:51:37.641217: step: 370/463, loss: 0.3934367299079895 2023-01-22 14:51:38.161992: step: 372/463, loss: 0.5934949517250061 2023-01-22 14:51:38.795238: step: 374/463, loss: 0.5013599395751953 2023-01-22 14:51:39.428747: step: 376/463, loss: 0.3758770227432251 2023-01-22 14:51:39.986890: step: 378/463, loss: 0.11614383012056351 2023-01-22 14:51:40.595938: step: 380/463, loss: 0.12868227064609528 2023-01-22 14:51:41.193908: step: 382/463, loss: 0.1567891389131546 2023-01-22 14:51:41.814036: step: 384/463, loss: 0.1386779099702835 2023-01-22 14:51:42.416716: step: 386/463, loss: 0.08209903538227081 2023-01-22 14:51:43.039427: step: 388/463, loss: 0.2553007900714874 2023-01-22 14:51:43.637017: step: 390/463, loss: 0.1405048817396164 2023-01-22 14:51:44.259533: step: 392/463, loss: 0.18045952916145325 2023-01-22 14:51:44.851975: step: 394/463, loss: 0.28160855174064636 2023-01-22 14:51:45.475344: step: 396/463, loss: 0.5107547640800476 2023-01-22 14:51:46.033527: step: 398/463, loss: 0.15843133628368378 2023-01-22 14:51:46.657763: step: 400/463, loss: 0.631945788860321 2023-01-22 14:51:47.258401: step: 402/463, loss: 0.5777909755706787 2023-01-22 14:51:47.883242: step: 404/463, loss: 0.22300545871257782 2023-01-22 14:51:48.403722: step: 406/463, loss: 0.95182865858078 2023-01-22 14:51:49.027361: step: 408/463, loss: 0.12243899703025818 2023-01-22 14:51:49.715173: step: 410/463, loss: 0.2345244139432907 2023-01-22 14:51:50.345734: step: 412/463, loss: 0.6049973368644714 2023-01-22 14:51:50.982279: step: 414/463, loss: 0.6340118646621704 2023-01-22 14:51:51.640301: step: 416/463, loss: 0.4180183410644531 2023-01-22 14:51:52.295809: step: 418/463, loss: 0.15347856283187866 2023-01-22 14:51:52.909784: step: 420/463, loss: 0.20315924286842346 2023-01-22 14:51:53.573314: step: 422/463, loss: 0.9023343324661255 2023-01-22 14:51:54.115854: step: 424/463, loss: 0.29596200585365295 2023-01-22 14:51:54.741199: step: 426/463, loss: 0.12977886199951172 2023-01-22 14:51:55.336264: step: 428/463, loss: 2.6178064346313477 2023-01-22 14:51:55.992967: step: 430/463, loss: 0.06159355491399765 2023-01-22 14:51:56.604438: step: 432/463, loss: 0.06741072982549667 2023-01-22 14:51:57.181721: step: 434/463, loss: 0.22832657396793365 2023-01-22 14:51:57.778986: step: 436/463, loss: 0.4763481020927429 2023-01-22 14:51:58.419608: step: 438/463, loss: 0.2565934658050537 2023-01-22 14:51:58.997282: step: 440/463, loss: 0.11319302022457123 2023-01-22 14:51:59.591737: step: 442/463, loss: 0.11748762428760529 2023-01-22 14:52:00.212314: step: 444/463, loss: 0.140935480594635 2023-01-22 14:52:00.889791: step: 446/463, loss: 1.4546352624893188 2023-01-22 14:52:01.524375: step: 448/463, loss: 1.0627775192260742 2023-01-22 14:52:02.203249: step: 450/463, loss: 0.6280418038368225 2023-01-22 14:52:02.797278: step: 452/463, loss: 0.05184320732951164 2023-01-22 14:52:03.439588: step: 454/463, loss: 0.22440877556800842 2023-01-22 14:52:04.073727: step: 456/463, loss: 0.17771081626415253 2023-01-22 14:52:04.652901: step: 458/463, loss: 0.6793186068534851 2023-01-22 14:52:05.236516: step: 460/463, loss: 0.11881604045629501 2023-01-22 14:52:05.874222: step: 462/463, loss: 0.05558108910918236 2023-01-22 14:52:06.441875: step: 464/463, loss: 0.19928082823753357 2023-01-22 14:52:07.071811: step: 466/463, loss: 0.48838913440704346 2023-01-22 14:52:07.649861: step: 468/463, loss: 0.11537204682826996 2023-01-22 14:52:08.285449: step: 470/463, loss: 0.21145379543304443 2023-01-22 14:52:08.875136: step: 472/463, loss: 0.0439831018447876 2023-01-22 14:52:09.400622: step: 474/463, loss: 0.7699050903320312 2023-01-22 14:52:10.037259: step: 476/463, loss: 0.13018596172332764 2023-01-22 14:52:10.695022: step: 478/463, loss: 4.366349697113037 2023-01-22 14:52:11.395692: step: 480/463, loss: 0.18405383825302124 2023-01-22 14:52:11.973507: step: 482/463, loss: 0.4194681942462921 2023-01-22 14:52:12.634787: step: 484/463, loss: 0.27238285541534424 2023-01-22 14:52:13.256335: step: 486/463, loss: 0.2507387697696686 2023-01-22 14:52:13.818699: step: 488/463, loss: 0.21501445770263672 2023-01-22 14:52:14.449850: step: 490/463, loss: 0.3864595293998718 2023-01-22 14:52:15.079363: step: 492/463, loss: 0.1388407200574875 2023-01-22 14:52:15.600307: step: 494/463, loss: 0.5755683779716492 2023-01-22 14:52:16.204048: step: 496/463, loss: 0.718276858329773 2023-01-22 14:52:16.766375: step: 498/463, loss: 0.246687114238739 2023-01-22 14:52:17.389535: step: 500/463, loss: 0.5972021222114563 2023-01-22 14:52:17.962488: step: 502/463, loss: 1.6822417974472046 2023-01-22 14:52:18.633542: step: 504/463, loss: 1.7047098875045776 2023-01-22 14:52:19.196172: step: 506/463, loss: 0.18766604363918304 2023-01-22 14:52:19.884866: step: 508/463, loss: 0.11246919631958008 2023-01-22 14:52:20.472280: step: 510/463, loss: 0.20906607806682587 2023-01-22 14:52:21.146160: step: 512/463, loss: 0.7496541142463684 2023-01-22 14:52:21.822507: step: 514/463, loss: 0.5023617744445801 2023-01-22 14:52:22.411952: step: 516/463, loss: 0.16917720437049866 2023-01-22 14:52:23.043598: step: 518/463, loss: 1.6889899969100952 2023-01-22 14:52:23.706310: step: 520/463, loss: 0.03227416053414345 2023-01-22 14:52:24.435061: step: 522/463, loss: 0.13447606563568115 2023-01-22 14:52:25.048864: step: 524/463, loss: 0.10282479226589203 2023-01-22 14:52:25.617437: step: 526/463, loss: 0.29651957750320435 2023-01-22 14:52:26.285673: step: 528/463, loss: 0.3634865880012512 2023-01-22 14:52:26.848151: step: 530/463, loss: 0.46373307704925537 2023-01-22 14:52:27.525406: step: 532/463, loss: 0.33311858773231506 2023-01-22 14:52:28.198019: step: 534/463, loss: 0.35378867387771606 2023-01-22 14:52:28.872621: step: 536/463, loss: 0.13664746284484863 2023-01-22 14:52:29.518122: step: 538/463, loss: 0.8930003046989441 2023-01-22 14:52:30.122724: step: 540/463, loss: 0.04304995387792587 2023-01-22 14:52:30.743186: step: 542/463, loss: 0.1762438416481018 2023-01-22 14:52:31.322769: step: 544/463, loss: 1.2176826000213623 2023-01-22 14:52:31.959259: step: 546/463, loss: 1.4446511268615723 2023-01-22 14:52:32.597279: step: 548/463, loss: 0.28892606496810913 2023-01-22 14:52:33.224022: step: 550/463, loss: 0.38826900720596313 2023-01-22 14:52:33.860062: step: 552/463, loss: 0.027628779411315918 2023-01-22 14:52:34.525727: step: 554/463, loss: 0.23815253376960754 2023-01-22 14:52:35.121793: step: 556/463, loss: 0.5839550495147705 2023-01-22 14:52:35.756944: step: 558/463, loss: 0.10970509052276611 2023-01-22 14:52:36.372672: step: 560/463, loss: 0.16695813834667206 2023-01-22 14:52:36.991118: step: 562/463, loss: 0.3346114158630371 2023-01-22 14:52:37.663150: step: 564/463, loss: 0.2951495349407196 2023-01-22 14:52:38.302701: step: 566/463, loss: 0.1068841814994812 2023-01-22 14:52:38.940472: step: 568/463, loss: 0.15627521276474 2023-01-22 14:52:39.598596: step: 570/463, loss: 0.23941364884376526 2023-01-22 14:52:40.241655: step: 572/463, loss: 0.12726959586143494 2023-01-22 14:52:40.813100: step: 574/463, loss: 0.6501198410987854 2023-01-22 14:52:41.386474: step: 576/463, loss: 0.0766812413930893 2023-01-22 14:52:41.930123: step: 578/463, loss: 0.2877200245857239 2023-01-22 14:52:42.579328: step: 580/463, loss: 0.6787815093994141 2023-01-22 14:52:43.268827: step: 582/463, loss: 0.48363548517227173 2023-01-22 14:52:43.855680: step: 584/463, loss: 1.6430816650390625 2023-01-22 14:52:44.464823: step: 586/463, loss: 0.4476325511932373 2023-01-22 14:52:45.081411: step: 588/463, loss: 0.19050639867782593 2023-01-22 14:52:45.644300: step: 590/463, loss: 0.5077390670776367 2023-01-22 14:52:46.257448: step: 592/463, loss: 0.7412128448486328 2023-01-22 14:52:46.914189: step: 594/463, loss: 0.20310094952583313 2023-01-22 14:52:47.527197: step: 596/463, loss: 0.7274365425109863 2023-01-22 14:52:48.129566: step: 598/463, loss: 0.10617895424365997 2023-01-22 14:52:48.727668: step: 600/463, loss: 0.13573814928531647 2023-01-22 14:52:49.328105: step: 602/463, loss: 0.3039187490940094 2023-01-22 14:52:49.943370: step: 604/463, loss: 0.5601779222488403 2023-01-22 14:52:50.482996: step: 606/463, loss: 0.12362327426671982 2023-01-22 14:52:51.122200: step: 608/463, loss: 0.2803031802177429 2023-01-22 14:52:51.707726: step: 610/463, loss: 0.37527552247047424 2023-01-22 14:52:52.293526: step: 612/463, loss: 0.235462486743927 2023-01-22 14:52:52.859501: step: 614/463, loss: 0.23509454727172852 2023-01-22 14:52:53.506701: step: 616/463, loss: 0.19121810793876648 2023-01-22 14:52:54.196488: step: 618/463, loss: 0.14084340631961823 2023-01-22 14:52:54.815395: step: 620/463, loss: 0.5777691006660461 2023-01-22 14:52:55.448147: step: 622/463, loss: 0.267713338136673 2023-01-22 14:52:55.979092: step: 624/463, loss: 0.32222938537597656 2023-01-22 14:52:56.593057: step: 626/463, loss: 0.8118187189102173 2023-01-22 14:52:57.196855: step: 628/463, loss: 0.1603214591741562 2023-01-22 14:52:57.846655: step: 630/463, loss: 1.0025355815887451 2023-01-22 14:52:58.472707: step: 632/463, loss: 0.7189676761627197 2023-01-22 14:52:59.230052: step: 634/463, loss: 0.5211580395698547 2023-01-22 14:52:59.936712: step: 636/463, loss: 0.9615458250045776 2023-01-22 14:53:00.607854: step: 638/463, loss: 0.216995507478714 2023-01-22 14:53:01.239793: step: 640/463, loss: 0.3903825879096985 2023-01-22 14:53:01.823134: step: 642/463, loss: 0.2409968227148056 2023-01-22 14:53:02.402372: step: 644/463, loss: 0.5431849956512451 2023-01-22 14:53:03.008863: step: 646/463, loss: 0.21793252229690552 2023-01-22 14:53:03.614179: step: 648/463, loss: 0.17400339245796204 2023-01-22 14:53:04.267018: step: 650/463, loss: 0.30775877833366394 2023-01-22 14:53:04.890514: step: 652/463, loss: 0.2765590250492096 2023-01-22 14:53:05.515528: step: 654/463, loss: 0.19231687486171722 2023-01-22 14:53:06.231835: step: 656/463, loss: 0.9946030378341675 2023-01-22 14:53:06.869584: step: 658/463, loss: 0.17124778032302856 2023-01-22 14:53:07.527995: step: 660/463, loss: 0.43337371945381165 2023-01-22 14:53:08.124377: step: 662/463, loss: 0.21242094039916992 2023-01-22 14:53:08.751561: step: 664/463, loss: 0.9970107674598694 2023-01-22 14:53:09.351620: step: 666/463, loss: 0.7288132309913635 2023-01-22 14:53:09.921040: step: 668/463, loss: 0.9435579180717468 2023-01-22 14:53:10.538832: step: 670/463, loss: 0.6452191472053528 2023-01-22 14:53:11.200130: step: 672/463, loss: 0.20595823228359222 2023-01-22 14:53:11.868170: step: 674/463, loss: 0.14431369304656982 2023-01-22 14:53:12.519556: step: 676/463, loss: 0.4500294625759125 2023-01-22 14:53:13.131796: step: 678/463, loss: 0.2537057399749756 2023-01-22 14:53:13.737686: step: 680/463, loss: 0.3146466016769409 2023-01-22 14:53:14.319378: step: 682/463, loss: 0.912977397441864 2023-01-22 14:53:14.917159: step: 684/463, loss: 0.17992167174816132 2023-01-22 14:53:15.510440: step: 686/463, loss: 0.5743591785430908 2023-01-22 14:53:16.162093: step: 688/463, loss: 0.21177223324775696 2023-01-22 14:53:16.798423: step: 690/463, loss: 0.12650960683822632 2023-01-22 14:53:17.373153: step: 692/463, loss: 0.12653544545173645 2023-01-22 14:53:18.014286: step: 694/463, loss: 0.2936026155948639 2023-01-22 14:53:18.570776: step: 696/463, loss: 0.935749888420105 2023-01-22 14:53:19.242622: step: 698/463, loss: 0.6374924778938293 2023-01-22 14:53:19.874616: step: 700/463, loss: 0.19528654217720032 2023-01-22 14:53:20.491568: step: 702/463, loss: 0.21958163380622864 2023-01-22 14:53:21.193247: step: 704/463, loss: 0.1713462918996811 2023-01-22 14:53:21.814073: step: 706/463, loss: 0.4690965712070465 2023-01-22 14:53:22.406438: step: 708/463, loss: 0.7133401036262512 2023-01-22 14:53:23.004658: step: 710/463, loss: 0.2320057898759842 2023-01-22 14:53:23.627480: step: 712/463, loss: 0.3132484257221222 2023-01-22 14:53:24.307609: step: 714/463, loss: 0.3275342583656311 2023-01-22 14:53:24.874853: step: 716/463, loss: 0.7792859673500061 2023-01-22 14:53:25.466693: step: 718/463, loss: 0.7050125598907471 2023-01-22 14:53:26.086821: step: 720/463, loss: 0.4849129319190979 2023-01-22 14:53:26.736087: step: 722/463, loss: 0.24752256274223328 2023-01-22 14:53:27.358663: step: 724/463, loss: 0.646270215511322 2023-01-22 14:53:27.982944: step: 726/463, loss: 0.15952590107917786 2023-01-22 14:53:28.595088: step: 728/463, loss: 0.2958294153213501 2023-01-22 14:53:29.242107: step: 730/463, loss: 1.0949386358261108 2023-01-22 14:53:29.919638: step: 732/463, loss: 0.4588703513145447 2023-01-22 14:53:30.538639: step: 734/463, loss: 1.3293707370758057 2023-01-22 14:53:31.157847: step: 736/463, loss: 1.0125830173492432 2023-01-22 14:53:31.776868: step: 738/463, loss: 0.1513443887233734 2023-01-22 14:53:32.356937: step: 740/463, loss: 0.4311617612838745 2023-01-22 14:53:33.092251: step: 742/463, loss: 0.29618287086486816 2023-01-22 14:53:33.706263: step: 744/463, loss: 0.18020635843276978 2023-01-22 14:53:34.326022: step: 746/463, loss: 0.42275527119636536 2023-01-22 14:53:34.878605: step: 748/463, loss: 0.8086273074150085 2023-01-22 14:53:35.594413: step: 750/463, loss: 0.1390872299671173 2023-01-22 14:53:36.182315: step: 752/463, loss: 0.3222130537033081 2023-01-22 14:53:36.732758: step: 754/463, loss: 0.10360898822546005 2023-01-22 14:53:37.355700: step: 756/463, loss: 0.23962368071079254 2023-01-22 14:53:37.949039: step: 758/463, loss: 0.3460545539855957 2023-01-22 14:53:38.522335: step: 760/463, loss: 1.4683297872543335 2023-01-22 14:53:39.159067: step: 762/463, loss: 0.2785496413707733 2023-01-22 14:53:39.765398: step: 764/463, loss: 0.24739758670330048 2023-01-22 14:53:40.324276: step: 766/463, loss: 0.5571380853652954 2023-01-22 14:53:40.923316: step: 768/463, loss: 0.1298704445362091 2023-01-22 14:53:41.494037: step: 770/463, loss: 1.0143673419952393 2023-01-22 14:53:42.134212: step: 772/463, loss: 0.4673198163509369 2023-01-22 14:53:42.743550: step: 774/463, loss: 0.15281765162944794 2023-01-22 14:53:43.310035: step: 776/463, loss: 0.35119009017944336 2023-01-22 14:53:43.958785: step: 778/463, loss: 0.4387505054473877 2023-01-22 14:53:44.531667: step: 780/463, loss: 0.44158318638801575 2023-01-22 14:53:45.135791: step: 782/463, loss: 1.100198745727539 2023-01-22 14:53:45.666796: step: 784/463, loss: 0.3104798197746277 2023-01-22 14:53:46.281977: step: 786/463, loss: 0.12598446011543274 2023-01-22 14:53:46.863586: step: 788/463, loss: 0.7392730712890625 2023-01-22 14:53:47.437949: step: 790/463, loss: 0.8324344158172607 2023-01-22 14:53:48.076713: step: 792/463, loss: 0.6978740692138672 2023-01-22 14:53:48.826568: step: 794/463, loss: 0.4218312203884125 2023-01-22 14:53:49.469754: step: 796/463, loss: 0.21656791865825653 2023-01-22 14:53:50.057433: step: 798/463, loss: 0.15305152535438538 2023-01-22 14:53:50.723422: step: 800/463, loss: 1.130376935005188 2023-01-22 14:53:51.318549: step: 802/463, loss: 0.2740732729434967 2023-01-22 14:53:51.956517: step: 804/463, loss: 0.6058177351951599 2023-01-22 14:53:52.624747: step: 806/463, loss: 0.5034443736076355 2023-01-22 14:53:53.238518: step: 808/463, loss: 0.4664181172847748 2023-01-22 14:53:53.803476: step: 810/463, loss: 0.1484043449163437 2023-01-22 14:53:54.384257: step: 812/463, loss: 0.14245188236236572 2023-01-22 14:53:54.981336: step: 814/463, loss: 0.0927780345082283 2023-01-22 14:53:55.628254: step: 816/463, loss: 0.19126732647418976 2023-01-22 14:53:56.168222: step: 818/463, loss: 0.06832949817180634 2023-01-22 14:53:56.734807: step: 820/463, loss: 0.01584739051759243 2023-01-22 14:53:57.325529: step: 822/463, loss: 1.5207877159118652 2023-01-22 14:53:57.867595: step: 824/463, loss: 0.14813609421253204 2023-01-22 14:53:58.501224: step: 826/463, loss: 2.208134174346924 2023-01-22 14:53:59.106008: step: 828/463, loss: 0.1445113718509674 2023-01-22 14:53:59.711524: step: 830/463, loss: 0.3370588719844818 2023-01-22 14:54:00.289881: step: 832/463, loss: 0.20831598341464996 2023-01-22 14:54:00.822935: step: 834/463, loss: 0.6522635221481323 2023-01-22 14:54:01.383592: step: 836/463, loss: 0.0827510803937912 2023-01-22 14:54:02.094941: step: 838/463, loss: 0.19872835278511047 2023-01-22 14:54:02.692500: step: 840/463, loss: 2.4889283180236816 2023-01-22 14:54:03.299244: step: 842/463, loss: 0.4685746729373932 2023-01-22 14:54:03.889890: step: 844/463, loss: 0.19143787026405334 2023-01-22 14:54:04.498828: step: 846/463, loss: 0.3668155074119568 2023-01-22 14:54:05.034626: step: 848/463, loss: 0.379412978887558 2023-01-22 14:54:05.646449: step: 850/463, loss: 0.5774029493331909 2023-01-22 14:54:06.198820: step: 852/463, loss: 0.36693209409713745 2023-01-22 14:54:06.830278: step: 854/463, loss: 0.8021453022956848 2023-01-22 14:54:07.453606: step: 856/463, loss: 0.18997842073440552 2023-01-22 14:54:08.000018: step: 858/463, loss: 0.12272963672876358 2023-01-22 14:54:08.615865: step: 860/463, loss: 0.2306232899427414 2023-01-22 14:54:09.215781: step: 862/463, loss: 0.8258456587791443 2023-01-22 14:54:09.833694: step: 864/463, loss: 1.0035548210144043 2023-01-22 14:54:10.458591: step: 866/463, loss: 1.2027990818023682 2023-01-22 14:54:11.048013: step: 868/463, loss: 0.17434825003147125 2023-01-22 14:54:11.699168: step: 870/463, loss: 0.08343439549207687 2023-01-22 14:54:12.391521: step: 872/463, loss: 0.5718820691108704 2023-01-22 14:54:12.983646: step: 874/463, loss: 0.1297760307788849 2023-01-22 14:54:13.559731: step: 876/463, loss: 0.1723289042711258 2023-01-22 14:54:14.186948: step: 878/463, loss: 0.5221850275993347 2023-01-22 14:54:14.794652: step: 880/463, loss: 0.7974631786346436 2023-01-22 14:54:15.417176: step: 882/463, loss: 0.553134560585022 2023-01-22 14:54:16.029418: step: 884/463, loss: 1.0612715482711792 2023-01-22 14:54:16.621208: step: 886/463, loss: 0.2208617478609085 2023-01-22 14:54:17.188582: step: 888/463, loss: 0.10531525313854218 2023-01-22 14:54:17.800134: step: 890/463, loss: 0.6264318823814392 2023-01-22 14:54:18.433677: step: 892/463, loss: 0.49431949853897095 2023-01-22 14:54:19.058939: step: 894/463, loss: 0.2178771197795868 2023-01-22 14:54:19.696349: step: 896/463, loss: 0.14518801867961884 2023-01-22 14:54:20.288935: step: 898/463, loss: 0.2855237126350403 2023-01-22 14:54:20.887398: step: 900/463, loss: 1.113693356513977 2023-01-22 14:54:21.433177: step: 902/463, loss: 0.4754798114299774 2023-01-22 14:54:22.037638: step: 904/463, loss: 0.08716350793838501 2023-01-22 14:54:22.621094: step: 906/463, loss: 0.2030053436756134 2023-01-22 14:54:23.249679: step: 908/463, loss: 0.6809872388839722 2023-01-22 14:54:23.923736: step: 910/463, loss: 0.35770249366760254 2023-01-22 14:54:24.525470: step: 912/463, loss: 1.2180308103561401 2023-01-22 14:54:25.132225: step: 914/463, loss: 0.1960585117340088 2023-01-22 14:54:25.802678: step: 916/463, loss: 0.345977246761322 2023-01-22 14:54:26.481713: step: 918/463, loss: 1.1675078868865967 2023-01-22 14:54:27.115349: step: 920/463, loss: 2.5344204902648926 2023-01-22 14:54:27.731715: step: 922/463, loss: 0.3908182680606842 2023-01-22 14:54:28.322574: step: 924/463, loss: 0.4368446171283722 2023-01-22 14:54:28.953264: step: 926/463, loss: 0.14364628493785858 ================================================== Loss: 0.417 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.28833017367906066, 'r': 0.31951579018704257, 'f1': 0.30312299087051564}, 'combined': 0.22335378274669573, 'epoch': 3} Test Chinese: {'template': {'p': 0.9722222222222222, 'r': 0.5511811023622047, 'f1': 0.7035175879396985}, 'slot': {'p': 0.337231890926726, 'r': 0.3031264860901957, 'f1': 0.3192709637699307}, 'combined': 0.22461273833060452, 'epoch': 3} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2894932844932845, 'r': 0.3213540254811602, 'f1': 0.3045927543679342}, 'combined': 0.22443676637637255, 'epoch': 3} Test Korean: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.3399848515292938, 'r': 0.3020440834423282, 'f1': 0.3198934106263624}, 'combined': 0.2271243215447173, 'epoch': 3} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3029068676507381, 'r': 0.328197004608295, 'f1': 0.3150452120739007}, 'combined': 0.232138577317611, 'epoch': 3} Test Russian: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.34829808414918945, 'r': 0.29207195363644944, 'f1': 0.31771661971273946}, 'combined': 0.225578799996045, 'epoch': 3} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.2783687943262411, 'r': 0.37380952380952376, 'f1': 0.31910569105691056}, 'combined': 0.2127371273712737, 'epoch': 3} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.2833333333333333, 'r': 0.3695652173913043, 'f1': 0.32075471698113206}, 'combined': 0.16037735849056603, 'epoch': 3} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.3125, 'r': 0.1724137931034483, 'f1': 0.22222222222222224}, 'combined': 0.14814814814814814, 'epoch': 3} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3095149595559295, 'r': 0.3253724622276754, 'f1': 0.31724567547453275}, 'combined': 0.23375997140228727, 'epoch': 1} Test for Chinese: {'template': {'p': 0.9722222222222222, 'r': 0.5511811023622047, 'f1': 0.7035175879396985}, 'slot': {'p': 0.36402697976295945, 'r': 0.29452226435922085, 'f1': 0.32560678286267597}, 'combined': 0.22907009849635496, 'epoch': 1} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.29409479409479405, 'r': 0.32770562770562767, 'f1': 0.3099918099918099}, 'combined': 0.2066612066612066, 'epoch': 1} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.31359210182048525, 'r': 0.32489807892596767, 'f1': 0.3191449908555172}, 'combined': 0.23515946694617054, 'epoch': 1} Test for Korean: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.3653684903443556, 'r': 0.29076445304891124, 'f1': 0.32382513429937054}, 'combined': 0.22991584535255308, 'epoch': 1} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.296875, 'r': 0.41304347826086957, 'f1': 0.3454545454545454}, 'combined': 0.1727272727272727, 'epoch': 1} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.301291961130742, 'r': 0.3235887096774194, 'f1': 0.3120425434583715}, 'combined': 0.2299260846535369, 'epoch': 0} Test for Russian: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.38111410877910484, 'r': 0.27421624899959984, 'f1': 0.318946559120102}, 'combined': 0.2264520569752724, 'epoch': 0} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.3888888888888889, 'r': 0.2413793103448276, 'f1': 0.2978723404255319}, 'combined': 0.19858156028368792, 'epoch': 0} ****************************** Epoch: 4 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-22 14:57:03.459217: step: 2/463, loss: 0.28018978238105774 2023-01-22 14:57:04.137382: step: 4/463, loss: 0.08867963403463364 2023-01-22 14:57:04.701525: step: 6/463, loss: 0.10725268721580505 2023-01-22 14:57:05.325886: step: 8/463, loss: 0.11564333736896515 2023-01-22 14:57:05.990538: step: 10/463, loss: 0.6741631031036377 2023-01-22 14:57:06.656514: step: 12/463, loss: 0.10417842864990234 2023-01-22 14:57:07.261834: step: 14/463, loss: 0.44783318042755127 2023-01-22 14:57:07.824104: step: 16/463, loss: 0.060294680297374725 2023-01-22 14:57:08.447184: step: 18/463, loss: 0.2915246784687042 2023-01-22 14:57:08.980682: step: 20/463, loss: 0.10010324418544769 2023-01-22 14:57:09.589559: step: 22/463, loss: 0.20422087609767914 2023-01-22 14:57:10.247044: step: 24/463, loss: 0.7192528247833252 2023-01-22 14:57:10.836305: step: 26/463, loss: 0.0965629518032074 2023-01-22 14:57:11.415399: step: 28/463, loss: 0.4942958950996399 2023-01-22 14:57:12.025429: step: 30/463, loss: 0.17007732391357422 2023-01-22 14:57:12.750981: step: 32/463, loss: 0.23528717458248138 2023-01-22 14:57:13.315724: step: 34/463, loss: 0.20931103825569153 2023-01-22 14:57:13.892264: step: 36/463, loss: 0.11351226270198822 2023-01-22 14:57:14.541544: step: 38/463, loss: 0.18377497792243958 2023-01-22 14:57:15.147931: step: 40/463, loss: 0.06843212246894836 2023-01-22 14:57:15.789110: step: 42/463, loss: 0.4377027153968811 2023-01-22 14:57:16.428708: step: 44/463, loss: 0.3962370455265045 2023-01-22 14:57:16.992074: step: 46/463, loss: 1.4944523572921753 2023-01-22 14:57:17.605308: step: 48/463, loss: 0.8883061408996582 2023-01-22 14:57:18.192007: step: 50/463, loss: 0.2700344920158386 2023-01-22 14:57:18.899921: step: 52/463, loss: 0.6041721701622009 2023-01-22 14:57:19.523768: step: 54/463, loss: 0.0889386236667633 2023-01-22 14:57:20.149896: step: 56/463, loss: 0.24817270040512085 2023-01-22 14:57:20.743017: step: 58/463, loss: 0.18336425721645355 2023-01-22 14:57:21.338323: step: 60/463, loss: 0.2627487778663635 2023-01-22 14:57:21.915337: step: 62/463, loss: 0.29467588663101196 2023-01-22 14:57:22.528306: step: 64/463, loss: 0.12842415273189545 2023-01-22 14:57:23.203371: step: 66/463, loss: 0.14607280492782593 2023-01-22 14:57:23.796039: step: 68/463, loss: 0.7639791369438171 2023-01-22 14:57:24.399557: step: 70/463, loss: 2.5741963386535645 2023-01-22 14:57:24.974828: step: 72/463, loss: 0.20956099033355713 2023-01-22 14:57:25.632204: step: 74/463, loss: 0.3968564569950104 2023-01-22 14:57:26.317862: step: 76/463, loss: 1.4481512308120728 2023-01-22 14:57:26.962826: step: 78/463, loss: 0.29436033964157104 2023-01-22 14:57:27.578439: step: 80/463, loss: 0.18586154282093048 2023-01-22 14:57:28.191475: step: 82/463, loss: 0.4711255729198456 2023-01-22 14:57:28.751692: step: 84/463, loss: 0.11521046608686447 2023-01-22 14:57:29.323177: step: 86/463, loss: 0.1547417789697647 2023-01-22 14:57:29.958968: step: 88/463, loss: 0.29981720447540283 2023-01-22 14:57:30.546529: step: 90/463, loss: 0.13780200481414795 2023-01-22 14:57:31.127650: step: 92/463, loss: 0.5757108926773071 2023-01-22 14:57:31.800460: step: 94/463, loss: 0.34691524505615234 2023-01-22 14:57:32.390538: step: 96/463, loss: 0.22615396976470947 2023-01-22 14:57:32.950503: step: 98/463, loss: 0.1604449599981308 2023-01-22 14:57:33.568242: step: 100/463, loss: 0.29706767201423645 2023-01-22 14:57:34.113172: step: 102/463, loss: 0.09050627052783966 2023-01-22 14:57:34.710547: step: 104/463, loss: 0.2576325833797455 2023-01-22 14:57:35.358195: step: 106/463, loss: 0.16945090889930725 2023-01-22 14:57:35.934757: step: 108/463, loss: 0.35895398259162903 2023-01-22 14:57:36.530536: step: 110/463, loss: 0.46203070878982544 2023-01-22 14:57:37.140802: step: 112/463, loss: 0.44922515749931335 2023-01-22 14:57:37.743612: step: 114/463, loss: 0.35423481464385986 2023-01-22 14:57:38.382950: step: 116/463, loss: 0.366700142621994 2023-01-22 14:57:39.077807: step: 118/463, loss: 0.21714362502098083 2023-01-22 14:57:39.730884: step: 120/463, loss: 0.23851676285266876 2023-01-22 14:57:40.456751: step: 122/463, loss: 0.15793676674365997 2023-01-22 14:57:41.067138: step: 124/463, loss: 0.15200673043727875 2023-01-22 14:57:41.651278: step: 126/463, loss: 0.473976731300354 2023-01-22 14:57:42.373635: step: 128/463, loss: 0.13739386200904846 2023-01-22 14:57:43.030924: step: 130/463, loss: 1.7287712097167969 2023-01-22 14:57:43.644900: step: 132/463, loss: 0.6252063512802124 2023-01-22 14:57:44.266687: step: 134/463, loss: 1.0874000787734985 2023-01-22 14:57:44.955529: step: 136/463, loss: 0.11914146691560745 2023-01-22 14:57:45.639624: step: 138/463, loss: 0.12471066415309906 2023-01-22 14:57:46.218894: step: 140/463, loss: 0.595917820930481 2023-01-22 14:57:46.837222: step: 142/463, loss: 0.1939898133277893 2023-01-22 14:57:47.435752: step: 144/463, loss: 0.15319624543190002 2023-01-22 14:57:48.087868: step: 146/463, loss: 0.14383116364479065 2023-01-22 14:57:48.669812: step: 148/463, loss: 0.8891110420227051 2023-01-22 14:57:49.333201: step: 150/463, loss: 0.19467127323150635 2023-01-22 14:57:49.903156: step: 152/463, loss: 0.18711240589618683 2023-01-22 14:57:50.477376: step: 154/463, loss: 0.7139452695846558 2023-01-22 14:57:51.138664: step: 156/463, loss: 0.14910583198070526 2023-01-22 14:57:51.750567: step: 158/463, loss: 0.6016501188278198 2023-01-22 14:57:52.377812: step: 160/463, loss: 0.9090298414230347 2023-01-22 14:57:52.975988: step: 162/463, loss: 0.22071878612041473 2023-01-22 14:57:53.623941: step: 164/463, loss: 0.5568557381629944 2023-01-22 14:57:54.211547: step: 166/463, loss: 1.0371307134628296 2023-01-22 14:57:54.782337: step: 168/463, loss: 0.3473025858402252 2023-01-22 14:57:55.389405: step: 170/463, loss: 0.20155216753482819 2023-01-22 14:57:56.002256: step: 172/463, loss: 0.18067775666713715 2023-01-22 14:57:56.641733: step: 174/463, loss: 0.1119653731584549 2023-01-22 14:57:57.273582: step: 176/463, loss: 0.1830419898033142 2023-01-22 14:57:57.885434: step: 178/463, loss: 0.4454245865345001 2023-01-22 14:57:58.501814: step: 180/463, loss: 0.05199163779616356 2023-01-22 14:57:59.098436: step: 182/463, loss: 1.2615498304367065 2023-01-22 14:57:59.743216: step: 184/463, loss: 0.15721569955348969 2023-01-22 14:58:00.318206: step: 186/463, loss: 0.1553230583667755 2023-01-22 14:58:00.939824: step: 188/463, loss: 0.27800455689430237 2023-01-22 14:58:01.533577: step: 190/463, loss: 0.49579891562461853 2023-01-22 14:58:02.090559: step: 192/463, loss: 1.173423409461975 2023-01-22 14:58:02.681867: step: 194/463, loss: 0.22896583378314972 2023-01-22 14:58:03.324257: step: 196/463, loss: 0.23491565883159637 2023-01-22 14:58:03.943701: step: 198/463, loss: 0.25036510825157166 2023-01-22 14:58:04.552961: step: 200/463, loss: 0.5655305981636047 2023-01-22 14:58:05.188373: step: 202/463, loss: 0.3819665312767029 2023-01-22 14:58:05.855475: step: 204/463, loss: 0.5717543959617615 2023-01-22 14:58:06.448575: step: 206/463, loss: 0.2607552409172058 2023-01-22 14:58:07.038950: step: 208/463, loss: 0.11372219771146774 2023-01-22 14:58:07.635703: step: 210/463, loss: 0.3368605375289917 2023-01-22 14:58:08.246010: step: 212/463, loss: 0.12473717331886292 2023-01-22 14:58:08.827729: step: 214/463, loss: 0.19164322316646576 2023-01-22 14:58:09.383239: step: 216/463, loss: 0.09972894191741943 2023-01-22 14:58:10.080103: step: 218/463, loss: 0.845991849899292 2023-01-22 14:58:10.702237: step: 220/463, loss: 0.24577371776103973 2023-01-22 14:58:11.341999: step: 222/463, loss: 0.13573606312274933 2023-01-22 14:58:11.969349: step: 224/463, loss: 0.16246303915977478 2023-01-22 14:58:12.647728: step: 226/463, loss: 1.0864205360412598 2023-01-22 14:58:13.196272: step: 228/463, loss: 0.08607133477926254 2023-01-22 14:58:13.776813: step: 230/463, loss: 0.4082280993461609 2023-01-22 14:58:14.356539: step: 232/463, loss: 0.21475334465503693 2023-01-22 14:58:14.901426: step: 234/463, loss: 0.23794853687286377 2023-01-22 14:58:15.471667: step: 236/463, loss: 0.28860414028167725 2023-01-22 14:58:16.041181: step: 238/463, loss: 0.08671560883522034 2023-01-22 14:58:16.732062: step: 240/463, loss: 0.21967609226703644 2023-01-22 14:58:17.432833: step: 242/463, loss: 0.3845984935760498 2023-01-22 14:58:18.082029: step: 244/463, loss: 0.19625811278820038 2023-01-22 14:58:18.626781: step: 246/463, loss: 0.4338453710079193 2023-01-22 14:58:19.321982: step: 248/463, loss: 0.8202825784683228 2023-01-22 14:58:19.924352: step: 250/463, loss: 0.19100452959537506 2023-01-22 14:58:20.529480: step: 252/463, loss: 1.0280678272247314 2023-01-22 14:58:21.166919: step: 254/463, loss: 0.17735689878463745 2023-01-22 14:58:21.833395: step: 256/463, loss: 0.08322380483150482 2023-01-22 14:58:22.469104: step: 258/463, loss: 0.09070700407028198 2023-01-22 14:58:23.036710: step: 260/463, loss: 0.4565905034542084 2023-01-22 14:58:23.621332: step: 262/463, loss: 0.4441155791282654 2023-01-22 14:58:24.248116: step: 264/463, loss: 0.06974222511053085 2023-01-22 14:58:24.900890: step: 266/463, loss: 0.5844534039497375 2023-01-22 14:58:25.523256: step: 268/463, loss: 0.0895291417837143 2023-01-22 14:58:26.173759: step: 270/463, loss: 0.09360255300998688 2023-01-22 14:58:26.733018: step: 272/463, loss: 0.3466280400753021 2023-01-22 14:58:27.326937: step: 274/463, loss: 0.2847074866294861 2023-01-22 14:58:27.939978: step: 276/463, loss: 0.49218201637268066 2023-01-22 14:58:28.573094: step: 278/463, loss: 0.09798160195350647 2023-01-22 14:58:29.145232: step: 280/463, loss: 0.08881958574056625 2023-01-22 14:58:29.737463: step: 282/463, loss: 0.278272420167923 2023-01-22 14:58:30.332668: step: 284/463, loss: 0.4639943242073059 2023-01-22 14:58:30.902162: step: 286/463, loss: 0.2576824724674225 2023-01-22 14:58:31.545963: step: 288/463, loss: 0.3048902451992035 2023-01-22 14:58:32.205111: step: 290/463, loss: 0.9544494152069092 2023-01-22 14:58:32.786194: step: 292/463, loss: 0.21287332475185394 2023-01-22 14:58:33.355832: step: 294/463, loss: 0.22078973054885864 2023-01-22 14:58:33.991415: step: 296/463, loss: 0.31360846757888794 2023-01-22 14:58:34.594820: step: 298/463, loss: 0.293476939201355 2023-01-22 14:58:35.155679: step: 300/463, loss: 0.11944038420915604 2023-01-22 14:58:35.769659: step: 302/463, loss: 0.33592134714126587 2023-01-22 14:58:36.367317: step: 304/463, loss: 0.1576250195503235 2023-01-22 14:58:37.014009: step: 306/463, loss: 0.1968565136194229 2023-01-22 14:58:37.587139: step: 308/463, loss: 0.09237449616193771 2023-01-22 14:58:38.202386: step: 310/463, loss: 0.38816022872924805 2023-01-22 14:58:38.779880: step: 312/463, loss: 0.2534879744052887 2023-01-22 14:58:39.421009: step: 314/463, loss: 0.09377417713403702 2023-01-22 14:58:40.024631: step: 316/463, loss: 0.11310450732707977 2023-01-22 14:58:40.595662: step: 318/463, loss: 0.14047051966190338 2023-01-22 14:58:41.194252: step: 320/463, loss: 0.19853508472442627 2023-01-22 14:58:41.736134: step: 322/463, loss: 0.7994641065597534 2023-01-22 14:58:42.356583: step: 324/463, loss: 0.08601376414299011 2023-01-22 14:58:42.959137: step: 326/463, loss: 0.16427846252918243 2023-01-22 14:58:43.626824: step: 328/463, loss: 0.2599063515663147 2023-01-22 14:58:44.232351: step: 330/463, loss: 0.327951043844223 2023-01-22 14:58:44.822041: step: 332/463, loss: 0.11033209413290024 2023-01-22 14:58:45.427778: step: 334/463, loss: 0.15873004496097565 2023-01-22 14:58:46.076408: step: 336/463, loss: 0.19789578020572662 2023-01-22 14:58:46.641643: step: 338/463, loss: 0.26895442605018616 2023-01-22 14:58:47.308724: step: 340/463, loss: 0.3992057740688324 2023-01-22 14:58:47.840210: step: 342/463, loss: 0.2031499445438385 2023-01-22 14:58:48.444178: step: 344/463, loss: 0.35654348134994507 2023-01-22 14:58:49.078078: step: 346/463, loss: 0.14467209577560425 2023-01-22 14:58:49.757112: step: 348/463, loss: 0.5794543027877808 2023-01-22 14:58:50.342492: step: 350/463, loss: 0.06388884037733078 2023-01-22 14:58:50.949132: step: 352/463, loss: 0.2286040484905243 2023-01-22 14:58:51.620039: step: 354/463, loss: 0.07589076459407806 2023-01-22 14:58:52.217297: step: 356/463, loss: 0.8244566917419434 2023-01-22 14:58:52.789369: step: 358/463, loss: 0.7021282315254211 2023-01-22 14:58:53.389077: step: 360/463, loss: 0.3721470832824707 2023-01-22 14:58:53.981610: step: 362/463, loss: 0.07220583409070969 2023-01-22 14:58:54.567817: step: 364/463, loss: 0.5398956537246704 2023-01-22 14:58:55.173106: step: 366/463, loss: 0.3289538621902466 2023-01-22 14:58:55.757757: step: 368/463, loss: 0.3798968195915222 2023-01-22 14:58:56.522248: step: 370/463, loss: 0.05315621942281723 2023-01-22 14:58:57.147364: step: 372/463, loss: 0.4166836142539978 2023-01-22 14:58:57.786836: step: 374/463, loss: 0.5072207450866699 2023-01-22 14:58:58.401987: step: 376/463, loss: 0.15064987540245056 2023-01-22 14:58:58.982044: step: 378/463, loss: 0.16925546526908875 2023-01-22 14:58:59.582572: step: 380/463, loss: 0.1383422613143921 2023-01-22 14:59:00.162671: step: 382/463, loss: 0.06041271612048149 2023-01-22 14:59:00.800018: step: 384/463, loss: 0.2947656810283661 2023-01-22 14:59:01.426162: step: 386/463, loss: 0.07814939320087433 2023-01-22 14:59:02.055531: step: 388/463, loss: 0.12006226927042007 2023-01-22 14:59:02.684629: step: 390/463, loss: 0.23326686024665833 2023-01-22 14:59:03.289525: step: 392/463, loss: 0.7897745966911316 2023-01-22 14:59:03.912063: step: 394/463, loss: 0.2859964966773987 2023-01-22 14:59:04.528727: step: 396/463, loss: 0.09860372543334961 2023-01-22 14:59:05.229698: step: 398/463, loss: 0.29780352115631104 2023-01-22 14:59:05.868233: step: 400/463, loss: 0.3475703299045563 2023-01-22 14:59:06.517538: step: 402/463, loss: 0.15948626399040222 2023-01-22 14:59:07.181610: step: 404/463, loss: 0.19314873218536377 2023-01-22 14:59:07.801753: step: 406/463, loss: 0.07498034834861755 2023-01-22 14:59:08.387426: step: 408/463, loss: 0.18619516491889954 2023-01-22 14:59:09.003817: step: 410/463, loss: 0.4005123972892761 2023-01-22 14:59:09.583253: step: 412/463, loss: 0.5407426357269287 2023-01-22 14:59:10.310794: step: 414/463, loss: 0.06398945301771164 2023-01-22 14:59:10.887858: step: 416/463, loss: 0.2890128493309021 2023-01-22 14:59:11.493593: step: 418/463, loss: 0.6001293659210205 2023-01-22 14:59:12.056309: step: 420/463, loss: 0.2903147041797638 2023-01-22 14:59:12.561312: step: 422/463, loss: 0.13867701590061188 2023-01-22 14:59:13.174989: step: 424/463, loss: 0.1688997745513916 2023-01-22 14:59:13.837777: step: 426/463, loss: 0.9968193769454956 2023-01-22 14:59:14.386824: step: 428/463, loss: 0.037324026226997375 2023-01-22 14:59:14.997505: step: 430/463, loss: 2.7202341556549072 2023-01-22 14:59:15.587072: step: 432/463, loss: 0.37451690435409546 2023-01-22 14:59:16.204283: step: 434/463, loss: 0.17314845323562622 2023-01-22 14:59:16.831884: step: 436/463, loss: 0.8154723048210144 2023-01-22 14:59:17.469086: step: 438/463, loss: 0.16151073575019836 2023-01-22 14:59:18.081989: step: 440/463, loss: 0.2452029436826706 2023-01-22 14:59:18.708850: step: 442/463, loss: 0.13354498147964478 2023-01-22 14:59:19.337306: step: 444/463, loss: 0.47331634163856506 2023-01-22 14:59:20.005492: step: 446/463, loss: 0.11509261280298233 2023-01-22 14:59:20.620032: step: 448/463, loss: 0.2748642861843109 2023-01-22 14:59:21.205957: step: 450/463, loss: 0.6040314435958862 2023-01-22 14:59:21.832598: step: 452/463, loss: 0.21756009757518768 2023-01-22 14:59:22.493666: step: 454/463, loss: 0.12236445397138596 2023-01-22 14:59:23.074926: step: 456/463, loss: 0.38573142886161804 2023-01-22 14:59:23.727501: step: 458/463, loss: 0.11379775404930115 2023-01-22 14:59:24.366530: step: 460/463, loss: 0.1585683971643448 2023-01-22 14:59:25.044047: step: 462/463, loss: 0.25073957443237305 2023-01-22 14:59:25.609057: step: 464/463, loss: 0.03477708250284195 2023-01-22 14:59:26.273270: step: 466/463, loss: 0.1657935082912445 2023-01-22 14:59:26.822444: step: 468/463, loss: 0.04809322580695152 2023-01-22 14:59:27.371325: step: 470/463, loss: 0.06740041822195053 2023-01-22 14:59:27.979936: step: 472/463, loss: 0.14566724002361298 2023-01-22 14:59:28.610892: step: 474/463, loss: 0.15663543343544006 2023-01-22 14:59:29.225309: step: 476/463, loss: 0.6585956811904907 2023-01-22 14:59:29.919968: step: 478/463, loss: 0.4346095621585846 2023-01-22 14:59:30.563865: step: 480/463, loss: 0.08026498556137085 2023-01-22 14:59:31.136103: step: 482/463, loss: 0.9349294304847717 2023-01-22 14:59:31.702139: step: 484/463, loss: 0.4936347007751465 2023-01-22 14:59:32.336828: step: 486/463, loss: 0.2628043293952942 2023-01-22 14:59:32.916926: step: 488/463, loss: 0.07908711582422256 2023-01-22 14:59:33.538214: step: 490/463, loss: 1.6746914386749268 2023-01-22 14:59:34.165164: step: 492/463, loss: 0.1065765842795372 2023-01-22 14:59:34.740283: step: 494/463, loss: 0.4884894788265228 2023-01-22 14:59:35.311199: step: 496/463, loss: 0.31176263093948364 2023-01-22 14:59:35.947572: step: 498/463, loss: 0.20677027106285095 2023-01-22 14:59:36.625744: step: 500/463, loss: 0.2779003381729126 2023-01-22 14:59:37.198327: step: 502/463, loss: 0.21184125542640686 2023-01-22 14:59:37.796079: step: 504/463, loss: 4.4286208152771 2023-01-22 14:59:38.466987: step: 506/463, loss: 0.5724657773971558 2023-01-22 14:59:39.019595: step: 508/463, loss: 0.3319438695907593 2023-01-22 14:59:39.577972: step: 510/463, loss: 0.33714908361434937 2023-01-22 14:59:40.172572: step: 512/463, loss: 0.5112523436546326 2023-01-22 14:59:40.762866: step: 514/463, loss: 0.44569432735443115 2023-01-22 14:59:41.323337: step: 516/463, loss: 0.6641700267791748 2023-01-22 14:59:41.947068: step: 518/463, loss: 0.8558828234672546 2023-01-22 14:59:42.588366: step: 520/463, loss: 0.8753697872161865 2023-01-22 14:59:43.255538: step: 522/463, loss: 0.6126326322555542 2023-01-22 14:59:43.928445: step: 524/463, loss: 0.8940564393997192 2023-01-22 14:59:44.551614: step: 526/463, loss: 0.5812259912490845 2023-01-22 14:59:45.177052: step: 528/463, loss: 0.08816865086555481 2023-01-22 14:59:45.789950: step: 530/463, loss: 0.22776606678962708 2023-01-22 14:59:46.359119: step: 532/463, loss: 0.9158434867858887 2023-01-22 14:59:46.965407: step: 534/463, loss: 0.05059341341257095 2023-01-22 14:59:47.582901: step: 536/463, loss: 0.1119237169623375 2023-01-22 14:59:48.200439: step: 538/463, loss: 0.14101532101631165 2023-01-22 14:59:48.816959: step: 540/463, loss: 0.06930503994226456 2023-01-22 14:59:49.522121: step: 542/463, loss: 1.9775664806365967 2023-01-22 14:59:50.116952: step: 544/463, loss: 0.2449316382408142 2023-01-22 14:59:50.744924: step: 546/463, loss: 0.14602653682231903 2023-01-22 14:59:51.311544: step: 548/463, loss: 0.15546542406082153 2023-01-22 14:59:51.933478: step: 550/463, loss: 0.18714672327041626 2023-01-22 14:59:52.506528: step: 552/463, loss: 0.5217647552490234 2023-01-22 14:59:53.099809: step: 554/463, loss: 0.19064991176128387 2023-01-22 14:59:53.657533: step: 556/463, loss: 0.6208058595657349 2023-01-22 14:59:54.249405: step: 558/463, loss: 0.13798952102661133 2023-01-22 14:59:54.889035: step: 560/463, loss: 0.3919745683670044 2023-01-22 14:59:55.534800: step: 562/463, loss: 0.25097665190696716 2023-01-22 14:59:56.153385: step: 564/463, loss: 0.3510279357433319 2023-01-22 14:59:56.713357: step: 566/463, loss: 0.1415420025587082 2023-01-22 14:59:57.328555: step: 568/463, loss: 0.28472238779067993 2023-01-22 14:59:57.916635: step: 570/463, loss: 0.10588109493255615 2023-01-22 14:59:58.552238: step: 572/463, loss: 0.11910245567560196 2023-01-22 14:59:59.189978: step: 574/463, loss: 1.8247716426849365 2023-01-22 14:59:59.805514: step: 576/463, loss: 0.3168087899684906 2023-01-22 15:00:00.451705: step: 578/463, loss: 0.47731953859329224 2023-01-22 15:00:01.062929: step: 580/463, loss: 0.6888152956962585 2023-01-22 15:00:01.730581: step: 582/463, loss: 0.677666425704956 2023-01-22 15:00:02.385769: step: 584/463, loss: 0.4419955015182495 2023-01-22 15:00:02.991618: step: 586/463, loss: 0.19437524676322937 2023-01-22 15:00:03.581907: step: 588/463, loss: 0.17522116005420685 2023-01-22 15:00:04.185147: step: 590/463, loss: 0.12441287934780121 2023-01-22 15:00:04.780104: step: 592/463, loss: 0.3458687365055084 2023-01-22 15:00:05.398540: step: 594/463, loss: 0.1821790486574173 2023-01-22 15:00:05.951518: step: 596/463, loss: 0.43311506509780884 2023-01-22 15:00:06.500344: step: 598/463, loss: 0.21133345365524292 2023-01-22 15:00:07.108941: step: 600/463, loss: 0.904247522354126 2023-01-22 15:00:07.728562: step: 602/463, loss: 0.38962435722351074 2023-01-22 15:00:08.342079: step: 604/463, loss: 0.9967257976531982 2023-01-22 15:00:08.959743: step: 606/463, loss: 0.12764020264148712 2023-01-22 15:00:09.654353: step: 608/463, loss: 0.6234896183013916 2023-01-22 15:00:10.199672: step: 610/463, loss: 0.3918750584125519 2023-01-22 15:00:10.766526: step: 612/463, loss: 0.14000630378723145 2023-01-22 15:00:11.374635: step: 614/463, loss: 0.3314099609851837 2023-01-22 15:00:11.995540: step: 616/463, loss: 0.8686912059783936 2023-01-22 15:00:12.573673: step: 618/463, loss: 0.11042092740535736 2023-01-22 15:00:13.193240: step: 620/463, loss: 0.6426002383232117 2023-01-22 15:00:13.819999: step: 622/463, loss: 0.3330007791519165 2023-01-22 15:00:14.564382: step: 624/463, loss: 0.2501598596572876 2023-01-22 15:00:15.218172: step: 626/463, loss: 0.08200860768556595 2023-01-22 15:00:15.793030: step: 628/463, loss: 0.15559709072113037 2023-01-22 15:00:16.403114: step: 630/463, loss: 1.5208486318588257 2023-01-22 15:00:17.002567: step: 632/463, loss: 0.28276869654655457 2023-01-22 15:00:17.757256: step: 634/463, loss: 0.271755576133728 2023-01-22 15:00:18.325650: step: 636/463, loss: 0.6508913040161133 2023-01-22 15:00:19.017019: step: 638/463, loss: 0.6424388885498047 2023-01-22 15:00:19.606667: step: 640/463, loss: 0.40920835733413696 2023-01-22 15:00:20.214948: step: 642/463, loss: 0.35774117708206177 2023-01-22 15:00:20.768076: step: 644/463, loss: 3.7830166816711426 2023-01-22 15:00:21.341910: step: 646/463, loss: 0.35346946120262146 2023-01-22 15:00:21.970790: step: 648/463, loss: 0.3704957365989685 2023-01-22 15:00:22.502309: step: 650/463, loss: 0.288404256105423 2023-01-22 15:00:23.091977: step: 652/463, loss: 0.1761763095855713 2023-01-22 15:00:23.695965: step: 654/463, loss: 6.209756374359131 2023-01-22 15:00:24.290885: step: 656/463, loss: 0.13294795155525208 2023-01-22 15:00:24.948748: step: 658/463, loss: 0.3768618702888489 2023-01-22 15:00:25.528305: step: 660/463, loss: 0.1322442889213562 2023-01-22 15:00:26.114842: step: 662/463, loss: 0.4557216465473175 2023-01-22 15:00:26.720901: step: 664/463, loss: 0.30379828810691833 2023-01-22 15:00:27.337872: step: 666/463, loss: 0.09650204330682755 2023-01-22 15:00:27.923064: step: 668/463, loss: 0.13443848490715027 2023-01-22 15:00:28.494512: step: 670/463, loss: 0.5230703353881836 2023-01-22 15:00:29.201274: step: 672/463, loss: 0.10101820528507233 2023-01-22 15:00:29.817402: step: 674/463, loss: 0.5745805501937866 2023-01-22 15:00:30.459047: step: 676/463, loss: 0.6321688890457153 2023-01-22 15:00:31.089327: step: 678/463, loss: 0.44606903195381165 2023-01-22 15:00:31.765276: step: 680/463, loss: 0.45148608088493347 2023-01-22 15:00:32.361390: step: 682/463, loss: 0.1289391815662384 2023-01-22 15:00:33.073213: step: 684/463, loss: 0.3342464566230774 2023-01-22 15:00:33.689478: step: 686/463, loss: 0.8948984146118164 2023-01-22 15:00:34.335716: step: 688/463, loss: 0.5682541728019714 2023-01-22 15:00:35.048013: step: 690/463, loss: 0.4605291187763214 2023-01-22 15:00:35.663850: step: 692/463, loss: 0.3703670799732208 2023-01-22 15:00:36.294150: step: 694/463, loss: 0.18482042849063873 2023-01-22 15:00:36.814971: step: 696/463, loss: 0.07417617738246918 2023-01-22 15:00:37.478656: step: 698/463, loss: 0.4044078588485718 2023-01-22 15:00:38.039380: step: 700/463, loss: 0.1456187665462494 2023-01-22 15:00:38.647290: step: 702/463, loss: 0.13531725108623505 2023-01-22 15:00:39.283470: step: 704/463, loss: 0.31319645047187805 2023-01-22 15:00:39.908640: step: 706/463, loss: 0.14915625751018524 2023-01-22 15:00:40.504399: step: 708/463, loss: 0.4966917932033539 2023-01-22 15:00:41.146626: step: 710/463, loss: 0.14964525401592255 2023-01-22 15:00:41.778249: step: 712/463, loss: 0.5544085502624512 2023-01-22 15:00:42.450187: step: 714/463, loss: 0.3419226109981537 2023-01-22 15:00:43.038553: step: 716/463, loss: 0.16782115399837494 2023-01-22 15:00:43.608770: step: 718/463, loss: 0.6603156328201294 2023-01-22 15:00:44.268751: step: 720/463, loss: 0.2616020143032074 2023-01-22 15:00:44.845337: step: 722/463, loss: 0.6968644857406616 2023-01-22 15:00:45.471709: step: 724/463, loss: 0.057327695190906525 2023-01-22 15:00:46.174805: step: 726/463, loss: 0.5646926164627075 2023-01-22 15:00:46.821895: step: 728/463, loss: 0.128641277551651 2023-01-22 15:00:47.366958: step: 730/463, loss: 0.2304677665233612 2023-01-22 15:00:47.979118: step: 732/463, loss: 0.25881531834602356 2023-01-22 15:00:48.610277: step: 734/463, loss: 0.4040507376194 2023-01-22 15:00:49.222814: step: 736/463, loss: 0.4181581437587738 2023-01-22 15:00:49.817273: step: 738/463, loss: 0.365761399269104 2023-01-22 15:00:50.431561: step: 740/463, loss: 0.2727954387664795 2023-01-22 15:00:51.063284: step: 742/463, loss: 0.20844769477844238 2023-01-22 15:00:51.717480: step: 744/463, loss: 0.5332238078117371 2023-01-22 15:00:52.375048: step: 746/463, loss: 0.3798210918903351 2023-01-22 15:00:53.063290: step: 748/463, loss: 0.5776268839836121 2023-01-22 15:00:53.750938: step: 750/463, loss: 0.14677004516124725 2023-01-22 15:00:54.391061: step: 752/463, loss: 0.28412604331970215 2023-01-22 15:00:55.048937: step: 754/463, loss: 0.11279013752937317 2023-01-22 15:00:55.626760: step: 756/463, loss: 0.42745211720466614 2023-01-22 15:00:56.224390: step: 758/463, loss: 0.62291419506073 2023-01-22 15:00:56.891583: step: 760/463, loss: 0.4180606007575989 2023-01-22 15:00:57.471741: step: 762/463, loss: 0.18151384592056274 2023-01-22 15:00:58.106947: step: 764/463, loss: 0.3013122081756592 2023-01-22 15:00:58.627694: step: 766/463, loss: 1.172857403755188 2023-01-22 15:00:59.245667: step: 768/463, loss: 0.1936178207397461 2023-01-22 15:00:59.910256: step: 770/463, loss: 0.29813894629478455 2023-01-22 15:01:00.514690: step: 772/463, loss: 0.6499261260032654 2023-01-22 15:01:01.127489: step: 774/463, loss: 0.3235485255718231 2023-01-22 15:01:01.697809: step: 776/463, loss: 0.12581439316272736 2023-01-22 15:01:02.265973: step: 778/463, loss: 0.04443514719605446 2023-01-22 15:01:02.836412: step: 780/463, loss: 0.16130831837654114 2023-01-22 15:01:03.443474: step: 782/463, loss: 0.23200927674770355 2023-01-22 15:01:04.046815: step: 784/463, loss: 0.2795993685722351 2023-01-22 15:01:04.706459: step: 786/463, loss: 0.4868917167186737 2023-01-22 15:01:05.342092: step: 788/463, loss: 0.3389803171157837 2023-01-22 15:01:06.026555: step: 790/463, loss: 1.9167799949645996 2023-01-22 15:01:06.681308: step: 792/463, loss: 0.34763702750205994 2023-01-22 15:01:07.331681: step: 794/463, loss: 0.4146835207939148 2023-01-22 15:01:07.886450: step: 796/463, loss: 0.2676306664943695 2023-01-22 15:01:08.497278: step: 798/463, loss: 0.8373745083808899 2023-01-22 15:01:09.247508: step: 800/463, loss: 0.758481502532959 2023-01-22 15:01:09.877009: step: 802/463, loss: 0.3841819763183594 2023-01-22 15:01:10.539917: step: 804/463, loss: 0.18776606023311615 2023-01-22 15:01:11.132879: step: 806/463, loss: 0.1256423145532608 2023-01-22 15:01:11.738248: step: 808/463, loss: 0.13052454590797424 2023-01-22 15:01:12.340918: step: 810/463, loss: 1.2559645175933838 2023-01-22 15:01:12.995530: step: 812/463, loss: 0.9020232558250427 2023-01-22 15:01:13.623385: step: 814/463, loss: 0.6255305409431458 2023-01-22 15:01:14.199297: step: 816/463, loss: 0.16855143010616302 2023-01-22 15:01:14.784086: step: 818/463, loss: 0.499297559261322 2023-01-22 15:01:15.367494: step: 820/463, loss: 0.22166529297828674 2023-01-22 15:01:15.915435: step: 822/463, loss: 0.8260859251022339 2023-01-22 15:01:16.450987: step: 824/463, loss: 0.07771969586610794 2023-01-22 15:01:17.126107: step: 826/463, loss: 0.1336643248796463 2023-01-22 15:01:17.741962: step: 828/463, loss: 0.5257257223129272 2023-01-22 15:01:18.353896: step: 830/463, loss: 0.32461419701576233 2023-01-22 15:01:18.949206: step: 832/463, loss: 0.24419811367988586 2023-01-22 15:01:19.588333: step: 834/463, loss: 0.1720065027475357 2023-01-22 15:01:20.284995: step: 836/463, loss: 0.41022706031799316 2023-01-22 15:01:20.898366: step: 838/463, loss: 0.14434635639190674 2023-01-22 15:01:21.466198: step: 840/463, loss: 0.15381769835948944 2023-01-22 15:01:22.141424: step: 842/463, loss: 0.28481221199035645 2023-01-22 15:01:22.831944: step: 844/463, loss: 0.31516778469085693 2023-01-22 15:01:23.424065: step: 846/463, loss: 1.3955180644989014 2023-01-22 15:01:23.966857: step: 848/463, loss: 0.3227834105491638 2023-01-22 15:01:24.580913: step: 850/463, loss: 0.1401841938495636 2023-01-22 15:01:25.193471: step: 852/463, loss: 0.9697494506835938 2023-01-22 15:01:25.740265: step: 854/463, loss: 0.08888167887926102 2023-01-22 15:01:26.324598: step: 856/463, loss: 0.7366852760314941 2023-01-22 15:01:26.919014: step: 858/463, loss: 0.40932899713516235 2023-01-22 15:01:27.515112: step: 860/463, loss: 1.174228310585022 2023-01-22 15:01:28.185661: step: 862/463, loss: 0.7009646892547607 2023-01-22 15:01:28.778320: step: 864/463, loss: 0.2172122299671173 2023-01-22 15:01:29.371950: step: 866/463, loss: 0.16126815974712372 2023-01-22 15:01:29.991256: step: 868/463, loss: 0.6198955178260803 2023-01-22 15:01:30.599520: step: 870/463, loss: 0.38617923855781555 2023-01-22 15:01:31.226733: step: 872/463, loss: 0.1714901626110077 2023-01-22 15:01:31.845569: step: 874/463, loss: 0.17205221951007843 2023-01-22 15:01:32.494737: step: 876/463, loss: 0.25930655002593994 2023-01-22 15:01:33.122955: step: 878/463, loss: 0.16274584829807281 2023-01-22 15:01:33.758668: step: 880/463, loss: 0.26167193055152893 2023-01-22 15:01:34.346184: step: 882/463, loss: 0.1560763418674469 2023-01-22 15:01:35.008029: step: 884/463, loss: 0.13708898425102234 2023-01-22 15:01:35.605198: step: 886/463, loss: 0.1979779303073883 2023-01-22 15:01:36.230575: step: 888/463, loss: 0.6031259894371033 2023-01-22 15:01:36.834168: step: 890/463, loss: 0.12203545868396759 2023-01-22 15:01:37.472596: step: 892/463, loss: 0.3442237079143524 2023-01-22 15:01:38.094830: step: 894/463, loss: 0.11883711069822311 2023-01-22 15:01:38.742393: step: 896/463, loss: 0.4925813674926758 2023-01-22 15:01:39.415084: step: 898/463, loss: 0.13428178429603577 2023-01-22 15:01:40.020833: step: 900/463, loss: 2.2352845668792725 2023-01-22 15:01:40.628105: step: 902/463, loss: 0.9210226535797119 2023-01-22 15:01:41.272782: step: 904/463, loss: 0.12226151674985886 2023-01-22 15:01:41.894224: step: 906/463, loss: 1.105759620666504 2023-01-22 15:01:42.482381: step: 908/463, loss: 0.21318599581718445 2023-01-22 15:01:43.124851: step: 910/463, loss: 0.31210803985595703 2023-01-22 15:01:43.769156: step: 912/463, loss: 0.2590809762477875 2023-01-22 15:01:44.310448: step: 914/463, loss: 0.17857448756694794 2023-01-22 15:01:44.952052: step: 916/463, loss: 0.34170395135879517 2023-01-22 15:01:45.540900: step: 918/463, loss: 0.30486786365509033 2023-01-22 15:01:46.175555: step: 920/463, loss: 1.9943842887878418 2023-01-22 15:01:46.798883: step: 922/463, loss: 0.06211067736148834 2023-01-22 15:01:47.394318: step: 924/463, loss: 0.12278813123703003 2023-01-22 15:01:47.947032: step: 926/463, loss: 0.30011966824531555 ================================================== Loss: 0.407 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.257823732718894, 'r': 0.3791525481160206, 'f1': 0.3069330151415405}, 'combined': 0.2261611690516614, 'epoch': 4} Test Chinese: {'template': {'p': 0.9722222222222222, 'r': 0.5511811023622047, 'f1': 0.7035175879396985}, 'slot': {'p': 0.2976644688385023, 'r': 0.3374397210064419, 'f1': 0.3163065743367794}, 'combined': 0.22252723822687998, 'epoch': 4} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2682373892953408, 'r': 0.3776701002981838, 'f1': 0.313683440279185}, 'combined': 0.2311351665215047, 'epoch': 4} Test Korean: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.29962244208288746, 'r': 0.3380892534245333, 'f1': 0.31769568746088683}, 'combined': 0.22556393809722963, 'epoch': 4} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2697430845364313, 'r': 0.38081376640437364, 'f1': 0.31579678189630983}, 'combined': 0.23269236560780723, 'epoch': 4} Test Russian: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.3126555591905385, 'r': 0.32412414738792067, 'f1': 0.3182865769804195}, 'combined': 0.22598346965609784, 'epoch': 4} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.21162280701754385, 'r': 0.4595238095238095, 'f1': 0.2897897897897898}, 'combined': 0.1931931931931932, 'epoch': 4} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.24468085106382978, 'r': 0.5, 'f1': 0.32857142857142857}, 'combined': 0.16428571428571428, 'epoch': 4} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.3235294117647059, 'r': 0.1896551724137931, 'f1': 0.2391304347826087}, 'combined': 0.15942028985507245, 'epoch': 4} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3095149595559295, 'r': 0.3253724622276754, 'f1': 0.31724567547453275}, 'combined': 0.23375997140228727, 'epoch': 1} Test for Chinese: {'template': {'p': 0.9722222222222222, 'r': 0.5511811023622047, 'f1': 0.7035175879396985}, 'slot': {'p': 0.36402697976295945, 'r': 0.29452226435922085, 'f1': 0.32560678286267597}, 'combined': 0.22907009849635496, 'epoch': 1} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.29409479409479405, 'r': 0.32770562770562767, 'f1': 0.3099918099918099}, 'combined': 0.2066612066612066, 'epoch': 1} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.31359210182048525, 'r': 0.32489807892596767, 'f1': 0.3191449908555172}, 'combined': 0.23515946694617054, 'epoch': 1} Test for Korean: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.3653684903443556, 'r': 0.29076445304891124, 'f1': 0.32382513429937054}, 'combined': 0.22991584535255308, 'epoch': 1} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.296875, 'r': 0.41304347826086957, 'f1': 0.3454545454545454}, 'combined': 0.1727272727272727, 'epoch': 1} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.301291961130742, 'r': 0.3235887096774194, 'f1': 0.3120425434583715}, 'combined': 0.2299260846535369, 'epoch': 0} Test for Russian: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.38111410877910484, 'r': 0.27421624899959984, 'f1': 0.318946559120102}, 'combined': 0.2264520569752724, 'epoch': 0} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.3888888888888889, 'r': 0.2413793103448276, 'f1': 0.2978723404255319}, 'combined': 0.19858156028368792, 'epoch': 0} ****************************** Epoch: 5 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-22 15:04:23.865916: step: 2/463, loss: 0.4692002534866333 2023-01-22 15:04:24.533184: step: 4/463, loss: 0.2567785382270813 2023-01-22 15:04:25.166353: step: 6/463, loss: 0.16233904659748077 2023-01-22 15:04:25.788258: step: 8/463, loss: 0.526107132434845 2023-01-22 15:04:26.434271: step: 10/463, loss: 0.3048560917377472 2023-01-22 15:04:27.048011: step: 12/463, loss: 0.3537100851535797 2023-01-22 15:04:27.712105: step: 14/463, loss: 0.7454623579978943 2023-01-22 15:04:28.379380: step: 16/463, loss: 0.6877308487892151 2023-01-22 15:04:28.993698: step: 18/463, loss: 0.1536683440208435 2023-01-22 15:04:29.591138: step: 20/463, loss: 0.012286501005291939 2023-01-22 15:04:30.217851: step: 22/463, loss: 0.19564437866210938 2023-01-22 15:04:30.848126: step: 24/463, loss: 0.5170451998710632 2023-01-22 15:04:31.490277: step: 26/463, loss: 0.3101789951324463 2023-01-22 15:04:32.055320: step: 28/463, loss: 0.17507588863372803 2023-01-22 15:04:32.701539: step: 30/463, loss: 0.1636398881673813 2023-01-22 15:04:33.285943: step: 32/463, loss: 0.28302571177482605 2023-01-22 15:04:33.854746: step: 34/463, loss: 0.14941485226154327 2023-01-22 15:04:34.425238: step: 36/463, loss: 0.5024131536483765 2023-01-22 15:04:35.091659: step: 38/463, loss: 0.867077112197876 2023-01-22 15:04:35.720327: step: 40/463, loss: 0.16221055388450623 2023-01-22 15:04:36.312891: step: 42/463, loss: 0.18688108026981354 2023-01-22 15:04:36.915959: step: 44/463, loss: 0.2314538210630417 2023-01-22 15:04:37.528545: step: 46/463, loss: 0.41155731678009033 2023-01-22 15:04:38.115083: step: 48/463, loss: 0.20005592703819275 2023-01-22 15:04:38.709383: step: 50/463, loss: 0.32210803031921387 2023-01-22 15:04:39.357999: step: 52/463, loss: 0.5439748764038086 2023-01-22 15:04:39.987311: step: 54/463, loss: 0.39841872453689575 2023-01-22 15:04:40.586088: step: 56/463, loss: 1.1341897249221802 2023-01-22 15:04:41.162937: step: 58/463, loss: 0.27541884779930115 2023-01-22 15:04:41.793897: step: 60/463, loss: 0.6743131279945374 2023-01-22 15:04:42.413950: step: 62/463, loss: 0.2771267592906952 2023-01-22 15:04:43.042948: step: 64/463, loss: 0.9941343665122986 2023-01-22 15:04:43.629309: step: 66/463, loss: 0.29212620854377747 2023-01-22 15:04:44.323925: step: 68/463, loss: 0.8206390738487244 2023-01-22 15:04:44.947931: step: 70/463, loss: 0.4609524607658386 2023-01-22 15:04:45.579009: step: 72/463, loss: 0.42758435010910034 2023-01-22 15:04:46.182679: step: 74/463, loss: 0.48209962248802185 2023-01-22 15:04:46.837964: step: 76/463, loss: 0.2506665587425232 2023-01-22 15:04:47.390249: step: 78/463, loss: 0.290978342294693 2023-01-22 15:04:48.016935: step: 80/463, loss: 0.2125348597764969 2023-01-22 15:04:48.603112: step: 82/463, loss: 0.3315035402774811 2023-01-22 15:04:49.221619: step: 84/463, loss: 0.3003198206424713 2023-01-22 15:04:49.864221: step: 86/463, loss: 0.33035892248153687 2023-01-22 15:04:50.415849: step: 88/463, loss: 0.10659858584403992 2023-01-22 15:04:51.059651: step: 90/463, loss: 0.15251687169075012 2023-01-22 15:04:51.661130: step: 92/463, loss: 1.2276495695114136 2023-01-22 15:04:52.210868: step: 94/463, loss: 0.18188640475273132 2023-01-22 15:04:52.762538: step: 96/463, loss: 0.1622830629348755 2023-01-22 15:04:53.305395: step: 98/463, loss: 0.25652652978897095 2023-01-22 15:04:53.853248: step: 100/463, loss: 0.2379939705133438 2023-01-22 15:04:54.452782: step: 102/463, loss: 0.6159705519676208 2023-01-22 15:04:55.073750: step: 104/463, loss: 0.40388768911361694 2023-01-22 15:04:55.699954: step: 106/463, loss: 0.703730046749115 2023-01-22 15:04:56.313971: step: 108/463, loss: 0.3708374500274658 2023-01-22 15:04:56.970963: step: 110/463, loss: 1.351558804512024 2023-01-22 15:04:57.563194: step: 112/463, loss: 0.11200806498527527 2023-01-22 15:04:58.178327: step: 114/463, loss: 1.7905954122543335 2023-01-22 15:04:58.774391: step: 116/463, loss: 1.362338662147522 2023-01-22 15:04:59.406310: step: 118/463, loss: 0.566702663898468 2023-01-22 15:05:00.022150: step: 120/463, loss: 0.3970111906528473 2023-01-22 15:05:00.713648: step: 122/463, loss: 0.26945579051971436 2023-01-22 15:05:01.295270: step: 124/463, loss: 0.2252281904220581 2023-01-22 15:05:01.887804: step: 126/463, loss: 0.5885814428329468 2023-01-22 15:05:02.550342: step: 128/463, loss: 0.8400110006332397 2023-01-22 15:05:03.224480: step: 130/463, loss: 0.3879692554473877 2023-01-22 15:05:03.899160: step: 132/463, loss: 0.6809670329093933 2023-01-22 15:05:04.489897: step: 134/463, loss: 0.1812920868396759 2023-01-22 15:05:05.177768: step: 136/463, loss: 0.10469996929168701 2023-01-22 15:05:05.804496: step: 138/463, loss: 0.11809810996055603 2023-01-22 15:05:06.446156: step: 140/463, loss: 0.3336801528930664 2023-01-22 15:05:07.066280: step: 142/463, loss: 0.09080840647220612 2023-01-22 15:05:07.698247: step: 144/463, loss: 0.4820631742477417 2023-01-22 15:05:08.343666: step: 146/463, loss: 0.40446344017982483 2023-01-22 15:05:08.983954: step: 148/463, loss: 1.736566185951233 2023-01-22 15:05:09.734177: step: 150/463, loss: 0.6908102035522461 2023-01-22 15:05:10.273385: step: 152/463, loss: 0.31490978598594666 2023-01-22 15:05:10.794393: step: 154/463, loss: 0.31338509917259216 2023-01-22 15:05:11.428584: step: 156/463, loss: 0.27421873807907104 2023-01-22 15:05:12.020517: step: 158/463, loss: 0.7789943218231201 2023-01-22 15:05:12.590283: step: 160/463, loss: 0.5893916487693787 2023-01-22 15:05:13.207711: step: 162/463, loss: 0.3481650650501251 2023-01-22 15:05:13.780376: step: 164/463, loss: 0.3503214716911316 2023-01-22 15:05:14.407813: step: 166/463, loss: 0.6015584468841553 2023-01-22 15:05:15.004872: step: 168/463, loss: 0.14258617162704468 2023-01-22 15:05:15.647660: step: 170/463, loss: 0.4121180772781372 2023-01-22 15:05:16.233127: step: 172/463, loss: 1.113049030303955 2023-01-22 15:05:16.890067: step: 174/463, loss: 0.18945221602916718 2023-01-22 15:05:17.481830: step: 176/463, loss: 0.15436908602714539 2023-01-22 15:05:18.114771: step: 178/463, loss: 0.15101735293865204 2023-01-22 15:05:18.744372: step: 180/463, loss: 0.14003483951091766 2023-01-22 15:05:19.296140: step: 182/463, loss: 0.2169262170791626 2023-01-22 15:05:19.883424: step: 184/463, loss: 0.6074908375740051 2023-01-22 15:05:20.521675: step: 186/463, loss: 0.2806061804294586 2023-01-22 15:05:21.108990: step: 188/463, loss: 0.3498987555503845 2023-01-22 15:05:21.731032: step: 190/463, loss: 0.09985600411891937 2023-01-22 15:05:22.390270: step: 192/463, loss: 0.19567662477493286 2023-01-22 15:05:23.020126: step: 194/463, loss: 0.07987669855356216 2023-01-22 15:05:23.585721: step: 196/463, loss: 0.6281145215034485 2023-01-22 15:05:24.142795: step: 198/463, loss: 0.22853246331214905 2023-01-22 15:05:24.883737: step: 200/463, loss: 0.6129348874092102 2023-01-22 15:05:25.555963: step: 202/463, loss: 0.2607996463775635 2023-01-22 15:05:26.191210: step: 204/463, loss: 0.7161058187484741 2023-01-22 15:05:26.791496: step: 206/463, loss: 0.14569544792175293 2023-01-22 15:05:27.407928: step: 208/463, loss: 0.2230064570903778 2023-01-22 15:05:28.034241: step: 210/463, loss: 0.1739092916250229 2023-01-22 15:05:28.729985: step: 212/463, loss: 1.2697973251342773 2023-01-22 15:05:29.332997: step: 214/463, loss: 0.4969445765018463 2023-01-22 15:05:29.975769: step: 216/463, loss: 0.44714510440826416 2023-01-22 15:05:30.589548: step: 218/463, loss: 0.7159148454666138 2023-01-22 15:05:31.219557: step: 220/463, loss: 0.8265483975410461 2023-01-22 15:05:31.828315: step: 222/463, loss: 0.12250448018312454 2023-01-22 15:05:32.415235: step: 224/463, loss: 0.5621204972267151 2023-01-22 15:05:32.930229: step: 226/463, loss: 0.2869696319103241 2023-01-22 15:05:33.545970: step: 228/463, loss: 0.4096831977367401 2023-01-22 15:05:34.126043: step: 230/463, loss: 0.18362700939178467 2023-01-22 15:05:34.681515: step: 232/463, loss: 0.2856053411960602 2023-01-22 15:05:35.263143: step: 234/463, loss: 0.29268038272857666 2023-01-22 15:05:35.904100: step: 236/463, loss: 0.7606827616691589 2023-01-22 15:05:36.512004: step: 238/463, loss: 0.26290079951286316 2023-01-22 15:05:37.142760: step: 240/463, loss: 0.2732445299625397 2023-01-22 15:05:37.717652: step: 242/463, loss: 0.17532040178775787 2023-01-22 15:05:38.407125: step: 244/463, loss: 0.33544921875 2023-01-22 15:05:39.073057: step: 246/463, loss: 0.35320037603378296 2023-01-22 15:05:39.646460: step: 248/463, loss: 0.24800707399845123 2023-01-22 15:05:40.295207: step: 250/463, loss: 0.5467283725738525 2023-01-22 15:05:40.907720: step: 252/463, loss: 0.8307380080223083 2023-01-22 15:05:41.526293: step: 254/463, loss: 5.084863662719727 2023-01-22 15:05:42.123030: step: 256/463, loss: 1.1463202238082886 2023-01-22 15:05:42.677453: step: 258/463, loss: 0.6998710632324219 2023-01-22 15:05:43.293378: step: 260/463, loss: 0.1490391492843628 2023-01-22 15:05:43.895356: step: 262/463, loss: 0.09507971256971359 2023-01-22 15:05:44.529167: step: 264/463, loss: 0.2911016047000885 2023-01-22 15:05:45.150919: step: 266/463, loss: 0.12046675384044647 2023-01-22 15:05:45.780692: step: 268/463, loss: 0.21190868318080902 2023-01-22 15:05:46.420629: step: 270/463, loss: 0.4022587537765503 2023-01-22 15:05:47.045684: step: 272/463, loss: 0.1499437689781189 2023-01-22 15:05:47.640525: step: 274/463, loss: 0.27141186594963074 2023-01-22 15:05:48.255648: step: 276/463, loss: 0.1464925855398178 2023-01-22 15:05:48.888810: step: 278/463, loss: 0.4959094822406769 2023-01-22 15:05:49.510572: step: 280/463, loss: 0.7757396697998047 2023-01-22 15:05:50.190137: step: 282/463, loss: 0.18813367187976837 2023-01-22 15:05:50.770554: step: 284/463, loss: 0.486691415309906 2023-01-22 15:05:51.446696: step: 286/463, loss: 1.0428664684295654 2023-01-22 15:05:52.080500: step: 288/463, loss: 0.16024091839790344 2023-01-22 15:05:52.678970: step: 290/463, loss: 0.3730674088001251 2023-01-22 15:05:53.303183: step: 292/463, loss: 3.7202296257019043 2023-01-22 15:05:53.954195: step: 294/463, loss: 0.6177228093147278 2023-01-22 15:05:54.603260: step: 296/463, loss: 0.466582715511322 2023-01-22 15:05:55.207819: step: 298/463, loss: 0.8829436302185059 2023-01-22 15:05:55.854803: step: 300/463, loss: 0.2887510061264038 2023-01-22 15:05:56.438306: step: 302/463, loss: 0.2260335385799408 2023-01-22 15:05:57.020642: step: 304/463, loss: 1.0092142820358276 2023-01-22 15:05:57.631621: step: 306/463, loss: 0.504250705242157 2023-01-22 15:05:58.260994: step: 308/463, loss: 0.24865233898162842 2023-01-22 15:05:58.805036: step: 310/463, loss: 0.25093069672584534 2023-01-22 15:05:59.332216: step: 312/463, loss: 0.1586432307958603 2023-01-22 15:06:00.021609: step: 314/463, loss: 0.18736442923545837 2023-01-22 15:06:00.681460: step: 316/463, loss: 0.6842339038848877 2023-01-22 15:06:01.288087: step: 318/463, loss: 1.048384189605713 2023-01-22 15:06:01.870434: step: 320/463, loss: 0.18224000930786133 2023-01-22 15:06:02.471952: step: 322/463, loss: 0.399073988199234 2023-01-22 15:06:03.056540: step: 324/463, loss: 0.4222061336040497 2023-01-22 15:06:03.613658: step: 326/463, loss: 0.1389034539461136 2023-01-22 15:06:04.185725: step: 328/463, loss: 0.5608020424842834 2023-01-22 15:06:04.868069: step: 330/463, loss: 0.26459065079689026 2023-01-22 15:06:05.464514: step: 332/463, loss: 0.1394079327583313 2023-01-22 15:06:06.118138: step: 334/463, loss: 0.18177056312561035 2023-01-22 15:06:06.695587: step: 336/463, loss: 0.286426842212677 2023-01-22 15:06:07.470163: step: 338/463, loss: 0.39777427911758423 2023-01-22 15:06:08.074846: step: 340/463, loss: 0.27986353635787964 2023-01-22 15:06:08.697242: step: 342/463, loss: 0.17406456172466278 2023-01-22 15:06:09.323359: step: 344/463, loss: 0.3088904023170471 2023-01-22 15:06:09.976834: step: 346/463, loss: 0.196720689535141 2023-01-22 15:06:10.655399: step: 348/463, loss: 0.3080258071422577 2023-01-22 15:06:11.216477: step: 350/463, loss: 0.11475144326686859 2023-01-22 15:06:11.834929: step: 352/463, loss: 0.5996674299240112 2023-01-22 15:06:12.457829: step: 354/463, loss: 1.1956913471221924 2023-01-22 15:06:13.098059: step: 356/463, loss: 1.2666531801223755 2023-01-22 15:06:13.745692: step: 358/463, loss: 0.4112864136695862 2023-01-22 15:06:14.310738: step: 360/463, loss: 0.07000018656253815 2023-01-22 15:06:14.922960: step: 362/463, loss: 0.3889230489730835 2023-01-22 15:06:15.518511: step: 364/463, loss: 0.1724991798400879 2023-01-22 15:06:16.086325: step: 366/463, loss: 0.6826949715614319 2023-01-22 15:06:16.750636: step: 368/463, loss: 0.13399218022823334 2023-01-22 15:06:17.421069: step: 370/463, loss: 0.25245383381843567 2023-01-22 15:06:18.025993: step: 372/463, loss: 0.2447829693555832 2023-01-22 15:06:18.660722: step: 374/463, loss: 0.9818593859672546 2023-01-22 15:06:19.277104: step: 376/463, loss: 4.955348491668701 2023-01-22 15:06:19.857817: step: 378/463, loss: 0.16970118880271912 2023-01-22 15:06:20.489989: step: 380/463, loss: 0.3012924790382385 2023-01-22 15:06:21.112551: step: 382/463, loss: 1.453906536102295 2023-01-22 15:06:21.667328: step: 384/463, loss: 0.908204197883606 2023-01-22 15:06:22.298571: step: 386/463, loss: 0.5834793448448181 2023-01-22 15:06:22.900623: step: 388/463, loss: 0.7565433382987976 2023-01-22 15:06:23.489055: step: 390/463, loss: 0.6701086163520813 2023-01-22 15:06:24.094312: step: 392/463, loss: 0.4259178340435028 2023-01-22 15:06:24.735085: step: 394/463, loss: 0.11410758644342422 2023-01-22 15:06:25.355049: step: 396/463, loss: 0.27128326892852783 2023-01-22 15:06:25.940760: step: 398/463, loss: 0.4323229193687439 2023-01-22 15:06:26.624595: step: 400/463, loss: 0.7417182922363281 2023-01-22 15:06:27.227068: step: 402/463, loss: 0.5714214444160461 2023-01-22 15:06:27.858603: step: 404/463, loss: 0.17722931504249573 2023-01-22 15:06:28.452259: step: 406/463, loss: 0.598773717880249 2023-01-22 15:06:29.068183: step: 408/463, loss: 0.12944506108760834 2023-01-22 15:06:29.726305: step: 410/463, loss: 0.499483197927475 2023-01-22 15:06:30.306093: step: 412/463, loss: 0.10298417508602142 2023-01-22 15:06:30.949241: step: 414/463, loss: 0.32125940918922424 2023-01-22 15:06:31.587360: step: 416/463, loss: 0.6119561791419983 2023-01-22 15:06:32.182941: step: 418/463, loss: 0.5318193435668945 2023-01-22 15:06:32.793691: step: 420/463, loss: 0.3632904887199402 2023-01-22 15:06:33.502983: step: 422/463, loss: 0.26872432231903076 2023-01-22 15:06:34.113593: step: 424/463, loss: 0.8417807817459106 2023-01-22 15:06:34.781668: step: 426/463, loss: 0.41502952575683594 2023-01-22 15:06:35.442859: step: 428/463, loss: 0.49339085817337036 2023-01-22 15:06:36.088220: step: 430/463, loss: 0.3399065136909485 2023-01-22 15:06:36.681591: step: 432/463, loss: 0.34788620471954346 2023-01-22 15:06:37.295986: step: 434/463, loss: 0.43992602825164795 2023-01-22 15:06:37.931972: step: 436/463, loss: 0.41373389959335327 2023-01-22 15:06:38.568438: step: 438/463, loss: 0.219792902469635 2023-01-22 15:06:39.240118: step: 440/463, loss: 1.1127560138702393 2023-01-22 15:06:39.812793: step: 442/463, loss: 3.4011247158050537 2023-01-22 15:06:40.441793: step: 444/463, loss: 0.16121172904968262 2023-01-22 15:06:40.995036: step: 446/463, loss: 0.26284611225128174 2023-01-22 15:06:41.564190: step: 448/463, loss: 0.6603972911834717 2023-01-22 15:06:42.184669: step: 450/463, loss: 1.5329499244689941 2023-01-22 15:06:42.839130: step: 452/463, loss: 0.16709434986114502 2023-01-22 15:06:43.455909: step: 454/463, loss: 0.1994452178478241 2023-01-22 15:06:44.069017: step: 456/463, loss: 0.17526139318943024 2023-01-22 15:06:44.746227: step: 458/463, loss: 0.79362952709198 2023-01-22 15:06:45.338531: step: 460/463, loss: 0.9350751638412476 2023-01-22 15:06:45.936485: step: 462/463, loss: 0.5589685440063477 2023-01-22 15:06:46.575466: step: 464/463, loss: 0.5435109734535217 2023-01-22 15:06:47.167767: step: 466/463, loss: 0.19313952326774597 2023-01-22 15:06:47.766481: step: 468/463, loss: 0.19563330709934235 2023-01-22 15:06:48.337239: step: 470/463, loss: 0.7306740283966064 2023-01-22 15:06:48.938633: step: 472/463, loss: 0.14222976565361023 2023-01-22 15:06:49.539034: step: 474/463, loss: 0.19171975553035736 2023-01-22 15:06:50.143359: step: 476/463, loss: 0.21581730246543884 2023-01-22 15:06:50.757173: step: 478/463, loss: 0.18528595566749573 2023-01-22 15:06:51.378659: step: 480/463, loss: 0.16171017289161682 2023-01-22 15:06:51.941795: step: 482/463, loss: 1.0104864835739136 2023-01-22 15:06:52.606321: step: 484/463, loss: 0.20890343189239502 2023-01-22 15:06:53.265304: step: 486/463, loss: 0.2979382872581482 2023-01-22 15:06:53.864621: step: 488/463, loss: 0.9752604961395264 2023-01-22 15:06:54.466858: step: 490/463, loss: 0.6083256602287292 2023-01-22 15:06:55.150206: step: 492/463, loss: 0.28748103976249695 2023-01-22 15:06:55.708660: step: 494/463, loss: 2.083831310272217 2023-01-22 15:06:56.446806: step: 496/463, loss: 0.7003525495529175 2023-01-22 15:06:57.068285: step: 498/463, loss: 0.13721677660942078 2023-01-22 15:06:57.662548: step: 500/463, loss: 0.18334297835826874 2023-01-22 15:06:58.335116: step: 502/463, loss: 0.18423591554164886 2023-01-22 15:06:58.951477: step: 504/463, loss: 0.47408327460289 2023-01-22 15:06:59.503398: step: 506/463, loss: 0.26286497712135315 2023-01-22 15:07:00.132622: step: 508/463, loss: 0.32620346546173096 2023-01-22 15:07:00.728174: step: 510/463, loss: 0.687005877494812 2023-01-22 15:07:01.410008: step: 512/463, loss: 1.1442794799804688 2023-01-22 15:07:02.014473: step: 514/463, loss: 0.14275363087654114 2023-01-22 15:07:02.688732: step: 516/463, loss: 0.24082612991333008 2023-01-22 15:07:03.385303: step: 518/463, loss: 0.832391083240509 2023-01-22 15:07:03.992686: step: 520/463, loss: 0.2136317640542984 2023-01-22 15:07:04.638384: step: 522/463, loss: 0.07853472232818604 2023-01-22 15:07:05.275269: step: 524/463, loss: 0.1477528065443039 2023-01-22 15:07:05.916232: step: 526/463, loss: 0.05718028172850609 2023-01-22 15:07:06.517399: step: 528/463, loss: 0.11975432932376862 2023-01-22 15:07:07.122591: step: 530/463, loss: 0.11637480556964874 2023-01-22 15:07:07.690789: step: 532/463, loss: 0.25144657492637634 2023-01-22 15:07:08.299633: step: 534/463, loss: 0.09461621195077896 2023-01-22 15:07:08.864599: step: 536/463, loss: 2.154644727706909 2023-01-22 15:07:09.486674: step: 538/463, loss: 0.1672067791223526 2023-01-22 15:07:10.090904: step: 540/463, loss: 0.14491000771522522 2023-01-22 15:07:10.715084: step: 542/463, loss: 0.21605603396892548 2023-01-22 15:07:11.359807: step: 544/463, loss: 0.07731068879365921 2023-01-22 15:07:11.963877: step: 546/463, loss: 0.3072422742843628 2023-01-22 15:07:12.606419: step: 548/463, loss: 0.1035570502281189 2023-01-22 15:07:13.199614: step: 550/463, loss: 0.14036881923675537 2023-01-22 15:07:13.793377: step: 552/463, loss: 0.19857291877269745 2023-01-22 15:07:14.398387: step: 554/463, loss: 2.495504856109619 2023-01-22 15:07:14.994806: step: 556/463, loss: 0.9159181118011475 2023-01-22 15:07:15.598211: step: 558/463, loss: 0.6913150548934937 2023-01-22 15:07:16.222438: step: 560/463, loss: 0.8678516149520874 2023-01-22 15:07:16.860613: step: 562/463, loss: 0.3394131660461426 2023-01-22 15:07:17.495590: step: 564/463, loss: 0.20960289239883423 2023-01-22 15:07:18.064998: step: 566/463, loss: 0.9923339486122131 2023-01-22 15:07:18.690775: step: 568/463, loss: 0.15965066850185394 2023-01-22 15:07:19.286345: step: 570/463, loss: 0.3059881627559662 2023-01-22 15:07:19.882238: step: 572/463, loss: 0.4895969033241272 2023-01-22 15:07:20.584440: step: 574/463, loss: 0.40519285202026367 2023-01-22 15:07:21.174448: step: 576/463, loss: 0.9017975926399231 2023-01-22 15:07:21.791611: step: 578/463, loss: 0.194340780377388 2023-01-22 15:07:22.458514: step: 580/463, loss: 0.39715301990509033 2023-01-22 15:07:23.031202: step: 582/463, loss: 0.3256192207336426 2023-01-22 15:07:23.739758: step: 584/463, loss: 0.8362942934036255 2023-01-22 15:07:24.288906: step: 586/463, loss: 0.47837021946907043 2023-01-22 15:07:24.865673: step: 588/463, loss: 0.4325508177280426 2023-01-22 15:07:25.459798: step: 590/463, loss: 0.24887965619564056 2023-01-22 15:07:26.065545: step: 592/463, loss: 0.23015880584716797 2023-01-22 15:07:26.663426: step: 594/463, loss: 1.168522834777832 2023-01-22 15:07:27.206423: step: 596/463, loss: 0.4639538824558258 2023-01-22 15:07:27.785238: step: 598/463, loss: 5.036430358886719 2023-01-22 15:07:28.403823: step: 600/463, loss: 0.12581677734851837 2023-01-22 15:07:28.950680: step: 602/463, loss: 0.14451941847801208 2023-01-22 15:07:29.548436: step: 604/463, loss: 0.24720533192157745 2023-01-22 15:07:30.208486: step: 606/463, loss: 2.2013638019561768 2023-01-22 15:07:30.898180: step: 608/463, loss: 0.7725247740745544 2023-01-22 15:07:31.433410: step: 610/463, loss: 1.0460600852966309 2023-01-22 15:07:32.028627: step: 612/463, loss: 0.4521838426589966 2023-01-22 15:07:32.630191: step: 614/463, loss: 0.5793362259864807 2023-01-22 15:07:33.231785: step: 616/463, loss: 1.207414150238037 2023-01-22 15:07:33.788998: step: 618/463, loss: 0.24056366086006165 2023-01-22 15:07:34.470941: step: 620/463, loss: 0.29370030760765076 2023-01-22 15:07:35.064132: step: 622/463, loss: 0.37422001361846924 2023-01-22 15:07:35.654803: step: 624/463, loss: 0.5061303377151489 2023-01-22 15:07:36.306215: step: 626/463, loss: 0.9865623712539673 2023-01-22 15:07:36.868491: step: 628/463, loss: 0.31774213910102844 2023-01-22 15:07:37.505818: step: 630/463, loss: 0.5466070771217346 2023-01-22 15:07:38.081650: step: 632/463, loss: 0.6768357753753662 2023-01-22 15:07:38.702037: step: 634/463, loss: 0.412163108587265 2023-01-22 15:07:39.321689: step: 636/463, loss: 0.36361241340637207 2023-01-22 15:07:40.055376: step: 638/463, loss: 5.2101545333862305 2023-01-22 15:07:40.731411: step: 640/463, loss: 0.4686252772808075 2023-01-22 15:07:41.304113: step: 642/463, loss: 0.29262393712997437 2023-01-22 15:07:41.949809: step: 644/463, loss: 0.8827959299087524 2023-01-22 15:07:42.493470: step: 646/463, loss: 0.09450677782297134 2023-01-22 15:07:43.170987: step: 648/463, loss: 0.20455053448677063 2023-01-22 15:07:43.915803: step: 650/463, loss: 0.5380986928939819 2023-01-22 15:07:44.512015: step: 652/463, loss: 0.8523054122924805 2023-01-22 15:07:45.112281: step: 654/463, loss: 0.49564671516418457 2023-01-22 15:07:45.650145: step: 656/463, loss: 0.26141366362571716 2023-01-22 15:07:46.276531: step: 658/463, loss: 0.7567002177238464 2023-01-22 15:07:46.808087: step: 660/463, loss: 0.6325774788856506 2023-01-22 15:07:47.418833: step: 662/463, loss: 0.6355985999107361 2023-01-22 15:07:48.029109: step: 664/463, loss: 0.6316511631011963 2023-01-22 15:07:48.675073: step: 666/463, loss: 0.19803586602210999 2023-01-22 15:07:49.236476: step: 668/463, loss: 1.0312713384628296 2023-01-22 15:07:49.833700: step: 670/463, loss: 0.36107760667800903 2023-01-22 15:07:50.426679: step: 672/463, loss: 0.8991619348526001 2023-01-22 15:07:51.047210: step: 674/463, loss: 0.30333635210990906 2023-01-22 15:07:51.626198: step: 676/463, loss: 0.19247207045555115 2023-01-22 15:07:52.221145: step: 678/463, loss: 0.22345250844955444 2023-01-22 15:07:52.901178: step: 680/463, loss: 2.244189977645874 2023-01-22 15:07:53.485358: step: 682/463, loss: 0.5507456660270691 2023-01-22 15:07:54.054359: step: 684/463, loss: 0.10249347239732742 2023-01-22 15:07:54.640072: step: 686/463, loss: 0.10633664578199387 2023-01-22 15:07:55.246435: step: 688/463, loss: 0.5916401147842407 2023-01-22 15:07:55.859049: step: 690/463, loss: 0.3296261429786682 2023-01-22 15:07:56.434059: step: 692/463, loss: 0.458748996257782 2023-01-22 15:07:57.167758: step: 694/463, loss: 0.22222070395946503 2023-01-22 15:07:57.889305: step: 696/463, loss: 0.23379574716091156 2023-01-22 15:07:58.587232: step: 698/463, loss: 0.2250123769044876 2023-01-22 15:07:59.206295: step: 700/463, loss: 0.5238831043243408 2023-01-22 15:07:59.819328: step: 702/463, loss: 1.195380449295044 2023-01-22 15:08:00.463188: step: 704/463, loss: 0.5709733963012695 2023-01-22 15:08:01.090005: step: 706/463, loss: 0.34684693813323975 2023-01-22 15:08:01.744161: step: 708/463, loss: 0.22114098072052002 2023-01-22 15:08:02.396077: step: 710/463, loss: 0.42899176478385925 2023-01-22 15:08:02.984451: step: 712/463, loss: 0.3177708387374878 2023-01-22 15:08:03.583114: step: 714/463, loss: 0.18190063536167145 2023-01-22 15:08:04.182388: step: 716/463, loss: 1.6301074028015137 2023-01-22 15:08:04.767816: step: 718/463, loss: 1.1122188568115234 2023-01-22 15:08:05.358217: step: 720/463, loss: 0.17177650332450867 2023-01-22 15:08:05.971495: step: 722/463, loss: 1.5505553483963013 2023-01-22 15:08:06.531884: step: 724/463, loss: 1.6699628829956055 2023-01-22 15:08:07.168646: step: 726/463, loss: 0.6964430809020996 2023-01-22 15:08:07.826770: step: 728/463, loss: 0.6751114726066589 2023-01-22 15:08:08.434307: step: 730/463, loss: 0.24304696917533875 2023-01-22 15:08:09.059361: step: 732/463, loss: 3.6442182064056396 2023-01-22 15:08:09.726395: step: 734/463, loss: 0.401777446269989 2023-01-22 15:08:10.307436: step: 736/463, loss: 0.5296493768692017 2023-01-22 15:08:10.864800: step: 738/463, loss: 0.3439177870750427 2023-01-22 15:08:11.476240: step: 740/463, loss: 0.8252872228622437 2023-01-22 15:08:12.084585: step: 742/463, loss: 0.37364664673805237 2023-01-22 15:08:12.698383: step: 744/463, loss: 0.6990154981613159 2023-01-22 15:08:13.266356: step: 746/463, loss: 0.29908475279808044 2023-01-22 15:08:13.988736: step: 748/463, loss: 1.3338626623153687 2023-01-22 15:08:14.701778: step: 750/463, loss: 0.40108582377433777 2023-01-22 15:08:15.314074: step: 752/463, loss: 0.7678399085998535 2023-01-22 15:08:15.966206: step: 754/463, loss: 0.4231470227241516 2023-01-22 15:08:16.626404: step: 756/463, loss: 0.3153012692928314 2023-01-22 15:08:17.317335: step: 758/463, loss: 0.948556125164032 2023-01-22 15:08:17.947213: step: 760/463, loss: 0.34161147475242615 2023-01-22 15:08:18.609424: step: 762/463, loss: 0.12547671794891357 2023-01-22 15:08:19.242458: step: 764/463, loss: 0.3423321545124054 2023-01-22 15:08:19.762743: step: 766/463, loss: 0.6311486959457397 2023-01-22 15:08:20.362804: step: 768/463, loss: 0.26898983120918274 2023-01-22 15:08:20.942923: step: 770/463, loss: 0.21572089195251465 2023-01-22 15:08:21.520593: step: 772/463, loss: 0.13775970041751862 2023-01-22 15:08:22.132733: step: 774/463, loss: 0.45379137992858887 2023-01-22 15:08:22.796828: step: 776/463, loss: 0.27344968914985657 2023-01-22 15:08:23.387069: step: 778/463, loss: 0.20151278376579285 2023-01-22 15:08:23.952378: step: 780/463, loss: 0.2211664319038391 2023-01-22 15:08:24.567380: step: 782/463, loss: 0.28910571336746216 2023-01-22 15:08:25.116866: step: 784/463, loss: 0.1498117446899414 2023-01-22 15:08:25.732162: step: 786/463, loss: 0.32780176401138306 2023-01-22 15:08:26.390905: step: 788/463, loss: 1.679417371749878 2023-01-22 15:08:27.004924: step: 790/463, loss: 0.38252124190330505 2023-01-22 15:08:27.602863: step: 792/463, loss: 0.49831265211105347 2023-01-22 15:08:28.228373: step: 794/463, loss: 0.16623185575008392 2023-01-22 15:08:28.807060: step: 796/463, loss: 0.2659747302532196 2023-01-22 15:08:29.457874: step: 798/463, loss: 0.2202443927526474 2023-01-22 15:08:30.053150: step: 800/463, loss: 1.234512209892273 2023-01-22 15:08:30.633364: step: 802/463, loss: 0.07375361025333405 2023-01-22 15:08:31.277619: step: 804/463, loss: 0.12125623226165771 2023-01-22 15:08:31.910376: step: 806/463, loss: 2.1898014545440674 2023-01-22 15:08:32.516040: step: 808/463, loss: 0.910386860370636 2023-01-22 15:08:33.124392: step: 810/463, loss: 2.113795757293701 2023-01-22 15:08:33.788921: step: 812/463, loss: 0.5033143162727356 2023-01-22 15:08:34.365584: step: 814/463, loss: 1.286495566368103 2023-01-22 15:08:35.011490: step: 816/463, loss: 0.8976309895515442 2023-01-22 15:08:35.580853: step: 818/463, loss: 1.146912932395935 2023-01-22 15:08:36.207054: step: 820/463, loss: 0.7825620770454407 2023-01-22 15:08:36.798727: step: 822/463, loss: 0.5147075057029724 2023-01-22 15:08:37.417166: step: 824/463, loss: 2.006443500518799 2023-01-22 15:08:37.982421: step: 826/463, loss: 0.8811243772506714 2023-01-22 15:08:38.557464: step: 828/463, loss: 0.2480451911687851 2023-01-22 15:08:39.142290: step: 830/463, loss: 0.8710680603981018 2023-01-22 15:08:39.751425: step: 832/463, loss: 0.44372621178627014 2023-01-22 15:08:40.306036: step: 834/463, loss: 1.1885063648223877 2023-01-22 15:08:40.853141: step: 836/463, loss: 0.24235740303993225 2023-01-22 15:08:41.415329: step: 838/463, loss: 0.9789263010025024 2023-01-22 15:08:42.069194: step: 840/463, loss: 0.4312312602996826 2023-01-22 15:08:42.660265: step: 842/463, loss: 1.0303033590316772 2023-01-22 15:08:43.335884: step: 844/463, loss: 0.3283100724220276 2023-01-22 15:08:43.878512: step: 846/463, loss: 1.1429189443588257 2023-01-22 15:08:44.446491: step: 848/463, loss: 0.1295229196548462 2023-01-22 15:08:45.108863: step: 850/463, loss: 2.6772496700286865 2023-01-22 15:08:45.654176: step: 852/463, loss: 0.13577556610107422 2023-01-22 15:08:46.226806: step: 854/463, loss: 0.3224055767059326 2023-01-22 15:08:46.913956: step: 856/463, loss: 0.1329585462808609 2023-01-22 15:08:47.499842: step: 858/463, loss: 0.4568784832954407 2023-01-22 15:08:48.129858: step: 860/463, loss: 0.234943225979805 2023-01-22 15:08:48.776761: step: 862/463, loss: 0.23698650300502777 2023-01-22 15:08:49.365262: step: 864/463, loss: 0.1575796753168106 2023-01-22 15:08:50.036080: step: 866/463, loss: 0.19493438303470612 2023-01-22 15:08:50.716134: step: 868/463, loss: 0.7287836074829102 2023-01-22 15:08:51.321095: step: 870/463, loss: 3.6825809478759766 2023-01-22 15:08:51.886695: step: 872/463, loss: 0.6025021076202393 2023-01-22 15:08:52.514516: step: 874/463, loss: 0.7609367966651917 2023-01-22 15:08:53.118992: step: 876/463, loss: 0.06902584433555603 2023-01-22 15:08:53.749868: step: 878/463, loss: 1.083537220954895 2023-01-22 15:08:54.362694: step: 880/463, loss: 0.3196052014827728 2023-01-22 15:08:54.991368: step: 882/463, loss: 0.18319807946681976 2023-01-22 15:08:55.603677: step: 884/463, loss: 0.9641375541687012 2023-01-22 15:08:56.179147: step: 886/463, loss: 0.25898924469947815 2023-01-22 15:08:56.770432: step: 888/463, loss: 0.27265775203704834 2023-01-22 15:08:57.383806: step: 890/463, loss: 0.20357045531272888 2023-01-22 15:08:58.031211: step: 892/463, loss: 0.8031787276268005 2023-01-22 15:08:58.624537: step: 894/463, loss: 1.7253501415252686 2023-01-22 15:08:59.189094: step: 896/463, loss: 0.17325490713119507 2023-01-22 15:08:59.829989: step: 898/463, loss: 0.5367797017097473 2023-01-22 15:09:00.439996: step: 900/463, loss: 0.3228587508201599 2023-01-22 15:09:01.168707: step: 902/463, loss: 0.2137545645236969 2023-01-22 15:09:01.799609: step: 904/463, loss: 0.626368522644043 2023-01-22 15:09:02.342044: step: 906/463, loss: 0.08115013688802719 2023-01-22 15:09:02.962642: step: 908/463, loss: 0.5737285017967224 2023-01-22 15:09:03.570985: step: 910/463, loss: 0.556083083152771 2023-01-22 15:09:04.156615: step: 912/463, loss: 0.9264906644821167 2023-01-22 15:09:04.755069: step: 914/463, loss: 0.36751627922058105 2023-01-22 15:09:05.344034: step: 916/463, loss: 0.7969188690185547 2023-01-22 15:09:05.902859: step: 918/463, loss: 0.22647371888160706 2023-01-22 15:09:06.497530: step: 920/463, loss: 0.11357276141643524 2023-01-22 15:09:07.214028: step: 922/463, loss: 0.3041425347328186 2023-01-22 15:09:07.784051: step: 924/463, loss: 1.3441200256347656 2023-01-22 15:09:08.404238: step: 926/463, loss: 0.15844851732254028 ================================================== Loss: 0.563 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.27948282747603836, 'r': 0.3313565340909091, 'f1': 0.3032170710571924}, 'combined': 0.2234231049895102, 'epoch': 5} Test Chinese: {'template': {'p': 0.9722222222222222, 'r': 0.5511811023622047, 'f1': 0.7035175879396985}, 'slot': {'p': 0.3307446947006414, 'r': 0.31456853495982395, 'f1': 0.322453869766337}, 'combined': 0.2268519686798351, 'epoch': 5} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2825206611570248, 'r': 0.3237215909090909, 'f1': 0.30172109443954104}, 'combined': 0.2223208064291355, 'epoch': 5} Test Korean: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.3279665758218992, 'r': 0.31078055438145036, 'f1': 0.3191423630195162}, 'combined': 0.2265910777438565, 'epoch': 5} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2882374138952349, 'r': 0.339006503842691, 'f1': 0.31156733512435314}, 'combined': 0.2295759311442602, 'epoch': 5} Test Russian: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.34057803379390106, 'r': 0.3004225450933101, 'f1': 0.31924251891586086}, 'combined': 0.2266621884302612, 'epoch': 5} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.21450617283950615, 'r': 0.33095238095238094, 'f1': 0.26029962546816476}, 'combined': 0.17353308364544318, 'epoch': 5} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.25, 'r': 0.44565217391304346, 'f1': 0.3203125}, 'combined': 0.16015625, 'epoch': 5} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.2857142857142857, 'r': 0.27586206896551724, 'f1': 0.28070175438596495}, 'combined': 0.18713450292397663, 'epoch': 5} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3095149595559295, 'r': 0.3253724622276754, 'f1': 0.31724567547453275}, 'combined': 0.23375997140228727, 'epoch': 1} Test for Chinese: {'template': {'p': 0.9722222222222222, 'r': 0.5511811023622047, 'f1': 0.7035175879396985}, 'slot': {'p': 0.36402697976295945, 'r': 0.29452226435922085, 'f1': 0.32560678286267597}, 'combined': 0.22907009849635496, 'epoch': 1} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.29409479409479405, 'r': 0.32770562770562767, 'f1': 0.3099918099918099}, 'combined': 0.2066612066612066, 'epoch': 1} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.31359210182048525, 'r': 0.32489807892596767, 'f1': 0.3191449908555172}, 'combined': 0.23515946694617054, 'epoch': 1} Test for Korean: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.3653684903443556, 'r': 0.29076445304891124, 'f1': 0.32382513429937054}, 'combined': 0.22991584535255308, 'epoch': 1} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.296875, 'r': 0.41304347826086957, 'f1': 0.3454545454545454}, 'combined': 0.1727272727272727, 'epoch': 1} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.301291961130742, 'r': 0.3235887096774194, 'f1': 0.3120425434583715}, 'combined': 0.2299260846535369, 'epoch': 0} Test for Russian: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.38111410877910484, 'r': 0.27421624899959984, 'f1': 0.318946559120102}, 'combined': 0.2264520569752724, 'epoch': 0} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.3888888888888889, 'r': 0.2413793103448276, 'f1': 0.2978723404255319}, 'combined': 0.19858156028368792, 'epoch': 0} ****************************** Epoch: 6 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-22 15:11:43.982233: step: 2/463, loss: 0.11328350752592087 2023-01-22 15:11:44.686270: step: 4/463, loss: 0.3801382780075073 2023-01-22 15:11:45.327751: step: 6/463, loss: 0.7553510665893555 2023-01-22 15:11:45.869917: step: 8/463, loss: 0.2715178430080414 2023-01-22 15:11:46.477343: step: 10/463, loss: 0.3160317540168762 2023-01-22 15:11:47.125122: step: 12/463, loss: 0.11956930160522461 2023-01-22 15:11:47.748870: step: 14/463, loss: 0.9501020908355713 2023-01-22 15:11:48.343514: step: 16/463, loss: 0.14126232266426086 2023-01-22 15:11:49.076719: step: 18/463, loss: 0.16749922931194305 2023-01-22 15:11:49.631015: step: 20/463, loss: 0.24191810190677643 2023-01-22 15:11:50.245917: step: 22/463, loss: 0.14310456812381744 2023-01-22 15:11:50.858247: step: 24/463, loss: 0.2878342568874359 2023-01-22 15:11:51.487304: step: 26/463, loss: 0.05530151352286339 2023-01-22 15:11:52.078552: step: 28/463, loss: 0.3716318607330322 2023-01-22 15:11:52.724855: step: 30/463, loss: 0.07146845012903214 2023-01-22 15:11:53.288469: step: 32/463, loss: 0.07172932475805283 2023-01-22 15:11:53.903511: step: 34/463, loss: 0.10561642050743103 2023-01-22 15:11:54.546678: step: 36/463, loss: 0.25322550535202026 2023-01-22 15:11:55.172354: step: 38/463, loss: 0.5367940664291382 2023-01-22 15:11:55.910448: step: 40/463, loss: 0.955446720123291 2023-01-22 15:11:56.478123: step: 42/463, loss: 0.21596089005470276 2023-01-22 15:11:57.080085: step: 44/463, loss: 0.19279137253761292 2023-01-22 15:11:57.728934: step: 46/463, loss: 0.0892983078956604 2023-01-22 15:11:58.265336: step: 48/463, loss: 0.696919858455658 2023-01-22 15:11:58.876779: step: 50/463, loss: 0.20688240230083466 2023-01-22 15:11:59.421817: step: 52/463, loss: 0.5158486366271973 2023-01-22 15:12:00.049018: step: 54/463, loss: 0.20877091586589813 2023-01-22 15:12:00.680622: step: 56/463, loss: 0.16436204314231873 2023-01-22 15:12:01.241565: step: 58/463, loss: 0.2730752229690552 2023-01-22 15:12:01.848886: step: 60/463, loss: 0.13818666338920593 2023-01-22 15:12:02.453904: step: 62/463, loss: 0.2810833752155304 2023-01-22 15:12:03.131441: step: 64/463, loss: 0.16900062561035156 2023-01-22 15:12:03.793230: step: 66/463, loss: 0.11805994808673859 2023-01-22 15:12:04.390457: step: 68/463, loss: 0.4482196867465973 2023-01-22 15:12:04.952349: step: 70/463, loss: 0.019276566803455353 2023-01-22 15:12:05.578921: step: 72/463, loss: 0.35931089520454407 2023-01-22 15:12:06.242990: step: 74/463, loss: 0.2671157419681549 2023-01-22 15:12:06.853221: step: 76/463, loss: 0.20586392283439636 2023-01-22 15:12:07.415563: step: 78/463, loss: 0.48910677433013916 2023-01-22 15:12:07.963289: step: 80/463, loss: 0.2637368440628052 2023-01-22 15:12:08.606698: step: 82/463, loss: 0.12059412896633148 2023-01-22 15:12:09.171674: step: 84/463, loss: 0.1311236321926117 2023-01-22 15:12:09.772035: step: 86/463, loss: 0.3183825612068176 2023-01-22 15:12:10.414172: step: 88/463, loss: 0.601168155670166 2023-01-22 15:12:11.031469: step: 90/463, loss: 0.12609867751598358 2023-01-22 15:12:11.645001: step: 92/463, loss: 0.2921961545944214 2023-01-22 15:12:12.252330: step: 94/463, loss: 0.38238218426704407 2023-01-22 15:12:12.923890: step: 96/463, loss: 0.20526336133480072 2023-01-22 15:12:13.560608: step: 98/463, loss: 0.12618792057037354 2023-01-22 15:12:14.258058: step: 100/463, loss: 0.16581925749778748 2023-01-22 15:12:14.935575: step: 102/463, loss: 0.1275063455104828 2023-01-22 15:12:15.455038: step: 104/463, loss: 0.23295395076274872 2023-01-22 15:12:15.997471: step: 106/463, loss: 0.09774941951036453 2023-01-22 15:12:16.639779: step: 108/463, loss: 0.5491511821746826 2023-01-22 15:12:17.252196: step: 110/463, loss: 0.1430748850107193 2023-01-22 15:12:17.855190: step: 112/463, loss: 0.3555144667625427 2023-01-22 15:12:18.446129: step: 114/463, loss: 0.3451424837112427 2023-01-22 15:12:19.163219: step: 116/463, loss: 0.15565650165081024 2023-01-22 15:12:19.795034: step: 118/463, loss: 0.10004019737243652 2023-01-22 15:12:20.402146: step: 120/463, loss: 0.09516075253486633 2023-01-22 15:12:20.994986: step: 122/463, loss: 0.31506070494651794 2023-01-22 15:12:21.617477: step: 124/463, loss: 0.46080759167671204 2023-01-22 15:12:22.190301: step: 126/463, loss: 0.32192856073379517 2023-01-22 15:12:22.925592: step: 128/463, loss: 0.5629552602767944 2023-01-22 15:12:23.452926: step: 130/463, loss: 0.22457252442836761 2023-01-22 15:12:24.017165: step: 132/463, loss: 0.2856089472770691 2023-01-22 15:12:24.628840: step: 134/463, loss: 0.15569323301315308 2023-01-22 15:12:25.256545: step: 136/463, loss: 0.17584070563316345 2023-01-22 15:12:25.843915: step: 138/463, loss: 0.1121608167886734 2023-01-22 15:12:26.441593: step: 140/463, loss: 0.8706576824188232 2023-01-22 15:12:26.986685: step: 142/463, loss: 0.08985014259815216 2023-01-22 15:12:27.605645: step: 144/463, loss: 0.31690341234207153 2023-01-22 15:12:28.204205: step: 146/463, loss: 0.5659283995628357 2023-01-22 15:12:28.819794: step: 148/463, loss: 0.7791360020637512 2023-01-22 15:12:29.392973: step: 150/463, loss: 0.23787671327590942 2023-01-22 15:12:29.990246: step: 152/463, loss: 0.1963760256767273 2023-01-22 15:12:30.631573: step: 154/463, loss: 0.8513021469116211 2023-01-22 15:12:31.237926: step: 156/463, loss: 0.21973298490047455 2023-01-22 15:12:31.826696: step: 158/463, loss: 0.048415202647447586 2023-01-22 15:12:32.429438: step: 160/463, loss: 0.11972279846668243 2023-01-22 15:12:32.991388: step: 162/463, loss: 0.2714700400829315 2023-01-22 15:12:33.631889: step: 164/463, loss: 0.2716275751590729 2023-01-22 15:12:34.278178: step: 166/463, loss: 1.0869468450546265 2023-01-22 15:12:34.769300: step: 168/463, loss: 0.0733659416437149 2023-01-22 15:12:35.358889: step: 170/463, loss: 0.49404239654541016 2023-01-22 15:12:35.943300: step: 172/463, loss: 0.1014707013964653 2023-01-22 15:12:36.547328: step: 174/463, loss: 0.3815869987010956 2023-01-22 15:12:37.231921: step: 176/463, loss: 0.27903908491134644 2023-01-22 15:12:38.066727: step: 178/463, loss: 0.13306820392608643 2023-01-22 15:12:38.690532: step: 180/463, loss: 0.7304980754852295 2023-01-22 15:12:39.318619: step: 182/463, loss: 0.2579108476638794 2023-01-22 15:12:39.955514: step: 184/463, loss: 0.37644118070602417 2023-01-22 15:12:40.516967: step: 186/463, loss: 0.12137850373983383 2023-01-22 15:12:41.246345: step: 188/463, loss: 0.1423615962266922 2023-01-22 15:12:41.850180: step: 190/463, loss: 0.25330784916877747 2023-01-22 15:12:42.507047: step: 192/463, loss: 0.09347507357597351 2023-01-22 15:12:43.104486: step: 194/463, loss: 0.18479356169700623 2023-01-22 15:12:43.696622: step: 196/463, loss: 0.22414341568946838 2023-01-22 15:12:44.275842: step: 198/463, loss: 0.20709463953971863 2023-01-22 15:12:44.877162: step: 200/463, loss: 0.5765883922576904 2023-01-22 15:12:45.459760: step: 202/463, loss: 0.6164947748184204 2023-01-22 15:12:46.080787: step: 204/463, loss: 0.11510085314512253 2023-01-22 15:12:46.663392: step: 206/463, loss: 0.6497308015823364 2023-01-22 15:12:47.285427: step: 208/463, loss: 0.7727522850036621 2023-01-22 15:12:47.916930: step: 210/463, loss: 0.4434744715690613 2023-01-22 15:12:48.550474: step: 212/463, loss: 0.7232543230056763 2023-01-22 15:12:49.153625: step: 214/463, loss: 2.5719962120056152 2023-01-22 15:12:49.814648: step: 216/463, loss: 0.8470968008041382 2023-01-22 15:12:50.433035: step: 218/463, loss: 0.34402573108673096 2023-01-22 15:12:51.030817: step: 220/463, loss: 0.7044435739517212 2023-01-22 15:12:51.633706: step: 222/463, loss: 0.7603880167007446 2023-01-22 15:12:52.248170: step: 224/463, loss: 0.4105510413646698 2023-01-22 15:12:52.816819: step: 226/463, loss: 0.13848306238651276 2023-01-22 15:12:53.394312: step: 228/463, loss: 0.3661164343357086 2023-01-22 15:12:54.035995: step: 230/463, loss: 0.39434897899627686 2023-01-22 15:12:54.679203: step: 232/463, loss: 0.2017090916633606 2023-01-22 15:12:55.225836: step: 234/463, loss: 0.2638443410396576 2023-01-22 15:12:55.880938: step: 236/463, loss: 0.7996578216552734 2023-01-22 15:12:56.458731: step: 238/463, loss: 0.2455178052186966 2023-01-22 15:12:57.085222: step: 240/463, loss: 0.12759089469909668 2023-01-22 15:12:57.718494: step: 242/463, loss: 7.540344715118408 2023-01-22 15:12:58.307295: step: 244/463, loss: 0.19620126485824585 2023-01-22 15:12:58.880971: step: 246/463, loss: 0.23500457406044006 2023-01-22 15:12:59.503909: step: 248/463, loss: 0.2618486285209656 2023-01-22 15:13:00.168209: step: 250/463, loss: 0.6461195945739746 2023-01-22 15:13:00.802257: step: 252/463, loss: 0.39269858598709106 2023-01-22 15:13:01.383470: step: 254/463, loss: 1.895592212677002 2023-01-22 15:13:02.034358: step: 256/463, loss: 0.18007637560367584 2023-01-22 15:13:02.608572: step: 258/463, loss: 0.3655993640422821 2023-01-22 15:13:03.209791: step: 260/463, loss: 1.1248749494552612 2023-01-22 15:13:03.804325: step: 262/463, loss: 0.1606884002685547 2023-01-22 15:13:04.412230: step: 264/463, loss: 0.22788532078266144 2023-01-22 15:13:05.079855: step: 266/463, loss: 0.23105552792549133 2023-01-22 15:13:05.612417: step: 268/463, loss: 0.3318527936935425 2023-01-22 15:13:06.186527: step: 270/463, loss: 0.7743659615516663 2023-01-22 15:13:06.840723: step: 272/463, loss: 0.4116678535938263 2023-01-22 15:13:07.422169: step: 274/463, loss: 0.24228893220424652 2023-01-22 15:13:07.979129: step: 276/463, loss: 0.16768258810043335 2023-01-22 15:13:08.555671: step: 278/463, loss: 0.10749752074480057 2023-01-22 15:13:09.287130: step: 280/463, loss: 0.23973722755908966 2023-01-22 15:13:09.956280: step: 282/463, loss: 0.41224515438079834 2023-01-22 15:13:10.559551: step: 284/463, loss: 0.11004934459924698 2023-01-22 15:13:11.241364: step: 286/463, loss: 0.31654468178749084 2023-01-22 15:13:11.881625: step: 288/463, loss: 0.1775241196155548 2023-01-22 15:13:12.500106: step: 290/463, loss: 0.5342812538146973 2023-01-22 15:13:13.203146: step: 292/463, loss: 0.2573208212852478 2023-01-22 15:13:13.788608: step: 294/463, loss: 0.28713127970695496 2023-01-22 15:13:14.412772: step: 296/463, loss: 0.49929067492485046 2023-01-22 15:13:15.029963: step: 298/463, loss: 0.9437546133995056 2023-01-22 15:13:15.651239: step: 300/463, loss: 0.2675062417984009 2023-01-22 15:13:16.264157: step: 302/463, loss: 0.24930548667907715 2023-01-22 15:13:16.909321: step: 304/463, loss: 0.3211551308631897 2023-01-22 15:13:17.503649: step: 306/463, loss: 0.17962241172790527 2023-01-22 15:13:18.095202: step: 308/463, loss: 0.5992646217346191 2023-01-22 15:13:18.770585: step: 310/463, loss: 0.1418333202600479 2023-01-22 15:13:19.419045: step: 312/463, loss: 0.34463557600975037 2023-01-22 15:13:20.060532: step: 314/463, loss: 0.7750646471977234 2023-01-22 15:13:20.661028: step: 316/463, loss: 0.10062951594591141 2023-01-22 15:13:21.263408: step: 318/463, loss: 0.13646477460861206 2023-01-22 15:13:21.951354: step: 320/463, loss: 0.2891658544540405 2023-01-22 15:13:22.602098: step: 322/463, loss: 0.44970205426216125 2023-01-22 15:13:23.263725: step: 324/463, loss: 0.5102471709251404 2023-01-22 15:13:23.866452: step: 326/463, loss: 2.1883232593536377 2023-01-22 15:13:24.500989: step: 328/463, loss: 0.7063361406326294 2023-01-22 15:13:25.106156: step: 330/463, loss: 0.17316219210624695 2023-01-22 15:13:25.723191: step: 332/463, loss: 0.25639572739601135 2023-01-22 15:13:26.319487: step: 334/463, loss: 0.9069021940231323 2023-01-22 15:13:26.929033: step: 336/463, loss: 0.37769222259521484 2023-01-22 15:13:27.509340: step: 338/463, loss: 0.19176307320594788 2023-01-22 15:13:28.066306: step: 340/463, loss: 0.37700071930885315 2023-01-22 15:13:28.619167: step: 342/463, loss: 0.24272394180297852 2023-01-22 15:13:29.242561: step: 344/463, loss: 0.09854038804769516 2023-01-22 15:13:29.860348: step: 346/463, loss: 0.1338016390800476 2023-01-22 15:13:30.496066: step: 348/463, loss: 0.14920343458652496 2023-01-22 15:13:31.200295: step: 350/463, loss: 1.257448434829712 2023-01-22 15:13:31.908837: step: 352/463, loss: 0.3763619661331177 2023-01-22 15:13:32.527713: step: 354/463, loss: 0.12848815321922302 2023-01-22 15:13:33.098382: step: 356/463, loss: 0.17355576157569885 2023-01-22 15:13:33.801738: step: 358/463, loss: 0.11232411116361618 2023-01-22 15:13:34.347592: step: 360/463, loss: 0.27039143443107605 2023-01-22 15:13:34.949393: step: 362/463, loss: 0.40133315324783325 2023-01-22 15:13:35.471937: step: 364/463, loss: 0.831834614276886 2023-01-22 15:13:36.126716: step: 366/463, loss: 0.1765107363462448 2023-01-22 15:13:36.727078: step: 368/463, loss: 0.18606682121753693 2023-01-22 15:13:37.340308: step: 370/463, loss: 0.2402883917093277 2023-01-22 15:13:37.901993: step: 372/463, loss: 0.3978153169155121 2023-01-22 15:13:38.496173: step: 374/463, loss: 0.10801620036363602 2023-01-22 15:13:39.073716: step: 376/463, loss: 0.6836568117141724 2023-01-22 15:13:39.587555: step: 378/463, loss: 0.43769174814224243 2023-01-22 15:13:40.216009: step: 380/463, loss: 0.6359347701072693 2023-01-22 15:13:40.804759: step: 382/463, loss: 0.1794208288192749 2023-01-22 15:13:41.379101: step: 384/463, loss: 0.22688336670398712 2023-01-22 15:13:42.020580: step: 386/463, loss: 0.30250388383865356 2023-01-22 15:13:42.674189: step: 388/463, loss: 0.2946070432662964 2023-01-22 15:13:43.268748: step: 390/463, loss: 0.11942090094089508 2023-01-22 15:13:43.919032: step: 392/463, loss: 1.2453315258026123 2023-01-22 15:13:44.489045: step: 394/463, loss: 0.16122104227542877 2023-01-22 15:13:45.041753: step: 396/463, loss: 0.24332432448863983 2023-01-22 15:13:45.668826: step: 398/463, loss: 0.5125413537025452 2023-01-22 15:13:46.245949: step: 400/463, loss: 3.077848434448242 2023-01-22 15:13:46.811799: step: 402/463, loss: 0.6677864193916321 2023-01-22 15:13:47.459985: step: 404/463, loss: 0.20504382252693176 2023-01-22 15:13:48.063197: step: 406/463, loss: 4.356563091278076 2023-01-22 15:13:48.753867: step: 408/463, loss: 0.3462993800640106 2023-01-22 15:13:49.340279: step: 410/463, loss: 0.23271521925926208 2023-01-22 15:13:49.898887: step: 412/463, loss: 0.6323425769805908 2023-01-22 15:13:50.484211: step: 414/463, loss: 0.3099965453147888 2023-01-22 15:13:51.113214: step: 416/463, loss: 0.3564698100090027 2023-01-22 15:13:51.732767: step: 418/463, loss: 0.518254816532135 2023-01-22 15:13:52.353264: step: 420/463, loss: 0.09125236421823502 2023-01-22 15:13:53.017280: step: 422/463, loss: 0.23809640109539032 2023-01-22 15:13:53.655867: step: 424/463, loss: 0.6290116906166077 2023-01-22 15:13:54.265587: step: 426/463, loss: 0.09893369674682617 2023-01-22 15:13:54.912054: step: 428/463, loss: 0.3670092225074768 2023-01-22 15:13:55.538376: step: 430/463, loss: 0.5810518860816956 2023-01-22 15:13:56.158281: step: 432/463, loss: 0.36125072836875916 2023-01-22 15:13:56.757907: step: 434/463, loss: 0.17330946028232574 2023-01-22 15:13:57.464417: step: 436/463, loss: 1.062544584274292 2023-01-22 15:13:58.080252: step: 438/463, loss: 0.29372650384902954 2023-01-22 15:13:58.659772: step: 440/463, loss: 0.3027943968772888 2023-01-22 15:13:59.270054: step: 442/463, loss: 0.4568956196308136 2023-01-22 15:13:59.881688: step: 444/463, loss: 0.49882906675338745 2023-01-22 15:14:00.519234: step: 446/463, loss: 0.24776211380958557 2023-01-22 15:14:01.087141: step: 448/463, loss: 0.05334063991904259 2023-01-22 15:14:01.704979: step: 450/463, loss: 0.790149986743927 2023-01-22 15:14:02.291388: step: 452/463, loss: 0.11571644246578217 2023-01-22 15:14:02.880470: step: 454/463, loss: 0.831578254699707 2023-01-22 15:14:03.492372: step: 456/463, loss: 0.16061407327651978 2023-01-22 15:14:04.087016: step: 458/463, loss: 0.2282624989748001 2023-01-22 15:14:04.696760: step: 460/463, loss: 0.1299496293067932 2023-01-22 15:14:05.261914: step: 462/463, loss: 1.1056077480316162 2023-01-22 15:14:05.869507: step: 464/463, loss: 0.39444318413734436 2023-01-22 15:14:06.426703: step: 466/463, loss: 0.13419659435749054 2023-01-22 15:14:07.098029: step: 468/463, loss: 2.4475340843200684 2023-01-22 15:14:07.647361: step: 470/463, loss: 0.24671070277690887 2023-01-22 15:14:08.286764: step: 472/463, loss: 0.12967588007450104 2023-01-22 15:14:08.942877: step: 474/463, loss: 0.42040571570396423 2023-01-22 15:14:09.524998: step: 476/463, loss: 0.36479324102401733 2023-01-22 15:14:10.139513: step: 478/463, loss: 0.2027682214975357 2023-01-22 15:14:10.678447: step: 480/463, loss: 0.3207573890686035 2023-01-22 15:14:11.257034: step: 482/463, loss: 1.2976740598678589 2023-01-22 15:14:11.858412: step: 484/463, loss: 0.3222140669822693 2023-01-22 15:14:12.501685: step: 486/463, loss: 0.6018906235694885 2023-01-22 15:14:13.071704: step: 488/463, loss: 0.32875415682792664 2023-01-22 15:14:13.666564: step: 490/463, loss: 1.298425555229187 2023-01-22 15:14:14.267204: step: 492/463, loss: 0.19678281247615814 2023-01-22 15:14:14.867516: step: 494/463, loss: 0.7787104845046997 2023-01-22 15:14:15.514197: step: 496/463, loss: 0.43866798281669617 2023-01-22 15:14:16.165977: step: 498/463, loss: 0.37102198600769043 2023-01-22 15:14:16.786739: step: 500/463, loss: 0.6937453746795654 2023-01-22 15:14:17.435790: step: 502/463, loss: 0.15943406522274017 2023-01-22 15:14:18.034076: step: 504/463, loss: 0.10120680928230286 2023-01-22 15:14:18.626493: step: 506/463, loss: 0.21074271202087402 2023-01-22 15:14:19.271120: step: 508/463, loss: 0.2929404675960541 2023-01-22 15:14:20.004628: step: 510/463, loss: 0.19745509326457977 2023-01-22 15:14:20.591105: step: 512/463, loss: 0.3255646824836731 2023-01-22 15:14:21.285020: step: 514/463, loss: 0.24985411763191223 2023-01-22 15:14:21.888544: step: 516/463, loss: 2.2063405513763428 2023-01-22 15:14:22.569919: step: 518/463, loss: 0.3745509684085846 2023-01-22 15:14:23.172352: step: 520/463, loss: 0.14399082958698273 2023-01-22 15:14:23.759039: step: 522/463, loss: 0.19667880237102509 2023-01-22 15:14:24.458140: step: 524/463, loss: 0.27852460741996765 2023-01-22 15:14:25.141988: step: 526/463, loss: 0.23019668459892273 2023-01-22 15:14:25.775469: step: 528/463, loss: 0.2512867748737335 2023-01-22 15:14:26.375133: step: 530/463, loss: 0.20609433948993683 2023-01-22 15:14:26.979285: step: 532/463, loss: 0.22840403020381927 2023-01-22 15:14:27.581687: step: 534/463, loss: 0.4465503394603729 2023-01-22 15:14:28.187122: step: 536/463, loss: 0.8308219313621521 2023-01-22 15:14:28.755780: step: 538/463, loss: 0.5853867530822754 2023-01-22 15:14:29.365999: step: 540/463, loss: 0.2560493052005768 2023-01-22 15:14:29.958737: step: 542/463, loss: 0.353453129529953 2023-01-22 15:14:30.600170: step: 544/463, loss: 0.25842440128326416 2023-01-22 15:14:31.168744: step: 546/463, loss: 0.2599759101867676 2023-01-22 15:14:31.782271: step: 548/463, loss: 0.15423175692558289 2023-01-22 15:14:32.481083: step: 550/463, loss: 0.16227617859840393 2023-01-22 15:14:33.079033: step: 552/463, loss: 0.08020909130573273 2023-01-22 15:14:33.724181: step: 554/463, loss: 0.13207575678825378 2023-01-22 15:14:34.259373: step: 556/463, loss: 0.2594393491744995 2023-01-22 15:14:34.804060: step: 558/463, loss: 0.3367311656475067 2023-01-22 15:14:35.489494: step: 560/463, loss: 0.6012117266654968 2023-01-22 15:14:36.160014: step: 562/463, loss: 0.2999015152454376 2023-01-22 15:14:36.774060: step: 564/463, loss: 0.19729477167129517 2023-01-22 15:14:37.400862: step: 566/463, loss: 0.13765570521354675 2023-01-22 15:14:38.022459: step: 568/463, loss: 0.4359719157218933 2023-01-22 15:14:38.595339: step: 570/463, loss: 5.058111190795898 2023-01-22 15:14:39.193311: step: 572/463, loss: 0.9088137149810791 2023-01-22 15:14:39.753710: step: 574/463, loss: 2.058014154434204 2023-01-22 15:14:40.360647: step: 576/463, loss: 0.3221476078033447 2023-01-22 15:14:40.972759: step: 578/463, loss: 0.12822946906089783 2023-01-22 15:14:41.554221: step: 580/463, loss: 0.34033873677253723 2023-01-22 15:14:42.122846: step: 582/463, loss: 0.34254229068756104 2023-01-22 15:14:42.760800: step: 584/463, loss: 0.3849256634712219 2023-01-22 15:14:43.447173: step: 586/463, loss: 0.2507324814796448 2023-01-22 15:14:44.041660: step: 588/463, loss: 0.2054041475057602 2023-01-22 15:14:44.629961: step: 590/463, loss: 0.49338018894195557 2023-01-22 15:14:45.264844: step: 592/463, loss: 0.4716828465461731 2023-01-22 15:14:45.868300: step: 594/463, loss: 0.5570412874221802 2023-01-22 15:14:46.523168: step: 596/463, loss: 0.9719157218933105 2023-01-22 15:14:47.190981: step: 598/463, loss: 0.1202988550066948 2023-01-22 15:14:47.727614: step: 600/463, loss: 0.5435601472854614 2023-01-22 15:14:48.382566: step: 602/463, loss: 0.3243338167667389 2023-01-22 15:14:49.027347: step: 604/463, loss: 0.21925713121891022 2023-01-22 15:14:49.553385: step: 606/463, loss: 1.3484998941421509 2023-01-22 15:14:50.174914: step: 608/463, loss: 0.9550144672393799 2023-01-22 15:14:50.812929: step: 610/463, loss: 0.3153746724128723 2023-01-22 15:14:51.430091: step: 612/463, loss: 0.11488358676433563 2023-01-22 15:14:51.981387: step: 614/463, loss: 0.19734595715999603 2023-01-22 15:14:52.666262: step: 616/463, loss: 0.25671476125717163 2023-01-22 15:14:53.312104: step: 618/463, loss: 0.09313756972551346 2023-01-22 15:14:53.907168: step: 620/463, loss: 0.9959446787834167 2023-01-22 15:14:54.534537: step: 622/463, loss: 0.19460713863372803 2023-01-22 15:14:55.127175: step: 624/463, loss: 0.6067196130752563 2023-01-22 15:14:55.713435: step: 626/463, loss: 1.1854249238967896 2023-01-22 15:14:56.299508: step: 628/463, loss: 0.8055773377418518 2023-01-22 15:14:56.912225: step: 630/463, loss: 0.5025987029075623 2023-01-22 15:14:57.585494: step: 632/463, loss: 0.5142430067062378 2023-01-22 15:14:58.175849: step: 634/463, loss: 0.1331365555524826 2023-01-22 15:14:58.802783: step: 636/463, loss: 0.7431958913803101 2023-01-22 15:14:59.437580: step: 638/463, loss: 0.24125762283802032 2023-01-22 15:15:00.058433: step: 640/463, loss: 0.28557202219963074 2023-01-22 15:15:00.670032: step: 642/463, loss: 0.5778165459632874 2023-01-22 15:15:01.356082: step: 644/463, loss: 0.8635492920875549 2023-01-22 15:15:01.938971: step: 646/463, loss: 1.0520206689834595 2023-01-22 15:15:02.479560: step: 648/463, loss: 0.3385510742664337 2023-01-22 15:15:03.098758: step: 650/463, loss: 0.13173292577266693 2023-01-22 15:15:03.703735: step: 652/463, loss: 1.0415979623794556 2023-01-22 15:15:04.274660: step: 654/463, loss: 0.27827802300453186 2023-01-22 15:15:04.805681: step: 656/463, loss: 0.16841401159763336 2023-01-22 15:15:05.386853: step: 658/463, loss: 0.09695250540971756 2023-01-22 15:15:06.015703: step: 660/463, loss: 0.1893773376941681 2023-01-22 15:15:06.577730: step: 662/463, loss: 0.5418996214866638 2023-01-22 15:15:07.223182: step: 664/463, loss: 0.3663601875305176 2023-01-22 15:15:07.890895: step: 666/463, loss: 1.2933084964752197 2023-01-22 15:15:08.500476: step: 668/463, loss: 0.5446755290031433 2023-01-22 15:15:09.118211: step: 670/463, loss: 0.5607168078422546 2023-01-22 15:15:09.722871: step: 672/463, loss: 0.45016294717788696 2023-01-22 15:15:10.361500: step: 674/463, loss: 0.29195892810821533 2023-01-22 15:15:10.992976: step: 676/463, loss: 0.41079187393188477 2023-01-22 15:15:11.645571: step: 678/463, loss: 0.5871946811676025 2023-01-22 15:15:12.295679: step: 680/463, loss: 0.37122276425361633 2023-01-22 15:15:12.955302: step: 682/463, loss: 0.8133432865142822 2023-01-22 15:15:13.590799: step: 684/463, loss: 0.20547789335250854 2023-01-22 15:15:14.141889: step: 686/463, loss: 0.11704316735267639 2023-01-22 15:15:14.725137: step: 688/463, loss: 0.16250087320804596 2023-01-22 15:15:15.322961: step: 690/463, loss: 0.14875176548957825 2023-01-22 15:15:15.988892: step: 692/463, loss: 0.19713599979877472 2023-01-22 15:15:16.574297: step: 694/463, loss: 0.4241962432861328 2023-01-22 15:15:17.213276: step: 696/463, loss: 0.23720425367355347 2023-01-22 15:15:17.858029: step: 698/463, loss: 0.3138216733932495 2023-01-22 15:15:18.472205: step: 700/463, loss: 0.3793291449546814 2023-01-22 15:15:19.097882: step: 702/463, loss: 0.4316708743572235 2023-01-22 15:15:19.684601: step: 704/463, loss: 0.12183252722024918 2023-01-22 15:15:20.277913: step: 706/463, loss: 0.12715552747249603 2023-01-22 15:15:20.919863: step: 708/463, loss: 6.672313690185547 2023-01-22 15:15:21.597194: step: 710/463, loss: 0.3807900547981262 2023-01-22 15:15:22.188546: step: 712/463, loss: 0.3769588768482208 2023-01-22 15:15:22.762489: step: 714/463, loss: 0.202292338013649 2023-01-22 15:15:23.358007: step: 716/463, loss: 1.0328387022018433 2023-01-22 15:15:24.013708: step: 718/463, loss: 0.5604069232940674 2023-01-22 15:15:24.599613: step: 720/463, loss: 0.22092115879058838 2023-01-22 15:15:25.216218: step: 722/463, loss: 0.23362916707992554 2023-01-22 15:15:25.820103: step: 724/463, loss: 0.8464804291725159 2023-01-22 15:15:26.421002: step: 726/463, loss: 0.6664421558380127 2023-01-22 15:15:27.102683: step: 728/463, loss: 0.6709352135658264 2023-01-22 15:15:27.731997: step: 730/463, loss: 0.2249390333890915 2023-01-22 15:15:28.448745: step: 732/463, loss: 0.7361041307449341 2023-01-22 15:15:29.036181: step: 734/463, loss: 0.4826154112815857 2023-01-22 15:15:29.623756: step: 736/463, loss: 0.09629987925291061 2023-01-22 15:15:30.240516: step: 738/463, loss: 0.22576287388801575 2023-01-22 15:15:30.850925: step: 740/463, loss: 0.18979591131210327 2023-01-22 15:15:31.466468: step: 742/463, loss: 0.287920743227005 2023-01-22 15:15:32.104142: step: 744/463, loss: 0.12172221392393112 2023-01-22 15:15:32.727144: step: 746/463, loss: 0.6339112520217896 2023-01-22 15:15:33.363323: step: 748/463, loss: 0.4870273172855377 2023-01-22 15:15:34.053770: step: 750/463, loss: 1.1368921995162964 2023-01-22 15:15:34.665325: step: 752/463, loss: 2.6646697521209717 2023-01-22 15:15:35.282578: step: 754/463, loss: 0.40921348333358765 2023-01-22 15:15:35.873891: step: 756/463, loss: 0.3033018112182617 2023-01-22 15:15:36.580984: step: 758/463, loss: 0.44696757197380066 2023-01-22 15:15:37.170698: step: 760/463, loss: 0.4322013556957245 2023-01-22 15:15:37.822972: step: 762/463, loss: 0.20454519987106323 2023-01-22 15:15:38.438033: step: 764/463, loss: 0.17951835691928864 2023-01-22 15:15:39.108359: step: 766/463, loss: 0.16549405455589294 2023-01-22 15:15:39.676443: step: 768/463, loss: 0.8966842293739319 2023-01-22 15:15:40.291310: step: 770/463, loss: 0.5106577277183533 2023-01-22 15:15:40.899579: step: 772/463, loss: 0.10148836672306061 2023-01-22 15:15:41.496479: step: 774/463, loss: 0.9061776995658875 2023-01-22 15:15:42.157416: step: 776/463, loss: 0.628853976726532 2023-01-22 15:15:42.820889: step: 778/463, loss: 1.1642197370529175 2023-01-22 15:15:43.462649: step: 780/463, loss: 1.5863817930221558 2023-01-22 15:15:44.102670: step: 782/463, loss: 0.38762885332107544 2023-01-22 15:15:44.710695: step: 784/463, loss: 0.12165044248104095 2023-01-22 15:15:45.361337: step: 786/463, loss: 0.5305049419403076 2023-01-22 15:15:45.905047: step: 788/463, loss: 1.103909969329834 2023-01-22 15:15:46.491447: step: 790/463, loss: 0.0926133468747139 2023-01-22 15:15:47.104967: step: 792/463, loss: 0.21463267505168915 2023-01-22 15:15:47.641643: step: 794/463, loss: 0.16561099886894226 2023-01-22 15:15:48.178757: step: 796/463, loss: 0.3911454677581787 2023-01-22 15:15:48.881270: step: 798/463, loss: 0.37328091263771057 2023-01-22 15:15:49.563159: step: 800/463, loss: 0.18910427391529083 2023-01-22 15:15:50.203991: step: 802/463, loss: 0.9788722991943359 2023-01-22 15:15:50.756421: step: 804/463, loss: 0.16694746911525726 2023-01-22 15:15:51.315551: step: 806/463, loss: 4.966589450836182 2023-01-22 15:15:51.900328: step: 808/463, loss: 0.5302117466926575 2023-01-22 15:15:52.507761: step: 810/463, loss: 0.2575676143169403 2023-01-22 15:15:53.152403: step: 812/463, loss: 0.7427481412887573 2023-01-22 15:15:53.779936: step: 814/463, loss: 6.533020973205566 2023-01-22 15:15:54.500204: step: 816/463, loss: 0.3338218927383423 2023-01-22 15:15:55.085806: step: 818/463, loss: 0.5398399233818054 2023-01-22 15:15:55.730799: step: 820/463, loss: 0.1866040676832199 2023-01-22 15:15:56.332894: step: 822/463, loss: 0.5480536818504333 2023-01-22 15:15:56.957172: step: 824/463, loss: 0.6829494833946228 2023-01-22 15:15:57.548720: step: 826/463, loss: 0.32083213329315186 2023-01-22 15:15:58.099556: step: 828/463, loss: 0.3168462812900543 2023-01-22 15:15:58.689469: step: 830/463, loss: 0.26398539543151855 2023-01-22 15:15:59.253292: step: 832/463, loss: 0.28886768221855164 2023-01-22 15:15:59.896178: step: 834/463, loss: 0.4076050817966461 2023-01-22 15:16:00.542495: step: 836/463, loss: 0.1953868567943573 2023-01-22 15:16:01.184156: step: 838/463, loss: 0.9525861144065857 2023-01-22 15:16:01.906637: step: 840/463, loss: 0.6852613091468811 2023-01-22 15:16:02.538444: step: 842/463, loss: 2.841035842895508 2023-01-22 15:16:03.151382: step: 844/463, loss: 0.14252546429634094 2023-01-22 15:16:03.736344: step: 846/463, loss: 0.05969494581222534 2023-01-22 15:16:04.321789: step: 848/463, loss: 0.15092657506465912 2023-01-22 15:16:05.021179: step: 850/463, loss: 0.38697493076324463 2023-01-22 15:16:05.651269: step: 852/463, loss: 0.14084096252918243 2023-01-22 15:16:06.261656: step: 854/463, loss: 0.7985931634902954 2023-01-22 15:16:06.831144: step: 856/463, loss: 0.1944494992494583 2023-01-22 15:16:07.444083: step: 858/463, loss: 0.19951370358467102 2023-01-22 15:16:08.012215: step: 860/463, loss: 0.11571680754423141 2023-01-22 15:16:08.564663: step: 862/463, loss: 0.1899576038122177 2023-01-22 15:16:09.165384: step: 864/463, loss: 0.570140540599823 2023-01-22 15:16:09.785471: step: 866/463, loss: 0.8048853278160095 2023-01-22 15:16:10.339369: step: 868/463, loss: 0.23934796452522278 2023-01-22 15:16:10.930591: step: 870/463, loss: 0.13635562360286713 2023-01-22 15:16:11.522334: step: 872/463, loss: 1.5228379964828491 2023-01-22 15:16:12.194237: step: 874/463, loss: 0.07769973576068878 2023-01-22 15:16:12.836318: step: 876/463, loss: 0.10696491599082947 2023-01-22 15:16:13.435217: step: 878/463, loss: 0.48473307490348816 2023-01-22 15:16:14.059648: step: 880/463, loss: 0.1958603709936142 2023-01-22 15:16:14.698730: step: 882/463, loss: 0.1734141707420349 2023-01-22 15:16:15.345183: step: 884/463, loss: 0.3189689815044403 2023-01-22 15:16:16.030251: step: 886/463, loss: 0.4838476777076721 2023-01-22 15:16:16.610293: step: 888/463, loss: 0.5076563954353333 2023-01-22 15:16:17.239227: step: 890/463, loss: 0.6265429258346558 2023-01-22 15:16:17.841683: step: 892/463, loss: 0.2630029320716858 2023-01-22 15:16:18.490308: step: 894/463, loss: 0.5064318180084229 2023-01-22 15:16:19.087389: step: 896/463, loss: 0.9581440687179565 2023-01-22 15:16:19.821750: step: 898/463, loss: 0.3112206757068634 2023-01-22 15:16:20.438924: step: 900/463, loss: 0.25947365164756775 2023-01-22 15:16:21.107329: step: 902/463, loss: 0.10890407115221024 2023-01-22 15:16:21.791919: step: 904/463, loss: 0.47971606254577637 2023-01-22 15:16:22.473917: step: 906/463, loss: 0.42998647689819336 2023-01-22 15:16:23.106022: step: 908/463, loss: 0.5029575228691101 2023-01-22 15:16:23.710874: step: 910/463, loss: 0.5117167830467224 2023-01-22 15:16:24.312735: step: 912/463, loss: 0.4868311882019043 2023-01-22 15:16:24.921700: step: 914/463, loss: 0.6755980849266052 2023-01-22 15:16:25.539035: step: 916/463, loss: 1.6050560474395752 2023-01-22 15:16:26.137056: step: 918/463, loss: 4.4098358154296875 2023-01-22 15:16:26.822088: step: 920/463, loss: 0.7806760668754578 2023-01-22 15:16:27.380101: step: 922/463, loss: 0.3419690728187561 2023-01-22 15:16:27.975815: step: 924/463, loss: 0.2484600692987442 2023-01-22 15:16:28.562160: step: 926/463, loss: 0.45909857749938965 ================================================== Loss: 0.506 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.26815476190476195, 'r': 0.328197004608295, 'f1': 0.2951532788883472}, 'combined': 0.21748136339141372, 'epoch': 6} Test Chinese: {'template': {'p': 0.9722222222222222, 'r': 0.5511811023622047, 'f1': 0.7035175879396985}, 'slot': {'p': 0.3009323115341943, 'r': 0.31168926425686383, 'f1': 0.30621634783950563}, 'combined': 0.2154285864197527, 'epoch': 6} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.270383555351401, 'r': 0.3191244239631337, 'f1': 0.29273902772597293}, 'combined': 0.21570244148229584, 'epoch': 6} Test Korean: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.3014546817272049, 'r': 0.3040851763321543, 'f1': 0.30276421553140653}, 'combined': 0.21496259302729862, 'epoch': 6} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2893626329100397, 'r': 0.33877940133869927, 'f1': 0.3121271757089065}, 'combined': 0.22998844525919426, 'epoch': 6} Test Russian: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.3200163294971871, 'r': 0.2985143597142173, 'f1': 0.30889160833633683}, 'combined': 0.21931304191879913, 'epoch': 6} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.26666666666666666, 'r': 0.38095238095238093, 'f1': 0.3137254901960784}, 'combined': 0.2091503267973856, 'epoch': 6} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.23026315789473684, 'r': 0.3804347826086957, 'f1': 0.28688524590163933}, 'combined': 0.14344262295081966, 'epoch': 6} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.3977272727272727, 'r': 0.3017241379310345, 'f1': 0.34313725490196073}, 'combined': 0.2287581699346405, 'epoch': 6} New best russian model... ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3095149595559295, 'r': 0.3253724622276754, 'f1': 0.31724567547453275}, 'combined': 0.23375997140228727, 'epoch': 1} Test for Chinese: {'template': {'p': 0.9722222222222222, 'r': 0.5511811023622047, 'f1': 0.7035175879396985}, 'slot': {'p': 0.36402697976295945, 'r': 0.29452226435922085, 'f1': 0.32560678286267597}, 'combined': 0.22907009849635496, 'epoch': 1} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.29409479409479405, 'r': 0.32770562770562767, 'f1': 0.3099918099918099}, 'combined': 0.2066612066612066, 'epoch': 1} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.31359210182048525, 'r': 0.32489807892596767, 'f1': 0.3191449908555172}, 'combined': 0.23515946694617054, 'epoch': 1} Test for Korean: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.3653684903443556, 'r': 0.29076445304891124, 'f1': 0.32382513429937054}, 'combined': 0.22991584535255308, 'epoch': 1} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.296875, 'r': 0.41304347826086957, 'f1': 0.3454545454545454}, 'combined': 0.1727272727272727, 'epoch': 1} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2893626329100397, 'r': 0.33877940133869927, 'f1': 0.3121271757089065}, 'combined': 0.22998844525919426, 'epoch': 6} Test for Russian: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.3200163294971871, 'r': 0.2985143597142173, 'f1': 0.30889160833633683}, 'combined': 0.21931304191879913, 'epoch': 6} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.3977272727272727, 'r': 0.3017241379310345, 'f1': 0.34313725490196073}, 'combined': 0.2287581699346405, 'epoch': 6} ****************************** Epoch: 7 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-22 15:19:12.455732: step: 2/463, loss: 0.2292134016752243 2023-01-22 15:19:13.094970: step: 4/463, loss: 0.5140865445137024 2023-01-22 15:19:13.760632: step: 6/463, loss: 0.6457586884498596 2023-01-22 15:19:14.407916: step: 8/463, loss: 0.09470998495817184 2023-01-22 15:19:14.981179: step: 10/463, loss: 0.14442171156406403 2023-01-22 15:19:15.563145: step: 12/463, loss: 0.20705416798591614 2023-01-22 15:19:16.217153: step: 14/463, loss: 0.5588554739952087 2023-01-22 15:19:16.962639: step: 16/463, loss: 0.32287657260894775 2023-01-22 15:19:17.544214: step: 18/463, loss: 0.15792706608772278 2023-01-22 15:19:18.237018: step: 20/463, loss: 1.1745243072509766 2023-01-22 15:19:18.909761: step: 22/463, loss: 0.5958467125892639 2023-01-22 15:19:19.595579: step: 24/463, loss: 0.31491032242774963 2023-01-22 15:19:20.214843: step: 26/463, loss: 0.14607733488082886 2023-01-22 15:19:20.847998: step: 28/463, loss: 0.17305392026901245 2023-01-22 15:19:21.473658: step: 30/463, loss: 0.23052647709846497 2023-01-22 15:19:22.101061: step: 32/463, loss: 0.4587783217430115 2023-01-22 15:19:22.752860: step: 34/463, loss: 0.14067287743091583 2023-01-22 15:19:23.336776: step: 36/463, loss: 0.23134037852287292 2023-01-22 15:19:23.918589: step: 38/463, loss: 0.18950802087783813 2023-01-22 15:19:24.530080: step: 40/463, loss: 0.1487523764371872 2023-01-22 15:19:25.136167: step: 42/463, loss: 0.091647669672966 2023-01-22 15:19:25.699954: step: 44/463, loss: 0.14908897876739502 2023-01-22 15:19:26.296448: step: 46/463, loss: 0.15245479345321655 2023-01-22 15:19:26.969629: step: 48/463, loss: 0.11669953912496567 2023-01-22 15:19:27.552768: step: 50/463, loss: 0.23191702365875244 2023-01-22 15:19:28.209116: step: 52/463, loss: 0.22847819328308105 2023-01-22 15:19:28.830703: step: 54/463, loss: 0.4031789004802704 2023-01-22 15:19:29.426751: step: 56/463, loss: 0.10735268890857697 2023-01-22 15:19:30.033621: step: 58/463, loss: 0.08336841315031052 2023-01-22 15:19:30.632821: step: 60/463, loss: 0.11832805722951889 2023-01-22 15:19:31.203266: step: 62/463, loss: 1.0553998947143555 2023-01-22 15:19:31.820670: step: 64/463, loss: 0.0867585614323616 2023-01-22 15:19:32.362549: step: 66/463, loss: 0.026128387078642845 2023-01-22 15:19:32.974956: step: 68/463, loss: 0.15353162586688995 2023-01-22 15:19:33.560504: step: 70/463, loss: 0.409671813249588 2023-01-22 15:19:34.201787: step: 72/463, loss: 0.1064080223441124 2023-01-22 15:19:34.849669: step: 74/463, loss: 0.3898642659187317 2023-01-22 15:19:35.452962: step: 76/463, loss: 0.18286579847335815 2023-01-22 15:19:36.099838: step: 78/463, loss: 0.10900668054819107 2023-01-22 15:19:36.718505: step: 80/463, loss: 0.22107960283756256 2023-01-22 15:19:37.305960: step: 82/463, loss: 0.015773039311170578 2023-01-22 15:19:37.923879: step: 84/463, loss: 0.20778371393680573 2023-01-22 15:19:38.527235: step: 86/463, loss: 0.26144200563430786 2023-01-22 15:19:39.169785: step: 88/463, loss: 0.1410159468650818 2023-01-22 15:19:39.805867: step: 90/463, loss: 1.2241520881652832 2023-01-22 15:19:40.438165: step: 92/463, loss: 0.26139533519744873 2023-01-22 15:19:41.113036: step: 94/463, loss: 0.05062587931752205 2023-01-22 15:19:41.748750: step: 96/463, loss: 0.5319894552230835 2023-01-22 15:19:42.323347: step: 98/463, loss: 0.17293617129325867 2023-01-22 15:19:42.951317: step: 100/463, loss: 0.2626856565475464 2023-01-22 15:19:43.571616: step: 102/463, loss: 0.1319103091955185 2023-01-22 15:19:44.161948: step: 104/463, loss: 1.3577152490615845 2023-01-22 15:19:44.776641: step: 106/463, loss: 0.18683896958827972 2023-01-22 15:19:45.367161: step: 108/463, loss: 0.09024856239557266 2023-01-22 15:19:45.986426: step: 110/463, loss: 0.16265636682510376 2023-01-22 15:19:46.614526: step: 112/463, loss: 0.3683833181858063 2023-01-22 15:19:47.224508: step: 114/463, loss: 0.16429784893989563 2023-01-22 15:19:47.818818: step: 116/463, loss: 0.4095369577407837 2023-01-22 15:19:48.395572: step: 118/463, loss: 0.13131633400917053 2023-01-22 15:19:49.055927: step: 120/463, loss: 0.07633301615715027 2023-01-22 15:19:49.735540: step: 122/463, loss: 0.17867662012577057 2023-01-22 15:19:50.401513: step: 124/463, loss: 0.20043376088142395 2023-01-22 15:19:50.941628: step: 126/463, loss: 1.069822907447815 2023-01-22 15:19:51.653203: step: 128/463, loss: 0.3196142911911011 2023-01-22 15:19:52.346183: step: 130/463, loss: 0.176326721906662 2023-01-22 15:19:52.940708: step: 132/463, loss: 0.21627584099769592 2023-01-22 15:19:53.545244: step: 134/463, loss: 0.10179844498634338 2023-01-22 15:19:54.139770: step: 136/463, loss: 0.14972935616970062 2023-01-22 15:19:54.792899: step: 138/463, loss: 3.7853171825408936 2023-01-22 15:19:55.392771: step: 140/463, loss: 0.5580949187278748 2023-01-22 15:19:56.032155: step: 142/463, loss: 0.21673980355262756 2023-01-22 15:19:56.670809: step: 144/463, loss: 0.5594762563705444 2023-01-22 15:19:57.342513: step: 146/463, loss: 0.14537198841571808 2023-01-22 15:19:57.906228: step: 148/463, loss: 0.4432889521121979 2023-01-22 15:19:58.520615: step: 150/463, loss: 0.3505689203739166 2023-01-22 15:19:59.107744: step: 152/463, loss: 1.2365888357162476 2023-01-22 15:19:59.722621: step: 154/463, loss: 0.28022506833076477 2023-01-22 15:20:00.385155: step: 156/463, loss: 0.07978623360395432 2023-01-22 15:20:01.012856: step: 158/463, loss: 0.5222527980804443 2023-01-22 15:20:01.616706: step: 160/463, loss: 0.8907120227813721 2023-01-22 15:20:02.257525: step: 162/463, loss: 0.1499650627374649 2023-01-22 15:20:02.942849: step: 164/463, loss: 0.7439171075820923 2023-01-22 15:20:03.540498: step: 166/463, loss: 0.3077215254306793 2023-01-22 15:20:04.150075: step: 168/463, loss: 0.7279030680656433 2023-01-22 15:20:04.740082: step: 170/463, loss: 0.3710136115550995 2023-01-22 15:20:05.350924: step: 172/463, loss: 0.1876528412103653 2023-01-22 15:20:05.928554: step: 174/463, loss: 0.08628953248262405 2023-01-22 15:20:06.512863: step: 176/463, loss: 0.49339529871940613 2023-01-22 15:20:07.246312: step: 178/463, loss: 0.42600178718566895 2023-01-22 15:20:07.924979: step: 180/463, loss: 0.34569844603538513 2023-01-22 15:20:08.592923: step: 182/463, loss: 0.28247255086898804 2023-01-22 15:20:09.379496: step: 184/463, loss: 0.18900494277477264 2023-01-22 15:20:09.977600: step: 186/463, loss: 0.08254369348287582 2023-01-22 15:20:10.598174: step: 188/463, loss: 0.38772132992744446 2023-01-22 15:20:11.184773: step: 190/463, loss: 0.2626463770866394 2023-01-22 15:20:11.842177: step: 192/463, loss: 0.4520440995693207 2023-01-22 15:20:12.476785: step: 194/463, loss: 0.09230727702379227 2023-01-22 15:20:13.046020: step: 196/463, loss: 0.10673794895410538 2023-01-22 15:20:13.593624: step: 198/463, loss: 0.29481348395347595 2023-01-22 15:20:14.162438: step: 200/463, loss: 0.5064359903335571 2023-01-22 15:20:14.783421: step: 202/463, loss: 0.22931136190891266 2023-01-22 15:20:15.397648: step: 204/463, loss: 0.47471046447753906 2023-01-22 15:20:16.090498: step: 206/463, loss: 0.41100674867630005 2023-01-22 15:20:16.720533: step: 208/463, loss: 0.15832196176052094 2023-01-22 15:20:17.425832: step: 210/463, loss: 0.17259295284748077 2023-01-22 15:20:17.975887: step: 212/463, loss: 0.07675910741090775 2023-01-22 15:20:18.618848: step: 214/463, loss: 0.2755281329154968 2023-01-22 15:20:19.235719: step: 216/463, loss: 0.36675071716308594 2023-01-22 15:20:19.817226: step: 218/463, loss: 0.16653701663017273 2023-01-22 15:20:20.419684: step: 220/463, loss: 1.5281400680541992 2023-01-22 15:20:21.032806: step: 222/463, loss: 0.2454165816307068 2023-01-22 15:20:21.593234: step: 224/463, loss: 0.14281092584133148 2023-01-22 15:20:22.263266: step: 226/463, loss: 0.5093085765838623 2023-01-22 15:20:22.865208: step: 228/463, loss: 0.42358189821243286 2023-01-22 15:20:23.470036: step: 230/463, loss: 0.3917354941368103 2023-01-22 15:20:24.109269: step: 232/463, loss: 0.12960240244865417 2023-01-22 15:20:24.720227: step: 234/463, loss: 0.45275479555130005 2023-01-22 15:20:25.396883: step: 236/463, loss: 0.6266748309135437 2023-01-22 15:20:26.008935: step: 238/463, loss: 0.21018432080745697 2023-01-22 15:20:26.578599: step: 240/463, loss: 0.18314947187900543 2023-01-22 15:20:27.163981: step: 242/463, loss: 0.2929132878780365 2023-01-22 15:20:27.765117: step: 244/463, loss: 1.0889827013015747 2023-01-22 15:20:28.367284: step: 246/463, loss: 0.18940860033035278 2023-01-22 15:20:28.945552: step: 248/463, loss: 0.1883765161037445 2023-01-22 15:20:29.588422: step: 250/463, loss: 0.3153200149536133 2023-01-22 15:20:30.215425: step: 252/463, loss: 0.5354615449905396 2023-01-22 15:20:30.841987: step: 254/463, loss: 0.2133326232433319 2023-01-22 15:20:31.515387: step: 256/463, loss: 0.4468422830104828 2023-01-22 15:20:32.107417: step: 258/463, loss: 0.38873201608657837 2023-01-22 15:20:32.697179: step: 260/463, loss: 0.13451050221920013 2023-01-22 15:20:33.282695: step: 262/463, loss: 0.6182888746261597 2023-01-22 15:20:33.912835: step: 264/463, loss: 0.1433843970298767 2023-01-22 15:20:34.544277: step: 266/463, loss: 1.3607600927352905 2023-01-22 15:20:35.176760: step: 268/463, loss: 0.35817569494247437 2023-01-22 15:20:35.738587: step: 270/463, loss: 1.6474217176437378 2023-01-22 15:20:36.370318: step: 272/463, loss: 0.1847514510154724 2023-01-22 15:20:36.989530: step: 274/463, loss: 0.15895485877990723 2023-01-22 15:20:37.547830: step: 276/463, loss: 0.160550057888031 2023-01-22 15:20:38.162131: step: 278/463, loss: 0.5762653350830078 2023-01-22 15:20:38.722062: step: 280/463, loss: 0.09631293267011642 2023-01-22 15:20:39.350328: step: 282/463, loss: 0.3952464163303375 2023-01-22 15:20:39.907048: step: 284/463, loss: 0.22372303903102875 2023-01-22 15:20:40.497207: step: 286/463, loss: 6.20141077041626 2023-01-22 15:20:41.173780: step: 288/463, loss: 0.15710070729255676 2023-01-22 15:20:41.785654: step: 290/463, loss: 1.4910959005355835 2023-01-22 15:20:42.385406: step: 292/463, loss: 0.11858776956796646 2023-01-22 15:20:42.976408: step: 294/463, loss: 0.17900453507900238 2023-01-22 15:20:43.564795: step: 296/463, loss: 0.1411808878183365 2023-01-22 15:20:44.191215: step: 298/463, loss: 0.0877775177359581 2023-01-22 15:20:44.797107: step: 300/463, loss: 0.17963624000549316 2023-01-22 15:20:45.440025: step: 302/463, loss: 0.3109763264656067 2023-01-22 15:20:46.090462: step: 304/463, loss: 0.18308374285697937 2023-01-22 15:20:46.706574: step: 306/463, loss: 0.43376490473747253 2023-01-22 15:20:47.272796: step: 308/463, loss: 0.21313676238059998 2023-01-22 15:20:47.901472: step: 310/463, loss: 0.40419626235961914 2023-01-22 15:20:48.500102: step: 312/463, loss: 0.02422020025551319 2023-01-22 15:20:49.144329: step: 314/463, loss: 0.1212041974067688 2023-01-22 15:20:49.796546: step: 316/463, loss: 0.14146606624126434 2023-01-22 15:20:50.456244: step: 318/463, loss: 0.49400806427001953 2023-01-22 15:20:51.125404: step: 320/463, loss: 0.27245575189590454 2023-01-22 15:20:51.759885: step: 322/463, loss: 0.07313971221446991 2023-01-22 15:20:52.330798: step: 324/463, loss: 0.1980755478143692 2023-01-22 15:20:52.953226: step: 326/463, loss: 0.30978578329086304 2023-01-22 15:20:53.559145: step: 328/463, loss: 0.1269151270389557 2023-01-22 15:20:54.203552: step: 330/463, loss: 0.11283473670482635 2023-01-22 15:20:54.801175: step: 332/463, loss: 0.15597856044769287 2023-01-22 15:20:55.349421: step: 334/463, loss: 0.1087990254163742 2023-01-22 15:20:56.005475: step: 336/463, loss: 0.1628381460905075 2023-01-22 15:20:56.707187: step: 338/463, loss: 0.33541399240493774 2023-01-22 15:20:57.435204: step: 340/463, loss: 0.1708591878414154 2023-01-22 15:20:58.031703: step: 342/463, loss: 0.2645842432975769 2023-01-22 15:20:58.594719: step: 344/463, loss: 0.41883477568626404 2023-01-22 15:20:59.164847: step: 346/463, loss: 0.09723778814077377 2023-01-22 15:20:59.810987: step: 348/463, loss: 0.6665863394737244 2023-01-22 15:21:00.426367: step: 350/463, loss: 0.11722732335329056 2023-01-22 15:21:01.048801: step: 352/463, loss: 0.28830668330192566 2023-01-22 15:21:01.639008: step: 354/463, loss: 0.3687843382358551 2023-01-22 15:21:02.300491: step: 356/463, loss: 0.33121442794799805 2023-01-22 15:21:02.889272: step: 358/463, loss: 0.2664475440979004 2023-01-22 15:21:03.507712: step: 360/463, loss: 0.20674127340316772 2023-01-22 15:21:04.090322: step: 362/463, loss: 0.3410205841064453 2023-01-22 15:21:04.753473: step: 364/463, loss: 0.5178223252296448 2023-01-22 15:21:05.361544: step: 366/463, loss: 0.11074258387088776 2023-01-22 15:21:05.934992: step: 368/463, loss: 0.2785670757293701 2023-01-22 15:21:06.592160: step: 370/463, loss: 0.19549858570098877 2023-01-22 15:21:07.273851: step: 372/463, loss: 0.2714909315109253 2023-01-22 15:21:07.840636: step: 374/463, loss: 0.7515138387680054 2023-01-22 15:21:08.427604: step: 376/463, loss: 0.5290991067886353 2023-01-22 15:21:09.014040: step: 378/463, loss: 10.528651237487793 2023-01-22 15:21:09.613884: step: 380/463, loss: 0.5413484573364258 2023-01-22 15:21:10.195569: step: 382/463, loss: 0.12680268287658691 2023-01-22 15:21:10.867347: step: 384/463, loss: 0.7808083295822144 2023-01-22 15:21:11.506250: step: 386/463, loss: 0.18824996054172516 2023-01-22 15:21:12.107528: step: 388/463, loss: 0.25982654094696045 2023-01-22 15:21:12.713481: step: 390/463, loss: 0.13112325966358185 2023-01-22 15:21:13.313866: step: 392/463, loss: 0.13693010807037354 2023-01-22 15:21:13.974138: step: 394/463, loss: 0.1653897762298584 2023-01-22 15:21:14.561237: step: 396/463, loss: 0.12466676533222198 2023-01-22 15:21:15.116801: step: 398/463, loss: 0.766970157623291 2023-01-22 15:21:15.708519: step: 400/463, loss: 0.23886620998382568 2023-01-22 15:21:16.322070: step: 402/463, loss: 0.10124516487121582 2023-01-22 15:21:16.939181: step: 404/463, loss: 0.2622757852077484 2023-01-22 15:21:17.606093: step: 406/463, loss: 0.2160647213459015 2023-01-22 15:21:18.211311: step: 408/463, loss: 0.15356677770614624 2023-01-22 15:21:18.830391: step: 410/463, loss: 0.12603981792926788 2023-01-22 15:21:19.409216: step: 412/463, loss: 0.09186432510614395 2023-01-22 15:21:20.052860: step: 414/463, loss: 0.32241737842559814 2023-01-22 15:21:20.628264: step: 416/463, loss: 0.13145940005779266 2023-01-22 15:21:21.232765: step: 418/463, loss: 2.5445446968078613 2023-01-22 15:21:21.845223: step: 420/463, loss: 0.15494303405284882 2023-01-22 15:21:22.537693: step: 422/463, loss: 0.2494560033082962 2023-01-22 15:21:23.167721: step: 424/463, loss: 0.23796483874320984 2023-01-22 15:21:23.799319: step: 426/463, loss: 0.6462833881378174 2023-01-22 15:21:24.425666: step: 428/463, loss: 0.29962360858917236 2023-01-22 15:21:25.045215: step: 430/463, loss: 0.774705171585083 2023-01-22 15:21:25.637232: step: 432/463, loss: 0.12601062655448914 2023-01-22 15:21:26.288409: step: 434/463, loss: 0.21410134434700012 2023-01-22 15:21:26.987444: step: 436/463, loss: 0.2624575197696686 2023-01-22 15:21:27.553908: step: 438/463, loss: 0.08458137512207031 2023-01-22 15:21:28.144926: step: 440/463, loss: 0.09152693301439285 2023-01-22 15:21:28.770575: step: 442/463, loss: 0.6456946730613708 2023-01-22 15:21:29.358082: step: 444/463, loss: 0.21645833551883698 2023-01-22 15:21:29.987911: step: 446/463, loss: 0.7098376154899597 2023-01-22 15:21:30.544207: step: 448/463, loss: 0.40361952781677246 2023-01-22 15:21:31.237173: step: 450/463, loss: 0.1525217890739441 2023-01-22 15:21:31.803648: step: 452/463, loss: 0.5160903930664062 2023-01-22 15:21:32.406043: step: 454/463, loss: 0.08593238145112991 2023-01-22 15:21:33.114100: step: 456/463, loss: 0.3016653060913086 2023-01-22 15:21:33.754445: step: 458/463, loss: 0.35271504521369934 2023-01-22 15:21:34.405658: step: 460/463, loss: 0.20796242356300354 2023-01-22 15:21:35.033110: step: 462/463, loss: 0.3434599041938782 2023-01-22 15:21:35.613878: step: 464/463, loss: 0.2708801031112671 2023-01-22 15:21:36.180341: step: 466/463, loss: 0.2593349814414978 2023-01-22 15:21:36.757648: step: 468/463, loss: 0.16519767045974731 2023-01-22 15:21:37.468821: step: 470/463, loss: 1.0911171436309814 2023-01-22 15:21:38.135385: step: 472/463, loss: 0.1519259810447693 2023-01-22 15:21:38.781731: step: 474/463, loss: 0.498020201921463 2023-01-22 15:21:39.413393: step: 476/463, loss: 0.1623341143131256 2023-01-22 15:21:40.028731: step: 478/463, loss: 0.15821684896945953 2023-01-22 15:21:40.654927: step: 480/463, loss: 0.1260978877544403 2023-01-22 15:21:41.269277: step: 482/463, loss: 0.18318217992782593 2023-01-22 15:21:41.928226: step: 484/463, loss: 1.4226531982421875 2023-01-22 15:21:42.587436: step: 486/463, loss: 0.5629276633262634 2023-01-22 15:21:43.197101: step: 488/463, loss: 0.32553690671920776 2023-01-22 15:21:43.916683: step: 490/463, loss: 0.22068104147911072 2023-01-22 15:21:44.575297: step: 492/463, loss: 0.7626279592514038 2023-01-22 15:21:45.233627: step: 494/463, loss: 0.6075904369354248 2023-01-22 15:21:45.854113: step: 496/463, loss: 0.2067672610282898 2023-01-22 15:21:46.473553: step: 498/463, loss: 0.19533081352710724 2023-01-22 15:21:47.075830: step: 500/463, loss: 0.7950993776321411 2023-01-22 15:21:47.666594: step: 502/463, loss: 0.3475831151008606 2023-01-22 15:21:48.286225: step: 504/463, loss: 0.2884642779827118 2023-01-22 15:21:48.833949: step: 506/463, loss: 0.2806173264980316 2023-01-22 15:21:49.520961: step: 508/463, loss: 0.1194763332605362 2023-01-22 15:21:50.151000: step: 510/463, loss: 0.24674589931964874 2023-01-22 15:21:50.787040: step: 512/463, loss: 0.0823177620768547 2023-01-22 15:21:51.430011: step: 514/463, loss: 0.16364946961402893 2023-01-22 15:21:52.030000: step: 516/463, loss: 0.12027714401483536 2023-01-22 15:21:52.645513: step: 518/463, loss: 0.21341311931610107 2023-01-22 15:21:53.285500: step: 520/463, loss: 0.3254562318325043 2023-01-22 15:21:53.855221: step: 522/463, loss: 0.210825115442276 2023-01-22 15:21:54.445350: step: 524/463, loss: 0.1434701830148697 2023-01-22 15:21:55.054180: step: 526/463, loss: 1.439180850982666 2023-01-22 15:21:55.615968: step: 528/463, loss: 0.39130863547325134 2023-01-22 15:21:56.194803: step: 530/463, loss: 0.5652151107788086 2023-01-22 15:21:56.786980: step: 532/463, loss: 0.08535708487033844 2023-01-22 15:21:57.437742: step: 534/463, loss: 1.410870909690857 2023-01-22 15:21:58.035127: step: 536/463, loss: 0.15043194591999054 2023-01-22 15:21:58.665879: step: 538/463, loss: 0.19373323023319244 2023-01-22 15:21:59.268367: step: 540/463, loss: 0.24187472462654114 2023-01-22 15:21:59.829704: step: 542/463, loss: 0.5943361520767212 2023-01-22 15:22:00.400598: step: 544/463, loss: 0.12170377373695374 2023-01-22 15:22:01.094352: step: 546/463, loss: 0.3253842890262604 2023-01-22 15:22:01.679371: step: 548/463, loss: 0.9854581356048584 2023-01-22 15:22:02.231427: step: 550/463, loss: 0.08538780361413956 2023-01-22 15:22:02.772929: step: 552/463, loss: 0.19582825899124146 2023-01-22 15:22:03.386960: step: 554/463, loss: 0.13915352523326874 2023-01-22 15:22:03.935105: step: 556/463, loss: 0.16813842952251434 2023-01-22 15:22:04.588174: step: 558/463, loss: 0.21973265707492828 2023-01-22 15:22:05.220872: step: 560/463, loss: 0.14147542417049408 2023-01-22 15:22:05.828399: step: 562/463, loss: 0.5848942995071411 2023-01-22 15:22:06.440702: step: 564/463, loss: 0.151697039604187 2023-01-22 15:22:07.059996: step: 566/463, loss: 0.21819627285003662 2023-01-22 15:22:07.615289: step: 568/463, loss: 0.8666470646858215 2023-01-22 15:22:08.253789: step: 570/463, loss: 0.152870312333107 2023-01-22 15:22:08.958563: step: 572/463, loss: 0.2378864735364914 2023-01-22 15:22:09.548802: step: 574/463, loss: 1.3122667074203491 2023-01-22 15:22:10.152300: step: 576/463, loss: 0.19613826274871826 2023-01-22 15:22:10.714080: step: 578/463, loss: 0.151960551738739 2023-01-22 15:22:11.357684: step: 580/463, loss: 0.40190252661705017 2023-01-22 15:22:11.964669: step: 582/463, loss: 0.11602989584207535 2023-01-22 15:22:12.566379: step: 584/463, loss: 0.0820671021938324 2023-01-22 15:22:13.108772: step: 586/463, loss: 0.21513324975967407 2023-01-22 15:22:13.692491: step: 588/463, loss: 0.08809860050678253 2023-01-22 15:22:14.280013: step: 590/463, loss: 0.12910224497318268 2023-01-22 15:22:14.900014: step: 592/463, loss: 0.4797820746898651 2023-01-22 15:22:15.493728: step: 594/463, loss: 0.254446417093277 2023-01-22 15:22:16.083395: step: 596/463, loss: 1.6758280992507935 2023-01-22 15:22:16.700023: step: 598/463, loss: 0.796886146068573 2023-01-22 15:22:17.350793: step: 600/463, loss: 0.24479810893535614 2023-01-22 15:22:17.938846: step: 602/463, loss: 0.4781230688095093 2023-01-22 15:22:18.538478: step: 604/463, loss: 0.39534300565719604 2023-01-22 15:22:19.103149: step: 606/463, loss: 0.9694533944129944 2023-01-22 15:22:19.706534: step: 608/463, loss: 0.3088534474372864 2023-01-22 15:22:20.272103: step: 610/463, loss: 1.3978685140609741 2023-01-22 15:22:20.892024: step: 612/463, loss: 0.6127835512161255 2023-01-22 15:22:21.437644: step: 614/463, loss: 0.12447532266378403 2023-01-22 15:22:22.083270: step: 616/463, loss: 0.36375147104263306 2023-01-22 15:22:22.671260: step: 618/463, loss: 0.08155945688486099 2023-01-22 15:22:23.206603: step: 620/463, loss: 0.16822339594364166 2023-01-22 15:22:23.736046: step: 622/463, loss: 0.5318288803100586 2023-01-22 15:22:24.349176: step: 624/463, loss: 0.13485561311244965 2023-01-22 15:22:24.983788: step: 626/463, loss: 0.17041334509849548 2023-01-22 15:22:25.650648: step: 628/463, loss: 0.14291955530643463 2023-01-22 15:22:26.317501: step: 630/463, loss: 0.21260246634483337 2023-01-22 15:22:26.977181: step: 632/463, loss: 0.2879716455936432 2023-01-22 15:22:27.577225: step: 634/463, loss: 0.09450111538171768 2023-01-22 15:22:28.170109: step: 636/463, loss: 0.2226722538471222 2023-01-22 15:22:28.783464: step: 638/463, loss: 0.08204520493745804 2023-01-22 15:22:29.451308: step: 640/463, loss: 1.0668238401412964 2023-01-22 15:22:30.068002: step: 642/463, loss: 0.18608799576759338 2023-01-22 15:22:30.717966: step: 644/463, loss: 0.19619512557983398 2023-01-22 15:22:31.330735: step: 646/463, loss: 0.30225950479507446 2023-01-22 15:22:31.940723: step: 648/463, loss: 0.11910831928253174 2023-01-22 15:22:32.568962: step: 650/463, loss: 0.5913862586021423 2023-01-22 15:22:33.166781: step: 652/463, loss: 0.16275320947170258 2023-01-22 15:22:33.774519: step: 654/463, loss: 0.13180436193943024 2023-01-22 15:22:34.403589: step: 656/463, loss: 0.3615812063217163 2023-01-22 15:22:35.051929: step: 658/463, loss: 0.13242971897125244 2023-01-22 15:22:35.629760: step: 660/463, loss: 0.07663135230541229 2023-01-22 15:22:36.302500: step: 662/463, loss: 0.8886915445327759 2023-01-22 15:22:36.939945: step: 664/463, loss: 0.19857065379619598 2023-01-22 15:22:37.569813: step: 666/463, loss: 0.21508026123046875 2023-01-22 15:22:38.249171: step: 668/463, loss: 2.0331625938415527 2023-01-22 15:22:38.857438: step: 670/463, loss: 1.0885087251663208 2023-01-22 15:22:39.529577: step: 672/463, loss: 0.11972912400960922 2023-01-22 15:22:40.131802: step: 674/463, loss: 0.1584780514240265 2023-01-22 15:22:40.746644: step: 676/463, loss: 0.08678844571113586 2023-01-22 15:22:41.350559: step: 678/463, loss: 0.586555540561676 2023-01-22 15:22:41.902851: step: 680/463, loss: 0.15392273664474487 2023-01-22 15:22:42.547284: step: 682/463, loss: 1.791028618812561 2023-01-22 15:22:43.149420: step: 684/463, loss: 0.37482935190200806 2023-01-22 15:22:43.841211: step: 686/463, loss: 0.12166769057512283 2023-01-22 15:22:44.443097: step: 688/463, loss: 0.3020041286945343 2023-01-22 15:22:45.102287: step: 690/463, loss: 0.27312037348747253 2023-01-22 15:22:45.751212: step: 692/463, loss: 1.1477407217025757 2023-01-22 15:22:46.335378: step: 694/463, loss: 0.24349983036518097 2023-01-22 15:22:46.980119: step: 696/463, loss: 0.4801204204559326 2023-01-22 15:22:47.572012: step: 698/463, loss: 0.10426265001296997 2023-01-22 15:22:48.152168: step: 700/463, loss: 0.2514320909976959 2023-01-22 15:22:48.727050: step: 702/463, loss: 0.5907490253448486 2023-01-22 15:22:49.379963: step: 704/463, loss: 0.09696906805038452 2023-01-22 15:22:50.025752: step: 706/463, loss: 0.8883567452430725 2023-01-22 15:22:50.670238: step: 708/463, loss: 0.27232369780540466 2023-01-22 15:22:51.240741: step: 710/463, loss: 0.27925753593444824 2023-01-22 15:22:51.847669: step: 712/463, loss: 0.16633141040802002 2023-01-22 15:22:52.526619: step: 714/463, loss: 0.2651751637458801 2023-01-22 15:22:53.164590: step: 716/463, loss: 0.4528287649154663 2023-01-22 15:22:53.792498: step: 718/463, loss: 0.16663207113742828 2023-01-22 15:22:54.440370: step: 720/463, loss: 0.37191087007522583 2023-01-22 15:22:55.046315: step: 722/463, loss: 0.11990674585103989 2023-01-22 15:22:55.661950: step: 724/463, loss: 0.25941210985183716 2023-01-22 15:22:56.251142: step: 726/463, loss: 0.06690103560686111 2023-01-22 15:22:56.895197: step: 728/463, loss: 1.0274930000305176 2023-01-22 15:22:57.474766: step: 730/463, loss: 6.5989909172058105 2023-01-22 15:22:58.074649: step: 732/463, loss: 1.6679614782333374 2023-01-22 15:22:58.580253: step: 734/463, loss: 0.6403878331184387 2023-01-22 15:22:59.233225: step: 736/463, loss: 0.15095177292823792 2023-01-22 15:22:59.873658: step: 738/463, loss: 0.25578272342681885 2023-01-22 15:23:00.485452: step: 740/463, loss: 0.19997969269752502 2023-01-22 15:23:01.074632: step: 742/463, loss: 0.10498865693807602 2023-01-22 15:23:01.643500: step: 744/463, loss: 0.21233795583248138 2023-01-22 15:23:02.243357: step: 746/463, loss: 0.3749246597290039 2023-01-22 15:23:02.845220: step: 748/463, loss: 0.23353469371795654 2023-01-22 15:23:03.386105: step: 750/463, loss: 0.1104661151766777 2023-01-22 15:23:04.000593: step: 752/463, loss: 0.16461525857448578 2023-01-22 15:23:04.666443: step: 754/463, loss: 0.14122551679611206 2023-01-22 15:23:05.190362: step: 756/463, loss: 0.37910377979278564 2023-01-22 15:23:05.777323: step: 758/463, loss: 0.1912377029657364 2023-01-22 15:23:06.307828: step: 760/463, loss: 0.2415326088666916 2023-01-22 15:23:06.928734: step: 762/463, loss: 0.7621281743049622 2023-01-22 15:23:07.557635: step: 764/463, loss: 0.1676209568977356 2023-01-22 15:23:08.121840: step: 766/463, loss: 0.21012917160987854 2023-01-22 15:23:08.730826: step: 768/463, loss: 0.09846082329750061 2023-01-22 15:23:09.260129: step: 770/463, loss: 0.2181215137243271 2023-01-22 15:23:09.909882: step: 772/463, loss: 0.3579956293106079 2023-01-22 15:23:10.484040: step: 774/463, loss: 0.331439346075058 2023-01-22 15:23:11.077407: step: 776/463, loss: 0.22908668220043182 2023-01-22 15:23:11.666534: step: 778/463, loss: 0.0487535260617733 2023-01-22 15:23:12.224583: step: 780/463, loss: 0.12046513706445694 2023-01-22 15:23:12.852478: step: 782/463, loss: 0.5209836363792419 2023-01-22 15:23:13.446816: step: 784/463, loss: 0.14264340698719025 2023-01-22 15:23:14.049378: step: 786/463, loss: 0.2825098931789398 2023-01-22 15:23:14.667491: step: 788/463, loss: 0.8692577481269836 2023-01-22 15:23:15.223281: step: 790/463, loss: 0.13770322501659393 2023-01-22 15:23:15.822805: step: 792/463, loss: 0.4892885088920593 2023-01-22 15:23:16.491530: step: 794/463, loss: 0.3616708517074585 2023-01-22 15:23:17.072285: step: 796/463, loss: 0.20980167388916016 2023-01-22 15:23:17.709839: step: 798/463, loss: 0.2856959104537964 2023-01-22 15:23:18.303195: step: 800/463, loss: 0.8828518986701965 2023-01-22 15:23:18.868411: step: 802/463, loss: 0.10210683941841125 2023-01-22 15:23:19.480798: step: 804/463, loss: 0.24084636569023132 2023-01-22 15:23:20.115874: step: 806/463, loss: 0.7847738862037659 2023-01-22 15:23:20.828217: step: 808/463, loss: 0.23459014296531677 2023-01-22 15:23:21.435351: step: 810/463, loss: 0.5269566178321838 2023-01-22 15:23:22.050099: step: 812/463, loss: 0.2852025628089905 2023-01-22 15:23:22.669628: step: 814/463, loss: 0.36101004481315613 2023-01-22 15:23:23.258844: step: 816/463, loss: 0.8603495359420776 2023-01-22 15:23:23.868788: step: 818/463, loss: 0.1541174054145813 2023-01-22 15:23:24.475198: step: 820/463, loss: 0.05553519353270531 2023-01-22 15:23:25.083928: step: 822/463, loss: 0.14664889872074127 2023-01-22 15:23:25.681486: step: 824/463, loss: 0.344748854637146 2023-01-22 15:23:26.274114: step: 826/463, loss: 0.0747259259223938 2023-01-22 15:23:26.909549: step: 828/463, loss: 0.24167515337467194 2023-01-22 15:23:27.569774: step: 830/463, loss: 0.5423164367675781 2023-01-22 15:23:28.197078: step: 832/463, loss: 0.30657103657722473 2023-01-22 15:23:28.879872: step: 834/463, loss: 0.4214096665382385 2023-01-22 15:23:29.462530: step: 836/463, loss: 0.48464950919151306 2023-01-22 15:23:30.103316: step: 838/463, loss: 0.1506676971912384 2023-01-22 15:23:30.740376: step: 840/463, loss: 0.21291394531726837 2023-01-22 15:23:31.338887: step: 842/463, loss: 0.33670076727867126 2023-01-22 15:23:31.941699: step: 844/463, loss: 0.1512867659330368 2023-01-22 15:23:32.583075: step: 846/463, loss: 0.22316919267177582 2023-01-22 15:23:33.187933: step: 848/463, loss: 0.33850619196891785 2023-01-22 15:23:33.815624: step: 850/463, loss: 0.18111850321292877 2023-01-22 15:23:34.467494: step: 852/463, loss: 0.4996117651462555 2023-01-22 15:23:35.017435: step: 854/463, loss: 0.3203403353691101 2023-01-22 15:23:35.731255: step: 856/463, loss: 0.3423388600349426 2023-01-22 15:23:36.350330: step: 858/463, loss: 0.3194054365158081 2023-01-22 15:23:36.971166: step: 860/463, loss: 0.16993840038776398 2023-01-22 15:23:37.580467: step: 862/463, loss: 0.3208369016647339 2023-01-22 15:23:38.239214: step: 864/463, loss: 1.1828213930130005 2023-01-22 15:23:38.859377: step: 866/463, loss: 0.19894807040691376 2023-01-22 15:23:39.446695: step: 868/463, loss: 0.13642045855522156 2023-01-22 15:23:40.038453: step: 870/463, loss: 0.166127011179924 2023-01-22 15:23:40.673894: step: 872/463, loss: 0.2317439615726471 2023-01-22 15:23:41.304068: step: 874/463, loss: 0.16621087491512299 2023-01-22 15:23:41.932592: step: 876/463, loss: 0.5331152677536011 2023-01-22 15:23:42.562546: step: 878/463, loss: 0.23488563299179077 2023-01-22 15:23:43.161522: step: 880/463, loss: 0.18018820881843567 2023-01-22 15:23:43.762402: step: 882/463, loss: 0.15383575856685638 2023-01-22 15:23:44.371713: step: 884/463, loss: 0.709888219833374 2023-01-22 15:23:45.058898: step: 886/463, loss: 0.0976945161819458 2023-01-22 15:23:45.696031: step: 888/463, loss: 0.43309342861175537 2023-01-22 15:23:46.335211: step: 890/463, loss: 0.19633664190769196 2023-01-22 15:23:46.959549: step: 892/463, loss: 0.15317735075950623 2023-01-22 15:23:47.590395: step: 894/463, loss: 0.8126393556594849 2023-01-22 15:23:48.167787: step: 896/463, loss: 0.3557514548301697 2023-01-22 15:23:48.751165: step: 898/463, loss: 0.20849908888339996 2023-01-22 15:23:49.379418: step: 900/463, loss: 1.5301814079284668 2023-01-22 15:23:50.016666: step: 902/463, loss: 0.25665178894996643 2023-01-22 15:23:50.612347: step: 904/463, loss: 1.7843436002731323 2023-01-22 15:23:51.207157: step: 906/463, loss: 2.2349588871002197 2023-01-22 15:23:51.791690: step: 908/463, loss: 0.4244661331176758 2023-01-22 15:23:52.375127: step: 910/463, loss: 0.34122493863105774 2023-01-22 15:23:52.965500: step: 912/463, loss: 1.492700457572937 2023-01-22 15:23:53.668762: step: 914/463, loss: 0.32317012548446655 2023-01-22 15:23:54.360559: step: 916/463, loss: 0.6083212494850159 2023-01-22 15:23:54.982665: step: 918/463, loss: 0.25575634837150574 2023-01-22 15:23:55.523895: step: 920/463, loss: 0.7104517221450806 2023-01-22 15:23:56.219520: step: 922/463, loss: 0.20503701269626617 2023-01-22 15:23:56.829716: step: 924/463, loss: 0.18798765540122986 2023-01-22 15:23:57.431873: step: 926/463, loss: 0.1550433337688446 ================================================== Loss: 0.417 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.28595472440944886, 'r': 0.34455645161290327, 'f1': 0.3125322719449226}, 'combined': 0.23028693722257454, 'epoch': 7} Test Chinese: {'template': {'p': 0.9722222222222222, 'r': 0.5511811023622047, 'f1': 0.7035175879396985}, 'slot': {'p': 0.34012196615338974, 'r': 0.2928910555696439, 'f1': 0.31474449425362955}, 'combined': 0.22142828741461376, 'epoch': 7} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.28585209003215434, 'r': 0.3373814041745731, 'f1': 0.3094865100087032}, 'combined': 0.22804269158536022, 'epoch': 7} Test Korean: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.3450865569819719, 'r': 0.2968648546962815, 'f1': 0.3191645620913073}, 'combined': 0.22660683908482815, 'epoch': 7} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2970570388349515, 'r': 0.3483515180265655, 'f1': 0.32066593886462885}, 'combined': 0.2362801654792002, 'epoch': 7} Test Russian: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.3573347532314924, 'r': 0.2886765473704196, 'f1': 0.3193571466078555}, 'combined': 0.22674357409157742, 'epoch': 7} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.26190476190476186, 'r': 0.36666666666666664, 'f1': 0.30555555555555547}, 'combined': 0.20370370370370364, 'epoch': 7} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.25, 'r': 0.41304347826086957, 'f1': 0.31147540983606553}, 'combined': 0.15573770491803277, 'epoch': 7} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.35526315789473684, 'r': 0.23275862068965517, 'f1': 0.28125}, 'combined': 0.1875, 'epoch': 7} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3095149595559295, 'r': 0.3253724622276754, 'f1': 0.31724567547453275}, 'combined': 0.23375997140228727, 'epoch': 1} Test for Chinese: {'template': {'p': 0.9722222222222222, 'r': 0.5511811023622047, 'f1': 0.7035175879396985}, 'slot': {'p': 0.36402697976295945, 'r': 0.29452226435922085, 'f1': 0.32560678286267597}, 'combined': 0.22907009849635496, 'epoch': 1} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.29409479409479405, 'r': 0.32770562770562767, 'f1': 0.3099918099918099}, 'combined': 0.2066612066612066, 'epoch': 1} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.31359210182048525, 'r': 0.32489807892596767, 'f1': 0.3191449908555172}, 'combined': 0.23515946694617054, 'epoch': 1} Test for Korean: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.3653684903443556, 'r': 0.29076445304891124, 'f1': 0.32382513429937054}, 'combined': 0.22991584535255308, 'epoch': 1} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.296875, 'r': 0.41304347826086957, 'f1': 0.3454545454545454}, 'combined': 0.1727272727272727, 'epoch': 1} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2893626329100397, 'r': 0.33877940133869927, 'f1': 0.3121271757089065}, 'combined': 0.22998844525919426, 'epoch': 6} Test for Russian: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.3200163294971871, 'r': 0.2985143597142173, 'f1': 0.30889160833633683}, 'combined': 0.21931304191879913, 'epoch': 6} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.3977272727272727, 'r': 0.3017241379310345, 'f1': 0.34313725490196073}, 'combined': 0.2287581699346405, 'epoch': 6} ****************************** Epoch: 8 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-22 15:26:31.895758: step: 2/463, loss: 0.16487164795398712 2023-01-22 15:26:32.532296: step: 4/463, loss: 0.3504563271999359 2023-01-22 15:26:33.108342: step: 6/463, loss: 0.035538896918296814 2023-01-22 15:26:33.729099: step: 8/463, loss: 0.21702080965042114 2023-01-22 15:26:34.347585: step: 10/463, loss: 0.15845009684562683 2023-01-22 15:26:34.972343: step: 12/463, loss: 0.10672367364168167 2023-01-22 15:26:35.617268: step: 14/463, loss: 0.23821987211704254 2023-01-22 15:26:36.217600: step: 16/463, loss: 0.15477898716926575 2023-01-22 15:26:36.896425: step: 18/463, loss: 0.15609805285930634 2023-01-22 15:26:37.530564: step: 20/463, loss: 0.21649298071861267 2023-01-22 15:26:38.159201: step: 22/463, loss: 1.7248178720474243 2023-01-22 15:26:38.712208: step: 24/463, loss: 0.1018708348274231 2023-01-22 15:26:39.346237: step: 26/463, loss: 0.11073853075504303 2023-01-22 15:26:40.025930: step: 28/463, loss: 0.11904218047857285 2023-01-22 15:26:40.648535: step: 30/463, loss: 0.24052776396274567 2023-01-22 15:26:41.298297: step: 32/463, loss: 0.16408266127109528 2023-01-22 15:26:41.906177: step: 34/463, loss: 0.0856943279504776 2023-01-22 15:26:42.445808: step: 36/463, loss: 0.24771706759929657 2023-01-22 15:26:43.102282: step: 38/463, loss: 0.8553184270858765 2023-01-22 15:26:43.708499: step: 40/463, loss: 2.4277901649475098 2023-01-22 15:26:44.334470: step: 42/463, loss: 0.06999494880437851 2023-01-22 15:26:44.952832: step: 44/463, loss: 0.5136069655418396 2023-01-22 15:26:45.627402: step: 46/463, loss: 0.054063186049461365 2023-01-22 15:26:46.238774: step: 48/463, loss: 1.5367964506149292 2023-01-22 15:26:46.834210: step: 50/463, loss: 0.03032035566866398 2023-01-22 15:26:47.482166: step: 52/463, loss: 0.43803727626800537 2023-01-22 15:26:48.065567: step: 54/463, loss: 0.5416883230209351 2023-01-22 15:26:48.663529: step: 56/463, loss: 0.1979689598083496 2023-01-22 15:26:49.330099: step: 58/463, loss: 0.1386841982603073 2023-01-22 15:26:49.978960: step: 60/463, loss: 0.6161872148513794 2023-01-22 15:26:50.633202: step: 62/463, loss: 0.4542045593261719 2023-01-22 15:26:51.212790: step: 64/463, loss: 0.1458391398191452 2023-01-22 15:26:51.804202: step: 66/463, loss: 0.11994066834449768 2023-01-22 15:26:52.455352: step: 68/463, loss: 0.19210149347782135 2023-01-22 15:26:53.086046: step: 70/463, loss: 0.16066598892211914 2023-01-22 15:26:53.674748: step: 72/463, loss: 0.07313390076160431 2023-01-22 15:26:54.294984: step: 74/463, loss: 0.5544369220733643 2023-01-22 15:26:54.938546: step: 76/463, loss: 0.04329092800617218 2023-01-22 15:26:55.546132: step: 78/463, loss: 0.11809313297271729 2023-01-22 15:26:56.153446: step: 80/463, loss: 0.6050962209701538 2023-01-22 15:26:56.681179: step: 82/463, loss: 0.3386179506778717 2023-01-22 15:26:57.242146: step: 84/463, loss: 0.11928664892911911 2023-01-22 15:26:57.798084: step: 86/463, loss: 0.08769885450601578 2023-01-22 15:26:58.427708: step: 88/463, loss: 0.19669575989246368 2023-01-22 15:26:59.019134: step: 90/463, loss: 0.1995067447423935 2023-01-22 15:26:59.709118: step: 92/463, loss: 0.16610294580459595 2023-01-22 15:27:00.284363: step: 94/463, loss: 0.32979652285575867 2023-01-22 15:27:00.893130: step: 96/463, loss: 0.3520798087120056 2023-01-22 15:27:01.541923: step: 98/463, loss: 0.11900466680526733 2023-01-22 15:27:02.161780: step: 100/463, loss: 0.2860579490661621 2023-01-22 15:27:02.699764: step: 102/463, loss: 0.10451015830039978 2023-01-22 15:27:03.238384: step: 104/463, loss: 0.17775273323059082 2023-01-22 15:27:03.866513: step: 106/463, loss: 0.11890896409749985 2023-01-22 15:27:04.457550: step: 108/463, loss: 0.05485021322965622 2023-01-22 15:27:05.097957: step: 110/463, loss: 0.5879225730895996 2023-01-22 15:27:05.682434: step: 112/463, loss: 0.22550435364246368 2023-01-22 15:27:06.244513: step: 114/463, loss: 0.0681144967675209 2023-01-22 15:27:06.916015: step: 116/463, loss: 0.3327588438987732 2023-01-22 15:27:07.583200: step: 118/463, loss: 0.14224553108215332 2023-01-22 15:27:08.233363: step: 120/463, loss: 0.7507123947143555 2023-01-22 15:27:08.879220: step: 122/463, loss: 0.3913622200489044 2023-01-22 15:27:09.458028: step: 124/463, loss: 0.1077384501695633 2023-01-22 15:27:10.059033: step: 126/463, loss: 0.22683611512184143 2023-01-22 15:27:10.646530: step: 128/463, loss: 0.18837259709835052 2023-01-22 15:27:11.276387: step: 130/463, loss: 0.22795245051383972 2023-01-22 15:27:11.923210: step: 132/463, loss: 0.6963698267936707 2023-01-22 15:27:12.515247: step: 134/463, loss: 0.3180541396141052 2023-01-22 15:27:13.142409: step: 136/463, loss: 0.23282858729362488 2023-01-22 15:27:13.735806: step: 138/463, loss: 0.15941829979419708 2023-01-22 15:27:14.335407: step: 140/463, loss: 0.37119928002357483 2023-01-22 15:27:14.930207: step: 142/463, loss: 0.19013936817646027 2023-01-22 15:27:15.575387: step: 144/463, loss: 0.5343262553215027 2023-01-22 15:27:16.192408: step: 146/463, loss: 0.15649515390396118 2023-01-22 15:27:16.832761: step: 148/463, loss: 0.2099848985671997 2023-01-22 15:27:17.488791: step: 150/463, loss: 0.26935648918151855 2023-01-22 15:27:18.143321: step: 152/463, loss: 0.2125912755727768 2023-01-22 15:27:18.730300: step: 154/463, loss: 0.09557577222585678 2023-01-22 15:27:19.326306: step: 156/463, loss: 0.04513729736208916 2023-01-22 15:27:19.903980: step: 158/463, loss: 0.045310549437999725 2023-01-22 15:27:20.470598: step: 160/463, loss: 0.1516764760017395 2023-01-22 15:27:21.043422: step: 162/463, loss: 0.07136696577072144 2023-01-22 15:27:21.657202: step: 164/463, loss: 0.5766372680664062 2023-01-22 15:27:22.370944: step: 166/463, loss: 0.1483779400587082 2023-01-22 15:27:23.105525: step: 168/463, loss: 1.1384159326553345 2023-01-22 15:27:23.707278: step: 170/463, loss: 1.4216501712799072 2023-01-22 15:27:24.322187: step: 172/463, loss: 0.9407877326011658 2023-01-22 15:27:24.966647: step: 174/463, loss: 0.1822318732738495 2023-01-22 15:27:25.620111: step: 176/463, loss: 2.9570534229278564 2023-01-22 15:27:26.210169: step: 178/463, loss: 0.1017463207244873 2023-01-22 15:27:26.854542: step: 180/463, loss: 0.0747569352388382 2023-01-22 15:27:27.447786: step: 182/463, loss: 0.6701416969299316 2023-01-22 15:27:28.022971: step: 184/463, loss: 1.1429237127304077 2023-01-22 15:27:28.710946: step: 186/463, loss: 0.9213649034500122 2023-01-22 15:27:29.321698: step: 188/463, loss: 0.18768484890460968 2023-01-22 15:27:29.933981: step: 190/463, loss: 0.12778973579406738 2023-01-22 15:27:30.552975: step: 192/463, loss: 0.1750268042087555 2023-01-22 15:27:31.137443: step: 194/463, loss: 0.14634548127651215 2023-01-22 15:27:31.747131: step: 196/463, loss: 0.5044635534286499 2023-01-22 15:27:32.368329: step: 198/463, loss: 0.11382979899644852 2023-01-22 15:27:32.962660: step: 200/463, loss: 0.35524529218673706 2023-01-22 15:27:33.575260: step: 202/463, loss: 1.2491135597229004 2023-01-22 15:27:34.172159: step: 204/463, loss: 0.39710697531700134 2023-01-22 15:27:34.811833: step: 206/463, loss: 0.26609599590301514 2023-01-22 15:27:35.400794: step: 208/463, loss: 0.770419716835022 2023-01-22 15:27:35.966068: step: 210/463, loss: 0.13870513439178467 2023-01-22 15:27:36.627171: step: 212/463, loss: 0.05385138839483261 2023-01-22 15:27:37.299301: step: 214/463, loss: 0.20735955238342285 2023-01-22 15:27:37.945154: step: 216/463, loss: 0.23239685595035553 2023-01-22 15:27:38.492930: step: 218/463, loss: 0.18408022820949554 2023-01-22 15:27:39.091765: step: 220/463, loss: 0.6688066124916077 2023-01-22 15:27:39.805200: step: 222/463, loss: 0.2528699040412903 2023-01-22 15:27:40.460953: step: 224/463, loss: 0.26364046335220337 2023-01-22 15:27:41.054504: step: 226/463, loss: 0.24640771746635437 2023-01-22 15:27:41.642306: step: 228/463, loss: 0.4132780432701111 2023-01-22 15:27:42.291988: step: 230/463, loss: 0.3018377721309662 2023-01-22 15:27:42.873358: step: 232/463, loss: 0.17244209349155426 2023-01-22 15:27:43.537662: step: 234/463, loss: 0.06652279198169708 2023-01-22 15:27:44.204041: step: 236/463, loss: 0.8406684398651123 2023-01-22 15:27:44.780399: step: 238/463, loss: 0.19631734490394592 2023-01-22 15:27:45.366182: step: 240/463, loss: 0.17777059972286224 2023-01-22 15:27:45.989698: step: 242/463, loss: 0.13294045627117157 2023-01-22 15:27:46.599334: step: 244/463, loss: 0.12604284286499023 2023-01-22 15:27:47.192911: step: 246/463, loss: 0.40341418981552124 2023-01-22 15:27:47.796179: step: 248/463, loss: 0.1610405296087265 2023-01-22 15:27:48.466686: step: 250/463, loss: 0.18504329025745392 2023-01-22 15:27:49.146511: step: 252/463, loss: 0.14189612865447998 2023-01-22 15:27:49.897941: step: 254/463, loss: 0.36384716629981995 2023-01-22 15:27:50.538566: step: 256/463, loss: 0.11381727457046509 2023-01-22 15:27:51.125863: step: 258/463, loss: 0.12138859927654266 2023-01-22 15:27:51.704034: step: 260/463, loss: 0.12843194603919983 2023-01-22 15:27:52.315227: step: 262/463, loss: 0.13399603962898254 2023-01-22 15:27:52.926840: step: 264/463, loss: 0.11417409032583237 2023-01-22 15:27:53.490620: step: 266/463, loss: 0.09756356477737427 2023-01-22 15:27:54.030596: step: 268/463, loss: 0.5185312628746033 2023-01-22 15:27:54.651287: step: 270/463, loss: 0.5843031406402588 2023-01-22 15:27:55.276190: step: 272/463, loss: 0.12065514177083969 2023-01-22 15:27:55.921470: step: 274/463, loss: 0.22943642735481262 2023-01-22 15:27:56.529315: step: 276/463, loss: 0.21956069767475128 2023-01-22 15:27:57.134207: step: 278/463, loss: 0.39214786887168884 2023-01-22 15:27:57.762286: step: 280/463, loss: 0.09084703773260117 2023-01-22 15:27:58.475898: step: 282/463, loss: 0.906652569770813 2023-01-22 15:27:59.152252: step: 284/463, loss: 0.29256656765937805 2023-01-22 15:27:59.770412: step: 286/463, loss: 1.7061409950256348 2023-01-22 15:28:00.362661: step: 288/463, loss: 0.17310208082199097 2023-01-22 15:28:00.981088: step: 290/463, loss: 0.11019329726696014 2023-01-22 15:28:01.568552: step: 292/463, loss: 0.18812063336372375 2023-01-22 15:28:02.192177: step: 294/463, loss: 0.2367040514945984 2023-01-22 15:28:02.829362: step: 296/463, loss: 0.37064042687416077 2023-01-22 15:28:03.450590: step: 298/463, loss: 5.930190086364746 2023-01-22 15:28:04.020077: step: 300/463, loss: 0.15612800419330597 2023-01-22 15:28:04.623854: step: 302/463, loss: 0.10784886032342911 2023-01-22 15:28:05.213503: step: 304/463, loss: 0.4373374283313751 2023-01-22 15:28:05.799519: step: 306/463, loss: 0.381553053855896 2023-01-22 15:28:06.482887: step: 308/463, loss: 0.27639174461364746 2023-01-22 15:28:07.119451: step: 310/463, loss: 0.3075440227985382 2023-01-22 15:28:07.720385: step: 312/463, loss: 0.6651710271835327 2023-01-22 15:28:08.360895: step: 314/463, loss: 0.4784019887447357 2023-01-22 15:28:09.027366: step: 316/463, loss: 0.36471685767173767 2023-01-22 15:28:09.580155: step: 318/463, loss: 0.19372452795505524 2023-01-22 15:28:10.146301: step: 320/463, loss: 0.0941748321056366 2023-01-22 15:28:10.704364: step: 322/463, loss: 0.17864423990249634 2023-01-22 15:28:11.276892: step: 324/463, loss: 0.12077989429235458 2023-01-22 15:28:11.880889: step: 326/463, loss: 0.151410311460495 2023-01-22 15:28:12.551295: step: 328/463, loss: 0.12612590193748474 2023-01-22 15:28:13.080304: step: 330/463, loss: 0.13746872544288635 2023-01-22 15:28:13.775795: step: 332/463, loss: 0.30397140979766846 2023-01-22 15:28:14.457205: step: 334/463, loss: 0.8617913126945496 2023-01-22 15:28:15.101179: step: 336/463, loss: 0.09834155440330505 2023-01-22 15:28:15.718488: step: 338/463, loss: 0.2935326099395752 2023-01-22 15:28:16.289760: step: 340/463, loss: 0.1721435785293579 2023-01-22 15:28:16.913751: step: 342/463, loss: 0.1528320461511612 2023-01-22 15:28:17.479106: step: 344/463, loss: 0.14741285145282745 2023-01-22 15:28:18.095779: step: 346/463, loss: 0.145978644490242 2023-01-22 15:28:18.670429: step: 348/463, loss: 1.7788876295089722 2023-01-22 15:28:19.339491: step: 350/463, loss: 1.7614234685897827 2023-01-22 15:28:19.944579: step: 352/463, loss: 0.05853121355175972 2023-01-22 15:28:20.549647: step: 354/463, loss: 0.09219097346067429 2023-01-22 15:28:21.147295: step: 356/463, loss: 0.07740720361471176 2023-01-22 15:28:21.905896: step: 358/463, loss: 0.16658952832221985 2023-01-22 15:28:22.571370: step: 360/463, loss: 0.14364755153656006 2023-01-22 15:28:23.157007: step: 362/463, loss: 0.03519139066338539 2023-01-22 15:28:23.761624: step: 364/463, loss: 0.8442787528038025 2023-01-22 15:28:24.377423: step: 366/463, loss: 0.0770379826426506 2023-01-22 15:28:24.966730: step: 368/463, loss: 1.1323152780532837 2023-01-22 15:28:25.639110: step: 370/463, loss: 0.2045743614435196 2023-01-22 15:28:26.219540: step: 372/463, loss: 0.34592413902282715 2023-01-22 15:28:26.869661: step: 374/463, loss: 0.6158236265182495 2023-01-22 15:28:27.464094: step: 376/463, loss: 0.10464627295732498 2023-01-22 15:28:28.126750: step: 378/463, loss: 0.15170647203922272 2023-01-22 15:28:28.773102: step: 380/463, loss: 0.12863703072071075 2023-01-22 15:28:29.389919: step: 382/463, loss: 0.5255241394042969 2023-01-22 15:28:29.964144: step: 384/463, loss: 0.12661273777484894 2023-01-22 15:28:30.630813: step: 386/463, loss: 0.15400183200836182 2023-01-22 15:28:31.269657: step: 388/463, loss: 0.1466009020805359 2023-01-22 15:28:31.846859: step: 390/463, loss: 0.28363877534866333 2023-01-22 15:28:32.449187: step: 392/463, loss: 0.48686954379081726 2023-01-22 15:28:33.053958: step: 394/463, loss: 0.5147059559822083 2023-01-22 15:28:33.627257: step: 396/463, loss: 0.12725059688091278 2023-01-22 15:28:34.247131: step: 398/463, loss: 0.15045998990535736 2023-01-22 15:28:34.929789: step: 400/463, loss: 0.391051322221756 2023-01-22 15:28:35.632671: step: 402/463, loss: 0.09459737688302994 2023-01-22 15:28:36.224791: step: 404/463, loss: 0.19938690960407257 2023-01-22 15:28:36.838673: step: 406/463, loss: 0.5834435224533081 2023-01-22 15:28:37.441404: step: 408/463, loss: 0.14898522198200226 2023-01-22 15:28:37.973854: step: 410/463, loss: 0.07291276007890701 2023-01-22 15:28:38.579809: step: 412/463, loss: 0.20091824233531952 2023-01-22 15:28:39.299358: step: 414/463, loss: 0.27391061186790466 2023-01-22 15:28:39.931290: step: 416/463, loss: 0.7262064814567566 2023-01-22 15:28:40.480465: step: 418/463, loss: 0.23653005063533783 2023-01-22 15:28:41.063846: step: 420/463, loss: 0.16781355440616608 2023-01-22 15:28:41.705888: step: 422/463, loss: 0.24119384586811066 2023-01-22 15:28:42.439166: step: 424/463, loss: 0.34987956285476685 2023-01-22 15:28:43.025436: step: 426/463, loss: 0.14340338110923767 2023-01-22 15:28:43.574404: step: 428/463, loss: 0.24128663539886475 2023-01-22 15:28:44.292262: step: 430/463, loss: 0.8730706572532654 2023-01-22 15:28:44.879488: step: 432/463, loss: 0.7110679149627686 2023-01-22 15:28:45.511694: step: 434/463, loss: 0.37842637300491333 2023-01-22 15:28:46.138883: step: 436/463, loss: 0.0801948755979538 2023-01-22 15:28:46.705819: step: 438/463, loss: 1.588850498199463 2023-01-22 15:28:47.223801: step: 440/463, loss: 0.07357046008110046 2023-01-22 15:28:47.820294: step: 442/463, loss: 0.10184233635663986 2023-01-22 15:28:48.359656: step: 444/463, loss: 0.1895674169063568 2023-01-22 15:28:48.974106: step: 446/463, loss: 0.6268930435180664 2023-01-22 15:28:49.591070: step: 448/463, loss: 0.17010632157325745 2023-01-22 15:28:50.194004: step: 450/463, loss: 0.22064565122127533 2023-01-22 15:28:50.827696: step: 452/463, loss: 0.23295965790748596 2023-01-22 15:28:51.466153: step: 454/463, loss: 0.2041836827993393 2023-01-22 15:28:52.042795: step: 456/463, loss: 0.19297023117542267 2023-01-22 15:28:52.599969: step: 458/463, loss: 0.1553608775138855 2023-01-22 15:28:53.223112: step: 460/463, loss: 0.07147838175296783 2023-01-22 15:28:53.809827: step: 462/463, loss: 0.8132540583610535 2023-01-22 15:28:54.477299: step: 464/463, loss: 0.11224795132875443 2023-01-22 15:28:55.104796: step: 466/463, loss: 0.28505939245224 2023-01-22 15:28:55.689004: step: 468/463, loss: 0.9773705005645752 2023-01-22 15:28:56.289467: step: 470/463, loss: 0.16424565017223358 2023-01-22 15:28:56.890056: step: 472/463, loss: 0.5413669347763062 2023-01-22 15:28:57.556481: step: 474/463, loss: 0.09745154529809952 2023-01-22 15:28:58.160666: step: 476/463, loss: 0.22046037018299103 2023-01-22 15:28:58.747299: step: 478/463, loss: 0.25285619497299194 2023-01-22 15:28:59.332546: step: 480/463, loss: 0.11988124996423721 2023-01-22 15:28:59.883200: step: 482/463, loss: 0.15438736975193024 2023-01-22 15:29:00.513799: step: 484/463, loss: 0.24947504699230194 2023-01-22 15:29:01.182493: step: 486/463, loss: 0.15720424056053162 2023-01-22 15:29:01.796743: step: 488/463, loss: 0.17430947721004486 2023-01-22 15:29:02.393756: step: 490/463, loss: 4.068419456481934 2023-01-22 15:29:03.053446: step: 492/463, loss: 0.09201015532016754 2023-01-22 15:29:03.704736: step: 494/463, loss: 0.40999650955200195 2023-01-22 15:29:04.266904: step: 496/463, loss: 0.23335400223731995 2023-01-22 15:29:04.881278: step: 498/463, loss: 0.16296014189720154 2023-01-22 15:29:05.470625: step: 500/463, loss: 0.503380298614502 2023-01-22 15:29:06.075675: step: 502/463, loss: 0.17872093617916107 2023-01-22 15:29:06.651237: step: 504/463, loss: 0.19954152405261993 2023-01-22 15:29:07.237372: step: 506/463, loss: 3.158412456512451 2023-01-22 15:29:07.833280: step: 508/463, loss: 0.26305052638053894 2023-01-22 15:29:08.437986: step: 510/463, loss: 0.29176321625709534 2023-01-22 15:29:09.015528: step: 512/463, loss: 0.12284141033887863 2023-01-22 15:29:09.630031: step: 514/463, loss: 0.713845431804657 2023-01-22 15:29:10.180896: step: 516/463, loss: 0.29918816685676575 2023-01-22 15:29:10.733368: step: 518/463, loss: 0.54299396276474 2023-01-22 15:29:11.409325: step: 520/463, loss: 0.1454082578420639 2023-01-22 15:29:12.005602: step: 522/463, loss: 0.2962797284126282 2023-01-22 15:29:12.684701: step: 524/463, loss: 0.2879945635795593 2023-01-22 15:29:13.304433: step: 526/463, loss: 0.13068807125091553 2023-01-22 15:29:13.939505: step: 528/463, loss: 0.30125415325164795 2023-01-22 15:29:14.567066: step: 530/463, loss: 0.13231755793094635 2023-01-22 15:29:15.185726: step: 532/463, loss: 0.2992933988571167 2023-01-22 15:29:15.784987: step: 534/463, loss: 0.24928715825080872 2023-01-22 15:29:16.442995: step: 536/463, loss: 0.15060816705226898 2023-01-22 15:29:17.085829: step: 538/463, loss: 0.10340354591608047 2023-01-22 15:29:17.676522: step: 540/463, loss: 0.4494140148162842 2023-01-22 15:29:18.334889: step: 542/463, loss: 0.08604943007230759 2023-01-22 15:29:18.952020: step: 544/463, loss: 0.15499401092529297 2023-01-22 15:29:19.656421: step: 546/463, loss: 0.1679101586341858 2023-01-22 15:29:20.269754: step: 548/463, loss: 0.41227996349334717 2023-01-22 15:29:20.903779: step: 550/463, loss: 0.26008448004722595 2023-01-22 15:29:21.546500: step: 552/463, loss: 0.4526502788066864 2023-01-22 15:29:22.153106: step: 554/463, loss: 2.750079393386841 2023-01-22 15:29:22.808772: step: 556/463, loss: 0.9249699711799622 2023-01-22 15:29:23.424177: step: 558/463, loss: 0.26532861590385437 2023-01-22 15:29:23.975895: step: 560/463, loss: 0.13327407836914062 2023-01-22 15:29:24.556474: step: 562/463, loss: 0.452277272939682 2023-01-22 15:29:25.189997: step: 564/463, loss: 0.14739862084388733 2023-01-22 15:29:25.836412: step: 566/463, loss: 0.11605662107467651 2023-01-22 15:29:26.454673: step: 568/463, loss: 0.26655125617980957 2023-01-22 15:29:27.072882: step: 570/463, loss: 0.09983063489198685 2023-01-22 15:29:27.665829: step: 572/463, loss: 5.637213230133057 2023-01-22 15:29:28.325244: step: 574/463, loss: 0.05878300964832306 2023-01-22 15:29:28.915116: step: 576/463, loss: 0.39879846572875977 2023-01-22 15:29:29.563311: step: 578/463, loss: 0.15339171886444092 2023-01-22 15:29:30.151010: step: 580/463, loss: 0.40502801537513733 2023-01-22 15:29:30.789770: step: 582/463, loss: 0.25094130635261536 2023-01-22 15:29:31.404999: step: 584/463, loss: 0.21087327599525452 2023-01-22 15:29:32.024145: step: 586/463, loss: 0.06081909313797951 2023-01-22 15:29:32.574331: step: 588/463, loss: 0.15651674568653107 2023-01-22 15:29:33.168634: step: 590/463, loss: 0.5661988854408264 2023-01-22 15:29:33.721304: step: 592/463, loss: 0.07145919650793076 2023-01-22 15:29:34.361609: step: 594/463, loss: 0.13007758557796478 2023-01-22 15:29:34.941017: step: 596/463, loss: 0.12733973562717438 2023-01-22 15:29:35.708225: step: 598/463, loss: 0.17387202382087708 2023-01-22 15:29:36.315691: step: 600/463, loss: 0.2978232800960541 2023-01-22 15:29:36.996893: step: 602/463, loss: 0.18030747771263123 2023-01-22 15:29:37.614301: step: 604/463, loss: 0.6975864171981812 2023-01-22 15:29:38.248099: step: 606/463, loss: 0.8284298777580261 2023-01-22 15:29:38.914698: step: 608/463, loss: 0.06148531287908554 2023-01-22 15:29:39.568925: step: 610/463, loss: 0.13989399373531342 2023-01-22 15:29:40.185354: step: 612/463, loss: 0.11084530502557755 2023-01-22 15:29:40.785900: step: 614/463, loss: 0.3267926871776581 2023-01-22 15:29:41.410670: step: 616/463, loss: 0.5982192158699036 2023-01-22 15:29:42.007512: step: 618/463, loss: 0.1170753613114357 2023-01-22 15:29:42.580875: step: 620/463, loss: 0.22871661186218262 2023-01-22 15:29:43.138706: step: 622/463, loss: 0.573196530342102 2023-01-22 15:29:43.740507: step: 624/463, loss: 0.20662738382816315 2023-01-22 15:29:44.361209: step: 626/463, loss: 1.3724911212921143 2023-01-22 15:29:44.917512: step: 628/463, loss: 0.29613253474235535 2023-01-22 15:29:45.539441: step: 630/463, loss: 0.34659090638160706 2023-01-22 15:29:46.149606: step: 632/463, loss: 0.08074025809764862 2023-01-22 15:29:46.684883: step: 634/463, loss: 0.1351611614227295 2023-01-22 15:29:47.232733: step: 636/463, loss: 0.045026808977127075 2023-01-22 15:29:47.792833: step: 638/463, loss: 0.13623255491256714 2023-01-22 15:29:48.432629: step: 640/463, loss: 0.3457527160644531 2023-01-22 15:29:49.044880: step: 642/463, loss: 0.5819001793861389 2023-01-22 15:29:49.587821: step: 644/463, loss: 0.1549147516489029 2023-01-22 15:29:50.214924: step: 646/463, loss: 0.3472188115119934 2023-01-22 15:29:50.794197: step: 648/463, loss: 1.06662118434906 2023-01-22 15:29:51.431145: step: 650/463, loss: 0.23092952370643616 2023-01-22 15:29:52.058430: step: 652/463, loss: 1.1923049688339233 2023-01-22 15:29:52.670051: step: 654/463, loss: 0.1946762502193451 2023-01-22 15:29:53.283035: step: 656/463, loss: 0.21864797174930573 2023-01-22 15:29:53.873174: step: 658/463, loss: 0.15166567265987396 2023-01-22 15:29:54.473705: step: 660/463, loss: 0.21830828487873077 2023-01-22 15:29:55.065556: step: 662/463, loss: 0.4088539183139801 2023-01-22 15:29:55.649098: step: 664/463, loss: 0.8744271397590637 2023-01-22 15:29:56.259948: step: 666/463, loss: 0.35041356086730957 2023-01-22 15:29:56.829804: step: 668/463, loss: 0.14496062695980072 2023-01-22 15:29:57.443586: step: 670/463, loss: 0.24225276708602905 2023-01-22 15:29:58.034644: step: 672/463, loss: 0.10711809992790222 2023-01-22 15:29:58.649879: step: 674/463, loss: 0.12685883045196533 2023-01-22 15:29:59.223820: step: 676/463, loss: 0.6289884448051453 2023-01-22 15:29:59.858883: step: 678/463, loss: 0.18914948403835297 2023-01-22 15:30:00.507808: step: 680/463, loss: 0.6180313229560852 2023-01-22 15:30:01.068675: step: 682/463, loss: 0.08595053851604462 2023-01-22 15:30:01.712818: step: 684/463, loss: 1.0640815496444702 2023-01-22 15:30:02.379659: step: 686/463, loss: 0.1322038471698761 2023-01-22 15:30:03.019256: step: 688/463, loss: 0.7682482600212097 2023-01-22 15:30:03.668521: step: 690/463, loss: 0.5089671611785889 2023-01-22 15:30:04.302158: step: 692/463, loss: 0.2670285999774933 2023-01-22 15:30:04.908707: step: 694/463, loss: 0.12237494438886642 2023-01-22 15:30:05.525285: step: 696/463, loss: 0.8058057427406311 2023-01-22 15:30:06.259778: step: 698/463, loss: 0.19590164721012115 2023-01-22 15:30:06.959029: step: 700/463, loss: 0.13318048417568207 2023-01-22 15:30:07.611309: step: 702/463, loss: 0.11755412071943283 2023-01-22 15:30:08.267867: step: 704/463, loss: 0.18338964879512787 2023-01-22 15:30:08.874750: step: 706/463, loss: 0.07057612389326096 2023-01-22 15:30:09.550545: step: 708/463, loss: 0.08611511439085007 2023-01-22 15:30:10.203060: step: 710/463, loss: 0.2940301299095154 2023-01-22 15:30:10.826710: step: 712/463, loss: 0.09783314913511276 2023-01-22 15:30:11.385365: step: 714/463, loss: 0.03477096185088158 2023-01-22 15:30:12.022763: step: 716/463, loss: 0.13544633984565735 2023-01-22 15:30:12.640241: step: 718/463, loss: 0.08810052275657654 2023-01-22 15:30:13.233038: step: 720/463, loss: 0.7230587601661682 2023-01-22 15:30:13.821874: step: 722/463, loss: 0.5926504135131836 2023-01-22 15:30:14.444194: step: 724/463, loss: 0.2676931619644165 2023-01-22 15:30:15.080769: step: 726/463, loss: 0.7751638889312744 2023-01-22 15:30:15.703722: step: 728/463, loss: 0.612553596496582 2023-01-22 15:30:16.347563: step: 730/463, loss: 0.27694275975227356 2023-01-22 15:30:16.991449: step: 732/463, loss: 0.220717653632164 2023-01-22 15:30:17.619949: step: 734/463, loss: 0.6148403286933899 2023-01-22 15:30:18.188346: step: 736/463, loss: 0.13562646508216858 2023-01-22 15:30:18.768924: step: 738/463, loss: 0.2535829246044159 2023-01-22 15:30:19.421511: step: 740/463, loss: 0.3031376600265503 2023-01-22 15:30:20.012050: step: 742/463, loss: 0.22858306765556335 2023-01-22 15:30:20.665026: step: 744/463, loss: 0.9351018667221069 2023-01-22 15:30:21.284958: step: 746/463, loss: 0.17246918380260468 2023-01-22 15:30:21.958764: step: 748/463, loss: 0.5891396999359131 2023-01-22 15:30:22.535709: step: 750/463, loss: 0.17908084392547607 2023-01-22 15:30:23.201482: step: 752/463, loss: 0.22861412167549133 2023-01-22 15:30:23.748224: step: 754/463, loss: 0.28577160835266113 2023-01-22 15:30:24.383461: step: 756/463, loss: 0.09434788674116135 2023-01-22 15:30:25.012888: step: 758/463, loss: 0.17552722990512848 2023-01-22 15:30:25.629118: step: 760/463, loss: 0.1670864224433899 2023-01-22 15:30:26.230560: step: 762/463, loss: 0.1414014846086502 2023-01-22 15:30:26.828289: step: 764/463, loss: 0.220012366771698 2023-01-22 15:30:27.418760: step: 766/463, loss: 0.34742963314056396 2023-01-22 15:30:27.997485: step: 768/463, loss: 0.15765875577926636 2023-01-22 15:30:28.628486: step: 770/463, loss: 0.055323172360658646 2023-01-22 15:30:29.280551: step: 772/463, loss: 0.24944284558296204 2023-01-22 15:30:29.982315: step: 774/463, loss: 0.0986892431974411 2023-01-22 15:30:30.673342: step: 776/463, loss: 0.12194304913282394 2023-01-22 15:30:31.294227: step: 778/463, loss: 0.6660851240158081 2023-01-22 15:30:31.861521: step: 780/463, loss: 0.13439349830150604 2023-01-22 15:30:32.450415: step: 782/463, loss: 0.1164356991648674 2023-01-22 15:30:33.079590: step: 784/463, loss: 0.7993448972702026 2023-01-22 15:30:33.655082: step: 786/463, loss: 0.23112338781356812 2023-01-22 15:30:34.387667: step: 788/463, loss: 0.1860707402229309 2023-01-22 15:30:34.981627: step: 790/463, loss: 0.5992123484611511 2023-01-22 15:30:35.620099: step: 792/463, loss: 0.1465783268213272 2023-01-22 15:30:36.156869: step: 794/463, loss: 0.180286705493927 2023-01-22 15:30:36.722580: step: 796/463, loss: 0.14994490146636963 2023-01-22 15:30:37.426165: step: 798/463, loss: 0.2237142026424408 2023-01-22 15:30:38.012817: step: 800/463, loss: 0.5729338526725769 2023-01-22 15:30:38.665159: step: 802/463, loss: 0.27626535296440125 2023-01-22 15:30:39.353065: step: 804/463, loss: 0.5199800729751587 2023-01-22 15:30:39.955144: step: 806/463, loss: 1.8115214109420776 2023-01-22 15:30:40.557161: step: 808/463, loss: 0.36487722396850586 2023-01-22 15:30:41.148892: step: 810/463, loss: 0.12763142585754395 2023-01-22 15:30:41.793014: step: 812/463, loss: 0.21795439720153809 2023-01-22 15:30:42.436782: step: 814/463, loss: 0.24033254384994507 2023-01-22 15:30:43.044408: step: 816/463, loss: 0.24460312724113464 2023-01-22 15:30:43.701000: step: 818/463, loss: 0.21404379606246948 2023-01-22 15:30:44.307784: step: 820/463, loss: 0.17983190715312958 2023-01-22 15:30:44.904243: step: 822/463, loss: 0.1139487475156784 2023-01-22 15:30:45.520129: step: 824/463, loss: 0.3910600543022156 2023-01-22 15:30:46.112410: step: 826/463, loss: 0.16988827288150787 2023-01-22 15:30:46.735808: step: 828/463, loss: 0.11899615079164505 2023-01-22 15:30:47.275666: step: 830/463, loss: 0.09909864515066147 2023-01-22 15:30:47.902286: step: 832/463, loss: 0.3792467713356018 2023-01-22 15:30:48.467642: step: 834/463, loss: 0.08687497675418854 2023-01-22 15:30:49.100589: step: 836/463, loss: 0.11133027821779251 2023-01-22 15:30:49.706516: step: 838/463, loss: 0.7069783806800842 2023-01-22 15:30:50.217259: step: 840/463, loss: 0.15885843336582184 2023-01-22 15:30:50.829327: step: 842/463, loss: 0.1356661468744278 2023-01-22 15:30:51.536550: step: 844/463, loss: 1.7529646158218384 2023-01-22 15:30:52.230342: step: 846/463, loss: 0.15293292701244354 2023-01-22 15:30:52.825132: step: 848/463, loss: 0.6598432660102844 2023-01-22 15:30:53.379863: step: 850/463, loss: 1.0685268640518188 2023-01-22 15:30:54.006796: step: 852/463, loss: 0.1158989816904068 2023-01-22 15:30:54.612099: step: 854/463, loss: 0.11234312504529953 2023-01-22 15:30:55.248602: step: 856/463, loss: 0.9744786620140076 2023-01-22 15:30:55.804411: step: 858/463, loss: 0.29152390360832214 2023-01-22 15:30:56.332157: step: 860/463, loss: 0.2626463770866394 2023-01-22 15:30:56.969765: step: 862/463, loss: 0.1867259293794632 2023-01-22 15:30:57.555898: step: 864/463, loss: 0.21971353888511658 2023-01-22 15:30:58.162659: step: 866/463, loss: 0.1341341733932495 2023-01-22 15:30:58.761241: step: 868/463, loss: 0.12559007108211517 2023-01-22 15:30:59.367538: step: 870/463, loss: 0.23563507199287415 2023-01-22 15:30:59.996033: step: 872/463, loss: 0.30911317467689514 2023-01-22 15:31:00.624813: step: 874/463, loss: 0.10908465087413788 2023-01-22 15:31:01.261105: step: 876/463, loss: 0.28492945432662964 2023-01-22 15:31:01.893146: step: 878/463, loss: 0.35925647616386414 2023-01-22 15:31:02.526344: step: 880/463, loss: 0.20803391933441162 2023-01-22 15:31:03.188646: step: 882/463, loss: 0.29923489689826965 2023-01-22 15:31:03.768503: step: 884/463, loss: 0.35310113430023193 2023-01-22 15:31:04.456988: step: 886/463, loss: 0.513892650604248 2023-01-22 15:31:05.100787: step: 888/463, loss: 0.10313627123832703 2023-01-22 15:31:05.749486: step: 890/463, loss: 0.20523688197135925 2023-01-22 15:31:06.387386: step: 892/463, loss: 0.09785130620002747 2023-01-22 15:31:06.985253: step: 894/463, loss: 0.16046510636806488 2023-01-22 15:31:07.609123: step: 896/463, loss: 0.3775019347667694 2023-01-22 15:31:08.181562: step: 898/463, loss: 0.08324379473924637 2023-01-22 15:31:08.782948: step: 900/463, loss: 0.4934373199939728 2023-01-22 15:31:09.341448: step: 902/463, loss: 0.27230575680732727 2023-01-22 15:31:09.993913: step: 904/463, loss: 0.1329985409975052 2023-01-22 15:31:10.585073: step: 906/463, loss: 0.17334221303462982 2023-01-22 15:31:11.194009: step: 908/463, loss: 0.16498501598834991 2023-01-22 15:31:11.843529: step: 910/463, loss: 0.6820108890533447 2023-01-22 15:31:12.469446: step: 912/463, loss: 0.270793616771698 2023-01-22 15:31:13.052666: step: 914/463, loss: 0.08387669920921326 2023-01-22 15:31:13.662051: step: 916/463, loss: 0.4046429693698883 2023-01-22 15:31:14.254700: step: 918/463, loss: 0.2457585632801056 2023-01-22 15:31:14.864620: step: 920/463, loss: 0.6180830597877502 2023-01-22 15:31:15.466895: step: 922/463, loss: 0.17410029470920563 2023-01-22 15:31:16.136798: step: 924/463, loss: 0.12644445896148682 2023-01-22 15:31:16.848503: step: 926/463, loss: 0.23891422152519226 ================================================== Loss: 0.370 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2782685185185185, 'r': 0.35641603415559775, 'f1': 0.3125311980033278}, 'combined': 0.23028614589718888, 'epoch': 8} Test Chinese: {'template': {'p': 0.9722222222222222, 'r': 0.5511811023622047, 'f1': 0.7035175879396985}, 'slot': {'p': 0.32812936617648, 'r': 0.31409940200314007, 'f1': 0.3209611365988395}, 'combined': 0.22580180464239968, 'epoch': 8} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.28882978723404257, 'r': 0.3606261859582543, 'f1': 0.3207594936708861}, 'combined': 0.23634910059960026, 'epoch': 8} Test Korean: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.32266427723983016, 'r': 0.30972394877162424, 'f1': 0.3160617164066308}, 'combined': 0.22440381864870784, 'epoch': 8} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2960370879120879, 'r': 0.3651311330984006, 'f1': 0.3269738439130962}, 'combined': 0.24092809551491298, 'epoch': 8} Test Russian: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.3432383458787202, 'r': 0.300071252597903, 'f1': 0.3202065090629999}, 'combined': 0.22734662143472992, 'epoch': 8} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.22005208333333331, 'r': 0.40238095238095234, 'f1': 0.2845117845117845}, 'combined': 0.1896745230078563, 'epoch': 8} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.22023809523809523, 'r': 0.40217391304347827, 'f1': 0.2846153846153846}, 'combined': 0.1423076923076923, 'epoch': 8} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.2708333333333333, 'r': 0.22413793103448276, 'f1': 0.24528301886792453}, 'combined': 0.16352201257861634, 'epoch': 8} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3095149595559295, 'r': 0.3253724622276754, 'f1': 0.31724567547453275}, 'combined': 0.23375997140228727, 'epoch': 1} Test for Chinese: {'template': {'p': 0.9722222222222222, 'r': 0.5511811023622047, 'f1': 0.7035175879396985}, 'slot': {'p': 0.36402697976295945, 'r': 0.29452226435922085, 'f1': 0.32560678286267597}, 'combined': 0.22907009849635496, 'epoch': 1} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.29409479409479405, 'r': 0.32770562770562767, 'f1': 0.3099918099918099}, 'combined': 0.2066612066612066, 'epoch': 1} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.31359210182048525, 'r': 0.32489807892596767, 'f1': 0.3191449908555172}, 'combined': 0.23515946694617054, 'epoch': 1} Test for Korean: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.3653684903443556, 'r': 0.29076445304891124, 'f1': 0.32382513429937054}, 'combined': 0.22991584535255308, 'epoch': 1} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.296875, 'r': 0.41304347826086957, 'f1': 0.3454545454545454}, 'combined': 0.1727272727272727, 'epoch': 1} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2893626329100397, 'r': 0.33877940133869927, 'f1': 0.3121271757089065}, 'combined': 0.22998844525919426, 'epoch': 6} Test for Russian: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.3200163294971871, 'r': 0.2985143597142173, 'f1': 0.30889160833633683}, 'combined': 0.21931304191879913, 'epoch': 6} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.3977272727272727, 'r': 0.3017241379310345, 'f1': 0.34313725490196073}, 'combined': 0.2287581699346405, 'epoch': 6} ****************************** Epoch: 9 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-22 15:33:50.430086: step: 2/463, loss: 0.17153166234493256 2023-01-22 15:33:51.086321: step: 4/463, loss: 0.2481025904417038 2023-01-22 15:33:51.674673: step: 6/463, loss: 0.11309840530157089 2023-01-22 15:33:52.215385: step: 8/463, loss: 0.09898319095373154 2023-01-22 15:33:52.922450: step: 10/463, loss: 0.1770477294921875 2023-01-22 15:33:53.499045: step: 12/463, loss: 0.1319502741098404 2023-01-22 15:33:54.085067: step: 14/463, loss: 0.1795295774936676 2023-01-22 15:33:54.755354: step: 16/463, loss: 0.09913533180952072 2023-01-22 15:33:55.368506: step: 18/463, loss: 0.2259988933801651 2023-01-22 15:33:55.996205: step: 20/463, loss: 0.12546414136886597 2023-01-22 15:33:56.588489: step: 22/463, loss: 0.1861373782157898 2023-01-22 15:33:57.279159: step: 24/463, loss: 0.2236696183681488 2023-01-22 15:33:57.939546: step: 26/463, loss: 0.18975159525871277 2023-01-22 15:33:58.634185: step: 28/463, loss: 0.08690284937620163 2023-01-22 15:33:59.203346: step: 30/463, loss: 0.18611980974674225 2023-01-22 15:33:59.826079: step: 32/463, loss: 0.09621629118919373 2023-01-22 15:34:00.438626: step: 34/463, loss: 0.07239944487810135 2023-01-22 15:34:01.062306: step: 36/463, loss: 0.1621030569076538 2023-01-22 15:34:01.626770: step: 38/463, loss: 0.3918221592903137 2023-01-22 15:34:02.266313: step: 40/463, loss: 0.09395307302474976 2023-01-22 15:34:02.924421: step: 42/463, loss: 0.19249242544174194 2023-01-22 15:34:03.507298: step: 44/463, loss: 0.4634231925010681 2023-01-22 15:34:04.074959: step: 46/463, loss: 0.0645282119512558 2023-01-22 15:34:04.738942: step: 48/463, loss: 0.2845926880836487 2023-01-22 15:34:05.345253: step: 50/463, loss: 5.190971374511719 2023-01-22 15:34:05.899923: step: 52/463, loss: 0.1356731355190277 2023-01-22 15:34:06.520303: step: 54/463, loss: 0.13491907715797424 2023-01-22 15:34:07.131626: step: 56/463, loss: 0.04895960912108421 2023-01-22 15:34:07.765273: step: 58/463, loss: 0.09974096715450287 2023-01-22 15:34:08.323554: step: 60/463, loss: 0.42797917127609253 2023-01-22 15:34:08.998121: step: 62/463, loss: 0.05568438768386841 2023-01-22 15:34:09.621355: step: 64/463, loss: 0.17679621279239655 2023-01-22 15:34:10.210341: step: 66/463, loss: 0.29304009675979614 2023-01-22 15:34:10.870565: step: 68/463, loss: 0.09657712280750275 2023-01-22 15:34:11.493843: step: 70/463, loss: 0.09293390810489655 2023-01-22 15:34:12.090254: step: 72/463, loss: 0.13799786567687988 2023-01-22 15:34:12.685855: step: 74/463, loss: 0.2835022211074829 2023-01-22 15:34:13.271790: step: 76/463, loss: 0.16709072887897491 2023-01-22 15:34:13.820181: step: 78/463, loss: 0.07623666524887085 2023-01-22 15:34:14.351246: step: 80/463, loss: 0.49481678009033203 2023-01-22 15:34:14.886805: step: 82/463, loss: 0.06679725646972656 2023-01-22 15:34:15.462080: step: 84/463, loss: 0.1999761462211609 2023-01-22 15:34:16.079523: step: 86/463, loss: 0.32262876629829407 2023-01-22 15:34:16.664330: step: 88/463, loss: 0.2990921437740326 2023-01-22 15:34:17.287671: step: 90/463, loss: 0.11836453527212143 2023-01-22 15:34:17.853775: step: 92/463, loss: 0.44921115040779114 2023-01-22 15:34:18.470623: step: 94/463, loss: 1.538417935371399 2023-01-22 15:34:19.083985: step: 96/463, loss: 0.5896992087364197 2023-01-22 15:34:19.705783: step: 98/463, loss: 0.21138238906860352 2023-01-22 15:34:20.318155: step: 100/463, loss: 0.1382117122411728 2023-01-22 15:34:20.963563: step: 102/463, loss: 0.1451740711927414 2023-01-22 15:34:21.679184: step: 104/463, loss: 0.1350160390138626 2023-01-22 15:34:22.232714: step: 106/463, loss: 0.12082543969154358 2023-01-22 15:34:22.795576: step: 108/463, loss: 0.039022523909807205 2023-01-22 15:34:23.384893: step: 110/463, loss: 0.22964738309383392 2023-01-22 15:34:23.952371: step: 112/463, loss: 0.09196848422288895 2023-01-22 15:34:24.537228: step: 114/463, loss: 0.06061667948961258 2023-01-22 15:34:25.213071: step: 116/463, loss: 0.1652471125125885 2023-01-22 15:34:25.808385: step: 118/463, loss: 0.1497257500886917 2023-01-22 15:34:26.415308: step: 120/463, loss: 0.39808031916618347 2023-01-22 15:34:27.134368: step: 122/463, loss: 0.23042140901088715 2023-01-22 15:34:27.812031: step: 124/463, loss: 0.5945118069648743 2023-01-22 15:34:28.380791: step: 126/463, loss: 0.17294389009475708 2023-01-22 15:34:29.016038: step: 128/463, loss: 0.10453740507364273 2023-01-22 15:34:29.660593: step: 130/463, loss: 0.07634685933589935 2023-01-22 15:34:30.306626: step: 132/463, loss: 0.2726737856864929 2023-01-22 15:34:30.954589: step: 134/463, loss: 0.16973647475242615 2023-01-22 15:34:31.591633: step: 136/463, loss: 0.16965653002262115 2023-01-22 15:34:32.120785: step: 138/463, loss: 0.17900274693965912 2023-01-22 15:34:32.728747: step: 140/463, loss: 0.666972279548645 2023-01-22 15:34:33.313879: step: 142/463, loss: 0.16657280921936035 2023-01-22 15:34:33.882659: step: 144/463, loss: 0.3656866252422333 2023-01-22 15:34:34.494978: step: 146/463, loss: 0.13201504945755005 2023-01-22 15:34:35.133693: step: 148/463, loss: 0.2741958200931549 2023-01-22 15:34:35.666680: step: 150/463, loss: 0.303128182888031 2023-01-22 15:34:36.193826: step: 152/463, loss: 0.23353055119514465 2023-01-22 15:34:36.909684: step: 154/463, loss: 0.32571038603782654 2023-01-22 15:34:37.497587: step: 156/463, loss: 0.29289788007736206 2023-01-22 15:34:38.097453: step: 158/463, loss: 0.6369815468788147 2023-01-22 15:34:38.742139: step: 160/463, loss: 0.1397433876991272 2023-01-22 15:34:39.385211: step: 162/463, loss: 0.15871936082839966 2023-01-22 15:34:40.030271: step: 164/463, loss: 0.09473586827516556 2023-01-22 15:34:40.611983: step: 166/463, loss: 0.1598958522081375 2023-01-22 15:34:41.226809: step: 168/463, loss: 0.047141142189502716 2023-01-22 15:34:41.843631: step: 170/463, loss: 0.23950275778770447 2023-01-22 15:34:42.476114: step: 172/463, loss: 0.5382143259048462 2023-01-22 15:34:43.077365: step: 174/463, loss: 0.16742826998233795 2023-01-22 15:34:43.662087: step: 176/463, loss: 0.36320847272872925 2023-01-22 15:34:44.244758: step: 178/463, loss: 0.15095973014831543 2023-01-22 15:34:44.955001: step: 180/463, loss: 0.12796016037464142 2023-01-22 15:34:45.517567: step: 182/463, loss: 0.30316656827926636 2023-01-22 15:34:46.168312: step: 184/463, loss: 0.06674323976039886 2023-01-22 15:34:46.727487: step: 186/463, loss: 0.10858427733182907 2023-01-22 15:34:47.312273: step: 188/463, loss: 1.4883980751037598 2023-01-22 15:34:47.866071: step: 190/463, loss: 0.2310623824596405 2023-01-22 15:34:48.584242: step: 192/463, loss: 0.07873179018497467 2023-01-22 15:34:49.163734: step: 194/463, loss: 0.3672964572906494 2023-01-22 15:34:49.849315: step: 196/463, loss: 0.21914350986480713 2023-01-22 15:34:50.447072: step: 198/463, loss: 0.3682887852191925 2023-01-22 15:34:51.105153: step: 200/463, loss: 0.23460537195205688 2023-01-22 15:34:51.695015: step: 202/463, loss: 0.11394502967596054 2023-01-22 15:34:52.254079: step: 204/463, loss: 0.07090889662504196 2023-01-22 15:34:52.877111: step: 206/463, loss: 0.28772053122520447 2023-01-22 15:34:53.491956: step: 208/463, loss: 0.11017069220542908 2023-01-22 15:34:54.139041: step: 210/463, loss: 0.11104051023721695 2023-01-22 15:34:54.858803: step: 212/463, loss: 0.12580320239067078 2023-01-22 15:34:55.521263: step: 214/463, loss: 0.12638290226459503 2023-01-22 15:34:56.111851: step: 216/463, loss: 0.3387400507926941 2023-01-22 15:34:56.837126: step: 218/463, loss: 0.1407310664653778 2023-01-22 15:34:57.434499: step: 220/463, loss: 0.15156978368759155 2023-01-22 15:34:58.114198: step: 222/463, loss: 0.15753497183322906 2023-01-22 15:34:58.718967: step: 224/463, loss: 0.1478101909160614 2023-01-22 15:34:59.354990: step: 226/463, loss: 0.23675429821014404 2023-01-22 15:35:00.027055: step: 228/463, loss: 0.2196059376001358 2023-01-22 15:35:00.671352: step: 230/463, loss: 0.4312027096748352 2023-01-22 15:35:01.296288: step: 232/463, loss: 0.23598097264766693 2023-01-22 15:35:01.827444: step: 234/463, loss: 0.47105395793914795 2023-01-22 15:35:02.471886: step: 236/463, loss: 0.2935202717781067 2023-01-22 15:35:03.079398: step: 238/463, loss: 0.05753381550312042 2023-01-22 15:35:03.654166: step: 240/463, loss: 0.12001167237758636 2023-01-22 15:35:04.245628: step: 242/463, loss: 0.05545753985643387 2023-01-22 15:35:04.892817: step: 244/463, loss: 0.3201543688774109 2023-01-22 15:35:05.477821: step: 246/463, loss: 0.49448326230049133 2023-01-22 15:35:06.167121: step: 248/463, loss: 0.07907971739768982 2023-01-22 15:35:06.807543: step: 250/463, loss: 1.4817062616348267 2023-01-22 15:35:07.447719: step: 252/463, loss: 0.040236979722976685 2023-01-22 15:35:08.031907: step: 254/463, loss: 0.3727327883243561 2023-01-22 15:35:08.640118: step: 256/463, loss: 0.1063518151640892 2023-01-22 15:35:09.255760: step: 258/463, loss: 0.20889277756214142 2023-01-22 15:35:09.868514: step: 260/463, loss: 0.18649034202098846 2023-01-22 15:35:10.511635: step: 262/463, loss: 0.33195990324020386 2023-01-22 15:35:11.106857: step: 264/463, loss: 0.05117291212081909 2023-01-22 15:35:11.708640: step: 266/463, loss: 0.23681020736694336 2023-01-22 15:35:12.297763: step: 268/463, loss: 0.03028898686170578 2023-01-22 15:35:12.921936: step: 270/463, loss: 0.07559726387262344 2023-01-22 15:35:13.489791: step: 272/463, loss: 0.12291713804006577 2023-01-22 15:35:14.050562: step: 274/463, loss: 0.38352566957473755 2023-01-22 15:35:14.695520: step: 276/463, loss: 0.5669448375701904 2023-01-22 15:35:15.322668: step: 278/463, loss: 0.12671183049678802 2023-01-22 15:35:15.940188: step: 280/463, loss: 0.07778966426849365 2023-01-22 15:35:16.535775: step: 282/463, loss: 0.08837009221315384 2023-01-22 15:35:17.177774: step: 284/463, loss: 0.19480456411838531 2023-01-22 15:35:17.836356: step: 286/463, loss: 0.1842186450958252 2023-01-22 15:35:18.532838: step: 288/463, loss: 0.2354021668434143 2023-01-22 15:35:19.232550: step: 290/463, loss: 0.07553748786449432 2023-01-22 15:35:19.844249: step: 292/463, loss: 0.3076557517051697 2023-01-22 15:35:20.399571: step: 294/463, loss: 0.18498709797859192 2023-01-22 15:35:21.053704: step: 296/463, loss: 0.29448097944259644 2023-01-22 15:35:21.610047: step: 298/463, loss: 0.2696293592453003 2023-01-22 15:35:22.213299: step: 300/463, loss: 0.10077899694442749 2023-01-22 15:35:22.866436: step: 302/463, loss: 0.06689047068357468 2023-01-22 15:35:23.521883: step: 304/463, loss: 0.568021297454834 2023-01-22 15:35:24.113595: step: 306/463, loss: 0.43649959564208984 2023-01-22 15:35:24.725053: step: 308/463, loss: 0.1779690980911255 2023-01-22 15:35:25.366543: step: 310/463, loss: 0.156068354845047 2023-01-22 15:35:25.962772: step: 312/463, loss: 0.08728298544883728 2023-01-22 15:35:26.554668: step: 314/463, loss: 0.22395265102386475 2023-01-22 15:35:27.174715: step: 316/463, loss: 0.1379486322402954 2023-01-22 15:35:27.712993: step: 318/463, loss: 0.15776890516281128 2023-01-22 15:35:28.282088: step: 320/463, loss: 0.09931416809558868 2023-01-22 15:35:28.925843: step: 322/463, loss: 0.10670053958892822 2023-01-22 15:35:29.596379: step: 324/463, loss: 0.10092684626579285 2023-01-22 15:35:30.245048: step: 326/463, loss: 0.12958112359046936 2023-01-22 15:35:30.959127: step: 328/463, loss: 0.10120599716901779 2023-01-22 15:35:31.541059: step: 330/463, loss: 0.20679239928722382 2023-01-22 15:35:32.121542: step: 332/463, loss: 0.124295175075531 2023-01-22 15:35:32.731254: step: 334/463, loss: 1.0542069673538208 2023-01-22 15:35:33.388509: step: 336/463, loss: 0.09998573362827301 2023-01-22 15:35:33.991811: step: 338/463, loss: 0.0704512745141983 2023-01-22 15:35:34.629087: step: 340/463, loss: 0.1309303492307663 2023-01-22 15:35:35.182916: step: 342/463, loss: 0.16639363765716553 2023-01-22 15:35:35.778690: step: 344/463, loss: 0.09570711106061935 2023-01-22 15:35:36.431097: step: 346/463, loss: 0.6679884791374207 2023-01-22 15:35:37.019230: step: 348/463, loss: 0.17838124930858612 2023-01-22 15:35:37.673094: step: 350/463, loss: 0.20480100810527802 2023-01-22 15:35:38.359740: step: 352/463, loss: 0.1766744703054428 2023-01-22 15:35:38.940459: step: 354/463, loss: 0.10462980717420578 2023-01-22 15:35:39.536190: step: 356/463, loss: 0.11210843175649643 2023-01-22 15:35:40.153298: step: 358/463, loss: 0.09804601967334747 2023-01-22 15:35:40.775609: step: 360/463, loss: 2.879030227661133 2023-01-22 15:35:41.318436: step: 362/463, loss: 0.15619498491287231 2023-01-22 15:35:41.972004: step: 364/463, loss: 0.166386216878891 2023-01-22 15:35:42.619920: step: 366/463, loss: 1.2726777791976929 2023-01-22 15:35:43.185962: step: 368/463, loss: 0.72119140625 2023-01-22 15:35:43.791871: step: 370/463, loss: 0.14832474291324615 2023-01-22 15:35:44.364039: step: 372/463, loss: 0.2228291630744934 2023-01-22 15:35:45.051685: step: 374/463, loss: 0.24049876630306244 2023-01-22 15:35:45.644183: step: 376/463, loss: 0.05110529065132141 2023-01-22 15:35:46.255920: step: 378/463, loss: 0.07428029924631119 2023-01-22 15:35:46.863620: step: 380/463, loss: 0.1189124584197998 2023-01-22 15:35:47.490390: step: 382/463, loss: 0.07407426834106445 2023-01-22 15:35:48.070904: step: 384/463, loss: 1.1588594913482666 2023-01-22 15:35:48.663020: step: 386/463, loss: 0.059389546513557434 2023-01-22 15:35:49.269423: step: 388/463, loss: 0.16576623916625977 2023-01-22 15:35:49.890709: step: 390/463, loss: 0.09820029139518738 2023-01-22 15:35:50.612991: step: 392/463, loss: 0.17012064158916473 2023-01-22 15:35:51.325070: step: 394/463, loss: 0.17948183417320251 2023-01-22 15:35:51.959218: step: 396/463, loss: 0.0380796454846859 2023-01-22 15:35:52.620843: step: 398/463, loss: 0.08647415041923523 2023-01-22 15:35:53.246920: step: 400/463, loss: 0.784430742263794 2023-01-22 15:35:53.820726: step: 402/463, loss: 0.5520376563072205 2023-01-22 15:35:54.358689: step: 404/463, loss: 0.07186788320541382 2023-01-22 15:35:54.966598: step: 406/463, loss: 0.0732453316450119 2023-01-22 15:35:55.538667: step: 408/463, loss: 0.34668147563934326 2023-01-22 15:35:56.208959: step: 410/463, loss: 0.11433564126491547 2023-01-22 15:35:56.798655: step: 412/463, loss: 0.027529867365956306 2023-01-22 15:35:57.332346: step: 414/463, loss: 0.21183113753795624 2023-01-22 15:35:57.929757: step: 416/463, loss: 0.1245654970407486 2023-01-22 15:35:58.529387: step: 418/463, loss: 0.07231695204973221 2023-01-22 15:35:59.155715: step: 420/463, loss: 1.348099946975708 2023-01-22 15:35:59.775610: step: 422/463, loss: 0.10820773243904114 2023-01-22 15:36:00.387893: step: 424/463, loss: 0.18913574516773224 2023-01-22 15:36:01.042972: step: 426/463, loss: 0.2273891270160675 2023-01-22 15:36:01.654545: step: 428/463, loss: 0.13242848217487335 2023-01-22 15:36:02.496382: step: 430/463, loss: 0.23015965521335602 2023-01-22 15:36:03.024346: step: 432/463, loss: 0.6252323389053345 2023-01-22 15:36:03.688496: step: 434/463, loss: 1.283473014831543 2023-01-22 15:36:04.343847: step: 436/463, loss: 0.1841343194246292 2023-01-22 15:36:04.882461: step: 438/463, loss: 0.050452981144189835 2023-01-22 15:36:05.581682: step: 440/463, loss: 0.14711600542068481 2023-01-22 15:36:06.201301: step: 442/463, loss: 0.18716341257095337 2023-01-22 15:36:06.845810: step: 444/463, loss: 0.22310778498649597 2023-01-22 15:36:07.478225: step: 446/463, loss: 0.10852532833814621 2023-01-22 15:36:08.086561: step: 448/463, loss: 0.11836019158363342 2023-01-22 15:36:08.674182: step: 450/463, loss: 0.15737132728099823 2023-01-22 15:36:09.233438: step: 452/463, loss: 0.1498911827802658 2023-01-22 15:36:09.806358: step: 454/463, loss: 0.11912377923727036 2023-01-22 15:36:10.424485: step: 456/463, loss: 0.2811686396598816 2023-01-22 15:36:11.110587: step: 458/463, loss: 0.5437459349632263 2023-01-22 15:36:11.695672: step: 460/463, loss: 0.13738712668418884 2023-01-22 15:36:12.274741: step: 462/463, loss: 0.21534515917301178 2023-01-22 15:36:12.841318: step: 464/463, loss: 0.6040116548538208 2023-01-22 15:36:13.442712: step: 466/463, loss: 1.640557050704956 2023-01-22 15:36:13.998388: step: 468/463, loss: 0.1548285186290741 2023-01-22 15:36:14.607016: step: 470/463, loss: 0.2669467628002167 2023-01-22 15:36:15.213960: step: 472/463, loss: 0.6713567972183228 2023-01-22 15:36:15.821332: step: 474/463, loss: 0.21316078305244446 2023-01-22 15:36:16.465902: step: 476/463, loss: 0.13406170904636383 2023-01-22 15:36:17.079238: step: 478/463, loss: 0.11279979348182678 2023-01-22 15:36:17.681600: step: 480/463, loss: 0.2236279994249344 2023-01-22 15:36:18.312917: step: 482/463, loss: 0.5516587495803833 2023-01-22 15:36:18.952017: step: 484/463, loss: 0.10894601792097092 2023-01-22 15:36:19.585745: step: 486/463, loss: 0.6853892803192139 2023-01-22 15:36:20.126937: step: 488/463, loss: 0.14586472511291504 2023-01-22 15:36:20.744879: step: 490/463, loss: 0.3797264099121094 2023-01-22 15:36:21.387634: step: 492/463, loss: 0.12222345173358917 2023-01-22 15:36:21.914767: step: 494/463, loss: 0.680202305316925 2023-01-22 15:36:22.451254: step: 496/463, loss: 0.08561058342456818 2023-01-22 15:36:23.038069: step: 498/463, loss: 0.0621226541697979 2023-01-22 15:36:23.627112: step: 500/463, loss: 0.3232349157333374 2023-01-22 15:36:24.286382: step: 502/463, loss: 0.1951119601726532 2023-01-22 15:36:24.864792: step: 504/463, loss: 0.09055054932832718 2023-01-22 15:36:25.472578: step: 506/463, loss: 0.18275560438632965 2023-01-22 15:36:26.077959: step: 508/463, loss: 3.200840473175049 2023-01-22 15:36:26.686109: step: 510/463, loss: 0.18334196507930756 2023-01-22 15:36:27.278435: step: 512/463, loss: 0.08274304866790771 2023-01-22 15:36:27.845138: step: 514/463, loss: 0.24850161373615265 2023-01-22 15:36:28.399550: step: 516/463, loss: 0.12743520736694336 2023-01-22 15:36:28.973469: step: 518/463, loss: 0.4542965590953827 2023-01-22 15:36:29.571913: step: 520/463, loss: 0.11686130613088608 2023-01-22 15:36:30.240564: step: 522/463, loss: 0.2476978451013565 2023-01-22 15:36:30.879045: step: 524/463, loss: 0.07032271474599838 2023-01-22 15:36:31.456891: step: 526/463, loss: 0.2589728534221649 2023-01-22 15:36:32.043423: step: 528/463, loss: 0.22674793004989624 2023-01-22 15:36:32.681836: step: 530/463, loss: 0.18108032643795013 2023-01-22 15:36:33.315864: step: 532/463, loss: 0.4408963918685913 2023-01-22 15:36:33.868603: step: 534/463, loss: 0.07760021090507507 2023-01-22 15:36:34.467882: step: 536/463, loss: 0.06765962392091751 2023-01-22 15:36:35.018165: step: 538/463, loss: 0.3221050500869751 2023-01-22 15:36:35.590909: step: 540/463, loss: 0.10379406809806824 2023-01-22 15:36:36.224957: step: 542/463, loss: 0.060803405940532684 2023-01-22 15:36:36.830552: step: 544/463, loss: 0.3671858608722687 2023-01-22 15:36:37.418002: step: 546/463, loss: 0.3804903030395508 2023-01-22 15:36:38.083370: step: 548/463, loss: 0.15878644585609436 2023-01-22 15:36:38.750545: step: 550/463, loss: 0.12209507822990417 2023-01-22 15:36:39.406408: step: 552/463, loss: 0.17469345033168793 2023-01-22 15:36:40.030372: step: 554/463, loss: 0.5733564496040344 2023-01-22 15:36:40.645161: step: 556/463, loss: 0.2539329528808594 2023-01-22 15:36:41.259637: step: 558/463, loss: 0.15032707154750824 2023-01-22 15:36:41.906901: step: 560/463, loss: 0.17615991830825806 2023-01-22 15:36:42.517622: step: 562/463, loss: 0.03493887186050415 2023-01-22 15:36:43.183621: step: 564/463, loss: 0.05600854009389877 2023-01-22 15:36:43.754730: step: 566/463, loss: 0.2887996435165405 2023-01-22 15:36:44.314179: step: 568/463, loss: 0.12876653671264648 2023-01-22 15:36:44.882847: step: 570/463, loss: 0.04325012490153313 2023-01-22 15:36:45.506609: step: 572/463, loss: 0.07131408900022507 2023-01-22 15:36:46.166541: step: 574/463, loss: 0.12633644044399261 2023-01-22 15:36:46.790749: step: 576/463, loss: 0.21042633056640625 2023-01-22 15:36:47.413410: step: 578/463, loss: 0.13653793931007385 2023-01-22 15:36:48.031263: step: 580/463, loss: 0.3283112943172455 2023-01-22 15:36:48.552913: step: 582/463, loss: 0.09841673076152802 2023-01-22 15:36:49.116302: step: 584/463, loss: 0.2488481104373932 2023-01-22 15:36:49.781296: step: 586/463, loss: 0.5770904421806335 2023-01-22 15:36:50.386857: step: 588/463, loss: 0.12066004425287247 2023-01-22 15:36:50.974306: step: 590/463, loss: 0.30740925669670105 2023-01-22 15:36:51.580672: step: 592/463, loss: 0.6235149502754211 2023-01-22 15:36:52.225124: step: 594/463, loss: 0.18602100014686584 2023-01-22 15:36:52.791062: step: 596/463, loss: 0.054312027990818024 2023-01-22 15:36:53.424134: step: 598/463, loss: 0.2963472306728363 2023-01-22 15:36:54.041777: step: 600/463, loss: 0.4641953110694885 2023-01-22 15:36:54.613912: step: 602/463, loss: 0.6046051383018494 2023-01-22 15:36:55.137274: step: 604/463, loss: 0.13247612118721008 2023-01-22 15:36:55.823170: step: 606/463, loss: 0.21494345366954803 2023-01-22 15:36:56.460214: step: 608/463, loss: 0.10805203020572662 2023-01-22 15:36:57.121816: step: 610/463, loss: 0.12216905504465103 2023-01-22 15:36:57.746031: step: 612/463, loss: 0.08232171088457108 2023-01-22 15:36:58.376214: step: 614/463, loss: 0.1379295289516449 2023-01-22 15:36:59.036630: step: 616/463, loss: 0.7613576650619507 2023-01-22 15:36:59.641551: step: 618/463, loss: 0.105097696185112 2023-01-22 15:37:00.241532: step: 620/463, loss: 0.14517799019813538 2023-01-22 15:37:00.860382: step: 622/463, loss: 0.1036609634757042 2023-01-22 15:37:01.458577: step: 624/463, loss: 0.49962103366851807 2023-01-22 15:37:02.081603: step: 626/463, loss: 0.35880550742149353 2023-01-22 15:37:02.681894: step: 628/463, loss: 0.2009057253599167 2023-01-22 15:37:03.265896: step: 630/463, loss: 0.49910545349121094 2023-01-22 15:37:03.887538: step: 632/463, loss: 7.007883071899414 2023-01-22 15:37:04.544426: step: 634/463, loss: 0.06680173426866531 2023-01-22 15:37:05.121261: step: 636/463, loss: 0.136859729886055 2023-01-22 15:37:05.743598: step: 638/463, loss: 0.9268234968185425 2023-01-22 15:37:06.359104: step: 640/463, loss: 0.33477744460105896 2023-01-22 15:37:07.006162: step: 642/463, loss: 0.1651124656200409 2023-01-22 15:37:07.688316: step: 644/463, loss: 0.12438957393169403 2023-01-22 15:37:08.325925: step: 646/463, loss: 0.1337636560201645 2023-01-22 15:37:09.023327: step: 648/463, loss: 0.1442052125930786 2023-01-22 15:37:09.647200: step: 650/463, loss: 0.153989776968956 2023-01-22 15:37:10.252582: step: 652/463, loss: 0.14472688734531403 2023-01-22 15:37:10.821286: step: 654/463, loss: 0.16586308181285858 2023-01-22 15:37:11.408278: step: 656/463, loss: 0.1985577940940857 2023-01-22 15:37:11.979497: step: 658/463, loss: 0.1763506829738617 2023-01-22 15:37:12.616054: step: 660/463, loss: 0.5323293805122375 2023-01-22 15:37:13.246254: step: 662/463, loss: 0.1505509912967682 2023-01-22 15:37:13.801856: step: 664/463, loss: 0.0553705208003521 2023-01-22 15:37:14.396156: step: 666/463, loss: 0.040222376585006714 2023-01-22 15:37:15.044562: step: 668/463, loss: 0.1532299816608429 2023-01-22 15:37:15.657105: step: 670/463, loss: 0.124493807554245 2023-01-22 15:37:16.243386: step: 672/463, loss: 0.21250513195991516 2023-01-22 15:37:16.826223: step: 674/463, loss: 0.4274196922779083 2023-01-22 15:37:17.449383: step: 676/463, loss: 0.1944085955619812 2023-01-22 15:37:18.024974: step: 678/463, loss: 0.16945792734622955 2023-01-22 15:37:18.620722: step: 680/463, loss: 0.06105254217982292 2023-01-22 15:37:19.290340: step: 682/463, loss: 0.24851608276367188 2023-01-22 15:37:19.904580: step: 684/463, loss: 0.15362799167633057 2023-01-22 15:37:20.546316: step: 686/463, loss: 0.11208247393369675 2023-01-22 15:37:21.194549: step: 688/463, loss: 0.10445218533277512 2023-01-22 15:37:21.869114: step: 690/463, loss: 0.21176135540008545 2023-01-22 15:37:22.463901: step: 692/463, loss: 0.2937779724597931 2023-01-22 15:37:23.107171: step: 694/463, loss: 0.15573173761367798 2023-01-22 15:37:23.679725: step: 696/463, loss: 0.16489307582378387 2023-01-22 15:37:24.325574: step: 698/463, loss: 0.36452412605285645 2023-01-22 15:37:24.929569: step: 700/463, loss: 0.6376091241836548 2023-01-22 15:37:25.568363: step: 702/463, loss: 0.3924853801727295 2023-01-22 15:37:26.230982: step: 704/463, loss: 0.20033350586891174 2023-01-22 15:37:26.844623: step: 706/463, loss: 0.12370406091213226 2023-01-22 15:37:27.435349: step: 708/463, loss: 0.27027225494384766 2023-01-22 15:37:28.054158: step: 710/463, loss: 0.16252018511295319 2023-01-22 15:37:28.694017: step: 712/463, loss: 0.11269262433052063 2023-01-22 15:37:29.296388: step: 714/463, loss: 0.7407333850860596 2023-01-22 15:37:29.902175: step: 716/463, loss: 0.0738672986626625 2023-01-22 15:37:30.502910: step: 718/463, loss: 0.10366003960371017 2023-01-22 15:37:31.113157: step: 720/463, loss: 0.4596409797668457 2023-01-22 15:37:31.715233: step: 722/463, loss: 0.31061145663261414 2023-01-22 15:37:32.303184: step: 724/463, loss: 0.16199475526809692 2023-01-22 15:37:32.858501: step: 726/463, loss: 0.37917178869247437 2023-01-22 15:37:33.398210: step: 728/463, loss: 0.17258499562740326 2023-01-22 15:37:34.080736: step: 730/463, loss: 0.23333178460597992 2023-01-22 15:37:34.661418: step: 732/463, loss: 0.23857566714286804 2023-01-22 15:37:35.301984: step: 734/463, loss: 0.045403704047203064 2023-01-22 15:37:35.982389: step: 736/463, loss: 0.31272631883621216 2023-01-22 15:37:36.585022: step: 738/463, loss: 0.20262940227985382 2023-01-22 15:37:37.184159: step: 740/463, loss: 1.0687575340270996 2023-01-22 15:37:37.815626: step: 742/463, loss: 0.0727340430021286 2023-01-22 15:37:38.409095: step: 744/463, loss: 0.1452399343252182 2023-01-22 15:37:39.027111: step: 746/463, loss: 0.1234608143568039 2023-01-22 15:37:39.645355: step: 748/463, loss: 0.15639141201972961 2023-01-22 15:37:40.260041: step: 750/463, loss: 0.09996497631072998 2023-01-22 15:37:40.853298: step: 752/463, loss: 0.12619838118553162 2023-01-22 15:37:41.466103: step: 754/463, loss: 0.1188499927520752 2023-01-22 15:37:42.119761: step: 756/463, loss: 0.08778050541877747 2023-01-22 15:37:42.747452: step: 758/463, loss: 0.548507571220398 2023-01-22 15:37:43.316366: step: 760/463, loss: 0.48687586188316345 2023-01-22 15:37:43.911871: step: 762/463, loss: 0.139670729637146 2023-01-22 15:37:44.533716: step: 764/463, loss: 0.29274505376815796 2023-01-22 15:37:45.192859: step: 766/463, loss: 1.6825809478759766 2023-01-22 15:37:45.739337: step: 768/463, loss: 0.09364624321460724 2023-01-22 15:37:46.262229: step: 770/463, loss: 0.13443978130817413 2023-01-22 15:37:46.880016: step: 772/463, loss: 1.2405831813812256 2023-01-22 15:37:47.492463: step: 774/463, loss: 0.2065059095621109 2023-01-22 15:37:48.071495: step: 776/463, loss: 0.2460082620382309 2023-01-22 15:37:48.715632: step: 778/463, loss: 0.23876695334911346 2023-01-22 15:37:49.364757: step: 780/463, loss: 0.6067272424697876 2023-01-22 15:37:50.051589: step: 782/463, loss: 0.053829919546842575 2023-01-22 15:37:50.651830: step: 784/463, loss: 0.16580449044704437 2023-01-22 15:37:51.225247: step: 786/463, loss: 0.06792334467172623 2023-01-22 15:37:51.899908: step: 788/463, loss: 0.19389712810516357 2023-01-22 15:37:52.501716: step: 790/463, loss: 0.084641233086586 2023-01-22 15:37:53.230356: step: 792/463, loss: 0.14121395349502563 2023-01-22 15:37:53.820031: step: 794/463, loss: 0.1087966188788414 2023-01-22 15:37:54.432985: step: 796/463, loss: 0.7084237933158875 2023-01-22 15:37:55.116347: step: 798/463, loss: 0.13546496629714966 2023-01-22 15:37:55.714069: step: 800/463, loss: 0.28883132338523865 2023-01-22 15:37:56.259586: step: 802/463, loss: 0.16458064317703247 2023-01-22 15:37:56.931260: step: 804/463, loss: 0.16891056299209595 2023-01-22 15:37:57.517459: step: 806/463, loss: 0.13761353492736816 2023-01-22 15:37:58.170171: step: 808/463, loss: 0.12072444707155228 2023-01-22 15:37:58.824691: step: 810/463, loss: 0.3882240951061249 2023-01-22 15:37:59.431278: step: 812/463, loss: 0.20960992574691772 2023-01-22 15:38:00.078857: step: 814/463, loss: 0.11579000949859619 2023-01-22 15:38:00.675406: step: 816/463, loss: 0.04784644395112991 2023-01-22 15:38:01.243264: step: 818/463, loss: 0.09945841133594513 2023-01-22 15:38:01.852420: step: 820/463, loss: 0.22717350721359253 2023-01-22 15:38:02.422975: step: 822/463, loss: 0.0975896343588829 2023-01-22 15:38:02.975889: step: 824/463, loss: 0.3785575330257416 2023-01-22 15:38:03.573277: step: 826/463, loss: 0.15830698609352112 2023-01-22 15:38:04.152500: step: 828/463, loss: 0.19315622746944427 2023-01-22 15:38:04.755858: step: 830/463, loss: 0.491571843624115 2023-01-22 15:38:05.342393: step: 832/463, loss: 0.08075858652591705 2023-01-22 15:38:05.905599: step: 834/463, loss: 0.3105156123638153 2023-01-22 15:38:06.545221: step: 836/463, loss: 0.2494489997625351 2023-01-22 15:38:07.120033: step: 838/463, loss: 0.28457894921302795 2023-01-22 15:38:07.737927: step: 840/463, loss: 0.1696852147579193 2023-01-22 15:38:08.461019: step: 842/463, loss: 0.10592375695705414 2023-01-22 15:38:09.062727: step: 844/463, loss: 0.2159624695777893 2023-01-22 15:38:09.678983: step: 846/463, loss: 0.1327885389328003 2023-01-22 15:38:10.239742: step: 848/463, loss: 0.12858092784881592 2023-01-22 15:38:11.010343: step: 850/463, loss: 0.053625281900167465 2023-01-22 15:38:11.609391: step: 852/463, loss: 0.07096284627914429 2023-01-22 15:38:12.205580: step: 854/463, loss: 0.07700026780366898 2023-01-22 15:38:12.825543: step: 856/463, loss: 0.10297121852636337 2023-01-22 15:38:13.429925: step: 858/463, loss: 0.5746098160743713 2023-01-22 15:38:14.058779: step: 860/463, loss: 0.11554156988859177 2023-01-22 15:38:14.695205: step: 862/463, loss: 0.12530957162380219 2023-01-22 15:38:15.287163: step: 864/463, loss: 0.2754795551300049 2023-01-22 15:38:15.898876: step: 866/463, loss: 1.0150099992752075 2023-01-22 15:38:16.542153: step: 868/463, loss: 0.5246260762214661 2023-01-22 15:38:17.132213: step: 870/463, loss: 0.13494347035884857 2023-01-22 15:38:17.811311: step: 872/463, loss: 0.8495489954948425 2023-01-22 15:38:18.421392: step: 874/463, loss: 0.07405278831720352 2023-01-22 15:38:19.031559: step: 876/463, loss: 0.8412298560142517 2023-01-22 15:38:19.724578: step: 878/463, loss: 0.2126026749610901 2023-01-22 15:38:20.341695: step: 880/463, loss: 0.09888923913240433 2023-01-22 15:38:20.977656: step: 882/463, loss: 0.13962070643901825 2023-01-22 15:38:21.621822: step: 884/463, loss: 0.2535344064235687 2023-01-22 15:38:22.265914: step: 886/463, loss: 0.4218654930591583 2023-01-22 15:38:22.939733: step: 888/463, loss: 0.21839721500873566 2023-01-22 15:38:23.646838: step: 890/463, loss: 0.47848978638648987 2023-01-22 15:38:24.175171: step: 892/463, loss: 0.03554714843630791 2023-01-22 15:38:24.828117: step: 894/463, loss: 0.12180837988853455 2023-01-22 15:38:25.500518: step: 896/463, loss: 0.21129396557807922 2023-01-22 15:38:26.110630: step: 898/463, loss: 0.06671477109193802 2023-01-22 15:38:26.705240: step: 900/463, loss: 0.09041544795036316 2023-01-22 15:38:27.346404: step: 902/463, loss: 0.06875801086425781 2023-01-22 15:38:27.970890: step: 904/463, loss: 0.5283781290054321 2023-01-22 15:38:28.587452: step: 906/463, loss: 0.114686019718647 2023-01-22 15:38:29.171466: step: 908/463, loss: 0.06865619868040085 2023-01-22 15:38:29.803098: step: 910/463, loss: 0.07721424102783203 2023-01-22 15:38:30.406879: step: 912/463, loss: 0.13775037229061127 2023-01-22 15:38:31.014292: step: 914/463, loss: 0.43571630120277405 2023-01-22 15:38:31.574536: step: 916/463, loss: 0.1318282037973404 2023-01-22 15:38:32.155251: step: 918/463, loss: 0.05203825607895851 2023-01-22 15:38:32.772043: step: 920/463, loss: 0.13451875746250153 2023-01-22 15:38:33.393974: step: 922/463, loss: 0.3883390724658966 2023-01-22 15:38:34.022105: step: 924/463, loss: 0.19637282192707062 2023-01-22 15:38:34.591160: step: 926/463, loss: 0.045323312282562256 ================================================== Loss: 0.284 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.26878457814661133, 'r': 0.36875, 'f1': 0.31093}, 'combined': 0.22910631578947366, 'epoch': 9} Test Chinese: {'template': {'p': 0.9722222222222222, 'r': 0.5511811023622047, 'f1': 0.7035175879396985}, 'slot': {'p': 0.33020162504944583, 'r': 0.31032037537369384, 'f1': 0.31995245180229703}, 'combined': 0.22509217714734467, 'epoch': 9} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.27314685314685316, 'r': 0.37058823529411766, 'f1': 0.3144927536231884}, 'combined': 0.23173150266971776, 'epoch': 9} Test Korean: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.3217223191886182, 'r': 0.3031938086594308, 'f1': 0.31218338250108507}, 'combined': 0.2216502015757704, 'epoch': 9} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2792885164051355, 'r': 0.37150142314990514, 'f1': 0.3188619706840391}, 'combined': 0.23495092576718668, 'epoch': 9} Test Russian: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.3400664122883469, 'r': 0.29521922603896666, 'f1': 0.3160598539641111}, 'combined': 0.2244024963145189, 'epoch': 9} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.23391812865497075, 'r': 0.38095238095238093, 'f1': 0.2898550724637681}, 'combined': 0.1932367149758454, 'epoch': 9} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.25, 'r': 0.41304347826086957, 'f1': 0.31147540983606553}, 'combined': 0.15573770491803277, 'epoch': 9} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.40476190476190477, 'r': 0.29310344827586204, 'f1': 0.34}, 'combined': 0.22666666666666668, 'epoch': 9} New best russian model... ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3095149595559295, 'r': 0.3253724622276754, 'f1': 0.31724567547453275}, 'combined': 0.23375997140228727, 'epoch': 1} Test for Chinese: {'template': {'p': 0.9722222222222222, 'r': 0.5511811023622047, 'f1': 0.7035175879396985}, 'slot': {'p': 0.36402697976295945, 'r': 0.29452226435922085, 'f1': 0.32560678286267597}, 'combined': 0.22907009849635496, 'epoch': 1} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.29409479409479405, 'r': 0.32770562770562767, 'f1': 0.3099918099918099}, 'combined': 0.2066612066612066, 'epoch': 1} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.31359210182048525, 'r': 0.32489807892596767, 'f1': 0.3191449908555172}, 'combined': 0.23515946694617054, 'epoch': 1} Test for Korean: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.3653684903443556, 'r': 0.29076445304891124, 'f1': 0.32382513429937054}, 'combined': 0.22991584535255308, 'epoch': 1} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.296875, 'r': 0.41304347826086957, 'f1': 0.3454545454545454}, 'combined': 0.1727272727272727, 'epoch': 1} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2792885164051355, 'r': 0.37150142314990514, 'f1': 0.3188619706840391}, 'combined': 0.23495092576718668, 'epoch': 9} Test for Russian: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.3400664122883469, 'r': 0.29521922603896666, 'f1': 0.3160598539641111}, 'combined': 0.2244024963145189, 'epoch': 9} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.40476190476190477, 'r': 0.29310344827586204, 'f1': 0.34}, 'combined': 0.22666666666666668, 'epoch': 9} ****************************** Epoch: 10 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-22 15:41:18.518190: step: 2/463, loss: 0.40154069662094116 2023-01-22 15:41:19.146222: step: 4/463, loss: 0.10948758572340012 2023-01-22 15:41:19.747540: step: 6/463, loss: 0.08187143504619598 2023-01-22 15:41:20.359800: step: 8/463, loss: 0.02750558592379093 2023-01-22 15:41:20.930295: step: 10/463, loss: 0.2536138594150543 2023-01-22 15:41:21.541976: step: 12/463, loss: 0.2753535211086273 2023-01-22 15:41:22.176439: step: 14/463, loss: 0.5048105120658875 2023-01-22 15:41:22.716677: step: 16/463, loss: 0.3811872601509094 2023-01-22 15:41:23.304012: step: 18/463, loss: 0.12271176278591156 2023-01-22 15:41:23.944301: step: 20/463, loss: 0.5379421710968018 2023-01-22 15:41:24.563752: step: 22/463, loss: 0.04927234724164009 2023-01-22 15:41:25.209678: step: 24/463, loss: 0.37658411264419556 2023-01-22 15:41:25.819869: step: 26/463, loss: 0.2915007472038269 2023-01-22 15:41:26.406680: step: 28/463, loss: 0.11103135347366333 2023-01-22 15:41:27.034188: step: 30/463, loss: 0.14253082871437073 2023-01-22 15:41:27.725493: step: 32/463, loss: 0.023239625617861748 2023-01-22 15:41:28.323011: step: 34/463, loss: 0.08279220759868622 2023-01-22 15:41:28.950005: step: 36/463, loss: 0.05612502247095108 2023-01-22 15:41:29.523137: step: 38/463, loss: 0.03191516920924187 2023-01-22 15:41:30.059588: step: 40/463, loss: 0.03198060393333435 2023-01-22 15:41:30.565799: step: 42/463, loss: 0.08805370330810547 2023-01-22 15:41:31.226390: step: 44/463, loss: 0.07898853719234467 2023-01-22 15:41:31.827378: step: 46/463, loss: 0.053603533655405045 2023-01-22 15:41:32.522697: step: 48/463, loss: 0.07239585369825363 2023-01-22 15:41:33.102193: step: 50/463, loss: 0.4926121234893799 2023-01-22 15:41:33.650002: step: 52/463, loss: 0.16030244529247284 2023-01-22 15:41:34.249818: step: 54/463, loss: 0.2138657569885254 2023-01-22 15:41:34.806253: step: 56/463, loss: 0.6113528609275818 2023-01-22 15:41:35.408958: step: 58/463, loss: 0.140641987323761 2023-01-22 15:41:36.051841: step: 60/463, loss: 1.4100732803344727 2023-01-22 15:41:36.701385: step: 62/463, loss: 0.17404872179031372 2023-01-22 15:41:37.357376: step: 64/463, loss: 0.27019932866096497 2023-01-22 15:41:37.995295: step: 66/463, loss: 0.14932166039943695 2023-01-22 15:41:38.717853: step: 68/463, loss: 0.12371020019054413 2023-01-22 15:41:39.340308: step: 70/463, loss: 0.15235258638858795 2023-01-22 15:41:39.899741: step: 72/463, loss: 0.07262715697288513 2023-01-22 15:41:40.510273: step: 74/463, loss: 0.286660760641098 2023-01-22 15:41:41.131400: step: 76/463, loss: 0.07292231172323227 2023-01-22 15:41:41.708413: step: 78/463, loss: 0.06333442032337189 2023-01-22 15:41:42.413260: step: 80/463, loss: 0.10189302265644073 2023-01-22 15:41:43.044155: step: 82/463, loss: 0.22885921597480774 2023-01-22 15:41:43.693814: step: 84/463, loss: 0.07018053531646729 2023-01-22 15:41:44.360266: step: 86/463, loss: 0.040487706661224365 2023-01-22 15:41:44.984601: step: 88/463, loss: 0.11509883403778076 2023-01-22 15:41:45.527715: step: 90/463, loss: 0.030213577672839165 2023-01-22 15:41:46.185968: step: 92/463, loss: 0.9620280265808105 2023-01-22 15:41:46.785801: step: 94/463, loss: 0.11503848433494568 2023-01-22 15:41:47.333755: step: 96/463, loss: 0.14240910112857819 2023-01-22 15:41:47.987530: step: 98/463, loss: 0.15989504754543304 2023-01-22 15:41:48.606727: step: 100/463, loss: 0.309516578912735 2023-01-22 15:41:49.243775: step: 102/463, loss: 0.07258076220750809 2023-01-22 15:41:49.820336: step: 104/463, loss: 0.12410176545381546 2023-01-22 15:41:50.486547: step: 106/463, loss: 0.1747732162475586 2023-01-22 15:41:51.111115: step: 108/463, loss: 0.13655897974967957 2023-01-22 15:41:51.707915: step: 110/463, loss: 0.09219282865524292 2023-01-22 15:41:52.309800: step: 112/463, loss: 0.8564315438270569 2023-01-22 15:41:52.978864: step: 114/463, loss: 0.4530537724494934 2023-01-22 15:41:53.606492: step: 116/463, loss: 0.20383036136627197 2023-01-22 15:41:54.167141: step: 118/463, loss: 0.09706863760948181 2023-01-22 15:41:54.762797: step: 120/463, loss: 0.11450684815645218 2023-01-22 15:41:55.324966: step: 122/463, loss: 0.1908928006887436 2023-01-22 15:41:55.891521: step: 124/463, loss: 0.4277323782444 2023-01-22 15:41:56.548556: step: 126/463, loss: 1.3415489196777344 2023-01-22 15:41:57.201079: step: 128/463, loss: 0.16129577159881592 2023-01-22 15:41:57.766224: step: 130/463, loss: 0.5505008101463318 2023-01-22 15:41:58.303539: step: 132/463, loss: 0.34369194507598877 2023-01-22 15:41:58.898649: step: 134/463, loss: 0.07024836540222168 2023-01-22 15:41:59.530115: step: 136/463, loss: 0.09066754579544067 2023-01-22 15:42:00.136723: step: 138/463, loss: 0.3213493525981903 2023-01-22 15:42:00.739939: step: 140/463, loss: 0.043776798993349075 2023-01-22 15:42:01.354115: step: 142/463, loss: 0.8071836829185486 2023-01-22 15:42:02.027065: step: 144/463, loss: 0.16151858866214752 2023-01-22 15:42:02.640553: step: 146/463, loss: 0.23304855823516846 2023-01-22 15:42:03.239396: step: 148/463, loss: 0.11122845113277435 2023-01-22 15:42:03.836815: step: 150/463, loss: 0.13140808045864105 2023-01-22 15:42:04.431825: step: 152/463, loss: 0.08877796679735184 2023-01-22 15:42:05.061961: step: 154/463, loss: 0.40265265107154846 2023-01-22 15:42:05.697265: step: 156/463, loss: 0.02423746883869171 2023-01-22 15:42:06.375014: step: 158/463, loss: 0.46296006441116333 2023-01-22 15:42:06.973386: step: 160/463, loss: 0.12039482593536377 2023-01-22 15:42:07.551401: step: 162/463, loss: 0.05494978278875351 2023-01-22 15:42:08.174460: step: 164/463, loss: 0.22877813875675201 2023-01-22 15:42:08.759618: step: 166/463, loss: 0.19151930510997772 2023-01-22 15:42:09.326698: step: 168/463, loss: 0.28065991401672363 2023-01-22 15:42:09.961367: step: 170/463, loss: 0.3985030949115753 2023-01-22 15:42:10.602945: step: 172/463, loss: 0.2238493412733078 2023-01-22 15:42:11.351424: step: 174/463, loss: 0.044669974595308304 2023-01-22 15:42:12.012591: step: 176/463, loss: 0.16945435106754303 2023-01-22 15:42:12.613532: step: 178/463, loss: 0.13997437059879303 2023-01-22 15:42:13.261234: step: 180/463, loss: 0.14598223567008972 2023-01-22 15:42:13.897268: step: 182/463, loss: 0.10464274138212204 2023-01-22 15:42:14.480525: step: 184/463, loss: 0.1705319583415985 2023-01-22 15:42:15.005181: step: 186/463, loss: 0.13178521394729614 2023-01-22 15:42:15.550498: step: 188/463, loss: 0.06371850520372391 2023-01-22 15:42:16.093988: step: 190/463, loss: 0.13415907323360443 2023-01-22 15:42:16.686556: step: 192/463, loss: 0.22134028375148773 2023-01-22 15:42:17.283347: step: 194/463, loss: 0.1567593812942505 2023-01-22 15:42:17.929149: step: 196/463, loss: 0.08227834105491638 2023-01-22 15:42:18.579168: step: 198/463, loss: 0.07417291402816772 2023-01-22 15:42:19.237126: step: 200/463, loss: 0.12974724173545837 2023-01-22 15:42:19.834356: step: 202/463, loss: 0.2933799624443054 2023-01-22 15:42:20.436251: step: 204/463, loss: 0.06816171109676361 2023-01-22 15:42:21.019588: step: 206/463, loss: 0.1960725337266922 2023-01-22 15:42:21.599552: step: 208/463, loss: 0.6089431047439575 2023-01-22 15:42:22.226795: step: 210/463, loss: 0.33125486969947815 2023-01-22 15:42:22.818702: step: 212/463, loss: 0.34541037678718567 2023-01-22 15:42:23.410634: step: 214/463, loss: 0.7775437831878662 2023-01-22 15:42:24.015613: step: 216/463, loss: 0.10284820199012756 2023-01-22 15:42:24.595274: step: 218/463, loss: 0.091645747423172 2023-01-22 15:42:25.121197: step: 220/463, loss: 1.0382095575332642 2023-01-22 15:42:25.729746: step: 222/463, loss: 0.21796661615371704 2023-01-22 15:42:26.285565: step: 224/463, loss: 0.37717822194099426 2023-01-22 15:42:26.943253: step: 226/463, loss: 1.4435853958129883 2023-01-22 15:42:27.535313: step: 228/463, loss: 0.28745847940444946 2023-01-22 15:42:28.153824: step: 230/463, loss: 0.5564746856689453 2023-01-22 15:42:28.706722: step: 232/463, loss: 0.5620812177658081 2023-01-22 15:42:29.330556: step: 234/463, loss: 0.11769884079694748 2023-01-22 15:42:29.926500: step: 236/463, loss: 1.02436363697052 2023-01-22 15:42:30.570372: step: 238/463, loss: 0.12243344634771347 2023-01-22 15:42:31.207905: step: 240/463, loss: 0.18046696484088898 2023-01-22 15:42:31.828933: step: 242/463, loss: 0.11766783148050308 2023-01-22 15:42:32.454461: step: 244/463, loss: 0.10932203382253647 2023-01-22 15:42:33.718418: step: 246/463, loss: 0.2529165744781494 2023-01-22 15:42:34.334787: step: 248/463, loss: 0.08245060592889786 2023-01-22 15:42:34.928649: step: 250/463, loss: 0.27087146043777466 2023-01-22 15:42:35.542620: step: 252/463, loss: 0.15223677456378937 2023-01-22 15:42:36.156507: step: 254/463, loss: 0.3834773302078247 2023-01-22 15:42:36.879250: step: 256/463, loss: 0.17169812321662903 2023-01-22 15:42:37.531334: step: 258/463, loss: 0.25927338004112244 2023-01-22 15:42:38.168088: step: 260/463, loss: 0.11834041774272919 2023-01-22 15:42:38.763612: step: 262/463, loss: 0.20095296204090118 2023-01-22 15:42:39.342645: step: 264/463, loss: 0.11368720233440399 2023-01-22 15:42:39.943241: step: 266/463, loss: 0.06389305740594864 2023-01-22 15:42:40.548342: step: 268/463, loss: 0.11819636821746826 2023-01-22 15:42:41.152713: step: 270/463, loss: 0.16653388738632202 2023-01-22 15:42:41.779607: step: 272/463, loss: 0.05196546018123627 2023-01-22 15:42:42.438750: step: 274/463, loss: 0.5603309869766235 2023-01-22 15:42:43.100092: step: 276/463, loss: 0.1909887045621872 2023-01-22 15:42:43.744012: step: 278/463, loss: 0.19672751426696777 2023-01-22 15:42:44.416630: step: 280/463, loss: 0.19756807386875153 2023-01-22 15:42:45.009288: step: 282/463, loss: 0.15142323076725006 2023-01-22 15:42:45.607709: step: 284/463, loss: 0.10646837204694748 2023-01-22 15:42:46.151327: step: 286/463, loss: 0.20932796597480774 2023-01-22 15:42:46.768570: step: 288/463, loss: 0.2537936568260193 2023-01-22 15:42:47.381564: step: 290/463, loss: 0.45465201139450073 2023-01-22 15:42:47.906613: step: 292/463, loss: 0.13248153030872345 2023-01-22 15:42:48.548732: step: 294/463, loss: 0.5830867886543274 2023-01-22 15:42:49.236485: step: 296/463, loss: 0.3196721374988556 2023-01-22 15:42:49.813832: step: 298/463, loss: 0.12841641902923584 2023-01-22 15:42:50.403877: step: 300/463, loss: 0.8898604512214661 2023-01-22 15:42:51.011273: step: 302/463, loss: 0.05626816675066948 2023-01-22 15:42:51.651253: step: 304/463, loss: 0.19256293773651123 2023-01-22 15:42:52.282112: step: 306/463, loss: 0.0797729641199112 2023-01-22 15:42:52.925807: step: 308/463, loss: 0.14352534711360931 2023-01-22 15:42:53.510352: step: 310/463, loss: 0.14184658229351044 2023-01-22 15:42:54.149780: step: 312/463, loss: 5.698919296264648 2023-01-22 15:42:54.788459: step: 314/463, loss: 0.5731371641159058 2023-01-22 15:42:55.409119: step: 316/463, loss: 0.12808886170387268 2023-01-22 15:42:56.040529: step: 318/463, loss: 0.10967511683702469 2023-01-22 15:42:56.634410: step: 320/463, loss: 0.11350304633378983 2023-01-22 15:42:57.301754: step: 322/463, loss: 0.14162488281726837 2023-01-22 15:42:57.863982: step: 324/463, loss: 0.0764804556965828 2023-01-22 15:42:58.478178: step: 326/463, loss: 0.07667504251003265 2023-01-22 15:42:59.054877: step: 328/463, loss: 0.18006391823291779 2023-01-22 15:42:59.647469: step: 330/463, loss: 0.404104083776474 2023-01-22 15:43:00.229354: step: 332/463, loss: 0.09660448879003525 2023-01-22 15:43:00.934802: step: 334/463, loss: 0.06312228739261627 2023-01-22 15:43:01.527009: step: 336/463, loss: 0.41857510805130005 2023-01-22 15:43:02.114683: step: 338/463, loss: 0.10745318979024887 2023-01-22 15:43:02.682080: step: 340/463, loss: 0.6886838674545288 2023-01-22 15:43:03.273807: step: 342/463, loss: 0.060952845960855484 2023-01-22 15:43:03.887865: step: 344/463, loss: 0.1899404525756836 2023-01-22 15:43:04.475474: step: 346/463, loss: 0.22441746294498444 2023-01-22 15:43:05.085262: step: 348/463, loss: 0.07826019823551178 2023-01-22 15:43:05.655144: step: 350/463, loss: 0.08408509194850922 2023-01-22 15:43:06.259498: step: 352/463, loss: 0.15325136482715607 2023-01-22 15:43:06.863732: step: 354/463, loss: 0.062361788004636765 2023-01-22 15:43:07.571689: step: 356/463, loss: 0.11399306356906891 2023-01-22 15:43:08.197026: step: 358/463, loss: 0.30359819531440735 2023-01-22 15:43:08.806696: step: 360/463, loss: 0.2804632782936096 2023-01-22 15:43:09.430598: step: 362/463, loss: 0.11184251308441162 2023-01-22 15:43:10.061087: step: 364/463, loss: 0.13788747787475586 2023-01-22 15:43:10.677162: step: 366/463, loss: 0.03232598677277565 2023-01-22 15:43:11.312014: step: 368/463, loss: 0.08699927479028702 2023-01-22 15:43:11.903436: step: 370/463, loss: 0.12345526367425919 2023-01-22 15:43:12.504459: step: 372/463, loss: 0.033256739377975464 2023-01-22 15:43:13.063232: step: 374/463, loss: 0.16277574002742767 2023-01-22 15:43:13.680837: step: 376/463, loss: 0.46877458691596985 2023-01-22 15:43:14.338575: step: 378/463, loss: 0.033871471881866455 2023-01-22 15:43:15.007945: step: 380/463, loss: 0.37213271856307983 2023-01-22 15:43:15.576423: step: 382/463, loss: 0.2982301115989685 2023-01-22 15:43:16.208501: step: 384/463, loss: 0.14033499360084534 2023-01-22 15:43:16.856236: step: 386/463, loss: 0.4683551788330078 2023-01-22 15:43:17.434057: step: 388/463, loss: 0.030419645830988884 2023-01-22 15:43:18.070355: step: 390/463, loss: 0.047792915254831314 2023-01-22 15:43:18.654422: step: 392/463, loss: 0.1998784989118576 2023-01-22 15:43:19.254127: step: 394/463, loss: 0.27084630727767944 2023-01-22 15:43:19.879860: step: 396/463, loss: 0.13078385591506958 2023-01-22 15:43:20.554134: step: 398/463, loss: 0.08334220945835114 2023-01-22 15:43:21.107647: step: 400/463, loss: 0.03373286873102188 2023-01-22 15:43:21.684556: step: 402/463, loss: 0.09037044644355774 2023-01-22 15:43:22.318197: step: 404/463, loss: 0.06295708566904068 2023-01-22 15:43:22.874174: step: 406/463, loss: 0.1880054771900177 2023-01-22 15:43:23.422892: step: 408/463, loss: 0.11160700023174286 2023-01-22 15:43:23.997336: step: 410/463, loss: 0.06626136600971222 2023-01-22 15:43:24.595844: step: 412/463, loss: 0.3930124342441559 2023-01-22 15:43:25.229744: step: 414/463, loss: 1.0913612842559814 2023-01-22 15:43:25.773978: step: 416/463, loss: 1.110353708267212 2023-01-22 15:43:26.443650: step: 418/463, loss: 0.7351412177085876 2023-01-22 15:43:27.047313: step: 420/463, loss: 0.16602905094623566 2023-01-22 15:43:27.802465: step: 422/463, loss: 0.2530447244644165 2023-01-22 15:43:28.426635: step: 424/463, loss: 0.2499721646308899 2023-01-22 15:43:29.018705: step: 426/463, loss: 0.04278969764709473 2023-01-22 15:43:29.649033: step: 428/463, loss: 0.2150280624628067 2023-01-22 15:43:30.308919: step: 430/463, loss: 0.08834477514028549 2023-01-22 15:43:30.956873: step: 432/463, loss: 0.09184259921312332 2023-01-22 15:43:31.612126: step: 434/463, loss: 0.14564815163612366 2023-01-22 15:43:32.186506: step: 436/463, loss: 0.11879178881645203 2023-01-22 15:43:32.837887: step: 438/463, loss: 0.14588358998298645 2023-01-22 15:43:33.485886: step: 440/463, loss: 0.5753128528594971 2023-01-22 15:43:34.072298: step: 442/463, loss: 0.09427831321954727 2023-01-22 15:43:34.700470: step: 444/463, loss: 0.1107979267835617 2023-01-22 15:43:35.425327: step: 446/463, loss: 0.3931941092014313 2023-01-22 15:43:36.050874: step: 448/463, loss: 0.2710011303424835 2023-01-22 15:43:36.660363: step: 450/463, loss: 0.27078256011009216 2023-01-22 15:43:37.385461: step: 452/463, loss: 0.18152102828025818 2023-01-22 15:43:38.022588: step: 454/463, loss: 0.09159266948699951 2023-01-22 15:43:38.656381: step: 456/463, loss: 0.16016456484794617 2023-01-22 15:43:39.259941: step: 458/463, loss: 0.04194062575697899 2023-01-22 15:43:39.862790: step: 460/463, loss: 0.16257309913635254 2023-01-22 15:43:40.489838: step: 462/463, loss: 0.14484372735023499 2023-01-22 15:43:41.139398: step: 464/463, loss: 0.29700881242752075 2023-01-22 15:43:41.782554: step: 466/463, loss: 0.09791848808526993 2023-01-22 15:43:42.464377: step: 468/463, loss: 0.0708242803812027 2023-01-22 15:43:43.112403: step: 470/463, loss: 0.13938698172569275 2023-01-22 15:43:43.705629: step: 472/463, loss: 0.14578376710414886 2023-01-22 15:43:44.320810: step: 474/463, loss: 0.17998532950878143 2023-01-22 15:43:44.926435: step: 476/463, loss: 0.1747705191373825 2023-01-22 15:43:45.558072: step: 478/463, loss: 0.17318804562091827 2023-01-22 15:43:46.241524: step: 480/463, loss: 0.2273542881011963 2023-01-22 15:43:46.820283: step: 482/463, loss: 0.0996875986456871 2023-01-22 15:43:47.373224: step: 484/463, loss: 0.23904851078987122 2023-01-22 15:43:48.031294: step: 486/463, loss: 0.32661572098731995 2023-01-22 15:43:48.562644: step: 488/463, loss: 0.3198864161968231 2023-01-22 15:43:49.188650: step: 490/463, loss: 0.1237216368317604 2023-01-22 15:43:49.830807: step: 492/463, loss: 0.13695378601551056 2023-01-22 15:43:50.436603: step: 494/463, loss: 0.3341292440891266 2023-01-22 15:43:51.145640: step: 496/463, loss: 0.06083236262202263 2023-01-22 15:43:51.773903: step: 498/463, loss: 0.511300265789032 2023-01-22 15:43:52.409382: step: 500/463, loss: 0.1484244167804718 2023-01-22 15:43:53.065903: step: 502/463, loss: 0.8819226026535034 2023-01-22 15:43:53.671768: step: 504/463, loss: 0.3403226137161255 2023-01-22 15:43:54.247890: step: 506/463, loss: 0.02496417425572872 2023-01-22 15:43:54.852885: step: 508/463, loss: 0.32907846570014954 2023-01-22 15:43:55.446033: step: 510/463, loss: 0.30299341678619385 2023-01-22 15:43:56.038313: step: 512/463, loss: 0.14849722385406494 2023-01-22 15:43:56.627103: step: 514/463, loss: 0.11631640791893005 2023-01-22 15:43:57.208665: step: 516/463, loss: 0.1780148148536682 2023-01-22 15:43:57.878191: step: 518/463, loss: 0.12776637077331543 2023-01-22 15:43:58.444098: step: 520/463, loss: 0.07072924077510834 2023-01-22 15:43:59.048900: step: 522/463, loss: 0.0863102376461029 2023-01-22 15:43:59.634246: step: 524/463, loss: 0.33186841011047363 2023-01-22 15:44:00.231585: step: 526/463, loss: 0.3439846336841583 2023-01-22 15:44:00.810466: step: 528/463, loss: 0.11858299374580383 2023-01-22 15:44:01.398060: step: 530/463, loss: 0.11775513738393784 2023-01-22 15:44:02.030139: step: 532/463, loss: 0.1434253752231598 2023-01-22 15:44:02.641348: step: 534/463, loss: 0.20010808110237122 2023-01-22 15:44:03.236824: step: 536/463, loss: 0.42535853385925293 2023-01-22 15:44:03.857488: step: 538/463, loss: 0.20112168788909912 2023-01-22 15:44:04.545318: step: 540/463, loss: 0.15591692924499512 2023-01-22 15:44:05.116304: step: 542/463, loss: 0.1130349412560463 2023-01-22 15:44:05.729818: step: 544/463, loss: 0.06791341304779053 2023-01-22 15:44:06.233065: step: 546/463, loss: 0.05658133700489998 2023-01-22 15:44:06.843663: step: 548/463, loss: 0.11882972717285156 2023-01-22 15:44:07.453517: step: 550/463, loss: 0.31166261434555054 2023-01-22 15:44:08.018752: step: 552/463, loss: 0.10249744355678558 2023-01-22 15:44:08.643722: step: 554/463, loss: 0.07589934766292572 2023-01-22 15:44:09.297566: step: 556/463, loss: 0.14628173410892487 2023-01-22 15:44:09.850681: step: 558/463, loss: 0.13658225536346436 2023-01-22 15:44:10.476235: step: 560/463, loss: 0.14817124605178833 2023-01-22 15:44:11.086807: step: 562/463, loss: 0.11600441485643387 2023-01-22 15:44:11.685653: step: 564/463, loss: 0.3476855456829071 2023-01-22 15:44:12.283430: step: 566/463, loss: 0.3397993743419647 2023-01-22 15:44:12.974171: step: 568/463, loss: 0.32109546661376953 2023-01-22 15:44:13.579441: step: 570/463, loss: 0.20212127268314362 2023-01-22 15:44:14.221610: step: 572/463, loss: 0.03610093891620636 2023-01-22 15:44:14.828058: step: 574/463, loss: 0.1360633671283722 2023-01-22 15:44:15.464547: step: 576/463, loss: 0.2183108627796173 2023-01-22 15:44:16.071906: step: 578/463, loss: 1.14596688747406 2023-01-22 15:44:16.699680: step: 580/463, loss: 0.6904117465019226 2023-01-22 15:44:17.318833: step: 582/463, loss: 0.3438210189342499 2023-01-22 15:44:17.850912: step: 584/463, loss: 1.6091212034225464 2023-01-22 15:44:18.481197: step: 586/463, loss: 0.07631315290927887 2023-01-22 15:44:19.074148: step: 588/463, loss: 0.25208497047424316 2023-01-22 15:44:19.744979: step: 590/463, loss: 0.18700838088989258 2023-01-22 15:44:20.388604: step: 592/463, loss: 1.2856144905090332 2023-01-22 15:44:21.010318: step: 594/463, loss: 0.3962287902832031 2023-01-22 15:44:21.560359: step: 596/463, loss: 0.11446593701839447 2023-01-22 15:44:22.195491: step: 598/463, loss: 0.11464279890060425 2023-01-22 15:44:22.730413: step: 600/463, loss: 0.476642906665802 2023-01-22 15:44:23.318099: step: 602/463, loss: 0.01924838311970234 2023-01-22 15:44:23.892054: step: 604/463, loss: 0.09926652908325195 2023-01-22 15:44:24.536396: step: 606/463, loss: 0.042811594903469086 2023-01-22 15:44:25.152753: step: 608/463, loss: 0.19313392043113708 2023-01-22 15:44:25.751961: step: 610/463, loss: 0.09596715867519379 2023-01-22 15:44:26.390550: step: 612/463, loss: 0.5793747305870056 2023-01-22 15:44:27.026454: step: 614/463, loss: 0.15619812905788422 2023-01-22 15:44:27.587975: step: 616/463, loss: 0.18493740260601044 2023-01-22 15:44:28.199095: step: 618/463, loss: 0.17497417330741882 2023-01-22 15:44:28.789145: step: 620/463, loss: 0.5195960402488708 2023-01-22 15:44:29.459972: step: 622/463, loss: 0.11246638745069504 2023-01-22 15:44:30.116470: step: 624/463, loss: 0.09005017578601837 2023-01-22 15:44:30.813804: step: 626/463, loss: 0.2701147198677063 2023-01-22 15:44:31.413013: step: 628/463, loss: 0.18558157980442047 2023-01-22 15:44:32.021548: step: 630/463, loss: 0.01844145357608795 2023-01-22 15:44:32.753203: step: 632/463, loss: 0.1051337718963623 2023-01-22 15:44:33.395457: step: 634/463, loss: 0.09002503007650375 2023-01-22 15:44:34.031809: step: 636/463, loss: 0.053919464349746704 2023-01-22 15:44:34.650430: step: 638/463, loss: 0.1744384616613388 2023-01-22 15:44:35.242740: step: 640/463, loss: 0.10787545144557953 2023-01-22 15:44:35.827374: step: 642/463, loss: 0.14595259726047516 2023-01-22 15:44:36.431186: step: 644/463, loss: 0.5221693515777588 2023-01-22 15:44:37.114929: step: 646/463, loss: 0.18667727708816528 2023-01-22 15:44:37.716241: step: 648/463, loss: 0.27558618783950806 2023-01-22 15:44:38.356512: step: 650/463, loss: 0.27779144048690796 2023-01-22 15:44:38.944557: step: 652/463, loss: 0.05071897804737091 2023-01-22 15:44:39.594069: step: 654/463, loss: 0.08819505572319031 2023-01-22 15:44:40.178238: step: 656/463, loss: 0.1252928227186203 2023-01-22 15:44:40.842007: step: 658/463, loss: 0.171945720911026 2023-01-22 15:44:41.445556: step: 660/463, loss: 0.10049039125442505 2023-01-22 15:44:42.033864: step: 662/463, loss: 0.1621101051568985 2023-01-22 15:44:42.617689: step: 664/463, loss: 0.1554742455482483 2023-01-22 15:44:43.184014: step: 666/463, loss: 0.24186374247074127 2023-01-22 15:44:43.761960: step: 668/463, loss: 0.13482648134231567 2023-01-22 15:44:44.320458: step: 670/463, loss: 0.19186227023601532 2023-01-22 15:44:44.932360: step: 672/463, loss: 0.2953675091266632 2023-01-22 15:44:45.550518: step: 674/463, loss: 0.07042815536260605 2023-01-22 15:44:46.208639: step: 676/463, loss: 0.23863781988620758 2023-01-22 15:44:46.794150: step: 678/463, loss: 0.34180140495300293 2023-01-22 15:44:47.401802: step: 680/463, loss: 0.20568764209747314 2023-01-22 15:44:48.016675: step: 682/463, loss: 0.17024503648281097 2023-01-22 15:44:48.646558: step: 684/463, loss: 0.06332960724830627 2023-01-22 15:44:49.392985: step: 686/463, loss: 0.20806638896465302 2023-01-22 15:44:50.082377: step: 688/463, loss: 0.2610677182674408 2023-01-22 15:44:50.730156: step: 690/463, loss: 0.20768281817436218 2023-01-22 15:44:51.355348: step: 692/463, loss: 0.2731561064720154 2023-01-22 15:44:51.982049: step: 694/463, loss: 0.1324978470802307 2023-01-22 15:44:52.590395: step: 696/463, loss: 0.36941900849342346 2023-01-22 15:44:53.189020: step: 698/463, loss: 0.07979051023721695 2023-01-22 15:44:53.820712: step: 700/463, loss: 0.12093637883663177 2023-01-22 15:44:54.456362: step: 702/463, loss: 0.0820816308259964 2023-01-22 15:44:55.086260: step: 704/463, loss: 0.10476037859916687 2023-01-22 15:44:55.718033: step: 706/463, loss: 0.10667893290519714 2023-01-22 15:44:56.351850: step: 708/463, loss: 0.06923399865627289 2023-01-22 15:44:57.017867: step: 710/463, loss: 0.45718011260032654 2023-01-22 15:44:57.652436: step: 712/463, loss: 0.19783912599086761 2023-01-22 15:44:58.291553: step: 714/463, loss: 0.12787441909313202 2023-01-22 15:44:58.867258: step: 716/463, loss: 0.23538514971733093 2023-01-22 15:44:59.467656: step: 718/463, loss: 0.08009427040815353 2023-01-22 15:45:00.087554: step: 720/463, loss: 0.08249340206384659 2023-01-22 15:45:00.687289: step: 722/463, loss: 0.11066696792840958 2023-01-22 15:45:01.257649: step: 724/463, loss: 0.04272262752056122 2023-01-22 15:45:01.860108: step: 726/463, loss: 7.0908098220825195 2023-01-22 15:45:02.480653: step: 728/463, loss: 0.049037691205739975 2023-01-22 15:45:03.066533: step: 730/463, loss: 0.39937570691108704 2023-01-22 15:45:03.694355: step: 732/463, loss: 1.039881944656372 2023-01-22 15:45:04.307946: step: 734/463, loss: 0.11043931543827057 2023-01-22 15:45:04.952859: step: 736/463, loss: 0.5287123918533325 2023-01-22 15:45:05.559826: step: 738/463, loss: 8.246933937072754 2023-01-22 15:45:06.170934: step: 740/463, loss: 0.2513301968574524 2023-01-22 15:45:06.742810: step: 742/463, loss: 0.07061297446489334 2023-01-22 15:45:07.397098: step: 744/463, loss: 0.18841087818145752 2023-01-22 15:45:08.000963: step: 746/463, loss: 0.23705150187015533 2023-01-22 15:45:08.616790: step: 748/463, loss: 0.9724906086921692 2023-01-22 15:45:09.199316: step: 750/463, loss: 0.10628017038106918 2023-01-22 15:45:09.839402: step: 752/463, loss: 0.035298172384500504 2023-01-22 15:45:10.380519: step: 754/463, loss: 0.124782033264637 2023-01-22 15:45:11.041411: step: 756/463, loss: 0.2902880311012268 2023-01-22 15:45:11.706121: step: 758/463, loss: 0.08859802037477493 2023-01-22 15:45:12.313770: step: 760/463, loss: 0.3655205965042114 2023-01-22 15:45:12.929053: step: 762/463, loss: 0.7972539663314819 2023-01-22 15:45:13.568786: step: 764/463, loss: 0.45659443736076355 2023-01-22 15:45:14.207686: step: 766/463, loss: 0.7290958166122437 2023-01-22 15:45:14.834075: step: 768/463, loss: 0.16718219220638275 2023-01-22 15:45:15.427201: step: 770/463, loss: 0.07359286397695541 2023-01-22 15:45:16.070174: step: 772/463, loss: 0.12085504829883575 2023-01-22 15:45:16.658549: step: 774/463, loss: 0.12890678644180298 2023-01-22 15:45:17.225430: step: 776/463, loss: 0.10536642372608185 2023-01-22 15:45:17.862614: step: 778/463, loss: 0.07809966057538986 2023-01-22 15:45:18.499617: step: 780/463, loss: 0.09633523970842361 2023-01-22 15:45:19.151124: step: 782/463, loss: 0.8652080297470093 2023-01-22 15:45:19.756136: step: 784/463, loss: 0.24520201981067657 2023-01-22 15:45:20.403150: step: 786/463, loss: 0.0845712274312973 2023-01-22 15:45:21.048645: step: 788/463, loss: 0.1384482979774475 2023-01-22 15:45:21.673982: step: 790/463, loss: 0.21005448698997498 2023-01-22 15:45:22.314495: step: 792/463, loss: 0.6695063710212708 2023-01-22 15:45:22.886967: step: 794/463, loss: 0.1850268691778183 2023-01-22 15:45:23.528526: step: 796/463, loss: 0.11372587084770203 2023-01-22 15:45:24.160040: step: 798/463, loss: 0.03277843818068504 2023-01-22 15:45:24.746877: step: 800/463, loss: 0.06729190051555634 2023-01-22 15:45:25.353023: step: 802/463, loss: 0.21394476294517517 2023-01-22 15:45:25.992388: step: 804/463, loss: 0.30516189336776733 2023-01-22 15:45:26.674246: step: 806/463, loss: 0.07171948999166489 2023-01-22 15:45:27.328218: step: 808/463, loss: 0.24301525950431824 2023-01-22 15:45:27.966022: step: 810/463, loss: 0.6988062858581543 2023-01-22 15:45:28.579389: step: 812/463, loss: 0.35831010341644287 2023-01-22 15:45:29.179500: step: 814/463, loss: 0.096327044069767 2023-01-22 15:45:29.798420: step: 816/463, loss: 0.20689234137535095 2023-01-22 15:45:30.333961: step: 818/463, loss: 0.12810686230659485 2023-01-22 15:45:30.973641: step: 820/463, loss: 0.3458573818206787 2023-01-22 15:45:31.544481: step: 822/463, loss: 0.02082609198987484 2023-01-22 15:45:32.198010: step: 824/463, loss: 0.2951313853263855 2023-01-22 15:45:32.775425: step: 826/463, loss: 0.04201703518629074 2023-01-22 15:45:33.342489: step: 828/463, loss: 0.2750527560710907 2023-01-22 15:45:33.943391: step: 830/463, loss: 0.389217734336853 2023-01-22 15:45:34.527965: step: 832/463, loss: 0.35633736848831177 2023-01-22 15:45:35.102501: step: 834/463, loss: 0.20983098447322845 2023-01-22 15:45:35.735510: step: 836/463, loss: 0.097258061170578 2023-01-22 15:45:36.318826: step: 838/463, loss: 0.1673450767993927 2023-01-22 15:45:36.931066: step: 840/463, loss: 0.3331473171710968 2023-01-22 15:45:37.563810: step: 842/463, loss: 0.6409955024719238 2023-01-22 15:45:38.182084: step: 844/463, loss: 0.19879215955734253 2023-01-22 15:45:38.760921: step: 846/463, loss: 0.1810797154903412 2023-01-22 15:45:39.469020: step: 848/463, loss: 0.43632590770721436 2023-01-22 15:45:40.092895: step: 850/463, loss: 0.062483321875333786 2023-01-22 15:45:40.722616: step: 852/463, loss: 0.5781866908073425 2023-01-22 15:45:41.337115: step: 854/463, loss: 0.09061413258314133 2023-01-22 15:45:41.967934: step: 856/463, loss: 0.11473110318183899 2023-01-22 15:45:42.577247: step: 858/463, loss: 0.03386783227324486 2023-01-22 15:45:43.171545: step: 860/463, loss: 0.09787828475236893 2023-01-22 15:45:43.763680: step: 862/463, loss: 0.6621948480606079 2023-01-22 15:45:44.395454: step: 864/463, loss: 0.2568643093109131 2023-01-22 15:45:45.032448: step: 866/463, loss: 0.17429155111312866 2023-01-22 15:45:45.697558: step: 868/463, loss: 0.13112936913967133 2023-01-22 15:45:46.304234: step: 870/463, loss: 0.33683228492736816 2023-01-22 15:45:46.845787: step: 872/463, loss: 0.19763724505901337 2023-01-22 15:45:47.521348: step: 874/463, loss: 0.7051858901977539 2023-01-22 15:45:48.186872: step: 876/463, loss: 0.2323628067970276 2023-01-22 15:45:48.814717: step: 878/463, loss: 9.842399597167969 2023-01-22 15:45:49.407830: step: 880/463, loss: 0.17118792235851288 2023-01-22 15:45:49.972446: step: 882/463, loss: 0.1646565943956375 2023-01-22 15:45:50.537943: step: 884/463, loss: 0.2311827838420868 2023-01-22 15:45:51.166236: step: 886/463, loss: 0.12026527523994446 2023-01-22 15:45:51.907295: step: 888/463, loss: 0.19712676107883453 2023-01-22 15:45:52.486893: step: 890/463, loss: 0.08223231136798859 2023-01-22 15:45:53.092545: step: 892/463, loss: 0.2076314091682434 2023-01-22 15:45:53.702687: step: 894/463, loss: 0.2403974086046219 2023-01-22 15:45:54.294644: step: 896/463, loss: 0.06244455277919769 2023-01-22 15:45:54.928461: step: 898/463, loss: 0.0782196968793869 2023-01-22 15:45:55.546511: step: 900/463, loss: 0.0754585787653923 2023-01-22 15:45:56.139895: step: 902/463, loss: 0.09606040269136429 2023-01-22 15:45:56.704102: step: 904/463, loss: 0.10891515761613846 2023-01-22 15:45:57.275880: step: 906/463, loss: 0.22735673189163208 2023-01-22 15:45:57.948634: step: 908/463, loss: 0.12264873087406158 2023-01-22 15:45:58.490715: step: 910/463, loss: 0.042725756764411926 2023-01-22 15:45:59.122593: step: 912/463, loss: 0.22233512997627258 2023-01-22 15:45:59.734726: step: 914/463, loss: 0.10984070599079132 2023-01-22 15:46:00.372593: step: 916/463, loss: 0.11169915646314621 2023-01-22 15:46:00.953439: step: 918/463, loss: 0.029448162764310837 2023-01-22 15:46:01.572718: step: 920/463, loss: 0.36392742395401 2023-01-22 15:46:02.230070: step: 922/463, loss: 0.4412465989589691 2023-01-22 15:46:02.913448: step: 924/463, loss: 0.5335264801979065 2023-01-22 15:46:03.485985: step: 926/463, loss: 0.1237524077296257 ================================================== Loss: 0.300 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2763234157650695, 'r': 0.3392433586337761, 'f1': 0.30456771720613285}, 'combined': 0.22441831794136102, 'epoch': 10} Test Chinese: {'template': {'p': 0.9722222222222222, 'r': 0.5511811023622047, 'f1': 0.7035175879396985}, 'slot': {'p': 0.3522054392206747, 'r': 0.3331340529921316, 'f1': 0.3424043901938875}, 'combined': 0.2408875106891671, 'epoch': 10} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.26967229199372056, 'r': 0.3259606261859583, 'f1': 0.2951567869415807}, 'combined': 0.21748394827274367, 'epoch': 10} Test Korean: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.345705756367181, 'r': 0.3263824651815918, 'f1': 0.33576632761268876}, 'combined': 0.23839409260500902, 'epoch': 10} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.28493110236220476, 'r': 0.343323055028463, 'f1': 0.31141351118760763}, 'combined': 0.22946258719086876, 'epoch': 10} Test Russian: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.36354871148677315, 'r': 0.31270273785225944, 'f1': 0.3362142219013015}, 'combined': 0.23871209754992406, 'epoch': 10} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.2876984126984127, 'r': 0.3452380952380952, 'f1': 0.3138528138528138}, 'combined': 0.20923520923520916, 'epoch': 10} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.24285714285714285, 'r': 0.3695652173913043, 'f1': 0.29310344827586204}, 'combined': 0.14655172413793102, 'epoch': 10} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.25, 'r': 0.1724137931034483, 'f1': 0.20408163265306123}, 'combined': 0.13605442176870747, 'epoch': 10} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3095149595559295, 'r': 0.3253724622276754, 'f1': 0.31724567547453275}, 'combined': 0.23375997140228727, 'epoch': 1} Test for Chinese: {'template': {'p': 0.9722222222222222, 'r': 0.5511811023622047, 'f1': 0.7035175879396985}, 'slot': {'p': 0.36402697976295945, 'r': 0.29452226435922085, 'f1': 0.32560678286267597}, 'combined': 0.22907009849635496, 'epoch': 1} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.29409479409479405, 'r': 0.32770562770562767, 'f1': 0.3099918099918099}, 'combined': 0.2066612066612066, 'epoch': 1} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.31359210182048525, 'r': 0.32489807892596767, 'f1': 0.3191449908555172}, 'combined': 0.23515946694617054, 'epoch': 1} Test for Korean: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.3653684903443556, 'r': 0.29076445304891124, 'f1': 0.32382513429937054}, 'combined': 0.22991584535255308, 'epoch': 1} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.296875, 'r': 0.41304347826086957, 'f1': 0.3454545454545454}, 'combined': 0.1727272727272727, 'epoch': 1} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2792885164051355, 'r': 0.37150142314990514, 'f1': 0.3188619706840391}, 'combined': 0.23495092576718668, 'epoch': 9} Test for Russian: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.3400664122883469, 'r': 0.29521922603896666, 'f1': 0.3160598539641111}, 'combined': 0.2244024963145189, 'epoch': 9} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.40476190476190477, 'r': 0.29310344827586204, 'f1': 0.34}, 'combined': 0.22666666666666668, 'epoch': 9} ****************************** Epoch: 11 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-22 15:48:36.091538: step: 2/463, loss: 0.30766698718070984 2023-01-22 15:48:36.636996: step: 4/463, loss: 0.07764735072851181 2023-01-22 15:48:37.240618: step: 6/463, loss: 0.14308613538742065 2023-01-22 15:48:37.836545: step: 8/463, loss: 0.42703041434288025 2023-01-22 15:48:38.446837: step: 10/463, loss: 0.6001687049865723 2023-01-22 15:48:39.003373: step: 12/463, loss: 0.06018494442105293 2023-01-22 15:48:39.711202: step: 14/463, loss: 0.095168836414814 2023-01-22 15:48:40.272265: step: 16/463, loss: 0.25650495290756226 2023-01-22 15:48:40.873673: step: 18/463, loss: 0.22008155286312103 2023-01-22 15:48:41.462441: step: 20/463, loss: 0.0990765318274498 2023-01-22 15:48:42.165593: step: 22/463, loss: 0.04741843417286873 2023-01-22 15:48:42.745730: step: 24/463, loss: 0.40621092915534973 2023-01-22 15:48:43.350599: step: 26/463, loss: 0.12071111798286438 2023-01-22 15:48:43.998096: step: 28/463, loss: 0.09850025922060013 2023-01-22 15:48:44.581973: step: 30/463, loss: 0.04313766583800316 2023-01-22 15:48:45.173609: step: 32/463, loss: 0.14443650841712952 2023-01-22 15:48:45.763398: step: 34/463, loss: 0.877985954284668 2023-01-22 15:48:46.399215: step: 36/463, loss: 0.11363336443901062 2023-01-22 15:48:47.020240: step: 38/463, loss: 0.07350818812847137 2023-01-22 15:48:47.587592: step: 40/463, loss: 0.541226863861084 2023-01-22 15:48:48.238770: step: 42/463, loss: 0.17483672499656677 2023-01-22 15:48:48.838715: step: 44/463, loss: 0.7551212310791016 2023-01-22 15:48:49.483458: step: 46/463, loss: 0.33826547861099243 2023-01-22 15:48:50.025431: step: 48/463, loss: 0.05065212398767471 2023-01-22 15:48:50.586983: step: 50/463, loss: 0.2848389744758606 2023-01-22 15:48:51.192655: step: 52/463, loss: 0.11479523032903671 2023-01-22 15:48:51.728428: step: 54/463, loss: 0.13717034459114075 2023-01-22 15:48:52.314780: step: 56/463, loss: 0.07104825973510742 2023-01-22 15:48:52.965269: step: 58/463, loss: 0.0817020907998085 2023-01-22 15:48:53.646624: step: 60/463, loss: 0.09490220248699188 2023-01-22 15:48:54.232376: step: 62/463, loss: 0.053828563541173935 2023-01-22 15:48:54.820928: step: 64/463, loss: 0.057275477796792984 2023-01-22 15:48:55.452168: step: 66/463, loss: 0.12280435860157013 2023-01-22 15:48:56.077768: step: 68/463, loss: 0.1881435662508011 2023-01-22 15:48:56.744886: step: 70/463, loss: 0.1447756141424179 2023-01-22 15:48:57.384877: step: 72/463, loss: 0.09036615490913391 2023-01-22 15:48:58.041671: step: 74/463, loss: 0.14337694644927979 2023-01-22 15:48:58.628611: step: 76/463, loss: 0.26268401741981506 2023-01-22 15:48:59.218795: step: 78/463, loss: 0.13526304066181183 2023-01-22 15:48:59.798062: step: 80/463, loss: 0.1299312859773636 2023-01-22 15:49:00.425177: step: 82/463, loss: 0.03643864020705223 2023-01-22 15:49:00.992693: step: 84/463, loss: 0.03413718193769455 2023-01-22 15:49:01.598048: step: 86/463, loss: 0.33864444494247437 2023-01-22 15:49:02.199153: step: 88/463, loss: 0.12213394045829773 2023-01-22 15:49:02.800997: step: 90/463, loss: 0.1795668751001358 2023-01-22 15:49:03.400971: step: 92/463, loss: 0.37977248430252075 2023-01-22 15:49:04.005113: step: 94/463, loss: 0.19702722132205963 2023-01-22 15:49:04.584159: step: 96/463, loss: 0.13043655455112457 2023-01-22 15:49:05.187159: step: 98/463, loss: 0.16667112708091736 2023-01-22 15:49:05.823082: step: 100/463, loss: 0.2098059356212616 2023-01-22 15:49:06.429501: step: 102/463, loss: 0.7729058265686035 2023-01-22 15:49:07.010966: step: 104/463, loss: 0.01812215894460678 2023-01-22 15:49:07.640263: step: 106/463, loss: 0.19573596119880676 2023-01-22 15:49:08.225972: step: 108/463, loss: 0.14948052167892456 2023-01-22 15:49:08.820647: step: 110/463, loss: 0.8801953196525574 2023-01-22 15:49:09.382265: step: 112/463, loss: 0.04627665877342224 2023-01-22 15:49:09.999769: step: 114/463, loss: 0.43227317929267883 2023-01-22 15:49:10.596859: step: 116/463, loss: 0.3645527958869934 2023-01-22 15:49:11.193936: step: 118/463, loss: 0.41212937235832214 2023-01-22 15:49:11.818665: step: 120/463, loss: 0.13738292455673218 2023-01-22 15:49:12.374392: step: 122/463, loss: 0.08812513202428818 2023-01-22 15:49:12.951611: step: 124/463, loss: 0.169837087392807 2023-01-22 15:49:13.582154: step: 126/463, loss: 0.5371737480163574 2023-01-22 15:49:14.228402: step: 128/463, loss: 0.15004339814186096 2023-01-22 15:49:14.823437: step: 130/463, loss: 0.06935568153858185 2023-01-22 15:49:15.536048: step: 132/463, loss: 0.12364938855171204 2023-01-22 15:49:16.151809: step: 134/463, loss: 0.09454914182424545 2023-01-22 15:49:16.742574: step: 136/463, loss: 11.436644554138184 2023-01-22 15:49:17.383635: step: 138/463, loss: 0.16171963512897491 2023-01-22 15:49:17.965355: step: 140/463, loss: 0.11569420993328094 2023-01-22 15:49:18.525440: step: 142/463, loss: 0.1380138248205185 2023-01-22 15:49:19.138631: step: 144/463, loss: 0.327555775642395 2023-01-22 15:49:19.746952: step: 146/463, loss: 0.0644770935177803 2023-01-22 15:49:20.320077: step: 148/463, loss: 0.18029683828353882 2023-01-22 15:49:20.888956: step: 150/463, loss: 0.05956190079450607 2023-01-22 15:49:21.495392: step: 152/463, loss: 0.24042275547981262 2023-01-22 15:49:22.076264: step: 154/463, loss: 0.4238482415676117 2023-01-22 15:49:22.667016: step: 156/463, loss: 0.15713714063167572 2023-01-22 15:49:23.266933: step: 158/463, loss: 0.04039907827973366 2023-01-22 15:49:23.920440: step: 160/463, loss: 0.12191184610128403 2023-01-22 15:49:24.497471: step: 162/463, loss: 0.1003834456205368 2023-01-22 15:49:25.103782: step: 164/463, loss: 0.14902736246585846 2023-01-22 15:49:25.702034: step: 166/463, loss: 0.08167887479066849 2023-01-22 15:49:26.326151: step: 168/463, loss: 0.3232946991920471 2023-01-22 15:49:26.938640: step: 170/463, loss: 1.5592657327651978 2023-01-22 15:49:27.581355: step: 172/463, loss: 0.09481150656938553 2023-01-22 15:49:28.192961: step: 174/463, loss: 0.14762264490127563 2023-01-22 15:49:28.818541: step: 176/463, loss: 0.059730175882577896 2023-01-22 15:49:29.453952: step: 178/463, loss: 0.0965738296508789 2023-01-22 15:49:30.072605: step: 180/463, loss: 0.667728841304779 2023-01-22 15:49:30.745915: step: 182/463, loss: 0.346400648355484 2023-01-22 15:49:31.382567: step: 184/463, loss: 0.18015865981578827 2023-01-22 15:49:32.153163: step: 186/463, loss: 0.15659375488758087 2023-01-22 15:49:32.742139: step: 188/463, loss: 0.13026641309261322 2023-01-22 15:49:33.336653: step: 190/463, loss: 0.4688052833080292 2023-01-22 15:49:33.931217: step: 192/463, loss: 0.3040282726287842 2023-01-22 15:49:34.537762: step: 194/463, loss: 0.049261413514614105 2023-01-22 15:49:35.166121: step: 196/463, loss: 0.32833805680274963 2023-01-22 15:49:35.774281: step: 198/463, loss: 0.07808633893728256 2023-01-22 15:49:36.349261: step: 200/463, loss: 0.28209230303764343 2023-01-22 15:49:36.984603: step: 202/463, loss: 0.14251497387886047 2023-01-22 15:49:37.631776: step: 204/463, loss: 0.13557669520378113 2023-01-22 15:49:38.265185: step: 206/463, loss: 0.16784541308879852 2023-01-22 15:49:38.909645: step: 208/463, loss: 0.22367115318775177 2023-01-22 15:49:39.473535: step: 210/463, loss: 0.05246428772807121 2023-01-22 15:49:40.075756: step: 212/463, loss: 0.12239870429039001 2023-01-22 15:49:40.592251: step: 214/463, loss: 0.11478374898433685 2023-01-22 15:49:41.279885: step: 216/463, loss: 0.08280700445175171 2023-01-22 15:49:41.872670: step: 218/463, loss: 0.1322205662727356 2023-01-22 15:49:42.506724: step: 220/463, loss: 0.11871849000453949 2023-01-22 15:49:43.137812: step: 222/463, loss: 0.08739390224218369 2023-01-22 15:49:43.781858: step: 224/463, loss: 0.21918293833732605 2023-01-22 15:49:44.339570: step: 226/463, loss: 0.2690132260322571 2023-01-22 15:49:44.950040: step: 228/463, loss: 0.2665930986404419 2023-01-22 15:49:45.520275: step: 230/463, loss: 0.15006457269191742 2023-01-22 15:49:46.177877: step: 232/463, loss: 0.13221873342990875 2023-01-22 15:49:46.815593: step: 234/463, loss: 0.273303359746933 2023-01-22 15:49:47.456988: step: 236/463, loss: 0.2755372226238251 2023-01-22 15:49:48.036511: step: 238/463, loss: 0.18883420526981354 2023-01-22 15:49:48.627032: step: 240/463, loss: 0.1182723268866539 2023-01-22 15:49:49.222723: step: 242/463, loss: 2.2498815059661865 2023-01-22 15:49:49.836773: step: 244/463, loss: 0.1727784425020218 2023-01-22 15:49:50.446922: step: 246/463, loss: 0.1354474276304245 2023-01-22 15:49:51.090883: step: 248/463, loss: 0.07331226766109467 2023-01-22 15:49:51.733410: step: 250/463, loss: 0.049903471022844315 2023-01-22 15:49:52.367690: step: 252/463, loss: 0.27633264660835266 2023-01-22 15:49:52.973069: step: 254/463, loss: 0.1763925701379776 2023-01-22 15:49:53.560752: step: 256/463, loss: 0.2230779081583023 2023-01-22 15:49:54.175357: step: 258/463, loss: 0.18155187368392944 2023-01-22 15:49:54.835362: step: 260/463, loss: 0.1982591450214386 2023-01-22 15:49:55.452772: step: 262/463, loss: 0.05888162553310394 2023-01-22 15:49:56.032035: step: 264/463, loss: 0.10446391999721527 2023-01-22 15:49:56.624700: step: 266/463, loss: 0.39118364453315735 2023-01-22 15:49:57.279573: step: 268/463, loss: 0.11228418350219727 2023-01-22 15:49:57.911095: step: 270/463, loss: 0.03592570498585701 2023-01-22 15:49:58.549961: step: 272/463, loss: 0.11017157882452011 2023-01-22 15:49:59.100768: step: 274/463, loss: 0.21150358021259308 2023-01-22 15:49:59.768586: step: 276/463, loss: 0.11916392296552658 2023-01-22 15:50:00.418500: step: 278/463, loss: 0.17495247721672058 2023-01-22 15:50:01.050338: step: 280/463, loss: 1.0055299997329712 2023-01-22 15:50:01.655332: step: 282/463, loss: 0.11420172452926636 2023-01-22 15:50:02.270241: step: 284/463, loss: 0.4013385474681854 2023-01-22 15:50:02.996479: step: 286/463, loss: 0.15077054500579834 2023-01-22 15:50:03.587230: step: 288/463, loss: 0.05620674416422844 2023-01-22 15:50:04.257460: step: 290/463, loss: 0.6424316167831421 2023-01-22 15:50:04.809295: step: 292/463, loss: 0.14537477493286133 2023-01-22 15:50:05.396976: step: 294/463, loss: 0.6141969561576843 2023-01-22 15:50:06.061060: step: 296/463, loss: 0.7530707120895386 2023-01-22 15:50:06.649936: step: 298/463, loss: 0.10310687124729156 2023-01-22 15:50:07.193970: step: 300/463, loss: 0.05188806354999542 2023-01-22 15:50:07.837491: step: 302/463, loss: 0.15703488886356354 2023-01-22 15:50:08.472375: step: 304/463, loss: 0.18566320836544037 2023-01-22 15:50:09.111647: step: 306/463, loss: 0.09109818190336227 2023-01-22 15:50:09.715933: step: 308/463, loss: 0.1266033947467804 2023-01-22 15:50:10.324036: step: 310/463, loss: 0.12085629999637604 2023-01-22 15:50:10.919942: step: 312/463, loss: 0.092064768075943 2023-01-22 15:50:11.486629: step: 314/463, loss: 0.06862680613994598 2023-01-22 15:50:12.117140: step: 316/463, loss: 0.2066315859556198 2023-01-22 15:50:12.729546: step: 318/463, loss: 0.4782834053039551 2023-01-22 15:50:13.381818: step: 320/463, loss: 2.455515146255493 2023-01-22 15:50:13.987583: step: 322/463, loss: 0.08887369185686111 2023-01-22 15:50:14.575967: step: 324/463, loss: 0.11713621020317078 2023-01-22 15:50:15.133677: step: 326/463, loss: 0.1295708566904068 2023-01-22 15:50:15.703712: step: 328/463, loss: 0.07538467645645142 2023-01-22 15:50:16.314737: step: 330/463, loss: 0.18779556453227997 2023-01-22 15:50:16.970596: step: 332/463, loss: 0.1255943477153778 2023-01-22 15:50:17.565968: step: 334/463, loss: 0.6497032642364502 2023-01-22 15:50:18.236327: step: 336/463, loss: 0.32931289076805115 2023-01-22 15:50:18.825959: step: 338/463, loss: 0.09439008682966232 2023-01-22 15:50:19.422040: step: 340/463, loss: 0.06466024369001389 2023-01-22 15:50:20.025300: step: 342/463, loss: 0.19016246497631073 2023-01-22 15:50:20.682893: step: 344/463, loss: 0.19971472024917603 2023-01-22 15:50:21.259303: step: 346/463, loss: 0.18602041900157928 2023-01-22 15:50:21.951257: step: 348/463, loss: 0.09306205064058304 2023-01-22 15:50:22.592226: step: 350/463, loss: 0.1557772159576416 2023-01-22 15:50:23.284488: step: 352/463, loss: 0.045556750148534775 2023-01-22 15:50:23.976191: step: 354/463, loss: 0.3573235869407654 2023-01-22 15:50:24.522333: step: 356/463, loss: 0.08888664096593857 2023-01-22 15:50:25.220204: step: 358/463, loss: 0.04437635466456413 2023-01-22 15:50:25.855110: step: 360/463, loss: 0.10056637972593307 2023-01-22 15:50:26.462121: step: 362/463, loss: 0.0926920548081398 2023-01-22 15:50:27.024648: step: 364/463, loss: 0.04695814475417137 2023-01-22 15:50:27.658272: step: 366/463, loss: 0.07641035318374634 2023-01-22 15:50:28.239472: step: 368/463, loss: 0.14842566847801208 2023-01-22 15:50:28.847803: step: 370/463, loss: 1.2936654090881348 2023-01-22 15:50:29.519579: step: 372/463, loss: 0.835946798324585 2023-01-22 15:50:30.164031: step: 374/463, loss: 0.11667069792747498 2023-01-22 15:50:30.765232: step: 376/463, loss: 0.06245206296443939 2023-01-22 15:50:31.364492: step: 378/463, loss: 0.13662712275981903 2023-01-22 15:50:32.075727: step: 380/463, loss: 0.4793098568916321 2023-01-22 15:50:32.649973: step: 382/463, loss: 0.111614890396595 2023-01-22 15:50:33.270343: step: 384/463, loss: 0.131459578871727 2023-01-22 15:50:33.926162: step: 386/463, loss: 0.05119938403367996 2023-01-22 15:50:34.577265: step: 388/463, loss: 0.049408622086048126 2023-01-22 15:50:35.192873: step: 390/463, loss: 0.07416415959596634 2023-01-22 15:50:35.812070: step: 392/463, loss: 0.11036046594381332 2023-01-22 15:50:36.459226: step: 394/463, loss: 0.16557873785495758 2023-01-22 15:50:37.065658: step: 396/463, loss: 0.4967389702796936 2023-01-22 15:50:37.740935: step: 398/463, loss: 0.058969512581825256 2023-01-22 15:50:38.458920: step: 400/463, loss: 1.3099561929702759 2023-01-22 15:50:39.052687: step: 402/463, loss: 0.08054342865943909 2023-01-22 15:50:39.701216: step: 404/463, loss: 0.07840954512357712 2023-01-22 15:50:40.302106: step: 406/463, loss: 0.15457703173160553 2023-01-22 15:50:40.900003: step: 408/463, loss: 0.2385210245847702 2023-01-22 15:50:41.557046: step: 410/463, loss: 0.3042565882205963 2023-01-22 15:50:42.150829: step: 412/463, loss: 0.19898365437984467 2023-01-22 15:50:42.785921: step: 414/463, loss: 0.2705179452896118 2023-01-22 15:50:43.444277: step: 416/463, loss: 0.10680816322565079 2023-01-22 15:50:44.095820: step: 418/463, loss: 0.15340305864810944 2023-01-22 15:50:44.670365: step: 420/463, loss: 0.038615792989730835 2023-01-22 15:50:45.281419: step: 422/463, loss: 0.03069743514060974 2023-01-22 15:50:45.849059: step: 424/463, loss: 0.15235330164432526 2023-01-22 15:50:46.445995: step: 426/463, loss: 3.277132749557495 2023-01-22 15:50:47.085749: step: 428/463, loss: 0.2866821885108948 2023-01-22 15:50:47.686939: step: 430/463, loss: 0.1565917730331421 2023-01-22 15:50:48.353627: step: 432/463, loss: 0.20708094537258148 2023-01-22 15:50:48.954226: step: 434/463, loss: 0.13557101786136627 2023-01-22 15:50:49.584652: step: 436/463, loss: 0.10209204256534576 2023-01-22 15:50:50.218172: step: 438/463, loss: 0.03551975265145302 2023-01-22 15:50:50.815060: step: 440/463, loss: 0.26435375213623047 2023-01-22 15:50:51.481415: step: 442/463, loss: 0.2166479229927063 2023-01-22 15:50:52.042923: step: 444/463, loss: 0.11472812294960022 2023-01-22 15:50:52.594025: step: 446/463, loss: 0.046217337250709534 2023-01-22 15:50:53.210861: step: 448/463, loss: 0.07948990166187286 2023-01-22 15:50:53.798984: step: 450/463, loss: 0.05667305365204811 2023-01-22 15:50:54.395561: step: 452/463, loss: 0.17144590616226196 2023-01-22 15:50:55.010876: step: 454/463, loss: 0.124813973903656 2023-01-22 15:50:55.622390: step: 456/463, loss: 0.14591386914253235 2023-01-22 15:50:56.200571: step: 458/463, loss: 0.12007512152194977 2023-01-22 15:50:56.788943: step: 460/463, loss: 0.16247791051864624 2023-01-22 15:50:57.363667: step: 462/463, loss: 0.12132865935564041 2023-01-22 15:50:57.933981: step: 464/463, loss: 0.17197085916996002 2023-01-22 15:50:58.598263: step: 466/463, loss: 0.4977633059024811 2023-01-22 15:50:59.154650: step: 468/463, loss: 0.1137714833021164 2023-01-22 15:50:59.824218: step: 470/463, loss: 0.1037524864077568 2023-01-22 15:51:00.482145: step: 472/463, loss: 0.1466861069202423 2023-01-22 15:51:01.105361: step: 474/463, loss: 2.040036678314209 2023-01-22 15:51:01.682569: step: 476/463, loss: 0.3610437512397766 2023-01-22 15:51:02.267481: step: 478/463, loss: 0.11685654520988464 2023-01-22 15:51:02.891158: step: 480/463, loss: 0.029153384268283844 2023-01-22 15:51:03.487882: step: 482/463, loss: 0.10687181353569031 2023-01-22 15:51:04.167856: step: 484/463, loss: 0.07882768660783768 2023-01-22 15:51:04.874007: step: 486/463, loss: 0.22036617994308472 2023-01-22 15:51:05.531453: step: 488/463, loss: 0.13858413696289062 2023-01-22 15:51:06.167599: step: 490/463, loss: 0.19485457241535187 2023-01-22 15:51:06.747074: step: 492/463, loss: 0.5633850693702698 2023-01-22 15:51:07.310482: step: 494/463, loss: 0.13537846505641937 2023-01-22 15:51:07.893881: step: 496/463, loss: 0.2506277859210968 2023-01-22 15:51:08.546381: step: 498/463, loss: 0.06884433329105377 2023-01-22 15:51:09.170884: step: 500/463, loss: 0.1178259328007698 2023-01-22 15:51:09.825654: step: 502/463, loss: 0.09327971935272217 2023-01-22 15:51:10.419152: step: 504/463, loss: 0.3796323835849762 2023-01-22 15:51:10.995661: step: 506/463, loss: 0.5609042048454285 2023-01-22 15:51:11.683208: step: 508/463, loss: 0.1328524649143219 2023-01-22 15:51:12.339710: step: 510/463, loss: 0.08789972960948944 2023-01-22 15:51:13.011308: step: 512/463, loss: 0.17137378454208374 2023-01-22 15:51:13.604438: step: 514/463, loss: 0.0942419022321701 2023-01-22 15:51:14.238336: step: 516/463, loss: 0.06321103125810623 2023-01-22 15:51:14.848713: step: 518/463, loss: 0.042588867247104645 2023-01-22 15:51:15.494808: step: 520/463, loss: 0.09414154291152954 2023-01-22 15:51:16.129459: step: 522/463, loss: 0.12195169925689697 2023-01-22 15:51:16.746922: step: 524/463, loss: 0.08791318535804749 2023-01-22 15:51:17.403652: step: 526/463, loss: 0.07200735807418823 2023-01-22 15:51:18.020806: step: 528/463, loss: 0.13725115358829498 2023-01-22 15:51:18.691785: step: 530/463, loss: 0.19681353867053986 2023-01-22 15:51:19.337375: step: 532/463, loss: 0.06435519456863403 2023-01-22 15:51:19.919723: step: 534/463, loss: 0.06020674854516983 2023-01-22 15:51:20.562793: step: 536/463, loss: 0.10913069546222687 2023-01-22 15:51:21.179674: step: 538/463, loss: 0.09244746714830399 2023-01-22 15:51:21.767186: step: 540/463, loss: 0.10088459402322769 2023-01-22 15:51:22.392659: step: 542/463, loss: 0.18608695268630981 2023-01-22 15:51:23.022369: step: 544/463, loss: 0.09219348430633545 2023-01-22 15:51:23.619069: step: 546/463, loss: 0.06271356344223022 2023-01-22 15:51:24.243276: step: 548/463, loss: 0.05571884661912918 2023-01-22 15:51:24.920784: step: 550/463, loss: 0.1140749603509903 2023-01-22 15:51:25.558920: step: 552/463, loss: 0.21480242908000946 2023-01-22 15:51:26.204050: step: 554/463, loss: 0.47956418991088867 2023-01-22 15:51:26.875864: step: 556/463, loss: 0.288901150226593 2023-01-22 15:51:27.432221: step: 558/463, loss: 0.09892876446247101 2023-01-22 15:51:28.076332: step: 560/463, loss: 0.06249299272894859 2023-01-22 15:51:28.741654: step: 562/463, loss: 0.07808088511228561 2023-01-22 15:51:29.384218: step: 564/463, loss: 0.034613557159900665 2023-01-22 15:51:29.966946: step: 566/463, loss: 0.12289872020483017 2023-01-22 15:51:30.542482: step: 568/463, loss: 0.28217169642448425 2023-01-22 15:51:31.117521: step: 570/463, loss: 1.0774328708648682 2023-01-22 15:51:31.657315: step: 572/463, loss: 0.1251152753829956 2023-01-22 15:51:32.239005: step: 574/463, loss: 0.11428724974393845 2023-01-22 15:51:32.887660: step: 576/463, loss: 0.06019041687250137 2023-01-22 15:51:33.519418: step: 578/463, loss: 0.5559658408164978 2023-01-22 15:51:34.140190: step: 580/463, loss: 0.11094654351472855 2023-01-22 15:51:34.767797: step: 582/463, loss: 0.4058575928211212 2023-01-22 15:51:35.355884: step: 584/463, loss: 0.16985461115837097 2023-01-22 15:51:36.004780: step: 586/463, loss: 0.1628400683403015 2023-01-22 15:51:36.595030: step: 588/463, loss: 0.17865613102912903 2023-01-22 15:51:37.214376: step: 590/463, loss: 0.15804004669189453 2023-01-22 15:51:37.828763: step: 592/463, loss: 0.10518410801887512 2023-01-22 15:51:38.414103: step: 594/463, loss: 0.400827020406723 2023-01-22 15:51:39.046349: step: 596/463, loss: 0.13154636323451996 2023-01-22 15:51:39.720769: step: 598/463, loss: 0.2047145515680313 2023-01-22 15:51:40.353936: step: 600/463, loss: 0.026997607201337814 2023-01-22 15:51:40.972769: step: 602/463, loss: 0.18913529813289642 2023-01-22 15:51:41.616193: step: 604/463, loss: 0.20349766314029694 2023-01-22 15:51:42.188436: step: 606/463, loss: 0.248092919588089 2023-01-22 15:51:42.797725: step: 608/463, loss: 1.3390507698059082 2023-01-22 15:51:43.460733: step: 610/463, loss: 0.8297772407531738 2023-01-22 15:51:44.074338: step: 612/463, loss: 0.046834852546453476 2023-01-22 15:51:44.667324: step: 614/463, loss: 0.15550518035888672 2023-01-22 15:51:45.357725: step: 616/463, loss: 0.3077329099178314 2023-01-22 15:51:45.985868: step: 618/463, loss: 0.322113037109375 2023-01-22 15:51:46.629737: step: 620/463, loss: 0.10847900062799454 2023-01-22 15:51:47.252235: step: 622/463, loss: 0.16469962894916534 2023-01-22 15:51:47.935019: step: 624/463, loss: 0.6569017171859741 2023-01-22 15:51:48.603952: step: 626/463, loss: 0.11301714181900024 2023-01-22 15:51:49.333256: step: 628/463, loss: 0.06796009093523026 2023-01-22 15:51:49.935309: step: 630/463, loss: 0.1826564073562622 2023-01-22 15:51:50.544681: step: 632/463, loss: 0.1905251294374466 2023-01-22 15:51:51.182855: step: 634/463, loss: 0.056017324328422546 2023-01-22 15:51:51.878703: step: 636/463, loss: 0.11843913048505783 2023-01-22 15:51:52.510298: step: 638/463, loss: 0.39750123023986816 2023-01-22 15:51:53.127829: step: 640/463, loss: 0.0960875004529953 2023-01-22 15:51:53.741575: step: 642/463, loss: 0.2190573662519455 2023-01-22 15:51:54.317118: step: 644/463, loss: 0.9889681339263916 2023-01-22 15:51:54.903078: step: 646/463, loss: 0.08479965478181839 2023-01-22 15:51:55.473387: step: 648/463, loss: 0.08075910061597824 2023-01-22 15:51:56.075687: step: 650/463, loss: 0.07180842012166977 2023-01-22 15:51:56.684983: step: 652/463, loss: 0.5454235076904297 2023-01-22 15:51:57.282113: step: 654/463, loss: 0.12177710980176926 2023-01-22 15:51:57.869685: step: 656/463, loss: 0.10010837763547897 2023-01-22 15:51:58.481603: step: 658/463, loss: 0.18917211890220642 2023-01-22 15:51:59.154638: step: 660/463, loss: 0.24335436522960663 2023-01-22 15:51:59.766096: step: 662/463, loss: 0.40236830711364746 2023-01-22 15:52:00.374348: step: 664/463, loss: 0.3826402425765991 2023-01-22 15:52:01.035329: step: 666/463, loss: 0.10697450488805771 2023-01-22 15:52:01.606869: step: 668/463, loss: 0.08759381622076035 2023-01-22 15:52:02.216647: step: 670/463, loss: 0.6467235684394836 2023-01-22 15:52:02.774518: step: 672/463, loss: 0.3104308545589447 2023-01-22 15:52:03.403323: step: 674/463, loss: 0.20840275287628174 2023-01-22 15:52:04.018213: step: 676/463, loss: 0.11874065548181534 2023-01-22 15:52:04.602281: step: 678/463, loss: 0.062338367104530334 2023-01-22 15:52:05.145709: step: 680/463, loss: 0.12735766172409058 2023-01-22 15:52:05.789118: step: 682/463, loss: 0.2641005218029022 2023-01-22 15:52:06.458678: step: 684/463, loss: 0.43134355545043945 2023-01-22 15:52:07.179249: step: 686/463, loss: 0.07732436805963516 2023-01-22 15:52:07.933323: step: 688/463, loss: 0.18965910375118256 2023-01-22 15:52:08.570404: step: 690/463, loss: 0.12673477828502655 2023-01-22 15:52:09.183765: step: 692/463, loss: 0.022075973451137543 2023-01-22 15:52:09.775175: step: 694/463, loss: 0.03526413440704346 2023-01-22 15:52:10.393065: step: 696/463, loss: 0.08386117219924927 2023-01-22 15:52:10.986302: step: 698/463, loss: 0.3849274814128876 2023-01-22 15:52:11.617651: step: 700/463, loss: 0.04293429106473923 2023-01-22 15:52:12.278615: step: 702/463, loss: 0.15308654308319092 2023-01-22 15:52:12.879197: step: 704/463, loss: 0.03631807491183281 2023-01-22 15:52:13.471135: step: 706/463, loss: 0.0877344012260437 2023-01-22 15:52:14.056379: step: 708/463, loss: 0.0950545147061348 2023-01-22 15:52:14.660643: step: 710/463, loss: 3.013585090637207 2023-01-22 15:52:15.296715: step: 712/463, loss: 0.2090204358100891 2023-01-22 15:52:15.911715: step: 714/463, loss: 0.19892239570617676 2023-01-22 15:52:16.497055: step: 716/463, loss: 0.16003288328647614 2023-01-22 15:52:17.104788: step: 718/463, loss: 0.0838901475071907 2023-01-22 15:52:17.680913: step: 720/463, loss: 0.08273895829916 2023-01-22 15:52:18.284433: step: 722/463, loss: 0.03586139902472496 2023-01-22 15:52:18.874726: step: 724/463, loss: 0.038860395550727844 2023-01-22 15:52:19.531738: step: 726/463, loss: 0.1511934995651245 2023-01-22 15:52:20.121968: step: 728/463, loss: 0.04311190918087959 2023-01-22 15:52:20.742952: step: 730/463, loss: 0.26674842834472656 2023-01-22 15:52:21.280612: step: 732/463, loss: 0.1552060842514038 2023-01-22 15:52:21.849898: step: 734/463, loss: 0.15326175093650818 2023-01-22 15:52:22.472863: step: 736/463, loss: 0.035149723291397095 2023-01-22 15:52:23.119680: step: 738/463, loss: 0.20921042561531067 2023-01-22 15:52:23.771498: step: 740/463, loss: 0.29525405168533325 2023-01-22 15:52:24.372511: step: 742/463, loss: 0.1712518334388733 2023-01-22 15:52:24.994304: step: 744/463, loss: 0.07051482051610947 2023-01-22 15:52:25.590242: step: 746/463, loss: 0.06408259272575378 2023-01-22 15:52:26.202896: step: 748/463, loss: 1.2105181217193604 2023-01-22 15:52:26.787775: step: 750/463, loss: 0.12165110558271408 2023-01-22 15:52:27.436579: step: 752/463, loss: 0.08526906371116638 2023-01-22 15:52:28.112564: step: 754/463, loss: 0.08251411467790604 2023-01-22 15:52:28.752106: step: 756/463, loss: 0.48402902483940125 2023-01-22 15:52:29.363804: step: 758/463, loss: 0.2341805249452591 2023-01-22 15:52:29.944193: step: 760/463, loss: 0.11187168955802917 2023-01-22 15:52:30.544514: step: 762/463, loss: 0.13485176861286163 2023-01-22 15:52:31.175553: step: 764/463, loss: 0.09754111617803574 2023-01-22 15:52:31.722470: step: 766/463, loss: 0.10498793423175812 2023-01-22 15:52:32.319824: step: 768/463, loss: 0.16940417885780334 2023-01-22 15:52:33.127681: step: 770/463, loss: 0.026495380327105522 2023-01-22 15:52:33.718721: step: 772/463, loss: 0.021905209869146347 2023-01-22 15:52:34.336511: step: 774/463, loss: 0.2635791599750519 2023-01-22 15:52:34.914934: step: 776/463, loss: 0.04806092381477356 2023-01-22 15:52:35.542129: step: 778/463, loss: 0.05991750955581665 2023-01-22 15:52:36.282513: step: 780/463, loss: 0.5934916138648987 2023-01-22 15:52:37.015553: step: 782/463, loss: 0.4112907648086548 2023-01-22 15:52:37.645081: step: 784/463, loss: 0.11262604594230652 2023-01-22 15:52:38.284282: step: 786/463, loss: 0.21722553670406342 2023-01-22 15:52:38.919235: step: 788/463, loss: 0.08990461379289627 2023-01-22 15:52:39.520474: step: 790/463, loss: 0.02403474971652031 2023-01-22 15:52:40.127000: step: 792/463, loss: 2.3664467334747314 2023-01-22 15:52:40.797039: step: 794/463, loss: 0.28112301230430603 2023-01-22 15:52:41.404690: step: 796/463, loss: 0.043373651802539825 2023-01-22 15:52:42.007208: step: 798/463, loss: 0.24440644681453705 2023-01-22 15:52:42.719253: step: 800/463, loss: 0.10140793025493622 2023-01-22 15:52:43.298966: step: 802/463, loss: 0.11137749999761581 2023-01-22 15:52:43.931923: step: 804/463, loss: 0.1308003067970276 2023-01-22 15:52:44.538976: step: 806/463, loss: 0.5266055464744568 2023-01-22 15:52:45.170636: step: 808/463, loss: 0.7873648405075073 2023-01-22 15:52:45.775417: step: 810/463, loss: 0.11768036335706711 2023-01-22 15:52:46.337808: step: 812/463, loss: 0.12520816922187805 2023-01-22 15:52:46.949736: step: 814/463, loss: 0.13595344126224518 2023-01-22 15:52:47.446533: step: 816/463, loss: 0.1075884997844696 2023-01-22 15:52:47.993146: step: 818/463, loss: 0.6378896832466125 2023-01-22 15:52:48.604505: step: 820/463, loss: 0.15112581849098206 2023-01-22 15:52:49.286096: step: 822/463, loss: 0.143977552652359 2023-01-22 15:52:49.835730: step: 824/463, loss: 0.08646225184202194 2023-01-22 15:52:50.462053: step: 826/463, loss: 0.11122042685747147 2023-01-22 15:52:51.016705: step: 828/463, loss: 0.28062132000923157 2023-01-22 15:52:51.602984: step: 830/463, loss: 0.16655002534389496 2023-01-22 15:52:52.168382: step: 832/463, loss: 0.07993201166391373 2023-01-22 15:52:52.730799: step: 834/463, loss: 0.07382989674806595 2023-01-22 15:52:53.339590: step: 836/463, loss: 0.059531353414058685 2023-01-22 15:52:53.932650: step: 838/463, loss: 0.11709504574537277 2023-01-22 15:52:54.569534: step: 840/463, loss: 0.10541233420372009 2023-01-22 15:52:55.127966: step: 842/463, loss: 0.06629659980535507 2023-01-22 15:52:55.771193: step: 844/463, loss: 1.1641117334365845 2023-01-22 15:52:56.353526: step: 846/463, loss: 0.09962944686412811 2023-01-22 15:52:56.979026: step: 848/463, loss: 0.04313936457037926 2023-01-22 15:52:57.617348: step: 850/463, loss: 0.12146587669849396 2023-01-22 15:52:58.244913: step: 852/463, loss: 0.06336131691932678 2023-01-22 15:52:58.887864: step: 854/463, loss: 0.1598680466413498 2023-01-22 15:52:59.499756: step: 856/463, loss: 0.1951095163822174 2023-01-22 15:53:00.032019: step: 858/463, loss: 0.17447355389595032 2023-01-22 15:53:00.640331: step: 860/463, loss: 0.2632284462451935 2023-01-22 15:53:01.334375: step: 862/463, loss: 0.106203593313694 2023-01-22 15:53:01.882140: step: 864/463, loss: 0.021640464663505554 2023-01-22 15:53:02.545660: step: 866/463, loss: 0.2745489180088043 2023-01-22 15:53:03.245880: step: 868/463, loss: 0.4149942696094513 2023-01-22 15:53:03.863490: step: 870/463, loss: 0.1331532895565033 2023-01-22 15:53:04.467368: step: 872/463, loss: 0.06813172996044159 2023-01-22 15:53:05.203897: step: 874/463, loss: 0.06313250213861465 2023-01-22 15:53:05.834606: step: 876/463, loss: 0.10989562422037125 2023-01-22 15:53:06.359079: step: 878/463, loss: 0.14949096739292145 2023-01-22 15:53:06.962641: step: 880/463, loss: 0.16397687792778015 2023-01-22 15:53:07.620174: step: 882/463, loss: 0.2599761188030243 2023-01-22 15:53:08.232166: step: 884/463, loss: 0.08316408842802048 2023-01-22 15:53:08.841358: step: 886/463, loss: 0.3813524544239044 2023-01-22 15:53:09.390353: step: 888/463, loss: 0.059639159590005875 2023-01-22 15:53:09.976808: step: 890/463, loss: 0.030805883929133415 2023-01-22 15:53:10.578178: step: 892/463, loss: 0.17491088807582855 2023-01-22 15:53:11.171308: step: 894/463, loss: 0.34603357315063477 2023-01-22 15:53:11.767567: step: 896/463, loss: 0.03985508531332016 2023-01-22 15:53:12.369978: step: 898/463, loss: 0.046755533665418625 2023-01-22 15:53:12.956455: step: 900/463, loss: 0.09123609960079193 2023-01-22 15:53:13.548453: step: 902/463, loss: 0.12473348528146744 2023-01-22 15:53:14.205599: step: 904/463, loss: 0.18592266738414764 2023-01-22 15:53:14.824882: step: 906/463, loss: 0.09335862100124359 2023-01-22 15:53:15.413691: step: 908/463, loss: 0.6929754018783569 2023-01-22 15:53:16.039080: step: 910/463, loss: 0.11228056252002716 2023-01-22 15:53:16.618334: step: 912/463, loss: 0.7811570167541504 2023-01-22 15:53:17.250109: step: 914/463, loss: 0.15852053463459015 2023-01-22 15:53:17.832083: step: 916/463, loss: 0.22059768438339233 2023-01-22 15:53:18.433323: step: 918/463, loss: 0.06329476088285446 2023-01-22 15:53:19.058964: step: 920/463, loss: 0.2752413749694824 2023-01-22 15:53:19.728103: step: 922/463, loss: 0.0879533514380455 2023-01-22 15:53:20.391154: step: 924/463, loss: 0.9679999947547913 2023-01-22 15:53:20.990099: step: 926/463, loss: 0.027774294838309288 ================================================== Loss: 0.260 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.286031314699793, 'r': 0.2996001626457035, 'f1': 0.29265854627300414}, 'combined': 0.21564313935905566, 'epoch': 11} Test Chinese: {'template': {'p': 0.9722222222222222, 'r': 0.5511811023622047, 'f1': 0.7035175879396985}, 'slot': {'p': 0.3466959902350088, 'r': 0.3012773015579334, 'f1': 0.32239486942414375}, 'combined': 0.2268104609014077, 'epoch': 11} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2857077084977638, 'r': 0.2943819463269179, 'f1': 0.28997997329773034}, 'combined': 0.21366945400885393, 'epoch': 11} Test Korean: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.33830651938339856, 'r': 0.28900705277168165, 'f1': 0.31171960703656204}, 'combined': 0.22132092099595904, 'epoch': 11} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.29229077825159916, 'r': 0.29728246137164543, 'f1': 0.29476548850960893}, 'combined': 0.21719562311234342, 'epoch': 11} Test Russian: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.36189253213247025, 'r': 0.2879982454666779, 'f1': 0.32074440165676665}, 'combined': 0.22772852517630432, 'epoch': 11} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.32215447154471544, 'r': 0.3773809523809524, 'f1': 0.3475877192982456}, 'combined': 0.2317251461988304, 'epoch': 11} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.3106060606060606, 'r': 0.44565217391304346, 'f1': 0.3660714285714286}, 'combined': 0.1830357142857143, 'epoch': 11} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.4364035087719298, 'r': 0.18058076225045372, 'f1': 0.2554557124518614}, 'combined': 0.17030380830124092, 'epoch': 11} New best chinese model... ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.286031314699793, 'r': 0.2996001626457035, 'f1': 0.29265854627300414}, 'combined': 0.21564313935905566, 'epoch': 11} Test for Chinese: {'template': {'p': 0.9722222222222222, 'r': 0.5511811023622047, 'f1': 0.7035175879396985}, 'slot': {'p': 0.3466959902350088, 'r': 0.3012773015579334, 'f1': 0.32239486942414375}, 'combined': 0.2268104609014077, 'epoch': 11} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.32215447154471544, 'r': 0.3773809523809524, 'f1': 0.3475877192982456}, 'combined': 0.2317251461988304, 'epoch': 11} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.31359210182048525, 'r': 0.32489807892596767, 'f1': 0.3191449908555172}, 'combined': 0.23515946694617054, 'epoch': 1} Test for Korean: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.3653684903443556, 'r': 0.29076445304891124, 'f1': 0.32382513429937054}, 'combined': 0.22991584535255308, 'epoch': 1} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.296875, 'r': 0.41304347826086957, 'f1': 0.3454545454545454}, 'combined': 0.1727272727272727, 'epoch': 1} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2792885164051355, 'r': 0.37150142314990514, 'f1': 0.3188619706840391}, 'combined': 0.23495092576718668, 'epoch': 9} Test for Russian: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.3400664122883469, 'r': 0.29521922603896666, 'f1': 0.3160598539641111}, 'combined': 0.2244024963145189, 'epoch': 9} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.40476190476190477, 'r': 0.29310344827586204, 'f1': 0.34}, 'combined': 0.22666666666666668, 'epoch': 9} ****************************** Epoch: 12 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-22 15:55:58.625621: step: 2/463, loss: 0.0861300602555275 2023-01-22 15:55:59.231396: step: 4/463, loss: 0.392968088388443 2023-01-22 15:55:59.784193: step: 6/463, loss: 0.07516488432884216 2023-01-22 15:56:00.452326: step: 8/463, loss: 0.9423526525497437 2023-01-22 15:56:01.082185: step: 10/463, loss: 0.03942079097032547 2023-01-22 15:56:01.728440: step: 12/463, loss: 0.0900811105966568 2023-01-22 15:56:02.322131: step: 14/463, loss: 0.2147255688905716 2023-01-22 15:56:02.899354: step: 16/463, loss: 0.10927651077508926 2023-01-22 15:56:03.475931: step: 18/463, loss: 0.12038875371217728 2023-01-22 15:56:04.077270: step: 20/463, loss: 0.2500070333480835 2023-01-22 15:56:04.705689: step: 22/463, loss: 0.11153824627399445 2023-01-22 15:56:05.312169: step: 24/463, loss: 0.1341220885515213 2023-01-22 15:56:05.875539: step: 26/463, loss: 0.029449567198753357 2023-01-22 15:56:06.542139: step: 28/463, loss: 0.07029224932193756 2023-01-22 15:56:07.074920: step: 30/463, loss: 0.04447728395462036 2023-01-22 15:56:07.708504: step: 32/463, loss: 0.11460871249437332 2023-01-22 15:56:08.326589: step: 34/463, loss: 0.09816394001245499 2023-01-22 15:56:08.909412: step: 36/463, loss: 0.03078196570277214 2023-01-22 15:56:09.511207: step: 38/463, loss: 0.06119254231452942 2023-01-22 15:56:10.082516: step: 40/463, loss: 0.3786177635192871 2023-01-22 15:56:10.721054: step: 42/463, loss: 0.11001528054475784 2023-01-22 15:56:11.343474: step: 44/463, loss: 0.15459610521793365 2023-01-22 15:56:11.968311: step: 46/463, loss: 0.34724149107933044 2023-01-22 15:56:12.587106: step: 48/463, loss: 0.04751883074641228 2023-01-22 15:56:13.208234: step: 50/463, loss: 2.5044608116149902 2023-01-22 15:56:13.876041: step: 52/463, loss: 0.13847512006759644 2023-01-22 15:56:14.460520: step: 54/463, loss: 0.11425051093101501 2023-01-22 15:56:15.051475: step: 56/463, loss: 0.05636635050177574 2023-01-22 15:56:15.623747: step: 58/463, loss: 0.12301263958215714 2023-01-22 15:56:16.202287: step: 60/463, loss: 0.10562190413475037 2023-01-22 15:56:16.807638: step: 62/463, loss: 0.09519854933023453 2023-01-22 15:56:17.411465: step: 64/463, loss: 0.6992284655570984 2023-01-22 15:56:18.047542: step: 66/463, loss: 0.10098081082105637 2023-01-22 15:56:18.644214: step: 68/463, loss: 0.0506458543241024 2023-01-22 15:56:19.286457: step: 70/463, loss: 0.08689767122268677 2023-01-22 15:56:19.934036: step: 72/463, loss: 1.113677978515625 2023-01-22 15:56:20.509189: step: 74/463, loss: 0.04527551308274269 2023-01-22 15:56:21.159625: step: 76/463, loss: 0.06913824379444122 2023-01-22 15:56:21.907827: step: 78/463, loss: 0.14081479609012604 2023-01-22 15:56:22.428742: step: 80/463, loss: 0.006058207247406244 2023-01-22 15:56:23.043892: step: 82/463, loss: 0.3920303285121918 2023-01-22 15:56:23.643424: step: 84/463, loss: 0.08975867182016373 2023-01-22 15:56:24.218586: step: 86/463, loss: 0.3274294435977936 2023-01-22 15:56:24.834639: step: 88/463, loss: 0.5766287446022034 2023-01-22 15:56:25.414206: step: 90/463, loss: 0.15913768112659454 2023-01-22 15:56:25.990743: step: 92/463, loss: 0.00908822100609541 2023-01-22 15:56:26.577581: step: 94/463, loss: 0.02873828262090683 2023-01-22 15:56:27.235437: step: 96/463, loss: 0.10611401498317719 2023-01-22 15:56:27.871486: step: 98/463, loss: 0.43153154850006104 2023-01-22 15:56:28.490496: step: 100/463, loss: 0.14611750841140747 2023-01-22 15:56:29.151275: step: 102/463, loss: 0.05124979466199875 2023-01-22 15:56:29.725526: step: 104/463, loss: 0.10868179053068161 2023-01-22 15:56:30.329008: step: 106/463, loss: 0.0331915020942688 2023-01-22 15:56:30.928175: step: 108/463, loss: 0.2829020917415619 2023-01-22 15:56:31.596205: step: 110/463, loss: 0.25730621814727783 2023-01-22 15:56:32.238069: step: 112/463, loss: 0.03875773772597313 2023-01-22 15:56:32.886462: step: 114/463, loss: 0.9396823644638062 2023-01-22 15:56:33.541627: step: 116/463, loss: 0.11610925197601318 2023-01-22 15:56:34.162322: step: 118/463, loss: 0.11845576018095016 2023-01-22 15:56:34.731185: step: 120/463, loss: 0.023275572806596756 2023-01-22 15:56:35.320159: step: 122/463, loss: 0.07260771095752716 2023-01-22 15:56:35.904732: step: 124/463, loss: 0.03091038577258587 2023-01-22 15:56:36.516945: step: 126/463, loss: 0.09225587546825409 2023-01-22 15:56:37.090590: step: 128/463, loss: 0.11037386953830719 2023-01-22 15:56:37.680178: step: 130/463, loss: 0.09758221358060837 2023-01-22 15:56:38.261148: step: 132/463, loss: 0.5677353739738464 2023-01-22 15:56:38.864259: step: 134/463, loss: 0.08946246653795242 2023-01-22 15:56:39.512655: step: 136/463, loss: 0.04699913412332535 2023-01-22 15:56:40.160318: step: 138/463, loss: 0.11516892164945602 2023-01-22 15:56:40.792168: step: 140/463, loss: 0.08148377388715744 2023-01-22 15:56:41.384253: step: 142/463, loss: 0.038631416857242584 2023-01-22 15:56:41.987958: step: 144/463, loss: 0.07037857174873352 2023-01-22 15:56:42.625070: step: 146/463, loss: 0.021262533962726593 2023-01-22 15:56:43.191735: step: 148/463, loss: 0.18214856088161469 2023-01-22 15:56:43.848318: step: 150/463, loss: 0.501178503036499 2023-01-22 15:56:44.473313: step: 152/463, loss: 0.10307686030864716 2023-01-22 15:56:45.079878: step: 154/463, loss: 0.06982319802045822 2023-01-22 15:56:45.702183: step: 156/463, loss: 0.5213944315910339 2023-01-22 15:56:46.311241: step: 158/463, loss: 0.09707575291395187 2023-01-22 15:56:46.944184: step: 160/463, loss: 0.08850989490747452 2023-01-22 15:56:47.601514: step: 162/463, loss: 0.3117218613624573 2023-01-22 15:56:48.198328: step: 164/463, loss: 0.08529256284236908 2023-01-22 15:56:48.812431: step: 166/463, loss: 0.10052064061164856 2023-01-22 15:56:49.457147: step: 168/463, loss: 0.2278415709733963 2023-01-22 15:56:50.088868: step: 170/463, loss: 0.15734164416790009 2023-01-22 15:56:50.755658: step: 172/463, loss: 0.09187552332878113 2023-01-22 15:56:51.344000: step: 174/463, loss: 7.01143741607666 2023-01-22 15:56:51.895374: step: 176/463, loss: 0.046210866421461105 2023-01-22 15:56:52.536564: step: 178/463, loss: 0.16678856313228607 2023-01-22 15:56:53.162758: step: 180/463, loss: 0.08144627511501312 2023-01-22 15:56:53.767291: step: 182/463, loss: 0.19307918846607208 2023-01-22 15:56:54.368177: step: 184/463, loss: 0.052044034004211426 2023-01-22 15:56:54.954915: step: 186/463, loss: 0.07335599511861801 2023-01-22 15:56:55.598817: step: 188/463, loss: 0.3661571443080902 2023-01-22 15:56:56.174317: step: 190/463, loss: 0.11067470163106918 2023-01-22 15:56:56.786094: step: 192/463, loss: 0.1257559359073639 2023-01-22 15:56:57.480276: step: 194/463, loss: 0.03967934101819992 2023-01-22 15:56:58.117927: step: 196/463, loss: 0.10816127806901932 2023-01-22 15:56:58.697039: step: 198/463, loss: 0.057704199105501175 2023-01-22 15:56:59.308505: step: 200/463, loss: 0.08288305997848511 2023-01-22 15:56:59.840905: step: 202/463, loss: 0.27915018796920776 2023-01-22 15:57:00.490237: step: 204/463, loss: 0.5615749955177307 2023-01-22 15:57:01.047157: step: 206/463, loss: 0.09222240746021271 2023-01-22 15:57:01.715024: step: 208/463, loss: 0.3562775254249573 2023-01-22 15:57:02.333841: step: 210/463, loss: 0.06718975305557251 2023-01-22 15:57:02.974851: step: 212/463, loss: 0.04610195755958557 2023-01-22 15:57:03.634578: step: 214/463, loss: 0.03996937349438667 2023-01-22 15:57:04.306411: step: 216/463, loss: 0.08180630207061768 2023-01-22 15:57:04.949796: step: 218/463, loss: 0.04338874667882919 2023-01-22 15:57:05.556932: step: 220/463, loss: 0.028971415013074875 2023-01-22 15:57:06.171586: step: 222/463, loss: 0.04795433580875397 2023-01-22 15:57:06.776428: step: 224/463, loss: 0.1831204742193222 2023-01-22 15:57:07.408607: step: 226/463, loss: 0.101833276450634 2023-01-22 15:57:07.985311: step: 228/463, loss: 0.32132524251937866 2023-01-22 15:57:08.631517: step: 230/463, loss: 0.03637580946087837 2023-01-22 15:57:09.233566: step: 232/463, loss: 0.5078372359275818 2023-01-22 15:57:09.849630: step: 234/463, loss: 0.5256062746047974 2023-01-22 15:57:10.416029: step: 236/463, loss: 0.01959248073399067 2023-01-22 15:57:11.066123: step: 238/463, loss: 0.08161444962024689 2023-01-22 15:57:11.626784: step: 240/463, loss: 0.11545759439468384 2023-01-22 15:57:12.199244: step: 242/463, loss: 0.047576628625392914 2023-01-22 15:57:12.790889: step: 244/463, loss: 0.08924965560436249 2023-01-22 15:57:13.396748: step: 246/463, loss: 0.1281774491071701 2023-01-22 15:57:13.988926: step: 248/463, loss: 0.3017335832118988 2023-01-22 15:57:14.538526: step: 250/463, loss: 0.09354396164417267 2023-01-22 15:57:15.127063: step: 252/463, loss: 0.7648566365242004 2023-01-22 15:57:15.763290: step: 254/463, loss: 0.09809574484825134 2023-01-22 15:57:16.382811: step: 256/463, loss: 0.5920016169548035 2023-01-22 15:57:17.002622: step: 258/463, loss: 0.085484579205513 2023-01-22 15:57:17.624597: step: 260/463, loss: 0.14312145113945007 2023-01-22 15:57:18.251009: step: 262/463, loss: 0.10994184017181396 2023-01-22 15:57:18.864784: step: 264/463, loss: 0.06164884567260742 2023-01-22 15:57:19.456897: step: 266/463, loss: 0.09198685735464096 2023-01-22 15:57:20.053927: step: 268/463, loss: 0.11088503897190094 2023-01-22 15:57:20.598677: step: 270/463, loss: 0.07385529577732086 2023-01-22 15:57:21.307953: step: 272/463, loss: 0.14000408351421356 2023-01-22 15:57:21.912524: step: 274/463, loss: 0.1303405612707138 2023-01-22 15:57:22.577708: step: 276/463, loss: 0.8127589821815491 2023-01-22 15:57:23.162923: step: 278/463, loss: 0.1876649260520935 2023-01-22 15:57:23.826498: step: 280/463, loss: 0.22236262261867523 2023-01-22 15:57:24.502193: step: 282/463, loss: 0.0396820493042469 2023-01-22 15:57:25.117248: step: 284/463, loss: 0.1226588636636734 2023-01-22 15:57:25.727835: step: 286/463, loss: 0.11205204576253891 2023-01-22 15:57:26.277742: step: 288/463, loss: 0.16303880512714386 2023-01-22 15:57:26.888902: step: 290/463, loss: 0.12113070487976074 2023-01-22 15:57:27.435046: step: 292/463, loss: 0.07036332041025162 2023-01-22 15:57:28.058492: step: 294/463, loss: 0.7480481863021851 2023-01-22 15:57:28.616032: step: 296/463, loss: 0.37319570779800415 2023-01-22 15:57:29.246367: step: 298/463, loss: 0.09122464060783386 2023-01-22 15:57:29.874990: step: 300/463, loss: 0.157090425491333 2023-01-22 15:57:30.500278: step: 302/463, loss: 0.05927928164601326 2023-01-22 15:57:31.068036: step: 304/463, loss: 0.03768585994839668 2023-01-22 15:57:31.681472: step: 306/463, loss: 0.08219924569129944 2023-01-22 15:57:32.344379: step: 308/463, loss: 0.16315336525440216 2023-01-22 15:57:32.954082: step: 310/463, loss: 0.1380206197500229 2023-01-22 15:57:33.583521: step: 312/463, loss: 0.2605874836444855 2023-01-22 15:57:34.212333: step: 314/463, loss: 0.11040909588336945 2023-01-22 15:57:34.762201: step: 316/463, loss: 0.14617910981178284 2023-01-22 15:57:35.375817: step: 318/463, loss: 0.06863058358430862 2023-01-22 15:57:36.034880: step: 320/463, loss: 0.11930922418832779 2023-01-22 15:57:36.639157: step: 322/463, loss: 0.13474254310131073 2023-01-22 15:57:37.214077: step: 324/463, loss: 0.04326383396983147 2023-01-22 15:57:37.828353: step: 326/463, loss: 0.08329316228628159 2023-01-22 15:57:38.424649: step: 328/463, loss: 0.8404079675674438 2023-01-22 15:57:38.986725: step: 330/463, loss: 0.11217956990003586 2023-01-22 15:57:39.649856: step: 332/463, loss: 0.13645195960998535 2023-01-22 15:57:40.244384: step: 334/463, loss: 0.08164163678884506 2023-01-22 15:57:40.938049: step: 336/463, loss: 0.14852389693260193 2023-01-22 15:57:41.549457: step: 338/463, loss: 0.1751575767993927 2023-01-22 15:57:42.215923: step: 340/463, loss: 0.04990030452609062 2023-01-22 15:57:42.868718: step: 342/463, loss: 0.09936562925577164 2023-01-22 15:57:43.490356: step: 344/463, loss: 0.7530326247215271 2023-01-22 15:57:44.093838: step: 346/463, loss: 0.130561962723732 2023-01-22 15:57:44.831566: step: 348/463, loss: 0.07336652278900146 2023-01-22 15:57:45.404105: step: 350/463, loss: 0.033001404255628586 2023-01-22 15:57:46.010896: step: 352/463, loss: 0.203648641705513 2023-01-22 15:57:46.618126: step: 354/463, loss: 0.29041168093681335 2023-01-22 15:57:47.225039: step: 356/463, loss: 0.07557806372642517 2023-01-22 15:57:47.848685: step: 358/463, loss: 0.3943725824356079 2023-01-22 15:57:48.499816: step: 360/463, loss: 0.7526627779006958 2023-01-22 15:57:49.113152: step: 362/463, loss: 0.12612439692020416 2023-01-22 15:57:49.749393: step: 364/463, loss: 0.8005114197731018 2023-01-22 15:57:50.354797: step: 366/463, loss: 0.09585773944854736 2023-01-22 15:57:50.953622: step: 368/463, loss: 0.07972254604101181 2023-01-22 15:57:51.540909: step: 370/463, loss: 0.12893284857273102 2023-01-22 15:57:52.081529: step: 372/463, loss: 0.14670437574386597 2023-01-22 15:57:52.718315: step: 374/463, loss: 0.2899644672870636 2023-01-22 15:57:53.328960: step: 376/463, loss: 0.16531439125537872 2023-01-22 15:57:54.027780: step: 378/463, loss: 0.11221973598003387 2023-01-22 15:57:54.624837: step: 380/463, loss: 0.02413300611078739 2023-01-22 15:57:55.162313: step: 382/463, loss: 0.3187316060066223 2023-01-22 15:57:55.762844: step: 384/463, loss: 0.08577956259250641 2023-01-22 15:57:56.399263: step: 386/463, loss: 0.3170589506626129 2023-01-22 15:57:57.075694: step: 388/463, loss: 0.07505723834037781 2023-01-22 15:57:57.641880: step: 390/463, loss: 0.301029771566391 2023-01-22 15:57:58.251201: step: 392/463, loss: 0.22670979797840118 2023-01-22 15:57:58.823656: step: 394/463, loss: 0.2627876102924347 2023-01-22 15:57:59.441168: step: 396/463, loss: 0.1508498191833496 2023-01-22 15:58:00.024496: step: 398/463, loss: 0.1429576575756073 2023-01-22 15:58:00.667936: step: 400/463, loss: 0.2075728476047516 2023-01-22 15:58:01.329409: step: 402/463, loss: 0.105370432138443 2023-01-22 15:58:01.905508: step: 404/463, loss: 0.03335290774703026 2023-01-22 15:58:02.484663: step: 406/463, loss: 0.05270611494779587 2023-01-22 15:58:03.120267: step: 408/463, loss: 2.668886184692383 2023-01-22 15:58:03.747533: step: 410/463, loss: 0.32154756784439087 2023-01-22 15:58:04.414517: step: 412/463, loss: 0.035651061683893204 2023-01-22 15:58:05.062095: step: 414/463, loss: 5.1933183670043945 2023-01-22 15:58:05.638488: step: 416/463, loss: 0.09853220731019974 2023-01-22 15:58:06.240909: step: 418/463, loss: 0.0796646773815155 2023-01-22 15:58:06.948734: step: 420/463, loss: 0.29431986808776855 2023-01-22 15:58:07.600529: step: 422/463, loss: 0.051944028586149216 2023-01-22 15:58:08.234612: step: 424/463, loss: 0.06268949061632156 2023-01-22 15:58:08.826920: step: 426/463, loss: 0.04561392217874527 2023-01-22 15:58:09.439838: step: 428/463, loss: 0.09708326309919357 2023-01-22 15:58:10.004500: step: 430/463, loss: 0.029536819085478783 2023-01-22 15:58:10.643231: step: 432/463, loss: 0.15796515345573425 2023-01-22 15:58:11.298458: step: 434/463, loss: 0.04496566578745842 2023-01-22 15:58:11.924416: step: 436/463, loss: 0.10133031755685806 2023-01-22 15:58:12.563618: step: 438/463, loss: 0.1608199179172516 2023-01-22 15:58:13.150332: step: 440/463, loss: 0.09890472143888474 2023-01-22 15:58:13.762707: step: 442/463, loss: 1.2354098558425903 2023-01-22 15:58:14.320487: step: 444/463, loss: 0.0873207077383995 2023-01-22 15:58:14.914772: step: 446/463, loss: 0.2660064101219177 2023-01-22 15:58:15.471965: step: 448/463, loss: 0.20531857013702393 2023-01-22 15:58:16.094323: step: 450/463, loss: 0.040446486324071884 2023-01-22 15:58:16.696335: step: 452/463, loss: 0.12020596861839294 2023-01-22 15:58:17.285970: step: 454/463, loss: 0.12804917991161346 2023-01-22 15:58:17.858137: step: 456/463, loss: 0.3558512032032013 2023-01-22 15:58:18.503853: step: 458/463, loss: 0.04215586185455322 2023-01-22 15:58:19.093960: step: 460/463, loss: 0.1493343710899353 2023-01-22 15:58:19.716784: step: 462/463, loss: 0.337640643119812 2023-01-22 15:58:20.287158: step: 464/463, loss: 0.6500495672225952 2023-01-22 15:58:20.899826: step: 466/463, loss: 0.10743508487939835 2023-01-22 15:58:21.468637: step: 468/463, loss: 0.31881073117256165 2023-01-22 15:58:22.083386: step: 470/463, loss: 0.06070874631404877 2023-01-22 15:58:22.668140: step: 472/463, loss: 0.13262760639190674 2023-01-22 15:58:23.260675: step: 474/463, loss: 0.6674783229827881 2023-01-22 15:58:23.856858: step: 476/463, loss: 0.08671616017818451 2023-01-22 15:58:24.522525: step: 478/463, loss: 0.21028344333171844 2023-01-22 15:58:25.121057: step: 480/463, loss: 0.07695206254720688 2023-01-22 15:58:25.669352: step: 482/463, loss: 0.14769043028354645 2023-01-22 15:58:26.302780: step: 484/463, loss: 0.252589613199234 2023-01-22 15:58:26.914628: step: 486/463, loss: 0.31464847922325134 2023-01-22 15:58:27.511975: step: 488/463, loss: 0.20300737023353577 2023-01-22 15:58:28.087025: step: 490/463, loss: 0.15124380588531494 2023-01-22 15:58:28.644499: step: 492/463, loss: 0.05633742734789848 2023-01-22 15:58:29.205969: step: 494/463, loss: 0.2579616904258728 2023-01-22 15:58:29.900397: step: 496/463, loss: 0.03908820450305939 2023-01-22 15:58:30.518880: step: 498/463, loss: 0.48307836055755615 2023-01-22 15:58:31.132039: step: 500/463, loss: 0.08387695252895355 2023-01-22 15:58:31.773586: step: 502/463, loss: 0.16009695827960968 2023-01-22 15:58:32.360546: step: 504/463, loss: 0.10748320072889328 2023-01-22 15:58:33.016670: step: 506/463, loss: 0.11239360272884369 2023-01-22 15:58:33.605762: step: 508/463, loss: 0.06943980604410172 2023-01-22 15:58:34.179223: step: 510/463, loss: 0.13349099457263947 2023-01-22 15:58:34.957631: step: 512/463, loss: 0.12051279842853546 2023-01-22 15:58:35.561377: step: 514/463, loss: 0.10275838524103165 2023-01-22 15:58:36.137438: step: 516/463, loss: 0.05324134975671768 2023-01-22 15:58:36.741064: step: 518/463, loss: 0.33309122920036316 2023-01-22 15:58:37.395896: step: 520/463, loss: 0.06453830003738403 2023-01-22 15:58:38.037950: step: 522/463, loss: 0.1594284027814865 2023-01-22 15:58:38.628848: step: 524/463, loss: 0.04267764091491699 2023-01-22 15:58:39.245685: step: 526/463, loss: 0.14498993754386902 2023-01-22 15:58:39.853024: step: 528/463, loss: 0.42583170533180237 2023-01-22 15:58:40.516426: step: 530/463, loss: 0.1731795221567154 2023-01-22 15:58:41.135920: step: 532/463, loss: 0.07455648481845856 2023-01-22 15:58:41.706073: step: 534/463, loss: 0.21733824908733368 2023-01-22 15:58:42.290111: step: 536/463, loss: 0.2760898172855377 2023-01-22 15:58:42.887538: step: 538/463, loss: 0.11627143621444702 2023-01-22 15:58:43.543978: step: 540/463, loss: 0.3335151970386505 2023-01-22 15:58:44.177824: step: 542/463, loss: 0.13885654509067535 2023-01-22 15:58:44.803617: step: 544/463, loss: 0.2727121114730835 2023-01-22 15:58:45.408086: step: 546/463, loss: 0.04346305876970291 2023-01-22 15:58:45.953868: step: 548/463, loss: 0.1564708650112152 2023-01-22 15:58:46.597725: step: 550/463, loss: 0.3722462058067322 2023-01-22 15:58:47.216924: step: 552/463, loss: 0.03801852837204933 2023-01-22 15:58:47.835752: step: 554/463, loss: 0.3872503638267517 2023-01-22 15:58:48.401302: step: 556/463, loss: 0.08052097260951996 2023-01-22 15:58:49.019187: step: 558/463, loss: 0.06935181468725204 2023-01-22 15:58:49.748625: step: 560/463, loss: 0.07895726710557938 2023-01-22 15:58:50.478610: step: 562/463, loss: 0.2057817280292511 2023-01-22 15:58:51.075538: step: 564/463, loss: 0.08846582472324371 2023-01-22 15:58:51.670379: step: 566/463, loss: 0.05468550696969032 2023-01-22 15:58:52.234412: step: 568/463, loss: 0.21517720818519592 2023-01-22 15:58:52.815841: step: 570/463, loss: 0.5374999046325684 2023-01-22 15:58:53.383396: step: 572/463, loss: 0.02009352669119835 2023-01-22 15:58:54.018874: step: 574/463, loss: 0.04121518135070801 2023-01-22 15:58:54.691569: step: 576/463, loss: 0.33559802174568176 2023-01-22 15:58:55.296293: step: 578/463, loss: 0.18520048260688782 2023-01-22 15:58:55.926955: step: 580/463, loss: 0.09580789506435394 2023-01-22 15:58:56.497397: step: 582/463, loss: 0.03978876769542694 2023-01-22 15:58:57.125306: step: 584/463, loss: 0.37242352962493896 2023-01-22 15:58:57.773175: step: 586/463, loss: 0.12584541738033295 2023-01-22 15:58:58.436554: step: 588/463, loss: 0.08844942599534988 2023-01-22 15:58:59.035722: step: 590/463, loss: 0.1319173276424408 2023-01-22 15:58:59.815367: step: 592/463, loss: 0.11733192950487137 2023-01-22 15:59:00.460337: step: 594/463, loss: 0.12338288873434067 2023-01-22 15:59:01.102700: step: 596/463, loss: 0.1313720941543579 2023-01-22 15:59:01.732857: step: 598/463, loss: 0.20948491990566254 2023-01-22 15:59:02.272884: step: 600/463, loss: 0.1684347242116928 2023-01-22 15:59:02.882440: step: 602/463, loss: 0.08271514624357224 2023-01-22 15:59:03.523351: step: 604/463, loss: 0.07532180100679398 2023-01-22 15:59:04.092156: step: 606/463, loss: 1.0916624069213867 2023-01-22 15:59:04.692187: step: 608/463, loss: 0.06839864701032639 2023-01-22 15:59:05.301625: step: 610/463, loss: 0.23737512528896332 2023-01-22 15:59:05.930628: step: 612/463, loss: 1.2182929515838623 2023-01-22 15:59:06.551597: step: 614/463, loss: 0.2538917362689972 2023-01-22 15:59:07.168159: step: 616/463, loss: 0.02101358026266098 2023-01-22 15:59:07.763124: step: 618/463, loss: 0.02385772205889225 2023-01-22 15:59:08.350149: step: 620/463, loss: 0.05643423646688461 2023-01-22 15:59:08.923704: step: 622/463, loss: 0.14926645159721375 2023-01-22 15:59:09.586171: step: 624/463, loss: 0.17984911799430847 2023-01-22 15:59:10.169255: step: 626/463, loss: 0.08038536459207535 2023-01-22 15:59:10.786731: step: 628/463, loss: 0.10429989546537399 2023-01-22 15:59:11.406715: step: 630/463, loss: 0.10029833763837814 2023-01-22 15:59:12.044663: step: 632/463, loss: 0.12359482049942017 2023-01-22 15:59:12.661762: step: 634/463, loss: 0.06361187249422073 2023-01-22 15:59:13.322136: step: 636/463, loss: 0.1068936362862587 2023-01-22 15:59:13.876445: step: 638/463, loss: 0.03322895988821983 2023-01-22 15:59:14.560111: step: 640/463, loss: 0.2729495167732239 2023-01-22 15:59:15.181112: step: 642/463, loss: 0.4596991539001465 2023-01-22 15:59:15.761718: step: 644/463, loss: 0.06242461875081062 2023-01-22 15:59:16.321121: step: 646/463, loss: 0.06560847163200378 2023-01-22 15:59:17.023132: step: 648/463, loss: 0.13191652297973633 2023-01-22 15:59:17.582491: step: 650/463, loss: 0.17920148372650146 2023-01-22 15:59:18.209044: step: 652/463, loss: 1.4253145456314087 2023-01-22 15:59:18.808631: step: 654/463, loss: 0.06314956396818161 2023-01-22 15:59:19.488487: step: 656/463, loss: 0.08827497810125351 2023-01-22 15:59:20.016577: step: 658/463, loss: 0.23561549186706543 2023-01-22 15:59:20.622420: step: 660/463, loss: 0.20914368331432343 2023-01-22 15:59:21.175258: step: 662/463, loss: 0.07288320362567902 2023-01-22 15:59:21.770868: step: 664/463, loss: 0.12965749204158783 2023-01-22 15:59:22.393247: step: 666/463, loss: 0.06994494050741196 2023-01-22 15:59:23.047174: step: 668/463, loss: 0.1325325220823288 2023-01-22 15:59:23.643305: step: 670/463, loss: 0.05767247453331947 2023-01-22 15:59:24.237786: step: 672/463, loss: 0.07963397353887558 2023-01-22 15:59:24.915891: step: 674/463, loss: 0.05728919431567192 2023-01-22 15:59:25.569129: step: 676/463, loss: 0.12764698266983032 2023-01-22 15:59:26.119849: step: 678/463, loss: 0.04784202575683594 2023-01-22 15:59:26.666557: step: 680/463, loss: 0.07107624411582947 2023-01-22 15:59:27.301552: step: 682/463, loss: 0.08824627846479416 2023-01-22 15:59:27.961042: step: 684/463, loss: 0.257671982049942 2023-01-22 15:59:28.530189: step: 686/463, loss: 0.037035033106803894 2023-01-22 15:59:29.118021: step: 688/463, loss: 0.2483813315629959 2023-01-22 15:59:29.819152: step: 690/463, loss: 0.0150942737236619 2023-01-22 15:59:30.441830: step: 692/463, loss: 0.08768827468156815 2023-01-22 15:59:30.995253: step: 694/463, loss: 0.4158094823360443 2023-01-22 15:59:31.651049: step: 696/463, loss: 0.08764569461345673 2023-01-22 15:59:32.236868: step: 698/463, loss: 0.04886002466082573 2023-01-22 15:59:32.885311: step: 700/463, loss: 0.17614418268203735 2023-01-22 15:59:33.567697: step: 702/463, loss: 0.16765576601028442 2023-01-22 15:59:34.122387: step: 704/463, loss: 0.09967868775129318 2023-01-22 15:59:34.728800: step: 706/463, loss: 0.18557091057300568 2023-01-22 15:59:35.349713: step: 708/463, loss: 0.009462079964578152 2023-01-22 15:59:35.958545: step: 710/463, loss: 0.06162165477871895 2023-01-22 15:59:36.569682: step: 712/463, loss: 0.4870803654193878 2023-01-22 15:59:37.272203: step: 714/463, loss: 0.1719835102558136 2023-01-22 15:59:37.887367: step: 716/463, loss: 0.28138041496276855 2023-01-22 15:59:38.489278: step: 718/463, loss: 0.08333136886358261 2023-01-22 15:59:39.095304: step: 720/463, loss: 0.10243593156337738 2023-01-22 15:59:39.666757: step: 722/463, loss: 0.19080708920955658 2023-01-22 15:59:40.363070: step: 724/463, loss: 0.06421104818582535 2023-01-22 15:59:40.919059: step: 726/463, loss: 0.011799248866736889 2023-01-22 15:59:41.546020: step: 728/463, loss: 0.4534011781215668 2023-01-22 15:59:42.156272: step: 730/463, loss: 0.06811133772134781 2023-01-22 15:59:42.802753: step: 732/463, loss: 0.09847425669431686 2023-01-22 15:59:43.433155: step: 734/463, loss: 0.16072200238704681 2023-01-22 15:59:43.961350: step: 736/463, loss: 0.17595139145851135 2023-01-22 15:59:44.569232: step: 738/463, loss: 0.15323977172374725 2023-01-22 15:59:45.187527: step: 740/463, loss: 0.5017956495285034 2023-01-22 15:59:45.911527: step: 742/463, loss: 0.16948798298835754 2023-01-22 15:59:46.532316: step: 744/463, loss: 0.31269246339797974 2023-01-22 15:59:47.180160: step: 746/463, loss: 0.46509379148483276 2023-01-22 15:59:47.783615: step: 748/463, loss: 1.5826869010925293 2023-01-22 15:59:48.331109: step: 750/463, loss: 0.06930915266275406 2023-01-22 15:59:48.970318: step: 752/463, loss: 0.16425146162509918 2023-01-22 15:59:49.541530: step: 754/463, loss: 0.1367664933204651 2023-01-22 15:59:50.117359: step: 756/463, loss: 0.14208164811134338 2023-01-22 15:59:50.722497: step: 758/463, loss: 0.1842901110649109 2023-01-22 15:59:51.308237: step: 760/463, loss: 0.1828688234090805 2023-01-22 15:59:51.898182: step: 762/463, loss: 0.1290280818939209 2023-01-22 15:59:52.525117: step: 764/463, loss: 0.18077082931995392 2023-01-22 15:59:53.092905: step: 766/463, loss: 0.08758290112018585 2023-01-22 15:59:53.691011: step: 768/463, loss: 0.17677392065525055 2023-01-22 15:59:54.365785: step: 770/463, loss: 2.3371331691741943 2023-01-22 15:59:55.029268: step: 772/463, loss: 0.2289806306362152 2023-01-22 15:59:55.659679: step: 774/463, loss: 0.42653965950012207 2023-01-22 15:59:56.260968: step: 776/463, loss: 0.09620672464370728 2023-01-22 15:59:56.847562: step: 778/463, loss: 0.5989714860916138 2023-01-22 15:59:57.436041: step: 780/463, loss: 0.15330654382705688 2023-01-22 15:59:58.092454: step: 782/463, loss: 0.32342100143432617 2023-01-22 15:59:58.665660: step: 784/463, loss: 0.17199596762657166 2023-01-22 15:59:59.254445: step: 786/463, loss: 0.8648913502693176 2023-01-22 15:59:59.837098: step: 788/463, loss: 0.15261411666870117 2023-01-22 16:00:00.412796: step: 790/463, loss: 0.09089970588684082 2023-01-22 16:00:01.019191: step: 792/463, loss: 0.1429625153541565 2023-01-22 16:00:01.691610: step: 794/463, loss: 0.40808990597724915 2023-01-22 16:00:02.314965: step: 796/463, loss: 0.13720203936100006 2023-01-22 16:00:02.988872: step: 798/463, loss: 0.31156125664711 2023-01-22 16:00:03.570653: step: 800/463, loss: 1.7698602676391602 2023-01-22 16:00:04.190136: step: 802/463, loss: 0.05084936320781708 2023-01-22 16:00:04.826188: step: 804/463, loss: 0.14314407110214233 2023-01-22 16:00:05.488129: step: 806/463, loss: 0.11828505247831345 2023-01-22 16:00:06.106611: step: 808/463, loss: 0.08337981253862381 2023-01-22 16:00:06.708833: step: 810/463, loss: 1.3459370136260986 2023-01-22 16:00:07.352704: step: 812/463, loss: 0.22640123963356018 2023-01-22 16:00:07.937800: step: 814/463, loss: 0.24593491852283478 2023-01-22 16:00:08.559741: step: 816/463, loss: 0.12504655122756958 2023-01-22 16:00:09.090222: step: 818/463, loss: 0.039794888347387314 2023-01-22 16:00:09.674698: step: 820/463, loss: 0.02326255664229393 2023-01-22 16:00:10.303529: step: 822/463, loss: 0.18437333405017853 2023-01-22 16:00:10.906943: step: 824/463, loss: 0.37795490026474 2023-01-22 16:00:11.521124: step: 826/463, loss: 0.7388449907302856 2023-01-22 16:00:12.160910: step: 828/463, loss: 0.05138657987117767 2023-01-22 16:00:12.791882: step: 830/463, loss: 0.3537134528160095 2023-01-22 16:00:13.411436: step: 832/463, loss: 0.11910641193389893 2023-01-22 16:00:14.071292: step: 834/463, loss: 0.3765007257461548 2023-01-22 16:00:14.702973: step: 836/463, loss: 0.12953561544418335 2023-01-22 16:00:15.328419: step: 838/463, loss: 0.23038069903850555 2023-01-22 16:00:15.970630: step: 840/463, loss: 0.029501046985387802 2023-01-22 16:00:16.549562: step: 842/463, loss: 0.18548208475112915 2023-01-22 16:00:17.133357: step: 844/463, loss: 0.06451420485973358 2023-01-22 16:00:17.727378: step: 846/463, loss: 0.19993458688259125 2023-01-22 16:00:18.286014: step: 848/463, loss: 0.22051355242729187 2023-01-22 16:00:18.967944: step: 850/463, loss: 0.0905173048377037 2023-01-22 16:00:19.575754: step: 852/463, loss: 0.2315424382686615 2023-01-22 16:00:20.131307: step: 854/463, loss: 0.11801984906196594 2023-01-22 16:00:20.787583: step: 856/463, loss: 0.6430762410163879 2023-01-22 16:00:21.372845: step: 858/463, loss: 0.1314188539981842 2023-01-22 16:00:22.049924: step: 860/463, loss: 0.1528383493423462 2023-01-22 16:00:22.667986: step: 862/463, loss: 0.16546112298965454 2023-01-22 16:00:23.234720: step: 864/463, loss: 0.3742576241493225 2023-01-22 16:00:23.829040: step: 866/463, loss: 0.1176866665482521 2023-01-22 16:00:24.439309: step: 868/463, loss: 0.045631397515535355 2023-01-22 16:00:25.062419: step: 870/463, loss: 0.04743572697043419 2023-01-22 16:00:25.693000: step: 872/463, loss: 1.0642017126083374 2023-01-22 16:00:26.296785: step: 874/463, loss: 1.557955265045166 2023-01-22 16:00:26.911686: step: 876/463, loss: 0.09657756984233856 2023-01-22 16:00:27.528021: step: 878/463, loss: 0.08996067941188812 2023-01-22 16:00:28.184857: step: 880/463, loss: 0.13480037450790405 2023-01-22 16:00:28.857135: step: 882/463, loss: 0.1776835173368454 2023-01-22 16:00:29.494947: step: 884/463, loss: 0.10550957918167114 2023-01-22 16:00:30.121556: step: 886/463, loss: 0.9665667414665222 2023-01-22 16:00:30.668745: step: 888/463, loss: 0.111086905002594 2023-01-22 16:00:31.275538: step: 890/463, loss: 0.17086857557296753 2023-01-22 16:00:31.960800: step: 892/463, loss: 0.5125375986099243 2023-01-22 16:00:32.560724: step: 894/463, loss: 0.03621345013380051 2023-01-22 16:00:33.124704: step: 896/463, loss: 0.09092015027999878 2023-01-22 16:00:33.726385: step: 898/463, loss: 0.4515478014945984 2023-01-22 16:00:34.303239: step: 900/463, loss: 0.2754586637020111 2023-01-22 16:00:34.876963: step: 902/463, loss: 0.08035633713006973 2023-01-22 16:00:35.582737: step: 904/463, loss: 0.6884781718254089 2023-01-22 16:00:36.240544: step: 906/463, loss: 0.4741193950176239 2023-01-22 16:00:36.821642: step: 908/463, loss: 0.032955095171928406 2023-01-22 16:00:37.421687: step: 910/463, loss: 0.2583451569080353 2023-01-22 16:00:38.087289: step: 912/463, loss: 0.03579411283135414 2023-01-22 16:00:38.722469: step: 914/463, loss: 0.5829232931137085 2023-01-22 16:00:39.328240: step: 916/463, loss: 0.05902054160833359 2023-01-22 16:00:39.928734: step: 918/463, loss: 0.08912669867277145 2023-01-22 16:00:40.540852: step: 920/463, loss: 0.17650122940540314 2023-01-22 16:00:41.155109: step: 922/463, loss: 0.11405730247497559 2023-01-22 16:00:41.852125: step: 924/463, loss: 0.19874829053878784 2023-01-22 16:00:42.459278: step: 926/463, loss: 0.2490263134241104 ================================================== Loss: 0.244 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2871675531914894, 'r': 0.35855075901328276, 'f1': 0.31891350210970465}, 'combined': 0.2349888962913613, 'epoch': 12} Test Chinese: {'template': {'p': 0.9722222222222222, 'r': 0.5511811023622047, 'f1': 0.7035175879396985}, 'slot': {'p': 0.33691268922160644, 'r': 0.3230951531016976, 'f1': 0.32985928325571984}, 'combined': 0.23206180731558182, 'epoch': 12} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.295820189274448, 'r': 0.3558823529411765, 'f1': 0.3230835486649441}, 'combined': 0.23806156217416932, 'epoch': 12} Test Korean: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.3364794936903662, 'r': 0.3179819299011053, 'f1': 0.3269693061163451}, 'combined': 0.23214820734260502, 'epoch': 12} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3031662306642402, 'r': 0.3612682976416373, 'f1': 0.3296768707482994}, 'combined': 0.2429197994987469, 'epoch': 12} Test Russian: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.3502538951337507, 'r': 0.30685419783096485, 'f1': 0.32712084717607975}, 'combined': 0.23225580149501662, 'epoch': 12} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.22557471264367815, 'r': 0.37380952380952376, 'f1': 0.28136200716845877}, 'combined': 0.18757467144563916, 'epoch': 12} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.2714285714285714, 'r': 0.41304347826086957, 'f1': 0.3275862068965517}, 'combined': 0.16379310344827586, 'epoch': 12} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.39705882352941174, 'r': 0.23275862068965517, 'f1': 0.2934782608695652}, 'combined': 0.19565217391304346, 'epoch': 12} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.286031314699793, 'r': 0.2996001626457035, 'f1': 0.29265854627300414}, 'combined': 0.21564313935905566, 'epoch': 11} Test for Chinese: {'template': {'p': 0.9722222222222222, 'r': 0.5511811023622047, 'f1': 0.7035175879396985}, 'slot': {'p': 0.3466959902350088, 'r': 0.3012773015579334, 'f1': 0.32239486942414375}, 'combined': 0.2268104609014077, 'epoch': 11} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.32215447154471544, 'r': 0.3773809523809524, 'f1': 0.3475877192982456}, 'combined': 0.2317251461988304, 'epoch': 11} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.31359210182048525, 'r': 0.32489807892596767, 'f1': 0.3191449908555172}, 'combined': 0.23515946694617054, 'epoch': 1} Test for Korean: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.3653684903443556, 'r': 0.29076445304891124, 'f1': 0.32382513429937054}, 'combined': 0.22991584535255308, 'epoch': 1} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.296875, 'r': 0.41304347826086957, 'f1': 0.3454545454545454}, 'combined': 0.1727272727272727, 'epoch': 1} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2792885164051355, 'r': 0.37150142314990514, 'f1': 0.3188619706840391}, 'combined': 0.23495092576718668, 'epoch': 9} Test for Russian: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.3400664122883469, 'r': 0.29521922603896666, 'f1': 0.3160598539641111}, 'combined': 0.2244024963145189, 'epoch': 9} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.40476190476190477, 'r': 0.29310344827586204, 'f1': 0.34}, 'combined': 0.22666666666666668, 'epoch': 9} ****************************** Epoch: 13 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-22 16:03:16.445005: step: 2/463, loss: 0.02108386531472206 2023-01-22 16:03:17.012276: step: 4/463, loss: 0.10170077532529831 2023-01-22 16:03:17.542849: step: 6/463, loss: 0.07822522521018982 2023-01-22 16:03:18.190124: step: 8/463, loss: 0.07511308044195175 2023-01-22 16:03:18.793806: step: 10/463, loss: 0.0725671648979187 2023-01-22 16:03:19.453233: step: 12/463, loss: 0.1626165211200714 2023-01-22 16:03:20.081681: step: 14/463, loss: 0.1654897928237915 2023-01-22 16:03:20.699641: step: 16/463, loss: 0.04427202790975571 2023-01-22 16:03:21.326898: step: 18/463, loss: 0.06998095661401749 2023-01-22 16:03:21.953355: step: 20/463, loss: 0.08586881309747696 2023-01-22 16:03:22.547902: step: 22/463, loss: 0.05537542700767517 2023-01-22 16:03:23.141807: step: 24/463, loss: 0.10738552361726761 2023-01-22 16:03:23.744475: step: 26/463, loss: 0.12732115387916565 2023-01-22 16:03:24.360348: step: 28/463, loss: 0.21386030316352844 2023-01-22 16:03:24.994050: step: 30/463, loss: 0.061403244733810425 2023-01-22 16:03:25.635462: step: 32/463, loss: 0.205952450633049 2023-01-22 16:03:26.253394: step: 34/463, loss: 0.0579068660736084 2023-01-22 16:03:26.894359: step: 36/463, loss: 0.060569122433662415 2023-01-22 16:03:27.522506: step: 38/463, loss: 0.18203423917293549 2023-01-22 16:03:28.106910: step: 40/463, loss: 0.2557511329650879 2023-01-22 16:03:28.755157: step: 42/463, loss: 0.10263889282941818 2023-01-22 16:03:29.331343: step: 44/463, loss: 0.06202158331871033 2023-01-22 16:03:29.937248: step: 46/463, loss: 0.05014343932271004 2023-01-22 16:03:30.570961: step: 48/463, loss: 0.09642301499843597 2023-01-22 16:03:31.168848: step: 50/463, loss: 0.1670560985803604 2023-01-22 16:03:31.802292: step: 52/463, loss: 0.6552795767784119 2023-01-22 16:03:32.445838: step: 54/463, loss: 0.10286103934049606 2023-01-22 16:03:33.126352: step: 56/463, loss: 0.08303134143352509 2023-01-22 16:03:33.741446: step: 58/463, loss: 0.1617237627506256 2023-01-22 16:03:34.406386: step: 60/463, loss: 0.032152146100997925 2023-01-22 16:03:35.065902: step: 62/463, loss: 0.06829527020454407 2023-01-22 16:03:35.657140: step: 64/463, loss: 0.053932320326566696 2023-01-22 16:03:36.325275: step: 66/463, loss: 0.07688063383102417 2023-01-22 16:03:36.941854: step: 68/463, loss: 0.1235547587275505 2023-01-22 16:03:37.590662: step: 70/463, loss: 0.04493608698248863 2023-01-22 16:03:38.161161: step: 72/463, loss: 0.060436371713876724 2023-01-22 16:03:38.754003: step: 74/463, loss: 0.14306925237178802 2023-01-22 16:03:39.370390: step: 76/463, loss: 0.12424968183040619 2023-01-22 16:03:40.073234: step: 78/463, loss: 0.027544915676116943 2023-01-22 16:03:40.654328: step: 80/463, loss: 0.06189502775669098 2023-01-22 16:03:41.274780: step: 82/463, loss: 0.0353294238448143 2023-01-22 16:03:41.898325: step: 84/463, loss: 0.07019975781440735 2023-01-22 16:03:42.473954: step: 86/463, loss: 0.05137719586491585 2023-01-22 16:03:43.046734: step: 88/463, loss: 0.03704935312271118 2023-01-22 16:03:43.626916: step: 90/463, loss: 0.10829801112413406 2023-01-22 16:03:44.185705: step: 92/463, loss: 0.17878183722496033 2023-01-22 16:03:44.804480: step: 94/463, loss: 0.06415527313947678 2023-01-22 16:03:45.406725: step: 96/463, loss: 0.10029242932796478 2023-01-22 16:03:45.952219: step: 98/463, loss: 0.0673484280705452 2023-01-22 16:03:46.555838: step: 100/463, loss: 0.1933254599571228 2023-01-22 16:03:47.134710: step: 102/463, loss: 0.09078634530305862 2023-01-22 16:03:47.719396: step: 104/463, loss: 0.07992289960384369 2023-01-22 16:03:48.387785: step: 106/463, loss: 0.3753969371318817 2023-01-22 16:03:49.025957: step: 108/463, loss: 0.011334724724292755 2023-01-22 16:03:49.656714: step: 110/463, loss: 0.10179857164621353 2023-01-22 16:03:50.296389: step: 112/463, loss: 0.03756437078118324 2023-01-22 16:03:50.959532: step: 114/463, loss: 0.13220857083797455 2023-01-22 16:03:51.563337: step: 116/463, loss: 0.053526781499385834 2023-01-22 16:03:52.176606: step: 118/463, loss: 0.06136821582913399 2023-01-22 16:03:52.834764: step: 120/463, loss: 0.06187546253204346 2023-01-22 16:03:53.472261: step: 122/463, loss: 0.177166149020195 2023-01-22 16:03:54.151866: step: 124/463, loss: 0.0868932232260704 2023-01-22 16:03:54.741102: step: 126/463, loss: 0.09526161104440689 2023-01-22 16:03:55.422844: step: 128/463, loss: 0.21327342092990875 2023-01-22 16:03:56.007172: step: 130/463, loss: 0.13131648302078247 2023-01-22 16:03:56.600017: step: 132/463, loss: 0.09675684571266174 2023-01-22 16:03:57.261426: step: 134/463, loss: 0.45969530940055847 2023-01-22 16:03:57.878607: step: 136/463, loss: 0.24603363871574402 2023-01-22 16:03:58.483531: step: 138/463, loss: 0.07422692328691483 2023-01-22 16:03:59.098946: step: 140/463, loss: 1.8143908977508545 2023-01-22 16:03:59.727167: step: 142/463, loss: 0.390268474817276 2023-01-22 16:04:00.310514: step: 144/463, loss: 0.04677719250321388 2023-01-22 16:04:00.910233: step: 146/463, loss: 0.06772051006555557 2023-01-22 16:04:01.548684: step: 148/463, loss: 0.20691873133182526 2023-01-22 16:04:02.177752: step: 150/463, loss: 0.18168911337852478 2023-01-22 16:04:02.851550: step: 152/463, loss: 0.3019852638244629 2023-01-22 16:04:03.474132: step: 154/463, loss: 0.2035442441701889 2023-01-22 16:04:04.095069: step: 156/463, loss: 0.044795673340559006 2023-01-22 16:04:04.702686: step: 158/463, loss: 0.3285199701786041 2023-01-22 16:04:05.275384: step: 160/463, loss: 0.023714380338788033 2023-01-22 16:04:05.875357: step: 162/463, loss: 0.05044558644294739 2023-01-22 16:04:06.599758: step: 164/463, loss: 0.18508554995059967 2023-01-22 16:04:07.253690: step: 166/463, loss: 0.01925022155046463 2023-01-22 16:04:07.845742: step: 168/463, loss: 0.0679761990904808 2023-01-22 16:04:08.481801: step: 170/463, loss: 0.13748414814472198 2023-01-22 16:04:09.104470: step: 172/463, loss: 0.045770544558763504 2023-01-22 16:04:09.794951: step: 174/463, loss: 0.05927185341715813 2023-01-22 16:04:10.355995: step: 176/463, loss: 0.029446961358189583 2023-01-22 16:04:11.019464: step: 178/463, loss: 0.03968602046370506 2023-01-22 16:04:11.630600: step: 180/463, loss: 0.13516509532928467 2023-01-22 16:04:12.263744: step: 182/463, loss: 0.07503102719783783 2023-01-22 16:04:12.888596: step: 184/463, loss: 0.024883529171347618 2023-01-22 16:04:13.504018: step: 186/463, loss: 0.24376827478408813 2023-01-22 16:04:14.100454: step: 188/463, loss: 0.08412440121173859 2023-01-22 16:04:14.708625: step: 190/463, loss: 0.014118066988885403 2023-01-22 16:04:15.254932: step: 192/463, loss: 0.28129756450653076 2023-01-22 16:04:15.913705: step: 194/463, loss: 0.6414167881011963 2023-01-22 16:04:16.517479: step: 196/463, loss: 0.005964319687336683 2023-01-22 16:04:17.159972: step: 198/463, loss: 0.05918041244149208 2023-01-22 16:04:17.796388: step: 200/463, loss: 0.026724640280008316 2023-01-22 16:04:18.377349: step: 202/463, loss: 0.11227631568908691 2023-01-22 16:04:18.991505: step: 204/463, loss: 0.14118710160255432 2023-01-22 16:04:19.674261: step: 206/463, loss: 0.046949148178100586 2023-01-22 16:04:20.352039: step: 208/463, loss: 0.034755952656269073 2023-01-22 16:04:20.967124: step: 210/463, loss: 0.05624411627650261 2023-01-22 16:04:21.580789: step: 212/463, loss: 0.094609335064888 2023-01-22 16:04:22.211306: step: 214/463, loss: 0.18752305209636688 2023-01-22 16:04:22.820328: step: 216/463, loss: 0.019950928166508675 2023-01-22 16:04:23.453368: step: 218/463, loss: 0.10414022207260132 2023-01-22 16:04:24.046815: step: 220/463, loss: 0.03668985143303871 2023-01-22 16:04:24.623231: step: 222/463, loss: 0.053316470235586166 2023-01-22 16:04:25.201350: step: 224/463, loss: 0.11589406430721283 2023-01-22 16:04:25.779859: step: 226/463, loss: 0.36937668919563293 2023-01-22 16:04:26.436560: step: 228/463, loss: 0.03212893381714821 2023-01-22 16:04:27.114170: step: 230/463, loss: 0.061012737452983856 2023-01-22 16:04:27.683076: step: 232/463, loss: 0.03872611001133919 2023-01-22 16:04:28.262742: step: 234/463, loss: 0.3927574157714844 2023-01-22 16:04:28.881519: step: 236/463, loss: 0.12124763429164886 2023-01-22 16:04:29.475931: step: 238/463, loss: 0.3160013258457184 2023-01-22 16:04:30.051240: step: 240/463, loss: 0.1692732572555542 2023-01-22 16:04:30.656285: step: 242/463, loss: 0.04250122234225273 2023-01-22 16:04:31.322745: step: 244/463, loss: 0.2613341212272644 2023-01-22 16:04:31.897980: step: 246/463, loss: 0.05078519135713577 2023-01-22 16:04:32.455150: step: 248/463, loss: 0.09295962750911713 2023-01-22 16:04:33.123051: step: 250/463, loss: 0.014821333810687065 2023-01-22 16:04:33.758936: step: 252/463, loss: 0.2010076344013214 2023-01-22 16:04:34.386204: step: 254/463, loss: 0.1832876056432724 2023-01-22 16:04:34.980273: step: 256/463, loss: 0.11921825259923935 2023-01-22 16:04:35.577010: step: 258/463, loss: 0.03742406889796257 2023-01-22 16:04:36.168887: step: 260/463, loss: 0.03584354743361473 2023-01-22 16:04:36.717905: step: 262/463, loss: 0.006572291254997253 2023-01-22 16:04:37.299612: step: 264/463, loss: 0.04208887368440628 2023-01-22 16:04:37.886037: step: 266/463, loss: 0.034230928868055344 2023-01-22 16:04:38.471244: step: 268/463, loss: 0.04290442168712616 2023-01-22 16:04:39.059123: step: 270/463, loss: 0.06206333264708519 2023-01-22 16:04:39.617923: step: 272/463, loss: 0.6973184943199158 2023-01-22 16:04:40.189503: step: 274/463, loss: 0.11684607714414597 2023-01-22 16:04:40.833748: step: 276/463, loss: 0.3158898949623108 2023-01-22 16:04:41.441409: step: 278/463, loss: 0.06825271993875504 2023-01-22 16:04:42.098381: step: 280/463, loss: 0.03583059459924698 2023-01-22 16:04:42.653110: step: 282/463, loss: 0.04478013515472412 2023-01-22 16:04:43.240668: step: 284/463, loss: 0.12344963848590851 2023-01-22 16:04:43.895618: step: 286/463, loss: 0.12761276960372925 2023-01-22 16:04:44.534113: step: 288/463, loss: 0.16597716510295868 2023-01-22 16:04:45.149385: step: 290/463, loss: 0.031031807884573936 2023-01-22 16:04:45.834926: step: 292/463, loss: 0.21369682252407074 2023-01-22 16:04:46.408446: step: 294/463, loss: 0.4069356918334961 2023-01-22 16:04:46.999087: step: 296/463, loss: 0.4624345004558563 2023-01-22 16:04:47.635920: step: 298/463, loss: 0.018621452152729034 2023-01-22 16:04:48.265114: step: 300/463, loss: 0.11020310968160629 2023-01-22 16:04:48.899907: step: 302/463, loss: 0.07263147830963135 2023-01-22 16:04:49.550669: step: 304/463, loss: 0.284977525472641 2023-01-22 16:04:50.103989: step: 306/463, loss: 0.05911963805556297 2023-01-22 16:04:50.777310: step: 308/463, loss: 0.08802596479654312 2023-01-22 16:04:51.369896: step: 310/463, loss: 1.356328010559082 2023-01-22 16:04:51.953416: step: 312/463, loss: 0.027946988120675087 2023-01-22 16:04:52.505236: step: 314/463, loss: 0.055415187031030655 2023-01-22 16:04:53.102293: step: 316/463, loss: 0.3229154348373413 2023-01-22 16:04:53.650525: step: 318/463, loss: 0.04108349233865738 2023-01-22 16:04:54.238743: step: 320/463, loss: 0.5424174070358276 2023-01-22 16:04:54.887143: step: 322/463, loss: 0.07949607819318771 2023-01-22 16:04:55.496804: step: 324/463, loss: 0.12088025361299515 2023-01-22 16:04:56.092494: step: 326/463, loss: 0.12129585444927216 2023-01-22 16:04:56.675049: step: 328/463, loss: 0.14168064296245575 2023-01-22 16:04:57.350425: step: 330/463, loss: 0.05893489718437195 2023-01-22 16:04:57.926036: step: 332/463, loss: 0.16034460067749023 2023-01-22 16:04:58.529617: step: 334/463, loss: 0.12082453817129135 2023-01-22 16:04:59.113681: step: 336/463, loss: 0.41055870056152344 2023-01-22 16:04:59.753399: step: 338/463, loss: 0.043504249304533005 2023-01-22 16:05:00.387670: step: 340/463, loss: 0.09868357330560684 2023-01-22 16:05:00.962817: step: 342/463, loss: 0.06267727166414261 2023-01-22 16:05:01.651343: step: 344/463, loss: 0.10694398730993271 2023-01-22 16:05:02.206756: step: 346/463, loss: 0.33748090267181396 2023-01-22 16:05:02.763195: step: 348/463, loss: 0.13698513805866241 2023-01-22 16:05:03.287138: step: 350/463, loss: 0.06052187457680702 2023-01-22 16:05:03.972839: step: 352/463, loss: 0.0759342610836029 2023-01-22 16:05:04.626815: step: 354/463, loss: 1.048840045928955 2023-01-22 16:05:05.230044: step: 356/463, loss: 0.02684696950018406 2023-01-22 16:05:05.935891: step: 358/463, loss: 0.13596241176128387 2023-01-22 16:05:06.593389: step: 360/463, loss: 0.36798837780952454 2023-01-22 16:05:07.231760: step: 362/463, loss: 0.18784606456756592 2023-01-22 16:05:07.828478: step: 364/463, loss: 0.09593077749013901 2023-01-22 16:05:08.399750: step: 366/463, loss: 0.0818757712841034 2023-01-22 16:05:08.991900: step: 368/463, loss: 0.3346506357192993 2023-01-22 16:05:09.594293: step: 370/463, loss: 0.48949867486953735 2023-01-22 16:05:10.209227: step: 372/463, loss: 0.06667443364858627 2023-01-22 16:05:10.796329: step: 374/463, loss: 0.13796338438987732 2023-01-22 16:05:11.461445: step: 376/463, loss: 3.654369831085205 2023-01-22 16:05:12.041689: step: 378/463, loss: 0.4502463638782501 2023-01-22 16:05:12.653836: step: 380/463, loss: 0.02668783999979496 2023-01-22 16:05:13.229963: step: 382/463, loss: 0.12121445685625076 2023-01-22 16:05:13.792836: step: 384/463, loss: 0.08039193600416183 2023-01-22 16:05:14.377732: step: 386/463, loss: 0.04662998393177986 2023-01-22 16:05:15.018721: step: 388/463, loss: 0.7552713751792908 2023-01-22 16:05:15.647462: step: 390/463, loss: 0.13476215302944183 2023-01-22 16:05:16.364292: step: 392/463, loss: 0.09529784321784973 2023-01-22 16:05:16.987322: step: 394/463, loss: 0.11388815939426422 2023-01-22 16:05:17.609918: step: 396/463, loss: 0.27961426973342896 2023-01-22 16:05:18.259558: step: 398/463, loss: 0.0546390563249588 2023-01-22 16:05:18.850220: step: 400/463, loss: 0.06768814474344254 2023-01-22 16:05:19.497743: step: 402/463, loss: 2.992281675338745 2023-01-22 16:05:20.092446: step: 404/463, loss: 0.08586805313825607 2023-01-22 16:05:20.732029: step: 406/463, loss: 0.10834775865077972 2023-01-22 16:05:21.311014: step: 408/463, loss: 0.20607316493988037 2023-01-22 16:05:21.933000: step: 410/463, loss: 0.10076102614402771 2023-01-22 16:05:22.486491: step: 412/463, loss: 0.24833130836486816 2023-01-22 16:05:23.102380: step: 414/463, loss: 0.18834535777568817 2023-01-22 16:05:23.664600: step: 416/463, loss: 0.09480500966310501 2023-01-22 16:05:24.278808: step: 418/463, loss: 0.20114415884017944 2023-01-22 16:05:24.923584: step: 420/463, loss: 0.05873025953769684 2023-01-22 16:05:25.470760: step: 422/463, loss: 0.02476329170167446 2023-01-22 16:05:26.154963: step: 424/463, loss: 0.4778350293636322 2023-01-22 16:05:26.714962: step: 426/463, loss: 0.2674151062965393 2023-01-22 16:05:27.307203: step: 428/463, loss: 0.3484248220920563 2023-01-22 16:05:27.942468: step: 430/463, loss: 0.22904010117053986 2023-01-22 16:05:28.538262: step: 432/463, loss: 2.497194528579712 2023-01-22 16:05:29.188351: step: 434/463, loss: 0.048180609941482544 2023-01-22 16:05:29.793926: step: 436/463, loss: 0.03460788354277611 2023-01-22 16:05:30.399898: step: 438/463, loss: 0.11969885975122452 2023-01-22 16:05:30.965488: step: 440/463, loss: 0.13664956390857697 2023-01-22 16:05:31.610211: step: 442/463, loss: 0.14179258048534393 2023-01-22 16:05:32.191887: step: 444/463, loss: 0.030060209333896637 2023-01-22 16:05:32.805114: step: 446/463, loss: 0.0655910074710846 2023-01-22 16:05:33.452874: step: 448/463, loss: 0.05175537243485451 2023-01-22 16:05:34.043022: step: 450/463, loss: 0.20212547481060028 2023-01-22 16:05:34.650880: step: 452/463, loss: 0.6811161041259766 2023-01-22 16:05:35.278219: step: 454/463, loss: 0.3011069595813751 2023-01-22 16:05:35.875336: step: 456/463, loss: 0.2384025901556015 2023-01-22 16:05:36.509134: step: 458/463, loss: 0.07780146598815918 2023-01-22 16:05:37.136736: step: 460/463, loss: 0.1250879317522049 2023-01-22 16:05:37.800411: step: 462/463, loss: 0.17082569003105164 2023-01-22 16:05:38.348761: step: 464/463, loss: 0.101338692009449 2023-01-22 16:05:38.903150: step: 466/463, loss: 0.07114072889089584 2023-01-22 16:05:39.582721: step: 468/463, loss: 0.087715744972229 2023-01-22 16:05:40.194317: step: 470/463, loss: 0.36223268508911133 2023-01-22 16:05:40.787580: step: 472/463, loss: 0.0985742136836052 2023-01-22 16:05:41.441878: step: 474/463, loss: 0.3917888402938843 2023-01-22 16:05:42.025596: step: 476/463, loss: 0.016237527132034302 2023-01-22 16:05:42.603607: step: 478/463, loss: 0.38260263204574585 2023-01-22 16:05:43.233526: step: 480/463, loss: 0.9861243367195129 2023-01-22 16:05:43.796904: step: 482/463, loss: 0.14938409626483917 2023-01-22 16:05:44.406963: step: 484/463, loss: 0.0651649683713913 2023-01-22 16:05:45.087982: step: 486/463, loss: 0.22786535322666168 2023-01-22 16:05:45.663934: step: 488/463, loss: 0.09017148613929749 2023-01-22 16:05:46.197750: step: 490/463, loss: 0.08039987087249756 2023-01-22 16:05:46.897775: step: 492/463, loss: 0.046700067818164825 2023-01-22 16:05:47.524632: step: 494/463, loss: 0.1674748957157135 2023-01-22 16:05:48.156125: step: 496/463, loss: 0.04163726419210434 2023-01-22 16:05:48.770566: step: 498/463, loss: 0.032454267144203186 2023-01-22 16:05:49.368680: step: 500/463, loss: 0.14114023745059967 2023-01-22 16:05:49.950571: step: 502/463, loss: 0.18452498316764832 2023-01-22 16:05:50.545432: step: 504/463, loss: 0.12438204884529114 2023-01-22 16:05:51.210975: step: 506/463, loss: 0.1521502584218979 2023-01-22 16:05:51.881828: step: 508/463, loss: 0.11815361678600311 2023-01-22 16:05:52.468737: step: 510/463, loss: 0.968411922454834 2023-01-22 16:05:53.119213: step: 512/463, loss: 0.07441670447587967 2023-01-22 16:05:53.740223: step: 514/463, loss: 0.032490868121385574 2023-01-22 16:05:54.340581: step: 516/463, loss: 0.060109928250312805 2023-01-22 16:05:54.950208: step: 518/463, loss: 0.02395005337893963 2023-01-22 16:05:55.571256: step: 520/463, loss: 0.03130461275577545 2023-01-22 16:05:56.236918: step: 522/463, loss: 0.08788760006427765 2023-01-22 16:05:56.853111: step: 524/463, loss: 0.05930419638752937 2023-01-22 16:05:57.435024: step: 526/463, loss: 0.2586973309516907 2023-01-22 16:05:58.020947: step: 528/463, loss: 0.06222150847315788 2023-01-22 16:05:58.574010: step: 530/463, loss: 0.17142131924629211 2023-01-22 16:05:59.199035: step: 532/463, loss: 0.11814301460981369 2023-01-22 16:05:59.794876: step: 534/463, loss: 0.09680081903934479 2023-01-22 16:06:00.431089: step: 536/463, loss: 0.23794129490852356 2023-01-22 16:06:01.083823: step: 538/463, loss: 0.20050817728042603 2023-01-22 16:06:01.709173: step: 540/463, loss: 0.08318908512592316 2023-01-22 16:06:02.290729: step: 542/463, loss: 0.3669341206550598 2023-01-22 16:06:02.914260: step: 544/463, loss: 0.14316150546073914 2023-01-22 16:06:03.486194: step: 546/463, loss: 0.039507895708084106 2023-01-22 16:06:04.089278: step: 548/463, loss: 0.09676271677017212 2023-01-22 16:06:04.700733: step: 550/463, loss: 0.11115095019340515 2023-01-22 16:06:05.371760: step: 552/463, loss: 0.08152812719345093 2023-01-22 16:06:05.998841: step: 554/463, loss: 0.05214051529765129 2023-01-22 16:06:06.626302: step: 556/463, loss: 0.15833570063114166 2023-01-22 16:06:07.177636: step: 558/463, loss: 0.03636357933282852 2023-01-22 16:06:07.808195: step: 560/463, loss: 0.7615919709205627 2023-01-22 16:06:08.375940: step: 562/463, loss: 0.12771488726139069 2023-01-22 16:06:09.108215: step: 564/463, loss: 0.1430182158946991 2023-01-22 16:06:09.731164: step: 566/463, loss: 3.371283531188965 2023-01-22 16:06:10.479937: step: 568/463, loss: 0.061900973320007324 2023-01-22 16:06:11.019312: step: 570/463, loss: 0.029409583657979965 2023-01-22 16:06:11.656367: step: 572/463, loss: 0.7436249256134033 2023-01-22 16:06:12.276252: step: 574/463, loss: 0.15375366806983948 2023-01-22 16:06:12.935531: step: 576/463, loss: 0.09434197843074799 2023-01-22 16:06:13.565700: step: 578/463, loss: 0.23521144688129425 2023-01-22 16:06:14.219367: step: 580/463, loss: 0.12104136496782303 2023-01-22 16:06:14.847010: step: 582/463, loss: 0.23615461587905884 2023-01-22 16:06:15.523486: step: 584/463, loss: 0.03565884754061699 2023-01-22 16:06:16.128978: step: 586/463, loss: 0.06022859364748001 2023-01-22 16:06:16.786518: step: 588/463, loss: 0.07340779155492783 2023-01-22 16:06:17.360856: step: 590/463, loss: 0.09406647831201553 2023-01-22 16:06:17.945106: step: 592/463, loss: 0.06828715652227402 2023-01-22 16:06:18.500946: step: 594/463, loss: 0.2565891742706299 2023-01-22 16:06:19.102208: step: 596/463, loss: 0.0862281322479248 2023-01-22 16:06:19.744238: step: 598/463, loss: 0.23858952522277832 2023-01-22 16:06:20.410338: step: 600/463, loss: 0.025864511728286743 2023-01-22 16:06:20.978299: step: 602/463, loss: 0.05267537012696266 2023-01-22 16:06:21.582841: step: 604/463, loss: 0.09233048558235168 2023-01-22 16:06:22.160176: step: 606/463, loss: 0.14636364579200745 2023-01-22 16:06:22.758712: step: 608/463, loss: 0.45371413230895996 2023-01-22 16:06:23.353199: step: 610/463, loss: 0.027421575039625168 2023-01-22 16:06:23.986952: step: 612/463, loss: 1.470650553703308 2023-01-22 16:06:24.658859: step: 614/463, loss: 1.5426228046417236 2023-01-22 16:06:25.243181: step: 616/463, loss: 0.07686719298362732 2023-01-22 16:06:25.858817: step: 618/463, loss: 0.35312169790267944 2023-01-22 16:06:26.423439: step: 620/463, loss: 0.5375580787658691 2023-01-22 16:06:27.072353: step: 622/463, loss: 0.2728302776813507 2023-01-22 16:06:27.707559: step: 624/463, loss: 0.0582653172314167 2023-01-22 16:06:28.288958: step: 626/463, loss: 0.09951641410589218 2023-01-22 16:06:28.997306: step: 628/463, loss: 0.10719003528356552 2023-01-22 16:06:29.653912: step: 630/463, loss: 0.03710818663239479 2023-01-22 16:06:30.236927: step: 632/463, loss: 0.15910382568836212 2023-01-22 16:06:30.819723: step: 634/463, loss: 0.1237945705652237 2023-01-22 16:06:31.448845: step: 636/463, loss: 0.09320122003555298 2023-01-22 16:06:32.084257: step: 638/463, loss: 0.14148275554180145 2023-01-22 16:06:32.662462: step: 640/463, loss: 0.09437666088342667 2023-01-22 16:06:33.287298: step: 642/463, loss: 0.025244833901524544 2023-01-22 16:06:33.894900: step: 644/463, loss: 0.06106661632657051 2023-01-22 16:06:34.499666: step: 646/463, loss: 0.08183084428310394 2023-01-22 16:06:35.070715: step: 648/463, loss: 0.009134301915764809 2023-01-22 16:06:35.678841: step: 650/463, loss: 0.15603183209896088 2023-01-22 16:06:36.307460: step: 652/463, loss: 0.17522603273391724 2023-01-22 16:06:36.937786: step: 654/463, loss: 0.05895433574914932 2023-01-22 16:06:37.557977: step: 656/463, loss: 0.14157070219516754 2023-01-22 16:06:38.125364: step: 658/463, loss: 0.07957937568426132 2023-01-22 16:06:38.669739: step: 660/463, loss: 0.1180666834115982 2023-01-22 16:06:39.276677: step: 662/463, loss: 0.22066418826580048 2023-01-22 16:06:39.865787: step: 664/463, loss: 0.06697670370340347 2023-01-22 16:06:40.475929: step: 666/463, loss: 0.0534360408782959 2023-01-22 16:06:41.121293: step: 668/463, loss: 0.1782628893852234 2023-01-22 16:06:41.747128: step: 670/463, loss: 0.11663768440485 2023-01-22 16:06:42.327880: step: 672/463, loss: 0.18980370461940765 2023-01-22 16:06:42.937235: step: 674/463, loss: 0.21551063656806946 2023-01-22 16:06:43.562373: step: 676/463, loss: 0.39015263319015503 2023-01-22 16:06:44.161008: step: 678/463, loss: 0.06919926404953003 2023-01-22 16:06:44.834385: step: 680/463, loss: 0.11892160028219223 2023-01-22 16:06:45.500690: step: 682/463, loss: 0.043400838971138 2023-01-22 16:06:46.118376: step: 684/463, loss: 0.03173093870282173 2023-01-22 16:06:46.695704: step: 686/463, loss: 0.14719317853450775 2023-01-22 16:06:47.341291: step: 688/463, loss: 0.1477840542793274 2023-01-22 16:06:47.941907: step: 690/463, loss: 0.03593097999691963 2023-01-22 16:06:48.631187: step: 692/463, loss: 0.03600059077143669 2023-01-22 16:06:49.267190: step: 694/463, loss: 0.07409865409135818 2023-01-22 16:06:49.883494: step: 696/463, loss: 0.0604204498231411 2023-01-22 16:06:50.476734: step: 698/463, loss: 0.02095182240009308 2023-01-22 16:06:51.074021: step: 700/463, loss: 0.05871518701314926 2023-01-22 16:06:51.706151: step: 702/463, loss: 0.04554479196667671 2023-01-22 16:06:52.343863: step: 704/463, loss: 0.05692585930228233 2023-01-22 16:06:53.014953: step: 706/463, loss: 0.0669647753238678 2023-01-22 16:06:53.651771: step: 708/463, loss: 0.038141507655382156 2023-01-22 16:06:54.259931: step: 710/463, loss: 0.09101494401693344 2023-01-22 16:06:54.912253: step: 712/463, loss: 0.023610794916749 2023-01-22 16:06:55.489008: step: 714/463, loss: 0.0933905765414238 2023-01-22 16:06:56.104921: step: 716/463, loss: 0.07494968920946121 2023-01-22 16:06:56.675225: step: 718/463, loss: 0.39095959067344666 2023-01-22 16:06:57.235991: step: 720/463, loss: 0.051917679607868195 2023-01-22 16:06:57.866719: step: 722/463, loss: 0.19290350377559662 2023-01-22 16:06:58.451116: step: 724/463, loss: 0.07401759922504425 2023-01-22 16:06:59.025705: step: 726/463, loss: 0.14822342991828918 2023-01-22 16:06:59.667105: step: 728/463, loss: 0.0428365133702755 2023-01-22 16:07:00.255507: step: 730/463, loss: 0.1205577626824379 2023-01-22 16:07:00.800638: step: 732/463, loss: 0.1607072949409485 2023-01-22 16:07:01.474897: step: 734/463, loss: 0.06043323874473572 2023-01-22 16:07:02.120880: step: 736/463, loss: 0.09878876805305481 2023-01-22 16:07:02.705206: step: 738/463, loss: 0.36006367206573486 2023-01-22 16:07:03.305508: step: 740/463, loss: 0.09525182843208313 2023-01-22 16:07:03.966335: step: 742/463, loss: 0.20150576531887054 2023-01-22 16:07:04.767045: step: 744/463, loss: 0.11411549896001816 2023-01-22 16:07:05.346142: step: 746/463, loss: 0.0694831907749176 2023-01-22 16:07:05.976718: step: 748/463, loss: 0.21037068963050842 2023-01-22 16:07:06.624594: step: 750/463, loss: 0.34095582365989685 2023-01-22 16:07:07.205819: step: 752/463, loss: 0.12141270935535431 2023-01-22 16:07:07.876572: step: 754/463, loss: 0.35645174980163574 2023-01-22 16:07:08.493077: step: 756/463, loss: 0.0938982143998146 2023-01-22 16:07:09.087006: step: 758/463, loss: 0.0773950070142746 2023-01-22 16:07:09.696403: step: 760/463, loss: 0.08569945394992828 2023-01-22 16:07:10.310666: step: 762/463, loss: 0.17549414932727814 2023-01-22 16:07:10.961744: step: 764/463, loss: 0.06861287355422974 2023-01-22 16:07:11.543858: step: 766/463, loss: 0.049301836639642715 2023-01-22 16:07:12.146499: step: 768/463, loss: 0.6057790517807007 2023-01-22 16:07:12.729135: step: 770/463, loss: 0.2369403839111328 2023-01-22 16:07:13.351957: step: 772/463, loss: 0.14688655734062195 2023-01-22 16:07:13.921385: step: 774/463, loss: 0.05183057859539986 2023-01-22 16:07:14.555765: step: 776/463, loss: 0.1502751260995865 2023-01-22 16:07:15.262172: step: 778/463, loss: 0.7798364162445068 2023-01-22 16:07:15.853338: step: 780/463, loss: 0.08997099101543427 2023-01-22 16:07:16.430384: step: 782/463, loss: 0.08277231454849243 2023-01-22 16:07:17.003697: step: 784/463, loss: 0.0814555287361145 2023-01-22 16:07:17.658492: step: 786/463, loss: 0.20540684461593628 2023-01-22 16:07:18.324054: step: 788/463, loss: 0.052653051912784576 2023-01-22 16:07:18.932472: step: 790/463, loss: 0.024029584601521492 2023-01-22 16:07:19.563422: step: 792/463, loss: 0.20234955847263336 2023-01-22 16:07:20.186909: step: 794/463, loss: 0.09839435666799545 2023-01-22 16:07:20.772596: step: 796/463, loss: 0.1336684674024582 2023-01-22 16:07:21.288339: step: 798/463, loss: 0.11402252316474915 2023-01-22 16:07:21.877250: step: 800/463, loss: 0.12282467633485794 2023-01-22 16:07:22.504201: step: 802/463, loss: 0.071919746696949 2023-01-22 16:07:23.161705: step: 804/463, loss: 0.16365496814250946 2023-01-22 16:07:23.772994: step: 806/463, loss: 0.05860280618071556 2023-01-22 16:07:24.379650: step: 808/463, loss: 0.10445426404476166 2023-01-22 16:07:25.089519: step: 810/463, loss: 0.08717592805624008 2023-01-22 16:07:25.666900: step: 812/463, loss: 0.04681031405925751 2023-01-22 16:07:26.244229: step: 814/463, loss: 0.08544857054948807 2023-01-22 16:07:26.870917: step: 816/463, loss: 0.18413056433200836 2023-01-22 16:07:27.504663: step: 818/463, loss: 2.1293177604675293 2023-01-22 16:07:28.075105: step: 820/463, loss: 0.06556117534637451 2023-01-22 16:07:28.682263: step: 822/463, loss: 0.2278032898902893 2023-01-22 16:07:29.295138: step: 824/463, loss: 0.03010297566652298 2023-01-22 16:07:29.967443: step: 826/463, loss: 0.0988142117857933 2023-01-22 16:07:30.553615: step: 828/463, loss: 0.1002720296382904 2023-01-22 16:07:31.150707: step: 830/463, loss: 0.49521881341934204 2023-01-22 16:07:31.755846: step: 832/463, loss: 0.0708225890994072 2023-01-22 16:07:32.338834: step: 834/463, loss: 0.036177925765514374 2023-01-22 16:07:32.893858: step: 836/463, loss: 0.07262350618839264 2023-01-22 16:07:33.490046: step: 838/463, loss: 0.1388338953256607 2023-01-22 16:07:34.145124: step: 840/463, loss: 0.08056412637233734 2023-01-22 16:07:34.743816: step: 842/463, loss: 0.32007482647895813 2023-01-22 16:07:35.429814: step: 844/463, loss: 0.15999531745910645 2023-01-22 16:07:36.024379: step: 846/463, loss: 0.02953474409878254 2023-01-22 16:07:36.607166: step: 848/463, loss: 0.09596318006515503 2023-01-22 16:07:37.176497: step: 850/463, loss: 0.08314316719770432 2023-01-22 16:07:37.797873: step: 852/463, loss: 0.0502016507089138 2023-01-22 16:07:38.477919: step: 854/463, loss: 0.1929430216550827 2023-01-22 16:07:39.084702: step: 856/463, loss: 0.19157880544662476 2023-01-22 16:07:39.773551: step: 858/463, loss: 0.17502203583717346 2023-01-22 16:07:40.345311: step: 860/463, loss: 0.2360767275094986 2023-01-22 16:07:40.937329: step: 862/463, loss: 0.1421293020248413 2023-01-22 16:07:41.551025: step: 864/463, loss: 0.30775728821754456 2023-01-22 16:07:42.134496: step: 866/463, loss: 0.8474909067153931 2023-01-22 16:07:42.702135: step: 868/463, loss: 0.052784815430641174 2023-01-22 16:07:43.290167: step: 870/463, loss: 0.06910984218120575 2023-01-22 16:07:43.855943: step: 872/463, loss: 0.20508794486522675 2023-01-22 16:07:44.477551: step: 874/463, loss: 0.05806577950716019 2023-01-22 16:07:45.112158: step: 876/463, loss: 0.2227485030889511 2023-01-22 16:07:45.691676: step: 878/463, loss: 0.19729956984519958 2023-01-22 16:07:46.299029: step: 880/463, loss: 0.3305603563785553 2023-01-22 16:07:46.898043: step: 882/463, loss: 0.13329380750656128 2023-01-22 16:07:47.488917: step: 884/463, loss: 0.5031425952911377 2023-01-22 16:07:48.150550: step: 886/463, loss: 0.22164581716060638 2023-01-22 16:07:48.813016: step: 888/463, loss: 0.21471001207828522 2023-01-22 16:07:49.479612: step: 890/463, loss: 0.09156519174575806 2023-01-22 16:07:50.261511: step: 892/463, loss: 0.33704155683517456 2023-01-22 16:07:50.857284: step: 894/463, loss: 0.058905228972435 2023-01-22 16:07:51.523172: step: 896/463, loss: 0.1745324730873108 2023-01-22 16:07:52.145422: step: 898/463, loss: 0.1387093961238861 2023-01-22 16:07:52.764094: step: 900/463, loss: 0.28719469904899597 2023-01-22 16:07:53.354601: step: 902/463, loss: 1.606732726097107 2023-01-22 16:07:54.016816: step: 904/463, loss: 0.08200130611658096 2023-01-22 16:07:54.631807: step: 906/463, loss: 0.07422398030757904 2023-01-22 16:07:55.245505: step: 908/463, loss: 0.23328106105327606 2023-01-22 16:07:55.854644: step: 910/463, loss: 0.08103915303945541 2023-01-22 16:07:56.544465: step: 912/463, loss: 0.07995246350765228 2023-01-22 16:07:57.254743: step: 914/463, loss: 0.20070500671863556 2023-01-22 16:07:57.780538: step: 916/463, loss: 0.19247238337993622 2023-01-22 16:07:58.385675: step: 918/463, loss: 0.34530478715896606 2023-01-22 16:07:58.945344: step: 920/463, loss: 0.08104103803634644 2023-01-22 16:07:59.517504: step: 922/463, loss: 0.03408876433968544 2023-01-22 16:08:00.118710: step: 924/463, loss: 0.07664431631565094 2023-01-22 16:08:00.715157: step: 926/463, loss: 0.17905595898628235 ================================================== Loss: 0.194 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2942608615611193, 'r': 0.324971198156682, 'f1': 0.30885450212546695}, 'combined': 0.22757700156613353, 'epoch': 13} Test Chinese: {'template': {'p': 0.9722222222222222, 'r': 0.5511811023622047, 'f1': 0.7035175879396985}, 'slot': {'p': 0.3307754597605348, 'r': 0.30275343391182574, 'f1': 0.31614471667035154}, 'combined': 0.22241336851180513, 'epoch': 13} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.297223624432105, 'r': 0.3192193006234752, 'f1': 0.3078290419552999}, 'combined': 0.22682139933548415, 'epoch': 13} Test Korean: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.33226810168539556, 'r': 0.30092752964869446, 'f1': 0.31582220114368026}, 'combined': 0.22423376281201296, 'epoch': 13} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.30099634384879564, 'r': 0.326126968382661, 'f1': 0.31305812811960354}, 'combined': 0.23067441019339208, 'epoch': 13} Test Russian: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.3543294617812976, 'r': 0.29646080732443936, 'f1': 0.32282227711505757}, 'combined': 0.22920381675169085, 'epoch': 13} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.2540849673202614, 'r': 0.3702380952380952, 'f1': 0.3013565891472868}, 'combined': 0.20090439276485786, 'epoch': 13} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.2361111111111111, 'r': 0.3695652173913043, 'f1': 0.28813559322033905}, 'combined': 0.14406779661016952, 'epoch': 13} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.359375, 'r': 0.19827586206896552, 'f1': 0.2555555555555556}, 'combined': 0.1703703703703704, 'epoch': 13} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.286031314699793, 'r': 0.2996001626457035, 'f1': 0.29265854627300414}, 'combined': 0.21564313935905566, 'epoch': 11} Test for Chinese: {'template': {'p': 0.9722222222222222, 'r': 0.5511811023622047, 'f1': 0.7035175879396985}, 'slot': {'p': 0.3466959902350088, 'r': 0.3012773015579334, 'f1': 0.32239486942414375}, 'combined': 0.2268104609014077, 'epoch': 11} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.32215447154471544, 'r': 0.3773809523809524, 'f1': 0.3475877192982456}, 'combined': 0.2317251461988304, 'epoch': 11} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.31359210182048525, 'r': 0.32489807892596767, 'f1': 0.3191449908555172}, 'combined': 0.23515946694617054, 'epoch': 1} Test for Korean: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.3653684903443556, 'r': 0.29076445304891124, 'f1': 0.32382513429937054}, 'combined': 0.22991584535255308, 'epoch': 1} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.296875, 'r': 0.41304347826086957, 'f1': 0.3454545454545454}, 'combined': 0.1727272727272727, 'epoch': 1} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2792885164051355, 'r': 0.37150142314990514, 'f1': 0.3188619706840391}, 'combined': 0.23495092576718668, 'epoch': 9} Test for Russian: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.3400664122883469, 'r': 0.29521922603896666, 'f1': 0.3160598539641111}, 'combined': 0.2244024963145189, 'epoch': 9} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.40476190476190477, 'r': 0.29310344827586204, 'f1': 0.34}, 'combined': 0.22666666666666668, 'epoch': 9} ****************************** Epoch: 14 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-22 16:10:34.385957: step: 2/463, loss: 0.07675016671419144 2023-01-22 16:10:35.002164: step: 4/463, loss: 0.0655759647488594 2023-01-22 16:10:35.589609: step: 6/463, loss: 0.09995383769273758 2023-01-22 16:10:36.218221: step: 8/463, loss: 0.060447946190834045 2023-01-22 16:10:36.818664: step: 10/463, loss: 0.5903008580207825 2023-01-22 16:10:37.375107: step: 12/463, loss: 0.08980602025985718 2023-01-22 16:10:38.028046: step: 14/463, loss: 0.1881512999534607 2023-01-22 16:10:38.631817: step: 16/463, loss: 0.10872343182563782 2023-01-22 16:10:39.229557: step: 18/463, loss: 0.2210412323474884 2023-01-22 16:10:39.952132: step: 20/463, loss: 0.045862190425395966 2023-01-22 16:10:40.603086: step: 22/463, loss: 0.0967811718583107 2023-01-22 16:10:41.232758: step: 24/463, loss: 1.0553267002105713 2023-01-22 16:10:41.985436: step: 26/463, loss: 0.0816500261425972 2023-01-22 16:10:42.661952: step: 28/463, loss: 0.07011556625366211 2023-01-22 16:10:43.286206: step: 30/463, loss: 0.06947368383407593 2023-01-22 16:10:43.923145: step: 32/463, loss: 0.4960462749004364 2023-01-22 16:10:44.589430: step: 34/463, loss: 0.1009887084364891 2023-01-22 16:10:45.177504: step: 36/463, loss: 0.37124907970428467 2023-01-22 16:10:45.802944: step: 38/463, loss: 0.03248896449804306 2023-01-22 16:10:46.461529: step: 40/463, loss: 0.014292052015662193 2023-01-22 16:10:47.076776: step: 42/463, loss: 0.0409284271299839 2023-01-22 16:10:47.704054: step: 44/463, loss: 0.13571470975875854 2023-01-22 16:10:48.391031: step: 46/463, loss: 0.02362903393805027 2023-01-22 16:10:48.944082: step: 48/463, loss: 0.10219025611877441 2023-01-22 16:10:49.569071: step: 50/463, loss: 0.23743271827697754 2023-01-22 16:10:50.173613: step: 52/463, loss: 0.11375357210636139 2023-01-22 16:10:50.747732: step: 54/463, loss: 0.05029996857047081 2023-01-22 16:10:51.344049: step: 56/463, loss: 0.01849927008152008 2023-01-22 16:10:51.978125: step: 58/463, loss: 1.5893388986587524 2023-01-22 16:10:52.541800: step: 60/463, loss: 0.22931115329265594 2023-01-22 16:10:53.129066: step: 62/463, loss: 0.5360106229782104 2023-01-22 16:10:53.747514: step: 64/463, loss: 0.09395391494035721 2023-01-22 16:10:54.330815: step: 66/463, loss: 0.2668137550354004 2023-01-22 16:10:55.011772: step: 68/463, loss: 0.10263574123382568 2023-01-22 16:10:55.638236: step: 70/463, loss: 0.05809789150953293 2023-01-22 16:10:56.287574: step: 72/463, loss: 0.22411370277404785 2023-01-22 16:10:56.973161: step: 74/463, loss: 0.06131374463438988 2023-01-22 16:10:57.592309: step: 76/463, loss: 0.14516018331050873 2023-01-22 16:10:58.196923: step: 78/463, loss: 0.03852732852101326 2023-01-22 16:10:58.780801: step: 80/463, loss: 0.0500032939016819 2023-01-22 16:10:59.358080: step: 82/463, loss: 0.03190518170595169 2023-01-22 16:10:59.980250: step: 84/463, loss: 0.12622998654842377 2023-01-22 16:11:00.640982: step: 86/463, loss: 0.15331420302391052 2023-01-22 16:11:01.251501: step: 88/463, loss: 0.1223493218421936 2023-01-22 16:11:01.872746: step: 90/463, loss: 0.10708821564912796 2023-01-22 16:11:02.450556: step: 92/463, loss: 0.07433609664440155 2023-01-22 16:11:03.052824: step: 94/463, loss: 0.02233053557574749 2023-01-22 16:11:03.702229: step: 96/463, loss: 0.6897002458572388 2023-01-22 16:11:04.444477: step: 98/463, loss: 0.04515305534005165 2023-01-22 16:11:05.026714: step: 100/463, loss: 0.5288664698600769 2023-01-22 16:11:05.612209: step: 102/463, loss: 0.043589506298303604 2023-01-22 16:11:06.265588: step: 104/463, loss: 0.06514944136142731 2023-01-22 16:11:06.847947: step: 106/463, loss: 0.01465211994946003 2023-01-22 16:11:07.440811: step: 108/463, loss: 0.16882802546024323 2023-01-22 16:11:08.001762: step: 110/463, loss: 0.07505299150943756 2023-01-22 16:11:08.631417: step: 112/463, loss: 0.03227434307336807 2023-01-22 16:11:09.264319: step: 114/463, loss: 0.06178808584809303 2023-01-22 16:11:09.860992: step: 116/463, loss: 0.17417891323566437 2023-01-22 16:11:10.478835: step: 118/463, loss: 0.3948008716106415 2023-01-22 16:11:11.051599: step: 120/463, loss: 0.14869239926338196 2023-01-22 16:11:11.682015: step: 122/463, loss: 0.025712991133332253 2023-01-22 16:11:12.289590: step: 124/463, loss: 0.20092050731182098 2023-01-22 16:11:12.889986: step: 126/463, loss: 0.12440460920333862 2023-01-22 16:11:13.536383: step: 128/463, loss: 0.11312313377857208 2023-01-22 16:11:14.169119: step: 130/463, loss: 0.19745159149169922 2023-01-22 16:11:14.835965: step: 132/463, loss: 0.07721090316772461 2023-01-22 16:11:15.431324: step: 134/463, loss: 0.2224597930908203 2023-01-22 16:11:16.022838: step: 136/463, loss: 0.07191826403141022 2023-01-22 16:11:16.644897: step: 138/463, loss: 0.07716206461191177 2023-01-22 16:11:17.256051: step: 140/463, loss: 0.11833931505680084 2023-01-22 16:11:17.981649: step: 142/463, loss: 0.10202905535697937 2023-01-22 16:11:18.521075: step: 144/463, loss: 0.021765589714050293 2023-01-22 16:11:19.146862: step: 146/463, loss: 0.21968525648117065 2023-01-22 16:11:19.784641: step: 148/463, loss: 0.13405713438987732 2023-01-22 16:11:20.356899: step: 150/463, loss: 0.05521602928638458 2023-01-22 16:11:20.887565: step: 152/463, loss: 0.12047287821769714 2023-01-22 16:11:21.509094: step: 154/463, loss: 0.14871466159820557 2023-01-22 16:11:22.115822: step: 156/463, loss: 0.2203676849603653 2023-01-22 16:11:22.767277: step: 158/463, loss: 0.13098083436489105 2023-01-22 16:11:23.310564: step: 160/463, loss: 0.12420705705881119 2023-01-22 16:11:23.827179: step: 162/463, loss: 0.07811812311410904 2023-01-22 16:11:24.432678: step: 164/463, loss: 0.18690241873264313 2023-01-22 16:11:25.041233: step: 166/463, loss: 0.02774176001548767 2023-01-22 16:11:25.585677: step: 168/463, loss: 0.039896685630083084 2023-01-22 16:11:26.206343: step: 170/463, loss: 0.04389895498752594 2023-01-22 16:11:26.907469: step: 172/463, loss: 0.6000957489013672 2023-01-22 16:11:27.571853: step: 174/463, loss: 0.1957181990146637 2023-01-22 16:11:28.153002: step: 176/463, loss: 0.31003808975219727 2023-01-22 16:11:28.698446: step: 178/463, loss: 0.05648312345147133 2023-01-22 16:11:29.326549: step: 180/463, loss: 0.24655158817768097 2023-01-22 16:11:29.881255: step: 182/463, loss: 0.2291954904794693 2023-01-22 16:11:30.495132: step: 184/463, loss: 0.09525701403617859 2023-01-22 16:11:31.101735: step: 186/463, loss: 0.05339603126049042 2023-01-22 16:11:31.710846: step: 188/463, loss: 0.15207292139530182 2023-01-22 16:11:32.325950: step: 190/463, loss: 0.4186971187591553 2023-01-22 16:11:32.966058: step: 192/463, loss: 0.06452116370201111 2023-01-22 16:11:33.577678: step: 194/463, loss: 0.03177981451153755 2023-01-22 16:11:34.233955: step: 196/463, loss: 0.13602229952812195 2023-01-22 16:11:34.784298: step: 198/463, loss: 0.10965681076049805 2023-01-22 16:11:35.334488: step: 200/463, loss: 0.08400817960500717 2023-01-22 16:11:35.954702: step: 202/463, loss: 0.06919145584106445 2023-01-22 16:11:36.584997: step: 204/463, loss: 0.20422585308551788 2023-01-22 16:11:37.230485: step: 206/463, loss: 0.08052266389131546 2023-01-22 16:11:37.800203: step: 208/463, loss: 0.016823094338178635 2023-01-22 16:11:38.422868: step: 210/463, loss: 0.15973834693431854 2023-01-22 16:11:38.963844: step: 212/463, loss: 0.05912219360470772 2023-01-22 16:11:39.674811: step: 214/463, loss: 0.09284861385822296 2023-01-22 16:11:40.275430: step: 216/463, loss: 0.4221577048301697 2023-01-22 16:11:40.845856: step: 218/463, loss: 0.023563319817185402 2023-01-22 16:11:41.441337: step: 220/463, loss: 0.05988965183496475 2023-01-22 16:11:42.042297: step: 222/463, loss: 0.24863365292549133 2023-01-22 16:11:42.625888: step: 224/463, loss: 0.2983083724975586 2023-01-22 16:11:43.205316: step: 226/463, loss: 0.40733420848846436 2023-01-22 16:11:43.831125: step: 228/463, loss: 0.19234538078308105 2023-01-22 16:11:44.422830: step: 230/463, loss: 0.092800572514534 2023-01-22 16:11:44.995886: step: 232/463, loss: 0.5629388689994812 2023-01-22 16:11:45.634198: step: 234/463, loss: 0.035517431795597076 2023-01-22 16:11:46.240397: step: 236/463, loss: 0.14609943330287933 2023-01-22 16:11:46.861471: step: 238/463, loss: 0.01640402339398861 2023-01-22 16:11:47.508482: step: 240/463, loss: 0.32826536893844604 2023-01-22 16:11:48.077037: step: 242/463, loss: 0.05212201178073883 2023-01-22 16:11:48.702122: step: 244/463, loss: 0.23895691335201263 2023-01-22 16:11:49.285578: step: 246/463, loss: 0.11717447638511658 2023-01-22 16:11:49.881696: step: 248/463, loss: 0.23320214450359344 2023-01-22 16:11:50.538540: step: 250/463, loss: 0.023238496854901314 2023-01-22 16:11:51.134776: step: 252/463, loss: 0.22174282371997833 2023-01-22 16:11:51.736076: step: 254/463, loss: 0.3321410119533539 2023-01-22 16:11:52.363273: step: 256/463, loss: 0.0555860809981823 2023-01-22 16:11:52.999039: step: 258/463, loss: 0.11208189278841019 2023-01-22 16:11:53.613662: step: 260/463, loss: 0.41557398438453674 2023-01-22 16:11:54.263464: step: 262/463, loss: 0.1462109088897705 2023-01-22 16:11:54.800382: step: 264/463, loss: 0.22726233303546906 2023-01-22 16:11:55.372674: step: 266/463, loss: 0.0982743352651596 2023-01-22 16:11:56.032100: step: 268/463, loss: 0.07731557637453079 2023-01-22 16:11:56.639151: step: 270/463, loss: 0.11731568723917007 2023-01-22 16:11:57.206766: step: 272/463, loss: 0.019450610503554344 2023-01-22 16:11:57.833144: step: 274/463, loss: 0.061928123235702515 2023-01-22 16:11:58.463314: step: 276/463, loss: 0.11042152345180511 2023-01-22 16:11:59.028894: step: 278/463, loss: 0.03382476046681404 2023-01-22 16:11:59.718635: step: 280/463, loss: 0.0794856920838356 2023-01-22 16:12:00.400649: step: 282/463, loss: 0.06790058314800262 2023-01-22 16:12:00.997439: step: 284/463, loss: 0.05404338985681534 2023-01-22 16:12:01.608574: step: 286/463, loss: 0.38700202107429504 2023-01-22 16:12:02.221975: step: 288/463, loss: 0.07135241478681564 2023-01-22 16:12:02.829327: step: 290/463, loss: 0.016554202884435654 2023-01-22 16:12:03.421412: step: 292/463, loss: 0.09790871292352676 2023-01-22 16:12:04.050204: step: 294/463, loss: 0.1203695610165596 2023-01-22 16:12:04.643901: step: 296/463, loss: 0.01246445719152689 2023-01-22 16:12:05.262605: step: 298/463, loss: 0.09969411790370941 2023-01-22 16:12:05.926566: step: 300/463, loss: 0.7231708765029907 2023-01-22 16:12:06.601309: step: 302/463, loss: 0.029620669782161713 2023-01-22 16:12:07.252350: step: 304/463, loss: 0.45540761947631836 2023-01-22 16:12:07.878870: step: 306/463, loss: 0.07372794300317764 2023-01-22 16:12:08.545751: step: 308/463, loss: 0.057653073221445084 2023-01-22 16:12:09.070785: step: 310/463, loss: 0.0717100128531456 2023-01-22 16:12:09.700955: step: 312/463, loss: 0.06671425700187683 2023-01-22 16:12:10.296412: step: 314/463, loss: 0.05578017979860306 2023-01-22 16:12:10.918813: step: 316/463, loss: 0.049616459757089615 2023-01-22 16:12:11.487397: step: 318/463, loss: 0.053819023072719574 2023-01-22 16:12:12.065977: step: 320/463, loss: 0.08290984481573105 2023-01-22 16:12:12.640261: step: 322/463, loss: 0.02363448217511177 2023-01-22 16:12:13.245813: step: 324/463, loss: 0.5781065225601196 2023-01-22 16:12:13.852012: step: 326/463, loss: 0.0820005014538765 2023-01-22 16:12:14.512666: step: 328/463, loss: 0.0636850893497467 2023-01-22 16:12:15.152147: step: 330/463, loss: 0.1899038553237915 2023-01-22 16:12:15.800506: step: 332/463, loss: 0.01569054089486599 2023-01-22 16:12:16.401992: step: 334/463, loss: 0.20446738600730896 2023-01-22 16:12:17.014511: step: 336/463, loss: 0.158631831407547 2023-01-22 16:12:17.605120: step: 338/463, loss: 0.0439680814743042 2023-01-22 16:12:18.212993: step: 340/463, loss: 0.5541055202484131 2023-01-22 16:12:18.793060: step: 342/463, loss: 0.11805849522352219 2023-01-22 16:12:19.517555: step: 344/463, loss: 0.011548774316906929 2023-01-22 16:12:20.084838: step: 346/463, loss: 0.037358131259679794 2023-01-22 16:12:20.714135: step: 348/463, loss: 0.08829781413078308 2023-01-22 16:12:21.277717: step: 350/463, loss: 0.029557185247540474 2023-01-22 16:12:21.843570: step: 352/463, loss: 0.093375064432621 2023-01-22 16:12:22.405189: step: 354/463, loss: 0.2966417968273163 2023-01-22 16:12:22.981990: step: 356/463, loss: 0.10732920467853546 2023-01-22 16:12:23.636662: step: 358/463, loss: 0.09376738965511322 2023-01-22 16:12:24.186965: step: 360/463, loss: 0.037833139300346375 2023-01-22 16:12:24.804898: step: 362/463, loss: 0.02954859286546707 2023-01-22 16:12:25.408614: step: 364/463, loss: 0.01586347632110119 2023-01-22 16:12:26.037570: step: 366/463, loss: 0.16085737943649292 2023-01-22 16:12:26.637105: step: 368/463, loss: 0.18782326579093933 2023-01-22 16:12:27.262932: step: 370/463, loss: 0.3899107873439789 2023-01-22 16:12:27.856340: step: 372/463, loss: 0.13393762707710266 2023-01-22 16:12:28.464707: step: 374/463, loss: 0.037211667746305466 2023-01-22 16:12:29.097554: step: 376/463, loss: 0.12075044214725494 2023-01-22 16:12:29.665034: step: 378/463, loss: 0.0870567262172699 2023-01-22 16:12:30.250918: step: 380/463, loss: 0.6839930415153503 2023-01-22 16:12:30.846542: step: 382/463, loss: 0.09883228689432144 2023-01-22 16:12:31.440760: step: 384/463, loss: 0.13694140315055847 2023-01-22 16:12:32.077682: step: 386/463, loss: 0.05821835994720459 2023-01-22 16:12:32.692160: step: 388/463, loss: 0.07449167966842651 2023-01-22 16:12:33.318937: step: 390/463, loss: 0.09761625528335571 2023-01-22 16:12:33.874395: step: 392/463, loss: 0.09591581672430038 2023-01-22 16:12:34.554062: step: 394/463, loss: 0.015023921616375446 2023-01-22 16:12:35.186024: step: 396/463, loss: 0.06875688582658768 2023-01-22 16:12:35.763359: step: 398/463, loss: 2.0661609172821045 2023-01-22 16:12:36.386787: step: 400/463, loss: 0.08367481827735901 2023-01-22 16:12:37.014658: step: 402/463, loss: 0.09931520372629166 2023-01-22 16:12:37.587520: step: 404/463, loss: 0.030169380828738213 2023-01-22 16:12:38.193137: step: 406/463, loss: 0.06585224717855453 2023-01-22 16:12:38.811924: step: 408/463, loss: 0.2315617799758911 2023-01-22 16:12:39.417831: step: 410/463, loss: 0.02580193243920803 2023-01-22 16:12:39.958855: step: 412/463, loss: 0.029623661190271378 2023-01-22 16:12:40.536470: step: 414/463, loss: 0.0852564200758934 2023-01-22 16:12:41.120359: step: 416/463, loss: 0.1628587245941162 2023-01-22 16:12:41.721147: step: 418/463, loss: 0.09467393904924393 2023-01-22 16:12:42.388095: step: 420/463, loss: 0.17455348372459412 2023-01-22 16:12:42.993057: step: 422/463, loss: 0.1355740875005722 2023-01-22 16:12:43.599013: step: 424/463, loss: 0.157327800989151 2023-01-22 16:12:44.182044: step: 426/463, loss: 0.0904545709490776 2023-01-22 16:12:44.760923: step: 428/463, loss: 0.22193805873394012 2023-01-22 16:12:45.400694: step: 430/463, loss: 0.03038228303194046 2023-01-22 16:12:46.055468: step: 432/463, loss: 0.10699906945228577 2023-01-22 16:12:46.663388: step: 434/463, loss: 0.30278247594833374 2023-01-22 16:12:47.330226: step: 436/463, loss: 0.03408712148666382 2023-01-22 16:12:47.961771: step: 438/463, loss: 0.06657236069440842 2023-01-22 16:12:48.570948: step: 440/463, loss: 0.3224368691444397 2023-01-22 16:12:49.290216: step: 442/463, loss: 0.03749328851699829 2023-01-22 16:12:49.970623: step: 444/463, loss: 0.06525817513465881 2023-01-22 16:12:50.604820: step: 446/463, loss: 0.032084498554468155 2023-01-22 16:12:51.218185: step: 448/463, loss: 0.07130036503076553 2023-01-22 16:12:51.880194: step: 450/463, loss: 0.09904330223798752 2023-01-22 16:12:52.422341: step: 452/463, loss: 0.17401240766048431 2023-01-22 16:12:53.029279: step: 454/463, loss: 0.10478309541940689 2023-01-22 16:12:53.648482: step: 456/463, loss: 0.1700463891029358 2023-01-22 16:12:54.216543: step: 458/463, loss: 0.07533483952283859 2023-01-22 16:12:54.865183: step: 460/463, loss: 0.4262741506099701 2023-01-22 16:12:55.496215: step: 462/463, loss: 0.1497563123703003 2023-01-22 16:12:56.077514: step: 464/463, loss: 0.039665792137384415 2023-01-22 16:12:56.674138: step: 466/463, loss: 0.025088772177696228 2023-01-22 16:12:57.269265: step: 468/463, loss: 0.1001342162489891 2023-01-22 16:12:57.852044: step: 470/463, loss: 0.0783759355545044 2023-01-22 16:12:58.454334: step: 472/463, loss: 0.02220998704433441 2023-01-22 16:12:59.131235: step: 474/463, loss: 0.17247097194194794 2023-01-22 16:12:59.688237: step: 476/463, loss: 0.035394519567489624 2023-01-22 16:13:00.295330: step: 478/463, loss: 0.00604554358869791 2023-01-22 16:13:00.869704: step: 480/463, loss: 0.16228698194026947 2023-01-22 16:13:01.508532: step: 482/463, loss: 0.033223140984773636 2023-01-22 16:13:02.113417: step: 484/463, loss: 0.11758869141340256 2023-01-22 16:13:02.661579: step: 486/463, loss: 0.04347692057490349 2023-01-22 16:13:03.294071: step: 488/463, loss: 0.05089852958917618 2023-01-22 16:13:03.913921: step: 490/463, loss: 0.03744899109005928 2023-01-22 16:13:04.552322: step: 492/463, loss: 0.2863737642765045 2023-01-22 16:13:05.120798: step: 494/463, loss: 0.05894947052001953 2023-01-22 16:13:05.689994: step: 496/463, loss: 0.030051682144403458 2023-01-22 16:13:06.257656: step: 498/463, loss: 0.051100365817546844 2023-01-22 16:13:06.831391: step: 500/463, loss: 0.0821952074766159 2023-01-22 16:13:07.439996: step: 502/463, loss: 1.6353161334991455 2023-01-22 16:13:08.106977: step: 504/463, loss: 0.11418363451957703 2023-01-22 16:13:08.728020: step: 506/463, loss: 0.14513860642910004 2023-01-22 16:13:09.344959: step: 508/463, loss: 0.3175228238105774 2023-01-22 16:13:09.957149: step: 510/463, loss: 0.3933655619621277 2023-01-22 16:13:10.544602: step: 512/463, loss: 0.22541390359401703 2023-01-22 16:13:11.161550: step: 514/463, loss: 0.06574596464633942 2023-01-22 16:13:11.770362: step: 516/463, loss: 0.02783725969493389 2023-01-22 16:13:12.447264: step: 518/463, loss: 0.07209238409996033 2023-01-22 16:13:13.155936: step: 520/463, loss: 0.13113972544670105 2023-01-22 16:13:13.709787: step: 522/463, loss: 0.1848844736814499 2023-01-22 16:13:14.312412: step: 524/463, loss: 0.01669367030262947 2023-01-22 16:13:14.890930: step: 526/463, loss: 0.006216236390173435 2023-01-22 16:13:15.533330: step: 528/463, loss: 0.10297949612140656 2023-01-22 16:13:16.177532: step: 530/463, loss: 0.12939968705177307 2023-01-22 16:13:16.835677: step: 532/463, loss: 0.09584011137485504 2023-01-22 16:13:17.427785: step: 534/463, loss: 0.10711297392845154 2023-01-22 16:13:18.065984: step: 536/463, loss: 0.032669201493263245 2023-01-22 16:13:18.626750: step: 538/463, loss: 0.2952217757701874 2023-01-22 16:13:19.200190: step: 540/463, loss: 1.1257184743881226 2023-01-22 16:13:19.867980: step: 542/463, loss: 0.07294858992099762 2023-01-22 16:13:20.438152: step: 544/463, loss: 0.10482457280158997 2023-01-22 16:13:20.987633: step: 546/463, loss: 0.0253711249679327 2023-01-22 16:13:21.564999: step: 548/463, loss: 1.6263724565505981 2023-01-22 16:13:22.207116: step: 550/463, loss: 0.034483011811971664 2023-01-22 16:13:22.824432: step: 552/463, loss: 0.09596127271652222 2023-01-22 16:13:23.469218: step: 554/463, loss: 0.16073022782802582 2023-01-22 16:13:24.052501: step: 556/463, loss: 0.08733044564723969 2023-01-22 16:13:24.603023: step: 558/463, loss: 0.09979964047670364 2023-01-22 16:13:25.207902: step: 560/463, loss: 0.0774931088089943 2023-01-22 16:13:25.771191: step: 562/463, loss: 0.02809038758277893 2023-01-22 16:13:26.345516: step: 564/463, loss: 0.056804392486810684 2023-01-22 16:13:26.903478: step: 566/463, loss: 0.03309309110045433 2023-01-22 16:13:27.557943: step: 568/463, loss: 0.13115225732326508 2023-01-22 16:13:28.165604: step: 570/463, loss: 0.17480210959911346 2023-01-22 16:13:28.697062: step: 572/463, loss: 0.1409579962491989 2023-01-22 16:13:29.307751: step: 574/463, loss: 0.058926355093717575 2023-01-22 16:13:29.978443: step: 576/463, loss: 0.3714093267917633 2023-01-22 16:13:30.550907: step: 578/463, loss: 0.09792101383209229 2023-01-22 16:13:31.183664: step: 580/463, loss: 0.019714193418622017 2023-01-22 16:13:31.804122: step: 582/463, loss: 0.07044295966625214 2023-01-22 16:13:32.384245: step: 584/463, loss: 0.04504678398370743 2023-01-22 16:13:33.013457: step: 586/463, loss: 0.06456266343593597 2023-01-22 16:13:33.607676: step: 588/463, loss: 0.057791970670223236 2023-01-22 16:13:34.284429: step: 590/463, loss: 0.31348758935928345 2023-01-22 16:13:34.941758: step: 592/463, loss: 0.08264046162366867 2023-01-22 16:13:35.522948: step: 594/463, loss: 0.4257124066352844 2023-01-22 16:13:36.106239: step: 596/463, loss: 0.08312836289405823 2023-01-22 16:13:36.858342: step: 598/463, loss: 0.25434771180152893 2023-01-22 16:13:37.586245: step: 600/463, loss: 0.07449564337730408 2023-01-22 16:13:38.215421: step: 602/463, loss: 0.14531761407852173 2023-01-22 16:13:38.780198: step: 604/463, loss: 0.08886903524398804 2023-01-22 16:13:39.367844: step: 606/463, loss: 0.14306092262268066 2023-01-22 16:13:40.001615: step: 608/463, loss: 0.2427092045545578 2023-01-22 16:13:40.579150: step: 610/463, loss: 0.05103425681591034 2023-01-22 16:13:41.199747: step: 612/463, loss: 0.061839353293180466 2023-01-22 16:13:41.805859: step: 614/463, loss: 0.04441489651799202 2023-01-22 16:13:42.413404: step: 616/463, loss: 0.03961002454161644 2023-01-22 16:13:42.997712: step: 618/463, loss: 0.16857494413852692 2023-01-22 16:13:43.631482: step: 620/463, loss: 0.03683756664395332 2023-01-22 16:13:44.196582: step: 622/463, loss: 0.21234336495399475 2023-01-22 16:13:44.826871: step: 624/463, loss: 0.0774833932518959 2023-01-22 16:13:45.461657: step: 626/463, loss: 0.07937804609537125 2023-01-22 16:13:46.101552: step: 628/463, loss: 0.03992748260498047 2023-01-22 16:13:46.637531: step: 630/463, loss: 0.06797576695680618 2023-01-22 16:13:47.196357: step: 632/463, loss: 0.08546434342861176 2023-01-22 16:13:47.778296: step: 634/463, loss: 0.11566168814897537 2023-01-22 16:13:48.417797: step: 636/463, loss: 0.08628936111927032 2023-01-22 16:13:48.990888: step: 638/463, loss: 0.011403496377170086 2023-01-22 16:13:49.618815: step: 640/463, loss: 0.07936941832304001 2023-01-22 16:13:50.230744: step: 642/463, loss: 0.07581977546215057 2023-01-22 16:13:50.901942: step: 644/463, loss: 0.041407909244298935 2023-01-22 16:13:51.502924: step: 646/463, loss: 0.001550393528304994 2023-01-22 16:13:52.128841: step: 648/463, loss: 0.033705439418554306 2023-01-22 16:13:52.790714: step: 650/463, loss: 0.060700614005327225 2023-01-22 16:13:53.392902: step: 652/463, loss: 0.06698837131261826 2023-01-22 16:13:53.968954: step: 654/463, loss: 0.30131250619888306 2023-01-22 16:13:54.587975: step: 656/463, loss: 0.028993036597967148 2023-01-22 16:13:55.206371: step: 658/463, loss: 0.06961779296398163 2023-01-22 16:13:55.833205: step: 660/463, loss: 0.09118983894586563 2023-01-22 16:13:56.508085: step: 662/463, loss: 0.020568886771798134 2023-01-22 16:13:57.144581: step: 664/463, loss: 1.1154916286468506 2023-01-22 16:13:57.753640: step: 666/463, loss: 0.16427204012870789 2023-01-22 16:13:58.352115: step: 668/463, loss: 0.04682648554444313 2023-01-22 16:13:58.973754: step: 670/463, loss: 0.17305916547775269 2023-01-22 16:13:59.603265: step: 672/463, loss: 0.050504691898822784 2023-01-22 16:14:00.177535: step: 674/463, loss: 0.0666716918349266 2023-01-22 16:14:00.776589: step: 676/463, loss: 0.17144875228405 2023-01-22 16:14:01.375846: step: 678/463, loss: 0.05541646108031273 2023-01-22 16:14:02.071174: step: 680/463, loss: 0.34367406368255615 2023-01-22 16:14:02.679715: step: 682/463, loss: 0.03627133369445801 2023-01-22 16:14:03.291054: step: 684/463, loss: 0.16341890394687653 2023-01-22 16:14:03.871944: step: 686/463, loss: 0.14793576300144196 2023-01-22 16:14:04.508700: step: 688/463, loss: 0.1131996214389801 2023-01-22 16:14:05.090359: step: 690/463, loss: 0.06781991571187973 2023-01-22 16:14:05.691887: step: 692/463, loss: 0.09343595057725906 2023-01-22 16:14:06.290475: step: 694/463, loss: 0.4842890799045563 2023-01-22 16:14:06.886184: step: 696/463, loss: 0.06069215014576912 2023-01-22 16:14:07.493205: step: 698/463, loss: 0.12126719206571579 2023-01-22 16:14:08.084333: step: 700/463, loss: 0.05985962226986885 2023-01-22 16:14:08.702654: step: 702/463, loss: 0.03357599303126335 2023-01-22 16:14:09.339311: step: 704/463, loss: 0.09971017390489578 2023-01-22 16:14:09.998988: step: 706/463, loss: 0.6043343544006348 2023-01-22 16:14:10.645071: step: 708/463, loss: 0.12592804431915283 2023-01-22 16:14:11.349643: step: 710/463, loss: 0.3056861162185669 2023-01-22 16:14:11.968302: step: 712/463, loss: 0.05643009394407272 2023-01-22 16:14:12.642281: step: 714/463, loss: 0.307606965303421 2023-01-22 16:14:13.277736: step: 716/463, loss: 0.03770419582724571 2023-01-22 16:14:13.920511: step: 718/463, loss: 0.07289402186870575 2023-01-22 16:14:14.573818: step: 720/463, loss: 0.02534320205450058 2023-01-22 16:14:15.182747: step: 722/463, loss: 0.19143863022327423 2023-01-22 16:14:15.749341: step: 724/463, loss: 0.05020402371883392 2023-01-22 16:14:16.328081: step: 726/463, loss: 0.16998742520809174 2023-01-22 16:14:16.892985: step: 728/463, loss: 0.11466042697429657 2023-01-22 16:14:17.485055: step: 730/463, loss: 0.02007829211652279 2023-01-22 16:14:18.079298: step: 732/463, loss: 0.24262700974941254 2023-01-22 16:14:18.721169: step: 734/463, loss: 0.0720750167965889 2023-01-22 16:14:19.289042: step: 736/463, loss: 0.1819390058517456 2023-01-22 16:14:19.898853: step: 738/463, loss: 0.09077084064483643 2023-01-22 16:14:20.511096: step: 740/463, loss: 0.4807194769382477 2023-01-22 16:14:21.132723: step: 742/463, loss: 0.03873295709490776 2023-01-22 16:14:21.751654: step: 744/463, loss: 0.15439373254776 2023-01-22 16:14:22.457585: step: 746/463, loss: 0.06476614624261856 2023-01-22 16:14:23.092184: step: 748/463, loss: 0.4391195774078369 2023-01-22 16:14:23.765189: step: 750/463, loss: 0.06903206557035446 2023-01-22 16:14:24.367318: step: 752/463, loss: 0.01863720454275608 2023-01-22 16:14:24.972037: step: 754/463, loss: 2.548945903778076 2023-01-22 16:14:25.571675: step: 756/463, loss: 0.36414071917533875 2023-01-22 16:14:26.195800: step: 758/463, loss: 0.07604169100522995 2023-01-22 16:14:26.761765: step: 760/463, loss: 0.014313437044620514 2023-01-22 16:14:27.382378: step: 762/463, loss: 0.02528936229646206 2023-01-22 16:14:27.991584: step: 764/463, loss: 0.6350257992744446 2023-01-22 16:14:28.541222: step: 766/463, loss: 0.04238261282444 2023-01-22 16:14:29.165197: step: 768/463, loss: 0.045138370245695114 2023-01-22 16:14:29.720101: step: 770/463, loss: 0.04239225015044212 2023-01-22 16:14:30.322046: step: 772/463, loss: 0.13525225222110748 2023-01-22 16:14:30.935621: step: 774/463, loss: 0.07173896580934525 2023-01-22 16:14:31.559763: step: 776/463, loss: 0.4176028072834015 2023-01-22 16:14:32.175587: step: 778/463, loss: 0.15224432945251465 2023-01-22 16:14:32.755224: step: 780/463, loss: 0.07840688526630402 2023-01-22 16:14:33.358589: step: 782/463, loss: 0.0328403003513813 2023-01-22 16:14:33.952878: step: 784/463, loss: 0.08359145373106003 2023-01-22 16:14:34.520942: step: 786/463, loss: 0.8236105442047119 2023-01-22 16:14:35.146411: step: 788/463, loss: 0.3282352089881897 2023-01-22 16:14:35.772094: step: 790/463, loss: 0.05203894525766373 2023-01-22 16:14:36.443234: step: 792/463, loss: 0.09208649396896362 2023-01-22 16:14:37.026035: step: 794/463, loss: 0.10902723670005798 2023-01-22 16:14:37.675777: step: 796/463, loss: 0.03842010721564293 2023-01-22 16:14:38.289197: step: 798/463, loss: 0.0665179044008255 2023-01-22 16:14:38.875857: step: 800/463, loss: 0.18100035190582275 2023-01-22 16:14:39.489097: step: 802/463, loss: 0.07220342755317688 2023-01-22 16:14:40.161622: step: 804/463, loss: 0.09989255666732788 2023-01-22 16:14:40.890340: step: 806/463, loss: 0.1319730579853058 2023-01-22 16:14:41.468819: step: 808/463, loss: 0.1330530047416687 2023-01-22 16:14:42.053088: step: 810/463, loss: 0.0213184617459774 2023-01-22 16:14:42.646325: step: 812/463, loss: 0.04843056946992874 2023-01-22 16:14:43.307377: step: 814/463, loss: 0.051545727998018265 2023-01-22 16:14:43.927553: step: 816/463, loss: 0.06334976106882095 2023-01-22 16:14:44.541426: step: 818/463, loss: 0.23926696181297302 2023-01-22 16:14:45.118888: step: 820/463, loss: 0.5210287570953369 2023-01-22 16:14:45.721898: step: 822/463, loss: 0.07643789798021317 2023-01-22 16:14:46.286095: step: 824/463, loss: 0.4277855455875397 2023-01-22 16:14:47.084154: step: 826/463, loss: 0.673653244972229 2023-01-22 16:14:47.765690: step: 828/463, loss: 1.0592390298843384 2023-01-22 16:14:48.388177: step: 830/463, loss: 0.10115966945886612 2023-01-22 16:14:49.041920: step: 832/463, loss: 0.15561948716640472 2023-01-22 16:14:49.717210: step: 834/463, loss: 0.11558883637189865 2023-01-22 16:14:50.350452: step: 836/463, loss: 2.3927981853485107 2023-01-22 16:14:50.957242: step: 838/463, loss: 0.3303210735321045 2023-01-22 16:14:51.564893: step: 840/463, loss: 0.9002991914749146 2023-01-22 16:14:52.182398: step: 842/463, loss: 0.015502339228987694 2023-01-22 16:14:52.900404: step: 844/463, loss: 0.2540409564971924 2023-01-22 16:14:53.485004: step: 846/463, loss: 0.10866440087556839 2023-01-22 16:14:54.057857: step: 848/463, loss: 0.12949763238430023 2023-01-22 16:14:54.703130: step: 850/463, loss: 0.2915351390838623 2023-01-22 16:14:55.275651: step: 852/463, loss: 0.09739460051059723 2023-01-22 16:14:55.923325: step: 854/463, loss: 0.006810775026679039 2023-01-22 16:14:56.638780: step: 856/463, loss: 0.05779821053147316 2023-01-22 16:14:57.259799: step: 858/463, loss: 0.07079167664051056 2023-01-22 16:14:57.867208: step: 860/463, loss: 0.06744373589754105 2023-01-22 16:14:58.502285: step: 862/463, loss: 0.08337841182947159 2023-01-22 16:14:59.071184: step: 864/463, loss: 0.07588480412960052 2023-01-22 16:14:59.691295: step: 866/463, loss: 0.08913889527320862 2023-01-22 16:15:00.318342: step: 868/463, loss: 0.05976057052612305 2023-01-22 16:15:00.905401: step: 870/463, loss: 0.24357064068317413 2023-01-22 16:15:01.602292: step: 872/463, loss: 0.31780675053596497 2023-01-22 16:15:02.142209: step: 874/463, loss: 0.10017620772123337 2023-01-22 16:15:02.778751: step: 876/463, loss: 0.11574061959981918 2023-01-22 16:15:03.363658: step: 878/463, loss: 0.07761568576097488 2023-01-22 16:15:03.969992: step: 880/463, loss: 0.16441595554351807 2023-01-22 16:15:04.615552: step: 882/463, loss: 0.031705714762210846 2023-01-22 16:15:05.192718: step: 884/463, loss: 0.10020439326763153 2023-01-22 16:15:05.780083: step: 886/463, loss: 0.2210567146539688 2023-01-22 16:15:06.409430: step: 888/463, loss: 0.041322700679302216 2023-01-22 16:15:07.039911: step: 890/463, loss: 0.11005710065364838 2023-01-22 16:15:07.647008: step: 892/463, loss: 0.04246811196208 2023-01-22 16:15:08.269212: step: 894/463, loss: 0.1490844190120697 2023-01-22 16:15:08.963045: step: 896/463, loss: 0.015116555616259575 2023-01-22 16:15:09.548791: step: 898/463, loss: 0.06426158547401428 2023-01-22 16:15:10.214163: step: 900/463, loss: 0.05587301030755043 2023-01-22 16:15:10.778317: step: 902/463, loss: 0.11844894289970398 2023-01-22 16:15:11.315750: step: 904/463, loss: 0.0880693569779396 2023-01-22 16:15:11.869102: step: 906/463, loss: 0.06316044926643372 2023-01-22 16:15:12.548696: step: 908/463, loss: 0.018015868961811066 2023-01-22 16:15:13.161017: step: 910/463, loss: 0.09592820703983307 2023-01-22 16:15:13.764434: step: 912/463, loss: 0.03864612057805061 2023-01-22 16:15:14.363918: step: 914/463, loss: 0.0776737853884697 2023-01-22 16:15:14.944126: step: 916/463, loss: 0.14693352580070496 2023-01-22 16:15:15.576276: step: 918/463, loss: 0.4297368824481964 2023-01-22 16:15:16.143402: step: 920/463, loss: 0.16856685280799866 2023-01-22 16:15:16.716306: step: 922/463, loss: 0.12944503128528595 2023-01-22 16:15:17.380479: step: 924/463, loss: 0.40906912088394165 2023-01-22 16:15:18.033071: step: 926/463, loss: 0.053052015602588654 ================================================== Loss: 0.169 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2612685149093744, 'r': 0.36339624559501227, 'f1': 0.30398384353741503}, 'combined': 0.22398809523809526, 'epoch': 14} Test Chinese: {'template': {'p': 0.9722222222222222, 'r': 0.5511811023622047, 'f1': 0.7035175879396985}, 'slot': {'p': 0.3321868704633461, 'r': 0.3304476721886689, 'f1': 0.33131498891357347}, 'combined': 0.23308592184874516, 'epoch': 14} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2712763798701299, 'r': 0.3623881810788832, 'f1': 0.3102820006962981}, 'combined': 0.2286288426183249, 'epoch': 14} Test Korean: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.32746578598366227, 'r': 0.32060786376410916, 'f1': 0.3240005395711367}, 'combined': 0.23004038309550706, 'epoch': 14} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2695473997373954, 'r': 0.3728653783843667, 'f1': 0.3128981758098109}, 'combined': 0.23055655059670274, 'epoch': 14} Test Russian: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.34422073764876104, 'r': 0.3111803527086531, 'f1': 0.32686772154364474}, 'combined': 0.23207608229598775, 'epoch': 14} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.22628205128205126, 'r': 0.4202380952380952, 'f1': 0.29416666666666663}, 'combined': 0.19611111111111107, 'epoch': 14} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.22282608695652173, 'r': 0.44565217391304346, 'f1': 0.2971014492753623}, 'combined': 0.14855072463768115, 'epoch': 14} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.2916666666666667, 'r': 0.2413793103448276, 'f1': 0.26415094339622647}, 'combined': 0.17610062893081763, 'epoch': 14} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.286031314699793, 'r': 0.2996001626457035, 'f1': 0.29265854627300414}, 'combined': 0.21564313935905566, 'epoch': 11} Test for Chinese: {'template': {'p': 0.9722222222222222, 'r': 0.5511811023622047, 'f1': 0.7035175879396985}, 'slot': {'p': 0.3466959902350088, 'r': 0.3012773015579334, 'f1': 0.32239486942414375}, 'combined': 0.2268104609014077, 'epoch': 11} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.32215447154471544, 'r': 0.3773809523809524, 'f1': 0.3475877192982456}, 'combined': 0.2317251461988304, 'epoch': 11} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.31359210182048525, 'r': 0.32489807892596767, 'f1': 0.3191449908555172}, 'combined': 0.23515946694617054, 'epoch': 1} Test for Korean: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.3653684903443556, 'r': 0.29076445304891124, 'f1': 0.32382513429937054}, 'combined': 0.22991584535255308, 'epoch': 1} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.296875, 'r': 0.41304347826086957, 'f1': 0.3454545454545454}, 'combined': 0.1727272727272727, 'epoch': 1} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2792885164051355, 'r': 0.37150142314990514, 'f1': 0.3188619706840391}, 'combined': 0.23495092576718668, 'epoch': 9} Test for Russian: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.3400664122883469, 'r': 0.29521922603896666, 'f1': 0.3160598539641111}, 'combined': 0.2244024963145189, 'epoch': 9} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.40476190476190477, 'r': 0.29310344827586204, 'f1': 0.34}, 'combined': 0.22666666666666668, 'epoch': 9} ****************************** Epoch: 15 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-22 16:17:53.706943: step: 2/463, loss: 1.1136900186538696 2023-01-22 16:17:54.336269: step: 4/463, loss: 0.05129950866103172 2023-01-22 16:17:55.024400: step: 6/463, loss: 0.15911410748958588 2023-01-22 16:17:55.655607: step: 8/463, loss: 0.04372177645564079 2023-01-22 16:17:56.256990: step: 10/463, loss: 0.34314101934432983 2023-01-22 16:17:56.894898: step: 12/463, loss: 0.05023907497525215 2023-01-22 16:17:57.493490: step: 14/463, loss: 0.17375628650188446 2023-01-22 16:17:58.166821: step: 16/463, loss: 0.1016114354133606 2023-01-22 16:17:58.758947: step: 18/463, loss: 0.1722182184457779 2023-01-22 16:17:59.359611: step: 20/463, loss: 0.07967983931303024 2023-01-22 16:17:59.948892: step: 22/463, loss: 0.10332731902599335 2023-01-22 16:18:00.557961: step: 24/463, loss: 0.46601104736328125 2023-01-22 16:18:01.238201: step: 26/463, loss: 0.054461538791656494 2023-01-22 16:18:01.870314: step: 28/463, loss: 0.03344506397843361 2023-01-22 16:18:02.489836: step: 30/463, loss: 1.0197166204452515 2023-01-22 16:18:03.062021: step: 32/463, loss: 0.07345034927129745 2023-01-22 16:18:03.804080: step: 34/463, loss: 0.3728037178516388 2023-01-22 16:18:04.381719: step: 36/463, loss: 0.05589612200856209 2023-01-22 16:18:04.993228: step: 38/463, loss: 0.021735813468694687 2023-01-22 16:18:05.618660: step: 40/463, loss: 0.2985685169696808 2023-01-22 16:18:06.210467: step: 42/463, loss: 0.09553049504756927 2023-01-22 16:18:06.782928: step: 44/463, loss: 0.09049806743860245 2023-01-22 16:18:07.433205: step: 46/463, loss: 0.08738178014755249 2023-01-22 16:18:08.041069: step: 48/463, loss: 0.05087277293205261 2023-01-22 16:18:08.734549: step: 50/463, loss: 0.3785260319709778 2023-01-22 16:18:09.331854: step: 52/463, loss: 0.3614160716533661 2023-01-22 16:18:09.948967: step: 54/463, loss: 0.04793338477611542 2023-01-22 16:18:10.626170: step: 56/463, loss: 0.020170900970697403 2023-01-22 16:18:11.254779: step: 58/463, loss: 0.10645491629838943 2023-01-22 16:18:11.842870: step: 60/463, loss: 0.0173997413367033 2023-01-22 16:18:12.445594: step: 62/463, loss: 0.15898871421813965 2023-01-22 16:18:13.060748: step: 64/463, loss: 0.1287989467382431 2023-01-22 16:18:13.727839: step: 66/463, loss: 0.05263053998351097 2023-01-22 16:18:14.356338: step: 68/463, loss: 0.13185575604438782 2023-01-22 16:18:14.934330: step: 70/463, loss: 0.04612639918923378 2023-01-22 16:18:15.664494: step: 72/463, loss: 0.008111974224448204 2023-01-22 16:18:16.275724: step: 74/463, loss: 0.04101620614528656 2023-01-22 16:18:16.861219: step: 76/463, loss: 0.06766528636217117 2023-01-22 16:18:17.502934: step: 78/463, loss: 0.05290784314274788 2023-01-22 16:18:18.080857: step: 80/463, loss: 0.031204305589199066 2023-01-22 16:18:18.763199: step: 82/463, loss: 0.07508650422096252 2023-01-22 16:18:19.466677: step: 84/463, loss: 0.07888216525316238 2023-01-22 16:18:20.080757: step: 86/463, loss: 0.10541693866252899 2023-01-22 16:18:20.687593: step: 88/463, loss: 0.1030062735080719 2023-01-22 16:18:21.257978: step: 90/463, loss: 0.04004230350255966 2023-01-22 16:18:21.853538: step: 92/463, loss: 0.027588212862610817 2023-01-22 16:18:22.461042: step: 94/463, loss: 0.07077085971832275 2023-01-22 16:18:23.048935: step: 96/463, loss: 0.14368517696857452 2023-01-22 16:18:23.626638: step: 98/463, loss: 0.013567483052611351 2023-01-22 16:18:24.187188: step: 100/463, loss: 0.019274311140179634 2023-01-22 16:18:24.828588: step: 102/463, loss: 0.09166082739830017 2023-01-22 16:18:25.393259: step: 104/463, loss: 0.03624126315116882 2023-01-22 16:18:26.054815: step: 106/463, loss: 0.016892993822693825 2023-01-22 16:18:26.732591: step: 108/463, loss: 0.31654122471809387 2023-01-22 16:18:27.390015: step: 110/463, loss: 0.13563372194766998 2023-01-22 16:18:28.016764: step: 112/463, loss: 0.15887178480625153 2023-01-22 16:18:28.608252: step: 114/463, loss: 0.07403489202260971 2023-01-22 16:18:29.177713: step: 116/463, loss: 0.04310024529695511 2023-01-22 16:18:29.770239: step: 118/463, loss: 0.028202271088957787 2023-01-22 16:18:30.449038: step: 120/463, loss: 0.4467148184776306 2023-01-22 16:18:31.037283: step: 122/463, loss: 0.007890046574175358 2023-01-22 16:18:31.645189: step: 124/463, loss: 0.07729124277830124 2023-01-22 16:18:32.178402: step: 126/463, loss: 0.007592519745230675 2023-01-22 16:18:32.782510: step: 128/463, loss: 0.04554588720202446 2023-01-22 16:18:33.404778: step: 130/463, loss: 0.021302292123436928 2023-01-22 16:18:34.001991: step: 132/463, loss: 0.09691748023033142 2023-01-22 16:18:34.629560: step: 134/463, loss: 0.0723891481757164 2023-01-22 16:18:35.287119: step: 136/463, loss: 0.6072290539741516 2023-01-22 16:18:35.872448: step: 138/463, loss: 0.03900304436683655 2023-01-22 16:18:36.425518: step: 140/463, loss: 0.04017530009150505 2023-01-22 16:18:37.029857: step: 142/463, loss: 0.05786195397377014 2023-01-22 16:18:37.602883: step: 144/463, loss: 0.09370902925729752 2023-01-22 16:18:38.173883: step: 146/463, loss: 0.09114547073841095 2023-01-22 16:18:38.712783: step: 148/463, loss: 0.05386968329548836 2023-01-22 16:18:39.280989: step: 150/463, loss: 0.006254538428038359 2023-01-22 16:18:39.863108: step: 152/463, loss: 0.170665442943573 2023-01-22 16:18:40.503430: step: 154/463, loss: 0.011491959914565086 2023-01-22 16:18:41.147955: step: 156/463, loss: 0.21397700905799866 2023-01-22 16:18:41.773622: step: 158/463, loss: 0.04546033963561058 2023-01-22 16:18:42.479372: step: 160/463, loss: 0.32222869992256165 2023-01-22 16:18:43.063225: step: 162/463, loss: 0.019535530358552933 2023-01-22 16:18:43.638251: step: 164/463, loss: 0.039660658687353134 2023-01-22 16:18:44.346910: step: 166/463, loss: 0.06405636668205261 2023-01-22 16:18:44.975582: step: 168/463, loss: 0.6279903054237366 2023-01-22 16:18:45.574906: step: 170/463, loss: 0.031021904200315475 2023-01-22 16:18:46.153381: step: 172/463, loss: 0.12619899213314056 2023-01-22 16:18:46.789632: step: 174/463, loss: 0.08389808237552643 2023-01-22 16:18:47.435661: step: 176/463, loss: 0.0758499950170517 2023-01-22 16:18:48.027581: step: 178/463, loss: 0.0666787177324295 2023-01-22 16:18:48.653960: step: 180/463, loss: 0.058811359107494354 2023-01-22 16:18:49.285721: step: 182/463, loss: 0.04327578470110893 2023-01-22 16:18:49.935307: step: 184/463, loss: 0.5512718558311462 2023-01-22 16:18:50.497023: step: 186/463, loss: 0.12747791409492493 2023-01-22 16:18:51.114366: step: 188/463, loss: 0.05886711925268173 2023-01-22 16:18:51.810599: step: 190/463, loss: 0.03419441357254982 2023-01-22 16:18:52.427626: step: 192/463, loss: 0.06569340080022812 2023-01-22 16:18:52.989451: step: 194/463, loss: 0.298285573720932 2023-01-22 16:18:53.627076: step: 196/463, loss: 0.05984310433268547 2023-01-22 16:18:54.267217: step: 198/463, loss: 0.11792390793561935 2023-01-22 16:18:54.824345: step: 200/463, loss: 0.0457036979496479 2023-01-22 16:18:55.421905: step: 202/463, loss: 0.0584447905421257 2023-01-22 16:18:56.064842: step: 204/463, loss: 0.02731543406844139 2023-01-22 16:18:56.665974: step: 206/463, loss: 0.2994098663330078 2023-01-22 16:18:57.281595: step: 208/463, loss: 0.03744002431631088 2023-01-22 16:18:57.906558: step: 210/463, loss: 0.060140460729599 2023-01-22 16:18:58.550367: step: 212/463, loss: 0.0964348241686821 2023-01-22 16:18:59.130554: step: 214/463, loss: 0.08038162440061569 2023-01-22 16:18:59.742471: step: 216/463, loss: 0.05648510530591011 2023-01-22 16:19:00.306497: step: 218/463, loss: 0.09165140241384506 2023-01-22 16:19:00.914481: step: 220/463, loss: 0.1603998839855194 2023-01-22 16:19:01.511584: step: 222/463, loss: 0.021137764677405357 2023-01-22 16:19:02.149103: step: 224/463, loss: 0.008030910044908524 2023-01-22 16:19:02.765556: step: 226/463, loss: 0.8400269150733948 2023-01-22 16:19:03.451824: step: 228/463, loss: 0.15571081638336182 2023-01-22 16:19:04.044788: step: 230/463, loss: 0.089493028819561 2023-01-22 16:19:04.658955: step: 232/463, loss: 0.07949037104845047 2023-01-22 16:19:05.277099: step: 234/463, loss: 0.24334228038787842 2023-01-22 16:19:05.820063: step: 236/463, loss: 0.009026282466948032 2023-01-22 16:19:06.380256: step: 238/463, loss: 0.08372674137353897 2023-01-22 16:19:06.937679: step: 240/463, loss: 0.35571572184562683 2023-01-22 16:19:07.622770: step: 242/463, loss: 0.051587529480457306 2023-01-22 16:19:08.193159: step: 244/463, loss: 0.02131333015859127 2023-01-22 16:19:08.800713: step: 246/463, loss: 0.05825590342283249 2023-01-22 16:19:09.360564: step: 248/463, loss: 0.2686914801597595 2023-01-22 16:19:10.029478: step: 250/463, loss: 0.06544006615877151 2023-01-22 16:19:10.649958: step: 252/463, loss: 0.22549249231815338 2023-01-22 16:19:11.205849: step: 254/463, loss: 0.010155598632991314 2023-01-22 16:19:11.826759: step: 256/463, loss: 0.4981191158294678 2023-01-22 16:19:12.543842: step: 258/463, loss: 0.08362714946269989 2023-01-22 16:19:13.142604: step: 260/463, loss: 0.03091524913907051 2023-01-22 16:19:13.699161: step: 262/463, loss: 0.05925906077027321 2023-01-22 16:19:14.321834: step: 264/463, loss: 0.021677229553461075 2023-01-22 16:19:14.958512: step: 266/463, loss: 0.021694693714380264 2023-01-22 16:19:15.540926: step: 268/463, loss: 0.04191165044903755 2023-01-22 16:19:16.219400: step: 270/463, loss: 0.05651445314288139 2023-01-22 16:19:16.764652: step: 272/463, loss: 0.08495394140481949 2023-01-22 16:19:17.353467: step: 274/463, loss: 0.06803777068853378 2023-01-22 16:19:17.922550: step: 276/463, loss: 0.07572060823440552 2023-01-22 16:19:18.572425: step: 278/463, loss: 0.03756977245211601 2023-01-22 16:19:19.171130: step: 280/463, loss: 0.0072244200855493546 2023-01-22 16:19:19.818000: step: 282/463, loss: 0.86048823595047 2023-01-22 16:19:20.380222: step: 284/463, loss: 0.05817443132400513 2023-01-22 16:19:20.934653: step: 286/463, loss: 0.04221373051404953 2023-01-22 16:19:21.550304: step: 288/463, loss: 0.13312803208827972 2023-01-22 16:19:22.212010: step: 290/463, loss: 0.17133329808712006 2023-01-22 16:19:22.843026: step: 292/463, loss: 0.06664586812257767 2023-01-22 16:19:23.430514: step: 294/463, loss: 0.1188000813126564 2023-01-22 16:19:24.012842: step: 296/463, loss: 0.15704964101314545 2023-01-22 16:19:24.556439: step: 298/463, loss: 0.1252674013376236 2023-01-22 16:19:25.219898: step: 300/463, loss: 0.006677211262285709 2023-01-22 16:19:25.795499: step: 302/463, loss: 0.08593586832284927 2023-01-22 16:19:26.446534: step: 304/463, loss: 0.07941567152738571 2023-01-22 16:19:27.110383: step: 306/463, loss: 0.11019840836524963 2023-01-22 16:19:27.728610: step: 308/463, loss: 0.0493154413998127 2023-01-22 16:19:28.363676: step: 310/463, loss: 0.19458499550819397 2023-01-22 16:19:28.906763: step: 312/463, loss: 0.0695580393075943 2023-01-22 16:19:29.530879: step: 314/463, loss: 1.7120105028152466 2023-01-22 16:19:30.125189: step: 316/463, loss: 0.03394429758191109 2023-01-22 16:19:30.734044: step: 318/463, loss: 0.08072242140769958 2023-01-22 16:19:31.386481: step: 320/463, loss: 0.12635423243045807 2023-01-22 16:19:31.977825: step: 322/463, loss: 0.10394418984651566 2023-01-22 16:19:32.606592: step: 324/463, loss: 0.03355325385928154 2023-01-22 16:19:33.214024: step: 326/463, loss: 0.038795504719018936 2023-01-22 16:19:33.818496: step: 328/463, loss: 0.045842163264751434 2023-01-22 16:19:34.406150: step: 330/463, loss: 0.0884387418627739 2023-01-22 16:19:35.011080: step: 332/463, loss: 0.08116581290960312 2023-01-22 16:19:35.606925: step: 334/463, loss: 0.1525176465511322 2023-01-22 16:19:36.152646: step: 336/463, loss: 0.01579912379384041 2023-01-22 16:19:36.786129: step: 338/463, loss: 0.04931158199906349 2023-01-22 16:19:37.337404: step: 340/463, loss: 0.10026677697896957 2023-01-22 16:19:37.944504: step: 342/463, loss: 0.06776951253414154 2023-01-22 16:19:38.532174: step: 344/463, loss: 0.04758342728018761 2023-01-22 16:19:39.167938: step: 346/463, loss: 0.0926588699221611 2023-01-22 16:19:39.794976: step: 348/463, loss: 0.07205090671777725 2023-01-22 16:19:40.419296: step: 350/463, loss: 0.015449507161974907 2023-01-22 16:19:41.025932: step: 352/463, loss: 0.5189287662506104 2023-01-22 16:19:41.647384: step: 354/463, loss: 0.20733435451984406 2023-01-22 16:19:42.182583: step: 356/463, loss: 0.08258868753910065 2023-01-22 16:19:42.828195: step: 358/463, loss: 0.05222300440073013 2023-01-22 16:19:43.466518: step: 360/463, loss: 0.37049898505210876 2023-01-22 16:19:44.142584: step: 362/463, loss: 0.009745137766003609 2023-01-22 16:19:44.738687: step: 364/463, loss: 0.05168571323156357 2023-01-22 16:19:45.345071: step: 366/463, loss: 0.4134691655635834 2023-01-22 16:19:45.985815: step: 368/463, loss: 0.08400744944810867 2023-01-22 16:19:46.598733: step: 370/463, loss: 0.05442841351032257 2023-01-22 16:19:47.176686: step: 372/463, loss: 0.0879712924361229 2023-01-22 16:19:47.801169: step: 374/463, loss: 0.014103834517300129 2023-01-22 16:19:48.448794: step: 376/463, loss: 0.04359009489417076 2023-01-22 16:19:49.051356: step: 378/463, loss: 0.022984983399510384 2023-01-22 16:19:49.702028: step: 380/463, loss: 0.06237708032131195 2023-01-22 16:19:50.361433: step: 382/463, loss: 0.015929369255900383 2023-01-22 16:19:51.070567: step: 384/463, loss: 0.08836397528648376 2023-01-22 16:19:51.649614: step: 386/463, loss: 0.15161825716495514 2023-01-22 16:19:52.227020: step: 388/463, loss: 0.02161381207406521 2023-01-22 16:19:52.841520: step: 390/463, loss: 0.051219064742326736 2023-01-22 16:19:53.489245: step: 392/463, loss: 0.0646698847413063 2023-01-22 16:19:54.088744: step: 394/463, loss: 0.09145133942365646 2023-01-22 16:19:54.687743: step: 396/463, loss: 0.10722997039556503 2023-01-22 16:19:55.342931: step: 398/463, loss: 0.05125118046998978 2023-01-22 16:19:55.969093: step: 400/463, loss: 0.31371960043907166 2023-01-22 16:19:56.676363: step: 402/463, loss: 0.026963984593749046 2023-01-22 16:19:57.286787: step: 404/463, loss: 0.052168115973472595 2023-01-22 16:19:57.842979: step: 406/463, loss: 0.39249467849731445 2023-01-22 16:19:58.388764: step: 408/463, loss: 0.014179211109876633 2023-01-22 16:19:59.000544: step: 410/463, loss: 0.047343891113996506 2023-01-22 16:19:59.587768: step: 412/463, loss: 0.06449511647224426 2023-01-22 16:20:00.223461: step: 414/463, loss: 0.15209528803825378 2023-01-22 16:20:00.869657: step: 416/463, loss: 0.061231352388858795 2023-01-22 16:20:01.554007: step: 418/463, loss: 0.1439550817012787 2023-01-22 16:20:02.119291: step: 420/463, loss: 0.7164761424064636 2023-01-22 16:20:02.716265: step: 422/463, loss: 0.21189238131046295 2023-01-22 16:20:03.332053: step: 424/463, loss: 0.08859238028526306 2023-01-22 16:20:03.912532: step: 426/463, loss: 0.060423411428928375 2023-01-22 16:20:04.504982: step: 428/463, loss: 0.017610525712370872 2023-01-22 16:20:05.161401: step: 430/463, loss: 0.17659184336662292 2023-01-22 16:20:05.747675: step: 432/463, loss: 0.17549853026866913 2023-01-22 16:20:06.383367: step: 434/463, loss: 0.16101182997226715 2023-01-22 16:20:07.028863: step: 436/463, loss: 0.18515095114707947 2023-01-22 16:20:07.644808: step: 438/463, loss: 0.07934846729040146 2023-01-22 16:20:08.231019: step: 440/463, loss: 0.06070747971534729 2023-01-22 16:20:09.062022: step: 442/463, loss: 0.22549283504486084 2023-01-22 16:20:09.688564: step: 444/463, loss: 0.26674482226371765 2023-01-22 16:20:10.276861: step: 446/463, loss: 0.06790928542613983 2023-01-22 16:20:10.903419: step: 448/463, loss: 0.7917834520339966 2023-01-22 16:20:11.557155: step: 450/463, loss: 0.09967049956321716 2023-01-22 16:20:12.172342: step: 452/463, loss: 0.011224746704101562 2023-01-22 16:20:12.856061: step: 454/463, loss: 0.15405090153217316 2023-01-22 16:20:13.476031: step: 456/463, loss: 0.10464118421077728 2023-01-22 16:20:14.084582: step: 458/463, loss: 0.3129270672798157 2023-01-22 16:20:14.697102: step: 460/463, loss: 0.1155051440000534 2023-01-22 16:20:15.350899: step: 462/463, loss: 0.16035684943199158 2023-01-22 16:20:16.065560: step: 464/463, loss: 0.1089453473687172 2023-01-22 16:20:16.699667: step: 466/463, loss: 0.21924130618572235 2023-01-22 16:20:17.270888: step: 468/463, loss: 0.022490041330456734 2023-01-22 16:20:17.885329: step: 470/463, loss: 0.13190558552742004 2023-01-22 16:20:18.469523: step: 472/463, loss: 0.5459837913513184 2023-01-22 16:20:19.060094: step: 474/463, loss: 0.081820547580719 2023-01-22 16:20:19.648529: step: 476/463, loss: 0.02452622912824154 2023-01-22 16:20:20.236871: step: 478/463, loss: 0.02967979572713375 2023-01-22 16:20:20.929809: step: 480/463, loss: 0.061660319566726685 2023-01-22 16:20:21.535578: step: 482/463, loss: 0.13234630227088928 2023-01-22 16:20:22.178626: step: 484/463, loss: 0.1408092975616455 2023-01-22 16:20:22.811698: step: 486/463, loss: 0.08144381642341614 2023-01-22 16:20:23.367817: step: 488/463, loss: 0.06824054569005966 2023-01-22 16:20:23.979659: step: 490/463, loss: 0.25250914692878723 2023-01-22 16:20:24.559585: step: 492/463, loss: 0.09514086693525314 2023-01-22 16:20:25.119058: step: 494/463, loss: 0.06822168081998825 2023-01-22 16:20:25.767736: step: 496/463, loss: 0.06888014078140259 2023-01-22 16:20:26.377585: step: 498/463, loss: 0.06405879557132721 2023-01-22 16:20:26.955421: step: 500/463, loss: 0.5727821588516235 2023-01-22 16:20:27.559913: step: 502/463, loss: 0.24823711812496185 2023-01-22 16:20:28.185653: step: 504/463, loss: 0.30475059151649475 2023-01-22 16:20:28.795053: step: 506/463, loss: 0.015030224807560444 2023-01-22 16:20:29.409148: step: 508/463, loss: 0.08462733775377274 2023-01-22 16:20:29.966039: step: 510/463, loss: 0.0436248704791069 2023-01-22 16:20:30.599560: step: 512/463, loss: 0.15087315440177917 2023-01-22 16:20:31.226528: step: 514/463, loss: 0.09049743413925171 2023-01-22 16:20:31.827593: step: 516/463, loss: 0.07875020802021027 2023-01-22 16:20:32.449563: step: 518/463, loss: 0.26880571246147156 2023-01-22 16:20:33.067475: step: 520/463, loss: 0.09701051563024521 2023-01-22 16:20:33.632530: step: 522/463, loss: 0.030374577268958092 2023-01-22 16:20:34.218628: step: 524/463, loss: 0.023795338347554207 2023-01-22 16:20:34.799232: step: 526/463, loss: 0.42391639947891235 2023-01-22 16:20:35.407837: step: 528/463, loss: 0.06287197768688202 2023-01-22 16:20:35.996385: step: 530/463, loss: 0.030871154740452766 2023-01-22 16:20:36.586378: step: 532/463, loss: 1.0172512531280518 2023-01-22 16:20:37.227595: step: 534/463, loss: 22.561044692993164 2023-01-22 16:20:37.813826: step: 536/463, loss: 0.10364580154418945 2023-01-22 16:20:38.391961: step: 538/463, loss: 0.18276658654212952 2023-01-22 16:20:39.032507: step: 540/463, loss: 0.11104226112365723 2023-01-22 16:20:39.655480: step: 542/463, loss: 0.02095131389796734 2023-01-22 16:20:40.334398: step: 544/463, loss: 0.056652627885341644 2023-01-22 16:20:40.907984: step: 546/463, loss: 0.04344608634710312 2023-01-22 16:20:41.475353: step: 548/463, loss: 0.05528009682893753 2023-01-22 16:20:42.084563: step: 550/463, loss: 0.1233786791563034 2023-01-22 16:20:42.672225: step: 552/463, loss: 0.0011431153398007154 2023-01-22 16:20:43.280203: step: 554/463, loss: 0.13020804524421692 2023-01-22 16:20:43.909936: step: 556/463, loss: 0.14191289246082306 2023-01-22 16:20:44.558291: step: 558/463, loss: 0.1067819893360138 2023-01-22 16:20:45.082092: step: 560/463, loss: 0.0632496178150177 2023-01-22 16:20:45.660610: step: 562/463, loss: 0.04296307638287544 2023-01-22 16:20:46.219619: step: 564/463, loss: 0.1018439531326294 2023-01-22 16:20:46.836098: step: 566/463, loss: 0.07820386439561844 2023-01-22 16:20:47.435328: step: 568/463, loss: 0.06435123831033707 2023-01-22 16:20:48.017076: step: 570/463, loss: 0.05581549182534218 2023-01-22 16:20:48.654200: step: 572/463, loss: 0.15068359673023224 2023-01-22 16:20:49.334988: step: 574/463, loss: 0.08564060181379318 2023-01-22 16:20:49.941622: step: 576/463, loss: 0.14013288915157318 2023-01-22 16:20:50.580429: step: 578/463, loss: 0.11942465603351593 2023-01-22 16:20:51.278698: step: 580/463, loss: 0.09550166130065918 2023-01-22 16:20:51.853136: step: 582/463, loss: 3.233044385910034 2023-01-22 16:20:52.470590: step: 584/463, loss: 0.2007221132516861 2023-01-22 16:20:53.074102: step: 586/463, loss: 0.05036419257521629 2023-01-22 16:20:53.674213: step: 588/463, loss: 0.016220109537243843 2023-01-22 16:20:54.327285: step: 590/463, loss: 0.08186416327953339 2023-01-22 16:20:54.949060: step: 592/463, loss: 0.058370716869831085 2023-01-22 16:20:55.532317: step: 594/463, loss: 0.04682866111397743 2023-01-22 16:20:56.077393: step: 596/463, loss: 0.20876897871494293 2023-01-22 16:20:56.640558: step: 598/463, loss: 0.5226532816886902 2023-01-22 16:20:57.306525: step: 600/463, loss: 0.07913534343242645 2023-01-22 16:20:57.959361: step: 602/463, loss: 0.055710215121507645 2023-01-22 16:20:58.542486: step: 604/463, loss: 0.013344991952180862 2023-01-22 16:20:59.130550: step: 606/463, loss: 0.2663425803184509 2023-01-22 16:20:59.756013: step: 608/463, loss: 0.08978661894798279 2023-01-22 16:21:00.335682: step: 610/463, loss: 0.033474430441856384 2023-01-22 16:21:00.975052: step: 612/463, loss: 0.046257197856903076 2023-01-22 16:21:01.609141: step: 614/463, loss: 0.2595808506011963 2023-01-22 16:21:02.215857: step: 616/463, loss: 0.13854049146175385 2023-01-22 16:21:02.942653: step: 618/463, loss: 0.13499432802200317 2023-01-22 16:21:03.638355: step: 620/463, loss: 0.04853852093219757 2023-01-22 16:21:04.284306: step: 622/463, loss: 0.04568323865532875 2023-01-22 16:21:04.906798: step: 624/463, loss: 0.09981644153594971 2023-01-22 16:21:05.654985: step: 626/463, loss: 0.0970684140920639 2023-01-22 16:21:06.318512: step: 628/463, loss: 0.19540366530418396 2023-01-22 16:21:06.900274: step: 630/463, loss: 0.04772598668932915 2023-01-22 16:21:07.450096: step: 632/463, loss: 0.14751742780208588 2023-01-22 16:21:08.049816: step: 634/463, loss: 0.050115227699279785 2023-01-22 16:21:08.627208: step: 636/463, loss: 0.6294702887535095 2023-01-22 16:21:09.272517: step: 638/463, loss: 0.06821957975625992 2023-01-22 16:21:09.912124: step: 640/463, loss: 0.3480606973171234 2023-01-22 16:21:10.451897: step: 642/463, loss: 0.12253233045339584 2023-01-22 16:21:11.077495: step: 644/463, loss: 0.061676908284425735 2023-01-22 16:21:11.689088: step: 646/463, loss: 0.13214999437332153 2023-01-22 16:21:12.338904: step: 648/463, loss: 0.03600108623504639 2023-01-22 16:21:13.017880: step: 650/463, loss: 0.05791589245200157 2023-01-22 16:21:13.586434: step: 652/463, loss: 0.11187602579593658 2023-01-22 16:21:14.164646: step: 654/463, loss: 0.05199594423174858 2023-01-22 16:21:14.784244: step: 656/463, loss: 0.08178532123565674 2023-01-22 16:21:15.404605: step: 658/463, loss: 0.43516671657562256 2023-01-22 16:21:15.985953: step: 660/463, loss: 0.05036203935742378 2023-01-22 16:21:16.602538: step: 662/463, loss: 0.0774068832397461 2023-01-22 16:21:17.166053: step: 664/463, loss: 0.10690979659557343 2023-01-22 16:21:17.809166: step: 666/463, loss: 0.04626484960317612 2023-01-22 16:21:18.448854: step: 668/463, loss: 0.0355355441570282 2023-01-22 16:21:19.022499: step: 670/463, loss: 0.07506830990314484 2023-01-22 16:21:19.650785: step: 672/463, loss: 0.04106340557336807 2023-01-22 16:21:20.318411: step: 674/463, loss: 0.06584631651639938 2023-01-22 16:21:20.929690: step: 676/463, loss: 0.08830723166465759 2023-01-22 16:21:21.513159: step: 678/463, loss: 0.06739376485347748 2023-01-22 16:21:22.078520: step: 680/463, loss: 0.09280620515346527 2023-01-22 16:21:22.816223: step: 682/463, loss: 0.02368597500026226 2023-01-22 16:21:23.473362: step: 684/463, loss: 0.7093531489372253 2023-01-22 16:21:24.040446: step: 686/463, loss: 0.21134606003761292 2023-01-22 16:21:24.567202: step: 688/463, loss: 0.030406875535845757 2023-01-22 16:21:25.159135: step: 690/463, loss: 0.7958748936653137 2023-01-22 16:21:25.822384: step: 692/463, loss: 0.14028556644916534 2023-01-22 16:21:26.414024: step: 694/463, loss: 0.012346156872808933 2023-01-22 16:21:27.008027: step: 696/463, loss: 0.10005488991737366 2023-01-22 16:21:27.631875: step: 698/463, loss: 0.20927205681800842 2023-01-22 16:21:28.295475: step: 700/463, loss: 0.23975242674350739 2023-01-22 16:21:28.862375: step: 702/463, loss: 0.2675652503967285 2023-01-22 16:21:29.479114: step: 704/463, loss: 0.1844460815191269 2023-01-22 16:21:30.063508: step: 706/463, loss: 0.124224454164505 2023-01-22 16:21:30.683170: step: 708/463, loss: 0.09228020906448364 2023-01-22 16:21:31.269653: step: 710/463, loss: 0.03246211260557175 2023-01-22 16:21:31.891463: step: 712/463, loss: 0.14148841798305511 2023-01-22 16:21:32.515463: step: 714/463, loss: 0.48018479347229004 2023-01-22 16:21:33.087965: step: 716/463, loss: 0.2848201394081116 2023-01-22 16:21:33.651529: step: 718/463, loss: 0.0663941353559494 2023-01-22 16:21:34.242304: step: 720/463, loss: 0.07282062619924545 2023-01-22 16:21:34.902713: step: 722/463, loss: 0.14949651062488556 2023-01-22 16:21:35.505065: step: 724/463, loss: 0.11406629532575607 2023-01-22 16:21:36.093163: step: 726/463, loss: 0.03552587330341339 2023-01-22 16:21:36.741552: step: 728/463, loss: 0.03403022140264511 2023-01-22 16:21:37.379609: step: 730/463, loss: 0.12237437069416046 2023-01-22 16:21:38.043962: step: 732/463, loss: 0.06277012079954147 2023-01-22 16:21:38.620514: step: 734/463, loss: 0.025973357260227203 2023-01-22 16:21:39.216880: step: 736/463, loss: 2.7938501834869385 2023-01-22 16:21:39.849628: step: 738/463, loss: 0.06488867849111557 2023-01-22 16:21:40.399313: step: 740/463, loss: 0.053931016474962234 2023-01-22 16:21:41.009184: step: 742/463, loss: 0.07245232909917831 2023-01-22 16:21:41.644362: step: 744/463, loss: 0.21718385815620422 2023-01-22 16:21:42.251930: step: 746/463, loss: 0.08317780494689941 2023-01-22 16:21:42.832270: step: 748/463, loss: 0.18458202481269836 2023-01-22 16:21:43.402312: step: 750/463, loss: 0.3128126859664917 2023-01-22 16:21:44.006224: step: 752/463, loss: 15.12411117553711 2023-01-22 16:21:44.698160: step: 754/463, loss: 0.5582413673400879 2023-01-22 16:21:45.226460: step: 756/463, loss: 0.08040746301412582 2023-01-22 16:21:45.853063: step: 758/463, loss: 0.22712168097496033 2023-01-22 16:21:46.396103: step: 760/463, loss: 0.054894424974918365 2023-01-22 16:21:47.020291: step: 762/463, loss: 0.10484629124403 2023-01-22 16:21:47.613904: step: 764/463, loss: 0.03317183628678322 2023-01-22 16:21:48.224128: step: 766/463, loss: 0.028720177710056305 2023-01-22 16:21:48.827747: step: 768/463, loss: 0.046849265694618225 2023-01-22 16:21:49.440039: step: 770/463, loss: 0.03642425313591957 2023-01-22 16:21:50.095715: step: 772/463, loss: 0.32579338550567627 2023-01-22 16:21:50.832381: step: 774/463, loss: 0.02007574401795864 2023-01-22 16:21:51.418644: step: 776/463, loss: 0.12157100439071655 2023-01-22 16:21:52.005725: step: 778/463, loss: 0.05112222209572792 2023-01-22 16:21:52.571882: step: 780/463, loss: 0.05304626002907753 2023-01-22 16:21:53.158848: step: 782/463, loss: 0.10159700363874435 2023-01-22 16:21:53.762447: step: 784/463, loss: 0.06608624756336212 2023-01-22 16:21:54.452110: step: 786/463, loss: 0.07575881481170654 2023-01-22 16:21:54.981414: step: 788/463, loss: 0.04067505523562431 2023-01-22 16:21:55.647617: step: 790/463, loss: 0.06143113598227501 2023-01-22 16:21:56.315976: step: 792/463, loss: 0.08414895087480545 2023-01-22 16:21:56.922461: step: 794/463, loss: 0.3566325008869171 2023-01-22 16:21:57.511682: step: 796/463, loss: 0.4592967629432678 2023-01-22 16:21:58.086269: step: 798/463, loss: 0.05540350452065468 2023-01-22 16:21:58.720770: step: 800/463, loss: 0.03470492362976074 2023-01-22 16:21:59.328525: step: 802/463, loss: 0.09503244608640671 2023-01-22 16:21:59.946306: step: 804/463, loss: 0.16782158613204956 2023-01-22 16:22:00.534922: step: 806/463, loss: 0.2240760177373886 2023-01-22 16:22:01.192011: step: 808/463, loss: 0.13780707120895386 2023-01-22 16:22:01.772002: step: 810/463, loss: 0.06771397590637207 2023-01-22 16:22:02.356642: step: 812/463, loss: 0.09324771910905838 2023-01-22 16:22:03.010312: step: 814/463, loss: 0.4550955295562744 2023-01-22 16:22:03.627049: step: 816/463, loss: 0.0725342258810997 2023-01-22 16:22:04.211716: step: 818/463, loss: 0.2422669678926468 2023-01-22 16:22:04.874418: step: 820/463, loss: 0.03026687167584896 2023-01-22 16:22:05.502643: step: 822/463, loss: 0.06415046006441116 2023-01-22 16:22:06.088208: step: 824/463, loss: 0.06968147307634354 2023-01-22 16:22:06.714923: step: 826/463, loss: 0.6831038594245911 2023-01-22 16:22:07.409496: step: 828/463, loss: 0.1439363956451416 2023-01-22 16:22:08.016770: step: 830/463, loss: 0.22143974900245667 2023-01-22 16:22:08.635212: step: 832/463, loss: 0.15041117370128632 2023-01-22 16:22:09.291162: step: 834/463, loss: 0.0824727937579155 2023-01-22 16:22:09.888495: step: 836/463, loss: 0.8542856574058533 2023-01-22 16:22:10.510409: step: 838/463, loss: 0.03801656514406204 2023-01-22 16:22:11.048195: step: 840/463, loss: 0.0789526030421257 2023-01-22 16:22:11.705073: step: 842/463, loss: 0.097702257335186 2023-01-22 16:22:12.308887: step: 844/463, loss: 0.049499187618494034 2023-01-22 16:22:12.980358: step: 846/463, loss: 2.6355817317962646 2023-01-22 16:22:13.626534: step: 848/463, loss: 0.12496107071638107 2023-01-22 16:22:14.234085: step: 850/463, loss: 0.1772080957889557 2023-01-22 16:22:14.818674: step: 852/463, loss: 0.07433206588029861 2023-01-22 16:22:15.424319: step: 854/463, loss: 0.14316411316394806 2023-01-22 16:22:16.081048: step: 856/463, loss: 0.1420987993478775 2023-01-22 16:22:16.649087: step: 858/463, loss: 1.2901135683059692 2023-01-22 16:22:17.360289: step: 860/463, loss: 0.052284400910139084 2023-01-22 16:22:17.952304: step: 862/463, loss: 0.044873494654893875 2023-01-22 16:22:18.538511: step: 864/463, loss: 0.03619229048490524 2023-01-22 16:22:19.198483: step: 866/463, loss: 0.10276873409748077 2023-01-22 16:22:19.823163: step: 868/463, loss: 0.14511355757713318 2023-01-22 16:22:20.356548: step: 870/463, loss: 0.09672579169273376 2023-01-22 16:22:20.932563: step: 872/463, loss: 0.10883469879627228 2023-01-22 16:22:21.554112: step: 874/463, loss: 0.24897442758083344 2023-01-22 16:22:22.144614: step: 876/463, loss: 0.0469701886177063 2023-01-22 16:22:22.801656: step: 878/463, loss: 0.017191709950566292 2023-01-22 16:22:23.476357: step: 880/463, loss: 0.5178337097167969 2023-01-22 16:22:24.044602: step: 882/463, loss: 0.24583376944065094 2023-01-22 16:22:24.644999: step: 884/463, loss: 0.04219265654683113 2023-01-22 16:22:25.296409: step: 886/463, loss: 0.05753428116440773 2023-01-22 16:22:25.886518: step: 888/463, loss: 0.07607780396938324 2023-01-22 16:22:26.494843: step: 890/463, loss: 0.15747754275798798 2023-01-22 16:22:27.088949: step: 892/463, loss: 0.23383373022079468 2023-01-22 16:22:27.704931: step: 894/463, loss: 0.23343023657798767 2023-01-22 16:22:28.352284: step: 896/463, loss: 0.31316110491752625 2023-01-22 16:22:28.998821: step: 898/463, loss: 0.056508179754018784 2023-01-22 16:22:29.607980: step: 900/463, loss: 0.152079775929451 2023-01-22 16:22:30.246693: step: 902/463, loss: 0.0897534117102623 2023-01-22 16:22:30.940455: step: 904/463, loss: 0.13596443831920624 2023-01-22 16:22:31.491648: step: 906/463, loss: 0.016014672815799713 2023-01-22 16:22:32.102086: step: 908/463, loss: 0.0466289259493351 2023-01-22 16:22:32.736804: step: 910/463, loss: 0.04789202660322189 2023-01-22 16:22:33.290132: step: 912/463, loss: 0.018287284299731255 2023-01-22 16:22:33.907707: step: 914/463, loss: 0.10381238162517548 2023-01-22 16:22:34.489568: step: 916/463, loss: 0.08761463314294815 2023-01-22 16:22:35.130230: step: 918/463, loss: 0.04405404254794121 2023-01-22 16:22:35.744194: step: 920/463, loss: 0.03705519810318947 2023-01-22 16:22:36.358317: step: 922/463, loss: 0.025801027193665504 2023-01-22 16:22:37.009916: step: 924/463, loss: 0.4665796160697937 2023-01-22 16:22:37.633710: step: 926/463, loss: 0.5286181569099426 ================================================== Loss: 0.240 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2781136035372145, 'r': 0.32983112373958073, 'f1': 0.3017725732825678}, 'combined': 0.22235873820820784, 'epoch': 15} Test Chinese: {'template': {'p': 0.9722222222222222, 'r': 0.5511811023622047, 'f1': 0.7035175879396985}, 'slot': {'p': 0.3272835825429386, 'r': 0.3161359321331791, 'f1': 0.32161318728786326}, 'combined': 0.22626053377035607, 'epoch': 15} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.28039443897014577, 'r': 0.3250872907225029, 'f1': 0.30109139228604404}, 'combined': 0.221856815368664, 'epoch': 15} Test Korean: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.3248514328907363, 'r': 0.30839607646483, 'f1': 0.31640995300379066}, 'combined': 0.22465106663269135, 'epoch': 15} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.29419352931628273, 'r': 0.339969372587507, 'f1': 0.3154293298479158}, 'combined': 0.23242161146688528, 'epoch': 15} Test Russian: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.343089588698682, 'r': 0.29459001333289975, 'f1': 0.31699545096666953}, 'combined': 0.22506677018633536, 'epoch': 15} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.23809523809523808, 'r': 0.38095238095238093, 'f1': 0.293040293040293}, 'combined': 0.19536019536019533, 'epoch': 15} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.30303030303030304, 'r': 0.43478260869565216, 'f1': 0.35714285714285715}, 'combined': 0.17857142857142858, 'epoch': 15} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.484375, 'r': 0.2672413793103448, 'f1': 0.34444444444444444}, 'combined': 0.22962962962962963, 'epoch': 15} New best russian model... ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.286031314699793, 'r': 0.2996001626457035, 'f1': 0.29265854627300414}, 'combined': 0.21564313935905566, 'epoch': 11} Test for Chinese: {'template': {'p': 0.9722222222222222, 'r': 0.5511811023622047, 'f1': 0.7035175879396985}, 'slot': {'p': 0.3466959902350088, 'r': 0.3012773015579334, 'f1': 0.32239486942414375}, 'combined': 0.2268104609014077, 'epoch': 11} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.32215447154471544, 'r': 0.3773809523809524, 'f1': 0.3475877192982456}, 'combined': 0.2317251461988304, 'epoch': 11} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.31359210182048525, 'r': 0.32489807892596767, 'f1': 0.3191449908555172}, 'combined': 0.23515946694617054, 'epoch': 1} Test for Korean: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.3653684903443556, 'r': 0.29076445304891124, 'f1': 0.32382513429937054}, 'combined': 0.22991584535255308, 'epoch': 1} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.296875, 'r': 0.41304347826086957, 'f1': 0.3454545454545454}, 'combined': 0.1727272727272727, 'epoch': 1} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.29419352931628273, 'r': 0.339969372587507, 'f1': 0.3154293298479158}, 'combined': 0.23242161146688528, 'epoch': 15} Test for Russian: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.343089588698682, 'r': 0.29459001333289975, 'f1': 0.31699545096666953}, 'combined': 0.22506677018633536, 'epoch': 15} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.484375, 'r': 0.2672413793103448, 'f1': 0.34444444444444444}, 'combined': 0.22962962962962963, 'epoch': 15} ****************************** Epoch: 16 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-22 16:25:21.887729: step: 2/463, loss: 0.14428521692752838 2023-01-22 16:25:22.474191: step: 4/463, loss: 0.02583499625325203 2023-01-22 16:25:23.073236: step: 6/463, loss: 0.05102044716477394 2023-01-22 16:25:23.641066: step: 8/463, loss: 0.07324698567390442 2023-01-22 16:25:24.195126: step: 10/463, loss: 0.10958367586135864 2023-01-22 16:25:24.775649: step: 12/463, loss: 0.0759357362985611 2023-01-22 16:25:25.346485: step: 14/463, loss: 0.05862908810377121 2023-01-22 16:25:25.990262: step: 16/463, loss: 0.019344186410307884 2023-01-22 16:25:26.568550: step: 18/463, loss: 0.061939436942338943 2023-01-22 16:25:27.239305: step: 20/463, loss: 0.06530731171369553 2023-01-22 16:25:27.815286: step: 22/463, loss: 0.15866726636886597 2023-01-22 16:25:28.446342: step: 24/463, loss: 0.0728439912199974 2023-01-22 16:25:29.073505: step: 26/463, loss: 0.05279998108744621 2023-01-22 16:25:29.678286: step: 28/463, loss: 0.04092046245932579 2023-01-22 16:25:30.313828: step: 30/463, loss: 0.10380543768405914 2023-01-22 16:25:30.900286: step: 32/463, loss: 0.061272770166397095 2023-01-22 16:25:31.628235: step: 34/463, loss: 0.049886491149663925 2023-01-22 16:25:32.274032: step: 36/463, loss: 0.03455881401896477 2023-01-22 16:25:32.836462: step: 38/463, loss: 0.011654849164187908 2023-01-22 16:25:33.499474: step: 40/463, loss: 0.18190611898899078 2023-01-22 16:25:34.189293: step: 42/463, loss: 0.0904662162065506 2023-01-22 16:25:34.836551: step: 44/463, loss: 0.10908166319131851 2023-01-22 16:25:35.493549: step: 46/463, loss: 0.11225692182779312 2023-01-22 16:25:36.155701: step: 48/463, loss: 0.07173627614974976 2023-01-22 16:25:36.757509: step: 50/463, loss: 0.017527112737298012 2023-01-22 16:25:37.344882: step: 52/463, loss: 0.05212482810020447 2023-01-22 16:25:37.975925: step: 54/463, loss: 0.2083996832370758 2023-01-22 16:25:38.595850: step: 56/463, loss: 0.3192879259586334 2023-01-22 16:25:39.229881: step: 58/463, loss: 0.12610578536987305 2023-01-22 16:25:39.858002: step: 60/463, loss: 0.010418935678899288 2023-01-22 16:25:40.537821: step: 62/463, loss: 1.714826226234436 2023-01-22 16:25:41.145268: step: 64/463, loss: 0.05784856900572777 2023-01-22 16:25:41.743096: step: 66/463, loss: 0.0031393696554005146 2023-01-22 16:25:42.352497: step: 68/463, loss: 0.14895278215408325 2023-01-22 16:25:43.004046: step: 70/463, loss: 0.037023916840553284 2023-01-22 16:25:43.597635: step: 72/463, loss: 0.04162986949086189 2023-01-22 16:25:44.170434: step: 74/463, loss: 0.15059104561805725 2023-01-22 16:25:44.783116: step: 76/463, loss: 0.2820736765861511 2023-01-22 16:25:45.448885: step: 78/463, loss: 0.07529161125421524 2023-01-22 16:25:46.179302: step: 80/463, loss: 0.05722295492887497 2023-01-22 16:25:46.738997: step: 82/463, loss: 0.04796871915459633 2023-01-22 16:25:47.418168: step: 84/463, loss: 0.028142396360635757 2023-01-22 16:25:48.123385: step: 86/463, loss: 0.05026878044009209 2023-01-22 16:25:48.761267: step: 88/463, loss: 0.3885742425918579 2023-01-22 16:25:49.378725: step: 90/463, loss: 0.015361827798187733 2023-01-22 16:25:49.985880: step: 92/463, loss: 0.044300224632024765 2023-01-22 16:25:50.583903: step: 94/463, loss: 0.19278568029403687 2023-01-22 16:25:51.152749: step: 96/463, loss: 0.054867010563611984 2023-01-22 16:25:51.811878: step: 98/463, loss: 0.05514153838157654 2023-01-22 16:25:52.373023: step: 100/463, loss: 0.01758463680744171 2023-01-22 16:25:52.978681: step: 102/463, loss: 0.001292157918214798 2023-01-22 16:25:53.545442: step: 104/463, loss: 0.03637455403804779 2023-01-22 16:25:54.183928: step: 106/463, loss: 0.1117054671049118 2023-01-22 16:25:54.737915: step: 108/463, loss: 0.06241156905889511 2023-01-22 16:25:55.320375: step: 110/463, loss: 0.02249697782099247 2023-01-22 16:25:55.946186: step: 112/463, loss: 0.031201960518956184 2023-01-22 16:25:56.544174: step: 114/463, loss: 0.10168837755918503 2023-01-22 16:25:57.110410: step: 116/463, loss: 0.6046896576881409 2023-01-22 16:25:57.688441: step: 118/463, loss: 0.05307856947183609 2023-01-22 16:25:58.298081: step: 120/463, loss: 0.03175158426165581 2023-01-22 16:25:58.969502: step: 122/463, loss: 0.024691177532076836 2023-01-22 16:25:59.581523: step: 124/463, loss: 0.7088136672973633 2023-01-22 16:26:00.147809: step: 126/463, loss: 0.0698840320110321 2023-01-22 16:26:00.714223: step: 128/463, loss: 0.02531246468424797 2023-01-22 16:26:01.319044: step: 130/463, loss: 0.11426868289709091 2023-01-22 16:26:02.033602: step: 132/463, loss: 0.04433256760239601 2023-01-22 16:26:02.686450: step: 134/463, loss: 0.08614595234394073 2023-01-22 16:26:03.340374: step: 136/463, loss: 0.08832154422998428 2023-01-22 16:26:03.929330: step: 138/463, loss: 0.012754051014780998 2023-01-22 16:26:04.603200: step: 140/463, loss: 0.05072607472538948 2023-01-22 16:26:05.153146: step: 142/463, loss: 0.07336296886205673 2023-01-22 16:26:05.778362: step: 144/463, loss: 0.06217144429683685 2023-01-22 16:26:06.369348: step: 146/463, loss: 0.025273405015468597 2023-01-22 16:26:07.005373: step: 148/463, loss: 0.055815376341342926 2023-01-22 16:26:07.678936: step: 150/463, loss: 0.019422708079218864 2023-01-22 16:26:08.304888: step: 152/463, loss: 0.0898275375366211 2023-01-22 16:26:08.923617: step: 154/463, loss: 0.12787140905857086 2023-01-22 16:26:09.540716: step: 156/463, loss: 0.034912385046482086 2023-01-22 16:26:10.149769: step: 158/463, loss: 0.012679366394877434 2023-01-22 16:26:10.814908: step: 160/463, loss: 0.06876155734062195 2023-01-22 16:26:11.401919: step: 162/463, loss: 0.0594242662191391 2023-01-22 16:26:11.960357: step: 164/463, loss: 0.10026215016841888 2023-01-22 16:26:12.556576: step: 166/463, loss: 0.3216547667980194 2023-01-22 16:26:13.281260: step: 168/463, loss: 0.018193338066339493 2023-01-22 16:26:13.894324: step: 170/463, loss: 0.030803395435214043 2023-01-22 16:26:14.509502: step: 172/463, loss: 0.05058509483933449 2023-01-22 16:26:15.100353: step: 174/463, loss: 0.08702760189771652 2023-01-22 16:26:15.712357: step: 176/463, loss: 0.09562745690345764 2023-01-22 16:26:16.266472: step: 178/463, loss: 0.00952006597071886 2023-01-22 16:26:16.829748: step: 180/463, loss: 0.04196525365114212 2023-01-22 16:26:17.442376: step: 182/463, loss: 0.3467611074447632 2023-01-22 16:26:18.099663: step: 184/463, loss: 0.10120903700590134 2023-01-22 16:26:18.774709: step: 186/463, loss: 0.07917984575033188 2023-01-22 16:26:19.401530: step: 188/463, loss: 0.052008554339408875 2023-01-22 16:26:20.029177: step: 190/463, loss: 0.11970135569572449 2023-01-22 16:26:20.633265: step: 192/463, loss: 0.06218023598194122 2023-01-22 16:26:21.244426: step: 194/463, loss: 1.0854462385177612 2023-01-22 16:26:21.860483: step: 196/463, loss: 0.03925105929374695 2023-01-22 16:26:22.413356: step: 198/463, loss: 0.059321433305740356 2023-01-22 16:26:23.013918: step: 200/463, loss: 0.1344059854745865 2023-01-22 16:26:23.613099: step: 202/463, loss: 0.013624987564980984 2023-01-22 16:26:24.185846: step: 204/463, loss: 0.04231403395533562 2023-01-22 16:26:24.797737: step: 206/463, loss: 0.0667969286441803 2023-01-22 16:26:25.425036: step: 208/463, loss: 0.30465611815452576 2023-01-22 16:26:26.062821: step: 210/463, loss: 0.123445525765419 2023-01-22 16:26:26.692462: step: 212/463, loss: 0.04022921621799469 2023-01-22 16:26:27.341845: step: 214/463, loss: 0.19167421758174896 2023-01-22 16:26:27.922509: step: 216/463, loss: 0.09953426569700241 2023-01-22 16:26:28.494526: step: 218/463, loss: 0.1635047346353531 2023-01-22 16:26:29.210811: step: 220/463, loss: 0.054468296468257904 2023-01-22 16:26:29.759096: step: 222/463, loss: 0.007843120023608208 2023-01-22 16:26:30.373767: step: 224/463, loss: 0.1256270706653595 2023-01-22 16:26:31.010306: step: 226/463, loss: 0.07846182584762573 2023-01-22 16:26:31.705526: step: 228/463, loss: 0.11458995193243027 2023-01-22 16:26:32.325688: step: 230/463, loss: 0.07498395442962646 2023-01-22 16:26:32.952860: step: 232/463, loss: 0.04375192150473595 2023-01-22 16:26:33.622812: step: 234/463, loss: 0.024539044126868248 2023-01-22 16:26:34.161736: step: 236/463, loss: 0.21527370810508728 2023-01-22 16:26:34.735735: step: 238/463, loss: 0.04686209559440613 2023-01-22 16:26:35.313676: step: 240/463, loss: 0.028561212122440338 2023-01-22 16:26:35.936791: step: 242/463, loss: 0.07770399004220963 2023-01-22 16:26:36.564473: step: 244/463, loss: 0.31057068705558777 2023-01-22 16:26:37.202082: step: 246/463, loss: 0.10229376703500748 2023-01-22 16:26:37.801016: step: 248/463, loss: 0.040732596069574356 2023-01-22 16:26:38.350420: step: 250/463, loss: 0.21103571355342865 2023-01-22 16:26:38.938234: step: 252/463, loss: 0.12881596386432648 2023-01-22 16:26:39.616877: step: 254/463, loss: 0.327356219291687 2023-01-22 16:26:40.181033: step: 256/463, loss: 0.15692844986915588 2023-01-22 16:26:40.762113: step: 258/463, loss: 0.0574437715113163 2023-01-22 16:26:41.429487: step: 260/463, loss: 0.011539776809513569 2023-01-22 16:26:42.062454: step: 262/463, loss: 0.05295128747820854 2023-01-22 16:26:42.650550: step: 264/463, loss: 0.07005015015602112 2023-01-22 16:26:43.214799: step: 266/463, loss: 0.17537076771259308 2023-01-22 16:26:43.810063: step: 268/463, loss: 0.03812427446246147 2023-01-22 16:26:44.429566: step: 270/463, loss: 0.03929457068443298 2023-01-22 16:26:44.982186: step: 272/463, loss: 0.2435234785079956 2023-01-22 16:26:45.605363: step: 274/463, loss: 0.09741266816854477 2023-01-22 16:26:46.224680: step: 276/463, loss: 0.07427236437797546 2023-01-22 16:26:46.821841: step: 278/463, loss: 0.019279256463050842 2023-01-22 16:26:47.442084: step: 280/463, loss: 0.04460856318473816 2023-01-22 16:26:48.091170: step: 282/463, loss: 0.37846627831459045 2023-01-22 16:26:48.707008: step: 284/463, loss: 0.02639560028910637 2023-01-22 16:26:49.337106: step: 286/463, loss: 0.0336846262216568 2023-01-22 16:26:49.958368: step: 288/463, loss: 0.008390380069613457 2023-01-22 16:26:50.530324: step: 290/463, loss: 0.05639432743191719 2023-01-22 16:26:51.171319: step: 292/463, loss: 0.10261567682027817 2023-01-22 16:26:51.722919: step: 294/463, loss: 0.26967504620552063 2023-01-22 16:26:52.366952: step: 296/463, loss: 0.10545353591442108 2023-01-22 16:26:52.973699: step: 298/463, loss: 0.24160727858543396 2023-01-22 16:26:53.569543: step: 300/463, loss: 0.11688096821308136 2023-01-22 16:26:54.154569: step: 302/463, loss: 0.04360333830118179 2023-01-22 16:26:54.753285: step: 304/463, loss: 0.10187655687332153 2023-01-22 16:26:55.314239: step: 306/463, loss: 0.07066205888986588 2023-01-22 16:26:55.970018: step: 308/463, loss: 0.021968575194478035 2023-01-22 16:26:56.589406: step: 310/463, loss: 0.054915785789489746 2023-01-22 16:26:57.265141: step: 312/463, loss: 0.09389752894639969 2023-01-22 16:26:57.845358: step: 314/463, loss: 0.03719539940357208 2023-01-22 16:26:58.494985: step: 316/463, loss: 0.022611156105995178 2023-01-22 16:26:59.095586: step: 318/463, loss: 0.024557486176490784 2023-01-22 16:26:59.619627: step: 320/463, loss: 0.04107135906815529 2023-01-22 16:27:00.214109: step: 322/463, loss: 0.016931327059864998 2023-01-22 16:27:00.812274: step: 324/463, loss: 0.07288599759340286 2023-01-22 16:27:01.410426: step: 326/463, loss: 0.21855506300926208 2023-01-22 16:27:02.087673: step: 328/463, loss: 0.14271271228790283 2023-01-22 16:27:02.640733: step: 330/463, loss: 0.09647845476865768 2023-01-22 16:27:03.285251: step: 332/463, loss: 0.09686043113470078 2023-01-22 16:27:03.853253: step: 334/463, loss: 0.11915914714336395 2023-01-22 16:27:04.478187: step: 336/463, loss: 0.014868238940834999 2023-01-22 16:27:05.074790: step: 338/463, loss: 0.0663587674498558 2023-01-22 16:27:05.726648: step: 340/463, loss: 0.059334490448236465 2023-01-22 16:27:06.446244: step: 342/463, loss: 0.02988993190228939 2023-01-22 16:27:07.041531: step: 344/463, loss: 0.3301783502101898 2023-01-22 16:27:07.819722: step: 346/463, loss: 0.042132653295993805 2023-01-22 16:27:08.441964: step: 348/463, loss: 0.10090532153844833 2023-01-22 16:27:09.119333: step: 350/463, loss: 0.0435476191341877 2023-01-22 16:27:09.749567: step: 352/463, loss: 0.03111307881772518 2023-01-22 16:27:10.352911: step: 354/463, loss: 0.0939292460680008 2023-01-22 16:27:10.942416: step: 356/463, loss: 0.055269159376621246 2023-01-22 16:27:11.478420: step: 358/463, loss: 0.056476395577192307 2023-01-22 16:27:12.101199: step: 360/463, loss: 0.05588222295045853 2023-01-22 16:27:12.704379: step: 362/463, loss: 0.08272379636764526 2023-01-22 16:27:13.310357: step: 364/463, loss: 0.009526556357741356 2023-01-22 16:27:14.002139: step: 366/463, loss: 0.07410157471895218 2023-01-22 16:27:14.613119: step: 368/463, loss: 0.058374904096126556 2023-01-22 16:27:15.220798: step: 370/463, loss: 1.1631983518600464 2023-01-22 16:27:15.862059: step: 372/463, loss: 0.06203728914260864 2023-01-22 16:27:16.490895: step: 374/463, loss: 0.07586369663476944 2023-01-22 16:27:17.073282: step: 376/463, loss: 0.0773143470287323 2023-01-22 16:27:17.752178: step: 378/463, loss: 0.09704328328371048 2023-01-22 16:27:18.423123: step: 380/463, loss: 0.039056308567523956 2023-01-22 16:27:19.022489: step: 382/463, loss: 0.14919227361679077 2023-01-22 16:27:19.620653: step: 384/463, loss: 0.04070664197206497 2023-01-22 16:27:20.314535: step: 386/463, loss: 0.2610322833061218 2023-01-22 16:27:20.929542: step: 388/463, loss: 0.05771954730153084 2023-01-22 16:27:21.565134: step: 390/463, loss: 0.06652950495481491 2023-01-22 16:27:22.188872: step: 392/463, loss: 0.09036467969417572 2023-01-22 16:27:22.747247: step: 394/463, loss: 0.062034107744693756 2023-01-22 16:27:23.364307: step: 396/463, loss: 0.030026812106370926 2023-01-22 16:27:23.994344: step: 398/463, loss: 0.10651829838752747 2023-01-22 16:27:24.684167: step: 400/463, loss: 0.06937893480062485 2023-01-22 16:27:25.275341: step: 402/463, loss: 0.7701834440231323 2023-01-22 16:27:25.944184: step: 404/463, loss: 0.24434304237365723 2023-01-22 16:27:26.506407: step: 406/463, loss: 0.04001837223768234 2023-01-22 16:27:27.040766: step: 408/463, loss: 0.05097857117652893 2023-01-22 16:27:27.661675: step: 410/463, loss: 0.045517537742853165 2023-01-22 16:27:28.267605: step: 412/463, loss: 0.12398950010538101 2023-01-22 16:27:28.912110: step: 414/463, loss: 0.03074038401246071 2023-01-22 16:27:29.486424: step: 416/463, loss: 0.07238101959228516 2023-01-22 16:27:30.053101: step: 418/463, loss: 0.0463382750749588 2023-01-22 16:27:30.654740: step: 420/463, loss: 0.09890548139810562 2023-01-22 16:27:31.235906: step: 422/463, loss: 0.4530923664569855 2023-01-22 16:27:31.885222: step: 424/463, loss: 0.01615770533680916 2023-01-22 16:27:32.501422: step: 426/463, loss: 0.07219133526086807 2023-01-22 16:27:33.107139: step: 428/463, loss: 0.15430380403995514 2023-01-22 16:27:33.717588: step: 430/463, loss: 0.13384389877319336 2023-01-22 16:27:34.286668: step: 432/463, loss: 0.043387409299612045 2023-01-22 16:27:34.869901: step: 434/463, loss: 0.008723920211195946 2023-01-22 16:27:35.480100: step: 436/463, loss: 0.06327725201845169 2023-01-22 16:27:36.067815: step: 438/463, loss: 0.055777743458747864 2023-01-22 16:27:36.670839: step: 440/463, loss: 0.11700907349586487 2023-01-22 16:27:37.268718: step: 442/463, loss: 0.8742594718933105 2023-01-22 16:27:37.848648: step: 444/463, loss: 0.2131926417350769 2023-01-22 16:27:38.439885: step: 446/463, loss: 0.1394772082567215 2023-01-22 16:27:39.044805: step: 448/463, loss: 0.07414799183607101 2023-01-22 16:27:39.719826: step: 450/463, loss: 0.06754671782255173 2023-01-22 16:27:40.383887: step: 452/463, loss: 0.021845154464244843 2023-01-22 16:27:40.938413: step: 454/463, loss: 0.01736283488571644 2023-01-22 16:27:41.504865: step: 456/463, loss: 0.06886675953865051 2023-01-22 16:27:42.062687: step: 458/463, loss: 0.008794305846095085 2023-01-22 16:27:42.625950: step: 460/463, loss: 0.2160838395357132 2023-01-22 16:27:43.335377: step: 462/463, loss: 0.07695823162794113 2023-01-22 16:27:43.929863: step: 464/463, loss: 0.2018650472164154 2023-01-22 16:27:44.538638: step: 466/463, loss: 0.13209478557109833 2023-01-22 16:27:45.089212: step: 468/463, loss: 0.0383484922349453 2023-01-22 16:27:45.689601: step: 470/463, loss: 0.06409667432308197 2023-01-22 16:27:46.300395: step: 472/463, loss: 0.3389008641242981 2023-01-22 16:27:46.891149: step: 474/463, loss: 0.14920105040073395 2023-01-22 16:27:47.529299: step: 476/463, loss: 0.11754865199327469 2023-01-22 16:27:48.150214: step: 478/463, loss: 0.03491789475083351 2023-01-22 16:27:48.768442: step: 480/463, loss: 0.34321334958076477 2023-01-22 16:27:49.422134: step: 482/463, loss: 0.06394869089126587 2023-01-22 16:27:50.023154: step: 484/463, loss: 0.028020964935421944 2023-01-22 16:27:50.592071: step: 486/463, loss: 0.03808412328362465 2023-01-22 16:27:51.199072: step: 488/463, loss: 0.09101155400276184 2023-01-22 16:27:51.809472: step: 490/463, loss: 0.08579631894826889 2023-01-22 16:27:52.449716: step: 492/463, loss: 0.10948781669139862 2023-01-22 16:27:53.101727: step: 494/463, loss: 0.11725734174251556 2023-01-22 16:27:53.681051: step: 496/463, loss: 0.025834525004029274 2023-01-22 16:27:54.258793: step: 498/463, loss: 0.2720796763896942 2023-01-22 16:27:54.862629: step: 500/463, loss: 0.19415166974067688 2023-01-22 16:27:55.453650: step: 502/463, loss: 0.28061479330062866 2023-01-22 16:27:56.110603: step: 504/463, loss: 0.06378906965255737 2023-01-22 16:27:56.741750: step: 506/463, loss: 0.09497807919979095 2023-01-22 16:27:57.344680: step: 508/463, loss: 0.01599784381687641 2023-01-22 16:27:57.882325: step: 510/463, loss: 0.21024303138256073 2023-01-22 16:27:58.440939: step: 512/463, loss: 0.1353234350681305 2023-01-22 16:27:59.089900: step: 514/463, loss: 0.033040136098861694 2023-01-22 16:27:59.705265: step: 516/463, loss: 0.01999240554869175 2023-01-22 16:28:00.245965: step: 518/463, loss: 0.16727715730667114 2023-01-22 16:28:00.814963: step: 520/463, loss: 0.044732220470905304 2023-01-22 16:28:01.443268: step: 522/463, loss: 0.01749737560749054 2023-01-22 16:28:02.077302: step: 524/463, loss: 0.14807678759098053 2023-01-22 16:28:02.625611: step: 526/463, loss: 0.01560723502188921 2023-01-22 16:28:03.280809: step: 528/463, loss: 0.47401028871536255 2023-01-22 16:28:03.827562: step: 530/463, loss: 0.3662603199481964 2023-01-22 16:28:04.433856: step: 532/463, loss: 1.1697790622711182 2023-01-22 16:28:05.004960: step: 534/463, loss: 0.06237613409757614 2023-01-22 16:28:05.602856: step: 536/463, loss: 0.06309166550636292 2023-01-22 16:28:06.141988: step: 538/463, loss: 0.021909452974796295 2023-01-22 16:28:06.749954: step: 540/463, loss: 0.1158827617764473 2023-01-22 16:28:07.402234: step: 542/463, loss: 0.07556529343128204 2023-01-22 16:28:07.985205: step: 544/463, loss: 0.06147349625825882 2023-01-22 16:28:08.543810: step: 546/463, loss: 0.022432060912251472 2023-01-22 16:28:09.168974: step: 548/463, loss: 0.12663380801677704 2023-01-22 16:28:09.783031: step: 550/463, loss: 0.05349862203001976 2023-01-22 16:28:10.360502: step: 552/463, loss: 0.02763727493584156 2023-01-22 16:28:10.979946: step: 554/463, loss: 0.09086702018976212 2023-01-22 16:28:11.529413: step: 556/463, loss: 0.007912018336355686 2023-01-22 16:28:12.131837: step: 558/463, loss: 0.04992949962615967 2023-01-22 16:28:12.701728: step: 560/463, loss: 0.06533095985651016 2023-01-22 16:28:13.312561: step: 562/463, loss: 0.030591044574975967 2023-01-22 16:28:13.939087: step: 564/463, loss: 0.01619875431060791 2023-01-22 16:28:14.552468: step: 566/463, loss: 0.2588687539100647 2023-01-22 16:28:15.138307: step: 568/463, loss: 0.17858584225177765 2023-01-22 16:28:15.788311: step: 570/463, loss: 0.027256622910499573 2023-01-22 16:28:16.365047: step: 572/463, loss: 0.16483838856220245 2023-01-22 16:28:16.946964: step: 574/463, loss: 0.14051298797130585 2023-01-22 16:28:17.534690: step: 576/463, loss: 0.0730261579155922 2023-01-22 16:28:18.342934: step: 578/463, loss: 0.08819999545812607 2023-01-22 16:28:18.921481: step: 580/463, loss: 0.08714915812015533 2023-01-22 16:28:19.557776: step: 582/463, loss: 0.040923237800598145 2023-01-22 16:28:20.163018: step: 584/463, loss: 0.1643635481595993 2023-01-22 16:28:20.751727: step: 586/463, loss: 0.3849364221096039 2023-01-22 16:28:21.376732: step: 588/463, loss: 0.47071611881256104 2023-01-22 16:28:22.059242: step: 590/463, loss: 0.09024696052074432 2023-01-22 16:28:22.673001: step: 592/463, loss: 0.06725668162107468 2023-01-22 16:28:23.293382: step: 594/463, loss: 0.043151725083589554 2023-01-22 16:28:23.976077: step: 596/463, loss: 0.038434308022260666 2023-01-22 16:28:24.519843: step: 598/463, loss: 0.05431987717747688 2023-01-22 16:28:25.145969: step: 600/463, loss: 0.22302481532096863 2023-01-22 16:28:25.732857: step: 602/463, loss: 0.04111813008785248 2023-01-22 16:28:26.355408: step: 604/463, loss: 0.07352280616760254 2023-01-22 16:28:27.058015: step: 606/463, loss: 0.02957470528781414 2023-01-22 16:28:27.673515: step: 608/463, loss: 0.07356975227594376 2023-01-22 16:28:28.317535: step: 610/463, loss: 0.07597237825393677 2023-01-22 16:28:28.879563: step: 612/463, loss: 0.2510901391506195 2023-01-22 16:28:29.498850: step: 614/463, loss: 0.019346531480550766 2023-01-22 16:28:30.080330: step: 616/463, loss: 0.039806585758924484 2023-01-22 16:28:30.728309: step: 618/463, loss: 0.02320874109864235 2023-01-22 16:28:31.352269: step: 620/463, loss: 0.28917521238327026 2023-01-22 16:28:31.995195: step: 622/463, loss: 0.041628871113061905 2023-01-22 16:28:32.594611: step: 624/463, loss: 0.026626314967870712 2023-01-22 16:28:33.286155: step: 626/463, loss: 0.07056853920221329 2023-01-22 16:28:33.911928: step: 628/463, loss: 0.04680918902158737 2023-01-22 16:28:34.496139: step: 630/463, loss: 0.04893006756901741 2023-01-22 16:28:35.059168: step: 632/463, loss: 0.04634620621800423 2023-01-22 16:28:35.670760: step: 634/463, loss: 0.5142707228660583 2023-01-22 16:28:36.239880: step: 636/463, loss: 0.5020127892494202 2023-01-22 16:28:36.829549: step: 638/463, loss: 0.08555576950311661 2023-01-22 16:28:37.404857: step: 640/463, loss: 0.09919973462820053 2023-01-22 16:28:37.985142: step: 642/463, loss: 0.04963303357362747 2023-01-22 16:28:38.618427: step: 644/463, loss: 0.0457271970808506 2023-01-22 16:28:39.261295: step: 646/463, loss: 0.19946283102035522 2023-01-22 16:28:39.857534: step: 648/463, loss: 0.06361276656389236 2023-01-22 16:28:40.458288: step: 650/463, loss: 0.13992442190647125 2023-01-22 16:28:41.140320: step: 652/463, loss: 0.030558515340089798 2023-01-22 16:28:41.763682: step: 654/463, loss: 0.10589060932397842 2023-01-22 16:28:42.388911: step: 656/463, loss: 0.05800039321184158 2023-01-22 16:28:42.976730: step: 658/463, loss: 0.02183489128947258 2023-01-22 16:28:43.604727: step: 660/463, loss: 0.26483556628227234 2023-01-22 16:28:44.254458: step: 662/463, loss: 0.04126259684562683 2023-01-22 16:28:44.868018: step: 664/463, loss: 0.3651045262813568 2023-01-22 16:28:45.513728: step: 666/463, loss: 0.08829773217439651 2023-01-22 16:28:46.176699: step: 668/463, loss: 0.0963931679725647 2023-01-22 16:28:46.837271: step: 670/463, loss: 0.08512981235980988 2023-01-22 16:28:47.394676: step: 672/463, loss: 0.07205789536237717 2023-01-22 16:28:47.995184: step: 674/463, loss: 0.3146595358848572 2023-01-22 16:28:48.606373: step: 676/463, loss: 0.016322940587997437 2023-01-22 16:28:49.239784: step: 678/463, loss: 0.05886604264378548 2023-01-22 16:28:49.910395: step: 680/463, loss: 0.10232941061258316 2023-01-22 16:28:50.507631: step: 682/463, loss: 0.10083825141191483 2023-01-22 16:28:51.164071: step: 684/463, loss: 0.053993843495845795 2023-01-22 16:28:51.781849: step: 686/463, loss: 0.050933588296175 2023-01-22 16:28:52.404329: step: 688/463, loss: 0.046927452087402344 2023-01-22 16:28:52.996404: step: 690/463, loss: 0.004062780644744635 2023-01-22 16:28:53.592419: step: 692/463, loss: 0.05638902261853218 2023-01-22 16:28:54.225790: step: 694/463, loss: 0.05997899919748306 2023-01-22 16:28:54.892613: step: 696/463, loss: 0.03539610654115677 2023-01-22 16:28:55.583876: step: 698/463, loss: 0.12319815903902054 2023-01-22 16:28:56.199278: step: 700/463, loss: 0.08777925372123718 2023-01-22 16:28:56.788645: step: 702/463, loss: 0.030841384083032608 2023-01-22 16:28:57.397945: step: 704/463, loss: 0.0586824044585228 2023-01-22 16:28:58.009542: step: 706/463, loss: 0.13164947926998138 2023-01-22 16:28:58.645307: step: 708/463, loss: 0.07362952828407288 2023-01-22 16:28:59.238946: step: 710/463, loss: 0.26194509863853455 2023-01-22 16:28:59.905019: step: 712/463, loss: 0.021204551681876183 2023-01-22 16:29:00.592409: step: 714/463, loss: 0.11638923734426498 2023-01-22 16:29:01.170347: step: 716/463, loss: 0.11234039068222046 2023-01-22 16:29:01.800006: step: 718/463, loss: 0.07162986695766449 2023-01-22 16:29:02.394347: step: 720/463, loss: 0.06022441387176514 2023-01-22 16:29:03.130761: step: 722/463, loss: 0.045107003301382065 2023-01-22 16:29:03.744995: step: 724/463, loss: 0.22889390587806702 2023-01-22 16:29:04.329455: step: 726/463, loss: 0.04560711979866028 2023-01-22 16:29:04.959925: step: 728/463, loss: 0.018233399838209152 2023-01-22 16:29:05.530879: step: 730/463, loss: 0.10473345965147018 2023-01-22 16:29:06.116276: step: 732/463, loss: 0.17189721763134003 2023-01-22 16:29:06.726157: step: 734/463, loss: 0.04314691573381424 2023-01-22 16:29:07.337208: step: 736/463, loss: 0.031948771327733994 2023-01-22 16:29:08.000899: step: 738/463, loss: 0.035902611911296844 2023-01-22 16:29:08.644968: step: 740/463, loss: 0.09742969274520874 2023-01-22 16:29:09.220543: step: 742/463, loss: 0.005761745851486921 2023-01-22 16:29:09.873870: step: 744/463, loss: 0.10262815654277802 2023-01-22 16:29:10.499117: step: 746/463, loss: 0.07813889533281326 2023-01-22 16:29:11.104206: step: 748/463, loss: 0.05031275376677513 2023-01-22 16:29:11.718480: step: 750/463, loss: 0.13020561635494232 2023-01-22 16:29:12.262859: step: 752/463, loss: 0.1293104887008667 2023-01-22 16:29:12.876080: step: 754/463, loss: 0.15095391869544983 2023-01-22 16:29:13.540287: step: 756/463, loss: 0.05689273774623871 2023-01-22 16:29:14.198721: step: 758/463, loss: 0.35717275738716125 2023-01-22 16:29:14.869579: step: 760/463, loss: 0.7136415839195251 2023-01-22 16:29:15.393507: step: 762/463, loss: 0.0679430440068245 2023-01-22 16:29:15.985621: step: 764/463, loss: 0.24026182293891907 2023-01-22 16:29:16.613723: step: 766/463, loss: 0.05929393693804741 2023-01-22 16:29:17.198453: step: 768/463, loss: 3.102717399597168 2023-01-22 16:29:17.771520: step: 770/463, loss: 0.08512435853481293 2023-01-22 16:29:18.328962: step: 772/463, loss: 0.11320531368255615 2023-01-22 16:29:18.940317: step: 774/463, loss: 0.10720217227935791 2023-01-22 16:29:19.577203: step: 776/463, loss: 0.21995840966701508 2023-01-22 16:29:20.179570: step: 778/463, loss: 0.23623734712600708 2023-01-22 16:29:20.806625: step: 780/463, loss: 0.03969380259513855 2023-01-22 16:29:21.463938: step: 782/463, loss: 0.15425720810890198 2023-01-22 16:29:22.017889: step: 784/463, loss: 0.04646308720111847 2023-01-22 16:29:22.627353: step: 786/463, loss: 0.12128916382789612 2023-01-22 16:29:23.221910: step: 788/463, loss: 0.03677933290600777 2023-01-22 16:29:23.836670: step: 790/463, loss: 0.03813580051064491 2023-01-22 16:29:24.444803: step: 792/463, loss: 0.14522534608840942 2023-01-22 16:29:25.066794: step: 794/463, loss: 0.06006068363785744 2023-01-22 16:29:25.675451: step: 796/463, loss: 0.031723007559776306 2023-01-22 16:29:26.277900: step: 798/463, loss: 0.21004703640937805 2023-01-22 16:29:26.877932: step: 800/463, loss: 0.0382004976272583 2023-01-22 16:29:27.604238: step: 802/463, loss: 0.02075333334505558 2023-01-22 16:29:28.214597: step: 804/463, loss: 0.03983953967690468 2023-01-22 16:29:28.820788: step: 806/463, loss: 0.06696569919586182 2023-01-22 16:29:29.400376: step: 808/463, loss: 0.26854294538497925 2023-01-22 16:29:30.062043: step: 810/463, loss: 1.0108675956726074 2023-01-22 16:29:30.664459: step: 812/463, loss: 0.07373742759227753 2023-01-22 16:29:31.238776: step: 814/463, loss: 0.259560763835907 2023-01-22 16:29:31.925309: step: 816/463, loss: 0.017031140625476837 2023-01-22 16:29:32.517687: step: 818/463, loss: 0.03655696660280228 2023-01-22 16:29:33.129026: step: 820/463, loss: 0.09260348975658417 2023-01-22 16:29:33.750564: step: 822/463, loss: 0.11358434706926346 2023-01-22 16:29:34.406224: step: 824/463, loss: 0.01852261647582054 2023-01-22 16:29:35.007140: step: 826/463, loss: 2.0162434577941895 2023-01-22 16:29:35.580941: step: 828/463, loss: 7.414250373840332 2023-01-22 16:29:36.197012: step: 830/463, loss: 0.12549148499965668 2023-01-22 16:29:36.874665: step: 832/463, loss: 0.08543751388788223 2023-01-22 16:29:37.478672: step: 834/463, loss: 0.07387448102235794 2023-01-22 16:29:38.114652: step: 836/463, loss: 0.04594489932060242 2023-01-22 16:29:38.718914: step: 838/463, loss: 0.19005608558654785 2023-01-22 16:29:39.361824: step: 840/463, loss: 0.07954856753349304 2023-01-22 16:29:39.971996: step: 842/463, loss: 0.11144533008337021 2023-01-22 16:29:40.572998: step: 844/463, loss: 0.14995217323303223 2023-01-22 16:29:41.146782: step: 846/463, loss: 0.030094821006059647 2023-01-22 16:29:41.748533: step: 848/463, loss: 0.038266148418188095 2023-01-22 16:29:42.329230: step: 850/463, loss: 0.026978151872754097 2023-01-22 16:29:42.914612: step: 852/463, loss: 0.2461566925048828 2023-01-22 16:29:43.552366: step: 854/463, loss: 0.0639805793762207 2023-01-22 16:29:44.122776: step: 856/463, loss: 0.0035163352731615305 2023-01-22 16:29:44.735857: step: 858/463, loss: 0.017042599618434906 2023-01-22 16:29:45.315278: step: 860/463, loss: 0.0718710869550705 2023-01-22 16:29:45.964041: step: 862/463, loss: 0.09531405568122864 2023-01-22 16:29:46.655937: step: 864/463, loss: 0.10290028899908066 2023-01-22 16:29:47.301290: step: 866/463, loss: 0.04332425072789192 2023-01-22 16:29:47.889969: step: 868/463, loss: 1.184149146080017 2023-01-22 16:29:48.472042: step: 870/463, loss: 0.049821630120277405 2023-01-22 16:29:49.088992: step: 872/463, loss: 0.06450564414262772 2023-01-22 16:29:49.782952: step: 874/463, loss: 0.338056355714798 2023-01-22 16:29:50.361603: step: 876/463, loss: 0.14539667963981628 2023-01-22 16:29:50.894558: step: 878/463, loss: 0.010352099314332008 2023-01-22 16:29:51.590436: step: 880/463, loss: 0.1212325468659401 2023-01-22 16:29:52.216239: step: 882/463, loss: 0.09514840692281723 2023-01-22 16:29:52.898937: step: 884/463, loss: 0.26018354296684265 2023-01-22 16:29:53.531087: step: 886/463, loss: 0.07416249811649323 2023-01-22 16:29:54.184438: step: 888/463, loss: 0.3655305504798889 2023-01-22 16:29:54.841200: step: 890/463, loss: 0.3487803041934967 2023-01-22 16:29:55.460399: step: 892/463, loss: 0.08373621851205826 2023-01-22 16:29:56.052215: step: 894/463, loss: 0.00867784209549427 2023-01-22 16:29:56.645514: step: 896/463, loss: 0.03816419094800949 2023-01-22 16:29:57.253492: step: 898/463, loss: 0.23765510320663452 2023-01-22 16:29:57.920419: step: 900/463, loss: 0.37295830249786377 2023-01-22 16:29:58.444046: step: 902/463, loss: 0.07036516070365906 2023-01-22 16:29:59.135329: step: 904/463, loss: 0.06355015188455582 2023-01-22 16:29:59.753244: step: 906/463, loss: 0.017205966636538506 2023-01-22 16:30:00.334405: step: 908/463, loss: 0.0038124164566397667 2023-01-22 16:30:00.926031: step: 910/463, loss: 0.15834790468215942 2023-01-22 16:30:01.565759: step: 912/463, loss: 0.06127948313951492 2023-01-22 16:30:02.128654: step: 914/463, loss: 0.06633953750133514 2023-01-22 16:30:02.757202: step: 916/463, loss: 0.08529923111200333 2023-01-22 16:30:03.309590: step: 918/463, loss: 0.04229454696178436 2023-01-22 16:30:03.964908: step: 920/463, loss: 0.15324260294437408 2023-01-22 16:30:04.635358: step: 922/463, loss: 0.15101855993270874 2023-01-22 16:30:05.239673: step: 924/463, loss: 0.06082436069846153 2023-01-22 16:30:05.870338: step: 926/463, loss: 0.44002920389175415 ================================================== Loss: 0.147 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2887401938920082, 'r': 0.35393959251278423, 'f1': 0.3180326773303279}, 'combined': 0.2343398675065574, 'epoch': 16} Test Chinese: {'template': {'p': 0.9722222222222222, 'r': 0.5511811023622047, 'f1': 0.7035175879396985}, 'slot': {'p': 0.3374190197383612, 'r': 0.30885912016190303, 'f1': 0.32250801977725824}, 'combined': 0.22689006416490531, 'epoch': 16} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2899444861617679, 'r': 0.3477133116778696, 'f1': 0.3162121057018763}, 'combined': 0.23299839367506672, 'epoch': 16} Test Korean: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.3425991168576647, 'r': 0.310013336982023, 'f1': 0.32549270195272406}, 'combined': 0.23109981838643406, 'epoch': 16} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2987150015373381, 'r': 0.357097629921296, 'f1': 0.32530760755146587}, 'combined': 0.23970034240634325, 'epoch': 16} Test Russian: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.35595372138424786, 'r': 0.29626541177221677, 'f1': 0.32337835698683337}, 'combined': 0.22959863346065168, 'epoch': 16} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.29006410256410253, 'r': 0.4309523809523809, 'f1': 0.34674329501915707}, 'combined': 0.23116219667943805, 'epoch': 16} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.25675675675675674, 'r': 0.41304347826086957, 'f1': 0.31666666666666665}, 'combined': 0.15833333333333333, 'epoch': 16} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.3068181818181818, 'r': 0.23275862068965517, 'f1': 0.2647058823529411}, 'combined': 0.17647058823529407, 'epoch': 16} New best chinese model... ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2887401938920082, 'r': 0.35393959251278423, 'f1': 0.3180326773303279}, 'combined': 0.2343398675065574, 'epoch': 16} Test for Chinese: {'template': {'p': 0.9722222222222222, 'r': 0.5511811023622047, 'f1': 0.7035175879396985}, 'slot': {'p': 0.3374190197383612, 'r': 0.30885912016190303, 'f1': 0.32250801977725824}, 'combined': 0.22689006416490531, 'epoch': 16} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.29006410256410253, 'r': 0.4309523809523809, 'f1': 0.34674329501915707}, 'combined': 0.23116219667943805, 'epoch': 16} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.31359210182048525, 'r': 0.32489807892596767, 'f1': 0.3191449908555172}, 'combined': 0.23515946694617054, 'epoch': 1} Test for Korean: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.3653684903443556, 'r': 0.29076445304891124, 'f1': 0.32382513429937054}, 'combined': 0.22991584535255308, 'epoch': 1} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.296875, 'r': 0.41304347826086957, 'f1': 0.3454545454545454}, 'combined': 0.1727272727272727, 'epoch': 1} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.29419352931628273, 'r': 0.339969372587507, 'f1': 0.3154293298479158}, 'combined': 0.23242161146688528, 'epoch': 15} Test for Russian: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.343089588698682, 'r': 0.29459001333289975, 'f1': 0.31699545096666953}, 'combined': 0.22506677018633536, 'epoch': 15} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.484375, 'r': 0.2672413793103448, 'f1': 0.34444444444444444}, 'combined': 0.22962962962962963, 'epoch': 15} ****************************** Epoch: 17 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-22 16:32:49.266046: step: 2/463, loss: 0.009684029035270214 2023-01-22 16:32:49.877337: step: 4/463, loss: 0.07281268388032913 2023-01-22 16:32:50.442692: step: 6/463, loss: 0.038480617105960846 2023-01-22 16:32:51.019513: step: 8/463, loss: 0.04605584219098091 2023-01-22 16:32:51.607708: step: 10/463, loss: 0.014978301711380482 2023-01-22 16:32:52.217969: step: 12/463, loss: 0.042193908244371414 2023-01-22 16:32:52.817809: step: 14/463, loss: 0.07410027086734772 2023-01-22 16:32:53.438079: step: 16/463, loss: 0.06547863036394119 2023-01-22 16:32:54.009768: step: 18/463, loss: 0.00816053431481123 2023-01-22 16:32:54.608351: step: 20/463, loss: 0.08385128527879715 2023-01-22 16:32:55.188973: step: 22/463, loss: 0.30231255292892456 2023-01-22 16:32:55.815706: step: 24/463, loss: 0.04476666450500488 2023-01-22 16:32:56.439343: step: 26/463, loss: 0.10752668976783752 2023-01-22 16:32:57.088338: step: 28/463, loss: 0.05813394486904144 2023-01-22 16:32:57.700578: step: 30/463, loss: 0.048147160559892654 2023-01-22 16:32:58.346552: step: 32/463, loss: 0.0749736800789833 2023-01-22 16:32:58.939207: step: 34/463, loss: 0.0069557856768369675 2023-01-22 16:32:59.614250: step: 36/463, loss: 0.026787374168634415 2023-01-22 16:33:00.221520: step: 38/463, loss: 0.11341273039579391 2023-01-22 16:33:00.859783: step: 40/463, loss: 0.09718478471040726 2023-01-22 16:33:01.492693: step: 42/463, loss: 0.12734684348106384 2023-01-22 16:33:02.058264: step: 44/463, loss: 0.03651277720928192 2023-01-22 16:33:02.700792: step: 46/463, loss: 0.2283494919538498 2023-01-22 16:33:03.319202: step: 48/463, loss: 0.023413360118865967 2023-01-22 16:33:03.890773: step: 50/463, loss: 0.1033891886472702 2023-01-22 16:33:04.561831: step: 52/463, loss: 0.05415477231144905 2023-01-22 16:33:05.191903: step: 54/463, loss: 0.04135279357433319 2023-01-22 16:33:05.810370: step: 56/463, loss: 0.12212113291025162 2023-01-22 16:33:06.520323: step: 58/463, loss: 0.06725304573774338 2023-01-22 16:33:07.080416: step: 60/463, loss: 0.013101599179208279 2023-01-22 16:33:07.702573: step: 62/463, loss: 0.10833246260881424 2023-01-22 16:33:08.373524: step: 64/463, loss: 0.035515256226062775 2023-01-22 16:33:08.948745: step: 66/463, loss: 0.07247386872768402 2023-01-22 16:33:09.587029: step: 68/463, loss: 0.10552673041820526 2023-01-22 16:33:10.185102: step: 70/463, loss: 0.009567800909280777 2023-01-22 16:33:10.768683: step: 72/463, loss: 0.10047411918640137 2023-01-22 16:33:11.388014: step: 74/463, loss: 0.02756412699818611 2023-01-22 16:33:12.018807: step: 76/463, loss: 0.08619429171085358 2023-01-22 16:33:12.677002: step: 78/463, loss: 0.06763036549091339 2023-01-22 16:33:13.277450: step: 80/463, loss: 0.041169118136167526 2023-01-22 16:33:13.918108: step: 82/463, loss: 0.025287270545959473 2023-01-22 16:33:14.622803: step: 84/463, loss: 0.04191216826438904 2023-01-22 16:33:15.219010: step: 86/463, loss: 0.06233130395412445 2023-01-22 16:33:15.870979: step: 88/463, loss: 0.018947241827845573 2023-01-22 16:33:16.511486: step: 90/463, loss: 0.12368597835302353 2023-01-22 16:33:17.135766: step: 92/463, loss: 0.043756913393735886 2023-01-22 16:33:17.713255: step: 94/463, loss: 0.10026411712169647 2023-01-22 16:33:18.351711: step: 96/463, loss: 0.11139896512031555 2023-01-22 16:33:19.012811: step: 98/463, loss: 0.09880515933036804 2023-01-22 16:33:19.619847: step: 100/463, loss: 0.009232312440872192 2023-01-22 16:33:20.199527: step: 102/463, loss: 0.016586152836680412 2023-01-22 16:33:20.762373: step: 104/463, loss: 0.023006152361631393 2023-01-22 16:33:21.376003: step: 106/463, loss: 0.036348775029182434 2023-01-22 16:33:21.972971: step: 108/463, loss: 0.11090726405382156 2023-01-22 16:33:22.562591: step: 110/463, loss: 0.030818207189440727 2023-01-22 16:33:23.192128: step: 112/463, loss: 0.03307202458381653 2023-01-22 16:33:23.848147: step: 114/463, loss: 0.03742627426981926 2023-01-22 16:33:24.409735: step: 116/463, loss: 0.04581645503640175 2023-01-22 16:33:25.143233: step: 118/463, loss: 0.04268413409590721 2023-01-22 16:33:25.781391: step: 120/463, loss: 0.038023754954338074 2023-01-22 16:33:26.385557: step: 122/463, loss: 0.37433022260665894 2023-01-22 16:33:26.933180: step: 124/463, loss: 0.009842624887824059 2023-01-22 16:33:27.455715: step: 126/463, loss: 0.004004408605396748 2023-01-22 16:33:28.088934: step: 128/463, loss: 0.09732145071029663 2023-01-22 16:33:28.692012: step: 130/463, loss: 0.07792481780052185 2023-01-22 16:33:29.361624: step: 132/463, loss: 0.030720684677362442 2023-01-22 16:33:29.940742: step: 134/463, loss: 0.2609100937843323 2023-01-22 16:33:30.535367: step: 136/463, loss: 0.06128688156604767 2023-01-22 16:33:31.151106: step: 138/463, loss: 0.026495812460780144 2023-01-22 16:33:31.779546: step: 140/463, loss: 0.034548819065093994 2023-01-22 16:33:32.340999: step: 142/463, loss: 0.09389893710613251 2023-01-22 16:33:32.954941: step: 144/463, loss: 0.05985189229249954 2023-01-22 16:33:33.554713: step: 146/463, loss: 0.08420074731111526 2023-01-22 16:33:34.092142: step: 148/463, loss: 0.075748011469841 2023-01-22 16:33:34.747040: step: 150/463, loss: 0.011778658255934715 2023-01-22 16:33:35.412454: step: 152/463, loss: 0.02646172232925892 2023-01-22 16:33:36.046498: step: 154/463, loss: 0.16931293904781342 2023-01-22 16:33:36.684342: step: 156/463, loss: 0.1294374167919159 2023-01-22 16:33:37.263903: step: 158/463, loss: 0.012415796518325806 2023-01-22 16:33:37.868463: step: 160/463, loss: 0.08420433104038239 2023-01-22 16:33:38.459320: step: 162/463, loss: 0.38259419798851013 2023-01-22 16:33:39.090294: step: 164/463, loss: 0.039897121489048004 2023-01-22 16:33:39.710731: step: 166/463, loss: 0.04589913785457611 2023-01-22 16:33:40.316019: step: 168/463, loss: 0.059075091034173965 2023-01-22 16:33:40.928462: step: 170/463, loss: 0.05287693068385124 2023-01-22 16:33:41.528843: step: 172/463, loss: 0.04223601147532463 2023-01-22 16:33:42.038092: step: 174/463, loss: 0.10717366635799408 2023-01-22 16:33:42.640379: step: 176/463, loss: 0.04037615284323692 2023-01-22 16:33:43.199854: step: 178/463, loss: 0.022855402901768684 2023-01-22 16:33:43.828403: step: 180/463, loss: 0.0997295007109642 2023-01-22 16:33:44.415572: step: 182/463, loss: 0.03225886449217796 2023-01-22 16:33:45.069374: step: 184/463, loss: 0.1394364982843399 2023-01-22 16:33:45.715058: step: 186/463, loss: 0.0643577128648758 2023-01-22 16:33:46.299390: step: 188/463, loss: 0.08053338527679443 2023-01-22 16:33:46.853223: step: 190/463, loss: 0.08948665112257004 2023-01-22 16:33:47.454171: step: 192/463, loss: 0.044194914400577545 2023-01-22 16:33:48.063096: step: 194/463, loss: 0.11392755806446075 2023-01-22 16:33:48.668846: step: 196/463, loss: 0.027201594784855843 2023-01-22 16:33:49.277115: step: 198/463, loss: 0.03459165617823601 2023-01-22 16:33:49.882581: step: 200/463, loss: 0.045980170369148254 2023-01-22 16:33:50.495669: step: 202/463, loss: 0.040873873978853226 2023-01-22 16:33:51.069273: step: 204/463, loss: 0.08316037058830261 2023-01-22 16:33:51.724602: step: 206/463, loss: 0.12086600810289383 2023-01-22 16:33:52.321313: step: 208/463, loss: 0.06602822244167328 2023-01-22 16:33:52.987006: step: 210/463, loss: 0.04403798282146454 2023-01-22 16:33:53.568525: step: 212/463, loss: 0.032360468059778214 2023-01-22 16:33:54.179763: step: 214/463, loss: 0.05648558586835861 2023-01-22 16:33:54.692911: step: 216/463, loss: 0.04868623614311218 2023-01-22 16:33:55.321073: step: 218/463, loss: 0.03459319844841957 2023-01-22 16:33:55.899118: step: 220/463, loss: 0.03120243363082409 2023-01-22 16:33:56.635927: step: 222/463, loss: 0.5293698310852051 2023-01-22 16:33:57.200930: step: 224/463, loss: 0.3355964422225952 2023-01-22 16:33:57.778924: step: 226/463, loss: 0.11252882331609726 2023-01-22 16:33:58.354877: step: 228/463, loss: 0.19139999151229858 2023-01-22 16:33:58.912229: step: 230/463, loss: 0.040742870420217514 2023-01-22 16:33:59.494501: step: 232/463, loss: 0.02730543166399002 2023-01-22 16:34:00.103614: step: 234/463, loss: 0.12655174732208252 2023-01-22 16:34:00.708884: step: 236/463, loss: 0.0627298653125763 2023-01-22 16:34:01.452294: step: 238/463, loss: 0.07463818043470383 2023-01-22 16:34:02.081778: step: 240/463, loss: 0.04659705236554146 2023-01-22 16:34:02.698940: step: 242/463, loss: 0.008892903104424477 2023-01-22 16:34:03.275632: step: 244/463, loss: 0.06608759611845016 2023-01-22 16:34:03.879612: step: 246/463, loss: 0.04788303002715111 2023-01-22 16:34:04.473123: step: 248/463, loss: 0.06421144306659698 2023-01-22 16:34:05.041011: step: 250/463, loss: 0.03391208127140999 2023-01-22 16:34:05.620674: step: 252/463, loss: 2.281822681427002 2023-01-22 16:34:06.199983: step: 254/463, loss: 0.08393685519695282 2023-01-22 16:34:06.822541: step: 256/463, loss: 0.04629409685730934 2023-01-22 16:34:07.456910: step: 258/463, loss: 0.03565816208720207 2023-01-22 16:34:08.065985: step: 260/463, loss: 0.11902303248643875 2023-01-22 16:34:08.639129: step: 262/463, loss: 0.1275409460067749 2023-01-22 16:34:09.215873: step: 264/463, loss: 0.009326746687293053 2023-01-22 16:34:09.819327: step: 266/463, loss: 0.0622209757566452 2023-01-22 16:34:10.470196: step: 268/463, loss: 0.09042775630950928 2023-01-22 16:34:11.119629: step: 270/463, loss: 0.06510508805513382 2023-01-22 16:34:11.787438: step: 272/463, loss: 0.06442617624998093 2023-01-22 16:34:12.459231: step: 274/463, loss: 0.36323487758636475 2023-01-22 16:34:13.118751: step: 276/463, loss: 0.0587281659245491 2023-01-22 16:34:13.845677: step: 278/463, loss: 0.12401116639375687 2023-01-22 16:34:14.455962: step: 280/463, loss: 0.07887143641710281 2023-01-22 16:34:15.081740: step: 282/463, loss: 0.08674923330545425 2023-01-22 16:34:15.670642: step: 284/463, loss: 0.0353887677192688 2023-01-22 16:34:16.285192: step: 286/463, loss: 0.0647701770067215 2023-01-22 16:34:16.921500: step: 288/463, loss: 0.12850961089134216 2023-01-22 16:34:17.499172: step: 290/463, loss: 0.04758147522807121 2023-01-22 16:34:18.085689: step: 292/463, loss: 0.09283986687660217 2023-01-22 16:34:18.763350: step: 294/463, loss: 0.03264891728758812 2023-01-22 16:34:19.381807: step: 296/463, loss: 0.22484800219535828 2023-01-22 16:34:19.977163: step: 298/463, loss: 0.03560889512300491 2023-01-22 16:34:20.577922: step: 300/463, loss: 0.11289200186729431 2023-01-22 16:34:21.197502: step: 302/463, loss: 0.20046424865722656 2023-01-22 16:34:21.795356: step: 304/463, loss: 0.22672253847122192 2023-01-22 16:34:22.397014: step: 306/463, loss: 0.03626544773578644 2023-01-22 16:34:23.023851: step: 308/463, loss: 0.2450268268585205 2023-01-22 16:34:23.657904: step: 310/463, loss: 0.026593254879117012 2023-01-22 16:34:24.281170: step: 312/463, loss: 0.019359473139047623 2023-01-22 16:34:24.897852: step: 314/463, loss: 0.06221301108598709 2023-01-22 16:34:25.620506: step: 316/463, loss: 0.11529038846492767 2023-01-22 16:34:26.241418: step: 318/463, loss: 0.06010087952017784 2023-01-22 16:34:26.834539: step: 320/463, loss: 0.10993721336126328 2023-01-22 16:34:27.468452: step: 322/463, loss: 0.2122594565153122 2023-01-22 16:34:28.057500: step: 324/463, loss: 0.03242126479744911 2023-01-22 16:34:28.705880: step: 326/463, loss: 0.00862402468919754 2023-01-22 16:34:29.316594: step: 328/463, loss: 0.05904664099216461 2023-01-22 16:34:29.880086: step: 330/463, loss: 0.006331127602607012 2023-01-22 16:34:30.490984: step: 332/463, loss: 0.04656258225440979 2023-01-22 16:34:31.103015: step: 334/463, loss: 0.553918719291687 2023-01-22 16:34:31.737077: step: 336/463, loss: 0.09463709592819214 2023-01-22 16:34:32.271858: step: 338/463, loss: 0.04666692018508911 2023-01-22 16:34:32.833682: step: 340/463, loss: 0.06889036297798157 2023-01-22 16:34:33.454763: step: 342/463, loss: 0.1246664747595787 2023-01-22 16:34:34.079800: step: 344/463, loss: 0.3991226553916931 2023-01-22 16:34:34.705187: step: 346/463, loss: 0.020198440179228783 2023-01-22 16:34:35.339328: step: 348/463, loss: 0.040735602378845215 2023-01-22 16:34:35.904399: step: 350/463, loss: 0.17584268748760223 2023-01-22 16:34:36.550488: step: 352/463, loss: 0.027179917320609093 2023-01-22 16:34:37.179486: step: 354/463, loss: 0.055612679570913315 2023-01-22 16:34:37.785967: step: 356/463, loss: 0.059638138860464096 2023-01-22 16:34:38.354451: step: 358/463, loss: 0.04757124185562134 2023-01-22 16:34:38.922771: step: 360/463, loss: 0.013483666814863682 2023-01-22 16:34:39.505685: step: 362/463, loss: 0.01090918853878975 2023-01-22 16:34:40.126094: step: 364/463, loss: 0.11282768100500107 2023-01-22 16:34:40.734111: step: 366/463, loss: 0.007290151435881853 2023-01-22 16:34:41.341748: step: 368/463, loss: 0.028429998084902763 2023-01-22 16:34:41.978438: step: 370/463, loss: 0.08194104582071304 2023-01-22 16:34:42.528670: step: 372/463, loss: 0.0998283103108406 2023-01-22 16:34:43.212199: step: 374/463, loss: 0.7557907104492188 2023-01-22 16:34:43.806996: step: 376/463, loss: 0.10063844174146652 2023-01-22 16:34:44.424233: step: 378/463, loss: 0.3180018961429596 2023-01-22 16:34:44.996079: step: 380/463, loss: 0.030974868685007095 2023-01-22 16:34:45.568033: step: 382/463, loss: 0.034933291375637054 2023-01-22 16:34:46.138853: step: 384/463, loss: 0.027230076491832733 2023-01-22 16:34:46.670918: step: 386/463, loss: 0.3160860538482666 2023-01-22 16:34:47.263129: step: 388/463, loss: 0.04892370104789734 2023-01-22 16:34:47.865243: step: 390/463, loss: 0.06734148412942886 2023-01-22 16:34:48.419517: step: 392/463, loss: 0.006904246285557747 2023-01-22 16:34:49.085950: step: 394/463, loss: 0.04728604480624199 2023-01-22 16:34:49.724415: step: 396/463, loss: 0.3799084722995758 2023-01-22 16:34:50.346570: step: 398/463, loss: 0.061487410217523575 2023-01-22 16:34:51.036198: step: 400/463, loss: 0.4241001605987549 2023-01-22 16:34:51.699043: step: 402/463, loss: 0.10780290514230728 2023-01-22 16:34:52.274132: step: 404/463, loss: 0.028656506910920143 2023-01-22 16:34:52.882450: step: 406/463, loss: 0.1752186119556427 2023-01-22 16:34:53.481990: step: 408/463, loss: 0.06318075954914093 2023-01-22 16:34:54.149890: step: 410/463, loss: 0.10509040206670761 2023-01-22 16:34:54.780755: step: 412/463, loss: 0.09765133261680603 2023-01-22 16:34:55.470277: step: 414/463, loss: 0.037268057465553284 2023-01-22 16:34:56.034140: step: 416/463, loss: 0.026766357943415642 2023-01-22 16:34:56.651062: step: 418/463, loss: 0.17511078715324402 2023-01-22 16:34:57.321576: step: 420/463, loss: 0.0454324409365654 2023-01-22 16:34:57.943401: step: 422/463, loss: 0.3150566518306732 2023-01-22 16:34:58.516324: step: 424/463, loss: 0.11450080573558807 2023-01-22 16:34:59.078310: step: 426/463, loss: 0.14378651976585388 2023-01-22 16:34:59.699032: step: 428/463, loss: 0.09493079781532288 2023-01-22 16:35:00.360084: step: 430/463, loss: 0.22715190052986145 2023-01-22 16:35:01.034362: step: 432/463, loss: 0.04593156650662422 2023-01-22 16:35:01.642920: step: 434/463, loss: 0.08131331950426102 2023-01-22 16:35:02.286952: step: 436/463, loss: 0.091254822909832 2023-01-22 16:35:02.843048: step: 438/463, loss: 0.07093078643083572 2023-01-22 16:35:03.438291: step: 440/463, loss: 0.0286184661090374 2023-01-22 16:35:04.070772: step: 442/463, loss: 0.14210593700408936 2023-01-22 16:35:04.646198: step: 444/463, loss: 0.049920275807380676 2023-01-22 16:35:05.256573: step: 446/463, loss: 0.06718917936086655 2023-01-22 16:35:05.871208: step: 448/463, loss: 0.04224003106355667 2023-01-22 16:35:06.456700: step: 450/463, loss: 0.07460243999958038 2023-01-22 16:35:07.076478: step: 452/463, loss: 0.07102516293525696 2023-01-22 16:35:07.690503: step: 454/463, loss: 0.1342989206314087 2023-01-22 16:35:08.316834: step: 456/463, loss: 0.394260436296463 2023-01-22 16:35:09.003045: step: 458/463, loss: 0.017591439187526703 2023-01-22 16:35:09.589967: step: 460/463, loss: 0.04001519829034805 2023-01-22 16:35:10.177737: step: 462/463, loss: 0.06774257123470306 2023-01-22 16:35:10.789338: step: 464/463, loss: 0.16617512702941895 2023-01-22 16:35:11.418182: step: 466/463, loss: 0.19578903913497925 2023-01-22 16:35:12.028608: step: 468/463, loss: 0.12032197415828705 2023-01-22 16:35:12.682955: step: 470/463, loss: 0.16864003241062164 2023-01-22 16:35:13.281831: step: 472/463, loss: 4.8138203620910645 2023-01-22 16:35:13.897252: step: 474/463, loss: 0.04587146267294884 2023-01-22 16:35:14.489906: step: 476/463, loss: 0.07958202809095383 2023-01-22 16:35:15.091600: step: 478/463, loss: 0.13121023774147034 2023-01-22 16:35:15.759167: step: 480/463, loss: 0.03464176878333092 2023-01-22 16:35:16.355747: step: 482/463, loss: 0.04372940957546234 2023-01-22 16:35:16.986270: step: 484/463, loss: 0.06094664707779884 2023-01-22 16:35:17.629058: step: 486/463, loss: 0.07570572197437286 2023-01-22 16:35:18.228520: step: 488/463, loss: 0.05128507688641548 2023-01-22 16:35:18.834073: step: 490/463, loss: 0.13223400712013245 2023-01-22 16:35:19.460603: step: 492/463, loss: 0.042311590164899826 2023-01-22 16:35:20.085604: step: 494/463, loss: 0.1278245896100998 2023-01-22 16:35:20.657412: step: 496/463, loss: 0.09354451298713684 2023-01-22 16:35:21.257924: step: 498/463, loss: 0.01030701957643032 2023-01-22 16:35:21.834986: step: 500/463, loss: 0.020655181258916855 2023-01-22 16:35:22.464644: step: 502/463, loss: 0.05232895538210869 2023-01-22 16:35:23.103864: step: 504/463, loss: 0.027931835502386093 2023-01-22 16:35:23.671987: step: 506/463, loss: 0.03415033966302872 2023-01-22 16:35:24.329162: step: 508/463, loss: 0.1197151318192482 2023-01-22 16:35:24.963511: step: 510/463, loss: 0.07430978119373322 2023-01-22 16:35:25.598384: step: 512/463, loss: 0.01198669709265232 2023-01-22 16:35:26.214008: step: 514/463, loss: 0.016991952434182167 2023-01-22 16:35:26.817895: step: 516/463, loss: 0.031380075961351395 2023-01-22 16:35:27.578728: step: 518/463, loss: 0.020898595452308655 2023-01-22 16:35:28.190639: step: 520/463, loss: 0.02502376027405262 2023-01-22 16:35:28.845868: step: 522/463, loss: 0.0667397528886795 2023-01-22 16:35:29.412107: step: 524/463, loss: 0.03780106455087662 2023-01-22 16:35:29.984512: step: 526/463, loss: 0.025395944714546204 2023-01-22 16:35:30.562616: step: 528/463, loss: 0.018453773111104965 2023-01-22 16:35:31.154377: step: 530/463, loss: 0.011117308400571346 2023-01-22 16:35:31.831584: step: 532/463, loss: 0.041603848338127136 2023-01-22 16:35:32.447076: step: 534/463, loss: 0.025699466466903687 2023-01-22 16:35:33.057263: step: 536/463, loss: 0.04622204229235649 2023-01-22 16:35:33.680764: step: 538/463, loss: 0.06480484455823898 2023-01-22 16:35:34.232743: step: 540/463, loss: 0.2280200570821762 2023-01-22 16:35:34.839211: step: 542/463, loss: 0.022631408646702766 2023-01-22 16:35:35.456543: step: 544/463, loss: 0.3057030141353607 2023-01-22 16:35:36.089509: step: 546/463, loss: 0.04813225939869881 2023-01-22 16:35:36.700759: step: 548/463, loss: 0.06709921360015869 2023-01-22 16:35:37.345413: step: 550/463, loss: 0.2452041655778885 2023-01-22 16:35:37.976555: step: 552/463, loss: 0.11371764540672302 2023-01-22 16:35:38.582526: step: 554/463, loss: 0.08006143569946289 2023-01-22 16:35:39.176204: step: 556/463, loss: 0.09838936477899551 2023-01-22 16:35:39.750561: step: 558/463, loss: 0.029758762568235397 2023-01-22 16:35:40.382671: step: 560/463, loss: 0.02741345763206482 2023-01-22 16:35:41.000967: step: 562/463, loss: 0.09545890986919403 2023-01-22 16:35:41.571036: step: 564/463, loss: 0.03813016787171364 2023-01-22 16:35:42.208339: step: 566/463, loss: 0.1972450464963913 2023-01-22 16:35:42.832556: step: 568/463, loss: 0.03822438046336174 2023-01-22 16:35:43.411695: step: 570/463, loss: 0.17337603867053986 2023-01-22 16:35:44.077211: step: 572/463, loss: 0.14963439106941223 2023-01-22 16:35:44.687507: step: 574/463, loss: 0.1748935878276825 2023-01-22 16:35:45.243671: step: 576/463, loss: 0.06768209487199783 2023-01-22 16:35:45.822419: step: 578/463, loss: 0.02161930501461029 2023-01-22 16:35:46.452751: step: 580/463, loss: 0.05919722840189934 2023-01-22 16:35:47.053206: step: 582/463, loss: 0.09013654291629791 2023-01-22 16:35:47.648636: step: 584/463, loss: 0.15383464097976685 2023-01-22 16:35:48.335086: step: 586/463, loss: 0.08814290165901184 2023-01-22 16:35:48.937442: step: 588/463, loss: 0.02814151532948017 2023-01-22 16:35:49.565156: step: 590/463, loss: 0.05853704735636711 2023-01-22 16:35:50.120792: step: 592/463, loss: 0.014149131253361702 2023-01-22 16:35:50.736523: step: 594/463, loss: 0.12199688702821732 2023-01-22 16:35:51.372116: step: 596/463, loss: 0.0885714665055275 2023-01-22 16:35:51.996666: step: 598/463, loss: 0.032831575721502304 2023-01-22 16:35:52.628618: step: 600/463, loss: 0.11337453871965408 2023-01-22 16:35:53.305758: step: 602/463, loss: 0.047468848526477814 2023-01-22 16:35:53.957278: step: 604/463, loss: 0.14818701148033142 2023-01-22 16:35:54.578458: step: 606/463, loss: 0.0007930466672405601 2023-01-22 16:35:55.250895: step: 608/463, loss: 0.09583106637001038 2023-01-22 16:35:55.883032: step: 610/463, loss: 0.2597254812717438 2023-01-22 16:35:56.493811: step: 612/463, loss: 0.01712767407298088 2023-01-22 16:35:57.085351: step: 614/463, loss: 0.06698231399059296 2023-01-22 16:35:57.705135: step: 616/463, loss: 0.09642507880926132 2023-01-22 16:35:58.398644: step: 618/463, loss: 0.11368535459041595 2023-01-22 16:35:59.013576: step: 620/463, loss: 0.12876875698566437 2023-01-22 16:35:59.639300: step: 622/463, loss: 0.026741914451122284 2023-01-22 16:36:00.179285: step: 624/463, loss: 0.01247134804725647 2023-01-22 16:36:00.831496: step: 626/463, loss: 0.028769077733159065 2023-01-22 16:36:01.487576: step: 628/463, loss: 0.24587930738925934 2023-01-22 16:36:02.159269: step: 630/463, loss: 0.04750790074467659 2023-01-22 16:36:02.756459: step: 632/463, loss: 0.42218267917633057 2023-01-22 16:36:03.293130: step: 634/463, loss: 0.28009748458862305 2023-01-22 16:36:03.940905: step: 636/463, loss: 0.08587592095136642 2023-01-22 16:36:04.638934: step: 638/463, loss: 0.2546045482158661 2023-01-22 16:36:05.232508: step: 640/463, loss: 0.007059719879180193 2023-01-22 16:36:05.808548: step: 642/463, loss: 0.051808737218379974 2023-01-22 16:36:06.419243: step: 644/463, loss: 0.06937278807163239 2023-01-22 16:36:06.934429: step: 646/463, loss: 0.6275177597999573 2023-01-22 16:36:07.559052: step: 648/463, loss: 0.0186784528195858 2023-01-22 16:36:08.177259: step: 650/463, loss: 0.04442289099097252 2023-01-22 16:36:08.768705: step: 652/463, loss: 0.6330482363700867 2023-01-22 16:36:09.381367: step: 654/463, loss: 0.46276700496673584 2023-01-22 16:36:09.959957: step: 656/463, loss: 4.516740322113037 2023-01-22 16:36:10.566922: step: 658/463, loss: 0.15224072337150574 2023-01-22 16:36:11.121116: step: 660/463, loss: 0.32750609517097473 2023-01-22 16:36:11.755464: step: 662/463, loss: 0.2058594524860382 2023-01-22 16:36:12.339353: step: 664/463, loss: 0.7287889122962952 2023-01-22 16:36:12.971840: step: 666/463, loss: 0.047473181039094925 2023-01-22 16:36:13.603923: step: 668/463, loss: 0.09878908097743988 2023-01-22 16:36:14.232532: step: 670/463, loss: 0.0724651888012886 2023-01-22 16:36:14.899781: step: 672/463, loss: 0.2794138491153717 2023-01-22 16:36:15.607681: step: 674/463, loss: 0.0823020190000534 2023-01-22 16:36:16.299766: step: 676/463, loss: 0.05153350532054901 2023-01-22 16:36:16.909016: step: 678/463, loss: 0.05466584116220474 2023-01-22 16:36:17.490920: step: 680/463, loss: 0.046845342963933945 2023-01-22 16:36:18.068040: step: 682/463, loss: 0.054225124418735504 2023-01-22 16:36:18.677994: step: 684/463, loss: 0.049702953547239304 2023-01-22 16:36:19.338237: step: 686/463, loss: 0.07614841312170029 2023-01-22 16:36:19.976742: step: 688/463, loss: 0.0374772809445858 2023-01-22 16:36:20.643207: step: 690/463, loss: 0.10767879337072372 2023-01-22 16:36:21.254969: step: 692/463, loss: 0.045014429837465286 2023-01-22 16:36:21.912281: step: 694/463, loss: 0.07836220413446426 2023-01-22 16:36:22.505720: step: 696/463, loss: 0.04979868605732918 2023-01-22 16:36:23.083617: step: 698/463, loss: 0.10276749730110168 2023-01-22 16:36:23.685084: step: 700/463, loss: 0.04079893231391907 2023-01-22 16:36:24.259877: step: 702/463, loss: 0.014544866047799587 2023-01-22 16:36:24.835937: step: 704/463, loss: 0.14706513285636902 2023-01-22 16:36:25.420979: step: 706/463, loss: 0.067708820104599 2023-01-22 16:36:26.028230: step: 708/463, loss: 0.16522085666656494 2023-01-22 16:36:26.597201: step: 710/463, loss: 0.06716971099376678 2023-01-22 16:36:27.205738: step: 712/463, loss: 0.046705760061740875 2023-01-22 16:36:27.877645: step: 714/463, loss: 0.059051513671875 2023-01-22 16:36:28.438060: step: 716/463, loss: 0.026394644752144814 2023-01-22 16:36:29.049255: step: 718/463, loss: 0.05047113075852394 2023-01-22 16:36:29.691911: step: 720/463, loss: 0.019043393433094025 2023-01-22 16:36:30.319270: step: 722/463, loss: 0.043644823133945465 2023-01-22 16:36:30.922550: step: 724/463, loss: 0.039119768887758255 2023-01-22 16:36:31.546277: step: 726/463, loss: 0.08412140607833862 2023-01-22 16:36:32.230331: step: 728/463, loss: 0.053985822945833206 2023-01-22 16:36:32.840368: step: 730/463, loss: 0.1071721538901329 2023-01-22 16:36:33.387511: step: 732/463, loss: 0.037748243659734726 2023-01-22 16:36:33.992505: step: 734/463, loss: 0.027196601033210754 2023-01-22 16:36:34.610719: step: 736/463, loss: 0.03326745331287384 2023-01-22 16:36:35.218541: step: 738/463, loss: 0.05940090864896774 2023-01-22 16:36:35.825443: step: 740/463, loss: 0.2639778256416321 2023-01-22 16:36:36.434537: step: 742/463, loss: 0.07364114373922348 2023-01-22 16:36:36.996215: step: 744/463, loss: 0.03910030797123909 2023-01-22 16:36:37.558593: step: 746/463, loss: 0.10638695955276489 2023-01-22 16:36:38.114177: step: 748/463, loss: 0.04186413809657097 2023-01-22 16:36:38.664671: step: 750/463, loss: 0.10281474888324738 2023-01-22 16:36:39.266816: step: 752/463, loss: 0.0887092724442482 2023-01-22 16:36:39.887563: step: 754/463, loss: 0.45409339666366577 2023-01-22 16:36:40.461624: step: 756/463, loss: 0.015463640913367271 2023-01-22 16:36:41.090395: step: 758/463, loss: 0.06177464872598648 2023-01-22 16:36:41.729624: step: 760/463, loss: 0.23056206107139587 2023-01-22 16:36:42.392051: step: 762/463, loss: 0.07336004078388214 2023-01-22 16:36:42.977734: step: 764/463, loss: 0.03708691522479057 2023-01-22 16:36:43.655720: step: 766/463, loss: 0.0796007439494133 2023-01-22 16:36:44.324869: step: 768/463, loss: 0.013684620149433613 2023-01-22 16:36:44.937371: step: 770/463, loss: 0.07419372349977493 2023-01-22 16:36:45.660729: step: 772/463, loss: 0.1145038902759552 2023-01-22 16:36:46.313820: step: 774/463, loss: 0.2707780599594116 2023-01-22 16:36:46.918856: step: 776/463, loss: 0.10805130004882812 2023-01-22 16:36:47.604549: step: 778/463, loss: 0.17458176612854004 2023-01-22 16:36:48.228106: step: 780/463, loss: 0.05540774390101433 2023-01-22 16:36:48.911412: step: 782/463, loss: 0.0006218242342583835 2023-01-22 16:36:49.509977: step: 784/463, loss: 0.09399521350860596 2023-01-22 16:36:50.120483: step: 786/463, loss: 0.06010764464735985 2023-01-22 16:36:50.652895: step: 788/463, loss: 0.30939948558807373 2023-01-22 16:36:51.228291: step: 790/463, loss: 0.07681126147508621 2023-01-22 16:36:51.940366: step: 792/463, loss: 0.23799282312393188 2023-01-22 16:36:52.556426: step: 794/463, loss: 0.007040924858301878 2023-01-22 16:36:53.184931: step: 796/463, loss: 0.07794934511184692 2023-01-22 16:36:53.897143: step: 798/463, loss: 0.01628902554512024 2023-01-22 16:36:54.457555: step: 800/463, loss: 0.07221756130456924 2023-01-22 16:36:55.072684: step: 802/463, loss: 0.01632831245660782 2023-01-22 16:36:55.667207: step: 804/463, loss: 0.05923178046941757 2023-01-22 16:36:56.261176: step: 806/463, loss: 0.058539971709251404 2023-01-22 16:36:56.850471: step: 808/463, loss: 0.07161515951156616 2023-01-22 16:36:57.408752: step: 810/463, loss: 0.43876898288726807 2023-01-22 16:36:57.990130: step: 812/463, loss: 0.025683172047138214 2023-01-22 16:36:58.577025: step: 814/463, loss: 0.1485605686903 2023-01-22 16:36:59.246771: step: 816/463, loss: 0.2211214005947113 2023-01-22 16:36:59.882020: step: 818/463, loss: 0.06159538775682449 2023-01-22 16:37:00.491965: step: 820/463, loss: 0.2801964581012726 2023-01-22 16:37:01.089833: step: 822/463, loss: 0.039062581956386566 2023-01-22 16:37:01.618019: step: 824/463, loss: 0.03304717317223549 2023-01-22 16:37:02.203044: step: 826/463, loss: 0.17000944912433624 2023-01-22 16:37:02.804451: step: 828/463, loss: 0.05852793529629707 2023-01-22 16:37:03.472065: step: 830/463, loss: 0.09781431406736374 2023-01-22 16:37:04.048953: step: 832/463, loss: 0.023394500836730003 2023-01-22 16:37:04.655969: step: 834/463, loss: 0.1424710750579834 2023-01-22 16:37:05.292410: step: 836/463, loss: 0.04893549531698227 2023-01-22 16:37:06.002658: step: 838/463, loss: 0.05834334343671799 2023-01-22 16:37:06.617393: step: 840/463, loss: 0.1589689403772354 2023-01-22 16:37:07.181239: step: 842/463, loss: 0.13909125328063965 2023-01-22 16:37:07.768364: step: 844/463, loss: 0.04796094819903374 2023-01-22 16:37:08.380178: step: 846/463, loss: 0.031300537288188934 2023-01-22 16:37:08.985418: step: 848/463, loss: 0.11457240581512451 2023-01-22 16:37:09.604580: step: 850/463, loss: 0.5891149044036865 2023-01-22 16:37:10.229922: step: 852/463, loss: 0.05534527823328972 2023-01-22 16:37:10.866676: step: 854/463, loss: 0.07711130380630493 2023-01-22 16:37:11.462117: step: 856/463, loss: 0.03179836645722389 2023-01-22 16:37:12.117998: step: 858/463, loss: 0.12852661311626434 2023-01-22 16:37:12.735912: step: 860/463, loss: 0.11725404858589172 2023-01-22 16:37:13.407423: step: 862/463, loss: 0.014022591523826122 2023-01-22 16:37:14.130646: step: 864/463, loss: 0.04268370941281319 2023-01-22 16:37:14.698094: step: 866/463, loss: 0.14850124716758728 2023-01-22 16:37:15.273670: step: 868/463, loss: 0.010326889343559742 2023-01-22 16:37:15.870353: step: 870/463, loss: 0.052374038845300674 2023-01-22 16:37:16.546846: step: 872/463, loss: 0.08235851675271988 2023-01-22 16:37:17.162642: step: 874/463, loss: 0.0050810035318136215 2023-01-22 16:37:17.724041: step: 876/463, loss: 0.052457962185144424 2023-01-22 16:37:18.333702: step: 878/463, loss: 0.12302198261022568 2023-01-22 16:37:18.920674: step: 880/463, loss: 0.03678077831864357 2023-01-22 16:37:19.551873: step: 882/463, loss: 0.09542186558246613 2023-01-22 16:37:20.170804: step: 884/463, loss: 0.8466272950172424 2023-01-22 16:37:20.802186: step: 886/463, loss: 0.04436057433485985 2023-01-22 16:37:21.399674: step: 888/463, loss: 0.11901895701885223 2023-01-22 16:37:22.015578: step: 890/463, loss: 0.056356605142354965 2023-01-22 16:37:22.635380: step: 892/463, loss: 0.007071968633681536 2023-01-22 16:37:23.355663: step: 894/463, loss: 0.9019984006881714 2023-01-22 16:37:23.982436: step: 896/463, loss: 0.05535057187080383 2023-01-22 16:37:24.602749: step: 898/463, loss: 0.05215047299861908 2023-01-22 16:37:25.210115: step: 900/463, loss: 3.035036563873291 2023-01-22 16:37:25.839891: step: 902/463, loss: 0.07349798828363419 2023-01-22 16:37:26.496776: step: 904/463, loss: 0.029679620638489723 2023-01-22 16:37:27.098234: step: 906/463, loss: 0.05383647233247757 2023-01-22 16:37:27.689853: step: 908/463, loss: 0.05002675577998161 2023-01-22 16:37:28.292071: step: 910/463, loss: 0.13471947610378265 2023-01-22 16:37:28.869369: step: 912/463, loss: 0.309780478477478 2023-01-22 16:37:29.540987: step: 914/463, loss: 0.035247523337602615 2023-01-22 16:37:30.131589: step: 916/463, loss: 0.883965790271759 2023-01-22 16:37:30.790623: step: 918/463, loss: 0.02262360416352749 2023-01-22 16:37:31.402965: step: 920/463, loss: 0.026997357606887817 2023-01-22 16:37:32.006650: step: 922/463, loss: 0.040129438042640686 2023-01-22 16:37:32.648459: step: 924/463, loss: 0.054094348102808 2023-01-22 16:37:33.245829: step: 926/463, loss: 0.07033561915159225 ================================================== Loss: 0.131 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.27950921379164684, 'r': 0.34050268549191137, 'f1': 0.3070058430354787}, 'combined': 0.2262148317103527, 'epoch': 17} Test Chinese: {'template': {'p': 0.9722222222222222, 'r': 0.5511811023622047, 'f1': 0.7035175879396985}, 'slot': {'p': 0.34796317690927714, 'r': 0.3264240439009435, 'f1': 0.33684964314384364}, 'combined': 0.2369796484429051, 'epoch': 17} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.28642843540005986, 'r': 0.33860515228508026, 'f1': 0.3103389830508475}, 'combined': 0.22867082961641394, 'epoch': 17} Test Korean: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.3492453884409228, 'r': 0.3270179138496522, 'f1': 0.3377663639671779}, 'combined': 0.2398141184166963, 'epoch': 17} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2910401858939311, 'r': 0.3424002186987425, 'f1': 0.31463803880424984}, 'combined': 0.2318385549083946, 'epoch': 17} Test Russian: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.36911606832749205, 'r': 0.31307226737724453, 'f1': 0.33879208537707484}, 'combined': 0.24054238061772312, 'epoch': 17} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.22549019607843135, 'r': 0.43809523809523804, 'f1': 0.2977346278317152}, 'combined': 0.19848975188781015, 'epoch': 17} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.33088235294117646, 'r': 0.4891304347826087, 'f1': 0.39473684210526316}, 'combined': 0.19736842105263158, 'epoch': 17} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.32142857142857145, 'r': 0.23275862068965517, 'f1': 0.26999999999999996}, 'combined': 0.17999999999999997, 'epoch': 17} New best korean model... ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2887401938920082, 'r': 0.35393959251278423, 'f1': 0.3180326773303279}, 'combined': 0.2343398675065574, 'epoch': 16} Test for Chinese: {'template': {'p': 0.9722222222222222, 'r': 0.5511811023622047, 'f1': 0.7035175879396985}, 'slot': {'p': 0.3374190197383612, 'r': 0.30885912016190303, 'f1': 0.32250801977725824}, 'combined': 0.22689006416490531, 'epoch': 16} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.29006410256410253, 'r': 0.4309523809523809, 'f1': 0.34674329501915707}, 'combined': 0.23116219667943805, 'epoch': 16} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.28642843540005986, 'r': 0.33860515228508026, 'f1': 0.3103389830508475}, 'combined': 0.22867082961641394, 'epoch': 17} Test for Korean: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.3492453884409228, 'r': 0.3270179138496522, 'f1': 0.3377663639671779}, 'combined': 0.2398141184166963, 'epoch': 17} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.33088235294117646, 'r': 0.4891304347826087, 'f1': 0.39473684210526316}, 'combined': 0.19736842105263158, 'epoch': 17} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.29419352931628273, 'r': 0.339969372587507, 'f1': 0.3154293298479158}, 'combined': 0.23242161146688528, 'epoch': 15} Test for Russian: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.343089588698682, 'r': 0.29459001333289975, 'f1': 0.31699545096666953}, 'combined': 0.22506677018633536, 'epoch': 15} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.484375, 'r': 0.2672413793103448, 'f1': 0.34444444444444444}, 'combined': 0.22962962962962963, 'epoch': 15} ****************************** Epoch: 18 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-22 16:40:17.136566: step: 2/463, loss: 0.05229368060827255 2023-01-22 16:40:17.752481: step: 4/463, loss: 0.033324867486953735 2023-01-22 16:40:18.372778: step: 6/463, loss: 0.043083835393190384 2023-01-22 16:40:19.039529: step: 8/463, loss: 0.074899822473526 2023-01-22 16:40:19.694399: step: 10/463, loss: 0.016204925253987312 2023-01-22 16:40:20.271148: step: 12/463, loss: 0.02793966978788376 2023-01-22 16:40:20.808373: step: 14/463, loss: 0.010157816112041473 2023-01-22 16:40:21.492892: step: 16/463, loss: 0.01744117960333824 2023-01-22 16:40:22.123396: step: 18/463, loss: 0.019675089046359062 2023-01-22 16:40:22.720269: step: 20/463, loss: 0.2395535260438919 2023-01-22 16:40:23.300856: step: 22/463, loss: 0.006468599196523428 2023-01-22 16:40:23.893233: step: 24/463, loss: 0.03579816222190857 2023-01-22 16:40:24.462371: step: 26/463, loss: 0.09393144398927689 2023-01-22 16:40:25.085112: step: 28/463, loss: 0.020029611885547638 2023-01-22 16:40:25.641736: step: 30/463, loss: 0.02872510440647602 2023-01-22 16:40:26.243821: step: 32/463, loss: 1.0470025539398193 2023-01-22 16:40:26.889232: step: 34/463, loss: 0.05407264828681946 2023-01-22 16:40:27.505193: step: 36/463, loss: 0.054576046764850616 2023-01-22 16:40:28.148418: step: 38/463, loss: 0.08602189272642136 2023-01-22 16:40:28.781350: step: 40/463, loss: 0.03444186970591545 2023-01-22 16:40:29.422949: step: 42/463, loss: 0.021027792245149612 2023-01-22 16:40:29.980043: step: 44/463, loss: 0.026481106877326965 2023-01-22 16:40:30.564947: step: 46/463, loss: 0.054727714508771896 2023-01-22 16:40:31.098231: step: 48/463, loss: 0.01613656058907509 2023-01-22 16:40:31.777190: step: 50/463, loss: 0.06298161298036575 2023-01-22 16:40:32.401155: step: 52/463, loss: 0.1881769746541977 2023-01-22 16:40:33.025369: step: 54/463, loss: 0.029652193188667297 2023-01-22 16:40:33.616259: step: 56/463, loss: 0.1146392971277237 2023-01-22 16:40:34.212430: step: 58/463, loss: 0.004848239943385124 2023-01-22 16:40:34.827612: step: 60/463, loss: 0.011791123077273369 2023-01-22 16:40:35.423417: step: 62/463, loss: 0.04005800187587738 2023-01-22 16:40:35.966505: step: 64/463, loss: 0.014112237840890884 2023-01-22 16:40:36.553321: step: 66/463, loss: 0.07765521109104156 2023-01-22 16:40:37.175227: step: 68/463, loss: 0.16144885122776031 2023-01-22 16:40:37.787053: step: 70/463, loss: 0.03863349184393883 2023-01-22 16:40:38.379233: step: 72/463, loss: 0.0484267994761467 2023-01-22 16:40:38.990158: step: 74/463, loss: 0.009348251856863499 2023-01-22 16:40:39.674380: step: 76/463, loss: 0.025486743077635765 2023-01-22 16:40:40.287338: step: 78/463, loss: 0.0223931223154068 2023-01-22 16:40:40.884448: step: 80/463, loss: 0.038978688418865204 2023-01-22 16:40:41.520433: step: 82/463, loss: 0.10097823292016983 2023-01-22 16:40:42.181465: step: 84/463, loss: 0.03650551661849022 2023-01-22 16:40:42.774305: step: 86/463, loss: 0.0343787707388401 2023-01-22 16:40:43.453797: step: 88/463, loss: 0.04078145697712898 2023-01-22 16:40:44.053683: step: 90/463, loss: 0.08851227164268494 2023-01-22 16:40:44.695805: step: 92/463, loss: 0.06457871198654175 2023-01-22 16:40:45.286173: step: 94/463, loss: 0.672856867313385 2023-01-22 16:40:45.880770: step: 96/463, loss: 0.04567726328969002 2023-01-22 16:40:46.526790: step: 98/463, loss: 0.1546766757965088 2023-01-22 16:40:47.203400: step: 100/463, loss: 0.023200906813144684 2023-01-22 16:40:47.729861: step: 102/463, loss: 0.0462331660091877 2023-01-22 16:40:48.322601: step: 104/463, loss: 0.05648176744580269 2023-01-22 16:40:48.938416: step: 106/463, loss: 0.00951012596487999 2023-01-22 16:40:49.583433: step: 108/463, loss: 0.025032036006450653 2023-01-22 16:40:50.151689: step: 110/463, loss: 0.09225096553564072 2023-01-22 16:40:50.775931: step: 112/463, loss: 0.04359101131558418 2023-01-22 16:40:51.349730: step: 114/463, loss: 0.08590131253004074 2023-01-22 16:40:51.990806: step: 116/463, loss: 0.05241032689809799 2023-01-22 16:40:52.603977: step: 118/463, loss: 0.5824062824249268 2023-01-22 16:40:53.224666: step: 120/463, loss: 0.062495261430740356 2023-01-22 16:40:53.920400: step: 122/463, loss: 0.7533474564552307 2023-01-22 16:40:54.461840: step: 124/463, loss: 0.06695589423179626 2023-01-22 16:40:55.103879: step: 126/463, loss: 0.05757410079240799 2023-01-22 16:40:55.740465: step: 128/463, loss: 0.009067753329873085 2023-01-22 16:40:56.301841: step: 130/463, loss: 0.10525200515985489 2023-01-22 16:40:56.919299: step: 132/463, loss: 0.029943600296974182 2023-01-22 16:40:57.523983: step: 134/463, loss: 0.017585385590791702 2023-01-22 16:40:58.080390: step: 136/463, loss: 0.08338368684053421 2023-01-22 16:40:58.656390: step: 138/463, loss: 0.07059365510940552 2023-01-22 16:40:59.269205: step: 140/463, loss: 0.015042508020997047 2023-01-22 16:40:59.881143: step: 142/463, loss: 0.03258882090449333 2023-01-22 16:41:00.464709: step: 144/463, loss: 0.09933247417211533 2023-01-22 16:41:01.051743: step: 146/463, loss: 0.7414109706878662 2023-01-22 16:41:01.712961: step: 148/463, loss: 0.11750238388776779 2023-01-22 16:41:02.343220: step: 150/463, loss: 0.10198447108268738 2023-01-22 16:41:02.917739: step: 152/463, loss: 0.08951544761657715 2023-01-22 16:41:03.539859: step: 154/463, loss: 0.011343862861394882 2023-01-22 16:41:04.164568: step: 156/463, loss: 0.13887447118759155 2023-01-22 16:41:04.783480: step: 158/463, loss: 0.1324821412563324 2023-01-22 16:41:05.417914: step: 160/463, loss: 0.17779143154621124 2023-01-22 16:41:06.016841: step: 162/463, loss: 0.11114363372325897 2023-01-22 16:41:06.591807: step: 164/463, loss: 0.03589291870594025 2023-01-22 16:41:07.181050: step: 166/463, loss: 0.02523355931043625 2023-01-22 16:41:07.730931: step: 168/463, loss: 0.08646336942911148 2023-01-22 16:41:08.344080: step: 170/463, loss: 0.038340721279382706 2023-01-22 16:41:08.984241: step: 172/463, loss: 0.20569635927677155 2023-01-22 16:41:09.661171: step: 174/463, loss: 0.1924046277999878 2023-01-22 16:41:10.240624: step: 176/463, loss: 0.004122932441532612 2023-01-22 16:41:10.760483: step: 178/463, loss: 0.1464463174343109 2023-01-22 16:41:11.310495: step: 180/463, loss: 0.06310077011585236 2023-01-22 16:41:11.911493: step: 182/463, loss: 0.0304847601801157 2023-01-22 16:41:12.493960: step: 184/463, loss: 0.05080313980579376 2023-01-22 16:41:13.122289: step: 186/463, loss: 0.1242276206612587 2023-01-22 16:41:13.701163: step: 188/463, loss: 0.06924159824848175 2023-01-22 16:41:14.336032: step: 190/463, loss: 0.41993436217308044 2023-01-22 16:41:14.964041: step: 192/463, loss: 0.03753301501274109 2023-01-22 16:41:15.567429: step: 194/463, loss: 0.02593453787267208 2023-01-22 16:41:16.188756: step: 196/463, loss: 0.010376195423305035 2023-01-22 16:41:16.751377: step: 198/463, loss: 0.07326087355613708 2023-01-22 16:41:17.396665: step: 200/463, loss: 0.03795021399855614 2023-01-22 16:41:17.970380: step: 202/463, loss: 0.0005985997850075364 2023-01-22 16:41:18.582798: step: 204/463, loss: 0.08164554089307785 2023-01-22 16:41:19.191489: step: 206/463, loss: 0.09199836105108261 2023-01-22 16:41:19.808899: step: 208/463, loss: 0.050378210842609406 2023-01-22 16:41:20.498810: step: 210/463, loss: 0.12984077632427216 2023-01-22 16:41:21.125877: step: 212/463, loss: 0.017921915277838707 2023-01-22 16:41:21.731535: step: 214/463, loss: 0.06992635130882263 2023-01-22 16:41:22.402677: step: 216/463, loss: 0.041133102029561996 2023-01-22 16:41:23.004460: step: 218/463, loss: 0.1499326527118683 2023-01-22 16:41:23.661764: step: 220/463, loss: 0.19944682717323303 2023-01-22 16:41:24.193064: step: 222/463, loss: 0.036363475024700165 2023-01-22 16:41:24.783208: step: 224/463, loss: 0.1038120910525322 2023-01-22 16:41:25.408186: step: 226/463, loss: 0.11858183145523071 2023-01-22 16:41:26.011408: step: 228/463, loss: 0.08477521687746048 2023-01-22 16:41:26.587932: step: 230/463, loss: 0.07071194052696228 2023-01-22 16:41:27.206265: step: 232/463, loss: 0.0489998459815979 2023-01-22 16:41:27.792981: step: 234/463, loss: 0.16688716411590576 2023-01-22 16:41:28.297871: step: 236/463, loss: 0.02395627647638321 2023-01-22 16:41:28.906431: step: 238/463, loss: 0.014148363843560219 2023-01-22 16:41:29.547267: step: 240/463, loss: 0.06006031110882759 2023-01-22 16:41:30.194553: step: 242/463, loss: 0.1702716201543808 2023-01-22 16:41:30.833127: step: 244/463, loss: 0.08379409462213516 2023-01-22 16:41:31.486934: step: 246/463, loss: 0.07828385382890701 2023-01-22 16:41:32.069230: step: 248/463, loss: 0.009192584082484245 2023-01-22 16:41:32.732025: step: 250/463, loss: 0.5033050775527954 2023-01-22 16:41:33.286276: step: 252/463, loss: 0.025383301079273224 2023-01-22 16:41:33.898430: step: 254/463, loss: 0.03546523302793503 2023-01-22 16:41:34.465512: step: 256/463, loss: 0.05065888911485672 2023-01-22 16:41:35.080074: step: 258/463, loss: 0.07243824005126953 2023-01-22 16:41:35.694856: step: 260/463, loss: 0.021998325362801552 2023-01-22 16:41:36.417693: step: 262/463, loss: 0.08135838806629181 2023-01-22 16:41:37.035201: step: 264/463, loss: 0.015787217766046524 2023-01-22 16:41:37.652701: step: 266/463, loss: 0.027945738285779953 2023-01-22 16:41:38.238242: step: 268/463, loss: 0.08336099982261658 2023-01-22 16:41:38.868738: step: 270/463, loss: 0.07006078958511353 2023-01-22 16:41:39.496324: step: 272/463, loss: 0.11573395133018494 2023-01-22 16:41:40.155809: step: 274/463, loss: 0.09761707484722137 2023-01-22 16:41:40.805653: step: 276/463, loss: 0.08779638260602951 2023-01-22 16:41:41.340695: step: 278/463, loss: 0.002975096693262458 2023-01-22 16:41:41.931680: step: 280/463, loss: 0.017354309558868408 2023-01-22 16:41:42.511531: step: 282/463, loss: 0.007142297923564911 2023-01-22 16:41:43.137406: step: 284/463, loss: 0.44680845737457275 2023-01-22 16:41:43.823545: step: 286/463, loss: 0.032051146030426025 2023-01-22 16:41:44.443563: step: 288/463, loss: 0.08525965362787247 2023-01-22 16:41:44.987561: step: 290/463, loss: 0.006251983344554901 2023-01-22 16:41:45.573205: step: 292/463, loss: 0.016534797847270966 2023-01-22 16:41:46.188999: step: 294/463, loss: 0.5868385434150696 2023-01-22 16:41:46.776171: step: 296/463, loss: 0.03886045888066292 2023-01-22 16:41:47.340395: step: 298/463, loss: 0.015437356196343899 2023-01-22 16:41:47.980690: step: 300/463, loss: 0.7030611038208008 2023-01-22 16:41:48.573211: step: 302/463, loss: 0.03936284780502319 2023-01-22 16:41:49.224790: step: 304/463, loss: 0.16305777430534363 2023-01-22 16:41:49.872737: step: 306/463, loss: 0.04054656997323036 2023-01-22 16:41:50.455499: step: 308/463, loss: 0.10261719673871994 2023-01-22 16:41:51.106582: step: 310/463, loss: 0.2812890410423279 2023-01-22 16:41:51.722192: step: 312/463, loss: 0.019766660407185555 2023-01-22 16:41:52.283045: step: 314/463, loss: 0.029181813821196556 2023-01-22 16:41:52.900949: step: 316/463, loss: 0.08779565244913101 2023-01-22 16:41:53.483145: step: 318/463, loss: 0.08626317977905273 2023-01-22 16:41:54.157125: step: 320/463, loss: 0.04404470697045326 2023-01-22 16:41:54.879184: step: 322/463, loss: 0.07130976766347885 2023-01-22 16:41:55.465860: step: 324/463, loss: 0.019988756626844406 2023-01-22 16:41:56.095842: step: 326/463, loss: 0.04724951833486557 2023-01-22 16:41:56.659359: step: 328/463, loss: 0.09304466098546982 2023-01-22 16:41:57.285279: step: 330/463, loss: 0.08092371374368668 2023-01-22 16:41:57.854716: step: 332/463, loss: 0.007558646611869335 2023-01-22 16:41:58.544040: step: 334/463, loss: 0.10603643208742142 2023-01-22 16:41:59.152733: step: 336/463, loss: 0.03256450220942497 2023-01-22 16:41:59.832832: step: 338/463, loss: 0.14850786328315735 2023-01-22 16:42:00.476139: step: 340/463, loss: 0.029127048328518867 2023-01-22 16:42:01.063564: step: 342/463, loss: 0.05594366788864136 2023-01-22 16:42:01.652877: step: 344/463, loss: 0.09174425154924393 2023-01-22 16:42:02.253609: step: 346/463, loss: 0.25730305910110474 2023-01-22 16:42:02.950213: step: 348/463, loss: 0.08676300197839737 2023-01-22 16:42:03.560129: step: 350/463, loss: 0.0618809312582016 2023-01-22 16:42:04.224979: step: 352/463, loss: 0.06170523166656494 2023-01-22 16:42:04.788474: step: 354/463, loss: 0.40960395336151123 2023-01-22 16:42:05.415073: step: 356/463, loss: 0.0271652452647686 2023-01-22 16:42:06.022609: step: 358/463, loss: 0.060409028083086014 2023-01-22 16:42:06.638182: step: 360/463, loss: 0.0964808240532875 2023-01-22 16:42:07.295747: step: 362/463, loss: 0.07571766525506973 2023-01-22 16:42:07.903847: step: 364/463, loss: 0.0749310702085495 2023-01-22 16:42:08.522059: step: 366/463, loss: 0.8307719230651855 2023-01-22 16:42:09.159583: step: 368/463, loss: 0.05406997725367546 2023-01-22 16:42:09.830115: step: 370/463, loss: 0.03025028109550476 2023-01-22 16:42:10.472455: step: 372/463, loss: 0.15029756724834442 2023-01-22 16:42:11.170646: step: 374/463, loss: 0.029874512925744057 2023-01-22 16:42:11.760722: step: 376/463, loss: 0.05143111199140549 2023-01-22 16:42:12.436032: step: 378/463, loss: 0.08810188621282578 2023-01-22 16:42:13.036373: step: 380/463, loss: 0.05552966892719269 2023-01-22 16:42:13.642705: step: 382/463, loss: 0.329473614692688 2023-01-22 16:42:14.235125: step: 384/463, loss: 0.05316713824868202 2023-01-22 16:42:14.855319: step: 386/463, loss: 0.03988700732588768 2023-01-22 16:42:15.494532: step: 388/463, loss: 0.07106542587280273 2023-01-22 16:42:16.170399: step: 390/463, loss: 0.0476866289973259 2023-01-22 16:42:16.852801: step: 392/463, loss: 0.006597617175430059 2023-01-22 16:42:17.415505: step: 394/463, loss: 0.018450040370225906 2023-01-22 16:42:18.011740: step: 396/463, loss: 0.0261400006711483 2023-01-22 16:42:18.604675: step: 398/463, loss: 0.016818134114146233 2023-01-22 16:42:19.223544: step: 400/463, loss: 0.07379306107759476 2023-01-22 16:42:19.827386: step: 402/463, loss: 0.03203646466135979 2023-01-22 16:42:20.472487: step: 404/463, loss: 0.03810209035873413 2023-01-22 16:42:21.168856: step: 406/463, loss: 0.11573769897222519 2023-01-22 16:42:21.771773: step: 408/463, loss: 0.01616506092250347 2023-01-22 16:42:22.377874: step: 410/463, loss: 0.09869438409805298 2023-01-22 16:42:22.998941: step: 412/463, loss: 0.1381227970123291 2023-01-22 16:42:23.611362: step: 414/463, loss: 0.48111197352409363 2023-01-22 16:42:24.148808: step: 416/463, loss: 0.008440607227385044 2023-01-22 16:42:24.715405: step: 418/463, loss: 0.026592131704092026 2023-01-22 16:42:25.341112: step: 420/463, loss: 0.05442299321293831 2023-01-22 16:42:26.017386: step: 422/463, loss: 0.03047971986234188 2023-01-22 16:42:26.614164: step: 424/463, loss: 0.0284715723246336 2023-01-22 16:42:27.212132: step: 426/463, loss: 0.03859110176563263 2023-01-22 16:42:27.895469: step: 428/463, loss: 0.06802213191986084 2023-01-22 16:42:28.511487: step: 430/463, loss: 0.033452581614255905 2023-01-22 16:42:29.095988: step: 432/463, loss: 0.11053768545389175 2023-01-22 16:42:29.754947: step: 434/463, loss: 0.03467222675681114 2023-01-22 16:42:30.398893: step: 436/463, loss: 0.18572543561458588 2023-01-22 16:42:31.016897: step: 438/463, loss: 0.0627322643995285 2023-01-22 16:42:31.674789: step: 440/463, loss: 0.04957449063658714 2023-01-22 16:42:32.324594: step: 442/463, loss: 0.030460555106401443 2023-01-22 16:42:33.016054: step: 444/463, loss: 0.03470031917095184 2023-01-22 16:42:33.588242: step: 446/463, loss: 0.04225575178861618 2023-01-22 16:42:34.234454: step: 448/463, loss: 0.10568469762802124 2023-01-22 16:42:34.917887: step: 450/463, loss: 0.34340861439704895 2023-01-22 16:42:35.566845: step: 452/463, loss: 0.3041839599609375 2023-01-22 16:42:36.177162: step: 454/463, loss: 0.07217948883771896 2023-01-22 16:42:36.794462: step: 456/463, loss: 0.6633732914924622 2023-01-22 16:42:37.415735: step: 458/463, loss: 0.15855590999126434 2023-01-22 16:42:38.070469: step: 460/463, loss: 0.05590202659368515 2023-01-22 16:42:38.715256: step: 462/463, loss: 0.040579404681921005 2023-01-22 16:42:39.383836: step: 464/463, loss: 0.02238362468779087 2023-01-22 16:42:40.030274: step: 466/463, loss: 0.16294565796852112 2023-01-22 16:42:40.655799: step: 468/463, loss: 0.03643759712576866 2023-01-22 16:42:41.300453: step: 470/463, loss: 0.02327682450413704 2023-01-22 16:42:42.079427: step: 472/463, loss: 0.1161649227142334 2023-01-22 16:42:42.711642: step: 474/463, loss: 0.014084688387811184 2023-01-22 16:42:43.331019: step: 476/463, loss: 0.09742389619350433 2023-01-22 16:42:43.997383: step: 478/463, loss: 0.07261428982019424 2023-01-22 16:42:44.632782: step: 480/463, loss: 0.0542832650244236 2023-01-22 16:42:45.294010: step: 482/463, loss: 0.02767026610672474 2023-01-22 16:42:45.895957: step: 484/463, loss: 0.009875942952930927 2023-01-22 16:42:46.525563: step: 486/463, loss: 0.023435885086655617 2023-01-22 16:42:47.114397: step: 488/463, loss: 0.045102763921022415 2023-01-22 16:42:47.785261: step: 490/463, loss: 0.025840144604444504 2023-01-22 16:42:48.420072: step: 492/463, loss: 0.054885510355234146 2023-01-22 16:42:49.044311: step: 494/463, loss: 0.011808100156486034 2023-01-22 16:42:49.631298: step: 496/463, loss: 0.11696034669876099 2023-01-22 16:42:50.242592: step: 498/463, loss: 0.04152390733361244 2023-01-22 16:42:50.932031: step: 500/463, loss: 0.11099371314048767 2023-01-22 16:42:51.543042: step: 502/463, loss: 0.003418322652578354 2023-01-22 16:42:52.133007: step: 504/463, loss: 0.08316610753536224 2023-01-22 16:42:52.773824: step: 506/463, loss: 0.09416878968477249 2023-01-22 16:42:53.359658: step: 508/463, loss: 0.06148260086774826 2023-01-22 16:42:53.984080: step: 510/463, loss: 0.13419973850250244 2023-01-22 16:42:54.585970: step: 512/463, loss: 0.0012598333414644003 2023-01-22 16:42:55.202819: step: 514/463, loss: 0.07953911274671555 2023-01-22 16:42:55.800721: step: 516/463, loss: 0.021585378795862198 2023-01-22 16:42:56.426568: step: 518/463, loss: 0.07375290244817734 2023-01-22 16:42:57.060260: step: 520/463, loss: 0.14709150791168213 2023-01-22 16:42:57.699711: step: 522/463, loss: 0.06558318436145782 2023-01-22 16:42:58.288763: step: 524/463, loss: 0.04169303923845291 2023-01-22 16:42:58.995285: step: 526/463, loss: 0.10469111800193787 2023-01-22 16:42:59.595338: step: 528/463, loss: 0.026635250076651573 2023-01-22 16:43:00.264683: step: 530/463, loss: 0.007126982789486647 2023-01-22 16:43:00.902933: step: 532/463, loss: 0.02500409632921219 2023-01-22 16:43:01.452213: step: 534/463, loss: 0.01568017154932022 2023-01-22 16:43:01.990724: step: 536/463, loss: 0.0260545052587986 2023-01-22 16:43:02.577589: step: 538/463, loss: 0.21560066938400269 2023-01-22 16:43:03.173208: step: 540/463, loss: 0.021115705370903015 2023-01-22 16:43:03.731922: step: 542/463, loss: 0.35932686924934387 2023-01-22 16:43:04.359653: step: 544/463, loss: 0.051029980182647705 2023-01-22 16:43:04.969722: step: 546/463, loss: 0.018741494044661522 2023-01-22 16:43:05.564652: step: 548/463, loss: 0.03786850720643997 2023-01-22 16:43:06.322458: step: 550/463, loss: 0.07767565548419952 2023-01-22 16:43:06.918778: step: 552/463, loss: 0.052013445645570755 2023-01-22 16:43:07.668280: step: 554/463, loss: 0.1630832552909851 2023-01-22 16:43:08.232299: step: 556/463, loss: 0.04901263862848282 2023-01-22 16:43:08.811149: step: 558/463, loss: 0.015183957293629646 2023-01-22 16:43:09.446078: step: 560/463, loss: 0.02676558680832386 2023-01-22 16:43:10.090008: step: 562/463, loss: 0.18818849325180054 2023-01-22 16:43:10.730550: step: 564/463, loss: 0.09635596722364426 2023-01-22 16:43:11.308556: step: 566/463, loss: 0.04100256785750389 2023-01-22 16:43:12.003667: step: 568/463, loss: 0.039773665368556976 2023-01-22 16:43:12.657351: step: 570/463, loss: 0.09066710621118546 2023-01-22 16:43:13.329563: step: 572/463, loss: 0.02740287594497204 2023-01-22 16:43:13.973182: step: 574/463, loss: 0.022807829082012177 2023-01-22 16:43:14.632690: step: 576/463, loss: 0.07456956058740616 2023-01-22 16:43:15.333683: step: 578/463, loss: 0.07549942284822464 2023-01-22 16:43:15.954193: step: 580/463, loss: 0.11407119035720825 2023-01-22 16:43:16.609518: step: 582/463, loss: 0.6028149127960205 2023-01-22 16:43:17.225894: step: 584/463, loss: 0.06498437374830246 2023-01-22 16:43:17.847558: step: 586/463, loss: 0.08702302724123001 2023-01-22 16:43:18.478168: step: 588/463, loss: 0.036849603056907654 2023-01-22 16:43:19.129258: step: 590/463, loss: 0.023740509524941444 2023-01-22 16:43:19.946320: step: 592/463, loss: 0.04063057154417038 2023-01-22 16:43:20.610179: step: 594/463, loss: 0.07523534446954727 2023-01-22 16:43:21.267852: step: 596/463, loss: 0.16881796717643738 2023-01-22 16:43:21.849674: step: 598/463, loss: 0.007330241147428751 2023-01-22 16:43:22.460988: step: 600/463, loss: 0.03827233612537384 2023-01-22 16:43:23.040890: step: 602/463, loss: 0.024059951305389404 2023-01-22 16:43:23.706561: step: 604/463, loss: 0.029604200273752213 2023-01-22 16:43:24.295033: step: 606/463, loss: 0.043861981481313705 2023-01-22 16:43:24.908136: step: 608/463, loss: 0.22660565376281738 2023-01-22 16:43:25.622373: step: 610/463, loss: 0.09877804666757584 2023-01-22 16:43:26.180006: step: 612/463, loss: 0.027870165184140205 2023-01-22 16:43:26.751788: step: 614/463, loss: 0.017990848049521446 2023-01-22 16:43:27.433337: step: 616/463, loss: 0.15495993196964264 2023-01-22 16:43:28.045430: step: 618/463, loss: 0.025977924466133118 2023-01-22 16:43:28.683730: step: 620/463, loss: 0.009924844838678837 2023-01-22 16:43:29.287853: step: 622/463, loss: 0.0826999843120575 2023-01-22 16:43:29.935288: step: 624/463, loss: 0.046227820217609406 2023-01-22 16:43:30.567097: step: 626/463, loss: 0.08325247466564178 2023-01-22 16:43:31.219749: step: 628/463, loss: 0.04553397372364998 2023-01-22 16:43:31.868825: step: 630/463, loss: 0.11556340754032135 2023-01-22 16:43:32.500465: step: 632/463, loss: 0.024727389216423035 2023-01-22 16:43:33.181352: step: 634/463, loss: 0.06717843562364578 2023-01-22 16:43:33.843364: step: 636/463, loss: 0.03639480471611023 2023-01-22 16:43:34.455559: step: 638/463, loss: 0.06774527579545975 2023-01-22 16:43:35.127538: step: 640/463, loss: 0.4937330186367035 2023-01-22 16:43:35.746935: step: 642/463, loss: 0.04669651761651039 2023-01-22 16:43:36.413049: step: 644/463, loss: 0.08514953404664993 2023-01-22 16:43:37.044388: step: 646/463, loss: 0.10228398442268372 2023-01-22 16:43:37.651109: step: 648/463, loss: 0.07793349027633667 2023-01-22 16:43:38.279516: step: 650/463, loss: 0.02804330177605152 2023-01-22 16:43:38.852187: step: 652/463, loss: 0.05896514654159546 2023-01-22 16:43:39.493235: step: 654/463, loss: 0.5892834067344666 2023-01-22 16:43:40.121287: step: 656/463, loss: 0.25063803791999817 2023-01-22 16:43:40.755796: step: 658/463, loss: 1.1057854890823364 2023-01-22 16:43:41.381588: step: 660/463, loss: 0.05005389451980591 2023-01-22 16:43:41.985858: step: 662/463, loss: 0.040708184242248535 2023-01-22 16:43:42.568806: step: 664/463, loss: 0.07096633315086365 2023-01-22 16:43:43.154082: step: 666/463, loss: 0.018490582704544067 2023-01-22 16:43:43.842574: step: 668/463, loss: 0.06729850172996521 2023-01-22 16:43:44.491543: step: 670/463, loss: 0.05185816064476967 2023-01-22 16:43:45.089368: step: 672/463, loss: 0.08534397929906845 2023-01-22 16:43:45.639792: step: 674/463, loss: 0.008029425516724586 2023-01-22 16:43:46.252566: step: 676/463, loss: 0.06380575895309448 2023-01-22 16:43:46.787054: step: 678/463, loss: 0.06322808563709259 2023-01-22 16:43:47.423213: step: 680/463, loss: 0.1318054497241974 2023-01-22 16:43:48.051680: step: 682/463, loss: 0.11114707589149475 2023-01-22 16:43:48.709764: step: 684/463, loss: 0.15192261338233948 2023-01-22 16:43:49.339350: step: 686/463, loss: 0.06690530478954315 2023-01-22 16:43:49.952417: step: 688/463, loss: 0.08799722790718079 2023-01-22 16:43:50.562691: step: 690/463, loss: 0.043299801647663116 2023-01-22 16:43:51.129847: step: 692/463, loss: 0.5969913601875305 2023-01-22 16:43:51.763919: step: 694/463, loss: 0.040563762187957764 2023-01-22 16:43:52.399379: step: 696/463, loss: 0.06899046152830124 2023-01-22 16:43:52.993830: step: 698/463, loss: 0.014611917547881603 2023-01-22 16:43:53.646129: step: 700/463, loss: 0.05444234237074852 2023-01-22 16:43:54.382049: step: 702/463, loss: 0.17598481476306915 2023-01-22 16:43:55.055676: step: 704/463, loss: 0.10042323172092438 2023-01-22 16:43:55.644225: step: 706/463, loss: 0.6152206659317017 2023-01-22 16:43:56.233833: step: 708/463, loss: 0.0868997648358345 2023-01-22 16:43:56.796460: step: 710/463, loss: 0.07910794019699097 2023-01-22 16:43:57.504219: step: 712/463, loss: 0.05471988394856453 2023-01-22 16:43:58.091988: step: 714/463, loss: 0.030006421729922295 2023-01-22 16:43:58.707328: step: 716/463, loss: 0.1228027194738388 2023-01-22 16:43:59.301998: step: 718/463, loss: 0.0449814610183239 2023-01-22 16:43:59.901765: step: 720/463, loss: 0.08637873083353043 2023-01-22 16:44:00.564189: step: 722/463, loss: 0.011391912586987019 2023-01-22 16:44:01.213437: step: 724/463, loss: 0.037670254707336426 2023-01-22 16:44:01.838325: step: 726/463, loss: 0.1804923713207245 2023-01-22 16:44:02.473250: step: 728/463, loss: 0.08698229491710663 2023-01-22 16:44:03.095723: step: 730/463, loss: 0.06061690300703049 2023-01-22 16:44:03.649138: step: 732/463, loss: 0.1425868421792984 2023-01-22 16:44:04.267878: step: 734/463, loss: 0.1403505653142929 2023-01-22 16:44:04.835677: step: 736/463, loss: 0.030518969520926476 2023-01-22 16:44:05.466481: step: 738/463, loss: 0.13049206137657166 2023-01-22 16:44:06.208647: step: 740/463, loss: 0.08121033012866974 2023-01-22 16:44:06.853125: step: 742/463, loss: 0.3346508741378784 2023-01-22 16:44:07.477653: step: 744/463, loss: 0.06174361705780029 2023-01-22 16:44:08.248790: step: 746/463, loss: 0.03979477286338806 2023-01-22 16:44:08.911117: step: 748/463, loss: 0.07043951004743576 2023-01-22 16:44:09.566440: step: 750/463, loss: 0.006785435602068901 2023-01-22 16:44:10.250513: step: 752/463, loss: 0.012875337153673172 2023-01-22 16:44:10.812590: step: 754/463, loss: 0.041532114148139954 2023-01-22 16:44:11.451220: step: 756/463, loss: 0.02475295029580593 2023-01-22 16:44:12.111049: step: 758/463, loss: 0.058311447501182556 2023-01-22 16:44:12.732058: step: 760/463, loss: 0.09355651587247849 2023-01-22 16:44:13.398402: step: 762/463, loss: 0.07498453557491302 2023-01-22 16:44:13.967651: step: 764/463, loss: 0.03308523818850517 2023-01-22 16:44:14.599940: step: 766/463, loss: 0.32853806018829346 2023-01-22 16:44:15.226867: step: 768/463, loss: 0.05436338856816292 2023-01-22 16:44:15.813959: step: 770/463, loss: 0.02450391836464405 2023-01-22 16:44:16.442019: step: 772/463, loss: 0.06450299173593521 2023-01-22 16:44:17.084018: step: 774/463, loss: 0.040684983134269714 2023-01-22 16:44:17.709979: step: 776/463, loss: 0.0960361585021019 2023-01-22 16:44:18.276562: step: 778/463, loss: 1.2636427879333496 2023-01-22 16:44:18.990710: step: 780/463, loss: 0.013372809626162052 2023-01-22 16:44:19.577189: step: 782/463, loss: 0.021004172042012215 2023-01-22 16:44:20.254934: step: 784/463, loss: 0.018933909013867378 2023-01-22 16:44:20.878419: step: 786/463, loss: 0.05911260098218918 2023-01-22 16:44:21.486541: step: 788/463, loss: 0.002860710956156254 2023-01-22 16:44:22.094569: step: 790/463, loss: 0.03068568743765354 2023-01-22 16:44:22.763993: step: 792/463, loss: 0.0661996454000473 2023-01-22 16:44:23.446252: step: 794/463, loss: 0.3197731673717499 2023-01-22 16:44:24.030452: step: 796/463, loss: 0.04053255915641785 2023-01-22 16:44:24.631026: step: 798/463, loss: 0.047353293746709824 2023-01-22 16:44:25.303352: step: 800/463, loss: 0.02688196487724781 2023-01-22 16:44:25.968664: step: 802/463, loss: 0.09575239568948746 2023-01-22 16:44:26.601959: step: 804/463, loss: 0.05460821092128754 2023-01-22 16:44:27.169360: step: 806/463, loss: 1.0921714305877686 2023-01-22 16:44:27.851102: step: 808/463, loss: 0.12091883271932602 2023-01-22 16:44:28.436761: step: 810/463, loss: 0.07184804230928421 2023-01-22 16:44:28.986019: step: 812/463, loss: 0.6034228801727295 2023-01-22 16:44:29.638859: step: 814/463, loss: 0.0809527263045311 2023-01-22 16:44:30.175082: step: 816/463, loss: 0.03601351007819176 2023-01-22 16:44:30.741628: step: 818/463, loss: 0.04107963666319847 2023-01-22 16:44:31.493194: step: 820/463, loss: 0.02762593887746334 2023-01-22 16:44:32.131977: step: 822/463, loss: 0.0075806910172104836 2023-01-22 16:44:32.877259: step: 824/463, loss: 0.03740861639380455 2023-01-22 16:44:33.485709: step: 826/463, loss: 0.03932375833392143 2023-01-22 16:44:34.109194: step: 828/463, loss: 0.020801370963454247 2023-01-22 16:44:34.744567: step: 830/463, loss: 0.05387889966368675 2023-01-22 16:44:35.401935: step: 832/463, loss: 0.05144181847572327 2023-01-22 16:44:36.072896: step: 834/463, loss: 0.04477492719888687 2023-01-22 16:44:36.736444: step: 836/463, loss: 0.20092818140983582 2023-01-22 16:44:37.422360: step: 838/463, loss: 0.009338716976344585 2023-01-22 16:44:38.068519: step: 840/463, loss: 0.05996943637728691 2023-01-22 16:44:38.662886: step: 842/463, loss: 0.05091816186904907 2023-01-22 16:44:39.300010: step: 844/463, loss: 0.10363996773958206 2023-01-22 16:44:39.952526: step: 846/463, loss: 0.2683759927749634 2023-01-22 16:44:40.601979: step: 848/463, loss: 0.4161459803581238 2023-01-22 16:44:41.209953: step: 850/463, loss: 0.2553946077823639 2023-01-22 16:44:41.851267: step: 852/463, loss: 0.008917962200939655 2023-01-22 16:44:42.427157: step: 854/463, loss: 0.0794248878955841 2023-01-22 16:44:43.048784: step: 856/463, loss: 0.0397050678730011 2023-01-22 16:44:43.717037: step: 858/463, loss: 0.023218052461743355 2023-01-22 16:44:44.308311: step: 860/463, loss: 0.02473318576812744 2023-01-22 16:44:44.914093: step: 862/463, loss: 0.10376220941543579 2023-01-22 16:44:45.465962: step: 864/463, loss: 0.09712719917297363 2023-01-22 16:44:46.094645: step: 866/463, loss: 0.07785729318857193 2023-01-22 16:44:46.676494: step: 868/463, loss: 0.0135653642937541 2023-01-22 16:44:47.266561: step: 870/463, loss: 0.5838375091552734 2023-01-22 16:44:47.828828: step: 872/463, loss: 0.005020774435251951 2023-01-22 16:44:48.475858: step: 874/463, loss: 0.0355030782520771 2023-01-22 16:44:49.083844: step: 876/463, loss: 0.15650099515914917 2023-01-22 16:44:49.733506: step: 878/463, loss: 0.0380869098007679 2023-01-22 16:44:50.357830: step: 880/463, loss: 0.1699914038181305 2023-01-22 16:44:50.983508: step: 882/463, loss: 0.025957971811294556 2023-01-22 16:44:51.607957: step: 884/463, loss: 0.11436450481414795 2023-01-22 16:44:52.222478: step: 886/463, loss: 0.02098097652196884 2023-01-22 16:44:52.771022: step: 888/463, loss: 0.016520533710718155 2023-01-22 16:44:53.404792: step: 890/463, loss: 0.023980949074029922 2023-01-22 16:44:54.050313: step: 892/463, loss: 0.2769082188606262 2023-01-22 16:44:54.674889: step: 894/463, loss: 0.013706614263355732 2023-01-22 16:44:55.279820: step: 896/463, loss: 0.015313009731471539 2023-01-22 16:44:55.884426: step: 898/463, loss: 0.1773129552602768 2023-01-22 16:44:56.520258: step: 900/463, loss: 0.0590256005525589 2023-01-22 16:44:57.133660: step: 902/463, loss: 0.05406447872519493 2023-01-22 16:44:57.798759: step: 904/463, loss: 0.4133858382701874 2023-01-22 16:44:58.430952: step: 906/463, loss: 0.12163659930229187 2023-01-22 16:44:59.108644: step: 908/463, loss: 0.14906884729862213 2023-01-22 16:44:59.764530: step: 910/463, loss: 0.10698559880256653 2023-01-22 16:45:00.410365: step: 912/463, loss: 2.8846487998962402 2023-01-22 16:45:01.071337: step: 914/463, loss: 0.042744118720293045 2023-01-22 16:45:01.758264: step: 916/463, loss: 0.05098891630768776 2023-01-22 16:45:02.339546: step: 918/463, loss: 0.08275730162858963 2023-01-22 16:45:02.926079: step: 920/463, loss: 0.0317838117480278 2023-01-22 16:45:03.581766: step: 922/463, loss: 0.1491996794939041 2023-01-22 16:45:04.203680: step: 924/463, loss: 0.017777718603610992 2023-01-22 16:45:04.836785: step: 926/463, loss: 0.06820128113031387 ================================================== Loss: 0.109 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2885878489326765, 'r': 0.33349146110056926, 'f1': 0.30941901408450706}, 'combined': 0.22799295774647887, 'epoch': 18} Test Chinese: {'template': {'p': 0.9722222222222222, 'r': 0.5511811023622047, 'f1': 0.7035175879396985}, 'slot': {'p': 0.33908642004896467, 'r': 0.3171907873407418, 'f1': 0.32777334742334546}, 'combined': 0.2305943147701928, 'epoch': 18} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2914572864321608, 'r': 0.3301707779886148, 'f1': 0.3096085409252669}, 'combined': 0.22813260910282823, 'epoch': 18} Test Korean: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.3417843009829022, 'r': 0.3149426019528313, 'f1': 0.327814915384146}, 'combined': 0.23274858992274364, 'epoch': 18} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.29920624677336083, 'r': 0.3383812582104802, 'f1': 0.3175902459072539}, 'combined': 0.23401386540534494, 'epoch': 18} Test Russian: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.36355641796658256, 'r': 0.3042326394676725, 'f1': 0.331259482023708}, 'combined': 0.23519423223683264, 'epoch': 18} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.2580409356725146, 'r': 0.4202380952380952, 'f1': 0.3197463768115942}, 'combined': 0.21316425120772944, 'epoch': 18} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.25, 'r': 0.40217391304347827, 'f1': 0.30833333333333335}, 'combined': 0.15416666666666667, 'epoch': 18} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.3875, 'r': 0.2672413793103448, 'f1': 0.3163265306122449}, 'combined': 0.2108843537414966, 'epoch': 18} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2887401938920082, 'r': 0.35393959251278423, 'f1': 0.3180326773303279}, 'combined': 0.2343398675065574, 'epoch': 16} Test for Chinese: {'template': {'p': 0.9722222222222222, 'r': 0.5511811023622047, 'f1': 0.7035175879396985}, 'slot': {'p': 0.3374190197383612, 'r': 0.30885912016190303, 'f1': 0.32250801977725824}, 'combined': 0.22689006416490531, 'epoch': 16} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.29006410256410253, 'r': 0.4309523809523809, 'f1': 0.34674329501915707}, 'combined': 0.23116219667943805, 'epoch': 16} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.28642843540005986, 'r': 0.33860515228508026, 'f1': 0.3103389830508475}, 'combined': 0.22867082961641394, 'epoch': 17} Test for Korean: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.3492453884409228, 'r': 0.3270179138496522, 'f1': 0.3377663639671779}, 'combined': 0.2398141184166963, 'epoch': 17} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.33088235294117646, 'r': 0.4891304347826087, 'f1': 0.39473684210526316}, 'combined': 0.19736842105263158, 'epoch': 17} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.29419352931628273, 'r': 0.339969372587507, 'f1': 0.3154293298479158}, 'combined': 0.23242161146688528, 'epoch': 15} Test for Russian: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.343089588698682, 'r': 0.29459001333289975, 'f1': 0.31699545096666953}, 'combined': 0.22506677018633536, 'epoch': 15} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.484375, 'r': 0.2672413793103448, 'f1': 0.34444444444444444}, 'combined': 0.22962962962962963, 'epoch': 15} ****************************** Epoch: 19 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-22 16:47:44.549160: step: 2/463, loss: 0.04636792838573456 2023-01-22 16:47:45.152223: step: 4/463, loss: 0.09001514315605164 2023-01-22 16:47:45.770999: step: 6/463, loss: 0.1123499646782875 2023-01-22 16:47:46.383237: step: 8/463, loss: 0.10140480846166611 2023-01-22 16:47:46.933857: step: 10/463, loss: 0.017452526837587357 2023-01-22 16:47:47.495678: step: 12/463, loss: 0.039355237036943436 2023-01-22 16:47:48.080496: step: 14/463, loss: 0.027500225231051445 2023-01-22 16:47:48.725175: step: 16/463, loss: 0.09309407323598862 2023-01-22 16:47:49.340949: step: 18/463, loss: 0.03867093473672867 2023-01-22 16:47:49.984622: step: 20/463, loss: 0.12692657113075256 2023-01-22 16:47:50.602442: step: 22/463, loss: 0.05737128108739853 2023-01-22 16:47:51.229604: step: 24/463, loss: 0.1350782811641693 2023-01-22 16:47:51.815446: step: 26/463, loss: 0.07506753504276276 2023-01-22 16:47:52.449147: step: 28/463, loss: 0.040556710213422775 2023-01-22 16:47:53.003495: step: 30/463, loss: 0.07495608180761337 2023-01-22 16:47:53.669307: step: 32/463, loss: 0.2737199664115906 2023-01-22 16:47:54.314175: step: 34/463, loss: 0.4909331798553467 2023-01-22 16:47:54.978421: step: 36/463, loss: 0.014969772659242153 2023-01-22 16:47:55.585332: step: 38/463, loss: 0.09222202748060226 2023-01-22 16:47:56.190092: step: 40/463, loss: 0.05815839767456055 2023-01-22 16:47:56.798368: step: 42/463, loss: 0.1578315645456314 2023-01-22 16:47:57.347317: step: 44/463, loss: 0.06462925672531128 2023-01-22 16:47:57.970264: step: 46/463, loss: 0.09700269252061844 2023-01-22 16:47:58.609313: step: 48/463, loss: 0.05459180101752281 2023-01-22 16:47:59.214606: step: 50/463, loss: 0.0625167191028595 2023-01-22 16:47:59.858309: step: 52/463, loss: 0.17293469607830048 2023-01-22 16:48:00.492557: step: 54/463, loss: 0.09401818364858627 2023-01-22 16:48:01.090895: step: 56/463, loss: 0.022667579352855682 2023-01-22 16:48:01.697067: step: 58/463, loss: 0.009311222471296787 2023-01-22 16:48:02.350671: step: 60/463, loss: 0.03784038871526718 2023-01-22 16:48:02.969440: step: 62/463, loss: 0.028912797570228577 2023-01-22 16:48:03.511759: step: 64/463, loss: 0.01274263858795166 2023-01-22 16:48:04.118215: step: 66/463, loss: 0.07003182917833328 2023-01-22 16:48:04.801864: step: 68/463, loss: 0.03681602329015732 2023-01-22 16:48:05.425068: step: 70/463, loss: 0.14679290354251862 2023-01-22 16:48:06.043475: step: 72/463, loss: 0.058090802282094955 2023-01-22 16:48:06.682486: step: 74/463, loss: 0.09339684993028641 2023-01-22 16:48:07.196761: step: 76/463, loss: 0.26453644037246704 2023-01-22 16:48:07.896581: step: 78/463, loss: 0.07121706008911133 2023-01-22 16:48:08.579567: step: 80/463, loss: 0.01600741036236286 2023-01-22 16:48:09.141508: step: 82/463, loss: 0.0404224656522274 2023-01-22 16:48:09.754778: step: 84/463, loss: 0.0511116199195385 2023-01-22 16:48:10.398795: step: 86/463, loss: 0.024546559900045395 2023-01-22 16:48:11.011751: step: 88/463, loss: 1.4327936172485352 2023-01-22 16:48:11.637575: step: 90/463, loss: 0.003155801212415099 2023-01-22 16:48:12.215400: step: 92/463, loss: 0.1632770448923111 2023-01-22 16:48:12.821345: step: 94/463, loss: 0.042905062437057495 2023-01-22 16:48:13.484206: step: 96/463, loss: 0.04636860266327858 2023-01-22 16:48:14.098799: step: 98/463, loss: 0.014492981135845184 2023-01-22 16:48:14.778923: step: 100/463, loss: 0.06112828478217125 2023-01-22 16:48:15.360895: step: 102/463, loss: 0.129401296377182 2023-01-22 16:48:16.006127: step: 104/463, loss: 0.16280597448349 2023-01-22 16:48:16.568441: step: 106/463, loss: 0.10495039820671082 2023-01-22 16:48:17.160225: step: 108/463, loss: 0.010070253163576126 2023-01-22 16:48:17.696546: step: 110/463, loss: 0.016290104016661644 2023-01-22 16:48:18.263888: step: 112/463, loss: 0.007364022079855204 2023-01-22 16:48:18.904506: step: 114/463, loss: 0.03768562898039818 2023-01-22 16:48:19.516728: step: 116/463, loss: 0.09434930980205536 2023-01-22 16:48:20.136476: step: 118/463, loss: 0.02073539048433304 2023-01-22 16:48:20.762398: step: 120/463, loss: 0.09806998819112778 2023-01-22 16:48:21.339467: step: 122/463, loss: 0.027869757264852524 2023-01-22 16:48:21.941597: step: 124/463, loss: 0.056343965232372284 2023-01-22 16:48:22.555077: step: 126/463, loss: 0.094172902405262 2023-01-22 16:48:23.127513: step: 128/463, loss: 0.021216074004769325 2023-01-22 16:48:23.793244: step: 130/463, loss: 0.05503359064459801 2023-01-22 16:48:24.392041: step: 132/463, loss: 0.20799164474010468 2023-01-22 16:48:25.057960: step: 134/463, loss: 0.026273956522345543 2023-01-22 16:48:25.662347: step: 136/463, loss: 0.11471883952617645 2023-01-22 16:48:26.292064: step: 138/463, loss: 0.08235922455787659 2023-01-22 16:48:26.916727: step: 140/463, loss: 0.011850845068693161 2023-01-22 16:48:27.504795: step: 142/463, loss: 0.008053872734308243 2023-01-22 16:48:28.099825: step: 144/463, loss: 0.014741532504558563 2023-01-22 16:48:28.693057: step: 146/463, loss: 0.2938196361064911 2023-01-22 16:48:29.328470: step: 148/463, loss: 0.04815991595387459 2023-01-22 16:48:29.934060: step: 150/463, loss: 0.2392081916332245 2023-01-22 16:48:30.592257: step: 152/463, loss: 0.08781959861516953 2023-01-22 16:48:31.250015: step: 154/463, loss: 0.1522451788187027 2023-01-22 16:48:31.847635: step: 156/463, loss: 0.22752203047275543 2023-01-22 16:48:32.507851: step: 158/463, loss: 0.1171979233622551 2023-01-22 16:48:33.081284: step: 160/463, loss: 0.015109706670045853 2023-01-22 16:48:33.964588: step: 162/463, loss: 0.0403885580599308 2023-01-22 16:48:34.550274: step: 164/463, loss: 0.018345873802900314 2023-01-22 16:48:35.161852: step: 166/463, loss: 0.06599047034978867 2023-01-22 16:48:35.873783: step: 168/463, loss: 0.5042043924331665 2023-01-22 16:48:36.444048: step: 170/463, loss: 0.01108035258948803 2023-01-22 16:48:37.143217: step: 172/463, loss: 0.008897640742361546 2023-01-22 16:48:37.789145: step: 174/463, loss: 0.03958697244524956 2023-01-22 16:48:38.412271: step: 176/463, loss: 0.0385705903172493 2023-01-22 16:48:39.073254: step: 178/463, loss: 0.06585169583559036 2023-01-22 16:48:39.673203: step: 180/463, loss: 0.13198675215244293 2023-01-22 16:48:40.222858: step: 182/463, loss: 0.09679707139730453 2023-01-22 16:48:40.879570: step: 184/463, loss: 0.008188272826373577 2023-01-22 16:48:41.535074: step: 186/463, loss: 0.12260603159666061 2023-01-22 16:48:42.138102: step: 188/463, loss: 0.09057255834341049 2023-01-22 16:48:42.716843: step: 190/463, loss: 0.02076653391122818 2023-01-22 16:48:43.315965: step: 192/463, loss: 1.4407315254211426 2023-01-22 16:48:43.931516: step: 194/463, loss: 0.03608759120106697 2023-01-22 16:48:44.571609: step: 196/463, loss: 0.03566184267401695 2023-01-22 16:48:45.217467: step: 198/463, loss: 0.23496948182582855 2023-01-22 16:48:45.833549: step: 200/463, loss: 0.03590071573853493 2023-01-22 16:48:46.428920: step: 202/463, loss: 0.057483211159706116 2023-01-22 16:48:47.028735: step: 204/463, loss: 0.04411294683814049 2023-01-22 16:48:47.633437: step: 206/463, loss: 0.011095965281128883 2023-01-22 16:48:48.310634: step: 208/463, loss: 0.13487809896469116 2023-01-22 16:48:48.921128: step: 210/463, loss: 0.026676440611481667 2023-01-22 16:48:49.577481: step: 212/463, loss: 0.04783211275935173 2023-01-22 16:48:50.185475: step: 214/463, loss: 0.025610819458961487 2023-01-22 16:48:50.829991: step: 216/463, loss: 0.04262056574225426 2023-01-22 16:48:51.441599: step: 218/463, loss: 0.03285584971308708 2023-01-22 16:48:52.045945: step: 220/463, loss: 0.005109147168695927 2023-01-22 16:48:52.631646: step: 222/463, loss: 0.0014465320855379105 2023-01-22 16:48:53.216841: step: 224/463, loss: 0.03951844945549965 2023-01-22 16:48:53.908042: step: 226/463, loss: 0.03873629868030548 2023-01-22 16:48:54.477866: step: 228/463, loss: 0.029332321137189865 2023-01-22 16:48:55.101333: step: 230/463, loss: 0.039540596306324005 2023-01-22 16:48:55.752646: step: 232/463, loss: 0.061497606337070465 2023-01-22 16:48:56.419418: step: 234/463, loss: 0.04351307451725006 2023-01-22 16:48:57.007575: step: 236/463, loss: 0.1704922467470169 2023-01-22 16:48:57.636824: step: 238/463, loss: 0.049618665128946304 2023-01-22 16:48:58.237599: step: 240/463, loss: 0.0171179361641407 2023-01-22 16:48:58.862314: step: 242/463, loss: 0.04847273603081703 2023-01-22 16:48:59.479652: step: 244/463, loss: 0.18111002445220947 2023-01-22 16:49:00.114484: step: 246/463, loss: 0.04116060584783554 2023-01-22 16:49:00.745356: step: 248/463, loss: 0.10703171789646149 2023-01-22 16:49:01.374666: step: 250/463, loss: 0.014379863627254963 2023-01-22 16:49:01.965518: step: 252/463, loss: 0.03638704866170883 2023-01-22 16:49:02.558881: step: 254/463, loss: 0.028577188029885292 2023-01-22 16:49:03.128263: step: 256/463, loss: 0.07171151041984558 2023-01-22 16:49:03.724048: step: 258/463, loss: 0.012037718668580055 2023-01-22 16:49:04.271804: step: 260/463, loss: 0.05173099786043167 2023-01-22 16:49:04.900460: step: 262/463, loss: 0.14954306185245514 2023-01-22 16:49:05.528511: step: 264/463, loss: 0.009192417375743389 2023-01-22 16:49:06.184784: step: 266/463, loss: 0.014035576023161411 2023-01-22 16:49:06.788713: step: 268/463, loss: 0.005188019946217537 2023-01-22 16:49:07.435568: step: 270/463, loss: 0.021883683279156685 2023-01-22 16:49:08.092726: step: 272/463, loss: 0.0176668930798769 2023-01-22 16:49:08.679571: step: 274/463, loss: 0.009969591163098812 2023-01-22 16:49:09.368437: step: 276/463, loss: 0.3630470335483551 2023-01-22 16:49:10.051665: step: 278/463, loss: 0.02344396710395813 2023-01-22 16:49:10.642870: step: 280/463, loss: 0.013086456805467606 2023-01-22 16:49:11.242284: step: 282/463, loss: 1.9121757745742798 2023-01-22 16:49:11.942713: step: 284/463, loss: 0.11952221393585205 2023-01-22 16:49:12.689865: step: 286/463, loss: 0.03920687735080719 2023-01-22 16:49:13.321773: step: 288/463, loss: 0.05913982540369034 2023-01-22 16:49:13.923164: step: 290/463, loss: 0.10695138573646545 2023-01-22 16:49:14.580911: step: 292/463, loss: 0.07472793757915497 2023-01-22 16:49:15.216288: step: 294/463, loss: 0.10199268907308578 2023-01-22 16:49:15.857523: step: 296/463, loss: 0.01420282106846571 2023-01-22 16:49:16.457548: step: 298/463, loss: 0.06863246113061905 2023-01-22 16:49:17.054835: step: 300/463, loss: 0.04446542263031006 2023-01-22 16:49:17.654878: step: 302/463, loss: 0.03597784787416458 2023-01-22 16:49:18.397530: step: 304/463, loss: 0.27933934330940247 2023-01-22 16:49:19.146385: step: 306/463, loss: 0.009806377813220024 2023-01-22 16:49:19.843648: step: 308/463, loss: 0.04126913473010063 2023-01-22 16:49:20.434870: step: 310/463, loss: 0.08789330720901489 2023-01-22 16:49:21.022701: step: 312/463, loss: 0.03982290253043175 2023-01-22 16:49:21.686220: step: 314/463, loss: 0.021273542195558548 2023-01-22 16:49:22.369686: step: 316/463, loss: 0.02829986996948719 2023-01-22 16:49:22.970460: step: 318/463, loss: 0.07617245614528656 2023-01-22 16:49:23.504404: step: 320/463, loss: 0.1808425635099411 2023-01-22 16:49:24.204021: step: 322/463, loss: 0.07029062509536743 2023-01-22 16:49:24.766234: step: 324/463, loss: 0.032603684812784195 2023-01-22 16:49:25.416242: step: 326/463, loss: 1.4972810745239258 2023-01-22 16:49:25.993284: step: 328/463, loss: 0.03929269686341286 2023-01-22 16:49:26.594302: step: 330/463, loss: 0.02704949490725994 2023-01-22 16:49:27.220580: step: 332/463, loss: 0.045854903757572174 2023-01-22 16:49:27.891253: step: 334/463, loss: 0.06264758110046387 2023-01-22 16:49:28.563794: step: 336/463, loss: 0.06017249450087547 2023-01-22 16:49:29.224579: step: 338/463, loss: 0.07814358919858932 2023-01-22 16:49:29.854830: step: 340/463, loss: 0.03903423622250557 2023-01-22 16:49:30.512071: step: 342/463, loss: 0.09933950752019882 2023-01-22 16:49:31.218256: step: 344/463, loss: 0.04313047230243683 2023-01-22 16:49:31.841863: step: 346/463, loss: 0.20800140500068665 2023-01-22 16:49:32.451073: step: 348/463, loss: 0.0947616696357727 2023-01-22 16:49:33.066815: step: 350/463, loss: 0.03944844380021095 2023-01-22 16:49:33.669009: step: 352/463, loss: 0.009616902098059654 2023-01-22 16:49:34.367882: step: 354/463, loss: 0.031036363914608955 2023-01-22 16:49:34.995078: step: 356/463, loss: 0.02447657473385334 2023-01-22 16:49:35.568732: step: 358/463, loss: 0.010733339935541153 2023-01-22 16:49:36.154651: step: 360/463, loss: 0.008889035321772099 2023-01-22 16:49:36.776510: step: 362/463, loss: 0.03961664065718651 2023-01-22 16:49:37.364033: step: 364/463, loss: 0.034800246357917786 2023-01-22 16:49:37.937948: step: 366/463, loss: 0.006748363841325045 2023-01-22 16:49:38.660973: step: 368/463, loss: 0.5024653673171997 2023-01-22 16:49:39.223271: step: 370/463, loss: 0.004523778334259987 2023-01-22 16:49:39.804484: step: 372/463, loss: 0.08239591121673584 2023-01-22 16:49:40.471546: step: 374/463, loss: 0.007211757358163595 2023-01-22 16:49:41.137842: step: 376/463, loss: 0.018534645438194275 2023-01-22 16:49:41.710709: step: 378/463, loss: 0.05562247708439827 2023-01-22 16:49:42.332307: step: 380/463, loss: 0.02421676740050316 2023-01-22 16:49:42.954633: step: 382/463, loss: 0.3083426058292389 2023-01-22 16:49:43.518392: step: 384/463, loss: 0.027262764051556587 2023-01-22 16:49:44.157360: step: 386/463, loss: 0.023630784824490547 2023-01-22 16:49:44.754707: step: 388/463, loss: 0.06870390474796295 2023-01-22 16:49:45.434722: step: 390/463, loss: 0.046379394829273224 2023-01-22 16:49:46.072584: step: 392/463, loss: 0.14098785817623138 2023-01-22 16:49:46.676475: step: 394/463, loss: 0.030977150425314903 2023-01-22 16:49:47.359890: step: 396/463, loss: 0.05526440963149071 2023-01-22 16:49:48.008572: step: 398/463, loss: 0.04461690038442612 2023-01-22 16:49:48.583560: step: 400/463, loss: 0.035312190651893616 2023-01-22 16:49:49.128515: step: 402/463, loss: 0.2505747377872467 2023-01-22 16:49:49.805998: step: 404/463, loss: 0.18820340931415558 2023-01-22 16:49:50.390234: step: 406/463, loss: 0.024156799539923668 2023-01-22 16:49:51.109461: step: 408/463, loss: 0.15820738673210144 2023-01-22 16:49:51.676531: step: 410/463, loss: 0.3670610189437866 2023-01-22 16:49:52.305777: step: 412/463, loss: 0.01690792664885521 2023-01-22 16:49:52.895598: step: 414/463, loss: 0.05511033907532692 2023-01-22 16:49:53.547586: step: 416/463, loss: 0.023335812613368034 2023-01-22 16:49:54.165545: step: 418/463, loss: 0.0964590311050415 2023-01-22 16:49:54.748043: step: 420/463, loss: 0.04741751775145531 2023-01-22 16:49:55.386759: step: 422/463, loss: 0.02149251103401184 2023-01-22 16:49:56.062557: step: 424/463, loss: 0.1283600777387619 2023-01-22 16:49:56.656567: step: 426/463, loss: 0.021887797862291336 2023-01-22 16:49:57.308545: step: 428/463, loss: 0.02962470054626465 2023-01-22 16:49:57.933560: step: 430/463, loss: 0.023900965228676796 2023-01-22 16:49:58.561033: step: 432/463, loss: 0.06469332426786423 2023-01-22 16:49:59.141806: step: 434/463, loss: 0.3887791931629181 2023-01-22 16:49:59.725197: step: 436/463, loss: 0.019189484417438507 2023-01-22 16:50:00.384092: step: 438/463, loss: 0.0018229224951937795 2023-01-22 16:50:00.999524: step: 440/463, loss: 0.005696425214409828 2023-01-22 16:50:01.622107: step: 442/463, loss: 0.03701581060886383 2023-01-22 16:50:02.245127: step: 444/463, loss: 0.021698307245969772 2023-01-22 16:50:02.923066: step: 446/463, loss: 0.08930105715990067 2023-01-22 16:50:03.531899: step: 448/463, loss: 0.05044582858681679 2023-01-22 16:50:04.158831: step: 450/463, loss: 0.054706066846847534 2023-01-22 16:50:04.762296: step: 452/463, loss: 0.02749365009367466 2023-01-22 16:50:05.453278: step: 454/463, loss: 0.03461924195289612 2023-01-22 16:50:06.043953: step: 456/463, loss: 0.07269348204135895 2023-01-22 16:50:06.730130: step: 458/463, loss: 0.042938824743032455 2023-01-22 16:50:07.457596: step: 460/463, loss: 0.04059815779328346 2023-01-22 16:50:08.041180: step: 462/463, loss: 0.01474787201732397 2023-01-22 16:50:08.692267: step: 464/463, loss: 0.027494940906763077 2023-01-22 16:50:09.342755: step: 466/463, loss: 0.06110293045639992 2023-01-22 16:50:09.957214: step: 468/463, loss: 0.020943904295563698 2023-01-22 16:50:10.587155: step: 470/463, loss: 0.005138691049069166 2023-01-22 16:50:11.222272: step: 472/463, loss: 0.06810832768678665 2023-01-22 16:50:11.873644: step: 474/463, loss: 0.1128334179520607 2023-01-22 16:50:12.538684: step: 476/463, loss: 0.12352797389030457 2023-01-22 16:50:13.135717: step: 478/463, loss: 0.03775565326213837 2023-01-22 16:50:13.740061: step: 480/463, loss: 0.07155919075012207 2023-01-22 16:50:14.346710: step: 482/463, loss: 0.0276838019490242 2023-01-22 16:50:14.956483: step: 484/463, loss: 0.046788670122623444 2023-01-22 16:50:15.564781: step: 486/463, loss: 0.045393139123916626 2023-01-22 16:50:16.157634: step: 488/463, loss: 0.038358960300683975 2023-01-22 16:50:16.784549: step: 490/463, loss: 0.055979251861572266 2023-01-22 16:50:17.442212: step: 492/463, loss: 0.04979492723941803 2023-01-22 16:50:18.154228: step: 494/463, loss: 0.047866471111774445 2023-01-22 16:50:18.728634: step: 496/463, loss: 0.02266561985015869 2023-01-22 16:50:19.359565: step: 498/463, loss: 0.027715856209397316 2023-01-22 16:50:20.023165: step: 500/463, loss: 0.05459524691104889 2023-01-22 16:50:20.566821: step: 502/463, loss: 0.05810924619436264 2023-01-22 16:50:21.157141: step: 504/463, loss: 0.06935615092515945 2023-01-22 16:50:21.772849: step: 506/463, loss: 0.05496121942996979 2023-01-22 16:50:22.422630: step: 508/463, loss: 0.041997674852609634 2023-01-22 16:50:23.034763: step: 510/463, loss: 0.04010969027876854 2023-01-22 16:50:23.626992: step: 512/463, loss: 0.03783336654305458 2023-01-22 16:50:24.274889: step: 514/463, loss: 0.017396369948983192 2023-01-22 16:50:24.952240: step: 516/463, loss: 0.05054687708616257 2023-01-22 16:50:25.594482: step: 518/463, loss: 0.030967628583312035 2023-01-22 16:50:26.279408: step: 520/463, loss: 0.07239972054958344 2023-01-22 16:50:26.889975: step: 522/463, loss: 0.0418214350938797 2023-01-22 16:50:27.509060: step: 524/463, loss: 0.02831004559993744 2023-01-22 16:50:28.138522: step: 526/463, loss: 0.028634842485189438 2023-01-22 16:50:28.762736: step: 528/463, loss: 0.029990795999765396 2023-01-22 16:50:29.481719: step: 530/463, loss: 0.10256866365671158 2023-01-22 16:50:30.118288: step: 532/463, loss: 0.02998257428407669 2023-01-22 16:50:30.789352: step: 534/463, loss: 0.06829416006803513 2023-01-22 16:50:31.408825: step: 536/463, loss: 0.013462930917739868 2023-01-22 16:50:32.109098: step: 538/463, loss: 0.08848761022090912 2023-01-22 16:50:32.757943: step: 540/463, loss: 0.09039761871099472 2023-01-22 16:50:33.450036: step: 542/463, loss: 0.14122946560382843 2023-01-22 16:50:34.037330: step: 544/463, loss: 0.02619233727455139 2023-01-22 16:50:34.634655: step: 546/463, loss: 0.056005459278821945 2023-01-22 16:50:35.288974: step: 548/463, loss: 0.0962764173746109 2023-01-22 16:50:35.869441: step: 550/463, loss: 0.10139350593090057 2023-01-22 16:50:36.472181: step: 552/463, loss: 0.036682505160570145 2023-01-22 16:50:37.131823: step: 554/463, loss: 0.22565707564353943 2023-01-22 16:50:37.778378: step: 556/463, loss: 0.03803018853068352 2023-01-22 16:50:38.420821: step: 558/463, loss: 0.04875044897198677 2023-01-22 16:50:39.014697: step: 560/463, loss: 0.04738086089491844 2023-01-22 16:50:39.675684: step: 562/463, loss: 0.06079798564314842 2023-01-22 16:50:40.347945: step: 564/463, loss: 0.40627750754356384 2023-01-22 16:50:40.983558: step: 566/463, loss: 0.0006937486468814313 2023-01-22 16:50:41.665747: step: 568/463, loss: 0.02122250571846962 2023-01-22 16:50:42.296725: step: 570/463, loss: 0.0965903103351593 2023-01-22 16:50:42.969804: step: 572/463, loss: 0.07831482589244843 2023-01-22 16:50:43.633056: step: 574/463, loss: 0.08289043605327606 2023-01-22 16:50:44.268229: step: 576/463, loss: 0.07579950243234634 2023-01-22 16:50:44.863359: step: 578/463, loss: 0.09195798635482788 2023-01-22 16:50:45.431892: step: 580/463, loss: 0.07948186993598938 2023-01-22 16:50:46.095022: step: 582/463, loss: 1.4808086156845093 2023-01-22 16:50:46.699212: step: 584/463, loss: 0.045080315321683884 2023-01-22 16:50:47.363039: step: 586/463, loss: 0.09397546201944351 2023-01-22 16:50:48.043803: step: 588/463, loss: 0.0815775915980339 2023-01-22 16:50:48.726799: step: 590/463, loss: 0.40038004517555237 2023-01-22 16:50:49.368643: step: 592/463, loss: 0.01035951916128397 2023-01-22 16:50:49.988943: step: 594/463, loss: 0.06293779611587524 2023-01-22 16:50:50.582139: step: 596/463, loss: 0.05365222319960594 2023-01-22 16:50:51.245681: step: 598/463, loss: 0.03156990185379982 2023-01-22 16:50:51.867219: step: 600/463, loss: 0.03201669454574585 2023-01-22 16:50:52.492451: step: 602/463, loss: 0.044124193489551544 2023-01-22 16:50:53.157209: step: 604/463, loss: 0.08683539181947708 2023-01-22 16:50:53.757548: step: 606/463, loss: 0.08681542426347733 2023-01-22 16:50:54.340119: step: 608/463, loss: 0.038694653660058975 2023-01-22 16:50:54.956113: step: 610/463, loss: 0.06398120522499084 2023-01-22 16:50:55.530920: step: 612/463, loss: 0.035619597882032394 2023-01-22 16:50:56.171871: step: 614/463, loss: 0.030382677912712097 2023-01-22 16:50:56.835620: step: 616/463, loss: 0.1013604924082756 2023-01-22 16:50:57.475591: step: 618/463, loss: 0.020420387387275696 2023-01-22 16:50:58.036557: step: 620/463, loss: 0.030380195006728172 2023-01-22 16:50:58.700575: step: 622/463, loss: 0.054290782660245895 2023-01-22 16:50:59.338284: step: 624/463, loss: 0.055025484412908554 2023-01-22 16:50:59.981818: step: 626/463, loss: 0.11658049374818802 2023-01-22 16:51:00.606322: step: 628/463, loss: 0.011359983123838902 2023-01-22 16:51:01.325187: step: 630/463, loss: 0.020199548453092575 2023-01-22 16:51:01.991206: step: 632/463, loss: 0.1200765073299408 2023-01-22 16:51:02.660477: step: 634/463, loss: 1.6043034791946411 2023-01-22 16:51:03.257932: step: 636/463, loss: 0.01528902631253004 2023-01-22 16:51:03.833754: step: 638/463, loss: 0.05021580308675766 2023-01-22 16:51:04.408014: step: 640/463, loss: 0.019652079790830612 2023-01-22 16:51:05.038717: step: 642/463, loss: 0.17238672077655792 2023-01-22 16:51:05.687616: step: 644/463, loss: 0.08194341510534286 2023-01-22 16:51:06.334060: step: 646/463, loss: 0.01592087931931019 2023-01-22 16:51:06.924540: step: 648/463, loss: 0.03840217366814613 2023-01-22 16:51:07.488214: step: 650/463, loss: 0.006283220369368792 2023-01-22 16:51:08.089365: step: 652/463, loss: 0.01841687224805355 2023-01-22 16:51:08.648097: step: 654/463, loss: 0.011898201890289783 2023-01-22 16:51:09.289551: step: 656/463, loss: 0.02538643777370453 2023-01-22 16:51:09.921745: step: 658/463, loss: 0.03563403710722923 2023-01-22 16:51:10.485621: step: 660/463, loss: 1.1859220266342163 2023-01-22 16:51:11.148756: step: 662/463, loss: 0.06716649979352951 2023-01-22 16:51:11.743432: step: 664/463, loss: 0.05306149646639824 2023-01-22 16:51:12.415719: step: 666/463, loss: 0.06262854486703873 2023-01-22 16:51:13.004729: step: 668/463, loss: 0.14011381566524506 2023-01-22 16:51:13.751384: step: 670/463, loss: 0.01105993427336216 2023-01-22 16:51:14.355538: step: 672/463, loss: 0.07136335223913193 2023-01-22 16:51:15.016095: step: 674/463, loss: 0.08433887362480164 2023-01-22 16:51:15.650158: step: 676/463, loss: 0.027550501748919487 2023-01-22 16:51:16.308615: step: 678/463, loss: 0.2550245523452759 2023-01-22 16:51:16.963548: step: 680/463, loss: 0.23550502955913544 2023-01-22 16:51:17.518153: step: 682/463, loss: 0.12774014472961426 2023-01-22 16:51:18.190928: step: 684/463, loss: 0.029325993731617928 2023-01-22 16:51:18.815768: step: 686/463, loss: 0.0307205468416214 2023-01-22 16:51:19.421965: step: 688/463, loss: 0.060222845524549484 2023-01-22 16:51:20.066282: step: 690/463, loss: 0.011428779922425747 2023-01-22 16:51:20.695139: step: 692/463, loss: 0.0279059000313282 2023-01-22 16:51:21.275200: step: 694/463, loss: 0.0315239243209362 2023-01-22 16:51:21.882680: step: 696/463, loss: 0.02057325281202793 2023-01-22 16:51:22.636666: step: 698/463, loss: 0.0858994796872139 2023-01-22 16:51:23.212442: step: 700/463, loss: 0.06277964264154434 2023-01-22 16:51:23.854536: step: 702/463, loss: 0.2253146916627884 2023-01-22 16:51:24.456100: step: 704/463, loss: 0.4745645821094513 2023-01-22 16:51:25.078164: step: 706/463, loss: 0.03236434981226921 2023-01-22 16:51:25.694286: step: 708/463, loss: 0.057825569063425064 2023-01-22 16:51:26.420310: step: 710/463, loss: 0.017234375700354576 2023-01-22 16:51:27.116450: step: 712/463, loss: 0.10133519023656845 2023-01-22 16:51:27.768286: step: 714/463, loss: 0.03966178745031357 2023-01-22 16:51:28.361995: step: 716/463, loss: 0.02324477583169937 2023-01-22 16:51:28.969967: step: 718/463, loss: 0.026913844048976898 2023-01-22 16:51:29.554695: step: 720/463, loss: 0.06275444477796555 2023-01-22 16:51:30.211405: step: 722/463, loss: 0.13484707474708557 2023-01-22 16:51:30.869750: step: 724/463, loss: 0.021263226866722107 2023-01-22 16:51:31.479608: step: 726/463, loss: 0.032878726720809937 2023-01-22 16:51:32.142101: step: 728/463, loss: 0.0457865409553051 2023-01-22 16:51:32.757847: step: 730/463, loss: 0.0889144241809845 2023-01-22 16:51:33.436416: step: 732/463, loss: 0.056999340653419495 2023-01-22 16:51:34.052898: step: 734/463, loss: 0.028533494099974632 2023-01-22 16:51:34.708876: step: 736/463, loss: 0.09060779213905334 2023-01-22 16:51:35.318242: step: 738/463, loss: 0.08383861929178238 2023-01-22 16:51:36.039923: step: 740/463, loss: 0.05609513819217682 2023-01-22 16:51:36.755067: step: 742/463, loss: 0.06418629735708237 2023-01-22 16:51:37.456594: step: 744/463, loss: 0.1045694425702095 2023-01-22 16:51:38.054155: step: 746/463, loss: 0.006003578659147024 2023-01-22 16:51:38.678915: step: 748/463, loss: 0.02795448713004589 2023-01-22 16:51:39.259725: step: 750/463, loss: 0.04418785125017166 2023-01-22 16:51:39.856391: step: 752/463, loss: 0.022774826735258102 2023-01-22 16:51:40.486006: step: 754/463, loss: 0.06855323910713196 2023-01-22 16:51:41.178261: step: 756/463, loss: 0.09963860362768173 2023-01-22 16:51:41.840920: step: 758/463, loss: 0.06308478862047195 2023-01-22 16:51:42.492947: step: 760/463, loss: 0.04672875255346298 2023-01-22 16:51:43.074244: step: 762/463, loss: 0.014300234615802765 2023-01-22 16:51:43.678664: step: 764/463, loss: 0.019327238202095032 2023-01-22 16:51:44.333678: step: 766/463, loss: 0.007982457056641579 2023-01-22 16:51:44.947454: step: 768/463, loss: 0.01139580737799406 2023-01-22 16:51:45.585964: step: 770/463, loss: 0.04256941005587578 2023-01-22 16:51:46.209401: step: 772/463, loss: 0.007176972925662994 2023-01-22 16:51:46.828169: step: 774/463, loss: 0.04351868852972984 2023-01-22 16:51:47.444244: step: 776/463, loss: 0.31784239411354065 2023-01-22 16:51:48.011257: step: 778/463, loss: 0.10866931080818176 2023-01-22 16:51:48.565969: step: 780/463, loss: 0.0967535674571991 2023-01-22 16:51:49.138068: step: 782/463, loss: 0.30729496479034424 2023-01-22 16:51:49.781083: step: 784/463, loss: 0.0784192681312561 2023-01-22 16:51:50.391828: step: 786/463, loss: 0.06446586549282074 2023-01-22 16:51:51.030662: step: 788/463, loss: 0.048455819487571716 2023-01-22 16:51:51.603901: step: 790/463, loss: 0.05198821797966957 2023-01-22 16:51:52.345459: step: 792/463, loss: 0.039615169167518616 2023-01-22 16:51:52.966864: step: 794/463, loss: 0.0561993271112442 2023-01-22 16:51:53.559544: step: 796/463, loss: 0.30200397968292236 2023-01-22 16:51:54.156328: step: 798/463, loss: 0.022991875186562538 2023-01-22 16:51:54.853442: step: 800/463, loss: 0.032224006950855255 2023-01-22 16:51:55.485854: step: 802/463, loss: 0.5006826519966125 2023-01-22 16:51:56.103283: step: 804/463, loss: 0.3114413619041443 2023-01-22 16:51:56.717223: step: 806/463, loss: 0.019121438264846802 2023-01-22 16:51:57.407249: step: 808/463, loss: 0.0036421488039195538 2023-01-22 16:51:57.985546: step: 810/463, loss: 0.0003671071899589151 2023-01-22 16:51:58.612519: step: 812/463, loss: 0.021200869232416153 2023-01-22 16:51:59.190213: step: 814/463, loss: 0.01725013740360737 2023-01-22 16:51:59.751501: step: 816/463, loss: 0.01078586746007204 2023-01-22 16:52:00.392332: step: 818/463, loss: 0.06197042390704155 2023-01-22 16:52:01.021993: step: 820/463, loss: 0.07802018523216248 2023-01-22 16:52:01.667545: step: 822/463, loss: 0.010541153140366077 2023-01-22 16:52:02.338868: step: 824/463, loss: 0.04055146872997284 2023-01-22 16:52:03.004111: step: 826/463, loss: 0.05507691577076912 2023-01-22 16:52:03.600747: step: 828/463, loss: 0.04060526564717293 2023-01-22 16:52:04.268371: step: 830/463, loss: 0.02670554257929325 2023-01-22 16:52:04.902334: step: 832/463, loss: 0.004849882796406746 2023-01-22 16:52:05.509519: step: 834/463, loss: 0.09346813708543777 2023-01-22 16:52:06.073781: step: 836/463, loss: 0.533324122428894 2023-01-22 16:52:06.703035: step: 838/463, loss: 0.08962450176477432 2023-01-22 16:52:07.327557: step: 840/463, loss: 0.04204501584172249 2023-01-22 16:52:07.942016: step: 842/463, loss: 0.14332985877990723 2023-01-22 16:52:08.590181: step: 844/463, loss: 0.027158468961715698 2023-01-22 16:52:09.187794: step: 846/463, loss: 0.058514151722192764 2023-01-22 16:52:09.866409: step: 848/463, loss: 0.02195899561047554 2023-01-22 16:52:10.474511: step: 850/463, loss: 0.021381020545959473 2023-01-22 16:52:11.084434: step: 852/463, loss: 0.10565589368343353 2023-01-22 16:52:11.670397: step: 854/463, loss: 0.0014853515895083547 2023-01-22 16:52:12.325058: step: 856/463, loss: 0.0504942387342453 2023-01-22 16:52:12.985007: step: 858/463, loss: 0.013602443039417267 2023-01-22 16:52:13.623597: step: 860/463, loss: 0.17649886012077332 2023-01-22 16:52:14.244909: step: 862/463, loss: 0.09694847464561462 2023-01-22 16:52:14.981820: step: 864/463, loss: 0.05557619780302048 2023-01-22 16:52:15.551708: step: 866/463, loss: 0.01276442501693964 2023-01-22 16:52:16.163447: step: 868/463, loss: 0.42978665232658386 2023-01-22 16:52:16.702002: step: 870/463, loss: 0.04307706654071808 2023-01-22 16:52:17.337078: step: 872/463, loss: 0.06050116941332817 2023-01-22 16:52:17.935985: step: 874/463, loss: 0.05931306257843971 2023-01-22 16:52:18.524385: step: 876/463, loss: 0.020400680601596832 2023-01-22 16:52:19.188387: step: 878/463, loss: 0.05224798992276192 2023-01-22 16:52:19.865486: step: 880/463, loss: 0.019119637086987495 2023-01-22 16:52:20.512139: step: 882/463, loss: 0.003886362537741661 2023-01-22 16:52:21.155213: step: 884/463, loss: 0.04459100589156151 2023-01-22 16:52:21.725218: step: 886/463, loss: 1.7206817865371704 2023-01-22 16:52:22.299747: step: 888/463, loss: 0.06904873996973038 2023-01-22 16:52:22.965840: step: 890/463, loss: 0.01244281604886055 2023-01-22 16:52:23.595658: step: 892/463, loss: 0.35629627108573914 2023-01-22 16:52:24.226973: step: 894/463, loss: 0.011590253561735153 2023-01-22 16:52:24.903625: step: 896/463, loss: 0.36410394310951233 2023-01-22 16:52:25.507754: step: 898/463, loss: 0.011768914759159088 2023-01-22 16:52:26.156516: step: 900/463, loss: 0.003035547211766243 2023-01-22 16:52:26.821370: step: 902/463, loss: 0.19969482719898224 2023-01-22 16:52:27.409546: step: 904/463, loss: 0.02102407068014145 2023-01-22 16:52:28.062248: step: 906/463, loss: 0.018248483538627625 2023-01-22 16:52:28.721642: step: 908/463, loss: 1.3151532411575317 2023-01-22 16:52:29.378179: step: 910/463, loss: 0.05741620808839798 2023-01-22 16:52:29.944871: step: 912/463, loss: 0.011935326270759106 2023-01-22 16:52:30.494048: step: 914/463, loss: 0.049973778426647186 2023-01-22 16:52:31.161141: step: 916/463, loss: 0.24521493911743164 2023-01-22 16:52:31.796002: step: 918/463, loss: 0.08046819269657135 2023-01-22 16:52:32.411941: step: 920/463, loss: 0.04535198584198952 2023-01-22 16:52:32.986785: step: 922/463, loss: 0.038755640387535095 2023-01-22 16:52:33.578458: step: 924/463, loss: 0.022681159898638725 2023-01-22 16:52:34.195451: step: 926/463, loss: 0.04851355403661728 ================================================== Loss: 0.101 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.28830040353281555, 'r': 0.3265945747800586, 'f1': 0.30625505499838235}, 'combined': 0.22566161947249225, 'epoch': 19} Test Chinese: {'template': {'p': 0.9722222222222222, 'r': 0.5511811023622047, 'f1': 0.7035175879396985}, 'slot': {'p': 0.3387477477506408, 'r': 0.3210122635752146, 'f1': 0.32964162549927944}, 'combined': 0.2319086812557745, 'epoch': 19} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2866866438356165, 'r': 0.3176944971537002, 'f1': 0.3013951395139514}, 'combined': 0.22208062911554313, 'epoch': 19} Test Korean: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.339292351677531, 'r': 0.3167912882154958, 'f1': 0.32765597138534136}, 'combined': 0.23263573968359236, 'epoch': 19} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.29497819175687323, 'r': 0.3268828538634041, 'f1': 0.3101120863834635}, 'combined': 0.2285036425983415, 'epoch': 19} Test Russian: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.3561835053267278, 'r': 0.303035704793682, 'f1': 0.32746715482655314}, 'combined': 0.23250167992685272, 'epoch': 19} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.2535919540229885, 'r': 0.4202380952380952, 'f1': 0.3163082437275986}, 'combined': 0.2108721624850657, 'epoch': 19} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.29285714285714287, 'r': 0.44565217391304346, 'f1': 0.35344827586206895}, 'combined': 0.17672413793103448, 'epoch': 19} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.4166666666666667, 'r': 0.3017241379310345, 'f1': 0.35}, 'combined': 0.2333333333333333, 'epoch': 19} New best russian model... ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2887401938920082, 'r': 0.35393959251278423, 'f1': 0.3180326773303279}, 'combined': 0.2343398675065574, 'epoch': 16} Test for Chinese: {'template': {'p': 0.9722222222222222, 'r': 0.5511811023622047, 'f1': 0.7035175879396985}, 'slot': {'p': 0.3374190197383612, 'r': 0.30885912016190303, 'f1': 0.32250801977725824}, 'combined': 0.22689006416490531, 'epoch': 16} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.29006410256410253, 'r': 0.4309523809523809, 'f1': 0.34674329501915707}, 'combined': 0.23116219667943805, 'epoch': 16} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.28642843540005986, 'r': 0.33860515228508026, 'f1': 0.3103389830508475}, 'combined': 0.22867082961641394, 'epoch': 17} Test for Korean: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.3492453884409228, 'r': 0.3270179138496522, 'f1': 0.3377663639671779}, 'combined': 0.2398141184166963, 'epoch': 17} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.33088235294117646, 'r': 0.4891304347826087, 'f1': 0.39473684210526316}, 'combined': 0.19736842105263158, 'epoch': 17} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.29497819175687323, 'r': 0.3268828538634041, 'f1': 0.3101120863834635}, 'combined': 0.2285036425983415, 'epoch': 19} Test for Russian: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.3561835053267278, 'r': 0.303035704793682, 'f1': 0.32746715482655314}, 'combined': 0.23250167992685272, 'epoch': 19} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.4166666666666667, 'r': 0.3017241379310345, 'f1': 0.35}, 'combined': 0.2333333333333333, 'epoch': 19} ****************************** Epoch: 20 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-22 16:55:24.579413: step: 2/463, loss: 0.08323995769023895 2023-01-22 16:55:25.205359: step: 4/463, loss: 0.007340015843510628 2023-01-22 16:55:25.788440: step: 6/463, loss: 0.02065613865852356 2023-01-22 16:55:26.385587: step: 8/463, loss: 0.018542177975177765 2023-01-22 16:55:27.042212: step: 10/463, loss: 0.00855496246367693 2023-01-22 16:55:27.679661: step: 12/463, loss: 0.24917936325073242 2023-01-22 16:55:28.330995: step: 14/463, loss: 0.029521092772483826 2023-01-22 16:55:28.980058: step: 16/463, loss: 0.0457870177924633 2023-01-22 16:55:29.622300: step: 18/463, loss: 0.023652303963899612 2023-01-22 16:55:30.176030: step: 20/463, loss: 0.037270687520504 2023-01-22 16:55:30.762085: step: 22/463, loss: 0.03394424542784691 2023-01-22 16:55:31.433258: step: 24/463, loss: 0.048702944070100784 2023-01-22 16:55:32.097552: step: 26/463, loss: 0.12050747126340866 2023-01-22 16:55:32.771410: step: 28/463, loss: 0.5310078859329224 2023-01-22 16:55:33.423150: step: 30/463, loss: 0.03218822553753853 2023-01-22 16:55:34.120512: step: 32/463, loss: 0.20882923901081085 2023-01-22 16:55:34.723045: step: 34/463, loss: 0.07173653692007065 2023-01-22 16:55:35.391999: step: 36/463, loss: 0.03950980305671692 2023-01-22 16:55:36.090545: step: 38/463, loss: 0.03376353904604912 2023-01-22 16:55:36.716785: step: 40/463, loss: 0.08236401528120041 2023-01-22 16:55:37.367367: step: 42/463, loss: 0.030163351446390152 2023-01-22 16:55:37.969786: step: 44/463, loss: 0.17320962250232697 2023-01-22 16:55:38.590268: step: 46/463, loss: 0.07755175232887268 2023-01-22 16:55:39.174602: step: 48/463, loss: 0.045449815690517426 2023-01-22 16:55:39.912255: step: 50/463, loss: 0.4162406027317047 2023-01-22 16:55:40.524168: step: 52/463, loss: 0.012239661999046803 2023-01-22 16:55:41.135196: step: 54/463, loss: 0.017549682408571243 2023-01-22 16:55:41.758862: step: 56/463, loss: 0.007764208130538464 2023-01-22 16:55:42.406301: step: 58/463, loss: 0.04698004573583603 2023-01-22 16:55:43.030716: step: 60/463, loss: 0.00611442606896162 2023-01-22 16:55:43.655637: step: 62/463, loss: 0.212619349360466 2023-01-22 16:55:44.290340: step: 64/463, loss: 0.0333978496491909 2023-01-22 16:55:44.887242: step: 66/463, loss: 0.061653804033994675 2023-01-22 16:55:45.584468: step: 68/463, loss: 0.0509067103266716 2023-01-22 16:55:46.204702: step: 70/463, loss: 0.5823639631271362 2023-01-22 16:55:46.879639: step: 72/463, loss: 0.07869108766317368 2023-01-22 16:55:47.469326: step: 74/463, loss: 0.0683194175362587 2023-01-22 16:55:48.053634: step: 76/463, loss: 0.055802781134843826 2023-01-22 16:55:48.610786: step: 78/463, loss: 0.01306124310940504 2023-01-22 16:55:49.241438: step: 80/463, loss: 0.06690248101949692 2023-01-22 16:55:49.859883: step: 82/463, loss: 0.007106563542038202 2023-01-22 16:55:50.443577: step: 84/463, loss: 0.15121328830718994 2023-01-22 16:55:51.085682: step: 86/463, loss: 0.1270948350429535 2023-01-22 16:55:51.715452: step: 88/463, loss: 0.022250449284911156 2023-01-22 16:55:52.297243: step: 90/463, loss: 0.02790955826640129 2023-01-22 16:55:52.874957: step: 92/463, loss: 0.014617358334362507 2023-01-22 16:55:53.486759: step: 94/463, loss: 0.06815235316753387 2023-01-22 16:55:54.105763: step: 96/463, loss: 0.03701472282409668 2023-01-22 16:55:54.749715: step: 98/463, loss: 0.06210058555006981 2023-01-22 16:55:55.375830: step: 100/463, loss: 0.03500523045659065 2023-01-22 16:55:55.987049: step: 102/463, loss: 0.04695548117160797 2023-01-22 16:55:56.681280: step: 104/463, loss: 0.036129385232925415 2023-01-22 16:55:57.323876: step: 106/463, loss: 0.044387370347976685 2023-01-22 16:55:57.971806: step: 108/463, loss: 0.01837249845266342 2023-01-22 16:55:58.615328: step: 110/463, loss: 0.6807417273521423 2023-01-22 16:55:59.197222: step: 112/463, loss: 0.007771647069603205 2023-01-22 16:55:59.778112: step: 114/463, loss: 0.012117752805352211 2023-01-22 16:56:00.426621: step: 116/463, loss: 0.006294673308730125 2023-01-22 16:56:01.030007: step: 118/463, loss: 0.006181786768138409 2023-01-22 16:56:01.674342: step: 120/463, loss: 0.08464090526103973 2023-01-22 16:56:02.269689: step: 122/463, loss: 0.0581471249461174 2023-01-22 16:56:02.888973: step: 124/463, loss: 0.015994256362318993 2023-01-22 16:56:03.528286: step: 126/463, loss: 0.07616140693426132 2023-01-22 16:56:04.143963: step: 128/463, loss: 0.0648871585726738 2023-01-22 16:56:04.744542: step: 130/463, loss: 0.04968608170747757 2023-01-22 16:56:05.328276: step: 132/463, loss: 0.011758648790419102 2023-01-22 16:56:05.911066: step: 134/463, loss: 0.058100178837776184 2023-01-22 16:56:06.548822: step: 136/463, loss: 1.248206615447998 2023-01-22 16:56:07.196576: step: 138/463, loss: 0.10224691033363342 2023-01-22 16:56:07.839904: step: 140/463, loss: 0.09465485066175461 2023-01-22 16:56:08.415182: step: 142/463, loss: 0.03472604602575302 2023-01-22 16:56:09.053506: step: 144/463, loss: 0.034313153475522995 2023-01-22 16:56:09.594100: step: 146/463, loss: 0.010274055413901806 2023-01-22 16:56:10.163425: step: 148/463, loss: 0.04031563922762871 2023-01-22 16:56:10.760371: step: 150/463, loss: 0.10204671323299408 2023-01-22 16:56:11.452633: step: 152/463, loss: 0.06611669063568115 2023-01-22 16:56:12.044472: step: 154/463, loss: 0.016789471730589867 2023-01-22 16:56:12.693291: step: 156/463, loss: 0.42020291090011597 2023-01-22 16:56:13.307574: step: 158/463, loss: 0.02031388320028782 2023-01-22 16:56:14.000649: step: 160/463, loss: 0.01969081349670887 2023-01-22 16:56:14.655472: step: 162/463, loss: 0.02645924873650074 2023-01-22 16:56:15.229277: step: 164/463, loss: 0.005060764029622078 2023-01-22 16:56:15.851757: step: 166/463, loss: 0.03826911002397537 2023-01-22 16:56:16.486175: step: 168/463, loss: 0.053765106946229935 2023-01-22 16:56:17.071875: step: 170/463, loss: 0.052000198513269424 2023-01-22 16:56:17.651978: step: 172/463, loss: 0.008816355839371681 2023-01-22 16:56:18.284531: step: 174/463, loss: 0.018465502187609673 2023-01-22 16:56:18.956443: step: 176/463, loss: 0.06459164619445801 2023-01-22 16:56:19.592102: step: 178/463, loss: 0.02789643593132496 2023-01-22 16:56:20.155729: step: 180/463, loss: 0.13722079992294312 2023-01-22 16:56:20.780273: step: 182/463, loss: 0.042327359318733215 2023-01-22 16:56:21.421822: step: 184/463, loss: 0.019767504185438156 2023-01-22 16:56:22.042767: step: 186/463, loss: 0.029623975977301598 2023-01-22 16:56:22.657424: step: 188/463, loss: 0.00828305073082447 2023-01-22 16:56:23.229951: step: 190/463, loss: 0.02647193893790245 2023-01-22 16:56:23.913352: step: 192/463, loss: 0.01056437287479639 2023-01-22 16:56:24.483502: step: 194/463, loss: 0.5350810885429382 2023-01-22 16:56:25.090143: step: 196/463, loss: 0.04880645126104355 2023-01-22 16:56:25.707864: step: 198/463, loss: 0.08484818786382675 2023-01-22 16:56:26.322085: step: 200/463, loss: 0.011702382005751133 2023-01-22 16:56:26.893976: step: 202/463, loss: 0.4117920398712158 2023-01-22 16:56:27.566416: step: 204/463, loss: 0.02006661333143711 2023-01-22 16:56:28.225663: step: 206/463, loss: 0.026494525372982025 2023-01-22 16:56:28.819514: step: 208/463, loss: 0.9082757830619812 2023-01-22 16:56:29.409133: step: 210/463, loss: 0.17125080525875092 2023-01-22 16:56:30.023333: step: 212/463, loss: 1.0819920301437378 2023-01-22 16:56:30.615450: step: 214/463, loss: 0.022895555943250656 2023-01-22 16:56:31.236195: step: 216/463, loss: 0.01617003045976162 2023-01-22 16:56:31.820262: step: 218/463, loss: 0.021007556468248367 2023-01-22 16:56:32.436757: step: 220/463, loss: 0.011161042377352715 2023-01-22 16:56:33.050527: step: 222/463, loss: 0.05958352982997894 2023-01-22 16:56:33.751621: step: 224/463, loss: 0.04096345975995064 2023-01-22 16:56:34.314310: step: 226/463, loss: 0.0013262492138892412 2023-01-22 16:56:34.962842: step: 228/463, loss: 0.04926057904958725 2023-01-22 16:56:35.609036: step: 230/463, loss: 0.03283074498176575 2023-01-22 16:56:36.214510: step: 232/463, loss: 0.031715184450149536 2023-01-22 16:56:36.821578: step: 234/463, loss: 0.040327586233615875 2023-01-22 16:56:37.536921: step: 236/463, loss: 0.030659107491374016 2023-01-22 16:56:38.125663: step: 238/463, loss: 0.054952554404735565 2023-01-22 16:56:38.864119: step: 240/463, loss: 0.046287670731544495 2023-01-22 16:56:39.517792: step: 242/463, loss: 0.0023850214201956987 2023-01-22 16:56:40.129612: step: 244/463, loss: 0.12035354226827621 2023-01-22 16:56:40.721735: step: 246/463, loss: 0.09603091329336166 2023-01-22 16:56:41.372176: step: 248/463, loss: 0.08054850995540619 2023-01-22 16:56:41.952080: step: 250/463, loss: 0.055818844586610794 2023-01-22 16:56:42.583918: step: 252/463, loss: 0.4591297209262848 2023-01-22 16:56:43.158651: step: 254/463, loss: 0.08919832855463028 2023-01-22 16:56:43.757425: step: 256/463, loss: 0.08627292513847351 2023-01-22 16:56:44.367467: step: 258/463, loss: 0.05825857073068619 2023-01-22 16:56:44.991125: step: 260/463, loss: 0.01681964285671711 2023-01-22 16:56:45.599648: step: 262/463, loss: 0.14119376242160797 2023-01-22 16:56:46.192689: step: 264/463, loss: 0.0014511578483507037 2023-01-22 16:56:46.844202: step: 266/463, loss: 0.02058991976082325 2023-01-22 16:56:47.555958: step: 268/463, loss: 0.03168392553925514 2023-01-22 16:56:48.182563: step: 270/463, loss: 0.07809014618396759 2023-01-22 16:56:48.831172: step: 272/463, loss: 0.010978511534631252 2023-01-22 16:56:49.468067: step: 274/463, loss: 0.03784770891070366 2023-01-22 16:56:50.084747: step: 276/463, loss: 0.06694339960813522 2023-01-22 16:56:50.687594: step: 278/463, loss: 0.06547904014587402 2023-01-22 16:56:51.310268: step: 280/463, loss: 0.050147686153650284 2023-01-22 16:56:51.906666: step: 282/463, loss: 0.10435345768928528 2023-01-22 16:56:52.506281: step: 284/463, loss: 0.02806750498712063 2023-01-22 16:56:53.049950: step: 286/463, loss: 0.04192790761590004 2023-01-22 16:56:53.666880: step: 288/463, loss: 0.0384625643491745 2023-01-22 16:56:54.290425: step: 290/463, loss: 0.006255440413951874 2023-01-22 16:56:54.912119: step: 292/463, loss: 0.028770076110959053 2023-01-22 16:56:55.526315: step: 294/463, loss: 0.0681924894452095 2023-01-22 16:56:56.173253: step: 296/463, loss: 0.03222872316837311 2023-01-22 16:56:56.789515: step: 298/463, loss: 0.040263596922159195 2023-01-22 16:56:57.434464: step: 300/463, loss: 0.09183554351329803 2023-01-22 16:56:58.090408: step: 302/463, loss: 0.13943786919116974 2023-01-22 16:56:58.673584: step: 304/463, loss: 0.0012791040353477001 2023-01-22 16:56:59.284708: step: 306/463, loss: 0.46397557854652405 2023-01-22 16:56:59.898441: step: 308/463, loss: 0.054934076964855194 2023-01-22 16:57:00.524041: step: 310/463, loss: 0.05137812718749046 2023-01-22 16:57:01.141408: step: 312/463, loss: 0.007286332547664642 2023-01-22 16:57:01.745937: step: 314/463, loss: 0.06177544221282005 2023-01-22 16:57:02.381317: step: 316/463, loss: 0.025980332866311073 2023-01-22 16:57:03.042636: step: 318/463, loss: 0.042516984045505524 2023-01-22 16:57:03.677761: step: 320/463, loss: 0.01909751072525978 2023-01-22 16:57:04.296864: step: 322/463, loss: 0.003260646481066942 2023-01-22 16:57:04.916925: step: 324/463, loss: 0.027001265436410904 2023-01-22 16:57:05.485200: step: 326/463, loss: 0.027243856340646744 2023-01-22 16:57:06.123112: step: 328/463, loss: 0.1572408229112625 2023-01-22 16:57:06.701222: step: 330/463, loss: 0.027139194309711456 2023-01-22 16:57:07.300708: step: 332/463, loss: 0.03428567573428154 2023-01-22 16:57:07.874791: step: 334/463, loss: 0.26800066232681274 2023-01-22 16:57:08.514214: step: 336/463, loss: 0.001998656429350376 2023-01-22 16:57:09.158944: step: 338/463, loss: 0.041000839322805405 2023-01-22 16:57:09.723601: step: 340/463, loss: 0.07905033975839615 2023-01-22 16:57:10.364286: step: 342/463, loss: 0.036708950996398926 2023-01-22 16:57:10.994992: step: 344/463, loss: 0.011947150342166424 2023-01-22 16:57:11.719207: step: 346/463, loss: 0.04792380332946777 2023-01-22 16:57:12.360243: step: 348/463, loss: 0.016851287335157394 2023-01-22 16:57:13.067698: step: 350/463, loss: 0.08677379786968231 2023-01-22 16:57:13.679579: step: 352/463, loss: 0.07014984637498856 2023-01-22 16:57:14.354333: step: 354/463, loss: 0.043972499668598175 2023-01-22 16:57:14.977034: step: 356/463, loss: 0.021397612988948822 2023-01-22 16:57:15.602716: step: 358/463, loss: 0.0069655487313866615 2023-01-22 16:57:16.215854: step: 360/463, loss: 0.0007701426511630416 2023-01-22 16:57:16.887912: step: 362/463, loss: 0.14952483773231506 2023-01-22 16:57:17.504319: step: 364/463, loss: 0.024697627872228622 2023-01-22 16:57:18.190807: step: 366/463, loss: 0.06226004660129547 2023-01-22 16:57:18.768285: step: 368/463, loss: 0.018628248944878578 2023-01-22 16:57:19.360629: step: 370/463, loss: 0.037878844887018204 2023-01-22 16:57:20.026413: step: 372/463, loss: 0.05800690874457359 2023-01-22 16:57:20.622076: step: 374/463, loss: 0.1900286078453064 2023-01-22 16:57:21.233122: step: 376/463, loss: 0.04533953592181206 2023-01-22 16:57:21.842041: step: 378/463, loss: 0.040352046489715576 2023-01-22 16:57:22.513604: step: 380/463, loss: 0.01498746033757925 2023-01-22 16:57:23.140465: step: 382/463, loss: 0.0223684161901474 2023-01-22 16:57:23.776209: step: 384/463, loss: 0.03730696067214012 2023-01-22 16:57:24.425531: step: 386/463, loss: 0.042159512639045715 2023-01-22 16:57:25.033609: step: 388/463, loss: 0.22115536034107208 2023-01-22 16:57:25.633335: step: 390/463, loss: 0.04116658866405487 2023-01-22 16:57:26.253220: step: 392/463, loss: 0.721065104007721 2023-01-22 16:57:26.913475: step: 394/463, loss: 0.008944780565798283 2023-01-22 16:57:27.515729: step: 396/463, loss: 0.007003935985267162 2023-01-22 16:57:28.100787: step: 398/463, loss: 0.2672494053840637 2023-01-22 16:57:28.759637: step: 400/463, loss: 0.047528333961963654 2023-01-22 16:57:29.355003: step: 402/463, loss: 0.46634459495544434 2023-01-22 16:57:29.978962: step: 404/463, loss: 0.03278937563300133 2023-01-22 16:57:30.626182: step: 406/463, loss: 0.2179594486951828 2023-01-22 16:57:31.284521: step: 408/463, loss: 0.030622906982898712 2023-01-22 16:57:31.806976: step: 410/463, loss: 0.05897950381040573 2023-01-22 16:57:32.372767: step: 412/463, loss: 0.008753904141485691 2023-01-22 16:57:32.973536: step: 414/463, loss: 0.015373372472822666 2023-01-22 16:57:33.608301: step: 416/463, loss: 0.017205150797963142 2023-01-22 16:57:34.259887: step: 418/463, loss: 0.0032850243151187897 2023-01-22 16:57:34.889530: step: 420/463, loss: 0.03149497136473656 2023-01-22 16:57:35.549171: step: 422/463, loss: 0.021351870149374008 2023-01-22 16:57:36.104024: step: 424/463, loss: 0.523242175579071 2023-01-22 16:57:36.717688: step: 426/463, loss: 0.0421280674636364 2023-01-22 16:57:37.376692: step: 428/463, loss: 0.02364734560251236 2023-01-22 16:57:37.923758: step: 430/463, loss: 0.00742302043363452 2023-01-22 16:57:38.527329: step: 432/463, loss: 0.04400298744440079 2023-01-22 16:57:39.175385: step: 434/463, loss: 0.017799649387598038 2023-01-22 16:57:39.757162: step: 436/463, loss: 0.08979277312755585 2023-01-22 16:57:40.426584: step: 438/463, loss: 0.06308624893426895 2023-01-22 16:57:41.036396: step: 440/463, loss: 0.08002220094203949 2023-01-22 16:57:41.673617: step: 442/463, loss: 0.02492065541446209 2023-01-22 16:57:42.311420: step: 444/463, loss: 0.12558989226818085 2023-01-22 16:57:42.904703: step: 446/463, loss: 0.005345349665731192 2023-01-22 16:57:43.526355: step: 448/463, loss: 0.022938577458262444 2023-01-22 16:57:44.224791: step: 450/463, loss: 0.06727348268032074 2023-01-22 16:57:44.878627: step: 452/463, loss: 0.031391389667987823 2023-01-22 16:57:45.562948: step: 454/463, loss: 0.021947121247649193 2023-01-22 16:57:46.202271: step: 456/463, loss: 0.033300332725048065 2023-01-22 16:57:46.798855: step: 458/463, loss: 0.021847276017069817 2023-01-22 16:57:47.484909: step: 460/463, loss: 0.4443843960762024 2023-01-22 16:57:48.057619: step: 462/463, loss: 0.010977687314152718 2023-01-22 16:57:48.654631: step: 464/463, loss: 0.13077019155025482 2023-01-22 16:57:49.360503: step: 466/463, loss: 0.05875653028488159 2023-01-22 16:57:49.995840: step: 468/463, loss: 0.08670536428689957 2023-01-22 16:57:50.604011: step: 470/463, loss: 0.018220892176032066 2023-01-22 16:57:51.225708: step: 472/463, loss: 0.04116418585181236 2023-01-22 16:57:51.893320: step: 474/463, loss: 0.04712887480854988 2023-01-22 16:57:52.573392: step: 476/463, loss: 0.016370316967368126 2023-01-22 16:57:53.194224: step: 478/463, loss: 0.034324612468481064 2023-01-22 16:57:53.847843: step: 480/463, loss: 0.08447950333356857 2023-01-22 16:57:54.489259: step: 482/463, loss: 0.010231978259980679 2023-01-22 16:57:55.140516: step: 484/463, loss: 0.015600950457155704 2023-01-22 16:57:55.756253: step: 486/463, loss: 0.06997652351856232 2023-01-22 16:57:56.337982: step: 488/463, loss: 0.021421212702989578 2023-01-22 16:57:56.970463: step: 490/463, loss: 0.03769376501441002 2023-01-22 16:57:57.650891: step: 492/463, loss: 0.07068721950054169 2023-01-22 16:57:58.283237: step: 494/463, loss: 0.006244635675102472 2023-01-22 16:57:58.914426: step: 496/463, loss: 0.04079504683613777 2023-01-22 16:57:59.529151: step: 498/463, loss: 0.07517904043197632 2023-01-22 16:58:00.164504: step: 500/463, loss: 0.034827642142772675 2023-01-22 16:58:00.796465: step: 502/463, loss: 0.028365740552544594 2023-01-22 16:58:01.425643: step: 504/463, loss: 0.08840467780828476 2023-01-22 16:58:02.088854: step: 506/463, loss: 0.13625359535217285 2023-01-22 16:58:02.704731: step: 508/463, loss: 0.027245689183473587 2023-01-22 16:58:03.288173: step: 510/463, loss: 0.02557837963104248 2023-01-22 16:58:03.948094: step: 512/463, loss: 0.020092196762561798 2023-01-22 16:58:04.540500: step: 514/463, loss: 0.03934720903635025 2023-01-22 16:58:05.071085: step: 516/463, loss: 0.07788414508104324 2023-01-22 16:58:05.844675: step: 518/463, loss: 0.025618892163038254 2023-01-22 16:58:06.479257: step: 520/463, loss: 0.04723735526204109 2023-01-22 16:58:07.065103: step: 522/463, loss: 0.00748606538400054 2023-01-22 16:58:07.624424: step: 524/463, loss: 0.039075128734111786 2023-01-22 16:58:08.255939: step: 526/463, loss: 0.01978275552392006 2023-01-22 16:58:08.856299: step: 528/463, loss: 0.030289098620414734 2023-01-22 16:58:09.448171: step: 530/463, loss: 0.0018432075157761574 2023-01-22 16:58:10.113631: step: 532/463, loss: 0.06357678771018982 2023-01-22 16:58:10.729605: step: 534/463, loss: 0.0338468924164772 2023-01-22 16:58:11.312234: step: 536/463, loss: 0.008032926358282566 2023-01-22 16:58:11.978279: step: 538/463, loss: 0.055053334683179855 2023-01-22 16:58:12.711691: step: 540/463, loss: 0.05032167211174965 2023-01-22 16:58:13.340111: step: 542/463, loss: 0.10766009986400604 2023-01-22 16:58:13.951951: step: 544/463, loss: 0.04359640181064606 2023-01-22 16:58:14.539989: step: 546/463, loss: 0.08294761180877686 2023-01-22 16:58:15.100307: step: 548/463, loss: 0.1947774589061737 2023-01-22 16:58:15.652852: step: 550/463, loss: 0.03763885051012039 2023-01-22 16:58:16.292053: step: 552/463, loss: 0.05967513471841812 2023-01-22 16:58:16.918507: step: 554/463, loss: 0.0971040353178978 2023-01-22 16:58:17.486616: step: 556/463, loss: 0.37865322828292847 2023-01-22 16:58:18.188832: step: 558/463, loss: 0.09412643313407898 2023-01-22 16:58:18.834357: step: 560/463, loss: 0.05961790308356285 2023-01-22 16:58:19.521106: step: 562/463, loss: 0.0573711171746254 2023-01-22 16:58:20.102396: step: 564/463, loss: 0.016912255436182022 2023-01-22 16:58:20.712956: step: 566/463, loss: 0.003248361172154546 2023-01-22 16:58:21.365956: step: 568/463, loss: 4.563074111938477 2023-01-22 16:58:22.023470: step: 570/463, loss: 0.004131477326154709 2023-01-22 16:58:22.633692: step: 572/463, loss: 0.019411368295550346 2023-01-22 16:58:23.292323: step: 574/463, loss: 0.041583724319934845 2023-01-22 16:58:24.015232: step: 576/463, loss: 0.0022921790368855 2023-01-22 16:58:24.668583: step: 578/463, loss: 0.11921312659978867 2023-01-22 16:58:25.259755: step: 580/463, loss: 0.017321715131402016 2023-01-22 16:58:25.886481: step: 582/463, loss: 0.52079838514328 2023-01-22 16:58:26.553847: step: 584/463, loss: 0.03948429599404335 2023-01-22 16:58:27.242270: step: 586/463, loss: 0.09386350959539413 2023-01-22 16:58:27.875971: step: 588/463, loss: 0.09751170128583908 2023-01-22 16:58:28.536255: step: 590/463, loss: 0.030723415315151215 2023-01-22 16:58:29.142247: step: 592/463, loss: 0.06645587831735611 2023-01-22 16:58:29.833597: step: 594/463, loss: 0.04700160399079323 2023-01-22 16:58:30.372330: step: 596/463, loss: 0.00480484776198864 2023-01-22 16:58:30.956260: step: 598/463, loss: 0.009629609063267708 2023-01-22 16:58:31.555931: step: 600/463, loss: 0.021236106753349304 2023-01-22 16:58:32.151137: step: 602/463, loss: 0.5373559594154358 2023-01-22 16:58:32.761500: step: 604/463, loss: 0.0532221794128418 2023-01-22 16:58:33.318628: step: 606/463, loss: 0.03754604607820511 2023-01-22 16:58:33.957442: step: 608/463, loss: 0.06116281822323799 2023-01-22 16:58:34.597690: step: 610/463, loss: 0.014874203130602837 2023-01-22 16:58:35.422859: step: 612/463, loss: 0.16626352071762085 2023-01-22 16:58:36.080969: step: 614/463, loss: 0.06160041317343712 2023-01-22 16:58:36.688757: step: 616/463, loss: 0.0469839982688427 2023-01-22 16:58:37.300337: step: 618/463, loss: 0.005901341792196035 2023-01-22 16:58:37.921893: step: 620/463, loss: 0.1207084208726883 2023-01-22 16:58:38.510429: step: 622/463, loss: 0.040004365146160126 2023-01-22 16:58:39.181401: step: 624/463, loss: 0.01315951719880104 2023-01-22 16:58:39.847608: step: 626/463, loss: 0.2644183039665222 2023-01-22 16:58:40.508377: step: 628/463, loss: 0.08684351295232773 2023-01-22 16:58:41.198379: step: 630/463, loss: 0.11115144193172455 2023-01-22 16:58:41.809253: step: 632/463, loss: 0.037044450640678406 2023-01-22 16:58:42.427448: step: 634/463, loss: 0.8883649706840515 2023-01-22 16:58:43.066692: step: 636/463, loss: 0.27728837728500366 2023-01-22 16:58:43.689001: step: 638/463, loss: 0.007357340771704912 2023-01-22 16:58:44.212419: step: 640/463, loss: 0.02758486568927765 2023-01-22 16:58:44.938758: step: 642/463, loss: 0.08709313720464706 2023-01-22 16:58:45.555562: step: 644/463, loss: 0.08649551868438721 2023-01-22 16:58:46.234104: step: 646/463, loss: 0.03996846824884415 2023-01-22 16:58:46.910720: step: 648/463, loss: 0.08165590465068817 2023-01-22 16:58:47.482744: step: 650/463, loss: 0.05521314591169357 2023-01-22 16:58:48.129220: step: 652/463, loss: 0.044498685747385025 2023-01-22 16:58:48.714641: step: 654/463, loss: 0.022218558937311172 2023-01-22 16:58:49.420092: step: 656/463, loss: 0.27197614312171936 2023-01-22 16:58:50.120835: step: 658/463, loss: 0.10603546351194382 2023-01-22 16:58:50.728830: step: 660/463, loss: 0.06629909574985504 2023-01-22 16:58:51.445040: step: 662/463, loss: 0.18396151065826416 2023-01-22 16:58:52.058395: step: 664/463, loss: 0.0529121495783329 2023-01-22 16:58:52.697579: step: 666/463, loss: 0.0340249165892601 2023-01-22 16:58:53.294990: step: 668/463, loss: 0.006401388440281153 2023-01-22 16:58:53.975019: step: 670/463, loss: 0.14767079055309296 2023-01-22 16:58:54.548661: step: 672/463, loss: 0.0029904914554208517 2023-01-22 16:58:55.212965: step: 674/463, loss: 0.22466996312141418 2023-01-22 16:58:55.853264: step: 676/463, loss: 0.05319199711084366 2023-01-22 16:58:56.461612: step: 678/463, loss: 0.035423438996076584 2023-01-22 16:58:57.024183: step: 680/463, loss: 0.018738234415650368 2023-01-22 16:58:57.664234: step: 682/463, loss: 0.05417242273688316 2023-01-22 16:58:58.313073: step: 684/463, loss: 0.033520810306072235 2023-01-22 16:58:59.003018: step: 686/463, loss: 0.13476787507534027 2023-01-22 16:58:59.634597: step: 688/463, loss: 0.099180668592453 2023-01-22 16:59:00.265481: step: 690/463, loss: 0.06774382293224335 2023-01-22 16:59:00.890841: step: 692/463, loss: 0.02167348936200142 2023-01-22 16:59:01.525387: step: 694/463, loss: 0.42495861649513245 2023-01-22 16:59:02.123719: step: 696/463, loss: 0.019821608439087868 2023-01-22 16:59:02.697068: step: 698/463, loss: 0.016630573198199272 2023-01-22 16:59:03.316129: step: 700/463, loss: 0.034024354070425034 2023-01-22 16:59:03.967460: step: 702/463, loss: 0.015306154265999794 2023-01-22 16:59:04.698352: step: 704/463, loss: 0.018720919266343117 2023-01-22 16:59:05.337821: step: 706/463, loss: 0.022080421447753906 2023-01-22 16:59:05.965256: step: 708/463, loss: 0.010876769199967384 2023-01-22 16:59:06.589397: step: 710/463, loss: 0.06322673708200455 2023-01-22 16:59:07.208296: step: 712/463, loss: 0.022721419110894203 2023-01-22 16:59:07.860814: step: 714/463, loss: 0.04646049067378044 2023-01-22 16:59:08.641614: step: 716/463, loss: 0.4346594214439392 2023-01-22 16:59:09.340152: step: 718/463, loss: 0.6926411986351013 2023-01-22 16:59:09.989130: step: 720/463, loss: 0.021722089499235153 2023-01-22 16:59:10.656337: step: 722/463, loss: 0.07009609788656235 2023-01-22 16:59:11.268702: step: 724/463, loss: 0.15431883931159973 2023-01-22 16:59:11.936289: step: 726/463, loss: 0.0412658266723156 2023-01-22 16:59:12.525832: step: 728/463, loss: 0.20546042919158936 2023-01-22 16:59:13.158291: step: 730/463, loss: 0.13148559629917145 2023-01-22 16:59:13.795502: step: 732/463, loss: 0.024479255080223083 2023-01-22 16:59:14.420496: step: 734/463, loss: 0.034617338329553604 2023-01-22 16:59:15.055457: step: 736/463, loss: 0.0718984603881836 2023-01-22 16:59:15.697118: step: 738/463, loss: 0.0949181467294693 2023-01-22 16:59:16.296508: step: 740/463, loss: 0.11052833497524261 2023-01-22 16:59:16.901340: step: 742/463, loss: 0.01713215932250023 2023-01-22 16:59:17.468081: step: 744/463, loss: 0.052556466311216354 2023-01-22 16:59:18.040369: step: 746/463, loss: 0.025919241830706596 2023-01-22 16:59:18.703985: step: 748/463, loss: 0.05121634900569916 2023-01-22 16:59:19.347972: step: 750/463, loss: 0.5875501036643982 2023-01-22 16:59:20.018563: step: 752/463, loss: 0.02519374154508114 2023-01-22 16:59:20.616273: step: 754/463, loss: 0.003298985306173563 2023-01-22 16:59:21.271354: step: 756/463, loss: 0.050324950367212296 2023-01-22 16:59:21.927575: step: 758/463, loss: 0.011300569400191307 2023-01-22 16:59:22.638146: step: 760/463, loss: 0.2746778428554535 2023-01-22 16:59:23.259661: step: 762/463, loss: 0.02676212042570114 2023-01-22 16:59:23.851723: step: 764/463, loss: 0.016230760142207146 2023-01-22 16:59:24.488583: step: 766/463, loss: 0.02446156181395054 2023-01-22 16:59:25.111669: step: 768/463, loss: 0.10033155977725983 2023-01-22 16:59:25.736990: step: 770/463, loss: 0.03214560076594353 2023-01-22 16:59:26.391592: step: 772/463, loss: 0.02340608276426792 2023-01-22 16:59:26.983739: step: 774/463, loss: 0.01338187139481306 2023-01-22 16:59:27.637509: step: 776/463, loss: 0.08513816446065903 2023-01-22 16:59:28.250569: step: 778/463, loss: 0.09424039721488953 2023-01-22 16:59:28.845982: step: 780/463, loss: 0.012167329899966717 2023-01-22 16:59:29.508878: step: 782/463, loss: 0.032609958201646805 2023-01-22 16:59:30.085610: step: 784/463, loss: 0.0328303799033165 2023-01-22 16:59:30.805242: step: 786/463, loss: 0.010953973047435284 2023-01-22 16:59:31.486448: step: 788/463, loss: 0.10454772412776947 2023-01-22 16:59:32.142904: step: 790/463, loss: 0.004707758314907551 2023-01-22 16:59:32.759032: step: 792/463, loss: 0.03880766034126282 2023-01-22 16:59:33.431271: step: 794/463, loss: 0.011037297546863556 2023-01-22 16:59:34.023586: step: 796/463, loss: 0.027888448908925056 2023-01-22 16:59:34.722332: step: 798/463, loss: 0.04853122681379318 2023-01-22 16:59:35.312184: step: 800/463, loss: 1.5909734964370728 2023-01-22 16:59:35.998896: step: 802/463, loss: 0.006183129735291004 2023-01-22 16:59:36.591393: step: 804/463, loss: 0.02890894189476967 2023-01-22 16:59:37.217561: step: 806/463, loss: 0.039785102009773254 2023-01-22 16:59:37.901019: step: 808/463, loss: 0.017108134925365448 2023-01-22 16:59:38.526914: step: 810/463, loss: 0.20944510400295258 2023-01-22 16:59:39.079884: step: 812/463, loss: 0.03926607966423035 2023-01-22 16:59:39.735014: step: 814/463, loss: 0.0238968338817358 2023-01-22 16:59:40.390870: step: 816/463, loss: 0.0007349143852479756 2023-01-22 16:59:40.992471: step: 818/463, loss: 0.21055850386619568 2023-01-22 16:59:41.628040: step: 820/463, loss: 0.08366881310939789 2023-01-22 16:59:42.270620: step: 822/463, loss: 0.010718808509409428 2023-01-22 16:59:42.884259: step: 824/463, loss: 0.007650543935596943 2023-01-22 16:59:43.467901: step: 826/463, loss: 0.0949612408876419 2023-01-22 16:59:44.098552: step: 828/463, loss: 0.019172823056578636 2023-01-22 16:59:44.685170: step: 830/463, loss: 0.2670556902885437 2023-01-22 16:59:45.350413: step: 832/463, loss: 0.05254850909113884 2023-01-22 16:59:45.990476: step: 834/463, loss: 0.9196890592575073 2023-01-22 16:59:46.623097: step: 836/463, loss: 0.012537769041955471 2023-01-22 16:59:47.288653: step: 838/463, loss: 0.029650073498487473 2023-01-22 16:59:47.848049: step: 840/463, loss: 0.037620775401592255 2023-01-22 16:59:48.574710: step: 842/463, loss: 0.028186824172735214 2023-01-22 16:59:49.247968: step: 844/463, loss: 0.07818823307752609 2023-01-22 16:59:49.884032: step: 846/463, loss: 0.013711227104067802 2023-01-22 16:59:50.583674: step: 848/463, loss: 0.13804399967193604 2023-01-22 16:59:51.184722: step: 850/463, loss: 0.016752654686570168 2023-01-22 16:59:51.854206: step: 852/463, loss: 0.038956981152296066 2023-01-22 16:59:52.491760: step: 854/463, loss: 0.04014114290475845 2023-01-22 16:59:53.167095: step: 856/463, loss: 0.6125719547271729 2023-01-22 16:59:53.843863: step: 858/463, loss: 0.042675431817770004 2023-01-22 16:59:54.488022: step: 860/463, loss: 0.6113349795341492 2023-01-22 16:59:55.118548: step: 862/463, loss: 0.03162816911935806 2023-01-22 16:59:55.752684: step: 864/463, loss: 0.06500391662120819 2023-01-22 16:59:56.365204: step: 866/463, loss: 0.3485666513442993 2023-01-22 16:59:56.961280: step: 868/463, loss: 0.6008363366127014 2023-01-22 16:59:57.623079: step: 870/463, loss: 0.1132049411535263 2023-01-22 16:59:58.150173: step: 872/463, loss: 0.028075966984033585 2023-01-22 16:59:58.779335: step: 874/463, loss: 0.019048798829317093 2023-01-22 16:59:59.369167: step: 876/463, loss: 0.07738973200321198 2023-01-22 16:59:59.924400: step: 878/463, loss: 0.11745759844779968 2023-01-22 17:00:00.473671: step: 880/463, loss: 0.047934629023075104 2023-01-22 17:00:01.173426: step: 882/463, loss: 0.09423694014549255 2023-01-22 17:00:01.853786: step: 884/463, loss: 0.18730667233467102 2023-01-22 17:00:02.476906: step: 886/463, loss: 0.052175674587488174 2023-01-22 17:00:03.072953: step: 888/463, loss: 0.03832956403493881 2023-01-22 17:00:03.657384: step: 890/463, loss: 0.016272542998194695 2023-01-22 17:00:04.232734: step: 892/463, loss: 0.048094749450683594 2023-01-22 17:00:04.839885: step: 894/463, loss: 0.04599382355809212 2023-01-22 17:00:05.472215: step: 896/463, loss: 0.03661296144127846 2023-01-22 17:00:06.083386: step: 898/463, loss: 0.049422428011894226 2023-01-22 17:00:06.680029: step: 900/463, loss: 0.060927629470825195 2023-01-22 17:00:07.289479: step: 902/463, loss: 0.005963659379631281 2023-01-22 17:00:07.907172: step: 904/463, loss: 0.18699103593826294 2023-01-22 17:00:08.563928: step: 906/463, loss: 0.04739372432231903 2023-01-22 17:00:09.233211: step: 908/463, loss: 0.03402712568640709 2023-01-22 17:00:09.838059: step: 910/463, loss: 0.034771960228681564 2023-01-22 17:00:10.491658: step: 912/463, loss: 0.06435071676969528 2023-01-22 17:00:11.086209: step: 914/463, loss: 0.014212435111403465 2023-01-22 17:00:11.745042: step: 916/463, loss: 0.009613131172955036 2023-01-22 17:00:12.351211: step: 918/463, loss: 0.016497118398547173 2023-01-22 17:00:13.007894: step: 920/463, loss: 0.027498938143253326 2023-01-22 17:00:13.560288: step: 922/463, loss: 0.018406087532639503 2023-01-22 17:00:14.281424: step: 924/463, loss: 0.07453325390815735 2023-01-22 17:00:14.913386: step: 926/463, loss: 0.5193667411804199 ================================================== Loss: 0.102 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.28174586776859506, 'r': 0.32344639468690706, 'f1': 0.3011594522968198}, 'combined': 0.22190696485028824, 'epoch': 20} Test Chinese: {'template': {'p': 0.9722222222222222, 'r': 0.5511811023622047, 'f1': 0.7035175879396985}, 'slot': {'p': 0.35800929520358865, 'r': 0.3148982282419, 'f1': 0.3350727665415203}, 'combined': 0.23572958450157208, 'epoch': 20} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2811266447368421, 'r': 0.3243358633776091, 'f1': 0.3011894273127753}, 'combined': 0.22192905170415023, 'epoch': 20} Test Korean: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.35088108535058565, 'r': 0.3079785945653612, 'f1': 0.328033014676594}, 'combined': 0.23290344042038175, 'epoch': 20} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.29811343141907926, 'r': 0.34053944158308486, 'f1': 0.3179172466152094}, 'combined': 0.23425481329541745, 'epoch': 20} Test Russian: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.36505995641065214, 'r': 0.2978456014345199, 'f1': 0.32804522752903387}, 'combined': 0.23291211154561403, 'epoch': 20} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.2404970760233918, 'r': 0.3916666666666666, 'f1': 0.2980072463768116}, 'combined': 0.19867149758454106, 'epoch': 20} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.21794871794871795, 'r': 0.3695652173913043, 'f1': 0.27419354838709675}, 'combined': 0.13709677419354838, 'epoch': 20} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.27586206896551724, 'f1': 0.35555555555555557}, 'combined': 0.23703703703703705, 'epoch': 20} New best russian model... ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2887401938920082, 'r': 0.35393959251278423, 'f1': 0.3180326773303279}, 'combined': 0.2343398675065574, 'epoch': 16} Test for Chinese: {'template': {'p': 0.9722222222222222, 'r': 0.5511811023622047, 'f1': 0.7035175879396985}, 'slot': {'p': 0.3374190197383612, 'r': 0.30885912016190303, 'f1': 0.32250801977725824}, 'combined': 0.22689006416490531, 'epoch': 16} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.29006410256410253, 'r': 0.4309523809523809, 'f1': 0.34674329501915707}, 'combined': 0.23116219667943805, 'epoch': 16} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.28642843540005986, 'r': 0.33860515228508026, 'f1': 0.3103389830508475}, 'combined': 0.22867082961641394, 'epoch': 17} Test for Korean: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.3492453884409228, 'r': 0.3270179138496522, 'f1': 0.3377663639671779}, 'combined': 0.2398141184166963, 'epoch': 17} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.33088235294117646, 'r': 0.4891304347826087, 'f1': 0.39473684210526316}, 'combined': 0.19736842105263158, 'epoch': 17} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.29811343141907926, 'r': 0.34053944158308486, 'f1': 0.3179172466152094}, 'combined': 0.23425481329541745, 'epoch': 20} Test for Russian: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.36505995641065214, 'r': 0.2978456014345199, 'f1': 0.32804522752903387}, 'combined': 0.23291211154561403, 'epoch': 20} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.27586206896551724, 'f1': 0.35555555555555557}, 'combined': 0.23703703703703705, 'epoch': 20} ****************************** Epoch: 21 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-22 17:03:05.682094: step: 2/463, loss: 0.022008482366800308 2023-01-22 17:03:06.283900: step: 4/463, loss: 0.029988478869199753 2023-01-22 17:03:06.895183: step: 6/463, loss: 0.01874367892742157 2023-01-22 17:03:07.560650: step: 8/463, loss: 0.08505063503980637 2023-01-22 17:03:08.206958: step: 10/463, loss: 0.33817678689956665 2023-01-22 17:03:08.800707: step: 12/463, loss: 0.05574452504515648 2023-01-22 17:03:09.431888: step: 14/463, loss: 0.041506603360176086 2023-01-22 17:03:09.960986: step: 16/463, loss: 0.0046156905591487885 2023-01-22 17:03:10.515656: step: 18/463, loss: 0.025547225028276443 2023-01-22 17:03:11.133119: step: 20/463, loss: 0.020429356023669243 2023-01-22 17:03:12.473190: step: 22/463, loss: 0.0026935527566820383 2023-01-22 17:03:13.105988: step: 24/463, loss: 0.011190974153578281 2023-01-22 17:03:13.735752: step: 26/463, loss: 0.02654440887272358 2023-01-22 17:03:14.381903: step: 28/463, loss: 0.07672686129808426 2023-01-22 17:03:14.999511: step: 30/463, loss: 0.051153987646102905 2023-01-22 17:03:15.623247: step: 32/463, loss: 0.01669188216328621 2023-01-22 17:03:16.252499: step: 34/463, loss: 0.026145007461309433 2023-01-22 17:03:16.917132: step: 36/463, loss: 0.022429408505558968 2023-01-22 17:03:17.545108: step: 38/463, loss: 0.00846058689057827 2023-01-22 17:03:18.261334: step: 40/463, loss: 0.2107594907283783 2023-01-22 17:03:18.836700: step: 42/463, loss: 0.02715696021914482 2023-01-22 17:03:19.472781: step: 44/463, loss: 0.0189416091889143 2023-01-22 17:03:20.060594: step: 46/463, loss: 0.40652430057525635 2023-01-22 17:03:20.654907: step: 48/463, loss: 0.028207359835505486 2023-01-22 17:03:21.277979: step: 50/463, loss: 0.1154620349407196 2023-01-22 17:03:21.936259: step: 52/463, loss: 0.019459251314401627 2023-01-22 17:03:22.552761: step: 54/463, loss: 0.21148084104061127 2023-01-22 17:03:23.176120: step: 56/463, loss: 0.06065848097205162 2023-01-22 17:03:23.761451: step: 58/463, loss: 0.007095199543982744 2023-01-22 17:03:24.411473: step: 60/463, loss: 0.049874864518642426 2023-01-22 17:03:25.058425: step: 62/463, loss: 0.0462472066283226 2023-01-22 17:03:25.640912: step: 64/463, loss: 0.028083130717277527 2023-01-22 17:03:26.207052: step: 66/463, loss: 0.003161604516208172 2023-01-22 17:03:26.823798: step: 68/463, loss: 0.01670728251338005 2023-01-22 17:03:27.516872: step: 70/463, loss: 0.07545154541730881 2023-01-22 17:03:28.132775: step: 72/463, loss: 0.03556971251964569 2023-01-22 17:03:28.742146: step: 74/463, loss: 0.05403538793325424 2023-01-22 17:03:29.462706: step: 76/463, loss: 0.1220034509897232 2023-01-22 17:03:30.124801: step: 78/463, loss: 0.025854015722870827 2023-01-22 17:03:30.791347: step: 80/463, loss: 0.10299518704414368 2023-01-22 17:03:31.349821: step: 82/463, loss: 0.005664109718054533 2023-01-22 17:03:31.975556: step: 84/463, loss: 0.025291237980127335 2023-01-22 17:03:32.569735: step: 86/463, loss: 0.010893854312598705 2023-01-22 17:03:33.119981: step: 88/463, loss: 0.06616128981113434 2023-01-22 17:03:33.767720: step: 90/463, loss: 0.015148491598665714 2023-01-22 17:03:34.418218: step: 92/463, loss: 0.015791479498147964 2023-01-22 17:03:35.001573: step: 94/463, loss: 0.02229871042072773 2023-01-22 17:03:35.635952: step: 96/463, loss: 0.0008954100194387138 2023-01-22 17:03:36.270248: step: 98/463, loss: 0.08014589548110962 2023-01-22 17:03:36.916482: step: 100/463, loss: 0.02978227660059929 2023-01-22 17:03:37.498480: step: 102/463, loss: 0.003485077526420355 2023-01-22 17:03:38.071216: step: 104/463, loss: 0.007276567630469799 2023-01-22 17:03:38.768570: step: 106/463, loss: 0.022680295631289482 2023-01-22 17:03:39.437605: step: 108/463, loss: 0.024388842284679413 2023-01-22 17:03:40.093374: step: 110/463, loss: 0.10606863349676132 2023-01-22 17:03:40.778065: step: 112/463, loss: 0.03977262228727341 2023-01-22 17:03:41.433316: step: 114/463, loss: 0.0019069232512265444 2023-01-22 17:03:42.032770: step: 116/463, loss: 0.07033708691596985 2023-01-22 17:03:42.641010: step: 118/463, loss: 0.04773767665028572 2023-01-22 17:03:43.269317: step: 120/463, loss: 0.133262038230896 2023-01-22 17:03:43.853957: step: 122/463, loss: 0.037915125489234924 2023-01-22 17:03:44.442670: step: 124/463, loss: 0.006940323859453201 2023-01-22 17:03:45.083919: step: 126/463, loss: 0.04145856201648712 2023-01-22 17:03:45.718508: step: 128/463, loss: 0.048484671860933304 2023-01-22 17:03:46.358558: step: 130/463, loss: 0.008461753837764263 2023-01-22 17:03:46.958657: step: 132/463, loss: 0.008569824509322643 2023-01-22 17:03:47.578863: step: 134/463, loss: 0.08508991450071335 2023-01-22 17:03:48.117955: step: 136/463, loss: 0.02078446000814438 2023-01-22 17:03:48.840561: step: 138/463, loss: 0.02604726329445839 2023-01-22 17:03:49.432022: step: 140/463, loss: 0.006216873414814472 2023-01-22 17:03:49.995775: step: 142/463, loss: 0.07322961837053299 2023-01-22 17:03:50.668138: step: 144/463, loss: 0.01985345035791397 2023-01-22 17:03:51.243066: step: 146/463, loss: 0.029293887317180634 2023-01-22 17:03:51.870134: step: 148/463, loss: 0.01702212169766426 2023-01-22 17:03:52.458494: step: 150/463, loss: 0.009137047454714775 2023-01-22 17:03:53.069634: step: 152/463, loss: 0.08957723528146744 2023-01-22 17:03:53.708534: step: 154/463, loss: 0.01358967088162899 2023-01-22 17:03:54.381687: step: 156/463, loss: 0.039461325854063034 2023-01-22 17:03:55.012162: step: 158/463, loss: 0.007351674605160952 2023-01-22 17:03:55.651247: step: 160/463, loss: 0.034397151321172714 2023-01-22 17:03:56.268252: step: 162/463, loss: 0.03313220292329788 2023-01-22 17:03:56.969464: step: 164/463, loss: 0.03177512809634209 2023-01-22 17:03:57.647561: step: 166/463, loss: 0.02378370612859726 2023-01-22 17:03:58.287618: step: 168/463, loss: 0.009129060432314873 2023-01-22 17:03:58.933539: step: 170/463, loss: 0.04293675348162651 2023-01-22 17:03:59.563362: step: 172/463, loss: 0.03969128057360649 2023-01-22 17:04:00.201045: step: 174/463, loss: 0.04807805269956589 2023-01-22 17:04:00.846665: step: 176/463, loss: 0.08762156218290329 2023-01-22 17:04:01.406174: step: 178/463, loss: 0.0022141719236969948 2023-01-22 17:04:01.998173: step: 180/463, loss: 0.23819103837013245 2023-01-22 17:04:02.628219: step: 182/463, loss: 0.03713945671916008 2023-01-22 17:04:03.256322: step: 184/463, loss: 0.05133058875799179 2023-01-22 17:04:03.952838: step: 186/463, loss: 0.18070843815803528 2023-01-22 17:04:04.570922: step: 188/463, loss: 0.05851379409432411 2023-01-22 17:04:05.281386: step: 190/463, loss: 0.08885157853364944 2023-01-22 17:04:05.893354: step: 192/463, loss: 0.012084142304956913 2023-01-22 17:04:06.567218: step: 194/463, loss: 0.029860274866223335 2023-01-22 17:04:07.192237: step: 196/463, loss: 0.04741688445210457 2023-01-22 17:04:07.801111: step: 198/463, loss: 0.013846560381352901 2023-01-22 17:04:08.398424: step: 200/463, loss: 0.004107863176614046 2023-01-22 17:04:09.059959: step: 202/463, loss: 0.0883418619632721 2023-01-22 17:04:09.719913: step: 204/463, loss: 0.023268932476639748 2023-01-22 17:04:10.389779: step: 206/463, loss: 0.03269048407673836 2023-01-22 17:04:10.992078: step: 208/463, loss: 0.0995730310678482 2023-01-22 17:04:11.630429: step: 210/463, loss: 0.08325400948524475 2023-01-22 17:04:12.274021: step: 212/463, loss: 0.019368594512343407 2023-01-22 17:04:12.917716: step: 214/463, loss: 0.013158101588487625 2023-01-22 17:04:13.588243: step: 216/463, loss: 0.041684590280056 2023-01-22 17:04:14.287462: step: 218/463, loss: 0.2172883301973343 2023-01-22 17:04:14.996705: step: 220/463, loss: 0.025261444970965385 2023-01-22 17:04:15.601939: step: 222/463, loss: 0.02492641657590866 2023-01-22 17:04:16.240404: step: 224/463, loss: 0.0009922600584104657 2023-01-22 17:04:16.887323: step: 226/463, loss: 0.025789761915802956 2023-01-22 17:04:17.530423: step: 228/463, loss: 0.14626505970954895 2023-01-22 17:04:18.198844: step: 230/463, loss: 0.08147326856851578 2023-01-22 17:04:18.828699: step: 232/463, loss: 0.06291748583316803 2023-01-22 17:04:19.481064: step: 234/463, loss: 0.13415737450122833 2023-01-22 17:04:20.105129: step: 236/463, loss: 0.004794170148670673 2023-01-22 17:04:20.713565: step: 238/463, loss: 0.04370730742812157 2023-01-22 17:04:21.272262: step: 240/463, loss: 0.022455822676420212 2023-01-22 17:04:21.886082: step: 242/463, loss: 0.0015574638964608312 2023-01-22 17:04:22.468499: step: 244/463, loss: 0.01737472601234913 2023-01-22 17:04:23.086819: step: 246/463, loss: 0.04005654901266098 2023-01-22 17:04:23.647819: step: 248/463, loss: 0.025169437751173973 2023-01-22 17:04:24.214165: step: 250/463, loss: 0.008000411093235016 2023-01-22 17:04:24.809912: step: 252/463, loss: 0.002304816385731101 2023-01-22 17:04:25.459860: step: 254/463, loss: 0.022343797609210014 2023-01-22 17:04:26.134056: step: 256/463, loss: 0.058486081659793854 2023-01-22 17:04:26.751605: step: 258/463, loss: 0.03611346706748009 2023-01-22 17:04:27.317930: step: 260/463, loss: 0.03398415073752403 2023-01-22 17:04:27.960485: step: 262/463, loss: 0.028867607936263084 2023-01-22 17:04:28.535542: step: 264/463, loss: 0.009431576356291771 2023-01-22 17:04:29.198177: step: 266/463, loss: 0.06314925849437714 2023-01-22 17:04:29.821672: step: 268/463, loss: 0.016855956986546516 2023-01-22 17:04:30.495105: step: 270/463, loss: 0.009606849402189255 2023-01-22 17:04:31.119463: step: 272/463, loss: 0.044898103922605515 2023-01-22 17:04:31.726212: step: 274/463, loss: 0.01386989839375019 2023-01-22 17:04:32.358946: step: 276/463, loss: 0.0811362937092781 2023-01-22 17:04:32.999123: step: 278/463, loss: 0.028530778363347054 2023-01-22 17:04:33.643186: step: 280/463, loss: 0.13090461492538452 2023-01-22 17:04:34.298912: step: 282/463, loss: 0.03901318460702896 2023-01-22 17:04:34.897457: step: 284/463, loss: 0.01338057778775692 2023-01-22 17:04:35.516342: step: 286/463, loss: 0.00290060811676085 2023-01-22 17:04:36.116079: step: 288/463, loss: 0.07087630033493042 2023-01-22 17:04:36.785500: step: 290/463, loss: 0.012638784013688564 2023-01-22 17:04:37.416649: step: 292/463, loss: 0.04471646621823311 2023-01-22 17:04:38.110822: step: 294/463, loss: 0.0410781167447567 2023-01-22 17:04:38.727860: step: 296/463, loss: 0.013152619823813438 2023-01-22 17:04:39.253436: step: 298/463, loss: 0.004940604791045189 2023-01-22 17:04:39.890605: step: 300/463, loss: 0.019711941480636597 2023-01-22 17:04:40.529992: step: 302/463, loss: 0.2259039580821991 2023-01-22 17:04:41.243554: step: 304/463, loss: 0.15583130717277527 2023-01-22 17:04:41.854553: step: 306/463, loss: 0.021658288314938545 2023-01-22 17:04:42.506007: step: 308/463, loss: 0.003552973736077547 2023-01-22 17:04:43.047098: step: 310/463, loss: 0.11902958154678345 2023-01-22 17:04:43.601613: step: 312/463, loss: 0.040493644773960114 2023-01-22 17:04:44.246224: step: 314/463, loss: 0.0886593759059906 2023-01-22 17:04:44.858541: step: 316/463, loss: 0.029729273170232773 2023-01-22 17:04:45.492249: step: 318/463, loss: 0.012689848430454731 2023-01-22 17:04:46.076055: step: 320/463, loss: 0.06161126494407654 2023-01-22 17:04:46.780634: step: 322/463, loss: 0.032552413642406464 2023-01-22 17:04:47.402694: step: 324/463, loss: 0.12655891478061676 2023-01-22 17:04:48.039668: step: 326/463, loss: 0.09107667952775955 2023-01-22 17:04:48.622257: step: 328/463, loss: 0.027906039729714394 2023-01-22 17:04:49.321020: step: 330/463, loss: 0.002341063227504492 2023-01-22 17:04:49.975139: step: 332/463, loss: 0.06716832518577576 2023-01-22 17:04:50.634140: step: 334/463, loss: 0.006494496949017048 2023-01-22 17:04:51.254016: step: 336/463, loss: 0.019810305908322334 2023-01-22 17:04:51.894779: step: 338/463, loss: 0.003307758364826441 2023-01-22 17:04:52.509951: step: 340/463, loss: 0.012600535526871681 2023-01-22 17:04:53.068887: step: 342/463, loss: 0.017224587500095367 2023-01-22 17:04:53.752983: step: 344/463, loss: 0.03407838195562363 2023-01-22 17:04:54.431586: step: 346/463, loss: 0.05280140042304993 2023-01-22 17:04:55.091134: step: 348/463, loss: 0.06186382845044136 2023-01-22 17:04:55.672316: step: 350/463, loss: 0.15974871814250946 2023-01-22 17:04:56.296728: step: 352/463, loss: 0.053387466818094254 2023-01-22 17:04:56.894253: step: 354/463, loss: 0.02263207733631134 2023-01-22 17:04:57.508628: step: 356/463, loss: 0.12856197357177734 2023-01-22 17:04:58.177048: step: 358/463, loss: 0.06802386790513992 2023-01-22 17:04:58.761473: step: 360/463, loss: 0.0747547596693039 2023-01-22 17:04:59.450267: step: 362/463, loss: 0.07135310024023056 2023-01-22 17:05:00.096770: step: 364/463, loss: 0.023682184517383575 2023-01-22 17:05:00.813234: step: 366/463, loss: 1.6382685899734497 2023-01-22 17:05:01.410867: step: 368/463, loss: 0.05660940334200859 2023-01-22 17:05:01.988927: step: 370/463, loss: 0.053998157382011414 2023-01-22 17:05:02.650804: step: 372/463, loss: 0.015436838380992413 2023-01-22 17:05:03.345020: step: 374/463, loss: 0.03999985754489899 2023-01-22 17:05:04.017510: step: 376/463, loss: 0.02227618917822838 2023-01-22 17:05:04.677992: step: 378/463, loss: 0.006027388386428356 2023-01-22 17:05:05.349680: step: 380/463, loss: 0.062093473970890045 2023-01-22 17:05:05.987321: step: 382/463, loss: 0.0010212885681539774 2023-01-22 17:05:06.567115: step: 384/463, loss: 0.03135315701365471 2023-01-22 17:05:07.191759: step: 386/463, loss: 0.06826417148113251 2023-01-22 17:05:07.796424: step: 388/463, loss: 0.006376666948199272 2023-01-22 17:05:08.437662: step: 390/463, loss: 0.04538203030824661 2023-01-22 17:05:09.085914: step: 392/463, loss: 0.05936737731099129 2023-01-22 17:05:09.684109: step: 394/463, loss: 0.0009442442678846419 2023-01-22 17:05:10.308795: step: 396/463, loss: 0.1517576277256012 2023-01-22 17:05:10.915195: step: 398/463, loss: 0.018915748223662376 2023-01-22 17:05:11.539503: step: 400/463, loss: 0.028445499017834663 2023-01-22 17:05:12.175675: step: 402/463, loss: 0.029400646686553955 2023-01-22 17:05:12.764492: step: 404/463, loss: 0.016852660104632378 2023-01-22 17:05:13.403137: step: 406/463, loss: 0.02031405083835125 2023-01-22 17:05:14.033309: step: 408/463, loss: 0.035715844482183456 2023-01-22 17:05:14.670334: step: 410/463, loss: 0.13425001502037048 2023-01-22 17:05:15.273271: step: 412/463, loss: 0.15887081623077393 2023-01-22 17:05:15.907941: step: 414/463, loss: 0.005851370748132467 2023-01-22 17:05:16.549083: step: 416/463, loss: 0.0094437375664711 2023-01-22 17:05:17.212419: step: 418/463, loss: 0.014794398099184036 2023-01-22 17:05:17.799938: step: 420/463, loss: 0.001093297149054706 2023-01-22 17:05:18.403628: step: 422/463, loss: 0.03567489981651306 2023-01-22 17:05:19.042972: step: 424/463, loss: 0.009832983836531639 2023-01-22 17:05:19.742624: step: 426/463, loss: 0.010569760575890541 2023-01-22 17:05:20.345098: step: 428/463, loss: 0.12131989747285843 2023-01-22 17:05:20.971678: step: 430/463, loss: 0.035594791173934937 2023-01-22 17:05:21.583184: step: 432/463, loss: 0.0460851825773716 2023-01-22 17:05:22.209974: step: 434/463, loss: 0.01689666323363781 2023-01-22 17:05:22.801670: step: 436/463, loss: 0.023740127682685852 2023-01-22 17:05:23.424165: step: 438/463, loss: 0.039711110293865204 2023-01-22 17:05:24.008697: step: 440/463, loss: 0.025187894701957703 2023-01-22 17:05:24.626141: step: 442/463, loss: 0.39497265219688416 2023-01-22 17:05:25.257495: step: 444/463, loss: 0.017680466175079346 2023-01-22 17:05:25.946815: step: 446/463, loss: 0.003804377978667617 2023-01-22 17:05:26.526986: step: 448/463, loss: 0.004656915552914143 2023-01-22 17:05:27.146570: step: 450/463, loss: 0.0301457941532135 2023-01-22 17:05:27.750498: step: 452/463, loss: 0.024035511538386345 2023-01-22 17:05:28.337703: step: 454/463, loss: 0.021909106522798538 2023-01-22 17:05:28.970554: step: 456/463, loss: 0.022178424522280693 2023-01-22 17:05:29.648074: step: 458/463, loss: 0.15144531428813934 2023-01-22 17:05:30.282282: step: 460/463, loss: 0.06766287237405777 2023-01-22 17:05:30.906240: step: 462/463, loss: 0.035993535071611404 2023-01-22 17:05:31.557757: step: 464/463, loss: 0.22480399906635284 2023-01-22 17:05:32.173568: step: 466/463, loss: 0.03108730912208557 2023-01-22 17:05:32.804565: step: 468/463, loss: 0.019457558169960976 2023-01-22 17:05:33.417130: step: 470/463, loss: 0.2333899736404419 2023-01-22 17:05:34.048253: step: 472/463, loss: 0.07232055068016052 2023-01-22 17:05:34.662988: step: 474/463, loss: 0.01934022083878517 2023-01-22 17:05:35.458362: step: 476/463, loss: 0.0322767049074173 2023-01-22 17:05:36.093915: step: 478/463, loss: 0.03563401475548744 2023-01-22 17:05:36.734890: step: 480/463, loss: 0.026971401646733284 2023-01-22 17:05:37.340177: step: 482/463, loss: 0.0007329676300287247 2023-01-22 17:05:37.939786: step: 484/463, loss: 0.016659708693623543 2023-01-22 17:05:38.534932: step: 486/463, loss: 0.04705329239368439 2023-01-22 17:05:39.202380: step: 488/463, loss: 0.08474838733673096 2023-01-22 17:05:39.809821: step: 490/463, loss: 0.06722645461559296 2023-01-22 17:05:40.464679: step: 492/463, loss: 0.07198620587587357 2023-01-22 17:05:41.130134: step: 494/463, loss: 0.03929497301578522 2023-01-22 17:05:41.757356: step: 496/463, loss: 0.03740274906158447 2023-01-22 17:05:42.436059: step: 498/463, loss: 0.011608954519033432 2023-01-22 17:05:43.092480: step: 500/463, loss: 0.09067392349243164 2023-01-22 17:05:43.726942: step: 502/463, loss: 0.016709964722394943 2023-01-22 17:05:44.337565: step: 504/463, loss: 0.07882294058799744 2023-01-22 17:05:45.008582: step: 506/463, loss: 0.21838000416755676 2023-01-22 17:05:45.567803: step: 508/463, loss: 0.014037218876183033 2023-01-22 17:05:46.240474: step: 510/463, loss: 0.000776133849285543 2023-01-22 17:05:46.851459: step: 512/463, loss: 0.018749598413705826 2023-01-22 17:05:47.498710: step: 514/463, loss: 0.039468154311180115 2023-01-22 17:05:48.121667: step: 516/463, loss: 0.05234163627028465 2023-01-22 17:05:48.715797: step: 518/463, loss: 0.07060364633798599 2023-01-22 17:05:49.343533: step: 520/463, loss: 0.011302628554403782 2023-01-22 17:05:49.987686: step: 522/463, loss: 0.5673316121101379 2023-01-22 17:05:50.639465: step: 524/463, loss: 0.0815105140209198 2023-01-22 17:05:51.251018: step: 526/463, loss: 0.003231844399124384 2023-01-22 17:05:51.872169: step: 528/463, loss: 0.03861748054623604 2023-01-22 17:05:52.477308: step: 530/463, loss: 0.05325410142540932 2023-01-22 17:05:53.106085: step: 532/463, loss: 0.020434843376278877 2023-01-22 17:05:53.672037: step: 534/463, loss: 0.013344867154955864 2023-01-22 17:05:54.268308: step: 536/463, loss: 0.07382109761238098 2023-01-22 17:05:54.957260: step: 538/463, loss: 0.01820579543709755 2023-01-22 17:05:55.642963: step: 540/463, loss: 0.0005593604873865843 2023-01-22 17:05:56.323577: step: 542/463, loss: 0.06149774417281151 2023-01-22 17:05:56.939977: step: 544/463, loss: 0.056089434772729874 2023-01-22 17:05:57.565283: step: 546/463, loss: 0.0012331443140283227 2023-01-22 17:05:58.163529: step: 548/463, loss: 0.0037512006238102913 2023-01-22 17:05:58.791812: step: 550/463, loss: 0.014551788568496704 2023-01-22 17:05:59.378791: step: 552/463, loss: 0.004120564088225365 2023-01-22 17:05:59.988924: step: 554/463, loss: 0.09919439256191254 2023-01-22 17:06:00.548979: step: 556/463, loss: 0.010632972232997417 2023-01-22 17:06:01.250443: step: 558/463, loss: 0.047986019402742386 2023-01-22 17:06:01.844448: step: 560/463, loss: 0.04798350855708122 2023-01-22 17:06:02.424495: step: 562/463, loss: 0.10861191898584366 2023-01-22 17:06:03.053840: step: 564/463, loss: 0.09543894976377487 2023-01-22 17:06:03.785548: step: 566/463, loss: 0.13507740199565887 2023-01-22 17:06:04.386276: step: 568/463, loss: 0.009960348717868328 2023-01-22 17:06:04.982489: step: 570/463, loss: 0.01186533086001873 2023-01-22 17:06:05.621560: step: 572/463, loss: 0.025996817275881767 2023-01-22 17:06:06.269271: step: 574/463, loss: 0.020238740369677544 2023-01-22 17:06:06.910434: step: 576/463, loss: 0.0059691439382731915 2023-01-22 17:06:07.585881: step: 578/463, loss: 0.023758655413985252 2023-01-22 17:06:08.193321: step: 580/463, loss: 0.03507672995328903 2023-01-22 17:06:08.807742: step: 582/463, loss: 0.023334074765443802 2023-01-22 17:06:09.405797: step: 584/463, loss: 0.08426858484745026 2023-01-22 17:06:10.061975: step: 586/463, loss: 0.12785480916500092 2023-01-22 17:06:10.657023: step: 588/463, loss: 0.0009382165153510869 2023-01-22 17:06:11.243625: step: 590/463, loss: 0.01552286371588707 2023-01-22 17:06:11.862709: step: 592/463, loss: 0.013905069790780544 2023-01-22 17:06:12.425260: step: 594/463, loss: 0.07818382978439331 2023-01-22 17:06:13.071077: step: 596/463, loss: 0.03273193538188934 2023-01-22 17:06:13.693619: step: 598/463, loss: 0.08542925119400024 2023-01-22 17:06:14.280276: step: 600/463, loss: 0.01674262247979641 2023-01-22 17:06:14.898180: step: 602/463, loss: 0.11450672149658203 2023-01-22 17:06:15.585404: step: 604/463, loss: 0.06275250762701035 2023-01-22 17:06:16.207976: step: 606/463, loss: 0.04039764404296875 2023-01-22 17:06:16.922320: step: 608/463, loss: 0.0770040974020958 2023-01-22 17:06:17.536500: step: 610/463, loss: 0.02580110915005207 2023-01-22 17:06:18.148584: step: 612/463, loss: 0.02259071357548237 2023-01-22 17:06:18.760578: step: 614/463, loss: 0.0186466034501791 2023-01-22 17:06:19.380458: step: 616/463, loss: 0.05069529637694359 2023-01-22 17:06:20.011576: step: 618/463, loss: 0.04952714592218399 2023-01-22 17:06:20.593711: step: 620/463, loss: 0.03745238110423088 2023-01-22 17:06:21.165365: step: 622/463, loss: 0.002454794477671385 2023-01-22 17:06:21.751961: step: 624/463, loss: 0.22640396654605865 2023-01-22 17:06:22.424150: step: 626/463, loss: 0.039421092718839645 2023-01-22 17:06:23.075761: step: 628/463, loss: 0.001969334902241826 2023-01-22 17:06:23.750563: step: 630/463, loss: 0.0042140791192650795 2023-01-22 17:06:24.351537: step: 632/463, loss: 0.0023637895938009024 2023-01-22 17:06:24.987443: step: 634/463, loss: 0.08299047499895096 2023-01-22 17:06:25.572641: step: 636/463, loss: 0.06453372538089752 2023-01-22 17:06:26.157817: step: 638/463, loss: 0.033531419932842255 2023-01-22 17:06:26.793489: step: 640/463, loss: 0.1430540233850479 2023-01-22 17:06:27.453268: step: 642/463, loss: 0.05303196236491203 2023-01-22 17:06:28.074183: step: 644/463, loss: 0.026107240468263626 2023-01-22 17:06:28.668729: step: 646/463, loss: 0.02516034059226513 2023-01-22 17:06:29.319028: step: 648/463, loss: 0.010120201855897903 2023-01-22 17:06:29.943886: step: 650/463, loss: 0.022002574056386948 2023-01-22 17:06:30.565228: step: 652/463, loss: 0.1363283097743988 2023-01-22 17:06:31.220130: step: 654/463, loss: 0.014807868748903275 2023-01-22 17:06:31.885265: step: 656/463, loss: 0.012575142085552216 2023-01-22 17:06:32.466232: step: 658/463, loss: 0.007345051504671574 2023-01-22 17:06:33.092940: step: 660/463, loss: 0.028650455176830292 2023-01-22 17:06:33.703491: step: 662/463, loss: 0.008022154681384563 2023-01-22 17:06:34.358352: step: 664/463, loss: 0.04508165642619133 2023-01-22 17:06:34.981187: step: 666/463, loss: 0.09993044286966324 2023-01-22 17:06:35.652370: step: 668/463, loss: 0.10364606231451035 2023-01-22 17:06:36.254337: step: 670/463, loss: 0.011971932835876942 2023-01-22 17:06:36.932125: step: 672/463, loss: 0.03310597687959671 2023-01-22 17:06:37.518452: step: 674/463, loss: 0.04759678244590759 2023-01-22 17:06:38.177988: step: 676/463, loss: 0.07852241396903992 2023-01-22 17:06:38.729190: step: 678/463, loss: 0.015355043113231659 2023-01-22 17:06:39.331872: step: 680/463, loss: 0.36714431643486023 2023-01-22 17:06:39.949167: step: 682/463, loss: 0.2849484980106354 2023-01-22 17:06:40.549503: step: 684/463, loss: 0.0416339673101902 2023-01-22 17:06:41.156120: step: 686/463, loss: 0.014312603510916233 2023-01-22 17:06:41.767622: step: 688/463, loss: 0.019896304234862328 2023-01-22 17:06:42.373270: step: 690/463, loss: 0.08474797010421753 2023-01-22 17:06:43.064081: step: 692/463, loss: 0.06567728519439697 2023-01-22 17:06:43.681511: step: 694/463, loss: 0.08197841793298721 2023-01-22 17:06:44.236130: step: 696/463, loss: 0.015771687030792236 2023-01-22 17:06:44.874281: step: 698/463, loss: 0.006728087551891804 2023-01-22 17:06:45.505127: step: 700/463, loss: 0.13797040283679962 2023-01-22 17:06:46.125485: step: 702/463, loss: 0.10487734526395798 2023-01-22 17:06:46.681524: step: 704/463, loss: 0.8632163405418396 2023-01-22 17:06:47.275648: step: 706/463, loss: 0.08381126075983047 2023-01-22 17:06:47.901791: step: 708/463, loss: 0.0610245056450367 2023-01-22 17:06:48.493467: step: 710/463, loss: 0.312435507774353 2023-01-22 17:06:49.110612: step: 712/463, loss: 0.0230458602309227 2023-01-22 17:06:49.763523: step: 714/463, loss: 0.047541867941617966 2023-01-22 17:06:50.351690: step: 716/463, loss: 0.040169671177864075 2023-01-22 17:06:50.891493: step: 718/463, loss: 0.010622736997902393 2023-01-22 17:06:51.538918: step: 720/463, loss: 0.044290874153375626 2023-01-22 17:06:52.173158: step: 722/463, loss: 0.0077961222268640995 2023-01-22 17:06:52.775877: step: 724/463, loss: 0.018529990687966347 2023-01-22 17:06:53.362610: step: 726/463, loss: 0.050099924206733704 2023-01-22 17:06:53.974452: step: 728/463, loss: 0.026411956176161766 2023-01-22 17:06:54.599709: step: 730/463, loss: 0.03044850565493107 2023-01-22 17:06:55.216721: step: 732/463, loss: 0.016553515568375587 2023-01-22 17:06:55.836495: step: 734/463, loss: 0.04270682483911514 2023-01-22 17:06:56.432953: step: 736/463, loss: 0.03065469115972519 2023-01-22 17:06:57.011151: step: 738/463, loss: 0.0033613373525440693 2023-01-22 17:06:57.669229: step: 740/463, loss: 0.0203634612262249 2023-01-22 17:06:58.266165: step: 742/463, loss: 0.011422252282500267 2023-01-22 17:06:58.920288: step: 744/463, loss: 0.006883529480546713 2023-01-22 17:06:59.505022: step: 746/463, loss: 0.062039814889431 2023-01-22 17:07:00.175648: step: 748/463, loss: 0.04242417961359024 2023-01-22 17:07:00.762833: step: 750/463, loss: 0.048045188188552856 2023-01-22 17:07:01.382872: step: 752/463, loss: 0.0334664061665535 2023-01-22 17:07:01.941199: step: 754/463, loss: 0.03738798946142197 2023-01-22 17:07:02.571639: step: 756/463, loss: 0.1418430060148239 2023-01-22 17:07:03.159743: step: 758/463, loss: 0.12069568037986755 2023-01-22 17:07:03.848362: step: 760/463, loss: 0.02376287244260311 2023-01-22 17:07:04.421796: step: 762/463, loss: 0.011529789306223392 2023-01-22 17:07:05.024581: step: 764/463, loss: 0.0635392889380455 2023-01-22 17:07:05.660238: step: 766/463, loss: 0.031059306114912033 2023-01-22 17:07:06.274378: step: 768/463, loss: 0.018602974712848663 2023-01-22 17:07:06.926228: step: 770/463, loss: 0.007980188354849815 2023-01-22 17:07:07.527911: step: 772/463, loss: 0.006945118308067322 2023-01-22 17:07:08.156522: step: 774/463, loss: 0.0035656304098665714 2023-01-22 17:07:08.785791: step: 776/463, loss: 0.044834204018116 2023-01-22 17:07:09.465006: step: 778/463, loss: 0.08671990036964417 2023-01-22 17:07:10.062254: step: 780/463, loss: 0.06817467510700226 2023-01-22 17:07:10.675675: step: 782/463, loss: 0.06093546375632286 2023-01-22 17:07:11.247253: step: 784/463, loss: 0.09929952025413513 2023-01-22 17:07:11.845419: step: 786/463, loss: 0.057542286813259125 2023-01-22 17:07:12.396817: step: 788/463, loss: 0.0003967313969042152 2023-01-22 17:07:13.023918: step: 790/463, loss: 0.01445329375565052 2023-01-22 17:07:13.669135: step: 792/463, loss: 0.023365819826722145 2023-01-22 17:07:14.275926: step: 794/463, loss: 1.6078743934631348 2023-01-22 17:07:14.891649: step: 796/463, loss: 0.05605413392186165 2023-01-22 17:07:15.588824: step: 798/463, loss: 0.03238263353705406 2023-01-22 17:07:16.253791: step: 800/463, loss: 0.04528237134218216 2023-01-22 17:07:16.860562: step: 802/463, loss: 0.8978449702262878 2023-01-22 17:07:17.505134: step: 804/463, loss: 0.7818218469619751 2023-01-22 17:07:18.139381: step: 806/463, loss: 0.11709796637296677 2023-01-22 17:07:18.812987: step: 808/463, loss: 0.05932655557990074 2023-01-22 17:07:19.528380: step: 810/463, loss: 0.020915281027555466 2023-01-22 17:07:20.109890: step: 812/463, loss: 0.03243932127952576 2023-01-22 17:07:20.672838: step: 814/463, loss: 0.00761460093781352 2023-01-22 17:07:21.280827: step: 816/463, loss: 0.11169257760047913 2023-01-22 17:07:21.955267: step: 818/463, loss: 0.023898065090179443 2023-01-22 17:07:22.594795: step: 820/463, loss: 0.08242367208003998 2023-01-22 17:07:23.168356: step: 822/463, loss: 0.02316606231033802 2023-01-22 17:07:23.777148: step: 824/463, loss: 0.06097719818353653 2023-01-22 17:07:24.394730: step: 826/463, loss: 0.045977361500263214 2023-01-22 17:07:24.990691: step: 828/463, loss: 0.012211537919938564 2023-01-22 17:07:25.565465: step: 830/463, loss: 0.0205874964594841 2023-01-22 17:07:26.165666: step: 832/463, loss: 0.005921326112002134 2023-01-22 17:07:26.768479: step: 834/463, loss: 0.01432809792459011 2023-01-22 17:07:27.405824: step: 836/463, loss: 0.011056209914386272 2023-01-22 17:07:28.053667: step: 838/463, loss: 0.05553867295384407 2023-01-22 17:07:28.664227: step: 840/463, loss: 0.01775946654379368 2023-01-22 17:07:29.336539: step: 842/463, loss: 0.01678019016981125 2023-01-22 17:07:29.957474: step: 844/463, loss: 0.04698930308222771 2023-01-22 17:07:30.500242: step: 846/463, loss: 0.07533220946788788 2023-01-22 17:07:31.083918: step: 848/463, loss: 0.024132536724209785 2023-01-22 17:07:31.714076: step: 850/463, loss: 0.5090657472610474 2023-01-22 17:07:32.326188: step: 852/463, loss: 0.010510803200304508 2023-01-22 17:07:32.946625: step: 854/463, loss: 0.33552324771881104 2023-01-22 17:07:33.547843: step: 856/463, loss: 0.020237218588590622 2023-01-22 17:07:34.114056: step: 858/463, loss: 0.09624751657247543 2023-01-22 17:07:34.755475: step: 860/463, loss: 0.031281422823667526 2023-01-22 17:07:35.428979: step: 862/463, loss: 0.023025600239634514 2023-01-22 17:07:36.020650: step: 864/463, loss: 0.0066196550615131855 2023-01-22 17:07:36.652171: step: 866/463, loss: 0.021688468754291534 2023-01-22 17:07:37.264277: step: 868/463, loss: 0.0010393769480288029 2023-01-22 17:07:37.916549: step: 870/463, loss: 0.005214229226112366 2023-01-22 17:07:38.555061: step: 872/463, loss: 0.021304894238710403 2023-01-22 17:07:39.193672: step: 874/463, loss: 0.007582861930131912 2023-01-22 17:07:39.880703: step: 876/463, loss: 0.0267223808914423 2023-01-22 17:07:40.496476: step: 878/463, loss: 0.022991865873336792 2023-01-22 17:07:41.139524: step: 880/463, loss: 0.01653200015425682 2023-01-22 17:07:41.760886: step: 882/463, loss: 0.0010056934552267194 2023-01-22 17:07:42.394962: step: 884/463, loss: 0.05289945751428604 2023-01-22 17:07:42.972256: step: 886/463, loss: 0.021886564791202545 2023-01-22 17:07:43.673529: step: 888/463, loss: 0.08960405737161636 2023-01-22 17:07:44.295299: step: 890/463, loss: 0.003927177749574184 2023-01-22 17:07:44.881343: step: 892/463, loss: 0.02631259337067604 2023-01-22 17:07:45.461822: step: 894/463, loss: 0.0406310148537159 2023-01-22 17:07:46.056537: step: 896/463, loss: 0.058247070759534836 2023-01-22 17:07:46.668723: step: 898/463, loss: 0.0033351515885442495 2023-01-22 17:07:47.243937: step: 900/463, loss: 0.029780665412545204 2023-01-22 17:07:47.854409: step: 902/463, loss: 2.365645408630371 2023-01-22 17:07:48.515657: step: 904/463, loss: 0.3996106684207916 2023-01-22 17:07:49.051644: step: 906/463, loss: 0.06108494848012924 2023-01-22 17:07:49.727665: step: 908/463, loss: 0.02042175456881523 2023-01-22 17:07:50.361264: step: 910/463, loss: 0.0630432739853859 2023-01-22 17:07:50.964855: step: 912/463, loss: 0.0004885736852884293 2023-01-22 17:07:51.779338: step: 914/463, loss: 0.018718397244811058 2023-01-22 17:07:52.431131: step: 916/463, loss: 0.0016363165341317654 2023-01-22 17:07:53.126085: step: 918/463, loss: 0.005637249443680048 2023-01-22 17:07:53.765254: step: 920/463, loss: 0.053009893745183945 2023-01-22 17:07:54.388266: step: 922/463, loss: 0.04637932404875755 2023-01-22 17:07:54.983611: step: 924/463, loss: 0.008284215815365314 2023-01-22 17:07:55.628795: step: 926/463, loss: 0.4857058525085449 ================================================== Loss: 0.069 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.28191302652106087, 'r': 0.342896110056926, 'f1': 0.3094285102739726}, 'combined': 0.22799995493871664, 'epoch': 21} Test Chinese: {'template': {'p': 0.9722222222222222, 'r': 0.5511811023622047, 'f1': 0.7035175879396985}, 'slot': {'p': 0.3384979966952492, 'r': 0.32696837060693945, 'f1': 0.332633304615678}, 'combined': 0.2340133801316328, 'epoch': 21} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.291068515497553, 'r': 0.3385673624288425, 'f1': 0.3130263157894737}, 'combined': 0.23065096952908587, 'epoch': 21} Test Korean: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.34050941074485314, 'r': 0.3235582872405242, 'f1': 0.33181750012575034}, 'combined': 0.23559042508928274, 'epoch': 21} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.291850884244373, 'r': 0.3444615749525617, 'f1': 0.3159812880765884}, 'combined': 0.23282831753011773, 'epoch': 21} Test Russian: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.3538062394812509, 'r': 0.30838308035134354, 'f1': 0.3295367494188412}, 'combined': 0.23397109208737724, 'epoch': 21} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.2626488095238095, 'r': 0.4202380952380952, 'f1': 0.3232600732600732}, 'combined': 0.21550671550671546, 'epoch': 21} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.2847222222222222, 'r': 0.44565217391304346, 'f1': 0.3474576271186441}, 'combined': 0.17372881355932204, 'epoch': 21} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.42105263157894735, 'r': 0.27586206896551724, 'f1': 0.3333333333333333}, 'combined': 0.2222222222222222, 'epoch': 21} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2887401938920082, 'r': 0.35393959251278423, 'f1': 0.3180326773303279}, 'combined': 0.2343398675065574, 'epoch': 16} Test for Chinese: {'template': {'p': 0.9722222222222222, 'r': 0.5511811023622047, 'f1': 0.7035175879396985}, 'slot': {'p': 0.3374190197383612, 'r': 0.30885912016190303, 'f1': 0.32250801977725824}, 'combined': 0.22689006416490531, 'epoch': 16} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.29006410256410253, 'r': 0.4309523809523809, 'f1': 0.34674329501915707}, 'combined': 0.23116219667943805, 'epoch': 16} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.28642843540005986, 'r': 0.33860515228508026, 'f1': 0.3103389830508475}, 'combined': 0.22867082961641394, 'epoch': 17} Test for Korean: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.3492453884409228, 'r': 0.3270179138496522, 'f1': 0.3377663639671779}, 'combined': 0.2398141184166963, 'epoch': 17} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.33088235294117646, 'r': 0.4891304347826087, 'f1': 0.39473684210526316}, 'combined': 0.19736842105263158, 'epoch': 17} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.29811343141907926, 'r': 0.34053944158308486, 'f1': 0.3179172466152094}, 'combined': 0.23425481329541745, 'epoch': 20} Test for Russian: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.36505995641065214, 'r': 0.2978456014345199, 'f1': 0.32804522752903387}, 'combined': 0.23291211154561403, 'epoch': 20} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.27586206896551724, 'f1': 0.35555555555555557}, 'combined': 0.23703703703703705, 'epoch': 20} ****************************** Epoch: 22 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-22 17:10:36.281737: step: 2/463, loss: 0.013563413172960281 2023-01-22 17:10:36.939782: step: 4/463, loss: 0.027686724439263344 2023-01-22 17:10:37.474793: step: 6/463, loss: 0.016995931044220924 2023-01-22 17:10:38.060057: step: 8/463, loss: 0.0042908089235424995 2023-01-22 17:10:38.691370: step: 10/463, loss: 0.01713423617184162 2023-01-22 17:10:39.327019: step: 12/463, loss: 0.002881145104765892 2023-01-22 17:10:40.029888: step: 14/463, loss: 0.021302253007888794 2023-01-22 17:10:40.576716: step: 16/463, loss: 0.012144491076469421 2023-01-22 17:10:41.182676: step: 18/463, loss: 0.022809915244579315 2023-01-22 17:10:41.888459: step: 20/463, loss: 0.04144177958369255 2023-01-22 17:10:42.514772: step: 22/463, loss: 0.03930528834462166 2023-01-22 17:10:43.109539: step: 24/463, loss: 0.009283098392188549 2023-01-22 17:10:43.677129: step: 26/463, loss: 0.02163575030863285 2023-01-22 17:10:44.263079: step: 28/463, loss: 0.005954277701675892 2023-01-22 17:10:44.915450: step: 30/463, loss: 0.03411492332816124 2023-01-22 17:10:45.563755: step: 32/463, loss: 0.002053888514637947 2023-01-22 17:10:46.216808: step: 34/463, loss: 0.01332179456949234 2023-01-22 17:10:46.899701: step: 36/463, loss: 0.05226156488060951 2023-01-22 17:10:47.562133: step: 38/463, loss: 0.028072770684957504 2023-01-22 17:10:48.249958: step: 40/463, loss: 0.031267743557691574 2023-01-22 17:10:48.911056: step: 42/463, loss: 0.015865441411733627 2023-01-22 17:10:49.572521: step: 44/463, loss: 0.03088334947824478 2023-01-22 17:10:50.136592: step: 46/463, loss: 0.06349833309650421 2023-01-22 17:10:50.764001: step: 48/463, loss: 0.051087215542793274 2023-01-22 17:10:51.388693: step: 50/463, loss: 0.7793177366256714 2023-01-22 17:10:51.988733: step: 52/463, loss: 0.08643217384815216 2023-01-22 17:10:52.645685: step: 54/463, loss: 0.015515283681452274 2023-01-22 17:10:53.205950: step: 56/463, loss: 0.09505452960729599 2023-01-22 17:10:53.774278: step: 58/463, loss: 0.036231376230716705 2023-01-22 17:10:54.420816: step: 60/463, loss: 0.008632997050881386 2023-01-22 17:10:55.043290: step: 62/463, loss: 0.02432946115732193 2023-01-22 17:10:55.656173: step: 64/463, loss: 0.0027217117603868246 2023-01-22 17:10:56.297272: step: 66/463, loss: 0.03582718223333359 2023-01-22 17:10:56.901188: step: 68/463, loss: 0.08178508281707764 2023-01-22 17:10:57.515370: step: 70/463, loss: 0.012343664653599262 2023-01-22 17:10:58.186188: step: 72/463, loss: 0.02994297258555889 2023-01-22 17:10:58.832311: step: 74/463, loss: 0.012609326280653477 2023-01-22 17:10:59.429100: step: 76/463, loss: 0.027291543781757355 2023-01-22 17:11:00.009250: step: 78/463, loss: 0.00982663780450821 2023-01-22 17:11:00.542996: step: 80/463, loss: 0.0010083981323987246 2023-01-22 17:11:01.169578: step: 82/463, loss: 0.023982752114534378 2023-01-22 17:11:01.857529: step: 84/463, loss: 0.013572655618190765 2023-01-22 17:11:02.521739: step: 86/463, loss: 0.025363372638821602 2023-01-22 17:11:03.114067: step: 88/463, loss: 0.00859833788126707 2023-01-22 17:11:03.697007: step: 90/463, loss: 0.02871275134384632 2023-01-22 17:11:04.283046: step: 92/463, loss: 0.01284545287489891 2023-01-22 17:11:04.929622: step: 94/463, loss: 0.022092290222644806 2023-01-22 17:11:05.535208: step: 96/463, loss: 0.05295825004577637 2023-01-22 17:11:06.156905: step: 98/463, loss: 0.0008643814944662154 2023-01-22 17:11:06.764155: step: 100/463, loss: 0.04345071688294411 2023-01-22 17:11:07.393505: step: 102/463, loss: 0.004079500678926706 2023-01-22 17:11:08.068224: step: 104/463, loss: 0.04956790804862976 2023-01-22 17:11:08.657160: step: 106/463, loss: 0.08838072419166565 2023-01-22 17:11:09.351746: step: 108/463, loss: 0.055859848856925964 2023-01-22 17:11:09.974458: step: 110/463, loss: 0.020128486678004265 2023-01-22 17:11:10.579942: step: 112/463, loss: 0.022582082077860832 2023-01-22 17:11:11.172819: step: 114/463, loss: 0.0380750447511673 2023-01-22 17:11:11.776672: step: 116/463, loss: 0.01886761747300625 2023-01-22 17:11:12.415077: step: 118/463, loss: 0.0155165521427989 2023-01-22 17:11:12.986506: step: 120/463, loss: 0.045420918613672256 2023-01-22 17:11:13.505765: step: 122/463, loss: 0.01767851412296295 2023-01-22 17:11:14.037163: step: 124/463, loss: 0.029305074363946915 2023-01-22 17:11:14.608730: step: 126/463, loss: 0.015495275147259235 2023-01-22 17:11:15.207896: step: 128/463, loss: 0.0022032982669770718 2023-01-22 17:11:15.816813: step: 130/463, loss: 0.011123518459498882 2023-01-22 17:11:16.468801: step: 132/463, loss: 0.03400957211852074 2023-01-22 17:11:17.124250: step: 134/463, loss: 0.02978605031967163 2023-01-22 17:11:17.726315: step: 136/463, loss: 0.02157764695584774 2023-01-22 17:11:18.356891: step: 138/463, loss: 0.06356684118509293 2023-01-22 17:11:18.981493: step: 140/463, loss: 0.03214133158326149 2023-01-22 17:11:19.591611: step: 142/463, loss: 0.0074737779796123505 2023-01-22 17:11:20.265555: step: 144/463, loss: 0.06261921674013138 2023-01-22 17:11:20.864855: step: 146/463, loss: 0.008794319815933704 2023-01-22 17:11:21.499526: step: 148/463, loss: 0.04661964252591133 2023-01-22 17:11:22.153210: step: 150/463, loss: 0.01853261888027191 2023-01-22 17:11:22.750964: step: 152/463, loss: 0.02204013615846634 2023-01-22 17:11:23.345003: step: 154/463, loss: 0.002404892584308982 2023-01-22 17:11:23.936740: step: 156/463, loss: 0.0021966365166008472 2023-01-22 17:11:24.537375: step: 158/463, loss: 0.03265184536576271 2023-01-22 17:11:25.100127: step: 160/463, loss: 0.0012091726530343294 2023-01-22 17:11:25.737183: step: 162/463, loss: 0.03373891860246658 2023-01-22 17:11:26.294647: step: 164/463, loss: 0.023059731349349022 2023-01-22 17:11:26.943120: step: 166/463, loss: 0.0013733680825680494 2023-01-22 17:11:27.537035: step: 168/463, loss: 0.11976668983697891 2023-01-22 17:11:28.176374: step: 170/463, loss: 0.0005224257474765182 2023-01-22 17:11:28.773902: step: 172/463, loss: 0.026743799448013306 2023-01-22 17:11:29.362495: step: 174/463, loss: 0.06131874397397041 2023-01-22 17:11:29.978775: step: 176/463, loss: 0.006924128625541925 2023-01-22 17:11:30.720558: step: 178/463, loss: 0.006872483529150486 2023-01-22 17:11:31.393750: step: 180/463, loss: 0.04494689404964447 2023-01-22 17:11:32.022871: step: 182/463, loss: 0.044343478977680206 2023-01-22 17:11:32.682316: step: 184/463, loss: 0.0020337977912276983 2023-01-22 17:11:33.329171: step: 186/463, loss: 0.011643885634839535 2023-01-22 17:11:33.922831: step: 188/463, loss: 0.012045054696500301 2023-01-22 17:11:34.561303: step: 190/463, loss: 0.01983756199479103 2023-01-22 17:11:35.161986: step: 192/463, loss: 0.042664580047130585 2023-01-22 17:11:35.849495: step: 194/463, loss: 0.08944869786500931 2023-01-22 17:11:36.510816: step: 196/463, loss: 0.054104484617710114 2023-01-22 17:11:37.143489: step: 198/463, loss: 0.095198854804039 2023-01-22 17:11:37.767265: step: 200/463, loss: 0.05829924717545509 2023-01-22 17:11:38.401629: step: 202/463, loss: 0.046723417937755585 2023-01-22 17:11:39.029634: step: 204/463, loss: 0.004173466004431248 2023-01-22 17:11:39.673684: step: 206/463, loss: 0.04351614788174629 2023-01-22 17:11:40.430991: step: 208/463, loss: 0.04002128541469574 2023-01-22 17:11:41.128250: step: 210/463, loss: 0.16399826109409332 2023-01-22 17:11:41.697675: step: 212/463, loss: 0.013926751911640167 2023-01-22 17:11:42.327168: step: 214/463, loss: 0.07999598234891891 2023-01-22 17:11:42.961140: step: 216/463, loss: 0.04332979768514633 2023-01-22 17:11:43.583887: step: 218/463, loss: 0.07990018278360367 2023-01-22 17:11:44.251099: step: 220/463, loss: 0.15040315687656403 2023-01-22 17:11:44.935289: step: 222/463, loss: 0.04178351163864136 2023-01-22 17:11:45.480285: step: 224/463, loss: 0.01731269061565399 2023-01-22 17:11:46.129016: step: 226/463, loss: 0.039577241986989975 2023-01-22 17:11:46.748398: step: 228/463, loss: 0.030887238681316376 2023-01-22 17:11:47.380652: step: 230/463, loss: 0.1715202033519745 2023-01-22 17:11:47.990097: step: 232/463, loss: 0.04403403773903847 2023-01-22 17:11:48.632166: step: 234/463, loss: 0.05182133615016937 2023-01-22 17:11:49.224421: step: 236/463, loss: 0.007900946773588657 2023-01-22 17:11:49.866003: step: 238/463, loss: 0.028874559327960014 2023-01-22 17:11:50.487337: step: 240/463, loss: 0.007244710810482502 2023-01-22 17:11:51.122602: step: 242/463, loss: 0.1025867760181427 2023-01-22 17:11:51.683109: step: 244/463, loss: 0.006252650637179613 2023-01-22 17:11:52.328984: step: 246/463, loss: 0.04906067997217178 2023-01-22 17:11:53.003443: step: 248/463, loss: 0.04808478057384491 2023-01-22 17:11:53.620730: step: 250/463, loss: 0.03452625870704651 2023-01-22 17:11:54.246277: step: 252/463, loss: 0.05751759186387062 2023-01-22 17:11:54.891479: step: 254/463, loss: 0.08386038988828659 2023-01-22 17:11:55.470916: step: 256/463, loss: 0.09407392889261246 2023-01-22 17:11:56.104750: step: 258/463, loss: 0.011393817141652107 2023-01-22 17:11:56.754649: step: 260/463, loss: 0.02577880211174488 2023-01-22 17:11:57.351624: step: 262/463, loss: 0.03964587301015854 2023-01-22 17:11:57.967823: step: 264/463, loss: 0.04590174928307533 2023-01-22 17:11:58.555928: step: 266/463, loss: 0.036120444536209106 2023-01-22 17:11:59.155384: step: 268/463, loss: 0.10283565521240234 2023-01-22 17:11:59.747071: step: 270/463, loss: 0.01423379685729742 2023-01-22 17:12:00.362715: step: 272/463, loss: 0.21537049114704132 2023-01-22 17:12:01.031761: step: 274/463, loss: 0.023580417037010193 2023-01-22 17:12:01.715372: step: 276/463, loss: 0.007297035772353411 2023-01-22 17:12:02.336151: step: 278/463, loss: 0.037084463983774185 2023-01-22 17:12:03.048422: step: 280/463, loss: 0.027498440816998482 2023-01-22 17:12:03.729273: step: 282/463, loss: 0.06275985389947891 2023-01-22 17:12:04.384411: step: 284/463, loss: 0.04020778089761734 2023-01-22 17:12:04.983783: step: 286/463, loss: 0.03513509780168533 2023-01-22 17:12:05.564212: step: 288/463, loss: 0.010877633467316628 2023-01-22 17:12:06.222756: step: 290/463, loss: 1.2258589267730713 2023-01-22 17:12:06.854649: step: 292/463, loss: 0.08278675377368927 2023-01-22 17:12:07.507026: step: 294/463, loss: 0.0210158359259367 2023-01-22 17:12:08.180649: step: 296/463, loss: 0.0418945774435997 2023-01-22 17:12:08.794920: step: 298/463, loss: 0.011203783564269543 2023-01-22 17:12:09.435883: step: 300/463, loss: 0.6963186264038086 2023-01-22 17:12:10.084333: step: 302/463, loss: 0.006312732584774494 2023-01-22 17:12:10.729631: step: 304/463, loss: 0.09560777992010117 2023-01-22 17:12:11.333315: step: 306/463, loss: 0.0929030030965805 2023-01-22 17:12:11.992054: step: 308/463, loss: 0.09828118234872818 2023-01-22 17:12:12.553745: step: 310/463, loss: 0.02609098330140114 2023-01-22 17:12:13.134550: step: 312/463, loss: 0.07369411736726761 2023-01-22 17:12:13.819861: step: 314/463, loss: 0.48333102464675903 2023-01-22 17:12:14.450650: step: 316/463, loss: 0.0036630574613809586 2023-01-22 17:12:15.049517: step: 318/463, loss: 0.008865703828632832 2023-01-22 17:12:15.765129: step: 320/463, loss: 0.020232485607266426 2023-01-22 17:12:16.406077: step: 322/463, loss: 0.2081756293773651 2023-01-22 17:12:17.035386: step: 324/463, loss: 0.02812458947300911 2023-01-22 17:12:17.598904: step: 326/463, loss: 0.010162158869206905 2023-01-22 17:12:18.207308: step: 328/463, loss: 0.015865325927734375 2023-01-22 17:12:18.848390: step: 330/463, loss: 1.2262437343597412 2023-01-22 17:12:19.519455: step: 332/463, loss: 0.02222246490418911 2023-01-22 17:12:20.179540: step: 334/463, loss: 0.0032969950698316097 2023-01-22 17:12:20.836545: step: 336/463, loss: 0.05294954776763916 2023-01-22 17:12:21.427186: step: 338/463, loss: 0.030851278454065323 2023-01-22 17:12:22.041565: step: 340/463, loss: 0.012105780653655529 2023-01-22 17:12:22.682944: step: 342/463, loss: 0.06529495120048523 2023-01-22 17:12:23.274405: step: 344/463, loss: 0.03589235991239548 2023-01-22 17:12:23.909033: step: 346/463, loss: 1.6785163879394531 2023-01-22 17:12:24.534810: step: 348/463, loss: 0.0415109321475029 2023-01-22 17:12:25.123192: step: 350/463, loss: 0.04612249135971069 2023-01-22 17:12:25.750064: step: 352/463, loss: 0.021317025646567345 2023-01-22 17:12:26.469253: step: 354/463, loss: 0.11501602083444595 2023-01-22 17:12:27.066426: step: 356/463, loss: 0.00674433633685112 2023-01-22 17:12:27.727085: step: 358/463, loss: 0.7569242715835571 2023-01-22 17:12:28.332902: step: 360/463, loss: 0.020515209063887596 2023-01-22 17:12:28.990610: step: 362/463, loss: 0.10614895075559616 2023-01-22 17:12:29.582094: step: 364/463, loss: 0.027117570862174034 2023-01-22 17:12:30.239270: step: 366/463, loss: 0.03812817111611366 2023-01-22 17:12:30.823564: step: 368/463, loss: 0.7994940280914307 2023-01-22 17:12:31.461427: step: 370/463, loss: 0.06824885308742523 2023-01-22 17:12:32.055144: step: 372/463, loss: 0.29690033197402954 2023-01-22 17:12:32.773835: step: 374/463, loss: 0.008621599525213242 2023-01-22 17:12:33.455515: step: 376/463, loss: 0.23360788822174072 2023-01-22 17:12:34.040178: step: 378/463, loss: 0.10709181427955627 2023-01-22 17:12:34.696906: step: 380/463, loss: 0.10683143883943558 2023-01-22 17:12:35.288585: step: 382/463, loss: 0.4179520010948181 2023-01-22 17:12:35.892295: step: 384/463, loss: 0.010522992350161076 2023-01-22 17:12:36.631136: step: 386/463, loss: 0.07373275607824326 2023-01-22 17:12:37.203107: step: 388/463, loss: 0.0052343448624014854 2023-01-22 17:12:37.814192: step: 390/463, loss: 0.020403338596224785 2023-01-22 17:12:38.373870: step: 392/463, loss: 0.06538897007703781 2023-01-22 17:12:39.007165: step: 394/463, loss: 0.011114916764199734 2023-01-22 17:12:39.632653: step: 396/463, loss: 0.009620853699743748 2023-01-22 17:12:40.294975: step: 398/463, loss: 0.0025538557674735785 2023-01-22 17:12:40.979260: step: 400/463, loss: 0.022205937653779984 2023-01-22 17:12:41.612188: step: 402/463, loss: 0.581816554069519 2023-01-22 17:12:42.191192: step: 404/463, loss: 0.04103419557213783 2023-01-22 17:12:42.808779: step: 406/463, loss: 0.3224335014820099 2023-01-22 17:12:43.398745: step: 408/463, loss: 0.008435601368546486 2023-01-22 17:12:44.085192: step: 410/463, loss: 0.013043609447777271 2023-01-22 17:12:44.711514: step: 412/463, loss: 0.7903401255607605 2023-01-22 17:12:45.306908: step: 414/463, loss: 0.02529788948595524 2023-01-22 17:12:45.944174: step: 416/463, loss: 0.015711113810539246 2023-01-22 17:12:46.565087: step: 418/463, loss: 0.11086002737283707 2023-01-22 17:12:47.210831: step: 420/463, loss: 0.03177548572421074 2023-01-22 17:12:47.851159: step: 422/463, loss: 0.005693338345736265 2023-01-22 17:12:48.580200: step: 424/463, loss: 0.035534437745809555 2023-01-22 17:12:49.179138: step: 426/463, loss: 0.034687794744968414 2023-01-22 17:12:49.832776: step: 428/463, loss: 0.029139943420886993 2023-01-22 17:12:50.503813: step: 430/463, loss: 0.04404307156801224 2023-01-22 17:12:51.233868: step: 432/463, loss: 0.0537167489528656 2023-01-22 17:12:51.873597: step: 434/463, loss: 0.0464879535138607 2023-01-22 17:12:52.513277: step: 436/463, loss: 0.0037006582133471966 2023-01-22 17:12:53.118818: step: 438/463, loss: 0.02211572229862213 2023-01-22 17:12:53.740364: step: 440/463, loss: 0.0015138108283281326 2023-01-22 17:12:54.425072: step: 442/463, loss: 0.011896266601979733 2023-01-22 17:12:55.049877: step: 444/463, loss: 0.001578322146087885 2023-01-22 17:12:55.635972: step: 446/463, loss: 0.011550191789865494 2023-01-22 17:12:56.268377: step: 448/463, loss: 0.026150180026888847 2023-01-22 17:12:56.948490: step: 450/463, loss: 0.012388618662953377 2023-01-22 17:12:57.554839: step: 452/463, loss: 0.028825758025050163 2023-01-22 17:12:58.148258: step: 454/463, loss: 0.00706181675195694 2023-01-22 17:12:58.794677: step: 456/463, loss: 0.062083348631858826 2023-01-22 17:12:59.475915: step: 458/463, loss: 0.013195201754570007 2023-01-22 17:13:00.110403: step: 460/463, loss: 0.01238645613193512 2023-01-22 17:13:00.711922: step: 462/463, loss: 0.01704845204949379 2023-01-22 17:13:01.397853: step: 464/463, loss: 0.013495084829628468 2023-01-22 17:13:01.992993: step: 466/463, loss: 0.07680729031562805 2023-01-22 17:13:02.636392: step: 468/463, loss: 0.0003301684046164155 2023-01-22 17:13:03.153611: step: 470/463, loss: 0.03692224249243736 2023-01-22 17:13:03.714185: step: 472/463, loss: 0.02365754172205925 2023-01-22 17:13:04.329445: step: 474/463, loss: 0.11717119067907333 2023-01-22 17:13:04.898193: step: 476/463, loss: 0.012651990167796612 2023-01-22 17:13:05.554884: step: 478/463, loss: 0.04487377777695656 2023-01-22 17:13:06.236423: step: 480/463, loss: 1.0240111351013184 2023-01-22 17:13:06.919393: step: 482/463, loss: 0.023005060851573944 2023-01-22 17:13:07.524903: step: 484/463, loss: 0.026540931314229965 2023-01-22 17:13:08.172272: step: 486/463, loss: 0.0699540376663208 2023-01-22 17:13:08.808809: step: 488/463, loss: 0.06316050887107849 2023-01-22 17:13:09.475764: step: 490/463, loss: 0.016835026443004608 2023-01-22 17:13:10.109894: step: 492/463, loss: 0.012087613344192505 2023-01-22 17:13:10.690134: step: 494/463, loss: 0.08639559894800186 2023-01-22 17:13:11.372838: step: 496/463, loss: 0.5190495252609253 2023-01-22 17:13:11.912197: step: 498/463, loss: 0.03531806170940399 2023-01-22 17:13:12.458184: step: 500/463, loss: 0.05010486766695976 2023-01-22 17:13:13.125277: step: 502/463, loss: 0.0932348370552063 2023-01-22 17:13:13.784387: step: 504/463, loss: 0.053074758499860764 2023-01-22 17:13:14.298899: step: 506/463, loss: 0.013884359039366245 2023-01-22 17:13:14.893127: step: 508/463, loss: 0.014796269126236439 2023-01-22 17:13:15.518204: step: 510/463, loss: 0.01313356775790453 2023-01-22 17:13:16.186739: step: 512/463, loss: 0.012099887244403362 2023-01-22 17:13:16.752192: step: 514/463, loss: 0.0034200982190668583 2023-01-22 17:13:17.371989: step: 516/463, loss: 0.021783558651804924 2023-01-22 17:13:17.956536: step: 518/463, loss: 0.00556019926443696 2023-01-22 17:13:18.683290: step: 520/463, loss: 0.17949852347373962 2023-01-22 17:13:19.335760: step: 522/463, loss: 0.04827951639890671 2023-01-22 17:13:20.103635: step: 524/463, loss: 0.030402008444070816 2023-01-22 17:13:20.679159: step: 526/463, loss: 0.03898661583662033 2023-01-22 17:13:21.269943: step: 528/463, loss: 0.01171800959855318 2023-01-22 17:13:21.974896: step: 530/463, loss: 0.07287195324897766 2023-01-22 17:13:22.612218: step: 532/463, loss: 0.3649541735649109 2023-01-22 17:13:23.245307: step: 534/463, loss: 0.027874968945980072 2023-01-22 17:13:23.881137: step: 536/463, loss: 0.07704052329063416 2023-01-22 17:13:24.482293: step: 538/463, loss: 0.009613605216145515 2023-01-22 17:13:25.077636: step: 540/463, loss: 0.02339683286845684 2023-01-22 17:13:25.662545: step: 542/463, loss: 0.00648613041266799 2023-01-22 17:13:26.273913: step: 544/463, loss: 0.02178863435983658 2023-01-22 17:13:26.833599: step: 546/463, loss: 0.3488987386226654 2023-01-22 17:13:27.401188: step: 548/463, loss: 0.005571399815380573 2023-01-22 17:13:27.954574: step: 550/463, loss: 0.005527704954147339 2023-01-22 17:13:28.674489: step: 552/463, loss: 0.01688903383910656 2023-01-22 17:13:29.296097: step: 554/463, loss: 0.07669171690940857 2023-01-22 17:13:29.924282: step: 556/463, loss: 0.013890649192035198 2023-01-22 17:13:30.511157: step: 558/463, loss: 0.09763424843549728 2023-01-22 17:13:31.167067: step: 560/463, loss: 0.08211641013622284 2023-01-22 17:13:31.807175: step: 562/463, loss: 0.011355492286384106 2023-01-22 17:13:32.401620: step: 564/463, loss: 0.005021478980779648 2023-01-22 17:13:33.016173: step: 566/463, loss: 0.029434701427817345 2023-01-22 17:13:33.768230: step: 568/463, loss: 0.030581358820199966 2023-01-22 17:13:34.505618: step: 570/463, loss: 0.007987498305737972 2023-01-22 17:13:35.065899: step: 572/463, loss: 0.07487676292657852 2023-01-22 17:13:35.661560: step: 574/463, loss: 0.09364102780818939 2023-01-22 17:13:36.239103: step: 576/463, loss: 0.03839296102523804 2023-01-22 17:13:36.937032: step: 578/463, loss: 0.019017299637198448 2023-01-22 17:13:37.523000: step: 580/463, loss: 0.06900999695062637 2023-01-22 17:13:38.122368: step: 582/463, loss: 0.09364496916532516 2023-01-22 17:13:38.734040: step: 584/463, loss: 0.020546218380331993 2023-01-22 17:13:39.335795: step: 586/463, loss: 0.025251159444451332 2023-01-22 17:13:39.992137: step: 588/463, loss: 0.03777123615145683 2023-01-22 17:13:40.574314: step: 590/463, loss: 0.006622961722314358 2023-01-22 17:13:41.239231: step: 592/463, loss: 0.06497672200202942 2023-01-22 17:13:41.877530: step: 594/463, loss: 0.08352644741535187 2023-01-22 17:13:42.553581: step: 596/463, loss: 0.020658286288380623 2023-01-22 17:13:43.195980: step: 598/463, loss: 0.3913595378398895 2023-01-22 17:13:43.757933: step: 600/463, loss: 0.03869215399026871 2023-01-22 17:13:44.371525: step: 602/463, loss: 0.007375158369541168 2023-01-22 17:13:44.984597: step: 604/463, loss: 0.04663262888789177 2023-01-22 17:13:45.589809: step: 606/463, loss: 0.041728585958480835 2023-01-22 17:13:46.189301: step: 608/463, loss: 0.036133792251348495 2023-01-22 17:13:46.776402: step: 610/463, loss: 0.025601839646697044 2023-01-22 17:13:47.389143: step: 612/463, loss: 0.016820082440972328 2023-01-22 17:13:48.060794: step: 614/463, loss: 0.5794793367385864 2023-01-22 17:13:48.648789: step: 616/463, loss: 0.05411287397146225 2023-01-22 17:13:49.339830: step: 618/463, loss: 0.05692407116293907 2023-01-22 17:13:49.979718: step: 620/463, loss: 0.041848305612802505 2023-01-22 17:13:50.574359: step: 622/463, loss: 0.00617753341794014 2023-01-22 17:13:51.161356: step: 624/463, loss: 0.05162177234888077 2023-01-22 17:13:51.816106: step: 626/463, loss: 0.1265338808298111 2023-01-22 17:13:52.399160: step: 628/463, loss: 0.10994291305541992 2023-01-22 17:13:53.065471: step: 630/463, loss: 0.010973530821502209 2023-01-22 17:13:53.773873: step: 632/463, loss: 0.14767488837242126 2023-01-22 17:13:54.365928: step: 634/463, loss: 0.13200104236602783 2023-01-22 17:13:54.933199: step: 636/463, loss: 0.10118882358074188 2023-01-22 17:13:55.538228: step: 638/463, loss: 0.03843146935105324 2023-01-22 17:13:56.144686: step: 640/463, loss: 0.016174932941794395 2023-01-22 17:13:56.852564: step: 642/463, loss: 0.06340375542640686 2023-01-22 17:13:57.453284: step: 644/463, loss: 0.003897025715559721 2023-01-22 17:13:58.110115: step: 646/463, loss: 0.009921206161379814 2023-01-22 17:13:58.676071: step: 648/463, loss: 0.014023966155946255 2023-01-22 17:13:59.338645: step: 650/463, loss: 0.005584945436567068 2023-01-22 17:13:59.964710: step: 652/463, loss: 0.011161362752318382 2023-01-22 17:14:00.592641: step: 654/463, loss: 0.033452730625867844 2023-01-22 17:14:01.277317: step: 656/463, loss: 0.6672853827476501 2023-01-22 17:14:01.880031: step: 658/463, loss: 0.010777494870126247 2023-01-22 17:14:02.462033: step: 660/463, loss: 0.013961912132799625 2023-01-22 17:14:03.074333: step: 662/463, loss: 0.09277906268835068 2023-01-22 17:14:03.639179: step: 664/463, loss: 0.055470533668994904 2023-01-22 17:14:04.242618: step: 666/463, loss: 0.009314214810729027 2023-01-22 17:14:04.817356: step: 668/463, loss: 0.016500145196914673 2023-01-22 17:14:05.524063: step: 670/463, loss: 0.01951323263347149 2023-01-22 17:14:06.179049: step: 672/463, loss: 0.012328967452049255 2023-01-22 17:14:06.743146: step: 674/463, loss: 0.007565303705632687 2023-01-22 17:14:07.319736: step: 676/463, loss: 0.05677812173962593 2023-01-22 17:14:07.932202: step: 678/463, loss: 0.042429450899362564 2023-01-22 17:14:08.561817: step: 680/463, loss: 0.014553539454936981 2023-01-22 17:14:09.117039: step: 682/463, loss: 0.005215069279074669 2023-01-22 17:14:09.776643: step: 684/463, loss: 0.05484722927212715 2023-01-22 17:14:10.401888: step: 686/463, loss: 0.007780789397656918 2023-01-22 17:14:11.026918: step: 688/463, loss: 0.05414300411939621 2023-01-22 17:14:11.597224: step: 690/463, loss: 0.05062070116400719 2023-01-22 17:14:12.226365: step: 692/463, loss: 0.009477240033447742 2023-01-22 17:14:12.831781: step: 694/463, loss: 0.10330638289451599 2023-01-22 17:14:13.400966: step: 696/463, loss: 0.011186743155121803 2023-01-22 17:14:13.943725: step: 698/463, loss: 0.026758011430501938 2023-01-22 17:14:14.534153: step: 700/463, loss: 0.08817766606807709 2023-01-22 17:14:15.175442: step: 702/463, loss: 0.01583784818649292 2023-01-22 17:14:15.819469: step: 704/463, loss: 0.06759083271026611 2023-01-22 17:14:16.505209: step: 706/463, loss: 0.03433932363986969 2023-01-22 17:14:17.118403: step: 708/463, loss: 0.004085755906999111 2023-01-22 17:14:17.768162: step: 710/463, loss: 0.03228599205613136 2023-01-22 17:14:18.392721: step: 712/463, loss: 0.042199574410915375 2023-01-22 17:14:19.035722: step: 714/463, loss: 0.06078357249498367 2023-01-22 17:14:19.575948: step: 716/463, loss: 0.05979737266898155 2023-01-22 17:14:20.207742: step: 718/463, loss: 0.005694372579455376 2023-01-22 17:14:20.834634: step: 720/463, loss: 0.002384510124102235 2023-01-22 17:14:21.438030: step: 722/463, loss: 0.8429024815559387 2023-01-22 17:14:22.033984: step: 724/463, loss: 0.029743095859885216 2023-01-22 17:14:22.672389: step: 726/463, loss: 0.0015543088084086776 2023-01-22 17:14:23.274351: step: 728/463, loss: 0.05636022984981537 2023-01-22 17:14:23.897436: step: 730/463, loss: 0.010792577639222145 2023-01-22 17:14:24.533595: step: 732/463, loss: 0.017636822536587715 2023-01-22 17:14:25.139149: step: 734/463, loss: 0.0020220899023115635 2023-01-22 17:14:25.846813: step: 736/463, loss: 0.05371855944395065 2023-01-22 17:14:26.430925: step: 738/463, loss: 0.050043825060129166 2023-01-22 17:14:27.074649: step: 740/463, loss: 0.09638512134552002 2023-01-22 17:14:27.665505: step: 742/463, loss: 0.03681471571326256 2023-01-22 17:14:28.359250: step: 744/463, loss: 0.023226486518979073 2023-01-22 17:14:29.073966: step: 746/463, loss: 0.05908375233411789 2023-01-22 17:14:29.701255: step: 748/463, loss: 0.036077726632356644 2023-01-22 17:14:30.285038: step: 750/463, loss: 0.0007341078016906977 2023-01-22 17:14:30.907243: step: 752/463, loss: 0.02228367328643799 2023-01-22 17:14:31.626541: step: 754/463, loss: 0.032006774097681046 2023-01-22 17:14:32.234785: step: 756/463, loss: 0.02814781852066517 2023-01-22 17:14:32.853130: step: 758/463, loss: 0.04964398592710495 2023-01-22 17:14:33.462567: step: 760/463, loss: 0.08994432538747787 2023-01-22 17:14:34.138369: step: 762/463, loss: 0.08794959634542465 2023-01-22 17:14:34.757477: step: 764/463, loss: 0.04584171995520592 2023-01-22 17:14:35.371204: step: 766/463, loss: 0.020694615319371223 2023-01-22 17:14:36.001799: step: 768/463, loss: 0.010873031802475452 2023-01-22 17:14:36.616380: step: 770/463, loss: 0.015189928002655506 2023-01-22 17:14:37.198514: step: 772/463, loss: 0.014821401797235012 2023-01-22 17:14:37.878603: step: 774/463, loss: 0.009979063645005226 2023-01-22 17:14:38.475492: step: 776/463, loss: 0.01155173871666193 2023-01-22 17:14:39.183549: step: 778/463, loss: 0.03265469893813133 2023-01-22 17:14:39.869476: step: 780/463, loss: 0.005357522517442703 2023-01-22 17:14:40.427884: step: 782/463, loss: 0.03511936962604523 2023-01-22 17:14:41.006642: step: 784/463, loss: 0.00030760906520299613 2023-01-22 17:14:41.693158: step: 786/463, loss: 0.014124150387942791 2023-01-22 17:14:42.301596: step: 788/463, loss: 0.05552303045988083 2023-01-22 17:14:42.941748: step: 790/463, loss: 3.3117932616733015e-05 2023-01-22 17:14:43.542824: step: 792/463, loss: 0.01662527583539486 2023-01-22 17:14:44.132070: step: 794/463, loss: 0.026058165356516838 2023-01-22 17:14:44.734272: step: 796/463, loss: 0.008907283656299114 2023-01-22 17:14:45.319508: step: 798/463, loss: 0.012906046584248543 2023-01-22 17:14:45.908841: step: 800/463, loss: 0.010848150588572025 2023-01-22 17:14:46.503539: step: 802/463, loss: 0.009968670085072517 2023-01-22 17:14:47.175898: step: 804/463, loss: 0.034957848489284515 2023-01-22 17:14:47.727213: step: 806/463, loss: 0.07747875154018402 2023-01-22 17:14:48.385647: step: 808/463, loss: 0.07693058997392654 2023-01-22 17:14:49.024375: step: 810/463, loss: 0.047312136739492416 2023-01-22 17:14:49.646551: step: 812/463, loss: 0.06979115307331085 2023-01-22 17:14:50.260154: step: 814/463, loss: 0.017651531845331192 2023-01-22 17:14:50.921907: step: 816/463, loss: 0.039687737822532654 2023-01-22 17:14:51.479422: step: 818/463, loss: 0.02474035881459713 2023-01-22 17:14:52.149600: step: 820/463, loss: 0.015986165031790733 2023-01-22 17:14:52.698636: step: 822/463, loss: 0.005214245989918709 2023-01-22 17:14:53.347964: step: 824/463, loss: 0.013402801938354969 2023-01-22 17:14:54.054779: step: 826/463, loss: 0.06216780096292496 2023-01-22 17:14:54.666531: step: 828/463, loss: 0.11508358269929886 2023-01-22 17:14:55.262919: step: 830/463, loss: 0.02690371684730053 2023-01-22 17:14:55.958631: step: 832/463, loss: 0.13311836123466492 2023-01-22 17:14:56.566526: step: 834/463, loss: 0.08067207783460617 2023-01-22 17:14:57.184854: step: 836/463, loss: 0.0660458654165268 2023-01-22 17:14:57.798080: step: 838/463, loss: 0.16069719195365906 2023-01-22 17:14:58.386140: step: 840/463, loss: 0.03663317859172821 2023-01-22 17:14:58.972581: step: 842/463, loss: 0.0014651352539658546 2023-01-22 17:14:59.619122: step: 844/463, loss: 0.04699042811989784 2023-01-22 17:15:00.257604: step: 846/463, loss: 0.06638261675834656 2023-01-22 17:15:00.891210: step: 848/463, loss: 0.021002106368541718 2023-01-22 17:15:01.496563: step: 850/463, loss: 0.018220746889710426 2023-01-22 17:15:02.125834: step: 852/463, loss: 0.011116367764770985 2023-01-22 17:15:02.740899: step: 854/463, loss: 0.014238220639526844 2023-01-22 17:15:03.351878: step: 856/463, loss: 0.05706850811839104 2023-01-22 17:15:04.024569: step: 858/463, loss: 0.07921537011861801 2023-01-22 17:15:04.804290: step: 860/463, loss: 0.08001001179218292 2023-01-22 17:15:05.390970: step: 862/463, loss: 0.01564917340874672 2023-01-22 17:15:05.961786: step: 864/463, loss: 0.012040984816849232 2023-01-22 17:15:06.618570: step: 866/463, loss: 0.04217517003417015 2023-01-22 17:15:07.198877: step: 868/463, loss: 0.026142189279198647 2023-01-22 17:15:07.807297: step: 870/463, loss: 0.6649972796440125 2023-01-22 17:15:08.412800: step: 872/463, loss: 0.09306490421295166 2023-01-22 17:15:09.007530: step: 874/463, loss: 0.17080290615558624 2023-01-22 17:15:09.598279: step: 876/463, loss: 0.030153919011354446 2023-01-22 17:15:10.251991: step: 878/463, loss: 0.002523594070225954 2023-01-22 17:15:10.921549: step: 880/463, loss: 0.04563302919268608 2023-01-22 17:15:11.594629: step: 882/463, loss: 0.08882565051317215 2023-01-22 17:15:12.214006: step: 884/463, loss: 0.14906835556030273 2023-01-22 17:15:12.860431: step: 886/463, loss: 0.036489829421043396 2023-01-22 17:15:13.505547: step: 888/463, loss: 0.019559592008590698 2023-01-22 17:15:14.212806: step: 890/463, loss: 0.009866524487733841 2023-01-22 17:15:14.804502: step: 892/463, loss: 0.025303857401013374 2023-01-22 17:15:15.386814: step: 894/463, loss: 0.0018611943814903498 2023-01-22 17:15:15.998036: step: 896/463, loss: 0.0530954971909523 2023-01-22 17:15:16.687084: step: 898/463, loss: 0.3510603606700897 2023-01-22 17:15:17.331520: step: 900/463, loss: 0.0023971442133188248 2023-01-22 17:15:17.934305: step: 902/463, loss: 0.04074025899171829 2023-01-22 17:15:18.599015: step: 904/463, loss: 0.11753255873918533 2023-01-22 17:15:19.253967: step: 906/463, loss: 0.06982971727848053 2023-01-22 17:15:19.877653: step: 908/463, loss: 0.019340883940458298 2023-01-22 17:15:20.481590: step: 910/463, loss: 0.006668656133115292 2023-01-22 17:15:21.150122: step: 912/463, loss: 0.049222931265830994 2023-01-22 17:15:21.730988: step: 914/463, loss: 0.10472872108221054 2023-01-22 17:15:22.351819: step: 916/463, loss: 0.01673835702240467 2023-01-22 17:15:22.943550: step: 918/463, loss: 0.03419507294893265 2023-01-22 17:15:23.585896: step: 920/463, loss: 0.08963220566511154 2023-01-22 17:15:24.281203: step: 922/463, loss: 0.09727652370929718 2023-01-22 17:15:24.902712: step: 924/463, loss: 0.029025373980402946 2023-01-22 17:15:25.509057: step: 926/463, loss: 0.05501796305179596 ================================================== Loss: 0.071 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2758798984034833, 'r': 0.36068548387096777, 'f1': 0.3126336348684211}, 'combined': 0.2303616256925208, 'epoch': 22} Test Chinese: {'template': {'p': 0.9722222222222222, 'r': 0.5511811023622047, 'f1': 0.7035175879396985}, 'slot': {'p': 0.33520418603042307, 'r': 0.33900667679691127, 'f1': 0.3370947085546727}, 'combined': 0.23715205626961902, 'epoch': 22} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.28127749266862173, 'r': 0.3640061669829222, 'f1': 0.3173387096774194}, 'combined': 0.23382852292020376, 'epoch': 22} Test Korean: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.3309096821235833, 'r': 0.33408595306892314, 'f1': 0.33249023205990963}, 'combined': 0.23606806476253583, 'epoch': 22} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.28504931643799986, 'r': 0.3656420074233167, 'f1': 0.3203546764955742}, 'combined': 0.23605081425989677, 'epoch': 22} Test Russian: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.34833035469302703, 'r': 0.32401409956611416, 'f1': 0.33573251184698627}, 'combined': 0.23837008341136023, 'epoch': 22} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.23996913580246912, 'r': 0.3702380952380952, 'f1': 0.29119850187265917}, 'combined': 0.19413233458177276, 'epoch': 22} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.2616279069767442, 'r': 0.4891304347826087, 'f1': 0.34090909090909094}, 'combined': 0.17045454545454547, 'epoch': 22} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.3804347826086957, 'r': 0.3017241379310345, 'f1': 0.3365384615384615}, 'combined': 0.22435897435897434, 'epoch': 22} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2887401938920082, 'r': 0.35393959251278423, 'f1': 0.3180326773303279}, 'combined': 0.2343398675065574, 'epoch': 16} Test for Chinese: {'template': {'p': 0.9722222222222222, 'r': 0.5511811023622047, 'f1': 0.7035175879396985}, 'slot': {'p': 0.3374190197383612, 'r': 0.30885912016190303, 'f1': 0.32250801977725824}, 'combined': 0.22689006416490531, 'epoch': 16} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.29006410256410253, 'r': 0.4309523809523809, 'f1': 0.34674329501915707}, 'combined': 0.23116219667943805, 'epoch': 16} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.28642843540005986, 'r': 0.33860515228508026, 'f1': 0.3103389830508475}, 'combined': 0.22867082961641394, 'epoch': 17} Test for Korean: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.3492453884409228, 'r': 0.3270179138496522, 'f1': 0.3377663639671779}, 'combined': 0.2398141184166963, 'epoch': 17} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.33088235294117646, 'r': 0.4891304347826087, 'f1': 0.39473684210526316}, 'combined': 0.19736842105263158, 'epoch': 17} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.29811343141907926, 'r': 0.34053944158308486, 'f1': 0.3179172466152094}, 'combined': 0.23425481329541745, 'epoch': 20} Test for Russian: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.36505995641065214, 'r': 0.2978456014345199, 'f1': 0.32804522752903387}, 'combined': 0.23291211154561403, 'epoch': 20} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.27586206896551724, 'f1': 0.35555555555555557}, 'combined': 0.23703703703703705, 'epoch': 20} ****************************** Epoch: 23 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-22 17:18:06.182626: step: 2/463, loss: 0.005892541259527206 2023-01-22 17:18:06.765025: step: 4/463, loss: 0.010192710906267166 2023-01-22 17:18:07.432512: step: 6/463, loss: 0.05443151295185089 2023-01-22 17:18:08.010126: step: 8/463, loss: 0.01612916961312294 2023-01-22 17:18:08.610693: step: 10/463, loss: 0.009892472065985203 2023-01-22 17:18:09.248873: step: 12/463, loss: 0.3891589641571045 2023-01-22 17:18:09.947605: step: 14/463, loss: 0.06694084405899048 2023-01-22 17:18:10.656519: step: 16/463, loss: 0.034992605447769165 2023-01-22 17:18:11.292557: step: 18/463, loss: 0.024312308058142662 2023-01-22 17:18:11.995832: step: 20/463, loss: 0.06371854990720749 2023-01-22 17:18:12.668737: step: 22/463, loss: 0.009520432911813259 2023-01-22 17:18:13.285464: step: 24/463, loss: 0.07230748981237411 2023-01-22 17:18:13.939941: step: 26/463, loss: 0.006076466757804155 2023-01-22 17:18:14.649182: step: 28/463, loss: 0.5243937373161316 2023-01-22 17:18:15.336399: step: 30/463, loss: 0.005940806586295366 2023-01-22 17:18:15.909029: step: 32/463, loss: 0.005302266217768192 2023-01-22 17:18:16.483123: step: 34/463, loss: 0.02308773063123226 2023-01-22 17:18:17.141944: step: 36/463, loss: 0.036868866533041 2023-01-22 17:18:17.839921: step: 38/463, loss: 0.039779182523489 2023-01-22 17:18:18.464929: step: 40/463, loss: 0.02297835238277912 2023-01-22 17:18:19.108498: step: 42/463, loss: 0.22360916435718536 2023-01-22 17:18:19.744424: step: 44/463, loss: 0.005737891420722008 2023-01-22 17:18:20.324298: step: 46/463, loss: 0.015915706753730774 2023-01-22 17:18:20.902265: step: 48/463, loss: 0.09529364109039307 2023-01-22 17:18:21.619141: step: 50/463, loss: 0.02168303355574608 2023-01-22 17:18:22.208099: step: 52/463, loss: 0.01993958093225956 2023-01-22 17:18:22.866626: step: 54/463, loss: 0.0014373899903148413 2023-01-22 17:18:23.497587: step: 56/463, loss: 0.04297472909092903 2023-01-22 17:18:24.145104: step: 58/463, loss: 0.011658867821097374 2023-01-22 17:18:24.836860: step: 60/463, loss: 0.0595899298787117 2023-01-22 17:18:25.496715: step: 62/463, loss: 0.024417098611593246 2023-01-22 17:18:26.117041: step: 64/463, loss: 0.06643234938383102 2023-01-22 17:18:26.700374: step: 66/463, loss: 0.09744423627853394 2023-01-22 17:18:27.259375: step: 68/463, loss: 0.05132581293582916 2023-01-22 17:18:27.879661: step: 70/463, loss: 0.0020039891824126244 2023-01-22 17:18:28.505241: step: 72/463, loss: 0.007011362351477146 2023-01-22 17:18:29.122731: step: 74/463, loss: 0.000963249709457159 2023-01-22 17:18:29.723455: step: 76/463, loss: 0.003521341597661376 2023-01-22 17:18:30.352071: step: 78/463, loss: 0.024169012904167175 2023-01-22 17:18:30.954158: step: 80/463, loss: 0.029294351115822792 2023-01-22 17:18:31.554722: step: 82/463, loss: 1.0961735248565674 2023-01-22 17:18:32.196059: step: 84/463, loss: 0.08384102582931519 2023-01-22 17:18:32.787783: step: 86/463, loss: 0.028012923896312714 2023-01-22 17:18:33.444196: step: 88/463, loss: 0.02362704649567604 2023-01-22 17:18:34.113849: step: 90/463, loss: 0.07629059255123138 2023-01-22 17:18:34.735158: step: 92/463, loss: 0.10695347934961319 2023-01-22 17:18:35.380400: step: 94/463, loss: 0.011698407121002674 2023-01-22 17:18:35.951539: step: 96/463, loss: 0.01555577851831913 2023-01-22 17:18:36.519208: step: 98/463, loss: 0.015713734552264214 2023-01-22 17:18:37.120057: step: 100/463, loss: 0.05116893723607063 2023-01-22 17:18:37.713491: step: 102/463, loss: 0.03930802270770073 2023-01-22 17:18:38.278238: step: 104/463, loss: 0.008638635277748108 2023-01-22 17:18:38.844977: step: 106/463, loss: 0.0053259022533893585 2023-01-22 17:18:39.475654: step: 108/463, loss: 0.03919784352183342 2023-01-22 17:18:40.145521: step: 110/463, loss: 0.02070312388241291 2023-01-22 17:18:40.764027: step: 112/463, loss: 0.024203702807426453 2023-01-22 17:18:41.398961: step: 114/463, loss: 0.010861345566809177 2023-01-22 17:18:42.023058: step: 116/463, loss: 0.0460004098713398 2023-01-22 17:18:42.655605: step: 118/463, loss: 0.03444872051477432 2023-01-22 17:18:43.220254: step: 120/463, loss: 0.030014755204319954 2023-01-22 17:18:43.883576: step: 122/463, loss: 0.06193523481488228 2023-01-22 17:18:44.465774: step: 124/463, loss: 0.00955293234437704 2023-01-22 17:18:45.020463: step: 126/463, loss: 0.0012385817244648933 2023-01-22 17:18:45.630409: step: 128/463, loss: 0.038992706686258316 2023-01-22 17:18:46.177255: step: 130/463, loss: 0.00649621058255434 2023-01-22 17:18:46.786096: step: 132/463, loss: 0.011049095541238785 2023-01-22 17:18:47.438252: step: 134/463, loss: 0.20624427497386932 2023-01-22 17:18:48.092609: step: 136/463, loss: 0.020530832931399345 2023-01-22 17:18:48.672347: step: 138/463, loss: 0.014963815920054913 2023-01-22 17:18:49.334866: step: 140/463, loss: 0.007141227833926678 2023-01-22 17:18:49.966571: step: 142/463, loss: 0.04325651004910469 2023-01-22 17:18:50.593467: step: 144/463, loss: 0.03578656539320946 2023-01-22 17:18:51.273823: step: 146/463, loss: 0.023921608924865723 2023-01-22 17:18:51.872384: step: 148/463, loss: 0.012401938438415527 2023-01-22 17:18:52.554861: step: 150/463, loss: 0.0007657934911549091 2023-01-22 17:18:53.148566: step: 152/463, loss: 0.016983021050691605 2023-01-22 17:18:53.808695: step: 154/463, loss: 0.03177740424871445 2023-01-22 17:18:54.443645: step: 156/463, loss: 0.08286339789628983 2023-01-22 17:18:55.128616: step: 158/463, loss: 0.0016042310744524002 2023-01-22 17:18:55.695606: step: 160/463, loss: 0.036790765821933746 2023-01-22 17:18:56.346513: step: 162/463, loss: 0.37995806336402893 2023-01-22 17:18:57.012244: step: 164/463, loss: 0.011168638244271278 2023-01-22 17:18:57.665588: step: 166/463, loss: 0.06446376442909241 2023-01-22 17:18:58.266941: step: 168/463, loss: 1.9233320951461792 2023-01-22 17:18:58.942430: step: 170/463, loss: 0.027794063091278076 2023-01-22 17:18:59.564132: step: 172/463, loss: 0.006951446179300547 2023-01-22 17:19:00.203641: step: 174/463, loss: 0.030443254858255386 2023-01-22 17:19:00.770983: step: 176/463, loss: 2.280064344406128 2023-01-22 17:19:01.416272: step: 178/463, loss: 0.00327964685857296 2023-01-22 17:19:02.022989: step: 180/463, loss: 0.03174437955021858 2023-01-22 17:19:02.666276: step: 182/463, loss: 0.03662707656621933 2023-01-22 17:19:03.292672: step: 184/463, loss: 0.029410753399133682 2023-01-22 17:19:03.930462: step: 186/463, loss: 0.04096364229917526 2023-01-22 17:19:04.541229: step: 188/463, loss: 0.04170026630163193 2023-01-22 17:19:05.096420: step: 190/463, loss: 0.014283590018749237 2023-01-22 17:19:05.784369: step: 192/463, loss: 0.028489021584391594 2023-01-22 17:19:06.399460: step: 194/463, loss: 0.028533540666103363 2023-01-22 17:19:07.017995: step: 196/463, loss: 0.05597316101193428 2023-01-22 17:19:07.653967: step: 198/463, loss: 0.01959366910159588 2023-01-22 17:19:08.221861: step: 200/463, loss: 0.01876150816679001 2023-01-22 17:19:08.843544: step: 202/463, loss: 5.152926921844482 2023-01-22 17:19:09.490444: step: 204/463, loss: 0.007571298163384199 2023-01-22 17:19:10.152698: step: 206/463, loss: 0.012597275897860527 2023-01-22 17:19:10.781852: step: 208/463, loss: 0.16445235908031464 2023-01-22 17:19:11.404184: step: 210/463, loss: 0.08912359923124313 2023-01-22 17:19:12.066245: step: 212/463, loss: 0.016077691689133644 2023-01-22 17:19:12.663172: step: 214/463, loss: 0.02574787475168705 2023-01-22 17:19:13.268156: step: 216/463, loss: 0.12700188159942627 2023-01-22 17:19:13.870865: step: 218/463, loss: 0.020649759098887444 2023-01-22 17:19:14.458932: step: 220/463, loss: 0.02249561809003353 2023-01-22 17:19:15.126121: step: 222/463, loss: 0.05387392267584801 2023-01-22 17:19:15.740138: step: 224/463, loss: 0.06289222091436386 2023-01-22 17:19:16.300292: step: 226/463, loss: 0.028519917279481888 2023-01-22 17:19:16.976026: step: 228/463, loss: 0.0172868762165308 2023-01-22 17:19:17.528442: step: 230/463, loss: 0.0009495491976849735 2023-01-22 17:19:18.145371: step: 232/463, loss: 0.042658425867557526 2023-01-22 17:19:18.750914: step: 234/463, loss: 0.002979355165734887 2023-01-22 17:19:19.369396: step: 236/463, loss: 0.020633934065699577 2023-01-22 17:19:20.022964: step: 238/463, loss: 0.007021602708846331 2023-01-22 17:19:20.629180: step: 240/463, loss: 0.015684904530644417 2023-01-22 17:19:21.280152: step: 242/463, loss: 1.3897475004196167 2023-01-22 17:19:21.922266: step: 244/463, loss: 0.02102472633123398 2023-01-22 17:19:22.509050: step: 246/463, loss: 0.03760921210050583 2023-01-22 17:19:23.240698: step: 248/463, loss: 0.09580725431442261 2023-01-22 17:19:23.865203: step: 250/463, loss: 0.052749838680028915 2023-01-22 17:19:24.557975: step: 252/463, loss: 0.05554341897368431 2023-01-22 17:19:25.125615: step: 254/463, loss: 0.0010748887434601784 2023-01-22 17:19:25.743857: step: 256/463, loss: 0.05419585108757019 2023-01-22 17:19:26.362191: step: 258/463, loss: 0.017516447231173515 2023-01-22 17:19:26.970956: step: 260/463, loss: 0.00013826134090777487 2023-01-22 17:19:27.737430: step: 262/463, loss: 0.021989811211824417 2023-01-22 17:19:28.326492: step: 264/463, loss: 0.030769724398851395 2023-01-22 17:19:28.975096: step: 266/463, loss: 0.00020292814588174224 2023-01-22 17:19:29.589947: step: 268/463, loss: 0.009271759539842606 2023-01-22 17:19:30.269473: step: 270/463, loss: 0.03550155833363533 2023-01-22 17:19:30.909117: step: 272/463, loss: 0.010885486379265785 2023-01-22 17:19:31.465622: step: 274/463, loss: 0.028866959735751152 2023-01-22 17:19:32.023299: step: 276/463, loss: 0.004384224768728018 2023-01-22 17:19:32.615691: step: 278/463, loss: 0.01474601961672306 2023-01-22 17:19:33.198860: step: 280/463, loss: 0.08131208270788193 2023-01-22 17:19:33.787471: step: 282/463, loss: 0.076774962246418 2023-01-22 17:19:34.381972: step: 284/463, loss: 0.059826355427503586 2023-01-22 17:19:34.990885: step: 286/463, loss: 0.01489231362938881 2023-01-22 17:19:35.599148: step: 288/463, loss: 0.0409623384475708 2023-01-22 17:19:36.252754: step: 290/463, loss: 0.016631444916129112 2023-01-22 17:19:36.925551: step: 292/463, loss: 3.1487433910369873 2023-01-22 17:19:37.637265: step: 294/463, loss: 0.026927508413791656 2023-01-22 17:19:38.264028: step: 296/463, loss: 0.05589478462934494 2023-01-22 17:19:38.867527: step: 298/463, loss: 0.012181058526039124 2023-01-22 17:19:39.473837: step: 300/463, loss: 0.027618834748864174 2023-01-22 17:19:40.036487: step: 302/463, loss: 0.12885434925556183 2023-01-22 17:19:40.638311: step: 304/463, loss: 0.0649406686425209 2023-01-22 17:19:41.268515: step: 306/463, loss: 0.003937111236155033 2023-01-22 17:19:41.858201: step: 308/463, loss: 0.01266536209732294 2023-01-22 17:19:42.439690: step: 310/463, loss: 0.0030766241252422333 2023-01-22 17:19:43.033975: step: 312/463, loss: 0.010793567635118961 2023-01-22 17:19:43.695616: step: 314/463, loss: 0.005253665614873171 2023-01-22 17:19:44.377601: step: 316/463, loss: 0.0005801338120363653 2023-01-22 17:19:45.033131: step: 318/463, loss: 0.0013944539241492748 2023-01-22 17:19:45.662370: step: 320/463, loss: 0.03126085549592972 2023-01-22 17:19:46.293606: step: 322/463, loss: 0.01161214429885149 2023-01-22 17:19:46.963637: step: 324/463, loss: 0.0135923121124506 2023-01-22 17:19:47.514980: step: 326/463, loss: 0.02638242393732071 2023-01-22 17:19:48.162046: step: 328/463, loss: 0.04806811362504959 2023-01-22 17:19:48.757725: step: 330/463, loss: 0.0029666952323168516 2023-01-22 17:19:49.425163: step: 332/463, loss: 0.10732391476631165 2023-01-22 17:19:50.086215: step: 334/463, loss: 0.0409528873860836 2023-01-22 17:19:50.689231: step: 336/463, loss: 0.4554470181465149 2023-01-22 17:19:51.337582: step: 338/463, loss: 0.07143368571996689 2023-01-22 17:19:52.009756: step: 340/463, loss: 0.008603817783296108 2023-01-22 17:19:52.657431: step: 342/463, loss: 0.016352256760001183 2023-01-22 17:19:53.270962: step: 344/463, loss: 0.0049014464020729065 2023-01-22 17:19:53.824602: step: 346/463, loss: 0.02046714536845684 2023-01-22 17:19:54.379152: step: 348/463, loss: 0.007810044102370739 2023-01-22 17:19:55.095897: step: 350/463, loss: 0.0074748508632183075 2023-01-22 17:19:55.707589: step: 352/463, loss: 1.3127591609954834 2023-01-22 17:19:56.376251: step: 354/463, loss: 0.040429987013339996 2023-01-22 17:19:56.975954: step: 356/463, loss: 0.0015518247382715344 2023-01-22 17:19:57.649554: step: 358/463, loss: 0.02201058343052864 2023-01-22 17:19:58.263449: step: 360/463, loss: 0.00019243858696427196 2023-01-22 17:19:58.942767: step: 362/463, loss: 0.0905686616897583 2023-01-22 17:19:59.625282: step: 364/463, loss: 0.33184656500816345 2023-01-22 17:20:00.217407: step: 366/463, loss: 0.014664266258478165 2023-01-22 17:20:00.770460: step: 368/463, loss: 0.03596504405140877 2023-01-22 17:20:01.415358: step: 370/463, loss: 0.023809248581528664 2023-01-22 17:20:01.999742: step: 372/463, loss: 0.008440676145255566 2023-01-22 17:20:02.637614: step: 374/463, loss: 0.01318934466689825 2023-01-22 17:20:03.275151: step: 376/463, loss: 0.06201139837503433 2023-01-22 17:20:03.925283: step: 378/463, loss: 0.02205021306872368 2023-01-22 17:20:04.497061: step: 380/463, loss: 0.007236981764435768 2023-01-22 17:20:05.168250: step: 382/463, loss: 0.11772746592760086 2023-01-22 17:20:05.801657: step: 384/463, loss: 0.01801762543618679 2023-01-22 17:20:06.451651: step: 386/463, loss: 0.030071480199694633 2023-01-22 17:20:07.042657: step: 388/463, loss: 0.028685346245765686 2023-01-22 17:20:07.736740: step: 390/463, loss: 0.0295648705214262 2023-01-22 17:20:08.340469: step: 392/463, loss: 0.019567226991057396 2023-01-22 17:20:08.943137: step: 394/463, loss: 0.01965329237282276 2023-01-22 17:20:09.463021: step: 396/463, loss: 6.529844540636986e-05 2023-01-22 17:20:10.063296: step: 398/463, loss: 0.11065395176410675 2023-01-22 17:20:10.627721: step: 400/463, loss: 0.001248741871677339 2023-01-22 17:20:11.238318: step: 402/463, loss: 0.021731076762080193 2023-01-22 17:20:11.801028: step: 404/463, loss: 0.0017541839042678475 2023-01-22 17:20:12.467889: step: 406/463, loss: 0.05884702876210213 2023-01-22 17:20:13.098407: step: 408/463, loss: 0.019196441397070885 2023-01-22 17:20:13.736792: step: 410/463, loss: 0.13302166759967804 2023-01-22 17:20:14.411653: step: 412/463, loss: 0.009090296924114227 2023-01-22 17:20:14.972987: step: 414/463, loss: 0.006756724789738655 2023-01-22 17:20:15.624779: step: 416/463, loss: 0.024685584008693695 2023-01-22 17:20:16.244657: step: 418/463, loss: 0.012323005124926567 2023-01-22 17:20:16.829454: step: 420/463, loss: 0.007339778821915388 2023-01-22 17:20:17.437433: step: 422/463, loss: 0.08368469029664993 2023-01-22 17:20:18.061745: step: 424/463, loss: 0.020390504971146584 2023-01-22 17:20:18.629346: step: 426/463, loss: 0.027731232345104218 2023-01-22 17:20:19.242324: step: 428/463, loss: 0.04462944343686104 2023-01-22 17:20:19.875293: step: 430/463, loss: 0.030412673950195312 2023-01-22 17:20:20.476077: step: 432/463, loss: 0.04352433979511261 2023-01-22 17:20:21.098255: step: 434/463, loss: 0.0022284891456365585 2023-01-22 17:20:21.704325: step: 436/463, loss: 0.044847987592220306 2023-01-22 17:20:22.352516: step: 438/463, loss: 0.010303664021193981 2023-01-22 17:20:23.059910: step: 440/463, loss: 0.0013882527127861977 2023-01-22 17:20:23.674502: step: 442/463, loss: 0.015433170832693577 2023-01-22 17:20:24.363675: step: 444/463, loss: 0.04074522480368614 2023-01-22 17:20:24.991094: step: 446/463, loss: 0.03773694485425949 2023-01-22 17:20:25.625159: step: 448/463, loss: 0.006459082011133432 2023-01-22 17:20:26.229875: step: 450/463, loss: 0.04458160698413849 2023-01-22 17:20:26.875458: step: 452/463, loss: 0.08224526792764664 2023-01-22 17:20:27.525055: step: 454/463, loss: 0.07221974432468414 2023-01-22 17:20:28.063871: step: 456/463, loss: 0.0016192406183108687 2023-01-22 17:20:28.698649: step: 458/463, loss: 0.017631731927394867 2023-01-22 17:20:29.289192: step: 460/463, loss: 0.015431506559252739 2023-01-22 17:20:29.937644: step: 462/463, loss: 0.0236737709492445 2023-01-22 17:20:30.522661: step: 464/463, loss: 0.007640978321433067 2023-01-22 17:20:31.139510: step: 466/463, loss: 0.01753704808652401 2023-01-22 17:20:31.794844: step: 468/463, loss: 0.028678154572844505 2023-01-22 17:20:32.406882: step: 470/463, loss: 0.004774967674165964 2023-01-22 17:20:33.049097: step: 472/463, loss: 0.016893042251467705 2023-01-22 17:20:33.628013: step: 474/463, loss: 0.008640365675091743 2023-01-22 17:20:34.223355: step: 476/463, loss: 0.06607842445373535 2023-01-22 17:20:34.865446: step: 478/463, loss: 0.02033242955803871 2023-01-22 17:20:35.478708: step: 480/463, loss: 0.10660023242235184 2023-01-22 17:20:36.113905: step: 482/463, loss: 0.03930537402629852 2023-01-22 17:20:36.790030: step: 484/463, loss: 0.06668118387460709 2023-01-22 17:20:37.392249: step: 486/463, loss: 0.2138746976852417 2023-01-22 17:20:37.997975: step: 488/463, loss: 0.08405610173940659 2023-01-22 17:20:38.711625: step: 490/463, loss: 0.005365739576518536 2023-01-22 17:20:39.392448: step: 492/463, loss: 0.05597129836678505 2023-01-22 17:20:39.958542: step: 494/463, loss: 0.04358851909637451 2023-01-22 17:20:40.660957: step: 496/463, loss: 0.12269962579011917 2023-01-22 17:20:41.269246: step: 498/463, loss: 0.017021019011735916 2023-01-22 17:20:41.836940: step: 500/463, loss: 0.031021205708384514 2023-01-22 17:20:42.431730: step: 502/463, loss: 0.06373132765293121 2023-01-22 17:20:43.018371: step: 504/463, loss: 0.02295548841357231 2023-01-22 17:20:43.647990: step: 506/463, loss: 0.04117393121123314 2023-01-22 17:20:44.300208: step: 508/463, loss: 0.15027372539043427 2023-01-22 17:20:45.067163: step: 510/463, loss: 0.011589364148676395 2023-01-22 17:20:45.719114: step: 512/463, loss: 0.020236879587173462 2023-01-22 17:20:46.316850: step: 514/463, loss: 0.261220782995224 2023-01-22 17:20:46.893997: step: 516/463, loss: 0.024177057668566704 2023-01-22 17:20:47.495074: step: 518/463, loss: 0.08192690461874008 2023-01-22 17:20:48.137048: step: 520/463, loss: 0.01594894379377365 2023-01-22 17:20:48.679261: step: 522/463, loss: 0.01750831864774227 2023-01-22 17:20:49.296149: step: 524/463, loss: 0.009235161356627941 2023-01-22 17:20:49.933696: step: 526/463, loss: 0.008091656491160393 2023-01-22 17:20:50.595159: step: 528/463, loss: 0.5002816319465637 2023-01-22 17:20:51.186716: step: 530/463, loss: 0.043521687388420105 2023-01-22 17:20:51.882631: step: 532/463, loss: 0.024673108011484146 2023-01-22 17:20:52.535365: step: 534/463, loss: 0.04728483781218529 2023-01-22 17:20:53.137263: step: 536/463, loss: 0.022420965135097504 2023-01-22 17:20:53.734706: step: 538/463, loss: 0.021939655765891075 2023-01-22 17:20:54.326628: step: 540/463, loss: 0.02867579273879528 2023-01-22 17:20:54.973847: step: 542/463, loss: 0.06178984418511391 2023-01-22 17:20:55.628158: step: 544/463, loss: 0.025143740698695183 2023-01-22 17:20:56.320822: step: 546/463, loss: 0.04023994505405426 2023-01-22 17:20:56.977432: step: 548/463, loss: 0.16838355362415314 2023-01-22 17:20:57.606953: step: 550/463, loss: 0.039148904383182526 2023-01-22 17:20:58.167111: step: 552/463, loss: 0.052133575081825256 2023-01-22 17:20:58.766209: step: 554/463, loss: 0.27369603514671326 2023-01-22 17:20:59.463656: step: 556/463, loss: 0.003646009135991335 2023-01-22 17:21:00.149263: step: 558/463, loss: 0.046143416315317154 2023-01-22 17:21:00.699828: step: 560/463, loss: 0.042662810534238815 2023-01-22 17:21:01.348880: step: 562/463, loss: 0.0034438222646713257 2023-01-22 17:21:01.923735: step: 564/463, loss: 0.010395655408501625 2023-01-22 17:21:02.499097: step: 566/463, loss: 0.015228550881147385 2023-01-22 17:21:03.114003: step: 568/463, loss: 0.01025892049074173 2023-01-22 17:21:03.760575: step: 570/463, loss: 0.01359342411160469 2023-01-22 17:21:04.378712: step: 572/463, loss: 0.021212926134467125 2023-01-22 17:21:04.902056: step: 574/463, loss: 0.024636268615722656 2023-01-22 17:21:05.606170: step: 576/463, loss: 0.07653139531612396 2023-01-22 17:21:06.292204: step: 578/463, loss: 0.025550944730639458 2023-01-22 17:21:06.900793: step: 580/463, loss: 0.009554654359817505 2023-01-22 17:21:07.519905: step: 582/463, loss: 0.08674691617488861 2023-01-22 17:21:08.097056: step: 584/463, loss: 0.003558396827429533 2023-01-22 17:21:08.707223: step: 586/463, loss: 0.08891670405864716 2023-01-22 17:21:09.318623: step: 588/463, loss: 0.004067990928888321 2023-01-22 17:21:09.971862: step: 590/463, loss: 0.19377462565898895 2023-01-22 17:21:10.595975: step: 592/463, loss: 0.02604489214718342 2023-01-22 17:21:11.149898: step: 594/463, loss: 0.0136541947722435 2023-01-22 17:21:11.721588: step: 596/463, loss: 0.0020479450467973948 2023-01-22 17:21:12.388527: step: 598/463, loss: 0.0237677413970232 2023-01-22 17:21:13.040905: step: 600/463, loss: 0.037905577570199966 2023-01-22 17:21:13.681322: step: 602/463, loss: 0.09957637637853622 2023-01-22 17:21:14.259495: step: 604/463, loss: 0.012053903192281723 2023-01-22 17:21:14.880888: step: 606/463, loss: 0.06479569524526596 2023-01-22 17:21:15.471986: step: 608/463, loss: 0.014586645178496838 2023-01-22 17:21:16.072205: step: 610/463, loss: 0.036499083042144775 2023-01-22 17:21:16.731050: step: 612/463, loss: 0.00592226954177022 2023-01-22 17:21:17.340873: step: 614/463, loss: 0.015755832195281982 2023-01-22 17:21:17.993074: step: 616/463, loss: 0.023067282512784004 2023-01-22 17:21:18.638440: step: 618/463, loss: 0.011051676236093044 2023-01-22 17:21:19.289111: step: 620/463, loss: 0.02911444939672947 2023-01-22 17:21:19.882528: step: 622/463, loss: 0.09129710495471954 2023-01-22 17:21:20.531887: step: 624/463, loss: 0.007406262680888176 2023-01-22 17:21:21.095352: step: 626/463, loss: 0.0021360437385737896 2023-01-22 17:21:21.717668: step: 628/463, loss: 0.029901724308729172 2023-01-22 17:21:22.406604: step: 630/463, loss: 0.1279161423444748 2023-01-22 17:21:23.057590: step: 632/463, loss: 0.008336743339896202 2023-01-22 17:21:23.754491: step: 634/463, loss: 0.014868838712573051 2023-01-22 17:21:24.401345: step: 636/463, loss: 0.23796209692955017 2023-01-22 17:21:24.992843: step: 638/463, loss: 0.040789615362882614 2023-01-22 17:21:25.604107: step: 640/463, loss: 0.03538300469517708 2023-01-22 17:21:26.255268: step: 642/463, loss: 0.019182274118065834 2023-01-22 17:21:26.829500: step: 644/463, loss: 0.030733667314052582 2023-01-22 17:21:27.450460: step: 646/463, loss: 0.009575051255524158 2023-01-22 17:21:28.076982: step: 648/463, loss: 0.022425610572099686 2023-01-22 17:21:28.742327: step: 650/463, loss: 0.020248502492904663 2023-01-22 17:21:29.484280: step: 652/463, loss: 0.033155206590890884 2023-01-22 17:21:30.072822: step: 654/463, loss: 0.03456728160381317 2023-01-22 17:21:30.648978: step: 656/463, loss: 0.03722742572426796 2023-01-22 17:21:31.266897: step: 658/463, loss: 0.0483386255800724 2023-01-22 17:21:31.832882: step: 660/463, loss: 0.0011788289994001389 2023-01-22 17:21:32.372587: step: 662/463, loss: 0.03551824763417244 2023-01-22 17:21:32.955172: step: 664/463, loss: 0.01675087958574295 2023-01-22 17:21:33.620276: step: 666/463, loss: 0.04700243100523949 2023-01-22 17:21:34.196162: step: 668/463, loss: 0.04846629127860069 2023-01-22 17:21:34.850488: step: 670/463, loss: 0.04675858095288277 2023-01-22 17:21:35.488262: step: 672/463, loss: 0.009110938757658005 2023-01-22 17:21:36.108369: step: 674/463, loss: 0.015501391142606735 2023-01-22 17:21:36.741947: step: 676/463, loss: 0.034774646162986755 2023-01-22 17:21:37.407482: step: 678/463, loss: 0.062475454062223434 2023-01-22 17:21:38.063778: step: 680/463, loss: 0.05605718120932579 2023-01-22 17:21:38.665579: step: 682/463, loss: 0.0045094299130141735 2023-01-22 17:21:39.235885: step: 684/463, loss: 0.06791666895151138 2023-01-22 17:21:39.819871: step: 686/463, loss: 0.06613350659608841 2023-01-22 17:21:40.397926: step: 688/463, loss: 0.004254626575857401 2023-01-22 17:21:41.045855: step: 690/463, loss: 0.007798529230058193 2023-01-22 17:21:41.669243: step: 692/463, loss: 0.050361815840005875 2023-01-22 17:21:42.256815: step: 694/463, loss: 0.048550527542829514 2023-01-22 17:21:42.871022: step: 696/463, loss: 0.21593502163887024 2023-01-22 17:21:43.417558: step: 698/463, loss: 0.05233469232916832 2023-01-22 17:21:44.031892: step: 700/463, loss: 0.016478747129440308 2023-01-22 17:21:44.620840: step: 702/463, loss: 0.25208622217178345 2023-01-22 17:21:45.338633: step: 704/463, loss: 0.04008551314473152 2023-01-22 17:21:46.020584: step: 706/463, loss: 0.12413036078214645 2023-01-22 17:21:46.602155: step: 708/463, loss: 0.024911794811487198 2023-01-22 17:21:47.187713: step: 710/463, loss: 1.8656680583953857 2023-01-22 17:21:47.800026: step: 712/463, loss: 0.02072846330702305 2023-01-22 17:21:48.383386: step: 714/463, loss: 0.30950698256492615 2023-01-22 17:21:48.990929: step: 716/463, loss: 0.07431033998727798 2023-01-22 17:21:49.590764: step: 718/463, loss: 0.014272412285208702 2023-01-22 17:21:50.228548: step: 720/463, loss: 0.029018409550189972 2023-01-22 17:21:50.821634: step: 722/463, loss: 0.039390332996845245 2023-01-22 17:21:51.474708: step: 724/463, loss: 0.3944337069988251 2023-01-22 17:21:52.208510: step: 726/463, loss: 0.04478415101766586 2023-01-22 17:21:52.835736: step: 728/463, loss: 0.0031139031052589417 2023-01-22 17:21:53.509990: step: 730/463, loss: 0.5571728944778442 2023-01-22 17:21:54.126214: step: 732/463, loss: 0.11298578977584839 2023-01-22 17:21:54.729199: step: 734/463, loss: 0.006445199251174927 2023-01-22 17:21:55.334659: step: 736/463, loss: 0.025385599583387375 2023-01-22 17:21:55.990115: step: 738/463, loss: 0.03202042356133461 2023-01-22 17:21:56.603217: step: 740/463, loss: 0.06522183120250702 2023-01-22 17:21:57.165252: step: 742/463, loss: 0.03894011676311493 2023-01-22 17:21:57.749577: step: 744/463, loss: 0.18785327672958374 2023-01-22 17:21:58.330556: step: 746/463, loss: 0.02080308273434639 2023-01-22 17:21:58.932990: step: 748/463, loss: 0.014974262565374374 2023-01-22 17:21:59.594480: step: 750/463, loss: 0.5792259573936462 2023-01-22 17:22:00.234115: step: 752/463, loss: 0.004052882082760334 2023-01-22 17:22:00.845456: step: 754/463, loss: 0.003757901955395937 2023-01-22 17:22:01.516645: step: 756/463, loss: 0.057845450937747955 2023-01-22 17:22:02.076147: step: 758/463, loss: 0.015557783655822277 2023-01-22 17:22:02.668486: step: 760/463, loss: 0.023919960483908653 2023-01-22 17:22:03.260804: step: 762/463, loss: 0.018450137227773666 2023-01-22 17:22:03.845112: step: 764/463, loss: 0.3332688510417938 2023-01-22 17:22:04.439957: step: 766/463, loss: 0.12415773421525955 2023-01-22 17:22:05.034579: step: 768/463, loss: 0.01973123475909233 2023-01-22 17:22:05.564157: step: 770/463, loss: 0.011695022694766521 2023-01-22 17:22:06.208979: step: 772/463, loss: 0.02195870876312256 2023-01-22 17:22:06.894506: step: 774/463, loss: 0.015525211580097675 2023-01-22 17:22:07.472210: step: 776/463, loss: 0.22645972669124603 2023-01-22 17:22:08.191968: step: 778/463, loss: 0.04359378293156624 2023-01-22 17:22:08.831804: step: 780/463, loss: 0.015679193660616875 2023-01-22 17:22:09.434408: step: 782/463, loss: 0.04318249970674515 2023-01-22 17:22:10.017957: step: 784/463, loss: 0.0847286731004715 2023-01-22 17:22:10.691944: step: 786/463, loss: 0.05258708447217941 2023-01-22 17:22:11.342517: step: 788/463, loss: 0.05275636911392212 2023-01-22 17:22:11.960941: step: 790/463, loss: 0.07554729282855988 2023-01-22 17:22:12.649606: step: 792/463, loss: 0.15744267404079437 2023-01-22 17:22:13.232178: step: 794/463, loss: 0.020089788362383842 2023-01-22 17:22:13.876652: step: 796/463, loss: 0.39157140254974365 2023-01-22 17:22:14.522994: step: 798/463, loss: 0.05580102279782295 2023-01-22 17:22:15.187042: step: 800/463, loss: 0.07222683727741241 2023-01-22 17:22:15.816426: step: 802/463, loss: 0.0940839946269989 2023-01-22 17:22:16.421069: step: 804/463, loss: 0.046713363379240036 2023-01-22 17:22:17.036782: step: 806/463, loss: 0.04370113089680672 2023-01-22 17:22:17.673060: step: 808/463, loss: 0.006451519671827555 2023-01-22 17:22:18.257856: step: 810/463, loss: 0.08223014324903488 2023-01-22 17:22:18.847056: step: 812/463, loss: 0.16129277646541595 2023-01-22 17:22:19.451665: step: 814/463, loss: 0.020276136696338654 2023-01-22 17:22:20.094401: step: 816/463, loss: 0.01944536156952381 2023-01-22 17:22:20.839600: step: 818/463, loss: 0.06842590123414993 2023-01-22 17:22:21.456278: step: 820/463, loss: 1.2451082468032837 2023-01-22 17:22:22.048208: step: 822/463, loss: 0.059509195387363434 2023-01-22 17:22:22.673445: step: 824/463, loss: 0.030874056741595268 2023-01-22 17:22:23.317566: step: 826/463, loss: 0.018031006678938866 2023-01-22 17:22:23.883376: step: 828/463, loss: 0.0684840977191925 2023-01-22 17:22:24.524427: step: 830/463, loss: 0.017451738938689232 2023-01-22 17:22:25.170067: step: 832/463, loss: 0.02305111475288868 2023-01-22 17:22:25.763570: step: 834/463, loss: 0.029928898438811302 2023-01-22 17:22:26.404161: step: 836/463, loss: 0.08949222415685654 2023-01-22 17:22:27.010884: step: 838/463, loss: 0.057591043412685394 2023-01-22 17:22:27.862393: step: 840/463, loss: 0.22588162124156952 2023-01-22 17:22:28.495743: step: 842/463, loss: 0.04172500967979431 2023-01-22 17:22:29.107471: step: 844/463, loss: 0.5713762640953064 2023-01-22 17:22:29.736996: step: 846/463, loss: 0.08438538759946823 2023-01-22 17:22:30.377007: step: 848/463, loss: 0.23034918308258057 2023-01-22 17:22:30.975527: step: 850/463, loss: 0.015118774957954884 2023-01-22 17:22:31.558781: step: 852/463, loss: 0.046705350279808044 2023-01-22 17:22:32.163576: step: 854/463, loss: 0.07861807942390442 2023-01-22 17:22:32.779554: step: 856/463, loss: 0.2379361093044281 2023-01-22 17:22:33.410804: step: 858/463, loss: 0.07769784331321716 2023-01-22 17:22:34.043212: step: 860/463, loss: 0.06685512512922287 2023-01-22 17:22:34.666078: step: 862/463, loss: 0.047386761754751205 2023-01-22 17:22:35.281478: step: 864/463, loss: 0.10052480548620224 2023-01-22 17:22:36.015181: step: 866/463, loss: 0.0296592116355896 2023-01-22 17:22:36.651202: step: 868/463, loss: 0.07322181761264801 2023-01-22 17:22:37.246846: step: 870/463, loss: 0.047359444200992584 2023-01-22 17:22:37.889223: step: 872/463, loss: 0.06559302657842636 2023-01-22 17:22:38.525985: step: 874/463, loss: 0.11056447774171829 2023-01-22 17:22:39.141305: step: 876/463, loss: 0.21934691071510315 2023-01-22 17:22:39.777913: step: 878/463, loss: 0.07080180943012238 2023-01-22 17:22:40.369734: step: 880/463, loss: 0.06471425294876099 2023-01-22 17:22:40.980916: step: 882/463, loss: 0.0716087743639946 2023-01-22 17:22:41.577226: step: 884/463, loss: 0.02496063895523548 2023-01-22 17:22:42.284612: step: 886/463, loss: 0.03358428552746773 2023-01-22 17:22:42.905126: step: 888/463, loss: 0.05967062711715698 2023-01-22 17:22:43.627491: step: 890/463, loss: 0.021897463127970695 2023-01-22 17:22:44.278741: step: 892/463, loss: 0.07353842258453369 2023-01-22 17:22:44.888305: step: 894/463, loss: 0.06311706453561783 2023-01-22 17:22:45.444323: step: 896/463, loss: 0.023440958932042122 2023-01-22 17:22:46.051776: step: 898/463, loss: 0.03143225237727165 2023-01-22 17:22:46.654086: step: 900/463, loss: 0.06163887679576874 2023-01-22 17:22:47.341413: step: 902/463, loss: 0.07575874030590057 2023-01-22 17:22:47.935839: step: 904/463, loss: 0.03764557093381882 2023-01-22 17:22:48.524655: step: 906/463, loss: 0.019488399848341942 2023-01-22 17:22:49.117824: step: 908/463, loss: 0.023914938792586327 2023-01-22 17:22:49.740280: step: 910/463, loss: 0.1095043495297432 2023-01-22 17:22:50.308587: step: 912/463, loss: 0.019526569172739983 2023-01-22 17:22:50.939984: step: 914/463, loss: 0.13620992004871368 2023-01-22 17:22:51.633801: step: 916/463, loss: 0.014058236964046955 2023-01-22 17:22:52.231480: step: 918/463, loss: 0.834291398525238 2023-01-22 17:22:52.853811: step: 920/463, loss: 0.03066139854490757 2023-01-22 17:22:53.489152: step: 922/463, loss: 0.38639944791793823 2023-01-22 17:22:54.140448: step: 924/463, loss: 0.003428705735132098 2023-01-22 17:22:54.828181: step: 926/463, loss: 0.05778537318110466 ================================================== Loss: 0.097 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2884159482758621, 'r': 0.31742172675521824, 'f1': 0.3022244805781391}, 'combined': 0.22269172253126038, 'epoch': 23} Test Chinese: {'template': {'p': 0.9722222222222222, 'r': 0.5511811023622047, 'f1': 0.7035175879396985}, 'slot': {'p': 0.3485597377789355, 'r': 0.31844855624305884, 'f1': 0.33282448285868255}, 'combined': 0.2341478773880179, 'epoch': 23} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2880791083916084, 'r': 0.3126778937381404, 'f1': 0.2998748862602366}, 'combined': 0.22096044250754276, 'epoch': 23} Test Korean: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.3501879299562226, 'r': 0.3132134626571799, 'f1': 0.3306703161723889}, 'combined': 0.23477592448239612, 'epoch': 23} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2996737339783267, 'r': 0.3281057770502742, 'f1': 0.31324591395922924}, 'combined': 0.23081277870680048, 'epoch': 23} Test Russian: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.36324738983017096, 'r': 0.30514050841503, 'f1': 0.3316681630658378}, 'combined': 0.23548439577674482, 'epoch': 23} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.2796296296296296, 'r': 0.35952380952380947, 'f1': 0.3145833333333333}, 'combined': 0.2097222222222222, 'epoch': 23} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.3125, 'r': 0.43478260869565216, 'f1': 0.36363636363636365}, 'combined': 0.18181818181818182, 'epoch': 23} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.3194444444444444, 'r': 0.19827586206896552, 'f1': 0.24468085106382978}, 'combined': 0.1631205673758865, 'epoch': 23} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2887401938920082, 'r': 0.35393959251278423, 'f1': 0.3180326773303279}, 'combined': 0.2343398675065574, 'epoch': 16} Test for Chinese: {'template': {'p': 0.9722222222222222, 'r': 0.5511811023622047, 'f1': 0.7035175879396985}, 'slot': {'p': 0.3374190197383612, 'r': 0.30885912016190303, 'f1': 0.32250801977725824}, 'combined': 0.22689006416490531, 'epoch': 16} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.29006410256410253, 'r': 0.4309523809523809, 'f1': 0.34674329501915707}, 'combined': 0.23116219667943805, 'epoch': 16} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.28642843540005986, 'r': 0.33860515228508026, 'f1': 0.3103389830508475}, 'combined': 0.22867082961641394, 'epoch': 17} Test for Korean: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.3492453884409228, 'r': 0.3270179138496522, 'f1': 0.3377663639671779}, 'combined': 0.2398141184166963, 'epoch': 17} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.33088235294117646, 'r': 0.4891304347826087, 'f1': 0.39473684210526316}, 'combined': 0.19736842105263158, 'epoch': 17} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.29811343141907926, 'r': 0.34053944158308486, 'f1': 0.3179172466152094}, 'combined': 0.23425481329541745, 'epoch': 20} Test for Russian: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.36505995641065214, 'r': 0.2978456014345199, 'f1': 0.32804522752903387}, 'combined': 0.23291211154561403, 'epoch': 20} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.27586206896551724, 'f1': 0.35555555555555557}, 'combined': 0.23703703703703705, 'epoch': 20} ****************************** Epoch: 24 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-22 17:25:34.710742: step: 2/463, loss: 0.02154787816107273 2023-01-22 17:25:35.278145: step: 4/463, loss: 0.011272327974438667 2023-01-22 17:25:35.950613: step: 6/463, loss: 0.09992445260286331 2023-01-22 17:25:36.552083: step: 8/463, loss: 0.16863080859184265 2023-01-22 17:25:37.171508: step: 10/463, loss: 0.008213714696466923 2023-01-22 17:25:37.766984: step: 12/463, loss: 0.00742047606036067 2023-01-22 17:25:38.416739: step: 14/463, loss: 0.02726094424724579 2023-01-22 17:25:39.013424: step: 16/463, loss: 0.04925283044576645 2023-01-22 17:25:39.659491: step: 18/463, loss: 0.031762540340423584 2023-01-22 17:25:40.265100: step: 20/463, loss: 0.02416313998401165 2023-01-22 17:25:40.921395: step: 22/463, loss: 0.135688915848732 2023-01-22 17:25:41.538425: step: 24/463, loss: 0.014163674786686897 2023-01-22 17:25:42.173323: step: 26/463, loss: 0.15010103583335876 2023-01-22 17:25:42.971103: step: 28/463, loss: 0.020053762942552567 2023-01-22 17:25:43.527956: step: 30/463, loss: 0.020847642794251442 2023-01-22 17:25:44.162255: step: 32/463, loss: 0.028525806963443756 2023-01-22 17:25:44.777724: step: 34/463, loss: 0.006106817629188299 2023-01-22 17:25:45.474950: step: 36/463, loss: 0.08901070058345795 2023-01-22 17:25:46.118854: step: 38/463, loss: 0.15647484362125397 2023-01-22 17:25:46.788427: step: 40/463, loss: 0.23000669479370117 2023-01-22 17:25:47.371038: step: 42/463, loss: 0.0217449888586998 2023-01-22 17:25:47.931688: step: 44/463, loss: 0.0065308623015880585 2023-01-22 17:25:48.449210: step: 46/463, loss: 0.10569345206022263 2023-01-22 17:25:49.158505: step: 48/463, loss: 0.0003534290590323508 2023-01-22 17:25:49.798255: step: 50/463, loss: 0.03743773326277733 2023-01-22 17:25:50.425330: step: 52/463, loss: 0.015249907970428467 2023-01-22 17:25:51.064561: step: 54/463, loss: 0.023658758029341698 2023-01-22 17:25:51.714318: step: 56/463, loss: 0.03926058113574982 2023-01-22 17:25:52.295923: step: 58/463, loss: 0.0042902622371912 2023-01-22 17:25:52.908737: step: 60/463, loss: 0.040905121713876724 2023-01-22 17:25:53.507415: step: 62/463, loss: 1.694770097732544 2023-01-22 17:25:54.134558: step: 64/463, loss: 0.17651298642158508 2023-01-22 17:25:54.814235: step: 66/463, loss: 0.24792519211769104 2023-01-22 17:25:55.394806: step: 68/463, loss: 0.005508068948984146 2023-01-22 17:25:56.029183: step: 70/463, loss: 0.3491031527519226 2023-01-22 17:25:56.636640: step: 72/463, loss: 0.017216037958860397 2023-01-22 17:25:57.281098: step: 74/463, loss: 1.288432002067566 2023-01-22 17:25:57.895555: step: 76/463, loss: 0.028438659384846687 2023-01-22 17:25:58.587947: step: 78/463, loss: 0.047596946358680725 2023-01-22 17:25:59.188849: step: 80/463, loss: 0.052112214267253876 2023-01-22 17:25:59.793086: step: 82/463, loss: 0.051036685705184937 2023-01-22 17:26:00.455110: step: 84/463, loss: 0.11541088670492172 2023-01-22 17:26:01.083674: step: 86/463, loss: 0.026292184367775917 2023-01-22 17:26:01.652161: step: 88/463, loss: 0.0139949144795537 2023-01-22 17:26:02.216462: step: 90/463, loss: 0.03600534051656723 2023-01-22 17:26:02.872653: step: 92/463, loss: 0.09447404742240906 2023-01-22 17:26:03.457096: step: 94/463, loss: 0.05121331661939621 2023-01-22 17:26:04.080654: step: 96/463, loss: 0.4795037806034088 2023-01-22 17:26:04.683002: step: 98/463, loss: 0.0849536806344986 2023-01-22 17:26:05.322507: step: 100/463, loss: 0.024082055315375328 2023-01-22 17:26:05.939216: step: 102/463, loss: 0.004697373602539301 2023-01-22 17:26:06.603552: step: 104/463, loss: 0.011492719873785973 2023-01-22 17:26:07.156926: step: 106/463, loss: 0.4253498911857605 2023-01-22 17:26:07.752125: step: 108/463, loss: 0.002878354862332344 2023-01-22 17:26:08.373523: step: 110/463, loss: 0.0021745769772678614 2023-01-22 17:26:09.002859: step: 112/463, loss: 0.01800791174173355 2023-01-22 17:26:09.693293: step: 114/463, loss: 0.06406493484973907 2023-01-22 17:26:10.332139: step: 116/463, loss: 0.05337189510464668 2023-01-22 17:26:10.985060: step: 118/463, loss: 0.0470418818295002 2023-01-22 17:26:11.526522: step: 120/463, loss: 0.010111740790307522 2023-01-22 17:26:12.128900: step: 122/463, loss: 0.04624320566654205 2023-01-22 17:26:12.757227: step: 124/463, loss: 0.042156707495450974 2023-01-22 17:26:13.327869: step: 126/463, loss: 0.050189197063446045 2023-01-22 17:26:13.986105: step: 128/463, loss: 0.14640438556671143 2023-01-22 17:26:14.561682: step: 130/463, loss: 0.005728199612349272 2023-01-22 17:26:15.151724: step: 132/463, loss: 0.07451710104942322 2023-01-22 17:26:15.772697: step: 134/463, loss: 0.06092182919383049 2023-01-22 17:26:16.423322: step: 136/463, loss: 0.016769811511039734 2023-01-22 17:26:17.036960: step: 138/463, loss: 0.012961498461663723 2023-01-22 17:26:17.629237: step: 140/463, loss: 0.019037961959838867 2023-01-22 17:26:18.182662: step: 142/463, loss: 0.012365853413939476 2023-01-22 17:26:18.827443: step: 144/463, loss: 0.05496220290660858 2023-01-22 17:26:19.539905: step: 146/463, loss: 0.03284367546439171 2023-01-22 17:26:20.127006: step: 148/463, loss: 0.7536410093307495 2023-01-22 17:26:20.804943: step: 150/463, loss: 0.043980829417705536 2023-01-22 17:26:21.398631: step: 152/463, loss: 0.04366925731301308 2023-01-22 17:26:22.033630: step: 154/463, loss: 0.030875546857714653 2023-01-22 17:26:22.582868: step: 156/463, loss: 0.3530164957046509 2023-01-22 17:26:23.256494: step: 158/463, loss: 0.013277437537908554 2023-01-22 17:26:23.874950: step: 160/463, loss: 0.02993687614798546 2023-01-22 17:26:24.458583: step: 162/463, loss: 0.05758369341492653 2023-01-22 17:26:25.114795: step: 164/463, loss: 0.13259810209274292 2023-01-22 17:26:25.793330: step: 166/463, loss: 0.08789709210395813 2023-01-22 17:26:26.406764: step: 168/463, loss: 0.02889864332973957 2023-01-22 17:26:27.055649: step: 170/463, loss: 0.019116181880235672 2023-01-22 17:26:27.609490: step: 172/463, loss: 0.015127943828701973 2023-01-22 17:26:28.211908: step: 174/463, loss: 0.012739073485136032 2023-01-22 17:26:28.822068: step: 176/463, loss: 0.014192002825438976 2023-01-22 17:26:29.428128: step: 178/463, loss: 0.006838496308773756 2023-01-22 17:26:30.050613: step: 180/463, loss: 0.07409291714429855 2023-01-22 17:26:30.606627: step: 182/463, loss: 0.00706526031717658 2023-01-22 17:26:31.288448: step: 184/463, loss: 0.2519952058792114 2023-01-22 17:26:31.922357: step: 186/463, loss: 0.008615119382739067 2023-01-22 17:26:32.580000: step: 188/463, loss: 0.01488721463829279 2023-01-22 17:26:33.273703: step: 190/463, loss: 0.007983477786183357 2023-01-22 17:26:33.934058: step: 192/463, loss: 0.02199907973408699 2023-01-22 17:26:34.565699: step: 194/463, loss: 0.02740555629134178 2023-01-22 17:26:35.170593: step: 196/463, loss: 0.04987142235040665 2023-01-22 17:26:35.787693: step: 198/463, loss: 0.014718769118189812 2023-01-22 17:26:36.363624: step: 200/463, loss: 0.06009833142161369 2023-01-22 17:26:36.969287: step: 202/463, loss: 0.053088024258613586 2023-01-22 17:26:37.575903: step: 204/463, loss: 0.011730360798537731 2023-01-22 17:26:38.240977: step: 206/463, loss: 0.07350147515535355 2023-01-22 17:26:38.893837: step: 208/463, loss: 0.04847460240125656 2023-01-22 17:26:39.580052: step: 210/463, loss: 0.2980313301086426 2023-01-22 17:26:40.178888: step: 212/463, loss: 0.013508399948477745 2023-01-22 17:26:40.762720: step: 214/463, loss: 0.03806102275848389 2023-01-22 17:26:41.401351: step: 216/463, loss: 0.06622885912656784 2023-01-22 17:26:42.025760: step: 218/463, loss: 0.051811717450618744 2023-01-22 17:26:42.664825: step: 220/463, loss: 0.011327734217047691 2023-01-22 17:26:43.305119: step: 222/463, loss: 0.024459224194288254 2023-01-22 17:26:44.006140: step: 224/463, loss: 0.005248096771538258 2023-01-22 17:26:44.581062: step: 226/463, loss: 0.10312044620513916 2023-01-22 17:26:45.193131: step: 228/463, loss: 0.30551475286483765 2023-01-22 17:26:45.810483: step: 230/463, loss: 0.022489266470074654 2023-01-22 17:26:46.431773: step: 232/463, loss: 0.012361498549580574 2023-01-22 17:26:46.991670: step: 234/463, loss: 0.007158802822232246 2023-01-22 17:26:47.569108: step: 236/463, loss: 0.04437195509672165 2023-01-22 17:26:48.157302: step: 238/463, loss: 0.013542373664677143 2023-01-22 17:26:48.804438: step: 240/463, loss: 0.07925643771886826 2023-01-22 17:26:49.414794: step: 242/463, loss: 0.04142526909708977 2023-01-22 17:26:49.992642: step: 244/463, loss: 0.0498080849647522 2023-01-22 17:26:50.616125: step: 246/463, loss: 0.14723414182662964 2023-01-22 17:26:51.287019: step: 248/463, loss: 0.009859489277005196 2023-01-22 17:26:51.938237: step: 250/463, loss: 0.014310535043478012 2023-01-22 17:26:52.545748: step: 252/463, loss: 0.010096844285726547 2023-01-22 17:26:53.151790: step: 254/463, loss: 0.08052971959114075 2023-01-22 17:26:53.706518: step: 256/463, loss: 0.019421596080064774 2023-01-22 17:26:54.340902: step: 258/463, loss: 0.11731848865747452 2023-01-22 17:26:55.032212: step: 260/463, loss: 0.02431412972509861 2023-01-22 17:26:55.686237: step: 262/463, loss: 0.6578973531723022 2023-01-22 17:26:56.309929: step: 264/463, loss: 0.05692674219608307 2023-01-22 17:26:56.886690: step: 266/463, loss: 0.030120881274342537 2023-01-22 17:26:57.568207: step: 268/463, loss: 0.11921948194503784 2023-01-22 17:26:58.186552: step: 270/463, loss: 0.006165622733533382 2023-01-22 17:26:58.825437: step: 272/463, loss: 0.05765359103679657 2023-01-22 17:26:59.555697: step: 274/463, loss: 0.041743479669094086 2023-01-22 17:27:00.166376: step: 276/463, loss: 0.007901758886873722 2023-01-22 17:27:00.751337: step: 278/463, loss: 0.6516156196594238 2023-01-22 17:27:01.447184: step: 280/463, loss: 0.050377726554870605 2023-01-22 17:27:02.105310: step: 282/463, loss: 0.004036291036754847 2023-01-22 17:27:02.708097: step: 284/463, loss: 0.037798553705215454 2023-01-22 17:27:03.352682: step: 286/463, loss: 0.06682592630386353 2023-01-22 17:27:03.993565: step: 288/463, loss: 0.07578270137310028 2023-01-22 17:27:04.595742: step: 290/463, loss: 0.42399051785469055 2023-01-22 17:27:05.187727: step: 292/463, loss: 0.009073241613805294 2023-01-22 17:27:05.765182: step: 294/463, loss: 0.032846082001924515 2023-01-22 17:27:06.346946: step: 296/463, loss: 0.0070249238051474094 2023-01-22 17:27:06.985309: step: 298/463, loss: 0.0571865513920784 2023-01-22 17:27:07.530723: step: 300/463, loss: 0.0816686749458313 2023-01-22 17:27:08.134965: step: 302/463, loss: 0.10224536061286926 2023-01-22 17:27:08.698408: step: 304/463, loss: 0.021261895075440407 2023-01-22 17:27:09.321462: step: 306/463, loss: 0.021262118592858315 2023-01-22 17:27:09.942891: step: 308/463, loss: 0.013932416215538979 2023-01-22 17:27:10.545444: step: 310/463, loss: 0.029500797390937805 2023-01-22 17:27:11.220703: step: 312/463, loss: 0.027756493538618088 2023-01-22 17:27:11.774150: step: 314/463, loss: 0.02237740531563759 2023-01-22 17:27:12.411521: step: 316/463, loss: 0.018178790807724 2023-01-22 17:27:13.037667: step: 318/463, loss: 0.10535358637571335 2023-01-22 17:27:13.622966: step: 320/463, loss: 0.059715598821640015 2023-01-22 17:27:14.255080: step: 322/463, loss: 0.03232601657509804 2023-01-22 17:27:14.920245: step: 324/463, loss: 0.03406830132007599 2023-01-22 17:27:15.604293: step: 326/463, loss: 0.022730013355612755 2023-01-22 17:27:16.194902: step: 328/463, loss: 0.06777559965848923 2023-01-22 17:27:16.811316: step: 330/463, loss: 0.343959778547287 2023-01-22 17:27:17.440758: step: 332/463, loss: 0.011968759819865227 2023-01-22 17:27:18.037082: step: 334/463, loss: 0.004772141110152006 2023-01-22 17:27:18.703709: step: 336/463, loss: 0.16213339567184448 2023-01-22 17:27:19.340595: step: 338/463, loss: 0.07047352939844131 2023-01-22 17:27:19.927551: step: 340/463, loss: 0.028111305087804794 2023-01-22 17:27:20.553556: step: 342/463, loss: 0.003760100807994604 2023-01-22 17:27:21.202479: step: 344/463, loss: 0.10969149321317673 2023-01-22 17:27:21.829578: step: 346/463, loss: 0.036409806460142136 2023-01-22 17:27:22.387800: step: 348/463, loss: 0.21110419929027557 2023-01-22 17:27:22.996230: step: 350/463, loss: 0.6004828810691833 2023-01-22 17:27:23.713523: step: 352/463, loss: 0.39065271615982056 2023-01-22 17:27:24.285216: step: 354/463, loss: 0.25227969884872437 2023-01-22 17:27:25.014551: step: 356/463, loss: 0.23308967053890228 2023-01-22 17:27:25.596210: step: 358/463, loss: 0.20950326323509216 2023-01-22 17:27:26.174311: step: 360/463, loss: 0.050682906061410904 2023-01-22 17:27:26.755729: step: 362/463, loss: 0.019851967692375183 2023-01-22 17:27:27.372405: step: 364/463, loss: 0.025335540995001793 2023-01-22 17:27:28.039540: step: 366/463, loss: 0.03753063082695007 2023-01-22 17:27:28.666398: step: 368/463, loss: 0.08870086073875427 2023-01-22 17:27:29.322033: step: 370/463, loss: 0.04684872180223465 2023-01-22 17:27:29.947454: step: 372/463, loss: 0.056196801364421844 2023-01-22 17:27:30.588069: step: 374/463, loss: 0.006583400070667267 2023-01-22 17:27:31.175585: step: 376/463, loss: 0.06797608733177185 2023-01-22 17:27:31.782266: step: 378/463, loss: 0.03502649813890457 2023-01-22 17:27:32.364320: step: 380/463, loss: 0.013109921477735043 2023-01-22 17:27:32.965714: step: 382/463, loss: 0.041452765464782715 2023-01-22 17:27:33.612788: step: 384/463, loss: 0.04293333366513252 2023-01-22 17:27:34.246052: step: 386/463, loss: 0.024125609546899796 2023-01-22 17:27:34.839429: step: 388/463, loss: 0.023708190768957138 2023-01-22 17:27:35.469127: step: 390/463, loss: 0.023528335615992546 2023-01-22 17:27:36.071698: step: 392/463, loss: 0.059591758996248245 2023-01-22 17:27:36.667322: step: 394/463, loss: 0.02388681285083294 2023-01-22 17:27:37.228628: step: 396/463, loss: 0.010158158838748932 2023-01-22 17:27:37.827647: step: 398/463, loss: 0.012181886471807957 2023-01-22 17:27:38.489014: step: 400/463, loss: 0.0219782255589962 2023-01-22 17:27:39.073554: step: 402/463, loss: 0.0020483483094722033 2023-01-22 17:27:39.790165: step: 404/463, loss: 0.04038657993078232 2023-01-22 17:27:40.459365: step: 406/463, loss: 0.14766545593738556 2023-01-22 17:27:41.027412: step: 408/463, loss: 0.20826144516468048 2023-01-22 17:27:41.634511: step: 410/463, loss: 0.10310250520706177 2023-01-22 17:27:42.269986: step: 412/463, loss: 0.028869448229670525 2023-01-22 17:27:42.835219: step: 414/463, loss: 0.025206487625837326 2023-01-22 17:27:43.395132: step: 416/463, loss: 0.0023915076162666082 2023-01-22 17:27:44.059615: step: 418/463, loss: 0.17600232362747192 2023-01-22 17:27:44.667967: step: 420/463, loss: 0.02621486224234104 2023-01-22 17:27:45.287498: step: 422/463, loss: 0.003943255171179771 2023-01-22 17:27:45.891002: step: 424/463, loss: 0.010243468917906284 2023-01-22 17:27:46.533315: step: 426/463, loss: 0.00456245057284832 2023-01-22 17:27:47.225755: step: 428/463, loss: 0.0068969386629760265 2023-01-22 17:27:47.867499: step: 430/463, loss: 0.04178307577967644 2023-01-22 17:27:48.515582: step: 432/463, loss: 0.001963286194950342 2023-01-22 17:27:49.175588: step: 434/463, loss: 0.0351041741669178 2023-01-22 17:27:49.880067: step: 436/463, loss: 0.021570879966020584 2023-01-22 17:27:50.499241: step: 438/463, loss: 0.007141605485230684 2023-01-22 17:27:51.026377: step: 440/463, loss: 0.016238966956734657 2023-01-22 17:27:51.649365: step: 442/463, loss: 0.006308582611382008 2023-01-22 17:27:52.237734: step: 444/463, loss: 0.0031510782428085804 2023-01-22 17:27:52.811982: step: 446/463, loss: 0.00912080891430378 2023-01-22 17:27:53.519436: step: 448/463, loss: 0.006560547277331352 2023-01-22 17:27:54.168539: step: 450/463, loss: 0.020050277933478355 2023-01-22 17:27:54.780987: step: 452/463, loss: 0.009101350791752338 2023-01-22 17:27:55.406745: step: 454/463, loss: 0.06446589529514313 2023-01-22 17:27:56.040532: step: 456/463, loss: 0.02454133704304695 2023-01-22 17:27:56.644046: step: 458/463, loss: 0.015303069725632668 2023-01-22 17:27:57.192714: step: 460/463, loss: 0.008602706715464592 2023-01-22 17:27:57.763940: step: 462/463, loss: 0.003972137812525034 2023-01-22 17:27:58.390911: step: 464/463, loss: 0.00645432760939002 2023-01-22 17:27:59.041579: step: 466/463, loss: 0.1755065768957138 2023-01-22 17:27:59.704708: step: 468/463, loss: 0.027596401050686836 2023-01-22 17:28:00.287243: step: 470/463, loss: 0.0028843635227531195 2023-01-22 17:28:00.854239: step: 472/463, loss: 0.16587810218334198 2023-01-22 17:28:01.471477: step: 474/463, loss: 0.007610809989273548 2023-01-22 17:28:02.238919: step: 476/463, loss: 0.04218628257513046 2023-01-22 17:28:02.913024: step: 478/463, loss: 0.018574006855487823 2023-01-22 17:28:03.537686: step: 480/463, loss: 0.02324530854821205 2023-01-22 17:28:04.162653: step: 482/463, loss: 0.01811802014708519 2023-01-22 17:28:04.723397: step: 484/463, loss: 0.019892647862434387 2023-01-22 17:28:05.351127: step: 486/463, loss: 0.018489811569452286 2023-01-22 17:28:05.996345: step: 488/463, loss: 0.003761408617720008 2023-01-22 17:28:06.599287: step: 490/463, loss: 0.00808839499950409 2023-01-22 17:28:07.268559: step: 492/463, loss: 0.27048489451408386 2023-01-22 17:28:07.863381: step: 494/463, loss: 0.007092142943292856 2023-01-22 17:28:08.455431: step: 496/463, loss: 0.030156413093209267 2023-01-22 17:28:09.098004: step: 498/463, loss: 0.029225366190075874 2023-01-22 17:28:09.755605: step: 500/463, loss: 0.0170326866209507 2023-01-22 17:28:10.350444: step: 502/463, loss: 0.01223977841436863 2023-01-22 17:28:10.998828: step: 504/463, loss: 0.04225526377558708 2023-01-22 17:28:11.673610: step: 506/463, loss: 0.08552591502666473 2023-01-22 17:28:12.349533: step: 508/463, loss: 0.05372040718793869 2023-01-22 17:28:12.998764: step: 510/463, loss: 0.008716282434761524 2023-01-22 17:28:13.658438: step: 512/463, loss: 0.08692143112421036 2023-01-22 17:28:14.238626: step: 514/463, loss: 0.1859707087278366 2023-01-22 17:28:14.824969: step: 516/463, loss: 0.007628163322806358 2023-01-22 17:28:15.454641: step: 518/463, loss: 0.02198653668165207 2023-01-22 17:28:16.066898: step: 520/463, loss: 0.01780496910214424 2023-01-22 17:28:16.666898: step: 522/463, loss: 0.020145878195762634 2023-01-22 17:28:17.200610: step: 524/463, loss: 0.001263095298781991 2023-01-22 17:28:17.830025: step: 526/463, loss: 0.010636504739522934 2023-01-22 17:28:18.524771: step: 528/463, loss: 0.01673496700823307 2023-01-22 17:28:19.193830: step: 530/463, loss: 0.04983898252248764 2023-01-22 17:28:19.837153: step: 532/463, loss: 0.1179044172167778 2023-01-22 17:28:20.487164: step: 534/463, loss: 0.009475599974393845 2023-01-22 17:28:21.174338: step: 536/463, loss: 0.0910903587937355 2023-01-22 17:28:21.804879: step: 538/463, loss: 0.0021920623257756233 2023-01-22 17:28:22.428036: step: 540/463, loss: 0.013429964892566204 2023-01-22 17:28:22.983172: step: 542/463, loss: 0.01569763384759426 2023-01-22 17:28:23.600921: step: 544/463, loss: 0.006411516107618809 2023-01-22 17:28:24.184537: step: 546/463, loss: 0.011767766438424587 2023-01-22 17:28:24.822472: step: 548/463, loss: 0.015643300488591194 2023-01-22 17:28:25.452917: step: 550/463, loss: 0.0003929726080968976 2023-01-22 17:28:26.100454: step: 552/463, loss: 0.0012740949168801308 2023-01-22 17:28:26.799025: step: 554/463, loss: 0.031146228313446045 2023-01-22 17:28:27.469082: step: 556/463, loss: 0.011685994453728199 2023-01-22 17:28:28.101396: step: 558/463, loss: 0.07473944872617722 2023-01-22 17:28:28.678589: step: 560/463, loss: 0.016522668302059174 2023-01-22 17:28:29.286438: step: 562/463, loss: 0.038474325090646744 2023-01-22 17:28:29.856394: step: 564/463, loss: 0.03601900115609169 2023-01-22 17:28:30.351383: step: 566/463, loss: 0.031685005873441696 2023-01-22 17:28:31.011953: step: 568/463, loss: 0.011367942206561565 2023-01-22 17:28:31.635422: step: 570/463, loss: 0.0053645288571715355 2023-01-22 17:28:32.249406: step: 572/463, loss: 0.1040564775466919 2023-01-22 17:28:32.913552: step: 574/463, loss: 0.03742009028792381 2023-01-22 17:28:33.588053: step: 576/463, loss: 0.019883394241333008 2023-01-22 17:28:34.223594: step: 578/463, loss: 0.04160835221409798 2023-01-22 17:28:34.825507: step: 580/463, loss: 0.0051157246343791485 2023-01-22 17:28:35.497865: step: 582/463, loss: 0.06626801937818527 2023-01-22 17:28:36.176642: step: 584/463, loss: 0.015126736834645271 2023-01-22 17:28:36.765795: step: 586/463, loss: 0.036434322595596313 2023-01-22 17:28:37.407426: step: 588/463, loss: 0.0061904569156467915 2023-01-22 17:28:38.057366: step: 590/463, loss: 0.1156497672200203 2023-01-22 17:28:38.666190: step: 592/463, loss: 0.14057716727256775 2023-01-22 17:28:39.282042: step: 594/463, loss: 0.018376464024186134 2023-01-22 17:28:39.911559: step: 596/463, loss: 0.02636924386024475 2023-01-22 17:28:40.552051: step: 598/463, loss: 0.010219041258096695 2023-01-22 17:28:41.166214: step: 600/463, loss: 0.34983694553375244 2023-01-22 17:28:41.800272: step: 602/463, loss: 0.06348025053739548 2023-01-22 17:28:42.440060: step: 604/463, loss: 0.007992386817932129 2023-01-22 17:28:43.070421: step: 606/463, loss: 0.013451758772134781 2023-01-22 17:28:43.692049: step: 608/463, loss: 0.7719197273254395 2023-01-22 17:28:44.303824: step: 610/463, loss: 0.06322049349546432 2023-01-22 17:28:44.840407: step: 612/463, loss: 0.04225136339664459 2023-01-22 17:28:45.439129: step: 614/463, loss: 0.21242806315422058 2023-01-22 17:28:46.048029: step: 616/463, loss: 0.048087988048791885 2023-01-22 17:28:46.648769: step: 618/463, loss: 0.005673318635672331 2023-01-22 17:28:47.259197: step: 620/463, loss: 0.030607812106609344 2023-01-22 17:28:47.897440: step: 622/463, loss: 0.04865031689405441 2023-01-22 17:28:48.470237: step: 624/463, loss: 0.005974498111754656 2023-01-22 17:28:49.091200: step: 626/463, loss: 0.023962851613759995 2023-01-22 17:28:49.718556: step: 628/463, loss: 0.019295185804367065 2023-01-22 17:28:50.306306: step: 630/463, loss: 0.24297137558460236 2023-01-22 17:28:51.033158: step: 632/463, loss: 0.05746319144964218 2023-01-22 17:28:51.682731: step: 634/463, loss: 0.032981909811496735 2023-01-22 17:28:52.402666: step: 636/463, loss: 0.16302557289600372 2023-01-22 17:28:53.009817: step: 638/463, loss: 0.04449867457151413 2023-01-22 17:28:53.587733: step: 640/463, loss: 0.004948452580720186 2023-01-22 17:28:54.216454: step: 642/463, loss: 0.05447782203555107 2023-01-22 17:28:54.809131: step: 644/463, loss: 0.01917831227183342 2023-01-22 17:28:55.435153: step: 646/463, loss: 0.04870018735527992 2023-01-22 17:28:56.128589: step: 648/463, loss: 0.013602307997643948 2023-01-22 17:28:56.761156: step: 650/463, loss: 0.04065564647316933 2023-01-22 17:28:57.389359: step: 652/463, loss: 0.574588418006897 2023-01-22 17:28:58.075721: step: 654/463, loss: 0.06764312833547592 2023-01-22 17:28:58.680498: step: 656/463, loss: 0.0025896041188389063 2023-01-22 17:28:59.291555: step: 658/463, loss: 0.02156662568449974 2023-01-22 17:28:59.902838: step: 660/463, loss: 0.19365161657333374 2023-01-22 17:29:00.562320: step: 662/463, loss: 0.03667619824409485 2023-01-22 17:29:01.171252: step: 664/463, loss: 0.017535943537950516 2023-01-22 17:29:01.805123: step: 666/463, loss: 0.038888297975063324 2023-01-22 17:29:02.417446: step: 668/463, loss: 0.0047869132831692696 2023-01-22 17:29:03.050860: step: 670/463, loss: 0.010897749103605747 2023-01-22 17:29:03.636850: step: 672/463, loss: 0.05347045511007309 2023-01-22 17:29:04.265723: step: 674/463, loss: 0.012796790339052677 2023-01-22 17:29:04.839841: step: 676/463, loss: 0.08466549217700958 2023-01-22 17:29:05.475150: step: 678/463, loss: 0.018159102648496628 2023-01-22 17:29:06.118142: step: 680/463, loss: 0.02649029903113842 2023-01-22 17:29:06.802312: step: 682/463, loss: 0.008291780948638916 2023-01-22 17:29:07.473877: step: 684/463, loss: 0.008414607495069504 2023-01-22 17:29:08.050233: step: 686/463, loss: 0.03525730222463608 2023-01-22 17:29:08.639910: step: 688/463, loss: 0.054401297122240067 2023-01-22 17:29:09.246705: step: 690/463, loss: 0.012010350823402405 2023-01-22 17:29:09.970852: step: 692/463, loss: 0.02416534535586834 2023-01-22 17:29:10.640911: step: 694/463, loss: 0.0100698946043849 2023-01-22 17:29:11.292614: step: 696/463, loss: 0.00327460584230721 2023-01-22 17:29:11.889855: step: 698/463, loss: 0.015421032905578613 2023-01-22 17:29:12.510410: step: 700/463, loss: 0.32142242789268494 2023-01-22 17:29:13.253589: step: 702/463, loss: 0.01059854868799448 2023-01-22 17:29:13.796816: step: 704/463, loss: 0.018395904451608658 2023-01-22 17:29:14.381195: step: 706/463, loss: 0.35547834634780884 2023-01-22 17:29:15.060238: step: 708/463, loss: 0.04303404688835144 2023-01-22 17:29:15.679722: step: 710/463, loss: 0.09011078625917435 2023-01-22 17:29:16.243054: step: 712/463, loss: 0.0012208075495436788 2023-01-22 17:29:16.877112: step: 714/463, loss: 0.00914571899920702 2023-01-22 17:29:17.524797: step: 716/463, loss: 0.1592751294374466 2023-01-22 17:29:18.103909: step: 718/463, loss: 0.007053169421851635 2023-01-22 17:29:18.769317: step: 720/463, loss: 0.04186827316880226 2023-01-22 17:29:19.508116: step: 722/463, loss: 0.054789889603853226 2023-01-22 17:29:20.138212: step: 724/463, loss: 0.018602756783366203 2023-01-22 17:29:20.741259: step: 726/463, loss: 0.03133273869752884 2023-01-22 17:29:21.389025: step: 728/463, loss: 0.0033027196768671274 2023-01-22 17:29:21.975003: step: 730/463, loss: 0.06696689873933792 2023-01-22 17:29:22.591291: step: 732/463, loss: 0.008384250104427338 2023-01-22 17:29:23.216413: step: 734/463, loss: 0.03999199718236923 2023-01-22 17:29:23.824658: step: 736/463, loss: 0.029483428224921227 2023-01-22 17:29:24.446101: step: 738/463, loss: 0.04945564642548561 2023-01-22 17:29:25.029611: step: 740/463, loss: 0.02120252326130867 2023-01-22 17:29:25.672750: step: 742/463, loss: 0.055935394018888474 2023-01-22 17:29:26.438576: step: 744/463, loss: 0.02242564596235752 2023-01-22 17:29:27.135665: step: 746/463, loss: 0.02618967555463314 2023-01-22 17:29:27.799740: step: 748/463, loss: 0.03839452192187309 2023-01-22 17:29:28.432546: step: 750/463, loss: 6.765986472601071e-05 2023-01-22 17:29:29.080880: step: 752/463, loss: 0.01052500493824482 2023-01-22 17:29:29.773445: step: 754/463, loss: 0.21418775618076324 2023-01-22 17:29:30.359105: step: 756/463, loss: 0.011764584109187126 2023-01-22 17:29:30.904097: step: 758/463, loss: 0.003530450165271759 2023-01-22 17:29:31.533833: step: 760/463, loss: 0.1216595321893692 2023-01-22 17:29:32.147353: step: 762/463, loss: 0.06528561562299728 2023-01-22 17:29:32.706315: step: 764/463, loss: 0.15064913034439087 2023-01-22 17:29:33.340779: step: 766/463, loss: 0.01857985183596611 2023-01-22 17:29:33.982781: step: 768/463, loss: 0.008824251592159271 2023-01-22 17:29:34.581986: step: 770/463, loss: 0.00939925666898489 2023-01-22 17:29:35.158377: step: 772/463, loss: 0.005191699601709843 2023-01-22 17:29:35.816191: step: 774/463, loss: 0.041555870324373245 2023-01-22 17:29:36.441423: step: 776/463, loss: 0.0187307707965374 2023-01-22 17:29:37.106196: step: 778/463, loss: 0.03252154216170311 2023-01-22 17:29:37.735409: step: 780/463, loss: 0.02709745056927204 2023-01-22 17:29:38.398623: step: 782/463, loss: 0.01647551730275154 2023-01-22 17:29:39.078293: step: 784/463, loss: 0.03129119798541069 2023-01-22 17:29:39.656493: step: 786/463, loss: 0.0324518196284771 2023-01-22 17:29:40.327388: step: 788/463, loss: 0.01122946385294199 2023-01-22 17:29:40.950472: step: 790/463, loss: 0.04954236373305321 2023-01-22 17:29:41.601326: step: 792/463, loss: 0.0038455140311270952 2023-01-22 17:29:42.243624: step: 794/463, loss: 0.028034178540110588 2023-01-22 17:29:42.881506: step: 796/463, loss: 0.007170025259256363 2023-01-22 17:29:43.499410: step: 798/463, loss: 0.005859613884240389 2023-01-22 17:29:44.126693: step: 800/463, loss: 0.09755295515060425 2023-01-22 17:29:44.754703: step: 802/463, loss: 0.0011405328987166286 2023-01-22 17:29:45.400284: step: 804/463, loss: 1.0445609092712402 2023-01-22 17:29:45.993611: step: 806/463, loss: 0.12469397485256195 2023-01-22 17:29:46.575688: step: 808/463, loss: 0.08371111005544662 2023-01-22 17:29:47.176078: step: 810/463, loss: 0.029746374115347862 2023-01-22 17:29:47.792126: step: 812/463, loss: 0.0008151483489200473 2023-01-22 17:29:48.371162: step: 814/463, loss: 0.053259484469890594 2023-01-22 17:29:49.015908: step: 816/463, loss: 0.018781164661049843 2023-01-22 17:29:49.605656: step: 818/463, loss: 0.03212551027536392 2023-01-22 17:29:50.260631: step: 820/463, loss: 0.05152464658021927 2023-01-22 17:29:50.901040: step: 822/463, loss: 0.020616773515939713 2023-01-22 17:29:51.594870: step: 824/463, loss: 0.16062775254249573 2023-01-22 17:29:52.191728: step: 826/463, loss: 0.1442812979221344 2023-01-22 17:29:52.816725: step: 828/463, loss: 0.031985729932785034 2023-01-22 17:29:53.401952: step: 830/463, loss: 0.023981083184480667 2023-01-22 17:29:53.989053: step: 832/463, loss: 0.00401545874774456 2023-01-22 17:29:54.628692: step: 834/463, loss: 0.03239932283759117 2023-01-22 17:29:55.233510: step: 836/463, loss: 0.010528789833188057 2023-01-22 17:29:55.838672: step: 838/463, loss: 0.06283333897590637 2023-01-22 17:29:56.555704: step: 840/463, loss: 0.02500576712191105 2023-01-22 17:29:57.219835: step: 842/463, loss: 0.023158472031354904 2023-01-22 17:29:57.936010: step: 844/463, loss: 0.015501804649829865 2023-01-22 17:29:58.629857: step: 846/463, loss: 0.028191469609737396 2023-01-22 17:29:59.324198: step: 848/463, loss: 0.02797282114624977 2023-01-22 17:29:59.924563: step: 850/463, loss: 0.007414241787046194 2023-01-22 17:30:00.529921: step: 852/463, loss: 0.03013245016336441 2023-01-22 17:30:01.213950: step: 854/463, loss: 0.09525308012962341 2023-01-22 17:30:01.864876: step: 856/463, loss: 0.042966797947883606 2023-01-22 17:30:02.535087: step: 858/463, loss: 0.02702353708446026 2023-01-22 17:30:03.109689: step: 860/463, loss: 0.001902017742395401 2023-01-22 17:30:03.706825: step: 862/463, loss: 0.019140712916851044 2023-01-22 17:30:04.311328: step: 864/463, loss: 0.06699293106794357 2023-01-22 17:30:04.898615: step: 866/463, loss: 0.06756817549467087 2023-01-22 17:30:05.496012: step: 868/463, loss: 0.01569155789911747 2023-01-22 17:30:06.090762: step: 870/463, loss: 0.03409242630004883 2023-01-22 17:30:06.655424: step: 872/463, loss: 0.02184385061264038 2023-01-22 17:30:07.329836: step: 874/463, loss: 0.004961153492331505 2023-01-22 17:30:07.960700: step: 876/463, loss: 0.13240276277065277 2023-01-22 17:30:08.567769: step: 878/463, loss: 0.019126959145069122 2023-01-22 17:30:09.174738: step: 880/463, loss: 0.005521869752556086 2023-01-22 17:30:09.834019: step: 882/463, loss: 0.018516888841986656 2023-01-22 17:30:10.480741: step: 884/463, loss: 0.022748515009880066 2023-01-22 17:30:11.110092: step: 886/463, loss: 0.11106698960065842 2023-01-22 17:30:11.737071: step: 888/463, loss: 0.029722148552536964 2023-01-22 17:30:12.272020: step: 890/463, loss: 0.04380152374505997 2023-01-22 17:30:12.916650: step: 892/463, loss: 0.02680574543774128 2023-01-22 17:30:13.478788: step: 894/463, loss: 0.14267607033252716 2023-01-22 17:30:14.103132: step: 896/463, loss: 0.006017074920237064 2023-01-22 17:30:14.736249: step: 898/463, loss: 0.11540395021438599 2023-01-22 17:30:15.371759: step: 900/463, loss: 0.04129418730735779 2023-01-22 17:30:16.061969: step: 902/463, loss: 0.01264046411961317 2023-01-22 17:30:16.704296: step: 904/463, loss: 0.020309830084443092 2023-01-22 17:30:17.307017: step: 906/463, loss: 0.10446984320878983 2023-01-22 17:30:17.943996: step: 908/463, loss: 0.00663741584867239 2023-01-22 17:30:18.604276: step: 910/463, loss: 0.019194338470697403 2023-01-22 17:30:19.189466: step: 912/463, loss: 0.05620375648140907 2023-01-22 17:30:19.834196: step: 914/463, loss: 0.08648758381605148 2023-01-22 17:30:20.424658: step: 916/463, loss: 0.0012221003416925669 2023-01-22 17:30:21.101103: step: 918/463, loss: 0.1343500018119812 2023-01-22 17:30:21.727425: step: 920/463, loss: 0.34208521246910095 2023-01-22 17:30:22.313942: step: 922/463, loss: 0.04202340543270111 2023-01-22 17:30:22.945278: step: 924/463, loss: 0.0012775585055351257 2023-01-22 17:30:23.530627: step: 926/463, loss: 0.02598246932029724 ================================================== Loss: 0.070 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.29892015706806285, 'r': 0.3250118595825427, 'f1': 0.3114204545454546}, 'combined': 0.22946770334928232, 'epoch': 24} Test Chinese: {'template': {'p': 0.9722222222222222, 'r': 0.5511811023622047, 'f1': 0.7035175879396985}, 'slot': {'p': 0.35556081469012174, 'r': 0.3139856408956398, 'f1': 0.3334824323136267}, 'combined': 0.2346107564015464, 'epoch': 24} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2982700892857143, 'r': 0.31694734345351044, 'f1': 0.30732520699172033}, 'combined': 0.22645015252021497, 'epoch': 24} Test Korean: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.3611765258057013, 'r': 0.31012015828342937, 'f1': 0.33370676187118314}, 'combined': 0.23693180092854002, 'epoch': 24} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.30846788015905663, 'r': 0.3283690337177054, 'f1': 0.31810750141402716}, 'combined': 0.23439500104191474, 'epoch': 24} Test Russian: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.36871348141567106, 'r': 0.2994790722415494, 'f1': 0.33050943394368587}, 'combined': 0.23466169810001697, 'epoch': 24} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.26215277777777773, 'r': 0.35952380952380947, 'f1': 0.3032128514056224}, 'combined': 0.20214190093708156, 'epoch': 24} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.27702702702702703, 'r': 0.44565217391304346, 'f1': 0.3416666666666667}, 'combined': 0.17083333333333334, 'epoch': 24} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.40789473684210525, 'r': 0.2672413793103448, 'f1': 0.3229166666666667}, 'combined': 0.2152777777777778, 'epoch': 24} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2887401938920082, 'r': 0.35393959251278423, 'f1': 0.3180326773303279}, 'combined': 0.2343398675065574, 'epoch': 16} Test for Chinese: {'template': {'p': 0.9722222222222222, 'r': 0.5511811023622047, 'f1': 0.7035175879396985}, 'slot': {'p': 0.3374190197383612, 'r': 0.30885912016190303, 'f1': 0.32250801977725824}, 'combined': 0.22689006416490531, 'epoch': 16} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.29006410256410253, 'r': 0.4309523809523809, 'f1': 0.34674329501915707}, 'combined': 0.23116219667943805, 'epoch': 16} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.28642843540005986, 'r': 0.33860515228508026, 'f1': 0.3103389830508475}, 'combined': 0.22867082961641394, 'epoch': 17} Test for Korean: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.3492453884409228, 'r': 0.3270179138496522, 'f1': 0.3377663639671779}, 'combined': 0.2398141184166963, 'epoch': 17} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.33088235294117646, 'r': 0.4891304347826087, 'f1': 0.39473684210526316}, 'combined': 0.19736842105263158, 'epoch': 17} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.29811343141907926, 'r': 0.34053944158308486, 'f1': 0.3179172466152094}, 'combined': 0.23425481329541745, 'epoch': 20} Test for Russian: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.36505995641065214, 'r': 0.2978456014345199, 'f1': 0.32804522752903387}, 'combined': 0.23291211154561403, 'epoch': 20} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.27586206896551724, 'f1': 0.35555555555555557}, 'combined': 0.23703703703703705, 'epoch': 20} ****************************** Epoch: 25 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-22 17:33:01.942441: step: 2/463, loss: 0.057302962988615036 2023-01-22 17:33:02.577276: step: 4/463, loss: 0.01852249912917614 2023-01-22 17:33:03.205643: step: 6/463, loss: 0.3842068314552307 2023-01-22 17:33:03.832456: step: 8/463, loss: 0.2170548439025879 2023-01-22 17:33:04.410669: step: 10/463, loss: 0.008522633463144302 2023-01-22 17:33:05.034641: step: 12/463, loss: 0.06070908159017563 2023-01-22 17:33:05.672469: step: 14/463, loss: 0.034053172916173935 2023-01-22 17:33:06.288146: step: 16/463, loss: 0.012888525612652302 2023-01-22 17:33:07.030804: step: 18/463, loss: 0.05755499750375748 2023-01-22 17:33:07.628332: step: 20/463, loss: 0.0024352511391043663 2023-01-22 17:33:08.234300: step: 22/463, loss: 0.00221640570089221 2023-01-22 17:33:08.953204: step: 24/463, loss: 0.004661782644689083 2023-01-22 17:33:09.639749: step: 26/463, loss: 0.05005740001797676 2023-01-22 17:33:10.224678: step: 28/463, loss: 0.002592317759990692 2023-01-22 17:33:10.827205: step: 30/463, loss: 0.07645998150110245 2023-01-22 17:33:11.412225: step: 32/463, loss: 0.0014141896972432733 2023-01-22 17:33:12.045672: step: 34/463, loss: 0.021779172122478485 2023-01-22 17:33:12.718495: step: 36/463, loss: 0.02350645698606968 2023-01-22 17:33:13.393566: step: 38/463, loss: 0.012934483587741852 2023-01-22 17:33:14.064279: step: 40/463, loss: 0.031489744782447815 2023-01-22 17:33:14.610254: step: 42/463, loss: 0.020799607038497925 2023-01-22 17:33:15.211244: step: 44/463, loss: 0.040483903139829636 2023-01-22 17:33:15.821809: step: 46/463, loss: 0.1460500806570053 2023-01-22 17:33:16.462452: step: 48/463, loss: 0.19536256790161133 2023-01-22 17:33:16.996525: step: 50/463, loss: 0.006061998195946217 2023-01-22 17:33:17.650880: step: 52/463, loss: 0.01171745266765356 2023-01-22 17:33:18.277500: step: 54/463, loss: 0.011754184029996395 2023-01-22 17:33:18.962369: step: 56/463, loss: 0.1373332142829895 2023-01-22 17:33:19.605492: step: 58/463, loss: 0.00354779209010303 2023-01-22 17:33:20.221955: step: 60/463, loss: 0.027041086927056313 2023-01-22 17:33:20.852203: step: 62/463, loss: 0.04234721139073372 2023-01-22 17:33:21.402332: step: 64/463, loss: 0.005601263605058193 2023-01-22 17:33:22.003906: step: 66/463, loss: 0.04450717568397522 2023-01-22 17:33:22.609004: step: 68/463, loss: 0.007348646409809589 2023-01-22 17:33:23.209566: step: 70/463, loss: 0.0002987582702189684 2023-01-22 17:33:23.811830: step: 72/463, loss: 0.8272213935852051 2023-01-22 17:33:24.464069: step: 74/463, loss: 0.15087854862213135 2023-01-22 17:33:25.001834: step: 76/463, loss: 0.013482317328453064 2023-01-22 17:33:25.622666: step: 78/463, loss: 0.01967170089483261 2023-01-22 17:33:26.309752: step: 80/463, loss: 0.0236872136592865 2023-01-22 17:33:26.938439: step: 82/463, loss: 0.03972487524151802 2023-01-22 17:33:27.558519: step: 84/463, loss: 0.03651516139507294 2023-01-22 17:33:28.212891: step: 86/463, loss: 0.006312840152531862 2023-01-22 17:33:28.849702: step: 88/463, loss: 0.02740023098886013 2023-01-22 17:33:29.440766: step: 90/463, loss: 0.046417590230703354 2023-01-22 17:33:30.054518: step: 92/463, loss: 0.007785239722579718 2023-01-22 17:33:30.660657: step: 94/463, loss: 0.0061247688718140125 2023-01-22 17:33:31.214002: step: 96/463, loss: 0.0011405410477891564 2023-01-22 17:33:31.852110: step: 98/463, loss: 0.06603604555130005 2023-01-22 17:33:32.493180: step: 100/463, loss: 0.008483638055622578 2023-01-22 17:33:33.089286: step: 102/463, loss: 0.009341648779809475 2023-01-22 17:33:33.780776: step: 104/463, loss: 0.04395283758640289 2023-01-22 17:33:34.381134: step: 106/463, loss: 0.001083686831407249 2023-01-22 17:33:34.985940: step: 108/463, loss: 0.022011177614331245 2023-01-22 17:33:35.595000: step: 110/463, loss: 0.008242065086960793 2023-01-22 17:33:36.176763: step: 112/463, loss: 0.02450496144592762 2023-01-22 17:33:36.800057: step: 114/463, loss: 0.022780872881412506 2023-01-22 17:33:37.417582: step: 116/463, loss: 0.13546577095985413 2023-01-22 17:33:38.034309: step: 118/463, loss: 0.06569387763738632 2023-01-22 17:33:38.654905: step: 120/463, loss: 0.008568460121750832 2023-01-22 17:33:39.184440: step: 122/463, loss: 0.005942097399383783 2023-01-22 17:33:39.810585: step: 124/463, loss: 0.011355992406606674 2023-01-22 17:33:40.369590: step: 126/463, loss: 0.0001533769245725125 2023-01-22 17:33:40.981241: step: 128/463, loss: 0.013081954792141914 2023-01-22 17:33:41.543618: step: 130/463, loss: 0.02918005920946598 2023-01-22 17:33:42.137802: step: 132/463, loss: 0.028670839965343475 2023-01-22 17:33:42.783085: step: 134/463, loss: 0.0019708205945789814 2023-01-22 17:33:43.445403: step: 136/463, loss: 0.0029621173162013292 2023-01-22 17:33:44.078506: step: 138/463, loss: 0.006678888574242592 2023-01-22 17:33:44.785699: step: 140/463, loss: 0.005182827357202768 2023-01-22 17:33:45.379600: step: 142/463, loss: 0.09436925500631332 2023-01-22 17:33:46.000208: step: 144/463, loss: 0.028548013418912888 2023-01-22 17:33:46.595240: step: 146/463, loss: 0.03117876686155796 2023-01-22 17:33:47.305951: step: 148/463, loss: 0.006528372876346111 2023-01-22 17:33:47.927507: step: 150/463, loss: 0.1656406819820404 2023-01-22 17:33:48.593658: step: 152/463, loss: 0.017929766327142715 2023-01-22 17:33:49.286694: step: 154/463, loss: 0.6341349482536316 2023-01-22 17:33:49.937751: step: 156/463, loss: 0.040077582001686096 2023-01-22 17:33:50.554527: step: 158/463, loss: 0.02685161679983139 2023-01-22 17:33:51.167735: step: 160/463, loss: 0.0004693639057222754 2023-01-22 17:33:51.759203: step: 162/463, loss: 0.05267840251326561 2023-01-22 17:33:52.357576: step: 164/463, loss: 0.25749674439430237 2023-01-22 17:33:52.963828: step: 166/463, loss: 6.216945621417835e-05 2023-01-22 17:33:53.576836: step: 168/463, loss: 0.005322039593011141 2023-01-22 17:33:54.259087: step: 170/463, loss: 0.028385162353515625 2023-01-22 17:33:54.888317: step: 172/463, loss: 0.011960746720433235 2023-01-22 17:33:55.483779: step: 174/463, loss: 0.00389685551635921 2023-01-22 17:33:56.068030: step: 176/463, loss: 0.02316039241850376 2023-01-22 17:33:56.697210: step: 178/463, loss: 0.08430065214633942 2023-01-22 17:33:57.326347: step: 180/463, loss: 0.46658119559288025 2023-01-22 17:33:57.939961: step: 182/463, loss: 0.005048105958849192 2023-01-22 17:33:58.589858: step: 184/463, loss: 0.027088826522231102 2023-01-22 17:33:59.223697: step: 186/463, loss: 0.00453605130314827 2023-01-22 17:33:59.808773: step: 188/463, loss: 0.019154712557792664 2023-01-22 17:34:00.456268: step: 190/463, loss: 0.03827141597867012 2023-01-22 17:34:01.065422: step: 192/463, loss: 0.008115684613585472 2023-01-22 17:34:01.711062: step: 194/463, loss: 0.040869321674108505 2023-01-22 17:34:02.311907: step: 196/463, loss: 0.023555941879749298 2023-01-22 17:34:02.940012: step: 198/463, loss: 0.017903029918670654 2023-01-22 17:34:03.525425: step: 200/463, loss: 0.04735589027404785 2023-01-22 17:34:04.137741: step: 202/463, loss: 0.38268622756004333 2023-01-22 17:34:04.843625: step: 204/463, loss: 0.19904479384422302 2023-01-22 17:34:05.497349: step: 206/463, loss: 0.008936000987887383 2023-01-22 17:34:06.113536: step: 208/463, loss: 0.005919995252043009 2023-01-22 17:34:06.708747: step: 210/463, loss: 0.010240599513053894 2023-01-22 17:34:07.294527: step: 212/463, loss: 0.03906400874257088 2023-01-22 17:34:07.952743: step: 214/463, loss: 0.0010734631214290857 2023-01-22 17:34:08.535097: step: 216/463, loss: 0.009507209062576294 2023-01-22 17:34:09.061020: step: 218/463, loss: 0.0028330180794000626 2023-01-22 17:34:09.649898: step: 220/463, loss: 0.010603483766317368 2023-01-22 17:34:10.243314: step: 222/463, loss: 0.007083239033818245 2023-01-22 17:34:10.905767: step: 224/463, loss: 0.010324249975383282 2023-01-22 17:34:11.470684: step: 226/463, loss: 0.00742448540404439 2023-01-22 17:34:12.063107: step: 228/463, loss: 0.050290994346141815 2023-01-22 17:34:12.682626: step: 230/463, loss: 0.004349336959421635 2023-01-22 17:34:13.268932: step: 232/463, loss: 0.045917484909296036 2023-01-22 17:34:13.911283: step: 234/463, loss: 0.031848661601543427 2023-01-22 17:34:14.518175: step: 236/463, loss: 0.030647816136479378 2023-01-22 17:34:15.105010: step: 238/463, loss: 0.02059847302734852 2023-01-22 17:34:15.742423: step: 240/463, loss: 0.02160876989364624 2023-01-22 17:34:16.431799: step: 242/463, loss: 0.013080844655632973 2023-01-22 17:34:17.074621: step: 244/463, loss: 0.04033321514725685 2023-01-22 17:34:17.706106: step: 246/463, loss: 0.0019364332547411323 2023-01-22 17:34:18.322878: step: 248/463, loss: 0.027135450392961502 2023-01-22 17:34:18.927255: step: 250/463, loss: 0.011136802844703197 2023-01-22 17:34:19.548388: step: 252/463, loss: 0.0034948750399053097 2023-01-22 17:34:20.071761: step: 254/463, loss: 0.0010191729525104165 2023-01-22 17:34:20.669900: step: 256/463, loss: 0.003934869077056646 2023-01-22 17:34:21.308986: step: 258/463, loss: 0.033262692391872406 2023-01-22 17:34:21.962323: step: 260/463, loss: 0.009994233027100563 2023-01-22 17:34:22.583315: step: 262/463, loss: 0.004077099729329348 2023-01-22 17:34:23.178692: step: 264/463, loss: 0.0005193166434764862 2023-01-22 17:34:23.854741: step: 266/463, loss: 0.00695264944806695 2023-01-22 17:34:24.416708: step: 268/463, loss: 0.08168432861566544 2023-01-22 17:34:25.078621: step: 270/463, loss: 0.020601660013198853 2023-01-22 17:34:25.627941: step: 272/463, loss: 0.01118266861885786 2023-01-22 17:34:26.191883: step: 274/463, loss: 0.0009270587470382452 2023-01-22 17:34:26.794198: step: 276/463, loss: 0.0045670573599636555 2023-01-22 17:34:27.419189: step: 278/463, loss: 0.07541597634553909 2023-01-22 17:34:28.017500: step: 280/463, loss: 0.002097351010888815 2023-01-22 17:34:28.616771: step: 282/463, loss: 0.0043763574212789536 2023-01-22 17:34:29.202180: step: 284/463, loss: 0.0026054643094539642 2023-01-22 17:34:29.888689: step: 286/463, loss: 0.021727852523326874 2023-01-22 17:34:30.553164: step: 288/463, loss: 0.009471302852034569 2023-01-22 17:34:31.151835: step: 290/463, loss: 0.06551475077867508 2023-01-22 17:34:31.732461: step: 292/463, loss: 0.01292817946523428 2023-01-22 17:34:32.350152: step: 294/463, loss: 0.014555895701050758 2023-01-22 17:34:32.955758: step: 296/463, loss: 0.03987764194607735 2023-01-22 17:34:33.635784: step: 298/463, loss: 0.0177764929831028 2023-01-22 17:34:34.328898: step: 300/463, loss: 0.008657808415591717 2023-01-22 17:34:34.962654: step: 302/463, loss: 0.0062730214558541775 2023-01-22 17:34:35.530379: step: 304/463, loss: 0.005871305707842112 2023-01-22 17:34:36.087791: step: 306/463, loss: 0.0950247049331665 2023-01-22 17:34:36.817575: step: 308/463, loss: 0.031176820397377014 2023-01-22 17:34:37.462697: step: 310/463, loss: 0.038271162658929825 2023-01-22 17:34:38.134176: step: 312/463, loss: 0.06210783123970032 2023-01-22 17:34:38.748459: step: 314/463, loss: 0.17606982588768005 2023-01-22 17:34:39.339610: step: 316/463, loss: 0.07480455189943314 2023-01-22 17:34:39.949003: step: 318/463, loss: 0.17868801951408386 2023-01-22 17:34:40.536201: step: 320/463, loss: 0.013032588176429272 2023-01-22 17:34:41.206753: step: 322/463, loss: 0.06287174671888351 2023-01-22 17:34:41.806561: step: 324/463, loss: 0.04328424111008644 2023-01-22 17:34:42.419188: step: 326/463, loss: 0.09385145455598831 2023-01-22 17:34:42.968624: step: 328/463, loss: 0.015254276804625988 2023-01-22 17:34:43.558765: step: 330/463, loss: 0.01074440497905016 2023-01-22 17:34:44.179448: step: 332/463, loss: 0.023391056805849075 2023-01-22 17:34:44.849028: step: 334/463, loss: 0.02780279703438282 2023-01-22 17:34:45.437371: step: 336/463, loss: 0.030893884599208832 2023-01-22 17:34:46.048367: step: 338/463, loss: 0.008319702930748463 2023-01-22 17:34:46.670783: step: 340/463, loss: 0.05772455781698227 2023-01-22 17:34:47.318189: step: 342/463, loss: 0.008088185451924801 2023-01-22 17:34:47.868216: step: 344/463, loss: 0.03776765987277031 2023-01-22 17:34:48.507725: step: 346/463, loss: 0.008874297142028809 2023-01-22 17:34:49.082949: step: 348/463, loss: 0.011532902717590332 2023-01-22 17:34:49.786173: step: 350/463, loss: 0.16311147809028625 2023-01-22 17:34:50.399164: step: 352/463, loss: 0.04252118989825249 2023-01-22 17:34:51.009088: step: 354/463, loss: 0.008686690591275692 2023-01-22 17:34:51.660207: step: 356/463, loss: 0.005681866314262152 2023-01-22 17:34:52.352769: step: 358/463, loss: 0.03594763204455376 2023-01-22 17:34:53.059277: step: 360/463, loss: 0.05414300039410591 2023-01-22 17:34:53.619263: step: 362/463, loss: 0.010812772437930107 2023-01-22 17:34:54.253829: step: 364/463, loss: 0.00018398492829874158 2023-01-22 17:34:54.844908: step: 366/463, loss: 0.0038744041230529547 2023-01-22 17:34:55.442403: step: 368/463, loss: 0.013348935171961784 2023-01-22 17:34:56.121463: step: 370/463, loss: 0.030728695914149284 2023-01-22 17:34:56.715724: step: 372/463, loss: 0.012359564192593098 2023-01-22 17:34:57.365396: step: 374/463, loss: 0.018455639481544495 2023-01-22 17:34:58.005569: step: 376/463, loss: 0.01169959083199501 2023-01-22 17:34:58.640676: step: 378/463, loss: 0.14637883007526398 2023-01-22 17:34:59.273241: step: 380/463, loss: 0.03680294007062912 2023-01-22 17:34:59.841835: step: 382/463, loss: 0.0028800591826438904 2023-01-22 17:35:00.417810: step: 384/463, loss: 0.06581912934780121 2023-01-22 17:35:01.020934: step: 386/463, loss: 0.05552119016647339 2023-01-22 17:35:01.656282: step: 388/463, loss: 0.03536562621593475 2023-01-22 17:35:02.291358: step: 390/463, loss: 0.028910452499985695 2023-01-22 17:35:02.945112: step: 392/463, loss: 0.01994897611439228 2023-01-22 17:35:03.570590: step: 394/463, loss: 0.0038552694022655487 2023-01-22 17:35:04.216667: step: 396/463, loss: 0.019589632749557495 2023-01-22 17:35:04.879281: step: 398/463, loss: 0.036420971155166626 2023-01-22 17:35:05.490769: step: 400/463, loss: 0.049638014286756516 2023-01-22 17:35:06.081993: step: 402/463, loss: 0.004046909045428038 2023-01-22 17:35:06.664750: step: 404/463, loss: 0.05476788803935051 2023-01-22 17:35:07.478729: step: 406/463, loss: 0.037916891276836395 2023-01-22 17:35:08.102349: step: 408/463, loss: 0.008093694224953651 2023-01-22 17:35:08.726979: step: 410/463, loss: 0.024047253653407097 2023-01-22 17:35:09.303585: step: 412/463, loss: 0.02121901325881481 2023-01-22 17:35:09.896527: step: 414/463, loss: 0.006429386790841818 2023-01-22 17:35:10.496770: step: 416/463, loss: 0.020175179466605186 2023-01-22 17:35:11.066795: step: 418/463, loss: 0.008278806693851948 2023-01-22 17:35:11.690574: step: 420/463, loss: 0.10228173434734344 2023-01-22 17:35:12.282864: step: 422/463, loss: 0.009379604831337929 2023-01-22 17:35:12.925825: step: 424/463, loss: 0.008021031506359577 2023-01-22 17:35:13.579232: step: 426/463, loss: 0.042402517050504684 2023-01-22 17:35:14.200890: step: 428/463, loss: 0.0088718943297863 2023-01-22 17:35:14.766839: step: 430/463, loss: 0.02621322125196457 2023-01-22 17:35:15.403003: step: 432/463, loss: 0.0310148186981678 2023-01-22 17:35:15.995634: step: 434/463, loss: 0.016806337982416153 2023-01-22 17:35:16.655039: step: 436/463, loss: 0.005243271589279175 2023-01-22 17:35:17.341707: step: 438/463, loss: 0.05722426623106003 2023-01-22 17:35:17.939725: step: 440/463, loss: 0.05976366624236107 2023-01-22 17:35:18.517251: step: 442/463, loss: 0.02002291940152645 2023-01-22 17:35:19.099814: step: 444/463, loss: 0.008887650445103645 2023-01-22 17:35:19.750965: step: 446/463, loss: 0.02857253886759281 2023-01-22 17:35:20.295512: step: 448/463, loss: 0.011843237094581127 2023-01-22 17:35:20.909925: step: 450/463, loss: 0.036049310117959976 2023-01-22 17:35:21.568563: step: 452/463, loss: 0.015252743847668171 2023-01-22 17:35:22.260481: step: 454/463, loss: 0.03929377347230911 2023-01-22 17:35:22.868115: step: 456/463, loss: 0.03385158255696297 2023-01-22 17:35:23.496245: step: 458/463, loss: 0.03129095956683159 2023-01-22 17:35:24.163780: step: 460/463, loss: 0.05460015684366226 2023-01-22 17:35:24.774076: step: 462/463, loss: 0.09239614009857178 2023-01-22 17:35:25.398846: step: 464/463, loss: 0.2792365252971649 2023-01-22 17:35:26.051485: step: 466/463, loss: 0.010799788869917393 2023-01-22 17:35:26.669182: step: 468/463, loss: 0.0030732739251106977 2023-01-22 17:35:27.270661: step: 470/463, loss: 0.005176235921680927 2023-01-22 17:35:27.890332: step: 472/463, loss: 0.010563170537352562 2023-01-22 17:35:28.456929: step: 474/463, loss: 0.03724229708313942 2023-01-22 17:35:29.108779: step: 476/463, loss: 0.05246124416589737 2023-01-22 17:35:29.678199: step: 478/463, loss: 0.12462900578975677 2023-01-22 17:35:30.343445: step: 480/463, loss: 0.0017681369790807366 2023-01-22 17:35:30.944388: step: 482/463, loss: 0.008751614950597286 2023-01-22 17:35:31.596349: step: 484/463, loss: 0.04031358286738396 2023-01-22 17:35:32.204328: step: 486/463, loss: 0.004609514027833939 2023-01-22 17:35:32.834603: step: 488/463, loss: 0.06289667636156082 2023-01-22 17:35:33.448274: step: 490/463, loss: 0.01022890955209732 2023-01-22 17:35:34.098505: step: 492/463, loss: 0.0011745513183996081 2023-01-22 17:35:34.747861: step: 494/463, loss: 0.05416417121887207 2023-01-22 17:35:35.435926: step: 496/463, loss: 0.7456041574478149 2023-01-22 17:35:36.019876: step: 498/463, loss: 0.015438547357916832 2023-01-22 17:35:36.695783: step: 500/463, loss: 0.014089792035520077 2023-01-22 17:35:37.209311: step: 502/463, loss: 0.05061308667063713 2023-01-22 17:35:37.958486: step: 504/463, loss: 0.021909601986408234 2023-01-22 17:35:38.584316: step: 506/463, loss: 0.008621095679700375 2023-01-22 17:35:39.136188: step: 508/463, loss: 0.014597372151911259 2023-01-22 17:35:39.747590: step: 510/463, loss: 2.6798694133758545 2023-01-22 17:35:40.395808: step: 512/463, loss: 0.0065391515381634235 2023-01-22 17:35:40.990348: step: 514/463, loss: 0.10399416834115982 2023-01-22 17:35:41.577300: step: 516/463, loss: 0.020610354840755463 2023-01-22 17:35:42.133858: step: 518/463, loss: 0.13269460201263428 2023-01-22 17:35:42.729619: step: 520/463, loss: 0.06851325184106827 2023-01-22 17:35:43.325664: step: 522/463, loss: 0.00587398000061512 2023-01-22 17:35:43.975486: step: 524/463, loss: 0.015989985316991806 2023-01-22 17:35:44.595368: step: 526/463, loss: 0.0027295532636344433 2023-01-22 17:35:45.207520: step: 528/463, loss: 0.02301041968166828 2023-01-22 17:35:45.830247: step: 530/463, loss: 0.07625401020050049 2023-01-22 17:35:46.466887: step: 532/463, loss: 0.018582958728075027 2023-01-22 17:35:47.031526: step: 534/463, loss: 0.0025474783033132553 2023-01-22 17:35:47.679603: step: 536/463, loss: 0.008842743001878262 2023-01-22 17:35:48.303249: step: 538/463, loss: 0.07212825864553452 2023-01-22 17:35:48.899846: step: 540/463, loss: 0.22330251336097717 2023-01-22 17:35:49.540506: step: 542/463, loss: 0.5265229344367981 2023-01-22 17:35:50.232095: step: 544/463, loss: 0.028451118618249893 2023-01-22 17:35:50.859815: step: 546/463, loss: 0.05769031122326851 2023-01-22 17:35:51.460299: step: 548/463, loss: 0.012981404550373554 2023-01-22 17:35:52.097805: step: 550/463, loss: 0.06042862311005592 2023-01-22 17:35:52.712331: step: 552/463, loss: 0.028414668515324593 2023-01-22 17:35:53.334095: step: 554/463, loss: 0.1333131045103073 2023-01-22 17:35:53.976662: step: 556/463, loss: 0.04191620275378227 2023-01-22 17:35:54.640848: step: 558/463, loss: 0.05410356447100639 2023-01-22 17:35:55.256721: step: 560/463, loss: 0.23903688788414001 2023-01-22 17:35:55.885196: step: 562/463, loss: 0.017969265580177307 2023-01-22 17:35:56.444075: step: 564/463, loss: 0.001567368395626545 2023-01-22 17:35:57.095566: step: 566/463, loss: 0.0081771956756711 2023-01-22 17:35:57.743052: step: 568/463, loss: 0.0013565103290602565 2023-01-22 17:35:58.347571: step: 570/463, loss: 0.003196444595232606 2023-01-22 17:35:59.002084: step: 572/463, loss: 0.01656646840274334 2023-01-22 17:35:59.585220: step: 574/463, loss: 0.01343387458473444 2023-01-22 17:36:00.226748: step: 576/463, loss: 0.05985991284251213 2023-01-22 17:36:00.827465: step: 578/463, loss: 0.03644665703177452 2023-01-22 17:36:01.458086: step: 580/463, loss: 0.005518585909157991 2023-01-22 17:36:02.016700: step: 582/463, loss: 0.04437004029750824 2023-01-22 17:36:02.660450: step: 584/463, loss: 0.03884151205420494 2023-01-22 17:36:03.288622: step: 586/463, loss: 0.059661615639925 2023-01-22 17:36:03.965099: step: 588/463, loss: 0.060279399156570435 2023-01-22 17:36:04.615033: step: 590/463, loss: 0.000777052016928792 2023-01-22 17:36:05.257177: step: 592/463, loss: 0.00707732979208231 2023-01-22 17:36:05.873621: step: 594/463, loss: 0.0013223501155152917 2023-01-22 17:36:06.494927: step: 596/463, loss: 0.00045954337110742927 2023-01-22 17:36:07.112571: step: 598/463, loss: 0.006680453661829233 2023-01-22 17:36:07.718356: step: 600/463, loss: 0.0018856502138078213 2023-01-22 17:36:08.357788: step: 602/463, loss: 0.048557911068201065 2023-01-22 17:36:08.930872: step: 604/463, loss: 0.00879836268723011 2023-01-22 17:36:09.599808: step: 606/463, loss: 0.03463005647063255 2023-01-22 17:36:10.217132: step: 608/463, loss: 0.19668737053871155 2023-01-22 17:36:10.805386: step: 610/463, loss: 0.006034636404365301 2023-01-22 17:36:11.435008: step: 612/463, loss: 0.0255566593259573 2023-01-22 17:36:12.060416: step: 614/463, loss: 0.03927958756685257 2023-01-22 17:36:12.609500: step: 616/463, loss: 0.020854413509368896 2023-01-22 17:36:13.237432: step: 618/463, loss: 0.1738763153553009 2023-01-22 17:36:13.882885: step: 620/463, loss: 0.02790101245045662 2023-01-22 17:36:14.548625: step: 622/463, loss: 0.019412798807024956 2023-01-22 17:36:15.188686: step: 624/463, loss: 0.009059369564056396 2023-01-22 17:36:15.798493: step: 626/463, loss: 0.06021444499492645 2023-01-22 17:36:16.381979: step: 628/463, loss: 0.015908081084489822 2023-01-22 17:36:17.004370: step: 630/463, loss: 0.014649467542767525 2023-01-22 17:36:17.683129: step: 632/463, loss: 0.009021610021591187 2023-01-22 17:36:18.326658: step: 634/463, loss: 0.06907861679792404 2023-01-22 17:36:18.900682: step: 636/463, loss: 0.07418691366910934 2023-01-22 17:36:19.505830: step: 638/463, loss: 0.009772669523954391 2023-01-22 17:36:20.135830: step: 640/463, loss: 0.01503036729991436 2023-01-22 17:36:20.741547: step: 642/463, loss: 0.0006463410099968314 2023-01-22 17:36:21.427474: step: 644/463, loss: 0.05333736166357994 2023-01-22 17:36:22.014176: step: 646/463, loss: 0.07148414850234985 2023-01-22 17:36:22.628610: step: 648/463, loss: 0.06414070725440979 2023-01-22 17:36:23.222361: step: 650/463, loss: 0.021622028201818466 2023-01-22 17:36:23.811069: step: 652/463, loss: 0.01997608318924904 2023-01-22 17:36:24.451357: step: 654/463, loss: 0.448893278837204 2023-01-22 17:36:25.110127: step: 656/463, loss: 0.1361357867717743 2023-01-22 17:36:25.691761: step: 658/463, loss: 0.005972206126898527 2023-01-22 17:36:26.299566: step: 660/463, loss: 0.03520888462662697 2023-01-22 17:36:26.918699: step: 662/463, loss: 0.017786426469683647 2023-01-22 17:36:27.558333: step: 664/463, loss: 0.22943957149982452 2023-01-22 17:36:28.191693: step: 666/463, loss: 0.013730431906878948 2023-01-22 17:36:28.732858: step: 668/463, loss: 0.00463698199018836 2023-01-22 17:36:29.312866: step: 670/463, loss: 0.033566202968358994 2023-01-22 17:36:30.055352: step: 672/463, loss: 0.019083280116319656 2023-01-22 17:36:30.709186: step: 674/463, loss: 0.04003748670220375 2023-01-22 17:36:31.397465: step: 676/463, loss: 0.03013003058731556 2023-01-22 17:36:31.986118: step: 678/463, loss: 0.002928652800619602 2023-01-22 17:36:32.627500: step: 680/463, loss: 0.01869940757751465 2023-01-22 17:36:33.247741: step: 682/463, loss: 0.01848847232758999 2023-01-22 17:36:33.865463: step: 684/463, loss: 0.0034768334589898586 2023-01-22 17:36:34.490409: step: 686/463, loss: 0.03578011319041252 2023-01-22 17:36:35.134754: step: 688/463, loss: 0.0070393262431025505 2023-01-22 17:36:35.719747: step: 690/463, loss: 0.007814531214535236 2023-01-22 17:36:36.341420: step: 692/463, loss: 0.1775757223367691 2023-01-22 17:36:36.950285: step: 694/463, loss: 0.019050421193242073 2023-01-22 17:36:37.581685: step: 696/463, loss: 0.01946580782532692 2023-01-22 17:36:38.205857: step: 698/463, loss: 0.053767260164022446 2023-01-22 17:36:38.925593: step: 700/463, loss: 0.23071061074733734 2023-01-22 17:36:39.554075: step: 702/463, loss: 0.0009593978174962103 2023-01-22 17:36:40.156604: step: 704/463, loss: 0.031742777675390244 2023-01-22 17:36:40.785472: step: 706/463, loss: 0.008334320038557053 2023-01-22 17:36:41.406819: step: 708/463, loss: 0.01677660271525383 2023-01-22 17:36:42.051909: step: 710/463, loss: 0.44588392972946167 2023-01-22 17:36:42.733128: step: 712/463, loss: 0.1231948658823967 2023-01-22 17:36:43.371867: step: 714/463, loss: 0.018855396658182144 2023-01-22 17:36:43.962174: step: 716/463, loss: 0.0335846021771431 2023-01-22 17:36:44.592756: step: 718/463, loss: 0.030143335461616516 2023-01-22 17:36:45.362539: step: 720/463, loss: 0.00879617128521204 2023-01-22 17:36:46.031086: step: 722/463, loss: 0.09415583312511444 2023-01-22 17:36:46.656041: step: 724/463, loss: 0.006232122424989939 2023-01-22 17:36:47.293793: step: 726/463, loss: 0.032799046486616135 2023-01-22 17:36:47.852501: step: 728/463, loss: 0.013905070722103119 2023-01-22 17:36:48.463906: step: 730/463, loss: 0.0026407698169350624 2023-01-22 17:36:49.092241: step: 732/463, loss: 0.16874364018440247 2023-01-22 17:36:49.693088: step: 734/463, loss: 0.0007326055783778429 2023-01-22 17:36:50.349216: step: 736/463, loss: 0.0007945413235574961 2023-01-22 17:36:50.966652: step: 738/463, loss: 0.03207281231880188 2023-01-22 17:36:51.544734: step: 740/463, loss: 0.05604258552193642 2023-01-22 17:36:52.194868: step: 742/463, loss: 0.026004012674093246 2023-01-22 17:36:52.825029: step: 744/463, loss: 0.002207354176789522 2023-01-22 17:36:53.455169: step: 746/463, loss: 0.023610983043909073 2023-01-22 17:36:54.055573: step: 748/463, loss: 0.0008069050381891429 2023-01-22 17:36:54.711468: step: 750/463, loss: 0.017948295921087265 2023-01-22 17:36:55.327973: step: 752/463, loss: 0.017106330022215843 2023-01-22 17:36:56.071516: step: 754/463, loss: 0.019637051969766617 2023-01-22 17:36:56.696843: step: 756/463, loss: 0.04784862697124481 2023-01-22 17:36:57.330243: step: 758/463, loss: 0.005550309084355831 2023-01-22 17:36:57.922422: step: 760/463, loss: 0.04014147073030472 2023-01-22 17:36:58.619662: step: 762/463, loss: 0.02748432755470276 2023-01-22 17:36:59.253240: step: 764/463, loss: 0.028737803921103477 2023-01-22 17:36:59.880273: step: 766/463, loss: 0.005828005727380514 2023-01-22 17:37:00.536240: step: 768/463, loss: 0.07333066314458847 2023-01-22 17:37:01.134459: step: 770/463, loss: 0.023455141112208366 2023-01-22 17:37:01.768012: step: 772/463, loss: 0.08453132212162018 2023-01-22 17:37:02.362084: step: 774/463, loss: 0.06740622967481613 2023-01-22 17:37:03.032589: step: 776/463, loss: 0.17559926211833954 2023-01-22 17:37:03.704975: step: 778/463, loss: 0.06230633705854416 2023-01-22 17:37:04.334642: step: 780/463, loss: 0.006970869842916727 2023-01-22 17:37:05.009700: step: 782/463, loss: 0.03146901726722717 2023-01-22 17:37:05.629404: step: 784/463, loss: 0.02176230400800705 2023-01-22 17:37:06.329782: step: 786/463, loss: 0.02516130357980728 2023-01-22 17:37:06.984791: step: 788/463, loss: 0.02103971503674984 2023-01-22 17:37:07.673221: step: 790/463, loss: 0.07773151993751526 2023-01-22 17:37:08.297952: step: 792/463, loss: 0.3735676109790802 2023-01-22 17:37:08.866934: step: 794/463, loss: 0.015492771752178669 2023-01-22 17:37:09.517897: step: 796/463, loss: 0.004837566986680031 2023-01-22 17:37:10.164588: step: 798/463, loss: 0.061610620468854904 2023-01-22 17:37:10.857320: step: 800/463, loss: 0.0938883125782013 2023-01-22 17:37:11.437250: step: 802/463, loss: 0.04197343438863754 2023-01-22 17:37:12.119119: step: 804/463, loss: 0.06517980247735977 2023-01-22 17:37:12.751737: step: 806/463, loss: 0.0007306336192414165 2023-01-22 17:37:13.378784: step: 808/463, loss: 0.057785943150520325 2023-01-22 17:37:14.012274: step: 810/463, loss: 0.09021037817001343 2023-01-22 17:37:14.668704: step: 812/463, loss: 0.04687533155083656 2023-01-22 17:37:15.346174: step: 814/463, loss: 0.07298246026039124 2023-01-22 17:37:16.032400: step: 816/463, loss: 0.010638000443577766 2023-01-22 17:37:16.639722: step: 818/463, loss: 0.009270724840462208 2023-01-22 17:37:17.231486: step: 820/463, loss: 0.0015388599131256342 2023-01-22 17:37:17.897027: step: 822/463, loss: 0.03384757786989212 2023-01-22 17:37:18.530520: step: 824/463, loss: 0.0296478308737278 2023-01-22 17:37:19.167082: step: 826/463, loss: 0.37040817737579346 2023-01-22 17:37:19.821066: step: 828/463, loss: 0.03569178283214569 2023-01-22 17:37:20.484295: step: 830/463, loss: 0.530291736125946 2023-01-22 17:37:21.077342: step: 832/463, loss: 0.003370628459379077 2023-01-22 17:37:21.699211: step: 834/463, loss: 0.006556604988873005 2023-01-22 17:37:22.277106: step: 836/463, loss: 0.0011065591825172305 2023-01-22 17:37:22.824703: step: 838/463, loss: 0.38811179995536804 2023-01-22 17:37:23.444032: step: 840/463, loss: 0.0963120386004448 2023-01-22 17:37:24.168743: step: 842/463, loss: 0.029486194252967834 2023-01-22 17:37:24.741018: step: 844/463, loss: 0.03785359859466553 2023-01-22 17:37:25.287385: step: 846/463, loss: 0.012951135635375977 2023-01-22 17:37:25.923452: step: 848/463, loss: 0.0036848343443125486 2023-01-22 17:37:26.489600: step: 850/463, loss: 0.10988558828830719 2023-01-22 17:37:27.189347: step: 852/463, loss: 0.08377789705991745 2023-01-22 17:37:27.877716: step: 854/463, loss: 0.05468481034040451 2023-01-22 17:37:28.465458: step: 856/463, loss: 0.060352668166160583 2023-01-22 17:37:29.087782: step: 858/463, loss: 0.00737422751262784 2023-01-22 17:37:29.689547: step: 860/463, loss: 0.02507697604596615 2023-01-22 17:37:30.299591: step: 862/463, loss: 0.03152686357498169 2023-01-22 17:37:30.882763: step: 864/463, loss: 0.1401437520980835 2023-01-22 17:37:31.509368: step: 866/463, loss: 0.007679919246584177 2023-01-22 17:37:32.184869: step: 868/463, loss: 0.27416884899139404 2023-01-22 17:37:32.895331: step: 870/463, loss: 0.11166494339704514 2023-01-22 17:37:33.477014: step: 872/463, loss: 0.07156647741794586 2023-01-22 17:37:34.054729: step: 874/463, loss: 0.01060706190764904 2023-01-22 17:37:34.681771: step: 876/463, loss: 0.03908117488026619 2023-01-22 17:37:35.294776: step: 878/463, loss: 0.022888747975230217 2023-01-22 17:37:35.894282: step: 880/463, loss: 0.0176005270332098 2023-01-22 17:37:36.498219: step: 882/463, loss: 0.007385856471955776 2023-01-22 17:37:37.085744: step: 884/463, loss: 0.00019545118266250938 2023-01-22 17:37:37.687351: step: 886/463, loss: 0.04316885769367218 2023-01-22 17:37:38.318014: step: 888/463, loss: 0.02838277630507946 2023-01-22 17:37:38.925300: step: 890/463, loss: 0.01970914751291275 2023-01-22 17:37:39.429881: step: 892/463, loss: 0.004516608081758022 2023-01-22 17:37:40.029857: step: 894/463, loss: 0.006696379743516445 2023-01-22 17:37:40.716736: step: 896/463, loss: 0.036292966455221176 2023-01-22 17:37:41.365059: step: 898/463, loss: 0.6178606748580933 2023-01-22 17:37:41.930860: step: 900/463, loss: 0.016748595982789993 2023-01-22 17:37:42.550465: step: 902/463, loss: 0.02405557781457901 2023-01-22 17:37:43.234085: step: 904/463, loss: 0.017630890011787415 2023-01-22 17:37:43.867201: step: 906/463, loss: 0.03331703692674637 2023-01-22 17:37:44.482958: step: 908/463, loss: 0.16933690011501312 2023-01-22 17:37:45.104841: step: 910/463, loss: 0.027453407645225525 2023-01-22 17:37:45.763908: step: 912/463, loss: 0.04985061287879944 2023-01-22 17:37:46.453199: step: 914/463, loss: 0.05505552142858505 2023-01-22 17:37:47.049626: step: 916/463, loss: 0.045101724565029144 2023-01-22 17:37:47.612549: step: 918/463, loss: 0.042703140527009964 2023-01-22 17:37:48.179184: step: 920/463, loss: 0.005263745319098234 2023-01-22 17:37:48.773020: step: 922/463, loss: 0.10761218518018723 2023-01-22 17:37:49.420235: step: 924/463, loss: 0.12171320617198944 2023-01-22 17:37:50.057964: step: 926/463, loss: 0.04482921585440636 ================================================== Loss: 0.058 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2866532456356037, 'r': 0.31602568446733537, 'f1': 0.300623710675606}, 'combined': 0.221512207866236, 'epoch': 25} Test Chinese: {'template': {'p': 0.9722222222222222, 'r': 0.5511811023622047, 'f1': 0.7035175879396985}, 'slot': {'p': 0.3694899052560343, 'r': 0.29958640966705485, 'f1': 0.3308864823188367}, 'combined': 0.23278445992279972, 'epoch': 25} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2964895546659995, 'r': 0.3212439007861209, 'f1': 0.30837073900598494}, 'combined': 0.22722054453072574, 'epoch': 25} Test Korean: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.3696017850411415, 'r': 0.29291022022876867, 'f1': 0.32681714260933625}, 'combined': 0.23204017125262874, 'epoch': 25} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2982403288950908, 'r': 0.3208771660028776, 'f1': 0.30914491130441774}, 'combined': 0.22779098727693936, 'epoch': 25} Test Russian: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.3877496750283486, 'r': 0.2926731840309961, 'f1': 0.3335688402779926}, 'combined': 0.23683387659737473, 'epoch': 25} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.2976190476190476, 'r': 0.41666666666666663, 'f1': 0.34722222222222227}, 'combined': 0.2314814814814815, 'epoch': 25} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.2777777777777778, 'r': 0.43478260869565216, 'f1': 0.3389830508474576}, 'combined': 0.1694915254237288, 'epoch': 25} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.39705882352941174, 'r': 0.23275862068965517, 'f1': 0.2934782608695652}, 'combined': 0.19565217391304346, 'epoch': 25} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2887401938920082, 'r': 0.35393959251278423, 'f1': 0.3180326773303279}, 'combined': 0.2343398675065574, 'epoch': 16} Test for Chinese: {'template': {'p': 0.9722222222222222, 'r': 0.5511811023622047, 'f1': 0.7035175879396985}, 'slot': {'p': 0.3374190197383612, 'r': 0.30885912016190303, 'f1': 0.32250801977725824}, 'combined': 0.22689006416490531, 'epoch': 16} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.29006410256410253, 'r': 0.4309523809523809, 'f1': 0.34674329501915707}, 'combined': 0.23116219667943805, 'epoch': 16} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.28642843540005986, 'r': 0.33860515228508026, 'f1': 0.3103389830508475}, 'combined': 0.22867082961641394, 'epoch': 17} Test for Korean: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.3492453884409228, 'r': 0.3270179138496522, 'f1': 0.3377663639671779}, 'combined': 0.2398141184166963, 'epoch': 17} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.33088235294117646, 'r': 0.4891304347826087, 'f1': 0.39473684210526316}, 'combined': 0.19736842105263158, 'epoch': 17} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.29811343141907926, 'r': 0.34053944158308486, 'f1': 0.3179172466152094}, 'combined': 0.23425481329541745, 'epoch': 20} Test for Russian: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.36505995641065214, 'r': 0.2978456014345199, 'f1': 0.32804522752903387}, 'combined': 0.23291211154561403, 'epoch': 20} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.27586206896551724, 'f1': 0.35555555555555557}, 'combined': 0.23703703703703705, 'epoch': 20} ****************************** Epoch: 26 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-22 17:40:29.033332: step: 2/463, loss: 0.01220590341836214 2023-01-22 17:40:29.651610: step: 4/463, loss: 0.007525493856519461 2023-01-22 17:40:30.258181: step: 6/463, loss: 0.016267795115709305 2023-01-22 17:40:30.847615: step: 8/463, loss: 0.004680731799453497 2023-01-22 17:40:31.470755: step: 10/463, loss: 0.007684738840907812 2023-01-22 17:40:32.188852: step: 12/463, loss: 0.03322313725948334 2023-01-22 17:40:32.869125: step: 14/463, loss: 0.1109473928809166 2023-01-22 17:40:33.537619: step: 16/463, loss: 0.011600861325860023 2023-01-22 17:40:34.210364: step: 18/463, loss: 0.008591043762862682 2023-01-22 17:40:34.817173: step: 20/463, loss: 0.02250751107931137 2023-01-22 17:40:35.398837: step: 22/463, loss: 0.007856126874685287 2023-01-22 17:40:36.016997: step: 24/463, loss: 0.02368726022541523 2023-01-22 17:40:36.634316: step: 26/463, loss: 0.06515151262283325 2023-01-22 17:40:37.243097: step: 28/463, loss: 0.01061059720814228 2023-01-22 17:40:37.868681: step: 30/463, loss: 0.006478564348071814 2023-01-22 17:40:38.473227: step: 32/463, loss: 0.007756749168038368 2023-01-22 17:40:39.076452: step: 34/463, loss: 0.008657277561724186 2023-01-22 17:40:39.763581: step: 36/463, loss: 0.013586904853582382 2023-01-22 17:40:40.368206: step: 38/463, loss: 0.0002604983455967158 2023-01-22 17:40:41.049826: step: 40/463, loss: 0.04485365003347397 2023-01-22 17:40:41.690711: step: 42/463, loss: 0.02987808920443058 2023-01-22 17:40:42.351913: step: 44/463, loss: 0.004077726509422064 2023-01-22 17:40:42.995289: step: 46/463, loss: 0.023542793467640877 2023-01-22 17:40:43.604573: step: 48/463, loss: 0.06271526962518692 2023-01-22 17:40:44.153771: step: 50/463, loss: 0.043441154062747955 2023-01-22 17:40:44.783175: step: 52/463, loss: 0.09954315423965454 2023-01-22 17:40:45.385554: step: 54/463, loss: 0.025452645495533943 2023-01-22 17:40:46.081764: step: 56/463, loss: 0.047646086663007736 2023-01-22 17:40:46.644737: step: 58/463, loss: 0.004461673554033041 2023-01-22 17:40:47.339944: step: 60/463, loss: 0.008077072910964489 2023-01-22 17:40:48.121813: step: 62/463, loss: 0.04395684227347374 2023-01-22 17:40:48.770939: step: 64/463, loss: 0.059292446821928024 2023-01-22 17:40:49.371049: step: 66/463, loss: 0.04674339294433594 2023-01-22 17:40:49.995594: step: 68/463, loss: 0.01653953082859516 2023-01-22 17:40:50.726770: step: 70/463, loss: 0.0017598519334569573 2023-01-22 17:40:51.304772: step: 72/463, loss: 0.007758632767945528 2023-01-22 17:40:51.887226: step: 74/463, loss: 0.006688870955258608 2023-01-22 17:40:52.503852: step: 76/463, loss: 0.06000211462378502 2023-01-22 17:40:53.116879: step: 78/463, loss: 0.019265038892626762 2023-01-22 17:40:53.752172: step: 80/463, loss: 0.22867916524410248 2023-01-22 17:40:54.334929: step: 82/463, loss: 0.02815896086394787 2023-01-22 17:40:54.967770: step: 84/463, loss: 0.10128021985292435 2023-01-22 17:40:55.641920: step: 86/463, loss: 0.011117551475763321 2023-01-22 17:40:56.246753: step: 88/463, loss: 0.04503503441810608 2023-01-22 17:40:56.829777: step: 90/463, loss: 0.012142233550548553 2023-01-22 17:40:57.461732: step: 92/463, loss: 0.003159452462568879 2023-01-22 17:40:58.095253: step: 94/463, loss: 0.0660816878080368 2023-01-22 17:40:58.652248: step: 96/463, loss: 0.015071950852870941 2023-01-22 17:40:59.275308: step: 98/463, loss: 0.05544154718518257 2023-01-22 17:40:59.940250: step: 100/463, loss: 0.11105155199766159 2023-01-22 17:41:00.548738: step: 102/463, loss: 0.0050207022577524185 2023-01-22 17:41:01.178746: step: 104/463, loss: 0.09417281299829483 2023-01-22 17:41:01.789933: step: 106/463, loss: 0.5499095916748047 2023-01-22 17:41:02.422965: step: 108/463, loss: 0.00154082418885082 2023-01-22 17:41:03.058934: step: 110/463, loss: 0.014493829570710659 2023-01-22 17:41:03.681008: step: 112/463, loss: 0.005074814427644014 2023-01-22 17:41:04.309097: step: 114/463, loss: 0.012423597276210785 2023-01-22 17:41:04.932788: step: 116/463, loss: 0.01714438945055008 2023-01-22 17:41:05.664845: step: 118/463, loss: 0.09503595530986786 2023-01-22 17:41:06.330709: step: 120/463, loss: 0.00632100785151124 2023-01-22 17:41:07.037446: step: 122/463, loss: 0.07028459012508392 2023-01-22 17:41:07.715610: step: 124/463, loss: 0.004885394591838121 2023-01-22 17:41:08.329841: step: 126/463, loss: 0.026978055015206337 2023-01-22 17:41:08.888203: step: 128/463, loss: 0.009684693068265915 2023-01-22 17:41:09.436554: step: 130/463, loss: 0.0017024094704538584 2023-01-22 17:41:10.042348: step: 132/463, loss: 0.051269400864839554 2023-01-22 17:41:10.668972: step: 134/463, loss: 0.0035013891756534576 2023-01-22 17:41:11.240668: step: 136/463, loss: 0.014035666361451149 2023-01-22 17:41:11.841582: step: 138/463, loss: 0.004264821298420429 2023-01-22 17:41:12.480238: step: 140/463, loss: 0.0512673445045948 2023-01-22 17:41:13.170231: step: 142/463, loss: 0.039616815745830536 2023-01-22 17:41:13.758787: step: 144/463, loss: 0.0040939683094620705 2023-01-22 17:41:14.349613: step: 146/463, loss: 0.001456791302189231 2023-01-22 17:41:14.871890: step: 148/463, loss: 0.015326387248933315 2023-01-22 17:41:15.463457: step: 150/463, loss: 0.009537910111248493 2023-01-22 17:41:16.107467: step: 152/463, loss: 0.02939922735095024 2023-01-22 17:41:16.674315: step: 154/463, loss: 0.04579417034983635 2023-01-22 17:41:17.297547: step: 156/463, loss: 0.043767645955085754 2023-01-22 17:41:17.943510: step: 158/463, loss: 0.03526806831359863 2023-01-22 17:41:18.504305: step: 160/463, loss: 0.007558304350823164 2023-01-22 17:41:19.154588: step: 162/463, loss: 0.02205314300954342 2023-01-22 17:41:19.791706: step: 164/463, loss: 0.0002891231852117926 2023-01-22 17:41:20.415066: step: 166/463, loss: 0.10035229474306107 2023-01-22 17:41:21.093467: step: 168/463, loss: 0.03530791029334068 2023-01-22 17:41:21.747024: step: 170/463, loss: 0.08970101922750473 2023-01-22 17:41:22.366563: step: 172/463, loss: 0.01310124434530735 2023-01-22 17:41:22.957786: step: 174/463, loss: 0.04036356881260872 2023-01-22 17:41:23.600614: step: 176/463, loss: 0.04655942693352699 2023-01-22 17:41:24.242035: step: 178/463, loss: 1.300333023071289 2023-01-22 17:41:24.869072: step: 180/463, loss: 0.3467961847782135 2023-01-22 17:41:25.463001: step: 182/463, loss: 0.053278595209121704 2023-01-22 17:41:26.139420: step: 184/463, loss: 0.007122400216758251 2023-01-22 17:41:26.714305: step: 186/463, loss: 0.015062619000673294 2023-01-22 17:41:27.319100: step: 188/463, loss: 0.050801608711481094 2023-01-22 17:41:27.959256: step: 190/463, loss: 0.014414542354643345 2023-01-22 17:41:28.535341: step: 192/463, loss: 0.05244988203048706 2023-01-22 17:41:29.112395: step: 194/463, loss: 0.006414277013391256 2023-01-22 17:41:29.710072: step: 196/463, loss: 0.028647569939494133 2023-01-22 17:41:30.242523: step: 198/463, loss: 0.060875605791807175 2023-01-22 17:41:30.865808: step: 200/463, loss: 0.005998514126986265 2023-01-22 17:41:31.530978: step: 202/463, loss: 0.010954087600111961 2023-01-22 17:41:32.133728: step: 204/463, loss: 0.007254703901708126 2023-01-22 17:41:32.808286: step: 206/463, loss: 0.059198930859565735 2023-01-22 17:41:33.498699: step: 208/463, loss: 0.04088457673788071 2023-01-22 17:41:34.059430: step: 210/463, loss: 0.019292255863547325 2023-01-22 17:41:34.686488: step: 212/463, loss: 0.08872763067483902 2023-01-22 17:41:35.278903: step: 214/463, loss: 0.03282894194126129 2023-01-22 17:41:35.905200: step: 216/463, loss: 0.037816330790519714 2023-01-22 17:41:36.616249: step: 218/463, loss: 0.032770853489637375 2023-01-22 17:41:37.183303: step: 220/463, loss: 0.01736808381974697 2023-01-22 17:41:37.744158: step: 222/463, loss: 0.0008901652763597667 2023-01-22 17:41:38.339068: step: 224/463, loss: 0.0069746533408761024 2023-01-22 17:41:38.946753: step: 226/463, loss: 0.020527873188257217 2023-01-22 17:41:39.555739: step: 228/463, loss: 0.0837789922952652 2023-01-22 17:41:40.089596: step: 230/463, loss: 0.07999513298273087 2023-01-22 17:41:40.781706: step: 232/463, loss: 0.0622144490480423 2023-01-22 17:41:41.388116: step: 234/463, loss: 0.018369058147072792 2023-01-22 17:41:41.977009: step: 236/463, loss: 0.44357410073280334 2023-01-22 17:41:42.692615: step: 238/463, loss: 0.056041594594717026 2023-01-22 17:41:43.307055: step: 240/463, loss: 0.0360465869307518 2023-01-22 17:41:43.968773: step: 242/463, loss: 0.662790060043335 2023-01-22 17:41:44.582916: step: 244/463, loss: 0.026837073266506195 2023-01-22 17:41:45.199077: step: 246/463, loss: 0.047510746866464615 2023-01-22 17:41:45.782201: step: 248/463, loss: 0.0026059444062411785 2023-01-22 17:41:46.424467: step: 250/463, loss: 0.0025001391768455505 2023-01-22 17:41:46.995208: step: 252/463, loss: 0.014354495331645012 2023-01-22 17:41:47.574738: step: 254/463, loss: 0.013815448619425297 2023-01-22 17:41:48.155189: step: 256/463, loss: 0.010786745697259903 2023-01-22 17:41:48.883293: step: 258/463, loss: 2.470428228378296 2023-01-22 17:41:49.578663: step: 260/463, loss: 0.013407396152615547 2023-01-22 17:41:50.226554: step: 262/463, loss: 0.02417255938053131 2023-01-22 17:41:50.863259: step: 264/463, loss: 0.002357170917093754 2023-01-22 17:41:51.482775: step: 266/463, loss: 0.011398532427847385 2023-01-22 17:41:52.115793: step: 268/463, loss: 0.026300424709916115 2023-01-22 17:41:52.746267: step: 270/463, loss: 0.006126223132014275 2023-01-22 17:41:53.396910: step: 272/463, loss: 0.019339794293045998 2023-01-22 17:41:54.018754: step: 274/463, loss: 0.02685556374490261 2023-01-22 17:41:54.677115: step: 276/463, loss: 0.22054935991764069 2023-01-22 17:41:55.323429: step: 278/463, loss: 0.01454430352896452 2023-01-22 17:41:55.911632: step: 280/463, loss: 0.3280911445617676 2023-01-22 17:41:56.528021: step: 282/463, loss: 0.007779586128890514 2023-01-22 17:41:57.108689: step: 284/463, loss: 0.0026116182561963797 2023-01-22 17:41:57.733412: step: 286/463, loss: 0.08688265830278397 2023-01-22 17:41:58.449184: step: 288/463, loss: 0.007146784570068121 2023-01-22 17:41:58.991666: step: 290/463, loss: 0.10163905471563339 2023-01-22 17:41:59.561314: step: 292/463, loss: 0.016556203365325928 2023-01-22 17:42:00.161317: step: 294/463, loss: 0.01717689260840416 2023-01-22 17:42:00.817288: step: 296/463, loss: 0.08650849759578705 2023-01-22 17:42:01.433658: step: 298/463, loss: 0.014537579379975796 2023-01-22 17:42:02.042581: step: 300/463, loss: 0.02015146240592003 2023-01-22 17:42:02.647813: step: 302/463, loss: 0.04602815583348274 2023-01-22 17:42:03.341921: step: 304/463, loss: 0.13226287066936493 2023-01-22 17:42:03.923047: step: 306/463, loss: 0.019310399889945984 2023-01-22 17:42:04.667099: step: 308/463, loss: 0.021726345643401146 2023-01-22 17:42:05.267838: step: 310/463, loss: 0.0012280370574444532 2023-01-22 17:42:05.851685: step: 312/463, loss: 0.04339618608355522 2023-01-22 17:42:06.459850: step: 314/463, loss: 0.020027853548526764 2023-01-22 17:42:07.083797: step: 316/463, loss: 0.7567735910415649 2023-01-22 17:42:07.701259: step: 318/463, loss: 0.03300265222787857 2023-01-22 17:42:08.347751: step: 320/463, loss: 0.5429046154022217 2023-01-22 17:42:08.981336: step: 322/463, loss: 0.037947602570056915 2023-01-22 17:42:09.565353: step: 324/463, loss: 0.1647876650094986 2023-01-22 17:42:10.177756: step: 326/463, loss: 0.01415654830634594 2023-01-22 17:42:10.781385: step: 328/463, loss: 0.012166880071163177 2023-01-22 17:42:11.417110: step: 330/463, loss: 0.011832972057163715 2023-01-22 17:42:12.110287: step: 332/463, loss: 0.016252709552645683 2023-01-22 17:42:12.692510: step: 334/463, loss: 0.0003199432685505599 2023-01-22 17:42:13.234410: step: 336/463, loss: 0.026687612757086754 2023-01-22 17:42:13.877248: step: 338/463, loss: 0.004762224853038788 2023-01-22 17:42:14.458636: step: 340/463, loss: 0.0013707904145121574 2023-01-22 17:42:15.100175: step: 342/463, loss: 0.016454612836241722 2023-01-22 17:42:15.753690: step: 344/463, loss: 0.017940031364560127 2023-01-22 17:42:16.332447: step: 346/463, loss: 0.04070261865854263 2023-01-22 17:42:16.931693: step: 348/463, loss: 0.042921535670757294 2023-01-22 17:42:17.520689: step: 350/463, loss: 0.1565055549144745 2023-01-22 17:42:18.091713: step: 352/463, loss: 0.002980774501338601 2023-01-22 17:42:18.815230: step: 354/463, loss: 0.021961189806461334 2023-01-22 17:42:19.426390: step: 356/463, loss: 2.072334051132202 2023-01-22 17:42:20.072533: step: 358/463, loss: 0.06832674890756607 2023-01-22 17:42:20.670931: step: 360/463, loss: 0.0011128151090815663 2023-01-22 17:42:21.276874: step: 362/463, loss: 0.01487234327942133 2023-01-22 17:42:21.881301: step: 364/463, loss: 0.054214052855968475 2023-01-22 17:42:22.523586: step: 366/463, loss: 0.03981756418943405 2023-01-22 17:42:23.113641: step: 368/463, loss: 0.005580809898674488 2023-01-22 17:42:23.725654: step: 370/463, loss: 0.008592989295721054 2023-01-22 17:42:24.309364: step: 372/463, loss: 0.0004932968295179307 2023-01-22 17:42:24.904347: step: 374/463, loss: 0.00399392144754529 2023-01-22 17:42:25.529286: step: 376/463, loss: 0.4164958894252777 2023-01-22 17:42:26.202960: step: 378/463, loss: 0.0024854878429323435 2023-01-22 17:42:26.746561: step: 380/463, loss: 0.02760501764714718 2023-01-22 17:42:27.347942: step: 382/463, loss: 0.03295260667800903 2023-01-22 17:42:27.938590: step: 384/463, loss: 0.006325080059468746 2023-01-22 17:42:28.530663: step: 386/463, loss: 0.07014255225658417 2023-01-22 17:42:29.226906: step: 388/463, loss: 0.0013382347533479333 2023-01-22 17:42:29.842083: step: 390/463, loss: 0.02432812750339508 2023-01-22 17:42:30.429120: step: 392/463, loss: 0.15580353140830994 2023-01-22 17:42:31.107164: step: 394/463, loss: 0.026020050048828125 2023-01-22 17:42:31.696480: step: 396/463, loss: 0.37998542189598083 2023-01-22 17:42:32.324424: step: 398/463, loss: 0.024236757308244705 2023-01-22 17:42:32.915817: step: 400/463, loss: 0.0006453873938880861 2023-01-22 17:42:33.528703: step: 402/463, loss: 0.029128113761544228 2023-01-22 17:42:34.103375: step: 404/463, loss: 0.04767274111509323 2023-01-22 17:42:34.739229: step: 406/463, loss: 0.0418865829706192 2023-01-22 17:42:35.309803: step: 408/463, loss: 0.008699294179677963 2023-01-22 17:42:35.894784: step: 410/463, loss: 0.028960194438695908 2023-01-22 17:42:36.544450: step: 412/463, loss: 0.02379652112722397 2023-01-22 17:42:37.184811: step: 414/463, loss: 0.0662323608994484 2023-01-22 17:42:37.851274: step: 416/463, loss: 0.1485428810119629 2023-01-22 17:42:38.443336: step: 418/463, loss: 0.014355101622641087 2023-01-22 17:42:39.062929: step: 420/463, loss: 0.005887479521334171 2023-01-22 17:42:39.692791: step: 422/463, loss: 0.2752608060836792 2023-01-22 17:42:40.312764: step: 424/463, loss: 0.013058009557425976 2023-01-22 17:42:41.006955: step: 426/463, loss: 0.002241884358227253 2023-01-22 17:42:41.548130: step: 428/463, loss: 0.007715311367064714 2023-01-22 17:42:42.111658: step: 430/463, loss: 0.06277306377887726 2023-01-22 17:42:42.708869: step: 432/463, loss: 0.024087119847536087 2023-01-22 17:42:43.345703: step: 434/463, loss: 0.00212637591175735 2023-01-22 17:42:43.976226: step: 436/463, loss: 0.006012834142893553 2023-01-22 17:42:44.620658: step: 438/463, loss: 0.09374812245368958 2023-01-22 17:42:45.211015: step: 440/463, loss: 0.010396486148238182 2023-01-22 17:42:45.824000: step: 442/463, loss: 0.01348874345421791 2023-01-22 17:42:46.383500: step: 444/463, loss: 0.013059775345027447 2023-01-22 17:42:47.045446: step: 446/463, loss: 0.012737814337015152 2023-01-22 17:42:47.633789: step: 448/463, loss: 0.006129831075668335 2023-01-22 17:42:48.237645: step: 450/463, loss: 0.06853323429822922 2023-01-22 17:42:48.845005: step: 452/463, loss: 0.004811081103980541 2023-01-22 17:42:49.460849: step: 454/463, loss: 0.03206224367022514 2023-01-22 17:42:50.027921: step: 456/463, loss: 0.11015310883522034 2023-01-22 17:42:50.593075: step: 458/463, loss: 0.027929261326789856 2023-01-22 17:42:51.238843: step: 460/463, loss: 0.21594659984111786 2023-01-22 17:42:51.818539: step: 462/463, loss: 0.0057710688561201096 2023-01-22 17:42:52.440913: step: 464/463, loss: 0.019618649035692215 2023-01-22 17:42:53.021556: step: 466/463, loss: 0.019396010786294937 2023-01-22 17:42:53.667070: step: 468/463, loss: 0.038183875381946564 2023-01-22 17:42:54.269568: step: 470/463, loss: 0.005402226932346821 2023-01-22 17:42:54.901097: step: 472/463, loss: 0.02538665011525154 2023-01-22 17:42:55.512986: step: 474/463, loss: 0.018288346007466316 2023-01-22 17:42:56.115329: step: 476/463, loss: 0.10018619149923325 2023-01-22 17:42:56.752146: step: 478/463, loss: 0.03464306890964508 2023-01-22 17:42:57.400256: step: 480/463, loss: 0.014261183328926563 2023-01-22 17:42:57.985854: step: 482/463, loss: 0.0014354573795571923 2023-01-22 17:42:58.599783: step: 484/463, loss: 0.014875952154397964 2023-01-22 17:42:59.195711: step: 486/463, loss: 0.0024372367188334465 2023-01-22 17:42:59.844156: step: 488/463, loss: 0.017565980553627014 2023-01-22 17:43:00.455847: step: 490/463, loss: 0.10486725717782974 2023-01-22 17:43:01.027652: step: 492/463, loss: 0.03343328461050987 2023-01-22 17:43:01.641600: step: 494/463, loss: 0.00021012530487496406 2023-01-22 17:43:02.311688: step: 496/463, loss: 0.3929195702075958 2023-01-22 17:43:02.942478: step: 498/463, loss: 0.06423081457614899 2023-01-22 17:43:03.511697: step: 500/463, loss: 0.031972289085388184 2023-01-22 17:43:04.121434: step: 502/463, loss: 0.13389240205287933 2023-01-22 17:43:04.731377: step: 504/463, loss: 0.007267009001225233 2023-01-22 17:43:05.365241: step: 506/463, loss: 0.06748101115226746 2023-01-22 17:43:05.970920: step: 508/463, loss: 0.038641318678855896 2023-01-22 17:43:06.541730: step: 510/463, loss: 0.00047687129699625075 2023-01-22 17:43:07.174871: step: 512/463, loss: 0.0006077192956581712 2023-01-22 17:43:07.803721: step: 514/463, loss: 0.006717605981975794 2023-01-22 17:43:08.381995: step: 516/463, loss: 1.6518760919570923 2023-01-22 17:43:09.016800: step: 518/463, loss: 0.08417150378227234 2023-01-22 17:43:09.652229: step: 520/463, loss: 0.012581528164446354 2023-01-22 17:43:10.270943: step: 522/463, loss: 0.027379076927900314 2023-01-22 17:43:10.928327: step: 524/463, loss: 0.0022716736420989037 2023-01-22 17:43:11.519795: step: 526/463, loss: 0.05606410279870033 2023-01-22 17:43:12.103467: step: 528/463, loss: 2.7191085815429688 2023-01-22 17:43:12.731333: step: 530/463, loss: 0.02138979360461235 2023-01-22 17:43:13.352616: step: 532/463, loss: 0.0332799106836319 2023-01-22 17:43:14.022453: step: 534/463, loss: 0.005720691289752722 2023-01-22 17:43:14.633258: step: 536/463, loss: 0.048267681151628494 2023-01-22 17:43:15.286976: step: 538/463, loss: 0.037405870854854584 2023-01-22 17:43:15.901769: step: 540/463, loss: 0.011664239689707756 2023-01-22 17:43:16.506835: step: 542/463, loss: 0.025763077661395073 2023-01-22 17:43:17.092042: step: 544/463, loss: 0.005322271957993507 2023-01-22 17:43:17.655056: step: 546/463, loss: 0.033957891166210175 2023-01-22 17:43:18.249803: step: 548/463, loss: 0.02395976334810257 2023-01-22 17:43:18.846365: step: 550/463, loss: 0.0911819115281105 2023-01-22 17:43:19.520445: step: 552/463, loss: 0.02012833021581173 2023-01-22 17:43:20.184263: step: 554/463, loss: 0.027553854510188103 2023-01-22 17:43:20.776394: step: 556/463, loss: 0.08565418422222137 2023-01-22 17:43:21.382177: step: 558/463, loss: 0.007254458498209715 2023-01-22 17:43:22.061721: step: 560/463, loss: 0.0073805614374578 2023-01-22 17:43:22.620796: step: 562/463, loss: 0.005820074584335089 2023-01-22 17:43:23.255498: step: 564/463, loss: 0.005134861450642347 2023-01-22 17:43:23.897590: step: 566/463, loss: 0.0003374096122570336 2023-01-22 17:43:24.505937: step: 568/463, loss: 0.01328819990158081 2023-01-22 17:43:25.129567: step: 570/463, loss: 0.11937854439020157 2023-01-22 17:43:25.738546: step: 572/463, loss: 0.002269263844937086 2023-01-22 17:43:26.370070: step: 574/463, loss: 0.04149191081523895 2023-01-22 17:43:26.972811: step: 576/463, loss: 0.0007395473658107221 2023-01-22 17:43:27.553131: step: 578/463, loss: 0.021636318415403366 2023-01-22 17:43:28.229501: step: 580/463, loss: 0.040761176496744156 2023-01-22 17:43:28.812932: step: 582/463, loss: 0.0013148257276043296 2023-01-22 17:43:29.517097: step: 584/463, loss: 0.004243991803377867 2023-01-22 17:43:30.146018: step: 586/463, loss: 0.6195448637008667 2023-01-22 17:43:30.755181: step: 588/463, loss: 0.011664263904094696 2023-01-22 17:43:31.312732: step: 590/463, loss: 0.004717049654573202 2023-01-22 17:43:31.943430: step: 592/463, loss: 0.013969927094876766 2023-01-22 17:43:32.619027: step: 594/463, loss: 0.14267340302467346 2023-01-22 17:43:33.232373: step: 596/463, loss: 0.01867334172129631 2023-01-22 17:43:33.854022: step: 598/463, loss: 0.009579035453498363 2023-01-22 17:43:34.455809: step: 600/463, loss: 0.01106257364153862 2023-01-22 17:43:35.046423: step: 602/463, loss: 0.005207826383411884 2023-01-22 17:43:35.634809: step: 604/463, loss: 0.000493346422445029 2023-01-22 17:43:36.288890: step: 606/463, loss: 0.06467792391777039 2023-01-22 17:43:36.831938: step: 608/463, loss: 0.1683456152677536 2023-01-22 17:43:37.456764: step: 610/463, loss: 0.021216217428445816 2023-01-22 17:43:38.120523: step: 612/463, loss: 0.06379719823598862 2023-01-22 17:43:38.751692: step: 614/463, loss: 0.00026635205722413957 2023-01-22 17:43:39.503135: step: 616/463, loss: 0.07286352664232254 2023-01-22 17:43:40.155305: step: 618/463, loss: 0.009033137932419777 2023-01-22 17:43:40.802425: step: 620/463, loss: 0.06926427036523819 2023-01-22 17:43:41.405467: step: 622/463, loss: 0.016620252281427383 2023-01-22 17:43:42.051649: step: 624/463, loss: 0.012910572811961174 2023-01-22 17:43:42.592628: step: 626/463, loss: 0.010569962672889233 2023-01-22 17:43:43.206465: step: 628/463, loss: 0.004260397516191006 2023-01-22 17:43:43.819648: step: 630/463, loss: 0.03356446325778961 2023-01-22 17:43:44.462396: step: 632/463, loss: 0.06495002657175064 2023-01-22 17:43:45.081130: step: 634/463, loss: 0.036372724920511246 2023-01-22 17:43:45.716863: step: 636/463, loss: 0.0016160152154043317 2023-01-22 17:43:46.343025: step: 638/463, loss: 0.11685538291931152 2023-01-22 17:43:46.941703: step: 640/463, loss: 0.004129200242459774 2023-01-22 17:43:47.536586: step: 642/463, loss: 0.030844535678625107 2023-01-22 17:43:48.147024: step: 644/463, loss: 0.02487705834209919 2023-01-22 17:43:48.793252: step: 646/463, loss: 0.04912056401371956 2023-01-22 17:43:49.417404: step: 648/463, loss: 0.07829011231660843 2023-01-22 17:43:50.072640: step: 650/463, loss: 0.0733383446931839 2023-01-22 17:43:50.710506: step: 652/463, loss: 0.012831995263695717 2023-01-22 17:43:51.369897: step: 654/463, loss: 1.0788288116455078 2023-01-22 17:43:51.996114: step: 656/463, loss: 0.0054109832271933556 2023-01-22 17:43:52.609661: step: 658/463, loss: 0.0188837181776762 2023-01-22 17:43:53.230551: step: 660/463, loss: 0.016306791454553604 2023-01-22 17:43:53.868684: step: 662/463, loss: 0.05278490111231804 2023-01-22 17:43:54.485754: step: 664/463, loss: 0.01989087462425232 2023-01-22 17:43:55.174609: step: 666/463, loss: 0.020794618874788284 2023-01-22 17:43:55.797558: step: 668/463, loss: 0.0035405997186899185 2023-01-22 17:43:56.422245: step: 670/463, loss: 0.028991803526878357 2023-01-22 17:43:57.035848: step: 672/463, loss: 0.06302163749933243 2023-01-22 17:43:57.614443: step: 674/463, loss: 0.042631831020116806 2023-01-22 17:43:58.229341: step: 676/463, loss: 0.031978484243154526 2023-01-22 17:43:58.821367: step: 678/463, loss: 0.009777568280696869 2023-01-22 17:43:59.519750: step: 680/463, loss: 0.003939659800380468 2023-01-22 17:44:00.132657: step: 682/463, loss: 0.08121895045042038 2023-01-22 17:44:00.726506: step: 684/463, loss: 0.04567919299006462 2023-01-22 17:44:01.384422: step: 686/463, loss: 1.0407085418701172 2023-01-22 17:44:01.994105: step: 688/463, loss: 0.03207026422023773 2023-01-22 17:44:02.580890: step: 690/463, loss: 0.01913168840110302 2023-01-22 17:44:03.214104: step: 692/463, loss: 0.05123714357614517 2023-01-22 17:44:03.781714: step: 694/463, loss: 0.038987718522548676 2023-01-22 17:44:04.469590: step: 696/463, loss: 0.05064193904399872 2023-01-22 17:44:05.107156: step: 698/463, loss: 0.08313120901584625 2023-01-22 17:44:05.746273: step: 700/463, loss: 0.10021159052848816 2023-01-22 17:44:06.377167: step: 702/463, loss: 0.03009776398539543 2023-01-22 17:44:06.971505: step: 704/463, loss: 0.159305140376091 2023-01-22 17:44:07.649291: step: 706/463, loss: 0.06675783544778824 2023-01-22 17:44:08.230212: step: 708/463, loss: 0.019637413322925568 2023-01-22 17:44:08.845439: step: 710/463, loss: 0.027885576710104942 2023-01-22 17:44:09.463311: step: 712/463, loss: 0.019769128412008286 2023-01-22 17:44:09.994742: step: 714/463, loss: 0.024496685713529587 2023-01-22 17:44:10.646404: step: 716/463, loss: 0.02972661517560482 2023-01-22 17:44:11.322597: step: 718/463, loss: 0.003959205932915211 2023-01-22 17:44:11.968605: step: 720/463, loss: 0.01716112717986107 2023-01-22 17:44:12.594406: step: 722/463, loss: 0.009717060253024101 2023-01-22 17:44:13.195828: step: 724/463, loss: 0.05289917066693306 2023-01-22 17:44:13.902189: step: 726/463, loss: 0.08873023837804794 2023-01-22 17:44:14.472737: step: 728/463, loss: 0.005750670563429594 2023-01-22 17:44:15.099736: step: 730/463, loss: 0.0030100075528025627 2023-01-22 17:44:15.756281: step: 732/463, loss: 0.003160548396408558 2023-01-22 17:44:16.417285: step: 734/463, loss: 0.01783859170973301 2023-01-22 17:44:17.092460: step: 736/463, loss: 0.030152346938848495 2023-01-22 17:44:17.766070: step: 738/463, loss: 0.03679962456226349 2023-01-22 17:44:18.366964: step: 740/463, loss: 0.025984447449445724 2023-01-22 17:44:19.020052: step: 742/463, loss: 0.027999315410852432 2023-01-22 17:44:19.590642: step: 744/463, loss: 0.0006463401950895786 2023-01-22 17:44:20.233387: step: 746/463, loss: 0.026396969333291054 2023-01-22 17:44:20.840834: step: 748/463, loss: 0.014458159916102886 2023-01-22 17:44:21.554922: step: 750/463, loss: 0.016929537057876587 2023-01-22 17:44:22.135220: step: 752/463, loss: 1.898078203201294 2023-01-22 17:44:22.742977: step: 754/463, loss: 0.05364091694355011 2023-01-22 17:44:23.301673: step: 756/463, loss: 0.016118738800287247 2023-01-22 17:44:24.062147: step: 758/463, loss: 0.051836829632520676 2023-01-22 17:44:24.656859: step: 760/463, loss: 0.0297658983618021 2023-01-22 17:44:25.268527: step: 762/463, loss: 0.001830625464208424 2023-01-22 17:44:25.923526: step: 764/463, loss: 0.022167814895510674 2023-01-22 17:44:26.616274: step: 766/463, loss: 0.0028956737369298935 2023-01-22 17:44:27.211668: step: 768/463, loss: 0.21833539009094238 2023-01-22 17:44:27.803068: step: 770/463, loss: 0.009094377048313618 2023-01-22 17:44:28.440101: step: 772/463, loss: 0.01965837925672531 2023-01-22 17:44:29.075428: step: 774/463, loss: 0.00923532247543335 2023-01-22 17:44:29.687974: step: 776/463, loss: 0.061453089118003845 2023-01-22 17:44:30.336210: step: 778/463, loss: 0.02154242806136608 2023-01-22 17:44:30.933036: step: 780/463, loss: 0.009268476627767086 2023-01-22 17:44:31.550408: step: 782/463, loss: 0.011497462168335915 2023-01-22 17:44:32.110500: step: 784/463, loss: 0.006660569459199905 2023-01-22 17:44:32.955873: step: 786/463, loss: 0.031129861250519753 2023-01-22 17:44:33.575980: step: 788/463, loss: 0.017950469627976418 2023-01-22 17:44:34.294718: step: 790/463, loss: 0.013328289613127708 2023-01-22 17:44:35.026506: step: 792/463, loss: 0.020557183772325516 2023-01-22 17:44:35.684235: step: 794/463, loss: 0.01604551076889038 2023-01-22 17:44:36.314356: step: 796/463, loss: 0.03984198346734047 2023-01-22 17:44:36.932807: step: 798/463, loss: 0.017685605213046074 2023-01-22 17:44:37.569390: step: 800/463, loss: 0.016408517956733704 2023-01-22 17:44:38.176256: step: 802/463, loss: 0.01498230267316103 2023-01-22 17:44:38.826790: step: 804/463, loss: 0.03806765377521515 2023-01-22 17:44:39.403764: step: 806/463, loss: 0.005920288618654013 2023-01-22 17:44:39.962332: step: 808/463, loss: 0.017071209847927094 2023-01-22 17:44:40.574197: step: 810/463, loss: 0.0038283951580524445 2023-01-22 17:44:41.228862: step: 812/463, loss: 0.012544974684715271 2023-01-22 17:44:41.916665: step: 814/463, loss: 0.07949940115213394 2023-01-22 17:44:42.540509: step: 816/463, loss: 0.009659403935074806 2023-01-22 17:44:43.177901: step: 818/463, loss: 0.004589417949318886 2023-01-22 17:44:43.746806: step: 820/463, loss: 0.02763715386390686 2023-01-22 17:44:44.344185: step: 822/463, loss: 0.05164521560072899 2023-01-22 17:44:44.963516: step: 824/463, loss: 0.09015657007694244 2023-01-22 17:44:45.583363: step: 826/463, loss: 0.0049230121076107025 2023-01-22 17:44:46.186137: step: 828/463, loss: 0.06094367429614067 2023-01-22 17:44:46.800503: step: 830/463, loss: 0.031519025564193726 2023-01-22 17:44:47.347907: step: 832/463, loss: 0.029998885467648506 2023-01-22 17:44:47.977441: step: 834/463, loss: 0.34587058424949646 2023-01-22 17:44:48.638742: step: 836/463, loss: 0.1585591584444046 2023-01-22 17:44:49.257988: step: 838/463, loss: 0.06283228099346161 2023-01-22 17:44:49.875105: step: 840/463, loss: 0.013429426588118076 2023-01-22 17:44:50.575542: step: 842/463, loss: 0.01095914002507925 2023-01-22 17:44:51.183353: step: 844/463, loss: 0.014152479358017445 2023-01-22 17:44:51.786064: step: 846/463, loss: 0.010221419855952263 2023-01-22 17:44:52.411236: step: 848/463, loss: 0.08376269042491913 2023-01-22 17:44:53.107471: step: 850/463, loss: 0.06738091260194778 2023-01-22 17:44:53.775374: step: 852/463, loss: 0.013817171566188335 2023-01-22 17:44:54.400443: step: 854/463, loss: 0.03279763460159302 2023-01-22 17:44:54.984416: step: 856/463, loss: 0.029351856559515 2023-01-22 17:44:55.665626: step: 858/463, loss: 0.04950103908777237 2023-01-22 17:44:56.239251: step: 860/463, loss: 0.005169173236936331 2023-01-22 17:44:56.941101: step: 862/463, loss: 0.1452832669019699 2023-01-22 17:44:57.520211: step: 864/463, loss: 0.0041940221562981606 2023-01-22 17:44:58.100396: step: 866/463, loss: 0.10963336378335953 2023-01-22 17:44:58.700344: step: 868/463, loss: 0.019062159582972527 2023-01-22 17:44:59.338774: step: 870/463, loss: 0.13614334166049957 2023-01-22 17:44:59.954657: step: 872/463, loss: 0.0030576647259294987 2023-01-22 17:45:00.584486: step: 874/463, loss: 0.02599555440247059 2023-01-22 17:45:01.185812: step: 876/463, loss: 0.005957301706075668 2023-01-22 17:45:01.820301: step: 878/463, loss: 0.020934727042913437 2023-01-22 17:45:02.462229: step: 880/463, loss: 0.007874351926147938 2023-01-22 17:45:03.043206: step: 882/463, loss: 0.0076484824530780315 2023-01-22 17:45:03.630942: step: 884/463, loss: 0.0020763282664120197 2023-01-22 17:45:04.247521: step: 886/463, loss: 0.05656498670578003 2023-01-22 17:45:04.857767: step: 888/463, loss: 0.006346619687974453 2023-01-22 17:45:05.530987: step: 890/463, loss: 0.03908694535493851 2023-01-22 17:45:06.162431: step: 892/463, loss: 0.014977081678807735 2023-01-22 17:45:06.801542: step: 894/463, loss: 0.008493378758430481 2023-01-22 17:45:07.391690: step: 896/463, loss: 0.012959162704646587 2023-01-22 17:45:08.013470: step: 898/463, loss: 0.00481724739074707 2023-01-22 17:45:08.676369: step: 900/463, loss: 0.04692864418029785 2023-01-22 17:45:09.302615: step: 902/463, loss: 0.011521544307470322 2023-01-22 17:45:09.908428: step: 904/463, loss: 0.03224186599254608 2023-01-22 17:45:10.514955: step: 906/463, loss: 0.01155626680701971 2023-01-22 17:45:11.184750: step: 908/463, loss: 0.013143565505743027 2023-01-22 17:45:11.865213: step: 910/463, loss: 0.021662337705492973 2023-01-22 17:45:12.493457: step: 912/463, loss: 0.029251230880618095 2023-01-22 17:45:13.140842: step: 914/463, loss: 0.003696347586810589 2023-01-22 17:45:13.786916: step: 916/463, loss: 0.013457286171615124 2023-01-22 17:45:14.416260: step: 918/463, loss: 0.00865876954048872 2023-01-22 17:45:15.034411: step: 920/463, loss: 0.05250321701169014 2023-01-22 17:45:15.642907: step: 922/463, loss: 0.030419936403632164 2023-01-22 17:45:16.275828: step: 924/463, loss: 0.005583269987255335 2023-01-22 17:45:16.892481: step: 926/463, loss: 0.07217348366975784 ================================================== Loss: 0.075 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.26506746626686656, 'r': 0.33548387096774196, 'f1': 0.2961474036850921}, 'combined': 0.21821387639954157, 'epoch': 26} Test Chinese: {'template': {'p': 0.9722222222222222, 'r': 0.5511811023622047, 'f1': 0.7035175879396985}, 'slot': {'p': 0.3307320270681909, 'r': 0.3246767763546495, 'f1': 0.3276764298097518}, 'combined': 0.23052613152444854, 'epoch': 26} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2749617151607963, 'r': 0.3407020872865275, 'f1': 0.3043220338983051}, 'combined': 0.2242372881355932, 'epoch': 26} Test Korean: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.3324458667368832, 'r': 0.32519988010355966, 'f1': 0.3287829550275742}, 'combined': 0.23343589806957768, 'epoch': 26} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2833776393098427, 'r': 0.34897929395083094, 'f1': 0.3127756597144352}, 'combined': 0.23046627557905752, 'epoch': 26} Test Russian: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.35040652739500383, 'r': 0.3164666281447024, 'f1': 0.33257290770639975}, 'combined': 0.23612676447154382, 'epoch': 26} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.24513888888888888, 'r': 0.4202380952380952, 'f1': 0.3096491228070175}, 'combined': 0.20643274853801166, 'epoch': 26} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.26973684210526316, 'r': 0.44565217391304346, 'f1': 0.3360655737704918}, 'combined': 0.1680327868852459, 'epoch': 26} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.3, 'r': 0.20689655172413793, 'f1': 0.24489795918367346}, 'combined': 0.16326530612244897, 'epoch': 26} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2887401938920082, 'r': 0.35393959251278423, 'f1': 0.3180326773303279}, 'combined': 0.2343398675065574, 'epoch': 16} Test for Chinese: {'template': {'p': 0.9722222222222222, 'r': 0.5511811023622047, 'f1': 0.7035175879396985}, 'slot': {'p': 0.3374190197383612, 'r': 0.30885912016190303, 'f1': 0.32250801977725824}, 'combined': 0.22689006416490531, 'epoch': 16} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.29006410256410253, 'r': 0.4309523809523809, 'f1': 0.34674329501915707}, 'combined': 0.23116219667943805, 'epoch': 16} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.28642843540005986, 'r': 0.33860515228508026, 'f1': 0.3103389830508475}, 'combined': 0.22867082961641394, 'epoch': 17} Test for Korean: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.3492453884409228, 'r': 0.3270179138496522, 'f1': 0.3377663639671779}, 'combined': 0.2398141184166963, 'epoch': 17} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.33088235294117646, 'r': 0.4891304347826087, 'f1': 0.39473684210526316}, 'combined': 0.19736842105263158, 'epoch': 17} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.29811343141907926, 'r': 0.34053944158308486, 'f1': 0.3179172466152094}, 'combined': 0.23425481329541745, 'epoch': 20} Test for Russian: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.36505995641065214, 'r': 0.2978456014345199, 'f1': 0.32804522752903387}, 'combined': 0.23291211154561403, 'epoch': 20} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.27586206896551724, 'f1': 0.35555555555555557}, 'combined': 0.23703703703703705, 'epoch': 20} ****************************** Epoch: 27 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-22 17:47:57.124522: step: 2/463, loss: 0.03846801444888115 2023-01-22 17:47:57.758987: step: 4/463, loss: 0.03114727884531021 2023-01-22 17:47:58.389321: step: 6/463, loss: 0.011592866852879524 2023-01-22 17:47:58.961410: step: 8/463, loss: 0.004234970547258854 2023-01-22 17:47:59.574400: step: 10/463, loss: 0.02401907928287983 2023-01-22 17:48:00.277270: step: 12/463, loss: 0.04923490434885025 2023-01-22 17:48:00.809212: step: 14/463, loss: 0.03334368020296097 2023-01-22 17:48:01.430680: step: 16/463, loss: 0.0096847889944911 2023-01-22 17:48:01.993840: step: 18/463, loss: 0.0017464231932535768 2023-01-22 17:48:02.611682: step: 20/463, loss: 0.00818121898919344 2023-01-22 17:48:03.233000: step: 22/463, loss: 0.06419355422258377 2023-01-22 17:48:03.816932: step: 24/463, loss: 0.012695678509771824 2023-01-22 17:48:04.428594: step: 26/463, loss: 0.015393340960144997 2023-01-22 17:48:05.067325: step: 28/463, loss: 0.041216179728507996 2023-01-22 17:48:05.734964: step: 30/463, loss: 0.00272354856133461 2023-01-22 17:48:06.334285: step: 32/463, loss: 0.01804034784436226 2023-01-22 17:48:07.005360: step: 34/463, loss: 0.0015794079517945647 2023-01-22 17:48:07.583814: step: 36/463, loss: 0.0941072329878807 2023-01-22 17:48:08.156613: step: 38/463, loss: 0.019646015018224716 2023-01-22 17:48:08.739844: step: 40/463, loss: 0.012248197570443153 2023-01-22 17:48:09.300077: step: 42/463, loss: 0.005155832506716251 2023-01-22 17:48:09.930166: step: 44/463, loss: 0.1626175194978714 2023-01-22 17:48:10.555249: step: 46/463, loss: 0.0161720160394907 2023-01-22 17:48:11.144832: step: 48/463, loss: 0.016141356900334358 2023-01-22 17:48:11.725657: step: 50/463, loss: 0.0012848081532865763 2023-01-22 17:48:12.409257: step: 52/463, loss: 0.008192689158022404 2023-01-22 17:48:13.021188: step: 54/463, loss: 0.0033828255254775286 2023-01-22 17:48:13.582338: step: 56/463, loss: 0.019882885739207268 2023-01-22 17:48:14.086560: step: 58/463, loss: 0.0020799366757273674 2023-01-22 17:48:14.643385: step: 60/463, loss: 0.03481445088982582 2023-01-22 17:48:15.391876: step: 62/463, loss: 0.000505155767314136 2023-01-22 17:48:16.026409: step: 64/463, loss: 0.020395614206790924 2023-01-22 17:48:16.607988: step: 66/463, loss: 0.006576713640242815 2023-01-22 17:48:17.251108: step: 68/463, loss: 0.03270922228693962 2023-01-22 17:48:17.820474: step: 70/463, loss: 0.010687068104743958 2023-01-22 17:48:18.403145: step: 72/463, loss: 0.007401933427900076 2023-01-22 17:48:19.061718: step: 74/463, loss: 0.01328960433602333 2023-01-22 17:48:19.790655: step: 76/463, loss: 0.026214729994535446 2023-01-22 17:48:20.460266: step: 78/463, loss: 0.002396677155047655 2023-01-22 17:48:21.077555: step: 80/463, loss: 0.002170937368646264 2023-01-22 17:48:21.744372: step: 82/463, loss: 0.021579084917902946 2023-01-22 17:48:22.423917: step: 84/463, loss: 0.0018461729632690549 2023-01-22 17:48:23.061507: step: 86/463, loss: 0.003414551494643092 2023-01-22 17:48:23.653843: step: 88/463, loss: 0.0010635770158842206 2023-01-22 17:48:24.294703: step: 90/463, loss: 0.03402366116642952 2023-01-22 17:48:24.901732: step: 92/463, loss: 5.3132484026718885e-05 2023-01-22 17:48:25.593722: step: 94/463, loss: 0.008998529054224491 2023-01-22 17:48:26.139779: step: 96/463, loss: 0.001385695650242269 2023-01-22 17:48:26.758170: step: 98/463, loss: 0.0016499008052051067 2023-01-22 17:48:27.386122: step: 100/463, loss: 1.1373722553253174 2023-01-22 17:48:28.002683: step: 102/463, loss: 0.0013120889198035002 2023-01-22 17:48:28.606460: step: 104/463, loss: 0.002140115015208721 2023-01-22 17:48:29.170873: step: 106/463, loss: 0.005088759120553732 2023-01-22 17:48:29.742672: step: 108/463, loss: 7.312182424357161e-05 2023-01-22 17:48:30.338023: step: 110/463, loss: 0.01854015700519085 2023-01-22 17:48:30.937865: step: 112/463, loss: 0.2706995904445648 2023-01-22 17:48:31.573356: step: 114/463, loss: 0.00043776820530183613 2023-01-22 17:48:32.215136: step: 116/463, loss: 0.029114680364727974 2023-01-22 17:48:32.889095: step: 118/463, loss: 0.14264056086540222 2023-01-22 17:48:33.483971: step: 120/463, loss: 0.016254017129540443 2023-01-22 17:48:34.055435: step: 122/463, loss: 0.004264076706022024 2023-01-22 17:48:34.695161: step: 124/463, loss: 0.028481610119342804 2023-01-22 17:48:35.320906: step: 126/463, loss: 0.04463396593928337 2023-01-22 17:48:35.918401: step: 128/463, loss: 0.010214912705123425 2023-01-22 17:48:36.604964: step: 130/463, loss: 0.22810405492782593 2023-01-22 17:48:37.196079: step: 132/463, loss: 0.005645947065204382 2023-01-22 17:48:37.827226: step: 134/463, loss: 0.0010425091022625566 2023-01-22 17:48:38.477095: step: 136/463, loss: 0.026518329977989197 2023-01-22 17:48:39.058153: step: 138/463, loss: 0.00014999997802078724 2023-01-22 17:48:39.635682: step: 140/463, loss: 8.400728984270245e-05 2023-01-22 17:48:40.226904: step: 142/463, loss: 0.0066820476204156876 2023-01-22 17:48:40.849491: step: 144/463, loss: 0.03395969793200493 2023-01-22 17:48:41.580851: step: 146/463, loss: 0.001538825803436339 2023-01-22 17:48:42.214984: step: 148/463, loss: 0.1174059510231018 2023-01-22 17:48:42.772031: step: 150/463, loss: 0.06088556721806526 2023-01-22 17:48:43.402696: step: 152/463, loss: 0.09464927762746811 2023-01-22 17:48:43.964554: step: 154/463, loss: 0.03171466290950775 2023-01-22 17:48:44.573606: step: 156/463, loss: 0.07049822062253952 2023-01-22 17:48:45.169053: step: 158/463, loss: 0.020116835832595825 2023-01-22 17:48:45.795259: step: 160/463, loss: 0.006592441350221634 2023-01-22 17:48:46.356518: step: 162/463, loss: 2.008444607781712e-05 2023-01-22 17:48:47.003797: step: 164/463, loss: 0.02699263207614422 2023-01-22 17:48:47.635626: step: 166/463, loss: 0.06426060944795609 2023-01-22 17:48:48.204522: step: 168/463, loss: 0.004334310535341501 2023-01-22 17:48:48.979867: step: 170/463, loss: 0.03040197119116783 2023-01-22 17:48:49.559545: step: 172/463, loss: 0.009321453049778938 2023-01-22 17:48:50.214163: step: 174/463, loss: 0.0031520274933427572 2023-01-22 17:48:50.837902: step: 176/463, loss: 0.020025089383125305 2023-01-22 17:48:51.418714: step: 178/463, loss: 0.0021182165946811438 2023-01-22 17:48:52.006514: step: 180/463, loss: 0.012605207972228527 2023-01-22 17:48:52.610323: step: 182/463, loss: 0.03446768596768379 2023-01-22 17:48:53.201886: step: 184/463, loss: 0.052701640874147415 2023-01-22 17:48:53.813112: step: 186/463, loss: 0.21379327774047852 2023-01-22 17:48:54.430541: step: 188/463, loss: 0.2432553619146347 2023-01-22 17:48:55.018715: step: 190/463, loss: 0.0143883703276515 2023-01-22 17:48:55.619290: step: 192/463, loss: 0.08418628573417664 2023-01-22 17:48:56.272332: step: 194/463, loss: 0.04601965472102165 2023-01-22 17:48:56.894853: step: 196/463, loss: 0.04428742453455925 2023-01-22 17:48:57.547492: step: 198/463, loss: 0.053761180490255356 2023-01-22 17:48:58.202556: step: 200/463, loss: 0.013337770476937294 2023-01-22 17:48:58.832013: step: 202/463, loss: 0.0003529027453623712 2023-01-22 17:48:59.498571: step: 204/463, loss: 0.009401666931807995 2023-01-22 17:49:00.193056: step: 206/463, loss: 0.008197514340281487 2023-01-22 17:49:00.889033: step: 208/463, loss: 0.0647740289568901 2023-01-22 17:49:01.545660: step: 210/463, loss: 0.0023389416746795177 2023-01-22 17:49:02.100384: step: 212/463, loss: 0.0004937934572808444 2023-01-22 17:49:02.662029: step: 214/463, loss: 0.004447128623723984 2023-01-22 17:49:03.286179: step: 216/463, loss: 0.03374769911170006 2023-01-22 17:49:03.885671: step: 218/463, loss: 0.028546040877699852 2023-01-22 17:49:04.497179: step: 220/463, loss: 0.05634113401174545 2023-01-22 17:49:05.163449: step: 222/463, loss: 0.02697850950062275 2023-01-22 17:49:05.767821: step: 224/463, loss: 0.020151084288954735 2023-01-22 17:49:06.449933: step: 226/463, loss: 0.01193307526409626 2023-01-22 17:49:07.121430: step: 228/463, loss: 0.01497825887054205 2023-01-22 17:49:07.714595: step: 230/463, loss: 0.07419648766517639 2023-01-22 17:49:08.328986: step: 232/463, loss: 0.006083178799599409 2023-01-22 17:49:08.902650: step: 234/463, loss: 0.032527390867471695 2023-01-22 17:49:09.468003: step: 236/463, loss: 0.016671963036060333 2023-01-22 17:49:10.115361: step: 238/463, loss: 0.03314630314707756 2023-01-22 17:49:10.738433: step: 240/463, loss: 0.03589486703276634 2023-01-22 17:49:11.349512: step: 242/463, loss: 0.045082490891218185 2023-01-22 17:49:11.971280: step: 244/463, loss: 0.06520739942789078 2023-01-22 17:49:12.581224: step: 246/463, loss: 0.000122411860502325 2023-01-22 17:49:13.146587: step: 248/463, loss: 0.05110474303364754 2023-01-22 17:49:13.806634: step: 250/463, loss: 0.016275979578495026 2023-01-22 17:49:14.441368: step: 252/463, loss: 0.12060437351465225 2023-01-22 17:49:15.014948: step: 254/463, loss: 0.006911230273544788 2023-01-22 17:49:15.594733: step: 256/463, loss: 0.006004971917718649 2023-01-22 17:49:16.213110: step: 258/463, loss: 0.029703857377171516 2023-01-22 17:49:16.798227: step: 260/463, loss: 0.0025412356480956078 2023-01-22 17:49:17.432215: step: 262/463, loss: 0.07979313284158707 2023-01-22 17:49:18.066172: step: 264/463, loss: 0.006945250555872917 2023-01-22 17:49:18.803271: step: 266/463, loss: 0.49371370673179626 2023-01-22 17:49:19.431223: step: 268/463, loss: 0.03181619942188263 2023-01-22 17:49:20.055748: step: 270/463, loss: 0.018052563071250916 2023-01-22 17:49:20.700811: step: 272/463, loss: 0.01817806251347065 2023-01-22 17:49:21.363295: step: 274/463, loss: 0.03369865193963051 2023-01-22 17:49:21.984324: step: 276/463, loss: 0.0019523120718076825 2023-01-22 17:49:22.591620: step: 278/463, loss: 0.04299550503492355 2023-01-22 17:49:23.176962: step: 280/463, loss: 0.1027774065732956 2023-01-22 17:49:23.795505: step: 282/463, loss: 0.015144970268011093 2023-01-22 17:49:24.456600: step: 284/463, loss: 0.03970795497298241 2023-01-22 17:49:25.062056: step: 286/463, loss: 0.0022635182831436396 2023-01-22 17:49:25.621389: step: 288/463, loss: 0.006016434635967016 2023-01-22 17:49:26.237364: step: 290/463, loss: 0.03252415359020233 2023-01-22 17:49:26.834114: step: 292/463, loss: 0.0009435771498829126 2023-01-22 17:49:27.528255: step: 294/463, loss: 0.0051262229681015015 2023-01-22 17:49:28.161352: step: 296/463, loss: 0.00801279116421938 2023-01-22 17:49:28.698080: step: 298/463, loss: 0.0008969748159870505 2023-01-22 17:49:29.360622: step: 300/463, loss: 0.0290631502866745 2023-01-22 17:49:30.025548: step: 302/463, loss: 0.016329893842339516 2023-01-22 17:49:30.700337: step: 304/463, loss: 0.014553033746778965 2023-01-22 17:49:31.364720: step: 306/463, loss: 0.0007963245152495801 2023-01-22 17:49:32.022488: step: 308/463, loss: 0.0658935010433197 2023-01-22 17:49:32.628680: step: 310/463, loss: 0.0008466828148812056 2023-01-22 17:49:33.209494: step: 312/463, loss: 0.03963599354028702 2023-01-22 17:49:33.895664: step: 314/463, loss: 0.03279987350106239 2023-01-22 17:49:34.480849: step: 316/463, loss: 0.001783805899322033 2023-01-22 17:49:35.145252: step: 318/463, loss: 0.02451697178184986 2023-01-22 17:49:35.818190: step: 320/463, loss: 0.0094827339053154 2023-01-22 17:49:36.451031: step: 322/463, loss: 0.00896123144775629 2023-01-22 17:49:37.064255: step: 324/463, loss: 0.017012618482112885 2023-01-22 17:49:37.643785: step: 326/463, loss: 0.01277776900678873 2023-01-22 17:49:38.271864: step: 328/463, loss: 0.008567464537918568 2023-01-22 17:49:38.874251: step: 330/463, loss: 0.006242402829229832 2023-01-22 17:49:39.502380: step: 332/463, loss: 0.013295956887304783 2023-01-22 17:49:40.051841: step: 334/463, loss: 0.04005291312932968 2023-01-22 17:49:40.728644: step: 336/463, loss: 0.049425654113292694 2023-01-22 17:49:41.317214: step: 338/463, loss: 0.09296686202287674 2023-01-22 17:49:41.937999: step: 340/463, loss: 0.0006993450806476176 2023-01-22 17:49:42.582351: step: 342/463, loss: 0.032701026648283005 2023-01-22 17:49:43.195105: step: 344/463, loss: 0.012323413044214249 2023-01-22 17:49:43.792765: step: 346/463, loss: 0.022011296823620796 2023-01-22 17:49:44.345295: step: 348/463, loss: 0.01293986663222313 2023-01-22 17:49:44.939554: step: 350/463, loss: 0.000851975753903389 2023-01-22 17:49:45.582533: step: 352/463, loss: 0.008460157550871372 2023-01-22 17:49:46.191919: step: 354/463, loss: 0.004642071668058634 2023-01-22 17:49:46.785653: step: 356/463, loss: 0.03562808781862259 2023-01-22 17:49:47.411104: step: 358/463, loss: 0.0018659631023183465 2023-01-22 17:49:48.020983: step: 360/463, loss: 0.0018742562970146537 2023-01-22 17:49:48.651014: step: 362/463, loss: 0.04680521786212921 2023-01-22 17:49:49.357759: step: 364/463, loss: 0.0010150556918233633 2023-01-22 17:49:50.121773: step: 366/463, loss: 0.0008003339753486216 2023-01-22 17:49:50.760463: step: 368/463, loss: 0.007737286388874054 2023-01-22 17:49:51.371360: step: 370/463, loss: 0.003575053997337818 2023-01-22 17:49:52.037554: step: 372/463, loss: 0.007499197497963905 2023-01-22 17:49:52.785499: step: 374/463, loss: 0.01739656925201416 2023-01-22 17:49:53.452579: step: 376/463, loss: 0.09678056836128235 2023-01-22 17:49:54.050583: step: 378/463, loss: 0.01632765121757984 2023-01-22 17:49:54.629480: step: 380/463, loss: 0.055970560759305954 2023-01-22 17:49:55.262424: step: 382/463, loss: 0.014666725881397724 2023-01-22 17:49:55.874403: step: 384/463, loss: 0.09449894726276398 2023-01-22 17:49:56.514633: step: 386/463, loss: 0.03287149965763092 2023-01-22 17:49:57.167779: step: 388/463, loss: 0.019780142232775688 2023-01-22 17:49:57.860663: step: 390/463, loss: 0.015939654782414436 2023-01-22 17:49:58.491116: step: 392/463, loss: 0.06114884465932846 2023-01-22 17:49:59.206471: step: 394/463, loss: 0.04510142281651497 2023-01-22 17:49:59.872603: step: 396/463, loss: 0.08135179430246353 2023-01-22 17:50:00.462077: step: 398/463, loss: 0.039826322346925735 2023-01-22 17:50:01.154064: step: 400/463, loss: 0.031221801415085793 2023-01-22 17:50:01.774704: step: 402/463, loss: 0.004980822093784809 2023-01-22 17:50:02.379860: step: 404/463, loss: 0.005884864833205938 2023-01-22 17:50:02.998034: step: 406/463, loss: 0.0027543706819415092 2023-01-22 17:50:03.634302: step: 408/463, loss: 0.006009845063090324 2023-01-22 17:50:04.274841: step: 410/463, loss: 0.005450126715004444 2023-01-22 17:50:04.905713: step: 412/463, loss: 0.004920670296996832 2023-01-22 17:50:05.467196: step: 414/463, loss: 0.007633294444531202 2023-01-22 17:50:06.114680: step: 416/463, loss: 0.0388934500515461 2023-01-22 17:50:06.718453: step: 418/463, loss: 0.001295076566748321 2023-01-22 17:50:07.408065: step: 420/463, loss: 0.013074057176709175 2023-01-22 17:50:08.018905: step: 422/463, loss: 0.13556058704853058 2023-01-22 17:50:08.615756: step: 424/463, loss: 0.0011264777276664972 2023-01-22 17:50:09.231772: step: 426/463, loss: 0.028429098427295685 2023-01-22 17:50:09.905410: step: 428/463, loss: 0.11666837334632874 2023-01-22 17:50:10.514543: step: 430/463, loss: 0.04956163838505745 2023-01-22 17:50:11.144643: step: 432/463, loss: 0.0011699004098773003 2023-01-22 17:50:11.774074: step: 434/463, loss: 0.007975371554493904 2023-01-22 17:50:12.377255: step: 436/463, loss: 0.013124348595738411 2023-01-22 17:50:13.002799: step: 438/463, loss: 0.0005725275841541588 2023-01-22 17:50:13.640580: step: 440/463, loss: 0.016216294839978218 2023-01-22 17:50:14.300909: step: 442/463, loss: 0.05559993162751198 2023-01-22 17:50:14.971286: step: 444/463, loss: 0.03783676400780678 2023-01-22 17:50:15.615636: step: 446/463, loss: 0.0441446453332901 2023-01-22 17:50:16.284443: step: 448/463, loss: 0.02010415866971016 2023-01-22 17:50:16.906478: step: 450/463, loss: 0.01194583810865879 2023-01-22 17:50:17.535380: step: 452/463, loss: 0.05752135068178177 2023-01-22 17:50:18.107185: step: 454/463, loss: 0.04000825434923172 2023-01-22 17:50:18.681206: step: 456/463, loss: 0.021434279158711433 2023-01-22 17:50:19.346824: step: 458/463, loss: 0.1959368884563446 2023-01-22 17:50:19.956040: step: 460/463, loss: 0.029529722407460213 2023-01-22 17:50:20.492123: step: 462/463, loss: 0.0033024826552718878 2023-01-22 17:50:21.058560: step: 464/463, loss: 0.22429098188877106 2023-01-22 17:50:21.644644: step: 466/463, loss: 0.0024790188763290644 2023-01-22 17:50:22.264541: step: 468/463, loss: 0.027486253529787064 2023-01-22 17:50:22.810851: step: 470/463, loss: 0.029336288571357727 2023-01-22 17:50:23.413659: step: 472/463, loss: 0.0404227189719677 2023-01-22 17:50:24.016678: step: 474/463, loss: 0.055558349937200546 2023-01-22 17:50:24.693177: step: 476/463, loss: 0.014828120358288288 2023-01-22 17:50:25.291994: step: 478/463, loss: 0.012185693718492985 2023-01-22 17:50:25.906303: step: 480/463, loss: 0.012028262950479984 2023-01-22 17:50:26.437192: step: 482/463, loss: 0.0011768975527957082 2023-01-22 17:50:27.018527: step: 484/463, loss: 0.021456733345985413 2023-01-22 17:50:27.622133: step: 486/463, loss: 0.06720185279846191 2023-01-22 17:50:28.247301: step: 488/463, loss: 0.055892761796712875 2023-01-22 17:50:28.905095: step: 490/463, loss: 0.014233998022973537 2023-01-22 17:50:29.542395: step: 492/463, loss: 0.018311746418476105 2023-01-22 17:50:30.140561: step: 494/463, loss: 0.002983683720231056 2023-01-22 17:50:30.864394: step: 496/463, loss: 0.07616374641656876 2023-01-22 17:50:31.475322: step: 498/463, loss: 0.04599510878324509 2023-01-22 17:50:32.122625: step: 500/463, loss: 0.0023552519269287586 2023-01-22 17:50:32.693404: step: 502/463, loss: 0.008894763886928558 2023-01-22 17:50:33.225911: step: 504/463, loss: 0.004337077494710684 2023-01-22 17:50:33.837848: step: 506/463, loss: 0.022625703364610672 2023-01-22 17:50:34.491924: step: 508/463, loss: 0.012854417786002159 2023-01-22 17:50:35.111466: step: 510/463, loss: 0.032762154936790466 2023-01-22 17:50:35.731950: step: 512/463, loss: 0.015778588131070137 2023-01-22 17:50:36.407536: step: 514/463, loss: 0.05038512125611305 2023-01-22 17:50:37.064887: step: 516/463, loss: 0.038881588727235794 2023-01-22 17:50:37.695401: step: 518/463, loss: 0.028790676966309547 2023-01-22 17:50:38.267901: step: 520/463, loss: 0.015478965826332569 2023-01-22 17:50:38.935687: step: 522/463, loss: 0.024034997448325157 2023-01-22 17:50:39.542731: step: 524/463, loss: 0.2900469899177551 2023-01-22 17:50:40.138767: step: 526/463, loss: 0.0015439905691891909 2023-01-22 17:50:40.706620: step: 528/463, loss: 0.008807475678622723 2023-01-22 17:50:41.322566: step: 530/463, loss: 0.024467293173074722 2023-01-22 17:50:41.951187: step: 532/463, loss: 0.017341451719403267 2023-01-22 17:50:42.663808: step: 534/463, loss: 1.5919641256332397 2023-01-22 17:50:43.292420: step: 536/463, loss: 0.06444763392210007 2023-01-22 17:50:43.926493: step: 538/463, loss: 0.03531820327043533 2023-01-22 17:50:44.568235: step: 540/463, loss: 0.01727011799812317 2023-01-22 17:50:45.219071: step: 542/463, loss: 0.026101933792233467 2023-01-22 17:50:45.886934: step: 544/463, loss: 0.04611241817474365 2023-01-22 17:50:46.484472: step: 546/463, loss: 0.0071188402362167835 2023-01-22 17:50:47.126323: step: 548/463, loss: 0.005557433236390352 2023-01-22 17:50:47.800444: step: 550/463, loss: 0.031933631747961044 2023-01-22 17:50:48.451007: step: 552/463, loss: 0.0020858561620116234 2023-01-22 17:50:49.140717: step: 554/463, loss: 0.016088435426354408 2023-01-22 17:50:49.794463: step: 556/463, loss: 0.021044744178652763 2023-01-22 17:50:50.337735: step: 558/463, loss: 0.004519319627434015 2023-01-22 17:50:50.980095: step: 560/463, loss: 0.02332056500017643 2023-01-22 17:50:51.542008: step: 562/463, loss: 0.0058340709656476974 2023-01-22 17:50:52.189760: step: 564/463, loss: 0.0010396204888820648 2023-01-22 17:50:52.753648: step: 566/463, loss: 7.118176290532574e-05 2023-01-22 17:50:53.349883: step: 568/463, loss: 0.012243904173374176 2023-01-22 17:50:54.011210: step: 570/463, loss: 0.0006087793735787272 2023-01-22 17:50:54.634953: step: 572/463, loss: 0.0018195939483121037 2023-01-22 17:50:55.262496: step: 574/463, loss: 0.011768065392971039 2023-01-22 17:50:55.912110: step: 576/463, loss: 0.013620640151202679 2023-01-22 17:50:56.539133: step: 578/463, loss: 0.0046589127741754055 2023-01-22 17:50:57.225544: step: 580/463, loss: 0.0009700771188363433 2023-01-22 17:50:57.905592: step: 582/463, loss: 0.2398383617401123 2023-01-22 17:50:58.554861: step: 584/463, loss: 0.07839155942201614 2023-01-22 17:50:59.193763: step: 586/463, loss: 0.015046199783682823 2023-01-22 17:50:59.804440: step: 588/463, loss: 0.06470783054828644 2023-01-22 17:51:00.424558: step: 590/463, loss: 2.0666937828063965 2023-01-22 17:51:01.107981: step: 592/463, loss: 0.03148055076599121 2023-01-22 17:51:01.724839: step: 594/463, loss: 0.008366445079445839 2023-01-22 17:51:02.327904: step: 596/463, loss: 0.060453373938798904 2023-01-22 17:51:02.934472: step: 598/463, loss: 0.03336908668279648 2023-01-22 17:51:03.602133: step: 600/463, loss: 0.018315425142645836 2023-01-22 17:51:04.137983: step: 602/463, loss: 0.007002854719758034 2023-01-22 17:51:04.797834: step: 604/463, loss: 0.15297657251358032 2023-01-22 17:51:05.398770: step: 606/463, loss: 0.35663720965385437 2023-01-22 17:51:06.025485: step: 608/463, loss: 0.019266217947006226 2023-01-22 17:51:06.694666: step: 610/463, loss: 0.00553420465439558 2023-01-22 17:51:07.310690: step: 612/463, loss: 0.015519029460847378 2023-01-22 17:51:07.899838: step: 614/463, loss: 0.00269878632389009 2023-01-22 17:51:08.596041: step: 616/463, loss: 0.000998870120383799 2023-01-22 17:51:09.298162: step: 618/463, loss: 0.01718381978571415 2023-01-22 17:51:09.907378: step: 620/463, loss: 0.0034887201618403196 2023-01-22 17:51:10.459161: step: 622/463, loss: 0.004243575967848301 2023-01-22 17:51:11.073325: step: 624/463, loss: 0.04647991433739662 2023-01-22 17:51:11.733973: step: 626/463, loss: 0.004498799331486225 2023-01-22 17:51:12.344826: step: 628/463, loss: 0.01623944751918316 2023-01-22 17:51:12.941897: step: 630/463, loss: 0.00011566172179300338 2023-01-22 17:51:13.550914: step: 632/463, loss: 0.02832302451133728 2023-01-22 17:51:14.169419: step: 634/463, loss: 0.03712830692529678 2023-01-22 17:51:14.815177: step: 636/463, loss: 0.02037656307220459 2023-01-22 17:51:15.518673: step: 638/463, loss: 0.0339631661772728 2023-01-22 17:51:16.120492: step: 640/463, loss: 0.003495069220662117 2023-01-22 17:51:16.716434: step: 642/463, loss: 0.024904631078243256 2023-01-22 17:51:17.520602: step: 644/463, loss: 0.0036753127351403236 2023-01-22 17:51:18.168077: step: 646/463, loss: 0.29219168424606323 2023-01-22 17:51:18.920385: step: 648/463, loss: 0.037397466599941254 2023-01-22 17:51:19.580203: step: 650/463, loss: 0.024118728935718536 2023-01-22 17:51:20.143902: step: 652/463, loss: 0.0019974722526967525 2023-01-22 17:51:20.805271: step: 654/463, loss: 0.008478906005620956 2023-01-22 17:51:21.462396: step: 656/463, loss: 0.2616731524467468 2023-01-22 17:51:22.012954: step: 658/463, loss: 0.0003566553059499711 2023-01-22 17:51:22.631233: step: 660/463, loss: 0.042264457792043686 2023-01-22 17:51:23.221302: step: 662/463, loss: 0.008090407587587833 2023-01-22 17:51:23.809519: step: 664/463, loss: 0.0006050234078429639 2023-01-22 17:51:24.427121: step: 666/463, loss: 0.04029412940144539 2023-01-22 17:51:25.049834: step: 668/463, loss: 0.012490830384194851 2023-01-22 17:51:25.647484: step: 670/463, loss: 0.028311122208833694 2023-01-22 17:51:26.240754: step: 672/463, loss: 0.07672730088233948 2023-01-22 17:51:26.890509: step: 674/463, loss: 0.06956350803375244 2023-01-22 17:51:27.518136: step: 676/463, loss: 0.12294165045022964 2023-01-22 17:51:28.164675: step: 678/463, loss: 0.04267501458525658 2023-01-22 17:51:28.744272: step: 680/463, loss: 0.001772976596839726 2023-01-22 17:51:29.327737: step: 682/463, loss: 0.005126665811985731 2023-01-22 17:51:29.965372: step: 684/463, loss: 0.18588408827781677 2023-01-22 17:51:30.637113: step: 686/463, loss: 0.07507473230361938 2023-01-22 17:51:31.248023: step: 688/463, loss: 0.003553441260010004 2023-01-22 17:51:31.872324: step: 690/463, loss: 0.0017299929168075323 2023-01-22 17:51:32.425704: step: 692/463, loss: 0.03263545781373978 2023-01-22 17:51:33.030448: step: 694/463, loss: 0.005403709597885609 2023-01-22 17:51:33.753116: step: 696/463, loss: 0.016429226845502853 2023-01-22 17:51:34.389481: step: 698/463, loss: 0.012417491525411606 2023-01-22 17:51:34.955336: step: 700/463, loss: 0.019027722999453545 2023-01-22 17:51:35.593230: step: 702/463, loss: 0.04937572032213211 2023-01-22 17:51:36.243230: step: 704/463, loss: 0.0538933202624321 2023-01-22 17:51:36.885454: step: 706/463, loss: 0.00609893212094903 2023-01-22 17:51:37.466362: step: 708/463, loss: 0.07924287021160126 2023-01-22 17:51:38.086537: step: 710/463, loss: 0.0014265759382396936 2023-01-22 17:51:38.681524: step: 712/463, loss: 0.00800226628780365 2023-01-22 17:51:39.272356: step: 714/463, loss: 0.012359429150819778 2023-01-22 17:51:39.983710: step: 716/463, loss: 0.15779682993888855 2023-01-22 17:51:40.648095: step: 718/463, loss: 0.013786989264190197 2023-01-22 17:51:41.266754: step: 720/463, loss: 0.0019537811167538166 2023-01-22 17:51:41.891300: step: 722/463, loss: 0.4363991320133209 2023-01-22 17:51:42.500463: step: 724/463, loss: 0.011480231769382954 2023-01-22 17:51:43.075246: step: 726/463, loss: 0.011598076671361923 2023-01-22 17:51:43.677903: step: 728/463, loss: 0.019618865102529526 2023-01-22 17:51:44.268570: step: 730/463, loss: 0.008991632610559464 2023-01-22 17:51:44.965938: step: 732/463, loss: 0.056285858154296875 2023-01-22 17:51:45.480112: step: 734/463, loss: 0.08214806765317917 2023-01-22 17:51:46.150378: step: 736/463, loss: 0.007321357727050781 2023-01-22 17:51:46.813395: step: 738/463, loss: 0.21322140097618103 2023-01-22 17:51:47.390951: step: 740/463, loss: 0.01280925702303648 2023-01-22 17:51:48.006199: step: 742/463, loss: 0.022861996665596962 2023-01-22 17:51:48.625659: step: 744/463, loss: 0.01811855100095272 2023-01-22 17:51:49.284638: step: 746/463, loss: 0.010414735414087772 2023-01-22 17:51:49.820142: step: 748/463, loss: 0.010915475897490978 2023-01-22 17:51:50.410575: step: 750/463, loss: 0.015992823988199234 2023-01-22 17:51:51.141801: step: 752/463, loss: 0.01614873670041561 2023-01-22 17:51:51.798304: step: 754/463, loss: 0.025485718622803688 2023-01-22 17:51:52.406527: step: 756/463, loss: 0.11707180738449097 2023-01-22 17:51:53.068445: step: 758/463, loss: 0.01518965046852827 2023-01-22 17:51:53.727976: step: 760/463, loss: 0.045832183212041855 2023-01-22 17:51:54.324251: step: 762/463, loss: 5.17034568474628e-05 2023-01-22 17:51:54.929283: step: 764/463, loss: 0.06700917333364487 2023-01-22 17:51:55.545803: step: 766/463, loss: 0.15571404993534088 2023-01-22 17:51:56.128566: step: 768/463, loss: 0.0008002783870324492 2023-01-22 17:51:56.709618: step: 770/463, loss: 0.007696948945522308 2023-01-22 17:51:57.335620: step: 772/463, loss: 0.02202214114367962 2023-01-22 17:51:57.948227: step: 774/463, loss: 0.004566315095871687 2023-01-22 17:51:58.508535: step: 776/463, loss: 0.016592904925346375 2023-01-22 17:51:59.152490: step: 778/463, loss: 0.00365439523011446 2023-01-22 17:51:59.756873: step: 780/463, loss: 0.050477441400289536 2023-01-22 17:52:00.377002: step: 782/463, loss: 0.005551913287490606 2023-01-22 17:52:00.991222: step: 784/463, loss: 0.002600231906399131 2023-01-22 17:52:01.565269: step: 786/463, loss: 0.002360090147703886 2023-01-22 17:52:02.099516: step: 788/463, loss: 0.011114414781332016 2023-01-22 17:52:02.725688: step: 790/463, loss: 0.000492330058477819 2023-01-22 17:52:03.354587: step: 792/463, loss: 0.023250268772244453 2023-01-22 17:52:04.006661: step: 794/463, loss: 0.011345695704221725 2023-01-22 17:52:04.644785: step: 796/463, loss: 0.013351762667298317 2023-01-22 17:52:05.239503: step: 798/463, loss: 0.033038556575775146 2023-01-22 17:52:05.791890: step: 800/463, loss: 0.0007965235272422433 2023-01-22 17:52:06.420291: step: 802/463, loss: 0.007613286841660738 2023-01-22 17:52:07.071867: step: 804/463, loss: 0.001887855469249189 2023-01-22 17:52:07.722699: step: 806/463, loss: 0.029818959534168243 2023-01-22 17:52:08.367124: step: 808/463, loss: 0.015905221924185753 2023-01-22 17:52:08.942081: step: 810/463, loss: 0.026028377935290337 2023-01-22 17:52:09.565656: step: 812/463, loss: 0.00048201571917161345 2023-01-22 17:52:10.142034: step: 814/463, loss: 0.005213267169892788 2023-01-22 17:52:10.806611: step: 816/463, loss: 0.02499951422214508 2023-01-22 17:52:11.383672: step: 818/463, loss: 0.006720477249473333 2023-01-22 17:52:11.980248: step: 820/463, loss: 0.02923782914876938 2023-01-22 17:52:12.654705: step: 822/463, loss: 0.17592738568782806 2023-01-22 17:52:13.293976: step: 824/463, loss: 0.056264590471982956 2023-01-22 17:52:13.914622: step: 826/463, loss: 0.009056363254785538 2023-01-22 17:52:14.585944: step: 828/463, loss: 0.0030096087139099836 2023-01-22 17:52:15.235086: step: 830/463, loss: 0.0027272601146250963 2023-01-22 17:52:15.802235: step: 832/463, loss: 0.0030925008468329906 2023-01-22 17:52:16.572621: step: 834/463, loss: 0.12174199521541595 2023-01-22 17:52:17.264144: step: 836/463, loss: 0.11134455353021622 2023-01-22 17:52:17.874113: step: 838/463, loss: 0.02167416922748089 2023-01-22 17:52:18.437667: step: 840/463, loss: 0.02003876306116581 2023-01-22 17:52:19.061776: step: 842/463, loss: 0.0011874906485900283 2023-01-22 17:52:19.749079: step: 844/463, loss: 0.053902335464954376 2023-01-22 17:52:20.403947: step: 846/463, loss: 0.007153698708862066 2023-01-22 17:52:21.040704: step: 848/463, loss: 0.13168253004550934 2023-01-22 17:52:21.688911: step: 850/463, loss: 0.015588678419589996 2023-01-22 17:52:22.408260: step: 852/463, loss: 0.11487516015768051 2023-01-22 17:52:22.947194: step: 854/463, loss: 0.027320779860019684 2023-01-22 17:52:23.570625: step: 856/463, loss: 0.03783724829554558 2023-01-22 17:52:24.154155: step: 858/463, loss: 0.0035979312378913164 2023-01-22 17:52:24.776942: step: 860/463, loss: 0.06076950207352638 2023-01-22 17:52:25.387209: step: 862/463, loss: 0.0058922236785292625 2023-01-22 17:52:25.964071: step: 864/463, loss: 0.005483855959028006 2023-01-22 17:52:26.635854: step: 866/463, loss: 0.06085788458585739 2023-01-22 17:52:27.201302: step: 868/463, loss: 0.12196002155542374 2023-01-22 17:52:27.821531: step: 870/463, loss: 0.022396961227059364 2023-01-22 17:52:28.437525: step: 872/463, loss: 0.013725541532039642 2023-01-22 17:52:29.050463: step: 874/463, loss: 0.0073176538571715355 2023-01-22 17:52:29.720387: step: 876/463, loss: 0.26693424582481384 2023-01-22 17:52:30.356911: step: 878/463, loss: 0.020757755264639854 2023-01-22 17:52:31.002490: step: 880/463, loss: 0.05497920140624046 2023-01-22 17:52:31.697003: step: 882/463, loss: 0.0981750562787056 2023-01-22 17:52:32.338104: step: 884/463, loss: 0.004204156342893839 2023-01-22 17:52:32.960218: step: 886/463, loss: 0.007944880053400993 2023-01-22 17:52:33.584679: step: 888/463, loss: 0.004250450991094112 2023-01-22 17:52:34.151924: step: 890/463, loss: 0.003027859376743436 2023-01-22 17:52:34.746601: step: 892/463, loss: 0.07564454525709152 2023-01-22 17:52:35.371832: step: 894/463, loss: 0.006803567986935377 2023-01-22 17:52:35.966719: step: 896/463, loss: 0.029895305633544922 2023-01-22 17:52:36.613718: step: 898/463, loss: 0.03149735555052757 2023-01-22 17:52:37.233853: step: 900/463, loss: 0.00981094315648079 2023-01-22 17:52:37.884897: step: 902/463, loss: 0.04752328246831894 2023-01-22 17:52:38.496251: step: 904/463, loss: 0.002688638400286436 2023-01-22 17:52:39.156165: step: 906/463, loss: 0.012819921597838402 2023-01-22 17:52:39.784218: step: 908/463, loss: 0.016765151172876358 2023-01-22 17:52:40.334987: step: 910/463, loss: 0.02540372498333454 2023-01-22 17:52:40.963671: step: 912/463, loss: 0.010049008764326572 2023-01-22 17:52:41.524753: step: 914/463, loss: 0.0010463562794029713 2023-01-22 17:52:42.169907: step: 916/463, loss: 0.004274103324860334 2023-01-22 17:52:42.797065: step: 918/463, loss: 0.021883543580770493 2023-01-22 17:52:43.494051: step: 920/463, loss: 0.026611991226673126 2023-01-22 17:52:44.128007: step: 922/463, loss: 0.0015798329841345549 2023-01-22 17:52:44.788972: step: 924/463, loss: 0.013593805953860283 2023-01-22 17:52:45.413513: step: 926/463, loss: 0.05056929588317871 ================================================== Loss: 0.045 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2856879606229243, 'r': 0.34586132614312276, 'f1': 0.3129080152402158}, 'combined': 0.23056380070331686, 'epoch': 27} Test Chinese: {'template': {'p': 0.9722222222222222, 'r': 0.5511811023622047, 'f1': 0.7035175879396985}, 'slot': {'p': 0.33892371933973614, 'r': 0.32975562571012723, 'f1': 0.3342768218167234}, 'combined': 0.23516962338864966, 'epoch': 27} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.28507518750382604, 'r': 0.34295572842016264, 'f1': 0.3113482668000443}, 'combined': 0.22941451237898, 'epoch': 27} Test Korean: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.3381720141345674, 'r': 0.32518809736151244, 'f1': 0.3315529889468801}, 'combined': 0.23540262215228488, 'epoch': 27} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3002336704567088, 'r': 0.35378578624974605, 'f1': 0.32481726368225816}, 'combined': 0.23933903639745338, 'epoch': 27} Test Russian: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.35346500388198765, 'r': 0.30870306015894117, 'f1': 0.32957109919066446}, 'combined': 0.23399548042537174, 'epoch': 27} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.23888888888888887, 'r': 0.4095238095238095, 'f1': 0.3017543859649123}, 'combined': 0.20116959064327483, 'epoch': 27} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.24404761904761904, 'r': 0.44565217391304346, 'f1': 0.3153846153846154}, 'combined': 0.1576923076923077, 'epoch': 27} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.31, 'r': 0.2672413793103448, 'f1': 0.287037037037037}, 'combined': 0.191358024691358, 'epoch': 27} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2887401938920082, 'r': 0.35393959251278423, 'f1': 0.3180326773303279}, 'combined': 0.2343398675065574, 'epoch': 16} Test for Chinese: {'template': {'p': 0.9722222222222222, 'r': 0.5511811023622047, 'f1': 0.7035175879396985}, 'slot': {'p': 0.3374190197383612, 'r': 0.30885912016190303, 'f1': 0.32250801977725824}, 'combined': 0.22689006416490531, 'epoch': 16} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.29006410256410253, 'r': 0.4309523809523809, 'f1': 0.34674329501915707}, 'combined': 0.23116219667943805, 'epoch': 16} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.28642843540005986, 'r': 0.33860515228508026, 'f1': 0.3103389830508475}, 'combined': 0.22867082961641394, 'epoch': 17} Test for Korean: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.3492453884409228, 'r': 0.3270179138496522, 'f1': 0.3377663639671779}, 'combined': 0.2398141184166963, 'epoch': 17} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.33088235294117646, 'r': 0.4891304347826087, 'f1': 0.39473684210526316}, 'combined': 0.19736842105263158, 'epoch': 17} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.29811343141907926, 'r': 0.34053944158308486, 'f1': 0.3179172466152094}, 'combined': 0.23425481329541745, 'epoch': 20} Test for Russian: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.36505995641065214, 'r': 0.2978456014345199, 'f1': 0.32804522752903387}, 'combined': 0.23291211154561403, 'epoch': 20} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.27586206896551724, 'f1': 0.35555555555555557}, 'combined': 0.23703703703703705, 'epoch': 20} ****************************** Epoch: 28 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-22 17:55:26.442183: step: 2/463, loss: 0.03213844448328018 2023-01-22 17:55:27.071246: step: 4/463, loss: 0.01139053888618946 2023-01-22 17:55:27.704865: step: 6/463, loss: 0.014507448300719261 2023-01-22 17:55:28.321341: step: 8/463, loss: 0.008234204724431038 2023-01-22 17:55:28.905888: step: 10/463, loss: 0.3811902701854706 2023-01-22 17:55:29.464180: step: 12/463, loss: 0.0018843284342437983 2023-01-22 17:55:30.036910: step: 14/463, loss: 0.009751319885253906 2023-01-22 17:55:30.711545: step: 16/463, loss: 0.0012558826711028814 2023-01-22 17:55:31.336274: step: 18/463, loss: 0.022351214662194252 2023-01-22 17:55:31.969665: step: 20/463, loss: 0.0018188246758654714 2023-01-22 17:55:32.582166: step: 22/463, loss: 0.007978350855410099 2023-01-22 17:55:33.185766: step: 24/463, loss: 0.06589847058057785 2023-01-22 17:55:33.742873: step: 26/463, loss: 0.0011133685475215316 2023-01-22 17:55:34.360444: step: 28/463, loss: 0.016765886917710304 2023-01-22 17:55:34.928353: step: 30/463, loss: 0.04471869021654129 2023-01-22 17:55:35.546765: step: 32/463, loss: 0.006966112647205591 2023-01-22 17:55:36.074791: step: 34/463, loss: 0.00040883588371798396 2023-01-22 17:55:36.716640: step: 36/463, loss: 0.00040548614924773574 2023-01-22 17:55:37.370156: step: 38/463, loss: 0.027878645807504654 2023-01-22 17:55:37.958476: step: 40/463, loss: 0.03747014328837395 2023-01-22 17:55:38.590534: step: 42/463, loss: 0.003113814163953066 2023-01-22 17:55:39.170660: step: 44/463, loss: 0.0061993044801056385 2023-01-22 17:55:39.753847: step: 46/463, loss: 0.0062837558798491955 2023-01-22 17:55:40.304647: step: 48/463, loss: 0.028911467641592026 2023-01-22 17:55:40.874236: step: 50/463, loss: 0.01168891228735447 2023-01-22 17:55:41.426507: step: 52/463, loss: 0.0031217671930789948 2023-01-22 17:55:42.016326: step: 54/463, loss: 0.003798728110268712 2023-01-22 17:55:42.619498: step: 56/463, loss: 0.002153100911527872 2023-01-22 17:55:43.320504: step: 58/463, loss: 0.002094922587275505 2023-01-22 17:55:43.939433: step: 60/463, loss: 0.0279534924775362 2023-01-22 17:55:44.604891: step: 62/463, loss: 0.02709927409887314 2023-01-22 17:55:45.235345: step: 64/463, loss: 0.014271283522248268 2023-01-22 17:55:45.872351: step: 66/463, loss: 0.007896749302744865 2023-01-22 17:55:46.583331: step: 68/463, loss: 0.05483843758702278 2023-01-22 17:55:47.168787: step: 70/463, loss: 0.00017187879711855203 2023-01-22 17:55:47.759834: step: 72/463, loss: 0.010253151878714561 2023-01-22 17:55:48.324542: step: 74/463, loss: 0.004188257269561291 2023-01-22 17:55:48.930251: step: 76/463, loss: 0.010238183662295341 2023-01-22 17:55:49.543161: step: 78/463, loss: 0.014795300550758839 2023-01-22 17:55:50.164603: step: 80/463, loss: 0.025578083470463753 2023-01-22 17:55:50.762188: step: 82/463, loss: 0.013971716165542603 2023-01-22 17:55:51.354255: step: 84/463, loss: 0.06326129287481308 2023-01-22 17:55:51.981202: step: 86/463, loss: 0.003552919253706932 2023-01-22 17:55:52.620106: step: 88/463, loss: 0.005448946729302406 2023-01-22 17:55:53.245361: step: 90/463, loss: 0.03882410004734993 2023-01-22 17:55:53.864470: step: 92/463, loss: 0.0035626122262328863 2023-01-22 17:55:54.465932: step: 94/463, loss: 0.009476098231971264 2023-01-22 17:55:55.052300: step: 96/463, loss: 0.049219321459531784 2023-01-22 17:55:55.718287: step: 98/463, loss: 0.0011895071947947145 2023-01-22 17:55:56.343017: step: 100/463, loss: 0.023094521835446358 2023-01-22 17:55:56.958528: step: 102/463, loss: 0.10946579277515411 2023-01-22 17:55:57.639984: step: 104/463, loss: 0.012792223133146763 2023-01-22 17:55:58.253048: step: 106/463, loss: 0.02739347144961357 2023-01-22 17:55:58.857371: step: 108/463, loss: 0.00016177623183466494 2023-01-22 17:55:59.421774: step: 110/463, loss: 0.022935036569833755 2023-01-22 17:56:00.008247: step: 112/463, loss: 0.0042551723308861256 2023-01-22 17:56:00.662098: step: 114/463, loss: 0.0008929232135415077 2023-01-22 17:56:01.227773: step: 116/463, loss: 0.09152216464281082 2023-01-22 17:56:01.829345: step: 118/463, loss: 0.004237803164869547 2023-01-22 17:56:02.529125: step: 120/463, loss: 0.034009627997875214 2023-01-22 17:56:03.203132: step: 122/463, loss: 0.013493592850863934 2023-01-22 17:56:03.885485: step: 124/463, loss: 0.06996171176433563 2023-01-22 17:56:04.640254: step: 126/463, loss: 0.009534063749015331 2023-01-22 17:56:05.281743: step: 128/463, loss: 0.01058170199394226 2023-01-22 17:56:05.888747: step: 130/463, loss: 0.011072096414864063 2023-01-22 17:56:06.597024: step: 132/463, loss: 0.020566480234265327 2023-01-22 17:56:07.181837: step: 134/463, loss: 0.010000312700867653 2023-01-22 17:56:07.807781: step: 136/463, loss: 0.0030474683735519648 2023-01-22 17:56:08.359948: step: 138/463, loss: 0.00383941363543272 2023-01-22 17:56:09.002898: step: 140/463, loss: 0.015523367561399937 2023-01-22 17:56:09.576230: step: 142/463, loss: 0.04197985306382179 2023-01-22 17:56:10.210217: step: 144/463, loss: 0.008804041892290115 2023-01-22 17:56:10.897206: step: 146/463, loss: 0.0014919346431270242 2023-01-22 17:56:11.500685: step: 148/463, loss: 0.02446535788476467 2023-01-22 17:56:12.154039: step: 150/463, loss: 0.12663733959197998 2023-01-22 17:56:12.795496: step: 152/463, loss: 0.041182879358530045 2023-01-22 17:56:13.389429: step: 154/463, loss: 0.11055509746074677 2023-01-22 17:56:14.001585: step: 156/463, loss: 0.013493106700479984 2023-01-22 17:56:14.620742: step: 158/463, loss: 0.1285579949617386 2023-01-22 17:56:15.263053: step: 160/463, loss: 0.04070800542831421 2023-01-22 17:56:15.901608: step: 162/463, loss: 0.013397276401519775 2023-01-22 17:56:16.604770: step: 164/463, loss: 0.0013415964785963297 2023-01-22 17:56:17.196036: step: 166/463, loss: 0.0005928172613494098 2023-01-22 17:56:17.853810: step: 168/463, loss: 0.0427122600376606 2023-01-22 17:56:18.527577: step: 170/463, loss: 0.02601591683924198 2023-01-22 17:56:19.133579: step: 172/463, loss: 0.3514743149280548 2023-01-22 17:56:19.729782: step: 174/463, loss: 0.009464817121624947 2023-01-22 17:56:20.453867: step: 176/463, loss: 0.08448472619056702 2023-01-22 17:56:21.133282: step: 178/463, loss: 0.017665864899754524 2023-01-22 17:56:21.757880: step: 180/463, loss: 0.0027705703396350145 2023-01-22 17:56:22.399494: step: 182/463, loss: 0.01657681353390217 2023-01-22 17:56:23.052713: step: 184/463, loss: 0.00031073205173015594 2023-01-22 17:56:23.669921: step: 186/463, loss: 0.008332927711308002 2023-01-22 17:56:24.259876: step: 188/463, loss: 0.0016145278932526708 2023-01-22 17:56:24.894276: step: 190/463, loss: 0.0078649390488863 2023-01-22 17:56:25.574953: step: 192/463, loss: 0.009682240895926952 2023-01-22 17:56:26.172174: step: 194/463, loss: 0.0031669256277382374 2023-01-22 17:56:26.747338: step: 196/463, loss: 0.007990841753780842 2023-01-22 17:56:27.456265: step: 198/463, loss: 0.0005642786272801459 2023-01-22 17:56:28.145977: step: 200/463, loss: 0.007196986116468906 2023-01-22 17:56:28.811206: step: 202/463, loss: 0.014287058264017105 2023-01-22 17:56:29.380843: step: 204/463, loss: 0.007178491912782192 2023-01-22 17:56:30.008912: step: 206/463, loss: 0.002781319199129939 2023-01-22 17:56:30.605284: step: 208/463, loss: 0.04663149267435074 2023-01-22 17:56:31.251815: step: 210/463, loss: 0.0004133085603825748 2023-01-22 17:56:31.872655: step: 212/463, loss: 0.4990121126174927 2023-01-22 17:56:32.523273: step: 214/463, loss: 0.021204093471169472 2023-01-22 17:56:33.133208: step: 216/463, loss: 0.0008263445924967527 2023-01-22 17:56:33.782214: step: 218/463, loss: 0.0032889172434806824 2023-01-22 17:56:34.382353: step: 220/463, loss: 0.011515962891280651 2023-01-22 17:56:35.018741: step: 222/463, loss: 0.33023905754089355 2023-01-22 17:56:35.722050: step: 224/463, loss: 0.004194328095763922 2023-01-22 17:56:36.351597: step: 226/463, loss: 0.006566221825778484 2023-01-22 17:56:36.983126: step: 228/463, loss: 0.01366354525089264 2023-01-22 17:56:37.661873: step: 230/463, loss: 0.05330008268356323 2023-01-22 17:56:38.333471: step: 232/463, loss: 0.004113213159143925 2023-01-22 17:56:38.921643: step: 234/463, loss: 0.026428310200572014 2023-01-22 17:56:39.509801: step: 236/463, loss: 0.023110993206501007 2023-01-22 17:56:40.055167: step: 238/463, loss: 0.011343053542077541 2023-01-22 17:56:40.673228: step: 240/463, loss: 0.02657420188188553 2023-01-22 17:56:41.253706: step: 242/463, loss: 0.023343728855252266 2023-01-22 17:56:41.817570: step: 244/463, loss: 0.0002803184324875474 2023-01-22 17:56:42.414333: step: 246/463, loss: 0.0015890122158452868 2023-01-22 17:56:42.948876: step: 248/463, loss: 0.004460596013814211 2023-01-22 17:56:43.569255: step: 250/463, loss: 0.001164121087640524 2023-01-22 17:56:44.163476: step: 252/463, loss: 0.012586899101734161 2023-01-22 17:56:44.800609: step: 254/463, loss: 0.041143082082271576 2023-01-22 17:56:45.421154: step: 256/463, loss: 0.026948943734169006 2023-01-22 17:56:46.038127: step: 258/463, loss: 0.03190794587135315 2023-01-22 17:56:46.762794: step: 260/463, loss: 0.006449377629905939 2023-01-22 17:56:47.374422: step: 262/463, loss: 0.0050116474740207195 2023-01-22 17:56:47.961888: step: 264/463, loss: 0.005377790424972773 2023-01-22 17:56:48.628521: step: 266/463, loss: 0.006007257383316755 2023-01-22 17:56:49.289315: step: 268/463, loss: 0.009416303597390652 2023-01-22 17:56:49.961823: step: 270/463, loss: 0.0002540118002798408 2023-01-22 17:56:50.539735: step: 272/463, loss: 0.008071673102676868 2023-01-22 17:56:51.257177: step: 274/463, loss: 0.0753302201628685 2023-01-22 17:56:51.893905: step: 276/463, loss: 0.010046935640275478 2023-01-22 17:56:52.544297: step: 278/463, loss: 0.0041891769506037235 2023-01-22 17:56:53.210258: step: 280/463, loss: 0.006256021559238434 2023-01-22 17:56:53.829888: step: 282/463, loss: 0.00045696660527028143 2023-01-22 17:56:54.530569: step: 284/463, loss: 0.19601012766361237 2023-01-22 17:56:55.154393: step: 286/463, loss: 0.01000457163900137 2023-01-22 17:56:55.797916: step: 288/463, loss: 0.014750906266272068 2023-01-22 17:56:56.414578: step: 290/463, loss: 0.010435018688440323 2023-01-22 17:56:57.037403: step: 292/463, loss: 0.01061052642762661 2023-01-22 17:56:57.639943: step: 294/463, loss: 0.01193891279399395 2023-01-22 17:56:58.232432: step: 296/463, loss: 0.008767467923462391 2023-01-22 17:56:58.837446: step: 298/463, loss: 0.005914943292737007 2023-01-22 17:56:59.404031: step: 300/463, loss: 3.304164783912711e-05 2023-01-22 17:57:00.063353: step: 302/463, loss: 0.011755518615245819 2023-01-22 17:57:00.683839: step: 304/463, loss: 0.01596486009657383 2023-01-22 17:57:01.289653: step: 306/463, loss: 0.10865382105112076 2023-01-22 17:57:01.868154: step: 308/463, loss: 0.01518117543309927 2023-01-22 17:57:02.505332: step: 310/463, loss: 0.0005967971519567072 2023-01-22 17:57:03.206251: step: 312/463, loss: 0.0438823364675045 2023-01-22 17:57:03.853880: step: 314/463, loss: 0.034976065158843994 2023-01-22 17:57:04.442571: step: 316/463, loss: 0.00025192496832460165 2023-01-22 17:57:05.054996: step: 318/463, loss: 0.059565842151641846 2023-01-22 17:57:05.695675: step: 320/463, loss: 0.02785775251686573 2023-01-22 17:57:06.342215: step: 322/463, loss: 0.044402964413166046 2023-01-22 17:57:06.921099: step: 324/463, loss: 0.03249533101916313 2023-01-22 17:57:07.511409: step: 326/463, loss: 0.000936886586714536 2023-01-22 17:57:08.161266: step: 328/463, loss: 0.010038439184427261 2023-01-22 17:57:08.725608: step: 330/463, loss: 0.042146358639001846 2023-01-22 17:57:09.310336: step: 332/463, loss: 0.003152502002194524 2023-01-22 17:57:09.953244: step: 334/463, loss: 0.010268078185617924 2023-01-22 17:57:10.555030: step: 336/463, loss: 0.03180191293358803 2023-01-22 17:57:11.216777: step: 338/463, loss: 0.020611850544810295 2023-01-22 17:57:11.869340: step: 340/463, loss: 0.03614897280931473 2023-01-22 17:57:12.445744: step: 342/463, loss: 0.015169225633144379 2023-01-22 17:57:13.119632: step: 344/463, loss: 0.00789072085171938 2023-01-22 17:57:13.694361: step: 346/463, loss: 0.0013702625874429941 2023-01-22 17:57:14.279673: step: 348/463, loss: 0.011910945177078247 2023-01-22 17:57:14.954054: step: 350/463, loss: 0.0023197077680379152 2023-01-22 17:57:15.661245: step: 352/463, loss: 0.018045097589492798 2023-01-22 17:57:16.323510: step: 354/463, loss: 0.005368582438677549 2023-01-22 17:57:16.955028: step: 356/463, loss: 0.036318108439445496 2023-01-22 17:57:17.583788: step: 358/463, loss: 0.007762932684272528 2023-01-22 17:57:18.238158: step: 360/463, loss: 0.009136620908975601 2023-01-22 17:57:18.858454: step: 362/463, loss: 0.07882899791002274 2023-01-22 17:57:19.529004: step: 364/463, loss: 0.030811211094260216 2023-01-22 17:57:20.133646: step: 366/463, loss: 0.14758308231830597 2023-01-22 17:57:20.794514: step: 368/463, loss: 0.0019732359796762466 2023-01-22 17:57:21.349340: step: 370/463, loss: 0.010489108972251415 2023-01-22 17:57:22.015181: step: 372/463, loss: 0.026793479919433594 2023-01-22 17:57:22.634200: step: 374/463, loss: 0.005628593266010284 2023-01-22 17:57:23.310265: step: 376/463, loss: 0.046319641172885895 2023-01-22 17:57:23.948469: step: 378/463, loss: 0.14259272813796997 2023-01-22 17:57:24.624915: step: 380/463, loss: 0.004148304462432861 2023-01-22 17:57:25.241594: step: 382/463, loss: 0.014632424339652061 2023-01-22 17:57:25.892813: step: 384/463, loss: 0.018852103501558304 2023-01-22 17:57:26.456377: step: 386/463, loss: 0.04470071569085121 2023-01-22 17:57:27.098300: step: 388/463, loss: 0.01910846307873726 2023-01-22 17:57:27.688296: step: 390/463, loss: 0.0015176574233919382 2023-01-22 17:57:28.239300: step: 392/463, loss: 0.0002993095258716494 2023-01-22 17:57:28.811703: step: 394/463, loss: 0.0035101030953228474 2023-01-22 17:57:29.410271: step: 396/463, loss: 0.0130534702911973 2023-01-22 17:57:30.080222: step: 398/463, loss: 0.01311402302235365 2023-01-22 17:57:30.750366: step: 400/463, loss: 0.029257070273160934 2023-01-22 17:57:31.358895: step: 402/463, loss: 0.0021011342760175467 2023-01-22 17:57:31.966120: step: 404/463, loss: 0.01543229166418314 2023-01-22 17:57:32.575659: step: 406/463, loss: 0.1510200798511505 2023-01-22 17:57:33.186858: step: 408/463, loss: 0.001918436260893941 2023-01-22 17:57:33.895112: step: 410/463, loss: 0.06944883614778519 2023-01-22 17:57:34.614149: step: 412/463, loss: 0.009832518175244331 2023-01-22 17:57:35.254050: step: 414/463, loss: 0.0023887804709374905 2023-01-22 17:57:35.838918: step: 416/463, loss: 0.050731439143419266 2023-01-22 17:57:36.449629: step: 418/463, loss: 0.16306638717651367 2023-01-22 17:57:37.129978: step: 420/463, loss: 0.02276529371738434 2023-01-22 17:57:37.717914: step: 422/463, loss: 0.1663629710674286 2023-01-22 17:57:38.333132: step: 424/463, loss: 0.03923305496573448 2023-01-22 17:57:39.213668: step: 426/463, loss: 0.042238980531692505 2023-01-22 17:57:39.848058: step: 428/463, loss: 0.010411540977656841 2023-01-22 17:57:40.483834: step: 430/463, loss: 0.038226187229156494 2023-01-22 17:57:41.112076: step: 432/463, loss: 0.009631017223000526 2023-01-22 17:57:41.818751: step: 434/463, loss: 0.1680622100830078 2023-01-22 17:57:42.354442: step: 436/463, loss: 0.029324444010853767 2023-01-22 17:57:42.994523: step: 438/463, loss: 0.006575694307684898 2023-01-22 17:57:43.580682: step: 440/463, loss: 0.00043723982525989413 2023-01-22 17:57:44.154676: step: 442/463, loss: 0.09023015201091766 2023-01-22 17:57:44.797497: step: 444/463, loss: 0.013733116909861565 2023-01-22 17:57:45.362373: step: 446/463, loss: 0.04067426547408104 2023-01-22 17:57:45.903095: step: 448/463, loss: 0.005045309197157621 2023-01-22 17:57:46.512230: step: 450/463, loss: 0.046275582164525986 2023-01-22 17:57:47.136001: step: 452/463, loss: 0.020393243059515953 2023-01-22 17:57:47.731139: step: 454/463, loss: 0.4273609220981598 2023-01-22 17:57:48.462894: step: 456/463, loss: 0.2258153259754181 2023-01-22 17:57:49.029675: step: 458/463, loss: 0.0014842608943581581 2023-01-22 17:57:49.638213: step: 460/463, loss: 0.0016395433340221643 2023-01-22 17:57:50.257745: step: 462/463, loss: 0.004302798770368099 2023-01-22 17:57:50.865747: step: 464/463, loss: 0.022075161337852478 2023-01-22 17:57:51.465498: step: 466/463, loss: 0.02726873941719532 2023-01-22 17:57:52.071170: step: 468/463, loss: 0.02152709662914276 2023-01-22 17:57:52.700691: step: 470/463, loss: 0.1815168410539627 2023-01-22 17:57:53.263196: step: 472/463, loss: 0.023141901940107346 2023-01-22 17:57:53.902319: step: 474/463, loss: 0.0017482854891568422 2023-01-22 17:57:54.510256: step: 476/463, loss: 0.001265004277229309 2023-01-22 17:57:55.146938: step: 478/463, loss: 0.02614702098071575 2023-01-22 17:57:55.756515: step: 480/463, loss: 0.016391456127166748 2023-01-22 17:57:56.336560: step: 482/463, loss: 0.04686746001243591 2023-01-22 17:57:56.952686: step: 484/463, loss: 0.007279905956238508 2023-01-22 17:57:57.581144: step: 486/463, loss: 0.004578466061502695 2023-01-22 17:57:58.192516: step: 488/463, loss: 0.03546911105513573 2023-01-22 17:57:58.769262: step: 490/463, loss: 0.0011881497921422124 2023-01-22 17:57:59.403385: step: 492/463, loss: 0.023232176899909973 2023-01-22 17:58:00.052267: step: 494/463, loss: 0.004714783746749163 2023-01-22 17:58:00.669448: step: 496/463, loss: 0.004121197387576103 2023-01-22 17:58:01.443183: step: 498/463, loss: 0.011855159886181355 2023-01-22 17:58:02.001658: step: 500/463, loss: 0.00264635868370533 2023-01-22 17:58:02.700128: step: 502/463, loss: 0.2936907112598419 2023-01-22 17:58:03.339538: step: 504/463, loss: 0.054029520601034164 2023-01-22 17:58:03.973256: step: 506/463, loss: 0.39043569564819336 2023-01-22 17:58:04.627789: step: 508/463, loss: 0.012901032343506813 2023-01-22 17:58:05.188029: step: 510/463, loss: 0.007815157063305378 2023-01-22 17:58:05.782766: step: 512/463, loss: 0.01819210685789585 2023-01-22 17:58:06.379112: step: 514/463, loss: 0.0035125650465488434 2023-01-22 17:58:07.043145: step: 516/463, loss: 0.0009647037368267775 2023-01-22 17:58:07.682470: step: 518/463, loss: 0.028976766392588615 2023-01-22 17:58:08.270199: step: 520/463, loss: 0.019093571230769157 2023-01-22 17:58:08.931806: step: 522/463, loss: 0.0039791446179151535 2023-01-22 17:58:09.582674: step: 524/463, loss: 0.11286035180091858 2023-01-22 17:58:10.171552: step: 526/463, loss: 0.003631838597357273 2023-01-22 17:58:10.809857: step: 528/463, loss: 0.0717243105173111 2023-01-22 17:58:11.474209: step: 530/463, loss: 0.004705814179033041 2023-01-22 17:58:12.115032: step: 532/463, loss: 0.03489552065730095 2023-01-22 17:58:12.713365: step: 534/463, loss: 0.01284461934119463 2023-01-22 17:58:13.328810: step: 536/463, loss: 0.002069346606731415 2023-01-22 17:58:13.934514: step: 538/463, loss: 0.08871117979288101 2023-01-22 17:58:14.580854: step: 540/463, loss: 0.02060859091579914 2023-01-22 17:58:15.218882: step: 542/463, loss: 0.02605126053094864 2023-01-22 17:58:15.823770: step: 544/463, loss: 0.0056695761159062386 2023-01-22 17:58:16.423003: step: 546/463, loss: 0.04290623962879181 2023-01-22 17:58:17.053075: step: 548/463, loss: 0.008508003316819668 2023-01-22 17:58:17.674479: step: 550/463, loss: 0.035869237035512924 2023-01-22 17:58:18.305500: step: 552/463, loss: 0.016515962779521942 2023-01-22 17:58:18.935565: step: 554/463, loss: 0.028028413653373718 2023-01-22 17:58:19.568954: step: 556/463, loss: 0.005842795129865408 2023-01-22 17:58:20.159803: step: 558/463, loss: 0.32232028245925903 2023-01-22 17:58:20.764592: step: 560/463, loss: 0.01715569943189621 2023-01-22 17:58:21.457848: step: 562/463, loss: 0.17176924645900726 2023-01-22 17:58:22.068096: step: 564/463, loss: 0.009791905991733074 2023-01-22 17:58:22.699567: step: 566/463, loss: 0.0023924061097204685 2023-01-22 17:58:23.351356: step: 568/463, loss: 0.031134577468037605 2023-01-22 17:58:23.927331: step: 570/463, loss: 0.04699390009045601 2023-01-22 17:58:24.558598: step: 572/463, loss: 0.0048288824036717415 2023-01-22 17:58:25.183185: step: 574/463, loss: 0.0017013158649206161 2023-01-22 17:58:25.765456: step: 576/463, loss: 0.18949079513549805 2023-01-22 17:58:26.421961: step: 578/463, loss: 0.0335036963224411 2023-01-22 17:58:27.012932: step: 580/463, loss: 0.009879798628389835 2023-01-22 17:58:27.576283: step: 582/463, loss: 0.004880913067609072 2023-01-22 17:58:28.189124: step: 584/463, loss: 0.011361636221408844 2023-01-22 17:58:28.805163: step: 586/463, loss: 0.025270171463489532 2023-01-22 17:58:29.489645: step: 588/463, loss: 0.019494131207466125 2023-01-22 17:58:30.192677: step: 590/463, loss: 0.030884509906172752 2023-01-22 17:58:30.834368: step: 592/463, loss: 0.028992824256420135 2023-01-22 17:58:31.387064: step: 594/463, loss: 0.034830790013074875 2023-01-22 17:58:32.034667: step: 596/463, loss: 0.005582206416875124 2023-01-22 17:58:32.584564: step: 598/463, loss: 0.008395330049097538 2023-01-22 17:58:33.239593: step: 600/463, loss: 0.015479398891329765 2023-01-22 17:58:33.956200: step: 602/463, loss: 1.6992946863174438 2023-01-22 17:58:34.506796: step: 604/463, loss: 0.011650439351797104 2023-01-22 17:58:35.170559: step: 606/463, loss: 0.005732941906899214 2023-01-22 17:58:35.832979: step: 608/463, loss: 0.08030443638563156 2023-01-22 17:58:36.384031: step: 610/463, loss: 0.004602071363478899 2023-01-22 17:58:36.990005: step: 612/463, loss: 0.026011796668171883 2023-01-22 17:58:37.669728: step: 614/463, loss: 0.004580824635922909 2023-01-22 17:58:38.264854: step: 616/463, loss: 0.013105754740536213 2023-01-22 17:58:38.847848: step: 618/463, loss: 0.008687750436365604 2023-01-22 17:58:39.489581: step: 620/463, loss: 0.0771847516298294 2023-01-22 17:58:40.150347: step: 622/463, loss: 0.002787213772535324 2023-01-22 17:58:40.758022: step: 624/463, loss: 0.0270919781178236 2023-01-22 17:58:41.411139: step: 626/463, loss: 0.0072434647008776665 2023-01-22 17:58:42.030955: step: 628/463, loss: 0.020049938932061195 2023-01-22 17:58:42.608380: step: 630/463, loss: 0.039199575781822205 2023-01-22 17:58:43.181396: step: 632/463, loss: 0.018415415659546852 2023-01-22 17:58:43.776163: step: 634/463, loss: 0.01475452072918415 2023-01-22 17:58:44.385576: step: 636/463, loss: 0.0037371725775301456 2023-01-22 17:58:44.980978: step: 638/463, loss: 0.017161279916763306 2023-01-22 17:58:45.603003: step: 640/463, loss: 0.020916441455483437 2023-01-22 17:58:46.320360: step: 642/463, loss: 0.021510159596800804 2023-01-22 17:58:46.919262: step: 644/463, loss: 0.02695145271718502 2023-01-22 17:58:47.542358: step: 646/463, loss: 0.0005162374000065029 2023-01-22 17:58:48.215046: step: 648/463, loss: 0.006295125000178814 2023-01-22 17:58:48.837344: step: 650/463, loss: 0.054281122982501984 2023-01-22 17:58:49.490158: step: 652/463, loss: 0.007555721327662468 2023-01-22 17:58:50.099277: step: 654/463, loss: 0.00992881041020155 2023-01-22 17:58:50.734515: step: 656/463, loss: 0.015221333131194115 2023-01-22 17:58:51.350525: step: 658/463, loss: 0.011583554558455944 2023-01-22 17:58:51.992729: step: 660/463, loss: 0.031957853585481644 2023-01-22 17:58:52.578811: step: 662/463, loss: 0.003323350800201297 2023-01-22 17:58:53.198697: step: 664/463, loss: 0.00031299449619837105 2023-01-22 17:58:53.802811: step: 666/463, loss: 0.1090482622385025 2023-01-22 17:58:54.405690: step: 668/463, loss: 0.007484715431928635 2023-01-22 17:58:55.011506: step: 670/463, loss: 0.0057810512371361256 2023-01-22 17:58:55.679365: step: 672/463, loss: 0.11908159404993057 2023-01-22 17:58:56.277439: step: 674/463, loss: 0.009599621407687664 2023-01-22 17:58:56.925725: step: 676/463, loss: 0.0019944333471357822 2023-01-22 17:58:57.546112: step: 678/463, loss: 0.003478676313534379 2023-01-22 17:58:58.164074: step: 680/463, loss: 0.008184224367141724 2023-01-22 17:58:58.763080: step: 682/463, loss: 0.025088230147957802 2023-01-22 17:58:59.379220: step: 684/463, loss: 0.005986685398966074 2023-01-22 17:59:00.073202: step: 686/463, loss: 0.04667002707719803 2023-01-22 17:59:00.668784: step: 688/463, loss: 0.05182386189699173 2023-01-22 17:59:01.305942: step: 690/463, loss: 0.12074615806341171 2023-01-22 17:59:01.915979: step: 692/463, loss: 0.009417477063834667 2023-01-22 17:59:02.484504: step: 694/463, loss: 0.0016443624626845121 2023-01-22 17:59:03.192874: step: 696/463, loss: 0.015490012243390083 2023-01-22 17:59:03.852512: step: 698/463, loss: 0.004734557121992111 2023-01-22 17:59:04.537105: step: 700/463, loss: 0.04747442901134491 2023-01-22 17:59:05.102378: step: 702/463, loss: 0.06289347261190414 2023-01-22 17:59:05.681108: step: 704/463, loss: 0.0004960677470080554 2023-01-22 17:59:06.272585: step: 706/463, loss: 0.002099080942571163 2023-01-22 17:59:06.906860: step: 708/463, loss: 0.020305011421442032 2023-01-22 17:59:07.583316: step: 710/463, loss: 0.030749093741178513 2023-01-22 17:59:08.143959: step: 712/463, loss: 0.12189842015504837 2023-01-22 17:59:08.758443: step: 714/463, loss: 0.0097909951582551 2023-01-22 17:59:09.435961: step: 716/463, loss: 0.013627277687191963 2023-01-22 17:59:10.101457: step: 718/463, loss: 0.18070662021636963 2023-01-22 17:59:10.757822: step: 720/463, loss: 0.021304147318005562 2023-01-22 17:59:11.423171: step: 722/463, loss: 0.005282060243189335 2023-01-22 17:59:12.074938: step: 724/463, loss: 0.010232248343527317 2023-01-22 17:59:12.683721: step: 726/463, loss: 0.008852960541844368 2023-01-22 17:59:13.293426: step: 728/463, loss: 0.005146044306457043 2023-01-22 17:59:13.879100: step: 730/463, loss: 0.0010412195697426796 2023-01-22 17:59:14.527759: step: 732/463, loss: 0.03398527204990387 2023-01-22 17:59:15.117698: step: 734/463, loss: 0.10040097683668137 2023-01-22 17:59:15.696786: step: 736/463, loss: 0.006790989078581333 2023-01-22 17:59:16.307289: step: 738/463, loss: 0.026957111433148384 2023-01-22 17:59:16.907877: step: 740/463, loss: 0.04882344976067543 2023-01-22 17:59:17.507928: step: 742/463, loss: 0.0020014874171465635 2023-01-22 17:59:18.117515: step: 744/463, loss: 0.004357237834483385 2023-01-22 17:59:18.678278: step: 746/463, loss: 0.06280282884836197 2023-01-22 17:59:19.327095: step: 748/463, loss: 0.00973549298942089 2023-01-22 17:59:19.957068: step: 750/463, loss: 0.0024649337865412235 2023-01-22 17:59:20.497966: step: 752/463, loss: 0.011823905631899834 2023-01-22 17:59:21.202311: step: 754/463, loss: 0.0015216541942209005 2023-01-22 17:59:21.768944: step: 756/463, loss: 0.021127386018633842 2023-01-22 17:59:22.393066: step: 758/463, loss: 0.027100063860416412 2023-01-22 17:59:22.970861: step: 760/463, loss: 0.014051140286028385 2023-01-22 17:59:23.610446: step: 762/463, loss: 0.005368887912482023 2023-01-22 17:59:24.276078: step: 764/463, loss: 0.006937088910490274 2023-01-22 17:59:24.864355: step: 766/463, loss: 0.0001394805294694379 2023-01-22 17:59:25.493321: step: 768/463, loss: 0.013887831009924412 2023-01-22 17:59:26.121597: step: 770/463, loss: 0.036024775356054306 2023-01-22 17:59:26.666400: step: 772/463, loss: 0.0005145878531038761 2023-01-22 17:59:27.237078: step: 774/463, loss: 0.0018220331985503435 2023-01-22 17:59:27.847778: step: 776/463, loss: 0.018096191808581352 2023-01-22 17:59:28.496997: step: 778/463, loss: 0.01101822592318058 2023-01-22 17:59:29.051635: step: 780/463, loss: 0.025981159880757332 2023-01-22 17:59:29.659472: step: 782/463, loss: 0.009200104512274265 2023-01-22 17:59:30.263267: step: 784/463, loss: 0.000929882749915123 2023-01-22 17:59:30.847517: step: 786/463, loss: 0.04648807272315025 2023-01-22 17:59:31.508771: step: 788/463, loss: 0.009808370843529701 2023-01-22 17:59:32.102828: step: 790/463, loss: 0.037254396826028824 2023-01-22 17:59:32.725039: step: 792/463, loss: 0.024326195940375328 2023-01-22 17:59:33.296784: step: 794/463, loss: 0.027615226805210114 2023-01-22 17:59:33.947954: step: 796/463, loss: 0.1882377415895462 2023-01-22 17:59:34.586632: step: 798/463, loss: 0.020664643496274948 2023-01-22 17:59:35.191140: step: 800/463, loss: 0.0023818279150873423 2023-01-22 17:59:35.812556: step: 802/463, loss: 0.02699568308889866 2023-01-22 17:59:36.414003: step: 804/463, loss: 0.1175592690706253 2023-01-22 17:59:36.996226: step: 806/463, loss: 0.0259011909365654 2023-01-22 17:59:37.619933: step: 808/463, loss: 0.03524906188249588 2023-01-22 17:59:38.206405: step: 810/463, loss: 0.002336753299459815 2023-01-22 17:59:38.749555: step: 812/463, loss: 0.0639367401599884 2023-01-22 17:59:39.336195: step: 814/463, loss: 0.012017275206744671 2023-01-22 17:59:39.932386: step: 816/463, loss: 0.07620594650506973 2023-01-22 17:59:40.568177: step: 818/463, loss: 0.05655490607023239 2023-01-22 17:59:41.186437: step: 820/463, loss: 0.03756054863333702 2023-01-22 17:59:41.742560: step: 822/463, loss: 0.022915706038475037 2023-01-22 17:59:42.359856: step: 824/463, loss: 0.0022326463367789984 2023-01-22 17:59:43.061853: step: 826/463, loss: 0.001099086133763194 2023-01-22 17:59:43.692461: step: 828/463, loss: 0.02429528534412384 2023-01-22 17:59:44.251856: step: 830/463, loss: 0.024618521332740784 2023-01-22 17:59:44.947772: step: 832/463, loss: 0.033521007746458054 2023-01-22 17:59:45.545259: step: 834/463, loss: 0.0006215701578184962 2023-01-22 17:59:46.143837: step: 836/463, loss: 0.006641241256147623 2023-01-22 17:59:46.791891: step: 838/463, loss: 0.022021666169166565 2023-01-22 17:59:47.439976: step: 840/463, loss: 0.020306698977947235 2023-01-22 17:59:48.064981: step: 842/463, loss: 0.02231733500957489 2023-01-22 17:59:48.673172: step: 844/463, loss: 0.06131979078054428 2023-01-22 17:59:49.279494: step: 846/463, loss: 0.008569777011871338 2023-01-22 17:59:49.889217: step: 848/463, loss: 0.024392323568463326 2023-01-22 17:59:50.512452: step: 850/463, loss: 0.007598164025694132 2023-01-22 17:59:51.130306: step: 852/463, loss: 0.048181962221860886 2023-01-22 17:59:51.753633: step: 854/463, loss: 0.005434044636785984 2023-01-22 17:59:52.326738: step: 856/463, loss: 6.133849819889292e-05 2023-01-22 17:59:52.951909: step: 858/463, loss: 0.0040589794516563416 2023-01-22 17:59:53.580457: step: 860/463, loss: 0.024733835831284523 2023-01-22 17:59:54.201842: step: 862/463, loss: 0.15530121326446533 2023-01-22 17:59:54.834070: step: 864/463, loss: 0.007906424812972546 2023-01-22 17:59:55.521543: step: 866/463, loss: 0.024359606206417084 2023-01-22 17:59:56.136864: step: 868/463, loss: 0.024041499942541122 2023-01-22 17:59:56.765321: step: 870/463, loss: 0.0160908754914999 2023-01-22 17:59:57.384018: step: 872/463, loss: 0.00335917086340487 2023-01-22 17:59:58.006800: step: 874/463, loss: 0.035593125969171524 2023-01-22 17:59:58.590529: step: 876/463, loss: 0.0034344070591032505 2023-01-22 17:59:59.223409: step: 878/463, loss: 0.5149649977684021 2023-01-22 17:59:59.835729: step: 880/463, loss: 0.037077710032463074 2023-01-22 18:00:00.468486: step: 882/463, loss: 0.040120307356119156 2023-01-22 18:00:01.026403: step: 884/463, loss: 0.007976699620485306 2023-01-22 18:00:01.644596: step: 886/463, loss: 0.0038741615135222673 2023-01-22 18:00:02.269289: step: 888/463, loss: 0.056547317653894424 2023-01-22 18:00:02.895066: step: 890/463, loss: 0.013041479513049126 2023-01-22 18:00:03.516012: step: 892/463, loss: 0.02891763485968113 2023-01-22 18:00:04.093844: step: 894/463, loss: 0.001965333940461278 2023-01-22 18:00:04.722849: step: 896/463, loss: 0.0716252326965332 2023-01-22 18:00:05.348056: step: 898/463, loss: 0.04837187007069588 2023-01-22 18:00:05.991269: step: 900/463, loss: 0.0030993283726274967 2023-01-22 18:00:06.588180: step: 902/463, loss: 0.011841515079140663 2023-01-22 18:00:07.220966: step: 904/463, loss: 0.023678341880440712 2023-01-22 18:00:07.855614: step: 906/463, loss: 0.03985615819692612 2023-01-22 18:00:08.428772: step: 908/463, loss: 0.0310367438942194 2023-01-22 18:00:09.024740: step: 910/463, loss: 0.016357889398932457 2023-01-22 18:00:09.607682: step: 912/463, loss: 0.014543814584612846 2023-01-22 18:00:10.199630: step: 914/463, loss: 0.010902931913733482 2023-01-22 18:00:10.887209: step: 916/463, loss: 0.02926882915198803 2023-01-22 18:00:11.482854: step: 918/463, loss: 0.0005131382495164871 2023-01-22 18:00:12.097108: step: 920/463, loss: 0.0006596340681426227 2023-01-22 18:00:12.742021: step: 922/463, loss: 0.04686674103140831 2023-01-22 18:00:13.436875: step: 924/463, loss: 0.012077810242772102 2023-01-22 18:00:14.047525: step: 926/463, loss: 0.0033161970786750317 ================================================== Loss: 0.036 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.28854839674563293, 'r': 0.32687550826782324, 'f1': 0.306518492628368}, 'combined': 0.2258557314103764, 'epoch': 28} Test Chinese: {'template': {'p': 0.9722222222222222, 'r': 0.5511811023622047, 'f1': 0.7035175879396985}, 'slot': {'p': 0.34286741326306547, 'r': 0.33658450254882083, 'f1': 0.3396969087811084}, 'combined': 0.2389827498962572, 'epoch': 28} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2974273607748184, 'r': 0.3329831932773109, 'f1': 0.3142025834505691}, 'combined': 0.2315176930688404, 'epoch': 28} Test Korean: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.340478128507631, 'r': 0.3321592911618948, 'f1': 0.3362672682610702}, 'combined': 0.2387497604653598, 'epoch': 28} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.30627309113300494, 'r': 0.3370747492545405, 'f1': 0.3209365724609627}, 'combined': 0.23647957970807776, 'epoch': 28} Test Russian: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.35224466731550763, 'r': 0.31689725305609806, 'f1': 0.33363734680963564}, 'combined': 0.2368825162348413, 'epoch': 28} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.2986111111111111, 'r': 0.4095238095238095, 'f1': 0.3453815261044177}, 'combined': 0.23025435073627845, 'epoch': 28} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.32142857142857145, 'r': 0.4891304347826087, 'f1': 0.3879310344827586}, 'combined': 0.1939655172413793, 'epoch': 28} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.29411764705882354, 'r': 0.1724137931034483, 'f1': 0.2173913043478261}, 'combined': 0.14492753623188406, 'epoch': 28} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2887401938920082, 'r': 0.35393959251278423, 'f1': 0.3180326773303279}, 'combined': 0.2343398675065574, 'epoch': 16} Test for Chinese: {'template': {'p': 0.9722222222222222, 'r': 0.5511811023622047, 'f1': 0.7035175879396985}, 'slot': {'p': 0.3374190197383612, 'r': 0.30885912016190303, 'f1': 0.32250801977725824}, 'combined': 0.22689006416490531, 'epoch': 16} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.29006410256410253, 'r': 0.4309523809523809, 'f1': 0.34674329501915707}, 'combined': 0.23116219667943805, 'epoch': 16} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.28642843540005986, 'r': 0.33860515228508026, 'f1': 0.3103389830508475}, 'combined': 0.22867082961641394, 'epoch': 17} Test for Korean: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.3492453884409228, 'r': 0.3270179138496522, 'f1': 0.3377663639671779}, 'combined': 0.2398141184166963, 'epoch': 17} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.33088235294117646, 'r': 0.4891304347826087, 'f1': 0.39473684210526316}, 'combined': 0.19736842105263158, 'epoch': 17} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.29811343141907926, 'r': 0.34053944158308486, 'f1': 0.3179172466152094}, 'combined': 0.23425481329541745, 'epoch': 20} Test for Russian: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.36505995641065214, 'r': 0.2978456014345199, 'f1': 0.32804522752903387}, 'combined': 0.23291211154561403, 'epoch': 20} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.27586206896551724, 'f1': 0.35555555555555557}, 'combined': 0.23703703703703705, 'epoch': 20} ****************************** Epoch: 29 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-22 18:02:54.447515: step: 2/463, loss: 0.0003914887201972306 2023-01-22 18:02:55.048138: step: 4/463, loss: 0.005576972849667072 2023-01-22 18:02:55.631296: step: 6/463, loss: 0.001547397579997778 2023-01-22 18:02:56.276708: step: 8/463, loss: 0.06775903701782227 2023-01-22 18:02:56.879621: step: 10/463, loss: 0.010049664415419102 2023-01-22 18:02:57.472626: step: 12/463, loss: 0.016414577141404152 2023-01-22 18:02:58.118047: step: 14/463, loss: 0.023569058626890182 2023-01-22 18:02:58.761470: step: 16/463, loss: 0.041077855974435806 2023-01-22 18:02:59.343869: step: 18/463, loss: 0.005656919442117214 2023-01-22 18:02:59.950223: step: 20/463, loss: 0.007052540313452482 2023-01-22 18:03:00.607050: step: 22/463, loss: 0.0005088591133244336 2023-01-22 18:03:01.238206: step: 24/463, loss: 0.04395225644111633 2023-01-22 18:03:01.810531: step: 26/463, loss: 0.0009947199141606688 2023-01-22 18:03:02.421759: step: 28/463, loss: 0.009419906884431839 2023-01-22 18:03:03.099003: step: 30/463, loss: 0.003832306480035186 2023-01-22 18:03:03.719001: step: 32/463, loss: 0.004256066866219044 2023-01-22 18:03:04.303280: step: 34/463, loss: 0.015332180075347424 2023-01-22 18:03:04.862887: step: 36/463, loss: 0.0004844457725994289 2023-01-22 18:03:05.472109: step: 38/463, loss: 0.0013962752418592572 2023-01-22 18:03:06.079143: step: 40/463, loss: 0.04071041941642761 2023-01-22 18:03:06.724118: step: 42/463, loss: 0.008008237928152084 2023-01-22 18:03:07.323841: step: 44/463, loss: 0.046955883502960205 2023-01-22 18:03:07.914819: step: 46/463, loss: 0.01782935857772827 2023-01-22 18:03:08.522228: step: 48/463, loss: 0.0012190013658255339 2023-01-22 18:03:09.172739: step: 50/463, loss: 0.022537730634212494 2023-01-22 18:03:09.774174: step: 52/463, loss: 0.036848150193691254 2023-01-22 18:03:10.339820: step: 54/463, loss: 0.005417682696133852 2023-01-22 18:03:10.903007: step: 56/463, loss: 0.0044907075352966785 2023-01-22 18:03:11.492198: step: 58/463, loss: 0.0004283687740098685 2023-01-22 18:03:12.258265: step: 60/463, loss: 11.077154159545898 2023-01-22 18:03:12.870159: step: 62/463, loss: 0.0892077386379242 2023-01-22 18:03:13.486455: step: 64/463, loss: 0.21198393404483795 2023-01-22 18:03:14.157896: step: 66/463, loss: 0.017917681485414505 2023-01-22 18:03:14.803234: step: 68/463, loss: 0.008726405911147594 2023-01-22 18:03:15.391622: step: 70/463, loss: 0.28265970945358276 2023-01-22 18:03:16.021456: step: 72/463, loss: 0.03540825843811035 2023-01-22 18:03:16.680063: step: 74/463, loss: 0.0668899193406105 2023-01-22 18:03:17.253128: step: 76/463, loss: 0.12833908200263977 2023-01-22 18:03:17.846163: step: 78/463, loss: 0.007680061738938093 2023-01-22 18:03:18.454986: step: 80/463, loss: 0.0007509983843192458 2023-01-22 18:03:19.075921: step: 82/463, loss: 0.004140624776482582 2023-01-22 18:03:19.733238: step: 84/463, loss: 0.03997078537940979 2023-01-22 18:03:20.366538: step: 86/463, loss: 0.003870470682159066 2023-01-22 18:03:20.984152: step: 88/463, loss: 0.03164122253656387 2023-01-22 18:03:21.624686: step: 90/463, loss: 0.001192582189105451 2023-01-22 18:03:22.158302: step: 92/463, loss: 0.015754753723740578 2023-01-22 18:03:22.707993: step: 94/463, loss: 0.0037303504068404436 2023-01-22 18:03:23.310347: step: 96/463, loss: 0.05817827582359314 2023-01-22 18:03:23.936279: step: 98/463, loss: 0.002822417300194502 2023-01-22 18:03:24.539559: step: 100/463, loss: 0.0018273844616487622 2023-01-22 18:03:25.252930: step: 102/463, loss: 0.0035583844874054193 2023-01-22 18:03:25.934343: step: 104/463, loss: 0.09877372533082962 2023-01-22 18:03:26.518124: step: 106/463, loss: 0.002170482650399208 2023-01-22 18:03:27.134234: step: 108/463, loss: 0.0038933607283979654 2023-01-22 18:03:27.763386: step: 110/463, loss: 0.01427222415804863 2023-01-22 18:03:28.417176: step: 112/463, loss: 0.019852783530950546 2023-01-22 18:03:29.013223: step: 114/463, loss: 0.0025913782883435488 2023-01-22 18:03:29.626683: step: 116/463, loss: 0.015149939805269241 2023-01-22 18:03:30.208898: step: 118/463, loss: 0.0016816423740237951 2023-01-22 18:03:30.802517: step: 120/463, loss: 0.0009503490291535854 2023-01-22 18:03:31.333632: step: 122/463, loss: 0.002311705844476819 2023-01-22 18:03:31.929078: step: 124/463, loss: 0.10056640952825546 2023-01-22 18:03:32.545101: step: 126/463, loss: 0.048510245978832245 2023-01-22 18:03:33.144269: step: 128/463, loss: 2.432097608107142e-05 2023-01-22 18:03:33.815784: step: 130/463, loss: 0.06025919318199158 2023-01-22 18:03:34.371344: step: 132/463, loss: 0.05710262060165405 2023-01-22 18:03:34.934039: step: 134/463, loss: 0.0036655825097113848 2023-01-22 18:03:35.598479: step: 136/463, loss: 0.006012010853737593 2023-01-22 18:03:36.232499: step: 138/463, loss: 0.005076298955827951 2023-01-22 18:03:36.895976: step: 140/463, loss: 0.038241319358348846 2023-01-22 18:03:37.510165: step: 142/463, loss: 0.0007519533974118531 2023-01-22 18:03:38.086139: step: 144/463, loss: 0.016284534707665443 2023-01-22 18:03:38.724656: step: 146/463, loss: 0.026747338473796844 2023-01-22 18:03:39.401318: step: 148/463, loss: 0.004052776377648115 2023-01-22 18:03:40.017314: step: 150/463, loss: 0.05809955671429634 2023-01-22 18:03:40.652914: step: 152/463, loss: 0.0574604868888855 2023-01-22 18:03:41.217812: step: 154/463, loss: 0.012432574294507504 2023-01-22 18:03:41.889461: step: 156/463, loss: 0.049490686506032944 2023-01-22 18:03:42.535252: step: 158/463, loss: 0.0004762216121889651 2023-01-22 18:03:43.197544: step: 160/463, loss: 0.008385424502193928 2023-01-22 18:03:43.853200: step: 162/463, loss: 0.0403021015226841 2023-01-22 18:03:44.457260: step: 164/463, loss: 0.00398400891572237 2023-01-22 18:03:45.083136: step: 166/463, loss: 0.04287290573120117 2023-01-22 18:03:45.667465: step: 168/463, loss: 0.00023844490351621062 2023-01-22 18:03:46.373119: step: 170/463, loss: 0.006485272664576769 2023-01-22 18:03:47.038708: step: 172/463, loss: 0.13799051940441132 2023-01-22 18:03:47.674531: step: 174/463, loss: 0.0020901276730000973 2023-01-22 18:03:48.266098: step: 176/463, loss: 0.013940956443548203 2023-01-22 18:03:48.915569: step: 178/463, loss: 0.11896011978387833 2023-01-22 18:03:49.576296: step: 180/463, loss: 0.017927976325154305 2023-01-22 18:03:50.226308: step: 182/463, loss: 0.0026327294763177633 2023-01-22 18:03:50.839330: step: 184/463, loss: 0.022679300978779793 2023-01-22 18:03:51.438893: step: 186/463, loss: 0.0009482899331487715 2023-01-22 18:03:52.069583: step: 188/463, loss: 0.031126342713832855 2023-01-22 18:03:52.623807: step: 190/463, loss: 0.12050122767686844 2023-01-22 18:03:53.352042: step: 192/463, loss: 0.04409010335803032 2023-01-22 18:03:53.992674: step: 194/463, loss: 0.000497934699524194 2023-01-22 18:03:54.617923: step: 196/463, loss: 0.0003541471669450402 2023-01-22 18:03:55.227334: step: 198/463, loss: 0.0023955742362886667 2023-01-22 18:03:55.749703: step: 200/463, loss: 0.0030397018417716026 2023-01-22 18:03:56.380989: step: 202/463, loss: 0.0031904818024486303 2023-01-22 18:03:56.978185: step: 204/463, loss: 0.03965320438146591 2023-01-22 18:03:57.573939: step: 206/463, loss: 0.002516996581107378 2023-01-22 18:03:58.249648: step: 208/463, loss: 0.0022032237611711025 2023-01-22 18:03:58.905045: step: 210/463, loss: 0.026234019547700882 2023-01-22 18:03:59.479154: step: 212/463, loss: 0.010337116196751595 2023-01-22 18:04:00.107530: step: 214/463, loss: 0.0010616221698001027 2023-01-22 18:04:00.746292: step: 216/463, loss: 0.03481316938996315 2023-01-22 18:04:01.365666: step: 218/463, loss: 0.005598996765911579 2023-01-22 18:04:02.026433: step: 220/463, loss: 0.0026432164013385773 2023-01-22 18:04:02.622099: step: 222/463, loss: 0.020484453067183495 2023-01-22 18:04:03.215317: step: 224/463, loss: 0.005181078799068928 2023-01-22 18:04:03.885779: step: 226/463, loss: 0.02644847333431244 2023-01-22 18:04:04.482036: step: 228/463, loss: 0.004441256634891033 2023-01-22 18:04:05.102283: step: 230/463, loss: 0.0005881215329281986 2023-01-22 18:04:05.764381: step: 232/463, loss: 0.00144354032818228 2023-01-22 18:04:06.342576: step: 234/463, loss: 0.03939047455787659 2023-01-22 18:04:06.940236: step: 236/463, loss: 0.008906861767172813 2023-01-22 18:04:07.520078: step: 238/463, loss: 0.0016461132327094674 2023-01-22 18:04:08.089869: step: 240/463, loss: 0.006462874356657267 2023-01-22 18:04:08.643112: step: 242/463, loss: 0.01575472205877304 2023-01-22 18:04:09.231547: step: 244/463, loss: 0.018910666927695274 2023-01-22 18:04:09.859925: step: 246/463, loss: 0.01126375887542963 2023-01-22 18:04:10.458619: step: 248/463, loss: 0.14323322474956512 2023-01-22 18:04:11.120741: step: 250/463, loss: 0.03058643639087677 2023-01-22 18:04:11.696296: step: 252/463, loss: 0.03092053532600403 2023-01-22 18:04:12.370124: step: 254/463, loss: 0.0061956229619681835 2023-01-22 18:04:13.021882: step: 256/463, loss: 0.01216513104736805 2023-01-22 18:04:13.617072: step: 258/463, loss: 0.006946695037186146 2023-01-22 18:04:14.291792: step: 260/463, loss: 0.03590548038482666 2023-01-22 18:04:14.926799: step: 262/463, loss: 0.03887266293168068 2023-01-22 18:04:15.552708: step: 264/463, loss: 0.04369470104575157 2023-01-22 18:04:16.155699: step: 266/463, loss: 5.589088686974719e-05 2023-01-22 18:04:16.791459: step: 268/463, loss: 0.0034993442241102457 2023-01-22 18:04:17.379810: step: 270/463, loss: 0.0029966821894049644 2023-01-22 18:04:18.012688: step: 272/463, loss: 0.0026113344356417656 2023-01-22 18:04:18.597694: step: 274/463, loss: 0.005718905944377184 2023-01-22 18:04:19.221911: step: 276/463, loss: 0.03133324533700943 2023-01-22 18:04:19.869376: step: 278/463, loss: 0.0137581592425704 2023-01-22 18:04:20.481933: step: 280/463, loss: 0.007801003288477659 2023-01-22 18:04:21.062645: step: 282/463, loss: 0.03098185919225216 2023-01-22 18:04:21.664279: step: 284/463, loss: 0.0895136147737503 2023-01-22 18:04:22.420367: step: 286/463, loss: 0.001176695805042982 2023-01-22 18:04:23.045062: step: 288/463, loss: 0.0017950657056644559 2023-01-22 18:04:23.692161: step: 290/463, loss: 0.001105518196709454 2023-01-22 18:04:24.327247: step: 292/463, loss: 0.013744093477725983 2023-01-22 18:04:24.941269: step: 294/463, loss: 0.020104916766285896 2023-01-22 18:04:25.545835: step: 296/463, loss: 0.032298389822244644 2023-01-22 18:04:26.204224: step: 298/463, loss: 0.13137850165367126 2023-01-22 18:04:26.843363: step: 300/463, loss: 0.0005408317665569484 2023-01-22 18:04:27.535230: step: 302/463, loss: 0.0052017224952578545 2023-01-22 18:04:28.189139: step: 304/463, loss: 0.0073852273635566235 2023-01-22 18:04:28.910825: step: 306/463, loss: 0.0010250161867588758 2023-01-22 18:04:29.495450: step: 308/463, loss: 0.004428446292877197 2023-01-22 18:04:30.134561: step: 310/463, loss: 0.0021626728121191263 2023-01-22 18:04:30.735220: step: 312/463, loss: 0.0015278459759429097 2023-01-22 18:04:31.393663: step: 314/463, loss: 0.0007718686247244477 2023-01-22 18:04:32.052165: step: 316/463, loss: 0.00024556610151194036 2023-01-22 18:04:32.656366: step: 318/463, loss: 0.07632569223642349 2023-01-22 18:04:33.281937: step: 320/463, loss: 0.07321486622095108 2023-01-22 18:04:33.925661: step: 322/463, loss: 0.020352276042103767 2023-01-22 18:04:34.545510: step: 324/463, loss: 0.00484932167455554 2023-01-22 18:04:35.144806: step: 326/463, loss: 0.007487508002668619 2023-01-22 18:04:35.747137: step: 328/463, loss: 0.003619804745540023 2023-01-22 18:04:36.389126: step: 330/463, loss: 0.09593615680932999 2023-01-22 18:04:37.100366: step: 332/463, loss: 0.04941615089774132 2023-01-22 18:04:37.740516: step: 334/463, loss: 0.04310867562890053 2023-01-22 18:04:38.413314: step: 336/463, loss: 0.00032819577609188855 2023-01-22 18:04:39.086269: step: 338/463, loss: 0.010150833986699581 2023-01-22 18:04:39.679667: step: 340/463, loss: 0.029327258467674255 2023-01-22 18:04:40.314633: step: 342/463, loss: 0.002538709668442607 2023-01-22 18:04:40.916073: step: 344/463, loss: 0.0010243378346785903 2023-01-22 18:04:41.538105: step: 346/463, loss: 0.0036662223283201456 2023-01-22 18:04:42.131672: step: 348/463, loss: 0.044041551649570465 2023-01-22 18:04:42.743984: step: 350/463, loss: 0.008993592113256454 2023-01-22 18:04:43.370067: step: 352/463, loss: 0.0165958721190691 2023-01-22 18:04:44.007602: step: 354/463, loss: 0.06362118571996689 2023-01-22 18:04:44.617254: step: 356/463, loss: 0.036150913685560226 2023-01-22 18:04:45.261762: step: 358/463, loss: 0.00221143732778728 2023-01-22 18:04:45.838649: step: 360/463, loss: 2.6260597705841064 2023-01-22 18:04:46.466481: step: 362/463, loss: 5.134716775501147e-05 2023-01-22 18:04:47.029074: step: 364/463, loss: 0.02292129211127758 2023-01-22 18:04:47.608109: step: 366/463, loss: 0.017430715262889862 2023-01-22 18:04:48.279501: step: 368/463, loss: 0.018831543624401093 2023-01-22 18:04:48.898407: step: 370/463, loss: 0.011705653741955757 2023-01-22 18:04:49.583573: step: 372/463, loss: 0.007447318639606237 2023-01-22 18:04:50.197345: step: 374/463, loss: 0.004266749136149883 2023-01-22 18:04:50.806682: step: 376/463, loss: 0.0897613987326622 2023-01-22 18:04:51.359627: step: 378/463, loss: 0.000258387066423893 2023-01-22 18:04:52.033848: step: 380/463, loss: 0.045856013894081116 2023-01-22 18:04:52.671942: step: 382/463, loss: 0.07217244803905487 2023-01-22 18:04:53.305814: step: 384/463, loss: 31.621356964111328 2023-01-22 18:04:53.942060: step: 386/463, loss: 0.034945424646139145 2023-01-22 18:04:54.554031: step: 388/463, loss: 0.07829007506370544 2023-01-22 18:04:55.216517: step: 390/463, loss: 0.04521381855010986 2023-01-22 18:04:55.840976: step: 392/463, loss: 0.16197924315929413 2023-01-22 18:04:56.496644: step: 394/463, loss: 0.0024033484514802694 2023-01-22 18:04:57.190539: step: 396/463, loss: 0.025754138827323914 2023-01-22 18:04:57.800244: step: 398/463, loss: 0.003672210033982992 2023-01-22 18:04:58.443033: step: 400/463, loss: 0.047989875078201294 2023-01-22 18:04:59.073594: step: 402/463, loss: 0.009732765145599842 2023-01-22 18:04:59.677355: step: 404/463, loss: 0.017397744581103325 2023-01-22 18:05:00.277125: step: 406/463, loss: 0.007540534250438213 2023-01-22 18:05:00.932388: step: 408/463, loss: 0.008393394760787487 2023-01-22 18:05:01.597430: step: 410/463, loss: 0.016669226810336113 2023-01-22 18:05:02.224237: step: 412/463, loss: 0.037026721984148026 2023-01-22 18:05:02.909791: step: 414/463, loss: 0.013164187781512737 2023-01-22 18:05:03.571691: step: 416/463, loss: 0.02203996106982231 2023-01-22 18:05:04.265457: step: 418/463, loss: 0.0005837487988173962 2023-01-22 18:05:04.907632: step: 420/463, loss: 0.2756882905960083 2023-01-22 18:05:05.614367: step: 422/463, loss: 0.06159980595111847 2023-01-22 18:05:06.230950: step: 424/463, loss: 0.023684632033109665 2023-01-22 18:05:06.868534: step: 426/463, loss: 0.01158506702631712 2023-01-22 18:05:07.468232: step: 428/463, loss: 0.7395918965339661 2023-01-22 18:05:08.069397: step: 430/463, loss: 0.0047838157042860985 2023-01-22 18:05:08.694025: step: 432/463, loss: 0.047143109142780304 2023-01-22 18:05:09.255037: step: 434/463, loss: 0.001740562729537487 2023-01-22 18:05:09.903953: step: 436/463, loss: 0.06021541357040405 2023-01-22 18:05:10.502690: step: 438/463, loss: 0.05092698708176613 2023-01-22 18:05:11.101491: step: 440/463, loss: 0.014860011637210846 2023-01-22 18:05:11.764618: step: 442/463, loss: 0.04093102738261223 2023-01-22 18:05:12.456130: step: 444/463, loss: 0.0009748899610713124 2023-01-22 18:05:13.040432: step: 446/463, loss: 0.004331653006374836 2023-01-22 18:05:13.743882: step: 448/463, loss: 0.00026309661916457117 2023-01-22 18:05:14.312360: step: 450/463, loss: 0.010321945883333683 2023-01-22 18:05:15.062663: step: 452/463, loss: 0.01205496396869421 2023-01-22 18:05:15.667887: step: 454/463, loss: 0.0009673243621364236 2023-01-22 18:05:16.251098: step: 456/463, loss: 0.01407802663743496 2023-01-22 18:05:16.899546: step: 458/463, loss: 0.006319864187389612 2023-01-22 18:05:17.531967: step: 460/463, loss: 0.08002618700265884 2023-01-22 18:05:18.153612: step: 462/463, loss: 0.011293763294816017 2023-01-22 18:05:18.726260: step: 464/463, loss: 0.015847865492105484 2023-01-22 18:05:19.329231: step: 466/463, loss: 0.052237559109926224 2023-01-22 18:05:20.005123: step: 468/463, loss: 0.16998177766799927 2023-01-22 18:05:20.649268: step: 470/463, loss: 0.047351643443107605 2023-01-22 18:05:21.212268: step: 472/463, loss: 0.009008118882775307 2023-01-22 18:05:21.895011: step: 474/463, loss: 0.010648525319993496 2023-01-22 18:05:22.472022: step: 476/463, loss: 0.00024141841277014464 2023-01-22 18:05:23.082814: step: 478/463, loss: 0.006063411943614483 2023-01-22 18:05:23.688197: step: 480/463, loss: 0.08582402765750885 2023-01-22 18:05:24.323687: step: 482/463, loss: 0.01688452996313572 2023-01-22 18:05:24.967607: step: 484/463, loss: 0.035698242485523224 2023-01-22 18:05:25.519620: step: 486/463, loss: 0.0024564056657254696 2023-01-22 18:05:26.145470: step: 488/463, loss: 0.09767744690179825 2023-01-22 18:05:26.795570: step: 490/463, loss: 0.026396173983812332 2023-01-22 18:05:27.406743: step: 492/463, loss: 0.03116304613649845 2023-01-22 18:05:28.021624: step: 494/463, loss: 0.022785691544413567 2023-01-22 18:05:28.590412: step: 496/463, loss: 0.7129865288734436 2023-01-22 18:05:29.174794: step: 498/463, loss: 0.04368944093585014 2023-01-22 18:05:29.785800: step: 500/463, loss: 0.017652157694101334 2023-01-22 18:05:30.413339: step: 502/463, loss: 0.08276063203811646 2023-01-22 18:05:31.019590: step: 504/463, loss: 0.005638677626848221 2023-01-22 18:05:31.645369: step: 506/463, loss: 0.021448923274874687 2023-01-22 18:05:32.309930: step: 508/463, loss: 0.022473150864243507 2023-01-22 18:05:32.867594: step: 510/463, loss: 0.0121835982427001 2023-01-22 18:05:33.499467: step: 512/463, loss: 0.01876768097281456 2023-01-22 18:05:34.111226: step: 514/463, loss: 0.0019984664395451546 2023-01-22 18:05:34.772029: step: 516/463, loss: 0.030067726969718933 2023-01-22 18:05:35.414636: step: 518/463, loss: 0.02606113627552986 2023-01-22 18:05:35.998989: step: 520/463, loss: 0.05141742154955864 2023-01-22 18:05:36.647999: step: 522/463, loss: 0.02086355909705162 2023-01-22 18:05:37.280477: step: 524/463, loss: 0.042362798005342484 2023-01-22 18:05:37.908550: step: 526/463, loss: 0.10895591229200363 2023-01-22 18:05:38.505168: step: 528/463, loss: 0.01313256286084652 2023-01-22 18:05:39.163417: step: 530/463, loss: 0.004161592107266188 2023-01-22 18:05:39.771068: step: 532/463, loss: 0.04824601113796234 2023-01-22 18:05:40.449620: step: 534/463, loss: 0.010457382537424564 2023-01-22 18:05:41.024080: step: 536/463, loss: 0.005988478660583496 2023-01-22 18:05:41.740455: step: 538/463, loss: 0.2397231012582779 2023-01-22 18:05:42.381791: step: 540/463, loss: 0.0035204975865781307 2023-01-22 18:05:42.977720: step: 542/463, loss: 0.014744160696864128 2023-01-22 18:05:43.571714: step: 544/463, loss: 0.005024021025747061 2023-01-22 18:05:44.164067: step: 546/463, loss: 0.086214579641819 2023-01-22 18:05:44.814815: step: 548/463, loss: 0.08544863015413284 2023-01-22 18:05:45.447565: step: 550/463, loss: 0.16468545794487 2023-01-22 18:05:46.020868: step: 552/463, loss: 0.0075435335747897625 2023-01-22 18:05:46.622271: step: 554/463, loss: 0.022413009777665138 2023-01-22 18:05:47.285986: step: 556/463, loss: 0.007172274868935347 2023-01-22 18:05:47.900431: step: 558/463, loss: 0.00046209717402234674 2023-01-22 18:05:48.533443: step: 560/463, loss: 0.020075026899576187 2023-01-22 18:05:49.256252: step: 562/463, loss: 0.5611164569854736 2023-01-22 18:05:49.912881: step: 564/463, loss: 0.030953286215662956 2023-01-22 18:05:50.519195: step: 566/463, loss: 0.026850735768675804 2023-01-22 18:05:51.078957: step: 568/463, loss: 0.0010573522886261344 2023-01-22 18:05:51.776694: step: 570/463, loss: 0.0009187035611830652 2023-01-22 18:05:52.374057: step: 572/463, loss: 0.2791612148284912 2023-01-22 18:05:53.017664: step: 574/463, loss: 0.0021377175580710173 2023-01-22 18:05:53.664940: step: 576/463, loss: 0.0050544533878564835 2023-01-22 18:05:54.330956: step: 578/463, loss: 0.019523300230503082 2023-01-22 18:05:54.933954: step: 580/463, loss: 0.00529256509616971 2023-01-22 18:05:55.541576: step: 582/463, loss: 0.0405399352312088 2023-01-22 18:05:56.137142: step: 584/463, loss: 0.0271639171987772 2023-01-22 18:05:56.798058: step: 586/463, loss: 0.29573631286621094 2023-01-22 18:05:57.405086: step: 588/463, loss: 0.014864478260278702 2023-01-22 18:05:58.072000: step: 590/463, loss: 0.019664153456687927 2023-01-22 18:05:58.684356: step: 592/463, loss: 0.01845971867442131 2023-01-22 18:05:59.276461: step: 594/463, loss: 0.03472559526562691 2023-01-22 18:05:59.848813: step: 596/463, loss: 0.00039096680120564997 2023-01-22 18:06:00.526776: step: 598/463, loss: 0.21561750769615173 2023-01-22 18:06:01.136370: step: 600/463, loss: 0.6421136856079102 2023-01-22 18:06:01.753869: step: 602/463, loss: 0.02682858146727085 2023-01-22 18:06:02.287119: step: 604/463, loss: 0.032713666558265686 2023-01-22 18:06:02.889265: step: 606/463, loss: 0.00545856449753046 2023-01-22 18:06:03.526871: step: 608/463, loss: 0.023619504645466805 2023-01-22 18:06:04.113383: step: 610/463, loss: 0.004005948547273874 2023-01-22 18:06:04.692322: step: 612/463, loss: 0.010788863524794579 2023-01-22 18:06:05.301199: step: 614/463, loss: 0.04299852252006531 2023-01-22 18:06:05.880030: step: 616/463, loss: 0.004148825537413359 2023-01-22 18:06:06.489133: step: 618/463, loss: 0.016512783244252205 2023-01-22 18:06:07.167215: step: 620/463, loss: 0.028558718040585518 2023-01-22 18:06:07.806552: step: 622/463, loss: 0.02039634995162487 2023-01-22 18:06:08.419129: step: 624/463, loss: 0.04354649409651756 2023-01-22 18:06:09.100465: step: 626/463, loss: 0.007526175118982792 2023-01-22 18:06:09.713744: step: 628/463, loss: 0.0059040142223238945 2023-01-22 18:06:10.330932: step: 630/463, loss: 0.028138358145952225 2023-01-22 18:06:10.896350: step: 632/463, loss: 9.206015238305554e-05 2023-01-22 18:06:11.496799: step: 634/463, loss: 0.01758112758398056 2023-01-22 18:06:12.079051: step: 636/463, loss: 0.016187870875000954 2023-01-22 18:06:12.679397: step: 638/463, loss: 0.0036649582907557487 2023-01-22 18:06:13.276250: step: 640/463, loss: 0.014780957251787186 2023-01-22 18:06:13.921236: step: 642/463, loss: 0.04213018715381622 2023-01-22 18:06:14.575249: step: 644/463, loss: 0.02352382056415081 2023-01-22 18:06:15.205300: step: 646/463, loss: 0.4817819595336914 2023-01-22 18:06:15.812135: step: 648/463, loss: 0.024256188422441483 2023-01-22 18:06:16.423956: step: 650/463, loss: 0.014130618423223495 2023-01-22 18:06:17.110805: step: 652/463, loss: 0.07165864109992981 2023-01-22 18:06:17.769722: step: 654/463, loss: 0.02124427817761898 2023-01-22 18:06:18.361183: step: 656/463, loss: 0.014942212030291557 2023-01-22 18:06:19.018599: step: 658/463, loss: 0.0017131698550656438 2023-01-22 18:06:19.651419: step: 660/463, loss: 0.0022824485786259174 2023-01-22 18:06:20.282685: step: 662/463, loss: 0.14455591142177582 2023-01-22 18:06:20.867317: step: 664/463, loss: 0.0003786091401707381 2023-01-22 18:06:21.466908: step: 666/463, loss: 0.05090665817260742 2023-01-22 18:06:22.093482: step: 668/463, loss: 1.8664562702178955 2023-01-22 18:06:22.693280: step: 670/463, loss: 0.008491952903568745 2023-01-22 18:06:23.344136: step: 672/463, loss: 0.000866522139403969 2023-01-22 18:06:23.930376: step: 674/463, loss: 0.024245696142315865 2023-01-22 18:06:24.564801: step: 676/463, loss: 0.10155075788497925 2023-01-22 18:06:25.219273: step: 678/463, loss: 0.04755963757634163 2023-01-22 18:06:25.864257: step: 680/463, loss: 0.0015442497096955776 2023-01-22 18:06:26.522768: step: 682/463, loss: 0.004446601495146751 2023-01-22 18:06:27.128775: step: 684/463, loss: 0.014279940165579319 2023-01-22 18:06:27.810098: step: 686/463, loss: 0.00420925859361887 2023-01-22 18:06:28.405508: step: 688/463, loss: 0.026696840301156044 2023-01-22 18:06:29.010861: step: 690/463, loss: 0.0007682603900320828 2023-01-22 18:06:29.621338: step: 692/463, loss: 0.004845873918384314 2023-01-22 18:06:30.209446: step: 694/463, loss: 0.024735506623983383 2023-01-22 18:06:30.872657: step: 696/463, loss: 0.07266804575920105 2023-01-22 18:06:31.492274: step: 698/463, loss: 0.012969673611223698 2023-01-22 18:06:32.154473: step: 700/463, loss: 0.001290955813601613 2023-01-22 18:06:32.862067: step: 702/463, loss: 0.022023595869541168 2023-01-22 18:06:33.449342: step: 704/463, loss: 0.009300443343818188 2023-01-22 18:06:34.020208: step: 706/463, loss: 0.007266368251293898 2023-01-22 18:06:34.632151: step: 708/463, loss: 0.009650597348809242 2023-01-22 18:06:35.281394: step: 710/463, loss: 0.006791943218559027 2023-01-22 18:06:35.822175: step: 712/463, loss: 0.003005444770678878 2023-01-22 18:06:36.482135: step: 714/463, loss: 0.003758802078664303 2023-01-22 18:06:37.093240: step: 716/463, loss: 0.011285132728517056 2023-01-22 18:06:37.829395: step: 718/463, loss: 0.009113574400544167 2023-01-22 18:06:38.494727: step: 720/463, loss: 0.008147427812218666 2023-01-22 18:06:39.123945: step: 722/463, loss: 0.000347251130733639 2023-01-22 18:06:39.681931: step: 724/463, loss: 0.007750391494482756 2023-01-22 18:06:40.321876: step: 726/463, loss: 0.013226618990302086 2023-01-22 18:06:40.949773: step: 728/463, loss: 0.03858989477157593 2023-01-22 18:06:41.562811: step: 730/463, loss: 0.07968893647193909 2023-01-22 18:06:42.204586: step: 732/463, loss: 0.0957961156964302 2023-01-22 18:06:42.825347: step: 734/463, loss: 0.02677401900291443 2023-01-22 18:06:43.395057: step: 736/463, loss: 0.008960733190178871 2023-01-22 18:06:44.001934: step: 738/463, loss: 0.020980706438422203 2023-01-22 18:06:44.574950: step: 740/463, loss: 0.014655128121376038 2023-01-22 18:06:45.197356: step: 742/463, loss: 0.011478654108941555 2023-01-22 18:06:45.824220: step: 744/463, loss: 0.03162815049290657 2023-01-22 18:06:46.435747: step: 746/463, loss: 0.0002497062087059021 2023-01-22 18:06:47.054851: step: 748/463, loss: 0.002379894722253084 2023-01-22 18:06:47.621041: step: 750/463, loss: 0.006795261055231094 2023-01-22 18:06:48.215004: step: 752/463, loss: 0.01872861571609974 2023-01-22 18:06:48.823500: step: 754/463, loss: 0.016164487227797508 2023-01-22 18:06:49.474789: step: 756/463, loss: 0.0009352327906526625 2023-01-22 18:06:50.084206: step: 758/463, loss: 0.013647534884512424 2023-01-22 18:06:50.696597: step: 760/463, loss: 0.006284648552536964 2023-01-22 18:06:51.321023: step: 762/463, loss: 0.050970036536455154 2023-01-22 18:06:51.894599: step: 764/463, loss: 0.008069973438978195 2023-01-22 18:06:52.509008: step: 766/463, loss: 0.004806981422007084 2023-01-22 18:06:53.220172: step: 768/463, loss: 0.011381915770471096 2023-01-22 18:06:53.806321: step: 770/463, loss: 0.05059416964650154 2023-01-22 18:06:54.395618: step: 772/463, loss: 0.004449351690709591 2023-01-22 18:06:55.016244: step: 774/463, loss: 0.004779100883752108 2023-01-22 18:06:55.618144: step: 776/463, loss: 0.02330714650452137 2023-01-22 18:06:56.257828: step: 778/463, loss: 0.008431531488895416 2023-01-22 18:06:56.841150: step: 780/463, loss: 0.029059957712888718 2023-01-22 18:06:57.454275: step: 782/463, loss: 0.009260015562176704 2023-01-22 18:06:58.087009: step: 784/463, loss: 0.10108258575201035 2023-01-22 18:06:58.734177: step: 786/463, loss: 0.0020181983709335327 2023-01-22 18:06:59.359692: step: 788/463, loss: 0.02183091640472412 2023-01-22 18:06:59.920440: step: 790/463, loss: 0.006580478977411985 2023-01-22 18:07:00.491958: step: 792/463, loss: 0.16445356607437134 2023-01-22 18:07:01.121355: step: 794/463, loss: 0.05463160574436188 2023-01-22 18:07:01.740431: step: 796/463, loss: 0.004266956355422735 2023-01-22 18:07:02.416603: step: 798/463, loss: 0.00814636517316103 2023-01-22 18:07:03.086624: step: 800/463, loss: 0.005915231071412563 2023-01-22 18:07:03.758906: step: 802/463, loss: 0.008371345698833466 2023-01-22 18:07:04.432191: step: 804/463, loss: 0.01655784621834755 2023-01-22 18:07:05.081242: step: 806/463, loss: 0.015464629046618938 2023-01-22 18:07:05.684697: step: 808/463, loss: 0.023200802505016327 2023-01-22 18:07:06.328492: step: 810/463, loss: 0.026450231671333313 2023-01-22 18:07:07.002780: step: 812/463, loss: 0.000593698990996927 2023-01-22 18:07:07.645682: step: 814/463, loss: 0.06238268315792084 2023-01-22 18:07:08.265732: step: 816/463, loss: 0.041439902037382126 2023-01-22 18:07:08.865374: step: 818/463, loss: 0.013739473186433315 2023-01-22 18:07:09.476013: step: 820/463, loss: 0.03209599852561951 2023-01-22 18:07:10.151385: step: 822/463, loss: 0.014313260093331337 2023-01-22 18:07:11.010631: step: 824/463, loss: 0.07013878971338272 2023-01-22 18:07:11.730290: step: 826/463, loss: 0.6172475218772888 2023-01-22 18:07:12.339393: step: 828/463, loss: 0.02992432750761509 2023-01-22 18:07:12.902193: step: 830/463, loss: 0.003681521862745285 2023-01-22 18:07:13.490886: step: 832/463, loss: 0.0013048171531409025 2023-01-22 18:07:14.067170: step: 834/463, loss: 0.011586617678403854 2023-01-22 18:07:14.703488: step: 836/463, loss: 0.03322337195277214 2023-01-22 18:07:15.242506: step: 838/463, loss: 0.0008399966754950583 2023-01-22 18:07:15.894489: step: 840/463, loss: 0.004164307378232479 2023-01-22 18:07:16.442337: step: 842/463, loss: 0.018846558406949043 2023-01-22 18:07:17.034253: step: 844/463, loss: 0.0021412770729511976 2023-01-22 18:07:17.627488: step: 846/463, loss: 0.0001682731817709282 2023-01-22 18:07:18.272817: step: 848/463, loss: 0.01972891204059124 2023-01-22 18:07:18.865834: step: 850/463, loss: 0.0013729886850342155 2023-01-22 18:07:19.499572: step: 852/463, loss: 0.015464067459106445 2023-01-22 18:07:20.099625: step: 854/463, loss: 0.0069878422655165195 2023-01-22 18:07:20.744769: step: 856/463, loss: 0.018083522096276283 2023-01-22 18:07:21.405272: step: 858/463, loss: 0.002178867580369115 2023-01-22 18:07:22.068988: step: 860/463, loss: 0.06608951836824417 2023-01-22 18:07:22.731291: step: 862/463, loss: 0.005587534047663212 2023-01-22 18:07:23.389673: step: 864/463, loss: 0.04588538780808449 2023-01-22 18:07:24.008887: step: 866/463, loss: 0.04764103889465332 2023-01-22 18:07:24.606514: step: 868/463, loss: 0.010985337197780609 2023-01-22 18:07:25.153913: step: 870/463, loss: 0.004983668681234121 2023-01-22 18:07:25.705505: step: 872/463, loss: 0.00608850410208106 2023-01-22 18:07:26.300113: step: 874/463, loss: 0.03145509958267212 2023-01-22 18:07:26.886636: step: 876/463, loss: 0.02217182144522667 2023-01-22 18:07:27.547181: step: 878/463, loss: 0.004564650822430849 2023-01-22 18:07:28.171968: step: 880/463, loss: 0.003980085253715515 2023-01-22 18:07:28.773692: step: 882/463, loss: 0.005193396471440792 2023-01-22 18:07:29.391355: step: 884/463, loss: 0.028482798486948013 2023-01-22 18:07:30.051876: step: 886/463, loss: 0.020847423002123833 2023-01-22 18:07:30.669695: step: 888/463, loss: 0.011301984079182148 2023-01-22 18:07:31.312208: step: 890/463, loss: 0.06866087764501572 2023-01-22 18:07:31.943773: step: 892/463, loss: 0.05763440951704979 2023-01-22 18:07:32.531537: step: 894/463, loss: 0.008166320621967316 2023-01-22 18:07:33.189618: step: 896/463, loss: 0.02462875097990036 2023-01-22 18:07:33.753276: step: 898/463, loss: 0.00032986057340167463 2023-01-22 18:07:34.429910: step: 900/463, loss: 0.015516037121415138 2023-01-22 18:07:35.067287: step: 902/463, loss: 0.013101711869239807 2023-01-22 18:07:35.747583: step: 904/463, loss: 0.005989592056721449 2023-01-22 18:07:36.372969: step: 906/463, loss: 0.022785846143960953 2023-01-22 18:07:37.024653: step: 908/463, loss: 0.005884743761271238 2023-01-22 18:07:37.665979: step: 910/463, loss: 0.013686166144907475 2023-01-22 18:07:38.230046: step: 912/463, loss: 0.0009274838375858963 2023-01-22 18:07:38.779242: step: 914/463, loss: 0.0066236392594873905 2023-01-22 18:07:39.504999: step: 916/463, loss: 7.755740080028772e-05 2023-01-22 18:07:40.109751: step: 918/463, loss: 0.004494545515626669 2023-01-22 18:07:40.714158: step: 920/463, loss: 0.018951473757624626 2023-01-22 18:07:41.345988: step: 922/463, loss: 0.010083302855491638 2023-01-22 18:07:41.939403: step: 924/463, loss: 0.0031991652213037014 2023-01-22 18:07:42.542733: step: 926/463, loss: 0.016238154843449593 ================================================== Loss: 0.136 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.31154818059299194, 'r': 0.313321699647601, 'f1': 0.312432423300446}, 'combined': 0.23021336453717073, 'epoch': 29} Test Chinese: {'template': {'p': 0.9722222222222222, 'r': 0.5511811023622047, 'f1': 0.7035175879396985}, 'slot': {'p': 0.35841849662687986, 'r': 0.32120052010105204, 'f1': 0.33879042433116024}, 'combined': 0.23834502214252482, 'epoch': 29} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.31716366158113735, 'r': 0.30994171862293307, 'f1': 0.31351110501782287}, 'combined': 0.23100818264471157, 'epoch': 29} Test Korean: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.3483239962628753, 'r': 0.30516343128091344, 'f1': 0.3253184113934203}, 'combined': 0.2309760720893284, 'epoch': 29} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.33030674071525346, 'r': 0.32027845257209586, 'f1': 0.32521530733235937}, 'combined': 0.2396323317185806, 'epoch': 29} Test Russian: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.3704599338642285, 'r': 0.2980489171228784, 'f1': 0.3303327456700374}, 'combined': 0.23453624942572654, 'epoch': 29} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.34188034188034183, 'r': 0.38095238095238093, 'f1': 0.36036036036036034}, 'combined': 0.2402402402402402, 'epoch': 29} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.30357142857142855, 'r': 0.3695652173913043, 'f1': 0.3333333333333333}, 'combined': 0.16666666666666666, 'epoch': 29} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.39705882352941174, 'r': 0.23275862068965517, 'f1': 0.2934782608695652}, 'combined': 0.19565217391304346, 'epoch': 29} New best chinese model... ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.31154818059299194, 'r': 0.313321699647601, 'f1': 0.312432423300446}, 'combined': 0.23021336453717073, 'epoch': 29} Test for Chinese: {'template': {'p': 0.9722222222222222, 'r': 0.5511811023622047, 'f1': 0.7035175879396985}, 'slot': {'p': 0.35841849662687986, 'r': 0.32120052010105204, 'f1': 0.33879042433116024}, 'combined': 0.23834502214252482, 'epoch': 29} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.34188034188034183, 'r': 0.38095238095238093, 'f1': 0.36036036036036034}, 'combined': 0.2402402402402402, 'epoch': 29} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.28642843540005986, 'r': 0.33860515228508026, 'f1': 0.3103389830508475}, 'combined': 0.22867082961641394, 'epoch': 17} Test for Korean: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.3492453884409228, 'r': 0.3270179138496522, 'f1': 0.3377663639671779}, 'combined': 0.2398141184166963, 'epoch': 17} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.33088235294117646, 'r': 0.4891304347826087, 'f1': 0.39473684210526316}, 'combined': 0.19736842105263158, 'epoch': 17} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.29811343141907926, 'r': 0.34053944158308486, 'f1': 0.3179172466152094}, 'combined': 0.23425481329541745, 'epoch': 20} Test for Russian: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.36505995641065214, 'r': 0.2978456014345199, 'f1': 0.32804522752903387}, 'combined': 0.23291211154561403, 'epoch': 20} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.27586206896551724, 'f1': 0.35555555555555557}, 'combined': 0.23703703703703705, 'epoch': 20} ****************************** Epoch: 30 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-22 18:10:31.711321: step: 2/463, loss: 0.023948200047016144 2023-01-22 18:10:32.465777: step: 4/463, loss: 0.14404939115047455 2023-01-22 18:10:33.055812: step: 6/463, loss: 0.0027011537458747625 2023-01-22 18:10:33.675651: step: 8/463, loss: 0.055666182190179825 2023-01-22 18:10:34.310403: step: 10/463, loss: 0.0032131739426404238 2023-01-22 18:10:34.915787: step: 12/463, loss: 0.001157824881374836 2023-01-22 18:10:35.470381: step: 14/463, loss: 0.0008008122094906867 2023-01-22 18:10:36.053642: step: 16/463, loss: 0.0004668194451369345 2023-01-22 18:10:36.688001: step: 18/463, loss: 0.03428475186228752 2023-01-22 18:10:37.306009: step: 20/463, loss: 0.26941102743148804 2023-01-22 18:10:38.021666: step: 22/463, loss: 0.010036205872893333 2023-01-22 18:10:38.566587: step: 24/463, loss: 0.008043904788792133 2023-01-22 18:10:39.171883: step: 26/463, loss: 1.4233556985855103 2023-01-22 18:10:39.785844: step: 28/463, loss: 0.01979057863354683 2023-01-22 18:10:40.422913: step: 30/463, loss: 0.02773088775575161 2023-01-22 18:10:41.031749: step: 32/463, loss: 0.0009087707730941474 2023-01-22 18:10:41.679735: step: 34/463, loss: 0.23532138764858246 2023-01-22 18:10:42.228434: step: 36/463, loss: 0.24490056931972504 2023-01-22 18:10:42.908048: step: 38/463, loss: 0.0004997205105610192 2023-01-22 18:10:43.493629: step: 40/463, loss: 0.00344315217807889 2023-01-22 18:10:44.140225: step: 42/463, loss: 0.01547417975962162 2023-01-22 18:10:44.742958: step: 44/463, loss: 0.007229425944387913 2023-01-22 18:10:45.382525: step: 46/463, loss: 0.004866970703005791 2023-01-22 18:10:46.019680: step: 48/463, loss: 0.2209947258234024 2023-01-22 18:10:46.752988: step: 50/463, loss: 0.02936173789203167 2023-01-22 18:10:47.388720: step: 52/463, loss: 0.02489728108048439 2023-01-22 18:10:47.995929: step: 54/463, loss: 0.0013715631794184446 2023-01-22 18:10:48.600370: step: 56/463, loss: 0.011298605240881443 2023-01-22 18:10:49.179898: step: 58/463, loss: 0.005743944086134434 2023-01-22 18:10:49.844028: step: 60/463, loss: 0.015584814362227917 2023-01-22 18:10:50.550214: step: 62/463, loss: 0.013106060214340687 2023-01-22 18:10:51.214128: step: 64/463, loss: 0.04132722690701485 2023-01-22 18:10:51.798411: step: 66/463, loss: 0.0009316614596173167 2023-01-22 18:10:52.394967: step: 68/463, loss: 0.009831576608121395 2023-01-22 18:10:52.991750: step: 70/463, loss: 0.0011557565303519368 2023-01-22 18:10:53.629204: step: 72/463, loss: 0.0038804991636425257 2023-01-22 18:10:54.227870: step: 74/463, loss: 0.01038205437362194 2023-01-22 18:10:54.836427: step: 76/463, loss: 0.024874422699213028 2023-01-22 18:10:55.483344: step: 78/463, loss: 0.01572497747838497 2023-01-22 18:10:56.095782: step: 80/463, loss: 0.02184843271970749 2023-01-22 18:10:56.793165: step: 82/463, loss: 0.2565692663192749 2023-01-22 18:10:57.508846: step: 84/463, loss: 0.02074008248746395 2023-01-22 18:10:58.215042: step: 86/463, loss: 0.03602277487516403 2023-01-22 18:10:58.725325: step: 88/463, loss: 0.008283732458949089 2023-01-22 18:10:59.393675: step: 90/463, loss: 0.04479197785258293 2023-01-22 18:11:00.045963: step: 92/463, loss: 0.033568188548088074 2023-01-22 18:11:00.629989: step: 94/463, loss: 0.0018867895705625415 2023-01-22 18:11:01.155813: step: 96/463, loss: 0.005669394042342901 2023-01-22 18:11:01.812864: step: 98/463, loss: 0.08342336118221283 2023-01-22 18:11:02.404616: step: 100/463, loss: 8.45475442474708e-05 2023-01-22 18:11:03.126272: step: 102/463, loss: 0.004925311077386141 2023-01-22 18:11:03.776697: step: 104/463, loss: 0.016624681651592255 2023-01-22 18:11:04.459226: step: 106/463, loss: 0.023950546979904175 2023-01-22 18:11:05.063995: step: 108/463, loss: 0.00323311029933393 2023-01-22 18:11:05.653469: step: 110/463, loss: 0.07704628258943558 2023-01-22 18:11:06.327474: step: 112/463, loss: 0.0078015876933932304 2023-01-22 18:11:06.941223: step: 114/463, loss: 0.018764011561870575 2023-01-22 18:11:07.587444: step: 116/463, loss: 0.004055830650031567 2023-01-22 18:11:08.208277: step: 118/463, loss: 0.00469647441059351 2023-01-22 18:11:08.799032: step: 120/463, loss: 0.011342434212565422 2023-01-22 18:11:09.496121: step: 122/463, loss: 0.007262876722961664 2023-01-22 18:11:10.150095: step: 124/463, loss: 0.014735093340277672 2023-01-22 18:11:10.759228: step: 126/463, loss: 0.002598400926217437 2023-01-22 18:11:11.412124: step: 128/463, loss: 0.0014067861484363675 2023-01-22 18:11:11.963992: step: 130/463, loss: 0.01282421313226223 2023-01-22 18:11:12.573058: step: 132/463, loss: 0.16156382858753204 2023-01-22 18:11:13.264319: step: 134/463, loss: 0.001516206655651331 2023-01-22 18:11:13.895271: step: 136/463, loss: 0.001136263133957982 2023-01-22 18:11:14.485529: step: 138/463, loss: 0.039045486599206924 2023-01-22 18:11:15.086910: step: 140/463, loss: 0.0011799471685662866 2023-01-22 18:11:15.676772: step: 142/463, loss: 0.030932432040572166 2023-01-22 18:11:16.259329: step: 144/463, loss: 0.056320708245038986 2023-01-22 18:11:16.849195: step: 146/463, loss: 0.023523397743701935 2023-01-22 18:11:17.473027: step: 148/463, loss: 0.05645058676600456 2023-01-22 18:11:18.069405: step: 150/463, loss: 0.0016243322752416134 2023-01-22 18:11:18.718040: step: 152/463, loss: 0.03789251297712326 2023-01-22 18:11:19.293804: step: 154/463, loss: 0.03750096261501312 2023-01-22 18:11:19.960409: step: 156/463, loss: 0.031655505299568176 2023-01-22 18:11:20.569113: step: 158/463, loss: 0.0023483901750296354 2023-01-22 18:11:21.141583: step: 160/463, loss: 7.17196089681238e-05 2023-01-22 18:11:21.826331: step: 162/463, loss: 0.14080381393432617 2023-01-22 18:11:22.469987: step: 164/463, loss: 0.001104487106204033 2023-01-22 18:11:23.116811: step: 166/463, loss: 0.00022219610400497913 2023-01-22 18:11:23.809794: step: 168/463, loss: 0.0013512177392840385 2023-01-22 18:11:24.427441: step: 170/463, loss: 0.00853075459599495 2023-01-22 18:11:25.050392: step: 172/463, loss: 0.004934984724968672 2023-01-22 18:11:25.704758: step: 174/463, loss: 0.038530148565769196 2023-01-22 18:11:26.313798: step: 176/463, loss: 0.01424612756818533 2023-01-22 18:11:26.945137: step: 178/463, loss: 0.00029099208768457174 2023-01-22 18:11:27.620497: step: 180/463, loss: 0.004306720569729805 2023-01-22 18:11:28.217280: step: 182/463, loss: 0.00014756170276086777 2023-01-22 18:11:28.871382: step: 184/463, loss: 0.008879352360963821 2023-01-22 18:11:29.542498: step: 186/463, loss: 0.02718985825777054 2023-01-22 18:11:30.181145: step: 188/463, loss: 0.02127208560705185 2023-01-22 18:11:30.790000: step: 190/463, loss: 0.6382697224617004 2023-01-22 18:11:31.444656: step: 192/463, loss: 0.0473245307803154 2023-01-22 18:11:32.037603: step: 194/463, loss: 0.020394712686538696 2023-01-22 18:11:32.798976: step: 196/463, loss: 0.0036867656745016575 2023-01-22 18:11:33.401515: step: 198/463, loss: 0.006057628896087408 2023-01-22 18:11:33.990060: step: 200/463, loss: 0.00301573914475739 2023-01-22 18:11:34.580739: step: 202/463, loss: 0.012359154410660267 2023-01-22 18:11:35.206243: step: 204/463, loss: 0.023704219609498978 2023-01-22 18:11:35.875493: step: 206/463, loss: 0.0009804400615394115 2023-01-22 18:11:36.442953: step: 208/463, loss: 0.024044297635555267 2023-01-22 18:11:37.074434: step: 210/463, loss: 0.0014464708510786295 2023-01-22 18:11:37.696233: step: 212/463, loss: 0.0012989290989935398 2023-01-22 18:11:38.354592: step: 214/463, loss: 0.007914329878985882 2023-01-22 18:11:38.970775: step: 216/463, loss: 0.03295142948627472 2023-01-22 18:11:39.583494: step: 218/463, loss: 0.019422050565481186 2023-01-22 18:11:40.261659: step: 220/463, loss: 0.09406920522451401 2023-01-22 18:11:40.978988: step: 222/463, loss: 0.35805922746658325 2023-01-22 18:11:41.588871: step: 224/463, loss: 0.021297164261341095 2023-01-22 18:11:42.209023: step: 226/463, loss: 0.011777711100876331 2023-01-22 18:11:42.781958: step: 228/463, loss: 0.004479800350964069 2023-01-22 18:11:43.363621: step: 230/463, loss: 0.02752714231610298 2023-01-22 18:11:43.994444: step: 232/463, loss: 0.01448436826467514 2023-01-22 18:11:44.676114: step: 234/463, loss: 0.00012848593178205192 2023-01-22 18:11:45.292166: step: 236/463, loss: 0.019066892564296722 2023-01-22 18:11:45.901589: step: 238/463, loss: 0.01006449293345213 2023-01-22 18:11:46.493628: step: 240/463, loss: 0.004511532373726368 2023-01-22 18:11:47.130561: step: 242/463, loss: 0.022410444915294647 2023-01-22 18:11:47.727330: step: 244/463, loss: 0.024690503254532814 2023-01-22 18:11:48.334631: step: 246/463, loss: 0.0048939441330730915 2023-01-22 18:11:48.905547: step: 248/463, loss: 0.06040778011083603 2023-01-22 18:11:49.458141: step: 250/463, loss: 0.009408536367118359 2023-01-22 18:11:50.105562: step: 252/463, loss: 0.015400157310068607 2023-01-22 18:11:50.727070: step: 254/463, loss: 0.002154512098059058 2023-01-22 18:11:51.332381: step: 256/463, loss: 0.002788411220535636 2023-01-22 18:11:51.973660: step: 258/463, loss: 0.004452574998140335 2023-01-22 18:11:52.603465: step: 260/463, loss: 0.004687020555138588 2023-01-22 18:11:53.291742: step: 262/463, loss: 0.03842202201485634 2023-01-22 18:11:53.904181: step: 264/463, loss: 0.011100695468485355 2023-01-22 18:11:54.581189: step: 266/463, loss: 0.028578925877809525 2023-01-22 18:11:55.224820: step: 268/463, loss: 0.01955319195985794 2023-01-22 18:11:55.828539: step: 270/463, loss: 0.004465459380298853 2023-01-22 18:11:56.494255: step: 272/463, loss: 0.05097714439034462 2023-01-22 18:11:57.124932: step: 274/463, loss: 0.023559074848890305 2023-01-22 18:11:57.791878: step: 276/463, loss: 0.026307255029678345 2023-01-22 18:11:58.387227: step: 278/463, loss: 0.008315099403262138 2023-01-22 18:11:58.942631: step: 280/463, loss: 0.016143200919032097 2023-01-22 18:11:59.602037: step: 282/463, loss: 0.04260721430182457 2023-01-22 18:12:00.188562: step: 284/463, loss: 0.007427839562296867 2023-01-22 18:12:00.734540: step: 286/463, loss: 0.13695195317268372 2023-01-22 18:12:01.338443: step: 288/463, loss: 0.01557847298681736 2023-01-22 18:12:01.898505: step: 290/463, loss: 0.0008198956493288279 2023-01-22 18:12:02.520261: step: 292/463, loss: 0.014469236135482788 2023-01-22 18:12:03.097235: step: 294/463, loss: 0.0032709655351936817 2023-01-22 18:12:03.898524: step: 296/463, loss: 0.008295631036162376 2023-01-22 18:12:04.484618: step: 298/463, loss: 0.016531944274902344 2023-01-22 18:12:05.105587: step: 300/463, loss: 0.014960959553718567 2023-01-22 18:12:05.648109: step: 302/463, loss: 0.021415378898382187 2023-01-22 18:12:06.358230: step: 304/463, loss: 0.003210564376786351 2023-01-22 18:12:06.982673: step: 306/463, loss: 0.0020850584842264652 2023-01-22 18:12:07.590954: step: 308/463, loss: 0.0291800107806921 2023-01-22 18:12:08.223028: step: 310/463, loss: 0.0016299487324431539 2023-01-22 18:12:08.829930: step: 312/463, loss: 0.02358255907893181 2023-01-22 18:12:09.539005: step: 314/463, loss: 0.0030994764529168606 2023-01-22 18:12:10.140097: step: 316/463, loss: 0.019881032407283783 2023-01-22 18:12:10.765751: step: 318/463, loss: 0.04933161288499832 2023-01-22 18:12:11.321122: step: 320/463, loss: 0.0029673855751752853 2023-01-22 18:12:11.927988: step: 322/463, loss: 0.006268812343478203 2023-01-22 18:12:12.592707: step: 324/463, loss: 0.015069541521370411 2023-01-22 18:12:13.182843: step: 326/463, loss: 0.0013001947663724422 2023-01-22 18:12:13.799523: step: 328/463, loss: 0.05226092040538788 2023-01-22 18:12:14.441182: step: 330/463, loss: 0.04904714971780777 2023-01-22 18:12:15.040050: step: 332/463, loss: 0.02910422533750534 2023-01-22 18:12:15.619351: step: 334/463, loss: 0.021304460242390633 2023-01-22 18:12:16.290960: step: 336/463, loss: 0.005897172261029482 2023-01-22 18:12:16.852231: step: 338/463, loss: 0.0006253680330701172 2023-01-22 18:12:17.521488: step: 340/463, loss: 0.007108220364898443 2023-01-22 18:12:18.130011: step: 342/463, loss: 0.12392474710941315 2023-01-22 18:12:18.748077: step: 344/463, loss: 0.04380347579717636 2023-01-22 18:12:19.382482: step: 346/463, loss: 0.0095811253413558 2023-01-22 18:12:19.969453: step: 348/463, loss: 0.004952526651322842 2023-01-22 18:12:20.584690: step: 350/463, loss: 0.023340018466114998 2023-01-22 18:12:21.174093: step: 352/463, loss: 0.0026507005095481873 2023-01-22 18:12:21.725119: step: 354/463, loss: 0.008857302367687225 2023-01-22 18:12:22.341001: step: 356/463, loss: 0.0015662929508835077 2023-01-22 18:12:22.978971: step: 358/463, loss: 0.010397627018392086 2023-01-22 18:12:23.641169: step: 360/463, loss: 0.02089148573577404 2023-01-22 18:12:24.201836: step: 362/463, loss: 0.000866714573930949 2023-01-22 18:12:24.789296: step: 364/463, loss: 0.002715121256187558 2023-01-22 18:12:25.399782: step: 366/463, loss: 0.040322382003068924 2023-01-22 18:12:26.035913: step: 368/463, loss: 0.0043486133217811584 2023-01-22 18:12:26.581203: step: 370/463, loss: 0.010896815918385983 2023-01-22 18:12:27.214111: step: 372/463, loss: 0.00037029205122962594 2023-01-22 18:12:27.925711: step: 374/463, loss: 0.0074162790551781654 2023-01-22 18:12:28.513740: step: 376/463, loss: 0.03430149331688881 2023-01-22 18:12:29.198328: step: 378/463, loss: 0.017231294885277748 2023-01-22 18:12:29.835730: step: 380/463, loss: 0.012513946741819382 2023-01-22 18:12:30.462630: step: 382/463, loss: 0.0058025214821100235 2023-01-22 18:12:31.120844: step: 384/463, loss: 0.0472773015499115 2023-01-22 18:12:31.786959: step: 386/463, loss: 0.04094253107905388 2023-01-22 18:12:32.426752: step: 388/463, loss: 0.00018742491374723613 2023-01-22 18:12:33.072138: step: 390/463, loss: 0.003474854864180088 2023-01-22 18:12:33.715314: step: 392/463, loss: 0.0768037959933281 2023-01-22 18:12:34.323065: step: 394/463, loss: 0.014737810008227825 2023-01-22 18:12:34.873576: step: 396/463, loss: 0.03229651600122452 2023-01-22 18:12:35.462520: step: 398/463, loss: 0.012442879378795624 2023-01-22 18:12:36.068585: step: 400/463, loss: 0.00793522410094738 2023-01-22 18:12:36.699490: step: 402/463, loss: 0.0023170760832726955 2023-01-22 18:12:37.279357: step: 404/463, loss: 0.0003207748231943697 2023-01-22 18:12:37.888452: step: 406/463, loss: 0.00019383880135137588 2023-01-22 18:12:38.671595: step: 408/463, loss: 0.006120300851762295 2023-01-22 18:12:39.303380: step: 410/463, loss: 0.002548555377870798 2023-01-22 18:12:39.900262: step: 412/463, loss: 0.03936883434653282 2023-01-22 18:12:40.563898: step: 414/463, loss: 0.24368350207805634 2023-01-22 18:12:41.197669: step: 416/463, loss: 3.6957385540008545 2023-01-22 18:12:41.837000: step: 418/463, loss: 0.030091989785432816 2023-01-22 18:12:42.480771: step: 420/463, loss: 0.021147722378373146 2023-01-22 18:12:43.078744: step: 422/463, loss: 0.0026078636292368174 2023-01-22 18:12:43.722847: step: 424/463, loss: 0.0024680921342223883 2023-01-22 18:12:44.295048: step: 426/463, loss: 0.027086207643151283 2023-01-22 18:12:44.969378: step: 428/463, loss: 0.0014038127847015858 2023-01-22 18:12:45.557493: step: 430/463, loss: 0.015695499256253242 2023-01-22 18:12:46.130873: step: 432/463, loss: 0.01798972114920616 2023-01-22 18:12:46.793776: step: 434/463, loss: 0.040278758853673935 2023-01-22 18:12:47.431754: step: 436/463, loss: 0.005900786258280277 2023-01-22 18:12:48.061858: step: 438/463, loss: 0.006206972524523735 2023-01-22 18:12:48.736569: step: 440/463, loss: 0.001937671098858118 2023-01-22 18:12:49.277764: step: 442/463, loss: 0.0002884727728087455 2023-01-22 18:12:49.916250: step: 444/463, loss: 0.02178526669740677 2023-01-22 18:12:50.479162: step: 446/463, loss: 0.0029507277067750692 2023-01-22 18:12:51.028591: step: 448/463, loss: 0.028034847229719162 2023-01-22 18:12:51.662172: step: 450/463, loss: 0.03076230175793171 2023-01-22 18:12:52.260380: step: 452/463, loss: 0.007163890637457371 2023-01-22 18:12:52.901917: step: 454/463, loss: 0.01690208539366722 2023-01-22 18:12:53.553867: step: 456/463, loss: 0.015023274347186089 2023-01-22 18:12:54.155998: step: 458/463, loss: 0.004815500695258379 2023-01-22 18:12:54.808232: step: 460/463, loss: 0.007886810228228569 2023-01-22 18:12:55.558970: step: 462/463, loss: 0.04339291527867317 2023-01-22 18:12:56.164292: step: 464/463, loss: 0.013576248660683632 2023-01-22 18:12:56.733955: step: 466/463, loss: 0.006912542507052422 2023-01-22 18:12:57.415095: step: 468/463, loss: 0.0016215142095461488 2023-01-22 18:12:58.013804: step: 470/463, loss: 0.008909751661121845 2023-01-22 18:12:58.605130: step: 472/463, loss: 0.003114415565505624 2023-01-22 18:12:59.218194: step: 474/463, loss: 0.026890741661190987 2023-01-22 18:12:59.889719: step: 476/463, loss: 0.0017079211538657546 2023-01-22 18:13:00.509926: step: 478/463, loss: 0.02384045161306858 2023-01-22 18:13:01.117067: step: 480/463, loss: 0.0069001163356006145 2023-01-22 18:13:01.748141: step: 482/463, loss: 0.11666595190763474 2023-01-22 18:13:02.404467: step: 484/463, loss: 0.29434671998023987 2023-01-22 18:13:03.037904: step: 486/463, loss: 0.02785714715719223 2023-01-22 18:13:03.706781: step: 488/463, loss: 0.2949220538139343 2023-01-22 18:13:04.303603: step: 490/463, loss: 0.01611381769180298 2023-01-22 18:13:04.925340: step: 492/463, loss: 0.0355745293200016 2023-01-22 18:13:05.533715: step: 494/463, loss: 0.0016642909031361341 2023-01-22 18:13:06.153311: step: 496/463, loss: 0.020432552322745323 2023-01-22 18:13:06.750034: step: 498/463, loss: 0.04677198454737663 2023-01-22 18:13:07.319287: step: 500/463, loss: 0.0003822250582743436 2023-01-22 18:13:07.971749: step: 502/463, loss: 0.016908539459109306 2023-01-22 18:13:08.568850: step: 504/463, loss: 0.003968917764723301 2023-01-22 18:13:09.141943: step: 506/463, loss: 0.007253527641296387 2023-01-22 18:13:09.778034: step: 508/463, loss: 0.048732686787843704 2023-01-22 18:13:10.390456: step: 510/463, loss: 0.04935967177152634 2023-01-22 18:13:11.002800: step: 512/463, loss: 0.15548154711723328 2023-01-22 18:13:11.623901: step: 514/463, loss: 0.00038724354817532003 2023-01-22 18:13:12.270741: step: 516/463, loss: 0.000767572782933712 2023-01-22 18:13:12.881311: step: 518/463, loss: 0.0004339531878940761 2023-01-22 18:13:13.614599: step: 520/463, loss: 0.13186833262443542 2023-01-22 18:13:14.265290: step: 522/463, loss: 0.027762528508901596 2023-01-22 18:13:14.948735: step: 524/463, loss: 0.008001415990293026 2023-01-22 18:13:15.575953: step: 526/463, loss: 0.004161191638559103 2023-01-22 18:13:16.169818: step: 528/463, loss: 0.008406359702348709 2023-01-22 18:13:16.776041: step: 530/463, loss: 0.0019117603078484535 2023-01-22 18:13:17.435778: step: 532/463, loss: 0.0017048605950549245 2023-01-22 18:13:18.072524: step: 534/463, loss: 0.00047492844169028103 2023-01-22 18:13:18.616241: step: 536/463, loss: 0.002072973409667611 2023-01-22 18:13:19.289442: step: 538/463, loss: 0.16199597716331482 2023-01-22 18:13:19.925417: step: 540/463, loss: 0.00015400812844745815 2023-01-22 18:13:20.494252: step: 542/463, loss: 6.968522211536765e-05 2023-01-22 18:13:21.131532: step: 544/463, loss: 0.008989878930151463 2023-01-22 18:13:21.755485: step: 546/463, loss: 0.0027830805629491806 2023-01-22 18:13:22.330329: step: 548/463, loss: 0.0013350980589166284 2023-01-22 18:13:22.956096: step: 550/463, loss: 0.010845081880688667 2023-01-22 18:13:23.570325: step: 552/463, loss: 0.3503897488117218 2023-01-22 18:13:24.222220: step: 554/463, loss: 0.03919822350144386 2023-01-22 18:13:24.926245: step: 556/463, loss: 0.04039038345217705 2023-01-22 18:13:25.572653: step: 558/463, loss: 0.05769497528672218 2023-01-22 18:13:26.168164: step: 560/463, loss: 0.003731369972229004 2023-01-22 18:13:26.761684: step: 562/463, loss: 0.5754086971282959 2023-01-22 18:13:27.407949: step: 564/463, loss: 0.003930822480469942 2023-01-22 18:13:28.049366: step: 566/463, loss: 0.07386160641908646 2023-01-22 18:13:28.636527: step: 568/463, loss: 0.0025761513970792294 2023-01-22 18:13:29.294553: step: 570/463, loss: 0.03909789025783539 2023-01-22 18:13:29.897964: step: 572/463, loss: 1.0972036123275757 2023-01-22 18:13:30.483515: step: 574/463, loss: 0.030675988644361496 2023-01-22 18:13:31.133948: step: 576/463, loss: 0.6055426597595215 2023-01-22 18:13:31.725887: step: 578/463, loss: 0.008553487248718739 2023-01-22 18:13:32.333986: step: 580/463, loss: 0.06233445927500725 2023-01-22 18:13:32.868265: step: 582/463, loss: 0.010070525109767914 2023-01-22 18:13:33.490167: step: 584/463, loss: 0.02859129011631012 2023-01-22 18:13:34.073944: step: 586/463, loss: 0.0010363359469920397 2023-01-22 18:13:34.681011: step: 588/463, loss: 0.005983466282486916 2023-01-22 18:13:35.319231: step: 590/463, loss: 0.15692682564258575 2023-01-22 18:13:36.040962: step: 592/463, loss: 0.017561476677656174 2023-01-22 18:13:36.639740: step: 594/463, loss: 0.003962086513638496 2023-01-22 18:13:37.214106: step: 596/463, loss: 0.011827449314296246 2023-01-22 18:13:37.867151: step: 598/463, loss: 0.011248989962041378 2023-01-22 18:13:38.499663: step: 600/463, loss: 0.008587057702243328 2023-01-22 18:13:39.095034: step: 602/463, loss: 0.0034255480859428644 2023-01-22 18:13:39.766621: step: 604/463, loss: 0.004411138128489256 2023-01-22 18:13:40.440344: step: 606/463, loss: 0.019318683072924614 2023-01-22 18:13:41.041757: step: 608/463, loss: 0.053919848054647446 2023-01-22 18:13:41.781713: step: 610/463, loss: 0.00286387512460351 2023-01-22 18:13:42.404620: step: 612/463, loss: 0.09522686153650284 2023-01-22 18:13:43.038143: step: 614/463, loss: 0.028980562463402748 2023-01-22 18:13:43.605626: step: 616/463, loss: 0.018289050087332726 2023-01-22 18:13:44.190965: step: 618/463, loss: 0.015610437840223312 2023-01-22 18:13:44.802711: step: 620/463, loss: 0.046121641993522644 2023-01-22 18:13:45.408275: step: 622/463, loss: 0.03760074824094772 2023-01-22 18:13:45.972992: step: 624/463, loss: 0.005352803505957127 2023-01-22 18:13:46.611291: step: 626/463, loss: 0.016087571159005165 2023-01-22 18:13:47.329133: step: 628/463, loss: 0.01791827380657196 2023-01-22 18:13:47.938222: step: 630/463, loss: 0.012377656064927578 2023-01-22 18:13:48.600634: step: 632/463, loss: 0.00025046136579476297 2023-01-22 18:13:49.240130: step: 634/463, loss: 0.007167185191065073 2023-01-22 18:13:49.861238: step: 636/463, loss: 0.0035229800269007683 2023-01-22 18:13:50.416929: step: 638/463, loss: 0.0009591412381269038 2023-01-22 18:13:51.049507: step: 640/463, loss: 0.009967419318854809 2023-01-22 18:13:51.646857: step: 642/463, loss: 0.01822073385119438 2023-01-22 18:13:52.255866: step: 644/463, loss: 0.5648950934410095 2023-01-22 18:13:52.898982: step: 646/463, loss: 0.03150377795100212 2023-01-22 18:13:53.662783: step: 648/463, loss: 0.0005830589216202497 2023-01-22 18:13:54.252025: step: 650/463, loss: 0.0031935253646224737 2023-01-22 18:13:54.919463: step: 652/463, loss: 0.015042461454868317 2023-01-22 18:13:55.482029: step: 654/463, loss: 0.014571400359272957 2023-01-22 18:13:56.067466: step: 656/463, loss: 0.0020906534045934677 2023-01-22 18:13:56.691148: step: 658/463, loss: 0.020395148545503616 2023-01-22 18:13:57.303187: step: 660/463, loss: 0.03264773264527321 2023-01-22 18:13:57.958324: step: 662/463, loss: 0.2238789200782776 2023-01-22 18:13:58.549497: step: 664/463, loss: 0.009959045797586441 2023-01-22 18:13:59.221182: step: 666/463, loss: 0.01719987764954567 2023-01-22 18:13:59.864868: step: 668/463, loss: 0.22881001234054565 2023-01-22 18:14:00.466168: step: 670/463, loss: 0.006574646569788456 2023-01-22 18:14:01.135859: step: 672/463, loss: 0.0060072652995586395 2023-01-22 18:14:01.812719: step: 674/463, loss: 0.022581797093153 2023-01-22 18:14:02.415232: step: 676/463, loss: 0.0031595751643180847 2023-01-22 18:14:03.072116: step: 678/463, loss: 0.04732261598110199 2023-01-22 18:14:03.687502: step: 680/463, loss: 0.013669160194694996 2023-01-22 18:14:04.387173: step: 682/463, loss: 0.16138359904289246 2023-01-22 18:14:05.075774: step: 684/463, loss: 0.015681058168411255 2023-01-22 18:14:05.749052: step: 686/463, loss: 0.035312749445438385 2023-01-22 18:14:06.374427: step: 688/463, loss: 0.06733265519142151 2023-01-22 18:14:07.051024: step: 690/463, loss: 0.01868884637951851 2023-01-22 18:14:07.758969: step: 692/463, loss: 0.0005354206077754498 2023-01-22 18:14:08.367726: step: 694/463, loss: 0.007417973596602678 2023-01-22 18:14:09.004790: step: 696/463, loss: 0.022458119317889214 2023-01-22 18:14:09.669777: step: 698/463, loss: 0.028263045474886894 2023-01-22 18:14:10.304332: step: 700/463, loss: 0.00888818223029375 2023-01-22 18:14:10.878966: step: 702/463, loss: 0.014472449198365211 2023-01-22 18:14:11.505017: step: 704/463, loss: 0.0031615043990314007 2023-01-22 18:14:12.166443: step: 706/463, loss: 0.4925404191017151 2023-01-22 18:14:12.774427: step: 708/463, loss: 0.10952349007129669 2023-01-22 18:14:13.344890: step: 710/463, loss: 0.0006102272309362888 2023-01-22 18:14:13.959734: step: 712/463, loss: 0.005679975263774395 2023-01-22 18:14:14.589331: step: 714/463, loss: 0.05565885826945305 2023-01-22 18:14:15.229764: step: 716/463, loss: 0.04839213192462921 2023-01-22 18:14:15.773275: step: 718/463, loss: 0.009029711596667767 2023-01-22 18:14:16.344697: step: 720/463, loss: 0.07806672900915146 2023-01-22 18:14:16.971592: step: 722/463, loss: 0.7232964634895325 2023-01-22 18:14:17.598781: step: 724/463, loss: 0.015376397408545017 2023-01-22 18:14:18.184231: step: 726/463, loss: 0.012584639713168144 2023-01-22 18:14:18.814395: step: 728/463, loss: 0.026783838868141174 2023-01-22 18:14:19.482973: step: 730/463, loss: 0.021375712007284164 2023-01-22 18:14:20.050922: step: 732/463, loss: 0.015645084902644157 2023-01-22 18:14:20.664939: step: 734/463, loss: 0.2184251993894577 2023-01-22 18:14:21.291072: step: 736/463, loss: 0.0007292705122381449 2023-01-22 18:14:21.924788: step: 738/463, loss: 0.033939048647880554 2023-01-22 18:14:22.501054: step: 740/463, loss: 1.2818611139664426e-05 2023-01-22 18:14:23.105207: step: 742/463, loss: 0.030507301911711693 2023-01-22 18:14:23.709126: step: 744/463, loss: 0.03035360760986805 2023-01-22 18:14:24.353340: step: 746/463, loss: 0.003208100562915206 2023-01-22 18:14:24.953377: step: 748/463, loss: 0.013953549787402153 2023-01-22 18:14:25.591675: step: 750/463, loss: 0.01943303272128105 2023-01-22 18:14:26.216422: step: 752/463, loss: 0.0012870366917923093 2023-01-22 18:14:26.776366: step: 754/463, loss: 0.009313479997217655 2023-01-22 18:14:27.333656: step: 756/463, loss: 0.013476977124810219 2023-01-22 18:14:27.938458: step: 758/463, loss: 0.007046813145279884 2023-01-22 18:14:28.606810: step: 760/463, loss: 0.06792444735765457 2023-01-22 18:14:29.155514: step: 762/463, loss: 0.010370973497629166 2023-01-22 18:14:29.771133: step: 764/463, loss: 0.04215259104967117 2023-01-22 18:14:30.447278: step: 766/463, loss: 0.016461750492453575 2023-01-22 18:14:31.193095: step: 768/463, loss: 0.011332379654049873 2023-01-22 18:14:31.813489: step: 770/463, loss: 0.004935069940984249 2023-01-22 18:14:32.455476: step: 772/463, loss: 0.0021530399098992348 2023-01-22 18:14:33.103051: step: 774/463, loss: 0.01050308533012867 2023-01-22 18:14:33.714283: step: 776/463, loss: 0.004323417320847511 2023-01-22 18:14:34.360424: step: 778/463, loss: 0.0796646848320961 2023-01-22 18:14:34.967675: step: 780/463, loss: 0.002258620923385024 2023-01-22 18:14:35.610847: step: 782/463, loss: 0.018394850194454193 2023-01-22 18:14:36.239366: step: 784/463, loss: 0.02932923473417759 2023-01-22 18:14:36.972726: step: 786/463, loss: 0.026468796655535698 2023-01-22 18:14:37.572856: step: 788/463, loss: 0.05370144173502922 2023-01-22 18:14:38.184381: step: 790/463, loss: 0.023551400750875473 2023-01-22 18:14:38.795966: step: 792/463, loss: 0.005214824341237545 2023-01-22 18:14:39.412755: step: 794/463, loss: 0.03873399645090103 2023-01-22 18:14:39.968715: step: 796/463, loss: 0.01496395654976368 2023-01-22 18:14:40.585037: step: 798/463, loss: 0.012165072374045849 2023-01-22 18:14:41.226641: step: 800/463, loss: 0.09790411591529846 2023-01-22 18:14:41.808455: step: 802/463, loss: 0.06166449561715126 2023-01-22 18:14:42.416526: step: 804/463, loss: 0.1707279533147812 2023-01-22 18:14:42.990206: step: 806/463, loss: 0.0027621288318187 2023-01-22 18:14:43.619573: step: 808/463, loss: 0.006634900346398354 2023-01-22 18:14:44.228796: step: 810/463, loss: 0.048995934426784515 2023-01-22 18:14:44.856615: step: 812/463, loss: 0.007062920834869146 2023-01-22 18:14:45.438656: step: 814/463, loss: 0.0005750320851802826 2023-01-22 18:14:46.119787: step: 816/463, loss: 0.008104966022074223 2023-01-22 18:14:46.866068: step: 818/463, loss: 0.004252342507243156 2023-01-22 18:14:47.478624: step: 820/463, loss: 0.020631130784749985 2023-01-22 18:14:48.105101: step: 822/463, loss: 0.004894971381872892 2023-01-22 18:14:48.707331: step: 824/463, loss: 0.0016357406275346875 2023-01-22 18:14:49.340701: step: 826/463, loss: 0.009907460771501064 2023-01-22 18:14:49.964343: step: 828/463, loss: 0.04246728867292404 2023-01-22 18:14:50.670953: step: 830/463, loss: 0.0011992910876870155 2023-01-22 18:14:51.327570: step: 832/463, loss: 0.03676518797874451 2023-01-22 18:14:51.935228: step: 834/463, loss: 0.0012569967657327652 2023-01-22 18:14:52.507241: step: 836/463, loss: 0.004173298832029104 2023-01-22 18:14:53.123225: step: 838/463, loss: 0.0007855732110328972 2023-01-22 18:14:53.750179: step: 840/463, loss: 0.00662184227257967 2023-01-22 18:14:54.404855: step: 842/463, loss: 0.009061768651008606 2023-01-22 18:14:55.018053: step: 844/463, loss: 0.004875887650996447 2023-01-22 18:14:55.580451: step: 846/463, loss: 0.010103992186486721 2023-01-22 18:14:56.208391: step: 848/463, loss: 0.04652407765388489 2023-01-22 18:14:56.758002: step: 850/463, loss: 0.015500756911933422 2023-01-22 18:14:57.357460: step: 852/463, loss: 0.010568263940513134 2023-01-22 18:14:58.079023: step: 854/463, loss: 0.001282501150853932 2023-01-22 18:14:58.747160: step: 856/463, loss: 0.018431924283504486 2023-01-22 18:14:59.362234: step: 858/463, loss: 0.009424363262951374 2023-01-22 18:14:59.962915: step: 860/463, loss: 0.010929495096206665 2023-01-22 18:15:00.604077: step: 862/463, loss: 0.10462957620620728 2023-01-22 18:15:01.234626: step: 864/463, loss: 0.02342650294303894 2023-01-22 18:15:01.838778: step: 866/463, loss: 0.02719004452228546 2023-01-22 18:15:02.494874: step: 868/463, loss: 0.014827726408839226 2023-01-22 18:15:03.130680: step: 870/463, loss: 0.02214731276035309 2023-01-22 18:15:03.718654: step: 872/463, loss: 0.0008395720506086946 2023-01-22 18:15:04.354983: step: 874/463, loss: 0.0023085554130375385 2023-01-22 18:15:04.965147: step: 876/463, loss: 0.0011846653651446104 2023-01-22 18:15:05.591413: step: 878/463, loss: 0.007740335538983345 2023-01-22 18:15:06.274341: step: 880/463, loss: 0.09933079779148102 2023-01-22 18:15:06.865470: step: 882/463, loss: 0.03235408663749695 2023-01-22 18:15:07.467809: step: 884/463, loss: 0.013510629534721375 2023-01-22 18:15:08.105499: step: 886/463, loss: 0.09222543239593506 2023-01-22 18:15:08.740157: step: 888/463, loss: 0.002404292579740286 2023-01-22 18:15:09.371805: step: 890/463, loss: 0.0023015434853732586 2023-01-22 18:15:09.941017: step: 892/463, loss: 0.0011289699468761683 2023-01-22 18:15:10.569665: step: 894/463, loss: 0.22704465687274933 2023-01-22 18:15:11.174324: step: 896/463, loss: 0.018727142363786697 2023-01-22 18:15:11.800777: step: 898/463, loss: 0.0022262544371187687 2023-01-22 18:15:12.394556: step: 900/463, loss: 0.005623789504170418 2023-01-22 18:15:12.967118: step: 902/463, loss: 0.0018593050772324204 2023-01-22 18:15:13.558031: step: 904/463, loss: 0.02604919672012329 2023-01-22 18:15:14.179972: step: 906/463, loss: 0.00552030187100172 2023-01-22 18:15:14.823247: step: 908/463, loss: 0.03418564423918724 2023-01-22 18:15:15.459560: step: 910/463, loss: 0.02089224010705948 2023-01-22 18:15:16.098804: step: 912/463, loss: 0.21227945387363434 2023-01-22 18:15:16.699456: step: 914/463, loss: 0.0019891841802746058 2023-01-22 18:15:17.354276: step: 916/463, loss: 0.03263230621814728 2023-01-22 18:15:17.929372: step: 918/463, loss: 0.013114752247929573 2023-01-22 18:15:18.569369: step: 920/463, loss: 0.021778322756290436 2023-01-22 18:15:19.224217: step: 922/463, loss: 0.010525264777243137 2023-01-22 18:15:19.786794: step: 924/463, loss: 0.002981335623189807 2023-01-22 18:15:20.435540: step: 926/463, loss: 0.010050356388092041 ================================================== Loss: 0.050 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.28104676999783224, 'r': 0.35144178639197615, 'f1': 0.31232684895205975}, 'combined': 0.230135572912044, 'epoch': 30} Test Chinese: {'template': {'p': 0.9722222222222222, 'r': 0.5511811023622047, 'f1': 0.7035175879396985}, 'slot': {'p': 0.34333805844859144, 'r': 0.33674692643648935, 'f1': 0.3400105530363144}, 'combined': 0.23920340414615085, 'epoch': 30} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2891833931374748, 'r': 0.34954425318514504, 'f1': 0.31651172066764854}, 'combined': 0.2332191625972147, 'epoch': 30} Test Korean: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.3382166806352807, 'r': 0.32935510821688907, 'f1': 0.33372707867994517}, 'combined': 0.23694622586276107, 'epoch': 30} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2919934913217624, 'r': 0.3557112361073462, 'f1': 0.3207182573628254}, 'combined': 0.23631871595155554, 'epoch': 30} Test Russian: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.36122302043550025, 'r': 0.3233655859793779, 'f1': 0.34124755386763844}, 'combined': 0.24228576324602327, 'epoch': 30} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.2513020833333333, 'r': 0.4595238095238095, 'f1': 0.3249158249158249}, 'combined': 0.21661054994388323, 'epoch': 30} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.27439024390243905, 'r': 0.4891304347826087, 'f1': 0.35156250000000006}, 'combined': 0.17578125000000003, 'epoch': 30} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.4375, 'r': 0.3017241379310345, 'f1': 0.3571428571428571}, 'combined': 0.23809523809523805, 'epoch': 30} New best russian model... ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.31154818059299194, 'r': 0.313321699647601, 'f1': 0.312432423300446}, 'combined': 0.23021336453717073, 'epoch': 29} Test for Chinese: {'template': {'p': 0.9722222222222222, 'r': 0.5511811023622047, 'f1': 0.7035175879396985}, 'slot': {'p': 0.35841849662687986, 'r': 0.32120052010105204, 'f1': 0.33879042433116024}, 'combined': 0.23834502214252482, 'epoch': 29} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.34188034188034183, 'r': 0.38095238095238093, 'f1': 0.36036036036036034}, 'combined': 0.2402402402402402, 'epoch': 29} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.28642843540005986, 'r': 0.33860515228508026, 'f1': 0.3103389830508475}, 'combined': 0.22867082961641394, 'epoch': 17} Test for Korean: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.3492453884409228, 'r': 0.3270179138496522, 'f1': 0.3377663639671779}, 'combined': 0.2398141184166963, 'epoch': 17} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.33088235294117646, 'r': 0.4891304347826087, 'f1': 0.39473684210526316}, 'combined': 0.19736842105263158, 'epoch': 17} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2919934913217624, 'r': 0.3557112361073462, 'f1': 0.3207182573628254}, 'combined': 0.23631871595155554, 'epoch': 30} Test for Russian: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.36122302043550025, 'r': 0.3233655859793779, 'f1': 0.34124755386763844}, 'combined': 0.24228576324602327, 'epoch': 30} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.4375, 'r': 0.3017241379310345, 'f1': 0.3571428571428571}, 'combined': 0.23809523809523805, 'epoch': 30} ****************************** Epoch: 31 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-22 18:18:09.426768: step: 2/463, loss: 0.0034697141963988543 2023-01-22 18:18:10.040257: step: 4/463, loss: 7.213461503852159e-05 2023-01-22 18:18:10.623492: step: 6/463, loss: 0.006965832784771919 2023-01-22 18:18:11.350203: step: 8/463, loss: 0.006881648674607277 2023-01-22 18:18:11.930651: step: 10/463, loss: 0.00900797639042139 2023-01-22 18:18:12.544818: step: 12/463, loss: 0.008540589362382889 2023-01-22 18:18:13.187708: step: 14/463, loss: 0.010402970016002655 2023-01-22 18:18:13.807498: step: 16/463, loss: 0.007035038899630308 2023-01-22 18:18:14.359308: step: 18/463, loss: 0.004793423227965832 2023-01-22 18:18:15.033980: step: 20/463, loss: 0.04580916091799736 2023-01-22 18:18:15.672698: step: 22/463, loss: 0.3791797459125519 2023-01-22 18:18:16.289699: step: 24/463, loss: 0.0029842581134289503 2023-01-22 18:18:16.932291: step: 26/463, loss: 0.001278057461604476 2023-01-22 18:18:17.497662: step: 28/463, loss: 0.008154268376529217 2023-01-22 18:18:18.228426: step: 30/463, loss: 0.026554269716143608 2023-01-22 18:18:18.889605: step: 32/463, loss: 0.003693509614095092 2023-01-22 18:18:19.516519: step: 34/463, loss: 0.0029341252520680428 2023-01-22 18:18:20.128153: step: 36/463, loss: 0.000851911143399775 2023-01-22 18:18:20.720281: step: 38/463, loss: 0.000727150880265981 2023-01-22 18:18:21.385996: step: 40/463, loss: 0.007314437534660101 2023-01-22 18:18:22.058760: step: 42/463, loss: 0.00884784571826458 2023-01-22 18:18:22.679852: step: 44/463, loss: 0.010852406732738018 2023-01-22 18:18:23.267856: step: 46/463, loss: 0.00033820801763795316 2023-01-22 18:18:23.848962: step: 48/463, loss: 0.051972609013319016 2023-01-22 18:18:24.485638: step: 50/463, loss: 0.040540169924497604 2023-01-22 18:18:25.142952: step: 52/463, loss: 0.0003875389520544559 2023-01-22 18:18:25.683734: step: 54/463, loss: 0.012576003558933735 2023-01-22 18:18:26.305823: step: 56/463, loss: 0.010207431390881538 2023-01-22 18:18:26.861923: step: 58/463, loss: 0.003847378073260188 2023-01-22 18:18:27.511834: step: 60/463, loss: 0.006159787531942129 2023-01-22 18:18:28.123143: step: 62/463, loss: 0.003099992172792554 2023-01-22 18:18:28.800591: step: 64/463, loss: 0.025655722245573997 2023-01-22 18:18:29.425157: step: 66/463, loss: 0.027867279946804047 2023-01-22 18:18:30.039878: step: 68/463, loss: 0.02575061097741127 2023-01-22 18:18:30.655633: step: 70/463, loss: 0.006011165212839842 2023-01-22 18:18:31.328482: step: 72/463, loss: 0.018847400322556496 2023-01-22 18:18:31.954743: step: 74/463, loss: 0.005331492051482201 2023-01-22 18:18:32.554169: step: 76/463, loss: 0.007087979931384325 2023-01-22 18:18:33.238357: step: 78/463, loss: 0.024247780442237854 2023-01-22 18:18:33.895924: step: 80/463, loss: 0.04615018889307976 2023-01-22 18:18:34.555991: step: 82/463, loss: 0.019899262115359306 2023-01-22 18:18:35.183797: step: 84/463, loss: 0.044249989092350006 2023-01-22 18:18:35.837556: step: 86/463, loss: 0.0005138160195201635 2023-01-22 18:18:36.491134: step: 88/463, loss: 0.11633506417274475 2023-01-22 18:18:37.118022: step: 90/463, loss: 9.683575626695529e-05 2023-01-22 18:18:37.699286: step: 92/463, loss: 0.007304362487047911 2023-01-22 18:18:38.292613: step: 94/463, loss: 0.001968808239325881 2023-01-22 18:18:38.866619: step: 96/463, loss: 0.21648988127708435 2023-01-22 18:18:39.423277: step: 98/463, loss: 0.010240547358989716 2023-01-22 18:18:39.992946: step: 100/463, loss: 0.3823312222957611 2023-01-22 18:18:40.607197: step: 102/463, loss: 0.003952830098569393 2023-01-22 18:18:41.255154: step: 104/463, loss: 0.009170093573629856 2023-01-22 18:18:41.849774: step: 106/463, loss: 0.001861494849435985 2023-01-22 18:18:42.452653: step: 108/463, loss: 0.005969810765236616 2023-01-22 18:18:43.017438: step: 110/463, loss: 0.002797572175040841 2023-01-22 18:18:43.578805: step: 112/463, loss: 0.007652714848518372 2023-01-22 18:18:44.232971: step: 114/463, loss: 0.002494953805580735 2023-01-22 18:18:44.930914: step: 116/463, loss: 0.08275073021650314 2023-01-22 18:18:45.559278: step: 118/463, loss: 0.019888008013367653 2023-01-22 18:18:46.129038: step: 120/463, loss: 0.2197272926568985 2023-01-22 18:18:46.744080: step: 122/463, loss: 0.04050542414188385 2023-01-22 18:18:47.394949: step: 124/463, loss: 0.013758723624050617 2023-01-22 18:18:48.097620: step: 126/463, loss: 0.006622448097914457 2023-01-22 18:18:48.745678: step: 128/463, loss: 0.011364995501935482 2023-01-22 18:18:49.362706: step: 130/463, loss: 0.002550726057961583 2023-01-22 18:18:49.991286: step: 132/463, loss: 0.004346586763858795 2023-01-22 18:18:50.600630: step: 134/463, loss: 0.013523002155125141 2023-01-22 18:18:51.153474: step: 136/463, loss: 0.0004118725482840091 2023-01-22 18:18:51.760618: step: 138/463, loss: 0.0014834207249805331 2023-01-22 18:18:52.379832: step: 140/463, loss: 0.012234459631145 2023-01-22 18:18:53.054847: step: 142/463, loss: 0.0017610699869692326 2023-01-22 18:18:53.659371: step: 144/463, loss: 0.001764387940056622 2023-01-22 18:18:54.297848: step: 146/463, loss: 0.09388061612844467 2023-01-22 18:18:54.891628: step: 148/463, loss: 0.0024837786331772804 2023-01-22 18:18:55.503867: step: 150/463, loss: 0.005958440713584423 2023-01-22 18:18:56.118378: step: 152/463, loss: 4.94223690032959 2023-01-22 18:18:56.744640: step: 154/463, loss: 0.06739182025194168 2023-01-22 18:18:57.381285: step: 156/463, loss: 0.012782353907823563 2023-01-22 18:18:57.966294: step: 158/463, loss: 0.002535733859986067 2023-01-22 18:18:58.572132: step: 160/463, loss: 0.030457543209195137 2023-01-22 18:18:59.174088: step: 162/463, loss: 0.0581279993057251 2023-01-22 18:18:59.755784: step: 164/463, loss: 0.03132342919707298 2023-01-22 18:19:00.458710: step: 166/463, loss: 0.004091935697942972 2023-01-22 18:19:01.125210: step: 168/463, loss: 0.0022033238783478737 2023-01-22 18:19:01.789199: step: 170/463, loss: 0.08216611295938492 2023-01-22 18:19:02.437472: step: 172/463, loss: 0.09897327423095703 2023-01-22 18:19:03.105180: step: 174/463, loss: 0.014863393269479275 2023-01-22 18:19:03.704401: step: 176/463, loss: 0.00794917345046997 2023-01-22 18:19:04.309488: step: 178/463, loss: 0.006093865260481834 2023-01-22 18:19:05.004047: step: 180/463, loss: 0.0024429578334093094 2023-01-22 18:19:05.593622: step: 182/463, loss: 0.0017789560370147228 2023-01-22 18:19:06.204102: step: 184/463, loss: 0.0001380879111820832 2023-01-22 18:19:06.789530: step: 186/463, loss: 4.164044730714522e-05 2023-01-22 18:19:07.385284: step: 188/463, loss: 0.010775119997560978 2023-01-22 18:19:07.992887: step: 190/463, loss: 0.008560256101191044 2023-01-22 18:19:08.653914: step: 192/463, loss: 0.009549740701913834 2023-01-22 18:19:09.246898: step: 194/463, loss: 0.08187191188335419 2023-01-22 18:19:09.814176: step: 196/463, loss: 1.7170051336288452 2023-01-22 18:19:10.477928: step: 198/463, loss: 0.05909733846783638 2023-01-22 18:19:11.116494: step: 200/463, loss: 0.03847628831863403 2023-01-22 18:19:11.739622: step: 202/463, loss: 0.19852684438228607 2023-01-22 18:19:12.422334: step: 204/463, loss: 0.008707087486982346 2023-01-22 18:19:12.996739: step: 206/463, loss: 0.01595526561141014 2023-01-22 18:19:13.584042: step: 208/463, loss: 0.011712945997714996 2023-01-22 18:19:14.133309: step: 210/463, loss: 0.0012921106535941362 2023-01-22 18:19:14.753684: step: 212/463, loss: 0.00217556394636631 2023-01-22 18:19:15.388879: step: 214/463, loss: 0.02038075588643551 2023-01-22 18:19:15.975000: step: 216/463, loss: 0.38973933458328247 2023-01-22 18:19:16.613862: step: 218/463, loss: 0.0052379099652171135 2023-01-22 18:19:17.211493: step: 220/463, loss: 0.019734056666493416 2023-01-22 18:19:17.855074: step: 222/463, loss: 0.024135669693350792 2023-01-22 18:19:18.486795: step: 224/463, loss: 0.0025779984425753355 2023-01-22 18:19:19.150753: step: 226/463, loss: 0.0013697071699425578 2023-01-22 18:19:19.826456: step: 228/463, loss: 0.004149887710809708 2023-01-22 18:19:20.451522: step: 230/463, loss: 0.05079057812690735 2023-01-22 18:19:21.038070: step: 232/463, loss: 0.01663527451455593 2023-01-22 18:19:21.672257: step: 234/463, loss: 0.022769294679164886 2023-01-22 18:19:22.280444: step: 236/463, loss: 0.02324012480676174 2023-01-22 18:19:22.949251: step: 238/463, loss: 0.009856556542217731 2023-01-22 18:19:23.642298: step: 240/463, loss: 0.005087092984467745 2023-01-22 18:19:24.267676: step: 242/463, loss: 0.006066466215997934 2023-01-22 18:19:24.907578: step: 244/463, loss: 0.009654073975980282 2023-01-22 18:19:25.533121: step: 246/463, loss: 0.0014127518516033888 2023-01-22 18:19:26.190587: step: 248/463, loss: 0.004239039961248636 2023-01-22 18:19:26.817931: step: 250/463, loss: 0.0030084946192801 2023-01-22 18:19:27.472743: step: 252/463, loss: 0.005904610734432936 2023-01-22 18:19:28.089128: step: 254/463, loss: 0.0006347369635477662 2023-01-22 18:19:28.760655: step: 256/463, loss: 0.0297253280878067 2023-01-22 18:19:29.482540: step: 258/463, loss: 0.0010816711001098156 2023-01-22 18:19:30.071410: step: 260/463, loss: 0.00027384122950024903 2023-01-22 18:19:30.625477: step: 262/463, loss: 0.003992012701928616 2023-01-22 18:19:31.260057: step: 264/463, loss: 0.01676459051668644 2023-01-22 18:19:31.902594: step: 266/463, loss: 0.013297022320330143 2023-01-22 18:19:32.491364: step: 268/463, loss: 0.017277555540204048 2023-01-22 18:19:33.115792: step: 270/463, loss: 0.00940115749835968 2023-01-22 18:19:33.767392: step: 272/463, loss: 0.029294399544596672 2023-01-22 18:19:34.409118: step: 274/463, loss: 0.05817404016852379 2023-01-22 18:19:35.022369: step: 276/463, loss: 0.017300566658377647 2023-01-22 18:19:35.671025: step: 278/463, loss: 0.011246436275541782 2023-01-22 18:19:36.260355: step: 280/463, loss: 0.007550498936325312 2023-01-22 18:19:36.834464: step: 282/463, loss: 0.005410152021795511 2023-01-22 18:19:37.387049: step: 284/463, loss: 0.00626436248421669 2023-01-22 18:19:38.057540: step: 286/463, loss: 0.02834450826048851 2023-01-22 18:19:38.685475: step: 288/463, loss: 0.021798590198159218 2023-01-22 18:19:39.286302: step: 290/463, loss: 0.00814418401569128 2023-01-22 18:19:39.875336: step: 292/463, loss: 0.010207520797848701 2023-01-22 18:19:40.476305: step: 294/463, loss: 0.01407572627067566 2023-01-22 18:19:41.084335: step: 296/463, loss: 0.0024294760078191757 2023-01-22 18:19:41.746651: step: 298/463, loss: 0.002293537138029933 2023-01-22 18:19:42.428597: step: 300/463, loss: 0.025300638750195503 2023-01-22 18:19:43.063052: step: 302/463, loss: 0.0022776476107537746 2023-01-22 18:19:43.711366: step: 304/463, loss: 0.01726912334561348 2023-01-22 18:19:44.323694: step: 306/463, loss: 0.010087522678077221 2023-01-22 18:19:44.936903: step: 308/463, loss: 0.036564819514751434 2023-01-22 18:19:45.475082: step: 310/463, loss: 0.009090251289308071 2023-01-22 18:19:46.077539: step: 312/463, loss: 0.015777084976434708 2023-01-22 18:19:46.712999: step: 314/463, loss: 0.0030677663162350655 2023-01-22 18:19:47.386811: step: 316/463, loss: 0.03831237554550171 2023-01-22 18:19:48.010900: step: 318/463, loss: 0.035081569105386734 2023-01-22 18:19:48.659847: step: 320/463, loss: 0.010471419431269169 2023-01-22 18:19:49.260108: step: 322/463, loss: 0.012351276353001595 2023-01-22 18:19:49.889780: step: 324/463, loss: 0.0021423758007586002 2023-01-22 18:19:50.556464: step: 326/463, loss: 0.11121780425310135 2023-01-22 18:19:51.190249: step: 328/463, loss: 0.06433888524770737 2023-01-22 18:19:51.819280: step: 330/463, loss: 0.02450012043118477 2023-01-22 18:19:52.482709: step: 332/463, loss: 0.02998599223792553 2023-01-22 18:19:53.111206: step: 334/463, loss: 0.050536561757326126 2023-01-22 18:19:53.806314: step: 336/463, loss: 0.0029485237319022417 2023-01-22 18:19:54.373226: step: 338/463, loss: 0.06969481706619263 2023-01-22 18:19:55.014746: step: 340/463, loss: 6.351657066261396e-05 2023-01-22 18:19:55.710599: step: 342/463, loss: 0.0008613711106590927 2023-01-22 18:19:56.350888: step: 344/463, loss: 0.006981618236750364 2023-01-22 18:19:56.970656: step: 346/463, loss: 0.05143416300415993 2023-01-22 18:19:57.536264: step: 348/463, loss: 0.0008798282942734659 2023-01-22 18:19:58.102710: step: 350/463, loss: 0.3609671890735626 2023-01-22 18:19:58.707059: step: 352/463, loss: 0.0007065609097480774 2023-01-22 18:19:59.355037: step: 354/463, loss: 0.408699095249176 2023-01-22 18:19:59.946799: step: 356/463, loss: 0.02070353366434574 2023-01-22 18:20:00.565278: step: 358/463, loss: 0.01863936521112919 2023-01-22 18:20:01.177496: step: 360/463, loss: 0.018299000337719917 2023-01-22 18:20:01.791545: step: 362/463, loss: 0.015054052695631981 2023-01-22 18:20:02.387078: step: 364/463, loss: 0.0026879352517426014 2023-01-22 18:20:03.052621: step: 366/463, loss: 0.00772927887737751 2023-01-22 18:20:03.650008: step: 368/463, loss: 0.0036399865057319403 2023-01-22 18:20:04.238093: step: 370/463, loss: 0.01593363843858242 2023-01-22 18:20:04.897137: step: 372/463, loss: 0.25168073177337646 2023-01-22 18:20:05.533159: step: 374/463, loss: 0.014135019853711128 2023-01-22 18:20:06.181096: step: 376/463, loss: 0.005792160518467426 2023-01-22 18:20:06.841679: step: 378/463, loss: 0.006022203713655472 2023-01-22 18:20:07.453396: step: 380/463, loss: 0.009728223085403442 2023-01-22 18:20:08.080822: step: 382/463, loss: 0.059636350721120834 2023-01-22 18:20:08.681789: step: 384/463, loss: 0.00021271263540256768 2023-01-22 18:20:09.239747: step: 386/463, loss: 0.00735977990552783 2023-01-22 18:20:09.879307: step: 388/463, loss: 0.000963760947342962 2023-01-22 18:20:10.520029: step: 390/463, loss: 0.003006977727636695 2023-01-22 18:20:11.194059: step: 392/463, loss: 0.006326448637992144 2023-01-22 18:20:11.886694: step: 394/463, loss: 0.01627027988433838 2023-01-22 18:20:12.520606: step: 396/463, loss: 0.03300509229302406 2023-01-22 18:20:13.154004: step: 398/463, loss: 0.0012454361421987414 2023-01-22 18:20:13.819189: step: 400/463, loss: 0.001087996643036604 2023-01-22 18:20:14.443947: step: 402/463, loss: 0.0675671324133873 2023-01-22 18:20:15.127618: step: 404/463, loss: 0.07142652571201324 2023-01-22 18:20:15.693388: step: 406/463, loss: 0.0006489591905847192 2023-01-22 18:20:16.322463: step: 408/463, loss: 0.0939200296998024 2023-01-22 18:20:16.998776: step: 410/463, loss: 0.6741977334022522 2023-01-22 18:20:17.622000: step: 412/463, loss: 0.10462718456983566 2023-01-22 18:20:18.214567: step: 414/463, loss: 0.016559455543756485 2023-01-22 18:20:18.816633: step: 416/463, loss: 0.0066527570597827435 2023-01-22 18:20:19.461928: step: 418/463, loss: 0.017350398004055023 2023-01-22 18:20:20.045892: step: 420/463, loss: 0.00012676433834712952 2023-01-22 18:20:20.765366: step: 422/463, loss: 0.003922599833458662 2023-01-22 18:20:21.329679: step: 424/463, loss: 3.6304047107696533 2023-01-22 18:20:21.909167: step: 426/463, loss: 0.011224079877138138 2023-01-22 18:20:22.483982: step: 428/463, loss: 0.0074096317403018475 2023-01-22 18:20:23.067207: step: 430/463, loss: 0.004232236184179783 2023-01-22 18:20:23.659738: step: 432/463, loss: 0.04192217066884041 2023-01-22 18:20:24.208402: step: 434/463, loss: 0.017733968794345856 2023-01-22 18:20:24.849229: step: 436/463, loss: 0.0022405071649700403 2023-01-22 18:20:25.534849: step: 438/463, loss: 0.006156692281365395 2023-01-22 18:20:26.124346: step: 440/463, loss: 0.01638326607644558 2023-01-22 18:20:26.753015: step: 442/463, loss: 0.0005889717722311616 2023-01-22 18:20:27.325442: step: 444/463, loss: 0.01060851477086544 2023-01-22 18:20:27.939039: step: 446/463, loss: 0.042973242700099945 2023-01-22 18:20:28.627195: step: 448/463, loss: 0.011050221510231495 2023-01-22 18:20:29.254563: step: 450/463, loss: 0.005555661395192146 2023-01-22 18:20:29.924043: step: 452/463, loss: 0.013292369432747364 2023-01-22 18:20:30.573646: step: 454/463, loss: 0.10558976978063583 2023-01-22 18:20:31.219871: step: 456/463, loss: 0.11108414828777313 2023-01-22 18:20:31.783083: step: 458/463, loss: 0.030451759696006775 2023-01-22 18:20:32.388671: step: 460/463, loss: 0.040511392056941986 2023-01-22 18:20:32.998851: step: 462/463, loss: 7.512211595894769e-05 2023-01-22 18:20:33.600518: step: 464/463, loss: 0.005217378959059715 2023-01-22 18:20:34.168619: step: 466/463, loss: 0.0040438431315124035 2023-01-22 18:20:34.818142: step: 468/463, loss: 0.0022796131670475006 2023-01-22 18:20:35.393769: step: 470/463, loss: 0.0006914663827046752 2023-01-22 18:20:35.977674: step: 472/463, loss: 0.002179347909986973 2023-01-22 18:20:36.642472: step: 474/463, loss: 0.03809194639325142 2023-01-22 18:20:37.312191: step: 476/463, loss: 0.017258284613490105 2023-01-22 18:20:37.910461: step: 478/463, loss: 0.01539873518049717 2023-01-22 18:20:38.498034: step: 480/463, loss: 0.008838477544486523 2023-01-22 18:20:39.139664: step: 482/463, loss: 0.003954616840928793 2023-01-22 18:20:39.769134: step: 484/463, loss: 0.050852470099925995 2023-01-22 18:20:40.346248: step: 486/463, loss: 0.009960491210222244 2023-01-22 18:20:40.927746: step: 488/463, loss: 1.3520069122314453 2023-01-22 18:20:41.684674: step: 490/463, loss: 0.0007649831240996718 2023-01-22 18:20:42.234397: step: 492/463, loss: 0.17580921947956085 2023-01-22 18:20:42.885458: step: 494/463, loss: 0.049489039927721024 2023-01-22 18:20:43.559217: step: 496/463, loss: 0.020153913646936417 2023-01-22 18:20:44.171309: step: 498/463, loss: 0.013928745873272419 2023-01-22 18:20:44.722407: step: 500/463, loss: 0.008221771568059921 2023-01-22 18:20:45.368081: step: 502/463, loss: 0.0018057673005387187 2023-01-22 18:20:45.974094: step: 504/463, loss: 0.012483587488532066 2023-01-22 18:20:46.564175: step: 506/463, loss: 0.03156764432787895 2023-01-22 18:20:47.146108: step: 508/463, loss: 0.005704615265130997 2023-01-22 18:20:47.716177: step: 510/463, loss: 0.04239983111619949 2023-01-22 18:20:48.345335: step: 512/463, loss: 0.02729366347193718 2023-01-22 18:20:48.978240: step: 514/463, loss: 0.00031904669594950974 2023-01-22 18:20:49.604789: step: 516/463, loss: 0.013547600246965885 2023-01-22 18:20:50.293836: step: 518/463, loss: 0.005843594670295715 2023-01-22 18:20:50.938696: step: 520/463, loss: 0.002676575444638729 2023-01-22 18:20:51.553664: step: 522/463, loss: 0.023169398307800293 2023-01-22 18:20:52.189037: step: 524/463, loss: 0.019753923639655113 2023-01-22 18:20:52.811716: step: 526/463, loss: 0.02781996876001358 2023-01-22 18:20:53.413701: step: 528/463, loss: 0.02394329011440277 2023-01-22 18:20:54.017518: step: 530/463, loss: 0.014806583523750305 2023-01-22 18:20:54.682712: step: 532/463, loss: 0.009182552807033062 2023-01-22 18:20:55.356475: step: 534/463, loss: 0.039411768317222595 2023-01-22 18:20:55.998038: step: 536/463, loss: 0.01279967650771141 2023-01-22 18:20:56.637908: step: 538/463, loss: 0.019111791625618935 2023-01-22 18:20:57.241247: step: 540/463, loss: 0.00539689464494586 2023-01-22 18:20:57.824484: step: 542/463, loss: 0.05816657096147537 2023-01-22 18:20:58.446261: step: 544/463, loss: 6.834299711044878e-05 2023-01-22 18:20:59.064639: step: 546/463, loss: 0.0006665163091383874 2023-01-22 18:20:59.670537: step: 548/463, loss: 0.0004959981306456029 2023-01-22 18:21:00.259913: step: 550/463, loss: 0.019053084775805473 2023-01-22 18:21:00.832211: step: 552/463, loss: 0.02591102570295334 2023-01-22 18:21:01.472994: step: 554/463, loss: 0.012576212175190449 2023-01-22 18:21:02.214750: step: 556/463, loss: 0.03803124278783798 2023-01-22 18:21:02.814870: step: 558/463, loss: 0.005009842105209827 2023-01-22 18:21:03.435496: step: 560/463, loss: 0.003059673821553588 2023-01-22 18:21:04.057071: step: 562/463, loss: 0.01563168317079544 2023-01-22 18:21:04.716494: step: 564/463, loss: 0.02057463303208351 2023-01-22 18:21:05.326423: step: 566/463, loss: 0.005659275222569704 2023-01-22 18:21:05.928621: step: 568/463, loss: 0.04112184792757034 2023-01-22 18:21:06.553298: step: 570/463, loss: 0.5756282210350037 2023-01-22 18:21:07.161812: step: 572/463, loss: 0.001162281259894371 2023-01-22 18:21:07.826314: step: 574/463, loss: 0.01426804717630148 2023-01-22 18:21:08.406903: step: 576/463, loss: 0.04646402969956398 2023-01-22 18:21:09.012137: step: 578/463, loss: 0.003588128602132201 2023-01-22 18:21:09.613237: step: 580/463, loss: 0.000626835972070694 2023-01-22 18:21:10.203188: step: 582/463, loss: 0.03200998529791832 2023-01-22 18:21:10.869831: step: 584/463, loss: 0.0014391746371984482 2023-01-22 18:21:11.392974: step: 586/463, loss: 0.0017121941782534122 2023-01-22 18:21:11.999157: step: 588/463, loss: 0.004215845838189125 2023-01-22 18:21:12.700239: step: 590/463, loss: 0.003956167493015528 2023-01-22 18:21:13.377215: step: 592/463, loss: 0.0014639532892033458 2023-01-22 18:21:13.967364: step: 594/463, loss: 0.06867779791355133 2023-01-22 18:21:14.576762: step: 596/463, loss: 0.002170664956793189 2023-01-22 18:21:15.265348: step: 598/463, loss: 0.018251409754157066 2023-01-22 18:21:15.881152: step: 600/463, loss: 0.011018474586308002 2023-01-22 18:21:16.476376: step: 602/463, loss: 0.0013550458243116736 2023-01-22 18:21:17.086094: step: 604/463, loss: 0.00682104704901576 2023-01-22 18:21:17.668810: step: 606/463, loss: 0.0029662332963198423 2023-01-22 18:21:18.322558: step: 608/463, loss: 0.018733007833361626 2023-01-22 18:21:18.922596: step: 610/463, loss: 0.10786278545856476 2023-01-22 18:21:19.600227: step: 612/463, loss: 0.022705424576997757 2023-01-22 18:21:20.249933: step: 614/463, loss: 0.07716985791921616 2023-01-22 18:21:20.854039: step: 616/463, loss: 0.011697415262460709 2023-01-22 18:21:21.361420: step: 618/463, loss: 4.518195419223048e-05 2023-01-22 18:21:22.007521: step: 620/463, loss: 0.15230655670166016 2023-01-22 18:21:22.588731: step: 622/463, loss: 0.028954140841960907 2023-01-22 18:21:23.171444: step: 624/463, loss: 0.0023211664520204067 2023-01-22 18:21:23.781085: step: 626/463, loss: 8.909948519431055e-05 2023-01-22 18:21:24.337368: step: 628/463, loss: 0.008880862034857273 2023-01-22 18:21:24.944525: step: 630/463, loss: 0.033434897661209106 2023-01-22 18:21:25.677739: step: 632/463, loss: 0.003263346618041396 2023-01-22 18:21:26.360375: step: 634/463, loss: 0.17233973741531372 2023-01-22 18:21:26.932276: step: 636/463, loss: 1.7139488458633423 2023-01-22 18:21:27.543695: step: 638/463, loss: 0.010796098969876766 2023-01-22 18:21:28.134812: step: 640/463, loss: 0.0068534910678863525 2023-01-22 18:21:28.786630: step: 642/463, loss: 0.068217433989048 2023-01-22 18:21:29.404012: step: 644/463, loss: 0.004636696074157953 2023-01-22 18:21:30.124919: step: 646/463, loss: 0.01768891140818596 2023-01-22 18:21:30.751774: step: 648/463, loss: 0.04882446676492691 2023-01-22 18:21:31.409906: step: 650/463, loss: 0.024751078337430954 2023-01-22 18:21:32.080437: step: 652/463, loss: 0.038990721106529236 2023-01-22 18:21:32.657024: step: 654/463, loss: 0.03226310387253761 2023-01-22 18:21:33.372914: step: 656/463, loss: 0.014916284941136837 2023-01-22 18:21:33.963211: step: 658/463, loss: 0.009870043024420738 2023-01-22 18:21:34.583623: step: 660/463, loss: 0.01927579753100872 2023-01-22 18:21:35.226239: step: 662/463, loss: 0.07705847173929214 2023-01-22 18:21:35.868712: step: 664/463, loss: 0.016272718086838722 2023-01-22 18:21:36.488570: step: 666/463, loss: 0.1324833482503891 2023-01-22 18:21:37.105777: step: 668/463, loss: 0.0579785518348217 2023-01-22 18:21:37.788872: step: 670/463, loss: 0.008580625988543034 2023-01-22 18:21:38.404093: step: 672/463, loss: 0.09812193363904953 2023-01-22 18:21:39.056944: step: 674/463, loss: 0.002078806748613715 2023-01-22 18:21:39.741493: step: 676/463, loss: 0.025751248002052307 2023-01-22 18:21:40.314210: step: 678/463, loss: 0.02130742557346821 2023-01-22 18:21:40.912386: step: 680/463, loss: 0.002567109651863575 2023-01-22 18:21:41.567544: step: 682/463, loss: 0.18203774094581604 2023-01-22 18:21:42.167941: step: 684/463, loss: 0.007050244137644768 2023-01-22 18:21:42.795293: step: 686/463, loss: 0.25232353806495667 2023-01-22 18:21:43.491442: step: 688/463, loss: 0.009148597717285156 2023-01-22 18:21:44.085721: step: 690/463, loss: 0.03980245813727379 2023-01-22 18:21:44.654671: step: 692/463, loss: 0.0007751992670819163 2023-01-22 18:21:45.253791: step: 694/463, loss: 0.01693861000239849 2023-01-22 18:21:45.909155: step: 696/463, loss: 0.025222016498446465 2023-01-22 18:21:46.510104: step: 698/463, loss: 0.03731339052319527 2023-01-22 18:21:47.090167: step: 700/463, loss: 0.00807987805455923 2023-01-22 18:21:47.652425: step: 702/463, loss: 0.004370031412690878 2023-01-22 18:21:48.228774: step: 704/463, loss: 0.0029959494713693857 2023-01-22 18:21:48.907843: step: 706/463, loss: 0.005270706955343485 2023-01-22 18:21:49.543679: step: 708/463, loss: 0.0026616055984050035 2023-01-22 18:21:50.176172: step: 710/463, loss: 0.013084055855870247 2023-01-22 18:21:50.833080: step: 712/463, loss: 0.0007389066740870476 2023-01-22 18:21:51.455207: step: 714/463, loss: 0.012192999944090843 2023-01-22 18:21:52.072876: step: 716/463, loss: 0.111172616481781 2023-01-22 18:21:52.640660: step: 718/463, loss: 0.004755302798002958 2023-01-22 18:21:53.292512: step: 720/463, loss: 0.0052588521502912045 2023-01-22 18:21:53.915911: step: 722/463, loss: 0.015571122989058495 2023-01-22 18:21:54.533735: step: 724/463, loss: 0.0022379509173333645 2023-01-22 18:21:55.127076: step: 726/463, loss: 0.015926023945212364 2023-01-22 18:21:55.745214: step: 728/463, loss: 0.006153000518679619 2023-01-22 18:21:56.318608: step: 730/463, loss: 2.9108154194545932e-05 2023-01-22 18:21:56.934422: step: 732/463, loss: 3.0244607387430733e-06 2023-01-22 18:21:57.520358: step: 734/463, loss: 0.0008842929382808506 2023-01-22 18:21:58.134328: step: 736/463, loss: 0.0004919427447021008 2023-01-22 18:21:58.740857: step: 738/463, loss: 0.0006872426019981503 2023-01-22 18:21:59.327617: step: 740/463, loss: 0.011594356037676334 2023-01-22 18:21:59.927407: step: 742/463, loss: 0.0021731711458414793 2023-01-22 18:22:00.510247: step: 744/463, loss: 0.0019667260348796844 2023-01-22 18:22:01.079555: step: 746/463, loss: 0.0007664577569812536 2023-01-22 18:22:01.637571: step: 748/463, loss: 0.010328497737646103 2023-01-22 18:22:02.265979: step: 750/463, loss: 0.0026019851211458445 2023-01-22 18:22:02.852621: step: 752/463, loss: 0.007410511374473572 2023-01-22 18:22:03.490590: step: 754/463, loss: 0.0032958793453872204 2023-01-22 18:22:04.082894: step: 756/463, loss: 0.01854996755719185 2023-01-22 18:22:04.730160: step: 758/463, loss: 0.002103339647874236 2023-01-22 18:22:05.335062: step: 760/463, loss: 0.057929813861846924 2023-01-22 18:22:05.981250: step: 762/463, loss: 0.001048272824846208 2023-01-22 18:22:06.625981: step: 764/463, loss: 0.00593983568251133 2023-01-22 18:22:07.265858: step: 766/463, loss: 0.00727528240531683 2023-01-22 18:22:07.868672: step: 768/463, loss: 0.0001712996163405478 2023-01-22 18:22:08.527998: step: 770/463, loss: 0.028490424156188965 2023-01-22 18:22:09.107334: step: 772/463, loss: 0.011083113960921764 2023-01-22 18:22:09.712545: step: 774/463, loss: 0.0251147523522377 2023-01-22 18:22:10.338059: step: 776/463, loss: 0.02115636318922043 2023-01-22 18:22:10.916870: step: 778/463, loss: 0.011703879572451115 2023-01-22 18:22:11.494423: step: 780/463, loss: 0.0011396488407626748 2023-01-22 18:22:12.132873: step: 782/463, loss: 0.005569474771618843 2023-01-22 18:22:12.929442: step: 784/463, loss: 0.027344342321157455 2023-01-22 18:22:13.508857: step: 786/463, loss: 0.007779438979923725 2023-01-22 18:22:14.021699: step: 788/463, loss: 0.012175945565104485 2023-01-22 18:22:14.642847: step: 790/463, loss: 0.0025165004190057516 2023-01-22 18:22:15.305389: step: 792/463, loss: 0.008965961635112762 2023-01-22 18:22:15.840417: step: 794/463, loss: 0.0013342257589101791 2023-01-22 18:22:16.524755: step: 796/463, loss: 0.07231802493333817 2023-01-22 18:22:17.117927: step: 798/463, loss: 0.013007629662752151 2023-01-22 18:22:17.748991: step: 800/463, loss: 0.01303919032216072 2023-01-22 18:22:18.390650: step: 802/463, loss: 0.0171971432864666 2023-01-22 18:22:19.047578: step: 804/463, loss: 0.1503831297159195 2023-01-22 18:22:19.731116: step: 806/463, loss: 0.06313421577215195 2023-01-22 18:22:20.321813: step: 808/463, loss: 0.0047308215871453285 2023-01-22 18:22:20.847404: step: 810/463, loss: 0.023119142279028893 2023-01-22 18:22:21.438637: step: 812/463, loss: 0.01511091087013483 2023-01-22 18:22:22.059577: step: 814/463, loss: 0.0013143975520506501 2023-01-22 18:22:22.688311: step: 816/463, loss: 0.003909895662218332 2023-01-22 18:22:23.361872: step: 818/463, loss: 0.010181371122598648 2023-01-22 18:22:24.059110: step: 820/463, loss: 0.029307236894965172 2023-01-22 18:22:24.651007: step: 822/463, loss: 0.4564387798309326 2023-01-22 18:22:25.286066: step: 824/463, loss: 0.06385404616594315 2023-01-22 18:22:25.970408: step: 826/463, loss: 0.005129428580403328 2023-01-22 18:22:26.670660: step: 828/463, loss: 0.006898260675370693 2023-01-22 18:22:27.270951: step: 830/463, loss: 0.04087752103805542 2023-01-22 18:22:27.922495: step: 832/463, loss: 0.12419389188289642 2023-01-22 18:22:28.512973: step: 834/463, loss: 0.03047112002968788 2023-01-22 18:22:29.136179: step: 836/463, loss: 0.0010787836508825421 2023-01-22 18:22:29.779519: step: 838/463, loss: 0.008321932516992092 2023-01-22 18:22:30.373339: step: 840/463, loss: 0.002996347611770034 2023-01-22 18:22:31.191447: step: 842/463, loss: 0.048816192895174026 2023-01-22 18:22:31.773725: step: 844/463, loss: 0.000454862805781886 2023-01-22 18:22:32.406961: step: 846/463, loss: 0.011622236110270023 2023-01-22 18:22:33.073670: step: 848/463, loss: 0.3344492018222809 2023-01-22 18:22:33.667111: step: 850/463, loss: 0.007749462034553289 2023-01-22 18:22:34.325609: step: 852/463, loss: 0.004255947191268206 2023-01-22 18:22:34.936409: step: 854/463, loss: 0.03162684291601181 2023-01-22 18:22:35.517695: step: 856/463, loss: 0.04359538480639458 2023-01-22 18:22:36.145307: step: 858/463, loss: 0.03613938391208649 2023-01-22 18:22:36.795261: step: 860/463, loss: 0.17776446044445038 2023-01-22 18:22:37.324002: step: 862/463, loss: 0.0004483503580559045 2023-01-22 18:22:37.936012: step: 864/463, loss: 0.000983401550911367 2023-01-22 18:22:38.555763: step: 866/463, loss: 0.0016125279944390059 2023-01-22 18:22:39.202911: step: 868/463, loss: 0.009228711016476154 2023-01-22 18:22:39.873826: step: 870/463, loss: 0.003109992016106844 2023-01-22 18:22:40.401790: step: 872/463, loss: 0.036478497087955475 2023-01-22 18:22:41.062148: step: 874/463, loss: 0.02600398287177086 2023-01-22 18:22:41.693121: step: 876/463, loss: 0.005154294427484274 2023-01-22 18:22:42.330653: step: 878/463, loss: 0.018369833007454872 2023-01-22 18:22:42.938032: step: 880/463, loss: 0.04344536364078522 2023-01-22 18:22:43.555981: step: 882/463, loss: 0.041122596710920334 2023-01-22 18:22:44.190900: step: 884/463, loss: 0.0009182292269542813 2023-01-22 18:22:44.758414: step: 886/463, loss: 0.010591769590973854 2023-01-22 18:22:45.415184: step: 888/463, loss: 0.055419642478227615 2023-01-22 18:22:46.028507: step: 890/463, loss: 0.06011559069156647 2023-01-22 18:22:46.667747: step: 892/463, loss: 0.00918414257466793 2023-01-22 18:22:47.296312: step: 894/463, loss: 0.00259955320507288 2023-01-22 18:22:47.849187: step: 896/463, loss: 0.015476366505026817 2023-01-22 18:22:48.418577: step: 898/463, loss: 0.05018766224384308 2023-01-22 18:22:49.038591: step: 900/463, loss: 0.02507612854242325 2023-01-22 18:22:49.691386: step: 902/463, loss: 0.03183688968420029 2023-01-22 18:22:50.288280: step: 904/463, loss: 3.6112687666900456e-05 2023-01-22 18:22:50.917300: step: 906/463, loss: 0.0843418538570404 2023-01-22 18:22:51.593429: step: 908/463, loss: 0.023264532908797264 2023-01-22 18:22:52.240098: step: 910/463, loss: 1.0202089548110962 2023-01-22 18:22:52.884745: step: 912/463, loss: 0.026611492037773132 2023-01-22 18:22:53.491244: step: 914/463, loss: 0.004394446033984423 2023-01-22 18:22:54.187463: step: 916/463, loss: 0.013830309733748436 2023-01-22 18:22:54.793498: step: 918/463, loss: 0.0025506827514618635 2023-01-22 18:22:55.329188: step: 920/463, loss: 0.031091446056962013 2023-01-22 18:22:55.941748: step: 922/463, loss: 0.037423647940158844 2023-01-22 18:22:56.548789: step: 924/463, loss: 0.43451374769210815 2023-01-22 18:22:57.204654: step: 926/463, loss: 0.004698766861110926 ================================================== Loss: 0.063 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2857067953020134, 'r': 0.3231143263757116, 'f1': 0.30326135351736416}, 'combined': 0.22345573417068937, 'epoch': 31} Test Chinese: {'template': {'p': 0.9722222222222222, 'r': 0.5511811023622047, 'f1': 0.7035175879396985}, 'slot': {'p': 0.3583955549733647, 'r': 0.3252744313228184, 'f1': 0.3410326990194449}, 'combined': 0.23992250182272506, 'epoch': 31} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.29740827645051193, 'r': 0.33070445920303604, 'f1': 0.31317385444743934}, 'combined': 0.23075968222442897, 'epoch': 31} Test Korean: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.354957514911559, 'r': 0.32184465170708054, 'f1': 0.3375910521335358}, 'combined': 0.2396896470148104, 'epoch': 31} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.29957053488679997, 'r': 0.3302665669245366, 'f1': 0.3141705429047487}, 'combined': 0.2314940842456043, 'epoch': 31} Test Russian: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.36816481891513386, 'r': 0.3090012148274617, 'f1': 0.3359984719633843}, 'combined': 0.23855891509400282, 'epoch': 31} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.31292517006802717, 'r': 0.43809523809523804, 'f1': 0.365079365079365}, 'combined': 0.24338624338624332, 'epoch': 31} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.2642857142857143, 'r': 0.40217391304347827, 'f1': 0.31896551724137934}, 'combined': 0.15948275862068967, 'epoch': 31} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.27586206896551724, 'f1': 0.35555555555555557}, 'combined': 0.23703703703703705, 'epoch': 31} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.31154818059299194, 'r': 0.313321699647601, 'f1': 0.312432423300446}, 'combined': 0.23021336453717073, 'epoch': 29} Test for Chinese: {'template': {'p': 0.9722222222222222, 'r': 0.5511811023622047, 'f1': 0.7035175879396985}, 'slot': {'p': 0.35841849662687986, 'r': 0.32120052010105204, 'f1': 0.33879042433116024}, 'combined': 0.23834502214252482, 'epoch': 29} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.34188034188034183, 'r': 0.38095238095238093, 'f1': 0.36036036036036034}, 'combined': 0.2402402402402402, 'epoch': 29} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.28642843540005986, 'r': 0.33860515228508026, 'f1': 0.3103389830508475}, 'combined': 0.22867082961641394, 'epoch': 17} Test for Korean: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.3492453884409228, 'r': 0.3270179138496522, 'f1': 0.3377663639671779}, 'combined': 0.2398141184166963, 'epoch': 17} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.33088235294117646, 'r': 0.4891304347826087, 'f1': 0.39473684210526316}, 'combined': 0.19736842105263158, 'epoch': 17} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2919934913217624, 'r': 0.3557112361073462, 'f1': 0.3207182573628254}, 'combined': 0.23631871595155554, 'epoch': 30} Test for Russian: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.36122302043550025, 'r': 0.3233655859793779, 'f1': 0.34124755386763844}, 'combined': 0.24228576324602327, 'epoch': 30} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.4375, 'r': 0.3017241379310345, 'f1': 0.3571428571428571}, 'combined': 0.23809523809523805, 'epoch': 30} ****************************** Epoch: 32 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-22 18:25:37.286258: step: 2/463, loss: 0.030614763498306274 2023-01-22 18:25:37.891466: step: 4/463, loss: 0.03874175623059273 2023-01-22 18:25:38.577474: step: 6/463, loss: 0.009867417626082897 2023-01-22 18:25:39.197442: step: 8/463, loss: 0.09041564166545868 2023-01-22 18:25:39.787801: step: 10/463, loss: 0.0008795055327937007 2023-01-22 18:25:40.409324: step: 12/463, loss: 0.0026088336016982794 2023-01-22 18:25:41.027206: step: 14/463, loss: 0.007183930370956659 2023-01-22 18:25:41.672904: step: 16/463, loss: 0.0006557835731655359 2023-01-22 18:25:42.302884: step: 18/463, loss: 0.004347142763435841 2023-01-22 18:25:42.884770: step: 20/463, loss: 0.0007631027256138623 2023-01-22 18:25:43.533654: step: 22/463, loss: 0.020694231614470482 2023-01-22 18:25:44.191699: step: 24/463, loss: 0.008002565242350101 2023-01-22 18:25:44.851780: step: 26/463, loss: 0.02451903745532036 2023-01-22 18:25:45.550624: step: 28/463, loss: 0.026644079014658928 2023-01-22 18:25:46.222979: step: 30/463, loss: 0.012972644530236721 2023-01-22 18:25:46.858778: step: 32/463, loss: 0.010772773995995522 2023-01-22 18:25:47.662560: step: 34/463, loss: 0.00997359398752451 2023-01-22 18:25:48.288793: step: 36/463, loss: 0.0014216314302757382 2023-01-22 18:25:48.860675: step: 38/463, loss: 0.0024651081766933203 2023-01-22 18:25:49.459154: step: 40/463, loss: 0.0024488207418471575 2023-01-22 18:25:50.056162: step: 42/463, loss: 0.00803136546164751 2023-01-22 18:25:50.661661: step: 44/463, loss: 0.0005861379322595894 2023-01-22 18:25:51.328264: step: 46/463, loss: 0.06020209193229675 2023-01-22 18:25:51.882234: step: 48/463, loss: 0.0006684345426037908 2023-01-22 18:25:52.413467: step: 50/463, loss: 0.01361856423318386 2023-01-22 18:25:53.014087: step: 52/463, loss: 0.011278171092271805 2023-01-22 18:25:53.640437: step: 54/463, loss: 0.012625074945390224 2023-01-22 18:25:54.271992: step: 56/463, loss: 0.001920407055877149 2023-01-22 18:25:54.801647: step: 58/463, loss: 0.008513384498655796 2023-01-22 18:25:55.433986: step: 60/463, loss: 0.009283128194510937 2023-01-22 18:25:56.056149: step: 62/463, loss: 0.02183576300740242 2023-01-22 18:25:56.783154: step: 64/463, loss: 0.05939057096838951 2023-01-22 18:25:57.393651: step: 66/463, loss: 0.00858644861727953 2023-01-22 18:25:57.991506: step: 68/463, loss: 0.0007334311376325786 2023-01-22 18:25:58.611046: step: 70/463, loss: 1.187030553817749 2023-01-22 18:25:59.261002: step: 72/463, loss: 0.03280066326260567 2023-01-22 18:25:59.816637: step: 74/463, loss: 0.05102023854851723 2023-01-22 18:26:00.385296: step: 76/463, loss: 0.0038463897071778774 2023-01-22 18:26:01.043987: step: 78/463, loss: 0.0010863669449463487 2023-01-22 18:26:01.731992: step: 80/463, loss: 0.0802931934595108 2023-01-22 18:26:02.384820: step: 82/463, loss: 0.0268262792378664 2023-01-22 18:26:03.058300: step: 84/463, loss: 0.05939289927482605 2023-01-22 18:26:03.647434: step: 86/463, loss: 4.057941259816289e-05 2023-01-22 18:26:04.274910: step: 88/463, loss: 0.01025380939245224 2023-01-22 18:26:04.861720: step: 90/463, loss: 0.00885508581995964 2023-01-22 18:26:05.462632: step: 92/463, loss: 8.519143011653796e-05 2023-01-22 18:26:06.077987: step: 94/463, loss: 0.0002763755910564214 2023-01-22 18:26:06.728433: step: 96/463, loss: 0.00088076654355973 2023-01-22 18:26:07.375615: step: 98/463, loss: 0.02465272881090641 2023-01-22 18:26:08.033560: step: 100/463, loss: 0.010534192435443401 2023-01-22 18:26:08.617920: step: 102/463, loss: 0.023925913497805595 2023-01-22 18:26:09.153151: step: 104/463, loss: 0.023714197799563408 2023-01-22 18:26:09.772373: step: 106/463, loss: 0.007070284802466631 2023-01-22 18:26:10.409886: step: 108/463, loss: 0.0009701781673356891 2023-01-22 18:26:11.042990: step: 110/463, loss: 0.003983228001743555 2023-01-22 18:26:11.664514: step: 112/463, loss: 0.08200642466545105 2023-01-22 18:26:12.338332: step: 114/463, loss: 0.0034356177784502506 2023-01-22 18:26:12.982684: step: 116/463, loss: 0.01688413880765438 2023-01-22 18:26:13.572764: step: 118/463, loss: 0.10588008910417557 2023-01-22 18:26:14.210876: step: 120/463, loss: 0.2819541394710541 2023-01-22 18:26:14.855962: step: 122/463, loss: 0.019284039735794067 2023-01-22 18:26:15.575858: step: 124/463, loss: 0.09123295545578003 2023-01-22 18:26:16.214914: step: 126/463, loss: 0.0007505693356506526 2023-01-22 18:26:16.824924: step: 128/463, loss: 0.0042683579958975315 2023-01-22 18:26:17.523177: step: 130/463, loss: 0.002902250736951828 2023-01-22 18:26:18.184256: step: 132/463, loss: 0.04759160801768303 2023-01-22 18:26:18.808340: step: 134/463, loss: 0.048020921647548676 2023-01-22 18:26:19.416331: step: 136/463, loss: 0.004624802153557539 2023-01-22 18:26:19.944857: step: 138/463, loss: 0.0021022099535912275 2023-01-22 18:26:20.575407: step: 140/463, loss: 0.0007675497909076512 2023-01-22 18:26:21.160712: step: 142/463, loss: 0.016619686037302017 2023-01-22 18:26:21.746268: step: 144/463, loss: 0.011588981375098228 2023-01-22 18:26:22.433616: step: 146/463, loss: 0.19113923609256744 2023-01-22 18:26:23.017685: step: 148/463, loss: 0.008705196902155876 2023-01-22 18:26:23.635671: step: 150/463, loss: 0.01780868135392666 2023-01-22 18:26:24.278828: step: 152/463, loss: 0.04104067385196686 2023-01-22 18:26:24.915165: step: 154/463, loss: 0.00010074569581774995 2023-01-22 18:26:25.536645: step: 156/463, loss: 0.07353632152080536 2023-01-22 18:26:26.150181: step: 158/463, loss: 0.03063885308802128 2023-01-22 18:26:26.747963: step: 160/463, loss: 0.06048136204481125 2023-01-22 18:26:27.386185: step: 162/463, loss: 0.013756104744970798 2023-01-22 18:26:27.987279: step: 164/463, loss: 0.000325008702930063 2023-01-22 18:26:28.658188: step: 166/463, loss: 0.0010838062735274434 2023-01-22 18:26:29.300128: step: 168/463, loss: 0.0007398619782179594 2023-01-22 18:26:29.896569: step: 170/463, loss: 0.0035692553501576185 2023-01-22 18:26:30.517190: step: 172/463, loss: 0.004098034463822842 2023-01-22 18:26:31.145046: step: 174/463, loss: 0.003794135758653283 2023-01-22 18:26:31.777862: step: 176/463, loss: 0.005292246583849192 2023-01-22 18:26:32.339703: step: 178/463, loss: 0.0021012339275330305 2023-01-22 18:26:32.930891: step: 180/463, loss: 0.0005480112740769982 2023-01-22 18:26:33.575489: step: 182/463, loss: 0.017011111602187157 2023-01-22 18:26:34.189453: step: 184/463, loss: 0.00023634152603335679 2023-01-22 18:26:34.835742: step: 186/463, loss: 0.03679811954498291 2023-01-22 18:26:35.450866: step: 188/463, loss: 0.0003832667716778815 2023-01-22 18:26:36.164912: step: 190/463, loss: 0.2623153328895569 2023-01-22 18:26:36.827619: step: 192/463, loss: 0.019213518127799034 2023-01-22 18:26:37.468644: step: 194/463, loss: 0.0011597422417253256 2023-01-22 18:26:38.082360: step: 196/463, loss: 0.04966145008802414 2023-01-22 18:26:38.698823: step: 198/463, loss: 0.005204069893807173 2023-01-22 18:26:39.369072: step: 200/463, loss: 0.012322881259024143 2023-01-22 18:26:39.965377: step: 202/463, loss: 0.0024742227979004383 2023-01-22 18:26:40.575502: step: 204/463, loss: 0.6543542742729187 2023-01-22 18:26:41.148666: step: 206/463, loss: 0.0358550027012825 2023-01-22 18:26:41.717882: step: 208/463, loss: 5.90651725360658e-05 2023-01-22 18:26:42.373798: step: 210/463, loss: 0.00888847280293703 2023-01-22 18:26:42.996693: step: 212/463, loss: 0.0027152360416948795 2023-01-22 18:26:43.635333: step: 214/463, loss: 0.0033107048366218805 2023-01-22 18:26:44.275821: step: 216/463, loss: 0.004602914676070213 2023-01-22 18:26:44.943699: step: 218/463, loss: 0.0004551361780613661 2023-01-22 18:26:45.542683: step: 220/463, loss: 0.011092551052570343 2023-01-22 18:26:46.159715: step: 222/463, loss: 9.889698048937134e-06 2023-01-22 18:26:46.735599: step: 224/463, loss: 0.0034915637224912643 2023-01-22 18:26:47.373512: step: 226/463, loss: 0.00024109637888614088 2023-01-22 18:26:48.028519: step: 228/463, loss: 0.02294524759054184 2023-01-22 18:26:48.630105: step: 230/463, loss: 0.11641913652420044 2023-01-22 18:26:49.333196: step: 232/463, loss: 0.020388511940836906 2023-01-22 18:26:49.948225: step: 234/463, loss: 0.0025205216370522976 2023-01-22 18:26:50.530783: step: 236/463, loss: 0.0102242985740304 2023-01-22 18:26:51.163637: step: 238/463, loss: 0.06540962308645248 2023-01-22 18:26:51.798164: step: 240/463, loss: 8.097615500446409e-05 2023-01-22 18:26:52.470276: step: 242/463, loss: 0.01397916954010725 2023-01-22 18:26:53.095711: step: 244/463, loss: 0.05718614533543587 2023-01-22 18:26:53.716910: step: 246/463, loss: 0.00193133600987494 2023-01-22 18:26:54.346430: step: 248/463, loss: 0.00582908233627677 2023-01-22 18:26:54.986276: step: 250/463, loss: 0.001608937862329185 2023-01-22 18:26:55.659189: step: 252/463, loss: 0.007725281175225973 2023-01-22 18:26:56.260103: step: 254/463, loss: 0.04413016885519028 2023-01-22 18:26:56.949739: step: 256/463, loss: 0.009714223444461823 2023-01-22 18:26:57.648800: step: 258/463, loss: 0.0006648200214840472 2023-01-22 18:26:58.269619: step: 260/463, loss: 5.2150575356790796e-05 2023-01-22 18:26:58.897198: step: 262/463, loss: 0.0007346154889091849 2023-01-22 18:26:59.552247: step: 264/463, loss: 0.0013798220315948129 2023-01-22 18:27:00.189446: step: 266/463, loss: 0.0008934060460887849 2023-01-22 18:27:00.801729: step: 268/463, loss: 0.003181750187650323 2023-01-22 18:27:01.492242: step: 270/463, loss: 0.002725494559854269 2023-01-22 18:27:02.172158: step: 272/463, loss: 0.012833239510655403 2023-01-22 18:27:02.798719: step: 274/463, loss: 0.009426075033843517 2023-01-22 18:27:03.422303: step: 276/463, loss: 0.00047624576836824417 2023-01-22 18:27:04.034741: step: 278/463, loss: 0.05621564760804176 2023-01-22 18:27:04.676632: step: 280/463, loss: 0.0353035070002079 2023-01-22 18:27:05.291527: step: 282/463, loss: 0.3190092444419861 2023-01-22 18:27:05.889554: step: 284/463, loss: 0.02347484789788723 2023-01-22 18:27:06.568804: step: 286/463, loss: 5.7020384701900184e-05 2023-01-22 18:27:07.207366: step: 288/463, loss: 0.00018010949133895338 2023-01-22 18:27:07.756134: step: 290/463, loss: 0.0036151979584246874 2023-01-22 18:27:08.496752: step: 292/463, loss: 0.017629412934184074 2023-01-22 18:27:09.065604: step: 294/463, loss: 0.007204721216112375 2023-01-22 18:27:09.688961: step: 296/463, loss: 0.0002788869314827025 2023-01-22 18:27:10.374158: step: 298/463, loss: 0.005722853355109692 2023-01-22 18:27:10.983538: step: 300/463, loss: 0.03564462065696716 2023-01-22 18:27:11.578014: step: 302/463, loss: 0.0038565327413380146 2023-01-22 18:27:12.232581: step: 304/463, loss: 0.29082024097442627 2023-01-22 18:27:12.850816: step: 306/463, loss: 0.023026762530207634 2023-01-22 18:27:13.467757: step: 308/463, loss: 0.02001926675438881 2023-01-22 18:27:14.116940: step: 310/463, loss: 0.005944774951785803 2023-01-22 18:27:14.714249: step: 312/463, loss: 0.014342518523335457 2023-01-22 18:27:15.306658: step: 314/463, loss: 0.05035915970802307 2023-01-22 18:27:15.906395: step: 316/463, loss: 0.011642935685813427 2023-01-22 18:27:16.474047: step: 318/463, loss: 0.004401103593409061 2023-01-22 18:27:17.173390: step: 320/463, loss: 0.002240537665784359 2023-01-22 18:27:17.866206: step: 322/463, loss: 0.014900058507919312 2023-01-22 18:27:18.583610: step: 324/463, loss: 0.02595534548163414 2023-01-22 18:27:19.168131: step: 326/463, loss: 0.010317686945199966 2023-01-22 18:27:19.803341: step: 328/463, loss: 0.2959589958190918 2023-01-22 18:27:20.386694: step: 330/463, loss: 0.012675803154706955 2023-01-22 18:27:21.021302: step: 332/463, loss: 0.006935832556337118 2023-01-22 18:27:21.693338: step: 334/463, loss: 0.003507763845846057 2023-01-22 18:27:22.287072: step: 336/463, loss: 0.005058653652667999 2023-01-22 18:27:22.867094: step: 338/463, loss: 0.0078047155402600765 2023-01-22 18:27:23.467382: step: 340/463, loss: 0.0014647762291133404 2023-01-22 18:27:24.062928: step: 342/463, loss: 0.0013325014151632786 2023-01-22 18:27:24.692006: step: 344/463, loss: 0.0273173488676548 2023-01-22 18:27:25.396116: step: 346/463, loss: 0.020032895728945732 2023-01-22 18:27:26.096942: step: 348/463, loss: 0.0010531210573390126 2023-01-22 18:27:26.753311: step: 350/463, loss: 0.029222985729575157 2023-01-22 18:27:27.360523: step: 352/463, loss: 0.0793878361582756 2023-01-22 18:27:28.004417: step: 354/463, loss: 0.00048377603525295854 2023-01-22 18:27:28.667240: step: 356/463, loss: 5.191231727600098 2023-01-22 18:27:29.189351: step: 358/463, loss: 0.0005025931168347597 2023-01-22 18:27:29.791430: step: 360/463, loss: 0.0138544337823987 2023-01-22 18:27:30.374254: step: 362/463, loss: 0.6042255759239197 2023-01-22 18:27:31.029066: step: 364/463, loss: 0.014295564033091068 2023-01-22 18:27:31.640069: step: 366/463, loss: 0.014689529314637184 2023-01-22 18:27:32.312973: step: 368/463, loss: 0.02259857952594757 2023-01-22 18:27:32.912569: step: 370/463, loss: 0.0012402745196595788 2023-01-22 18:27:33.520283: step: 372/463, loss: 0.06784980744123459 2023-01-22 18:27:34.120757: step: 374/463, loss: 0.0006874781101942062 2023-01-22 18:27:34.703336: step: 376/463, loss: 0.0001291482913075015 2023-01-22 18:27:35.305370: step: 378/463, loss: 0.00015576594159938395 2023-01-22 18:27:35.849950: step: 380/463, loss: 0.05601614713668823 2023-01-22 18:27:36.507576: step: 382/463, loss: 0.002157850656658411 2023-01-22 18:27:37.109905: step: 384/463, loss: 0.0007407641969621181 2023-01-22 18:27:37.694928: step: 386/463, loss: 0.0016483055660501122 2023-01-22 18:27:38.265655: step: 388/463, loss: 0.03883794695138931 2023-01-22 18:27:38.889376: step: 390/463, loss: 0.033348023891448975 2023-01-22 18:27:39.496238: step: 392/463, loss: 0.006183885037899017 2023-01-22 18:27:40.070430: step: 394/463, loss: 0.0010848063975572586 2023-01-22 18:27:40.618910: step: 396/463, loss: 0.3029787540435791 2023-01-22 18:27:41.266001: step: 398/463, loss: 0.03901616856455803 2023-01-22 18:27:41.921834: step: 400/463, loss: 0.04901716485619545 2023-01-22 18:27:42.597368: step: 402/463, loss: 0.002000237349420786 2023-01-22 18:27:43.196953: step: 404/463, loss: 0.0019496651366353035 2023-01-22 18:27:43.858316: step: 406/463, loss: 0.12262634187936783 2023-01-22 18:27:44.465544: step: 408/463, loss: 0.06054764240980148 2023-01-22 18:27:45.083279: step: 410/463, loss: 0.3686816096305847 2023-01-22 18:27:45.697623: step: 412/463, loss: 0.03703209385275841 2023-01-22 18:27:46.329015: step: 414/463, loss: 0.00416553346440196 2023-01-22 18:27:46.920905: step: 416/463, loss: 0.008285529911518097 2023-01-22 18:27:47.498894: step: 418/463, loss: 0.0004289813223294914 2023-01-22 18:27:48.074150: step: 420/463, loss: 0.002054835669696331 2023-01-22 18:27:48.709983: step: 422/463, loss: 0.00015364825958386064 2023-01-22 18:27:49.328134: step: 424/463, loss: 0.0018545184284448624 2023-01-22 18:27:49.864853: step: 426/463, loss: 0.007042792160063982 2023-01-22 18:27:50.504714: step: 428/463, loss: 0.008317075669765472 2023-01-22 18:27:51.160000: step: 430/463, loss: 0.0038274978287518024 2023-01-22 18:27:51.728138: step: 432/463, loss: 0.006218602880835533 2023-01-22 18:27:52.397997: step: 434/463, loss: 0.0031281551346182823 2023-01-22 18:27:53.007319: step: 436/463, loss: 0.012608464807271957 2023-01-22 18:27:53.602964: step: 438/463, loss: 0.025826040655374527 2023-01-22 18:27:54.242818: step: 440/463, loss: 0.01832614839076996 2023-01-22 18:27:54.820723: step: 442/463, loss: 9.750958997756243e-05 2023-01-22 18:27:55.371535: step: 444/463, loss: 0.0003094124549534172 2023-01-22 18:27:55.985358: step: 446/463, loss: 0.015861941501498222 2023-01-22 18:27:56.590658: step: 448/463, loss: 0.0005222885520197451 2023-01-22 18:27:57.256123: step: 450/463, loss: 0.03929736837744713 2023-01-22 18:27:57.879393: step: 452/463, loss: 0.017715943977236748 2023-01-22 18:27:58.443103: step: 454/463, loss: 0.0057287379167973995 2023-01-22 18:27:59.160095: step: 456/463, loss: 0.019285151734948158 2023-01-22 18:27:59.790282: step: 458/463, loss: 0.013365359976887703 2023-01-22 18:28:00.424105: step: 460/463, loss: 0.02760261297225952 2023-01-22 18:28:01.015723: step: 462/463, loss: 0.008483940735459328 2023-01-22 18:28:01.681354: step: 464/463, loss: 0.09011266380548477 2023-01-22 18:28:02.367744: step: 466/463, loss: 0.09011425077915192 2023-01-22 18:28:02.933045: step: 468/463, loss: 0.003480620216578245 2023-01-22 18:28:03.585658: step: 470/463, loss: 0.0005052518099546432 2023-01-22 18:28:04.342728: step: 472/463, loss: 0.0014285201905295253 2023-01-22 18:28:05.005438: step: 474/463, loss: 0.004950942005962133 2023-01-22 18:28:05.567311: step: 476/463, loss: 0.0006906845956109464 2023-01-22 18:28:06.155705: step: 478/463, loss: 0.035715144127607346 2023-01-22 18:28:06.713919: step: 480/463, loss: 0.032476745545864105 2023-01-22 18:28:07.293632: step: 482/463, loss: 0.02435312792658806 2023-01-22 18:28:07.945487: step: 484/463, loss: 0.021083662286400795 2023-01-22 18:28:08.553042: step: 486/463, loss: 0.0116967111825943 2023-01-22 18:28:09.152702: step: 488/463, loss: 1.3059754371643066 2023-01-22 18:28:09.871109: step: 490/463, loss: 0.16015954315662384 2023-01-22 18:28:10.519735: step: 492/463, loss: 0.017719849944114685 2023-01-22 18:28:11.128400: step: 494/463, loss: 0.0350295715034008 2023-01-22 18:28:11.705248: step: 496/463, loss: 0.00018520528101362288 2023-01-22 18:28:12.298412: step: 498/463, loss: 0.011715661734342575 2023-01-22 18:28:12.899991: step: 500/463, loss: 0.00013896499876864254 2023-01-22 18:28:13.502654: step: 502/463, loss: 0.0026541331317275763 2023-01-22 18:28:14.122640: step: 504/463, loss: 0.0013028652174398303 2023-01-22 18:28:14.778770: step: 506/463, loss: 0.0037425588816404343 2023-01-22 18:28:15.407265: step: 508/463, loss: 0.009832057170569897 2023-01-22 18:28:16.060383: step: 510/463, loss: 0.003256046213209629 2023-01-22 18:28:16.699703: step: 512/463, loss: 0.001373638049699366 2023-01-22 18:28:17.274380: step: 514/463, loss: 0.002945641055703163 2023-01-22 18:28:17.907841: step: 516/463, loss: 0.010917948558926582 2023-01-22 18:28:18.534582: step: 518/463, loss: 0.006178031675517559 2023-01-22 18:28:19.169941: step: 520/463, loss: 0.03541672229766846 2023-01-22 18:28:19.816392: step: 522/463, loss: 0.11940894275903702 2023-01-22 18:28:20.454495: step: 524/463, loss: 0.001312454231083393 2023-01-22 18:28:21.058108: step: 526/463, loss: 0.004518647212535143 2023-01-22 18:28:21.657768: step: 528/463, loss: 0.03516843542456627 2023-01-22 18:28:22.287697: step: 530/463, loss: 0.008458506315946579 2023-01-22 18:28:22.871909: step: 532/463, loss: 0.11398664861917496 2023-01-22 18:28:23.467383: step: 534/463, loss: 0.0323471799492836 2023-01-22 18:28:24.103374: step: 536/463, loss: 0.0031823874451220036 2023-01-22 18:28:24.790336: step: 538/463, loss: 0.07462967932224274 2023-01-22 18:28:25.321819: step: 540/463, loss: 0.0005490122712217271 2023-01-22 18:28:26.001241: step: 542/463, loss: 0.00864870660007 2023-01-22 18:28:26.617737: step: 544/463, loss: 0.004063241649419069 2023-01-22 18:28:27.249946: step: 546/463, loss: 0.00019289017654955387 2023-01-22 18:28:27.875093: step: 548/463, loss: 0.0015600437764078379 2023-01-22 18:28:28.508101: step: 550/463, loss: 0.2853275537490845 2023-01-22 18:28:29.165138: step: 552/463, loss: 0.0031575735192745924 2023-01-22 18:28:29.708200: step: 554/463, loss: 0.024457257241010666 2023-01-22 18:28:30.327071: step: 556/463, loss: 0.051583051681518555 2023-01-22 18:28:30.891723: step: 558/463, loss: 0.002855353755876422 2023-01-22 18:28:31.564337: step: 560/463, loss: 0.00010596140782581642 2023-01-22 18:28:32.222338: step: 562/463, loss: 0.018723808228969574 2023-01-22 18:28:32.908434: step: 564/463, loss: 0.010295277461409569 2023-01-22 18:28:33.512108: step: 566/463, loss: 0.001454145647585392 2023-01-22 18:28:34.086790: step: 568/463, loss: 0.003908323124051094 2023-01-22 18:28:34.690695: step: 570/463, loss: 0.021431010216474533 2023-01-22 18:28:35.341842: step: 572/463, loss: 0.005228378344327211 2023-01-22 18:28:36.043303: step: 574/463, loss: 0.11140754818916321 2023-01-22 18:28:36.608732: step: 576/463, loss: 0.0003180864732712507 2023-01-22 18:28:37.217992: step: 578/463, loss: 0.02494269795715809 2023-01-22 18:28:37.887412: step: 580/463, loss: 0.006920388899743557 2023-01-22 18:28:38.598272: step: 582/463, loss: 0.02570492774248123 2023-01-22 18:28:39.263036: step: 584/463, loss: 0.024108489975333214 2023-01-22 18:28:39.937667: step: 586/463, loss: 0.007123040501028299 2023-01-22 18:28:40.608972: step: 588/463, loss: 0.0006514330743812025 2023-01-22 18:28:41.208468: step: 590/463, loss: 0.019878430292010307 2023-01-22 18:28:41.822404: step: 592/463, loss: 0.008211604319512844 2023-01-22 18:28:42.421831: step: 594/463, loss: 0.0012728053843602538 2023-01-22 18:28:43.078619: step: 596/463, loss: 0.013113809749484062 2023-01-22 18:28:43.648915: step: 598/463, loss: 0.0025733658112585545 2023-01-22 18:28:44.219633: step: 600/463, loss: 0.0002584099711384624 2023-01-22 18:28:44.829434: step: 602/463, loss: 0.018839865922927856 2023-01-22 18:28:45.440808: step: 604/463, loss: 0.005882123950868845 2023-01-22 18:28:46.026601: step: 606/463, loss: 0.0044534411281347275 2023-01-22 18:28:46.702602: step: 608/463, loss: 0.04973507300019264 2023-01-22 18:28:47.357705: step: 610/463, loss: 0.020817968994379044 2023-01-22 18:28:48.043434: step: 612/463, loss: 0.003578104777261615 2023-01-22 18:28:48.664372: step: 614/463, loss: 0.00031608500285074115 2023-01-22 18:28:49.283834: step: 616/463, loss: 0.015332520939409733 2023-01-22 18:28:49.863065: step: 618/463, loss: 0.0004367057990748435 2023-01-22 18:28:50.475626: step: 620/463, loss: 0.002006630413234234 2023-01-22 18:28:51.138496: step: 622/463, loss: 0.0046698865480721 2023-01-22 18:28:51.689930: step: 624/463, loss: 0.02812325768172741 2023-01-22 18:28:52.321391: step: 626/463, loss: 0.0011194414691999555 2023-01-22 18:28:53.032501: step: 628/463, loss: 0.004268256016075611 2023-01-22 18:28:53.574848: step: 630/463, loss: 0.002697776071727276 2023-01-22 18:28:54.242145: step: 632/463, loss: 0.07180566340684891 2023-01-22 18:28:55.040917: step: 634/463, loss: 0.0028531288262456656 2023-01-22 18:28:55.635951: step: 636/463, loss: 0.0021848613396286964 2023-01-22 18:28:56.279472: step: 638/463, loss: 0.005958175752311945 2023-01-22 18:28:56.940238: step: 640/463, loss: 3.0825729481875896e-05 2023-01-22 18:28:57.623120: step: 642/463, loss: 0.0029319594614207745 2023-01-22 18:28:58.286700: step: 644/463, loss: 0.6656644940376282 2023-01-22 18:28:58.968225: step: 646/463, loss: 0.0022583359386771917 2023-01-22 18:28:59.579471: step: 648/463, loss: 0.07567179203033447 2023-01-22 18:29:00.161326: step: 650/463, loss: 0.009297891519963741 2023-01-22 18:29:00.779085: step: 652/463, loss: 0.00017452302563469857 2023-01-22 18:29:01.375635: step: 654/463, loss: 0.026319490745663643 2023-01-22 18:29:02.005072: step: 656/463, loss: 0.0037914097774773836 2023-01-22 18:29:02.604584: step: 658/463, loss: 0.009037671610713005 2023-01-22 18:29:03.229019: step: 660/463, loss: 0.0090198228135705 2023-01-22 18:29:03.828039: step: 662/463, loss: 0.0017608230700716376 2023-01-22 18:29:04.394469: step: 664/463, loss: 0.0176743995398283 2023-01-22 18:29:04.985931: step: 666/463, loss: 0.029735233634710312 2023-01-22 18:29:05.596723: step: 668/463, loss: 0.3939843475818634 2023-01-22 18:29:06.247255: step: 670/463, loss: 1.159134080808144e-05 2023-01-22 18:29:06.857619: step: 672/463, loss: 0.3045475482940674 2023-01-22 18:29:07.525127: step: 674/463, loss: 0.012774793431162834 2023-01-22 18:29:08.145374: step: 676/463, loss: 0.0001276725815841928 2023-01-22 18:29:08.766976: step: 678/463, loss: 0.010734937153756618 2023-01-22 18:29:09.383624: step: 680/463, loss: 0.006783026736229658 2023-01-22 18:29:09.961636: step: 682/463, loss: 0.031193338334560394 2023-01-22 18:29:10.617133: step: 684/463, loss: 0.03894350677728653 2023-01-22 18:29:11.234164: step: 686/463, loss: 0.004426902625709772 2023-01-22 18:29:11.911356: step: 688/463, loss: 0.021038807928562164 2023-01-22 18:29:12.576579: step: 690/463, loss: 0.06877341866493225 2023-01-22 18:29:13.182992: step: 692/463, loss: 0.08579489588737488 2023-01-22 18:29:13.758895: step: 694/463, loss: 0.0006600141641683877 2023-01-22 18:29:14.426449: step: 696/463, loss: 0.006386469583958387 2023-01-22 18:29:15.113603: step: 698/463, loss: 0.0010192240588366985 2023-01-22 18:29:15.715379: step: 700/463, loss: 0.005759436637163162 2023-01-22 18:29:16.335610: step: 702/463, loss: 0.022929934784770012 2023-01-22 18:29:16.987739: step: 704/463, loss: 0.021200455725193024 2023-01-22 18:29:17.589793: step: 706/463, loss: 0.006237346213310957 2023-01-22 18:29:18.139816: step: 708/463, loss: 0.25409427285194397 2023-01-22 18:29:18.799061: step: 710/463, loss: 0.0031114043667912483 2023-01-22 18:29:19.425846: step: 712/463, loss: 0.0337141752243042 2023-01-22 18:29:20.038844: step: 714/463, loss: 0.007208849303424358 2023-01-22 18:29:20.689314: step: 716/463, loss: 0.002028097864240408 2023-01-22 18:29:21.406581: step: 718/463, loss: 0.027283979579806328 2023-01-22 18:29:22.003807: step: 720/463, loss: 0.0036162796895951033 2023-01-22 18:29:22.585565: step: 722/463, loss: 0.0043805623427033424 2023-01-22 18:29:23.284021: step: 724/463, loss: 0.042655445635318756 2023-01-22 18:29:23.888863: step: 726/463, loss: 0.11347927153110504 2023-01-22 18:29:24.482420: step: 728/463, loss: 0.09242423623800278 2023-01-22 18:29:25.124482: step: 730/463, loss: 0.00517246825620532 2023-01-22 18:29:25.750257: step: 732/463, loss: 0.0007222515414468944 2023-01-22 18:29:26.320365: step: 734/463, loss: 0.03590545058250427 2023-01-22 18:29:26.996997: step: 736/463, loss: 0.036464396864175797 2023-01-22 18:29:27.690655: step: 738/463, loss: 0.2606666386127472 2023-01-22 18:29:28.378296: step: 740/463, loss: 0.024211449548602104 2023-01-22 18:29:28.962452: step: 742/463, loss: 0.03299950063228607 2023-01-22 18:29:29.578738: step: 744/463, loss: 0.05704088136553764 2023-01-22 18:29:30.198883: step: 746/463, loss: 0.0019760713912546635 2023-01-22 18:29:30.797190: step: 748/463, loss: 0.007816256955265999 2023-01-22 18:29:31.353714: step: 750/463, loss: 0.011471687816083431 2023-01-22 18:29:31.989739: step: 752/463, loss: 0.2041717767715454 2023-01-22 18:29:32.611954: step: 754/463, loss: 0.014862852171063423 2023-01-22 18:29:33.317876: step: 756/463, loss: 0.007055539172142744 2023-01-22 18:29:33.852987: step: 758/463, loss: 0.0015472363447770476 2023-01-22 18:29:34.442046: step: 760/463, loss: 0.008810878731310368 2023-01-22 18:29:35.058808: step: 762/463, loss: 0.015625275671482086 2023-01-22 18:29:35.690303: step: 764/463, loss: 0.0011172146769240499 2023-01-22 18:29:36.343905: step: 766/463, loss: 0.027537034824490547 2023-01-22 18:29:36.976425: step: 768/463, loss: 0.025252562016248703 2023-01-22 18:29:37.604667: step: 770/463, loss: 0.023372087627649307 2023-01-22 18:29:38.261542: step: 772/463, loss: 0.06744197756052017 2023-01-22 18:29:38.872371: step: 774/463, loss: 0.41840964555740356 2023-01-22 18:29:39.480334: step: 776/463, loss: 0.0263313427567482 2023-01-22 18:29:40.091721: step: 778/463, loss: 2.7209989639231935e-05 2023-01-22 18:29:40.715686: step: 780/463, loss: 0.004690257832407951 2023-01-22 18:29:41.339139: step: 782/463, loss: 0.026597224175930023 2023-01-22 18:29:41.987686: step: 784/463, loss: 0.007800604682415724 2023-01-22 18:29:42.654516: step: 786/463, loss: 0.47231554985046387 2023-01-22 18:29:43.319111: step: 788/463, loss: 0.09862665086984634 2023-01-22 18:29:43.929325: step: 790/463, loss: 0.07699305564165115 2023-01-22 18:29:44.572807: step: 792/463, loss: 0.03080790862441063 2023-01-22 18:29:45.193178: step: 794/463, loss: 0.006956194061785936 2023-01-22 18:29:45.756710: step: 796/463, loss: 0.0028936960734426975 2023-01-22 18:29:46.318830: step: 798/463, loss: 0.0542217381298542 2023-01-22 18:29:46.936372: step: 800/463, loss: 0.01367761380970478 2023-01-22 18:29:47.571994: step: 802/463, loss: 0.017740536481142044 2023-01-22 18:29:48.187776: step: 804/463, loss: 0.0003226136032026261 2023-01-22 18:29:48.814861: step: 806/463, loss: 0.0359903983771801 2023-01-22 18:29:49.466126: step: 808/463, loss: 0.011278880760073662 2023-01-22 18:29:50.086801: step: 810/463, loss: 0.003164075780659914 2023-01-22 18:29:50.791573: step: 812/463, loss: 0.011427155695855618 2023-01-22 18:29:51.390791: step: 814/463, loss: 0.015759596601128578 2023-01-22 18:29:51.967573: step: 816/463, loss: 9.486660565016791e-05 2023-01-22 18:29:52.608587: step: 818/463, loss: 0.005746307782828808 2023-01-22 18:29:53.290414: step: 820/463, loss: 0.01592373102903366 2023-01-22 18:29:53.921254: step: 822/463, loss: 0.02960425242781639 2023-01-22 18:29:54.553941: step: 824/463, loss: 0.009142672643065453 2023-01-22 18:29:55.274550: step: 826/463, loss: 0.004462206736207008 2023-01-22 18:29:55.875907: step: 828/463, loss: 0.0037363788578659296 2023-01-22 18:29:56.545396: step: 830/463, loss: 0.0011974434601143003 2023-01-22 18:29:57.178384: step: 832/463, loss: 0.0061015356332063675 2023-01-22 18:29:57.752440: step: 834/463, loss: 0.11071093380451202 2023-01-22 18:29:58.300503: step: 836/463, loss: 0.027968933805823326 2023-01-22 18:29:58.954045: step: 838/463, loss: 0.010387386195361614 2023-01-22 18:29:59.592430: step: 840/463, loss: 0.021756578236818314 2023-01-22 18:30:00.180373: step: 842/463, loss: 0.04339312016963959 2023-01-22 18:30:00.798937: step: 844/463, loss: 0.04358723387122154 2023-01-22 18:30:01.442093: step: 846/463, loss: 0.006927151698619127 2023-01-22 18:30:02.030898: step: 848/463, loss: 0.00820869393646717 2023-01-22 18:30:02.596190: step: 850/463, loss: 0.0036937613040208817 2023-01-22 18:30:03.312166: step: 852/463, loss: 0.016916370019316673 2023-01-22 18:30:03.931375: step: 854/463, loss: 0.0004962085513398051 2023-01-22 18:30:04.539827: step: 856/463, loss: 0.0034861317835748196 2023-01-22 18:30:05.100744: step: 858/463, loss: 0.00635550357401371 2023-01-22 18:30:05.751303: step: 860/463, loss: 0.0257489625364542 2023-01-22 18:30:06.344927: step: 862/463, loss: 0.010596978478133678 2023-01-22 18:30:06.971003: step: 864/463, loss: 0.050487879663705826 2023-01-22 18:30:07.570371: step: 866/463, loss: 0.08265386521816254 2023-01-22 18:30:08.197233: step: 868/463, loss: 0.007073723245412111 2023-01-22 18:30:08.826603: step: 870/463, loss: 0.0021391764748841524 2023-01-22 18:30:09.549477: step: 872/463, loss: 0.0262689720839262 2023-01-22 18:30:10.171387: step: 874/463, loss: 0.007113305851817131 2023-01-22 18:30:10.779220: step: 876/463, loss: 0.05123058706521988 2023-01-22 18:30:11.387143: step: 878/463, loss: 0.016217513009905815 2023-01-22 18:30:11.966641: step: 880/463, loss: 0.0016730048228055239 2023-01-22 18:30:12.653535: step: 882/463, loss: 0.0025919005274772644 2023-01-22 18:30:13.303514: step: 884/463, loss: 0.004198434762656689 2023-01-22 18:30:13.919641: step: 886/463, loss: 0.042319778352975845 2023-01-22 18:30:14.517515: step: 888/463, loss: 0.013793771155178547 2023-01-22 18:30:15.149939: step: 890/463, loss: 0.02415756694972515 2023-01-22 18:30:15.734746: step: 892/463, loss: 0.0168730691075325 2023-01-22 18:30:16.356706: step: 894/463, loss: 0.005962313152849674 2023-01-22 18:30:17.022868: step: 896/463, loss: 0.0002675297437235713 2023-01-22 18:30:17.600963: step: 898/463, loss: 0.0022843291517347097 2023-01-22 18:30:18.250983: step: 900/463, loss: 0.0006123908096924424 2023-01-22 18:30:18.827705: step: 902/463, loss: 0.00443311920389533 2023-01-22 18:30:19.426134: step: 904/463, loss: 0.005423808470368385 2023-01-22 18:30:20.072893: step: 906/463, loss: 0.0008052661432884634 2023-01-22 18:30:20.774200: step: 908/463, loss: 0.011643342673778534 2023-01-22 18:30:21.369150: step: 910/463, loss: 0.0069712563417851925 2023-01-22 18:30:21.919906: step: 912/463, loss: 0.007522481959313154 2023-01-22 18:30:22.637237: step: 914/463, loss: 0.11866769939661026 2023-01-22 18:30:23.178950: step: 916/463, loss: 0.0018803380662575364 2023-01-22 18:30:23.809242: step: 918/463, loss: 0.02438446134328842 2023-01-22 18:30:24.421081: step: 920/463, loss: 0.006335046142339706 2023-01-22 18:30:24.979245: step: 922/463, loss: 9.800559928407893e-05 2023-01-22 18:30:25.521556: step: 924/463, loss: 0.0166312325745821 2023-01-22 18:30:26.147716: step: 926/463, loss: 0.007627373095601797 ================================================== Loss: 0.049 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.28321399582560297, 'r': 0.33104330441854163, 'f1': 0.305266529183852}, 'combined': 0.2249332320302067, 'epoch': 32} Test Chinese: {'template': {'p': 0.9722222222222222, 'r': 0.5511811023622047, 'f1': 0.7035175879396985}, 'slot': {'p': 0.3393303633905101, 'r': 0.32542580621583356, 'f1': 0.33223266553587993}, 'combined': 0.23373152349257886, 'epoch': 32} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.28669905462184875, 'r': 0.33294083762537274, 'f1': 0.30809450645930014}, 'combined': 0.2270170047594843, 'epoch': 32} Test Korean: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.3427604167660694, 'r': 0.3260112166448653, 'f1': 0.3341760771690659}, 'combined': 0.23726501479003678, 'epoch': 32} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.30493758958432876, 'r': 0.34602026294388727, 'f1': 0.3241825396825397}, 'combined': 0.23887134502923976, 'epoch': 32} Test Russian: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.35629862700228837, 'r': 0.3109063062847193, 'f1': 0.3320583662649472}, 'combined': 0.23576144004811253, 'epoch': 32} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.3034420289855072, 'r': 0.3988095238095238, 'f1': 0.34465020576131683}, 'combined': 0.2297668038408779, 'epoch': 32} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.2564102564102564, 'r': 0.43478260869565216, 'f1': 0.3225806451612903}, 'combined': 0.16129032258064516, 'epoch': 32} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.421875, 'r': 0.23275862068965517, 'f1': 0.3}, 'combined': 0.19999999999999998, 'epoch': 32} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.31154818059299194, 'r': 0.313321699647601, 'f1': 0.312432423300446}, 'combined': 0.23021336453717073, 'epoch': 29} Test for Chinese: {'template': {'p': 0.9722222222222222, 'r': 0.5511811023622047, 'f1': 0.7035175879396985}, 'slot': {'p': 0.35841849662687986, 'r': 0.32120052010105204, 'f1': 0.33879042433116024}, 'combined': 0.23834502214252482, 'epoch': 29} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.34188034188034183, 'r': 0.38095238095238093, 'f1': 0.36036036036036034}, 'combined': 0.2402402402402402, 'epoch': 29} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.28642843540005986, 'r': 0.33860515228508026, 'f1': 0.3103389830508475}, 'combined': 0.22867082961641394, 'epoch': 17} Test for Korean: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.3492453884409228, 'r': 0.3270179138496522, 'f1': 0.3377663639671779}, 'combined': 0.2398141184166963, 'epoch': 17} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.33088235294117646, 'r': 0.4891304347826087, 'f1': 0.39473684210526316}, 'combined': 0.19736842105263158, 'epoch': 17} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2919934913217624, 'r': 0.3557112361073462, 'f1': 0.3207182573628254}, 'combined': 0.23631871595155554, 'epoch': 30} Test for Russian: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.36122302043550025, 'r': 0.3233655859793779, 'f1': 0.34124755386763844}, 'combined': 0.24228576324602327, 'epoch': 30} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.4375, 'r': 0.3017241379310345, 'f1': 0.3571428571428571}, 'combined': 0.23809523809523805, 'epoch': 30} ****************************** Epoch: 33 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-22 18:33:05.287264: step: 2/463, loss: 0.0013932586880400777 2023-01-22 18:33:05.971769: step: 4/463, loss: 0.02181864343583584 2023-01-22 18:33:06.590976: step: 6/463, loss: 0.0031859776936471462 2023-01-22 18:33:07.255715: step: 8/463, loss: 0.00013484965893439949 2023-01-22 18:33:07.916755: step: 10/463, loss: 0.1254078447818756 2023-01-22 18:33:08.493692: step: 12/463, loss: 0.004445891361683607 2023-01-22 18:33:09.135057: step: 14/463, loss: 0.026911946013569832 2023-01-22 18:33:09.710688: step: 16/463, loss: 0.007043161429464817 2023-01-22 18:33:10.344318: step: 18/463, loss: 0.12216439098119736 2023-01-22 18:33:11.007222: step: 20/463, loss: 4.410505425767042e-05 2023-01-22 18:33:11.601496: step: 22/463, loss: 0.0001067329358193092 2023-01-22 18:33:12.123804: step: 24/463, loss: 8.399414218729362e-05 2023-01-22 18:33:12.776426: step: 26/463, loss: 0.2121105194091797 2023-01-22 18:33:13.406653: step: 28/463, loss: 0.04740883782505989 2023-01-22 18:33:14.092143: step: 30/463, loss: 0.0016680164262652397 2023-01-22 18:33:14.675103: step: 32/463, loss: 0.0022112783044576645 2023-01-22 18:33:15.349673: step: 34/463, loss: 0.000836713588796556 2023-01-22 18:33:15.946044: step: 36/463, loss: 0.007664518430829048 2023-01-22 18:33:16.529780: step: 38/463, loss: 0.002375347539782524 2023-01-22 18:33:17.127303: step: 40/463, loss: 0.0025359492283314466 2023-01-22 18:33:17.785493: step: 42/463, loss: 0.016810551285743713 2023-01-22 18:33:18.431821: step: 44/463, loss: 0.012377624399960041 2023-01-22 18:33:19.041545: step: 46/463, loss: 3.814995527267456 2023-01-22 18:33:19.705985: step: 48/463, loss: 0.004936085548251867 2023-01-22 18:33:20.330122: step: 50/463, loss: 0.02674945630133152 2023-01-22 18:33:20.924792: step: 52/463, loss: 0.016773661598563194 2023-01-22 18:33:21.506181: step: 54/463, loss: 0.026745310053229332 2023-01-22 18:33:22.083993: step: 56/463, loss: 0.07662426680326462 2023-01-22 18:33:22.648257: step: 58/463, loss: 0.06307216733694077 2023-01-22 18:33:23.327391: step: 60/463, loss: 0.004318969789892435 2023-01-22 18:33:24.125168: step: 62/463, loss: 0.003160941880196333 2023-01-22 18:33:24.790073: step: 64/463, loss: 0.019474875181913376 2023-01-22 18:33:25.381753: step: 66/463, loss: 0.00017072148330044 2023-01-22 18:33:26.015665: step: 68/463, loss: 0.0054404595866799355 2023-01-22 18:33:26.653374: step: 70/463, loss: 0.015696804970502853 2023-01-22 18:33:27.306229: step: 72/463, loss: 0.2734544575214386 2023-01-22 18:33:27.889544: step: 74/463, loss: 0.0002800676738843322 2023-01-22 18:33:28.529177: step: 76/463, loss: 0.02203391119837761 2023-01-22 18:33:29.157297: step: 78/463, loss: 0.00992437731474638 2023-01-22 18:33:29.847761: step: 80/463, loss: 0.004833622369915247 2023-01-22 18:33:30.391169: step: 82/463, loss: 0.005540614016354084 2023-01-22 18:33:31.037776: step: 84/463, loss: 0.006166017148643732 2023-01-22 18:33:31.710587: step: 86/463, loss: 0.0008647437789477408 2023-01-22 18:33:32.354243: step: 88/463, loss: 0.0007783463806845248 2023-01-22 18:33:32.929677: step: 90/463, loss: 0.003765785600990057 2023-01-22 18:33:33.541190: step: 92/463, loss: 0.012625808827579021 2023-01-22 18:33:34.076491: step: 94/463, loss: 0.016006946563720703 2023-01-22 18:33:34.661817: step: 96/463, loss: 0.008302020840346813 2023-01-22 18:33:35.312760: step: 98/463, loss: 0.00893313717097044 2023-01-22 18:33:35.984039: step: 100/463, loss: 0.011998682282865047 2023-01-22 18:33:36.598537: step: 102/463, loss: 0.011211829259991646 2023-01-22 18:33:37.199683: step: 104/463, loss: 0.0062698847614228725 2023-01-22 18:33:37.776950: step: 106/463, loss: 0.006063942797482014 2023-01-22 18:33:38.396742: step: 108/463, loss: 0.008157585747539997 2023-01-22 18:33:39.011452: step: 110/463, loss: 0.027382224798202515 2023-01-22 18:33:39.649940: step: 112/463, loss: 0.022550681605935097 2023-01-22 18:33:40.286650: step: 114/463, loss: 0.00800416711717844 2023-01-22 18:33:40.884718: step: 116/463, loss: 0.01698043756186962 2023-01-22 18:33:41.472309: step: 118/463, loss: 0.09140413999557495 2023-01-22 18:33:42.101822: step: 120/463, loss: 0.0042067538015544415 2023-01-22 18:33:42.697079: step: 122/463, loss: 0.023529253900051117 2023-01-22 18:33:43.288991: step: 124/463, loss: 0.00035012507578358054 2023-01-22 18:33:44.025082: step: 126/463, loss: 0.5861074924468994 2023-01-22 18:33:44.639326: step: 128/463, loss: 0.01316317543387413 2023-01-22 18:33:45.246476: step: 130/463, loss: 0.04388779401779175 2023-01-22 18:33:45.813156: step: 132/463, loss: 0.0010401640320196748 2023-01-22 18:33:46.401289: step: 134/463, loss: 0.0026373390574008226 2023-01-22 18:33:46.948428: step: 136/463, loss: 0.0059344228357076645 2023-01-22 18:33:47.526901: step: 138/463, loss: 0.009586157277226448 2023-01-22 18:33:48.257047: step: 140/463, loss: 0.012928391806781292 2023-01-22 18:33:48.903275: step: 142/463, loss: 0.00791609100997448 2023-01-22 18:33:49.539412: step: 144/463, loss: 0.021640172228217125 2023-01-22 18:33:50.121356: step: 146/463, loss: 0.008415917865931988 2023-01-22 18:33:50.722900: step: 148/463, loss: 0.055267058312892914 2023-01-22 18:33:51.345537: step: 150/463, loss: 0.030475394800305367 2023-01-22 18:33:52.031521: step: 152/463, loss: 0.24319012463092804 2023-01-22 18:33:52.645415: step: 154/463, loss: 0.010866263881325722 2023-01-22 18:33:53.196505: step: 156/463, loss: 0.0023217920679599047 2023-01-22 18:33:53.851548: step: 158/463, loss: 0.1085912436246872 2023-01-22 18:33:54.508860: step: 160/463, loss: 0.039575640112161636 2023-01-22 18:33:55.143953: step: 162/463, loss: 0.0006061178864911199 2023-01-22 18:33:55.829279: step: 164/463, loss: 0.0007332772365771234 2023-01-22 18:33:56.460589: step: 166/463, loss: 0.03827434778213501 2023-01-22 18:33:57.073899: step: 168/463, loss: 0.039752259850502014 2023-01-22 18:33:57.673674: step: 170/463, loss: 0.019318079575896263 2023-01-22 18:33:58.290785: step: 172/463, loss: 0.05793484300374985 2023-01-22 18:33:58.877483: step: 174/463, loss: 0.0009875181131064892 2023-01-22 18:33:59.520568: step: 176/463, loss: 0.0027187541127204895 2023-01-22 18:34:00.182210: step: 178/463, loss: 0.05385085567831993 2023-01-22 18:34:00.856215: step: 180/463, loss: 0.03129861503839493 2023-01-22 18:34:01.474068: step: 182/463, loss: 0.0349201038479805 2023-01-22 18:34:02.086207: step: 184/463, loss: 0.004159962292760611 2023-01-22 18:34:02.801230: step: 186/463, loss: 0.003720509819686413 2023-01-22 18:34:03.449808: step: 188/463, loss: 0.172422856092453 2023-01-22 18:34:04.093976: step: 190/463, loss: 0.007242167368531227 2023-01-22 18:34:04.723704: step: 192/463, loss: 0.017842689529061317 2023-01-22 18:34:05.402093: step: 194/463, loss: 1.0088056325912476 2023-01-22 18:34:06.029403: step: 196/463, loss: 0.0006924908957444131 2023-01-22 18:34:06.615936: step: 198/463, loss: 0.0006414587260223925 2023-01-22 18:34:07.216151: step: 200/463, loss: 0.002112535061314702 2023-01-22 18:34:07.824767: step: 202/463, loss: 0.0033550274092704058 2023-01-22 18:34:08.463447: step: 204/463, loss: 0.0019070173148065805 2023-01-22 18:34:09.006696: step: 206/463, loss: 0.0013315198011696339 2023-01-22 18:34:09.606080: step: 208/463, loss: 0.004634491633623838 2023-01-22 18:34:10.240584: step: 210/463, loss: 0.028919626027345657 2023-01-22 18:34:10.880722: step: 212/463, loss: 0.0025548983830958605 2023-01-22 18:34:11.509722: step: 214/463, loss: 0.1113569512963295 2023-01-22 18:34:12.135522: step: 216/463, loss: 0.001656369655393064 2023-01-22 18:34:12.848200: step: 218/463, loss: 0.05174800753593445 2023-01-22 18:34:13.475450: step: 220/463, loss: 0.00011315010488033295 2023-01-22 18:34:14.098109: step: 222/463, loss: 0.012050597928464413 2023-01-22 18:34:14.648933: step: 224/463, loss: 0.0050604818388819695 2023-01-22 18:34:15.237833: step: 226/463, loss: 0.043445322662591934 2023-01-22 18:34:15.869508: step: 228/463, loss: 0.04577048495411873 2023-01-22 18:34:16.496615: step: 230/463, loss: 0.021333202719688416 2023-01-22 18:34:17.096248: step: 232/463, loss: 0.02679116278886795 2023-01-22 18:34:17.713758: step: 234/463, loss: 0.05852409452199936 2023-01-22 18:34:18.365851: step: 236/463, loss: 0.0003992473357357085 2023-01-22 18:34:18.984731: step: 238/463, loss: 0.013466784730553627 2023-01-22 18:34:19.623829: step: 240/463, loss: 0.00025833951076492667 2023-01-22 18:34:20.197577: step: 242/463, loss: 0.0005600973381660879 2023-01-22 18:34:20.840486: step: 244/463, loss: 0.026733027771115303 2023-01-22 18:34:21.470534: step: 246/463, loss: 0.0020034664776176214 2023-01-22 18:34:22.057936: step: 248/463, loss: 0.024378353729844093 2023-01-22 18:34:22.641751: step: 250/463, loss: 0.00044065050315111876 2023-01-22 18:34:23.212348: step: 252/463, loss: 0.010949659161269665 2023-01-22 18:34:23.870863: step: 254/463, loss: 0.008761285804212093 2023-01-22 18:34:24.511677: step: 256/463, loss: 0.02798420563340187 2023-01-22 18:34:25.085357: step: 258/463, loss: 0.00019059523765463382 2023-01-22 18:34:25.625493: step: 260/463, loss: 0.00964433141052723 2023-01-22 18:34:26.228110: step: 262/463, loss: 0.010586505755782127 2023-01-22 18:34:26.798673: step: 264/463, loss: 0.0019379424629732966 2023-01-22 18:34:27.481279: step: 266/463, loss: 0.028102975338697433 2023-01-22 18:34:28.103422: step: 268/463, loss: 0.047110915184020996 2023-01-22 18:34:28.794618: step: 270/463, loss: 0.011390289291739464 2023-01-22 18:34:29.469096: step: 272/463, loss: 0.33488160371780396 2023-01-22 18:34:30.065437: step: 274/463, loss: 0.0023548321332782507 2023-01-22 18:34:30.641264: step: 276/463, loss: 0.0006221684161573648 2023-01-22 18:34:31.246264: step: 278/463, loss: 0.0021879354026168585 2023-01-22 18:34:31.847456: step: 280/463, loss: 0.008410616777837276 2023-01-22 18:34:32.491545: step: 282/463, loss: 0.0015852057840675116 2023-01-22 18:34:33.112085: step: 284/463, loss: 0.0025573037564754486 2023-01-22 18:34:33.749961: step: 286/463, loss: 0.004271598067134619 2023-01-22 18:34:34.367372: step: 288/463, loss: 8.553343650419265e-05 2023-01-22 18:34:35.005395: step: 290/463, loss: 0.08332924544811249 2023-01-22 18:34:35.596290: step: 292/463, loss: 0.004096593242138624 2023-01-22 18:34:36.244226: step: 294/463, loss: 7.485674359486438e-06 2023-01-22 18:34:36.872582: step: 296/463, loss: 0.7751229405403137 2023-01-22 18:34:37.528327: step: 298/463, loss: 0.004551571793854237 2023-01-22 18:34:38.176758: step: 300/463, loss: 0.009025401435792446 2023-01-22 18:34:38.836806: step: 302/463, loss: 0.01862529292702675 2023-01-22 18:34:39.506284: step: 304/463, loss: 1.643502946535591e-05 2023-01-22 18:34:40.166640: step: 306/463, loss: 0.35719457268714905 2023-01-22 18:34:40.735066: step: 308/463, loss: 0.1354079246520996 2023-01-22 18:34:41.378974: step: 310/463, loss: 0.0034778225235641003 2023-01-22 18:34:41.936909: step: 312/463, loss: 0.0023073123302310705 2023-01-22 18:34:42.563853: step: 314/463, loss: 0.01871236227452755 2023-01-22 18:34:43.150285: step: 316/463, loss: 0.002790148137137294 2023-01-22 18:34:43.840954: step: 318/463, loss: 0.015036230906844139 2023-01-22 18:34:44.647004: step: 320/463, loss: 0.06739447265863419 2023-01-22 18:34:45.247470: step: 322/463, loss: 0.00042165833292528987 2023-01-22 18:34:45.841436: step: 324/463, loss: 0.04003236070275307 2023-01-22 18:34:46.531118: step: 326/463, loss: 0.0010843510972335935 2023-01-22 18:34:47.170521: step: 328/463, loss: 0.0039037505630403757 2023-01-22 18:34:47.779641: step: 330/463, loss: 0.15337149798870087 2023-01-22 18:34:48.324207: step: 332/463, loss: 0.004860019776970148 2023-01-22 18:34:48.933256: step: 334/463, loss: 0.003123639151453972 2023-01-22 18:34:49.581173: step: 336/463, loss: 0.025094961747527122 2023-01-22 18:34:50.172674: step: 338/463, loss: 0.006600796245038509 2023-01-22 18:34:50.862281: step: 340/463, loss: 0.008109505288302898 2023-01-22 18:34:51.453694: step: 342/463, loss: 0.06752146035432816 2023-01-22 18:34:52.045897: step: 344/463, loss: 0.0010341204470023513 2023-01-22 18:34:52.704580: step: 346/463, loss: 0.0037475479766726494 2023-01-22 18:34:53.309538: step: 348/463, loss: 0.00010757103882497177 2023-01-22 18:34:53.938184: step: 350/463, loss: 0.00718420697376132 2023-01-22 18:34:54.586661: step: 352/463, loss: 0.009333829395473003 2023-01-22 18:34:55.279269: step: 354/463, loss: 0.01734703965485096 2023-01-22 18:34:55.929424: step: 356/463, loss: 0.004886743146926165 2023-01-22 18:34:56.497827: step: 358/463, loss: 0.001554237911477685 2023-01-22 18:34:57.138948: step: 360/463, loss: 0.0009370250627398491 2023-01-22 18:34:57.788100: step: 362/463, loss: 0.00451401574537158 2023-01-22 18:34:58.441982: step: 364/463, loss: 0.015931345522403717 2023-01-22 18:34:59.141347: step: 366/463, loss: 0.0019849480595439672 2023-01-22 18:34:59.760080: step: 368/463, loss: 0.0027850775513798 2023-01-22 18:35:00.423818: step: 370/463, loss: 0.04677087813615799 2023-01-22 18:35:01.048960: step: 372/463, loss: 0.05528602749109268 2023-01-22 18:35:01.714748: step: 374/463, loss: 0.01552736945450306 2023-01-22 18:35:02.416502: step: 376/463, loss: 0.17703038454055786 2023-01-22 18:35:03.055202: step: 378/463, loss: 0.1686805635690689 2023-01-22 18:35:03.709029: step: 380/463, loss: 0.00019041445921175182 2023-01-22 18:35:04.395682: step: 382/463, loss: 0.10301722586154938 2023-01-22 18:35:05.031907: step: 384/463, loss: 0.010699063539505005 2023-01-22 18:35:05.660473: step: 386/463, loss: 0.007266933564096689 2023-01-22 18:35:06.243633: step: 388/463, loss: 0.004625116009265184 2023-01-22 18:35:06.905051: step: 390/463, loss: 0.007225690875202417 2023-01-22 18:35:07.448442: step: 392/463, loss: 0.01454153936356306 2023-01-22 18:35:08.019272: step: 394/463, loss: 0.00023638870334252715 2023-01-22 18:35:08.652357: step: 396/463, loss: 0.0015497403219342232 2023-01-22 18:35:09.320298: step: 398/463, loss: 0.0013933083973824978 2023-01-22 18:35:09.926238: step: 400/463, loss: 0.022542651742696762 2023-01-22 18:35:10.576233: step: 402/463, loss: 0.001109207863919437 2023-01-22 18:35:11.210211: step: 404/463, loss: 0.0019756362307816744 2023-01-22 18:35:11.774923: step: 406/463, loss: 0.010985612869262695 2023-01-22 18:35:12.403912: step: 408/463, loss: 0.0272396057844162 2023-01-22 18:35:13.071353: step: 410/463, loss: 0.028261778876185417 2023-01-22 18:35:13.670671: step: 412/463, loss: 0.0019402196630835533 2023-01-22 18:35:14.308361: step: 414/463, loss: 0.008733673021197319 2023-01-22 18:35:14.997117: step: 416/463, loss: 0.02199440449476242 2023-01-22 18:35:15.626439: step: 418/463, loss: 0.011788634583353996 2023-01-22 18:35:16.255898: step: 420/463, loss: 0.024626873433589935 2023-01-22 18:35:16.874507: step: 422/463, loss: 0.011924585327506065 2023-01-22 18:35:17.453646: step: 424/463, loss: 0.0006248769932426512 2023-01-22 18:35:18.081184: step: 426/463, loss: 0.007920955307781696 2023-01-22 18:35:18.713309: step: 428/463, loss: 0.007296229247003794 2023-01-22 18:35:19.348803: step: 430/463, loss: 0.050273410975933075 2023-01-22 18:35:20.014138: step: 432/463, loss: 0.0019431866239756346 2023-01-22 18:35:20.733198: step: 434/463, loss: 0.019585752859711647 2023-01-22 18:35:21.291201: step: 436/463, loss: 0.0076931859366595745 2023-01-22 18:35:21.974121: step: 438/463, loss: 0.02151302807033062 2023-01-22 18:35:22.589048: step: 440/463, loss: 0.01828189380466938 2023-01-22 18:35:23.272018: step: 442/463, loss: 0.002852978650480509 2023-01-22 18:35:23.908483: step: 444/463, loss: 0.24126063287258148 2023-01-22 18:35:24.522800: step: 446/463, loss: 0.0007179048261605203 2023-01-22 18:35:25.178692: step: 448/463, loss: 0.02330598793923855 2023-01-22 18:35:25.793491: step: 450/463, loss: 0.09987345337867737 2023-01-22 18:35:26.404535: step: 452/463, loss: 0.037242788821458817 2023-01-22 18:35:26.986061: step: 454/463, loss: 0.006808450445532799 2023-01-22 18:35:27.613970: step: 456/463, loss: 3.358148387633264e-05 2023-01-22 18:35:28.198991: step: 458/463, loss: 0.0033859792165458202 2023-01-22 18:35:28.781523: step: 460/463, loss: 0.002981805009767413 2023-01-22 18:35:29.393153: step: 462/463, loss: 0.010459107346832752 2023-01-22 18:35:30.070381: step: 464/463, loss: 1.1266754865646362 2023-01-22 18:35:30.786116: step: 466/463, loss: 0.0036795358173549175 2023-01-22 18:35:31.469251: step: 468/463, loss: 0.006028769072145224 2023-01-22 18:35:32.034737: step: 470/463, loss: 0.000837208004668355 2023-01-22 18:35:32.643031: step: 472/463, loss: 0.002633063355460763 2023-01-22 18:35:33.284762: step: 474/463, loss: 0.017999490723013878 2023-01-22 18:35:33.883109: step: 476/463, loss: 0.0019533182494342327 2023-01-22 18:35:34.517156: step: 478/463, loss: 0.03130760416388512 2023-01-22 18:35:35.186234: step: 480/463, loss: 1.5717397928237915 2023-01-22 18:35:35.830596: step: 482/463, loss: 0.0034265972208231688 2023-01-22 18:35:36.410240: step: 484/463, loss: 0.08021298795938492 2023-01-22 18:35:36.999855: step: 486/463, loss: 0.00398729695007205 2023-01-22 18:35:37.622478: step: 488/463, loss: 0.001529026310890913 2023-01-22 18:35:38.255962: step: 490/463, loss: 0.00045622012112289667 2023-01-22 18:35:38.895042: step: 492/463, loss: 0.000495657732244581 2023-01-22 18:35:39.579425: step: 494/463, loss: 0.04346424713730812 2023-01-22 18:35:40.180238: step: 496/463, loss: 0.01340216863900423 2023-01-22 18:35:40.811514: step: 498/463, loss: 0.2153106927871704 2023-01-22 18:35:41.440307: step: 500/463, loss: 0.028481867164373398 2023-01-22 18:35:42.084318: step: 502/463, loss: 0.0008101155399344862 2023-01-22 18:35:42.725321: step: 504/463, loss: 0.10249729454517365 2023-01-22 18:35:43.361914: step: 506/463, loss: 0.0007267071632668376 2023-01-22 18:35:43.970665: step: 508/463, loss: 0.023027973249554634 2023-01-22 18:35:44.667343: step: 510/463, loss: 0.0717514380812645 2023-01-22 18:35:45.331656: step: 512/463, loss: 5.780195351690054e-05 2023-01-22 18:35:45.902864: step: 514/463, loss: 0.0023229029029607773 2023-01-22 18:35:46.506710: step: 516/463, loss: 0.0010094813769683242 2023-01-22 18:35:47.129783: step: 518/463, loss: 2.7495494578033686e-05 2023-01-22 18:35:47.778426: step: 520/463, loss: 0.0021471548825502396 2023-01-22 18:35:48.428242: step: 522/463, loss: 0.008352371864020824 2023-01-22 18:35:49.049563: step: 524/463, loss: 0.0029325515497475863 2023-01-22 18:35:49.594287: step: 526/463, loss: 0.0015799134271219373 2023-01-22 18:35:50.191532: step: 528/463, loss: 0.08026915788650513 2023-01-22 18:35:50.829858: step: 530/463, loss: 0.05052229389548302 2023-01-22 18:35:51.385212: step: 532/463, loss: 0.007742117624729872 2023-01-22 18:35:51.982098: step: 534/463, loss: 0.0004088955174665898 2023-01-22 18:35:52.564249: step: 536/463, loss: 0.050201427191495895 2023-01-22 18:35:53.147389: step: 538/463, loss: 0.00043902051402255893 2023-01-22 18:35:53.669700: step: 540/463, loss: 0.033641181886196136 2023-01-22 18:35:54.285619: step: 542/463, loss: 0.005302655976265669 2023-01-22 18:35:54.882564: step: 544/463, loss: 0.013024280779063702 2023-01-22 18:35:55.486419: step: 546/463, loss: 0.006633285898715258 2023-01-22 18:35:56.076173: step: 548/463, loss: 0.0008590467041358352 2023-01-22 18:35:56.735563: step: 550/463, loss: 0.02632511407136917 2023-01-22 18:35:57.325480: step: 552/463, loss: 0.2093154788017273 2023-01-22 18:35:57.932607: step: 554/463, loss: 0.02234162576496601 2023-01-22 18:35:58.541225: step: 556/463, loss: 0.004936086013913155 2023-01-22 18:35:59.168938: step: 558/463, loss: 0.05560288205742836 2023-01-22 18:35:59.767633: step: 560/463, loss: 0.013978354632854462 2023-01-22 18:36:00.421611: step: 562/463, loss: 0.029930485412478447 2023-01-22 18:36:01.128313: step: 564/463, loss: 0.0011513273930177093 2023-01-22 18:36:01.757781: step: 566/463, loss: 0.0008668722584843636 2023-01-22 18:36:02.347365: step: 568/463, loss: 0.05424121022224426 2023-01-22 18:36:03.067968: step: 570/463, loss: 0.00030832496122457087 2023-01-22 18:36:03.643600: step: 572/463, loss: 0.021626979112625122 2023-01-22 18:36:04.259727: step: 574/463, loss: 0.0032457076013088226 2023-01-22 18:36:04.856940: step: 576/463, loss: 0.009469774551689625 2023-01-22 18:36:05.489964: step: 578/463, loss: 0.0015660423086956143 2023-01-22 18:36:06.121451: step: 580/463, loss: 0.0015311285387724638 2023-01-22 18:36:06.742519: step: 582/463, loss: 0.0006403227453120053 2023-01-22 18:36:07.412726: step: 584/463, loss: 0.32313668727874756 2023-01-22 18:36:08.009300: step: 586/463, loss: 0.01327084843069315 2023-01-22 18:36:08.652289: step: 588/463, loss: 0.02713206596672535 2023-01-22 18:36:09.182315: step: 590/463, loss: 0.024466780945658684 2023-01-22 18:36:09.778020: step: 592/463, loss: 0.008217682130634785 2023-01-22 18:36:10.408651: step: 594/463, loss: 0.001997007057070732 2023-01-22 18:36:11.041247: step: 596/463, loss: 0.0015895984834060073 2023-01-22 18:36:11.633983: step: 598/463, loss: 0.01027715764939785 2023-01-22 18:36:12.237142: step: 600/463, loss: 0.01284846942871809 2023-01-22 18:36:12.830128: step: 602/463, loss: 0.0008542280411347747 2023-01-22 18:36:13.544161: step: 604/463, loss: 0.08100762963294983 2023-01-22 18:36:14.149000: step: 606/463, loss: 0.002482808195054531 2023-01-22 18:36:14.752238: step: 608/463, loss: 0.003425067523494363 2023-01-22 18:36:15.364417: step: 610/463, loss: 0.00044653331860899925 2023-01-22 18:36:16.037613: step: 612/463, loss: 0.006690045818686485 2023-01-22 18:36:16.622889: step: 614/463, loss: 0.011804546229541302 2023-01-22 18:36:17.259357: step: 616/463, loss: 0.1057143583893776 2023-01-22 18:36:17.826119: step: 618/463, loss: 0.002800230635330081 2023-01-22 18:36:18.504087: step: 620/463, loss: 0.6832827925682068 2023-01-22 18:36:19.159640: step: 622/463, loss: 0.02133691869676113 2023-01-22 18:36:19.795143: step: 624/463, loss: 0.0025007319636642933 2023-01-22 18:36:20.441558: step: 626/463, loss: 0.07407975196838379 2023-01-22 18:36:21.029868: step: 628/463, loss: 0.0006483806064352393 2023-01-22 18:36:21.650213: step: 630/463, loss: 0.0008885335410013795 2023-01-22 18:36:22.343631: step: 632/463, loss: 0.018372874706983566 2023-01-22 18:36:23.014015: step: 634/463, loss: 0.005026048514991999 2023-01-22 18:36:23.612798: step: 636/463, loss: 0.3491978347301483 2023-01-22 18:36:24.271954: step: 638/463, loss: 0.002196703338995576 2023-01-22 18:36:24.912563: step: 640/463, loss: 0.004823832307010889 2023-01-22 18:36:25.592457: step: 642/463, loss: 0.03379920497536659 2023-01-22 18:36:26.155112: step: 644/463, loss: 0.0004682086000684649 2023-01-22 18:36:26.709182: step: 646/463, loss: 0.0007909684209153056 2023-01-22 18:36:27.330730: step: 648/463, loss: 0.023408932611346245 2023-01-22 18:36:27.940206: step: 650/463, loss: 0.075900599360466 2023-01-22 18:36:28.553668: step: 652/463, loss: 0.007950200699269772 2023-01-22 18:36:29.230849: step: 654/463, loss: 0.009109659120440483 2023-01-22 18:36:29.870886: step: 656/463, loss: 0.4728400707244873 2023-01-22 18:36:30.464642: step: 658/463, loss: 0.04246094077825546 2023-01-22 18:36:30.991606: step: 660/463, loss: 0.038936249911785126 2023-01-22 18:36:31.648964: step: 662/463, loss: 0.11980098485946655 2023-01-22 18:36:32.272781: step: 664/463, loss: 0.004500584211200476 2023-01-22 18:36:32.936484: step: 666/463, loss: 0.008632301352918148 2023-01-22 18:36:33.562226: step: 668/463, loss: 0.019336862489581108 2023-01-22 18:36:34.189704: step: 670/463, loss: 0.007340050768107176 2023-01-22 18:36:34.791871: step: 672/463, loss: 0.0001914932217914611 2023-01-22 18:36:35.367716: step: 674/463, loss: 0.0009257001802325249 2023-01-22 18:36:35.973420: step: 676/463, loss: 0.03375827148556709 2023-01-22 18:36:36.623358: step: 678/463, loss: 0.004182027652859688 2023-01-22 18:36:37.235989: step: 680/463, loss: 0.01220657303929329 2023-01-22 18:36:37.894495: step: 682/463, loss: 0.00025416986318305135 2023-01-22 18:36:38.541068: step: 684/463, loss: 0.1373051553964615 2023-01-22 18:36:39.163318: step: 686/463, loss: 0.025592533871531487 2023-01-22 18:36:39.766465: step: 688/463, loss: 0.05938173085451126 2023-01-22 18:36:40.370318: step: 690/463, loss: 0.0010745730251073837 2023-01-22 18:36:41.014719: step: 692/463, loss: 0.015250202268362045 2023-01-22 18:36:41.659393: step: 694/463, loss: 0.01921061798930168 2023-01-22 18:36:42.386736: step: 696/463, loss: 0.009930760599672794 2023-01-22 18:36:43.094831: step: 698/463, loss: 0.007616202346980572 2023-01-22 18:36:43.670190: step: 700/463, loss: 0.005745828151702881 2023-01-22 18:36:44.268937: step: 702/463, loss: 0.005791160743683577 2023-01-22 18:36:44.818515: step: 704/463, loss: 0.010489909909665585 2023-01-22 18:36:45.464862: step: 706/463, loss: 0.06520478427410126 2023-01-22 18:36:46.057118: step: 708/463, loss: 0.01546331774443388 2023-01-22 18:36:46.682535: step: 710/463, loss: 0.0031511266715824604 2023-01-22 18:36:47.270984: step: 712/463, loss: 4.217265814077109e-05 2023-01-22 18:36:47.900371: step: 714/463, loss: 0.004015397746115923 2023-01-22 18:36:48.609978: step: 716/463, loss: 0.014853289350867271 2023-01-22 18:36:49.271275: step: 718/463, loss: 0.00044226067257113755 2023-01-22 18:36:49.942119: step: 720/463, loss: 0.040723130106925964 2023-01-22 18:36:50.613353: step: 722/463, loss: 0.000643678882624954 2023-01-22 18:36:51.235218: step: 724/463, loss: 0.03064826875925064 2023-01-22 18:36:51.812487: step: 726/463, loss: 0.006204506848007441 2023-01-22 18:36:52.365155: step: 728/463, loss: 0.0020964075811207294 2023-01-22 18:36:53.002137: step: 730/463, loss: 0.009509078226983547 2023-01-22 18:36:53.655351: step: 732/463, loss: 0.00043342376011423767 2023-01-22 18:36:54.261620: step: 734/463, loss: 0.04021352156996727 2023-01-22 18:36:54.881548: step: 736/463, loss: 0.12872037291526794 2023-01-22 18:36:55.481331: step: 738/463, loss: 0.011373971588909626 2023-01-22 18:36:56.106842: step: 740/463, loss: 0.0007512427982874215 2023-01-22 18:36:56.666164: step: 742/463, loss: 1.7922960978467017e-05 2023-01-22 18:36:57.257824: step: 744/463, loss: 0.0004205645527690649 2023-01-22 18:36:57.950586: step: 746/463, loss: 0.01865875907242298 2023-01-22 18:36:58.584402: step: 748/463, loss: 0.005451646167784929 2023-01-22 18:36:59.262446: step: 750/463, loss: 0.8020462393760681 2023-01-22 18:36:59.926710: step: 752/463, loss: 0.002502827439457178 2023-01-22 18:37:00.488404: step: 754/463, loss: 0.0031556945759803057 2023-01-22 18:37:01.160741: step: 756/463, loss: 0.024208471179008484 2023-01-22 18:37:01.816320: step: 758/463, loss: 0.000698418531101197 2023-01-22 18:37:02.421927: step: 760/463, loss: 0.010708297602832317 2023-01-22 18:37:03.041170: step: 762/463, loss: 0.0013954704627394676 2023-01-22 18:37:03.654724: step: 764/463, loss: 0.13687138259410858 2023-01-22 18:37:04.373725: step: 766/463, loss: 0.002894794102758169 2023-01-22 18:37:05.006110: step: 768/463, loss: 0.06847632676362991 2023-01-22 18:37:05.634231: step: 770/463, loss: 0.002703516511246562 2023-01-22 18:37:06.227283: step: 772/463, loss: 0.00019253414939157665 2023-01-22 18:37:06.872836: step: 774/463, loss: 0.0027576326392591 2023-01-22 18:37:07.555063: step: 776/463, loss: 0.0017571891658008099 2023-01-22 18:37:08.193763: step: 778/463, loss: 0.007923871278762817 2023-01-22 18:37:08.872840: step: 780/463, loss: 0.2504790127277374 2023-01-22 18:37:09.434418: step: 782/463, loss: 0.0017992117209360003 2023-01-22 18:37:10.046585: step: 784/463, loss: 0.01976163126528263 2023-01-22 18:37:10.663450: step: 786/463, loss: 0.005164849106222391 2023-01-22 18:37:11.288849: step: 788/463, loss: 0.0047714668326079845 2023-01-22 18:37:11.968085: step: 790/463, loss: 0.003612370463088155 2023-01-22 18:37:12.652553: step: 792/463, loss: 0.0008338374318554997 2023-01-22 18:37:13.246185: step: 794/463, loss: 0.03897548094391823 2023-01-22 18:37:13.853758: step: 796/463, loss: 0.0046454984694719315 2023-01-22 18:37:14.442612: step: 798/463, loss: 0.02185918390750885 2023-01-22 18:37:15.153776: step: 800/463, loss: 0.029178552329540253 2023-01-22 18:37:15.760238: step: 802/463, loss: 1.9040051698684692 2023-01-22 18:37:16.335524: step: 804/463, loss: 0.0005749660776928067 2023-01-22 18:37:17.057642: step: 806/463, loss: 0.0459834560751915 2023-01-22 18:37:17.686826: step: 808/463, loss: 0.0033718671184033155 2023-01-22 18:37:18.309036: step: 810/463, loss: 0.0038874726742506027 2023-01-22 18:37:19.050747: step: 812/463, loss: 0.0006899218424223363 2023-01-22 18:37:19.720934: step: 814/463, loss: 0.10444609820842743 2023-01-22 18:37:20.429701: step: 816/463, loss: 0.005753802601248026 2023-01-22 18:37:21.009271: step: 818/463, loss: 0.010254835709929466 2023-01-22 18:37:21.652960: step: 820/463, loss: 0.005231910385191441 2023-01-22 18:37:22.298446: step: 822/463, loss: 4.1565439460100606e-05 2023-01-22 18:37:23.022329: step: 824/463, loss: 0.007041449658572674 2023-01-22 18:37:23.665116: step: 826/463, loss: 0.04505906626582146 2023-01-22 18:37:24.251524: step: 828/463, loss: 0.020431658253073692 2023-01-22 18:37:24.924616: step: 830/463, loss: 0.0014902811963111162 2023-01-22 18:37:25.583258: step: 832/463, loss: 0.006454145070165396 2023-01-22 18:37:26.170623: step: 834/463, loss: 1.2973847389221191 2023-01-22 18:37:26.807111: step: 836/463, loss: 0.010359304957091808 2023-01-22 18:37:27.413942: step: 838/463, loss: 0.020958522334694862 2023-01-22 18:37:27.997307: step: 840/463, loss: 0.00291613070294261 2023-01-22 18:37:28.641127: step: 842/463, loss: 0.2026606649160385 2023-01-22 18:37:29.205609: step: 844/463, loss: 0.0024177723098546267 2023-01-22 18:37:29.811764: step: 846/463, loss: 0.02967863716185093 2023-01-22 18:37:30.370305: step: 848/463, loss: 0.017069164663553238 2023-01-22 18:37:31.009253: step: 850/463, loss: 3.707767973537557e-05 2023-01-22 18:37:31.602641: step: 852/463, loss: 0.06910073757171631 2023-01-22 18:37:32.265551: step: 854/463, loss: 0.009719984605908394 2023-01-22 18:37:32.857273: step: 856/463, loss: 0.012193923816084862 2023-01-22 18:37:33.472436: step: 858/463, loss: 0.0004109850851818919 2023-01-22 18:37:34.142192: step: 860/463, loss: 0.00027402874547988176 2023-01-22 18:37:34.757291: step: 862/463, loss: 0.009598150849342346 2023-01-22 18:37:35.451826: step: 864/463, loss: 0.06702028214931488 2023-01-22 18:37:36.065650: step: 866/463, loss: 0.009674475528299809 2023-01-22 18:37:36.692557: step: 868/463, loss: 0.0011261209147050977 2023-01-22 18:37:37.362612: step: 870/463, loss: 0.020866477862000465 2023-01-22 18:37:37.971001: step: 872/463, loss: 0.00012341064575593919 2023-01-22 18:37:38.540768: step: 874/463, loss: 0.005406065843999386 2023-01-22 18:37:39.130955: step: 876/463, loss: 0.0076129804365336895 2023-01-22 18:37:39.751391: step: 878/463, loss: 0.04044797271490097 2023-01-22 18:37:40.323946: step: 880/463, loss: 0.006791677325963974 2023-01-22 18:37:40.950086: step: 882/463, loss: 0.04146798327565193 2023-01-22 18:37:41.570294: step: 884/463, loss: 0.011866828426718712 2023-01-22 18:37:42.228597: step: 886/463, loss: 0.0032745241187512875 2023-01-22 18:37:42.858169: step: 888/463, loss: 0.11990147083997726 2023-01-22 18:37:43.455790: step: 890/463, loss: 0.02671329490840435 2023-01-22 18:37:44.064257: step: 892/463, loss: 0.0045822239480912685 2023-01-22 18:37:44.669702: step: 894/463, loss: 0.0016211771871894598 2023-01-22 18:37:45.346021: step: 896/463, loss: 0.005417850334197283 2023-01-22 18:37:45.983025: step: 898/463, loss: 0.005452464334666729 2023-01-22 18:37:46.617162: step: 900/463, loss: 0.04223255068063736 2023-01-22 18:37:47.212447: step: 902/463, loss: 0.00021084809850435704 2023-01-22 18:37:47.856657: step: 904/463, loss: 0.012233521789312363 2023-01-22 18:37:48.469882: step: 906/463, loss: 0.6109666228294373 2023-01-22 18:37:49.163458: step: 908/463, loss: 0.01064689178019762 2023-01-22 18:37:49.765318: step: 910/463, loss: 0.002101311692968011 2023-01-22 18:37:50.381674: step: 912/463, loss: 0.002190230879932642 2023-01-22 18:37:50.929697: step: 914/463, loss: 1.3704584489460103e-05 2023-01-22 18:37:51.547055: step: 916/463, loss: 0.015062023885548115 2023-01-22 18:37:52.156810: step: 918/463, loss: 0.04398564621806145 2023-01-22 18:37:52.789160: step: 920/463, loss: 0.03191087394952774 2023-01-22 18:37:53.423647: step: 922/463, loss: 0.005315628834068775 2023-01-22 18:37:54.111931: step: 924/463, loss: 0.01851019822061062 2023-01-22 18:37:54.703277: step: 926/463, loss: 0.00482781371101737 ================================================== Loss: 0.057 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.29459852005870846, 'r': 0.3264621171049065, 'f1': 0.30971293557927226}, 'combined': 0.22820953147946377, 'epoch': 33} Test Chinese: {'template': {'p': 0.9722222222222222, 'r': 0.5511811023622047, 'f1': 0.7035175879396985}, 'slot': {'p': 0.3669334870862626, 'r': 0.33526268044150237, 'f1': 0.350383867395356}, 'combined': 0.24650121324296403, 'epoch': 33} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2977078201970444, 'r': 0.32764807535917595, 'f1': 0.31196122080268424}, 'combined': 0.2298661626967147, 'epoch': 33} Test Korean: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.3700155375253328, 'r': 0.33320770918511644, 'f1': 0.35064833308185805}, 'combined': 0.2489603164881192, 'epoch': 33} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3028993919161733, 'r': 0.3339365212586275, 'f1': 0.31766163664854996}, 'combined': 0.23406646910945786, 'epoch': 33} Test Russian: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.38412633305988514, 'r': 0.32004936396430606, 'f1': 0.34917248379145344}, 'combined': 0.24791246349193194, 'epoch': 33} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.3115530303030303, 'r': 0.3916666666666666, 'f1': 0.3470464135021097}, 'combined': 0.2313642756680731, 'epoch': 33} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.24342105263157895, 'r': 0.40217391304347827, 'f1': 0.30327868852459017}, 'combined': 0.15163934426229508, 'epoch': 33} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.39705882352941174, 'r': 0.23275862068965517, 'f1': 0.2934782608695652}, 'combined': 0.19565217391304346, 'epoch': 33} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.31154818059299194, 'r': 0.313321699647601, 'f1': 0.312432423300446}, 'combined': 0.23021336453717073, 'epoch': 29} Test for Chinese: {'template': {'p': 0.9722222222222222, 'r': 0.5511811023622047, 'f1': 0.7035175879396985}, 'slot': {'p': 0.35841849662687986, 'r': 0.32120052010105204, 'f1': 0.33879042433116024}, 'combined': 0.23834502214252482, 'epoch': 29} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.34188034188034183, 'r': 0.38095238095238093, 'f1': 0.36036036036036034}, 'combined': 0.2402402402402402, 'epoch': 29} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.28642843540005986, 'r': 0.33860515228508026, 'f1': 0.3103389830508475}, 'combined': 0.22867082961641394, 'epoch': 17} Test for Korean: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.3492453884409228, 'r': 0.3270179138496522, 'f1': 0.3377663639671779}, 'combined': 0.2398141184166963, 'epoch': 17} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.33088235294117646, 'r': 0.4891304347826087, 'f1': 0.39473684210526316}, 'combined': 0.19736842105263158, 'epoch': 17} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2919934913217624, 'r': 0.3557112361073462, 'f1': 0.3207182573628254}, 'combined': 0.23631871595155554, 'epoch': 30} Test for Russian: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.36122302043550025, 'r': 0.3233655859793779, 'f1': 0.34124755386763844}, 'combined': 0.24228576324602327, 'epoch': 30} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.4375, 'r': 0.3017241379310345, 'f1': 0.3571428571428571}, 'combined': 0.23809523809523805, 'epoch': 30} ****************************** Epoch: 34 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-22 18:40:33.866107: step: 2/463, loss: 0.01421637088060379 2023-01-22 18:40:34.499888: step: 4/463, loss: 0.013926789164543152 2023-01-22 18:40:35.091145: step: 6/463, loss: 0.0064306422136723995 2023-01-22 18:40:35.722651: step: 8/463, loss: 0.0006886579212732613 2023-01-22 18:40:36.337404: step: 10/463, loss: 0.013569198548793793 2023-01-22 18:40:37.036505: step: 12/463, loss: 0.09498600661754608 2023-01-22 18:40:37.684944: step: 14/463, loss: 0.049510691314935684 2023-01-22 18:40:38.340731: step: 16/463, loss: 0.0006151176057755947 2023-01-22 18:40:38.969828: step: 18/463, loss: 0.0050384290516376495 2023-01-22 18:40:39.580781: step: 20/463, loss: 0.0061970055103302 2023-01-22 18:40:40.178068: step: 22/463, loss: 0.09867826849222183 2023-01-22 18:40:40.805388: step: 24/463, loss: 0.062110964208841324 2023-01-22 18:40:41.339439: step: 26/463, loss: 0.010279631242156029 2023-01-22 18:40:41.924032: step: 28/463, loss: 0.0009861228754743934 2023-01-22 18:40:42.596312: step: 30/463, loss: 0.002708416897803545 2023-01-22 18:40:43.180948: step: 32/463, loss: 0.004181708674877882 2023-01-22 18:40:43.764727: step: 34/463, loss: 0.005509230308234692 2023-01-22 18:40:44.391485: step: 36/463, loss: 0.006542433984577656 2023-01-22 18:40:45.020044: step: 38/463, loss: 0.03910353407263756 2023-01-22 18:40:45.684841: step: 40/463, loss: 0.004590011667460203 2023-01-22 18:40:46.284138: step: 42/463, loss: 0.0034196621272712946 2023-01-22 18:40:46.861054: step: 44/463, loss: 0.00030749267898499966 2023-01-22 18:40:47.463918: step: 46/463, loss: 0.0011456963839009404 2023-01-22 18:40:48.156477: step: 48/463, loss: 0.08443114906549454 2023-01-22 18:40:48.726212: step: 50/463, loss: 0.0373181588947773 2023-01-22 18:40:49.344274: step: 52/463, loss: 2.1122054022271186e-05 2023-01-22 18:40:49.974848: step: 54/463, loss: 0.010377667844295502 2023-01-22 18:40:50.658133: step: 56/463, loss: 0.26323121786117554 2023-01-22 18:40:51.230792: step: 58/463, loss: 0.0013677745591849089 2023-01-22 18:40:51.798307: step: 60/463, loss: 0.0023777950555086136 2023-01-22 18:40:52.427684: step: 62/463, loss: 0.03108135238289833 2023-01-22 18:40:53.029622: step: 64/463, loss: 0.0024119571316987276 2023-01-22 18:40:53.629250: step: 66/463, loss: 0.018009508028626442 2023-01-22 18:40:54.213787: step: 68/463, loss: 0.07018401473760605 2023-01-22 18:40:54.870166: step: 70/463, loss: 0.09363169223070145 2023-01-22 18:40:55.485575: step: 72/463, loss: 0.006280925124883652 2023-01-22 18:40:56.107235: step: 74/463, loss: 0.022263793274760246 2023-01-22 18:40:56.719139: step: 76/463, loss: 0.005102612543851137 2023-01-22 18:40:57.335338: step: 78/463, loss: 0.007046980317682028 2023-01-22 18:40:58.033885: step: 80/463, loss: 0.0009749687160365283 2023-01-22 18:40:58.655560: step: 82/463, loss: 0.025792596861720085 2023-01-22 18:40:59.355601: step: 84/463, loss: 0.06601664423942566 2023-01-22 18:40:59.943085: step: 86/463, loss: 0.0005563173326663673 2023-01-22 18:41:00.539460: step: 88/463, loss: 0.014414437115192413 2023-01-22 18:41:01.130663: step: 90/463, loss: 0.01725691743195057 2023-01-22 18:41:01.788263: step: 92/463, loss: 0.060248587280511856 2023-01-22 18:41:02.410660: step: 94/463, loss: 0.33322712779045105 2023-01-22 18:41:02.982724: step: 96/463, loss: 0.0005854417104274035 2023-01-22 18:41:03.551458: step: 98/463, loss: 0.00033376793726347387 2023-01-22 18:41:04.137922: step: 100/463, loss: 0.007701256312429905 2023-01-22 18:41:04.718262: step: 102/463, loss: 0.004094271454960108 2023-01-22 18:41:05.347949: step: 104/463, loss: 0.009576747193932533 2023-01-22 18:41:05.916234: step: 106/463, loss: 0.9199184775352478 2023-01-22 18:41:06.475235: step: 108/463, loss: 0.01941925846040249 2023-01-22 18:41:07.155173: step: 110/463, loss: 0.13289156556129456 2023-01-22 18:41:07.794491: step: 112/463, loss: 0.0006112268310971558 2023-01-22 18:41:08.389601: step: 114/463, loss: 0.0003047761565539986 2023-01-22 18:41:08.993968: step: 116/463, loss: 0.008087718859314919 2023-01-22 18:41:09.633448: step: 118/463, loss: 0.0034443242475390434 2023-01-22 18:41:10.261597: step: 120/463, loss: 0.012340379878878593 2023-01-22 18:41:10.857513: step: 122/463, loss: 0.010474558919668198 2023-01-22 18:41:11.491612: step: 124/463, loss: 0.17184706032276154 2023-01-22 18:41:12.152218: step: 126/463, loss: 0.12175922095775604 2023-01-22 18:41:12.797852: step: 128/463, loss: 0.0006550498073920608 2023-01-22 18:41:13.382590: step: 130/463, loss: 0.0010210784384980798 2023-01-22 18:41:14.017054: step: 132/463, loss: 0.012848429381847382 2023-01-22 18:41:14.661949: step: 134/463, loss: 0.02216471917927265 2023-01-22 18:41:15.269741: step: 136/463, loss: 0.0008165210601873696 2023-01-22 18:41:15.885339: step: 138/463, loss: 1.558795884193387e-05 2023-01-22 18:41:16.569032: step: 140/463, loss: 0.0005281181656755507 2023-01-22 18:41:17.150044: step: 142/463, loss: 7.1084723472595215 2023-01-22 18:41:17.761296: step: 144/463, loss: 0.0051185511983931065 2023-01-22 18:41:18.399316: step: 146/463, loss: 0.004348520655184984 2023-01-22 18:41:19.021959: step: 148/463, loss: 0.0001945000549312681 2023-01-22 18:41:19.684469: step: 150/463, loss: 0.0011314823059365153 2023-01-22 18:41:20.269877: step: 152/463, loss: 0.03141910955309868 2023-01-22 18:41:20.935224: step: 154/463, loss: 0.000563225185032934 2023-01-22 18:41:21.517942: step: 156/463, loss: 0.18047134578227997 2023-01-22 18:41:22.132006: step: 158/463, loss: 0.014541915617883205 2023-01-22 18:41:22.728111: step: 160/463, loss: 0.03826597332954407 2023-01-22 18:41:23.373156: step: 162/463, loss: 0.0008390791481360793 2023-01-22 18:41:23.933255: step: 164/463, loss: 0.0045412806794047356 2023-01-22 18:41:24.603951: step: 166/463, loss: 0.004094784613698721 2023-01-22 18:41:25.184527: step: 168/463, loss: 0.001470040762796998 2023-01-22 18:41:25.882789: step: 170/463, loss: 0.013009748421609402 2023-01-22 18:41:26.498708: step: 172/463, loss: 0.02990247868001461 2023-01-22 18:41:27.089664: step: 174/463, loss: 0.0015182701172307134 2023-01-22 18:41:27.740588: step: 176/463, loss: 0.0008452034671790898 2023-01-22 18:41:28.356011: step: 178/463, loss: 0.0002400897501502186 2023-01-22 18:41:28.976774: step: 180/463, loss: 0.007220083381980658 2023-01-22 18:41:29.557211: step: 182/463, loss: 0.0011007088469341397 2023-01-22 18:41:30.162707: step: 184/463, loss: 0.041540469974279404 2023-01-22 18:41:30.790918: step: 186/463, loss: 0.013444289565086365 2023-01-22 18:41:31.406255: step: 188/463, loss: 0.017034368589520454 2023-01-22 18:41:32.049214: step: 190/463, loss: 0.0080609992146492 2023-01-22 18:41:32.673741: step: 192/463, loss: 0.03661505877971649 2023-01-22 18:41:33.304177: step: 194/463, loss: 0.0014141191495582461 2023-01-22 18:41:33.920835: step: 196/463, loss: 0.02654779888689518 2023-01-22 18:41:34.501711: step: 198/463, loss: 0.01949721947312355 2023-01-22 18:41:35.093211: step: 200/463, loss: 0.012242456898093224 2023-01-22 18:41:35.749483: step: 202/463, loss: 0.00010855083382921293 2023-01-22 18:41:36.370658: step: 204/463, loss: 0.00012783582496922463 2023-01-22 18:41:36.950670: step: 206/463, loss: 0.00021289606229402125 2023-01-22 18:41:37.581134: step: 208/463, loss: 0.014509106986224651 2023-01-22 18:41:38.299862: step: 210/463, loss: 0.035522691905498505 2023-01-22 18:41:38.925406: step: 212/463, loss: 0.03508644551038742 2023-01-22 18:41:39.562712: step: 214/463, loss: 0.009764544665813446 2023-01-22 18:41:40.164955: step: 216/463, loss: 0.0011302358470857143 2023-01-22 18:41:40.803481: step: 218/463, loss: 0.021305255591869354 2023-01-22 18:41:41.406544: step: 220/463, loss: 0.009935895912349224 2023-01-22 18:41:42.069586: step: 222/463, loss: 0.0005852561444044113 2023-01-22 18:41:42.678915: step: 224/463, loss: 0.001610026927664876 2023-01-22 18:41:43.221493: step: 226/463, loss: 0.06624916940927505 2023-01-22 18:41:43.836646: step: 228/463, loss: 0.010288701392710209 2023-01-22 18:41:44.455963: step: 230/463, loss: 0.0011153160594403744 2023-01-22 18:41:45.122913: step: 232/463, loss: 0.0006676982156932354 2023-01-22 18:41:45.806271: step: 234/463, loss: 0.014171549119055271 2023-01-22 18:41:46.415464: step: 236/463, loss: 0.07930251210927963 2023-01-22 18:41:47.030012: step: 238/463, loss: 0.06612588465213776 2023-01-22 18:41:47.642009: step: 240/463, loss: 0.0153943607583642 2023-01-22 18:41:48.255269: step: 242/463, loss: 0.7877417206764221 2023-01-22 18:41:48.842990: step: 244/463, loss: 0.008827961049973965 2023-01-22 18:41:49.522785: step: 246/463, loss: 0.029533136636018753 2023-01-22 18:41:50.188080: step: 248/463, loss: 0.0012410464696586132 2023-01-22 18:41:50.766892: step: 250/463, loss: 0.010269280523061752 2023-01-22 18:41:51.383972: step: 252/463, loss: 0.01659373939037323 2023-01-22 18:41:52.076424: step: 254/463, loss: 0.009376885369420052 2023-01-22 18:41:52.735402: step: 256/463, loss: 0.000961662910412997 2023-01-22 18:41:53.334763: step: 258/463, loss: 0.003301321528851986 2023-01-22 18:41:53.928064: step: 260/463, loss: 0.00018343979900237173 2023-01-22 18:41:54.500374: step: 262/463, loss: 0.04936336725950241 2023-01-22 18:41:55.105251: step: 264/463, loss: 0.051275111734867096 2023-01-22 18:41:55.809912: step: 266/463, loss: 4.4701726437779143e-05 2023-01-22 18:41:56.380629: step: 268/463, loss: 0.0067665548995137215 2023-01-22 18:41:57.075693: step: 270/463, loss: 0.04054063931107521 2023-01-22 18:41:57.697824: step: 272/463, loss: 0.0013388764346018434 2023-01-22 18:41:58.389764: step: 274/463, loss: 0.2578161358833313 2023-01-22 18:41:59.049484: step: 276/463, loss: 0.02838834747672081 2023-01-22 18:41:59.650775: step: 278/463, loss: 0.03394710645079613 2023-01-22 18:42:00.264895: step: 280/463, loss: 0.05228975787758827 2023-01-22 18:42:00.873618: step: 282/463, loss: 0.0407642237842083 2023-01-22 18:42:01.505327: step: 284/463, loss: 0.001727886963635683 2023-01-22 18:42:02.160850: step: 286/463, loss: 0.02139594405889511 2023-01-22 18:42:02.790882: step: 288/463, loss: 0.000808328331913799 2023-01-22 18:42:03.392927: step: 290/463, loss: 0.024710439145565033 2023-01-22 18:42:04.019936: step: 292/463, loss: 0.042870908975601196 2023-01-22 18:42:04.650604: step: 294/463, loss: 0.014400427229702473 2023-01-22 18:42:05.259997: step: 296/463, loss: 0.018344393000006676 2023-01-22 18:42:05.859414: step: 298/463, loss: 0.013718219473958015 2023-01-22 18:42:06.521391: step: 300/463, loss: 0.0039004050195217133 2023-01-22 18:42:07.383693: step: 302/463, loss: 0.011632346548140049 2023-01-22 18:42:08.066671: step: 304/463, loss: 0.003035646630451083 2023-01-22 18:42:08.704568: step: 306/463, loss: 0.3683563470840454 2023-01-22 18:42:09.284265: step: 308/463, loss: 0.02586384490132332 2023-01-22 18:42:09.856756: step: 310/463, loss: 1.9401670215302147e-05 2023-01-22 18:42:10.463440: step: 312/463, loss: 0.0001734842808218673 2023-01-22 18:42:11.066894: step: 314/463, loss: 0.00819866918027401 2023-01-22 18:42:11.650056: step: 316/463, loss: 0.004380682948976755 2023-01-22 18:42:12.285750: step: 318/463, loss: 0.01282327901571989 2023-01-22 18:42:12.926503: step: 320/463, loss: 0.01597239077091217 2023-01-22 18:42:13.560408: step: 322/463, loss: 0.00034171133302152157 2023-01-22 18:42:14.133098: step: 324/463, loss: 0.0008982212166301906 2023-01-22 18:42:14.738397: step: 326/463, loss: 0.0017116991803050041 2023-01-22 18:42:15.294429: step: 328/463, loss: 7.243558502523229e-05 2023-01-22 18:42:15.880829: step: 330/463, loss: 0.00014489635941572487 2023-01-22 18:42:16.600904: step: 332/463, loss: 0.11085410416126251 2023-01-22 18:42:17.231746: step: 334/463, loss: 0.007016283925622702 2023-01-22 18:42:17.860844: step: 336/463, loss: 0.001887564198113978 2023-01-22 18:42:18.561207: step: 338/463, loss: 0.0006738511146977544 2023-01-22 18:42:19.277389: step: 340/463, loss: 0.0014242117758840322 2023-01-22 18:42:19.900900: step: 342/463, loss: 0.00582523038610816 2023-01-22 18:42:20.498075: step: 344/463, loss: 0.008165675215423107 2023-01-22 18:42:21.127948: step: 346/463, loss: 0.00024797653895802796 2023-01-22 18:42:21.748599: step: 348/463, loss: 0.06975727528333664 2023-01-22 18:42:22.357436: step: 350/463, loss: 0.0052853976376354694 2023-01-22 18:42:22.935168: step: 352/463, loss: 0.0010137715144082904 2023-01-22 18:42:23.571534: step: 354/463, loss: 0.38808494806289673 2023-01-22 18:42:24.148291: step: 356/463, loss: 0.012908794917166233 2023-01-22 18:42:24.728710: step: 358/463, loss: 0.04972630366683006 2023-01-22 18:42:25.408573: step: 360/463, loss: 0.0016720225103199482 2023-01-22 18:42:26.018548: step: 362/463, loss: 0.004246782045811415 2023-01-22 18:42:26.620229: step: 364/463, loss: 0.012881260365247726 2023-01-22 18:42:27.161983: step: 366/463, loss: 0.014761758968234062 2023-01-22 18:42:27.756856: step: 368/463, loss: 0.0001875192392617464 2023-01-22 18:42:28.359470: step: 370/463, loss: 0.0015146736986935139 2023-01-22 18:42:29.010442: step: 372/463, loss: 0.0013625032734125853 2023-01-22 18:42:29.590470: step: 374/463, loss: 1.6496245734742843e-05 2023-01-22 18:42:30.175876: step: 376/463, loss: 0.006443125195801258 2023-01-22 18:42:30.785297: step: 378/463, loss: 0.010886967182159424 2023-01-22 18:42:31.444353: step: 380/463, loss: 0.006280792411416769 2023-01-22 18:42:32.084145: step: 382/463, loss: 0.0034953660797327757 2023-01-22 18:42:32.748326: step: 384/463, loss: 0.020828668028116226 2023-01-22 18:42:33.368307: step: 386/463, loss: 0.021822623908519745 2023-01-22 18:42:34.023226: step: 388/463, loss: 0.03424214944243431 2023-01-22 18:42:34.667182: step: 390/463, loss: 0.0014773915754631162 2023-01-22 18:42:35.369042: step: 392/463, loss: 0.010415085591375828 2023-01-22 18:42:35.983371: step: 394/463, loss: 0.0007750998483970761 2023-01-22 18:42:36.615239: step: 396/463, loss: 0.006946917157620192 2023-01-22 18:42:37.244969: step: 398/463, loss: 0.017306620255112648 2023-01-22 18:42:37.852505: step: 400/463, loss: 0.009726050309836864 2023-01-22 18:42:38.468828: step: 402/463, loss: 0.10022459924221039 2023-01-22 18:42:39.091571: step: 404/463, loss: 0.013463594019412994 2023-01-22 18:42:39.705326: step: 406/463, loss: 0.0027320042718201876 2023-01-22 18:42:40.358388: step: 408/463, loss: 0.003501604776829481 2023-01-22 18:42:40.941440: step: 410/463, loss: 0.001862604171037674 2023-01-22 18:42:41.534041: step: 412/463, loss: 0.001478648860938847 2023-01-22 18:42:42.163486: step: 414/463, loss: 0.05678674951195717 2023-01-22 18:42:42.720081: step: 416/463, loss: 0.0064149657264351845 2023-01-22 18:42:43.292988: step: 418/463, loss: 0.00016864931967575103 2023-01-22 18:42:43.978663: step: 420/463, loss: 0.0039683617651462555 2023-01-22 18:42:44.614839: step: 422/463, loss: 0.003331769024953246 2023-01-22 18:42:45.229510: step: 424/463, loss: 0.04636223986744881 2023-01-22 18:42:45.861104: step: 426/463, loss: 0.009499873034656048 2023-01-22 18:42:46.438234: step: 428/463, loss: 0.007162950001657009 2023-01-22 18:42:47.039678: step: 430/463, loss: 0.021719692274928093 2023-01-22 18:42:47.733884: step: 432/463, loss: 0.002167987870052457 2023-01-22 18:42:48.440192: step: 434/463, loss: 0.034110505133867264 2023-01-22 18:42:49.112320: step: 436/463, loss: 0.0003223160747438669 2023-01-22 18:42:49.745700: step: 438/463, loss: 0.020973848178982735 2023-01-22 18:42:50.351476: step: 440/463, loss: 0.04148552566766739 2023-01-22 18:42:50.988152: step: 442/463, loss: 0.0037114627193659544 2023-01-22 18:42:51.642723: step: 444/463, loss: 0.03722742572426796 2023-01-22 18:42:52.265096: step: 446/463, loss: 0.01808503083884716 2023-01-22 18:42:52.863285: step: 448/463, loss: 0.0007337511051446199 2023-01-22 18:42:53.507143: step: 450/463, loss: 0.0035443773958832026 2023-01-22 18:42:54.175122: step: 452/463, loss: 3.548006134224124e-05 2023-01-22 18:42:54.810737: step: 454/463, loss: 0.05485925450921059 2023-01-22 18:42:55.470692: step: 456/463, loss: 0.012426719069480896 2023-01-22 18:42:56.111182: step: 458/463, loss: 0.0035880538634955883 2023-01-22 18:42:56.666534: step: 460/463, loss: 0.01954762265086174 2023-01-22 18:42:57.324990: step: 462/463, loss: 0.0007640445255674422 2023-01-22 18:42:57.966789: step: 464/463, loss: 0.0010571957100182772 2023-01-22 18:42:58.567088: step: 466/463, loss: 0.005422906018793583 2023-01-22 18:42:59.215924: step: 468/463, loss: 0.0005060746916569769 2023-01-22 18:42:59.899982: step: 470/463, loss: 0.02033419907093048 2023-01-22 18:43:00.565133: step: 472/463, loss: 0.16299371421337128 2023-01-22 18:43:01.265472: step: 474/463, loss: 0.0031123273074626923 2023-01-22 18:43:01.890595: step: 476/463, loss: 0.00011974084191024303 2023-01-22 18:43:02.496537: step: 478/463, loss: 0.07505516707897186 2023-01-22 18:43:03.116953: step: 480/463, loss: 0.002450087107717991 2023-01-22 18:43:03.792346: step: 482/463, loss: 0.0009411225328221917 2023-01-22 18:43:04.369993: step: 484/463, loss: 0.004816989880055189 2023-01-22 18:43:04.955638: step: 486/463, loss: 0.013165805488824844 2023-01-22 18:43:05.539148: step: 488/463, loss: 0.009172674268484116 2023-01-22 18:43:06.218111: step: 490/463, loss: 0.008746613748371601 2023-01-22 18:43:06.854711: step: 492/463, loss: 0.017872389405965805 2023-01-22 18:43:07.487368: step: 494/463, loss: 0.006268732715398073 2023-01-22 18:43:08.148231: step: 496/463, loss: 0.07411642372608185 2023-01-22 18:43:08.744723: step: 498/463, loss: 3.645800825324841e-05 2023-01-22 18:43:09.357042: step: 500/463, loss: 0.0027852184139192104 2023-01-22 18:43:09.928722: step: 502/463, loss: 0.0026338198222219944 2023-01-22 18:43:10.544637: step: 504/463, loss: 0.08276905119419098 2023-01-22 18:43:11.142378: step: 506/463, loss: 0.00038033557939343154 2023-01-22 18:43:11.707654: step: 508/463, loss: 0.0006086411303840578 2023-01-22 18:43:12.318534: step: 510/463, loss: 0.05192597210407257 2023-01-22 18:43:12.946198: step: 512/463, loss: 0.012301702983677387 2023-01-22 18:43:13.641644: step: 514/463, loss: 0.021557848900556564 2023-01-22 18:43:14.207005: step: 516/463, loss: 0.00963529571890831 2023-01-22 18:43:14.881088: step: 518/463, loss: 0.06469772756099701 2023-01-22 18:43:15.474166: step: 520/463, loss: 0.01095014251768589 2023-01-22 18:43:16.141469: step: 522/463, loss: 0.004447646904736757 2023-01-22 18:43:16.743041: step: 524/463, loss: 0.029293827712535858 2023-01-22 18:43:17.322126: step: 526/463, loss: 0.018115544691681862 2023-01-22 18:43:17.919111: step: 528/463, loss: 0.0009236755431629717 2023-01-22 18:43:18.504665: step: 530/463, loss: 0.04176928475499153 2023-01-22 18:43:19.100033: step: 532/463, loss: 0.015158458612859249 2023-01-22 18:43:19.707970: step: 534/463, loss: 0.00034961619530804455 2023-01-22 18:43:20.353829: step: 536/463, loss: 0.029664194211363792 2023-01-22 18:43:20.979594: step: 538/463, loss: 0.1713618040084839 2023-01-22 18:43:21.582792: step: 540/463, loss: 0.003012137720361352 2023-01-22 18:43:22.169066: step: 542/463, loss: 0.0031789508648216724 2023-01-22 18:43:22.761853: step: 544/463, loss: 0.009241295047104359 2023-01-22 18:43:23.461913: step: 546/463, loss: 0.08362690359354019 2023-01-22 18:43:24.108560: step: 548/463, loss: 0.000803778471890837 2023-01-22 18:43:24.731762: step: 550/463, loss: 0.15678732097148895 2023-01-22 18:43:25.450569: step: 552/463, loss: 0.028481030836701393 2023-01-22 18:43:26.058482: step: 554/463, loss: 0.0006153411231935024 2023-01-22 18:43:26.644518: step: 556/463, loss: 0.005816456396132708 2023-01-22 18:43:27.287281: step: 558/463, loss: 1.061047911643982 2023-01-22 18:43:27.910438: step: 560/463, loss: 0.025019798427820206 2023-01-22 18:43:28.523858: step: 562/463, loss: 0.002734766574576497 2023-01-22 18:43:29.137729: step: 564/463, loss: 0.002338422928005457 2023-01-22 18:43:29.923971: step: 566/463, loss: 0.0006360527477227151 2023-01-22 18:43:30.608909: step: 568/463, loss: 0.014671143144369125 2023-01-22 18:43:31.361084: step: 570/463, loss: 0.013965191319584846 2023-01-22 18:43:31.970993: step: 572/463, loss: 0.13115009665489197 2023-01-22 18:43:32.640105: step: 574/463, loss: 0.0021860164124518633 2023-01-22 18:43:33.320188: step: 576/463, loss: 0.01173482183367014 2023-01-22 18:43:33.988349: step: 578/463, loss: 0.26513081789016724 2023-01-22 18:43:34.618678: step: 580/463, loss: 0.0008760917698964477 2023-01-22 18:43:35.235070: step: 582/463, loss: 0.0026832432486116886 2023-01-22 18:43:35.873928: step: 584/463, loss: 0.03140880540013313 2023-01-22 18:43:36.437849: step: 586/463, loss: 0.007647199090570211 2023-01-22 18:43:37.092282: step: 588/463, loss: 0.027470635250210762 2023-01-22 18:43:37.685668: step: 590/463, loss: 0.00039448257302865386 2023-01-22 18:43:38.351828: step: 592/463, loss: 0.0016140195075422525 2023-01-22 18:43:38.931613: step: 594/463, loss: 0.9824089407920837 2023-01-22 18:43:39.635077: step: 596/463, loss: 0.05376916378736496 2023-01-22 18:43:40.183497: step: 598/463, loss: 0.0022429884411394596 2023-01-22 18:43:40.830528: step: 600/463, loss: 0.0035803017672151327 2023-01-22 18:43:41.455735: step: 602/463, loss: 0.034703269600868225 2023-01-22 18:43:42.062534: step: 604/463, loss: 0.022047266364097595 2023-01-22 18:43:42.649690: step: 606/463, loss: 0.011108850128948689 2023-01-22 18:43:43.269205: step: 608/463, loss: 0.00030934251844882965 2023-01-22 18:43:43.949591: step: 610/463, loss: 0.008202211931347847 2023-01-22 18:43:44.537132: step: 612/463, loss: 0.0013449783436954021 2023-01-22 18:43:45.085413: step: 614/463, loss: 8.708340465091169e-05 2023-01-22 18:43:45.712304: step: 616/463, loss: 0.014783303253352642 2023-01-22 18:43:46.391378: step: 618/463, loss: 0.021595174446702003 2023-01-22 18:43:47.063841: step: 620/463, loss: 0.013105475343763828 2023-01-22 18:43:47.686319: step: 622/463, loss: 0.0016359209548681974 2023-01-22 18:43:48.300891: step: 624/463, loss: 0.000614314223639667 2023-01-22 18:43:48.995555: step: 626/463, loss: 0.05638539791107178 2023-01-22 18:43:49.607829: step: 628/463, loss: 0.0017719111638143659 2023-01-22 18:43:50.308802: step: 630/463, loss: 0.011052437126636505 2023-01-22 18:43:50.949022: step: 632/463, loss: 0.04180431365966797 2023-01-22 18:43:51.493184: step: 634/463, loss: 0.0013781070010736585 2023-01-22 18:43:52.127579: step: 636/463, loss: 0.0001847767853178084 2023-01-22 18:43:52.754211: step: 638/463, loss: 0.011167913675308228 2023-01-22 18:43:53.442830: step: 640/463, loss: 0.0013468522811308503 2023-01-22 18:43:54.115369: step: 642/463, loss: 0.0011806698748841882 2023-01-22 18:43:54.771544: step: 644/463, loss: 0.013269947841763496 2023-01-22 18:43:55.325959: step: 646/463, loss: 0.010470489971339703 2023-01-22 18:43:55.931521: step: 648/463, loss: 0.0054083894938230515 2023-01-22 18:43:56.529938: step: 650/463, loss: 6.3167494772642385e-06 2023-01-22 18:43:57.166453: step: 652/463, loss: 0.003893531858921051 2023-01-22 18:43:57.755917: step: 654/463, loss: 0.060266200453042984 2023-01-22 18:43:58.364889: step: 656/463, loss: 0.012322275899350643 2023-01-22 18:43:58.975471: step: 658/463, loss: 0.004375784657895565 2023-01-22 18:43:59.551899: step: 660/463, loss: 0.005770581774413586 2023-01-22 18:44:00.163724: step: 662/463, loss: 0.0011859607184305787 2023-01-22 18:44:00.774561: step: 664/463, loss: 0.004354021046310663 2023-01-22 18:44:01.392183: step: 666/463, loss: 0.002952700015157461 2023-01-22 18:44:01.965543: step: 668/463, loss: 0.02704070322215557 2023-01-22 18:44:02.533273: step: 670/463, loss: 0.0009742437396198511 2023-01-22 18:44:03.165486: step: 672/463, loss: 0.005109363701194525 2023-01-22 18:44:03.774373: step: 674/463, loss: 0.014681716449558735 2023-01-22 18:44:04.415250: step: 676/463, loss: 0.0013712168438360095 2023-01-22 18:44:05.052513: step: 678/463, loss: 0.010155047290027142 2023-01-22 18:44:05.686243: step: 680/463, loss: 0.023913737386465073 2023-01-22 18:44:06.436191: step: 682/463, loss: 0.004724218510091305 2023-01-22 18:44:07.180804: step: 684/463, loss: 0.006154857575893402 2023-01-22 18:44:07.804404: step: 686/463, loss: 0.0022843251936137676 2023-01-22 18:44:08.402430: step: 688/463, loss: 0.000740609597414732 2023-01-22 18:44:09.008557: step: 690/463, loss: 0.009860338643193245 2023-01-22 18:44:09.577640: step: 692/463, loss: 0.0002434194611851126 2023-01-22 18:44:10.196886: step: 694/463, loss: 4.564369737636298e-05 2023-01-22 18:44:10.842640: step: 696/463, loss: 0.0003151975106447935 2023-01-22 18:44:11.492524: step: 698/463, loss: 0.02829935774207115 2023-01-22 18:44:12.122132: step: 700/463, loss: 0.007231098599731922 2023-01-22 18:44:12.711985: step: 702/463, loss: 0.0025222443509846926 2023-01-22 18:44:13.442408: step: 704/463, loss: 0.012938730418682098 2023-01-22 18:44:14.107379: step: 706/463, loss: 0.040966153144836426 2023-01-22 18:44:14.667013: step: 708/463, loss: 2.0632429368561134e-05 2023-01-22 18:44:15.265011: step: 710/463, loss: 0.6191602945327759 2023-01-22 18:44:15.866111: step: 712/463, loss: 0.001026917714625597 2023-01-22 18:44:16.549015: step: 714/463, loss: 0.019649211317300797 2023-01-22 18:44:17.144087: step: 716/463, loss: 0.010141539387404919 2023-01-22 18:44:17.773536: step: 718/463, loss: 0.014880494214594364 2023-01-22 18:44:18.437736: step: 720/463, loss: 0.0006479179137386382 2023-01-22 18:44:19.056892: step: 722/463, loss: 0.057085420936346054 2023-01-22 18:44:19.614668: step: 724/463, loss: 0.0007350622909143567 2023-01-22 18:44:20.215562: step: 726/463, loss: 0.013320631347596645 2023-01-22 18:44:20.784218: step: 728/463, loss: 0.0005232381517998874 2023-01-22 18:44:21.390059: step: 730/463, loss: 0.12193936854600906 2023-01-22 18:44:22.010150: step: 732/463, loss: 0.02334590256214142 2023-01-22 18:44:22.648754: step: 734/463, loss: 0.008810954168438911 2023-01-22 18:44:23.310752: step: 736/463, loss: 0.24132297933101654 2023-01-22 18:44:23.923485: step: 738/463, loss: 0.03780048340559006 2023-01-22 18:44:24.543698: step: 740/463, loss: 0.03790288418531418 2023-01-22 18:44:25.113266: step: 742/463, loss: 0.002594243036583066 2023-01-22 18:44:25.745800: step: 744/463, loss: 0.00030925730243325233 2023-01-22 18:44:26.463217: step: 746/463, loss: 0.023370644077658653 2023-01-22 18:44:27.083660: step: 748/463, loss: 0.22173257172107697 2023-01-22 18:44:27.678088: step: 750/463, loss: 0.002258113818243146 2023-01-22 18:44:28.203464: step: 752/463, loss: 3.7230318412184715e-05 2023-01-22 18:44:28.767350: step: 754/463, loss: 0.013405629433691502 2023-01-22 18:44:29.444562: step: 756/463, loss: 0.0022201465908437967 2023-01-22 18:44:30.147872: step: 758/463, loss: 0.0034790337085723877 2023-01-22 18:44:30.766480: step: 760/463, loss: 0.009706716053187847 2023-01-22 18:44:31.403396: step: 762/463, loss: 0.11818651854991913 2023-01-22 18:44:32.028642: step: 764/463, loss: 0.010218911804258823 2023-01-22 18:44:32.666152: step: 766/463, loss: 0.010501544922590256 2023-01-22 18:44:33.269779: step: 768/463, loss: 0.00041544571286067367 2023-01-22 18:44:33.882374: step: 770/463, loss: 0.033106010407209396 2023-01-22 18:44:34.507734: step: 772/463, loss: 0.007863357663154602 2023-01-22 18:44:35.177469: step: 774/463, loss: 0.0019719076808542013 2023-01-22 18:44:35.886162: step: 776/463, loss: 0.003379812929779291 2023-01-22 18:44:36.570429: step: 778/463, loss: 5.034834885009332e-06 2023-01-22 18:44:37.161817: step: 780/463, loss: 0.009942869655787945 2023-01-22 18:44:37.926263: step: 782/463, loss: 0.006514287553727627 2023-01-22 18:44:38.572259: step: 784/463, loss: 0.0005471589975059032 2023-01-22 18:44:39.214259: step: 786/463, loss: 0.011018045246601105 2023-01-22 18:44:39.796584: step: 788/463, loss: 0.00176827737595886 2023-01-22 18:44:40.441682: step: 790/463, loss: 0.10855037719011307 2023-01-22 18:44:41.125961: step: 792/463, loss: 0.015687497332692146 2023-01-22 18:44:41.709253: step: 794/463, loss: 0.00018437393009662628 2023-01-22 18:44:42.371907: step: 796/463, loss: 0.31448793411254883 2023-01-22 18:44:43.039591: step: 798/463, loss: 0.019415542483329773 2023-01-22 18:44:43.688323: step: 800/463, loss: 0.00561416894197464 2023-01-22 18:44:44.288703: step: 802/463, loss: 0.02965330332517624 2023-01-22 18:44:44.881954: step: 804/463, loss: 0.04786636307835579 2023-01-22 18:44:45.483744: step: 806/463, loss: 0.00014612307131756097 2023-01-22 18:44:46.074976: step: 808/463, loss: 0.05881043151021004 2023-01-22 18:44:46.710391: step: 810/463, loss: 0.01071997545659542 2023-01-22 18:44:47.303158: step: 812/463, loss: 0.0009294534684158862 2023-01-22 18:44:47.952750: step: 814/463, loss: 0.0009063658071681857 2023-01-22 18:44:48.623806: step: 816/463, loss: 0.09378843754529953 2023-01-22 18:44:49.320548: step: 818/463, loss: 0.0042310538701713085 2023-01-22 18:44:49.943039: step: 820/463, loss: 0.014233632013201714 2023-01-22 18:44:50.526149: step: 822/463, loss: 0.002345799235627055 2023-01-22 18:44:51.095103: step: 824/463, loss: 0.0008994611562229693 2023-01-22 18:44:51.755103: step: 826/463, loss: 0.04471496865153313 2023-01-22 18:44:52.381823: step: 828/463, loss: 0.024863053113222122 2023-01-22 18:44:53.025540: step: 830/463, loss: 0.021019527688622475 2023-01-22 18:44:53.654815: step: 832/463, loss: 0.00840185396373272 2023-01-22 18:44:54.346122: step: 834/463, loss: 0.012063741683959961 2023-01-22 18:44:54.951952: step: 836/463, loss: 0.006443624384701252 2023-01-22 18:44:55.567524: step: 838/463, loss: 0.21057826280593872 2023-01-22 18:44:56.137355: step: 840/463, loss: 0.004301637876778841 2023-01-22 18:44:56.740168: step: 842/463, loss: 0.0022611659951508045 2023-01-22 18:44:57.333426: step: 844/463, loss: 0.027574626728892326 2023-01-22 18:44:57.897164: step: 846/463, loss: 0.006673639640212059 2023-01-22 18:44:58.491787: step: 848/463, loss: 0.023669401183724403 2023-01-22 18:44:59.183524: step: 850/463, loss: 0.009553012438118458 2023-01-22 18:44:59.752832: step: 852/463, loss: 0.0008357904152944684 2023-01-22 18:45:00.362508: step: 854/463, loss: 0.03610701858997345 2023-01-22 18:45:00.938465: step: 856/463, loss: 0.01557481661438942 2023-01-22 18:45:01.544637: step: 858/463, loss: 0.0006194389425218105 2023-01-22 18:45:02.104844: step: 860/463, loss: 0.02167293056845665 2023-01-22 18:45:02.715022: step: 862/463, loss: 0.00599303375929594 2023-01-22 18:45:03.328508: step: 864/463, loss: 0.010027652606368065 2023-01-22 18:45:03.925819: step: 866/463, loss: 9.620825039746705e-06 2023-01-22 18:45:04.527870: step: 868/463, loss: 0.004486383404582739 2023-01-22 18:45:05.219002: step: 870/463, loss: 0.0011229589581489563 2023-01-22 18:45:05.791290: step: 872/463, loss: 0.005795732140541077 2023-01-22 18:45:06.423116: step: 874/463, loss: 0.027347121387720108 2023-01-22 18:45:07.049903: step: 876/463, loss: 0.03581884503364563 2023-01-22 18:45:07.663086: step: 878/463, loss: 0.030752096325159073 2023-01-22 18:45:08.276315: step: 880/463, loss: 0.15358710289001465 2023-01-22 18:45:08.861355: step: 882/463, loss: 0.0038287583738565445 2023-01-22 18:45:09.495241: step: 884/463, loss: 0.009879331104457378 2023-01-22 18:45:10.135523: step: 886/463, loss: 0.011102424003183842 2023-01-22 18:45:10.723576: step: 888/463, loss: 0.006603811867535114 2023-01-22 18:45:11.264670: step: 890/463, loss: 0.005906204227358103 2023-01-22 18:45:11.883074: step: 892/463, loss: 8.769622218096629e-05 2023-01-22 18:45:12.517340: step: 894/463, loss: 0.0002431709144730121 2023-01-22 18:45:13.214253: step: 896/463, loss: 0.0006169751868583262 2023-01-22 18:45:13.852344: step: 898/463, loss: 6.237782508833334e-05 2023-01-22 18:45:14.528566: step: 900/463, loss: 0.01859668269753456 2023-01-22 18:45:15.150849: step: 902/463, loss: 0.0001987944997381419 2023-01-22 18:45:15.761937: step: 904/463, loss: 0.009199957363307476 2023-01-22 18:45:16.371747: step: 906/463, loss: 0.014221847988665104 2023-01-22 18:45:16.963549: step: 908/463, loss: 0.0005939154652878642 2023-01-22 18:45:17.535031: step: 910/463, loss: 0.0015050870133563876 2023-01-22 18:45:18.164034: step: 912/463, loss: 0.010839171707630157 2023-01-22 18:45:18.770630: step: 914/463, loss: 0.044668667018413544 2023-01-22 18:45:19.377628: step: 916/463, loss: 0.03507235646247864 2023-01-22 18:45:19.993383: step: 918/463, loss: 0.5352575778961182 2023-01-22 18:45:20.577782: step: 920/463, loss: 0.0021275102626532316 2023-01-22 18:45:21.219912: step: 922/463, loss: 0.0016215662471950054 2023-01-22 18:45:21.915822: step: 924/463, loss: 0.00468553276732564 2023-01-22 18:45:22.553921: step: 926/463, loss: 0.024761535227298737 ================================================== Loss: 0.049 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.27846247453022416, 'r': 0.33341522092708054, 'f1': 0.30347119417715274}, 'combined': 0.2236103536042178, 'epoch': 34} Test Chinese: {'template': {'p': 0.9722222222222222, 'r': 0.5511811023622047, 'f1': 0.7035175879396985}, 'slot': {'p': 0.3495737848958868, 'r': 0.3276301820079148, 'f1': 0.33824646153292376}, 'combined': 0.23796233474678055, 'epoch': 34} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.28687983814215345, 'r': 0.3315176877202494, 'f1': 0.3075877137826962}, 'combined': 0.22664357857672351, 'epoch': 34} Test Korean: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.3512639634296152, 'r': 0.32429081440535157, 'f1': 0.33723890499866865}, 'combined': 0.23943962254905474, 'epoch': 34} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2979280374940674, 'r': 0.34032766332339387, 'f1': 0.3177195368847273}, 'combined': 0.23410913244137802, 'epoch': 34} Test Russian: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.3662122419274383, 'r': 0.3169574949782457, 'f1': 0.3398092993914713}, 'combined': 0.24126460256794463, 'epoch': 34} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.27044025157232704, 'r': 0.4095238095238095, 'f1': 0.32575757575757575}, 'combined': 0.21717171717171715, 'epoch': 34} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.3125, 'r': 0.4891304347826087, 'f1': 0.38135593220338987}, 'combined': 0.19067796610169493, 'epoch': 34} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.32142857142857145, 'r': 0.23275862068965517, 'f1': 0.26999999999999996}, 'combined': 0.17999999999999997, 'epoch': 34} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.31154818059299194, 'r': 0.313321699647601, 'f1': 0.312432423300446}, 'combined': 0.23021336453717073, 'epoch': 29} Test for Chinese: {'template': {'p': 0.9722222222222222, 'r': 0.5511811023622047, 'f1': 0.7035175879396985}, 'slot': {'p': 0.35841849662687986, 'r': 0.32120052010105204, 'f1': 0.33879042433116024}, 'combined': 0.23834502214252482, 'epoch': 29} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.34188034188034183, 'r': 0.38095238095238093, 'f1': 0.36036036036036034}, 'combined': 0.2402402402402402, 'epoch': 29} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.28642843540005986, 'r': 0.33860515228508026, 'f1': 0.3103389830508475}, 'combined': 0.22867082961641394, 'epoch': 17} Test for Korean: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.3492453884409228, 'r': 0.3270179138496522, 'f1': 0.3377663639671779}, 'combined': 0.2398141184166963, 'epoch': 17} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.33088235294117646, 'r': 0.4891304347826087, 'f1': 0.39473684210526316}, 'combined': 0.19736842105263158, 'epoch': 17} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2919934913217624, 'r': 0.3557112361073462, 'f1': 0.3207182573628254}, 'combined': 0.23631871595155554, 'epoch': 30} Test for Russian: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.36122302043550025, 'r': 0.3233655859793779, 'f1': 0.34124755386763844}, 'combined': 0.24228576324602327, 'epoch': 30} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.4375, 'r': 0.3017241379310345, 'f1': 0.3571428571428571}, 'combined': 0.23809523809523805, 'epoch': 30} ****************************** Epoch: 35 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-22 18:48:02.608172: step: 2/463, loss: 0.009650697000324726 2023-01-22 18:48:03.230783: step: 4/463, loss: 0.011786172166466713 2023-01-22 18:48:03.832866: step: 6/463, loss: 0.0002879654639400542 2023-01-22 18:48:04.473849: step: 8/463, loss: 0.000464693148387596 2023-01-22 18:48:05.068031: step: 10/463, loss: 0.0024587048683315516 2023-01-22 18:48:05.631708: step: 12/463, loss: 0.0009022123995237052 2023-01-22 18:48:06.241754: step: 14/463, loss: 0.0040870062075555325 2023-01-22 18:48:06.801293: step: 16/463, loss: 0.018193447962403297 2023-01-22 18:48:07.379845: step: 18/463, loss: 0.0007111936574801803 2023-01-22 18:48:08.004653: step: 20/463, loss: 0.06329775601625443 2023-01-22 18:48:08.635376: step: 22/463, loss: 0.0019312745425850153 2023-01-22 18:48:09.279165: step: 24/463, loss: 0.0011293272254988551 2023-01-22 18:48:09.874246: step: 26/463, loss: 0.0015840608393773437 2023-01-22 18:48:10.551621: step: 28/463, loss: 0.007588351145386696 2023-01-22 18:48:11.168921: step: 30/463, loss: 0.00665410328656435 2023-01-22 18:48:11.815264: step: 32/463, loss: 9.522154141450301e-05 2023-01-22 18:48:12.389700: step: 34/463, loss: 0.029504846781492233 2023-01-22 18:48:12.932837: step: 36/463, loss: 0.008196620270609856 2023-01-22 18:48:13.545841: step: 38/463, loss: 0.005462096072733402 2023-01-22 18:48:14.154788: step: 40/463, loss: 0.001002611592411995 2023-01-22 18:48:14.752590: step: 42/463, loss: 0.010422318242490292 2023-01-22 18:48:15.368110: step: 44/463, loss: 0.0007528648129664361 2023-01-22 18:48:16.030664: step: 46/463, loss: 0.0015187760582193732 2023-01-22 18:48:16.685451: step: 48/463, loss: 0.0021412763744592667 2023-01-22 18:48:17.365397: step: 50/463, loss: 0.0019520525820553303 2023-01-22 18:48:18.079224: step: 52/463, loss: 0.08593850582838058 2023-01-22 18:48:18.696582: step: 54/463, loss: 0.010441022925078869 2023-01-22 18:48:19.284107: step: 56/463, loss: 0.0031402993481606245 2023-01-22 18:48:19.908943: step: 58/463, loss: 0.0012851492501795292 2023-01-22 18:48:20.547791: step: 60/463, loss: 0.0036043250001966953 2023-01-22 18:48:21.234669: step: 62/463, loss: 0.02823488786816597 2023-01-22 18:48:21.915476: step: 64/463, loss: 0.008473533205688 2023-01-22 18:48:22.510851: step: 66/463, loss: 3.976836524088867e-05 2023-01-22 18:48:23.122141: step: 68/463, loss: 0.0008379635983146727 2023-01-22 18:48:23.761943: step: 70/463, loss: 0.02820459194481373 2023-01-22 18:48:24.356880: step: 72/463, loss: 0.035985805094242096 2023-01-22 18:48:24.930919: step: 74/463, loss: 0.0004739728756248951 2023-01-22 18:48:25.571427: step: 76/463, loss: 9.638145274948329e-05 2023-01-22 18:48:26.203241: step: 78/463, loss: 0.0020508503075689077 2023-01-22 18:48:26.790591: step: 80/463, loss: 0.005530265625566244 2023-01-22 18:48:27.488721: step: 82/463, loss: 0.0004159810487180948 2023-01-22 18:48:28.077525: step: 84/463, loss: 0.0013184337876737118 2023-01-22 18:48:28.683264: step: 86/463, loss: 0.00038613390643149614 2023-01-22 18:48:29.304525: step: 88/463, loss: 0.006877379026263952 2023-01-22 18:48:29.965661: step: 90/463, loss: 0.7983630299568176 2023-01-22 18:48:30.565449: step: 92/463, loss: 0.0022911932319402695 2023-01-22 18:48:31.175495: step: 94/463, loss: 0.020264288410544395 2023-01-22 18:48:31.764844: step: 96/463, loss: 0.0008965901215560734 2023-01-22 18:48:32.377668: step: 98/463, loss: 0.00029586852178908885 2023-01-22 18:48:33.039969: step: 100/463, loss: 0.049667470157146454 2023-01-22 18:48:33.703341: step: 102/463, loss: 4.8003188567236066e-05 2023-01-22 18:48:34.395311: step: 104/463, loss: 0.034216154366731644 2023-01-22 18:48:35.014974: step: 106/463, loss: 0.0013502194778993726 2023-01-22 18:48:35.691726: step: 108/463, loss: 0.00010287695477018133 2023-01-22 18:48:36.398004: step: 110/463, loss: 0.034096281975507736 2023-01-22 18:48:37.087362: step: 112/463, loss: 0.005010094027966261 2023-01-22 18:48:37.655047: step: 114/463, loss: 0.00025026770890690386 2023-01-22 18:48:38.330475: step: 116/463, loss: 0.007432885468006134 2023-01-22 18:48:38.922216: step: 118/463, loss: 0.008711685426533222 2023-01-22 18:48:39.517913: step: 120/463, loss: 0.005241652950644493 2023-01-22 18:48:40.152146: step: 122/463, loss: 0.0001902843068819493 2023-01-22 18:48:40.751453: step: 124/463, loss: 0.00038083791150711477 2023-01-22 18:48:41.347193: step: 126/463, loss: 0.03736787661910057 2023-01-22 18:48:41.949961: step: 128/463, loss: 0.00027251310530118644 2023-01-22 18:48:42.618342: step: 130/463, loss: 0.007968614809215069 2023-01-22 18:48:43.261155: step: 132/463, loss: 0.0016482468927279115 2023-01-22 18:48:43.862288: step: 134/463, loss: 0.026196682825684547 2023-01-22 18:48:44.443283: step: 136/463, loss: 0.0005368880811147392 2023-01-22 18:48:45.094254: step: 138/463, loss: 0.002911505987867713 2023-01-22 18:48:45.771444: step: 140/463, loss: 0.006802616640925407 2023-01-22 18:48:46.437840: step: 142/463, loss: 0.19156968593597412 2023-01-22 18:48:46.989423: step: 144/463, loss: 0.0029986929148435593 2023-01-22 18:48:47.576550: step: 146/463, loss: 0.0013309495989233255 2023-01-22 18:48:48.198336: step: 148/463, loss: 0.0008818070637062192 2023-01-22 18:48:48.791747: step: 150/463, loss: 0.006695579271763563 2023-01-22 18:48:49.429123: step: 152/463, loss: 0.000710685271769762 2023-01-22 18:48:50.040777: step: 154/463, loss: 0.0012030262732878327 2023-01-22 18:48:50.723293: step: 156/463, loss: 0.006528215482831001 2023-01-22 18:48:51.357954: step: 158/463, loss: 0.003699961584061384 2023-01-22 18:48:51.993519: step: 160/463, loss: 0.002487849211320281 2023-01-22 18:48:52.648614: step: 162/463, loss: 0.00034482928458601236 2023-01-22 18:48:53.245758: step: 164/463, loss: 0.015215154737234116 2023-01-22 18:48:53.828251: step: 166/463, loss: 5.501805208041333e-05 2023-01-22 18:48:54.379395: step: 168/463, loss: 0.00016288267215713859 2023-01-22 18:48:54.970510: step: 170/463, loss: 2.893218697863631e-05 2023-01-22 18:48:55.626217: step: 172/463, loss: 0.007109466008841991 2023-01-22 18:48:56.182127: step: 174/463, loss: 0.008348418399691582 2023-01-22 18:48:56.820958: step: 176/463, loss: 0.0003959539462812245 2023-01-22 18:48:57.378502: step: 178/463, loss: 0.003911568783223629 2023-01-22 18:48:58.098464: step: 180/463, loss: 0.07109412550926208 2023-01-22 18:48:58.756258: step: 182/463, loss: 0.005627064500004053 2023-01-22 18:48:59.391809: step: 184/463, loss: 0.005449837073683739 2023-01-22 18:49:00.000783: step: 186/463, loss: 0.0041159032844007015 2023-01-22 18:49:00.605050: step: 188/463, loss: 0.6390036940574646 2023-01-22 18:49:01.322035: step: 190/463, loss: 0.003781113075092435 2023-01-22 18:49:01.893267: step: 192/463, loss: 0.0011266513029113412 2023-01-22 18:49:02.481552: step: 194/463, loss: 0.00811034720391035 2023-01-22 18:49:03.127133: step: 196/463, loss: 0.0006623000372201204 2023-01-22 18:49:03.742484: step: 198/463, loss: 0.001529157510958612 2023-01-22 18:49:04.318636: step: 200/463, loss: 0.0033484171144664288 2023-01-22 18:49:04.923098: step: 202/463, loss: 0.06631159782409668 2023-01-22 18:49:05.533482: step: 204/463, loss: 0.0006336997030302882 2023-01-22 18:49:06.128431: step: 206/463, loss: 0.004517833702266216 2023-01-22 18:49:06.712239: step: 208/463, loss: 6.507584095001221 2023-01-22 18:49:07.352918: step: 210/463, loss: 0.018587056547403336 2023-01-22 18:49:07.962070: step: 212/463, loss: 0.0001679467677604407 2023-01-22 18:49:08.598294: step: 214/463, loss: 0.0003122456546407193 2023-01-22 18:49:09.236750: step: 216/463, loss: 0.014828535728156567 2023-01-22 18:49:09.909637: step: 218/463, loss: 0.00039677516906522214 2023-01-22 18:49:10.555424: step: 220/463, loss: 0.00014470804308075458 2023-01-22 18:49:11.170180: step: 222/463, loss: 0.0010515274479985237 2023-01-22 18:49:11.798412: step: 224/463, loss: 0.002367601729929447 2023-01-22 18:49:12.427289: step: 226/463, loss: 0.07630432397127151 2023-01-22 18:49:12.991086: step: 228/463, loss: 0.005174749530851841 2023-01-22 18:49:13.533771: step: 230/463, loss: 0.02038835734128952 2023-01-22 18:49:14.141771: step: 232/463, loss: 0.0009652617736719549 2023-01-22 18:49:14.774419: step: 234/463, loss: 0.0016483080107718706 2023-01-22 18:49:15.382791: step: 236/463, loss: 0.014348719269037247 2023-01-22 18:49:15.955654: step: 238/463, loss: 8.999479177873582e-05 2023-01-22 18:49:16.542456: step: 240/463, loss: 0.001197825069539249 2023-01-22 18:49:17.116786: step: 242/463, loss: 0.0005172394448891282 2023-01-22 18:49:17.738230: step: 244/463, loss: 1.667422111495398e-05 2023-01-22 18:49:18.387229: step: 246/463, loss: 0.008807245641946793 2023-01-22 18:49:19.007081: step: 248/463, loss: 0.019100964069366455 2023-01-22 18:49:19.647837: step: 250/463, loss: 0.012610741890966892 2023-01-22 18:49:20.319220: step: 252/463, loss: 0.008680294267833233 2023-01-22 18:49:21.019415: step: 254/463, loss: 0.06546274572610855 2023-01-22 18:49:21.619981: step: 256/463, loss: 0.0006040350417606533 2023-01-22 18:49:22.160256: step: 258/463, loss: 0.027568619698286057 2023-01-22 18:49:22.780002: step: 260/463, loss: 0.0053232163190841675 2023-01-22 18:49:23.444749: step: 262/463, loss: 0.0008328939438797534 2023-01-22 18:49:24.072252: step: 264/463, loss: 0.004068667069077492 2023-01-22 18:49:24.756302: step: 266/463, loss: 0.009993740357458591 2023-01-22 18:49:25.380387: step: 268/463, loss: 0.0004189243772998452 2023-01-22 18:49:25.983509: step: 270/463, loss: 0.001134916441515088 2023-01-22 18:49:26.611144: step: 272/463, loss: 0.029912590980529785 2023-01-22 18:49:27.291124: step: 274/463, loss: 0.0018716041231527925 2023-01-22 18:49:27.952585: step: 276/463, loss: 0.07098497450351715 2023-01-22 18:49:28.542057: step: 278/463, loss: 0.009800136089324951 2023-01-22 18:49:29.139104: step: 280/463, loss: 0.0007468195399269462 2023-01-22 18:49:29.756724: step: 282/463, loss: 0.1672784388065338 2023-01-22 18:49:30.340657: step: 284/463, loss: 0.0019075084710493684 2023-01-22 18:49:30.953625: step: 286/463, loss: 0.01114833727478981 2023-01-22 18:49:31.590285: step: 288/463, loss: 0.004412234760820866 2023-01-22 18:49:32.247469: step: 290/463, loss: 0.020172690972685814 2023-01-22 18:49:32.828819: step: 292/463, loss: 0.0008893111953511834 2023-01-22 18:49:33.419795: step: 294/463, loss: 0.020668407902121544 2023-01-22 18:49:33.990320: step: 296/463, loss: 0.00015217819600366056 2023-01-22 18:49:34.564544: step: 298/463, loss: 0.1386207640171051 2023-01-22 18:49:35.210985: step: 300/463, loss: 0.006524915341287851 2023-01-22 18:49:35.850785: step: 302/463, loss: 0.0006003522430546582 2023-01-22 18:49:36.496509: step: 304/463, loss: 0.0022898679599165916 2023-01-22 18:49:37.123931: step: 306/463, loss: 0.027583589777350426 2023-01-22 18:49:37.745075: step: 308/463, loss: 0.0027378464583307505 2023-01-22 18:49:38.381259: step: 310/463, loss: 0.001503090257756412 2023-01-22 18:49:38.954068: step: 312/463, loss: 0.0033252981957048178 2023-01-22 18:49:39.766027: step: 314/463, loss: 0.08234601467847824 2023-01-22 18:49:40.327702: step: 316/463, loss: 0.0004886414972133934 2023-01-22 18:49:40.935182: step: 318/463, loss: 0.0008888702723197639 2023-01-22 18:49:41.501454: step: 320/463, loss: 0.0027260619681328535 2023-01-22 18:49:42.118702: step: 322/463, loss: 0.0011248054215684533 2023-01-22 18:49:42.688457: step: 324/463, loss: 0.002195649081841111 2023-01-22 18:49:43.287978: step: 326/463, loss: 0.000714053981937468 2023-01-22 18:49:43.873501: step: 328/463, loss: 0.014387431554496288 2023-01-22 18:49:44.574085: step: 330/463, loss: 0.05212874710559845 2023-01-22 18:49:45.180263: step: 332/463, loss: 0.0006725656567141414 2023-01-22 18:49:45.765225: step: 334/463, loss: 0.008794141933321953 2023-01-22 18:49:46.328889: step: 336/463, loss: 0.013474443927407265 2023-01-22 18:49:46.901551: step: 338/463, loss: 0.003402983769774437 2023-01-22 18:49:47.487950: step: 340/463, loss: 0.00016218061500694603 2023-01-22 18:49:48.073200: step: 342/463, loss: 0.0023117836099117994 2023-01-22 18:49:48.682174: step: 344/463, loss: 0.002105920109897852 2023-01-22 18:49:49.299079: step: 346/463, loss: 0.0030313667375594378 2023-01-22 18:49:49.947891: step: 348/463, loss: 0.0006608630064874887 2023-01-22 18:49:50.597781: step: 350/463, loss: 9.228332055499777e-05 2023-01-22 18:49:51.283057: step: 352/463, loss: 0.012822591699659824 2023-01-22 18:49:51.892004: step: 354/463, loss: 0.00246209604665637 2023-01-22 18:49:52.468217: step: 356/463, loss: 3.0101489755907096e-05 2023-01-22 18:49:53.114261: step: 358/463, loss: 0.014162423089146614 2023-01-22 18:49:53.771802: step: 360/463, loss: 0.038227446377277374 2023-01-22 18:49:54.445620: step: 362/463, loss: 0.007529544644057751 2023-01-22 18:49:55.038921: step: 364/463, loss: 0.0015825848095119 2023-01-22 18:49:55.669693: step: 366/463, loss: 4.445153899723664e-05 2023-01-22 18:49:56.337389: step: 368/463, loss: 0.0002774097374640405 2023-01-22 18:49:56.997818: step: 370/463, loss: 0.0025055406149476767 2023-01-22 18:49:57.588221: step: 372/463, loss: 0.03678932413458824 2023-01-22 18:49:58.241481: step: 374/463, loss: 7.41112744435668e-05 2023-01-22 18:49:58.885279: step: 376/463, loss: 0.0014649521326646209 2023-01-22 18:49:59.525820: step: 378/463, loss: 0.020510289818048477 2023-01-22 18:50:00.233048: step: 380/463, loss: 0.07849282026290894 2023-01-22 18:50:00.839604: step: 382/463, loss: 0.00020519566896837205 2023-01-22 18:50:01.484600: step: 384/463, loss: 0.0058251735754311085 2023-01-22 18:50:02.124467: step: 386/463, loss: 0.033200886100530624 2023-01-22 18:50:02.677039: step: 388/463, loss: 0.08934615552425385 2023-01-22 18:50:03.341116: step: 390/463, loss: 0.030569639056921005 2023-01-22 18:50:03.928815: step: 392/463, loss: 1.0173883438110352 2023-01-22 18:50:04.544568: step: 394/463, loss: 0.004208071157336235 2023-01-22 18:50:05.131139: step: 396/463, loss: 0.11540179699659348 2023-01-22 18:50:05.747159: step: 398/463, loss: 0.005547211971133947 2023-01-22 18:50:06.394644: step: 400/463, loss: 0.026464618742465973 2023-01-22 18:50:07.048412: step: 402/463, loss: 0.048501625657081604 2023-01-22 18:50:07.687263: step: 404/463, loss: 0.09062911570072174 2023-01-22 18:50:08.230903: step: 406/463, loss: 0.0158686563372612 2023-01-22 18:50:08.785223: step: 408/463, loss: 8.738929318496957e-05 2023-01-22 18:50:09.413880: step: 410/463, loss: 0.05587149038910866 2023-01-22 18:50:10.080968: step: 412/463, loss: 0.004976948723196983 2023-01-22 18:50:10.671300: step: 414/463, loss: 0.024331748485565186 2023-01-22 18:50:11.288383: step: 416/463, loss: 0.012122121639549732 2023-01-22 18:50:11.963168: step: 418/463, loss: 0.02252669259905815 2023-01-22 18:50:12.619060: step: 420/463, loss: 0.09368427842855453 2023-01-22 18:50:13.280304: step: 422/463, loss: 0.00019348938076291233 2023-01-22 18:50:13.956242: step: 424/463, loss: 0.010643952526152134 2023-01-22 18:50:14.517839: step: 426/463, loss: 0.008839860558509827 2023-01-22 18:50:15.141986: step: 428/463, loss: 0.03608330339193344 2023-01-22 18:50:15.802146: step: 430/463, loss: 1.104791283607483 2023-01-22 18:50:16.468899: step: 432/463, loss: 0.00012076389975845814 2023-01-22 18:50:17.106830: step: 434/463, loss: 0.01150478795170784 2023-01-22 18:50:17.722613: step: 436/463, loss: 0.010172829031944275 2023-01-22 18:50:18.337999: step: 438/463, loss: 0.005157343111932278 2023-01-22 18:50:18.981320: step: 440/463, loss: 0.017767997458577156 2023-01-22 18:50:19.603414: step: 442/463, loss: 0.020208347588777542 2023-01-22 18:50:20.291320: step: 444/463, loss: 0.012978977523744106 2023-01-22 18:50:20.984267: step: 446/463, loss: 0.0028525032103061676 2023-01-22 18:50:21.622826: step: 448/463, loss: 0.09910660237073898 2023-01-22 18:50:22.260585: step: 450/463, loss: 0.05011899769306183 2023-01-22 18:50:22.883702: step: 452/463, loss: 0.00022203246771823615 2023-01-22 18:50:23.545971: step: 454/463, loss: 0.000666602049022913 2023-01-22 18:50:24.168709: step: 456/463, loss: 0.07130929082632065 2023-01-22 18:50:24.799299: step: 458/463, loss: 0.18852639198303223 2023-01-22 18:50:25.424743: step: 460/463, loss: 0.0034951341804116964 2023-01-22 18:50:26.064652: step: 462/463, loss: 0.0005693099228665233 2023-01-22 18:50:26.762088: step: 464/463, loss: 0.2907623052597046 2023-01-22 18:50:27.292613: step: 466/463, loss: 0.00031704321736469865 2023-01-22 18:50:27.921158: step: 468/463, loss: 0.0005440630484372377 2023-01-22 18:50:28.617208: step: 470/463, loss: 0.00012365682050585747 2023-01-22 18:50:29.208997: step: 472/463, loss: 0.0015922797610983253 2023-01-22 18:50:29.819807: step: 474/463, loss: 0.008679372258484364 2023-01-22 18:50:30.563425: step: 476/463, loss: 0.002265596529468894 2023-01-22 18:50:31.123197: step: 478/463, loss: 0.01140524446964264 2023-01-22 18:50:31.738663: step: 480/463, loss: 0.03552233427762985 2023-01-22 18:50:32.367591: step: 482/463, loss: 0.003224781947210431 2023-01-22 18:50:33.069989: step: 484/463, loss: 0.007472130935639143 2023-01-22 18:50:33.632362: step: 486/463, loss: 0.00015518437430728227 2023-01-22 18:50:34.281166: step: 488/463, loss: 0.007762855850160122 2023-01-22 18:50:34.885814: step: 490/463, loss: 0.047219034284353256 2023-01-22 18:50:35.478126: step: 492/463, loss: 0.004519898444414139 2023-01-22 18:50:36.096006: step: 494/463, loss: 0.056851357221603394 2023-01-22 18:50:36.673996: step: 496/463, loss: 0.007137979380786419 2023-01-22 18:50:37.323363: step: 498/463, loss: 0.012937169522047043 2023-01-22 18:50:37.916781: step: 500/463, loss: 0.0019125521648675203 2023-01-22 18:50:38.573574: step: 502/463, loss: 0.004871410317718983 2023-01-22 18:50:39.214407: step: 504/463, loss: 0.0019807661883533 2023-01-22 18:50:39.851066: step: 506/463, loss: 0.1702852100133896 2023-01-22 18:50:40.481427: step: 508/463, loss: 0.2537592649459839 2023-01-22 18:50:41.130225: step: 510/463, loss: 0.007020138669759035 2023-01-22 18:50:41.732555: step: 512/463, loss: 0.009446575306355953 2023-01-22 18:50:42.317967: step: 514/463, loss: 0.003063920186832547 2023-01-22 18:50:42.968821: step: 516/463, loss: 0.022446447983384132 2023-01-22 18:50:43.685174: step: 518/463, loss: 0.003525157691910863 2023-01-22 18:50:44.288225: step: 520/463, loss: 0.0021053035743534565 2023-01-22 18:50:44.904667: step: 522/463, loss: 0.03281977400183678 2023-01-22 18:50:45.545739: step: 524/463, loss: 0.027482863515615463 2023-01-22 18:50:46.194194: step: 526/463, loss: 0.0020643046591430902 2023-01-22 18:50:46.820511: step: 528/463, loss: 0.009132737293839455 2023-01-22 18:50:47.474139: step: 530/463, loss: 0.0007322818273678422 2023-01-22 18:50:48.038022: step: 532/463, loss: 0.0051918188109993935 2023-01-22 18:50:48.611227: step: 534/463, loss: 0.012080921791493893 2023-01-22 18:50:49.198043: step: 536/463, loss: 0.050436120480298996 2023-01-22 18:50:49.900218: step: 538/463, loss: 0.07078374922275543 2023-01-22 18:50:50.495062: step: 540/463, loss: 0.017102325335144997 2023-01-22 18:50:51.044822: step: 542/463, loss: 0.0016749731730669737 2023-01-22 18:50:51.659660: step: 544/463, loss: 0.0001819965400500223 2023-01-22 18:50:52.290336: step: 546/463, loss: 0.6325834393501282 2023-01-22 18:50:52.990927: step: 548/463, loss: 0.024886522442102432 2023-01-22 18:50:53.617749: step: 550/463, loss: 0.02613760158419609 2023-01-22 18:50:54.263673: step: 552/463, loss: 0.039630673825740814 2023-01-22 18:50:54.880577: step: 554/463, loss: 0.02165718376636505 2023-01-22 18:50:55.493695: step: 556/463, loss: 0.0030502143781632185 2023-01-22 18:50:56.111901: step: 558/463, loss: 0.0004469918494578451 2023-01-22 18:50:56.704531: step: 560/463, loss: 0.08587326109409332 2023-01-22 18:50:57.386208: step: 562/463, loss: 5.5723161494825035e-05 2023-01-22 18:50:57.961608: step: 564/463, loss: 0.006172677036374807 2023-01-22 18:50:58.566794: step: 566/463, loss: 0.08456075191497803 2023-01-22 18:50:59.159982: step: 568/463, loss: 0.0072333477437496185 2023-01-22 18:50:59.836126: step: 570/463, loss: 0.028079887852072716 2023-01-22 18:51:00.381707: step: 572/463, loss: 0.0029028465505689383 2023-01-22 18:51:00.987569: step: 574/463, loss: 0.8768166899681091 2023-01-22 18:51:01.641341: step: 576/463, loss: 0.000327515386743471 2023-01-22 18:51:02.292048: step: 578/463, loss: 0.005033503752201796 2023-01-22 18:51:02.900527: step: 580/463, loss: 0.012327686883509159 2023-01-22 18:51:03.488940: step: 582/463, loss: 0.007722549606114626 2023-01-22 18:51:04.088845: step: 584/463, loss: 0.03548095375299454 2023-01-22 18:51:04.719642: step: 586/463, loss: 0.05224282294511795 2023-01-22 18:51:05.374564: step: 588/463, loss: 0.013207907788455486 2023-01-22 18:51:05.980591: step: 590/463, loss: 0.001920191221870482 2023-01-22 18:51:06.592716: step: 592/463, loss: 0.002828942146152258 2023-01-22 18:51:07.208220: step: 594/463, loss: 0.2377557009458542 2023-01-22 18:51:07.831744: step: 596/463, loss: 0.009992492385208607 2023-01-22 18:51:08.518107: step: 598/463, loss: 0.014815385453402996 2023-01-22 18:51:09.126295: step: 600/463, loss: 6.090897659305483e-05 2023-01-22 18:51:09.750654: step: 602/463, loss: 0.00013207882875576615 2023-01-22 18:51:10.404165: step: 604/463, loss: 0.012882939539849758 2023-01-22 18:51:11.063026: step: 606/463, loss: 0.002112177200615406 2023-01-22 18:51:11.682278: step: 608/463, loss: 0.021544061601161957 2023-01-22 18:51:12.298079: step: 610/463, loss: 0.0233796127140522 2023-01-22 18:51:12.901512: step: 612/463, loss: 0.0005436899373307824 2023-01-22 18:51:13.471324: step: 614/463, loss: 0.005196926183998585 2023-01-22 18:51:14.049989: step: 616/463, loss: 0.03044118359684944 2023-01-22 18:51:14.653138: step: 618/463, loss: 0.00037238546065054834 2023-01-22 18:51:15.265410: step: 620/463, loss: 0.019778165966272354 2023-01-22 18:51:15.880196: step: 622/463, loss: 0.0011244122870266438 2023-01-22 18:51:16.519552: step: 624/463, loss: 0.002646859735250473 2023-01-22 18:51:17.178863: step: 626/463, loss: 0.00043994744191877544 2023-01-22 18:51:17.768242: step: 628/463, loss: 0.0029270644299685955 2023-01-22 18:51:18.458888: step: 630/463, loss: 0.007826575078070164 2023-01-22 18:51:19.039183: step: 632/463, loss: 0.15459802746772766 2023-01-22 18:51:19.774023: step: 634/463, loss: 0.01781396195292473 2023-01-22 18:51:20.417226: step: 636/463, loss: 0.033318206667900085 2023-01-22 18:51:21.113629: step: 638/463, loss: 0.0011194914113730192 2023-01-22 18:51:21.688053: step: 640/463, loss: 0.001153525779955089 2023-01-22 18:51:22.282183: step: 642/463, loss: 0.000616598641499877 2023-01-22 18:51:22.863005: step: 644/463, loss: 0.01366270612925291 2023-01-22 18:51:23.508297: step: 646/463, loss: 0.022628067061305046 2023-01-22 18:51:24.128034: step: 648/463, loss: 0.0162322036921978 2023-01-22 18:51:24.747342: step: 650/463, loss: 8.266350778285414e-05 2023-01-22 18:51:25.428929: step: 652/463, loss: 0.0735631063580513 2023-01-22 18:51:26.046884: step: 654/463, loss: 0.0017314935103058815 2023-01-22 18:51:26.677520: step: 656/463, loss: 0.0011976715177297592 2023-01-22 18:51:27.298078: step: 658/463, loss: 0.0059079499915242195 2023-01-22 18:51:27.933253: step: 660/463, loss: 0.0004118305805604905 2023-01-22 18:51:28.532102: step: 662/463, loss: 0.0012248677667230368 2023-01-22 18:51:29.164711: step: 664/463, loss: 0.0076681626960635185 2023-01-22 18:51:29.772195: step: 666/463, loss: 0.0035638920962810516 2023-01-22 18:51:30.385696: step: 668/463, loss: 0.04568130150437355 2023-01-22 18:51:30.943524: step: 670/463, loss: 0.0005280053592287004 2023-01-22 18:51:31.583929: step: 672/463, loss: 0.009610794484615326 2023-01-22 18:51:32.173972: step: 674/463, loss: 0.0015710145235061646 2023-01-22 18:51:32.770200: step: 676/463, loss: 0.0016349686775356531 2023-01-22 18:51:33.408933: step: 678/463, loss: 0.0008764940430410206 2023-01-22 18:51:34.125015: step: 680/463, loss: 0.004971388727426529 2023-01-22 18:51:34.763354: step: 682/463, loss: 8.591260120738298e-05 2023-01-22 18:51:35.348853: step: 684/463, loss: 0.03484330326318741 2023-01-22 18:51:35.975309: step: 686/463, loss: 0.014426385052502155 2023-01-22 18:51:36.562956: step: 688/463, loss: 0.07955188304185867 2023-01-22 18:51:37.198021: step: 690/463, loss: 0.000625498010776937 2023-01-22 18:51:37.881890: step: 692/463, loss: 0.0320172980427742 2023-01-22 18:51:38.502665: step: 694/463, loss: 0.011903285048902035 2023-01-22 18:51:39.126996: step: 696/463, loss: 0.03269776701927185 2023-01-22 18:51:39.758506: step: 698/463, loss: 0.05856316536664963 2023-01-22 18:51:40.377133: step: 700/463, loss: 0.0001418902538716793 2023-01-22 18:51:40.994478: step: 702/463, loss: 0.13959087431430817 2023-01-22 18:51:41.617010: step: 704/463, loss: 0.0011247132206335664 2023-01-22 18:51:42.264965: step: 706/463, loss: 7.681200804654509e-05 2023-01-22 18:51:42.871440: step: 708/463, loss: 3.101267429883592e-05 2023-01-22 18:51:43.446013: step: 710/463, loss: 0.0025639438536018133 2023-01-22 18:51:44.064311: step: 712/463, loss: 0.008351258933544159 2023-01-22 18:51:44.662691: step: 714/463, loss: 0.006547629367560148 2023-01-22 18:51:45.259264: step: 716/463, loss: 0.007629115134477615 2023-01-22 18:51:45.885240: step: 718/463, loss: 0.0010633810888975859 2023-01-22 18:51:46.575032: step: 720/463, loss: 7.668213220313191e-05 2023-01-22 18:51:47.152267: step: 722/463, loss: 0.01556393038481474 2023-01-22 18:51:47.805020: step: 724/463, loss: 0.0007416423759423196 2023-01-22 18:51:48.479504: step: 726/463, loss: 0.006539164111018181 2023-01-22 18:51:49.173413: step: 728/463, loss: 0.0018867211183533072 2023-01-22 18:51:49.835373: step: 730/463, loss: 0.0005657885340042412 2023-01-22 18:51:50.459581: step: 732/463, loss: 0.04360099881887436 2023-01-22 18:51:51.110333: step: 734/463, loss: 0.004330152180045843 2023-01-22 18:51:51.771325: step: 736/463, loss: 0.08611345291137695 2023-01-22 18:51:52.341176: step: 738/463, loss: 2.9391927455435507e-05 2023-01-22 18:51:52.929380: step: 740/463, loss: 0.019473427906632423 2023-01-22 18:51:53.575098: step: 742/463, loss: 0.024920666590332985 2023-01-22 18:51:54.222613: step: 744/463, loss: 0.0024718125350773335 2023-01-22 18:51:54.929264: step: 746/463, loss: 0.0008794695604592562 2023-01-22 18:51:55.610561: step: 748/463, loss: 0.015080979093909264 2023-01-22 18:51:56.233814: step: 750/463, loss: 0.0017277252627536654 2023-01-22 18:51:56.844559: step: 752/463, loss: 0.0650850459933281 2023-01-22 18:51:57.464283: step: 754/463, loss: 0.01961144246160984 2023-01-22 18:51:58.117205: step: 756/463, loss: 0.0017193052917718887 2023-01-22 18:51:58.798320: step: 758/463, loss: 9.541733743390068e-05 2023-01-22 18:51:59.442134: step: 760/463, loss: 0.013148232363164425 2023-01-22 18:52:00.094859: step: 762/463, loss: 0.0003687719872687012 2023-01-22 18:52:00.689558: step: 764/463, loss: 0.01600434072315693 2023-01-22 18:52:01.300606: step: 766/463, loss: 0.0002734844165388495 2023-01-22 18:52:01.913680: step: 768/463, loss: 0.0028955943416804075 2023-01-22 18:52:02.519921: step: 770/463, loss: 0.03363749384880066 2023-01-22 18:52:03.133331: step: 772/463, loss: 0.029944676905870438 2023-01-22 18:52:03.749797: step: 774/463, loss: 0.008217028342187405 2023-01-22 18:52:04.367818: step: 776/463, loss: 0.004039390478283167 2023-01-22 18:52:04.948805: step: 778/463, loss: 0.011337703093886375 2023-01-22 18:52:05.564443: step: 780/463, loss: 0.00012448150664567947 2023-01-22 18:52:06.188925: step: 782/463, loss: 0.010057143867015839 2023-01-22 18:52:06.777646: step: 784/463, loss: 0.0022137206979095936 2023-01-22 18:52:07.364286: step: 786/463, loss: 2.9386113965301774e-05 2023-01-22 18:52:07.990215: step: 788/463, loss: 0.00017630128422752023 2023-01-22 18:52:08.658841: step: 790/463, loss: 0.018591230735182762 2023-01-22 18:52:09.267830: step: 792/463, loss: 0.0015779443783685565 2023-01-22 18:52:09.854716: step: 794/463, loss: 0.07194649428129196 2023-01-22 18:52:10.470971: step: 796/463, loss: 0.06920286267995834 2023-01-22 18:52:11.079315: step: 798/463, loss: 0.0004050025891046971 2023-01-22 18:52:11.741827: step: 800/463, loss: 0.012356555089354515 2023-01-22 18:52:12.362473: step: 802/463, loss: 0.002985794795677066 2023-01-22 18:52:13.030424: step: 804/463, loss: 0.0031967204995453358 2023-01-22 18:52:13.790826: step: 806/463, loss: 0.002034263452515006 2023-01-22 18:52:14.461753: step: 808/463, loss: 0.01366417482495308 2023-01-22 18:52:15.111212: step: 810/463, loss: 0.0008984622545540333 2023-01-22 18:52:15.728016: step: 812/463, loss: 0.017900753766298294 2023-01-22 18:52:16.398617: step: 814/463, loss: 0.010512127541005611 2023-01-22 18:52:16.998910: step: 816/463, loss: 0.0008099261904135346 2023-01-22 18:52:17.621199: step: 818/463, loss: 0.025760695338249207 2023-01-22 18:52:18.280615: step: 820/463, loss: 0.019363844767212868 2023-01-22 18:52:18.955004: step: 822/463, loss: 0.0002963232109323144 2023-01-22 18:52:19.664095: step: 824/463, loss: 0.7088283896446228 2023-01-22 18:52:20.258837: step: 826/463, loss: 0.013112346641719341 2023-01-22 18:52:20.811317: step: 828/463, loss: 0.03232106938958168 2023-01-22 18:52:21.402894: step: 830/463, loss: 0.0036049422342330217 2023-01-22 18:52:22.054922: step: 832/463, loss: 0.02061675675213337 2023-01-22 18:52:22.689459: step: 834/463, loss: 0.0032450652215629816 2023-01-22 18:52:23.331545: step: 836/463, loss: 0.00797413568943739 2023-01-22 18:52:23.905740: step: 838/463, loss: 0.00015561634791083634 2023-01-22 18:52:24.596349: step: 840/463, loss: 0.008400904014706612 2023-01-22 18:52:25.211936: step: 842/463, loss: 0.009043578058481216 2023-01-22 18:52:25.904773: step: 844/463, loss: 0.08120032399892807 2023-01-22 18:52:26.504759: step: 846/463, loss: 0.015190019272267818 2023-01-22 18:52:27.134302: step: 848/463, loss: 0.0060034096240997314 2023-01-22 18:52:27.713477: step: 850/463, loss: 0.06084943562746048 2023-01-22 18:52:28.326781: step: 852/463, loss: 0.00817871280014515 2023-01-22 18:52:28.945383: step: 854/463, loss: 0.04268055409193039 2023-01-22 18:52:29.560500: step: 856/463, loss: 0.11056595295667648 2023-01-22 18:52:30.177428: step: 858/463, loss: 0.0015568170929327607 2023-01-22 18:52:30.799582: step: 860/463, loss: 0.027737751603126526 2023-01-22 18:52:31.346370: step: 862/463, loss: 0.00039671960985288024 2023-01-22 18:52:31.858639: step: 864/463, loss: 0.005923689343035221 2023-01-22 18:52:32.426521: step: 866/463, loss: 0.024443788453936577 2023-01-22 18:52:33.042062: step: 868/463, loss: 0.024091830477118492 2023-01-22 18:52:33.734862: step: 870/463, loss: 0.006489522289484739 2023-01-22 18:52:34.348705: step: 872/463, loss: 0.029067212715744972 2023-01-22 18:52:34.976639: step: 874/463, loss: 0.0041711642406880856 2023-01-22 18:52:35.519016: step: 876/463, loss: 0.004569076467305422 2023-01-22 18:52:36.087997: step: 878/463, loss: 0.0005329335690476 2023-01-22 18:52:36.738376: step: 880/463, loss: 0.005759290419518948 2023-01-22 18:52:37.321755: step: 882/463, loss: 3.5535309314727783 2023-01-22 18:52:37.927544: step: 884/463, loss: 0.0018742851680144668 2023-01-22 18:52:38.504555: step: 886/463, loss: 2.019053317781072e-05 2023-01-22 18:52:39.225039: step: 888/463, loss: 0.0008340865606442094 2023-01-22 18:52:39.843012: step: 890/463, loss: 0.00854445993900299 2023-01-22 18:52:40.470247: step: 892/463, loss: 0.000495079206302762 2023-01-22 18:52:41.117992: step: 894/463, loss: 1.8116363207809627e-05 2023-01-22 18:52:41.791790: step: 896/463, loss: 0.016278408467769623 2023-01-22 18:52:42.454002: step: 898/463, loss: 0.01780099980533123 2023-01-22 18:52:43.065548: step: 900/463, loss: 0.027924466878175735 2023-01-22 18:52:43.588341: step: 902/463, loss: 0.0001748322683852166 2023-01-22 18:52:44.250645: step: 904/463, loss: 0.007044735364615917 2023-01-22 18:52:44.915168: step: 906/463, loss: 0.17951500415802002 2023-01-22 18:52:45.527751: step: 908/463, loss: 0.005990760400891304 2023-01-22 18:52:46.244900: step: 910/463, loss: 0.040801163762807846 2023-01-22 18:52:46.885075: step: 912/463, loss: 0.07330822944641113 2023-01-22 18:52:47.480205: step: 914/463, loss: 0.00014372900477610528 2023-01-22 18:52:48.046417: step: 916/463, loss: 0.002673413837328553 2023-01-22 18:52:48.625926: step: 918/463, loss: 0.0009781820699572563 2023-01-22 18:52:49.263678: step: 920/463, loss: 0.04004479944705963 2023-01-22 18:52:49.894216: step: 922/463, loss: 0.00036398929660208523 2023-01-22 18:52:50.535544: step: 924/463, loss: 6.87718202243559e-05 2023-01-22 18:52:51.165543: step: 926/463, loss: 0.003413501661270857 ================================================== Loss: 0.052 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.29174030960507075, 'r': 0.3244019381946327, 'f1': 0.30720542934154804}, 'combined': 0.22636189530429854, 'epoch': 35} Test Chinese: {'template': {'p': 0.9722222222222222, 'r': 0.5511811023622047, 'f1': 0.7035175879396985}, 'slot': {'p': 0.3656360313747181, 'r': 0.3280204675541281, 'f1': 0.34580834217333173}, 'combined': 0.24328225077520826, 'epoch': 35} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2969134852216749, 'r': 0.3267738547031716, 'f1': 0.3111288553361724}, 'combined': 0.22925284077402175, 'epoch': 35} Test Korean: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.36736581810354946, 'r': 0.32376917651359943, 'f1': 0.34419246408588583}, 'combined': 0.24437664950097893, 'epoch': 35} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3093920510912699, 'r': 0.33815905394415835, 'f1': 0.3231365755731123}, 'combined': 0.23810063463281958, 'epoch': 35} Test Russian: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.38469814454868806, 'r': 0.3142037201549494, 'f1': 0.34589573803801343}, 'combined': 0.24558597400698953, 'epoch': 35} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.29450757575757575, 'r': 0.3702380952380952, 'f1': 0.3280590717299578}, 'combined': 0.21870604781997185, 'epoch': 35} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.2962962962962963, 'r': 0.34782608695652173, 'f1': 0.31999999999999995}, 'combined': 0.15999999999999998, 'epoch': 35} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.359375, 'r': 0.19827586206896552, 'f1': 0.2555555555555556}, 'combined': 0.1703703703703704, 'epoch': 35} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.31154818059299194, 'r': 0.313321699647601, 'f1': 0.312432423300446}, 'combined': 0.23021336453717073, 'epoch': 29} Test for Chinese: {'template': {'p': 0.9722222222222222, 'r': 0.5511811023622047, 'f1': 0.7035175879396985}, 'slot': {'p': 0.35841849662687986, 'r': 0.32120052010105204, 'f1': 0.33879042433116024}, 'combined': 0.23834502214252482, 'epoch': 29} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.34188034188034183, 'r': 0.38095238095238093, 'f1': 0.36036036036036034}, 'combined': 0.2402402402402402, 'epoch': 29} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.28642843540005986, 'r': 0.33860515228508026, 'f1': 0.3103389830508475}, 'combined': 0.22867082961641394, 'epoch': 17} Test for Korean: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.3492453884409228, 'r': 0.3270179138496522, 'f1': 0.3377663639671779}, 'combined': 0.2398141184166963, 'epoch': 17} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.33088235294117646, 'r': 0.4891304347826087, 'f1': 0.39473684210526316}, 'combined': 0.19736842105263158, 'epoch': 17} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2919934913217624, 'r': 0.3557112361073462, 'f1': 0.3207182573628254}, 'combined': 0.23631871595155554, 'epoch': 30} Test for Russian: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.36122302043550025, 'r': 0.3233655859793779, 'f1': 0.34124755386763844}, 'combined': 0.24228576324602327, 'epoch': 30} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.4375, 'r': 0.3017241379310345, 'f1': 0.3571428571428571}, 'combined': 0.23809523809523805, 'epoch': 30} ****************************** Epoch: 36 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-22 18:55:29.646004: step: 2/463, loss: 0.0039131054654717445 2023-01-22 18:55:30.239226: step: 4/463, loss: 0.03753639757633209 2023-01-22 18:55:30.825941: step: 6/463, loss: 0.0012767130974680185 2023-01-22 18:55:31.520305: step: 8/463, loss: 0.00015890615759417415 2023-01-22 18:55:32.085310: step: 10/463, loss: 5.283621430862695e-05 2023-01-22 18:55:32.741045: step: 12/463, loss: 0.003733850084245205 2023-01-22 18:55:33.341733: step: 14/463, loss: 0.00023791068815626204 2023-01-22 18:55:33.976005: step: 16/463, loss: 0.0037667066790163517 2023-01-22 18:55:34.640736: step: 18/463, loss: 1.3957005739212036 2023-01-22 18:55:35.284926: step: 20/463, loss: 0.0007915259338915348 2023-01-22 18:55:35.889201: step: 22/463, loss: 0.011138240806758404 2023-01-22 18:55:36.679463: step: 24/463, loss: 0.0007262803264893591 2023-01-22 18:55:37.294457: step: 26/463, loss: 0.004932014271616936 2023-01-22 18:55:37.925701: step: 28/463, loss: 0.012132841162383556 2023-01-22 18:55:38.498689: step: 30/463, loss: 0.004444682039320469 2023-01-22 18:55:39.145287: step: 32/463, loss: 0.2115774303674698 2023-01-22 18:55:39.782543: step: 34/463, loss: 0.008062940090894699 2023-01-22 18:55:40.401548: step: 36/463, loss: 0.005142402835190296 2023-01-22 18:55:41.007159: step: 38/463, loss: 0.04166489467024803 2023-01-22 18:55:41.652565: step: 40/463, loss: 8.344496018253267e-05 2023-01-22 18:55:42.266978: step: 42/463, loss: 0.00018007186008617282 2023-01-22 18:55:42.814171: step: 44/463, loss: 0.007009679451584816 2023-01-22 18:55:43.466898: step: 46/463, loss: 0.08297619968652725 2023-01-22 18:55:44.039440: step: 48/463, loss: 0.03162526339292526 2023-01-22 18:55:44.603140: step: 50/463, loss: 0.0013527707196772099 2023-01-22 18:55:45.233258: step: 52/463, loss: 0.034911077469587326 2023-01-22 18:55:45.820444: step: 54/463, loss: 0.0008847775170579553 2023-01-22 18:55:46.433303: step: 56/463, loss: 0.0932626947760582 2023-01-22 18:55:47.019337: step: 58/463, loss: 0.003763317596167326 2023-01-22 18:55:47.629442: step: 60/463, loss: 0.01685957796871662 2023-01-22 18:55:48.220194: step: 62/463, loss: 0.002230278681963682 2023-01-22 18:55:48.815723: step: 64/463, loss: 0.004431103356182575 2023-01-22 18:55:49.481545: step: 66/463, loss: 0.03291596099734306 2023-01-22 18:55:50.160936: step: 68/463, loss: 0.003820683341473341 2023-01-22 18:55:50.799871: step: 70/463, loss: 0.00018920113507192582 2023-01-22 18:55:51.366924: step: 72/463, loss: 8.434458140982315e-05 2023-01-22 18:55:52.034121: step: 74/463, loss: 0.011637560091912746 2023-01-22 18:55:52.653595: step: 76/463, loss: 0.02371274307370186 2023-01-22 18:55:53.363611: step: 78/463, loss: 0.01811177469789982 2023-01-22 18:55:53.886958: step: 80/463, loss: 0.0017426020931452513 2023-01-22 18:55:54.429650: step: 82/463, loss: 0.006690945476293564 2023-01-22 18:55:55.143629: step: 84/463, loss: 0.10590013861656189 2023-01-22 18:55:55.718325: step: 86/463, loss: 0.011213444173336029 2023-01-22 18:55:56.364781: step: 88/463, loss: 0.002744789468124509 2023-01-22 18:55:56.933217: step: 90/463, loss: 0.005911767948418856 2023-01-22 18:55:57.589880: step: 92/463, loss: 0.00012543072807602584 2023-01-22 18:55:58.202916: step: 94/463, loss: 0.008700630627572536 2023-01-22 18:55:58.885453: step: 96/463, loss: 0.0014130865456536412 2023-01-22 18:55:59.540046: step: 98/463, loss: 0.037577275186777115 2023-01-22 18:56:00.175410: step: 100/463, loss: 0.052911508828401566 2023-01-22 18:56:00.751283: step: 102/463, loss: 0.000879697676282376 2023-01-22 18:56:01.345236: step: 104/463, loss: 0.0018070174846798182 2023-01-22 18:56:01.986494: step: 106/463, loss: 0.0029690370429307222 2023-01-22 18:56:02.613037: step: 108/463, loss: 0.04050770029425621 2023-01-22 18:56:03.214352: step: 110/463, loss: 0.022075319662690163 2023-01-22 18:56:03.825291: step: 112/463, loss: 0.007614338770508766 2023-01-22 18:56:04.556914: step: 114/463, loss: 0.018121657893061638 2023-01-22 18:56:05.115603: step: 116/463, loss: 0.006946384906768799 2023-01-22 18:56:05.690605: step: 118/463, loss: 0.021788297221064568 2023-01-22 18:56:06.323363: step: 120/463, loss: 0.03637975826859474 2023-01-22 18:56:06.926933: step: 122/463, loss: 0.006877163890749216 2023-01-22 18:56:07.549643: step: 124/463, loss: 0.0008058794192038476 2023-01-22 18:56:08.167578: step: 126/463, loss: 8.476442599203438e-05 2023-01-22 18:56:08.832716: step: 128/463, loss: 0.0007311162771657109 2023-01-22 18:56:09.518287: step: 130/463, loss: 0.11699274182319641 2023-01-22 18:56:10.163684: step: 132/463, loss: 0.006577538326382637 2023-01-22 18:56:10.746901: step: 134/463, loss: 0.0009582852362655103 2023-01-22 18:56:11.335080: step: 136/463, loss: 0.00042834447231143713 2023-01-22 18:56:11.979538: step: 138/463, loss: 0.0030110678635537624 2023-01-22 18:56:12.582950: step: 140/463, loss: 0.0009051500237546861 2023-01-22 18:56:13.148292: step: 142/463, loss: 0.01139234658330679 2023-01-22 18:56:13.768838: step: 144/463, loss: 0.08383572101593018 2023-01-22 18:56:14.353522: step: 146/463, loss: 0.0006674934411421418 2023-01-22 18:56:14.994117: step: 148/463, loss: 0.02658681571483612 2023-01-22 18:56:15.632888: step: 150/463, loss: 0.0034844321198761463 2023-01-22 18:56:16.264776: step: 152/463, loss: 0.01202048733830452 2023-01-22 18:56:16.880390: step: 154/463, loss: 0.005570350214838982 2023-01-22 18:56:17.472435: step: 156/463, loss: 0.0001580249227117747 2023-01-22 18:56:18.054788: step: 158/463, loss: 0.006072321441024542 2023-01-22 18:56:18.688821: step: 160/463, loss: 0.0010906679090112448 2023-01-22 18:56:19.295739: step: 162/463, loss: 0.0032006497494876385 2023-01-22 18:56:20.003052: step: 164/463, loss: 0.009151017293334007 2023-01-22 18:56:20.648406: step: 166/463, loss: 0.015467564575374126 2023-01-22 18:56:21.236885: step: 168/463, loss: 0.0014867375139147043 2023-01-22 18:56:21.836647: step: 170/463, loss: 0.01828332617878914 2023-01-22 18:56:22.383542: step: 172/463, loss: 0.00014816145994700491 2023-01-22 18:56:23.005647: step: 174/463, loss: 0.008795912377536297 2023-01-22 18:56:23.580568: step: 176/463, loss: 0.12636598944664001 2023-01-22 18:56:24.176531: step: 178/463, loss: 0.029871642589569092 2023-01-22 18:56:24.799416: step: 180/463, loss: 0.05339502915740013 2023-01-22 18:56:25.405390: step: 182/463, loss: 0.001030914718285203 2023-01-22 18:56:26.045815: step: 184/463, loss: 0.008076929487287998 2023-01-22 18:56:26.684823: step: 186/463, loss: 0.012896292842924595 2023-01-22 18:56:27.257490: step: 188/463, loss: 0.004633620381355286 2023-01-22 18:56:27.912277: step: 190/463, loss: 0.002839659806340933 2023-01-22 18:56:28.523683: step: 192/463, loss: 0.048908721655607224 2023-01-22 18:56:29.079341: step: 194/463, loss: 2.005544900894165 2023-01-22 18:56:29.728607: step: 196/463, loss: 1.7679762095212936e-05 2023-01-22 18:56:30.354384: step: 198/463, loss: 0.0018164466600865126 2023-01-22 18:56:30.983601: step: 200/463, loss: 0.031455039978027344 2023-01-22 18:56:31.572483: step: 202/463, loss: 0.008206028491258621 2023-01-22 18:56:32.150420: step: 204/463, loss: 0.00013274287630338222 2023-01-22 18:56:32.776383: step: 206/463, loss: 0.0003759699466172606 2023-01-22 18:56:33.407260: step: 208/463, loss: 0.21346065402030945 2023-01-22 18:56:34.035575: step: 210/463, loss: 0.0008245801436714828 2023-01-22 18:56:34.656236: step: 212/463, loss: 0.018925832584500313 2023-01-22 18:56:35.315174: step: 214/463, loss: 0.04367969185113907 2023-01-22 18:56:35.998716: step: 216/463, loss: 0.046914905309677124 2023-01-22 18:56:36.669141: step: 218/463, loss: 0.006718606688082218 2023-01-22 18:56:37.256476: step: 220/463, loss: 0.024260101839900017 2023-01-22 18:56:37.918479: step: 222/463, loss: 0.0035814372822642326 2023-01-22 18:56:38.528494: step: 224/463, loss: 0.03318468853831291 2023-01-22 18:56:39.145037: step: 226/463, loss: 0.01521458849310875 2023-01-22 18:56:39.759445: step: 228/463, loss: 0.03938154876232147 2023-01-22 18:56:40.329781: step: 230/463, loss: 0.0008341980865225196 2023-01-22 18:56:40.892459: step: 232/463, loss: 0.0001917454064823687 2023-01-22 18:56:41.491134: step: 234/463, loss: 0.011107788421213627 2023-01-22 18:56:42.143309: step: 236/463, loss: 0.0030267282854765654 2023-01-22 18:56:42.750240: step: 238/463, loss: 0.000329388800309971 2023-01-22 18:56:43.387394: step: 240/463, loss: 0.0038451035507023335 2023-01-22 18:56:44.065606: step: 242/463, loss: 0.0012905709445476532 2023-01-22 18:56:44.675028: step: 244/463, loss: 0.00016268555191345513 2023-01-22 18:56:45.282692: step: 246/463, loss: 0.063331738114357 2023-01-22 18:56:45.933852: step: 248/463, loss: 0.05011914670467377 2023-01-22 18:56:46.560230: step: 250/463, loss: 0.00032154045766219497 2023-01-22 18:56:47.136553: step: 252/463, loss: 0.0005501228151842952 2023-01-22 18:56:47.741331: step: 254/463, loss: 0.002352383453398943 2023-01-22 18:56:48.342420: step: 256/463, loss: 0.0007580643869005144 2023-01-22 18:56:48.990944: step: 258/463, loss: 0.08714041858911514 2023-01-22 18:56:49.617107: step: 260/463, loss: 0.005704423412680626 2023-01-22 18:56:50.247736: step: 262/463, loss: 0.0072910296730697155 2023-01-22 18:56:50.850352: step: 264/463, loss: 0.006596122402697802 2023-01-22 18:56:51.441591: step: 266/463, loss: 0.0018177800811827183 2023-01-22 18:56:52.073911: step: 268/463, loss: 0.015885574743151665 2023-01-22 18:56:52.603396: step: 270/463, loss: 0.0002563674934208393 2023-01-22 18:56:53.199718: step: 272/463, loss: 0.013127037324011326 2023-01-22 18:56:53.857571: step: 274/463, loss: 0.02170911431312561 2023-01-22 18:56:54.478026: step: 276/463, loss: 0.005188416223973036 2023-01-22 18:56:55.099966: step: 278/463, loss: 0.01241718977689743 2023-01-22 18:56:55.684540: step: 280/463, loss: 0.0004599408130161464 2023-01-22 18:56:56.330285: step: 282/463, loss: 0.03312910720705986 2023-01-22 18:56:56.931379: step: 284/463, loss: 0.00039400876266881824 2023-01-22 18:56:57.587944: step: 286/463, loss: 0.0006595011800527573 2023-01-22 18:56:58.189766: step: 288/463, loss: 0.021535217761993408 2023-01-22 18:56:58.896699: step: 290/463, loss: 0.0034528118558228016 2023-01-22 18:56:59.492639: step: 292/463, loss: 0.0056760963052511215 2023-01-22 18:57:00.071576: step: 294/463, loss: 0.0010164246195927262 2023-01-22 18:57:00.695869: step: 296/463, loss: 0.001583081902936101 2023-01-22 18:57:01.399912: step: 298/463, loss: 0.027305709198117256 2023-01-22 18:57:02.020397: step: 300/463, loss: 0.0017213256796821952 2023-01-22 18:57:02.654784: step: 302/463, loss: 0.00030831413459964097 2023-01-22 18:57:03.277298: step: 304/463, loss: 0.028002982959151268 2023-01-22 18:57:03.862763: step: 306/463, loss: 0.00025799195282161236 2023-01-22 18:57:04.393978: step: 308/463, loss: 1.7887819922179915e-05 2023-01-22 18:57:04.976776: step: 310/463, loss: 0.0003403961018193513 2023-01-22 18:57:05.558555: step: 312/463, loss: 0.02369399555027485 2023-01-22 18:57:06.233382: step: 314/463, loss: 0.0009541420149616897 2023-01-22 18:57:06.904958: step: 316/463, loss: 0.03247833624482155 2023-01-22 18:57:07.496402: step: 318/463, loss: 0.00019016233272850513 2023-01-22 18:57:08.086503: step: 320/463, loss: 0.016087811440229416 2023-01-22 18:57:08.708035: step: 322/463, loss: 0.09212274104356766 2023-01-22 18:57:09.324441: step: 324/463, loss: 0.004515312612056732 2023-01-22 18:57:09.936262: step: 326/463, loss: 0.005570523906499147 2023-01-22 18:57:10.630677: step: 328/463, loss: 0.0017093459609895945 2023-01-22 18:57:11.261146: step: 330/463, loss: 0.015343230217695236 2023-01-22 18:57:11.877445: step: 332/463, loss: 0.0006614208105020225 2023-01-22 18:57:12.490948: step: 334/463, loss: 0.10579109191894531 2023-01-22 18:57:13.169914: step: 336/463, loss: 0.0022409153170883656 2023-01-22 18:57:13.778969: step: 338/463, loss: 0.0007101965020410717 2023-01-22 18:57:14.366510: step: 340/463, loss: 0.001009844709187746 2023-01-22 18:57:14.979504: step: 342/463, loss: 0.0013554352335631847 2023-01-22 18:57:15.675319: step: 344/463, loss: 0.0019103569211438298 2023-01-22 18:57:16.269983: step: 346/463, loss: 0.0005408637225627899 2023-01-22 18:57:16.857078: step: 348/463, loss: 0.0012891687219962478 2023-01-22 18:57:17.462748: step: 350/463, loss: 0.0004941669758409262 2023-01-22 18:57:18.047567: step: 352/463, loss: 0.007493946701288223 2023-01-22 18:57:18.661400: step: 354/463, loss: 0.0033008658792823553 2023-01-22 18:57:19.334998: step: 356/463, loss: 0.049578506499528885 2023-01-22 18:57:19.978149: step: 358/463, loss: 0.0028152712620794773 2023-01-22 18:57:20.524280: step: 360/463, loss: 0.007004488725215197 2023-01-22 18:57:21.079543: step: 362/463, loss: 0.0012697805650532246 2023-01-22 18:57:21.698579: step: 364/463, loss: 0.018851568922400475 2023-01-22 18:57:22.288247: step: 366/463, loss: 0.0058478000573813915 2023-01-22 18:57:22.985548: step: 368/463, loss: 0.021564127877354622 2023-01-22 18:57:23.578378: step: 370/463, loss: 0.0806829109787941 2023-01-22 18:57:24.163732: step: 372/463, loss: 0.05472458153963089 2023-01-22 18:57:24.783644: step: 374/463, loss: 0.003851136425510049 2023-01-22 18:57:25.357239: step: 376/463, loss: 0.013617169111967087 2023-01-22 18:57:25.972765: step: 378/463, loss: 0.004623953253030777 2023-01-22 18:57:26.600667: step: 380/463, loss: 0.0003911424719262868 2023-01-22 18:57:27.152598: step: 382/463, loss: 2.276523446198553e-05 2023-01-22 18:57:27.877844: step: 384/463, loss: 0.20619510114192963 2023-01-22 18:57:28.540445: step: 386/463, loss: 0.017589032649993896 2023-01-22 18:57:29.150002: step: 388/463, loss: 0.010000547394156456 2023-01-22 18:57:29.764789: step: 390/463, loss: 0.00019680480181705207 2023-01-22 18:57:30.314263: step: 392/463, loss: 0.002149604493752122 2023-01-22 18:57:30.938381: step: 394/463, loss: 0.027686886489391327 2023-01-22 18:57:31.556827: step: 396/463, loss: 0.006160394754260778 2023-01-22 18:57:32.177452: step: 398/463, loss: 0.05810905247926712 2023-01-22 18:57:32.782758: step: 400/463, loss: 1.2715278899122495e-05 2023-01-22 18:57:33.345675: step: 402/463, loss: 0.0005860989331267774 2023-01-22 18:57:34.005810: step: 404/463, loss: 0.02245870605111122 2023-01-22 18:57:34.591061: step: 406/463, loss: 0.0012381643755361438 2023-01-22 18:57:35.256612: step: 408/463, loss: 0.04763353615999222 2023-01-22 18:57:35.861537: step: 410/463, loss: 0.11136508733034134 2023-01-22 18:57:36.488566: step: 412/463, loss: 0.05177998170256615 2023-01-22 18:57:37.187058: step: 414/463, loss: 0.0007559359655715525 2023-01-22 18:57:37.862237: step: 416/463, loss: 3.7106416129972786e-05 2023-01-22 18:57:38.508126: step: 418/463, loss: 0.019734280183911324 2023-01-22 18:57:39.172969: step: 420/463, loss: 0.06339508295059204 2023-01-22 18:57:39.765912: step: 422/463, loss: 0.0040702372789382935 2023-01-22 18:57:40.392384: step: 424/463, loss: 0.001581559656187892 2023-01-22 18:57:41.015591: step: 426/463, loss: 0.00038402422796934843 2023-01-22 18:57:41.577915: step: 428/463, loss: 0.005168115720152855 2023-01-22 18:57:42.165452: step: 430/463, loss: 0.0013646932784467936 2023-01-22 18:57:42.793881: step: 432/463, loss: 0.009912949986755848 2023-01-22 18:57:43.393325: step: 434/463, loss: 3.561094490578398e-05 2023-01-22 18:57:44.020296: step: 436/463, loss: 0.01404653675854206 2023-01-22 18:57:44.667330: step: 438/463, loss: 0.029941964894533157 2023-01-22 18:57:45.232615: step: 440/463, loss: 0.050746191293001175 2023-01-22 18:57:45.778505: step: 442/463, loss: 0.02635306492447853 2023-01-22 18:57:46.450808: step: 444/463, loss: 0.0034588577691465616 2023-01-22 18:57:47.115508: step: 446/463, loss: 0.01602175086736679 2023-01-22 18:57:47.790898: step: 448/463, loss: 0.01984070986509323 2023-01-22 18:57:48.381875: step: 450/463, loss: 0.0006603609072044492 2023-01-22 18:57:49.054263: step: 452/463, loss: 0.0020816135220229626 2023-01-22 18:57:49.626899: step: 454/463, loss: 0.01991451345384121 2023-01-22 18:57:50.260116: step: 456/463, loss: 0.001439945655874908 2023-01-22 18:57:50.869092: step: 458/463, loss: 0.004279706161469221 2023-01-22 18:57:51.442062: step: 460/463, loss: 0.0018330500461161137 2023-01-22 18:57:52.088772: step: 462/463, loss: 0.02326149307191372 2023-01-22 18:57:52.845954: step: 464/463, loss: 0.00026935923960991204 2023-01-22 18:57:53.458221: step: 466/463, loss: 0.00837028119713068 2023-01-22 18:57:54.066839: step: 468/463, loss: 1.5012360563559923e-05 2023-01-22 18:57:54.687088: step: 470/463, loss: 0.00011508278112160042 2023-01-22 18:57:55.299231: step: 472/463, loss: 0.0009805691661313176 2023-01-22 18:57:55.918587: step: 474/463, loss: 0.0008236143621616066 2023-01-22 18:57:56.604428: step: 476/463, loss: 0.013348506763577461 2023-01-22 18:57:57.291454: step: 478/463, loss: 0.06315455585718155 2023-01-22 18:57:57.911880: step: 480/463, loss: 0.020077558234333992 2023-01-22 18:57:58.574799: step: 482/463, loss: 0.0005393887986429036 2023-01-22 18:57:59.449294: step: 484/463, loss: 0.33139973878860474 2023-01-22 18:58:00.084362: step: 486/463, loss: 0.0021186566445976496 2023-01-22 18:58:00.700852: step: 488/463, loss: 0.0009362651617266238 2023-01-22 18:58:01.306730: step: 490/463, loss: 0.021026672795414925 2023-01-22 18:58:01.912738: step: 492/463, loss: 0.06323542445898056 2023-01-22 18:58:02.557232: step: 494/463, loss: 0.0006318576633930206 2023-01-22 18:58:03.182804: step: 496/463, loss: 0.014056823216378689 2023-01-22 18:58:03.867512: step: 498/463, loss: 0.006511223502457142 2023-01-22 18:58:04.503652: step: 500/463, loss: 0.003151458688080311 2023-01-22 18:58:05.159732: step: 502/463, loss: 6.321983528323472e-05 2023-01-22 18:58:05.754293: step: 504/463, loss: 0.006083660759031773 2023-01-22 18:58:06.400059: step: 506/463, loss: 0.003428579308092594 2023-01-22 18:58:07.093812: step: 508/463, loss: 0.019471485167741776 2023-01-22 18:58:07.760835: step: 510/463, loss: 0.0010689079063013196 2023-01-22 18:58:08.358735: step: 512/463, loss: 0.0033592672552913427 2023-01-22 18:58:08.965654: step: 514/463, loss: 0.0029803358484059572 2023-01-22 18:58:09.567154: step: 516/463, loss: 0.0064012957736849785 2023-01-22 18:58:10.187120: step: 518/463, loss: 0.004149196669459343 2023-01-22 18:58:10.882593: step: 520/463, loss: 0.0007871257839724422 2023-01-22 18:58:11.508499: step: 522/463, loss: 0.015625817701220512 2023-01-22 18:58:12.153146: step: 524/463, loss: 0.0007801534375175834 2023-01-22 18:58:12.802475: step: 526/463, loss: 0.0030307737179100513 2023-01-22 18:58:13.459956: step: 528/463, loss: 0.007288351655006409 2023-01-22 18:58:14.077620: step: 530/463, loss: 0.02911122515797615 2023-01-22 18:58:14.687394: step: 532/463, loss: 4.032709512102883e-06 2023-01-22 18:58:15.398747: step: 534/463, loss: 0.007424159441143274 2023-01-22 18:58:15.973704: step: 536/463, loss: 0.0004408792592585087 2023-01-22 18:58:16.575725: step: 538/463, loss: 7.529286085627973e-05 2023-01-22 18:58:17.231603: step: 540/463, loss: 0.6011190414428711 2023-01-22 18:58:17.874683: step: 542/463, loss: 0.005220221821218729 2023-01-22 18:58:18.467099: step: 544/463, loss: 0.015601000748574734 2023-01-22 18:58:19.032961: step: 546/463, loss: 0.128461092710495 2023-01-22 18:58:19.604086: step: 548/463, loss: 0.0009255751501768827 2023-01-22 18:58:20.254895: step: 550/463, loss: 0.001316879061050713 2023-01-22 18:58:20.900175: step: 552/463, loss: 0.007969926111400127 2023-01-22 18:58:21.501327: step: 554/463, loss: 0.000722224940545857 2023-01-22 18:58:22.119681: step: 556/463, loss: 0.012545270845293999 2023-01-22 18:58:22.681284: step: 558/463, loss: 0.0004259929119143635 2023-01-22 18:58:23.375561: step: 560/463, loss: 0.0018221450736746192 2023-01-22 18:58:24.061581: step: 562/463, loss: 0.006431570742279291 2023-01-22 18:58:24.668077: step: 564/463, loss: 8.34755483083427e-05 2023-01-22 18:58:25.298579: step: 566/463, loss: 0.013022126629948616 2023-01-22 18:58:25.932887: step: 568/463, loss: 0.003758697770535946 2023-01-22 18:58:26.631648: step: 570/463, loss: 0.006135449279099703 2023-01-22 18:58:27.232112: step: 572/463, loss: 0.002386078704148531 2023-01-22 18:58:27.826350: step: 574/463, loss: 0.028024164959788322 2023-01-22 18:58:28.525660: step: 576/463, loss: 0.08550935983657837 2023-01-22 18:58:29.086821: step: 578/463, loss: 0.0007010336848907173 2023-01-22 18:58:29.658929: step: 580/463, loss: 1.3751759070146363e-05 2023-01-22 18:58:30.272699: step: 582/463, loss: 0.00802725087851286 2023-01-22 18:58:30.894844: step: 584/463, loss: 0.0002351757138967514 2023-01-22 18:58:31.489206: step: 586/463, loss: 0.00013442027557175606 2023-01-22 18:58:32.120823: step: 588/463, loss: 0.01335829496383667 2023-01-22 18:58:32.760823: step: 590/463, loss: 0.015719005838036537 2023-01-22 18:58:33.393953: step: 592/463, loss: 0.026370814070105553 2023-01-22 18:58:33.992061: step: 594/463, loss: 0.003387934062629938 2023-01-22 18:58:34.631932: step: 596/463, loss: 0.012097291648387909 2023-01-22 18:58:35.241535: step: 598/463, loss: 0.0012093591503798962 2023-01-22 18:58:35.881278: step: 600/463, loss: 0.00309302587993443 2023-01-22 18:58:36.466763: step: 602/463, loss: 0.005865468177944422 2023-01-22 18:58:37.154066: step: 604/463, loss: 0.002052412601187825 2023-01-22 18:58:37.760768: step: 606/463, loss: 0.0005783310625702143 2023-01-22 18:58:38.402676: step: 608/463, loss: 0.0040392763912677765 2023-01-22 18:58:39.021817: step: 610/463, loss: 0.0008774574380367994 2023-01-22 18:58:39.764547: step: 612/463, loss: 0.008288026787340641 2023-01-22 18:58:40.394142: step: 614/463, loss: 0.0002895063953474164 2023-01-22 18:58:41.026027: step: 616/463, loss: 0.0011733133578673005 2023-01-22 18:58:41.657361: step: 618/463, loss: 0.00048313132720068097 2023-01-22 18:58:42.304645: step: 620/463, loss: 0.007986728101968765 2023-01-22 18:58:42.913420: step: 622/463, loss: 0.009346497245132923 2023-01-22 18:58:43.500435: step: 624/463, loss: 0.00015373782662209123 2023-01-22 18:58:44.076086: step: 626/463, loss: 0.008601801469922066 2023-01-22 18:58:44.701563: step: 628/463, loss: 6.685173138976097e-05 2023-01-22 18:58:45.321545: step: 630/463, loss: 0.0022931210696697235 2023-01-22 18:58:46.001784: step: 632/463, loss: 0.003195962868630886 2023-01-22 18:58:46.567804: step: 634/463, loss: 0.00970042310655117 2023-01-22 18:58:47.204828: step: 636/463, loss: 0.0016407814109697938 2023-01-22 18:58:47.823498: step: 638/463, loss: 0.03672211244702339 2023-01-22 18:58:48.488790: step: 640/463, loss: 0.00894109159708023 2023-01-22 18:58:49.062333: step: 642/463, loss: 0.0006029381765983999 2023-01-22 18:58:49.714154: step: 644/463, loss: 0.0001658487308304757 2023-01-22 18:58:50.359389: step: 646/463, loss: 0.002161727286875248 2023-01-22 18:58:51.042716: step: 648/463, loss: 0.02210998721420765 2023-01-22 18:58:51.661795: step: 650/463, loss: 0.04925721138715744 2023-01-22 18:58:52.301322: step: 652/463, loss: 3.1916463285597274e-06 2023-01-22 18:58:52.923200: step: 654/463, loss: 0.01511157862842083 2023-01-22 18:58:53.515322: step: 656/463, loss: 0.000587633578106761 2023-01-22 18:58:54.116050: step: 658/463, loss: 0.0030336943455040455 2023-01-22 18:58:54.706512: step: 660/463, loss: 2.228893390565645e-05 2023-01-22 18:58:55.274599: step: 662/463, loss: 7.496440957766026e-05 2023-01-22 18:58:55.930849: step: 664/463, loss: 1.0517664122744463e-05 2023-01-22 18:58:56.560584: step: 666/463, loss: 0.16666337847709656 2023-01-22 18:58:57.188654: step: 668/463, loss: 0.002395007759332657 2023-01-22 18:58:57.800764: step: 670/463, loss: 0.0007135762716643512 2023-01-22 18:58:58.393395: step: 672/463, loss: 0.0017217278946191072 2023-01-22 18:58:59.072993: step: 674/463, loss: 0.0005295167211443186 2023-01-22 18:58:59.715680: step: 676/463, loss: 0.0003577275201678276 2023-01-22 18:59:00.326907: step: 678/463, loss: 0.003206083085387945 2023-01-22 18:59:00.934488: step: 680/463, loss: 0.016423596069216728 2023-01-22 18:59:01.548648: step: 682/463, loss: 0.0005123615846969187 2023-01-22 18:59:02.118800: step: 684/463, loss: 0.03351074084639549 2023-01-22 18:59:02.723302: step: 686/463, loss: 0.027256429195404053 2023-01-22 18:59:03.356403: step: 688/463, loss: 0.0009734135819599032 2023-01-22 18:59:03.890125: step: 690/463, loss: 0.0035593505017459393 2023-01-22 18:59:04.515714: step: 692/463, loss: 0.0019398280419409275 2023-01-22 18:59:05.099478: step: 694/463, loss: 4.3778421968454495e-05 2023-01-22 18:59:05.694171: step: 696/463, loss: 0.000678095209877938 2023-01-22 18:59:06.340330: step: 698/463, loss: 0.009801367297768593 2023-01-22 18:59:07.005671: step: 700/463, loss: 0.03219345957040787 2023-01-22 18:59:07.618224: step: 702/463, loss: 0.00037196851917542517 2023-01-22 18:59:08.203119: step: 704/463, loss: 0.003169299103319645 2023-01-22 18:59:08.835819: step: 706/463, loss: 0.008782744407653809 2023-01-22 18:59:09.438912: step: 708/463, loss: 0.04423222690820694 2023-01-22 18:59:10.050896: step: 710/463, loss: 0.011025527492165565 2023-01-22 18:59:10.643442: step: 712/463, loss: 0.03643079102039337 2023-01-22 18:59:11.248125: step: 714/463, loss: 0.018903637304902077 2023-01-22 18:59:11.870279: step: 716/463, loss: 0.0023649660870432854 2023-01-22 18:59:12.453002: step: 718/463, loss: 0.009146427735686302 2023-01-22 18:59:12.956843: step: 720/463, loss: 0.002626665635034442 2023-01-22 18:59:13.572623: step: 722/463, loss: 0.17376849055290222 2023-01-22 18:59:14.260112: step: 724/463, loss: 3.567122621461749e-05 2023-01-22 18:59:14.875953: step: 726/463, loss: 0.0003404667950235307 2023-01-22 18:59:15.403050: step: 728/463, loss: 0.00151273503433913 2023-01-22 18:59:16.045149: step: 730/463, loss: 0.03626533970236778 2023-01-22 18:59:16.714594: step: 732/463, loss: 0.03891868144273758 2023-01-22 18:59:17.265589: step: 734/463, loss: 0.02094132825732231 2023-01-22 18:59:17.879303: step: 736/463, loss: 2.3891008822829463e-05 2023-01-22 18:59:18.564367: step: 738/463, loss: 0.00375882419757545 2023-01-22 18:59:19.176994: step: 740/463, loss: 0.01976710557937622 2023-01-22 18:59:19.807821: step: 742/463, loss: 0.017446395009756088 2023-01-22 18:59:20.447369: step: 744/463, loss: 0.0321173369884491 2023-01-22 18:59:21.045522: step: 746/463, loss: 3.3992011594818905e-05 2023-01-22 18:59:21.738006: step: 748/463, loss: 0.003993078134953976 2023-01-22 18:59:22.391984: step: 750/463, loss: 0.0040438417345285416 2023-01-22 18:59:23.017622: step: 752/463, loss: 0.02536393515765667 2023-01-22 18:59:23.686061: step: 754/463, loss: 0.039279550313949585 2023-01-22 18:59:24.294459: step: 756/463, loss: 0.000873614742886275 2023-01-22 18:59:24.955344: step: 758/463, loss: 0.30024227499961853 2023-01-22 18:59:25.625959: step: 760/463, loss: 0.0069885337725281715 2023-01-22 18:59:26.162179: step: 762/463, loss: 0.002927709836512804 2023-01-22 18:59:26.833401: step: 764/463, loss: 0.00036898721009492874 2023-01-22 18:59:27.524188: step: 766/463, loss: 0.004583714995533228 2023-01-22 18:59:28.137708: step: 768/463, loss: 0.05691809207201004 2023-01-22 18:59:28.746212: step: 770/463, loss: 0.05177024379372597 2023-01-22 18:59:29.375371: step: 772/463, loss: 0.0005056034424342215 2023-01-22 18:59:30.083141: step: 774/463, loss: 0.2132866531610489 2023-01-22 18:59:30.698344: step: 776/463, loss: 0.02564445696771145 2023-01-22 18:59:31.301928: step: 778/463, loss: 0.0026663579046726227 2023-01-22 18:59:31.848533: step: 780/463, loss: 5.4693584388587624e-05 2023-01-22 18:59:32.432869: step: 782/463, loss: 0.00015187505050562322 2023-01-22 18:59:33.079155: step: 784/463, loss: 0.014591741375625134 2023-01-22 18:59:33.677033: step: 786/463, loss: 0.002419151598587632 2023-01-22 18:59:34.293984: step: 788/463, loss: 0.005634056869894266 2023-01-22 18:59:34.907893: step: 790/463, loss: 0.07280516624450684 2023-01-22 18:59:35.589129: step: 792/463, loss: 0.012629182077944279 2023-01-22 18:59:36.179247: step: 794/463, loss: 0.008383732289075851 2023-01-22 18:59:36.876137: step: 796/463, loss: 0.02958122082054615 2023-01-22 18:59:37.494281: step: 798/463, loss: 0.0007115362677723169 2023-01-22 18:59:38.066571: step: 800/463, loss: 0.005023838020861149 2023-01-22 18:59:38.776953: step: 802/463, loss: 0.07436691969633102 2023-01-22 18:59:39.418550: step: 804/463, loss: 0.00976409949362278 2023-01-22 18:59:40.142650: step: 806/463, loss: 0.029882710427045822 2023-01-22 18:59:40.728618: step: 808/463, loss: 0.00012714667536783963 2023-01-22 18:59:41.288813: step: 810/463, loss: 0.010556872934103012 2023-01-22 18:59:41.913248: step: 812/463, loss: 0.005567050073295832 2023-01-22 18:59:42.588047: step: 814/463, loss: 0.19004641473293304 2023-01-22 18:59:43.246471: step: 816/463, loss: 0.007543179206550121 2023-01-22 18:59:43.911076: step: 818/463, loss: 0.04509204626083374 2023-01-22 18:59:44.508059: step: 820/463, loss: 0.08260630816221237 2023-01-22 18:59:45.181893: step: 822/463, loss: 0.4132557213306427 2023-01-22 18:59:45.799800: step: 824/463, loss: 0.00033459271071478724 2023-01-22 18:59:46.487701: step: 826/463, loss: 0.010395604185760021 2023-01-22 18:59:47.061498: step: 828/463, loss: 0.005294136703014374 2023-01-22 18:59:47.674928: step: 830/463, loss: 0.0024492100346833467 2023-01-22 18:59:48.248529: step: 832/463, loss: 0.012872098945081234 2023-01-22 18:59:48.864635: step: 834/463, loss: 0.011526709422469139 2023-01-22 18:59:49.501992: step: 836/463, loss: 0.01367436908185482 2023-01-22 18:59:50.078875: step: 838/463, loss: 0.00029191887006163597 2023-01-22 18:59:50.717189: step: 840/463, loss: 0.0683939978480339 2023-01-22 18:59:51.434900: step: 842/463, loss: 0.01013981457799673 2023-01-22 18:59:52.065447: step: 844/463, loss: 0.0412985198199749 2023-01-22 18:59:52.701350: step: 846/463, loss: 0.0013584363041445613 2023-01-22 18:59:53.316702: step: 848/463, loss: 0.003594774752855301 2023-01-22 18:59:53.919580: step: 850/463, loss: 0.0016252202913165092 2023-01-22 18:59:54.565169: step: 852/463, loss: 0.0005345061654224992 2023-01-22 18:59:55.133595: step: 854/463, loss: 0.0039537339471280575 2023-01-22 18:59:55.825110: step: 856/463, loss: 0.0015526420902460814 2023-01-22 18:59:56.493324: step: 858/463, loss: 0.03394979238510132 2023-01-22 18:59:57.123608: step: 860/463, loss: 0.14379896223545074 2023-01-22 18:59:57.786741: step: 862/463, loss: 0.002866592491045594 2023-01-22 18:59:58.394814: step: 864/463, loss: 0.019480133429169655 2023-01-22 18:59:59.064746: step: 866/463, loss: 1.889379564090632e-05 2023-01-22 18:59:59.704866: step: 868/463, loss: 0.11842015385627747 2023-01-22 19:00:00.417916: step: 870/463, loss: 0.007714948616921902 2023-01-22 19:00:00.985305: step: 872/463, loss: 0.00039211814873851836 2023-01-22 19:00:01.573242: step: 874/463, loss: 2.0166089598205872e-05 2023-01-22 19:00:02.247342: step: 876/463, loss: 0.014767895452678204 2023-01-22 19:00:02.855741: step: 878/463, loss: 0.004431243985891342 2023-01-22 19:00:03.519770: step: 880/463, loss: 0.0001304658071603626 2023-01-22 19:00:04.160802: step: 882/463, loss: 0.00022995134349912405 2023-01-22 19:00:04.727051: step: 884/463, loss: 0.005044331308454275 2023-01-22 19:00:05.319673: step: 886/463, loss: 0.027933118864893913 2023-01-22 19:00:05.952546: step: 888/463, loss: 0.01362233143299818 2023-01-22 19:00:06.564794: step: 890/463, loss: 0.4199944734573364 2023-01-22 19:00:07.151995: step: 892/463, loss: 0.02124643139541149 2023-01-22 19:00:07.721778: step: 894/463, loss: 0.336060106754303 2023-01-22 19:00:08.367464: step: 896/463, loss: 0.20768040418624878 2023-01-22 19:00:09.015943: step: 898/463, loss: 0.06433898955583572 2023-01-22 19:00:09.692506: step: 900/463, loss: 0.008206301368772984 2023-01-22 19:00:10.311866: step: 902/463, loss: 0.0003189053386449814 2023-01-22 19:00:10.849055: step: 904/463, loss: 0.00018434094090480357 2023-01-22 19:00:11.457661: step: 906/463, loss: 6.999744800850749e-05 2023-01-22 19:00:12.046443: step: 908/463, loss: 0.004247726406902075 2023-01-22 19:00:12.656540: step: 910/463, loss: 0.002398204989731312 2023-01-22 19:00:13.312427: step: 912/463, loss: 0.0932539775967598 2023-01-22 19:00:13.926459: step: 914/463, loss: 0.01665748655796051 2023-01-22 19:00:14.508761: step: 916/463, loss: 0.009202977642416954 2023-01-22 19:00:15.102840: step: 918/463, loss: 0.7346072196960449 2023-01-22 19:00:15.820678: step: 920/463, loss: 0.005587894469499588 2023-01-22 19:00:16.436571: step: 922/463, loss: 0.09244506806135178 2023-01-22 19:00:17.068228: step: 924/463, loss: 0.0008791422587819397 2023-01-22 19:00:17.653358: step: 926/463, loss: 0.022510290145874023 ================================================== Loss: 0.032 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2788921225135624, 'r': 0.33445886419083765, 'f1': 0.304158449402194}, 'combined': 0.2241167521910903, 'epoch': 36} Test Chinese: {'template': {'p': 0.9722222222222222, 'r': 0.5511811023622047, 'f1': 0.7035175879396985}, 'slot': {'p': 0.3596750234403423, 'r': 0.33082576262385455, 'f1': 0.34464772909133623}, 'combined': 0.24246573905923152, 'epoch': 36} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2837778479559026, 'r': 0.3349332474925454, 'f1': 0.30724076837001113}, 'combined': 0.22638793458842923, 'epoch': 36} Test Korean: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.35788953989753547, 'r': 0.32853385337888946, 'f1': 0.34258398177634874}, 'combined': 0.2432346270612076, 'epoch': 36} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.30131371288286185, 'r': 0.3493409460558417, 'f1': 0.3235547953803666}, 'combined': 0.23840879659605957, 'epoch': 36} Test Russian: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.3696345898496746, 'r': 0.31318951722865096, 'f1': 0.33907906163819934}, 'combined': 0.2407461337631215, 'epoch': 36} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.2540849673202614, 'r': 0.3702380952380952, 'f1': 0.3013565891472868}, 'combined': 0.20090439276485786, 'epoch': 36} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.21428571428571427, 'r': 0.32608695652173914, 'f1': 0.2586206896551724}, 'combined': 0.1293103448275862, 'epoch': 36} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.40789473684210525, 'r': 0.2672413793103448, 'f1': 0.3229166666666667}, 'combined': 0.2152777777777778, 'epoch': 36} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.31154818059299194, 'r': 0.313321699647601, 'f1': 0.312432423300446}, 'combined': 0.23021336453717073, 'epoch': 29} Test for Chinese: {'template': {'p': 0.9722222222222222, 'r': 0.5511811023622047, 'f1': 0.7035175879396985}, 'slot': {'p': 0.35841849662687986, 'r': 0.32120052010105204, 'f1': 0.33879042433116024}, 'combined': 0.23834502214252482, 'epoch': 29} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.34188034188034183, 'r': 0.38095238095238093, 'f1': 0.36036036036036034}, 'combined': 0.2402402402402402, 'epoch': 29} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.28642843540005986, 'r': 0.33860515228508026, 'f1': 0.3103389830508475}, 'combined': 0.22867082961641394, 'epoch': 17} Test for Korean: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.3492453884409228, 'r': 0.3270179138496522, 'f1': 0.3377663639671779}, 'combined': 0.2398141184166963, 'epoch': 17} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.33088235294117646, 'r': 0.4891304347826087, 'f1': 0.39473684210526316}, 'combined': 0.19736842105263158, 'epoch': 17} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2919934913217624, 'r': 0.3557112361073462, 'f1': 0.3207182573628254}, 'combined': 0.23631871595155554, 'epoch': 30} Test for Russian: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.36122302043550025, 'r': 0.3233655859793779, 'f1': 0.34124755386763844}, 'combined': 0.24228576324602327, 'epoch': 30} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.4375, 'r': 0.3017241379310345, 'f1': 0.3571428571428571}, 'combined': 0.23809523809523805, 'epoch': 30} ****************************** Epoch: 37 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-22 19:02:54.833782: step: 2/463, loss: 0.0034316214732825756 2023-01-22 19:02:55.407556: step: 4/463, loss: 0.0002714357106015086 2023-01-22 19:02:56.052334: step: 6/463, loss: 0.018349144607782364 2023-01-22 19:02:56.595398: step: 8/463, loss: 0.00012741127284243703 2023-01-22 19:02:57.258262: step: 10/463, loss: 0.021559862419962883 2023-01-22 19:02:57.842173: step: 12/463, loss: 0.014822883531451225 2023-01-22 19:02:58.442560: step: 14/463, loss: 0.0033380286768078804 2023-01-22 19:02:59.055422: step: 16/463, loss: 0.0008424852276220918 2023-01-22 19:02:59.685644: step: 18/463, loss: 0.0024112577084451914 2023-01-22 19:03:00.260428: step: 20/463, loss: 0.0006610251148231328 2023-01-22 19:03:00.882964: step: 22/463, loss: 0.0011779398191720247 2023-01-22 19:03:01.424003: step: 24/463, loss: 0.02320272848010063 2023-01-22 19:03:02.224262: step: 26/463, loss: 0.13180238008499146 2023-01-22 19:03:02.914239: step: 28/463, loss: 0.004826754331588745 2023-01-22 19:03:03.555058: step: 30/463, loss: 0.02388058975338936 2023-01-22 19:03:04.124632: step: 32/463, loss: 0.029068345203995705 2023-01-22 19:03:04.787163: step: 34/463, loss: 0.021090632304549217 2023-01-22 19:03:05.413687: step: 36/463, loss: 0.029777739197015762 2023-01-22 19:03:06.054457: step: 38/463, loss: 0.0026042331010103226 2023-01-22 19:03:06.723960: step: 40/463, loss: 0.006657453253865242 2023-01-22 19:03:07.358906: step: 42/463, loss: 0.009387089870870113 2023-01-22 19:03:08.052966: step: 44/463, loss: 0.0012680336367338896 2023-01-22 19:03:08.654597: step: 46/463, loss: 2.8926522645633668e-05 2023-01-22 19:03:09.218549: step: 48/463, loss: 0.06527360528707504 2023-01-22 19:03:09.837548: step: 50/463, loss: 0.016339516267180443 2023-01-22 19:03:10.452831: step: 52/463, loss: 0.0015982307959347963 2023-01-22 19:03:11.052875: step: 54/463, loss: 0.0004291353980079293 2023-01-22 19:03:11.669930: step: 56/463, loss: 0.002479638671502471 2023-01-22 19:03:12.260014: step: 58/463, loss: 0.010931259021162987 2023-01-22 19:03:12.942280: step: 60/463, loss: 0.00740866968408227 2023-01-22 19:03:13.559337: step: 62/463, loss: 0.018932953476905823 2023-01-22 19:03:14.179724: step: 64/463, loss: 0.0034797838889062405 2023-01-22 19:03:14.801881: step: 66/463, loss: 0.001341668888926506 2023-01-22 19:03:15.530198: step: 68/463, loss: 0.00017243965703528374 2023-01-22 19:03:16.123216: step: 70/463, loss: 0.0042627486400306225 2023-01-22 19:03:16.826786: step: 72/463, loss: 0.03757151588797569 2023-01-22 19:03:17.440724: step: 74/463, loss: 0.0032558634411543608 2023-01-22 19:03:18.102104: step: 76/463, loss: 0.3218415081501007 2023-01-22 19:03:18.749274: step: 78/463, loss: 0.0006900537991896272 2023-01-22 19:03:19.360969: step: 80/463, loss: 0.029682917520403862 2023-01-22 19:03:19.945915: step: 82/463, loss: 0.009259370155632496 2023-01-22 19:03:20.551423: step: 84/463, loss: 0.0007953910971991718 2023-01-22 19:03:21.187320: step: 86/463, loss: 0.0006983903585933149 2023-01-22 19:03:21.737812: step: 88/463, loss: 0.0003455177939031273 2023-01-22 19:03:22.421282: step: 90/463, loss: 0.013918382115662098 2023-01-22 19:03:23.101134: step: 92/463, loss: 0.002395426854491234 2023-01-22 19:03:23.694740: step: 94/463, loss: 0.007133430801331997 2023-01-22 19:03:24.275528: step: 96/463, loss: 0.0007088520796969533 2023-01-22 19:03:25.005287: step: 98/463, loss: 0.04382473602890968 2023-01-22 19:03:25.638443: step: 100/463, loss: 0.0032103490084409714 2023-01-22 19:03:26.216454: step: 102/463, loss: 0.0009398344554938376 2023-01-22 19:03:26.852838: step: 104/463, loss: 0.009064999409019947 2023-01-22 19:03:27.507792: step: 106/463, loss: 0.07266268879175186 2023-01-22 19:03:28.156532: step: 108/463, loss: 0.002251814818009734 2023-01-22 19:03:28.749887: step: 110/463, loss: 0.000683218298945576 2023-01-22 19:03:29.403331: step: 112/463, loss: 0.012692817486822605 2023-01-22 19:03:29.965757: step: 114/463, loss: 0.07910128682851791 2023-01-22 19:03:30.543937: step: 116/463, loss: 0.0007394339772872627 2023-01-22 19:03:31.114384: step: 118/463, loss: 0.00013344400213100016 2023-01-22 19:03:31.747795: step: 120/463, loss: 0.005026302766054869 2023-01-22 19:03:32.384503: step: 122/463, loss: 0.010279424488544464 2023-01-22 19:03:33.070857: step: 124/463, loss: 0.008140173740684986 2023-01-22 19:03:33.733905: step: 126/463, loss: 0.0012472158996388316 2023-01-22 19:03:34.306746: step: 128/463, loss: 0.005428242031484842 2023-01-22 19:03:34.949159: step: 130/463, loss: 0.0017173783853650093 2023-01-22 19:03:35.496641: step: 132/463, loss: 5.5181961215566844e-05 2023-01-22 19:03:36.088237: step: 134/463, loss: 0.00012833734217565507 2023-01-22 19:03:36.650158: step: 136/463, loss: 0.0001899186463560909 2023-01-22 19:03:37.236263: step: 138/463, loss: 0.007140164263546467 2023-01-22 19:03:37.902053: step: 140/463, loss: 0.011061793193221092 2023-01-22 19:03:38.483291: step: 142/463, loss: 0.004177043680101633 2023-01-22 19:03:39.060744: step: 144/463, loss: 0.048303090035915375 2023-01-22 19:03:39.645396: step: 146/463, loss: 0.0013725049793720245 2023-01-22 19:03:40.253669: step: 148/463, loss: 0.0035711172968149185 2023-01-22 19:03:40.895620: step: 150/463, loss: 0.025714939460158348 2023-01-22 19:03:41.510966: step: 152/463, loss: 0.003683775896206498 2023-01-22 19:03:42.135341: step: 154/463, loss: 0.022877104580402374 2023-01-22 19:03:42.709722: step: 156/463, loss: 0.0002466371806804091 2023-01-22 19:03:43.276729: step: 158/463, loss: 0.000737882568500936 2023-01-22 19:03:43.940061: step: 160/463, loss: 0.022165384143590927 2023-01-22 19:03:44.563258: step: 162/463, loss: 0.005865116138011217 2023-01-22 19:03:45.206950: step: 164/463, loss: 0.007709412835538387 2023-01-22 19:03:45.801342: step: 166/463, loss: 0.004023921210318804 2023-01-22 19:03:46.432884: step: 168/463, loss: 0.0012560674222186208 2023-01-22 19:03:47.096575: step: 170/463, loss: 0.00015828327741473913 2023-01-22 19:03:47.716349: step: 172/463, loss: 0.00683031277731061 2023-01-22 19:03:48.279910: step: 174/463, loss: 0.00795911718159914 2023-01-22 19:03:48.879390: step: 176/463, loss: 7.729393587396771e-07 2023-01-22 19:03:49.522444: step: 178/463, loss: 0.036940865218639374 2023-01-22 19:03:50.163617: step: 180/463, loss: 0.0007378771551884711 2023-01-22 19:03:50.748149: step: 182/463, loss: 0.011250119656324387 2023-01-22 19:03:51.331670: step: 184/463, loss: 0.006617526989430189 2023-01-22 19:03:51.892279: step: 186/463, loss: 0.058060623705387115 2023-01-22 19:03:52.436489: step: 188/463, loss: 7.652824569959193e-05 2023-01-22 19:03:53.077290: step: 190/463, loss: 0.01519687194377184 2023-01-22 19:03:53.707127: step: 192/463, loss: 0.0003063291369471699 2023-01-22 19:03:54.335129: step: 194/463, loss: 0.006012205500155687 2023-01-22 19:03:54.966899: step: 196/463, loss: 0.006280363537371159 2023-01-22 19:03:55.493490: step: 198/463, loss: 0.003202635794878006 2023-01-22 19:03:56.095361: step: 200/463, loss: 0.11471550166606903 2023-01-22 19:03:56.684157: step: 202/463, loss: 0.01721019670367241 2023-01-22 19:03:57.301953: step: 204/463, loss: 0.016893111169338226 2023-01-22 19:03:57.929827: step: 206/463, loss: 0.018344949930906296 2023-01-22 19:03:58.548318: step: 208/463, loss: 0.009665673598647118 2023-01-22 19:03:59.140656: step: 210/463, loss: 0.014448953792452812 2023-01-22 19:03:59.757901: step: 212/463, loss: 0.009668177925050259 2023-01-22 19:04:00.424023: step: 214/463, loss: 0.020127251744270325 2023-01-22 19:04:01.046568: step: 216/463, loss: 0.0059251561760902405 2023-01-22 19:04:01.654386: step: 218/463, loss: 0.0016292514046654105 2023-01-22 19:04:02.265146: step: 220/463, loss: 0.010097656399011612 2023-01-22 19:04:02.962183: step: 222/463, loss: 0.05773255601525307 2023-01-22 19:04:03.520247: step: 224/463, loss: 1.4640791960118804e-05 2023-01-22 19:04:04.128971: step: 226/463, loss: 0.003510373178869486 2023-01-22 19:04:04.720673: step: 228/463, loss: 0.0011022362159565091 2023-01-22 19:04:05.323792: step: 230/463, loss: 7.818215090082958e-05 2023-01-22 19:04:05.950766: step: 232/463, loss: 0.0026341467164456844 2023-01-22 19:04:06.613802: step: 234/463, loss: 0.39143994450569153 2023-01-22 19:04:07.210944: step: 236/463, loss: 0.00042211380787193775 2023-01-22 19:04:07.772750: step: 238/463, loss: 0.0032821798231452703 2023-01-22 19:04:08.437734: step: 240/463, loss: 0.00011095698573626578 2023-01-22 19:04:09.124214: step: 242/463, loss: 0.10323191434144974 2023-01-22 19:04:09.741390: step: 244/463, loss: 0.004489421844482422 2023-01-22 19:04:10.344875: step: 246/463, loss: 0.007197208236902952 2023-01-22 19:04:11.032326: step: 248/463, loss: 0.00015137509035412222 2023-01-22 19:04:11.683905: step: 250/463, loss: 0.4236075282096863 2023-01-22 19:04:12.257972: step: 252/463, loss: 0.00044664324377663434 2023-01-22 19:04:12.819281: step: 254/463, loss: 0.0036588360089808702 2023-01-22 19:04:13.407165: step: 256/463, loss: 0.0019494387088343501 2023-01-22 19:04:14.070819: step: 258/463, loss: 0.10680871456861496 2023-01-22 19:04:14.687597: step: 260/463, loss: 0.00045227259397506714 2023-01-22 19:04:15.264521: step: 262/463, loss: 0.0005818310892209411 2023-01-22 19:04:15.905169: step: 264/463, loss: 0.0447736531496048 2023-01-22 19:04:16.541940: step: 266/463, loss: 0.019326768815517426 2023-01-22 19:04:17.107591: step: 268/463, loss: 0.012794767506420612 2023-01-22 19:04:17.669808: step: 270/463, loss: 0.02105092443525791 2023-01-22 19:04:18.246214: step: 272/463, loss: 0.020379452034831047 2023-01-22 19:04:18.886651: step: 274/463, loss: 0.19948792457580566 2023-01-22 19:04:19.506761: step: 276/463, loss: 0.0004633721837308258 2023-01-22 19:04:20.145582: step: 278/463, loss: 0.00030965893529355526 2023-01-22 19:04:20.737114: step: 280/463, loss: 0.0030839326791465282 2023-01-22 19:04:21.482991: step: 282/463, loss: 0.0029346898663789034 2023-01-22 19:04:22.133901: step: 284/463, loss: 0.006112061440944672 2023-01-22 19:04:22.709930: step: 286/463, loss: 0.0029282723553478718 2023-01-22 19:04:23.282873: step: 288/463, loss: 0.025896752253174782 2023-01-22 19:04:23.920309: step: 290/463, loss: 0.000425500184064731 2023-01-22 19:04:24.562887: step: 292/463, loss: 0.022494789212942123 2023-01-22 19:04:25.180134: step: 294/463, loss: 0.10723140835762024 2023-01-22 19:04:25.779458: step: 296/463, loss: 0.0013610683381557465 2023-01-22 19:04:26.388071: step: 298/463, loss: 0.0010936838807538152 2023-01-22 19:04:26.965110: step: 300/463, loss: 0.0028232885524630547 2023-01-22 19:04:27.508063: step: 302/463, loss: 0.002621584804728627 2023-01-22 19:04:28.124433: step: 304/463, loss: 0.006185212172567844 2023-01-22 19:04:28.699998: step: 306/463, loss: 0.003315105801448226 2023-01-22 19:04:29.289955: step: 308/463, loss: 0.00944677833467722 2023-01-22 19:04:29.907901: step: 310/463, loss: 0.018645279109477997 2023-01-22 19:04:30.479867: step: 312/463, loss: 0.001943824696354568 2023-01-22 19:04:31.098248: step: 314/463, loss: 0.0027740702498704195 2023-01-22 19:04:31.730909: step: 316/463, loss: 0.0017509753815829754 2023-01-22 19:04:32.390237: step: 318/463, loss: 0.0004403857165016234 2023-01-22 19:04:33.023159: step: 320/463, loss: 0.0006594507722184062 2023-01-22 19:04:33.580692: step: 322/463, loss: 0.0003397808759473264 2023-01-22 19:04:34.176641: step: 324/463, loss: 0.007517275400459766 2023-01-22 19:04:34.743060: step: 326/463, loss: 0.18610866367816925 2023-01-22 19:04:35.358661: step: 328/463, loss: 0.013776523992419243 2023-01-22 19:04:35.945819: step: 330/463, loss: 0.0002741349453572184 2023-01-22 19:04:36.578876: step: 332/463, loss: 0.003540854901075363 2023-01-22 19:04:37.127496: step: 334/463, loss: 0.003934502135962248 2023-01-22 19:04:37.691147: step: 336/463, loss: 7.080984505591914e-05 2023-01-22 19:04:38.255771: step: 338/463, loss: 0.00022966664982959628 2023-01-22 19:04:38.886522: step: 340/463, loss: 0.030603595077991486 2023-01-22 19:04:39.450674: step: 342/463, loss: 0.0028781238943338394 2023-01-22 19:04:40.046400: step: 344/463, loss: 0.012799584306776524 2023-01-22 19:04:40.682434: step: 346/463, loss: 0.008817187510430813 2023-01-22 19:04:41.297231: step: 348/463, loss: 0.02101755328476429 2023-01-22 19:04:41.934694: step: 350/463, loss: 0.014654478058218956 2023-01-22 19:04:42.580561: step: 352/463, loss: 0.00935424491763115 2023-01-22 19:04:43.225247: step: 354/463, loss: 0.043565843254327774 2023-01-22 19:04:43.823312: step: 356/463, loss: 0.08620220422744751 2023-01-22 19:04:44.376768: step: 358/463, loss: 0.0030100559815764427 2023-01-22 19:04:44.946998: step: 360/463, loss: 0.11220147460699081 2023-01-22 19:04:45.561544: step: 362/463, loss: 0.0014714108547195792 2023-01-22 19:04:46.143381: step: 364/463, loss: 0.005715283565223217 2023-01-22 19:04:46.718899: step: 366/463, loss: 0.0061335489153862 2023-01-22 19:04:47.307229: step: 368/463, loss: 0.3858020603656769 2023-01-22 19:04:47.843411: step: 370/463, loss: 0.17132149636745453 2023-01-22 19:04:48.404364: step: 372/463, loss: 0.017161207273602486 2023-01-22 19:04:48.974524: step: 374/463, loss: 0.011440928094089031 2023-01-22 19:04:49.614916: step: 376/463, loss: 0.004039146937429905 2023-01-22 19:04:50.268017: step: 378/463, loss: 0.0064560649916529655 2023-01-22 19:04:50.885358: step: 380/463, loss: 0.0005779936909675598 2023-01-22 19:04:51.456058: step: 382/463, loss: 0.0017574417870491743 2023-01-22 19:04:52.036234: step: 384/463, loss: 0.012567474506795406 2023-01-22 19:04:52.676292: step: 386/463, loss: 0.009510831907391548 2023-01-22 19:04:53.253849: step: 388/463, loss: 0.0036942060105502605 2023-01-22 19:04:53.836910: step: 390/463, loss: 0.0013385605998337269 2023-01-22 19:04:54.458914: step: 392/463, loss: 0.015783850103616714 2023-01-22 19:04:55.117828: step: 394/463, loss: 0.02526027150452137 2023-01-22 19:04:55.684271: step: 396/463, loss: 8.25179013190791e-05 2023-01-22 19:04:56.311073: step: 398/463, loss: 1.5807218005647883e-05 2023-01-22 19:04:56.954073: step: 400/463, loss: 0.020785463973879814 2023-01-22 19:04:57.534249: step: 402/463, loss: 0.005067126359790564 2023-01-22 19:04:58.154072: step: 404/463, loss: 0.008145703002810478 2023-01-22 19:04:58.733278: step: 406/463, loss: 0.0014468543231487274 2023-01-22 19:04:59.278592: step: 408/463, loss: 0.004424612503498793 2023-01-22 19:04:59.927671: step: 410/463, loss: 0.008577024564146996 2023-01-22 19:05:00.575480: step: 412/463, loss: 0.08610379695892334 2023-01-22 19:05:01.169969: step: 414/463, loss: 0.023871811106801033 2023-01-22 19:05:01.842194: step: 416/463, loss: 0.012454979121685028 2023-01-22 19:05:02.477873: step: 418/463, loss: 0.20413877069950104 2023-01-22 19:05:03.028933: step: 420/463, loss: 0.005675806198269129 2023-01-22 19:05:03.663962: step: 422/463, loss: 0.045328762382268906 2023-01-22 19:05:04.329436: step: 424/463, loss: 1.0433175563812256 2023-01-22 19:05:04.975457: step: 426/463, loss: 0.00011970631021540612 2023-01-22 19:05:05.657478: step: 428/463, loss: 0.7693889141082764 2023-01-22 19:05:06.284784: step: 430/463, loss: 0.054582927376031876 2023-01-22 19:05:06.912921: step: 432/463, loss: 0.012701805680990219 2023-01-22 19:05:07.528740: step: 434/463, loss: 0.004288404248654842 2023-01-22 19:05:08.210880: step: 436/463, loss: 0.37723931670188904 2023-01-22 19:05:08.782888: step: 438/463, loss: 0.0024325258564203978 2023-01-22 19:05:09.378734: step: 440/463, loss: 0.00010750196088338271 2023-01-22 19:05:09.917456: step: 442/463, loss: 0.0009430550853721797 2023-01-22 19:05:10.535192: step: 444/463, loss: 0.03975220397114754 2023-01-22 19:05:11.133001: step: 446/463, loss: 0.04364331066608429 2023-01-22 19:05:11.796487: step: 448/463, loss: 0.016937240958213806 2023-01-22 19:05:12.351876: step: 450/463, loss: 0.002689364366233349 2023-01-22 19:05:12.948518: step: 452/463, loss: 0.007340874057263136 2023-01-22 19:05:13.520059: step: 454/463, loss: 0.0010755807161331177 2023-01-22 19:05:14.072963: step: 456/463, loss: 0.011218082159757614 2023-01-22 19:05:14.643498: step: 458/463, loss: 0.0148127106949687 2023-01-22 19:05:15.222876: step: 460/463, loss: 0.00626591918990016 2023-01-22 19:05:15.813106: step: 462/463, loss: 4.323193550109863 2023-01-22 19:05:16.509690: step: 464/463, loss: 0.04106561839580536 2023-01-22 19:05:17.133810: step: 466/463, loss: 0.006221759133040905 2023-01-22 19:05:17.690069: step: 468/463, loss: 0.006663068197667599 2023-01-22 19:05:18.235452: step: 470/463, loss: 0.024744654074311256 2023-01-22 19:05:18.818059: step: 472/463, loss: 0.0238431878387928 2023-01-22 19:05:19.466612: step: 474/463, loss: 0.004128447733819485 2023-01-22 19:05:20.043486: step: 476/463, loss: 0.002573884790763259 2023-01-22 19:05:20.725878: step: 478/463, loss: 0.0070581501349806786 2023-01-22 19:05:21.313719: step: 480/463, loss: 0.007858439348638058 2023-01-22 19:05:21.911532: step: 482/463, loss: 0.0103904465213418 2023-01-22 19:05:22.581708: step: 484/463, loss: 0.0017777140019461513 2023-01-22 19:05:23.154137: step: 486/463, loss: 5.622552635031752e-05 2023-01-22 19:05:23.803827: step: 488/463, loss: 0.005763747729361057 2023-01-22 19:05:24.404738: step: 490/463, loss: 0.005702953319996595 2023-01-22 19:05:25.058998: step: 492/463, loss: 4.989029548596591e-05 2023-01-22 19:05:25.696636: step: 494/463, loss: 0.002477621892467141 2023-01-22 19:05:26.339333: step: 496/463, loss: 0.036307524889707565 2023-01-22 19:05:26.936904: step: 498/463, loss: 0.0005911033367738128 2023-01-22 19:05:27.570701: step: 500/463, loss: 0.005463406443595886 2023-01-22 19:05:28.133507: step: 502/463, loss: 6.260655209189281e-05 2023-01-22 19:05:28.750496: step: 504/463, loss: 0.0021783963311463594 2023-01-22 19:05:29.395421: step: 506/463, loss: 0.03412538766860962 2023-01-22 19:05:30.080455: step: 508/463, loss: 0.0001535079354653135 2023-01-22 19:05:30.692492: step: 510/463, loss: 0.014903398230671883 2023-01-22 19:05:31.271173: step: 512/463, loss: 0.00745190167799592 2023-01-22 19:05:31.881790: step: 514/463, loss: 1.7218793800566345e-05 2023-01-22 19:05:32.455168: step: 516/463, loss: 0.007843056693673134 2023-01-22 19:05:33.046691: step: 518/463, loss: 0.0010136034106835723 2023-01-22 19:05:33.619608: step: 520/463, loss: 0.050767507404088974 2023-01-22 19:05:34.232568: step: 522/463, loss: 0.009956415742635727 2023-01-22 19:05:34.744867: step: 524/463, loss: 3.397311593289487e-05 2023-01-22 19:05:35.323972: step: 526/463, loss: 0.0005262968479655683 2023-01-22 19:05:35.908226: step: 528/463, loss: 0.0002244722272735089 2023-01-22 19:05:36.536240: step: 530/463, loss: 0.14118845760822296 2023-01-22 19:05:37.098195: step: 532/463, loss: 0.010948466137051582 2023-01-22 19:05:37.669873: step: 534/463, loss: 0.0029030137229710817 2023-01-22 19:05:38.230207: step: 536/463, loss: 0.0016152034513652325 2023-01-22 19:05:38.791656: step: 538/463, loss: 0.003841479541733861 2023-01-22 19:05:39.497506: step: 540/463, loss: 0.0007266312022693455 2023-01-22 19:05:40.078527: step: 542/463, loss: 0.0006326115690171719 2023-01-22 19:05:40.772450: step: 544/463, loss: 0.00046351630589924753 2023-01-22 19:05:41.508790: step: 546/463, loss: 0.0012164267245680094 2023-01-22 19:05:42.105157: step: 548/463, loss: 0.007187125738710165 2023-01-22 19:05:42.633002: step: 550/463, loss: 0.006794034969061613 2023-01-22 19:05:43.223362: step: 552/463, loss: 0.00012777438678313047 2023-01-22 19:05:43.846866: step: 554/463, loss: 0.0006044174660928547 2023-01-22 19:05:44.433355: step: 556/463, loss: 0.0029851836152374744 2023-01-22 19:05:45.072917: step: 558/463, loss: 0.0009341996628791094 2023-01-22 19:05:45.648744: step: 560/463, loss: 0.004605557303875685 2023-01-22 19:05:46.360540: step: 562/463, loss: 0.024548249319195747 2023-01-22 19:05:47.007813: step: 564/463, loss: 0.009360147640109062 2023-01-22 19:05:47.668736: step: 566/463, loss: 0.009234219789505005 2023-01-22 19:05:48.326073: step: 568/463, loss: 0.015495932660996914 2023-01-22 19:05:48.970799: step: 570/463, loss: 0.004095152951776981 2023-01-22 19:05:49.613278: step: 572/463, loss: 0.016916470602154732 2023-01-22 19:05:50.187696: step: 574/463, loss: 0.013683938421308994 2023-01-22 19:05:50.756913: step: 576/463, loss: 0.002121491590514779 2023-01-22 19:05:51.389985: step: 578/463, loss: 0.03873724862933159 2023-01-22 19:05:52.107826: step: 580/463, loss: 0.002991607878357172 2023-01-22 19:05:52.683271: step: 582/463, loss: 0.00812537595629692 2023-01-22 19:05:53.291073: step: 584/463, loss: 0.0028152144514024258 2023-01-22 19:05:53.893686: step: 586/463, loss: 0.02014043554663658 2023-01-22 19:05:54.443427: step: 588/463, loss: 0.0033074701204895973 2023-01-22 19:05:55.073420: step: 590/463, loss: 0.0054642269387841225 2023-01-22 19:05:55.642150: step: 592/463, loss: 0.004553141538053751 2023-01-22 19:05:56.267789: step: 594/463, loss: 0.005062703043222427 2023-01-22 19:05:56.834340: step: 596/463, loss: 0.42387309670448303 2023-01-22 19:05:57.467275: step: 598/463, loss: 0.002603686647489667 2023-01-22 19:05:58.062193: step: 600/463, loss: 0.0002646129869390279 2023-01-22 19:05:58.621011: step: 602/463, loss: 0.0009648936102166772 2023-01-22 19:05:59.242139: step: 604/463, loss: 0.01889690011739731 2023-01-22 19:05:59.848368: step: 606/463, loss: 0.007588464301079512 2023-01-22 19:06:00.469087: step: 608/463, loss: 0.008610591292381287 2023-01-22 19:06:01.122116: step: 610/463, loss: 0.0018239142373204231 2023-01-22 19:06:01.656208: step: 612/463, loss: 1.2675186553678941e-05 2023-01-22 19:06:02.210918: step: 614/463, loss: 0.014706745743751526 2023-01-22 19:06:02.785998: step: 616/463, loss: 8.246238394349348e-06 2023-01-22 19:06:03.397227: step: 618/463, loss: 0.0003675173793453723 2023-01-22 19:06:04.060336: step: 620/463, loss: 0.011796703562140465 2023-01-22 19:06:04.708803: step: 622/463, loss: 0.008284240961074829 2023-01-22 19:06:05.295382: step: 624/463, loss: 0.017200341448187828 2023-01-22 19:06:05.892577: step: 626/463, loss: 0.15119490027427673 2023-01-22 19:06:06.517426: step: 628/463, loss: 0.00020595072419382632 2023-01-22 19:06:07.115133: step: 630/463, loss: 0.015169610269367695 2023-01-22 19:06:07.672312: step: 632/463, loss: 0.00864847656339407 2023-01-22 19:06:08.220919: step: 634/463, loss: 0.0009307865984737873 2023-01-22 19:06:08.798324: step: 636/463, loss: 0.01495320163667202 2023-01-22 19:06:09.396924: step: 638/463, loss: 0.004342271946370602 2023-01-22 19:06:09.971595: step: 640/463, loss: 0.012519050389528275 2023-01-22 19:06:10.621404: step: 642/463, loss: 0.00926229078322649 2023-01-22 19:06:11.270107: step: 644/463, loss: 0.17586460709571838 2023-01-22 19:06:11.865450: step: 646/463, loss: 0.041533272713422775 2023-01-22 19:06:12.445280: step: 648/463, loss: 0.0011898735538125038 2023-01-22 19:06:13.049938: step: 650/463, loss: 0.024906329810619354 2023-01-22 19:06:13.616847: step: 652/463, loss: 0.0001968204596778378 2023-01-22 19:06:14.176638: step: 654/463, loss: 0.00019154422625433654 2023-01-22 19:06:14.731083: step: 656/463, loss: 0.0004527656128630042 2023-01-22 19:06:15.350509: step: 658/463, loss: 0.04931115731596947 2023-01-22 19:06:16.019345: step: 660/463, loss: 0.04182051867246628 2023-01-22 19:06:16.620693: step: 662/463, loss: 0.00974087230861187 2023-01-22 19:06:17.245579: step: 664/463, loss: 0.0026624242309480906 2023-01-22 19:06:17.863748: step: 666/463, loss: 1.9522829055786133 2023-01-22 19:06:18.457719: step: 668/463, loss: 9.954460256267339e-05 2023-01-22 19:06:19.040247: step: 670/463, loss: 0.009883292950689793 2023-01-22 19:06:19.734727: step: 672/463, loss: 0.038442935794591904 2023-01-22 19:06:20.380379: step: 674/463, loss: 0.01573793776333332 2023-01-22 19:06:21.067161: step: 676/463, loss: 0.001642202609218657 2023-01-22 19:06:21.699313: step: 678/463, loss: 0.007947596721351147 2023-01-22 19:06:22.338318: step: 680/463, loss: 0.002847308525815606 2023-01-22 19:06:22.945316: step: 682/463, loss: 0.004827558994293213 2023-01-22 19:06:23.548080: step: 684/463, loss: 0.00879442784935236 2023-01-22 19:06:24.202361: step: 686/463, loss: 0.0013853918062523007 2023-01-22 19:06:24.770989: step: 688/463, loss: 2.3848266209824942e-05 2023-01-22 19:06:25.520728: step: 690/463, loss: 0.0003184963425155729 2023-01-22 19:06:26.117778: step: 692/463, loss: 0.00030207211966626346 2023-01-22 19:06:26.726818: step: 694/463, loss: 0.2322780340909958 2023-01-22 19:06:27.370080: step: 696/463, loss: 0.010288759134709835 2023-01-22 19:06:27.970993: step: 698/463, loss: 0.16671058535575867 2023-01-22 19:06:28.566812: step: 700/463, loss: 0.02031872235238552 2023-01-22 19:06:29.174540: step: 702/463, loss: 0.00022678067034576088 2023-01-22 19:06:29.732948: step: 704/463, loss: 0.004212798085063696 2023-01-22 19:06:30.326400: step: 706/463, loss: 0.015182940289378166 2023-01-22 19:06:30.915048: step: 708/463, loss: 0.00022876627917867154 2023-01-22 19:06:31.534710: step: 710/463, loss: 3.6824701965088025e-05 2023-01-22 19:06:32.141570: step: 712/463, loss: 0.10247209668159485 2023-01-22 19:06:32.771080: step: 714/463, loss: 0.012001020833849907 2023-01-22 19:06:33.362017: step: 716/463, loss: 0.0014442985411733389 2023-01-22 19:06:33.981459: step: 718/463, loss: 0.0008763174409978092 2023-01-22 19:06:34.565380: step: 720/463, loss: 0.00034498321474529803 2023-01-22 19:06:35.234419: step: 722/463, loss: 0.02556798979640007 2023-01-22 19:06:35.796798: step: 724/463, loss: 0.002236104104667902 2023-01-22 19:06:36.459724: step: 726/463, loss: 0.014791916124522686 2023-01-22 19:06:37.095552: step: 728/463, loss: 0.002261174377053976 2023-01-22 19:06:37.765235: step: 730/463, loss: 0.0016264747828245163 2023-01-22 19:06:38.400474: step: 732/463, loss: 0.009160919114947319 2023-01-22 19:06:39.032149: step: 734/463, loss: 0.03938758745789528 2023-01-22 19:06:39.684723: step: 736/463, loss: 0.024364300072193146 2023-01-22 19:06:40.296893: step: 738/463, loss: 2.8477115847636014e-05 2023-01-22 19:06:40.875032: step: 740/463, loss: 0.0002387808490311727 2023-01-22 19:06:41.571460: step: 742/463, loss: 0.00715411314740777 2023-01-22 19:06:42.107003: step: 744/463, loss: 0.00694757467135787 2023-01-22 19:06:42.706645: step: 746/463, loss: 0.0034912950359284878 2023-01-22 19:06:43.305128: step: 748/463, loss: 0.006020050961524248 2023-01-22 19:06:43.900946: step: 750/463, loss: 0.00825294479727745 2023-01-22 19:06:44.471789: step: 752/463, loss: 0.000389638589695096 2023-01-22 19:06:45.083169: step: 754/463, loss: 0.028875520452857018 2023-01-22 19:06:45.675247: step: 756/463, loss: 0.12370839715003967 2023-01-22 19:06:46.314198: step: 758/463, loss: 0.002328355796635151 2023-01-22 19:06:46.933869: step: 760/463, loss: 0.06526339054107666 2023-01-22 19:06:47.535086: step: 762/463, loss: 0.0005695425788871944 2023-01-22 19:06:48.156414: step: 764/463, loss: 0.001476838137023151 2023-01-22 19:06:48.918610: step: 766/463, loss: 0.003480741521343589 2023-01-22 19:06:49.593079: step: 768/463, loss: 0.11688916385173798 2023-01-22 19:06:50.249838: step: 770/463, loss: 0.06980141997337341 2023-01-22 19:06:50.905790: step: 772/463, loss: 0.34632471203804016 2023-01-22 19:06:51.516455: step: 774/463, loss: 0.46453630924224854 2023-01-22 19:06:52.108726: step: 776/463, loss: 0.0016633502673357725 2023-01-22 19:06:52.688499: step: 778/463, loss: 0.0006041254382580519 2023-01-22 19:06:53.279530: step: 780/463, loss: 0.012992740608751774 2023-01-22 19:06:53.846598: step: 782/463, loss: 0.004174672067165375 2023-01-22 19:06:54.458316: step: 784/463, loss: 1.5383927822113037 2023-01-22 19:06:55.105672: step: 786/463, loss: 0.16006481647491455 2023-01-22 19:06:55.716721: step: 788/463, loss: 0.00018381157133262604 2023-01-22 19:06:56.344952: step: 790/463, loss: 0.007852508686482906 2023-01-22 19:06:56.963482: step: 792/463, loss: 0.013559629209339619 2023-01-22 19:06:57.564654: step: 794/463, loss: 0.0034026142675429583 2023-01-22 19:06:58.134251: step: 796/463, loss: 0.033831655979156494 2023-01-22 19:06:58.735896: step: 798/463, loss: 0.13187310099601746 2023-01-22 19:06:59.357058: step: 800/463, loss: 0.0013999352231621742 2023-01-22 19:06:59.918416: step: 802/463, loss: 0.0027970923110842705 2023-01-22 19:07:00.556366: step: 804/463, loss: 0.019234444946050644 2023-01-22 19:07:01.140287: step: 806/463, loss: 0.0010194600326940417 2023-01-22 19:07:01.710421: step: 808/463, loss: 0.0007224015425890684 2023-01-22 19:07:02.354801: step: 810/463, loss: 0.014147120527923107 2023-01-22 19:07:02.969657: step: 812/463, loss: 0.017070811241865158 2023-01-22 19:07:03.618601: step: 814/463, loss: 0.0003064287011511624 2023-01-22 19:07:04.178027: step: 816/463, loss: 0.0009792183991521597 2023-01-22 19:07:04.752110: step: 818/463, loss: 0.020804140716791153 2023-01-22 19:07:05.342516: step: 820/463, loss: 0.005913062486797571 2023-01-22 19:07:05.969448: step: 822/463, loss: 0.000843966961838305 2023-01-22 19:07:06.576895: step: 824/463, loss: 0.002937337150797248 2023-01-22 19:07:07.174926: step: 826/463, loss: 0.7647182941436768 2023-01-22 19:07:07.813999: step: 828/463, loss: 0.009788138791918755 2023-01-22 19:07:08.395546: step: 830/463, loss: 0.0036282429937273264 2023-01-22 19:07:08.883796: step: 832/463, loss: 0.002347236964851618 2023-01-22 19:07:09.445779: step: 834/463, loss: 0.015613395720720291 2023-01-22 19:07:10.059162: step: 836/463, loss: 0.000944484316278249 2023-01-22 19:07:10.644235: step: 838/463, loss: 0.040422726422548294 2023-01-22 19:07:11.302706: step: 840/463, loss: 0.01618790812790394 2023-01-22 19:07:11.904392: step: 842/463, loss: 0.04599295184016228 2023-01-22 19:07:12.584883: step: 844/463, loss: 0.01670876331627369 2023-01-22 19:07:13.190376: step: 846/463, loss: 3.312061744509265e-05 2023-01-22 19:07:13.780511: step: 848/463, loss: 0.014650463126599789 2023-01-22 19:07:14.352123: step: 850/463, loss: 0.002339577069506049 2023-01-22 19:07:14.956302: step: 852/463, loss: 0.03575816750526428 2023-01-22 19:07:15.518433: step: 854/463, loss: 0.00019467574020382017 2023-01-22 19:07:16.160114: step: 856/463, loss: 0.11702123284339905 2023-01-22 19:07:16.775733: step: 858/463, loss: 0.01167676318436861 2023-01-22 19:07:17.392372: step: 860/463, loss: 0.042015738785266876 2023-01-22 19:07:18.016603: step: 862/463, loss: 0.0005903188139200211 2023-01-22 19:07:18.554677: step: 864/463, loss: 0.0009963897755369544 2023-01-22 19:07:19.098904: step: 866/463, loss: 8.39572967379354e-05 2023-01-22 19:07:19.701757: step: 868/463, loss: 0.001567073748447001 2023-01-22 19:07:20.388269: step: 870/463, loss: 0.0008154552197083831 2023-01-22 19:07:21.023299: step: 872/463, loss: 0.010500306263566017 2023-01-22 19:07:21.609145: step: 874/463, loss: 0.0006256489432416856 2023-01-22 19:07:22.215194: step: 876/463, loss: 0.0048181707970798016 2023-01-22 19:07:22.827955: step: 878/463, loss: 0.009716312400996685 2023-01-22 19:07:23.496921: step: 880/463, loss: 0.03686812147498131 2023-01-22 19:07:24.118898: step: 882/463, loss: 0.0019368560751900077 2023-01-22 19:07:24.812243: step: 884/463, loss: 0.036908797919750214 2023-01-22 19:07:25.331316: step: 886/463, loss: 0.00011586022446863353 2023-01-22 19:07:25.967816: step: 888/463, loss: 0.012970784679055214 2023-01-22 19:07:26.578196: step: 890/463, loss: 0.004012468736618757 2023-01-22 19:07:27.166653: step: 892/463, loss: 1.542651653289795 2023-01-22 19:07:27.796686: step: 894/463, loss: 0.01246715523302555 2023-01-22 19:07:28.442137: step: 896/463, loss: 0.011878938414156437 2023-01-22 19:07:29.095392: step: 898/463, loss: 0.00016034110740292817 2023-01-22 19:07:29.834668: step: 900/463, loss: 0.021903716027736664 2023-01-22 19:07:30.425516: step: 902/463, loss: 5.89243572903797e-05 2023-01-22 19:07:31.048873: step: 904/463, loss: 0.00909152626991272 2023-01-22 19:07:31.634930: step: 906/463, loss: 0.005393726751208305 2023-01-22 19:07:32.207334: step: 908/463, loss: 1.904344026115723e-05 2023-01-22 19:07:32.885080: step: 910/463, loss: 0.005100796464830637 2023-01-22 19:07:33.479142: step: 912/463, loss: 0.000634519848972559 2023-01-22 19:07:34.075167: step: 914/463, loss: 0.03214986249804497 2023-01-22 19:07:34.721836: step: 916/463, loss: 0.01719917543232441 2023-01-22 19:07:35.363082: step: 918/463, loss: 0.029929205775260925 2023-01-22 19:07:35.934177: step: 920/463, loss: 0.006227471400052309 2023-01-22 19:07:36.535575: step: 922/463, loss: 0.02984226867556572 2023-01-22 19:07:37.229855: step: 924/463, loss: 0.17771632969379425 2023-01-22 19:07:37.831393: step: 926/463, loss: 0.0005947013269178569 ================================================== Loss: 0.049 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2869111570247934, 'r': 0.3293761859582543, 'f1': 0.3066806537102474}, 'combined': 0.22597521852334015, 'epoch': 37} Test Chinese: {'template': {'p': 0.9722222222222222, 'r': 0.5511811023622047, 'f1': 0.7035175879396985}, 'slot': {'p': 0.3410820934155749, 'r': 0.3241172772509259, 'f1': 0.3323833554626945}, 'combined': 0.23383753650641825, 'epoch': 37} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2930072156196944, 'r': 0.3274786527514232, 'f1': 0.30928539426523294}, 'combined': 0.22789450103754005, 'epoch': 37} Test Korean: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.3393963800684782, 'r': 0.3177768898895961, 'f1': 0.3282310192099839}, 'combined': 0.23304402363908855, 'epoch': 37} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.30477045125482627, 'r': 0.3423607346164272, 'f1': 0.32247382867356056}, 'combined': 0.23761229481209725, 'epoch': 37} Test Russian: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.3629923974553131, 'r': 0.3047108955951232, 'f1': 0.33130805156737303}, 'combined': 0.23522871661283484, 'epoch': 37} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.3049645390070922, 'r': 0.4095238095238095, 'f1': 0.34959349593495936}, 'combined': 0.23306233062330622, 'epoch': 37} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.2569444444444444, 'r': 0.40217391304347827, 'f1': 0.31355932203389825}, 'combined': 0.15677966101694912, 'epoch': 37} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.4375, 'r': 0.3017241379310345, 'f1': 0.3571428571428571}, 'combined': 0.23809523809523805, 'epoch': 37} New best russian model... ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.31154818059299194, 'r': 0.313321699647601, 'f1': 0.312432423300446}, 'combined': 0.23021336453717073, 'epoch': 29} Test for Chinese: {'template': {'p': 0.9722222222222222, 'r': 0.5511811023622047, 'f1': 0.7035175879396985}, 'slot': {'p': 0.35841849662687986, 'r': 0.32120052010105204, 'f1': 0.33879042433116024}, 'combined': 0.23834502214252482, 'epoch': 29} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.34188034188034183, 'r': 0.38095238095238093, 'f1': 0.36036036036036034}, 'combined': 0.2402402402402402, 'epoch': 29} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.28642843540005986, 'r': 0.33860515228508026, 'f1': 0.3103389830508475}, 'combined': 0.22867082961641394, 'epoch': 17} Test for Korean: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.3492453884409228, 'r': 0.3270179138496522, 'f1': 0.3377663639671779}, 'combined': 0.2398141184166963, 'epoch': 17} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.33088235294117646, 'r': 0.4891304347826087, 'f1': 0.39473684210526316}, 'combined': 0.19736842105263158, 'epoch': 17} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.30477045125482627, 'r': 0.3423607346164272, 'f1': 0.32247382867356056}, 'combined': 0.23761229481209725, 'epoch': 37} Test for Russian: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.3629923974553131, 'r': 0.3047108955951232, 'f1': 0.33130805156737303}, 'combined': 0.23522871661283484, 'epoch': 37} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.4375, 'r': 0.3017241379310345, 'f1': 0.3571428571428571}, 'combined': 0.23809523809523805, 'epoch': 37} ****************************** Epoch: 38 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-22 19:10:20.309294: step: 2/463, loss: 0.006746386643499136 2023-01-22 19:10:20.946159: step: 4/463, loss: 0.0031352669466286898 2023-01-22 19:10:21.571830: step: 6/463, loss: 0.01404573954641819 2023-01-22 19:10:22.202525: step: 8/463, loss: 0.0024750789161771536 2023-01-22 19:10:22.792837: step: 10/463, loss: 0.001432015560567379 2023-01-22 19:10:23.366781: step: 12/463, loss: 0.000657000346109271 2023-01-22 19:10:23.997622: step: 14/463, loss: 0.0044289883226156235 2023-01-22 19:10:24.601473: step: 16/463, loss: 0.0026413460727781057 2023-01-22 19:10:25.264048: step: 18/463, loss: 0.048979826271533966 2023-01-22 19:10:25.841800: step: 20/463, loss: 0.0003209237474948168 2023-01-22 19:10:26.389541: step: 22/463, loss: 0.0166185200214386 2023-01-22 19:10:26.992112: step: 24/463, loss: 7.029924017842859e-05 2023-01-22 19:10:27.607014: step: 26/463, loss: 0.006621234584599733 2023-01-22 19:10:28.153882: step: 28/463, loss: 0.00010080239007947966 2023-01-22 19:10:28.823159: step: 30/463, loss: 0.13674984872341156 2023-01-22 19:10:29.419818: step: 32/463, loss: 0.002081488724797964 2023-01-22 19:10:30.036462: step: 34/463, loss: 1.3824112102156505e-05 2023-01-22 19:10:30.623538: step: 36/463, loss: 0.0017340844497084618 2023-01-22 19:10:31.285534: step: 38/463, loss: 0.011636920273303986 2023-01-22 19:10:31.901038: step: 40/463, loss: 0.0009110919781960547 2023-01-22 19:10:32.483369: step: 42/463, loss: 0.020231368020176888 2023-01-22 19:10:33.117179: step: 44/463, loss: 0.00041259583667851985 2023-01-22 19:10:33.681249: step: 46/463, loss: 0.01572292670607567 2023-01-22 19:10:34.285977: step: 48/463, loss: 0.08703038096427917 2023-01-22 19:10:34.923680: step: 50/463, loss: 0.0008654529228806496 2023-01-22 19:10:35.499180: step: 52/463, loss: 0.04329479858279228 2023-01-22 19:10:36.103246: step: 54/463, loss: 0.0011172551894560456 2023-01-22 19:10:36.678323: step: 56/463, loss: 0.0006390108028426766 2023-01-22 19:10:37.305903: step: 58/463, loss: 0.012935356236994267 2023-01-22 19:10:37.955704: step: 60/463, loss: 0.003911642823368311 2023-01-22 19:10:38.556082: step: 62/463, loss: 0.009291069582104683 2023-01-22 19:10:39.101210: step: 64/463, loss: 0.15196523070335388 2023-01-22 19:10:39.736084: step: 66/463, loss: 0.0039484635926783085 2023-01-22 19:10:40.316094: step: 68/463, loss: 0.031240791082382202 2023-01-22 19:10:40.929370: step: 70/463, loss: 0.00041737788706086576 2023-01-22 19:10:41.515120: step: 72/463, loss: 0.004021356347948313 2023-01-22 19:10:42.119054: step: 74/463, loss: 0.0003060698218178004 2023-01-22 19:10:42.751243: step: 76/463, loss: 0.00017851039592642337 2023-01-22 19:10:43.331636: step: 78/463, loss: 4.924087079416495e-06 2023-01-22 19:10:43.867226: step: 80/463, loss: 0.03483670949935913 2023-01-22 19:10:44.437744: step: 82/463, loss: 0.0038260850124061108 2023-01-22 19:10:45.014101: step: 84/463, loss: 0.0033600616734474897 2023-01-22 19:10:45.643893: step: 86/463, loss: 0.13529902696609497 2023-01-22 19:10:46.245922: step: 88/463, loss: 0.00015856024401728064 2023-01-22 19:10:46.832655: step: 90/463, loss: 0.0031417077407240868 2023-01-22 19:10:47.501852: step: 92/463, loss: 8.040045213419944e-05 2023-01-22 19:10:48.077495: step: 94/463, loss: 0.0011113312793895602 2023-01-22 19:10:48.676252: step: 96/463, loss: 0.004745126701891422 2023-01-22 19:10:49.262293: step: 98/463, loss: 0.003652101382613182 2023-01-22 19:10:49.982942: step: 100/463, loss: 0.023657301440835 2023-01-22 19:10:50.663273: step: 102/463, loss: 0.001118181156925857 2023-01-22 19:10:51.256723: step: 104/463, loss: 0.009054522030055523 2023-01-22 19:10:52.023006: step: 106/463, loss: 0.000484171585412696 2023-01-22 19:10:52.611484: step: 108/463, loss: 0.002362383995205164 2023-01-22 19:10:53.254662: step: 110/463, loss: 0.0034026249777525663 2023-01-22 19:10:53.804851: step: 112/463, loss: 0.003150507342070341 2023-01-22 19:10:54.397907: step: 114/463, loss: 0.00013333545939531177 2023-01-22 19:10:54.981680: step: 116/463, loss: 0.0006767200538888574 2023-01-22 19:10:55.528858: step: 118/463, loss: 0.0017028190195560455 2023-01-22 19:10:56.168061: step: 120/463, loss: 0.04836149886250496 2023-01-22 19:10:56.695393: step: 122/463, loss: 0.009175170212984085 2023-01-22 19:10:57.337121: step: 124/463, loss: 0.0002815091283991933 2023-01-22 19:10:58.051597: step: 126/463, loss: 0.2115914672613144 2023-01-22 19:10:58.620800: step: 128/463, loss: 0.008026933297514915 2023-01-22 19:10:59.212070: step: 130/463, loss: 0.00773219857364893 2023-01-22 19:10:59.861497: step: 132/463, loss: 0.007410572376102209 2023-01-22 19:11:00.501398: step: 134/463, loss: 0.016471978276968002 2023-01-22 19:11:01.131425: step: 136/463, loss: 0.010331208817660809 2023-01-22 19:11:01.696267: step: 138/463, loss: 0.08242560178041458 2023-01-22 19:11:02.277246: step: 140/463, loss: 0.00022398405417334288 2023-01-22 19:11:02.812828: step: 142/463, loss: 0.0037055369466543198 2023-01-22 19:11:03.396246: step: 144/463, loss: 0.012694220058619976 2023-01-22 19:11:03.955532: step: 146/463, loss: 0.0011813599849119782 2023-01-22 19:11:04.560293: step: 148/463, loss: 0.004951213952153921 2023-01-22 19:11:05.118766: step: 150/463, loss: 0.00654065515846014 2023-01-22 19:11:05.731236: step: 152/463, loss: 0.0009564876090735197 2023-01-22 19:11:06.322860: step: 154/463, loss: 0.011436395347118378 2023-01-22 19:11:06.942049: step: 156/463, loss: 0.005748507101088762 2023-01-22 19:11:07.519561: step: 158/463, loss: 0.007028175983577967 2023-01-22 19:11:08.076053: step: 160/463, loss: 0.014428318478167057 2023-01-22 19:11:08.680110: step: 162/463, loss: 0.12192299216985703 2023-01-22 19:11:09.277677: step: 164/463, loss: 0.015920933336019516 2023-01-22 19:11:09.880026: step: 166/463, loss: 0.000891414878424257 2023-01-22 19:11:10.451107: step: 168/463, loss: 0.06969546526670456 2023-01-22 19:11:11.084748: step: 170/463, loss: 0.012356922961771488 2023-01-22 19:11:11.658182: step: 172/463, loss: 0.022439956665039062 2023-01-22 19:11:12.246937: step: 174/463, loss: 0.004886950831860304 2023-01-22 19:11:12.906434: step: 176/463, loss: 0.0006983957136981189 2023-01-22 19:11:13.522968: step: 178/463, loss: 0.0004801067407242954 2023-01-22 19:11:14.158366: step: 180/463, loss: 0.0013559977523982525 2023-01-22 19:11:14.825323: step: 182/463, loss: 0.002655019983649254 2023-01-22 19:11:15.428108: step: 184/463, loss: 0.0005117713590152562 2023-01-22 19:11:16.036209: step: 186/463, loss: 0.006484409794211388 2023-01-22 19:11:16.662577: step: 188/463, loss: 0.002285390393808484 2023-01-22 19:11:17.187453: step: 190/463, loss: 0.00036972356610931456 2023-01-22 19:11:17.791997: step: 192/463, loss: 0.0004971589078195393 2023-01-22 19:11:18.368436: step: 194/463, loss: 0.002160158008337021 2023-01-22 19:11:18.941660: step: 196/463, loss: 0.003843221813440323 2023-01-22 19:11:19.555268: step: 198/463, loss: 0.0014152834191918373 2023-01-22 19:11:20.112222: step: 200/463, loss: 0.013148820027709007 2023-01-22 19:11:20.675102: step: 202/463, loss: 0.005296451039612293 2023-01-22 19:11:21.277014: step: 204/463, loss: 0.058996014297008514 2023-01-22 19:11:21.866044: step: 206/463, loss: 0.0016590056475251913 2023-01-22 19:11:22.432321: step: 208/463, loss: 0.00018813587666954845 2023-01-22 19:11:23.012937: step: 210/463, loss: 0.008485142141580582 2023-01-22 19:11:23.635605: step: 212/463, loss: 0.0237103458493948 2023-01-22 19:11:24.234296: step: 214/463, loss: 0.035315703600645065 2023-01-22 19:11:24.810313: step: 216/463, loss: 0.01367004681378603 2023-01-22 19:11:25.413501: step: 218/463, loss: 0.03292296454310417 2023-01-22 19:11:26.067140: step: 220/463, loss: 0.021760089322924614 2023-01-22 19:11:26.647191: step: 222/463, loss: 0.016635950654745102 2023-01-22 19:11:27.256247: step: 224/463, loss: 1.5777399312355556e-05 2023-01-22 19:11:27.964212: step: 226/463, loss: 0.007048322353512049 2023-01-22 19:11:28.580932: step: 228/463, loss: 0.02976927161216736 2023-01-22 19:11:29.151379: step: 230/463, loss: 0.0005775204626843333 2023-01-22 19:11:29.743331: step: 232/463, loss: 0.014299589209258556 2023-01-22 19:11:30.383784: step: 234/463, loss: 0.002022725762799382 2023-01-22 19:11:31.024593: step: 236/463, loss: 0.00045913970097899437 2023-01-22 19:11:31.574766: step: 238/463, loss: 0.00012865292956121266 2023-01-22 19:11:32.168738: step: 240/463, loss: 0.0003918398870155215 2023-01-22 19:11:32.808319: step: 242/463, loss: 0.02438717521727085 2023-01-22 19:11:33.446275: step: 244/463, loss: 0.009819227270781994 2023-01-22 19:11:34.070195: step: 246/463, loss: 0.021382570266723633 2023-01-22 19:11:34.676317: step: 248/463, loss: 0.011360648088157177 2023-01-22 19:11:35.310355: step: 250/463, loss: 0.019919678568840027 2023-01-22 19:11:35.929756: step: 252/463, loss: 0.04069416970014572 2023-01-22 19:11:36.743426: step: 254/463, loss: 0.3088483512401581 2023-01-22 19:11:37.361133: step: 256/463, loss: 0.004116449970752001 2023-01-22 19:11:37.954993: step: 258/463, loss: 0.010726540349423885 2023-01-22 19:11:38.584560: step: 260/463, loss: 0.00030139245791360736 2023-01-22 19:11:39.208509: step: 262/463, loss: 0.0005882786936126649 2023-01-22 19:11:39.839777: step: 264/463, loss: 0.009292826056480408 2023-01-22 19:11:40.424683: step: 266/463, loss: 0.009715428575873375 2023-01-22 19:11:40.984559: step: 268/463, loss: 0.003091690596193075 2023-01-22 19:11:41.593369: step: 270/463, loss: 0.005568896885961294 2023-01-22 19:11:42.256242: step: 272/463, loss: 0.039929863065481186 2023-01-22 19:11:42.865362: step: 274/463, loss: 0.005067629739642143 2023-01-22 19:11:43.435577: step: 276/463, loss: 0.035672351717948914 2023-01-22 19:11:44.031494: step: 278/463, loss: 0.0015454042004421353 2023-01-22 19:11:44.624812: step: 280/463, loss: 0.0034804092720150948 2023-01-22 19:11:45.202541: step: 282/463, loss: 0.008348464965820312 2023-01-22 19:11:45.817524: step: 284/463, loss: 0.00045176525600254536 2023-01-22 19:11:46.364564: step: 286/463, loss: 7.17291040928103e-05 2023-01-22 19:11:47.002946: step: 288/463, loss: 0.01568594202399254 2023-01-22 19:11:47.655803: step: 290/463, loss: 0.04432795196771622 2023-01-22 19:11:48.267559: step: 292/463, loss: 0.05577278137207031 2023-01-22 19:11:48.897209: step: 294/463, loss: 0.01335665863007307 2023-01-22 19:11:49.460599: step: 296/463, loss: 0.004177842754870653 2023-01-22 19:11:50.036716: step: 298/463, loss: 0.00032572110649198294 2023-01-22 19:11:50.675037: step: 300/463, loss: 0.001511085545644164 2023-01-22 19:11:51.349711: step: 302/463, loss: 0.0066068521700799465 2023-01-22 19:11:51.973860: step: 304/463, loss: 0.046314455568790436 2023-01-22 19:11:52.630070: step: 306/463, loss: 0.004831632133573294 2023-01-22 19:11:53.217915: step: 308/463, loss: 0.023934412747621536 2023-01-22 19:11:53.808199: step: 310/463, loss: 0.007402149960398674 2023-01-22 19:11:54.371785: step: 312/463, loss: 0.0038931716699153185 2023-01-22 19:11:54.969175: step: 314/463, loss: 0.029209831729531288 2023-01-22 19:11:55.570534: step: 316/463, loss: 0.045786790549755096 2023-01-22 19:11:56.106407: step: 318/463, loss: 0.0003469511866569519 2023-01-22 19:11:56.703390: step: 320/463, loss: 0.04305526986718178 2023-01-22 19:11:57.346765: step: 322/463, loss: 0.06162300333380699 2023-01-22 19:11:57.992400: step: 324/463, loss: 0.011697503738105297 2023-01-22 19:11:58.674934: step: 326/463, loss: 0.02581833302974701 2023-01-22 19:11:59.268524: step: 328/463, loss: 0.0033242045901715755 2023-01-22 19:11:59.865541: step: 330/463, loss: 0.0012335831997916102 2023-01-22 19:12:00.499794: step: 332/463, loss: 0.004426132421940565 2023-01-22 19:12:01.170134: step: 334/463, loss: 0.018044328317046165 2023-01-22 19:12:01.808193: step: 336/463, loss: 0.002944071777164936 2023-01-22 19:12:02.427074: step: 338/463, loss: 0.0022902057971805334 2023-01-22 19:12:03.080835: step: 340/463, loss: 1.260993849427905e-05 2023-01-22 19:12:03.700490: step: 342/463, loss: 0.08599743992090225 2023-01-22 19:12:04.247708: step: 344/463, loss: 0.0003688898286782205 2023-01-22 19:12:04.869857: step: 346/463, loss: 0.015205108560621738 2023-01-22 19:12:05.487765: step: 348/463, loss: 0.020361071452498436 2023-01-22 19:12:06.057813: step: 350/463, loss: 0.0556219145655632 2023-01-22 19:12:06.746787: step: 352/463, loss: 0.0007032107678242028 2023-01-22 19:12:07.345017: step: 354/463, loss: 0.0008069656323641539 2023-01-22 19:12:07.953868: step: 356/463, loss: 0.178666889667511 2023-01-22 19:12:08.547515: step: 358/463, loss: 0.0030103803146630526 2023-01-22 19:12:09.166518: step: 360/463, loss: 0.03296664357185364 2023-01-22 19:12:09.785325: step: 362/463, loss: 0.004225002136081457 2023-01-22 19:12:10.415791: step: 364/463, loss: 0.003328059334307909 2023-01-22 19:12:10.980985: step: 366/463, loss: 0.010784521698951721 2023-01-22 19:12:11.599846: step: 368/463, loss: 0.013562551699578762 2023-01-22 19:12:12.200693: step: 370/463, loss: 0.005788852460682392 2023-01-22 19:12:12.780667: step: 372/463, loss: 0.014748837798833847 2023-01-22 19:12:13.468621: step: 374/463, loss: 0.0018124342896044254 2023-01-22 19:12:14.105966: step: 376/463, loss: 0.17702598869800568 2023-01-22 19:12:14.768988: step: 378/463, loss: 1.6585954654146917e-05 2023-01-22 19:12:15.370598: step: 380/463, loss: 0.0811033621430397 2023-01-22 19:12:16.060766: step: 382/463, loss: 0.0019142479868605733 2023-01-22 19:12:16.707405: step: 384/463, loss: 0.0006436016410589218 2023-01-22 19:12:17.321748: step: 386/463, loss: 0.006967521272599697 2023-01-22 19:12:17.916552: step: 388/463, loss: 0.10297872126102448 2023-01-22 19:12:18.562838: step: 390/463, loss: 0.0049799648113548756 2023-01-22 19:12:19.243129: step: 392/463, loss: 0.0007665951270610094 2023-01-22 19:12:19.850205: step: 394/463, loss: 0.006818701513111591 2023-01-22 19:12:20.425382: step: 396/463, loss: 0.015515086241066456 2023-01-22 19:12:21.075795: step: 398/463, loss: 0.042736224830150604 2023-01-22 19:12:21.752534: step: 400/463, loss: 0.000497344124596566 2023-01-22 19:12:22.326420: step: 402/463, loss: 5.0855767767643556e-05 2023-01-22 19:12:23.052976: step: 404/463, loss: 0.01187044195830822 2023-01-22 19:12:23.643706: step: 406/463, loss: 0.0012635558377951384 2023-01-22 19:12:24.261018: step: 408/463, loss: 0.022774986922740936 2023-01-22 19:12:24.803199: step: 410/463, loss: 1.7997066606767476e-05 2023-01-22 19:12:25.429979: step: 412/463, loss: 0.003161196131259203 2023-01-22 19:12:26.006324: step: 414/463, loss: 0.0005029952735640109 2023-01-22 19:12:26.615170: step: 416/463, loss: 0.00325025524944067 2023-01-22 19:12:27.259977: step: 418/463, loss: 0.0008132705115713179 2023-01-22 19:12:27.880987: step: 420/463, loss: 0.016347838565707207 2023-01-22 19:12:28.511762: step: 422/463, loss: 0.029815444722771645 2023-01-22 19:12:29.099698: step: 424/463, loss: 0.034186650067567825 2023-01-22 19:12:29.750940: step: 426/463, loss: 0.030626997351646423 2023-01-22 19:12:30.334690: step: 428/463, loss: 0.012777779251337051 2023-01-22 19:12:30.976955: step: 430/463, loss: 0.01015054527670145 2023-01-22 19:12:31.594894: step: 432/463, loss: 0.0024666087701916695 2023-01-22 19:12:32.195568: step: 434/463, loss: 0.01794361136853695 2023-01-22 19:12:32.798251: step: 436/463, loss: 0.03955583646893501 2023-01-22 19:12:33.372563: step: 438/463, loss: 0.004127271007746458 2023-01-22 19:12:33.945955: step: 440/463, loss: 0.0017556400271132588 2023-01-22 19:12:34.514868: step: 442/463, loss: 0.05451272428035736 2023-01-22 19:12:35.104541: step: 444/463, loss: 0.13425898551940918 2023-01-22 19:12:35.665304: step: 446/463, loss: 0.00013624416897073388 2023-01-22 19:12:36.249206: step: 448/463, loss: 0.0013248012401163578 2023-01-22 19:12:36.825383: step: 450/463, loss: 0.00796592142432928 2023-01-22 19:12:37.439735: step: 452/463, loss: 0.04715524613857269 2023-01-22 19:12:38.149809: step: 454/463, loss: 0.000616908073425293 2023-01-22 19:12:38.744219: step: 456/463, loss: 0.004847410600632429 2023-01-22 19:12:39.388742: step: 458/463, loss: 0.030315343290567398 2023-01-22 19:12:39.981385: step: 460/463, loss: 0.0006381303537636995 2023-01-22 19:12:40.497437: step: 462/463, loss: 0.00039535219548270106 2023-01-22 19:12:41.081879: step: 464/463, loss: 0.034011077135801315 2023-01-22 19:12:41.731223: step: 466/463, loss: 0.005730907898396254 2023-01-22 19:12:42.375621: step: 468/463, loss: 0.0019133588066324592 2023-01-22 19:12:42.951368: step: 470/463, loss: 0.0011946476297453046 2023-01-22 19:12:43.542381: step: 472/463, loss: 0.0016516479663550854 2023-01-22 19:12:44.179758: step: 474/463, loss: 0.03693294897675514 2023-01-22 19:12:44.799065: step: 476/463, loss: 0.00038448491250164807 2023-01-22 19:12:45.351970: step: 478/463, loss: 0.0055732461623847485 2023-01-22 19:12:45.930298: step: 480/463, loss: 0.000408959633205086 2023-01-22 19:12:46.497684: step: 482/463, loss: 0.023933470249176025 2023-01-22 19:12:47.012224: step: 484/463, loss: 0.00013412396947387606 2023-01-22 19:12:47.603137: step: 486/463, loss: 0.001528188120573759 2023-01-22 19:12:48.218279: step: 488/463, loss: 0.0009825716260820627 2023-01-22 19:12:48.890530: step: 490/463, loss: 6.716891221003607e-06 2023-01-22 19:12:49.549534: step: 492/463, loss: 0.02886100299656391 2023-01-22 19:12:50.126486: step: 494/463, loss: 0.01908118836581707 2023-01-22 19:12:50.781017: step: 496/463, loss: 0.0003038989962078631 2023-01-22 19:12:51.426734: step: 498/463, loss: 0.021594882011413574 2023-01-22 19:12:52.070639: step: 500/463, loss: 0.06992525607347488 2023-01-22 19:12:52.718179: step: 502/463, loss: 0.02198343724012375 2023-01-22 19:12:53.336794: step: 504/463, loss: 0.0033501831348985434 2023-01-22 19:12:53.943108: step: 506/463, loss: 0.005412927363067865 2023-01-22 19:12:54.559790: step: 508/463, loss: 0.0015021697618067265 2023-01-22 19:12:55.165441: step: 510/463, loss: 8.666652138344944e-05 2023-01-22 19:12:55.716189: step: 512/463, loss: 0.00015826222079340369 2023-01-22 19:12:56.303337: step: 514/463, loss: 0.0005814137402921915 2023-01-22 19:12:56.895251: step: 516/463, loss: 0.003474871162325144 2023-01-22 19:12:57.514397: step: 518/463, loss: 0.0029067459981888533 2023-01-22 19:12:58.014293: step: 520/463, loss: 0.00010252672655042261 2023-01-22 19:12:58.598084: step: 522/463, loss: 0.03940977901220322 2023-01-22 19:12:59.166281: step: 524/463, loss: 0.09102381020784378 2023-01-22 19:12:59.763857: step: 526/463, loss: 0.005426268558949232 2023-01-22 19:13:00.380598: step: 528/463, loss: 2.500007758499123e-05 2023-01-22 19:13:00.986082: step: 530/463, loss: 0.00011044167331419885 2023-01-22 19:13:01.599932: step: 532/463, loss: 0.00030152901308611035 2023-01-22 19:13:02.232107: step: 534/463, loss: 0.002793220803141594 2023-01-22 19:13:02.820162: step: 536/463, loss: 0.0003889525542035699 2023-01-22 19:13:03.492951: step: 538/463, loss: 0.01775766909122467 2023-01-22 19:13:04.059214: step: 540/463, loss: 0.007488325238227844 2023-01-22 19:13:04.717041: step: 542/463, loss: 0.00114177237264812 2023-01-22 19:13:05.327576: step: 544/463, loss: 0.0068650199100375175 2023-01-22 19:13:05.938840: step: 546/463, loss: 0.0018149109091609716 2023-01-22 19:13:06.575426: step: 548/463, loss: 0.00124483706895262 2023-01-22 19:13:07.187574: step: 550/463, loss: 2.3925944333313964e-05 2023-01-22 19:13:07.877194: step: 552/463, loss: 0.058190152049064636 2023-01-22 19:13:08.528615: step: 554/463, loss: 0.011362412944436073 2023-01-22 19:13:09.147224: step: 556/463, loss: 0.14383170008659363 2023-01-22 19:13:09.795984: step: 558/463, loss: 0.00670391833409667 2023-01-22 19:13:10.388873: step: 560/463, loss: 0.00600038655102253 2023-01-22 19:13:10.994795: step: 562/463, loss: 0.009400787763297558 2023-01-22 19:13:11.608501: step: 564/463, loss: 0.047244854271411896 2023-01-22 19:13:12.167521: step: 566/463, loss: 0.05317766219377518 2023-01-22 19:13:12.809812: step: 568/463, loss: 0.006393760908395052 2023-01-22 19:13:13.385459: step: 570/463, loss: 0.003632148029282689 2023-01-22 19:13:14.023265: step: 572/463, loss: 0.0024984190240502357 2023-01-22 19:13:14.565443: step: 574/463, loss: 0.0031935039442032576 2023-01-22 19:13:15.201555: step: 576/463, loss: 0.0018289466388523579 2023-01-22 19:13:15.803310: step: 578/463, loss: 0.004927013069391251 2023-01-22 19:13:16.464713: step: 580/463, loss: 0.00825447216629982 2023-01-22 19:13:17.197341: step: 582/463, loss: 0.0003716855717357248 2023-01-22 19:13:17.812408: step: 584/463, loss: 0.0012661003274843097 2023-01-22 19:13:18.449483: step: 586/463, loss: 0.0006166067905724049 2023-01-22 19:13:19.083222: step: 588/463, loss: 0.002301298314705491 2023-01-22 19:13:19.743939: step: 590/463, loss: 0.03572594001889229 2023-01-22 19:13:20.281232: step: 592/463, loss: 0.002622630912810564 2023-01-22 19:13:20.938325: step: 594/463, loss: 0.008832047693431377 2023-01-22 19:13:21.583231: step: 596/463, loss: 0.0019754983950406313 2023-01-22 19:13:22.226774: step: 598/463, loss: 0.0001924394309753552 2023-01-22 19:13:22.845305: step: 600/463, loss: 0.14119470119476318 2023-01-22 19:13:23.470392: step: 602/463, loss: 0.00028454052517190576 2023-01-22 19:13:24.102441: step: 604/463, loss: 0.04671543091535568 2023-01-22 19:13:24.720736: step: 606/463, loss: 0.03849666938185692 2023-01-22 19:13:25.310824: step: 608/463, loss: 0.005580467637628317 2023-01-22 19:13:25.946950: step: 610/463, loss: 0.022600675001740456 2023-01-22 19:13:26.609900: step: 612/463, loss: 0.009045332670211792 2023-01-22 19:13:27.223875: step: 614/463, loss: 0.025188252329826355 2023-01-22 19:13:27.848906: step: 616/463, loss: 0.00040330927004106343 2023-01-22 19:13:28.471331: step: 618/463, loss: 0.023502472788095474 2023-01-22 19:13:29.058888: step: 620/463, loss: 3.5873730666935444e-05 2023-01-22 19:13:29.669436: step: 622/463, loss: 0.0006118675810284913 2023-01-22 19:13:30.341634: step: 624/463, loss: 0.028348593041300774 2023-01-22 19:13:31.014112: step: 626/463, loss: 0.006027822382748127 2023-01-22 19:13:31.644296: step: 628/463, loss: 0.09027397632598877 2023-01-22 19:13:32.279969: step: 630/463, loss: 0.003426061477512121 2023-01-22 19:13:32.864890: step: 632/463, loss: 0.0012821012642234564 2023-01-22 19:13:33.562218: step: 634/463, loss: 0.000303239852655679 2023-01-22 19:13:34.168689: step: 636/463, loss: 0.0023394376039505005 2023-01-22 19:13:34.739520: step: 638/463, loss: 0.017693674191832542 2023-01-22 19:13:35.373283: step: 640/463, loss: 0.00011275022552581504 2023-01-22 19:13:35.967965: step: 642/463, loss: 0.0006492749089375138 2023-01-22 19:13:36.608776: step: 644/463, loss: 0.029278067871928215 2023-01-22 19:13:37.250752: step: 646/463, loss: 2.2290468215942383 2023-01-22 19:13:37.820301: step: 648/463, loss: 0.004585623275488615 2023-01-22 19:13:38.491611: step: 650/463, loss: 0.04739953950047493 2023-01-22 19:13:39.125262: step: 652/463, loss: 0.09578879177570343 2023-01-22 19:13:39.794643: step: 654/463, loss: 0.01951269991695881 2023-01-22 19:13:40.436520: step: 656/463, loss: 1.13214111328125 2023-01-22 19:13:40.960228: step: 658/463, loss: 0.004209664184600115 2023-01-22 19:13:41.571686: step: 660/463, loss: 0.015327269211411476 2023-01-22 19:13:42.245855: step: 662/463, loss: 0.0065514156594872475 2023-01-22 19:13:42.833950: step: 664/463, loss: 0.0013492980506271124 2023-01-22 19:13:43.416264: step: 666/463, loss: 0.012007782235741615 2023-01-22 19:13:44.010296: step: 668/463, loss: 0.023900073021650314 2023-01-22 19:13:44.577414: step: 670/463, loss: 0.002312966389581561 2023-01-22 19:13:45.207307: step: 672/463, loss: 0.007426915690302849 2023-01-22 19:13:45.765781: step: 674/463, loss: 0.020562587305903435 2023-01-22 19:13:46.471338: step: 676/463, loss: 0.006336801219731569 2023-01-22 19:13:47.086554: step: 678/463, loss: 0.0010996608762070537 2023-01-22 19:13:47.694162: step: 680/463, loss: 0.003843034850433469 2023-01-22 19:13:48.354344: step: 682/463, loss: 0.0017897281795740128 2023-01-22 19:13:48.980760: step: 684/463, loss: 0.07990092039108276 2023-01-22 19:13:49.555638: step: 686/463, loss: 0.04249080643057823 2023-01-22 19:13:50.100143: step: 688/463, loss: 0.05301714316010475 2023-01-22 19:13:50.626534: step: 690/463, loss: 0.0009199282503686845 2023-01-22 19:13:51.247041: step: 692/463, loss: 0.006909001152962446 2023-01-22 19:13:51.773267: step: 694/463, loss: 0.0010311786318197846 2023-01-22 19:13:52.368625: step: 696/463, loss: 0.002054056851193309 2023-01-22 19:13:52.991098: step: 698/463, loss: 0.004385179840028286 2023-01-22 19:13:53.538627: step: 700/463, loss: 4.337991776992567e-05 2023-01-22 19:13:54.169989: step: 702/463, loss: 0.026126211509108543 2023-01-22 19:13:54.764889: step: 704/463, loss: 0.014166576787829399 2023-01-22 19:13:55.389003: step: 706/463, loss: 0.0015577052254229784 2023-01-22 19:13:56.051123: step: 708/463, loss: 0.058565158396959305 2023-01-22 19:13:56.726487: step: 710/463, loss: 0.015621548518538475 2023-01-22 19:13:57.371117: step: 712/463, loss: 0.01081937924027443 2023-01-22 19:13:57.967424: step: 714/463, loss: 0.5539243221282959 2023-01-22 19:13:58.635620: step: 716/463, loss: 0.015049549750983715 2023-01-22 19:13:59.190816: step: 718/463, loss: 0.0005825216649100184 2023-01-22 19:13:59.775248: step: 720/463, loss: 0.0001708117633825168 2023-01-22 19:14:00.393722: step: 722/463, loss: 0.0012988595990464091 2023-01-22 19:14:01.024411: step: 724/463, loss: 0.0009494388359598815 2023-01-22 19:14:01.628258: step: 726/463, loss: 0.006202553398907185 2023-01-22 19:14:02.213566: step: 728/463, loss: 0.009330949746072292 2023-01-22 19:14:02.799978: step: 730/463, loss: 0.058258265256881714 2023-01-22 19:14:03.378720: step: 732/463, loss: 0.0020145936869084835 2023-01-22 19:14:03.999141: step: 734/463, loss: 0.18634866178035736 2023-01-22 19:14:04.572835: step: 736/463, loss: 0.004939565435051918 2023-01-22 19:14:05.197319: step: 738/463, loss: 0.015489058569073677 2023-01-22 19:14:05.898342: step: 740/463, loss: 0.003114770632237196 2023-01-22 19:14:06.511795: step: 742/463, loss: 0.019322045147418976 2023-01-22 19:14:07.107417: step: 744/463, loss: 0.016432438045740128 2023-01-22 19:14:07.738365: step: 746/463, loss: 0.0032481190282851458 2023-01-22 19:14:08.492398: step: 748/463, loss: 0.0017327797831967473 2023-01-22 19:14:09.086749: step: 750/463, loss: 0.0009434268577024341 2023-01-22 19:14:09.672941: step: 752/463, loss: 0.00456859078258276 2023-01-22 19:14:10.240131: step: 754/463, loss: 0.0009508858202025294 2023-01-22 19:14:10.831510: step: 756/463, loss: 0.001065373420715332 2023-01-22 19:14:11.425101: step: 758/463, loss: 0.00022229146270547062 2023-01-22 19:14:12.032481: step: 760/463, loss: 0.059398118406534195 2023-01-22 19:14:12.653064: step: 762/463, loss: 0.001906992052681744 2023-01-22 19:14:13.266896: step: 764/463, loss: 0.023267647251486778 2023-01-22 19:14:13.857494: step: 766/463, loss: 0.43855324387550354 2023-01-22 19:14:14.506624: step: 768/463, loss: 0.0037960836198180914 2023-01-22 19:14:15.157170: step: 770/463, loss: 0.009045098908245564 2023-01-22 19:14:15.810044: step: 772/463, loss: 0.004089029505848885 2023-01-22 19:14:16.497685: step: 774/463, loss: 0.0008124143932946026 2023-01-22 19:14:17.060062: step: 776/463, loss: 0.016408808529376984 2023-01-22 19:14:17.685346: step: 778/463, loss: 0.003563628066331148 2023-01-22 19:14:18.264414: step: 780/463, loss: 0.06206468120217323 2023-01-22 19:14:18.950996: step: 782/463, loss: 0.0012219235068187118 2023-01-22 19:14:19.526235: step: 784/463, loss: 0.00014362319780047983 2023-01-22 19:14:20.165654: step: 786/463, loss: 0.00112239015288651 2023-01-22 19:14:20.734685: step: 788/463, loss: 0.001967805903404951 2023-01-22 19:14:21.286424: step: 790/463, loss: 1.0982622370647732e-05 2023-01-22 19:14:21.874989: step: 792/463, loss: 0.0013761479640379548 2023-01-22 19:14:22.511035: step: 794/463, loss: 0.015206113457679749 2023-01-22 19:14:23.107380: step: 796/463, loss: 0.0593472383916378 2023-01-22 19:14:23.702211: step: 798/463, loss: 0.0008406026754528284 2023-01-22 19:14:24.327882: step: 800/463, loss: 0.08602361381053925 2023-01-22 19:14:24.987714: step: 802/463, loss: 0.1042935773730278 2023-01-22 19:14:25.614552: step: 804/463, loss: 0.006354521960020065 2023-01-22 19:14:26.317901: step: 806/463, loss: 0.007068177219480276 2023-01-22 19:14:26.941225: step: 808/463, loss: 0.00024940792354755104 2023-01-22 19:14:27.586402: step: 810/463, loss: 9.864004823612049e-05 2023-01-22 19:14:28.210248: step: 812/463, loss: 0.01268360298126936 2023-01-22 19:14:28.790876: step: 814/463, loss: 0.0002035978832282126 2023-01-22 19:14:29.320104: step: 816/463, loss: 0.030074607580900192 2023-01-22 19:14:29.988440: step: 818/463, loss: 0.005834620911628008 2023-01-22 19:14:30.640725: step: 820/463, loss: 0.002605207497254014 2023-01-22 19:14:31.324160: step: 822/463, loss: 0.038838449865579605 2023-01-22 19:14:31.919249: step: 824/463, loss: 0.0003345948935020715 2023-01-22 19:14:32.472765: step: 826/463, loss: 0.00043221694068051875 2023-01-22 19:14:33.058620: step: 828/463, loss: 0.00620083324611187 2023-01-22 19:14:33.653668: step: 830/463, loss: 0.005476971622556448 2023-01-22 19:14:34.248470: step: 832/463, loss: 0.0027082893066108227 2023-01-22 19:14:34.921224: step: 834/463, loss: 0.008984465152025223 2023-01-22 19:14:35.550127: step: 836/463, loss: 0.01769418828189373 2023-01-22 19:14:36.193621: step: 838/463, loss: 0.0002761089417617768 2023-01-22 19:14:36.831126: step: 840/463, loss: 0.00034920877078548074 2023-01-22 19:14:37.461852: step: 842/463, loss: 5.912203505431535e-06 2023-01-22 19:14:38.141785: step: 844/463, loss: 0.008575237356126308 2023-01-22 19:14:38.774331: step: 846/463, loss: 0.00025965101667679846 2023-01-22 19:14:39.488100: step: 848/463, loss: 0.05152559280395508 2023-01-22 19:14:40.084580: step: 850/463, loss: 0.08940494805574417 2023-01-22 19:14:40.762088: step: 852/463, loss: 0.03182918578386307 2023-01-22 19:14:41.357777: step: 854/463, loss: 0.13736256957054138 2023-01-22 19:14:41.929954: step: 856/463, loss: 0.0013441493501886725 2023-01-22 19:14:42.519764: step: 858/463, loss: 0.003059338079765439 2023-01-22 19:14:43.157750: step: 860/463, loss: 0.0004014249425381422 2023-01-22 19:14:43.787698: step: 862/463, loss: 0.01205840241163969 2023-01-22 19:14:44.428081: step: 864/463, loss: 0.011809009127318859 2023-01-22 19:14:45.027755: step: 866/463, loss: 0.002571647521108389 2023-01-22 19:14:45.619755: step: 868/463, loss: 0.021793268620967865 2023-01-22 19:14:46.229430: step: 870/463, loss: 0.0011173386592417955 2023-01-22 19:14:46.925624: step: 872/463, loss: 7.2265191078186035 2023-01-22 19:14:47.576034: step: 874/463, loss: 0.010943396016955376 2023-01-22 19:14:48.235489: step: 876/463, loss: 0.00993053987622261 2023-01-22 19:14:48.842285: step: 878/463, loss: 1.795295611373149e-05 2023-01-22 19:14:49.461382: step: 880/463, loss: 0.00047202350106090307 2023-01-22 19:14:50.059654: step: 882/463, loss: 0.001368070486932993 2023-01-22 19:14:50.686614: step: 884/463, loss: 0.015549518167972565 2023-01-22 19:14:51.297361: step: 886/463, loss: 0.0043998598121106625 2023-01-22 19:14:51.974223: step: 888/463, loss: 8.740511839278042e-05 2023-01-22 19:14:52.554226: step: 890/463, loss: 0.00670892558991909 2023-01-22 19:14:53.172544: step: 892/463, loss: 0.0006243825773708522 2023-01-22 19:14:53.771879: step: 894/463, loss: 0.2699730396270752 2023-01-22 19:14:54.385368: step: 896/463, loss: 0.011053579859435558 2023-01-22 19:14:54.998873: step: 898/463, loss: 0.0038830148987472057 2023-01-22 19:14:55.562017: step: 900/463, loss: 0.00017185910837724805 2023-01-22 19:14:56.238458: step: 902/463, loss: 0.13066858053207397 2023-01-22 19:14:56.810419: step: 904/463, loss: 0.0013839170569553971 2023-01-22 19:14:57.370059: step: 906/463, loss: 3.561679841368459e-05 2023-01-22 19:14:57.965605: step: 908/463, loss: 3.531894617481157e-05 2023-01-22 19:14:58.585620: step: 910/463, loss: 0.002987832296639681 2023-01-22 19:14:59.270341: step: 912/463, loss: 0.10551473498344421 2023-01-22 19:14:59.809297: step: 914/463, loss: 0.021022681146860123 2023-01-22 19:15:00.430600: step: 916/463, loss: 0.09170028567314148 2023-01-22 19:15:01.062968: step: 918/463, loss: 0.00470306258648634 2023-01-22 19:15:01.650270: step: 920/463, loss: 0.11887484043836594 2023-01-22 19:15:02.287301: step: 922/463, loss: 7.710747013334185e-05 2023-01-22 19:15:02.938534: step: 924/463, loss: 0.008562718518078327 2023-01-22 19:15:03.551524: step: 926/463, loss: 0.02811807207763195 ================================================== Loss: 0.043 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2906089193825043, 'r': 0.32148956356736247, 'f1': 0.3052702702702703}, 'combined': 0.22493598862019917, 'epoch': 38} Test Chinese: {'template': {'p': 0.9722222222222222, 'r': 0.5511811023622047, 'f1': 0.7035175879396985}, 'slot': {'p': 0.34813207894015946, 'r': 0.30681797533120514, 'f1': 0.3261719849068284}, 'combined': 0.2294677280751557, 'epoch': 38} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.294271758436945, 'r': 0.31437381404174575, 'f1': 0.3039908256880734}, 'combined': 0.22399323998068565, 'epoch': 38} Test Korean: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.34885104213982376, 'r': 0.3065014396442469, 'f1': 0.326307901806288}, 'combined': 0.23167861028246448, 'epoch': 38} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3013433221389822, 'r': 0.31621035510978585, 'f1': 0.30859788359788365}, 'combined': 0.22738791423001953, 'epoch': 38} Test Russian: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.36284164059171536, 'r': 0.29122398926094883, 'f1': 0.3231118873098706}, 'combined': 0.22940943999000812, 'epoch': 38} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.2657407407407407, 'r': 0.3416666666666666, 'f1': 0.29895833333333327}, 'combined': 0.1993055555555555, 'epoch': 38} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.2661290322580645, 'r': 0.358695652173913, 'f1': 0.30555555555555547}, 'combined': 0.15277777777777773, 'epoch': 38} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.421875, 'r': 0.23275862068965517, 'f1': 0.3}, 'combined': 0.19999999999999998, 'epoch': 38} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.31154818059299194, 'r': 0.313321699647601, 'f1': 0.312432423300446}, 'combined': 0.23021336453717073, 'epoch': 29} Test for Chinese: {'template': {'p': 0.9722222222222222, 'r': 0.5511811023622047, 'f1': 0.7035175879396985}, 'slot': {'p': 0.35841849662687986, 'r': 0.32120052010105204, 'f1': 0.33879042433116024}, 'combined': 0.23834502214252482, 'epoch': 29} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.34188034188034183, 'r': 0.38095238095238093, 'f1': 0.36036036036036034}, 'combined': 0.2402402402402402, 'epoch': 29} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.28642843540005986, 'r': 0.33860515228508026, 'f1': 0.3103389830508475}, 'combined': 0.22867082961641394, 'epoch': 17} Test for Korean: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.3492453884409228, 'r': 0.3270179138496522, 'f1': 0.3377663639671779}, 'combined': 0.2398141184166963, 'epoch': 17} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.33088235294117646, 'r': 0.4891304347826087, 'f1': 0.39473684210526316}, 'combined': 0.19736842105263158, 'epoch': 17} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.30477045125482627, 'r': 0.3423607346164272, 'f1': 0.32247382867356056}, 'combined': 0.23761229481209725, 'epoch': 37} Test for Russian: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.3629923974553131, 'r': 0.3047108955951232, 'f1': 0.33130805156737303}, 'combined': 0.23522871661283484, 'epoch': 37} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.4375, 'r': 0.3017241379310345, 'f1': 0.3571428571428571}, 'combined': 0.23809523809523805, 'epoch': 37} ****************************** Epoch: 39 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-22 19:17:38.497677: step: 2/463, loss: 0.02631985954940319 2023-01-22 19:17:39.089899: step: 4/463, loss: 0.006363257300108671 2023-01-22 19:17:39.671993: step: 6/463, loss: 0.017346756532788277 2023-01-22 19:17:40.325033: step: 8/463, loss: 0.015194990672171116 2023-01-22 19:17:40.929493: step: 10/463, loss: 0.00019597719074226916 2023-01-22 19:17:41.582950: step: 12/463, loss: 0.007883941754698753 2023-01-22 19:17:42.164320: step: 14/463, loss: 0.008027291856706142 2023-01-22 19:17:42.908888: step: 16/463, loss: 0.012272590771317482 2023-01-22 19:17:43.458774: step: 18/463, loss: 0.0003710162127390504 2023-01-22 19:17:44.092482: step: 20/463, loss: 0.019811728969216347 2023-01-22 19:17:44.760197: step: 22/463, loss: 0.004291980993002653 2023-01-22 19:17:45.334875: step: 24/463, loss: 0.0021375122014433146 2023-01-22 19:17:45.987648: step: 26/463, loss: 0.00023569600307382643 2023-01-22 19:17:46.642801: step: 28/463, loss: 0.004171671811491251 2023-01-22 19:17:47.239340: step: 30/463, loss: 0.020365377888083458 2023-01-22 19:17:47.801776: step: 32/463, loss: 0.017604857683181763 2023-01-22 19:17:48.427343: step: 34/463, loss: 0.00012744884588755667 2023-01-22 19:17:49.045690: step: 36/463, loss: 0.001308182254433632 2023-01-22 19:17:49.624904: step: 38/463, loss: 0.003517588833346963 2023-01-22 19:17:50.152858: step: 40/463, loss: 0.0002848721342161298 2023-01-22 19:17:50.747821: step: 42/463, loss: 0.0011000847443938255 2023-01-22 19:17:51.347431: step: 44/463, loss: 0.00045837913057766855 2023-01-22 19:17:51.977271: step: 46/463, loss: 0.4493778944015503 2023-01-22 19:17:52.533623: step: 48/463, loss: 0.009390701539814472 2023-01-22 19:17:53.130448: step: 50/463, loss: 7.077543705236167e-05 2023-01-22 19:17:53.717241: step: 52/463, loss: 0.005414931569248438 2023-01-22 19:17:54.450907: step: 54/463, loss: 4.4053296733181924e-05 2023-01-22 19:17:55.045202: step: 56/463, loss: 0.00965932197868824 2023-01-22 19:17:55.669354: step: 58/463, loss: 0.0016875831643119454 2023-01-22 19:17:56.237568: step: 60/463, loss: 0.021097179502248764 2023-01-22 19:17:56.841296: step: 62/463, loss: 0.000496727239806205 2023-01-22 19:17:57.489498: step: 64/463, loss: 0.000394094007788226 2023-01-22 19:17:58.121126: step: 66/463, loss: 0.002842820482328534 2023-01-22 19:17:58.692145: step: 68/463, loss: 0.0001231045462191105 2023-01-22 19:17:59.228532: step: 70/463, loss: 0.005539805628359318 2023-01-22 19:17:59.831434: step: 72/463, loss: 0.006031483877450228 2023-01-22 19:18:00.427070: step: 74/463, loss: 2.8178150387248024e-05 2023-01-22 19:18:01.004754: step: 76/463, loss: 0.0023891450837254524 2023-01-22 19:18:01.624233: step: 78/463, loss: 0.004083276726305485 2023-01-22 19:18:02.304714: step: 80/463, loss: 0.030821725726127625 2023-01-22 19:18:02.899137: step: 82/463, loss: 0.000346246175467968 2023-01-22 19:18:03.512437: step: 84/463, loss: 0.01562660001218319 2023-01-22 19:18:04.154812: step: 86/463, loss: 0.0008905378053896129 2023-01-22 19:18:04.738300: step: 88/463, loss: 0.00031145240063779056 2023-01-22 19:18:05.346426: step: 90/463, loss: 0.0016455911099910736 2023-01-22 19:18:05.924518: step: 92/463, loss: 0.0032473045866936445 2023-01-22 19:18:06.560188: step: 94/463, loss: 0.0016595879569649696 2023-01-22 19:18:07.180682: step: 96/463, loss: 0.0029695986304432154 2023-01-22 19:18:07.746162: step: 98/463, loss: 0.0004126555286347866 2023-01-22 19:18:08.295899: step: 100/463, loss: 0.017140023410320282 2023-01-22 19:18:08.884254: step: 102/463, loss: 0.0034870938397943974 2023-01-22 19:18:09.476409: step: 104/463, loss: 0.00019782333401963115 2023-01-22 19:18:10.045061: step: 106/463, loss: 0.000535413681063801 2023-01-22 19:18:10.673512: step: 108/463, loss: 0.0018165619112551212 2023-01-22 19:18:11.323715: step: 110/463, loss: 0.0011332063004374504 2023-01-22 19:18:11.941896: step: 112/463, loss: 0.008882497437298298 2023-01-22 19:18:12.540070: step: 114/463, loss: 1.5770771503448486 2023-01-22 19:18:13.152529: step: 116/463, loss: 0.0012396709062159061 2023-01-22 19:18:13.734098: step: 118/463, loss: 0.005441958084702492 2023-01-22 19:18:14.263417: step: 120/463, loss: 0.004889312200248241 2023-01-22 19:18:14.809643: step: 122/463, loss: 0.087395578622818 2023-01-22 19:18:15.372023: step: 124/463, loss: 0.0012065222254022956 2023-01-22 19:18:16.021837: step: 126/463, loss: 0.00503553869202733 2023-01-22 19:18:16.590358: step: 128/463, loss: 0.10303381830453873 2023-01-22 19:18:17.194058: step: 130/463, loss: 0.0022102543152868748 2023-01-22 19:18:17.845807: step: 132/463, loss: 0.010673107579350471 2023-01-22 19:18:18.498320: step: 134/463, loss: 0.0009051923989318311 2023-01-22 19:18:19.094583: step: 136/463, loss: 0.01500641368329525 2023-01-22 19:18:19.705188: step: 138/463, loss: 0.005843086168169975 2023-01-22 19:18:20.381156: step: 140/463, loss: 0.005791195202618837 2023-01-22 19:18:20.971341: step: 142/463, loss: 0.000311277195578441 2023-01-22 19:18:21.606607: step: 144/463, loss: 0.0524728000164032 2023-01-22 19:18:22.240018: step: 146/463, loss: 0.013564004562795162 2023-01-22 19:18:22.880471: step: 148/463, loss: 0.046273693442344666 2023-01-22 19:18:23.535070: step: 150/463, loss: 0.008562915027141571 2023-01-22 19:18:24.198573: step: 152/463, loss: 0.003957361914217472 2023-01-22 19:18:24.736385: step: 154/463, loss: 0.030854616314172745 2023-01-22 19:18:25.390298: step: 156/463, loss: 0.055271413177251816 2023-01-22 19:18:25.993930: step: 158/463, loss: 0.0006093059200793505 2023-01-22 19:18:26.621222: step: 160/463, loss: 0.0018509463407099247 2023-01-22 19:18:27.209917: step: 162/463, loss: 0.0007072144071571529 2023-01-22 19:18:27.848841: step: 164/463, loss: 0.0031710925977677107 2023-01-22 19:18:28.428397: step: 166/463, loss: 0.0057087549939751625 2023-01-22 19:18:29.005937: step: 168/463, loss: 0.0006292226607911289 2023-01-22 19:18:29.546983: step: 170/463, loss: 0.002889123512431979 2023-01-22 19:18:30.160741: step: 172/463, loss: 0.00026830894057638943 2023-01-22 19:18:30.762116: step: 174/463, loss: 5.372847954276949e-05 2023-01-22 19:18:31.363143: step: 176/463, loss: 0.017239421606063843 2023-01-22 19:18:31.937317: step: 178/463, loss: 0.000502221577335149 2023-01-22 19:18:32.547994: step: 180/463, loss: 0.002216485096141696 2023-01-22 19:18:33.146295: step: 182/463, loss: 0.0007315961993299425 2023-01-22 19:18:33.782170: step: 184/463, loss: 5.5602686188649386e-05 2023-01-22 19:18:34.390753: step: 186/463, loss: 0.00161487958393991 2023-01-22 19:18:34.991269: step: 188/463, loss: 0.0024896820541471243 2023-01-22 19:18:35.608322: step: 190/463, loss: 1.2525795682449825e-05 2023-01-22 19:18:36.140843: step: 192/463, loss: 3.2338972232537344e-05 2023-01-22 19:18:36.727692: step: 194/463, loss: 0.005458936095237732 2023-01-22 19:18:37.355189: step: 196/463, loss: 0.0022778292186558247 2023-01-22 19:18:37.972215: step: 198/463, loss: 0.028027566149830818 2023-01-22 19:18:38.526288: step: 200/463, loss: 0.023967748507857323 2023-01-22 19:18:39.061136: step: 202/463, loss: 0.12396543473005295 2023-01-22 19:18:39.634661: step: 204/463, loss: 0.004536129999905825 2023-01-22 19:18:40.264420: step: 206/463, loss: 0.0034182423260062933 2023-01-22 19:18:40.862001: step: 208/463, loss: 0.0034342235885560513 2023-01-22 19:18:41.489956: step: 210/463, loss: 0.0008503955323249102 2023-01-22 19:18:42.119223: step: 212/463, loss: 0.07735413312911987 2023-01-22 19:18:42.774732: step: 214/463, loss: 0.0004853928985539824 2023-01-22 19:18:43.382269: step: 216/463, loss: 0.046890437602996826 2023-01-22 19:18:43.995958: step: 218/463, loss: 0.008525453507900238 2023-01-22 19:18:44.635352: step: 220/463, loss: 0.031520359218120575 2023-01-22 19:18:45.203514: step: 222/463, loss: 2.6425008400110528e-05 2023-01-22 19:18:45.784988: step: 224/463, loss: 0.0002786096010822803 2023-01-22 19:18:46.370558: step: 226/463, loss: 9.458150452701375e-05 2023-01-22 19:18:46.950497: step: 228/463, loss: 0.028175901621580124 2023-01-22 19:18:47.623979: step: 230/463, loss: 5.3758747526444495e-05 2023-01-22 19:18:48.251092: step: 232/463, loss: 0.001925392891280353 2023-01-22 19:18:48.838056: step: 234/463, loss: 0.0001720411964925006 2023-01-22 19:18:49.473035: step: 236/463, loss: 0.00026137055829167366 2023-01-22 19:18:50.080583: step: 238/463, loss: 0.0007945472025312483 2023-01-22 19:18:50.716377: step: 240/463, loss: 0.03565320372581482 2023-01-22 19:18:51.387392: step: 242/463, loss: 0.0002595476107671857 2023-01-22 19:18:51.991262: step: 244/463, loss: 0.0002720094344113022 2023-01-22 19:18:52.637453: step: 246/463, loss: 0.004102571401745081 2023-01-22 19:18:53.282866: step: 248/463, loss: 0.024232489988207817 2023-01-22 19:18:53.922499: step: 250/463, loss: 0.0038173289503902197 2023-01-22 19:18:54.533218: step: 252/463, loss: 0.046204451471567154 2023-01-22 19:18:55.146196: step: 254/463, loss: 0.0013366711791604757 2023-01-22 19:18:55.668869: step: 256/463, loss: 6.0445512644946575e-05 2023-01-22 19:18:56.280258: step: 258/463, loss: 0.003084553638473153 2023-01-22 19:18:56.841980: step: 260/463, loss: 0.00021135472343303263 2023-01-22 19:18:57.462125: step: 262/463, loss: 0.00285577354952693 2023-01-22 19:18:58.085761: step: 264/463, loss: 0.009045645594596863 2023-01-22 19:18:58.621686: step: 266/463, loss: 0.0032236569095402956 2023-01-22 19:18:59.227904: step: 268/463, loss: 0.003224786836653948 2023-01-22 19:18:59.881327: step: 270/463, loss: 0.016662560403347015 2023-01-22 19:19:00.442288: step: 272/463, loss: 0.0692569762468338 2023-01-22 19:19:01.046811: step: 274/463, loss: 0.001345733879134059 2023-01-22 19:19:01.647076: step: 276/463, loss: 0.40693485736846924 2023-01-22 19:19:02.213653: step: 278/463, loss: 0.0001949029101524502 2023-01-22 19:19:02.856522: step: 280/463, loss: 0.004869560245424509 2023-01-22 19:19:03.483226: step: 282/463, loss: 5.593533842329634e-06 2023-01-22 19:19:04.081541: step: 284/463, loss: 0.00147507362999022 2023-01-22 19:19:04.711508: step: 286/463, loss: 0.0113254114985466 2023-01-22 19:19:05.317434: step: 288/463, loss: 0.0023536020889878273 2023-01-22 19:19:05.912619: step: 290/463, loss: 0.0004543559916783124 2023-01-22 19:19:06.525287: step: 292/463, loss: 0.0017660722369328141 2023-01-22 19:19:07.066216: step: 294/463, loss: 1.438876824977342e-05 2023-01-22 19:19:07.652944: step: 296/463, loss: 9.385849989484996e-05 2023-01-22 19:19:08.237544: step: 298/463, loss: 0.008466262370347977 2023-01-22 19:19:08.860564: step: 300/463, loss: 0.005293470341712236 2023-01-22 19:19:09.459652: step: 302/463, loss: 0.014306939207017422 2023-01-22 19:19:10.090847: step: 304/463, loss: 4.8343947128159925e-05 2023-01-22 19:19:10.727957: step: 306/463, loss: 0.00952550396323204 2023-01-22 19:19:11.393532: step: 308/463, loss: 0.00017297003068961203 2023-01-22 19:19:11.957485: step: 310/463, loss: 0.001623228657990694 2023-01-22 19:19:12.555135: step: 312/463, loss: 0.018758218735456467 2023-01-22 19:19:13.132464: step: 314/463, loss: 0.00010309414938092232 2023-01-22 19:19:13.752055: step: 316/463, loss: 0.001325489254668355 2023-01-22 19:19:14.378969: step: 318/463, loss: 0.00013826215581502765 2023-01-22 19:19:14.945127: step: 320/463, loss: 0.00013117739581502974 2023-01-22 19:19:15.504832: step: 322/463, loss: 0.00026670959778130054 2023-01-22 19:19:16.224432: step: 324/463, loss: 0.0038495922926813364 2023-01-22 19:19:16.784621: step: 326/463, loss: 2.3076298020896502e-05 2023-01-22 19:19:17.372729: step: 328/463, loss: 0.005292745307087898 2023-01-22 19:19:18.057805: step: 330/463, loss: 0.012439122423529625 2023-01-22 19:19:18.722160: step: 332/463, loss: 0.0030223974026739597 2023-01-22 19:19:19.333222: step: 334/463, loss: 0.16963711380958557 2023-01-22 19:19:19.905672: step: 336/463, loss: 0.0006073869881220162 2023-01-22 19:19:20.522958: step: 338/463, loss: 0.005426917225122452 2023-01-22 19:19:21.089845: step: 340/463, loss: 0.0002585617476142943 2023-01-22 19:19:21.669736: step: 342/463, loss: 0.0006838802946731448 2023-01-22 19:19:22.293178: step: 344/463, loss: 0.0004278693813830614 2023-01-22 19:19:22.913806: step: 346/463, loss: 0.0020766882225871086 2023-01-22 19:19:23.531741: step: 348/463, loss: 0.00046342916903086007 2023-01-22 19:19:24.123482: step: 350/463, loss: 0.026393737643957138 2023-01-22 19:19:24.716547: step: 352/463, loss: 0.0002271143748657778 2023-01-22 19:19:25.248636: step: 354/463, loss: 0.00014506481238640845 2023-01-22 19:19:25.794292: step: 356/463, loss: 0.00024762589600868523 2023-01-22 19:19:26.398219: step: 358/463, loss: 0.002072034403681755 2023-01-22 19:19:27.065086: step: 360/463, loss: 0.0005221647443249822 2023-01-22 19:19:27.707455: step: 362/463, loss: 0.00442659854888916 2023-01-22 19:19:28.264854: step: 364/463, loss: 7.12068776920205e-06 2023-01-22 19:19:29.058018: step: 366/463, loss: 0.002150141168385744 2023-01-22 19:19:29.652254: step: 368/463, loss: 0.00021313711476977915 2023-01-22 19:19:30.288000: step: 370/463, loss: 0.0013930960558354855 2023-01-22 19:19:30.899646: step: 372/463, loss: 0.00922850426286459 2023-01-22 19:19:31.528936: step: 374/463, loss: 0.08281563967466354 2023-01-22 19:19:32.128919: step: 376/463, loss: 6.857610424049199e-05 2023-01-22 19:19:32.695936: step: 378/463, loss: 0.006654652766883373 2023-01-22 19:19:33.273603: step: 380/463, loss: 0.03502865135669708 2023-01-22 19:19:33.893501: step: 382/463, loss: 0.000335115531925112 2023-01-22 19:19:34.466409: step: 384/463, loss: 0.0001996094360947609 2023-01-22 19:19:35.031164: step: 386/463, loss: 0.010503219440579414 2023-01-22 19:19:35.637443: step: 388/463, loss: 0.004548720549792051 2023-01-22 19:19:36.227070: step: 390/463, loss: 9.78617463260889e-05 2023-01-22 19:19:36.805046: step: 392/463, loss: 6.93258471073932e-06 2023-01-22 19:19:37.389411: step: 394/463, loss: 0.04954591020941734 2023-01-22 19:19:38.020149: step: 396/463, loss: 0.00012508535291999578 2023-01-22 19:19:38.540337: step: 398/463, loss: 0.008829621598124504 2023-01-22 19:19:39.145801: step: 400/463, loss: 0.00035695050610229373 2023-01-22 19:19:39.770651: step: 402/463, loss: 0.02659214287996292 2023-01-22 19:19:40.513345: step: 404/463, loss: 0.00010770234803203493 2023-01-22 19:19:41.082429: step: 406/463, loss: 2.631860479596071e-05 2023-01-22 19:19:41.684287: step: 408/463, loss: 0.0006412076181732118 2023-01-22 19:19:42.291459: step: 410/463, loss: 0.010217915289103985 2023-01-22 19:19:42.942652: step: 412/463, loss: 0.22226382791996002 2023-01-22 19:19:43.562564: step: 414/463, loss: 0.007841676473617554 2023-01-22 19:19:44.162387: step: 416/463, loss: 0.01650642603635788 2023-01-22 19:19:44.785909: step: 418/463, loss: 0.0007042823708616197 2023-01-22 19:19:45.442558: step: 420/463, loss: 0.0003163098299410194 2023-01-22 19:19:46.003001: step: 422/463, loss: 0.010452454909682274 2023-01-22 19:19:46.615389: step: 424/463, loss: 0.016265392303466797 2023-01-22 19:19:47.232075: step: 426/463, loss: 0.0011052886256948113 2023-01-22 19:19:47.853632: step: 428/463, loss: 0.00765977194532752 2023-01-22 19:19:48.589092: step: 430/463, loss: 0.060208309441804886 2023-01-22 19:19:49.197900: step: 432/463, loss: 0.0009403592557646334 2023-01-22 19:19:49.799837: step: 434/463, loss: 0.0528746172785759 2023-01-22 19:19:50.417623: step: 436/463, loss: 0.15433084964752197 2023-01-22 19:19:51.045912: step: 438/463, loss: 0.005888060666620731 2023-01-22 19:19:51.683819: step: 440/463, loss: 0.0004917871556244791 2023-01-22 19:19:52.282510: step: 442/463, loss: 0.0016099504427984357 2023-01-22 19:19:52.841947: step: 444/463, loss: 0.0008842946263030171 2023-01-22 19:19:53.474983: step: 446/463, loss: 0.00019577554485294968 2023-01-22 19:19:54.035027: step: 448/463, loss: 0.00010528425627853721 2023-01-22 19:19:54.650021: step: 450/463, loss: 0.00032175268279388547 2023-01-22 19:19:55.278697: step: 452/463, loss: 0.01075148768723011 2023-01-22 19:19:55.867339: step: 454/463, loss: 0.0035002115182578564 2023-01-22 19:19:56.392779: step: 456/463, loss: 8.578391316405032e-06 2023-01-22 19:19:56.957920: step: 458/463, loss: 0.0001439665211364627 2023-01-22 19:19:57.588899: step: 460/463, loss: 0.01788206398487091 2023-01-22 19:19:58.217949: step: 462/463, loss: 0.003951238002628088 2023-01-22 19:19:58.762162: step: 464/463, loss: 0.00032614340307191014 2023-01-22 19:19:59.308331: step: 466/463, loss: 0.007410815451294184 2023-01-22 19:19:59.907184: step: 468/463, loss: 0.00174837710801512 2023-01-22 19:20:00.545212: step: 470/463, loss: 0.01199332531541586 2023-01-22 19:20:01.200083: step: 472/463, loss: 0.035291220992803574 2023-01-22 19:20:01.817710: step: 474/463, loss: 0.025272618979215622 2023-01-22 19:20:02.455650: step: 476/463, loss: 0.005143485032021999 2023-01-22 19:20:03.055973: step: 478/463, loss: 0.0014207472559064627 2023-01-22 19:20:03.674146: step: 480/463, loss: 0.00036828138399869204 2023-01-22 19:20:04.292261: step: 482/463, loss: 0.00028148782439529896 2023-01-22 19:20:04.926686: step: 484/463, loss: 0.0025993476156145334 2023-01-22 19:20:05.534966: step: 486/463, loss: 0.015902405604720116 2023-01-22 19:20:06.146061: step: 488/463, loss: 0.0030720734503120184 2023-01-22 19:20:06.736853: step: 490/463, loss: 0.033776070922613144 2023-01-22 19:20:07.314494: step: 492/463, loss: 0.012743447907269001 2023-01-22 19:20:07.982143: step: 494/463, loss: 0.006398965138942003 2023-01-22 19:20:08.578203: step: 496/463, loss: 0.0002802814997266978 2023-01-22 19:20:09.132509: step: 498/463, loss: 3.824139275820926e-05 2023-01-22 19:20:09.719301: step: 500/463, loss: 0.020187703892588615 2023-01-22 19:20:10.406506: step: 502/463, loss: 0.10093924403190613 2023-01-22 19:20:10.960059: step: 504/463, loss: 0.007475260645151138 2023-01-22 19:20:11.546959: step: 506/463, loss: 0.006322294939309359 2023-01-22 19:20:12.131345: step: 508/463, loss: 0.005052641965448856 2023-01-22 19:20:12.743665: step: 510/463, loss: 0.011344644241034985 2023-01-22 19:20:13.378202: step: 512/463, loss: 0.0011517333332449198 2023-01-22 19:20:14.020418: step: 514/463, loss: 0.009060057811439037 2023-01-22 19:20:14.647352: step: 516/463, loss: 0.004933866206556559 2023-01-22 19:20:15.316835: step: 518/463, loss: 0.002929247450083494 2023-01-22 19:20:15.919358: step: 520/463, loss: 0.0008936733356676996 2023-01-22 19:20:16.475572: step: 522/463, loss: 0.0003673116152640432 2023-01-22 19:20:17.088158: step: 524/463, loss: 0.0015240754000842571 2023-01-22 19:20:17.745831: step: 526/463, loss: 0.0024089852813631296 2023-01-22 19:20:18.362404: step: 528/463, loss: 0.0610002838075161 2023-01-22 19:20:18.945683: step: 530/463, loss: 0.01050985511392355 2023-01-22 19:20:19.527167: step: 532/463, loss: 0.005481315776705742 2023-01-22 19:20:20.172928: step: 534/463, loss: 0.003396882675588131 2023-01-22 19:20:20.771576: step: 536/463, loss: 0.0011447732103988528 2023-01-22 19:20:21.533394: step: 538/463, loss: 0.030429044738411903 2023-01-22 19:20:22.158507: step: 540/463, loss: 0.0005776140606030822 2023-01-22 19:20:22.816897: step: 542/463, loss: 0.48920562863349915 2023-01-22 19:20:23.393112: step: 544/463, loss: 0.0001608120946912095 2023-01-22 19:20:23.986019: step: 546/463, loss: 5.0360511522740126e-05 2023-01-22 19:20:24.557918: step: 548/463, loss: 0.0006281930254772305 2023-01-22 19:20:25.168937: step: 550/463, loss: 0.00021876278333365917 2023-01-22 19:20:25.768435: step: 552/463, loss: 0.011245344765484333 2023-01-22 19:20:26.335309: step: 554/463, loss: 0.0005652742111124098 2023-01-22 19:20:26.982666: step: 556/463, loss: 0.0005486967856995761 2023-01-22 19:20:27.509475: step: 558/463, loss: 0.000936436525080353 2023-01-22 19:20:28.176653: step: 560/463, loss: 0.003898577531799674 2023-01-22 19:20:28.774857: step: 562/463, loss: 0.018520137295126915 2023-01-22 19:20:29.405796: step: 564/463, loss: 0.011104453355073929 2023-01-22 19:20:29.990438: step: 566/463, loss: 8.595120743848383e-05 2023-01-22 19:20:30.585206: step: 568/463, loss: 0.004353742580860853 2023-01-22 19:20:31.176641: step: 570/463, loss: 3.355580702191219e-05 2023-01-22 19:20:31.784629: step: 572/463, loss: 0.0011402338277548552 2023-01-22 19:20:32.360583: step: 574/463, loss: 0.004971250891685486 2023-01-22 19:20:33.029450: step: 576/463, loss: 0.0002080999838653952 2023-01-22 19:20:33.678761: step: 578/463, loss: 0.009077758528292179 2023-01-22 19:20:34.321520: step: 580/463, loss: 0.002814331091940403 2023-01-22 19:20:34.889381: step: 582/463, loss: 0.03920842707157135 2023-01-22 19:20:35.428693: step: 584/463, loss: 0.014242704957723618 2023-01-22 19:20:36.038437: step: 586/463, loss: 0.00017616008699405938 2023-01-22 19:20:36.657952: step: 588/463, loss: 0.009579058736562729 2023-01-22 19:20:37.288105: step: 590/463, loss: 0.01011890359222889 2023-01-22 19:20:37.911196: step: 592/463, loss: 0.0038661975413560867 2023-01-22 19:20:38.501265: step: 594/463, loss: 0.00031490589026361704 2023-01-22 19:20:39.114544: step: 596/463, loss: 0.013816223479807377 2023-01-22 19:20:39.736034: step: 598/463, loss: 0.000538473017513752 2023-01-22 19:20:40.316236: step: 600/463, loss: 0.01737445965409279 2023-01-22 19:20:40.900456: step: 602/463, loss: 0.002582959597930312 2023-01-22 19:20:41.485066: step: 604/463, loss: 0.2503345310688019 2023-01-22 19:20:42.121121: step: 606/463, loss: 0.012100421823561192 2023-01-22 19:20:42.788949: step: 608/463, loss: 4.3014546236008755e-07 2023-01-22 19:20:43.393899: step: 610/463, loss: 0.0021794431377202272 2023-01-22 19:20:44.022234: step: 612/463, loss: 3.612560612964444e-05 2023-01-22 19:20:44.675871: step: 614/463, loss: 0.0001598628150532022 2023-01-22 19:20:45.289033: step: 616/463, loss: 0.029134385287761688 2023-01-22 19:20:45.830781: step: 618/463, loss: 0.0008517194073647261 2023-01-22 19:20:46.391878: step: 620/463, loss: 0.0005924435099586844 2023-01-22 19:20:46.969013: step: 622/463, loss: 0.41519680619239807 2023-01-22 19:20:47.615614: step: 624/463, loss: 6.282704998739064e-05 2023-01-22 19:20:48.237928: step: 626/463, loss: 0.01885322667658329 2023-01-22 19:20:48.855486: step: 628/463, loss: 0.003477758262306452 2023-01-22 19:20:49.459287: step: 630/463, loss: 0.007785693742334843 2023-01-22 19:20:50.126750: step: 632/463, loss: 0.0012488395441323519 2023-01-22 19:20:50.733228: step: 634/463, loss: 0.015157187357544899 2023-01-22 19:20:51.294244: step: 636/463, loss: 0.005170656368136406 2023-01-22 19:20:51.959597: step: 638/463, loss: 0.00213405629619956 2023-01-22 19:20:52.590732: step: 640/463, loss: 0.002979469019919634 2023-01-22 19:20:53.185076: step: 642/463, loss: 0.007288097869604826 2023-01-22 19:20:53.776833: step: 644/463, loss: 0.08734485507011414 2023-01-22 19:20:54.324198: step: 646/463, loss: 0.00040480721509084105 2023-01-22 19:20:54.903064: step: 648/463, loss: 0.008404127322137356 2023-01-22 19:20:55.550755: step: 650/463, loss: 0.017118128016591072 2023-01-22 19:20:56.191038: step: 652/463, loss: 0.002976895309984684 2023-01-22 19:20:56.854615: step: 654/463, loss: 0.01703518070280552 2023-01-22 19:20:57.513335: step: 656/463, loss: 6.2815030105412e-05 2023-01-22 19:20:58.075014: step: 658/463, loss: 0.0004981746315024793 2023-01-22 19:20:58.700669: step: 660/463, loss: 0.028871091082692146 2023-01-22 19:20:59.330743: step: 662/463, loss: 0.00025219295639544725 2023-01-22 19:20:59.918325: step: 664/463, loss: 0.00043213096796534956 2023-01-22 19:21:00.551102: step: 666/463, loss: 0.0003604301600717008 2023-01-22 19:21:01.118716: step: 668/463, loss: 0.011177362874150276 2023-01-22 19:21:01.664292: step: 670/463, loss: 0.000650756002869457 2023-01-22 19:21:02.252745: step: 672/463, loss: 0.07313233613967896 2023-01-22 19:21:02.840049: step: 674/463, loss: 0.018022766336798668 2023-01-22 19:21:03.381601: step: 676/463, loss: 0.00012449591304175556 2023-01-22 19:21:03.953660: step: 678/463, loss: 0.0021458349656313658 2023-01-22 19:21:04.567603: step: 680/463, loss: 0.0006035775295458734 2023-01-22 19:21:05.100854: step: 682/463, loss: 0.09316227585077286 2023-01-22 19:21:05.752823: step: 684/463, loss: 0.0006450857035815716 2023-01-22 19:21:06.289276: step: 686/463, loss: 0.00526129174977541 2023-01-22 19:21:06.915261: step: 688/463, loss: 0.001747821574099362 2023-01-22 19:21:07.416770: step: 690/463, loss: 0.0001614074280951172 2023-01-22 19:21:08.060325: step: 692/463, loss: 0.0008860760135576129 2023-01-22 19:21:08.684823: step: 694/463, loss: 0.3239785134792328 2023-01-22 19:21:09.303515: step: 696/463, loss: 0.01878795213997364 2023-01-22 19:21:09.924899: step: 698/463, loss: 0.027236608788371086 2023-01-22 19:21:10.557551: step: 700/463, loss: 0.003239111043512821 2023-01-22 19:21:11.149827: step: 702/463, loss: 0.0008779906784184277 2023-01-22 19:21:11.800809: step: 704/463, loss: 0.0022555729374289513 2023-01-22 19:21:12.354763: step: 706/463, loss: 0.001425999216735363 2023-01-22 19:21:12.911855: step: 708/463, loss: 0.005605560261756182 2023-01-22 19:21:13.533980: step: 710/463, loss: 0.041293613612651825 2023-01-22 19:21:14.141506: step: 712/463, loss: 0.0007189091411419213 2023-01-22 19:21:14.731123: step: 714/463, loss: 0.000256000756053254 2023-01-22 19:21:15.324990: step: 716/463, loss: 1.4284234566730447e-05 2023-01-22 19:21:15.902459: step: 718/463, loss: 0.06201605871319771 2023-01-22 19:21:16.553934: step: 720/463, loss: 0.0034365453757345676 2023-01-22 19:21:17.170263: step: 722/463, loss: 0.05135727673768997 2023-01-22 19:21:17.809425: step: 724/463, loss: 0.017361681908369064 2023-01-22 19:21:18.429584: step: 726/463, loss: 0.04857705160975456 2023-01-22 19:21:19.033984: step: 728/463, loss: 0.018471738323569298 2023-01-22 19:21:19.665166: step: 730/463, loss: 0.02996482141315937 2023-01-22 19:21:20.264848: step: 732/463, loss: 0.0011229916708543897 2023-01-22 19:21:20.878715: step: 734/463, loss: 0.017092516645789146 2023-01-22 19:21:21.503237: step: 736/463, loss: 0.020738938823342323 2023-01-22 19:21:22.132986: step: 738/463, loss: 0.0009721467504277825 2023-01-22 19:21:22.759067: step: 740/463, loss: 0.0020712350960820913 2023-01-22 19:21:23.351522: step: 742/463, loss: 0.004902801010757685 2023-01-22 19:21:23.922783: step: 744/463, loss: 0.15072710812091827 2023-01-22 19:21:24.453002: step: 746/463, loss: 0.005985019262880087 2023-01-22 19:21:25.142846: step: 748/463, loss: 0.017352981492877007 2023-01-22 19:21:25.763922: step: 750/463, loss: 0.0013474620645865798 2023-01-22 19:21:26.360812: step: 752/463, loss: 0.001517048804089427 2023-01-22 19:21:26.944247: step: 754/463, loss: 0.008226670324802399 2023-01-22 19:21:27.556620: step: 756/463, loss: 0.010155747644603252 2023-01-22 19:21:28.159378: step: 758/463, loss: 0.07770176231861115 2023-01-22 19:21:28.741223: step: 760/463, loss: 0.017221830785274506 2023-01-22 19:21:29.324701: step: 762/463, loss: 0.06997016817331314 2023-01-22 19:21:29.991956: step: 764/463, loss: 0.0006415266543626785 2023-01-22 19:21:30.651936: step: 766/463, loss: 0.02340009994804859 2023-01-22 19:21:31.208205: step: 768/463, loss: 0.0005615018890239298 2023-01-22 19:21:31.793110: step: 770/463, loss: 0.0103829400613904 2023-01-22 19:21:32.347572: step: 772/463, loss: 0.026568705216050148 2023-01-22 19:21:32.983948: step: 774/463, loss: 0.006846896838396788 2023-01-22 19:21:33.628305: step: 776/463, loss: 0.008683239109814167 2023-01-22 19:21:34.247924: step: 778/463, loss: 0.0015891186194494367 2023-01-22 19:21:34.944448: step: 780/463, loss: 0.10487958043813705 2023-01-22 19:21:35.622520: step: 782/463, loss: 0.0005072446656413376 2023-01-22 19:21:36.184936: step: 784/463, loss: 0.0003221355436835438 2023-01-22 19:21:36.761211: step: 786/463, loss: 0.004501454997807741 2023-01-22 19:21:37.397137: step: 788/463, loss: 0.019357256591320038 2023-01-22 19:21:37.977226: step: 790/463, loss: 0.02529536746442318 2023-01-22 19:21:38.584874: step: 792/463, loss: 0.009772663936018944 2023-01-22 19:21:39.204954: step: 794/463, loss: 0.0017570576164871454 2023-01-22 19:21:39.863212: step: 796/463, loss: 0.05137167498469353 2023-01-22 19:21:40.526606: step: 798/463, loss: 0.021431544795632362 2023-01-22 19:21:41.255435: step: 800/463, loss: 0.0030735263135284185 2023-01-22 19:21:41.842771: step: 802/463, loss: 0.002342082792893052 2023-01-22 19:21:42.406655: step: 804/463, loss: 0.00022062555945012718 2023-01-22 19:21:42.981507: step: 806/463, loss: 0.009434249252080917 2023-01-22 19:21:43.635248: step: 808/463, loss: 0.00011012489267159253 2023-01-22 19:21:44.230353: step: 810/463, loss: 0.01204280648380518 2023-01-22 19:21:44.796816: step: 812/463, loss: 0.0001879936025943607 2023-01-22 19:21:45.428514: step: 814/463, loss: 0.21772773563861847 2023-01-22 19:21:46.064358: step: 816/463, loss: 0.009562121704220772 2023-01-22 19:21:46.683414: step: 818/463, loss: 0.021478796377778053 2023-01-22 19:21:47.282986: step: 820/463, loss: 0.0002490739861968905 2023-01-22 19:21:47.892496: step: 822/463, loss: 0.003896197769790888 2023-01-22 19:21:48.481356: step: 824/463, loss: 0.21124491095542908 2023-01-22 19:21:49.139510: step: 826/463, loss: 0.00015101763710845262 2023-01-22 19:21:49.895730: step: 828/463, loss: 0.000523419410455972 2023-01-22 19:21:50.512808: step: 830/463, loss: 0.016111573204398155 2023-01-22 19:21:51.146222: step: 832/463, loss: 0.0038049304857850075 2023-01-22 19:21:51.760609: step: 834/463, loss: 0.0025796606205403805 2023-01-22 19:21:52.342890: step: 836/463, loss: 0.0031358227133750916 2023-01-22 19:21:52.951200: step: 838/463, loss: 0.0046489788219332695 2023-01-22 19:21:53.509037: step: 840/463, loss: 0.0006247336859814823 2023-01-22 19:21:54.103490: step: 842/463, loss: 0.016195034608244896 2023-01-22 19:21:54.849018: step: 844/463, loss: 0.004666765220463276 2023-01-22 19:21:55.444940: step: 846/463, loss: 0.002970300614833832 2023-01-22 19:21:56.037721: step: 848/463, loss: 0.001579555799253285 2023-01-22 19:21:56.637508: step: 850/463, loss: 0.04145455360412598 2023-01-22 19:21:57.216000: step: 852/463, loss: 0.000334134791046381 2023-01-22 19:21:57.882335: step: 854/463, loss: 0.0006487035425379872 2023-01-22 19:21:58.563665: step: 856/463, loss: 0.0012821132550016046 2023-01-22 19:21:59.267240: step: 858/463, loss: 0.023612376302480698 2023-01-22 19:21:59.867804: step: 860/463, loss: 0.02850208431482315 2023-01-22 19:22:00.439006: step: 862/463, loss: 8.814719330985099e-05 2023-01-22 19:22:00.976131: step: 864/463, loss: 0.010587815195322037 2023-01-22 19:22:01.579421: step: 866/463, loss: 0.00040114024886861444 2023-01-22 19:22:02.206991: step: 868/463, loss: 0.001415454433299601 2023-01-22 19:22:02.854817: step: 870/463, loss: 0.01892326958477497 2023-01-22 19:22:03.454404: step: 872/463, loss: 0.03393569961190224 2023-01-22 19:22:04.137346: step: 874/463, loss: 0.001789387664757669 2023-01-22 19:22:04.745514: step: 876/463, loss: 0.00012244857498444617 2023-01-22 19:22:05.378821: step: 878/463, loss: 0.0003315527574159205 2023-01-22 19:22:05.915435: step: 880/463, loss: 0.009065951220691204 2023-01-22 19:22:06.502848: step: 882/463, loss: 0.0001549780135974288 2023-01-22 19:22:07.140491: step: 884/463, loss: 0.018188610672950745 2023-01-22 19:22:07.761530: step: 886/463, loss: 0.0006740073440596461 2023-01-22 19:22:08.426088: step: 888/463, loss: 0.02477690950036049 2023-01-22 19:22:09.104318: step: 890/463, loss: 0.013044852763414383 2023-01-22 19:22:09.763095: step: 892/463, loss: 0.00779425585642457 2023-01-22 19:22:10.328468: step: 894/463, loss: 0.0010109026916325092 2023-01-22 19:22:10.919643: step: 896/463, loss: 0.004689314402639866 2023-01-22 19:22:11.550125: step: 898/463, loss: 0.0005530448979698122 2023-01-22 19:22:12.150069: step: 900/463, loss: 0.00035099475644528866 2023-01-22 19:22:12.926733: step: 902/463, loss: 0.002225806936621666 2023-01-22 19:22:13.543645: step: 904/463, loss: 0.01110985316336155 2023-01-22 19:22:14.155420: step: 906/463, loss: 0.0029671478550881147 2023-01-22 19:22:14.804710: step: 908/463, loss: 0.011405534110963345 2023-01-22 19:22:15.420576: step: 910/463, loss: 0.0015419799601659179 2023-01-22 19:22:16.002308: step: 912/463, loss: 0.005140809807926416 2023-01-22 19:22:16.642181: step: 914/463, loss: 0.029043421149253845 2023-01-22 19:22:17.224770: step: 916/463, loss: 0.0032956514041870832 2023-01-22 19:22:17.834255: step: 918/463, loss: 0.003885021898895502 2023-01-22 19:22:18.417201: step: 920/463, loss: 5.0685197493294254e-05 2023-01-22 19:22:19.048534: step: 922/463, loss: 0.001345443306490779 2023-01-22 19:22:19.666368: step: 924/463, loss: 0.006768426392227411 2023-01-22 19:22:20.297984: step: 926/463, loss: 0.0014919567620381713 ================================================== Loss: 0.021 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2996889880952381, 'r': 0.3412018839793982, 'f1': 0.3191009633667132}, 'combined': 0.23512702563863075, 'epoch': 39} Test Chinese: {'template': {'p': 0.9722222222222222, 'r': 0.5511811023622047, 'f1': 0.7035175879396985}, 'slot': {'p': 0.34292218222869, 'r': 0.3165895888289302, 'f1': 0.3292301894718276}, 'combined': 0.2316192287741501, 'epoch': 39} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2991766376601402, 'r': 0.3355092843589048, 'f1': 0.31630302836698176}, 'combined': 0.2330653893230392, 'epoch': 39} Test Korean: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.33981542903675455, 'r': 0.310137225627431, 'f1': 0.3242987427793685}, 'combined': 0.2302521073733516, 'epoch': 39} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3098214285714286, 'r': 0.34391942260775277, 'f1': 0.32598117934224047}, 'combined': 0.2401966584627035, 'epoch': 39} Test Russian: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.35270207665095765, 'r': 0.29576500494028823, 'f1': 0.3217339303859234}, 'combined': 0.2284310905740056, 'epoch': 39} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.26996527777777773, 'r': 0.3702380952380952, 'f1': 0.3122489959839357}, 'combined': 0.20816599732262378, 'epoch': 39} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.30833333333333335, 'r': 0.40217391304347827, 'f1': 0.34905660377358494}, 'combined': 0.17452830188679247, 'epoch': 39} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.4861111111111111, 'r': 0.3017241379310345, 'f1': 0.3723404255319149}, 'combined': 0.2482269503546099, 'epoch': 39} New best russian model... ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.31154818059299194, 'r': 0.313321699647601, 'f1': 0.312432423300446}, 'combined': 0.23021336453717073, 'epoch': 29} Test for Chinese: {'template': {'p': 0.9722222222222222, 'r': 0.5511811023622047, 'f1': 0.7035175879396985}, 'slot': {'p': 0.35841849662687986, 'r': 0.32120052010105204, 'f1': 0.33879042433116024}, 'combined': 0.23834502214252482, 'epoch': 29} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.34188034188034183, 'r': 0.38095238095238093, 'f1': 0.36036036036036034}, 'combined': 0.2402402402402402, 'epoch': 29} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.28642843540005986, 'r': 0.33860515228508026, 'f1': 0.3103389830508475}, 'combined': 0.22867082961641394, 'epoch': 17} Test for Korean: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.3492453884409228, 'r': 0.3270179138496522, 'f1': 0.3377663639671779}, 'combined': 0.2398141184166963, 'epoch': 17} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.33088235294117646, 'r': 0.4891304347826087, 'f1': 0.39473684210526316}, 'combined': 0.19736842105263158, 'epoch': 17} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3098214285714286, 'r': 0.34391942260775277, 'f1': 0.32598117934224047}, 'combined': 0.2401966584627035, 'epoch': 39} Test for Russian: {'template': {'p': 0.9726027397260274, 'r': 0.5590551181102362, 'f1': 0.71}, 'slot': {'p': 0.35270207665095765, 'r': 0.29576500494028823, 'f1': 0.3217339303859234}, 'combined': 0.2284310905740056, 'epoch': 39} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.4861111111111111, 'r': 0.3017241379310345, 'f1': 0.3723404255319149}, 'combined': 0.2482269503546099, 'epoch': 39}