Command that produces this log: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 ---------------------------------------------------------------------------------------------------- > trainable params: >>> xlmr.embeddings.word_embeddings.weight: torch.Size([250002, 1024]) >>> xlmr.embeddings.position_embeddings.weight: torch.Size([514, 1024]) >>> xlmr.embeddings.token_type_embeddings.weight: torch.Size([1, 1024]) >>> xlmr.embeddings.LayerNorm.weight: torch.Size([1024]) >>> xlmr.embeddings.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.0.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.0.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.0.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.0.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.0.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.0.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.0.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.0.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.0.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.0.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.0.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.0.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.0.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.0.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.0.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.0.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.1.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.1.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.1.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.1.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.1.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.1.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.1.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.1.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.1.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.1.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.1.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.1.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.1.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.1.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.1.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.1.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.2.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.2.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.2.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.2.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.2.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.2.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.2.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.2.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.2.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.2.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.2.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.2.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.2.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.2.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.2.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.2.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.3.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.3.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.3.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.3.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.3.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.3.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.3.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.3.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.3.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.3.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.3.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.3.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.3.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.3.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.3.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.3.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.4.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.4.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.4.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.4.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.4.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.4.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.4.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.4.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.4.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.4.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.4.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.4.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.4.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.4.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.4.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.4.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.5.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.5.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.5.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.5.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.5.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.5.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.5.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.5.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.5.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.5.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.5.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.5.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.5.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.5.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.5.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.5.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.6.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.6.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.6.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.6.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.6.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.6.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.6.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.6.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.6.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.6.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.6.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.6.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.6.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.6.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.6.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.6.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.7.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.7.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.7.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.7.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.7.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.7.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.7.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.7.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.7.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.7.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.7.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.7.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.7.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.7.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.7.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.7.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.8.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.8.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.8.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.8.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.8.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.8.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.8.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.8.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.8.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.8.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.8.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.8.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.8.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.8.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.8.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.8.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.9.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.9.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.9.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.9.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.9.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.9.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.9.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.9.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.9.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.9.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.9.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.9.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.9.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.9.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.9.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.9.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.10.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.10.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.10.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.10.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.10.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.10.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.10.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.10.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.10.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.10.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.10.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.10.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.10.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.10.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.10.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.10.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.11.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.11.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.11.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.11.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.11.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.11.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.11.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.11.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.11.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.11.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.11.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.11.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.11.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.11.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.11.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.11.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.12.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.12.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.12.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.12.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.12.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.12.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.12.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.12.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.12.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.12.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.12.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.12.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.12.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.12.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.12.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.12.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.13.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.13.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.13.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.13.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.13.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.13.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.13.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.13.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.13.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.13.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.13.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.13.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.13.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.13.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.13.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.13.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.14.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.14.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.14.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.14.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.14.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.14.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.14.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.14.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.14.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.14.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.14.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.14.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.14.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.14.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.14.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.14.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.15.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.15.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.15.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.15.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.15.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.15.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.15.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.15.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.15.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.15.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.15.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.15.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.15.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.15.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.15.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.15.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.16.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.16.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.16.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.16.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.16.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.16.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.16.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.16.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.16.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.16.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.16.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.16.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.16.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.16.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.16.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.16.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.17.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.17.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.17.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.17.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.17.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.17.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.17.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.17.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.17.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.17.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.17.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.17.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.17.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.17.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.17.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.17.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.18.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.18.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.18.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.18.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.18.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.18.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.18.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.18.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.18.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.18.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.18.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.18.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.18.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.18.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.18.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.18.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.19.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.19.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.19.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.19.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.19.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.19.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.19.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.19.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.19.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.19.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.19.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.19.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.19.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.19.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.19.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.19.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.20.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.20.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.20.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.20.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.20.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.20.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.20.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.20.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.20.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.20.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.20.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.20.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.20.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.20.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.20.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.20.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.21.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.21.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.21.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.21.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.21.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.21.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.21.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.21.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.21.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.21.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.21.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.21.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.21.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.21.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.21.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.21.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.22.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.22.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.22.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.22.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.22.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.22.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.22.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.22.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.22.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.22.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.22.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.22.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.22.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.22.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.22.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.22.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.23.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.23.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.23.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.23.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.23.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.23.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.23.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.23.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.23.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.23.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.23.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.23.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.23.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.23.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.23.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.23.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.pooler.dense.weight: torch.Size([1024, 1024]) >>> xlmr.pooler.dense.bias: torch.Size([1024]) >>> basic_gcn.T_T.0.weight: torch.Size([1024, 1024]) >>> basic_gcn.T_T.0.bias: torch.Size([1024]) >>> basic_gcn.T_T.1.weight: torch.Size([1024, 1024]) >>> basic_gcn.T_T.1.bias: torch.Size([1024]) >>> basic_gcn.T_T.2.weight: torch.Size([1024, 1024]) >>> basic_gcn.T_T.2.bias: torch.Size([1024]) >>> basic_gcn.T_E.0.weight: torch.Size([1024, 1024]) >>> basic_gcn.T_E.0.bias: torch.Size([1024]) >>> basic_gcn.T_E.1.weight: torch.Size([1024, 1024]) >>> basic_gcn.T_E.1.bias: torch.Size([1024]) >>> basic_gcn.T_E.2.weight: torch.Size([1024, 1024]) >>> basic_gcn.T_E.2.bias: torch.Size([1024]) >>> basic_gcn.E_T.0.weight: torch.Size([1024, 1024]) >>> basic_gcn.E_T.0.bias: torch.Size([1024]) >>> basic_gcn.E_T.1.weight: torch.Size([1024, 1024]) >>> basic_gcn.E_T.1.bias: torch.Size([1024]) >>> basic_gcn.E_T.2.weight: torch.Size([1024, 1024]) >>> basic_gcn.E_T.2.bias: torch.Size([1024]) >>> basic_gcn.E_E.0.weight: torch.Size([1024, 1024]) >>> basic_gcn.E_E.0.bias: torch.Size([1024]) >>> basic_gcn.E_E.1.weight: torch.Size([1024, 1024]) >>> basic_gcn.E_E.1.bias: torch.Size([1024]) >>> basic_gcn.E_E.2.weight: torch.Size([1024, 1024]) >>> basic_gcn.E_E.2.bias: torch.Size([1024]) >>> basic_gcn.f_t.0.weight: torch.Size([1024, 2048]) >>> basic_gcn.f_t.0.bias: torch.Size([1024]) >>> basic_gcn.f_e.0.weight: torch.Size([1024, 2048]) >>> basic_gcn.f_e.0.bias: torch.Size([1024]) >>> name2classifier.occupy-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.occupy-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.occupy-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.occupy-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.outcome-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.outcome-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.outcome-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.outcome-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.protest-event-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.protest-event-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.protest-event-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.protest-event-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.when-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.when-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.when-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.when-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.where-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.where-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.where-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.where-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.who-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.who-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.who-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.who-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.protest-against-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.protest-against-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.protest-against-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.protest-against-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.protest-for-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.protest-for-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.protest-for-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.protest-for-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.organizer-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.organizer-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.organizer-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.organizer-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.wounded-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.wounded-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.wounded-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.wounded-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.arrested-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.arrested-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.arrested-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.arrested-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.killed-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.killed-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.killed-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.killed-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.imprisoned-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.imprisoned-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.imprisoned-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.imprisoned-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.corrupt-event-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.corrupt-event-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.corrupt-event-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.corrupt-event-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.judicial-actions-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.judicial-actions-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.judicial-actions-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.judicial-actions-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.charged-with-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.charged-with-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.charged-with-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.charged-with-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.prison-term-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.prison-term-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.prison-term-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.prison-term-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.fine-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.fine-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.fine-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.fine-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.npi-events-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.npi-events-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.npi-events-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.npi-events-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.disease-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.disease-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.disease-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.disease-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.infected-cumulative-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.infected-cumulative-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.infected-cumulative-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.infected-cumulative-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.killed-count-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.killed-count-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.killed-count-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.killed-count-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.killed-cumulative-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.killed-cumulative-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.killed-cumulative-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.killed-cumulative-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.outbreak-event-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.outbreak-event-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.outbreak-event-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.outbreak-event-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.infected-count-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.infected-count-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.infected-count-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.infected-count-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.killed-individuals-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.killed-individuals-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.killed-individuals-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.killed-individuals-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.hospitalized-count-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.hospitalized-count-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.hospitalized-count-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.hospitalized-count-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.hospitalized-individuals-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.hospitalized-individuals-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.hospitalized-individuals-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.hospitalized-individuals-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.infected-individuals-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.infected-individuals-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.infected-individuals-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.infected-individuals-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.tested-individuals-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.tested-individuals-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.tested-individuals-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.tested-individuals-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.vaccinated-count-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.vaccinated-count-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.vaccinated-count-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.vaccinated-count-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.tested-count-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.tested-count-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.tested-count-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.tested-count-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.exposed-individuals-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.exposed-individuals-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.exposed-individuals-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.exposed-individuals-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.recovered-cumulative-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.recovered-cumulative-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.recovered-cumulative-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.recovered-cumulative-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.tested-cumulative-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.tested-cumulative-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.tested-cumulative-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.tested-cumulative-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.recovered-count-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.recovered-count-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.recovered-count-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.recovered-count-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.exposed-count-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.exposed-count-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.exposed-count-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.exposed-count-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.exposed-cumulative-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.exposed-cumulative-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.exposed-cumulative-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.exposed-cumulative-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.vaccinated-individuals-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.vaccinated-individuals-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.vaccinated-individuals-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.vaccinated-individuals-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.hospitalized-cumulative-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.hospitalized-cumulative-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.hospitalized-cumulative-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.hospitalized-cumulative-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.recovered-individuals-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.recovered-individuals-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.recovered-individuals-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.recovered-individuals-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.blamed-by-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.blamed-by-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.blamed-by-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.blamed-by-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.claimed-by-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.claimed-by-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.claimed-by-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.claimed-by-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.terror-event-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.terror-event-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.terror-event-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.terror-event-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.named-perp-org-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.named-perp-org-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.named-perp-org-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.named-perp-org-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.target-physical-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.target-physical-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.target-physical-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.target-physical-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.kidnapped-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.kidnapped-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.kidnapped-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.kidnapped-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.named-perp-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.named-perp-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.named-perp-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.named-perp-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.perp-killed-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.perp-killed-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.perp-killed-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.perp-killed-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.target-human-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.target-human-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.target-human-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.target-human-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.perp-captured-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.perp-captured-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.perp-captured-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.perp-captured-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.perp-objective-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.perp-objective-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.perp-objective-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.perp-objective-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.weapon-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.weapon-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.weapon-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.weapon-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.named-organizer-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.named-organizer-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.named-organizer-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.named-organizer-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.affected-cumulative-count-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.affected-cumulative-count-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.affected-cumulative-count-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.affected-cumulative-count-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.damage-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.damage-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.damage-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.damage-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.human-displacement-events-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.human-displacement-events-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.human-displacement-events-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.human-displacement-events-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.major-disaster-event-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.major-disaster-event-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.major-disaster-event-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.major-disaster-event-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.related-natural-phenomena-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.related-natural-phenomena-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.related-natural-phenomena-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.related-natural-phenomena-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.responders-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.responders-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.responders-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.responders-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.assistance-provided-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.assistance-provided-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.assistance-provided-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.assistance-provided-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.rescue-events-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.rescue-events-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.rescue-events-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.rescue-events-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.individuals-affected-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.individuals-affected-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.individuals-affected-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.individuals-affected-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.missing-count-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.missing-count-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.missing-count-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.missing-count-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.injured-count-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.injured-count-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.injured-count-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.injured-count-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.assistance-needed-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.assistance-needed-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.assistance-needed-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.assistance-needed-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.rescued-count-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.rescued-count-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.rescued-count-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.rescued-count-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.repair-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.repair-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.repair-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.repair-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.declare-emergency-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.declare-emergency-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.declare-emergency-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.declare-emergency-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.announce-disaster-warnings-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.announce-disaster-warnings-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.announce-disaster-warnings-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.announce-disaster-warnings-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.disease-outbreak-events-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.disease-outbreak-events-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.disease-outbreak-events-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.disease-outbreak-events-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.current-location-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.current-location-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.current-location-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.current-location-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.event-or-soa-at-origin-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.event-or-soa-at-origin-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.event-or-soa-at-origin-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.event-or-soa-at-origin-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.group-identity-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.group-identity-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.group-identity-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.group-identity-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.human-displacement-event-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.human-displacement-event-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.human-displacement-event-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.human-displacement-event-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.origin-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.origin-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.origin-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.origin-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.total-displaced-count-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.total-displaced-count-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.total-displaced-count-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.total-displaced-count-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.transitory-events-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.transitory-events-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.transitory-events-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.transitory-events-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.destination-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.destination-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.destination-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.destination-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.transiting-location-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.transiting-location-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.transiting-location-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.transiting-location-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.settlement-status-event-or-soa-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.settlement-status-event-or-soa-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.settlement-status-event-or-soa-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.settlement-status-event-or-soa-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.detained-count-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.detained-count-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.detained-count-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.detained-count-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.blocked-migration-count-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.blocked-migration-count-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.blocked-migration-count-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.blocked-migration-count-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.cybercrime-event-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.cybercrime-event-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.cybercrime-event-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.cybercrime-event-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.perpetrator-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.perpetrator-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.perpetrator-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.perpetrator-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.victim-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.victim-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.victim-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.victim-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.response-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.response-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.response-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.response-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.information-stolen-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.information-stolen-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.information-stolen-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.information-stolen-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.related-crimes-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.related-crimes-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.related-crimes-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.related-crimes-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.victim-impact-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.victim-impact-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.victim-impact-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.victim-impact-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.contract-amount-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.contract-amount-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.contract-amount-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.contract-amount-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.etip-event-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.etip-event-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.etip-event-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.etip-event-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.project-location-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.project-location-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.project-location-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.project-location-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.project-name-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.project-name-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.project-name-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.project-name-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.signatories-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.signatories-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.signatories-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.signatories-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.contract-awardee-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.contract-awardee-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.contract-awardee-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.contract-awardee-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.overall-project-value-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.overall-project-value-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.overall-project-value-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.overall-project-value-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.funding-amount-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.funding-amount-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.funding-amount-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.funding-amount-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.funding-recipient-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.funding-recipient-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.funding-recipient-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.funding-recipient-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.funding-source-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.funding-source-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.funding-source-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.funding-source-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.contract-awarder-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.contract-awarder-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.contract-awarder-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.contract-awarder-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.agreement-length-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.agreement-length-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.agreement-length-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.agreement-length-ffn.layers.1.bias: torch.Size([2]) >>> irrealis_classifier.layers.0.weight: torch.Size([350, 1127]) >>> irrealis_classifier.layers.0.bias: torch.Size([350]) >>> irrealis_classifier.layers.1.weight: torch.Size([7, 350]) >>> irrealis_classifier.layers.1.bias: torch.Size([7]) n_trainable_params: 613743345, n_nontrainable_params: 0 ---------------------------------------------------------------------------------------------------- ****************************** Epoch: 0 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-24 00:23:40.550329: step: 2/466, loss: 10.935382843017578 2023-01-24 00:23:41.199463: step: 4/466, loss: 31.552471160888672 2023-01-24 00:23:41.877218: step: 6/466, loss: 24.21432113647461 2023-01-24 00:23:42.513769: step: 8/466, loss: 14.790350914001465 2023-01-24 00:23:43.142695: step: 10/466, loss: 6.093600749969482 2023-01-24 00:23:43.755506: step: 12/466, loss: 22.027292251586914 2023-01-24 00:23:44.395359: step: 14/466, loss: 13.000919342041016 2023-01-24 00:23:45.021840: step: 16/466, loss: 31.851593017578125 2023-01-24 00:23:45.584641: step: 18/466, loss: 14.411346435546875 2023-01-24 00:23:46.206881: step: 20/466, loss: 11.301730155944824 2023-01-24 00:23:46.814439: step: 22/466, loss: 30.992475509643555 2023-01-24 00:23:47.440030: step: 24/466, loss: 16.0606689453125 2023-01-24 00:23:48.028406: step: 26/466, loss: 10.805730819702148 2023-01-24 00:23:48.652526: step: 28/466, loss: 20.61066436767578 2023-01-24 00:23:49.251029: step: 30/466, loss: 6.085881233215332 2023-01-24 00:23:49.885943: step: 32/466, loss: 19.626184463500977 2023-01-24 00:23:50.502380: step: 34/466, loss: 4.410502910614014 2023-01-24 00:23:51.126860: step: 36/466, loss: 25.338882446289062 2023-01-24 00:23:51.768142: step: 38/466, loss: 5.67061185836792 2023-01-24 00:23:52.349925: step: 40/466, loss: 24.38402557373047 2023-01-24 00:23:52.953311: step: 42/466, loss: 30.825908660888672 2023-01-24 00:23:53.550604: step: 44/466, loss: 6.1968994140625 2023-01-24 00:23:54.238491: step: 46/466, loss: 22.731178283691406 2023-01-24 00:23:54.870122: step: 48/466, loss: 15.784555435180664 2023-01-24 00:23:55.521908: step: 50/466, loss: 16.66205596923828 2023-01-24 00:23:56.126149: step: 52/466, loss: 20.352027893066406 2023-01-24 00:23:56.745497: step: 54/466, loss: 6.7969512939453125 2023-01-24 00:23:57.375194: step: 56/466, loss: 22.42051124572754 2023-01-24 00:23:57.989019: step: 58/466, loss: 17.6397647857666 2023-01-24 00:23:58.607146: step: 60/466, loss: 12.36483383178711 2023-01-24 00:23:59.243639: step: 62/466, loss: 16.508319854736328 2023-01-24 00:23:59.818759: step: 64/466, loss: 17.55796241760254 2023-01-24 00:24:00.436852: step: 66/466, loss: 12.24570369720459 2023-01-24 00:24:01.097941: step: 68/466, loss: 22.549480438232422 2023-01-24 00:24:01.619332: step: 70/466, loss: 15.868330001831055 2023-01-24 00:24:02.220759: step: 72/466, loss: 16.54954719543457 2023-01-24 00:24:02.811718: step: 74/466, loss: 12.17561149597168 2023-01-24 00:24:03.391750: step: 76/466, loss: 18.420719146728516 2023-01-24 00:24:03.955017: step: 78/466, loss: 21.157407760620117 2023-01-24 00:24:04.651734: step: 80/466, loss: 6.287039756774902 2023-01-24 00:24:05.278983: step: 82/466, loss: 23.08069610595703 2023-01-24 00:24:05.898890: step: 84/466, loss: 6.357649326324463 2023-01-24 00:24:06.540805: step: 86/466, loss: 9.457338333129883 2023-01-24 00:24:07.178447: step: 88/466, loss: 22.535430908203125 2023-01-24 00:24:07.854935: step: 90/466, loss: 10.483683586120605 2023-01-24 00:24:08.481011: step: 92/466, loss: 5.847053527832031 2023-01-24 00:24:09.108777: step: 94/466, loss: 5.426416397094727 2023-01-24 00:24:09.729557: step: 96/466, loss: 22.074766159057617 2023-01-24 00:24:10.420076: step: 98/466, loss: 13.938610076904297 2023-01-24 00:24:11.065562: step: 100/466, loss: 13.14396858215332 2023-01-24 00:24:11.658098: step: 102/466, loss: 12.43580436706543 2023-01-24 00:24:12.294704: step: 104/466, loss: 20.18524932861328 2023-01-24 00:24:12.958143: step: 106/466, loss: 5.016257286071777 2023-01-24 00:24:13.623132: step: 108/466, loss: 11.029955863952637 2023-01-24 00:24:14.237650: step: 110/466, loss: 13.102413177490234 2023-01-24 00:24:14.847390: step: 112/466, loss: 19.9345760345459 2023-01-24 00:24:15.431395: step: 114/466, loss: 4.981045246124268 2023-01-24 00:24:16.115611: step: 116/466, loss: 12.654452323913574 2023-01-24 00:24:16.739226: step: 118/466, loss: 9.176351547241211 2023-01-24 00:24:17.367365: step: 120/466, loss: 17.193998336791992 2023-01-24 00:24:17.970461: step: 122/466, loss: 12.385316848754883 2023-01-24 00:24:18.696644: step: 124/466, loss: 18.726558685302734 2023-01-24 00:24:19.237354: step: 126/466, loss: 4.600466251373291 2023-01-24 00:24:19.874151: step: 128/466, loss: 8.098365783691406 2023-01-24 00:24:20.576834: step: 130/466, loss: 5.148943901062012 2023-01-24 00:24:21.184345: step: 132/466, loss: 17.139907836914062 2023-01-24 00:24:21.806869: step: 134/466, loss: 14.471826553344727 2023-01-24 00:24:22.360845: step: 136/466, loss: 13.262125968933105 2023-01-24 00:24:22.934054: step: 138/466, loss: 5.0157389640808105 2023-01-24 00:24:23.564820: step: 140/466, loss: 10.014802932739258 2023-01-24 00:24:24.220752: step: 142/466, loss: 6.947539329528809 2023-01-24 00:24:24.817585: step: 144/466, loss: 9.322587966918945 2023-01-24 00:24:25.484423: step: 146/466, loss: 14.274321556091309 2023-01-24 00:24:26.121400: step: 148/466, loss: 7.078306674957275 2023-01-24 00:24:26.713526: step: 150/466, loss: 11.2247314453125 2023-01-24 00:24:27.365718: step: 152/466, loss: 7.076643943786621 2023-01-24 00:24:27.940314: step: 154/466, loss: 5.292274475097656 2023-01-24 00:24:28.574674: step: 156/466, loss: 10.551542282104492 2023-01-24 00:24:29.230201: step: 158/466, loss: 10.318815231323242 2023-01-24 00:24:29.877830: step: 160/466, loss: 7.905094146728516 2023-01-24 00:24:30.525331: step: 162/466, loss: 10.10220718383789 2023-01-24 00:24:31.137329: step: 164/466, loss: 7.8116254806518555 2023-01-24 00:24:31.764030: step: 166/466, loss: 5.788525581359863 2023-01-24 00:24:32.369553: step: 168/466, loss: 3.7819089889526367 2023-01-24 00:24:33.120726: step: 170/466, loss: 13.363239288330078 2023-01-24 00:24:33.736282: step: 172/466, loss: 9.444230079650879 2023-01-24 00:24:34.367683: step: 174/466, loss: 3.878795862197876 2023-01-24 00:24:34.972959: step: 176/466, loss: 7.288130760192871 2023-01-24 00:24:35.635295: step: 178/466, loss: 9.1649169921875 2023-01-24 00:24:36.317523: step: 180/466, loss: 9.544353485107422 2023-01-24 00:24:36.935050: step: 182/466, loss: 12.113117218017578 2023-01-24 00:24:37.557317: step: 184/466, loss: 2.878976821899414 2023-01-24 00:24:38.215465: step: 186/466, loss: 18.907833099365234 2023-01-24 00:24:38.800808: step: 188/466, loss: 9.29400634765625 2023-01-24 00:24:39.484768: step: 190/466, loss: 3.236294746398926 2023-01-24 00:24:40.120949: step: 192/466, loss: 21.163219451904297 2023-01-24 00:24:40.742389: step: 194/466, loss: 6.056461334228516 2023-01-24 00:24:41.349528: step: 196/466, loss: 4.537060260772705 2023-01-24 00:24:41.937858: step: 198/466, loss: 5.748075485229492 2023-01-24 00:24:42.516511: step: 200/466, loss: 3.9136929512023926 2023-01-24 00:24:43.161766: step: 202/466, loss: 11.612344741821289 2023-01-24 00:24:43.758149: step: 204/466, loss: 5.7864670753479 2023-01-24 00:24:44.371792: step: 206/466, loss: 10.0962495803833 2023-01-24 00:24:45.025549: step: 208/466, loss: 17.319595336914062 2023-01-24 00:24:45.637692: step: 210/466, loss: 9.155757904052734 2023-01-24 00:24:46.339489: step: 212/466, loss: 8.002532958984375 2023-01-24 00:24:47.036279: step: 214/466, loss: 2.7576231956481934 2023-01-24 00:24:47.640906: step: 216/466, loss: 7.004976272583008 2023-01-24 00:24:48.248492: step: 218/466, loss: 9.138443946838379 2023-01-24 00:24:48.838522: step: 220/466, loss: 7.824585914611816 2023-01-24 00:24:49.458335: step: 222/466, loss: 7.4660797119140625 2023-01-24 00:24:50.201189: step: 224/466, loss: 8.690784454345703 2023-01-24 00:24:50.920867: step: 226/466, loss: 1.8159675598144531 2023-01-24 00:24:51.589263: step: 228/466, loss: 2.712833881378174 2023-01-24 00:24:52.210797: step: 230/466, loss: 4.209612846374512 2023-01-24 00:24:52.822889: step: 232/466, loss: 2.7250709533691406 2023-01-24 00:24:53.416826: step: 234/466, loss: 3.2030768394470215 2023-01-24 00:24:54.057543: step: 236/466, loss: 6.869441986083984 2023-01-24 00:24:54.691142: step: 238/466, loss: 8.67182731628418 2023-01-24 00:24:55.381206: step: 240/466, loss: 5.470909118652344 2023-01-24 00:24:56.103401: step: 242/466, loss: 8.2802734375 2023-01-24 00:24:56.721870: step: 244/466, loss: 3.2316181659698486 2023-01-24 00:24:57.346140: step: 246/466, loss: 2.4566733837127686 2023-01-24 00:24:58.048844: step: 248/466, loss: 15.329526901245117 2023-01-24 00:24:58.615761: step: 250/466, loss: 9.337640762329102 2023-01-24 00:24:59.224474: step: 252/466, loss: 3.938605308532715 2023-01-24 00:24:59.874868: step: 254/466, loss: 4.5049967765808105 2023-01-24 00:25:00.491276: step: 256/466, loss: 9.990443229675293 2023-01-24 00:25:01.140771: step: 258/466, loss: 6.6896772384643555 2023-01-24 00:25:01.825014: step: 260/466, loss: 6.534847259521484 2023-01-24 00:25:02.479333: step: 262/466, loss: 3.0652568340301514 2023-01-24 00:25:03.125107: step: 264/466, loss: 4.987476348876953 2023-01-24 00:25:03.761323: step: 266/466, loss: 12.270776748657227 2023-01-24 00:25:04.310586: step: 268/466, loss: 6.708323955535889 2023-01-24 00:25:05.036209: step: 270/466, loss: 12.853983879089355 2023-01-24 00:25:05.739626: step: 272/466, loss: 6.741026878356934 2023-01-24 00:25:06.420489: step: 274/466, loss: 2.9499881267547607 2023-01-24 00:25:07.053412: step: 276/466, loss: 2.5946052074432373 2023-01-24 00:25:07.677073: step: 278/466, loss: 4.536483287811279 2023-01-24 00:25:08.335977: step: 280/466, loss: 2.211632013320923 2023-01-24 00:25:09.013461: step: 282/466, loss: 3.691544532775879 2023-01-24 00:25:09.634558: step: 284/466, loss: 3.1353511810302734 2023-01-24 00:25:10.303169: step: 286/466, loss: 3.9749929904937744 2023-01-24 00:25:10.874153: step: 288/466, loss: 5.54394006729126 2023-01-24 00:25:11.432807: step: 290/466, loss: 4.538595199584961 2023-01-24 00:25:12.105246: step: 292/466, loss: 5.125032424926758 2023-01-24 00:25:12.711232: step: 294/466, loss: 5.607701778411865 2023-01-24 00:25:13.312411: step: 296/466, loss: 2.525693893432617 2023-01-24 00:25:13.925170: step: 298/466, loss: 2.7111496925354004 2023-01-24 00:25:14.531429: step: 300/466, loss: 4.847977161407471 2023-01-24 00:25:15.165382: step: 302/466, loss: 13.873931884765625 2023-01-24 00:25:15.843559: step: 304/466, loss: 6.348834991455078 2023-01-24 00:25:16.451425: step: 306/466, loss: 8.936687469482422 2023-01-24 00:25:17.123708: step: 308/466, loss: 13.306646347045898 2023-01-24 00:25:17.704904: step: 310/466, loss: 5.151156425476074 2023-01-24 00:25:18.270355: step: 312/466, loss: 3.3634378910064697 2023-01-24 00:25:18.847006: step: 314/466, loss: 8.005563735961914 2023-01-24 00:25:19.510262: step: 316/466, loss: 7.66864013671875 2023-01-24 00:25:20.111289: step: 318/466, loss: 6.213191509246826 2023-01-24 00:25:20.708458: step: 320/466, loss: 3.8438405990600586 2023-01-24 00:25:21.368032: step: 322/466, loss: 9.124351501464844 2023-01-24 00:25:22.020364: step: 324/466, loss: 6.281722068786621 2023-01-24 00:25:22.653026: step: 326/466, loss: 2.8622822761535645 2023-01-24 00:25:23.325874: step: 328/466, loss: 8.143562316894531 2023-01-24 00:25:23.940213: step: 330/466, loss: 14.289070129394531 2023-01-24 00:25:24.569265: step: 332/466, loss: 8.19924545288086 2023-01-24 00:25:25.256297: step: 334/466, loss: 3.265841484069824 2023-01-24 00:25:25.911469: step: 336/466, loss: 2.4021248817443848 2023-01-24 00:25:26.587495: step: 338/466, loss: 3.2878198623657227 2023-01-24 00:25:27.200159: step: 340/466, loss: 3.6344642639160156 2023-01-24 00:25:27.804395: step: 342/466, loss: 5.6455583572387695 2023-01-24 00:25:28.406808: step: 344/466, loss: 15.640310287475586 2023-01-24 00:25:29.017573: step: 346/466, loss: 2.7401907444000244 2023-01-24 00:25:29.650180: step: 348/466, loss: 6.943836212158203 2023-01-24 00:25:30.282960: step: 350/466, loss: 3.270993947982788 2023-01-24 00:25:30.924189: step: 352/466, loss: 2.140139579772949 2023-01-24 00:25:31.566993: step: 354/466, loss: 2.002262592315674 2023-01-24 00:25:32.152249: step: 356/466, loss: 6.698768615722656 2023-01-24 00:25:32.772831: step: 358/466, loss: 8.702178001403809 2023-01-24 00:25:33.383062: step: 360/466, loss: 3.5848991870880127 2023-01-24 00:25:33.960143: step: 362/466, loss: 4.4226460456848145 2023-01-24 00:25:34.637373: step: 364/466, loss: 9.540589332580566 2023-01-24 00:25:35.234054: step: 366/466, loss: 2.794017791748047 2023-01-24 00:25:35.819651: step: 368/466, loss: 3.7166731357574463 2023-01-24 00:25:36.513509: step: 370/466, loss: 2.8494551181793213 2023-01-24 00:25:37.160457: step: 372/466, loss: 2.1792683601379395 2023-01-24 00:25:37.812907: step: 374/466, loss: 6.519150257110596 2023-01-24 00:25:38.394890: step: 376/466, loss: 1.7491375207901 2023-01-24 00:25:38.992740: step: 378/466, loss: 4.340683460235596 2023-01-24 00:25:39.582040: step: 380/466, loss: 3.240265130996704 2023-01-24 00:25:40.204624: step: 382/466, loss: 7.557920932769775 2023-01-24 00:25:40.812918: step: 384/466, loss: 6.803190231323242 2023-01-24 00:25:41.531640: step: 386/466, loss: 5.763479232788086 2023-01-24 00:25:42.185441: step: 388/466, loss: 2.6579391956329346 2023-01-24 00:25:42.753900: step: 390/466, loss: 4.02760124206543 2023-01-24 00:25:43.334031: step: 392/466, loss: 4.680157661437988 2023-01-24 00:25:43.881177: step: 394/466, loss: 2.3998398780822754 2023-01-24 00:25:44.476517: step: 396/466, loss: 4.017712593078613 2023-01-24 00:25:45.187375: step: 398/466, loss: 4.725624084472656 2023-01-24 00:25:45.768942: step: 400/466, loss: 2.2863736152648926 2023-01-24 00:25:46.372830: step: 402/466, loss: 5.301486015319824 2023-01-24 00:25:46.972844: step: 404/466, loss: 4.332059383392334 2023-01-24 00:25:47.586272: step: 406/466, loss: 2.0066604614257812 2023-01-24 00:25:48.215847: step: 408/466, loss: 3.5898351669311523 2023-01-24 00:25:48.886943: step: 410/466, loss: 3.8044934272766113 2023-01-24 00:25:49.567978: step: 412/466, loss: 7.56033992767334 2023-01-24 00:25:50.188476: step: 414/466, loss: 1.6290899515151978 2023-01-24 00:25:50.753633: step: 416/466, loss: 1.7890315055847168 2023-01-24 00:25:51.364375: step: 418/466, loss: 1.864124059677124 2023-01-24 00:25:51.951343: step: 420/466, loss: 2.122128963470459 2023-01-24 00:25:52.488088: step: 422/466, loss: 0.8767684102058411 2023-01-24 00:25:53.121162: step: 424/466, loss: 2.351550817489624 2023-01-24 00:25:53.728669: step: 426/466, loss: 6.1453447341918945 2023-01-24 00:25:54.359775: step: 428/466, loss: 2.0417628288269043 2023-01-24 00:25:54.972376: step: 430/466, loss: 0.7644177675247192 2023-01-24 00:25:55.602316: step: 432/466, loss: 1.2050378322601318 2023-01-24 00:25:56.175577: step: 434/466, loss: 0.8422468304634094 2023-01-24 00:25:56.857664: step: 436/466, loss: 5.608556747436523 2023-01-24 00:25:57.515160: step: 438/466, loss: 0.5147002339363098 2023-01-24 00:25:58.090520: step: 440/466, loss: 1.6460902690887451 2023-01-24 00:25:58.706451: step: 442/466, loss: 0.38369235396385193 2023-01-24 00:25:59.344036: step: 444/466, loss: 0.7263200283050537 2023-01-24 00:25:59.951062: step: 446/466, loss: 2.806840181350708 2023-01-24 00:26:00.609981: step: 448/466, loss: 2.154893398284912 2023-01-24 00:26:01.218269: step: 450/466, loss: 1.0717921257019043 2023-01-24 00:26:01.838861: step: 452/466, loss: 1.0342295169830322 2023-01-24 00:26:02.478297: step: 454/466, loss: 1.136860966682434 2023-01-24 00:26:03.087573: step: 456/466, loss: 7.433987617492676 2023-01-24 00:26:03.653862: step: 458/466, loss: 0.8435848951339722 2023-01-24 00:26:04.178335: step: 460/466, loss: 2.5238540172576904 2023-01-24 00:26:04.949205: step: 462/466, loss: 1.891288161277771 2023-01-24 00:26:05.558203: step: 464/466, loss: 0.9611189365386963 2023-01-24 00:26:06.194077: step: 466/466, loss: 4.8004326820373535 2023-01-24 00:26:06.988741: step: 468/466, loss: 3.1122493743896484 2023-01-24 00:26:07.609870: step: 470/466, loss: 0.8799759745597839 2023-01-24 00:26:08.250249: step: 472/466, loss: 2.821756362915039 2023-01-24 00:26:08.837859: step: 474/466, loss: 10.722322463989258 2023-01-24 00:26:09.548195: step: 476/466, loss: 0.9301474094390869 2023-01-24 00:26:10.169732: step: 478/466, loss: 3.162686824798584 2023-01-24 00:26:10.838772: step: 480/466, loss: 3.3056721687316895 2023-01-24 00:26:11.430006: step: 482/466, loss: 2.354368209838867 2023-01-24 00:26:12.014029: step: 484/466, loss: 1.06724214553833 2023-01-24 00:26:12.746291: step: 486/466, loss: 2.0066750049591064 2023-01-24 00:26:13.421325: step: 488/466, loss: 15.266960144042969 2023-01-24 00:26:14.010300: step: 490/466, loss: 4.051244735717773 2023-01-24 00:26:14.638623: step: 492/466, loss: 0.7980690598487854 2023-01-24 00:26:15.260641: step: 494/466, loss: 1.2072261571884155 2023-01-24 00:26:15.896676: step: 496/466, loss: 1.5347435474395752 2023-01-24 00:26:16.583409: step: 498/466, loss: 0.9659013748168945 2023-01-24 00:26:17.230551: step: 500/466, loss: 1.3465356826782227 2023-01-24 00:26:17.852714: step: 502/466, loss: 0.8857125639915466 2023-01-24 00:26:18.440585: step: 504/466, loss: 0.8151245713233948 2023-01-24 00:26:19.096207: step: 506/466, loss: 0.33978378772735596 2023-01-24 00:26:19.771441: step: 508/466, loss: 0.4371962547302246 2023-01-24 00:26:20.381335: step: 510/466, loss: 7.064756870269775 2023-01-24 00:26:21.004660: step: 512/466, loss: 7.476099014282227 2023-01-24 00:26:21.653125: step: 514/466, loss: 0.9030598998069763 2023-01-24 00:26:22.289369: step: 516/466, loss: 0.8397070169448853 2023-01-24 00:26:22.916780: step: 518/466, loss: 2.821498394012451 2023-01-24 00:26:23.514167: step: 520/466, loss: 1.5774316787719727 2023-01-24 00:26:24.113112: step: 522/466, loss: 5.750486373901367 2023-01-24 00:26:24.757075: step: 524/466, loss: 4.969746112823486 2023-01-24 00:26:25.404672: step: 526/466, loss: 2.1585094928741455 2023-01-24 00:26:26.019410: step: 528/466, loss: 0.7462534308433533 2023-01-24 00:26:26.708222: step: 530/466, loss: 9.493864059448242 2023-01-24 00:26:27.396102: step: 532/466, loss: 4.18302059173584 2023-01-24 00:26:28.028963: step: 534/466, loss: 9.291149139404297 2023-01-24 00:26:28.651988: step: 536/466, loss: 3.3164713382720947 2023-01-24 00:26:29.308317: step: 538/466, loss: 2.596146821975708 2023-01-24 00:26:29.909335: step: 540/466, loss: 7.012548446655273 2023-01-24 00:26:30.530672: step: 542/466, loss: 3.4024271965026855 2023-01-24 00:26:31.166294: step: 544/466, loss: 0.7824108600616455 2023-01-24 00:26:31.759364: step: 546/466, loss: 3.5947446823120117 2023-01-24 00:26:32.337674: step: 548/466, loss: 2.312133550643921 2023-01-24 00:26:32.894712: step: 550/466, loss: 3.605940341949463 2023-01-24 00:26:33.503598: step: 552/466, loss: 5.734463691711426 2023-01-24 00:26:34.188260: step: 554/466, loss: 3.824420213699341 2023-01-24 00:26:34.868184: step: 556/466, loss: 1.526013970375061 2023-01-24 00:26:35.465224: step: 558/466, loss: 1.2279987335205078 2023-01-24 00:26:36.030126: step: 560/466, loss: 3.907745122909546 2023-01-24 00:26:36.669422: step: 562/466, loss: 1.75930655002594 2023-01-24 00:26:37.347610: step: 564/466, loss: 0.5211279988288879 2023-01-24 00:26:38.006794: step: 566/466, loss: 0.5466214418411255 2023-01-24 00:26:38.641883: step: 568/466, loss: 6.440335273742676 2023-01-24 00:26:39.295160: step: 570/466, loss: 4.154919624328613 2023-01-24 00:26:39.896120: step: 572/466, loss: 4.83548641204834 2023-01-24 00:26:40.583889: step: 574/466, loss: 0.8471634984016418 2023-01-24 00:26:41.180714: step: 576/466, loss: 0.60401451587677 2023-01-24 00:26:41.827176: step: 578/466, loss: 5.233648300170898 2023-01-24 00:26:42.490554: step: 580/466, loss: 0.6951800584793091 2023-01-24 00:26:43.120241: step: 582/466, loss: 3.380234718322754 2023-01-24 00:26:43.751674: step: 584/466, loss: 2.091862678527832 2023-01-24 00:26:44.355168: step: 586/466, loss: 1.276121735572815 2023-01-24 00:26:45.016912: step: 588/466, loss: 2.7723000049591064 2023-01-24 00:26:45.613209: step: 590/466, loss: 1.389462947845459 2023-01-24 00:26:46.284244: step: 592/466, loss: 6.692296981811523 2023-01-24 00:26:46.884141: step: 594/466, loss: 1.937554121017456 2023-01-24 00:26:47.546925: step: 596/466, loss: 2.5454559326171875 2023-01-24 00:26:48.157003: step: 598/466, loss: 1.0721949338912964 2023-01-24 00:26:48.879773: step: 600/466, loss: 2.6973354816436768 2023-01-24 00:26:49.541473: step: 602/466, loss: 1.4351885318756104 2023-01-24 00:26:50.115385: step: 604/466, loss: 1.5300734043121338 2023-01-24 00:26:50.683258: step: 606/466, loss: 2.501544713973999 2023-01-24 00:26:51.293244: step: 608/466, loss: 0.6491356492042542 2023-01-24 00:26:51.905307: step: 610/466, loss: 1.4302908182144165 2023-01-24 00:26:52.490474: step: 612/466, loss: 0.9263692498207092 2023-01-24 00:26:53.096285: step: 614/466, loss: 0.7574171423912048 2023-01-24 00:26:53.684888: step: 616/466, loss: 0.5629387497901917 2023-01-24 00:26:54.376664: step: 618/466, loss: 1.5841846466064453 2023-01-24 00:26:54.985869: step: 620/466, loss: 8.079695701599121 2023-01-24 00:26:55.600608: step: 622/466, loss: 0.7748921513557434 2023-01-24 00:26:56.197200: step: 624/466, loss: 3.2063374519348145 2023-01-24 00:26:56.802769: step: 626/466, loss: 1.6632599830627441 2023-01-24 00:26:57.426324: step: 628/466, loss: 2.4611549377441406 2023-01-24 00:26:58.058897: step: 630/466, loss: 1.1037203073501587 2023-01-24 00:26:58.658604: step: 632/466, loss: 6.284277439117432 2023-01-24 00:26:59.254415: step: 634/466, loss: 1.2190725803375244 2023-01-24 00:26:59.844141: step: 636/466, loss: 1.3618507385253906 2023-01-24 00:27:00.466345: step: 638/466, loss: 5.7003679275512695 2023-01-24 00:27:01.085612: step: 640/466, loss: 8.65969467163086 2023-01-24 00:27:01.730919: step: 642/466, loss: 3.9915173053741455 2023-01-24 00:27:02.378612: step: 644/466, loss: 2.6033904552459717 2023-01-24 00:27:02.995565: step: 646/466, loss: 0.6600139141082764 2023-01-24 00:27:03.591571: step: 648/466, loss: 1.091048240661621 2023-01-24 00:27:04.247642: step: 650/466, loss: 2.0598092079162598 2023-01-24 00:27:04.906555: step: 652/466, loss: 2.6983959674835205 2023-01-24 00:27:05.600537: step: 654/466, loss: 1.2380976676940918 2023-01-24 00:27:06.270511: step: 656/466, loss: 0.8623917698860168 2023-01-24 00:27:06.869783: step: 658/466, loss: 1.8455694913864136 2023-01-24 00:27:07.536404: step: 660/466, loss: 2.5191493034362793 2023-01-24 00:27:08.162685: step: 662/466, loss: 1.4284855127334595 2023-01-24 00:27:08.823485: step: 664/466, loss: 2.01560115814209 2023-01-24 00:27:09.494800: step: 666/466, loss: 4.184616565704346 2023-01-24 00:27:10.233747: step: 668/466, loss: 4.633384704589844 2023-01-24 00:27:10.850296: step: 670/466, loss: 0.761311948299408 2023-01-24 00:27:11.423070: step: 672/466, loss: 2.524230480194092 2023-01-24 00:27:11.997641: step: 674/466, loss: 1.7049751281738281 2023-01-24 00:27:12.599182: step: 676/466, loss: 4.10607385635376 2023-01-24 00:27:13.249498: step: 678/466, loss: 0.5962437391281128 2023-01-24 00:27:13.851094: step: 680/466, loss: 2.6895506381988525 2023-01-24 00:27:14.484351: step: 682/466, loss: 1.299211025238037 2023-01-24 00:27:15.133814: step: 684/466, loss: 2.7528464794158936 2023-01-24 00:27:15.767221: step: 686/466, loss: 1.287872552871704 2023-01-24 00:27:16.361452: step: 688/466, loss: 4.796496391296387 2023-01-24 00:27:16.960846: step: 690/466, loss: 0.9580941200256348 2023-01-24 00:27:17.598757: step: 692/466, loss: 2.782583713531494 2023-01-24 00:27:18.197110: step: 694/466, loss: 0.9516090154647827 2023-01-24 00:27:18.818488: step: 696/466, loss: 1.9994728565216064 2023-01-24 00:27:19.455932: step: 698/466, loss: 1.4923850297927856 2023-01-24 00:27:20.019547: step: 700/466, loss: 1.8313624858856201 2023-01-24 00:27:20.679245: step: 702/466, loss: 4.105555534362793 2023-01-24 00:27:21.305906: step: 704/466, loss: 16.962718963623047 2023-01-24 00:27:21.898529: step: 706/466, loss: 2.9927914142608643 2023-01-24 00:27:22.504410: step: 708/466, loss: 0.7770897746086121 2023-01-24 00:27:23.159076: step: 710/466, loss: 14.245176315307617 2023-01-24 00:27:23.797792: step: 712/466, loss: 1.8469178676605225 2023-01-24 00:27:24.392181: step: 714/466, loss: 1.0387372970581055 2023-01-24 00:27:24.986049: step: 716/466, loss: 3.556546688079834 2023-01-24 00:27:25.630455: step: 718/466, loss: 0.7969565987586975 2023-01-24 00:27:26.336618: step: 720/466, loss: 5.698383331298828 2023-01-24 00:27:26.892144: step: 722/466, loss: 1.0497674942016602 2023-01-24 00:27:27.480228: step: 724/466, loss: 0.8513200879096985 2023-01-24 00:27:28.071940: step: 726/466, loss: 1.0116870403289795 2023-01-24 00:27:28.661667: step: 728/466, loss: 0.7006529569625854 2023-01-24 00:27:29.308385: step: 730/466, loss: 2.557363986968994 2023-01-24 00:27:29.990131: step: 732/466, loss: 3.872546672821045 2023-01-24 00:27:30.546708: step: 734/466, loss: 3.9225711822509766 2023-01-24 00:27:31.148125: step: 736/466, loss: 5.718816757202148 2023-01-24 00:27:31.772165: step: 738/466, loss: 1.5927464962005615 2023-01-24 00:27:32.423295: step: 740/466, loss: 5.787430286407471 2023-01-24 00:27:33.061571: step: 742/466, loss: 0.9632762670516968 2023-01-24 00:27:33.655476: step: 744/466, loss: 0.8985154032707214 2023-01-24 00:27:34.260492: step: 746/466, loss: 6.514135360717773 2023-01-24 00:27:34.827398: step: 748/466, loss: 5.090004920959473 2023-01-24 00:27:35.446508: step: 750/466, loss: 0.6915642619132996 2023-01-24 00:27:36.053913: step: 752/466, loss: 1.69606614112854 2023-01-24 00:27:36.663931: step: 754/466, loss: 3.6167702674865723 2023-01-24 00:27:37.272981: step: 756/466, loss: 6.165261268615723 2023-01-24 00:27:37.878204: step: 758/466, loss: 1.6209237575531006 2023-01-24 00:27:38.509841: step: 760/466, loss: 1.159082055091858 2023-01-24 00:27:39.110852: step: 762/466, loss: 1.1850038766860962 2023-01-24 00:27:39.789650: step: 764/466, loss: 0.9388983249664307 2023-01-24 00:27:40.409030: step: 766/466, loss: 2.732558488845825 2023-01-24 00:27:41.136323: step: 768/466, loss: 2.2904887199401855 2023-01-24 00:27:41.702666: step: 770/466, loss: 2.290951728820801 2023-01-24 00:27:42.301721: step: 772/466, loss: 0.5040983557701111 2023-01-24 00:27:42.924918: step: 774/466, loss: 1.4131964445114136 2023-01-24 00:27:43.528288: step: 776/466, loss: 0.7500594854354858 2023-01-24 00:27:44.143258: step: 778/466, loss: 1.775069236755371 2023-01-24 00:27:44.737414: step: 780/466, loss: 0.8299316763877869 2023-01-24 00:27:45.358699: step: 782/466, loss: 8.539618492126465 2023-01-24 00:27:45.946956: step: 784/466, loss: 8.508955001831055 2023-01-24 00:27:46.523856: step: 786/466, loss: 1.5897914171218872 2023-01-24 00:27:47.100235: step: 788/466, loss: 2.956247568130493 2023-01-24 00:27:47.724331: step: 790/466, loss: 1.2754576206207275 2023-01-24 00:27:48.316891: step: 792/466, loss: 1.6307601928710938 2023-01-24 00:27:48.948631: step: 794/466, loss: 8.023615837097168 2023-01-24 00:27:49.583487: step: 796/466, loss: 3.913644552230835 2023-01-24 00:27:50.267847: step: 798/466, loss: 1.479268193244934 2023-01-24 00:27:50.882465: step: 800/466, loss: 3.459024667739868 2023-01-24 00:27:51.427806: step: 802/466, loss: 1.494737982749939 2023-01-24 00:27:52.077008: step: 804/466, loss: 2.402879238128662 2023-01-24 00:27:52.740290: step: 806/466, loss: 2.3793046474456787 2023-01-24 00:27:53.353247: step: 808/466, loss: 7.7648234367370605 2023-01-24 00:27:54.012188: step: 810/466, loss: 12.321928024291992 2023-01-24 00:27:54.623201: step: 812/466, loss: 8.237139701843262 2023-01-24 00:27:55.244169: step: 814/466, loss: 2.2108066082000732 2023-01-24 00:27:55.835914: step: 816/466, loss: 1.0587334632873535 2023-01-24 00:27:56.515198: step: 818/466, loss: 2.11734938621521 2023-01-24 00:27:57.153807: step: 820/466, loss: 1.61896550655365 2023-01-24 00:27:57.780661: step: 822/466, loss: 1.7837340831756592 2023-01-24 00:27:58.467899: step: 824/466, loss: 2.19197416305542 2023-01-24 00:27:59.052533: step: 826/466, loss: 1.5917332172393799 2023-01-24 00:27:59.627035: step: 828/466, loss: 3.1244866847991943 2023-01-24 00:28:00.340780: step: 830/466, loss: 3.1029326915740967 2023-01-24 00:28:00.967169: step: 832/466, loss: 3.0160324573516846 2023-01-24 00:28:01.578233: step: 834/466, loss: 2.6842033863067627 2023-01-24 00:28:02.277510: step: 836/466, loss: 1.0783379077911377 2023-01-24 00:28:02.945684: step: 838/466, loss: 0.7518795728683472 2023-01-24 00:28:03.575326: step: 840/466, loss: 2.020292043685913 2023-01-24 00:28:04.281003: step: 842/466, loss: 9.57175350189209 2023-01-24 00:28:04.858632: step: 844/466, loss: 3.6099765300750732 2023-01-24 00:28:05.454311: step: 846/466, loss: 0.6370089054107666 2023-01-24 00:28:06.094444: step: 848/466, loss: 1.0624032020568848 2023-01-24 00:28:06.757856: step: 850/466, loss: 0.9141412377357483 2023-01-24 00:28:07.437890: step: 852/466, loss: 0.9670308828353882 2023-01-24 00:28:08.009415: step: 854/466, loss: 0.788347065448761 2023-01-24 00:28:08.590485: step: 856/466, loss: 2.0098037719726562 2023-01-24 00:28:09.214593: step: 858/466, loss: 4.156428337097168 2023-01-24 00:28:09.767087: step: 860/466, loss: 0.8457652926445007 2023-01-24 00:28:10.361899: step: 862/466, loss: 8.276666641235352 2023-01-24 00:28:11.006565: step: 864/466, loss: 1.2429496049880981 2023-01-24 00:28:11.597490: step: 866/466, loss: 1.0977206230163574 2023-01-24 00:28:12.205453: step: 868/466, loss: 1.4717178344726562 2023-01-24 00:28:12.803423: step: 870/466, loss: 9.140847206115723 2023-01-24 00:28:13.480221: step: 872/466, loss: 1.118363618850708 2023-01-24 00:28:14.103562: step: 874/466, loss: 5.926425933837891 2023-01-24 00:28:14.706590: step: 876/466, loss: 2.4152932167053223 2023-01-24 00:28:15.366409: step: 878/466, loss: 0.7910025715827942 2023-01-24 00:28:16.022834: step: 880/466, loss: 1.3264044523239136 2023-01-24 00:28:16.607348: step: 882/466, loss: 4.652287006378174 2023-01-24 00:28:17.235223: step: 884/466, loss: 2.289107322692871 2023-01-24 00:28:17.791217: step: 886/466, loss: 5.126692295074463 2023-01-24 00:28:18.389424: step: 888/466, loss: 0.8533294796943665 2023-01-24 00:28:18.965181: step: 890/466, loss: 0.5715285539627075 2023-01-24 00:28:19.561321: step: 892/466, loss: 1.5395843982696533 2023-01-24 00:28:20.165808: step: 894/466, loss: 0.9893038272857666 2023-01-24 00:28:20.795554: step: 896/466, loss: 1.1434332132339478 2023-01-24 00:28:21.345307: step: 898/466, loss: 5.790560722351074 2023-01-24 00:28:22.040629: step: 900/466, loss: 8.075478553771973 2023-01-24 00:28:22.729985: step: 902/466, loss: 3.857839584350586 2023-01-24 00:28:23.359020: step: 904/466, loss: 1.1071827411651611 2023-01-24 00:28:23.949336: step: 906/466, loss: 6.996962547302246 2023-01-24 00:28:24.576151: step: 908/466, loss: 2.3261501789093018 2023-01-24 00:28:25.139841: step: 910/466, loss: 4.405646324157715 2023-01-24 00:28:25.717699: step: 912/466, loss: 3.5805516242980957 2023-01-24 00:28:26.438400: step: 914/466, loss: 4.537283897399902 2023-01-24 00:28:27.039529: step: 916/466, loss: 3.6143927574157715 2023-01-24 00:28:27.604304: step: 918/466, loss: 5.42225456237793 2023-01-24 00:28:28.232715: step: 920/466, loss: 2.3025240898132324 2023-01-24 00:28:28.879605: step: 922/466, loss: 0.8708323240280151 2023-01-24 00:28:29.593075: step: 924/466, loss: 3.0053529739379883 2023-01-24 00:28:30.230631: step: 926/466, loss: 1.0579087734222412 2023-01-24 00:28:30.838783: step: 928/466, loss: 5.881206512451172 2023-01-24 00:28:31.497446: step: 930/466, loss: 0.386121928691864 2023-01-24 00:28:32.112667: step: 932/466, loss: 1.6460617780685425 ================================================== Loss: 5.728 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.27001551046085, 'r': 0.10157483285767324, 'f1': 0.14761836972997017}, 'combined': 0.10877143032734643, 'epoch': 0} Test Chinese: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.3110226104830422, 'r': 0.03714101620029456, 'f1': 0.06635785549830063}, 'combined': 0.04400935494187813, 'epoch': 0} Dev Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.26260040160642567, 'r': 0.08240390674228103, 'f1': 0.12544364508393283}, 'combined': 0.08362909672262188, 'epoch': 0} Test Korean: {'template': {'p': 0.96875, 'r': 0.49206349206349204, 'f1': 0.6526315789473683}, 'slot': {'p': 0.30982940051020413, 'r': 0.02981176362297497, 'f1': 0.054390114196148684}, 'combined': 0.03549670610696019, 'epoch': 0} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.26020516074450084, 'r': 0.09690059861373661, 'f1': 0.14121326905417814}, 'combined': 0.10405188246097336, 'epoch': 0} Test Russian: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.32320065430752454, 'r': 0.036373956799214534, 'f1': 0.06538885824600112}, 'combined': 0.04336670391444633, 'epoch': 0} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.22421875, 'r': 0.205, 'f1': 0.21417910447761193}, 'combined': 0.1427860696517413, 'epoch': 0} Sample Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 0} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.6, 'r': 0.10344827586206896, 'f1': 0.17647058823529413}, 'combined': 0.11764705882352941, 'epoch': 0} New best chinese model... New best korean model... New best russian model... ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.27001551046085, 'r': 0.10157483285767324, 'f1': 0.14761836972997017}, 'combined': 0.10877143032734643, 'epoch': 0} Test for Chinese: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.3110226104830422, 'r': 0.03714101620029456, 'f1': 0.06635785549830063}, 'combined': 0.04400935494187813, 'epoch': 0} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.22421875, 'r': 0.205, 'f1': 0.21417910447761193}, 'combined': 0.1427860696517413, 'epoch': 0} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.26260040160642567, 'r': 0.08240390674228103, 'f1': 0.12544364508393283}, 'combined': 0.08362909672262188, 'epoch': 0} Test for Korean: {'template': {'p': 0.96875, 'r': 0.49206349206349204, 'f1': 0.6526315789473683}, 'slot': {'p': 0.30982940051020413, 'r': 0.02981176362297497, 'f1': 0.054390114196148684}, 'combined': 0.03549670610696019, 'epoch': 0} Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 0} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.26020516074450084, 'r': 0.09690059861373661, 'f1': 0.14121326905417814}, 'combined': 0.10405188246097336, 'epoch': 0} Test for Russian: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.32320065430752454, 'r': 0.036373956799214534, 'f1': 0.06538885824600112}, 'combined': 0.04336670391444633, 'epoch': 0} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.6, 'r': 0.10344827586206896, 'f1': 0.17647058823529413}, 'combined': 0.11764705882352941, 'epoch': 0} ****************************** Epoch: 1 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-24 00:31:26.564606: step: 2/466, loss: 12.579239845275879 2023-01-24 00:31:27.219568: step: 4/466, loss: 8.355171203613281 2023-01-24 00:31:27.743486: step: 6/466, loss: 1.0959560871124268 2023-01-24 00:31:28.394855: step: 8/466, loss: 2.282801866531372 2023-01-24 00:31:29.063106: step: 10/466, loss: 0.46709614992141724 2023-01-24 00:31:29.758310: step: 12/466, loss: 4.711222171783447 2023-01-24 00:31:30.353084: step: 14/466, loss: 2.075840950012207 2023-01-24 00:31:30.951561: step: 16/466, loss: 1.3335590362548828 2023-01-24 00:31:31.591636: step: 18/466, loss: 2.6483826637268066 2023-01-24 00:31:32.306423: step: 20/466, loss: 0.9723260402679443 2023-01-24 00:31:32.977220: step: 22/466, loss: 3.485568046569824 2023-01-24 00:31:33.626809: step: 24/466, loss: 1.1184146404266357 2023-01-24 00:31:34.317108: step: 26/466, loss: 0.5919270515441895 2023-01-24 00:31:34.879958: step: 28/466, loss: 0.5985679626464844 2023-01-24 00:31:35.435592: step: 30/466, loss: 1.5386550426483154 2023-01-24 00:31:36.068921: step: 32/466, loss: 0.29626911878585815 2023-01-24 00:31:36.678000: step: 34/466, loss: 1.6548181772232056 2023-01-24 00:31:37.297479: step: 36/466, loss: 1.5368632078170776 2023-01-24 00:31:37.866442: step: 38/466, loss: 1.086187481880188 2023-01-24 00:31:38.491950: step: 40/466, loss: 3.108809232711792 2023-01-24 00:31:39.100890: step: 42/466, loss: 1.6362186670303345 2023-01-24 00:31:39.741903: step: 44/466, loss: 2.8455889225006104 2023-01-24 00:31:40.361328: step: 46/466, loss: 2.073005437850952 2023-01-24 00:31:40.933751: step: 48/466, loss: 1.2606151103973389 2023-01-24 00:31:41.567538: step: 50/466, loss: 0.48470303416252136 2023-01-24 00:31:42.139193: step: 52/466, loss: 1.1830860376358032 2023-01-24 00:31:42.750443: step: 54/466, loss: 1.5276299715042114 2023-01-24 00:31:43.405537: step: 56/466, loss: 5.069943428039551 2023-01-24 00:31:44.021831: step: 58/466, loss: 1.8566813468933105 2023-01-24 00:31:44.614266: step: 60/466, loss: 2.134775161743164 2023-01-24 00:31:45.227323: step: 62/466, loss: 1.424320936203003 2023-01-24 00:31:45.846462: step: 64/466, loss: 0.26281335949897766 2023-01-24 00:31:46.559163: step: 66/466, loss: 4.2092084884643555 2023-01-24 00:31:47.230994: step: 68/466, loss: 2.038078546524048 2023-01-24 00:31:47.840653: step: 70/466, loss: 0.6230260133743286 2023-01-24 00:31:48.451095: step: 72/466, loss: 0.9377046823501587 2023-01-24 00:31:49.177707: step: 74/466, loss: 4.913173675537109 2023-01-24 00:31:49.851280: step: 76/466, loss: 8.410189628601074 2023-01-24 00:31:50.469740: step: 78/466, loss: 3.003692150115967 2023-01-24 00:31:51.069429: step: 80/466, loss: 0.8860955238342285 2023-01-24 00:31:51.638139: step: 82/466, loss: 1.2767869234085083 2023-01-24 00:31:52.243981: step: 84/466, loss: 3.6442768573760986 2023-01-24 00:31:52.874872: step: 86/466, loss: 3.817312479019165 2023-01-24 00:31:53.507939: step: 88/466, loss: 1.678422451019287 2023-01-24 00:31:54.154864: step: 90/466, loss: 5.551665306091309 2023-01-24 00:31:54.772931: step: 92/466, loss: 2.8305606842041016 2023-01-24 00:31:55.444304: step: 94/466, loss: 0.44949406385421753 2023-01-24 00:31:56.027359: step: 96/466, loss: 3.2628884315490723 2023-01-24 00:31:56.613400: step: 98/466, loss: 0.6608309745788574 2023-01-24 00:31:57.254960: step: 100/466, loss: 3.4487357139587402 2023-01-24 00:31:57.877343: step: 102/466, loss: 2.0214779376983643 2023-01-24 00:31:58.556923: step: 104/466, loss: 1.7519176006317139 2023-01-24 00:31:59.170892: step: 106/466, loss: 0.37205448746681213 2023-01-24 00:31:59.843914: step: 108/466, loss: 3.332947254180908 2023-01-24 00:32:00.434536: step: 110/466, loss: 2.115652084350586 2023-01-24 00:32:01.047103: step: 112/466, loss: 1.5125521421432495 2023-01-24 00:32:01.667192: step: 114/466, loss: 0.2838253676891327 2023-01-24 00:32:02.298834: step: 116/466, loss: 3.1312167644500732 2023-01-24 00:32:02.881076: step: 118/466, loss: 0.916276216506958 2023-01-24 00:32:03.438094: step: 120/466, loss: 0.8722048997879028 2023-01-24 00:32:04.051214: step: 122/466, loss: 1.2177882194519043 2023-01-24 00:32:04.717732: step: 124/466, loss: 1.0140763521194458 2023-01-24 00:32:05.364483: step: 126/466, loss: 1.6305534839630127 2023-01-24 00:32:05.988622: step: 128/466, loss: 0.47239357233047485 2023-01-24 00:32:06.557365: step: 130/466, loss: 2.2214860916137695 2023-01-24 00:32:07.190519: step: 132/466, loss: 1.5602493286132812 2023-01-24 00:32:07.790065: step: 134/466, loss: 0.5952809453010559 2023-01-24 00:32:08.439962: step: 136/466, loss: 0.4576283097267151 2023-01-24 00:32:09.003010: step: 138/466, loss: 1.1963032484054565 2023-01-24 00:32:09.601066: step: 140/466, loss: 2.7550652027130127 2023-01-24 00:32:10.246713: step: 142/466, loss: 0.2905564606189728 2023-01-24 00:32:10.941467: step: 144/466, loss: 0.7401619553565979 2023-01-24 00:32:11.574971: step: 146/466, loss: 0.9804503917694092 2023-01-24 00:32:12.248670: step: 148/466, loss: 2.015255928039551 2023-01-24 00:32:12.854616: step: 150/466, loss: 0.6618127822875977 2023-01-24 00:32:13.487976: step: 152/466, loss: 0.9437558054924011 2023-01-24 00:32:14.045006: step: 154/466, loss: 5.640690803527832 2023-01-24 00:32:14.764127: step: 156/466, loss: 0.5138984322547913 2023-01-24 00:32:15.399826: step: 158/466, loss: 0.4257822334766388 2023-01-24 00:32:15.967876: step: 160/466, loss: 0.8269299864768982 2023-01-24 00:32:16.571478: step: 162/466, loss: 1.4057587385177612 2023-01-24 00:32:17.189618: step: 164/466, loss: 0.7340868711471558 2023-01-24 00:32:17.816541: step: 166/466, loss: 0.9378079175949097 2023-01-24 00:32:18.454774: step: 168/466, loss: 1.642561912536621 2023-01-24 00:32:19.078775: step: 170/466, loss: 1.8315976858139038 2023-01-24 00:32:19.699284: step: 172/466, loss: 2.245410442352295 2023-01-24 00:32:20.392637: step: 174/466, loss: 1.2697768211364746 2023-01-24 00:32:20.974671: step: 176/466, loss: 2.777341365814209 2023-01-24 00:32:21.651403: step: 178/466, loss: 0.9618198871612549 2023-01-24 00:32:22.277468: step: 180/466, loss: 2.3667354583740234 2023-01-24 00:32:22.920918: step: 182/466, loss: 1.2746222019195557 2023-01-24 00:32:23.530063: step: 184/466, loss: 3.522149085998535 2023-01-24 00:32:24.133010: step: 186/466, loss: 1.1410295963287354 2023-01-24 00:32:24.753989: step: 188/466, loss: 0.7399966716766357 2023-01-24 00:32:25.426385: step: 190/466, loss: 0.5701156258583069 2023-01-24 00:32:25.986650: step: 192/466, loss: 1.5691550970077515 2023-01-24 00:32:26.682559: step: 194/466, loss: 1.9450173377990723 2023-01-24 00:32:27.377832: step: 196/466, loss: 1.6858083009719849 2023-01-24 00:32:28.022170: step: 198/466, loss: 0.5167574882507324 2023-01-24 00:32:28.667603: step: 200/466, loss: 0.9078482389450073 2023-01-24 00:32:29.268572: step: 202/466, loss: 0.27712491154670715 2023-01-24 00:32:29.890159: step: 204/466, loss: 0.9520632028579712 2023-01-24 00:32:30.468710: step: 206/466, loss: 1.923095464706421 2023-01-24 00:32:31.042008: step: 208/466, loss: 5.610950469970703 2023-01-24 00:32:31.671372: step: 210/466, loss: 6.252904891967773 2023-01-24 00:32:32.382109: step: 212/466, loss: 1.0028800964355469 2023-01-24 00:32:33.061265: step: 214/466, loss: 0.7181891202926636 2023-01-24 00:32:33.695489: step: 216/466, loss: 0.6957169771194458 2023-01-24 00:32:34.368482: step: 218/466, loss: 2.1483540534973145 2023-01-24 00:32:34.968014: step: 220/466, loss: 1.477760910987854 2023-01-24 00:32:35.607905: step: 222/466, loss: 0.40660589933395386 2023-01-24 00:32:36.207415: step: 224/466, loss: 1.5613031387329102 2023-01-24 00:32:36.800271: step: 226/466, loss: 0.7885868549346924 2023-01-24 00:32:37.436807: step: 228/466, loss: 2.1106314659118652 2023-01-24 00:32:38.064771: step: 230/466, loss: 3.05161452293396 2023-01-24 00:32:38.703880: step: 232/466, loss: 0.373366117477417 2023-01-24 00:32:39.307634: step: 234/466, loss: 0.3329656422138214 2023-01-24 00:32:39.919332: step: 236/466, loss: 0.6629565954208374 2023-01-24 00:32:40.559069: step: 238/466, loss: 6.061126232147217 2023-01-24 00:32:41.191043: step: 240/466, loss: 4.7828497886657715 2023-01-24 00:32:41.833531: step: 242/466, loss: 3.0968270301818848 2023-01-24 00:32:42.449832: step: 244/466, loss: 2.103395462036133 2023-01-24 00:32:43.050653: step: 246/466, loss: 0.6191222071647644 2023-01-24 00:32:43.734458: step: 248/466, loss: 2.9041786193847656 2023-01-24 00:32:44.390315: step: 250/466, loss: 1.4008110761642456 2023-01-24 00:32:45.009486: step: 252/466, loss: 0.6556980013847351 2023-01-24 00:32:45.650979: step: 254/466, loss: 4.224248886108398 2023-01-24 00:32:46.274515: step: 256/466, loss: 2.330866813659668 2023-01-24 00:32:46.856836: step: 258/466, loss: 0.3894461989402771 2023-01-24 00:32:47.619807: step: 260/466, loss: 2.0729668140411377 2023-01-24 00:32:48.304158: step: 262/466, loss: 2.676603317260742 2023-01-24 00:32:49.001074: step: 264/466, loss: 2.0413904190063477 2023-01-24 00:32:49.770538: step: 266/466, loss: 1.497300624847412 2023-01-24 00:32:50.385910: step: 268/466, loss: 1.5379443168640137 2023-01-24 00:32:51.009972: step: 270/466, loss: 1.128397822380066 2023-01-24 00:32:51.617070: step: 272/466, loss: 1.7599122524261475 2023-01-24 00:32:52.240153: step: 274/466, loss: 0.8250323534011841 2023-01-24 00:32:52.851460: step: 276/466, loss: 3.5199387073516846 2023-01-24 00:32:53.518338: step: 278/466, loss: 0.5130527019500732 2023-01-24 00:32:54.159021: step: 280/466, loss: 1.7161411046981812 2023-01-24 00:32:54.808597: step: 282/466, loss: 1.1118619441986084 2023-01-24 00:32:55.430524: step: 284/466, loss: 1.6944289207458496 2023-01-24 00:32:56.041479: step: 286/466, loss: 1.8150008916854858 2023-01-24 00:32:56.826577: step: 288/466, loss: 1.1598901748657227 2023-01-24 00:32:57.487628: step: 290/466, loss: 2.952678680419922 2023-01-24 00:32:58.112223: step: 292/466, loss: 0.9587304592132568 2023-01-24 00:32:58.741856: step: 294/466, loss: 1.411131739616394 2023-01-24 00:32:59.317199: step: 296/466, loss: 0.7512630820274353 2023-01-24 00:32:59.939136: step: 298/466, loss: 0.33326366543769836 2023-01-24 00:33:00.570825: step: 300/466, loss: 0.570182740688324 2023-01-24 00:33:01.225402: step: 302/466, loss: 1.506582260131836 2023-01-24 00:33:01.934371: step: 304/466, loss: 2.3775475025177 2023-01-24 00:33:02.629584: step: 306/466, loss: 1.159224033355713 2023-01-24 00:33:03.245155: step: 308/466, loss: 1.5621230602264404 2023-01-24 00:33:03.841730: step: 310/466, loss: 1.3572973012924194 2023-01-24 00:33:04.440839: step: 312/466, loss: 0.7192381620407104 2023-01-24 00:33:05.058632: step: 314/466, loss: 0.8887008428573608 2023-01-24 00:33:05.681271: step: 316/466, loss: 0.34289538860321045 2023-01-24 00:33:06.297291: step: 318/466, loss: 1.4671778678894043 2023-01-24 00:33:06.941369: step: 320/466, loss: 4.913384914398193 2023-01-24 00:33:07.538543: step: 322/466, loss: 1.5437592267990112 2023-01-24 00:33:08.114565: step: 324/466, loss: 1.5716824531555176 2023-01-24 00:33:08.768195: step: 326/466, loss: 1.6771984100341797 2023-01-24 00:33:09.368956: step: 328/466, loss: 0.4275144338607788 2023-01-24 00:33:09.998612: step: 330/466, loss: 1.4679495096206665 2023-01-24 00:33:10.626205: step: 332/466, loss: 0.2106197476387024 2023-01-24 00:33:11.293273: step: 334/466, loss: 4.643703460693359 2023-01-24 00:33:11.882361: step: 336/466, loss: 0.6623115539550781 2023-01-24 00:33:12.468729: step: 338/466, loss: 6.546868801116943 2023-01-24 00:33:13.133347: step: 340/466, loss: 11.798778533935547 2023-01-24 00:33:13.772902: step: 342/466, loss: 0.8140530586242676 2023-01-24 00:33:14.380635: step: 344/466, loss: 1.2153860330581665 2023-01-24 00:33:14.962362: step: 346/466, loss: 4.89198112487793 2023-01-24 00:33:15.576417: step: 348/466, loss: 1.0797613859176636 2023-01-24 00:33:16.176006: step: 350/466, loss: 0.4592924118041992 2023-01-24 00:33:16.798350: step: 352/466, loss: 0.5482463836669922 2023-01-24 00:33:17.489705: step: 354/466, loss: 1.0288336277008057 2023-01-24 00:33:18.058585: step: 356/466, loss: 4.854940891265869 2023-01-24 00:33:18.598252: step: 358/466, loss: 0.894655168056488 2023-01-24 00:33:19.235862: step: 360/466, loss: 2.249600410461426 2023-01-24 00:33:19.823018: step: 362/466, loss: 2.275702476501465 2023-01-24 00:33:20.467538: step: 364/466, loss: 3.057396411895752 2023-01-24 00:33:21.069168: step: 366/466, loss: 3.021085262298584 2023-01-24 00:33:21.742254: step: 368/466, loss: 0.44216188788414 2023-01-24 00:33:22.357013: step: 370/466, loss: 0.9607877135276794 2023-01-24 00:33:22.951920: step: 372/466, loss: 1.0053690671920776 2023-01-24 00:33:23.574681: step: 374/466, loss: 2.570640802383423 2023-01-24 00:33:24.271408: step: 376/466, loss: 2.2469892501831055 2023-01-24 00:33:24.880414: step: 378/466, loss: 5.98201322555542 2023-01-24 00:33:25.522479: step: 380/466, loss: 0.677047610282898 2023-01-24 00:33:26.110439: step: 382/466, loss: 0.7705219984054565 2023-01-24 00:33:26.718419: step: 384/466, loss: 1.9747231006622314 2023-01-24 00:33:27.387316: step: 386/466, loss: 0.9605329036712646 2023-01-24 00:33:27.967757: step: 388/466, loss: 1.0569020509719849 2023-01-24 00:33:28.521144: step: 390/466, loss: 1.4205042123794556 2023-01-24 00:33:29.197019: step: 392/466, loss: 0.48208555579185486 2023-01-24 00:33:29.927790: step: 394/466, loss: 1.7804467678070068 2023-01-24 00:33:30.547952: step: 396/466, loss: 2.304171085357666 2023-01-24 00:33:31.100861: step: 398/466, loss: 10.734625816345215 2023-01-24 00:33:31.693817: step: 400/466, loss: 0.8132683038711548 2023-01-24 00:33:32.351891: step: 402/466, loss: 1.5852059125900269 2023-01-24 00:33:32.992343: step: 404/466, loss: 1.0326261520385742 2023-01-24 00:33:33.580020: step: 406/466, loss: 0.21298131346702576 2023-01-24 00:33:34.163402: step: 408/466, loss: 0.6365259885787964 2023-01-24 00:33:34.753258: step: 410/466, loss: 0.5460081696510315 2023-01-24 00:33:35.347923: step: 412/466, loss: 2.189100742340088 2023-01-24 00:33:35.985333: step: 414/466, loss: 0.9356738924980164 2023-01-24 00:33:36.578118: step: 416/466, loss: 2.1108479499816895 2023-01-24 00:33:37.153500: step: 418/466, loss: 0.6288458108901978 2023-01-24 00:33:37.771141: step: 420/466, loss: 1.7310302257537842 2023-01-24 00:33:38.487293: step: 422/466, loss: 2.0583267211914062 2023-01-24 00:33:39.047873: step: 424/466, loss: 0.34806692600250244 2023-01-24 00:33:39.681087: step: 426/466, loss: 3.313180923461914 2023-01-24 00:33:40.324664: step: 428/466, loss: 1.883329153060913 2023-01-24 00:33:40.927100: step: 430/466, loss: 0.9804830551147461 2023-01-24 00:33:41.548588: step: 432/466, loss: 0.4481969475746155 2023-01-24 00:33:42.176427: step: 434/466, loss: 3.6360621452331543 2023-01-24 00:33:42.770455: step: 436/466, loss: 1.488958716392517 2023-01-24 00:33:43.342700: step: 438/466, loss: 1.733879566192627 2023-01-24 00:33:43.893029: step: 440/466, loss: 2.5470359325408936 2023-01-24 00:33:44.533972: step: 442/466, loss: 4.520044803619385 2023-01-24 00:33:45.156722: step: 444/466, loss: 1.3759305477142334 2023-01-24 00:33:45.740312: step: 446/466, loss: 1.4914339780807495 2023-01-24 00:33:46.354048: step: 448/466, loss: 2.6907448768615723 2023-01-24 00:33:46.977287: step: 450/466, loss: 0.8654686808586121 2023-01-24 00:33:47.518226: step: 452/466, loss: 2.585897922515869 2023-01-24 00:33:48.140799: step: 454/466, loss: 3.011698007583618 2023-01-24 00:33:48.743511: step: 456/466, loss: 2.0853371620178223 2023-01-24 00:33:49.438899: step: 458/466, loss: 1.8473291397094727 2023-01-24 00:33:50.084250: step: 460/466, loss: 1.509905457496643 2023-01-24 00:33:50.675109: step: 462/466, loss: 0.45855090022087097 2023-01-24 00:33:51.282381: step: 464/466, loss: 1.5549182891845703 2023-01-24 00:33:51.926637: step: 466/466, loss: 0.8358886241912842 2023-01-24 00:33:52.542133: step: 468/466, loss: 3.466428279876709 2023-01-24 00:33:53.114292: step: 470/466, loss: 1.2473845481872559 2023-01-24 00:33:53.827519: step: 472/466, loss: 2.0258431434631348 2023-01-24 00:33:54.484946: step: 474/466, loss: 1.6379798650741577 2023-01-24 00:33:55.093500: step: 476/466, loss: 3.311021566390991 2023-01-24 00:33:55.679977: step: 478/466, loss: 0.4525070786476135 2023-01-24 00:33:56.309394: step: 480/466, loss: 0.9990955591201782 2023-01-24 00:33:57.064108: step: 482/466, loss: 0.23777519166469574 2023-01-24 00:33:57.670834: step: 484/466, loss: 6.6074652671813965 2023-01-24 00:33:58.323380: step: 486/466, loss: 2.4265973567962646 2023-01-24 00:33:58.941811: step: 488/466, loss: 1.9913256168365479 2023-01-24 00:33:59.591594: step: 490/466, loss: 2.090097665786743 2023-01-24 00:34:00.220075: step: 492/466, loss: 0.9152706861495972 2023-01-24 00:34:00.910988: step: 494/466, loss: 0.4318042993545532 2023-01-24 00:34:01.572537: step: 496/466, loss: 0.695695161819458 2023-01-24 00:34:02.190170: step: 498/466, loss: 1.1006947755813599 2023-01-24 00:34:02.887330: step: 500/466, loss: 0.4782952070236206 2023-01-24 00:34:03.500723: step: 502/466, loss: 3.6143462657928467 2023-01-24 00:34:04.095410: step: 504/466, loss: 0.45223885774612427 2023-01-24 00:34:04.756265: step: 506/466, loss: 4.147829532623291 2023-01-24 00:34:05.343933: step: 508/466, loss: 2.7985970973968506 2023-01-24 00:34:06.008109: step: 510/466, loss: 0.9218372702598572 2023-01-24 00:34:06.598110: step: 512/466, loss: 3.021129608154297 2023-01-24 00:34:07.291770: step: 514/466, loss: 2.36238956451416 2023-01-24 00:34:07.910515: step: 516/466, loss: 1.4599518775939941 2023-01-24 00:34:08.521042: step: 518/466, loss: 1.1134390830993652 2023-01-24 00:34:09.169364: step: 520/466, loss: 1.2822983264923096 2023-01-24 00:34:09.825325: step: 522/466, loss: 2.5615310668945312 2023-01-24 00:34:10.493309: step: 524/466, loss: 0.5547291040420532 2023-01-24 00:34:11.090977: step: 526/466, loss: 3.790654182434082 2023-01-24 00:34:11.743172: step: 528/466, loss: 0.6939271688461304 2023-01-24 00:34:12.387578: step: 530/466, loss: 1.255786418914795 2023-01-24 00:34:13.029670: step: 532/466, loss: 1.297980546951294 2023-01-24 00:34:13.697179: step: 534/466, loss: 2.0212531089782715 2023-01-24 00:34:14.256928: step: 536/466, loss: 1.225157380104065 2023-01-24 00:34:14.876043: step: 538/466, loss: 0.2992454767227173 2023-01-24 00:34:15.521261: step: 540/466, loss: 3.0153403282165527 2023-01-24 00:34:16.146416: step: 542/466, loss: 0.43518757820129395 2023-01-24 00:34:16.857581: step: 544/466, loss: 1.9364233016967773 2023-01-24 00:34:17.495427: step: 546/466, loss: 2.9780986309051514 2023-01-24 00:34:18.123919: step: 548/466, loss: 0.9115053415298462 2023-01-24 00:34:18.792798: step: 550/466, loss: 0.8266890048980713 2023-01-24 00:34:19.378322: step: 552/466, loss: 1.7876451015472412 2023-01-24 00:34:19.979760: step: 554/466, loss: 0.3994894027709961 2023-01-24 00:34:20.598978: step: 556/466, loss: 1.1966663599014282 2023-01-24 00:34:21.243587: step: 558/466, loss: 8.233001708984375 2023-01-24 00:34:21.879565: step: 560/466, loss: 1.256848692893982 2023-01-24 00:34:22.480318: step: 562/466, loss: 1.3873143196105957 2023-01-24 00:34:23.098770: step: 564/466, loss: 1.9018683433532715 2023-01-24 00:34:23.714908: step: 566/466, loss: 0.48335394263267517 2023-01-24 00:34:24.334543: step: 568/466, loss: 1.3220398426055908 2023-01-24 00:34:24.992683: step: 570/466, loss: 1.0557098388671875 2023-01-24 00:34:25.601931: step: 572/466, loss: 0.586812436580658 2023-01-24 00:34:26.162452: step: 574/466, loss: 1.9707565307617188 2023-01-24 00:34:26.829460: step: 576/466, loss: 0.7529993057250977 2023-01-24 00:34:27.413102: step: 578/466, loss: 1.8851854801177979 2023-01-24 00:34:28.012395: step: 580/466, loss: 1.3797357082366943 2023-01-24 00:34:28.621820: step: 582/466, loss: 3.5268735885620117 2023-01-24 00:34:29.227559: step: 584/466, loss: 1.2795867919921875 2023-01-24 00:34:29.868275: step: 586/466, loss: 1.1228471994400024 2023-01-24 00:34:30.453765: step: 588/466, loss: 0.9561172127723694 2023-01-24 00:34:31.021981: step: 590/466, loss: 1.8342702388763428 2023-01-24 00:34:31.658244: step: 592/466, loss: 2.8268752098083496 2023-01-24 00:34:32.291779: step: 594/466, loss: 0.9906534552574158 2023-01-24 00:34:32.948067: step: 596/466, loss: 1.8398644924163818 2023-01-24 00:34:33.615304: step: 598/466, loss: 0.46265068650245667 2023-01-24 00:34:34.256321: step: 600/466, loss: 0.5323865413665771 2023-01-24 00:34:34.875915: step: 602/466, loss: 1.052520751953125 2023-01-24 00:34:35.471761: step: 604/466, loss: 9.658276557922363 2023-01-24 00:34:36.151597: step: 606/466, loss: 0.883283257484436 2023-01-24 00:34:36.754827: step: 608/466, loss: 0.7546254396438599 2023-01-24 00:34:37.426043: step: 610/466, loss: 6.600062847137451 2023-01-24 00:34:38.078747: step: 612/466, loss: 5.850398063659668 2023-01-24 00:34:38.734785: step: 614/466, loss: 0.40383315086364746 2023-01-24 00:34:39.350340: step: 616/466, loss: 2.252748966217041 2023-01-24 00:34:39.915231: step: 618/466, loss: 0.9612758755683899 2023-01-24 00:34:40.533264: step: 620/466, loss: 0.3715060353279114 2023-01-24 00:34:41.174004: step: 622/466, loss: 1.1357539892196655 2023-01-24 00:34:41.859032: step: 624/466, loss: 0.7216935157775879 2023-01-24 00:34:42.443759: step: 626/466, loss: 2.094508171081543 2023-01-24 00:34:43.089512: step: 628/466, loss: 3.520603656768799 2023-01-24 00:34:43.681435: step: 630/466, loss: 0.35452020168304443 2023-01-24 00:34:44.295350: step: 632/466, loss: 1.1755645275115967 2023-01-24 00:34:44.887641: step: 634/466, loss: 2.9755730628967285 2023-01-24 00:34:45.509446: step: 636/466, loss: 0.28536292910575867 2023-01-24 00:34:46.130952: step: 638/466, loss: 0.6651081442832947 2023-01-24 00:34:46.718641: step: 640/466, loss: 1.1705361604690552 2023-01-24 00:34:47.334967: step: 642/466, loss: 1.835726261138916 2023-01-24 00:34:47.960488: step: 644/466, loss: 1.3732147216796875 2023-01-24 00:34:48.639498: step: 646/466, loss: 0.21464434266090393 2023-01-24 00:34:49.216840: step: 648/466, loss: 0.518509030342102 2023-01-24 00:34:49.814322: step: 650/466, loss: 1.4306645393371582 2023-01-24 00:34:50.431068: step: 652/466, loss: 2.33333683013916 2023-01-24 00:34:50.997503: step: 654/466, loss: 2.096909761428833 2023-01-24 00:34:51.680152: step: 656/466, loss: 1.3629062175750732 2023-01-24 00:34:52.308424: step: 658/466, loss: 2.293306350708008 2023-01-24 00:34:52.944906: step: 660/466, loss: 1.3752634525299072 2023-01-24 00:34:53.560239: step: 662/466, loss: 0.5292986035346985 2023-01-24 00:34:54.235104: step: 664/466, loss: 0.8116971850395203 2023-01-24 00:34:54.849924: step: 666/466, loss: 1.6029118299484253 2023-01-24 00:34:55.450261: step: 668/466, loss: 0.9917951822280884 2023-01-24 00:34:56.129829: step: 670/466, loss: 10.244860649108887 2023-01-24 00:34:56.743748: step: 672/466, loss: 1.4981614351272583 2023-01-24 00:34:57.345995: step: 674/466, loss: 1.0083128213882446 2023-01-24 00:34:57.932941: step: 676/466, loss: 6.35294771194458 2023-01-24 00:34:58.544963: step: 678/466, loss: 3.928889751434326 2023-01-24 00:34:59.103081: step: 680/466, loss: 0.8590149283409119 2023-01-24 00:34:59.684243: step: 682/466, loss: 0.2028389275074005 2023-01-24 00:35:00.332047: step: 684/466, loss: 0.860236644744873 2023-01-24 00:35:00.860630: step: 686/466, loss: 2.617568254470825 2023-01-24 00:35:01.498338: step: 688/466, loss: 1.339379906654358 2023-01-24 00:35:02.134906: step: 690/466, loss: 2.5586915016174316 2023-01-24 00:35:02.773321: step: 692/466, loss: 3.329720973968506 2023-01-24 00:35:03.359738: step: 694/466, loss: 1.036975622177124 2023-01-24 00:35:04.070310: step: 696/466, loss: 1.7532050609588623 2023-01-24 00:35:04.736250: step: 698/466, loss: 1.2536813020706177 2023-01-24 00:35:05.362968: step: 700/466, loss: 1.6638786792755127 2023-01-24 00:35:05.953634: step: 702/466, loss: 0.9238472580909729 2023-01-24 00:35:06.620051: step: 704/466, loss: 0.3592261075973511 2023-01-24 00:35:07.231076: step: 706/466, loss: 3.991269588470459 2023-01-24 00:35:07.836307: step: 708/466, loss: 4.487711429595947 2023-01-24 00:35:08.406665: step: 710/466, loss: 1.7793514728546143 2023-01-24 00:35:09.048087: step: 712/466, loss: 4.636097431182861 2023-01-24 00:35:09.703639: step: 714/466, loss: 3.309096097946167 2023-01-24 00:35:10.317012: step: 716/466, loss: 0.39011961221694946 2023-01-24 00:35:10.973520: step: 718/466, loss: 2.96675968170166 2023-01-24 00:35:11.573766: step: 720/466, loss: 2.651958465576172 2023-01-24 00:35:12.248121: step: 722/466, loss: 2.880375623703003 2023-01-24 00:35:12.893494: step: 724/466, loss: 1.8882092237472534 2023-01-24 00:35:13.511896: step: 726/466, loss: 0.37530142068862915 2023-01-24 00:35:14.100382: step: 728/466, loss: 1.3665651082992554 2023-01-24 00:35:14.714405: step: 730/466, loss: 0.3963625431060791 2023-01-24 00:35:15.314015: step: 732/466, loss: 5.215664386749268 2023-01-24 00:35:15.912803: step: 734/466, loss: 1.9291287660598755 2023-01-24 00:35:16.565729: step: 736/466, loss: 0.7714915871620178 2023-01-24 00:35:17.162933: step: 738/466, loss: 1.3604142665863037 2023-01-24 00:35:17.797660: step: 740/466, loss: 0.4155275821685791 2023-01-24 00:35:18.414968: step: 742/466, loss: 1.0549376010894775 2023-01-24 00:35:19.114170: step: 744/466, loss: 0.4867190718650818 2023-01-24 00:35:19.792521: step: 746/466, loss: 1.5686765909194946 2023-01-24 00:35:20.369438: step: 748/466, loss: 1.4739316701889038 2023-01-24 00:35:21.004585: step: 750/466, loss: 0.34707027673721313 2023-01-24 00:35:21.627535: step: 752/466, loss: 5.242847442626953 2023-01-24 00:35:22.229906: step: 754/466, loss: 0.6374714970588684 2023-01-24 00:35:22.828378: step: 756/466, loss: 1.4829870462417603 2023-01-24 00:35:23.494387: step: 758/466, loss: 6.024981498718262 2023-01-24 00:35:24.076755: step: 760/466, loss: 1.0136044025421143 2023-01-24 00:35:24.678923: step: 762/466, loss: 0.5662339329719543 2023-01-24 00:35:25.317783: step: 764/466, loss: 2.2438929080963135 2023-01-24 00:35:25.987305: step: 766/466, loss: 0.7552876472473145 2023-01-24 00:35:26.659963: step: 768/466, loss: 1.6896439790725708 2023-01-24 00:35:27.237809: step: 770/466, loss: 10.491397857666016 2023-01-24 00:35:27.834975: step: 772/466, loss: 0.5391125679016113 2023-01-24 00:35:28.545284: step: 774/466, loss: 1.70233154296875 2023-01-24 00:35:29.139713: step: 776/466, loss: 0.6633647680282593 2023-01-24 00:35:29.768129: step: 778/466, loss: 0.354541152715683 2023-01-24 00:35:30.423375: step: 780/466, loss: 1.1705315113067627 2023-01-24 00:35:31.026688: step: 782/466, loss: 1.2291159629821777 2023-01-24 00:35:31.643229: step: 784/466, loss: 2.6389012336730957 2023-01-24 00:35:32.244328: step: 786/466, loss: 0.4652237892150879 2023-01-24 00:35:32.880473: step: 788/466, loss: 1.0216243267059326 2023-01-24 00:35:33.449779: step: 790/466, loss: 0.6143762469291687 2023-01-24 00:35:34.110023: step: 792/466, loss: 3.523286819458008 2023-01-24 00:35:34.790715: step: 794/466, loss: 2.3982529640197754 2023-01-24 00:35:35.462545: step: 796/466, loss: 0.6625871062278748 2023-01-24 00:35:36.068565: step: 798/466, loss: 0.9518342614173889 2023-01-24 00:35:36.701975: step: 800/466, loss: 1.6259984970092773 2023-01-24 00:35:37.397250: step: 802/466, loss: 0.7471064329147339 2023-01-24 00:35:38.067610: step: 804/466, loss: 1.2354010343551636 2023-01-24 00:35:38.915350: step: 806/466, loss: 0.4772886037826538 2023-01-24 00:35:39.576182: step: 808/466, loss: 1.1822104454040527 2023-01-24 00:35:40.223241: step: 810/466, loss: 0.5853030681610107 2023-01-24 00:35:40.847338: step: 812/466, loss: 1.3605884313583374 2023-01-24 00:35:41.420065: step: 814/466, loss: 3.5283548831939697 2023-01-24 00:35:42.051516: step: 816/466, loss: 3.4050121307373047 2023-01-24 00:35:42.687211: step: 818/466, loss: 4.525536060333252 2023-01-24 00:35:43.325036: step: 820/466, loss: 1.727461576461792 2023-01-24 00:35:43.912362: step: 822/466, loss: 1.5522887706756592 2023-01-24 00:35:44.628148: step: 824/466, loss: 2.8800466060638428 2023-01-24 00:35:45.205465: step: 826/466, loss: 4.962917327880859 2023-01-24 00:35:45.850526: step: 828/466, loss: 0.2558228075504303 2023-01-24 00:35:46.458800: step: 830/466, loss: 0.6200178265571594 2023-01-24 00:35:47.106698: step: 832/466, loss: 9.803146362304688 2023-01-24 00:35:47.726893: step: 834/466, loss: 0.5240775346755981 2023-01-24 00:35:48.336873: step: 836/466, loss: 0.3322848677635193 2023-01-24 00:35:48.986561: step: 838/466, loss: 0.4767006039619446 2023-01-24 00:35:49.578394: step: 840/466, loss: 0.3725050687789917 2023-01-24 00:35:50.186988: step: 842/466, loss: 1.8058209419250488 2023-01-24 00:35:50.739838: step: 844/466, loss: 1.364620566368103 2023-01-24 00:35:51.519153: step: 846/466, loss: 0.6291117668151855 2023-01-24 00:35:52.155875: step: 848/466, loss: 0.9351247549057007 2023-01-24 00:35:52.862826: step: 850/466, loss: 0.8235954642295837 2023-01-24 00:35:53.449701: step: 852/466, loss: 0.7105476260185242 2023-01-24 00:35:54.139419: step: 854/466, loss: 0.8148956894874573 2023-01-24 00:35:54.727603: step: 856/466, loss: 0.2897741496562958 2023-01-24 00:35:55.369967: step: 858/466, loss: 1.7938950061798096 2023-01-24 00:35:55.921405: step: 860/466, loss: 1.265045166015625 2023-01-24 00:35:56.521172: step: 862/466, loss: 0.6019369959831238 2023-01-24 00:35:57.078259: step: 864/466, loss: 0.36597803235054016 2023-01-24 00:35:57.781214: step: 866/466, loss: 1.755831241607666 2023-01-24 00:35:58.414754: step: 868/466, loss: 0.627671480178833 2023-01-24 00:35:59.042935: step: 870/466, loss: 1.4150893688201904 2023-01-24 00:35:59.686970: step: 872/466, loss: 5.4517316818237305 2023-01-24 00:36:00.328264: step: 874/466, loss: 9.735913276672363 2023-01-24 00:36:00.922534: step: 876/466, loss: 3.080693244934082 2023-01-24 00:36:01.599916: step: 878/466, loss: 1.8908751010894775 2023-01-24 00:36:02.233321: step: 880/466, loss: 1.8814504146575928 2023-01-24 00:36:02.855585: step: 882/466, loss: 0.810623049736023 2023-01-24 00:36:03.477520: step: 884/466, loss: 0.9066575765609741 2023-01-24 00:36:04.116037: step: 886/466, loss: 0.49293220043182373 2023-01-24 00:36:04.799382: step: 888/466, loss: 5.225826740264893 2023-01-24 00:36:05.404158: step: 890/466, loss: 0.3955407738685608 2023-01-24 00:36:06.065149: step: 892/466, loss: 1.4828016757965088 2023-01-24 00:36:06.690301: step: 894/466, loss: 2.075650930404663 2023-01-24 00:36:07.296003: step: 896/466, loss: 1.8842453956604004 2023-01-24 00:36:07.950932: step: 898/466, loss: 0.8249079585075378 2023-01-24 00:36:08.603185: step: 900/466, loss: 1.673976182937622 2023-01-24 00:36:09.218679: step: 902/466, loss: 1.0010340213775635 2023-01-24 00:36:09.811058: step: 904/466, loss: 0.965110182762146 2023-01-24 00:36:10.437130: step: 906/466, loss: 0.7587573528289795 2023-01-24 00:36:11.159003: step: 908/466, loss: 2.1170947551727295 2023-01-24 00:36:11.776807: step: 910/466, loss: 3.88934326171875 2023-01-24 00:36:12.411811: step: 912/466, loss: 1.153349757194519 2023-01-24 00:36:13.009521: step: 914/466, loss: 1.6431703567504883 2023-01-24 00:36:13.581384: step: 916/466, loss: 0.47427424788475037 2023-01-24 00:36:14.255974: step: 918/466, loss: 1.332406997680664 2023-01-24 00:36:14.867263: step: 920/466, loss: 0.9475909471511841 2023-01-24 00:36:15.494181: step: 922/466, loss: 1.633178949356079 2023-01-24 00:36:16.159610: step: 924/466, loss: 1.9771479368209839 2023-01-24 00:36:16.766858: step: 926/466, loss: 0.8867887854576111 2023-01-24 00:36:17.388788: step: 928/466, loss: 1.328560709953308 2023-01-24 00:36:17.978982: step: 930/466, loss: 2.47381591796875 2023-01-24 00:36:18.617485: step: 932/466, loss: 2.4950668811798096 ================================================== Loss: 1.902 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.37565699891067533, 'r': 0.17929084038918597, 'f1': 0.24273221468074407}, 'combined': 0.17885531608054825, 'epoch': 1} Test Chinese: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.3901108312798634, 'r': 0.19371252121211807, 'f1': 0.25887745790509625}, 'combined': 0.17169074928420888, 'epoch': 1} Dev Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.37588421382022236, 'r': 0.14637457097725104, 'f1': 0.21069972257677771}, 'combined': 0.1404664817178518, 'epoch': 1} Test Korean: {'template': {'p': 0.96875, 'r': 0.49206349206349204, 'f1': 0.6526315789473683}, 'slot': {'p': 0.3959540617056072, 'r': 0.17991716401080948, 'f1': 0.24741271547995342}, 'combined': 0.161469351155338, 'epoch': 1} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3899780433006536, 'r': 0.17726274695484254, 'f1': 0.2437362770629085}, 'combined': 0.17959515152003785, 'epoch': 1} Test Russian: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.38294116650838345, 'r': 0.17927710376984562, 'f1': 0.24422039223981315}, 'combined': 0.16197000107096415, 'epoch': 1} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.36363636363636365, 'r': 0.11428571428571428, 'f1': 0.17391304347826086}, 'combined': 0.11594202898550723, 'epoch': 1} Sample Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.875, 'r': 0.07608695652173914, 'f1': 0.14}, 'combined': 0.09333333333333334, 'epoch': 1} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.3333333333333333, 'r': 0.034482758620689655, 'f1': 0.0625}, 'combined': 0.041666666666666664, 'epoch': 1} New best chinese model... New best korean model... ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.37565699891067533, 'r': 0.17929084038918597, 'f1': 0.24273221468074407}, 'combined': 0.17885531608054825, 'epoch': 1} Test for Chinese: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.3901108312798634, 'r': 0.19371252121211807, 'f1': 0.25887745790509625}, 'combined': 0.17169074928420888, 'epoch': 1} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.36363636363636365, 'r': 0.11428571428571428, 'f1': 0.17391304347826086}, 'combined': 0.11594202898550723, 'epoch': 1} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.37588421382022236, 'r': 0.14637457097725104, 'f1': 0.21069972257677771}, 'combined': 0.1404664817178518, 'epoch': 1} Test for Korean: {'template': {'p': 0.96875, 'r': 0.49206349206349204, 'f1': 0.6526315789473683}, 'slot': {'p': 0.3959540617056072, 'r': 0.17991716401080948, 'f1': 0.24741271547995342}, 'combined': 0.161469351155338, 'epoch': 1} Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.875, 'r': 0.07608695652173914, 'f1': 0.14}, 'combined': 0.09333333333333334, 'epoch': 1} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.26020516074450084, 'r': 0.09690059861373661, 'f1': 0.14121326905417814}, 'combined': 0.10405188246097336, 'epoch': 0} Test for Russian: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.32320065430752454, 'r': 0.036373956799214534, 'f1': 0.06538885824600112}, 'combined': 0.04336670391444633, 'epoch': 0} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.6, 'r': 0.10344827586206896, 'f1': 0.17647058823529413}, 'combined': 0.11764705882352941, 'epoch': 0} ****************************** Epoch: 2 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-24 00:39:05.029480: step: 2/466, loss: 0.3859618306159973 2023-01-24 00:39:05.634970: step: 4/466, loss: 0.5589725971221924 2023-01-24 00:39:06.280520: step: 6/466, loss: 1.2912181615829468 2023-01-24 00:39:06.991966: step: 8/466, loss: 1.1314879655838013 2023-01-24 00:39:07.577491: step: 10/466, loss: 1.917891263961792 2023-01-24 00:39:08.176534: step: 12/466, loss: 1.408381700515747 2023-01-24 00:39:08.744356: step: 14/466, loss: 0.5122565031051636 2023-01-24 00:39:09.325639: step: 16/466, loss: 0.4245198369026184 2023-01-24 00:39:09.940779: step: 18/466, loss: 1.6285946369171143 2023-01-24 00:39:10.611145: step: 20/466, loss: 0.49963173270225525 2023-01-24 00:39:11.199492: step: 22/466, loss: 0.4725305736064911 2023-01-24 00:39:11.907458: step: 24/466, loss: 0.6637438535690308 2023-01-24 00:39:12.565957: step: 26/466, loss: 0.7057062983512878 2023-01-24 00:39:13.252206: step: 28/466, loss: 1.5215015411376953 2023-01-24 00:39:13.894940: step: 30/466, loss: 0.8151401877403259 2023-01-24 00:39:14.509214: step: 32/466, loss: 0.8009874820709229 2023-01-24 00:39:15.164766: step: 34/466, loss: 1.2140456438064575 2023-01-24 00:39:15.812028: step: 36/466, loss: 3.2386741638183594 2023-01-24 00:39:16.426927: step: 38/466, loss: 0.6042141318321228 2023-01-24 00:39:17.017617: step: 40/466, loss: 1.747312307357788 2023-01-24 00:39:17.606253: step: 42/466, loss: 1.4219690561294556 2023-01-24 00:39:18.173016: step: 44/466, loss: 1.022874355316162 2023-01-24 00:39:18.820802: step: 46/466, loss: 2.0148932933807373 2023-01-24 00:39:19.430381: step: 48/466, loss: 1.0092506408691406 2023-01-24 00:39:19.963353: step: 50/466, loss: 0.8231704831123352 2023-01-24 00:39:20.598109: step: 52/466, loss: 0.3929067552089691 2023-01-24 00:39:21.277303: step: 54/466, loss: 2.7094404697418213 2023-01-24 00:39:21.891137: step: 56/466, loss: 0.7950037121772766 2023-01-24 00:39:22.581520: step: 58/466, loss: 1.2933663129806519 2023-01-24 00:39:23.226765: step: 60/466, loss: 1.3285505771636963 2023-01-24 00:39:23.827555: step: 62/466, loss: 1.7081013917922974 2023-01-24 00:39:24.403485: step: 64/466, loss: 4.110572338104248 2023-01-24 00:39:25.028179: step: 66/466, loss: 0.8001346588134766 2023-01-24 00:39:25.650646: step: 68/466, loss: 0.3810636103153229 2023-01-24 00:39:26.305080: step: 70/466, loss: 0.4169490933418274 2023-01-24 00:39:26.981267: step: 72/466, loss: 1.4064998626708984 2023-01-24 00:39:27.590990: step: 74/466, loss: 1.4400849342346191 2023-01-24 00:39:28.196778: step: 76/466, loss: 3.868173837661743 2023-01-24 00:39:28.800735: step: 78/466, loss: 0.9741390943527222 2023-01-24 00:39:29.445475: step: 80/466, loss: 0.9136323928833008 2023-01-24 00:39:30.057598: step: 82/466, loss: 1.366288661956787 2023-01-24 00:39:30.750689: step: 84/466, loss: 2.6454687118530273 2023-01-24 00:39:31.383832: step: 86/466, loss: 0.6054922342300415 2023-01-24 00:39:32.073723: step: 88/466, loss: 2.3126707077026367 2023-01-24 00:39:32.745122: step: 90/466, loss: 3.112882137298584 2023-01-24 00:39:33.365035: step: 92/466, loss: 1.7007570266723633 2023-01-24 00:39:34.012790: step: 94/466, loss: 7.418228626251221 2023-01-24 00:39:34.576316: step: 96/466, loss: 2.902988910675049 2023-01-24 00:39:35.130368: step: 98/466, loss: 0.9322847127914429 2023-01-24 00:39:35.751987: step: 100/466, loss: 1.4995085000991821 2023-01-24 00:39:36.347802: step: 102/466, loss: 2.071957588195801 2023-01-24 00:39:36.896572: step: 104/466, loss: 0.431795597076416 2023-01-24 00:39:37.543673: step: 106/466, loss: 1.375353217124939 2023-01-24 00:39:38.164121: step: 108/466, loss: 0.3198004364967346 2023-01-24 00:39:38.765376: step: 110/466, loss: 0.9918296933174133 2023-01-24 00:39:39.338909: step: 112/466, loss: 0.3641217052936554 2023-01-24 00:39:39.921225: step: 114/466, loss: 1.6604777574539185 2023-01-24 00:39:40.549281: step: 116/466, loss: 0.3456338047981262 2023-01-24 00:39:41.231839: step: 118/466, loss: 7.719141483306885 2023-01-24 00:39:41.799044: step: 120/466, loss: 0.28431424498558044 2023-01-24 00:39:42.433334: step: 122/466, loss: 0.3917914628982544 2023-01-24 00:39:42.972069: step: 124/466, loss: 1.2277631759643555 2023-01-24 00:39:43.564192: step: 126/466, loss: 1.6502057313919067 2023-01-24 00:39:44.201069: step: 128/466, loss: 0.8259530067443848 2023-01-24 00:39:44.841237: step: 130/466, loss: 1.7057901620864868 2023-01-24 00:39:45.488313: step: 132/466, loss: 0.5677308440208435 2023-01-24 00:39:46.201364: step: 134/466, loss: 1.5732409954071045 2023-01-24 00:39:46.850116: step: 136/466, loss: 5.119172096252441 2023-01-24 00:39:47.406881: step: 138/466, loss: 3.7913308143615723 2023-01-24 00:39:48.089634: step: 140/466, loss: 0.9256842136383057 2023-01-24 00:39:48.724116: step: 142/466, loss: 0.32183602452278137 2023-01-24 00:39:49.368859: step: 144/466, loss: 1.5022695064544678 2023-01-24 00:39:49.970481: step: 146/466, loss: 0.835024893283844 2023-01-24 00:39:50.609018: step: 148/466, loss: 2.1166932582855225 2023-01-24 00:39:51.269317: step: 150/466, loss: 1.9419481754302979 2023-01-24 00:39:51.910649: step: 152/466, loss: 0.6141785979270935 2023-01-24 00:39:52.525781: step: 154/466, loss: 0.9153237342834473 2023-01-24 00:39:53.149208: step: 156/466, loss: 1.6009995937347412 2023-01-24 00:39:53.743449: step: 158/466, loss: 0.41681626439094543 2023-01-24 00:39:54.295853: step: 160/466, loss: 0.34275585412979126 2023-01-24 00:39:54.971256: step: 162/466, loss: 1.5234066247940063 2023-01-24 00:39:55.594299: step: 164/466, loss: 0.6240906715393066 2023-01-24 00:39:56.250229: step: 166/466, loss: 0.7960733771324158 2023-01-24 00:39:56.890836: step: 168/466, loss: 0.31994709372520447 2023-01-24 00:39:57.585357: step: 170/466, loss: 0.8821254968643188 2023-01-24 00:39:58.251643: step: 172/466, loss: 1.0889562368392944 2023-01-24 00:39:58.888835: step: 174/466, loss: 7.21901273727417 2023-01-24 00:39:59.519864: step: 176/466, loss: 1.2003504037857056 2023-01-24 00:40:00.141438: step: 178/466, loss: 0.48017793893814087 2023-01-24 00:40:00.766854: step: 180/466, loss: 1.6653739213943481 2023-01-24 00:40:01.422987: step: 182/466, loss: 0.6655115485191345 2023-01-24 00:40:02.135309: step: 184/466, loss: 0.6897805333137512 2023-01-24 00:40:02.760799: step: 186/466, loss: 0.3107963502407074 2023-01-24 00:40:03.469810: step: 188/466, loss: 0.5084014534950256 2023-01-24 00:40:04.086779: step: 190/466, loss: 5.032703876495361 2023-01-24 00:40:04.849453: step: 192/466, loss: 1.2486380338668823 2023-01-24 00:40:05.449923: step: 194/466, loss: 0.20551693439483643 2023-01-24 00:40:06.030042: step: 196/466, loss: 1.0858979225158691 2023-01-24 00:40:06.759961: step: 198/466, loss: 0.31415680050849915 2023-01-24 00:40:07.384695: step: 200/466, loss: 1.15254807472229 2023-01-24 00:40:07.964659: step: 202/466, loss: 2.475886344909668 2023-01-24 00:40:08.592646: step: 204/466, loss: 1.0149493217468262 2023-01-24 00:40:09.232650: step: 206/466, loss: 1.486482858657837 2023-01-24 00:40:09.889682: step: 208/466, loss: 0.2678473889827728 2023-01-24 00:40:10.460477: step: 210/466, loss: 0.3779557943344116 2023-01-24 00:40:11.105718: step: 212/466, loss: 1.080670952796936 2023-01-24 00:40:11.744870: step: 214/466, loss: 0.9954599142074585 2023-01-24 00:40:12.350139: step: 216/466, loss: 4.8555684089660645 2023-01-24 00:40:12.914816: step: 218/466, loss: 1.719361424446106 2023-01-24 00:40:13.526559: step: 220/466, loss: 0.8947339653968811 2023-01-24 00:40:14.162225: step: 222/466, loss: 1.861836314201355 2023-01-24 00:40:14.876440: step: 224/466, loss: 1.4084279537200928 2023-01-24 00:40:15.497789: step: 226/466, loss: 1.3126344680786133 2023-01-24 00:40:16.164553: step: 228/466, loss: 0.772314190864563 2023-01-24 00:40:16.802800: step: 230/466, loss: 0.6389649510383606 2023-01-24 00:40:17.382738: step: 232/466, loss: 0.7215180397033691 2023-01-24 00:40:18.053487: step: 234/466, loss: 0.9780803322792053 2023-01-24 00:40:18.627124: step: 236/466, loss: 6.474110126495361 2023-01-24 00:40:19.260418: step: 238/466, loss: 0.9038907885551453 2023-01-24 00:40:19.947454: step: 240/466, loss: 0.8878388404846191 2023-01-24 00:40:20.573479: step: 242/466, loss: 0.41521763801574707 2023-01-24 00:40:21.191647: step: 244/466, loss: 1.3140779733657837 2023-01-24 00:40:21.863872: step: 246/466, loss: 0.25495314598083496 2023-01-24 00:40:22.488606: step: 248/466, loss: 0.9745254516601562 2023-01-24 00:40:23.158803: step: 250/466, loss: 0.3536645174026489 2023-01-24 00:40:23.839407: step: 252/466, loss: 0.2425452321767807 2023-01-24 00:40:24.453768: step: 254/466, loss: 3.70131516456604 2023-01-24 00:40:25.119379: step: 256/466, loss: 3.986081600189209 2023-01-24 00:40:25.722517: step: 258/466, loss: 1.9067820310592651 2023-01-24 00:40:26.354596: step: 260/466, loss: 0.39824968576431274 2023-01-24 00:40:26.951936: step: 262/466, loss: 0.8292238712310791 2023-01-24 00:40:27.517472: step: 264/466, loss: 2.207139492034912 2023-01-24 00:40:28.132686: step: 266/466, loss: 1.1207127571105957 2023-01-24 00:40:28.664745: step: 268/466, loss: 0.8023765087127686 2023-01-24 00:40:29.289850: step: 270/466, loss: 2.832197427749634 2023-01-24 00:40:29.884874: step: 272/466, loss: 1.586511492729187 2023-01-24 00:40:30.474484: step: 274/466, loss: 0.8086563348770142 2023-01-24 00:40:31.082156: step: 276/466, loss: 1.5293811559677124 2023-01-24 00:40:31.693797: step: 278/466, loss: 1.3649612665176392 2023-01-24 00:40:32.316078: step: 280/466, loss: 1.25004243850708 2023-01-24 00:40:33.015806: step: 282/466, loss: 0.8976415395736694 2023-01-24 00:40:33.694163: step: 284/466, loss: 1.1052522659301758 2023-01-24 00:40:34.351585: step: 286/466, loss: 3.7007155418395996 2023-01-24 00:40:34.946366: step: 288/466, loss: 0.9899224042892456 2023-01-24 00:40:35.570971: step: 290/466, loss: 1.1725620031356812 2023-01-24 00:40:36.233645: step: 292/466, loss: 0.30295124650001526 2023-01-24 00:40:36.835838: step: 294/466, loss: 0.42656242847442627 2023-01-24 00:40:37.475476: step: 296/466, loss: 2.2831902503967285 2023-01-24 00:40:38.138897: step: 298/466, loss: 0.574013352394104 2023-01-24 00:40:38.754578: step: 300/466, loss: 5.317529678344727 2023-01-24 00:40:39.331453: step: 302/466, loss: 0.7478150129318237 2023-01-24 00:40:39.934490: step: 304/466, loss: 1.5402438640594482 2023-01-24 00:40:40.592316: step: 306/466, loss: 0.8089839220046997 2023-01-24 00:40:41.281619: step: 308/466, loss: 2.3785948753356934 2023-01-24 00:40:41.874063: step: 310/466, loss: 0.2538207173347473 2023-01-24 00:40:42.470792: step: 312/466, loss: 1.2879310846328735 2023-01-24 00:40:43.087401: step: 314/466, loss: 0.27034011483192444 2023-01-24 00:40:43.686492: step: 316/466, loss: 1.3283967971801758 2023-01-24 00:40:44.320234: step: 318/466, loss: 0.9468340873718262 2023-01-24 00:40:44.925985: step: 320/466, loss: 0.8612461686134338 2023-01-24 00:40:45.525256: step: 322/466, loss: 0.9248701333999634 2023-01-24 00:40:46.138634: step: 324/466, loss: 0.49967315793037415 2023-01-24 00:40:46.748546: step: 326/466, loss: 3.6828291416168213 2023-01-24 00:40:47.454282: step: 328/466, loss: 0.54094398021698 2023-01-24 00:40:48.039028: step: 330/466, loss: 0.9822638630867004 2023-01-24 00:40:48.613908: step: 332/466, loss: 0.935371458530426 2023-01-24 00:40:49.227417: step: 334/466, loss: 0.9043583869934082 2023-01-24 00:40:49.809035: step: 336/466, loss: 1.2238879203796387 2023-01-24 00:40:50.491800: step: 338/466, loss: 0.562913715839386 2023-01-24 00:40:51.109189: step: 340/466, loss: 1.2473548650741577 2023-01-24 00:40:51.748073: step: 342/466, loss: 2.6486573219299316 2023-01-24 00:40:52.391381: step: 344/466, loss: 1.0722802877426147 2023-01-24 00:40:53.132326: step: 346/466, loss: 1.7376291751861572 2023-01-24 00:40:53.801417: step: 348/466, loss: 2.0604095458984375 2023-01-24 00:40:54.467214: step: 350/466, loss: 1.029964566230774 2023-01-24 00:40:55.031996: step: 352/466, loss: 1.4632513523101807 2023-01-24 00:40:55.659965: step: 354/466, loss: 1.3040094375610352 2023-01-24 00:40:56.219291: step: 356/466, loss: 1.9913289546966553 2023-01-24 00:40:56.848597: step: 358/466, loss: 0.7091863751411438 2023-01-24 00:40:57.408581: step: 360/466, loss: 1.2750763893127441 2023-01-24 00:40:58.021433: step: 362/466, loss: 3.3938052654266357 2023-01-24 00:40:58.651878: step: 364/466, loss: 1.2354358434677124 2023-01-24 00:40:59.314522: step: 366/466, loss: 6.258620262145996 2023-01-24 00:40:59.885740: step: 368/466, loss: 0.35919690132141113 2023-01-24 00:41:00.526030: step: 370/466, loss: 1.1379178762435913 2023-01-24 00:41:01.149853: step: 372/466, loss: 0.9073055982589722 2023-01-24 00:41:01.749235: step: 374/466, loss: 1.434647798538208 2023-01-24 00:41:02.364870: step: 376/466, loss: 0.6871436238288879 2023-01-24 00:41:02.933604: step: 378/466, loss: 0.7046130299568176 2023-01-24 00:41:03.529379: step: 380/466, loss: 3.336244583129883 2023-01-24 00:41:04.158394: step: 382/466, loss: 0.2550167739391327 2023-01-24 00:41:04.836360: step: 384/466, loss: 0.6727685332298279 2023-01-24 00:41:05.461156: step: 386/466, loss: 2.2739858627319336 2023-01-24 00:41:06.043408: step: 388/466, loss: 5.542956352233887 2023-01-24 00:41:06.696390: step: 390/466, loss: 0.6057764291763306 2023-01-24 00:41:07.428164: step: 392/466, loss: 1.7758235931396484 2023-01-24 00:41:08.028595: step: 394/466, loss: 0.5540112257003784 2023-01-24 00:41:08.699875: step: 396/466, loss: 0.48421353101730347 2023-01-24 00:41:09.309676: step: 398/466, loss: 3.0981218814849854 2023-01-24 00:41:09.886071: step: 400/466, loss: 2.1436336040496826 2023-01-24 00:41:10.523590: step: 402/466, loss: 0.9988797903060913 2023-01-24 00:41:11.158614: step: 404/466, loss: 1.479414463043213 2023-01-24 00:41:11.776746: step: 406/466, loss: 1.2433871030807495 2023-01-24 00:41:12.448027: step: 408/466, loss: 0.9100664854049683 2023-01-24 00:41:13.093971: step: 410/466, loss: 4.007172584533691 2023-01-24 00:41:13.686989: step: 412/466, loss: 0.5805957317352295 2023-01-24 00:41:14.293602: step: 414/466, loss: 0.994168221950531 2023-01-24 00:41:14.998882: step: 416/466, loss: 1.7388842105865479 2023-01-24 00:41:15.632754: step: 418/466, loss: 1.2173644304275513 2023-01-24 00:41:16.260978: step: 420/466, loss: 7.888503074645996 2023-01-24 00:41:16.920364: step: 422/466, loss: 0.9034971594810486 2023-01-24 00:41:17.541125: step: 424/466, loss: 12.167634963989258 2023-01-24 00:41:18.108277: step: 426/466, loss: 1.0801763534545898 2023-01-24 00:41:18.713567: step: 428/466, loss: 2.7184035778045654 2023-01-24 00:41:19.342949: step: 430/466, loss: 0.380400151014328 2023-01-24 00:41:19.891168: step: 432/466, loss: 0.9667631983757019 2023-01-24 00:41:20.545284: step: 434/466, loss: 1.403149127960205 2023-01-24 00:41:21.163741: step: 436/466, loss: 2.0224409103393555 2023-01-24 00:41:21.920205: step: 438/466, loss: 1.8769458532333374 2023-01-24 00:41:22.611916: step: 440/466, loss: 4.415177822113037 2023-01-24 00:41:23.307381: step: 442/466, loss: 2.260067939758301 2023-01-24 00:41:23.888639: step: 444/466, loss: 3.9543333053588867 2023-01-24 00:41:24.541451: step: 446/466, loss: 2.0179100036621094 2023-01-24 00:41:25.194938: step: 448/466, loss: 1.4100654125213623 2023-01-24 00:41:25.808963: step: 450/466, loss: 5.165609836578369 2023-01-24 00:41:26.454311: step: 452/466, loss: 1.6865736246109009 2023-01-24 00:41:27.123786: step: 454/466, loss: 1.8559184074401855 2023-01-24 00:41:27.727591: step: 456/466, loss: 1.371927261352539 2023-01-24 00:41:28.379065: step: 458/466, loss: 0.27529627084732056 2023-01-24 00:41:28.990792: step: 460/466, loss: 1.0816210508346558 2023-01-24 00:41:29.604692: step: 462/466, loss: 1.4851192235946655 2023-01-24 00:41:30.184790: step: 464/466, loss: 1.0001541376113892 2023-01-24 00:41:30.852867: step: 466/466, loss: 0.3777729868888855 2023-01-24 00:41:31.438550: step: 468/466, loss: 0.24453957378864288 2023-01-24 00:41:32.032567: step: 470/466, loss: 0.7623027563095093 2023-01-24 00:41:32.685426: step: 472/466, loss: 0.84808748960495 2023-01-24 00:41:33.257484: step: 474/466, loss: 2.5589609146118164 2023-01-24 00:41:33.813409: step: 476/466, loss: 0.33705270290374756 2023-01-24 00:41:34.454683: step: 478/466, loss: 15.094965934753418 2023-01-24 00:41:35.053209: step: 480/466, loss: 0.3791036307811737 2023-01-24 00:41:35.699253: step: 482/466, loss: 0.7764727473258972 2023-01-24 00:41:36.265872: step: 484/466, loss: 0.8260162472724915 2023-01-24 00:41:36.870287: step: 486/466, loss: 0.6989418864250183 2023-01-24 00:41:37.491397: step: 488/466, loss: 0.3180275857448578 2023-01-24 00:41:38.163954: step: 490/466, loss: 0.5473271012306213 2023-01-24 00:41:38.846480: step: 492/466, loss: 0.29150083661079407 2023-01-24 00:41:39.416785: step: 494/466, loss: 1.8655388355255127 2023-01-24 00:41:40.196755: step: 496/466, loss: 0.3491297662258148 2023-01-24 00:41:40.903689: step: 498/466, loss: 1.1241211891174316 2023-01-24 00:41:41.538473: step: 500/466, loss: 0.9345641732215881 2023-01-24 00:41:42.217494: step: 502/466, loss: 1.3086748123168945 2023-01-24 00:41:42.798937: step: 504/466, loss: 1.6536461114883423 2023-01-24 00:41:43.464627: step: 506/466, loss: 0.5924420356750488 2023-01-24 00:41:44.075208: step: 508/466, loss: 0.4484773278236389 2023-01-24 00:41:44.738858: step: 510/466, loss: 5.096407890319824 2023-01-24 00:41:45.369936: step: 512/466, loss: 3.5445706844329834 2023-01-24 00:41:46.038224: step: 514/466, loss: 0.21641923487186432 2023-01-24 00:41:46.733005: step: 516/466, loss: 4.486607551574707 2023-01-24 00:41:47.388929: step: 518/466, loss: 3.2323899269104004 2023-01-24 00:41:48.032211: step: 520/466, loss: 1.0494511127471924 2023-01-24 00:41:48.684721: step: 522/466, loss: 0.6894415020942688 2023-01-24 00:41:49.287734: step: 524/466, loss: 1.5049724578857422 2023-01-24 00:41:49.867202: step: 526/466, loss: 2.985200881958008 2023-01-24 00:41:50.462164: step: 528/466, loss: 2.1904640197753906 2023-01-24 00:41:51.036033: step: 530/466, loss: 4.436951637268066 2023-01-24 00:41:51.737894: step: 532/466, loss: 1.0465539693832397 2023-01-24 00:41:52.431739: step: 534/466, loss: 1.1398470401763916 2023-01-24 00:41:53.014800: step: 536/466, loss: 0.6183001399040222 2023-01-24 00:41:53.597751: step: 538/466, loss: 2.022190809249878 2023-01-24 00:41:54.256430: step: 540/466, loss: 0.39715272188186646 2023-01-24 00:41:54.859502: step: 542/466, loss: 0.23415739834308624 2023-01-24 00:41:55.523559: step: 544/466, loss: 1.0342296361923218 2023-01-24 00:41:56.168213: step: 546/466, loss: 1.5827267169952393 2023-01-24 00:41:56.892425: step: 548/466, loss: 2.6225316524505615 2023-01-24 00:41:57.552665: step: 550/466, loss: 0.9604104161262512 2023-01-24 00:41:58.135081: step: 552/466, loss: 0.8755918145179749 2023-01-24 00:41:58.870852: step: 554/466, loss: 1.1196208000183105 2023-01-24 00:41:59.444570: step: 556/466, loss: 2.395808219909668 2023-01-24 00:42:00.057286: step: 558/466, loss: 0.9987582564353943 2023-01-24 00:42:00.716751: step: 560/466, loss: 1.783170461654663 2023-01-24 00:42:01.273531: step: 562/466, loss: 1.1715949773788452 2023-01-24 00:42:01.869884: step: 564/466, loss: 0.42941659688949585 2023-01-24 00:42:02.509150: step: 566/466, loss: 0.5479053258895874 2023-01-24 00:42:03.180393: step: 568/466, loss: 0.7223318815231323 2023-01-24 00:42:03.784569: step: 570/466, loss: 1.9938143491744995 2023-01-24 00:42:04.327583: step: 572/466, loss: 0.6955587267875671 2023-01-24 00:42:04.962238: step: 574/466, loss: 0.6885424852371216 2023-01-24 00:42:05.545354: step: 576/466, loss: 0.5971453189849854 2023-01-24 00:42:06.162235: step: 578/466, loss: 2.068782329559326 2023-01-24 00:42:06.759474: step: 580/466, loss: 1.2053582668304443 2023-01-24 00:42:07.421652: step: 582/466, loss: 0.5845661759376526 2023-01-24 00:42:08.031919: step: 584/466, loss: 0.4609079658985138 2023-01-24 00:42:08.707431: step: 586/466, loss: 1.3553452491760254 2023-01-24 00:42:09.344448: step: 588/466, loss: 7.491386890411377 2023-01-24 00:42:09.920850: step: 590/466, loss: 0.31566736102104187 2023-01-24 00:42:10.465755: step: 592/466, loss: 1.7004241943359375 2023-01-24 00:42:11.143084: step: 594/466, loss: 3.041285514831543 2023-01-24 00:42:11.760875: step: 596/466, loss: 2.2295432090759277 2023-01-24 00:42:12.433158: step: 598/466, loss: 1.2003366947174072 2023-01-24 00:42:13.054054: step: 600/466, loss: 1.229418396949768 2023-01-24 00:42:13.683575: step: 602/466, loss: 1.0585122108459473 2023-01-24 00:42:14.292750: step: 604/466, loss: 1.5848824977874756 2023-01-24 00:42:15.013438: step: 606/466, loss: 0.33494770526885986 2023-01-24 00:42:15.637764: step: 608/466, loss: 1.017104148864746 2023-01-24 00:42:16.243025: step: 610/466, loss: 1.0163648128509521 2023-01-24 00:42:17.024209: step: 612/466, loss: 4.332009315490723 2023-01-24 00:42:17.624144: step: 614/466, loss: 1.3501293659210205 2023-01-24 00:42:18.250764: step: 616/466, loss: 1.3002701997756958 2023-01-24 00:42:18.980163: step: 618/466, loss: 1.1786446571350098 2023-01-24 00:42:19.660456: step: 620/466, loss: 5.574863910675049 2023-01-24 00:42:20.311124: step: 622/466, loss: 8.472260475158691 2023-01-24 00:42:20.951289: step: 624/466, loss: 1.0419970750808716 2023-01-24 00:42:21.560827: step: 626/466, loss: 0.926811158657074 2023-01-24 00:42:22.213240: step: 628/466, loss: 1.934928059577942 2023-01-24 00:42:22.858584: step: 630/466, loss: 1.2481105327606201 2023-01-24 00:42:23.513575: step: 632/466, loss: 0.6394846439361572 2023-01-24 00:42:24.118815: step: 634/466, loss: 0.8781318068504333 2023-01-24 00:42:24.711925: step: 636/466, loss: 0.4937204122543335 2023-01-24 00:42:25.352286: step: 638/466, loss: 0.72397381067276 2023-01-24 00:42:26.010750: step: 640/466, loss: 1.065069317817688 2023-01-24 00:42:26.689147: step: 642/466, loss: 0.7574830055236816 2023-01-24 00:42:27.353127: step: 644/466, loss: 1.0235209465026855 2023-01-24 00:42:27.963927: step: 646/466, loss: 1.9580715894699097 2023-01-24 00:42:28.586457: step: 648/466, loss: 0.3159792721271515 2023-01-24 00:42:29.217393: step: 650/466, loss: 0.7083503603935242 2023-01-24 00:42:29.807855: step: 652/466, loss: 0.3604220151901245 2023-01-24 00:42:30.457540: step: 654/466, loss: 0.5020242929458618 2023-01-24 00:42:31.114371: step: 656/466, loss: 1.9850904941558838 2023-01-24 00:42:31.774249: step: 658/466, loss: 2.489419937133789 2023-01-24 00:42:32.429778: step: 660/466, loss: 2.914874315261841 2023-01-24 00:42:33.066764: step: 662/466, loss: 2.6481523513793945 2023-01-24 00:42:33.717171: step: 664/466, loss: 3.4673984050750732 2023-01-24 00:42:34.336618: step: 666/466, loss: 0.20957191288471222 2023-01-24 00:42:35.023151: step: 668/466, loss: 0.39392566680908203 2023-01-24 00:42:35.603704: step: 670/466, loss: 2.5256829261779785 2023-01-24 00:42:36.221170: step: 672/466, loss: 0.48323601484298706 2023-01-24 00:42:36.890493: step: 674/466, loss: 0.2096550613641739 2023-01-24 00:42:37.436540: step: 676/466, loss: 1.0257917642593384 2023-01-24 00:42:37.996373: step: 678/466, loss: 2.546658515930176 2023-01-24 00:42:38.630399: step: 680/466, loss: 0.5435642004013062 2023-01-24 00:42:39.307882: step: 682/466, loss: 1.7187285423278809 2023-01-24 00:42:39.965093: step: 684/466, loss: 0.7300114631652832 2023-01-24 00:42:40.600998: step: 686/466, loss: 0.38423314690589905 2023-01-24 00:42:41.210161: step: 688/466, loss: 1.514926552772522 2023-01-24 00:42:41.841543: step: 690/466, loss: 2.6025986671447754 2023-01-24 00:42:42.471164: step: 692/466, loss: 2.7757177352905273 2023-01-24 00:42:43.092601: step: 694/466, loss: 4.4856367111206055 2023-01-24 00:42:43.730342: step: 696/466, loss: 1.9674458503723145 2023-01-24 00:42:44.379252: step: 698/466, loss: 3.832584857940674 2023-01-24 00:42:44.990136: step: 700/466, loss: 1.8476121425628662 2023-01-24 00:42:45.653462: step: 702/466, loss: 1.2803583145141602 2023-01-24 00:42:46.288628: step: 704/466, loss: 2.006136894226074 2023-01-24 00:42:46.916788: step: 706/466, loss: 2.4802322387695312 2023-01-24 00:42:47.556737: step: 708/466, loss: 1.1326019763946533 2023-01-24 00:42:48.165257: step: 710/466, loss: 1.0300644636154175 2023-01-24 00:42:48.784367: step: 712/466, loss: 2.0992324352264404 2023-01-24 00:42:49.395671: step: 714/466, loss: 1.4960116147994995 2023-01-24 00:42:50.018078: step: 716/466, loss: 0.3980748653411865 2023-01-24 00:42:50.695632: step: 718/466, loss: 0.8751800060272217 2023-01-24 00:42:51.340381: step: 720/466, loss: 0.7033678293228149 2023-01-24 00:42:51.982733: step: 722/466, loss: 0.7334343791007996 2023-01-24 00:42:52.628415: step: 724/466, loss: 1.1421419382095337 2023-01-24 00:42:53.311599: step: 726/466, loss: 5.31184196472168 2023-01-24 00:42:53.918647: step: 728/466, loss: 4.327996730804443 2023-01-24 00:42:54.510099: step: 730/466, loss: 0.9183502793312073 2023-01-24 00:42:55.195586: step: 732/466, loss: 1.9118095636367798 2023-01-24 00:42:55.818843: step: 734/466, loss: 0.35955610871315 2023-01-24 00:42:56.402919: step: 736/466, loss: 0.5458019375801086 2023-01-24 00:42:57.091985: step: 738/466, loss: 1.2042368650436401 2023-01-24 00:42:57.739818: step: 740/466, loss: 0.546623706817627 2023-01-24 00:42:58.377458: step: 742/466, loss: 4.107133865356445 2023-01-24 00:42:59.015265: step: 744/466, loss: 1.4979407787322998 2023-01-24 00:42:59.639191: step: 746/466, loss: 1.1551812887191772 2023-01-24 00:43:00.231808: step: 748/466, loss: 0.409785658121109 2023-01-24 00:43:00.839017: step: 750/466, loss: 0.38589829206466675 2023-01-24 00:43:01.581826: step: 752/466, loss: 0.6685751080513 2023-01-24 00:43:02.191949: step: 754/466, loss: 0.9630231857299805 2023-01-24 00:43:02.798590: step: 756/466, loss: 1.2398910522460938 2023-01-24 00:43:03.411843: step: 758/466, loss: 1.371281623840332 2023-01-24 00:43:04.069966: step: 760/466, loss: 0.8932585716247559 2023-01-24 00:43:04.667553: step: 762/466, loss: 1.0760459899902344 2023-01-24 00:43:05.256283: step: 764/466, loss: 0.772979199886322 2023-01-24 00:43:05.851574: step: 766/466, loss: 0.7807055711746216 2023-01-24 00:43:06.452583: step: 768/466, loss: 3.313267946243286 2023-01-24 00:43:07.073589: step: 770/466, loss: 7.387088298797607 2023-01-24 00:43:07.649062: step: 772/466, loss: 0.481009304523468 2023-01-24 00:43:08.262085: step: 774/466, loss: 1.8092561960220337 2023-01-24 00:43:08.822958: step: 776/466, loss: 0.7750083208084106 2023-01-24 00:43:09.430410: step: 778/466, loss: 0.7640812993049622 2023-01-24 00:43:10.109197: step: 780/466, loss: 0.771941602230072 2023-01-24 00:43:10.768971: step: 782/466, loss: 4.130035400390625 2023-01-24 00:43:11.384738: step: 784/466, loss: 1.6280627250671387 2023-01-24 00:43:12.044607: step: 786/466, loss: 3.1433563232421875 2023-01-24 00:43:12.607080: step: 788/466, loss: 1.0050230026245117 2023-01-24 00:43:13.215497: step: 790/466, loss: 2.5180046558380127 2023-01-24 00:43:13.893724: step: 792/466, loss: 1.341071367263794 2023-01-24 00:43:14.524893: step: 794/466, loss: 0.5717883706092834 2023-01-24 00:43:15.080633: step: 796/466, loss: 0.6130303740501404 2023-01-24 00:43:15.727012: step: 798/466, loss: 1.025365948677063 2023-01-24 00:43:16.293639: step: 800/466, loss: 0.3749749958515167 2023-01-24 00:43:16.980303: step: 802/466, loss: 2.4972636699676514 2023-01-24 00:43:17.545360: step: 804/466, loss: 0.24893394112586975 2023-01-24 00:43:18.043232: step: 806/466, loss: 0.9291529059410095 2023-01-24 00:43:18.669385: step: 808/466, loss: 0.9736714363098145 2023-01-24 00:43:19.329806: step: 810/466, loss: 0.6771429777145386 2023-01-24 00:43:19.934860: step: 812/466, loss: 0.9184495806694031 2023-01-24 00:43:20.571628: step: 814/466, loss: 0.399580717086792 2023-01-24 00:43:21.180375: step: 816/466, loss: 0.5318378806114197 2023-01-24 00:43:21.814303: step: 818/466, loss: 2.599856376647949 2023-01-24 00:43:22.419406: step: 820/466, loss: 1.9299108982086182 2023-01-24 00:43:23.108942: step: 822/466, loss: 1.3706055879592896 2023-01-24 00:43:23.731452: step: 824/466, loss: 1.642850399017334 2023-01-24 00:43:24.333640: step: 826/466, loss: 0.41598743200302124 2023-01-24 00:43:24.972136: step: 828/466, loss: 0.9551495313644409 2023-01-24 00:43:25.540747: step: 830/466, loss: 0.5807445645332336 2023-01-24 00:43:26.162989: step: 832/466, loss: 0.9744365215301514 2023-01-24 00:43:26.770816: step: 834/466, loss: 1.759616494178772 2023-01-24 00:43:27.371442: step: 836/466, loss: 2.0120441913604736 2023-01-24 00:43:28.004957: step: 838/466, loss: 0.6406006217002869 2023-01-24 00:43:28.627192: step: 840/466, loss: 2.323092222213745 2023-01-24 00:43:29.269503: step: 842/466, loss: 0.7588452100753784 2023-01-24 00:43:29.888809: step: 844/466, loss: 0.23752079904079437 2023-01-24 00:43:30.563835: step: 846/466, loss: 2.2680630683898926 2023-01-24 00:43:31.171844: step: 848/466, loss: 0.8969069123268127 2023-01-24 00:43:31.828166: step: 850/466, loss: 1.1765252351760864 2023-01-24 00:43:32.503857: step: 852/466, loss: 0.780741274356842 2023-01-24 00:43:33.168628: step: 854/466, loss: 0.6287121176719666 2023-01-24 00:43:33.799967: step: 856/466, loss: 0.25774434208869934 2023-01-24 00:43:34.434806: step: 858/466, loss: 0.20518824458122253 2023-01-24 00:43:35.013780: step: 860/466, loss: 12.963945388793945 2023-01-24 00:43:35.559436: step: 862/466, loss: 0.8451552391052246 2023-01-24 00:43:36.184509: step: 864/466, loss: 1.062126636505127 2023-01-24 00:43:36.856246: step: 866/466, loss: 0.7250032424926758 2023-01-24 00:43:37.465025: step: 868/466, loss: 3.0983104705810547 2023-01-24 00:43:38.122482: step: 870/466, loss: 1.5424200296401978 2023-01-24 00:43:38.733395: step: 872/466, loss: 1.0159175395965576 2023-01-24 00:43:39.269993: step: 874/466, loss: 2.1191201210021973 2023-01-24 00:43:39.889148: step: 876/466, loss: 2.0893123149871826 2023-01-24 00:43:40.488652: step: 878/466, loss: 1.5631012916564941 2023-01-24 00:43:41.145810: step: 880/466, loss: 1.3018872737884521 2023-01-24 00:43:41.768799: step: 882/466, loss: 0.6196349263191223 2023-01-24 00:43:42.406492: step: 884/466, loss: 0.6660146117210388 2023-01-24 00:43:43.019667: step: 886/466, loss: 1.7531086206436157 2023-01-24 00:43:43.754935: step: 888/466, loss: 0.9928648471832275 2023-01-24 00:43:44.371393: step: 890/466, loss: 0.5851998329162598 2023-01-24 00:43:44.999195: step: 892/466, loss: 1.0412944555282593 2023-01-24 00:43:45.649953: step: 894/466, loss: 0.5938190221786499 2023-01-24 00:43:46.233390: step: 896/466, loss: 1.4201117753982544 2023-01-24 00:43:46.834837: step: 898/466, loss: 0.5297113656997681 2023-01-24 00:43:47.422391: step: 900/466, loss: 2.7390661239624023 2023-01-24 00:43:48.043594: step: 902/466, loss: 3.369328498840332 2023-01-24 00:43:48.748768: step: 904/466, loss: 1.364393949508667 2023-01-24 00:43:49.467081: step: 906/466, loss: 0.4259275496006012 2023-01-24 00:43:50.058003: step: 908/466, loss: 2.5091474056243896 2023-01-24 00:43:50.699854: step: 910/466, loss: 1.1209821701049805 2023-01-24 00:43:51.340788: step: 912/466, loss: 0.6636772155761719 2023-01-24 00:43:51.953562: step: 914/466, loss: 0.638691246509552 2023-01-24 00:43:52.586001: step: 916/466, loss: 1.4952242374420166 2023-01-24 00:43:53.203127: step: 918/466, loss: 0.9142274856567383 2023-01-24 00:43:53.853813: step: 920/466, loss: 0.5067021250724792 2023-01-24 00:43:54.417033: step: 922/466, loss: 0.7559120059013367 2023-01-24 00:43:54.995704: step: 924/466, loss: 0.6585716009140015 2023-01-24 00:43:55.559440: step: 926/466, loss: 1.6754124164581299 2023-01-24 00:43:56.212191: step: 928/466, loss: 4.6818013191223145 2023-01-24 00:43:56.836774: step: 930/466, loss: 1.664534091949463 2023-01-24 00:43:57.474112: step: 932/466, loss: 0.6084837913513184 ================================================== Loss: 1.577 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3525654761904762, 'r': 0.2337081755050505, 'f1': 0.281088648443432}, 'combined': 0.2071179514846341, 'epoch': 2} Test Chinese: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.3811719617405466, 'r': 0.17164268977253655, 'f1': 0.23669915621790796}, 'combined': 0.15698182381291303, 'epoch': 2} Dev Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.34554782082324453, 'r': 0.19269679989197946, 'f1': 0.24741894937586684}, 'combined': 0.1649459662505779, 'epoch': 2} Test Korean: {'template': {'p': 0.96875, 'r': 0.49206349206349204, 'f1': 0.6526315789473683}, 'slot': {'p': 0.40241683553303587, 'r': 0.1674999263617645, 'f1': 0.2365425789352723}, 'combined': 0.15437515677880928, 'epoch': 2} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3479781856876829, 'r': 0.2359397546897547, 'f1': 0.28121036224873697}, 'combined': 0.2072076353411746, 'epoch': 2} Test Russian: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.37637657701005184, 'r': 0.1730616585733341, 'f1': 0.23710164472391654}, 'combined': 0.15724875919513634, 'epoch': 2} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.3508771929824562, 'r': 0.1904761904761905, 'f1': 0.2469135802469136}, 'combined': 0.1646090534979424, 'epoch': 2} Sample Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.4583333333333333, 'r': 0.2391304347826087, 'f1': 0.3142857142857143}, 'combined': 0.2095238095238095, 'epoch': 2} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.16666666666666666, 'r': 0.034482758620689655, 'f1': 0.05714285714285715}, 'combined': 0.0380952380952381, 'epoch': 2} New best chinese model... New best korean model... New best russian model... ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3525654761904762, 'r': 0.2337081755050505, 'f1': 0.281088648443432}, 'combined': 0.2071179514846341, 'epoch': 2} Test for Chinese: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.3811719617405466, 'r': 0.17164268977253655, 'f1': 0.23669915621790796}, 'combined': 0.15698182381291303, 'epoch': 2} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.3508771929824562, 'r': 0.1904761904761905, 'f1': 0.2469135802469136}, 'combined': 0.1646090534979424, 'epoch': 2} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.34554782082324453, 'r': 0.19269679989197946, 'f1': 0.24741894937586684}, 'combined': 0.1649459662505779, 'epoch': 2} Test for Korean: {'template': {'p': 0.96875, 'r': 0.49206349206349204, 'f1': 0.6526315789473683}, 'slot': {'p': 0.40241683553303587, 'r': 0.1674999263617645, 'f1': 0.2365425789352723}, 'combined': 0.15437515677880928, 'epoch': 2} Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.4583333333333333, 'r': 0.2391304347826087, 'f1': 0.3142857142857143}, 'combined': 0.2095238095238095, 'epoch': 2} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3479781856876829, 'r': 0.2359397546897547, 'f1': 0.28121036224873697}, 'combined': 0.2072076353411746, 'epoch': 2} Test for Russian: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.37637657701005184, 'r': 0.1730616585733341, 'f1': 0.23710164472391654}, 'combined': 0.15724875919513634, 'epoch': 2} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.16666666666666666, 'r': 0.034482758620689655, 'f1': 0.05714285714285715}, 'combined': 0.0380952380952381, 'epoch': 2} ****************************** Epoch: 3 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-24 00:46:48.857142: step: 2/466, loss: 7.128997802734375 2023-01-24 00:46:49.559886: step: 4/466, loss: 1.2443701028823853 2023-01-24 00:46:50.190364: step: 6/466, loss: 1.0124177932739258 2023-01-24 00:46:50.826571: step: 8/466, loss: 0.5614454746246338 2023-01-24 00:46:51.420173: step: 10/466, loss: 3.858065128326416 2023-01-24 00:46:52.020124: step: 12/466, loss: 0.6932990550994873 2023-01-24 00:46:52.642550: step: 14/466, loss: 0.7984504699707031 2023-01-24 00:46:53.225269: step: 16/466, loss: 3.1545112133026123 2023-01-24 00:46:53.837030: step: 18/466, loss: 0.8147408962249756 2023-01-24 00:46:54.427510: step: 20/466, loss: 0.6810839176177979 2023-01-24 00:46:54.993394: step: 22/466, loss: 0.8591039180755615 2023-01-24 00:46:55.660454: step: 24/466, loss: 1.1684130430221558 2023-01-24 00:46:56.240437: step: 26/466, loss: 0.17121870815753937 2023-01-24 00:46:56.798704: step: 28/466, loss: 0.6259769201278687 2023-01-24 00:46:57.460545: step: 30/466, loss: 0.7976721525192261 2023-01-24 00:46:58.038470: step: 32/466, loss: 0.7649844288825989 2023-01-24 00:46:58.652907: step: 34/466, loss: 0.3217013478279114 2023-01-24 00:46:59.259058: step: 36/466, loss: 0.8182716965675354 2023-01-24 00:46:59.940840: step: 38/466, loss: 0.2067316174507141 2023-01-24 00:47:00.558343: step: 40/466, loss: 0.17226776480674744 2023-01-24 00:47:01.155104: step: 42/466, loss: 1.2925540208816528 2023-01-24 00:47:01.731248: step: 44/466, loss: 0.3089175224304199 2023-01-24 00:47:02.461592: step: 46/466, loss: 0.3989468812942505 2023-01-24 00:47:03.134470: step: 48/466, loss: 1.590653419494629 2023-01-24 00:47:03.721497: step: 50/466, loss: 0.7529211640357971 2023-01-24 00:47:04.430877: step: 52/466, loss: 0.47044286131858826 2023-01-24 00:47:05.086558: step: 54/466, loss: 1.4606329202651978 2023-01-24 00:47:05.771180: step: 56/466, loss: 0.7644850015640259 2023-01-24 00:47:06.410042: step: 58/466, loss: 2.350726366043091 2023-01-24 00:47:07.037044: step: 60/466, loss: 0.7856782674789429 2023-01-24 00:47:07.658005: step: 62/466, loss: 3.875983953475952 2023-01-24 00:47:08.265439: step: 64/466, loss: 1.41089928150177 2023-01-24 00:47:08.860388: step: 66/466, loss: 0.7020902633666992 2023-01-24 00:47:09.464535: step: 68/466, loss: 0.514349102973938 2023-01-24 00:47:10.140796: step: 70/466, loss: 0.5861013531684875 2023-01-24 00:47:10.733772: step: 72/466, loss: 0.3562849760055542 2023-01-24 00:47:11.309809: step: 74/466, loss: 0.2749972641468048 2023-01-24 00:47:11.866852: step: 76/466, loss: 0.5348482131958008 2023-01-24 00:47:12.478989: step: 78/466, loss: 0.22524146735668182 2023-01-24 00:47:13.093800: step: 80/466, loss: 0.5875887274742126 2023-01-24 00:47:13.686133: step: 82/466, loss: 0.7782057523727417 2023-01-24 00:47:14.371852: step: 84/466, loss: 0.8846481442451477 2023-01-24 00:47:14.966824: step: 86/466, loss: 0.38065099716186523 2023-01-24 00:47:15.631120: step: 88/466, loss: 0.8411879539489746 2023-01-24 00:47:16.273694: step: 90/466, loss: 0.7535284757614136 2023-01-24 00:47:16.896879: step: 92/466, loss: 0.22157704830169678 2023-01-24 00:47:17.463232: step: 94/466, loss: 1.2213329076766968 2023-01-24 00:47:18.172537: step: 96/466, loss: 6.678746223449707 2023-01-24 00:47:18.867293: step: 98/466, loss: 0.27789440751075745 2023-01-24 00:47:19.555515: step: 100/466, loss: 0.40511220693588257 2023-01-24 00:47:20.175945: step: 102/466, loss: 1.249924659729004 2023-01-24 00:47:20.784244: step: 104/466, loss: 0.6266840100288391 2023-01-24 00:47:21.422675: step: 106/466, loss: 4.088799953460693 2023-01-24 00:47:22.049171: step: 108/466, loss: 1.7367558479309082 2023-01-24 00:47:22.678561: step: 110/466, loss: 0.3987114131450653 2023-01-24 00:47:23.265376: step: 112/466, loss: 0.2352711260318756 2023-01-24 00:47:23.876055: step: 114/466, loss: 0.482532262802124 2023-01-24 00:47:24.536202: step: 116/466, loss: 0.9187915325164795 2023-01-24 00:47:25.181182: step: 118/466, loss: 0.2020147144794464 2023-01-24 00:47:25.765384: step: 120/466, loss: 0.2355295866727829 2023-01-24 00:47:26.368132: step: 122/466, loss: 1.4388062953948975 2023-01-24 00:47:26.969512: step: 124/466, loss: 1.1046127080917358 2023-01-24 00:47:27.588613: step: 126/466, loss: 0.25789308547973633 2023-01-24 00:47:28.205114: step: 128/466, loss: 1.6567487716674805 2023-01-24 00:47:28.766686: step: 130/466, loss: 1.1461472511291504 2023-01-24 00:47:29.352346: step: 132/466, loss: 1.9202277660369873 2023-01-24 00:47:30.053056: step: 134/466, loss: 0.8971623778343201 2023-01-24 00:47:30.674924: step: 136/466, loss: 1.5297441482543945 2023-01-24 00:47:31.281084: step: 138/466, loss: 1.31128990650177 2023-01-24 00:47:31.903024: step: 140/466, loss: 0.7695094347000122 2023-01-24 00:47:32.592203: step: 142/466, loss: 0.25040504336357117 2023-01-24 00:47:33.186872: step: 144/466, loss: 0.7494414448738098 2023-01-24 00:47:33.777595: step: 146/466, loss: 0.8621645569801331 2023-01-24 00:47:34.404184: step: 148/466, loss: 0.9328680634498596 2023-01-24 00:47:34.990964: step: 150/466, loss: 3.1788134574890137 2023-01-24 00:47:35.569427: step: 152/466, loss: 1.3003443479537964 2023-01-24 00:47:36.202925: step: 154/466, loss: 0.7661277055740356 2023-01-24 00:47:36.868667: step: 156/466, loss: 0.248875230550766 2023-01-24 00:47:37.460857: step: 158/466, loss: 0.33294370770454407 2023-01-24 00:47:38.138463: step: 160/466, loss: 1.2023553848266602 2023-01-24 00:47:38.849696: step: 162/466, loss: 1.5967686176300049 2023-01-24 00:47:39.512618: step: 164/466, loss: 0.39988163113594055 2023-01-24 00:47:40.175155: step: 166/466, loss: 0.21817664802074432 2023-01-24 00:47:40.758726: step: 168/466, loss: 0.42362329363822937 2023-01-24 00:47:41.371220: step: 170/466, loss: 0.38722550868988037 2023-01-24 00:47:41.999008: step: 172/466, loss: 3.7433886528015137 2023-01-24 00:47:42.701491: step: 174/466, loss: 0.8374991416931152 2023-01-24 00:47:43.381674: step: 176/466, loss: 0.6235227584838867 2023-01-24 00:47:43.991112: step: 178/466, loss: 1.054993987083435 2023-01-24 00:47:44.623312: step: 180/466, loss: 0.5051757097244263 2023-01-24 00:47:45.270327: step: 182/466, loss: 1.1793485879898071 2023-01-24 00:47:45.864364: step: 184/466, loss: 0.9896507263183594 2023-01-24 00:47:46.433843: step: 186/466, loss: 4.04372501373291 2023-01-24 00:47:47.059911: step: 188/466, loss: 0.5763915181159973 2023-01-24 00:47:47.689739: step: 190/466, loss: 2.586012125015259 2023-01-24 00:47:48.249785: step: 192/466, loss: 0.22528810799121857 2023-01-24 00:47:48.842977: step: 194/466, loss: 1.463167428970337 2023-01-24 00:47:49.466746: step: 196/466, loss: 0.7889247536659241 2023-01-24 00:47:50.148373: step: 198/466, loss: 1.4208873510360718 2023-01-24 00:47:50.737979: step: 200/466, loss: 5.841209888458252 2023-01-24 00:47:51.320516: step: 202/466, loss: 1.1238937377929688 2023-01-24 00:47:51.901732: step: 204/466, loss: 1.2800912857055664 2023-01-24 00:47:52.510196: step: 206/466, loss: 2.091311454772949 2023-01-24 00:47:53.087504: step: 208/466, loss: 0.321956992149353 2023-01-24 00:47:53.815679: step: 210/466, loss: 5.405332088470459 2023-01-24 00:47:54.460300: step: 212/466, loss: 12.03922176361084 2023-01-24 00:47:55.116558: step: 214/466, loss: 1.0027766227722168 2023-01-24 00:47:55.812027: step: 216/466, loss: 1.1555352210998535 2023-01-24 00:47:56.507777: step: 218/466, loss: 3.034679412841797 2023-01-24 00:47:57.226228: step: 220/466, loss: 0.25661519169807434 2023-01-24 00:47:57.854921: step: 222/466, loss: 0.4973966181278229 2023-01-24 00:47:58.463138: step: 224/466, loss: 2.389017105102539 2023-01-24 00:47:59.121057: step: 226/466, loss: 0.22257673740386963 2023-01-24 00:47:59.832525: step: 228/466, loss: 0.7281988859176636 2023-01-24 00:48:00.478213: step: 230/466, loss: 1.7831833362579346 2023-01-24 00:48:01.163808: step: 232/466, loss: 3.927642583847046 2023-01-24 00:48:01.815557: step: 234/466, loss: 0.6490916013717651 2023-01-24 00:48:02.424811: step: 236/466, loss: 0.5916441082954407 2023-01-24 00:48:03.008977: step: 238/466, loss: 0.864163339138031 2023-01-24 00:48:03.646063: step: 240/466, loss: 4.118500709533691 2023-01-24 00:48:04.334860: step: 242/466, loss: 1.7515130043029785 2023-01-24 00:48:04.908728: step: 244/466, loss: 0.7366144061088562 2023-01-24 00:48:05.516929: step: 246/466, loss: 1.149341106414795 2023-01-24 00:48:06.110875: step: 248/466, loss: 0.9426298141479492 2023-01-24 00:48:06.758847: step: 250/466, loss: 1.1655728816986084 2023-01-24 00:48:07.337148: step: 252/466, loss: 1.1931763887405396 2023-01-24 00:48:08.005266: step: 254/466, loss: 0.6886307597160339 2023-01-24 00:48:08.652306: step: 256/466, loss: 0.7341050505638123 2023-01-24 00:48:09.318582: step: 258/466, loss: 6.162553310394287 2023-01-24 00:48:09.971020: step: 260/466, loss: 0.9465752243995667 2023-01-24 00:48:10.585984: step: 262/466, loss: 1.0216237306594849 2023-01-24 00:48:11.222849: step: 264/466, loss: 0.5391034483909607 2023-01-24 00:48:11.871076: step: 266/466, loss: 0.802638053894043 2023-01-24 00:48:12.482552: step: 268/466, loss: 1.4949768781661987 2023-01-24 00:48:13.160380: step: 270/466, loss: 1.0837867259979248 2023-01-24 00:48:13.716207: step: 272/466, loss: 0.31959158182144165 2023-01-24 00:48:14.349791: step: 274/466, loss: 0.6207334995269775 2023-01-24 00:48:15.014873: step: 276/466, loss: 0.45914387702941895 2023-01-24 00:48:15.632192: step: 278/466, loss: 4.065018653869629 2023-01-24 00:48:16.273757: step: 280/466, loss: 0.20469020307064056 2023-01-24 00:48:16.877777: step: 282/466, loss: 0.5084442496299744 2023-01-24 00:48:17.508719: step: 284/466, loss: 0.7040725350379944 2023-01-24 00:48:18.093319: step: 286/466, loss: 0.399564266204834 2023-01-24 00:48:18.722565: step: 288/466, loss: 0.6310179233551025 2023-01-24 00:48:19.352964: step: 290/466, loss: 0.13815368711948395 2023-01-24 00:48:19.952309: step: 292/466, loss: 2.8727893829345703 2023-01-24 00:48:20.604067: step: 294/466, loss: 0.43193039298057556 2023-01-24 00:48:21.213129: step: 296/466, loss: 0.7842096090316772 2023-01-24 00:48:21.836680: step: 298/466, loss: 0.7666311264038086 2023-01-24 00:48:22.488920: step: 300/466, loss: 2.4203577041625977 2023-01-24 00:48:23.103886: step: 302/466, loss: 0.3907892405986786 2023-01-24 00:48:23.714947: step: 304/466, loss: 0.3148607611656189 2023-01-24 00:48:24.398759: step: 306/466, loss: 0.531349778175354 2023-01-24 00:48:25.001496: step: 308/466, loss: 1.8285367488861084 2023-01-24 00:48:25.619195: step: 310/466, loss: 1.708573341369629 2023-01-24 00:48:26.274924: step: 312/466, loss: 3.27811598777771 2023-01-24 00:48:26.857164: step: 314/466, loss: 0.8649401664733887 2023-01-24 00:48:27.488155: step: 316/466, loss: 0.32861870527267456 2023-01-24 00:48:28.149299: step: 318/466, loss: 1.1042927503585815 2023-01-24 00:48:28.748692: step: 320/466, loss: 0.801863431930542 2023-01-24 00:48:29.324059: step: 322/466, loss: 0.46133917570114136 2023-01-24 00:48:30.025233: step: 324/466, loss: 0.4726460874080658 2023-01-24 00:48:30.656815: step: 326/466, loss: 0.27592214941978455 2023-01-24 00:48:31.276279: step: 328/466, loss: 1.2176792621612549 2023-01-24 00:48:31.873464: step: 330/466, loss: 0.8818544149398804 2023-01-24 00:48:32.496684: step: 332/466, loss: 0.7360835075378418 2023-01-24 00:48:33.085520: step: 334/466, loss: 0.2759115993976593 2023-01-24 00:48:33.673654: step: 336/466, loss: 1.2839148044586182 2023-01-24 00:48:34.267168: step: 338/466, loss: 0.5772988796234131 2023-01-24 00:48:34.923429: step: 340/466, loss: 0.621418833732605 2023-01-24 00:48:35.590829: step: 342/466, loss: 8.68262767791748 2023-01-24 00:48:36.219728: step: 344/466, loss: 2.322258234024048 2023-01-24 00:48:36.885482: step: 346/466, loss: 2.3204495906829834 2023-01-24 00:48:37.507967: step: 348/466, loss: 3.272414207458496 2023-01-24 00:48:38.211309: step: 350/466, loss: 0.9026280641555786 2023-01-24 00:48:38.886751: step: 352/466, loss: 1.1732512712478638 2023-01-24 00:48:39.513412: step: 354/466, loss: 1.1399867534637451 2023-01-24 00:48:40.121211: step: 356/466, loss: 0.6772962212562561 2023-01-24 00:48:40.945721: step: 358/466, loss: 1.2160906791687012 2023-01-24 00:48:41.529342: step: 360/466, loss: 0.6079787015914917 2023-01-24 00:48:42.229849: step: 362/466, loss: 0.3634703457355499 2023-01-24 00:48:42.950703: step: 364/466, loss: 0.6174222230911255 2023-01-24 00:48:43.543484: step: 366/466, loss: 0.4810933470726013 2023-01-24 00:48:44.123049: step: 368/466, loss: 1.0697897672653198 2023-01-24 00:48:44.713046: step: 370/466, loss: 0.1810256391763687 2023-01-24 00:48:45.371800: step: 372/466, loss: 1.742691159248352 2023-01-24 00:48:46.043132: step: 374/466, loss: 0.844283401966095 2023-01-24 00:48:46.658907: step: 376/466, loss: 5.25944709777832 2023-01-24 00:48:47.312000: step: 378/466, loss: 1.2075788974761963 2023-01-24 00:48:47.893259: step: 380/466, loss: 1.401288390159607 2023-01-24 00:48:48.492501: step: 382/466, loss: 1.144996166229248 2023-01-24 00:48:49.115270: step: 384/466, loss: 0.42485713958740234 2023-01-24 00:48:49.764402: step: 386/466, loss: 1.235788106918335 2023-01-24 00:48:50.432461: step: 388/466, loss: 1.0391279458999634 2023-01-24 00:48:51.008363: step: 390/466, loss: 4.329006195068359 2023-01-24 00:48:51.666784: step: 392/466, loss: 0.8605351448059082 2023-01-24 00:48:52.270862: step: 394/466, loss: 0.8281611800193787 2023-01-24 00:48:52.892481: step: 396/466, loss: 1.492803931236267 2023-01-24 00:48:53.551287: step: 398/466, loss: 0.942059338092804 2023-01-24 00:48:54.183548: step: 400/466, loss: 0.6410725712776184 2023-01-24 00:48:54.763838: step: 402/466, loss: 1.1986944675445557 2023-01-24 00:48:55.352124: step: 404/466, loss: 1.9941927194595337 2023-01-24 00:48:55.921006: step: 406/466, loss: 0.5979148149490356 2023-01-24 00:48:56.538700: step: 408/466, loss: 0.42674484848976135 2023-01-24 00:48:57.167679: step: 410/466, loss: 0.6338918209075928 2023-01-24 00:48:57.779121: step: 412/466, loss: 0.2838844954967499 2023-01-24 00:48:58.344545: step: 414/466, loss: 1.0026823282241821 2023-01-24 00:48:58.940514: step: 416/466, loss: 1.5562171936035156 2023-01-24 00:48:59.608932: step: 418/466, loss: 0.7744855880737305 2023-01-24 00:49:00.241163: step: 420/466, loss: 0.2533835470676422 2023-01-24 00:49:00.802003: step: 422/466, loss: 0.6974008083343506 2023-01-24 00:49:01.449993: step: 424/466, loss: 1.380007028579712 2023-01-24 00:49:02.106795: step: 426/466, loss: 1.7630139589309692 2023-01-24 00:49:02.677402: step: 428/466, loss: 2.447899341583252 2023-01-24 00:49:03.251410: step: 430/466, loss: 2.01246976852417 2023-01-24 00:49:03.935915: step: 432/466, loss: 0.5250542163848877 2023-01-24 00:49:04.576681: step: 434/466, loss: 0.38290348649024963 2023-01-24 00:49:05.193363: step: 436/466, loss: 2.1723570823669434 2023-01-24 00:49:05.834222: step: 438/466, loss: 0.8501411080360413 2023-01-24 00:49:06.533629: step: 440/466, loss: 1.0419926643371582 2023-01-24 00:49:07.188020: step: 442/466, loss: 1.304261326789856 2023-01-24 00:49:07.717672: step: 444/466, loss: 0.49662578105926514 2023-01-24 00:49:08.351406: step: 446/466, loss: 0.7082849144935608 2023-01-24 00:49:08.946369: step: 448/466, loss: 0.22583594918251038 2023-01-24 00:49:09.536399: step: 450/466, loss: 0.28244996070861816 2023-01-24 00:49:10.138972: step: 452/466, loss: 2.694072723388672 2023-01-24 00:49:10.746939: step: 454/466, loss: 0.78786301612854 2023-01-24 00:49:11.329022: step: 456/466, loss: 1.4211052656173706 2023-01-24 00:49:11.997615: step: 458/466, loss: 1.5807013511657715 2023-01-24 00:49:12.629024: step: 460/466, loss: 0.27115046977996826 2023-01-24 00:49:13.299944: step: 462/466, loss: 1.418265700340271 2023-01-24 00:49:13.974579: step: 464/466, loss: 0.6221765279769897 2023-01-24 00:49:14.651594: step: 466/466, loss: 1.763245940208435 2023-01-24 00:49:15.246365: step: 468/466, loss: 2.8533806800842285 2023-01-24 00:49:15.897107: step: 470/466, loss: 0.33080437779426575 2023-01-24 00:49:16.506467: step: 472/466, loss: 0.6602237224578857 2023-01-24 00:49:17.123013: step: 474/466, loss: 0.256158709526062 2023-01-24 00:49:17.701976: step: 476/466, loss: 0.4613457918167114 2023-01-24 00:49:18.351574: step: 478/466, loss: 1.4783824682235718 2023-01-24 00:49:18.987193: step: 480/466, loss: 0.7388719320297241 2023-01-24 00:49:19.598839: step: 482/466, loss: 1.4873404502868652 2023-01-24 00:49:20.195955: step: 484/466, loss: 7.858376502990723 2023-01-24 00:49:20.818638: step: 486/466, loss: 0.8068593740463257 2023-01-24 00:49:21.399387: step: 488/466, loss: 1.6794354915618896 2023-01-24 00:49:22.030646: step: 490/466, loss: 2.4783554077148438 2023-01-24 00:49:22.646403: step: 492/466, loss: 0.7154982089996338 2023-01-24 00:49:23.212035: step: 494/466, loss: 0.3904711902141571 2023-01-24 00:49:23.788345: step: 496/466, loss: 0.9338459968566895 2023-01-24 00:49:24.401367: step: 498/466, loss: 0.49109259247779846 2023-01-24 00:49:25.113063: step: 500/466, loss: 1.4999065399169922 2023-01-24 00:49:25.713320: step: 502/466, loss: 0.929957926273346 2023-01-24 00:49:26.417367: step: 504/466, loss: 1.8211406469345093 2023-01-24 00:49:27.011525: step: 506/466, loss: 3.3952579498291016 2023-01-24 00:49:27.659005: step: 508/466, loss: 0.4057934880256653 2023-01-24 00:49:28.262558: step: 510/466, loss: 4.130643367767334 2023-01-24 00:49:28.832143: step: 512/466, loss: 1.2591372728347778 2023-01-24 00:49:29.456552: step: 514/466, loss: 0.8045276999473572 2023-01-24 00:49:30.076045: step: 516/466, loss: 1.412993311882019 2023-01-24 00:49:30.724356: step: 518/466, loss: 1.677248239517212 2023-01-24 00:49:31.342773: step: 520/466, loss: 2.805729389190674 2023-01-24 00:49:31.965752: step: 522/466, loss: 1.2893203496932983 2023-01-24 00:49:32.619470: step: 524/466, loss: 1.1867783069610596 2023-01-24 00:49:33.235578: step: 526/466, loss: 0.3541945219039917 2023-01-24 00:49:33.934122: step: 528/466, loss: 0.5837615132331848 2023-01-24 00:49:34.607511: step: 530/466, loss: 0.3464820981025696 2023-01-24 00:49:35.250534: step: 532/466, loss: 1.0019688606262207 2023-01-24 00:49:35.900999: step: 534/466, loss: 2.7500970363616943 2023-01-24 00:49:36.607993: step: 536/466, loss: 1.2855634689331055 2023-01-24 00:49:37.161537: step: 538/466, loss: 1.05565345287323 2023-01-24 00:49:37.787994: step: 540/466, loss: 1.951469898223877 2023-01-24 00:49:38.432075: step: 542/466, loss: 1.0896230936050415 2023-01-24 00:49:39.078607: step: 544/466, loss: 2.1885619163513184 2023-01-24 00:49:39.700008: step: 546/466, loss: 0.2746879756450653 2023-01-24 00:49:40.339230: step: 548/466, loss: 0.5674010515213013 2023-01-24 00:49:40.980628: step: 550/466, loss: 1.5061687231063843 2023-01-24 00:49:41.572871: step: 552/466, loss: 3.505814790725708 2023-01-24 00:49:42.205784: step: 554/466, loss: 0.5314408540725708 2023-01-24 00:49:42.801636: step: 556/466, loss: 0.5787690877914429 2023-01-24 00:49:43.412854: step: 558/466, loss: 1.5390510559082031 2023-01-24 00:49:44.020996: step: 560/466, loss: 2.987964153289795 2023-01-24 00:49:44.652238: step: 562/466, loss: 0.6447228193283081 2023-01-24 00:49:45.245172: step: 564/466, loss: 0.7427317500114441 2023-01-24 00:49:45.819490: step: 566/466, loss: 0.6331799030303955 2023-01-24 00:49:46.450192: step: 568/466, loss: 2.916682481765747 2023-01-24 00:49:47.097834: step: 570/466, loss: 1.3643766641616821 2023-01-24 00:49:47.671087: step: 572/466, loss: 0.6767655611038208 2023-01-24 00:49:48.261026: step: 574/466, loss: 0.1964035928249359 2023-01-24 00:49:48.896669: step: 576/466, loss: 0.30235856771469116 2023-01-24 00:49:49.439513: step: 578/466, loss: 1.5001364946365356 2023-01-24 00:49:50.112862: step: 580/466, loss: 2.098022937774658 2023-01-24 00:49:50.682118: step: 582/466, loss: 1.3338154554367065 2023-01-24 00:49:51.339604: step: 584/466, loss: 1.5174646377563477 2023-01-24 00:49:52.020899: step: 586/466, loss: 2.505064010620117 2023-01-24 00:49:52.662765: step: 588/466, loss: 1.6157104969024658 2023-01-24 00:49:53.421371: step: 590/466, loss: 0.6165828108787537 2023-01-24 00:49:54.054926: step: 592/466, loss: 0.4426193833351135 2023-01-24 00:49:54.699934: step: 594/466, loss: 0.5435011982917786 2023-01-24 00:49:55.256564: step: 596/466, loss: 1.061076045036316 2023-01-24 00:49:55.826048: step: 598/466, loss: 0.27508556842803955 2023-01-24 00:49:56.464845: step: 600/466, loss: 1.2685134410858154 2023-01-24 00:49:57.059428: step: 602/466, loss: 0.46343937516212463 2023-01-24 00:49:57.721749: step: 604/466, loss: 0.30765286087989807 2023-01-24 00:49:58.340844: step: 606/466, loss: 0.6040436029434204 2023-01-24 00:49:58.931680: step: 608/466, loss: 1.361868977546692 2023-01-24 00:49:59.586721: step: 610/466, loss: 1.4016616344451904 2023-01-24 00:50:00.307183: step: 612/466, loss: 0.5135511159896851 2023-01-24 00:50:00.970260: step: 614/466, loss: 0.5744441747665405 2023-01-24 00:50:01.612711: step: 616/466, loss: 2.2033591270446777 2023-01-24 00:50:02.233304: step: 618/466, loss: 0.8985404968261719 2023-01-24 00:50:02.879085: step: 620/466, loss: 0.9845884442329407 2023-01-24 00:50:03.543114: step: 622/466, loss: 1.1178152561187744 2023-01-24 00:50:04.218430: step: 624/466, loss: 0.418687105178833 2023-01-24 00:50:04.807573: step: 626/466, loss: 0.262470543384552 2023-01-24 00:50:05.393645: step: 628/466, loss: 0.7660725116729736 2023-01-24 00:50:06.026313: step: 630/466, loss: 0.33508577942848206 2023-01-24 00:50:06.679215: step: 632/466, loss: 0.1842721551656723 2023-01-24 00:50:07.337312: step: 634/466, loss: 1.1078568696975708 2023-01-24 00:50:07.979880: step: 636/466, loss: 0.8883263468742371 2023-01-24 00:50:08.619285: step: 638/466, loss: 1.2047715187072754 2023-01-24 00:50:09.203755: step: 640/466, loss: 0.20477718114852905 2023-01-24 00:50:09.848846: step: 642/466, loss: 1.3275189399719238 2023-01-24 00:50:10.421215: step: 644/466, loss: 0.3288031816482544 2023-01-24 00:50:10.989211: step: 646/466, loss: 0.4039969742298126 2023-01-24 00:50:11.576603: step: 648/466, loss: 5.4602766036987305 2023-01-24 00:50:12.220517: step: 650/466, loss: 1.0883431434631348 2023-01-24 00:50:12.823827: step: 652/466, loss: 0.9030593633651733 2023-01-24 00:50:13.421386: step: 654/466, loss: 2.0893757343292236 2023-01-24 00:50:14.013724: step: 656/466, loss: 10.055910110473633 2023-01-24 00:50:14.711288: step: 658/466, loss: 1.5657265186309814 2023-01-24 00:50:15.325946: step: 660/466, loss: 0.8252211809158325 2023-01-24 00:50:15.926551: step: 662/466, loss: 0.2250550538301468 2023-01-24 00:50:16.566081: step: 664/466, loss: 1.1556295156478882 2023-01-24 00:50:17.163927: step: 666/466, loss: 3.229569435119629 2023-01-24 00:50:17.825592: step: 668/466, loss: 0.4573105573654175 2023-01-24 00:50:18.399832: step: 670/466, loss: 0.7028440237045288 2023-01-24 00:50:19.041614: step: 672/466, loss: 0.8527745604515076 2023-01-24 00:50:19.646721: step: 674/466, loss: 0.24894112348556519 2023-01-24 00:50:20.299008: step: 676/466, loss: 1.8360977172851562 2023-01-24 00:50:20.941707: step: 678/466, loss: 0.7502386569976807 2023-01-24 00:50:21.550846: step: 680/466, loss: 0.5243229866027832 2023-01-24 00:50:22.195111: step: 682/466, loss: 0.5677071809768677 2023-01-24 00:50:22.764936: step: 684/466, loss: 0.9893231391906738 2023-01-24 00:50:23.424389: step: 686/466, loss: 0.9406887888908386 2023-01-24 00:50:24.035955: step: 688/466, loss: 1.7328908443450928 2023-01-24 00:50:24.719947: step: 690/466, loss: 0.5731059908866882 2023-01-24 00:50:25.338547: step: 692/466, loss: 3.6121439933776855 2023-01-24 00:50:25.948063: step: 694/466, loss: 0.5973498821258545 2023-01-24 00:50:26.568861: step: 696/466, loss: 1.0482425689697266 2023-01-24 00:50:27.188220: step: 698/466, loss: 0.5279887318611145 2023-01-24 00:50:27.751937: step: 700/466, loss: 1.2407571077346802 2023-01-24 00:50:28.363083: step: 702/466, loss: 0.5543396472930908 2023-01-24 00:50:29.026301: step: 704/466, loss: 0.9664677381515503 2023-01-24 00:50:29.621125: step: 706/466, loss: 1.2225338220596313 2023-01-24 00:50:30.287231: step: 708/466, loss: 1.5828288793563843 2023-01-24 00:50:30.929832: step: 710/466, loss: 0.4475824236869812 2023-01-24 00:50:31.543889: step: 712/466, loss: 0.6657191514968872 2023-01-24 00:50:32.216237: step: 714/466, loss: 2.9985835552215576 2023-01-24 00:50:32.895718: step: 716/466, loss: 0.8259701132774353 2023-01-24 00:50:33.565689: step: 718/466, loss: 0.8415499925613403 2023-01-24 00:50:34.166408: step: 720/466, loss: 0.903465986251831 2023-01-24 00:50:34.795760: step: 722/466, loss: 2.4392549991607666 2023-01-24 00:50:35.421805: step: 724/466, loss: 0.33140289783477783 2023-01-24 00:50:36.017035: step: 726/466, loss: 3.3589072227478027 2023-01-24 00:50:36.592373: step: 728/466, loss: 3.946051597595215 2023-01-24 00:50:37.294583: step: 730/466, loss: 1.526811957359314 2023-01-24 00:50:37.931874: step: 732/466, loss: 0.24000266194343567 2023-01-24 00:50:38.508912: step: 734/466, loss: 0.7835390567779541 2023-01-24 00:50:39.118212: step: 736/466, loss: 0.4948054254055023 2023-01-24 00:50:39.750035: step: 738/466, loss: 0.7074456810951233 2023-01-24 00:50:40.377317: step: 740/466, loss: 0.4379880428314209 2023-01-24 00:50:41.108956: step: 742/466, loss: 0.6280728578567505 2023-01-24 00:50:41.753872: step: 744/466, loss: 4.928013801574707 2023-01-24 00:50:42.422830: step: 746/466, loss: 2.8795113563537598 2023-01-24 00:50:43.059414: step: 748/466, loss: 0.34982365369796753 2023-01-24 00:50:43.676519: step: 750/466, loss: 0.8757614493370056 2023-01-24 00:50:44.271348: step: 752/466, loss: 0.3209124207496643 2023-01-24 00:50:44.915566: step: 754/466, loss: 1.0372759103775024 2023-01-24 00:50:45.611589: step: 756/466, loss: 0.6970776915550232 2023-01-24 00:50:46.275532: step: 758/466, loss: 0.7311455607414246 2023-01-24 00:50:46.902337: step: 760/466, loss: 2.0544915199279785 2023-01-24 00:50:47.527439: step: 762/466, loss: 0.8903125524520874 2023-01-24 00:50:48.163697: step: 764/466, loss: 1.341835856437683 2023-01-24 00:50:48.846395: step: 766/466, loss: 0.5011296272277832 2023-01-24 00:50:49.467630: step: 768/466, loss: 2.6994924545288086 2023-01-24 00:50:50.128582: step: 770/466, loss: 1.665336012840271 2023-01-24 00:50:50.737870: step: 772/466, loss: 1.8195743560791016 2023-01-24 00:50:51.345095: step: 774/466, loss: 1.832624912261963 2023-01-24 00:50:51.952841: step: 776/466, loss: 0.27986520528793335 2023-01-24 00:50:52.636339: step: 778/466, loss: 0.9908888339996338 2023-01-24 00:50:53.237919: step: 780/466, loss: 0.2665031850337982 2023-01-24 00:50:54.000956: step: 782/466, loss: 0.5017292499542236 2023-01-24 00:50:54.734994: step: 784/466, loss: 1.8286349773406982 2023-01-24 00:50:55.382880: step: 786/466, loss: 0.8838155269622803 2023-01-24 00:50:56.030416: step: 788/466, loss: 0.8231385946273804 2023-01-24 00:50:56.698292: step: 790/466, loss: 1.1516478061676025 2023-01-24 00:50:57.327964: step: 792/466, loss: 2.0130913257598877 2023-01-24 00:50:57.979127: step: 794/466, loss: 1.003012776374817 2023-01-24 00:50:58.592859: step: 796/466, loss: 1.2084956169128418 2023-01-24 00:50:59.247905: step: 798/466, loss: 1.0133253335952759 2023-01-24 00:50:59.883283: step: 800/466, loss: 2.3319003582000732 2023-01-24 00:51:00.508615: step: 802/466, loss: 2.432572364807129 2023-01-24 00:51:01.096347: step: 804/466, loss: 2.595862627029419 2023-01-24 00:51:01.752229: step: 806/466, loss: 0.9174851179122925 2023-01-24 00:51:02.410529: step: 808/466, loss: 0.8310629725456238 2023-01-24 00:51:03.082100: step: 810/466, loss: 0.9610335826873779 2023-01-24 00:51:03.678063: step: 812/466, loss: 0.6178631782531738 2023-01-24 00:51:04.266315: step: 814/466, loss: 0.8955137729644775 2023-01-24 00:51:04.891452: step: 816/466, loss: 0.37296637892723083 2023-01-24 00:51:05.514166: step: 818/466, loss: 6.648438453674316 2023-01-24 00:51:06.107906: step: 820/466, loss: 0.9201613068580627 2023-01-24 00:51:06.743043: step: 822/466, loss: 0.6568300724029541 2023-01-24 00:51:07.371136: step: 824/466, loss: 1.3470970392227173 2023-01-24 00:51:08.016561: step: 826/466, loss: 1.4039759635925293 2023-01-24 00:51:08.703110: step: 828/466, loss: 0.38886359333992004 2023-01-24 00:51:09.303471: step: 830/466, loss: 1.1843596696853638 2023-01-24 00:51:09.899887: step: 832/466, loss: 2.6176464557647705 2023-01-24 00:51:10.539861: step: 834/466, loss: 1.1597771644592285 2023-01-24 00:51:11.142310: step: 836/466, loss: 0.3963584899902344 2023-01-24 00:51:11.797575: step: 838/466, loss: 0.2508530020713806 2023-01-24 00:51:12.356118: step: 840/466, loss: 0.8605871200561523 2023-01-24 00:51:12.997902: step: 842/466, loss: 1.2558029890060425 2023-01-24 00:51:13.571585: step: 844/466, loss: 1.2138640880584717 2023-01-24 00:51:14.215541: step: 846/466, loss: 0.5845181345939636 2023-01-24 00:51:14.920387: step: 848/466, loss: 1.2057701349258423 2023-01-24 00:51:15.512679: step: 850/466, loss: 1.1409090757369995 2023-01-24 00:51:16.122945: step: 852/466, loss: 1.0597467422485352 2023-01-24 00:51:16.705061: step: 854/466, loss: 0.5218050479888916 2023-01-24 00:51:17.335601: step: 856/466, loss: 0.7265652418136597 2023-01-24 00:51:18.020722: step: 858/466, loss: 2.1826112270355225 2023-01-24 00:51:18.673148: step: 860/466, loss: 0.37034282088279724 2023-01-24 00:51:19.364228: step: 862/466, loss: 2.7625083923339844 2023-01-24 00:51:20.008443: step: 864/466, loss: 0.6565631628036499 2023-01-24 00:51:20.586792: step: 866/466, loss: 1.849884271621704 2023-01-24 00:51:21.255750: step: 868/466, loss: 0.7041193842887878 2023-01-24 00:51:21.893042: step: 870/466, loss: 0.3684704601764679 2023-01-24 00:51:22.563798: step: 872/466, loss: 0.36822181940078735 2023-01-24 00:51:23.200600: step: 874/466, loss: 0.5671539306640625 2023-01-24 00:51:23.862526: step: 876/466, loss: 0.28059229254722595 2023-01-24 00:51:24.518899: step: 878/466, loss: 0.19189174473285675 2023-01-24 00:51:25.098671: step: 880/466, loss: 0.7543321251869202 2023-01-24 00:51:25.680455: step: 882/466, loss: 1.4416909217834473 2023-01-24 00:51:26.270678: step: 884/466, loss: 0.3737231492996216 2023-01-24 00:51:26.974185: step: 886/466, loss: 0.37167057394981384 2023-01-24 00:51:27.634130: step: 888/466, loss: 0.3493468165397644 2023-01-24 00:51:28.235791: step: 890/466, loss: 0.610969603061676 2023-01-24 00:51:28.853368: step: 892/466, loss: 0.42475399374961853 2023-01-24 00:51:29.484119: step: 894/466, loss: 0.17129488289356232 2023-01-24 00:51:30.145483: step: 896/466, loss: 4.020137786865234 2023-01-24 00:51:30.733222: step: 898/466, loss: 0.6816555261611938 2023-01-24 00:51:31.326742: step: 900/466, loss: 1.2390077114105225 2023-01-24 00:51:31.916823: step: 902/466, loss: 0.4940222501754761 2023-01-24 00:51:32.624832: step: 904/466, loss: 0.15959131717681885 2023-01-24 00:51:33.286168: step: 906/466, loss: 2.1150925159454346 2023-01-24 00:51:33.864553: step: 908/466, loss: 1.8174481391906738 2023-01-24 00:51:34.435200: step: 910/466, loss: 3.087872266769409 2023-01-24 00:51:35.062468: step: 912/466, loss: 0.49890315532684326 2023-01-24 00:51:35.677190: step: 914/466, loss: 1.5742014646530151 2023-01-24 00:51:36.286051: step: 916/466, loss: 0.7117514610290527 2023-01-24 00:51:36.862141: step: 918/466, loss: 1.0011276006698608 2023-01-24 00:51:37.577737: step: 920/466, loss: 1.4318996667861938 2023-01-24 00:51:38.173564: step: 922/466, loss: 0.8991138935089111 2023-01-24 00:51:38.751566: step: 924/466, loss: 0.5053771734237671 2023-01-24 00:51:39.334914: step: 926/466, loss: 2.1333112716674805 2023-01-24 00:51:39.925616: step: 928/466, loss: 4.616252422332764 2023-01-24 00:51:40.543718: step: 930/466, loss: 4.1801652908325195 2023-01-24 00:51:41.159791: step: 932/466, loss: 3.0591838359832764 ================================================== Loss: 1.294 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3164303771212846, 'r': 0.295415076932964, 'f1': 0.3055618165724671}, 'combined': 0.22515081221129155, 'epoch': 3} Test Chinese: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.34839024729428025, 'r': 0.27130476474688553, 'f1': 0.30505307367555545}, 'combined': 0.20231499186772586, 'epoch': 3} Dev Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.3139643719806763, 'r': 0.24617660984848483, 'f1': 0.2759686836518046}, 'combined': 0.1839791224345364, 'epoch': 3} Test Korean: {'template': {'p': 0.96875, 'r': 0.49206349206349204, 'f1': 0.6526315789473683}, 'slot': {'p': 0.3600184028992785, 'r': 0.25773338169559745, 'f1': 0.30040790740161233}, 'combined': 0.19605568693578906, 'epoch': 3} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.32446373912283005, 'r': 0.30476195610208895, 'f1': 0.31430440482544203}, 'combined': 0.23159271934506254, 'epoch': 3} Test Russian: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.3404589712580221, 'r': 0.2634574188331142, 'f1': 0.2970492050155484}, 'combined': 0.1970067266424362, 'epoch': 3} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.3098290598290598, 'r': 0.3452380952380952, 'f1': 0.3265765765765765}, 'combined': 0.21771771771771767, 'epoch': 3} Sample Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.3076923076923077, 'r': 0.17391304347826086, 'f1': 0.2222222222222222}, 'combined': 0.14814814814814814, 'epoch': 3} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.34375, 'r': 0.09482758620689655, 'f1': 0.14864864864864866}, 'combined': 0.0990990990990991, 'epoch': 3} New best chinese model... New best russian model... ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3164303771212846, 'r': 0.295415076932964, 'f1': 0.3055618165724671}, 'combined': 0.22515081221129155, 'epoch': 3} Test for Chinese: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.34839024729428025, 'r': 0.27130476474688553, 'f1': 0.30505307367555545}, 'combined': 0.20231499186772586, 'epoch': 3} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.3098290598290598, 'r': 0.3452380952380952, 'f1': 0.3265765765765765}, 'combined': 0.21771771771771767, 'epoch': 3} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.34554782082324453, 'r': 0.19269679989197946, 'f1': 0.24741894937586684}, 'combined': 0.1649459662505779, 'epoch': 2} Test for Korean: {'template': {'p': 0.96875, 'r': 0.49206349206349204, 'f1': 0.6526315789473683}, 'slot': {'p': 0.40241683553303587, 'r': 0.1674999263617645, 'f1': 0.2365425789352723}, 'combined': 0.15437515677880928, 'epoch': 2} Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.4583333333333333, 'r': 0.2391304347826087, 'f1': 0.3142857142857143}, 'combined': 0.2095238095238095, 'epoch': 2} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.32446373912283005, 'r': 0.30476195610208895, 'f1': 0.31430440482544203}, 'combined': 0.23159271934506254, 'epoch': 3} Test for Russian: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.3404589712580221, 'r': 0.2634574188331142, 'f1': 0.2970492050155484}, 'combined': 0.1970067266424362, 'epoch': 3} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.34375, 'r': 0.09482758620689655, 'f1': 0.14864864864864866}, 'combined': 0.0990990990990991, 'epoch': 3} ****************************** Epoch: 4 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-24 00:54:22.418978: step: 2/466, loss: 0.16752889752388 2023-01-24 00:54:23.153943: step: 4/466, loss: 0.6875675916671753 2023-01-24 00:54:23.739315: step: 6/466, loss: 0.975544810295105 2023-01-24 00:54:24.342788: step: 8/466, loss: 0.9720039367675781 2023-01-24 00:54:24.921583: step: 10/466, loss: 0.6962733864784241 2023-01-24 00:54:25.541178: step: 12/466, loss: 0.2384537160396576 2023-01-24 00:54:26.222564: step: 14/466, loss: 0.6073007583618164 2023-01-24 00:54:26.829919: step: 16/466, loss: 0.2983320355415344 2023-01-24 00:54:27.500447: step: 18/466, loss: 0.5829412341117859 2023-01-24 00:54:28.123763: step: 20/466, loss: 0.6241064071655273 2023-01-24 00:54:28.818184: step: 22/466, loss: 2.6800174713134766 2023-01-24 00:54:29.524310: step: 24/466, loss: 0.8684090971946716 2023-01-24 00:54:30.138187: step: 26/466, loss: 0.8638682961463928 2023-01-24 00:54:30.803414: step: 28/466, loss: 1.8441447019577026 2023-01-24 00:54:31.400714: step: 30/466, loss: 0.3063080310821533 2023-01-24 00:54:32.019425: step: 32/466, loss: 1.5212368965148926 2023-01-24 00:54:32.668095: step: 34/466, loss: 1.479271650314331 2023-01-24 00:54:33.320888: step: 36/466, loss: 0.17555014789104462 2023-01-24 00:54:33.906546: step: 38/466, loss: 1.2827237844467163 2023-01-24 00:54:34.642309: step: 40/466, loss: 0.8869252800941467 2023-01-24 00:54:35.274307: step: 42/466, loss: 1.5399391651153564 2023-01-24 00:54:35.930709: step: 44/466, loss: 0.3328384459018707 2023-01-24 00:54:36.547962: step: 46/466, loss: 1.2157955169677734 2023-01-24 00:54:37.204652: step: 48/466, loss: 0.8616793751716614 2023-01-24 00:54:37.806352: step: 50/466, loss: 0.5039883852005005 2023-01-24 00:54:38.410443: step: 52/466, loss: 0.5184072256088257 2023-01-24 00:54:39.063705: step: 54/466, loss: 0.4360441267490387 2023-01-24 00:54:39.737821: step: 56/466, loss: 0.6127593517303467 2023-01-24 00:54:40.417222: step: 58/466, loss: 0.19667549431324005 2023-01-24 00:54:41.009814: step: 60/466, loss: 0.21572081744670868 2023-01-24 00:54:41.679879: step: 62/466, loss: 0.9960227012634277 2023-01-24 00:54:42.347941: step: 64/466, loss: 1.609165072441101 2023-01-24 00:54:42.956705: step: 66/466, loss: 0.8556930422782898 2023-01-24 00:54:43.494393: step: 68/466, loss: 1.347257137298584 2023-01-24 00:54:44.119971: step: 70/466, loss: 2.176039934158325 2023-01-24 00:54:44.771206: step: 72/466, loss: 1.7566003799438477 2023-01-24 00:54:45.373430: step: 74/466, loss: 0.6123693585395813 2023-01-24 00:54:46.064799: step: 76/466, loss: 0.9963200092315674 2023-01-24 00:54:46.721997: step: 78/466, loss: 0.6093348264694214 2023-01-24 00:54:47.367363: step: 80/466, loss: 0.748720109462738 2023-01-24 00:54:47.933933: step: 82/466, loss: 1.1287039518356323 2023-01-24 00:54:48.583358: step: 84/466, loss: 0.6180115342140198 2023-01-24 00:54:49.234083: step: 86/466, loss: 0.4055139422416687 2023-01-24 00:54:49.839272: step: 88/466, loss: 0.2868320345878601 2023-01-24 00:54:50.393590: step: 90/466, loss: 0.8599556088447571 2023-01-24 00:54:51.031329: step: 92/466, loss: 0.8055647611618042 2023-01-24 00:54:51.629828: step: 94/466, loss: 1.122262716293335 2023-01-24 00:54:52.254134: step: 96/466, loss: 3.749009609222412 2023-01-24 00:54:52.863683: step: 98/466, loss: 0.796328604221344 2023-01-24 00:54:53.542920: step: 100/466, loss: 0.7650906443595886 2023-01-24 00:54:54.170666: step: 102/466, loss: 0.49554768204689026 2023-01-24 00:54:54.755618: step: 104/466, loss: 0.6650872826576233 2023-01-24 00:54:55.374319: step: 106/466, loss: 0.3901982009410858 2023-01-24 00:54:56.035186: step: 108/466, loss: 0.28979477286338806 2023-01-24 00:54:56.670412: step: 110/466, loss: 0.5083305835723877 2023-01-24 00:54:57.328294: step: 112/466, loss: 2.831669330596924 2023-01-24 00:54:57.939165: step: 114/466, loss: 0.48879584670066833 2023-01-24 00:54:58.590066: step: 116/466, loss: 0.2785993814468384 2023-01-24 00:54:59.145077: step: 118/466, loss: 0.8611894845962524 2023-01-24 00:54:59.762918: step: 120/466, loss: 0.3948362171649933 2023-01-24 00:55:00.427970: step: 122/466, loss: 0.7170612812042236 2023-01-24 00:55:01.079970: step: 124/466, loss: 0.4916868507862091 2023-01-24 00:55:01.713659: step: 126/466, loss: 2.437495470046997 2023-01-24 00:55:02.346953: step: 128/466, loss: 6.306944847106934 2023-01-24 00:55:02.926142: step: 130/466, loss: 0.627547562122345 2023-01-24 00:55:03.496087: step: 132/466, loss: 0.506788969039917 2023-01-24 00:55:04.218940: step: 134/466, loss: 0.49227237701416016 2023-01-24 00:55:04.847886: step: 136/466, loss: 0.7065977454185486 2023-01-24 00:55:05.441421: step: 138/466, loss: 0.7459494471549988 2023-01-24 00:55:06.076973: step: 140/466, loss: 0.45485028624534607 2023-01-24 00:55:06.757391: step: 142/466, loss: 0.43375688791275024 2023-01-24 00:55:07.360822: step: 144/466, loss: 1.2365134954452515 2023-01-24 00:55:07.955295: step: 146/466, loss: 0.7009040713310242 2023-01-24 00:55:08.575837: step: 148/466, loss: 0.6090517044067383 2023-01-24 00:55:09.228615: step: 150/466, loss: 0.27691200375556946 2023-01-24 00:55:09.809914: step: 152/466, loss: 1.0735443830490112 2023-01-24 00:55:10.406872: step: 154/466, loss: 0.44410645961761475 2023-01-24 00:55:11.000829: step: 156/466, loss: 0.39195290207862854 2023-01-24 00:55:11.661521: step: 158/466, loss: 1.1745972633361816 2023-01-24 00:55:12.315742: step: 160/466, loss: 0.39246582984924316 2023-01-24 00:55:12.933848: step: 162/466, loss: 0.6164143085479736 2023-01-24 00:55:13.576228: step: 164/466, loss: 0.07469842582941055 2023-01-24 00:55:14.191820: step: 166/466, loss: 0.6378391981124878 2023-01-24 00:55:14.768728: step: 168/466, loss: 1.5078856945037842 2023-01-24 00:55:15.385774: step: 170/466, loss: 0.5053169131278992 2023-01-24 00:55:16.014106: step: 172/466, loss: 2.8393361568450928 2023-01-24 00:55:16.541063: step: 174/466, loss: 0.15515677630901337 2023-01-24 00:55:17.201880: step: 176/466, loss: 0.2646979093551636 2023-01-24 00:55:17.906994: step: 178/466, loss: 0.32472139596939087 2023-01-24 00:55:18.608011: step: 180/466, loss: 0.7028464674949646 2023-01-24 00:55:19.248366: step: 182/466, loss: 0.5240946412086487 2023-01-24 00:55:19.821386: step: 184/466, loss: 0.2585528790950775 2023-01-24 00:55:20.412670: step: 186/466, loss: 0.3905927836894989 2023-01-24 00:55:21.008897: step: 188/466, loss: 0.298913836479187 2023-01-24 00:55:21.600067: step: 190/466, loss: 0.22440190613269806 2023-01-24 00:55:22.260821: step: 192/466, loss: 0.4986995458602905 2023-01-24 00:55:22.831903: step: 194/466, loss: 2.0994832515716553 2023-01-24 00:55:23.419391: step: 196/466, loss: 1.9224767684936523 2023-01-24 00:55:24.062956: step: 198/466, loss: 0.8672666549682617 2023-01-24 00:55:24.760029: step: 200/466, loss: 3.985262393951416 2023-01-24 00:55:25.371887: step: 202/466, loss: 0.5150429606437683 2023-01-24 00:55:25.979710: step: 204/466, loss: 1.835949182510376 2023-01-24 00:55:26.642519: step: 206/466, loss: 1.2212414741516113 2023-01-24 00:55:27.275708: step: 208/466, loss: 1.9496268033981323 2023-01-24 00:55:27.948183: step: 210/466, loss: 0.5779752135276794 2023-01-24 00:55:28.517551: step: 212/466, loss: 0.5665013194084167 2023-01-24 00:55:29.110109: step: 214/466, loss: 1.4755817651748657 2023-01-24 00:55:29.781166: step: 216/466, loss: 1.2451655864715576 2023-01-24 00:55:30.404519: step: 218/466, loss: 0.9476553201675415 2023-01-24 00:55:31.075306: step: 220/466, loss: 0.3696775436401367 2023-01-24 00:55:31.680659: step: 222/466, loss: 2.8585143089294434 2023-01-24 00:55:32.339611: step: 224/466, loss: 0.5285619497299194 2023-01-24 00:55:32.971061: step: 226/466, loss: 1.3603274822235107 2023-01-24 00:55:33.561685: step: 228/466, loss: 0.6532608270645142 2023-01-24 00:55:34.199493: step: 230/466, loss: 0.1637951135635376 2023-01-24 00:55:34.778508: step: 232/466, loss: 0.4016510248184204 2023-01-24 00:55:35.449827: step: 234/466, loss: 0.8992332816123962 2023-01-24 00:55:36.076614: step: 236/466, loss: 0.87233966588974 2023-01-24 00:55:36.772135: step: 238/466, loss: 1.3923922777175903 2023-01-24 00:55:37.412560: step: 240/466, loss: 0.3028479218482971 2023-01-24 00:55:38.033849: step: 242/466, loss: 5.808621883392334 2023-01-24 00:55:38.642561: step: 244/466, loss: 0.9096077680587769 2023-01-24 00:55:39.339557: step: 246/466, loss: 2.775336742401123 2023-01-24 00:55:39.967936: step: 248/466, loss: 0.6801366806030273 2023-01-24 00:55:40.557556: step: 250/466, loss: 0.4254688024520874 2023-01-24 00:55:41.199079: step: 252/466, loss: 0.8711274862289429 2023-01-24 00:55:41.832093: step: 254/466, loss: 0.5454870462417603 2023-01-24 00:55:42.441117: step: 256/466, loss: 1.483837366104126 2023-01-24 00:55:43.112070: step: 258/466, loss: 0.21793553233146667 2023-01-24 00:55:43.755576: step: 260/466, loss: 1.0067871809005737 2023-01-24 00:55:44.402390: step: 262/466, loss: 0.9667119979858398 2023-01-24 00:55:45.098790: step: 264/466, loss: 0.24902909994125366 2023-01-24 00:55:45.727138: step: 266/466, loss: 0.7938571572303772 2023-01-24 00:55:46.372546: step: 268/466, loss: 0.6056684255599976 2023-01-24 00:55:46.991641: step: 270/466, loss: 1.96467924118042 2023-01-24 00:55:47.580010: step: 272/466, loss: 1.3282179832458496 2023-01-24 00:55:48.284238: step: 274/466, loss: 0.4778505563735962 2023-01-24 00:55:48.851171: step: 276/466, loss: 1.424825668334961 2023-01-24 00:55:49.553515: step: 278/466, loss: 0.8274783492088318 2023-01-24 00:55:50.179531: step: 280/466, loss: 1.7603840827941895 2023-01-24 00:55:50.757431: step: 282/466, loss: 0.4785130023956299 2023-01-24 00:55:51.411429: step: 284/466, loss: 2.4316341876983643 2023-01-24 00:55:51.990142: step: 286/466, loss: 0.45507317781448364 2023-01-24 00:55:52.618908: step: 288/466, loss: 1.6569828987121582 2023-01-24 00:55:53.237584: step: 290/466, loss: 1.6912707090377808 2023-01-24 00:55:53.841796: step: 292/466, loss: 0.99015212059021 2023-01-24 00:55:54.520418: step: 294/466, loss: 0.8423710465431213 2023-01-24 00:55:55.110560: step: 296/466, loss: 0.3664628565311432 2023-01-24 00:55:55.745386: step: 298/466, loss: 0.5789371728897095 2023-01-24 00:55:56.314285: step: 300/466, loss: 0.3125426173210144 2023-01-24 00:55:56.964646: step: 302/466, loss: 0.1801319122314453 2023-01-24 00:55:57.571485: step: 304/466, loss: 0.6854432821273804 2023-01-24 00:55:58.184875: step: 306/466, loss: 3.1088051795959473 2023-01-24 00:55:58.786126: step: 308/466, loss: 0.9818800687789917 2023-01-24 00:55:59.396665: step: 310/466, loss: 0.406833291053772 2023-01-24 00:56:00.053948: step: 312/466, loss: 1.8748011589050293 2023-01-24 00:56:00.644911: step: 314/466, loss: 1.398498296737671 2023-01-24 00:56:01.268959: step: 316/466, loss: 0.4191115200519562 2023-01-24 00:56:01.855771: step: 318/466, loss: 2.0664098262786865 2023-01-24 00:56:02.447761: step: 320/466, loss: 0.5372059345245361 2023-01-24 00:56:03.135763: step: 322/466, loss: 0.47565436363220215 2023-01-24 00:56:03.740811: step: 324/466, loss: 1.0751205682754517 2023-01-24 00:56:04.350544: step: 326/466, loss: 0.6654729843139648 2023-01-24 00:56:05.075415: step: 328/466, loss: 0.6316863894462585 2023-01-24 00:56:05.673125: step: 330/466, loss: 1.9781851768493652 2023-01-24 00:56:06.315217: step: 332/466, loss: 0.558565080165863 2023-01-24 00:56:06.993042: step: 334/466, loss: 0.36841893196105957 2023-01-24 00:56:07.571206: step: 336/466, loss: 0.37612828612327576 2023-01-24 00:56:08.131966: step: 338/466, loss: 0.2611583173274994 2023-01-24 00:56:08.705160: step: 340/466, loss: 0.8179485201835632 2023-01-24 00:56:09.324779: step: 342/466, loss: 0.5163973569869995 2023-01-24 00:56:09.948943: step: 344/466, loss: 0.3190925121307373 2023-01-24 00:56:10.532699: step: 346/466, loss: 1.341943621635437 2023-01-24 00:56:11.147060: step: 348/466, loss: 1.0306843519210815 2023-01-24 00:56:11.814629: step: 350/466, loss: 0.23381835222244263 2023-01-24 00:56:12.424804: step: 352/466, loss: 0.32457369565963745 2023-01-24 00:56:13.050554: step: 354/466, loss: 1.210745096206665 2023-01-24 00:56:13.592938: step: 356/466, loss: 1.895652413368225 2023-01-24 00:56:14.204977: step: 358/466, loss: 0.8307660222053528 2023-01-24 00:56:14.839321: step: 360/466, loss: 0.8706965446472168 2023-01-24 00:56:15.422371: step: 362/466, loss: 0.27401426434516907 2023-01-24 00:56:16.029357: step: 364/466, loss: 1.4053891897201538 2023-01-24 00:56:16.627746: step: 366/466, loss: 0.9843682646751404 2023-01-24 00:56:17.209789: step: 368/466, loss: 0.8457871675491333 2023-01-24 00:56:17.825367: step: 370/466, loss: 0.647186815738678 2023-01-24 00:56:18.426040: step: 372/466, loss: 0.32200056314468384 2023-01-24 00:56:18.924366: step: 374/466, loss: 0.6154910326004028 2023-01-24 00:56:19.528075: step: 376/466, loss: 0.2628232538700104 2023-01-24 00:56:20.131396: step: 378/466, loss: 1.383215069770813 2023-01-24 00:56:20.752316: step: 380/466, loss: 1.0806177854537964 2023-01-24 00:56:21.334353: step: 382/466, loss: 0.937576949596405 2023-01-24 00:56:21.942684: step: 384/466, loss: 0.881920576095581 2023-01-24 00:56:22.565396: step: 386/466, loss: 1.1035183668136597 2023-01-24 00:56:23.191094: step: 388/466, loss: 1.3505921363830566 2023-01-24 00:56:23.822103: step: 390/466, loss: 0.593630850315094 2023-01-24 00:56:24.470131: step: 392/466, loss: 1.1212100982666016 2023-01-24 00:56:25.102827: step: 394/466, loss: 1.7661340236663818 2023-01-24 00:56:25.764420: step: 396/466, loss: 0.6906561255455017 2023-01-24 00:56:26.477039: step: 398/466, loss: 1.5854777097702026 2023-01-24 00:56:27.077210: step: 400/466, loss: 1.0544898509979248 2023-01-24 00:56:27.691722: step: 402/466, loss: 0.3514784276485443 2023-01-24 00:56:28.295529: step: 404/466, loss: 1.0262101888656616 2023-01-24 00:56:28.932559: step: 406/466, loss: 2.7127935886383057 2023-01-24 00:56:29.606259: step: 408/466, loss: 9.940874099731445 2023-01-24 00:56:30.259461: step: 410/466, loss: 0.32826903462409973 2023-01-24 00:56:30.838520: step: 412/466, loss: 2.594214916229248 2023-01-24 00:56:31.456592: step: 414/466, loss: 1.0893781185150146 2023-01-24 00:56:32.027997: step: 416/466, loss: 0.3404262661933899 2023-01-24 00:56:32.668899: step: 418/466, loss: 0.8813122510910034 2023-01-24 00:56:33.338923: step: 420/466, loss: 0.8629094362258911 2023-01-24 00:56:33.955098: step: 422/466, loss: 1.0467783212661743 2023-01-24 00:56:34.516739: step: 424/466, loss: 0.9128414988517761 2023-01-24 00:56:35.087410: step: 426/466, loss: 0.3509741425514221 2023-01-24 00:56:35.743455: step: 428/466, loss: 0.632339358329773 2023-01-24 00:56:36.383325: step: 430/466, loss: 0.25753694772720337 2023-01-24 00:56:37.039991: step: 432/466, loss: 0.3298838436603546 2023-01-24 00:56:37.664671: step: 434/466, loss: 1.2893836498260498 2023-01-24 00:56:38.489757: step: 436/466, loss: 0.8060743808746338 2023-01-24 00:56:39.117421: step: 438/466, loss: 0.6938568353652954 2023-01-24 00:56:39.762806: step: 440/466, loss: 0.3745870292186737 2023-01-24 00:56:40.424264: step: 442/466, loss: 1.093895673751831 2023-01-24 00:56:41.043146: step: 444/466, loss: 1.6614004373550415 2023-01-24 00:56:41.721466: step: 446/466, loss: 0.6754403114318848 2023-01-24 00:56:42.381945: step: 448/466, loss: 0.3983674943447113 2023-01-24 00:56:42.979970: step: 450/466, loss: 1.8840715885162354 2023-01-24 00:56:43.691914: step: 452/466, loss: 0.5922427177429199 2023-01-24 00:56:44.338845: step: 454/466, loss: 0.6781620979309082 2023-01-24 00:56:45.011834: step: 456/466, loss: 0.6283656358718872 2023-01-24 00:56:45.556623: step: 458/466, loss: 0.5273997783660889 2023-01-24 00:56:46.201809: step: 460/466, loss: 5.435046195983887 2023-01-24 00:56:46.794983: step: 462/466, loss: 0.20654211938381195 2023-01-24 00:56:47.422412: step: 464/466, loss: 0.32080647349357605 2023-01-24 00:56:48.015087: step: 466/466, loss: 1.0139718055725098 2023-01-24 00:56:48.631963: step: 468/466, loss: 0.23736083507537842 2023-01-24 00:56:49.201057: step: 470/466, loss: 0.7319826483726501 2023-01-24 00:56:49.803719: step: 472/466, loss: 11.274197578430176 2023-01-24 00:56:50.471552: step: 474/466, loss: 1.1109938621520996 2023-01-24 00:56:51.105336: step: 476/466, loss: 0.3432893753051758 2023-01-24 00:56:51.764458: step: 478/466, loss: 0.4525010883808136 2023-01-24 00:56:52.402272: step: 480/466, loss: 0.7121342420578003 2023-01-24 00:56:53.032314: step: 482/466, loss: 1.3188581466674805 2023-01-24 00:56:53.633167: step: 484/466, loss: 0.17577265202999115 2023-01-24 00:56:54.171553: step: 486/466, loss: 0.8547149896621704 2023-01-24 00:56:54.743937: step: 488/466, loss: 1.0815746784210205 2023-01-24 00:56:55.416573: step: 490/466, loss: 0.526871383190155 2023-01-24 00:56:56.022397: step: 492/466, loss: 0.8402315378189087 2023-01-24 00:56:56.714659: step: 494/466, loss: 0.8059301972389221 2023-01-24 00:56:57.297073: step: 496/466, loss: 0.5090266466140747 2023-01-24 00:56:58.007301: step: 498/466, loss: 1.7350958585739136 2023-01-24 00:56:58.663222: step: 500/466, loss: 0.25774773955345154 2023-01-24 00:56:59.285070: step: 502/466, loss: 0.9254035353660583 2023-01-24 00:56:59.910443: step: 504/466, loss: 0.6830517649650574 2023-01-24 00:57:00.539674: step: 506/466, loss: 0.7184911370277405 2023-01-24 00:57:01.074440: step: 508/466, loss: 0.19448937475681305 2023-01-24 00:57:01.695861: step: 510/466, loss: 0.3182905316352844 2023-01-24 00:57:02.303365: step: 512/466, loss: 0.22594107687473297 2023-01-24 00:57:02.901367: step: 514/466, loss: 0.4143276810646057 2023-01-24 00:57:03.459528: step: 516/466, loss: 0.2853304147720337 2023-01-24 00:57:04.088365: step: 518/466, loss: 2.4298224449157715 2023-01-24 00:57:04.711974: step: 520/466, loss: 2.2207794189453125 2023-01-24 00:57:05.270815: step: 522/466, loss: 0.23604892194271088 2023-01-24 00:57:05.904074: step: 524/466, loss: 0.9321451187133789 2023-01-24 00:57:06.554086: step: 526/466, loss: 0.7915027141571045 2023-01-24 00:57:07.194141: step: 528/466, loss: 1.3687559366226196 2023-01-24 00:57:07.824176: step: 530/466, loss: 7.580524444580078 2023-01-24 00:57:08.451771: step: 532/466, loss: 0.8179179430007935 2023-01-24 00:57:09.043883: step: 534/466, loss: 0.5741764903068542 2023-01-24 00:57:09.666117: step: 536/466, loss: 0.5159574747085571 2023-01-24 00:57:10.253350: step: 538/466, loss: 0.7804008722305298 2023-01-24 00:57:10.865318: step: 540/466, loss: 2.4360361099243164 2023-01-24 00:57:11.527123: step: 542/466, loss: 0.4893640875816345 2023-01-24 00:57:12.122428: step: 544/466, loss: 0.6733390092849731 2023-01-24 00:57:12.770494: step: 546/466, loss: 1.9966200590133667 2023-01-24 00:57:13.406993: step: 548/466, loss: 0.23160187900066376 2023-01-24 00:57:13.993905: step: 550/466, loss: 0.6399521231651306 2023-01-24 00:57:14.646529: step: 552/466, loss: 1.6272296905517578 2023-01-24 00:57:15.286788: step: 554/466, loss: 0.7216408252716064 2023-01-24 00:57:15.815766: step: 556/466, loss: 1.3738418817520142 2023-01-24 00:57:16.413924: step: 558/466, loss: 0.693959653377533 2023-01-24 00:57:17.005882: step: 560/466, loss: 0.4637894034385681 2023-01-24 00:57:17.642950: step: 562/466, loss: 1.4166477918624878 2023-01-24 00:57:18.302851: step: 564/466, loss: 1.5565288066864014 2023-01-24 00:57:18.932202: step: 566/466, loss: 1.0820739269256592 2023-01-24 00:57:19.683216: step: 568/466, loss: 0.27220243215560913 2023-01-24 00:57:20.303388: step: 570/466, loss: 0.6370554566383362 2023-01-24 00:57:20.912505: step: 572/466, loss: 0.7111790776252747 2023-01-24 00:57:21.594976: step: 574/466, loss: 1.1463621854782104 2023-01-24 00:57:22.228113: step: 576/466, loss: 0.47016438841819763 2023-01-24 00:57:22.840275: step: 578/466, loss: 0.8937940001487732 2023-01-24 00:57:23.464869: step: 580/466, loss: 0.28812530636787415 2023-01-24 00:57:24.063848: step: 582/466, loss: 1.85710871219635 2023-01-24 00:57:24.726349: step: 584/466, loss: 2.4654958248138428 2023-01-24 00:57:25.332619: step: 586/466, loss: 0.30235031247138977 2023-01-24 00:57:25.979208: step: 588/466, loss: 1.227952480316162 2023-01-24 00:57:26.608447: step: 590/466, loss: 0.6962063312530518 2023-01-24 00:57:27.263545: step: 592/466, loss: 0.2512756586074829 2023-01-24 00:57:27.936066: step: 594/466, loss: 1.5459935665130615 2023-01-24 00:57:28.519567: step: 596/466, loss: 0.5050576329231262 2023-01-24 00:57:29.184774: step: 598/466, loss: 0.7506095170974731 2023-01-24 00:57:29.840944: step: 600/466, loss: 0.24673344194889069 2023-01-24 00:57:30.520205: step: 602/466, loss: 0.8932666778564453 2023-01-24 00:57:31.159824: step: 604/466, loss: 1.055237054824829 2023-01-24 00:57:31.812936: step: 606/466, loss: 0.6997503638267517 2023-01-24 00:57:32.468408: step: 608/466, loss: 0.6910296082496643 2023-01-24 00:57:33.142479: step: 610/466, loss: 1.1907850503921509 2023-01-24 00:57:33.867974: step: 612/466, loss: 0.7755650877952576 2023-01-24 00:57:34.533741: step: 614/466, loss: 1.2764533758163452 2023-01-24 00:57:35.238009: step: 616/466, loss: 0.49234524369239807 2023-01-24 00:57:35.892425: step: 618/466, loss: 0.468046635389328 2023-01-24 00:57:36.649456: step: 620/466, loss: 0.9081097841262817 2023-01-24 00:57:37.292221: step: 622/466, loss: 0.372394859790802 2023-01-24 00:57:37.865501: step: 624/466, loss: 0.43292921781539917 2023-01-24 00:57:38.487226: step: 626/466, loss: 0.9822397828102112 2023-01-24 00:57:39.130719: step: 628/466, loss: 0.6980036497116089 2023-01-24 00:57:39.740471: step: 630/466, loss: 0.6188594698905945 2023-01-24 00:57:40.395873: step: 632/466, loss: 0.5005221366882324 2023-01-24 00:57:40.980027: step: 634/466, loss: 2.710353136062622 2023-01-24 00:57:41.609447: step: 636/466, loss: 1.0915191173553467 2023-01-24 00:57:42.250445: step: 638/466, loss: 0.34767547249794006 2023-01-24 00:57:42.949135: step: 640/466, loss: 0.43298444151878357 2023-01-24 00:57:43.674824: step: 642/466, loss: 0.9110534191131592 2023-01-24 00:57:44.379468: step: 644/466, loss: 1.2311044931411743 2023-01-24 00:57:45.033426: step: 646/466, loss: 1.852325201034546 2023-01-24 00:57:45.701678: step: 648/466, loss: 0.2186703085899353 2023-01-24 00:57:46.258986: step: 650/466, loss: 0.21883513033390045 2023-01-24 00:57:46.857437: step: 652/466, loss: 4.912973880767822 2023-01-24 00:57:47.459904: step: 654/466, loss: 1.7887861728668213 2023-01-24 00:57:48.123482: step: 656/466, loss: 0.19818219542503357 2023-01-24 00:57:48.701427: step: 658/466, loss: 4.3705573081970215 2023-01-24 00:57:49.299826: step: 660/466, loss: 1.8388097286224365 2023-01-24 00:57:49.935700: step: 662/466, loss: 2.788428783416748 2023-01-24 00:57:50.565417: step: 664/466, loss: 2.1162989139556885 2023-01-24 00:57:51.133896: step: 666/466, loss: 1.3119508028030396 2023-01-24 00:57:51.739901: step: 668/466, loss: 0.6477605700492859 2023-01-24 00:57:52.372319: step: 670/466, loss: 0.364826500415802 2023-01-24 00:57:52.991387: step: 672/466, loss: 1.7788515090942383 2023-01-24 00:57:53.588094: step: 674/466, loss: 1.2706859111785889 2023-01-24 00:57:54.212293: step: 676/466, loss: 1.1993398666381836 2023-01-24 00:57:54.790146: step: 678/466, loss: 0.48671069741249084 2023-01-24 00:57:55.414647: step: 680/466, loss: 0.4129486382007599 2023-01-24 00:57:56.038921: step: 682/466, loss: 1.152281641960144 2023-01-24 00:57:56.714529: step: 684/466, loss: 0.6857094764709473 2023-01-24 00:57:57.260283: step: 686/466, loss: 0.5028449892997742 2023-01-24 00:57:57.839867: step: 688/466, loss: 5.923160552978516 2023-01-24 00:57:58.527648: step: 690/466, loss: 0.7625570297241211 2023-01-24 00:57:59.222594: step: 692/466, loss: 0.16711105406284332 2023-01-24 00:57:59.828401: step: 694/466, loss: 1.0117989778518677 2023-01-24 00:58:00.457346: step: 696/466, loss: 0.7212257385253906 2023-01-24 00:58:01.126720: step: 698/466, loss: 1.5466200113296509 2023-01-24 00:58:01.719502: step: 700/466, loss: 1.6017810106277466 2023-01-24 00:58:02.380431: step: 702/466, loss: 1.5867637395858765 2023-01-24 00:58:02.989361: step: 704/466, loss: 0.4246661067008972 2023-01-24 00:58:03.660502: step: 706/466, loss: 0.6573480367660522 2023-01-24 00:58:04.322192: step: 708/466, loss: 0.338861882686615 2023-01-24 00:58:04.920816: step: 710/466, loss: 0.6983233690261841 2023-01-24 00:58:05.521852: step: 712/466, loss: 1.945359230041504 2023-01-24 00:58:06.196053: step: 714/466, loss: 1.229771614074707 2023-01-24 00:58:06.844167: step: 716/466, loss: 0.4587058424949646 2023-01-24 00:58:07.534047: step: 718/466, loss: 0.6692253947257996 2023-01-24 00:58:08.231281: step: 720/466, loss: 0.5753758549690247 2023-01-24 00:58:08.834612: step: 722/466, loss: 0.7323040962219238 2023-01-24 00:58:09.457795: step: 724/466, loss: 0.8918372392654419 2023-01-24 00:58:10.030200: step: 726/466, loss: 0.4605655074119568 2023-01-24 00:58:10.633566: step: 728/466, loss: 1.4957081079483032 2023-01-24 00:58:11.235111: step: 730/466, loss: 7.627898693084717 2023-01-24 00:58:11.902870: step: 732/466, loss: 0.7290813326835632 2023-01-24 00:58:12.491725: step: 734/466, loss: 0.27680712938308716 2023-01-24 00:58:13.142274: step: 736/466, loss: 0.2125605195760727 2023-01-24 00:58:13.774635: step: 738/466, loss: 0.848861813545227 2023-01-24 00:58:14.407288: step: 740/466, loss: 1.3741085529327393 2023-01-24 00:58:15.058820: step: 742/466, loss: 2.952786445617676 2023-01-24 00:58:15.732155: step: 744/466, loss: 1.1536624431610107 2023-01-24 00:58:16.496965: step: 746/466, loss: 0.8832286596298218 2023-01-24 00:58:17.212018: step: 748/466, loss: 0.8762491345405579 2023-01-24 00:58:17.813039: step: 750/466, loss: 1.5069191455841064 2023-01-24 00:58:18.468634: step: 752/466, loss: 1.0268547534942627 2023-01-24 00:58:19.071746: step: 754/466, loss: 1.148740530014038 2023-01-24 00:58:19.714263: step: 756/466, loss: 0.22098587453365326 2023-01-24 00:58:20.338411: step: 758/466, loss: 0.5976026654243469 2023-01-24 00:58:20.973678: step: 760/466, loss: 0.5490186214447021 2023-01-24 00:58:21.602020: step: 762/466, loss: 1.9247955083847046 2023-01-24 00:58:22.246005: step: 764/466, loss: 0.8851991891860962 2023-01-24 00:58:22.886976: step: 766/466, loss: 0.32720571756362915 2023-01-24 00:58:23.584148: step: 768/466, loss: 5.309898853302002 2023-01-24 00:58:24.146869: step: 770/466, loss: 0.6763961315155029 2023-01-24 00:58:24.806337: step: 772/466, loss: 0.4193522036075592 2023-01-24 00:58:25.379468: step: 774/466, loss: 0.5671581029891968 2023-01-24 00:58:26.072874: step: 776/466, loss: 0.7532185316085815 2023-01-24 00:58:26.709689: step: 778/466, loss: 0.14647214114665985 2023-01-24 00:58:27.337521: step: 780/466, loss: 0.7904964089393616 2023-01-24 00:58:27.949320: step: 782/466, loss: 1.7479318380355835 2023-01-24 00:58:28.556630: step: 784/466, loss: 0.15508301556110382 2023-01-24 00:58:29.298182: step: 786/466, loss: 1.5002763271331787 2023-01-24 00:58:29.977405: step: 788/466, loss: 0.510749340057373 2023-01-24 00:58:30.591798: step: 790/466, loss: 0.6137053966522217 2023-01-24 00:58:31.173277: step: 792/466, loss: 0.31379973888397217 2023-01-24 00:58:31.777259: step: 794/466, loss: 0.44335412979125977 2023-01-24 00:58:32.367426: step: 796/466, loss: 0.6573041677474976 2023-01-24 00:58:33.020216: step: 798/466, loss: 2.469816207885742 2023-01-24 00:58:33.661453: step: 800/466, loss: 0.3082996904850006 2023-01-24 00:58:34.259601: step: 802/466, loss: 0.5501794815063477 2023-01-24 00:58:34.830867: step: 804/466, loss: 0.8786368370056152 2023-01-24 00:58:35.444189: step: 806/466, loss: 0.41694772243499756 2023-01-24 00:58:36.100205: step: 808/466, loss: 1.077990174293518 2023-01-24 00:58:36.728364: step: 810/466, loss: 2.110581398010254 2023-01-24 00:58:37.323468: step: 812/466, loss: 0.4858446717262268 2023-01-24 00:58:37.948313: step: 814/466, loss: 2.4747583866119385 2023-01-24 00:58:38.528082: step: 816/466, loss: 1.3693656921386719 2023-01-24 00:58:39.188499: step: 818/466, loss: 1.0355596542358398 2023-01-24 00:58:39.763444: step: 820/466, loss: 1.659318447113037 2023-01-24 00:58:40.449634: step: 822/466, loss: 3.9617958068847656 2023-01-24 00:58:41.105640: step: 824/466, loss: 1.1150134801864624 2023-01-24 00:58:41.649334: step: 826/466, loss: 0.36890584230422974 2023-01-24 00:58:42.271742: step: 828/466, loss: 1.1831920146942139 2023-01-24 00:58:42.944244: step: 830/466, loss: 0.43788182735443115 2023-01-24 00:58:43.543353: step: 832/466, loss: 1.0296648740768433 2023-01-24 00:58:44.115494: step: 834/466, loss: 1.4571318626403809 2023-01-24 00:58:44.777118: step: 836/466, loss: 0.3482404947280884 2023-01-24 00:58:45.385897: step: 838/466, loss: 0.7448530197143555 2023-01-24 00:58:45.982625: step: 840/466, loss: 1.256547212600708 2023-01-24 00:58:46.613094: step: 842/466, loss: 3.370251178741455 2023-01-24 00:58:47.272035: step: 844/466, loss: 0.5442215800285339 2023-01-24 00:58:47.888983: step: 846/466, loss: 2.169522285461426 2023-01-24 00:58:48.476380: step: 848/466, loss: 0.46707239747047424 2023-01-24 00:58:49.081698: step: 850/466, loss: 0.791730523109436 2023-01-24 00:58:49.702551: step: 852/466, loss: 0.681902289390564 2023-01-24 00:58:50.324231: step: 854/466, loss: 0.38136905431747437 2023-01-24 00:58:50.966935: step: 856/466, loss: 3.295001983642578 2023-01-24 00:58:51.550743: step: 858/466, loss: 0.9217782616615295 2023-01-24 00:58:52.146506: step: 860/466, loss: 0.49116799235343933 2023-01-24 00:58:52.766542: step: 862/466, loss: 0.8243554830551147 2023-01-24 00:58:53.362842: step: 864/466, loss: 0.4150761663913727 2023-01-24 00:58:54.018849: step: 866/466, loss: 0.5980117917060852 2023-01-24 00:58:54.604211: step: 868/466, loss: 1.8143088817596436 2023-01-24 00:58:55.213126: step: 870/466, loss: 1.0737305879592896 2023-01-24 00:58:55.863770: step: 872/466, loss: 0.9121313095092773 2023-01-24 00:58:56.469074: step: 874/466, loss: 0.8232439160346985 2023-01-24 00:58:57.091600: step: 876/466, loss: 0.5766512155532837 2023-01-24 00:58:57.700271: step: 878/466, loss: 0.4171861410140991 2023-01-24 00:58:58.309151: step: 880/466, loss: 3.5865650177001953 2023-01-24 00:58:58.931598: step: 882/466, loss: 1.2284045219421387 2023-01-24 00:58:59.544143: step: 884/466, loss: 0.30379223823547363 2023-01-24 00:59:00.185412: step: 886/466, loss: 1.3876835107803345 2023-01-24 00:59:00.821853: step: 888/466, loss: 2.0049948692321777 2023-01-24 00:59:01.420515: step: 890/466, loss: 1.9099504947662354 2023-01-24 00:59:02.043340: step: 892/466, loss: 0.6487614512443542 2023-01-24 00:59:02.687142: step: 894/466, loss: 0.7433854341506958 2023-01-24 00:59:03.327308: step: 896/466, loss: 0.6980950236320496 2023-01-24 00:59:03.879364: step: 898/466, loss: 0.7908098101615906 2023-01-24 00:59:04.445255: step: 900/466, loss: 0.328832745552063 2023-01-24 00:59:05.031395: step: 902/466, loss: 0.5036404728889465 2023-01-24 00:59:05.594330: step: 904/466, loss: 0.22693654894828796 2023-01-24 00:59:06.211215: step: 906/466, loss: 0.2672615945339203 2023-01-24 00:59:06.812016: step: 908/466, loss: 0.6582983732223511 2023-01-24 00:59:07.458574: step: 910/466, loss: 0.9059508442878723 2023-01-24 00:59:08.087239: step: 912/466, loss: 0.706932544708252 2023-01-24 00:59:08.748207: step: 914/466, loss: 0.40315255522727966 2023-01-24 00:59:09.393771: step: 916/466, loss: 0.9941682815551758 2023-01-24 00:59:09.993745: step: 918/466, loss: 0.32447823882102966 2023-01-24 00:59:10.613787: step: 920/466, loss: 0.8413281440734863 2023-01-24 00:59:11.198177: step: 922/466, loss: 0.7358835935592651 2023-01-24 00:59:11.789427: step: 924/466, loss: 0.6849548816680908 2023-01-24 00:59:12.451730: step: 926/466, loss: 0.35149043798446655 2023-01-24 00:59:13.093342: step: 928/466, loss: 0.7100387215614319 2023-01-24 00:59:13.731808: step: 930/466, loss: 0.3302081227302551 2023-01-24 00:59:14.327286: step: 932/466, loss: 2.2487170696258545 ================================================== Loss: 1.062 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.35893291693705387, 'r': 0.2737970258229519, 'f1': 0.31063731455047505}, 'combined': 0.22889065282666582, 'epoch': 4} Test Chinese: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.35070226138692967, 'r': 0.22499599626641983, 'f1': 0.2741248688688976}, 'combined': 0.18180302184051236, 'epoch': 4} Dev Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.3595007110930516, 'r': 0.21856009140316204, 'f1': 0.27184859425410973}, 'combined': 0.18123239616940648, 'epoch': 4} Test Korean: {'template': {'p': 0.96875, 'r': 0.49206349206349204, 'f1': 0.6526315789473683}, 'slot': {'p': 0.3626853594704456, 'r': 0.21164496301565397, 'f1': 0.26730446395088064}, 'combined': 0.17445133436794313, 'epoch': 4} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.362943679241804, 'r': 0.2706581896433187, 'f1': 0.3100801433522369}, 'combined': 0.228480105627964, 'epoch': 4} Test Russian: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.340306021942204, 'r': 0.21626374035114956, 'f1': 0.2644622764484677}, 'combined': 0.1753946703906936, 'epoch': 4} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.4020833333333334, 'r': 0.2297619047619048, 'f1': 0.29242424242424253}, 'combined': 0.194949494949495, 'epoch': 4} Sample Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.25, 'r': 0.08695652173913043, 'f1': 0.12903225806451613}, 'combined': 0.08602150537634408, 'epoch': 4} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.34375, 'r': 0.09482758620689655, 'f1': 0.14864864864864866}, 'combined': 0.0990990990990991, 'epoch': 4} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3164303771212846, 'r': 0.295415076932964, 'f1': 0.3055618165724671}, 'combined': 0.22515081221129155, 'epoch': 3} Test for Chinese: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.34839024729428025, 'r': 0.27130476474688553, 'f1': 0.30505307367555545}, 'combined': 0.20231499186772586, 'epoch': 3} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.3098290598290598, 'r': 0.3452380952380952, 'f1': 0.3265765765765765}, 'combined': 0.21771771771771767, 'epoch': 3} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.34554782082324453, 'r': 0.19269679989197946, 'f1': 0.24741894937586684}, 'combined': 0.1649459662505779, 'epoch': 2} Test for Korean: {'template': {'p': 0.96875, 'r': 0.49206349206349204, 'f1': 0.6526315789473683}, 'slot': {'p': 0.40241683553303587, 'r': 0.1674999263617645, 'f1': 0.2365425789352723}, 'combined': 0.15437515677880928, 'epoch': 2} Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.4583333333333333, 'r': 0.2391304347826087, 'f1': 0.3142857142857143}, 'combined': 0.2095238095238095, 'epoch': 2} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.32446373912283005, 'r': 0.30476195610208895, 'f1': 0.31430440482544203}, 'combined': 0.23159271934506254, 'epoch': 3} Test for Russian: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.3404589712580221, 'r': 0.2634574188331142, 'f1': 0.2970492050155484}, 'combined': 0.1970067266424362, 'epoch': 3} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.34375, 'r': 0.09482758620689655, 'f1': 0.14864864864864866}, 'combined': 0.0990990990990991, 'epoch': 3} ****************************** Epoch: 5 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-24 01:01:41.411862: step: 2/466, loss: 0.8873512148857117 2023-01-24 01:01:42.045349: step: 4/466, loss: 0.43146419525146484 2023-01-24 01:01:42.669033: step: 6/466, loss: 1.6752448081970215 2023-01-24 01:01:43.284542: step: 8/466, loss: 0.789301335811615 2023-01-24 01:01:43.974056: step: 10/466, loss: 0.4882446527481079 2023-01-24 01:01:44.598668: step: 12/466, loss: 0.8184223771095276 2023-01-24 01:01:45.215415: step: 14/466, loss: 0.6357017159461975 2023-01-24 01:01:45.790912: step: 16/466, loss: 2.351536273956299 2023-01-24 01:01:46.430378: step: 18/466, loss: 1.0994080305099487 2023-01-24 01:01:47.069292: step: 20/466, loss: 0.4771495461463928 2023-01-24 01:01:47.679890: step: 22/466, loss: 0.36976194381713867 2023-01-24 01:01:48.284111: step: 24/466, loss: 0.6308794617652893 2023-01-24 01:01:48.916033: step: 26/466, loss: 0.7177110910415649 2023-01-24 01:01:49.463129: step: 28/466, loss: 0.39158812165260315 2023-01-24 01:01:50.021484: step: 30/466, loss: 1.6430250406265259 2023-01-24 01:01:50.651432: step: 32/466, loss: 0.548024594783783 2023-01-24 01:01:51.243750: step: 34/466, loss: 1.4155356884002686 2023-01-24 01:01:51.868318: step: 36/466, loss: 0.4548807740211487 2023-01-24 01:01:52.451528: step: 38/466, loss: 0.5886859893798828 2023-01-24 01:01:53.088507: step: 40/466, loss: 0.45883312821388245 2023-01-24 01:01:53.736303: step: 42/466, loss: 0.368941992521286 2023-01-24 01:01:54.318792: step: 44/466, loss: 1.1091209650039673 2023-01-24 01:01:54.914665: step: 46/466, loss: 1.2951784133911133 2023-01-24 01:01:55.620472: step: 48/466, loss: 0.3653731048107147 2023-01-24 01:01:56.299595: step: 50/466, loss: 0.40053850412368774 2023-01-24 01:01:56.932744: step: 52/466, loss: 1.1370720863342285 2023-01-24 01:01:57.530049: step: 54/466, loss: 0.22803129255771637 2023-01-24 01:01:58.130036: step: 56/466, loss: 1.3747471570968628 2023-01-24 01:01:58.779755: step: 58/466, loss: 0.8854600191116333 2023-01-24 01:01:59.407463: step: 60/466, loss: 2.8767919540405273 2023-01-24 01:02:00.076822: step: 62/466, loss: 3.0985217094421387 2023-01-24 01:02:00.775272: step: 64/466, loss: 0.42767399549484253 2023-01-24 01:02:01.448779: step: 66/466, loss: 2.14646315574646 2023-01-24 01:02:02.036662: step: 68/466, loss: 0.949621856212616 2023-01-24 01:02:02.670528: step: 70/466, loss: 0.5875003337860107 2023-01-24 01:02:03.287152: step: 72/466, loss: 0.40015918016433716 2023-01-24 01:02:03.962218: step: 74/466, loss: 0.5424792766571045 2023-01-24 01:02:04.678935: step: 76/466, loss: 0.7409729957580566 2023-01-24 01:02:05.327604: step: 78/466, loss: 1.1596415042877197 2023-01-24 01:02:05.919234: step: 80/466, loss: 1.3621959686279297 2023-01-24 01:02:06.507301: step: 82/466, loss: 0.28334301710128784 2023-01-24 01:02:07.087756: step: 84/466, loss: 0.5116409659385681 2023-01-24 01:02:07.749268: step: 86/466, loss: 0.8505147099494934 2023-01-24 01:02:08.410404: step: 88/466, loss: 1.3884097337722778 2023-01-24 01:02:09.039074: step: 90/466, loss: 0.4708605110645294 2023-01-24 01:02:09.635745: step: 92/466, loss: 0.49022626876831055 2023-01-24 01:02:10.255369: step: 94/466, loss: 0.6543488502502441 2023-01-24 01:02:10.866070: step: 96/466, loss: 0.6285028457641602 2023-01-24 01:02:11.505205: step: 98/466, loss: 1.4233053922653198 2023-01-24 01:02:12.101731: step: 100/466, loss: 0.39845722913742065 2023-01-24 01:02:12.717930: step: 102/466, loss: 0.5205832719802856 2023-01-24 01:02:13.301868: step: 104/466, loss: 0.7826010584831238 2023-01-24 01:02:13.868011: step: 106/466, loss: 0.6176843643188477 2023-01-24 01:02:14.461331: step: 108/466, loss: 1.3059556484222412 2023-01-24 01:02:15.081290: step: 110/466, loss: 1.2681266069412231 2023-01-24 01:02:15.746392: step: 112/466, loss: 0.3299587666988373 2023-01-24 01:02:16.426375: step: 114/466, loss: 0.4056156277656555 2023-01-24 01:02:17.056479: step: 116/466, loss: 3.232006549835205 2023-01-24 01:02:17.659693: step: 118/466, loss: 1.019670844078064 2023-01-24 01:02:18.267662: step: 120/466, loss: 0.3997257947921753 2023-01-24 01:02:18.884889: step: 122/466, loss: 0.600451648235321 2023-01-24 01:02:19.572048: step: 124/466, loss: 1.1474565267562866 2023-01-24 01:02:20.179296: step: 126/466, loss: 0.1493775099515915 2023-01-24 01:02:20.837868: step: 128/466, loss: 0.6986579895019531 2023-01-24 01:02:21.511937: step: 130/466, loss: 0.7964789271354675 2023-01-24 01:02:22.208179: step: 132/466, loss: 1.0484631061553955 2023-01-24 01:02:22.909942: step: 134/466, loss: 1.7508398294448853 2023-01-24 01:02:23.531782: step: 136/466, loss: 2.2924609184265137 2023-01-24 01:02:24.086278: step: 138/466, loss: 1.2998239994049072 2023-01-24 01:02:24.769375: step: 140/466, loss: 0.7389345765113831 2023-01-24 01:02:25.353472: step: 142/466, loss: 1.8558109998703003 2023-01-24 01:02:26.086945: step: 144/466, loss: 1.9583468437194824 2023-01-24 01:02:26.771211: step: 146/466, loss: 1.3969378471374512 2023-01-24 01:02:27.329081: step: 148/466, loss: 0.17234475910663605 2023-01-24 01:02:28.019469: step: 150/466, loss: 0.8840292096138 2023-01-24 01:02:28.594549: step: 152/466, loss: 0.41319066286087036 2023-01-24 01:02:29.334373: step: 154/466, loss: 0.38288575410842896 2023-01-24 01:02:29.989729: step: 156/466, loss: 0.6801028251647949 2023-01-24 01:02:30.652695: step: 158/466, loss: 1.5284907817840576 2023-01-24 01:02:31.348307: step: 160/466, loss: 0.5035023093223572 2023-01-24 01:02:32.028534: step: 162/466, loss: 1.1935200691223145 2023-01-24 01:02:32.636624: step: 164/466, loss: 0.1817062497138977 2023-01-24 01:02:33.279199: step: 166/466, loss: 0.8277509212493896 2023-01-24 01:02:33.811662: step: 168/466, loss: 0.4788461923599243 2023-01-24 01:02:34.445273: step: 170/466, loss: 1.3420499563217163 2023-01-24 01:02:35.020643: step: 172/466, loss: 0.43186071515083313 2023-01-24 01:02:35.630818: step: 174/466, loss: 1.1397451162338257 2023-01-24 01:02:36.262057: step: 176/466, loss: 1.153487205505371 2023-01-24 01:02:36.935444: step: 178/466, loss: 0.3397679328918457 2023-01-24 01:02:37.514293: step: 180/466, loss: 0.42831388115882874 2023-01-24 01:02:38.150284: step: 182/466, loss: 0.6156601309776306 2023-01-24 01:02:38.820510: step: 184/466, loss: 0.5617088079452515 2023-01-24 01:02:39.426793: step: 186/466, loss: 0.5770358443260193 2023-01-24 01:02:40.036210: step: 188/466, loss: 0.20702214539051056 2023-01-24 01:02:40.646888: step: 190/466, loss: 0.6894669532775879 2023-01-24 01:02:41.245081: step: 192/466, loss: 0.30901455879211426 2023-01-24 01:02:41.849150: step: 194/466, loss: 0.6040390729904175 2023-01-24 01:02:42.447260: step: 196/466, loss: 0.6449813842773438 2023-01-24 01:02:43.072104: step: 198/466, loss: 0.34480994939804077 2023-01-24 01:02:43.762669: step: 200/466, loss: 1.084185242652893 2023-01-24 01:02:44.362686: step: 202/466, loss: 1.1693246364593506 2023-01-24 01:02:44.921181: step: 204/466, loss: 1.1242389678955078 2023-01-24 01:02:45.520353: step: 206/466, loss: 0.888567328453064 2023-01-24 01:02:46.100986: step: 208/466, loss: 0.763310432434082 2023-01-24 01:02:46.665977: step: 210/466, loss: 0.8818193674087524 2023-01-24 01:02:47.253511: step: 212/466, loss: 3.0669779777526855 2023-01-24 01:02:47.895899: step: 214/466, loss: 1.6946501731872559 2023-01-24 01:02:48.508382: step: 216/466, loss: 1.333387017250061 2023-01-24 01:02:49.068900: step: 218/466, loss: 3.137331247329712 2023-01-24 01:02:49.751003: step: 220/466, loss: 0.2983590066432953 2023-01-24 01:02:50.357507: step: 222/466, loss: 1.8559598922729492 2023-01-24 01:02:51.003522: step: 224/466, loss: 0.50806725025177 2023-01-24 01:02:51.602318: step: 226/466, loss: 0.3796902298927307 2023-01-24 01:02:52.223635: step: 228/466, loss: 0.6262779235839844 2023-01-24 01:02:52.856319: step: 230/466, loss: 0.7788878679275513 2023-01-24 01:02:53.450383: step: 232/466, loss: 0.9816898703575134 2023-01-24 01:02:54.038370: step: 234/466, loss: 0.8697726726531982 2023-01-24 01:02:54.667719: step: 236/466, loss: 0.2544841170310974 2023-01-24 01:02:55.298399: step: 238/466, loss: 0.5122835040092468 2023-01-24 01:02:55.892812: step: 240/466, loss: 0.7203906178474426 2023-01-24 01:02:56.503439: step: 242/466, loss: 0.1544247716665268 2023-01-24 01:02:57.118602: step: 244/466, loss: 0.5812761783599854 2023-01-24 01:02:57.761608: step: 246/466, loss: 1.2550731897354126 2023-01-24 01:02:58.377468: step: 248/466, loss: 1.1732237339019775 2023-01-24 01:02:58.957616: step: 250/466, loss: 0.17169396579265594 2023-01-24 01:02:59.635677: step: 252/466, loss: 0.848768949508667 2023-01-24 01:03:00.351321: step: 254/466, loss: 0.3380299508571625 2023-01-24 01:03:01.007397: step: 256/466, loss: 0.5533966422080994 2023-01-24 01:03:01.635005: step: 258/466, loss: 0.82039475440979 2023-01-24 01:03:02.226342: step: 260/466, loss: 0.30277910828590393 2023-01-24 01:03:02.870637: step: 262/466, loss: 2.042152166366577 2023-01-24 01:03:03.485086: step: 264/466, loss: 1.1292691230773926 2023-01-24 01:03:04.087785: step: 266/466, loss: 1.9049155712127686 2023-01-24 01:03:04.763884: step: 268/466, loss: 0.306099534034729 2023-01-24 01:03:05.423362: step: 270/466, loss: 0.8231030106544495 2023-01-24 01:03:06.065054: step: 272/466, loss: 0.9825611710548401 2023-01-24 01:03:06.672627: step: 274/466, loss: 0.36353135108947754 2023-01-24 01:03:07.217292: step: 276/466, loss: 0.9937047958374023 2023-01-24 01:03:07.794283: step: 278/466, loss: 1.0498427152633667 2023-01-24 01:03:08.561070: step: 280/466, loss: 1.2113786935806274 2023-01-24 01:03:09.123509: step: 282/466, loss: 1.1858857870101929 2023-01-24 01:03:09.748743: step: 284/466, loss: 0.5884745717048645 2023-01-24 01:03:10.328723: step: 286/466, loss: 0.289907842874527 2023-01-24 01:03:11.003887: step: 288/466, loss: 0.32026946544647217 2023-01-24 01:03:11.694561: step: 290/466, loss: 1.020920991897583 2023-01-24 01:03:12.375972: step: 292/466, loss: 2.072936773300171 2023-01-24 01:03:12.977208: step: 294/466, loss: 0.4009312689304352 2023-01-24 01:03:13.556349: step: 296/466, loss: 1.1250522136688232 2023-01-24 01:03:14.283465: step: 298/466, loss: 0.9698195457458496 2023-01-24 01:03:14.904996: step: 300/466, loss: 1.2581521272659302 2023-01-24 01:03:15.512265: step: 302/466, loss: 0.6058701276779175 2023-01-24 01:03:16.150566: step: 304/466, loss: 1.154731035232544 2023-01-24 01:03:16.748838: step: 306/466, loss: 0.3935511112213135 2023-01-24 01:03:17.314794: step: 308/466, loss: 0.23435933887958527 2023-01-24 01:03:17.991931: step: 310/466, loss: 2.445216178894043 2023-01-24 01:03:18.576178: step: 312/466, loss: 1.0250365734100342 2023-01-24 01:03:19.293575: step: 314/466, loss: 1.1069316864013672 2023-01-24 01:03:19.952127: step: 316/466, loss: 1.266685128211975 2023-01-24 01:03:20.506797: step: 318/466, loss: 1.595887303352356 2023-01-24 01:03:21.093694: step: 320/466, loss: 0.12612585723400116 2023-01-24 01:03:21.728373: step: 322/466, loss: 2.205091714859009 2023-01-24 01:03:22.351958: step: 324/466, loss: 0.23279878497123718 2023-01-24 01:03:22.999277: step: 326/466, loss: 1.5497491359710693 2023-01-24 01:03:23.601791: step: 328/466, loss: 1.8650856018066406 2023-01-24 01:03:24.173613: step: 330/466, loss: 0.22802956402301788 2023-01-24 01:03:24.769954: step: 332/466, loss: 1.0268545150756836 2023-01-24 01:03:25.383709: step: 334/466, loss: 0.6684575080871582 2023-01-24 01:03:26.152642: step: 336/466, loss: 1.6692156791687012 2023-01-24 01:03:26.761922: step: 338/466, loss: 0.6034660339355469 2023-01-24 01:03:27.368942: step: 340/466, loss: 1.5330373048782349 2023-01-24 01:03:27.981420: step: 342/466, loss: 1.2503236532211304 2023-01-24 01:03:28.641686: step: 344/466, loss: 0.9118637442588806 2023-01-24 01:03:29.284742: step: 346/466, loss: 0.6592655181884766 2023-01-24 01:03:29.883787: step: 348/466, loss: 0.4565240144729614 2023-01-24 01:03:30.484447: step: 350/466, loss: 0.28018730878829956 2023-01-24 01:03:31.126880: step: 352/466, loss: 1.3790347576141357 2023-01-24 01:03:31.662608: step: 354/466, loss: 0.5521497130393982 2023-01-24 01:03:32.271358: step: 356/466, loss: 0.5413537621498108 2023-01-24 01:03:32.866921: step: 358/466, loss: 0.4369373023509979 2023-01-24 01:03:33.562427: step: 360/466, loss: 1.288124680519104 2023-01-24 01:03:34.211808: step: 362/466, loss: 0.7232743501663208 2023-01-24 01:03:34.835292: step: 364/466, loss: 0.384662389755249 2023-01-24 01:03:35.475834: step: 366/466, loss: 0.9474266767501831 2023-01-24 01:03:36.033853: step: 368/466, loss: 0.9133715033531189 2023-01-24 01:03:36.753414: step: 370/466, loss: 0.5375441908836365 2023-01-24 01:03:37.365529: step: 372/466, loss: 4.506320476531982 2023-01-24 01:03:37.980832: step: 374/466, loss: 0.3736286759376526 2023-01-24 01:03:38.589426: step: 376/466, loss: 0.54074627161026 2023-01-24 01:03:39.246924: step: 378/466, loss: 1.5236220359802246 2023-01-24 01:03:39.834606: step: 380/466, loss: 0.17001619935035706 2023-01-24 01:03:40.434056: step: 382/466, loss: 0.2629481256008148 2023-01-24 01:03:41.017762: step: 384/466, loss: 0.8746252059936523 2023-01-24 01:03:41.585965: step: 386/466, loss: 0.3887893855571747 2023-01-24 01:03:42.215182: step: 388/466, loss: 0.9724615812301636 2023-01-24 01:03:42.802193: step: 390/466, loss: 0.5996192693710327 2023-01-24 01:03:43.441970: step: 392/466, loss: 0.3284958004951477 2023-01-24 01:03:44.079049: step: 394/466, loss: 1.205241084098816 2023-01-24 01:03:44.669720: step: 396/466, loss: 1.0375643968582153 2023-01-24 01:03:45.279899: step: 398/466, loss: 0.7388166785240173 2023-01-24 01:03:45.889217: step: 400/466, loss: 0.4791616201400757 2023-01-24 01:03:46.520145: step: 402/466, loss: 0.34600746631622314 2023-01-24 01:03:47.172601: step: 404/466, loss: 0.4240439236164093 2023-01-24 01:03:47.789485: step: 406/466, loss: 0.9592070579528809 2023-01-24 01:03:48.395414: step: 408/466, loss: 0.41275113821029663 2023-01-24 01:03:49.081556: step: 410/466, loss: 0.3482060432434082 2023-01-24 01:03:49.702266: step: 412/466, loss: 0.1422082632780075 2023-01-24 01:03:50.380571: step: 414/466, loss: 0.5417606830596924 2023-01-24 01:03:51.034585: step: 416/466, loss: 0.3440185487270355 2023-01-24 01:03:51.699624: step: 418/466, loss: 1.4496549367904663 2023-01-24 01:03:52.358974: step: 420/466, loss: 0.3177294135093689 2023-01-24 01:03:52.927594: step: 422/466, loss: 0.39011937379837036 2023-01-24 01:03:53.536879: step: 424/466, loss: 1.9770435094833374 2023-01-24 01:03:54.241194: step: 426/466, loss: 1.8488770723342896 2023-01-24 01:03:54.875126: step: 428/466, loss: 0.6467912197113037 2023-01-24 01:03:55.529252: step: 430/466, loss: 1.8490506410598755 2023-01-24 01:03:56.138965: step: 432/466, loss: 1.3182984590530396 2023-01-24 01:03:56.785069: step: 434/466, loss: 1.0293097496032715 2023-01-24 01:03:57.398840: step: 436/466, loss: 0.565805196762085 2023-01-24 01:03:58.026766: step: 438/466, loss: 1.7070420980453491 2023-01-24 01:03:58.663400: step: 440/466, loss: 0.7296619415283203 2023-01-24 01:03:59.250795: step: 442/466, loss: 1.5177580118179321 2023-01-24 01:03:59.840266: step: 444/466, loss: 1.4175868034362793 2023-01-24 01:04:00.485311: step: 446/466, loss: 0.08863383531570435 2023-01-24 01:04:01.126270: step: 448/466, loss: 0.5503197908401489 2023-01-24 01:04:01.755274: step: 450/466, loss: 0.7010912299156189 2023-01-24 01:04:02.380542: step: 452/466, loss: 0.45214200019836426 2023-01-24 01:04:03.008443: step: 454/466, loss: 0.2602442502975464 2023-01-24 01:04:03.674821: step: 456/466, loss: 0.7292141914367676 2023-01-24 01:04:04.272591: step: 458/466, loss: 2.5713729858398438 2023-01-24 01:04:04.957783: step: 460/466, loss: 0.46030521392822266 2023-01-24 01:04:05.552517: step: 462/466, loss: 0.6398338079452515 2023-01-24 01:04:06.193276: step: 464/466, loss: 0.7439488768577576 2023-01-24 01:04:06.829317: step: 466/466, loss: 0.6633845567703247 2023-01-24 01:04:07.463409: step: 468/466, loss: 0.8546178936958313 2023-01-24 01:04:08.093347: step: 470/466, loss: 0.6649083495140076 2023-01-24 01:04:08.720517: step: 472/466, loss: 1.4693559408187866 2023-01-24 01:04:09.335509: step: 474/466, loss: 1.0182071924209595 2023-01-24 01:04:09.967950: step: 476/466, loss: 0.28218722343444824 2023-01-24 01:04:10.529752: step: 478/466, loss: 1.2137396335601807 2023-01-24 01:04:11.136627: step: 480/466, loss: 0.8096773624420166 2023-01-24 01:04:11.767436: step: 482/466, loss: 0.9805156588554382 2023-01-24 01:04:12.337731: step: 484/466, loss: 0.37933504581451416 2023-01-24 01:04:12.947536: step: 486/466, loss: 0.6712952852249146 2023-01-24 01:04:13.577175: step: 488/466, loss: 0.36372894048690796 2023-01-24 01:04:14.216819: step: 490/466, loss: 0.7804000377655029 2023-01-24 01:04:14.812387: step: 492/466, loss: 0.37860196828842163 2023-01-24 01:04:15.493155: step: 494/466, loss: 0.6025473475456238 2023-01-24 01:04:16.128001: step: 496/466, loss: 0.39121013879776 2023-01-24 01:04:16.726203: step: 498/466, loss: 1.1215208768844604 2023-01-24 01:04:17.341056: step: 500/466, loss: 0.7064223289489746 2023-01-24 01:04:17.885761: step: 502/466, loss: 0.488375186920166 2023-01-24 01:04:18.506834: step: 504/466, loss: 0.27472975850105286 2023-01-24 01:04:19.153316: step: 506/466, loss: 1.3731070756912231 2023-01-24 01:04:19.819435: step: 508/466, loss: 0.6567395925521851 2023-01-24 01:04:20.475923: step: 510/466, loss: 0.6438676118850708 2023-01-24 01:04:21.068651: step: 512/466, loss: 1.5238269567489624 2023-01-24 01:04:21.685368: step: 514/466, loss: 0.1650468409061432 2023-01-24 01:04:22.321069: step: 516/466, loss: 2.120837926864624 2023-01-24 01:04:23.011464: step: 518/466, loss: 0.48192933201789856 2023-01-24 01:04:23.654488: step: 520/466, loss: 0.34953781962394714 2023-01-24 01:04:24.312333: step: 522/466, loss: 1.1414295434951782 2023-01-24 01:04:24.932316: step: 524/466, loss: 0.1272927224636078 2023-01-24 01:04:25.586523: step: 526/466, loss: 1.9006937742233276 2023-01-24 01:04:26.240714: step: 528/466, loss: 0.30656200647354126 2023-01-24 01:04:26.879202: step: 530/466, loss: 0.8622303605079651 2023-01-24 01:04:27.477246: step: 532/466, loss: 0.9534271359443665 2023-01-24 01:04:28.105834: step: 534/466, loss: 0.41976526379585266 2023-01-24 01:04:28.734349: step: 536/466, loss: 1.968782663345337 2023-01-24 01:04:29.319907: step: 538/466, loss: 0.3606802523136139 2023-01-24 01:04:30.087950: step: 540/466, loss: 0.6844926476478577 2023-01-24 01:04:30.726697: step: 542/466, loss: 1.232601523399353 2023-01-24 01:04:31.372137: step: 544/466, loss: 0.5401339530944824 2023-01-24 01:04:32.022631: step: 546/466, loss: 3.593629837036133 2023-01-24 01:04:32.673319: step: 548/466, loss: 0.35622620582580566 2023-01-24 01:04:33.278812: step: 550/466, loss: 1.1047563552856445 2023-01-24 01:04:33.913388: step: 552/466, loss: 1.1900886297225952 2023-01-24 01:04:34.555741: step: 554/466, loss: 0.7149781584739685 2023-01-24 01:04:35.159619: step: 556/466, loss: 1.642871379852295 2023-01-24 01:04:35.865097: step: 558/466, loss: 1.541691780090332 2023-01-24 01:04:36.454578: step: 560/466, loss: 0.5821778774261475 2023-01-24 01:04:37.116224: step: 562/466, loss: 0.3414008915424347 2023-01-24 01:04:37.712939: step: 564/466, loss: 0.36129286885261536 2023-01-24 01:04:38.305031: step: 566/466, loss: 0.2286379039287567 2023-01-24 01:04:38.907156: step: 568/466, loss: 0.9652560353279114 2023-01-24 01:04:39.545028: step: 570/466, loss: 0.21559062600135803 2023-01-24 01:04:40.145448: step: 572/466, loss: 0.28920507431030273 2023-01-24 01:04:40.758189: step: 574/466, loss: 0.4134334623813629 2023-01-24 01:04:41.444467: step: 576/466, loss: 0.6740312576293945 2023-01-24 01:04:42.071826: step: 578/466, loss: 4.717375755310059 2023-01-24 01:04:42.702325: step: 580/466, loss: 2.156572103500366 2023-01-24 01:04:43.294999: step: 582/466, loss: 0.8814055919647217 2023-01-24 01:04:43.969879: step: 584/466, loss: 0.525405764579773 2023-01-24 01:04:44.637655: step: 586/466, loss: 1.29059636592865 2023-01-24 01:04:45.267175: step: 588/466, loss: 0.4213157296180725 2023-01-24 01:04:45.845432: step: 590/466, loss: 0.4958527684211731 2023-01-24 01:04:46.442037: step: 592/466, loss: 1.0746906995773315 2023-01-24 01:04:47.077044: step: 594/466, loss: 0.36387670040130615 2023-01-24 01:04:47.692101: step: 596/466, loss: 1.0891448259353638 2023-01-24 01:04:48.247279: step: 598/466, loss: 1.131120204925537 2023-01-24 01:04:48.877914: step: 600/466, loss: 9.013973236083984 2023-01-24 01:04:49.512347: step: 602/466, loss: 2.1736669540405273 2023-01-24 01:04:50.108043: step: 604/466, loss: 0.8762894868850708 2023-01-24 01:04:50.706024: step: 606/466, loss: 0.28769704699516296 2023-01-24 01:04:51.387467: step: 608/466, loss: 1.3017332553863525 2023-01-24 01:04:52.077678: step: 610/466, loss: 0.4562010169029236 2023-01-24 01:04:52.710590: step: 612/466, loss: 0.520621657371521 2023-01-24 01:04:53.303638: step: 614/466, loss: 0.8116568326950073 2023-01-24 01:04:53.934624: step: 616/466, loss: 1.0235605239868164 2023-01-24 01:04:54.583356: step: 618/466, loss: 0.8463140726089478 2023-01-24 01:04:55.278257: step: 620/466, loss: 1.0514177083969116 2023-01-24 01:04:55.937741: step: 622/466, loss: 0.5442951917648315 2023-01-24 01:04:56.542068: step: 624/466, loss: 0.7741603255271912 2023-01-24 01:04:57.167152: step: 626/466, loss: 1.048574686050415 2023-01-24 01:04:57.835805: step: 628/466, loss: 1.3407269716262817 2023-01-24 01:04:58.458112: step: 630/466, loss: 1.2626268863677979 2023-01-24 01:04:59.190584: step: 632/466, loss: 0.20626191794872284 2023-01-24 01:04:59.757361: step: 634/466, loss: 0.6527853012084961 2023-01-24 01:05:00.350705: step: 636/466, loss: 0.4264346957206726 2023-01-24 01:05:01.029006: step: 638/466, loss: 1.9168164730072021 2023-01-24 01:05:01.738373: step: 640/466, loss: 1.464083194732666 2023-01-24 01:05:02.423045: step: 642/466, loss: 1.5962927341461182 2023-01-24 01:05:03.103322: step: 644/466, loss: 0.33310598134994507 2023-01-24 01:05:03.693332: step: 646/466, loss: 0.2638262212276459 2023-01-24 01:05:04.308859: step: 648/466, loss: 0.21902170777320862 2023-01-24 01:05:04.930516: step: 650/466, loss: 1.34444260597229 2023-01-24 01:05:05.572594: step: 652/466, loss: 1.201189398765564 2023-01-24 01:05:06.164223: step: 654/466, loss: 0.8005489110946655 2023-01-24 01:05:06.842466: step: 656/466, loss: 0.2549632787704468 2023-01-24 01:05:07.462173: step: 658/466, loss: 0.41220033168792725 2023-01-24 01:05:08.054871: step: 660/466, loss: 0.7568885087966919 2023-01-24 01:05:08.646713: step: 662/466, loss: 0.1099766418337822 2023-01-24 01:05:09.263273: step: 664/466, loss: 1.3935626745224 2023-01-24 01:05:09.876021: step: 666/466, loss: 1.1161881685256958 2023-01-24 01:05:10.478460: step: 668/466, loss: 1.8901318311691284 2023-01-24 01:05:11.121243: step: 670/466, loss: 1.109795331954956 2023-01-24 01:05:11.798423: step: 672/466, loss: 1.0813663005828857 2023-01-24 01:05:12.446525: step: 674/466, loss: 0.41245943307876587 2023-01-24 01:05:13.056207: step: 676/466, loss: 0.43254128098487854 2023-01-24 01:05:13.695545: step: 678/466, loss: 0.5581113696098328 2023-01-24 01:05:14.320084: step: 680/466, loss: 0.5595136284828186 2023-01-24 01:05:14.900910: step: 682/466, loss: 1.6868298053741455 2023-01-24 01:05:15.439183: step: 684/466, loss: 0.2144726663827896 2023-01-24 01:05:16.110514: step: 686/466, loss: 0.49233365058898926 2023-01-24 01:05:16.664591: step: 688/466, loss: 0.44163215160369873 2023-01-24 01:05:17.314572: step: 690/466, loss: 0.9274957180023193 2023-01-24 01:05:17.938408: step: 692/466, loss: 0.9633333683013916 2023-01-24 01:05:18.612283: step: 694/466, loss: 0.6739097833633423 2023-01-24 01:05:19.199207: step: 696/466, loss: 1.2564454078674316 2023-01-24 01:05:19.790331: step: 698/466, loss: 0.9555618762969971 2023-01-24 01:05:20.403070: step: 700/466, loss: 0.7222802639007568 2023-01-24 01:05:21.071777: step: 702/466, loss: 0.45501071214675903 2023-01-24 01:05:21.708845: step: 704/466, loss: 0.9140257239341736 2023-01-24 01:05:22.295317: step: 706/466, loss: 0.3494584858417511 2023-01-24 01:05:22.865801: step: 708/466, loss: 0.22887399792671204 2023-01-24 01:05:23.496467: step: 710/466, loss: 1.0088324546813965 2023-01-24 01:05:24.141030: step: 712/466, loss: 1.3507208824157715 2023-01-24 01:05:24.792789: step: 714/466, loss: 0.5512253046035767 2023-01-24 01:05:25.457886: step: 716/466, loss: 1.1831029653549194 2023-01-24 01:05:26.038186: step: 718/466, loss: 0.8344932794570923 2023-01-24 01:05:26.607935: step: 720/466, loss: 1.2759710550308228 2023-01-24 01:05:27.268408: step: 722/466, loss: 0.49068954586982727 2023-01-24 01:05:27.872039: step: 724/466, loss: 1.6091653108596802 2023-01-24 01:05:28.455840: step: 726/466, loss: 0.5767663717269897 2023-01-24 01:05:29.185240: step: 728/466, loss: 0.449196994304657 2023-01-24 01:05:29.825247: step: 730/466, loss: 0.642565906047821 2023-01-24 01:05:30.454298: step: 732/466, loss: 1.0968046188354492 2023-01-24 01:05:31.064941: step: 734/466, loss: 1.483222484588623 2023-01-24 01:05:31.674476: step: 736/466, loss: 0.23038074374198914 2023-01-24 01:05:32.370618: step: 738/466, loss: 0.20710501074790955 2023-01-24 01:05:33.001449: step: 740/466, loss: 0.2826586067676544 2023-01-24 01:05:33.587651: step: 742/466, loss: 0.3242071270942688 2023-01-24 01:05:34.186745: step: 744/466, loss: 0.8751279711723328 2023-01-24 01:05:34.777773: step: 746/466, loss: 0.24087029695510864 2023-01-24 01:05:35.383426: step: 748/466, loss: 0.6206285953521729 2023-01-24 01:05:36.070036: step: 750/466, loss: 0.9701910614967346 2023-01-24 01:05:36.730107: step: 752/466, loss: 0.8841103315353394 2023-01-24 01:05:37.430255: step: 754/466, loss: 1.3417093753814697 2023-01-24 01:05:38.086260: step: 756/466, loss: 0.0679602101445198 2023-01-24 01:05:38.692661: step: 758/466, loss: 0.21013587713241577 2023-01-24 01:05:39.342473: step: 760/466, loss: 0.5504181385040283 2023-01-24 01:05:39.946668: step: 762/466, loss: 0.7431395649909973 2023-01-24 01:05:40.618632: step: 764/466, loss: 1.3952863216400146 2023-01-24 01:05:41.228228: step: 766/466, loss: 2.8130741119384766 2023-01-24 01:05:41.855207: step: 768/466, loss: 1.290252685546875 2023-01-24 01:05:42.469095: step: 770/466, loss: 0.8594147562980652 2023-01-24 01:05:43.094344: step: 772/466, loss: 0.5162642002105713 2023-01-24 01:05:43.716756: step: 774/466, loss: 0.8412430882453918 2023-01-24 01:05:44.307703: step: 776/466, loss: 0.4270803928375244 2023-01-24 01:05:44.945254: step: 778/466, loss: 0.45356687903404236 2023-01-24 01:05:45.553902: step: 780/466, loss: 1.001887321472168 2023-01-24 01:05:46.216810: step: 782/466, loss: 0.5256953239440918 2023-01-24 01:05:46.823320: step: 784/466, loss: 0.5954223275184631 2023-01-24 01:05:47.657317: step: 786/466, loss: 0.5046058893203735 2023-01-24 01:05:48.295830: step: 788/466, loss: 0.8506419658660889 2023-01-24 01:05:48.998579: step: 790/466, loss: 0.274544894695282 2023-01-24 01:05:49.610023: step: 792/466, loss: 0.777887225151062 2023-01-24 01:05:50.234895: step: 794/466, loss: 0.79243004322052 2023-01-24 01:05:50.869328: step: 796/466, loss: 1.4459573030471802 2023-01-24 01:05:51.476333: step: 798/466, loss: 1.8504751920700073 2023-01-24 01:05:52.101725: step: 800/466, loss: 0.5830101370811462 2023-01-24 01:05:52.698442: step: 802/466, loss: 0.15489044785499573 2023-01-24 01:05:53.320797: step: 804/466, loss: 2.4530303478240967 2023-01-24 01:05:53.907143: step: 806/466, loss: 1.168830394744873 2023-01-24 01:05:54.520232: step: 808/466, loss: 1.9327266216278076 2023-01-24 01:05:55.129065: step: 810/466, loss: 1.0205035209655762 2023-01-24 01:05:55.749018: step: 812/466, loss: 0.7465870380401611 2023-01-24 01:05:56.384054: step: 814/466, loss: 0.7815476655960083 2023-01-24 01:05:56.956554: step: 816/466, loss: 0.33895769715309143 2023-01-24 01:05:57.675793: step: 818/466, loss: 1.771838665008545 2023-01-24 01:05:58.312495: step: 820/466, loss: 0.6762361526489258 2023-01-24 01:05:59.004688: step: 822/466, loss: 1.0862176418304443 2023-01-24 01:05:59.676595: step: 824/466, loss: 0.6672700047492981 2023-01-24 01:06:00.294051: step: 826/466, loss: 1.169159173965454 2023-01-24 01:06:00.897433: step: 828/466, loss: 0.46546027064323425 2023-01-24 01:06:01.484762: step: 830/466, loss: 1.314602255821228 2023-01-24 01:06:02.039550: step: 832/466, loss: 0.10589657723903656 2023-01-24 01:06:02.685011: step: 834/466, loss: 0.5514154434204102 2023-01-24 01:06:03.291849: step: 836/466, loss: 0.8950437903404236 2023-01-24 01:06:03.896369: step: 838/466, loss: 0.37474846839904785 2023-01-24 01:06:04.570178: step: 840/466, loss: 0.20299357175827026 2023-01-24 01:06:05.231087: step: 842/466, loss: 1.3681626319885254 2023-01-24 01:06:05.869989: step: 844/466, loss: 0.5367602109909058 2023-01-24 01:06:06.503266: step: 846/466, loss: 0.258124440908432 2023-01-24 01:06:07.120306: step: 848/466, loss: 0.23170244693756104 2023-01-24 01:06:07.738757: step: 850/466, loss: 1.8249881267547607 2023-01-24 01:06:08.310947: step: 852/466, loss: 0.8979384899139404 2023-01-24 01:06:08.910719: step: 854/466, loss: 0.6674975156784058 2023-01-24 01:06:09.513355: step: 856/466, loss: 0.2780420482158661 2023-01-24 01:06:10.179360: step: 858/466, loss: 2.4918947219848633 2023-01-24 01:06:10.771158: step: 860/466, loss: 0.5463971495628357 2023-01-24 01:06:11.426648: step: 862/466, loss: 0.16769395768642426 2023-01-24 01:06:12.094913: step: 864/466, loss: 1.2891900539398193 2023-01-24 01:06:12.714516: step: 866/466, loss: 0.975051760673523 2023-01-24 01:06:13.265561: step: 868/466, loss: 0.39227622747421265 2023-01-24 01:06:13.862636: step: 870/466, loss: 0.5910021066665649 2023-01-24 01:06:14.506891: step: 872/466, loss: 0.2911394238471985 2023-01-24 01:06:15.142033: step: 874/466, loss: 0.8446894884109497 2023-01-24 01:06:15.768054: step: 876/466, loss: 1.073867917060852 2023-01-24 01:06:16.356931: step: 878/466, loss: 0.43695247173309326 2023-01-24 01:06:16.993123: step: 880/466, loss: 0.5594303607940674 2023-01-24 01:06:17.650567: step: 882/466, loss: 1.9501713514328003 2023-01-24 01:06:18.328552: step: 884/466, loss: 0.5615759491920471 2023-01-24 01:06:18.923110: step: 886/466, loss: 1.3543728590011597 2023-01-24 01:06:19.673213: step: 888/466, loss: 0.516370415687561 2023-01-24 01:06:20.248161: step: 890/466, loss: 1.3336546421051025 2023-01-24 01:06:20.807057: step: 892/466, loss: 1.3180499076843262 2023-01-24 01:06:21.409045: step: 894/466, loss: 0.891340434551239 2023-01-24 01:06:22.125601: step: 896/466, loss: 0.2793903648853302 2023-01-24 01:06:22.783345: step: 898/466, loss: 0.8381861448287964 2023-01-24 01:06:23.441458: step: 900/466, loss: 1.500229835510254 2023-01-24 01:06:24.111340: step: 902/466, loss: 0.8812735676765442 2023-01-24 01:06:24.735748: step: 904/466, loss: 1.176670789718628 2023-01-24 01:06:25.373424: step: 906/466, loss: 0.7041540145874023 2023-01-24 01:06:26.079499: step: 908/466, loss: 0.2905486226081848 2023-01-24 01:06:26.749008: step: 910/466, loss: 0.22905662655830383 2023-01-24 01:06:27.434834: step: 912/466, loss: 0.5883660316467285 2023-01-24 01:06:28.091366: step: 914/466, loss: 1.1307791471481323 2023-01-24 01:06:28.637975: step: 916/466, loss: 0.5354166030883789 2023-01-24 01:06:29.223974: step: 918/466, loss: 2.1675527095794678 2023-01-24 01:06:29.884872: step: 920/466, loss: 0.23495560884475708 2023-01-24 01:06:30.519138: step: 922/466, loss: 0.7527521252632141 2023-01-24 01:06:31.136604: step: 924/466, loss: 0.6191567182540894 2023-01-24 01:06:31.718931: step: 926/466, loss: 0.9814187288284302 2023-01-24 01:06:32.339358: step: 928/466, loss: 0.27741020917892456 2023-01-24 01:06:32.981237: step: 930/466, loss: 0.8087165355682373 2023-01-24 01:06:33.593539: step: 932/466, loss: 0.23670652508735657 ================================================== Loss: 0.900 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.318032196969697, 'r': 0.30116685319100095, 'f1': 0.30936984141021107}, 'combined': 0.2279567252496292, 'epoch': 5} Test Chinese: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.35052653959328206, 'r': 0.2626676835154499, 'f1': 0.3003028753234936}, 'combined': 0.19916460125081437, 'epoch': 5} Dev Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.2959473164135636, 'r': 0.24447821790685687, 'f1': 0.2677618577075099}, 'combined': 0.17850790513833992, 'epoch': 5} Test Korean: {'template': {'p': 0.96875, 'r': 0.49206349206349204, 'f1': 0.6526315789473683}, 'slot': {'p': 0.37089076752905875, 'r': 0.2591418609488748, 'f1': 0.30510586075020424}, 'combined': 0.1991217196475017, 'epoch': 5} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.31075728542450565, 'r': 0.30368121820421334, 'f1': 0.30717850670560537}, 'combined': 0.22634205757255133, 'epoch': 5} Test Russian: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.3485189989286822, 'r': 0.2601068199796745, 'f1': 0.2978913010178722}, 'combined': 0.19756521518283748, 'epoch': 5} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.24094202898550723, 'r': 0.31666666666666665, 'f1': 0.27366255144032925}, 'combined': 0.1824417009602195, 'epoch': 5} Sample Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5357142857142857, 'r': 0.32608695652173914, 'f1': 0.40540540540540543}, 'combined': 0.2702702702702703, 'epoch': 5} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.39285714285714285, 'r': 0.09482758620689655, 'f1': 0.15277777777777776}, 'combined': 0.10185185185185183, 'epoch': 5} New best korean model... ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3164303771212846, 'r': 0.295415076932964, 'f1': 0.3055618165724671}, 'combined': 0.22515081221129155, 'epoch': 3} Test for Chinese: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.34839024729428025, 'r': 0.27130476474688553, 'f1': 0.30505307367555545}, 'combined': 0.20231499186772586, 'epoch': 3} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.3098290598290598, 'r': 0.3452380952380952, 'f1': 0.3265765765765765}, 'combined': 0.21771771771771767, 'epoch': 3} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.2959473164135636, 'r': 0.24447821790685687, 'f1': 0.2677618577075099}, 'combined': 0.17850790513833992, 'epoch': 5} Test for Korean: {'template': {'p': 0.96875, 'r': 0.49206349206349204, 'f1': 0.6526315789473683}, 'slot': {'p': 0.37089076752905875, 'r': 0.2591418609488748, 'f1': 0.30510586075020424}, 'combined': 0.1991217196475017, 'epoch': 5} Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5357142857142857, 'r': 0.32608695652173914, 'f1': 0.40540540540540543}, 'combined': 0.2702702702702703, 'epoch': 5} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.32446373912283005, 'r': 0.30476195610208895, 'f1': 0.31430440482544203}, 'combined': 0.23159271934506254, 'epoch': 3} Test for Russian: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.3404589712580221, 'r': 0.2634574188331142, 'f1': 0.2970492050155484}, 'combined': 0.1970067266424362, 'epoch': 3} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.34375, 'r': 0.09482758620689655, 'f1': 0.14864864864864866}, 'combined': 0.0990990990990991, 'epoch': 3} ****************************** Epoch: 6 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-24 01:09:08.333578: step: 2/466, loss: 0.5851011276245117 2023-01-24 01:09:09.001503: step: 4/466, loss: 0.2616092264652252 2023-01-24 01:09:09.618928: step: 6/466, loss: 0.3854227364063263 2023-01-24 01:09:10.259120: step: 8/466, loss: 0.7984930276870728 2023-01-24 01:09:10.883737: step: 10/466, loss: 0.9049591422080994 2023-01-24 01:09:11.505086: step: 12/466, loss: 1.1948599815368652 2023-01-24 01:09:12.118075: step: 14/466, loss: 1.0925015211105347 2023-01-24 01:09:12.698460: step: 16/466, loss: 0.5179755687713623 2023-01-24 01:09:13.466471: step: 18/466, loss: 0.8919814825057983 2023-01-24 01:09:14.158078: step: 20/466, loss: 0.13820907473564148 2023-01-24 01:09:14.886267: step: 22/466, loss: 0.7515466213226318 2023-01-24 01:09:15.497270: step: 24/466, loss: 1.1870259046554565 2023-01-24 01:09:16.088165: step: 26/466, loss: 0.4848284423351288 2023-01-24 01:09:16.665625: step: 28/466, loss: 0.5582742094993591 2023-01-24 01:09:17.263484: step: 30/466, loss: 0.6025078892707825 2023-01-24 01:09:17.850359: step: 32/466, loss: 0.23402145504951477 2023-01-24 01:09:18.491059: step: 34/466, loss: 1.5965683460235596 2023-01-24 01:09:19.085207: step: 36/466, loss: 0.1658952534198761 2023-01-24 01:09:19.826389: step: 38/466, loss: 0.45339614152908325 2023-01-24 01:09:20.487151: step: 40/466, loss: 0.426493376493454 2023-01-24 01:09:21.071073: step: 42/466, loss: 0.28591638803482056 2023-01-24 01:09:21.668947: step: 44/466, loss: 0.6934127807617188 2023-01-24 01:09:22.320259: step: 46/466, loss: 0.9057992696762085 2023-01-24 01:09:22.953345: step: 48/466, loss: 0.4184841513633728 2023-01-24 01:09:23.577775: step: 50/466, loss: 0.4128725826740265 2023-01-24 01:09:24.169042: step: 52/466, loss: 0.13497361540794373 2023-01-24 01:09:24.793167: step: 54/466, loss: 0.19966351985931396 2023-01-24 01:09:25.441573: step: 56/466, loss: 1.5218298435211182 2023-01-24 01:09:26.074641: step: 58/466, loss: 0.5765039324760437 2023-01-24 01:09:26.755391: step: 60/466, loss: 0.15954983234405518 2023-01-24 01:09:27.439678: step: 62/466, loss: 0.40534988045692444 2023-01-24 01:09:28.050608: step: 64/466, loss: 0.40137627720832825 2023-01-24 01:09:28.850453: step: 66/466, loss: 0.6724600195884705 2023-01-24 01:09:29.423642: step: 68/466, loss: 1.581249713897705 2023-01-24 01:09:30.116840: step: 70/466, loss: 0.7606999278068542 2023-01-24 01:09:30.750158: step: 72/466, loss: 1.073913812637329 2023-01-24 01:09:31.344447: step: 74/466, loss: 0.4068748652935028 2023-01-24 01:09:31.953895: step: 76/466, loss: 1.0465961694717407 2023-01-24 01:09:32.586034: step: 78/466, loss: 0.5474420785903931 2023-01-24 01:09:33.196250: step: 80/466, loss: 0.6576740145683289 2023-01-24 01:09:33.883271: step: 82/466, loss: 1.0797858238220215 2023-01-24 01:09:34.534797: step: 84/466, loss: 0.3516871929168701 2023-01-24 01:09:35.188300: step: 86/466, loss: 0.5890368223190308 2023-01-24 01:09:35.926470: step: 88/466, loss: 0.5695680975914001 2023-01-24 01:09:36.567265: step: 90/466, loss: 0.40643540024757385 2023-01-24 01:09:37.200370: step: 92/466, loss: 0.9107625484466553 2023-01-24 01:09:37.916780: step: 94/466, loss: 0.2521580159664154 2023-01-24 01:09:38.522228: step: 96/466, loss: 0.7301395535469055 2023-01-24 01:09:39.089670: step: 98/466, loss: 0.05241236835718155 2023-01-24 01:09:39.732035: step: 100/466, loss: 0.3641345202922821 2023-01-24 01:09:40.367175: step: 102/466, loss: 0.5393410921096802 2023-01-24 01:09:40.976642: step: 104/466, loss: 0.15966859459877014 2023-01-24 01:09:41.604293: step: 106/466, loss: 0.26376670598983765 2023-01-24 01:09:42.243590: step: 108/466, loss: 0.6145482063293457 2023-01-24 01:09:42.875533: step: 110/466, loss: 0.4013631343841553 2023-01-24 01:09:43.479643: step: 112/466, loss: 1.1954171657562256 2023-01-24 01:09:44.066622: step: 114/466, loss: 0.47752130031585693 2023-01-24 01:09:44.644644: step: 116/466, loss: 0.4595664441585541 2023-01-24 01:09:45.271768: step: 118/466, loss: 0.306684285402298 2023-01-24 01:09:45.892671: step: 120/466, loss: 0.9856728315353394 2023-01-24 01:09:46.533367: step: 122/466, loss: 0.22228127717971802 2023-01-24 01:09:47.097514: step: 124/466, loss: 0.5959650874137878 2023-01-24 01:09:47.697197: step: 126/466, loss: 0.5348347425460815 2023-01-24 01:09:48.276972: step: 128/466, loss: 0.3972974121570587 2023-01-24 01:09:48.846747: step: 130/466, loss: 0.9095874428749084 2023-01-24 01:09:49.452471: step: 132/466, loss: 0.2588285207748413 2023-01-24 01:09:50.035881: step: 134/466, loss: 0.33528074622154236 2023-01-24 01:09:50.713173: step: 136/466, loss: 0.5507739782333374 2023-01-24 01:09:51.289302: step: 138/466, loss: 0.6516622304916382 2023-01-24 01:09:51.883114: step: 140/466, loss: 0.16563507914543152 2023-01-24 01:09:52.470357: step: 142/466, loss: 0.23387084901332855 2023-01-24 01:09:53.104182: step: 144/466, loss: 1.556229829788208 2023-01-24 01:09:53.731546: step: 146/466, loss: 0.5888187885284424 2023-01-24 01:09:54.435138: step: 148/466, loss: 0.7805293798446655 2023-01-24 01:09:55.057292: step: 150/466, loss: 0.41787880659103394 2023-01-24 01:09:55.631112: step: 152/466, loss: 2.2686965465545654 2023-01-24 01:09:56.241176: step: 154/466, loss: 0.3498953580856323 2023-01-24 01:09:56.791685: step: 156/466, loss: 0.6985567212104797 2023-01-24 01:09:57.426663: step: 158/466, loss: 0.9538766741752625 2023-01-24 01:09:58.024421: step: 160/466, loss: 0.19783622026443481 2023-01-24 01:09:58.700594: step: 162/466, loss: 0.3048805892467499 2023-01-24 01:09:59.314784: step: 164/466, loss: 0.9341416358947754 2023-01-24 01:09:59.887348: step: 166/466, loss: 0.19876980781555176 2023-01-24 01:10:00.528598: step: 168/466, loss: 0.32706427574157715 2023-01-24 01:10:01.273584: step: 170/466, loss: 0.6943373680114746 2023-01-24 01:10:01.893448: step: 172/466, loss: 0.7037789225578308 2023-01-24 01:10:02.549310: step: 174/466, loss: 0.9194547533988953 2023-01-24 01:10:03.261029: step: 176/466, loss: 0.30153122544288635 2023-01-24 01:10:03.909847: step: 178/466, loss: 0.5422283411026001 2023-01-24 01:10:04.626867: step: 180/466, loss: 0.7232992053031921 2023-01-24 01:10:05.287367: step: 182/466, loss: 0.6932333707809448 2023-01-24 01:10:05.893250: step: 184/466, loss: 2.055283546447754 2023-01-24 01:10:06.519459: step: 186/466, loss: 1.341884732246399 2023-01-24 01:10:07.166669: step: 188/466, loss: 0.784245491027832 2023-01-24 01:10:07.963956: step: 190/466, loss: 0.17669035494327545 2023-01-24 01:10:08.582071: step: 192/466, loss: 1.3504656553268433 2023-01-24 01:10:09.192308: step: 194/466, loss: 1.777635931968689 2023-01-24 01:10:09.922412: step: 196/466, loss: 0.5566887855529785 2023-01-24 01:10:10.479956: step: 198/466, loss: 0.2367226481437683 2023-01-24 01:10:11.147974: step: 200/466, loss: 0.521005392074585 2023-01-24 01:10:11.765542: step: 202/466, loss: 0.33490070700645447 2023-01-24 01:10:12.313232: step: 204/466, loss: 0.7058187127113342 2023-01-24 01:10:12.958074: step: 206/466, loss: 2.002868175506592 2023-01-24 01:10:13.572504: step: 208/466, loss: 0.5000832676887512 2023-01-24 01:10:14.153135: step: 210/466, loss: 1.2851827144622803 2023-01-24 01:10:14.707416: step: 212/466, loss: 0.3409268856048584 2023-01-24 01:10:15.380732: step: 214/466, loss: 0.396659791469574 2023-01-24 01:10:16.011488: step: 216/466, loss: 1.1128814220428467 2023-01-24 01:10:16.619855: step: 218/466, loss: 0.7367991805076599 2023-01-24 01:10:17.269917: step: 220/466, loss: 0.31557464599609375 2023-01-24 01:10:17.862964: step: 222/466, loss: 0.49025699496269226 2023-01-24 01:10:18.451423: step: 224/466, loss: 0.6997539401054382 2023-01-24 01:10:19.115563: step: 226/466, loss: 0.20695850253105164 2023-01-24 01:10:19.758758: step: 228/466, loss: 0.6655076742172241 2023-01-24 01:10:20.439997: step: 230/466, loss: 0.8669025301933289 2023-01-24 01:10:21.090795: step: 232/466, loss: 0.3725025951862335 2023-01-24 01:10:21.659638: step: 234/466, loss: 0.11006996780633926 2023-01-24 01:10:22.243753: step: 236/466, loss: 0.3412753939628601 2023-01-24 01:10:22.945665: step: 238/466, loss: 1.3536577224731445 2023-01-24 01:10:23.525349: step: 240/466, loss: 0.16854868829250336 2023-01-24 01:10:24.164408: step: 242/466, loss: 0.6572349667549133 2023-01-24 01:10:24.800789: step: 244/466, loss: 0.45324134826660156 2023-01-24 01:10:25.468923: step: 246/466, loss: 0.3042180836200714 2023-01-24 01:10:26.108820: step: 248/466, loss: 0.505139172077179 2023-01-24 01:10:26.758631: step: 250/466, loss: 1.6602253913879395 2023-01-24 01:10:27.363232: step: 252/466, loss: 0.3393111228942871 2023-01-24 01:10:28.044039: step: 254/466, loss: 1.0471343994140625 2023-01-24 01:10:28.639263: step: 256/466, loss: 0.30317452549934387 2023-01-24 01:10:29.221446: step: 258/466, loss: 0.6668992042541504 2023-01-24 01:10:29.822595: step: 260/466, loss: 1.2988413572311401 2023-01-24 01:10:30.451648: step: 262/466, loss: 0.2204177975654602 2023-01-24 01:10:31.040105: step: 264/466, loss: 0.7073379158973694 2023-01-24 01:10:31.651082: step: 266/466, loss: 0.2726287543773651 2023-01-24 01:10:32.296099: step: 268/466, loss: 1.0368794202804565 2023-01-24 01:10:32.945509: step: 270/466, loss: 0.19614382088184357 2023-01-24 01:10:33.592716: step: 272/466, loss: 0.4229815900325775 2023-01-24 01:10:34.241371: step: 274/466, loss: 0.25105732679367065 2023-01-24 01:10:34.869844: step: 276/466, loss: 1.1729516983032227 2023-01-24 01:10:35.511162: step: 278/466, loss: 1.3843333721160889 2023-01-24 01:10:36.217352: step: 280/466, loss: 0.3619017004966736 2023-01-24 01:10:36.803889: step: 282/466, loss: 0.12895332276821136 2023-01-24 01:10:37.470014: step: 284/466, loss: 1.8011032342910767 2023-01-24 01:10:38.055220: step: 286/466, loss: 1.437566876411438 2023-01-24 01:10:38.700833: step: 288/466, loss: 0.6030497550964355 2023-01-24 01:10:39.395907: step: 290/466, loss: 0.36808836460113525 2023-01-24 01:10:39.939616: step: 292/466, loss: 0.3335151970386505 2023-01-24 01:10:40.548228: step: 294/466, loss: 0.611156165599823 2023-01-24 01:10:41.191727: step: 296/466, loss: 1.2586387395858765 2023-01-24 01:10:41.845765: step: 298/466, loss: 0.2738572955131531 2023-01-24 01:10:42.457765: step: 300/466, loss: 0.6908833980560303 2023-01-24 01:10:43.112428: step: 302/466, loss: 0.679551362991333 2023-01-24 01:10:43.724247: step: 304/466, loss: 3.43265962600708 2023-01-24 01:10:44.386618: step: 306/466, loss: 0.3721560835838318 2023-01-24 01:10:45.042423: step: 308/466, loss: 1.1952612400054932 2023-01-24 01:10:45.716861: step: 310/466, loss: 0.8409422636032104 2023-01-24 01:10:46.351154: step: 312/466, loss: 0.679412305355072 2023-01-24 01:10:46.952537: step: 314/466, loss: 0.30780380964279175 2023-01-24 01:10:47.538728: step: 316/466, loss: 0.2822536528110504 2023-01-24 01:10:48.166272: step: 318/466, loss: 1.0847021341323853 2023-01-24 01:10:48.757430: step: 320/466, loss: 0.2368328869342804 2023-01-24 01:10:49.392824: step: 322/466, loss: 0.3256698548793793 2023-01-24 01:10:50.011838: step: 324/466, loss: 0.5243033766746521 2023-01-24 01:10:50.649553: step: 326/466, loss: 0.7073649168014526 2023-01-24 01:10:51.241788: step: 328/466, loss: 1.7504061460494995 2023-01-24 01:10:51.895659: step: 330/466, loss: 0.5639289617538452 2023-01-24 01:10:52.562780: step: 332/466, loss: 0.3306322693824768 2023-01-24 01:10:53.274678: step: 334/466, loss: 0.9812665581703186 2023-01-24 01:10:53.931585: step: 336/466, loss: 0.4654000401496887 2023-01-24 01:10:54.526008: step: 338/466, loss: 0.8619095683097839 2023-01-24 01:10:55.221397: step: 340/466, loss: 0.9946466684341431 2023-01-24 01:10:55.858778: step: 342/466, loss: 0.976168155670166 2023-01-24 01:10:56.453728: step: 344/466, loss: 0.26822784543037415 2023-01-24 01:10:57.062249: step: 346/466, loss: 0.49369752407073975 2023-01-24 01:10:57.681485: step: 348/466, loss: 0.46656692028045654 2023-01-24 01:10:58.272651: step: 350/466, loss: 0.21348896622657776 2023-01-24 01:10:58.927130: step: 352/466, loss: 0.9902009963989258 2023-01-24 01:10:59.537129: step: 354/466, loss: 0.3983117938041687 2023-01-24 01:11:00.151374: step: 356/466, loss: 0.5899996161460876 2023-01-24 01:11:00.832821: step: 358/466, loss: 0.2520958483219147 2023-01-24 01:11:01.477276: step: 360/466, loss: 0.7416293025016785 2023-01-24 01:11:02.072663: step: 362/466, loss: 0.40580204129219055 2023-01-24 01:11:02.667346: step: 364/466, loss: 0.4914165139198303 2023-01-24 01:11:03.351747: step: 366/466, loss: 1.9097177982330322 2023-01-24 01:11:03.941532: step: 368/466, loss: 0.5574102997779846 2023-01-24 01:11:04.543029: step: 370/466, loss: 0.2993851602077484 2023-01-24 01:11:05.212370: step: 372/466, loss: 0.2144942283630371 2023-01-24 01:11:05.888844: step: 374/466, loss: 0.3030288815498352 2023-01-24 01:11:06.482447: step: 376/466, loss: 0.6745043992996216 2023-01-24 01:11:07.098836: step: 378/466, loss: 0.5604280233383179 2023-01-24 01:11:07.677430: step: 380/466, loss: 0.49297454953193665 2023-01-24 01:11:08.328060: step: 382/466, loss: 0.7742137312889099 2023-01-24 01:11:08.965030: step: 384/466, loss: 1.0356718301773071 2023-01-24 01:11:09.608501: step: 386/466, loss: 0.24659501016139984 2023-01-24 01:11:10.248664: step: 388/466, loss: 0.9688482284545898 2023-01-24 01:11:10.917549: step: 390/466, loss: 0.47970694303512573 2023-01-24 01:11:11.555282: step: 392/466, loss: 0.6070094704627991 2023-01-24 01:11:12.268968: step: 394/466, loss: 0.27566829323768616 2023-01-24 01:11:12.852588: step: 396/466, loss: 1.622042179107666 2023-01-24 01:11:13.472757: step: 398/466, loss: 1.9418872594833374 2023-01-24 01:11:14.093272: step: 400/466, loss: 0.389572411775589 2023-01-24 01:11:14.729264: step: 402/466, loss: 2.1840286254882812 2023-01-24 01:11:15.340739: step: 404/466, loss: 0.16141338646411896 2023-01-24 01:11:15.963772: step: 406/466, loss: 0.3883560299873352 2023-01-24 01:11:16.532464: step: 408/466, loss: 0.753023624420166 2023-01-24 01:11:17.120083: step: 410/466, loss: 0.09870093315839767 2023-01-24 01:11:17.674684: step: 412/466, loss: 0.6118962168693542 2023-01-24 01:11:18.265058: step: 414/466, loss: 0.6355909705162048 2023-01-24 01:11:18.898813: step: 416/466, loss: 1.0369325876235962 2023-01-24 01:11:19.506368: step: 418/466, loss: 0.5004268884658813 2023-01-24 01:11:20.181147: step: 420/466, loss: 0.41801849007606506 2023-01-24 01:11:20.779934: step: 422/466, loss: 1.4625771045684814 2023-01-24 01:11:21.452533: step: 424/466, loss: 1.1229809522628784 2023-01-24 01:11:22.010050: step: 426/466, loss: 0.8910166025161743 2023-01-24 01:11:22.628086: step: 428/466, loss: 0.22723639011383057 2023-01-24 01:11:23.260396: step: 430/466, loss: 1.198258638381958 2023-01-24 01:11:23.907989: step: 432/466, loss: 1.2684011459350586 2023-01-24 01:11:24.555867: step: 434/466, loss: 0.47714242339134216 2023-01-24 01:11:25.192324: step: 436/466, loss: 0.28571125864982605 2023-01-24 01:11:25.828079: step: 438/466, loss: 1.4924527406692505 2023-01-24 01:11:26.499374: step: 440/466, loss: 3.09102463722229 2023-01-24 01:11:27.131346: step: 442/466, loss: 2.5959880352020264 2023-01-24 01:11:27.787364: step: 444/466, loss: 0.4722447693347931 2023-01-24 01:11:28.422754: step: 446/466, loss: 0.25088632106781006 2023-01-24 01:11:29.009934: step: 448/466, loss: 0.6037149429321289 2023-01-24 01:11:29.651059: step: 450/466, loss: 0.531111478805542 2023-01-24 01:11:30.293195: step: 452/466, loss: 0.4763965606689453 2023-01-24 01:11:30.957337: step: 454/466, loss: 0.38432395458221436 2023-01-24 01:11:31.561077: step: 456/466, loss: 0.7968324422836304 2023-01-24 01:11:32.215227: step: 458/466, loss: 2.104541778564453 2023-01-24 01:11:32.898246: step: 460/466, loss: 0.8049629330635071 2023-01-24 01:11:33.567999: step: 462/466, loss: 0.14413891732692719 2023-01-24 01:11:34.151263: step: 464/466, loss: 1.6799324750900269 2023-01-24 01:11:34.786350: step: 466/466, loss: 1.1222249269485474 2023-01-24 01:11:35.421789: step: 468/466, loss: 0.7136576771736145 2023-01-24 01:11:36.067479: step: 470/466, loss: 0.4166320264339447 2023-01-24 01:11:36.615005: step: 472/466, loss: 1.827630639076233 2023-01-24 01:11:37.201861: step: 474/466, loss: 0.5749404430389404 2023-01-24 01:11:37.843235: step: 476/466, loss: 0.9107766151428223 2023-01-24 01:11:38.429314: step: 478/466, loss: 1.3580448627471924 2023-01-24 01:11:39.066790: step: 480/466, loss: 0.20735961198806763 2023-01-24 01:11:39.653562: step: 482/466, loss: 0.33012843132019043 2023-01-24 01:11:40.300743: step: 484/466, loss: 0.29290837049484253 2023-01-24 01:11:40.959692: step: 486/466, loss: 0.10790805518627167 2023-01-24 01:11:41.580696: step: 488/466, loss: 1.6741561889648438 2023-01-24 01:11:42.206518: step: 490/466, loss: 2.871872663497925 2023-01-24 01:11:42.785970: step: 492/466, loss: 0.17955073714256287 2023-01-24 01:11:43.401730: step: 494/466, loss: 0.313848078250885 2023-01-24 01:11:44.039458: step: 496/466, loss: 0.4093271791934967 2023-01-24 01:11:44.754514: step: 498/466, loss: 0.2761945426464081 2023-01-24 01:11:45.439551: step: 500/466, loss: 0.9821873903274536 2023-01-24 01:11:46.113154: step: 502/466, loss: 0.5871090888977051 2023-01-24 01:11:46.711029: step: 504/466, loss: 0.8444579839706421 2023-01-24 01:11:47.281799: step: 506/466, loss: 1.1011338233947754 2023-01-24 01:11:47.934712: step: 508/466, loss: 0.659293532371521 2023-01-24 01:11:48.472428: step: 510/466, loss: 0.5439980030059814 2023-01-24 01:11:49.109022: step: 512/466, loss: 0.9268401861190796 2023-01-24 01:11:49.715869: step: 514/466, loss: 0.9230048656463623 2023-01-24 01:11:50.312915: step: 516/466, loss: 0.3950723707675934 2023-01-24 01:11:50.875124: step: 518/466, loss: 3.7305030822753906 2023-01-24 01:11:51.518096: step: 520/466, loss: 1.420408844947815 2023-01-24 01:11:52.164535: step: 522/466, loss: 0.12743352353572845 2023-01-24 01:11:52.755036: step: 524/466, loss: 0.30235588550567627 2023-01-24 01:11:53.392367: step: 526/466, loss: 0.5124424695968628 2023-01-24 01:11:53.990261: step: 528/466, loss: 1.0410754680633545 2023-01-24 01:11:54.644515: step: 530/466, loss: 1.2636866569519043 2023-01-24 01:11:55.282638: step: 532/466, loss: 0.19068653881549835 2023-01-24 01:11:55.914770: step: 534/466, loss: 1.2769172191619873 2023-01-24 01:11:56.570324: step: 536/466, loss: 1.0977855920791626 2023-01-24 01:11:57.196809: step: 538/466, loss: 0.31804510951042175 2023-01-24 01:11:57.874983: step: 540/466, loss: 1.8412132263183594 2023-01-24 01:11:58.478207: step: 542/466, loss: 0.6204238533973694 2023-01-24 01:11:59.036259: step: 544/466, loss: 1.0468567609786987 2023-01-24 01:11:59.580679: step: 546/466, loss: 0.43631964921951294 2023-01-24 01:12:00.244261: step: 548/466, loss: 0.2319594770669937 2023-01-24 01:12:00.934562: step: 550/466, loss: 1.136697769165039 2023-01-24 01:12:01.538911: step: 552/466, loss: 1.1643078327178955 2023-01-24 01:12:02.190839: step: 554/466, loss: 0.256322979927063 2023-01-24 01:12:02.770131: step: 556/466, loss: 0.3079525828361511 2023-01-24 01:12:03.392487: step: 558/466, loss: 0.3707126975059509 2023-01-24 01:12:03.988576: step: 560/466, loss: 0.6252129077911377 2023-01-24 01:12:04.584788: step: 562/466, loss: 3.6634650230407715 2023-01-24 01:12:05.201162: step: 564/466, loss: 0.16154566407203674 2023-01-24 01:12:05.815645: step: 566/466, loss: 0.3450652062892914 2023-01-24 01:12:06.434260: step: 568/466, loss: 1.1304841041564941 2023-01-24 01:12:07.087322: step: 570/466, loss: 0.6155608892440796 2023-01-24 01:12:07.690254: step: 572/466, loss: 0.22098317742347717 2023-01-24 01:12:08.314098: step: 574/466, loss: 0.6425347924232483 2023-01-24 01:12:08.921123: step: 576/466, loss: 0.8756067156791687 2023-01-24 01:12:09.541941: step: 578/466, loss: 0.604966938495636 2023-01-24 01:12:10.199438: step: 580/466, loss: 0.39261394739151 2023-01-24 01:12:10.829758: step: 582/466, loss: 0.5821620225906372 2023-01-24 01:12:11.419419: step: 584/466, loss: 0.46261516213417053 2023-01-24 01:12:12.042674: step: 586/466, loss: 0.4566130042076111 2023-01-24 01:12:12.685955: step: 588/466, loss: 0.9882059097290039 2023-01-24 01:12:13.359485: step: 590/466, loss: 0.4548759460449219 2023-01-24 01:12:13.989918: step: 592/466, loss: 0.44460439682006836 2023-01-24 01:12:14.740027: step: 594/466, loss: 0.1863112598657608 2023-01-24 01:12:15.378215: step: 596/466, loss: 0.49223336577415466 2023-01-24 01:12:15.961331: step: 598/466, loss: 0.948884129524231 2023-01-24 01:12:16.635457: step: 600/466, loss: 0.27376243472099304 2023-01-24 01:12:17.261619: step: 602/466, loss: 1.7389003038406372 2023-01-24 01:12:17.861888: step: 604/466, loss: 0.8352102041244507 2023-01-24 01:12:18.480771: step: 606/466, loss: 0.22336582839488983 2023-01-24 01:12:19.216402: step: 608/466, loss: 2.5565452575683594 2023-01-24 01:12:19.895809: step: 610/466, loss: 0.2957126796245575 2023-01-24 01:12:20.493291: step: 612/466, loss: 0.30108410120010376 2023-01-24 01:12:21.087963: step: 614/466, loss: 1.1393479108810425 2023-01-24 01:12:21.727819: step: 616/466, loss: 1.0807772874832153 2023-01-24 01:12:22.376000: step: 618/466, loss: 2.6531262397766113 2023-01-24 01:12:22.984578: step: 620/466, loss: 0.8356918096542358 2023-01-24 01:12:23.630698: step: 622/466, loss: 0.18988583981990814 2023-01-24 01:12:24.190266: step: 624/466, loss: 0.49750959873199463 2023-01-24 01:12:24.837771: step: 626/466, loss: 0.16339921951293945 2023-01-24 01:12:25.386178: step: 628/466, loss: 0.39217299222946167 2023-01-24 01:12:26.060051: step: 630/466, loss: 0.6528757810592651 2023-01-24 01:12:26.677646: step: 632/466, loss: 0.5883387923240662 2023-01-24 01:12:27.308408: step: 634/466, loss: 1.029714584350586 2023-01-24 01:12:27.906391: step: 636/466, loss: 0.3076227009296417 2023-01-24 01:12:28.505417: step: 638/466, loss: 0.39846861362457275 2023-01-24 01:12:29.161103: step: 640/466, loss: 0.7532325983047485 2023-01-24 01:12:29.730267: step: 642/466, loss: 0.3744526505470276 2023-01-24 01:12:30.345439: step: 644/466, loss: 0.7421638369560242 2023-01-24 01:12:31.004293: step: 646/466, loss: 0.5743134021759033 2023-01-24 01:12:31.563576: step: 648/466, loss: 0.40305227041244507 2023-01-24 01:12:32.196411: step: 650/466, loss: 0.5777412056922913 2023-01-24 01:12:32.867800: step: 652/466, loss: 1.007772445678711 2023-01-24 01:12:33.486850: step: 654/466, loss: 0.6450772881507874 2023-01-24 01:12:34.112045: step: 656/466, loss: 0.4225735664367676 2023-01-24 01:12:34.712957: step: 658/466, loss: 1.1234859228134155 2023-01-24 01:12:35.379541: step: 660/466, loss: 0.17602777481079102 2023-01-24 01:12:36.013270: step: 662/466, loss: 0.2158088982105255 2023-01-24 01:12:36.609618: step: 664/466, loss: 0.22115278244018555 2023-01-24 01:12:37.204651: step: 666/466, loss: 0.45712563395500183 2023-01-24 01:12:37.832929: step: 668/466, loss: 0.2096031755208969 2023-01-24 01:12:38.426904: step: 670/466, loss: 0.339317262172699 2023-01-24 01:12:39.021588: step: 672/466, loss: 0.2058376669883728 2023-01-24 01:12:39.584379: step: 674/466, loss: 0.09214982390403748 2023-01-24 01:12:40.254269: step: 676/466, loss: 1.1575394868850708 2023-01-24 01:12:40.919984: step: 678/466, loss: 0.8664871454238892 2023-01-24 01:12:41.563834: step: 680/466, loss: 1.3440394401550293 2023-01-24 01:12:42.118862: step: 682/466, loss: 0.9377317428588867 2023-01-24 01:12:42.769637: step: 684/466, loss: 0.37596309185028076 2023-01-24 01:12:43.396057: step: 686/466, loss: 0.8694080114364624 2023-01-24 01:12:44.017739: step: 688/466, loss: 0.4781648516654968 2023-01-24 01:12:44.645206: step: 690/466, loss: 0.9240776896476746 2023-01-24 01:12:45.313439: step: 692/466, loss: 0.46945273876190186 2023-01-24 01:12:45.887413: step: 694/466, loss: 0.32516801357269287 2023-01-24 01:12:46.514896: step: 696/466, loss: 0.5208209753036499 2023-01-24 01:12:47.122870: step: 698/466, loss: 0.23446114361286163 2023-01-24 01:12:47.751015: step: 700/466, loss: 0.661338210105896 2023-01-24 01:12:48.349993: step: 702/466, loss: 0.2834647595882416 2023-01-24 01:12:48.965370: step: 704/466, loss: 0.679925799369812 2023-01-24 01:12:49.541639: step: 706/466, loss: 0.9928429126739502 2023-01-24 01:12:50.116574: step: 708/466, loss: 0.15573279559612274 2023-01-24 01:12:50.754065: step: 710/466, loss: 0.8083428740501404 2023-01-24 01:12:51.353903: step: 712/466, loss: 1.1022380590438843 2023-01-24 01:12:52.038874: step: 714/466, loss: 2.9695441722869873 2023-01-24 01:12:52.676757: step: 716/466, loss: 0.32324570417404175 2023-01-24 01:12:53.330615: step: 718/466, loss: 0.26076310873031616 2023-01-24 01:12:53.980534: step: 720/466, loss: 0.4587477147579193 2023-01-24 01:12:54.549371: step: 722/466, loss: 1.2385427951812744 2023-01-24 01:12:55.196786: step: 724/466, loss: 0.2180221974849701 2023-01-24 01:12:55.825285: step: 726/466, loss: 0.4084261655807495 2023-01-24 01:12:56.480953: step: 728/466, loss: 1.120225429534912 2023-01-24 01:12:57.074584: step: 730/466, loss: 0.4246133863925934 2023-01-24 01:12:57.746444: step: 732/466, loss: 0.336835652589798 2023-01-24 01:12:58.332297: step: 734/466, loss: 0.4902934432029724 2023-01-24 01:12:58.894979: step: 736/466, loss: 1.239622950553894 2023-01-24 01:12:59.544726: step: 738/466, loss: 0.3462580442428589 2023-01-24 01:13:00.164099: step: 740/466, loss: 0.4404885768890381 2023-01-24 01:13:00.798968: step: 742/466, loss: 0.22442424297332764 2023-01-24 01:13:01.386737: step: 744/466, loss: 7.0663228034973145 2023-01-24 01:13:01.986078: step: 746/466, loss: 0.3398008942604065 2023-01-24 01:13:02.608148: step: 748/466, loss: 0.8893722295761108 2023-01-24 01:13:03.279632: step: 750/466, loss: 0.17342069745063782 2023-01-24 01:13:03.934527: step: 752/466, loss: 0.8895680904388428 2023-01-24 01:13:04.537015: step: 754/466, loss: 0.7023111581802368 2023-01-24 01:13:05.129817: step: 756/466, loss: 0.26367777585983276 2023-01-24 01:13:05.681576: step: 758/466, loss: 0.9022852182388306 2023-01-24 01:13:06.372222: step: 760/466, loss: 0.9276801943778992 2023-01-24 01:13:06.992276: step: 762/466, loss: 0.7745717167854309 2023-01-24 01:13:07.650994: step: 764/466, loss: 0.598774790763855 2023-01-24 01:13:08.215352: step: 766/466, loss: 0.44912904500961304 2023-01-24 01:13:08.909447: step: 768/466, loss: 0.24844907224178314 2023-01-24 01:13:09.578966: step: 770/466, loss: 0.272910475730896 2023-01-24 01:13:10.151319: step: 772/466, loss: 0.25028693675994873 2023-01-24 01:13:10.747970: step: 774/466, loss: 0.5121179223060608 2023-01-24 01:13:11.301278: step: 776/466, loss: 1.4639263153076172 2023-01-24 01:13:12.009249: step: 778/466, loss: 0.681140661239624 2023-01-24 01:13:12.669764: step: 780/466, loss: 0.8039424419403076 2023-01-24 01:13:13.332938: step: 782/466, loss: 0.35990825295448303 2023-01-24 01:13:13.906433: step: 784/466, loss: 0.34313514828681946 2023-01-24 01:13:14.478538: step: 786/466, loss: 0.5792441368103027 2023-01-24 01:13:15.289250: step: 788/466, loss: 0.6026006937026978 2023-01-24 01:13:15.880494: step: 790/466, loss: 7.832659721374512 2023-01-24 01:13:16.592066: step: 792/466, loss: 1.1409316062927246 2023-01-24 01:13:17.149309: step: 794/466, loss: 0.5021165013313293 2023-01-24 01:13:17.804672: step: 796/466, loss: 1.1863479614257812 2023-01-24 01:13:18.490230: step: 798/466, loss: 1.0290889739990234 2023-01-24 01:13:19.162572: step: 800/466, loss: 0.2763671875 2023-01-24 01:13:19.784517: step: 802/466, loss: 1.0786328315734863 2023-01-24 01:13:20.457085: step: 804/466, loss: 0.641802966594696 2023-01-24 01:13:21.075258: step: 806/466, loss: 0.19582396745681763 2023-01-24 01:13:21.729636: step: 808/466, loss: 0.8715490698814392 2023-01-24 01:13:22.250686: step: 810/466, loss: 0.2146126627922058 2023-01-24 01:13:22.933822: step: 812/466, loss: 0.2053159922361374 2023-01-24 01:13:23.557666: step: 814/466, loss: 3.9195573329925537 2023-01-24 01:13:24.280854: step: 816/466, loss: 0.8048123121261597 2023-01-24 01:13:24.854196: step: 818/466, loss: 2.6136231422424316 2023-01-24 01:13:25.477879: step: 820/466, loss: 0.85329270362854 2023-01-24 01:13:26.103551: step: 822/466, loss: 0.8017750978469849 2023-01-24 01:13:26.701931: step: 824/466, loss: 0.26032787561416626 2023-01-24 01:13:27.313348: step: 826/466, loss: 1.9249786138534546 2023-01-24 01:13:27.920111: step: 828/466, loss: 0.6099420189857483 2023-01-24 01:13:28.582344: step: 830/466, loss: 0.21212445199489594 2023-01-24 01:13:29.204237: step: 832/466, loss: 0.5787874460220337 2023-01-24 01:13:29.824751: step: 834/466, loss: 0.5170060992240906 2023-01-24 01:13:30.483305: step: 836/466, loss: 0.4283927083015442 2023-01-24 01:13:31.088772: step: 838/466, loss: 0.8669469356536865 2023-01-24 01:13:31.744367: step: 840/466, loss: 1.6597884893417358 2023-01-24 01:13:32.396611: step: 842/466, loss: 1.3498128652572632 2023-01-24 01:13:33.047666: step: 844/466, loss: 1.7638030052185059 2023-01-24 01:13:33.698765: step: 846/466, loss: 0.43602848052978516 2023-01-24 01:13:34.406362: step: 848/466, loss: 0.6229745149612427 2023-01-24 01:13:35.002524: step: 850/466, loss: 0.6166430115699768 2023-01-24 01:13:35.635419: step: 852/466, loss: 0.5322221517562866 2023-01-24 01:13:36.197599: step: 854/466, loss: 0.3037514388561249 2023-01-24 01:13:36.850847: step: 856/466, loss: 0.4232807755470276 2023-01-24 01:13:37.460277: step: 858/466, loss: 0.23372140526771545 2023-01-24 01:13:38.069793: step: 860/466, loss: 1.3362098932266235 2023-01-24 01:13:38.659290: step: 862/466, loss: 0.38267797231674194 2023-01-24 01:13:39.285150: step: 864/466, loss: 0.8275524377822876 2023-01-24 01:13:39.914338: step: 866/466, loss: 1.5383274555206299 2023-01-24 01:13:40.650181: step: 868/466, loss: 0.17373280227184296 2023-01-24 01:13:41.231224: step: 870/466, loss: 0.28224244713783264 2023-01-24 01:13:41.819775: step: 872/466, loss: 0.4439685344696045 2023-01-24 01:13:42.484852: step: 874/466, loss: 0.5664662718772888 2023-01-24 01:13:43.033273: step: 876/466, loss: 1.4287652969360352 2023-01-24 01:13:43.731277: step: 878/466, loss: 1.4793837070465088 2023-01-24 01:13:44.359224: step: 880/466, loss: 1.6305980682373047 2023-01-24 01:13:45.004553: step: 882/466, loss: 2.0438435077667236 2023-01-24 01:13:45.634835: step: 884/466, loss: 0.4367019534111023 2023-01-24 01:13:46.306331: step: 886/466, loss: 6.0026535987854 2023-01-24 01:13:46.897816: step: 888/466, loss: 0.6394232511520386 2023-01-24 01:13:47.500331: step: 890/466, loss: 0.775500476360321 2023-01-24 01:13:48.187454: step: 892/466, loss: 0.55797278881073 2023-01-24 01:13:48.823248: step: 894/466, loss: 0.5209435820579529 2023-01-24 01:13:49.459759: step: 896/466, loss: 3.8680386543273926 2023-01-24 01:13:50.062594: step: 898/466, loss: 0.9209475517272949 2023-01-24 01:13:50.685734: step: 900/466, loss: 0.7050312757492065 2023-01-24 01:13:51.287216: step: 902/466, loss: 0.5102740526199341 2023-01-24 01:13:51.927663: step: 904/466, loss: 0.529159665107727 2023-01-24 01:13:52.548921: step: 906/466, loss: 0.5836001038551331 2023-01-24 01:13:53.178716: step: 908/466, loss: 0.8525623679161072 2023-01-24 01:13:53.778855: step: 910/466, loss: 10.857308387756348 2023-01-24 01:13:54.425565: step: 912/466, loss: 0.52732914686203 2023-01-24 01:13:55.027126: step: 914/466, loss: 0.9026237726211548 2023-01-24 01:13:55.621323: step: 916/466, loss: 1.7058541774749756 2023-01-24 01:13:56.205900: step: 918/466, loss: 1.7477123737335205 2023-01-24 01:13:56.851744: step: 920/466, loss: 0.24393534660339355 2023-01-24 01:13:57.532549: step: 922/466, loss: 0.695246696472168 2023-01-24 01:13:58.178209: step: 924/466, loss: 0.23524464666843414 2023-01-24 01:13:58.839079: step: 926/466, loss: 0.31187599897384644 2023-01-24 01:13:59.519953: step: 928/466, loss: 0.3860665261745453 2023-01-24 01:14:00.113961: step: 930/466, loss: 0.6614386439323425 2023-01-24 01:14:00.844616: step: 932/466, loss: 2.147660255432129 ================================================== Loss: 0.800 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3426994403339025, 'r': 0.326442351134002, 'f1': 0.3343734092276366}, 'combined': 0.24638040679931117, 'epoch': 6} Test Chinese: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.3829430975684692, 'r': 0.2621003488086857, 'f1': 0.3112023539198586}, 'combined': 0.20639327099348134, 'epoch': 6} Dev Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.32754054520358866, 'r': 0.25682156385281385, 'f1': 0.2879018804974219}, 'combined': 0.19193458699828125, 'epoch': 6} Test Korean: {'template': {'p': 0.96875, 'r': 0.49206349206349204, 'f1': 0.6526315789473683}, 'slot': {'p': 0.4086617349823993, 'r': 0.2485057578790473, 'f1': 0.30906822163639414}, 'combined': 0.2017076814890151, 'epoch': 6} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.31945704143331427, 'r': 0.3127890576462812, 'f1': 0.31608788759269446}, 'combined': 0.2329068645419854, 'epoch': 6} Test Russian: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.37083846239905494, 'r': 0.25521096594043485, 'f1': 0.3023468688335152}, 'combined': 0.20052020316419658, 'epoch': 6} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.3494623655913978, 'r': 0.3095238095238095, 'f1': 0.32828282828282823}, 'combined': 0.2188552188552188, 'epoch': 6} Sample Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.46153846153846156, 'r': 0.2608695652173913, 'f1': 0.33333333333333337}, 'combined': 0.22222222222222224, 'epoch': 6} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.3333333333333333, 'r': 0.10344827586206896, 'f1': 0.15789473684210528}, 'combined': 0.10526315789473685, 'epoch': 6} New best chinese model... New best russian model... ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3426994403339025, 'r': 0.326442351134002, 'f1': 0.3343734092276366}, 'combined': 0.24638040679931117, 'epoch': 6} Test for Chinese: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.3829430975684692, 'r': 0.2621003488086857, 'f1': 0.3112023539198586}, 'combined': 0.20639327099348134, 'epoch': 6} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.3494623655913978, 'r': 0.3095238095238095, 'f1': 0.32828282828282823}, 'combined': 0.2188552188552188, 'epoch': 6} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.2959473164135636, 'r': 0.24447821790685687, 'f1': 0.2677618577075099}, 'combined': 0.17850790513833992, 'epoch': 5} Test for Korean: {'template': {'p': 0.96875, 'r': 0.49206349206349204, 'f1': 0.6526315789473683}, 'slot': {'p': 0.37089076752905875, 'r': 0.2591418609488748, 'f1': 0.30510586075020424}, 'combined': 0.1991217196475017, 'epoch': 5} Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5357142857142857, 'r': 0.32608695652173914, 'f1': 0.40540540540540543}, 'combined': 0.2702702702702703, 'epoch': 5} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.31945704143331427, 'r': 0.3127890576462812, 'f1': 0.31608788759269446}, 'combined': 0.2329068645419854, 'epoch': 6} Test for Russian: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.37083846239905494, 'r': 0.25521096594043485, 'f1': 0.3023468688335152}, 'combined': 0.20052020316419658, 'epoch': 6} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.3333333333333333, 'r': 0.10344827586206896, 'f1': 0.15789473684210528}, 'combined': 0.10526315789473685, 'epoch': 6} ****************************** Epoch: 7 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-24 01:16:41.591834: step: 2/466, loss: 0.5861604809761047 2023-01-24 01:16:42.223844: step: 4/466, loss: 0.4408992528915405 2023-01-24 01:16:42.827107: step: 6/466, loss: 0.28753623366355896 2023-01-24 01:16:43.466917: step: 8/466, loss: 0.32058003544807434 2023-01-24 01:16:44.084746: step: 10/466, loss: 0.3608631491661072 2023-01-24 01:16:44.656527: step: 12/466, loss: 0.8042821288108826 2023-01-24 01:16:45.242076: step: 14/466, loss: 0.22324897348880768 2023-01-24 01:16:45.860705: step: 16/466, loss: 0.26565486192703247 2023-01-24 01:16:46.477008: step: 18/466, loss: 0.21577036380767822 2023-01-24 01:16:47.139527: step: 20/466, loss: 0.7381148338317871 2023-01-24 01:16:47.740046: step: 22/466, loss: 0.22094513475894928 2023-01-24 01:16:48.410184: step: 24/466, loss: 0.5388932228088379 2023-01-24 01:16:49.095099: step: 26/466, loss: 0.2958400845527649 2023-01-24 01:16:49.782977: step: 28/466, loss: 2.005858898162842 2023-01-24 01:16:50.446742: step: 30/466, loss: 0.5006194114685059 2023-01-24 01:16:51.098360: step: 32/466, loss: 0.5860127210617065 2023-01-24 01:16:51.717552: step: 34/466, loss: 0.5371690392494202 2023-01-24 01:16:52.319927: step: 36/466, loss: 0.2739104628562927 2023-01-24 01:16:52.917179: step: 38/466, loss: 0.2741168737411499 2023-01-24 01:16:53.632911: step: 40/466, loss: 0.3612619936466217 2023-01-24 01:16:54.249684: step: 42/466, loss: 0.37896281480789185 2023-01-24 01:16:54.972183: step: 44/466, loss: 0.26077574491500854 2023-01-24 01:16:55.605914: step: 46/466, loss: 0.7730565071105957 2023-01-24 01:16:56.195389: step: 48/466, loss: 0.4118739664554596 2023-01-24 01:16:56.843445: step: 50/466, loss: 0.3541674017906189 2023-01-24 01:16:57.483869: step: 52/466, loss: 0.5203198790550232 2023-01-24 01:16:58.175523: step: 54/466, loss: 0.3168627619743347 2023-01-24 01:16:58.812188: step: 56/466, loss: 0.3129294812679291 2023-01-24 01:16:59.409458: step: 58/466, loss: 0.4802446663379669 2023-01-24 01:16:59.996961: step: 60/466, loss: 0.24453146755695343 2023-01-24 01:17:00.652434: step: 62/466, loss: 0.6423302292823792 2023-01-24 01:17:01.217673: step: 64/466, loss: 0.41287142038345337 2023-01-24 01:17:01.834045: step: 66/466, loss: 0.19917523860931396 2023-01-24 01:17:02.406060: step: 68/466, loss: 0.5443085432052612 2023-01-24 01:17:03.010339: step: 70/466, loss: 0.5711265206336975 2023-01-24 01:17:03.616620: step: 72/466, loss: 0.673648476600647 2023-01-24 01:17:04.233717: step: 74/466, loss: 0.23313216865062714 2023-01-24 01:17:04.872142: step: 76/466, loss: 2.2076218128204346 2023-01-24 01:17:05.525146: step: 78/466, loss: 0.2055993527173996 2023-01-24 01:17:06.193701: step: 80/466, loss: 0.3096136152744293 2023-01-24 01:17:06.785039: step: 82/466, loss: 0.21365565061569214 2023-01-24 01:17:07.373464: step: 84/466, loss: 0.1641816794872284 2023-01-24 01:17:07.954958: step: 86/466, loss: 0.8886588215827942 2023-01-24 01:17:08.578383: step: 88/466, loss: 0.29702797532081604 2023-01-24 01:17:09.188679: step: 90/466, loss: 0.40896323323249817 2023-01-24 01:17:09.836615: step: 92/466, loss: 0.31896260380744934 2023-01-24 01:17:10.502548: step: 94/466, loss: 1.0673526525497437 2023-01-24 01:17:11.094728: step: 96/466, loss: 0.3697822093963623 2023-01-24 01:17:11.728759: step: 98/466, loss: 0.2739983797073364 2023-01-24 01:17:12.329835: step: 100/466, loss: 0.6867333650588989 2023-01-24 01:17:12.985939: step: 102/466, loss: 0.7206767797470093 2023-01-24 01:17:13.648271: step: 104/466, loss: 1.2624249458312988 2023-01-24 01:17:14.260217: step: 106/466, loss: 0.3790944218635559 2023-01-24 01:17:14.841384: step: 108/466, loss: 0.05347498878836632 2023-01-24 01:17:15.498034: step: 110/466, loss: 1.041469931602478 2023-01-24 01:17:16.083396: step: 112/466, loss: 0.3049376904964447 2023-01-24 01:17:16.825035: step: 114/466, loss: 0.6779406070709229 2023-01-24 01:17:17.498906: step: 116/466, loss: 0.5164543390274048 2023-01-24 01:17:18.163041: step: 118/466, loss: 0.5160636305809021 2023-01-24 01:17:18.771714: step: 120/466, loss: 1.2827353477478027 2023-01-24 01:17:19.399135: step: 122/466, loss: 0.34198516607284546 2023-01-24 01:17:20.051491: step: 124/466, loss: 0.8944591879844666 2023-01-24 01:17:20.732545: step: 126/466, loss: 0.5568956136703491 2023-01-24 01:17:21.387058: step: 128/466, loss: 0.6809290647506714 2023-01-24 01:17:22.030753: step: 130/466, loss: 0.43843433260917664 2023-01-24 01:17:22.635272: step: 132/466, loss: 0.6921073794364929 2023-01-24 01:17:23.274211: step: 134/466, loss: 0.665354311466217 2023-01-24 01:17:23.948036: step: 136/466, loss: 0.23547694087028503 2023-01-24 01:17:24.601036: step: 138/466, loss: 0.5738288164138794 2023-01-24 01:17:25.185295: step: 140/466, loss: 0.1683214008808136 2023-01-24 01:17:25.892315: step: 142/466, loss: 0.5310552716255188 2023-01-24 01:17:26.533679: step: 144/466, loss: 0.7672767043113708 2023-01-24 01:17:27.134062: step: 146/466, loss: 0.8567443490028381 2023-01-24 01:17:27.685715: step: 148/466, loss: 0.34494611620903015 2023-01-24 01:17:28.273073: step: 150/466, loss: 0.6466522812843323 2023-01-24 01:17:28.926589: step: 152/466, loss: 0.6802599430084229 2023-01-24 01:17:29.566080: step: 154/466, loss: 0.4298156201839447 2023-01-24 01:17:30.371551: step: 156/466, loss: 0.4231945276260376 2023-01-24 01:17:30.995896: step: 158/466, loss: 1.2339304685592651 2023-01-24 01:17:31.654812: step: 160/466, loss: 0.2515791952610016 2023-01-24 01:17:32.311615: step: 162/466, loss: 0.16225486993789673 2023-01-24 01:17:32.877154: step: 164/466, loss: 1.2626041173934937 2023-01-24 01:17:33.507082: step: 166/466, loss: 0.9185038805007935 2023-01-24 01:17:34.136894: step: 168/466, loss: 0.8705633878707886 2023-01-24 01:17:34.790459: step: 170/466, loss: 0.5381495952606201 2023-01-24 01:17:35.381421: step: 172/466, loss: 0.6132254600524902 2023-01-24 01:17:35.991565: step: 174/466, loss: 0.2988283038139343 2023-01-24 01:17:36.712579: step: 176/466, loss: 0.2567707300186157 2023-01-24 01:17:37.386370: step: 178/466, loss: 0.30163484811782837 2023-01-24 01:17:38.044513: step: 180/466, loss: 0.5373499393463135 2023-01-24 01:17:38.724803: step: 182/466, loss: 0.3271122872829437 2023-01-24 01:17:39.357732: step: 184/466, loss: 0.9332323670387268 2023-01-24 01:17:39.967722: step: 186/466, loss: 0.2196381390094757 2023-01-24 01:17:40.569906: step: 188/466, loss: 0.23258720338344574 2023-01-24 01:17:41.304181: step: 190/466, loss: 0.4428851008415222 2023-01-24 01:17:41.908639: step: 192/466, loss: 0.18682517111301422 2023-01-24 01:17:42.520909: step: 194/466, loss: 0.07872353494167328 2023-01-24 01:17:43.197583: step: 196/466, loss: 0.6266285181045532 2023-01-24 01:17:43.853096: step: 198/466, loss: 0.2624841034412384 2023-01-24 01:17:44.475047: step: 200/466, loss: 0.17764367163181305 2023-01-24 01:17:45.057752: step: 202/466, loss: 0.27667200565338135 2023-01-24 01:17:45.676864: step: 204/466, loss: 0.8821558952331543 2023-01-24 01:17:46.227625: step: 206/466, loss: 0.6958469152450562 2023-01-24 01:17:46.820566: step: 208/466, loss: 0.25369971990585327 2023-01-24 01:17:47.400101: step: 210/466, loss: 0.27209770679473877 2023-01-24 01:17:47.978951: step: 212/466, loss: 0.34161660075187683 2023-01-24 01:17:48.598975: step: 214/466, loss: 1.7664700746536255 2023-01-24 01:17:49.242032: step: 216/466, loss: 1.5650590658187866 2023-01-24 01:17:49.912728: step: 218/466, loss: 0.4417518079280853 2023-01-24 01:17:50.551620: step: 220/466, loss: 0.36913424730300903 2023-01-24 01:17:51.164843: step: 222/466, loss: 3.28045916557312 2023-01-24 01:17:51.793675: step: 224/466, loss: 0.22563347220420837 2023-01-24 01:17:52.369404: step: 226/466, loss: 0.5703184008598328 2023-01-24 01:17:53.040703: step: 228/466, loss: 0.26921722292900085 2023-01-24 01:17:53.680821: step: 230/466, loss: 0.4393335282802582 2023-01-24 01:17:54.272015: step: 232/466, loss: 0.4259721040725708 2023-01-24 01:17:54.989452: step: 234/466, loss: 0.39365753531455994 2023-01-24 01:17:55.664234: step: 236/466, loss: 1.1271625757217407 2023-01-24 01:17:56.257225: step: 238/466, loss: 0.3785664141178131 2023-01-24 01:17:56.903148: step: 240/466, loss: 0.2514948546886444 2023-01-24 01:17:57.512477: step: 242/466, loss: 0.4495410919189453 2023-01-24 01:17:58.173405: step: 244/466, loss: 0.35675227642059326 2023-01-24 01:17:58.788372: step: 246/466, loss: 0.4869801104068756 2023-01-24 01:17:59.387517: step: 248/466, loss: 0.3292056918144226 2023-01-24 01:18:00.025309: step: 250/466, loss: 0.6296101212501526 2023-01-24 01:18:00.615831: step: 252/466, loss: 0.13931596279144287 2023-01-24 01:18:01.238826: step: 254/466, loss: 0.5509451627731323 2023-01-24 01:18:01.907402: step: 256/466, loss: 0.2474900782108307 2023-01-24 01:18:02.607215: step: 258/466, loss: 1.371412992477417 2023-01-24 01:18:03.197339: step: 260/466, loss: 0.63140869140625 2023-01-24 01:18:03.827777: step: 262/466, loss: 0.3595554530620575 2023-01-24 01:18:04.511597: step: 264/466, loss: 2.6054558753967285 2023-01-24 01:18:05.124763: step: 266/466, loss: 3.7890048027038574 2023-01-24 01:18:05.722466: step: 268/466, loss: 0.9801911115646362 2023-01-24 01:18:06.301606: step: 270/466, loss: 0.2202795296907425 2023-01-24 01:18:06.870901: step: 272/466, loss: 0.42003333568573 2023-01-24 01:18:07.531066: step: 274/466, loss: 1.0269274711608887 2023-01-24 01:18:08.199736: step: 276/466, loss: 0.3591936230659485 2023-01-24 01:18:08.806093: step: 278/466, loss: 0.44897300004959106 2023-01-24 01:18:09.384494: step: 280/466, loss: 0.24281011521816254 2023-01-24 01:18:10.040228: step: 282/466, loss: 0.893500566482544 2023-01-24 01:18:10.642201: step: 284/466, loss: 0.6132438778877258 2023-01-24 01:18:11.303825: step: 286/466, loss: 0.658779501914978 2023-01-24 01:18:11.897306: step: 288/466, loss: 0.4819200932979584 2023-01-24 01:18:12.481742: step: 290/466, loss: 2.5244646072387695 2023-01-24 01:18:13.078314: step: 292/466, loss: 1.0741807222366333 2023-01-24 01:18:13.842438: step: 294/466, loss: 0.8289614319801331 2023-01-24 01:18:14.493134: step: 296/466, loss: 0.5560529828071594 2023-01-24 01:18:15.120169: step: 298/466, loss: 0.3289271295070648 2023-01-24 01:18:15.724151: step: 300/466, loss: 0.3868092894554138 2023-01-24 01:18:16.357484: step: 302/466, loss: 0.20935015380382538 2023-01-24 01:18:17.037256: step: 304/466, loss: 0.40141284465789795 2023-01-24 01:18:17.673025: step: 306/466, loss: 0.3986812233924866 2023-01-24 01:18:18.349947: step: 308/466, loss: 0.2795734703540802 2023-01-24 01:18:18.927887: step: 310/466, loss: 0.24247369170188904 2023-01-24 01:18:19.530349: step: 312/466, loss: 0.23847003281116486 2023-01-24 01:18:20.122011: step: 314/466, loss: 0.14001093804836273 2023-01-24 01:18:20.743689: step: 316/466, loss: 0.5716454386711121 2023-01-24 01:18:21.381141: step: 318/466, loss: 0.2892405390739441 2023-01-24 01:18:22.050383: step: 320/466, loss: 0.36249661445617676 2023-01-24 01:18:22.645836: step: 322/466, loss: 0.31691575050354004 2023-01-24 01:18:23.327842: step: 324/466, loss: 0.4308742582798004 2023-01-24 01:18:23.997550: step: 326/466, loss: 0.8795764446258545 2023-01-24 01:18:24.557035: step: 328/466, loss: 0.5361948609352112 2023-01-24 01:18:25.164492: step: 330/466, loss: 0.13352420926094055 2023-01-24 01:18:25.730301: step: 332/466, loss: 0.5865218043327332 2023-01-24 01:18:26.364509: step: 334/466, loss: 0.9855297207832336 2023-01-24 01:18:27.047739: step: 336/466, loss: 1.1064568758010864 2023-01-24 01:18:27.660529: step: 338/466, loss: 0.8792426586151123 2023-01-24 01:18:28.374984: step: 340/466, loss: 0.3035004734992981 2023-01-24 01:18:28.924573: step: 342/466, loss: 0.8655188679695129 2023-01-24 01:18:29.524398: step: 344/466, loss: 0.3053174614906311 2023-01-24 01:18:30.119049: step: 346/466, loss: 0.4611417055130005 2023-01-24 01:18:30.799160: step: 348/466, loss: 0.6201977729797363 2023-01-24 01:18:31.386531: step: 350/466, loss: 0.7922503352165222 2023-01-24 01:18:31.960894: step: 352/466, loss: 0.11923874914646149 2023-01-24 01:18:32.613114: step: 354/466, loss: 0.3223556578159332 2023-01-24 01:18:33.240819: step: 356/466, loss: 0.7820574045181274 2023-01-24 01:18:33.853202: step: 358/466, loss: 1.522855520248413 2023-01-24 01:18:34.434160: step: 360/466, loss: 0.30281901359558105 2023-01-24 01:18:35.076183: step: 362/466, loss: 0.5263733863830566 2023-01-24 01:18:35.668172: step: 364/466, loss: 0.1544521301984787 2023-01-24 01:18:36.343092: step: 366/466, loss: 0.6664984822273254 2023-01-24 01:18:36.963146: step: 368/466, loss: 0.4794970452785492 2023-01-24 01:18:37.618611: step: 370/466, loss: 0.7100381851196289 2023-01-24 01:18:38.248791: step: 372/466, loss: 0.2704818844795227 2023-01-24 01:18:38.822000: step: 374/466, loss: 0.5666533708572388 2023-01-24 01:18:39.452523: step: 376/466, loss: 0.5648034811019897 2023-01-24 01:18:40.055298: step: 378/466, loss: 0.41750550270080566 2023-01-24 01:18:40.707056: step: 380/466, loss: 0.3844192624092102 2023-01-24 01:18:41.313023: step: 382/466, loss: 0.5948653221130371 2023-01-24 01:18:41.945143: step: 384/466, loss: 1.1000791788101196 2023-01-24 01:18:42.483445: step: 386/466, loss: 0.45296505093574524 2023-01-24 01:18:43.059946: step: 388/466, loss: 0.39085686206817627 2023-01-24 01:18:43.757098: step: 390/466, loss: 0.1594216376543045 2023-01-24 01:18:44.418883: step: 392/466, loss: 0.2634032368659973 2023-01-24 01:18:45.057973: step: 394/466, loss: 0.14748629927635193 2023-01-24 01:18:45.661996: step: 396/466, loss: 1.5020861625671387 2023-01-24 01:18:46.269481: step: 398/466, loss: 0.9530461430549622 2023-01-24 01:18:46.896736: step: 400/466, loss: 0.266817182302475 2023-01-24 01:18:47.522675: step: 402/466, loss: 0.5079426169395447 2023-01-24 01:18:48.130236: step: 404/466, loss: 0.17258180677890778 2023-01-24 01:18:48.773410: step: 406/466, loss: 0.5756434202194214 2023-01-24 01:18:49.369757: step: 408/466, loss: 0.5008427500724792 2023-01-24 01:18:49.988099: step: 410/466, loss: 0.6624125838279724 2023-01-24 01:18:50.618884: step: 412/466, loss: 0.2582664489746094 2023-01-24 01:18:51.288704: step: 414/466, loss: 0.2248954474925995 2023-01-24 01:18:51.893164: step: 416/466, loss: 0.43843311071395874 2023-01-24 01:18:52.510980: step: 418/466, loss: 0.2874451279640198 2023-01-24 01:18:53.130261: step: 420/466, loss: 0.22079920768737793 2023-01-24 01:18:53.863509: step: 422/466, loss: 0.2217000275850296 2023-01-24 01:18:54.471682: step: 424/466, loss: 0.150802880525589 2023-01-24 01:18:55.103093: step: 426/466, loss: 1.0697029829025269 2023-01-24 01:18:55.766650: step: 428/466, loss: 0.6755574345588684 2023-01-24 01:18:56.411369: step: 430/466, loss: 0.8953157663345337 2023-01-24 01:18:57.069239: step: 432/466, loss: 0.4585370719432831 2023-01-24 01:18:57.737145: step: 434/466, loss: 0.36771368980407715 2023-01-24 01:18:58.349305: step: 436/466, loss: 0.2886699140071869 2023-01-24 01:18:58.917279: step: 438/466, loss: 0.5681027770042419 2023-01-24 01:18:59.546001: step: 440/466, loss: 0.29305148124694824 2023-01-24 01:19:00.234616: step: 442/466, loss: 0.5841960310935974 2023-01-24 01:19:00.840608: step: 444/466, loss: 0.137999027967453 2023-01-24 01:19:01.489918: step: 446/466, loss: 0.45419827103614807 2023-01-24 01:19:02.013354: step: 448/466, loss: 0.15391451120376587 2023-01-24 01:19:02.645969: step: 450/466, loss: 0.40725621581077576 2023-01-24 01:19:03.230201: step: 452/466, loss: 0.42934536933898926 2023-01-24 01:19:03.830213: step: 454/466, loss: 0.15431423485279083 2023-01-24 01:19:04.497281: step: 456/466, loss: 0.24367600679397583 2023-01-24 01:19:05.133867: step: 458/466, loss: 0.578997015953064 2023-01-24 01:19:05.691943: step: 460/466, loss: 1.582404375076294 2023-01-24 01:19:06.294674: step: 462/466, loss: 0.2780749797821045 2023-01-24 01:19:06.875220: step: 464/466, loss: 0.6970811486244202 2023-01-24 01:19:07.461646: step: 466/466, loss: 0.45635974407196045 2023-01-24 01:19:08.069058: step: 468/466, loss: 0.15261076390743256 2023-01-24 01:19:08.691282: step: 470/466, loss: 0.24384662508964539 2023-01-24 01:19:09.413386: step: 472/466, loss: 0.4098736047744751 2023-01-24 01:19:10.026445: step: 474/466, loss: 0.30031678080558777 2023-01-24 01:19:10.609387: step: 476/466, loss: 0.3997558057308197 2023-01-24 01:19:11.274029: step: 478/466, loss: 0.06994882971048355 2023-01-24 01:19:11.899970: step: 480/466, loss: 0.2708031237125397 2023-01-24 01:19:12.488624: step: 482/466, loss: 0.17339487373828888 2023-01-24 01:19:13.084855: step: 484/466, loss: 2.775698661804199 2023-01-24 01:19:13.771272: step: 486/466, loss: 0.7528473138809204 2023-01-24 01:19:14.379246: step: 488/466, loss: 0.20074620842933655 2023-01-24 01:19:15.011171: step: 490/466, loss: 0.4729272723197937 2023-01-24 01:19:15.609696: step: 492/466, loss: 0.5164822936058044 2023-01-24 01:19:16.260414: step: 494/466, loss: 0.4550749957561493 2023-01-24 01:19:16.867257: step: 496/466, loss: 0.08575890213251114 2023-01-24 01:19:17.504291: step: 498/466, loss: 1.2750965356826782 2023-01-24 01:19:18.217088: step: 500/466, loss: 0.27038100361824036 2023-01-24 01:19:18.808737: step: 502/466, loss: 0.40854859352111816 2023-01-24 01:19:19.454606: step: 504/466, loss: 0.11246982216835022 2023-01-24 01:19:20.100093: step: 506/466, loss: 0.4013828933238983 2023-01-24 01:19:20.730421: step: 508/466, loss: 0.6209487915039062 2023-01-24 01:19:21.373511: step: 510/466, loss: 0.26867666840553284 2023-01-24 01:19:22.002827: step: 512/466, loss: 0.3805329203605652 2023-01-24 01:19:22.596343: step: 514/466, loss: 0.11256767809391022 2023-01-24 01:19:23.221908: step: 516/466, loss: 0.6399586796760559 2023-01-24 01:19:23.851563: step: 518/466, loss: 1.0372354984283447 2023-01-24 01:19:24.552717: step: 520/466, loss: 0.5978058576583862 2023-01-24 01:19:25.191674: step: 522/466, loss: 0.7899807095527649 2023-01-24 01:19:25.831850: step: 524/466, loss: 0.2514680027961731 2023-01-24 01:19:26.421578: step: 526/466, loss: 0.5277990102767944 2023-01-24 01:19:27.010859: step: 528/466, loss: 0.32197409868240356 2023-01-24 01:19:27.606904: step: 530/466, loss: 0.3707515001296997 2023-01-24 01:19:28.180792: step: 532/466, loss: 0.5824082493782043 2023-01-24 01:19:28.781773: step: 534/466, loss: 0.39320623874664307 2023-01-24 01:19:29.484518: step: 536/466, loss: 0.5981414318084717 2023-01-24 01:19:30.137609: step: 538/466, loss: 0.2387053370475769 2023-01-24 01:19:30.770888: step: 540/466, loss: 0.2407008707523346 2023-01-24 01:19:31.408491: step: 542/466, loss: 0.23002319037914276 2023-01-24 01:19:31.996721: step: 544/466, loss: 0.27901536226272583 2023-01-24 01:19:32.622954: step: 546/466, loss: 0.18295063078403473 2023-01-24 01:19:33.326829: step: 548/466, loss: 0.27758944034576416 2023-01-24 01:19:33.960344: step: 550/466, loss: 0.49734365940093994 2023-01-24 01:19:34.577100: step: 552/466, loss: 1.140255093574524 2023-01-24 01:19:35.313024: step: 554/466, loss: 0.8541272878646851 2023-01-24 01:19:35.896405: step: 556/466, loss: 0.22601443529129028 2023-01-24 01:19:36.516017: step: 558/466, loss: 0.5696001648902893 2023-01-24 01:19:37.169850: step: 560/466, loss: 0.7366502285003662 2023-01-24 01:19:37.840034: step: 562/466, loss: 0.5757856965065002 2023-01-24 01:19:38.403517: step: 564/466, loss: 0.40094703435897827 2023-01-24 01:19:39.060109: step: 566/466, loss: 0.6098746061325073 2023-01-24 01:19:39.639509: step: 568/466, loss: 0.22799567878246307 2023-01-24 01:19:40.313817: step: 570/466, loss: 0.5381388664245605 2023-01-24 01:19:40.892798: step: 572/466, loss: 0.2369133085012436 2023-01-24 01:19:41.532567: step: 574/466, loss: 0.991308867931366 2023-01-24 01:19:42.197855: step: 576/466, loss: 0.5809664130210876 2023-01-24 01:19:42.797015: step: 578/466, loss: 0.23986878991127014 2023-01-24 01:19:43.345228: step: 580/466, loss: 0.8582189083099365 2023-01-24 01:19:43.959595: step: 582/466, loss: 0.30137068033218384 2023-01-24 01:19:44.574423: step: 584/466, loss: 0.14077231287956238 2023-01-24 01:19:45.226502: step: 586/466, loss: 0.559489905834198 2023-01-24 01:19:45.919831: step: 588/466, loss: 0.7863790988922119 2023-01-24 01:19:46.547114: step: 590/466, loss: 0.2836732864379883 2023-01-24 01:19:47.120892: step: 592/466, loss: 0.4197135269641876 2023-01-24 01:19:47.714571: step: 594/466, loss: 0.6486166715621948 2023-01-24 01:19:48.318646: step: 596/466, loss: 0.5372748970985413 2023-01-24 01:19:48.919663: step: 598/466, loss: 0.2581697106361389 2023-01-24 01:19:49.469618: step: 600/466, loss: 1.1937419176101685 2023-01-24 01:19:50.091577: step: 602/466, loss: 0.4056375026702881 2023-01-24 01:19:50.698002: step: 604/466, loss: 0.9247855544090271 2023-01-24 01:19:51.302340: step: 606/466, loss: 0.3549393117427826 2023-01-24 01:19:51.882720: step: 608/466, loss: 0.21426716446876526 2023-01-24 01:19:52.494501: step: 610/466, loss: 3.929910898208618 2023-01-24 01:19:53.106492: step: 612/466, loss: 0.830359160900116 2023-01-24 01:19:53.715344: step: 614/466, loss: 0.7156874537467957 2023-01-24 01:19:54.312289: step: 616/466, loss: 1.1229567527770996 2023-01-24 01:19:55.011158: step: 618/466, loss: 0.7136696577072144 2023-01-24 01:19:55.619811: step: 620/466, loss: 0.21507291495800018 2023-01-24 01:19:56.233741: step: 622/466, loss: 0.10922008752822876 2023-01-24 01:19:56.893805: step: 624/466, loss: 0.4058763384819031 2023-01-24 01:19:57.520547: step: 626/466, loss: 0.9438013434410095 2023-01-24 01:19:58.175906: step: 628/466, loss: 0.3833908438682556 2023-01-24 01:19:58.791113: step: 630/466, loss: 0.4176103174686432 2023-01-24 01:19:59.416438: step: 632/466, loss: 1.2825418710708618 2023-01-24 01:20:00.055336: step: 634/466, loss: 0.18677985668182373 2023-01-24 01:20:00.657351: step: 636/466, loss: 0.7825669050216675 2023-01-24 01:20:01.397495: step: 638/466, loss: 0.3831029236316681 2023-01-24 01:20:01.974948: step: 640/466, loss: 0.5506783723831177 2023-01-24 01:20:02.666994: step: 642/466, loss: 0.6937105655670166 2023-01-24 01:20:03.359419: step: 644/466, loss: 0.5345378518104553 2023-01-24 01:20:03.993534: step: 646/466, loss: 1.5613038539886475 2023-01-24 01:20:04.590051: step: 648/466, loss: 0.5392831563949585 2023-01-24 01:20:05.249258: step: 650/466, loss: 0.566638708114624 2023-01-24 01:20:05.824079: step: 652/466, loss: 2.2288661003112793 2023-01-24 01:20:06.424124: step: 654/466, loss: 0.22525736689567566 2023-01-24 01:20:07.039436: step: 656/466, loss: 0.12807002663612366 2023-01-24 01:20:07.652842: step: 658/466, loss: 0.4514746069908142 2023-01-24 01:20:08.259852: step: 660/466, loss: 0.628321647644043 2023-01-24 01:20:08.886291: step: 662/466, loss: 0.8235511779785156 2023-01-24 01:20:09.508373: step: 664/466, loss: 0.4312689006328583 2023-01-24 01:20:10.156677: step: 666/466, loss: 0.9901965856552124 2023-01-24 01:20:10.877039: step: 668/466, loss: 0.3410632908344269 2023-01-24 01:20:11.432903: step: 670/466, loss: 0.22190023958683014 2023-01-24 01:20:12.054180: step: 672/466, loss: 1.326055884361267 2023-01-24 01:20:12.718229: step: 674/466, loss: 0.3149159848690033 2023-01-24 01:20:13.382412: step: 676/466, loss: 0.3897155523300171 2023-01-24 01:20:13.954319: step: 678/466, loss: 0.16535858809947968 2023-01-24 01:20:14.593766: step: 680/466, loss: 0.18679854273796082 2023-01-24 01:20:15.201713: step: 682/466, loss: 0.33889710903167725 2023-01-24 01:20:15.824615: step: 684/466, loss: 0.9268393516540527 2023-01-24 01:20:16.441406: step: 686/466, loss: 0.4853194057941437 2023-01-24 01:20:17.061343: step: 688/466, loss: 2.5176568031311035 2023-01-24 01:20:17.680499: step: 690/466, loss: 0.3999820947647095 2023-01-24 01:20:18.319975: step: 692/466, loss: 4.8594584465026855 2023-01-24 01:20:18.950977: step: 694/466, loss: 0.9282346367835999 2023-01-24 01:20:19.576383: step: 696/466, loss: 0.38179799914360046 2023-01-24 01:20:20.243209: step: 698/466, loss: 0.5187469720840454 2023-01-24 01:20:20.911272: step: 700/466, loss: 0.1862371265888214 2023-01-24 01:20:21.575029: step: 702/466, loss: 0.17429044842720032 2023-01-24 01:20:22.194151: step: 704/466, loss: 1.9816677570343018 2023-01-24 01:20:22.815675: step: 706/466, loss: 0.19061195850372314 2023-01-24 01:20:23.360770: step: 708/466, loss: 0.3051075339317322 2023-01-24 01:20:23.973903: step: 710/466, loss: 0.4641377031803131 2023-01-24 01:20:24.584608: step: 712/466, loss: 0.1860985904932022 2023-01-24 01:20:25.151039: step: 714/466, loss: 0.2122507393360138 2023-01-24 01:20:25.762701: step: 716/466, loss: 0.21075765788555145 2023-01-24 01:20:26.387868: step: 718/466, loss: 0.508497953414917 2023-01-24 01:20:26.968760: step: 720/466, loss: 0.5619471669197083 2023-01-24 01:20:27.561205: step: 722/466, loss: 0.2921290695667267 2023-01-24 01:20:28.144540: step: 724/466, loss: 0.7574358582496643 2023-01-24 01:20:28.718095: step: 726/466, loss: 0.5595558881759644 2023-01-24 01:20:29.357253: step: 728/466, loss: 0.09329615533351898 2023-01-24 01:20:29.981854: step: 730/466, loss: 0.9361396431922913 2023-01-24 01:20:30.667672: step: 732/466, loss: 0.3507343530654907 2023-01-24 01:20:31.222333: step: 734/466, loss: 0.19813771545886993 2023-01-24 01:20:31.786045: step: 736/466, loss: 2.7031946182250977 2023-01-24 01:20:32.414825: step: 738/466, loss: 0.14652515947818756 2023-01-24 01:20:33.051334: step: 740/466, loss: 0.3808930516242981 2023-01-24 01:20:33.734392: step: 742/466, loss: 0.34050026535987854 2023-01-24 01:20:34.366271: step: 744/466, loss: 0.5174801349639893 2023-01-24 01:20:34.997926: step: 746/466, loss: 1.3954707384109497 2023-01-24 01:20:35.579642: step: 748/466, loss: 0.35464537143707275 2023-01-24 01:20:36.214708: step: 750/466, loss: 0.18262244760990143 2023-01-24 01:20:36.798723: step: 752/466, loss: 0.5942860245704651 2023-01-24 01:20:37.463630: step: 754/466, loss: 2.3839564323425293 2023-01-24 01:20:38.059686: step: 756/466, loss: 0.12105849385261536 2023-01-24 01:20:38.731885: step: 758/466, loss: 1.135514497756958 2023-01-24 01:20:39.343098: step: 760/466, loss: 0.4444793462753296 2023-01-24 01:20:39.964861: step: 762/466, loss: 0.38591933250427246 2023-01-24 01:20:40.604365: step: 764/466, loss: 0.5965667963027954 2023-01-24 01:20:41.238034: step: 766/466, loss: 0.9795670509338379 2023-01-24 01:20:41.809393: step: 768/466, loss: 0.7009924650192261 2023-01-24 01:20:42.506920: step: 770/466, loss: 0.6074681878089905 2023-01-24 01:20:43.112139: step: 772/466, loss: 0.36849063634872437 2023-01-24 01:20:43.815660: step: 774/466, loss: 0.573112428188324 2023-01-24 01:20:44.423570: step: 776/466, loss: 0.6135465502738953 2023-01-24 01:20:45.062224: step: 778/466, loss: 0.7879122495651245 2023-01-24 01:20:45.636297: step: 780/466, loss: 0.8168390393257141 2023-01-24 01:20:46.244911: step: 782/466, loss: 0.3562544286251068 2023-01-24 01:20:46.886288: step: 784/466, loss: 0.30302950739860535 2023-01-24 01:20:47.513583: step: 786/466, loss: 4.270347595214844 2023-01-24 01:20:48.189030: step: 788/466, loss: 0.39814668893814087 2023-01-24 01:20:48.817464: step: 790/466, loss: 0.2591295838356018 2023-01-24 01:20:49.407967: step: 792/466, loss: 0.20328326523303986 2023-01-24 01:20:50.005712: step: 794/466, loss: 0.1836165338754654 2023-01-24 01:20:50.632375: step: 796/466, loss: 0.3204825222492218 2023-01-24 01:20:51.342268: step: 798/466, loss: 0.48555970191955566 2023-01-24 01:20:51.942970: step: 800/466, loss: 0.19230753183364868 2023-01-24 01:20:52.629837: step: 802/466, loss: 0.3014485836029053 2023-01-24 01:20:53.212948: step: 804/466, loss: 0.5951617956161499 2023-01-24 01:20:53.928306: step: 806/466, loss: 0.6485537886619568 2023-01-24 01:20:54.541878: step: 808/466, loss: 0.2819003462791443 2023-01-24 01:20:55.186611: step: 810/466, loss: 0.2669002115726471 2023-01-24 01:20:55.725680: step: 812/466, loss: 0.42355260252952576 2023-01-24 01:20:56.379322: step: 814/466, loss: 0.410382479429245 2023-01-24 01:20:57.003872: step: 816/466, loss: 0.4057581424713135 2023-01-24 01:20:57.614445: step: 818/466, loss: 0.784027099609375 2023-01-24 01:20:58.288636: step: 820/466, loss: 0.2816869914531708 2023-01-24 01:20:58.859278: step: 822/466, loss: 0.6319968700408936 2023-01-24 01:20:59.504725: step: 824/466, loss: 1.1715902090072632 2023-01-24 01:21:00.177712: step: 826/466, loss: 0.6217017769813538 2023-01-24 01:21:00.867903: step: 828/466, loss: 0.29474109411239624 2023-01-24 01:21:01.530031: step: 830/466, loss: 0.5678926706314087 2023-01-24 01:21:02.190058: step: 832/466, loss: 0.39161813259124756 2023-01-24 01:21:02.766853: step: 834/466, loss: 0.18232816457748413 2023-01-24 01:21:03.405583: step: 836/466, loss: 0.6148484349250793 2023-01-24 01:21:03.988203: step: 838/466, loss: 0.6800553798675537 2023-01-24 01:21:04.649886: step: 840/466, loss: 0.766160249710083 2023-01-24 01:21:05.318536: step: 842/466, loss: 0.4223789870738983 2023-01-24 01:21:05.968548: step: 844/466, loss: 0.2277809977531433 2023-01-24 01:21:06.625431: step: 846/466, loss: 0.9845823645591736 2023-01-24 01:21:07.243913: step: 848/466, loss: 0.36414122581481934 2023-01-24 01:21:07.908930: step: 850/466, loss: 0.36403149366378784 2023-01-24 01:21:08.541856: step: 852/466, loss: 0.5862483382225037 2023-01-24 01:21:09.200842: step: 854/466, loss: 0.3021498918533325 2023-01-24 01:21:09.747907: step: 856/466, loss: 0.458347886800766 2023-01-24 01:21:10.337334: step: 858/466, loss: 0.5069761872291565 2023-01-24 01:21:10.980992: step: 860/466, loss: 0.7813975214958191 2023-01-24 01:21:11.645061: step: 862/466, loss: 1.3759708404541016 2023-01-24 01:21:12.302045: step: 864/466, loss: 0.2643243372440338 2023-01-24 01:21:12.971795: step: 866/466, loss: 0.3489965796470642 2023-01-24 01:21:13.542658: step: 868/466, loss: 0.4561339020729065 2023-01-24 01:21:14.201649: step: 870/466, loss: 1.14317786693573 2023-01-24 01:21:14.800751: step: 872/466, loss: 1.186966061592102 2023-01-24 01:21:15.441147: step: 874/466, loss: 0.40301233530044556 2023-01-24 01:21:16.004466: step: 876/466, loss: 0.9938855767250061 2023-01-24 01:21:16.592473: step: 878/466, loss: 0.29392674565315247 2023-01-24 01:21:17.302946: step: 880/466, loss: 0.7371693253517151 2023-01-24 01:21:17.890941: step: 882/466, loss: 1.307347059249878 2023-01-24 01:21:18.485555: step: 884/466, loss: 0.2597961127758026 2023-01-24 01:21:19.075968: step: 886/466, loss: 4.855459213256836 2023-01-24 01:21:19.709167: step: 888/466, loss: 0.1865711212158203 2023-01-24 01:21:20.332407: step: 890/466, loss: 0.833149790763855 2023-01-24 01:21:20.939840: step: 892/466, loss: 0.24636347591876984 2023-01-24 01:21:21.603910: step: 894/466, loss: 1.0389864444732666 2023-01-24 01:21:22.201002: step: 896/466, loss: 1.0861743688583374 2023-01-24 01:21:22.821194: step: 898/466, loss: 0.5853481888771057 2023-01-24 01:21:23.447967: step: 900/466, loss: 1.08613121509552 2023-01-24 01:21:24.076071: step: 902/466, loss: 0.6622076034545898 2023-01-24 01:21:24.674247: step: 904/466, loss: 0.8512253761291504 2023-01-24 01:21:25.325329: step: 906/466, loss: 0.2479708194732666 2023-01-24 01:21:25.938759: step: 908/466, loss: 0.33753520250320435 2023-01-24 01:21:26.643842: step: 910/466, loss: 1.420801043510437 2023-01-24 01:21:27.222838: step: 912/466, loss: 0.2568238377571106 2023-01-24 01:21:27.838718: step: 914/466, loss: 0.15492361783981323 2023-01-24 01:21:28.462154: step: 916/466, loss: 1.5217208862304688 2023-01-24 01:21:29.061865: step: 918/466, loss: 0.9508876800537109 2023-01-24 01:21:29.633907: step: 920/466, loss: 0.3954949676990509 2023-01-24 01:21:30.226996: step: 922/466, loss: 1.1577404737472534 2023-01-24 01:21:30.783871: step: 924/466, loss: 0.8091952204704285 2023-01-24 01:21:31.424257: step: 926/466, loss: 0.7096596360206604 2023-01-24 01:21:31.994145: step: 928/466, loss: 0.18696229159832 2023-01-24 01:21:32.682103: step: 930/466, loss: 1.1425925493240356 2023-01-24 01:21:33.309394: step: 932/466, loss: 0.763130784034729 ================================================== Loss: 0.603 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3258476566168874, 'r': 0.3134815216409144, 'f1': 0.3195449940130791}, 'combined': 0.2354542061149004, 'epoch': 7} Test Chinese: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.3408063421620675, 'r': 0.2859707196342608, 'f1': 0.31098979482333533}, 'combined': 0.20625229915744517, 'epoch': 7} Dev Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.32179245874898055, 'r': 0.2614563727335467, 'f1': 0.2885035837059826}, 'combined': 0.19233572247065506, 'epoch': 7} Test Korean: {'template': {'p': 0.96875, 'r': 0.49206349206349204, 'f1': 0.6526315789473683}, 'slot': {'p': 0.3461798594746533, 'r': 0.26799096017951607, 'f1': 0.302108371047851}, 'combined': 0.19716546321017642, 'epoch': 7} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3214272868712161, 'r': 0.31166858366449984, 'f1': 0.3164727236824498}, 'combined': 0.23319042797654194, 'epoch': 7} Test Russian: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.3374639287478496, 'r': 0.2816078301964814, 'f1': 0.30701605547736693}, 'combined': 0.20361686580882363, 'epoch': 7} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.2689393939393939, 'r': 0.33809523809523806, 'f1': 0.2995780590717299}, 'combined': 0.19971870604781994, 'epoch': 7} Sample Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.3541666666666667, 'r': 0.3695652173913043, 'f1': 0.3617021276595745}, 'combined': 0.24113475177304966, 'epoch': 7} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.4423076923076923, 'r': 0.19827586206896552, 'f1': 0.2738095238095238}, 'combined': 0.1825396825396825, 'epoch': 7} New best russian model... ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3426994403339025, 'r': 0.326442351134002, 'f1': 0.3343734092276366}, 'combined': 0.24638040679931117, 'epoch': 6} Test for Chinese: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.3829430975684692, 'r': 0.2621003488086857, 'f1': 0.3112023539198586}, 'combined': 0.20639327099348134, 'epoch': 6} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.3494623655913978, 'r': 0.3095238095238095, 'f1': 0.32828282828282823}, 'combined': 0.2188552188552188, 'epoch': 6} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.2959473164135636, 'r': 0.24447821790685687, 'f1': 0.2677618577075099}, 'combined': 0.17850790513833992, 'epoch': 5} Test for Korean: {'template': {'p': 0.96875, 'r': 0.49206349206349204, 'f1': 0.6526315789473683}, 'slot': {'p': 0.37089076752905875, 'r': 0.2591418609488748, 'f1': 0.30510586075020424}, 'combined': 0.1991217196475017, 'epoch': 5} Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5357142857142857, 'r': 0.32608695652173914, 'f1': 0.40540540540540543}, 'combined': 0.2702702702702703, 'epoch': 5} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3214272868712161, 'r': 0.31166858366449984, 'f1': 0.3164727236824498}, 'combined': 0.23319042797654194, 'epoch': 7} Test for Russian: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.3374639287478496, 'r': 0.2816078301964814, 'f1': 0.30701605547736693}, 'combined': 0.20361686580882363, 'epoch': 7} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.4423076923076923, 'r': 0.19827586206896552, 'f1': 0.2738095238095238}, 'combined': 0.1825396825396825, 'epoch': 7} ****************************** Epoch: 8 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-24 01:24:10.310416: step: 2/466, loss: 0.42070913314819336 2023-01-24 01:24:10.895961: step: 4/466, loss: 0.19329315423965454 2023-01-24 01:24:11.568960: step: 6/466, loss: 0.1542825996875763 2023-01-24 01:24:12.289693: step: 8/466, loss: 0.3330560028553009 2023-01-24 01:24:12.953856: step: 10/466, loss: 0.8174936771392822 2023-01-24 01:24:13.592206: step: 12/466, loss: 0.6355933547019958 2023-01-24 01:24:14.225295: step: 14/466, loss: 0.23780372738838196 2023-01-24 01:24:14.854096: step: 16/466, loss: 0.6033594608306885 2023-01-24 01:24:15.449347: step: 18/466, loss: 0.5725092887878418 2023-01-24 01:24:16.056950: step: 20/466, loss: 0.22369495034217834 2023-01-24 01:24:16.677669: step: 22/466, loss: 0.3154262602329254 2023-01-24 01:24:17.317837: step: 24/466, loss: 0.09151893109083176 2023-01-24 01:24:17.917241: step: 26/466, loss: 0.16328245401382446 2023-01-24 01:24:18.513409: step: 28/466, loss: 0.7152175307273865 2023-01-24 01:24:19.180901: step: 30/466, loss: 0.16174262762069702 2023-01-24 01:24:19.852345: step: 32/466, loss: 0.533721387386322 2023-01-24 01:24:20.458252: step: 34/466, loss: 1.1466821432113647 2023-01-24 01:24:21.134278: step: 36/466, loss: 0.14627352356910706 2023-01-24 01:24:21.795928: step: 38/466, loss: 0.47448137402534485 2023-01-24 01:24:22.415831: step: 40/466, loss: 0.16674591600894928 2023-01-24 01:24:23.027511: step: 42/466, loss: 1.0415873527526855 2023-01-24 01:24:23.653664: step: 44/466, loss: 0.3185746371746063 2023-01-24 01:24:24.245535: step: 46/466, loss: 0.10683126002550125 2023-01-24 01:24:24.939381: step: 48/466, loss: 0.20796655118465424 2023-01-24 01:24:25.635028: step: 50/466, loss: 0.5955159664154053 2023-01-24 01:24:26.278207: step: 52/466, loss: 0.4141726493835449 2023-01-24 01:24:26.814902: step: 54/466, loss: 0.22971345484256744 2023-01-24 01:24:27.409251: step: 56/466, loss: 0.3233274817466736 2023-01-24 01:24:28.005975: step: 58/466, loss: 0.192288339138031 2023-01-24 01:24:28.645344: step: 60/466, loss: 0.7038285732269287 2023-01-24 01:24:29.275835: step: 62/466, loss: 0.7595869898796082 2023-01-24 01:24:29.871451: step: 64/466, loss: 0.31833571195602417 2023-01-24 01:24:30.496632: step: 66/466, loss: 0.17052781581878662 2023-01-24 01:24:31.093124: step: 68/466, loss: 0.7216683626174927 2023-01-24 01:24:31.702978: step: 70/466, loss: 0.5472569465637207 2023-01-24 01:24:32.392440: step: 72/466, loss: 0.13880658149719238 2023-01-24 01:24:33.010846: step: 74/466, loss: 0.20119856297969818 2023-01-24 01:24:33.597647: step: 76/466, loss: 0.7351464033126831 2023-01-24 01:24:34.184248: step: 78/466, loss: 0.9380894899368286 2023-01-24 01:24:34.770953: step: 80/466, loss: 1.0884746313095093 2023-01-24 01:24:35.406318: step: 82/466, loss: 0.4526834785938263 2023-01-24 01:24:36.104202: step: 84/466, loss: 0.873041033744812 2023-01-24 01:24:36.774545: step: 86/466, loss: 1.6015405654907227 2023-01-24 01:24:37.347580: step: 88/466, loss: 0.28300535678863525 2023-01-24 01:24:37.914444: step: 90/466, loss: 0.28813549876213074 2023-01-24 01:24:38.511166: step: 92/466, loss: 0.144855797290802 2023-01-24 01:24:39.095848: step: 94/466, loss: 0.41047242283821106 2023-01-24 01:24:39.720674: step: 96/466, loss: 0.2648795545101166 2023-01-24 01:24:40.303208: step: 98/466, loss: 0.4050430953502655 2023-01-24 01:24:40.864171: step: 100/466, loss: 0.9858799576759338 2023-01-24 01:24:41.506761: step: 102/466, loss: 0.17314811050891876 2023-01-24 01:24:42.074791: step: 104/466, loss: 0.42704564332962036 2023-01-24 01:24:42.718465: step: 106/466, loss: 0.16546908020973206 2023-01-24 01:24:43.387378: step: 108/466, loss: 0.18945223093032837 2023-01-24 01:24:44.013434: step: 110/466, loss: 0.5868598222732544 2023-01-24 01:24:44.598785: step: 112/466, loss: 0.3855229616165161 2023-01-24 01:24:45.259237: step: 114/466, loss: 0.535126268863678 2023-01-24 01:24:45.862555: step: 116/466, loss: 0.6311970353126526 2023-01-24 01:24:46.441327: step: 118/466, loss: 0.29702258110046387 2023-01-24 01:24:47.043417: step: 120/466, loss: 0.18481245636940002 2023-01-24 01:24:47.657463: step: 122/466, loss: 0.13972234725952148 2023-01-24 01:24:48.253904: step: 124/466, loss: 0.3186049461364746 2023-01-24 01:24:48.872466: step: 126/466, loss: 0.187290757894516 2023-01-24 01:24:49.482863: step: 128/466, loss: 0.2006399929523468 2023-01-24 01:24:50.167092: step: 130/466, loss: 0.27899008989334106 2023-01-24 01:24:50.749458: step: 132/466, loss: 0.23280629515647888 2023-01-24 01:24:51.372736: step: 134/466, loss: 0.274791955947876 2023-01-24 01:24:51.942860: step: 136/466, loss: 0.40975281596183777 2023-01-24 01:24:52.520879: step: 138/466, loss: 0.26108965277671814 2023-01-24 01:24:53.146988: step: 140/466, loss: 0.5550920367240906 2023-01-24 01:24:53.802678: step: 142/466, loss: 0.18534642457962036 2023-01-24 01:24:54.535424: step: 144/466, loss: 0.3210761547088623 2023-01-24 01:24:55.195934: step: 146/466, loss: 0.45470285415649414 2023-01-24 01:24:55.838442: step: 148/466, loss: 0.3254496455192566 2023-01-24 01:24:56.467037: step: 150/466, loss: 0.03482885658740997 2023-01-24 01:24:57.152778: step: 152/466, loss: 0.7344004511833191 2023-01-24 01:24:57.701887: step: 154/466, loss: 0.47011271119117737 2023-01-24 01:24:58.322397: step: 156/466, loss: 0.3940267562866211 2023-01-24 01:24:58.888111: step: 158/466, loss: 1.9927465915679932 2023-01-24 01:24:59.592250: step: 160/466, loss: 0.07561226189136505 2023-01-24 01:25:00.282942: step: 162/466, loss: 0.27004295587539673 2023-01-24 01:25:00.887860: step: 164/466, loss: 0.18644846975803375 2023-01-24 01:25:01.499701: step: 166/466, loss: 0.579391360282898 2023-01-24 01:25:02.139202: step: 168/466, loss: 0.7410764694213867 2023-01-24 01:25:02.849634: step: 170/466, loss: 0.36173009872436523 2023-01-24 01:25:03.551338: step: 172/466, loss: 0.3252776265144348 2023-01-24 01:25:04.214281: step: 174/466, loss: 0.43130847811698914 2023-01-24 01:25:04.842470: step: 176/466, loss: 0.2706472873687744 2023-01-24 01:25:05.446834: step: 178/466, loss: 0.4012158215045929 2023-01-24 01:25:06.127773: step: 180/466, loss: 0.23293635249137878 2023-01-24 01:25:06.872558: step: 182/466, loss: 0.3880685567855835 2023-01-24 01:25:07.517406: step: 184/466, loss: 0.219434455037117 2023-01-24 01:25:08.294807: step: 186/466, loss: 0.5739613175392151 2023-01-24 01:25:08.933825: step: 188/466, loss: 0.5995786190032959 2023-01-24 01:25:09.601505: step: 190/466, loss: 0.35044941306114197 2023-01-24 01:25:10.271099: step: 192/466, loss: 0.391764372587204 2023-01-24 01:25:10.880079: step: 194/466, loss: 0.25634679198265076 2023-01-24 01:25:11.511781: step: 196/466, loss: 0.3363511562347412 2023-01-24 01:25:12.103408: step: 198/466, loss: 0.32367298007011414 2023-01-24 01:25:12.793413: step: 200/466, loss: 0.3627299964427948 2023-01-24 01:25:13.449628: step: 202/466, loss: 0.49342265725135803 2023-01-24 01:25:13.991353: step: 204/466, loss: 0.25272437930107117 2023-01-24 01:25:14.592687: step: 206/466, loss: 0.1899859458208084 2023-01-24 01:25:15.192266: step: 208/466, loss: 0.25297975540161133 2023-01-24 01:25:15.798434: step: 210/466, loss: 0.18472492694854736 2023-01-24 01:25:16.356271: step: 212/466, loss: 0.3657456040382385 2023-01-24 01:25:16.952002: step: 214/466, loss: 0.1637917011976242 2023-01-24 01:25:17.571035: step: 216/466, loss: 0.12977047264575958 2023-01-24 01:25:18.181354: step: 218/466, loss: 0.3983894884586334 2023-01-24 01:25:18.848098: step: 220/466, loss: 0.4265328645706177 2023-01-24 01:25:19.495398: step: 222/466, loss: 0.6930554509162903 2023-01-24 01:25:20.113929: step: 224/466, loss: 0.3342722952365875 2023-01-24 01:25:20.753581: step: 226/466, loss: 2.727994918823242 2023-01-24 01:25:21.381892: step: 228/466, loss: 1.518387794494629 2023-01-24 01:25:22.027269: step: 230/466, loss: 0.3353487253189087 2023-01-24 01:25:22.573058: step: 232/466, loss: 0.5437520742416382 2023-01-24 01:25:23.179184: step: 234/466, loss: 0.1320791393518448 2023-01-24 01:25:23.860159: step: 236/466, loss: 0.6413620114326477 2023-01-24 01:25:24.493543: step: 238/466, loss: 0.3188782334327698 2023-01-24 01:25:25.218384: step: 240/466, loss: 0.3302933871746063 2023-01-24 01:25:25.833485: step: 242/466, loss: 0.28584909439086914 2023-01-24 01:25:26.451755: step: 244/466, loss: 0.5395722389221191 2023-01-24 01:25:27.057582: step: 246/466, loss: 0.22122004628181458 2023-01-24 01:25:27.696281: step: 248/466, loss: 0.6224589347839355 2023-01-24 01:25:28.280006: step: 250/466, loss: 0.5412746667861938 2023-01-24 01:25:28.915219: step: 252/466, loss: 0.7373511791229248 2023-01-24 01:25:29.754484: step: 254/466, loss: 0.23652246594429016 2023-01-24 01:25:30.476374: step: 256/466, loss: 0.2539837658405304 2023-01-24 01:25:31.040894: step: 258/466, loss: 0.260820597410202 2023-01-24 01:25:31.606436: step: 260/466, loss: 0.21884596347808838 2023-01-24 01:25:32.214967: step: 262/466, loss: 0.07499537616968155 2023-01-24 01:25:32.812197: step: 264/466, loss: 1.2581735849380493 2023-01-24 01:25:33.573118: step: 266/466, loss: 0.34262615442276 2023-01-24 01:25:34.213021: step: 268/466, loss: 0.10764749348163605 2023-01-24 01:25:34.839312: step: 270/466, loss: 0.19042930006980896 2023-01-24 01:25:35.510028: step: 272/466, loss: 0.10762417316436768 2023-01-24 01:25:36.153493: step: 274/466, loss: 0.1228412389755249 2023-01-24 01:25:36.796144: step: 276/466, loss: 0.3271018862724304 2023-01-24 01:25:37.600943: step: 278/466, loss: 1.02420973777771 2023-01-24 01:25:38.288637: step: 280/466, loss: 0.13888515532016754 2023-01-24 01:25:38.965187: step: 282/466, loss: 0.9880621433258057 2023-01-24 01:25:39.604393: step: 284/466, loss: 0.8876910209655762 2023-01-24 01:25:40.213716: step: 286/466, loss: 1.1690138578414917 2023-01-24 01:25:40.863337: step: 288/466, loss: 0.1375790536403656 2023-01-24 01:25:41.500286: step: 290/466, loss: 0.840218186378479 2023-01-24 01:25:42.149293: step: 292/466, loss: 0.4882718324661255 2023-01-24 01:25:42.715307: step: 294/466, loss: 0.31972554326057434 2023-01-24 01:25:43.406219: step: 296/466, loss: 0.33755213022232056 2023-01-24 01:25:44.001465: step: 298/466, loss: 0.2977468967437744 2023-01-24 01:25:44.603663: step: 300/466, loss: 0.18298953771591187 2023-01-24 01:25:45.213817: step: 302/466, loss: 0.28846272826194763 2023-01-24 01:25:45.862861: step: 304/466, loss: 0.3550736606121063 2023-01-24 01:25:46.418001: step: 306/466, loss: 0.9019066095352173 2023-01-24 01:25:47.055198: step: 308/466, loss: 0.147572323679924 2023-01-24 01:25:47.644550: step: 310/466, loss: 0.3051636815071106 2023-01-24 01:25:48.245307: step: 312/466, loss: 0.28385066986083984 2023-01-24 01:25:48.816590: step: 314/466, loss: 1.4107277393341064 2023-01-24 01:25:49.395547: step: 316/466, loss: 0.14702874422073364 2023-01-24 01:25:49.963803: step: 318/466, loss: 0.41051381826400757 2023-01-24 01:25:50.585470: step: 320/466, loss: 0.18689079582691193 2023-01-24 01:25:51.187785: step: 322/466, loss: 0.24710184335708618 2023-01-24 01:25:51.818990: step: 324/466, loss: 0.7665988206863403 2023-01-24 01:25:52.416384: step: 326/466, loss: 0.45251378417015076 2023-01-24 01:25:53.021109: step: 328/466, loss: 0.21326738595962524 2023-01-24 01:25:53.643241: step: 330/466, loss: 0.18569011986255646 2023-01-24 01:25:54.165814: step: 332/466, loss: 0.44794222712516785 2023-01-24 01:25:54.808834: step: 334/466, loss: 0.2406645119190216 2023-01-24 01:25:55.411624: step: 336/466, loss: 0.30867999792099 2023-01-24 01:25:55.984413: step: 338/466, loss: 0.4746020436286926 2023-01-24 01:25:56.638421: step: 340/466, loss: 0.5891717076301575 2023-01-24 01:25:57.245738: step: 342/466, loss: 0.18415430188179016 2023-01-24 01:25:57.817675: step: 344/466, loss: 0.33969199657440186 2023-01-24 01:25:58.413669: step: 346/466, loss: 0.2565991282463074 2023-01-24 01:25:59.024057: step: 348/466, loss: 0.45773324370384216 2023-01-24 01:25:59.775422: step: 350/466, loss: 0.4006710350513458 2023-01-24 01:26:00.374185: step: 352/466, loss: 0.4616091847419739 2023-01-24 01:26:01.026500: step: 354/466, loss: 0.1680757999420166 2023-01-24 01:26:01.634167: step: 356/466, loss: 0.30026933550834656 2023-01-24 01:26:02.327167: step: 358/466, loss: 0.1306033730506897 2023-01-24 01:26:02.961774: step: 360/466, loss: 0.22498393058776855 2023-01-24 01:26:03.586108: step: 362/466, loss: 0.17583592236042023 2023-01-24 01:26:04.173986: step: 364/466, loss: 1.0427978038787842 2023-01-24 01:26:04.796231: step: 366/466, loss: 0.34120306372642517 2023-01-24 01:26:05.370272: step: 368/466, loss: 0.25547441840171814 2023-01-24 01:26:06.050579: step: 370/466, loss: 1.1784662008285522 2023-01-24 01:26:06.702939: step: 372/466, loss: 0.5351514220237732 2023-01-24 01:26:07.306691: step: 374/466, loss: 0.24661028385162354 2023-01-24 01:26:07.900244: step: 376/466, loss: 0.17091259360313416 2023-01-24 01:26:08.496694: step: 378/466, loss: 1.1215652227401733 2023-01-24 01:26:09.119341: step: 380/466, loss: 0.5957236886024475 2023-01-24 01:26:09.737371: step: 382/466, loss: 0.22655294835567474 2023-01-24 01:26:10.382895: step: 384/466, loss: 0.11992353945970535 2023-01-24 01:26:11.004214: step: 386/466, loss: 0.2618737816810608 2023-01-24 01:26:11.652245: step: 388/466, loss: 0.15113231539726257 2023-01-24 01:26:12.352394: step: 390/466, loss: 0.31584858894348145 2023-01-24 01:26:13.033339: step: 392/466, loss: 0.05570532754063606 2023-01-24 01:26:13.597487: step: 394/466, loss: 0.7033010721206665 2023-01-24 01:26:14.260195: step: 396/466, loss: 0.21475452184677124 2023-01-24 01:26:14.881755: step: 398/466, loss: 0.4419141113758087 2023-01-24 01:26:15.484789: step: 400/466, loss: 0.24417926371097565 2023-01-24 01:26:16.144541: step: 402/466, loss: 0.38471463322639465 2023-01-24 01:26:16.780154: step: 404/466, loss: 0.40086495876312256 2023-01-24 01:26:17.393841: step: 406/466, loss: 1.5156762599945068 2023-01-24 01:26:17.976352: step: 408/466, loss: 0.17294296622276306 2023-01-24 01:26:18.582880: step: 410/466, loss: 0.3071759343147278 2023-01-24 01:26:19.195616: step: 412/466, loss: 0.26017171144485474 2023-01-24 01:26:19.808478: step: 414/466, loss: 0.47733762860298157 2023-01-24 01:26:20.411600: step: 416/466, loss: 0.24585777521133423 2023-01-24 01:26:21.060094: step: 418/466, loss: 0.5091809034347534 2023-01-24 01:26:21.686287: step: 420/466, loss: 0.8872667551040649 2023-01-24 01:26:22.284791: step: 422/466, loss: 0.4296327829360962 2023-01-24 01:26:22.935870: step: 424/466, loss: 0.690857470035553 2023-01-24 01:26:23.562382: step: 426/466, loss: 0.2613014876842499 2023-01-24 01:26:24.156559: step: 428/466, loss: 1.7435870170593262 2023-01-24 01:26:24.803815: step: 430/466, loss: 0.22666385769844055 2023-01-24 01:26:25.379285: step: 432/466, loss: 0.11715306341648102 2023-01-24 01:26:26.078984: step: 434/466, loss: 0.8825061917304993 2023-01-24 01:26:26.723490: step: 436/466, loss: 1.4174325466156006 2023-01-24 01:26:27.442549: step: 438/466, loss: 0.34822991490364075 2023-01-24 01:26:28.076385: step: 440/466, loss: 0.29355087876319885 2023-01-24 01:26:28.662262: step: 442/466, loss: 1.369311809539795 2023-01-24 01:26:29.249440: step: 444/466, loss: 0.38101184368133545 2023-01-24 01:26:29.866256: step: 446/466, loss: 0.3892883062362671 2023-01-24 01:26:30.504854: step: 448/466, loss: 0.26244592666625977 2023-01-24 01:26:31.087333: step: 450/466, loss: 0.25750574469566345 2023-01-24 01:26:31.720543: step: 452/466, loss: 0.8240104913711548 2023-01-24 01:26:32.392377: step: 454/466, loss: 0.48789867758750916 2023-01-24 01:26:33.040228: step: 456/466, loss: 0.4815549850463867 2023-01-24 01:26:33.652622: step: 458/466, loss: 0.16869626939296722 2023-01-24 01:26:34.274511: step: 460/466, loss: 0.27055051922798157 2023-01-24 01:26:34.875298: step: 462/466, loss: 0.9371935129165649 2023-01-24 01:26:35.530515: step: 464/466, loss: 0.9419511556625366 2023-01-24 01:26:36.129382: step: 466/466, loss: 0.7592545747756958 2023-01-24 01:26:36.736973: step: 468/466, loss: 0.13283738493919373 2023-01-24 01:26:37.377670: step: 470/466, loss: 0.5328441858291626 2023-01-24 01:26:37.926016: step: 472/466, loss: 0.3986574709415436 2023-01-24 01:26:38.563619: step: 474/466, loss: 0.17218521237373352 2023-01-24 01:26:39.258393: step: 476/466, loss: 0.2489137500524521 2023-01-24 01:26:39.879768: step: 478/466, loss: 3.4816160202026367 2023-01-24 01:26:40.488853: step: 480/466, loss: 0.5082486867904663 2023-01-24 01:26:41.132366: step: 482/466, loss: 0.4919622838497162 2023-01-24 01:26:41.811302: step: 484/466, loss: 0.2767554521560669 2023-01-24 01:26:42.419529: step: 486/466, loss: 0.37757161259651184 2023-01-24 01:26:43.000345: step: 488/466, loss: 0.5388099551200867 2023-01-24 01:26:43.583554: step: 490/466, loss: 0.651739776134491 2023-01-24 01:26:44.162455: step: 492/466, loss: 0.23953072726726532 2023-01-24 01:26:44.722019: step: 494/466, loss: 0.41552096605300903 2023-01-24 01:26:45.336765: step: 496/466, loss: 0.44315606355667114 2023-01-24 01:26:45.958763: step: 498/466, loss: 0.2738582491874695 2023-01-24 01:26:46.569635: step: 500/466, loss: 1.0400668382644653 2023-01-24 01:26:47.218807: step: 502/466, loss: 0.4278027415275574 2023-01-24 01:26:47.856292: step: 504/466, loss: 0.6743075847625732 2023-01-24 01:26:48.484203: step: 506/466, loss: 0.6389839053153992 2023-01-24 01:26:49.076789: step: 508/466, loss: 0.13873815536499023 2023-01-24 01:26:49.749764: step: 510/466, loss: 0.3239734470844269 2023-01-24 01:26:50.373808: step: 512/466, loss: 0.995073139667511 2023-01-24 01:26:50.972799: step: 514/466, loss: 1.3936810493469238 2023-01-24 01:26:51.577505: step: 516/466, loss: 0.37060728669166565 2023-01-24 01:26:52.265913: step: 518/466, loss: 0.6086036562919617 2023-01-24 01:26:52.844381: step: 520/466, loss: 0.6494529247283936 2023-01-24 01:26:53.494610: step: 522/466, loss: 0.5009200572967529 2023-01-24 01:26:54.146308: step: 524/466, loss: 0.1802213490009308 2023-01-24 01:26:54.875155: step: 526/466, loss: 0.5180705785751343 2023-01-24 01:26:55.573376: step: 528/466, loss: 0.8920317888259888 2023-01-24 01:26:56.170879: step: 530/466, loss: 0.3410802185535431 2023-01-24 01:26:56.800807: step: 532/466, loss: 0.26836782693862915 2023-01-24 01:26:57.404556: step: 534/466, loss: 0.1766878366470337 2023-01-24 01:26:58.025380: step: 536/466, loss: 0.09978803992271423 2023-01-24 01:26:58.675194: step: 538/466, loss: 0.40650302171707153 2023-01-24 01:26:59.336878: step: 540/466, loss: 0.1943265050649643 2023-01-24 01:26:59.929343: step: 542/466, loss: 0.22957094013690948 2023-01-24 01:27:00.517247: step: 544/466, loss: 0.3287438750267029 2023-01-24 01:27:01.093702: step: 546/466, loss: 0.714024543762207 2023-01-24 01:27:01.730592: step: 548/466, loss: 0.27795958518981934 2023-01-24 01:27:02.338935: step: 550/466, loss: 0.5719060301780701 2023-01-24 01:27:02.893574: step: 552/466, loss: 0.2060711830854416 2023-01-24 01:27:03.481399: step: 554/466, loss: 0.32010918855667114 2023-01-24 01:27:04.074330: step: 556/466, loss: 0.6014868021011353 2023-01-24 01:27:04.675415: step: 558/466, loss: 0.2549934685230255 2023-01-24 01:27:05.287223: step: 560/466, loss: 0.7524586915969849 2023-01-24 01:27:05.946681: step: 562/466, loss: 0.20897318422794342 2023-01-24 01:27:06.561903: step: 564/466, loss: 0.405833899974823 2023-01-24 01:27:07.147203: step: 566/466, loss: 0.2669678330421448 2023-01-24 01:27:07.808141: step: 568/466, loss: 0.43072205781936646 2023-01-24 01:27:08.542766: step: 570/466, loss: 0.33123594522476196 2023-01-24 01:27:09.148059: step: 572/466, loss: 0.24202190339565277 2023-01-24 01:27:09.779083: step: 574/466, loss: 0.9372299313545227 2023-01-24 01:27:10.440656: step: 576/466, loss: 0.5606487393379211 2023-01-24 01:27:11.008731: step: 578/466, loss: 0.19988267123699188 2023-01-24 01:27:11.594845: step: 580/466, loss: 0.758746862411499 2023-01-24 01:27:12.251445: step: 582/466, loss: 0.6153707504272461 2023-01-24 01:27:12.783028: step: 584/466, loss: 0.48051753640174866 2023-01-24 01:27:13.388113: step: 586/466, loss: 0.41417208313941956 2023-01-24 01:27:13.956076: step: 588/466, loss: 0.9258098006248474 2023-01-24 01:27:14.641562: step: 590/466, loss: 0.25086748600006104 2023-01-24 01:27:15.364337: step: 592/466, loss: 0.6945814490318298 2023-01-24 01:27:16.017791: step: 594/466, loss: 0.4970313012599945 2023-01-24 01:27:16.671480: step: 596/466, loss: 0.08689434826374054 2023-01-24 01:27:17.340990: step: 598/466, loss: 0.22101353108882904 2023-01-24 01:27:18.016829: step: 600/466, loss: 0.1183728501200676 2023-01-24 01:27:18.621205: step: 602/466, loss: 0.2400856614112854 2023-01-24 01:27:19.267597: step: 604/466, loss: 0.29236307740211487 2023-01-24 01:27:19.870546: step: 606/466, loss: 1.5404255390167236 2023-01-24 01:27:20.431839: step: 608/466, loss: 0.664776086807251 2023-01-24 01:27:21.060464: step: 610/466, loss: 0.2434491068124771 2023-01-24 01:27:21.707516: step: 612/466, loss: 1.052943468093872 2023-01-24 01:27:22.329300: step: 614/466, loss: 0.8199459314346313 2023-01-24 01:27:22.977472: step: 616/466, loss: 4.667254447937012 2023-01-24 01:27:23.605826: step: 618/466, loss: 0.6161572933197021 2023-01-24 01:27:24.287976: step: 620/466, loss: 1.5852956771850586 2023-01-24 01:27:24.870724: step: 622/466, loss: 0.8171479105949402 2023-01-24 01:27:25.536148: step: 624/466, loss: 1.4977972507476807 2023-01-24 01:27:26.164228: step: 626/466, loss: 0.8954432010650635 2023-01-24 01:27:26.826972: step: 628/466, loss: 0.19785064458847046 2023-01-24 01:27:27.445413: step: 630/466, loss: 0.23454849421977997 2023-01-24 01:27:28.024133: step: 632/466, loss: 0.9934785962104797 2023-01-24 01:27:28.668850: step: 634/466, loss: 0.4460870027542114 2023-01-24 01:27:29.356123: step: 636/466, loss: 0.10035587847232819 2023-01-24 01:27:29.960084: step: 638/466, loss: 0.11986562609672546 2023-01-24 01:27:30.508941: step: 640/466, loss: 0.7981387376785278 2023-01-24 01:27:31.147162: step: 642/466, loss: 2.019948959350586 2023-01-24 01:27:31.749472: step: 644/466, loss: 0.1733536273241043 2023-01-24 01:27:32.442572: step: 646/466, loss: 0.5257325172424316 2023-01-24 01:27:33.050216: step: 648/466, loss: 0.9252492785453796 2023-01-24 01:27:33.662973: step: 650/466, loss: 0.30040332674980164 2023-01-24 01:27:34.323822: step: 652/466, loss: 0.46063125133514404 2023-01-24 01:27:34.946785: step: 654/466, loss: 0.4684278070926666 2023-01-24 01:27:35.484797: step: 656/466, loss: 0.22722361981868744 2023-01-24 01:27:36.125400: step: 658/466, loss: 0.7254287600517273 2023-01-24 01:27:36.760097: step: 660/466, loss: 0.1062985435128212 2023-01-24 01:27:37.352516: step: 662/466, loss: 0.3585277199745178 2023-01-24 01:27:38.015198: step: 664/466, loss: 0.5264803767204285 2023-01-24 01:27:38.606497: step: 666/466, loss: 0.6416162252426147 2023-01-24 01:27:39.206634: step: 668/466, loss: 0.15391018986701965 2023-01-24 01:27:39.777025: step: 670/466, loss: 0.28397443890571594 2023-01-24 01:27:40.405734: step: 672/466, loss: 0.20207203924655914 2023-01-24 01:27:40.970608: step: 674/466, loss: 0.12301262468099594 2023-01-24 01:27:41.543478: step: 676/466, loss: 0.5341054201126099 2023-01-24 01:27:42.141994: step: 678/466, loss: 0.10569257289171219 2023-01-24 01:27:42.753902: step: 680/466, loss: 0.15092137455940247 2023-01-24 01:27:43.365556: step: 682/466, loss: 0.7261207699775696 2023-01-24 01:27:43.991765: step: 684/466, loss: 1.65827214717865 2023-01-24 01:27:44.677273: step: 686/466, loss: 1.0371516942977905 2023-01-24 01:27:45.302240: step: 688/466, loss: 0.4550298750400543 2023-01-24 01:27:45.891011: step: 690/466, loss: 0.414910227060318 2023-01-24 01:27:46.464399: step: 692/466, loss: 0.2945830523967743 2023-01-24 01:27:47.076714: step: 694/466, loss: 0.1816388964653015 2023-01-24 01:27:47.725258: step: 696/466, loss: 0.41306084394454956 2023-01-24 01:27:48.339729: step: 698/466, loss: 1.016797423362732 2023-01-24 01:27:48.935342: step: 700/466, loss: 0.17569591104984283 2023-01-24 01:27:49.556845: step: 702/466, loss: 1.9182236194610596 2023-01-24 01:27:50.248710: step: 704/466, loss: 0.42811667919158936 2023-01-24 01:27:50.837772: step: 706/466, loss: 0.6352326273918152 2023-01-24 01:27:51.438450: step: 708/466, loss: 0.9747324585914612 2023-01-24 01:27:52.025861: step: 710/466, loss: 0.729386031627655 2023-01-24 01:27:52.691818: step: 712/466, loss: 0.14309903979301453 2023-01-24 01:27:53.282658: step: 714/466, loss: 0.5412364602088928 2023-01-24 01:27:53.877634: step: 716/466, loss: 0.49542367458343506 2023-01-24 01:27:54.515031: step: 718/466, loss: 0.44659021496772766 2023-01-24 01:27:55.112149: step: 720/466, loss: 0.4688337743282318 2023-01-24 01:27:55.697196: step: 722/466, loss: 0.7991344928741455 2023-01-24 01:27:56.355252: step: 724/466, loss: 0.9413745999336243 2023-01-24 01:27:57.014812: step: 726/466, loss: 0.9676265716552734 2023-01-24 01:27:57.586631: step: 728/466, loss: 0.21356771886348724 2023-01-24 01:27:58.246792: step: 730/466, loss: 0.3693164587020874 2023-01-24 01:27:58.821855: step: 732/466, loss: 0.21786029636859894 2023-01-24 01:27:59.495554: step: 734/466, loss: 0.5728607773780823 2023-01-24 01:28:00.138328: step: 736/466, loss: 0.5436547994613647 2023-01-24 01:28:00.814544: step: 738/466, loss: 0.3133682906627655 2023-01-24 01:28:01.464095: step: 740/466, loss: 0.34399712085723877 2023-01-24 01:28:02.187408: step: 742/466, loss: 0.21830910444259644 2023-01-24 01:28:02.850948: step: 744/466, loss: 0.2679389417171478 2023-01-24 01:28:03.491415: step: 746/466, loss: 0.2695886194705963 2023-01-24 01:28:04.084872: step: 748/466, loss: 0.30671921372413635 2023-01-24 01:28:04.667564: step: 750/466, loss: 0.172052800655365 2023-01-24 01:28:05.340518: step: 752/466, loss: 0.648655354976654 2023-01-24 01:28:05.945124: step: 754/466, loss: 0.32802191376686096 2023-01-24 01:28:06.585447: step: 756/466, loss: 0.22855710983276367 2023-01-24 01:28:07.238392: step: 758/466, loss: 0.5866103172302246 2023-01-24 01:28:07.817597: step: 760/466, loss: 0.26203909516334534 2023-01-24 01:28:08.450748: step: 762/466, loss: 0.6300651431083679 2023-01-24 01:28:09.076245: step: 764/466, loss: 0.2713729739189148 2023-01-24 01:28:09.668709: step: 766/466, loss: 0.8633518815040588 2023-01-24 01:28:10.273001: step: 768/466, loss: 0.1878773421049118 2023-01-24 01:28:10.923466: step: 770/466, loss: 0.11146188527345657 2023-01-24 01:28:11.526260: step: 772/466, loss: 0.49176108837127686 2023-01-24 01:28:12.167870: step: 774/466, loss: 0.2051166445016861 2023-01-24 01:28:12.776204: step: 776/466, loss: 0.20102080702781677 2023-01-24 01:28:13.355937: step: 778/466, loss: 0.25653690099716187 2023-01-24 01:28:13.984764: step: 780/466, loss: 0.30637654662132263 2023-01-24 01:28:14.624817: step: 782/466, loss: 0.23440568149089813 2023-01-24 01:28:15.280594: step: 784/466, loss: 0.5576182007789612 2023-01-24 01:28:15.938713: step: 786/466, loss: 0.4898272454738617 2023-01-24 01:28:16.490101: step: 788/466, loss: 0.6654157042503357 2023-01-24 01:28:17.090145: step: 790/466, loss: 0.20696957409381866 2023-01-24 01:28:17.676890: step: 792/466, loss: 0.18953485786914825 2023-01-24 01:28:18.274998: step: 794/466, loss: 0.6122508645057678 2023-01-24 01:28:18.839293: step: 796/466, loss: 0.8150749206542969 2023-01-24 01:28:19.483930: step: 798/466, loss: 0.42300933599472046 2023-01-24 01:28:20.152716: step: 800/466, loss: 0.7897908091545105 2023-01-24 01:28:20.739760: step: 802/466, loss: 0.6928276419639587 2023-01-24 01:28:21.346300: step: 804/466, loss: 0.8072050213813782 2023-01-24 01:28:21.991595: step: 806/466, loss: 0.16678494215011597 2023-01-24 01:28:22.575653: step: 808/466, loss: 0.3240396976470947 2023-01-24 01:28:23.244554: step: 810/466, loss: 0.45988866686820984 2023-01-24 01:28:23.917985: step: 812/466, loss: 0.23593668639659882 2023-01-24 01:28:24.512765: step: 814/466, loss: 0.4983198642730713 2023-01-24 01:28:25.143308: step: 816/466, loss: 0.5303546190261841 2023-01-24 01:28:25.813019: step: 818/466, loss: 0.20623086392879486 2023-01-24 01:28:26.474862: step: 820/466, loss: 0.22160398960113525 2023-01-24 01:28:27.141315: step: 822/466, loss: 0.365662544965744 2023-01-24 01:28:27.769613: step: 824/466, loss: 0.38205280900001526 2023-01-24 01:28:28.436998: step: 826/466, loss: 0.1559482216835022 2023-01-24 01:28:29.090016: step: 828/466, loss: 0.6489866375923157 2023-01-24 01:28:29.650614: step: 830/466, loss: 0.1958453506231308 2023-01-24 01:28:30.223179: step: 832/466, loss: 0.19367656111717224 2023-01-24 01:28:30.886066: step: 834/466, loss: 0.2849229574203491 2023-01-24 01:28:31.463314: step: 836/466, loss: 0.2829507291316986 2023-01-24 01:28:32.044353: step: 838/466, loss: 3.710892915725708 2023-01-24 01:28:32.734065: step: 840/466, loss: 0.6591838598251343 2023-01-24 01:28:33.324717: step: 842/466, loss: 0.45559462904930115 2023-01-24 01:28:33.915289: step: 844/466, loss: 0.8332613706588745 2023-01-24 01:28:34.540564: step: 846/466, loss: 0.9570567011833191 2023-01-24 01:28:35.344453: step: 848/466, loss: 0.2521909773349762 2023-01-24 01:28:35.970576: step: 850/466, loss: 0.37644124031066895 2023-01-24 01:28:36.547760: step: 852/466, loss: 0.18477340042591095 2023-01-24 01:28:37.256570: step: 854/466, loss: 0.2751915752887726 2023-01-24 01:28:37.859340: step: 856/466, loss: 0.25587552785873413 2023-01-24 01:28:38.455368: step: 858/466, loss: 1.611753225326538 2023-01-24 01:28:39.088068: step: 860/466, loss: 0.37444859743118286 2023-01-24 01:28:39.685516: step: 862/466, loss: 1.0017908811569214 2023-01-24 01:28:40.265961: step: 864/466, loss: 0.2559552788734436 2023-01-24 01:28:40.952429: step: 866/466, loss: 0.16050826013088226 2023-01-24 01:28:41.558478: step: 868/466, loss: 0.8576302528381348 2023-01-24 01:28:42.231691: step: 870/466, loss: 0.5445796251296997 2023-01-24 01:28:42.807057: step: 872/466, loss: 0.9365764260292053 2023-01-24 01:28:43.498713: step: 874/466, loss: 0.14005893468856812 2023-01-24 01:28:44.059045: step: 876/466, loss: 0.3255828619003296 2023-01-24 01:28:44.642929: step: 878/466, loss: 0.8145939111709595 2023-01-24 01:28:45.271415: step: 880/466, loss: 0.5594971776008606 2023-01-24 01:28:45.895246: step: 882/466, loss: 1.537627935409546 2023-01-24 01:28:46.508300: step: 884/466, loss: 0.20095406472682953 2023-01-24 01:28:47.142428: step: 886/466, loss: 0.868279218673706 2023-01-24 01:28:47.677376: step: 888/466, loss: 0.13002410531044006 2023-01-24 01:28:48.302314: step: 890/466, loss: 0.2226947396993637 2023-01-24 01:28:48.944794: step: 892/466, loss: 0.6223090887069702 2023-01-24 01:28:49.595817: step: 894/466, loss: 0.18663303554058075 2023-01-24 01:28:50.265899: step: 896/466, loss: 1.025985598564148 2023-01-24 01:28:50.868704: step: 898/466, loss: 0.2230851948261261 2023-01-24 01:28:51.471519: step: 900/466, loss: 0.39694327116012573 2023-01-24 01:28:52.090830: step: 902/466, loss: 0.15065787732601166 2023-01-24 01:28:52.706763: step: 904/466, loss: 0.20817795395851135 2023-01-24 01:28:53.361108: step: 906/466, loss: 0.49367645382881165 2023-01-24 01:28:54.061193: step: 908/466, loss: 1.8523865938186646 2023-01-24 01:28:54.674495: step: 910/466, loss: 0.26639750599861145 2023-01-24 01:28:55.334058: step: 912/466, loss: 0.4446331858634949 2023-01-24 01:28:56.004734: step: 914/466, loss: 2.210296154022217 2023-01-24 01:28:56.668235: step: 916/466, loss: 0.5116863250732422 2023-01-24 01:28:57.294786: step: 918/466, loss: 0.4732198417186737 2023-01-24 01:28:57.938816: step: 920/466, loss: 0.7057874202728271 2023-01-24 01:28:58.544509: step: 922/466, loss: 0.28075242042541504 2023-01-24 01:28:59.181950: step: 924/466, loss: 0.23045767843723297 2023-01-24 01:28:59.800847: step: 926/466, loss: 3.6228244304656982 2023-01-24 01:29:00.419105: step: 928/466, loss: 0.7160196304321289 2023-01-24 01:29:01.113443: step: 930/466, loss: 0.2058020830154419 2023-01-24 01:29:01.777753: step: 932/466, loss: 0.09874732792377472 ================================================== Loss: 0.505 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3257307354986996, 'r': 0.3096605284342476, 'f1': 0.31749240950359625}, 'combined': 0.23394177542370248, 'epoch': 8} Test Chinese: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.36657467323348636, 'r': 0.28483200625488697, 'f1': 0.3205745440047707}, 'combined': 0.21260902400316395, 'epoch': 8} Dev Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.3277193509615385, 'r': 0.25820312500000003, 'f1': 0.2888373940677967}, 'combined': 0.19255826271186444, 'epoch': 8} Test Korean: {'template': {'p': 0.96875, 'r': 0.49206349206349204, 'f1': 0.6526315789473683}, 'slot': {'p': 0.38913889350389336, 'r': 0.2719607452443086, 'f1': 0.3201650622022891}, 'combined': 0.2089498300688623, 'epoch': 8} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.32109180482176475, 'r': 0.30403189868322694, 'f1': 0.31232906550889006}, 'combined': 0.2301372061644453, 'epoch': 8} Test Russian: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.37128897699724334, 'r': 0.28127952802821465, 'f1': 0.3200767043079684}, 'combined': 0.21227885052549197, 'epoch': 8} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.3390804597701149, 'r': 0.2809523809523809, 'f1': 0.30729166666666663}, 'combined': 0.20486111111111108, 'epoch': 8} Sample Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5178571428571429, 'r': 0.31521739130434784, 'f1': 0.39189189189189194}, 'combined': 0.26126126126126126, 'epoch': 8} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.4444444444444444, 'r': 0.13793103448275862, 'f1': 0.21052631578947367}, 'combined': 0.14035087719298245, 'epoch': 8} New best korean model... ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3426994403339025, 'r': 0.326442351134002, 'f1': 0.3343734092276366}, 'combined': 0.24638040679931117, 'epoch': 6} Test for Chinese: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.3829430975684692, 'r': 0.2621003488086857, 'f1': 0.3112023539198586}, 'combined': 0.20639327099348134, 'epoch': 6} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.3494623655913978, 'r': 0.3095238095238095, 'f1': 0.32828282828282823}, 'combined': 0.2188552188552188, 'epoch': 6} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.3277193509615385, 'r': 0.25820312500000003, 'f1': 0.2888373940677967}, 'combined': 0.19255826271186444, 'epoch': 8} Test for Korean: {'template': {'p': 0.96875, 'r': 0.49206349206349204, 'f1': 0.6526315789473683}, 'slot': {'p': 0.38913889350389336, 'r': 0.2719607452443086, 'f1': 0.3201650622022891}, 'combined': 0.2089498300688623, 'epoch': 8} Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5178571428571429, 'r': 0.31521739130434784, 'f1': 0.39189189189189194}, 'combined': 0.26126126126126126, 'epoch': 8} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3214272868712161, 'r': 0.31166858366449984, 'f1': 0.3164727236824498}, 'combined': 0.23319042797654194, 'epoch': 7} Test for Russian: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.3374639287478496, 'r': 0.2816078301964814, 'f1': 0.30701605547736693}, 'combined': 0.20361686580882363, 'epoch': 7} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.4423076923076923, 'r': 0.19827586206896552, 'f1': 0.2738095238095238}, 'combined': 0.1825396825396825, 'epoch': 7} ****************************** Epoch: 9 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-24 01:31:37.229291: step: 2/466, loss: 0.2886248230934143 2023-01-24 01:31:37.864466: step: 4/466, loss: 0.308744341135025 2023-01-24 01:31:38.500235: step: 6/466, loss: 0.18767115473747253 2023-01-24 01:31:39.152127: step: 8/466, loss: 0.4757394790649414 2023-01-24 01:31:39.720362: step: 10/466, loss: 0.18802697956562042 2023-01-24 01:31:40.325256: step: 12/466, loss: 0.6095316410064697 2023-01-24 01:31:40.935574: step: 14/466, loss: 0.48483580350875854 2023-01-24 01:31:41.572496: step: 16/466, loss: 0.46638408303260803 2023-01-24 01:31:42.190216: step: 18/466, loss: 0.15202441811561584 2023-01-24 01:31:42.783395: step: 20/466, loss: 0.24129228293895721 2023-01-24 01:31:43.412257: step: 22/466, loss: 0.48490291833877563 2023-01-24 01:31:44.032619: step: 24/466, loss: 0.0903395414352417 2023-01-24 01:31:44.671028: step: 26/466, loss: 0.06721008569002151 2023-01-24 01:31:45.310666: step: 28/466, loss: 0.1363254338502884 2023-01-24 01:31:45.933036: step: 30/466, loss: 0.20085632801055908 2023-01-24 01:31:46.564290: step: 32/466, loss: 0.4882054328918457 2023-01-24 01:31:47.174231: step: 34/466, loss: 0.4240463078022003 2023-01-24 01:31:47.824036: step: 36/466, loss: 0.14644275605678558 2023-01-24 01:31:48.459883: step: 38/466, loss: 0.1937483698129654 2023-01-24 01:31:49.108300: step: 40/466, loss: 0.295713871717453 2023-01-24 01:31:49.746353: step: 42/466, loss: 0.14528240263462067 2023-01-24 01:31:50.401665: step: 44/466, loss: 0.20442795753479004 2023-01-24 01:31:51.043028: step: 46/466, loss: 0.42661958932876587 2023-01-24 01:31:51.675949: step: 48/466, loss: 0.16562780737876892 2023-01-24 01:31:52.320174: step: 50/466, loss: 0.18909737467765808 2023-01-24 01:31:52.976013: step: 52/466, loss: 0.10642421245574951 2023-01-24 01:31:53.631759: step: 54/466, loss: 0.3187907338142395 2023-01-24 01:31:54.233848: step: 56/466, loss: 0.653616726398468 2023-01-24 01:31:54.859832: step: 58/466, loss: 0.15026280283927917 2023-01-24 01:31:55.428909: step: 60/466, loss: 0.4882783889770508 2023-01-24 01:31:56.048263: step: 62/466, loss: 0.13916371762752533 2023-01-24 01:31:56.641936: step: 64/466, loss: 0.8189008235931396 2023-01-24 01:31:57.278993: step: 66/466, loss: 0.14434003829956055 2023-01-24 01:31:57.889032: step: 68/466, loss: 0.18077050149440765 2023-01-24 01:31:58.481652: step: 70/466, loss: 0.4928813576698303 2023-01-24 01:31:59.146658: step: 72/466, loss: 0.1190536841750145 2023-01-24 01:31:59.766805: step: 74/466, loss: 0.7139197587966919 2023-01-24 01:32:00.374690: step: 76/466, loss: 0.3733008801937103 2023-01-24 01:32:00.982529: step: 78/466, loss: 0.1874418556690216 2023-01-24 01:32:01.611276: step: 80/466, loss: 0.22743932902812958 2023-01-24 01:32:02.222005: step: 82/466, loss: 0.057189784944057465 2023-01-24 01:32:02.814687: step: 84/466, loss: 0.7802344560623169 2023-01-24 01:32:03.409166: step: 86/466, loss: 0.25141051411628723 2023-01-24 01:32:03.998694: step: 88/466, loss: 0.21101033687591553 2023-01-24 01:32:04.557608: step: 90/466, loss: 0.2927389144897461 2023-01-24 01:32:05.214500: step: 92/466, loss: 0.903736412525177 2023-01-24 01:32:05.842074: step: 94/466, loss: 0.9050028920173645 2023-01-24 01:32:06.494588: step: 96/466, loss: 0.19579294323921204 2023-01-24 01:32:07.156369: step: 98/466, loss: 0.39611369371414185 2023-01-24 01:32:07.814953: step: 100/466, loss: 0.267816960811615 2023-01-24 01:32:08.468516: step: 102/466, loss: 0.3000200092792511 2023-01-24 01:32:09.049827: step: 104/466, loss: 0.20672659575939178 2023-01-24 01:32:09.664999: step: 106/466, loss: 3.0240981578826904 2023-01-24 01:32:10.356514: step: 108/466, loss: 0.2653675079345703 2023-01-24 01:32:10.961124: step: 110/466, loss: 0.2504380941390991 2023-01-24 01:32:11.569383: step: 112/466, loss: 0.1431732028722763 2023-01-24 01:32:12.259959: step: 114/466, loss: 0.2724539339542389 2023-01-24 01:32:12.855110: step: 116/466, loss: 1.323121428489685 2023-01-24 01:32:13.461179: step: 118/466, loss: 0.3038110136985779 2023-01-24 01:32:14.085399: step: 120/466, loss: 0.3248835504055023 2023-01-24 01:32:14.729854: step: 122/466, loss: 0.3454150855541229 2023-01-24 01:32:15.335048: step: 124/466, loss: 0.32745280861854553 2023-01-24 01:32:15.964006: step: 126/466, loss: 0.12488136440515518 2023-01-24 01:32:16.732012: step: 128/466, loss: 0.11687491834163666 2023-01-24 01:32:17.306526: step: 130/466, loss: 0.31433528661727905 2023-01-24 01:32:17.918879: step: 132/466, loss: 0.5409859418869019 2023-01-24 01:32:18.693729: step: 134/466, loss: 0.10108667612075806 2023-01-24 01:32:19.278208: step: 136/466, loss: 0.1774735450744629 2023-01-24 01:32:19.841322: step: 138/466, loss: 0.5564969778060913 2023-01-24 01:32:20.434484: step: 140/466, loss: 0.2754778563976288 2023-01-24 01:32:21.063241: step: 142/466, loss: 0.21344691514968872 2023-01-24 01:32:21.704808: step: 144/466, loss: 0.13812457025051117 2023-01-24 01:32:22.299459: step: 146/466, loss: 0.23207207024097443 2023-01-24 01:32:22.893423: step: 148/466, loss: 0.46367210149765015 2023-01-24 01:32:23.496587: step: 150/466, loss: 0.27947306632995605 2023-01-24 01:32:24.195916: step: 152/466, loss: 0.11693456768989563 2023-01-24 01:32:24.810086: step: 154/466, loss: 0.3103890120983124 2023-01-24 01:32:25.483585: step: 156/466, loss: 0.26553767919540405 2023-01-24 01:32:26.107747: step: 158/466, loss: 1.2345422506332397 2023-01-24 01:32:26.685671: step: 160/466, loss: 0.4150778651237488 2023-01-24 01:32:27.371819: step: 162/466, loss: 0.9702160954475403 2023-01-24 01:32:28.025751: step: 164/466, loss: 0.6917840838432312 2023-01-24 01:32:28.682051: step: 166/466, loss: 2.0591235160827637 2023-01-24 01:32:29.306773: step: 168/466, loss: 0.2591855525970459 2023-01-24 01:32:29.964101: step: 170/466, loss: 0.20238420367240906 2023-01-24 01:32:30.551195: step: 172/466, loss: 0.17404764890670776 2023-01-24 01:32:31.089969: step: 174/466, loss: 0.17147669196128845 2023-01-24 01:32:31.699695: step: 176/466, loss: 0.28134211897850037 2023-01-24 01:32:32.360202: step: 178/466, loss: 0.3062577247619629 2023-01-24 01:32:32.999765: step: 180/466, loss: 0.36133408546447754 2023-01-24 01:32:33.622805: step: 182/466, loss: 1.115287184715271 2023-01-24 01:32:34.196385: step: 184/466, loss: 0.14870578050613403 2023-01-24 01:32:34.838201: step: 186/466, loss: 0.2662959098815918 2023-01-24 01:32:35.489260: step: 188/466, loss: 0.8476355671882629 2023-01-24 01:32:36.117190: step: 190/466, loss: 0.7082141637802124 2023-01-24 01:32:36.771986: step: 192/466, loss: 0.25874102115631104 2023-01-24 01:32:37.339477: step: 194/466, loss: 0.2716241180896759 2023-01-24 01:32:37.993210: step: 196/466, loss: 0.3178020417690277 2023-01-24 01:32:38.631558: step: 198/466, loss: 1.2630382776260376 2023-01-24 01:32:39.244819: step: 200/466, loss: 0.14553770422935486 2023-01-24 01:32:39.892853: step: 202/466, loss: 0.9363977313041687 2023-01-24 01:32:40.562286: step: 204/466, loss: 4.289189338684082 2023-01-24 01:32:41.165437: step: 206/466, loss: 0.8392980694770813 2023-01-24 01:32:41.757834: step: 208/466, loss: 0.37578555941581726 2023-01-24 01:32:42.364559: step: 210/466, loss: 0.2799866199493408 2023-01-24 01:32:43.002390: step: 212/466, loss: 0.22827349603176117 2023-01-24 01:32:43.888446: step: 214/466, loss: 0.8818905353546143 2023-01-24 01:32:44.501554: step: 216/466, loss: 0.31714412569999695 2023-01-24 01:32:45.123785: step: 218/466, loss: 0.4018808603286743 2023-01-24 01:32:45.745904: step: 220/466, loss: 0.7286221981048584 2023-01-24 01:32:46.348223: step: 222/466, loss: 1.2584357261657715 2023-01-24 01:32:46.973764: step: 224/466, loss: 0.6334192752838135 2023-01-24 01:32:47.607708: step: 226/466, loss: 0.4385855197906494 2023-01-24 01:32:48.262770: step: 228/466, loss: 0.36628538370132446 2023-01-24 01:32:48.892980: step: 230/466, loss: 7.889693260192871 2023-01-24 01:32:49.536642: step: 232/466, loss: 0.21159392595291138 2023-01-24 01:32:50.215329: step: 234/466, loss: 0.7000142335891724 2023-01-24 01:32:50.797410: step: 236/466, loss: 0.11271099746227264 2023-01-24 01:32:51.479811: step: 238/466, loss: 0.3571082353591919 2023-01-24 01:32:52.120297: step: 240/466, loss: 0.4262726604938507 2023-01-24 01:32:52.745480: step: 242/466, loss: 0.6651432514190674 2023-01-24 01:32:53.376198: step: 244/466, loss: 0.6429592370986938 2023-01-24 01:32:53.998113: step: 246/466, loss: 0.3488086462020874 2023-01-24 01:32:54.637735: step: 248/466, loss: 0.30513519048690796 2023-01-24 01:32:55.303521: step: 250/466, loss: 0.1501394361257553 2023-01-24 01:32:55.944955: step: 252/466, loss: 0.1115301251411438 2023-01-24 01:32:56.495062: step: 254/466, loss: 0.2430560290813446 2023-01-24 01:32:57.136962: step: 256/466, loss: 0.46057721972465515 2023-01-24 01:32:57.762797: step: 258/466, loss: 0.1814437210559845 2023-01-24 01:32:58.455549: step: 260/466, loss: 0.4026636481285095 2023-01-24 01:32:59.071919: step: 262/466, loss: 0.26710110902786255 2023-01-24 01:32:59.713661: step: 264/466, loss: 0.40534016489982605 2023-01-24 01:33:00.307833: step: 266/466, loss: 0.25253409147262573 2023-01-24 01:33:01.033134: step: 268/466, loss: 0.2516902983188629 2023-01-24 01:33:01.677633: step: 270/466, loss: 0.2063271850347519 2023-01-24 01:33:02.309357: step: 272/466, loss: 0.14844533801078796 2023-01-24 01:33:02.908782: step: 274/466, loss: 0.2199673056602478 2023-01-24 01:33:03.524821: step: 276/466, loss: 0.05996764451265335 2023-01-24 01:33:04.079114: step: 278/466, loss: 0.143018439412117 2023-01-24 01:33:04.653453: step: 280/466, loss: 0.06843327730894089 2023-01-24 01:33:05.285802: step: 282/466, loss: 0.24048197269439697 2023-01-24 01:33:05.910299: step: 284/466, loss: 0.08511865139007568 2023-01-24 01:33:06.554572: step: 286/466, loss: 0.4760672450065613 2023-01-24 01:33:07.233055: step: 288/466, loss: 0.9917468428611755 2023-01-24 01:33:07.921856: step: 290/466, loss: 0.506651759147644 2023-01-24 01:33:08.523281: step: 292/466, loss: 0.24498611688613892 2023-01-24 01:33:09.105408: step: 294/466, loss: 0.097173772752285 2023-01-24 01:33:09.697961: step: 296/466, loss: 0.11477378755807877 2023-01-24 01:33:10.243192: step: 298/466, loss: 0.4592766761779785 2023-01-24 01:33:10.871081: step: 300/466, loss: 0.23259836435317993 2023-01-24 01:33:11.488739: step: 302/466, loss: 0.6135904788970947 2023-01-24 01:33:12.115329: step: 304/466, loss: 0.45602795481681824 2023-01-24 01:33:12.785233: step: 306/466, loss: 0.229086235165596 2023-01-24 01:33:13.467597: step: 308/466, loss: 0.7307414412498474 2023-01-24 01:33:14.052440: step: 310/466, loss: 0.1451161801815033 2023-01-24 01:33:14.685678: step: 312/466, loss: 0.38837188482284546 2023-01-24 01:33:15.257928: step: 314/466, loss: 0.1958981454372406 2023-01-24 01:33:15.961170: step: 316/466, loss: 0.10762417316436768 2023-01-24 01:33:16.554423: step: 318/466, loss: 0.1184818223118782 2023-01-24 01:33:17.131510: step: 320/466, loss: 0.17390476167201996 2023-01-24 01:33:17.731991: step: 322/466, loss: 0.35525721311569214 2023-01-24 01:33:18.441705: step: 324/466, loss: 0.4133126735687256 2023-01-24 01:33:19.100917: step: 326/466, loss: 0.2449815571308136 2023-01-24 01:33:19.731345: step: 328/466, loss: 0.1683533489704132 2023-01-24 01:33:20.246621: step: 330/466, loss: 0.15328174829483032 2023-01-24 01:33:20.857731: step: 332/466, loss: 0.18109317123889923 2023-01-24 01:33:21.582340: step: 334/466, loss: 0.13646559417247772 2023-01-24 01:33:22.206712: step: 336/466, loss: 0.35461995005607605 2023-01-24 01:33:22.799682: step: 338/466, loss: 0.2538391351699829 2023-01-24 01:33:23.489893: step: 340/466, loss: 0.42897504568099976 2023-01-24 01:33:24.138341: step: 342/466, loss: 0.2490566372871399 2023-01-24 01:33:24.783739: step: 344/466, loss: 0.3087506890296936 2023-01-24 01:33:25.429069: step: 346/466, loss: 0.3386785686016083 2023-01-24 01:33:26.087491: step: 348/466, loss: 0.1746833473443985 2023-01-24 01:33:26.697437: step: 350/466, loss: 0.48220208287239075 2023-01-24 01:33:27.383483: step: 352/466, loss: 0.5700064301490784 2023-01-24 01:33:28.011290: step: 354/466, loss: 0.2225196361541748 2023-01-24 01:33:28.629611: step: 356/466, loss: 0.21626007556915283 2023-01-24 01:33:29.216660: step: 358/466, loss: 0.3181474208831787 2023-01-24 01:33:29.812217: step: 360/466, loss: 0.2892976701259613 2023-01-24 01:33:30.494789: step: 362/466, loss: 0.6562973856925964 2023-01-24 01:33:31.186911: step: 364/466, loss: 0.42469480633735657 2023-01-24 01:33:31.770130: step: 366/466, loss: 0.16269342601299286 2023-01-24 01:33:32.398326: step: 368/466, loss: 0.11876245588064194 2023-01-24 01:33:33.015183: step: 370/466, loss: 0.19721461832523346 2023-01-24 01:33:33.674020: step: 372/466, loss: 0.11414781957864761 2023-01-24 01:33:34.268434: step: 374/466, loss: 0.6896754503250122 2023-01-24 01:33:34.863447: step: 376/466, loss: 0.6822637319564819 2023-01-24 01:33:35.544885: step: 378/466, loss: 0.5647622346878052 2023-01-24 01:33:36.149803: step: 380/466, loss: 0.4214338958263397 2023-01-24 01:33:36.805462: step: 382/466, loss: 0.21153606474399567 2023-01-24 01:33:37.476322: step: 384/466, loss: 0.2709487974643707 2023-01-24 01:33:38.080185: step: 386/466, loss: 0.23430021107196808 2023-01-24 01:33:38.717505: step: 388/466, loss: 0.1149580180644989 2023-01-24 01:33:39.381590: step: 390/466, loss: 0.9785158634185791 2023-01-24 01:33:40.014701: step: 392/466, loss: 0.2314041405916214 2023-01-24 01:33:40.625268: step: 394/466, loss: 0.1163083165884018 2023-01-24 01:33:41.238902: step: 396/466, loss: 0.6735597252845764 2023-01-24 01:33:41.907611: step: 398/466, loss: 0.682388186454773 2023-01-24 01:33:42.606748: step: 400/466, loss: 0.27005475759506226 2023-01-24 01:33:43.295927: step: 402/466, loss: 0.261234849691391 2023-01-24 01:33:43.910314: step: 404/466, loss: 0.35506269335746765 2023-01-24 01:33:44.539159: step: 406/466, loss: 0.2801685631275177 2023-01-24 01:33:45.161339: step: 408/466, loss: 0.5067627429962158 2023-01-24 01:33:45.800887: step: 410/466, loss: 0.4552246332168579 2023-01-24 01:33:46.412571: step: 412/466, loss: 0.30544689297676086 2023-01-24 01:33:47.030541: step: 414/466, loss: 0.3824828267097473 2023-01-24 01:33:47.595807: step: 416/466, loss: 0.49223142862319946 2023-01-24 01:33:48.198227: step: 418/466, loss: 0.18452343344688416 2023-01-24 01:33:48.831004: step: 420/466, loss: 0.1926453560590744 2023-01-24 01:33:49.477885: step: 422/466, loss: 0.2764644920825958 2023-01-24 01:33:50.054615: step: 424/466, loss: 0.30662691593170166 2023-01-24 01:33:50.687263: step: 426/466, loss: 0.17963269352912903 2023-01-24 01:33:51.334308: step: 428/466, loss: 0.2118184119462967 2023-01-24 01:33:51.975256: step: 430/466, loss: 0.1730094999074936 2023-01-24 01:33:52.574663: step: 432/466, loss: 0.4022391140460968 2023-01-24 01:33:53.150276: step: 434/466, loss: 0.28192245960235596 2023-01-24 01:33:53.745018: step: 436/466, loss: 0.3259207010269165 2023-01-24 01:33:54.340100: step: 438/466, loss: 0.2654353380203247 2023-01-24 01:33:54.969888: step: 440/466, loss: 1.4085147380828857 2023-01-24 01:33:55.669122: step: 442/466, loss: 0.8635183572769165 2023-01-24 01:33:56.304272: step: 444/466, loss: 0.18238022923469543 2023-01-24 01:33:56.916455: step: 446/466, loss: 0.9470014572143555 2023-01-24 01:33:57.569148: step: 448/466, loss: 0.6711329817771912 2023-01-24 01:33:58.184262: step: 450/466, loss: 0.10989248752593994 2023-01-24 01:33:58.864912: step: 452/466, loss: 0.22127914428710938 2023-01-24 01:33:59.437463: step: 454/466, loss: 0.154241144657135 2023-01-24 01:33:59.994899: step: 456/466, loss: 0.14610041677951813 2023-01-24 01:34:00.669123: step: 458/466, loss: 0.6313903331756592 2023-01-24 01:34:01.334831: step: 460/466, loss: 0.4210139513015747 2023-01-24 01:34:01.958505: step: 462/466, loss: 0.10614173859357834 2023-01-24 01:34:02.558436: step: 464/466, loss: 0.1791078895330429 2023-01-24 01:34:03.201960: step: 466/466, loss: 0.10792126506567001 2023-01-24 01:34:03.831219: step: 468/466, loss: 0.41749194264411926 2023-01-24 01:34:04.521649: step: 470/466, loss: 0.15609955787658691 2023-01-24 01:34:05.156995: step: 472/466, loss: 0.18218357861042023 2023-01-24 01:34:05.797628: step: 474/466, loss: 0.15988190472126007 2023-01-24 01:34:06.406998: step: 476/466, loss: 0.49651825428009033 2023-01-24 01:34:07.055943: step: 478/466, loss: 0.3248773515224457 2023-01-24 01:34:07.637262: step: 480/466, loss: 0.2903391420841217 2023-01-24 01:34:08.276781: step: 482/466, loss: 0.1469796597957611 2023-01-24 01:34:08.906150: step: 484/466, loss: 0.7293221950531006 2023-01-24 01:34:09.629200: step: 486/466, loss: 0.2348928600549698 2023-01-24 01:34:10.400592: step: 488/466, loss: 0.19862528145313263 2023-01-24 01:34:11.058739: step: 490/466, loss: 1.9945156574249268 2023-01-24 01:34:11.668017: step: 492/466, loss: 0.6894669532775879 2023-01-24 01:34:12.338687: step: 494/466, loss: 0.23232914507389069 2023-01-24 01:34:12.966534: step: 496/466, loss: 0.180666983127594 2023-01-24 01:34:13.593558: step: 498/466, loss: 0.4309394657611847 2023-01-24 01:34:14.301423: step: 500/466, loss: 1.1669100522994995 2023-01-24 01:34:14.899937: step: 502/466, loss: 0.6310187578201294 2023-01-24 01:34:15.455003: step: 504/466, loss: 0.1634405255317688 2023-01-24 01:34:16.053738: step: 506/466, loss: 0.30890509486198425 2023-01-24 01:34:16.707479: step: 508/466, loss: 0.15104559063911438 2023-01-24 01:34:17.332702: step: 510/466, loss: 0.44312986731529236 2023-01-24 01:34:17.899847: step: 512/466, loss: 1.0001057386398315 2023-01-24 01:34:18.544333: step: 514/466, loss: 0.25896045565605164 2023-01-24 01:34:19.168452: step: 516/466, loss: 0.11799373477697372 2023-01-24 01:34:19.843143: step: 518/466, loss: 0.290573388338089 2023-01-24 01:34:20.410353: step: 520/466, loss: 0.22119809687137604 2023-01-24 01:34:20.990937: step: 522/466, loss: 0.10746853798627853 2023-01-24 01:34:21.559381: step: 524/466, loss: 0.38003775477409363 2023-01-24 01:34:22.165123: step: 526/466, loss: 0.09424518793821335 2023-01-24 01:34:22.896799: step: 528/466, loss: 1.2450789213180542 2023-01-24 01:34:23.528939: step: 530/466, loss: 0.7346737384796143 2023-01-24 01:34:24.166684: step: 532/466, loss: 0.1465187668800354 2023-01-24 01:34:24.750411: step: 534/466, loss: 0.20528163015842438 2023-01-24 01:34:25.338868: step: 536/466, loss: 0.2840029001235962 2023-01-24 01:34:25.924110: step: 538/466, loss: 0.4026467204093933 2023-01-24 01:34:26.592330: step: 540/466, loss: 0.13014625012874603 2023-01-24 01:34:27.222331: step: 542/466, loss: 0.20283189415931702 2023-01-24 01:34:27.850259: step: 544/466, loss: 0.13066184520721436 2023-01-24 01:34:28.511467: step: 546/466, loss: 0.32642120122909546 2023-01-24 01:34:29.153217: step: 548/466, loss: 0.21554157137870789 2023-01-24 01:34:29.729806: step: 550/466, loss: 0.2458067387342453 2023-01-24 01:34:30.362121: step: 552/466, loss: 0.1621115505695343 2023-01-24 01:34:30.955982: step: 554/466, loss: 0.22945551574230194 2023-01-24 01:34:31.617872: step: 556/466, loss: 1.2747833728790283 2023-01-24 01:34:32.298503: step: 558/466, loss: 0.1876629889011383 2023-01-24 01:34:32.934797: step: 560/466, loss: 0.1095738410949707 2023-01-24 01:34:33.600971: step: 562/466, loss: 0.35169994831085205 2023-01-24 01:34:34.233681: step: 564/466, loss: 0.2967619001865387 2023-01-24 01:34:34.823405: step: 566/466, loss: 0.23424719274044037 2023-01-24 01:34:35.397406: step: 568/466, loss: 0.16903695464134216 2023-01-24 01:34:35.991493: step: 570/466, loss: 0.2578504979610443 2023-01-24 01:34:36.597774: step: 572/466, loss: 0.1446860134601593 2023-01-24 01:34:37.198979: step: 574/466, loss: 0.09747033566236496 2023-01-24 01:34:37.762760: step: 576/466, loss: 0.7096918225288391 2023-01-24 01:34:38.334996: step: 578/466, loss: 0.14603278040885925 2023-01-24 01:34:38.920937: step: 580/466, loss: 0.2655777037143707 2023-01-24 01:34:39.551470: step: 582/466, loss: 0.1466868370771408 2023-01-24 01:34:40.099429: step: 584/466, loss: 0.1403120905160904 2023-01-24 01:34:40.760395: step: 586/466, loss: 0.14304976165294647 2023-01-24 01:34:41.355232: step: 588/466, loss: 0.22928020358085632 2023-01-24 01:34:41.978934: step: 590/466, loss: 0.5682132840156555 2023-01-24 01:34:42.674446: step: 592/466, loss: 0.11868242174386978 2023-01-24 01:34:43.353247: step: 594/466, loss: 0.9860707521438599 2023-01-24 01:34:43.978964: step: 596/466, loss: 0.6569647192955017 2023-01-24 01:34:44.647808: step: 598/466, loss: 0.4909774363040924 2023-01-24 01:34:45.223457: step: 600/466, loss: 0.35181015729904175 2023-01-24 01:34:45.837462: step: 602/466, loss: 0.7969850301742554 2023-01-24 01:34:46.518389: step: 604/466, loss: 0.23063130676746368 2023-01-24 01:34:47.146949: step: 606/466, loss: 0.44911569356918335 2023-01-24 01:34:47.725394: step: 608/466, loss: 0.07528708875179291 2023-01-24 01:34:48.345948: step: 610/466, loss: 0.18028683960437775 2023-01-24 01:34:48.984843: step: 612/466, loss: 0.06006666645407677 2023-01-24 01:34:49.554097: step: 614/466, loss: 0.3530943989753723 2023-01-24 01:34:50.136285: step: 616/466, loss: 0.31971678137779236 2023-01-24 01:34:50.745666: step: 618/466, loss: 0.6333480477333069 2023-01-24 01:34:51.350551: step: 620/466, loss: 0.35352540016174316 2023-01-24 01:34:51.953608: step: 622/466, loss: 0.6284568309783936 2023-01-24 01:34:52.696000: step: 624/466, loss: 0.11288725584745407 2023-01-24 01:34:53.273422: step: 626/466, loss: 0.07124555110931396 2023-01-24 01:34:53.890173: step: 628/466, loss: 0.05913181230425835 2023-01-24 01:34:54.528379: step: 630/466, loss: 0.18568114936351776 2023-01-24 01:34:55.204731: step: 632/466, loss: 0.4419129490852356 2023-01-24 01:34:55.786028: step: 634/466, loss: 0.22381986677646637 2023-01-24 01:34:56.402432: step: 636/466, loss: 1.7939949035644531 2023-01-24 01:34:57.043998: step: 638/466, loss: 0.3469640910625458 2023-01-24 01:34:57.668399: step: 640/466, loss: 0.15622952580451965 2023-01-24 01:34:58.288443: step: 642/466, loss: 0.3314429223537445 2023-01-24 01:34:58.983717: step: 644/466, loss: 0.17653773725032806 2023-01-24 01:34:59.595233: step: 646/466, loss: 0.3294999897480011 2023-01-24 01:35:00.192009: step: 648/466, loss: 1.521822452545166 2023-01-24 01:35:00.836799: step: 650/466, loss: 0.4291199743747711 2023-01-24 01:35:01.525767: step: 652/466, loss: 0.26721054315567017 2023-01-24 01:35:02.191344: step: 654/466, loss: 0.2262919545173645 2023-01-24 01:35:02.782745: step: 656/466, loss: 0.16066038608551025 2023-01-24 01:35:03.299111: step: 658/466, loss: 0.22145728766918182 2023-01-24 01:35:03.961452: step: 660/466, loss: 0.4337965250015259 2023-01-24 01:35:04.630569: step: 662/466, loss: 0.25202491879463196 2023-01-24 01:35:05.308516: step: 664/466, loss: 0.131889209151268 2023-01-24 01:35:05.896731: step: 666/466, loss: 0.3332974314689636 2023-01-24 01:35:06.447710: step: 668/466, loss: 0.19245274364948273 2023-01-24 01:35:07.056510: step: 670/466, loss: 0.41981491446495056 2023-01-24 01:35:07.661416: step: 672/466, loss: 0.503174364566803 2023-01-24 01:35:08.259167: step: 674/466, loss: 1.4129886627197266 2023-01-24 01:35:08.924224: step: 676/466, loss: 0.10373762249946594 2023-01-24 01:35:09.564342: step: 678/466, loss: 0.2581503093242645 2023-01-24 01:35:10.200280: step: 680/466, loss: 0.7148119211196899 2023-01-24 01:35:10.851143: step: 682/466, loss: 0.43798357248306274 2023-01-24 01:35:11.425601: step: 684/466, loss: 0.36510202288627625 2023-01-24 01:35:12.047311: step: 686/466, loss: 1.9389971494674683 2023-01-24 01:35:12.625562: step: 688/466, loss: 0.08822230994701385 2023-01-24 01:35:13.166806: step: 690/466, loss: 0.20560045540332794 2023-01-24 01:35:13.740299: step: 692/466, loss: 0.2151784747838974 2023-01-24 01:35:14.308097: step: 694/466, loss: 0.12266770005226135 2023-01-24 01:35:14.993811: step: 696/466, loss: 0.15356658399105072 2023-01-24 01:35:15.634998: step: 698/466, loss: 0.7142479419708252 2023-01-24 01:35:16.341135: step: 700/466, loss: 0.2554255723953247 2023-01-24 01:35:16.950625: step: 702/466, loss: 0.15858277678489685 2023-01-24 01:35:17.585366: step: 704/466, loss: 0.4641050696372986 2023-01-24 01:35:18.196568: step: 706/466, loss: 0.9413942098617554 2023-01-24 01:35:18.806498: step: 708/466, loss: 0.16942070424556732 2023-01-24 01:35:19.473300: step: 710/466, loss: 0.13268499076366425 2023-01-24 01:35:20.096267: step: 712/466, loss: 0.365239679813385 2023-01-24 01:35:20.742243: step: 714/466, loss: 0.6016151309013367 2023-01-24 01:35:21.336843: step: 716/466, loss: 0.29063844680786133 2023-01-24 01:35:21.937848: step: 718/466, loss: 0.4106799364089966 2023-01-24 01:35:22.573360: step: 720/466, loss: 0.146284282207489 2023-01-24 01:35:23.284646: step: 722/466, loss: 0.17649759352207184 2023-01-24 01:35:23.853902: step: 724/466, loss: 0.2214135378599167 2023-01-24 01:35:24.470007: step: 726/466, loss: 0.9062934517860413 2023-01-24 01:35:25.160539: step: 728/466, loss: 0.8165644407272339 2023-01-24 01:35:25.761483: step: 730/466, loss: 0.12466230988502502 2023-01-24 01:35:26.366676: step: 732/466, loss: 0.22552810609340668 2023-01-24 01:35:27.061753: step: 734/466, loss: 0.45871126651763916 2023-01-24 01:35:27.679913: step: 736/466, loss: 0.779687225818634 2023-01-24 01:35:28.290185: step: 738/466, loss: 0.35549259185791016 2023-01-24 01:35:28.919114: step: 740/466, loss: 0.19906482100486755 2023-01-24 01:35:29.539249: step: 742/466, loss: 0.21490739285945892 2023-01-24 01:35:30.134152: step: 744/466, loss: 0.3891027569770813 2023-01-24 01:35:30.747924: step: 746/466, loss: 0.16956500709056854 2023-01-24 01:35:31.364658: step: 748/466, loss: 0.34901368618011475 2023-01-24 01:35:31.935240: step: 750/466, loss: 0.255240261554718 2023-01-24 01:35:32.580623: step: 752/466, loss: 0.5560647249221802 2023-01-24 01:35:33.157400: step: 754/466, loss: 0.3323211073875427 2023-01-24 01:35:33.831929: step: 756/466, loss: 0.24396903812885284 2023-01-24 01:35:34.503820: step: 758/466, loss: 0.49413108825683594 2023-01-24 01:35:35.108228: step: 760/466, loss: 0.35466375946998596 2023-01-24 01:35:35.789744: step: 762/466, loss: 0.18038402497768402 2023-01-24 01:35:36.439297: step: 764/466, loss: 0.29585492610931396 2023-01-24 01:35:37.078999: step: 766/466, loss: 0.48583555221557617 2023-01-24 01:35:37.678722: step: 768/466, loss: 0.2323603630065918 2023-01-24 01:35:38.254155: step: 770/466, loss: 0.22142218053340912 2023-01-24 01:35:38.841842: step: 772/466, loss: 0.45981019735336304 2023-01-24 01:35:39.415601: step: 774/466, loss: 0.70528644323349 2023-01-24 01:35:40.036009: step: 776/466, loss: 0.4367358684539795 2023-01-24 01:35:40.647151: step: 778/466, loss: 0.16971944272518158 2023-01-24 01:35:41.299630: step: 780/466, loss: 0.5198857188224792 2023-01-24 01:35:41.977236: step: 782/466, loss: 0.37577539682388306 2023-01-24 01:35:42.618966: step: 784/466, loss: 0.9556046724319458 2023-01-24 01:35:43.270154: step: 786/466, loss: 0.3785717785358429 2023-01-24 01:35:43.911700: step: 788/466, loss: 0.2823404371738434 2023-01-24 01:35:44.526365: step: 790/466, loss: 2.3219895362854004 2023-01-24 01:35:45.152953: step: 792/466, loss: 0.31830859184265137 2023-01-24 01:35:45.719843: step: 794/466, loss: 0.29214802384376526 2023-01-24 01:35:46.306774: step: 796/466, loss: 0.17592476308345795 2023-01-24 01:35:46.924301: step: 798/466, loss: 0.7485669851303101 2023-01-24 01:35:47.572880: step: 800/466, loss: 0.12755541503429413 2023-01-24 01:35:48.175309: step: 802/466, loss: 0.28463253378868103 2023-01-24 01:35:48.783939: step: 804/466, loss: 0.5147067308425903 2023-01-24 01:35:49.486356: step: 806/466, loss: 0.3051072061061859 2023-01-24 01:35:50.131403: step: 808/466, loss: 1.1821234226226807 2023-01-24 01:35:50.762423: step: 810/466, loss: 0.9276929497718811 2023-01-24 01:35:51.479552: step: 812/466, loss: 0.4206128418445587 2023-01-24 01:35:52.094392: step: 814/466, loss: 0.5495182275772095 2023-01-24 01:35:52.756821: step: 816/466, loss: 0.21994075179100037 2023-01-24 01:35:53.318664: step: 818/466, loss: 0.15149745345115662 2023-01-24 01:35:54.070939: step: 820/466, loss: 1.3844549655914307 2023-01-24 01:35:54.636944: step: 822/466, loss: 2.7750017642974854 2023-01-24 01:35:55.284732: step: 824/466, loss: 0.33474108576774597 2023-01-24 01:35:55.900184: step: 826/466, loss: 0.11480654776096344 2023-01-24 01:35:56.542020: step: 828/466, loss: 0.3044185936450958 2023-01-24 01:35:57.177325: step: 830/466, loss: 0.31660595536231995 2023-01-24 01:35:57.774918: step: 832/466, loss: 0.8432048559188843 2023-01-24 01:35:58.432672: step: 834/466, loss: 3.6031930446624756 2023-01-24 01:35:59.163592: step: 836/466, loss: 0.1622489094734192 2023-01-24 01:35:59.844000: step: 838/466, loss: 0.3535557985305786 2023-01-24 01:36:00.457602: step: 840/466, loss: 0.377373069524765 2023-01-24 01:36:01.113501: step: 842/466, loss: 0.31769198179244995 2023-01-24 01:36:01.730097: step: 844/466, loss: 0.19945383071899414 2023-01-24 01:36:02.364330: step: 846/466, loss: 0.18304133415222168 2023-01-24 01:36:03.005209: step: 848/466, loss: 0.07066892832517624 2023-01-24 01:36:03.618681: step: 850/466, loss: 0.3190172612667084 2023-01-24 01:36:04.341830: step: 852/466, loss: 0.19037355482578278 2023-01-24 01:36:04.984307: step: 854/466, loss: 1.0925954580307007 2023-01-24 01:36:05.664921: step: 856/466, loss: 0.07573944330215454 2023-01-24 01:36:06.272596: step: 858/466, loss: 0.2416946142911911 2023-01-24 01:36:06.918237: step: 860/466, loss: 0.1962304413318634 2023-01-24 01:36:07.546879: step: 862/466, loss: 0.5230595469474792 2023-01-24 01:36:08.168868: step: 864/466, loss: 1.2808942794799805 2023-01-24 01:36:08.733879: step: 866/466, loss: 0.1189044788479805 2023-01-24 01:36:09.370296: step: 868/466, loss: 0.35028746724128723 2023-01-24 01:36:09.978516: step: 870/466, loss: 0.15857447683811188 2023-01-24 01:36:10.640497: step: 872/466, loss: 0.30284059047698975 2023-01-24 01:36:11.221031: step: 874/466, loss: 0.15418531000614166 2023-01-24 01:36:11.775210: step: 876/466, loss: 0.22523635625839233 2023-01-24 01:36:12.356094: step: 878/466, loss: 0.6470255255699158 2023-01-24 01:36:12.998414: step: 880/466, loss: 0.13579440116882324 2023-01-24 01:36:13.615053: step: 882/466, loss: 0.27756914496421814 2023-01-24 01:36:14.225005: step: 884/466, loss: 0.2099146693944931 2023-01-24 01:36:14.941127: step: 886/466, loss: 0.8106641173362732 2023-01-24 01:36:15.486342: step: 888/466, loss: 0.37550458312034607 2023-01-24 01:36:16.182881: step: 890/466, loss: 0.31646865606307983 2023-01-24 01:36:16.843963: step: 892/466, loss: 0.46495404839515686 2023-01-24 01:36:17.455562: step: 894/466, loss: 0.48244810104370117 2023-01-24 01:36:18.056069: step: 896/466, loss: 0.21186278760433197 2023-01-24 01:36:18.708535: step: 898/466, loss: 0.17754632234573364 2023-01-24 01:36:19.370856: step: 900/466, loss: 0.16431452333927155 2023-01-24 01:36:20.111845: step: 902/466, loss: 0.7147395610809326 2023-01-24 01:36:20.721665: step: 904/466, loss: 0.5099537372589111 2023-01-24 01:36:21.331534: step: 906/466, loss: 0.5110349655151367 2023-01-24 01:36:21.908267: step: 908/466, loss: 1.6138478517532349 2023-01-24 01:36:22.492097: step: 910/466, loss: 0.21494626998901367 2023-01-24 01:36:23.104087: step: 912/466, loss: 0.9132940769195557 2023-01-24 01:36:23.690911: step: 914/466, loss: 0.1571851521730423 2023-01-24 01:36:24.333943: step: 916/466, loss: 0.20044052600860596 2023-01-24 01:36:24.984178: step: 918/466, loss: 0.1431189775466919 2023-01-24 01:36:25.573309: step: 920/466, loss: 0.4292384088039398 2023-01-24 01:36:26.282204: step: 922/466, loss: 0.15444906055927277 2023-01-24 01:36:26.926195: step: 924/466, loss: 0.9784746170043945 2023-01-24 01:36:27.521233: step: 926/466, loss: 0.6480464339256287 2023-01-24 01:36:28.196813: step: 928/466, loss: 0.6187877655029297 2023-01-24 01:36:28.850929: step: 930/466, loss: 0.22937346994876862 2023-01-24 01:36:29.506118: step: 932/466, loss: 0.43938004970550537 ================================================== Loss: 0.429 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3175350349581958, 'r': 0.3253679675093467, 'f1': 0.3214037842126068}, 'combined': 0.23682384099876289, 'epoch': 9} Test Chinese: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.351612705617177, 'r': 0.27672376571949253, 'f1': 0.30970537733140885}, 'combined': 0.20540045750476854, 'epoch': 9} Dev Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.3129727839999492, 'r': 0.27503668896965233, 'f1': 0.29278099148382347}, 'combined': 0.19518732765588231, 'epoch': 9} Test Korean: {'template': {'p': 0.96875, 'r': 0.49206349206349204, 'f1': 0.6526315789473683}, 'slot': {'p': 0.3722595000117802, 'r': 0.26737047926515933, 'f1': 0.3112149341144762}, 'combined': 0.20310869384313182, 'epoch': 9} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.30907918511549515, 'r': 0.32315489753062204, 'f1': 0.31596035435739855}, 'combined': 0.23281289268439892, 'epoch': 9} Test Russian: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.3562002750597734, 'r': 0.27639415806973766, 'f1': 0.3112631726533042}, 'combined': 0.2064336067337976, 'epoch': 9} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.2879629629629629, 'r': 0.3702380952380952, 'f1': 0.3239583333333333}, 'combined': 0.21597222222222218, 'epoch': 9} Sample Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.359375, 'r': 0.25, 'f1': 0.2948717948717949}, 'combined': 0.19658119658119658, 'epoch': 9} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.36363636363636365, 'r': 0.13793103448275862, 'f1': 0.2}, 'combined': 0.13333333333333333, 'epoch': 9} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3426994403339025, 'r': 0.326442351134002, 'f1': 0.3343734092276366}, 'combined': 0.24638040679931117, 'epoch': 6} Test for Chinese: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.3829430975684692, 'r': 0.2621003488086857, 'f1': 0.3112023539198586}, 'combined': 0.20639327099348134, 'epoch': 6} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.3494623655913978, 'r': 0.3095238095238095, 'f1': 0.32828282828282823}, 'combined': 0.2188552188552188, 'epoch': 6} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.3277193509615385, 'r': 0.25820312500000003, 'f1': 0.2888373940677967}, 'combined': 0.19255826271186444, 'epoch': 8} Test for Korean: {'template': {'p': 0.96875, 'r': 0.49206349206349204, 'f1': 0.6526315789473683}, 'slot': {'p': 0.38913889350389336, 'r': 0.2719607452443086, 'f1': 0.3201650622022891}, 'combined': 0.2089498300688623, 'epoch': 8} Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5178571428571429, 'r': 0.31521739130434784, 'f1': 0.39189189189189194}, 'combined': 0.26126126126126126, 'epoch': 8} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3214272868712161, 'r': 0.31166858366449984, 'f1': 0.3164727236824498}, 'combined': 0.23319042797654194, 'epoch': 7} Test for Russian: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.3374639287478496, 'r': 0.2816078301964814, 'f1': 0.30701605547736693}, 'combined': 0.20361686580882363, 'epoch': 7} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.4423076923076923, 'r': 0.19827586206896552, 'f1': 0.2738095238095238}, 'combined': 0.1825396825396825, 'epoch': 7} ****************************** Epoch: 10 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-24 01:38:58.240092: step: 2/466, loss: 0.3424968421459198 2023-01-24 01:38:58.871724: step: 4/466, loss: 0.15314273536205292 2023-01-24 01:38:59.446825: step: 6/466, loss: 0.16525211930274963 2023-01-24 01:39:00.140434: step: 8/466, loss: 0.08957090228796005 2023-01-24 01:39:00.763706: step: 10/466, loss: 0.1828995943069458 2023-01-24 01:39:01.369594: step: 12/466, loss: 0.5272262096405029 2023-01-24 01:39:01.920778: step: 14/466, loss: 0.3937443494796753 2023-01-24 01:39:02.556410: step: 16/466, loss: 0.22258463501930237 2023-01-24 01:39:03.165412: step: 18/466, loss: 0.35958412289619446 2023-01-24 01:39:03.796175: step: 20/466, loss: 1.7579293251037598 2023-01-24 01:39:04.366490: step: 22/466, loss: 0.6147506833076477 2023-01-24 01:39:04.961724: step: 24/466, loss: 0.561853289604187 2023-01-24 01:39:05.598163: step: 26/466, loss: 0.08265829086303711 2023-01-24 01:39:06.244515: step: 28/466, loss: 0.1963891237974167 2023-01-24 01:39:06.908959: step: 30/466, loss: 0.15613555908203125 2023-01-24 01:39:07.525204: step: 32/466, loss: 0.14221565425395966 2023-01-24 01:39:08.160154: step: 34/466, loss: 0.11288049817085266 2023-01-24 01:39:08.779287: step: 36/466, loss: 0.23729105293750763 2023-01-24 01:39:09.427827: step: 38/466, loss: 0.41222819685935974 2023-01-24 01:39:10.024810: step: 40/466, loss: 0.1543423980474472 2023-01-24 01:39:10.605745: step: 42/466, loss: 0.10196036100387573 2023-01-24 01:39:11.219738: step: 44/466, loss: 0.1147770956158638 2023-01-24 01:39:11.842465: step: 46/466, loss: 0.09728725999593735 2023-01-24 01:39:12.456317: step: 48/466, loss: 0.08855711668729782 2023-01-24 01:39:13.122216: step: 50/466, loss: 0.10126923024654388 2023-01-24 01:39:13.742928: step: 52/466, loss: 0.44507497549057007 2023-01-24 01:39:14.335740: step: 54/466, loss: 0.31747809052467346 2023-01-24 01:39:14.894316: step: 56/466, loss: 0.2689839005470276 2023-01-24 01:39:15.495205: step: 58/466, loss: 0.41009822487831116 2023-01-24 01:39:16.128451: step: 60/466, loss: 0.09176215529441833 2023-01-24 01:39:16.746129: step: 62/466, loss: 0.24102617800235748 2023-01-24 01:39:17.361500: step: 64/466, loss: 0.08780214190483093 2023-01-24 01:39:17.975798: step: 66/466, loss: 0.13865149021148682 2023-01-24 01:39:18.617845: step: 68/466, loss: 0.13049353659152985 2023-01-24 01:39:19.244072: step: 70/466, loss: 0.292896032333374 2023-01-24 01:39:19.889447: step: 72/466, loss: 0.2684376835823059 2023-01-24 01:39:20.595118: step: 74/466, loss: 0.2586406171321869 2023-01-24 01:39:21.202178: step: 76/466, loss: 0.14662769436836243 2023-01-24 01:39:21.862931: step: 78/466, loss: 0.18522438406944275 2023-01-24 01:39:22.463313: step: 80/466, loss: 0.02016591839492321 2023-01-24 01:39:23.098295: step: 82/466, loss: 0.37746357917785645 2023-01-24 01:39:23.758498: step: 84/466, loss: 0.519001305103302 2023-01-24 01:39:24.420487: step: 86/466, loss: 0.126045823097229 2023-01-24 01:39:25.080533: step: 88/466, loss: 0.21811991930007935 2023-01-24 01:39:25.647879: step: 90/466, loss: 0.7372027635574341 2023-01-24 01:39:26.232337: step: 92/466, loss: 0.10083332657814026 2023-01-24 01:39:26.879179: step: 94/466, loss: 0.1068217009305954 2023-01-24 01:39:27.599194: step: 96/466, loss: 0.04024646803736687 2023-01-24 01:39:28.212159: step: 98/466, loss: 0.40056195855140686 2023-01-24 01:39:28.826795: step: 100/466, loss: 0.15211622416973114 2023-01-24 01:39:29.461944: step: 102/466, loss: 0.3278835415840149 2023-01-24 01:39:30.084011: step: 104/466, loss: 0.08469592779874802 2023-01-24 01:39:30.706933: step: 106/466, loss: 0.2144700288772583 2023-01-24 01:39:31.344060: step: 108/466, loss: 0.16949813067913055 2023-01-24 01:39:31.955173: step: 110/466, loss: 0.9435144662857056 2023-01-24 01:39:32.564649: step: 112/466, loss: 0.19128726422786713 2023-01-24 01:39:33.201616: step: 114/466, loss: 0.6171504259109497 2023-01-24 01:39:33.776645: step: 116/466, loss: 0.17125974595546722 2023-01-24 01:39:34.429193: step: 118/466, loss: 0.1719478815793991 2023-01-24 01:39:35.027647: step: 120/466, loss: 0.15581001341342926 2023-01-24 01:39:35.696493: step: 122/466, loss: 0.1323140412569046 2023-01-24 01:39:36.373772: step: 124/466, loss: 0.03781604394316673 2023-01-24 01:39:36.932787: step: 126/466, loss: 0.14860422909259796 2023-01-24 01:39:37.603446: step: 128/466, loss: 0.15132874250411987 2023-01-24 01:39:38.214270: step: 130/466, loss: 0.282259076833725 2023-01-24 01:39:38.766710: step: 132/466, loss: 0.6195818781852722 2023-01-24 01:39:39.321371: step: 134/466, loss: 0.14780747890472412 2023-01-24 01:39:39.976328: step: 136/466, loss: 0.19852077960968018 2023-01-24 01:39:40.575602: step: 138/466, loss: 3.3535280227661133 2023-01-24 01:39:41.214587: step: 140/466, loss: 0.3124248683452606 2023-01-24 01:39:41.822189: step: 142/466, loss: 0.50434410572052 2023-01-24 01:39:42.495127: step: 144/466, loss: 0.19816236197948456 2023-01-24 01:39:43.149951: step: 146/466, loss: 0.1461830735206604 2023-01-24 01:39:43.767266: step: 148/466, loss: 0.4960712194442749 2023-01-24 01:39:44.400119: step: 150/466, loss: 0.12675248086452484 2023-01-24 01:39:45.016713: step: 152/466, loss: 0.0832248106598854 2023-01-24 01:39:45.626972: step: 154/466, loss: 0.7244929075241089 2023-01-24 01:39:46.248901: step: 156/466, loss: 0.20706027746200562 2023-01-24 01:39:46.954916: step: 158/466, loss: 0.06518436223268509 2023-01-24 01:39:47.587732: step: 160/466, loss: 0.2813121974468231 2023-01-24 01:39:48.231400: step: 162/466, loss: 0.25889384746551514 2023-01-24 01:39:48.822576: step: 164/466, loss: 0.26677221059799194 2023-01-24 01:39:49.447137: step: 166/466, loss: 0.1657583862543106 2023-01-24 01:39:50.116408: step: 168/466, loss: 0.38243743777275085 2023-01-24 01:39:50.734311: step: 170/466, loss: 0.33843281865119934 2023-01-24 01:39:51.389998: step: 172/466, loss: 0.28743040561676025 2023-01-24 01:39:51.975933: step: 174/466, loss: 0.2466418594121933 2023-01-24 01:39:52.606866: step: 176/466, loss: 0.2965812087059021 2023-01-24 01:39:53.195197: step: 178/466, loss: 0.13026443123817444 2023-01-24 01:39:53.763937: step: 180/466, loss: 0.45092785358428955 2023-01-24 01:39:54.386639: step: 182/466, loss: 0.1328914910554886 2023-01-24 01:39:54.985198: step: 184/466, loss: 0.41911742091178894 2023-01-24 01:39:55.575942: step: 186/466, loss: 0.11339353024959564 2023-01-24 01:39:56.180185: step: 188/466, loss: 0.5373882055282593 2023-01-24 01:39:56.805545: step: 190/466, loss: 0.23385509848594666 2023-01-24 01:39:57.457625: step: 192/466, loss: 0.21484020352363586 2023-01-24 01:39:58.065078: step: 194/466, loss: 0.15819811820983887 2023-01-24 01:39:58.680203: step: 196/466, loss: 0.40069326758384705 2023-01-24 01:39:59.251364: step: 198/466, loss: 0.09636548161506653 2023-01-24 01:39:59.903683: step: 200/466, loss: 0.19350042939186096 2023-01-24 01:40:00.509314: step: 202/466, loss: 0.11633653938770294 2023-01-24 01:40:01.114397: step: 204/466, loss: 0.12276885658502579 2023-01-24 01:40:01.743278: step: 206/466, loss: 0.1648620367050171 2023-01-24 01:40:02.327061: step: 208/466, loss: 0.23498764634132385 2023-01-24 01:40:02.967951: step: 210/466, loss: 0.23746566474437714 2023-01-24 01:40:03.602651: step: 212/466, loss: 0.5610257387161255 2023-01-24 01:40:04.172315: step: 214/466, loss: 0.20346219837665558 2023-01-24 01:40:04.778337: step: 216/466, loss: 0.8227632641792297 2023-01-24 01:40:05.408533: step: 218/466, loss: 0.09616819769144058 2023-01-24 01:40:06.128060: step: 220/466, loss: 0.23829485476016998 2023-01-24 01:40:06.921031: step: 222/466, loss: 0.08007515221834183 2023-01-24 01:40:07.540065: step: 224/466, loss: 0.16075478494167328 2023-01-24 01:40:08.187795: step: 226/466, loss: 0.19215460121631622 2023-01-24 01:40:08.929932: step: 228/466, loss: 0.20689161121845245 2023-01-24 01:40:09.608693: step: 230/466, loss: 0.36305904388427734 2023-01-24 01:40:10.302765: step: 232/466, loss: 0.47161346673965454 2023-01-24 01:40:10.897930: step: 234/466, loss: 0.11965955048799515 2023-01-24 01:40:11.561641: step: 236/466, loss: 0.15132421255111694 2023-01-24 01:40:12.136240: step: 238/466, loss: 0.20713631808757782 2023-01-24 01:40:12.729317: step: 240/466, loss: 0.10391706228256226 2023-01-24 01:40:13.341951: step: 242/466, loss: 0.16056931018829346 2023-01-24 01:40:13.977120: step: 244/466, loss: 0.09839039295911789 2023-01-24 01:40:14.572889: step: 246/466, loss: 0.12909168004989624 2023-01-24 01:40:15.160804: step: 248/466, loss: 0.15248292684555054 2023-01-24 01:40:15.780855: step: 250/466, loss: 0.06441118568181992 2023-01-24 01:40:16.417625: step: 252/466, loss: 0.3634990453720093 2023-01-24 01:40:17.059863: step: 254/466, loss: 0.12455514073371887 2023-01-24 01:40:17.775539: step: 256/466, loss: 0.4242327809333801 2023-01-24 01:40:18.423609: step: 258/466, loss: 0.2633454501628876 2023-01-24 01:40:19.028161: step: 260/466, loss: 0.10316567122936249 2023-01-24 01:40:19.588509: step: 262/466, loss: 0.4540386199951172 2023-01-24 01:40:20.160102: step: 264/466, loss: 0.08635711669921875 2023-01-24 01:40:20.821402: step: 266/466, loss: 0.6375790238380432 2023-01-24 01:40:21.454440: step: 268/466, loss: 0.2528059780597687 2023-01-24 01:40:22.061783: step: 270/466, loss: 0.12142959237098694 2023-01-24 01:40:22.649909: step: 272/466, loss: 0.9206447005271912 2023-01-24 01:40:23.263025: step: 274/466, loss: 0.41452789306640625 2023-01-24 01:40:23.907308: step: 276/466, loss: 0.17503076791763306 2023-01-24 01:40:24.513629: step: 278/466, loss: 0.2776041328907013 2023-01-24 01:40:25.110680: step: 280/466, loss: 0.10080447047948837 2023-01-24 01:40:25.679084: step: 282/466, loss: 0.48633429408073425 2023-01-24 01:40:26.262278: step: 284/466, loss: 0.12377966940402985 2023-01-24 01:40:26.943486: step: 286/466, loss: 0.19372420012950897 2023-01-24 01:40:27.561223: step: 288/466, loss: 0.8181790709495544 2023-01-24 01:40:28.163987: step: 290/466, loss: 0.2616439461708069 2023-01-24 01:40:28.846030: step: 292/466, loss: 0.558844804763794 2023-01-24 01:40:29.468038: step: 294/466, loss: 0.4766530990600586 2023-01-24 01:40:30.073978: step: 296/466, loss: 0.22030682861804962 2023-01-24 01:40:30.680016: step: 298/466, loss: 0.2769697904586792 2023-01-24 01:40:31.214060: step: 300/466, loss: 0.1449180245399475 2023-01-24 01:40:31.877202: step: 302/466, loss: 0.11370931565761566 2023-01-24 01:40:32.565841: step: 304/466, loss: 0.5098021626472473 2023-01-24 01:40:33.258824: step: 306/466, loss: 0.9076952338218689 2023-01-24 01:40:33.890020: step: 308/466, loss: 0.2744629979133606 2023-01-24 01:40:34.493625: step: 310/466, loss: 0.1508442908525467 2023-01-24 01:40:35.115613: step: 312/466, loss: 0.5850167274475098 2023-01-24 01:40:35.751875: step: 314/466, loss: 0.17930437624454498 2023-01-24 01:40:36.354135: step: 316/466, loss: 1.0136152505874634 2023-01-24 01:40:37.027819: step: 318/466, loss: 0.31405845284461975 2023-01-24 01:40:37.593590: step: 320/466, loss: 0.17222459614276886 2023-01-24 01:40:38.248300: step: 322/466, loss: 0.4613257944583893 2023-01-24 01:40:38.899728: step: 324/466, loss: 0.677399754524231 2023-01-24 01:40:39.537488: step: 326/466, loss: 0.26653000712394714 2023-01-24 01:40:40.209463: step: 328/466, loss: 0.6625899076461792 2023-01-24 01:40:40.821817: step: 330/466, loss: 0.14762520790100098 2023-01-24 01:40:41.528260: step: 332/466, loss: 0.2265157699584961 2023-01-24 01:40:42.148943: step: 334/466, loss: 0.18762357532978058 2023-01-24 01:40:42.790951: step: 336/466, loss: 0.1634010672569275 2023-01-24 01:40:43.364696: step: 338/466, loss: 0.17027297616004944 2023-01-24 01:40:43.932064: step: 340/466, loss: 0.23633556067943573 2023-01-24 01:40:44.542159: step: 342/466, loss: 0.1393851935863495 2023-01-24 01:40:45.209389: step: 344/466, loss: 0.276138037443161 2023-01-24 01:40:45.815598: step: 346/466, loss: 0.22308233380317688 2023-01-24 01:40:46.404859: step: 348/466, loss: 0.0407722033560276 2023-01-24 01:40:47.024039: step: 350/466, loss: 0.15903085470199585 2023-01-24 01:40:47.746419: step: 352/466, loss: 0.141182079911232 2023-01-24 01:40:48.438915: step: 354/466, loss: 0.11356158554553986 2023-01-24 01:40:49.051464: step: 356/466, loss: 0.11174963414669037 2023-01-24 01:40:49.600624: step: 358/466, loss: 0.1275566965341568 2023-01-24 01:40:50.238073: step: 360/466, loss: 0.33931320905685425 2023-01-24 01:40:50.857408: step: 362/466, loss: 0.3848154842853546 2023-01-24 01:40:51.493025: step: 364/466, loss: 0.1476249396800995 2023-01-24 01:40:52.099859: step: 366/466, loss: 0.6467593908309937 2023-01-24 01:40:52.720316: step: 368/466, loss: 0.48849937319755554 2023-01-24 01:40:53.371230: step: 370/466, loss: 0.2312539666891098 2023-01-24 01:40:53.973646: step: 372/466, loss: 0.40469416975975037 2023-01-24 01:40:54.629144: step: 374/466, loss: 0.8057838082313538 2023-01-24 01:40:55.296096: step: 376/466, loss: 0.1687404364347458 2023-01-24 01:40:55.978989: step: 378/466, loss: 0.36339056491851807 2023-01-24 01:40:56.574681: step: 380/466, loss: 0.15135253965854645 2023-01-24 01:40:57.248281: step: 382/466, loss: 0.7757834792137146 2023-01-24 01:40:57.907825: step: 384/466, loss: 0.19134391844272614 2023-01-24 01:40:58.514512: step: 386/466, loss: 0.2958052158355713 2023-01-24 01:40:59.094628: step: 388/466, loss: 0.2449759989976883 2023-01-24 01:40:59.710238: step: 390/466, loss: 0.19907286763191223 2023-01-24 01:41:00.323337: step: 392/466, loss: 0.28933587670326233 2023-01-24 01:41:00.942246: step: 394/466, loss: 0.3133198320865631 2023-01-24 01:41:01.523746: step: 396/466, loss: 0.30926313996315 2023-01-24 01:41:02.217058: step: 398/466, loss: 0.3633095324039459 2023-01-24 01:41:02.776601: step: 400/466, loss: 0.20471030473709106 2023-01-24 01:41:03.370022: step: 402/466, loss: 0.2586519420146942 2023-01-24 01:41:04.017162: step: 404/466, loss: 0.24502499401569366 2023-01-24 01:41:04.645957: step: 406/466, loss: 0.23827120661735535 2023-01-24 01:41:05.220292: step: 408/466, loss: 0.23574049770832062 2023-01-24 01:41:05.860419: step: 410/466, loss: 0.14450325071811676 2023-01-24 01:41:06.477774: step: 412/466, loss: 0.8013572096824646 2023-01-24 01:41:07.140265: step: 414/466, loss: 0.1458197832107544 2023-01-24 01:41:07.723355: step: 416/466, loss: 0.5983514785766602 2023-01-24 01:41:08.342254: step: 418/466, loss: 0.5537610650062561 2023-01-24 01:41:08.938653: step: 420/466, loss: 0.3009760081768036 2023-01-24 01:41:09.576174: step: 422/466, loss: 0.911166787147522 2023-01-24 01:41:10.251390: step: 424/466, loss: 0.15086902678012848 2023-01-24 01:41:10.853012: step: 426/466, loss: 0.1715857833623886 2023-01-24 01:41:11.468190: step: 428/466, loss: 0.861497700214386 2023-01-24 01:41:12.128270: step: 430/466, loss: 0.25514018535614014 2023-01-24 01:41:12.726816: step: 432/466, loss: 0.2692078649997711 2023-01-24 01:41:13.354630: step: 434/466, loss: 0.556423544883728 2023-01-24 01:41:14.035904: step: 436/466, loss: 0.17007210850715637 2023-01-24 01:41:14.695973: step: 438/466, loss: 0.30793359875679016 2023-01-24 01:41:15.326696: step: 440/466, loss: 0.22319169342517853 2023-01-24 01:41:15.971719: step: 442/466, loss: 0.20929493010044098 2023-01-24 01:41:16.696300: step: 444/466, loss: 0.15269042551517487 2023-01-24 01:41:17.350036: step: 446/466, loss: 0.2763131856918335 2023-01-24 01:41:18.005407: step: 448/466, loss: 0.23531506955623627 2023-01-24 01:41:18.627636: step: 450/466, loss: 0.07401667535305023 2023-01-24 01:41:19.259911: step: 452/466, loss: 0.6907615661621094 2023-01-24 01:41:19.874577: step: 454/466, loss: 0.1963689923286438 2023-01-24 01:41:20.482260: step: 456/466, loss: 0.18842215836048126 2023-01-24 01:41:21.088632: step: 458/466, loss: 0.1701459288597107 2023-01-24 01:41:21.762625: step: 460/466, loss: 0.2941122353076935 2023-01-24 01:41:22.414573: step: 462/466, loss: 0.3442278802394867 2023-01-24 01:41:23.091351: step: 464/466, loss: 0.14135023951530457 2023-01-24 01:41:23.739235: step: 466/466, loss: 0.39910465478897095 2023-01-24 01:41:24.362369: step: 468/466, loss: 0.13927897810935974 2023-01-24 01:41:25.015323: step: 470/466, loss: 0.34257107973098755 2023-01-24 01:41:25.642316: step: 472/466, loss: 0.17023462057113647 2023-01-24 01:41:26.252016: step: 474/466, loss: 0.15634143352508545 2023-01-24 01:41:26.872398: step: 476/466, loss: 0.7175696492195129 2023-01-24 01:41:27.532048: step: 478/466, loss: 0.8162303566932678 2023-01-24 01:41:28.120612: step: 480/466, loss: 0.13726294040679932 2023-01-24 01:41:28.745771: step: 482/466, loss: 0.24070516228675842 2023-01-24 01:41:29.366872: step: 484/466, loss: 0.12580016255378723 2023-01-24 01:41:30.059458: step: 486/466, loss: 0.10945683717727661 2023-01-24 01:41:30.677294: step: 488/466, loss: 0.4962387979030609 2023-01-24 01:41:31.329717: step: 490/466, loss: 0.2147718220949173 2023-01-24 01:41:31.925310: step: 492/466, loss: 0.11244887858629227 2023-01-24 01:41:32.528469: step: 494/466, loss: 0.500853419303894 2023-01-24 01:41:33.161660: step: 496/466, loss: 0.3162543475627899 2023-01-24 01:41:33.799675: step: 498/466, loss: 0.13642749190330505 2023-01-24 01:41:34.445126: step: 500/466, loss: 0.19492042064666748 2023-01-24 01:41:35.089650: step: 502/466, loss: 0.4172359108924866 2023-01-24 01:41:35.691760: step: 504/466, loss: 0.13469327986240387 2023-01-24 01:41:36.306486: step: 506/466, loss: 0.36627480387687683 2023-01-24 01:41:36.923507: step: 508/466, loss: 0.19249746203422546 2023-01-24 01:41:37.565186: step: 510/466, loss: 0.24976111948490143 2023-01-24 01:41:38.175798: step: 512/466, loss: 0.6852096319198608 2023-01-24 01:41:38.766537: step: 514/466, loss: 0.2618003189563751 2023-01-24 01:41:39.380824: step: 516/466, loss: 0.189588725566864 2023-01-24 01:41:40.080715: step: 518/466, loss: 0.22583279013633728 2023-01-24 01:41:40.749333: step: 520/466, loss: 0.17520493268966675 2023-01-24 01:41:41.337293: step: 522/466, loss: 0.17015887796878815 2023-01-24 01:41:41.970417: step: 524/466, loss: 0.08240482211112976 2023-01-24 01:41:42.611101: step: 526/466, loss: 0.24475426971912384 2023-01-24 01:41:43.183378: step: 528/466, loss: 0.470389723777771 2023-01-24 01:41:43.735771: step: 530/466, loss: 0.0590042769908905 2023-01-24 01:41:44.351334: step: 532/466, loss: 0.37143149971961975 2023-01-24 01:41:45.004615: step: 534/466, loss: 0.14989838004112244 2023-01-24 01:41:45.642277: step: 536/466, loss: 0.2862009108066559 2023-01-24 01:41:46.293897: step: 538/466, loss: 0.26238441467285156 2023-01-24 01:41:46.910176: step: 540/466, loss: 0.19798694550991058 2023-01-24 01:41:47.489394: step: 542/466, loss: 0.1776721179485321 2023-01-24 01:41:48.120493: step: 544/466, loss: 0.13989011943340302 2023-01-24 01:41:48.769576: step: 546/466, loss: 0.26300859451293945 2023-01-24 01:41:49.335518: step: 548/466, loss: 0.1900997906923294 2023-01-24 01:41:49.938992: step: 550/466, loss: 0.19733339548110962 2023-01-24 01:41:50.551373: step: 552/466, loss: 0.10670574754476547 2023-01-24 01:41:51.152668: step: 554/466, loss: 0.23030467331409454 2023-01-24 01:41:51.775429: step: 556/466, loss: 0.15442197024822235 2023-01-24 01:41:52.424947: step: 558/466, loss: 0.46626031398773193 2023-01-24 01:41:53.014875: step: 560/466, loss: 0.17873597145080566 2023-01-24 01:41:53.668831: step: 562/466, loss: 0.08750782907009125 2023-01-24 01:41:54.274123: step: 564/466, loss: 0.7056400775909424 2023-01-24 01:41:54.898745: step: 566/466, loss: 0.1467796415090561 2023-01-24 01:41:55.514708: step: 568/466, loss: 0.1509588658809662 2023-01-24 01:41:56.151180: step: 570/466, loss: 0.2639663517475128 2023-01-24 01:41:56.832386: step: 572/466, loss: 0.17098958790302277 2023-01-24 01:41:57.388758: step: 574/466, loss: 0.06614339351654053 2023-01-24 01:41:58.034345: step: 576/466, loss: 0.330814003944397 2023-01-24 01:41:58.630262: step: 578/466, loss: 0.7567887902259827 2023-01-24 01:41:59.271067: step: 580/466, loss: 0.45156949758529663 2023-01-24 01:41:59.886544: step: 582/466, loss: 0.09206489473581314 2023-01-24 01:42:00.469475: step: 584/466, loss: 0.23979876935482025 2023-01-24 01:42:01.045976: step: 586/466, loss: 0.21902264654636383 2023-01-24 01:42:01.635307: step: 588/466, loss: 0.1813296377658844 2023-01-24 01:42:02.300315: step: 590/466, loss: 0.4988095462322235 2023-01-24 01:42:02.866622: step: 592/466, loss: 0.3031708300113678 2023-01-24 01:42:03.509652: step: 594/466, loss: 0.27922341227531433 2023-01-24 01:42:04.218031: step: 596/466, loss: 0.3294333815574646 2023-01-24 01:42:04.895565: step: 598/466, loss: 0.16292539238929749 2023-01-24 01:42:05.509063: step: 600/466, loss: 0.26730597019195557 2023-01-24 01:42:06.183711: step: 602/466, loss: 0.23192861676216125 2023-01-24 01:42:06.776907: step: 604/466, loss: 0.17797262966632843 2023-01-24 01:42:07.420235: step: 606/466, loss: 0.1849815845489502 2023-01-24 01:42:08.062376: step: 608/466, loss: 0.849368691444397 2023-01-24 01:42:08.745704: step: 610/466, loss: 0.17126432061195374 2023-01-24 01:42:09.331848: step: 612/466, loss: 0.1651676893234253 2023-01-24 01:42:09.950087: step: 614/466, loss: 0.15718361735343933 2023-01-24 01:42:10.573267: step: 616/466, loss: 0.7476538419723511 2023-01-24 01:42:11.263098: step: 618/466, loss: 0.5841106176376343 2023-01-24 01:42:11.828228: step: 620/466, loss: 0.3979680836200714 2023-01-24 01:42:12.431438: step: 622/466, loss: 0.2936483919620514 2023-01-24 01:42:13.065168: step: 624/466, loss: 0.26303133368492126 2023-01-24 01:42:13.662396: step: 626/466, loss: 1.160338044166565 2023-01-24 01:42:14.238674: step: 628/466, loss: 0.100038543343544 2023-01-24 01:42:14.957721: step: 630/466, loss: 0.5164868235588074 2023-01-24 01:42:15.584251: step: 632/466, loss: 0.19388951361179352 2023-01-24 01:42:16.127420: step: 634/466, loss: 0.5035961270332336 2023-01-24 01:42:16.763366: step: 636/466, loss: 0.18198414146900177 2023-01-24 01:42:17.383704: step: 638/466, loss: 0.29664215445518494 2023-01-24 01:42:18.007817: step: 640/466, loss: 0.2572523355484009 2023-01-24 01:42:18.572452: step: 642/466, loss: 0.09871802479028702 2023-01-24 01:42:19.206153: step: 644/466, loss: 0.39587247371673584 2023-01-24 01:42:19.814505: step: 646/466, loss: 0.1277804970741272 2023-01-24 01:42:20.361486: step: 648/466, loss: 0.12480014562606812 2023-01-24 01:42:20.973996: step: 650/466, loss: 0.34665799140930176 2023-01-24 01:42:21.618758: step: 652/466, loss: 0.12297838926315308 2023-01-24 01:42:22.240336: step: 654/466, loss: 0.417325496673584 2023-01-24 01:42:22.910824: step: 656/466, loss: 0.5616034865379333 2023-01-24 01:42:23.457992: step: 658/466, loss: 0.7238473296165466 2023-01-24 01:42:24.039827: step: 660/466, loss: 0.3585425913333893 2023-01-24 01:42:24.645388: step: 662/466, loss: 0.3884541988372803 2023-01-24 01:42:25.236463: step: 664/466, loss: 0.6369414925575256 2023-01-24 01:42:25.800149: step: 666/466, loss: 0.25105607509613037 2023-01-24 01:42:26.474656: step: 668/466, loss: 0.4268193244934082 2023-01-24 01:42:27.147736: step: 670/466, loss: 0.2666698098182678 2023-01-24 01:42:27.735305: step: 672/466, loss: 0.31034159660339355 2023-01-24 01:42:28.334277: step: 674/466, loss: 0.13177955150604248 2023-01-24 01:42:28.973022: step: 676/466, loss: 0.8183972835540771 2023-01-24 01:42:29.564722: step: 678/466, loss: 0.41874760389328003 2023-01-24 01:42:30.235173: step: 680/466, loss: 0.21547198295593262 2023-01-24 01:42:30.903926: step: 682/466, loss: 0.08435779809951782 2023-01-24 01:42:31.497033: step: 684/466, loss: 0.22993096709251404 2023-01-24 01:42:32.193879: step: 686/466, loss: 0.1858258694410324 2023-01-24 01:42:32.816388: step: 688/466, loss: 0.35966718196868896 2023-01-24 01:42:33.373595: step: 690/466, loss: 0.3706780672073364 2023-01-24 01:42:33.940953: step: 692/466, loss: 0.36994704604148865 2023-01-24 01:42:34.589680: step: 694/466, loss: 0.564039409160614 2023-01-24 01:42:35.216924: step: 696/466, loss: 0.06898679584264755 2023-01-24 01:42:35.819200: step: 698/466, loss: 0.46038195490837097 2023-01-24 01:42:36.517611: step: 700/466, loss: 0.28017866611480713 2023-01-24 01:42:37.186023: step: 702/466, loss: 0.24359598755836487 2023-01-24 01:42:37.817057: step: 704/466, loss: 0.26845529675483704 2023-01-24 01:42:38.503482: step: 706/466, loss: 0.18026471138000488 2023-01-24 01:42:39.098382: step: 708/466, loss: 0.323679655790329 2023-01-24 01:42:39.702371: step: 710/466, loss: 0.11869863420724869 2023-01-24 01:42:40.370634: step: 712/466, loss: 0.30867138504981995 2023-01-24 01:42:40.977032: step: 714/466, loss: 1.2142316102981567 2023-01-24 01:42:41.590082: step: 716/466, loss: 0.23127204179763794 2023-01-24 01:42:42.182781: step: 718/466, loss: 0.05211823433637619 2023-01-24 01:42:42.780783: step: 720/466, loss: 0.15031933784484863 2023-01-24 01:42:43.420241: step: 722/466, loss: 0.10426419228315353 2023-01-24 01:42:44.056213: step: 724/466, loss: 0.4317232370376587 2023-01-24 01:42:44.707576: step: 726/466, loss: 0.6303207278251648 2023-01-24 01:42:45.346280: step: 728/466, loss: 0.12916883826255798 2023-01-24 01:42:45.905402: step: 730/466, loss: 0.22627326846122742 2023-01-24 01:42:46.593188: step: 732/466, loss: 0.5379096865653992 2023-01-24 01:42:47.241174: step: 734/466, loss: 0.4592354893684387 2023-01-24 01:42:47.844466: step: 736/466, loss: 0.1455874741077423 2023-01-24 01:42:48.463871: step: 738/466, loss: 0.13084712624549866 2023-01-24 01:42:49.135903: step: 740/466, loss: 0.682135283946991 2023-01-24 01:42:49.757083: step: 742/466, loss: 0.8221052885055542 2023-01-24 01:42:50.445841: step: 744/466, loss: 0.20233270525932312 2023-01-24 01:42:51.064002: step: 746/466, loss: 0.045890893787145615 2023-01-24 01:42:51.688694: step: 748/466, loss: 0.12642249464988708 2023-01-24 01:42:52.341170: step: 750/466, loss: 0.601008951663971 2023-01-24 01:42:53.142100: step: 752/466, loss: 2.653719902038574 2023-01-24 01:42:53.727432: step: 754/466, loss: 0.1881476491689682 2023-01-24 01:42:54.341685: step: 756/466, loss: 0.21659430861473083 2023-01-24 01:42:54.971058: step: 758/466, loss: 0.1342061311006546 2023-01-24 01:42:55.602019: step: 760/466, loss: 0.44529831409454346 2023-01-24 01:42:56.232587: step: 762/466, loss: 0.4104353189468384 2023-01-24 01:42:56.948308: step: 764/466, loss: 0.3914719223976135 2023-01-24 01:42:57.596260: step: 766/466, loss: 3.8568103313446045 2023-01-24 01:42:58.201323: step: 768/466, loss: 0.1950238198041916 2023-01-24 01:42:58.798870: step: 770/466, loss: 0.4235544502735138 2023-01-24 01:42:59.430654: step: 772/466, loss: 0.15523557364940643 2023-01-24 01:43:00.070564: step: 774/466, loss: 0.24269673228263855 2023-01-24 01:43:00.785670: step: 776/466, loss: 0.2621227204799652 2023-01-24 01:43:01.464618: step: 778/466, loss: 0.2298140972852707 2023-01-24 01:43:02.095956: step: 780/466, loss: 0.18909944593906403 2023-01-24 01:43:02.758250: step: 782/466, loss: 0.4999868869781494 2023-01-24 01:43:03.369636: step: 784/466, loss: 0.34763866662979126 2023-01-24 01:43:04.010383: step: 786/466, loss: 3.5766942501068115 2023-01-24 01:43:04.695166: step: 788/466, loss: 2.156630039215088 2023-01-24 01:43:05.323379: step: 790/466, loss: 0.24687305092811584 2023-01-24 01:43:05.900597: step: 792/466, loss: 0.39704298973083496 2023-01-24 01:43:06.504965: step: 794/466, loss: 0.35833752155303955 2023-01-24 01:43:07.184149: step: 796/466, loss: 0.3989448547363281 2023-01-24 01:43:07.776119: step: 798/466, loss: 0.1479235291481018 2023-01-24 01:43:08.373587: step: 800/466, loss: 0.1703123152256012 2023-01-24 01:43:08.974457: step: 802/466, loss: 2.342402935028076 2023-01-24 01:43:09.583288: step: 804/466, loss: 0.1217118352651596 2023-01-24 01:43:10.158240: step: 806/466, loss: 0.44758445024490356 2023-01-24 01:43:10.742344: step: 808/466, loss: 0.7267645001411438 2023-01-24 01:43:11.340959: step: 810/466, loss: 0.1048843264579773 2023-01-24 01:43:11.937433: step: 812/466, loss: 0.1265694946050644 2023-01-24 01:43:12.563155: step: 814/466, loss: 0.14260146021842957 2023-01-24 01:43:13.216259: step: 816/466, loss: 0.1159842237830162 2023-01-24 01:43:13.830302: step: 818/466, loss: 0.24127842485904694 2023-01-24 01:43:14.447410: step: 820/466, loss: 0.3379100561141968 2023-01-24 01:43:15.030963: step: 822/466, loss: 0.41297850012779236 2023-01-24 01:43:15.592728: step: 824/466, loss: 0.4508364796638489 2023-01-24 01:43:16.206810: step: 826/466, loss: 0.1604514718055725 2023-01-24 01:43:16.801168: step: 828/466, loss: 0.16290849447250366 2023-01-24 01:43:17.386544: step: 830/466, loss: 0.1882670521736145 2023-01-24 01:43:18.045987: step: 832/466, loss: 0.9084370732307434 2023-01-24 01:43:18.731731: step: 834/466, loss: 0.17885644733905792 2023-01-24 01:43:19.384323: step: 836/466, loss: 0.6350449323654175 2023-01-24 01:43:20.018112: step: 838/466, loss: 0.25155529379844666 2023-01-24 01:43:20.590203: step: 840/466, loss: 0.1859191209077835 2023-01-24 01:43:21.296696: step: 842/466, loss: 0.6225964426994324 2023-01-24 01:43:21.893701: step: 844/466, loss: 0.32836204767227173 2023-01-24 01:43:22.527148: step: 846/466, loss: 0.12162549793720245 2023-01-24 01:43:23.158315: step: 848/466, loss: 0.3245471119880676 2023-01-24 01:43:23.742220: step: 850/466, loss: 0.17035870254039764 2023-01-24 01:43:24.314598: step: 852/466, loss: 0.14016790688037872 2023-01-24 01:43:24.926735: step: 854/466, loss: 1.0957573652267456 2023-01-24 01:43:25.533622: step: 856/466, loss: 0.17246706783771515 2023-01-24 01:43:26.103849: step: 858/466, loss: 0.09959319233894348 2023-01-24 01:43:26.721828: step: 860/466, loss: 0.06413392722606659 2023-01-24 01:43:27.337290: step: 862/466, loss: 0.11291803419589996 2023-01-24 01:43:27.967715: step: 864/466, loss: 0.1998555213212967 2023-01-24 01:43:28.594884: step: 866/466, loss: 0.8061385750770569 2023-01-24 01:43:29.231619: step: 868/466, loss: 0.21117821335792542 2023-01-24 01:43:29.902233: step: 870/466, loss: 0.386350154876709 2023-01-24 01:43:30.520025: step: 872/466, loss: 0.38534924387931824 2023-01-24 01:43:31.151364: step: 874/466, loss: 0.43389642238616943 2023-01-24 01:43:31.753273: step: 876/466, loss: 0.18496811389923096 2023-01-24 01:43:32.386224: step: 878/466, loss: 0.6521280407905579 2023-01-24 01:43:32.972619: step: 880/466, loss: 0.9200221300125122 2023-01-24 01:43:33.591857: step: 882/466, loss: 0.18344150483608246 2023-01-24 01:43:34.182746: step: 884/466, loss: 0.2574477791786194 2023-01-24 01:43:34.871935: step: 886/466, loss: 0.7089791893959045 2023-01-24 01:43:35.650342: step: 888/466, loss: 1.1452258825302124 2023-01-24 01:43:36.331355: step: 890/466, loss: 0.13744662702083588 2023-01-24 01:43:36.937424: step: 892/466, loss: 0.14688539505004883 2023-01-24 01:43:37.496248: step: 894/466, loss: 0.09006017446517944 2023-01-24 01:43:38.157356: step: 896/466, loss: 0.11454999446868896 2023-01-24 01:43:38.857479: step: 898/466, loss: 0.44436606764793396 2023-01-24 01:43:39.453800: step: 900/466, loss: 0.15809577703475952 2023-01-24 01:43:40.079338: step: 902/466, loss: 0.25707101821899414 2023-01-24 01:43:40.742437: step: 904/466, loss: 0.31899547576904297 2023-01-24 01:43:41.344107: step: 906/466, loss: 0.07329529523849487 2023-01-24 01:43:41.954299: step: 908/466, loss: 0.4771485924720764 2023-01-24 01:43:42.551790: step: 910/466, loss: 0.32161709666252136 2023-01-24 01:43:43.144353: step: 912/466, loss: 0.20154333114624023 2023-01-24 01:43:43.726430: step: 914/466, loss: 0.19467943906784058 2023-01-24 01:43:44.297065: step: 916/466, loss: 0.11051704734563828 2023-01-24 01:43:44.965600: step: 918/466, loss: 1.6850031614303589 2023-01-24 01:43:45.657757: step: 920/466, loss: 0.3364313244819641 2023-01-24 01:43:46.322247: step: 922/466, loss: 1.0680686235427856 2023-01-24 01:43:46.964157: step: 924/466, loss: 0.15946362912654877 2023-01-24 01:43:47.573910: step: 926/466, loss: 0.0858464390039444 2023-01-24 01:43:48.204893: step: 928/466, loss: 0.32690322399139404 2023-01-24 01:43:48.837505: step: 930/466, loss: 0.2521716356277466 2023-01-24 01:43:49.479406: step: 932/466, loss: 0.3197533190250397 ================================================== Loss: 0.339 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3249718423169751, 'r': 0.34840434707607393, 'f1': 0.33628038628038637}, 'combined': 0.24778554778554784, 'epoch': 10} Test Chinese: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.35564524631816125, 'r': 0.2998937951296123, 'f1': 0.3253987814443737}, 'combined': 0.21580851826362604, 'epoch': 10} Dev Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.31508371199826896, 'r': 0.28285924145299146, 'f1': 0.298103152669021}, 'combined': 0.19873543511268066, 'epoch': 10} Test Korean: {'template': {'p': 0.96875, 'r': 0.49206349206349204, 'f1': 0.6526315789473683}, 'slot': {'p': 0.3659577520051451, 'r': 0.28516188467933384, 'f1': 0.3205469360629008}, 'combined': 0.20919905300947209, 'epoch': 10} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.31342086947210623, 'r': 0.33661520326605715, 'f1': 0.32460423077989414}, 'combined': 0.23918206478518514, 'epoch': 10} Test Russian: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.3499058910721146, 'r': 0.29380029129675833, 'f1': 0.31940800178466694}, 'combined': 0.21183535869656664, 'epoch': 10} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.33226495726495725, 'r': 0.3702380952380952, 'f1': 0.3502252252252252}, 'combined': 0.23348348348348347, 'epoch': 10} Sample Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.43478260869565216, 'f1': 0.46511627906976744}, 'combined': 0.31007751937984496, 'epoch': 10} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.35714285714285715, 'r': 0.1724137931034483, 'f1': 0.23255813953488377}, 'combined': 0.1550387596899225, 'epoch': 10} New best chinese model... New best korean model... ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3249718423169751, 'r': 0.34840434707607393, 'f1': 0.33628038628038637}, 'combined': 0.24778554778554784, 'epoch': 10} Test for Chinese: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.35564524631816125, 'r': 0.2998937951296123, 'f1': 0.3253987814443737}, 'combined': 0.21580851826362604, 'epoch': 10} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.33226495726495725, 'r': 0.3702380952380952, 'f1': 0.3502252252252252}, 'combined': 0.23348348348348347, 'epoch': 10} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.31508371199826896, 'r': 0.28285924145299146, 'f1': 0.298103152669021}, 'combined': 0.19873543511268066, 'epoch': 10} Test for Korean: {'template': {'p': 0.96875, 'r': 0.49206349206349204, 'f1': 0.6526315789473683}, 'slot': {'p': 0.3659577520051451, 'r': 0.28516188467933384, 'f1': 0.3205469360629008}, 'combined': 0.20919905300947209, 'epoch': 10} Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.43478260869565216, 'f1': 0.46511627906976744}, 'combined': 0.31007751937984496, 'epoch': 10} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3214272868712161, 'r': 0.31166858366449984, 'f1': 0.3164727236824498}, 'combined': 0.23319042797654194, 'epoch': 7} Test for Russian: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.3374639287478496, 'r': 0.2816078301964814, 'f1': 0.30701605547736693}, 'combined': 0.20361686580882363, 'epoch': 7} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.4423076923076923, 'r': 0.19827586206896552, 'f1': 0.2738095238095238}, 'combined': 0.1825396825396825, 'epoch': 7} ****************************** Epoch: 11 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-24 01:46:39.548369: step: 2/466, loss: 0.18607491254806519 2023-01-24 01:46:40.142604: step: 4/466, loss: 0.3782379627227783 2023-01-24 01:46:40.731923: step: 6/466, loss: 0.14562508463859558 2023-01-24 01:46:41.393162: step: 8/466, loss: 0.5758311748504639 2023-01-24 01:46:42.018258: step: 10/466, loss: 0.08559700101613998 2023-01-24 01:46:42.688812: step: 12/466, loss: 0.03899945691227913 2023-01-24 01:46:43.333130: step: 14/466, loss: 0.2517959773540497 2023-01-24 01:46:43.874614: step: 16/466, loss: 0.15470734238624573 2023-01-24 01:46:44.539542: step: 18/466, loss: 0.5453633666038513 2023-01-24 01:46:45.153300: step: 20/466, loss: 0.17590969800949097 2023-01-24 01:46:45.809849: step: 22/466, loss: 0.45994845032691956 2023-01-24 01:46:46.490454: step: 24/466, loss: 0.26488208770751953 2023-01-24 01:46:47.068159: step: 26/466, loss: 0.09462356567382812 2023-01-24 01:46:47.689927: step: 28/466, loss: 0.14928802847862244 2023-01-24 01:46:48.377417: step: 30/466, loss: 0.8031451106071472 2023-01-24 01:46:49.052909: step: 32/466, loss: 0.08310721814632416 2023-01-24 01:46:49.663345: step: 34/466, loss: 0.38305965065956116 2023-01-24 01:46:50.269506: step: 36/466, loss: 0.706479549407959 2023-01-24 01:46:50.931046: step: 38/466, loss: 0.14212380349636078 2023-01-24 01:46:51.512808: step: 40/466, loss: 0.206035777926445 2023-01-24 01:46:52.087956: step: 42/466, loss: 0.17984972894191742 2023-01-24 01:46:52.744595: step: 44/466, loss: 0.12307525426149368 2023-01-24 01:46:53.401895: step: 46/466, loss: 0.1807154268026352 2023-01-24 01:46:54.012281: step: 48/466, loss: 0.17570000886917114 2023-01-24 01:46:54.762815: step: 50/466, loss: 0.25609198212623596 2023-01-24 01:46:55.321815: step: 52/466, loss: 0.06491687893867493 2023-01-24 01:46:55.980789: step: 54/466, loss: 0.07231342792510986 2023-01-24 01:46:56.689522: step: 56/466, loss: 0.05617355927824974 2023-01-24 01:46:57.312047: step: 58/466, loss: 0.029409393668174744 2023-01-24 01:46:58.081829: step: 60/466, loss: 0.06893029063940048 2023-01-24 01:46:58.704373: step: 62/466, loss: 0.07814190536737442 2023-01-24 01:46:59.257489: step: 64/466, loss: 0.017962148413062096 2023-01-24 01:46:59.873116: step: 66/466, loss: 0.09650110453367233 2023-01-24 01:47:00.515109: step: 68/466, loss: 0.053260087966918945 2023-01-24 01:47:01.190061: step: 70/466, loss: 0.08349565416574478 2023-01-24 01:47:01.786202: step: 72/466, loss: 0.2979294955730438 2023-01-24 01:47:02.397686: step: 74/466, loss: 0.17924754321575165 2023-01-24 01:47:02.992398: step: 76/466, loss: 0.20314748585224152 2023-01-24 01:47:03.612392: step: 78/466, loss: 0.28504741191864014 2023-01-24 01:47:04.303961: step: 80/466, loss: 0.5961211919784546 2023-01-24 01:47:04.963756: step: 82/466, loss: 0.5081316828727722 2023-01-24 01:47:05.561671: step: 84/466, loss: 0.17852233350276947 2023-01-24 01:47:06.242829: step: 86/466, loss: 0.2734845280647278 2023-01-24 01:47:06.917547: step: 88/466, loss: 0.16375912725925446 2023-01-24 01:47:07.557267: step: 90/466, loss: 0.24265673756599426 2023-01-24 01:47:08.186010: step: 92/466, loss: 0.3444412052631378 2023-01-24 01:47:08.893344: step: 94/466, loss: 1.2213103771209717 2023-01-24 01:47:09.507780: step: 96/466, loss: 0.18616735935211182 2023-01-24 01:47:10.096007: step: 98/466, loss: 0.20802858471870422 2023-01-24 01:47:10.830485: step: 100/466, loss: 0.22155752778053284 2023-01-24 01:47:11.417784: step: 102/466, loss: 0.068345807492733 2023-01-24 01:47:11.978328: step: 104/466, loss: 0.6814802289009094 2023-01-24 01:47:12.570987: step: 106/466, loss: 0.013646832667291164 2023-01-24 01:47:13.189159: step: 108/466, loss: 0.13286662101745605 2023-01-24 01:47:13.752796: step: 110/466, loss: 0.0960453674197197 2023-01-24 01:47:14.320087: step: 112/466, loss: 0.10478698462247849 2023-01-24 01:47:15.037484: step: 114/466, loss: 0.3638004958629608 2023-01-24 01:47:15.675120: step: 116/466, loss: 0.05827972665429115 2023-01-24 01:47:16.320278: step: 118/466, loss: 0.32893651723861694 2023-01-24 01:47:16.958463: step: 120/466, loss: 0.3385923206806183 2023-01-24 01:47:17.531852: step: 122/466, loss: 0.24091915786266327 2023-01-24 01:47:18.120930: step: 124/466, loss: 0.1859840601682663 2023-01-24 01:47:18.766478: step: 126/466, loss: 0.28287869691848755 2023-01-24 01:47:19.397242: step: 128/466, loss: 0.14036421477794647 2023-01-24 01:47:19.948883: step: 130/466, loss: 0.2465360164642334 2023-01-24 01:47:20.540547: step: 132/466, loss: 0.16373713314533234 2023-01-24 01:47:21.193323: step: 134/466, loss: 0.7166844606399536 2023-01-24 01:47:21.778231: step: 136/466, loss: 0.166682630777359 2023-01-24 01:47:22.351628: step: 138/466, loss: 0.384368360042572 2023-01-24 01:47:22.985737: step: 140/466, loss: 0.5179634094238281 2023-01-24 01:47:23.556496: step: 142/466, loss: 0.04584791511297226 2023-01-24 01:47:24.161937: step: 144/466, loss: 0.7276697158813477 2023-01-24 01:47:24.764832: step: 146/466, loss: 0.41061267256736755 2023-01-24 01:47:25.397100: step: 148/466, loss: 0.2997644245624542 2023-01-24 01:47:25.967535: step: 150/466, loss: 0.21162858605384827 2023-01-24 01:47:26.611372: step: 152/466, loss: 0.09424500912427902 2023-01-24 01:47:27.258553: step: 154/466, loss: 0.16237930953502655 2023-01-24 01:47:27.903493: step: 156/466, loss: 0.21634267270565033 2023-01-24 01:47:28.551887: step: 158/466, loss: 0.3554103672504425 2023-01-24 01:47:29.271559: step: 160/466, loss: 0.6150345802307129 2023-01-24 01:47:29.998578: step: 162/466, loss: 0.14475339651107788 2023-01-24 01:47:30.643496: step: 164/466, loss: 0.19218337535858154 2023-01-24 01:47:31.270260: step: 166/466, loss: 0.06557288765907288 2023-01-24 01:47:31.922503: step: 168/466, loss: 0.5332788228988647 2023-01-24 01:47:32.492124: step: 170/466, loss: 0.20994780957698822 2023-01-24 01:47:33.108015: step: 172/466, loss: 1.137925148010254 2023-01-24 01:47:33.750846: step: 174/466, loss: 0.21695555746555328 2023-01-24 01:47:34.397631: step: 176/466, loss: 0.09937255829572678 2023-01-24 01:47:34.982915: step: 178/466, loss: 0.17750242352485657 2023-01-24 01:47:35.513418: step: 180/466, loss: 0.19674468040466309 2023-01-24 01:47:36.137479: step: 182/466, loss: 0.09151066094636917 2023-01-24 01:47:36.821271: step: 184/466, loss: 1.1057778596878052 2023-01-24 01:47:37.476873: step: 186/466, loss: 0.14974889159202576 2023-01-24 01:47:38.135499: step: 188/466, loss: 0.23337318003177643 2023-01-24 01:47:38.755901: step: 190/466, loss: 0.1663864701986313 2023-01-24 01:47:39.414999: step: 192/466, loss: 0.1742500364780426 2023-01-24 01:47:40.050568: step: 194/466, loss: 1.0310101509094238 2023-01-24 01:47:40.681574: step: 196/466, loss: 0.28460630774497986 2023-01-24 01:47:41.303698: step: 198/466, loss: 0.2701140344142914 2023-01-24 01:47:41.967222: step: 200/466, loss: 0.10820496082305908 2023-01-24 01:47:42.658226: step: 202/466, loss: 0.10222582519054413 2023-01-24 01:47:43.305804: step: 204/466, loss: 0.10434279590845108 2023-01-24 01:47:43.891410: step: 206/466, loss: 0.6771631836891174 2023-01-24 01:47:44.502532: step: 208/466, loss: 0.08052944391965866 2023-01-24 01:47:45.127808: step: 210/466, loss: 0.2689838111400604 2023-01-24 01:47:45.707271: step: 212/466, loss: 0.25654590129852295 2023-01-24 01:47:46.329951: step: 214/466, loss: 0.5604700446128845 2023-01-24 01:47:46.943145: step: 216/466, loss: 0.1821504384279251 2023-01-24 01:47:47.554351: step: 218/466, loss: 0.12607960402965546 2023-01-24 01:47:48.159460: step: 220/466, loss: 0.3847530484199524 2023-01-24 01:47:48.818756: step: 222/466, loss: 0.1250201016664505 2023-01-24 01:47:49.403670: step: 224/466, loss: 0.23013734817504883 2023-01-24 01:47:49.992370: step: 226/466, loss: 0.12877550721168518 2023-01-24 01:47:50.592350: step: 228/466, loss: 0.1783233880996704 2023-01-24 01:47:51.290736: step: 230/466, loss: 0.22882427275180817 2023-01-24 01:47:51.923311: step: 232/466, loss: 0.13324867188930511 2023-01-24 01:47:52.464038: step: 234/466, loss: 0.06851288676261902 2023-01-24 01:47:53.005225: step: 236/466, loss: 0.08244407176971436 2023-01-24 01:47:53.630718: step: 238/466, loss: 1.0778659582138062 2023-01-24 01:47:54.259444: step: 240/466, loss: 0.1519305258989334 2023-01-24 01:47:54.874661: step: 242/466, loss: 2.159055233001709 2023-01-24 01:47:55.445067: step: 244/466, loss: 0.11591868102550507 2023-01-24 01:47:56.041566: step: 246/466, loss: 0.16447295248508453 2023-01-24 01:47:56.685710: step: 248/466, loss: 0.2301783561706543 2023-01-24 01:47:57.245428: step: 250/466, loss: 0.1749679446220398 2023-01-24 01:47:57.889087: step: 252/466, loss: 0.21180734038352966 2023-01-24 01:47:58.479749: step: 254/466, loss: 0.006519475486129522 2023-01-24 01:47:59.081494: step: 256/466, loss: 0.18454588949680328 2023-01-24 01:47:59.655807: step: 258/466, loss: 0.08782033622264862 2023-01-24 01:48:00.269494: step: 260/466, loss: 0.10792701691389084 2023-01-24 01:48:00.922198: step: 262/466, loss: 0.3083246946334839 2023-01-24 01:48:01.702976: step: 264/466, loss: 1.006682276725769 2023-01-24 01:48:02.391217: step: 266/466, loss: 0.1580200344324112 2023-01-24 01:48:03.045841: step: 268/466, loss: 0.08529648929834366 2023-01-24 01:48:03.616397: step: 270/466, loss: 1.1525237560272217 2023-01-24 01:48:04.290328: step: 272/466, loss: 0.21152204275131226 2023-01-24 01:48:04.916875: step: 274/466, loss: 0.10467851907014847 2023-01-24 01:48:05.541426: step: 276/466, loss: 0.2370503544807434 2023-01-24 01:48:06.160749: step: 278/466, loss: 0.37224963307380676 2023-01-24 01:48:06.725157: step: 280/466, loss: 0.27512431144714355 2023-01-24 01:48:07.304320: step: 282/466, loss: 1.7817201614379883 2023-01-24 01:48:07.903412: step: 284/466, loss: 0.4905145764350891 2023-01-24 01:48:08.564880: step: 286/466, loss: 0.37921059131622314 2023-01-24 01:48:09.178062: step: 288/466, loss: 0.3173060119152069 2023-01-24 01:48:09.910925: step: 290/466, loss: 0.45783019065856934 2023-01-24 01:48:10.541311: step: 292/466, loss: 0.7370656728744507 2023-01-24 01:48:11.244616: step: 294/466, loss: 0.23975718021392822 2023-01-24 01:48:11.844277: step: 296/466, loss: 0.2448563575744629 2023-01-24 01:48:12.424144: step: 298/466, loss: 0.2936834394931793 2023-01-24 01:48:13.024136: step: 300/466, loss: 0.10825812071561813 2023-01-24 01:48:13.642799: step: 302/466, loss: 0.15110787749290466 2023-01-24 01:48:14.260293: step: 304/466, loss: 0.10494061559438705 2023-01-24 01:48:14.858545: step: 306/466, loss: 0.38257136940956116 2023-01-24 01:48:15.520632: step: 308/466, loss: 0.10423853993415833 2023-01-24 01:48:16.189625: step: 310/466, loss: 0.4629186987876892 2023-01-24 01:48:16.807630: step: 312/466, loss: 0.2475818246603012 2023-01-24 01:48:17.482550: step: 314/466, loss: 0.8074089288711548 2023-01-24 01:48:18.102744: step: 316/466, loss: 0.18972155451774597 2023-01-24 01:48:18.700401: step: 318/466, loss: 0.21986006200313568 2023-01-24 01:48:19.385855: step: 320/466, loss: 0.5212770700454712 2023-01-24 01:48:20.114734: step: 322/466, loss: 0.43529248237609863 2023-01-24 01:48:20.716488: step: 324/466, loss: 0.3243367075920105 2023-01-24 01:48:21.463295: step: 326/466, loss: 0.287121057510376 2023-01-24 01:48:22.102068: step: 328/466, loss: 0.061606720089912415 2023-01-24 01:48:22.857617: step: 330/466, loss: 0.29698190093040466 2023-01-24 01:48:23.434813: step: 332/466, loss: 0.18679064512252808 2023-01-24 01:48:24.059331: step: 334/466, loss: 0.20813262462615967 2023-01-24 01:48:24.668865: step: 336/466, loss: 0.1052006259560585 2023-01-24 01:48:25.243020: step: 338/466, loss: 0.12170988321304321 2023-01-24 01:48:25.881653: step: 340/466, loss: 0.11190930753946304 2023-01-24 01:48:26.504709: step: 342/466, loss: 1.2435797452926636 2023-01-24 01:48:27.091775: step: 344/466, loss: 0.06316729635000229 2023-01-24 01:48:27.709898: step: 346/466, loss: 0.2365454137325287 2023-01-24 01:48:28.329451: step: 348/466, loss: 0.37936925888061523 2023-01-24 01:48:28.977269: step: 350/466, loss: 0.11781468242406845 2023-01-24 01:48:29.577324: step: 352/466, loss: 0.11133911460638046 2023-01-24 01:48:30.180602: step: 354/466, loss: 0.2571800947189331 2023-01-24 01:48:30.709248: step: 356/466, loss: 0.2111179381608963 2023-01-24 01:48:31.355149: step: 358/466, loss: 0.17491096258163452 2023-01-24 01:48:31.985018: step: 360/466, loss: 0.16860328614711761 2023-01-24 01:48:32.644264: step: 362/466, loss: 4.821085453033447 2023-01-24 01:48:33.270614: step: 364/466, loss: 0.13677658140659332 2023-01-24 01:48:33.898100: step: 366/466, loss: 0.3939335346221924 2023-01-24 01:48:34.507948: step: 368/466, loss: 0.37653470039367676 2023-01-24 01:48:35.116231: step: 370/466, loss: 0.18542204797267914 2023-01-24 01:48:35.706770: step: 372/466, loss: 0.07824349403381348 2023-01-24 01:48:36.347944: step: 374/466, loss: 0.12440678477287292 2023-01-24 01:48:37.046350: step: 376/466, loss: 0.2704429030418396 2023-01-24 01:48:37.695631: step: 378/466, loss: 0.1222703754901886 2023-01-24 01:48:38.284730: step: 380/466, loss: 0.1519077718257904 2023-01-24 01:48:38.926901: step: 382/466, loss: 0.17306558787822723 2023-01-24 01:48:39.506622: step: 384/466, loss: 0.09035259485244751 2023-01-24 01:48:40.191561: step: 386/466, loss: 0.337478905916214 2023-01-24 01:48:40.792984: step: 388/466, loss: 0.7239857316017151 2023-01-24 01:48:41.375637: step: 390/466, loss: 0.15432657301425934 2023-01-24 01:48:41.969062: step: 392/466, loss: 0.12917044758796692 2023-01-24 01:48:42.616060: step: 394/466, loss: 0.12573650479316711 2023-01-24 01:48:43.190668: step: 396/466, loss: 0.39603760838508606 2023-01-24 01:48:43.841094: step: 398/466, loss: 0.22024846076965332 2023-01-24 01:48:44.461956: step: 400/466, loss: 0.2544387876987457 2023-01-24 01:48:45.079494: step: 402/466, loss: 0.24689553678035736 2023-01-24 01:48:45.809902: step: 404/466, loss: 0.19383157789707184 2023-01-24 01:48:46.450201: step: 406/466, loss: 0.7465018630027771 2023-01-24 01:48:47.031477: step: 408/466, loss: 0.25976499915122986 2023-01-24 01:48:47.605487: step: 410/466, loss: 0.2334873080253601 2023-01-24 01:48:48.297114: step: 412/466, loss: 0.29507511854171753 2023-01-24 01:48:48.990541: step: 414/466, loss: 0.06479204446077347 2023-01-24 01:48:49.713575: step: 416/466, loss: 0.26891064643859863 2023-01-24 01:48:50.286920: step: 418/466, loss: 0.2599930763244629 2023-01-24 01:48:50.941401: step: 420/466, loss: 0.7796530723571777 2023-01-24 01:48:51.581356: step: 422/466, loss: 0.10785645991563797 2023-01-24 01:48:52.171796: step: 424/466, loss: 0.10304387658834457 2023-01-24 01:48:52.807772: step: 426/466, loss: 0.0517401397228241 2023-01-24 01:48:53.479659: step: 428/466, loss: 0.19476906955242157 2023-01-24 01:48:54.120099: step: 430/466, loss: 0.630233883857727 2023-01-24 01:48:54.733186: step: 432/466, loss: 0.28799697756767273 2023-01-24 01:48:55.386359: step: 434/466, loss: 0.2140231430530548 2023-01-24 01:48:55.946819: step: 436/466, loss: 0.13422894477844238 2023-01-24 01:48:56.589311: step: 438/466, loss: 0.6948972940444946 2023-01-24 01:48:57.206594: step: 440/466, loss: 0.3033457100391388 2023-01-24 01:48:57.832358: step: 442/466, loss: 0.17105747759342194 2023-01-24 01:48:58.424184: step: 444/466, loss: 0.15871189534664154 2023-01-24 01:48:59.031373: step: 446/466, loss: 0.08819045126438141 2023-01-24 01:48:59.638741: step: 448/466, loss: 0.15792372822761536 2023-01-24 01:49:00.256505: step: 450/466, loss: 0.18644051253795624 2023-01-24 01:49:00.845987: step: 452/466, loss: 0.8649181723594666 2023-01-24 01:49:01.455761: step: 454/466, loss: 0.1666182279586792 2023-01-24 01:49:02.083764: step: 456/466, loss: 0.09834443777799606 2023-01-24 01:49:02.727749: step: 458/466, loss: 0.15646080672740936 2023-01-24 01:49:03.411569: step: 460/466, loss: 0.42353373765945435 2023-01-24 01:49:04.065450: step: 462/466, loss: 0.549027681350708 2023-01-24 01:49:04.816720: step: 464/466, loss: 0.2670876979827881 2023-01-24 01:49:05.485318: step: 466/466, loss: 0.18293240666389465 2023-01-24 01:49:06.103000: step: 468/466, loss: 0.056671373546123505 2023-01-24 01:49:06.706489: step: 470/466, loss: 0.16536259651184082 2023-01-24 01:49:07.407390: step: 472/466, loss: 0.8619071245193481 2023-01-24 01:49:07.967032: step: 474/466, loss: 0.08908943831920624 2023-01-24 01:49:08.660208: step: 476/466, loss: 1.3924860954284668 2023-01-24 01:49:09.216354: step: 478/466, loss: 0.24757857620716095 2023-01-24 01:49:09.844159: step: 480/466, loss: 0.1696261763572693 2023-01-24 01:49:10.496233: step: 482/466, loss: 0.12697002291679382 2023-01-24 01:49:11.140643: step: 484/466, loss: 0.1525171548128128 2023-01-24 01:49:11.716945: step: 486/466, loss: 0.22188422083854675 2023-01-24 01:49:12.407646: step: 488/466, loss: 0.5944315195083618 2023-01-24 01:49:13.025871: step: 490/466, loss: 0.5309344530105591 2023-01-24 01:49:13.661016: step: 492/466, loss: 0.24633949995040894 2023-01-24 01:49:14.357989: step: 494/466, loss: 0.2065393030643463 2023-01-24 01:49:15.020570: step: 496/466, loss: 0.2309531420469284 2023-01-24 01:49:15.630917: step: 498/466, loss: 0.07951924949884415 2023-01-24 01:49:16.287586: step: 500/466, loss: 0.1381409466266632 2023-01-24 01:49:16.930193: step: 502/466, loss: 0.22923199832439423 2023-01-24 01:49:17.467054: step: 504/466, loss: 0.1416565328836441 2023-01-24 01:49:18.059522: step: 506/466, loss: 0.09851235151290894 2023-01-24 01:49:18.674369: step: 508/466, loss: 0.3562050461769104 2023-01-24 01:49:19.296829: step: 510/466, loss: 0.08076535165309906 2023-01-24 01:49:19.884773: step: 512/466, loss: 0.4545714855194092 2023-01-24 01:49:20.461328: step: 514/466, loss: 0.36821848154067993 2023-01-24 01:49:21.099723: step: 516/466, loss: 0.16712959110736847 2023-01-24 01:49:21.665725: step: 518/466, loss: 0.12090260535478592 2023-01-24 01:49:22.205852: step: 520/466, loss: 0.15361355245113373 2023-01-24 01:49:22.804963: step: 522/466, loss: 0.2245166003704071 2023-01-24 01:49:23.455454: step: 524/466, loss: 0.09379874914884567 2023-01-24 01:49:24.045458: step: 526/466, loss: 0.06804991513490677 2023-01-24 01:49:24.705074: step: 528/466, loss: 0.33074793219566345 2023-01-24 01:49:25.305753: step: 530/466, loss: 0.1294187307357788 2023-01-24 01:49:25.967833: step: 532/466, loss: 0.2864542603492737 2023-01-24 01:49:26.707318: step: 534/466, loss: 0.18013721704483032 2023-01-24 01:49:27.308086: step: 536/466, loss: 0.3223685920238495 2023-01-24 01:49:27.913664: step: 538/466, loss: 0.11612385511398315 2023-01-24 01:49:28.564580: step: 540/466, loss: 0.051373809576034546 2023-01-24 01:49:29.155914: step: 542/466, loss: 0.036936067044734955 2023-01-24 01:49:30.024901: step: 544/466, loss: 0.5212790966033936 2023-01-24 01:49:30.665864: step: 546/466, loss: 0.08071798831224442 2023-01-24 01:49:31.275476: step: 548/466, loss: 0.07064666599035263 2023-01-24 01:49:31.839007: step: 550/466, loss: 0.17133018374443054 2023-01-24 01:49:32.441240: step: 552/466, loss: 0.47561606764793396 2023-01-24 01:49:33.113410: step: 554/466, loss: 0.069338358938694 2023-01-24 01:49:33.762149: step: 556/466, loss: 0.06468157470226288 2023-01-24 01:49:34.424545: step: 558/466, loss: 0.4185435473918915 2023-01-24 01:49:35.076789: step: 560/466, loss: 0.32739049196243286 2023-01-24 01:49:35.678517: step: 562/466, loss: 0.44227561354637146 2023-01-24 01:49:36.220033: step: 564/466, loss: 0.1070290356874466 2023-01-24 01:49:36.806702: step: 566/466, loss: 0.22124671936035156 2023-01-24 01:49:37.374061: step: 568/466, loss: 0.8226217031478882 2023-01-24 01:49:37.988743: step: 570/466, loss: 0.1698789745569229 2023-01-24 01:49:38.622065: step: 572/466, loss: 0.3633151650428772 2023-01-24 01:49:39.234789: step: 574/466, loss: 0.32821306586265564 2023-01-24 01:49:39.799003: step: 576/466, loss: 0.1997094601392746 2023-01-24 01:49:40.410485: step: 578/466, loss: 0.26783257722854614 2023-01-24 01:49:41.044527: step: 580/466, loss: 0.4049939215183258 2023-01-24 01:49:41.674112: step: 582/466, loss: 0.7677994966506958 2023-01-24 01:49:42.293222: step: 584/466, loss: 0.1088162437081337 2023-01-24 01:49:42.960696: step: 586/466, loss: 0.3080524802207947 2023-01-24 01:49:43.523113: step: 588/466, loss: 0.029904013499617577 2023-01-24 01:49:44.140048: step: 590/466, loss: 0.343360960483551 2023-01-24 01:49:44.830025: step: 592/466, loss: 0.48682013154029846 2023-01-24 01:49:45.515692: step: 594/466, loss: 0.665473997592926 2023-01-24 01:49:46.157445: step: 596/466, loss: 0.3582887053489685 2023-01-24 01:49:46.756186: step: 598/466, loss: 0.18016378581523895 2023-01-24 01:49:47.460351: step: 600/466, loss: 0.1603153645992279 2023-01-24 01:49:48.094230: step: 602/466, loss: 0.33286023139953613 2023-01-24 01:49:48.715277: step: 604/466, loss: 0.1129535436630249 2023-01-24 01:49:49.369265: step: 606/466, loss: 0.9827211499214172 2023-01-24 01:49:50.010152: step: 608/466, loss: 0.22517018020153046 2023-01-24 01:49:50.649542: step: 610/466, loss: 0.10864400118589401 2023-01-24 01:49:51.239789: step: 612/466, loss: 0.25312262773513794 2023-01-24 01:49:51.876896: step: 614/466, loss: 0.5937185287475586 2023-01-24 01:49:52.473627: step: 616/466, loss: 0.3340551555156708 2023-01-24 01:49:53.101163: step: 618/466, loss: 0.22389067709445953 2023-01-24 01:49:53.709773: step: 620/466, loss: 0.15252818167209625 2023-01-24 01:49:54.368175: step: 622/466, loss: 0.34194090962409973 2023-01-24 01:49:55.020671: step: 624/466, loss: 1.6129209995269775 2023-01-24 01:49:55.661032: step: 626/466, loss: 0.15540900826454163 2023-01-24 01:49:56.334340: step: 628/466, loss: 0.6548113226890564 2023-01-24 01:49:57.022795: step: 630/466, loss: 0.2582647502422333 2023-01-24 01:49:57.626738: step: 632/466, loss: 0.5408102869987488 2023-01-24 01:49:58.246674: step: 634/466, loss: 0.06651590019464493 2023-01-24 01:49:58.909231: step: 636/466, loss: 0.1293291449546814 2023-01-24 01:49:59.515936: step: 638/466, loss: 0.09778794646263123 2023-01-24 01:50:00.094170: step: 640/466, loss: 0.5053603053092957 2023-01-24 01:50:00.685194: step: 642/466, loss: 0.12509065866470337 2023-01-24 01:50:01.260309: step: 644/466, loss: 0.22191186249256134 2023-01-24 01:50:01.865211: step: 646/466, loss: 0.15831364691257477 2023-01-24 01:50:02.516488: step: 648/466, loss: 0.28588342666625977 2023-01-24 01:50:03.162801: step: 650/466, loss: 0.11837802082300186 2023-01-24 01:50:03.789805: step: 652/466, loss: 0.15723967552185059 2023-01-24 01:50:04.417369: step: 654/466, loss: 0.1029759868979454 2023-01-24 01:50:05.025767: step: 656/466, loss: 0.20298771560192108 2023-01-24 01:50:05.713377: step: 658/466, loss: 0.13640204071998596 2023-01-24 01:50:06.284126: step: 660/466, loss: 0.24446871876716614 2023-01-24 01:50:06.915016: step: 662/466, loss: 0.413612425327301 2023-01-24 01:50:07.519418: step: 664/466, loss: 0.04400372877717018 2023-01-24 01:50:08.192207: step: 666/466, loss: 0.38971996307373047 2023-01-24 01:50:08.896780: step: 668/466, loss: 0.6782568097114563 2023-01-24 01:50:09.494440: step: 670/466, loss: 0.25608938932418823 2023-01-24 01:50:10.135624: step: 672/466, loss: 0.3360452353954315 2023-01-24 01:50:10.769456: step: 674/466, loss: 0.13316825032234192 2023-01-24 01:50:11.381699: step: 676/466, loss: 0.28985121846199036 2023-01-24 01:50:11.995144: step: 678/466, loss: 0.7279584407806396 2023-01-24 01:50:12.571127: step: 680/466, loss: 0.2239963263273239 2023-01-24 01:50:13.215424: step: 682/466, loss: 0.20093773305416107 2023-01-24 01:50:13.863276: step: 684/466, loss: 1.3977488279342651 2023-01-24 01:50:14.497606: step: 686/466, loss: 0.36743053793907166 2023-01-24 01:50:15.157446: step: 688/466, loss: 12.828628540039062 2023-01-24 01:50:15.803766: step: 690/466, loss: 0.10988299548625946 2023-01-24 01:50:16.447950: step: 692/466, loss: 0.7742509245872498 2023-01-24 01:50:17.024751: step: 694/466, loss: 0.5149338841438293 2023-01-24 01:50:17.617153: step: 696/466, loss: 0.5421413779258728 2023-01-24 01:50:18.241803: step: 698/466, loss: 0.1817217469215393 2023-01-24 01:50:18.902696: step: 700/466, loss: 0.11957786977291107 2023-01-24 01:50:19.585996: step: 702/466, loss: 0.10088533908128738 2023-01-24 01:50:20.275385: step: 704/466, loss: 0.44104164838790894 2023-01-24 01:50:20.962096: step: 706/466, loss: 0.39744728803634644 2023-01-24 01:50:21.652788: step: 708/466, loss: 0.25253307819366455 2023-01-24 01:50:22.293510: step: 710/466, loss: 0.20075541734695435 2023-01-24 01:50:22.918152: step: 712/466, loss: 0.25348201394081116 2023-01-24 01:50:23.535823: step: 714/466, loss: 0.14546802639961243 2023-01-24 01:50:24.196028: step: 716/466, loss: 0.5452193021774292 2023-01-24 01:50:24.836276: step: 718/466, loss: 0.6032496690750122 2023-01-24 01:50:25.469430: step: 720/466, loss: 0.2527567148208618 2023-01-24 01:50:26.072644: step: 722/466, loss: 0.23223836719989777 2023-01-24 01:50:26.673650: step: 724/466, loss: 0.28572478890419006 2023-01-24 01:50:27.283101: step: 726/466, loss: 0.03468862920999527 2023-01-24 01:50:27.928525: step: 728/466, loss: 0.1436808854341507 2023-01-24 01:50:28.595835: step: 730/466, loss: 0.7947600483894348 2023-01-24 01:50:29.183918: step: 732/466, loss: 0.6669540405273438 2023-01-24 01:50:29.906568: step: 734/466, loss: 0.9614865779876709 2023-01-24 01:50:30.523870: step: 736/466, loss: 0.21250706911087036 2023-01-24 01:50:31.185517: step: 738/466, loss: 0.4781217873096466 2023-01-24 01:50:31.797056: step: 740/466, loss: 0.10923323035240173 2023-01-24 01:50:32.446086: step: 742/466, loss: 0.5838552713394165 2023-01-24 01:50:33.049380: step: 744/466, loss: 0.36028003692626953 2023-01-24 01:50:33.622846: step: 746/466, loss: 0.34653040766716003 2023-01-24 01:50:34.280021: step: 748/466, loss: 0.1995936930179596 2023-01-24 01:50:34.893522: step: 750/466, loss: 0.47797736525535583 2023-01-24 01:50:35.489537: step: 752/466, loss: 0.22129611670970917 2023-01-24 01:50:36.067985: step: 754/466, loss: 0.15425199270248413 2023-01-24 01:50:36.702607: step: 756/466, loss: 0.34954312443733215 2023-01-24 01:50:37.376710: step: 758/466, loss: 0.22375522553920746 2023-01-24 01:50:37.994882: step: 760/466, loss: 0.2790171504020691 2023-01-24 01:50:38.646692: step: 762/466, loss: 0.17393219470977783 2023-01-24 01:50:39.257583: step: 764/466, loss: 0.27619192004203796 2023-01-24 01:50:39.862896: step: 766/466, loss: 0.0823623314499855 2023-01-24 01:50:40.497820: step: 768/466, loss: 0.14845037460327148 2023-01-24 01:50:41.126365: step: 770/466, loss: 0.06040605902671814 2023-01-24 01:50:41.696416: step: 772/466, loss: 0.11510689556598663 2023-01-24 01:50:42.301670: step: 774/466, loss: 0.5105097889900208 2023-01-24 01:50:42.977890: step: 776/466, loss: 0.4596703052520752 2023-01-24 01:50:43.595815: step: 778/466, loss: 0.08241189271211624 2023-01-24 01:50:44.218023: step: 780/466, loss: 0.2193770706653595 2023-01-24 01:50:44.930776: step: 782/466, loss: 0.13047076761722565 2023-01-24 01:50:45.576203: step: 784/466, loss: 0.1318654865026474 2023-01-24 01:50:46.234735: step: 786/466, loss: 0.4017501175403595 2023-01-24 01:50:46.831191: step: 788/466, loss: 0.5616803169250488 2023-01-24 01:50:47.484277: step: 790/466, loss: 0.286371648311615 2023-01-24 01:50:48.116011: step: 792/466, loss: 0.263804167509079 2023-01-24 01:50:48.697260: step: 794/466, loss: 0.1747511327266693 2023-01-24 01:50:49.347053: step: 796/466, loss: 0.38937389850616455 2023-01-24 01:50:49.922626: step: 798/466, loss: 0.3421323001384735 2023-01-24 01:50:50.572742: step: 800/466, loss: 0.09835729748010635 2023-01-24 01:50:51.164440: step: 802/466, loss: 0.18446092307567596 2023-01-24 01:50:51.739819: step: 804/466, loss: 0.20434127748012543 2023-01-24 01:50:52.347077: step: 806/466, loss: 0.727473258972168 2023-01-24 01:50:53.014387: step: 808/466, loss: 0.09985418617725372 2023-01-24 01:50:53.708082: step: 810/466, loss: 0.7417232990264893 2023-01-24 01:50:54.320776: step: 812/466, loss: 0.1557294726371765 2023-01-24 01:50:54.900873: step: 814/466, loss: 0.11794015765190125 2023-01-24 01:50:55.524201: step: 816/466, loss: 0.24623864889144897 2023-01-24 01:50:56.157150: step: 818/466, loss: 0.07437814772129059 2023-01-24 01:50:56.761545: step: 820/466, loss: 0.034862954169511795 2023-01-24 01:50:57.401088: step: 822/466, loss: 0.32118910551071167 2023-01-24 01:50:58.016142: step: 824/466, loss: 0.19968284666538239 2023-01-24 01:50:58.644203: step: 826/466, loss: 0.30660173296928406 2023-01-24 01:50:59.300592: step: 828/466, loss: 0.2359127253293991 2023-01-24 01:50:59.919190: step: 830/466, loss: 0.15513497591018677 2023-01-24 01:51:00.559036: step: 832/466, loss: 0.10971243679523468 2023-01-24 01:51:01.265792: step: 834/466, loss: 0.30356305837631226 2023-01-24 01:51:01.944129: step: 836/466, loss: 0.42627593874931335 2023-01-24 01:51:02.594709: step: 838/466, loss: 0.2816768288612366 2023-01-24 01:51:03.233578: step: 840/466, loss: 0.2914235293865204 2023-01-24 01:51:03.807850: step: 842/466, loss: 0.13204285502433777 2023-01-24 01:51:04.349643: step: 844/466, loss: 0.8028279542922974 2023-01-24 01:51:04.976825: step: 846/466, loss: 0.050989869982004166 2023-01-24 01:51:05.600988: step: 848/466, loss: 0.06458903849124908 2023-01-24 01:51:06.192162: step: 850/466, loss: 0.5054102540016174 2023-01-24 01:51:06.754894: step: 852/466, loss: 0.22338427603244781 2023-01-24 01:51:07.359488: step: 854/466, loss: 0.49166548252105713 2023-01-24 01:51:07.997776: step: 856/466, loss: 0.2450941950082779 2023-01-24 01:51:08.652991: step: 858/466, loss: 0.4752826988697052 2023-01-24 01:51:09.233172: step: 860/466, loss: 0.07163707166910172 2023-01-24 01:51:09.856669: step: 862/466, loss: 0.754447340965271 2023-01-24 01:51:10.525618: step: 864/466, loss: 0.21170610189437866 2023-01-24 01:51:11.163660: step: 866/466, loss: 0.2848890423774719 2023-01-24 01:51:11.753658: step: 868/466, loss: 0.20099715888500214 2023-01-24 01:51:12.362219: step: 870/466, loss: 0.15913009643554688 2023-01-24 01:51:13.065509: step: 872/466, loss: 0.08308277279138565 2023-01-24 01:51:13.737627: step: 874/466, loss: 1.4316819906234741 2023-01-24 01:51:14.339546: step: 876/466, loss: 0.19593428075313568 2023-01-24 01:51:14.949597: step: 878/466, loss: 0.1286579817533493 2023-01-24 01:51:15.538795: step: 880/466, loss: 0.08160890638828278 2023-01-24 01:51:16.170461: step: 882/466, loss: 0.30313640832901 2023-01-24 01:51:16.732947: step: 884/466, loss: 0.11539170145988464 2023-01-24 01:51:17.361270: step: 886/466, loss: 0.7092605829238892 2023-01-24 01:51:17.980551: step: 888/466, loss: 0.2879091501235962 2023-01-24 01:51:18.597100: step: 890/466, loss: 0.05659577250480652 2023-01-24 01:51:19.197965: step: 892/466, loss: 0.2133093923330307 2023-01-24 01:51:19.807579: step: 894/466, loss: 0.19630642235279083 2023-01-24 01:51:20.391122: step: 896/466, loss: 0.28548285365104675 2023-01-24 01:51:20.989340: step: 898/466, loss: 0.13361498713493347 2023-01-24 01:51:21.578910: step: 900/466, loss: 0.3782631754875183 2023-01-24 01:51:22.158768: step: 902/466, loss: 0.4256414473056793 2023-01-24 01:51:22.814317: step: 904/466, loss: 0.1539887934923172 2023-01-24 01:51:23.436472: step: 906/466, loss: 0.32209718227386475 2023-01-24 01:51:24.058459: step: 908/466, loss: 0.17927029728889465 2023-01-24 01:51:24.652673: step: 910/466, loss: 0.23833033442497253 2023-01-24 01:51:25.167110: step: 912/466, loss: 0.3317146599292755 2023-01-24 01:51:25.774728: step: 914/466, loss: 0.3678411543369293 2023-01-24 01:51:26.392360: step: 916/466, loss: 0.0920814499258995 2023-01-24 01:51:27.005298: step: 918/466, loss: 0.1649651676416397 2023-01-24 01:51:27.600710: step: 920/466, loss: 0.44486376643180847 2023-01-24 01:51:28.272163: step: 922/466, loss: 0.2081485092639923 2023-01-24 01:51:28.919648: step: 924/466, loss: 0.22808825969696045 2023-01-24 01:51:29.538333: step: 926/466, loss: 0.2163880616426468 2023-01-24 01:51:30.259380: step: 928/466, loss: 0.24781255424022675 2023-01-24 01:51:30.900321: step: 930/466, loss: 0.6487457752227783 2023-01-24 01:51:31.502258: step: 932/466, loss: 0.12162760645151138 ================================================== Loss: 0.332 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.33592695681511475, 'r': 0.31489168247944344, 'f1': 0.3250693764283383}, 'combined': 0.23952480368403872, 'epoch': 11} Test Chinese: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.35906774970282124, 'r': 0.2969989943522903, 'f1': 0.32509729088514655}, 'combined': 0.2156085659756412, 'epoch': 11} Dev Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.33049568965517245, 'r': 0.2541311553030303, 'f1': 0.287326017130621}, 'combined': 0.19155067808708065, 'epoch': 11} Test Korean: {'template': {'p': 0.96875, 'r': 0.49206349206349204, 'f1': 0.6526315789473683}, 'slot': {'p': 0.3614236525454653, 'r': 0.27662208558458123, 'f1': 0.3133874535068085}, 'combined': 0.20452654860444344, 'epoch': 11} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3225165386256935, 'r': 0.304156963371859, 'f1': 0.3130678119081439}, 'combined': 0.2306815456165271, 'epoch': 11} Test Russian: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.34849716429494293, 'r': 0.28705081899142665, 'f1': 0.3148035995953371}, 'combined': 0.20878166190778832, 'epoch': 11} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.32440476190476186, 'r': 0.2595238095238095, 'f1': 0.2883597883597883}, 'combined': 0.19223985890652553, 'epoch': 11} Sample Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5166666666666667, 'r': 0.33695652173913043, 'f1': 0.40789473684210525}, 'combined': 0.27192982456140347, 'epoch': 11} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.3, 'r': 0.10344827586206896, 'f1': 0.15384615384615385}, 'combined': 0.10256410256410256, 'epoch': 11} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3249718423169751, 'r': 0.34840434707607393, 'f1': 0.33628038628038637}, 'combined': 0.24778554778554784, 'epoch': 10} Test for Chinese: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.35564524631816125, 'r': 0.2998937951296123, 'f1': 0.3253987814443737}, 'combined': 0.21580851826362604, 'epoch': 10} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.33226495726495725, 'r': 0.3702380952380952, 'f1': 0.3502252252252252}, 'combined': 0.23348348348348347, 'epoch': 10} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.31508371199826896, 'r': 0.28285924145299146, 'f1': 0.298103152669021}, 'combined': 0.19873543511268066, 'epoch': 10} Test for Korean: {'template': {'p': 0.96875, 'r': 0.49206349206349204, 'f1': 0.6526315789473683}, 'slot': {'p': 0.3659577520051451, 'r': 0.28516188467933384, 'f1': 0.3205469360629008}, 'combined': 0.20919905300947209, 'epoch': 10} Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.43478260869565216, 'f1': 0.46511627906976744}, 'combined': 0.31007751937984496, 'epoch': 10} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3214272868712161, 'r': 0.31166858366449984, 'f1': 0.3164727236824498}, 'combined': 0.23319042797654194, 'epoch': 7} Test for Russian: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.3374639287478496, 'r': 0.2816078301964814, 'f1': 0.30701605547736693}, 'combined': 0.20361686580882363, 'epoch': 7} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.4423076923076923, 'r': 0.19827586206896552, 'f1': 0.2738095238095238}, 'combined': 0.1825396825396825, 'epoch': 7} ****************************** Epoch: 12 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-24 01:53:59.736156: step: 2/466, loss: 0.13710524141788483 2023-01-24 01:54:00.329068: step: 4/466, loss: 0.06199297681450844 2023-01-24 01:54:00.911919: step: 6/466, loss: 4.465801239013672 2023-01-24 01:54:01.542429: step: 8/466, loss: 0.15292063355445862 2023-01-24 01:54:02.183297: step: 10/466, loss: 0.15151306986808777 2023-01-24 01:54:02.835164: step: 12/466, loss: 0.11174985766410828 2023-01-24 01:54:03.464965: step: 14/466, loss: 0.13583029806613922 2023-01-24 01:54:04.118516: step: 16/466, loss: 1.3309175968170166 2023-01-24 01:54:04.737337: step: 18/466, loss: 0.11795716732740402 2023-01-24 01:54:05.342444: step: 20/466, loss: 0.2237384021282196 2023-01-24 01:54:06.019200: step: 22/466, loss: 0.18687458336353302 2023-01-24 01:54:06.643980: step: 24/466, loss: 0.13026343286037445 2023-01-24 01:54:07.288818: step: 26/466, loss: 0.13793495297431946 2023-01-24 01:54:07.935934: step: 28/466, loss: 0.19610273838043213 2023-01-24 01:54:08.500527: step: 30/466, loss: 0.11369558423757553 2023-01-24 01:54:09.117747: step: 32/466, loss: 0.12538641691207886 2023-01-24 01:54:09.781984: step: 34/466, loss: 0.08715818077325821 2023-01-24 01:54:10.436115: step: 36/466, loss: 0.08705645054578781 2023-01-24 01:54:11.028321: step: 38/466, loss: 0.08952294290065765 2023-01-24 01:54:11.692789: step: 40/466, loss: 0.16769270598888397 2023-01-24 01:54:12.322069: step: 42/466, loss: 0.32103365659713745 2023-01-24 01:54:12.916208: step: 44/466, loss: 0.06588691473007202 2023-01-24 01:54:13.526308: step: 46/466, loss: 0.2702665328979492 2023-01-24 01:54:14.122705: step: 48/466, loss: 0.8478837609291077 2023-01-24 01:54:14.771367: step: 50/466, loss: 2.1849398612976074 2023-01-24 01:54:15.341853: step: 52/466, loss: 0.07038564234972 2023-01-24 01:54:15.967559: step: 54/466, loss: 0.10733061283826828 2023-01-24 01:54:16.632375: step: 56/466, loss: 0.07326068729162216 2023-01-24 01:54:17.266664: step: 58/466, loss: 0.14899058640003204 2023-01-24 01:54:17.933584: step: 60/466, loss: 0.15643711388111115 2023-01-24 01:54:18.481819: step: 62/466, loss: 0.0960053876042366 2023-01-24 01:54:19.091557: step: 64/466, loss: 0.12543736398220062 2023-01-24 01:54:19.718764: step: 66/466, loss: 0.3649769425392151 2023-01-24 01:54:20.324189: step: 68/466, loss: 0.13690850138664246 2023-01-24 01:54:20.964709: step: 70/466, loss: 0.2428659349679947 2023-01-24 01:54:21.583963: step: 72/466, loss: 0.20831294357776642 2023-01-24 01:54:22.210279: step: 74/466, loss: 0.18887560069561005 2023-01-24 01:54:22.842600: step: 76/466, loss: 0.09079209715127945 2023-01-24 01:54:23.494607: step: 78/466, loss: 0.23012274503707886 2023-01-24 01:54:24.104364: step: 80/466, loss: 0.4214737117290497 2023-01-24 01:54:24.726493: step: 82/466, loss: 0.08949767053127289 2023-01-24 01:54:25.390492: step: 84/466, loss: 0.13404501974582672 2023-01-24 01:54:26.043591: step: 86/466, loss: 0.20653145015239716 2023-01-24 01:54:26.689223: step: 88/466, loss: 0.062412939965724945 2023-01-24 01:54:27.307129: step: 90/466, loss: 0.24332228302955627 2023-01-24 01:54:27.944421: step: 92/466, loss: 0.6783890724182129 2023-01-24 01:54:28.634505: step: 94/466, loss: 0.222461998462677 2023-01-24 01:54:29.296407: step: 96/466, loss: 0.2090371549129486 2023-01-24 01:54:29.911242: step: 98/466, loss: 0.17527589201927185 2023-01-24 01:54:30.532944: step: 100/466, loss: 0.18766961991786957 2023-01-24 01:54:31.158807: step: 102/466, loss: 0.07617698609828949 2023-01-24 01:54:31.759927: step: 104/466, loss: 0.09402088820934296 2023-01-24 01:54:32.417229: step: 106/466, loss: 0.0528925284743309 2023-01-24 01:54:33.079758: step: 108/466, loss: 0.10213060677051544 2023-01-24 01:54:33.714776: step: 110/466, loss: 0.511379599571228 2023-01-24 01:54:34.341482: step: 112/466, loss: 0.22382397949695587 2023-01-24 01:54:34.973883: step: 114/466, loss: 0.13445906341075897 2023-01-24 01:54:35.548885: step: 116/466, loss: 0.3029036521911621 2023-01-24 01:54:36.227085: step: 118/466, loss: 0.3222705125808716 2023-01-24 01:54:36.862322: step: 120/466, loss: 0.15039843320846558 2023-01-24 01:54:37.435549: step: 122/466, loss: 0.1167134940624237 2023-01-24 01:54:37.997454: step: 124/466, loss: 0.19666531682014465 2023-01-24 01:54:38.616969: step: 126/466, loss: 0.05981430411338806 2023-01-24 01:54:39.253562: step: 128/466, loss: 0.0997176542878151 2023-01-24 01:54:39.881538: step: 130/466, loss: 0.11331747472286224 2023-01-24 01:54:40.492394: step: 132/466, loss: 0.15961550176143646 2023-01-24 01:54:41.150599: step: 134/466, loss: 0.6935784816741943 2023-01-24 01:54:41.797593: step: 136/466, loss: 0.06055481359362602 2023-01-24 01:54:42.472135: step: 138/466, loss: 0.17825272679328918 2023-01-24 01:54:43.101087: step: 140/466, loss: 0.4913868308067322 2023-01-24 01:54:43.755545: step: 142/466, loss: 0.8437122702598572 2023-01-24 01:54:44.354293: step: 144/466, loss: 0.14781421422958374 2023-01-24 01:54:44.973141: step: 146/466, loss: 0.09563115239143372 2023-01-24 01:54:45.668189: step: 148/466, loss: 0.17472043633460999 2023-01-24 01:54:46.281187: step: 150/466, loss: 0.3626697063446045 2023-01-24 01:54:47.015686: step: 152/466, loss: 0.19643613696098328 2023-01-24 01:54:47.629711: step: 154/466, loss: 0.4196276366710663 2023-01-24 01:54:48.253831: step: 156/466, loss: 0.16844557225704193 2023-01-24 01:54:48.898449: step: 158/466, loss: 3.0080199241638184 2023-01-24 01:54:49.474095: step: 160/466, loss: 0.034158289432525635 2023-01-24 01:54:50.056138: step: 162/466, loss: 0.21539528667926788 2023-01-24 01:54:50.719274: step: 164/466, loss: 0.15623827278614044 2023-01-24 01:54:51.341939: step: 166/466, loss: 0.13679563999176025 2023-01-24 01:54:52.000800: step: 168/466, loss: 0.1737860143184662 2023-01-24 01:54:52.607269: step: 170/466, loss: 2.247715711593628 2023-01-24 01:54:53.263705: step: 172/466, loss: 0.19016523659229279 2023-01-24 01:54:54.003958: step: 174/466, loss: 0.13316655158996582 2023-01-24 01:54:54.638904: step: 176/466, loss: 0.3083028495311737 2023-01-24 01:54:55.291542: step: 178/466, loss: 0.07834689319133759 2023-01-24 01:54:55.960147: step: 180/466, loss: 0.31468725204467773 2023-01-24 01:54:56.596064: step: 182/466, loss: 0.2045125812292099 2023-01-24 01:54:57.280142: step: 184/466, loss: 0.31081730127334595 2023-01-24 01:54:57.897971: step: 186/466, loss: 0.34246203303337097 2023-01-24 01:54:58.539143: step: 188/466, loss: 0.09704740345478058 2023-01-24 01:54:59.149625: step: 190/466, loss: 0.03455285355448723 2023-01-24 01:54:59.810078: step: 192/466, loss: 0.13362093269824982 2023-01-24 01:55:00.367011: step: 194/466, loss: 0.08739200234413147 2023-01-24 01:55:00.986785: step: 196/466, loss: 0.6040061712265015 2023-01-24 01:55:01.716634: step: 198/466, loss: 0.09349583834409714 2023-01-24 01:55:02.392615: step: 200/466, loss: 0.3042532205581665 2023-01-24 01:55:03.026529: step: 202/466, loss: 0.11280182003974915 2023-01-24 01:55:03.648451: step: 204/466, loss: 0.21730904281139374 2023-01-24 01:55:04.282939: step: 206/466, loss: 0.5868622064590454 2023-01-24 01:55:04.909256: step: 208/466, loss: 0.1880406141281128 2023-01-24 01:55:05.558193: step: 210/466, loss: 0.08322806656360626 2023-01-24 01:55:06.178876: step: 212/466, loss: 0.08090437948703766 2023-01-24 01:55:06.760140: step: 214/466, loss: 0.4925406873226166 2023-01-24 01:55:07.302905: step: 216/466, loss: 0.14719174802303314 2023-01-24 01:55:07.937204: step: 218/466, loss: 0.07024139910936356 2023-01-24 01:55:08.557645: step: 220/466, loss: 0.3707457184791565 2023-01-24 01:55:09.143035: step: 222/466, loss: 0.10386056452989578 2023-01-24 01:55:09.710628: step: 224/466, loss: 0.10939282923936844 2023-01-24 01:55:10.306957: step: 226/466, loss: 0.23045538365840912 2023-01-24 01:55:10.944637: step: 228/466, loss: 0.4715212285518646 2023-01-24 01:55:11.505061: step: 230/466, loss: 0.06053485721349716 2023-01-24 01:55:12.117028: step: 232/466, loss: 0.08302391320466995 2023-01-24 01:55:12.772439: step: 234/466, loss: 0.2692738473415375 2023-01-24 01:55:13.368198: step: 236/466, loss: 0.07110875099897385 2023-01-24 01:55:13.960063: step: 238/466, loss: 0.10884547978639603 2023-01-24 01:55:14.609506: step: 240/466, loss: 0.18593835830688477 2023-01-24 01:55:15.273334: step: 242/466, loss: 0.24864543974399567 2023-01-24 01:55:15.890584: step: 244/466, loss: 2.9477343559265137 2023-01-24 01:55:16.563878: step: 246/466, loss: 0.1442636251449585 2023-01-24 01:55:17.165280: step: 248/466, loss: 0.107085682451725 2023-01-24 01:55:17.775963: step: 250/466, loss: 0.7243697047233582 2023-01-24 01:55:18.396747: step: 252/466, loss: 0.26127350330352783 2023-01-24 01:55:19.106030: step: 254/466, loss: 0.15399640798568726 2023-01-24 01:55:19.778207: step: 256/466, loss: 0.1712428778409958 2023-01-24 01:55:20.371942: step: 258/466, loss: 0.15944349765777588 2023-01-24 01:55:21.010659: step: 260/466, loss: 0.2489410638809204 2023-01-24 01:55:21.653306: step: 262/466, loss: 0.05128001049160957 2023-01-24 01:55:22.217422: step: 264/466, loss: 0.0540747307240963 2023-01-24 01:55:22.845414: step: 266/466, loss: 0.3385080099105835 2023-01-24 01:55:23.531531: step: 268/466, loss: 0.21575245261192322 2023-01-24 01:55:24.167681: step: 270/466, loss: 0.12338259816169739 2023-01-24 01:55:24.790593: step: 272/466, loss: 0.19334723055362701 2023-01-24 01:55:25.406520: step: 274/466, loss: 0.15544076263904572 2023-01-24 01:55:25.981142: step: 276/466, loss: 0.13702447712421417 2023-01-24 01:55:26.627225: step: 278/466, loss: 0.18537229299545288 2023-01-24 01:55:27.177713: step: 280/466, loss: 0.02131054364144802 2023-01-24 01:55:27.879758: step: 282/466, loss: 0.16477516293525696 2023-01-24 01:55:28.549346: step: 284/466, loss: 0.33140814304351807 2023-01-24 01:55:29.269166: step: 286/466, loss: 0.1358073502779007 2023-01-24 01:55:29.929814: step: 288/466, loss: 0.1653151959180832 2023-01-24 01:55:30.544736: step: 290/466, loss: 0.08968791365623474 2023-01-24 01:55:31.228077: step: 292/466, loss: 0.20049548149108887 2023-01-24 01:55:31.852375: step: 294/466, loss: 0.21416182816028595 2023-01-24 01:55:32.435269: step: 296/466, loss: 0.11540967226028442 2023-01-24 01:55:33.050734: step: 298/466, loss: 0.2275293916463852 2023-01-24 01:55:33.622186: step: 300/466, loss: 0.1523159146308899 2023-01-24 01:55:34.235894: step: 302/466, loss: 0.38231658935546875 2023-01-24 01:55:34.840728: step: 304/466, loss: 0.6548060774803162 2023-01-24 01:55:35.415438: step: 306/466, loss: 0.11360199004411697 2023-01-24 01:55:36.076115: step: 308/466, loss: 0.6178106665611267 2023-01-24 01:55:36.690268: step: 310/466, loss: 0.9341347813606262 2023-01-24 01:55:37.356951: step: 312/466, loss: 0.08135673403739929 2023-01-24 01:55:37.969639: step: 314/466, loss: 0.09795916080474854 2023-01-24 01:55:38.567873: step: 316/466, loss: 0.11982527375221252 2023-01-24 01:55:39.130398: step: 318/466, loss: 0.12153121083974838 2023-01-24 01:55:39.710915: step: 320/466, loss: 0.22064079344272614 2023-01-24 01:55:40.294716: step: 322/466, loss: 0.5312843918800354 2023-01-24 01:55:40.837351: step: 324/466, loss: 0.11995311826467514 2023-01-24 01:55:41.439758: step: 326/466, loss: 0.34171998500823975 2023-01-24 01:55:42.047021: step: 328/466, loss: 0.2656693756580353 2023-01-24 01:55:42.697895: step: 330/466, loss: 0.1589646339416504 2023-01-24 01:55:43.324064: step: 332/466, loss: 0.1326562464237213 2023-01-24 01:55:43.950267: step: 334/466, loss: 0.36644625663757324 2023-01-24 01:55:44.663591: step: 336/466, loss: 0.11936230212450027 2023-01-24 01:55:45.268709: step: 338/466, loss: 0.17160767316818237 2023-01-24 01:55:45.843714: step: 340/466, loss: 0.20013047754764557 2023-01-24 01:55:46.466657: step: 342/466, loss: 0.08777187019586563 2023-01-24 01:55:47.079483: step: 344/466, loss: 0.3543819189071655 2023-01-24 01:55:47.677860: step: 346/466, loss: 0.09497516602277756 2023-01-24 01:55:48.241211: step: 348/466, loss: 0.1696675419807434 2023-01-24 01:55:48.839677: step: 350/466, loss: 0.0606960654258728 2023-01-24 01:55:49.524112: step: 352/466, loss: 0.128300741314888 2023-01-24 01:55:50.159329: step: 354/466, loss: 0.1335906833410263 2023-01-24 01:55:50.774741: step: 356/466, loss: 0.291808545589447 2023-01-24 01:55:51.455187: step: 358/466, loss: 0.620805561542511 2023-01-24 01:55:52.061764: step: 360/466, loss: 0.23039770126342773 2023-01-24 01:55:52.693375: step: 362/466, loss: 0.0966557189822197 2023-01-24 01:55:53.327717: step: 364/466, loss: 0.321336954832077 2023-01-24 01:55:53.945795: step: 366/466, loss: 0.17469625174999237 2023-01-24 01:55:54.549499: step: 368/466, loss: 0.2617659866809845 2023-01-24 01:55:55.137643: step: 370/466, loss: 0.3682243525981903 2023-01-24 01:55:55.732126: step: 372/466, loss: 0.19373668730258942 2023-01-24 01:55:56.289013: step: 374/466, loss: 0.4200045168399811 2023-01-24 01:55:56.907767: step: 376/466, loss: 0.11585645377635956 2023-01-24 01:55:57.587688: step: 378/466, loss: 0.5483799576759338 2023-01-24 01:55:58.236240: step: 380/466, loss: 1.5028091669082642 2023-01-24 01:55:58.893625: step: 382/466, loss: 0.1518053114414215 2023-01-24 01:55:59.543111: step: 384/466, loss: 0.17489144206047058 2023-01-24 01:56:00.184697: step: 386/466, loss: 0.4138447940349579 2023-01-24 01:56:00.783302: step: 388/466, loss: 0.10435126721858978 2023-01-24 01:56:01.362493: step: 390/466, loss: 0.06659137457609177 2023-01-24 01:56:01.950236: step: 392/466, loss: 0.3992398679256439 2023-01-24 01:56:02.554691: step: 394/466, loss: 0.25137820839881897 2023-01-24 01:56:03.187085: step: 396/466, loss: 0.491972416639328 2023-01-24 01:56:03.813755: step: 398/466, loss: 0.28629037737846375 2023-01-24 01:56:04.401002: step: 400/466, loss: 0.24112729728221893 2023-01-24 01:56:04.983474: step: 402/466, loss: 0.21412010490894318 2023-01-24 01:56:05.646610: step: 404/466, loss: 0.14537203311920166 2023-01-24 01:56:06.275392: step: 406/466, loss: 0.2578541040420532 2023-01-24 01:56:06.874564: step: 408/466, loss: 0.23483307659626007 2023-01-24 01:56:07.524613: step: 410/466, loss: 0.21885579824447632 2023-01-24 01:56:08.143887: step: 412/466, loss: 0.05923938378691673 2023-01-24 01:56:08.747766: step: 414/466, loss: 0.1962558776140213 2023-01-24 01:56:09.402721: step: 416/466, loss: 0.09239842742681503 2023-01-24 01:56:10.003426: step: 418/466, loss: 0.3583330810070038 2023-01-24 01:56:10.620613: step: 420/466, loss: 0.3155226409435272 2023-01-24 01:56:11.224804: step: 422/466, loss: 0.1832641214132309 2023-01-24 01:56:11.849005: step: 424/466, loss: 0.27545690536499023 2023-01-24 01:56:12.494964: step: 426/466, loss: 0.19648611545562744 2023-01-24 01:56:13.197584: step: 428/466, loss: 0.15953874588012695 2023-01-24 01:56:13.847245: step: 430/466, loss: 0.10485388338565826 2023-01-24 01:56:14.436587: step: 432/466, loss: 0.10260970890522003 2023-01-24 01:56:15.006366: step: 434/466, loss: 0.3881319761276245 2023-01-24 01:56:15.657539: step: 436/466, loss: 0.09818845987319946 2023-01-24 01:56:16.217985: step: 438/466, loss: 0.25729766488075256 2023-01-24 01:56:16.875687: step: 440/466, loss: 1.216381311416626 2023-01-24 01:56:17.527709: step: 442/466, loss: 0.02328294888138771 2023-01-24 01:56:18.229534: step: 444/466, loss: 0.3791840076446533 2023-01-24 01:56:18.790395: step: 446/466, loss: 0.08898995816707611 2023-01-24 01:56:19.449482: step: 448/466, loss: 0.0699378028512001 2023-01-24 01:56:20.046768: step: 450/466, loss: 0.2938188314437866 2023-01-24 01:56:20.743960: step: 452/466, loss: 0.09860727190971375 2023-01-24 01:56:21.401878: step: 454/466, loss: 0.15588293969631195 2023-01-24 01:56:22.020277: step: 456/466, loss: 0.286975622177124 2023-01-24 01:56:22.631416: step: 458/466, loss: 0.12239721417427063 2023-01-24 01:56:23.277947: step: 460/466, loss: 0.061540111899375916 2023-01-24 01:56:23.920843: step: 462/466, loss: 0.19804197549819946 2023-01-24 01:56:24.529817: step: 464/466, loss: 0.04648022726178169 2023-01-24 01:56:25.146761: step: 466/466, loss: 0.33823028206825256 2023-01-24 01:56:25.767976: step: 468/466, loss: 1.070517897605896 2023-01-24 01:56:26.386619: step: 470/466, loss: 0.30970969796180725 2023-01-24 01:56:26.992144: step: 472/466, loss: 0.20133820176124573 2023-01-24 01:56:27.683918: step: 474/466, loss: 0.1282711923122406 2023-01-24 01:56:28.283252: step: 476/466, loss: 0.2471041977405548 2023-01-24 01:56:28.852895: step: 478/466, loss: 0.17406708002090454 2023-01-24 01:56:29.416053: step: 480/466, loss: 0.11726734787225723 2023-01-24 01:56:30.092164: step: 482/466, loss: 0.2776939868927002 2023-01-24 01:56:30.721950: step: 484/466, loss: 0.16853603720664978 2023-01-24 01:56:31.352738: step: 486/466, loss: 0.39846131205558777 2023-01-24 01:56:31.980068: step: 488/466, loss: 0.059623707085847855 2023-01-24 01:56:32.632269: step: 490/466, loss: 0.09924140572547913 2023-01-24 01:56:33.274649: step: 492/466, loss: 0.09131093323230743 2023-01-24 01:56:33.872183: step: 494/466, loss: 0.0609959177672863 2023-01-24 01:56:34.441754: step: 496/466, loss: 0.16200877726078033 2023-01-24 01:56:35.053583: step: 498/466, loss: 0.14968974888324738 2023-01-24 01:56:35.838521: step: 500/466, loss: 0.06152166798710823 2023-01-24 01:56:36.468444: step: 502/466, loss: 0.21100901067256927 2023-01-24 01:56:37.081015: step: 504/466, loss: 0.16417381167411804 2023-01-24 01:56:37.743714: step: 506/466, loss: 0.8375449180603027 2023-01-24 01:56:38.329838: step: 508/466, loss: 0.09576047211885452 2023-01-24 01:56:38.937025: step: 510/466, loss: 0.7462514638900757 2023-01-24 01:56:39.575508: step: 512/466, loss: 0.08928915858268738 2023-01-24 01:56:40.259373: step: 514/466, loss: 0.08065564185380936 2023-01-24 01:56:40.883504: step: 516/466, loss: 0.24708355963230133 2023-01-24 01:56:41.504726: step: 518/466, loss: 0.6688655018806458 2023-01-24 01:56:42.106060: step: 520/466, loss: 0.1237562745809555 2023-01-24 01:56:42.742991: step: 522/466, loss: 0.870952844619751 2023-01-24 01:56:43.435432: step: 524/466, loss: 0.19325175881385803 2023-01-24 01:56:44.179573: step: 526/466, loss: 0.11155366152524948 2023-01-24 01:56:44.758288: step: 528/466, loss: 0.49736759066581726 2023-01-24 01:56:45.362387: step: 530/466, loss: 0.2320837527513504 2023-01-24 01:56:46.035363: step: 532/466, loss: 0.9562352895736694 2023-01-24 01:56:46.622486: step: 534/466, loss: 0.19843395054340363 2023-01-24 01:56:47.277372: step: 536/466, loss: 0.1723707616329193 2023-01-24 01:56:47.964221: step: 538/466, loss: 0.2933112680912018 2023-01-24 01:56:48.614469: step: 540/466, loss: 0.16960805654525757 2023-01-24 01:56:49.216070: step: 542/466, loss: 0.07599765062332153 2023-01-24 01:56:49.836575: step: 544/466, loss: 0.13477495312690735 2023-01-24 01:56:50.524426: step: 546/466, loss: 0.1572151482105255 2023-01-24 01:56:51.196827: step: 548/466, loss: 0.3136172294616699 2023-01-24 01:56:51.798707: step: 550/466, loss: 0.6616876125335693 2023-01-24 01:56:52.355841: step: 552/466, loss: 0.18141965568065643 2023-01-24 01:56:52.974322: step: 554/466, loss: 0.14190468192100525 2023-01-24 01:56:53.584249: step: 556/466, loss: 0.4681580364704132 2023-01-24 01:56:54.196506: step: 558/466, loss: 0.26399046182632446 2023-01-24 01:56:54.728976: step: 560/466, loss: 0.10197646170854568 2023-01-24 01:56:55.357064: step: 562/466, loss: 0.14712463319301605 2023-01-24 01:56:56.017211: step: 564/466, loss: 0.8725566267967224 2023-01-24 01:56:56.652982: step: 566/466, loss: 0.23474819958209991 2023-01-24 01:56:57.267695: step: 568/466, loss: 0.07910080999135971 2023-01-24 01:56:57.892194: step: 570/466, loss: 0.2522532343864441 2023-01-24 01:56:58.513605: step: 572/466, loss: 0.08765573054552078 2023-01-24 01:56:59.137937: step: 574/466, loss: 0.1364997774362564 2023-01-24 01:56:59.740801: step: 576/466, loss: 0.1621016561985016 2023-01-24 01:57:00.341079: step: 578/466, loss: 0.33065852522850037 2023-01-24 01:57:00.911390: step: 580/466, loss: 0.29974791407585144 2023-01-24 01:57:01.533129: step: 582/466, loss: 0.24464833736419678 2023-01-24 01:57:02.143612: step: 584/466, loss: 0.41071781516075134 2023-01-24 01:57:02.768226: step: 586/466, loss: 0.05391302332282066 2023-01-24 01:57:03.383464: step: 588/466, loss: 0.09712765365839005 2023-01-24 01:57:04.008101: step: 590/466, loss: 0.09114282578229904 2023-01-24 01:57:04.607625: step: 592/466, loss: 0.12766587734222412 2023-01-24 01:57:05.216320: step: 594/466, loss: 0.41201990842819214 2023-01-24 01:57:05.797153: step: 596/466, loss: 1.335031509399414 2023-01-24 01:57:06.447634: step: 598/466, loss: 0.4613915681838989 2023-01-24 01:57:07.071709: step: 600/466, loss: 0.20050835609436035 2023-01-24 01:57:07.606371: step: 602/466, loss: 0.22310605645179749 2023-01-24 01:57:08.224676: step: 604/466, loss: 0.4295780658721924 2023-01-24 01:57:08.797233: step: 606/466, loss: 0.08041802048683167 2023-01-24 01:57:09.439058: step: 608/466, loss: 0.09882882237434387 2023-01-24 01:57:10.117642: step: 610/466, loss: 0.07247479259967804 2023-01-24 01:57:10.788701: step: 612/466, loss: 0.13833218812942505 2023-01-24 01:57:11.425196: step: 614/466, loss: 0.27434587478637695 2023-01-24 01:57:12.065132: step: 616/466, loss: 0.5944318771362305 2023-01-24 01:57:12.697699: step: 618/466, loss: 0.17174427211284637 2023-01-24 01:57:13.315322: step: 620/466, loss: 0.14457230269908905 2023-01-24 01:57:13.847550: step: 622/466, loss: 0.07766750454902649 2023-01-24 01:57:14.470878: step: 624/466, loss: 0.10745465010404587 2023-01-24 01:57:15.117945: step: 626/466, loss: 0.4946940541267395 2023-01-24 01:57:15.711602: step: 628/466, loss: 0.15756690502166748 2023-01-24 01:57:16.339543: step: 630/466, loss: 0.09950481355190277 2023-01-24 01:57:16.975222: step: 632/466, loss: 0.18794941902160645 2023-01-24 01:57:17.562094: step: 634/466, loss: 0.10358545929193497 2023-01-24 01:57:18.184039: step: 636/466, loss: 0.16333678364753723 2023-01-24 01:57:18.781139: step: 638/466, loss: 0.5597032308578491 2023-01-24 01:57:19.395124: step: 640/466, loss: 0.18828891217708588 2023-01-24 01:57:20.035329: step: 642/466, loss: 0.12340757995843887 2023-01-24 01:57:20.646050: step: 644/466, loss: 0.17228133976459503 2023-01-24 01:57:21.261650: step: 646/466, loss: 0.12330249696969986 2023-01-24 01:57:21.916688: step: 648/466, loss: 0.36004215478897095 2023-01-24 01:57:22.502004: step: 650/466, loss: 0.32791563868522644 2023-01-24 01:57:23.251758: step: 652/466, loss: 0.36322280764579773 2023-01-24 01:57:23.909657: step: 654/466, loss: 0.09934289753437042 2023-01-24 01:57:24.541962: step: 656/466, loss: 0.12922798097133636 2023-01-24 01:57:25.118634: step: 658/466, loss: 0.14867113530635834 2023-01-24 01:57:25.834802: step: 660/466, loss: 0.1516675502061844 2023-01-24 01:57:26.469511: step: 662/466, loss: 0.2672687768936157 2023-01-24 01:57:27.256298: step: 664/466, loss: 0.10394330322742462 2023-01-24 01:57:27.891646: step: 666/466, loss: 0.36707255244255066 2023-01-24 01:57:28.465971: step: 668/466, loss: 0.15134350955486298 2023-01-24 01:57:29.122030: step: 670/466, loss: 0.16136594116687775 2023-01-24 01:57:29.775913: step: 672/466, loss: 0.052661944180727005 2023-01-24 01:57:30.430979: step: 674/466, loss: 0.24297872185707092 2023-01-24 01:57:31.049773: step: 676/466, loss: 0.17728376388549805 2023-01-24 01:57:31.629280: step: 678/466, loss: 0.22896040976047516 2023-01-24 01:57:32.281661: step: 680/466, loss: 0.8838499188423157 2023-01-24 01:57:32.879909: step: 682/466, loss: 0.19494293630123138 2023-01-24 01:57:33.474135: step: 684/466, loss: 0.10196879506111145 2023-01-24 01:57:34.122764: step: 686/466, loss: 0.21828074753284454 2023-01-24 01:57:34.709165: step: 688/466, loss: 0.3562825918197632 2023-01-24 01:57:35.391214: step: 690/466, loss: 0.1883367896080017 2023-01-24 01:57:35.980188: step: 692/466, loss: 0.12125629931688309 2023-01-24 01:57:36.574501: step: 694/466, loss: 0.18721602857112885 2023-01-24 01:57:37.230456: step: 696/466, loss: 0.16550004482269287 2023-01-24 01:57:37.868434: step: 698/466, loss: 0.25382041931152344 2023-01-24 01:57:38.501316: step: 700/466, loss: 0.6556934714317322 2023-01-24 01:57:39.079508: step: 702/466, loss: 0.07627542316913605 2023-01-24 01:57:39.728941: step: 704/466, loss: 4.631105422973633 2023-01-24 01:57:40.358833: step: 706/466, loss: 0.10441194474697113 2023-01-24 01:57:41.046412: step: 708/466, loss: 0.26737096905708313 2023-01-24 01:57:41.684302: step: 710/466, loss: 0.14205802977085114 2023-01-24 01:57:42.280055: step: 712/466, loss: 0.07517994195222855 2023-01-24 01:57:42.904204: step: 714/466, loss: 0.14049071073532104 2023-01-24 01:57:43.424533: step: 716/466, loss: 2.2365288734436035 2023-01-24 01:57:44.047729: step: 718/466, loss: 0.07635090500116348 2023-01-24 01:57:44.660603: step: 720/466, loss: 0.17170868813991547 2023-01-24 01:57:45.263220: step: 722/466, loss: 0.8543318510055542 2023-01-24 01:57:45.880243: step: 724/466, loss: 0.4963378310203552 2023-01-24 01:57:46.452642: step: 726/466, loss: 0.05161907896399498 2023-01-24 01:57:47.123202: step: 728/466, loss: 0.2685733735561371 2023-01-24 01:57:47.756710: step: 730/466, loss: 0.1145210936665535 2023-01-24 01:57:48.425741: step: 732/466, loss: 0.1569463610649109 2023-01-24 01:57:49.015125: step: 734/466, loss: 0.5626563429832458 2023-01-24 01:57:49.623881: step: 736/466, loss: 0.23395347595214844 2023-01-24 01:57:50.248559: step: 738/466, loss: 0.1768762171268463 2023-01-24 01:57:50.841224: step: 740/466, loss: 0.21685035526752472 2023-01-24 01:57:51.454882: step: 742/466, loss: 0.30971992015838623 2023-01-24 01:57:52.051875: step: 744/466, loss: 0.7027494311332703 2023-01-24 01:57:52.707463: step: 746/466, loss: 0.4980536997318268 2023-01-24 01:57:53.331464: step: 748/466, loss: 0.8457667827606201 2023-01-24 01:57:53.926702: step: 750/466, loss: 0.17517438530921936 2023-01-24 01:57:54.470452: step: 752/466, loss: 0.3755433261394501 2023-01-24 01:57:55.080346: step: 754/466, loss: 0.1993408501148224 2023-01-24 01:57:55.694642: step: 756/466, loss: 0.373582661151886 2023-01-24 01:57:56.330342: step: 758/466, loss: 0.16973261535167694 2023-01-24 01:57:56.952233: step: 760/466, loss: 0.18853074312210083 2023-01-24 01:57:57.632641: step: 762/466, loss: 0.40229472517967224 2023-01-24 01:57:58.283181: step: 764/466, loss: 0.16277316212654114 2023-01-24 01:57:58.909598: step: 766/466, loss: 0.18618665635585785 2023-01-24 01:57:59.560628: step: 768/466, loss: 0.5071804523468018 2023-01-24 01:58:00.232302: step: 770/466, loss: 0.25484156608581543 2023-01-24 01:58:00.857453: step: 772/466, loss: 0.4185568392276764 2023-01-24 01:58:01.454539: step: 774/466, loss: 0.1737195998430252 2023-01-24 01:58:02.124631: step: 776/466, loss: 0.3032733201980591 2023-01-24 01:58:02.775567: step: 778/466, loss: 0.6785239577293396 2023-01-24 01:58:03.435386: step: 780/466, loss: 0.09707774966955185 2023-01-24 01:58:04.028794: step: 782/466, loss: 0.1329166740179062 2023-01-24 01:58:04.721360: step: 784/466, loss: 0.09093427658081055 2023-01-24 01:58:05.362419: step: 786/466, loss: 0.14941225945949554 2023-01-24 01:58:06.015579: step: 788/466, loss: 0.05523379519581795 2023-01-24 01:58:06.618589: step: 790/466, loss: 0.1023048460483551 2023-01-24 01:58:07.274348: step: 792/466, loss: 0.157408207654953 2023-01-24 01:58:07.859398: step: 794/466, loss: 0.1586690992116928 2023-01-24 01:58:08.460845: step: 796/466, loss: 0.12543946504592896 2023-01-24 01:58:09.132273: step: 798/466, loss: 0.4799301028251648 2023-01-24 01:58:09.735217: step: 800/466, loss: 0.08886481821537018 2023-01-24 01:58:10.333806: step: 802/466, loss: 0.16303367912769318 2023-01-24 01:58:11.043847: step: 804/466, loss: 0.8298637270927429 2023-01-24 01:58:11.660331: step: 806/466, loss: 0.23885048925876617 2023-01-24 01:58:12.235605: step: 808/466, loss: 0.07132650911808014 2023-01-24 01:58:12.885442: step: 810/466, loss: 0.32651305198669434 2023-01-24 01:58:13.554283: step: 812/466, loss: 0.39218130707740784 2023-01-24 01:58:14.147113: step: 814/466, loss: 0.1230127364397049 2023-01-24 01:58:14.834502: step: 816/466, loss: 0.23656906187534332 2023-01-24 01:58:15.515456: step: 818/466, loss: 0.27691248059272766 2023-01-24 01:58:16.127361: step: 820/466, loss: 0.07484827935695648 2023-01-24 01:58:16.780024: step: 822/466, loss: 0.18714210391044617 2023-01-24 01:58:17.344278: step: 824/466, loss: 0.05669151246547699 2023-01-24 01:58:18.023899: step: 826/466, loss: 0.615753710269928 2023-01-24 01:58:18.657592: step: 828/466, loss: 0.18706496059894562 2023-01-24 01:58:19.277494: step: 830/466, loss: 0.13120721280574799 2023-01-24 01:58:19.981964: step: 832/466, loss: 0.17712688446044922 2023-01-24 01:58:20.586957: step: 834/466, loss: 0.1563715785741806 2023-01-24 01:58:21.251392: step: 836/466, loss: 0.0996440052986145 2023-01-24 01:58:21.899883: step: 838/466, loss: 0.1388978511095047 2023-01-24 01:58:22.561340: step: 840/466, loss: 0.48037368059158325 2023-01-24 01:58:23.176555: step: 842/466, loss: 0.26534557342529297 2023-01-24 01:58:23.825241: step: 844/466, loss: 0.1332325041294098 2023-01-24 01:58:24.432220: step: 846/466, loss: 0.11895450204610825 2023-01-24 01:58:25.053076: step: 848/466, loss: 0.16911429166793823 2023-01-24 01:58:25.659544: step: 850/466, loss: 0.16510894894599915 2023-01-24 01:58:26.277209: step: 852/466, loss: 0.21481208503246307 2023-01-24 01:58:26.871683: step: 854/466, loss: 0.17886097729206085 2023-01-24 01:58:27.461930: step: 856/466, loss: 0.25372764468193054 2023-01-24 01:58:28.067350: step: 858/466, loss: 0.4483509659767151 2023-01-24 01:58:28.668557: step: 860/466, loss: 0.14233942329883575 2023-01-24 01:58:29.189730: step: 862/466, loss: 0.1988704800605774 2023-01-24 01:58:29.800444: step: 864/466, loss: 0.19561944901943207 2023-01-24 01:58:30.459423: step: 866/466, loss: 0.4714718163013458 2023-01-24 01:58:31.083984: step: 868/466, loss: 0.22195981442928314 2023-01-24 01:58:31.702596: step: 870/466, loss: 0.13857711851596832 2023-01-24 01:58:32.312257: step: 872/466, loss: 1.053472876548767 2023-01-24 01:58:32.889374: step: 874/466, loss: 0.19076867401599884 2023-01-24 01:58:33.535146: step: 876/466, loss: 0.1434697061777115 2023-01-24 01:58:34.153792: step: 878/466, loss: 0.09571454674005508 2023-01-24 01:58:34.765335: step: 880/466, loss: 0.2282378375530243 2023-01-24 01:58:35.384895: step: 882/466, loss: 0.059323765337467194 2023-01-24 01:58:35.982288: step: 884/466, loss: 0.05962638556957245 2023-01-24 01:58:36.541409: step: 886/466, loss: 0.04837535694241524 2023-01-24 01:58:37.194232: step: 888/466, loss: 0.09476480633020401 2023-01-24 01:58:37.848490: step: 890/466, loss: 0.14488112926483154 2023-01-24 01:58:38.453299: step: 892/466, loss: 0.449963241815567 2023-01-24 01:58:39.122210: step: 894/466, loss: 0.2974949777126312 2023-01-24 01:58:39.747895: step: 896/466, loss: 0.8873539566993713 2023-01-24 01:58:40.405853: step: 898/466, loss: 1.1094868183135986 2023-01-24 01:58:41.010475: step: 900/466, loss: 1.1049017906188965 2023-01-24 01:58:41.601880: step: 902/466, loss: 0.2315780371427536 2023-01-24 01:58:42.217128: step: 904/466, loss: 1.0686709880828857 2023-01-24 01:58:42.865838: step: 906/466, loss: 0.27643969655036926 2023-01-24 01:58:43.573714: step: 908/466, loss: 0.40389660000801086 2023-01-24 01:58:44.109742: step: 910/466, loss: 0.2511165142059326 2023-01-24 01:58:44.703016: step: 912/466, loss: 0.08071059733629227 2023-01-24 01:58:45.320327: step: 914/466, loss: 0.34828317165374756 2023-01-24 01:58:45.950496: step: 916/466, loss: 0.15177913010120392 2023-01-24 01:58:46.581603: step: 918/466, loss: 0.2506985664367676 2023-01-24 01:58:47.157615: step: 920/466, loss: 0.40900301933288574 2023-01-24 01:58:47.789874: step: 922/466, loss: 0.2533995509147644 2023-01-24 01:58:48.536921: step: 924/466, loss: 0.17880086600780487 2023-01-24 01:58:49.113814: step: 926/466, loss: 0.23364755511283875 2023-01-24 01:58:49.688320: step: 928/466, loss: 0.056145116686820984 2023-01-24 01:58:50.304175: step: 930/466, loss: 0.06831599771976471 2023-01-24 01:58:51.079738: step: 932/466, loss: 0.7609554529190063 ================================================== Loss: 0.293 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.33705740762440434, 'r': 0.32106796703501134, 'f1': 0.32886845214276184}, 'combined': 0.2423241226315087, 'epoch': 12} Test Chinese: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.3600072626306723, 'r': 0.28589724043538317, 'f1': 0.31870062039892444}, 'combined': 0.21136621456508975, 'epoch': 12} Dev Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.3298823989689773, 'r': 0.2661551173499703, 'f1': 0.2946119537961936}, 'combined': 0.1964079691974624, 'epoch': 12} Test Korean: {'template': {'p': 0.96875, 'r': 0.49206349206349204, 'f1': 0.6526315789473683}, 'slot': {'p': 0.3669169580784741, 'r': 0.2727299774827033, 'f1': 0.3128890272161504}, 'combined': 0.20420125986738236, 'epoch': 12} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3237248942917547, 'r': 0.31696782818699326, 'f1': 0.3203107295389174}, 'combined': 0.2360184322918339, 'epoch': 12} Test Russian: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.35686061950793674, 'r': 0.28278360891240467, 'f1': 0.31553269576867066}, 'combined': 0.20926520755642403, 'epoch': 12} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.36979166666666663, 'r': 0.33809523809523806, 'f1': 0.3532338308457711}, 'combined': 0.23548922056384738, 'epoch': 12} Sample Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.4411764705882353, 'r': 0.32608695652173914, 'f1': 0.375}, 'combined': 0.25, 'epoch': 12} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.4, 'r': 0.13793103448275862, 'f1': 0.20512820512820515}, 'combined': 0.13675213675213677, 'epoch': 12} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3249718423169751, 'r': 0.34840434707607393, 'f1': 0.33628038628038637}, 'combined': 0.24778554778554784, 'epoch': 10} Test for Chinese: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.35564524631816125, 'r': 0.2998937951296123, 'f1': 0.3253987814443737}, 'combined': 0.21580851826362604, 'epoch': 10} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.33226495726495725, 'r': 0.3702380952380952, 'f1': 0.3502252252252252}, 'combined': 0.23348348348348347, 'epoch': 10} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.31508371199826896, 'r': 0.28285924145299146, 'f1': 0.298103152669021}, 'combined': 0.19873543511268066, 'epoch': 10} Test for Korean: {'template': {'p': 0.96875, 'r': 0.49206349206349204, 'f1': 0.6526315789473683}, 'slot': {'p': 0.3659577520051451, 'r': 0.28516188467933384, 'f1': 0.3205469360629008}, 'combined': 0.20919905300947209, 'epoch': 10} Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.43478260869565216, 'f1': 0.46511627906976744}, 'combined': 0.31007751937984496, 'epoch': 10} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3214272868712161, 'r': 0.31166858366449984, 'f1': 0.3164727236824498}, 'combined': 0.23319042797654194, 'epoch': 7} Test for Russian: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.3374639287478496, 'r': 0.2816078301964814, 'f1': 0.30701605547736693}, 'combined': 0.20361686580882363, 'epoch': 7} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.4423076923076923, 'r': 0.19827586206896552, 'f1': 0.2738095238095238}, 'combined': 0.1825396825396825, 'epoch': 7} ****************************** Epoch: 13 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-24 02:01:20.033696: step: 2/466, loss: 0.525618314743042 2023-01-24 02:01:20.640974: step: 4/466, loss: 0.12188282608985901 2023-01-24 02:01:21.209497: step: 6/466, loss: 0.18711575865745544 2023-01-24 02:01:21.850599: step: 8/466, loss: 0.07466843724250793 2023-01-24 02:01:22.461784: step: 10/466, loss: 0.2744048237800598 2023-01-24 02:01:23.065101: step: 12/466, loss: 1.0381118059158325 2023-01-24 02:01:23.706790: step: 14/466, loss: 0.10302115976810455 2023-01-24 02:01:24.287284: step: 16/466, loss: 0.1746019423007965 2023-01-24 02:01:24.965877: step: 18/466, loss: 0.2092004269361496 2023-01-24 02:01:25.572638: step: 20/466, loss: 0.7763059139251709 2023-01-24 02:01:26.190161: step: 22/466, loss: 0.1145629957318306 2023-01-24 02:01:26.833386: step: 24/466, loss: 0.32337671518325806 2023-01-24 02:01:27.466225: step: 26/466, loss: 1.5217716693878174 2023-01-24 02:01:28.039679: step: 28/466, loss: 0.07874564826488495 2023-01-24 02:01:28.695695: step: 30/466, loss: 0.1454867422580719 2023-01-24 02:01:29.251689: step: 32/466, loss: 0.6862086653709412 2023-01-24 02:01:29.919317: step: 34/466, loss: 0.3887907862663269 2023-01-24 02:01:30.503832: step: 36/466, loss: 0.04609563574194908 2023-01-24 02:01:31.062773: step: 38/466, loss: 0.06378234922885895 2023-01-24 02:01:31.658005: step: 40/466, loss: 0.07670899480581284 2023-01-24 02:01:32.296307: step: 42/466, loss: 0.030169617384672165 2023-01-24 02:01:32.886100: step: 44/466, loss: 0.19305665791034698 2023-01-24 02:01:33.563575: step: 46/466, loss: 0.04287484288215637 2023-01-24 02:01:34.241277: step: 48/466, loss: 0.40193885564804077 2023-01-24 02:01:34.865897: step: 50/466, loss: 0.06653300672769547 2023-01-24 02:01:35.516114: step: 52/466, loss: 0.18866319954395294 2023-01-24 02:01:36.129465: step: 54/466, loss: 0.16975773870944977 2023-01-24 02:01:36.705588: step: 56/466, loss: 0.06851798295974731 2023-01-24 02:01:37.338094: step: 58/466, loss: 0.1637471318244934 2023-01-24 02:01:37.962492: step: 60/466, loss: 0.0365801677107811 2023-01-24 02:01:38.561174: step: 62/466, loss: 0.04194226488471031 2023-01-24 02:01:39.309954: step: 64/466, loss: 0.029038073495030403 2023-01-24 02:01:39.910673: step: 66/466, loss: 0.11975918710231781 2023-01-24 02:01:40.522754: step: 68/466, loss: 0.1207805871963501 2023-01-24 02:01:41.139818: step: 70/466, loss: 0.2956823408603668 2023-01-24 02:01:41.730056: step: 72/466, loss: 0.10991843044757843 2023-01-24 02:01:42.397425: step: 74/466, loss: 0.8458861708641052 2023-01-24 02:01:43.021795: step: 76/466, loss: 0.12933480739593506 2023-01-24 02:01:43.646967: step: 78/466, loss: 0.33300691843032837 2023-01-24 02:01:44.247477: step: 80/466, loss: 0.2356657087802887 2023-01-24 02:01:44.877345: step: 82/466, loss: 0.21511326730251312 2023-01-24 02:01:45.520710: step: 84/466, loss: 0.20533886551856995 2023-01-24 02:01:46.175133: step: 86/466, loss: 0.09611569344997406 2023-01-24 02:01:46.801850: step: 88/466, loss: 0.37041494250297546 2023-01-24 02:01:47.420454: step: 90/466, loss: 0.17537567019462585 2023-01-24 02:01:48.080355: step: 92/466, loss: 0.13145779073238373 2023-01-24 02:01:48.675446: step: 94/466, loss: 0.13477493822574615 2023-01-24 02:01:49.322967: step: 96/466, loss: 0.515109121799469 2023-01-24 02:01:49.875070: step: 98/466, loss: 0.6469271779060364 2023-01-24 02:01:50.572995: step: 100/466, loss: 0.39150166511535645 2023-01-24 02:01:51.255763: step: 102/466, loss: 0.3539683520793915 2023-01-24 02:01:51.848572: step: 104/466, loss: 0.07924119383096695 2023-01-24 02:01:52.486911: step: 106/466, loss: 0.6557603478431702 2023-01-24 02:01:53.116328: step: 108/466, loss: 0.16338536143302917 2023-01-24 02:01:53.684316: step: 110/466, loss: 0.05812246352434158 2023-01-24 02:01:54.344159: step: 112/466, loss: 0.08773166686296463 2023-01-24 02:01:55.013510: step: 114/466, loss: 0.17465516924858093 2023-01-24 02:01:55.656762: step: 116/466, loss: 0.20950469374656677 2023-01-24 02:01:56.251277: step: 118/466, loss: 0.17099221050739288 2023-01-24 02:01:56.907662: step: 120/466, loss: 0.20525000989437103 2023-01-24 02:01:57.560539: step: 122/466, loss: 0.1835314929485321 2023-01-24 02:01:58.238013: step: 124/466, loss: 0.22093580663204193 2023-01-24 02:01:58.837209: step: 126/466, loss: 0.08336763083934784 2023-01-24 02:01:59.462593: step: 128/466, loss: 0.21546803414821625 2023-01-24 02:02:00.147217: step: 130/466, loss: 0.09574634581804276 2023-01-24 02:02:00.727756: step: 132/466, loss: 0.07790052890777588 2023-01-24 02:02:01.433684: step: 134/466, loss: 0.21874640882015228 2023-01-24 02:02:02.034837: step: 136/466, loss: 0.12523621320724487 2023-01-24 02:02:02.657056: step: 138/466, loss: 0.054812103509902954 2023-01-24 02:02:03.263807: step: 140/466, loss: 0.09920776635408401 2023-01-24 02:02:03.894867: step: 142/466, loss: 0.18406184017658234 2023-01-24 02:02:04.464149: step: 144/466, loss: 0.17157801985740662 2023-01-24 02:02:05.110375: step: 146/466, loss: 0.12519454956054688 2023-01-24 02:02:05.770746: step: 148/466, loss: 0.09784115105867386 2023-01-24 02:02:06.456172: step: 150/466, loss: 0.25946009159088135 2023-01-24 02:02:07.104644: step: 152/466, loss: 1.248733639717102 2023-01-24 02:02:07.780995: step: 154/466, loss: 0.13074356317520142 2023-01-24 02:02:08.425933: step: 156/466, loss: 0.3054160177707672 2023-01-24 02:02:09.013534: step: 158/466, loss: 0.18431904911994934 2023-01-24 02:02:09.619909: step: 160/466, loss: 0.17071475088596344 2023-01-24 02:02:10.256788: step: 162/466, loss: 0.38476401567459106 2023-01-24 02:02:10.899942: step: 164/466, loss: 0.1319192796945572 2023-01-24 02:02:11.479925: step: 166/466, loss: 0.5609253644943237 2023-01-24 02:02:12.168449: step: 168/466, loss: 0.11155123263597488 2023-01-24 02:02:12.869010: step: 170/466, loss: 0.07531193643808365 2023-01-24 02:02:13.562451: step: 172/466, loss: 0.12356498837471008 2023-01-24 02:02:14.196944: step: 174/466, loss: 0.09486281126737595 2023-01-24 02:02:14.793304: step: 176/466, loss: 0.21441110968589783 2023-01-24 02:02:15.442726: step: 178/466, loss: 0.1300574094057083 2023-01-24 02:02:16.075638: step: 180/466, loss: 0.16298283636569977 2023-01-24 02:02:16.720352: step: 182/466, loss: 0.3968439996242523 2023-01-24 02:02:17.301739: step: 184/466, loss: 0.17258816957473755 2023-01-24 02:02:17.966642: step: 186/466, loss: 0.10586294531822205 2023-01-24 02:02:18.556512: step: 188/466, loss: 0.07512211799621582 2023-01-24 02:02:19.175248: step: 190/466, loss: 0.109731025993824 2023-01-24 02:02:19.782063: step: 192/466, loss: 0.08937923610210419 2023-01-24 02:02:20.446034: step: 194/466, loss: 0.10278814285993576 2023-01-24 02:02:21.170309: step: 196/466, loss: 0.19093210995197296 2023-01-24 02:02:21.828778: step: 198/466, loss: 0.09084939956665039 2023-01-24 02:02:22.431188: step: 200/466, loss: 0.174184188246727 2023-01-24 02:02:23.095045: step: 202/466, loss: 0.21613729000091553 2023-01-24 02:02:23.662159: step: 204/466, loss: 0.17608675360679626 2023-01-24 02:02:24.275220: step: 206/466, loss: 0.1232207715511322 2023-01-24 02:02:24.863561: step: 208/466, loss: 0.126413956284523 2023-01-24 02:02:25.492888: step: 210/466, loss: 0.0623779371380806 2023-01-24 02:02:26.121435: step: 212/466, loss: 0.20897889137268066 2023-01-24 02:02:26.728895: step: 214/466, loss: 0.04284898564219475 2023-01-24 02:02:27.324115: step: 216/466, loss: 0.07657509297132492 2023-01-24 02:02:27.953061: step: 218/466, loss: 0.1104479730129242 2023-01-24 02:02:28.550457: step: 220/466, loss: 0.134963259100914 2023-01-24 02:02:29.189607: step: 222/466, loss: 0.15580850839614868 2023-01-24 02:02:29.816546: step: 224/466, loss: 0.12098768353462219 2023-01-24 02:02:30.420988: step: 226/466, loss: 0.15912289917469025 2023-01-24 02:02:30.992647: step: 228/466, loss: 0.13218237459659576 2023-01-24 02:02:31.587426: step: 230/466, loss: 0.5889711380004883 2023-01-24 02:02:32.215294: step: 232/466, loss: 0.08734430372714996 2023-01-24 02:02:32.849369: step: 234/466, loss: 0.07764595746994019 2023-01-24 02:02:33.544210: step: 236/466, loss: 0.03637455403804779 2023-01-24 02:02:34.161847: step: 238/466, loss: 0.10069497674703598 2023-01-24 02:02:34.764276: step: 240/466, loss: 0.17100438475608826 2023-01-24 02:02:35.344744: step: 242/466, loss: 0.20142343640327454 2023-01-24 02:02:36.034190: step: 244/466, loss: 0.0414368100464344 2023-01-24 02:02:36.726299: step: 246/466, loss: 0.45548686385154724 2023-01-24 02:02:37.428407: step: 248/466, loss: 0.2700558006763458 2023-01-24 02:02:38.007949: step: 250/466, loss: 0.045008108019828796 2023-01-24 02:02:38.632199: step: 252/466, loss: 0.1975070983171463 2023-01-24 02:02:39.251729: step: 254/466, loss: 0.2863794267177582 2023-01-24 02:02:39.906166: step: 256/466, loss: 0.2864835560321808 2023-01-24 02:02:40.609192: step: 258/466, loss: 0.0764208734035492 2023-01-24 02:02:41.242086: step: 260/466, loss: 0.13654173910617828 2023-01-24 02:02:41.842640: step: 262/466, loss: 0.5410063862800598 2023-01-24 02:02:42.498043: step: 264/466, loss: 0.40156546235084534 2023-01-24 02:02:43.074058: step: 266/466, loss: 0.08285211026668549 2023-01-24 02:02:43.694283: step: 268/466, loss: 0.11998789012432098 2023-01-24 02:02:44.328644: step: 270/466, loss: 0.37964218854904175 2023-01-24 02:02:44.915633: step: 272/466, loss: 0.0892152190208435 2023-01-24 02:02:45.482378: step: 274/466, loss: 0.15910491347312927 2023-01-24 02:02:46.104724: step: 276/466, loss: 0.24213582277297974 2023-01-24 02:02:46.652332: step: 278/466, loss: 0.0799097791314125 2023-01-24 02:02:47.250322: step: 280/466, loss: 0.14344455301761627 2023-01-24 02:02:47.803349: step: 282/466, loss: 0.1222216784954071 2023-01-24 02:02:48.427997: step: 284/466, loss: 0.20590557157993317 2023-01-24 02:02:49.057752: step: 286/466, loss: 0.0666474997997284 2023-01-24 02:02:49.730563: step: 288/466, loss: 0.09086759388446808 2023-01-24 02:02:50.330786: step: 290/466, loss: 0.14365486800670624 2023-01-24 02:02:50.939956: step: 292/466, loss: 0.2211560308933258 2023-01-24 02:02:51.525477: step: 294/466, loss: 0.14865663647651672 2023-01-24 02:02:52.142875: step: 296/466, loss: 0.16953952610492706 2023-01-24 02:02:52.739051: step: 298/466, loss: 0.15864787995815277 2023-01-24 02:02:53.418696: step: 300/466, loss: 0.8891596794128418 2023-01-24 02:02:54.073863: step: 302/466, loss: 0.06364311277866364 2023-01-24 02:02:54.766610: step: 304/466, loss: 0.08998765796422958 2023-01-24 02:02:55.376754: step: 306/466, loss: 0.03975175321102142 2023-01-24 02:02:56.028109: step: 308/466, loss: 0.38749098777770996 2023-01-24 02:02:56.645343: step: 310/466, loss: 0.09961052238941193 2023-01-24 02:02:57.317322: step: 312/466, loss: 0.22312025725841522 2023-01-24 02:02:57.929515: step: 314/466, loss: 0.18734197318553925 2023-01-24 02:02:58.514550: step: 316/466, loss: 0.22206509113311768 2023-01-24 02:02:59.167822: step: 318/466, loss: 0.16230708360671997 2023-01-24 02:02:59.802710: step: 320/466, loss: 0.6054675579071045 2023-01-24 02:03:00.492938: step: 322/466, loss: 0.10808802396059036 2023-01-24 02:03:01.077457: step: 324/466, loss: 0.23263190686702728 2023-01-24 02:03:01.692547: step: 326/466, loss: 0.1171720027923584 2023-01-24 02:03:02.358481: step: 328/466, loss: 0.22656656801700592 2023-01-24 02:03:02.959506: step: 330/466, loss: 0.09640899300575256 2023-01-24 02:03:03.634687: step: 332/466, loss: 0.10093360394239426 2023-01-24 02:03:04.225030: step: 334/466, loss: 0.05876747891306877 2023-01-24 02:03:04.937228: step: 336/466, loss: 0.35687941312789917 2023-01-24 02:03:05.550241: step: 338/466, loss: 0.23797641694545746 2023-01-24 02:03:06.153273: step: 340/466, loss: 0.36994874477386475 2023-01-24 02:03:06.762282: step: 342/466, loss: 0.05669309198856354 2023-01-24 02:03:07.377161: step: 344/466, loss: 0.160336434841156 2023-01-24 02:03:07.991219: step: 346/466, loss: 0.07873602211475372 2023-01-24 02:03:08.562599: step: 348/466, loss: 0.3927478492259979 2023-01-24 02:03:09.194083: step: 350/466, loss: 0.07420491427183151 2023-01-24 02:03:09.882602: step: 352/466, loss: 0.26476380228996277 2023-01-24 02:03:10.474000: step: 354/466, loss: 0.11637300252914429 2023-01-24 02:03:11.084247: step: 356/466, loss: 0.44961050152778625 2023-01-24 02:03:11.666402: step: 358/466, loss: 0.18461167812347412 2023-01-24 02:03:12.240882: step: 360/466, loss: 0.08700277656316757 2023-01-24 02:03:12.896421: step: 362/466, loss: 0.20201227068901062 2023-01-24 02:03:13.548410: step: 364/466, loss: 0.40742918848991394 2023-01-24 02:03:14.172561: step: 366/466, loss: 0.24917148053646088 2023-01-24 02:03:14.781517: step: 368/466, loss: 0.21557331085205078 2023-01-24 02:03:15.521814: step: 370/466, loss: 2.6784920692443848 2023-01-24 02:03:16.149129: step: 372/466, loss: 0.12315894663333893 2023-01-24 02:03:16.724549: step: 374/466, loss: 0.22482457756996155 2023-01-24 02:03:17.357359: step: 376/466, loss: 0.13915163278579712 2023-01-24 02:03:17.893306: step: 378/466, loss: 0.052151940762996674 2023-01-24 02:03:18.535476: step: 380/466, loss: 0.16188733279705048 2023-01-24 02:03:19.103942: step: 382/466, loss: 0.28139740228652954 2023-01-24 02:03:19.673635: step: 384/466, loss: 0.0629613995552063 2023-01-24 02:03:20.385523: step: 386/466, loss: 0.038815055042505264 2023-01-24 02:03:21.053692: step: 388/466, loss: 0.1372925341129303 2023-01-24 02:03:21.701939: step: 390/466, loss: 0.23523753881454468 2023-01-24 02:03:22.321994: step: 392/466, loss: 0.5332252383232117 2023-01-24 02:03:22.923824: step: 394/466, loss: 0.14214055240154266 2023-01-24 02:03:23.556012: step: 396/466, loss: 0.13956071436405182 2023-01-24 02:03:24.198373: step: 398/466, loss: 0.17934565246105194 2023-01-24 02:03:24.781082: step: 400/466, loss: 0.09901084750890732 2023-01-24 02:03:25.349393: step: 402/466, loss: 0.18380653858184814 2023-01-24 02:03:25.992310: step: 404/466, loss: 0.1347212940454483 2023-01-24 02:03:26.653721: step: 406/466, loss: 0.2966226041316986 2023-01-24 02:03:27.238274: step: 408/466, loss: 0.1731652170419693 2023-01-24 02:03:27.801591: step: 410/466, loss: 0.07733267545700073 2023-01-24 02:03:28.454057: step: 412/466, loss: 0.09514249116182327 2023-01-24 02:03:29.083853: step: 414/466, loss: 0.04069305956363678 2023-01-24 02:03:29.782756: step: 416/466, loss: 0.18317970633506775 2023-01-24 02:03:30.327885: step: 418/466, loss: 0.12108591943979263 2023-01-24 02:03:30.961445: step: 420/466, loss: 0.2907571494579315 2023-01-24 02:03:31.586537: step: 422/466, loss: 0.0754459872841835 2023-01-24 02:03:32.259166: step: 424/466, loss: 0.05278822034597397 2023-01-24 02:03:32.953567: step: 426/466, loss: 0.023496627807617188 2023-01-24 02:03:33.618779: step: 428/466, loss: 0.3592928349971771 2023-01-24 02:03:34.220122: step: 430/466, loss: 0.29169031977653503 2023-01-24 02:03:34.892958: step: 432/466, loss: 0.1340051293373108 2023-01-24 02:03:35.548560: step: 434/466, loss: 0.2072197049856186 2023-01-24 02:03:36.126405: step: 436/466, loss: 0.10318450629711151 2023-01-24 02:03:36.742422: step: 438/466, loss: 0.2741301357746124 2023-01-24 02:03:37.382881: step: 440/466, loss: 0.12780462205410004 2023-01-24 02:03:37.980616: step: 442/466, loss: 0.28163811564445496 2023-01-24 02:03:38.655439: step: 444/466, loss: 0.16987621784210205 2023-01-24 02:03:39.265363: step: 446/466, loss: 0.18827319145202637 2023-01-24 02:03:39.930442: step: 448/466, loss: 0.15258075296878815 2023-01-24 02:03:40.595232: step: 450/466, loss: 0.6299476027488708 2023-01-24 02:03:41.183024: step: 452/466, loss: 0.2035852074623108 2023-01-24 02:03:41.861764: step: 454/466, loss: 0.46440279483795166 2023-01-24 02:03:42.468340: step: 456/466, loss: 0.20179222524166107 2023-01-24 02:03:43.073009: step: 458/466, loss: 0.08790719509124756 2023-01-24 02:03:43.649956: step: 460/466, loss: 0.1740276962518692 2023-01-24 02:03:44.253494: step: 462/466, loss: 0.7034498453140259 2023-01-24 02:03:44.922703: step: 464/466, loss: 2.465857744216919 2023-01-24 02:03:45.602476: step: 466/466, loss: 0.46868717670440674 2023-01-24 02:03:46.217278: step: 468/466, loss: 0.13198892772197723 2023-01-24 02:03:46.839604: step: 470/466, loss: 0.33430054783821106 2023-01-24 02:03:47.458671: step: 472/466, loss: 0.1354030817747116 2023-01-24 02:03:48.118377: step: 474/466, loss: 0.10925836861133575 2023-01-24 02:03:48.747935: step: 476/466, loss: 0.11237549036741257 2023-01-24 02:03:49.320366: step: 478/466, loss: 0.12133002281188965 2023-01-24 02:03:49.946041: step: 480/466, loss: 0.05736667290329933 2023-01-24 02:03:50.587236: step: 482/466, loss: 0.12249401956796646 2023-01-24 02:03:51.292232: step: 484/466, loss: 0.11551576852798462 2023-01-24 02:03:51.945421: step: 486/466, loss: 0.13422928750514984 2023-01-24 02:03:52.508966: step: 488/466, loss: 0.22650648653507233 2023-01-24 02:03:53.088829: step: 490/466, loss: 0.1355985850095749 2023-01-24 02:03:53.740635: step: 492/466, loss: 0.038527145981788635 2023-01-24 02:03:54.320755: step: 494/466, loss: 0.09556832164525986 2023-01-24 02:03:54.940912: step: 496/466, loss: 0.13957932591438293 2023-01-24 02:03:55.552720: step: 498/466, loss: 0.13868115842342377 2023-01-24 02:03:56.193056: step: 500/466, loss: 0.15585461258888245 2023-01-24 02:03:56.774054: step: 502/466, loss: 0.07137840986251831 2023-01-24 02:03:57.424685: step: 504/466, loss: 0.3687639534473419 2023-01-24 02:03:57.982821: step: 506/466, loss: 0.04002955183386803 2023-01-24 02:03:58.600015: step: 508/466, loss: 0.1983628123998642 2023-01-24 02:03:59.210480: step: 510/466, loss: 0.22286492586135864 2023-01-24 02:03:59.865626: step: 512/466, loss: 0.08817946165800095 2023-01-24 02:04:00.509175: step: 514/466, loss: 0.10085298866033554 2023-01-24 02:04:01.081328: step: 516/466, loss: 0.6722241044044495 2023-01-24 02:04:01.707781: step: 518/466, loss: 0.42127126455307007 2023-01-24 02:04:02.342989: step: 520/466, loss: 0.09877334535121918 2023-01-24 02:04:02.936100: step: 522/466, loss: 0.18479159474372864 2023-01-24 02:04:03.542670: step: 524/466, loss: 0.10049667209386826 2023-01-24 02:04:04.315444: step: 526/466, loss: 0.21246574819087982 2023-01-24 02:04:04.891210: step: 528/466, loss: 1.5300737619400024 2023-01-24 02:04:05.529478: step: 530/466, loss: 0.1098194494843483 2023-01-24 02:04:06.135052: step: 532/466, loss: 0.031503114849328995 2023-01-24 02:04:06.775652: step: 534/466, loss: 0.09223125874996185 2023-01-24 02:04:07.462240: step: 536/466, loss: 0.09262816607952118 2023-01-24 02:04:08.045132: step: 538/466, loss: 0.08239345997571945 2023-01-24 02:04:08.682500: step: 540/466, loss: 0.12973906099796295 2023-01-24 02:04:09.291954: step: 542/466, loss: 0.5507596731185913 2023-01-24 02:04:09.936455: step: 544/466, loss: 0.34149739146232605 2023-01-24 02:04:10.564049: step: 546/466, loss: 0.0915873795747757 2023-01-24 02:04:11.167097: step: 548/466, loss: 1.4637705087661743 2023-01-24 02:04:11.778876: step: 550/466, loss: 0.38938653469085693 2023-01-24 02:04:12.487045: step: 552/466, loss: 0.10289011150598526 2023-01-24 02:04:13.107916: step: 554/466, loss: 0.15459519624710083 2023-01-24 02:04:13.748619: step: 556/466, loss: 0.09873844683170319 2023-01-24 02:04:14.371274: step: 558/466, loss: 0.17922475934028625 2023-01-24 02:04:14.963265: step: 560/466, loss: 0.14410601556301117 2023-01-24 02:04:15.594946: step: 562/466, loss: 0.12939496338367462 2023-01-24 02:04:16.265013: step: 564/466, loss: 0.25387799739837646 2023-01-24 02:04:16.888213: step: 566/466, loss: 0.12458079308271408 2023-01-24 02:04:17.497422: step: 568/466, loss: 0.2783191204071045 2023-01-24 02:04:18.073841: step: 570/466, loss: 0.09612763673067093 2023-01-24 02:04:18.689900: step: 572/466, loss: 0.27933767437934875 2023-01-24 02:04:19.287240: step: 574/466, loss: 0.061390191316604614 2023-01-24 02:04:19.891362: step: 576/466, loss: 0.7752988934516907 2023-01-24 02:04:20.523883: step: 578/466, loss: 0.5396843552589417 2023-01-24 02:04:21.115214: step: 580/466, loss: 1.0238251686096191 2023-01-24 02:04:21.710937: step: 582/466, loss: 0.13761739432811737 2023-01-24 02:04:22.306321: step: 584/466, loss: 0.643787145614624 2023-01-24 02:04:22.896368: step: 586/466, loss: 0.8737914562225342 2023-01-24 02:04:23.535762: step: 588/466, loss: 0.6515952944755554 2023-01-24 02:04:24.178454: step: 590/466, loss: 0.20468327403068542 2023-01-24 02:04:24.769499: step: 592/466, loss: 0.17673346400260925 2023-01-24 02:04:25.412142: step: 594/466, loss: 0.2418610006570816 2023-01-24 02:04:25.938864: step: 596/466, loss: 0.10020948201417923 2023-01-24 02:04:26.540248: step: 598/466, loss: 0.060589466243982315 2023-01-24 02:04:27.120001: step: 600/466, loss: 0.08437406271696091 2023-01-24 02:04:27.696052: step: 602/466, loss: 0.11464756727218628 2023-01-24 02:04:28.339550: step: 604/466, loss: 0.0527980662882328 2023-01-24 02:04:28.944971: step: 606/466, loss: 0.19469307363033295 2023-01-24 02:04:29.557386: step: 608/466, loss: 0.22520983219146729 2023-01-24 02:04:30.170874: step: 610/466, loss: 0.4616248905658722 2023-01-24 02:04:30.761965: step: 612/466, loss: 0.6428051590919495 2023-01-24 02:04:31.339593: step: 614/466, loss: 0.11791159957647324 2023-01-24 02:04:31.991146: step: 616/466, loss: 0.10004676878452301 2023-01-24 02:04:32.677863: step: 618/466, loss: 0.140882670879364 2023-01-24 02:04:33.339664: step: 620/466, loss: 0.2986718714237213 2023-01-24 02:04:33.975186: step: 622/466, loss: 0.2703951299190521 2023-01-24 02:04:34.554373: step: 624/466, loss: 0.09847074747085571 2023-01-24 02:04:35.112327: step: 626/466, loss: 0.08627253025770187 2023-01-24 02:04:35.759354: step: 628/466, loss: 0.15860922634601593 2023-01-24 02:04:36.457699: step: 630/466, loss: 0.15982860326766968 2023-01-24 02:04:37.192451: step: 632/466, loss: 0.5689049363136292 2023-01-24 02:04:37.830870: step: 634/466, loss: 0.24592043459415436 2023-01-24 02:04:38.487459: step: 636/466, loss: 0.14072726666927338 2023-01-24 02:04:39.111861: step: 638/466, loss: 0.1406276971101761 2023-01-24 02:04:39.719670: step: 640/466, loss: 0.03222530707716942 2023-01-24 02:04:40.387135: step: 642/466, loss: 0.38642606139183044 2023-01-24 02:04:40.993186: step: 644/466, loss: 0.09393725544214249 2023-01-24 02:04:41.624088: step: 646/466, loss: 0.08864431083202362 2023-01-24 02:04:42.208574: step: 648/466, loss: 0.07840286195278168 2023-01-24 02:04:42.781281: step: 650/466, loss: 0.2724342346191406 2023-01-24 02:04:43.422657: step: 652/466, loss: 0.22781580686569214 2023-01-24 02:04:44.020943: step: 654/466, loss: 0.1687709242105484 2023-01-24 02:04:44.766490: step: 656/466, loss: 0.15484493970870972 2023-01-24 02:04:45.533408: step: 658/466, loss: 0.06312824785709381 2023-01-24 02:04:46.131718: step: 660/466, loss: 0.140718013048172 2023-01-24 02:04:46.820177: step: 662/466, loss: 0.10476689040660858 2023-01-24 02:04:47.366119: step: 664/466, loss: 0.07167988270521164 2023-01-24 02:04:47.971920: step: 666/466, loss: 0.12937051057815552 2023-01-24 02:04:48.579265: step: 668/466, loss: 0.05053913593292236 2023-01-24 02:04:49.216165: step: 670/466, loss: 0.3197830319404602 2023-01-24 02:04:49.815426: step: 672/466, loss: 0.2779462933540344 2023-01-24 02:04:50.496472: step: 674/466, loss: 0.18884597718715668 2023-01-24 02:04:51.131333: step: 676/466, loss: 0.3712058961391449 2023-01-24 02:04:51.811077: step: 678/466, loss: 0.2407611608505249 2023-01-24 02:04:52.414700: step: 680/466, loss: 0.394243448972702 2023-01-24 02:04:53.054744: step: 682/466, loss: 0.07410983741283417 2023-01-24 02:04:53.673143: step: 684/466, loss: 0.4667331576347351 2023-01-24 02:04:54.251520: step: 686/466, loss: 0.5091947317123413 2023-01-24 02:04:54.862422: step: 688/466, loss: 0.395091712474823 2023-01-24 02:04:55.450033: step: 690/466, loss: 0.37954455614089966 2023-01-24 02:04:56.127722: step: 692/466, loss: 0.7210215926170349 2023-01-24 02:04:56.725680: step: 694/466, loss: 0.08687297999858856 2023-01-24 02:04:57.452846: step: 696/466, loss: 0.3269064724445343 2023-01-24 02:04:58.013587: step: 698/466, loss: 0.25779739022254944 2023-01-24 02:04:58.592953: step: 700/466, loss: 0.19471396505832672 2023-01-24 02:04:59.248754: step: 702/466, loss: 0.17518281936645508 2023-01-24 02:04:59.973327: step: 704/466, loss: 0.1451595425605774 2023-01-24 02:05:00.624839: step: 706/466, loss: 0.043420497328042984 2023-01-24 02:05:01.253013: step: 708/466, loss: 0.0574161633849144 2023-01-24 02:05:01.973789: step: 710/466, loss: 0.08920268714427948 2023-01-24 02:05:02.643484: step: 712/466, loss: 0.10907725244760513 2023-01-24 02:05:03.275921: step: 714/466, loss: 0.12948867678642273 2023-01-24 02:05:03.904659: step: 716/466, loss: 0.16905492544174194 2023-01-24 02:05:04.471067: step: 718/466, loss: 0.2499549686908722 2023-01-24 02:05:05.179318: step: 720/466, loss: 0.03271065279841423 2023-01-24 02:05:05.829609: step: 722/466, loss: 0.02659328654408455 2023-01-24 02:05:06.449672: step: 724/466, loss: 0.23334236443042755 2023-01-24 02:05:07.145425: step: 726/466, loss: 0.3844182789325714 2023-01-24 02:05:07.829582: step: 728/466, loss: 0.33857661485671997 2023-01-24 02:05:08.444314: step: 730/466, loss: 0.10223531723022461 2023-01-24 02:05:09.148050: step: 732/466, loss: 0.4171455204486847 2023-01-24 02:05:09.787070: step: 734/466, loss: 0.08916039019823074 2023-01-24 02:05:10.440774: step: 736/466, loss: 0.37796342372894287 2023-01-24 02:05:11.069188: step: 738/466, loss: 0.05818198248744011 2023-01-24 02:05:11.650774: step: 740/466, loss: 0.14418166875839233 2023-01-24 02:05:12.273901: step: 742/466, loss: 0.9279890060424805 2023-01-24 02:05:12.907801: step: 744/466, loss: 0.10512785613536835 2023-01-24 02:05:13.567328: step: 746/466, loss: 0.21932624280452728 2023-01-24 02:05:14.182211: step: 748/466, loss: 0.31123635172843933 2023-01-24 02:05:14.894265: step: 750/466, loss: 0.2003611922264099 2023-01-24 02:05:15.504008: step: 752/466, loss: 0.05662640929222107 2023-01-24 02:05:16.100001: step: 754/466, loss: 0.06766672432422638 2023-01-24 02:05:16.699037: step: 756/466, loss: 0.02892901934683323 2023-01-24 02:05:17.293815: step: 758/466, loss: 0.11633181571960449 2023-01-24 02:05:17.941155: step: 760/466, loss: 0.11391647905111313 2023-01-24 02:05:18.597509: step: 762/466, loss: 0.06218302249908447 2023-01-24 02:05:19.237737: step: 764/466, loss: 0.14876174926757812 2023-01-24 02:05:19.845185: step: 766/466, loss: 0.18223460018634796 2023-01-24 02:05:20.420375: step: 768/466, loss: 0.1018591970205307 2023-01-24 02:05:21.003423: step: 770/466, loss: 0.051450781524181366 2023-01-24 02:05:21.599511: step: 772/466, loss: 0.11895598471164703 2023-01-24 02:05:22.213997: step: 774/466, loss: 0.3299162983894348 2023-01-24 02:05:22.824702: step: 776/466, loss: 0.03409243002533913 2023-01-24 02:05:23.493357: step: 778/466, loss: 0.2987508773803711 2023-01-24 02:05:24.183343: step: 780/466, loss: 0.4159797430038452 2023-01-24 02:05:24.721845: step: 782/466, loss: 0.0329253189265728 2023-01-24 02:05:25.392677: step: 784/466, loss: 0.07173468172550201 2023-01-24 02:05:26.033155: step: 786/466, loss: 0.17268122732639313 2023-01-24 02:05:26.709797: step: 788/466, loss: 0.37131261825561523 2023-01-24 02:05:27.338498: step: 790/466, loss: 0.1132279559969902 2023-01-24 02:05:27.937514: step: 792/466, loss: 0.17152756452560425 2023-01-24 02:05:28.532085: step: 794/466, loss: 0.19634950160980225 2023-01-24 02:05:29.142116: step: 796/466, loss: 0.2054695188999176 2023-01-24 02:05:29.778483: step: 798/466, loss: 0.2601601481437683 2023-01-24 02:05:30.407875: step: 800/466, loss: 0.7241315245628357 2023-01-24 02:05:31.026908: step: 802/466, loss: 0.8270736336708069 2023-01-24 02:05:31.659909: step: 804/466, loss: 0.03443297743797302 2023-01-24 02:05:32.309594: step: 806/466, loss: 0.06627636402845383 2023-01-24 02:05:32.990176: step: 808/466, loss: 0.10838421434164047 2023-01-24 02:05:33.601265: step: 810/466, loss: 0.09377229958772659 2023-01-24 02:05:34.228527: step: 812/466, loss: 0.11635808646678925 2023-01-24 02:05:34.882841: step: 814/466, loss: 0.0770280733704567 2023-01-24 02:05:35.521430: step: 816/466, loss: 0.12906278669834137 2023-01-24 02:05:36.139109: step: 818/466, loss: 0.41343000531196594 2023-01-24 02:05:36.730568: step: 820/466, loss: 0.04330282285809517 2023-01-24 02:05:37.335950: step: 822/466, loss: 0.08779174834489822 2023-01-24 02:05:37.924914: step: 824/466, loss: 0.13243816792964935 2023-01-24 02:05:38.515572: step: 826/466, loss: 0.156992107629776 2023-01-24 02:05:39.129989: step: 828/466, loss: 0.08866634219884872 2023-01-24 02:05:39.721896: step: 830/466, loss: 0.5653623342514038 2023-01-24 02:05:40.328064: step: 832/466, loss: 0.18135926127433777 2023-01-24 02:05:40.948499: step: 834/466, loss: 0.8506476283073425 2023-01-24 02:05:41.562390: step: 836/466, loss: 0.10663268715143204 2023-01-24 02:05:42.249669: step: 838/466, loss: 0.37173619866371155 2023-01-24 02:05:42.858133: step: 840/466, loss: 0.8415031433105469 2023-01-24 02:05:43.471370: step: 842/466, loss: 0.4686752259731293 2023-01-24 02:05:44.098516: step: 844/466, loss: 0.2503049075603485 2023-01-24 02:05:44.754935: step: 846/466, loss: 1.3546899557113647 2023-01-24 02:05:45.363856: step: 848/466, loss: 0.08763039857149124 2023-01-24 02:05:45.947562: step: 850/466, loss: 0.044537756592035294 2023-01-24 02:05:46.497594: step: 852/466, loss: 0.12424295395612717 2023-01-24 02:05:47.133891: step: 854/466, loss: 0.17801827192306519 2023-01-24 02:05:47.773869: step: 856/466, loss: 0.08622211962938309 2023-01-24 02:05:48.392027: step: 858/466, loss: 0.10934218764305115 2023-01-24 02:05:49.015750: step: 860/466, loss: 0.5443745851516724 2023-01-24 02:05:49.629289: step: 862/466, loss: 0.32259345054626465 2023-01-24 02:05:50.272359: step: 864/466, loss: 0.3073665201663971 2023-01-24 02:05:50.906492: step: 866/466, loss: 0.0844200998544693 2023-01-24 02:05:51.549707: step: 868/466, loss: 0.15249481797218323 2023-01-24 02:05:52.184888: step: 870/466, loss: 0.09784666448831558 2023-01-24 02:05:52.767785: step: 872/466, loss: 0.26929613947868347 2023-01-24 02:05:53.365502: step: 874/466, loss: 0.7266647219657898 2023-01-24 02:05:53.941011: step: 876/466, loss: 0.11726915836334229 2023-01-24 02:05:54.606300: step: 878/466, loss: 1.9974907636642456 2023-01-24 02:05:55.273968: step: 880/466, loss: 0.37205514311790466 2023-01-24 02:05:55.913931: step: 882/466, loss: 0.10242167860269547 2023-01-24 02:05:56.575421: step: 884/466, loss: 0.12692664563655853 2023-01-24 02:05:57.184586: step: 886/466, loss: 0.23743067681789398 2023-01-24 02:05:57.821274: step: 888/466, loss: 0.3533828556537628 2023-01-24 02:05:58.446027: step: 890/466, loss: 0.11567486077547073 2023-01-24 02:05:59.038808: step: 892/466, loss: 0.1629101186990738 2023-01-24 02:05:59.689485: step: 894/466, loss: 0.260605126619339 2023-01-24 02:06:00.293070: step: 896/466, loss: 0.10653205215930939 2023-01-24 02:06:00.907968: step: 898/466, loss: 0.13470444083213806 2023-01-24 02:06:01.587476: step: 900/466, loss: 0.10960997641086578 2023-01-24 02:06:02.267267: step: 902/466, loss: 0.39571037888526917 2023-01-24 02:06:02.903283: step: 904/466, loss: 0.05582544207572937 2023-01-24 02:06:03.533791: step: 906/466, loss: 0.13240954279899597 2023-01-24 02:06:04.163881: step: 908/466, loss: 0.3220178484916687 2023-01-24 02:06:04.708986: step: 910/466, loss: 0.09687075018882751 2023-01-24 02:06:05.350922: step: 912/466, loss: 0.19923625886440277 2023-01-24 02:06:06.021266: step: 914/466, loss: 0.09295041859149933 2023-01-24 02:06:06.671617: step: 916/466, loss: 0.07397592067718506 2023-01-24 02:06:07.303666: step: 918/466, loss: 0.12189310789108276 2023-01-24 02:06:07.896556: step: 920/466, loss: 0.14278768002986908 2023-01-24 02:06:08.457611: step: 922/466, loss: 0.07068316638469696 2023-01-24 02:06:09.035945: step: 924/466, loss: 0.08041635900735855 2023-01-24 02:06:09.643263: step: 926/466, loss: 0.1307588815689087 2023-01-24 02:06:10.200071: step: 928/466, loss: 0.054248761385679245 2023-01-24 02:06:10.797590: step: 930/466, loss: 0.41642510890960693 2023-01-24 02:06:11.401783: step: 932/466, loss: 0.646379828453064 ================================================== Loss: 0.237 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3384687266123707, 'r': 0.32112782411040863, 'f1': 0.3295703277627758}, 'combined': 0.24284129414099268, 'epoch': 13} Test Chinese: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.37192480126309296, 'r': 0.3046241229392952, 'f1': 0.334927046163623}, 'combined': 0.2221277819116256, 'epoch': 13} Dev Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.3267949100408958, 'r': 0.2587126371157092, 'f1': 0.2887955018966056}, 'combined': 0.19253033459773705, 'epoch': 13} Test Korean: {'template': {'p': 0.96875, 'r': 0.49206349206349204, 'f1': 0.6526315789473683}, 'slot': {'p': 0.38946293813003574, 'r': 0.2876293387228749, 'f1': 0.33088833289334707}, 'combined': 0.21594817515144754, 'epoch': 13} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.33052165810117173, 'r': 0.31358791091192767, 'f1': 0.32183218899821986}, 'combined': 0.23713950768289882, 'epoch': 13} Test Russian: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.3621676965217385, 'r': 0.2928698082695271, 'f1': 0.32385316280641824}, 'combined': 0.21478344476280584, 'epoch': 13} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.38541666666666663, 'r': 0.35238095238095235, 'f1': 0.36815920398009944}, 'combined': 0.24543946932006627, 'epoch': 13} Sample Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.32608695652173914, 'f1': 0.39473684210526316}, 'combined': 0.2631578947368421, 'epoch': 13} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.4444444444444444, 'r': 0.13793103448275862, 'f1': 0.21052631578947367}, 'combined': 0.14035087719298245, 'epoch': 13} New best chinese model... ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3384687266123707, 'r': 0.32112782411040863, 'f1': 0.3295703277627758}, 'combined': 0.24284129414099268, 'epoch': 13} Test for Chinese: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.37192480126309296, 'r': 0.3046241229392952, 'f1': 0.334927046163623}, 'combined': 0.2221277819116256, 'epoch': 13} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.38541666666666663, 'r': 0.35238095238095235, 'f1': 0.36815920398009944}, 'combined': 0.24543946932006627, 'epoch': 13} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.31508371199826896, 'r': 0.28285924145299146, 'f1': 0.298103152669021}, 'combined': 0.19873543511268066, 'epoch': 10} Test for Korean: {'template': {'p': 0.96875, 'r': 0.49206349206349204, 'f1': 0.6526315789473683}, 'slot': {'p': 0.3659577520051451, 'r': 0.28516188467933384, 'f1': 0.3205469360629008}, 'combined': 0.20919905300947209, 'epoch': 10} Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.43478260869565216, 'f1': 0.46511627906976744}, 'combined': 0.31007751937984496, 'epoch': 10} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3214272868712161, 'r': 0.31166858366449984, 'f1': 0.3164727236824498}, 'combined': 0.23319042797654194, 'epoch': 7} Test for Russian: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.3374639287478496, 'r': 0.2816078301964814, 'f1': 0.30701605547736693}, 'combined': 0.20361686580882363, 'epoch': 7} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.4423076923076923, 'r': 0.19827586206896552, 'f1': 0.2738095238095238}, 'combined': 0.1825396825396825, 'epoch': 7} ****************************** Epoch: 14 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-24 02:08:47.279005: step: 2/466, loss: 0.26537057757377625 2023-01-24 02:08:47.841145: step: 4/466, loss: 0.025286676362156868 2023-01-24 02:08:48.457060: step: 6/466, loss: 0.08132816851139069 2023-01-24 02:08:49.111746: step: 8/466, loss: 0.09927152842283249 2023-01-24 02:08:49.746954: step: 10/466, loss: 0.47495347261428833 2023-01-24 02:08:50.379306: step: 12/466, loss: 0.23650674521923065 2023-01-24 02:08:50.963939: step: 14/466, loss: 0.10547780245542526 2023-01-24 02:08:51.604073: step: 16/466, loss: 0.10848968476057053 2023-01-24 02:08:52.260405: step: 18/466, loss: 0.05141892284154892 2023-01-24 02:08:52.847283: step: 20/466, loss: 0.07002759724855423 2023-01-24 02:08:53.443199: step: 22/466, loss: 0.12314854562282562 2023-01-24 02:08:54.119308: step: 24/466, loss: 0.09595341980457306 2023-01-24 02:08:54.703059: step: 26/466, loss: 0.0839044377207756 2023-01-24 02:08:55.523630: step: 28/466, loss: 0.13150641322135925 2023-01-24 02:08:56.122877: step: 30/466, loss: 0.03693179786205292 2023-01-24 02:08:56.736294: step: 32/466, loss: 0.16620874404907227 2023-01-24 02:08:57.408656: step: 34/466, loss: 0.37741395831108093 2023-01-24 02:08:58.035209: step: 36/466, loss: 0.07734955102205276 2023-01-24 02:08:58.640139: step: 38/466, loss: 2.731236219406128 2023-01-24 02:08:59.250208: step: 40/466, loss: 0.16634321212768555 2023-01-24 02:08:59.884619: step: 42/466, loss: 0.047279294580221176 2023-01-24 02:09:00.532413: step: 44/466, loss: 0.19378140568733215 2023-01-24 02:09:01.103303: step: 46/466, loss: 0.14623573422431946 2023-01-24 02:09:01.738181: step: 48/466, loss: 0.12817901372909546 2023-01-24 02:09:02.392023: step: 50/466, loss: 0.27838876843452454 2023-01-24 02:09:03.110699: step: 52/466, loss: 0.12414537370204926 2023-01-24 02:09:03.713663: step: 54/466, loss: 0.13180899620056152 2023-01-24 02:09:04.329809: step: 56/466, loss: 0.13430505990982056 2023-01-24 02:09:04.885570: step: 58/466, loss: 0.09954774379730225 2023-01-24 02:09:05.490639: step: 60/466, loss: 0.1269797831773758 2023-01-24 02:09:06.164793: step: 62/466, loss: 0.2481815665960312 2023-01-24 02:09:06.796364: step: 64/466, loss: 0.07848747819662094 2023-01-24 02:09:07.472257: step: 66/466, loss: 0.061888862401247025 2023-01-24 02:09:08.066054: step: 68/466, loss: 0.5187919735908508 2023-01-24 02:09:08.677733: step: 70/466, loss: 0.41317012906074524 2023-01-24 02:09:09.316174: step: 72/466, loss: 0.08287671208381653 2023-01-24 02:09:09.888109: step: 74/466, loss: 0.11693020164966583 2023-01-24 02:09:10.493479: step: 76/466, loss: 0.18891726434230804 2023-01-24 02:09:11.157140: step: 78/466, loss: 0.14649678766727448 2023-01-24 02:09:11.769106: step: 80/466, loss: 0.09510315954685211 2023-01-24 02:09:12.391681: step: 82/466, loss: 0.041796330362558365 2023-01-24 02:09:13.005696: step: 84/466, loss: 0.12924985587596893 2023-01-24 02:09:13.578316: step: 86/466, loss: 0.07326295226812363 2023-01-24 02:09:14.157793: step: 88/466, loss: 0.13155905902385712 2023-01-24 02:09:14.794317: step: 90/466, loss: 0.8130422234535217 2023-01-24 02:09:15.450593: step: 92/466, loss: 0.18054792284965515 2023-01-24 02:09:16.013919: step: 94/466, loss: 0.08204445242881775 2023-01-24 02:09:16.642718: step: 96/466, loss: 0.33193227648735046 2023-01-24 02:09:17.254661: step: 98/466, loss: 0.13546620309352875 2023-01-24 02:09:17.939771: step: 100/466, loss: 0.03974077105522156 2023-01-24 02:09:18.526903: step: 102/466, loss: 0.06648655980825424 2023-01-24 02:09:19.210334: step: 104/466, loss: 0.11741594225168228 2023-01-24 02:09:19.810486: step: 106/466, loss: 0.1027723178267479 2023-01-24 02:09:20.363042: step: 108/466, loss: 0.12174259126186371 2023-01-24 02:09:21.004383: step: 110/466, loss: 0.06153004989027977 2023-01-24 02:09:21.631055: step: 112/466, loss: 0.040703579783439636 2023-01-24 02:09:22.235277: step: 114/466, loss: 1.0315808057785034 2023-01-24 02:09:22.770388: step: 116/466, loss: 0.07071185111999512 2023-01-24 02:09:23.434314: step: 118/466, loss: 0.4233225882053375 2023-01-24 02:09:24.066124: step: 120/466, loss: 0.2131342887878418 2023-01-24 02:09:24.698078: step: 122/466, loss: 0.06186900660395622 2023-01-24 02:09:25.253111: step: 124/466, loss: 0.10383732616901398 2023-01-24 02:09:25.882848: step: 126/466, loss: 0.09479478001594543 2023-01-24 02:09:26.474671: step: 128/466, loss: 0.17489869892597198 2023-01-24 02:09:27.064169: step: 130/466, loss: 0.16707611083984375 2023-01-24 02:09:27.629977: step: 132/466, loss: 0.17206239700317383 2023-01-24 02:09:28.256706: step: 134/466, loss: 0.20160230994224548 2023-01-24 02:09:28.923713: step: 136/466, loss: 0.20505794882774353 2023-01-24 02:09:29.574614: step: 138/466, loss: 0.16573834419250488 2023-01-24 02:09:30.163199: step: 140/466, loss: 0.1187601089477539 2023-01-24 02:09:30.749493: step: 142/466, loss: 0.024643516167998314 2023-01-24 02:09:31.445518: step: 144/466, loss: 0.12837985157966614 2023-01-24 02:09:32.191117: step: 146/466, loss: 0.43892914056777954 2023-01-24 02:09:32.864072: step: 148/466, loss: 0.04670446738600731 2023-01-24 02:09:33.513377: step: 150/466, loss: 0.1311047524213791 2023-01-24 02:09:34.242303: step: 152/466, loss: 0.11167477816343307 2023-01-24 02:09:34.957834: step: 154/466, loss: 0.12165019661188126 2023-01-24 02:09:35.567072: step: 156/466, loss: 0.2770508825778961 2023-01-24 02:09:36.200750: step: 158/466, loss: 0.11379371583461761 2023-01-24 02:09:36.816932: step: 160/466, loss: 0.09370989352464676 2023-01-24 02:09:37.401568: step: 162/466, loss: 0.209278404712677 2023-01-24 02:09:38.034487: step: 164/466, loss: 0.05772625654935837 2023-01-24 02:09:38.635096: step: 166/466, loss: 0.06810871511697769 2023-01-24 02:09:39.227043: step: 168/466, loss: 0.05928172543644905 2023-01-24 02:09:39.874291: step: 170/466, loss: 0.19251835346221924 2023-01-24 02:09:40.470554: step: 172/466, loss: 0.1081613078713417 2023-01-24 02:09:41.093426: step: 174/466, loss: 0.04214688017964363 2023-01-24 02:09:41.651103: step: 176/466, loss: 0.10552027821540833 2023-01-24 02:09:42.350979: step: 178/466, loss: 0.082865409553051 2023-01-24 02:09:42.985228: step: 180/466, loss: 0.4372321367263794 2023-01-24 02:09:43.583938: step: 182/466, loss: 0.0875171348452568 2023-01-24 02:09:44.213514: step: 184/466, loss: 0.13187700510025024 2023-01-24 02:09:44.884852: step: 186/466, loss: 0.09077508002519608 2023-01-24 02:09:45.416262: step: 188/466, loss: 0.15599395334720612 2023-01-24 02:09:45.982477: step: 190/466, loss: 0.07060946524143219 2023-01-24 02:09:46.588803: step: 192/466, loss: 0.061473701149225235 2023-01-24 02:09:47.204527: step: 194/466, loss: 0.24900802969932556 2023-01-24 02:09:47.861470: step: 196/466, loss: 0.614746630191803 2023-01-24 02:09:48.517234: step: 198/466, loss: 0.5936498641967773 2023-01-24 02:09:49.147409: step: 200/466, loss: 0.17943210899829865 2023-01-24 02:09:49.756606: step: 202/466, loss: 0.14334438741207123 2023-01-24 02:09:50.389066: step: 204/466, loss: 0.13771358132362366 2023-01-24 02:09:50.986946: step: 206/466, loss: 0.10882329195737839 2023-01-24 02:09:51.669979: step: 208/466, loss: 0.171889528632164 2023-01-24 02:09:52.362933: step: 210/466, loss: 0.17607173323631287 2023-01-24 02:09:53.013209: step: 212/466, loss: 0.13134627044200897 2023-01-24 02:09:53.663083: step: 214/466, loss: 0.15773671865463257 2023-01-24 02:09:54.328812: step: 216/466, loss: 0.09228464215993881 2023-01-24 02:09:54.995420: step: 218/466, loss: 0.029966870322823524 2023-01-24 02:09:55.563933: step: 220/466, loss: 0.07456782460212708 2023-01-24 02:09:56.217982: step: 222/466, loss: 0.011664489284157753 2023-01-24 02:09:56.842262: step: 224/466, loss: 0.1483556181192398 2023-01-24 02:09:57.451474: step: 226/466, loss: 0.2248881608247757 2023-01-24 02:09:58.093690: step: 228/466, loss: 0.17643052339553833 2023-01-24 02:09:58.748472: step: 230/466, loss: 0.05628127232193947 2023-01-24 02:09:59.392319: step: 232/466, loss: 0.21768799424171448 2023-01-24 02:10:00.035690: step: 234/466, loss: 0.5761271119117737 2023-01-24 02:10:00.642859: step: 236/466, loss: 0.3103395700454712 2023-01-24 02:10:01.290247: step: 238/466, loss: 0.2983532249927521 2023-01-24 02:10:01.926525: step: 240/466, loss: 0.16951322555541992 2023-01-24 02:10:02.545981: step: 242/466, loss: 0.07378947734832764 2023-01-24 02:10:03.224632: step: 244/466, loss: 0.11983276158571243 2023-01-24 02:10:03.865349: step: 246/466, loss: 0.48832592368125916 2023-01-24 02:10:04.554157: step: 248/466, loss: 0.07185845077037811 2023-01-24 02:10:05.138642: step: 250/466, loss: 0.03485824167728424 2023-01-24 02:10:05.768638: step: 252/466, loss: 0.12447050213813782 2023-01-24 02:10:06.377218: step: 254/466, loss: 0.1706133782863617 2023-01-24 02:10:06.975997: step: 256/466, loss: 0.11184553056955338 2023-01-24 02:10:07.684052: step: 258/466, loss: 0.11569743603467941 2023-01-24 02:10:08.277259: step: 260/466, loss: 0.13082577288150787 2023-01-24 02:10:08.923182: step: 262/466, loss: 0.16038110852241516 2023-01-24 02:10:09.580358: step: 264/466, loss: 0.13148073852062225 2023-01-24 02:10:10.202453: step: 266/466, loss: 0.21504873037338257 2023-01-24 02:10:10.882472: step: 268/466, loss: 0.44771113991737366 2023-01-24 02:10:11.477699: step: 270/466, loss: 0.14703847467899323 2023-01-24 02:10:12.122513: step: 272/466, loss: 0.10682922601699829 2023-01-24 02:10:12.757449: step: 274/466, loss: 0.07483697682619095 2023-01-24 02:10:13.366236: step: 276/466, loss: 0.04568616673350334 2023-01-24 02:10:13.992631: step: 278/466, loss: 0.2048133760690689 2023-01-24 02:10:14.649017: step: 280/466, loss: 0.3653821051120758 2023-01-24 02:10:15.334971: step: 282/466, loss: 0.3007429242134094 2023-01-24 02:10:16.013822: step: 284/466, loss: 0.117209292948246 2023-01-24 02:10:16.636749: step: 286/466, loss: 0.06718248873949051 2023-01-24 02:10:17.222582: step: 288/466, loss: 0.3619995713233948 2023-01-24 02:10:17.825926: step: 290/466, loss: 0.34004920721054077 2023-01-24 02:10:18.476598: step: 292/466, loss: 0.08876284211874008 2023-01-24 02:10:19.093756: step: 294/466, loss: 0.056431349366903305 2023-01-24 02:10:19.747059: step: 296/466, loss: 0.13510681688785553 2023-01-24 02:10:20.386388: step: 298/466, loss: 0.18540455400943756 2023-01-24 02:10:20.946970: step: 300/466, loss: 0.1090221256017685 2023-01-24 02:10:21.596541: step: 302/466, loss: 0.08833901584148407 2023-01-24 02:10:22.224486: step: 304/466, loss: 0.18773992359638214 2023-01-24 02:10:22.861344: step: 306/466, loss: 0.35183510184288025 2023-01-24 02:10:23.486638: step: 308/466, loss: 0.07946083694696426 2023-01-24 02:10:24.106266: step: 310/466, loss: 0.06417113542556763 2023-01-24 02:10:24.713003: step: 312/466, loss: 0.026588531211018562 2023-01-24 02:10:25.308721: step: 314/466, loss: 0.16041052341461182 2023-01-24 02:10:25.943257: step: 316/466, loss: 0.07452167570590973 2023-01-24 02:10:26.573703: step: 318/466, loss: 0.09353282302618027 2023-01-24 02:10:27.264355: step: 320/466, loss: 9.388075828552246 2023-01-24 02:10:27.847635: step: 322/466, loss: 0.10507649928331375 2023-01-24 02:10:28.479573: step: 324/466, loss: 0.05475550517439842 2023-01-24 02:10:29.116767: step: 326/466, loss: 0.07265757769346237 2023-01-24 02:10:29.695555: step: 328/466, loss: 0.0750664621591568 2023-01-24 02:10:30.286935: step: 330/466, loss: 0.09978318214416504 2023-01-24 02:10:30.925824: step: 332/466, loss: 0.30219554901123047 2023-01-24 02:10:31.564534: step: 334/466, loss: 0.023919468745589256 2023-01-24 02:10:32.222022: step: 336/466, loss: 0.11267466098070145 2023-01-24 02:10:32.890931: step: 338/466, loss: 0.02998601458966732 2023-01-24 02:10:33.514341: step: 340/466, loss: 0.24879337847232819 2023-01-24 02:10:34.183621: step: 342/466, loss: 0.23057223856449127 2023-01-24 02:10:34.788954: step: 344/466, loss: 0.08246339857578278 2023-01-24 02:10:35.373635: step: 346/466, loss: 0.14387105405330658 2023-01-24 02:10:35.966297: step: 348/466, loss: 0.405474454164505 2023-01-24 02:10:36.578691: step: 350/466, loss: 0.05964050069451332 2023-01-24 02:10:37.228732: step: 352/466, loss: 0.03839050233364105 2023-01-24 02:10:37.867722: step: 354/466, loss: 0.13237349689006805 2023-01-24 02:10:38.499539: step: 356/466, loss: 0.289070188999176 2023-01-24 02:10:39.183102: step: 358/466, loss: 0.08378314971923828 2023-01-24 02:10:39.760210: step: 360/466, loss: 0.19309566915035248 2023-01-24 02:10:40.375455: step: 362/466, loss: 0.1765211969614029 2023-01-24 02:10:41.043552: step: 364/466, loss: 0.08976773172616959 2023-01-24 02:10:41.669405: step: 366/466, loss: 5.7123122215271 2023-01-24 02:10:42.300909: step: 368/466, loss: 0.3234618902206421 2023-01-24 02:10:42.923323: step: 370/466, loss: 0.09172488003969193 2023-01-24 02:10:43.614603: step: 372/466, loss: 0.03426278382539749 2023-01-24 02:10:44.196117: step: 374/466, loss: 0.1049998477101326 2023-01-24 02:10:44.839244: step: 376/466, loss: 0.050934456288814545 2023-01-24 02:10:45.475223: step: 378/466, loss: 2.1087403297424316 2023-01-24 02:10:46.061942: step: 380/466, loss: 0.07639701664447784 2023-01-24 02:10:46.668837: step: 382/466, loss: 0.22721798717975616 2023-01-24 02:10:47.355995: step: 384/466, loss: 0.0736975371837616 2023-01-24 02:10:47.995720: step: 386/466, loss: 0.16866913437843323 2023-01-24 02:10:48.606327: step: 388/466, loss: 0.4156855642795563 2023-01-24 02:10:49.175001: step: 390/466, loss: 0.46621739864349365 2023-01-24 02:10:49.781485: step: 392/466, loss: 0.06919531524181366 2023-01-24 02:10:50.322518: step: 394/466, loss: 0.18814410269260406 2023-01-24 02:10:50.924483: step: 396/466, loss: 0.30297258496284485 2023-01-24 02:10:51.544706: step: 398/466, loss: 0.2292100340127945 2023-01-24 02:10:52.140879: step: 400/466, loss: 0.7158749103546143 2023-01-24 02:10:52.814615: step: 402/466, loss: 0.0643678605556488 2023-01-24 02:10:53.447865: step: 404/466, loss: 0.27010399103164673 2023-01-24 02:10:54.121768: step: 406/466, loss: 0.33736926317214966 2023-01-24 02:10:54.743831: step: 408/466, loss: 0.1557852327823639 2023-01-24 02:10:55.331568: step: 410/466, loss: 0.09922328591346741 2023-01-24 02:10:55.939951: step: 412/466, loss: 0.13485293090343475 2023-01-24 02:10:56.501623: step: 414/466, loss: 0.11178892850875854 2023-01-24 02:10:57.056201: step: 416/466, loss: 0.07302606105804443 2023-01-24 02:10:57.715364: step: 418/466, loss: 0.0684945359826088 2023-01-24 02:10:58.321585: step: 420/466, loss: 0.5240396857261658 2023-01-24 02:10:58.939127: step: 422/466, loss: 0.5128722786903381 2023-01-24 02:10:59.557211: step: 424/466, loss: 0.21964208781719208 2023-01-24 02:11:00.310753: step: 426/466, loss: 0.02849763073027134 2023-01-24 02:11:00.993708: step: 428/466, loss: 0.10740663856267929 2023-01-24 02:11:01.587056: step: 430/466, loss: 0.1858994960784912 2023-01-24 02:11:02.190964: step: 432/466, loss: 0.08060059696435928 2023-01-24 02:11:02.806279: step: 434/466, loss: 0.19610847532749176 2023-01-24 02:11:03.398788: step: 436/466, loss: 0.15405970811843872 2023-01-24 02:11:04.016816: step: 438/466, loss: 0.13509126007556915 2023-01-24 02:11:04.600737: step: 440/466, loss: 0.10840291529893875 2023-01-24 02:11:05.184213: step: 442/466, loss: 0.12234757095575333 2023-01-24 02:11:05.816570: step: 444/466, loss: 0.0750017836689949 2023-01-24 02:11:06.478836: step: 446/466, loss: 0.1562875509262085 2023-01-24 02:11:07.053494: step: 448/466, loss: 0.06718388199806213 2023-01-24 02:11:07.615121: step: 450/466, loss: 0.1260349452495575 2023-01-24 02:11:08.244442: step: 452/466, loss: 0.3128783106803894 2023-01-24 02:11:08.825857: step: 454/466, loss: 0.11964062601327896 2023-01-24 02:11:09.488261: step: 456/466, loss: 0.1366245299577713 2023-01-24 02:11:10.163607: step: 458/466, loss: 0.02635299414396286 2023-01-24 02:11:10.736695: step: 460/466, loss: 0.12284113466739655 2023-01-24 02:11:11.307194: step: 462/466, loss: 0.1805114895105362 2023-01-24 02:11:11.867860: step: 464/466, loss: 0.15980634093284607 2023-01-24 02:11:12.511792: step: 466/466, loss: 0.12533527612686157 2023-01-24 02:11:13.086826: step: 468/466, loss: 0.15628713369369507 2023-01-24 02:11:13.706227: step: 470/466, loss: 0.15992864966392517 2023-01-24 02:11:14.386078: step: 472/466, loss: 0.12801966071128845 2023-01-24 02:11:14.937199: step: 474/466, loss: 0.06098826974630356 2023-01-24 02:11:15.505066: step: 476/466, loss: 0.8425703048706055 2023-01-24 02:11:16.055783: step: 478/466, loss: 0.04198523610830307 2023-01-24 02:11:16.639809: step: 480/466, loss: 0.13882498443126678 2023-01-24 02:11:17.242830: step: 482/466, loss: 0.30974912643432617 2023-01-24 02:11:17.879391: step: 484/466, loss: 0.03182601556181908 2023-01-24 02:11:18.474770: step: 486/466, loss: 0.11850722134113312 2023-01-24 02:11:19.095386: step: 488/466, loss: 0.06199447438120842 2023-01-24 02:11:19.698234: step: 490/466, loss: 0.03679699823260307 2023-01-24 02:11:20.302675: step: 492/466, loss: 0.06447263062000275 2023-01-24 02:11:20.927142: step: 494/466, loss: 0.09972032904624939 2023-01-24 02:11:21.553564: step: 496/466, loss: 1.3976175785064697 2023-01-24 02:11:22.209933: step: 498/466, loss: 6.323942184448242 2023-01-24 02:11:22.811082: step: 500/466, loss: 0.15918752551078796 2023-01-24 02:11:23.394926: step: 502/466, loss: 0.36978256702423096 2023-01-24 02:11:24.067668: step: 504/466, loss: 0.09340979158878326 2023-01-24 02:11:24.673325: step: 506/466, loss: 0.0978708490729332 2023-01-24 02:11:25.338399: step: 508/466, loss: 0.09205763787031174 2023-01-24 02:11:25.931362: step: 510/466, loss: 0.06624890118837357 2023-01-24 02:11:26.588966: step: 512/466, loss: 0.17666137218475342 2023-01-24 02:11:27.265457: step: 514/466, loss: 0.18705052137374878 2023-01-24 02:11:27.924578: step: 516/466, loss: 0.13752776384353638 2023-01-24 02:11:28.542945: step: 518/466, loss: 0.08362399786710739 2023-01-24 02:11:29.170301: step: 520/466, loss: 0.2757173180580139 2023-01-24 02:11:29.823512: step: 522/466, loss: 0.06851635128259659 2023-01-24 02:11:30.385607: step: 524/466, loss: 0.08473610132932663 2023-01-24 02:11:31.037064: step: 526/466, loss: 0.232706218957901 2023-01-24 02:11:31.668588: step: 528/466, loss: 1.5634676218032837 2023-01-24 02:11:32.363891: step: 530/466, loss: 0.26936009526252747 2023-01-24 02:11:32.990169: step: 532/466, loss: 0.5476840734481812 2023-01-24 02:11:33.536684: step: 534/466, loss: 0.11797524243593216 2023-01-24 02:11:34.173786: step: 536/466, loss: 0.2392173558473587 2023-01-24 02:11:34.850419: step: 538/466, loss: 0.08022218942642212 2023-01-24 02:11:35.572128: step: 540/466, loss: 0.5519012808799744 2023-01-24 02:11:36.184323: step: 542/466, loss: 0.29034101963043213 2023-01-24 02:11:36.799961: step: 544/466, loss: 0.1117844432592392 2023-01-24 02:11:37.432899: step: 546/466, loss: 0.1492680013179779 2023-01-24 02:11:38.103645: step: 548/466, loss: 0.14945177733898163 2023-01-24 02:11:38.790749: step: 550/466, loss: 0.15347805619239807 2023-01-24 02:11:39.433074: step: 552/466, loss: 0.5507827997207642 2023-01-24 02:11:40.037538: step: 554/466, loss: 0.5270827412605286 2023-01-24 02:11:40.619959: step: 556/466, loss: 0.08873269706964493 2023-01-24 02:11:41.203189: step: 558/466, loss: 8.167985916137695 2023-01-24 02:11:41.855206: step: 560/466, loss: 0.10504305362701416 2023-01-24 02:11:42.461646: step: 562/466, loss: 3.3082435131073 2023-01-24 02:11:43.065125: step: 564/466, loss: 0.15133020281791687 2023-01-24 02:11:43.659959: step: 566/466, loss: 0.10324478894472122 2023-01-24 02:11:44.193993: step: 568/466, loss: 0.23461833596229553 2023-01-24 02:11:44.787878: step: 570/466, loss: 0.06707794219255447 2023-01-24 02:11:45.444152: step: 572/466, loss: 0.08997345715761185 2023-01-24 02:11:46.138911: step: 574/466, loss: 0.25757864117622375 2023-01-24 02:11:46.783652: step: 576/466, loss: 0.07078008353710175 2023-01-24 02:11:47.345459: step: 578/466, loss: 0.24525022506713867 2023-01-24 02:11:47.903613: step: 580/466, loss: 0.11206243187189102 2023-01-24 02:11:48.539332: step: 582/466, loss: 0.06284279376268387 2023-01-24 02:11:49.212098: step: 584/466, loss: 0.48521706461906433 2023-01-24 02:11:49.831071: step: 586/466, loss: 0.09278231859207153 2023-01-24 02:11:50.433930: step: 588/466, loss: 0.10632549226284027 2023-01-24 02:11:51.028223: step: 590/466, loss: 0.17148663103580475 2023-01-24 02:11:51.669022: step: 592/466, loss: 0.10256657749414444 2023-01-24 02:11:52.216603: step: 594/466, loss: 0.15673145651817322 2023-01-24 02:11:52.829306: step: 596/466, loss: 0.0729982927441597 2023-01-24 02:11:53.425632: step: 598/466, loss: 0.20437881350517273 2023-01-24 02:11:54.021608: step: 600/466, loss: 0.10806822776794434 2023-01-24 02:11:54.615559: step: 602/466, loss: 0.1564791202545166 2023-01-24 02:11:55.185948: step: 604/466, loss: 0.07354568690061569 2023-01-24 02:11:55.841298: step: 606/466, loss: 0.29064008593559265 2023-01-24 02:11:56.471078: step: 608/466, loss: 0.06247774884104729 2023-01-24 02:11:57.041402: step: 610/466, loss: 0.09393714368343353 2023-01-24 02:11:57.747693: step: 612/466, loss: 0.43984925746917725 2023-01-24 02:11:58.298052: step: 614/466, loss: 0.04968178644776344 2023-01-24 02:11:58.894989: step: 616/466, loss: 0.09494192153215408 2023-01-24 02:11:59.585526: step: 618/466, loss: 0.1882258951663971 2023-01-24 02:12:00.282039: step: 620/466, loss: 0.11165252327919006 2023-01-24 02:12:00.863481: step: 622/466, loss: 0.08681537210941315 2023-01-24 02:12:01.452615: step: 624/466, loss: 0.1627599149942398 2023-01-24 02:12:02.148906: step: 626/466, loss: 0.1943708062171936 2023-01-24 02:12:02.770439: step: 628/466, loss: 0.1491440385580063 2023-01-24 02:12:03.426942: step: 630/466, loss: 0.16645674407482147 2023-01-24 02:12:04.022294: step: 632/466, loss: 0.013026190921664238 2023-01-24 02:12:04.658147: step: 634/466, loss: 0.24470369517803192 2023-01-24 02:12:05.325699: step: 636/466, loss: 0.5334924459457397 2023-01-24 02:12:05.921384: step: 638/466, loss: 1.54473876953125 2023-01-24 02:12:06.579077: step: 640/466, loss: 0.21863879263401031 2023-01-24 02:12:07.165263: step: 642/466, loss: 0.538988471031189 2023-01-24 02:12:07.805590: step: 644/466, loss: 0.190269336104393 2023-01-24 02:12:08.379350: step: 646/466, loss: 0.26166078448295593 2023-01-24 02:12:08.988968: step: 648/466, loss: 0.058466147631406784 2023-01-24 02:12:09.617343: step: 650/466, loss: 0.5121904015541077 2023-01-24 02:12:10.212322: step: 652/466, loss: 0.04038256034255028 2023-01-24 02:12:10.859615: step: 654/466, loss: 0.10419758409261703 2023-01-24 02:12:11.435885: step: 656/466, loss: 0.0707746148109436 2023-01-24 02:12:12.005706: step: 658/466, loss: 0.12626321613788605 2023-01-24 02:12:12.669393: step: 660/466, loss: 0.2745856046676636 2023-01-24 02:12:13.302102: step: 662/466, loss: 0.05933433026075363 2023-01-24 02:12:13.896245: step: 664/466, loss: 0.22782056033611298 2023-01-24 02:12:14.550010: step: 666/466, loss: 0.058970000594854355 2023-01-24 02:12:15.195610: step: 668/466, loss: 0.6271088123321533 2023-01-24 02:12:15.799482: step: 670/466, loss: 0.05891264230012894 2023-01-24 02:12:16.412618: step: 672/466, loss: 0.20704032480716705 2023-01-24 02:12:17.023810: step: 674/466, loss: 0.10041045397520065 2023-01-24 02:12:17.606416: step: 676/466, loss: 0.13243088126182556 2023-01-24 02:12:18.145533: step: 678/466, loss: 0.09281554818153381 2023-01-24 02:12:18.866604: step: 680/466, loss: 0.1242906004190445 2023-01-24 02:12:19.501344: step: 682/466, loss: 0.024335691705346107 2023-01-24 02:12:20.247660: step: 684/466, loss: 0.11704385280609131 2023-01-24 02:12:20.855481: step: 686/466, loss: 0.03983628377318382 2023-01-24 02:12:21.468139: step: 688/466, loss: 0.13050445914268494 2023-01-24 02:12:22.109925: step: 690/466, loss: 0.21144269406795502 2023-01-24 02:12:22.772785: step: 692/466, loss: 2.6851344108581543 2023-01-24 02:12:23.433501: step: 694/466, loss: 0.06771936267614365 2023-01-24 02:12:24.060262: step: 696/466, loss: 0.15440306067466736 2023-01-24 02:12:24.672643: step: 698/466, loss: 0.11375857144594193 2023-01-24 02:12:25.312457: step: 700/466, loss: 0.16460925340652466 2023-01-24 02:12:25.902519: step: 702/466, loss: 0.16876919567584991 2023-01-24 02:12:26.509191: step: 704/466, loss: 0.054315388202667236 2023-01-24 02:12:27.139534: step: 706/466, loss: 0.13007497787475586 2023-01-24 02:12:27.807135: step: 708/466, loss: 0.4762824475765228 2023-01-24 02:12:28.420258: step: 710/466, loss: 0.3519395589828491 2023-01-24 02:12:29.102562: step: 712/466, loss: 0.06498491764068604 2023-01-24 02:12:29.663732: step: 714/466, loss: 0.10465317219495773 2023-01-24 02:12:30.318137: step: 716/466, loss: 0.10128390043973923 2023-01-24 02:12:30.988028: step: 718/466, loss: 0.27678248286247253 2023-01-24 02:12:31.557616: step: 720/466, loss: 0.1347552388906479 2023-01-24 02:12:32.239954: step: 722/466, loss: 0.3467319905757904 2023-01-24 02:12:32.866515: step: 724/466, loss: 0.6930481791496277 2023-01-24 02:12:33.446199: step: 726/466, loss: 0.07042728364467621 2023-01-24 02:12:34.074490: step: 728/466, loss: 0.3940177857875824 2023-01-24 02:12:34.724104: step: 730/466, loss: 0.2059410810470581 2023-01-24 02:12:35.308288: step: 732/466, loss: 0.0927201509475708 2023-01-24 02:12:35.938974: step: 734/466, loss: 0.49056684970855713 2023-01-24 02:12:36.601941: step: 736/466, loss: 0.2960192561149597 2023-01-24 02:12:37.211026: step: 738/466, loss: 0.22611430287361145 2023-01-24 02:12:37.851603: step: 740/466, loss: 0.054231416434049606 2023-01-24 02:12:38.500832: step: 742/466, loss: 0.16621573269367218 2023-01-24 02:12:39.141944: step: 744/466, loss: 0.10313557088375092 2023-01-24 02:12:39.706784: step: 746/466, loss: 0.17337164282798767 2023-01-24 02:12:40.309041: step: 748/466, loss: 0.04055658355355263 2023-01-24 02:12:41.008465: step: 750/466, loss: 0.11937404423952103 2023-01-24 02:12:41.579425: step: 752/466, loss: 0.10778648406267166 2023-01-24 02:12:42.172559: step: 754/466, loss: 0.10828586667776108 2023-01-24 02:12:42.855428: step: 756/466, loss: 0.22498983144760132 2023-01-24 02:12:43.484868: step: 758/466, loss: 0.22420690953731537 2023-01-24 02:12:44.158085: step: 760/466, loss: 0.13945281505584717 2023-01-24 02:12:44.864155: step: 762/466, loss: 0.16044212877750397 2023-01-24 02:12:45.545466: step: 764/466, loss: 0.20257921516895294 2023-01-24 02:12:46.158385: step: 766/466, loss: 0.16559775173664093 2023-01-24 02:12:46.772711: step: 768/466, loss: 0.13508321344852448 2023-01-24 02:12:47.433722: step: 770/466, loss: 0.22314563393592834 2023-01-24 02:12:48.089181: step: 772/466, loss: 1.148383617401123 2023-01-24 02:12:48.681442: step: 774/466, loss: 0.13575758039951324 2023-01-24 02:12:49.321572: step: 776/466, loss: 0.15329691767692566 2023-01-24 02:12:49.943450: step: 778/466, loss: 0.15127770602703094 2023-01-24 02:12:50.559098: step: 780/466, loss: 0.26028984785079956 2023-01-24 02:12:51.187540: step: 782/466, loss: 0.16642004251480103 2023-01-24 02:12:51.810677: step: 784/466, loss: 0.0878530815243721 2023-01-24 02:12:52.449959: step: 786/466, loss: 0.13579607009887695 2023-01-24 02:12:53.091668: step: 788/466, loss: 0.1477285623550415 2023-01-24 02:12:53.769248: step: 790/466, loss: 0.3247869908809662 2023-01-24 02:12:54.394655: step: 792/466, loss: 0.11100728064775467 2023-01-24 02:12:55.096208: step: 794/466, loss: 0.4956207275390625 2023-01-24 02:12:55.713759: step: 796/466, loss: 0.18071597814559937 2023-01-24 02:12:56.285678: step: 798/466, loss: 0.08764049410820007 2023-01-24 02:12:56.931631: step: 800/466, loss: 0.09589579701423645 2023-01-24 02:12:57.537060: step: 802/466, loss: 0.2966838479042053 2023-01-24 02:12:58.109529: step: 804/466, loss: 0.10895594954490662 2023-01-24 02:12:58.735532: step: 806/466, loss: 0.13455626368522644 2023-01-24 02:12:59.336538: step: 808/466, loss: 0.027952078729867935 2023-01-24 02:12:59.982928: step: 810/466, loss: 0.16448421776294708 2023-01-24 02:13:00.560781: step: 812/466, loss: 0.4903505742549896 2023-01-24 02:13:01.172239: step: 814/466, loss: 0.04348333925008774 2023-01-24 02:13:01.828258: step: 816/466, loss: 0.08801747858524323 2023-01-24 02:13:02.458393: step: 818/466, loss: 0.499617338180542 2023-01-24 02:13:03.077874: step: 820/466, loss: 0.11657022684812546 2023-01-24 02:13:03.751918: step: 822/466, loss: 0.09049463272094727 2023-01-24 02:13:04.332198: step: 824/466, loss: 0.10426073521375656 2023-01-24 02:13:04.993870: step: 826/466, loss: 0.07946509122848511 2023-01-24 02:13:05.556143: step: 828/466, loss: 0.30058789253234863 2023-01-24 02:13:06.210395: step: 830/466, loss: 0.22830384969711304 2023-01-24 02:13:06.883226: step: 832/466, loss: 0.16114521026611328 2023-01-24 02:13:07.525672: step: 834/466, loss: 0.11515027284622192 2023-01-24 02:13:08.125722: step: 836/466, loss: 0.2690531611442566 2023-01-24 02:13:08.791894: step: 838/466, loss: 0.1222851350903511 2023-01-24 02:13:09.386943: step: 840/466, loss: 0.10726886987686157 2023-01-24 02:13:09.984629: step: 842/466, loss: 0.16912440955638885 2023-01-24 02:13:10.638292: step: 844/466, loss: 0.09197650849819183 2023-01-24 02:13:11.273572: step: 846/466, loss: 0.10261756926774979 2023-01-24 02:13:11.845279: step: 848/466, loss: 0.20265816152095795 2023-01-24 02:13:12.458975: step: 850/466, loss: 0.10778229683637619 2023-01-24 02:13:13.084016: step: 852/466, loss: 0.05235063657164574 2023-01-24 02:13:13.722223: step: 854/466, loss: 0.3888838589191437 2023-01-24 02:13:14.362214: step: 856/466, loss: 0.20329241454601288 2023-01-24 02:13:15.005470: step: 858/466, loss: 0.09305126219987869 2023-01-24 02:13:15.729912: step: 860/466, loss: 0.13486655056476593 2023-01-24 02:13:16.339922: step: 862/466, loss: 0.19808350503444672 2023-01-24 02:13:16.968698: step: 864/466, loss: 0.18498556315898895 2023-01-24 02:13:17.562929: step: 866/466, loss: 0.10210976004600525 2023-01-24 02:13:18.176106: step: 868/466, loss: 0.5273813605308533 2023-01-24 02:13:18.827176: step: 870/466, loss: 0.12989220023155212 2023-01-24 02:13:19.401437: step: 872/466, loss: 0.2621801793575287 2023-01-24 02:13:20.066964: step: 874/466, loss: 0.8328834176063538 2023-01-24 02:13:20.733396: step: 876/466, loss: 0.35489052534103394 2023-01-24 02:13:21.317175: step: 878/466, loss: 0.40134337544441223 2023-01-24 02:13:21.919822: step: 880/466, loss: 0.15391066670417786 2023-01-24 02:13:22.498034: step: 882/466, loss: 0.11246410757303238 2023-01-24 02:13:23.204162: step: 884/466, loss: 0.13483500480651855 2023-01-24 02:13:23.844906: step: 886/466, loss: 0.10536754876375198 2023-01-24 02:13:24.442810: step: 888/466, loss: 0.03027348406612873 2023-01-24 02:13:25.077779: step: 890/466, loss: 0.08882451057434082 2023-01-24 02:13:25.664670: step: 892/466, loss: 0.43937447667121887 2023-01-24 02:13:26.313682: step: 894/466, loss: 0.3225778639316559 2023-01-24 02:13:27.040940: step: 896/466, loss: 0.09378521889448166 2023-01-24 02:13:27.619191: step: 898/466, loss: 0.15521328151226044 2023-01-24 02:13:28.265032: step: 900/466, loss: 0.06254531443119049 2023-01-24 02:13:28.907701: step: 902/466, loss: 0.07430645078420639 2023-01-24 02:13:29.446956: step: 904/466, loss: 0.615398108959198 2023-01-24 02:13:30.030347: step: 906/466, loss: 0.3123369812965393 2023-01-24 02:13:30.704649: step: 908/466, loss: 0.4180082082748413 2023-01-24 02:13:31.307277: step: 910/466, loss: 1.0601695775985718 2023-01-24 02:13:31.922267: step: 912/466, loss: 0.09402686357498169 2023-01-24 02:13:32.589809: step: 914/466, loss: 0.029817946255207062 2023-01-24 02:13:33.210355: step: 916/466, loss: 0.24438625574111938 2023-01-24 02:13:33.888007: step: 918/466, loss: 0.07211785018444061 2023-01-24 02:13:34.495510: step: 920/466, loss: 0.06276479363441467 2023-01-24 02:13:35.107514: step: 922/466, loss: 0.11545450985431671 2023-01-24 02:13:35.689408: step: 924/466, loss: 0.0142361456528306 2023-01-24 02:13:36.283760: step: 926/466, loss: 0.1268867552280426 2023-01-24 02:13:36.878162: step: 928/466, loss: 0.1386023461818695 2023-01-24 02:13:37.520917: step: 930/466, loss: 0.8721864223480225 2023-01-24 02:13:38.152384: step: 932/466, loss: 0.12333185970783234 ================================================== Loss: 0.274 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3436604862251414, 'r': 0.3403999503027017, 'f1': 0.34202244768260015}, 'combined': 0.2520165403977054, 'epoch': 14} Test Chinese: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.3899547166460982, 'r': 0.30697127348438524, 'f1': 0.34352254806190646}, 'combined': 0.22782842565763742, 'epoch': 14} Dev Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.34310876623376624, 'r': 0.2859239718614719, 'f1': 0.3119170602125148}, 'combined': 0.2079447068083432, 'epoch': 14} Test Korean: {'template': {'p': 0.96875, 'r': 0.49206349206349204, 'f1': 0.6526315789473683}, 'slot': {'p': 0.3962562007791941, 'r': 0.28118093417434603, 'f1': 0.32894473290163634}, 'combined': 0.21467972042001526, 'epoch': 14} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3401493547643738, 'r': 0.33950391006842623, 'f1': 0.3398263259374371}, 'combined': 0.2503983454275852, 'epoch': 14} Test Russian: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.3758255981541577, 'r': 0.2920200628699981, 'f1': 0.3286646038332566}, 'combined': 0.21797445228319604, 'epoch': 14} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.31752873563218387, 'r': 0.26309523809523805, 'f1': 0.28776041666666663}, 'combined': 0.19184027777777773, 'epoch': 14} Sample Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.45588235294117646, 'r': 0.33695652173913043, 'f1': 0.3875}, 'combined': 0.2583333333333333, 'epoch': 14} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.36363636363636365, 'r': 0.13793103448275862, 'f1': 0.2}, 'combined': 0.13333333333333333, 'epoch': 14} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3384687266123707, 'r': 0.32112782411040863, 'f1': 0.3295703277627758}, 'combined': 0.24284129414099268, 'epoch': 13} Test for Chinese: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.37192480126309296, 'r': 0.3046241229392952, 'f1': 0.334927046163623}, 'combined': 0.2221277819116256, 'epoch': 13} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.38541666666666663, 'r': 0.35238095238095235, 'f1': 0.36815920398009944}, 'combined': 0.24543946932006627, 'epoch': 13} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.31508371199826896, 'r': 0.28285924145299146, 'f1': 0.298103152669021}, 'combined': 0.19873543511268066, 'epoch': 10} Test for Korean: {'template': {'p': 0.96875, 'r': 0.49206349206349204, 'f1': 0.6526315789473683}, 'slot': {'p': 0.3659577520051451, 'r': 0.28516188467933384, 'f1': 0.3205469360629008}, 'combined': 0.20919905300947209, 'epoch': 10} Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.43478260869565216, 'f1': 0.46511627906976744}, 'combined': 0.31007751937984496, 'epoch': 10} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3214272868712161, 'r': 0.31166858366449984, 'f1': 0.3164727236824498}, 'combined': 0.23319042797654194, 'epoch': 7} Test for Russian: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.3374639287478496, 'r': 0.2816078301964814, 'f1': 0.30701605547736693}, 'combined': 0.20361686580882363, 'epoch': 7} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.4423076923076923, 'r': 0.19827586206896552, 'f1': 0.2738095238095238}, 'combined': 0.1825396825396825, 'epoch': 7} ****************************** Epoch: 15 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-24 02:16:08.085816: step: 2/466, loss: 0.435541033744812 2023-01-24 02:16:08.693948: step: 4/466, loss: 0.03687071055173874 2023-01-24 02:16:09.263297: step: 6/466, loss: 0.327267587184906 2023-01-24 02:16:09.895673: step: 8/466, loss: 0.3533381223678589 2023-01-24 02:16:10.525887: step: 10/466, loss: 0.03968863934278488 2023-01-24 02:16:11.118873: step: 12/466, loss: 0.07497228682041168 2023-01-24 02:16:11.805845: step: 14/466, loss: 0.05809565261006355 2023-01-24 02:16:12.552787: step: 16/466, loss: 0.18471305072307587 2023-01-24 02:16:13.192887: step: 18/466, loss: 1.8211289644241333 2023-01-24 02:16:13.777166: step: 20/466, loss: 0.02774050645530224 2023-01-24 02:16:14.353361: step: 22/466, loss: 0.13910217583179474 2023-01-24 02:16:14.910588: step: 24/466, loss: 0.2877402901649475 2023-01-24 02:16:15.510503: step: 26/466, loss: 0.046039097011089325 2023-01-24 02:16:16.152887: step: 28/466, loss: 0.05408158898353577 2023-01-24 02:16:16.794503: step: 30/466, loss: 0.20368832349777222 2023-01-24 02:16:17.427107: step: 32/466, loss: 0.05387134850025177 2023-01-24 02:16:18.043304: step: 34/466, loss: 0.06808287650346756 2023-01-24 02:16:18.649172: step: 36/466, loss: 0.22698763012886047 2023-01-24 02:16:19.252768: step: 38/466, loss: 0.13414643704891205 2023-01-24 02:16:19.835265: step: 40/466, loss: 0.017022065818309784 2023-01-24 02:16:20.448652: step: 42/466, loss: 0.11883985996246338 2023-01-24 02:16:21.090918: step: 44/466, loss: 0.1732538789510727 2023-01-24 02:16:21.705586: step: 46/466, loss: 0.13217847049236298 2023-01-24 02:16:22.343886: step: 48/466, loss: 0.10473383218050003 2023-01-24 02:16:22.945658: step: 50/466, loss: 0.07943100482225418 2023-01-24 02:16:23.578277: step: 52/466, loss: 0.1259610503911972 2023-01-24 02:16:24.189567: step: 54/466, loss: 0.0526583306491375 2023-01-24 02:16:24.798397: step: 56/466, loss: 0.04471132531762123 2023-01-24 02:16:25.428510: step: 58/466, loss: 0.33938050270080566 2023-01-24 02:16:26.062306: step: 60/466, loss: 0.1816110759973526 2023-01-24 02:16:26.660395: step: 62/466, loss: 0.058411043137311935 2023-01-24 02:16:27.216723: step: 64/466, loss: 0.08439125120639801 2023-01-24 02:16:27.854460: step: 66/466, loss: 0.017518168315291405 2023-01-24 02:16:28.482579: step: 68/466, loss: 0.05559399351477623 2023-01-24 02:16:29.168821: step: 70/466, loss: 0.8502869606018066 2023-01-24 02:16:29.815278: step: 72/466, loss: 0.1861373633146286 2023-01-24 02:16:30.465816: step: 74/466, loss: 0.8699095249176025 2023-01-24 02:16:31.105952: step: 76/466, loss: 0.059313587844371796 2023-01-24 02:16:31.777219: step: 78/466, loss: 0.17662163078784943 2023-01-24 02:16:32.400796: step: 80/466, loss: 0.031233610585331917 2023-01-24 02:16:33.025627: step: 82/466, loss: 0.1937914341688156 2023-01-24 02:16:33.694197: step: 84/466, loss: 0.10098868608474731 2023-01-24 02:16:34.295194: step: 86/466, loss: 0.07239064574241638 2023-01-24 02:16:34.925232: step: 88/466, loss: 1.1043068170547485 2023-01-24 02:16:35.557186: step: 90/466, loss: 0.11017554253339767 2023-01-24 02:16:36.179452: step: 92/466, loss: 0.053812894970178604 2023-01-24 02:16:36.786047: step: 94/466, loss: 0.060267288237810135 2023-01-24 02:16:37.397343: step: 96/466, loss: 0.268286794424057 2023-01-24 02:16:38.080648: step: 98/466, loss: 0.08513689786195755 2023-01-24 02:16:38.685448: step: 100/466, loss: 0.1262291818857193 2023-01-24 02:16:39.245139: step: 102/466, loss: 0.06185045465826988 2023-01-24 02:16:39.899208: step: 104/466, loss: 0.14014405012130737 2023-01-24 02:16:40.527114: step: 106/466, loss: 0.5968447327613831 2023-01-24 02:16:41.202497: step: 108/466, loss: 0.10632877796888351 2023-01-24 02:16:41.876356: step: 110/466, loss: 0.11529248207807541 2023-01-24 02:16:42.445902: step: 112/466, loss: 0.021922091022133827 2023-01-24 02:16:43.074048: step: 114/466, loss: 0.30956166982650757 2023-01-24 02:16:43.717241: step: 116/466, loss: 0.07864983379840851 2023-01-24 02:16:44.345984: step: 118/466, loss: 0.08320973068475723 2023-01-24 02:16:45.018351: step: 120/466, loss: 0.09572584927082062 2023-01-24 02:16:45.666880: step: 122/466, loss: 0.05570907145738602 2023-01-24 02:16:46.311507: step: 124/466, loss: 0.250662624835968 2023-01-24 02:16:46.985943: step: 126/466, loss: 0.0643407478928566 2023-01-24 02:16:47.601952: step: 128/466, loss: 0.10747619718313217 2023-01-24 02:16:48.256418: step: 130/466, loss: 0.06883567571640015 2023-01-24 02:16:48.898171: step: 132/466, loss: 0.06860269606113434 2023-01-24 02:16:49.524831: step: 134/466, loss: 0.09205099195241928 2023-01-24 02:16:50.235987: step: 136/466, loss: 0.30829140543937683 2023-01-24 02:16:50.803656: step: 138/466, loss: 1.0106439590454102 2023-01-24 02:16:51.471622: step: 140/466, loss: 0.09725064039230347 2023-01-24 02:16:52.105881: step: 142/466, loss: 0.08082885295152664 2023-01-24 02:16:52.737045: step: 144/466, loss: 0.14608044922351837 2023-01-24 02:16:53.446228: step: 146/466, loss: 0.2036202996969223 2023-01-24 02:16:54.063068: step: 148/466, loss: 0.03470924124121666 2023-01-24 02:16:54.663345: step: 150/466, loss: 0.13530747592449188 2023-01-24 02:16:55.266842: step: 152/466, loss: 0.15470536053180695 2023-01-24 02:16:55.925154: step: 154/466, loss: 0.2205692082643509 2023-01-24 02:16:56.621601: step: 156/466, loss: 1.0972814559936523 2023-01-24 02:16:57.237278: step: 158/466, loss: 0.16078124940395355 2023-01-24 02:16:57.883750: step: 160/466, loss: 0.15565645694732666 2023-01-24 02:16:58.488073: step: 162/466, loss: 0.03340250253677368 2023-01-24 02:16:59.181056: step: 164/466, loss: 0.08437010645866394 2023-01-24 02:16:59.807977: step: 166/466, loss: 0.3232613801956177 2023-01-24 02:17:00.397027: step: 168/466, loss: 0.4667429029941559 2023-01-24 02:17:01.041434: step: 170/466, loss: 0.09632215648889542 2023-01-24 02:17:01.675160: step: 172/466, loss: 0.038387689739465714 2023-01-24 02:17:02.294554: step: 174/466, loss: 0.19090475142002106 2023-01-24 02:17:02.940317: step: 176/466, loss: 0.08887051045894623 2023-01-24 02:17:03.617327: step: 178/466, loss: 0.45436954498291016 2023-01-24 02:17:04.211190: step: 180/466, loss: 0.7342172265052795 2023-01-24 02:17:04.834653: step: 182/466, loss: 0.08003537356853485 2023-01-24 02:17:05.435286: step: 184/466, loss: 0.10012319684028625 2023-01-24 02:17:06.065482: step: 186/466, loss: 0.06299827247858047 2023-01-24 02:17:06.747850: step: 188/466, loss: 0.19645850360393524 2023-01-24 02:17:07.417236: step: 190/466, loss: 0.08092363178730011 2023-01-24 02:17:08.073996: step: 192/466, loss: 0.19707578420639038 2023-01-24 02:17:08.693950: step: 194/466, loss: 0.06013033911585808 2023-01-24 02:17:09.317683: step: 196/466, loss: 0.08408872038125992 2023-01-24 02:17:09.910316: step: 198/466, loss: 0.3458172678947449 2023-01-24 02:17:10.609244: step: 200/466, loss: 1.0924654006958008 2023-01-24 02:17:11.223273: step: 202/466, loss: 0.33933231234550476 2023-01-24 02:17:11.883162: step: 204/466, loss: 0.0806460827589035 2023-01-24 02:17:12.554663: step: 206/466, loss: 0.10306204855442047 2023-01-24 02:17:13.212831: step: 208/466, loss: 0.15154699981212616 2023-01-24 02:17:13.940590: step: 210/466, loss: 0.11981084942817688 2023-01-24 02:17:14.566503: step: 212/466, loss: 0.11672055721282959 2023-01-24 02:17:15.194835: step: 214/466, loss: 0.24000589549541473 2023-01-24 02:17:15.842441: step: 216/466, loss: 0.03687505051493645 2023-01-24 02:17:16.427493: step: 218/466, loss: 0.40529903769493103 2023-01-24 02:17:17.071977: step: 220/466, loss: 0.21455496549606323 2023-01-24 02:17:17.717899: step: 222/466, loss: 0.09716780483722687 2023-01-24 02:17:18.353395: step: 224/466, loss: 0.07382888346910477 2023-01-24 02:17:18.992423: step: 226/466, loss: 0.14204277098178864 2023-01-24 02:17:19.569426: step: 228/466, loss: 0.056718990206718445 2023-01-24 02:17:20.164277: step: 230/466, loss: 0.03239859640598297 2023-01-24 02:17:20.779889: step: 232/466, loss: 0.028303178027272224 2023-01-24 02:17:21.455914: step: 234/466, loss: 0.11211122572422028 2023-01-24 02:17:22.116971: step: 236/466, loss: 1.3105813264846802 2023-01-24 02:17:22.766957: step: 238/466, loss: 0.0535355880856514 2023-01-24 02:17:23.326109: step: 240/466, loss: 0.21491867303848267 2023-01-24 02:17:23.931923: step: 242/466, loss: 0.05640966817736626 2023-01-24 02:17:24.600617: step: 244/466, loss: 0.027308456599712372 2023-01-24 02:17:25.190713: step: 246/466, loss: 0.13510803878307343 2023-01-24 02:17:25.774275: step: 248/466, loss: 1.0216366052627563 2023-01-24 02:17:26.364288: step: 250/466, loss: 0.15312233567237854 2023-01-24 02:17:27.005216: step: 252/466, loss: 0.22686350345611572 2023-01-24 02:17:27.605807: step: 254/466, loss: 0.14514611661434174 2023-01-24 02:17:28.311597: step: 256/466, loss: 0.0583498515188694 2023-01-24 02:17:28.974460: step: 258/466, loss: 0.05628177151083946 2023-01-24 02:17:29.625863: step: 260/466, loss: 0.11870299279689789 2023-01-24 02:17:30.279852: step: 262/466, loss: 0.07947038859128952 2023-01-24 02:17:30.925523: step: 264/466, loss: 0.03605325520038605 2023-01-24 02:17:31.563269: step: 266/466, loss: 0.11093839257955551 2023-01-24 02:17:32.189940: step: 268/466, loss: 0.26418861746788025 2023-01-24 02:17:32.855422: step: 270/466, loss: 0.03737075999379158 2023-01-24 02:17:33.469163: step: 272/466, loss: 0.16607210040092468 2023-01-24 02:17:34.067937: step: 274/466, loss: 0.16027934849262238 2023-01-24 02:17:34.686075: step: 276/466, loss: 0.09926323592662811 2023-01-24 02:17:35.268773: step: 278/466, loss: 0.16546255350112915 2023-01-24 02:17:35.928357: step: 280/466, loss: 0.02960674650967121 2023-01-24 02:17:36.675048: step: 282/466, loss: 0.07635494321584702 2023-01-24 02:17:37.277835: step: 284/466, loss: 0.2444678395986557 2023-01-24 02:17:37.883906: step: 286/466, loss: 0.3181353211402893 2023-01-24 02:17:38.510060: step: 288/466, loss: 0.06947898864746094 2023-01-24 02:17:39.112752: step: 290/466, loss: 0.08936656266450882 2023-01-24 02:17:39.692504: step: 292/466, loss: 0.18234677612781525 2023-01-24 02:17:40.309595: step: 294/466, loss: 0.17520812153816223 2023-01-24 02:17:40.934149: step: 296/466, loss: 0.27067720890045166 2023-01-24 02:17:41.549855: step: 298/466, loss: 0.31783533096313477 2023-01-24 02:17:42.142308: step: 300/466, loss: 0.3244658410549164 2023-01-24 02:17:42.769494: step: 302/466, loss: 0.05142327770590782 2023-01-24 02:17:43.382277: step: 304/466, loss: 0.35001641511917114 2023-01-24 02:17:44.011886: step: 306/466, loss: 0.10582118481397629 2023-01-24 02:17:44.617864: step: 308/466, loss: 0.329561322927475 2023-01-24 02:17:45.270742: step: 310/466, loss: 0.1648482382297516 2023-01-24 02:17:45.945409: step: 312/466, loss: 0.0752813071012497 2023-01-24 02:17:46.607193: step: 314/466, loss: 0.05904865264892578 2023-01-24 02:17:47.175010: step: 316/466, loss: 0.037242304533720016 2023-01-24 02:17:47.770007: step: 318/466, loss: 0.12373863905668259 2023-01-24 02:17:48.417144: step: 320/466, loss: 0.17016130685806274 2023-01-24 02:17:49.017814: step: 322/466, loss: 0.10103777050971985 2023-01-24 02:17:49.639265: step: 324/466, loss: 0.33198267221450806 2023-01-24 02:17:50.293113: step: 326/466, loss: 0.10659413784742355 2023-01-24 02:17:50.918578: step: 328/466, loss: 0.041212305426597595 2023-01-24 02:17:51.494257: step: 330/466, loss: 0.13769038021564484 2023-01-24 02:17:52.068216: step: 332/466, loss: 0.1511014699935913 2023-01-24 02:17:52.672505: step: 334/466, loss: 0.02524626813828945 2023-01-24 02:17:53.271943: step: 336/466, loss: 0.17625188827514648 2023-01-24 02:17:53.862079: step: 338/466, loss: 0.05785483866930008 2023-01-24 02:17:54.443776: step: 340/466, loss: 0.036739859730005264 2023-01-24 02:17:55.112511: step: 342/466, loss: 0.05239155888557434 2023-01-24 02:17:55.758507: step: 344/466, loss: 0.15496668219566345 2023-01-24 02:17:56.371370: step: 346/466, loss: 0.19038543105125427 2023-01-24 02:17:56.966679: step: 348/466, loss: 0.08914028108119965 2023-01-24 02:17:57.619563: step: 350/466, loss: 0.1829203963279724 2023-01-24 02:17:58.225250: step: 352/466, loss: 0.0923493281006813 2023-01-24 02:17:58.918247: step: 354/466, loss: 0.11299723386764526 2023-01-24 02:17:59.498293: step: 356/466, loss: 0.0236994419246912 2023-01-24 02:18:00.099642: step: 358/466, loss: 0.40911865234375 2023-01-24 02:18:00.758959: step: 360/466, loss: 0.6588344573974609 2023-01-24 02:18:01.435729: step: 362/466, loss: 0.12048368155956268 2023-01-24 02:18:02.119736: step: 364/466, loss: 0.2735106348991394 2023-01-24 02:18:02.774596: step: 366/466, loss: 0.18617478013038635 2023-01-24 02:18:03.363796: step: 368/466, loss: 0.048725713044404984 2023-01-24 02:18:04.052835: step: 370/466, loss: 0.11221233755350113 2023-01-24 02:18:04.738410: step: 372/466, loss: 0.04269242659211159 2023-01-24 02:18:05.334596: step: 374/466, loss: 0.05626671761274338 2023-01-24 02:18:05.967088: step: 376/466, loss: 0.07424618303775787 2023-01-24 02:18:06.564290: step: 378/466, loss: 0.07768986374139786 2023-01-24 02:18:07.316375: step: 380/466, loss: 0.04393366351723671 2023-01-24 02:18:07.974429: step: 382/466, loss: 0.09401178359985352 2023-01-24 02:18:08.591547: step: 384/466, loss: 0.04772252216935158 2023-01-24 02:18:09.226594: step: 386/466, loss: 0.1341552436351776 2023-01-24 02:18:09.938069: step: 388/466, loss: 0.06578285992145538 2023-01-24 02:18:10.599297: step: 390/466, loss: 0.04696401581168175 2023-01-24 02:18:11.314473: step: 392/466, loss: 0.15234379470348358 2023-01-24 02:18:12.022792: step: 394/466, loss: 0.06454946845769882 2023-01-24 02:18:12.702372: step: 396/466, loss: 0.032967355102300644 2023-01-24 02:18:13.272461: step: 398/466, loss: 0.12278541922569275 2023-01-24 02:18:13.905583: step: 400/466, loss: 0.3646247088909149 2023-01-24 02:18:14.554697: step: 402/466, loss: 0.39636847376823425 2023-01-24 02:18:15.177745: step: 404/466, loss: 0.2703320384025574 2023-01-24 02:18:15.750005: step: 406/466, loss: 0.17627400159835815 2023-01-24 02:18:16.405255: step: 408/466, loss: 0.35617542266845703 2023-01-24 02:18:16.997156: step: 410/466, loss: 0.11145827919244766 2023-01-24 02:18:17.627617: step: 412/466, loss: 0.030111731961369514 2023-01-24 02:18:18.218557: step: 414/466, loss: 0.028507882729172707 2023-01-24 02:18:18.891540: step: 416/466, loss: 0.03310058265924454 2023-01-24 02:18:19.515319: step: 418/466, loss: 0.11942702531814575 2023-01-24 02:18:20.083313: step: 420/466, loss: 0.07002561539411545 2023-01-24 02:18:20.750441: step: 422/466, loss: 0.1809057742357254 2023-01-24 02:18:21.317632: step: 424/466, loss: 0.048225123435258865 2023-01-24 02:18:21.946431: step: 426/466, loss: 0.272867888212204 2023-01-24 02:18:22.528922: step: 428/466, loss: 0.049407534301280975 2023-01-24 02:18:23.222517: step: 430/466, loss: 0.06726373732089996 2023-01-24 02:18:23.951646: step: 432/466, loss: 0.19764664769172668 2023-01-24 02:18:24.579745: step: 434/466, loss: 0.05364568531513214 2023-01-24 02:18:25.258421: step: 436/466, loss: 0.027864878997206688 2023-01-24 02:18:25.894257: step: 438/466, loss: 0.04443821683526039 2023-01-24 02:18:26.528865: step: 440/466, loss: 13.03366470336914 2023-01-24 02:18:27.130803: step: 442/466, loss: 0.06950782239437103 2023-01-24 02:18:27.743360: step: 444/466, loss: 0.48722684383392334 2023-01-24 02:18:28.347335: step: 446/466, loss: 0.1150822639465332 2023-01-24 02:18:29.024440: step: 448/466, loss: 0.33581680059432983 2023-01-24 02:18:29.613600: step: 450/466, loss: 0.09013377875089645 2023-01-24 02:18:30.239821: step: 452/466, loss: 0.031149625778198242 2023-01-24 02:18:30.893826: step: 454/466, loss: 0.2107708752155304 2023-01-24 02:18:31.564680: step: 456/466, loss: 0.06587281823158264 2023-01-24 02:18:32.109487: step: 458/466, loss: 0.03040958382189274 2023-01-24 02:18:32.726439: step: 460/466, loss: 0.6418898701667786 2023-01-24 02:18:33.339252: step: 462/466, loss: 0.10450328886508942 2023-01-24 02:18:33.957033: step: 464/466, loss: 0.20812921226024628 2023-01-24 02:18:34.544157: step: 466/466, loss: 0.015257641673088074 2023-01-24 02:18:35.122618: step: 468/466, loss: 0.03937196731567383 2023-01-24 02:18:35.683833: step: 470/466, loss: 0.07607219368219376 2023-01-24 02:18:36.295898: step: 472/466, loss: 0.10207568109035492 2023-01-24 02:18:36.889511: step: 474/466, loss: 0.09679700434207916 2023-01-24 02:18:37.523584: step: 476/466, loss: 0.1701655387878418 2023-01-24 02:18:38.172169: step: 478/466, loss: 0.10100958496332169 2023-01-24 02:18:38.781192: step: 480/466, loss: 0.04442289471626282 2023-01-24 02:18:39.397887: step: 482/466, loss: 0.21667301654815674 2023-01-24 02:18:40.039303: step: 484/466, loss: 0.059234533458948135 2023-01-24 02:18:40.732942: step: 486/466, loss: 0.3126879930496216 2023-01-24 02:18:41.291479: step: 488/466, loss: 0.02259223349392414 2023-01-24 02:18:41.820336: step: 490/466, loss: 0.06512950360774994 2023-01-24 02:18:42.384389: step: 492/466, loss: 0.06362253427505493 2023-01-24 02:18:42.955845: step: 494/466, loss: 0.016480471938848495 2023-01-24 02:18:43.549362: step: 496/466, loss: 0.07561665773391724 2023-01-24 02:18:44.136860: step: 498/466, loss: 0.03929918259382248 2023-01-24 02:18:44.754860: step: 500/466, loss: 0.19528630375862122 2023-01-24 02:18:45.403923: step: 502/466, loss: 0.10272553563117981 2023-01-24 02:18:46.035376: step: 504/466, loss: 0.2748989462852478 2023-01-24 02:18:46.562189: step: 506/466, loss: 0.04124134033918381 2023-01-24 02:18:47.217168: step: 508/466, loss: 0.0517859160900116 2023-01-24 02:18:47.809298: step: 510/466, loss: 0.09681697189807892 2023-01-24 02:18:48.424750: step: 512/466, loss: 0.10827016830444336 2023-01-24 02:18:49.111899: step: 514/466, loss: 0.9610345959663391 2023-01-24 02:18:49.711691: step: 516/466, loss: 0.2481914609670639 2023-01-24 02:18:50.369027: step: 518/466, loss: 0.1883612424135208 2023-01-24 02:18:51.004047: step: 520/466, loss: 0.08209282904863358 2023-01-24 02:18:51.645105: step: 522/466, loss: 0.08709144592285156 2023-01-24 02:18:52.259438: step: 524/466, loss: 0.07596763968467712 2023-01-24 02:18:52.870879: step: 526/466, loss: 0.35548317432403564 2023-01-24 02:18:53.462181: step: 528/466, loss: 0.03592758625745773 2023-01-24 02:18:54.065520: step: 530/466, loss: 0.07017184793949127 2023-01-24 02:18:54.655166: step: 532/466, loss: 0.06274805963039398 2023-01-24 02:18:55.272746: step: 534/466, loss: 0.019063686951994896 2023-01-24 02:18:55.872720: step: 536/466, loss: 0.08310821652412415 2023-01-24 02:18:56.483060: step: 538/466, loss: 0.11306414008140564 2023-01-24 02:18:57.112585: step: 540/466, loss: 0.11510761827230453 2023-01-24 02:18:57.746417: step: 542/466, loss: 0.12245524674654007 2023-01-24 02:18:58.408830: step: 544/466, loss: 0.08942381292581558 2023-01-24 02:18:59.023300: step: 546/466, loss: 0.9175434708595276 2023-01-24 02:18:59.674893: step: 548/466, loss: 0.03244870528578758 2023-01-24 02:19:00.325038: step: 550/466, loss: 0.03925330191850662 2023-01-24 02:19:00.920071: step: 552/466, loss: 0.0484529547393322 2023-01-24 02:19:01.547855: step: 554/466, loss: 0.2532956898212433 2023-01-24 02:19:02.144134: step: 556/466, loss: 0.044494111090898514 2023-01-24 02:19:02.769399: step: 558/466, loss: 0.42364785075187683 2023-01-24 02:19:03.461354: step: 560/466, loss: 0.19850437343120575 2023-01-24 02:19:04.061046: step: 562/466, loss: 0.1446579545736313 2023-01-24 02:19:04.808127: step: 564/466, loss: 0.14363546669483185 2023-01-24 02:19:05.414832: step: 566/466, loss: 0.10299845039844513 2023-01-24 02:19:06.053421: step: 568/466, loss: 0.1500239372253418 2023-01-24 02:19:06.677885: step: 570/466, loss: 0.06343446671962738 2023-01-24 02:19:07.372303: step: 572/466, loss: 0.13200996816158295 2023-01-24 02:19:08.033612: step: 574/466, loss: 0.0875290259718895 2023-01-24 02:19:08.625756: step: 576/466, loss: 0.01447082869708538 2023-01-24 02:19:09.301559: step: 578/466, loss: 0.0954570472240448 2023-01-24 02:19:09.876359: step: 580/466, loss: 0.08869298547506332 2023-01-24 02:19:10.470638: step: 582/466, loss: 0.0968579426407814 2023-01-24 02:19:11.144107: step: 584/466, loss: 0.3615601062774658 2023-01-24 02:19:11.790041: step: 586/466, loss: 0.1191064789891243 2023-01-24 02:19:12.410399: step: 588/466, loss: 0.059348151087760925 2023-01-24 02:19:13.232171: step: 590/466, loss: 0.11688205599784851 2023-01-24 02:19:13.821635: step: 592/466, loss: 0.0417327918112278 2023-01-24 02:19:14.393541: step: 594/466, loss: 0.023670373484492302 2023-01-24 02:19:15.036162: step: 596/466, loss: 0.32675257325172424 2023-01-24 02:19:15.696108: step: 598/466, loss: 0.1834038347005844 2023-01-24 02:19:16.335709: step: 600/466, loss: 0.08239725977182388 2023-01-24 02:19:16.992895: step: 602/466, loss: 0.112456314265728 2023-01-24 02:19:17.575882: step: 604/466, loss: 0.10530654340982437 2023-01-24 02:19:18.220039: step: 606/466, loss: 0.09844251722097397 2023-01-24 02:19:18.847055: step: 608/466, loss: 0.2826271951198578 2023-01-24 02:19:19.481718: step: 610/466, loss: 0.20284876227378845 2023-01-24 02:19:20.122968: step: 612/466, loss: 0.0901341587305069 2023-01-24 02:19:20.785174: step: 614/466, loss: 0.18216392397880554 2023-01-24 02:19:21.425165: step: 616/466, loss: 0.12945237755775452 2023-01-24 02:19:22.051686: step: 618/466, loss: 0.04815996438264847 2023-01-24 02:19:22.648096: step: 620/466, loss: 0.09071626514196396 2023-01-24 02:19:23.285222: step: 622/466, loss: 0.15761756896972656 2023-01-24 02:19:23.883189: step: 624/466, loss: 0.061441466212272644 2023-01-24 02:19:24.475325: step: 626/466, loss: 0.32481932640075684 2023-01-24 02:19:25.060136: step: 628/466, loss: 0.05131962150335312 2023-01-24 02:19:25.645502: step: 630/466, loss: 0.12522053718566895 2023-01-24 02:19:26.229673: step: 632/466, loss: 0.08636586368083954 2023-01-24 02:19:26.886010: step: 634/466, loss: 0.05276182293891907 2023-01-24 02:19:27.488290: step: 636/466, loss: 0.12636514008045197 2023-01-24 02:19:28.091310: step: 638/466, loss: 0.6053150296211243 2023-01-24 02:19:28.734344: step: 640/466, loss: 0.11609852313995361 2023-01-24 02:19:29.337730: step: 642/466, loss: 0.02773885801434517 2023-01-24 02:19:29.953824: step: 644/466, loss: 0.08347609639167786 2023-01-24 02:19:30.565377: step: 646/466, loss: 0.13086169958114624 2023-01-24 02:19:31.193368: step: 648/466, loss: 0.22208954393863678 2023-01-24 02:19:31.866143: step: 650/466, loss: 0.0469534769654274 2023-01-24 02:19:32.514281: step: 652/466, loss: 0.09385628253221512 2023-01-24 02:19:33.144546: step: 654/466, loss: 0.05745408311486244 2023-01-24 02:19:33.820066: step: 656/466, loss: 0.20931097865104675 2023-01-24 02:19:34.474377: step: 658/466, loss: 0.5002903342247009 2023-01-24 02:19:35.118523: step: 660/466, loss: 0.08020833879709244 2023-01-24 02:19:35.765892: step: 662/466, loss: 0.3154780864715576 2023-01-24 02:19:36.392178: step: 664/466, loss: 0.4076489210128784 2023-01-24 02:19:37.051370: step: 666/466, loss: 0.16374658048152924 2023-01-24 02:19:37.615906: step: 668/466, loss: 0.04053359851241112 2023-01-24 02:19:38.237436: step: 670/466, loss: 0.18959200382232666 2023-01-24 02:19:38.792676: step: 672/466, loss: 0.03634379059076309 2023-01-24 02:19:39.448590: step: 674/466, loss: 0.11083564907312393 2023-01-24 02:19:40.041957: step: 676/466, loss: 0.11092586815357208 2023-01-24 02:19:40.674293: step: 678/466, loss: 0.16896851360797882 2023-01-24 02:19:41.266387: step: 680/466, loss: 1.5057514905929565 2023-01-24 02:19:41.821250: step: 682/466, loss: 0.20026735961437225 2023-01-24 02:19:42.428995: step: 684/466, loss: 0.21140651404857635 2023-01-24 02:19:43.221492: step: 686/466, loss: 0.02491314709186554 2023-01-24 02:19:43.878835: step: 688/466, loss: 0.5144820809364319 2023-01-24 02:19:44.552120: step: 690/466, loss: 0.06804531812667847 2023-01-24 02:19:45.157839: step: 692/466, loss: 0.07824739068746567 2023-01-24 02:19:45.743673: step: 694/466, loss: 0.041952550411224365 2023-01-24 02:19:46.386027: step: 696/466, loss: 0.08526717871427536 2023-01-24 02:19:47.039120: step: 698/466, loss: 0.05078260973095894 2023-01-24 02:19:47.599556: step: 700/466, loss: 0.1528579443693161 2023-01-24 02:19:48.147968: step: 702/466, loss: 0.03180788829922676 2023-01-24 02:19:48.714712: step: 704/466, loss: 0.08808571100234985 2023-01-24 02:19:49.383323: step: 706/466, loss: 0.046116508543491364 2023-01-24 02:19:49.978463: step: 708/466, loss: 0.058172132819890976 2023-01-24 02:19:50.614467: step: 710/466, loss: 0.07205815613269806 2023-01-24 02:19:51.261338: step: 712/466, loss: 0.09958315640687943 2023-01-24 02:19:51.883285: step: 714/466, loss: 0.08324968814849854 2023-01-24 02:19:52.411303: step: 716/466, loss: 0.06993231922388077 2023-01-24 02:19:53.090572: step: 718/466, loss: 0.15261013805866241 2023-01-24 02:19:53.692105: step: 720/466, loss: 0.11847439408302307 2023-01-24 02:19:54.360533: step: 722/466, loss: 0.6557598114013672 2023-01-24 02:19:54.952385: step: 724/466, loss: 0.13594911992549896 2023-01-24 02:19:55.529152: step: 726/466, loss: 0.12737493216991425 2023-01-24 02:19:56.268524: step: 728/466, loss: 0.09064139425754547 2023-01-24 02:19:56.904895: step: 730/466, loss: 0.09410417824983597 2023-01-24 02:19:57.561644: step: 732/466, loss: 0.046360161155462265 2023-01-24 02:19:58.226496: step: 734/466, loss: 0.2701513171195984 2023-01-24 02:19:58.860286: step: 736/466, loss: 0.13428397476673126 2023-01-24 02:19:59.545296: step: 738/466, loss: 0.13104593753814697 2023-01-24 02:20:00.143931: step: 740/466, loss: 0.12228985875844955 2023-01-24 02:20:00.767175: step: 742/466, loss: 0.018590405583381653 2023-01-24 02:20:01.308143: step: 744/466, loss: 0.07382070273160934 2023-01-24 02:20:01.934218: step: 746/466, loss: 0.11061914265155792 2023-01-24 02:20:02.598843: step: 748/466, loss: 0.07724656909704208 2023-01-24 02:20:03.203988: step: 750/466, loss: 0.022709783166646957 2023-01-24 02:20:03.857502: step: 752/466, loss: 0.05855545401573181 2023-01-24 02:20:04.511501: step: 754/466, loss: 0.04040146619081497 2023-01-24 02:20:05.155145: step: 756/466, loss: 0.06897459924221039 2023-01-24 02:20:05.837057: step: 758/466, loss: 0.030555035918951035 2023-01-24 02:20:06.468353: step: 760/466, loss: 0.037560224533081055 2023-01-24 02:20:07.119163: step: 762/466, loss: 0.06906607002019882 2023-01-24 02:20:07.788752: step: 764/466, loss: 0.06891036033630371 2023-01-24 02:20:08.407328: step: 766/466, loss: 0.17637237906455994 2023-01-24 02:20:09.005918: step: 768/466, loss: 0.09575371444225311 2023-01-24 02:20:09.648646: step: 770/466, loss: 0.3711543381214142 2023-01-24 02:20:10.236962: step: 772/466, loss: 0.824753999710083 2023-01-24 02:20:10.810918: step: 774/466, loss: 0.8033647537231445 2023-01-24 02:20:11.358231: step: 776/466, loss: 0.06524951756000519 2023-01-24 02:20:11.910021: step: 778/466, loss: 0.2661142647266388 2023-01-24 02:20:12.518616: step: 780/466, loss: 0.09411405026912689 2023-01-24 02:20:13.076651: step: 782/466, loss: 0.1399596482515335 2023-01-24 02:20:13.711325: step: 784/466, loss: 0.19879567623138428 2023-01-24 02:20:14.359475: step: 786/466, loss: 0.2200627326965332 2023-01-24 02:20:14.921263: step: 788/466, loss: 0.14337505400180817 2023-01-24 02:20:15.536436: step: 790/466, loss: 0.14063367247581482 2023-01-24 02:20:16.162746: step: 792/466, loss: 0.6075316071510315 2023-01-24 02:20:16.808604: step: 794/466, loss: 0.08412862569093704 2023-01-24 02:20:17.395903: step: 796/466, loss: 0.17682573199272156 2023-01-24 02:20:18.037215: step: 798/466, loss: 0.02219691127538681 2023-01-24 02:20:18.700162: step: 800/466, loss: 0.09762563556432724 2023-01-24 02:20:19.362669: step: 802/466, loss: 0.11324504017829895 2023-01-24 02:20:20.026620: step: 804/466, loss: 0.15628759562969208 2023-01-24 02:20:20.682555: step: 806/466, loss: 0.38020768761634827 2023-01-24 02:20:21.337879: step: 808/466, loss: 0.18933938443660736 2023-01-24 02:20:21.927557: step: 810/466, loss: 0.2434525489807129 2023-01-24 02:20:22.564160: step: 812/466, loss: 0.10575389862060547 2023-01-24 02:20:23.184979: step: 814/466, loss: 0.11092200130224228 2023-01-24 02:20:23.840908: step: 816/466, loss: 0.08927656710147858 2023-01-24 02:20:24.411097: step: 818/466, loss: 0.1141166090965271 2023-01-24 02:20:24.999485: step: 820/466, loss: 0.15775534510612488 2023-01-24 02:20:25.567126: step: 822/466, loss: 0.04831439256668091 2023-01-24 02:20:26.193315: step: 824/466, loss: 0.3275830149650574 2023-01-24 02:20:26.863114: step: 826/466, loss: 0.08251907676458359 2023-01-24 02:20:27.466400: step: 828/466, loss: 0.1518041044473648 2023-01-24 02:20:28.044650: step: 830/466, loss: 0.0283180084079504 2023-01-24 02:20:28.614211: step: 832/466, loss: 0.18164034187793732 2023-01-24 02:20:29.217820: step: 834/466, loss: 0.054658275097608566 2023-01-24 02:20:29.821846: step: 836/466, loss: 0.1343078762292862 2023-01-24 02:20:30.509057: step: 838/466, loss: 0.13170497119426727 2023-01-24 02:20:31.110676: step: 840/466, loss: 0.03123355470597744 2023-01-24 02:20:31.804482: step: 842/466, loss: 0.23193548619747162 2023-01-24 02:20:32.495262: step: 844/466, loss: 0.6250280141830444 2023-01-24 02:20:33.054431: step: 846/466, loss: 0.1379401683807373 2023-01-24 02:20:33.705589: step: 848/466, loss: 0.42695051431655884 2023-01-24 02:20:34.292255: step: 850/466, loss: 0.10238040238618851 2023-01-24 02:20:34.876400: step: 852/466, loss: 0.11693931370973587 2023-01-24 02:20:35.498736: step: 854/466, loss: 0.19939444959163666 2023-01-24 02:20:36.091162: step: 856/466, loss: 0.10059074312448502 2023-01-24 02:20:36.694989: step: 858/466, loss: 0.05090232193470001 2023-01-24 02:20:37.294402: step: 860/466, loss: 0.15322493016719818 2023-01-24 02:20:37.904570: step: 862/466, loss: 0.022669047117233276 2023-01-24 02:20:38.514776: step: 864/466, loss: 0.08482439815998077 2023-01-24 02:20:39.171022: step: 866/466, loss: 0.12503689527511597 2023-01-24 02:20:39.848783: step: 868/466, loss: 0.020080553367733955 2023-01-24 02:20:40.397021: step: 870/466, loss: 0.1167760118842125 2023-01-24 02:20:40.998267: step: 872/466, loss: 0.05984390527009964 2023-01-24 02:20:41.615923: step: 874/466, loss: 0.0961577519774437 2023-01-24 02:20:42.191860: step: 876/466, loss: 0.51129549741745 2023-01-24 02:20:42.884791: step: 878/466, loss: 0.15473543107509613 2023-01-24 02:20:43.498344: step: 880/466, loss: 0.19731540977954865 2023-01-24 02:20:44.168938: step: 882/466, loss: 0.11906936019659042 2023-01-24 02:20:44.808214: step: 884/466, loss: 0.08546533435583115 2023-01-24 02:20:45.407806: step: 886/466, loss: 0.17211639881134033 2023-01-24 02:20:46.031788: step: 888/466, loss: 0.1469283550977707 2023-01-24 02:20:46.641506: step: 890/466, loss: 0.07431633025407791 2023-01-24 02:20:47.281779: step: 892/466, loss: 0.1898496001958847 2023-01-24 02:20:47.922302: step: 894/466, loss: 0.20680542290210724 2023-01-24 02:20:48.487024: step: 896/466, loss: 0.04137062653899193 2023-01-24 02:20:49.078218: step: 898/466, loss: 0.17267300188541412 2023-01-24 02:20:49.687034: step: 900/466, loss: 0.12561747431755066 2023-01-24 02:20:50.273963: step: 902/466, loss: 0.29961609840393066 2023-01-24 02:20:50.885979: step: 904/466, loss: 0.538173258304596 2023-01-24 02:20:51.510004: step: 906/466, loss: 0.14885465800762177 2023-01-24 02:20:52.129192: step: 908/466, loss: 0.11128829419612885 2023-01-24 02:20:52.785579: step: 910/466, loss: 0.10712137818336487 2023-01-24 02:20:53.439003: step: 912/466, loss: 0.10897429287433624 2023-01-24 02:20:54.040592: step: 914/466, loss: 0.08950541168451309 2023-01-24 02:20:54.665058: step: 916/466, loss: 0.047900937497615814 2023-01-24 02:20:55.318818: step: 918/466, loss: 0.2691085636615753 2023-01-24 02:20:55.950045: step: 920/466, loss: 0.12888646125793457 2023-01-24 02:20:56.616493: step: 922/466, loss: 0.10967149585485458 2023-01-24 02:20:57.241467: step: 924/466, loss: 0.1109466403722763 2023-01-24 02:20:57.885123: step: 926/466, loss: 0.10536369681358337 2023-01-24 02:20:58.557630: step: 928/466, loss: 0.1543402373790741 2023-01-24 02:20:59.247747: step: 930/466, loss: 0.04843086004257202 2023-01-24 02:20:59.843964: step: 932/466, loss: 0.15260881185531616 ================================================== Loss: 0.197 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.34829582517938684, 'r': 0.33772137887413034, 'f1': 0.3429271034039821}, 'combined': 0.2526831288239868, 'epoch': 15} Test Chinese: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.3704030751616487, 'r': 0.3128771445929716, 'f1': 0.339218531883306}, 'combined': 0.22497394860654488, 'epoch': 15} Dev Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.3391659212880143, 'r': 0.27621467074592077, 'f1': 0.30447045126063915}, 'combined': 0.2029803008404261, 'epoch': 15} Test Korean: {'template': {'p': 0.96875, 'r': 0.49206349206349204, 'f1': 0.6526315789473683}, 'slot': {'p': 0.3860668947006251, 'r': 0.29653931308470427, 'f1': 0.3354320850104895}, 'combined': 0.21891357127000363, 'epoch': 15} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3381293593575992, 'r': 0.33171324248174344, 'f1': 0.3348905723905724}, 'combined': 0.24676147439305335, 'epoch': 15} Test Russian: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.35292728046869964, 'r': 0.29720192039469445, 'f1': 0.32267637071423966}, 'combined': 0.21400298161358897, 'epoch': 15} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.3063063063063063, 'r': 0.32380952380952377, 'f1': 0.3148148148148148}, 'combined': 0.20987654320987653, 'epoch': 15} Sample Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.4852941176470588, 'r': 0.358695652173913, 'f1': 0.4125}, 'combined': 0.27499999999999997, 'epoch': 15} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.4, 'r': 0.13793103448275862, 'f1': 0.20512820512820515}, 'combined': 0.13675213675213677, 'epoch': 15} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3384687266123707, 'r': 0.32112782411040863, 'f1': 0.3295703277627758}, 'combined': 0.24284129414099268, 'epoch': 13} Test for Chinese: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.37192480126309296, 'r': 0.3046241229392952, 'f1': 0.334927046163623}, 'combined': 0.2221277819116256, 'epoch': 13} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.38541666666666663, 'r': 0.35238095238095235, 'f1': 0.36815920398009944}, 'combined': 0.24543946932006627, 'epoch': 13} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.31508371199826896, 'r': 0.28285924145299146, 'f1': 0.298103152669021}, 'combined': 0.19873543511268066, 'epoch': 10} Test for Korean: {'template': {'p': 0.96875, 'r': 0.49206349206349204, 'f1': 0.6526315789473683}, 'slot': {'p': 0.3659577520051451, 'r': 0.28516188467933384, 'f1': 0.3205469360629008}, 'combined': 0.20919905300947209, 'epoch': 10} Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.43478260869565216, 'f1': 0.46511627906976744}, 'combined': 0.31007751937984496, 'epoch': 10} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3214272868712161, 'r': 0.31166858366449984, 'f1': 0.3164727236824498}, 'combined': 0.23319042797654194, 'epoch': 7} Test for Russian: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.3374639287478496, 'r': 0.2816078301964814, 'f1': 0.30701605547736693}, 'combined': 0.20361686580882363, 'epoch': 7} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.4423076923076923, 'r': 0.19827586206896552, 'f1': 0.2738095238095238}, 'combined': 0.1825396825396825, 'epoch': 7} ****************************** Epoch: 16 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-24 02:23:28.997795: step: 2/466, loss: 0.042849015444517136 2023-01-24 02:23:29.586580: step: 4/466, loss: 0.09236946702003479 2023-01-24 02:23:30.222559: step: 6/466, loss: 0.1932559758424759 2023-01-24 02:23:30.796199: step: 8/466, loss: 0.061434224247932434 2023-01-24 02:23:31.359176: step: 10/466, loss: 0.05620969086885452 2023-01-24 02:23:31.911977: step: 12/466, loss: 0.09529339522123337 2023-01-24 02:23:32.583233: step: 14/466, loss: 0.155241459608078 2023-01-24 02:23:33.134901: step: 16/466, loss: 0.014359625056385994 2023-01-24 02:23:33.718577: step: 18/466, loss: 0.05823211371898651 2023-01-24 02:23:34.352055: step: 20/466, loss: 0.5219416618347168 2023-01-24 02:23:34.971231: step: 22/466, loss: 0.161069855093956 2023-01-24 02:23:35.620858: step: 24/466, loss: 0.06981996446847916 2023-01-24 02:23:36.246024: step: 26/466, loss: 0.05380038544535637 2023-01-24 02:23:36.869261: step: 28/466, loss: 0.0803898423910141 2023-01-24 02:23:37.494422: step: 30/466, loss: 0.14585107564926147 2023-01-24 02:23:38.145419: step: 32/466, loss: 0.0993504524230957 2023-01-24 02:23:38.780556: step: 34/466, loss: 0.015155520290136337 2023-01-24 02:23:39.447468: step: 36/466, loss: 0.06372464448213577 2023-01-24 02:23:40.066988: step: 38/466, loss: 0.05958646535873413 2023-01-24 02:23:40.689591: step: 40/466, loss: 0.9672218561172485 2023-01-24 02:23:41.344450: step: 42/466, loss: 0.13717980682849884 2023-01-24 02:23:41.933770: step: 44/466, loss: 0.12620197236537933 2023-01-24 02:23:42.598040: step: 46/466, loss: 0.0906166285276413 2023-01-24 02:23:43.217444: step: 48/466, loss: 0.09270259737968445 2023-01-24 02:23:43.819685: step: 50/466, loss: 0.2560976445674896 2023-01-24 02:23:44.569556: step: 52/466, loss: 0.09398878365755081 2023-01-24 02:23:45.173131: step: 54/466, loss: 0.040162697434425354 2023-01-24 02:23:45.807701: step: 56/466, loss: 0.05243329331278801 2023-01-24 02:23:46.414420: step: 58/466, loss: 0.05652332305908203 2023-01-24 02:23:47.071541: step: 60/466, loss: 0.026979945600032806 2023-01-24 02:23:47.729931: step: 62/466, loss: 0.13747258484363556 2023-01-24 02:23:48.380865: step: 64/466, loss: 0.11705562472343445 2023-01-24 02:23:48.992858: step: 66/466, loss: 0.1764337122440338 2023-01-24 02:23:49.630585: step: 68/466, loss: 0.18013818562030792 2023-01-24 02:23:50.262552: step: 70/466, loss: 0.08881554007530212 2023-01-24 02:23:50.884456: step: 72/466, loss: 0.12398698180913925 2023-01-24 02:23:51.504954: step: 74/466, loss: 0.14571566879749298 2023-01-24 02:23:52.119948: step: 76/466, loss: 0.19627001881599426 2023-01-24 02:23:52.766162: step: 78/466, loss: 0.041960377246141434 2023-01-24 02:23:53.399011: step: 80/466, loss: 0.1169004812836647 2023-01-24 02:23:54.122784: step: 82/466, loss: 0.16998115181922913 2023-01-24 02:23:54.725409: step: 84/466, loss: 0.49829229712486267 2023-01-24 02:23:55.304503: step: 86/466, loss: 0.06498653441667557 2023-01-24 02:23:55.999003: step: 88/466, loss: 0.07490881532430649 2023-01-24 02:23:56.626299: step: 90/466, loss: 0.26454323530197144 2023-01-24 02:23:57.253583: step: 92/466, loss: 0.05853681638836861 2023-01-24 02:23:57.849526: step: 94/466, loss: 0.16520006954669952 2023-01-24 02:23:58.462740: step: 96/466, loss: 0.020829584449529648 2023-01-24 02:23:59.059687: step: 98/466, loss: 0.1657995879650116 2023-01-24 02:23:59.694665: step: 100/466, loss: 0.09411315619945526 2023-01-24 02:24:00.362137: step: 102/466, loss: 0.1425132304430008 2023-01-24 02:24:01.051621: step: 104/466, loss: 0.13217833638191223 2023-01-24 02:24:01.591093: step: 106/466, loss: 0.5152426958084106 2023-01-24 02:24:02.197628: step: 108/466, loss: 0.20067660510540009 2023-01-24 02:24:02.844661: step: 110/466, loss: 0.6288007497787476 2023-01-24 02:24:03.445755: step: 112/466, loss: 0.10686130076646805 2023-01-24 02:24:04.042470: step: 114/466, loss: 0.08602900803089142 2023-01-24 02:24:04.707105: step: 116/466, loss: 0.30152466893196106 2023-01-24 02:24:05.299413: step: 118/466, loss: 0.020262155681848526 2023-01-24 02:24:05.940646: step: 120/466, loss: 0.5782296657562256 2023-01-24 02:24:06.557178: step: 122/466, loss: 0.02323436550796032 2023-01-24 02:24:07.159434: step: 124/466, loss: 0.0832483321428299 2023-01-24 02:24:07.744421: step: 126/466, loss: 0.08001965284347534 2023-01-24 02:24:08.416697: step: 128/466, loss: 0.0427028127014637 2023-01-24 02:24:09.085491: step: 130/466, loss: 0.2891668975353241 2023-01-24 02:24:09.693859: step: 132/466, loss: 0.1530880630016327 2023-01-24 02:24:10.269738: step: 134/466, loss: 0.027556931599974632 2023-01-24 02:24:10.813820: step: 136/466, loss: 0.04821633920073509 2023-01-24 02:24:11.456495: step: 138/466, loss: 0.07989934831857681 2023-01-24 02:24:12.112525: step: 140/466, loss: 0.11616438627243042 2023-01-24 02:24:12.674522: step: 142/466, loss: 0.10473637282848358 2023-01-24 02:24:13.358162: step: 144/466, loss: 0.19119058549404144 2023-01-24 02:24:13.937275: step: 146/466, loss: 0.08404102921485901 2023-01-24 02:24:14.560852: step: 148/466, loss: 0.23165667057037354 2023-01-24 02:24:15.159367: step: 150/466, loss: 0.2223658710718155 2023-01-24 02:24:15.797244: step: 152/466, loss: 0.06914431601762772 2023-01-24 02:24:16.392062: step: 154/466, loss: 0.20164556801319122 2023-01-24 02:24:17.050352: step: 156/466, loss: 0.24763233959674835 2023-01-24 02:24:17.702223: step: 158/466, loss: 0.16237004101276398 2023-01-24 02:24:18.320551: step: 160/466, loss: 0.0988822802901268 2023-01-24 02:24:18.984201: step: 162/466, loss: 0.46181219816207886 2023-01-24 02:24:19.597168: step: 164/466, loss: 0.11193667352199554 2023-01-24 02:24:20.238565: step: 166/466, loss: 0.09703291952610016 2023-01-24 02:24:20.873518: step: 168/466, loss: 0.07440285384654999 2023-01-24 02:24:21.415356: step: 170/466, loss: 0.08722874522209167 2023-01-24 02:24:22.078930: step: 172/466, loss: 0.21509423851966858 2023-01-24 02:24:22.758564: step: 174/466, loss: 0.17768056690692902 2023-01-24 02:24:23.426345: step: 176/466, loss: 0.08026005327701569 2023-01-24 02:24:24.058836: step: 178/466, loss: 0.09798018634319305 2023-01-24 02:24:24.687635: step: 180/466, loss: 0.0405559204518795 2023-01-24 02:24:25.314379: step: 182/466, loss: 0.06418140977621078 2023-01-24 02:24:25.983535: step: 184/466, loss: 0.045770592987537384 2023-01-24 02:24:26.645498: step: 186/466, loss: 0.21991638839244843 2023-01-24 02:24:27.250507: step: 188/466, loss: 0.0976586565375328 2023-01-24 02:24:27.886866: step: 190/466, loss: 0.03503083065152168 2023-01-24 02:24:28.459205: step: 192/466, loss: 0.11509787291288376 2023-01-24 02:24:29.124987: step: 194/466, loss: 0.29007306694984436 2023-01-24 02:24:29.728709: step: 196/466, loss: 0.030970891937613487 2023-01-24 02:24:30.415396: step: 198/466, loss: 0.0502634197473526 2023-01-24 02:24:30.981593: step: 200/466, loss: 0.04383686184883118 2023-01-24 02:24:31.573501: step: 202/466, loss: 0.05791878700256348 2023-01-24 02:24:32.366788: step: 204/466, loss: 0.08554225414991379 2023-01-24 02:24:32.969061: step: 206/466, loss: 0.037981659173965454 2023-01-24 02:24:33.576591: step: 208/466, loss: 0.08121279627084732 2023-01-24 02:24:34.233774: step: 210/466, loss: 0.1800466924905777 2023-01-24 02:24:34.822774: step: 212/466, loss: 0.12716861069202423 2023-01-24 02:24:35.474418: step: 214/466, loss: 0.13399292528629303 2023-01-24 02:24:36.102088: step: 216/466, loss: 0.08899815380573273 2023-01-24 02:24:36.688916: step: 218/466, loss: 0.358853816986084 2023-01-24 02:24:37.319859: step: 220/466, loss: 0.09514690935611725 2023-01-24 02:24:37.960983: step: 222/466, loss: 0.05393754318356514 2023-01-24 02:24:38.600373: step: 224/466, loss: 1.1265686750411987 2023-01-24 02:24:39.301994: step: 226/466, loss: 0.05097786709666252 2023-01-24 02:24:39.888199: step: 228/466, loss: 0.10566449910402298 2023-01-24 02:24:40.536422: step: 230/466, loss: 0.05373445525765419 2023-01-24 02:24:41.137692: step: 232/466, loss: 0.022259365767240524 2023-01-24 02:24:41.748820: step: 234/466, loss: 0.0640915185213089 2023-01-24 02:24:42.452799: step: 236/466, loss: 0.1120535135269165 2023-01-24 02:24:43.130823: step: 238/466, loss: 0.07983346283435822 2023-01-24 02:24:43.759185: step: 240/466, loss: 0.17747534811496735 2023-01-24 02:24:44.396553: step: 242/466, loss: 0.20360440015792847 2023-01-24 02:24:45.009074: step: 244/466, loss: 0.1407889872789383 2023-01-24 02:24:45.623039: step: 246/466, loss: 0.11615476757287979 2023-01-24 02:24:46.250848: step: 248/466, loss: 0.1729411482810974 2023-01-24 02:24:46.878993: step: 250/466, loss: 0.05504698306322098 2023-01-24 02:24:47.524719: step: 252/466, loss: 0.1191413551568985 2023-01-24 02:24:48.212760: step: 254/466, loss: 0.04059406742453575 2023-01-24 02:24:48.852988: step: 256/466, loss: 0.03980935364961624 2023-01-24 02:24:49.479844: step: 258/466, loss: 0.10433091968297958 2023-01-24 02:24:50.048534: step: 260/466, loss: 0.06259889155626297 2023-01-24 02:24:50.714463: step: 262/466, loss: 0.09888036549091339 2023-01-24 02:24:51.306456: step: 264/466, loss: 0.03456645458936691 2023-01-24 02:24:51.931514: step: 266/466, loss: 0.18702808022499084 2023-01-24 02:24:52.572768: step: 268/466, loss: 0.0509725883603096 2023-01-24 02:24:53.246236: step: 270/466, loss: 0.17532788217067719 2023-01-24 02:24:53.899097: step: 272/466, loss: 0.11292461305856705 2023-01-24 02:24:54.503531: step: 274/466, loss: 1.1103038787841797 2023-01-24 02:24:55.138483: step: 276/466, loss: 0.10820992290973663 2023-01-24 02:24:55.831603: step: 278/466, loss: 0.07610810548067093 2023-01-24 02:24:56.431789: step: 280/466, loss: 0.08254982531070709 2023-01-24 02:24:57.006809: step: 282/466, loss: 0.10432609170675278 2023-01-24 02:24:57.619061: step: 284/466, loss: 0.04582104831933975 2023-01-24 02:24:58.226048: step: 286/466, loss: 0.03947072848677635 2023-01-24 02:24:58.818259: step: 288/466, loss: 0.0196113009005785 2023-01-24 02:24:59.531095: step: 290/466, loss: 0.0797518640756607 2023-01-24 02:25:00.125756: step: 292/466, loss: 0.08071209490299225 2023-01-24 02:25:00.726952: step: 294/466, loss: 0.066269151866436 2023-01-24 02:25:01.364362: step: 296/466, loss: 0.20743513107299805 2023-01-24 02:25:02.026217: step: 298/466, loss: 0.24499349296092987 2023-01-24 02:25:02.687499: step: 300/466, loss: 0.05682981014251709 2023-01-24 02:25:03.320661: step: 302/466, loss: 0.08268841356039047 2023-01-24 02:25:04.064321: step: 304/466, loss: 0.06325706094503403 2023-01-24 02:25:04.740971: step: 306/466, loss: 0.04648834466934204 2023-01-24 02:25:05.376236: step: 308/466, loss: 0.04719547927379608 2023-01-24 02:25:05.949561: step: 310/466, loss: 0.12633663415908813 2023-01-24 02:25:06.611228: step: 312/466, loss: 0.07563754171133041 2023-01-24 02:25:07.226267: step: 314/466, loss: 0.12214470654726028 2023-01-24 02:25:07.907476: step: 316/466, loss: 0.043213777244091034 2023-01-24 02:25:08.554449: step: 318/466, loss: 0.13378891348838806 2023-01-24 02:25:09.206286: step: 320/466, loss: 0.2778260111808777 2023-01-24 02:25:09.782405: step: 322/466, loss: 0.16735616326332092 2023-01-24 02:25:10.371857: step: 324/466, loss: 0.10614963620901108 2023-01-24 02:25:10.971339: step: 326/466, loss: 0.03816225752234459 2023-01-24 02:25:11.579912: step: 328/466, loss: 0.029803283512592316 2023-01-24 02:25:12.194748: step: 330/466, loss: 0.027446869760751724 2023-01-24 02:25:12.861665: step: 332/466, loss: 0.04507140442728996 2023-01-24 02:25:13.474900: step: 334/466, loss: 0.36620408296585083 2023-01-24 02:25:14.065094: step: 336/466, loss: 0.06710328906774521 2023-01-24 02:25:14.662636: step: 338/466, loss: 0.021497823297977448 2023-01-24 02:25:15.297238: step: 340/466, loss: 0.14197732508182526 2023-01-24 02:25:16.027744: step: 342/466, loss: 0.10229825973510742 2023-01-24 02:25:16.667441: step: 344/466, loss: 0.3133872449398041 2023-01-24 02:25:17.266551: step: 346/466, loss: 0.0795479416847229 2023-01-24 02:25:17.891345: step: 348/466, loss: 0.09182026982307434 2023-01-24 02:25:18.586940: step: 350/466, loss: 0.17283767461776733 2023-01-24 02:25:19.227003: step: 352/466, loss: 0.417004257440567 2023-01-24 02:25:19.953779: step: 354/466, loss: 0.24089911580085754 2023-01-24 02:25:20.583280: step: 356/466, loss: 0.07301607728004456 2023-01-24 02:25:21.306768: step: 358/466, loss: 0.07322098314762115 2023-01-24 02:25:21.990857: step: 360/466, loss: 0.13493238389492035 2023-01-24 02:25:22.634288: step: 362/466, loss: 0.029753653332591057 2023-01-24 02:25:23.276939: step: 364/466, loss: 0.11101134121417999 2023-01-24 02:25:23.903944: step: 366/466, loss: 0.1328093707561493 2023-01-24 02:25:24.505248: step: 368/466, loss: 0.06907472759485245 2023-01-24 02:25:25.125969: step: 370/466, loss: 0.07155998796224594 2023-01-24 02:25:25.707576: step: 372/466, loss: 0.03286347910761833 2023-01-24 02:25:26.286628: step: 374/466, loss: 0.030025159940123558 2023-01-24 02:25:26.852800: step: 376/466, loss: 0.057663481682538986 2023-01-24 02:25:27.481658: step: 378/466, loss: 0.053767379373311996 2023-01-24 02:25:28.071830: step: 380/466, loss: 0.016332991421222687 2023-01-24 02:25:28.708481: step: 382/466, loss: 0.13445210456848145 2023-01-24 02:25:29.334791: step: 384/466, loss: 0.038524363189935684 2023-01-24 02:25:29.931943: step: 386/466, loss: 0.17334969341754913 2023-01-24 02:25:30.578735: step: 388/466, loss: 0.1045505702495575 2023-01-24 02:25:31.293613: step: 390/466, loss: 0.15772441029548645 2023-01-24 02:25:31.956169: step: 392/466, loss: 0.10239655524492264 2023-01-24 02:25:32.589586: step: 394/466, loss: 0.07589149475097656 2023-01-24 02:25:33.274408: step: 396/466, loss: 0.16218271851539612 2023-01-24 02:25:33.831019: step: 398/466, loss: 0.08361567556858063 2023-01-24 02:25:34.467979: step: 400/466, loss: 0.08070149272680283 2023-01-24 02:25:35.122732: step: 402/466, loss: 0.12400388717651367 2023-01-24 02:25:35.834659: step: 404/466, loss: 0.49983540177345276 2023-01-24 02:25:36.498157: step: 406/466, loss: 0.21592941880226135 2023-01-24 02:25:37.142449: step: 408/466, loss: 0.29184338450431824 2023-01-24 02:25:37.730329: step: 410/466, loss: 0.10343955457210541 2023-01-24 02:25:38.300921: step: 412/466, loss: 0.06982731074094772 2023-01-24 02:25:38.920154: step: 414/466, loss: 0.04141250625252724 2023-01-24 02:25:39.582802: step: 416/466, loss: 0.5886058807373047 2023-01-24 02:25:40.217859: step: 418/466, loss: 0.09144656360149384 2023-01-24 02:25:40.882389: step: 420/466, loss: 0.0958983525633812 2023-01-24 02:25:41.526395: step: 422/466, loss: 0.059212010353803635 2023-01-24 02:25:42.182701: step: 424/466, loss: 0.04879840463399887 2023-01-24 02:25:42.765916: step: 426/466, loss: 0.08247572183609009 2023-01-24 02:25:43.377014: step: 428/466, loss: 0.09836183488368988 2023-01-24 02:25:43.994972: step: 430/466, loss: 0.09267791360616684 2023-01-24 02:25:44.691549: step: 432/466, loss: 0.0319414921104908 2023-01-24 02:25:45.337141: step: 434/466, loss: 0.07540285587310791 2023-01-24 02:25:45.981056: step: 436/466, loss: 0.035845641046762466 2023-01-24 02:25:46.595793: step: 438/466, loss: 0.14549075067043304 2023-01-24 02:25:47.125198: step: 440/466, loss: 0.08438480645418167 2023-01-24 02:25:47.829024: step: 442/466, loss: 0.4051809012889862 2023-01-24 02:25:48.482977: step: 444/466, loss: 0.2547866702079773 2023-01-24 02:25:49.093647: step: 446/466, loss: 0.06865016371011734 2023-01-24 02:25:49.692561: step: 448/466, loss: 0.0883961096405983 2023-01-24 02:25:50.310431: step: 450/466, loss: 0.266781747341156 2023-01-24 02:25:50.980675: step: 452/466, loss: 0.14414082467556 2023-01-24 02:25:51.594927: step: 454/466, loss: 0.5005567073822021 2023-01-24 02:25:52.219837: step: 456/466, loss: 0.07571188360452652 2023-01-24 02:25:52.873409: step: 458/466, loss: 0.04615286737680435 2023-01-24 02:25:53.483937: step: 460/466, loss: 0.17617224156856537 2023-01-24 02:25:54.070996: step: 462/466, loss: 0.10700923204421997 2023-01-24 02:25:54.718056: step: 464/466, loss: 0.3710244297981262 2023-01-24 02:25:55.332137: step: 466/466, loss: 0.1079602986574173 2023-01-24 02:25:55.953971: step: 468/466, loss: 0.12475752085447311 2023-01-24 02:25:56.503770: step: 470/466, loss: 0.0040720487013459206 2023-01-24 02:25:57.126513: step: 472/466, loss: 0.08998247236013412 2023-01-24 02:25:57.714708: step: 474/466, loss: 0.10712777078151703 2023-01-24 02:25:58.373975: step: 476/466, loss: 0.14514942467212677 2023-01-24 02:25:59.056765: step: 478/466, loss: 0.1995212584733963 2023-01-24 02:25:59.776624: step: 480/466, loss: 0.1564503163099289 2023-01-24 02:26:00.474252: step: 482/466, loss: 0.2669309675693512 2023-01-24 02:26:01.061541: step: 484/466, loss: 0.019181542098522186 2023-01-24 02:26:01.708253: step: 486/466, loss: 0.019846079871058464 2023-01-24 02:26:02.400720: step: 488/466, loss: 0.35277754068374634 2023-01-24 02:26:02.972701: step: 490/466, loss: 0.04823637753725052 2023-01-24 02:26:03.622048: step: 492/466, loss: 0.027920136228203773 2023-01-24 02:26:04.243487: step: 494/466, loss: 0.03421937674283981 2023-01-24 02:26:04.900081: step: 496/466, loss: 0.08403817564249039 2023-01-24 02:26:05.574768: step: 498/466, loss: 0.08438766747713089 2023-01-24 02:26:06.299522: step: 500/466, loss: 0.10292106866836548 2023-01-24 02:26:06.962958: step: 502/466, loss: 0.10465852916240692 2023-01-24 02:26:07.580615: step: 504/466, loss: 0.13783647119998932 2023-01-24 02:26:08.246361: step: 506/466, loss: 0.04002920910716057 2023-01-24 02:26:08.892922: step: 508/466, loss: 0.38707074522972107 2023-01-24 02:26:09.425002: step: 510/466, loss: 0.07652793079614639 2023-01-24 02:26:10.159723: step: 512/466, loss: 0.020424529910087585 2023-01-24 02:26:10.879360: step: 514/466, loss: 0.17076388001441956 2023-01-24 02:26:11.501899: step: 516/466, loss: 0.049553290009498596 2023-01-24 02:26:12.068170: step: 518/466, loss: 0.19098210334777832 2023-01-24 02:26:12.648284: step: 520/466, loss: 0.1638951152563095 2023-01-24 02:26:13.299150: step: 522/466, loss: 0.026427242904901505 2023-01-24 02:26:13.906128: step: 524/466, loss: 0.08538713306188583 2023-01-24 02:26:14.514988: step: 526/466, loss: 0.0711318626999855 2023-01-24 02:26:15.177798: step: 528/466, loss: 0.1881960779428482 2023-01-24 02:26:15.815164: step: 530/466, loss: 0.035001594573259354 2023-01-24 02:26:16.479944: step: 532/466, loss: 0.17813636362552643 2023-01-24 02:26:17.095620: step: 534/466, loss: 0.9085713028907776 2023-01-24 02:26:17.724998: step: 536/466, loss: 0.1646883636713028 2023-01-24 02:26:18.303196: step: 538/466, loss: 0.09630478918552399 2023-01-24 02:26:18.933562: step: 540/466, loss: 0.09880103170871735 2023-01-24 02:26:19.596215: step: 542/466, loss: 0.05040838569402695 2023-01-24 02:26:20.251976: step: 544/466, loss: 0.09186426550149918 2023-01-24 02:26:20.813203: step: 546/466, loss: 0.35977739095687866 2023-01-24 02:26:21.458565: step: 548/466, loss: 0.07499703764915466 2023-01-24 02:26:22.101461: step: 550/466, loss: 0.1100747287273407 2023-01-24 02:26:22.773028: step: 552/466, loss: 0.3778783977031708 2023-01-24 02:26:23.434362: step: 554/466, loss: 0.07648572325706482 2023-01-24 02:26:24.092462: step: 556/466, loss: 0.06567910313606262 2023-01-24 02:26:24.725829: step: 558/466, loss: 0.0745609775185585 2023-01-24 02:26:25.388213: step: 560/466, loss: 0.12958188354969025 2023-01-24 02:26:25.997757: step: 562/466, loss: 0.032594673335552216 2023-01-24 02:26:26.570765: step: 564/466, loss: 0.06931976974010468 2023-01-24 02:26:27.262958: step: 566/466, loss: 0.18030980229377747 2023-01-24 02:26:27.899967: step: 568/466, loss: 0.060430657118558884 2023-01-24 02:26:28.528746: step: 570/466, loss: 0.08576898276805878 2023-01-24 02:26:29.224842: step: 572/466, loss: 0.07026806473731995 2023-01-24 02:26:29.885422: step: 574/466, loss: 0.12407351285219193 2023-01-24 02:26:30.494572: step: 576/466, loss: 0.1357271671295166 2023-01-24 02:26:31.152231: step: 578/466, loss: 0.0655779093503952 2023-01-24 02:26:31.793494: step: 580/466, loss: 0.05485844612121582 2023-01-24 02:26:32.488880: step: 582/466, loss: 0.14106249809265137 2023-01-24 02:26:33.052622: step: 584/466, loss: 0.34478509426116943 2023-01-24 02:26:33.670427: step: 586/466, loss: 0.15412241220474243 2023-01-24 02:26:34.227894: step: 588/466, loss: 0.04285565763711929 2023-01-24 02:26:34.853439: step: 590/466, loss: 0.054340530186891556 2023-01-24 02:26:35.475080: step: 592/466, loss: 0.17163047194480896 2023-01-24 02:26:36.064501: step: 594/466, loss: 0.1610175222158432 2023-01-24 02:26:36.701112: step: 596/466, loss: 0.10305967926979065 2023-01-24 02:26:37.259097: step: 598/466, loss: 0.10210266709327698 2023-01-24 02:26:37.865214: step: 600/466, loss: 0.02406475506722927 2023-01-24 02:26:38.435924: step: 602/466, loss: 0.17943254113197327 2023-01-24 02:26:39.106755: step: 604/466, loss: 0.06392550468444824 2023-01-24 02:26:39.742340: step: 606/466, loss: 0.0971219390630722 2023-01-24 02:26:40.418782: step: 608/466, loss: 0.026507209986448288 2023-01-24 02:26:41.019153: step: 610/466, loss: 0.12073804438114166 2023-01-24 02:26:41.653316: step: 612/466, loss: 0.038244716823101044 2023-01-24 02:26:42.262328: step: 614/466, loss: 0.047044411301612854 2023-01-24 02:26:42.870271: step: 616/466, loss: 0.06327566504478455 2023-01-24 02:26:43.554783: step: 618/466, loss: 0.053750984370708466 2023-01-24 02:26:44.139459: step: 620/466, loss: 0.045614615082740784 2023-01-24 02:26:44.727654: step: 622/466, loss: 0.05823136493563652 2023-01-24 02:26:45.388210: step: 624/466, loss: 0.3171159327030182 2023-01-24 02:26:46.041158: step: 626/466, loss: 0.11763111501932144 2023-01-24 02:26:46.708010: step: 628/466, loss: 0.07963328063488007 2023-01-24 02:26:47.328023: step: 630/466, loss: 0.10646232962608337 2023-01-24 02:26:48.004210: step: 632/466, loss: 0.07549462467432022 2023-01-24 02:26:48.639890: step: 634/466, loss: 0.043111998587846756 2023-01-24 02:26:49.192933: step: 636/466, loss: 0.27588483691215515 2023-01-24 02:26:49.855659: step: 638/466, loss: 0.18270190060138702 2023-01-24 02:26:50.466360: step: 640/466, loss: 0.10143808275461197 2023-01-24 02:26:51.105367: step: 642/466, loss: 0.08115211129188538 2023-01-24 02:26:51.758345: step: 644/466, loss: 0.09656275063753128 2023-01-24 02:26:52.359364: step: 646/466, loss: 0.027801234275102615 2023-01-24 02:26:52.961768: step: 648/466, loss: 0.045937929302453995 2023-01-24 02:26:53.597031: step: 650/466, loss: 0.10616622865200043 2023-01-24 02:26:54.273885: step: 652/466, loss: 0.11118504405021667 2023-01-24 02:26:54.899761: step: 654/466, loss: 0.20897029340267181 2023-01-24 02:26:55.515741: step: 656/466, loss: 0.061517294496297836 2023-01-24 02:26:56.325874: step: 658/466, loss: 0.13016831874847412 2023-01-24 02:26:56.856512: step: 660/466, loss: 0.06911101192235947 2023-01-24 02:26:57.414267: step: 662/466, loss: 0.07121779024600983 2023-01-24 02:26:57.970404: step: 664/466, loss: 0.033248383551836014 2023-01-24 02:26:58.584828: step: 666/466, loss: 0.22419223189353943 2023-01-24 02:26:59.174000: step: 668/466, loss: 0.08588635176420212 2023-01-24 02:26:59.721576: step: 670/466, loss: 0.14215855300426483 2023-01-24 02:27:00.383944: step: 672/466, loss: 0.021613050252199173 2023-01-24 02:27:00.974151: step: 674/466, loss: 0.10710838437080383 2023-01-24 02:27:01.561522: step: 676/466, loss: 0.1299968957901001 2023-01-24 02:27:02.210036: step: 678/466, loss: 0.16836518049240112 2023-01-24 02:27:02.822214: step: 680/466, loss: 0.6163982152938843 2023-01-24 02:27:03.453268: step: 682/466, loss: 0.838560938835144 2023-01-24 02:27:04.087669: step: 684/466, loss: 0.3202309310436249 2023-01-24 02:27:04.696813: step: 686/466, loss: 0.13837195932865143 2023-01-24 02:27:05.301648: step: 688/466, loss: 0.07165122032165527 2023-01-24 02:27:05.933142: step: 690/466, loss: 0.059542424976825714 2023-01-24 02:27:06.518269: step: 692/466, loss: 0.0504116453230381 2023-01-24 02:27:07.132679: step: 694/466, loss: 0.046665359288454056 2023-01-24 02:27:07.796534: step: 696/466, loss: 0.3874886929988861 2023-01-24 02:27:08.448499: step: 698/466, loss: 0.131962850689888 2023-01-24 02:27:09.113786: step: 700/466, loss: 0.13335564732551575 2023-01-24 02:27:09.743296: step: 702/466, loss: 0.08524461090564728 2023-01-24 02:27:10.388134: step: 704/466, loss: 0.32217660546302795 2023-01-24 02:27:10.982385: step: 706/466, loss: 0.1130739077925682 2023-01-24 02:27:11.632581: step: 708/466, loss: 0.4742449223995209 2023-01-24 02:27:12.244231: step: 710/466, loss: 0.1473226696252823 2023-01-24 02:27:12.847111: step: 712/466, loss: 0.4161403775215149 2023-01-24 02:27:13.476237: step: 714/466, loss: 0.11619622260332108 2023-01-24 02:27:14.078011: step: 716/466, loss: 0.1965903788805008 2023-01-24 02:27:14.690700: step: 718/466, loss: 0.07091796398162842 2023-01-24 02:27:15.302473: step: 720/466, loss: 0.12858054041862488 2023-01-24 02:27:15.990940: step: 722/466, loss: 0.07139798253774643 2023-01-24 02:27:16.633186: step: 724/466, loss: 0.08102741092443466 2023-01-24 02:27:17.253505: step: 726/466, loss: 0.039550743997097015 2023-01-24 02:27:17.883740: step: 728/466, loss: 0.16001680493354797 2023-01-24 02:27:18.484434: step: 730/466, loss: 0.06432478129863739 2023-01-24 02:27:19.154686: step: 732/466, loss: 0.051167555153369904 2023-01-24 02:27:19.750531: step: 734/466, loss: 0.16745160520076752 2023-01-24 02:27:20.344463: step: 736/466, loss: 0.07915138453245163 2023-01-24 02:27:20.919436: step: 738/466, loss: 0.1392790526151657 2023-01-24 02:27:21.544946: step: 740/466, loss: 0.16193625330924988 2023-01-24 02:27:22.217864: step: 742/466, loss: 0.09272979944944382 2023-01-24 02:27:22.804522: step: 744/466, loss: 0.09982388466596603 2023-01-24 02:27:23.464686: step: 746/466, loss: 0.23354339599609375 2023-01-24 02:27:24.113539: step: 748/466, loss: 0.4393465518951416 2023-01-24 02:27:24.754535: step: 750/466, loss: 0.09297937154769897 2023-01-24 02:27:25.409528: step: 752/466, loss: 0.17270395159721375 2023-01-24 02:27:26.004114: step: 754/466, loss: 0.015132890082895756 2023-01-24 02:27:26.621731: step: 756/466, loss: 0.07725682109594345 2023-01-24 02:27:27.217682: step: 758/466, loss: 0.2666139602661133 2023-01-24 02:27:27.815888: step: 760/466, loss: 0.3584570288658142 2023-01-24 02:27:28.448530: step: 762/466, loss: 0.05754183977842331 2023-01-24 02:27:29.060676: step: 764/466, loss: 0.27439069747924805 2023-01-24 02:27:29.676021: step: 766/466, loss: 0.02167350985109806 2023-01-24 02:27:30.277572: step: 768/466, loss: 0.05525706335902214 2023-01-24 02:27:30.957469: step: 770/466, loss: 0.1161939725279808 2023-01-24 02:27:31.593311: step: 772/466, loss: 0.11963807046413422 2023-01-24 02:27:32.203386: step: 774/466, loss: 0.07343064248561859 2023-01-24 02:27:32.815709: step: 776/466, loss: 0.2884886562824249 2023-01-24 02:27:33.478308: step: 778/466, loss: 0.07670705765485764 2023-01-24 02:27:34.115682: step: 780/466, loss: 0.1908833086490631 2023-01-24 02:27:34.761041: step: 782/466, loss: 0.08754006773233414 2023-01-24 02:27:35.377231: step: 784/466, loss: 0.0543396957218647 2023-01-24 02:27:36.063973: step: 786/466, loss: 0.04662584885954857 2023-01-24 02:27:36.713912: step: 788/466, loss: 0.06784288585186005 2023-01-24 02:27:37.393367: step: 790/466, loss: 0.0806589275598526 2023-01-24 02:27:38.023432: step: 792/466, loss: 0.1591053009033203 2023-01-24 02:27:38.643380: step: 794/466, loss: 0.1348680704832077 2023-01-24 02:27:39.232933: step: 796/466, loss: 0.03892558068037033 2023-01-24 02:27:39.820622: step: 798/466, loss: 0.09071938693523407 2023-01-24 02:27:40.451447: step: 800/466, loss: 0.07300712913274765 2023-01-24 02:27:41.099321: step: 802/466, loss: 0.2393036037683487 2023-01-24 02:27:41.759694: step: 804/466, loss: 0.20589615404605865 2023-01-24 02:27:42.355397: step: 806/466, loss: 0.35018736124038696 2023-01-24 02:27:43.026888: step: 808/466, loss: 0.27487683296203613 2023-01-24 02:27:43.672662: step: 810/466, loss: 0.08876249939203262 2023-01-24 02:27:44.284423: step: 812/466, loss: 0.12461288273334503 2023-01-24 02:27:44.834537: step: 814/466, loss: 0.02857905998826027 2023-01-24 02:27:45.457362: step: 816/466, loss: 0.14356930553913116 2023-01-24 02:27:46.066860: step: 818/466, loss: 0.09146396070718765 2023-01-24 02:27:46.657505: step: 820/466, loss: 0.8136057257652283 2023-01-24 02:27:47.280541: step: 822/466, loss: 0.18825575709342957 2023-01-24 02:27:47.901646: step: 824/466, loss: 0.13904358446598053 2023-01-24 02:27:48.482920: step: 826/466, loss: 0.03442108631134033 2023-01-24 02:27:49.070820: step: 828/466, loss: 0.03802469000220299 2023-01-24 02:27:49.658514: step: 830/466, loss: 0.07993285357952118 2023-01-24 02:27:50.308186: step: 832/466, loss: 0.021052174270153046 2023-01-24 02:27:50.951720: step: 834/466, loss: 0.03310456871986389 2023-01-24 02:27:51.543973: step: 836/466, loss: 0.08458700776100159 2023-01-24 02:27:52.233765: step: 838/466, loss: 0.44769522547721863 2023-01-24 02:27:52.867389: step: 840/466, loss: 0.0225103460252285 2023-01-24 02:27:53.476102: step: 842/466, loss: 0.19166380167007446 2023-01-24 02:27:54.060652: step: 844/466, loss: 0.028776248916983604 2023-01-24 02:27:54.741867: step: 846/466, loss: 0.07082303613424301 2023-01-24 02:27:55.313293: step: 848/466, loss: 0.048280857503414154 2023-01-24 02:27:55.866898: step: 850/466, loss: 0.15884102880954742 2023-01-24 02:27:56.417151: step: 852/466, loss: 0.16965623199939728 2023-01-24 02:27:57.044957: step: 854/466, loss: 0.045534998178482056 2023-01-24 02:27:57.647932: step: 856/466, loss: 0.03119780868291855 2023-01-24 02:27:58.235041: step: 858/466, loss: 0.12831765413284302 2023-01-24 02:27:58.827484: step: 860/466, loss: 0.06411344558000565 2023-01-24 02:27:59.413815: step: 862/466, loss: 0.09839695692062378 2023-01-24 02:28:00.033793: step: 864/466, loss: 0.03153237700462341 2023-01-24 02:28:00.646481: step: 866/466, loss: 0.2929447889328003 2023-01-24 02:28:01.264052: step: 868/466, loss: 0.21719765663146973 2023-01-24 02:28:01.927671: step: 870/466, loss: 0.15172508358955383 2023-01-24 02:28:02.528679: step: 872/466, loss: 0.08611549437046051 2023-01-24 02:28:03.177250: step: 874/466, loss: 0.11833415180444717 2023-01-24 02:28:03.794833: step: 876/466, loss: 0.3285122513771057 2023-01-24 02:28:04.436645: step: 878/466, loss: 0.09364908188581467 2023-01-24 02:28:05.052783: step: 880/466, loss: 0.01854356937110424 2023-01-24 02:28:05.711824: step: 882/466, loss: 0.19637906551361084 2023-01-24 02:28:06.365429: step: 884/466, loss: 0.1268959641456604 2023-01-24 02:28:06.983691: step: 886/466, loss: 0.08831460028886795 2023-01-24 02:28:07.542750: step: 888/466, loss: 0.01859605871140957 2023-01-24 02:28:08.223179: step: 890/466, loss: 0.05948847532272339 2023-01-24 02:28:08.832535: step: 892/466, loss: 0.03629082813858986 2023-01-24 02:28:09.438493: step: 894/466, loss: 0.06120298430323601 2023-01-24 02:28:10.059534: step: 896/466, loss: 0.04455656558275223 2023-01-24 02:28:10.687378: step: 898/466, loss: 0.10806294530630112 2023-01-24 02:28:11.238126: step: 900/466, loss: 0.18784120678901672 2023-01-24 02:28:11.849976: step: 902/466, loss: 0.018527891486883163 2023-01-24 02:28:12.458940: step: 904/466, loss: 0.026689549908041954 2023-01-24 02:28:13.039023: step: 906/466, loss: 0.10876591503620148 2023-01-24 02:28:13.674282: step: 908/466, loss: 0.1081932783126831 2023-01-24 02:28:14.319932: step: 910/466, loss: 0.0986640676856041 2023-01-24 02:28:14.924633: step: 912/466, loss: 0.015939170494675636 2023-01-24 02:28:15.515941: step: 914/466, loss: 0.07669097930192947 2023-01-24 02:28:16.088951: step: 916/466, loss: 0.05065422132611275 2023-01-24 02:28:16.734191: step: 918/466, loss: 0.11322743445634842 2023-01-24 02:28:17.412873: step: 920/466, loss: 0.05013665556907654 2023-01-24 02:28:18.075086: step: 922/466, loss: 0.040548644959926605 2023-01-24 02:28:18.655509: step: 924/466, loss: 0.16020895540714264 2023-01-24 02:28:19.304772: step: 926/466, loss: 0.18104447424411774 2023-01-24 02:28:19.968943: step: 928/466, loss: 0.09960732609033585 2023-01-24 02:28:20.561336: step: 930/466, loss: 0.1459933966398239 2023-01-24 02:28:21.175896: step: 932/466, loss: 0.11824838817119598 ================================================== Loss: 0.136 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.32080274696614275, 'r': 0.31228047285319016, 'f1': 0.31648424844929085}, 'combined': 0.23319891991000377, 'epoch': 16} Test Chinese: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.35685545189819845, 'r': 0.30799655987054214, 'f1': 0.33063072566567964}, 'combined': 0.21927840873164242, 'epoch': 16} Dev Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.32328280817502747, 'r': 0.2736882864663585, 'f1': 0.296425467188179}, 'combined': 0.19761697812545267, 'epoch': 16} Test Korean: {'template': {'p': 0.96875, 'r': 0.49206349206349204, 'f1': 0.6526315789473683}, 'slot': {'p': 0.36992227333844807, 'r': 0.292094470376333, 'f1': 0.3264335880838555}, 'combined': 0.21304086801262145, 'epoch': 16} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3306482127953262, 'r': 0.3212369733419488, 'f1': 0.32587465823138984}, 'combined': 0.24011816922312934, 'epoch': 16} Test Russian: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.3471793169972375, 'r': 0.29912681637478244, 'f1': 0.32136672837290753}, 'combined': 0.21313441052710963, 'epoch': 16} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.2841880341880342, 'r': 0.31666666666666665, 'f1': 0.29954954954954954}, 'combined': 0.1996996996996997, 'epoch': 16} Sample Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.4264705882352941, 'r': 0.31521739130434784, 'f1': 0.3625}, 'combined': 0.24166666666666664, 'epoch': 16} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.4, 'r': 0.13793103448275862, 'f1': 0.20512820512820515}, 'combined': 0.13675213675213677, 'epoch': 16} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3384687266123707, 'r': 0.32112782411040863, 'f1': 0.3295703277627758}, 'combined': 0.24284129414099268, 'epoch': 13} Test for Chinese: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.37192480126309296, 'r': 0.3046241229392952, 'f1': 0.334927046163623}, 'combined': 0.2221277819116256, 'epoch': 13} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.38541666666666663, 'r': 0.35238095238095235, 'f1': 0.36815920398009944}, 'combined': 0.24543946932006627, 'epoch': 13} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.31508371199826896, 'r': 0.28285924145299146, 'f1': 0.298103152669021}, 'combined': 0.19873543511268066, 'epoch': 10} Test for Korean: {'template': {'p': 0.96875, 'r': 0.49206349206349204, 'f1': 0.6526315789473683}, 'slot': {'p': 0.3659577520051451, 'r': 0.28516188467933384, 'f1': 0.3205469360629008}, 'combined': 0.20919905300947209, 'epoch': 10} Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.43478260869565216, 'f1': 0.46511627906976744}, 'combined': 0.31007751937984496, 'epoch': 10} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3214272868712161, 'r': 0.31166858366449984, 'f1': 0.3164727236824498}, 'combined': 0.23319042797654194, 'epoch': 7} Test for Russian: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.3374639287478496, 'r': 0.2816078301964814, 'f1': 0.30701605547736693}, 'combined': 0.20361686580882363, 'epoch': 7} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.4423076923076923, 'r': 0.19827586206896552, 'f1': 0.2738095238095238}, 'combined': 0.1825396825396825, 'epoch': 7} ****************************** Epoch: 17 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-24 02:30:50.905770: step: 2/466, loss: 0.07163936644792557 2023-01-24 02:30:51.562806: step: 4/466, loss: 0.20703236758708954 2023-01-24 02:30:52.152620: step: 6/466, loss: 0.2051185816526413 2023-01-24 02:30:52.705487: step: 8/466, loss: 0.0900440439581871 2023-01-24 02:30:53.308405: step: 10/466, loss: 0.04501933604478836 2023-01-24 02:30:53.903745: step: 12/466, loss: 0.18890134990215302 2023-01-24 02:30:54.539570: step: 14/466, loss: 0.09076967090368271 2023-01-24 02:30:55.155178: step: 16/466, loss: 0.1052854135632515 2023-01-24 02:30:55.832937: step: 18/466, loss: 0.12530001997947693 2023-01-24 02:30:56.477531: step: 20/466, loss: 0.1331702023744583 2023-01-24 02:30:57.109963: step: 22/466, loss: 0.09598618745803833 2023-01-24 02:30:57.754930: step: 24/466, loss: 0.10314838588237762 2023-01-24 02:30:58.373180: step: 26/466, loss: 0.10385505855083466 2023-01-24 02:30:59.011894: step: 28/466, loss: 1.707896113395691 2023-01-24 02:30:59.687852: step: 30/466, loss: 0.18123123049736023 2023-01-24 02:31:00.320462: step: 32/466, loss: 0.15528570115566254 2023-01-24 02:31:00.931479: step: 34/466, loss: 0.07825500518083572 2023-01-24 02:31:01.507851: step: 36/466, loss: 0.08328621834516525 2023-01-24 02:31:02.213232: step: 38/466, loss: 0.06680682301521301 2023-01-24 02:31:02.844155: step: 40/466, loss: 0.11616786569356918 2023-01-24 02:31:03.474687: step: 42/466, loss: 0.3848291039466858 2023-01-24 02:31:04.079675: step: 44/466, loss: 0.08743570744991302 2023-01-24 02:31:04.629319: step: 46/466, loss: 0.06026541069149971 2023-01-24 02:31:05.239307: step: 48/466, loss: 0.051632121205329895 2023-01-24 02:31:05.898142: step: 50/466, loss: 0.6892358660697937 2023-01-24 02:31:06.494595: step: 52/466, loss: 0.0357104130089283 2023-01-24 02:31:07.132766: step: 54/466, loss: 0.016136176884174347 2023-01-24 02:31:07.785302: step: 56/466, loss: 0.03522597998380661 2023-01-24 02:31:08.409746: step: 58/466, loss: 0.028929902240633965 2023-01-24 02:31:09.084621: step: 60/466, loss: 0.08128506690263748 2023-01-24 02:31:09.830874: step: 62/466, loss: 0.30395564436912537 2023-01-24 02:31:10.492309: step: 64/466, loss: 0.04575144499540329 2023-01-24 02:31:11.077736: step: 66/466, loss: 0.054458633065223694 2023-01-24 02:31:11.749628: step: 68/466, loss: 0.08864127844572067 2023-01-24 02:31:12.313410: step: 70/466, loss: 1.7842373847961426 2023-01-24 02:31:12.947195: step: 72/466, loss: 0.12249581515789032 2023-01-24 02:31:13.632319: step: 74/466, loss: 0.05205175280570984 2023-01-24 02:31:14.221206: step: 76/466, loss: 0.09778035432100296 2023-01-24 02:31:14.838526: step: 78/466, loss: 0.15009304881095886 2023-01-24 02:31:15.431984: step: 80/466, loss: 0.015185577794909477 2023-01-24 02:31:16.066722: step: 82/466, loss: 0.10033207386732101 2023-01-24 02:31:16.761702: step: 84/466, loss: 0.07721716165542603 2023-01-24 02:31:17.435034: step: 86/466, loss: 0.061948612332344055 2023-01-24 02:31:18.054617: step: 88/466, loss: 0.2933771312236786 2023-01-24 02:31:18.710470: step: 90/466, loss: 0.17141754925251007 2023-01-24 02:31:19.361980: step: 92/466, loss: 0.10136251151561737 2023-01-24 02:31:19.997651: step: 94/466, loss: 0.18460702896118164 2023-01-24 02:31:20.636001: step: 96/466, loss: 0.012896597385406494 2023-01-24 02:31:21.247638: step: 98/466, loss: 0.175594300031662 2023-01-24 02:31:21.906115: step: 100/466, loss: 0.1161227747797966 2023-01-24 02:31:22.536905: step: 102/466, loss: 0.08087904751300812 2023-01-24 02:31:23.165807: step: 104/466, loss: 0.3065102994441986 2023-01-24 02:31:23.879444: step: 106/466, loss: 0.22393816709518433 2023-01-24 02:31:24.473849: step: 108/466, loss: 0.2180919647216797 2023-01-24 02:31:25.119917: step: 110/466, loss: 0.11499159038066864 2023-01-24 02:31:25.788610: step: 112/466, loss: 0.19297310709953308 2023-01-24 02:31:26.457543: step: 114/466, loss: 0.09911413490772247 2023-01-24 02:31:27.054779: step: 116/466, loss: 0.32927078008651733 2023-01-24 02:31:27.619449: step: 118/466, loss: 0.06469939649105072 2023-01-24 02:31:28.217239: step: 120/466, loss: 0.044414736330509186 2023-01-24 02:31:28.826452: step: 122/466, loss: 0.046586912125349045 2023-01-24 02:31:29.430160: step: 124/466, loss: 0.06715916842222214 2023-01-24 02:31:30.037751: step: 126/466, loss: 0.15492646396160126 2023-01-24 02:31:30.724954: step: 128/466, loss: 0.04904414340853691 2023-01-24 02:31:31.321263: step: 130/466, loss: 0.13736887276172638 2023-01-24 02:31:31.903842: step: 132/466, loss: 0.06990758329629898 2023-01-24 02:31:32.501777: step: 134/466, loss: 0.07760658860206604 2023-01-24 02:31:33.130196: step: 136/466, loss: 0.029811248183250427 2023-01-24 02:31:33.736375: step: 138/466, loss: 0.06430856883525848 2023-01-24 02:31:34.325064: step: 140/466, loss: 0.15281669795513153 2023-01-24 02:31:34.892959: step: 142/466, loss: 0.06193069368600845 2023-01-24 02:31:35.534070: step: 144/466, loss: 0.05268199369311333 2023-01-24 02:31:36.164191: step: 146/466, loss: 0.13049347698688507 2023-01-24 02:31:36.838071: step: 148/466, loss: 0.07088293880224228 2023-01-24 02:31:37.364341: step: 150/466, loss: 0.016721755266189575 2023-01-24 02:31:37.935582: step: 152/466, loss: 0.061871930956840515 2023-01-24 02:31:38.543957: step: 154/466, loss: 0.05041418969631195 2023-01-24 02:31:39.136297: step: 156/466, loss: 0.06532739847898483 2023-01-24 02:31:39.702757: step: 158/466, loss: 0.03481756150722504 2023-01-24 02:31:40.303815: step: 160/466, loss: 0.08428133279085159 2023-01-24 02:31:40.858302: step: 162/466, loss: 0.041844822466373444 2023-01-24 02:31:41.503220: step: 164/466, loss: 0.1911388486623764 2023-01-24 02:31:42.089426: step: 166/466, loss: 0.11913547664880753 2023-01-24 02:31:42.661191: step: 168/466, loss: 0.046030666679143906 2023-01-24 02:31:43.275375: step: 170/466, loss: 0.0535283088684082 2023-01-24 02:31:43.913866: step: 172/466, loss: 0.04303639754652977 2023-01-24 02:31:44.523721: step: 174/466, loss: 0.0598825141787529 2023-01-24 02:31:45.233570: step: 176/466, loss: 0.9581116437911987 2023-01-24 02:31:45.917691: step: 178/466, loss: 0.09876693040132523 2023-01-24 02:31:46.633047: step: 180/466, loss: 1.1125866174697876 2023-01-24 02:31:47.172222: step: 182/466, loss: 0.03268519416451454 2023-01-24 02:31:47.789599: step: 184/466, loss: 0.10946287959814072 2023-01-24 02:31:48.392866: step: 186/466, loss: 0.07927299290895462 2023-01-24 02:31:48.970263: step: 188/466, loss: 0.11638938635587692 2023-01-24 02:31:49.553678: step: 190/466, loss: 2.2971489429473877 2023-01-24 02:31:50.122239: step: 192/466, loss: 0.1603151112794876 2023-01-24 02:31:50.749294: step: 194/466, loss: 0.0545412041246891 2023-01-24 02:31:51.332159: step: 196/466, loss: 0.033792734146118164 2023-01-24 02:31:51.918871: step: 198/466, loss: 0.03249639272689819 2023-01-24 02:31:52.656877: step: 200/466, loss: 0.30851632356643677 2023-01-24 02:31:53.266616: step: 202/466, loss: 0.048087891191244125 2023-01-24 02:31:53.897431: step: 204/466, loss: 0.031088771298527718 2023-01-24 02:31:54.549082: step: 206/466, loss: 0.08168289065361023 2023-01-24 02:31:55.111305: step: 208/466, loss: 0.06491156667470932 2023-01-24 02:31:55.684404: step: 210/466, loss: 0.07623785734176636 2023-01-24 02:31:56.253388: step: 212/466, loss: 0.07565370202064514 2023-01-24 02:31:56.917660: step: 214/466, loss: 0.2032523900270462 2023-01-24 02:31:57.527434: step: 216/466, loss: 0.09473734349012375 2023-01-24 02:31:58.178467: step: 218/466, loss: 0.4610787332057953 2023-01-24 02:31:58.771942: step: 220/466, loss: 0.00763842323794961 2023-01-24 02:31:59.443506: step: 222/466, loss: 0.32014867663383484 2023-01-24 02:32:00.055718: step: 224/466, loss: 0.07198350131511688 2023-01-24 02:32:00.702879: step: 226/466, loss: 1.0742568969726562 2023-01-24 02:32:01.224372: step: 228/466, loss: 0.05047997459769249 2023-01-24 02:32:01.826790: step: 230/466, loss: 0.04707179218530655 2023-01-24 02:32:02.516773: step: 232/466, loss: 0.1336919367313385 2023-01-24 02:32:03.184057: step: 234/466, loss: 0.09445635974407196 2023-01-24 02:32:03.806398: step: 236/466, loss: 0.36126482486724854 2023-01-24 02:32:04.441032: step: 238/466, loss: 0.060997847467660904 2023-01-24 02:32:05.039064: step: 240/466, loss: 0.04949461296200752 2023-01-24 02:32:05.693634: step: 242/466, loss: 0.025348510593175888 2023-01-24 02:32:06.368858: step: 244/466, loss: 0.01648002117872238 2023-01-24 02:32:07.025390: step: 246/466, loss: 0.09901939332485199 2023-01-24 02:32:07.698830: step: 248/466, loss: 0.10556934028863907 2023-01-24 02:32:08.297755: step: 250/466, loss: 0.06789106130599976 2023-01-24 02:32:08.899790: step: 252/466, loss: 0.07301247864961624 2023-01-24 02:32:09.533756: step: 254/466, loss: 0.06608757376670837 2023-01-24 02:32:10.100388: step: 256/466, loss: 0.03352828323841095 2023-01-24 02:32:10.731672: step: 258/466, loss: 0.15243376791477203 2023-01-24 02:32:11.348262: step: 260/466, loss: 0.5134772658348083 2023-01-24 02:32:12.021892: step: 262/466, loss: 0.06259741634130478 2023-01-24 02:32:12.626828: step: 264/466, loss: 0.08475439995527267 2023-01-24 02:32:13.234143: step: 266/466, loss: 0.053116872906684875 2023-01-24 02:32:13.819614: step: 268/466, loss: 0.040617186576128006 2023-01-24 02:32:14.431019: step: 270/466, loss: 0.04007381200790405 2023-01-24 02:32:15.070049: step: 272/466, loss: 0.09571406245231628 2023-01-24 02:32:15.720999: step: 274/466, loss: 0.12687426805496216 2023-01-24 02:32:16.354527: step: 276/466, loss: 0.0580701120197773 2023-01-24 02:32:16.898307: step: 278/466, loss: 0.013055864721536636 2023-01-24 02:32:17.465145: step: 280/466, loss: 0.04769768565893173 2023-01-24 02:32:18.051340: step: 282/466, loss: 0.0945633053779602 2023-01-24 02:32:18.606379: step: 284/466, loss: 0.015341498889029026 2023-01-24 02:32:19.190813: step: 286/466, loss: 0.06844498962163925 2023-01-24 02:32:19.865420: step: 288/466, loss: 0.12033065408468246 2023-01-24 02:32:20.475862: step: 290/466, loss: 0.1408940702676773 2023-01-24 02:32:21.139609: step: 292/466, loss: 0.201069638133049 2023-01-24 02:32:21.669725: step: 294/466, loss: 0.04977531358599663 2023-01-24 02:32:22.339703: step: 296/466, loss: 0.026707181707024574 2023-01-24 02:32:22.947416: step: 298/466, loss: 0.1375470608472824 2023-01-24 02:32:23.582395: step: 300/466, loss: 0.1574658751487732 2023-01-24 02:32:24.263377: step: 302/466, loss: 0.13280268013477325 2023-01-24 02:32:24.869827: step: 304/466, loss: 0.021194612607359886 2023-01-24 02:32:25.527687: step: 306/466, loss: 0.0552968792617321 2023-01-24 02:32:26.110684: step: 308/466, loss: 0.04537500813603401 2023-01-24 02:32:26.716761: step: 310/466, loss: 0.11480645090341568 2023-01-24 02:32:27.345220: step: 312/466, loss: 0.08807799220085144 2023-01-24 02:32:28.019502: step: 314/466, loss: 0.21497578918933868 2023-01-24 02:32:28.687225: step: 316/466, loss: 0.011518558487296104 2023-01-24 02:32:29.307837: step: 318/466, loss: 0.023418935015797615 2023-01-24 02:32:29.944949: step: 320/466, loss: 0.03557395190000534 2023-01-24 02:32:30.547419: step: 322/466, loss: 0.2847992181777954 2023-01-24 02:32:31.190055: step: 324/466, loss: 0.0652703270316124 2023-01-24 02:32:31.777959: step: 326/466, loss: 0.10121436417102814 2023-01-24 02:32:32.486586: step: 328/466, loss: 0.11288412660360336 2023-01-24 02:32:33.103359: step: 330/466, loss: 0.08179786056280136 2023-01-24 02:32:33.683683: step: 332/466, loss: 0.10343725234270096 2023-01-24 02:32:34.400073: step: 334/466, loss: 0.24230018258094788 2023-01-24 02:32:35.023836: step: 336/466, loss: 0.015430030412971973 2023-01-24 02:32:35.724724: step: 338/466, loss: 0.13451127707958221 2023-01-24 02:32:36.333690: step: 340/466, loss: 0.04132866859436035 2023-01-24 02:32:36.911734: step: 342/466, loss: 0.08746642619371414 2023-01-24 02:32:37.566302: step: 344/466, loss: 0.10066674649715424 2023-01-24 02:32:38.156413: step: 346/466, loss: 0.09099888056516647 2023-01-24 02:32:38.813609: step: 348/466, loss: 0.04106014594435692 2023-01-24 02:32:39.417882: step: 350/466, loss: 0.10349002480506897 2023-01-24 02:32:40.017760: step: 352/466, loss: 0.0962907001376152 2023-01-24 02:32:40.632708: step: 354/466, loss: 0.25616008043289185 2023-01-24 02:32:41.252041: step: 356/466, loss: 0.06833707541227341 2023-01-24 02:32:41.828290: step: 358/466, loss: 0.27315330505371094 2023-01-24 02:32:42.411938: step: 360/466, loss: 0.17183949053287506 2023-01-24 02:32:43.092325: step: 362/466, loss: 0.03962790593504906 2023-01-24 02:32:43.700590: step: 364/466, loss: 0.06679973006248474 2023-01-24 02:32:44.334434: step: 366/466, loss: 0.12388741225004196 2023-01-24 02:32:44.939667: step: 368/466, loss: 0.009611066430807114 2023-01-24 02:32:45.551542: step: 370/466, loss: 0.0658520832657814 2023-01-24 02:32:46.213382: step: 372/466, loss: 0.02278760075569153 2023-01-24 02:32:46.798345: step: 374/466, loss: 0.00905968714505434 2023-01-24 02:32:47.401682: step: 376/466, loss: 0.05500665307044983 2023-01-24 02:32:48.056020: step: 378/466, loss: 0.04999172315001488 2023-01-24 02:32:48.676546: step: 380/466, loss: 0.12665079534053802 2023-01-24 02:32:49.291083: step: 382/466, loss: 0.04226607829332352 2023-01-24 02:32:49.841988: step: 384/466, loss: 0.08261600881814957 2023-01-24 02:32:50.452133: step: 386/466, loss: 0.014171634800732136 2023-01-24 02:32:51.090036: step: 388/466, loss: 0.2714396119117737 2023-01-24 02:32:51.694630: step: 390/466, loss: 0.11800781637430191 2023-01-24 02:32:52.292294: step: 392/466, loss: 0.04906920716166496 2023-01-24 02:32:52.960222: step: 394/466, loss: 0.15990416705608368 2023-01-24 02:32:53.487861: step: 396/466, loss: 1.1693564653396606 2023-01-24 02:32:54.148946: step: 398/466, loss: 0.1673891246318817 2023-01-24 02:32:54.779751: step: 400/466, loss: 0.03268442302942276 2023-01-24 02:32:55.394581: step: 402/466, loss: 0.11904057115316391 2023-01-24 02:32:55.996369: step: 404/466, loss: 0.2910826802253723 2023-01-24 02:32:56.587821: step: 406/466, loss: 0.26489701867103577 2023-01-24 02:32:57.194778: step: 408/466, loss: 0.10169929265975952 2023-01-24 02:32:57.770841: step: 410/466, loss: 0.03802566975355148 2023-01-24 02:32:58.442776: step: 412/466, loss: 0.09995359182357788 2023-01-24 02:32:59.028778: step: 414/466, loss: 0.05769165977835655 2023-01-24 02:32:59.622646: step: 416/466, loss: 0.06827948987483978 2023-01-24 02:33:00.193220: step: 418/466, loss: 0.04335479065775871 2023-01-24 02:33:00.810940: step: 420/466, loss: 0.0710509717464447 2023-01-24 02:33:01.426805: step: 422/466, loss: 0.15698561072349548 2023-01-24 02:33:02.074321: step: 424/466, loss: 0.15473562479019165 2023-01-24 02:33:02.691538: step: 426/466, loss: 0.08751913905143738 2023-01-24 02:33:03.277844: step: 428/466, loss: 0.11267465353012085 2023-01-24 02:33:03.895113: step: 430/466, loss: 0.6198447346687317 2023-01-24 02:33:04.562647: step: 432/466, loss: 0.05800934135913849 2023-01-24 02:33:05.256888: step: 434/466, loss: 0.03908151388168335 2023-01-24 02:33:05.874537: step: 436/466, loss: 0.09468743205070496 2023-01-24 02:33:06.506006: step: 438/466, loss: 0.08381114900112152 2023-01-24 02:33:07.276448: step: 440/466, loss: 0.04809914156794548 2023-01-24 02:33:07.920884: step: 442/466, loss: 0.04134088754653931 2023-01-24 02:33:08.524794: step: 444/466, loss: 0.15189415216445923 2023-01-24 02:33:09.245681: step: 446/466, loss: 0.19982358813285828 2023-01-24 02:33:09.930172: step: 448/466, loss: 0.169953390955925 2023-01-24 02:33:10.532256: step: 450/466, loss: 0.08291981369256973 2023-01-24 02:33:11.155062: step: 452/466, loss: 0.29688578844070435 2023-01-24 02:33:11.816572: step: 454/466, loss: 0.14033854007720947 2023-01-24 02:33:12.476206: step: 456/466, loss: 0.07793901115655899 2023-01-24 02:33:13.177830: step: 458/466, loss: 0.02118169143795967 2023-01-24 02:33:13.758593: step: 460/466, loss: 0.05224119871854782 2023-01-24 02:33:14.327285: step: 462/466, loss: 0.05721522122621536 2023-01-24 02:33:14.954330: step: 464/466, loss: 0.601054310798645 2023-01-24 02:33:15.537255: step: 466/466, loss: 0.019567223265767097 2023-01-24 02:33:16.181150: step: 468/466, loss: 0.43480831384658813 2023-01-24 02:33:16.781597: step: 470/466, loss: 0.05632892996072769 2023-01-24 02:33:17.389776: step: 472/466, loss: 0.12266655266284943 2023-01-24 02:33:17.980464: step: 474/466, loss: 0.24110047519207 2023-01-24 02:33:18.567506: step: 476/466, loss: 0.041576821357011795 2023-01-24 02:33:19.182074: step: 478/466, loss: 0.5691177248954773 2023-01-24 02:33:19.831931: step: 480/466, loss: 0.2592619061470032 2023-01-24 02:33:20.451230: step: 482/466, loss: 0.0854816809296608 2023-01-24 02:33:21.048420: step: 484/466, loss: 0.06306815147399902 2023-01-24 02:33:21.635491: step: 486/466, loss: 0.014913514256477356 2023-01-24 02:33:22.252394: step: 488/466, loss: 0.027327340096235275 2023-01-24 02:33:22.828662: step: 490/466, loss: 0.19861024618148804 2023-01-24 02:33:23.494718: step: 492/466, loss: 0.21431532502174377 2023-01-24 02:33:24.154093: step: 494/466, loss: 0.15162241458892822 2023-01-24 02:33:24.761140: step: 496/466, loss: 0.05793420597910881 2023-01-24 02:33:25.403197: step: 498/466, loss: 0.03180902451276779 2023-01-24 02:33:25.998382: step: 500/466, loss: 0.07200276851654053 2023-01-24 02:33:26.537163: step: 502/466, loss: 0.0146143464371562 2023-01-24 02:33:27.145732: step: 504/466, loss: 0.16181471943855286 2023-01-24 02:33:27.736849: step: 506/466, loss: 0.09121792018413544 2023-01-24 02:33:28.316101: step: 508/466, loss: 0.015691015869379044 2023-01-24 02:33:28.974371: step: 510/466, loss: 0.06310312449932098 2023-01-24 02:33:29.528406: step: 512/466, loss: 0.03279956057667732 2023-01-24 02:33:30.118488: step: 514/466, loss: 0.06740190088748932 2023-01-24 02:33:30.773772: step: 516/466, loss: 0.09652914851903915 2023-01-24 02:33:31.408300: step: 518/466, loss: 0.0983295813202858 2023-01-24 02:33:32.000062: step: 520/466, loss: 0.07624398916959763 2023-01-24 02:33:32.654800: step: 522/466, loss: 0.08277488499879837 2023-01-24 02:33:33.230922: step: 524/466, loss: 0.047908440232276917 2023-01-24 02:33:33.846768: step: 526/466, loss: 0.16350354254245758 2023-01-24 02:33:34.499267: step: 528/466, loss: 0.06512527912855148 2023-01-24 02:33:35.131742: step: 530/466, loss: 0.18338938057422638 2023-01-24 02:33:35.782205: step: 532/466, loss: 0.10968554019927979 2023-01-24 02:33:36.402456: step: 534/466, loss: 0.02084297500550747 2023-01-24 02:33:37.062025: step: 536/466, loss: 0.02822844497859478 2023-01-24 02:33:37.700019: step: 538/466, loss: 0.15922009944915771 2023-01-24 02:33:38.317637: step: 540/466, loss: 0.06503274291753769 2023-01-24 02:33:38.954605: step: 542/466, loss: 0.054418645799160004 2023-01-24 02:33:39.649578: step: 544/466, loss: 0.0865282341837883 2023-01-24 02:33:40.231228: step: 546/466, loss: 0.026990242302417755 2023-01-24 02:33:40.744986: step: 548/466, loss: 0.040507011115550995 2023-01-24 02:33:41.392370: step: 550/466, loss: 0.05850210040807724 2023-01-24 02:33:42.032762: step: 552/466, loss: 0.06116797775030136 2023-01-24 02:33:42.651230: step: 554/466, loss: 0.04400629177689552 2023-01-24 02:33:43.274359: step: 556/466, loss: 0.20492783188819885 2023-01-24 02:33:43.926790: step: 558/466, loss: 0.1363212764263153 2023-01-24 02:33:44.572394: step: 560/466, loss: 0.07164261490106583 2023-01-24 02:33:45.194771: step: 562/466, loss: 0.05946716293692589 2023-01-24 02:33:45.803066: step: 564/466, loss: 0.07661996781826019 2023-01-24 02:33:46.493144: step: 566/466, loss: 0.045083437114953995 2023-01-24 02:33:47.063695: step: 568/466, loss: 0.0828741118311882 2023-01-24 02:33:47.749796: step: 570/466, loss: 0.054826319217681885 2023-01-24 02:33:48.380458: step: 572/466, loss: 0.16736988723278046 2023-01-24 02:33:49.015861: step: 574/466, loss: 0.11668187379837036 2023-01-24 02:33:49.619412: step: 576/466, loss: 0.10664176940917969 2023-01-24 02:33:50.259239: step: 578/466, loss: 0.06612854450941086 2023-01-24 02:33:50.934464: step: 580/466, loss: 0.10616856813430786 2023-01-24 02:33:51.509490: step: 582/466, loss: 0.12009678035974503 2023-01-24 02:33:52.146777: step: 584/466, loss: 0.5012925863265991 2023-01-24 02:33:52.782118: step: 586/466, loss: 0.1577220857143402 2023-01-24 02:33:53.364873: step: 588/466, loss: 0.058271583169698715 2023-01-24 02:33:53.968061: step: 590/466, loss: 0.37642702460289 2023-01-24 02:33:54.593454: step: 592/466, loss: 0.08568401634693146 2023-01-24 02:33:55.186830: step: 594/466, loss: 0.13801072537899017 2023-01-24 02:33:55.778699: step: 596/466, loss: 0.08078078180551529 2023-01-24 02:33:56.410190: step: 598/466, loss: 0.04879489168524742 2023-01-24 02:33:57.125005: step: 600/466, loss: 0.09097907692193985 2023-01-24 02:33:57.781389: step: 602/466, loss: 0.18482044339179993 2023-01-24 02:33:58.426275: step: 604/466, loss: 0.09671653807163239 2023-01-24 02:33:58.978266: step: 606/466, loss: 0.5252463221549988 2023-01-24 02:33:59.552422: step: 608/466, loss: 0.2623167932033539 2023-01-24 02:34:00.141640: step: 610/466, loss: 0.09503703564405441 2023-01-24 02:34:00.880131: step: 612/466, loss: 1.5838477611541748 2023-01-24 02:34:01.510264: step: 614/466, loss: 0.0861937403678894 2023-01-24 02:34:02.142349: step: 616/466, loss: 0.1169193685054779 2023-01-24 02:34:02.959673: step: 618/466, loss: 0.1271267980337143 2023-01-24 02:34:03.609725: step: 620/466, loss: 0.6097317337989807 2023-01-24 02:34:04.264194: step: 622/466, loss: 0.21033012866973877 2023-01-24 02:34:04.901765: step: 624/466, loss: 0.15617677569389343 2023-01-24 02:34:05.485962: step: 626/466, loss: 0.06437318027019501 2023-01-24 02:34:06.178219: step: 628/466, loss: 0.05594280734658241 2023-01-24 02:34:06.828413: step: 630/466, loss: 0.04285358265042305 2023-01-24 02:34:07.429302: step: 632/466, loss: 0.08711235225200653 2023-01-24 02:34:08.084004: step: 634/466, loss: 0.03910278156399727 2023-01-24 02:34:08.710693: step: 636/466, loss: 0.0911153107881546 2023-01-24 02:34:09.341690: step: 638/466, loss: 0.04268583655357361 2023-01-24 02:34:10.020634: step: 640/466, loss: 0.035510484129190445 2023-01-24 02:34:10.633849: step: 642/466, loss: 0.05552142485976219 2023-01-24 02:34:11.344287: step: 644/466, loss: 0.21432751417160034 2023-01-24 02:34:12.004400: step: 646/466, loss: 0.10881240665912628 2023-01-24 02:34:12.629858: step: 648/466, loss: 0.05536452680826187 2023-01-24 02:34:13.369307: step: 650/466, loss: 0.06242331117391586 2023-01-24 02:34:14.114262: step: 652/466, loss: 0.1637096405029297 2023-01-24 02:34:14.748952: step: 654/466, loss: 0.05477649345993996 2023-01-24 02:34:15.405428: step: 656/466, loss: 0.07378639280796051 2023-01-24 02:34:16.059151: step: 658/466, loss: 0.08187831938266754 2023-01-24 02:34:16.730653: step: 660/466, loss: 0.03793375566601753 2023-01-24 02:34:17.358926: step: 662/466, loss: 0.044803231954574585 2023-01-24 02:34:18.031201: step: 664/466, loss: 0.2506631314754486 2023-01-24 02:34:18.659206: step: 666/466, loss: 0.06307424604892731 2023-01-24 02:34:19.336746: step: 668/466, loss: 0.08123359084129333 2023-01-24 02:34:19.925471: step: 670/466, loss: 0.058123879134655 2023-01-24 02:34:20.573379: step: 672/466, loss: 0.07231193035840988 2023-01-24 02:34:21.203049: step: 674/466, loss: 0.04953724145889282 2023-01-24 02:34:21.917007: step: 676/466, loss: 0.1181645542383194 2023-01-24 02:34:22.568897: step: 678/466, loss: 0.08696312457323074 2023-01-24 02:34:23.222532: step: 680/466, loss: 0.0802014023065567 2023-01-24 02:34:23.787963: step: 682/466, loss: 0.04924570769071579 2023-01-24 02:34:24.411340: step: 684/466, loss: 0.06582105904817581 2023-01-24 02:34:25.080395: step: 686/466, loss: 0.1181468814611435 2023-01-24 02:34:25.801890: step: 688/466, loss: 0.12264467030763626 2023-01-24 02:34:26.405238: step: 690/466, loss: 0.08274734765291214 2023-01-24 02:34:27.025735: step: 692/466, loss: 0.07421544939279556 2023-01-24 02:34:27.677510: step: 694/466, loss: 0.011576401069760323 2023-01-24 02:34:28.284599: step: 696/466, loss: 0.015556792728602886 2023-01-24 02:34:28.945808: step: 698/466, loss: 1.4480268955230713 2023-01-24 02:34:29.584889: step: 700/466, loss: 0.28690746426582336 2023-01-24 02:34:30.197766: step: 702/466, loss: 0.0877830758690834 2023-01-24 02:34:30.850269: step: 704/466, loss: 0.25909364223480225 2023-01-24 02:34:31.525303: step: 706/466, loss: 0.8405012488365173 2023-01-24 02:34:32.150272: step: 708/466, loss: 0.15286941826343536 2023-01-24 02:34:32.783080: step: 710/466, loss: 0.15358571708202362 2023-01-24 02:34:33.384235: step: 712/466, loss: 0.11230973899364471 2023-01-24 02:34:33.985366: step: 714/466, loss: 0.0365796834230423 2023-01-24 02:34:34.597726: step: 716/466, loss: 0.3047601878643036 2023-01-24 02:34:35.175754: step: 718/466, loss: 0.11558191478252411 2023-01-24 02:34:35.750884: step: 720/466, loss: 0.09022244065999985 2023-01-24 02:34:36.368542: step: 722/466, loss: 0.018942786380648613 2023-01-24 02:34:36.917625: step: 724/466, loss: 0.21641039848327637 2023-01-24 02:34:37.605119: step: 726/466, loss: 0.6364198923110962 2023-01-24 02:34:38.290246: step: 728/466, loss: 0.14027713239192963 2023-01-24 02:34:38.907350: step: 730/466, loss: 0.08480183780193329 2023-01-24 02:34:39.589606: step: 732/466, loss: 0.040003206580877304 2023-01-24 02:34:40.261927: step: 734/466, loss: 0.10973091423511505 2023-01-24 02:34:40.917925: step: 736/466, loss: 0.46130046248435974 2023-01-24 02:34:41.562192: step: 738/466, loss: 20.934114456176758 2023-01-24 02:34:42.148480: step: 740/466, loss: 0.019502513110637665 2023-01-24 02:34:42.792260: step: 742/466, loss: 0.09954078495502472 2023-01-24 02:34:43.386830: step: 744/466, loss: 0.5703690648078918 2023-01-24 02:34:44.023712: step: 746/466, loss: 0.011088987812399864 2023-01-24 02:34:44.666042: step: 748/466, loss: 0.060354653745889664 2023-01-24 02:34:45.292612: step: 750/466, loss: 0.04820029065012932 2023-01-24 02:34:45.894346: step: 752/466, loss: 0.07377003133296967 2023-01-24 02:34:46.498765: step: 754/466, loss: 0.2375066727399826 2023-01-24 02:34:47.081714: step: 756/466, loss: 0.2742730677127838 2023-01-24 02:34:47.741250: step: 758/466, loss: 0.07631280273199081 2023-01-24 02:34:48.361315: step: 760/466, loss: 0.0527004674077034 2023-01-24 02:34:49.014326: step: 762/466, loss: 0.034588977694511414 2023-01-24 02:34:49.560354: step: 764/466, loss: 0.12235993146896362 2023-01-24 02:34:50.222767: step: 766/466, loss: 0.034359920769929886 2023-01-24 02:34:50.826122: step: 768/466, loss: 0.06783917546272278 2023-01-24 02:34:51.471233: step: 770/466, loss: 0.17482250928878784 2023-01-24 02:34:52.053268: step: 772/466, loss: 0.06787631660699844 2023-01-24 02:34:52.636916: step: 774/466, loss: 0.05822568014264107 2023-01-24 02:34:53.183817: step: 776/466, loss: 0.1922196000814438 2023-01-24 02:34:53.784132: step: 778/466, loss: 0.033735353499650955 2023-01-24 02:34:54.437212: step: 780/466, loss: 0.024258237332105637 2023-01-24 02:34:55.062613: step: 782/466, loss: 0.08711081743240356 2023-01-24 02:34:55.741400: step: 784/466, loss: 0.3389933109283447 2023-01-24 02:34:56.360572: step: 786/466, loss: 0.14080190658569336 2023-01-24 02:34:56.942336: step: 788/466, loss: 0.11569789797067642 2023-01-24 02:34:57.543748: step: 790/466, loss: 0.7507767081260681 2023-01-24 02:34:58.191940: step: 792/466, loss: 0.052032433450222015 2023-01-24 02:34:58.825518: step: 794/466, loss: 0.04452061280608177 2023-01-24 02:34:59.512100: step: 796/466, loss: 0.049668360501527786 2023-01-24 02:35:00.156975: step: 798/466, loss: 0.13247907161712646 2023-01-24 02:35:00.730157: step: 800/466, loss: 0.05487549304962158 2023-01-24 02:35:01.367515: step: 802/466, loss: 0.0959228128194809 2023-01-24 02:35:01.988775: step: 804/466, loss: 0.03185079246759415 2023-01-24 02:35:02.595916: step: 806/466, loss: 0.07701914012432098 2023-01-24 02:35:03.216306: step: 808/466, loss: 0.17329666018486023 2023-01-24 02:35:03.845442: step: 810/466, loss: 0.05172254145145416 2023-01-24 02:35:04.457270: step: 812/466, loss: 0.0889836996793747 2023-01-24 02:35:05.131344: step: 814/466, loss: 0.36501458287239075 2023-01-24 02:35:05.805483: step: 816/466, loss: 0.09125789999961853 2023-01-24 02:35:06.460054: step: 818/466, loss: 0.02714724838733673 2023-01-24 02:35:07.084509: step: 820/466, loss: 0.04561009258031845 2023-01-24 02:35:07.756095: step: 822/466, loss: 0.09402687847614288 2023-01-24 02:35:08.379183: step: 824/466, loss: 0.19560956954956055 2023-01-24 02:35:09.051016: step: 826/466, loss: 0.0626026913523674 2023-01-24 02:35:09.640421: step: 828/466, loss: 0.094856858253479 2023-01-24 02:35:10.291702: step: 830/466, loss: 0.033885981887578964 2023-01-24 02:35:10.923594: step: 832/466, loss: 0.03236625716090202 2023-01-24 02:35:11.571165: step: 834/466, loss: 0.10357360541820526 2023-01-24 02:35:12.166579: step: 836/466, loss: 0.06699132919311523 2023-01-24 02:35:12.882559: step: 838/466, loss: 0.1298193782567978 2023-01-24 02:35:13.540685: step: 840/466, loss: 0.052056584507226944 2023-01-24 02:35:14.178333: step: 842/466, loss: 0.03698112443089485 2023-01-24 02:35:14.785340: step: 844/466, loss: 0.21999523043632507 2023-01-24 02:35:15.485696: step: 846/466, loss: 0.07879167795181274 2023-01-24 02:35:16.135914: step: 848/466, loss: 0.1226952001452446 2023-01-24 02:35:16.805271: step: 850/466, loss: 0.15505649149417877 2023-01-24 02:35:17.417031: step: 852/466, loss: 0.045842863619327545 2023-01-24 02:35:18.031259: step: 854/466, loss: 0.44745418429374695 2023-01-24 02:35:18.684353: step: 856/466, loss: 0.03154830262064934 2023-01-24 02:35:19.356550: step: 858/466, loss: 0.24629947543144226 2023-01-24 02:35:19.974571: step: 860/466, loss: 0.052139271050691605 2023-01-24 02:35:20.634352: step: 862/466, loss: 0.5984631180763245 2023-01-24 02:35:21.232780: step: 864/466, loss: 0.06645335257053375 2023-01-24 02:35:21.873377: step: 866/466, loss: 0.026118503883481026 2023-01-24 02:35:22.492182: step: 868/466, loss: 0.3702729046344757 2023-01-24 02:35:23.115584: step: 870/466, loss: 0.0635104700922966 2023-01-24 02:35:23.664361: step: 872/466, loss: 0.07143677026033401 2023-01-24 02:35:24.282501: step: 874/466, loss: 0.020376041531562805 2023-01-24 02:35:25.031927: step: 876/466, loss: 0.07505100220441818 2023-01-24 02:35:25.663285: step: 878/466, loss: 0.1263282299041748 2023-01-24 02:35:26.212131: step: 880/466, loss: 0.04056801274418831 2023-01-24 02:35:26.868658: step: 882/466, loss: 0.07290873676538467 2023-01-24 02:35:27.545731: step: 884/466, loss: 0.45427680015563965 2023-01-24 02:35:28.117571: step: 886/466, loss: 0.016614042222499847 2023-01-24 02:35:28.765381: step: 888/466, loss: 0.0716577097773552 2023-01-24 02:35:29.324698: step: 890/466, loss: 0.07818274945020676 2023-01-24 02:35:29.932462: step: 892/466, loss: 0.08689560741186142 2023-01-24 02:35:30.580790: step: 894/466, loss: 0.11718635261058807 2023-01-24 02:35:31.187508: step: 896/466, loss: 0.07583006471395493 2023-01-24 02:35:31.801103: step: 898/466, loss: 0.06984055042266846 2023-01-24 02:35:32.442468: step: 900/466, loss: 0.0533699207007885 2023-01-24 02:35:33.082340: step: 902/466, loss: 0.06694526225328445 2023-01-24 02:35:33.872879: step: 904/466, loss: 2.499326229095459 2023-01-24 02:35:34.524142: step: 906/466, loss: 0.08951568603515625 2023-01-24 02:35:35.154087: step: 908/466, loss: 0.16057872772216797 2023-01-24 02:35:35.759810: step: 910/466, loss: 0.05228939652442932 2023-01-24 02:35:36.420420: step: 912/466, loss: 0.48351049423217773 2023-01-24 02:35:36.996160: step: 914/466, loss: 0.1175408884882927 2023-01-24 02:35:37.698643: step: 916/466, loss: 0.06571335345506668 2023-01-24 02:35:38.351779: step: 918/466, loss: 0.08429094403982162 2023-01-24 02:35:38.930701: step: 920/466, loss: 0.022826338186860085 2023-01-24 02:35:39.551022: step: 922/466, loss: 1.0452570915222168 2023-01-24 02:35:40.209896: step: 924/466, loss: 0.05522065982222557 2023-01-24 02:35:40.805959: step: 926/466, loss: 0.16498221457004547 2023-01-24 02:35:41.480174: step: 928/466, loss: 0.053920526057481766 2023-01-24 02:35:42.150073: step: 930/466, loss: 0.38070955872535706 2023-01-24 02:35:42.786479: step: 932/466, loss: 0.1380595862865448 ================================================== Loss: 0.197 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3385165429808287, 'r': 0.3462246995572423, 'f1': 0.34232723577235774}, 'combined': 0.25224112109542146, 'epoch': 17} Test Chinese: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.35244528344941667, 'r': 0.3222618205934545, 'f1': 0.3366784135617112}, 'combined': 0.22328931054869963, 'epoch': 17} Dev Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.3372907366071428, 'r': 0.2861860795454545, 'f1': 0.30964395491803276}, 'combined': 0.2064293032786885, 'epoch': 17} Test Korean: {'template': {'p': 0.96875, 'r': 0.49206349206349204, 'f1': 0.6526315789473683}, 'slot': {'p': 0.36409384248846793, 'r': 0.30892810877809396, 'f1': 0.33425008490744595}, 'combined': 0.21814216067643838, 'epoch': 17} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.32474947704658524, 'r': 0.33029548329595765, 'f1': 0.32749900225205963}, 'combined': 0.2413150542909913, 'epoch': 17} Test Russian: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.34366595879909545, 'r': 0.3115587584960658, 'f1': 0.3268257031047659}, 'combined': 0.2167548704529017, 'epoch': 17} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.29380341880341876, 'r': 0.32738095238095233, 'f1': 0.30968468468468463}, 'combined': 0.2064564564564564, 'epoch': 17} Sample Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.4852941176470588, 'r': 0.358695652173913, 'f1': 0.4125}, 'combined': 0.27499999999999997, 'epoch': 17} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.3409090909090909, 'r': 0.12931034482758622, 'f1': 0.1875}, 'combined': 0.125, 'epoch': 17} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3384687266123707, 'r': 0.32112782411040863, 'f1': 0.3295703277627758}, 'combined': 0.24284129414099268, 'epoch': 13} Test for Chinese: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.37192480126309296, 'r': 0.3046241229392952, 'f1': 0.334927046163623}, 'combined': 0.2221277819116256, 'epoch': 13} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.38541666666666663, 'r': 0.35238095238095235, 'f1': 0.36815920398009944}, 'combined': 0.24543946932006627, 'epoch': 13} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.31508371199826896, 'r': 0.28285924145299146, 'f1': 0.298103152669021}, 'combined': 0.19873543511268066, 'epoch': 10} Test for Korean: {'template': {'p': 0.96875, 'r': 0.49206349206349204, 'f1': 0.6526315789473683}, 'slot': {'p': 0.3659577520051451, 'r': 0.28516188467933384, 'f1': 0.3205469360629008}, 'combined': 0.20919905300947209, 'epoch': 10} Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.43478260869565216, 'f1': 0.46511627906976744}, 'combined': 0.31007751937984496, 'epoch': 10} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3214272868712161, 'r': 0.31166858366449984, 'f1': 0.3164727236824498}, 'combined': 0.23319042797654194, 'epoch': 7} Test for Russian: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.3374639287478496, 'r': 0.2816078301964814, 'f1': 0.30701605547736693}, 'combined': 0.20361686580882363, 'epoch': 7} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.4423076923076923, 'r': 0.19827586206896552, 'f1': 0.2738095238095238}, 'combined': 0.1825396825396825, 'epoch': 7} ****************************** Epoch: 18 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-24 02:38:14.227669: step: 2/466, loss: 0.31189316511154175 2023-01-24 02:38:14.802832: step: 4/466, loss: 0.10597145557403564 2023-01-24 02:38:15.450483: step: 6/466, loss: 0.07821610569953918 2023-01-24 02:38:16.089187: step: 8/466, loss: 0.5892221927642822 2023-01-24 02:38:16.673397: step: 10/466, loss: 0.25590741634368896 2023-01-24 02:38:17.319845: step: 12/466, loss: 0.12411262094974518 2023-01-24 02:38:17.918255: step: 14/466, loss: 0.08438217639923096 2023-01-24 02:38:18.509101: step: 16/466, loss: 0.06738931685686111 2023-01-24 02:38:19.150280: step: 18/466, loss: 0.033883992582559586 2023-01-24 02:38:19.737038: step: 20/466, loss: 0.0135149285197258 2023-01-24 02:38:20.351277: step: 22/466, loss: 0.05257147178053856 2023-01-24 02:38:21.052654: step: 24/466, loss: 0.04349208250641823 2023-01-24 02:38:21.629738: step: 26/466, loss: 0.08731899410486221 2023-01-24 02:38:22.342008: step: 28/466, loss: 0.037038665264844894 2023-01-24 02:38:23.004033: step: 30/466, loss: 0.07792191207408905 2023-01-24 02:38:23.588358: step: 32/466, loss: 0.11329524219036102 2023-01-24 02:38:24.276819: step: 34/466, loss: 0.2606694996356964 2023-01-24 02:38:24.867301: step: 36/466, loss: 0.06266503036022186 2023-01-24 02:38:25.519898: step: 38/466, loss: 0.12956149876117706 2023-01-24 02:38:26.160216: step: 40/466, loss: 0.03676142543554306 2023-01-24 02:38:26.745907: step: 42/466, loss: 0.0485200434923172 2023-01-24 02:38:27.411735: step: 44/466, loss: 0.06625945121049881 2023-01-24 02:38:28.097587: step: 46/466, loss: 0.12503652274608612 2023-01-24 02:38:28.694408: step: 48/466, loss: 0.08727210015058517 2023-01-24 02:38:29.263492: step: 50/466, loss: 0.02814129739999771 2023-01-24 02:38:29.869260: step: 52/466, loss: 0.0902329534292221 2023-01-24 02:38:30.484238: step: 54/466, loss: 0.09235873818397522 2023-01-24 02:38:31.063926: step: 56/466, loss: 0.27931010723114014 2023-01-24 02:38:31.782445: step: 58/466, loss: 0.0059571899473667145 2023-01-24 02:38:32.483148: step: 60/466, loss: 0.02129332534968853 2023-01-24 02:38:33.094718: step: 62/466, loss: 0.456002801656723 2023-01-24 02:38:33.661997: step: 64/466, loss: 0.06236730515956879 2023-01-24 02:38:34.304629: step: 66/466, loss: 0.06540032476186752 2023-01-24 02:38:34.914586: step: 68/466, loss: 0.03372754901647568 2023-01-24 02:38:35.510586: step: 70/466, loss: 0.04925305396318436 2023-01-24 02:38:36.084916: step: 72/466, loss: 0.1637629121541977 2023-01-24 02:38:36.734937: step: 74/466, loss: 0.08615628629922867 2023-01-24 02:38:37.356494: step: 76/466, loss: 0.027274945750832558 2023-01-24 02:38:38.004286: step: 78/466, loss: 0.19199179112911224 2023-01-24 02:38:38.648762: step: 80/466, loss: 0.06197204068303108 2023-01-24 02:38:39.249683: step: 82/466, loss: 0.028553558513522148 2023-01-24 02:38:39.860723: step: 84/466, loss: 0.06519344449043274 2023-01-24 02:38:40.503853: step: 86/466, loss: 0.09060356020927429 2023-01-24 02:38:41.065915: step: 88/466, loss: 0.09861177951097488 2023-01-24 02:38:41.729478: step: 90/466, loss: 0.15060625970363617 2023-01-24 02:38:42.313531: step: 92/466, loss: 0.041267793625593185 2023-01-24 02:38:42.904592: step: 94/466, loss: 0.0036512063816189766 2023-01-24 02:38:43.564561: step: 96/466, loss: 0.1721828132867813 2023-01-24 02:38:44.249647: step: 98/466, loss: 0.06883466988801956 2023-01-24 02:38:44.865481: step: 100/466, loss: 0.07559314370155334 2023-01-24 02:38:45.504057: step: 102/466, loss: 0.08263487368822098 2023-01-24 02:38:46.098528: step: 104/466, loss: 0.07542078197002411 2023-01-24 02:38:46.784797: step: 106/466, loss: 0.04282296076416969 2023-01-24 02:38:47.444616: step: 108/466, loss: 0.2059038281440735 2023-01-24 02:38:48.110831: step: 110/466, loss: 0.647688090801239 2023-01-24 02:38:48.800716: step: 112/466, loss: 0.08664006739854813 2023-01-24 02:38:49.451986: step: 114/466, loss: 0.04270491376519203 2023-01-24 02:38:50.099057: step: 116/466, loss: 0.07413546741008759 2023-01-24 02:38:50.672422: step: 118/466, loss: 0.04129385948181152 2023-01-24 02:38:51.314612: step: 120/466, loss: 0.036656226962804794 2023-01-24 02:38:51.867910: step: 122/466, loss: 0.03237105906009674 2023-01-24 02:38:52.486044: step: 124/466, loss: 0.08183992654085159 2023-01-24 02:38:53.114013: step: 126/466, loss: 0.08177828043699265 2023-01-24 02:38:53.741049: step: 128/466, loss: 0.1501251608133316 2023-01-24 02:38:54.318947: step: 130/466, loss: 0.059020865708589554 2023-01-24 02:38:55.008224: step: 132/466, loss: 0.01964469440281391 2023-01-24 02:38:55.627667: step: 134/466, loss: 0.08319169282913208 2023-01-24 02:38:56.251241: step: 136/466, loss: 0.020794998854398727 2023-01-24 02:38:56.856862: step: 138/466, loss: 0.06269271671772003 2023-01-24 02:38:57.425162: step: 140/466, loss: 0.044970426708459854 2023-01-24 02:38:58.036044: step: 142/466, loss: 0.08268948644399643 2023-01-24 02:38:58.620934: step: 144/466, loss: 0.011812067590653896 2023-01-24 02:38:59.259057: step: 146/466, loss: 0.5511666536331177 2023-01-24 02:38:59.901553: step: 148/466, loss: 1.1024284362792969 2023-01-24 02:39:00.523920: step: 150/466, loss: 0.07916638255119324 2023-01-24 02:39:01.165141: step: 152/466, loss: 0.0712054893374443 2023-01-24 02:39:01.742339: step: 154/466, loss: 0.051965199410915375 2023-01-24 02:39:02.472009: step: 156/466, loss: 0.1934525966644287 2023-01-24 02:39:03.077054: step: 158/466, loss: 0.15321888029575348 2023-01-24 02:39:03.691775: step: 160/466, loss: 0.40636488795280457 2023-01-24 02:39:04.336554: step: 162/466, loss: 0.02754190005362034 2023-01-24 02:39:04.864977: step: 164/466, loss: 0.022459637373685837 2023-01-24 02:39:05.534398: step: 166/466, loss: 0.04907315969467163 2023-01-24 02:39:06.122833: step: 168/466, loss: 0.032183870673179626 2023-01-24 02:39:06.742969: step: 170/466, loss: 0.1283939629793167 2023-01-24 02:39:07.431162: step: 172/466, loss: 0.02123296447098255 2023-01-24 02:39:08.065581: step: 174/466, loss: 0.09502942860126495 2023-01-24 02:39:08.667649: step: 176/466, loss: 0.06678149104118347 2023-01-24 02:39:09.227714: step: 178/466, loss: 0.05468068644404411 2023-01-24 02:39:09.844624: step: 180/466, loss: 0.0322553776204586 2023-01-24 02:39:10.458754: step: 182/466, loss: 0.030692176893353462 2023-01-24 02:39:11.062094: step: 184/466, loss: 0.06752853840589523 2023-01-24 02:39:11.678004: step: 186/466, loss: 0.565311849117279 2023-01-24 02:39:12.390995: step: 188/466, loss: 0.07549922913312912 2023-01-24 02:39:12.972403: step: 190/466, loss: 0.02192053571343422 2023-01-24 02:39:13.618738: step: 192/466, loss: 0.06956149637699127 2023-01-24 02:39:14.255685: step: 194/466, loss: 0.4531049132347107 2023-01-24 02:39:14.898848: step: 196/466, loss: 0.07766777276992798 2023-01-24 02:39:15.525716: step: 198/466, loss: 0.12816458940505981 2023-01-24 02:39:16.134505: step: 200/466, loss: 0.04339540749788284 2023-01-24 02:39:16.724158: step: 202/466, loss: 0.111495740711689 2023-01-24 02:39:17.371821: step: 204/466, loss: 0.1359895020723343 2023-01-24 02:39:17.967106: step: 206/466, loss: 0.44120514392852783 2023-01-24 02:39:18.571187: step: 208/466, loss: 0.035208724439144135 2023-01-24 02:39:19.179542: step: 210/466, loss: 0.07476918399333954 2023-01-24 02:39:19.805476: step: 212/466, loss: 0.059200409799814224 2023-01-24 02:39:20.441046: step: 214/466, loss: 0.08655449748039246 2023-01-24 02:39:21.114024: step: 216/466, loss: 0.03006293624639511 2023-01-24 02:39:21.789042: step: 218/466, loss: 0.12334776669740677 2023-01-24 02:39:22.453954: step: 220/466, loss: 0.0786009430885315 2023-01-24 02:39:23.102859: step: 222/466, loss: 0.16112715005874634 2023-01-24 02:39:23.729603: step: 224/466, loss: 0.028480958193540573 2023-01-24 02:39:24.274341: step: 226/466, loss: 0.05188245326280594 2023-01-24 02:39:24.871020: step: 228/466, loss: 1.4788625240325928 2023-01-24 02:39:25.581458: step: 230/466, loss: 0.06716261804103851 2023-01-24 02:39:26.202094: step: 232/466, loss: 0.021676335483789444 2023-01-24 02:39:26.837818: step: 234/466, loss: 0.06132663041353226 2023-01-24 02:39:27.461345: step: 236/466, loss: 0.050321124494075775 2023-01-24 02:39:28.209653: step: 238/466, loss: 0.272202730178833 2023-01-24 02:39:28.834249: step: 240/466, loss: 0.06891126185655594 2023-01-24 02:39:29.494026: step: 242/466, loss: 0.18587826192378998 2023-01-24 02:39:30.116503: step: 244/466, loss: 0.10435792803764343 2023-01-24 02:39:30.776699: step: 246/466, loss: 0.08475508540868759 2023-01-24 02:39:31.419252: step: 248/466, loss: 0.04646135866641998 2023-01-24 02:39:32.045043: step: 250/466, loss: 0.044781383126974106 2023-01-24 02:39:32.727959: step: 252/466, loss: 0.10665663331747055 2023-01-24 02:39:33.332505: step: 254/466, loss: 0.06658080965280533 2023-01-24 02:39:33.959982: step: 256/466, loss: 0.17637376487255096 2023-01-24 02:39:34.626615: step: 258/466, loss: 0.0785864070057869 2023-01-24 02:39:35.272810: step: 260/466, loss: 0.006488861050456762 2023-01-24 02:39:35.935264: step: 262/466, loss: 0.025803137570619583 2023-01-24 02:39:36.611414: step: 264/466, loss: 0.08928970247507095 2023-01-24 02:39:37.273729: step: 266/466, loss: 0.06495597213506699 2023-01-24 02:39:37.893825: step: 268/466, loss: 0.040781501680612564 2023-01-24 02:39:38.477318: step: 270/466, loss: 0.07361089438199997 2023-01-24 02:39:39.120836: step: 272/466, loss: 0.050119150429964066 2023-01-24 02:39:39.701736: step: 274/466, loss: 0.03163013607263565 2023-01-24 02:39:40.348959: step: 276/466, loss: 0.05836133658885956 2023-01-24 02:39:40.952883: step: 278/466, loss: 0.19847925007343292 2023-01-24 02:39:41.571813: step: 280/466, loss: 0.05794009938836098 2023-01-24 02:39:42.194437: step: 282/466, loss: 0.01365175936371088 2023-01-24 02:39:42.836459: step: 284/466, loss: 0.07203265279531479 2023-01-24 02:39:43.489549: step: 286/466, loss: 0.0678028017282486 2023-01-24 02:39:44.125192: step: 288/466, loss: 0.3244488537311554 2023-01-24 02:39:44.730358: step: 290/466, loss: 0.017647601664066315 2023-01-24 02:39:45.337760: step: 292/466, loss: 0.1130477637052536 2023-01-24 02:39:45.898622: step: 294/466, loss: 0.10335894674062729 2023-01-24 02:39:46.540887: step: 296/466, loss: 0.013636148534715176 2023-01-24 02:39:47.235711: step: 298/466, loss: 0.054267518222332 2023-01-24 02:39:47.869362: step: 300/466, loss: 0.1287168264389038 2023-01-24 02:39:48.500950: step: 302/466, loss: 0.07640498131513596 2023-01-24 02:39:49.082740: step: 304/466, loss: 0.07450221478939056 2023-01-24 02:39:49.692623: step: 306/466, loss: 0.12391683459281921 2023-01-24 02:39:50.323497: step: 308/466, loss: 0.04375586658716202 2023-01-24 02:39:50.988929: step: 310/466, loss: 0.04667707532644272 2023-01-24 02:39:51.636063: step: 312/466, loss: 0.06447603553533554 2023-01-24 02:39:52.240427: step: 314/466, loss: 0.061549168080091476 2023-01-24 02:39:52.805574: step: 316/466, loss: 0.12271782755851746 2023-01-24 02:39:53.401563: step: 318/466, loss: 0.035681240260601044 2023-01-24 02:39:54.030356: step: 320/466, loss: 0.06072334572672844 2023-01-24 02:39:54.622352: step: 322/466, loss: 0.11734100431203842 2023-01-24 02:39:55.194784: step: 324/466, loss: 0.10797211527824402 2023-01-24 02:39:55.790894: step: 326/466, loss: 0.0348459854722023 2023-01-24 02:39:56.448883: step: 328/466, loss: 0.31300246715545654 2023-01-24 02:39:57.063177: step: 330/466, loss: 0.05565841868519783 2023-01-24 02:39:57.630798: step: 332/466, loss: 0.04864946007728577 2023-01-24 02:39:58.241760: step: 334/466, loss: 0.05759395658969879 2023-01-24 02:39:58.891835: step: 336/466, loss: 0.01965009979903698 2023-01-24 02:39:59.503286: step: 338/466, loss: 0.8483911156654358 2023-01-24 02:40:00.173253: step: 340/466, loss: 0.09508222341537476 2023-01-24 02:40:00.733910: step: 342/466, loss: 0.119389608502388 2023-01-24 02:40:01.458760: step: 344/466, loss: 0.049336206167936325 2023-01-24 02:40:02.145127: step: 346/466, loss: 0.08939649164676666 2023-01-24 02:40:02.679591: step: 348/466, loss: 0.025346161797642708 2023-01-24 02:40:03.339336: step: 350/466, loss: 0.13140961527824402 2023-01-24 02:40:04.007419: step: 352/466, loss: 0.14738011360168457 2023-01-24 02:40:04.636564: step: 354/466, loss: 0.08162593841552734 2023-01-24 02:40:05.219820: step: 356/466, loss: 0.04835676774382591 2023-01-24 02:40:05.905631: step: 358/466, loss: 0.24017801880836487 2023-01-24 02:40:06.541081: step: 360/466, loss: 0.014580821618437767 2023-01-24 02:40:07.179329: step: 362/466, loss: 0.1441817432641983 2023-01-24 02:40:07.799126: step: 364/466, loss: 0.05539591610431671 2023-01-24 02:40:08.402060: step: 366/466, loss: 0.059686385095119476 2023-01-24 02:40:09.048135: step: 368/466, loss: 0.03324635699391365 2023-01-24 02:40:09.687019: step: 370/466, loss: 0.10210835933685303 2023-01-24 02:40:10.337571: step: 372/466, loss: 0.06760048866271973 2023-01-24 02:40:10.967964: step: 374/466, loss: 0.05021263659000397 2023-01-24 02:40:11.584808: step: 376/466, loss: 0.07459504157304764 2023-01-24 02:40:12.195799: step: 378/466, loss: 0.08886837959289551 2023-01-24 02:40:12.770625: step: 380/466, loss: 0.08208400011062622 2023-01-24 02:40:13.402314: step: 382/466, loss: 0.01850113831460476 2023-01-24 02:40:14.073871: step: 384/466, loss: 0.12809403240680695 2023-01-24 02:40:14.706808: step: 386/466, loss: 0.14002923667430878 2023-01-24 02:40:15.342830: step: 388/466, loss: 0.14379382133483887 2023-01-24 02:40:15.958545: step: 390/466, loss: 0.10631173849105835 2023-01-24 02:40:16.574280: step: 392/466, loss: 0.038909073919057846 2023-01-24 02:40:17.200051: step: 394/466, loss: 0.19458021223545074 2023-01-24 02:40:17.848135: step: 396/466, loss: 0.04370439797639847 2023-01-24 02:40:18.463316: step: 398/466, loss: 0.09685692936182022 2023-01-24 02:40:19.170812: step: 400/466, loss: 0.07326396554708481 2023-01-24 02:40:19.802902: step: 402/466, loss: 0.033047426491975784 2023-01-24 02:40:20.452852: step: 404/466, loss: 0.014400389045476913 2023-01-24 02:40:21.054864: step: 406/466, loss: 0.07890332490205765 2023-01-24 02:40:21.744259: step: 408/466, loss: 0.023150041699409485 2023-01-24 02:40:22.269819: step: 410/466, loss: 0.04116898775100708 2023-01-24 02:40:22.919041: step: 412/466, loss: 0.19657215476036072 2023-01-24 02:40:23.534038: step: 414/466, loss: 0.0530715212225914 2023-01-24 02:40:24.117560: step: 416/466, loss: 0.14518778026103973 2023-01-24 02:40:24.767532: step: 418/466, loss: 0.02331777848303318 2023-01-24 02:40:25.374241: step: 420/466, loss: 0.01835496909916401 2023-01-24 02:40:25.962952: step: 422/466, loss: 0.04725594073534012 2023-01-24 02:40:26.577641: step: 424/466, loss: 0.09757312387228012 2023-01-24 02:40:27.203791: step: 426/466, loss: 0.07201900333166122 2023-01-24 02:40:27.810391: step: 428/466, loss: 0.032236743718385696 2023-01-24 02:40:28.513514: step: 430/466, loss: 0.0317675843834877 2023-01-24 02:40:29.130343: step: 432/466, loss: 0.8498877286911011 2023-01-24 02:40:29.738457: step: 434/466, loss: 0.037114985287189484 2023-01-24 02:40:30.380288: step: 436/466, loss: 0.2792483866214752 2023-01-24 02:40:31.080795: step: 438/466, loss: 0.06097016483545303 2023-01-24 02:40:31.668296: step: 440/466, loss: 0.008344702422618866 2023-01-24 02:40:32.294578: step: 442/466, loss: 0.18476438522338867 2023-01-24 02:40:32.877863: step: 444/466, loss: 0.07133772224187851 2023-01-24 02:40:33.434247: step: 446/466, loss: 0.038877032697200775 2023-01-24 02:40:33.996115: step: 448/466, loss: 0.03844582289457321 2023-01-24 02:40:34.648032: step: 450/466, loss: 0.023474067449569702 2023-01-24 02:40:35.322664: step: 452/466, loss: 0.12620507180690765 2023-01-24 02:40:35.864147: step: 454/466, loss: 0.026539389044046402 2023-01-24 02:40:36.439493: step: 456/466, loss: 0.7999367713928223 2023-01-24 02:40:37.094259: step: 458/466, loss: 0.07577664405107498 2023-01-24 02:40:37.758228: step: 460/466, loss: 0.1380268633365631 2023-01-24 02:40:38.495822: step: 462/466, loss: 0.19326843321323395 2023-01-24 02:40:39.116661: step: 464/466, loss: 0.11803829669952393 2023-01-24 02:40:39.794468: step: 466/466, loss: 0.17432992160320282 2023-01-24 02:40:40.443187: step: 468/466, loss: 0.13016678392887115 2023-01-24 02:40:41.054041: step: 470/466, loss: 0.06393105536699295 2023-01-24 02:40:41.637188: step: 472/466, loss: 0.026901068165898323 2023-01-24 02:40:42.300067: step: 474/466, loss: 0.029669426381587982 2023-01-24 02:40:42.958549: step: 476/466, loss: 0.11381050199270248 2023-01-24 02:40:43.548831: step: 478/466, loss: 0.05609027296304703 2023-01-24 02:40:44.195638: step: 480/466, loss: 0.028202557936310768 2023-01-24 02:40:44.871836: step: 482/466, loss: 0.5612933039665222 2023-01-24 02:40:45.472858: step: 484/466, loss: 0.12710697948932648 2023-01-24 02:40:46.208157: step: 486/466, loss: 0.34370070695877075 2023-01-24 02:40:46.824245: step: 488/466, loss: 0.3669477105140686 2023-01-24 02:40:47.438665: step: 490/466, loss: 0.08090394735336304 2023-01-24 02:40:48.079235: step: 492/466, loss: 0.1269034892320633 2023-01-24 02:40:48.721480: step: 494/466, loss: 1.480870246887207 2023-01-24 02:40:49.399630: step: 496/466, loss: 0.15957190096378326 2023-01-24 02:40:50.058378: step: 498/466, loss: 0.03893940895795822 2023-01-24 02:40:50.717632: step: 500/466, loss: 0.027543164789676666 2023-01-24 02:40:51.372577: step: 502/466, loss: 0.11042570322751999 2023-01-24 02:40:51.967973: step: 504/466, loss: 0.054097600281238556 2023-01-24 02:40:52.580670: step: 506/466, loss: 0.09026054292917252 2023-01-24 02:40:53.273437: step: 508/466, loss: 0.070386603474617 2023-01-24 02:40:53.908804: step: 510/466, loss: 0.05811745673418045 2023-01-24 02:40:54.587799: step: 512/466, loss: 0.15301252901554108 2023-01-24 02:40:55.188401: step: 514/466, loss: 0.07264979183673859 2023-01-24 02:40:55.855484: step: 516/466, loss: 0.10656112432479858 2023-01-24 02:40:56.566313: step: 518/466, loss: 0.10963854938745499 2023-01-24 02:40:57.227378: step: 520/466, loss: 0.07999063283205032 2023-01-24 02:40:57.841385: step: 522/466, loss: 0.0714545026421547 2023-01-24 02:40:58.497361: step: 524/466, loss: 0.1233542338013649 2023-01-24 02:40:59.158778: step: 526/466, loss: 0.08987359702587128 2023-01-24 02:40:59.805079: step: 528/466, loss: 0.026457447558641434 2023-01-24 02:41:00.446364: step: 530/466, loss: 0.09527673572301865 2023-01-24 02:41:01.067690: step: 532/466, loss: 0.17456305027008057 2023-01-24 02:41:01.679941: step: 534/466, loss: 0.04650742560625076 2023-01-24 02:41:02.369446: step: 536/466, loss: 0.14098072052001953 2023-01-24 02:41:03.014487: step: 538/466, loss: 0.8169363141059875 2023-01-24 02:41:03.637803: step: 540/466, loss: 0.07249260693788528 2023-01-24 02:41:04.199176: step: 542/466, loss: 0.031110601499676704 2023-01-24 02:41:04.827564: step: 544/466, loss: 0.09898725897073746 2023-01-24 02:41:05.477953: step: 546/466, loss: 0.06797879934310913 2023-01-24 02:41:06.053261: step: 548/466, loss: 0.04290494695305824 2023-01-24 02:41:06.724952: step: 550/466, loss: 0.18823488056659698 2023-01-24 02:41:07.399997: step: 552/466, loss: 0.1126476526260376 2023-01-24 02:41:08.220023: step: 554/466, loss: 0.08767277002334595 2023-01-24 02:41:08.814101: step: 556/466, loss: 0.04120711237192154 2023-01-24 02:41:09.439112: step: 558/466, loss: 0.21468394994735718 2023-01-24 02:41:10.094549: step: 560/466, loss: 0.008046263828873634 2023-01-24 02:41:10.759020: step: 562/466, loss: 0.10749727487564087 2023-01-24 02:41:11.392440: step: 564/466, loss: 0.022814007475972176 2023-01-24 02:41:12.116234: step: 566/466, loss: 0.05688408017158508 2023-01-24 02:41:12.752276: step: 568/466, loss: 0.06673478335142136 2023-01-24 02:41:13.464318: step: 570/466, loss: 0.0634184256196022 2023-01-24 02:41:14.084319: step: 572/466, loss: 0.04535965248942375 2023-01-24 02:41:14.748867: step: 574/466, loss: 0.060279201716184616 2023-01-24 02:41:15.377310: step: 576/466, loss: 0.04101169854402542 2023-01-24 02:41:16.044773: step: 578/466, loss: 0.06607304513454437 2023-01-24 02:41:16.679402: step: 580/466, loss: 0.10051118582487106 2023-01-24 02:41:17.333115: step: 582/466, loss: 0.06458373367786407 2023-01-24 02:41:17.928199: step: 584/466, loss: 0.4127489924430847 2023-01-24 02:41:18.542321: step: 586/466, loss: 0.03470811992883682 2023-01-24 02:41:19.199599: step: 588/466, loss: 0.15565644204616547 2023-01-24 02:41:19.818913: step: 590/466, loss: 0.07380617409944534 2023-01-24 02:41:20.472008: step: 592/466, loss: 0.04714164882898331 2023-01-24 02:41:21.096858: step: 594/466, loss: 0.4443550109863281 2023-01-24 02:41:21.801593: step: 596/466, loss: 0.4796164035797119 2023-01-24 02:41:22.518821: step: 598/466, loss: 0.590786337852478 2023-01-24 02:41:23.095664: step: 600/466, loss: 0.0254229623824358 2023-01-24 02:41:23.682210: step: 602/466, loss: 0.0715954527258873 2023-01-24 02:41:24.307261: step: 604/466, loss: 0.12897107005119324 2023-01-24 02:41:24.950227: step: 606/466, loss: 0.15412917733192444 2023-01-24 02:41:25.592210: step: 608/466, loss: 0.1416398286819458 2023-01-24 02:41:26.151476: step: 610/466, loss: 0.06602641195058823 2023-01-24 02:41:26.725399: step: 612/466, loss: 0.12513640522956848 2023-01-24 02:41:27.304739: step: 614/466, loss: 0.7022066712379456 2023-01-24 02:41:27.962734: step: 616/466, loss: 0.03784637153148651 2023-01-24 02:41:28.564059: step: 618/466, loss: 0.5480793118476868 2023-01-24 02:41:29.169652: step: 620/466, loss: 0.022540397942066193 2023-01-24 02:41:29.747435: step: 622/466, loss: 0.06010816618800163 2023-01-24 02:41:30.379761: step: 624/466, loss: 0.11559788137674332 2023-01-24 02:41:31.017099: step: 626/466, loss: 0.09054840356111526 2023-01-24 02:41:31.663366: step: 628/466, loss: 0.22871863842010498 2023-01-24 02:41:32.314187: step: 630/466, loss: 0.04240802302956581 2023-01-24 02:41:32.877104: step: 632/466, loss: 0.6050617694854736 2023-01-24 02:41:33.519179: step: 634/466, loss: 0.07030927389860153 2023-01-24 02:41:34.135862: step: 636/466, loss: 0.13568347692489624 2023-01-24 02:41:34.737678: step: 638/466, loss: 0.02763962559401989 2023-01-24 02:41:35.351589: step: 640/466, loss: 0.10188723355531693 2023-01-24 02:41:36.008105: step: 642/466, loss: 0.10485479235649109 2023-01-24 02:41:36.628670: step: 644/466, loss: 0.06153221055865288 2023-01-24 02:41:37.204293: step: 646/466, loss: 0.09548943489789963 2023-01-24 02:41:37.794667: step: 648/466, loss: 0.07203955948352814 2023-01-24 02:41:38.400362: step: 650/466, loss: 0.04134916141629219 2023-01-24 02:41:39.056840: step: 652/466, loss: 0.4011443257331848 2023-01-24 02:41:39.697358: step: 654/466, loss: 0.08159706741571426 2023-01-24 02:41:40.335945: step: 656/466, loss: 0.02823558636009693 2023-01-24 02:41:40.988724: step: 658/466, loss: 0.06172872334718704 2023-01-24 02:41:41.564624: step: 660/466, loss: 0.5977843403816223 2023-01-24 02:41:42.184458: step: 662/466, loss: 0.048087794333696365 2023-01-24 02:41:42.865563: step: 664/466, loss: 0.07502186298370361 2023-01-24 02:41:43.440890: step: 666/466, loss: 0.020353632047772408 2023-01-24 02:41:43.989228: step: 668/466, loss: 0.04688615724444389 2023-01-24 02:41:44.577892: step: 670/466, loss: 0.06386066973209381 2023-01-24 02:41:45.287982: step: 672/466, loss: 0.148604616522789 2023-01-24 02:41:45.949902: step: 674/466, loss: 0.1732548624277115 2023-01-24 02:41:46.605026: step: 676/466, loss: 0.0486491434276104 2023-01-24 02:41:47.236851: step: 678/466, loss: 0.09224563091993332 2023-01-24 02:41:47.854458: step: 680/466, loss: 0.060158032923936844 2023-01-24 02:41:48.493078: step: 682/466, loss: 0.02874758094549179 2023-01-24 02:41:49.203918: step: 684/466, loss: 0.049542300403118134 2023-01-24 02:41:49.838501: step: 686/466, loss: 0.04817377030849457 2023-01-24 02:41:50.471117: step: 688/466, loss: 0.07989070564508438 2023-01-24 02:41:51.091047: step: 690/466, loss: 0.0017653441755101085 2023-01-24 02:41:51.725169: step: 692/466, loss: 0.15680000185966492 2023-01-24 02:41:52.441702: step: 694/466, loss: 0.06483715027570724 2023-01-24 02:41:53.019830: step: 696/466, loss: 0.06312517821788788 2023-01-24 02:41:53.689485: step: 698/466, loss: 0.0584401860833168 2023-01-24 02:41:54.316600: step: 700/466, loss: 0.1653444766998291 2023-01-24 02:41:54.918428: step: 702/466, loss: 0.1201152354478836 2023-01-24 02:41:55.500452: step: 704/466, loss: 0.07937215268611908 2023-01-24 02:41:56.088624: step: 706/466, loss: 0.12155706435441971 2023-01-24 02:41:56.708052: step: 708/466, loss: 0.08106189966201782 2023-01-24 02:41:57.313770: step: 710/466, loss: 0.06951658427715302 2023-01-24 02:41:57.898319: step: 712/466, loss: 0.08725471794605255 2023-01-24 02:41:58.578273: step: 714/466, loss: 0.02670646458864212 2023-01-24 02:41:59.255126: step: 716/466, loss: 0.06493691354990005 2023-01-24 02:41:59.890809: step: 718/466, loss: 0.028881005942821503 2023-01-24 02:42:00.541536: step: 720/466, loss: 0.09733739495277405 2023-01-24 02:42:01.117326: step: 722/466, loss: 0.061924561858177185 2023-01-24 02:42:01.708618: step: 724/466, loss: 0.01758013479411602 2023-01-24 02:42:02.376936: step: 726/466, loss: 0.08518119156360626 2023-01-24 02:42:02.986979: step: 728/466, loss: 0.0513693243265152 2023-01-24 02:42:03.618113: step: 730/466, loss: 0.5823181867599487 2023-01-24 02:42:04.285457: step: 732/466, loss: 0.9185676574707031 2023-01-24 02:42:04.867144: step: 734/466, loss: 0.06561222672462463 2023-01-24 02:42:05.489832: step: 736/466, loss: 0.09126259386539459 2023-01-24 02:42:06.147154: step: 738/466, loss: 0.08993934094905853 2023-01-24 02:42:06.780709: step: 740/466, loss: 0.15887029469013214 2023-01-24 02:42:07.425446: step: 742/466, loss: 0.5598637461662292 2023-01-24 02:42:08.080221: step: 744/466, loss: 0.17869073152542114 2023-01-24 02:42:08.718574: step: 746/466, loss: 0.09603740274906158 2023-01-24 02:42:09.348683: step: 748/466, loss: 0.07360593974590302 2023-01-24 02:42:10.025768: step: 750/466, loss: 0.1839761584997177 2023-01-24 02:42:10.670585: step: 752/466, loss: 0.13834035396575928 2023-01-24 02:42:11.298281: step: 754/466, loss: 0.008991479873657227 2023-01-24 02:42:11.898807: step: 756/466, loss: 0.06636402010917664 2023-01-24 02:42:12.499923: step: 758/466, loss: 0.08498692512512207 2023-01-24 02:42:13.114311: step: 760/466, loss: 0.05186803638935089 2023-01-24 02:42:13.792515: step: 762/466, loss: 0.04967869818210602 2023-01-24 02:42:14.389632: step: 764/466, loss: 0.3186541795730591 2023-01-24 02:42:15.116594: step: 766/466, loss: 1.305495023727417 2023-01-24 02:42:15.812891: step: 768/466, loss: 0.054354071617126465 2023-01-24 02:42:16.469087: step: 770/466, loss: 0.07185588777065277 2023-01-24 02:42:17.099177: step: 772/466, loss: 0.05645868182182312 2023-01-24 02:42:17.749663: step: 774/466, loss: 0.1832362711429596 2023-01-24 02:42:18.531688: step: 776/466, loss: 0.1475657969713211 2023-01-24 02:42:19.189420: step: 778/466, loss: 0.8674372434616089 2023-01-24 02:42:19.762267: step: 780/466, loss: 0.08966337889432907 2023-01-24 02:42:20.427268: step: 782/466, loss: 0.008127622306346893 2023-01-24 02:42:21.038306: step: 784/466, loss: 0.06214836984872818 2023-01-24 02:42:21.666360: step: 786/466, loss: 0.1667788326740265 2023-01-24 02:42:22.282734: step: 788/466, loss: 0.08831219375133514 2023-01-24 02:42:22.845973: step: 790/466, loss: 0.0821632519364357 2023-01-24 02:42:23.522568: step: 792/466, loss: 0.08928848803043365 2023-01-24 02:42:24.136381: step: 794/466, loss: 0.08582484722137451 2023-01-24 02:42:24.751696: step: 796/466, loss: 0.014395495876669884 2023-01-24 02:42:25.446159: step: 798/466, loss: 0.03297445923089981 2023-01-24 02:42:26.090751: step: 800/466, loss: 0.011172035709023476 2023-01-24 02:42:26.677079: step: 802/466, loss: 0.016700677573680878 2023-01-24 02:42:27.298348: step: 804/466, loss: 0.1435922384262085 2023-01-24 02:42:27.956155: step: 806/466, loss: 0.5995723605155945 2023-01-24 02:42:28.581972: step: 808/466, loss: 0.04722990840673447 2023-01-24 02:42:29.181445: step: 810/466, loss: 0.4910792112350464 2023-01-24 02:42:29.843527: step: 812/466, loss: 0.0818682312965393 2023-01-24 02:42:30.510926: step: 814/466, loss: 0.026851100847125053 2023-01-24 02:42:31.059493: step: 816/466, loss: 0.08519230037927628 2023-01-24 02:42:31.665677: step: 818/466, loss: 0.636756420135498 2023-01-24 02:42:32.294352: step: 820/466, loss: 0.024506378918886185 2023-01-24 02:42:32.918174: step: 822/466, loss: 0.1749979555606842 2023-01-24 02:42:33.528430: step: 824/466, loss: 0.04244179651141167 2023-01-24 02:42:34.122556: step: 826/466, loss: 0.03763969615101814 2023-01-24 02:42:34.780760: step: 828/466, loss: 0.017545204609632492 2023-01-24 02:42:35.440139: step: 830/466, loss: 0.15983814001083374 2023-01-24 02:42:36.067812: step: 832/466, loss: 0.21922515332698822 2023-01-24 02:42:36.647980: step: 834/466, loss: 0.10022667050361633 2023-01-24 02:42:37.225378: step: 836/466, loss: 0.03854656592011452 2023-01-24 02:42:37.848417: step: 838/466, loss: 0.039883919060230255 2023-01-24 02:42:38.450027: step: 840/466, loss: 0.18616662919521332 2023-01-24 02:42:39.096549: step: 842/466, loss: 0.09625918418169022 2023-01-24 02:42:39.795709: step: 844/466, loss: 0.020380394533276558 2023-01-24 02:42:40.423013: step: 846/466, loss: 0.13634014129638672 2023-01-24 02:42:41.060506: step: 848/466, loss: 0.056695401668548584 2023-01-24 02:42:41.614247: step: 850/466, loss: 0.023742130026221275 2023-01-24 02:42:42.242588: step: 852/466, loss: 0.3502342402935028 2023-01-24 02:42:42.948386: step: 854/466, loss: 0.053310394287109375 2023-01-24 02:42:43.563171: step: 856/466, loss: 0.13849897682666779 2023-01-24 02:42:44.206396: step: 858/466, loss: 0.049945440143346786 2023-01-24 02:42:44.833384: step: 860/466, loss: 0.07457361370325089 2023-01-24 02:42:45.451196: step: 862/466, loss: 1.0580438375473022 2023-01-24 02:42:46.055869: step: 864/466, loss: 1.2639542818069458 2023-01-24 02:42:46.659758: step: 866/466, loss: 0.008224474266171455 2023-01-24 02:42:47.246067: step: 868/466, loss: 1.7443006038665771 2023-01-24 02:42:47.866872: step: 870/466, loss: 0.2736111581325531 2023-01-24 02:42:48.522548: step: 872/466, loss: 0.06253460794687271 2023-01-24 02:42:49.170899: step: 874/466, loss: 0.26736292243003845 2023-01-24 02:42:49.830372: step: 876/466, loss: 0.19355060160160065 2023-01-24 02:42:50.457733: step: 878/466, loss: 0.2999248504638672 2023-01-24 02:42:51.029766: step: 880/466, loss: 0.06770453602075577 2023-01-24 02:42:51.667012: step: 882/466, loss: 0.046020377427339554 2023-01-24 02:42:52.339527: step: 884/466, loss: 0.09162288904190063 2023-01-24 02:42:52.979475: step: 886/466, loss: 0.07737147808074951 2023-01-24 02:42:53.616962: step: 888/466, loss: 0.7829618453979492 2023-01-24 02:42:54.247184: step: 890/466, loss: 0.032086726278066635 2023-01-24 02:42:54.849752: step: 892/466, loss: 0.05165988206863403 2023-01-24 02:42:55.501808: step: 894/466, loss: 0.1862795352935791 2023-01-24 02:42:56.195022: step: 896/466, loss: 0.11243996024131775 2023-01-24 02:42:56.801743: step: 898/466, loss: 0.39769259095191956 2023-01-24 02:42:57.520445: step: 900/466, loss: 0.025845997035503387 2023-01-24 02:42:58.142359: step: 902/466, loss: 0.024150336161255836 2023-01-24 02:42:58.832787: step: 904/466, loss: 0.2105114609003067 2023-01-24 02:42:59.486628: step: 906/466, loss: 0.27665263414382935 2023-01-24 02:43:00.143375: step: 908/466, loss: 0.03600461781024933 2023-01-24 02:43:00.788806: step: 910/466, loss: 0.09567281603813171 2023-01-24 02:43:01.436661: step: 912/466, loss: 0.2852194607257843 2023-01-24 02:43:02.024736: step: 914/466, loss: 0.2959142327308655 2023-01-24 02:43:02.662758: step: 916/466, loss: 0.04316151514649391 2023-01-24 02:43:03.367756: step: 918/466, loss: 0.14325612783432007 2023-01-24 02:43:04.026549: step: 920/466, loss: 0.10374636948108673 2023-01-24 02:43:04.628433: step: 922/466, loss: 0.18825989961624146 2023-01-24 02:43:05.246664: step: 924/466, loss: 0.6862626671791077 2023-01-24 02:43:05.922334: step: 926/466, loss: 0.13701371848583221 2023-01-24 02:43:06.544129: step: 928/466, loss: 0.023627731949090958 2023-01-24 02:43:07.162007: step: 930/466, loss: 0.04441440850496292 2023-01-24 02:43:07.800127: step: 932/466, loss: 0.13912048935890198 ================================================== Loss: 0.142 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.34364392396469795, 'r': 0.32016919671094246, 'f1': 0.3314914865749837}, 'combined': 0.2442568848447248, 'epoch': 18} Test Chinese: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.3478852351158289, 'r': 0.30876319310453326, 'f1': 0.327158800393071}, 'combined': 0.21697578471664808, 'epoch': 18} Dev Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.3365946261682243, 'r': 0.2728456439393939, 'f1': 0.3013859832635983}, 'combined': 0.20092398884239884, 'epoch': 18} Test Korean: {'template': {'p': 0.96875, 'r': 0.49206349206349204, 'f1': 0.6526315789473683}, 'slot': {'p': 0.35330897971372305, 'r': 0.28851529139252124, 'f1': 0.31764159699976624}, 'combined': 0.2073029369893211, 'epoch': 18} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.32673401891941856, 'r': 0.31557422320680084, 'f1': 0.32105717303085723}, 'combined': 0.2365684432858948, 'epoch': 18} Test Russian: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.337908154750774, 'r': 0.29409983736735473, 'f1': 0.31448568561370555}, 'combined': 0.20857081740183578, 'epoch': 18} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.2947154471544715, 'r': 0.3452380952380952, 'f1': 0.31798245614035087}, 'combined': 0.21198830409356723, 'epoch': 18} Sample Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.4642857142857143, 'r': 0.2826086956521739, 'f1': 0.35135135135135126}, 'combined': 0.23423423423423417, 'epoch': 18} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.34375, 'r': 0.09482758620689655, 'f1': 0.14864864864864866}, 'combined': 0.0990990990990991, 'epoch': 18} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3384687266123707, 'r': 0.32112782411040863, 'f1': 0.3295703277627758}, 'combined': 0.24284129414099268, 'epoch': 13} Test for Chinese: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.37192480126309296, 'r': 0.3046241229392952, 'f1': 0.334927046163623}, 'combined': 0.2221277819116256, 'epoch': 13} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.38541666666666663, 'r': 0.35238095238095235, 'f1': 0.36815920398009944}, 'combined': 0.24543946932006627, 'epoch': 13} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.31508371199826896, 'r': 0.28285924145299146, 'f1': 0.298103152669021}, 'combined': 0.19873543511268066, 'epoch': 10} Test for Korean: {'template': {'p': 0.96875, 'r': 0.49206349206349204, 'f1': 0.6526315789473683}, 'slot': {'p': 0.3659577520051451, 'r': 0.28516188467933384, 'f1': 0.3205469360629008}, 'combined': 0.20919905300947209, 'epoch': 10} Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.43478260869565216, 'f1': 0.46511627906976744}, 'combined': 0.31007751937984496, 'epoch': 10} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3214272868712161, 'r': 0.31166858366449984, 'f1': 0.3164727236824498}, 'combined': 0.23319042797654194, 'epoch': 7} Test for Russian: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.3374639287478496, 'r': 0.2816078301964814, 'f1': 0.30701605547736693}, 'combined': 0.20361686580882363, 'epoch': 7} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.4423076923076923, 'r': 0.19827586206896552, 'f1': 0.2738095238095238}, 'combined': 0.1825396825396825, 'epoch': 7} ****************************** Epoch: 19 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-24 02:45:39.100329: step: 2/466, loss: 0.17019319534301758 2023-01-24 02:45:39.716099: step: 4/466, loss: 0.09898035228252411 2023-01-24 02:45:40.327142: step: 6/466, loss: 0.06031796336174011 2023-01-24 02:45:40.920366: step: 8/466, loss: 0.04319017007946968 2023-01-24 02:45:41.605738: step: 10/466, loss: 0.08148910105228424 2023-01-24 02:45:42.263231: step: 12/466, loss: 0.03979048877954483 2023-01-24 02:45:42.880537: step: 14/466, loss: 0.07453230023384094 2023-01-24 02:45:43.462707: step: 16/466, loss: 0.06262771040201187 2023-01-24 02:45:44.074077: step: 18/466, loss: 0.4737427234649658 2023-01-24 02:45:44.672868: step: 20/466, loss: 0.0073438892140984535 2023-01-24 02:45:45.294347: step: 22/466, loss: 0.01852789707481861 2023-01-24 02:45:45.898347: step: 24/466, loss: 0.2755466401576996 2023-01-24 02:45:46.513537: step: 26/466, loss: 0.0645308718085289 2023-01-24 02:45:47.244032: step: 28/466, loss: 0.076554074883461 2023-01-24 02:45:47.894865: step: 30/466, loss: 0.12746798992156982 2023-01-24 02:45:48.544254: step: 32/466, loss: 0.11316990852355957 2023-01-24 02:45:49.196622: step: 34/466, loss: 0.0632438138127327 2023-01-24 02:45:49.774199: step: 36/466, loss: 0.047239162027835846 2023-01-24 02:45:50.402199: step: 38/466, loss: 0.03865513578057289 2023-01-24 02:45:51.071409: step: 40/466, loss: 0.009294724091887474 2023-01-24 02:45:51.724789: step: 42/466, loss: 0.03557030111551285 2023-01-24 02:45:52.377999: step: 44/466, loss: 0.06436656415462494 2023-01-24 02:45:53.062559: step: 46/466, loss: 0.11278537660837173 2023-01-24 02:45:53.710237: step: 48/466, loss: 0.03214911371469498 2023-01-24 02:45:54.387157: step: 50/466, loss: 0.03796916455030441 2023-01-24 02:45:55.061528: step: 52/466, loss: 0.12940087914466858 2023-01-24 02:45:55.766418: step: 54/466, loss: 0.05129818990826607 2023-01-24 02:45:56.426937: step: 56/466, loss: 0.1207818016409874 2023-01-24 02:45:56.988774: step: 58/466, loss: 0.03639287129044533 2023-01-24 02:45:57.576894: step: 60/466, loss: 0.04632697254419327 2023-01-24 02:45:58.190569: step: 62/466, loss: 0.007665011566132307 2023-01-24 02:45:58.784472: step: 64/466, loss: 0.022454943507909775 2023-01-24 02:45:59.484913: step: 66/466, loss: 0.030852628871798515 2023-01-24 02:46:00.265558: step: 68/466, loss: 0.04700777307152748 2023-01-24 02:46:00.945262: step: 70/466, loss: 0.1398026943206787 2023-01-24 02:46:01.529610: step: 72/466, loss: 0.06548305600881577 2023-01-24 02:46:02.170643: step: 74/466, loss: 0.038578152656555176 2023-01-24 02:46:02.847582: step: 76/466, loss: 0.016550444066524506 2023-01-24 02:46:03.478158: step: 78/466, loss: 0.1139058768749237 2023-01-24 02:46:04.088575: step: 80/466, loss: 0.03092702478170395 2023-01-24 02:46:04.699806: step: 82/466, loss: 0.2325407713651657 2023-01-24 02:46:05.286179: step: 84/466, loss: 0.034473005682229996 2023-01-24 02:46:05.934168: step: 86/466, loss: 0.22252222895622253 2023-01-24 02:46:06.536947: step: 88/466, loss: 0.09144359081983566 2023-01-24 02:46:07.101343: step: 90/466, loss: 0.05625465139746666 2023-01-24 02:46:07.723741: step: 92/466, loss: 0.01845523715019226 2023-01-24 02:46:08.370909: step: 94/466, loss: 0.010869793593883514 2023-01-24 02:46:09.008121: step: 96/466, loss: 0.05374189093708992 2023-01-24 02:46:09.738608: step: 98/466, loss: 0.1615215241909027 2023-01-24 02:46:10.370264: step: 100/466, loss: 0.13167911767959595 2023-01-24 02:46:11.006339: step: 102/466, loss: 0.03532380983233452 2023-01-24 02:46:11.587718: step: 104/466, loss: 0.05657459422945976 2023-01-24 02:46:12.258416: step: 106/466, loss: 0.011587120592594147 2023-01-24 02:46:12.826019: step: 108/466, loss: 0.00539175933226943 2023-01-24 02:46:13.397461: step: 110/466, loss: 0.01461577508598566 2023-01-24 02:46:14.046865: step: 112/466, loss: 0.04137485846877098 2023-01-24 02:46:14.720145: step: 114/466, loss: 0.10069483518600464 2023-01-24 02:46:15.358837: step: 116/466, loss: 0.03732612356543541 2023-01-24 02:46:15.985610: step: 118/466, loss: 0.06704573333263397 2023-01-24 02:46:16.709742: step: 120/466, loss: 0.17603015899658203 2023-01-24 02:46:17.327652: step: 122/466, loss: 0.00975867360830307 2023-01-24 02:46:17.990653: step: 124/466, loss: 1.9634912014007568 2023-01-24 02:46:18.592737: step: 126/466, loss: 0.033877596259117126 2023-01-24 02:46:19.193059: step: 128/466, loss: 0.04435870796442032 2023-01-24 02:46:19.885488: step: 130/466, loss: 0.06567829847335815 2023-01-24 02:46:20.535734: step: 132/466, loss: 0.06342989206314087 2023-01-24 02:46:21.253805: step: 134/466, loss: 0.10938435792922974 2023-01-24 02:46:21.841550: step: 136/466, loss: 0.023159049451351166 2023-01-24 02:46:22.511588: step: 138/466, loss: 0.028895560652017593 2023-01-24 02:46:23.194047: step: 140/466, loss: 0.07450679689645767 2023-01-24 02:46:23.797754: step: 142/466, loss: 0.09156231582164764 2023-01-24 02:46:24.398319: step: 144/466, loss: 0.048212744295597076 2023-01-24 02:46:24.987405: step: 146/466, loss: 0.07371757179498672 2023-01-24 02:46:25.591454: step: 148/466, loss: 0.04905620589852333 2023-01-24 02:46:26.217794: step: 150/466, loss: 0.10677918791770935 2023-01-24 02:46:26.885218: step: 152/466, loss: 0.009451477788388729 2023-01-24 02:46:27.495599: step: 154/466, loss: 0.014463577419519424 2023-01-24 02:46:28.129781: step: 156/466, loss: 0.0741467997431755 2023-01-24 02:46:28.796333: step: 158/466, loss: 0.08025062084197998 2023-01-24 02:46:29.454415: step: 160/466, loss: 0.05890921503305435 2023-01-24 02:46:30.106676: step: 162/466, loss: 0.05542481318116188 2023-01-24 02:46:30.781211: step: 164/466, loss: 0.019434651359915733 2023-01-24 02:46:31.318042: step: 166/466, loss: 0.858182430267334 2023-01-24 02:46:31.884664: step: 168/466, loss: 0.0657401755452156 2023-01-24 02:46:32.555613: step: 170/466, loss: 0.04019244760274887 2023-01-24 02:46:33.175633: step: 172/466, loss: 0.06402216851711273 2023-01-24 02:46:33.868511: step: 174/466, loss: 0.08312556147575378 2023-01-24 02:46:34.484485: step: 176/466, loss: 0.05441839620471001 2023-01-24 02:46:35.205724: step: 178/466, loss: 0.03143666684627533 2023-01-24 02:46:35.833140: step: 180/466, loss: 0.12197541445493698 2023-01-24 02:46:36.474648: step: 182/466, loss: 0.2967766523361206 2023-01-24 02:46:37.075857: step: 184/466, loss: 0.03947914019227028 2023-01-24 02:46:37.677275: step: 186/466, loss: 0.05565882474184036 2023-01-24 02:46:38.271528: step: 188/466, loss: 0.12150771170854568 2023-01-24 02:46:38.925325: step: 190/466, loss: 0.741322934627533 2023-01-24 02:46:39.576542: step: 192/466, loss: 0.06171559914946556 2023-01-24 02:46:40.173612: step: 194/466, loss: 0.028971601277589798 2023-01-24 02:46:40.783802: step: 196/466, loss: 0.5960246324539185 2023-01-24 02:46:41.399251: step: 198/466, loss: 0.10869164764881134 2023-01-24 02:46:42.039184: step: 200/466, loss: 0.03132522851228714 2023-01-24 02:46:42.672448: step: 202/466, loss: 0.11369634419679642 2023-01-24 02:46:43.274497: step: 204/466, loss: 0.04653134569525719 2023-01-24 02:46:43.840545: step: 206/466, loss: 0.24414774775505066 2023-01-24 02:46:44.449006: step: 208/466, loss: 0.09183460474014282 2023-01-24 02:46:45.065580: step: 210/466, loss: 0.03295493870973587 2023-01-24 02:46:45.680562: step: 212/466, loss: 0.05806458741426468 2023-01-24 02:46:46.346457: step: 214/466, loss: 0.18339110910892487 2023-01-24 02:46:46.932601: step: 216/466, loss: 0.021839501336216927 2023-01-24 02:46:47.494076: step: 218/466, loss: 0.014637252315878868 2023-01-24 02:46:48.122368: step: 220/466, loss: 0.06076514720916748 2023-01-24 02:46:48.800884: step: 222/466, loss: 0.06928149610757828 2023-01-24 02:46:49.435698: step: 224/466, loss: 0.06949815899133682 2023-01-24 02:46:50.141640: step: 226/466, loss: 0.03996966779232025 2023-01-24 02:46:50.719040: step: 228/466, loss: 0.11684077978134155 2023-01-24 02:46:51.312159: step: 230/466, loss: 0.02931184135377407 2023-01-24 02:46:51.905459: step: 232/466, loss: 0.03776870295405388 2023-01-24 02:46:52.533655: step: 234/466, loss: 1.6303871870040894 2023-01-24 02:46:53.233464: step: 236/466, loss: 0.02073729783296585 2023-01-24 02:46:53.872071: step: 238/466, loss: 0.11177118867635727 2023-01-24 02:46:54.530632: step: 240/466, loss: 0.02529243938624859 2023-01-24 02:46:55.151631: step: 242/466, loss: 0.726740837097168 2023-01-24 02:46:55.772647: step: 244/466, loss: 0.10609099268913269 2023-01-24 02:46:56.330219: step: 246/466, loss: 0.1528254896402359 2023-01-24 02:46:56.975764: step: 248/466, loss: 0.13189128041267395 2023-01-24 02:46:57.612197: step: 250/466, loss: 0.10527793318033218 2023-01-24 02:46:58.267196: step: 252/466, loss: 0.2536887228488922 2023-01-24 02:46:58.850908: step: 254/466, loss: 0.030160672962665558 2023-01-24 02:46:59.441183: step: 256/466, loss: 0.3606482148170471 2023-01-24 02:47:00.048726: step: 258/466, loss: 0.11497034132480621 2023-01-24 02:47:00.665479: step: 260/466, loss: 0.18138667941093445 2023-01-24 02:47:01.225937: step: 262/466, loss: 0.043892133980989456 2023-01-24 02:47:01.820020: step: 264/466, loss: 0.04916343465447426 2023-01-24 02:47:02.411982: step: 266/466, loss: 0.041603077203035355 2023-01-24 02:47:03.040303: step: 268/466, loss: 0.14506392180919647 2023-01-24 02:47:03.624743: step: 270/466, loss: 0.01465784665197134 2023-01-24 02:47:04.251955: step: 272/466, loss: 0.03732382133603096 2023-01-24 02:47:04.858025: step: 274/466, loss: 0.04902615398168564 2023-01-24 02:47:05.462905: step: 276/466, loss: 0.024088485166430473 2023-01-24 02:47:06.129606: step: 278/466, loss: 0.04977373778820038 2023-01-24 02:47:06.832135: step: 280/466, loss: 0.019392477348446846 2023-01-24 02:47:07.441176: step: 282/466, loss: 0.30683591961860657 2023-01-24 02:47:08.069623: step: 284/466, loss: 0.21062332391738892 2023-01-24 02:47:08.743762: step: 286/466, loss: 0.13876795768737793 2023-01-24 02:47:09.299086: step: 288/466, loss: 0.06014120206236839 2023-01-24 02:47:09.923712: step: 290/466, loss: 0.07535697519779205 2023-01-24 02:47:10.513896: step: 292/466, loss: 0.059721991419792175 2023-01-24 02:47:11.115298: step: 294/466, loss: 0.09441936016082764 2023-01-24 02:47:11.796880: step: 296/466, loss: 0.1585637778043747 2023-01-24 02:47:12.324206: step: 298/466, loss: 0.09514472633600235 2023-01-24 02:47:12.915500: step: 300/466, loss: 0.18422214686870575 2023-01-24 02:47:13.480829: step: 302/466, loss: 0.036070097237825394 2023-01-24 02:47:14.089988: step: 304/466, loss: 0.033678553998470306 2023-01-24 02:47:14.759533: step: 306/466, loss: 0.09714172780513763 2023-01-24 02:47:15.440651: step: 308/466, loss: 0.13208194077014923 2023-01-24 02:47:16.069592: step: 310/466, loss: 0.0807897076010704 2023-01-24 02:47:16.765398: step: 312/466, loss: 0.07150286436080933 2023-01-24 02:47:17.394544: step: 314/466, loss: 0.07468264549970627 2023-01-24 02:47:18.113172: step: 316/466, loss: 0.05557527393102646 2023-01-24 02:47:18.775018: step: 318/466, loss: 0.023244045674800873 2023-01-24 02:47:19.407458: step: 320/466, loss: 0.23286482691764832 2023-01-24 02:47:20.084265: step: 322/466, loss: 0.01101283635944128 2023-01-24 02:47:20.717882: step: 324/466, loss: 0.21222420036792755 2023-01-24 02:47:21.262415: step: 326/466, loss: 0.00755928223952651 2023-01-24 02:47:21.843829: step: 328/466, loss: 0.439331591129303 2023-01-24 02:47:22.451929: step: 330/466, loss: 0.06044579669833183 2023-01-24 02:47:23.068718: step: 332/466, loss: 0.07248007506132126 2023-01-24 02:47:23.691085: step: 334/466, loss: 0.1341545581817627 2023-01-24 02:47:24.344942: step: 336/466, loss: 0.06600915640592575 2023-01-24 02:47:24.966869: step: 338/466, loss: 0.09433504194021225 2023-01-24 02:47:25.592264: step: 340/466, loss: 1.2103420495986938 2023-01-24 02:47:26.323614: step: 342/466, loss: 0.09330989420413971 2023-01-24 02:47:26.965226: step: 344/466, loss: 0.10772915929555893 2023-01-24 02:47:27.601779: step: 346/466, loss: 0.08495800942182541 2023-01-24 02:47:28.251327: step: 348/466, loss: 0.07539297640323639 2023-01-24 02:47:28.824554: step: 350/466, loss: 0.028574472293257713 2023-01-24 02:47:29.480028: step: 352/466, loss: 0.15439830720424652 2023-01-24 02:47:30.199078: step: 354/466, loss: 0.055617015808820724 2023-01-24 02:47:30.833836: step: 356/466, loss: 0.2645198404788971 2023-01-24 02:47:31.402639: step: 358/466, loss: 0.033317677676677704 2023-01-24 02:47:31.949469: step: 360/466, loss: 0.025735756382346153 2023-01-24 02:47:32.639096: step: 362/466, loss: 0.05049290880560875 2023-01-24 02:47:33.269533: step: 364/466, loss: 0.0639977902173996 2023-01-24 02:47:33.885560: step: 366/466, loss: 0.04859790951013565 2023-01-24 02:47:34.467545: step: 368/466, loss: 0.07877586036920547 2023-01-24 02:47:35.106423: step: 370/466, loss: 0.030141178518533707 2023-01-24 02:47:35.740888: step: 372/466, loss: 0.24391008913516998 2023-01-24 02:47:36.323647: step: 374/466, loss: 0.025568336248397827 2023-01-24 02:47:36.938736: step: 376/466, loss: 0.13599267601966858 2023-01-24 02:47:37.576619: step: 378/466, loss: 0.0934305340051651 2023-01-24 02:47:38.176766: step: 380/466, loss: 0.030407529324293137 2023-01-24 02:47:38.847864: step: 382/466, loss: 0.05794856324791908 2023-01-24 02:47:39.470441: step: 384/466, loss: 0.052790604531764984 2023-01-24 02:47:40.057376: step: 386/466, loss: 0.13118720054626465 2023-01-24 02:47:40.642142: step: 388/466, loss: 0.1165083646774292 2023-01-24 02:47:41.335326: step: 390/466, loss: 0.10365615785121918 2023-01-24 02:47:41.980776: step: 392/466, loss: 0.2166069895029068 2023-01-24 02:47:42.666707: step: 394/466, loss: 0.031927574425935745 2023-01-24 02:47:43.341555: step: 396/466, loss: 0.12831361591815948 2023-01-24 02:47:43.955499: step: 398/466, loss: 0.06301099061965942 2023-01-24 02:47:44.538613: step: 400/466, loss: 0.6753872632980347 2023-01-24 02:47:45.193057: step: 402/466, loss: 0.061331309378147125 2023-01-24 02:47:45.809462: step: 404/466, loss: 0.02498142421245575 2023-01-24 02:47:46.473287: step: 406/466, loss: 0.059494998306035995 2023-01-24 02:47:47.101807: step: 408/466, loss: 0.08644162863492966 2023-01-24 02:47:47.703969: step: 410/466, loss: 0.031151285395026207 2023-01-24 02:47:48.322807: step: 412/466, loss: 0.0479624904692173 2023-01-24 02:47:48.990320: step: 414/466, loss: 0.005968266166746616 2023-01-24 02:47:49.538658: step: 416/466, loss: 0.057921234518289566 2023-01-24 02:47:50.162651: step: 418/466, loss: 0.08367381989955902 2023-01-24 02:47:50.838966: step: 420/466, loss: 0.01985708251595497 2023-01-24 02:47:51.503198: step: 422/466, loss: 0.47170504927635193 2023-01-24 02:47:52.135549: step: 424/466, loss: 0.6092000007629395 2023-01-24 02:47:52.792459: step: 426/466, loss: 0.03696879372000694 2023-01-24 02:47:53.437980: step: 428/466, loss: 0.024741677567362785 2023-01-24 02:47:54.089653: step: 430/466, loss: 0.02118833363056183 2023-01-24 02:47:54.786140: step: 432/466, loss: 0.40384843945503235 2023-01-24 02:47:55.354392: step: 434/466, loss: 0.04938744008541107 2023-01-24 02:47:56.025323: step: 436/466, loss: 0.06573593616485596 2023-01-24 02:47:56.697977: step: 438/466, loss: 0.05361180007457733 2023-01-24 02:47:57.353424: step: 440/466, loss: 0.20754894614219666 2023-01-24 02:47:57.953166: step: 442/466, loss: 0.04232131689786911 2023-01-24 02:47:58.590665: step: 444/466, loss: 0.0521935373544693 2023-01-24 02:47:59.220509: step: 446/466, loss: 0.04974536970257759 2023-01-24 02:47:59.783508: step: 448/466, loss: 0.3302471339702606 2023-01-24 02:48:00.413202: step: 450/466, loss: 0.01604943722486496 2023-01-24 02:48:01.064753: step: 452/466, loss: 0.09881814569234848 2023-01-24 02:48:01.751673: step: 454/466, loss: 0.07670079916715622 2023-01-24 02:48:02.389823: step: 456/466, loss: 0.06403877586126328 2023-01-24 02:48:02.999166: step: 458/466, loss: 0.04350895807147026 2023-01-24 02:48:03.611854: step: 460/466, loss: 0.06272173672914505 2023-01-24 02:48:04.261055: step: 462/466, loss: 0.030401557683944702 2023-01-24 02:48:04.897612: step: 464/466, loss: 0.47371336817741394 2023-01-24 02:48:05.524459: step: 466/466, loss: 0.07108789682388306 2023-01-24 02:48:06.225357: step: 468/466, loss: 0.07831962406635284 2023-01-24 02:48:06.849926: step: 470/466, loss: 0.195390522480011 2023-01-24 02:48:07.484922: step: 472/466, loss: 0.11402513086795807 2023-01-24 02:48:08.131220: step: 474/466, loss: 0.04407782852649689 2023-01-24 02:48:08.713469: step: 476/466, loss: 0.11348390579223633 2023-01-24 02:48:09.366405: step: 478/466, loss: 16.509958267211914 2023-01-24 02:48:09.992535: step: 480/466, loss: 0.03846314549446106 2023-01-24 02:48:10.574690: step: 482/466, loss: 0.14290757477283478 2023-01-24 02:48:11.176686: step: 484/466, loss: 0.0507693774998188 2023-01-24 02:48:11.828107: step: 486/466, loss: 0.08279737830162048 2023-01-24 02:48:12.428912: step: 488/466, loss: 0.03770788013935089 2023-01-24 02:48:13.036109: step: 490/466, loss: 0.04785335063934326 2023-01-24 02:48:13.686682: step: 492/466, loss: 0.11262853443622589 2023-01-24 02:48:14.307192: step: 494/466, loss: 0.0479293055832386 2023-01-24 02:48:14.923101: step: 496/466, loss: 0.09707148373126984 2023-01-24 02:48:15.535338: step: 498/466, loss: 0.08645107597112656 2023-01-24 02:48:16.167428: step: 500/466, loss: 0.07878920435905457 2023-01-24 02:48:16.787284: step: 502/466, loss: 0.25702032446861267 2023-01-24 02:48:17.417045: step: 504/466, loss: 0.026958482339978218 2023-01-24 02:48:18.020856: step: 506/466, loss: 0.10854596644639969 2023-01-24 02:48:18.569337: step: 508/466, loss: 0.07018983364105225 2023-01-24 02:48:19.176051: step: 510/466, loss: 0.033980175852775574 2023-01-24 02:48:19.778054: step: 512/466, loss: 0.03171353414654732 2023-01-24 02:48:21.117524: step: 514/466, loss: 0.15424910187721252 2023-01-24 02:48:21.744628: step: 516/466, loss: 0.05672196298837662 2023-01-24 02:48:22.375654: step: 518/466, loss: 0.24705109000205994 2023-01-24 02:48:22.940561: step: 520/466, loss: 0.04120907559990883 2023-01-24 02:48:23.555665: step: 522/466, loss: 0.045313362032175064 2023-01-24 02:48:24.224412: step: 524/466, loss: 0.01784730888903141 2023-01-24 02:48:24.845469: step: 526/466, loss: 0.047152016311883926 2023-01-24 02:48:25.477754: step: 528/466, loss: 0.06034964695572853 2023-01-24 02:48:26.107331: step: 530/466, loss: 0.07564128190279007 2023-01-24 02:48:26.740232: step: 532/466, loss: 0.49377718567848206 2023-01-24 02:48:27.365710: step: 534/466, loss: 0.030665753409266472 2023-01-24 02:48:27.974160: step: 536/466, loss: 0.03836410865187645 2023-01-24 02:48:28.593039: step: 538/466, loss: 0.12349491566419601 2023-01-24 02:48:29.192711: step: 540/466, loss: 0.07908549159765244 2023-01-24 02:48:29.767814: step: 542/466, loss: 0.017840398475527763 2023-01-24 02:48:30.337161: step: 544/466, loss: 0.12404599785804749 2023-01-24 02:48:30.967660: step: 546/466, loss: 0.05739966034889221 2023-01-24 02:48:31.605618: step: 548/466, loss: 0.13945209980010986 2023-01-24 02:48:32.198594: step: 550/466, loss: 0.25514689087867737 2023-01-24 02:48:32.788395: step: 552/466, loss: 0.0977272093296051 2023-01-24 02:48:33.459163: step: 554/466, loss: 0.03294859081506729 2023-01-24 02:48:34.097753: step: 556/466, loss: 0.06530392915010452 2023-01-24 02:48:34.720722: step: 558/466, loss: 0.02967679314315319 2023-01-24 02:48:35.378760: step: 560/466, loss: 0.16877122223377228 2023-01-24 02:48:36.083191: step: 562/466, loss: 0.06818056851625443 2023-01-24 02:48:36.732561: step: 564/466, loss: 0.06080375611782074 2023-01-24 02:48:37.366437: step: 566/466, loss: 0.03576469048857689 2023-01-24 02:48:37.985982: step: 568/466, loss: 0.0933336690068245 2023-01-24 02:48:38.622420: step: 570/466, loss: 0.0925021767616272 2023-01-24 02:48:39.222042: step: 572/466, loss: 0.13887295126914978 2023-01-24 02:48:39.843615: step: 574/466, loss: 0.06011636182665825 2023-01-24 02:48:40.470356: step: 576/466, loss: 0.18684709072113037 2023-01-24 02:48:41.136583: step: 578/466, loss: 0.03664421662688255 2023-01-24 02:48:41.716231: step: 580/466, loss: 0.07297038286924362 2023-01-24 02:48:42.373559: step: 582/466, loss: 0.07130319625139236 2023-01-24 02:48:42.999004: step: 584/466, loss: 0.15805700421333313 2023-01-24 02:48:43.602444: step: 586/466, loss: 0.23254035413265228 2023-01-24 02:48:44.252213: step: 588/466, loss: 0.009973529726266861 2023-01-24 02:48:44.882755: step: 590/466, loss: 0.07603029161691666 2023-01-24 02:48:45.581810: step: 592/466, loss: 0.06145777925848961 2023-01-24 02:48:46.193360: step: 594/466, loss: 0.7084916830062866 2023-01-24 02:48:46.827288: step: 596/466, loss: 0.08905018866062164 2023-01-24 02:48:47.449710: step: 598/466, loss: 0.0640096366405487 2023-01-24 02:48:48.224406: step: 600/466, loss: 0.017966795712709427 2023-01-24 02:48:48.823516: step: 602/466, loss: 0.07718763500452042 2023-01-24 02:48:49.458585: step: 604/466, loss: 0.09043219685554504 2023-01-24 02:48:50.064466: step: 606/466, loss: 0.04277624562382698 2023-01-24 02:48:50.683564: step: 608/466, loss: 0.09092157334089279 2023-01-24 02:48:51.341343: step: 610/466, loss: 0.04313157871365547 2023-01-24 02:48:51.966892: step: 612/466, loss: 0.02429984137415886 2023-01-24 02:48:52.707555: step: 614/466, loss: 0.3619789779186249 2023-01-24 02:48:53.344852: step: 616/466, loss: 0.025472547858953476 2023-01-24 02:48:53.969801: step: 618/466, loss: 0.04127861186861992 2023-01-24 02:48:54.619361: step: 620/466, loss: 0.05021412670612335 2023-01-24 02:48:55.311306: step: 622/466, loss: 0.2635730504989624 2023-01-24 02:48:55.892343: step: 624/466, loss: 0.09003724902868271 2023-01-24 02:48:56.572211: step: 626/466, loss: 0.29988300800323486 2023-01-24 02:48:57.215046: step: 628/466, loss: 0.040291618555784225 2023-01-24 02:48:57.809257: step: 630/466, loss: 0.06659505516290665 2023-01-24 02:48:58.424302: step: 632/466, loss: 0.11002293229103088 2023-01-24 02:48:59.010844: step: 634/466, loss: 0.03860000893473625 2023-01-24 02:48:59.628955: step: 636/466, loss: 0.11201709508895874 2023-01-24 02:49:00.272987: step: 638/466, loss: 0.09177178889513016 2023-01-24 02:49:00.931716: step: 640/466, loss: 0.12213198840618134 2023-01-24 02:49:01.604371: step: 642/466, loss: 0.5728474855422974 2023-01-24 02:49:02.336435: step: 644/466, loss: 0.04759373143315315 2023-01-24 02:49:02.996073: step: 646/466, loss: 0.0527830645442009 2023-01-24 02:49:03.667395: step: 648/466, loss: 0.06714562326669693 2023-01-24 02:49:04.377087: step: 650/466, loss: 0.15858644247055054 2023-01-24 02:49:05.011148: step: 652/466, loss: 0.07921741157770157 2023-01-24 02:49:05.600129: step: 654/466, loss: 0.04882184788584709 2023-01-24 02:49:06.232704: step: 656/466, loss: 0.044943083077669144 2023-01-24 02:49:06.883161: step: 658/466, loss: 0.11141613125801086 2023-01-24 02:49:07.449996: step: 660/466, loss: 0.09633949398994446 2023-01-24 02:49:08.118878: step: 662/466, loss: 0.026888061314821243 2023-01-24 02:49:08.734704: step: 664/466, loss: 0.013612605631351471 2023-01-24 02:49:09.354505: step: 666/466, loss: 0.09860095381736755 2023-01-24 02:49:09.971104: step: 668/466, loss: 0.05068560689687729 2023-01-24 02:49:10.535314: step: 670/466, loss: 0.07181273400783539 2023-01-24 02:49:11.135931: step: 672/466, loss: 0.14695997536182404 2023-01-24 02:49:11.736298: step: 674/466, loss: 0.014467950910329819 2023-01-24 02:49:12.485693: step: 676/466, loss: 0.06014389172196388 2023-01-24 02:49:13.137219: step: 678/466, loss: 0.1156170442700386 2023-01-24 02:49:13.793798: step: 680/466, loss: 0.024952847510576248 2023-01-24 02:49:14.415340: step: 682/466, loss: 0.2912246286869049 2023-01-24 02:49:15.019985: step: 684/466, loss: 0.04628412798047066 2023-01-24 02:49:15.623740: step: 686/466, loss: 0.03355022892355919 2023-01-24 02:49:16.216864: step: 688/466, loss: 0.07397707551717758 2023-01-24 02:49:16.845200: step: 690/466, loss: 0.05694981664419174 2023-01-24 02:49:17.499451: step: 692/466, loss: 0.12282607704401016 2023-01-24 02:49:18.152979: step: 694/466, loss: 0.0603046678006649 2023-01-24 02:49:18.826439: step: 696/466, loss: 0.02642957493662834 2023-01-24 02:49:19.378256: step: 698/466, loss: 0.21970698237419128 2023-01-24 02:49:19.993757: step: 700/466, loss: 0.11033938080072403 2023-01-24 02:49:20.608240: step: 702/466, loss: 0.034184087067842484 2023-01-24 02:49:21.216951: step: 704/466, loss: 0.01633790135383606 2023-01-24 02:49:21.749988: step: 706/466, loss: 0.3997347056865692 2023-01-24 02:49:22.412533: step: 708/466, loss: 0.10520128160715103 2023-01-24 02:49:23.016487: step: 710/466, loss: 0.11116138100624084 2023-01-24 02:49:23.642210: step: 712/466, loss: 0.15260502696037292 2023-01-24 02:49:24.309087: step: 714/466, loss: 0.04590679332613945 2023-01-24 02:49:24.985498: step: 716/466, loss: 0.02621178887784481 2023-01-24 02:49:25.643169: step: 718/466, loss: 0.038498785346746445 2023-01-24 02:49:26.230202: step: 720/466, loss: 0.015885451808571815 2023-01-24 02:49:26.835325: step: 722/466, loss: 0.057695887982845306 2023-01-24 02:49:27.448348: step: 724/466, loss: 0.04524630308151245 2023-01-24 02:49:28.012517: step: 726/466, loss: 0.00518582109361887 2023-01-24 02:49:28.627549: step: 728/466, loss: 0.035789694637060165 2023-01-24 02:49:29.261758: step: 730/466, loss: 0.11010007560253143 2023-01-24 02:49:30.071257: step: 732/466, loss: 0.06833881139755249 2023-01-24 02:49:30.700698: step: 734/466, loss: 0.07563214004039764 2023-01-24 02:49:31.341512: step: 736/466, loss: 0.04967145249247551 2023-01-24 02:49:31.937685: step: 738/466, loss: 0.029912220314145088 2023-01-24 02:49:32.617609: step: 740/466, loss: 0.11592493206262589 2023-01-24 02:49:33.259504: step: 742/466, loss: 0.03424540534615517 2023-01-24 02:49:33.888972: step: 744/466, loss: 0.016852907836437225 2023-01-24 02:49:34.524212: step: 746/466, loss: 0.033034175634384155 2023-01-24 02:49:35.250227: step: 748/466, loss: 0.12496069073677063 2023-01-24 02:49:35.876161: step: 750/466, loss: 0.14536382257938385 2023-01-24 02:49:36.508102: step: 752/466, loss: 1.701302170753479 2023-01-24 02:49:37.072729: step: 754/466, loss: 0.8775402903556824 2023-01-24 02:49:37.732968: step: 756/466, loss: 0.030453836545348167 2023-01-24 02:49:38.445156: step: 758/466, loss: 0.0717419981956482 2023-01-24 02:49:39.024354: step: 760/466, loss: 0.04080541059374809 2023-01-24 02:49:39.667603: step: 762/466, loss: 0.06974674761295319 2023-01-24 02:49:40.253312: step: 764/466, loss: 0.04770641773939133 2023-01-24 02:49:40.885035: step: 766/466, loss: 0.0678180605173111 2023-01-24 02:49:41.513717: step: 768/466, loss: 0.06281007081270218 2023-01-24 02:49:42.179476: step: 770/466, loss: 0.2280104011297226 2023-01-24 02:49:42.797599: step: 772/466, loss: 0.02376980520784855 2023-01-24 02:49:43.427135: step: 774/466, loss: 0.06682606041431427 2023-01-24 02:49:44.034886: step: 776/466, loss: 0.09349855035543442 2023-01-24 02:49:44.648465: step: 778/466, loss: 0.08626078814268112 2023-01-24 02:49:45.242854: step: 780/466, loss: 0.12150692194700241 2023-01-24 02:49:45.823841: step: 782/466, loss: 0.13246864080429077 2023-01-24 02:49:46.475569: step: 784/466, loss: 0.04481721296906471 2023-01-24 02:49:47.084275: step: 786/466, loss: 0.027929870411753654 2023-01-24 02:49:47.767793: step: 788/466, loss: 0.21095824241638184 2023-01-24 02:49:48.384231: step: 790/466, loss: 0.029106542468070984 2023-01-24 02:49:49.064988: step: 792/466, loss: 0.10976512730121613 2023-01-24 02:49:49.718056: step: 794/466, loss: 0.08589852601289749 2023-01-24 02:49:50.390907: step: 796/466, loss: 0.06154797598719597 2023-01-24 02:49:51.030325: step: 798/466, loss: 2.7724342346191406 2023-01-24 02:49:51.709277: step: 800/466, loss: 0.07345017045736313 2023-01-24 02:49:52.304250: step: 802/466, loss: 0.0817703828215599 2023-01-24 02:49:52.853149: step: 804/466, loss: 0.22079487144947052 2023-01-24 02:49:53.608814: step: 806/466, loss: 0.04317941144108772 2023-01-24 02:49:54.245034: step: 808/466, loss: 0.06339377909898758 2023-01-24 02:49:54.825152: step: 810/466, loss: 0.010256998240947723 2023-01-24 02:49:55.476493: step: 812/466, loss: 0.1356644630432129 2023-01-24 02:49:56.179687: step: 814/466, loss: 0.08809737861156464 2023-01-24 02:49:56.797270: step: 816/466, loss: 0.2747352421283722 2023-01-24 02:49:57.403442: step: 818/466, loss: 0.032459042966365814 2023-01-24 02:49:58.018691: step: 820/466, loss: 0.04946858808398247 2023-01-24 02:49:58.688779: step: 822/466, loss: 0.38934946060180664 2023-01-24 02:49:59.410550: step: 824/466, loss: 0.06598320603370667 2023-01-24 02:50:00.028450: step: 826/466, loss: 0.014138715341687202 2023-01-24 02:50:00.780328: step: 828/466, loss: 0.19181425869464874 2023-01-24 02:50:01.395212: step: 830/466, loss: 0.02072925865650177 2023-01-24 02:50:02.042015: step: 832/466, loss: 0.13066735863685608 2023-01-24 02:50:02.666590: step: 834/466, loss: 0.2115868330001831 2023-01-24 02:50:03.300334: step: 836/466, loss: 0.03603954613208771 2023-01-24 02:50:03.891590: step: 838/466, loss: 0.07363086193799973 2023-01-24 02:50:04.508185: step: 840/466, loss: 0.023128263652324677 2023-01-24 02:50:05.220990: step: 842/466, loss: 0.042165763676166534 2023-01-24 02:50:05.862065: step: 844/466, loss: 0.08984938263893127 2023-01-24 02:50:06.489906: step: 846/466, loss: 0.07657621055841446 2023-01-24 02:50:07.090441: step: 848/466, loss: 0.020295914262533188 2023-01-24 02:50:07.715620: step: 850/466, loss: 0.08770626783370972 2023-01-24 02:50:08.356818: step: 852/466, loss: 0.04018435627222061 2023-01-24 02:50:08.982619: step: 854/466, loss: 0.08496487140655518 2023-01-24 02:50:09.596727: step: 856/466, loss: 0.02933735027909279 2023-01-24 02:50:10.241754: step: 858/466, loss: 0.09103970974683762 2023-01-24 02:50:10.912046: step: 860/466, loss: 0.048141077160835266 2023-01-24 02:50:11.534340: step: 862/466, loss: 0.5548475980758667 2023-01-24 02:50:12.181498: step: 864/466, loss: 0.17042019963264465 2023-01-24 02:50:12.809580: step: 866/466, loss: 0.17668190598487854 2023-01-24 02:50:13.509799: step: 868/466, loss: 0.13648608326911926 2023-01-24 02:50:14.093538: step: 870/466, loss: 0.028354184702038765 2023-01-24 02:50:14.756334: step: 872/466, loss: 0.08296729624271393 2023-01-24 02:50:15.388757: step: 874/466, loss: 0.03876206651329994 2023-01-24 02:50:15.998831: step: 876/466, loss: 0.0596102811396122 2023-01-24 02:50:16.631994: step: 878/466, loss: 0.06662774085998535 2023-01-24 02:50:17.269778: step: 880/466, loss: 0.10815246403217316 2023-01-24 02:50:17.888334: step: 882/466, loss: 0.07196032255887985 2023-01-24 02:50:18.600256: step: 884/466, loss: 0.40313446521759033 2023-01-24 02:50:19.265167: step: 886/466, loss: 0.07323966175317764 2023-01-24 02:50:19.892239: step: 888/466, loss: 0.06867456436157227 2023-01-24 02:50:20.549873: step: 890/466, loss: 0.0214415080845356 2023-01-24 02:50:21.178067: step: 892/466, loss: 0.06691908836364746 2023-01-24 02:50:21.884470: step: 894/466, loss: 0.02709440514445305 2023-01-24 02:50:22.499609: step: 896/466, loss: 0.02463386207818985 2023-01-24 02:50:23.110732: step: 898/466, loss: 0.08717241138219833 2023-01-24 02:50:23.764593: step: 900/466, loss: 0.08618682622909546 2023-01-24 02:50:24.396536: step: 902/466, loss: 0.06320453435182571 2023-01-24 02:50:25.010015: step: 904/466, loss: 0.0032288103830069304 2023-01-24 02:50:25.610822: step: 906/466, loss: 0.04616895318031311 2023-01-24 02:50:26.289776: step: 908/466, loss: 0.08718512952327728 2023-01-24 02:50:26.858944: step: 910/466, loss: 7.17824125289917 2023-01-24 02:50:27.407348: step: 912/466, loss: 0.016759289428591728 2023-01-24 02:50:27.988438: step: 914/466, loss: 0.029550690203905106 2023-01-24 02:50:28.559743: step: 916/466, loss: 0.03661734610795975 2023-01-24 02:50:29.215303: step: 918/466, loss: 0.11967132240533829 2023-01-24 02:50:29.788946: step: 920/466, loss: 0.03834659978747368 2023-01-24 02:50:30.446530: step: 922/466, loss: 0.06053052097558975 2023-01-24 02:50:31.085227: step: 924/466, loss: 0.06139500066637993 2023-01-24 02:50:31.693805: step: 926/466, loss: 0.05111083388328552 2023-01-24 02:50:32.371637: step: 928/466, loss: 0.2747494578361511 2023-01-24 02:50:32.993034: step: 930/466, loss: 0.051467474550008774 2023-01-24 02:50:33.614915: step: 932/466, loss: 0.041985541582107544 ================================================== Loss: 0.170 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.33673082869511445, 'r': 0.3443983238456674, 'f1': 0.34052141963727334}, 'combined': 0.2509105197327277, 'epoch': 19} Test Chinese: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.37374106943884683, 'r': 0.3082634492943134, 'f1': 0.3378590846353073}, 'combined': 0.22407234628662864, 'epoch': 19} Dev Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.32808459051724137, 'r': 0.2883167613636363, 'f1': 0.3069178427419354}, 'combined': 0.20461189516129025, 'epoch': 19} Test Korean: {'template': {'p': 0.96875, 'r': 0.49206349206349204, 'f1': 0.6526315789473683}, 'slot': {'p': 0.37775597128051114, 'r': 0.2864267546537671, 'f1': 0.32581219799945516}, 'combined': 0.21263532922069703, 'epoch': 19} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3272973103983159, 'r': 0.3397184606980622, 'f1': 0.33339223237966253}, 'combined': 0.24565743438501447, 'epoch': 19} Test Russian: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.35323730756356436, 'r': 0.28920730124891997, 'f1': 0.3180314910252787}, 'combined': 0.21092243964370813, 'epoch': 19} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.2577519379844961, 'r': 0.31666666666666665, 'f1': 0.2841880341880341}, 'combined': 0.1894586894586894, 'epoch': 19} Sample Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.46875, 'r': 0.32608695652173914, 'f1': 0.38461538461538464}, 'combined': 0.2564102564102564, 'epoch': 19} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.4, 'r': 0.13793103448275862, 'f1': 0.20512820512820515}, 'combined': 0.13675213675213677, 'epoch': 19} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3384687266123707, 'r': 0.32112782411040863, 'f1': 0.3295703277627758}, 'combined': 0.24284129414099268, 'epoch': 13} Test for Chinese: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.37192480126309296, 'r': 0.3046241229392952, 'f1': 0.334927046163623}, 'combined': 0.2221277819116256, 'epoch': 13} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.38541666666666663, 'r': 0.35238095238095235, 'f1': 0.36815920398009944}, 'combined': 0.24543946932006627, 'epoch': 13} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.31508371199826896, 'r': 0.28285924145299146, 'f1': 0.298103152669021}, 'combined': 0.19873543511268066, 'epoch': 10} Test for Korean: {'template': {'p': 0.96875, 'r': 0.49206349206349204, 'f1': 0.6526315789473683}, 'slot': {'p': 0.3659577520051451, 'r': 0.28516188467933384, 'f1': 0.3205469360629008}, 'combined': 0.20919905300947209, 'epoch': 10} Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.43478260869565216, 'f1': 0.46511627906976744}, 'combined': 0.31007751937984496, 'epoch': 10} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3214272868712161, 'r': 0.31166858366449984, 'f1': 0.3164727236824498}, 'combined': 0.23319042797654194, 'epoch': 7} Test for Russian: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.3374639287478496, 'r': 0.2816078301964814, 'f1': 0.30701605547736693}, 'combined': 0.20361686580882363, 'epoch': 7} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.4423076923076923, 'r': 0.19827586206896552, 'f1': 0.2738095238095238}, 'combined': 0.1825396825396825, 'epoch': 7} ****************************** Epoch: 20 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-24 02:53:04.799922: step: 2/466, loss: 0.02529050223529339 2023-01-24 02:53:05.468035: step: 4/466, loss: 0.1284036636352539 2023-01-24 02:53:06.085511: step: 6/466, loss: 0.8496418595314026 2023-01-24 02:53:06.694941: step: 8/466, loss: 0.25180578231811523 2023-01-24 02:53:07.370441: step: 10/466, loss: 0.040862686932086945 2023-01-24 02:53:08.018491: step: 12/466, loss: 0.0089346868917346 2023-01-24 02:53:08.624453: step: 14/466, loss: 0.07790698856115341 2023-01-24 02:53:09.277024: step: 16/466, loss: 0.02893769182264805 2023-01-24 02:53:09.932042: step: 18/466, loss: 0.032509054988622665 2023-01-24 02:53:10.509210: step: 20/466, loss: 0.0163698922842741 2023-01-24 02:53:11.090564: step: 22/466, loss: 0.06177888438105583 2023-01-24 02:53:11.699724: step: 24/466, loss: 0.005556738469749689 2023-01-24 02:53:12.393598: step: 26/466, loss: 0.028866780921816826 2023-01-24 02:53:13.084282: step: 28/466, loss: 0.022818591445684433 2023-01-24 02:53:13.708990: step: 30/466, loss: 0.014123926870524883 2023-01-24 02:53:14.319001: step: 32/466, loss: 0.034859590232372284 2023-01-24 02:53:14.930849: step: 34/466, loss: 0.06828051805496216 2023-01-24 02:53:15.600193: step: 36/466, loss: 0.20173564553260803 2023-01-24 02:53:16.281337: step: 38/466, loss: 0.03694624453783035 2023-01-24 02:53:16.875338: step: 40/466, loss: 0.016344420611858368 2023-01-24 02:53:17.532871: step: 42/466, loss: 0.07911890000104904 2023-01-24 02:53:18.097716: step: 44/466, loss: 0.05435716360807419 2023-01-24 02:53:18.708603: step: 46/466, loss: 0.14925718307495117 2023-01-24 02:53:19.342981: step: 48/466, loss: 0.06081797555088997 2023-01-24 02:53:19.980062: step: 50/466, loss: 0.014909287914633751 2023-01-24 02:53:20.633249: step: 52/466, loss: 0.14070676267147064 2023-01-24 02:53:21.256968: step: 54/466, loss: 0.10343374311923981 2023-01-24 02:53:21.897015: step: 56/466, loss: 0.07278351485729218 2023-01-24 02:53:22.514174: step: 58/466, loss: 0.05475543811917305 2023-01-24 02:53:23.151065: step: 60/466, loss: 0.09487111866474152 2023-01-24 02:53:23.717341: step: 62/466, loss: 0.04730089753866196 2023-01-24 02:53:24.385453: step: 64/466, loss: 0.08981947600841522 2023-01-24 02:53:25.005251: step: 66/466, loss: 0.07103098928928375 2023-01-24 02:53:25.670189: step: 68/466, loss: 0.7478634119033813 2023-01-24 02:53:26.354979: step: 70/466, loss: 0.06133272126317024 2023-01-24 02:53:26.997642: step: 72/466, loss: 0.022983666509389877 2023-01-24 02:53:27.642065: step: 74/466, loss: 0.18768425285816193 2023-01-24 02:53:28.264933: step: 76/466, loss: 0.01486103143543005 2023-01-24 02:53:28.907468: step: 78/466, loss: 0.07413389533758163 2023-01-24 02:53:29.551570: step: 80/466, loss: 0.027376830577850342 2023-01-24 02:53:30.226152: step: 82/466, loss: 0.027573490515351295 2023-01-24 02:53:30.827801: step: 84/466, loss: 0.05019203945994377 2023-01-24 02:53:31.439623: step: 86/466, loss: 0.33560293912887573 2023-01-24 02:53:32.116151: step: 88/466, loss: 0.08343034237623215 2023-01-24 02:53:32.725170: step: 90/466, loss: 0.16450229287147522 2023-01-24 02:53:33.368575: step: 92/466, loss: 0.07126929610967636 2023-01-24 02:53:33.991282: step: 94/466, loss: 0.07928193360567093 2023-01-24 02:53:34.638863: step: 96/466, loss: 0.32826054096221924 2023-01-24 02:53:35.263714: step: 98/466, loss: 0.18427452445030212 2023-01-24 02:53:35.905237: step: 100/466, loss: 0.0352289155125618 2023-01-24 02:53:36.549858: step: 102/466, loss: 0.17166413366794586 2023-01-24 02:53:37.151110: step: 104/466, loss: 0.07920347154140472 2023-01-24 02:53:37.862920: step: 106/466, loss: 0.01411384716629982 2023-01-24 02:53:38.474469: step: 108/466, loss: 0.1047717034816742 2023-01-24 02:53:39.114785: step: 110/466, loss: 0.18175145983695984 2023-01-24 02:53:39.741322: step: 112/466, loss: 1.1283595561981201 2023-01-24 02:53:40.344435: step: 114/466, loss: 0.101852186024189 2023-01-24 02:53:41.004750: step: 116/466, loss: 0.03352248668670654 2023-01-24 02:53:41.636988: step: 118/466, loss: 0.05666612461209297 2023-01-24 02:53:42.335032: step: 120/466, loss: 0.01730651780962944 2023-01-24 02:53:42.927825: step: 122/466, loss: 0.030800916254520416 2023-01-24 02:53:43.568458: step: 124/466, loss: 0.03704928979277611 2023-01-24 02:53:44.209168: step: 126/466, loss: 0.042389120906591415 2023-01-24 02:53:44.839571: step: 128/466, loss: 0.2540122866630554 2023-01-24 02:53:45.474022: step: 130/466, loss: 0.015463070943951607 2023-01-24 02:53:46.122087: step: 132/466, loss: 0.015473777428269386 2023-01-24 02:53:46.909689: step: 134/466, loss: 0.5191654562950134 2023-01-24 02:53:47.568332: step: 136/466, loss: 0.04981977865099907 2023-01-24 02:53:48.200309: step: 138/466, loss: 0.029307564720511436 2023-01-24 02:53:48.865715: step: 140/466, loss: 0.07894758880138397 2023-01-24 02:53:49.522640: step: 142/466, loss: 0.2544078826904297 2023-01-24 02:53:50.110316: step: 144/466, loss: 0.04332369193434715 2023-01-24 02:53:50.746536: step: 146/466, loss: 0.014201727695763111 2023-01-24 02:53:51.362657: step: 148/466, loss: 0.04584435373544693 2023-01-24 02:53:51.993820: step: 150/466, loss: 0.12334337830543518 2023-01-24 02:53:52.554047: step: 152/466, loss: 0.055985480546951294 2023-01-24 02:53:53.202551: step: 154/466, loss: 0.02131708897650242 2023-01-24 02:53:53.907790: step: 156/466, loss: 0.056722573935985565 2023-01-24 02:53:54.472264: step: 158/466, loss: 0.47319892048835754 2023-01-24 02:53:55.097455: step: 160/466, loss: 0.02541445940732956 2023-01-24 02:53:55.720159: step: 162/466, loss: 0.19760848581790924 2023-01-24 02:53:56.287277: step: 164/466, loss: 0.010579260066151619 2023-01-24 02:53:56.889523: step: 166/466, loss: 0.048830416053533554 2023-01-24 02:53:57.464697: step: 168/466, loss: 0.07627157866954803 2023-01-24 02:53:58.010490: step: 170/466, loss: 0.039207588881254196 2023-01-24 02:53:58.661827: step: 172/466, loss: 0.03126634284853935 2023-01-24 02:53:59.256171: step: 174/466, loss: 0.0609285943210125 2023-01-24 02:53:59.897887: step: 176/466, loss: 0.06033097207546234 2023-01-24 02:54:00.471601: step: 178/466, loss: 0.013422602787613869 2023-01-24 02:54:01.110099: step: 180/466, loss: 0.04581700637936592 2023-01-24 02:54:01.720098: step: 182/466, loss: 0.020022494718432426 2023-01-24 02:54:02.381142: step: 184/466, loss: 0.09475758671760559 2023-01-24 02:54:03.056753: step: 186/466, loss: 0.14202749729156494 2023-01-24 02:54:03.680764: step: 188/466, loss: 0.3445826470851898 2023-01-24 02:54:04.323917: step: 190/466, loss: 0.035508062690496445 2023-01-24 02:54:04.928314: step: 192/466, loss: 0.008882682770490646 2023-01-24 02:54:05.529346: step: 194/466, loss: 0.10349462926387787 2023-01-24 02:54:06.200125: step: 196/466, loss: 0.07579652965068817 2023-01-24 02:54:06.842671: step: 198/466, loss: 0.105499267578125 2023-01-24 02:54:07.477401: step: 200/466, loss: 0.05291704088449478 2023-01-24 02:54:08.006459: step: 202/466, loss: 0.01687799021601677 2023-01-24 02:54:08.645228: step: 204/466, loss: 0.045372042804956436 2023-01-24 02:54:09.242326: step: 206/466, loss: 0.011181673035025597 2023-01-24 02:54:09.829189: step: 208/466, loss: 0.031141316518187523 2023-01-24 02:54:10.460717: step: 210/466, loss: 0.012476320378482342 2023-01-24 02:54:11.108759: step: 212/466, loss: 0.06704815477132797 2023-01-24 02:54:11.741004: step: 214/466, loss: 1.2883732318878174 2023-01-24 02:54:12.348932: step: 216/466, loss: 0.03643377497792244 2023-01-24 02:54:12.978313: step: 218/466, loss: 0.04377938434481621 2023-01-24 02:54:13.611835: step: 220/466, loss: 0.06631084531545639 2023-01-24 02:54:14.210238: step: 222/466, loss: 0.106280118227005 2023-01-24 02:54:14.859088: step: 224/466, loss: 0.10199912637472153 2023-01-24 02:54:15.537201: step: 226/466, loss: 0.08121784776449203 2023-01-24 02:54:16.153312: step: 228/466, loss: 0.058072999119758606 2023-01-24 02:54:16.741946: step: 230/466, loss: 0.023868972435593605 2023-01-24 02:54:17.315464: step: 232/466, loss: 0.06327088177204132 2023-01-24 02:54:18.040962: step: 234/466, loss: 0.012144657783210278 2023-01-24 02:54:18.647941: step: 236/466, loss: 0.036326225847005844 2023-01-24 02:54:19.227064: step: 238/466, loss: 0.14834503829479218 2023-01-24 02:54:19.859584: step: 240/466, loss: 0.05553761124610901 2023-01-24 02:54:20.495641: step: 242/466, loss: 0.0351240448653698 2023-01-24 02:54:21.184573: step: 244/466, loss: 0.06827200949192047 2023-01-24 02:54:21.846595: step: 246/466, loss: 0.0644996166229248 2023-01-24 02:54:22.449447: step: 248/466, loss: 0.04270177707076073 2023-01-24 02:54:23.057683: step: 250/466, loss: 0.030522502958774567 2023-01-24 02:54:23.709122: step: 252/466, loss: 0.028394509106874466 2023-01-24 02:54:24.367601: step: 254/466, loss: 0.46309852600097656 2023-01-24 02:54:25.069094: step: 256/466, loss: 0.05730801448225975 2023-01-24 02:54:25.726997: step: 258/466, loss: 0.08036874979734421 2023-01-24 02:54:26.366031: step: 260/466, loss: 0.08377359062433243 2023-01-24 02:54:27.044122: step: 262/466, loss: 0.14369578659534454 2023-01-24 02:54:27.621717: step: 264/466, loss: 0.1626148521900177 2023-01-24 02:54:28.217709: step: 266/466, loss: 0.07974758744239807 2023-01-24 02:54:28.912576: step: 268/466, loss: 0.018431322649121284 2023-01-24 02:54:29.519491: step: 270/466, loss: 0.040509093552827835 2023-01-24 02:54:30.103196: step: 272/466, loss: 0.009697719477117062 2023-01-24 02:54:30.685436: step: 274/466, loss: 0.08021612465381622 2023-01-24 02:54:31.274948: step: 276/466, loss: 0.06953954696655273 2023-01-24 02:54:31.934660: step: 278/466, loss: 0.037968482822179794 2023-01-24 02:54:32.563815: step: 280/466, loss: 0.038745686411857605 2023-01-24 02:54:33.152253: step: 282/466, loss: 0.10231854021549225 2023-01-24 02:54:33.691737: step: 284/466, loss: 0.12913811206817627 2023-01-24 02:54:34.304214: step: 286/466, loss: 0.030376749113202095 2023-01-24 02:54:34.913978: step: 288/466, loss: 0.03901521489024162 2023-01-24 02:54:35.548468: step: 290/466, loss: 0.17596976459026337 2023-01-24 02:54:36.189501: step: 292/466, loss: 0.07888085395097733 2023-01-24 02:54:36.755738: step: 294/466, loss: 0.054244693368673325 2023-01-24 02:54:37.401207: step: 296/466, loss: 0.04164930060505867 2023-01-24 02:54:38.098922: step: 298/466, loss: 0.06038874015212059 2023-01-24 02:54:38.682710: step: 300/466, loss: 0.08669485151767731 2023-01-24 02:54:39.289353: step: 302/466, loss: 0.005868879612535238 2023-01-24 02:54:39.881863: step: 304/466, loss: 0.0111148776486516 2023-01-24 02:54:40.521130: step: 306/466, loss: 0.10046995431184769 2023-01-24 02:54:41.223632: step: 308/466, loss: 0.1179303228855133 2023-01-24 02:54:41.838378: step: 310/466, loss: 0.06580962985754013 2023-01-24 02:54:42.437408: step: 312/466, loss: 0.1103833019733429 2023-01-24 02:54:43.018319: step: 314/466, loss: 0.01620732806622982 2023-01-24 02:54:43.718787: step: 316/466, loss: 0.1872664988040924 2023-01-24 02:54:44.269380: step: 318/466, loss: 0.017346568405628204 2023-01-24 02:54:44.827956: step: 320/466, loss: 0.09177451580762863 2023-01-24 02:54:45.433703: step: 322/466, loss: 0.02192498929798603 2023-01-24 02:54:46.026933: step: 324/466, loss: 0.052837517112493515 2023-01-24 02:54:46.633550: step: 326/466, loss: 0.12525829672813416 2023-01-24 02:54:47.214291: step: 328/466, loss: 0.15684545040130615 2023-01-24 02:54:47.800717: step: 330/466, loss: 0.07931232452392578 2023-01-24 02:54:48.405924: step: 332/466, loss: 0.07490004599094391 2023-01-24 02:54:49.063734: step: 334/466, loss: 0.3396855592727661 2023-01-24 02:54:49.681436: step: 336/466, loss: 0.05998348817229271 2023-01-24 02:54:50.314759: step: 338/466, loss: 0.03129202499985695 2023-01-24 02:54:50.951908: step: 340/466, loss: 0.022101087495684624 2023-01-24 02:54:51.606728: step: 342/466, loss: 0.07139355689287186 2023-01-24 02:54:52.223944: step: 344/466, loss: 0.21404370665550232 2023-01-24 02:54:52.872974: step: 346/466, loss: 0.16895224153995514 2023-01-24 02:54:53.500687: step: 348/466, loss: 0.08540564775466919 2023-01-24 02:54:54.143433: step: 350/466, loss: 0.08686162531375885 2023-01-24 02:54:54.799957: step: 352/466, loss: 0.13402172923088074 2023-01-24 02:54:55.455081: step: 354/466, loss: 0.031210817396640778 2023-01-24 02:54:56.179244: step: 356/466, loss: 0.03649180755019188 2023-01-24 02:54:56.841334: step: 358/466, loss: 0.030150389298796654 2023-01-24 02:54:57.510335: step: 360/466, loss: 0.060796499252319336 2023-01-24 02:54:58.202579: step: 362/466, loss: 0.040671683847904205 2023-01-24 02:54:58.847465: step: 364/466, loss: 0.16725201904773712 2023-01-24 02:54:59.492777: step: 366/466, loss: 0.15237775444984436 2023-01-24 02:55:00.149761: step: 368/466, loss: 0.04570477083325386 2023-01-24 02:55:00.786475: step: 370/466, loss: 0.18501229584217072 2023-01-24 02:55:01.404267: step: 372/466, loss: 0.018478266894817352 2023-01-24 02:55:02.033775: step: 374/466, loss: 0.450448215007782 2023-01-24 02:55:02.652507: step: 376/466, loss: 0.41951102018356323 2023-01-24 02:55:03.301801: step: 378/466, loss: 0.039621587842702866 2023-01-24 02:55:03.867498: step: 380/466, loss: 0.026672614738345146 2023-01-24 02:55:04.499459: step: 382/466, loss: 0.06711740791797638 2023-01-24 02:55:05.266407: step: 384/466, loss: 0.10505777597427368 2023-01-24 02:55:05.891749: step: 386/466, loss: 0.06020501255989075 2023-01-24 02:55:06.529703: step: 388/466, loss: 0.09692052751779556 2023-01-24 02:55:07.134254: step: 390/466, loss: 1.6137539148330688 2023-01-24 02:55:07.737093: step: 392/466, loss: 0.03121868334710598 2023-01-24 02:55:08.365481: step: 394/466, loss: 0.02551921457052231 2023-01-24 02:55:08.929045: step: 396/466, loss: 0.03320877254009247 2023-01-24 02:55:09.512189: step: 398/466, loss: 0.04377833381295204 2023-01-24 02:55:10.166004: step: 400/466, loss: 0.0744377076625824 2023-01-24 02:55:10.819314: step: 402/466, loss: 0.07616252452135086 2023-01-24 02:55:11.386130: step: 404/466, loss: 0.04090946167707443 2023-01-24 02:55:12.042889: step: 406/466, loss: 0.20691360533237457 2023-01-24 02:55:12.652496: step: 408/466, loss: 0.01815725676715374 2023-01-24 02:55:13.331582: step: 410/466, loss: 0.05874943360686302 2023-01-24 02:55:14.014102: step: 412/466, loss: 0.0488753616809845 2023-01-24 02:55:14.699458: step: 414/466, loss: 0.03152293711900711 2023-01-24 02:55:15.387907: step: 416/466, loss: 0.02398141287267208 2023-01-24 02:55:16.060620: step: 418/466, loss: 0.019400104880332947 2023-01-24 02:55:16.685144: step: 420/466, loss: 0.04799790680408478 2023-01-24 02:55:17.284198: step: 422/466, loss: 0.047729700803756714 2023-01-24 02:55:17.912024: step: 424/466, loss: 0.07025714963674545 2023-01-24 02:55:18.552045: step: 426/466, loss: 0.2905091345310211 2023-01-24 02:55:19.230187: step: 428/466, loss: 0.030055930837988853 2023-01-24 02:55:19.887563: step: 430/466, loss: 0.09775189310312271 2023-01-24 02:55:20.489073: step: 432/466, loss: 0.06666316837072372 2023-01-24 02:55:21.089978: step: 434/466, loss: 0.024716192856431007 2023-01-24 02:55:21.692833: step: 436/466, loss: 0.9140031337738037 2023-01-24 02:55:22.354115: step: 438/466, loss: 0.1764558106660843 2023-01-24 02:55:22.943331: step: 440/466, loss: 0.026693593710660934 2023-01-24 02:55:23.591536: step: 442/466, loss: 0.044917576014995575 2023-01-24 02:55:24.187475: step: 444/466, loss: 0.05101257562637329 2023-01-24 02:55:24.859681: step: 446/466, loss: 0.07329098135232925 2023-01-24 02:55:25.507709: step: 448/466, loss: 0.12456972151994705 2023-01-24 02:55:26.076141: step: 450/466, loss: 0.046114739030599594 2023-01-24 02:55:26.708454: step: 452/466, loss: 0.12324994802474976 2023-01-24 02:55:27.300525: step: 454/466, loss: 0.03191633149981499 2023-01-24 02:55:27.901913: step: 456/466, loss: 0.08683544397354126 2023-01-24 02:55:28.468181: step: 458/466, loss: 0.07412993907928467 2023-01-24 02:55:29.107772: step: 460/466, loss: 0.0910847932100296 2023-01-24 02:55:29.795712: step: 462/466, loss: 0.061169225722551346 2023-01-24 02:55:30.393269: step: 464/466, loss: 0.01881217025220394 2023-01-24 02:55:31.009086: step: 466/466, loss: 0.010416771285235882 2023-01-24 02:55:31.674519: step: 468/466, loss: 0.3848955035209656 2023-01-24 02:55:32.311744: step: 470/466, loss: 0.0624094232916832 2023-01-24 02:55:32.895630: step: 472/466, loss: 0.09354517608880997 2023-01-24 02:55:33.521611: step: 474/466, loss: 0.1539187729358673 2023-01-24 02:55:34.166782: step: 476/466, loss: 0.04534471780061722 2023-01-24 02:55:34.721756: step: 478/466, loss: 0.019748156890273094 2023-01-24 02:55:35.358502: step: 480/466, loss: 0.27361956238746643 2023-01-24 02:55:35.998988: step: 482/466, loss: 0.0669504776597023 2023-01-24 02:55:36.662517: step: 484/466, loss: 0.025343511253595352 2023-01-24 02:55:37.264908: step: 486/466, loss: 0.11701531708240509 2023-01-24 02:55:37.874743: step: 488/466, loss: 0.22605279088020325 2023-01-24 02:55:38.584622: step: 490/466, loss: 0.5436602830886841 2023-01-24 02:55:39.305624: step: 492/466, loss: 0.024394290521740913 2023-01-24 02:55:39.930645: step: 494/466, loss: 0.042096640914678574 2023-01-24 02:55:40.574619: step: 496/466, loss: 0.022823672741651535 2023-01-24 02:55:41.234693: step: 498/466, loss: 0.12584716081619263 2023-01-24 02:55:41.905311: step: 500/466, loss: 1.1072688102722168 2023-01-24 02:55:42.534647: step: 502/466, loss: 0.04686303064227104 2023-01-24 02:55:43.208500: step: 504/466, loss: 0.014092482626438141 2023-01-24 02:55:43.821931: step: 506/466, loss: 0.012901059351861477 2023-01-24 02:55:44.469117: step: 508/466, loss: 0.06189880520105362 2023-01-24 02:55:45.064737: step: 510/466, loss: 0.1094948947429657 2023-01-24 02:55:45.758863: step: 512/466, loss: 0.011595910415053368 2023-01-24 02:55:46.392709: step: 514/466, loss: 0.08042123168706894 2023-01-24 02:55:47.022377: step: 516/466, loss: 0.06183750927448273 2023-01-24 02:55:47.603019: step: 518/466, loss: 0.11537908762693405 2023-01-24 02:55:48.199638: step: 520/466, loss: 0.14726556837558746 2023-01-24 02:55:48.838296: step: 522/466, loss: 0.27260708808898926 2023-01-24 02:55:49.554665: step: 524/466, loss: 0.12424908578395844 2023-01-24 02:55:50.136679: step: 526/466, loss: 0.019538627937436104 2023-01-24 02:55:50.726826: step: 528/466, loss: 0.46317681670188904 2023-01-24 02:55:51.323049: step: 530/466, loss: 0.060504645109176636 2023-01-24 02:55:51.904935: step: 532/466, loss: 0.07724351435899734 2023-01-24 02:55:52.458855: step: 534/466, loss: 0.12051215767860413 2023-01-24 02:55:53.099616: step: 536/466, loss: 0.07808685302734375 2023-01-24 02:55:53.791008: step: 538/466, loss: 0.01689128950238228 2023-01-24 02:55:54.429633: step: 540/466, loss: 0.014444484375417233 2023-01-24 02:55:55.055979: step: 542/466, loss: 0.07736461609601974 2023-01-24 02:55:55.639741: step: 544/466, loss: 0.06107090041041374 2023-01-24 02:55:56.228149: step: 546/466, loss: 0.035207852721214294 2023-01-24 02:55:56.880017: step: 548/466, loss: 0.0847686156630516 2023-01-24 02:55:57.486233: step: 550/466, loss: 0.030109670013189316 2023-01-24 02:55:58.083935: step: 552/466, loss: 0.004358192905783653 2023-01-24 02:55:58.709351: step: 554/466, loss: 0.06824992597103119 2023-01-24 02:55:59.390088: step: 556/466, loss: 0.047165438532829285 2023-01-24 02:56:00.026847: step: 558/466, loss: 0.05250353738665581 2023-01-24 02:56:00.603907: step: 560/466, loss: 0.018119188025593758 2023-01-24 02:56:01.219027: step: 562/466, loss: 0.913058340549469 2023-01-24 02:56:01.825393: step: 564/466, loss: 0.05874692276120186 2023-01-24 02:56:02.520040: step: 566/466, loss: 0.03652816265821457 2023-01-24 02:56:03.125980: step: 568/466, loss: 0.0494910329580307 2023-01-24 02:56:03.748287: step: 570/466, loss: 0.23736171424388885 2023-01-24 02:56:04.468760: step: 572/466, loss: 0.3850165009498596 2023-01-24 02:56:05.038771: step: 574/466, loss: 0.11515842378139496 2023-01-24 02:56:05.650750: step: 576/466, loss: 0.18601633608341217 2023-01-24 02:56:06.325088: step: 578/466, loss: 0.2169901430606842 2023-01-24 02:56:06.973580: step: 580/466, loss: 0.05301076918840408 2023-01-24 02:56:07.570627: step: 582/466, loss: 0.02497091516852379 2023-01-24 02:56:08.222503: step: 584/466, loss: 0.15598836541175842 2023-01-24 02:56:08.777755: step: 586/466, loss: 0.02245033159852028 2023-01-24 02:56:09.440452: step: 588/466, loss: 0.29864266514778137 2023-01-24 02:56:10.057956: step: 590/466, loss: 0.1642669439315796 2023-01-24 02:56:10.643987: step: 592/466, loss: 0.06498217582702637 2023-01-24 02:56:11.279524: step: 594/466, loss: 0.08255688846111298 2023-01-24 02:56:11.853513: step: 596/466, loss: 0.009655744768679142 2023-01-24 02:56:12.452467: step: 598/466, loss: 0.018490519374608994 2023-01-24 02:56:13.088998: step: 600/466, loss: 0.03158906102180481 2023-01-24 02:56:13.733794: step: 602/466, loss: 0.030988484621047974 2023-01-24 02:56:14.326063: step: 604/466, loss: 0.005555752664804459 2023-01-24 02:56:15.051936: step: 606/466, loss: 0.11046244949102402 2023-01-24 02:56:15.723159: step: 608/466, loss: 0.09846675395965576 2023-01-24 02:56:16.327136: step: 610/466, loss: 0.5371130108833313 2023-01-24 02:56:16.899619: step: 612/466, loss: 0.23331308364868164 2023-01-24 02:56:17.575039: step: 614/466, loss: 0.05774373188614845 2023-01-24 02:56:18.238342: step: 616/466, loss: 0.09073471277952194 2023-01-24 02:56:18.826081: step: 618/466, loss: 0.11869775503873825 2023-01-24 02:56:19.498454: step: 620/466, loss: 0.03296220675110817 2023-01-24 02:56:20.085045: step: 622/466, loss: 0.009494036436080933 2023-01-24 02:56:20.738198: step: 624/466, loss: 0.31830641627311707 2023-01-24 02:56:21.394482: step: 626/466, loss: 0.021785251796245575 2023-01-24 02:56:22.018713: step: 628/466, loss: 0.28550639748573303 2023-01-24 02:56:22.624367: step: 630/466, loss: 0.020771397277712822 2023-01-24 02:56:23.180587: step: 632/466, loss: 0.023239050060510635 2023-01-24 02:56:23.816413: step: 634/466, loss: 0.11806279420852661 2023-01-24 02:56:24.412655: step: 636/466, loss: 0.010297476314008236 2023-01-24 02:56:25.060803: step: 638/466, loss: 0.05225253477692604 2023-01-24 02:56:25.660877: step: 640/466, loss: 0.013791006058454514 2023-01-24 02:56:26.329422: step: 642/466, loss: 0.161981463432312 2023-01-24 02:56:26.973007: step: 644/466, loss: 0.026967084035277367 2023-01-24 02:56:27.591453: step: 646/466, loss: 0.04840588569641113 2023-01-24 02:56:28.232522: step: 648/466, loss: 0.004754234105348587 2023-01-24 02:56:28.893520: step: 650/466, loss: 0.07378221303224564 2023-01-24 02:56:29.470415: step: 652/466, loss: 0.25798919796943665 2023-01-24 02:56:30.057472: step: 654/466, loss: 0.06274838000535965 2023-01-24 02:56:30.701162: step: 656/466, loss: 0.10202678292989731 2023-01-24 02:56:31.359219: step: 658/466, loss: 0.052663031965494156 2023-01-24 02:56:31.957482: step: 660/466, loss: 0.044118110090494156 2023-01-24 02:56:32.592403: step: 662/466, loss: 0.03136580064892769 2023-01-24 02:56:33.210275: step: 664/466, loss: 0.09924820810556412 2023-01-24 02:56:33.889492: step: 666/466, loss: 0.09181737154722214 2023-01-24 02:56:34.485164: step: 668/466, loss: 0.04943827539682388 2023-01-24 02:56:35.081884: step: 670/466, loss: 0.14395646750926971 2023-01-24 02:56:35.638719: step: 672/466, loss: 0.075625479221344 2023-01-24 02:56:36.268438: step: 674/466, loss: 0.030542463064193726 2023-01-24 02:56:36.911851: step: 676/466, loss: 0.05961842089891434 2023-01-24 02:56:37.602356: step: 678/466, loss: 0.201963409781456 2023-01-24 02:56:38.256441: step: 680/466, loss: 0.16936613619327545 2023-01-24 02:56:38.907567: step: 682/466, loss: 0.06179400533437729 2023-01-24 02:56:39.680323: step: 684/466, loss: 0.7320647239685059 2023-01-24 02:56:40.256380: step: 686/466, loss: 0.13582631945610046 2023-01-24 02:56:40.844291: step: 688/466, loss: 0.027148069813847542 2023-01-24 02:56:41.491864: step: 690/466, loss: 0.042925890535116196 2023-01-24 02:56:42.146797: step: 692/466, loss: 0.2105482518672943 2023-01-24 02:56:42.769060: step: 694/466, loss: 0.09240254014730453 2023-01-24 02:56:43.413147: step: 696/466, loss: 0.07900448888540268 2023-01-24 02:56:44.063461: step: 698/466, loss: 0.041338928043842316 2023-01-24 02:56:44.666360: step: 700/466, loss: 0.04051181301474571 2023-01-24 02:56:45.291780: step: 702/466, loss: 0.6643872857093811 2023-01-24 02:56:45.952376: step: 704/466, loss: 0.08506612479686737 2023-01-24 02:56:46.602189: step: 706/466, loss: 0.026076463982462883 2023-01-24 02:56:47.225078: step: 708/466, loss: 0.05837669596076012 2023-01-24 02:56:47.832474: step: 710/466, loss: 0.1373455673456192 2023-01-24 02:56:48.513779: step: 712/466, loss: 0.023741627112030983 2023-01-24 02:56:49.121377: step: 714/466, loss: 0.07516603916883469 2023-01-24 02:56:49.774746: step: 716/466, loss: 0.10175301879644394 2023-01-24 02:56:50.451173: step: 718/466, loss: 0.5919366478919983 2023-01-24 02:56:51.077842: step: 720/466, loss: 0.3820701539516449 2023-01-24 02:56:51.694800: step: 722/466, loss: 0.004326160065829754 2023-01-24 02:56:52.288109: step: 724/466, loss: 4.940408229827881 2023-01-24 02:56:52.873497: step: 726/466, loss: 0.07517941296100616 2023-01-24 02:56:53.494808: step: 728/466, loss: 0.0352647639811039 2023-01-24 02:56:54.125447: step: 730/466, loss: 0.04286986216902733 2023-01-24 02:56:54.723027: step: 732/466, loss: 0.05781130865216255 2023-01-24 02:56:55.413735: step: 734/466, loss: 0.9605228304862976 2023-01-24 02:56:56.076302: step: 736/466, loss: 0.0912160873413086 2023-01-24 02:56:56.777473: step: 738/466, loss: 0.11297646909952164 2023-01-24 02:56:57.363300: step: 740/466, loss: 0.0951039046049118 2023-01-24 02:56:57.898351: step: 742/466, loss: 0.021256210282444954 2023-01-24 02:56:58.486840: step: 744/466, loss: 0.02005433849990368 2023-01-24 02:56:59.165683: step: 746/466, loss: 0.07645412534475327 2023-01-24 02:56:59.813306: step: 748/466, loss: 0.03697463497519493 2023-01-24 02:57:00.342359: step: 750/466, loss: 0.04665801301598549 2023-01-24 02:57:00.973971: step: 752/466, loss: 0.04964252933859825 2023-01-24 02:57:01.616769: step: 754/466, loss: 0.07287842780351639 2023-01-24 02:57:02.232324: step: 756/466, loss: 0.025916436687111855 2023-01-24 02:57:02.876531: step: 758/466, loss: 0.22002416849136353 2023-01-24 02:57:03.513964: step: 760/466, loss: 0.019609622657299042 2023-01-24 02:57:04.125649: step: 762/466, loss: 0.1243818998336792 2023-01-24 02:57:04.759033: step: 764/466, loss: 0.0855434387922287 2023-01-24 02:57:05.385150: step: 766/466, loss: 0.029025837779045105 2023-01-24 02:57:05.956601: step: 768/466, loss: 0.0746544748544693 2023-01-24 02:57:06.641947: step: 770/466, loss: 0.07845261693000793 2023-01-24 02:57:07.249476: step: 772/466, loss: 0.025406787171959877 2023-01-24 02:57:07.849568: step: 774/466, loss: 0.20520764589309692 2023-01-24 02:57:08.475534: step: 776/466, loss: 0.06779426336288452 2023-01-24 02:57:09.145931: step: 778/466, loss: 0.03305256366729736 2023-01-24 02:57:09.729430: step: 780/466, loss: 0.15617133677005768 2023-01-24 02:57:10.343983: step: 782/466, loss: 0.025844769552350044 2023-01-24 02:57:10.964051: step: 784/466, loss: 0.12392221391201019 2023-01-24 02:57:11.580098: step: 786/466, loss: 0.032041821628808975 2023-01-24 02:57:12.261318: step: 788/466, loss: 0.06751325726509094 2023-01-24 02:57:12.888772: step: 790/466, loss: 0.11863002926111221 2023-01-24 02:57:13.450792: step: 792/466, loss: 0.055254314094781876 2023-01-24 02:57:14.055979: step: 794/466, loss: 0.3632022738456726 2023-01-24 02:57:14.780375: step: 796/466, loss: 0.13491855561733246 2023-01-24 02:57:15.404764: step: 798/466, loss: 0.0824522003531456 2023-01-24 02:57:15.995972: step: 800/466, loss: 0.14025896787643433 2023-01-24 02:57:16.606513: step: 802/466, loss: 0.00802691001445055 2023-01-24 02:57:17.286859: step: 804/466, loss: 0.10936498641967773 2023-01-24 02:57:17.894807: step: 806/466, loss: 0.05121378228068352 2023-01-24 02:57:18.524546: step: 808/466, loss: 0.09438611567020416 2023-01-24 02:57:19.163416: step: 810/466, loss: 0.008763416670262814 2023-01-24 02:57:19.824180: step: 812/466, loss: 0.08204493671655655 2023-01-24 02:57:20.396884: step: 814/466, loss: 0.04841059818863869 2023-01-24 02:57:21.091910: step: 816/466, loss: 0.21222209930419922 2023-01-24 02:57:21.708095: step: 818/466, loss: 0.0064653055742383 2023-01-24 02:57:22.311489: step: 820/466, loss: 0.08223865181207657 2023-01-24 02:57:22.954330: step: 822/466, loss: 0.08027202636003494 2023-01-24 02:57:23.571812: step: 824/466, loss: 0.23100262880325317 2023-01-24 02:57:24.183515: step: 826/466, loss: 0.11842045187950134 2023-01-24 02:57:24.823595: step: 828/466, loss: 0.05246425047516823 2023-01-24 02:57:25.485421: step: 830/466, loss: 0.06855402141809464 2023-01-24 02:57:26.102293: step: 832/466, loss: 0.048354990780353546 2023-01-24 02:57:26.701538: step: 834/466, loss: 0.0696854218840599 2023-01-24 02:57:27.398838: step: 836/466, loss: 0.08798477798700333 2023-01-24 02:57:27.977100: step: 838/466, loss: 0.05054626613855362 2023-01-24 02:57:28.595684: step: 840/466, loss: 0.081448994576931 2023-01-24 02:57:29.182369: step: 842/466, loss: 0.07047303020954132 2023-01-24 02:57:29.802729: step: 844/466, loss: 0.06402552872896194 2023-01-24 02:57:30.414070: step: 846/466, loss: 0.01810120977461338 2023-01-24 02:57:30.991745: step: 848/466, loss: 0.037891753017902374 2023-01-24 02:57:31.703353: step: 850/466, loss: 0.18829333782196045 2023-01-24 02:57:32.387875: step: 852/466, loss: 0.026354145258665085 2023-01-24 02:57:32.985030: step: 854/466, loss: 0.07193075120449066 2023-01-24 02:57:33.686007: step: 856/466, loss: 0.11727716773748398 2023-01-24 02:57:34.464520: step: 858/466, loss: 0.15801364183425903 2023-01-24 02:57:35.140618: step: 860/466, loss: 0.021523352712392807 2023-01-24 02:57:35.761660: step: 862/466, loss: 1.1879605054855347 2023-01-24 02:57:36.383099: step: 864/466, loss: 0.02499072439968586 2023-01-24 02:57:37.034991: step: 866/466, loss: 0.09587027877569199 2023-01-24 02:57:37.658974: step: 868/466, loss: 0.1396237313747406 2023-01-24 02:57:38.284576: step: 870/466, loss: 0.06507892161607742 2023-01-24 02:57:38.961053: step: 872/466, loss: 0.044944878667593 2023-01-24 02:57:39.558175: step: 874/466, loss: 0.03046913631260395 2023-01-24 02:57:40.168640: step: 876/466, loss: 0.03433838114142418 2023-01-24 02:57:40.840792: step: 878/466, loss: 0.06141744181513786 2023-01-24 02:57:41.496697: step: 880/466, loss: 0.2057872861623764 2023-01-24 02:57:42.119510: step: 882/466, loss: 0.04921253025531769 2023-01-24 02:57:42.779421: step: 884/466, loss: 0.01852300949394703 2023-01-24 02:57:43.401454: step: 886/466, loss: 0.015369892120361328 2023-01-24 02:57:44.010389: step: 888/466, loss: 0.14897996187210083 2023-01-24 02:57:44.653355: step: 890/466, loss: 0.027026517316699028 2023-01-24 02:57:45.320955: step: 892/466, loss: 3.0559287071228027 2023-01-24 02:57:45.933030: step: 894/466, loss: 0.029394425451755524 2023-01-24 02:57:46.567070: step: 896/466, loss: 0.05040713772177696 2023-01-24 02:57:47.249387: step: 898/466, loss: 0.4347133934497833 2023-01-24 02:57:47.917536: step: 900/466, loss: 0.5011553764343262 2023-01-24 02:57:48.495163: step: 902/466, loss: 0.0415460541844368 2023-01-24 02:57:49.143051: step: 904/466, loss: 0.43558353185653687 2023-01-24 02:57:49.777759: step: 906/466, loss: 0.2618334889411926 2023-01-24 02:57:50.291815: step: 908/466, loss: 0.005394692067056894 2023-01-24 02:57:50.965156: step: 910/466, loss: 0.06840041279792786 2023-01-24 02:57:51.574712: step: 912/466, loss: 0.05254624783992767 2023-01-24 02:57:52.191513: step: 914/466, loss: 0.11503753811120987 2023-01-24 02:57:52.785396: step: 916/466, loss: 0.05245266482234001 2023-01-24 02:57:53.395678: step: 918/466, loss: 0.5177193284034729 2023-01-24 02:57:53.983519: step: 920/466, loss: 0.07801266014575958 2023-01-24 02:57:54.623666: step: 922/466, loss: 0.03294721990823746 2023-01-24 02:57:55.237401: step: 924/466, loss: 0.004950059577822685 2023-01-24 02:57:55.835285: step: 926/466, loss: 0.23353128135204315 2023-01-24 02:57:56.442885: step: 928/466, loss: 0.034723926335573196 2023-01-24 02:57:57.108817: step: 930/466, loss: 0.3066639006137848 2023-01-24 02:57:57.694627: step: 932/466, loss: 0.013994027860462666 ================================================== Loss: 0.132 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.35600821224540485, 'r': 0.3296622534644356, 'f1': 0.34232907896701}, 'combined': 0.25224247923884946, 'epoch': 20} Test Chinese: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.3708503216596444, 'r': 0.2994463568648126, 'f1': 0.33134515303755174}, 'combined': 0.21975222584873896, 'epoch': 20} Dev Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.3478307038834952, 'r': 0.2714133522727273, 'f1': 0.30490691489361704}, 'combined': 0.20327127659574468, 'epoch': 20} Test Korean: {'template': {'p': 0.96875, 'r': 0.49206349206349204, 'f1': 0.6526315789473683}, 'slot': {'p': 0.37366495205875794, 'r': 0.2779860369987299, 'f1': 0.3188014471929879}, 'combined': 0.20805989185226578, 'epoch': 20} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3529250406962184, 'r': 0.3241285003737565, 'f1': 0.3379143812007313}, 'combined': 0.24898954404264412, 'epoch': 20} Test Russian: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.36150161213674326, 'r': 0.2838804867601958, 'f1': 0.3180232417148653}, 'combined': 0.21091696859845985, 'epoch': 20} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.38541666666666663, 'r': 0.35238095238095235, 'f1': 0.36815920398009944}, 'combined': 0.24543946932006627, 'epoch': 20} Sample Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5441176470588235, 'r': 0.40217391304347827, 'f1': 0.46249999999999997}, 'combined': 0.3083333333333333, 'epoch': 20} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.3333333333333333, 'r': 0.10344827586206896, 'f1': 0.15789473684210528}, 'combined': 0.10526315789473685, 'epoch': 20} New best chinese model... New best korean model... ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.35600821224540485, 'r': 0.3296622534644356, 'f1': 0.34232907896701}, 'combined': 0.25224247923884946, 'epoch': 20} Test for Chinese: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.3708503216596444, 'r': 0.2994463568648126, 'f1': 0.33134515303755174}, 'combined': 0.21975222584873896, 'epoch': 20} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.38541666666666663, 'r': 0.35238095238095235, 'f1': 0.36815920398009944}, 'combined': 0.24543946932006627, 'epoch': 20} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.3478307038834952, 'r': 0.2714133522727273, 'f1': 0.30490691489361704}, 'combined': 0.20327127659574468, 'epoch': 20} Test for Korean: {'template': {'p': 0.96875, 'r': 0.49206349206349204, 'f1': 0.6526315789473683}, 'slot': {'p': 0.37366495205875794, 'r': 0.2779860369987299, 'f1': 0.3188014471929879}, 'combined': 0.20805989185226578, 'epoch': 20} Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5441176470588235, 'r': 0.40217391304347827, 'f1': 0.46249999999999997}, 'combined': 0.3083333333333333, 'epoch': 20} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3214272868712161, 'r': 0.31166858366449984, 'f1': 0.3164727236824498}, 'combined': 0.23319042797654194, 'epoch': 7} Test for Russian: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.3374639287478496, 'r': 0.2816078301964814, 'f1': 0.30701605547736693}, 'combined': 0.20361686580882363, 'epoch': 7} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.4423076923076923, 'r': 0.19827586206896552, 'f1': 0.2738095238095238}, 'combined': 0.1825396825396825, 'epoch': 7} ****************************** Epoch: 21 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-24 03:00:47.760293: step: 2/466, loss: 0.07089442014694214 2023-01-24 03:00:48.413510: step: 4/466, loss: 0.049593016505241394 2023-01-24 03:00:49.106849: step: 6/466, loss: 0.028266672044992447 2023-01-24 03:00:49.683996: step: 8/466, loss: 0.007336086593568325 2023-01-24 03:00:50.317989: step: 10/466, loss: 0.060494426637887955 2023-01-24 03:00:50.884950: step: 12/466, loss: 0.009333517402410507 2023-01-24 03:00:51.529292: step: 14/466, loss: 0.09427694231271744 2023-01-24 03:00:52.145569: step: 16/466, loss: 0.10178826749324799 2023-01-24 03:00:52.772820: step: 18/466, loss: 0.07092513889074326 2023-01-24 03:00:53.308841: step: 20/466, loss: 0.07888146489858627 2023-01-24 03:00:53.945815: step: 22/466, loss: 0.036193717271089554 2023-01-24 03:00:54.544912: step: 24/466, loss: 0.06915474683046341 2023-01-24 03:00:55.124540: step: 26/466, loss: 0.05310487374663353 2023-01-24 03:00:55.746347: step: 28/466, loss: 0.2157112956047058 2023-01-24 03:00:56.350044: step: 30/466, loss: 0.011305051855742931 2023-01-24 03:00:56.964162: step: 32/466, loss: 0.023503711447119713 2023-01-24 03:00:57.584140: step: 34/466, loss: 0.027887945994734764 2023-01-24 03:00:58.294241: step: 36/466, loss: 0.005748997442424297 2023-01-24 03:00:58.907847: step: 38/466, loss: 0.11313948035240173 2023-01-24 03:00:59.514292: step: 40/466, loss: 0.05925760790705681 2023-01-24 03:01:00.072494: step: 42/466, loss: 0.007985766977071762 2023-01-24 03:01:00.695377: step: 44/466, loss: 0.030068177729845047 2023-01-24 03:01:01.303516: step: 46/466, loss: 0.05711689963936806 2023-01-24 03:01:01.930991: step: 48/466, loss: 0.053210336714982986 2023-01-24 03:01:02.552274: step: 50/466, loss: 0.44939592480659485 2023-01-24 03:01:03.143692: step: 52/466, loss: 0.03848563879728317 2023-01-24 03:01:03.811666: step: 54/466, loss: 0.22953787446022034 2023-01-24 03:01:04.443425: step: 56/466, loss: 0.10415433347225189 2023-01-24 03:01:05.029410: step: 58/466, loss: 0.08039028197526932 2023-01-24 03:01:05.819725: step: 60/466, loss: 0.03800737485289574 2023-01-24 03:01:06.450170: step: 62/466, loss: 0.1007322296500206 2023-01-24 03:01:07.034360: step: 64/466, loss: 0.07855730503797531 2023-01-24 03:01:07.644228: step: 66/466, loss: 0.006035930011421442 2023-01-24 03:01:08.289885: step: 68/466, loss: 0.07759585231542587 2023-01-24 03:01:08.953624: step: 70/466, loss: 0.05112384259700775 2023-01-24 03:01:09.658229: step: 72/466, loss: 0.12357943505048752 2023-01-24 03:01:10.361411: step: 74/466, loss: 0.11555702984333038 2023-01-24 03:01:11.133117: step: 76/466, loss: 0.028289109468460083 2023-01-24 03:01:11.761712: step: 78/466, loss: 0.08214019984006882 2023-01-24 03:01:12.347113: step: 80/466, loss: 0.02809653989970684 2023-01-24 03:01:12.969020: step: 82/466, loss: 0.06252384185791016 2023-01-24 03:01:13.592833: step: 84/466, loss: 0.06915662437677383 2023-01-24 03:01:14.230897: step: 86/466, loss: 0.0634181797504425 2023-01-24 03:01:14.865038: step: 88/466, loss: 0.1411423683166504 2023-01-24 03:01:15.533872: step: 90/466, loss: 0.17460677027702332 2023-01-24 03:01:16.236929: step: 92/466, loss: 0.04477472975850105 2023-01-24 03:01:16.923315: step: 94/466, loss: 0.014824727550148964 2023-01-24 03:01:17.544302: step: 96/466, loss: 0.022244907915592194 2023-01-24 03:01:18.211096: step: 98/466, loss: 0.6839988231658936 2023-01-24 03:01:18.870659: step: 100/466, loss: 0.021492348983883858 2023-01-24 03:01:19.501013: step: 102/466, loss: 0.01563192345201969 2023-01-24 03:01:20.163936: step: 104/466, loss: 0.0054557230323553085 2023-01-24 03:01:20.759466: step: 106/466, loss: 0.19327561557292938 2023-01-24 03:01:21.357057: step: 108/466, loss: 0.029385611414909363 2023-01-24 03:01:21.981722: step: 110/466, loss: 0.11822323501110077 2023-01-24 03:01:22.645126: step: 112/466, loss: 0.043833471834659576 2023-01-24 03:01:23.262156: step: 114/466, loss: 1.1973426342010498 2023-01-24 03:01:23.913773: step: 116/466, loss: 0.03044801950454712 2023-01-24 03:01:24.567973: step: 118/466, loss: 0.06850605458021164 2023-01-24 03:01:25.233267: step: 120/466, loss: 0.027155661955475807 2023-01-24 03:01:25.814792: step: 122/466, loss: 0.010388410650193691 2023-01-24 03:01:26.415739: step: 124/466, loss: 0.05787854641675949 2023-01-24 03:01:27.089295: step: 126/466, loss: 0.07108601182699203 2023-01-24 03:01:27.709717: step: 128/466, loss: 0.023687263950705528 2023-01-24 03:01:28.330674: step: 130/466, loss: 0.007384110242128372 2023-01-24 03:01:28.942655: step: 132/466, loss: 0.06754773110151291 2023-01-24 03:01:29.560560: step: 134/466, loss: 0.05037299543619156 2023-01-24 03:01:30.238773: step: 136/466, loss: 0.007863358594477177 2023-01-24 03:01:30.862187: step: 138/466, loss: 0.03649885579943657 2023-01-24 03:01:31.472344: step: 140/466, loss: 0.03751469403505325 2023-01-24 03:01:32.150917: step: 142/466, loss: 0.037405870854854584 2023-01-24 03:01:32.822197: step: 144/466, loss: 0.05105443298816681 2023-01-24 03:01:33.502911: step: 146/466, loss: 0.027803178876638412 2023-01-24 03:01:34.142705: step: 148/466, loss: 0.027827255427837372 2023-01-24 03:01:34.878428: step: 150/466, loss: 0.09103554487228394 2023-01-24 03:01:35.531905: step: 152/466, loss: 0.011072168126702309 2023-01-24 03:01:36.162041: step: 154/466, loss: 0.06728797405958176 2023-01-24 03:01:36.851714: step: 156/466, loss: 0.040998075157403946 2023-01-24 03:01:37.485445: step: 158/466, loss: 0.08982980996370316 2023-01-24 03:01:38.082118: step: 160/466, loss: 0.03985963389277458 2023-01-24 03:01:38.719770: step: 162/466, loss: 0.23042425513267517 2023-01-24 03:01:39.299679: step: 164/466, loss: 0.06946532428264618 2023-01-24 03:01:39.929449: step: 166/466, loss: 0.05300728231668472 2023-01-24 03:01:40.510632: step: 168/466, loss: 0.019600747153162956 2023-01-24 03:01:41.124483: step: 170/466, loss: 0.7505860924720764 2023-01-24 03:01:41.833761: step: 172/466, loss: 0.08120985329151154 2023-01-24 03:01:42.476954: step: 174/466, loss: 0.03269527107477188 2023-01-24 03:01:43.083223: step: 176/466, loss: 0.027445124462246895 2023-01-24 03:01:43.689555: step: 178/466, loss: 0.05066234618425369 2023-01-24 03:01:44.257695: step: 180/466, loss: 0.06630782783031464 2023-01-24 03:01:44.880837: step: 182/466, loss: 0.03700609877705574 2023-01-24 03:01:45.500099: step: 184/466, loss: 0.006005365867167711 2023-01-24 03:01:46.163174: step: 186/466, loss: 0.08509300649166107 2023-01-24 03:01:46.820885: step: 188/466, loss: 0.18691766262054443 2023-01-24 03:01:47.443730: step: 190/466, loss: 0.017493173480033875 2023-01-24 03:01:48.044300: step: 192/466, loss: 0.029288295656442642 2023-01-24 03:01:48.701083: step: 194/466, loss: 0.017930535599589348 2023-01-24 03:01:49.305771: step: 196/466, loss: 0.1339683085680008 2023-01-24 03:01:49.938825: step: 198/466, loss: 0.07602230459451675 2023-01-24 03:01:50.624795: step: 200/466, loss: 0.026082264259457588 2023-01-24 03:01:51.346233: step: 202/466, loss: 0.29196175932884216 2023-01-24 03:01:51.940510: step: 204/466, loss: 0.004092944320291281 2023-01-24 03:01:52.578870: step: 206/466, loss: 0.0425824411213398 2023-01-24 03:01:53.311761: step: 208/466, loss: 1.332690954208374 2023-01-24 03:01:53.932722: step: 210/466, loss: 0.09265276789665222 2023-01-24 03:01:54.563202: step: 212/466, loss: 0.07450100779533386 2023-01-24 03:01:55.179019: step: 214/466, loss: 0.062187325209379196 2023-01-24 03:01:55.805888: step: 216/466, loss: 0.1990901678800583 2023-01-24 03:01:56.445290: step: 218/466, loss: 0.11643319576978683 2023-01-24 03:01:57.073513: step: 220/466, loss: 0.04903949797153473 2023-01-24 03:01:57.658978: step: 222/466, loss: 0.1236814484000206 2023-01-24 03:01:58.313488: step: 224/466, loss: 0.03793327882885933 2023-01-24 03:01:58.935775: step: 226/466, loss: 0.006792228668928146 2023-01-24 03:01:59.586249: step: 228/466, loss: 0.7302740812301636 2023-01-24 03:02:00.192451: step: 230/466, loss: 0.13557493686676025 2023-01-24 03:02:00.835003: step: 232/466, loss: 0.03244762122631073 2023-01-24 03:02:01.435860: step: 234/466, loss: 0.049592722207307816 2023-01-24 03:02:02.093163: step: 236/466, loss: 0.10927261412143707 2023-01-24 03:02:02.806382: step: 238/466, loss: 0.03153478354215622 2023-01-24 03:02:03.372731: step: 240/466, loss: 0.007875959388911724 2023-01-24 03:02:03.978413: step: 242/466, loss: 0.8636252880096436 2023-01-24 03:02:04.604621: step: 244/466, loss: 0.03750202804803848 2023-01-24 03:02:05.222224: step: 246/466, loss: 0.14788737893104553 2023-01-24 03:02:05.856478: step: 248/466, loss: 0.050005968660116196 2023-01-24 03:02:06.496149: step: 250/466, loss: 0.027507677674293518 2023-01-24 03:02:07.072059: step: 252/466, loss: 0.04816051572561264 2023-01-24 03:02:07.714000: step: 254/466, loss: 0.06385090202093124 2023-01-24 03:02:08.334736: step: 256/466, loss: 0.04586299508810043 2023-01-24 03:02:08.979787: step: 258/466, loss: 0.08451353013515472 2023-01-24 03:02:09.559766: step: 260/466, loss: 0.031098363921046257 2023-01-24 03:02:10.182746: step: 262/466, loss: 0.03151926398277283 2023-01-24 03:02:10.789554: step: 264/466, loss: 0.054376523941755295 2023-01-24 03:02:11.378600: step: 266/466, loss: 0.07591578364372253 2023-01-24 03:02:11.962699: step: 268/466, loss: 0.06133175268769264 2023-01-24 03:02:12.600733: step: 270/466, loss: 0.061705656349658966 2023-01-24 03:02:13.230300: step: 272/466, loss: 0.05282587185502052 2023-01-24 03:02:13.843154: step: 274/466, loss: 0.016464704647660255 2023-01-24 03:02:14.439174: step: 276/466, loss: 0.019573258236050606 2023-01-24 03:02:15.086185: step: 278/466, loss: 0.07507047802209854 2023-01-24 03:02:15.725597: step: 280/466, loss: 0.026676513254642487 2023-01-24 03:02:16.403402: step: 282/466, loss: 1.1871224641799927 2023-01-24 03:02:17.010311: step: 284/466, loss: 0.037493109703063965 2023-01-24 03:02:17.680908: step: 286/466, loss: 0.028032071888446808 2023-01-24 03:02:18.333677: step: 288/466, loss: 0.03383786231279373 2023-01-24 03:02:19.012850: step: 290/466, loss: 0.10688777267932892 2023-01-24 03:02:19.750823: step: 292/466, loss: 0.08672936260700226 2023-01-24 03:02:20.451628: step: 294/466, loss: 0.0877775326371193 2023-01-24 03:02:21.063762: step: 296/466, loss: 0.019870324060320854 2023-01-24 03:02:21.695622: step: 298/466, loss: 0.06824889779090881 2023-01-24 03:02:22.295141: step: 300/466, loss: 0.01863786205649376 2023-01-24 03:02:22.889601: step: 302/466, loss: 0.0038263502065092325 2023-01-24 03:02:23.537427: step: 304/466, loss: 0.019151929765939713 2023-01-24 03:02:24.169937: step: 306/466, loss: 0.04331501945853233 2023-01-24 03:02:24.741309: step: 308/466, loss: 0.06901475042104721 2023-01-24 03:02:25.393675: step: 310/466, loss: 0.07544735819101334 2023-01-24 03:02:26.023137: step: 312/466, loss: 0.02430512011051178 2023-01-24 03:02:26.651604: step: 314/466, loss: 0.0816473513841629 2023-01-24 03:02:27.266570: step: 316/466, loss: 0.04463319852948189 2023-01-24 03:02:27.878904: step: 318/466, loss: 0.010686423629522324 2023-01-24 03:02:28.555667: step: 320/466, loss: 0.047023750841617584 2023-01-24 03:02:29.197896: step: 322/466, loss: 0.06095031648874283 2023-01-24 03:02:29.767831: step: 324/466, loss: 0.019171956926584244 2023-01-24 03:02:30.379240: step: 326/466, loss: 0.10878366231918335 2023-01-24 03:02:31.084865: step: 328/466, loss: 0.10855834186077118 2023-01-24 03:02:31.738981: step: 330/466, loss: 0.09366338700056076 2023-01-24 03:02:32.358984: step: 332/466, loss: 0.021269308403134346 2023-01-24 03:02:32.973950: step: 334/466, loss: 0.0992373451590538 2023-01-24 03:02:33.613142: step: 336/466, loss: 0.15862494707107544 2023-01-24 03:02:34.126436: step: 338/466, loss: 0.027936633676290512 2023-01-24 03:02:34.758284: step: 340/466, loss: 0.027892297133803368 2023-01-24 03:02:35.319962: step: 342/466, loss: 0.054857946932315826 2023-01-24 03:02:35.986308: step: 344/466, loss: 0.019250625744462013 2023-01-24 03:02:36.586045: step: 346/466, loss: 0.058253031224012375 2023-01-24 03:02:37.330836: step: 348/466, loss: 0.10461164265871048 2023-01-24 03:02:37.854710: step: 350/466, loss: 0.008837548084557056 2023-01-24 03:02:38.437140: step: 352/466, loss: 0.05633135139942169 2023-01-24 03:02:38.991899: step: 354/466, loss: 0.35336071252822876 2023-01-24 03:02:39.590466: step: 356/466, loss: 0.14346835017204285 2023-01-24 03:02:40.166426: step: 358/466, loss: 0.07668203860521317 2023-01-24 03:02:40.754701: step: 360/466, loss: 0.08573367446660995 2023-01-24 03:02:41.366275: step: 362/466, loss: 0.013371109962463379 2023-01-24 03:02:41.953349: step: 364/466, loss: 0.014967229217290878 2023-01-24 03:02:42.697814: step: 366/466, loss: 0.035412389785051346 2023-01-24 03:02:43.278578: step: 368/466, loss: 0.026346392929553986 2023-01-24 03:02:43.910594: step: 370/466, loss: 0.06741994619369507 2023-01-24 03:02:44.580988: step: 372/466, loss: 0.15302880108356476 2023-01-24 03:02:45.202210: step: 374/466, loss: 0.04921845719218254 2023-01-24 03:02:45.792516: step: 376/466, loss: 0.03806721791625023 2023-01-24 03:02:46.372567: step: 378/466, loss: 0.08171114325523376 2023-01-24 03:02:46.996654: step: 380/466, loss: 0.01198134571313858 2023-01-24 03:02:47.572117: step: 382/466, loss: 0.23268190026283264 2023-01-24 03:02:48.138047: step: 384/466, loss: 0.25207993388175964 2023-01-24 03:02:48.751007: step: 386/466, loss: 0.2673199474811554 2023-01-24 03:02:49.420710: step: 388/466, loss: 0.07382744550704956 2023-01-24 03:02:50.021490: step: 390/466, loss: 0.05229777470231056 2023-01-24 03:02:50.592328: step: 392/466, loss: 0.04268745332956314 2023-01-24 03:02:51.173280: step: 394/466, loss: 0.0031622499227523804 2023-01-24 03:02:51.793640: step: 396/466, loss: 0.19509929418563843 2023-01-24 03:02:52.478610: step: 398/466, loss: 0.08037510514259338 2023-01-24 03:02:53.137290: step: 400/466, loss: 0.05764692276716232 2023-01-24 03:02:53.764878: step: 402/466, loss: 0.05277779698371887 2023-01-24 03:02:54.453140: step: 404/466, loss: 0.049481362104415894 2023-01-24 03:02:55.073198: step: 406/466, loss: 0.1518610268831253 2023-01-24 03:02:55.687684: step: 408/466, loss: 0.0408242866396904 2023-01-24 03:02:56.284114: step: 410/466, loss: 0.07179911434650421 2023-01-24 03:02:56.943711: step: 412/466, loss: 0.08425852656364441 2023-01-24 03:02:57.626561: step: 414/466, loss: 0.05670277029275894 2023-01-24 03:02:58.196914: step: 416/466, loss: 0.014492363668978214 2023-01-24 03:02:58.847293: step: 418/466, loss: 0.10018357634544373 2023-01-24 03:02:59.397550: step: 420/466, loss: 0.005106528755277395 2023-01-24 03:03:00.055306: step: 422/466, loss: 0.03475326672196388 2023-01-24 03:03:00.678330: step: 424/466, loss: 0.05975669249892235 2023-01-24 03:03:01.321050: step: 426/466, loss: 1.164226770401001 2023-01-24 03:03:01.952379: step: 428/466, loss: 0.03273681923747063 2023-01-24 03:03:02.610840: step: 430/466, loss: 0.020399469882249832 2023-01-24 03:03:03.226350: step: 432/466, loss: 0.12875859439373016 2023-01-24 03:03:03.882302: step: 434/466, loss: 0.057574886828660965 2023-01-24 03:03:04.526322: step: 436/466, loss: 0.04370862618088722 2023-01-24 03:03:05.132850: step: 438/466, loss: 0.07838702201843262 2023-01-24 03:03:05.730124: step: 440/466, loss: 0.3533000349998474 2023-01-24 03:03:06.356467: step: 442/466, loss: 0.025857489556074142 2023-01-24 03:03:06.981300: step: 444/466, loss: 0.02611837536096573 2023-01-24 03:03:07.610486: step: 446/466, loss: 0.034644901752471924 2023-01-24 03:03:08.245826: step: 448/466, loss: 0.18575358390808105 2023-01-24 03:03:08.894831: step: 450/466, loss: 0.016335871070623398 2023-01-24 03:03:09.516669: step: 452/466, loss: 0.16690199077129364 2023-01-24 03:03:10.096542: step: 454/466, loss: 0.04394973814487457 2023-01-24 03:03:10.704826: step: 456/466, loss: 0.41671910881996155 2023-01-24 03:03:11.289552: step: 458/466, loss: 4.167999267578125 2023-01-24 03:03:11.913121: step: 460/466, loss: 1.054667592048645 2023-01-24 03:03:12.630288: step: 462/466, loss: 0.10038013756275177 2023-01-24 03:03:13.312641: step: 464/466, loss: 0.034050267189741135 2023-01-24 03:03:13.973245: step: 466/466, loss: 0.040084537118673325 2023-01-24 03:03:14.650515: step: 468/466, loss: 0.08686064928770065 2023-01-24 03:03:15.281166: step: 470/466, loss: 0.03887888044118881 2023-01-24 03:03:15.826310: step: 472/466, loss: 0.019564831629395485 2023-01-24 03:03:16.437180: step: 474/466, loss: 0.011036333627998829 2023-01-24 03:03:17.064965: step: 476/466, loss: 0.13818205893039703 2023-01-24 03:03:17.681305: step: 478/466, loss: 0.0896473228931427 2023-01-24 03:03:18.370894: step: 480/466, loss: 0.04010609909892082 2023-01-24 03:03:18.979818: step: 482/466, loss: 0.05827804282307625 2023-01-24 03:03:19.576080: step: 484/466, loss: 0.033501170575618744 2023-01-24 03:03:20.225526: step: 486/466, loss: 0.12091982364654541 2023-01-24 03:03:20.886074: step: 488/466, loss: 0.059742338955402374 2023-01-24 03:03:21.548833: step: 490/466, loss: 0.0497790165245533 2023-01-24 03:03:22.225519: step: 492/466, loss: 0.022443275898694992 2023-01-24 03:03:22.863858: step: 494/466, loss: 0.07436135411262512 2023-01-24 03:03:23.431315: step: 496/466, loss: 0.05396753177046776 2023-01-24 03:03:24.090369: step: 498/466, loss: 0.06043091043829918 2023-01-24 03:03:24.660890: step: 500/466, loss: 0.006692246999591589 2023-01-24 03:03:25.320516: step: 502/466, loss: 0.056607022881507874 2023-01-24 03:03:25.954766: step: 504/466, loss: 0.03901538997888565 2023-01-24 03:03:26.629717: step: 506/466, loss: 0.047609083354473114 2023-01-24 03:03:27.161265: step: 508/466, loss: 0.008187373168766499 2023-01-24 03:03:27.746611: step: 510/466, loss: 0.038169074803590775 2023-01-24 03:03:28.375172: step: 512/466, loss: 0.10957618057727814 2023-01-24 03:03:29.032959: step: 514/466, loss: 0.06552216410636902 2023-01-24 03:03:29.742610: step: 516/466, loss: 0.09400929510593414 2023-01-24 03:03:30.401503: step: 518/466, loss: 0.03930743411183357 2023-01-24 03:03:31.057204: step: 520/466, loss: 0.011088813655078411 2023-01-24 03:03:31.724511: step: 522/466, loss: 0.043089017271995544 2023-01-24 03:03:32.372223: step: 524/466, loss: 0.01242264173924923 2023-01-24 03:03:33.004648: step: 526/466, loss: 0.0019793591927736998 2023-01-24 03:03:33.632832: step: 528/466, loss: 0.057496801018714905 2023-01-24 03:03:34.262150: step: 530/466, loss: 0.018461046740412712 2023-01-24 03:03:34.912396: step: 532/466, loss: 0.05408164858818054 2023-01-24 03:03:35.526875: step: 534/466, loss: 0.024738499894738197 2023-01-24 03:03:36.169487: step: 536/466, loss: 0.04198291152715683 2023-01-24 03:03:36.795885: step: 538/466, loss: 0.7320752143859863 2023-01-24 03:03:37.481931: step: 540/466, loss: 0.132931649684906 2023-01-24 03:03:38.135712: step: 542/466, loss: 0.0771997720003128 2023-01-24 03:03:38.755648: step: 544/466, loss: 0.30851155519485474 2023-01-24 03:03:39.388491: step: 546/466, loss: 0.016442397609353065 2023-01-24 03:03:40.004895: step: 548/466, loss: 0.07672475278377533 2023-01-24 03:03:40.659700: step: 550/466, loss: 0.03206007182598114 2023-01-24 03:03:41.283565: step: 552/466, loss: 0.08088162541389465 2023-01-24 03:03:41.982192: step: 554/466, loss: 0.33736979961395264 2023-01-24 03:03:42.596070: step: 556/466, loss: 0.049401722848415375 2023-01-24 03:03:43.270647: step: 558/466, loss: 0.033894944936037064 2023-01-24 03:03:43.878234: step: 560/466, loss: 0.6094706654548645 2023-01-24 03:03:44.478902: step: 562/466, loss: 0.030762692913413048 2023-01-24 03:03:45.110279: step: 564/466, loss: 0.022721601650118828 2023-01-24 03:03:45.710211: step: 566/466, loss: 0.01707230508327484 2023-01-24 03:03:46.344877: step: 568/466, loss: 0.02287483401596546 2023-01-24 03:03:46.957970: step: 570/466, loss: 0.045053161680698395 2023-01-24 03:03:47.532683: step: 572/466, loss: 0.04668237641453743 2023-01-24 03:03:48.136952: step: 574/466, loss: 0.017005762085318565 2023-01-24 03:03:48.754652: step: 576/466, loss: 0.034409213811159134 2023-01-24 03:03:49.362617: step: 578/466, loss: 0.08835047483444214 2023-01-24 03:03:50.006479: step: 580/466, loss: 0.05900361016392708 2023-01-24 03:03:50.579115: step: 582/466, loss: 0.07710152119398117 2023-01-24 03:03:51.179994: step: 584/466, loss: 0.033414438366889954 2023-01-24 03:03:51.862643: step: 586/466, loss: 0.04823828116059303 2023-01-24 03:03:52.553272: step: 588/466, loss: 0.04212420806288719 2023-01-24 03:03:53.126418: step: 590/466, loss: 0.015205265022814274 2023-01-24 03:03:53.731164: step: 592/466, loss: 0.019105862826108932 2023-01-24 03:03:54.329629: step: 594/466, loss: 0.04656066745519638 2023-01-24 03:03:54.942876: step: 596/466, loss: 0.5374192595481873 2023-01-24 03:03:55.578597: step: 598/466, loss: 0.12201011925935745 2023-01-24 03:03:56.190508: step: 600/466, loss: 0.04644192382693291 2023-01-24 03:03:56.866767: step: 602/466, loss: 0.08558553457260132 2023-01-24 03:03:57.481502: step: 604/466, loss: 0.004719720687717199 2023-01-24 03:03:58.188705: step: 606/466, loss: 0.05577418953180313 2023-01-24 03:03:58.764373: step: 608/466, loss: 0.03790593519806862 2023-01-24 03:03:59.394428: step: 610/466, loss: 0.15378184616565704 2023-01-24 03:03:59.927858: step: 612/466, loss: 0.09049602597951889 2023-01-24 03:04:00.595632: step: 614/466, loss: 0.056327223777770996 2023-01-24 03:04:01.165293: step: 616/466, loss: 0.04873852804303169 2023-01-24 03:04:01.794587: step: 618/466, loss: 0.021303944289684296 2023-01-24 03:04:02.464306: step: 620/466, loss: 0.03830740973353386 2023-01-24 03:04:03.045725: step: 622/466, loss: 0.08673488348722458 2023-01-24 03:04:03.668585: step: 624/466, loss: 0.010257110930979252 2023-01-24 03:04:04.299114: step: 626/466, loss: 0.005480485036969185 2023-01-24 03:04:04.897153: step: 628/466, loss: 0.09009231626987457 2023-01-24 03:04:05.517470: step: 630/466, loss: 0.12214872241020203 2023-01-24 03:04:06.177051: step: 632/466, loss: 0.048097286373376846 2023-01-24 03:04:06.767970: step: 634/466, loss: 0.18809495866298676 2023-01-24 03:04:07.372303: step: 636/466, loss: 0.040272731333971024 2023-01-24 03:04:08.059675: step: 638/466, loss: 0.009438310749828815 2023-01-24 03:04:08.705992: step: 640/466, loss: 0.052058763802051544 2023-01-24 03:04:09.370690: step: 642/466, loss: 0.04133598506450653 2023-01-24 03:04:09.980730: step: 644/466, loss: 0.028369462117552757 2023-01-24 03:04:10.556178: step: 646/466, loss: 0.00392342172563076 2023-01-24 03:04:11.323458: step: 648/466, loss: 0.47618475556373596 2023-01-24 03:04:11.970397: step: 650/466, loss: 0.06533084064722061 2023-01-24 03:04:12.589370: step: 652/466, loss: 0.022673049941658974 2023-01-24 03:04:13.258246: step: 654/466, loss: 0.06585227698087692 2023-01-24 03:04:13.916831: step: 656/466, loss: 0.1225774809718132 2023-01-24 03:04:14.512146: step: 658/466, loss: 0.12389501929283142 2023-01-24 03:04:15.138603: step: 660/466, loss: 0.10344074666500092 2023-01-24 03:04:15.739709: step: 662/466, loss: 0.06762241572141647 2023-01-24 03:04:16.375964: step: 664/466, loss: 0.018936727195978165 2023-01-24 03:04:16.968908: step: 666/466, loss: 0.1071561723947525 2023-01-24 03:04:17.628202: step: 668/466, loss: 0.07230392098426819 2023-01-24 03:04:18.264706: step: 670/466, loss: 0.034910738468170166 2023-01-24 03:04:18.847701: step: 672/466, loss: 0.009688897989690304 2023-01-24 03:04:19.420942: step: 674/466, loss: 0.03755712881684303 2023-01-24 03:04:20.068459: step: 676/466, loss: 0.12783585488796234 2023-01-24 03:04:20.844161: step: 678/466, loss: 0.01756904274225235 2023-01-24 03:04:21.513734: step: 680/466, loss: 0.09806942939758301 2023-01-24 03:04:22.096189: step: 682/466, loss: 0.01094213966280222 2023-01-24 03:04:22.691784: step: 684/466, loss: 0.08024577796459198 2023-01-24 03:04:23.268581: step: 686/466, loss: 0.08542653173208237 2023-01-24 03:04:23.904766: step: 688/466, loss: 0.0802675113081932 2023-01-24 03:04:24.608409: step: 690/466, loss: 0.07116401195526123 2023-01-24 03:04:25.201303: step: 692/466, loss: 0.00978310126811266 2023-01-24 03:04:25.941128: step: 694/466, loss: 0.062231533229351044 2023-01-24 03:04:26.566259: step: 696/466, loss: 0.009687211364507675 2023-01-24 03:04:27.206695: step: 698/466, loss: 0.017184313386678696 2023-01-24 03:04:27.800023: step: 700/466, loss: 0.10960377752780914 2023-01-24 03:04:28.371654: step: 702/466, loss: 0.11471197009086609 2023-01-24 03:04:28.977769: step: 704/466, loss: 0.11303263157606125 2023-01-24 03:04:29.565030: step: 706/466, loss: 0.06571970880031586 2023-01-24 03:04:30.193538: step: 708/466, loss: 0.02506648376584053 2023-01-24 03:04:30.817474: step: 710/466, loss: 0.05620276555418968 2023-01-24 03:04:31.502570: step: 712/466, loss: 0.1362437605857849 2023-01-24 03:04:32.193322: step: 714/466, loss: 0.10670512914657593 2023-01-24 03:04:32.793793: step: 716/466, loss: 0.012019413523375988 2023-01-24 03:04:33.382885: step: 718/466, loss: 0.5341221690177917 2023-01-24 03:04:34.020615: step: 720/466, loss: 0.14672306180000305 2023-01-24 03:04:34.661819: step: 722/466, loss: 0.11437015235424042 2023-01-24 03:04:35.317942: step: 724/466, loss: 0.09794958680868149 2023-01-24 03:04:35.925714: step: 726/466, loss: 0.01830868236720562 2023-01-24 03:04:36.537614: step: 728/466, loss: 0.04564938694238663 2023-01-24 03:04:37.182708: step: 730/466, loss: 0.05757681280374527 2023-01-24 03:04:37.795529: step: 732/466, loss: 0.06785449385643005 2023-01-24 03:04:38.416726: step: 734/466, loss: 0.027226001024246216 2023-01-24 03:04:39.059952: step: 736/466, loss: 0.06051834672689438 2023-01-24 03:04:39.762763: step: 738/466, loss: 0.8186269402503967 2023-01-24 03:04:40.435569: step: 740/466, loss: 0.22527964413166046 2023-01-24 03:04:41.209557: step: 742/466, loss: 0.024167144671082497 2023-01-24 03:04:41.916847: step: 744/466, loss: 0.06646440923213959 2023-01-24 03:04:42.543307: step: 746/466, loss: 0.049378346651792526 2023-01-24 03:04:43.107590: step: 748/466, loss: 0.01376586128026247 2023-01-24 03:04:43.707138: step: 750/466, loss: 0.08511679619550705 2023-01-24 03:04:44.309525: step: 752/466, loss: 0.028222622349858284 2023-01-24 03:04:44.908762: step: 754/466, loss: 0.12172418087720871 2023-01-24 03:04:45.555402: step: 756/466, loss: 0.021877678111195564 2023-01-24 03:04:46.175010: step: 758/466, loss: 0.05017302185297012 2023-01-24 03:04:46.781528: step: 760/466, loss: 0.050847191363573074 2023-01-24 03:04:47.454094: step: 762/466, loss: 0.05814317241311073 2023-01-24 03:04:48.070693: step: 764/466, loss: 0.12317948043346405 2023-01-24 03:04:48.691166: step: 766/466, loss: 0.013819770887494087 2023-01-24 03:04:49.358782: step: 768/466, loss: 0.10745225101709366 2023-01-24 03:04:50.013367: step: 770/466, loss: 0.5924587845802307 2023-01-24 03:04:50.598658: step: 772/466, loss: 0.10218442976474762 2023-01-24 03:04:51.206151: step: 774/466, loss: 0.06877769529819489 2023-01-24 03:04:51.843760: step: 776/466, loss: 0.039890218526124954 2023-01-24 03:04:52.474446: step: 778/466, loss: 0.0665740892291069 2023-01-24 03:04:53.112337: step: 780/466, loss: 0.037569571286439896 2023-01-24 03:04:53.717876: step: 782/466, loss: 0.021711068227887154 2023-01-24 03:04:54.308002: step: 784/466, loss: 0.021156296133995056 2023-01-24 03:04:54.894281: step: 786/466, loss: 0.05858328565955162 2023-01-24 03:04:55.538367: step: 788/466, loss: 0.016213275492191315 2023-01-24 03:04:56.142721: step: 790/466, loss: 0.011195183731615543 2023-01-24 03:04:56.817382: step: 792/466, loss: 0.01327480748295784 2023-01-24 03:04:57.442085: step: 794/466, loss: 0.015499280765652657 2023-01-24 03:04:58.099617: step: 796/466, loss: 0.007978331297636032 2023-01-24 03:04:58.719082: step: 798/466, loss: 0.31933045387268066 2023-01-24 03:04:59.379690: step: 800/466, loss: 0.13486042618751526 2023-01-24 03:05:00.034044: step: 802/466, loss: 0.028332777321338654 2023-01-24 03:05:00.702147: step: 804/466, loss: 0.036711569875478745 2023-01-24 03:05:01.402412: step: 806/466, loss: 0.02059587650001049 2023-01-24 03:05:02.018779: step: 808/466, loss: 0.023733140900731087 2023-01-24 03:05:02.726840: step: 810/466, loss: 0.055078815668821335 2023-01-24 03:05:03.314221: step: 812/466, loss: 0.015577802434563637 2023-01-24 03:05:03.931758: step: 814/466, loss: 0.01860414631664753 2023-01-24 03:05:04.516360: step: 816/466, loss: 0.005331545602530241 2023-01-24 03:05:05.167393: step: 818/466, loss: 0.060170210897922516 2023-01-24 03:05:05.747996: step: 820/466, loss: 0.0259061511605978 2023-01-24 03:05:06.385178: step: 822/466, loss: 0.05004158988595009 2023-01-24 03:05:07.024139: step: 824/466, loss: 0.10845375061035156 2023-01-24 03:05:07.672154: step: 826/466, loss: 0.08583880960941315 2023-01-24 03:05:08.387430: step: 828/466, loss: 0.0069772228598594666 2023-01-24 03:05:08.974222: step: 830/466, loss: 0.0416720025241375 2023-01-24 03:05:09.652354: step: 832/466, loss: 0.04423435404896736 2023-01-24 03:05:10.256052: step: 834/466, loss: 0.10517225414514542 2023-01-24 03:05:10.861985: step: 836/466, loss: 0.03952678292989731 2023-01-24 03:05:11.501501: step: 838/466, loss: 0.13019147515296936 2023-01-24 03:05:12.121874: step: 840/466, loss: 1.2260103225708008 2023-01-24 03:05:12.750401: step: 842/466, loss: 0.016969038173556328 2023-01-24 03:05:13.368371: step: 844/466, loss: 0.06362573057413101 2023-01-24 03:05:13.981218: step: 846/466, loss: 0.12209760397672653 2023-01-24 03:05:14.569509: step: 848/466, loss: 0.1008475199341774 2023-01-24 03:05:15.186968: step: 850/466, loss: 0.00492806127294898 2023-01-24 03:05:15.890492: step: 852/466, loss: 0.039185430854558945 2023-01-24 03:05:16.483787: step: 854/466, loss: 0.061721861362457275 2023-01-24 03:05:17.059676: step: 856/466, loss: 0.02105090022087097 2023-01-24 03:05:17.711433: step: 858/466, loss: 0.04675479978322983 2023-01-24 03:05:18.355054: step: 860/466, loss: 0.019012775272130966 2023-01-24 03:05:18.956134: step: 862/466, loss: 0.22230355441570282 2023-01-24 03:05:19.574694: step: 864/466, loss: 0.08994642645120621 2023-01-24 03:05:20.204605: step: 866/466, loss: 0.03868407756090164 2023-01-24 03:05:20.860080: step: 868/466, loss: 0.09276425838470459 2023-01-24 03:05:21.504021: step: 870/466, loss: 0.06973981112241745 2023-01-24 03:05:22.108198: step: 872/466, loss: 0.01598411612212658 2023-01-24 03:05:22.751614: step: 874/466, loss: 0.10209723562002182 2023-01-24 03:05:23.368698: step: 876/466, loss: 0.012818582355976105 2023-01-24 03:05:23.983165: step: 878/466, loss: 0.024183141067624092 2023-01-24 03:05:24.644488: step: 880/466, loss: 0.052464935928583145 2023-01-24 03:05:25.315093: step: 882/466, loss: 0.06050201132893562 2023-01-24 03:05:25.944197: step: 884/466, loss: 0.7219332456588745 2023-01-24 03:05:26.586785: step: 886/466, loss: 0.05493218079209328 2023-01-24 03:05:27.224460: step: 888/466, loss: 0.01993950642645359 2023-01-24 03:05:27.896545: step: 890/466, loss: 0.2340915948152542 2023-01-24 03:05:28.525148: step: 892/466, loss: 0.0463443323969841 2023-01-24 03:05:29.121227: step: 894/466, loss: 0.19318298995494843 2023-01-24 03:05:29.781816: step: 896/466, loss: 0.051702868193387985 2023-01-24 03:05:30.366130: step: 898/466, loss: 0.0025893133133649826 2023-01-24 03:05:30.967980: step: 900/466, loss: 0.05517101660370827 2023-01-24 03:05:31.548702: step: 902/466, loss: 0.024286819621920586 2023-01-24 03:05:32.196307: step: 904/466, loss: 0.10800661891698837 2023-01-24 03:05:32.799204: step: 906/466, loss: 0.04178786650300026 2023-01-24 03:05:33.443762: step: 908/466, loss: 0.7517957091331482 2023-01-24 03:05:34.119825: step: 910/466, loss: 0.07966821640729904 2023-01-24 03:05:34.687997: step: 912/466, loss: 0.03132426366209984 2023-01-24 03:05:35.403829: step: 914/466, loss: 0.09199076145887375 2023-01-24 03:05:36.047890: step: 916/466, loss: 0.07831721752882004 2023-01-24 03:05:36.722854: step: 918/466, loss: 0.02353138104081154 2023-01-24 03:05:37.343854: step: 920/466, loss: 0.007074636872857809 2023-01-24 03:05:37.873352: step: 922/466, loss: 0.025794483721256256 2023-01-24 03:05:38.520868: step: 924/466, loss: 0.06823277473449707 2023-01-24 03:05:39.133919: step: 926/466, loss: 0.04336352273821831 2023-01-24 03:05:39.773556: step: 928/466, loss: 0.09034590423107147 2023-01-24 03:05:40.398558: step: 930/466, loss: 0.03232569992542267 2023-01-24 03:05:41.010275: step: 932/466, loss: 0.10629475116729736 ================================================== Loss: 0.106 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3447215691832288, 'r': 0.32967679481659834, 'f1': 0.33703136928874355}, 'combined': 0.2483389036864426, 'epoch': 21} Test Chinese: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.35437441974499184, 'r': 0.2999734897407737, 'f1': 0.3249125727300254}, 'combined': 0.2154860585981515, 'epoch': 21} Dev Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.34081101190476193, 'r': 0.2788453733766234, 'f1': 0.30672991071428574}, 'combined': 0.20448660714285716, 'epoch': 21} Test Korean: {'template': {'p': 0.96875, 'r': 0.49206349206349204, 'f1': 0.6526315789473683}, 'slot': {'p': 0.3652470445732468, 'r': 0.28232267223512664, 'f1': 0.3184754288947568}, 'combined': 0.20784712201552544, 'epoch': 21} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3301824730606656, 'r': 0.3189048933356334, 'f1': 0.3244457119457119}, 'combined': 0.23906526143368245, 'epoch': 21} Test Russian: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.3409594251478631, 'r': 0.2903921556766709, 'f1': 0.313650731143046}, 'combined': 0.20801706521404084, 'epoch': 21} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.3131313131313131, 'r': 0.2952380952380952, 'f1': 0.30392156862745096}, 'combined': 0.2026143790849673, 'epoch': 21} Sample Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.4642857142857143, 'r': 0.2826086956521739, 'f1': 0.35135135135135126}, 'combined': 0.23423423423423417, 'epoch': 21} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.275, 'r': 0.09482758620689655, 'f1': 0.141025641025641}, 'combined': 0.09401709401709399, 'epoch': 21} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.35600821224540485, 'r': 0.3296622534644356, 'f1': 0.34232907896701}, 'combined': 0.25224247923884946, 'epoch': 20} Test for Chinese: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.3708503216596444, 'r': 0.2994463568648126, 'f1': 0.33134515303755174}, 'combined': 0.21975222584873896, 'epoch': 20} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.38541666666666663, 'r': 0.35238095238095235, 'f1': 0.36815920398009944}, 'combined': 0.24543946932006627, 'epoch': 20} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.3478307038834952, 'r': 0.2714133522727273, 'f1': 0.30490691489361704}, 'combined': 0.20327127659574468, 'epoch': 20} Test for Korean: {'template': {'p': 0.96875, 'r': 0.49206349206349204, 'f1': 0.6526315789473683}, 'slot': {'p': 0.37366495205875794, 'r': 0.2779860369987299, 'f1': 0.3188014471929879}, 'combined': 0.20805989185226578, 'epoch': 20} Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5441176470588235, 'r': 0.40217391304347827, 'f1': 0.46249999999999997}, 'combined': 0.3083333333333333, 'epoch': 20} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3214272868712161, 'r': 0.31166858366449984, 'f1': 0.3164727236824498}, 'combined': 0.23319042797654194, 'epoch': 7} Test for Russian: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.3374639287478496, 'r': 0.2816078301964814, 'f1': 0.30701605547736693}, 'combined': 0.20361686580882363, 'epoch': 7} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.4423076923076923, 'r': 0.19827586206896552, 'f1': 0.2738095238095238}, 'combined': 0.1825396825396825, 'epoch': 7} ****************************** Epoch: 22 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-24 03:08:13.226142: step: 2/466, loss: 0.05701306089758873 2023-01-24 03:08:13.746705: step: 4/466, loss: 0.016013167798519135 2023-01-24 03:08:14.378798: step: 6/466, loss: 0.05998304486274719 2023-01-24 03:08:15.011809: step: 8/466, loss: 0.009235535748302937 2023-01-24 03:08:15.705854: step: 10/466, loss: 0.0133706359192729 2023-01-24 03:08:16.363055: step: 12/466, loss: 0.012997171841561794 2023-01-24 03:08:16.995333: step: 14/466, loss: 0.05838117375969887 2023-01-24 03:08:17.617298: step: 16/466, loss: 0.02190295420587063 2023-01-24 03:08:18.250635: step: 18/466, loss: 0.009717869572341442 2023-01-24 03:08:18.889983: step: 20/466, loss: 0.0579771064221859 2023-01-24 03:08:19.636162: step: 22/466, loss: 0.2816099226474762 2023-01-24 03:08:20.250755: step: 24/466, loss: 0.016298677772283554 2023-01-24 03:08:20.896418: step: 26/466, loss: 0.01881585642695427 2023-01-24 03:08:21.537429: step: 28/466, loss: 0.02248998172581196 2023-01-24 03:08:22.158221: step: 30/466, loss: 0.11135374009609222 2023-01-24 03:08:22.733575: step: 32/466, loss: 0.009332059882581234 2023-01-24 03:08:23.323790: step: 34/466, loss: 0.04593450576066971 2023-01-24 03:08:23.947841: step: 36/466, loss: 1.151442050933838 2023-01-24 03:08:24.554907: step: 38/466, loss: 0.027117688208818436 2023-01-24 03:08:25.228981: step: 40/466, loss: 0.4642413556575775 2023-01-24 03:08:25.926894: step: 42/466, loss: 0.012373731471598148 2023-01-24 03:08:26.558229: step: 44/466, loss: 0.017485400661826134 2023-01-24 03:08:27.125767: step: 46/466, loss: 0.05597267299890518 2023-01-24 03:08:27.760266: step: 48/466, loss: 0.020483272150158882 2023-01-24 03:08:28.464508: step: 50/466, loss: 0.09144927561283112 2023-01-24 03:08:29.070857: step: 52/466, loss: 0.0037049695383757353 2023-01-24 03:08:29.752468: step: 54/466, loss: 0.12996885180473328 2023-01-24 03:08:30.401588: step: 56/466, loss: 0.06591861695051193 2023-01-24 03:08:31.023398: step: 58/466, loss: 0.04005799442529678 2023-01-24 03:08:31.646337: step: 60/466, loss: 0.08128900080919266 2023-01-24 03:08:32.291829: step: 62/466, loss: 0.06788836419582367 2023-01-24 03:08:32.872482: step: 64/466, loss: 0.03723767027258873 2023-01-24 03:08:33.462894: step: 66/466, loss: 0.05795411020517349 2023-01-24 03:08:34.069635: step: 68/466, loss: 0.006802109070122242 2023-01-24 03:08:34.713306: step: 70/466, loss: 0.023723842576146126 2023-01-24 03:08:35.373960: step: 72/466, loss: 0.04105640947818756 2023-01-24 03:08:35.982164: step: 74/466, loss: 0.014921381138265133 2023-01-24 03:08:36.624053: step: 76/466, loss: 0.010845566168427467 2023-01-24 03:08:37.237053: step: 78/466, loss: 0.007722970098257065 2023-01-24 03:08:37.827251: step: 80/466, loss: 0.058121874928474426 2023-01-24 03:08:38.381276: step: 82/466, loss: 0.03176690265536308 2023-01-24 03:08:39.028303: step: 84/466, loss: 0.0331866480410099 2023-01-24 03:08:39.614016: step: 86/466, loss: 0.016345029696822166 2023-01-24 03:08:40.217623: step: 88/466, loss: 0.034479204565286636 2023-01-24 03:08:40.828616: step: 90/466, loss: 0.01930215209722519 2023-01-24 03:08:41.483993: step: 92/466, loss: 0.20928475260734558 2023-01-24 03:08:42.149411: step: 94/466, loss: 0.05373154953122139 2023-01-24 03:08:42.783833: step: 96/466, loss: 0.1054593026638031 2023-01-24 03:08:43.379981: step: 98/466, loss: 0.03443080559372902 2023-01-24 03:08:43.971660: step: 100/466, loss: 0.024371333420276642 2023-01-24 03:08:44.574632: step: 102/466, loss: 0.0245809443295002 2023-01-24 03:08:45.167152: step: 104/466, loss: 0.025508206337690353 2023-01-24 03:08:45.781566: step: 106/466, loss: 0.013353685848414898 2023-01-24 03:08:46.455104: step: 108/466, loss: 0.02436508983373642 2023-01-24 03:08:47.072207: step: 110/466, loss: 0.011672325432300568 2023-01-24 03:08:47.698084: step: 112/466, loss: 0.02600080519914627 2023-01-24 03:08:48.341706: step: 114/466, loss: 0.005994630511850119 2023-01-24 03:08:48.936575: step: 116/466, loss: 0.04980537295341492 2023-01-24 03:08:49.596638: step: 118/466, loss: 0.7140741944313049 2023-01-24 03:08:50.219071: step: 120/466, loss: 0.4516701400279999 2023-01-24 03:08:50.849545: step: 122/466, loss: 0.06617661565542221 2023-01-24 03:08:51.505006: step: 124/466, loss: 0.05022461339831352 2023-01-24 03:08:52.068969: step: 126/466, loss: 0.004974848125129938 2023-01-24 03:08:52.674044: step: 128/466, loss: 0.07024489343166351 2023-01-24 03:08:53.276796: step: 130/466, loss: 0.06471412628889084 2023-01-24 03:08:53.879860: step: 132/466, loss: 0.019128937274217606 2023-01-24 03:08:54.566277: step: 134/466, loss: 0.06269881129264832 2023-01-24 03:08:55.197088: step: 136/466, loss: 0.006089359056204557 2023-01-24 03:08:55.798371: step: 138/466, loss: 0.03454560786485672 2023-01-24 03:08:56.479152: step: 140/466, loss: 0.043931517750024796 2023-01-24 03:08:57.161451: step: 142/466, loss: 0.04773161932826042 2023-01-24 03:08:57.768629: step: 144/466, loss: 0.00738488556817174 2023-01-24 03:08:58.359172: step: 146/466, loss: 0.010454999282956123 2023-01-24 03:08:59.028955: step: 148/466, loss: 0.06720507889986038 2023-01-24 03:08:59.688681: step: 150/466, loss: 0.09316574037075043 2023-01-24 03:09:00.277962: step: 152/466, loss: 0.21962876617908478 2023-01-24 03:09:00.941122: step: 154/466, loss: 0.031010324135422707 2023-01-24 03:09:01.560025: step: 156/466, loss: 0.04724857956171036 2023-01-24 03:09:02.198928: step: 158/466, loss: 0.09203273057937622 2023-01-24 03:09:02.800831: step: 160/466, loss: 0.19428269565105438 2023-01-24 03:09:03.451019: step: 162/466, loss: 0.033569034188985825 2023-01-24 03:09:04.050019: step: 164/466, loss: 0.04066508635878563 2023-01-24 03:09:04.624407: step: 166/466, loss: 0.09030517190694809 2023-01-24 03:09:05.253805: step: 168/466, loss: 0.052252087742090225 2023-01-24 03:09:05.774365: step: 170/466, loss: 0.035676948726177216 2023-01-24 03:09:06.389972: step: 172/466, loss: 0.006707613822072744 2023-01-24 03:09:06.985491: step: 174/466, loss: 0.027036748826503754 2023-01-24 03:09:07.688315: step: 176/466, loss: 0.0769827589392662 2023-01-24 03:09:08.266604: step: 178/466, loss: 0.15034034848213196 2023-01-24 03:09:08.905190: step: 180/466, loss: 0.04687143862247467 2023-01-24 03:09:09.552101: step: 182/466, loss: 0.026359152048826218 2023-01-24 03:09:10.158902: step: 184/466, loss: 0.032875049859285355 2023-01-24 03:09:10.785429: step: 186/466, loss: 0.035999175161123276 2023-01-24 03:09:11.430073: step: 188/466, loss: 0.09714668989181519 2023-01-24 03:09:12.079117: step: 190/466, loss: 0.004884875845164061 2023-01-24 03:09:12.634390: step: 192/466, loss: 0.031640052795410156 2023-01-24 03:09:13.320076: step: 194/466, loss: 0.012053578160703182 2023-01-24 03:09:14.019714: step: 196/466, loss: 0.07030040770769119 2023-01-24 03:09:14.678325: step: 198/466, loss: 0.07704166322946548 2023-01-24 03:09:15.304205: step: 200/466, loss: 3.1605722904205322 2023-01-24 03:09:15.937640: step: 202/466, loss: 0.13462023437023163 2023-01-24 03:09:16.517442: step: 204/466, loss: 0.08356402069330215 2023-01-24 03:09:17.138348: step: 206/466, loss: 0.03021739050745964 2023-01-24 03:09:17.790518: step: 208/466, loss: 0.018619155511260033 2023-01-24 03:09:18.396037: step: 210/466, loss: 0.9524333477020264 2023-01-24 03:09:18.968304: step: 212/466, loss: 0.04453596845269203 2023-01-24 03:09:19.557078: step: 214/466, loss: 0.22760796546936035 2023-01-24 03:09:20.181326: step: 216/466, loss: 0.043370675295591354 2023-01-24 03:09:20.837809: step: 218/466, loss: 0.009173259139060974 2023-01-24 03:09:21.542874: step: 220/466, loss: 0.09860237687826157 2023-01-24 03:09:22.128295: step: 222/466, loss: 0.035534873604774475 2023-01-24 03:09:22.698688: step: 224/466, loss: 0.04748576879501343 2023-01-24 03:09:23.298701: step: 226/466, loss: 0.0008534886292181909 2023-01-24 03:09:23.927278: step: 228/466, loss: 0.2527320086956024 2023-01-24 03:09:24.501852: step: 230/466, loss: 0.09271737933158875 2023-01-24 03:09:25.097157: step: 232/466, loss: 0.00955366250127554 2023-01-24 03:09:25.789220: step: 234/466, loss: 0.006326592527329922 2023-01-24 03:09:26.411324: step: 236/466, loss: 0.03701920807361603 2023-01-24 03:09:27.107834: step: 238/466, loss: 0.017830880358815193 2023-01-24 03:09:27.710239: step: 240/466, loss: 0.6824085712432861 2023-01-24 03:09:28.309057: step: 242/466, loss: 0.10842016339302063 2023-01-24 03:09:28.948095: step: 244/466, loss: 0.012845429591834545 2023-01-24 03:09:29.569952: step: 246/466, loss: 0.014873155392706394 2023-01-24 03:09:30.206930: step: 248/466, loss: 0.015622666105628014 2023-01-24 03:09:30.780647: step: 250/466, loss: 0.06831051409244537 2023-01-24 03:09:31.434652: step: 252/466, loss: 0.0325254425406456 2023-01-24 03:09:32.041289: step: 254/466, loss: 0.05407670885324478 2023-01-24 03:09:32.689331: step: 256/466, loss: 0.2906532883644104 2023-01-24 03:09:33.352129: step: 258/466, loss: 0.0017354099545627832 2023-01-24 03:09:33.940952: step: 260/466, loss: 0.09595493227243423 2023-01-24 03:09:34.581606: step: 262/466, loss: 0.215519517660141 2023-01-24 03:09:35.225158: step: 264/466, loss: 0.009827272966504097 2023-01-24 03:09:35.843104: step: 266/466, loss: 0.04185410216450691 2023-01-24 03:09:36.465707: step: 268/466, loss: 0.023078553378582 2023-01-24 03:09:37.101370: step: 270/466, loss: 0.07125691324472427 2023-01-24 03:09:37.712922: step: 272/466, loss: 0.06753233820199966 2023-01-24 03:09:38.333016: step: 274/466, loss: 0.013809312134981155 2023-01-24 03:09:38.970421: step: 276/466, loss: 0.009868262335658073 2023-01-24 03:09:39.654258: step: 278/466, loss: 0.0939086303114891 2023-01-24 03:09:40.251207: step: 280/466, loss: 0.07660525292158127 2023-01-24 03:09:40.899495: step: 282/466, loss: 0.023387039080262184 2023-01-24 03:09:41.570612: step: 284/466, loss: 0.11623013019561768 2023-01-24 03:09:42.209035: step: 286/466, loss: 0.08008614182472229 2023-01-24 03:09:42.870629: step: 288/466, loss: 0.04176018759608269 2023-01-24 03:09:43.558597: step: 290/466, loss: 0.6140563488006592 2023-01-24 03:09:44.189084: step: 292/466, loss: 0.06483432650566101 2023-01-24 03:09:44.743041: step: 294/466, loss: 0.28950560092926025 2023-01-24 03:09:45.402844: step: 296/466, loss: 0.07724431157112122 2023-01-24 03:09:46.007689: step: 298/466, loss: 0.05625889077782631 2023-01-24 03:09:46.568281: step: 300/466, loss: 0.1193554475903511 2023-01-24 03:09:47.208580: step: 302/466, loss: 0.0603012852370739 2023-01-24 03:09:47.875464: step: 304/466, loss: 0.07429218292236328 2023-01-24 03:09:48.516906: step: 306/466, loss: 0.055066242814064026 2023-01-24 03:09:49.143615: step: 308/466, loss: 0.17455391585826874 2023-01-24 03:09:49.750416: step: 310/466, loss: 0.02669745311141014 2023-01-24 03:09:50.378948: step: 312/466, loss: 0.010350657626986504 2023-01-24 03:09:50.994345: step: 314/466, loss: 0.05983854457736015 2023-01-24 03:09:51.646375: step: 316/466, loss: 0.07685940712690353 2023-01-24 03:09:52.305387: step: 318/466, loss: 0.025662999600172043 2023-01-24 03:09:53.004693: step: 320/466, loss: 0.09239023178815842 2023-01-24 03:09:53.610522: step: 322/466, loss: 0.0246170312166214 2023-01-24 03:09:54.368224: step: 324/466, loss: 0.07709164917469025 2023-01-24 03:09:55.069094: step: 326/466, loss: 0.034793369472026825 2023-01-24 03:09:55.687005: step: 328/466, loss: 0.0283452607691288 2023-01-24 03:09:56.364904: step: 330/466, loss: 0.02577410452067852 2023-01-24 03:09:56.989048: step: 332/466, loss: 0.034362152218818665 2023-01-24 03:09:57.586269: step: 334/466, loss: 0.08701495826244354 2023-01-24 03:09:58.213761: step: 336/466, loss: 0.07708227634429932 2023-01-24 03:09:58.852820: step: 338/466, loss: 0.04221782088279724 2023-01-24 03:09:59.517165: step: 340/466, loss: 0.08996445685625076 2023-01-24 03:10:00.107946: step: 342/466, loss: 0.02697903849184513 2023-01-24 03:10:00.685034: step: 344/466, loss: 0.062445010989904404 2023-01-24 03:10:01.280675: step: 346/466, loss: 0.05373561754822731 2023-01-24 03:10:01.929738: step: 348/466, loss: 0.05835549533367157 2023-01-24 03:10:02.618650: step: 350/466, loss: 0.6431697607040405 2023-01-24 03:10:03.252834: step: 352/466, loss: 0.30082228779792786 2023-01-24 03:10:03.884429: step: 354/466, loss: 0.11711085587739944 2023-01-24 03:10:04.515576: step: 356/466, loss: 0.012538755312561989 2023-01-24 03:10:05.125179: step: 358/466, loss: 0.06648360192775726 2023-01-24 03:10:05.770778: step: 360/466, loss: 0.022442931309342384 2023-01-24 03:10:06.429564: step: 362/466, loss: 0.09700530767440796 2023-01-24 03:10:07.053226: step: 364/466, loss: 0.08707442879676819 2023-01-24 03:10:07.733205: step: 366/466, loss: 0.06549425423145294 2023-01-24 03:10:08.339351: step: 368/466, loss: 0.00937668140977621 2023-01-24 03:10:09.001676: step: 370/466, loss: 0.02462584339082241 2023-01-24 03:10:09.630175: step: 372/466, loss: 0.012783851474523544 2023-01-24 03:10:10.245837: step: 374/466, loss: 0.08071441948413849 2023-01-24 03:10:10.887002: step: 376/466, loss: 0.12172964215278625 2023-01-24 03:10:11.568779: step: 378/466, loss: 0.023829028010368347 2023-01-24 03:10:12.143269: step: 380/466, loss: 0.16985544562339783 2023-01-24 03:10:12.730390: step: 382/466, loss: 0.03699149936437607 2023-01-24 03:10:13.392202: step: 384/466, loss: 0.011300604790449142 2023-01-24 03:10:14.015829: step: 386/466, loss: 0.0649348646402359 2023-01-24 03:10:14.637841: step: 388/466, loss: 0.031760040670633316 2023-01-24 03:10:15.258801: step: 390/466, loss: 0.044917892664670944 2023-01-24 03:10:15.842495: step: 392/466, loss: 0.2522069811820984 2023-01-24 03:10:16.379728: step: 394/466, loss: 0.00824504904448986 2023-01-24 03:10:16.984383: step: 396/466, loss: 0.037625525146722794 2023-01-24 03:10:17.610467: step: 398/466, loss: 0.11952449381351471 2023-01-24 03:10:18.194047: step: 400/466, loss: 0.5322363376617432 2023-01-24 03:10:18.829337: step: 402/466, loss: 0.026489701122045517 2023-01-24 03:10:19.444540: step: 404/466, loss: 0.053119905292987823 2023-01-24 03:10:20.080376: step: 406/466, loss: 0.04038258269429207 2023-01-24 03:10:20.812500: step: 408/466, loss: 0.1848820447921753 2023-01-24 03:10:21.405298: step: 410/466, loss: 0.025337228551506996 2023-01-24 03:10:22.093956: step: 412/466, loss: 0.02773207239806652 2023-01-24 03:10:22.717197: step: 414/466, loss: 0.03828001767396927 2023-01-24 03:10:23.362060: step: 416/466, loss: 0.35797908902168274 2023-01-24 03:10:23.975248: step: 418/466, loss: 0.03279887139797211 2023-01-24 03:10:24.561678: step: 420/466, loss: 0.00925395917147398 2023-01-24 03:10:25.110696: step: 422/466, loss: 0.21304737031459808 2023-01-24 03:10:25.745292: step: 424/466, loss: 0.028571486473083496 2023-01-24 03:10:26.395276: step: 426/466, loss: 0.2181033194065094 2023-01-24 03:10:27.053121: step: 428/466, loss: 0.07693160325288773 2023-01-24 03:10:27.734897: step: 430/466, loss: 0.14517450332641602 2023-01-24 03:10:28.370264: step: 432/466, loss: 0.016695642843842506 2023-01-24 03:10:28.986357: step: 434/466, loss: 0.1899147927761078 2023-01-24 03:10:29.590723: step: 436/466, loss: 0.009916950948536396 2023-01-24 03:10:30.189687: step: 438/466, loss: 0.00741297984495759 2023-01-24 03:10:30.832856: step: 440/466, loss: 0.0704260990023613 2023-01-24 03:10:31.512082: step: 442/466, loss: 0.011597951874136925 2023-01-24 03:10:32.145086: step: 444/466, loss: 0.013319095596671104 2023-01-24 03:10:32.769773: step: 446/466, loss: 0.054979003965854645 2023-01-24 03:10:33.390147: step: 448/466, loss: 0.023092715069651604 2023-01-24 03:10:34.023020: step: 450/466, loss: 0.045721933245658875 2023-01-24 03:10:34.659874: step: 452/466, loss: 0.004618957173079252 2023-01-24 03:10:35.292851: step: 454/466, loss: 0.0299563929438591 2023-01-24 03:10:35.927852: step: 456/466, loss: 0.0010994257172569633 2023-01-24 03:10:36.515983: step: 458/466, loss: 0.016170697286725044 2023-01-24 03:10:37.152101: step: 460/466, loss: 0.04268626496195793 2023-01-24 03:10:37.862722: step: 462/466, loss: 0.05014938861131668 2023-01-24 03:10:38.520455: step: 464/466, loss: 0.04969983547925949 2023-01-24 03:10:39.360633: step: 466/466, loss: 0.04016483575105667 2023-01-24 03:10:39.927705: step: 468/466, loss: 0.014492525719106197 2023-01-24 03:10:40.547590: step: 470/466, loss: 0.8588945865631104 2023-01-24 03:10:41.233366: step: 472/466, loss: 0.049487411975860596 2023-01-24 03:10:41.861105: step: 474/466, loss: 0.06922387331724167 2023-01-24 03:10:42.540455: step: 476/466, loss: 0.040300093591213226 2023-01-24 03:10:43.128717: step: 478/466, loss: 0.17167863249778748 2023-01-24 03:10:43.706552: step: 480/466, loss: 0.03799637407064438 2023-01-24 03:10:44.357154: step: 482/466, loss: 0.08162984997034073 2023-01-24 03:10:44.979041: step: 484/466, loss: 0.04913508519530296 2023-01-24 03:10:45.604217: step: 486/466, loss: 0.06300981342792511 2023-01-24 03:10:46.222640: step: 488/466, loss: 0.09749052673578262 2023-01-24 03:10:46.847104: step: 490/466, loss: 0.06170743703842163 2023-01-24 03:10:47.511941: step: 492/466, loss: 0.23926964402198792 2023-01-24 03:10:48.117065: step: 494/466, loss: 0.04832770302891731 2023-01-24 03:10:48.737034: step: 496/466, loss: 0.019094914197921753 2023-01-24 03:10:49.350150: step: 498/466, loss: 0.06809645146131516 2023-01-24 03:10:49.955569: step: 500/466, loss: 0.07340987026691437 2023-01-24 03:10:50.567527: step: 502/466, loss: 0.08297903835773468 2023-01-24 03:10:51.210795: step: 504/466, loss: 0.04602425917983055 2023-01-24 03:10:51.771582: step: 506/466, loss: 0.047050878405570984 2023-01-24 03:10:52.371204: step: 508/466, loss: 0.056306980550289154 2023-01-24 03:10:52.993632: step: 510/466, loss: 0.02863304875791073 2023-01-24 03:10:53.619252: step: 512/466, loss: 0.023716315627098083 2023-01-24 03:10:54.251468: step: 514/466, loss: 0.059006400406360626 2023-01-24 03:10:54.922589: step: 516/466, loss: 0.1620517373085022 2023-01-24 03:10:55.476947: step: 518/466, loss: 0.0029416207689791918 2023-01-24 03:10:56.048797: step: 520/466, loss: 0.07356319576501846 2023-01-24 03:10:56.740675: step: 522/466, loss: 0.018130777403712273 2023-01-24 03:10:57.337944: step: 524/466, loss: 0.013137644156813622 2023-01-24 03:10:58.016231: step: 526/466, loss: 0.05470186471939087 2023-01-24 03:10:58.594839: step: 528/466, loss: 0.0040150294080376625 2023-01-24 03:10:59.261453: step: 530/466, loss: 0.016491297632455826 2023-01-24 03:10:59.952589: step: 532/466, loss: 0.06494159996509552 2023-01-24 03:11:00.595099: step: 534/466, loss: 0.06707453727722168 2023-01-24 03:11:01.259668: step: 536/466, loss: 0.015546813607215881 2023-01-24 03:11:01.886512: step: 538/466, loss: 0.024059467017650604 2023-01-24 03:11:02.504223: step: 540/466, loss: 0.054490331560373306 2023-01-24 03:11:03.108814: step: 542/466, loss: 0.0058572967536747456 2023-01-24 03:11:03.747499: step: 544/466, loss: 0.3280733823776245 2023-01-24 03:11:04.314006: step: 546/466, loss: 0.019469719380140305 2023-01-24 03:11:04.961825: step: 548/466, loss: 0.05076098069548607 2023-01-24 03:11:05.649464: step: 550/466, loss: 0.07283362746238708 2023-01-24 03:11:06.227566: step: 552/466, loss: 0.26903781294822693 2023-01-24 03:11:06.803458: step: 554/466, loss: 0.06180036813020706 2023-01-24 03:11:07.385332: step: 556/466, loss: 0.00028102347278036177 2023-01-24 03:11:08.034278: step: 558/466, loss: 0.03176679462194443 2023-01-24 03:11:08.636830: step: 560/466, loss: 0.04618803411722183 2023-01-24 03:11:09.235555: step: 562/466, loss: 0.005040397401899099 2023-01-24 03:11:09.960878: step: 564/466, loss: 0.022054478526115417 2023-01-24 03:11:10.571088: step: 566/466, loss: 0.006742571480572224 2023-01-24 03:11:11.219807: step: 568/466, loss: 0.09823835641145706 2023-01-24 03:11:11.855354: step: 570/466, loss: 0.01592562161386013 2023-01-24 03:11:12.487499: step: 572/466, loss: 0.060743462294340134 2023-01-24 03:11:13.170003: step: 574/466, loss: 0.06120866909623146 2023-01-24 03:11:13.857213: step: 576/466, loss: 0.018308117985725403 2023-01-24 03:11:14.510489: step: 578/466, loss: 0.0007237203535623848 2023-01-24 03:11:15.157207: step: 580/466, loss: 0.02220681495964527 2023-01-24 03:11:15.736982: step: 582/466, loss: 0.016686266288161278 2023-01-24 03:11:16.333341: step: 584/466, loss: 0.024640217423439026 2023-01-24 03:11:16.898700: step: 586/466, loss: 0.07240679115056992 2023-01-24 03:11:17.542672: step: 588/466, loss: 0.03808869048953056 2023-01-24 03:11:18.132015: step: 590/466, loss: 0.03586426377296448 2023-01-24 03:11:18.736488: step: 592/466, loss: 0.024462319910526276 2023-01-24 03:11:19.350304: step: 594/466, loss: 0.05637772008776665 2023-01-24 03:11:19.974477: step: 596/466, loss: 0.04977262765169144 2023-01-24 03:11:20.567437: step: 598/466, loss: 0.015603912062942982 2023-01-24 03:11:21.238207: step: 600/466, loss: 0.0405561625957489 2023-01-24 03:11:21.861793: step: 602/466, loss: 0.09790562093257904 2023-01-24 03:11:22.497969: step: 604/466, loss: 0.033606477081775665 2023-01-24 03:11:23.179756: step: 606/466, loss: 0.008672555908560753 2023-01-24 03:11:23.747685: step: 608/466, loss: 0.12055826932191849 2023-01-24 03:11:24.396460: step: 610/466, loss: 0.03952990099787712 2023-01-24 03:11:25.022075: step: 612/466, loss: 0.056830912828445435 2023-01-24 03:11:25.710111: step: 614/466, loss: 0.0020802398212254047 2023-01-24 03:11:26.331111: step: 616/466, loss: 0.054080478847026825 2023-01-24 03:11:26.965359: step: 618/466, loss: 0.7020652890205383 2023-01-24 03:11:27.561428: step: 620/466, loss: 0.07433077692985535 2023-01-24 03:11:28.211881: step: 622/466, loss: 0.015092556364834309 2023-01-24 03:11:28.897495: step: 624/466, loss: 0.08288074284791946 2023-01-24 03:11:29.484083: step: 626/466, loss: 0.02156096138060093 2023-01-24 03:11:30.099324: step: 628/466, loss: 0.01666255295276642 2023-01-24 03:11:30.718138: step: 630/466, loss: 0.08199740201234818 2023-01-24 03:11:31.351706: step: 632/466, loss: 0.01296154409646988 2023-01-24 03:11:31.948426: step: 634/466, loss: 0.05086514353752136 2023-01-24 03:11:32.653624: step: 636/466, loss: 0.06781143695116043 2023-01-24 03:11:33.316640: step: 638/466, loss: 0.16483181715011597 2023-01-24 03:11:33.941884: step: 640/466, loss: 0.025504454970359802 2023-01-24 03:11:34.554413: step: 642/466, loss: 0.4947722852230072 2023-01-24 03:11:35.231854: step: 644/466, loss: 0.03764424845576286 2023-01-24 03:11:35.847688: step: 646/466, loss: 0.009929514490067959 2023-01-24 03:11:36.456301: step: 648/466, loss: 0.06044996529817581 2023-01-24 03:11:37.085444: step: 650/466, loss: 0.1053556576371193 2023-01-24 03:11:37.727007: step: 652/466, loss: 0.025075990706682205 2023-01-24 03:11:38.352624: step: 654/466, loss: 0.022899439558386803 2023-01-24 03:11:39.106084: step: 656/466, loss: 0.32105860114097595 2023-01-24 03:11:39.724009: step: 658/466, loss: 0.021026339381933212 2023-01-24 03:11:40.322670: step: 660/466, loss: 0.08893311023712158 2023-01-24 03:11:40.981200: step: 662/466, loss: 0.004652900155633688 2023-01-24 03:11:41.562116: step: 664/466, loss: 0.006031044293195009 2023-01-24 03:11:42.179812: step: 666/466, loss: 0.07278808206319809 2023-01-24 03:11:42.766040: step: 668/466, loss: 0.0484389029443264 2023-01-24 03:11:43.379014: step: 670/466, loss: 0.19756385684013367 2023-01-24 03:11:44.017132: step: 672/466, loss: 0.12207819521427155 2023-01-24 03:11:44.678822: step: 674/466, loss: 0.051613833755254745 2023-01-24 03:11:45.318715: step: 676/466, loss: 0.08182787150144577 2023-01-24 03:11:45.941331: step: 678/466, loss: 0.055441003292798996 2023-01-24 03:11:46.589729: step: 680/466, loss: 0.07255645096302032 2023-01-24 03:11:47.266845: step: 682/466, loss: 0.1373283565044403 2023-01-24 03:11:47.969676: step: 684/466, loss: 0.08799053728580475 2023-01-24 03:11:48.644235: step: 686/466, loss: 0.09948298335075378 2023-01-24 03:11:49.289750: step: 688/466, loss: 0.3227572739124298 2023-01-24 03:11:49.956066: step: 690/466, loss: 0.13032247126102448 2023-01-24 03:11:50.584195: step: 692/466, loss: 0.05579046905040741 2023-01-24 03:11:51.251329: step: 694/466, loss: 0.08206340670585632 2023-01-24 03:11:51.828041: step: 696/466, loss: 0.08887199312448502 2023-01-24 03:11:52.452344: step: 698/466, loss: 0.056815605610609055 2023-01-24 03:11:53.081594: step: 700/466, loss: 0.1173422560095787 2023-01-24 03:11:53.748537: step: 702/466, loss: 0.025796858593821526 2023-01-24 03:11:54.392048: step: 704/466, loss: 0.1318012923002243 2023-01-24 03:11:55.040666: step: 706/466, loss: 0.06997198611497879 2023-01-24 03:11:55.635631: step: 708/466, loss: 0.0759538933634758 2023-01-24 03:11:56.283707: step: 710/466, loss: 0.12641946971416473 2023-01-24 03:11:56.929635: step: 712/466, loss: 0.07710997015237808 2023-01-24 03:11:57.471912: step: 714/466, loss: 0.05794893577694893 2023-01-24 03:11:58.047237: step: 716/466, loss: 0.037828873842954636 2023-01-24 03:11:58.657031: step: 718/466, loss: 0.06943363696336746 2023-01-24 03:11:59.360583: step: 720/466, loss: 0.07901345193386078 2023-01-24 03:11:59.887854: step: 722/466, loss: 0.015312760137021542 2023-01-24 03:12:00.464965: step: 724/466, loss: 0.053611285984516144 2023-01-24 03:12:01.091544: step: 726/466, loss: 0.016187705099582672 2023-01-24 03:12:01.712010: step: 728/466, loss: 0.08087359368801117 2023-01-24 03:12:02.376799: step: 730/466, loss: 0.07364023476839066 2023-01-24 03:12:03.018731: step: 732/466, loss: 1.0714809894561768 2023-01-24 03:12:03.584048: step: 734/466, loss: 0.0519305095076561 2023-01-24 03:12:04.244676: step: 736/466, loss: 0.12604323029518127 2023-01-24 03:12:04.848139: step: 738/466, loss: 0.0702681615948677 2023-01-24 03:12:05.462938: step: 740/466, loss: 0.058377061039209366 2023-01-24 03:12:06.067941: step: 742/466, loss: 0.08331472426652908 2023-01-24 03:12:06.700506: step: 744/466, loss: 0.1307016909122467 2023-01-24 03:12:07.314433: step: 746/466, loss: 0.057829976081848145 2023-01-24 03:12:07.948887: step: 748/466, loss: 0.0034869564697146416 2023-01-24 03:12:08.587690: step: 750/466, loss: 0.028278179466724396 2023-01-24 03:12:09.197460: step: 752/466, loss: 0.5153988599777222 2023-01-24 03:12:09.827434: step: 754/466, loss: 0.03906656429171562 2023-01-24 03:12:10.520597: step: 756/466, loss: 0.21079841256141663 2023-01-24 03:12:11.105836: step: 758/466, loss: 0.026310265064239502 2023-01-24 03:12:11.795388: step: 760/466, loss: 0.0046156407333910465 2023-01-24 03:12:12.444028: step: 762/466, loss: 0.06732354313135147 2023-01-24 03:12:13.058096: step: 764/466, loss: 0.021007949486374855 2023-01-24 03:12:13.657056: step: 766/466, loss: 0.035399969667196274 2023-01-24 03:12:14.292825: step: 768/466, loss: 0.05831155180931091 2023-01-24 03:12:14.886290: step: 770/466, loss: 0.1032584011554718 2023-01-24 03:12:15.474182: step: 772/466, loss: 0.020311955362558365 2023-01-24 03:12:16.083527: step: 774/466, loss: 0.28223592042922974 2023-01-24 03:12:16.723447: step: 776/466, loss: 0.04397254437208176 2023-01-24 03:12:17.362632: step: 778/466, loss: 0.06936588138341904 2023-01-24 03:12:17.989930: step: 780/466, loss: 0.2023026943206787 2023-01-24 03:12:18.574706: step: 782/466, loss: 0.013379652053117752 2023-01-24 03:12:19.166752: step: 784/466, loss: 0.07581201940774918 2023-01-24 03:12:19.804706: step: 786/466, loss: 0.1080794408917427 2023-01-24 03:12:20.518863: step: 788/466, loss: 0.010013054125010967 2023-01-24 03:12:21.136919: step: 790/466, loss: 0.1666671633720398 2023-01-24 03:12:21.781469: step: 792/466, loss: 0.043947044759988785 2023-01-24 03:12:22.346333: step: 794/466, loss: 0.03906881436705589 2023-01-24 03:12:22.935751: step: 796/466, loss: 0.03650645911693573 2023-01-24 03:12:23.581403: step: 798/466, loss: 0.01717812567949295 2023-01-24 03:12:24.203308: step: 800/466, loss: 0.05117480456829071 2023-01-24 03:12:24.885404: step: 802/466, loss: 0.0996786504983902 2023-01-24 03:12:25.499699: step: 804/466, loss: 0.015471360646188259 2023-01-24 03:12:26.069402: step: 806/466, loss: 0.033040329813957214 2023-01-24 03:12:26.703892: step: 808/466, loss: 0.0883571058511734 2023-01-24 03:12:27.336824: step: 810/466, loss: 0.07848524302244186 2023-01-24 03:12:27.966022: step: 812/466, loss: 0.029181169345974922 2023-01-24 03:12:28.739479: step: 814/466, loss: 0.031819898635149 2023-01-24 03:12:29.298069: step: 816/466, loss: 0.02793828584253788 2023-01-24 03:12:29.922038: step: 818/466, loss: 0.08263324946165085 2023-01-24 03:12:30.515750: step: 820/466, loss: 0.029203617945313454 2023-01-24 03:12:31.105855: step: 822/466, loss: 0.0037075963336974382 2023-01-24 03:12:31.704667: step: 824/466, loss: 0.1230461373925209 2023-01-24 03:12:32.291183: step: 826/466, loss: 0.05489637702703476 2023-01-24 03:12:32.957557: step: 828/466, loss: 0.041764117777347565 2023-01-24 03:12:33.576711: step: 830/466, loss: 0.035227157175540924 2023-01-24 03:12:34.223827: step: 832/466, loss: 0.178544819355011 2023-01-24 03:12:34.800034: step: 834/466, loss: 0.07546160370111465 2023-01-24 03:12:35.433890: step: 836/466, loss: 0.036592237651348114 2023-01-24 03:12:36.001293: step: 838/466, loss: 0.0232541523873806 2023-01-24 03:12:36.664944: step: 840/466, loss: 0.13159701228141785 2023-01-24 03:12:37.280373: step: 842/466, loss: 0.04711227864027023 2023-01-24 03:12:37.841078: step: 844/466, loss: 0.014443104155361652 2023-01-24 03:12:38.435286: step: 846/466, loss: 0.03396781533956528 2023-01-24 03:12:39.213633: step: 848/466, loss: 0.0871630385518074 2023-01-24 03:12:39.855254: step: 850/466, loss: 0.2766623795032501 2023-01-24 03:12:40.446569: step: 852/466, loss: 0.03323986008763313 2023-01-24 03:12:41.060060: step: 854/466, loss: 0.02848413586616516 2023-01-24 03:12:41.730995: step: 856/466, loss: 0.06942589581012726 2023-01-24 03:12:42.348196: step: 858/466, loss: 0.02229192480444908 2023-01-24 03:12:43.038130: step: 860/466, loss: 0.04600051790475845 2023-01-24 03:12:43.691474: step: 862/466, loss: 0.02044217474758625 2023-01-24 03:12:44.293142: step: 864/466, loss: 0.0533686988055706 2023-01-24 03:12:44.934886: step: 866/466, loss: 0.050430938601493835 2023-01-24 03:12:45.554578: step: 868/466, loss: 0.05311398208141327 2023-01-24 03:12:46.155021: step: 870/466, loss: 0.08430203050374985 2023-01-24 03:12:46.765750: step: 872/466, loss: 0.0548601895570755 2023-01-24 03:12:47.418038: step: 874/466, loss: 0.0949767678976059 2023-01-24 03:12:48.081233: step: 876/466, loss: 0.1755812168121338 2023-01-24 03:12:48.770698: step: 878/466, loss: 0.04888708516955376 2023-01-24 03:12:49.362335: step: 880/466, loss: 0.012096043676137924 2023-01-24 03:12:49.966595: step: 882/466, loss: 0.10377335548400879 2023-01-24 03:12:50.598390: step: 884/466, loss: 0.04046103358268738 2023-01-24 03:12:51.226455: step: 886/466, loss: 0.023146986961364746 2023-01-24 03:12:51.803956: step: 888/466, loss: 0.038587309420108795 2023-01-24 03:12:52.446985: step: 890/466, loss: 0.03209630399942398 2023-01-24 03:12:53.159801: step: 892/466, loss: 0.05548754334449768 2023-01-24 03:12:53.863623: step: 894/466, loss: 0.033855512738227844 2023-01-24 03:12:54.524566: step: 896/466, loss: 0.015545746311545372 2023-01-24 03:12:55.083999: step: 898/466, loss: 0.02351602539420128 2023-01-24 03:12:55.668692: step: 900/466, loss: 0.03280020132660866 2023-01-24 03:12:56.268492: step: 902/466, loss: 0.003082484472543001 2023-01-24 03:12:56.884529: step: 904/466, loss: 0.023638760671019554 2023-01-24 03:12:57.483303: step: 906/466, loss: 0.06338606029748917 2023-01-24 03:12:58.168330: step: 908/466, loss: 0.04905043542385101 2023-01-24 03:12:58.770138: step: 910/466, loss: 0.016157738864421844 2023-01-24 03:12:59.421144: step: 912/466, loss: 0.06345370411872864 2023-01-24 03:13:00.137043: step: 914/466, loss: 0.04141675680875778 2023-01-24 03:13:00.787540: step: 916/466, loss: 0.08778607845306396 2023-01-24 03:13:01.358559: step: 918/466, loss: 0.05597427487373352 2023-01-24 03:13:01.928578: step: 920/466, loss: 0.02496947906911373 2023-01-24 03:13:02.589633: step: 922/466, loss: 0.06652742624282837 2023-01-24 03:13:03.199192: step: 924/466, loss: 0.026141630485653877 2023-01-24 03:13:03.979065: step: 926/466, loss: 0.0483468696475029 2023-01-24 03:13:04.601482: step: 928/466, loss: 0.04227868467569351 2023-01-24 03:13:05.232044: step: 930/466, loss: 0.040610309690237045 2023-01-24 03:13:05.924015: step: 932/466, loss: 0.014453450217843056 ================================================== Loss: 0.087 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.34465461707097556, 'r': 0.33288273261693846, 'f1': 0.33866640943846826}, 'combined': 0.24954367011255554, 'epoch': 22} Test Chinese: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.3508830906160646, 'r': 0.30778153450232654, 'f1': 0.3279220773130778}, 'combined': 0.21748199946152308, 'epoch': 22} Dev Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.32959686774942, 'r': 0.269045928030303, 'f1': 0.29625912408759125}, 'combined': 0.19750608272506082, 'epoch': 22} Test Korean: {'template': {'p': 0.96875, 'r': 0.49206349206349204, 'f1': 0.6526315789473683}, 'slot': {'p': 0.3640709243766355, 'r': 0.2946639890535334, 'f1': 0.32571096108024666}, 'combined': 0.21256925881026623, 'epoch': 22} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3297188799762329, 'r': 0.3190827870737738, 'f1': 0.3243136524356389}, 'combined': 0.23896795442626023, 'epoch': 22} Test Russian: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.3383797257632996, 'r': 0.29915577831322854, 'f1': 0.31756113841147127}, 'combined': 0.2106104959412866, 'epoch': 22} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.33854166666666663, 'r': 0.3095238095238095, 'f1': 0.3233830845771144}, 'combined': 0.21558872305140958, 'epoch': 22} Sample Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5535714285714286, 'r': 0.33695652173913043, 'f1': 0.41891891891891897}, 'combined': 0.2792792792792793, 'epoch': 22} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.45454545454545453, 'r': 0.1724137931034483, 'f1': 0.25000000000000006}, 'combined': 0.16666666666666669, 'epoch': 22} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.35600821224540485, 'r': 0.3296622534644356, 'f1': 0.34232907896701}, 'combined': 0.25224247923884946, 'epoch': 20} Test for Chinese: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.3708503216596444, 'r': 0.2994463568648126, 'f1': 0.33134515303755174}, 'combined': 0.21975222584873896, 'epoch': 20} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.38541666666666663, 'r': 0.35238095238095235, 'f1': 0.36815920398009944}, 'combined': 0.24543946932006627, 'epoch': 20} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.3478307038834952, 'r': 0.2714133522727273, 'f1': 0.30490691489361704}, 'combined': 0.20327127659574468, 'epoch': 20} Test for Korean: {'template': {'p': 0.96875, 'r': 0.49206349206349204, 'f1': 0.6526315789473683}, 'slot': {'p': 0.37366495205875794, 'r': 0.2779860369987299, 'f1': 0.3188014471929879}, 'combined': 0.20805989185226578, 'epoch': 20} Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5441176470588235, 'r': 0.40217391304347827, 'f1': 0.46249999999999997}, 'combined': 0.3083333333333333, 'epoch': 20} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3214272868712161, 'r': 0.31166858366449984, 'f1': 0.3164727236824498}, 'combined': 0.23319042797654194, 'epoch': 7} Test for Russian: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.3374639287478496, 'r': 0.2816078301964814, 'f1': 0.30701605547736693}, 'combined': 0.20361686580882363, 'epoch': 7} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.4423076923076923, 'r': 0.19827586206896552, 'f1': 0.2738095238095238}, 'combined': 0.1825396825396825, 'epoch': 7} ****************************** Epoch: 23 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-24 03:15:37.387886: step: 2/466, loss: 0.014219677075743675 2023-01-24 03:15:37.966508: step: 4/466, loss: 0.02799920178949833 2023-01-24 03:15:38.610881: step: 6/466, loss: 0.09419617056846619 2023-01-24 03:15:39.235398: step: 8/466, loss: 0.31039220094680786 2023-01-24 03:15:39.920671: step: 10/466, loss: 0.2978193163871765 2023-01-24 03:15:40.549302: step: 12/466, loss: 0.007043609861284494 2023-01-24 03:15:41.148882: step: 14/466, loss: 0.027942348271608353 2023-01-24 03:15:41.767330: step: 16/466, loss: 0.11210431903600693 2023-01-24 03:15:42.557230: step: 18/466, loss: 0.10927058011293411 2023-01-24 03:15:43.242001: step: 20/466, loss: 0.021317631006240845 2023-01-24 03:15:43.946403: step: 22/466, loss: 0.038125600665807724 2023-01-24 03:15:44.583739: step: 24/466, loss: 0.0418558269739151 2023-01-24 03:15:45.208498: step: 26/466, loss: 0.018245242536067963 2023-01-24 03:15:45.809077: step: 28/466, loss: 0.047045011073350906 2023-01-24 03:15:46.451482: step: 30/466, loss: 0.03388955444097519 2023-01-24 03:15:47.041244: step: 32/466, loss: 0.03465335816144943 2023-01-24 03:15:47.575272: step: 34/466, loss: 0.017350606620311737 2023-01-24 03:15:48.241555: step: 36/466, loss: 0.014976875856518745 2023-01-24 03:15:48.885124: step: 38/466, loss: 0.033230796456336975 2023-01-24 03:15:49.519547: step: 40/466, loss: 0.020534897223114967 2023-01-24 03:15:50.125278: step: 42/466, loss: 0.021572591736912727 2023-01-24 03:15:50.770175: step: 44/466, loss: 0.06211506202816963 2023-01-24 03:15:51.389604: step: 46/466, loss: 0.01666572503745556 2023-01-24 03:15:52.059799: step: 48/466, loss: 0.027486102655529976 2023-01-24 03:15:52.681055: step: 50/466, loss: 0.10002031922340393 2023-01-24 03:15:53.310951: step: 52/466, loss: 0.019476555287837982 2023-01-24 03:15:54.036037: step: 54/466, loss: 0.00325954332947731 2023-01-24 03:15:54.656318: step: 56/466, loss: 0.030487297102808952 2023-01-24 03:15:55.312253: step: 58/466, loss: 0.0035734514240175486 2023-01-24 03:15:55.984275: step: 60/466, loss: 0.029996968805789948 2023-01-24 03:15:56.551274: step: 62/466, loss: 0.0012635894818231463 2023-01-24 03:15:57.149906: step: 64/466, loss: 0.11661484092473984 2023-01-24 03:15:57.790150: step: 66/466, loss: 0.20790398120880127 2023-01-24 03:15:58.444543: step: 68/466, loss: 0.045636314898729324 2023-01-24 03:15:59.130164: step: 70/466, loss: 0.0090608149766922 2023-01-24 03:15:59.797059: step: 72/466, loss: 0.03447333350777626 2023-01-24 03:16:00.415399: step: 74/466, loss: 0.04142190143465996 2023-01-24 03:16:01.031420: step: 76/466, loss: 0.08950883895158768 2023-01-24 03:16:01.621677: step: 78/466, loss: 0.03892095386981964 2023-01-24 03:16:02.294664: step: 80/466, loss: 0.03963964059948921 2023-01-24 03:16:02.984046: step: 82/466, loss: 0.08380357921123505 2023-01-24 03:16:03.579026: step: 84/466, loss: 0.04118441417813301 2023-01-24 03:16:04.207792: step: 86/466, loss: 0.024973010644316673 2023-01-24 03:16:04.762517: step: 88/466, loss: 0.0016479233745485544 2023-01-24 03:16:05.361432: step: 90/466, loss: 0.006236465647816658 2023-01-24 03:16:06.017456: step: 92/466, loss: 0.08336075395345688 2023-01-24 03:16:06.651384: step: 94/466, loss: 0.015114004723727703 2023-01-24 03:16:07.309124: step: 96/466, loss: 0.010505511425435543 2023-01-24 03:16:07.975070: step: 98/466, loss: 0.2507909834384918 2023-01-24 03:16:08.610959: step: 100/466, loss: 0.260568231344223 2023-01-24 03:16:09.271333: step: 102/466, loss: 0.044688452035188675 2023-01-24 03:16:09.917601: step: 104/466, loss: 0.004466955550014973 2023-01-24 03:16:10.551078: step: 106/466, loss: 0.06955769658088684 2023-01-24 03:16:11.225638: step: 108/466, loss: 0.06379435211420059 2023-01-24 03:16:11.855737: step: 110/466, loss: 0.015097280032932758 2023-01-24 03:16:12.415744: step: 112/466, loss: 0.08070417493581772 2023-01-24 03:16:13.013473: step: 114/466, loss: 0.061258550733327866 2023-01-24 03:16:13.659217: step: 116/466, loss: 0.008601571433246136 2023-01-24 03:16:14.285554: step: 118/466, loss: 0.021000182256102562 2023-01-24 03:16:14.951360: step: 120/466, loss: 0.010197958908975124 2023-01-24 03:16:15.578741: step: 122/466, loss: 0.011712507344782352 2023-01-24 03:16:16.191731: step: 124/466, loss: 0.02409764751791954 2023-01-24 03:16:16.766313: step: 126/466, loss: 0.10284940153360367 2023-01-24 03:16:17.351347: step: 128/466, loss: 0.013877983205020428 2023-01-24 03:16:18.003211: step: 130/466, loss: 0.06758103519678116 2023-01-24 03:16:18.662884: step: 132/466, loss: 0.04407202824950218 2023-01-24 03:16:19.299884: step: 134/466, loss: 0.1829264909029007 2023-01-24 03:16:19.901364: step: 136/466, loss: 0.034951724112033844 2023-01-24 03:16:20.485596: step: 138/466, loss: 0.08043573796749115 2023-01-24 03:16:21.099272: step: 140/466, loss: 0.0018060511210933328 2023-01-24 03:16:21.765527: step: 142/466, loss: 0.08757258951663971 2023-01-24 03:16:22.372853: step: 144/466, loss: 0.014110724441707134 2023-01-24 03:16:22.954641: step: 146/466, loss: 0.0018658102490007877 2023-01-24 03:16:23.581914: step: 148/466, loss: 0.04405156150460243 2023-01-24 03:16:24.167226: step: 150/466, loss: 0.006086637265980244 2023-01-24 03:16:24.848784: step: 152/466, loss: 0.13918232917785645 2023-01-24 03:16:25.426496: step: 154/466, loss: 0.02179715596139431 2023-01-24 03:16:26.025541: step: 156/466, loss: 0.1636473536491394 2023-01-24 03:16:26.798710: step: 158/466, loss: 0.04377252236008644 2023-01-24 03:16:27.407701: step: 160/466, loss: 0.030937492847442627 2023-01-24 03:16:28.025776: step: 162/466, loss: 0.06732071191072464 2023-01-24 03:16:28.659567: step: 164/466, loss: 0.1363304704427719 2023-01-24 03:16:29.318052: step: 166/466, loss: 0.08036568760871887 2023-01-24 03:16:29.902513: step: 168/466, loss: 0.08385451883077621 2023-01-24 03:16:30.540738: step: 170/466, loss: 0.012089313007891178 2023-01-24 03:16:31.150977: step: 172/466, loss: 0.048308826982975006 2023-01-24 03:16:31.880289: step: 174/466, loss: 0.03280562534928322 2023-01-24 03:16:32.541961: step: 176/466, loss: 0.023207364603877068 2023-01-24 03:16:33.118342: step: 178/466, loss: 0.0603955052793026 2023-01-24 03:16:33.724898: step: 180/466, loss: 1.5096756219863892 2023-01-24 03:16:34.402457: step: 182/466, loss: 0.02950737066566944 2023-01-24 03:16:34.956626: step: 184/466, loss: 0.026478124782443047 2023-01-24 03:16:35.539981: step: 186/466, loss: 0.016791829839348793 2023-01-24 03:16:36.215492: step: 188/466, loss: 0.01706968992948532 2023-01-24 03:16:36.876300: step: 190/466, loss: 0.014634549617767334 2023-01-24 03:16:37.514480: step: 192/466, loss: 0.014598404057323933 2023-01-24 03:16:38.155538: step: 194/466, loss: 0.016471099108457565 2023-01-24 03:16:38.799812: step: 196/466, loss: 0.011771813035011292 2023-01-24 03:16:39.481669: step: 198/466, loss: 0.03157324343919754 2023-01-24 03:16:40.074173: step: 200/466, loss: 0.0009993526618927717 2023-01-24 03:16:40.717419: step: 202/466, loss: 0.038105957210063934 2023-01-24 03:16:41.397542: step: 204/466, loss: 0.04690699279308319 2023-01-24 03:16:41.952768: step: 206/466, loss: 0.02774890325963497 2023-01-24 03:16:42.559537: step: 208/466, loss: 0.02330356277525425 2023-01-24 03:16:43.258081: step: 210/466, loss: 0.04613875970244408 2023-01-24 03:16:43.928917: step: 212/466, loss: 0.006724870763719082 2023-01-24 03:16:44.590506: step: 214/466, loss: 0.035854555666446686 2023-01-24 03:16:45.211077: step: 216/466, loss: 0.018583467230200768 2023-01-24 03:16:45.808831: step: 218/466, loss: 0.03919846564531326 2023-01-24 03:16:46.410578: step: 220/466, loss: 0.1321801096200943 2023-01-24 03:16:46.955062: step: 222/466, loss: 0.03498874232172966 2023-01-24 03:16:47.558819: step: 224/466, loss: 0.052615873515605927 2023-01-24 03:16:48.163211: step: 226/466, loss: 0.026402872055768967 2023-01-24 03:16:48.792380: step: 228/466, loss: 0.026835087686777115 2023-01-24 03:16:49.449838: step: 230/466, loss: 0.03692265599966049 2023-01-24 03:16:50.096500: step: 232/466, loss: 0.007541773375123739 2023-01-24 03:16:50.715690: step: 234/466, loss: 0.009179790504276752 2023-01-24 03:16:51.358844: step: 236/466, loss: 0.0473739430308342 2023-01-24 03:16:51.938597: step: 238/466, loss: 0.0576513297855854 2023-01-24 03:16:52.644004: step: 240/466, loss: 0.11710616946220398 2023-01-24 03:16:53.216104: step: 242/466, loss: 0.02180691994726658 2023-01-24 03:16:53.862145: step: 244/466, loss: 0.03474956750869751 2023-01-24 03:16:54.488118: step: 246/466, loss: 0.268618106842041 2023-01-24 03:16:55.082364: step: 248/466, loss: 0.04940341040492058 2023-01-24 03:16:55.730126: step: 250/466, loss: 0.019864708185195923 2023-01-24 03:16:56.330272: step: 252/466, loss: 0.4139997959136963 2023-01-24 03:16:56.965717: step: 254/466, loss: 0.22472988069057465 2023-01-24 03:16:57.611665: step: 256/466, loss: 0.08591841161251068 2023-01-24 03:16:58.317107: step: 258/466, loss: 0.04147052392363548 2023-01-24 03:16:58.973430: step: 260/466, loss: 0.041704703122377396 2023-01-24 03:16:59.559524: step: 262/466, loss: 0.07820732891559601 2023-01-24 03:17:00.158379: step: 264/466, loss: 0.18900932371616364 2023-01-24 03:17:00.784461: step: 266/466, loss: 0.006983160972595215 2023-01-24 03:17:01.377988: step: 268/466, loss: 0.007102139759808779 2023-01-24 03:17:02.008709: step: 270/466, loss: 0.03518517315387726 2023-01-24 03:17:02.568581: step: 272/466, loss: 0.0104660140350461 2023-01-24 03:17:03.188200: step: 274/466, loss: 0.08754755556583405 2023-01-24 03:17:03.854139: step: 276/466, loss: 0.03776015713810921 2023-01-24 03:17:04.409845: step: 278/466, loss: 0.06886523216962814 2023-01-24 03:17:05.013325: step: 280/466, loss: 0.054589878767728806 2023-01-24 03:17:05.642488: step: 282/466, loss: 0.03363099694252014 2023-01-24 03:17:06.238525: step: 284/466, loss: 0.022089608013629913 2023-01-24 03:17:06.816491: step: 286/466, loss: 0.03461292013525963 2023-01-24 03:17:07.423038: step: 288/466, loss: 0.1107168048620224 2023-01-24 03:17:08.070064: step: 290/466, loss: 0.06371128559112549 2023-01-24 03:17:08.681683: step: 292/466, loss: 0.014528674073517323 2023-01-24 03:17:09.316014: step: 294/466, loss: 0.12037225812673569 2023-01-24 03:17:09.926410: step: 296/466, loss: 0.036495864391326904 2023-01-24 03:17:10.529868: step: 298/466, loss: 0.032784491777420044 2023-01-24 03:17:11.127443: step: 300/466, loss: 0.03633768483996391 2023-01-24 03:17:11.865244: step: 302/466, loss: 0.005397193133831024 2023-01-24 03:17:12.495098: step: 304/466, loss: 0.19300426542758942 2023-01-24 03:17:13.138475: step: 306/466, loss: 0.0667954683303833 2023-01-24 03:17:13.767863: step: 308/466, loss: 0.3774685561656952 2023-01-24 03:17:14.442770: step: 310/466, loss: 0.03482966870069504 2023-01-24 03:17:15.063425: step: 312/466, loss: 0.055863045156002045 2023-01-24 03:17:15.633558: step: 314/466, loss: 0.034730829298496246 2023-01-24 03:17:16.251927: step: 316/466, loss: 0.004963377956300974 2023-01-24 03:17:16.868537: step: 318/466, loss: 0.06198538467288017 2023-01-24 03:17:17.534615: step: 320/466, loss: 0.4823494255542755 2023-01-24 03:17:18.223036: step: 322/466, loss: 0.02422151528298855 2023-01-24 03:17:18.852016: step: 324/466, loss: 0.01642017997801304 2023-01-24 03:17:19.473044: step: 326/466, loss: 0.2166920006275177 2023-01-24 03:17:20.091553: step: 328/466, loss: 0.10384485125541687 2023-01-24 03:17:20.798538: step: 330/466, loss: 0.0906425192952156 2023-01-24 03:17:21.478071: step: 332/466, loss: 0.04465954750776291 2023-01-24 03:17:22.091005: step: 334/466, loss: 0.02751227281987667 2023-01-24 03:17:22.781669: step: 336/466, loss: 0.019835565239191055 2023-01-24 03:17:23.423633: step: 338/466, loss: 0.059981536120176315 2023-01-24 03:17:23.957939: step: 340/466, loss: 0.06736723333597183 2023-01-24 03:17:24.546992: step: 342/466, loss: 0.21253731846809387 2023-01-24 03:17:25.193898: step: 344/466, loss: 0.022313419729471207 2023-01-24 03:17:25.793413: step: 346/466, loss: 0.03751381114125252 2023-01-24 03:17:26.435423: step: 348/466, loss: 0.030055083334445953 2023-01-24 03:17:27.099651: step: 350/466, loss: 0.022654645144939423 2023-01-24 03:17:27.715153: step: 352/466, loss: 0.1595204770565033 2023-01-24 03:17:28.396537: step: 354/466, loss: 0.016580000519752502 2023-01-24 03:17:29.011121: step: 356/466, loss: 0.07218082994222641 2023-01-24 03:17:29.624978: step: 358/466, loss: 0.6492629647254944 2023-01-24 03:17:30.249165: step: 360/466, loss: 0.1036364808678627 2023-01-24 03:17:30.911408: step: 362/466, loss: 0.14433376491069794 2023-01-24 03:17:31.496596: step: 364/466, loss: 0.04885255917906761 2023-01-24 03:17:32.144128: step: 366/466, loss: 0.014400053769350052 2023-01-24 03:17:32.770702: step: 368/466, loss: 0.04731348156929016 2023-01-24 03:17:33.411170: step: 370/466, loss: 0.04398616775870323 2023-01-24 03:17:34.025704: step: 372/466, loss: 8.719284057617188 2023-01-24 03:17:34.651583: step: 374/466, loss: 0.04839807003736496 2023-01-24 03:17:35.271232: step: 376/466, loss: 0.015130757354199886 2023-01-24 03:17:35.957672: step: 378/466, loss: 4.531932830810547 2023-01-24 03:17:36.596784: step: 380/466, loss: 0.002152061089873314 2023-01-24 03:17:37.207872: step: 382/466, loss: 0.07399486005306244 2023-01-24 03:17:37.819434: step: 384/466, loss: 0.026347309350967407 2023-01-24 03:17:38.425341: step: 386/466, loss: 0.009930618107318878 2023-01-24 03:17:39.095754: step: 388/466, loss: 0.029909221455454826 2023-01-24 03:17:39.636613: step: 390/466, loss: 0.04094990715384483 2023-01-24 03:17:40.305595: step: 392/466, loss: 0.06808948516845703 2023-01-24 03:17:40.896729: step: 394/466, loss: 0.05577511712908745 2023-01-24 03:17:41.507158: step: 396/466, loss: 0.026923881843686104 2023-01-24 03:17:42.121352: step: 398/466, loss: 0.10875479131937027 2023-01-24 03:17:42.703404: step: 400/466, loss: 0.026609651744365692 2023-01-24 03:17:43.309752: step: 402/466, loss: 0.028742486611008644 2023-01-24 03:17:43.799911: step: 404/466, loss: 0.006182553246617317 2023-01-24 03:17:44.346189: step: 406/466, loss: 0.5232219696044922 2023-01-24 03:17:44.984854: step: 408/466, loss: 0.03529982641339302 2023-01-24 03:17:45.577484: step: 410/466, loss: 0.025698933750391006 2023-01-24 03:17:46.168234: step: 412/466, loss: 0.062040552496910095 2023-01-24 03:17:46.772899: step: 414/466, loss: 0.015775354579091072 2023-01-24 03:17:47.357334: step: 416/466, loss: 0.03277741000056267 2023-01-24 03:17:48.009775: step: 418/466, loss: 0.04921555519104004 2023-01-24 03:17:48.634101: step: 420/466, loss: 0.006552983541041613 2023-01-24 03:17:49.274303: step: 422/466, loss: 0.010042181238532066 2023-01-24 03:17:49.877953: step: 424/466, loss: 0.010497386567294598 2023-01-24 03:17:50.472409: step: 426/466, loss: 0.015600942075252533 2023-01-24 03:17:51.076408: step: 428/466, loss: 0.09118566662073135 2023-01-24 03:17:51.684997: step: 430/466, loss: 0.08123798668384552 2023-01-24 03:17:52.383067: step: 432/466, loss: 0.0365171805024147 2023-01-24 03:17:53.045696: step: 434/466, loss: 0.06810249388217926 2023-01-24 03:17:53.639260: step: 436/466, loss: 0.04731258004903793 2023-01-24 03:17:54.359425: step: 438/466, loss: 0.04645892605185509 2023-01-24 03:17:55.075398: step: 440/466, loss: 0.03728519007563591 2023-01-24 03:17:55.675742: step: 442/466, loss: 0.016282543540000916 2023-01-24 03:17:56.313236: step: 444/466, loss: 0.022868501022458076 2023-01-24 03:17:56.885723: step: 446/466, loss: 0.027541210874915123 2023-01-24 03:17:57.520003: step: 448/466, loss: 0.00614620978012681 2023-01-24 03:17:58.139882: step: 450/466, loss: 0.01625632867217064 2023-01-24 03:17:58.807782: step: 452/466, loss: 0.012641062028706074 2023-01-24 03:17:59.393544: step: 454/466, loss: 0.015201380476355553 2023-01-24 03:18:00.050067: step: 456/466, loss: 0.04819103702902794 2023-01-24 03:18:00.653461: step: 458/466, loss: 0.005609402433037758 2023-01-24 03:18:01.256099: step: 460/466, loss: 0.016664976254105568 2023-01-24 03:18:01.913455: step: 462/466, loss: 0.019776416942477226 2023-01-24 03:18:02.551543: step: 464/466, loss: 0.009411384351551533 2023-01-24 03:18:03.175056: step: 466/466, loss: 0.02541434019804001 2023-01-24 03:18:03.747104: step: 468/466, loss: 0.016636408865451813 2023-01-24 03:18:04.409178: step: 470/466, loss: 0.03303654119372368 2023-01-24 03:18:04.951188: step: 472/466, loss: 0.054419100284576416 2023-01-24 03:18:05.559799: step: 474/466, loss: 0.09407573938369751 2023-01-24 03:18:06.197066: step: 476/466, loss: 0.034601762890815735 2023-01-24 03:18:06.827836: step: 478/466, loss: 0.02005459927022457 2023-01-24 03:18:07.442660: step: 480/466, loss: 0.02703125588595867 2023-01-24 03:18:08.051591: step: 482/466, loss: 0.0024854594375938177 2023-01-24 03:18:08.634550: step: 484/466, loss: 0.04000410437583923 2023-01-24 03:18:09.243381: step: 486/466, loss: 0.22972926497459412 2023-01-24 03:18:09.898192: step: 488/466, loss: 0.1464962661266327 2023-01-24 03:18:10.494891: step: 490/466, loss: 0.21413233876228333 2023-01-24 03:18:11.111102: step: 492/466, loss: 0.017804233357310295 2023-01-24 03:18:11.787911: step: 494/466, loss: 0.018891816958785057 2023-01-24 03:18:12.458628: step: 496/466, loss: 0.05741456523537636 2023-01-24 03:18:13.032740: step: 498/466, loss: 0.06254232674837112 2023-01-24 03:18:13.632591: step: 500/466, loss: 0.0880371630191803 2023-01-24 03:18:14.243771: step: 502/466, loss: 0.026517130434513092 2023-01-24 03:18:14.829436: step: 504/466, loss: 0.015300702303647995 2023-01-24 03:18:15.399163: step: 506/466, loss: 0.019848767668008804 2023-01-24 03:18:16.078396: step: 508/466, loss: 0.5972859263420105 2023-01-24 03:18:16.655723: step: 510/466, loss: 0.002276431303471327 2023-01-24 03:18:17.367114: step: 512/466, loss: 0.02450762875378132 2023-01-24 03:18:17.924263: step: 514/466, loss: 0.06573548167943954 2023-01-24 03:18:18.558874: step: 516/466, loss: 0.10455503314733505 2023-01-24 03:18:19.168612: step: 518/466, loss: 0.04081840068101883 2023-01-24 03:18:19.825972: step: 520/466, loss: 0.0435999259352684 2023-01-24 03:18:20.444976: step: 522/466, loss: 0.019436439499258995 2023-01-24 03:18:21.055529: step: 524/466, loss: 0.07334206253290176 2023-01-24 03:18:21.709005: step: 526/466, loss: 0.2929591238498688 2023-01-24 03:18:22.344094: step: 528/466, loss: 0.02410922572016716 2023-01-24 03:18:23.026666: step: 530/466, loss: 0.08006177097558975 2023-01-24 03:18:23.584405: step: 532/466, loss: 0.07058415561914444 2023-01-24 03:18:24.169250: step: 534/466, loss: 0.06360165774822235 2023-01-24 03:18:24.774698: step: 536/466, loss: 0.012963922694325447 2023-01-24 03:18:25.460765: step: 538/466, loss: 0.005889566615223885 2023-01-24 03:18:26.110597: step: 540/466, loss: 0.0956955999135971 2023-01-24 03:18:26.783900: step: 542/466, loss: 0.014393622055649757 2023-01-24 03:18:27.423445: step: 544/466, loss: 0.2035215198993683 2023-01-24 03:18:28.042588: step: 546/466, loss: 0.061057738959789276 2023-01-24 03:18:28.703988: step: 548/466, loss: 0.049713004380464554 2023-01-24 03:18:29.386054: step: 550/466, loss: 0.005676799453794956 2023-01-24 03:18:29.971202: step: 552/466, loss: 0.015204534865915775 2023-01-24 03:18:30.607248: step: 554/466, loss: 0.06786204129457474 2023-01-24 03:18:31.249132: step: 556/466, loss: 0.055277228355407715 2023-01-24 03:18:31.857981: step: 558/466, loss: 0.2268158197402954 2023-01-24 03:18:32.541128: step: 560/466, loss: 0.03270851820707321 2023-01-24 03:18:33.251874: step: 562/466, loss: 0.08453086018562317 2023-01-24 03:18:33.916347: step: 564/466, loss: 0.15126405656337738 2023-01-24 03:18:34.549521: step: 566/466, loss: 0.18492238223552704 2023-01-24 03:18:35.169472: step: 568/466, loss: 0.015301107428967953 2023-01-24 03:18:35.803113: step: 570/466, loss: 0.016525086015462875 2023-01-24 03:18:36.406318: step: 572/466, loss: 0.07693120837211609 2023-01-24 03:18:36.980318: step: 574/466, loss: 0.011329410597682 2023-01-24 03:18:37.629816: step: 576/466, loss: 0.14618626236915588 2023-01-24 03:18:38.246684: step: 578/466, loss: 0.23490644991397858 2023-01-24 03:18:38.921582: step: 580/466, loss: 0.0417061522603035 2023-01-24 03:18:39.522068: step: 582/466, loss: 0.03330326825380325 2023-01-24 03:18:40.155006: step: 584/466, loss: 0.013184239156544209 2023-01-24 03:18:40.738241: step: 586/466, loss: 0.01833726279437542 2023-01-24 03:18:41.380387: step: 588/466, loss: 0.1240232065320015 2023-01-24 03:18:42.021453: step: 590/466, loss: 0.16520902514457703 2023-01-24 03:18:42.592895: step: 592/466, loss: 0.05892754718661308 2023-01-24 03:18:43.194438: step: 594/466, loss: 0.025030579417943954 2023-01-24 03:18:43.778154: step: 596/466, loss: 0.04172952473163605 2023-01-24 03:18:44.431848: step: 598/466, loss: 0.017421064898371696 2023-01-24 03:18:45.048545: step: 600/466, loss: 0.0441550612449646 2023-01-24 03:18:45.669878: step: 602/466, loss: 0.009792739525437355 2023-01-24 03:18:46.215792: step: 604/466, loss: 0.034416936337947845 2023-01-24 03:18:46.782893: step: 606/466, loss: 0.00828731432557106 2023-01-24 03:18:47.467385: step: 608/466, loss: 0.03894294798374176 2023-01-24 03:18:48.178671: step: 610/466, loss: 0.5128212571144104 2023-01-24 03:18:48.793439: step: 612/466, loss: 0.014220272190868855 2023-01-24 03:18:49.418093: step: 614/466, loss: 0.05291266366839409 2023-01-24 03:18:50.032083: step: 616/466, loss: 0.08025780320167542 2023-01-24 03:18:50.685865: step: 618/466, loss: 0.014249353669583797 2023-01-24 03:18:51.408122: step: 620/466, loss: 0.09287619590759277 2023-01-24 03:18:52.012004: step: 622/466, loss: 0.022892482578754425 2023-01-24 03:18:52.634647: step: 624/466, loss: 0.029883043840527534 2023-01-24 03:18:53.258309: step: 626/466, loss: 0.04735163226723671 2023-01-24 03:18:53.851208: step: 628/466, loss: 0.051800090819597244 2023-01-24 03:18:54.455490: step: 630/466, loss: 0.13179709017276764 2023-01-24 03:18:55.137017: step: 632/466, loss: 0.01729048229753971 2023-01-24 03:18:55.835484: step: 634/466, loss: 0.07633604854345322 2023-01-24 03:18:56.586520: step: 636/466, loss: 0.023014308884739876 2023-01-24 03:18:57.191307: step: 638/466, loss: 0.0821717232465744 2023-01-24 03:18:57.858715: step: 640/466, loss: 0.01899549923837185 2023-01-24 03:18:58.482876: step: 642/466, loss: 0.007269714493304491 2023-01-24 03:18:59.135308: step: 644/466, loss: 0.027033589780330658 2023-01-24 03:18:59.754313: step: 646/466, loss: 0.061697300523519516 2023-01-24 03:19:00.406251: step: 648/466, loss: 0.06357747316360474 2023-01-24 03:19:01.030221: step: 650/466, loss: 0.3793969452381134 2023-01-24 03:19:01.648874: step: 652/466, loss: 0.04672146216034889 2023-01-24 03:19:02.279301: step: 654/466, loss: 0.00910738855600357 2023-01-24 03:19:02.899235: step: 656/466, loss: 0.008787152357399464 2023-01-24 03:19:03.546177: step: 658/466, loss: 0.027202855795621872 2023-01-24 03:19:04.227284: step: 660/466, loss: 0.032332953065633774 2023-01-24 03:19:04.908698: step: 662/466, loss: 0.0030818090308457613 2023-01-24 03:19:05.547052: step: 664/466, loss: 0.04429381713271141 2023-01-24 03:19:06.144957: step: 666/466, loss: 0.24332015216350555 2023-01-24 03:19:06.852837: step: 668/466, loss: 0.07040964066982269 2023-01-24 03:19:07.438888: step: 670/466, loss: 0.012174397706985474 2023-01-24 03:19:08.121198: step: 672/466, loss: 0.5175571441650391 2023-01-24 03:19:08.724479: step: 674/466, loss: 0.047638919204473495 2023-01-24 03:19:09.349270: step: 676/466, loss: 0.781222939491272 2023-01-24 03:19:09.952820: step: 678/466, loss: 0.08504980802536011 2023-01-24 03:19:10.611398: step: 680/466, loss: 0.001895928755402565 2023-01-24 03:19:11.223921: step: 682/466, loss: 0.15976615250110626 2023-01-24 03:19:11.807935: step: 684/466, loss: 0.013469494879245758 2023-01-24 03:19:12.440579: step: 686/466, loss: 0.02041921205818653 2023-01-24 03:19:13.065869: step: 688/466, loss: 0.008028162643313408 2023-01-24 03:19:13.756902: step: 690/466, loss: 0.17626157402992249 2023-01-24 03:19:14.414112: step: 692/466, loss: 0.8268342614173889 2023-01-24 03:19:15.077983: step: 694/466, loss: 0.08224974572658539 2023-01-24 03:19:15.712098: step: 696/466, loss: 0.03389971703290939 2023-01-24 03:19:16.375320: step: 698/466, loss: 0.031209105625748634 2023-01-24 03:19:17.025835: step: 700/466, loss: 0.0534808486700058 2023-01-24 03:19:17.633033: step: 702/466, loss: 0.12038559466600418 2023-01-24 03:19:18.257708: step: 704/466, loss: 0.04523187503218651 2023-01-24 03:19:18.912951: step: 706/466, loss: 0.026461729779839516 2023-01-24 03:19:19.615730: step: 708/466, loss: 0.14966435730457306 2023-01-24 03:19:20.226223: step: 710/466, loss: 0.025082498788833618 2023-01-24 03:19:20.918686: step: 712/466, loss: 0.0047665368765592575 2023-01-24 03:19:21.537997: step: 714/466, loss: 0.025602247565984726 2023-01-24 03:19:22.162820: step: 716/466, loss: 0.05528547987341881 2023-01-24 03:19:22.819928: step: 718/466, loss: 0.030770858749747276 2023-01-24 03:19:23.403363: step: 720/466, loss: 0.08510482311248779 2023-01-24 03:19:23.989732: step: 722/466, loss: 0.01946280337870121 2023-01-24 03:19:24.648375: step: 724/466, loss: 0.030918046832084656 2023-01-24 03:19:25.247906: step: 726/466, loss: 0.026320137083530426 2023-01-24 03:19:25.853537: step: 728/466, loss: 0.035525575280189514 2023-01-24 03:19:26.514072: step: 730/466, loss: 0.23102110624313354 2023-01-24 03:19:27.145425: step: 732/466, loss: 0.05604511499404907 2023-01-24 03:19:27.763027: step: 734/466, loss: 0.1168517991900444 2023-01-24 03:19:28.350068: step: 736/466, loss: 0.02180151827633381 2023-01-24 03:19:28.973147: step: 738/466, loss: 0.008714184165000916 2023-01-24 03:19:29.594679: step: 740/466, loss: 0.0680922344326973 2023-01-24 03:19:30.262264: step: 742/466, loss: 0.02514813095331192 2023-01-24 03:19:30.865946: step: 744/466, loss: 0.017655352130532265 2023-01-24 03:19:31.455169: step: 746/466, loss: 0.06468646228313446 2023-01-24 03:19:32.025589: step: 748/466, loss: 0.009327889420092106 2023-01-24 03:19:32.638707: step: 750/466, loss: 0.018987977877259254 2023-01-24 03:19:33.272223: step: 752/466, loss: 0.017294151708483696 2023-01-24 03:19:33.946810: step: 754/466, loss: 0.01210116222500801 2023-01-24 03:19:34.605607: step: 756/466, loss: 0.5217264294624329 2023-01-24 03:19:35.251704: step: 758/466, loss: 0.12612611055374146 2023-01-24 03:19:35.898750: step: 760/466, loss: 0.6621501445770264 2023-01-24 03:19:36.447999: step: 762/466, loss: 0.0022763311862945557 2023-01-24 03:19:37.041208: step: 764/466, loss: 0.012427791953086853 2023-01-24 03:19:37.672606: step: 766/466, loss: 0.06161829084157944 2023-01-24 03:19:38.349126: step: 768/466, loss: 0.0627053752541542 2023-01-24 03:19:39.013131: step: 770/466, loss: 0.0074045998044312 2023-01-24 03:19:39.643690: step: 772/466, loss: 0.05282897502183914 2023-01-24 03:19:40.262248: step: 774/466, loss: 0.01831836998462677 2023-01-24 03:19:40.890274: step: 776/466, loss: 0.026106664910912514 2023-01-24 03:19:41.534641: step: 778/466, loss: 0.048861004412174225 2023-01-24 03:19:42.120607: step: 780/466, loss: 0.006538981571793556 2023-01-24 03:19:42.783202: step: 782/466, loss: 0.04800443723797798 2023-01-24 03:19:43.426085: step: 784/466, loss: 0.03181067481637001 2023-01-24 03:19:44.067280: step: 786/466, loss: 0.012432624585926533 2023-01-24 03:19:44.730694: step: 788/466, loss: 0.1364171802997589 2023-01-24 03:19:45.323054: step: 790/466, loss: 0.011952829547226429 2023-01-24 03:19:45.898164: step: 792/466, loss: 0.3651338517665863 2023-01-24 03:19:46.542008: step: 794/466, loss: 0.14958646893501282 2023-01-24 03:19:47.175420: step: 796/466, loss: 0.013606944121420383 2023-01-24 03:19:47.813361: step: 798/466, loss: 0.024553818628191948 2023-01-24 03:19:48.426054: step: 800/466, loss: 0.018948564305901527 2023-01-24 03:19:49.074095: step: 802/466, loss: 0.052254047244787216 2023-01-24 03:19:49.658075: step: 804/466, loss: 0.08240722119808197 2023-01-24 03:19:50.245372: step: 806/466, loss: 0.050371814519166946 2023-01-24 03:19:50.896716: step: 808/466, loss: 0.0409635566174984 2023-01-24 03:19:51.532820: step: 810/466, loss: 0.060565996915102005 2023-01-24 03:19:52.163711: step: 812/466, loss: 0.031367238610982895 2023-01-24 03:19:52.772196: step: 814/466, loss: 0.08814281970262527 2023-01-24 03:19:53.450359: step: 816/466, loss: 0.04216299206018448 2023-01-24 03:19:54.171544: step: 818/466, loss: 0.021370599046349525 2023-01-24 03:19:54.752978: step: 820/466, loss: 0.021927153691649437 2023-01-24 03:19:55.372587: step: 822/466, loss: 0.03202419355511665 2023-01-24 03:19:55.988622: step: 824/466, loss: 0.8685194253921509 2023-01-24 03:19:56.598611: step: 826/466, loss: 0.10557854175567627 2023-01-24 03:19:57.168121: step: 828/466, loss: 0.027995627373456955 2023-01-24 03:19:57.791599: step: 830/466, loss: 0.008571329526603222 2023-01-24 03:19:58.472296: step: 832/466, loss: 0.017078906297683716 2023-01-24 03:19:59.077739: step: 834/466, loss: 0.06253548711538315 2023-01-24 03:19:59.730149: step: 836/466, loss: 0.03214738890528679 2023-01-24 03:20:00.525487: step: 838/466, loss: 0.11375249922275543 2023-01-24 03:20:01.173590: step: 840/466, loss: 0.010107310488820076 2023-01-24 03:20:01.796569: step: 842/466, loss: 0.0403822585940361 2023-01-24 03:20:02.436379: step: 844/466, loss: 0.07216506451368332 2023-01-24 03:20:03.023973: step: 846/466, loss: 0.03277384862303734 2023-01-24 03:20:03.678063: step: 848/466, loss: 1.960360050201416 2023-01-24 03:20:04.229652: step: 850/466, loss: 0.028355680406093597 2023-01-24 03:20:04.792034: step: 852/466, loss: 0.049147844314575195 2023-01-24 03:20:05.458196: step: 854/466, loss: 0.08806189894676208 2023-01-24 03:20:06.068938: step: 856/466, loss: 0.19922910630702972 2023-01-24 03:20:06.689783: step: 858/466, loss: 0.017885398119688034 2023-01-24 03:20:07.292930: step: 860/466, loss: 0.02041354402899742 2023-01-24 03:20:07.976153: step: 862/466, loss: 0.04174542427062988 2023-01-24 03:20:08.623405: step: 864/466, loss: 0.011507490649819374 2023-01-24 03:20:09.351025: step: 866/466, loss: 0.00754587771371007 2023-01-24 03:20:10.009132: step: 868/466, loss: 0.13759632408618927 2023-01-24 03:20:10.675840: step: 870/466, loss: 0.008561786264181137 2023-01-24 03:20:11.313130: step: 872/466, loss: 0.019189957529306412 2023-01-24 03:20:11.951168: step: 874/466, loss: 0.08911080658435822 2023-01-24 03:20:12.592828: step: 876/466, loss: 0.03029986470937729 2023-01-24 03:20:13.211922: step: 878/466, loss: 0.014277573674917221 2023-01-24 03:20:13.846674: step: 880/466, loss: 0.001170774339698255 2023-01-24 03:20:14.473686: step: 882/466, loss: 0.04861774295568466 2023-01-24 03:20:15.094530: step: 884/466, loss: 0.04374024271965027 2023-01-24 03:20:15.698948: step: 886/466, loss: 0.02988075092434883 2023-01-24 03:20:16.271070: step: 888/466, loss: 0.03934662044048309 2023-01-24 03:20:16.914616: step: 890/466, loss: 0.06640966981649399 2023-01-24 03:20:17.603764: step: 892/466, loss: 0.03840658813714981 2023-01-24 03:20:18.218646: step: 894/466, loss: 0.017629683017730713 2023-01-24 03:20:18.905336: step: 896/466, loss: 0.0430067740380764 2023-01-24 03:20:19.566622: step: 898/466, loss: 0.025082241743803024 2023-01-24 03:20:20.248915: step: 900/466, loss: 0.06866186112165451 2023-01-24 03:20:20.956969: step: 902/466, loss: 0.08462058007717133 2023-01-24 03:20:21.585391: step: 904/466, loss: 0.041680388152599335 2023-01-24 03:20:22.206066: step: 906/466, loss: 0.04381396621465683 2023-01-24 03:20:22.855171: step: 908/466, loss: 0.0076791406609117985 2023-01-24 03:20:23.506197: step: 910/466, loss: 0.0775487944483757 2023-01-24 03:20:24.155556: step: 912/466, loss: 0.019085828214883804 2023-01-24 03:20:24.780326: step: 914/466, loss: 0.050074320286512375 2023-01-24 03:20:25.435916: step: 916/466, loss: 0.0503820925951004 2023-01-24 03:20:26.129312: step: 918/466, loss: 0.1022348552942276 2023-01-24 03:20:26.764276: step: 920/466, loss: 0.0878489688038826 2023-01-24 03:20:27.512737: step: 922/466, loss: 0.08654739707708359 2023-01-24 03:20:28.110293: step: 924/466, loss: 0.01694355346262455 2023-01-24 03:20:28.717915: step: 926/466, loss: 0.17691993713378906 2023-01-24 03:20:29.364183: step: 928/466, loss: 0.07006848603487015 2023-01-24 03:20:29.947737: step: 930/466, loss: 0.008420702069997787 2023-01-24 03:20:30.466965: step: 932/466, loss: 0.44087299704551697 ================================================== Loss: 0.106 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3465223418134377, 'r': 0.33600173940543965, 'f1': 0.3411809569685293}, 'combined': 0.25139649460839, 'epoch': 23} Test Chinese: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.36496073526407535, 'r': 0.2964516699074107, 'f1': 0.3271581197259826}, 'combined': 0.2169753332897708, 'epoch': 23} Dev Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.3380231900452489, 'r': 0.28296638257575757, 'f1': 0.3080541237113402}, 'combined': 0.20536941580756013, 'epoch': 23} Test Korean: {'template': {'p': 0.96875, 'r': 0.49206349206349204, 'f1': 0.6526315789473683}, 'slot': {'p': 0.37949139688666383, 'r': 0.28313872852635835, 'f1': 0.3243097694485534}, 'combined': 0.21165479690326638, 'epoch': 23} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3306209672773361, 'r': 0.32999360301305275, 'f1': 0.3303069872514317}, 'combined': 0.24338409586947599, 'epoch': 23} Test Russian: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.35441556327912643, 'r': 0.28291874402022044, 'f1': 0.31465686022470346}, 'combined': 0.2086843425324458, 'epoch': 23} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.3069153069153069, 'r': 0.34199134199134196, 'f1': 0.3235053235053235}, 'combined': 0.21567021567021566, 'epoch': 23} Sample Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.48333333333333334, 'r': 0.31521739130434784, 'f1': 0.381578947368421}, 'combined': 0.25438596491228066, 'epoch': 23} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.13793103448275862, 'f1': 0.2162162162162162}, 'combined': 0.14414414414414412, 'epoch': 23} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.35600821224540485, 'r': 0.3296622534644356, 'f1': 0.34232907896701}, 'combined': 0.25224247923884946, 'epoch': 20} Test for Chinese: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.3708503216596444, 'r': 0.2994463568648126, 'f1': 0.33134515303755174}, 'combined': 0.21975222584873896, 'epoch': 20} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.38541666666666663, 'r': 0.35238095238095235, 'f1': 0.36815920398009944}, 'combined': 0.24543946932006627, 'epoch': 20} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.3478307038834952, 'r': 0.2714133522727273, 'f1': 0.30490691489361704}, 'combined': 0.20327127659574468, 'epoch': 20} Test for Korean: {'template': {'p': 0.96875, 'r': 0.49206349206349204, 'f1': 0.6526315789473683}, 'slot': {'p': 0.37366495205875794, 'r': 0.2779860369987299, 'f1': 0.3188014471929879}, 'combined': 0.20805989185226578, 'epoch': 20} Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5441176470588235, 'r': 0.40217391304347827, 'f1': 0.46249999999999997}, 'combined': 0.3083333333333333, 'epoch': 20} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3214272868712161, 'r': 0.31166858366449984, 'f1': 0.3164727236824498}, 'combined': 0.23319042797654194, 'epoch': 7} Test for Russian: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.3374639287478496, 'r': 0.2816078301964814, 'f1': 0.30701605547736693}, 'combined': 0.20361686580882363, 'epoch': 7} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.4423076923076923, 'r': 0.19827586206896552, 'f1': 0.2738095238095238}, 'combined': 0.1825396825396825, 'epoch': 7} ****************************** Epoch: 24 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-24 03:23:02.606533: step: 2/466, loss: 0.5174586772918701 2023-01-24 03:23:03.189283: step: 4/466, loss: 0.009579693898558617 2023-01-24 03:23:03.879673: step: 6/466, loss: 0.09315939247608185 2023-01-24 03:23:04.453530: step: 8/466, loss: 0.03629219904541969 2023-01-24 03:23:05.055267: step: 10/466, loss: 0.016620714217424393 2023-01-24 03:23:05.647684: step: 12/466, loss: 0.031349532306194305 2023-01-24 03:23:06.279585: step: 14/466, loss: 0.25620338320732117 2023-01-24 03:23:06.918527: step: 16/466, loss: 0.04357297345995903 2023-01-24 03:23:07.553453: step: 18/466, loss: 0.004880046471953392 2023-01-24 03:23:08.177799: step: 20/466, loss: 0.017147121950984 2023-01-24 03:23:08.815636: step: 22/466, loss: 0.11786799132823944 2023-01-24 03:23:09.432856: step: 24/466, loss: 0.021733110770583153 2023-01-24 03:23:10.096409: step: 26/466, loss: 0.031022317707538605 2023-01-24 03:23:10.784256: step: 28/466, loss: 0.04381708428263664 2023-01-24 03:23:11.410069: step: 30/466, loss: 0.061711300164461136 2023-01-24 03:23:12.014860: step: 32/466, loss: 0.015330356545746326 2023-01-24 03:23:12.631434: step: 34/466, loss: 0.022332649677991867 2023-01-24 03:23:13.249319: step: 36/466, loss: 0.048555828630924225 2023-01-24 03:23:13.827009: step: 38/466, loss: 0.022368174046278 2023-01-24 03:23:14.555234: step: 40/466, loss: 0.042115744203329086 2023-01-24 03:23:15.152757: step: 42/466, loss: 0.05961944907903671 2023-01-24 03:23:15.798270: step: 44/466, loss: 0.014870070852339268 2023-01-24 03:23:16.411423: step: 46/466, loss: 0.3316933214664459 2023-01-24 03:23:17.033355: step: 48/466, loss: 0.022402647882699966 2023-01-24 03:23:17.714968: step: 50/466, loss: 0.20861859619617462 2023-01-24 03:23:18.342942: step: 52/466, loss: 0.03810880705714226 2023-01-24 03:23:19.015144: step: 54/466, loss: 0.03817550092935562 2023-01-24 03:23:19.613962: step: 56/466, loss: 0.002065233187749982 2023-01-24 03:23:20.310843: step: 58/466, loss: 0.011738712899386883 2023-01-24 03:23:20.896687: step: 60/466, loss: 0.007858922705054283 2023-01-24 03:23:21.543979: step: 62/466, loss: 0.014686870388686657 2023-01-24 03:23:22.189059: step: 64/466, loss: 0.007629875093698502 2023-01-24 03:23:22.808452: step: 66/466, loss: 0.015685701742768288 2023-01-24 03:23:23.450639: step: 68/466, loss: 0.0027154877316206694 2023-01-24 03:23:24.045900: step: 70/466, loss: 0.0007991743623279035 2023-01-24 03:23:24.649688: step: 72/466, loss: 0.05554645135998726 2023-01-24 03:23:25.334482: step: 74/466, loss: 0.023773424327373505 2023-01-24 03:23:25.928084: step: 76/466, loss: 0.043621812015771866 2023-01-24 03:23:26.577446: step: 78/466, loss: 0.08062449097633362 2023-01-24 03:23:27.167046: step: 80/466, loss: 0.11252401769161224 2023-01-24 03:23:27.804791: step: 82/466, loss: 0.15902875363826752 2023-01-24 03:23:28.439343: step: 84/466, loss: 0.05257668346166611 2023-01-24 03:23:29.077945: step: 86/466, loss: 0.022970149293541908 2023-01-24 03:23:29.649641: step: 88/466, loss: 0.04428599402308464 2023-01-24 03:23:30.311225: step: 90/466, loss: 0.21043623983860016 2023-01-24 03:23:30.973051: step: 92/466, loss: 0.05659813806414604 2023-01-24 03:23:31.564124: step: 94/466, loss: 0.02158990129828453 2023-01-24 03:23:32.147547: step: 96/466, loss: 0.0067445761524140835 2023-01-24 03:23:32.752186: step: 98/466, loss: 0.06192212179303169 2023-01-24 03:23:33.444066: step: 100/466, loss: 0.04253120347857475 2023-01-24 03:23:34.031940: step: 102/466, loss: 0.06492670625448227 2023-01-24 03:23:34.657527: step: 104/466, loss: 0.04266819730401039 2023-01-24 03:23:35.235835: step: 106/466, loss: 0.036097414791584015 2023-01-24 03:23:35.955692: step: 108/466, loss: 0.2894755005836487 2023-01-24 03:23:36.607042: step: 110/466, loss: 0.05016065761446953 2023-01-24 03:23:37.329672: step: 112/466, loss: 0.043521471321582794 2023-01-24 03:23:38.013082: step: 114/466, loss: 0.10526257008314133 2023-01-24 03:23:38.653177: step: 116/466, loss: 0.0045556980185210705 2023-01-24 03:23:39.291272: step: 118/466, loss: 4.002925872802734 2023-01-24 03:23:39.897391: step: 120/466, loss: 0.016178200021386147 2023-01-24 03:23:40.539909: step: 122/466, loss: 0.03486013412475586 2023-01-24 03:23:41.243029: step: 124/466, loss: 0.11223459243774414 2023-01-24 03:23:41.829993: step: 126/466, loss: 0.01602235622704029 2023-01-24 03:23:42.453750: step: 128/466, loss: 0.03682239353656769 2023-01-24 03:23:43.101564: step: 130/466, loss: 0.03416857123374939 2023-01-24 03:23:43.739163: step: 132/466, loss: 0.08398226648569107 2023-01-24 03:23:44.321076: step: 134/466, loss: 0.0524989515542984 2023-01-24 03:23:44.973511: step: 136/466, loss: 0.01495880726724863 2023-01-24 03:23:45.561769: step: 138/466, loss: 0.0672888457775116 2023-01-24 03:23:46.241865: step: 140/466, loss: 0.03272339329123497 2023-01-24 03:23:46.863665: step: 142/466, loss: 0.010015328414738178 2023-01-24 03:23:47.487245: step: 144/466, loss: 0.023423034697771072 2023-01-24 03:23:48.100341: step: 146/466, loss: 0.012548690661787987 2023-01-24 03:23:48.758036: step: 148/466, loss: 0.009754863567650318 2023-01-24 03:23:49.352768: step: 150/466, loss: 0.0527384988963604 2023-01-24 03:23:49.957155: step: 152/466, loss: 0.00016302397125400603 2023-01-24 03:23:50.530583: step: 154/466, loss: 0.040271829813718796 2023-01-24 03:23:51.145461: step: 156/466, loss: 0.023221759125590324 2023-01-24 03:23:51.796367: step: 158/466, loss: 0.005486933048814535 2023-01-24 03:23:52.445148: step: 160/466, loss: 0.14957301318645477 2023-01-24 03:23:53.056443: step: 162/466, loss: 0.040905553847551346 2023-01-24 03:23:53.712197: step: 164/466, loss: 0.14041171967983246 2023-01-24 03:23:54.324809: step: 166/466, loss: 0.045817721635103226 2023-01-24 03:23:54.944731: step: 168/466, loss: 0.020683517679572105 2023-01-24 03:23:55.476385: step: 170/466, loss: 0.12141299247741699 2023-01-24 03:23:56.108959: step: 172/466, loss: 0.03760865703225136 2023-01-24 03:23:56.763485: step: 174/466, loss: 0.030975865200161934 2023-01-24 03:23:57.337942: step: 176/466, loss: 0.02207639068365097 2023-01-24 03:23:58.006930: step: 178/466, loss: 0.017290756106376648 2023-01-24 03:23:58.608810: step: 180/466, loss: 0.0924760028719902 2023-01-24 03:23:59.214870: step: 182/466, loss: 0.02489081583917141 2023-01-24 03:23:59.846131: step: 184/466, loss: 0.010242631658911705 2023-01-24 03:24:00.460425: step: 186/466, loss: 0.023558583110570908 2023-01-24 03:24:01.098312: step: 188/466, loss: 0.07605066150426865 2023-01-24 03:24:01.680527: step: 190/466, loss: 0.0016990734729915857 2023-01-24 03:24:02.332366: step: 192/466, loss: 0.005150969605892897 2023-01-24 03:24:02.929799: step: 194/466, loss: 0.012659971602261066 2023-01-24 03:24:03.566068: step: 196/466, loss: 0.014583379961550236 2023-01-24 03:24:04.210259: step: 198/466, loss: 0.010530868545174599 2023-01-24 03:24:04.795935: step: 200/466, loss: 0.04471125826239586 2023-01-24 03:24:05.488318: step: 202/466, loss: 0.04739345610141754 2023-01-24 03:24:06.140164: step: 204/466, loss: 0.07824733853340149 2023-01-24 03:24:06.729061: step: 206/466, loss: 0.0018017380498349667 2023-01-24 03:24:07.287376: step: 208/466, loss: 0.07637045532464981 2023-01-24 03:24:07.940499: step: 210/466, loss: 0.07172827422618866 2023-01-24 03:24:08.585457: step: 212/466, loss: 0.10092227905988693 2023-01-24 03:24:09.257427: step: 214/466, loss: 0.023817284032702446 2023-01-24 03:24:09.890224: step: 216/466, loss: 0.29648497700691223 2023-01-24 03:24:10.468949: step: 218/466, loss: 0.010222791694104671 2023-01-24 03:24:11.191377: step: 220/466, loss: 0.01697523146867752 2023-01-24 03:24:11.790637: step: 222/466, loss: 0.03982623293995857 2023-01-24 03:24:12.390275: step: 224/466, loss: 0.16172361373901367 2023-01-24 03:24:13.041346: step: 226/466, loss: 0.020176347345113754 2023-01-24 03:24:13.682401: step: 228/466, loss: 0.03942863643169403 2023-01-24 03:24:14.298805: step: 230/466, loss: 0.0117865651845932 2023-01-24 03:24:14.984599: step: 232/466, loss: 0.09619100391864777 2023-01-24 03:24:15.607731: step: 234/466, loss: 0.10768011957406998 2023-01-24 03:24:16.247194: step: 236/466, loss: 0.011421790346503258 2023-01-24 03:24:16.911969: step: 238/466, loss: 0.07146967202425003 2023-01-24 03:24:17.521974: step: 240/466, loss: 0.014844970777630806 2023-01-24 03:24:18.075842: step: 242/466, loss: 0.002789780031889677 2023-01-24 03:24:18.712772: step: 244/466, loss: 0.7855795621871948 2023-01-24 03:24:19.407986: step: 246/466, loss: 0.05135297030210495 2023-01-24 03:24:20.055975: step: 248/466, loss: 0.012107757851481438 2023-01-24 03:24:20.694004: step: 250/466, loss: 0.02014041505753994 2023-01-24 03:24:21.294138: step: 252/466, loss: 0.02659180946648121 2023-01-24 03:24:21.964150: step: 254/466, loss: 0.016569100320339203 2023-01-24 03:24:22.586897: step: 256/466, loss: 0.08627676963806152 2023-01-24 03:24:23.226216: step: 258/466, loss: 0.008308811113238335 2023-01-24 03:24:23.832943: step: 260/466, loss: 0.036741409450769424 2023-01-24 03:24:24.577221: step: 262/466, loss: 0.026793222874403 2023-01-24 03:24:25.179480: step: 264/466, loss: 0.04340960457921028 2023-01-24 03:24:25.757250: step: 266/466, loss: 0.015985840931534767 2023-01-24 03:24:26.356902: step: 268/466, loss: 0.02499624900519848 2023-01-24 03:24:27.070234: step: 270/466, loss: 0.6226954460144043 2023-01-24 03:24:27.705295: step: 272/466, loss: 0.052729010581970215 2023-01-24 03:24:28.330551: step: 274/466, loss: 0.023654548451304436 2023-01-24 03:24:28.883206: step: 276/466, loss: 0.0032235553953796625 2023-01-24 03:24:29.510904: step: 278/466, loss: 0.0302759800106287 2023-01-24 03:24:30.108150: step: 280/466, loss: 0.008658492006361485 2023-01-24 03:24:30.760250: step: 282/466, loss: 0.03539307788014412 2023-01-24 03:24:31.422150: step: 284/466, loss: 0.033288102596998215 2023-01-24 03:24:32.091109: step: 286/466, loss: 0.06405381113290787 2023-01-24 03:24:32.686863: step: 288/466, loss: 0.018247967585921288 2023-01-24 03:24:33.344515: step: 290/466, loss: 0.02976909652352333 2023-01-24 03:24:34.011110: step: 292/466, loss: 0.06236724928021431 2023-01-24 03:24:34.614336: step: 294/466, loss: 0.031794726848602295 2023-01-24 03:24:35.225917: step: 296/466, loss: 0.02337266318500042 2023-01-24 03:24:35.820325: step: 298/466, loss: 0.010144537314772606 2023-01-24 03:24:36.487974: step: 300/466, loss: 0.03884013369679451 2023-01-24 03:24:37.089460: step: 302/466, loss: 0.0067479852586984634 2023-01-24 03:24:37.647952: step: 304/466, loss: 0.0077685341238975525 2023-01-24 03:24:38.280952: step: 306/466, loss: 0.014038033783435822 2023-01-24 03:24:38.881648: step: 308/466, loss: 0.024625875055789948 2023-01-24 03:24:39.559498: step: 310/466, loss: 0.023192688822746277 2023-01-24 03:24:40.172010: step: 312/466, loss: 0.003318582195788622 2023-01-24 03:24:40.815610: step: 314/466, loss: 0.06751104444265366 2023-01-24 03:24:41.524878: step: 316/466, loss: 0.08357201516628265 2023-01-24 03:24:42.143963: step: 318/466, loss: 0.03922853246331215 2023-01-24 03:24:42.756423: step: 320/466, loss: 0.04735429957509041 2023-01-24 03:24:43.406894: step: 322/466, loss: 0.03607185557484627 2023-01-24 03:24:44.019778: step: 324/466, loss: 2.4579479694366455 2023-01-24 03:24:44.643386: step: 326/466, loss: 0.0066762263886630535 2023-01-24 03:24:45.351193: step: 328/466, loss: 0.020376691594719887 2023-01-24 03:24:45.950202: step: 330/466, loss: 0.0028863598126918077 2023-01-24 03:24:46.606673: step: 332/466, loss: 0.061129070818424225 2023-01-24 03:24:47.318910: step: 334/466, loss: 0.003399621695280075 2023-01-24 03:24:48.005977: step: 336/466, loss: 0.0011069603497162461 2023-01-24 03:24:48.575106: step: 338/466, loss: 0.008519783616065979 2023-01-24 03:24:49.195777: step: 340/466, loss: 0.01564904674887657 2023-01-24 03:24:49.822448: step: 342/466, loss: 0.059392645955085754 2023-01-24 03:24:50.486807: step: 344/466, loss: 0.006279794033616781 2023-01-24 03:24:51.136388: step: 346/466, loss: 0.09346354007720947 2023-01-24 03:24:51.745908: step: 348/466, loss: 0.03669194132089615 2023-01-24 03:24:52.410702: step: 350/466, loss: 0.04633399099111557 2023-01-24 03:24:53.022151: step: 352/466, loss: 0.014510966837406158 2023-01-24 03:24:53.756932: step: 354/466, loss: 0.04897434264421463 2023-01-24 03:24:54.387864: step: 356/466, loss: 0.011935423128306866 2023-01-24 03:24:55.071443: step: 358/466, loss: 0.02206057496368885 2023-01-24 03:24:55.680465: step: 360/466, loss: 0.17159996926784515 2023-01-24 03:24:56.301779: step: 362/466, loss: 0.014811763539910316 2023-01-24 03:24:56.939135: step: 364/466, loss: 0.014396698214113712 2023-01-24 03:24:57.585633: step: 366/466, loss: 0.02253626473248005 2023-01-24 03:24:58.226152: step: 368/466, loss: 0.03557388857007027 2023-01-24 03:24:58.870043: step: 370/466, loss: 0.014918249100446701 2023-01-24 03:24:59.410034: step: 372/466, loss: 0.7161222100257874 2023-01-24 03:25:00.068594: step: 374/466, loss: 0.0009834831580519676 2023-01-24 03:25:00.761963: step: 376/466, loss: 0.5186393857002258 2023-01-24 03:25:01.331082: step: 378/466, loss: 0.09099865704774857 2023-01-24 03:25:01.927956: step: 380/466, loss: 0.03353620693087578 2023-01-24 03:25:02.559495: step: 382/466, loss: 0.011328631080687046 2023-01-24 03:25:03.157997: step: 384/466, loss: 0.02683149091899395 2023-01-24 03:25:03.845523: step: 386/466, loss: 0.15679757297039032 2023-01-24 03:25:04.422246: step: 388/466, loss: 0.19017009437084198 2023-01-24 03:25:05.064657: step: 390/466, loss: 0.03918571397662163 2023-01-24 03:25:05.671524: step: 392/466, loss: 0.006072483025491238 2023-01-24 03:25:06.307082: step: 394/466, loss: 0.01014915481209755 2023-01-24 03:25:06.960582: step: 396/466, loss: 0.07179990410804749 2023-01-24 03:25:07.582100: step: 398/466, loss: 0.11546458303928375 2023-01-24 03:25:08.223935: step: 400/466, loss: 0.031725261360406876 2023-01-24 03:25:08.853994: step: 402/466, loss: 0.03558645024895668 2023-01-24 03:25:09.455562: step: 404/466, loss: 0.48780667781829834 2023-01-24 03:25:10.019083: step: 406/466, loss: 0.04735840857028961 2023-01-24 03:25:10.623127: step: 408/466, loss: 0.00975461583584547 2023-01-24 03:25:11.223974: step: 410/466, loss: 0.012872268445789814 2023-01-24 03:25:11.830865: step: 412/466, loss: 0.019132103770971298 2023-01-24 03:25:12.511059: step: 414/466, loss: 0.35264527797698975 2023-01-24 03:25:13.111173: step: 416/466, loss: 0.008495563641190529 2023-01-24 03:25:13.729231: step: 418/466, loss: 0.056623030453920364 2023-01-24 03:25:14.343733: step: 420/466, loss: 0.08087506145238876 2023-01-24 03:25:14.960631: step: 422/466, loss: 0.9835139513015747 2023-01-24 03:25:15.581413: step: 424/466, loss: 0.013827823102474213 2023-01-24 03:25:16.194094: step: 426/466, loss: 0.027738556265830994 2023-01-24 03:25:16.789252: step: 428/466, loss: 0.02511964738368988 2023-01-24 03:25:17.447551: step: 430/466, loss: 0.051109958440065384 2023-01-24 03:25:18.077411: step: 432/466, loss: 0.024527449160814285 2023-01-24 03:25:18.718010: step: 434/466, loss: 0.04220099374651909 2023-01-24 03:25:19.385825: step: 436/466, loss: 0.3841703534126282 2023-01-24 03:25:19.986086: step: 438/466, loss: 0.03868027776479721 2023-01-24 03:25:20.594482: step: 440/466, loss: 0.009918006137013435 2023-01-24 03:25:21.220465: step: 442/466, loss: 0.02802487462759018 2023-01-24 03:25:21.866512: step: 444/466, loss: 0.03066260740160942 2023-01-24 03:25:22.414797: step: 446/466, loss: 0.12128327786922455 2023-01-24 03:25:23.088160: step: 448/466, loss: 0.013228870928287506 2023-01-24 03:25:23.731615: step: 450/466, loss: 0.0864589735865593 2023-01-24 03:25:24.333590: step: 452/466, loss: 0.22337065637111664 2023-01-24 03:25:24.960247: step: 454/466, loss: 0.031695976853370667 2023-01-24 03:25:25.575999: step: 456/466, loss: 0.07854624837636948 2023-01-24 03:25:26.164369: step: 458/466, loss: 0.0038878133054822683 2023-01-24 03:25:26.833992: step: 460/466, loss: 0.009189965203404427 2023-01-24 03:25:27.507315: step: 462/466, loss: 0.03963831812143326 2023-01-24 03:25:28.179735: step: 464/466, loss: 0.003491988405585289 2023-01-24 03:25:28.845246: step: 466/466, loss: 0.0387866348028183 2023-01-24 03:25:29.521319: step: 468/466, loss: 0.09318447858095169 2023-01-24 03:25:30.189308: step: 470/466, loss: 0.06316616386175156 2023-01-24 03:25:30.928649: step: 472/466, loss: 0.03835407271981239 2023-01-24 03:25:31.569672: step: 474/466, loss: 0.010601839981973171 2023-01-24 03:25:32.286178: step: 476/466, loss: 0.04524579644203186 2023-01-24 03:25:32.881077: step: 478/466, loss: 0.041294775903224945 2023-01-24 03:25:33.528776: step: 480/466, loss: 0.050560399889945984 2023-01-24 03:25:34.123351: step: 482/466, loss: 0.05223943665623665 2023-01-24 03:25:34.671803: step: 484/466, loss: 0.019072677940130234 2023-01-24 03:25:35.419787: step: 486/466, loss: 0.04862317070364952 2023-01-24 03:25:36.046616: step: 488/466, loss: 0.03726345673203468 2023-01-24 03:25:36.780206: step: 490/466, loss: 0.2885895073413849 2023-01-24 03:25:37.441694: step: 492/466, loss: 0.02119053155183792 2023-01-24 03:25:38.112931: step: 494/466, loss: 0.04115508496761322 2023-01-24 03:25:38.763570: step: 496/466, loss: 0.041004154831171036 2023-01-24 03:25:39.340220: step: 498/466, loss: 0.03753751143813133 2023-01-24 03:25:39.890326: step: 500/466, loss: 0.00984681211411953 2023-01-24 03:25:40.569629: step: 502/466, loss: 0.0046401964500546455 2023-01-24 03:25:41.176124: step: 504/466, loss: 0.04446402192115784 2023-01-24 03:25:41.748549: step: 506/466, loss: 0.045399025082588196 2023-01-24 03:25:42.417404: step: 508/466, loss: 0.379410058259964 2023-01-24 03:25:43.032499: step: 510/466, loss: 0.04710298031568527 2023-01-24 03:25:43.666581: step: 512/466, loss: 0.0644146129488945 2023-01-24 03:25:44.238517: step: 514/466, loss: 0.08703358471393585 2023-01-24 03:25:44.841622: step: 516/466, loss: 0.0953550636768341 2023-01-24 03:25:45.495253: step: 518/466, loss: 0.01842649094760418 2023-01-24 03:25:46.109665: step: 520/466, loss: 0.002994328737258911 2023-01-24 03:25:46.703289: step: 522/466, loss: 0.01781976781785488 2023-01-24 03:25:47.314986: step: 524/466, loss: 0.03385910764336586 2023-01-24 03:25:47.976794: step: 526/466, loss: 0.02381124719977379 2023-01-24 03:25:48.648138: step: 528/466, loss: 0.008905943483114243 2023-01-24 03:25:49.278196: step: 530/466, loss: 0.013084372505545616 2023-01-24 03:25:49.923123: step: 532/466, loss: 0.06574474275112152 2023-01-24 03:25:50.597251: step: 534/466, loss: 0.20575186610221863 2023-01-24 03:25:51.225898: step: 536/466, loss: 0.18175360560417175 2023-01-24 03:25:51.840679: step: 538/466, loss: 0.3304038345813751 2023-01-24 03:25:52.544582: step: 540/466, loss: 0.06652913987636566 2023-01-24 03:25:53.209950: step: 542/466, loss: 0.04313105344772339 2023-01-24 03:25:53.838417: step: 544/466, loss: 0.023333299905061722 2023-01-24 03:25:54.524332: step: 546/466, loss: 0.006351171061396599 2023-01-24 03:25:55.194978: step: 548/466, loss: 0.02981061115860939 2023-01-24 03:25:55.825674: step: 550/466, loss: 0.09407296776771545 2023-01-24 03:25:56.483894: step: 552/466, loss: 0.0001765866472851485 2023-01-24 03:25:57.113391: step: 554/466, loss: 0.037890564650297165 2023-01-24 03:25:57.692190: step: 556/466, loss: 0.04096215218305588 2023-01-24 03:25:58.322538: step: 558/466, loss: 0.07184533774852753 2023-01-24 03:25:59.042160: step: 560/466, loss: 0.01907883584499359 2023-01-24 03:25:59.756992: step: 562/466, loss: 0.03673747554421425 2023-01-24 03:26:00.404163: step: 564/466, loss: 0.007795785088092089 2023-01-24 03:26:01.044484: step: 566/466, loss: 0.21403954923152924 2023-01-24 03:26:01.618744: step: 568/466, loss: 0.003437581704929471 2023-01-24 03:26:02.220766: step: 570/466, loss: 0.048479288816452026 2023-01-24 03:26:02.812794: step: 572/466, loss: 0.042687129229307175 2023-01-24 03:26:03.420366: step: 574/466, loss: 0.06387472152709961 2023-01-24 03:26:04.024456: step: 576/466, loss: 0.005016942508518696 2023-01-24 03:26:04.742298: step: 578/466, loss: 0.010053220205008984 2023-01-24 03:26:05.374590: step: 580/466, loss: 0.003696274943649769 2023-01-24 03:26:05.951038: step: 582/466, loss: 0.024698616936802864 2023-01-24 03:26:06.532320: step: 584/466, loss: 0.025689121335744858 2023-01-24 03:26:07.104860: step: 586/466, loss: 0.159027099609375 2023-01-24 03:26:07.817825: step: 588/466, loss: 0.03976619243621826 2023-01-24 03:26:08.458246: step: 590/466, loss: 0.008023654110729694 2023-01-24 03:26:09.159752: step: 592/466, loss: 0.052323099225759506 2023-01-24 03:26:09.812158: step: 594/466, loss: 0.03495736047625542 2023-01-24 03:26:10.487611: step: 596/466, loss: 0.07690518349409103 2023-01-24 03:26:11.134453: step: 598/466, loss: 0.02207178808748722 2023-01-24 03:26:11.766601: step: 600/466, loss: 0.05315900593996048 2023-01-24 03:26:12.373058: step: 602/466, loss: 0.027270402759313583 2023-01-24 03:26:13.042068: step: 604/466, loss: 0.037802521139383316 2023-01-24 03:26:13.860426: step: 606/466, loss: 0.020048823207616806 2023-01-24 03:26:14.448084: step: 608/466, loss: 0.06595315784215927 2023-01-24 03:26:15.026748: step: 610/466, loss: 0.07635252922773361 2023-01-24 03:26:15.669220: step: 612/466, loss: 0.02917221561074257 2023-01-24 03:26:16.310742: step: 614/466, loss: 0.049803707748651505 2023-01-24 03:26:16.974275: step: 616/466, loss: 0.04995589330792427 2023-01-24 03:26:17.613189: step: 618/466, loss: 0.015341007150709629 2023-01-24 03:26:18.260994: step: 620/466, loss: 0.013207652606070042 2023-01-24 03:26:18.876575: step: 622/466, loss: 0.9710530042648315 2023-01-24 03:26:19.522934: step: 624/466, loss: 0.02297407202422619 2023-01-24 03:26:20.162270: step: 626/466, loss: 0.1126004084944725 2023-01-24 03:26:20.790882: step: 628/466, loss: 0.07128822803497314 2023-01-24 03:26:21.440161: step: 630/466, loss: 0.057949699461460114 2023-01-24 03:26:22.014287: step: 632/466, loss: 0.035416506230831146 2023-01-24 03:26:22.620543: step: 634/466, loss: 0.024575524032115936 2023-01-24 03:26:23.184021: step: 636/466, loss: 0.09601810574531555 2023-01-24 03:26:23.872458: step: 638/466, loss: 0.03678590804338455 2023-01-24 03:26:24.472490: step: 640/466, loss: 0.05857367068529129 2023-01-24 03:26:25.103329: step: 642/466, loss: 0.009968073107302189 2023-01-24 03:26:25.711394: step: 644/466, loss: 0.05171883851289749 2023-01-24 03:26:26.420811: step: 646/466, loss: 0.10127750039100647 2023-01-24 03:26:27.049409: step: 648/466, loss: 0.08442340046167374 2023-01-24 03:26:27.645350: step: 650/466, loss: 0.21667565405368805 2023-01-24 03:26:28.284606: step: 652/466, loss: 0.034365225583314896 2023-01-24 03:26:28.913688: step: 654/466, loss: 0.011747542768716812 2023-01-24 03:26:29.541630: step: 656/466, loss: 0.0651884526014328 2023-01-24 03:26:30.191141: step: 658/466, loss: 0.02953972853720188 2023-01-24 03:26:30.880945: step: 660/466, loss: 0.05717940255999565 2023-01-24 03:26:31.540516: step: 662/466, loss: 0.0911126658320427 2023-01-24 03:26:32.153479: step: 664/466, loss: 0.02355436235666275 2023-01-24 03:26:32.766463: step: 666/466, loss: 0.16068434715270996 2023-01-24 03:26:33.389234: step: 668/466, loss: 0.011923057027161121 2023-01-24 03:26:34.061688: step: 670/466, loss: 0.01504041999578476 2023-01-24 03:26:34.717509: step: 672/466, loss: 0.10375463217496872 2023-01-24 03:26:35.374402: step: 674/466, loss: 0.060116566717624664 2023-01-24 03:26:35.991339: step: 676/466, loss: 0.023999834433197975 2023-01-24 03:26:36.612683: step: 678/466, loss: 0.013309174217283726 2023-01-24 03:26:37.225343: step: 680/466, loss: 0.011735780164599419 2023-01-24 03:26:37.864079: step: 682/466, loss: 0.20573440194129944 2023-01-24 03:26:38.429033: step: 684/466, loss: 0.03826363757252693 2023-01-24 03:26:39.055142: step: 686/466, loss: 0.038021162152290344 2023-01-24 03:26:39.664762: step: 688/466, loss: 0.05251982808113098 2023-01-24 03:26:40.287022: step: 690/466, loss: 0.13504108786582947 2023-01-24 03:26:40.904020: step: 692/466, loss: 0.09346406161785126 2023-01-24 03:26:41.591912: step: 694/466, loss: 0.007996978238224983 2023-01-24 03:26:42.183300: step: 696/466, loss: 0.007372912019491196 2023-01-24 03:26:42.780688: step: 698/466, loss: 0.05871805176138878 2023-01-24 03:26:43.402584: step: 700/466, loss: 0.026169583201408386 2023-01-24 03:26:43.998621: step: 702/466, loss: 0.08200190961360931 2023-01-24 03:26:44.613814: step: 704/466, loss: 0.08218079805374146 2023-01-24 03:26:45.202395: step: 706/466, loss: 0.002647866029292345 2023-01-24 03:26:45.834303: step: 708/466, loss: 0.03334326297044754 2023-01-24 03:26:46.459815: step: 710/466, loss: 0.015425094403326511 2023-01-24 03:26:47.100818: step: 712/466, loss: 0.033472705632448196 2023-01-24 03:26:47.675931: step: 714/466, loss: 0.09041909128427505 2023-01-24 03:26:48.320889: step: 716/466, loss: 0.021868692710995674 2023-01-24 03:26:48.915120: step: 718/466, loss: 0.00036438918323256075 2023-01-24 03:26:49.564742: step: 720/466, loss: 0.03918217122554779 2023-01-24 03:26:50.188595: step: 722/466, loss: 0.014676067046821117 2023-01-24 03:26:50.757743: step: 724/466, loss: 0.0330355279147625 2023-01-24 03:26:51.293374: step: 726/466, loss: 0.10043784976005554 2023-01-24 03:26:51.913013: step: 728/466, loss: 0.016035128384828568 2023-01-24 03:26:52.545174: step: 730/466, loss: 0.13090096414089203 2023-01-24 03:26:53.175709: step: 732/466, loss: 0.016727568581700325 2023-01-24 03:26:53.803695: step: 734/466, loss: 0.016407720744609833 2023-01-24 03:26:54.452332: step: 736/466, loss: 0.3815709948539734 2023-01-24 03:26:55.099045: step: 738/466, loss: 0.013494301587343216 2023-01-24 03:26:55.827054: step: 740/466, loss: 0.033299703150987625 2023-01-24 03:26:56.480166: step: 742/466, loss: 0.004631971009075642 2023-01-24 03:26:57.034978: step: 744/466, loss: 0.0062673985958099365 2023-01-24 03:26:57.650228: step: 746/466, loss: 0.01861630566418171 2023-01-24 03:26:58.267119: step: 748/466, loss: 0.016963746398687363 2023-01-24 03:26:58.925491: step: 750/466, loss: 0.016340335831046104 2023-01-24 03:26:59.535059: step: 752/466, loss: 0.03189510852098465 2023-01-24 03:27:00.161946: step: 754/466, loss: 0.030143678188323975 2023-01-24 03:27:00.795553: step: 756/466, loss: 0.10187117755413055 2023-01-24 03:27:01.428879: step: 758/466, loss: 0.019785935059189796 2023-01-24 03:27:02.049822: step: 760/466, loss: 0.026150960475206375 2023-01-24 03:27:02.644205: step: 762/466, loss: 0.008185453712940216 2023-01-24 03:27:03.306333: step: 764/466, loss: 0.0060665239579975605 2023-01-24 03:27:03.904667: step: 766/466, loss: 0.03670055419206619 2023-01-24 03:27:04.530689: step: 768/466, loss: 0.004702350124716759 2023-01-24 03:27:05.151556: step: 770/466, loss: 0.023463036864995956 2023-01-24 03:27:05.763225: step: 772/466, loss: 0.08310149610042572 2023-01-24 03:27:06.313774: step: 774/466, loss: 0.018694311380386353 2023-01-24 03:27:06.942345: step: 776/466, loss: 0.1447852998971939 2023-01-24 03:27:07.526476: step: 778/466, loss: 0.025596708059310913 2023-01-24 03:27:08.247779: step: 780/466, loss: 0.4690394997596741 2023-01-24 03:27:08.805766: step: 782/466, loss: 0.012877174653112888 2023-01-24 03:27:09.427771: step: 784/466, loss: 0.10288326442241669 2023-01-24 03:27:10.017102: step: 786/466, loss: 0.03267328813672066 2023-01-24 03:27:10.679079: step: 788/466, loss: 0.08561872690916061 2023-01-24 03:27:11.335474: step: 790/466, loss: 0.015591911971569061 2023-01-24 03:27:11.979052: step: 792/466, loss: 0.021638771519064903 2023-01-24 03:27:12.591181: step: 794/466, loss: 0.02019706554710865 2023-01-24 03:27:13.156296: step: 796/466, loss: 0.01928093656897545 2023-01-24 03:27:13.755644: step: 798/466, loss: 0.04225790873169899 2023-01-24 03:27:14.346093: step: 800/466, loss: 0.00393377710133791 2023-01-24 03:27:14.935332: step: 802/466, loss: 0.06843379884958267 2023-01-24 03:27:15.597167: step: 804/466, loss: 0.01800605095922947 2023-01-24 03:27:16.199352: step: 806/466, loss: 0.01616629585623741 2023-01-24 03:27:16.840602: step: 808/466, loss: 0.024410026147961617 2023-01-24 03:27:17.570530: step: 810/466, loss: 0.004984854720532894 2023-01-24 03:27:18.152538: step: 812/466, loss: 0.020824674516916275 2023-01-24 03:27:18.860462: step: 814/466, loss: 0.021463720127940178 2023-01-24 03:27:19.450688: step: 816/466, loss: 0.07688427716493607 2023-01-24 03:27:20.014867: step: 818/466, loss: 0.033285610377788544 2023-01-24 03:27:20.638658: step: 820/466, loss: 0.039045482873916626 2023-01-24 03:27:21.294280: step: 822/466, loss: 0.007764583453536034 2023-01-24 03:27:21.926402: step: 824/466, loss: 0.10974142700433731 2023-01-24 03:27:22.541546: step: 826/466, loss: 0.05139411613345146 2023-01-24 03:27:23.191499: step: 828/466, loss: 0.41900506615638733 2023-01-24 03:27:23.819946: step: 830/466, loss: 0.05137622356414795 2023-01-24 03:27:24.393249: step: 832/466, loss: 0.0016587182180956006 2023-01-24 03:27:25.014756: step: 834/466, loss: 0.028809627518057823 2023-01-24 03:27:25.630774: step: 836/466, loss: 0.07262729853391647 2023-01-24 03:27:26.227709: step: 838/466, loss: 0.028702951967716217 2023-01-24 03:27:26.874554: step: 840/466, loss: 0.02230178751051426 2023-01-24 03:27:27.571800: step: 842/466, loss: 0.03495894372463226 2023-01-24 03:27:28.137528: step: 844/466, loss: 0.008132087998092175 2023-01-24 03:27:28.792375: step: 846/466, loss: 0.04570523276925087 2023-01-24 03:27:29.401645: step: 848/466, loss: 0.024685293436050415 2023-01-24 03:27:30.053367: step: 850/466, loss: 0.016668280586600304 2023-01-24 03:27:30.635211: step: 852/466, loss: 0.08117682486772537 2023-01-24 03:27:31.256189: step: 854/466, loss: 0.08451499789953232 2023-01-24 03:27:31.846992: step: 856/466, loss: 0.0015832686331123114 2023-01-24 03:27:32.432248: step: 858/466, loss: 0.008496706373989582 2023-01-24 03:27:33.047490: step: 860/466, loss: 0.07007251679897308 2023-01-24 03:27:33.622209: step: 862/466, loss: 0.047850530594587326 2023-01-24 03:27:34.262398: step: 864/466, loss: 0.04546159505844116 2023-01-24 03:27:34.898256: step: 866/466, loss: 0.020945146679878235 2023-01-24 03:27:35.545201: step: 868/466, loss: 0.06443048268556595 2023-01-24 03:27:36.176669: step: 870/466, loss: 0.043095044791698456 2023-01-24 03:27:36.800657: step: 872/466, loss: 0.024777084589004517 2023-01-24 03:27:37.532630: step: 874/466, loss: 0.04186893627047539 2023-01-24 03:27:38.079195: step: 876/466, loss: 0.7887666821479797 2023-01-24 03:27:38.730030: step: 878/466, loss: 0.018086452037096024 2023-01-24 03:27:39.344752: step: 880/466, loss: 0.06803685426712036 2023-01-24 03:27:39.931532: step: 882/466, loss: 0.08622019737958908 2023-01-24 03:27:40.510859: step: 884/466, loss: 0.030558787286281586 2023-01-24 03:27:41.173125: step: 886/466, loss: 0.021821726113557816 2023-01-24 03:27:41.785123: step: 888/466, loss: 0.09157668799161911 2023-01-24 03:27:42.359493: step: 890/466, loss: 0.005649706348776817 2023-01-24 03:27:42.983416: step: 892/466, loss: 0.3497958183288574 2023-01-24 03:27:43.593328: step: 894/466, loss: 0.04933559149503708 2023-01-24 03:27:44.194805: step: 896/466, loss: 0.029646283015608788 2023-01-24 03:27:44.809568: step: 898/466, loss: 0.006502463947981596 2023-01-24 03:27:45.360398: step: 900/466, loss: 0.013586016371846199 2023-01-24 03:27:45.976464: step: 902/466, loss: 0.023692291229963303 2023-01-24 03:27:46.620264: step: 904/466, loss: 0.010394468903541565 2023-01-24 03:27:47.231634: step: 906/466, loss: 7.299670696258545 2023-01-24 03:27:47.992309: step: 908/466, loss: 0.09700772166252136 2023-01-24 03:27:48.528524: step: 910/466, loss: 0.048023246228694916 2023-01-24 03:27:49.135787: step: 912/466, loss: 0.01920267567038536 2023-01-24 03:27:49.712031: step: 914/466, loss: 0.005680992268025875 2023-01-24 03:27:50.351999: step: 916/466, loss: 0.015114396810531616 2023-01-24 03:27:50.992393: step: 918/466, loss: 0.012126674875617027 2023-01-24 03:27:51.594253: step: 920/466, loss: 0.006995673291385174 2023-01-24 03:27:52.192158: step: 922/466, loss: 0.001748422160744667 2023-01-24 03:27:52.850810: step: 924/466, loss: 0.03409386798739433 2023-01-24 03:27:53.487675: step: 926/466, loss: 0.05706127732992172 2023-01-24 03:27:54.107181: step: 928/466, loss: 0.016984621062874794 2023-01-24 03:27:54.713714: step: 930/466, loss: 0.07859759777784348 2023-01-24 03:27:55.348250: step: 932/466, loss: 0.11002255231142044 ================================================== Loss: 0.093 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3528882632482729, 'r': 0.3287820441269487, 'f1': 0.34040891405678186}, 'combined': 0.2508276208839445, 'epoch': 24} Test Chinese: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.36160702182458143, 'r': 0.30744423846903807, 'f1': 0.33233326666517454}, 'combined': 0.22040755509400173, 'epoch': 24} Dev Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.34940898345153665, 'r': 0.27992424242424246, 'f1': 0.3108307045215563}, 'combined': 0.20722046968103752, 'epoch': 24} Test Korean: {'template': {'p': 0.96875, 'r': 0.49206349206349204, 'f1': 0.6526315789473683}, 'slot': {'p': 0.3681622173271595, 'r': 0.28393793190742805, 'f1': 0.3206109328974286}, 'combined': 0.2092408193646376, 'epoch': 24} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3414336248254084, 'r': 0.32329293887643035, 'f1': 0.33211574812452005}, 'combined': 0.24471686703912002, 'epoch': 24} Test Russian: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.3544878566919009, 'r': 0.30292598662762443, 'f1': 0.32668488753959496}, 'combined': 0.21666147981900596, 'epoch': 24} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.29059829059829057, 'r': 0.32380952380952377, 'f1': 0.3063063063063063}, 'combined': 0.20420420420420418, 'epoch': 24} Sample Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.39285714285714285, 'r': 0.2391304347826087, 'f1': 0.2972972972972973}, 'combined': 0.1981981981981982, 'epoch': 24} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.275, 'r': 0.09482758620689655, 'f1': 0.141025641025641}, 'combined': 0.09401709401709399, 'epoch': 24} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.35600821224540485, 'r': 0.3296622534644356, 'f1': 0.34232907896701}, 'combined': 0.25224247923884946, 'epoch': 20} Test for Chinese: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.3708503216596444, 'r': 0.2994463568648126, 'f1': 0.33134515303755174}, 'combined': 0.21975222584873896, 'epoch': 20} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.38541666666666663, 'r': 0.35238095238095235, 'f1': 0.36815920398009944}, 'combined': 0.24543946932006627, 'epoch': 20} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.3478307038834952, 'r': 0.2714133522727273, 'f1': 0.30490691489361704}, 'combined': 0.20327127659574468, 'epoch': 20} Test for Korean: {'template': {'p': 0.96875, 'r': 0.49206349206349204, 'f1': 0.6526315789473683}, 'slot': {'p': 0.37366495205875794, 'r': 0.2779860369987299, 'f1': 0.3188014471929879}, 'combined': 0.20805989185226578, 'epoch': 20} Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5441176470588235, 'r': 0.40217391304347827, 'f1': 0.46249999999999997}, 'combined': 0.3083333333333333, 'epoch': 20} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3214272868712161, 'r': 0.31166858366449984, 'f1': 0.3164727236824498}, 'combined': 0.23319042797654194, 'epoch': 7} Test for Russian: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.3374639287478496, 'r': 0.2816078301964814, 'f1': 0.30701605547736693}, 'combined': 0.20361686580882363, 'epoch': 7} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.4423076923076923, 'r': 0.19827586206896552, 'f1': 0.2738095238095238}, 'combined': 0.1825396825396825, 'epoch': 7} ****************************** Epoch: 25 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-24 03:30:26.696904: step: 2/466, loss: 1.2913615703582764 2023-01-24 03:30:27.368296: step: 4/466, loss: 0.221303328871727 2023-01-24 03:30:27.991401: step: 6/466, loss: 0.07709360867738724 2023-01-24 03:30:28.625044: step: 8/466, loss: 0.07634203135967255 2023-01-24 03:30:29.208916: step: 10/466, loss: 0.808277428150177 2023-01-24 03:30:29.869781: step: 12/466, loss: 0.02934054285287857 2023-01-24 03:30:30.508740: step: 14/466, loss: 0.044773418456315994 2023-01-24 03:30:31.106991: step: 16/466, loss: 0.0567832812666893 2023-01-24 03:30:31.708547: step: 18/466, loss: 0.04091653227806091 2023-01-24 03:30:32.382081: step: 20/466, loss: 0.012444142252206802 2023-01-24 03:30:32.944433: step: 22/466, loss: 0.004964071791619062 2023-01-24 03:30:33.570635: step: 24/466, loss: 0.0060002775862813 2023-01-24 03:30:34.220750: step: 26/466, loss: 0.7530148029327393 2023-01-24 03:30:34.881808: step: 28/466, loss: 0.010613384656608105 2023-01-24 03:30:35.471225: step: 30/466, loss: 0.0024078087881207466 2023-01-24 03:30:36.161184: step: 32/466, loss: 0.007756354752928019 2023-01-24 03:30:36.804718: step: 34/466, loss: 0.0068458979949355125 2023-01-24 03:30:37.468160: step: 36/466, loss: 0.04409642145037651 2023-01-24 03:30:38.088006: step: 38/466, loss: 0.0186622217297554 2023-01-24 03:30:38.692750: step: 40/466, loss: 0.0013015653239563107 2023-01-24 03:30:39.324655: step: 42/466, loss: 0.2598473131656647 2023-01-24 03:30:39.966302: step: 44/466, loss: 0.006549442186951637 2023-01-24 03:30:40.625657: step: 46/466, loss: 0.002103047212585807 2023-01-24 03:30:41.287654: step: 48/466, loss: 0.004661684390157461 2023-01-24 03:30:41.962017: step: 50/466, loss: 0.011915341019630432 2023-01-24 03:30:42.600259: step: 52/466, loss: 0.007606809958815575 2023-01-24 03:30:43.208431: step: 54/466, loss: 0.044846631586551666 2023-01-24 03:30:43.807564: step: 56/466, loss: 0.017281727865338326 2023-01-24 03:30:44.427170: step: 58/466, loss: 0.024489082396030426 2023-01-24 03:30:45.083211: step: 60/466, loss: 0.04151010513305664 2023-01-24 03:30:45.745986: step: 62/466, loss: 0.0005681780166924 2023-01-24 03:30:46.352967: step: 64/466, loss: 0.06110693886876106 2023-01-24 03:30:47.023911: step: 66/466, loss: 0.1623503714799881 2023-01-24 03:30:47.640234: step: 68/466, loss: 0.012861186638474464 2023-01-24 03:30:48.288587: step: 70/466, loss: 0.018734971061348915 2023-01-24 03:30:48.940276: step: 72/466, loss: 0.04459035396575928 2023-01-24 03:30:49.565411: step: 74/466, loss: 0.05913710221648216 2023-01-24 03:30:50.140184: step: 76/466, loss: 0.01718541607260704 2023-01-24 03:30:50.698598: step: 78/466, loss: 0.0338447242975235 2023-01-24 03:30:51.313560: step: 80/466, loss: 0.009336519986391068 2023-01-24 03:30:51.934058: step: 82/466, loss: 0.012285180389881134 2023-01-24 03:30:52.548035: step: 84/466, loss: 0.044400352984666824 2023-01-24 03:30:53.227520: step: 86/466, loss: 0.0017768596298992634 2023-01-24 03:30:53.881233: step: 88/466, loss: 0.03392492234706879 2023-01-24 03:30:54.593241: step: 90/466, loss: 0.08839622884988785 2023-01-24 03:30:55.165243: step: 92/466, loss: 0.28095516562461853 2023-01-24 03:30:55.829480: step: 94/466, loss: 0.04762560501694679 2023-01-24 03:30:56.590530: step: 96/466, loss: 0.03430235758423805 2023-01-24 03:30:57.233619: step: 98/466, loss: 0.011945521458983421 2023-01-24 03:30:57.910077: step: 100/466, loss: 0.022856013849377632 2023-01-24 03:30:58.562275: step: 102/466, loss: 0.002989865140989423 2023-01-24 03:30:59.203479: step: 104/466, loss: 0.03537214174866676 2023-01-24 03:30:59.838859: step: 106/466, loss: 0.02911374531686306 2023-01-24 03:31:00.553524: step: 108/466, loss: 0.5798623561859131 2023-01-24 03:31:01.245167: step: 110/466, loss: 0.02080407552421093 2023-01-24 03:31:01.898379: step: 112/466, loss: 0.032129302620887756 2023-01-24 03:31:02.541469: step: 114/466, loss: 0.014343895949423313 2023-01-24 03:31:03.213538: step: 116/466, loss: 0.031172694638371468 2023-01-24 03:31:03.781705: step: 118/466, loss: 0.01042020134627819 2023-01-24 03:31:04.514595: step: 120/466, loss: 0.006632381118834019 2023-01-24 03:31:05.187851: step: 122/466, loss: 0.038836054503917694 2023-01-24 03:31:05.771160: step: 124/466, loss: 0.008173519745469093 2023-01-24 03:31:06.373087: step: 126/466, loss: 0.016281452029943466 2023-01-24 03:31:07.036974: step: 128/466, loss: 0.0038052171003073454 2023-01-24 03:31:07.667887: step: 130/466, loss: 0.11949094384908676 2023-01-24 03:31:08.263211: step: 132/466, loss: 0.05344999581575394 2023-01-24 03:31:08.893955: step: 134/466, loss: 0.024968191981315613 2023-01-24 03:31:09.552944: step: 136/466, loss: 0.03751816600561142 2023-01-24 03:31:10.145591: step: 138/466, loss: 0.0008284052019007504 2023-01-24 03:31:10.803687: step: 140/466, loss: 0.11725995689630508 2023-01-24 03:31:11.446030: step: 142/466, loss: 0.0842091366648674 2023-01-24 03:31:12.073968: step: 144/466, loss: 0.0696955993771553 2023-01-24 03:31:12.702941: step: 146/466, loss: 0.00623112078756094 2023-01-24 03:31:13.346397: step: 148/466, loss: 0.5660732984542847 2023-01-24 03:31:13.930269: step: 150/466, loss: 0.02277527004480362 2023-01-24 03:31:14.525162: step: 152/466, loss: 0.014372216537594795 2023-01-24 03:31:15.175854: step: 154/466, loss: 0.04187621921300888 2023-01-24 03:31:15.793466: step: 156/466, loss: 3.977751839556731e-05 2023-01-24 03:31:16.460769: step: 158/466, loss: 0.10576049983501434 2023-01-24 03:31:17.066187: step: 160/466, loss: 0.026855265721678734 2023-01-24 03:31:17.708983: step: 162/466, loss: 0.05360988527536392 2023-01-24 03:31:18.294533: step: 164/466, loss: 0.002422438934445381 2023-01-24 03:31:18.936960: step: 166/466, loss: 0.046311236917972565 2023-01-24 03:31:19.596605: step: 168/466, loss: 0.004069761373102665 2023-01-24 03:31:20.203702: step: 170/466, loss: 0.002492958679795265 2023-01-24 03:31:20.745869: step: 172/466, loss: 0.04683367908000946 2023-01-24 03:31:21.427160: step: 174/466, loss: 0.0008087892201729119 2023-01-24 03:31:22.054239: step: 176/466, loss: 0.00553395040333271 2023-01-24 03:31:22.583259: step: 178/466, loss: 0.005937342531979084 2023-01-24 03:31:23.204687: step: 180/466, loss: 0.026208385825157166 2023-01-24 03:31:23.914318: step: 182/466, loss: 0.04297297075390816 2023-01-24 03:31:24.619170: step: 184/466, loss: 0.019133813679218292 2023-01-24 03:31:25.236916: step: 186/466, loss: 0.24795441329479218 2023-01-24 03:31:25.827961: step: 188/466, loss: 0.05760320648550987 2023-01-24 03:31:26.442798: step: 190/466, loss: 0.7904965281486511 2023-01-24 03:31:27.052850: step: 192/466, loss: 0.033945102244615555 2023-01-24 03:31:27.662659: step: 194/466, loss: 0.01732119917869568 2023-01-24 03:31:28.283739: step: 196/466, loss: 0.0867454931139946 2023-01-24 03:31:29.003306: step: 198/466, loss: 0.14356158673763275 2023-01-24 03:31:29.585527: step: 200/466, loss: 0.00874892994761467 2023-01-24 03:31:30.252590: step: 202/466, loss: 0.09583394229412079 2023-01-24 03:31:30.841465: step: 204/466, loss: 0.06581012904644012 2023-01-24 03:31:31.443552: step: 206/466, loss: 0.0020562163554131985 2023-01-24 03:31:32.143355: step: 208/466, loss: 0.026082372292876244 2023-01-24 03:31:32.783818: step: 210/466, loss: 0.0012509813532233238 2023-01-24 03:31:33.417917: step: 212/466, loss: 0.019496295601129532 2023-01-24 03:31:34.038700: step: 214/466, loss: 0.4355490207672119 2023-01-24 03:31:34.547330: step: 216/466, loss: 0.003379611298441887 2023-01-24 03:31:35.126512: step: 218/466, loss: 0.04011888802051544 2023-01-24 03:31:35.729312: step: 220/466, loss: 0.020177897065877914 2023-01-24 03:31:36.330986: step: 222/466, loss: 0.05331127345561981 2023-01-24 03:31:36.908744: step: 224/466, loss: 0.0049538989551365376 2023-01-24 03:31:37.512116: step: 226/466, loss: 0.02534342184662819 2023-01-24 03:31:38.158480: step: 228/466, loss: 0.02678047865629196 2023-01-24 03:31:38.788086: step: 230/466, loss: 0.08728460222482681 2023-01-24 03:31:39.449281: step: 232/466, loss: 0.12621189653873444 2023-01-24 03:31:40.080424: step: 234/466, loss: 0.08374364674091339 2023-01-24 03:31:40.725962: step: 236/466, loss: 0.19564181566238403 2023-01-24 03:31:41.292591: step: 238/466, loss: 0.00403230544179678 2023-01-24 03:31:41.868547: step: 240/466, loss: 0.02667354792356491 2023-01-24 03:31:42.460318: step: 242/466, loss: 0.032107461243867874 2023-01-24 03:31:43.086837: step: 244/466, loss: 0.018708113580942154 2023-01-24 03:31:43.737587: step: 246/466, loss: 0.10217863321304321 2023-01-24 03:31:44.329084: step: 248/466, loss: 0.01089020911604166 2023-01-24 03:31:45.047303: step: 250/466, loss: 0.03414095938205719 2023-01-24 03:31:45.621264: step: 252/466, loss: 0.007417671848088503 2023-01-24 03:31:46.149933: step: 254/466, loss: 0.014984002336859703 2023-01-24 03:31:46.732912: step: 256/466, loss: 0.010449710302054882 2023-01-24 03:31:47.319813: step: 258/466, loss: 0.039140958338975906 2023-01-24 03:31:47.927273: step: 260/466, loss: 0.005764668807387352 2023-01-24 03:31:48.509302: step: 262/466, loss: 0.010224856436252594 2023-01-24 03:31:49.115356: step: 264/466, loss: 0.004748664330691099 2023-01-24 03:31:49.774248: step: 266/466, loss: 0.06514088809490204 2023-01-24 03:31:50.388288: step: 268/466, loss: 0.018356163054704666 2023-01-24 03:31:51.031890: step: 270/466, loss: 0.037056829780340195 2023-01-24 03:31:51.712926: step: 272/466, loss: 0.03318117931485176 2023-01-24 03:31:52.335170: step: 274/466, loss: 0.02692977897822857 2023-01-24 03:31:52.983059: step: 276/466, loss: 0.1166185736656189 2023-01-24 03:31:53.597014: step: 278/466, loss: 0.05034998804330826 2023-01-24 03:31:54.330694: step: 280/466, loss: 0.07291010767221451 2023-01-24 03:31:54.911783: step: 282/466, loss: 0.040174517780542374 2023-01-24 03:31:55.502432: step: 284/466, loss: 0.03222532942891121 2023-01-24 03:31:56.127449: step: 286/466, loss: 0.41899573802948 2023-01-24 03:31:56.741844: step: 288/466, loss: 0.01620565727353096 2023-01-24 03:31:57.314824: step: 290/466, loss: 0.0240196343511343 2023-01-24 03:31:57.941972: step: 292/466, loss: 0.012082924135029316 2023-01-24 03:31:58.541215: step: 294/466, loss: 0.11501817405223846 2023-01-24 03:31:59.126234: step: 296/466, loss: 0.004925146698951721 2023-01-24 03:31:59.820961: step: 298/466, loss: 0.36350271105766296 2023-01-24 03:32:00.481057: step: 300/466, loss: 0.05301080644130707 2023-01-24 03:32:01.136110: step: 302/466, loss: 0.0018715260084718466 2023-01-24 03:32:01.770513: step: 304/466, loss: 0.030930697917938232 2023-01-24 03:32:02.428773: step: 306/466, loss: 0.02174968086183071 2023-01-24 03:32:03.017985: step: 308/466, loss: 0.0800512433052063 2023-01-24 03:32:03.583032: step: 310/466, loss: 0.03545597568154335 2023-01-24 03:32:04.228706: step: 312/466, loss: 0.05008088797330856 2023-01-24 03:32:04.900771: step: 314/466, loss: 0.14534986019134521 2023-01-24 03:32:05.515505: step: 316/466, loss: 0.03142093867063522 2023-01-24 03:32:06.176414: step: 318/466, loss: 0.02881583571434021 2023-01-24 03:32:06.907242: step: 320/466, loss: 0.01100649032741785 2023-01-24 03:32:07.594498: step: 322/466, loss: 0.028189118951559067 2023-01-24 03:32:08.259928: step: 324/466, loss: 0.010489918291568756 2023-01-24 03:32:08.907982: step: 326/466, loss: 0.04265473783016205 2023-01-24 03:32:09.752416: step: 328/466, loss: 0.1682056188583374 2023-01-24 03:32:10.312492: step: 330/466, loss: 0.019547518342733383 2023-01-24 03:32:10.856732: step: 332/466, loss: 0.021306661888957024 2023-01-24 03:32:11.539405: step: 334/466, loss: 0.04452720656991005 2023-01-24 03:32:12.197029: step: 336/466, loss: 0.0865069106221199 2023-01-24 03:32:12.823355: step: 338/466, loss: 0.040249284356832504 2023-01-24 03:32:13.433018: step: 340/466, loss: 0.01776963099837303 2023-01-24 03:32:14.120681: step: 342/466, loss: 0.06774012744426727 2023-01-24 03:32:14.772755: step: 344/466, loss: 0.0254667978733778 2023-01-24 03:32:15.446264: step: 346/466, loss: 0.015283987857401371 2023-01-24 03:32:16.016298: step: 348/466, loss: 0.038537781685590744 2023-01-24 03:32:16.673541: step: 350/466, loss: 0.09057411551475525 2023-01-24 03:32:17.288055: step: 352/466, loss: 0.051457326859235764 2023-01-24 03:32:17.886837: step: 354/466, loss: 0.04070008918642998 2023-01-24 03:32:18.565201: step: 356/466, loss: 0.05365927889943123 2023-01-24 03:32:19.185372: step: 358/466, loss: 0.07253986597061157 2023-01-24 03:32:19.834447: step: 360/466, loss: 0.001430719392374158 2023-01-24 03:32:20.504020: step: 362/466, loss: 0.03338882699608803 2023-01-24 03:32:21.120559: step: 364/466, loss: 0.02356225810945034 2023-01-24 03:32:21.796275: step: 366/466, loss: 0.029113180935382843 2023-01-24 03:32:22.417739: step: 368/466, loss: 0.021528951823711395 2023-01-24 03:32:23.030173: step: 370/466, loss: 0.011167202144861221 2023-01-24 03:32:23.686744: step: 372/466, loss: 0.0653170719742775 2023-01-24 03:32:24.378749: step: 374/466, loss: 0.20392701029777527 2023-01-24 03:32:25.009977: step: 376/466, loss: 0.01070206705480814 2023-01-24 03:32:25.620709: step: 378/466, loss: 0.03856460750102997 2023-01-24 03:32:26.217338: step: 380/466, loss: 0.02206994593143463 2023-01-24 03:32:26.876888: step: 382/466, loss: 0.17710959911346436 2023-01-24 03:32:27.537022: step: 384/466, loss: 0.34115350246429443 2023-01-24 03:32:28.134311: step: 386/466, loss: 0.030216675251722336 2023-01-24 03:32:28.812499: step: 388/466, loss: 0.01600075513124466 2023-01-24 03:32:29.456400: step: 390/466, loss: 0.08036049455404282 2023-01-24 03:32:30.085953: step: 392/466, loss: 0.7863232493400574 2023-01-24 03:32:30.759119: step: 394/466, loss: 0.09372567385435104 2023-01-24 03:32:31.452171: step: 396/466, loss: 0.2711730897426605 2023-01-24 03:32:32.125733: step: 398/466, loss: 0.031892675906419754 2023-01-24 03:32:32.671267: step: 400/466, loss: 0.040052060037851334 2023-01-24 03:32:33.283488: step: 402/466, loss: 0.029966307803988457 2023-01-24 03:32:33.873973: step: 404/466, loss: 0.00928488653153181 2023-01-24 03:32:34.596444: step: 406/466, loss: 0.005141105968505144 2023-01-24 03:32:35.224524: step: 408/466, loss: 0.025026477873325348 2023-01-24 03:32:35.882599: step: 410/466, loss: 0.020624669268727303 2023-01-24 03:32:36.517658: step: 412/466, loss: 0.015934638679027557 2023-01-24 03:32:37.113858: step: 414/466, loss: 0.2350529432296753 2023-01-24 03:32:37.726570: step: 416/466, loss: 0.03973572701215744 2023-01-24 03:32:38.388517: step: 418/466, loss: 0.028968505561351776 2023-01-24 03:32:38.984569: step: 420/466, loss: 0.020778411999344826 2023-01-24 03:32:39.564854: step: 422/466, loss: 0.3516223430633545 2023-01-24 03:32:40.196748: step: 424/466, loss: 0.13670171797275543 2023-01-24 03:32:40.822463: step: 426/466, loss: 0.0782453641295433 2023-01-24 03:32:41.507803: step: 428/466, loss: 0.007612935733050108 2023-01-24 03:32:42.169479: step: 430/466, loss: 0.009548550471663475 2023-01-24 03:32:42.783807: step: 432/466, loss: 0.10455300658941269 2023-01-24 03:32:43.368460: step: 434/466, loss: 0.005702846217900515 2023-01-24 03:32:43.973904: step: 436/466, loss: 0.00872819498181343 2023-01-24 03:32:44.506293: step: 438/466, loss: 0.0432143472135067 2023-01-24 03:32:45.165771: step: 440/466, loss: 0.0244740080088377 2023-01-24 03:32:45.849682: step: 442/466, loss: 0.036844585090875626 2023-01-24 03:32:46.514431: step: 444/466, loss: 0.038751665502786636 2023-01-24 03:32:47.079240: step: 446/466, loss: 0.028784453868865967 2023-01-24 03:32:47.797845: step: 448/466, loss: 0.051399070769548416 2023-01-24 03:32:48.382466: step: 450/466, loss: 0.014154715463519096 2023-01-24 03:32:48.979173: step: 452/466, loss: 0.008030824363231659 2023-01-24 03:32:49.525717: step: 454/466, loss: 0.05862676724791527 2023-01-24 03:32:50.211542: step: 456/466, loss: 0.006854075472801924 2023-01-24 03:32:50.828959: step: 458/466, loss: 0.0058186049573123455 2023-01-24 03:32:51.458067: step: 460/466, loss: 0.07675496488809586 2023-01-24 03:32:52.010612: step: 462/466, loss: 0.02009126916527748 2023-01-24 03:32:52.690937: step: 464/466, loss: 0.013251720927655697 2023-01-24 03:32:53.301904: step: 466/466, loss: 0.019120588898658752 2023-01-24 03:32:53.979478: step: 468/466, loss: 0.03249607980251312 2023-01-24 03:32:54.551858: step: 470/466, loss: 0.030159030109643936 2023-01-24 03:32:55.184502: step: 472/466, loss: 0.36210766434669495 2023-01-24 03:32:55.820686: step: 474/466, loss: 0.01699150539934635 2023-01-24 03:32:56.488181: step: 476/466, loss: 0.0033404179848730564 2023-01-24 03:32:57.050165: step: 478/466, loss: 0.01301879994571209 2023-01-24 03:32:57.613072: step: 480/466, loss: 0.0035054609179496765 2023-01-24 03:32:58.223318: step: 482/466, loss: 0.021764883771538734 2023-01-24 03:32:58.851369: step: 484/466, loss: 0.015501120127737522 2023-01-24 03:32:59.472578: step: 486/466, loss: 0.026269903406500816 2023-01-24 03:33:00.079047: step: 488/466, loss: 0.0158759243786335 2023-01-24 03:33:00.733970: step: 490/466, loss: 0.0009323122794739902 2023-01-24 03:33:01.388950: step: 492/466, loss: 0.01813668943941593 2023-01-24 03:33:01.995895: step: 494/466, loss: 0.2687680423259735 2023-01-24 03:33:02.590945: step: 496/466, loss: 0.022268308326601982 2023-01-24 03:33:03.204829: step: 498/466, loss: 0.020895710214972496 2023-01-24 03:33:03.866634: step: 500/466, loss: 0.03320271894335747 2023-01-24 03:33:04.511802: step: 502/466, loss: 0.08355792611837387 2023-01-24 03:33:05.112907: step: 504/466, loss: 0.16657261550426483 2023-01-24 03:33:05.675103: step: 506/466, loss: 0.02144947461783886 2023-01-24 03:33:06.354799: step: 508/466, loss: 0.02311338298022747 2023-01-24 03:33:06.948064: step: 510/466, loss: 0.008023801259696484 2023-01-24 03:33:07.560228: step: 512/466, loss: 0.022611264139413834 2023-01-24 03:33:08.227252: step: 514/466, loss: 0.013177191838622093 2023-01-24 03:33:08.865477: step: 516/466, loss: 0.003986111376434565 2023-01-24 03:33:09.474528: step: 518/466, loss: 0.16800203919410706 2023-01-24 03:33:10.054111: step: 520/466, loss: 0.0060579185374081135 2023-01-24 03:33:10.706041: step: 522/466, loss: 0.029098467901349068 2023-01-24 03:33:11.335415: step: 524/466, loss: 0.029289137572050095 2023-01-24 03:33:11.948938: step: 526/466, loss: 0.07186546176671982 2023-01-24 03:33:12.548298: step: 528/466, loss: 0.074771448969841 2023-01-24 03:33:13.126159: step: 530/466, loss: 0.005138975568115711 2023-01-24 03:33:13.758738: step: 532/466, loss: 0.03942608833312988 2023-01-24 03:33:14.396049: step: 534/466, loss: 0.005945540964603424 2023-01-24 03:33:15.032964: step: 536/466, loss: 0.028450069949030876 2023-01-24 03:33:15.665245: step: 538/466, loss: 0.010474931448698044 2023-01-24 03:33:16.331483: step: 540/466, loss: 0.011315548792481422 2023-01-24 03:33:17.008691: step: 542/466, loss: 0.07123514264822006 2023-01-24 03:33:17.641189: step: 544/466, loss: 0.020416492596268654 2023-01-24 03:33:18.260940: step: 546/466, loss: 0.04518275335431099 2023-01-24 03:33:18.891943: step: 548/466, loss: 0.05349896848201752 2023-01-24 03:33:19.579992: step: 550/466, loss: 0.03100154921412468 2023-01-24 03:33:20.201446: step: 552/466, loss: 0.02530047297477722 2023-01-24 03:33:20.818272: step: 554/466, loss: 0.009493943303823471 2023-01-24 03:33:21.407333: step: 556/466, loss: 0.18496611714363098 2023-01-24 03:33:22.047890: step: 558/466, loss: 0.027933640405535698 2023-01-24 03:33:22.704440: step: 560/466, loss: 0.04296899959445 2023-01-24 03:33:23.316371: step: 562/466, loss: 0.038775667548179626 2023-01-24 03:33:23.907659: step: 564/466, loss: 0.004564823117107153 2023-01-24 03:33:24.522243: step: 566/466, loss: 0.03634239733219147 2023-01-24 03:33:25.159073: step: 568/466, loss: 0.027375012636184692 2023-01-24 03:33:25.789808: step: 570/466, loss: 0.008710439316928387 2023-01-24 03:33:26.421712: step: 572/466, loss: 0.02768733724951744 2023-01-24 03:33:27.113827: step: 574/466, loss: 0.05412431061267853 2023-01-24 03:33:27.726906: step: 576/466, loss: 0.008412402123212814 2023-01-24 03:33:28.374694: step: 578/466, loss: 0.06098160147666931 2023-01-24 03:33:28.983387: step: 580/466, loss: 0.07021938264369965 2023-01-24 03:33:29.589739: step: 582/466, loss: 0.06752918660640717 2023-01-24 03:33:30.226306: step: 584/466, loss: 0.05675654858350754 2023-01-24 03:33:30.834056: step: 586/466, loss: 0.025131892412900925 2023-01-24 03:33:31.401923: step: 588/466, loss: 0.026210255920886993 2023-01-24 03:33:32.086263: step: 590/466, loss: 0.025238752365112305 2023-01-24 03:33:32.712953: step: 592/466, loss: 0.0021176671143621206 2023-01-24 03:33:33.323825: step: 594/466, loss: 0.06994631141424179 2023-01-24 03:33:33.997326: step: 596/466, loss: 0.13406339287757874 2023-01-24 03:33:34.531692: step: 598/466, loss: 0.04605163633823395 2023-01-24 03:33:35.169642: step: 600/466, loss: 0.04097013548016548 2023-01-24 03:33:35.749271: step: 602/466, loss: 0.039376720786094666 2023-01-24 03:33:36.377488: step: 604/466, loss: 0.05856088921427727 2023-01-24 03:33:37.024930: step: 606/466, loss: 0.004100484307855368 2023-01-24 03:33:37.620011: step: 608/466, loss: 0.00830951239913702 2023-01-24 03:33:38.273899: step: 610/466, loss: 0.03391300514340401 2023-01-24 03:33:38.841322: step: 612/466, loss: 0.04908227175474167 2023-01-24 03:33:39.410568: step: 614/466, loss: 0.006420582998543978 2023-01-24 03:33:40.009928: step: 616/466, loss: 0.05737851932644844 2023-01-24 03:33:40.621521: step: 618/466, loss: 0.004047843161970377 2023-01-24 03:33:41.217403: step: 620/466, loss: 0.018269887194037437 2023-01-24 03:33:41.798292: step: 622/466, loss: 0.015939846634864807 2023-01-24 03:33:42.390324: step: 624/466, loss: 0.11771312355995178 2023-01-24 03:33:43.043324: step: 626/466, loss: 0.23366926610469818 2023-01-24 03:33:43.719579: step: 628/466, loss: 0.14914771914482117 2023-01-24 03:33:44.414275: step: 630/466, loss: 0.056283481419086456 2023-01-24 03:33:45.065660: step: 632/466, loss: 0.055580224841833115 2023-01-24 03:33:45.748129: step: 634/466, loss: 0.00767617579549551 2023-01-24 03:33:46.314654: step: 636/466, loss: 0.004347812384366989 2023-01-24 03:33:46.979836: step: 638/466, loss: 0.018882201984524727 2023-01-24 03:33:47.575948: step: 640/466, loss: 0.010219431482255459 2023-01-24 03:33:48.148469: step: 642/466, loss: 0.056204766035079956 2023-01-24 03:33:48.762824: step: 644/466, loss: 0.02552511729300022 2023-01-24 03:33:49.332813: step: 646/466, loss: 0.15252268314361572 2023-01-24 03:33:49.949493: step: 648/466, loss: 0.2137022316455841 2023-01-24 03:33:50.569767: step: 650/466, loss: 0.05027611553668976 2023-01-24 03:33:51.238897: step: 652/466, loss: 0.011146695353090763 2023-01-24 03:33:51.874435: step: 654/466, loss: 0.0017963236896321177 2023-01-24 03:33:52.456825: step: 656/466, loss: 0.008355779573321342 2023-01-24 03:33:53.121997: step: 658/466, loss: 0.011687744408845901 2023-01-24 03:33:53.776643: step: 660/466, loss: 0.017062479630112648 2023-01-24 03:33:54.407532: step: 662/466, loss: 0.027931544929742813 2023-01-24 03:33:55.045886: step: 664/466, loss: 0.0728331133723259 2023-01-24 03:33:55.671970: step: 666/466, loss: 0.040246982127428055 2023-01-24 03:33:56.318126: step: 668/466, loss: 0.0050750491209328175 2023-01-24 03:33:56.917250: step: 670/466, loss: 0.0024562934413552284 2023-01-24 03:33:57.580006: step: 672/466, loss: 0.0371859148144722 2023-01-24 03:33:58.248530: step: 674/466, loss: 0.014029420912265778 2023-01-24 03:33:58.911276: step: 676/466, loss: 0.008747943677008152 2023-01-24 03:33:59.511174: step: 678/466, loss: 0.08006829023361206 2023-01-24 03:34:00.063033: step: 680/466, loss: 0.0167327169328928 2023-01-24 03:34:00.716046: step: 682/466, loss: 0.05692826956510544 2023-01-24 03:34:01.318136: step: 684/466, loss: 0.007294619921594858 2023-01-24 03:34:01.957964: step: 686/466, loss: 0.0033462075516581535 2023-01-24 03:34:02.702065: step: 688/466, loss: 0.023898642510175705 2023-01-24 03:34:03.330826: step: 690/466, loss: 0.013447328470647335 2023-01-24 03:34:03.949490: step: 692/466, loss: 0.02081284299492836 2023-01-24 03:34:04.543513: step: 694/466, loss: 0.5867592692375183 2023-01-24 03:34:05.167074: step: 696/466, loss: 0.05716710537672043 2023-01-24 03:34:05.809936: step: 698/466, loss: 0.014385833404958248 2023-01-24 03:34:06.464560: step: 700/466, loss: 0.14949959516525269 2023-01-24 03:34:07.038980: step: 702/466, loss: 0.0021207034587860107 2023-01-24 03:34:07.627842: step: 704/466, loss: 0.013642613776028156 2023-01-24 03:34:08.229497: step: 706/466, loss: 0.050527218729257584 2023-01-24 03:34:08.899991: step: 708/466, loss: 0.09773019701242447 2023-01-24 03:34:09.528232: step: 710/466, loss: 0.007971648126840591 2023-01-24 03:34:10.234247: step: 712/466, loss: 0.04106210917234421 2023-01-24 03:34:10.897605: step: 714/466, loss: 0.033593036234378815 2023-01-24 03:34:11.517258: step: 716/466, loss: 0.005900632124394178 2023-01-24 03:34:12.118718: step: 718/466, loss: 0.025367496535182 2023-01-24 03:34:12.730763: step: 720/466, loss: 0.13995912671089172 2023-01-24 03:34:13.337133: step: 722/466, loss: 0.07217904180288315 2023-01-24 03:34:14.045499: step: 724/466, loss: 0.04147706553339958 2023-01-24 03:34:14.618616: step: 726/466, loss: 0.012126031331717968 2023-01-24 03:34:15.273124: step: 728/466, loss: 0.002041777828708291 2023-01-24 03:34:15.882310: step: 730/466, loss: 0.096912682056427 2023-01-24 03:34:16.532803: step: 732/466, loss: 0.13700343668460846 2023-01-24 03:34:17.118107: step: 734/466, loss: 0.015220297500491142 2023-01-24 03:34:17.761387: step: 736/466, loss: 0.003426916664466262 2023-01-24 03:34:18.428520: step: 738/466, loss: 0.02217986062169075 2023-01-24 03:34:19.074173: step: 740/466, loss: 0.013844664208590984 2023-01-24 03:34:19.704513: step: 742/466, loss: 0.015300248749554157 2023-01-24 03:34:20.322302: step: 744/466, loss: 1.4703454971313477 2023-01-24 03:34:20.949984: step: 746/466, loss: 0.04135194048285484 2023-01-24 03:34:21.605773: step: 748/466, loss: 0.027552183717489243 2023-01-24 03:34:22.227852: step: 750/466, loss: 0.28672534227371216 2023-01-24 03:34:22.884086: step: 752/466, loss: 0.011509292759001255 2023-01-24 03:34:23.471984: step: 754/466, loss: 0.02878592722117901 2023-01-24 03:34:24.155540: step: 756/466, loss: 0.01280699297785759 2023-01-24 03:34:24.832812: step: 758/466, loss: 0.009594520553946495 2023-01-24 03:34:25.518454: step: 760/466, loss: 0.03990183770656586 2023-01-24 03:34:26.142222: step: 762/466, loss: 0.04524315893650055 2023-01-24 03:34:26.778394: step: 764/466, loss: 0.08061192184686661 2023-01-24 03:34:27.363350: step: 766/466, loss: 0.02775973081588745 2023-01-24 03:34:27.957788: step: 768/466, loss: 0.02547493390738964 2023-01-24 03:34:28.566877: step: 770/466, loss: 0.06113841384649277 2023-01-24 03:34:29.237180: step: 772/466, loss: 0.05478772148489952 2023-01-24 03:34:29.925724: step: 774/466, loss: 0.021239129826426506 2023-01-24 03:34:30.612113: step: 776/466, loss: 0.002169389743357897 2023-01-24 03:34:31.249996: step: 778/466, loss: 0.07251054793596268 2023-01-24 03:34:31.919590: step: 780/466, loss: 0.02980934828519821 2023-01-24 03:34:32.586237: step: 782/466, loss: 0.007325939834117889 2023-01-24 03:34:33.161122: step: 784/466, loss: 0.02687842957675457 2023-01-24 03:34:33.850531: step: 786/466, loss: 0.022026900202035904 2023-01-24 03:34:34.487251: step: 788/466, loss: 0.03064543381333351 2023-01-24 03:34:35.121650: step: 790/466, loss: 0.051075346767902374 2023-01-24 03:34:35.767942: step: 792/466, loss: 0.014683965593576431 2023-01-24 03:34:36.337083: step: 794/466, loss: 0.0023933525662869215 2023-01-24 03:34:36.986835: step: 796/466, loss: 0.0172136090695858 2023-01-24 03:34:37.633272: step: 798/466, loss: 0.021011296659708023 2023-01-24 03:34:38.282066: step: 800/466, loss: 0.06336275488138199 2023-01-24 03:34:38.916815: step: 802/466, loss: 0.16190217435359955 2023-01-24 03:34:39.553631: step: 804/466, loss: 0.03494153544306755 2023-01-24 03:34:40.135823: step: 806/466, loss: 0.04694707319140434 2023-01-24 03:34:40.750052: step: 808/466, loss: 0.018555352464318275 2023-01-24 03:34:41.367956: step: 810/466, loss: 0.026234649121761322 2023-01-24 03:34:41.982592: step: 812/466, loss: 0.029778089374303818 2023-01-24 03:34:42.619699: step: 814/466, loss: 0.01820109598338604 2023-01-24 03:34:43.192577: step: 816/466, loss: 0.0023990797344595194 2023-01-24 03:34:43.844284: step: 818/466, loss: 0.024855811148881912 2023-01-24 03:34:44.444442: step: 820/466, loss: 0.03238849341869354 2023-01-24 03:34:45.049966: step: 822/466, loss: 0.04709932580590248 2023-01-24 03:34:45.640454: step: 824/466, loss: 0.019101126119494438 2023-01-24 03:34:46.265390: step: 826/466, loss: 0.03390325605869293 2023-01-24 03:34:46.899951: step: 828/466, loss: 0.08838388323783875 2023-01-24 03:34:47.503696: step: 830/466, loss: 0.04158073291182518 2023-01-24 03:34:48.126870: step: 832/466, loss: 0.48680511116981506 2023-01-24 03:34:48.800730: step: 834/466, loss: 0.011362655088305473 2023-01-24 03:34:49.422361: step: 836/466, loss: 0.043221790343523026 2023-01-24 03:34:50.079401: step: 838/466, loss: 0.029000479727983475 2023-01-24 03:34:50.880330: step: 840/466, loss: 0.1502661556005478 2023-01-24 03:34:51.526097: step: 842/466, loss: 0.09018251299858093 2023-01-24 03:34:52.208340: step: 844/466, loss: 0.03327755257487297 2023-01-24 03:34:52.858972: step: 846/466, loss: 0.034349311143159866 2023-01-24 03:34:53.522390: step: 848/466, loss: 0.5977291464805603 2023-01-24 03:34:54.160828: step: 850/466, loss: 0.0321338064968586 2023-01-24 03:34:54.756005: step: 852/466, loss: 0.09711363166570663 2023-01-24 03:34:55.365951: step: 854/466, loss: 0.017812050879001617 2023-01-24 03:34:55.919731: step: 856/466, loss: 0.013304860331118107 2023-01-24 03:34:56.517768: step: 858/466, loss: 0.017864806577563286 2023-01-24 03:34:57.211643: step: 860/466, loss: 0.025627808645367622 2023-01-24 03:34:57.777255: step: 862/466, loss: 0.04636122286319733 2023-01-24 03:34:58.392391: step: 864/466, loss: 0.2754199802875519 2023-01-24 03:34:59.069671: step: 866/466, loss: 0.02035515569150448 2023-01-24 03:34:59.668498: step: 868/466, loss: 0.019799085333943367 2023-01-24 03:35:00.309535: step: 870/466, loss: 0.014882412739098072 2023-01-24 03:35:00.980540: step: 872/466, loss: 0.13393734395503998 2023-01-24 03:35:01.613281: step: 874/466, loss: 0.057462695986032486 2023-01-24 03:35:02.277413: step: 876/466, loss: 0.018368752673268318 2023-01-24 03:35:02.848000: step: 878/466, loss: 0.04251131787896156 2023-01-24 03:35:03.465964: step: 880/466, loss: 0.03908228501677513 2023-01-24 03:35:04.101844: step: 882/466, loss: 0.0012620191555470228 2023-01-24 03:35:04.798036: step: 884/466, loss: 0.07473523169755936 2023-01-24 03:35:05.461979: step: 886/466, loss: 0.04367680475115776 2023-01-24 03:35:06.084143: step: 888/466, loss: 0.005734715610742569 2023-01-24 03:35:06.689435: step: 890/466, loss: 0.018414320424199104 2023-01-24 03:35:07.335470: step: 892/466, loss: 0.028936324641108513 2023-01-24 03:35:07.940784: step: 894/466, loss: 0.022568896412849426 2023-01-24 03:35:08.512383: step: 896/466, loss: 0.013084076344966888 2023-01-24 03:35:09.106151: step: 898/466, loss: 0.0422712080180645 2023-01-24 03:35:09.731641: step: 900/466, loss: 0.034558240324258804 2023-01-24 03:35:10.325937: step: 902/466, loss: 0.03358054161071777 2023-01-24 03:35:10.897190: step: 904/466, loss: 0.06838488578796387 2023-01-24 03:35:11.511825: step: 906/466, loss: 0.03336134925484657 2023-01-24 03:35:12.133058: step: 908/466, loss: 0.024106288328766823 2023-01-24 03:35:12.763946: step: 910/466, loss: 0.1128292828798294 2023-01-24 03:35:13.418724: step: 912/466, loss: 0.26297375559806824 2023-01-24 03:35:14.012801: step: 914/466, loss: 0.0409359335899353 2023-01-24 03:35:14.621102: step: 916/466, loss: 0.6932467222213745 2023-01-24 03:35:15.239285: step: 918/466, loss: 0.15438520908355713 2023-01-24 03:35:15.912183: step: 920/466, loss: 0.011742820963263512 2023-01-24 03:35:16.519941: step: 922/466, loss: 0.021806012839078903 2023-01-24 03:35:17.203390: step: 924/466, loss: 0.19510570168495178 2023-01-24 03:35:17.913496: step: 926/466, loss: 0.01388524565845728 2023-01-24 03:35:18.542340: step: 928/466, loss: 0.008038608357310295 2023-01-24 03:35:19.152736: step: 930/466, loss: 0.019992290064692497 2023-01-24 03:35:19.808784: step: 932/466, loss: 0.21347476541996002 ================================================== Loss: 0.068 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3500995901261859, 'r': 0.34345633414656185, 'f1': 0.3467461457763182}, 'combined': 0.2554971600457081, 'epoch': 25} Test Chinese: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.3737478786852443, 'r': 0.3065572488092453, 'f1': 0.33683450795089914}, 'combined': 0.2233928342886792, 'epoch': 25} Dev Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.32910511363636363, 'r': 0.2742542613636364, 'f1': 0.2991864669421488}, 'combined': 0.19945764462809917, 'epoch': 25} Test Korean: {'template': {'p': 0.96875, 'r': 0.49206349206349204, 'f1': 0.6526315789473683}, 'slot': {'p': 0.3803562606644222, 'r': 0.28970140060968946, 'f1': 0.32889629598629455}, 'combined': 0.21464810895947642, 'epoch': 25} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.33157153761954145, 'r': 0.3309423696164683, 'f1': 0.33125665486776595}, 'combined': 0.24408385095519594, 'epoch': 25} Test Russian: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.36147812462678597, 'r': 0.297521890954744, 'f1': 0.3263965178599618}, 'combined': 0.21647022946152905, 'epoch': 25} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.3145833333333333, 'r': 0.35952380952380947, 'f1': 0.3355555555555555}, 'combined': 0.22370370370370365, 'epoch': 25} Sample Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.578125, 'r': 0.40217391304347827, 'f1': 0.47435897435897434}, 'combined': 0.3162393162393162, 'epoch': 25} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.4166666666666667, 'r': 0.12931034482758622, 'f1': 0.19736842105263158}, 'combined': 0.13157894736842105, 'epoch': 25} New best korean model... ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.35600821224540485, 'r': 0.3296622534644356, 'f1': 0.34232907896701}, 'combined': 0.25224247923884946, 'epoch': 20} Test for Chinese: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.3708503216596444, 'r': 0.2994463568648126, 'f1': 0.33134515303755174}, 'combined': 0.21975222584873896, 'epoch': 20} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.38541666666666663, 'r': 0.35238095238095235, 'f1': 0.36815920398009944}, 'combined': 0.24543946932006627, 'epoch': 20} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.32910511363636363, 'r': 0.2742542613636364, 'f1': 0.2991864669421488}, 'combined': 0.19945764462809917, 'epoch': 25} Test for Korean: {'template': {'p': 0.96875, 'r': 0.49206349206349204, 'f1': 0.6526315789473683}, 'slot': {'p': 0.3803562606644222, 'r': 0.28970140060968946, 'f1': 0.32889629598629455}, 'combined': 0.21464810895947642, 'epoch': 25} Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.578125, 'r': 0.40217391304347827, 'f1': 0.47435897435897434}, 'combined': 0.3162393162393162, 'epoch': 25} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3214272868712161, 'r': 0.31166858366449984, 'f1': 0.3164727236824498}, 'combined': 0.23319042797654194, 'epoch': 7} Test for Russian: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.3374639287478496, 'r': 0.2816078301964814, 'f1': 0.30701605547736693}, 'combined': 0.20361686580882363, 'epoch': 7} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.4423076923076923, 'r': 0.19827586206896552, 'f1': 0.2738095238095238}, 'combined': 0.1825396825396825, 'epoch': 7} ****************************** Epoch: 26 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-24 03:37:58.930854: step: 2/466, loss: 0.045414477586746216 2023-01-24 03:37:59.532672: step: 4/466, loss: 0.03918834775686264 2023-01-24 03:38:00.181233: step: 6/466, loss: 0.33598366379737854 2023-01-24 03:38:00.768810: step: 8/466, loss: 0.3022470772266388 2023-01-24 03:38:01.369046: step: 10/466, loss: 0.012483332306146622 2023-01-24 03:38:01.950643: step: 12/466, loss: 0.024693988263607025 2023-01-24 03:38:02.642808: step: 14/466, loss: 0.024169620126485825 2023-01-24 03:38:03.325226: step: 16/466, loss: 0.09472887217998505 2023-01-24 03:38:03.929747: step: 18/466, loss: 0.008662912994623184 2023-01-24 03:38:04.516533: step: 20/466, loss: 0.008214856497943401 2023-01-24 03:38:05.121996: step: 22/466, loss: 0.022339563816785812 2023-01-24 03:38:05.689149: step: 24/466, loss: 0.03240164369344711 2023-01-24 03:38:06.296527: step: 26/466, loss: 0.04471060633659363 2023-01-24 03:38:06.961687: step: 28/466, loss: 0.009255090728402138 2023-01-24 03:38:07.639810: step: 30/466, loss: 0.020481910556554794 2023-01-24 03:38:08.238941: step: 32/466, loss: 0.06160318851470947 2023-01-24 03:38:08.864552: step: 34/466, loss: 0.03593392297625542 2023-01-24 03:38:09.520814: step: 36/466, loss: 0.021053927019238472 2023-01-24 03:38:10.103240: step: 38/466, loss: 0.014367097057402134 2023-01-24 03:38:10.692404: step: 40/466, loss: 0.004586027003824711 2023-01-24 03:38:11.280074: step: 42/466, loss: 0.009572326205670834 2023-01-24 03:38:11.914844: step: 44/466, loss: 0.013665699400007725 2023-01-24 03:38:12.451483: step: 46/466, loss: 0.0024054343812167645 2023-01-24 03:38:13.164684: step: 48/466, loss: 0.06742464751005173 2023-01-24 03:38:13.787225: step: 50/466, loss: 2.3411319255828857 2023-01-24 03:38:14.442016: step: 52/466, loss: 0.15515947341918945 2023-01-24 03:38:15.091509: step: 54/466, loss: 0.018642693758010864 2023-01-24 03:38:15.765930: step: 56/466, loss: 0.007206358015537262 2023-01-24 03:38:16.348759: step: 58/466, loss: 0.025296475738286972 2023-01-24 03:38:16.986006: step: 60/466, loss: 0.02713521011173725 2023-01-24 03:38:17.576058: step: 62/466, loss: 0.002757714129984379 2023-01-24 03:38:18.214591: step: 64/466, loss: 0.0047846343368291855 2023-01-24 03:38:19.006902: step: 66/466, loss: 0.04622029885649681 2023-01-24 03:38:19.615713: step: 68/466, loss: 0.028843289241194725 2023-01-24 03:38:20.284216: step: 70/466, loss: 0.07008807361125946 2023-01-24 03:38:20.857751: step: 72/466, loss: 0.013329028151929379 2023-01-24 03:38:21.437967: step: 74/466, loss: 0.06537533551454544 2023-01-24 03:38:22.073774: step: 76/466, loss: 0.11617883294820786 2023-01-24 03:38:22.717085: step: 78/466, loss: 0.0019994976464658976 2023-01-24 03:38:23.392336: step: 80/466, loss: 0.010550200939178467 2023-01-24 03:38:24.041606: step: 82/466, loss: 0.025649353861808777 2023-01-24 03:38:24.677872: step: 84/466, loss: 0.003584762569516897 2023-01-24 03:38:25.311247: step: 86/466, loss: 0.007658570073544979 2023-01-24 03:38:25.877590: step: 88/466, loss: 0.014431743882596493 2023-01-24 03:38:26.507384: step: 90/466, loss: 0.035823170095682144 2023-01-24 03:38:27.133971: step: 92/466, loss: 0.026814866811037064 2023-01-24 03:38:27.730664: step: 94/466, loss: 0.047312382608652115 2023-01-24 03:38:28.278084: step: 96/466, loss: 0.028254160657525063 2023-01-24 03:38:28.901856: step: 98/466, loss: 0.042886678129434586 2023-01-24 03:38:29.530331: step: 100/466, loss: 0.03463972359895706 2023-01-24 03:38:30.141273: step: 102/466, loss: 0.016591209918260574 2023-01-24 03:38:30.727409: step: 104/466, loss: 0.010787018574774265 2023-01-24 03:38:31.331941: step: 106/466, loss: 0.024628739804029465 2023-01-24 03:38:31.918607: step: 108/466, loss: 0.0013052689610049129 2023-01-24 03:38:32.610069: step: 110/466, loss: 0.008944116532802582 2023-01-24 03:38:33.214955: step: 112/466, loss: 0.01800631731748581 2023-01-24 03:38:33.855935: step: 114/466, loss: 0.01092657633125782 2023-01-24 03:38:34.466911: step: 116/466, loss: 0.056671883910894394 2023-01-24 03:38:35.070559: step: 118/466, loss: 0.05620972812175751 2023-01-24 03:38:35.679534: step: 120/466, loss: 0.09269988536834717 2023-01-24 03:38:36.300205: step: 122/466, loss: 0.03398977220058441 2023-01-24 03:38:36.919051: step: 124/466, loss: 0.005538527388125658 2023-01-24 03:38:37.568370: step: 126/466, loss: 0.019748879596590996 2023-01-24 03:38:38.143947: step: 128/466, loss: 0.012078657746315002 2023-01-24 03:38:38.762720: step: 130/466, loss: 0.022150034084916115 2023-01-24 03:38:39.380802: step: 132/466, loss: 0.02786128781735897 2023-01-24 03:38:39.997438: step: 134/466, loss: 0.040518466383218765 2023-01-24 03:38:40.673079: step: 136/466, loss: 0.011007298715412617 2023-01-24 03:38:41.341919: step: 138/466, loss: 0.01236250065267086 2023-01-24 03:38:41.954919: step: 140/466, loss: 0.00545964390039444 2023-01-24 03:38:42.613625: step: 142/466, loss: 0.013405879959464073 2023-01-24 03:38:43.192981: step: 144/466, loss: 0.019116604700684547 2023-01-24 03:38:43.801327: step: 146/466, loss: 0.024427859112620354 2023-01-24 03:38:44.431122: step: 148/466, loss: 0.03811703994870186 2023-01-24 03:38:44.991966: step: 150/466, loss: 0.00464991107583046 2023-01-24 03:38:45.580950: step: 152/466, loss: 4.555197715759277 2023-01-24 03:38:46.191036: step: 154/466, loss: 0.017821406945586205 2023-01-24 03:38:46.825756: step: 156/466, loss: 0.1571727991104126 2023-01-24 03:38:47.422729: step: 158/466, loss: 0.04710790142416954 2023-01-24 03:38:48.065747: step: 160/466, loss: 0.033908210694789886 2023-01-24 03:38:48.671291: step: 162/466, loss: 0.022364942356944084 2023-01-24 03:38:49.338434: step: 164/466, loss: 0.19430869817733765 2023-01-24 03:38:49.962810: step: 166/466, loss: 0.003512404393404722 2023-01-24 03:38:50.545604: step: 168/466, loss: 0.043155040591955185 2023-01-24 03:38:51.116324: step: 170/466, loss: 0.00416438328102231 2023-01-24 03:38:51.775703: step: 172/466, loss: 0.03939838707447052 2023-01-24 03:38:52.340269: step: 174/466, loss: 0.03198012337088585 2023-01-24 03:38:52.932423: step: 176/466, loss: 0.003328982973471284 2023-01-24 03:38:53.517335: step: 178/466, loss: 0.03389064222574234 2023-01-24 03:38:54.105748: step: 180/466, loss: 0.009292037226259708 2023-01-24 03:38:54.732585: step: 182/466, loss: 0.0263590719550848 2023-01-24 03:38:55.407804: step: 184/466, loss: 0.01964017190039158 2023-01-24 03:38:56.024301: step: 186/466, loss: 0.05610666796565056 2023-01-24 03:38:56.711018: step: 188/466, loss: 0.06703130900859833 2023-01-24 03:38:57.350511: step: 190/466, loss: 0.007344461977481842 2023-01-24 03:38:57.935827: step: 192/466, loss: 0.004412634763866663 2023-01-24 03:38:58.592973: step: 194/466, loss: 0.011533022858202457 2023-01-24 03:38:59.208728: step: 196/466, loss: 0.08746032416820526 2023-01-24 03:38:59.850472: step: 198/466, loss: 0.00856334250420332 2023-01-24 03:39:00.484264: step: 200/466, loss: 0.038558684289455414 2023-01-24 03:39:01.106211: step: 202/466, loss: 0.028453713282942772 2023-01-24 03:39:01.755213: step: 204/466, loss: 0.38487887382507324 2023-01-24 03:39:02.451572: step: 206/466, loss: 0.008262191899120808 2023-01-24 03:39:03.064766: step: 208/466, loss: 0.0030298414640128613 2023-01-24 03:39:03.707385: step: 210/466, loss: 0.05420486629009247 2023-01-24 03:39:04.259048: step: 212/466, loss: 0.016683876514434814 2023-01-24 03:39:04.910338: step: 214/466, loss: 0.05660427734255791 2023-01-24 03:39:05.507783: step: 216/466, loss: 0.08958832919597626 2023-01-24 03:39:06.128671: step: 218/466, loss: 0.012682387605309486 2023-01-24 03:39:06.755016: step: 220/466, loss: 0.016953550279140472 2023-01-24 03:39:07.399833: step: 222/466, loss: 0.017252590507268906 2023-01-24 03:39:07.981428: step: 224/466, loss: 0.011150977574288845 2023-01-24 03:39:08.577406: step: 226/466, loss: 0.006136606447398663 2023-01-24 03:39:09.134675: step: 228/466, loss: 0.008039235137403011 2023-01-24 03:39:09.754007: step: 230/466, loss: 0.0220362339168787 2023-01-24 03:39:10.463065: step: 232/466, loss: 0.014643709175288677 2023-01-24 03:39:11.078522: step: 234/466, loss: 0.22404278814792633 2023-01-24 03:39:11.740674: step: 236/466, loss: 0.04199331998825073 2023-01-24 03:39:12.310240: step: 238/466, loss: 0.0051811812445521355 2023-01-24 03:39:12.994478: step: 240/466, loss: 0.02466060034930706 2023-01-24 03:39:13.640936: step: 242/466, loss: 0.11676938086748123 2023-01-24 03:39:14.188365: step: 244/466, loss: 1.998118204937782e-05 2023-01-24 03:39:14.862475: step: 246/466, loss: 0.4189917743206024 2023-01-24 03:39:15.496021: step: 248/466, loss: 0.025706855580210686 2023-01-24 03:39:16.091239: step: 250/466, loss: 0.3023975193500519 2023-01-24 03:39:16.799407: step: 252/466, loss: 0.04882144555449486 2023-01-24 03:39:17.428870: step: 254/466, loss: 0.008105840533971786 2023-01-24 03:39:18.101414: step: 256/466, loss: 5.065145523985848e-05 2023-01-24 03:39:18.785688: step: 258/466, loss: 0.057812340557575226 2023-01-24 03:39:19.399301: step: 260/466, loss: 0.14420419931411743 2023-01-24 03:39:19.948879: step: 262/466, loss: 0.001418450498022139 2023-01-24 03:39:20.575363: step: 264/466, loss: 0.03700948879122734 2023-01-24 03:39:21.232400: step: 266/466, loss: 0.009782583452761173 2023-01-24 03:39:21.826828: step: 268/466, loss: 0.06596330553293228 2023-01-24 03:39:22.442652: step: 270/466, loss: 0.0538744255900383 2023-01-24 03:39:23.109866: step: 272/466, loss: 0.04572300612926483 2023-01-24 03:39:23.725884: step: 274/466, loss: 0.027452930808067322 2023-01-24 03:39:24.357893: step: 276/466, loss: 0.0159525778144598 2023-01-24 03:39:24.967408: step: 278/466, loss: 0.001892161089926958 2023-01-24 03:39:25.633378: step: 280/466, loss: 0.029880749061703682 2023-01-24 03:39:26.248948: step: 282/466, loss: 0.004739616997539997 2023-01-24 03:39:26.927181: step: 284/466, loss: 0.033471375703811646 2023-01-24 03:39:27.556969: step: 286/466, loss: 0.05095507577061653 2023-01-24 03:39:28.210621: step: 288/466, loss: 0.041758857667446136 2023-01-24 03:39:28.846633: step: 290/466, loss: 0.015397449024021626 2023-01-24 03:39:29.485740: step: 292/466, loss: 0.013284178450703621 2023-01-24 03:39:30.184503: step: 294/466, loss: 0.21159973740577698 2023-01-24 03:39:30.833406: step: 296/466, loss: 0.018651289865374565 2023-01-24 03:39:31.498743: step: 298/466, loss: 0.040743838995695114 2023-01-24 03:39:32.111167: step: 300/466, loss: 0.029271796345710754 2023-01-24 03:39:32.802730: step: 302/466, loss: 0.04050165042281151 2023-01-24 03:39:33.364031: step: 304/466, loss: 0.08837659657001495 2023-01-24 03:39:33.995155: step: 306/466, loss: 0.12287881225347519 2023-01-24 03:39:34.589977: step: 308/466, loss: 1.0752689838409424 2023-01-24 03:39:35.238011: step: 310/466, loss: 0.06996869295835495 2023-01-24 03:39:35.894665: step: 312/466, loss: 0.016347158700227737 2023-01-24 03:39:36.572829: step: 314/466, loss: 0.0714387372136116 2023-01-24 03:39:37.162000: step: 316/466, loss: 0.04747457429766655 2023-01-24 03:39:37.786479: step: 318/466, loss: 0.025179000571370125 2023-01-24 03:39:38.419722: step: 320/466, loss: 0.11100359261035919 2023-01-24 03:39:39.079240: step: 322/466, loss: 0.039913929998874664 2023-01-24 03:39:39.731135: step: 324/466, loss: 0.07095092535018921 2023-01-24 03:39:40.347055: step: 326/466, loss: 0.009094692766666412 2023-01-24 03:39:41.045926: step: 328/466, loss: 0.015984689816832542 2023-01-24 03:39:41.675481: step: 330/466, loss: 0.029985295608639717 2023-01-24 03:39:42.263464: step: 332/466, loss: 0.018034322187304497 2023-01-24 03:39:42.825491: step: 334/466, loss: 0.052487436681985855 2023-01-24 03:39:43.426642: step: 336/466, loss: 0.011984365992248058 2023-01-24 03:39:44.051716: step: 338/466, loss: 0.028638869524002075 2023-01-24 03:39:44.646615: step: 340/466, loss: 0.4480482339859009 2023-01-24 03:39:45.216243: step: 342/466, loss: 0.005989938974380493 2023-01-24 03:39:45.754302: step: 344/466, loss: 0.03386138007044792 2023-01-24 03:39:46.427714: step: 346/466, loss: 0.007416254375129938 2023-01-24 03:39:47.117918: step: 348/466, loss: 0.10737764090299606 2023-01-24 03:39:47.718829: step: 350/466, loss: 0.02148372493684292 2023-01-24 03:39:48.372600: step: 352/466, loss: 0.00944596342742443 2023-01-24 03:39:49.025313: step: 354/466, loss: 0.03134218230843544 2023-01-24 03:39:49.613282: step: 356/466, loss: 0.023554576560854912 2023-01-24 03:39:50.209740: step: 358/466, loss: 0.006292336154729128 2023-01-24 03:39:50.770913: step: 360/466, loss: 0.011793393641710281 2023-01-24 03:39:51.393158: step: 362/466, loss: 0.048451006412506104 2023-01-24 03:39:52.001514: step: 364/466, loss: 0.13643038272857666 2023-01-24 03:39:52.589602: step: 366/466, loss: 0.0035364192444831133 2023-01-24 03:39:53.223718: step: 368/466, loss: 0.02475653775036335 2023-01-24 03:39:53.847048: step: 370/466, loss: 0.03772176057100296 2023-01-24 03:39:54.523955: step: 372/466, loss: 0.06777799874544144 2023-01-24 03:39:55.158116: step: 374/466, loss: 0.12316180020570755 2023-01-24 03:39:55.720122: step: 376/466, loss: 0.06544934213161469 2023-01-24 03:39:56.375081: step: 378/466, loss: 0.024594111368060112 2023-01-24 03:39:56.962885: step: 380/466, loss: 0.015402672812342644 2023-01-24 03:39:57.600526: step: 382/466, loss: 0.021741464734077454 2023-01-24 03:39:58.207014: step: 384/466, loss: 0.03798363730311394 2023-01-24 03:39:58.796379: step: 386/466, loss: 0.00046246900456026196 2023-01-24 03:39:59.442250: step: 388/466, loss: 0.020787667483091354 2023-01-24 03:40:00.074220: step: 390/466, loss: 0.020255569368600845 2023-01-24 03:40:00.881043: step: 392/466, loss: 0.3557094633579254 2023-01-24 03:40:01.461503: step: 394/466, loss: 0.010126069188117981 2023-01-24 03:40:02.139894: step: 396/466, loss: 0.005736412014812231 2023-01-24 03:40:02.756160: step: 398/466, loss: 0.032208964228630066 2023-01-24 03:40:03.453230: step: 400/466, loss: 0.3010401725769043 2023-01-24 03:40:04.081903: step: 402/466, loss: 0.28091520071029663 2023-01-24 03:40:04.671850: step: 404/466, loss: 0.0113126365467906 2023-01-24 03:40:05.283433: step: 406/466, loss: 0.031300827860832214 2023-01-24 03:40:05.918049: step: 408/466, loss: 0.8436365127563477 2023-01-24 03:40:06.569890: step: 410/466, loss: 0.022062180563807487 2023-01-24 03:40:07.210111: step: 412/466, loss: 0.01742429845035076 2023-01-24 03:40:07.809597: step: 414/466, loss: 0.09323088824748993 2023-01-24 03:40:08.462184: step: 416/466, loss: 0.02583315037190914 2023-01-24 03:40:09.084012: step: 418/466, loss: 0.02988622523844242 2023-01-24 03:40:09.719418: step: 420/466, loss: 0.0024579628370702267 2023-01-24 03:40:10.367806: step: 422/466, loss: 0.004086425062268972 2023-01-24 03:40:11.082728: step: 424/466, loss: 0.03260628879070282 2023-01-24 03:40:11.710037: step: 426/466, loss: 0.049904391169548035 2023-01-24 03:40:12.322059: step: 428/466, loss: 0.0036581484600901604 2023-01-24 03:40:12.964341: step: 430/466, loss: 0.06395833939313889 2023-01-24 03:40:13.653598: step: 432/466, loss: 0.048287052661180496 2023-01-24 03:40:14.233237: step: 434/466, loss: 0.0005487494054250419 2023-01-24 03:40:14.837603: step: 436/466, loss: 0.3673534691333771 2023-01-24 03:40:15.419766: step: 438/466, loss: 0.005553035531193018 2023-01-24 03:40:16.001096: step: 440/466, loss: 0.03513813391327858 2023-01-24 03:40:16.625330: step: 442/466, loss: 0.0010807143989950418 2023-01-24 03:40:17.254154: step: 444/466, loss: 0.00991811603307724 2023-01-24 03:40:17.944477: step: 446/466, loss: 0.35704296827316284 2023-01-24 03:40:18.577444: step: 448/466, loss: 0.0004759306611958891 2023-01-24 03:40:19.180612: step: 450/466, loss: 0.013852309435606003 2023-01-24 03:40:19.911002: step: 452/466, loss: 0.055704496800899506 2023-01-24 03:40:20.470947: step: 454/466, loss: 0.05234242230653763 2023-01-24 03:40:21.152210: step: 456/466, loss: 0.029207957908511162 2023-01-24 03:40:21.811717: step: 458/466, loss: 0.010664550587534904 2023-01-24 03:40:22.411939: step: 460/466, loss: 0.09975343197584152 2023-01-24 03:40:23.190104: step: 462/466, loss: 0.17967922985553741 2023-01-24 03:40:23.878729: step: 464/466, loss: 0.001165557187050581 2023-01-24 03:40:24.526023: step: 466/466, loss: 0.01934705674648285 2023-01-24 03:40:25.135721: step: 468/466, loss: 0.009555082768201828 2023-01-24 03:40:25.825689: step: 470/466, loss: 0.004604933317750692 2023-01-24 03:40:26.413700: step: 472/466, loss: 0.014080885797739029 2023-01-24 03:40:27.059364: step: 474/466, loss: 0.042255330830812454 2023-01-24 03:40:27.708225: step: 476/466, loss: 0.02824373170733452 2023-01-24 03:40:28.340989: step: 478/466, loss: 0.014276370406150818 2023-01-24 03:40:28.932921: step: 480/466, loss: 0.033427610993385315 2023-01-24 03:40:29.546063: step: 482/466, loss: 0.006971611641347408 2023-01-24 03:40:30.143955: step: 484/466, loss: 0.03468838706612587 2023-01-24 03:40:30.754037: step: 486/466, loss: 0.008514195680618286 2023-01-24 03:40:31.349202: step: 488/466, loss: 0.21864424645900726 2023-01-24 03:40:32.023624: step: 490/466, loss: 0.2849474847316742 2023-01-24 03:40:32.753815: step: 492/466, loss: 0.005178901366889477 2023-01-24 03:40:33.409322: step: 494/466, loss: 0.010462756268680096 2023-01-24 03:40:34.001952: step: 496/466, loss: 0.10011930763721466 2023-01-24 03:40:34.671618: step: 498/466, loss: 0.034411054104566574 2023-01-24 03:40:35.251635: step: 500/466, loss: 0.002200616290792823 2023-01-24 03:40:35.919294: step: 502/466, loss: 0.0178882647305727 2023-01-24 03:40:36.596738: step: 504/466, loss: 0.07477042824029922 2023-01-24 03:40:37.306127: step: 506/466, loss: 0.007386498153209686 2023-01-24 03:40:37.903095: step: 508/466, loss: 0.052657391875982285 2023-01-24 03:40:38.548983: step: 510/466, loss: 3.171741247177124 2023-01-24 03:40:39.211551: step: 512/466, loss: 0.038998816162347794 2023-01-24 03:40:39.845752: step: 514/466, loss: 0.010146564804017544 2023-01-24 03:40:40.424055: step: 516/466, loss: 0.00945010595023632 2023-01-24 03:40:41.191705: step: 518/466, loss: 0.09256458282470703 2023-01-24 03:40:41.866000: step: 520/466, loss: 0.0418822318315506 2023-01-24 03:40:42.406121: step: 522/466, loss: 0.027969488874077797 2023-01-24 03:40:43.061462: step: 524/466, loss: 0.012059665285050869 2023-01-24 03:40:43.681511: step: 526/466, loss: 0.01905934512615204 2023-01-24 03:40:44.285301: step: 528/466, loss: 0.05664241686463356 2023-01-24 03:40:44.937444: step: 530/466, loss: 0.017221873626112938 2023-01-24 03:40:45.546849: step: 532/466, loss: 0.03145075589418411 2023-01-24 03:40:46.187196: step: 534/466, loss: 0.011874757707118988 2023-01-24 03:40:46.862315: step: 536/466, loss: 0.19178661704063416 2023-01-24 03:40:47.454039: step: 538/466, loss: 0.01207797322422266 2023-01-24 03:40:48.133263: step: 540/466, loss: 0.03249429538846016 2023-01-24 03:40:48.748573: step: 542/466, loss: 0.03537470102310181 2023-01-24 03:40:49.374067: step: 544/466, loss: 0.04688917100429535 2023-01-24 03:40:50.017810: step: 546/466, loss: 0.008935499005019665 2023-01-24 03:40:50.629384: step: 548/466, loss: 0.005889051128178835 2023-01-24 03:40:51.247668: step: 550/466, loss: 0.04385385289788246 2023-01-24 03:40:51.887475: step: 552/466, loss: 0.0009785328293219209 2023-01-24 03:40:52.525027: step: 554/466, loss: 0.008239595219492912 2023-01-24 03:40:53.154222: step: 556/466, loss: 0.6651448607444763 2023-01-24 03:40:53.828274: step: 558/466, loss: 0.11339122802019119 2023-01-24 03:40:54.417894: step: 560/466, loss: 0.015921475365757942 2023-01-24 03:40:55.035001: step: 562/466, loss: 0.03143639117479324 2023-01-24 03:40:55.756853: step: 564/466, loss: 0.05940140783786774 2023-01-24 03:40:56.389690: step: 566/466, loss: 0.08028946816921234 2023-01-24 03:40:57.034454: step: 568/466, loss: 0.010268310084939003 2023-01-24 03:40:57.706969: step: 570/466, loss: 0.019668761640787125 2023-01-24 03:40:58.329855: step: 572/466, loss: 0.01937008835375309 2023-01-24 03:40:58.980062: step: 574/466, loss: 0.03339478373527527 2023-01-24 03:40:59.617279: step: 576/466, loss: 0.04097174480557442 2023-01-24 03:41:00.255690: step: 578/466, loss: 0.052022628486156464 2023-01-24 03:41:00.833071: step: 580/466, loss: 0.0010472792200744152 2023-01-24 03:41:01.472336: step: 582/466, loss: 0.08285340666770935 2023-01-24 03:41:02.149501: step: 584/466, loss: 0.05154658481478691 2023-01-24 03:41:02.768142: step: 586/466, loss: 0.013209663331508636 2023-01-24 03:41:03.445298: step: 588/466, loss: 0.06628485023975372 2023-01-24 03:41:04.093592: step: 590/466, loss: 0.007210195530205965 2023-01-24 03:41:04.659600: step: 592/466, loss: 0.002422439167276025 2023-01-24 03:41:05.282118: step: 594/466, loss: 0.008276514708995819 2023-01-24 03:41:05.910800: step: 596/466, loss: 0.028346292674541473 2023-01-24 03:41:06.468407: step: 598/466, loss: 0.01611250266432762 2023-01-24 03:41:07.140763: step: 600/466, loss: 0.007911566644906998 2023-01-24 03:41:07.773110: step: 602/466, loss: 0.0665612444281578 2023-01-24 03:41:08.417992: step: 604/466, loss: 0.029158849269151688 2023-01-24 03:41:08.996101: step: 606/466, loss: 0.05460003763437271 2023-01-24 03:41:09.631237: step: 608/466, loss: 0.003954010549932718 2023-01-24 03:41:10.235366: step: 610/466, loss: 0.03006788343191147 2023-01-24 03:41:10.900382: step: 612/466, loss: 0.026150286197662354 2023-01-24 03:41:11.527764: step: 614/466, loss: 0.028111020103096962 2023-01-24 03:41:12.142244: step: 616/466, loss: 0.036183565855026245 2023-01-24 03:41:12.814244: step: 618/466, loss: 0.02873288281261921 2023-01-24 03:41:13.413257: step: 620/466, loss: 0.006896775681525469 2023-01-24 03:41:14.098923: step: 622/466, loss: 0.03590833768248558 2023-01-24 03:41:14.786899: step: 624/466, loss: 0.0337134450674057 2023-01-24 03:41:15.452304: step: 626/466, loss: 0.05777391046285629 2023-01-24 03:41:16.056222: step: 628/466, loss: 0.0067988913506269455 2023-01-24 03:41:16.665650: step: 630/466, loss: 0.0537460558116436 2023-01-24 03:41:17.314364: step: 632/466, loss: 0.012341397814452648 2023-01-24 03:41:17.871820: step: 634/466, loss: 0.08001233637332916 2023-01-24 03:41:18.457200: step: 636/466, loss: 0.1843031942844391 2023-01-24 03:41:19.040791: step: 638/466, loss: 0.038092005997896194 2023-01-24 03:41:19.677872: step: 640/466, loss: 0.07952667027711868 2023-01-24 03:41:20.290251: step: 642/466, loss: 0.0035366981755942106 2023-01-24 03:41:20.966060: step: 644/466, loss: 0.5054967403411865 2023-01-24 03:41:21.599533: step: 646/466, loss: 0.010712129063904285 2023-01-24 03:41:22.368209: step: 648/466, loss: 0.3804076611995697 2023-01-24 03:41:22.961022: step: 650/466, loss: 0.09094398468732834 2023-01-24 03:41:23.672439: step: 652/466, loss: 0.13170231878757477 2023-01-24 03:41:24.284893: step: 654/466, loss: 0.012623382732272148 2023-01-24 03:41:24.895112: step: 656/466, loss: 0.012589056976139545 2023-01-24 03:41:25.514550: step: 658/466, loss: 0.011988475918769836 2023-01-24 03:41:26.120852: step: 660/466, loss: 0.0673658475279808 2023-01-24 03:41:26.748444: step: 662/466, loss: 0.019919512793421745 2023-01-24 03:41:27.337398: step: 664/466, loss: 0.06564678996801376 2023-01-24 03:41:27.948232: step: 666/466, loss: 0.001861299155279994 2023-01-24 03:41:28.707149: step: 668/466, loss: 0.007484862580895424 2023-01-24 03:41:29.324156: step: 670/466, loss: 0.029324056580662727 2023-01-24 03:41:29.993398: step: 672/466, loss: 0.00355354230850935 2023-01-24 03:41:30.674960: step: 674/466, loss: 0.009839470498263836 2023-01-24 03:41:31.275604: step: 676/466, loss: 0.10285638272762299 2023-01-24 03:41:31.877854: step: 678/466, loss: 0.02608523704111576 2023-01-24 03:41:32.502163: step: 680/466, loss: 0.029285017400979996 2023-01-24 03:41:33.166486: step: 682/466, loss: 0.042839415371418 2023-01-24 03:41:33.817803: step: 684/466, loss: 0.052892055362463 2023-01-24 03:41:34.470277: step: 686/466, loss: 0.024776723235845566 2023-01-24 03:41:35.158736: step: 688/466, loss: 0.07299697399139404 2023-01-24 03:41:35.753751: step: 690/466, loss: 0.03305630758404732 2023-01-24 03:41:36.427489: step: 692/466, loss: 0.022931385785341263 2023-01-24 03:41:37.062804: step: 694/466, loss: 0.23795311152935028 2023-01-24 03:41:37.694851: step: 696/466, loss: 0.0009287637658417225 2023-01-24 03:41:38.345636: step: 698/466, loss: 0.06150897219777107 2023-01-24 03:41:38.912337: step: 700/466, loss: 0.025871066376566887 2023-01-24 03:41:39.533725: step: 702/466, loss: 0.038701131939888 2023-01-24 03:41:40.128498: step: 704/466, loss: 0.004738318733870983 2023-01-24 03:41:40.738126: step: 706/466, loss: 0.017376113682985306 2023-01-24 03:41:41.352647: step: 708/466, loss: 0.010453774593770504 2023-01-24 03:41:42.021216: step: 710/466, loss: 0.003291316330432892 2023-01-24 03:41:42.573698: step: 712/466, loss: 0.2283499538898468 2023-01-24 03:41:43.160184: step: 714/466, loss: 0.030491085723042488 2023-01-24 03:41:43.777389: step: 716/466, loss: 0.044772472232580185 2023-01-24 03:41:44.395847: step: 718/466, loss: 0.024380743503570557 2023-01-24 03:41:45.056264: step: 720/466, loss: 0.1536007523536682 2023-01-24 03:41:45.701554: step: 722/466, loss: 0.23740418255329132 2023-01-24 03:41:46.325477: step: 724/466, loss: 0.01061465684324503 2023-01-24 03:41:46.947109: step: 726/466, loss: 0.024473462253808975 2023-01-24 03:41:47.568263: step: 728/466, loss: 0.00921687949448824 2023-01-24 03:41:48.187973: step: 730/466, loss: 0.026211457327008247 2023-01-24 03:41:48.872660: step: 732/466, loss: 0.0797419473528862 2023-01-24 03:41:49.395848: step: 734/466, loss: 0.0002902036940213293 2023-01-24 03:41:50.063859: step: 736/466, loss: 0.08485154062509537 2023-01-24 03:41:50.674596: step: 738/466, loss: 0.010234514251351357 2023-01-24 03:41:51.318209: step: 740/466, loss: 0.021577196195721626 2023-01-24 03:41:51.945498: step: 742/466, loss: 0.06836622208356857 2023-01-24 03:41:52.569268: step: 744/466, loss: 0.045496616512537 2023-01-24 03:41:53.312927: step: 746/466, loss: 0.027603916823863983 2023-01-24 03:41:53.902946: step: 748/466, loss: 0.006657042074948549 2023-01-24 03:41:54.527435: step: 750/466, loss: 0.004206167533993721 2023-01-24 03:41:55.123013: step: 752/466, loss: 0.00930891465395689 2023-01-24 03:41:55.848517: step: 754/466, loss: 0.015324637293815613 2023-01-24 03:41:56.463730: step: 756/466, loss: 0.005557236261665821 2023-01-24 03:41:57.064795: step: 758/466, loss: 0.0005388028803281486 2023-01-24 03:41:57.741070: step: 760/466, loss: 0.007462238892912865 2023-01-24 03:41:58.462925: step: 762/466, loss: 0.05765986815094948 2023-01-24 03:41:59.093635: step: 764/466, loss: 0.00529530318453908 2023-01-24 03:41:59.776462: step: 766/466, loss: 0.22407777607440948 2023-01-24 03:42:00.396662: step: 768/466, loss: 0.01470536831766367 2023-01-24 03:42:01.044505: step: 770/466, loss: 0.004962172359228134 2023-01-24 03:42:01.648914: step: 772/466, loss: 0.02007342502474785 2023-01-24 03:42:02.233744: step: 774/466, loss: 0.01086547039449215 2023-01-24 03:42:02.818776: step: 776/466, loss: 0.01764582097530365 2023-01-24 03:42:03.449373: step: 778/466, loss: 0.03523987904191017 2023-01-24 03:42:04.088714: step: 780/466, loss: 0.0007762151653878391 2023-01-24 03:42:04.726167: step: 782/466, loss: 0.01184089295566082 2023-01-24 03:42:05.294892: step: 784/466, loss: 0.005018068943172693 2023-01-24 03:42:05.899685: step: 786/466, loss: 0.04472871869802475 2023-01-24 03:42:06.475592: step: 788/466, loss: 0.015511106699705124 2023-01-24 03:42:07.087246: step: 790/466, loss: 0.2705400884151459 2023-01-24 03:42:07.691822: step: 792/466, loss: 0.026682354509830475 2023-01-24 03:42:08.298270: step: 794/466, loss: 0.08655679225921631 2023-01-24 03:42:08.960927: step: 796/466, loss: 0.026306811720132828 2023-01-24 03:42:09.541986: step: 798/466, loss: 0.0008450224995613098 2023-01-24 03:42:10.207733: step: 800/466, loss: 0.03993954509496689 2023-01-24 03:42:10.865269: step: 802/466, loss: 0.02803313173353672 2023-01-24 03:42:11.535378: step: 804/466, loss: 0.04920896887779236 2023-01-24 03:42:12.215762: step: 806/466, loss: 0.00908689759671688 2023-01-24 03:42:12.756225: step: 808/466, loss: 0.007737348787486553 2023-01-24 03:42:13.354174: step: 810/466, loss: 0.01188454870134592 2023-01-24 03:42:13.965808: step: 812/466, loss: 0.10305050015449524 2023-01-24 03:42:14.618109: step: 814/466, loss: 0.05134795978665352 2023-01-24 03:42:15.268016: step: 816/466, loss: 0.0744282677769661 2023-01-24 03:42:15.860793: step: 818/466, loss: 0.008248677477240562 2023-01-24 03:42:16.501147: step: 820/466, loss: 0.03369758278131485 2023-01-24 03:42:17.112836: step: 822/466, loss: 0.03700336441397667 2023-01-24 03:42:17.722282: step: 824/466, loss: 0.026925429701805115 2023-01-24 03:42:18.375668: step: 826/466, loss: 0.05135258287191391 2023-01-24 03:42:18.991410: step: 828/466, loss: 0.01281293947249651 2023-01-24 03:42:19.651624: step: 830/466, loss: 0.01877225935459137 2023-01-24 03:42:20.191183: step: 832/466, loss: 0.050698645412921906 2023-01-24 03:42:20.826047: step: 834/466, loss: 0.006537098903208971 2023-01-24 03:42:21.437059: step: 836/466, loss: 0.004928469192236662 2023-01-24 03:42:22.098127: step: 838/466, loss: 0.6613799333572388 2023-01-24 03:42:22.724731: step: 840/466, loss: 0.016151078045368195 2023-01-24 03:42:23.526100: step: 842/466, loss: 0.03449910506606102 2023-01-24 03:42:24.147675: step: 844/466, loss: 0.042612750083208084 2023-01-24 03:42:24.763859: step: 846/466, loss: 0.012053466401994228 2023-01-24 03:42:25.384834: step: 848/466, loss: 0.016944939270615578 2023-01-24 03:42:25.968584: step: 850/466, loss: 0.009946225211024284 2023-01-24 03:42:26.578465: step: 852/466, loss: 0.03996839001774788 2023-01-24 03:42:27.171180: step: 854/466, loss: 0.25412529706954956 2023-01-24 03:42:27.771884: step: 856/466, loss: 0.003932584077119827 2023-01-24 03:42:28.358584: step: 858/466, loss: 0.019082466140389442 2023-01-24 03:42:28.996631: step: 860/466, loss: 0.028855543583631516 2023-01-24 03:42:29.586518: step: 862/466, loss: 0.012767578475177288 2023-01-24 03:42:30.229411: step: 864/466, loss: 0.09636927396059036 2023-01-24 03:42:30.835833: step: 866/466, loss: 0.02893782965838909 2023-01-24 03:42:31.443439: step: 868/466, loss: 0.017977142706513405 2023-01-24 03:42:32.097197: step: 870/466, loss: 0.08191262930631638 2023-01-24 03:42:32.724609: step: 872/466, loss: 0.05199733003973961 2023-01-24 03:42:33.359178: step: 874/466, loss: 0.007144299801439047 2023-01-24 03:42:33.958786: step: 876/466, loss: 0.027346298098564148 2023-01-24 03:42:34.641996: step: 878/466, loss: 0.25393012166023254 2023-01-24 03:42:35.277521: step: 880/466, loss: 0.009548576548695564 2023-01-24 03:42:35.931582: step: 882/466, loss: 0.03719523549079895 2023-01-24 03:42:36.558142: step: 884/466, loss: 0.18985582888126373 2023-01-24 03:42:37.213668: step: 886/466, loss: 0.051307398825883865 2023-01-24 03:42:37.770345: step: 888/466, loss: 0.04601292312145233 2023-01-24 03:42:38.388766: step: 890/466, loss: 0.06417310982942581 2023-01-24 03:42:38.999915: step: 892/466, loss: 0.04209336265921593 2023-01-24 03:42:39.553178: step: 894/466, loss: 0.029217995703220367 2023-01-24 03:42:40.182040: step: 896/466, loss: 0.08725843578577042 2023-01-24 03:42:40.878013: step: 898/466, loss: 0.013245921581983566 2023-01-24 03:42:41.489693: step: 900/466, loss: 0.006073274649679661 2023-01-24 03:42:42.149334: step: 902/466, loss: 0.016061050817370415 2023-01-24 03:42:42.788882: step: 904/466, loss: 0.034913044422864914 2023-01-24 03:42:43.463572: step: 906/466, loss: 0.0108271399512887 2023-01-24 03:42:44.106803: step: 908/466, loss: 0.005191510077565908 2023-01-24 03:42:44.670159: step: 910/466, loss: 0.020119434222579002 2023-01-24 03:42:45.348435: step: 912/466, loss: 0.058973293751478195 2023-01-24 03:42:45.965338: step: 914/466, loss: 0.051008958369493484 2023-01-24 03:42:46.535579: step: 916/466, loss: 0.028807258233428 2023-01-24 03:42:47.217550: step: 918/466, loss: 0.09778264164924622 2023-01-24 03:42:47.864058: step: 920/466, loss: 0.04884487763047218 2023-01-24 03:42:48.489592: step: 922/466, loss: 0.029354941099882126 2023-01-24 03:42:49.036403: step: 924/466, loss: 0.007687765639275312 2023-01-24 03:42:49.744174: step: 926/466, loss: 0.05332213640213013 2023-01-24 03:42:50.352853: step: 928/466, loss: 0.05926041677594185 2023-01-24 03:42:50.919060: step: 930/466, loss: 0.026867201551795006 2023-01-24 03:42:51.588567: step: 932/466, loss: 0.0791272446513176 ================================================== Loss: 0.076 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.35111244972219385, 'r': 0.33845374660127986, 'f1': 0.3446669071669072}, 'combined': 0.2539650894914053, 'epoch': 26} Test Chinese: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.35292858284037004, 'r': 0.3162606781296823, 'f1': 0.3335900303559663}, 'combined': 0.22124105640188435, 'epoch': 26} Dev Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.33748563218390804, 'r': 0.2780421401515152, 'f1': 0.3048935617860852}, 'combined': 0.20326237452405677, 'epoch': 26} Test Korean: {'template': {'p': 0.96875, 'r': 0.49206349206349204, 'f1': 0.6526315789473683}, 'slot': {'p': 0.36640296608963324, 'r': 0.2982570857182914, 'f1': 0.3288366152506866}, 'combined': 0.21460915942676384, 'epoch': 26} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3434394995774478, 'r': 0.32714730320280605, 'f1': 0.3350954884118149}, 'combined': 0.24691246514554782, 'epoch': 26} Test Russian: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.3401543889925739, 'r': 0.3039301553595985, 'f1': 0.3210236208873674}, 'combined': 0.21290685737607784, 'epoch': 26} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.29617117117117114, 'r': 0.31309523809523804, 'f1': 0.3043981481481481}, 'combined': 0.20293209876543206, 'epoch': 26} Sample Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.44642857142857145, 'r': 0.2717391304347826, 'f1': 0.33783783783783783}, 'combined': 0.2252252252252252, 'epoch': 26} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.3055555555555556, 'r': 0.09482758620689655, 'f1': 0.14473684210526314}, 'combined': 0.09649122807017542, 'epoch': 26} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.35600821224540485, 'r': 0.3296622534644356, 'f1': 0.34232907896701}, 'combined': 0.25224247923884946, 'epoch': 20} Test for Chinese: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.3708503216596444, 'r': 0.2994463568648126, 'f1': 0.33134515303755174}, 'combined': 0.21975222584873896, 'epoch': 20} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.38541666666666663, 'r': 0.35238095238095235, 'f1': 0.36815920398009944}, 'combined': 0.24543946932006627, 'epoch': 20} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.32910511363636363, 'r': 0.2742542613636364, 'f1': 0.2991864669421488}, 'combined': 0.19945764462809917, 'epoch': 25} Test for Korean: {'template': {'p': 0.96875, 'r': 0.49206349206349204, 'f1': 0.6526315789473683}, 'slot': {'p': 0.3803562606644222, 'r': 0.28970140060968946, 'f1': 0.32889629598629455}, 'combined': 0.21464810895947642, 'epoch': 25} Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.578125, 'r': 0.40217391304347827, 'f1': 0.47435897435897434}, 'combined': 0.3162393162393162, 'epoch': 25} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3214272868712161, 'r': 0.31166858366449984, 'f1': 0.3164727236824498}, 'combined': 0.23319042797654194, 'epoch': 7} Test for Russian: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.3374639287478496, 'r': 0.2816078301964814, 'f1': 0.30701605547736693}, 'combined': 0.20361686580882363, 'epoch': 7} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.4423076923076923, 'r': 0.19827586206896552, 'f1': 0.2738095238095238}, 'combined': 0.1825396825396825, 'epoch': 7} ****************************** Epoch: 27 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-24 03:45:23.053046: step: 2/466, loss: 0.027833836153149605 2023-01-24 03:45:23.669166: step: 4/466, loss: 0.024089179933071136 2023-01-24 03:45:24.260593: step: 6/466, loss: 0.05892466381192207 2023-01-24 03:45:24.923695: step: 8/466, loss: 0.0033265934325754642 2023-01-24 03:45:25.574249: step: 10/466, loss: 0.008479108102619648 2023-01-24 03:45:26.236760: step: 12/466, loss: 0.017342019826173782 2023-01-24 03:45:26.791565: step: 14/466, loss: 0.014171866700053215 2023-01-24 03:45:27.419695: step: 16/466, loss: 0.0018908302299678326 2023-01-24 03:45:28.048197: step: 18/466, loss: 0.022276606410741806 2023-01-24 03:45:28.606894: step: 20/466, loss: 0.0065987976267933846 2023-01-24 03:45:29.244554: step: 22/466, loss: 0.6846793293952942 2023-01-24 03:45:29.927357: step: 24/466, loss: 0.03842934966087341 2023-01-24 03:45:30.529085: step: 26/466, loss: 0.053983282297849655 2023-01-24 03:45:31.164314: step: 28/466, loss: 0.014167653396725655 2023-01-24 03:45:31.797328: step: 30/466, loss: 0.008245084434747696 2023-01-24 03:45:32.455345: step: 32/466, loss: 0.017242398113012314 2023-01-24 03:45:33.144120: step: 34/466, loss: 0.3137844204902649 2023-01-24 03:45:33.774211: step: 36/466, loss: 0.005847569555044174 2023-01-24 03:45:34.384606: step: 38/466, loss: 0.036376770585775375 2023-01-24 03:45:35.023582: step: 40/466, loss: 0.04871950298547745 2023-01-24 03:45:35.655681: step: 42/466, loss: 0.0542115792632103 2023-01-24 03:45:36.321364: step: 44/466, loss: 0.014360906556248665 2023-01-24 03:45:36.959514: step: 46/466, loss: 0.011436935514211655 2023-01-24 03:45:37.625131: step: 48/466, loss: 0.004590773489326239 2023-01-24 03:45:38.223127: step: 50/466, loss: 0.00757006648927927 2023-01-24 03:45:38.853181: step: 52/466, loss: 0.043378762900829315 2023-01-24 03:45:39.458546: step: 54/466, loss: 0.016078811138868332 2023-01-24 03:45:40.053427: step: 56/466, loss: 0.021416189149022102 2023-01-24 03:45:40.597127: step: 58/466, loss: 0.013789871707558632 2023-01-24 03:45:41.162681: step: 60/466, loss: 0.010896018706262112 2023-01-24 03:45:41.887228: step: 62/466, loss: 0.05236181616783142 2023-01-24 03:45:42.439297: step: 64/466, loss: 0.04595043882727623 2023-01-24 03:45:43.068886: step: 66/466, loss: 0.0866723507642746 2023-01-24 03:45:43.748045: step: 68/466, loss: 0.114671491086483 2023-01-24 03:45:44.440415: step: 70/466, loss: 0.004479016177356243 2023-01-24 03:45:45.076378: step: 72/466, loss: 0.018120339140295982 2023-01-24 03:45:45.718837: step: 74/466, loss: 0.06656885892152786 2023-01-24 03:45:46.365132: step: 76/466, loss: 0.024030594155192375 2023-01-24 03:45:47.020903: step: 78/466, loss: 0.2569972574710846 2023-01-24 03:45:47.644001: step: 80/466, loss: 0.0023115696385502815 2023-01-24 03:45:48.179438: step: 82/466, loss: 0.016187287867069244 2023-01-24 03:45:48.816919: step: 84/466, loss: 0.003180457279086113 2023-01-24 03:45:49.471473: step: 86/466, loss: 0.08133621513843536 2023-01-24 03:45:50.046319: step: 88/466, loss: 0.008258305490016937 2023-01-24 03:45:50.685391: step: 90/466, loss: 0.003522202605381608 2023-01-24 03:45:51.346546: step: 92/466, loss: 1.0791115760803223 2023-01-24 03:45:51.932080: step: 94/466, loss: 0.024251636117696762 2023-01-24 03:45:52.546010: step: 96/466, loss: 0.024751055985689163 2023-01-24 03:45:53.192967: step: 98/466, loss: 0.11981932818889618 2023-01-24 03:45:53.783872: step: 100/466, loss: 0.03855177015066147 2023-01-24 03:45:54.386863: step: 102/466, loss: 0.0010817792499437928 2023-01-24 03:45:55.026232: step: 104/466, loss: 0.05485154315829277 2023-01-24 03:45:55.609255: step: 106/466, loss: 0.08621357381343842 2023-01-24 03:45:56.219581: step: 108/466, loss: 0.0025955094024538994 2023-01-24 03:45:56.817800: step: 110/466, loss: 0.033904727548360825 2023-01-24 03:45:57.469230: step: 112/466, loss: 0.0053804488852620125 2023-01-24 03:45:58.024806: step: 114/466, loss: 0.03207462280988693 2023-01-24 03:45:58.689814: step: 116/466, loss: 0.00749216740950942 2023-01-24 03:45:59.299483: step: 118/466, loss: 0.001963667571544647 2023-01-24 03:45:59.885921: step: 120/466, loss: 0.012088139541447163 2023-01-24 03:46:00.483336: step: 122/466, loss: 0.009510976262390614 2023-01-24 03:46:01.181301: step: 124/466, loss: 0.0329180583357811 2023-01-24 03:46:01.867749: step: 126/466, loss: 0.051859792321920395 2023-01-24 03:46:02.445167: step: 128/466, loss: 0.023622572422027588 2023-01-24 03:46:03.029599: step: 130/466, loss: 0.053788863122463226 2023-01-24 03:46:03.664304: step: 132/466, loss: 0.06380758434534073 2023-01-24 03:46:04.243259: step: 134/466, loss: 0.0036389161832630634 2023-01-24 03:46:04.896132: step: 136/466, loss: 0.003958689048886299 2023-01-24 03:46:05.489264: step: 138/466, loss: 0.008076903410255909 2023-01-24 03:46:06.097102: step: 140/466, loss: 0.04739199951291084 2023-01-24 03:46:06.695096: step: 142/466, loss: 0.021099107339978218 2023-01-24 03:46:07.301557: step: 144/466, loss: 0.005976251792162657 2023-01-24 03:46:07.945343: step: 146/466, loss: 0.006119987461715937 2023-01-24 03:46:08.528294: step: 148/466, loss: 0.0069958982057869434 2023-01-24 03:46:09.218124: step: 150/466, loss: 0.030476871877908707 2023-01-24 03:46:09.896964: step: 152/466, loss: 0.01349339447915554 2023-01-24 03:46:10.559369: step: 154/466, loss: 0.025189753621816635 2023-01-24 03:46:11.127624: step: 156/466, loss: 0.03439941629767418 2023-01-24 03:46:11.721440: step: 158/466, loss: 0.005432324483990669 2023-01-24 03:46:12.388420: step: 160/466, loss: 0.17197854816913605 2023-01-24 03:46:12.975952: step: 162/466, loss: 0.00020336541638243943 2023-01-24 03:46:13.617786: step: 164/466, loss: 0.06483791023492813 2023-01-24 03:46:14.261030: step: 166/466, loss: 0.056566525250673294 2023-01-24 03:46:14.880253: step: 168/466, loss: 0.017517898231744766 2023-01-24 03:46:15.518413: step: 170/466, loss: 0.007111198268830776 2023-01-24 03:46:16.181307: step: 172/466, loss: 0.0664910301566124 2023-01-24 03:46:16.798181: step: 174/466, loss: 0.3708658814430237 2023-01-24 03:46:17.438007: step: 176/466, loss: 0.019381744787096977 2023-01-24 03:46:18.067038: step: 178/466, loss: 0.023170998319983482 2023-01-24 03:46:18.733612: step: 180/466, loss: 0.006079450715333223 2023-01-24 03:46:19.369577: step: 182/466, loss: 0.07094259560108185 2023-01-24 03:46:19.961705: step: 184/466, loss: 0.005931614898145199 2023-01-24 03:46:20.540804: step: 186/466, loss: 0.01944388635456562 2023-01-24 03:46:21.250822: step: 188/466, loss: 0.06885656714439392 2023-01-24 03:46:21.864069: step: 190/466, loss: 0.0036963620223104954 2023-01-24 03:46:22.437717: step: 192/466, loss: 0.0355551652610302 2023-01-24 03:46:23.062944: step: 194/466, loss: 0.0029540781397372484 2023-01-24 03:46:23.664370: step: 196/466, loss: 0.0021283773239701986 2023-01-24 03:46:24.312349: step: 198/466, loss: 0.03791055455803871 2023-01-24 03:46:24.896461: step: 200/466, loss: 0.0039059543050825596 2023-01-24 03:46:25.590478: step: 202/466, loss: 0.04408857598900795 2023-01-24 03:46:26.238250: step: 204/466, loss: 0.025127647444605827 2023-01-24 03:46:26.876368: step: 206/466, loss: 0.00040855578845366836 2023-01-24 03:46:27.445320: step: 208/466, loss: 0.021951226517558098 2023-01-24 03:46:28.028615: step: 210/466, loss: 0.10126690566539764 2023-01-24 03:46:28.690307: step: 212/466, loss: 0.01390159223228693 2023-01-24 03:46:29.365024: step: 214/466, loss: 0.0047782729379832745 2023-01-24 03:46:30.027498: step: 216/466, loss: 0.12035967409610748 2023-01-24 03:46:30.685748: step: 218/466, loss: 0.027765575796365738 2023-01-24 03:46:31.310446: step: 220/466, loss: 0.0063645802438259125 2023-01-24 03:46:32.036254: step: 222/466, loss: 0.0021806962322443724 2023-01-24 03:46:32.670308: step: 224/466, loss: 0.030791504308581352 2023-01-24 03:46:33.245517: step: 226/466, loss: 0.013874297961592674 2023-01-24 03:46:33.859197: step: 228/466, loss: 0.012175521813333035 2023-01-24 03:46:34.510259: step: 230/466, loss: 0.01178658939898014 2023-01-24 03:46:35.112451: step: 232/466, loss: 0.01597941480576992 2023-01-24 03:46:35.745605: step: 234/466, loss: 0.009201081469655037 2023-01-24 03:46:36.371659: step: 236/466, loss: 0.08578303456306458 2023-01-24 03:46:37.013298: step: 238/466, loss: 0.00881474930793047 2023-01-24 03:46:37.693658: step: 240/466, loss: 0.0507836751639843 2023-01-24 03:46:38.344679: step: 242/466, loss: 0.01997714675962925 2023-01-24 03:46:38.967359: step: 244/466, loss: 0.014055879786610603 2023-01-24 03:46:39.582133: step: 246/466, loss: 0.013484900817275047 2023-01-24 03:46:40.239792: step: 248/466, loss: 0.053369153290987015 2023-01-24 03:46:40.924209: step: 250/466, loss: 0.0779917910695076 2023-01-24 03:46:41.470885: step: 252/466, loss: 0.018295586109161377 2023-01-24 03:46:42.142382: step: 254/466, loss: 0.013621524907648563 2023-01-24 03:46:42.771831: step: 256/466, loss: 0.017526626586914062 2023-01-24 03:46:43.416394: step: 258/466, loss: 0.008595292456448078 2023-01-24 03:46:44.075550: step: 260/466, loss: 0.012955628335475922 2023-01-24 03:46:44.680075: step: 262/466, loss: 0.00752677395939827 2023-01-24 03:46:45.255795: step: 264/466, loss: 0.028774773702025414 2023-01-24 03:46:45.866137: step: 266/466, loss: 0.015406875871121883 2023-01-24 03:46:46.461843: step: 268/466, loss: 0.0460406057536602 2023-01-24 03:46:47.059213: step: 270/466, loss: 0.09528829902410507 2023-01-24 03:46:47.708801: step: 272/466, loss: 0.016132591292262077 2023-01-24 03:46:48.400329: step: 274/466, loss: 0.008536629378795624 2023-01-24 03:46:49.024008: step: 276/466, loss: 0.00013562050298787653 2023-01-24 03:46:49.626456: step: 278/466, loss: 0.009571983478963375 2023-01-24 03:46:50.312760: step: 280/466, loss: 0.008088672533631325 2023-01-24 03:46:50.934665: step: 282/466, loss: 0.046699605882167816 2023-01-24 03:46:51.471648: step: 284/466, loss: 0.11790218204259872 2023-01-24 03:46:52.083211: step: 286/466, loss: 0.9116804599761963 2023-01-24 03:46:52.681904: step: 288/466, loss: 0.029373178258538246 2023-01-24 03:46:53.317362: step: 290/466, loss: 0.025495873764157295 2023-01-24 03:46:53.973061: step: 292/466, loss: 0.0008974373922683299 2023-01-24 03:46:54.604889: step: 294/466, loss: 0.041956234723329544 2023-01-24 03:46:55.258598: step: 296/466, loss: 0.01854517310857773 2023-01-24 03:46:55.886301: step: 298/466, loss: 0.009173722006380558 2023-01-24 03:46:56.483323: step: 300/466, loss: 0.043664708733558655 2023-01-24 03:46:57.042357: step: 302/466, loss: 0.03538736328482628 2023-01-24 03:46:57.643419: step: 304/466, loss: 0.051593050360679626 2023-01-24 03:46:58.237810: step: 306/466, loss: 0.12222576141357422 2023-01-24 03:46:58.907371: step: 308/466, loss: 0.203830286860466 2023-01-24 03:46:59.546802: step: 310/466, loss: 0.029027223587036133 2023-01-24 03:47:00.232832: step: 312/466, loss: 0.1208413764834404 2023-01-24 03:47:00.826231: step: 314/466, loss: 0.013764460571110249 2023-01-24 03:47:01.401066: step: 316/466, loss: 0.014483177103102207 2023-01-24 03:47:02.000356: step: 318/466, loss: 0.007903095334768295 2023-01-24 03:47:02.639879: step: 320/466, loss: 0.07149031013250351 2023-01-24 03:47:03.235915: step: 322/466, loss: 0.010006767697632313 2023-01-24 03:47:03.871755: step: 324/466, loss: 0.004422049969434738 2023-01-24 03:47:04.495343: step: 326/466, loss: 0.015813073143363 2023-01-24 03:47:05.125364: step: 328/466, loss: 0.0066657704301178455 2023-01-24 03:47:05.756952: step: 330/466, loss: 0.04342114180326462 2023-01-24 03:47:06.381158: step: 332/466, loss: 0.02800775319337845 2023-01-24 03:47:06.918392: step: 334/466, loss: 0.029935946688055992 2023-01-24 03:47:07.532573: step: 336/466, loss: 0.028563717380166054 2023-01-24 03:47:08.140228: step: 338/466, loss: 0.0034600680228322744 2023-01-24 03:47:08.769618: step: 340/466, loss: 0.006282786373049021 2023-01-24 03:47:09.397819: step: 342/466, loss: 0.014332322403788567 2023-01-24 03:47:10.061559: step: 344/466, loss: 0.04797680303454399 2023-01-24 03:47:10.661053: step: 346/466, loss: 0.033090557903051376 2023-01-24 03:47:11.330751: step: 348/466, loss: 0.042578570544719696 2023-01-24 03:47:11.961312: step: 350/466, loss: 0.11458615958690643 2023-01-24 03:47:12.639486: step: 352/466, loss: 0.002273987978696823 2023-01-24 03:47:13.254241: step: 354/466, loss: 0.041761383414268494 2023-01-24 03:47:13.914263: step: 356/466, loss: 0.0080189760774374 2023-01-24 03:47:14.532945: step: 358/466, loss: 0.25496694445610046 2023-01-24 03:47:15.155719: step: 360/466, loss: 0.03920642286539078 2023-01-24 03:47:15.738014: step: 362/466, loss: 0.007292563561350107 2023-01-24 03:47:16.314905: step: 364/466, loss: 0.004238718654960394 2023-01-24 03:47:16.965624: step: 366/466, loss: 0.002630181610584259 2023-01-24 03:47:17.606639: step: 368/466, loss: 0.01754770055413246 2023-01-24 03:47:18.315963: step: 370/466, loss: 0.24520449340343475 2023-01-24 03:47:18.916467: step: 372/466, loss: 0.012534240260720253 2023-01-24 03:47:19.535441: step: 374/466, loss: 0.0020073133055120707 2023-01-24 03:47:20.182293: step: 376/466, loss: 0.0053610121831297874 2023-01-24 03:47:20.787578: step: 378/466, loss: 0.002231658436357975 2023-01-24 03:47:21.406270: step: 380/466, loss: 0.06281581521034241 2023-01-24 03:47:22.061811: step: 382/466, loss: 0.10772094130516052 2023-01-24 03:47:22.690162: step: 384/466, loss: 0.03544430807232857 2023-01-24 03:47:23.364535: step: 386/466, loss: 0.018868377432227135 2023-01-24 03:47:23.998530: step: 388/466, loss: 0.007571527734398842 2023-01-24 03:47:24.612007: step: 390/466, loss: 0.09080404788255692 2023-01-24 03:47:25.273533: step: 392/466, loss: 0.11332575976848602 2023-01-24 03:47:25.882714: step: 394/466, loss: 0.004821168724447489 2023-01-24 03:47:26.548502: step: 396/466, loss: 0.021221790462732315 2023-01-24 03:47:27.110627: step: 398/466, loss: 0.05074651166796684 2023-01-24 03:47:27.698829: step: 400/466, loss: 0.009762517176568508 2023-01-24 03:47:28.365137: step: 402/466, loss: 0.02435818873345852 2023-01-24 03:47:29.003150: step: 404/466, loss: 0.003784589236602187 2023-01-24 03:47:29.615868: step: 406/466, loss: 0.019062306731939316 2023-01-24 03:47:30.261008: step: 408/466, loss: 0.0393337719142437 2023-01-24 03:47:30.871863: step: 410/466, loss: 0.011388128623366356 2023-01-24 03:47:31.473591: step: 412/466, loss: 0.012343622744083405 2023-01-24 03:47:32.100929: step: 414/466, loss: 0.014441592618823051 2023-01-24 03:47:32.667662: step: 416/466, loss: 0.010982821695506573 2023-01-24 03:47:33.343219: step: 418/466, loss: 0.051883138716220856 2023-01-24 03:47:34.057273: step: 420/466, loss: 0.01821015402674675 2023-01-24 03:47:34.688529: step: 422/466, loss: 0.23361898958683014 2023-01-24 03:47:35.289036: step: 424/466, loss: 0.009966660290956497 2023-01-24 03:47:35.836179: step: 426/466, loss: 0.07146193832159042 2023-01-24 03:47:36.451134: step: 428/466, loss: 0.003511168761178851 2023-01-24 03:47:37.034345: step: 430/466, loss: 0.0016422433545812964 2023-01-24 03:47:37.674508: step: 432/466, loss: 0.0014138933038339019 2023-01-24 03:47:38.367871: step: 434/466, loss: 0.013912991620600224 2023-01-24 03:47:39.047112: step: 436/466, loss: 0.055950235575437546 2023-01-24 03:47:39.686386: step: 438/466, loss: 0.08086015284061432 2023-01-24 03:47:40.339407: step: 440/466, loss: 0.019065426662564278 2023-01-24 03:47:40.953552: step: 442/466, loss: 0.012725220061838627 2023-01-24 03:47:41.587519: step: 444/466, loss: 0.019271058961749077 2023-01-24 03:47:42.200620: step: 446/466, loss: 0.022563360631465912 2023-01-24 03:47:42.806696: step: 448/466, loss: 0.09099145978689194 2023-01-24 03:47:43.450010: step: 450/466, loss: 0.06450436264276505 2023-01-24 03:47:44.075895: step: 452/466, loss: 0.0597335547208786 2023-01-24 03:47:44.655161: step: 454/466, loss: 0.0012813321081921458 2023-01-24 03:47:45.292530: step: 456/466, loss: 0.023356245830655098 2023-01-24 03:47:45.805360: step: 458/466, loss: 0.0005833054892718792 2023-01-24 03:47:46.417789: step: 460/466, loss: 0.07915735244750977 2023-01-24 03:47:47.040262: step: 462/466, loss: 0.0856717973947525 2023-01-24 03:47:47.714374: step: 464/466, loss: 0.018680641427636147 2023-01-24 03:47:48.326838: step: 466/466, loss: 0.04496914893388748 2023-01-24 03:47:48.987689: step: 468/466, loss: 0.009340988472104073 2023-01-24 03:47:49.654933: step: 470/466, loss: 0.028811069205403328 2023-01-24 03:47:50.316511: step: 472/466, loss: 0.02744913473725319 2023-01-24 03:47:51.014814: step: 474/466, loss: 0.013957190327346325 2023-01-24 03:47:51.652453: step: 476/466, loss: 0.004813686013221741 2023-01-24 03:47:52.240350: step: 478/466, loss: 0.04070420563220978 2023-01-24 03:47:52.887913: step: 480/466, loss: 0.014992502517998219 2023-01-24 03:47:53.523523: step: 482/466, loss: 0.008838329464197159 2023-01-24 03:47:54.147722: step: 484/466, loss: 0.021108128130435944 2023-01-24 03:47:54.835825: step: 486/466, loss: 0.00671556917950511 2023-01-24 03:47:55.455820: step: 488/466, loss: 0.010902033187448978 2023-01-24 03:47:56.055048: step: 490/466, loss: 0.03226817771792412 2023-01-24 03:47:56.606484: step: 492/466, loss: 0.03758244588971138 2023-01-24 03:47:57.236349: step: 494/466, loss: 0.0034771913196891546 2023-01-24 03:47:57.854042: step: 496/466, loss: 0.025463111698627472 2023-01-24 03:47:58.480908: step: 498/466, loss: 0.0301282349973917 2023-01-24 03:47:59.085310: step: 500/466, loss: 0.010275552049279213 2023-01-24 03:47:59.748322: step: 502/466, loss: 0.0036540618166327477 2023-01-24 03:48:00.430328: step: 504/466, loss: 0.01159196998924017 2023-01-24 03:48:01.125853: step: 506/466, loss: 0.01329745166003704 2023-01-24 03:48:01.753591: step: 508/466, loss: 0.04020538181066513 2023-01-24 03:48:02.367604: step: 510/466, loss: 0.009330449625849724 2023-01-24 03:48:03.027859: step: 512/466, loss: 0.04804263263940811 2023-01-24 03:48:03.652555: step: 514/466, loss: 0.01904565468430519 2023-01-24 03:48:04.265070: step: 516/466, loss: 0.0011679809540510178 2023-01-24 03:48:04.906050: step: 518/466, loss: 0.051677506417036057 2023-01-24 03:48:05.586694: step: 520/466, loss: 0.022494550794363022 2023-01-24 03:48:06.199341: step: 522/466, loss: 0.056992609053850174 2023-01-24 03:48:06.876448: step: 524/466, loss: 0.007421521935611963 2023-01-24 03:48:07.499730: step: 526/466, loss: 0.0017072822665795684 2023-01-24 03:48:08.230736: step: 528/466, loss: 0.11647544056177139 2023-01-24 03:48:08.892827: step: 530/466, loss: 0.25799375772476196 2023-01-24 03:48:09.521217: step: 532/466, loss: 0.007462191861122847 2023-01-24 03:48:10.085990: step: 534/466, loss: 0.0788254365324974 2023-01-24 03:48:10.683807: step: 536/466, loss: 0.0102780656889081 2023-01-24 03:48:11.242912: step: 538/466, loss: 0.15707173943519592 2023-01-24 03:48:11.915366: step: 540/466, loss: 0.03727540746331215 2023-01-24 03:48:12.499359: step: 542/466, loss: 0.0025355357211083174 2023-01-24 03:48:13.114744: step: 544/466, loss: 0.006033079698681831 2023-01-24 03:48:13.765798: step: 546/466, loss: 0.026804547756910324 2023-01-24 03:48:14.322906: step: 548/466, loss: 0.004583065398037434 2023-01-24 03:48:15.036687: step: 550/466, loss: 0.15192300081253052 2023-01-24 03:48:15.671199: step: 552/466, loss: 0.015816085040569305 2023-01-24 03:48:16.282903: step: 554/466, loss: 0.2682707905769348 2023-01-24 03:48:16.897166: step: 556/466, loss: 0.018862945958971977 2023-01-24 03:48:17.490748: step: 558/466, loss: 0.009017648175358772 2023-01-24 03:48:18.122107: step: 560/466, loss: 0.017621343955397606 2023-01-24 03:48:18.995136: step: 562/466, loss: 0.01245852280408144 2023-01-24 03:48:19.675504: step: 564/466, loss: 0.004409522283822298 2023-01-24 03:48:20.320801: step: 566/466, loss: 0.020851565524935722 2023-01-24 03:48:20.971254: step: 568/466, loss: 0.004684189334511757 2023-01-24 03:48:21.579119: step: 570/466, loss: 0.0760229304432869 2023-01-24 03:48:22.164672: step: 572/466, loss: 0.0607258677482605 2023-01-24 03:48:22.750377: step: 574/466, loss: 0.027867596596479416 2023-01-24 03:48:23.391749: step: 576/466, loss: 0.013862524181604385 2023-01-24 03:48:24.058977: step: 578/466, loss: 0.07694022357463837 2023-01-24 03:48:24.669371: step: 580/466, loss: 0.0014871252933517098 2023-01-24 03:48:25.286913: step: 582/466, loss: 0.03727436065673828 2023-01-24 03:48:25.877704: step: 584/466, loss: 0.0019148996798321605 2023-01-24 03:48:26.537913: step: 586/466, loss: 0.020923055708408356 2023-01-24 03:48:27.169722: step: 588/466, loss: 1.0160681009292603 2023-01-24 03:48:27.801384: step: 590/466, loss: 0.033878784626722336 2023-01-24 03:48:28.408000: step: 592/466, loss: 0.06347450613975525 2023-01-24 03:48:29.095196: step: 594/466, loss: 0.08106108009815216 2023-01-24 03:48:29.688191: step: 596/466, loss: 0.00410859240218997 2023-01-24 03:48:30.291556: step: 598/466, loss: 0.06994364410638809 2023-01-24 03:48:31.007187: step: 600/466, loss: 0.2086619883775711 2023-01-24 03:48:31.727380: step: 602/466, loss: 0.015858981758356094 2023-01-24 03:48:32.396304: step: 604/466, loss: 0.01849283091723919 2023-01-24 03:48:32.982615: step: 606/466, loss: 0.025155507028102875 2023-01-24 03:48:33.676861: step: 608/466, loss: 0.0005380866350606084 2023-01-24 03:48:34.253765: step: 610/466, loss: 0.005587004590779543 2023-01-24 03:48:34.893630: step: 612/466, loss: 0.12961412966251373 2023-01-24 03:48:35.497674: step: 614/466, loss: 0.020661529153585434 2023-01-24 03:48:36.177775: step: 616/466, loss: 0.022744232788681984 2023-01-24 03:48:36.766330: step: 618/466, loss: 0.023633470758795738 2023-01-24 03:48:37.388189: step: 620/466, loss: 0.02143573947250843 2023-01-24 03:48:38.012137: step: 622/466, loss: 0.013075039722025394 2023-01-24 03:48:38.634325: step: 624/466, loss: 0.026958230882883072 2023-01-24 03:48:39.301225: step: 626/466, loss: 0.1400860995054245 2023-01-24 03:48:39.960675: step: 628/466, loss: 0.009638143703341484 2023-01-24 03:48:40.683872: step: 630/466, loss: 0.031049687415361404 2023-01-24 03:48:41.272577: step: 632/466, loss: 0.2267194539308548 2023-01-24 03:48:41.966946: step: 634/466, loss: 0.03791540116071701 2023-01-24 03:48:42.583067: step: 636/466, loss: 0.02210753597319126 2023-01-24 03:48:43.218149: step: 638/466, loss: 0.03216801956295967 2023-01-24 03:48:43.797005: step: 640/466, loss: 0.12850381433963776 2023-01-24 03:48:44.396134: step: 642/466, loss: 0.0019831915851682425 2023-01-24 03:48:45.037574: step: 644/466, loss: 0.014089684002101421 2023-01-24 03:48:45.579197: step: 646/466, loss: 0.0263113621622324 2023-01-24 03:48:46.207661: step: 648/466, loss: 0.27425116300582886 2023-01-24 03:48:46.764474: step: 650/466, loss: 0.13408559560775757 2023-01-24 03:48:47.351944: step: 652/466, loss: 0.0064635686576366425 2023-01-24 03:48:47.939833: step: 654/466, loss: 0.023577343672513962 2023-01-24 03:48:48.608610: step: 656/466, loss: 0.023118676617741585 2023-01-24 03:48:49.260489: step: 658/466, loss: 0.02726762555539608 2023-01-24 03:48:49.918052: step: 660/466, loss: 0.0005869396263733506 2023-01-24 03:48:50.514891: step: 662/466, loss: 0.05336834862828255 2023-01-24 03:48:51.261183: step: 664/466, loss: 0.02326621487736702 2023-01-24 03:48:51.854471: step: 666/466, loss: 0.003729963907971978 2023-01-24 03:48:52.564949: step: 668/466, loss: 0.0011509779142215848 2023-01-24 03:48:53.228200: step: 670/466, loss: 0.013985698111355305 2023-01-24 03:48:53.791921: step: 672/466, loss: 0.0028651151806116104 2023-01-24 03:48:54.402131: step: 674/466, loss: 0.003631117520853877 2023-01-24 03:48:55.006981: step: 676/466, loss: 0.29877933859825134 2023-01-24 03:48:55.620729: step: 678/466, loss: 0.018547525629401207 2023-01-24 03:48:56.222918: step: 680/466, loss: 0.019268687814474106 2023-01-24 03:48:56.831141: step: 682/466, loss: 0.01983795315027237 2023-01-24 03:48:57.474276: step: 684/466, loss: 0.006188856437802315 2023-01-24 03:48:58.139067: step: 686/466, loss: 0.08563053607940674 2023-01-24 03:48:58.732371: step: 688/466, loss: 0.0037638735957443714 2023-01-24 03:48:59.405840: step: 690/466, loss: 0.015350519679486752 2023-01-24 03:49:00.028802: step: 692/466, loss: 0.00996602326631546 2023-01-24 03:49:00.684810: step: 694/466, loss: 0.00022550724679604173 2023-01-24 03:49:01.340514: step: 696/466, loss: 0.04517294839024544 2023-01-24 03:49:01.966721: step: 698/466, loss: 0.1106954887509346 2023-01-24 03:49:02.633275: step: 700/466, loss: 0.021119993180036545 2023-01-24 03:49:03.421221: step: 702/466, loss: 0.058732982724905014 2023-01-24 03:49:04.009570: step: 704/466, loss: 0.005045011639595032 2023-01-24 03:49:04.656781: step: 706/466, loss: 0.003177064238116145 2023-01-24 03:49:05.242327: step: 708/466, loss: 0.03055475652217865 2023-01-24 03:49:05.871537: step: 710/466, loss: 0.1264713704586029 2023-01-24 03:49:06.498015: step: 712/466, loss: 0.0012993437703698874 2023-01-24 03:49:07.030600: step: 714/466, loss: 0.0006832975777797401 2023-01-24 03:49:07.655576: step: 716/466, loss: 0.03061027266085148 2023-01-24 03:49:08.287027: step: 718/466, loss: 0.0036416640505194664 2023-01-24 03:49:08.884501: step: 720/466, loss: 0.0023828730918467045 2023-01-24 03:49:09.485914: step: 722/466, loss: 0.05810290575027466 2023-01-24 03:49:10.175233: step: 724/466, loss: 0.026543136686086655 2023-01-24 03:49:10.811891: step: 726/466, loss: 0.0126041816547513 2023-01-24 03:49:11.415012: step: 728/466, loss: 0.05292503908276558 2023-01-24 03:49:12.065280: step: 730/466, loss: 0.06300801038742065 2023-01-24 03:49:12.640371: step: 732/466, loss: 0.011260651051998138 2023-01-24 03:49:13.287423: step: 734/466, loss: 0.0007103653624653816 2023-01-24 03:49:13.884622: step: 736/466, loss: 0.00010583127732388675 2023-01-24 03:49:14.468846: step: 738/466, loss: 0.022099090740084648 2023-01-24 03:49:15.129629: step: 740/466, loss: 0.0035885353572666645 2023-01-24 03:49:15.658916: step: 742/466, loss: 0.6832292675971985 2023-01-24 03:49:16.259000: step: 744/466, loss: 0.09095166623592377 2023-01-24 03:49:16.802139: step: 746/466, loss: 0.007521773222833872 2023-01-24 03:49:17.428597: step: 748/466, loss: 0.10964198410511017 2023-01-24 03:49:18.046873: step: 750/466, loss: 0.00558021105825901 2023-01-24 03:49:18.617039: step: 752/466, loss: 0.009862803854048252 2023-01-24 03:49:19.248416: step: 754/466, loss: 0.05558396503329277 2023-01-24 03:49:19.846359: step: 756/466, loss: 0.015214133076369762 2023-01-24 03:49:20.431368: step: 758/466, loss: 0.04548295959830284 2023-01-24 03:49:21.043531: step: 760/466, loss: 0.03597099334001541 2023-01-24 03:49:21.615964: step: 762/466, loss: 0.0132424496114254 2023-01-24 03:49:22.284527: step: 764/466, loss: 0.0013959844363853335 2023-01-24 03:49:22.891083: step: 766/466, loss: 0.010815260000526905 2023-01-24 03:49:23.558685: step: 768/466, loss: 0.11648006737232208 2023-01-24 03:49:24.164851: step: 770/466, loss: 0.0757567435503006 2023-01-24 03:49:24.763329: step: 772/466, loss: 0.0016913153231143951 2023-01-24 03:49:25.401903: step: 774/466, loss: 0.11876269429922104 2023-01-24 03:49:26.020557: step: 776/466, loss: 0.014088819734752178 2023-01-24 03:49:26.734247: step: 778/466, loss: 0.05380717292428017 2023-01-24 03:49:27.307031: step: 780/466, loss: 0.017499873414635658 2023-01-24 03:49:28.005035: step: 782/466, loss: 0.0022974826861172915 2023-01-24 03:49:28.573653: step: 784/466, loss: 0.0068471673876047134 2023-01-24 03:49:29.205068: step: 786/466, loss: 0.01211138442158699 2023-01-24 03:49:29.842064: step: 788/466, loss: 0.09165941178798676 2023-01-24 03:49:30.443318: step: 790/466, loss: 0.02449076995253563 2023-01-24 03:49:31.016678: step: 792/466, loss: 0.045832838863134384 2023-01-24 03:49:31.656303: step: 794/466, loss: 0.041032642126083374 2023-01-24 03:49:32.313256: step: 796/466, loss: 0.01552598550915718 2023-01-24 03:49:32.940162: step: 798/466, loss: 0.01997545175254345 2023-01-24 03:49:33.572123: step: 800/466, loss: 0.010445799678564072 2023-01-24 03:49:34.181942: step: 802/466, loss: 0.00863324198871851 2023-01-24 03:49:34.809182: step: 804/466, loss: 0.006967922672629356 2023-01-24 03:49:35.445991: step: 806/466, loss: 0.05250157043337822 2023-01-24 03:49:36.163126: step: 808/466, loss: 0.03862131014466286 2023-01-24 03:49:36.795178: step: 810/466, loss: 0.05220973864197731 2023-01-24 03:49:37.413108: step: 812/466, loss: 0.07533097267150879 2023-01-24 03:49:38.022547: step: 814/466, loss: 0.03043753281235695 2023-01-24 03:49:38.627476: step: 816/466, loss: 0.00482554966583848 2023-01-24 03:49:39.277748: step: 818/466, loss: 0.0547814704477787 2023-01-24 03:49:39.892849: step: 820/466, loss: 0.020278315991163254 2023-01-24 03:49:40.503505: step: 822/466, loss: 0.0074560982175171375 2023-01-24 03:49:41.222670: step: 824/466, loss: 0.01077156700193882 2023-01-24 03:49:41.933033: step: 826/466, loss: 0.024306146427989006 2023-01-24 03:49:42.536484: step: 828/466, loss: 0.034570641815662384 2023-01-24 03:49:43.160943: step: 830/466, loss: 0.06997299194335938 2023-01-24 03:49:43.833832: step: 832/466, loss: 0.00283684185706079 2023-01-24 03:49:44.511407: step: 834/466, loss: 0.005547116976231337 2023-01-24 03:49:45.138166: step: 836/466, loss: 0.027647120878100395 2023-01-24 03:49:45.819300: step: 838/466, loss: 0.06419383734464645 2023-01-24 03:49:46.476316: step: 840/466, loss: 0.0660874992609024 2023-01-24 03:49:47.103197: step: 842/466, loss: 0.056007083505392075 2023-01-24 03:49:47.693245: step: 844/466, loss: 0.01848028413951397 2023-01-24 03:49:48.359597: step: 846/466, loss: 0.06765052676200867 2023-01-24 03:49:49.021307: step: 848/466, loss: 0.032074473798274994 2023-01-24 03:49:49.726166: step: 850/466, loss: 0.002927222289144993 2023-01-24 03:49:50.381533: step: 852/466, loss: 0.062171820551157 2023-01-24 03:49:50.994040: step: 854/466, loss: 0.011124547570943832 2023-01-24 03:49:51.595280: step: 856/466, loss: 0.2015541046857834 2023-01-24 03:49:52.265694: step: 858/466, loss: 0.07552161812782288 2023-01-24 03:49:52.890417: step: 860/466, loss: 0.17228293418884277 2023-01-24 03:49:53.483690: step: 862/466, loss: 0.01148252934217453 2023-01-24 03:49:54.062735: step: 864/466, loss: 0.10406894981861115 2023-01-24 03:49:54.687708: step: 866/466, loss: 0.9883565306663513 2023-01-24 03:49:55.330515: step: 868/466, loss: 0.03360062092542648 2023-01-24 03:49:55.987030: step: 870/466, loss: 0.044887661933898926 2023-01-24 03:49:56.617966: step: 872/466, loss: 0.0762392058968544 2023-01-24 03:49:57.244286: step: 874/466, loss: 0.019538970664143562 2023-01-24 03:49:57.906563: step: 876/466, loss: 0.14621831476688385 2023-01-24 03:49:58.582791: step: 878/466, loss: 0.027906371280550957 2023-01-24 03:49:59.181380: step: 880/466, loss: 0.06606736779212952 2023-01-24 03:49:59.856159: step: 882/466, loss: 0.05360845848917961 2023-01-24 03:50:00.503620: step: 884/466, loss: 0.019824877381324768 2023-01-24 03:50:01.148413: step: 886/466, loss: 0.021310580894351006 2023-01-24 03:50:01.774386: step: 888/466, loss: 0.027834007516503334 2023-01-24 03:50:02.453173: step: 890/466, loss: 0.04889824241399765 2023-01-24 03:50:03.063692: step: 892/466, loss: 0.02395694889128208 2023-01-24 03:50:03.715379: step: 894/466, loss: 0.05169637128710747 2023-01-24 03:50:04.377674: step: 896/466, loss: 0.025937411934137344 2023-01-24 03:50:04.994699: step: 898/466, loss: 0.004313454497605562 2023-01-24 03:50:05.581053: step: 900/466, loss: 0.04247196391224861 2023-01-24 03:50:06.225378: step: 902/466, loss: 0.008535345084965229 2023-01-24 03:50:06.866617: step: 904/466, loss: 0.011454445309937 2023-01-24 03:50:07.458046: step: 906/466, loss: 0.01287270337343216 2023-01-24 03:50:08.048155: step: 908/466, loss: 0.02286466211080551 2023-01-24 03:50:08.663169: step: 910/466, loss: 0.029042303562164307 2023-01-24 03:50:09.313527: step: 912/466, loss: 0.006412671413272619 2023-01-24 03:50:09.934775: step: 914/466, loss: 0.06761791557073593 2023-01-24 03:50:10.551917: step: 916/466, loss: 0.08030974119901657 2023-01-24 03:50:11.216131: step: 918/466, loss: 0.002412214642390609 2023-01-24 03:50:11.838047: step: 920/466, loss: 0.031146619468927383 2023-01-24 03:50:12.442125: step: 922/466, loss: 0.08206193894147873 2023-01-24 03:50:13.112286: step: 924/466, loss: 0.07852324843406677 2023-01-24 03:50:13.726165: step: 926/466, loss: 0.008640158921480179 2023-01-24 03:50:14.301071: step: 928/466, loss: 0.07596441358327866 2023-01-24 03:50:14.922702: step: 930/466, loss: 0.008638971485197544 2023-01-24 03:50:15.529593: step: 932/466, loss: 0.02195640094578266 ================================================== Loss: 0.050 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3447524870962372, 'r': 0.3401732320494181, 'f1': 0.3424475516524228}, 'combined': 0.2523297749017852, 'epoch': 27} Test Chinese: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.35949230593427955, 'r': 0.3142437083991489, 'f1': 0.335348542914145}, 'combined': 0.22240732379798214, 'epoch': 27} Dev Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.33313022844272844, 'r': 0.28013223755411254, 'f1': 0.3043411963550852}, 'combined': 0.20289413090339015, 'epoch': 27} Test Korean: {'template': {'p': 0.96875, 'r': 0.49206349206349204, 'f1': 0.6526315789473683}, 'slot': {'p': 0.37139549882749806, 'r': 0.29344431867463205, 'f1': 0.32785009634869255}, 'combined': 0.21396532603809407, 'epoch': 27} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3304591607924015, 'r': 0.3266968174057707, 'f1': 0.32856721903213965}, 'combined': 0.2421021613921029, 'epoch': 27} Test Russian: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.3490689186721952, 'r': 0.30127612044844576, 'f1': 0.32341641209070365}, 'combined': 0.21449378625704696, 'epoch': 27} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.2809523809523809, 'r': 0.2809523809523809, 'f1': 0.2809523809523809}, 'combined': 0.18730158730158725, 'epoch': 27} Sample Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.4473684210526316, 'r': 0.3695652173913043, 'f1': 0.40476190476190477}, 'combined': 0.2698412698412698, 'epoch': 27} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.3, 'r': 0.10344827586206896, 'f1': 0.15384615384615385}, 'combined': 0.10256410256410256, 'epoch': 27} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.35600821224540485, 'r': 0.3296622534644356, 'f1': 0.34232907896701}, 'combined': 0.25224247923884946, 'epoch': 20} Test for Chinese: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.3708503216596444, 'r': 0.2994463568648126, 'f1': 0.33134515303755174}, 'combined': 0.21975222584873896, 'epoch': 20} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.38541666666666663, 'r': 0.35238095238095235, 'f1': 0.36815920398009944}, 'combined': 0.24543946932006627, 'epoch': 20} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.32910511363636363, 'r': 0.2742542613636364, 'f1': 0.2991864669421488}, 'combined': 0.19945764462809917, 'epoch': 25} Test for Korean: {'template': {'p': 0.96875, 'r': 0.49206349206349204, 'f1': 0.6526315789473683}, 'slot': {'p': 0.3803562606644222, 'r': 0.28970140060968946, 'f1': 0.32889629598629455}, 'combined': 0.21464810895947642, 'epoch': 25} Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.578125, 'r': 0.40217391304347827, 'f1': 0.47435897435897434}, 'combined': 0.3162393162393162, 'epoch': 25} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3214272868712161, 'r': 0.31166858366449984, 'f1': 0.3164727236824498}, 'combined': 0.23319042797654194, 'epoch': 7} Test for Russian: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.3374639287478496, 'r': 0.2816078301964814, 'f1': 0.30701605547736693}, 'combined': 0.20361686580882363, 'epoch': 7} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.4423076923076923, 'r': 0.19827586206896552, 'f1': 0.2738095238095238}, 'combined': 0.1825396825396825, 'epoch': 7} ****************************** Epoch: 28 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-24 03:52:48.120337: step: 2/466, loss: 0.014356344006955624 2023-01-24 03:52:48.761509: step: 4/466, loss: 0.04025975614786148 2023-01-24 03:52:49.405652: step: 6/466, loss: 0.007678386755287647 2023-01-24 03:52:50.049314: step: 8/466, loss: 0.006299751810729504 2023-01-24 03:52:50.689131: step: 10/466, loss: 0.02206443063914776 2023-01-24 03:52:51.323421: step: 12/466, loss: 0.009057655930519104 2023-01-24 03:52:51.920144: step: 14/466, loss: 0.0005861087120138109 2023-01-24 03:52:52.502108: step: 16/466, loss: 0.020142707973718643 2023-01-24 03:52:53.079952: step: 18/466, loss: 0.01774671860039234 2023-01-24 03:52:53.650378: step: 20/466, loss: 0.3696327209472656 2023-01-24 03:52:54.332564: step: 22/466, loss: 0.04497640207409859 2023-01-24 03:52:55.024791: step: 24/466, loss: 0.0403880812227726 2023-01-24 03:52:55.660195: step: 26/466, loss: 0.028408147394657135 2023-01-24 03:52:56.272026: step: 28/466, loss: 0.013730722479522228 2023-01-24 03:52:56.915941: step: 30/466, loss: 0.220284104347229 2023-01-24 03:52:57.512178: step: 32/466, loss: 0.0032014932949095964 2023-01-24 03:52:58.134461: step: 34/466, loss: 0.020034316927194595 2023-01-24 03:52:58.744011: step: 36/466, loss: 0.010054860264062881 2023-01-24 03:52:59.293379: step: 38/466, loss: 0.000760409515351057 2023-01-24 03:52:59.906684: step: 40/466, loss: 0.002009687013924122 2023-01-24 03:53:00.501969: step: 42/466, loss: 0.058458112180233 2023-01-24 03:53:01.069411: step: 44/466, loss: 0.004694878589361906 2023-01-24 03:53:01.653877: step: 46/466, loss: 0.04860213026404381 2023-01-24 03:53:02.367304: step: 48/466, loss: 0.019404586404561996 2023-01-24 03:53:02.946496: step: 50/466, loss: 1.8576234579086304 2023-01-24 03:53:03.505579: step: 52/466, loss: 0.0023364226799458265 2023-01-24 03:53:04.163440: step: 54/466, loss: 0.03962698578834534 2023-01-24 03:53:04.780160: step: 56/466, loss: 0.014290332794189453 2023-01-24 03:53:05.475720: step: 58/466, loss: 0.5267027616500854 2023-01-24 03:53:06.104368: step: 60/466, loss: 0.03794585168361664 2023-01-24 03:53:06.695040: step: 62/466, loss: 0.4988641142845154 2023-01-24 03:53:07.442299: step: 64/466, loss: 0.03297088295221329 2023-01-24 03:53:08.055943: step: 66/466, loss: 0.0036860518157482147 2023-01-24 03:53:08.714684: step: 68/466, loss: 0.05012097209692001 2023-01-24 03:53:09.325443: step: 70/466, loss: 0.007256614975631237 2023-01-24 03:53:09.862637: step: 72/466, loss: 0.012991432100534439 2023-01-24 03:53:10.481161: step: 74/466, loss: 0.7627309560775757 2023-01-24 03:53:11.187278: step: 76/466, loss: 0.06482483446598053 2023-01-24 03:53:11.729864: step: 78/466, loss: 0.02229459211230278 2023-01-24 03:53:12.374144: step: 80/466, loss: 0.011697539128363132 2023-01-24 03:53:12.964429: step: 82/466, loss: 0.046592358499765396 2023-01-24 03:53:13.566511: step: 84/466, loss: 0.019263656809926033 2023-01-24 03:53:14.205668: step: 86/466, loss: 0.015519551932811737 2023-01-24 03:53:14.793830: step: 88/466, loss: 0.010792257264256477 2023-01-24 03:53:15.404216: step: 90/466, loss: 0.047918882220983505 2023-01-24 03:53:16.068482: step: 92/466, loss: 0.035363685339689255 2023-01-24 03:53:16.679724: step: 94/466, loss: 0.005803626496344805 2023-01-24 03:53:17.308710: step: 96/466, loss: 0.0007789679802954197 2023-01-24 03:53:17.882063: step: 98/466, loss: 0.05806687846779823 2023-01-24 03:53:18.468058: step: 100/466, loss: 0.015525509603321552 2023-01-24 03:53:19.113203: step: 102/466, loss: 0.046151284128427505 2023-01-24 03:53:19.714768: step: 104/466, loss: 0.04594738781452179 2023-01-24 03:53:20.287818: step: 106/466, loss: 0.015765078365802765 2023-01-24 03:53:20.907471: step: 108/466, loss: 0.0011960859410464764 2023-01-24 03:53:21.587267: step: 110/466, loss: 0.014489513821899891 2023-01-24 03:53:22.283311: step: 112/466, loss: 0.10112947225570679 2023-01-24 03:53:22.975253: step: 114/466, loss: 0.03327832370996475 2023-01-24 03:53:23.687830: step: 116/466, loss: 0.022093651816248894 2023-01-24 03:53:24.326615: step: 118/466, loss: 0.050986457616090775 2023-01-24 03:53:24.982675: step: 120/466, loss: 0.053221650421619415 2023-01-24 03:53:25.571640: step: 122/466, loss: 0.03191741928458214 2023-01-24 03:53:26.195568: step: 124/466, loss: 0.01965634897351265 2023-01-24 03:53:26.801136: step: 126/466, loss: 0.02230476588010788 2023-01-24 03:53:27.418646: step: 128/466, loss: 0.005988352932035923 2023-01-24 03:53:28.005358: step: 130/466, loss: 0.0023371069692075253 2023-01-24 03:53:28.583663: step: 132/466, loss: 0.007428319193422794 2023-01-24 03:53:29.185643: step: 134/466, loss: 0.03449222072958946 2023-01-24 03:53:29.797767: step: 136/466, loss: 0.014719021506607533 2023-01-24 03:53:30.402498: step: 138/466, loss: 0.03975159302353859 2023-01-24 03:53:30.956860: step: 140/466, loss: 0.005082350689917803 2023-01-24 03:53:31.574722: step: 142/466, loss: 0.006860900670289993 2023-01-24 03:53:32.216381: step: 144/466, loss: 0.024904923513531685 2023-01-24 03:53:32.872823: step: 146/466, loss: 0.020591579377651215 2023-01-24 03:53:33.485574: step: 148/466, loss: 0.07017152011394501 2023-01-24 03:53:34.176354: step: 150/466, loss: 0.016340402886271477 2023-01-24 03:53:34.852595: step: 152/466, loss: 0.009680577553808689 2023-01-24 03:53:35.444583: step: 154/466, loss: 0.11456497013568878 2023-01-24 03:53:36.166296: step: 156/466, loss: 0.0032149055041372776 2023-01-24 03:53:36.787926: step: 158/466, loss: 0.002328538103029132 2023-01-24 03:53:37.443673: step: 160/466, loss: 0.04624420031905174 2023-01-24 03:53:37.990125: step: 162/466, loss: 0.0006145125371403992 2023-01-24 03:53:38.599561: step: 164/466, loss: 0.017932584509253502 2023-01-24 03:53:39.289417: step: 166/466, loss: 0.10638228803873062 2023-01-24 03:53:39.934352: step: 168/466, loss: 0.03491752967238426 2023-01-24 03:53:40.569982: step: 170/466, loss: 0.008974886499345303 2023-01-24 03:53:41.187611: step: 172/466, loss: 0.009993337094783783 2023-01-24 03:53:41.804649: step: 174/466, loss: 0.009299321100115776 2023-01-24 03:53:42.445989: step: 176/466, loss: 0.06450952589511871 2023-01-24 03:53:43.092963: step: 178/466, loss: 0.01272459328174591 2023-01-24 03:53:43.672515: step: 180/466, loss: 0.004704466089606285 2023-01-24 03:53:44.282461: step: 182/466, loss: 0.06017916649580002 2023-01-24 03:53:44.886700: step: 184/466, loss: 0.005661633796989918 2023-01-24 03:53:45.497422: step: 186/466, loss: 0.03322940692305565 2023-01-24 03:53:46.073502: step: 188/466, loss: 0.0073445746675133705 2023-01-24 03:53:46.677701: step: 190/466, loss: 0.013768017292022705 2023-01-24 03:53:47.331219: step: 192/466, loss: 0.01368219219148159 2023-01-24 03:53:47.921154: step: 194/466, loss: 0.00118536117952317 2023-01-24 03:53:48.539852: step: 196/466, loss: 0.021496085450053215 2023-01-24 03:53:49.166601: step: 198/466, loss: 0.04687000811100006 2023-01-24 03:53:49.825293: step: 200/466, loss: 0.05527500808238983 2023-01-24 03:53:50.460356: step: 202/466, loss: 0.021703608334064484 2023-01-24 03:53:51.116222: step: 204/466, loss: 0.03900735452771187 2023-01-24 03:53:51.732768: step: 206/466, loss: 0.015040869824588299 2023-01-24 03:53:52.400923: step: 208/466, loss: 0.006747271865606308 2023-01-24 03:53:53.057133: step: 210/466, loss: 0.05444791540503502 2023-01-24 03:53:53.690672: step: 212/466, loss: 0.01580660231411457 2023-01-24 03:53:54.288323: step: 214/466, loss: 0.0037571711000055075 2023-01-24 03:53:54.883081: step: 216/466, loss: 0.002167017897590995 2023-01-24 03:53:55.547025: step: 218/466, loss: 0.02093181200325489 2023-01-24 03:53:56.201532: step: 220/466, loss: 0.27529504895210266 2023-01-24 03:53:56.821001: step: 222/466, loss: 0.017809255048632622 2023-01-24 03:53:57.409465: step: 224/466, loss: 0.025749122723937035 2023-01-24 03:53:58.033551: step: 226/466, loss: 0.0021865046583116055 2023-01-24 03:53:58.631619: step: 228/466, loss: 0.004157186485826969 2023-01-24 03:53:59.329979: step: 230/466, loss: 0.051419783383607864 2023-01-24 03:53:59.949025: step: 232/466, loss: 0.5104377865791321 2023-01-24 03:54:00.643507: step: 234/466, loss: 0.014660846441984177 2023-01-24 03:54:01.277727: step: 236/466, loss: 0.016716402024030685 2023-01-24 03:54:01.849925: step: 238/466, loss: 0.00012194723240099847 2023-01-24 03:54:02.467191: step: 240/466, loss: 0.027347303926944733 2023-01-24 03:54:03.121290: step: 242/466, loss: 0.00809621810913086 2023-01-24 03:54:03.660416: step: 244/466, loss: 0.004353704862296581 2023-01-24 03:54:04.336174: step: 246/466, loss: 0.002220169873908162 2023-01-24 03:54:04.991755: step: 248/466, loss: 0.065149687230587 2023-01-24 03:54:05.614681: step: 250/466, loss: 0.0572744645178318 2023-01-24 03:54:06.275788: step: 252/466, loss: 0.008158638142049313 2023-01-24 03:54:06.958243: step: 254/466, loss: 0.00939987227320671 2023-01-24 03:54:07.580704: step: 256/466, loss: 0.021884001791477203 2023-01-24 03:54:08.136264: step: 258/466, loss: 0.006775291170924902 2023-01-24 03:54:08.751353: step: 260/466, loss: 0.08223210275173187 2023-01-24 03:54:09.348860: step: 262/466, loss: 0.011056998744606972 2023-01-24 03:54:09.969514: step: 264/466, loss: 0.011703469790518284 2023-01-24 03:54:10.631498: step: 266/466, loss: 0.01117135863751173 2023-01-24 03:54:11.265444: step: 268/466, loss: 0.031074855476617813 2023-01-24 03:54:11.894358: step: 270/466, loss: 0.023963337764143944 2023-01-24 03:54:12.570050: step: 272/466, loss: 0.008821115829050541 2023-01-24 03:54:13.188636: step: 274/466, loss: 0.03331343084573746 2023-01-24 03:54:13.796269: step: 276/466, loss: 0.0015307868598029017 2023-01-24 03:54:14.411819: step: 278/466, loss: 0.0017039136728271842 2023-01-24 03:54:15.053900: step: 280/466, loss: 0.08409566432237625 2023-01-24 03:54:15.657463: step: 282/466, loss: 0.016561800613999367 2023-01-24 03:54:16.268024: step: 284/466, loss: 0.010610692203044891 2023-01-24 03:54:16.837463: step: 286/466, loss: 0.001059519941918552 2023-01-24 03:54:17.398984: step: 288/466, loss: 0.03135376423597336 2023-01-24 03:54:18.000679: step: 290/466, loss: 0.012570686638355255 2023-01-24 03:54:18.573522: step: 292/466, loss: 0.05125534161925316 2023-01-24 03:54:19.317743: step: 294/466, loss: 0.04573095589876175 2023-01-24 03:54:19.958195: step: 296/466, loss: 0.0014932537451386452 2023-01-24 03:54:20.579610: step: 298/466, loss: 0.011165394447743893 2023-01-24 03:54:21.181105: step: 300/466, loss: 0.11039699614048004 2023-01-24 03:54:21.861896: step: 302/466, loss: 0.017728859558701515 2023-01-24 03:54:22.415694: step: 304/466, loss: 0.013740737922489643 2023-01-24 03:54:23.115833: step: 306/466, loss: 0.5609971284866333 2023-01-24 03:54:23.780450: step: 308/466, loss: 0.0003966326476074755 2023-01-24 03:54:24.357632: step: 310/466, loss: 0.11472980678081512 2023-01-24 03:54:24.950819: step: 312/466, loss: 0.02893056534230709 2023-01-24 03:54:25.526382: step: 314/466, loss: 0.001872989465482533 2023-01-24 03:54:26.177922: step: 316/466, loss: 3.7685582637786865 2023-01-24 03:54:26.791542: step: 318/466, loss: 0.05967605859041214 2023-01-24 03:54:27.429575: step: 320/466, loss: 0.022798847407102585 2023-01-24 03:54:28.081772: step: 322/466, loss: 0.057037223130464554 2023-01-24 03:54:28.635375: step: 324/466, loss: 0.011289509013295174 2023-01-24 03:54:29.200760: step: 326/466, loss: 0.13259345293045044 2023-01-24 03:54:29.845311: step: 328/466, loss: 0.0009833957301452756 2023-01-24 03:54:30.503951: step: 330/466, loss: 0.05492498725652695 2023-01-24 03:54:31.145531: step: 332/466, loss: 0.46318674087524414 2023-01-24 03:54:31.801339: step: 334/466, loss: 0.0014214838156476617 2023-01-24 03:54:32.413939: step: 336/466, loss: 0.02357303723692894 2023-01-24 03:54:32.974976: step: 338/466, loss: 0.006936745252460241 2023-01-24 03:54:33.592199: step: 340/466, loss: 0.07935026288032532 2023-01-24 03:54:34.213486: step: 342/466, loss: 0.0024470698554068804 2023-01-24 03:54:34.776599: step: 344/466, loss: 0.0981178805232048 2023-01-24 03:54:35.507584: step: 346/466, loss: 0.004784220829606056 2023-01-24 03:54:36.061202: step: 348/466, loss: 0.0003342062991578132 2023-01-24 03:54:36.653941: step: 350/466, loss: 0.0058012730441987514 2023-01-24 03:54:37.304038: step: 352/466, loss: 0.015377351082861423 2023-01-24 03:54:37.951217: step: 354/466, loss: 0.010469252243638039 2023-01-24 03:54:38.646829: step: 356/466, loss: 0.006357523147016764 2023-01-24 03:54:39.277621: step: 358/466, loss: 0.021507292985916138 2023-01-24 03:54:39.844570: step: 360/466, loss: 0.002699486678466201 2023-01-24 03:54:40.480797: step: 362/466, loss: 0.0266110859811306 2023-01-24 03:54:41.074512: step: 364/466, loss: 0.0008565335301682353 2023-01-24 03:54:41.726129: step: 366/466, loss: 0.012085853144526482 2023-01-24 03:54:42.322854: step: 368/466, loss: 0.0008071648189797997 2023-01-24 03:54:42.947604: step: 370/466, loss: 0.0019020200707018375 2023-01-24 03:54:43.609428: step: 372/466, loss: 0.014376146718859673 2023-01-24 03:54:44.225017: step: 374/466, loss: 0.2126854807138443 2023-01-24 03:54:44.824399: step: 376/466, loss: 0.004843392875045538 2023-01-24 03:54:45.494791: step: 378/466, loss: 0.020051002502441406 2023-01-24 03:54:46.188233: step: 380/466, loss: 0.022910239174962044 2023-01-24 03:54:46.799820: step: 382/466, loss: 0.02626357600092888 2023-01-24 03:54:47.393702: step: 384/466, loss: 0.01931600458920002 2023-01-24 03:54:47.984414: step: 386/466, loss: 0.0068773808889091015 2023-01-24 03:54:48.617478: step: 388/466, loss: 0.021012280136346817 2023-01-24 03:54:49.179901: step: 390/466, loss: 0.004145463462918997 2023-01-24 03:54:49.830423: step: 392/466, loss: 0.08756530284881592 2023-01-24 03:54:50.504597: step: 394/466, loss: 0.05061415582895279 2023-01-24 03:54:51.117411: step: 396/466, loss: 0.005754661746323109 2023-01-24 03:54:51.712906: step: 398/466, loss: 0.0050090826116502285 2023-01-24 03:54:52.340338: step: 400/466, loss: 0.03864866867661476 2023-01-24 03:54:52.921773: step: 402/466, loss: 0.005267775617539883 2023-01-24 03:54:53.576748: step: 404/466, loss: 0.028270401060581207 2023-01-24 03:54:54.251503: step: 406/466, loss: 0.2942807674407959 2023-01-24 03:54:54.928362: step: 408/466, loss: 0.1558419167995453 2023-01-24 03:54:55.604926: step: 410/466, loss: 0.0996171310544014 2023-01-24 03:54:56.245711: step: 412/466, loss: 0.01766573078930378 2023-01-24 03:54:57.043060: step: 414/466, loss: 0.0002705434162635356 2023-01-24 03:54:57.653546: step: 416/466, loss: 0.00040582052315585315 2023-01-24 03:54:58.387644: step: 418/466, loss: 0.021263068541884422 2023-01-24 03:54:58.966991: step: 420/466, loss: 0.11323844641447067 2023-01-24 03:54:59.590807: step: 422/466, loss: 0.01785622164607048 2023-01-24 03:55:00.297159: step: 424/466, loss: 0.00032024685060605407 2023-01-24 03:55:00.842550: step: 426/466, loss: 0.0005431880126707256 2023-01-24 03:55:01.468252: step: 428/466, loss: 0.024054253473877907 2023-01-24 03:55:02.168529: step: 430/466, loss: 0.002253231592476368 2023-01-24 03:55:02.778928: step: 432/466, loss: 0.020636066794395447 2023-01-24 03:55:03.372617: step: 434/466, loss: 0.00567647023126483 2023-01-24 03:55:04.021135: step: 436/466, loss: 0.13762988150119781 2023-01-24 03:55:04.664050: step: 438/466, loss: 0.06241016462445259 2023-01-24 03:55:05.297039: step: 440/466, loss: 0.0031003092881292105 2023-01-24 03:55:05.963498: step: 442/466, loss: 0.00021963377366773784 2023-01-24 03:55:06.634197: step: 444/466, loss: 0.21239763498306274 2023-01-24 03:55:07.315931: step: 446/466, loss: 0.024081414565443993 2023-01-24 03:55:07.967737: step: 448/466, loss: 0.01938778907060623 2023-01-24 03:55:08.572700: step: 450/466, loss: 0.004417332820594311 2023-01-24 03:55:09.240095: step: 452/466, loss: 0.00482999486848712 2023-01-24 03:55:09.881675: step: 454/466, loss: 0.016177594661712646 2023-01-24 03:55:10.472445: step: 456/466, loss: 0.01952294632792473 2023-01-24 03:55:11.058315: step: 458/466, loss: 0.035597063601017 2023-01-24 03:55:11.663361: step: 460/466, loss: 0.021008197218179703 2023-01-24 03:55:12.227326: step: 462/466, loss: 0.031786419451236725 2023-01-24 03:55:12.866365: step: 464/466, loss: 0.04171544685959816 2023-01-24 03:55:13.554945: step: 466/466, loss: 0.024207210168242455 2023-01-24 03:55:14.205300: step: 468/466, loss: 0.060788240283727646 2023-01-24 03:55:14.750358: step: 470/466, loss: 0.009493236429989338 2023-01-24 03:55:15.400582: step: 472/466, loss: 0.04282115772366524 2023-01-24 03:55:15.967484: step: 474/466, loss: 0.20088478922843933 2023-01-24 03:55:16.603919: step: 476/466, loss: 0.01010823342949152 2023-01-24 03:55:17.176624: step: 478/466, loss: 0.02495218999683857 2023-01-24 03:55:17.682571: step: 480/466, loss: 0.005852373316884041 2023-01-24 03:55:18.347286: step: 482/466, loss: 0.0020090469624847174 2023-01-24 03:55:18.961178: step: 484/466, loss: 0.003161755623295903 2023-01-24 03:55:19.560457: step: 486/466, loss: 0.024762656539678574 2023-01-24 03:55:20.190043: step: 488/466, loss: 0.04741264879703522 2023-01-24 03:55:20.854575: step: 490/466, loss: 0.02133660390973091 2023-01-24 03:55:21.498888: step: 492/466, loss: 0.013741686008870602 2023-01-24 03:55:22.090018: step: 494/466, loss: 0.010217435657978058 2023-01-24 03:55:22.803816: step: 496/466, loss: 0.0017479138914495707 2023-01-24 03:55:23.372800: step: 498/466, loss: 0.01496767345815897 2023-01-24 03:55:24.036363: step: 500/466, loss: 0.015532799065113068 2023-01-24 03:55:24.669613: step: 502/466, loss: 0.012100692838430405 2023-01-24 03:55:25.228638: step: 504/466, loss: 0.0010967212729156017 2023-01-24 03:55:25.821211: step: 506/466, loss: 0.014203607104718685 2023-01-24 03:55:26.441265: step: 508/466, loss: 0.010653822682797909 2023-01-24 03:55:27.023216: step: 510/466, loss: 0.013628825545310974 2023-01-24 03:55:27.646457: step: 512/466, loss: 0.060214050114154816 2023-01-24 03:55:28.265972: step: 514/466, loss: 0.006166969425976276 2023-01-24 03:55:28.814087: step: 516/466, loss: 0.00669153593480587 2023-01-24 03:55:29.393250: step: 518/466, loss: 0.04255734756588936 2023-01-24 03:55:30.036451: step: 520/466, loss: 0.06647557765245438 2023-01-24 03:55:30.720713: step: 522/466, loss: 0.005385304801166058 2023-01-24 03:55:31.339540: step: 524/466, loss: 1.0983834266662598 2023-01-24 03:55:31.968123: step: 526/466, loss: 0.039968132972717285 2023-01-24 03:55:32.646058: step: 528/466, loss: 0.03974110633134842 2023-01-24 03:55:33.247783: step: 530/466, loss: 0.01054446306079626 2023-01-24 03:55:33.863327: step: 532/466, loss: 0.018481917679309845 2023-01-24 03:55:34.443462: step: 534/466, loss: 0.018977802246809006 2023-01-24 03:55:35.025183: step: 536/466, loss: 0.009774553589522839 2023-01-24 03:55:35.660589: step: 538/466, loss: 0.001242406782694161 2023-01-24 03:55:36.258370: step: 540/466, loss: 0.000893936085049063 2023-01-24 03:55:36.886861: step: 542/466, loss: 0.012543848715722561 2023-01-24 03:55:37.544373: step: 544/466, loss: 0.0074572633020579815 2023-01-24 03:55:38.137455: step: 546/466, loss: 0.42661380767822266 2023-01-24 03:55:38.797703: step: 548/466, loss: 0.02388249896466732 2023-01-24 03:55:39.469781: step: 550/466, loss: 0.0061629582196474075 2023-01-24 03:55:40.047070: step: 552/466, loss: 0.011627927422523499 2023-01-24 03:55:40.647425: step: 554/466, loss: 0.06222614645957947 2023-01-24 03:55:41.313429: step: 556/466, loss: 0.018478328362107277 2023-01-24 03:55:41.953671: step: 558/466, loss: 0.055090226233005524 2023-01-24 03:55:42.605246: step: 560/466, loss: 0.022607017308473587 2023-01-24 03:55:43.226513: step: 562/466, loss: 0.001342977280728519 2023-01-24 03:55:43.828082: step: 564/466, loss: 0.009456566534936428 2023-01-24 03:55:44.420224: step: 566/466, loss: 0.03890547528862953 2023-01-24 03:55:45.089381: step: 568/466, loss: 0.008460449986159801 2023-01-24 03:55:45.701764: step: 570/466, loss: 0.0013223905116319656 2023-01-24 03:55:46.335943: step: 572/466, loss: 0.0388493649661541 2023-01-24 03:55:46.983697: step: 574/466, loss: 0.06376676261425018 2023-01-24 03:55:47.637327: step: 576/466, loss: 0.02470981702208519 2023-01-24 03:55:48.230355: step: 578/466, loss: 0.24580951035022736 2023-01-24 03:55:48.860182: step: 580/466, loss: 0.07626300305128098 2023-01-24 03:55:49.490784: step: 582/466, loss: 0.025210030376911163 2023-01-24 03:55:50.166500: step: 584/466, loss: 0.13735957443714142 2023-01-24 03:55:50.828254: step: 586/466, loss: 0.004883863963186741 2023-01-24 03:55:51.539421: step: 588/466, loss: 0.023837409913539886 2023-01-24 03:55:52.169658: step: 590/466, loss: 0.02724401466548443 2023-01-24 03:55:52.819413: step: 592/466, loss: 0.019942408427596092 2023-01-24 03:55:53.398965: step: 594/466, loss: 0.3258499801158905 2023-01-24 03:55:54.022441: step: 596/466, loss: 0.205764502286911 2023-01-24 03:55:54.653271: step: 598/466, loss: 0.017428552731871605 2023-01-24 03:55:55.339342: step: 600/466, loss: 0.019880976527929306 2023-01-24 03:55:55.904893: step: 602/466, loss: 0.0377473384141922 2023-01-24 03:55:56.571500: step: 604/466, loss: 7.953244494274259e-05 2023-01-24 03:55:57.163235: step: 606/466, loss: 0.07877268642187119 2023-01-24 03:55:57.770332: step: 608/466, loss: 0.009737065061926842 2023-01-24 03:55:58.435669: step: 610/466, loss: 0.004096858203411102 2023-01-24 03:55:59.078863: step: 612/466, loss: 0.001593309105373919 2023-01-24 03:55:59.728311: step: 614/466, loss: 0.005396370310336351 2023-01-24 03:56:00.379278: step: 616/466, loss: 0.0022211617324501276 2023-01-24 03:56:00.979024: step: 618/466, loss: 0.061748407781124115 2023-01-24 03:56:01.591507: step: 620/466, loss: 0.021056359633803368 2023-01-24 03:56:02.245955: step: 622/466, loss: 0.029573313891887665 2023-01-24 03:56:02.907355: step: 624/466, loss: 0.01509782113134861 2023-01-24 03:56:03.511010: step: 626/466, loss: 0.22090202569961548 2023-01-24 03:56:04.198656: step: 628/466, loss: 0.027898678556084633 2023-01-24 03:56:04.835619: step: 630/466, loss: 0.0019739815033972263 2023-01-24 03:56:05.463531: step: 632/466, loss: 0.012427431531250477 2023-01-24 03:56:06.092217: step: 634/466, loss: 0.005324557889252901 2023-01-24 03:56:06.776466: step: 636/466, loss: 0.5416416525840759 2023-01-24 03:56:07.402308: step: 638/466, loss: 0.02002260461449623 2023-01-24 03:56:07.965823: step: 640/466, loss: 0.0034390026703476906 2023-01-24 03:56:08.545646: step: 642/466, loss: 0.003750969422981143 2023-01-24 03:56:09.182531: step: 644/466, loss: 0.02836640365421772 2023-01-24 03:56:09.800901: step: 646/466, loss: 0.01306913048028946 2023-01-24 03:56:10.427129: step: 648/466, loss: 0.039452120661735535 2023-01-24 03:56:11.062282: step: 650/466, loss: 0.43644657731056213 2023-01-24 03:56:11.699098: step: 652/466, loss: 0.054574090987443924 2023-01-24 03:56:12.315619: step: 654/466, loss: 0.005646299570798874 2023-01-24 03:56:12.892282: step: 656/466, loss: 0.0025711979251354933 2023-01-24 03:56:13.559031: step: 658/466, loss: 0.13658203184604645 2023-01-24 03:56:14.166502: step: 660/466, loss: 0.011020969599485397 2023-01-24 03:56:14.775683: step: 662/466, loss: 0.12328854203224182 2023-01-24 03:56:15.388466: step: 664/466, loss: 0.2930150330066681 2023-01-24 03:56:16.040437: step: 666/466, loss: 0.006133504211902618 2023-01-24 03:56:16.673557: step: 668/466, loss: 0.005903858691453934 2023-01-24 03:56:17.306612: step: 670/466, loss: 0.018333403393626213 2023-01-24 03:56:18.035627: step: 672/466, loss: 0.0354229100048542 2023-01-24 03:56:18.633227: step: 674/466, loss: 0.025511808693408966 2023-01-24 03:56:19.256279: step: 676/466, loss: 0.01756656914949417 2023-01-24 03:56:19.820051: step: 678/466, loss: 0.030399370938539505 2023-01-24 03:56:20.456387: step: 680/466, loss: 0.04207872599363327 2023-01-24 03:56:21.041754: step: 682/466, loss: 0.026421351358294487 2023-01-24 03:56:21.742751: step: 684/466, loss: 0.006762126926332712 2023-01-24 03:56:22.375701: step: 686/466, loss: 0.020466167479753494 2023-01-24 03:56:23.076140: step: 688/466, loss: 0.00631640525534749 2023-01-24 03:56:23.708273: step: 690/466, loss: 0.04035324230790138 2023-01-24 03:56:24.279478: step: 692/466, loss: 0.009602941572666168 2023-01-24 03:56:24.866431: step: 694/466, loss: 0.03543735668063164 2023-01-24 03:56:25.454718: step: 696/466, loss: 0.12897473573684692 2023-01-24 03:56:26.161977: step: 698/466, loss: 0.002430747961625457 2023-01-24 03:56:26.778491: step: 700/466, loss: 0.10553860664367676 2023-01-24 03:56:27.371274: step: 702/466, loss: 0.2577919661998749 2023-01-24 03:56:27.988853: step: 704/466, loss: 0.04251670464873314 2023-01-24 03:56:28.610830: step: 706/466, loss: 0.019595172256231308 2023-01-24 03:56:29.284773: step: 708/466, loss: 0.008916772902011871 2023-01-24 03:56:29.882391: step: 710/466, loss: 0.04072237387299538 2023-01-24 03:56:30.554439: step: 712/466, loss: 0.08060120791196823 2023-01-24 03:56:31.123932: step: 714/466, loss: 0.015646446496248245 2023-01-24 03:56:31.861328: step: 716/466, loss: 0.04056132212281227 2023-01-24 03:56:32.522525: step: 718/466, loss: 0.056185342371463776 2023-01-24 03:56:33.307830: step: 720/466, loss: 0.01655575819313526 2023-01-24 03:56:33.917584: step: 722/466, loss: 0.04714022949337959 2023-01-24 03:56:34.471938: step: 724/466, loss: 0.008735047653317451 2023-01-24 03:56:35.058962: step: 726/466, loss: 0.05232086777687073 2023-01-24 03:56:35.711976: step: 728/466, loss: 0.013638571836054325 2023-01-24 03:56:36.366340: step: 730/466, loss: 0.020393308252096176 2023-01-24 03:56:36.977426: step: 732/466, loss: 0.037196703255176544 2023-01-24 03:56:37.555640: step: 734/466, loss: 0.0359979048371315 2023-01-24 03:56:38.145058: step: 736/466, loss: 0.036491766571998596 2023-01-24 03:56:38.790484: step: 738/466, loss: 1.5769621133804321 2023-01-24 03:56:39.479388: step: 740/466, loss: 0.013950319960713387 2023-01-24 03:56:40.116646: step: 742/466, loss: 0.13955432176589966 2023-01-24 03:56:40.741203: step: 744/466, loss: 0.014387154020369053 2023-01-24 03:56:41.405492: step: 746/466, loss: 0.025880424305796623 2023-01-24 03:56:42.004328: step: 748/466, loss: 0.008674003183841705 2023-01-24 03:56:42.581798: step: 750/466, loss: 0.002916875295341015 2023-01-24 03:56:43.260070: step: 752/466, loss: 0.015369150787591934 2023-01-24 03:56:43.946808: step: 754/466, loss: 0.09049924463033676 2023-01-24 03:56:44.574080: step: 756/466, loss: 0.009892329573631287 2023-01-24 03:56:45.225050: step: 758/466, loss: 0.02250661514699459 2023-01-24 03:56:45.833645: step: 760/466, loss: 0.019506581127643585 2023-01-24 03:56:46.418494: step: 762/466, loss: 0.015529129654169083 2023-01-24 03:56:47.084093: step: 764/466, loss: 0.004296987317502499 2023-01-24 03:56:47.738647: step: 766/466, loss: 0.05676477029919624 2023-01-24 03:56:48.374786: step: 768/466, loss: 0.01909170299768448 2023-01-24 03:56:49.034562: step: 770/466, loss: 0.07450731098651886 2023-01-24 03:56:49.639361: step: 772/466, loss: 0.03332969546318054 2023-01-24 03:56:50.310538: step: 774/466, loss: 0.048512302339076996 2023-01-24 03:56:50.897025: step: 776/466, loss: 0.009806359186768532 2023-01-24 03:56:51.528633: step: 778/466, loss: 0.10658646374940872 2023-01-24 03:56:52.106629: step: 780/466, loss: 0.02136125974357128 2023-01-24 03:56:52.798886: step: 782/466, loss: 0.02680668793618679 2023-01-24 03:56:53.416294: step: 784/466, loss: 0.01709206961095333 2023-01-24 03:56:54.115189: step: 786/466, loss: 0.0015258606290444732 2023-01-24 03:56:54.756773: step: 788/466, loss: 0.0022918814793229103 2023-01-24 03:56:55.348949: step: 790/466, loss: 0.04231948405504227 2023-01-24 03:56:55.928072: step: 792/466, loss: 0.0018199823098257184 2023-01-24 03:56:56.612327: step: 794/466, loss: 0.041555311530828476 2023-01-24 03:56:57.300578: step: 796/466, loss: 0.009676672518253326 2023-01-24 03:56:57.921767: step: 798/466, loss: 0.0991070494055748 2023-01-24 03:56:58.574724: step: 800/466, loss: 0.25320005416870117 2023-01-24 03:56:59.210481: step: 802/466, loss: 0.015998797491192818 2023-01-24 03:56:59.889540: step: 804/466, loss: 0.011345572769641876 2023-01-24 03:57:00.496236: step: 806/466, loss: 0.005489513278007507 2023-01-24 03:57:01.112570: step: 808/466, loss: 0.08279027789831161 2023-01-24 03:57:01.684223: step: 810/466, loss: 0.03269160911440849 2023-01-24 03:57:02.244034: step: 812/466, loss: 0.006571331061422825 2023-01-24 03:57:02.861643: step: 814/466, loss: 0.0014252661494538188 2023-01-24 03:57:03.508750: step: 816/466, loss: 0.012872443534433842 2023-01-24 03:57:04.131791: step: 818/466, loss: 0.403464138507843 2023-01-24 03:57:04.867791: step: 820/466, loss: 0.011302152648568153 2023-01-24 03:57:05.528376: step: 822/466, loss: 0.06163051724433899 2023-01-24 03:57:06.146171: step: 824/466, loss: 0.014406262896955013 2023-01-24 03:57:06.780880: step: 826/466, loss: 0.013885182328522205 2023-01-24 03:57:07.430637: step: 828/466, loss: 0.003926701378077269 2023-01-24 03:57:08.070116: step: 830/466, loss: 0.07288981229066849 2023-01-24 03:57:08.708288: step: 832/466, loss: 0.0074288509786129 2023-01-24 03:57:09.382076: step: 834/466, loss: 0.0028718686662614346 2023-01-24 03:57:10.010071: step: 836/466, loss: 0.0031497676391154528 2023-01-24 03:57:10.618909: step: 838/466, loss: 0.03677372261881828 2023-01-24 03:57:11.245555: step: 840/466, loss: 0.01764160580933094 2023-01-24 03:57:11.907081: step: 842/466, loss: 0.00905714463442564 2023-01-24 03:57:12.541905: step: 844/466, loss: 0.052184417843818665 2023-01-24 03:57:13.131601: step: 846/466, loss: 0.00874386541545391 2023-01-24 03:57:13.734884: step: 848/466, loss: 0.0046443939208984375 2023-01-24 03:57:14.350148: step: 850/466, loss: 0.01964038610458374 2023-01-24 03:57:15.007749: step: 852/466, loss: 0.04825638607144356 2023-01-24 03:57:15.580269: step: 854/466, loss: 0.007597595453262329 2023-01-24 03:57:16.201711: step: 856/466, loss: 0.03477509319782257 2023-01-24 03:57:16.812934: step: 858/466, loss: 0.0387188121676445 2023-01-24 03:57:17.397958: step: 860/466, loss: 0.714797854423523 2023-01-24 03:57:18.057005: step: 862/466, loss: 0.0008584187598899007 2023-01-24 03:57:18.688312: step: 864/466, loss: 0.12720586359500885 2023-01-24 03:57:19.353904: step: 866/466, loss: 0.03522484004497528 2023-01-24 03:57:19.999284: step: 868/466, loss: 0.0659174770116806 2023-01-24 03:57:20.550326: step: 870/466, loss: 0.0002512026985641569 2023-01-24 03:57:21.159906: step: 872/466, loss: 0.016449326649308205 2023-01-24 03:57:21.752830: step: 874/466, loss: 0.0061976853758096695 2023-01-24 03:57:22.329043: step: 876/466, loss: 0.01034632883965969 2023-01-24 03:57:22.945420: step: 878/466, loss: 0.024085231125354767 2023-01-24 03:57:23.600613: step: 880/466, loss: 0.04048647731542587 2023-01-24 03:57:24.198204: step: 882/466, loss: 0.00844142772257328 2023-01-24 03:57:24.880157: step: 884/466, loss: 0.01513928547501564 2023-01-24 03:57:25.534545: step: 886/466, loss: 0.3922380208969116 2023-01-24 03:57:26.153713: step: 888/466, loss: 0.058890506625175476 2023-01-24 03:57:26.794404: step: 890/466, loss: 0.011841786094009876 2023-01-24 03:57:27.446167: step: 892/466, loss: 0.00801936350762844 2023-01-24 03:57:28.019164: step: 894/466, loss: 0.021403668448328972 2023-01-24 03:57:28.637133: step: 896/466, loss: 0.009452925063669682 2023-01-24 03:57:29.287763: step: 898/466, loss: 0.005652129650115967 2023-01-24 03:57:29.884435: step: 900/466, loss: 0.058477722108364105 2023-01-24 03:57:30.555647: step: 902/466, loss: 0.018946265801787376 2023-01-24 03:57:31.188378: step: 904/466, loss: 0.07883594185113907 2023-01-24 03:57:31.801835: step: 906/466, loss: 0.011059698648750782 2023-01-24 03:57:32.428095: step: 908/466, loss: 0.0018178574973717332 2023-01-24 03:57:33.036331: step: 910/466, loss: 0.012226266786456108 2023-01-24 03:57:33.781647: step: 912/466, loss: 0.023213017731904984 2023-01-24 03:57:34.387959: step: 914/466, loss: 0.04142146185040474 2023-01-24 03:57:34.944423: step: 916/466, loss: 0.18669646978378296 2023-01-24 03:57:35.618008: step: 918/466, loss: 0.002439426491037011 2023-01-24 03:57:36.260146: step: 920/466, loss: 0.0272270068526268 2023-01-24 03:57:36.791029: step: 922/466, loss: 0.13135747611522675 2023-01-24 03:57:37.471994: step: 924/466, loss: 0.0030987828504294157 2023-01-24 03:57:38.073310: step: 926/466, loss: 0.017529543489217758 2023-01-24 03:57:38.648354: step: 928/466, loss: 0.02338702231645584 2023-01-24 03:57:39.263479: step: 930/466, loss: 0.009364855475723743 2023-01-24 03:57:39.892554: step: 932/466, loss: 0.07453788071870804 ================================================== Loss: 0.064 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3439181738004384, 'r': 0.3367396160930288, 'f1': 0.34029104061558235}, 'combined': 0.25074076676937646, 'epoch': 28} Test Chinese: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.35303984630785135, 'r': 0.3072301776692893, 'f1': 0.3285458699220152}, 'combined': 0.2178957064767769, 'epoch': 28} Dev Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.34082836067485617, 'r': 0.2814416008602979, 'f1': 0.3083011727266334}, 'combined': 0.2055341151510889, 'epoch': 28} Test Korean: {'template': {'p': 0.96875, 'r': 0.49206349206349204, 'f1': 0.6526315789473683}, 'slot': {'p': 0.3602948861550504, 'r': 0.2848698321329724, 'f1': 0.31817345502001554}, 'combined': 0.20765004432885223, 'epoch': 28} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3415890600427777, 'r': 0.3351072941975637, 'f1': 0.33831713418029896}, 'combined': 0.24928630939600974, 'epoch': 28} Test Russian: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.3405750213780812, 'r': 0.29571591977542233, 'f1': 0.3165641664386246}, 'combined': 0.20994929173131577, 'epoch': 28} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.3114035087719298, 'r': 0.33809523809523806, 'f1': 0.3242009132420091}, 'combined': 0.2161339421613394, 'epoch': 28} Sample Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.32608695652173914, 'f1': 0.39473684210526316}, 'combined': 0.2631578947368421, 'epoch': 28} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.45454545454545453, 'r': 0.1724137931034483, 'f1': 0.25000000000000006}, 'combined': 0.16666666666666669, 'epoch': 28} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.35600821224540485, 'r': 0.3296622534644356, 'f1': 0.34232907896701}, 'combined': 0.25224247923884946, 'epoch': 20} Test for Chinese: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.3708503216596444, 'r': 0.2994463568648126, 'f1': 0.33134515303755174}, 'combined': 0.21975222584873896, 'epoch': 20} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.38541666666666663, 'r': 0.35238095238095235, 'f1': 0.36815920398009944}, 'combined': 0.24543946932006627, 'epoch': 20} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.32910511363636363, 'r': 0.2742542613636364, 'f1': 0.2991864669421488}, 'combined': 0.19945764462809917, 'epoch': 25} Test for Korean: {'template': {'p': 0.96875, 'r': 0.49206349206349204, 'f1': 0.6526315789473683}, 'slot': {'p': 0.3803562606644222, 'r': 0.28970140060968946, 'f1': 0.32889629598629455}, 'combined': 0.21464810895947642, 'epoch': 25} Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.578125, 'r': 0.40217391304347827, 'f1': 0.47435897435897434}, 'combined': 0.3162393162393162, 'epoch': 25} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3214272868712161, 'r': 0.31166858366449984, 'f1': 0.3164727236824498}, 'combined': 0.23319042797654194, 'epoch': 7} Test for Russian: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.3374639287478496, 'r': 0.2816078301964814, 'f1': 0.30701605547736693}, 'combined': 0.20361686580882363, 'epoch': 7} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.4423076923076923, 'r': 0.19827586206896552, 'f1': 0.2738095238095238}, 'combined': 0.1825396825396825, 'epoch': 7} ****************************** Epoch: 29 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-24 04:00:11.833640: step: 2/466, loss: 0.04386208578944206 2023-01-24 04:00:12.441535: step: 4/466, loss: 0.012072478421032429 2023-01-24 04:00:13.132086: step: 6/466, loss: 0.005538816098123789 2023-01-24 04:00:13.695053: step: 8/466, loss: 0.002927206689491868 2023-01-24 04:00:14.360636: step: 10/466, loss: 0.0018422355642542243 2023-01-24 04:00:14.956367: step: 12/466, loss: 0.006833975203335285 2023-01-24 04:00:15.567613: step: 14/466, loss: 0.7979661822319031 2023-01-24 04:00:16.305286: step: 16/466, loss: 2.504417657852173 2023-01-24 04:00:17.001883: step: 18/466, loss: 0.0023924887645989656 2023-01-24 04:00:17.579276: step: 20/466, loss: 0.03579749912023544 2023-01-24 04:00:18.281189: step: 22/466, loss: 0.1898466944694519 2023-01-24 04:00:18.887486: step: 24/466, loss: 0.008246527053415775 2023-01-24 04:00:19.473789: step: 26/466, loss: 0.014673544093966484 2023-01-24 04:00:20.076696: step: 28/466, loss: 0.012497391551733017 2023-01-24 04:00:20.705997: step: 30/466, loss: 0.02695520780980587 2023-01-24 04:00:21.246460: step: 32/466, loss: 0.004595303442329168 2023-01-24 04:00:21.832675: step: 34/466, loss: 0.005384698044508696 2023-01-24 04:00:22.500069: step: 36/466, loss: 0.029108962044119835 2023-01-24 04:00:23.093564: step: 38/466, loss: 0.00017507233133073896 2023-01-24 04:00:23.740665: step: 40/466, loss: 0.0008183540776371956 2023-01-24 04:00:24.322435: step: 42/466, loss: 0.2459452599287033 2023-01-24 04:00:24.877973: step: 44/466, loss: 0.021580185741186142 2023-01-24 04:00:25.517301: step: 46/466, loss: 0.05248536914587021 2023-01-24 04:00:26.109211: step: 48/466, loss: 0.034506577998399734 2023-01-24 04:00:26.719746: step: 50/466, loss: 0.0018523776670917869 2023-01-24 04:00:27.320426: step: 52/466, loss: 0.04379947483539581 2023-01-24 04:00:27.952268: step: 54/466, loss: 0.07555657625198364 2023-01-24 04:00:28.543863: step: 56/466, loss: 0.05783117190003395 2023-01-24 04:00:29.153018: step: 58/466, loss: 0.446468323469162 2023-01-24 04:00:29.803032: step: 60/466, loss: 0.0647222027182579 2023-01-24 04:00:30.451291: step: 62/466, loss: 0.019346090033650398 2023-01-24 04:00:31.140381: step: 64/466, loss: 0.07000437378883362 2023-01-24 04:00:31.752285: step: 66/466, loss: 0.07137604802846909 2023-01-24 04:00:32.367301: step: 68/466, loss: 0.005074875894933939 2023-01-24 04:00:33.002771: step: 70/466, loss: 0.01667775772511959 2023-01-24 04:00:33.616100: step: 72/466, loss: 0.04960961639881134 2023-01-24 04:00:34.177443: step: 74/466, loss: 0.03634113445878029 2023-01-24 04:00:34.783385: step: 76/466, loss: 0.009740367531776428 2023-01-24 04:00:35.358654: step: 78/466, loss: 0.029107673093676567 2023-01-24 04:00:35.950119: step: 80/466, loss: 0.044322799891233444 2023-01-24 04:00:36.530045: step: 82/466, loss: 0.0037913850974291563 2023-01-24 04:00:37.190289: step: 84/466, loss: 0.0012504170881584287 2023-01-24 04:00:37.789388: step: 86/466, loss: 0.01192387379705906 2023-01-24 04:00:38.386298: step: 88/466, loss: 0.013240060769021511 2023-01-24 04:00:38.981026: step: 90/466, loss: 0.01812347024679184 2023-01-24 04:00:39.628063: step: 92/466, loss: 0.01432541012763977 2023-01-24 04:00:40.253627: step: 94/466, loss: 0.005916904658079147 2023-01-24 04:00:40.862063: step: 96/466, loss: 0.19921573996543884 2023-01-24 04:00:41.407230: step: 98/466, loss: 0.0003574567672330886 2023-01-24 04:00:42.057678: step: 100/466, loss: 0.0010081242071464658 2023-01-24 04:00:42.638153: step: 102/466, loss: 0.00035429472336545587 2023-01-24 04:00:43.288210: step: 104/466, loss: 0.004386215470731258 2023-01-24 04:00:43.921308: step: 106/466, loss: 0.08126308768987656 2023-01-24 04:00:44.514339: step: 108/466, loss: 0.30261746048927307 2023-01-24 04:00:45.145481: step: 110/466, loss: 0.0006659993669018149 2023-01-24 04:00:45.752614: step: 112/466, loss: 0.7788109183311462 2023-01-24 04:00:46.357898: step: 114/466, loss: 0.002139084041118622 2023-01-24 04:00:47.060052: step: 116/466, loss: 0.04724963381886482 2023-01-24 04:00:47.703130: step: 118/466, loss: 0.08716720342636108 2023-01-24 04:00:48.368968: step: 120/466, loss: 0.007712248712778091 2023-01-24 04:00:48.966109: step: 122/466, loss: 0.019319266080856323 2023-01-24 04:00:49.545920: step: 124/466, loss: 0.002920225029811263 2023-01-24 04:00:50.198480: step: 126/466, loss: 0.044984228909015656 2023-01-24 04:00:50.772907: step: 128/466, loss: 0.242207333445549 2023-01-24 04:00:51.340031: step: 130/466, loss: 0.1114872545003891 2023-01-24 04:00:52.004515: step: 132/466, loss: 0.015735222026705742 2023-01-24 04:00:52.686769: step: 134/466, loss: 0.04876323789358139 2023-01-24 04:00:53.348833: step: 136/466, loss: 0.003588942578062415 2023-01-24 04:00:53.985591: step: 138/466, loss: 0.015752285718917847 2023-01-24 04:00:54.625700: step: 140/466, loss: 0.02088429592549801 2023-01-24 04:00:55.265980: step: 142/466, loss: 0.05881446972489357 2023-01-24 04:00:55.849239: step: 144/466, loss: 0.034169264137744904 2023-01-24 04:00:56.464395: step: 146/466, loss: 0.016726719215512276 2023-01-24 04:00:57.089527: step: 148/466, loss: 0.15403063595294952 2023-01-24 04:00:57.714545: step: 150/466, loss: 0.0009215656900778413 2023-01-24 04:00:58.376773: step: 152/466, loss: 0.002376297488808632 2023-01-24 04:00:58.971263: step: 154/466, loss: 0.0004850882978644222 2023-01-24 04:00:59.565406: step: 156/466, loss: 0.0015028626658022404 2023-01-24 04:01:00.166749: step: 158/466, loss: 0.006702759303152561 2023-01-24 04:01:00.804764: step: 160/466, loss: 0.013435509987175465 2023-01-24 04:01:01.358428: step: 162/466, loss: 0.08299136906862259 2023-01-24 04:01:01.946219: step: 164/466, loss: 0.008563477545976639 2023-01-24 04:01:02.601393: step: 166/466, loss: 0.018727490678429604 2023-01-24 04:01:03.317422: step: 168/466, loss: 0.00214596395380795 2023-01-24 04:01:03.953841: step: 170/466, loss: 0.005763393361121416 2023-01-24 04:01:04.571812: step: 172/466, loss: 0.011483149603009224 2023-01-24 04:01:05.184746: step: 174/466, loss: 0.0037359632551670074 2023-01-24 04:01:05.876114: step: 176/466, loss: 0.0036448759492486715 2023-01-24 04:01:06.478082: step: 178/466, loss: 0.000318721286021173 2023-01-24 04:01:07.075555: step: 180/466, loss: 0.01972508803009987 2023-01-24 04:01:07.757742: step: 182/466, loss: 0.03576623275876045 2023-01-24 04:01:08.378398: step: 184/466, loss: 0.0046725524589419365 2023-01-24 04:01:09.001308: step: 186/466, loss: 0.017895927652716637 2023-01-24 04:01:09.644816: step: 188/466, loss: 0.01983766257762909 2023-01-24 04:01:10.370178: step: 190/466, loss: 0.006723640486598015 2023-01-24 04:01:10.972784: step: 192/466, loss: 0.006670257076621056 2023-01-24 04:01:11.556018: step: 194/466, loss: 0.008300933986902237 2023-01-24 04:01:12.198981: step: 196/466, loss: 0.010337584652006626 2023-01-24 04:01:12.842506: step: 198/466, loss: 0.00903736799955368 2023-01-24 04:01:13.420553: step: 200/466, loss: 0.016851140186190605 2023-01-24 04:01:14.064196: step: 202/466, loss: 0.03792807459831238 2023-01-24 04:01:14.777830: step: 204/466, loss: 0.02895973064005375 2023-01-24 04:01:15.438382: step: 206/466, loss: 0.007502868305891752 2023-01-24 04:01:16.056524: step: 208/466, loss: 0.031056923791766167 2023-01-24 04:01:16.669526: step: 210/466, loss: 0.013935143128037453 2023-01-24 04:01:17.273737: step: 212/466, loss: 0.002155495807528496 2023-01-24 04:01:17.911149: step: 214/466, loss: 0.01674588769674301 2023-01-24 04:01:18.560764: step: 216/466, loss: 0.0019156233174726367 2023-01-24 04:01:19.154088: step: 218/466, loss: 0.08050195872783661 2023-01-24 04:01:19.694842: step: 220/466, loss: 0.0023728234227746725 2023-01-24 04:01:20.277064: step: 222/466, loss: 0.11334630846977234 2023-01-24 04:01:20.902886: step: 224/466, loss: 0.0033581804018467665 2023-01-24 04:01:21.519798: step: 226/466, loss: 0.004865583032369614 2023-01-24 04:01:22.179218: step: 228/466, loss: 0.1188046932220459 2023-01-24 04:01:22.807852: step: 230/466, loss: 0.017414169386029243 2023-01-24 04:01:23.419318: step: 232/466, loss: 0.0009787804447114468 2023-01-24 04:01:23.977470: step: 234/466, loss: 0.0036582364700734615 2023-01-24 04:01:24.559021: step: 236/466, loss: 0.007959416136145592 2023-01-24 04:01:25.147153: step: 238/466, loss: 0.02049945294857025 2023-01-24 04:01:25.740849: step: 240/466, loss: 0.0003932398685719818 2023-01-24 04:01:26.374989: step: 242/466, loss: 0.015090403147041798 2023-01-24 04:01:26.977813: step: 244/466, loss: 0.011627012863755226 2023-01-24 04:01:27.569190: step: 246/466, loss: 0.053877320140600204 2023-01-24 04:01:28.199176: step: 248/466, loss: 0.019046692177653313 2023-01-24 04:01:28.789895: step: 250/466, loss: 0.0024457424879074097 2023-01-24 04:01:29.368938: step: 252/466, loss: 0.01016867347061634 2023-01-24 04:01:29.978413: step: 254/466, loss: 0.001433478551916778 2023-01-24 04:01:30.620461: step: 256/466, loss: 0.023510020226240158 2023-01-24 04:01:31.202004: step: 258/466, loss: 0.0041503384709358215 2023-01-24 04:01:31.777280: step: 260/466, loss: 0.0030759386718273163 2023-01-24 04:01:32.348620: step: 262/466, loss: 0.05536990612745285 2023-01-24 04:01:33.175536: step: 264/466, loss: 0.0028465837240219116 2023-01-24 04:01:33.824061: step: 266/466, loss: 0.0016013040440157056 2023-01-24 04:01:34.411646: step: 268/466, loss: 0.032356321811676025 2023-01-24 04:01:34.994791: step: 270/466, loss: 0.029682258144021034 2023-01-24 04:01:35.624403: step: 272/466, loss: 0.0367642343044281 2023-01-24 04:01:36.318287: step: 274/466, loss: 0.06313731521368027 2023-01-24 04:01:36.876376: step: 276/466, loss: 0.0310556311160326 2023-01-24 04:01:37.443802: step: 278/466, loss: 0.0056461249478161335 2023-01-24 04:01:38.072003: step: 280/466, loss: 0.00848975870758295 2023-01-24 04:01:38.679279: step: 282/466, loss: 0.28914815187454224 2023-01-24 04:01:39.301973: step: 284/466, loss: 0.014305856078863144 2023-01-24 04:01:39.898177: step: 286/466, loss: 0.01340281218290329 2023-01-24 04:01:40.497802: step: 288/466, loss: 0.32226353883743286 2023-01-24 04:01:41.192521: step: 290/466, loss: 0.026695629581809044 2023-01-24 04:01:41.806629: step: 292/466, loss: 0.1655883491039276 2023-01-24 04:01:42.372198: step: 294/466, loss: 0.0001594788918737322 2023-01-24 04:01:43.009324: step: 296/466, loss: 0.08155865222215652 2023-01-24 04:01:43.691515: step: 298/466, loss: 0.008540213108062744 2023-01-24 04:01:44.397740: step: 300/466, loss: 0.0696592628955841 2023-01-24 04:01:45.071315: step: 302/466, loss: 0.002189208986237645 2023-01-24 04:01:45.695979: step: 304/466, loss: 0.33625826239585876 2023-01-24 04:01:46.255283: step: 306/466, loss: 0.009354050271213055 2023-01-24 04:01:46.977308: step: 308/466, loss: 0.022616852074861526 2023-01-24 04:01:47.582851: step: 310/466, loss: 0.0198007021099329 2023-01-24 04:01:48.237222: step: 312/466, loss: 0.012987097725272179 2023-01-24 04:01:48.849325: step: 314/466, loss: 0.01676531694829464 2023-01-24 04:01:49.451751: step: 316/466, loss: 0.0025543151423335075 2023-01-24 04:01:50.072252: step: 318/466, loss: 0.3158953785896301 2023-01-24 04:01:50.843257: step: 320/466, loss: 0.1547013521194458 2023-01-24 04:01:51.476860: step: 322/466, loss: 0.010139084421098232 2023-01-24 04:01:52.071051: step: 324/466, loss: 0.014096586033701897 2023-01-24 04:01:52.718544: step: 326/466, loss: 0.012834510765969753 2023-01-24 04:01:53.303984: step: 328/466, loss: 0.02946978434920311 2023-01-24 04:01:53.908982: step: 330/466, loss: 0.00866196770220995 2023-01-24 04:01:54.511415: step: 332/466, loss: 0.03772924840450287 2023-01-24 04:01:55.176320: step: 334/466, loss: 0.015611520037055016 2023-01-24 04:01:55.841155: step: 336/466, loss: 0.010216974653303623 2023-01-24 04:01:56.482491: step: 338/466, loss: 0.010417732410132885 2023-01-24 04:01:57.076287: step: 340/466, loss: 0.00566525012254715 2023-01-24 04:01:57.731178: step: 342/466, loss: 0.0067802900448441505 2023-01-24 04:01:58.387481: step: 344/466, loss: 0.030475255101919174 2023-01-24 04:01:59.034411: step: 346/466, loss: 0.05227546766400337 2023-01-24 04:01:59.669917: step: 348/466, loss: 0.6098231673240662 2023-01-24 04:02:00.271204: step: 350/466, loss: 0.011939155869185925 2023-01-24 04:02:00.951519: step: 352/466, loss: 0.010138709098100662 2023-01-24 04:02:01.627181: step: 354/466, loss: 0.3563244640827179 2023-01-24 04:02:02.259858: step: 356/466, loss: 0.021631816402077675 2023-01-24 04:02:02.960439: step: 358/466, loss: 0.028252137824892998 2023-01-24 04:02:03.601926: step: 360/466, loss: 0.02579502761363983 2023-01-24 04:02:04.325099: step: 362/466, loss: 1.4603418111801147 2023-01-24 04:02:04.875665: step: 364/466, loss: 0.005669127218425274 2023-01-24 04:02:05.491422: step: 366/466, loss: 0.026713905856013298 2023-01-24 04:02:06.128714: step: 368/466, loss: 0.002620895393192768 2023-01-24 04:02:06.737879: step: 370/466, loss: 0.02652410790324211 2023-01-24 04:02:07.384012: step: 372/466, loss: 0.04273194819688797 2023-01-24 04:02:08.119468: step: 374/466, loss: 0.023813379928469658 2023-01-24 04:02:08.736410: step: 376/466, loss: 0.02738260105252266 2023-01-24 04:02:09.400009: step: 378/466, loss: 0.02079232968389988 2023-01-24 04:02:09.989252: step: 380/466, loss: 0.0023200856521725655 2023-01-24 04:02:10.626603: step: 382/466, loss: 0.009450546465814114 2023-01-24 04:02:11.193985: step: 384/466, loss: 0.0009971472900360823 2023-01-24 04:02:11.909194: step: 386/466, loss: 0.06446226686239243 2023-01-24 04:02:12.489610: step: 388/466, loss: 0.03794258087873459 2023-01-24 04:02:13.171869: step: 390/466, loss: 0.0006017423584125936 2023-01-24 04:02:13.775981: step: 392/466, loss: 0.00589497247710824 2023-01-24 04:02:14.346342: step: 394/466, loss: 0.00412049749866128 2023-01-24 04:02:15.015854: step: 396/466, loss: 0.010548084042966366 2023-01-24 04:02:15.637022: step: 398/466, loss: 0.1052076667547226 2023-01-24 04:02:16.263243: step: 400/466, loss: 0.005869503598660231 2023-01-24 04:02:16.878935: step: 402/466, loss: 0.004526306409388781 2023-01-24 04:02:17.498368: step: 404/466, loss: 0.06651441007852554 2023-01-24 04:02:18.108469: step: 406/466, loss: 0.00849270448088646 2023-01-24 04:02:18.678848: step: 408/466, loss: 0.0012003247393295169 2023-01-24 04:02:19.337482: step: 410/466, loss: 0.03083435632288456 2023-01-24 04:02:20.001705: step: 412/466, loss: 0.04964280128479004 2023-01-24 04:02:20.625406: step: 414/466, loss: 0.014443082734942436 2023-01-24 04:02:21.248672: step: 416/466, loss: 0.0007269966299645603 2023-01-24 04:02:21.909522: step: 418/466, loss: 0.017137521877884865 2023-01-24 04:02:22.530303: step: 420/466, loss: 0.015335258096456528 2023-01-24 04:02:23.175303: step: 422/466, loss: 0.008419616147875786 2023-01-24 04:02:23.793087: step: 424/466, loss: 0.0051277210004627705 2023-01-24 04:02:24.484812: step: 426/466, loss: 0.013941477052867413 2023-01-24 04:02:25.106761: step: 428/466, loss: 0.017742734402418137 2023-01-24 04:02:25.812124: step: 430/466, loss: 0.005647761281579733 2023-01-24 04:02:26.449689: step: 432/466, loss: 0.004662233404815197 2023-01-24 04:02:27.137862: step: 434/466, loss: 1.107483148574829 2023-01-24 04:02:27.780321: step: 436/466, loss: 0.014265783131122589 2023-01-24 04:02:28.453022: step: 438/466, loss: 1.2633209228515625 2023-01-24 04:02:29.076804: step: 440/466, loss: 0.0201712679117918 2023-01-24 04:02:29.624997: step: 442/466, loss: 0.014451676979660988 2023-01-24 04:02:30.255468: step: 444/466, loss: 0.13905183970928192 2023-01-24 04:02:30.931760: step: 446/466, loss: 0.034704096615314484 2023-01-24 04:02:31.501477: step: 448/466, loss: 0.012170841917395592 2023-01-24 04:02:32.126150: step: 450/466, loss: 0.08967079222202301 2023-01-24 04:02:32.783048: step: 452/466, loss: 0.008854460902512074 2023-01-24 04:02:33.389452: step: 454/466, loss: 0.006327802315354347 2023-01-24 04:02:34.036298: step: 456/466, loss: 0.31326547265052795 2023-01-24 04:02:34.663626: step: 458/466, loss: 0.001508050598204136 2023-01-24 04:02:35.267560: step: 460/466, loss: 0.008478164672851562 2023-01-24 04:02:35.842152: step: 462/466, loss: 0.015076616778969765 2023-01-24 04:02:36.463009: step: 464/466, loss: 0.018953507766127586 2023-01-24 04:02:37.074017: step: 466/466, loss: 0.9736811518669128 2023-01-24 04:02:37.726951: step: 468/466, loss: 0.029113473370671272 2023-01-24 04:02:38.388669: step: 470/466, loss: 0.031205737963318825 2023-01-24 04:02:38.974465: step: 472/466, loss: 0.0006522354669868946 2023-01-24 04:02:39.585576: step: 474/466, loss: 0.0013683760771527886 2023-01-24 04:02:40.204450: step: 476/466, loss: 0.0010390589013695717 2023-01-24 04:02:40.763642: step: 478/466, loss: 0.029676884412765503 2023-01-24 04:02:41.375808: step: 480/466, loss: 0.11172933876514435 2023-01-24 04:02:42.027813: step: 482/466, loss: 0.1738106608390808 2023-01-24 04:02:42.671308: step: 484/466, loss: 0.013176539912819862 2023-01-24 04:02:43.362912: step: 486/466, loss: 0.03815144672989845 2023-01-24 04:02:43.997164: step: 488/466, loss: 0.15745548903942108 2023-01-24 04:02:44.578047: step: 490/466, loss: 0.8841054439544678 2023-01-24 04:02:45.244579: step: 492/466, loss: 0.17422613501548767 2023-01-24 04:02:45.864676: step: 494/466, loss: 0.0311062540858984 2023-01-24 04:02:46.525179: step: 496/466, loss: 0.47466355562210083 2023-01-24 04:02:47.209402: step: 498/466, loss: 0.004679899197071791 2023-01-24 04:02:47.841785: step: 500/466, loss: 0.0400448776781559 2023-01-24 04:02:48.452678: step: 502/466, loss: 0.08364969491958618 2023-01-24 04:02:49.085057: step: 504/466, loss: 0.015947094187140465 2023-01-24 04:02:49.733061: step: 506/466, loss: 0.04322103038430214 2023-01-24 04:02:50.339242: step: 508/466, loss: 0.09971942752599716 2023-01-24 04:02:50.981022: step: 510/466, loss: 0.33148303627967834 2023-01-24 04:02:51.697788: step: 512/466, loss: 0.014690735377371311 2023-01-24 04:02:52.322528: step: 514/466, loss: 0.012732322327792645 2023-01-24 04:02:53.014338: step: 516/466, loss: 0.012350421398878098 2023-01-24 04:02:53.615226: step: 518/466, loss: 0.031112931668758392 2023-01-24 04:02:54.269809: step: 520/466, loss: 0.004019064828753471 2023-01-24 04:02:54.846657: step: 522/466, loss: 0.0008966823806986213 2023-01-24 04:02:55.419856: step: 524/466, loss: 0.03462705761194229 2023-01-24 04:02:56.077528: step: 526/466, loss: 0.0080671152099967 2023-01-24 04:02:56.691053: step: 528/466, loss: 0.005421999841928482 2023-01-24 04:02:57.348078: step: 530/466, loss: 0.01156659610569477 2023-01-24 04:02:58.031671: step: 532/466, loss: 0.002874686848372221 2023-01-24 04:02:58.603040: step: 534/466, loss: 0.004489653278142214 2023-01-24 04:02:59.190545: step: 536/466, loss: 0.024734828621149063 2023-01-24 04:02:59.779934: step: 538/466, loss: 0.009555038064718246 2023-01-24 04:03:00.398523: step: 540/466, loss: 0.02235630340874195 2023-01-24 04:03:01.104823: step: 542/466, loss: 0.03142331540584564 2023-01-24 04:03:01.738458: step: 544/466, loss: 0.010514368303120136 2023-01-24 04:03:02.432894: step: 546/466, loss: 0.006072200834751129 2023-01-24 04:03:03.040819: step: 548/466, loss: 0.2278628945350647 2023-01-24 04:03:03.625592: step: 550/466, loss: 0.005550784058868885 2023-01-24 04:03:04.251624: step: 552/466, loss: 0.029451590031385422 2023-01-24 04:03:04.926890: step: 554/466, loss: 0.022609373554587364 2023-01-24 04:03:05.541110: step: 556/466, loss: 0.029526591300964355 2023-01-24 04:03:06.129747: step: 558/466, loss: 0.009895593859255314 2023-01-24 04:03:06.799871: step: 560/466, loss: 0.009355615824460983 2023-01-24 04:03:07.428993: step: 562/466, loss: 3.3665573596954346 2023-01-24 04:03:08.046174: step: 564/466, loss: 0.090004563331604 2023-01-24 04:03:08.705653: step: 566/466, loss: 0.0005413692560978234 2023-01-24 04:03:09.341842: step: 568/466, loss: 0.05639300495386124 2023-01-24 04:03:10.005402: step: 570/466, loss: 0.011769573204219341 2023-01-24 04:03:10.644290: step: 572/466, loss: 0.017560189589858055 2023-01-24 04:03:11.290602: step: 574/466, loss: 0.01766313426196575 2023-01-24 04:03:11.855843: step: 576/466, loss: 0.0012013798113912344 2023-01-24 04:03:12.427057: step: 578/466, loss: 0.005746868904680014 2023-01-24 04:03:13.032193: step: 580/466, loss: 0.0014637453714385629 2023-01-24 04:03:13.671414: step: 582/466, loss: 0.07036874443292618 2023-01-24 04:03:14.344892: step: 584/466, loss: 0.03070271760225296 2023-01-24 04:03:14.970360: step: 586/466, loss: 0.005935574881732464 2023-01-24 04:03:15.668733: step: 588/466, loss: 0.04731940105557442 2023-01-24 04:03:16.261936: step: 590/466, loss: 0.1229914054274559 2023-01-24 04:03:16.919886: step: 592/466, loss: 0.028384150937199593 2023-01-24 04:03:17.525782: step: 594/466, loss: 0.0725310817360878 2023-01-24 04:03:18.183035: step: 596/466, loss: 0.0028359137941151857 2023-01-24 04:03:18.856396: step: 598/466, loss: 0.03283404931426048 2023-01-24 04:03:19.489818: step: 600/466, loss: 0.007365007419139147 2023-01-24 04:03:20.103584: step: 602/466, loss: 0.0025631687603890896 2023-01-24 04:03:20.751469: step: 604/466, loss: 0.011097467504441738 2023-01-24 04:03:21.391677: step: 606/466, loss: 0.0015583861386403441 2023-01-24 04:03:21.990838: step: 608/466, loss: 0.1234859973192215 2023-01-24 04:03:22.633049: step: 610/466, loss: 0.1035982072353363 2023-01-24 04:03:23.172434: step: 612/466, loss: 0.006027908064424992 2023-01-24 04:03:23.729722: step: 614/466, loss: 0.030623938888311386 2023-01-24 04:03:24.343212: step: 616/466, loss: 0.040002647787332535 2023-01-24 04:03:24.943156: step: 618/466, loss: 0.0009975489228963852 2023-01-24 04:03:25.555304: step: 620/466, loss: 0.023999236524105072 2023-01-24 04:03:26.138614: step: 622/466, loss: 0.576716423034668 2023-01-24 04:03:26.806633: step: 624/466, loss: 0.006064989138394594 2023-01-24 04:03:27.571648: step: 626/466, loss: 0.019019240513443947 2023-01-24 04:03:28.244036: step: 628/466, loss: 0.001003078417852521 2023-01-24 04:03:28.871633: step: 630/466, loss: 0.001638995367102325 2023-01-24 04:03:29.429325: step: 632/466, loss: 0.004462275188416243 2023-01-24 04:03:30.081577: step: 634/466, loss: 0.006321469321846962 2023-01-24 04:03:30.705832: step: 636/466, loss: 0.0014737026067450643 2023-01-24 04:03:31.310361: step: 638/466, loss: 0.005010698921978474 2023-01-24 04:03:31.937047: step: 640/466, loss: 0.06774980574846268 2023-01-24 04:03:32.586805: step: 642/466, loss: 0.001580861397087574 2023-01-24 04:03:33.197381: step: 644/466, loss: 0.00344419595785439 2023-01-24 04:03:33.891991: step: 646/466, loss: 0.013089598156511784 2023-01-24 04:03:34.498765: step: 648/466, loss: 0.0016473623691126704 2023-01-24 04:03:35.078638: step: 650/466, loss: 0.03453054651618004 2023-01-24 04:03:35.655317: step: 652/466, loss: 0.2897597551345825 2023-01-24 04:03:36.240596: step: 654/466, loss: 0.03459963575005531 2023-01-24 04:03:36.895377: step: 656/466, loss: 0.01754223369061947 2023-01-24 04:03:37.594955: step: 658/466, loss: 0.067201629281044 2023-01-24 04:03:38.169223: step: 660/466, loss: 0.002362264320254326 2023-01-24 04:03:38.733734: step: 662/466, loss: 0.0005576476105488837 2023-01-24 04:03:39.328021: step: 664/466, loss: 0.005788153037428856 2023-01-24 04:03:39.976274: step: 666/466, loss: 0.014405912719666958 2023-01-24 04:03:40.619048: step: 668/466, loss: 0.03309021145105362 2023-01-24 04:03:41.203607: step: 670/466, loss: 0.06431547552347183 2023-01-24 04:03:41.838294: step: 672/466, loss: 0.013910328038036823 2023-01-24 04:03:42.459883: step: 674/466, loss: 0.22224929928779602 2023-01-24 04:03:43.086498: step: 676/466, loss: 0.034740518778562546 2023-01-24 04:03:43.724261: step: 678/466, loss: 0.013087316416203976 2023-01-24 04:03:44.307430: step: 680/466, loss: 0.012118241749703884 2023-01-24 04:03:44.985525: step: 682/466, loss: 0.03825824707746506 2023-01-24 04:03:45.605626: step: 684/466, loss: 0.010267302393913269 2023-01-24 04:03:46.193586: step: 686/466, loss: 0.008821537718176842 2023-01-24 04:03:46.810252: step: 688/466, loss: 0.01823040284216404 2023-01-24 04:03:47.433948: step: 690/466, loss: 0.033738717436790466 2023-01-24 04:03:48.045433: step: 692/466, loss: 0.018728850409388542 2023-01-24 04:03:48.685082: step: 694/466, loss: 0.007886269129812717 2023-01-24 04:03:49.304983: step: 696/466, loss: 0.4238188564777374 2023-01-24 04:03:49.964138: step: 698/466, loss: 0.015290306881070137 2023-01-24 04:03:50.613216: step: 700/466, loss: 0.1405331939458847 2023-01-24 04:03:51.200274: step: 702/466, loss: 0.0031958790495991707 2023-01-24 04:03:51.946602: step: 704/466, loss: 0.009226196445524693 2023-01-24 04:03:52.553388: step: 706/466, loss: 0.012337690219283104 2023-01-24 04:03:53.222585: step: 708/466, loss: 0.025385482236742973 2023-01-24 04:03:53.848325: step: 710/466, loss: 0.0626063346862793 2023-01-24 04:03:54.490809: step: 712/466, loss: 0.0034278137609362602 2023-01-24 04:03:55.086576: step: 714/466, loss: 0.06579583138227463 2023-01-24 04:03:55.723268: step: 716/466, loss: 0.1320769339799881 2023-01-24 04:03:56.341118: step: 718/466, loss: 0.007669322192668915 2023-01-24 04:03:56.956696: step: 720/466, loss: 0.009058769792318344 2023-01-24 04:03:57.617205: step: 722/466, loss: 0.008587099611759186 2023-01-24 04:03:58.332181: step: 724/466, loss: 0.011049827560782433 2023-01-24 04:03:59.078770: step: 726/466, loss: 0.04374136030673981 2023-01-24 04:03:59.703103: step: 728/466, loss: 0.0014486410655081272 2023-01-24 04:04:00.344536: step: 730/466, loss: 0.15509715676307678 2023-01-24 04:04:00.941704: step: 732/466, loss: 0.0023542954586446285 2023-01-24 04:04:01.557283: step: 734/466, loss: 0.020629791542887688 2023-01-24 04:04:02.172004: step: 736/466, loss: 0.009806080721318722 2023-01-24 04:04:02.789389: step: 738/466, loss: 0.012316007167100906 2023-01-24 04:04:03.464981: step: 740/466, loss: 0.005239745602011681 2023-01-24 04:04:04.042732: step: 742/466, loss: 0.0018927620258182287 2023-01-24 04:04:04.655286: step: 744/466, loss: 0.041778482496738434 2023-01-24 04:04:05.209484: step: 746/466, loss: 0.002031937940046191 2023-01-24 04:04:05.818296: step: 748/466, loss: 0.031134361401200294 2023-01-24 04:04:06.397926: step: 750/466, loss: 0.01668274775147438 2023-01-24 04:04:07.005032: step: 752/466, loss: 0.008619182743132114 2023-01-24 04:04:07.633268: step: 754/466, loss: 0.050235576927661896 2023-01-24 04:04:08.232352: step: 756/466, loss: 0.006068137940019369 2023-01-24 04:04:08.904923: step: 758/466, loss: 0.018372712656855583 2023-01-24 04:04:09.539892: step: 760/466, loss: 0.010020049288868904 2023-01-24 04:04:10.154789: step: 762/466, loss: 0.005020998418331146 2023-01-24 04:04:10.762640: step: 764/466, loss: 9.63665297604166e-06 2023-01-24 04:04:11.405834: step: 766/466, loss: 0.20721535384655 2023-01-24 04:04:12.074045: step: 768/466, loss: 0.02705824188888073 2023-01-24 04:04:12.756131: step: 770/466, loss: 0.020132362842559814 2023-01-24 04:04:13.347940: step: 772/466, loss: 0.00651069451123476 2023-01-24 04:04:14.002678: step: 774/466, loss: 0.00996352918446064 2023-01-24 04:04:14.623283: step: 776/466, loss: 0.05295846238732338 2023-01-24 04:04:15.244236: step: 778/466, loss: 0.028427729383111 2023-01-24 04:04:15.885457: step: 780/466, loss: 0.046110689640045166 2023-01-24 04:04:16.537755: step: 782/466, loss: 0.052261386066675186 2023-01-24 04:04:17.148744: step: 784/466, loss: 0.04086603596806526 2023-01-24 04:04:17.805149: step: 786/466, loss: 0.002186446450650692 2023-01-24 04:04:18.488381: step: 788/466, loss: 0.004350494593381882 2023-01-24 04:04:19.168175: step: 790/466, loss: 0.010115685872733593 2023-01-24 04:04:19.758794: step: 792/466, loss: 0.0673961415886879 2023-01-24 04:04:20.346397: step: 794/466, loss: 0.002036201534792781 2023-01-24 04:04:20.940695: step: 796/466, loss: 0.002516336739063263 2023-01-24 04:04:21.503042: step: 798/466, loss: 0.00342946476303041 2023-01-24 04:04:22.159631: step: 800/466, loss: 0.0020202852319926023 2023-01-24 04:04:22.714863: step: 802/466, loss: 0.017617767676711082 2023-01-24 04:04:23.319482: step: 804/466, loss: 3.9753360748291016 2023-01-24 04:04:23.939301: step: 806/466, loss: 0.011270052753388882 2023-01-24 04:04:24.541846: step: 808/466, loss: 0.009572568349540234 2023-01-24 04:04:25.267224: step: 810/466, loss: 0.005419893190264702 2023-01-24 04:04:25.920226: step: 812/466, loss: 0.03984509035944939 2023-01-24 04:04:26.468530: step: 814/466, loss: 0.05492782965302467 2023-01-24 04:04:27.105802: step: 816/466, loss: 0.010290661826729774 2023-01-24 04:04:27.740210: step: 818/466, loss: 0.0068497927859425545 2023-01-24 04:04:28.383020: step: 820/466, loss: 0.0007102004019543529 2023-01-24 04:04:28.994940: step: 822/466, loss: 0.012335572391748428 2023-01-24 04:04:29.641911: step: 824/466, loss: 0.026347529143095016 2023-01-24 04:04:30.244949: step: 826/466, loss: 0.002580393571406603 2023-01-24 04:04:30.902157: step: 828/466, loss: 0.03455239161849022 2023-01-24 04:04:31.527950: step: 830/466, loss: 0.05823826044797897 2023-01-24 04:04:32.140377: step: 832/466, loss: 0.021898532286286354 2023-01-24 04:04:32.755066: step: 834/466, loss: 0.020318279042840004 2023-01-24 04:04:33.355009: step: 836/466, loss: 0.005964603740721941 2023-01-24 04:04:33.928915: step: 838/466, loss: 0.0307414922863245 2023-01-24 04:04:34.586876: step: 840/466, loss: 0.00021112307149451226 2023-01-24 04:04:35.204974: step: 842/466, loss: 0.05015534535050392 2023-01-24 04:04:35.810898: step: 844/466, loss: 0.006480558775365353 2023-01-24 04:04:36.470491: step: 846/466, loss: 0.010075699537992477 2023-01-24 04:04:37.047687: step: 848/466, loss: 0.14495618641376495 2023-01-24 04:04:37.724422: step: 850/466, loss: 0.04412548616528511 2023-01-24 04:04:38.335624: step: 852/466, loss: 0.00012186765525257215 2023-01-24 04:04:38.925675: step: 854/466, loss: 0.013047690503299236 2023-01-24 04:04:39.534816: step: 856/466, loss: 0.004875212907791138 2023-01-24 04:04:40.182871: step: 858/466, loss: 0.04846766218543053 2023-01-24 04:04:40.821852: step: 860/466, loss: 0.004972430877387524 2023-01-24 04:04:41.400951: step: 862/466, loss: 0.014919591136276722 2023-01-24 04:04:42.097101: step: 864/466, loss: 0.04054275527596474 2023-01-24 04:04:42.728421: step: 866/466, loss: 0.004106948152184486 2023-01-24 04:04:43.361458: step: 868/466, loss: 0.012273982167243958 2023-01-24 04:04:44.000344: step: 870/466, loss: 0.020487243309617043 2023-01-24 04:04:44.642906: step: 872/466, loss: 0.018841352313756943 2023-01-24 04:04:45.235704: step: 874/466, loss: 0.01061300653964281 2023-01-24 04:04:45.846454: step: 876/466, loss: 0.043411578983068466 2023-01-24 04:04:46.477042: step: 878/466, loss: 0.029093991965055466 2023-01-24 04:04:47.134153: step: 880/466, loss: 0.058155424892902374 2023-01-24 04:04:47.744247: step: 882/466, loss: 0.034859657287597656 2023-01-24 04:04:48.409248: step: 884/466, loss: 0.13316437602043152 2023-01-24 04:04:48.991427: step: 886/466, loss: 0.10845877230167389 2023-01-24 04:04:49.599171: step: 888/466, loss: 0.005917152855545282 2023-01-24 04:04:50.177484: step: 890/466, loss: 0.054976899176836014 2023-01-24 04:04:50.780971: step: 892/466, loss: 0.08583804219961166 2023-01-24 04:04:51.418339: step: 894/466, loss: 0.0240468867123127 2023-01-24 04:04:52.064988: step: 896/466, loss: 0.02332800254225731 2023-01-24 04:04:52.657880: step: 898/466, loss: 0.019618580117821693 2023-01-24 04:04:53.291188: step: 900/466, loss: 0.16611427068710327 2023-01-24 04:04:53.859935: step: 902/466, loss: 0.018733682110905647 2023-01-24 04:04:54.446716: step: 904/466, loss: 0.011458895169198513 2023-01-24 04:04:55.108832: step: 906/466, loss: 0.014567876234650612 2023-01-24 04:04:55.757271: step: 908/466, loss: 0.00459162425249815 2023-01-24 04:04:56.442380: step: 910/466, loss: 0.1225447803735733 2023-01-24 04:04:57.026466: step: 912/466, loss: 0.007581851910799742 2023-01-24 04:04:57.639111: step: 914/466, loss: 0.03084220364689827 2023-01-24 04:04:58.311716: step: 916/466, loss: 0.004150428809225559 2023-01-24 04:04:58.928571: step: 918/466, loss: 0.07581573724746704 2023-01-24 04:04:59.609366: step: 920/466, loss: 0.033521562814712524 2023-01-24 04:05:00.247460: step: 922/466, loss: 0.005024290177971125 2023-01-24 04:05:00.881146: step: 924/466, loss: 0.014391470700502396 2023-01-24 04:05:01.465449: step: 926/466, loss: 0.04559220001101494 2023-01-24 04:05:02.129975: step: 928/466, loss: 0.011902103200554848 2023-01-24 04:05:02.733531: step: 930/466, loss: 0.03757358342409134 2023-01-24 04:05:03.326508: step: 932/466, loss: 0.033605653792619705 ================================================== Loss: 0.076 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3375974233699761, 'r': 0.3337538094416652, 'f1': 0.3356646136941938}, 'combined': 0.2473318206167744, 'epoch': 29} Test Chinese: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.363637010728126, 'r': 0.2955531182967167, 'f1': 0.32607907101941425}, 'combined': 0.21625969476935242, 'epoch': 29} Dev Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.3247466216216216, 'r': 0.27308238636363635, 'f1': 0.29668209876543206}, 'combined': 0.19778806584362135, 'epoch': 29} Test Korean: {'template': {'p': 0.96875, 'r': 0.49206349206349204, 'f1': 0.6526315789473683}, 'slot': {'p': 0.37055964194737895, 'r': 0.27185022008380993, 'f1': 0.31362133793855745}, 'combined': 0.20467918897042695, 'epoch': 29} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3282728414084346, 'r': 0.33076447587832786, 'f1': 0.32951394855931715}, 'combined': 0.24279975157002315, 'epoch': 29} Test Russian: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.3469329427801109, 'r': 0.28049257852152815, 'f1': 0.31019495506836936}, 'combined': 0.20572515154793405, 'epoch': 29} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.271780303030303, 'r': 0.3416666666666666, 'f1': 0.30274261603375524}, 'combined': 0.20182841068917015, 'epoch': 29} Sample Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.46875, 'r': 0.32608695652173914, 'f1': 0.38461538461538464}, 'combined': 0.2564102564102564, 'epoch': 29} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5277777777777778, 'r': 0.16379310344827586, 'f1': 0.24999999999999997}, 'combined': 0.16666666666666663, 'epoch': 29} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.35600821224540485, 'r': 0.3296622534644356, 'f1': 0.34232907896701}, 'combined': 0.25224247923884946, 'epoch': 20} Test for Chinese: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.3708503216596444, 'r': 0.2994463568648126, 'f1': 0.33134515303755174}, 'combined': 0.21975222584873896, 'epoch': 20} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.38541666666666663, 'r': 0.35238095238095235, 'f1': 0.36815920398009944}, 'combined': 0.24543946932006627, 'epoch': 20} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.32910511363636363, 'r': 0.2742542613636364, 'f1': 0.2991864669421488}, 'combined': 0.19945764462809917, 'epoch': 25} Test for Korean: {'template': {'p': 0.96875, 'r': 0.49206349206349204, 'f1': 0.6526315789473683}, 'slot': {'p': 0.3803562606644222, 'r': 0.28970140060968946, 'f1': 0.32889629598629455}, 'combined': 0.21464810895947642, 'epoch': 25} Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.578125, 'r': 0.40217391304347827, 'f1': 0.47435897435897434}, 'combined': 0.3162393162393162, 'epoch': 25} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3214272868712161, 'r': 0.31166858366449984, 'f1': 0.3164727236824498}, 'combined': 0.23319042797654194, 'epoch': 7} Test for Russian: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.3374639287478496, 'r': 0.2816078301964814, 'f1': 0.30701605547736693}, 'combined': 0.20361686580882363, 'epoch': 7} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.4423076923076923, 'r': 0.19827586206896552, 'f1': 0.2738095238095238}, 'combined': 0.1825396825396825, 'epoch': 7} ****************************** Epoch: 30 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-24 04:07:35.996560: step: 2/466, loss: 0.00028478680178523064 2023-01-24 04:07:36.608612: step: 4/466, loss: 0.014078116975724697 2023-01-24 04:07:37.265985: step: 6/466, loss: 0.012460625730454922 2023-01-24 04:07:37.888733: step: 8/466, loss: 0.05085223913192749 2023-01-24 04:07:38.426393: step: 10/466, loss: 0.009307858534157276 2023-01-24 04:07:39.120835: step: 12/466, loss: 0.004645699169486761 2023-01-24 04:07:39.727401: step: 14/466, loss: 0.00043225885019637644 2023-01-24 04:07:40.389201: step: 16/466, loss: 0.009154175408184528 2023-01-24 04:07:41.050920: step: 18/466, loss: 0.004041125066578388 2023-01-24 04:07:41.668149: step: 20/466, loss: 0.14259779453277588 2023-01-24 04:07:42.248699: step: 22/466, loss: 0.004271865822374821 2023-01-24 04:07:42.862435: step: 24/466, loss: 0.018932662904262543 2023-01-24 04:07:43.544489: step: 26/466, loss: 0.008465741761028767 2023-01-24 04:07:44.164700: step: 28/466, loss: 0.015602695755660534 2023-01-24 04:07:44.799563: step: 30/466, loss: 0.758287787437439 2023-01-24 04:07:45.391843: step: 32/466, loss: 0.016388943418860435 2023-01-24 04:07:46.018013: step: 34/466, loss: 0.07368203997612 2023-01-24 04:07:46.593122: step: 36/466, loss: 0.0026354417204856873 2023-01-24 04:07:47.212867: step: 38/466, loss: 0.004369013477116823 2023-01-24 04:07:47.869754: step: 40/466, loss: 0.0012482845922932029 2023-01-24 04:07:48.458494: step: 42/466, loss: 0.0878681018948555 2023-01-24 04:07:49.173395: step: 44/466, loss: 0.08650446683168411 2023-01-24 04:07:49.714584: step: 46/466, loss: 0.0037090929690748453 2023-01-24 04:07:50.389632: step: 48/466, loss: 0.04962155595421791 2023-01-24 04:07:51.030561: step: 50/466, loss: 0.008082413114607334 2023-01-24 04:07:51.677643: step: 52/466, loss: 0.02043752558529377 2023-01-24 04:07:52.294600: step: 54/466, loss: 0.021009817719459534 2023-01-24 04:07:52.870900: step: 56/466, loss: 0.0005981952417641878 2023-01-24 04:07:53.463780: step: 58/466, loss: 0.012455707415938377 2023-01-24 04:07:54.118459: step: 60/466, loss: 0.010042482987046242 2023-01-24 04:07:54.789084: step: 62/466, loss: 0.0007668372127227485 2023-01-24 04:07:55.379562: step: 64/466, loss: 0.06405071169137955 2023-01-24 04:07:55.894339: step: 66/466, loss: 0.026616506278514862 2023-01-24 04:07:56.603994: step: 68/466, loss: 0.11562003195285797 2023-01-24 04:07:57.255841: step: 70/466, loss: 0.0031265579164028168 2023-01-24 04:07:57.902254: step: 72/466, loss: 0.07842712104320526 2023-01-24 04:07:58.601086: step: 74/466, loss: 0.02148415334522724 2023-01-24 04:07:59.152039: step: 76/466, loss: 0.01080042589455843 2023-01-24 04:07:59.805627: step: 78/466, loss: 0.018368011340498924 2023-01-24 04:08:00.430949: step: 80/466, loss: 0.00820663571357727 2023-01-24 04:08:01.043699: step: 82/466, loss: 0.003451654454693198 2023-01-24 04:08:01.640056: step: 84/466, loss: 0.011481214314699173 2023-01-24 04:08:02.271776: step: 86/466, loss: 0.00872575119137764 2023-01-24 04:08:02.925940: step: 88/466, loss: 0.003943064250051975 2023-01-24 04:08:03.587819: step: 90/466, loss: 0.001124784117564559 2023-01-24 04:08:04.201973: step: 92/466, loss: 0.011985089629888535 2023-01-24 04:08:04.912394: step: 94/466, loss: 0.005329095292836428 2023-01-24 04:08:05.540379: step: 96/466, loss: 0.36632513999938965 2023-01-24 04:08:06.138627: step: 98/466, loss: 0.03597888723015785 2023-01-24 04:08:06.722956: step: 100/466, loss: 0.014806758612394333 2023-01-24 04:08:07.398246: step: 102/466, loss: 0.0015192838618531823 2023-01-24 04:08:08.140103: step: 104/466, loss: 0.0349656417965889 2023-01-24 04:08:08.753028: step: 106/466, loss: 0.05895746126770973 2023-01-24 04:08:09.371860: step: 108/466, loss: 0.004985973238945007 2023-01-24 04:08:10.003140: step: 110/466, loss: 0.019262507557868958 2023-01-24 04:08:10.617076: step: 112/466, loss: 0.004171041306108236 2023-01-24 04:08:11.252177: step: 114/466, loss: 0.010270554572343826 2023-01-24 04:08:11.882944: step: 116/466, loss: 0.008071281015872955 2023-01-24 04:08:12.520194: step: 118/466, loss: 0.012917655520141125 2023-01-24 04:08:13.162984: step: 120/466, loss: 0.01800263673067093 2023-01-24 04:08:13.820570: step: 122/466, loss: 0.004215335939079523 2023-01-24 04:08:14.415097: step: 124/466, loss: 0.017569512128829956 2023-01-24 04:08:15.017201: step: 126/466, loss: 0.01752289943397045 2023-01-24 04:08:15.619682: step: 128/466, loss: 0.010893118567764759 2023-01-24 04:08:16.185316: step: 130/466, loss: 0.0005880341632291675 2023-01-24 04:08:16.772896: step: 132/466, loss: 0.003655801061540842 2023-01-24 04:08:17.409857: step: 134/466, loss: 0.03518220782279968 2023-01-24 04:08:18.023734: step: 136/466, loss: 0.05233493447303772 2023-01-24 04:08:18.720745: step: 138/466, loss: 0.021850325167179108 2023-01-24 04:08:19.422605: step: 140/466, loss: 0.027908580377697945 2023-01-24 04:08:20.014928: step: 142/466, loss: 0.006345653906464577 2023-01-24 04:08:20.684284: step: 144/466, loss: 0.0006852993974462152 2023-01-24 04:08:21.316303: step: 146/466, loss: 0.007213433738797903 2023-01-24 04:08:21.908855: step: 148/466, loss: 0.0066226390190422535 2023-01-24 04:08:22.525929: step: 150/466, loss: 0.015624548308551311 2023-01-24 04:08:23.192618: step: 152/466, loss: 0.02410743571817875 2023-01-24 04:08:23.777463: step: 154/466, loss: 0.0009644230012781918 2023-01-24 04:08:24.450549: step: 156/466, loss: 0.010475953109562397 2023-01-24 04:08:25.116769: step: 158/466, loss: 0.002348509384319186 2023-01-24 04:08:25.753918: step: 160/466, loss: 0.008804123848676682 2023-01-24 04:08:26.357989: step: 162/466, loss: 0.01287774182856083 2023-01-24 04:08:27.016821: step: 164/466, loss: 0.16314955055713654 2023-01-24 04:08:27.614573: step: 166/466, loss: 0.005296224262565374 2023-01-24 04:08:28.205188: step: 168/466, loss: 0.012881525792181492 2023-01-24 04:08:28.781691: step: 170/466, loss: 0.004743877798318863 2023-01-24 04:08:29.355562: step: 172/466, loss: 0.017502468079328537 2023-01-24 04:08:30.019705: step: 174/466, loss: 0.01408388838171959 2023-01-24 04:08:30.572327: step: 176/466, loss: 0.0014216086128726602 2023-01-24 04:08:31.211482: step: 178/466, loss: 0.008555619977414608 2023-01-24 04:08:31.819866: step: 180/466, loss: 0.003428714582696557 2023-01-24 04:08:32.483068: step: 182/466, loss: 0.005940432660281658 2023-01-24 04:08:33.141997: step: 184/466, loss: 0.011507420800626278 2023-01-24 04:08:33.825780: step: 186/466, loss: 0.05055483058094978 2023-01-24 04:08:34.485567: step: 188/466, loss: 0.3473048806190491 2023-01-24 04:08:35.154561: step: 190/466, loss: 0.006810220889747143 2023-01-24 04:08:35.804317: step: 192/466, loss: 0.046747103333473206 2023-01-24 04:08:36.372947: step: 194/466, loss: 0.0039004923310130835 2023-01-24 04:08:36.944497: step: 196/466, loss: 0.0035725836642086506 2023-01-24 04:08:37.527651: step: 198/466, loss: 0.14844007790088654 2023-01-24 04:08:38.152470: step: 200/466, loss: 0.033517882227897644 2023-01-24 04:08:38.753049: step: 202/466, loss: 0.01868906244635582 2023-01-24 04:08:39.351836: step: 204/466, loss: 0.025287127122282982 2023-01-24 04:08:39.991010: step: 206/466, loss: 0.010811547748744488 2023-01-24 04:08:40.601216: step: 208/466, loss: 0.07126505672931671 2023-01-24 04:08:41.297401: step: 210/466, loss: 0.0052129654213786125 2023-01-24 04:08:41.931888: step: 212/466, loss: 0.040356654673814774 2023-01-24 04:08:42.575951: step: 214/466, loss: 0.03850207477807999 2023-01-24 04:08:43.181073: step: 216/466, loss: 0.003108978969976306 2023-01-24 04:08:43.842941: step: 218/466, loss: 0.022466685622930527 2023-01-24 04:08:44.512127: step: 220/466, loss: 0.0003463841858319938 2023-01-24 04:08:45.179529: step: 222/466, loss: 0.016880718991160393 2023-01-24 04:08:45.783006: step: 224/466, loss: 0.015215063467621803 2023-01-24 04:08:46.391046: step: 226/466, loss: 0.004265309311449528 2023-01-24 04:08:46.972141: step: 228/466, loss: 0.0458204559981823 2023-01-24 04:08:47.582697: step: 230/466, loss: 0.018718402832746506 2023-01-24 04:08:48.253703: step: 232/466, loss: 0.00042664248030632734 2023-01-24 04:08:48.885786: step: 234/466, loss: 0.017034878954291344 2023-01-24 04:08:49.515411: step: 236/466, loss: 0.3019435703754425 2023-01-24 04:08:50.140937: step: 238/466, loss: 0.2951653301715851 2023-01-24 04:08:50.782471: step: 240/466, loss: 0.00019873416749760509 2023-01-24 04:08:51.390386: step: 242/466, loss: 0.023621132597327232 2023-01-24 04:08:52.024038: step: 244/466, loss: 0.07778826355934143 2023-01-24 04:08:52.648869: step: 246/466, loss: 0.1465650498867035 2023-01-24 04:08:53.233866: step: 248/466, loss: 0.013963623903691769 2023-01-24 04:08:53.827984: step: 250/466, loss: 0.16531668603420258 2023-01-24 04:08:54.494234: step: 252/466, loss: 0.02791769616305828 2023-01-24 04:08:55.091096: step: 254/466, loss: 0.027266457676887512 2023-01-24 04:08:55.716828: step: 256/466, loss: 0.004218158777803183 2023-01-24 04:08:56.353443: step: 258/466, loss: 0.011412985622882843 2023-01-24 04:08:56.971806: step: 260/466, loss: 0.00474912254139781 2023-01-24 04:08:57.572895: step: 262/466, loss: 0.004200744442641735 2023-01-24 04:08:58.173229: step: 264/466, loss: 0.007517603226006031 2023-01-24 04:08:58.727816: step: 266/466, loss: 0.017428897321224213 2023-01-24 04:08:59.387397: step: 268/466, loss: 0.02294786460697651 2023-01-24 04:08:59.995524: step: 270/466, loss: 0.12526442110538483 2023-01-24 04:09:00.630155: step: 272/466, loss: 0.08035369962453842 2023-01-24 04:09:01.199084: step: 274/466, loss: 0.006867168005555868 2023-01-24 04:09:01.842478: step: 276/466, loss: 0.026198022067546844 2023-01-24 04:09:02.467059: step: 278/466, loss: 0.00045783095993101597 2023-01-24 04:09:03.071887: step: 280/466, loss: 0.00981088075786829 2023-01-24 04:09:03.746770: step: 282/466, loss: 0.021191062405705452 2023-01-24 04:09:04.254744: step: 284/466, loss: 0.0037732140626758337 2023-01-24 04:09:04.879897: step: 286/466, loss: 0.0004485807439778 2023-01-24 04:09:05.493581: step: 288/466, loss: 0.008615471422672272 2023-01-24 04:09:06.114563: step: 290/466, loss: 0.6578670740127563 2023-01-24 04:09:06.757289: step: 292/466, loss: 0.0301654115319252 2023-01-24 04:09:07.429920: step: 294/466, loss: 0.01989324577152729 2023-01-24 04:09:08.056272: step: 296/466, loss: 0.0056756469421088696 2023-01-24 04:09:08.689145: step: 298/466, loss: 0.030090264976024628 2023-01-24 04:09:09.324257: step: 300/466, loss: 0.04090450331568718 2023-01-24 04:09:09.947292: step: 302/466, loss: 0.012088625691831112 2023-01-24 04:09:10.570739: step: 304/466, loss: 0.7122946977615356 2023-01-24 04:09:11.221655: step: 306/466, loss: 0.01763085089623928 2023-01-24 04:09:11.854604: step: 308/466, loss: 0.016639074310660362 2023-01-24 04:09:12.610579: step: 310/466, loss: 0.05521659180521965 2023-01-24 04:09:13.223368: step: 312/466, loss: 0.025561662390828133 2023-01-24 04:09:13.811136: step: 314/466, loss: 0.056694965809583664 2023-01-24 04:09:14.420891: step: 316/466, loss: 0.0177704319357872 2023-01-24 04:09:15.070113: step: 318/466, loss: 0.1333787739276886 2023-01-24 04:09:15.704548: step: 320/466, loss: 0.023962095379829407 2023-01-24 04:09:16.310820: step: 322/466, loss: 0.0034161526709795 2023-01-24 04:09:16.914110: step: 324/466, loss: 0.00201751128770411 2023-01-24 04:09:17.525928: step: 326/466, loss: 0.019854096695780754 2023-01-24 04:09:18.092470: step: 328/466, loss: 0.0136026656255126 2023-01-24 04:09:18.704448: step: 330/466, loss: 0.06052771210670471 2023-01-24 04:09:19.316752: step: 332/466, loss: 0.1464303433895111 2023-01-24 04:09:19.951453: step: 334/466, loss: 0.03735050559043884 2023-01-24 04:09:20.557605: step: 336/466, loss: 0.004379080608487129 2023-01-24 04:09:21.186322: step: 338/466, loss: 0.0054306890815496445 2023-01-24 04:09:21.779391: step: 340/466, loss: 0.00680329417809844 2023-01-24 04:09:22.353885: step: 342/466, loss: 0.005904025863856077 2023-01-24 04:09:22.985649: step: 344/466, loss: 0.01370063703507185 2023-01-24 04:09:23.594289: step: 346/466, loss: 0.027154013514518738 2023-01-24 04:09:24.207263: step: 348/466, loss: 0.0003415009123273194 2023-01-24 04:09:24.775259: step: 350/466, loss: 0.010796810500323772 2023-01-24 04:09:25.430554: step: 352/466, loss: 0.007304156664758921 2023-01-24 04:09:26.086965: step: 354/466, loss: 0.009061838500201702 2023-01-24 04:09:26.709698: step: 356/466, loss: 0.034129682928323746 2023-01-24 04:09:27.297595: step: 358/466, loss: 0.0017467025900259614 2023-01-24 04:09:27.824766: step: 360/466, loss: 0.0001253996742889285 2023-01-24 04:09:28.440570: step: 362/466, loss: 0.008916638791561127 2023-01-24 04:09:29.014024: step: 364/466, loss: 0.023510871455073357 2023-01-24 04:09:29.602177: step: 366/466, loss: 0.009778701700270176 2023-01-24 04:09:30.181755: step: 368/466, loss: 0.00026640386204235256 2023-01-24 04:09:30.764492: step: 370/466, loss: 0.00020242726895958185 2023-01-24 04:09:31.486051: step: 372/466, loss: 0.005702751688659191 2023-01-24 04:09:32.142004: step: 374/466, loss: 0.088680200278759 2023-01-24 04:09:32.708458: step: 376/466, loss: 0.13011717796325684 2023-01-24 04:09:33.350203: step: 378/466, loss: 0.033206891268491745 2023-01-24 04:09:33.965751: step: 380/466, loss: 0.008764302358031273 2023-01-24 04:09:34.701580: step: 382/466, loss: 0.0035456165205687284 2023-01-24 04:09:35.332010: step: 384/466, loss: 0.035435259342193604 2023-01-24 04:09:35.975932: step: 386/466, loss: 0.0048704869113862514 2023-01-24 04:09:36.627010: step: 388/466, loss: 0.03243507817387581 2023-01-24 04:09:37.217969: step: 390/466, loss: 0.009492951445281506 2023-01-24 04:09:37.850439: step: 392/466, loss: 0.04516708850860596 2023-01-24 04:09:38.509738: step: 394/466, loss: 0.0015181071357801557 2023-01-24 04:09:39.160251: step: 396/466, loss: 0.005739421583712101 2023-01-24 04:09:39.834336: step: 398/466, loss: 0.023896964266896248 2023-01-24 04:09:40.548423: step: 400/466, loss: 0.0031137680634856224 2023-01-24 04:09:41.142078: step: 402/466, loss: 0.0023499038070440292 2023-01-24 04:09:41.753191: step: 404/466, loss: 0.022085856646299362 2023-01-24 04:09:42.413576: step: 406/466, loss: 0.002061337698251009 2023-01-24 04:09:43.058637: step: 408/466, loss: 0.016577694565057755 2023-01-24 04:09:43.636935: step: 410/466, loss: 0.05804067105054855 2023-01-24 04:09:44.313165: step: 412/466, loss: 0.002286148490384221 2023-01-24 04:09:44.860066: step: 414/466, loss: 0.02970394864678383 2023-01-24 04:09:45.464797: step: 416/466, loss: 0.0006221303483471274 2023-01-24 04:09:46.151605: step: 418/466, loss: 0.07081273198127747 2023-01-24 04:09:46.723733: step: 420/466, loss: 0.018962359055876732 2023-01-24 04:09:47.359060: step: 422/466, loss: 0.007997725158929825 2023-01-24 04:09:47.982603: step: 424/466, loss: 0.021207401528954506 2023-01-24 04:09:48.562421: step: 426/466, loss: 0.004502940457314253 2023-01-24 04:09:49.240892: step: 428/466, loss: 0.0023897087667137384 2023-01-24 04:09:49.885307: step: 430/466, loss: 0.027787597849965096 2023-01-24 04:09:50.540033: step: 432/466, loss: 0.012907175347208977 2023-01-24 04:09:51.164312: step: 434/466, loss: 0.005142767447978258 2023-01-24 04:09:51.759213: step: 436/466, loss: 0.010803007520735264 2023-01-24 04:09:52.351662: step: 438/466, loss: 0.0029781744815409184 2023-01-24 04:09:52.968744: step: 440/466, loss: 0.002498141722753644 2023-01-24 04:09:53.539185: step: 442/466, loss: 0.006571412086486816 2023-01-24 04:09:54.204798: step: 444/466, loss: 0.010829522274434566 2023-01-24 04:09:54.756934: step: 446/466, loss: 0.0059796967543661594 2023-01-24 04:09:55.393887: step: 448/466, loss: 0.0032199707347899675 2023-01-24 04:09:56.186445: step: 450/466, loss: 0.10496566444635391 2023-01-24 04:09:56.810281: step: 452/466, loss: 0.022163324058055878 2023-01-24 04:09:57.428834: step: 454/466, loss: 0.012960226275026798 2023-01-24 04:09:58.003521: step: 456/466, loss: 0.02838488295674324 2023-01-24 04:09:58.598743: step: 458/466, loss: 0.0011124319862574339 2023-01-24 04:09:59.206923: step: 460/466, loss: 0.0037779128178954124 2023-01-24 04:09:59.843523: step: 462/466, loss: 0.005112520884722471 2023-01-24 04:10:00.451678: step: 464/466, loss: 0.018566833809018135 2023-01-24 04:10:01.036137: step: 466/466, loss: 0.0032261114101856947 2023-01-24 04:10:01.774958: step: 468/466, loss: 0.027344532310962677 2023-01-24 04:10:02.443354: step: 470/466, loss: 0.038745488971471786 2023-01-24 04:10:03.053411: step: 472/466, loss: 0.001235641655512154 2023-01-24 04:10:03.618612: step: 474/466, loss: 0.0025396200362592936 2023-01-24 04:10:04.176178: step: 476/466, loss: 0.0005299833719618618 2023-01-24 04:10:04.787591: step: 478/466, loss: 0.013481545262038708 2023-01-24 04:10:05.455092: step: 480/466, loss: 0.04830469191074371 2023-01-24 04:10:06.090014: step: 482/466, loss: 0.02216290310025215 2023-01-24 04:10:06.664783: step: 484/466, loss: 0.007065951824188232 2023-01-24 04:10:07.260735: step: 486/466, loss: 0.08484815806150436 2023-01-24 04:10:07.956076: step: 488/466, loss: 0.035927895456552505 2023-01-24 04:10:08.593358: step: 490/466, loss: 0.010688919574022293 2023-01-24 04:10:09.232163: step: 492/466, loss: 0.0003162022912874818 2023-01-24 04:10:09.820467: step: 494/466, loss: 0.008528665639460087 2023-01-24 04:10:10.449372: step: 496/466, loss: 0.014202555641531944 2023-01-24 04:10:11.065833: step: 498/466, loss: 0.0950498878955841 2023-01-24 04:10:11.673937: step: 500/466, loss: 0.0037851862143725157 2023-01-24 04:10:12.254576: step: 502/466, loss: 0.002583063906058669 2023-01-24 04:10:12.881095: step: 504/466, loss: 0.09229692816734314 2023-01-24 04:10:13.579448: step: 506/466, loss: 0.03515050932765007 2023-01-24 04:10:14.273779: step: 508/466, loss: 0.2170080989599228 2023-01-24 04:10:14.870811: step: 510/466, loss: 0.023440072312951088 2023-01-24 04:10:15.510114: step: 512/466, loss: 0.012734951451420784 2023-01-24 04:10:16.143025: step: 514/466, loss: 0.0022743670269846916 2023-01-24 04:10:16.780709: step: 516/466, loss: 0.014371275901794434 2023-01-24 04:10:17.328829: step: 518/466, loss: 0.017327800393104553 2023-01-24 04:10:17.954842: step: 520/466, loss: 0.06441115587949753 2023-01-24 04:10:18.684086: step: 522/466, loss: 0.036276645958423615 2023-01-24 04:10:19.267245: step: 524/466, loss: 0.0009572524577379227 2023-01-24 04:10:19.897777: step: 526/466, loss: 0.010352073237299919 2023-01-24 04:10:20.523107: step: 528/466, loss: 0.0002700319164432585 2023-01-24 04:10:21.146609: step: 530/466, loss: 0.04756855592131615 2023-01-24 04:10:21.722874: step: 532/466, loss: 0.03081398457288742 2023-01-24 04:10:22.322866: step: 534/466, loss: 0.013193175196647644 2023-01-24 04:10:22.955156: step: 536/466, loss: 0.08822406083345413 2023-01-24 04:10:23.539150: step: 538/466, loss: 0.02872275933623314 2023-01-24 04:10:24.136178: step: 540/466, loss: 0.0002939131227321923 2023-01-24 04:10:24.765479: step: 542/466, loss: 0.05053912475705147 2023-01-24 04:10:25.451888: step: 544/466, loss: 0.03908180445432663 2023-01-24 04:10:26.016274: step: 546/466, loss: 0.0028365193866193295 2023-01-24 04:10:26.600554: step: 548/466, loss: 1.3741116523742676 2023-01-24 04:10:27.226420: step: 550/466, loss: 0.013899214565753937 2023-01-24 04:10:27.863540: step: 552/466, loss: 0.011286936700344086 2023-01-24 04:10:28.394441: step: 554/466, loss: 0.009468899108469486 2023-01-24 04:10:28.984073: step: 556/466, loss: 0.003917319234460592 2023-01-24 04:10:29.607610: step: 558/466, loss: 0.010787371546030045 2023-01-24 04:10:30.297976: step: 560/466, loss: 0.009908036328852177 2023-01-24 04:10:30.879268: step: 562/466, loss: 0.012839104980230331 2023-01-24 04:10:31.483342: step: 564/466, loss: 0.0020545353181660175 2023-01-24 04:10:32.216326: step: 566/466, loss: 0.02453666739165783 2023-01-24 04:10:32.921326: step: 568/466, loss: 0.032193608582019806 2023-01-24 04:10:33.511248: step: 570/466, loss: 0.08889345824718475 2023-01-24 04:10:34.134663: step: 572/466, loss: 0.3024088740348816 2023-01-24 04:10:34.720606: step: 574/466, loss: 0.0006328593008220196 2023-01-24 04:10:35.326694: step: 576/466, loss: 0.0075796786695718765 2023-01-24 04:10:35.970099: step: 578/466, loss: 0.012202093377709389 2023-01-24 04:10:36.593483: step: 580/466, loss: 0.2748091220855713 2023-01-24 04:10:37.216759: step: 582/466, loss: 0.08011313527822495 2023-01-24 04:10:37.827615: step: 584/466, loss: 0.007275673560798168 2023-01-24 04:10:38.434985: step: 586/466, loss: 0.023380331695079803 2023-01-24 04:10:39.060796: step: 588/466, loss: 0.03591989725828171 2023-01-24 04:10:39.645441: step: 590/466, loss: 0.0058691492304205894 2023-01-24 04:10:40.264093: step: 592/466, loss: 0.004461975302547216 2023-01-24 04:10:41.011054: step: 594/466, loss: 0.02557777799665928 2023-01-24 04:10:41.689715: step: 596/466, loss: 0.0014842869713902473 2023-01-24 04:10:42.358415: step: 598/466, loss: 0.014528783038258553 2023-01-24 04:10:42.986800: step: 600/466, loss: 0.0010791391832754016 2023-01-24 04:10:43.652887: step: 602/466, loss: 0.07167264074087143 2023-01-24 04:10:44.193224: step: 604/466, loss: 0.0077194636687636375 2023-01-24 04:10:44.827017: step: 606/466, loss: 0.18285511434078217 2023-01-24 04:10:45.427624: step: 608/466, loss: 0.0013037014286965132 2023-01-24 04:10:46.013395: step: 610/466, loss: 0.2825278043746948 2023-01-24 04:10:46.572277: step: 612/466, loss: 0.028179537504911423 2023-01-24 04:10:47.152093: step: 614/466, loss: 0.036346279084682465 2023-01-24 04:10:47.740640: step: 616/466, loss: 0.017021380364894867 2023-01-24 04:10:48.371041: step: 618/466, loss: 0.02601482905447483 2023-01-24 04:10:49.007388: step: 620/466, loss: 0.037656866014003754 2023-01-24 04:10:49.605173: step: 622/466, loss: 0.02671785093843937 2023-01-24 04:10:50.209103: step: 624/466, loss: 0.000846438982989639 2023-01-24 04:10:50.894114: step: 626/466, loss: 0.0006022404413670301 2023-01-24 04:10:51.538644: step: 628/466, loss: 0.0027155440766364336 2023-01-24 04:10:52.205175: step: 630/466, loss: 0.03708767890930176 2023-01-24 04:10:52.859832: step: 632/466, loss: 0.007931017316877842 2023-01-24 04:10:53.444473: step: 634/466, loss: 0.006869449745863676 2023-01-24 04:10:54.076744: step: 636/466, loss: 0.049254097044467926 2023-01-24 04:10:54.687966: step: 638/466, loss: 0.12904299795627594 2023-01-24 04:10:55.366858: step: 640/466, loss: 0.020783551037311554 2023-01-24 04:10:55.967974: step: 642/466, loss: 0.005933032371103764 2023-01-24 04:10:56.624251: step: 644/466, loss: 0.06857472658157349 2023-01-24 04:10:57.253023: step: 646/466, loss: 0.004435302224010229 2023-01-24 04:10:57.916553: step: 648/466, loss: 0.717083752155304 2023-01-24 04:10:58.728495: step: 650/466, loss: 0.08154986053705215 2023-01-24 04:10:59.323049: step: 652/466, loss: 0.015391502529382706 2023-01-24 04:10:59.923276: step: 654/466, loss: 0.05044484883546829 2023-01-24 04:11:00.513182: step: 656/466, loss: 0.013044895604252815 2023-01-24 04:11:01.167510: step: 658/466, loss: 0.09905446320772171 2023-01-24 04:11:01.846772: step: 660/466, loss: 0.010607562959194183 2023-01-24 04:11:02.461303: step: 662/466, loss: 0.005811905954033136 2023-01-24 04:11:03.073163: step: 664/466, loss: 0.012407036498188972 2023-01-24 04:11:03.667776: step: 666/466, loss: 0.08473341166973114 2023-01-24 04:11:04.290384: step: 668/466, loss: 0.018354739993810654 2023-01-24 04:11:04.927401: step: 670/466, loss: 0.005217993166297674 2023-01-24 04:11:05.502943: step: 672/466, loss: 0.10953141003847122 2023-01-24 04:11:06.267731: step: 674/466, loss: 0.05236465856432915 2023-01-24 04:11:06.774474: step: 676/466, loss: 0.01544020976871252 2023-01-24 04:11:07.429443: step: 678/466, loss: 0.0276241023093462 2023-01-24 04:11:08.041451: step: 680/466, loss: 0.05030062049627304 2023-01-24 04:11:08.597763: step: 682/466, loss: 0.026244675740599632 2023-01-24 04:11:09.203992: step: 684/466, loss: 0.07645662128925323 2023-01-24 04:11:09.847995: step: 686/466, loss: 0.010076702572405338 2023-01-24 04:11:10.524206: step: 688/466, loss: 0.013777351938188076 2023-01-24 04:11:11.167891: step: 690/466, loss: 0.001946625066921115 2023-01-24 04:11:11.840434: step: 692/466, loss: 0.013552557677030563 2023-01-24 04:11:12.440193: step: 694/466, loss: 0.04471652954816818 2023-01-24 04:11:13.041333: step: 696/466, loss: 0.01028574537485838 2023-01-24 04:11:13.658270: step: 698/466, loss: 0.00933043658733368 2023-01-24 04:11:14.303183: step: 700/466, loss: 0.046223234385252 2023-01-24 04:11:15.036969: step: 702/466, loss: 0.08472418040037155 2023-01-24 04:11:15.702188: step: 704/466, loss: 0.04190603643655777 2023-01-24 04:11:16.332908: step: 706/466, loss: 0.2162904441356659 2023-01-24 04:11:16.922847: step: 708/466, loss: 0.01777159608900547 2023-01-24 04:11:17.537366: step: 710/466, loss: 0.018709806725382805 2023-01-24 04:11:18.163928: step: 712/466, loss: 0.04080433398485184 2023-01-24 04:11:18.763944: step: 714/466, loss: 0.024690769612789154 2023-01-24 04:11:19.436669: step: 716/466, loss: 0.026956753805279732 2023-01-24 04:11:20.153850: step: 718/466, loss: 0.23514507710933685 2023-01-24 04:11:20.785572: step: 720/466, loss: 0.005163929425179958 2023-01-24 04:11:21.467705: step: 722/466, loss: 0.16484905779361725 2023-01-24 04:11:22.059931: step: 724/466, loss: 0.04459630697965622 2023-01-24 04:11:22.613664: step: 726/466, loss: 0.0003833868831861764 2023-01-24 04:11:23.256250: step: 728/466, loss: 0.0007012172718532383 2023-01-24 04:11:23.864853: step: 730/466, loss: 0.009721334092319012 2023-01-24 04:11:24.599977: step: 732/466, loss: 0.006624647881835699 2023-01-24 04:11:25.203480: step: 734/466, loss: 0.0274251289665699 2023-01-24 04:11:25.792619: step: 736/466, loss: 0.015381962060928345 2023-01-24 04:11:26.476098: step: 738/466, loss: 0.18253710865974426 2023-01-24 04:11:27.101615: step: 740/466, loss: 0.9736369848251343 2023-01-24 04:11:27.709499: step: 742/466, loss: 0.005604333244264126 2023-01-24 04:11:28.354650: step: 744/466, loss: 0.006971567869186401 2023-01-24 04:11:28.975471: step: 746/466, loss: 0.0017240039305761456 2023-01-24 04:11:29.589421: step: 748/466, loss: 0.028396597132086754 2023-01-24 04:11:30.160332: step: 750/466, loss: 0.0009606159874238074 2023-01-24 04:11:30.799828: step: 752/466, loss: 0.12095136940479279 2023-01-24 04:11:31.655027: step: 754/466, loss: 0.01669827103614807 2023-01-24 04:11:32.254714: step: 756/466, loss: 0.0016191216418519616 2023-01-24 04:11:32.954375: step: 758/466, loss: 0.024371657520532608 2023-01-24 04:11:33.527597: step: 760/466, loss: 0.004538081120699644 2023-01-24 04:11:34.166313: step: 762/466, loss: 0.006123277824372053 2023-01-24 04:11:34.793863: step: 764/466, loss: 0.133865624666214 2023-01-24 04:11:35.488005: step: 766/466, loss: 0.02243422344326973 2023-01-24 04:11:36.180492: step: 768/466, loss: 0.048773620277643204 2023-01-24 04:11:36.797725: step: 770/466, loss: 0.004993142560124397 2023-01-24 04:11:37.408417: step: 772/466, loss: 0.005450873170047998 2023-01-24 04:11:38.089299: step: 774/466, loss: 0.039966829121112823 2023-01-24 04:11:38.693672: step: 776/466, loss: 0.000651595531962812 2023-01-24 04:11:39.353240: step: 778/466, loss: 0.02831016108393669 2023-01-24 04:11:39.954954: step: 780/466, loss: 0.3915857970714569 2023-01-24 04:11:40.542795: step: 782/466, loss: 0.01388892438262701 2023-01-24 04:11:41.177553: step: 784/466, loss: 0.07443975657224655 2023-01-24 04:11:41.810303: step: 786/466, loss: 0.03358267992734909 2023-01-24 04:11:42.421990: step: 788/466, loss: 0.08037069439888 2023-01-24 04:11:43.068249: step: 790/466, loss: 0.03405177965760231 2023-01-24 04:11:43.660272: step: 792/466, loss: 0.010468493215739727 2023-01-24 04:11:44.296200: step: 794/466, loss: 0.0027855599764734507 2023-01-24 04:11:44.935888: step: 796/466, loss: 0.007384442258626223 2023-01-24 04:11:45.580526: step: 798/466, loss: 0.017550082877278328 2023-01-24 04:11:46.253183: step: 800/466, loss: 0.021345291286706924 2023-01-24 04:11:46.843994: step: 802/466, loss: 0.014420836232602596 2023-01-24 04:11:47.435318: step: 804/466, loss: 0.0008694904972799122 2023-01-24 04:11:48.075355: step: 806/466, loss: 0.21264083683490753 2023-01-24 04:11:48.693393: step: 808/466, loss: 0.030158042907714844 2023-01-24 04:11:49.287212: step: 810/466, loss: 0.004542697686702013 2023-01-24 04:11:49.894511: step: 812/466, loss: 0.012855260632932186 2023-01-24 04:11:50.581619: step: 814/466, loss: 0.013972077518701553 2023-01-24 04:11:51.189213: step: 816/466, loss: 0.00239029573276639 2023-01-24 04:11:51.839334: step: 818/466, loss: 0.0032751683611422777 2023-01-24 04:11:52.418276: step: 820/466, loss: 0.002388720866292715 2023-01-24 04:11:53.066539: step: 822/466, loss: 0.006094999145716429 2023-01-24 04:11:53.701145: step: 824/466, loss: 0.034857261925935745 2023-01-24 04:11:54.318195: step: 826/466, loss: 0.026030782610177994 2023-01-24 04:11:54.885202: step: 828/466, loss: 0.004401189275085926 2023-01-24 04:11:55.472562: step: 830/466, loss: 0.007346487138420343 2023-01-24 04:11:56.103405: step: 832/466, loss: 0.011273558251559734 2023-01-24 04:11:56.724408: step: 834/466, loss: 0.00036735390312969685 2023-01-24 04:11:57.284206: step: 836/466, loss: 0.013762393966317177 2023-01-24 04:11:57.880537: step: 838/466, loss: 0.010150541551411152 2023-01-24 04:11:58.514301: step: 840/466, loss: 0.014261798933148384 2023-01-24 04:11:59.075376: step: 842/466, loss: 0.0029600472189486027 2023-01-24 04:11:59.659674: step: 844/466, loss: 0.005396983586251736 2023-01-24 04:12:00.192548: step: 846/466, loss: 4.712427471531555e-05 2023-01-24 04:12:00.874705: step: 848/466, loss: 0.023324286565184593 2023-01-24 04:12:01.507837: step: 850/466, loss: 0.22444498538970947 2023-01-24 04:12:02.133039: step: 852/466, loss: 0.009761333465576172 2023-01-24 04:12:02.806809: step: 854/466, loss: 0.14486908912658691 2023-01-24 04:12:03.495654: step: 856/466, loss: 0.004524150863289833 2023-01-24 04:12:04.159353: step: 858/466, loss: 0.005788067821413279 2023-01-24 04:12:04.762145: step: 860/466, loss: 0.005012852605432272 2023-01-24 04:12:05.412971: step: 862/466, loss: 0.024343110620975494 2023-01-24 04:12:06.050259: step: 864/466, loss: 0.0030281671788543463 2023-01-24 04:12:06.671671: step: 866/466, loss: 0.005659808404743671 2023-01-24 04:12:07.344974: step: 868/466, loss: 0.0009139773319475353 2023-01-24 04:12:07.966359: step: 870/466, loss: 0.012200686149299145 2023-01-24 04:12:08.540716: step: 872/466, loss: 0.010772855952382088 2023-01-24 04:12:09.194476: step: 874/466, loss: 0.002077052602544427 2023-01-24 04:12:09.865317: step: 876/466, loss: 0.1010916456580162 2023-01-24 04:12:10.454617: step: 878/466, loss: 0.18734891712665558 2023-01-24 04:12:11.045408: step: 880/466, loss: 0.020228328183293343 2023-01-24 04:12:11.720822: step: 882/466, loss: 0.03781450167298317 2023-01-24 04:12:12.335879: step: 884/466, loss: 0.010309301316738129 2023-01-24 04:12:12.958249: step: 886/466, loss: 0.001091604819521308 2023-01-24 04:12:13.635788: step: 888/466, loss: 0.005849903449416161 2023-01-24 04:12:14.318336: step: 890/466, loss: 0.01180856954306364 2023-01-24 04:12:14.910021: step: 892/466, loss: 0.012540485709905624 2023-01-24 04:12:15.513152: step: 894/466, loss: 0.07636093348264694 2023-01-24 04:12:16.123770: step: 896/466, loss: 0.022477617487311363 2023-01-24 04:12:16.751407: step: 898/466, loss: 0.003358501475304365 2023-01-24 04:12:17.319308: step: 900/466, loss: 0.006681774277240038 2023-01-24 04:12:17.952454: step: 902/466, loss: 0.008023912087082863 2023-01-24 04:12:18.571025: step: 904/466, loss: 0.008172152563929558 2023-01-24 04:12:19.173050: step: 906/466, loss: 0.012225516140460968 2023-01-24 04:12:19.810865: step: 908/466, loss: 0.0005171762895770371 2023-01-24 04:12:20.496138: step: 910/466, loss: 0.03175482898950577 2023-01-24 04:12:21.103413: step: 912/466, loss: 0.011137586086988449 2023-01-24 04:12:21.712780: step: 914/466, loss: 0.010821109637618065 2023-01-24 04:12:22.228596: step: 916/466, loss: 0.03788109868764877 2023-01-24 04:12:22.883275: step: 918/466, loss: 0.0038046720437705517 2023-01-24 04:12:23.549250: step: 920/466, loss: 0.1045636311173439 2023-01-24 04:12:24.194315: step: 922/466, loss: 0.023890666663646698 2023-01-24 04:12:24.747463: step: 924/466, loss: 0.005773146171122789 2023-01-24 04:12:25.358550: step: 926/466, loss: 0.000679883174598217 2023-01-24 04:12:26.045980: step: 928/466, loss: 0.01151447743177414 2023-01-24 04:12:26.561327: step: 930/466, loss: 0.006437981501221657 2023-01-24 04:12:27.183683: step: 932/466, loss: 0.003340943017974496 ================================================== Loss: 0.043 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3559793292203472, 'r': 0.3384167816686792, 'f1': 0.34697596097158356}, 'combined': 0.2556664975580089, 'epoch': 30} Test Chinese: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.35935018763595095, 'r': 0.3020534506125411, 'f1': 0.3282200395544938}, 'combined': 0.2176796117252601, 'epoch': 30} Dev Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.35564095127610207, 'r': 0.2903053977272727, 'f1': 0.3196689259645464}, 'combined': 0.21311261730969758, 'epoch': 30} Test Korean: {'template': {'p': 0.96875, 'r': 0.49206349206349204, 'f1': 0.6526315789473683}, 'slot': {'p': 0.3695936361540456, 'r': 0.28731998807143666, 'f1': 0.3233047244415487}, 'combined': 0.21099887279343174, 'epoch': 30} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3481891668380639, 'r': 0.3435642632937253, 'f1': 0.3458612545478381}, 'combined': 0.25484513492998595, 'epoch': 30} Test Russian: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.3474197022461863, 'r': 0.2908700017939932, 'f1': 0.31663982287659015}, 'combined': 0.20999946802177996, 'epoch': 30} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.3, 'r': 0.3, 'f1': 0.3}, 'combined': 0.19999999999999998, 'epoch': 30} Sample Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.4264705882352941, 'r': 0.31521739130434784, 'f1': 0.3625}, 'combined': 0.24166666666666664, 'epoch': 30} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.475, 'r': 0.16379310344827586, 'f1': 0.24358974358974356}, 'combined': 0.16239316239316237, 'epoch': 30} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.35600821224540485, 'r': 0.3296622534644356, 'f1': 0.34232907896701}, 'combined': 0.25224247923884946, 'epoch': 20} Test for Chinese: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.3708503216596444, 'r': 0.2994463568648126, 'f1': 0.33134515303755174}, 'combined': 0.21975222584873896, 'epoch': 20} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.38541666666666663, 'r': 0.35238095238095235, 'f1': 0.36815920398009944}, 'combined': 0.24543946932006627, 'epoch': 20} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.32910511363636363, 'r': 0.2742542613636364, 'f1': 0.2991864669421488}, 'combined': 0.19945764462809917, 'epoch': 25} Test for Korean: {'template': {'p': 0.96875, 'r': 0.49206349206349204, 'f1': 0.6526315789473683}, 'slot': {'p': 0.3803562606644222, 'r': 0.28970140060968946, 'f1': 0.32889629598629455}, 'combined': 0.21464810895947642, 'epoch': 25} Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.578125, 'r': 0.40217391304347827, 'f1': 0.47435897435897434}, 'combined': 0.3162393162393162, 'epoch': 25} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3214272868712161, 'r': 0.31166858366449984, 'f1': 0.3164727236824498}, 'combined': 0.23319042797654194, 'epoch': 7} Test for Russian: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.3374639287478496, 'r': 0.2816078301964814, 'f1': 0.30701605547736693}, 'combined': 0.20361686580882363, 'epoch': 7} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.4423076923076923, 'r': 0.19827586206896552, 'f1': 0.2738095238095238}, 'combined': 0.1825396825396825, 'epoch': 7} ****************************** Epoch: 31 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-24 04:14:59.081994: step: 2/466, loss: 0.03907116875052452 2023-01-24 04:14:59.677867: step: 4/466, loss: 0.022938139736652374 2023-01-24 04:15:00.337534: step: 6/466, loss: 0.0043020425364375114 2023-01-24 04:15:00.997659: step: 8/466, loss: 0.017011329531669617 2023-01-24 04:15:01.626079: step: 10/466, loss: 0.00695255259051919 2023-01-24 04:15:02.380974: step: 12/466, loss: 0.6651405692100525 2023-01-24 04:15:03.021468: step: 14/466, loss: 0.02912992797791958 2023-01-24 04:15:03.607612: step: 16/466, loss: 0.044540539383888245 2023-01-24 04:15:04.230687: step: 18/466, loss: 0.03385673463344574 2023-01-24 04:15:04.869025: step: 20/466, loss: 0.002722368575632572 2023-01-24 04:15:05.510917: step: 22/466, loss: 0.00797630287706852 2023-01-24 04:15:06.106853: step: 24/466, loss: 0.013233490288257599 2023-01-24 04:15:06.745086: step: 26/466, loss: 0.11989618092775345 2023-01-24 04:15:07.406146: step: 28/466, loss: 0.00013836105063091964 2023-01-24 04:15:08.017426: step: 30/466, loss: 0.04954499006271362 2023-01-24 04:15:08.591352: step: 32/466, loss: 0.0007336020935326815 2023-01-24 04:15:09.252659: step: 34/466, loss: 0.0631098598241806 2023-01-24 04:15:09.875778: step: 36/466, loss: 0.016741055995225906 2023-01-24 04:15:10.500370: step: 38/466, loss: 0.00044407794484868646 2023-01-24 04:15:11.141572: step: 40/466, loss: 0.0063453433103859425 2023-01-24 04:15:11.815293: step: 42/466, loss: 0.061242565512657166 2023-01-24 04:15:12.442270: step: 44/466, loss: 0.03584650903940201 2023-01-24 04:15:13.134391: step: 46/466, loss: 0.00015868403716012836 2023-01-24 04:15:13.739754: step: 48/466, loss: 0.1516694277524948 2023-01-24 04:15:14.363281: step: 50/466, loss: 0.0011374569730833173 2023-01-24 04:15:14.945200: step: 52/466, loss: 0.025352753698825836 2023-01-24 04:15:15.575655: step: 54/466, loss: 0.004081857856363058 2023-01-24 04:15:16.245856: step: 56/466, loss: 0.011429132893681526 2023-01-24 04:15:16.765895: step: 58/466, loss: 0.0034054499119520187 2023-01-24 04:15:17.407638: step: 60/466, loss: 0.005714302882552147 2023-01-24 04:15:18.122892: step: 62/466, loss: 0.0326569564640522 2023-01-24 04:15:18.769303: step: 64/466, loss: 0.024123404175043106 2023-01-24 04:15:19.415048: step: 66/466, loss: 0.0016146288253366947 2023-01-24 04:15:20.057011: step: 68/466, loss: 0.038427501916885376 2023-01-24 04:15:20.693642: step: 70/466, loss: 0.03747810795903206 2023-01-24 04:15:21.328125: step: 72/466, loss: 0.0019054411677643657 2023-01-24 04:15:21.927835: step: 74/466, loss: 0.002627598587423563 2023-01-24 04:15:22.628530: step: 76/466, loss: 0.011531215161085129 2023-01-24 04:15:23.233647: step: 78/466, loss: 0.017680393531918526 2023-01-24 04:15:23.863210: step: 80/466, loss: 0.019427277147769928 2023-01-24 04:15:24.476138: step: 82/466, loss: 0.029101530089974403 2023-01-24 04:15:25.096288: step: 84/466, loss: 0.03290451690554619 2023-01-24 04:15:25.818918: step: 86/466, loss: 5.953895379207097e-05 2023-01-24 04:15:26.372388: step: 88/466, loss: 0.00045504237641580403 2023-01-24 04:15:27.095577: step: 90/466, loss: 0.019244128838181496 2023-01-24 04:15:27.738183: step: 92/466, loss: 0.037002597004175186 2023-01-24 04:15:28.430453: step: 94/466, loss: 0.0021970639936625957 2023-01-24 04:15:29.175500: step: 96/466, loss: 0.020365744829177856 2023-01-24 04:15:29.793702: step: 98/466, loss: 0.005219418555498123 2023-01-24 04:15:30.430264: step: 100/466, loss: 0.020970579236745834 2023-01-24 04:15:31.030455: step: 102/466, loss: 0.02584446780383587 2023-01-24 04:15:31.674855: step: 104/466, loss: 0.02854795567691326 2023-01-24 04:15:32.318577: step: 106/466, loss: 0.0016252042260020971 2023-01-24 04:15:32.950875: step: 108/466, loss: 0.642196774482727 2023-01-24 04:15:33.574623: step: 110/466, loss: 0.006834291387349367 2023-01-24 04:15:34.249683: step: 112/466, loss: 0.27037227153778076 2023-01-24 04:15:34.806079: step: 114/466, loss: 0.011913014575839043 2023-01-24 04:15:35.412729: step: 116/466, loss: 0.03166844695806503 2023-01-24 04:15:36.120182: step: 118/466, loss: 0.014308612793684006 2023-01-24 04:15:36.759923: step: 120/466, loss: 0.03876231610774994 2023-01-24 04:15:37.404668: step: 122/466, loss: 0.02030685916543007 2023-01-24 04:15:38.146000: step: 124/466, loss: 0.004717486910521984 2023-01-24 04:15:38.726897: step: 126/466, loss: 0.03216750919818878 2023-01-24 04:15:39.309265: step: 128/466, loss: 0.011396355926990509 2023-01-24 04:15:40.040588: step: 130/466, loss: 0.034590549767017365 2023-01-24 04:15:40.686983: step: 132/466, loss: 0.0032397115137428045 2023-01-24 04:15:41.342506: step: 134/466, loss: 0.06025297939777374 2023-01-24 04:15:41.985404: step: 136/466, loss: 0.006710045505315065 2023-01-24 04:15:42.607408: step: 138/466, loss: 0.009684402495622635 2023-01-24 04:15:43.267060: step: 140/466, loss: 0.04758431017398834 2023-01-24 04:15:43.849976: step: 142/466, loss: 0.0214069951325655 2023-01-24 04:15:44.422652: step: 144/466, loss: 0.0012129333335906267 2023-01-24 04:15:45.058395: step: 146/466, loss: 0.004989032633602619 2023-01-24 04:15:45.722963: step: 148/466, loss: 0.005683788564056158 2023-01-24 04:15:46.375122: step: 150/466, loss: 0.007793087977916002 2023-01-24 04:15:47.045094: step: 152/466, loss: 0.010247008875012398 2023-01-24 04:15:47.612570: step: 154/466, loss: 0.0003937317233067006 2023-01-24 04:15:48.315769: step: 156/466, loss: 0.004915554076433182 2023-01-24 04:15:48.973779: step: 158/466, loss: 0.005361888092011213 2023-01-24 04:15:49.592310: step: 160/466, loss: 0.014959918335080147 2023-01-24 04:15:50.163915: step: 162/466, loss: 0.008497266098856926 2023-01-24 04:15:50.751339: step: 164/466, loss: 0.051364511251449585 2023-01-24 04:15:51.364904: step: 166/466, loss: 0.007937022484838963 2023-01-24 04:15:51.950138: step: 168/466, loss: 0.013537651859223843 2023-01-24 04:15:52.566901: step: 170/466, loss: 0.004952648654580116 2023-01-24 04:15:53.226306: step: 172/466, loss: 0.015321213752031326 2023-01-24 04:15:53.848019: step: 174/466, loss: 0.0075547955930233 2023-01-24 04:15:54.485803: step: 176/466, loss: 0.08551667630672455 2023-01-24 04:15:55.073191: step: 178/466, loss: 0.05356985330581665 2023-01-24 04:15:55.692273: step: 180/466, loss: 0.007350060157477856 2023-01-24 04:15:56.291446: step: 182/466, loss: 0.10914287716150284 2023-01-24 04:15:56.932434: step: 184/466, loss: 0.02786598540842533 2023-01-24 04:15:57.547033: step: 186/466, loss: 0.0018171078991144896 2023-01-24 04:15:58.177252: step: 188/466, loss: 0.008747768588364124 2023-01-24 04:15:58.842436: step: 190/466, loss: 0.019645584747195244 2023-01-24 04:15:59.459919: step: 192/466, loss: 0.0018772552721202374 2023-01-24 04:16:00.087629: step: 194/466, loss: 0.026600424200296402 2023-01-24 04:16:00.685757: step: 196/466, loss: 0.008282824419438839 2023-01-24 04:16:01.353637: step: 198/466, loss: 0.011049061082303524 2023-01-24 04:16:02.108190: step: 200/466, loss: 0.0026763228233903646 2023-01-24 04:16:02.740666: step: 202/466, loss: 0.006706198211759329 2023-01-24 04:16:03.358368: step: 204/466, loss: 0.11584700644016266 2023-01-24 04:16:03.934060: step: 206/466, loss: 0.030152572318911552 2023-01-24 04:16:04.635281: step: 208/466, loss: 0.020291537046432495 2023-01-24 04:16:05.247332: step: 210/466, loss: 0.0031722483690828085 2023-01-24 04:16:05.876020: step: 212/466, loss: 0.03458666428923607 2023-01-24 04:16:06.521184: step: 214/466, loss: 0.03625361621379852 2023-01-24 04:16:07.152152: step: 216/466, loss: 0.0018723373068496585 2023-01-24 04:16:07.710300: step: 218/466, loss: 0.10305580496788025 2023-01-24 04:16:08.268338: step: 220/466, loss: 0.0006719183875247836 2023-01-24 04:16:08.880336: step: 222/466, loss: 0.013313040137290955 2023-01-24 04:16:09.505160: step: 224/466, loss: 0.0028870024252682924 2023-01-24 04:16:10.089704: step: 226/466, loss: 0.0007040125783532858 2023-01-24 04:16:10.737133: step: 228/466, loss: 0.16122116148471832 2023-01-24 04:16:11.324913: step: 230/466, loss: 0.04338943958282471 2023-01-24 04:16:11.949182: step: 232/466, loss: 0.0360637828707695 2023-01-24 04:16:12.707291: step: 234/466, loss: 0.07026376575231552 2023-01-24 04:16:13.372314: step: 236/466, loss: 0.05887288227677345 2023-01-24 04:16:13.967545: step: 238/466, loss: 0.006853953935205936 2023-01-24 04:16:14.613027: step: 240/466, loss: 0.0164225772023201 2023-01-24 04:16:15.275327: step: 242/466, loss: 0.052106983959674835 2023-01-24 04:16:15.962540: step: 244/466, loss: 0.16110815107822418 2023-01-24 04:16:16.612471: step: 246/466, loss: 0.02118993178009987 2023-01-24 04:16:17.257515: step: 248/466, loss: 0.007692660205066204 2023-01-24 04:16:17.888225: step: 250/466, loss: 0.036612603813409805 2023-01-24 04:16:18.560676: step: 252/466, loss: 0.1068938821554184 2023-01-24 04:16:19.197350: step: 254/466, loss: 0.04812590405344963 2023-01-24 04:16:19.810209: step: 256/466, loss: 0.06173084303736687 2023-01-24 04:16:20.470805: step: 258/466, loss: 0.02443801425397396 2023-01-24 04:16:21.057488: step: 260/466, loss: 0.0062148310244083405 2023-01-24 04:16:21.625326: step: 262/466, loss: 0.022380737587809563 2023-01-24 04:16:22.185839: step: 264/466, loss: 0.012677857652306557 2023-01-24 04:16:22.777952: step: 266/466, loss: 0.003117139218375087 2023-01-24 04:16:23.386752: step: 268/466, loss: 0.014295085333287716 2023-01-24 04:16:23.956053: step: 270/466, loss: 0.022981129586696625 2023-01-24 04:16:24.624849: step: 272/466, loss: 0.04769422858953476 2023-01-24 04:16:25.314146: step: 274/466, loss: 0.04571692645549774 2023-01-24 04:16:25.941623: step: 276/466, loss: 0.015166614204645157 2023-01-24 04:16:26.520966: step: 278/466, loss: 9.916900307871401e-05 2023-01-24 04:16:27.175344: step: 280/466, loss: 0.30812007188796997 2023-01-24 04:16:27.782030: step: 282/466, loss: 0.0048457966186106205 2023-01-24 04:16:28.397441: step: 284/466, loss: 0.03308245912194252 2023-01-24 04:16:29.010444: step: 286/466, loss: 0.043762821704149246 2023-01-24 04:16:29.598328: step: 288/466, loss: 2.7630398273468018 2023-01-24 04:16:30.191404: step: 290/466, loss: 7.269015789031982 2023-01-24 04:16:30.852983: step: 292/466, loss: 0.004670634400099516 2023-01-24 04:16:31.491428: step: 294/466, loss: 0.005938251968473196 2023-01-24 04:16:32.117395: step: 296/466, loss: 0.01262521743774414 2023-01-24 04:16:32.701378: step: 298/466, loss: 0.00548663130030036 2023-01-24 04:16:33.305508: step: 300/466, loss: 0.08520643413066864 2023-01-24 04:16:34.081432: step: 302/466, loss: 0.050616439431905746 2023-01-24 04:16:34.652334: step: 304/466, loss: 0.015373089350759983 2023-01-24 04:16:35.231647: step: 306/466, loss: 0.013275873847305775 2023-01-24 04:16:35.847203: step: 308/466, loss: 0.0582607239484787 2023-01-24 04:16:36.493775: step: 310/466, loss: 0.37358009815216064 2023-01-24 04:16:37.094213: step: 312/466, loss: 0.2164192646741867 2023-01-24 04:16:37.761587: step: 314/466, loss: 0.048840828239917755 2023-01-24 04:16:38.407938: step: 316/466, loss: 0.0010105979163199663 2023-01-24 04:16:38.998104: step: 318/466, loss: 0.014398284256458282 2023-01-24 04:16:39.650833: step: 320/466, loss: 0.004466744605451822 2023-01-24 04:16:40.270189: step: 322/466, loss: 0.0074524241499602795 2023-01-24 04:16:40.835190: step: 324/466, loss: 0.02958526834845543 2023-01-24 04:16:41.493935: step: 326/466, loss: 0.01925458014011383 2023-01-24 04:16:42.179370: step: 328/466, loss: 0.022651994600892067 2023-01-24 04:16:42.821789: step: 330/466, loss: 0.0065770456567406654 2023-01-24 04:16:43.451587: step: 332/466, loss: 0.03941261023283005 2023-01-24 04:16:44.062070: step: 334/466, loss: 0.016719456762075424 2023-01-24 04:16:44.724436: step: 336/466, loss: 0.04962924122810364 2023-01-24 04:16:45.345756: step: 338/466, loss: 0.0021794959902763367 2023-01-24 04:16:46.031480: step: 340/466, loss: 0.00030944435275159776 2023-01-24 04:16:46.624224: step: 342/466, loss: 0.010837533511221409 2023-01-24 04:16:47.214519: step: 344/466, loss: 0.021002069115638733 2023-01-24 04:16:47.761751: step: 346/466, loss: 0.0004114443436264992 2023-01-24 04:16:48.429860: step: 348/466, loss: 4.3047621147707105e-05 2023-01-24 04:16:49.067828: step: 350/466, loss: 0.0005217364523559809 2023-01-24 04:16:49.650189: step: 352/466, loss: 0.013475162908434868 2023-01-24 04:16:50.247425: step: 354/466, loss: 0.06028321385383606 2023-01-24 04:16:50.874275: step: 356/466, loss: 0.001066471915692091 2023-01-24 04:16:51.524325: step: 358/466, loss: 0.02661680243909359 2023-01-24 04:16:52.135526: step: 360/466, loss: 0.1952916979789734 2023-01-24 04:16:52.796861: step: 362/466, loss: 0.0018600921612232924 2023-01-24 04:16:53.446422: step: 364/466, loss: 0.012644550763070583 2023-01-24 04:16:54.064100: step: 366/466, loss: 0.026175061240792274 2023-01-24 04:16:54.669160: step: 368/466, loss: 0.010393361561000347 2023-01-24 04:16:55.352834: step: 370/466, loss: 0.020793933421373367 2023-01-24 04:16:56.014496: step: 372/466, loss: 0.0021076411940157413 2023-01-24 04:16:56.565772: step: 374/466, loss: 0.019097154960036278 2023-01-24 04:16:57.182737: step: 376/466, loss: 0.002498042769730091 2023-01-24 04:16:57.818531: step: 378/466, loss: 0.010241997428238392 2023-01-24 04:16:58.485912: step: 380/466, loss: 0.008600963279604912 2023-01-24 04:16:59.114095: step: 382/466, loss: 0.018622539937496185 2023-01-24 04:16:59.743188: step: 384/466, loss: 0.07526848465204239 2023-01-24 04:17:00.399914: step: 386/466, loss: 0.019352687522768974 2023-01-24 04:17:00.996132: step: 388/466, loss: 0.061983659863471985 2023-01-24 04:17:01.617447: step: 390/466, loss: 0.011880343779921532 2023-01-24 04:17:02.275422: step: 392/466, loss: 0.22809994220733643 2023-01-24 04:17:02.913740: step: 394/466, loss: 0.04433238133788109 2023-01-24 04:17:03.542990: step: 396/466, loss: 0.018859324976801872 2023-01-24 04:17:04.175197: step: 398/466, loss: 0.0173000730574131 2023-01-24 04:17:04.776866: step: 400/466, loss: 0.006490845233201981 2023-01-24 04:17:05.391975: step: 402/466, loss: 0.012548327445983887 2023-01-24 04:17:06.021272: step: 404/466, loss: 0.01079154945909977 2023-01-24 04:17:06.639815: step: 406/466, loss: 0.009839864447712898 2023-01-24 04:17:07.256331: step: 408/466, loss: 0.04299784451723099 2023-01-24 04:17:07.902321: step: 410/466, loss: 0.033929768949747086 2023-01-24 04:17:08.565175: step: 412/466, loss: 0.015043784864246845 2023-01-24 04:17:09.168724: step: 414/466, loss: 0.003470017807558179 2023-01-24 04:17:09.789094: step: 416/466, loss: 0.000816081534139812 2023-01-24 04:17:10.379809: step: 418/466, loss: 0.014361785724759102 2023-01-24 04:17:10.948382: step: 420/466, loss: 0.025000043213367462 2023-01-24 04:17:11.579898: step: 422/466, loss: 0.024604009464383125 2023-01-24 04:17:12.165101: step: 424/466, loss: 0.0064855716191232204 2023-01-24 04:17:12.756474: step: 426/466, loss: 0.003926909063011408 2023-01-24 04:17:13.377230: step: 428/466, loss: 0.0009603975340723991 2023-01-24 04:17:13.969927: step: 430/466, loss: 0.002520288573578 2023-01-24 04:17:14.630802: step: 432/466, loss: 0.014697364531457424 2023-01-24 04:17:15.280189: step: 434/466, loss: 0.015412437729537487 2023-01-24 04:17:15.907483: step: 436/466, loss: 0.00022073285072110593 2023-01-24 04:17:16.462166: step: 438/466, loss: 0.0008744518272578716 2023-01-24 04:17:17.048168: step: 440/466, loss: 0.02448434755206108 2023-01-24 04:17:17.650692: step: 442/466, loss: 0.01597929373383522 2023-01-24 04:17:18.286085: step: 444/466, loss: 0.009523588232696056 2023-01-24 04:17:18.858286: step: 446/466, loss: 0.0016114782774820924 2023-01-24 04:17:19.537283: step: 448/466, loss: 0.0013937718467786908 2023-01-24 04:17:20.145565: step: 450/466, loss: 0.0294453427195549 2023-01-24 04:17:20.807708: step: 452/466, loss: 0.02636844851076603 2023-01-24 04:17:21.478916: step: 454/466, loss: 0.027769722044467926 2023-01-24 04:17:22.110243: step: 456/466, loss: 0.06924347579479218 2023-01-24 04:17:22.740387: step: 458/466, loss: 0.009108162485063076 2023-01-24 04:17:23.351304: step: 460/466, loss: 0.040876325219869614 2023-01-24 04:17:23.932179: step: 462/466, loss: 0.01908290758728981 2023-01-24 04:17:24.575602: step: 464/466, loss: 0.0011091256747022271 2023-01-24 04:17:25.149743: step: 466/466, loss: 0.012499526143074036 2023-01-24 04:17:25.766901: step: 468/466, loss: 0.04883186146616936 2023-01-24 04:17:26.404716: step: 470/466, loss: 0.0036303414963185787 2023-01-24 04:17:27.000119: step: 472/466, loss: 0.13335932791233063 2023-01-24 04:17:27.652277: step: 474/466, loss: 0.0019364000763744116 2023-01-24 04:17:28.243210: step: 476/466, loss: 7.695942622376606e-05 2023-01-24 04:17:28.890891: step: 478/466, loss: 0.0039782109670341015 2023-01-24 04:17:29.523531: step: 480/466, loss: 0.019232138991355896 2023-01-24 04:17:30.178105: step: 482/466, loss: 0.01933993399143219 2023-01-24 04:17:30.766900: step: 484/466, loss: 0.10488546639680862 2023-01-24 04:17:31.347526: step: 486/466, loss: 0.0014955231454223394 2023-01-24 04:17:32.022439: step: 488/466, loss: 0.013511566445231438 2023-01-24 04:17:32.643560: step: 490/466, loss: 0.021985583007335663 2023-01-24 04:17:33.228761: step: 492/466, loss: 0.011495518498122692 2023-01-24 04:17:34.005863: step: 494/466, loss: 0.023479973897337914 2023-01-24 04:17:34.626967: step: 496/466, loss: 0.02006375789642334 2023-01-24 04:17:35.262347: step: 498/466, loss: 0.008442089892923832 2023-01-24 04:17:35.929848: step: 500/466, loss: 0.0020238172728568316 2023-01-24 04:17:36.507832: step: 502/466, loss: 0.001612320076674223 2023-01-24 04:17:37.084217: step: 504/466, loss: 0.009541473351418972 2023-01-24 04:17:37.719769: step: 506/466, loss: 0.8183833360671997 2023-01-24 04:17:38.371587: step: 508/466, loss: 0.08022051304578781 2023-01-24 04:17:38.985514: step: 510/466, loss: 0.015515362843871117 2023-01-24 04:17:39.633306: step: 512/466, loss: 0.02318497560918331 2023-01-24 04:17:40.244172: step: 514/466, loss: 0.004739983938634396 2023-01-24 04:17:40.844421: step: 516/466, loss: 0.06561258435249329 2023-01-24 04:17:41.513449: step: 518/466, loss: 0.00429729325696826 2023-01-24 04:17:42.132674: step: 520/466, loss: 0.003679410321637988 2023-01-24 04:17:42.782316: step: 522/466, loss: 0.0016411789692938328 2023-01-24 04:17:43.471610: step: 524/466, loss: 0.04868793487548828 2023-01-24 04:17:44.018432: step: 526/466, loss: 0.013301272876560688 2023-01-24 04:17:44.686646: step: 528/466, loss: 0.6133697032928467 2023-01-24 04:17:45.304566: step: 530/466, loss: 0.0038876847829669714 2023-01-24 04:17:45.970750: step: 532/466, loss: 0.0029869128484278917 2023-01-24 04:17:46.628976: step: 534/466, loss: 0.017007049173116684 2023-01-24 04:17:47.247645: step: 536/466, loss: 0.027911217883229256 2023-01-24 04:17:47.939411: step: 538/466, loss: 0.011974958702921867 2023-01-24 04:17:48.566514: step: 540/466, loss: 0.012257498688995838 2023-01-24 04:17:49.178977: step: 542/466, loss: 0.0002447470906190574 2023-01-24 04:17:49.815189: step: 544/466, loss: 0.0006711081368848681 2023-01-24 04:17:50.451773: step: 546/466, loss: 0.04007217660546303 2023-01-24 04:17:51.051204: step: 548/466, loss: 0.0033877098467200994 2023-01-24 04:17:51.643707: step: 550/466, loss: 0.0033911005593836308 2023-01-24 04:17:52.311616: step: 552/466, loss: 0.03495388478040695 2023-01-24 04:17:52.926370: step: 554/466, loss: 0.01888686791062355 2023-01-24 04:17:53.551563: step: 556/466, loss: 0.010386298410594463 2023-01-24 04:17:54.237168: step: 558/466, loss: 0.9091969132423401 2023-01-24 04:17:54.900945: step: 560/466, loss: 0.004457194823771715 2023-01-24 04:17:55.460884: step: 562/466, loss: 0.03882373869419098 2023-01-24 04:17:56.054590: step: 564/466, loss: 0.005371954757720232 2023-01-24 04:17:56.645188: step: 566/466, loss: 0.024788759648799896 2023-01-24 04:17:57.245384: step: 568/466, loss: 0.008069412782788277 2023-01-24 04:17:57.816478: step: 570/466, loss: 0.028956662863492966 2023-01-24 04:17:58.505844: step: 572/466, loss: 0.0024390460457652807 2023-01-24 04:17:59.197393: step: 574/466, loss: 0.00887396652251482 2023-01-24 04:17:59.822012: step: 576/466, loss: 0.008376321755349636 2023-01-24 04:18:00.474984: step: 578/466, loss: 0.0015364962164312601 2023-01-24 04:18:01.125508: step: 580/466, loss: 0.029758483171463013 2023-01-24 04:18:01.779277: step: 582/466, loss: 0.010353045538067818 2023-01-24 04:18:02.431708: step: 584/466, loss: 0.0016091320430859923 2023-01-24 04:18:03.077811: step: 586/466, loss: 0.03720836341381073 2023-01-24 04:18:03.759335: step: 588/466, loss: 0.010439724661409855 2023-01-24 04:18:04.494874: step: 590/466, loss: 0.02553858608007431 2023-01-24 04:18:05.101456: step: 592/466, loss: 0.007521891500800848 2023-01-24 04:18:05.717987: step: 594/466, loss: 9.809506445890293e-05 2023-01-24 04:18:06.419998: step: 596/466, loss: 0.013808910734951496 2023-01-24 04:18:07.124864: step: 598/466, loss: 0.027568509802222252 2023-01-24 04:18:07.747927: step: 600/466, loss: 0.007159669417887926 2023-01-24 04:18:08.372632: step: 602/466, loss: 0.17362605035305023 2023-01-24 04:18:09.042440: step: 604/466, loss: 0.036958254873752594 2023-01-24 04:18:09.715329: step: 606/466, loss: 0.0023265182971954346 2023-01-24 04:18:10.325771: step: 608/466, loss: 0.03575790673494339 2023-01-24 04:18:10.929113: step: 610/466, loss: 0.0038715917617082596 2023-01-24 04:18:11.631047: step: 612/466, loss: 0.02022351138293743 2023-01-24 04:18:12.278434: step: 614/466, loss: 0.02672039344906807 2023-01-24 04:18:12.856906: step: 616/466, loss: 0.005389675032347441 2023-01-24 04:18:13.472535: step: 618/466, loss: 0.0008047828450798988 2023-01-24 04:18:14.104549: step: 620/466, loss: 0.006716866511851549 2023-01-24 04:18:14.661739: step: 622/466, loss: 0.0020203841850161552 2023-01-24 04:18:15.309410: step: 624/466, loss: 0.031006475910544395 2023-01-24 04:18:15.960796: step: 626/466, loss: 0.010505175217986107 2023-01-24 04:18:16.657988: step: 628/466, loss: 0.007487665396183729 2023-01-24 04:18:17.288466: step: 630/466, loss: 0.021982848644256592 2023-01-24 04:18:17.857504: step: 632/466, loss: 0.003571385983377695 2023-01-24 04:18:18.503810: step: 634/466, loss: 0.011617590673267841 2023-01-24 04:18:19.105637: step: 636/466, loss: 0.013700786046683788 2023-01-24 04:18:19.731544: step: 638/466, loss: 0.08118680864572525 2023-01-24 04:18:20.411008: step: 640/466, loss: 0.03135356307029724 2023-01-24 04:18:21.007976: step: 642/466, loss: 0.0007427539676427841 2023-01-24 04:18:21.594160: step: 644/466, loss: 0.03151165693998337 2023-01-24 04:18:22.249541: step: 646/466, loss: 0.10037717968225479 2023-01-24 04:18:22.915550: step: 648/466, loss: 0.0019138669595122337 2023-01-24 04:18:23.508558: step: 650/466, loss: 0.00436336500570178 2023-01-24 04:18:24.169535: step: 652/466, loss: 0.009080728515982628 2023-01-24 04:18:24.778984: step: 654/466, loss: 0.007250132970511913 2023-01-24 04:18:25.336709: step: 656/466, loss: 0.010160627774894238 2023-01-24 04:18:25.905143: step: 658/466, loss: 0.005207338836044073 2023-01-24 04:18:26.482127: step: 660/466, loss: 0.008738711476325989 2023-01-24 04:18:27.117059: step: 662/466, loss: 0.012235710397362709 2023-01-24 04:18:27.753705: step: 664/466, loss: 0.015246798284351826 2023-01-24 04:18:28.371253: step: 666/466, loss: 0.05778932943940163 2023-01-24 04:18:29.016649: step: 668/466, loss: 0.08114214986562729 2023-01-24 04:18:29.633660: step: 670/466, loss: 0.0023649095091968775 2023-01-24 04:18:30.274979: step: 672/466, loss: 0.026188427582383156 2023-01-24 04:18:30.982570: step: 674/466, loss: 0.002187538892030716 2023-01-24 04:18:31.542892: step: 676/466, loss: 0.030874181538820267 2023-01-24 04:18:32.196821: step: 678/466, loss: 0.07868461310863495 2023-01-24 04:18:32.825578: step: 680/466, loss: 0.004194202832877636 2023-01-24 04:18:33.391027: step: 682/466, loss: 0.0002074727526633069 2023-01-24 04:18:33.968114: step: 684/466, loss: 0.002662382321432233 2023-01-24 04:18:34.571640: step: 686/466, loss: 0.0001038382833939977 2023-01-24 04:18:35.192695: step: 688/466, loss: 0.021273411810398102 2023-01-24 04:18:35.827566: step: 690/466, loss: 0.008990940637886524 2023-01-24 04:18:36.500208: step: 692/466, loss: 0.044320378452539444 2023-01-24 04:18:37.147093: step: 694/466, loss: 0.017037227749824524 2023-01-24 04:18:37.783430: step: 696/466, loss: 0.04047022759914398 2023-01-24 04:18:38.413705: step: 698/466, loss: 0.01361551322042942 2023-01-24 04:18:38.990808: step: 700/466, loss: 0.006879597902297974 2023-01-24 04:18:39.595668: step: 702/466, loss: 0.060287315398454666 2023-01-24 04:18:40.204139: step: 704/466, loss: 0.003986467607319355 2023-01-24 04:18:40.797434: step: 706/466, loss: 0.08152619749307632 2023-01-24 04:18:41.519149: step: 708/466, loss: 0.001346687087789178 2023-01-24 04:18:42.198422: step: 710/466, loss: 0.08224773406982422 2023-01-24 04:18:42.827506: step: 712/466, loss: 0.018333574756979942 2023-01-24 04:18:43.475218: step: 714/466, loss: 0.0008959770784713328 2023-01-24 04:18:44.116519: step: 716/466, loss: 0.020969970151782036 2023-01-24 04:18:44.714883: step: 718/466, loss: 0.02299494855105877 2023-01-24 04:18:45.395684: step: 720/466, loss: 0.047333959490060806 2023-01-24 04:18:46.026193: step: 722/466, loss: 0.005843609105795622 2023-01-24 04:18:46.636917: step: 724/466, loss: 0.0007942087249830365 2023-01-24 04:18:47.195489: step: 726/466, loss: 0.00327915302477777 2023-01-24 04:18:47.886268: step: 728/466, loss: 0.1284780651330948 2023-01-24 04:18:48.470797: step: 730/466, loss: 0.0004653561918530613 2023-01-24 04:18:49.076428: step: 732/466, loss: 0.011936572380363941 2023-01-24 04:18:49.688525: step: 734/466, loss: 0.0001334332046099007 2023-01-24 04:18:50.269100: step: 736/466, loss: 0.031935177743434906 2023-01-24 04:18:50.843846: step: 738/466, loss: 0.034123290330171585 2023-01-24 04:18:51.430727: step: 740/466, loss: 0.0009675708715803921 2023-01-24 04:18:52.076147: step: 742/466, loss: 0.005988074000924826 2023-01-24 04:18:52.678924: step: 744/466, loss: 0.020195962861180305 2023-01-24 04:18:53.372210: step: 746/466, loss: 0.030538450926542282 2023-01-24 04:18:54.007682: step: 748/466, loss: 0.004954514559358358 2023-01-24 04:18:54.643896: step: 750/466, loss: 0.024093136191368103 2023-01-24 04:18:55.205048: step: 752/466, loss: 0.006772663444280624 2023-01-24 04:18:55.819677: step: 754/466, loss: 0.0009768909076228738 2023-01-24 04:18:56.424504: step: 756/466, loss: 0.01045869942754507 2023-01-24 04:18:57.093111: step: 758/466, loss: 0.003131929552182555 2023-01-24 04:18:57.692518: step: 760/466, loss: 0.009658971801400185 2023-01-24 04:18:58.304211: step: 762/466, loss: 0.0002491218037903309 2023-01-24 04:18:58.867955: step: 764/466, loss: 0.0007519605569541454 2023-01-24 04:18:59.467871: step: 766/466, loss: 0.010694680735468864 2023-01-24 04:19:00.019633: step: 768/466, loss: 0.0009108647354878485 2023-01-24 04:19:00.628052: step: 770/466, loss: 0.03541029989719391 2023-01-24 04:19:01.317831: step: 772/466, loss: 0.030286915600299835 2023-01-24 04:19:01.891232: step: 774/466, loss: 0.007199748884886503 2023-01-24 04:19:02.532113: step: 776/466, loss: 0.03642946481704712 2023-01-24 04:19:03.183794: step: 778/466, loss: 0.04434438794851303 2023-01-24 04:19:03.811513: step: 780/466, loss: 0.0012827562168240547 2023-01-24 04:19:04.476457: step: 782/466, loss: 0.009099474176764488 2023-01-24 04:19:05.100553: step: 784/466, loss: 0.0005583069869317114 2023-01-24 04:19:05.674619: step: 786/466, loss: 0.0026499121449887753 2023-01-24 04:19:06.305760: step: 788/466, loss: 0.00255676475353539 2023-01-24 04:19:06.874804: step: 790/466, loss: 0.03963962942361832 2023-01-24 04:19:07.470303: step: 792/466, loss: 0.015734344720840454 2023-01-24 04:19:08.130423: step: 794/466, loss: 0.009534102864563465 2023-01-24 04:19:08.731108: step: 796/466, loss: 0.09624442458152771 2023-01-24 04:19:09.382376: step: 798/466, loss: 0.003762713400647044 2023-01-24 04:19:10.006183: step: 800/466, loss: 0.0015033320523798466 2023-01-24 04:19:10.674284: step: 802/466, loss: 0.04189210757613182 2023-01-24 04:19:11.448351: step: 804/466, loss: 0.10371461510658264 2023-01-24 04:19:12.122694: step: 806/466, loss: 0.48933953046798706 2023-01-24 04:19:12.755935: step: 808/466, loss: 0.004488999489694834 2023-01-24 04:19:13.410766: step: 810/466, loss: 0.0008561794529668987 2023-01-24 04:19:14.089619: step: 812/466, loss: 0.007319875992834568 2023-01-24 04:19:14.652422: step: 814/466, loss: 0.19539205729961395 2023-01-24 04:19:15.249403: step: 816/466, loss: 0.009182372130453587 2023-01-24 04:19:15.901277: step: 818/466, loss: 0.045163724571466446 2023-01-24 04:19:16.582460: step: 820/466, loss: 0.009169608354568481 2023-01-24 04:19:17.254849: step: 822/466, loss: 0.0011140016140416265 2023-01-24 04:19:17.854767: step: 824/466, loss: 0.0025377203710377216 2023-01-24 04:19:18.439731: step: 826/466, loss: 0.028506889939308167 2023-01-24 04:19:19.084393: step: 828/466, loss: 0.03266207501292229 2023-01-24 04:19:19.744624: step: 830/466, loss: 0.022404903545975685 2023-01-24 04:19:20.368419: step: 832/466, loss: 0.02750786393880844 2023-01-24 04:19:20.991853: step: 834/466, loss: 0.011269154027104378 2023-01-24 04:19:21.608484: step: 836/466, loss: 0.022157708182930946 2023-01-24 04:19:22.212764: step: 838/466, loss: 0.0012200011406093836 2023-01-24 04:19:22.846730: step: 840/466, loss: 0.023260338231921196 2023-01-24 04:19:23.586532: step: 842/466, loss: 0.013728282414376736 2023-01-24 04:19:24.216919: step: 844/466, loss: 0.027568018063902855 2023-01-24 04:19:24.807401: step: 846/466, loss: 0.02342831902205944 2023-01-24 04:19:25.421643: step: 848/466, loss: 0.031135616824030876 2023-01-24 04:19:26.093810: step: 850/466, loss: 0.16939984261989594 2023-01-24 04:19:26.717872: step: 852/466, loss: 0.03348062187433243 2023-01-24 04:19:27.346625: step: 854/466, loss: 0.0009071927634067833 2023-01-24 04:19:27.937952: step: 856/466, loss: 0.009717311710119247 2023-01-24 04:19:28.551204: step: 858/466, loss: 0.01987823285162449 2023-01-24 04:19:29.168334: step: 860/466, loss: 0.021753111854195595 2023-01-24 04:19:29.737035: step: 862/466, loss: 0.025307273492217064 2023-01-24 04:19:30.339664: step: 864/466, loss: 0.002120056189596653 2023-01-24 04:19:30.994832: step: 866/466, loss: 0.13705721497535706 2023-01-24 04:19:31.530465: step: 868/466, loss: 1.5340710878372192 2023-01-24 04:19:32.188111: step: 870/466, loss: 0.005955138243734837 2023-01-24 04:19:32.866187: step: 872/466, loss: 0.03116076998412609 2023-01-24 04:19:33.451705: step: 874/466, loss: 0.042830005288124084 2023-01-24 04:19:34.108504: step: 876/466, loss: 0.07013010233640671 2023-01-24 04:19:34.703037: step: 878/466, loss: 0.005209819413721561 2023-01-24 04:19:35.351138: step: 880/466, loss: 0.025534560903906822 2023-01-24 04:19:35.913972: step: 882/466, loss: 0.00187689031008631 2023-01-24 04:19:36.552075: step: 884/466, loss: 0.042018599808216095 2023-01-24 04:19:37.189823: step: 886/466, loss: 0.039473798125982285 2023-01-24 04:19:37.776747: step: 888/466, loss: 0.17161712050437927 2023-01-24 04:19:38.405732: step: 890/466, loss: 0.0070687225088477135 2023-01-24 04:19:38.980323: step: 892/466, loss: 0.007750903721898794 2023-01-24 04:19:39.640514: step: 894/466, loss: 0.003680052701383829 2023-01-24 04:19:40.183315: step: 896/466, loss: 0.04857083782553673 2023-01-24 04:19:40.739315: step: 898/466, loss: 0.0007663737633265555 2023-01-24 04:19:41.283413: step: 900/466, loss: 0.04972157999873161 2023-01-24 04:19:41.921096: step: 902/466, loss: 0.022673940286040306 2023-01-24 04:19:42.587849: step: 904/466, loss: 0.02935558557510376 2023-01-24 04:19:43.195001: step: 906/466, loss: 0.02141900546848774 2023-01-24 04:19:43.823569: step: 908/466, loss: 0.026830747723579407 2023-01-24 04:19:44.510301: step: 910/466, loss: 0.00036158942384645343 2023-01-24 04:19:45.190812: step: 912/466, loss: 0.047231242060661316 2023-01-24 04:19:45.825097: step: 914/466, loss: 0.07010781764984131 2023-01-24 04:19:46.435699: step: 916/466, loss: 0.0036849970929324627 2023-01-24 04:19:47.069606: step: 918/466, loss: 0.0026618526317179203 2023-01-24 04:19:47.717975: step: 920/466, loss: 0.027415724471211433 2023-01-24 04:19:48.293706: step: 922/466, loss: 0.012835756875574589 2023-01-24 04:19:48.898327: step: 924/466, loss: 0.025382524356245995 2023-01-24 04:19:49.530882: step: 926/466, loss: 0.005560525692999363 2023-01-24 04:19:50.118324: step: 928/466, loss: 0.008374533616006374 2023-01-24 04:19:50.678432: step: 930/466, loss: 0.008973834104835987 2023-01-24 04:19:51.283993: step: 932/466, loss: 0.01936868205666542 ================================================== Loss: 0.060 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.36336552250904103, 'r': 0.33509609855672473, 'f1': 0.34865872446079754}, 'combined': 0.25690642855006135, 'epoch': 31} Test Chinese: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.36725972232992726, 'r': 0.3028782684154409, 'f1': 0.33197636992435453}, 'combined': 0.22017085673739573, 'epoch': 31} Dev Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.35263694638694637, 'r': 0.2865175189393939, 'f1': 0.3161572622779519}, 'combined': 0.21077150818530124, 'epoch': 31} Test Korean: {'template': {'p': 0.96875, 'r': 0.49206349206349204, 'f1': 0.6526315789473683}, 'slot': {'p': 0.3790470611556661, 'r': 0.28633287074334673, 'f1': 0.3262305578040888}, 'combined': 0.2129083640405632, 'epoch': 31} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3451014318032982, 'r': 0.3333142861250072, 'f1': 0.33910546098046096}, 'combined': 0.2498671817750765, 'epoch': 31} Test Russian: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.35907171376180635, 'r': 0.29772121279816727, 'f1': 0.32553111271340623}, 'combined': 0.21589628200681862, 'epoch': 31} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.3625, 'r': 0.3107142857142857, 'f1': 0.3346153846153846}, 'combined': 0.22307692307692306, 'epoch': 31} Sample Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.26666666666666666, 'r': 0.17391304347826086, 'f1': 0.2105263157894737}, 'combined': 0.14035087719298245, 'epoch': 31} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.375, 'r': 0.12931034482758622, 'f1': 0.19230769230769235}, 'combined': 0.12820512820512822, 'epoch': 31} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.35600821224540485, 'r': 0.3296622534644356, 'f1': 0.34232907896701}, 'combined': 0.25224247923884946, 'epoch': 20} Test for Chinese: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.3708503216596444, 'r': 0.2994463568648126, 'f1': 0.33134515303755174}, 'combined': 0.21975222584873896, 'epoch': 20} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.38541666666666663, 'r': 0.35238095238095235, 'f1': 0.36815920398009944}, 'combined': 0.24543946932006627, 'epoch': 20} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.32910511363636363, 'r': 0.2742542613636364, 'f1': 0.2991864669421488}, 'combined': 0.19945764462809917, 'epoch': 25} Test for Korean: {'template': {'p': 0.96875, 'r': 0.49206349206349204, 'f1': 0.6526315789473683}, 'slot': {'p': 0.3803562606644222, 'r': 0.28970140060968946, 'f1': 0.32889629598629455}, 'combined': 0.21464810895947642, 'epoch': 25} Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.578125, 'r': 0.40217391304347827, 'f1': 0.47435897435897434}, 'combined': 0.3162393162393162, 'epoch': 25} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3214272868712161, 'r': 0.31166858366449984, 'f1': 0.3164727236824498}, 'combined': 0.23319042797654194, 'epoch': 7} Test for Russian: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.3374639287478496, 'r': 0.2816078301964814, 'f1': 0.30701605547736693}, 'combined': 0.20361686580882363, 'epoch': 7} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.4423076923076923, 'r': 0.19827586206896552, 'f1': 0.2738095238095238}, 'combined': 0.1825396825396825, 'epoch': 7} ****************************** Epoch: 32 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-24 04:22:23.077316: step: 2/466, loss: 0.004893600009381771 2023-01-24 04:22:23.756693: step: 4/466, loss: 0.001807203982025385 2023-01-24 04:22:24.339970: step: 6/466, loss: 0.004189473111182451 2023-01-24 04:22:25.022657: step: 8/466, loss: 0.009971344843506813 2023-01-24 04:22:25.622760: step: 10/466, loss: 0.0013811824610456824 2023-01-24 04:22:26.204861: step: 12/466, loss: 0.026940230280160904 2023-01-24 04:22:26.933328: step: 14/466, loss: 0.00019078326295129955 2023-01-24 04:22:27.585608: step: 16/466, loss: 0.03184819594025612 2023-01-24 04:22:28.239177: step: 18/466, loss: 2.7896217943634838e-05 2023-01-24 04:22:28.842946: step: 20/466, loss: 0.0034164085518568754 2023-01-24 04:22:29.434548: step: 22/466, loss: 0.0020141061395406723 2023-01-24 04:22:30.044475: step: 24/466, loss: 0.014625578187406063 2023-01-24 04:22:30.595347: step: 26/466, loss: 0.2044469118118286 2023-01-24 04:22:31.195851: step: 28/466, loss: 0.0008368648122996092 2023-01-24 04:22:31.774595: step: 30/466, loss: 0.0176450964063406 2023-01-24 04:22:32.406994: step: 32/466, loss: 0.04258988797664642 2023-01-24 04:22:33.042796: step: 34/466, loss: 0.06267206370830536 2023-01-24 04:22:33.650343: step: 36/466, loss: 0.016352983191609383 2023-01-24 04:22:34.349183: step: 38/466, loss: 0.0013128308346495032 2023-01-24 04:22:34.929430: step: 40/466, loss: 0.002571600489318371 2023-01-24 04:22:35.562045: step: 42/466, loss: 0.012269257567822933 2023-01-24 04:22:36.146542: step: 44/466, loss: 0.001995067112147808 2023-01-24 04:22:36.780369: step: 46/466, loss: 0.046191588044166565 2023-01-24 04:22:37.424810: step: 48/466, loss: 0.009372851811349392 2023-01-24 04:22:38.067095: step: 50/466, loss: 0.017919622361660004 2023-01-24 04:22:38.658107: step: 52/466, loss: 0.007651403080672026 2023-01-24 04:22:39.241053: step: 54/466, loss: 0.010477669537067413 2023-01-24 04:22:39.855733: step: 56/466, loss: 0.01425440888851881 2023-01-24 04:22:40.505958: step: 58/466, loss: 0.00841886643320322 2023-01-24 04:22:41.163948: step: 60/466, loss: 0.012636151164770126 2023-01-24 04:22:41.727217: step: 62/466, loss: 0.02314668707549572 2023-01-24 04:22:42.332204: step: 64/466, loss: 0.045987606048583984 2023-01-24 04:22:43.002257: step: 66/466, loss: 0.26990488171577454 2023-01-24 04:22:43.623413: step: 68/466, loss: 0.005559887737035751 2023-01-24 04:22:44.361945: step: 70/466, loss: 0.0006924280896782875 2023-01-24 04:22:44.937376: step: 72/466, loss: 0.0032412128057330847 2023-01-24 04:22:45.567721: step: 74/466, loss: 0.0180651992559433 2023-01-24 04:22:46.149506: step: 76/466, loss: 0.0009903390891849995 2023-01-24 04:22:46.797299: step: 78/466, loss: 0.02404508925974369 2023-01-24 04:22:47.468556: step: 80/466, loss: 0.0448649562895298 2023-01-24 04:22:48.031321: step: 82/466, loss: 0.004130546469241381 2023-01-24 04:22:48.570968: step: 84/466, loss: 0.005723233800381422 2023-01-24 04:22:49.215333: step: 86/466, loss: 0.003985012881457806 2023-01-24 04:22:49.802850: step: 88/466, loss: 0.020042594522237778 2023-01-24 04:22:50.467007: step: 90/466, loss: 0.0017711673863232136 2023-01-24 04:22:51.124452: step: 92/466, loss: 0.0028102535288780928 2023-01-24 04:22:51.733104: step: 94/466, loss: 0.03830303996801376 2023-01-24 04:22:52.354022: step: 96/466, loss: 0.0036949864588677883 2023-01-24 04:22:52.983510: step: 98/466, loss: 0.001504071638919413 2023-01-24 04:22:53.646623: step: 100/466, loss: 0.0048750219866633415 2023-01-24 04:22:54.353060: step: 102/466, loss: 0.03824969753623009 2023-01-24 04:22:54.893287: step: 104/466, loss: 0.004800750873982906 2023-01-24 04:22:55.453567: step: 106/466, loss: 0.0020471615716814995 2023-01-24 04:22:56.009683: step: 108/466, loss: 0.000646074942778796 2023-01-24 04:22:56.624472: step: 110/466, loss: 0.0010992448078468442 2023-01-24 04:22:57.324123: step: 112/466, loss: 0.006446210667490959 2023-01-24 04:22:57.863261: step: 114/466, loss: 0.0009745300631038845 2023-01-24 04:22:58.500216: step: 116/466, loss: 0.013410649262368679 2023-01-24 04:22:59.119073: step: 118/466, loss: 0.0012341059045866132 2023-01-24 04:22:59.744990: step: 120/466, loss: 0.001941015711054206 2023-01-24 04:23:00.358786: step: 122/466, loss: 0.20783285796642303 2023-01-24 04:23:01.057664: step: 124/466, loss: 0.025142325088381767 2023-01-24 04:23:01.618880: step: 126/466, loss: 0.020406607538461685 2023-01-24 04:23:02.292589: step: 128/466, loss: 0.003357550362125039 2023-01-24 04:23:02.879915: step: 130/466, loss: 0.003221118589863181 2023-01-24 04:23:03.481103: step: 132/466, loss: 0.4911470115184784 2023-01-24 04:23:04.101505: step: 134/466, loss: 0.011972006410360336 2023-01-24 04:23:04.722855: step: 136/466, loss: 0.008492819964885712 2023-01-24 04:23:05.335695: step: 138/466, loss: 0.006594249978661537 2023-01-24 04:23:05.948846: step: 140/466, loss: 0.007354431785643101 2023-01-24 04:23:06.515238: step: 142/466, loss: 0.016762271523475647 2023-01-24 04:23:07.111162: step: 144/466, loss: 0.007904180325567722 2023-01-24 04:23:07.699616: step: 146/466, loss: 0.003073585219681263 2023-01-24 04:23:08.355617: step: 148/466, loss: 0.014073972590267658 2023-01-24 04:23:08.969648: step: 150/466, loss: 0.04962771013379097 2023-01-24 04:23:09.579148: step: 152/466, loss: 0.050608739256858826 2023-01-24 04:23:10.192046: step: 154/466, loss: 0.006356202531605959 2023-01-24 04:23:10.759744: step: 156/466, loss: 0.014832945540547371 2023-01-24 04:23:11.386855: step: 158/466, loss: 0.03682565316557884 2023-01-24 04:23:11.988683: step: 160/466, loss: 1.9299652194604278e-05 2023-01-24 04:23:12.647164: step: 162/466, loss: 0.003554818918928504 2023-01-24 04:23:13.296662: step: 164/466, loss: 0.4816342294216156 2023-01-24 04:23:13.930947: step: 166/466, loss: 0.019198158755898476 2023-01-24 04:23:14.531341: step: 168/466, loss: 0.004678398370742798 2023-01-24 04:23:15.172664: step: 170/466, loss: 0.0006657781777903438 2023-01-24 04:23:15.822158: step: 172/466, loss: 0.04507363960146904 2023-01-24 04:23:16.462094: step: 174/466, loss: 0.06059860438108444 2023-01-24 04:23:17.070326: step: 176/466, loss: 0.023954475298523903 2023-01-24 04:23:17.683215: step: 178/466, loss: 0.006275218445807695 2023-01-24 04:23:18.232537: step: 180/466, loss: 0.004957010503858328 2023-01-24 04:23:18.879930: step: 182/466, loss: 0.008655861020088196 2023-01-24 04:23:19.561969: step: 184/466, loss: 0.07140230387449265 2023-01-24 04:23:20.238534: step: 186/466, loss: 0.00874532200396061 2023-01-24 04:23:20.872487: step: 188/466, loss: 0.017771543934941292 2023-01-24 04:23:21.496641: step: 190/466, loss: 0.05333786457777023 2023-01-24 04:23:22.144367: step: 192/466, loss: 0.0453433096408844 2023-01-24 04:23:22.790494: step: 194/466, loss: 0.03081073798239231 2023-01-24 04:23:23.491078: step: 196/466, loss: 0.018321197479963303 2023-01-24 04:23:24.080041: step: 198/466, loss: 0.005789279937744141 2023-01-24 04:23:24.667564: step: 200/466, loss: 0.00897742249071598 2023-01-24 04:23:25.329285: step: 202/466, loss: 0.050437066704034805 2023-01-24 04:23:25.931969: step: 204/466, loss: 0.0827391967177391 2023-01-24 04:23:26.574593: step: 206/466, loss: 4.914351666229777e-05 2023-01-24 04:23:27.214361: step: 208/466, loss: 0.04163723438978195 2023-01-24 04:23:27.839263: step: 210/466, loss: 0.04904909431934357 2023-01-24 04:23:28.425302: step: 212/466, loss: 0.0018292181193828583 2023-01-24 04:23:29.108283: step: 214/466, loss: 0.016725745052099228 2023-01-24 04:23:29.715420: step: 216/466, loss: 0.08441532403230667 2023-01-24 04:23:30.323493: step: 218/466, loss: 0.014585371129214764 2023-01-24 04:23:30.916671: step: 220/466, loss: 0.03478140011429787 2023-01-24 04:23:31.574673: step: 222/466, loss: 0.02361155115067959 2023-01-24 04:23:32.149110: step: 224/466, loss: 0.0034576638136059046 2023-01-24 04:23:32.807291: step: 226/466, loss: 0.0021823784336447716 2023-01-24 04:23:33.507726: step: 228/466, loss: 0.002427509054541588 2023-01-24 04:23:34.134673: step: 230/466, loss: 0.012929845601320267 2023-01-24 04:23:34.748740: step: 232/466, loss: 0.05927010253071785 2023-01-24 04:23:35.354512: step: 234/466, loss: 0.005471110809594393 2023-01-24 04:23:36.032824: step: 236/466, loss: 0.002051880117505789 2023-01-24 04:23:36.747096: step: 238/466, loss: 0.02066088654100895 2023-01-24 04:23:37.430990: step: 240/466, loss: 0.012557800859212875 2023-01-24 04:23:37.986652: step: 242/466, loss: 0.0020497667137533426 2023-01-24 04:23:38.573486: step: 244/466, loss: 0.0003299048694316298 2023-01-24 04:23:39.254543: step: 246/466, loss: 0.0010864927899092436 2023-01-24 04:23:39.977170: step: 248/466, loss: 0.007769247982650995 2023-01-24 04:23:40.641323: step: 250/466, loss: 0.02015012316405773 2023-01-24 04:23:41.266025: step: 252/466, loss: 0.01395625900477171 2023-01-24 04:23:41.856432: step: 254/466, loss: 5.754221638198942e-05 2023-01-24 04:23:42.510975: step: 256/466, loss: 0.008632590994238853 2023-01-24 04:23:43.144257: step: 258/466, loss: 0.009349017404019833 2023-01-24 04:23:43.736761: step: 260/466, loss: 0.6971507668495178 2023-01-24 04:23:44.400728: step: 262/466, loss: 0.05355757847428322 2023-01-24 04:23:45.021818: step: 264/466, loss: 0.002188299084082246 2023-01-24 04:23:45.702465: step: 266/466, loss: 0.10608571767807007 2023-01-24 04:23:46.371579: step: 268/466, loss: 0.062313344329595566 2023-01-24 04:23:46.981776: step: 270/466, loss: 0.007442169357091188 2023-01-24 04:23:47.624386: step: 272/466, loss: 0.0062944344244897366 2023-01-24 04:23:48.206823: step: 274/466, loss: 0.04566106200218201 2023-01-24 04:23:48.828729: step: 276/466, loss: 0.02992089092731476 2023-01-24 04:23:49.407976: step: 278/466, loss: 0.0064385076984763145 2023-01-24 04:23:49.999826: step: 280/466, loss: 0.0003736176004167646 2023-01-24 04:23:50.612450: step: 282/466, loss: 0.0007755811675451696 2023-01-24 04:23:51.233661: step: 284/466, loss: 0.011125924997031689 2023-01-24 04:23:51.807657: step: 286/466, loss: 0.03555982559919357 2023-01-24 04:23:52.441238: step: 288/466, loss: 0.004205263219773769 2023-01-24 04:23:53.023946: step: 290/466, loss: 0.008824083022773266 2023-01-24 04:23:53.654476: step: 292/466, loss: 0.04853668063879013 2023-01-24 04:23:54.220696: step: 294/466, loss: 0.006314811296761036 2023-01-24 04:23:54.859728: step: 296/466, loss: 0.14895592629909515 2023-01-24 04:23:55.484120: step: 298/466, loss: 0.09949561953544617 2023-01-24 04:23:56.124832: step: 300/466, loss: 0.061377447098493576 2023-01-24 04:23:56.740319: step: 302/466, loss: 0.016520462930202484 2023-01-24 04:23:57.402881: step: 304/466, loss: 0.012629845179617405 2023-01-24 04:23:58.053015: step: 306/466, loss: 0.037604060024023056 2023-01-24 04:23:58.670750: step: 308/466, loss: 0.00816622469574213 2023-01-24 04:23:59.307412: step: 310/466, loss: 0.015745321288704872 2023-01-24 04:23:59.922052: step: 312/466, loss: 0.00892604049295187 2023-01-24 04:24:00.587936: step: 314/466, loss: 0.002678923076018691 2023-01-24 04:24:01.181069: step: 316/466, loss: 0.018849464133381844 2023-01-24 04:24:01.816240: step: 318/466, loss: 0.011159980669617653 2023-01-24 04:24:02.418634: step: 320/466, loss: 0.03757862374186516 2023-01-24 04:24:03.076272: step: 322/466, loss: 0.0057309032417833805 2023-01-24 04:24:03.653640: step: 324/466, loss: 0.03270480036735535 2023-01-24 04:24:04.289799: step: 326/466, loss: 0.011551638133823872 2023-01-24 04:24:04.958542: step: 328/466, loss: 0.012251759879291058 2023-01-24 04:24:05.608117: step: 330/466, loss: 0.05296879634261131 2023-01-24 04:24:06.231976: step: 332/466, loss: 0.0004901235224679112 2023-01-24 04:24:06.818146: step: 334/466, loss: 0.014748001471161842 2023-01-24 04:24:07.451908: step: 336/466, loss: 0.004974644631147385 2023-01-24 04:24:08.077623: step: 338/466, loss: 0.376626580953598 2023-01-24 04:24:08.763172: step: 340/466, loss: 0.04346088692545891 2023-01-24 04:24:09.395341: step: 342/466, loss: 1.6282410797430202e-05 2023-01-24 04:24:09.982476: step: 344/466, loss: 0.022755125537514687 2023-01-24 04:24:10.624874: step: 346/466, loss: 0.031009746715426445 2023-01-24 04:24:11.263107: step: 348/466, loss: 0.0008659077575430274 2023-01-24 04:24:11.905896: step: 350/466, loss: 0.02779753878712654 2023-01-24 04:24:12.476838: step: 352/466, loss: 0.0033039574045687914 2023-01-24 04:24:13.116319: step: 354/466, loss: 0.01538103073835373 2023-01-24 04:24:13.758625: step: 356/466, loss: 0.013847199268639088 2023-01-24 04:24:14.343834: step: 358/466, loss: 0.007189096882939339 2023-01-24 04:24:14.963497: step: 360/466, loss: 0.0009842516155913472 2023-01-24 04:24:15.581061: step: 362/466, loss: 0.011768832802772522 2023-01-24 04:24:16.228744: step: 364/466, loss: 0.09084676206111908 2023-01-24 04:24:16.892517: step: 366/466, loss: 0.016181915998458862 2023-01-24 04:24:17.464499: step: 368/466, loss: 0.006875579711049795 2023-01-24 04:24:18.099777: step: 370/466, loss: 0.010697558522224426 2023-01-24 04:24:18.701419: step: 372/466, loss: 0.01979762688279152 2023-01-24 04:24:19.353470: step: 374/466, loss: 0.04282763600349426 2023-01-24 04:24:19.957018: step: 376/466, loss: 0.07851805537939072 2023-01-24 04:24:20.631879: step: 378/466, loss: 0.0012981746112927794 2023-01-24 04:24:21.274614: step: 380/466, loss: 0.08073896914720535 2023-01-24 04:24:21.816120: step: 382/466, loss: 0.004018519539386034 2023-01-24 04:24:22.422294: step: 384/466, loss: 0.23132452368736267 2023-01-24 04:24:23.057541: step: 386/466, loss: 0.013552346266806126 2023-01-24 04:24:23.630829: step: 388/466, loss: 0.007649657316505909 2023-01-24 04:24:24.294581: step: 390/466, loss: 0.035173993557691574 2023-01-24 04:24:24.947831: step: 392/466, loss: 0.004693764727562666 2023-01-24 04:24:25.563160: step: 394/466, loss: 0.012217522598803043 2023-01-24 04:24:26.168080: step: 396/466, loss: 0.01885879971086979 2023-01-24 04:24:26.747842: step: 398/466, loss: 0.007570372894406319 2023-01-24 04:24:27.404739: step: 400/466, loss: 0.03087976947426796 2023-01-24 04:24:27.990520: step: 402/466, loss: 0.015562249347567558 2023-01-24 04:24:28.667234: step: 404/466, loss: 0.05325576290488243 2023-01-24 04:24:29.323026: step: 406/466, loss: 0.013357589021325111 2023-01-24 04:24:29.979250: step: 408/466, loss: 0.004710727371275425 2023-01-24 04:24:30.608598: step: 410/466, loss: 0.0020447219721972942 2023-01-24 04:24:31.162852: step: 412/466, loss: 0.00359279103577137 2023-01-24 04:24:31.769227: step: 414/466, loss: 0.101906418800354 2023-01-24 04:24:32.409658: step: 416/466, loss: 0.026152092963457108 2023-01-24 04:24:33.001715: step: 418/466, loss: 0.00011928620369872078 2023-01-24 04:24:33.609923: step: 420/466, loss: 0.006412764545530081 2023-01-24 04:24:34.164329: step: 422/466, loss: 0.00213825237005949 2023-01-24 04:24:34.875712: step: 424/466, loss: 0.04086513817310333 2023-01-24 04:24:35.555000: step: 426/466, loss: 2.8988325595855713 2023-01-24 04:24:36.151637: step: 428/466, loss: 0.0008585536852478981 2023-01-24 04:24:36.855624: step: 430/466, loss: 0.07981168478727341 2023-01-24 04:24:37.449292: step: 432/466, loss: 0.012860963121056557 2023-01-24 04:24:38.151514: step: 434/466, loss: 0.004198842216283083 2023-01-24 04:24:38.753771: step: 436/466, loss: 0.005549263209104538 2023-01-24 04:24:39.311922: step: 438/466, loss: 0.0016691813943907619 2023-01-24 04:24:39.938316: step: 440/466, loss: 0.015377065166831017 2023-01-24 04:24:40.575553: step: 442/466, loss: 0.004243817180395126 2023-01-24 04:24:41.249080: step: 444/466, loss: 0.7251068353652954 2023-01-24 04:24:41.856395: step: 446/466, loss: 0.041107308119535446 2023-01-24 04:24:42.522821: step: 448/466, loss: 0.030324382707476616 2023-01-24 04:24:43.148187: step: 450/466, loss: 0.012087621726095676 2023-01-24 04:24:43.740116: step: 452/466, loss: 0.024647416546940804 2023-01-24 04:24:44.388044: step: 454/466, loss: 0.001115889404900372 2023-01-24 04:24:45.058161: step: 456/466, loss: 0.017052508890628815 2023-01-24 04:24:45.704433: step: 458/466, loss: 0.02020523138344288 2023-01-24 04:24:46.377831: step: 460/466, loss: 0.0014908250886946917 2023-01-24 04:24:47.013260: step: 462/466, loss: 0.023595234379172325 2023-01-24 04:24:47.711757: step: 464/466, loss: 0.3415941894054413 2023-01-24 04:24:48.314093: step: 466/466, loss: 0.00020736547594424337 2023-01-24 04:24:49.024407: step: 468/466, loss: 0.0034636743366718292 2023-01-24 04:24:49.644121: step: 470/466, loss: 0.0024635677691549063 2023-01-24 04:24:50.199768: step: 472/466, loss: 0.006891455966979265 2023-01-24 04:24:50.805647: step: 474/466, loss: 0.006211124360561371 2023-01-24 04:24:51.357470: step: 476/466, loss: 0.004383048042654991 2023-01-24 04:24:51.953006: step: 478/466, loss: 0.004917972721159458 2023-01-24 04:24:52.617430: step: 480/466, loss: 0.001569868065416813 2023-01-24 04:24:53.268934: step: 482/466, loss: 0.0012542768381536007 2023-01-24 04:24:53.919512: step: 484/466, loss: 0.03853151574730873 2023-01-24 04:24:54.519531: step: 486/466, loss: 0.0009523625485599041 2023-01-24 04:24:55.155092: step: 488/466, loss: 0.0006722973193973303 2023-01-24 04:24:55.854346: step: 490/466, loss: 0.008869620971381664 2023-01-24 04:24:56.473753: step: 492/466, loss: 0.017616745084524155 2023-01-24 04:24:57.192541: step: 494/466, loss: 0.004762888886034489 2023-01-24 04:24:57.795431: step: 496/466, loss: 0.2257559895515442 2023-01-24 04:24:58.387928: step: 498/466, loss: 0.01182840671390295 2023-01-24 04:24:59.005042: step: 500/466, loss: 0.009502295404672623 2023-01-24 04:24:59.609600: step: 502/466, loss: 0.01936890184879303 2023-01-24 04:25:00.287834: step: 504/466, loss: 0.020262565463781357 2023-01-24 04:25:00.984421: step: 506/466, loss: 0.013965466059744358 2023-01-24 04:25:01.626533: step: 508/466, loss: 0.005476214457303286 2023-01-24 04:25:02.247783: step: 510/466, loss: 0.004246894735842943 2023-01-24 04:25:02.829192: step: 512/466, loss: 0.023494044318795204 2023-01-24 04:25:03.466039: step: 514/466, loss: 0.00397317111492157 2023-01-24 04:25:04.075458: step: 516/466, loss: 0.01444228459149599 2023-01-24 04:25:04.685445: step: 518/466, loss: 0.001565203652717173 2023-01-24 04:25:05.277330: step: 520/466, loss: 0.0030631034169346094 2023-01-24 04:25:05.889897: step: 522/466, loss: 0.0035917942877858877 2023-01-24 04:25:06.473817: step: 524/466, loss: 0.011667449027299881 2023-01-24 04:25:07.083678: step: 526/466, loss: 0.005897277966141701 2023-01-24 04:25:07.732459: step: 528/466, loss: 0.007870707660913467 2023-01-24 04:25:08.366119: step: 530/466, loss: 0.05467689782381058 2023-01-24 04:25:09.002977: step: 532/466, loss: 0.009704221971333027 2023-01-24 04:25:09.634700: step: 534/466, loss: 0.011946790851652622 2023-01-24 04:25:10.249508: step: 536/466, loss: 0.03798848018050194 2023-01-24 04:25:10.853945: step: 538/466, loss: 0.010317761451005936 2023-01-24 04:25:11.464058: step: 540/466, loss: 1.3332763046491891e-05 2023-01-24 04:25:12.040464: step: 542/466, loss: 0.019306158646941185 2023-01-24 04:25:12.637954: step: 544/466, loss: 0.003761201398447156 2023-01-24 04:25:13.260968: step: 546/466, loss: 0.3841245472431183 2023-01-24 04:25:13.826642: step: 548/466, loss: 0.02378680743277073 2023-01-24 04:25:14.398064: step: 550/466, loss: 0.014144516550004482 2023-01-24 04:25:15.045835: step: 552/466, loss: 0.0009709474397823215 2023-01-24 04:25:15.648461: step: 554/466, loss: 0.012108071707189083 2023-01-24 04:25:16.309299: step: 556/466, loss: 0.0035140831023454666 2023-01-24 04:25:16.956478: step: 558/466, loss: 0.012012338265776634 2023-01-24 04:25:17.535520: step: 560/466, loss: 3.552379846572876 2023-01-24 04:25:18.151079: step: 562/466, loss: 0.014286902733147144 2023-01-24 04:25:18.809552: step: 564/466, loss: 0.03940424695611 2023-01-24 04:25:19.423943: step: 566/466, loss: 0.030426137149333954 2023-01-24 04:25:20.063150: step: 568/466, loss: 0.012256816029548645 2023-01-24 04:25:20.682211: step: 570/466, loss: 0.05975402146577835 2023-01-24 04:25:21.265831: step: 572/466, loss: 0.08712486922740936 2023-01-24 04:25:21.930792: step: 574/466, loss: 0.02631019987165928 2023-01-24 04:25:22.525408: step: 576/466, loss: 0.0014418819919228554 2023-01-24 04:25:23.178049: step: 578/466, loss: 0.04173552244901657 2023-01-24 04:25:23.801867: step: 580/466, loss: 0.012788847088813782 2023-01-24 04:25:24.421410: step: 582/466, loss: 0.03496984392404556 2023-01-24 04:25:25.010168: step: 584/466, loss: 0.00038491602754220366 2023-01-24 04:25:25.687546: step: 586/466, loss: 0.012075595557689667 2023-01-24 04:25:26.347441: step: 588/466, loss: 0.02586449310183525 2023-01-24 04:25:27.054449: step: 590/466, loss: 0.050056036561727524 2023-01-24 04:25:27.685950: step: 592/466, loss: 0.07017027586698532 2023-01-24 04:25:28.288881: step: 594/466, loss: 0.02162768505513668 2023-01-24 04:25:28.937734: step: 596/466, loss: 0.001000135438516736 2023-01-24 04:25:29.563729: step: 598/466, loss: 0.05198930576443672 2023-01-24 04:25:30.160659: step: 600/466, loss: 0.0026975013315677643 2023-01-24 04:25:30.813044: step: 602/466, loss: 0.9263908863067627 2023-01-24 04:25:31.451845: step: 604/466, loss: 0.004877468105405569 2023-01-24 04:25:32.129777: step: 606/466, loss: 0.023529309779405594 2023-01-24 04:25:32.747268: step: 608/466, loss: 0.022583313286304474 2023-01-24 04:25:33.348483: step: 610/466, loss: 0.02809302881360054 2023-01-24 04:25:33.929468: step: 612/466, loss: 0.0007098540663719177 2023-01-24 04:25:34.559883: step: 614/466, loss: 0.012303457595407963 2023-01-24 04:25:35.157114: step: 616/466, loss: 0.005014235619455576 2023-01-24 04:25:35.726919: step: 618/466, loss: 0.006878813728690147 2023-01-24 04:25:36.341658: step: 620/466, loss: 0.03495979681611061 2023-01-24 04:25:36.984576: step: 622/466, loss: 0.010468830354511738 2023-01-24 04:25:37.702049: step: 624/466, loss: 0.12259616702795029 2023-01-24 04:25:38.354819: step: 626/466, loss: 0.0879226103425026 2023-01-24 04:25:38.991530: step: 628/466, loss: 0.02980031818151474 2023-01-24 04:25:39.620998: step: 630/466, loss: 0.00035697617568075657 2023-01-24 04:25:40.232840: step: 632/466, loss: 0.007286733016371727 2023-01-24 04:25:40.831199: step: 634/466, loss: 0.00031953808502294123 2023-01-24 04:25:41.539507: step: 636/466, loss: 5.5171603889903054e-05 2023-01-24 04:25:42.165227: step: 638/466, loss: 0.01621876284480095 2023-01-24 04:25:42.782851: step: 640/466, loss: 0.0016461930936202407 2023-01-24 04:25:43.373214: step: 642/466, loss: 0.20600132644176483 2023-01-24 04:25:43.995558: step: 644/466, loss: 0.0797785222530365 2023-01-24 04:25:44.654670: step: 646/466, loss: 2.0617271729861386e-05 2023-01-24 04:25:45.300920: step: 648/466, loss: 0.026157179847359657 2023-01-24 04:25:45.942200: step: 650/466, loss: 0.01519196480512619 2023-01-24 04:25:46.557634: step: 652/466, loss: 0.12516631186008453 2023-01-24 04:25:47.254313: step: 654/466, loss: 0.01664874143898487 2023-01-24 04:25:47.865365: step: 656/466, loss: 0.009894109331071377 2023-01-24 04:25:48.538391: step: 658/466, loss: 0.03975345939397812 2023-01-24 04:25:49.163893: step: 660/466, loss: 0.0023406550753861666 2023-01-24 04:25:49.795855: step: 662/466, loss: 0.014231977052986622 2023-01-24 04:25:50.392234: step: 664/466, loss: 3.681071029859595e-05 2023-01-24 04:25:50.971306: step: 666/466, loss: 0.0002408704167464748 2023-01-24 04:25:51.644791: step: 668/466, loss: 0.008114716969430447 2023-01-24 04:25:52.317732: step: 670/466, loss: 0.02963598445057869 2023-01-24 04:25:52.904338: step: 672/466, loss: 0.16881270706653595 2023-01-24 04:25:53.527013: step: 674/466, loss: 0.011266290210187435 2023-01-24 04:25:54.150290: step: 676/466, loss: 0.02551671490073204 2023-01-24 04:25:54.797027: step: 678/466, loss: 0.00021820540132466704 2023-01-24 04:25:55.362879: step: 680/466, loss: 0.01580444909632206 2023-01-24 04:25:56.016465: step: 682/466, loss: 0.0003003499296028167 2023-01-24 04:25:56.680137: step: 684/466, loss: 0.20750835537910461 2023-01-24 04:25:57.279858: step: 686/466, loss: 0.0017898082733154297 2023-01-24 04:25:57.935577: step: 688/466, loss: 0.02732234075665474 2023-01-24 04:25:58.567450: step: 690/466, loss: 0.0007180742104537785 2023-01-24 04:25:59.163907: step: 692/466, loss: 0.5810558795928955 2023-01-24 04:25:59.864324: step: 694/466, loss: 0.00484499940648675 2023-01-24 04:26:00.505370: step: 696/466, loss: 0.05192787945270538 2023-01-24 04:26:01.132345: step: 698/466, loss: 0.010271672159433365 2023-01-24 04:26:01.698207: step: 700/466, loss: 0.028488852083683014 2023-01-24 04:26:02.364139: step: 702/466, loss: 0.008909497410058975 2023-01-24 04:26:03.004085: step: 704/466, loss: 0.012639102526009083 2023-01-24 04:26:03.609196: step: 706/466, loss: 0.022058192640542984 2023-01-24 04:26:04.278576: step: 708/466, loss: 0.0082834642380476 2023-01-24 04:26:04.938918: step: 710/466, loss: 0.08404853194952011 2023-01-24 04:26:05.527283: step: 712/466, loss: 0.0002973505179397762 2023-01-24 04:26:06.119783: step: 714/466, loss: 0.01604515127837658 2023-01-24 04:26:06.721286: step: 716/466, loss: 0.0721011534333229 2023-01-24 04:26:07.359789: step: 718/466, loss: 0.021386979147791862 2023-01-24 04:26:07.934652: step: 720/466, loss: 0.0023640303406864405 2023-01-24 04:26:08.596344: step: 722/466, loss: 0.04245742782950401 2023-01-24 04:26:09.199509: step: 724/466, loss: 0.005683106370270252 2023-01-24 04:26:09.824162: step: 726/466, loss: 0.01418995764106512 2023-01-24 04:26:10.445545: step: 728/466, loss: 0.032957032322883606 2023-01-24 04:26:11.092023: step: 730/466, loss: 0.008482889272272587 2023-01-24 04:26:11.689383: step: 732/466, loss: 0.0015429266495630145 2023-01-24 04:26:12.311397: step: 734/466, loss: 0.0006709589506499469 2023-01-24 04:26:12.987033: step: 736/466, loss: 0.0624673031270504 2023-01-24 04:26:13.627896: step: 738/466, loss: 0.037133317440748215 2023-01-24 04:26:14.236602: step: 740/466, loss: 0.0023529180325567722 2023-01-24 04:26:14.860264: step: 742/466, loss: 0.0028179381042718887 2023-01-24 04:26:15.524098: step: 744/466, loss: 0.006925663445144892 2023-01-24 04:26:16.167429: step: 746/466, loss: 0.009411841630935669 2023-01-24 04:26:16.719146: step: 748/466, loss: 0.056623708456754684 2023-01-24 04:26:17.369009: step: 750/466, loss: 0.021879877895116806 2023-01-24 04:26:17.943403: step: 752/466, loss: 0.018209518864750862 2023-01-24 04:26:18.570805: step: 754/466, loss: 0.0008968443726189435 2023-01-24 04:26:19.260809: step: 756/466, loss: 0.001655777683481574 2023-01-24 04:26:19.905979: step: 758/466, loss: 0.006058151368051767 2023-01-24 04:26:20.543384: step: 760/466, loss: 0.004729055799543858 2023-01-24 04:26:21.111359: step: 762/466, loss: 5.468863673740998e-05 2023-01-24 04:26:21.711899: step: 764/466, loss: 0.01741127111017704 2023-01-24 04:26:22.392976: step: 766/466, loss: 0.01633036509156227 2023-01-24 04:26:22.995046: step: 768/466, loss: 0.002862333320081234 2023-01-24 04:26:23.576286: step: 770/466, loss: 0.03288822993636131 2023-01-24 04:26:24.182654: step: 772/466, loss: 0.04681827872991562 2023-01-24 04:26:24.869210: step: 774/466, loss: 0.03713144734501839 2023-01-24 04:26:25.525344: step: 776/466, loss: 0.01241319626569748 2023-01-24 04:26:26.169620: step: 778/466, loss: 0.008597268722951412 2023-01-24 04:26:26.823534: step: 780/466, loss: 0.017979905009269714 2023-01-24 04:26:27.363781: step: 782/466, loss: 0.0002149198844563216 2023-01-24 04:26:28.023611: step: 784/466, loss: 0.05203290283679962 2023-01-24 04:26:28.643493: step: 786/466, loss: 0.013367298059165478 2023-01-24 04:26:29.398968: step: 788/466, loss: 0.014973390847444534 2023-01-24 04:26:30.075560: step: 790/466, loss: 0.03173128515481949 2023-01-24 04:26:30.723925: step: 792/466, loss: 0.08076297491788864 2023-01-24 04:26:31.291852: step: 794/466, loss: 0.001049180282279849 2023-01-24 04:26:31.889907: step: 796/466, loss: 0.0002666466752998531 2023-01-24 04:26:32.451638: step: 798/466, loss: 0.01713642291724682 2023-01-24 04:26:33.107057: step: 800/466, loss: 0.0040335580706596375 2023-01-24 04:26:33.750223: step: 802/466, loss: 0.0013026159722357988 2023-01-24 04:26:34.355629: step: 804/466, loss: 0.0008358605555258691 2023-01-24 04:26:34.945452: step: 806/466, loss: 0.007629586383700371 2023-01-24 04:26:35.576444: step: 808/466, loss: 0.0026855634059756994 2023-01-24 04:26:36.161606: step: 810/466, loss: 0.0005732735153287649 2023-01-24 04:26:36.785003: step: 812/466, loss: 0.018595751374959946 2023-01-24 04:26:37.451951: step: 814/466, loss: 0.0013379593146964908 2023-01-24 04:26:38.109072: step: 816/466, loss: 0.11120335757732391 2023-01-24 04:26:38.703076: step: 818/466, loss: 0.04682184010744095 2023-01-24 04:26:39.322682: step: 820/466, loss: 0.04700683429837227 2023-01-24 04:26:39.942069: step: 822/466, loss: 0.0036845568101853132 2023-01-24 04:26:40.578496: step: 824/466, loss: 0.010761707089841366 2023-01-24 04:26:41.225386: step: 826/466, loss: 0.03720209747552872 2023-01-24 04:26:41.865673: step: 828/466, loss: 0.006173505913466215 2023-01-24 04:26:42.433212: step: 830/466, loss: 0.2876453399658203 2023-01-24 04:26:43.144976: step: 832/466, loss: 0.0029347799718379974 2023-01-24 04:26:43.747379: step: 834/466, loss: 0.00524116912856698 2023-01-24 04:26:44.379242: step: 836/466, loss: 0.03424202650785446 2023-01-24 04:26:44.971135: step: 838/466, loss: 0.03777669370174408 2023-01-24 04:26:45.631583: step: 840/466, loss: 0.06569282710552216 2023-01-24 04:26:46.294939: step: 842/466, loss: 0.02280786633491516 2023-01-24 04:26:46.904158: step: 844/466, loss: 0.031920842826366425 2023-01-24 04:26:47.612432: step: 846/466, loss: 0.021082449704408646 2023-01-24 04:26:48.366047: step: 848/466, loss: 0.012781715020537376 2023-01-24 04:26:48.955657: step: 850/466, loss: 0.03208983317017555 2023-01-24 04:26:49.655832: step: 852/466, loss: 0.0037364468444138765 2023-01-24 04:26:50.274074: step: 854/466, loss: 0.06761616468429565 2023-01-24 04:26:50.985558: step: 856/466, loss: 0.0003608867700677365 2023-01-24 04:26:51.646316: step: 858/466, loss: 0.004010418429970741 2023-01-24 04:26:52.237848: step: 860/466, loss: 0.005447422154247761 2023-01-24 04:26:52.855857: step: 862/466, loss: 0.012670483440160751 2023-01-24 04:26:53.512518: step: 864/466, loss: 0.002907310612499714 2023-01-24 04:26:54.116909: step: 866/466, loss: 0.010033238679170609 2023-01-24 04:26:54.661155: step: 868/466, loss: 5.622063667942712e-07 2023-01-24 04:26:55.274151: step: 870/466, loss: 0.9328770637512207 2023-01-24 04:26:55.855422: step: 872/466, loss: 0.12100088596343994 2023-01-24 04:26:56.479344: step: 874/466, loss: 0.08131342381238937 2023-01-24 04:26:57.141636: step: 876/466, loss: 0.03969806432723999 2023-01-24 04:26:57.817313: step: 878/466, loss: 0.044402651488780975 2023-01-24 04:26:58.423418: step: 880/466, loss: 0.021927524358034134 2023-01-24 04:26:59.038690: step: 882/466, loss: 0.03587482497096062 2023-01-24 04:26:59.676622: step: 884/466, loss: 0.03218105062842369 2023-01-24 04:27:00.330612: step: 886/466, loss: 0.009517614729702473 2023-01-24 04:27:00.946962: step: 888/466, loss: 0.06283598393201828 2023-01-24 04:27:01.602118: step: 890/466, loss: 0.057447660714387894 2023-01-24 04:27:02.292435: step: 892/466, loss: 0.01942952163517475 2023-01-24 04:27:02.936535: step: 894/466, loss: 0.01550187449902296 2023-01-24 04:27:03.520906: step: 896/466, loss: 0.004953037016093731 2023-01-24 04:27:04.125027: step: 898/466, loss: 0.0468306839466095 2023-01-24 04:27:04.718839: step: 900/466, loss: 0.0006143661448732018 2023-01-24 04:27:05.331277: step: 902/466, loss: 0.0217197947204113 2023-01-24 04:27:06.001466: step: 904/466, loss: 0.014523173682391644 2023-01-24 04:27:06.679439: step: 906/466, loss: 0.00903339497745037 2023-01-24 04:27:07.236099: step: 908/466, loss: 0.0004052472941111773 2023-01-24 04:27:07.869606: step: 910/466, loss: 0.008048434741795063 2023-01-24 04:27:08.462819: step: 912/466, loss: 0.06616722792387009 2023-01-24 04:27:09.201341: step: 914/466, loss: 7.085939432727173e-05 2023-01-24 04:27:09.840277: step: 916/466, loss: 0.6689620018005371 2023-01-24 04:27:10.527914: step: 918/466, loss: 0.019276319071650505 2023-01-24 04:27:11.129366: step: 920/466, loss: 0.002870056079700589 2023-01-24 04:27:11.779199: step: 922/466, loss: 0.14844343066215515 2023-01-24 04:27:12.430561: step: 924/466, loss: 0.07893482595682144 2023-01-24 04:27:13.095585: step: 926/466, loss: 0.025511648505926132 2023-01-24 04:27:13.731328: step: 928/466, loss: 0.0012719589285552502 2023-01-24 04:27:14.377740: step: 930/466, loss: 0.03931349515914917 2023-01-24 04:27:15.034159: step: 932/466, loss: 0.007727402728050947 ================================================== Loss: 0.052 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3434870810436421, 'r': 0.3304515181956861, 'f1': 0.33684323034647307}, 'combined': 0.24820027499213804, 'epoch': 32} Test Chinese: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.36756595281385557, 'r': 0.2948816186928156, 'f1': 0.3272362910036732}, 'combined': 0.2170271774532133, 'epoch': 32} Dev Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.33440328054298646, 'r': 0.2799360795454546, 'f1': 0.3047551546391753}, 'combined': 0.20317010309278352, 'epoch': 32} Test Korean: {'template': {'p': 0.96875, 'r': 0.49206349206349204, 'f1': 0.6526315789473683}, 'slot': {'p': 0.3808899803446344, 'r': 0.2746104537666688, 'f1': 0.3191344044780824}, 'combined': 0.208277190290959, 'epoch': 32} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3292864357864358, 'r': 0.32803677189350816, 'f1': 0.3286604159465376}, 'combined': 0.2421708328027119, 'epoch': 32} Test Russian: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.35490022755098094, 'r': 0.28441267151527005, 'f1': 0.3157706405942286}, 'combined': 0.20942301552363346, 'epoch': 32} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.3063063063063063, 'r': 0.32380952380952377, 'f1': 0.3148148148148148}, 'combined': 0.20987654320987653, 'epoch': 32} Sample Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.4375, 'r': 0.30434782608695654, 'f1': 0.358974358974359}, 'combined': 0.2393162393162393, 'epoch': 32} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.4, 'r': 0.13793103448275862, 'f1': 0.20512820512820515}, 'combined': 0.13675213675213677, 'epoch': 32} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.35600821224540485, 'r': 0.3296622534644356, 'f1': 0.34232907896701}, 'combined': 0.25224247923884946, 'epoch': 20} Test for Chinese: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.3708503216596444, 'r': 0.2994463568648126, 'f1': 0.33134515303755174}, 'combined': 0.21975222584873896, 'epoch': 20} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.38541666666666663, 'r': 0.35238095238095235, 'f1': 0.36815920398009944}, 'combined': 0.24543946932006627, 'epoch': 20} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.32910511363636363, 'r': 0.2742542613636364, 'f1': 0.2991864669421488}, 'combined': 0.19945764462809917, 'epoch': 25} Test for Korean: {'template': {'p': 0.96875, 'r': 0.49206349206349204, 'f1': 0.6526315789473683}, 'slot': {'p': 0.3803562606644222, 'r': 0.28970140060968946, 'f1': 0.32889629598629455}, 'combined': 0.21464810895947642, 'epoch': 25} Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.578125, 'r': 0.40217391304347827, 'f1': 0.47435897435897434}, 'combined': 0.3162393162393162, 'epoch': 25} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3214272868712161, 'r': 0.31166858366449984, 'f1': 0.3164727236824498}, 'combined': 0.23319042797654194, 'epoch': 7} Test for Russian: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.3374639287478496, 'r': 0.2816078301964814, 'f1': 0.30701605547736693}, 'combined': 0.20361686580882363, 'epoch': 7} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.4423076923076923, 'r': 0.19827586206896552, 'f1': 0.2738095238095238}, 'combined': 0.1825396825396825, 'epoch': 7} ****************************** Epoch: 33 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-24 04:29:46.896981: step: 2/466, loss: 0.013426671735942364 2023-01-24 04:29:47.502615: step: 4/466, loss: 0.04405581206083298 2023-01-24 04:29:48.084255: step: 6/466, loss: 0.036784835159778595 2023-01-24 04:29:48.708440: step: 8/466, loss: 0.002065311186015606 2023-01-24 04:29:49.316388: step: 10/466, loss: 0.001026434125378728 2023-01-24 04:29:49.942692: step: 12/466, loss: 0.000631114817224443 2023-01-24 04:29:50.531126: step: 14/466, loss: 0.0009648153209127486 2023-01-24 04:29:51.072500: step: 16/466, loss: 0.004327333532273769 2023-01-24 04:29:51.665796: step: 18/466, loss: 0.024526774883270264 2023-01-24 04:29:52.340351: step: 20/466, loss: 0.0209369957447052 2023-01-24 04:29:52.993927: step: 22/466, loss: 0.009978730231523514 2023-01-24 04:29:53.651721: step: 24/466, loss: 0.02210579626262188 2023-01-24 04:29:54.313958: step: 26/466, loss: 0.008876738138496876 2023-01-24 04:29:54.922007: step: 28/466, loss: 0.1963396519422531 2023-01-24 04:29:55.553615: step: 30/466, loss: 0.055339012295007706 2023-01-24 04:29:56.140438: step: 32/466, loss: 0.0009447642951272428 2023-01-24 04:29:56.776118: step: 34/466, loss: 0.0005788745475001633 2023-01-24 04:29:57.355838: step: 36/466, loss: 0.00714969402179122 2023-01-24 04:29:57.971822: step: 38/466, loss: 0.029303908348083496 2023-01-24 04:29:58.586352: step: 40/466, loss: 0.006237163674086332 2023-01-24 04:29:59.252836: step: 42/466, loss: 0.000784186355303973 2023-01-24 04:29:59.866727: step: 44/466, loss: 0.005701386835426092 2023-01-24 04:30:00.550758: step: 46/466, loss: 0.15250776708126068 2023-01-24 04:30:01.225063: step: 48/466, loss: 0.0013273664517328143 2023-01-24 04:30:01.870427: step: 50/466, loss: 0.021707775071263313 2023-01-24 04:30:02.616023: step: 52/466, loss: 0.0027274428866803646 2023-01-24 04:30:03.196262: step: 54/466, loss: 0.020889142528176308 2023-01-24 04:30:03.868141: step: 56/466, loss: 0.019990235567092896 2023-01-24 04:30:04.510700: step: 58/466, loss: 0.00216134125366807 2023-01-24 04:30:05.166969: step: 60/466, loss: 0.011055981740355492 2023-01-24 04:30:05.823293: step: 62/466, loss: 0.001772857503965497 2023-01-24 04:30:06.486630: step: 64/466, loss: 0.007328768260776997 2023-01-24 04:30:07.179801: step: 66/466, loss: 0.0024733725003898144 2023-01-24 04:30:07.734290: step: 68/466, loss: 0.022164082154631615 2023-01-24 04:30:08.361989: step: 70/466, loss: 0.16365617513656616 2023-01-24 04:30:08.985772: step: 72/466, loss: 0.04055704176425934 2023-01-24 04:30:09.635458: step: 74/466, loss: 0.5103712677955627 2023-01-24 04:30:10.239956: step: 76/466, loss: 0.01298505999147892 2023-01-24 04:30:10.882191: step: 78/466, loss: 0.018091892823576927 2023-01-24 04:30:11.559212: step: 80/466, loss: 0.002492034574970603 2023-01-24 04:30:12.154802: step: 82/466, loss: 0.030289528891444206 2023-01-24 04:30:12.838022: step: 84/466, loss: 0.0030104555189609528 2023-01-24 04:30:13.473269: step: 86/466, loss: 0.014978118240833282 2023-01-24 04:30:14.074132: step: 88/466, loss: 0.004207504913210869 2023-01-24 04:30:14.694457: step: 90/466, loss: 0.011144906282424927 2023-01-24 04:30:15.300222: step: 92/466, loss: 0.016953200101852417 2023-01-24 04:30:15.868390: step: 94/466, loss: 0.060297973453998566 2023-01-24 04:30:16.484297: step: 96/466, loss: 0.03718177229166031 2023-01-24 04:30:17.141513: step: 98/466, loss: 0.006500702351331711 2023-01-24 04:30:17.773205: step: 100/466, loss: 0.06412974745035172 2023-01-24 04:30:18.370818: step: 102/466, loss: 0.005766607355326414 2023-01-24 04:30:18.963647: step: 104/466, loss: 0.016049914062023163 2023-01-24 04:30:19.575985: step: 106/466, loss: 0.01654273457825184 2023-01-24 04:30:20.160579: step: 108/466, loss: 0.038459960371255875 2023-01-24 04:30:20.709169: step: 110/466, loss: 0.009186459705233574 2023-01-24 04:30:21.285608: step: 112/466, loss: 0.00904849823564291 2023-01-24 04:30:21.859481: step: 114/466, loss: 0.0003890866064466536 2023-01-24 04:30:22.471607: step: 116/466, loss: 0.01985808089375496 2023-01-24 04:30:23.079573: step: 118/466, loss: 0.014591362327337265 2023-01-24 04:30:23.710508: step: 120/466, loss: 0.00038054390461184084 2023-01-24 04:30:24.284530: step: 122/466, loss: 0.0006946328212507069 2023-01-24 04:30:24.920804: step: 124/466, loss: 0.016799703240394592 2023-01-24 04:30:25.571910: step: 126/466, loss: 0.0006378726684488356 2023-01-24 04:30:26.238107: step: 128/466, loss: 0.005696744192391634 2023-01-24 04:30:26.849437: step: 130/466, loss: 0.010593943297863007 2023-01-24 04:30:27.463999: step: 132/466, loss: 0.02038961462676525 2023-01-24 04:30:28.096070: step: 134/466, loss: 0.004535729065537453 2023-01-24 04:30:28.733897: step: 136/466, loss: 0.003972996957600117 2023-01-24 04:30:29.390491: step: 138/466, loss: 0.07120930403470993 2023-01-24 04:30:30.141616: step: 140/466, loss: 0.024490805342793465 2023-01-24 04:30:30.773604: step: 142/466, loss: 0.005146074574440718 2023-01-24 04:30:31.425902: step: 144/466, loss: 0.00045003503328189254 2023-01-24 04:30:32.034031: step: 146/466, loss: 0.037370651960372925 2023-01-24 04:30:32.635887: step: 148/466, loss: 0.0008118917467072606 2023-01-24 04:30:33.284912: step: 150/466, loss: 0.002586959395557642 2023-01-24 04:30:33.958920: step: 152/466, loss: 0.04539839178323746 2023-01-24 04:30:34.565999: step: 154/466, loss: 0.0006347663584165275 2023-01-24 04:30:35.161403: step: 156/466, loss: 0.03967674821615219 2023-01-24 04:30:35.812342: step: 158/466, loss: 0.005713362712413073 2023-01-24 04:30:36.483623: step: 160/466, loss: 0.07420878112316132 2023-01-24 04:30:37.142586: step: 162/466, loss: 0.002022551139816642 2023-01-24 04:30:37.826960: step: 164/466, loss: 0.018466947600245476 2023-01-24 04:30:38.484603: step: 166/466, loss: 0.008294089697301388 2023-01-24 04:30:39.155375: step: 168/466, loss: 0.09669829159975052 2023-01-24 04:30:39.794468: step: 170/466, loss: 0.008423964492976665 2023-01-24 04:30:40.376455: step: 172/466, loss: 0.0061074914410710335 2023-01-24 04:30:40.995943: step: 174/466, loss: 0.03758373484015465 2023-01-24 04:30:41.608562: step: 176/466, loss: 0.08949467539787292 2023-01-24 04:30:42.205300: step: 178/466, loss: 0.06499304622411728 2023-01-24 04:30:42.847906: step: 180/466, loss: 0.03493135794997215 2023-01-24 04:30:43.481422: step: 182/466, loss: 0.0016586108831688762 2023-01-24 04:30:44.102793: step: 184/466, loss: 0.003063349286094308 2023-01-24 04:30:44.791168: step: 186/466, loss: 0.005292469635605812 2023-01-24 04:30:45.391556: step: 188/466, loss: 0.004043189808726311 2023-01-24 04:30:46.130959: step: 190/466, loss: 0.06299145519733429 2023-01-24 04:30:46.772719: step: 192/466, loss: 0.02014276757836342 2023-01-24 04:30:47.397202: step: 194/466, loss: 0.008015135303139687 2023-01-24 04:30:48.049993: step: 196/466, loss: 0.012555583380162716 2023-01-24 04:30:48.700721: step: 198/466, loss: 0.006674039643257856 2023-01-24 04:30:49.372704: step: 200/466, loss: 0.004354489967226982 2023-01-24 04:30:49.999626: step: 202/466, loss: 0.0257272869348526 2023-01-24 04:30:50.601122: step: 204/466, loss: 0.06419980525970459 2023-01-24 04:30:51.285173: step: 206/466, loss: 0.05442401394248009 2023-01-24 04:30:51.923855: step: 208/466, loss: 0.08057260513305664 2023-01-24 04:30:52.485707: step: 210/466, loss: 3.373867988586426 2023-01-24 04:30:53.081467: step: 212/466, loss: 0.001134110032580793 2023-01-24 04:30:53.653258: step: 214/466, loss: 0.0033750978764146566 2023-01-24 04:30:54.274550: step: 216/466, loss: 0.024464482441544533 2023-01-24 04:30:54.906607: step: 218/466, loss: 0.006701041478663683 2023-01-24 04:30:55.520550: step: 220/466, loss: 0.004200145602226257 2023-01-24 04:30:56.124573: step: 222/466, loss: 0.01735462062060833 2023-01-24 04:30:56.786991: step: 224/466, loss: 0.004717234987765551 2023-01-24 04:30:57.491716: step: 226/466, loss: 0.010514490306377411 2023-01-24 04:30:58.076415: step: 228/466, loss: 0.08764377236366272 2023-01-24 04:30:58.749102: step: 230/466, loss: 0.0010582390241324902 2023-01-24 04:30:59.400819: step: 232/466, loss: 0.006454814225435257 2023-01-24 04:31:00.035183: step: 234/466, loss: 0.00016704069275874645 2023-01-24 04:31:00.666478: step: 236/466, loss: 0.0017406251281499863 2023-01-24 04:31:01.320622: step: 238/466, loss: 0.04129519686102867 2023-01-24 04:31:01.870686: step: 240/466, loss: 0.02657444216310978 2023-01-24 04:31:02.515337: step: 242/466, loss: 0.00041153430356644094 2023-01-24 04:31:03.163693: step: 244/466, loss: 0.02432902529835701 2023-01-24 04:31:03.779992: step: 246/466, loss: 0.00379131268709898 2023-01-24 04:31:04.350716: step: 248/466, loss: 0.013274705968797207 2023-01-24 04:31:04.976849: step: 250/466, loss: 0.0012937224237248302 2023-01-24 04:31:05.569885: step: 252/466, loss: 0.0030212379060685635 2023-01-24 04:31:06.179456: step: 254/466, loss: 0.000778071815147996 2023-01-24 04:31:06.818453: step: 256/466, loss: 0.023154953494668007 2023-01-24 04:31:07.447928: step: 258/466, loss: 0.004224831238389015 2023-01-24 04:31:08.048065: step: 260/466, loss: 0.0016274972585961223 2023-01-24 04:31:08.782152: step: 262/466, loss: 0.02594602108001709 2023-01-24 04:31:09.417369: step: 264/466, loss: 0.013618562370538712 2023-01-24 04:31:10.009211: step: 266/466, loss: 0.003424879163503647 2023-01-24 04:31:10.668737: step: 268/466, loss: 0.01895812712609768 2023-01-24 04:31:11.269262: step: 270/466, loss: 0.00664788531139493 2023-01-24 04:31:11.892338: step: 272/466, loss: 0.08949951827526093 2023-01-24 04:31:12.545394: step: 274/466, loss: 0.04765995219349861 2023-01-24 04:31:13.170369: step: 276/466, loss: 0.0042534093372523785 2023-01-24 04:31:13.848933: step: 278/466, loss: 0.043888408690690994 2023-01-24 04:31:14.458226: step: 280/466, loss: 0.0014408992137759924 2023-01-24 04:31:15.020004: step: 282/466, loss: 0.001083766925148666 2023-01-24 04:31:15.685029: step: 284/466, loss: 3.8808677196502686 2023-01-24 04:31:16.286939: step: 286/466, loss: 0.0072978041134774685 2023-01-24 04:31:16.914901: step: 288/466, loss: 0.023800894618034363 2023-01-24 04:31:17.554992: step: 290/466, loss: 0.0016516759060323238 2023-01-24 04:31:18.199775: step: 292/466, loss: 0.015784960240125656 2023-01-24 04:31:18.938620: step: 294/466, loss: 0.007960588671267033 2023-01-24 04:31:19.533347: step: 296/466, loss: 0.009296424686908722 2023-01-24 04:31:20.154795: step: 298/466, loss: 0.04585607349872589 2023-01-24 04:31:20.835701: step: 300/466, loss: 0.5705857276916504 2023-01-24 04:31:21.381765: step: 302/466, loss: 0.03182634338736534 2023-01-24 04:31:21.986426: step: 304/466, loss: 0.00026920222444459796 2023-01-24 04:31:22.586214: step: 306/466, loss: 0.019237732514739037 2023-01-24 04:31:23.144926: step: 308/466, loss: 0.0022195458877831697 2023-01-24 04:31:23.753834: step: 310/466, loss: 0.00019793420506175607 2023-01-24 04:31:24.320688: step: 312/466, loss: 0.13449236750602722 2023-01-24 04:31:24.892866: step: 314/466, loss: 0.09944283962249756 2023-01-24 04:31:25.474217: step: 316/466, loss: 0.009415938518941402 2023-01-24 04:31:26.052966: step: 318/466, loss: 0.017806226387619972 2023-01-24 04:31:26.709198: step: 320/466, loss: 0.022105487063527107 2023-01-24 04:31:27.266416: step: 322/466, loss: 0.004306672140955925 2023-01-24 04:31:27.872005: step: 324/466, loss: 0.047689154744148254 2023-01-24 04:31:28.497955: step: 326/466, loss: 0.13233791291713715 2023-01-24 04:31:29.095367: step: 328/466, loss: 0.03858339786529541 2023-01-24 04:31:29.732970: step: 330/466, loss: 0.004022891633212566 2023-01-24 04:31:30.318142: step: 332/466, loss: 0.028247395530343056 2023-01-24 04:31:30.944489: step: 334/466, loss: 0.01629316434264183 2023-01-24 04:31:31.526262: step: 336/466, loss: 0.020957961678504944 2023-01-24 04:31:32.184826: step: 338/466, loss: 0.02816147543489933 2023-01-24 04:31:32.784829: step: 340/466, loss: 0.10556326806545258 2023-01-24 04:31:33.396342: step: 342/466, loss: 0.0038239702116698027 2023-01-24 04:31:34.187529: step: 344/466, loss: 0.012613574974238873 2023-01-24 04:31:34.897012: step: 346/466, loss: 0.27932924032211304 2023-01-24 04:31:35.501654: step: 348/466, loss: 0.015454914420843124 2023-01-24 04:31:36.061141: step: 350/466, loss: 0.0010309700155630708 2023-01-24 04:31:36.701593: step: 352/466, loss: 0.04021390900015831 2023-01-24 04:31:37.383959: step: 354/466, loss: 0.05277429148554802 2023-01-24 04:31:37.985322: step: 356/466, loss: 0.0029488904401659966 2023-01-24 04:31:38.572288: step: 358/466, loss: 0.008632347919046879 2023-01-24 04:31:39.239001: step: 360/466, loss: 0.0010874831350520253 2023-01-24 04:31:39.847734: step: 362/466, loss: 0.07463423907756805 2023-01-24 04:31:40.425017: step: 364/466, loss: 0.008078490383923054 2023-01-24 04:31:41.015232: step: 366/466, loss: 3.8495003536809236e-05 2023-01-24 04:31:41.682041: step: 368/466, loss: 0.00020329591643530875 2023-01-24 04:31:42.318821: step: 370/466, loss: 0.0014669331721961498 2023-01-24 04:31:42.933914: step: 372/466, loss: 0.019579097628593445 2023-01-24 04:31:43.525967: step: 374/466, loss: 0.01790340431034565 2023-01-24 04:31:44.118208: step: 376/466, loss: 0.0022823030594736338 2023-01-24 04:31:44.762989: step: 378/466, loss: 0.10092777758836746 2023-01-24 04:31:45.310416: step: 380/466, loss: 0.0015150151448324323 2023-01-24 04:31:45.918469: step: 382/466, loss: 0.0008342181099578738 2023-01-24 04:31:46.535179: step: 384/466, loss: 0.020368829369544983 2023-01-24 04:31:47.136724: step: 386/466, loss: 0.009976758621633053 2023-01-24 04:31:47.680475: step: 388/466, loss: 0.005474635865539312 2023-01-24 04:31:48.275932: step: 390/466, loss: 0.00407253485172987 2023-01-24 04:31:48.875902: step: 392/466, loss: 0.005135375075042248 2023-01-24 04:31:49.512684: step: 394/466, loss: 0.000495641550514847 2023-01-24 04:31:50.087876: step: 396/466, loss: 0.019225360825657845 2023-01-24 04:31:50.864658: step: 398/466, loss: 0.010420488193631172 2023-01-24 04:31:51.473415: step: 400/466, loss: 0.7432917952537537 2023-01-24 04:31:52.121802: step: 402/466, loss: 0.08139077574014664 2023-01-24 04:31:52.740646: step: 404/466, loss: 0.025644859299063683 2023-01-24 04:31:53.367302: step: 406/466, loss: 0.01703513041138649 2023-01-24 04:31:53.989873: step: 408/466, loss: 0.0009812447242438793 2023-01-24 04:31:54.698643: step: 410/466, loss: 0.0013426410732790828 2023-01-24 04:31:55.329773: step: 412/466, loss: 0.02057889848947525 2023-01-24 04:31:55.909925: step: 414/466, loss: 0.03548679128289223 2023-01-24 04:31:56.533259: step: 416/466, loss: 0.01182898785918951 2023-01-24 04:31:57.146745: step: 418/466, loss: 0.013406840153038502 2023-01-24 04:31:57.767399: step: 420/466, loss: 0.007967946119606495 2023-01-24 04:31:58.414388: step: 422/466, loss: 0.017121534794569016 2023-01-24 04:31:58.998344: step: 424/466, loss: 0.005006145685911179 2023-01-24 04:31:59.648600: step: 426/466, loss: 0.008276369422674179 2023-01-24 04:32:00.269521: step: 428/466, loss: 0.00829931627959013 2023-01-24 04:32:00.888401: step: 430/466, loss: 0.012971614487469196 2023-01-24 04:32:01.506445: step: 432/466, loss: 0.0019958114717155695 2023-01-24 04:32:02.144933: step: 434/466, loss: 0.0078291529789567 2023-01-24 04:32:02.814870: step: 436/466, loss: 0.003694443963468075 2023-01-24 04:32:03.440813: step: 438/466, loss: 0.0021362758707255125 2023-01-24 04:32:04.098685: step: 440/466, loss: 0.006038742605596781 2023-01-24 04:32:04.807861: step: 442/466, loss: 0.0014593410305678844 2023-01-24 04:32:05.436084: step: 444/466, loss: 0.016899054870009422 2023-01-24 04:32:06.096887: step: 446/466, loss: 0.023046938702464104 2023-01-24 04:32:06.751549: step: 448/466, loss: 0.00014703207125421613 2023-01-24 04:32:07.367616: step: 450/466, loss: 0.13239708542823792 2023-01-24 04:32:08.036335: step: 452/466, loss: 0.014334054663777351 2023-01-24 04:32:08.687045: step: 454/466, loss: 0.001111467950977385 2023-01-24 04:32:09.303264: step: 456/466, loss: 0.022834081202745438 2023-01-24 04:32:09.939778: step: 458/466, loss: 0.2740824818611145 2023-01-24 04:32:10.580711: step: 460/466, loss: 0.0005852937465533614 2023-01-24 04:32:11.242146: step: 462/466, loss: 0.0012171048438176513 2023-01-24 04:32:11.821932: step: 464/466, loss: 0.012282396666705608 2023-01-24 04:32:12.439409: step: 466/466, loss: 0.008685320615768433 2023-01-24 04:32:13.079953: step: 468/466, loss: 0.0013729456113651395 2023-01-24 04:32:13.697441: step: 470/466, loss: 0.006827492732554674 2023-01-24 04:32:14.298257: step: 472/466, loss: 0.08489412069320679 2023-01-24 04:32:14.928320: step: 474/466, loss: 0.049523983150720596 2023-01-24 04:32:15.540822: step: 476/466, loss: 0.009977877140045166 2023-01-24 04:32:16.146778: step: 478/466, loss: 0.0059689804911613464 2023-01-24 04:32:16.772988: step: 480/466, loss: 0.11533409357070923 2023-01-24 04:32:17.458852: step: 482/466, loss: 0.013530323281884193 2023-01-24 04:32:18.076063: step: 484/466, loss: 0.27607443928718567 2023-01-24 04:32:18.642892: step: 486/466, loss: 0.0014335147570818663 2023-01-24 04:32:19.280675: step: 488/466, loss: 0.038051776587963104 2023-01-24 04:32:19.840585: step: 490/466, loss: 0.019769888371229172 2023-01-24 04:32:20.456194: step: 492/466, loss: 0.054358504712581635 2023-01-24 04:32:21.114025: step: 494/466, loss: 0.0036494287196546793 2023-01-24 04:32:21.688564: step: 496/466, loss: 0.745313823223114 2023-01-24 04:32:22.323148: step: 498/466, loss: 0.034797411412000656 2023-01-24 04:32:22.880217: step: 500/466, loss: 0.025983678176999092 2023-01-24 04:32:23.455734: step: 502/466, loss: 0.0233176089823246 2023-01-24 04:32:24.097510: step: 504/466, loss: 0.018735304474830627 2023-01-24 04:32:24.720331: step: 506/466, loss: 0.029395515099167824 2023-01-24 04:32:25.283998: step: 508/466, loss: 0.0024682055227458477 2023-01-24 04:32:25.965944: step: 510/466, loss: 0.005346355494111776 2023-01-24 04:32:26.588672: step: 512/466, loss: 0.00011466677096905187 2023-01-24 04:32:27.183165: step: 514/466, loss: 0.02455970272421837 2023-01-24 04:32:27.811044: step: 516/466, loss: 0.4798331558704376 2023-01-24 04:32:28.478844: step: 518/466, loss: 0.001988023519515991 2023-01-24 04:32:29.154138: step: 520/466, loss: 0.055366151034832 2023-01-24 04:32:29.747818: step: 522/466, loss: 0.06646957248449326 2023-01-24 04:32:30.323704: step: 524/466, loss: 0.018799712881445885 2023-01-24 04:32:30.911355: step: 526/466, loss: 0.0010934981983155012 2023-01-24 04:32:31.578164: step: 528/466, loss: 0.005480987951159477 2023-01-24 04:32:32.261303: step: 530/466, loss: 0.015281077474355698 2023-01-24 04:32:32.883241: step: 532/466, loss: 0.01949336566030979 2023-01-24 04:32:33.427539: step: 534/466, loss: 0.0007982194656506181 2023-01-24 04:32:34.050697: step: 536/466, loss: 0.010515277273952961 2023-01-24 04:32:34.682256: step: 538/466, loss: 0.024209585040807724 2023-01-24 04:32:35.340088: step: 540/466, loss: 0.007699800655245781 2023-01-24 04:32:35.992140: step: 542/466, loss: 0.0018632489955052733 2023-01-24 04:32:36.610587: step: 544/466, loss: 0.0011185059556737542 2023-01-24 04:32:37.208905: step: 546/466, loss: 0.0012074961559846997 2023-01-24 04:32:37.868996: step: 548/466, loss: 0.003445641603320837 2023-01-24 04:32:38.448657: step: 550/466, loss: 0.01648532971739769 2023-01-24 04:32:38.988977: step: 552/466, loss: 0.03117789328098297 2023-01-24 04:32:39.646334: step: 554/466, loss: 0.029531968757510185 2023-01-24 04:32:40.258665: step: 556/466, loss: 0.007115309592336416 2023-01-24 04:32:40.931618: step: 558/466, loss: 0.028260016813874245 2023-01-24 04:32:41.568434: step: 560/466, loss: 0.024944953620433807 2023-01-24 04:32:42.318063: step: 562/466, loss: 0.0020174370147287846 2023-01-24 04:32:42.973686: step: 564/466, loss: 0.01885637827217579 2023-01-24 04:32:43.597622: step: 566/466, loss: 0.0009125018259510398 2023-01-24 04:32:44.284995: step: 568/466, loss: 0.016139905899763107 2023-01-24 04:32:44.935802: step: 570/466, loss: 0.06416390091180801 2023-01-24 04:32:45.620172: step: 572/466, loss: 0.15958631038665771 2023-01-24 04:32:46.255706: step: 574/466, loss: 0.016670292243361473 2023-01-24 04:32:46.869823: step: 576/466, loss: 0.0015069821383804083 2023-01-24 04:32:47.473150: step: 578/466, loss: 0.001389741781167686 2023-01-24 04:32:48.205209: step: 580/466, loss: 0.043244875967502594 2023-01-24 04:32:48.846504: step: 582/466, loss: 0.013401959091424942 2023-01-24 04:32:49.448969: step: 584/466, loss: 0.002430893713608384 2023-01-24 04:32:50.102864: step: 586/466, loss: 0.00083020085003227 2023-01-24 04:32:50.747162: step: 588/466, loss: 0.007434462197124958 2023-01-24 04:32:51.360362: step: 590/466, loss: 0.06438363343477249 2023-01-24 04:32:51.913612: step: 592/466, loss: 0.0016319775022566319 2023-01-24 04:32:52.548440: step: 594/466, loss: 0.00047406292287632823 2023-01-24 04:32:53.199490: step: 596/466, loss: 0.022736098617315292 2023-01-24 04:32:53.859731: step: 598/466, loss: 0.3369652032852173 2023-01-24 04:32:54.462839: step: 600/466, loss: 0.00040172869921661913 2023-01-24 04:32:55.062001: step: 602/466, loss: 0.005799479782581329 2023-01-24 04:32:55.693208: step: 604/466, loss: 0.07075946778059006 2023-01-24 04:32:56.279866: step: 606/466, loss: 0.0024011246860027313 2023-01-24 04:32:56.863233: step: 608/466, loss: 0.5161957144737244 2023-01-24 04:32:57.520275: step: 610/466, loss: 0.002114727860316634 2023-01-24 04:32:58.205066: step: 612/466, loss: 0.0044073727913200855 2023-01-24 04:32:58.810733: step: 614/466, loss: 0.099338598549366 2023-01-24 04:32:59.410119: step: 616/466, loss: 0.015629105269908905 2023-01-24 04:33:00.078072: step: 618/466, loss: 0.3922637403011322 2023-01-24 04:33:00.682422: step: 620/466, loss: 8.24608578113839e-05 2023-01-24 04:33:01.352080: step: 622/466, loss: 0.022870955988764763 2023-01-24 04:33:01.984622: step: 624/466, loss: 0.00864611566066742 2023-01-24 04:33:02.700296: step: 626/466, loss: 0.00594196654856205 2023-01-24 04:33:03.342560: step: 628/466, loss: 0.012344695627689362 2023-01-24 04:33:03.923063: step: 630/466, loss: 0.0002703698119148612 2023-01-24 04:33:04.517728: step: 632/466, loss: 3.023464887519367e-05 2023-01-24 04:33:05.109644: step: 634/466, loss: 0.0016201818361878395 2023-01-24 04:33:05.724554: step: 636/466, loss: 0.03301899880170822 2023-01-24 04:33:06.402036: step: 638/466, loss: 0.04122118651866913 2023-01-24 04:33:06.993722: step: 640/466, loss: 7.819570600986481e-05 2023-01-24 04:33:07.571529: step: 642/466, loss: 0.001188569818623364 2023-01-24 04:33:08.172845: step: 644/466, loss: 0.0031083908397704363 2023-01-24 04:33:08.786223: step: 646/466, loss: 0.01917244680225849 2023-01-24 04:33:09.335499: step: 648/466, loss: 0.142510324716568 2023-01-24 04:33:09.970413: step: 650/466, loss: 0.019693298265337944 2023-01-24 04:33:10.605144: step: 652/466, loss: 0.08926106989383698 2023-01-24 04:33:11.237940: step: 654/466, loss: 0.060860756784677505 2023-01-24 04:33:11.844584: step: 656/466, loss: 0.002574354875832796 2023-01-24 04:33:12.514463: step: 658/466, loss: 0.022529149428009987 2023-01-24 04:33:13.159274: step: 660/466, loss: 0.00436887051910162 2023-01-24 04:33:13.768182: step: 662/466, loss: 0.023581165820360184 2023-01-24 04:33:14.421341: step: 664/466, loss: 0.10075651109218597 2023-01-24 04:33:14.995499: step: 666/466, loss: 0.005409528501331806 2023-01-24 04:33:15.621550: step: 668/466, loss: 0.0017528823809698224 2023-01-24 04:33:16.284613: step: 670/466, loss: 0.08418670296669006 2023-01-24 04:33:16.942702: step: 672/466, loss: 0.001370617770589888 2023-01-24 04:33:17.593492: step: 674/466, loss: 0.0022740717977285385 2023-01-24 04:33:18.229426: step: 676/466, loss: 0.003733733668923378 2023-01-24 04:33:18.848598: step: 678/466, loss: 0.0049021136946976185 2023-01-24 04:33:19.434257: step: 680/466, loss: 0.05791711434721947 2023-01-24 04:33:20.037667: step: 682/466, loss: 1.726668357849121 2023-01-24 04:33:20.665226: step: 684/466, loss: 0.10720763355493546 2023-01-24 04:33:21.327114: step: 686/466, loss: 0.040691643953323364 2023-01-24 04:33:21.995491: step: 688/466, loss: 0.0018499793950468302 2023-01-24 04:33:22.618311: step: 690/466, loss: 0.012728684581816196 2023-01-24 04:33:23.242529: step: 692/466, loss: 0.0239124558866024 2023-01-24 04:33:23.855506: step: 694/466, loss: 0.005421977955847979 2023-01-24 04:33:24.424994: step: 696/466, loss: 0.03604663908481598 2023-01-24 04:33:25.058941: step: 698/466, loss: 0.021979086101055145 2023-01-24 04:33:25.749959: step: 700/466, loss: 0.007704432122409344 2023-01-24 04:33:26.392275: step: 702/466, loss: 0.058133162558078766 2023-01-24 04:33:27.061272: step: 704/466, loss: 0.004512380808591843 2023-01-24 04:33:27.674865: step: 706/466, loss: 0.005167886149138212 2023-01-24 04:33:28.335187: step: 708/466, loss: 0.013947400264441967 2023-01-24 04:33:28.973934: step: 710/466, loss: 0.006945465691387653 2023-01-24 04:33:29.636818: step: 712/466, loss: 0.0446675680577755 2023-01-24 04:33:30.257688: step: 714/466, loss: 0.015970544889569283 2023-01-24 04:33:30.897411: step: 716/466, loss: 0.0590476393699646 2023-01-24 04:33:31.550545: step: 718/466, loss: 0.014604576863348484 2023-01-24 04:33:32.205428: step: 720/466, loss: 0.006326437462121248 2023-01-24 04:33:32.809569: step: 722/466, loss: 0.006244446150958538 2023-01-24 04:33:33.507002: step: 724/466, loss: 0.22038207948207855 2023-01-24 04:33:34.124786: step: 726/466, loss: 0.008666588924825191 2023-01-24 04:33:34.795112: step: 728/466, loss: 0.0798700824379921 2023-01-24 04:33:35.381772: step: 730/466, loss: 0.054656002670526505 2023-01-24 04:33:35.987967: step: 732/466, loss: 0.004361420404165983 2023-01-24 04:33:36.600390: step: 734/466, loss: 0.01673595793545246 2023-01-24 04:33:37.252976: step: 736/466, loss: 0.002686643274500966 2023-01-24 04:33:37.881275: step: 738/466, loss: 0.012211292050778866 2023-01-24 04:33:38.516983: step: 740/466, loss: 0.007673958782106638 2023-01-24 04:33:39.094755: step: 742/466, loss: 0.0045503550209105015 2023-01-24 04:33:39.708410: step: 744/466, loss: 0.002969563938677311 2023-01-24 04:33:40.328761: step: 746/466, loss: 0.08212627470493317 2023-01-24 04:33:40.927380: step: 748/466, loss: 0.000188513717148453 2023-01-24 04:33:41.502137: step: 750/466, loss: 0.053561802953481674 2023-01-24 04:33:42.145162: step: 752/466, loss: 0.028425009921193123 2023-01-24 04:33:42.777982: step: 754/466, loss: 0.08109253644943237 2023-01-24 04:33:43.549254: step: 756/466, loss: 0.004710098262876272 2023-01-24 04:33:44.228466: step: 758/466, loss: 0.013896183110773563 2023-01-24 04:33:44.802732: step: 760/466, loss: 0.018331099301576614 2023-01-24 04:33:45.344191: step: 762/466, loss: 0.017580410465598106 2023-01-24 04:33:45.935709: step: 764/466, loss: 0.030865274369716644 2023-01-24 04:33:46.587140: step: 766/466, loss: 0.05202435702085495 2023-01-24 04:33:47.184715: step: 768/466, loss: 0.0060366494581103325 2023-01-24 04:33:47.904705: step: 770/466, loss: 0.019957980141043663 2023-01-24 04:33:48.517363: step: 772/466, loss: 0.023260585963726044 2023-01-24 04:33:49.137475: step: 774/466, loss: 0.009710898622870445 2023-01-24 04:33:49.838537: step: 776/466, loss: 0.002994952257722616 2023-01-24 04:33:50.464264: step: 778/466, loss: 0.0006601830828003585 2023-01-24 04:33:51.120950: step: 780/466, loss: 0.0024687948171049356 2023-01-24 04:33:51.715556: step: 782/466, loss: 0.0004755732079502195 2023-01-24 04:33:52.310645: step: 784/466, loss: 0.06548666208982468 2023-01-24 04:33:52.923958: step: 786/466, loss: 0.014849133789539337 2023-01-24 04:33:53.541535: step: 788/466, loss: 0.012060455977916718 2023-01-24 04:33:54.209199: step: 790/466, loss: 0.016499169170856476 2023-01-24 04:33:54.775155: step: 792/466, loss: 0.01723051816225052 2023-01-24 04:33:55.459208: step: 794/466, loss: 0.0010327841155231 2023-01-24 04:33:56.060408: step: 796/466, loss: 0.0008903178968466818 2023-01-24 04:33:56.643631: step: 798/466, loss: 0.07221655547618866 2023-01-24 04:33:57.212564: step: 800/466, loss: 0.003996188286691904 2023-01-24 04:33:57.822666: step: 802/466, loss: 0.04473995044827461 2023-01-24 04:33:58.452023: step: 804/466, loss: 0.027255544438958168 2023-01-24 04:33:59.110394: step: 806/466, loss: 1.667365550994873 2023-01-24 04:33:59.814281: step: 808/466, loss: 0.0021742419339716434 2023-01-24 04:34:00.497620: step: 810/466, loss: 0.008183703757822514 2023-01-24 04:34:01.109502: step: 812/466, loss: 0.0016619794769212604 2023-01-24 04:34:01.734385: step: 814/466, loss: 0.011446056887507439 2023-01-24 04:34:02.353304: step: 816/466, loss: 0.001039049937389791 2023-01-24 04:34:02.953540: step: 818/466, loss: 0.0022509461268782616 2023-01-24 04:34:03.569386: step: 820/466, loss: 0.010372995398938656 2023-01-24 04:34:04.170957: step: 822/466, loss: 3.292357723694295e-05 2023-01-24 04:34:04.821477: step: 824/466, loss: 0.002079638186842203 2023-01-24 04:34:05.482734: step: 826/466, loss: 0.015205027535557747 2023-01-24 04:34:06.096959: step: 828/466, loss: 0.022317998111248016 2023-01-24 04:34:06.778707: step: 830/466, loss: 0.022038595750927925 2023-01-24 04:34:07.405298: step: 832/466, loss: 0.015924014151096344 2023-01-24 04:34:08.006270: step: 834/466, loss: 0.001443645916879177 2023-01-24 04:34:08.650220: step: 836/466, loss: 0.04264770820736885 2023-01-24 04:34:09.315371: step: 838/466, loss: 0.002594459569081664 2023-01-24 04:34:09.929367: step: 840/466, loss: 0.014926884323358536 2023-01-24 04:34:10.582549: step: 842/466, loss: 7.93904036981985e-05 2023-01-24 04:34:11.179719: step: 844/466, loss: 0.00013029819820076227 2023-01-24 04:34:11.900365: step: 846/466, loss: 0.0015288868453353643 2023-01-24 04:34:12.550291: step: 848/466, loss: 0.001493810210376978 2023-01-24 04:34:13.174377: step: 850/466, loss: 0.01249898225069046 2023-01-24 04:34:13.770592: step: 852/466, loss: 0.025699099525809288 2023-01-24 04:34:14.431336: step: 854/466, loss: 0.01823694258928299 2023-01-24 04:34:15.060785: step: 856/466, loss: 0.06470238417387009 2023-01-24 04:34:15.668181: step: 858/466, loss: 0.0032758882734924555 2023-01-24 04:34:16.282476: step: 860/466, loss: 0.013614756055176258 2023-01-24 04:34:16.871116: step: 862/466, loss: 0.026815857738256454 2023-01-24 04:34:17.479157: step: 864/466, loss: 0.000988278421573341 2023-01-24 04:34:18.137252: step: 866/466, loss: 0.05569792911410332 2023-01-24 04:34:18.785730: step: 868/466, loss: 0.005735119339078665 2023-01-24 04:34:19.420800: step: 870/466, loss: 3.250784720876254e-05 2023-01-24 04:34:19.966992: step: 872/466, loss: 0.00030958899878896773 2023-01-24 04:34:20.552191: step: 874/466, loss: 0.02781083807349205 2023-01-24 04:34:21.138087: step: 876/466, loss: 0.02429971843957901 2023-01-24 04:34:21.769626: step: 878/466, loss: 0.012174009345471859 2023-01-24 04:34:22.371578: step: 880/466, loss: 0.0009367524180561304 2023-01-24 04:34:22.946434: step: 882/466, loss: 0.01255623809993267 2023-01-24 04:34:23.552441: step: 884/466, loss: 0.0006068717339076102 2023-01-24 04:34:24.223776: step: 886/466, loss: 0.009702799841761589 2023-01-24 04:34:24.895096: step: 888/466, loss: 0.099673792719841 2023-01-24 04:34:25.538483: step: 890/466, loss: 0.04090374335646629 2023-01-24 04:34:26.162725: step: 892/466, loss: 0.0034436695277690887 2023-01-24 04:34:26.819140: step: 894/466, loss: 0.003322442527860403 2023-01-24 04:34:27.485461: step: 896/466, loss: 0.009773043915629387 2023-01-24 04:34:28.092801: step: 898/466, loss: 0.0044140624813735485 2023-01-24 04:34:28.707135: step: 900/466, loss: 0.1074460819363594 2023-01-24 04:34:29.338395: step: 902/466, loss: 0.027551736682653427 2023-01-24 04:34:29.969845: step: 904/466, loss: 0.0009217429906129837 2023-01-24 04:34:30.627737: step: 906/466, loss: 0.002145333681255579 2023-01-24 04:34:31.227508: step: 908/466, loss: 0.019214188680052757 2023-01-24 04:34:31.869548: step: 910/466, loss: 0.0341348797082901 2023-01-24 04:34:32.483113: step: 912/466, loss: 1.8653154373168945 2023-01-24 04:34:33.127540: step: 914/466, loss: 0.009925548918545246 2023-01-24 04:34:33.757178: step: 916/466, loss: 0.06136850267648697 2023-01-24 04:34:34.399639: step: 918/466, loss: 0.02847776934504509 2023-01-24 04:34:35.072647: step: 920/466, loss: 0.002534595550969243 2023-01-24 04:34:35.738370: step: 922/466, loss: 0.00827273353934288 2023-01-24 04:34:36.330793: step: 924/466, loss: 0.03495006263256073 2023-01-24 04:34:36.945959: step: 926/466, loss: 0.006841725669801235 2023-01-24 04:34:37.549974: step: 928/466, loss: 0.0022770666982978582 2023-01-24 04:34:38.236106: step: 930/466, loss: 0.03333687782287598 2023-01-24 04:34:38.809856: step: 932/466, loss: 0.02715749852359295 ================================================== Loss: 0.059 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.32857101993131405, 'r': 0.3229597501412157, 'f1': 0.3257412216735324}, 'combined': 0.2400198475489186, 'epoch': 33} Test Chinese: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.3595587881578532, 'r': 0.2950625410619471, 'f1': 0.3241334339700019}, 'combined': 0.2149693240837318, 'epoch': 33} Dev Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.32338554987212276, 'r': 0.26765054032976826, 'f1': 0.29289012496190187}, 'combined': 0.19526008330793457, 'epoch': 33} Test Korean: {'template': {'p': 0.96875, 'r': 0.49206349206349204, 'f1': 0.6526315789473683}, 'slot': {'p': 0.3795602647983616, 'r': 0.2800034740315782, 'f1': 0.32226814935709946}, 'combined': 0.21032237115937016, 'epoch': 33} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3223985338172601, 'r': 0.3217867718935082, 'f1': 0.32209236237014016}, 'combined': 0.237331214377998, 'epoch': 33} Test Russian: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.3544053467625014, 'r': 0.2856126278068685, 'f1': 0.31631187378994846}, 'combined': 0.20978196810939584, 'epoch': 33} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.25675675675675674, 'r': 0.2714285714285714, 'f1': 0.2638888888888889}, 'combined': 0.17592592592592593, 'epoch': 33} Sample Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.515625, 'r': 0.358695652173913, 'f1': 0.423076923076923}, 'combined': 0.282051282051282, 'epoch': 33} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5555555555555556, 'r': 0.1724137931034483, 'f1': 0.26315789473684215}, 'combined': 0.1754385964912281, 'epoch': 33} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.35600821224540485, 'r': 0.3296622534644356, 'f1': 0.34232907896701}, 'combined': 0.25224247923884946, 'epoch': 20} Test for Chinese: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.3708503216596444, 'r': 0.2994463568648126, 'f1': 0.33134515303755174}, 'combined': 0.21975222584873896, 'epoch': 20} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.38541666666666663, 'r': 0.35238095238095235, 'f1': 0.36815920398009944}, 'combined': 0.24543946932006627, 'epoch': 20} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.32910511363636363, 'r': 0.2742542613636364, 'f1': 0.2991864669421488}, 'combined': 0.19945764462809917, 'epoch': 25} Test for Korean: {'template': {'p': 0.96875, 'r': 0.49206349206349204, 'f1': 0.6526315789473683}, 'slot': {'p': 0.3803562606644222, 'r': 0.28970140060968946, 'f1': 0.32889629598629455}, 'combined': 0.21464810895947642, 'epoch': 25} Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.578125, 'r': 0.40217391304347827, 'f1': 0.47435897435897434}, 'combined': 0.3162393162393162, 'epoch': 25} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3214272868712161, 'r': 0.31166858366449984, 'f1': 0.3164727236824498}, 'combined': 0.23319042797654194, 'epoch': 7} Test for Russian: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.3374639287478496, 'r': 0.2816078301964814, 'f1': 0.30701605547736693}, 'combined': 0.20361686580882363, 'epoch': 7} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.4423076923076923, 'r': 0.19827586206896552, 'f1': 0.2738095238095238}, 'combined': 0.1825396825396825, 'epoch': 7} ****************************** Epoch: 34 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-24 04:37:10.215654: step: 2/466, loss: 0.00145103526301682 2023-01-24 04:37:10.885659: step: 4/466, loss: 0.01543799601495266 2023-01-24 04:37:11.491986: step: 6/466, loss: 0.00019660727411974221 2023-01-24 04:37:12.057612: step: 8/466, loss: 0.01038823090493679 2023-01-24 04:37:12.692886: step: 10/466, loss: 0.1893748939037323 2023-01-24 04:37:13.345663: step: 12/466, loss: 0.002862396417185664 2023-01-24 04:37:14.015292: step: 14/466, loss: 0.4766947627067566 2023-01-24 04:37:14.633039: step: 16/466, loss: 0.0016948574921116233 2023-01-24 04:37:15.318514: step: 18/466, loss: 0.00034551628050394356 2023-01-24 04:37:15.926932: step: 20/466, loss: 0.00022335999528877437 2023-01-24 04:37:16.554743: step: 22/466, loss: 0.882878303527832 2023-01-24 04:37:17.156158: step: 24/466, loss: 0.007838007993996143 2023-01-24 04:37:17.907422: step: 26/466, loss: 0.031117921695113182 2023-01-24 04:37:18.498382: step: 28/466, loss: 0.017217734828591347 2023-01-24 04:37:19.179365: step: 30/466, loss: 0.025149572640657425 2023-01-24 04:37:19.863680: step: 32/466, loss: 0.0020521816331893206 2023-01-24 04:37:20.515732: step: 34/466, loss: 0.011958999559283257 2023-01-24 04:37:21.118906: step: 36/466, loss: 0.008193464949727058 2023-01-24 04:37:21.719291: step: 38/466, loss: 0.00041370323742739856 2023-01-24 04:37:22.337735: step: 40/466, loss: 0.0024150144308805466 2023-01-24 04:37:22.940371: step: 42/466, loss: 0.010512173175811768 2023-01-24 04:37:23.583464: step: 44/466, loss: 0.00981095340102911 2023-01-24 04:37:24.252093: step: 46/466, loss: 0.00645467359572649 2023-01-24 04:37:24.942921: step: 48/466, loss: 0.003874067682772875 2023-01-24 04:37:25.626775: step: 50/466, loss: 0.00272333063185215 2023-01-24 04:37:26.267742: step: 52/466, loss: 0.003950105048716068 2023-01-24 04:37:26.910528: step: 54/466, loss: 0.014292075298726559 2023-01-24 04:37:27.460444: step: 56/466, loss: 0.026993967592716217 2023-01-24 04:37:28.094818: step: 58/466, loss: 0.015481133945286274 2023-01-24 04:37:28.761581: step: 60/466, loss: 0.0004928237176500261 2023-01-24 04:37:29.370594: step: 62/466, loss: 0.014285244047641754 2023-01-24 04:37:29.952796: step: 64/466, loss: 0.0006637139013037086 2023-01-24 04:37:30.557609: step: 66/466, loss: 0.02772347629070282 2023-01-24 04:37:31.217008: step: 68/466, loss: 0.002016782760620117 2023-01-24 04:37:31.890695: step: 70/466, loss: 0.0017903584521263838 2023-01-24 04:37:32.501104: step: 72/466, loss: 0.018268978223204613 2023-01-24 04:37:33.119013: step: 74/466, loss: 0.005079879891127348 2023-01-24 04:37:33.708534: step: 76/466, loss: 0.003510237904265523 2023-01-24 04:37:34.382494: step: 78/466, loss: 0.00917114969342947 2023-01-24 04:37:34.964352: step: 80/466, loss: 0.006327086128294468 2023-01-24 04:37:35.608036: step: 82/466, loss: 0.06136607751250267 2023-01-24 04:37:36.210224: step: 84/466, loss: 0.0007208603201434016 2023-01-24 04:37:36.898839: step: 86/466, loss: 0.002733423840254545 2023-01-24 04:37:37.546134: step: 88/466, loss: 0.013530532829463482 2023-01-24 04:37:38.197408: step: 90/466, loss: 0.006512346677482128 2023-01-24 04:37:38.890649: step: 92/466, loss: 0.005329577252268791 2023-01-24 04:37:39.605195: step: 94/466, loss: 6.773492813110352 2023-01-24 04:37:40.276937: step: 96/466, loss: 0.009055419825017452 2023-01-24 04:37:40.921149: step: 98/466, loss: 0.02896817773580551 2023-01-24 04:37:41.480640: step: 100/466, loss: 0.01143586728721857 2023-01-24 04:37:42.025501: step: 102/466, loss: 0.00017800797650124878 2023-01-24 04:37:42.667417: step: 104/466, loss: 0.023297762498259544 2023-01-24 04:37:43.294649: step: 106/466, loss: 0.003763154149055481 2023-01-24 04:37:43.938866: step: 108/466, loss: 0.0016384078189730644 2023-01-24 04:37:44.577269: step: 110/466, loss: 0.08477050065994263 2023-01-24 04:37:45.173139: step: 112/466, loss: 0.04829823970794678 2023-01-24 04:37:45.826466: step: 114/466, loss: 0.01076584868133068 2023-01-24 04:37:46.460556: step: 116/466, loss: 0.002037909347563982 2023-01-24 04:37:47.069488: step: 118/466, loss: 0.005181357264518738 2023-01-24 04:37:47.701531: step: 120/466, loss: 0.001771796029061079 2023-01-24 04:37:48.300673: step: 122/466, loss: 0.014115221798419952 2023-01-24 04:37:48.903613: step: 124/466, loss: 0.00441603921353817 2023-01-24 04:37:49.484711: step: 126/466, loss: 0.008045581169426441 2023-01-24 04:37:50.098057: step: 128/466, loss: 0.017257267609238625 2023-01-24 04:37:50.771710: step: 130/466, loss: 5.825229891343042e-05 2023-01-24 04:37:51.411764: step: 132/466, loss: 0.003156565362587571 2023-01-24 04:37:52.024515: step: 134/466, loss: 0.03854076936841011 2023-01-24 04:37:52.610341: step: 136/466, loss: 0.16053859889507294 2023-01-24 04:37:53.212840: step: 138/466, loss: 0.015324209816753864 2023-01-24 04:37:53.853713: step: 140/466, loss: 0.005850520916283131 2023-01-24 04:37:54.507249: step: 142/466, loss: 0.0016848534578457475 2023-01-24 04:37:55.154205: step: 144/466, loss: 0.03472881019115448 2023-01-24 04:37:55.840673: step: 146/466, loss: 0.01919623836874962 2023-01-24 04:37:56.524178: step: 148/466, loss: 0.14697599411010742 2023-01-24 04:37:57.128370: step: 150/466, loss: 0.014312896877527237 2023-01-24 04:37:57.770294: step: 152/466, loss: 0.00022975423780735582 2023-01-24 04:37:58.461981: step: 154/466, loss: 0.00428403215482831 2023-01-24 04:37:59.041462: step: 156/466, loss: 0.000444417935796082 2023-01-24 04:37:59.673645: step: 158/466, loss: 0.010773024521768093 2023-01-24 04:38:00.312581: step: 160/466, loss: 0.010588569566607475 2023-01-24 04:38:01.001406: step: 162/466, loss: 0.43219447135925293 2023-01-24 04:38:01.557861: step: 164/466, loss: 0.0008868477889336646 2023-01-24 04:38:02.191517: step: 166/466, loss: 0.002060934202745557 2023-01-24 04:38:02.799098: step: 168/466, loss: 0.0030699800699949265 2023-01-24 04:38:03.383059: step: 170/466, loss: 0.00046602555084973574 2023-01-24 04:38:04.025730: step: 172/466, loss: 0.0474085696041584 2023-01-24 04:38:04.639386: step: 174/466, loss: 0.00043793785152956843 2023-01-24 04:38:05.228167: step: 176/466, loss: 0.0032128584571182728 2023-01-24 04:38:05.849328: step: 178/466, loss: 0.017221592366695404 2023-01-24 04:38:06.451483: step: 180/466, loss: 0.0053990427404642105 2023-01-24 04:38:07.030871: step: 182/466, loss: 0.007466496899724007 2023-01-24 04:38:07.623867: step: 184/466, loss: 0.0009559531463310122 2023-01-24 04:38:08.239115: step: 186/466, loss: 0.20325548946857452 2023-01-24 04:38:08.899320: step: 188/466, loss: 0.02338382601737976 2023-01-24 04:38:09.550436: step: 190/466, loss: 0.0074676163494586945 2023-01-24 04:38:10.159009: step: 192/466, loss: 0.005346193909645081 2023-01-24 04:38:10.710198: step: 194/466, loss: 0.01663224585354328 2023-01-24 04:38:11.372585: step: 196/466, loss: 0.0007311832159757614 2023-01-24 04:38:12.022355: step: 198/466, loss: 0.0018670763820409775 2023-01-24 04:38:12.665886: step: 200/466, loss: 0.01166330836713314 2023-01-24 04:38:13.287115: step: 202/466, loss: 0.03932376578450203 2023-01-24 04:38:13.904323: step: 204/466, loss: 0.005174162797629833 2023-01-24 04:38:14.505818: step: 206/466, loss: 0.005198253784328699 2023-01-24 04:38:15.073270: step: 208/466, loss: 0.012226833030581474 2023-01-24 04:38:15.662791: step: 210/466, loss: 0.005281386431306601 2023-01-24 04:38:16.287920: step: 212/466, loss: 0.0019306482281535864 2023-01-24 04:38:16.913803: step: 214/466, loss: 0.00234972289763391 2023-01-24 04:38:17.485952: step: 216/466, loss: 0.009221348911523819 2023-01-24 04:38:18.091896: step: 218/466, loss: 0.0305698923766613 2023-01-24 04:38:18.756901: step: 220/466, loss: 0.0838170200586319 2023-01-24 04:38:19.407257: step: 222/466, loss: 0.002633826807141304 2023-01-24 04:38:20.092308: step: 224/466, loss: 0.0003977204905822873 2023-01-24 04:38:20.700603: step: 226/466, loss: 0.0055332910269498825 2023-01-24 04:38:21.398507: step: 228/466, loss: 0.049719490110874176 2023-01-24 04:38:22.081860: step: 230/466, loss: 0.014783482067286968 2023-01-24 04:38:22.695103: step: 232/466, loss: 0.004010343458503485 2023-01-24 04:38:23.325680: step: 234/466, loss: 0.1441631317138672 2023-01-24 04:38:23.986906: step: 236/466, loss: 0.16137421131134033 2023-01-24 04:38:24.732543: step: 238/466, loss: 0.5159417390823364 2023-01-24 04:38:25.409998: step: 240/466, loss: 0.000661507889162749 2023-01-24 04:38:26.039792: step: 242/466, loss: 0.022038301452994347 2023-01-24 04:38:26.600332: step: 244/466, loss: 0.0033813118934631348 2023-01-24 04:38:27.259331: step: 246/466, loss: 0.00786846037954092 2023-01-24 04:38:27.818966: step: 248/466, loss: 0.21639883518218994 2023-01-24 04:38:28.384292: step: 250/466, loss: 0.0043896762654185295 2023-01-24 04:38:28.972228: step: 252/466, loss: 0.004003674257546663 2023-01-24 04:38:29.665910: step: 254/466, loss: 0.03909432142972946 2023-01-24 04:38:30.268878: step: 256/466, loss: 0.3959011435508728 2023-01-24 04:38:30.966800: step: 258/466, loss: 0.01426810584962368 2023-01-24 04:38:31.575796: step: 260/466, loss: 0.009774630889296532 2023-01-24 04:38:32.185945: step: 262/466, loss: 0.004245213698595762 2023-01-24 04:38:32.801713: step: 264/466, loss: 0.004529725294560194 2023-01-24 04:38:33.373346: step: 266/466, loss: 0.0024402092676609755 2023-01-24 04:38:34.006319: step: 268/466, loss: 0.002173554850742221 2023-01-24 04:38:34.660510: step: 270/466, loss: 0.0045709493570029736 2023-01-24 04:38:35.245330: step: 272/466, loss: 0.07321876287460327 2023-01-24 04:38:35.838178: step: 274/466, loss: 0.06183570250868797 2023-01-24 04:38:36.444480: step: 276/466, loss: 0.0011557862162590027 2023-01-24 04:38:37.036751: step: 278/466, loss: 0.062261614948511124 2023-01-24 04:38:37.691217: step: 280/466, loss: 0.011697226203978062 2023-01-24 04:38:38.377774: step: 282/466, loss: 0.006745433434844017 2023-01-24 04:38:39.020845: step: 284/466, loss: 0.0017828167183324695 2023-01-24 04:38:39.660492: step: 286/466, loss: 0.052710626274347305 2023-01-24 04:38:40.254021: step: 288/466, loss: 0.004695044830441475 2023-01-24 04:38:40.918478: step: 290/466, loss: 0.02131335251033306 2023-01-24 04:38:41.554409: step: 292/466, loss: 0.007738930638879538 2023-01-24 04:38:42.181398: step: 294/466, loss: 0.01593361236155033 2023-01-24 04:38:42.755582: step: 296/466, loss: 0.00036103566526435316 2023-01-24 04:38:43.380546: step: 298/466, loss: 0.009133227169513702 2023-01-24 04:38:43.967199: step: 300/466, loss: 0.0015408407198265195 2023-01-24 04:38:44.590667: step: 302/466, loss: 0.3041478991508484 2023-01-24 04:38:45.181237: step: 304/466, loss: 0.04272019490599632 2023-01-24 04:38:45.853118: step: 306/466, loss: 0.027257483452558517 2023-01-24 04:38:46.504506: step: 308/466, loss: 0.10972411930561066 2023-01-24 04:38:47.086759: step: 310/466, loss: 0.0012863239971920848 2023-01-24 04:38:47.681460: step: 312/466, loss: 0.023251373320817947 2023-01-24 04:38:48.342707: step: 314/466, loss: 0.010081824846565723 2023-01-24 04:38:48.978353: step: 316/466, loss: 0.007842368446290493 2023-01-24 04:38:49.619441: step: 318/466, loss: 0.006509221158921719 2023-01-24 04:38:50.248282: step: 320/466, loss: 0.014905976131558418 2023-01-24 04:38:50.873371: step: 322/466, loss: 0.000455966976005584 2023-01-24 04:38:51.540224: step: 324/466, loss: 0.012475952506065369 2023-01-24 04:38:52.242454: step: 326/466, loss: 0.0011488832533359528 2023-01-24 04:38:52.913280: step: 328/466, loss: 0.0370166040956974 2023-01-24 04:38:53.576051: step: 330/466, loss: 0.050278399139642715 2023-01-24 04:38:54.225482: step: 332/466, loss: 0.02249862253665924 2023-01-24 04:38:54.867633: step: 334/466, loss: 0.0007561338716186583 2023-01-24 04:38:55.450544: step: 336/466, loss: 0.009127049706876278 2023-01-24 04:38:56.047001: step: 338/466, loss: 0.028104346245527267 2023-01-24 04:38:56.683994: step: 340/466, loss: 0.018344346433877945 2023-01-24 04:38:57.328192: step: 342/466, loss: 0.015621643513441086 2023-01-24 04:38:57.993198: step: 344/466, loss: 0.01132090762257576 2023-01-24 04:38:58.618558: step: 346/466, loss: 0.007212344091385603 2023-01-24 04:38:59.222363: step: 348/466, loss: 0.003566417610272765 2023-01-24 04:38:59.857453: step: 350/466, loss: 8.620372682344168e-05 2023-01-24 04:39:00.434531: step: 352/466, loss: 0.014750273898243904 2023-01-24 04:39:01.084482: step: 354/466, loss: 0.010027170181274414 2023-01-24 04:39:01.797271: step: 356/466, loss: 0.026798786595463753 2023-01-24 04:39:02.502234: step: 358/466, loss: 0.022073006257414818 2023-01-24 04:39:03.137463: step: 360/466, loss: 0.008595152758061886 2023-01-24 04:39:03.752604: step: 362/466, loss: 0.01112151425331831 2023-01-24 04:39:04.387273: step: 364/466, loss: 0.005875526927411556 2023-01-24 04:39:05.067142: step: 366/466, loss: 0.00016584506374783814 2023-01-24 04:39:05.632180: step: 368/466, loss: 0.010205023922026157 2023-01-24 04:39:06.238228: step: 370/466, loss: 0.009187527000904083 2023-01-24 04:39:06.847380: step: 372/466, loss: 0.02374931424856186 2023-01-24 04:39:07.471128: step: 374/466, loss: 0.008131437003612518 2023-01-24 04:39:08.048013: step: 376/466, loss: 0.05454736202955246 2023-01-24 04:39:08.641062: step: 378/466, loss: 0.03893023729324341 2023-01-24 04:39:09.293422: step: 380/466, loss: 0.014377097599208355 2023-01-24 04:39:09.926869: step: 382/466, loss: 0.03035423718392849 2023-01-24 04:39:10.573710: step: 384/466, loss: 0.0008646403439342976 2023-01-24 04:39:11.195568: step: 386/466, loss: 0.005868476815521717 2023-01-24 04:39:11.847068: step: 388/466, loss: 0.004623404238373041 2023-01-24 04:39:12.404423: step: 390/466, loss: 0.07496129721403122 2023-01-24 04:39:13.016155: step: 392/466, loss: 0.0005022316472604871 2023-01-24 04:39:13.731456: step: 394/466, loss: 0.0347086526453495 2023-01-24 04:39:14.325565: step: 396/466, loss: 0.04865318536758423 2023-01-24 04:39:14.958598: step: 398/466, loss: 0.009207713417708874 2023-01-24 04:39:15.541907: step: 400/466, loss: 0.003410718170925975 2023-01-24 04:39:16.158082: step: 402/466, loss: 0.07742343842983246 2023-01-24 04:39:16.787870: step: 404/466, loss: 0.019158687442541122 2023-01-24 04:39:17.382541: step: 406/466, loss: 0.03728799149394035 2023-01-24 04:39:17.983133: step: 408/466, loss: 0.009705687873065472 2023-01-24 04:39:18.517631: step: 410/466, loss: 0.00032010520226322114 2023-01-24 04:39:19.121333: step: 412/466, loss: 0.014273714274168015 2023-01-24 04:39:19.678820: step: 414/466, loss: 0.010331586003303528 2023-01-24 04:39:20.390640: step: 416/466, loss: 0.014624751172959805 2023-01-24 04:39:21.031848: step: 418/466, loss: 0.005155010148882866 2023-01-24 04:39:21.674830: step: 420/466, loss: 0.02165280096232891 2023-01-24 04:39:22.280414: step: 422/466, loss: 0.003787653986364603 2023-01-24 04:39:22.851055: step: 424/466, loss: 0.002895970596000552 2023-01-24 04:39:23.458036: step: 426/466, loss: 0.0736176073551178 2023-01-24 04:39:24.329856: step: 428/466, loss: 0.023823287338018417 2023-01-24 04:39:24.967525: step: 430/466, loss: 0.02936473861336708 2023-01-24 04:39:25.623304: step: 432/466, loss: 0.007644005585461855 2023-01-24 04:39:26.248453: step: 434/466, loss: 0.027653008699417114 2023-01-24 04:39:26.888744: step: 436/466, loss: 0.016150671988725662 2023-01-24 04:39:27.536567: step: 438/466, loss: 0.01745207980275154 2023-01-24 04:39:28.155725: step: 440/466, loss: 0.016127359122037888 2023-01-24 04:39:28.816429: step: 442/466, loss: 0.06432020664215088 2023-01-24 04:39:29.418500: step: 444/466, loss: 0.022350141778588295 2023-01-24 04:39:30.041936: step: 446/466, loss: 0.013962076045572758 2023-01-24 04:39:30.708320: step: 448/466, loss: 0.007573869079351425 2023-01-24 04:39:31.291956: step: 450/466, loss: 0.00517288688570261 2023-01-24 04:39:31.917435: step: 452/466, loss: 0.055208589881658554 2023-01-24 04:39:32.654327: step: 454/466, loss: 0.01886332593858242 2023-01-24 04:39:33.230480: step: 456/466, loss: 0.00902930460870266 2023-01-24 04:39:33.817590: step: 458/466, loss: 0.017374033108353615 2023-01-24 04:39:34.477961: step: 460/466, loss: 0.0030088305938988924 2023-01-24 04:39:35.088859: step: 462/466, loss: 0.007050098851323128 2023-01-24 04:39:35.689788: step: 464/466, loss: 0.0005313614383339882 2023-01-24 04:39:36.298000: step: 466/466, loss: 0.03915034234523773 2023-01-24 04:39:36.884920: step: 468/466, loss: 0.00026846598484553397 2023-01-24 04:39:37.513945: step: 470/466, loss: 0.027079172432422638 2023-01-24 04:39:38.143363: step: 472/466, loss: 0.0015112333931028843 2023-01-24 04:39:38.693932: step: 474/466, loss: 3.813156217802316e-05 2023-01-24 04:39:39.275450: step: 476/466, loss: 0.0012540343450382352 2023-01-24 04:39:39.896277: step: 478/466, loss: 0.01737871579825878 2023-01-24 04:39:40.574672: step: 480/466, loss: 0.0008481427212245762 2023-01-24 04:39:41.157993: step: 482/466, loss: 0.00066763861104846 2023-01-24 04:39:41.822883: step: 484/466, loss: 0.033831000328063965 2023-01-24 04:39:42.413700: step: 486/466, loss: 7.685121818212792e-05 2023-01-24 04:39:43.004808: step: 488/466, loss: 0.013425313867628574 2023-01-24 04:39:43.565462: step: 490/466, loss: 0.04113367199897766 2023-01-24 04:39:44.215569: step: 492/466, loss: 0.004991469904780388 2023-01-24 04:39:44.814014: step: 494/466, loss: 0.04344819113612175 2023-01-24 04:39:45.412960: step: 496/466, loss: 0.0021680507343262434 2023-01-24 04:39:46.035859: step: 498/466, loss: 0.027491370216012 2023-01-24 04:39:46.675441: step: 500/466, loss: 0.02037181705236435 2023-01-24 04:39:47.307983: step: 502/466, loss: 0.016716280952095985 2023-01-24 04:39:47.903426: step: 504/466, loss: 0.0015023309970274568 2023-01-24 04:39:48.540243: step: 506/466, loss: 0.003428276162594557 2023-01-24 04:39:49.160103: step: 508/466, loss: 0.004894760437309742 2023-01-24 04:39:49.771888: step: 510/466, loss: 0.03600066527724266 2023-01-24 04:39:50.376910: step: 512/466, loss: 0.0008272746345028281 2023-01-24 04:39:50.961640: step: 514/466, loss: 0.004729779902845621 2023-01-24 04:39:51.499487: step: 516/466, loss: 0.01697508431971073 2023-01-24 04:39:52.094741: step: 518/466, loss: 0.033606935292482376 2023-01-24 04:39:52.708181: step: 520/466, loss: 0.011284686625003815 2023-01-24 04:39:53.334420: step: 522/466, loss: 0.0027831706684082747 2023-01-24 04:39:53.961936: step: 524/466, loss: 0.0012507356004789472 2023-01-24 04:39:54.608786: step: 526/466, loss: 0.006484354380518198 2023-01-24 04:39:55.219528: step: 528/466, loss: 0.018996044993400574 2023-01-24 04:39:55.851154: step: 530/466, loss: 0.084006167948246 2023-01-24 04:39:56.500406: step: 532/466, loss: 0.0024460619315505028 2023-01-24 04:39:57.127389: step: 534/466, loss: 0.0011832149466499686 2023-01-24 04:39:57.853148: step: 536/466, loss: 0.0038138527888804674 2023-01-24 04:39:58.449197: step: 538/466, loss: 0.003790761809796095 2023-01-24 04:39:59.047562: step: 540/466, loss: 0.0055133383721113205 2023-01-24 04:39:59.671059: step: 542/466, loss: 0.004894603043794632 2023-01-24 04:40:00.266721: step: 544/466, loss: 0.012636066414415836 2023-01-24 04:40:00.876378: step: 546/466, loss: 0.0008947293972596526 2023-01-24 04:40:01.470767: step: 548/466, loss: 0.02004299685359001 2023-01-24 04:40:02.144731: step: 550/466, loss: 0.1391642540693283 2023-01-24 04:40:02.712628: step: 552/466, loss: 0.016957959160208702 2023-01-24 04:40:03.327742: step: 554/466, loss: 0.004670858848839998 2023-01-24 04:40:03.948415: step: 556/466, loss: 0.0005273279966786504 2023-01-24 04:40:04.595120: step: 558/466, loss: 0.0011098579270765185 2023-01-24 04:40:05.252287: step: 560/466, loss: 0.0007891767891123891 2023-01-24 04:40:05.918348: step: 562/466, loss: 0.011333691887557507 2023-01-24 04:40:06.537242: step: 564/466, loss: 0.026420513167977333 2023-01-24 04:40:07.162296: step: 566/466, loss: 0.010849075391888618 2023-01-24 04:40:07.818711: step: 568/466, loss: 0.0009764400310814381 2023-01-24 04:40:08.480855: step: 570/466, loss: 0.004660797771066427 2023-01-24 04:40:09.159306: step: 572/466, loss: 0.003154938342049718 2023-01-24 04:40:09.829563: step: 574/466, loss: 0.0002601823944132775 2023-01-24 04:40:10.505310: step: 576/466, loss: 0.0025972893927246332 2023-01-24 04:40:11.129268: step: 578/466, loss: 0.0016454608412459493 2023-01-24 04:40:11.710361: step: 580/466, loss: 0.001574407215230167 2023-01-24 04:40:12.285568: step: 582/466, loss: 0.025794681161642075 2023-01-24 04:40:12.914767: step: 584/466, loss: 0.005013021640479565 2023-01-24 04:40:13.576591: step: 586/466, loss: 0.28778019547462463 2023-01-24 04:40:14.194690: step: 588/466, loss: 0.0031442195177078247 2023-01-24 04:40:14.831086: step: 590/466, loss: 0.009091082960367203 2023-01-24 04:40:15.482459: step: 592/466, loss: 0.03503568843007088 2023-01-24 04:40:16.149936: step: 594/466, loss: 0.0561491958796978 2023-01-24 04:40:16.771915: step: 596/466, loss: 0.03535356745123863 2023-01-24 04:40:17.413224: step: 598/466, loss: 0.027905019000172615 2023-01-24 04:40:17.999321: step: 600/466, loss: 0.0008747755200602114 2023-01-24 04:40:18.597819: step: 602/466, loss: 0.01476990431547165 2023-01-24 04:40:19.284331: step: 604/466, loss: 0.02200404927134514 2023-01-24 04:40:19.912221: step: 606/466, loss: 0.005571092013269663 2023-01-24 04:40:20.554507: step: 608/466, loss: 0.03939218446612358 2023-01-24 04:40:21.201523: step: 610/466, loss: 0.02648027054965496 2023-01-24 04:40:21.814577: step: 612/466, loss: 0.0020359500776976347 2023-01-24 04:40:22.380695: step: 614/466, loss: 0.0784219428896904 2023-01-24 04:40:22.956623: step: 616/466, loss: 0.00010506340186111629 2023-01-24 04:40:23.630689: step: 618/466, loss: 0.01642722263932228 2023-01-24 04:40:24.345180: step: 620/466, loss: 0.00024249528360087425 2023-01-24 04:40:24.985769: step: 622/466, loss: 0.007444320246577263 2023-01-24 04:40:25.567954: step: 624/466, loss: 0.07463310658931732 2023-01-24 04:40:26.153295: step: 626/466, loss: 0.0032307819928973913 2023-01-24 04:40:26.797513: step: 628/466, loss: 3.353673219680786 2023-01-24 04:40:27.485644: step: 630/466, loss: 0.02865377441048622 2023-01-24 04:40:28.054556: step: 632/466, loss: 0.013395837508141994 2023-01-24 04:40:28.729543: step: 634/466, loss: 0.021773796528577805 2023-01-24 04:40:29.306555: step: 636/466, loss: 2.937152657978004e-06 2023-01-24 04:40:29.938989: step: 638/466, loss: 0.023525815457105637 2023-01-24 04:40:30.527244: step: 640/466, loss: 0.005066219717264175 2023-01-24 04:40:31.142323: step: 642/466, loss: 0.0014348170952871442 2023-01-24 04:40:31.773626: step: 644/466, loss: 0.20574060082435608 2023-01-24 04:40:32.420724: step: 646/466, loss: 0.005184273701161146 2023-01-24 04:40:33.068777: step: 648/466, loss: 0.011186490766704082 2023-01-24 04:40:33.719210: step: 650/466, loss: 0.011190414428710938 2023-01-24 04:40:34.344592: step: 652/466, loss: 0.052119433879852295 2023-01-24 04:40:34.943163: step: 654/466, loss: 0.12704919278621674 2023-01-24 04:40:35.627419: step: 656/466, loss: 0.05566718056797981 2023-01-24 04:40:36.213144: step: 658/466, loss: 0.013254563324153423 2023-01-24 04:40:36.840521: step: 660/466, loss: 0.03673689812421799 2023-01-24 04:40:37.468583: step: 662/466, loss: 0.046389028429985046 2023-01-24 04:40:38.085912: step: 664/466, loss: 0.06802118569612503 2023-01-24 04:40:38.691755: step: 666/466, loss: 0.012471283785998821 2023-01-24 04:40:39.240292: step: 668/466, loss: 0.043097637593746185 2023-01-24 04:40:39.860971: step: 670/466, loss: 0.14850647747516632 2023-01-24 04:40:40.523348: step: 672/466, loss: 0.08933252096176147 2023-01-24 04:40:41.222767: step: 674/466, loss: 0.023991243913769722 2023-01-24 04:40:41.820948: step: 676/466, loss: 0.01252694707363844 2023-01-24 04:40:42.453139: step: 678/466, loss: 0.001300438423641026 2023-01-24 04:40:43.056565: step: 680/466, loss: 0.011431179940700531 2023-01-24 04:40:43.729309: step: 682/466, loss: 0.030292831361293793 2023-01-24 04:40:44.448053: step: 684/466, loss: 0.0014602902811020613 2023-01-24 04:40:45.106569: step: 686/466, loss: 0.03479599207639694 2023-01-24 04:40:45.817665: step: 688/466, loss: 0.021553900092840195 2023-01-24 04:40:46.397408: step: 690/466, loss: 0.009053482674062252 2023-01-24 04:40:47.004631: step: 692/466, loss: 0.007322824560105801 2023-01-24 04:40:47.602840: step: 694/466, loss: 0.007402040995657444 2023-01-24 04:40:48.226290: step: 696/466, loss: 0.00453294487670064 2023-01-24 04:40:48.935708: step: 698/466, loss: 0.004316447302699089 2023-01-24 04:40:49.659772: step: 700/466, loss: 0.01291638519614935 2023-01-24 04:40:50.299267: step: 702/466, loss: 1.0230196714401245 2023-01-24 04:40:50.972822: step: 704/466, loss: 0.01189468428492546 2023-01-24 04:40:51.587572: step: 706/466, loss: 0.004292115103453398 2023-01-24 04:40:52.188815: step: 708/466, loss: 0.01580323837697506 2023-01-24 04:40:52.834480: step: 710/466, loss: 0.008795082569122314 2023-01-24 04:40:53.509418: step: 712/466, loss: 0.0017119685653597116 2023-01-24 04:40:54.149013: step: 714/466, loss: 0.07867859303951263 2023-01-24 04:40:54.771630: step: 716/466, loss: 0.06292006373405457 2023-01-24 04:40:55.404575: step: 718/466, loss: 0.01796099729835987 2023-01-24 04:40:56.043997: step: 720/466, loss: 0.011473596096038818 2023-01-24 04:40:56.650754: step: 722/466, loss: 0.004189202096313238 2023-01-24 04:40:57.322394: step: 724/466, loss: 0.006237534806132317 2023-01-24 04:40:57.966969: step: 726/466, loss: 0.014844987541437149 2023-01-24 04:40:58.570453: step: 728/466, loss: 0.004478269722312689 2023-01-24 04:40:59.128829: step: 730/466, loss: 0.0007747714407742023 2023-01-24 04:40:59.739321: step: 732/466, loss: 0.020828496664762497 2023-01-24 04:41:00.350320: step: 734/466, loss: 0.0076482766307890415 2023-01-24 04:41:00.958225: step: 736/466, loss: 0.006298763677477837 2023-01-24 04:41:01.556595: step: 738/466, loss: 0.00467700744047761 2023-01-24 04:41:02.215767: step: 740/466, loss: 0.0018371690530329943 2023-01-24 04:41:02.831478: step: 742/466, loss: 0.01124525535851717 2023-01-24 04:41:03.492246: step: 744/466, loss: 0.0011776175815612078 2023-01-24 04:41:04.023095: step: 746/466, loss: 0.0012847530888393521 2023-01-24 04:41:04.582307: step: 748/466, loss: 0.030908742919564247 2023-01-24 04:41:05.188010: step: 750/466, loss: 0.03198450431227684 2023-01-24 04:41:05.821063: step: 752/466, loss: 0.003392572049051523 2023-01-24 04:41:06.478197: step: 754/466, loss: 0.008934213779866695 2023-01-24 04:41:07.093284: step: 756/466, loss: 0.0018937125569209456 2023-01-24 04:41:07.759971: step: 758/466, loss: 0.0012889329809695482 2023-01-24 04:41:08.400467: step: 760/466, loss: 0.0003134472935926169 2023-01-24 04:41:09.011940: step: 762/466, loss: 0.0072999438270926476 2023-01-24 04:41:09.677622: step: 764/466, loss: 0.0006366133457049727 2023-01-24 04:41:10.284570: step: 766/466, loss: 0.10606065392494202 2023-01-24 04:41:10.949075: step: 768/466, loss: 0.2799156904220581 2023-01-24 04:41:11.557801: step: 770/466, loss: 0.037246257066726685 2023-01-24 04:41:12.136279: step: 772/466, loss: 0.0026118732057511806 2023-01-24 04:41:12.692303: step: 774/466, loss: 0.0012117475271224976 2023-01-24 04:41:13.298870: step: 776/466, loss: 0.001340881921350956 2023-01-24 04:41:13.926086: step: 778/466, loss: 0.0028708649333566427 2023-01-24 04:41:14.520936: step: 780/466, loss: 0.0027547755744308233 2023-01-24 04:41:15.151624: step: 782/466, loss: 0.019336048513650894 2023-01-24 04:41:15.834488: step: 784/466, loss: 0.0019793594256043434 2023-01-24 04:41:16.482051: step: 786/466, loss: 0.0011085917940363288 2023-01-24 04:41:17.075870: step: 788/466, loss: 0.0020195310935378075 2023-01-24 04:41:17.662313: step: 790/466, loss: 0.005421169102191925 2023-01-24 04:41:18.282050: step: 792/466, loss: 0.1041063740849495 2023-01-24 04:41:18.907022: step: 794/466, loss: 1.3896070413466077e-05 2023-01-24 04:41:19.517190: step: 796/466, loss: 0.0036372009199112654 2023-01-24 04:41:20.156704: step: 798/466, loss: 0.0005540642305277288 2023-01-24 04:41:20.713082: step: 800/466, loss: 0.0004064899403601885 2023-01-24 04:41:21.345777: step: 802/466, loss: 0.0005614913534373045 2023-01-24 04:41:21.940360: step: 804/466, loss: 0.061234891414642334 2023-01-24 04:41:22.526635: step: 806/466, loss: 0.00150016276165843 2023-01-24 04:41:23.153484: step: 808/466, loss: 0.12199815362691879 2023-01-24 04:41:23.829730: step: 810/466, loss: 0.021360736340284348 2023-01-24 04:41:24.579373: step: 812/466, loss: 0.013401246629655361 2023-01-24 04:41:25.213364: step: 814/466, loss: 0.35365334153175354 2023-01-24 04:41:25.861173: step: 816/466, loss: 0.3945031464099884 2023-01-24 04:41:26.444440: step: 818/466, loss: 0.008412548340857029 2023-01-24 04:41:27.057540: step: 820/466, loss: 0.4514215886592865 2023-01-24 04:41:27.717115: step: 822/466, loss: 0.0017607983900234103 2023-01-24 04:41:28.326546: step: 824/466, loss: 0.013890708796679974 2023-01-24 04:41:28.929667: step: 826/466, loss: 0.31904810667037964 2023-01-24 04:41:29.553417: step: 828/466, loss: 0.006622544955462217 2023-01-24 04:41:30.085366: step: 830/466, loss: 0.008045758120715618 2023-01-24 04:41:30.710335: step: 832/466, loss: 0.055824968963861465 2023-01-24 04:41:31.266511: step: 834/466, loss: 0.0005027701845392585 2023-01-24 04:41:31.953165: step: 836/466, loss: 0.012430011294782162 2023-01-24 04:41:32.678880: step: 838/466, loss: 0.0092149768024683 2023-01-24 04:41:33.323061: step: 840/466, loss: 0.12487414479255676 2023-01-24 04:41:33.923634: step: 842/466, loss: 0.0024845825973898172 2023-01-24 04:41:34.575133: step: 844/466, loss: 0.004382827784866095 2023-01-24 04:41:35.199809: step: 846/466, loss: 0.009878009557723999 2023-01-24 04:41:35.843661: step: 848/466, loss: 0.5359202027320862 2023-01-24 04:41:36.445687: step: 850/466, loss: 0.00421258294954896 2023-01-24 04:41:37.034319: step: 852/466, loss: 0.0002831450547091663 2023-01-24 04:41:37.683220: step: 854/466, loss: 0.02538694068789482 2023-01-24 04:41:38.287667: step: 856/466, loss: 0.007417216431349516 2023-01-24 04:41:38.967607: step: 858/466, loss: 0.00018170750990975648 2023-01-24 04:41:39.604904: step: 860/466, loss: 0.005004175007343292 2023-01-24 04:41:40.219006: step: 862/466, loss: 0.00015706718841101974 2023-01-24 04:41:40.823079: step: 864/466, loss: 0.037154652178287506 2023-01-24 04:41:41.471196: step: 866/466, loss: 0.13900421559810638 2023-01-24 04:41:42.063552: step: 868/466, loss: 0.01564154215157032 2023-01-24 04:41:42.695896: step: 870/466, loss: 0.021798204630613327 2023-01-24 04:41:43.255001: step: 872/466, loss: 0.07405062019824982 2023-01-24 04:41:43.830052: step: 874/466, loss: 0.000820789544377476 2023-01-24 04:41:44.492087: step: 876/466, loss: 0.0032635561656206846 2023-01-24 04:41:45.173574: step: 878/466, loss: 0.009002278558909893 2023-01-24 04:41:45.750696: step: 880/466, loss: 0.007568451575934887 2023-01-24 04:41:46.442103: step: 882/466, loss: 0.005844644736498594 2023-01-24 04:41:47.049102: step: 884/466, loss: 0.03638865426182747 2023-01-24 04:41:47.752775: step: 886/466, loss: 0.026529058814048767 2023-01-24 04:41:48.391081: step: 888/466, loss: 0.013912210240960121 2023-01-24 04:41:48.988811: step: 890/466, loss: 0.025714127346873283 2023-01-24 04:41:49.649671: step: 892/466, loss: 0.012615867890417576 2023-01-24 04:41:50.335233: step: 894/466, loss: 0.001323892269283533 2023-01-24 04:41:50.962434: step: 896/466, loss: 0.0011425853008404374 2023-01-24 04:41:51.536064: step: 898/466, loss: 0.010848517529666424 2023-01-24 04:41:52.129595: step: 900/466, loss: 0.006292147561907768 2023-01-24 04:41:52.746643: step: 902/466, loss: 0.009328721091151237 2023-01-24 04:41:53.428631: step: 904/466, loss: 0.03581482172012329 2023-01-24 04:41:54.074062: step: 906/466, loss: 0.012340017594397068 2023-01-24 04:41:54.608469: step: 908/466, loss: 0.12274104356765747 2023-01-24 04:41:55.247961: step: 910/466, loss: 0.00909186527132988 2023-01-24 04:41:55.786499: step: 912/466, loss: 0.00010829462553374469 2023-01-24 04:41:56.449324: step: 914/466, loss: 0.11186635494232178 2023-01-24 04:41:57.111765: step: 916/466, loss: 0.08896740525960922 2023-01-24 04:41:57.723400: step: 918/466, loss: 0.017871228978037834 2023-01-24 04:41:58.367634: step: 920/466, loss: 0.0023724103812128305 2023-01-24 04:41:58.956452: step: 922/466, loss: 0.4413450360298157 2023-01-24 04:41:59.588309: step: 924/466, loss: 0.0022770024370402098 2023-01-24 04:42:00.225870: step: 926/466, loss: 0.0038027584087103605 2023-01-24 04:42:00.817874: step: 928/466, loss: 0.016884373500943184 2023-01-24 04:42:01.446756: step: 930/466, loss: 0.014772958122193813 2023-01-24 04:42:02.102601: step: 932/466, loss: 0.015412840060889721 ================================================== Loss: 0.057 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.34684152127333945, 'r': 0.325780935541372, 'f1': 0.335981512779458}, 'combined': 0.2475653252059164, 'epoch': 34} Test Chinese: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.37197752619015295, 'r': 0.3077767215831794, 'f1': 0.3368453345851029}, 'combined': 0.22340001464711484, 'epoch': 34} Dev Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.3416424418604651, 'r': 0.2782315340909091, 'f1': 0.30669363256784965}, 'combined': 0.20446242171189977, 'epoch': 34} Test Korean: {'template': {'p': 0.96875, 'r': 0.49206349206349204, 'f1': 0.6526315789473683}, 'slot': {'p': 0.3688349995138794, 'r': 0.28669410274172424, 'f1': 0.3226182297064357}, 'combined': 0.2105508446505159, 'epoch': 34} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.33966138435132576, 'r': 0.32999360301305275, 'f1': 0.3347577070026541}, 'combined': 0.24666357358090302, 'epoch': 34} Test Russian: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.3512832960894507, 'r': 0.2930410134912866, 'f1': 0.3195298131017152}, 'combined': 0.21191614547678517, 'epoch': 34} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.35349462365591394, 'r': 0.31309523809523804, 'f1': 0.332070707070707}, 'combined': 0.22138047138047134, 'epoch': 34} Sample Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5625, 'r': 0.391304347826087, 'f1': 0.46153846153846156}, 'combined': 0.3076923076923077, 'epoch': 34} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.375, 'r': 0.12931034482758622, 'f1': 0.19230769230769235}, 'combined': 0.12820512820512822, 'epoch': 34} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.35600821224540485, 'r': 0.3296622534644356, 'f1': 0.34232907896701}, 'combined': 0.25224247923884946, 'epoch': 20} Test for Chinese: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.3708503216596444, 'r': 0.2994463568648126, 'f1': 0.33134515303755174}, 'combined': 0.21975222584873896, 'epoch': 20} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.38541666666666663, 'r': 0.35238095238095235, 'f1': 0.36815920398009944}, 'combined': 0.24543946932006627, 'epoch': 20} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.32910511363636363, 'r': 0.2742542613636364, 'f1': 0.2991864669421488}, 'combined': 0.19945764462809917, 'epoch': 25} Test for Korean: {'template': {'p': 0.96875, 'r': 0.49206349206349204, 'f1': 0.6526315789473683}, 'slot': {'p': 0.3803562606644222, 'r': 0.28970140060968946, 'f1': 0.32889629598629455}, 'combined': 0.21464810895947642, 'epoch': 25} Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.578125, 'r': 0.40217391304347827, 'f1': 0.47435897435897434}, 'combined': 0.3162393162393162, 'epoch': 25} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3214272868712161, 'r': 0.31166858366449984, 'f1': 0.3164727236824498}, 'combined': 0.23319042797654194, 'epoch': 7} Test for Russian: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.3374639287478496, 'r': 0.2816078301964814, 'f1': 0.30701605547736693}, 'combined': 0.20361686580882363, 'epoch': 7} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.4423076923076923, 'r': 0.19827586206896552, 'f1': 0.2738095238095238}, 'combined': 0.1825396825396825, 'epoch': 7} ****************************** Epoch: 35 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-24 04:44:34.373899: step: 2/466, loss: 0.002012265380471945 2023-01-24 04:44:34.974631: step: 4/466, loss: 0.012894488871097565 2023-01-24 04:44:35.575066: step: 6/466, loss: 0.00037483443156816065 2023-01-24 04:44:36.131539: step: 8/466, loss: 0.10634152591228485 2023-01-24 04:44:36.770585: step: 10/466, loss: 0.0006054774858057499 2023-01-24 04:44:37.450169: step: 12/466, loss: 0.09973970800638199 2023-01-24 04:44:38.080476: step: 14/466, loss: 0.006195437163114548 2023-01-24 04:44:38.680625: step: 16/466, loss: 0.012353341095149517 2023-01-24 04:44:39.304352: step: 18/466, loss: 0.007871192879974842 2023-01-24 04:44:39.914160: step: 20/466, loss: 0.03719558194279671 2023-01-24 04:44:40.536053: step: 22/466, loss: 0.8608661890029907 2023-01-24 04:44:41.117492: step: 24/466, loss: 0.009443704038858414 2023-01-24 04:44:41.726147: step: 26/466, loss: 0.024249430745840073 2023-01-24 04:44:42.367534: step: 28/466, loss: 0.0005519503611139953 2023-01-24 04:44:42.986717: step: 30/466, loss: 0.0016365292249247432 2023-01-24 04:44:43.554722: step: 32/466, loss: 0.08336590975522995 2023-01-24 04:44:44.157333: step: 34/466, loss: 0.013230552896857262 2023-01-24 04:44:44.804603: step: 36/466, loss: 0.05108135566115379 2023-01-24 04:44:45.375725: step: 38/466, loss: 0.002142277080565691 2023-01-24 04:44:45.999435: step: 40/466, loss: 0.013755310326814651 2023-01-24 04:44:46.719323: step: 42/466, loss: 0.02609098330140114 2023-01-24 04:44:47.377333: step: 44/466, loss: 0.005456157959997654 2023-01-24 04:44:48.001827: step: 46/466, loss: 0.0007077806512825191 2023-01-24 04:44:48.699088: step: 48/466, loss: 1.049682855606079 2023-01-24 04:44:49.327462: step: 50/466, loss: 0.006461585871875286 2023-01-24 04:44:49.947423: step: 52/466, loss: 0.004847324453294277 2023-01-24 04:44:50.637843: step: 54/466, loss: 0.003220370737835765 2023-01-24 04:44:51.230763: step: 56/466, loss: 0.004356561228632927 2023-01-24 04:44:51.813146: step: 58/466, loss: 0.01184358075261116 2023-01-24 04:44:52.338944: step: 60/466, loss: 0.0025993327144533396 2023-01-24 04:44:52.960814: step: 62/466, loss: 0.0036130023654550314 2023-01-24 04:44:53.519756: step: 64/466, loss: 0.0017474743071943521 2023-01-24 04:44:54.188167: step: 66/466, loss: 0.008417025208473206 2023-01-24 04:44:54.788844: step: 68/466, loss: 0.0027821469120681286 2023-01-24 04:44:55.332951: step: 70/466, loss: 0.0111384941264987 2023-01-24 04:44:56.001143: step: 72/466, loss: 0.021571749821305275 2023-01-24 04:44:56.590833: step: 74/466, loss: 0.0024087431374937296 2023-01-24 04:44:57.201704: step: 76/466, loss: 0.0031457082368433475 2023-01-24 04:44:57.777533: step: 78/466, loss: 0.0029580495320260525 2023-01-24 04:44:58.388019: step: 80/466, loss: 0.0061109671369194984 2023-01-24 04:44:59.031262: step: 82/466, loss: 0.011633175425231457 2023-01-24 04:44:59.645862: step: 84/466, loss: 0.000648948538582772 2023-01-24 04:45:00.249351: step: 86/466, loss: 0.022715579718351364 2023-01-24 04:45:00.864036: step: 88/466, loss: 0.04326004534959793 2023-01-24 04:45:01.470496: step: 90/466, loss: 0.0031368292402476072 2023-01-24 04:45:02.128031: step: 92/466, loss: 0.008613396435976028 2023-01-24 04:45:02.748104: step: 94/466, loss: 0.00023159265401773155 2023-01-24 04:45:03.401741: step: 96/466, loss: 0.009789356961846352 2023-01-24 04:45:04.030534: step: 98/466, loss: 0.011413405649363995 2023-01-24 04:45:04.622211: step: 100/466, loss: 0.002103513339534402 2023-01-24 04:45:05.255245: step: 102/466, loss: 0.07808516919612885 2023-01-24 04:45:05.867198: step: 104/466, loss: 0.007273864466696978 2023-01-24 04:45:06.689559: step: 106/466, loss: 0.0027130546513944864 2023-01-24 04:45:07.316258: step: 108/466, loss: 0.005673782899975777 2023-01-24 04:45:08.039180: step: 110/466, loss: 0.003928873222321272 2023-01-24 04:45:08.670604: step: 112/466, loss: 0.0015045427717268467 2023-01-24 04:45:09.235108: step: 114/466, loss: 0.36543506383895874 2023-01-24 04:45:09.815502: step: 116/466, loss: 0.009011466056108475 2023-01-24 04:45:10.421331: step: 118/466, loss: 0.0004968467983417213 2023-01-24 04:45:11.107608: step: 120/466, loss: 0.012009035795927048 2023-01-24 04:45:11.746550: step: 122/466, loss: 0.014121904037892818 2023-01-24 04:45:12.339012: step: 124/466, loss: 0.02587854117155075 2023-01-24 04:45:12.979319: step: 126/466, loss: 0.011958899907767773 2023-01-24 04:45:13.646469: step: 128/466, loss: 0.008790363557636738 2023-01-24 04:45:14.277693: step: 130/466, loss: 0.02592388354241848 2023-01-24 04:45:14.913158: step: 132/466, loss: 0.024880820885300636 2023-01-24 04:45:15.556924: step: 134/466, loss: 0.07368521392345428 2023-01-24 04:45:16.120925: step: 136/466, loss: 0.20551887154579163 2023-01-24 04:45:16.716434: step: 138/466, loss: 0.12554225325584412 2023-01-24 04:45:17.279594: step: 140/466, loss: 0.7651264667510986 2023-01-24 04:45:17.887250: step: 142/466, loss: 0.012582586146891117 2023-01-24 04:45:18.551901: step: 144/466, loss: 0.0031956112943589687 2023-01-24 04:45:19.178958: step: 146/466, loss: 0.03728998452425003 2023-01-24 04:45:19.852004: step: 148/466, loss: 0.010229956358671188 2023-01-24 04:45:20.493960: step: 150/466, loss: 0.004200585186481476 2023-01-24 04:45:21.077221: step: 152/466, loss: 0.02673015184700489 2023-01-24 04:45:21.695989: step: 154/466, loss: 0.012018535286188126 2023-01-24 04:45:22.388634: step: 156/466, loss: 0.006779039278626442 2023-01-24 04:45:23.010149: step: 158/466, loss: 0.01809326931834221 2023-01-24 04:45:23.649964: step: 160/466, loss: 0.011561712250113487 2023-01-24 04:45:24.286387: step: 162/466, loss: 0.019324587658047676 2023-01-24 04:45:24.895286: step: 164/466, loss: 0.008751314133405685 2023-01-24 04:45:25.512537: step: 166/466, loss: 0.009230945259332657 2023-01-24 04:45:26.137970: step: 168/466, loss: 0.00518200034275651 2023-01-24 04:45:26.852128: step: 170/466, loss: 0.004224380478262901 2023-01-24 04:45:27.445794: step: 172/466, loss: 0.009996664710342884 2023-01-24 04:45:27.981347: step: 174/466, loss: 0.0077126771211624146 2023-01-24 04:45:28.613374: step: 176/466, loss: 0.019181014969944954 2023-01-24 04:45:29.241284: step: 178/466, loss: 0.004100507125258446 2023-01-24 04:45:29.859180: step: 180/466, loss: 0.010374164208769798 2023-01-24 04:45:30.484678: step: 182/466, loss: 0.032318126410245895 2023-01-24 04:45:31.125149: step: 184/466, loss: 0.006364541593939066 2023-01-24 04:45:31.710895: step: 186/466, loss: 0.014697098173201084 2023-01-24 04:45:32.389413: step: 188/466, loss: 0.0025067031383514404 2023-01-24 04:45:33.033234: step: 190/466, loss: 0.01777309738099575 2023-01-24 04:45:33.735747: step: 192/466, loss: 0.008049978874623775 2023-01-24 04:45:34.356066: step: 194/466, loss: 0.006283496040850878 2023-01-24 04:45:35.027022: step: 196/466, loss: 0.0018434731755405664 2023-01-24 04:45:35.702362: step: 198/466, loss: 0.009735134430229664 2023-01-24 04:45:36.306988: step: 200/466, loss: 0.028072571381926537 2023-01-24 04:45:36.864229: step: 202/466, loss: 0.013600992038846016 2023-01-24 04:45:37.585959: step: 204/466, loss: 0.21987058222293854 2023-01-24 04:45:38.268853: step: 206/466, loss: 0.034088876098394394 2023-01-24 04:45:38.967128: step: 208/466, loss: 0.010856841690838337 2023-01-24 04:45:39.578010: step: 210/466, loss: 0.033877093344926834 2023-01-24 04:45:40.223998: step: 212/466, loss: 0.05989128351211548 2023-01-24 04:45:40.980635: step: 214/466, loss: 0.24762582778930664 2023-01-24 04:45:41.577673: step: 216/466, loss: 0.008403414860367775 2023-01-24 04:45:42.141491: step: 218/466, loss: 0.003733893157914281 2023-01-24 04:45:42.755935: step: 220/466, loss: 0.0018547578947618604 2023-01-24 04:45:43.413962: step: 222/466, loss: 0.0006512874970212579 2023-01-24 04:45:43.987732: step: 224/466, loss: 0.08464835584163666 2023-01-24 04:45:44.638472: step: 226/466, loss: 0.013823796063661575 2023-01-24 04:45:45.278047: step: 228/466, loss: 0.05559268966317177 2023-01-24 04:45:45.928688: step: 230/466, loss: 0.025688299909234047 2023-01-24 04:45:46.491102: step: 232/466, loss: 0.0025740419514477253 2023-01-24 04:45:47.024704: step: 234/466, loss: 0.0037856039125472307 2023-01-24 04:45:47.631665: step: 236/466, loss: 5.876278877258301 2023-01-24 04:45:48.256356: step: 238/466, loss: 0.004950941540300846 2023-01-24 04:45:48.927442: step: 240/466, loss: 0.030039146542549133 2023-01-24 04:45:49.532771: step: 242/466, loss: 0.0005290998960845172 2023-01-24 04:45:50.134923: step: 244/466, loss: 0.002077216748148203 2023-01-24 04:45:50.696059: step: 246/466, loss: 0.008934388868510723 2023-01-24 04:45:51.297667: step: 248/466, loss: 0.011586735025048256 2023-01-24 04:45:51.929520: step: 250/466, loss: 0.02876337245106697 2023-01-24 04:45:52.581633: step: 252/466, loss: 0.09464550018310547 2023-01-24 04:45:53.213808: step: 254/466, loss: 0.0040171486325562 2023-01-24 04:45:53.835350: step: 256/466, loss: 0.004150137770920992 2023-01-24 04:45:54.439557: step: 258/466, loss: 0.0008456672076135874 2023-01-24 04:45:55.037557: step: 260/466, loss: 0.0011844084365293384 2023-01-24 04:45:55.577425: step: 262/466, loss: 0.0007199643296189606 2023-01-24 04:45:56.189357: step: 264/466, loss: 0.0002797591732814908 2023-01-24 04:45:56.809537: step: 266/466, loss: 0.0007250283961184323 2023-01-24 04:45:57.497098: step: 268/466, loss: 0.017860015854239464 2023-01-24 04:45:58.164764: step: 270/466, loss: 0.025834210216999054 2023-01-24 04:45:58.801290: step: 272/466, loss: 0.05327354371547699 2023-01-24 04:45:59.406936: step: 274/466, loss: 0.023485787212848663 2023-01-24 04:46:00.008821: step: 276/466, loss: 0.00020847993437200785 2023-01-24 04:46:00.606173: step: 278/466, loss: 0.02168998494744301 2023-01-24 04:46:01.242069: step: 280/466, loss: 0.000491889426484704 2023-01-24 04:46:01.805943: step: 282/466, loss: 0.006376080680638552 2023-01-24 04:46:02.416957: step: 284/466, loss: 0.06472522020339966 2023-01-24 04:46:03.067185: step: 286/466, loss: 0.0011324502993375063 2023-01-24 04:46:03.682527: step: 288/466, loss: 0.00033992258249782026 2023-01-24 04:46:04.309373: step: 290/466, loss: 0.006721579935401678 2023-01-24 04:46:04.969193: step: 292/466, loss: 0.001619403250515461 2023-01-24 04:46:05.528617: step: 294/466, loss: 0.0036933289375156164 2023-01-24 04:46:06.116105: step: 296/466, loss: 0.00347943720407784 2023-01-24 04:46:06.776539: step: 298/466, loss: 0.04912320151925087 2023-01-24 04:46:07.430513: step: 300/466, loss: 0.0557754747569561 2023-01-24 04:46:08.075062: step: 302/466, loss: 0.0028693275526165962 2023-01-24 04:46:08.639424: step: 304/466, loss: 0.08447981625795364 2023-01-24 04:46:09.249716: step: 306/466, loss: 0.012897913344204426 2023-01-24 04:46:09.836699: step: 308/466, loss: 0.003185515059158206 2023-01-24 04:46:10.465618: step: 310/466, loss: 0.03757956251502037 2023-01-24 04:46:11.094244: step: 312/466, loss: 0.053706299513578415 2023-01-24 04:46:11.660230: step: 314/466, loss: 0.0005277339951135218 2023-01-24 04:46:12.325595: step: 316/466, loss: 0.006508746184408665 2023-01-24 04:46:12.963947: step: 318/466, loss: 0.04517892748117447 2023-01-24 04:46:13.576491: step: 320/466, loss: 0.01487236749380827 2023-01-24 04:46:14.275981: step: 322/466, loss: 0.013757728040218353 2023-01-24 04:46:14.882896: step: 324/466, loss: 0.0076672472059726715 2023-01-24 04:46:15.570239: step: 326/466, loss: 0.0007945263059809804 2023-01-24 04:46:16.197290: step: 328/466, loss: 0.006535803899168968 2023-01-24 04:46:16.819700: step: 330/466, loss: 0.0017893729964271188 2023-01-24 04:46:17.441983: step: 332/466, loss: 0.0004827079246751964 2023-01-24 04:46:18.079197: step: 334/466, loss: 0.00041103153489530087 2023-01-24 04:46:18.769558: step: 336/466, loss: 0.004424719139933586 2023-01-24 04:46:19.395487: step: 338/466, loss: 0.042862024158239365 2023-01-24 04:46:19.993460: step: 340/466, loss: 0.0029731725808233023 2023-01-24 04:46:20.639939: step: 342/466, loss: 0.0063887243159115314 2023-01-24 04:46:21.317277: step: 344/466, loss: 0.0518309660255909 2023-01-24 04:46:21.911872: step: 346/466, loss: 0.4405127763748169 2023-01-24 04:46:22.537002: step: 348/466, loss: 0.037538956850767136 2023-01-24 04:46:23.113764: step: 350/466, loss: 0.005816600285470486 2023-01-24 04:46:23.694833: step: 352/466, loss: 0.0004314797988627106 2023-01-24 04:46:24.314109: step: 354/466, loss: 0.013126222416758537 2023-01-24 04:46:24.920718: step: 356/466, loss: 0.016857674345374107 2023-01-24 04:46:25.451262: step: 358/466, loss: 0.012838209047913551 2023-01-24 04:46:26.011544: step: 360/466, loss: 0.0008656500140205026 2023-01-24 04:46:26.679531: step: 362/466, loss: 0.16545487940311432 2023-01-24 04:46:27.283768: step: 364/466, loss: 0.006057723890990019 2023-01-24 04:46:27.895981: step: 366/466, loss: 0.0021400887053459883 2023-01-24 04:46:28.576132: step: 368/466, loss: 0.24039912223815918 2023-01-24 04:46:29.324042: step: 370/466, loss: 0.004498835653066635 2023-01-24 04:46:29.981690: step: 372/466, loss: 0.0002478906826581806 2023-01-24 04:46:30.642635: step: 374/466, loss: 0.009848754853010178 2023-01-24 04:46:31.280134: step: 376/466, loss: 0.007356069982051849 2023-01-24 04:46:31.957750: step: 378/466, loss: 0.08883567154407501 2023-01-24 04:46:32.620557: step: 380/466, loss: 0.12306622415781021 2023-01-24 04:46:33.194876: step: 382/466, loss: 0.0016421453328803182 2023-01-24 04:46:33.719314: step: 384/466, loss: 0.021117009222507477 2023-01-24 04:46:34.287949: step: 386/466, loss: 0.005438767373561859 2023-01-24 04:46:34.928447: step: 388/466, loss: 0.0004887752002105117 2023-01-24 04:46:35.528899: step: 390/466, loss: 0.0034964196383953094 2023-01-24 04:46:36.122703: step: 392/466, loss: 0.003222419647499919 2023-01-24 04:46:36.789818: step: 394/466, loss: 0.0004024989320896566 2023-01-24 04:46:37.426209: step: 396/466, loss: 0.0022491118870675564 2023-01-24 04:46:38.085625: step: 398/466, loss: 0.08100035041570663 2023-01-24 04:46:38.643190: step: 400/466, loss: 0.004757395014166832 2023-01-24 04:46:39.252199: step: 402/466, loss: 0.01630372554063797 2023-01-24 04:46:39.860681: step: 404/466, loss: 0.0118543840944767 2023-01-24 04:46:40.438483: step: 406/466, loss: 0.0010891692945733666 2023-01-24 04:46:41.025865: step: 408/466, loss: 0.05898061767220497 2023-01-24 04:46:41.661587: step: 410/466, loss: 0.0320536345243454 2023-01-24 04:46:42.325745: step: 412/466, loss: 0.015358486212790012 2023-01-24 04:46:42.915635: step: 414/466, loss: 0.021513421088457108 2023-01-24 04:46:43.489894: step: 416/466, loss: 0.004470662213861942 2023-01-24 04:46:44.077780: step: 418/466, loss: 0.01665816269814968 2023-01-24 04:46:44.667557: step: 420/466, loss: 0.012034298852086067 2023-01-24 04:46:45.309361: step: 422/466, loss: 0.05559059977531433 2023-01-24 04:46:45.929234: step: 424/466, loss: 0.004251624457538128 2023-01-24 04:46:46.527007: step: 426/466, loss: 0.005079424940049648 2023-01-24 04:46:47.099548: step: 428/466, loss: 0.00046944443602114916 2023-01-24 04:46:47.737722: step: 430/466, loss: 0.0004140078090131283 2023-01-24 04:46:48.266665: step: 432/466, loss: 0.0032719718292355537 2023-01-24 04:46:48.906220: step: 434/466, loss: 0.0004585221759043634 2023-01-24 04:46:49.550138: step: 436/466, loss: 0.3552585542201996 2023-01-24 04:46:50.135961: step: 438/466, loss: 0.009487979114055634 2023-01-24 04:46:50.744818: step: 440/466, loss: 0.010056398808956146 2023-01-24 04:46:51.343896: step: 442/466, loss: 0.0001256998657481745 2023-01-24 04:46:52.029978: step: 444/466, loss: 0.00877401977777481 2023-01-24 04:46:52.702648: step: 446/466, loss: 0.046221181750297546 2023-01-24 04:46:53.355407: step: 448/466, loss: 0.0017273180419579148 2023-01-24 04:46:53.985969: step: 450/466, loss: 0.006932374089956284 2023-01-24 04:46:54.631546: step: 452/466, loss: 0.0431647002696991 2023-01-24 04:46:55.245826: step: 454/466, loss: 5.422514004749246e-05 2023-01-24 04:46:55.815466: step: 456/466, loss: 0.0026062550023198128 2023-01-24 04:46:56.419072: step: 458/466, loss: 0.09274192899465561 2023-01-24 04:46:57.000494: step: 460/466, loss: 0.0007122901733964682 2023-01-24 04:46:57.652160: step: 462/466, loss: 0.0017089269822463393 2023-01-24 04:46:58.267366: step: 464/466, loss: 0.010782505385577679 2023-01-24 04:46:58.844770: step: 466/466, loss: 0.0005593986716121435 2023-01-24 04:46:59.442112: step: 468/466, loss: 0.000961803481914103 2023-01-24 04:47:00.032967: step: 470/466, loss: 0.015021142549812794 2023-01-24 04:47:00.589829: step: 472/466, loss: 0.002034959616139531 2023-01-24 04:47:01.159692: step: 474/466, loss: 0.013840865343809128 2023-01-24 04:47:01.818279: step: 476/466, loss: 0.0044415052980184555 2023-01-24 04:47:02.402865: step: 478/466, loss: 0.0029340493492782116 2023-01-24 04:47:03.056257: step: 480/466, loss: 0.02091211825609207 2023-01-24 04:47:03.684170: step: 482/466, loss: 0.010680991224944592 2023-01-24 04:47:04.322389: step: 484/466, loss: 0.03337204456329346 2023-01-24 04:47:05.013359: step: 486/466, loss: 0.0021719844080507755 2023-01-24 04:47:05.652192: step: 488/466, loss: 0.0014358946355059743 2023-01-24 04:47:06.199573: step: 490/466, loss: 0.0053126271814107895 2023-01-24 04:47:06.933370: step: 492/466, loss: 0.005152451805770397 2023-01-24 04:47:07.545238: step: 494/466, loss: 0.016728512942790985 2023-01-24 04:47:08.216464: step: 496/466, loss: 0.0023070168681442738 2023-01-24 04:47:08.884473: step: 498/466, loss: 0.0012790559558197856 2023-01-24 04:47:09.555995: step: 500/466, loss: 0.0027140311431139708 2023-01-24 04:47:10.197706: step: 502/466, loss: 0.00013023210340179503 2023-01-24 04:47:10.895815: step: 504/466, loss: 0.0019716620445251465 2023-01-24 04:47:11.528283: step: 506/466, loss: 0.004332349635660648 2023-01-24 04:47:12.179109: step: 508/466, loss: 0.0004268595075700432 2023-01-24 04:47:12.869059: step: 510/466, loss: 0.018236014991998672 2023-01-24 04:47:13.561727: step: 512/466, loss: 0.0096100103110075 2023-01-24 04:47:14.180060: step: 514/466, loss: 0.26996174454689026 2023-01-24 04:47:14.866513: step: 516/466, loss: 0.07348065078258514 2023-01-24 04:47:15.552698: step: 518/466, loss: 0.011505013331770897 2023-01-24 04:47:16.155305: step: 520/466, loss: 0.02061111107468605 2023-01-24 04:47:16.790648: step: 522/466, loss: 0.04129303619265556 2023-01-24 04:47:17.421800: step: 524/466, loss: 0.0015660661738365889 2023-01-24 04:47:18.057298: step: 526/466, loss: 0.006148469168692827 2023-01-24 04:47:18.632496: step: 528/466, loss: 0.01998778060078621 2023-01-24 04:47:19.246915: step: 530/466, loss: 0.022510893642902374 2023-01-24 04:47:19.870723: step: 532/466, loss: 0.0020203834865242243 2023-01-24 04:47:20.527599: step: 534/466, loss: 0.016820572316646576 2023-01-24 04:47:21.212176: step: 536/466, loss: 0.07796097546815872 2023-01-24 04:47:21.766142: step: 538/466, loss: 0.008878006599843502 2023-01-24 04:47:22.318070: step: 540/466, loss: 0.002800745191052556 2023-01-24 04:47:22.887261: step: 542/466, loss: 0.00031313998624682426 2023-01-24 04:47:23.529836: step: 544/466, loss: 2.1622219719574787e-05 2023-01-24 04:47:24.165080: step: 546/466, loss: 0.0022388698998838663 2023-01-24 04:47:24.819867: step: 548/466, loss: 0.03359275311231613 2023-01-24 04:47:25.428031: step: 550/466, loss: 0.011839710175991058 2023-01-24 04:47:25.997655: step: 552/466, loss: 0.0029783437494188547 2023-01-24 04:47:26.607660: step: 554/466, loss: 0.0022885308135300875 2023-01-24 04:47:27.218196: step: 556/466, loss: 0.0033760373480618 2023-01-24 04:47:27.822455: step: 558/466, loss: 0.0036777276545763016 2023-01-24 04:47:28.454878: step: 560/466, loss: 0.01203171443194151 2023-01-24 04:47:29.070660: step: 562/466, loss: 0.055107247084379196 2023-01-24 04:47:29.672755: step: 564/466, loss: 0.05906687304377556 2023-01-24 04:47:30.304475: step: 566/466, loss: 0.13570892810821533 2023-01-24 04:47:30.969427: step: 568/466, loss: 0.00559958815574646 2023-01-24 04:47:31.588835: step: 570/466, loss: 0.002212547929957509 2023-01-24 04:47:32.277978: step: 572/466, loss: 0.013021819293498993 2023-01-24 04:47:32.987568: step: 574/466, loss: 0.005140095017850399 2023-01-24 04:47:33.618437: step: 576/466, loss: 0.00024036553804762661 2023-01-24 04:47:34.253817: step: 578/466, loss: 0.00848733726888895 2023-01-24 04:47:34.926518: step: 580/466, loss: 0.0004640131664928049 2023-01-24 04:47:35.507813: step: 582/466, loss: 0.0006688210996799171 2023-01-24 04:47:36.071599: step: 584/466, loss: 0.006631654687225819 2023-01-24 04:47:36.814024: step: 586/466, loss: 0.034514736384153366 2023-01-24 04:47:37.418842: step: 588/466, loss: 0.01949184387922287 2023-01-24 04:47:38.026806: step: 590/466, loss: 0.0045889331959187984 2023-01-24 04:47:38.606662: step: 592/466, loss: 0.02588224969804287 2023-01-24 04:47:39.187801: step: 594/466, loss: 2.4295799448736943e-05 2023-01-24 04:47:39.780805: step: 596/466, loss: 0.03241288661956787 2023-01-24 04:47:40.435106: step: 598/466, loss: 0.028325794264674187 2023-01-24 04:47:41.111157: step: 600/466, loss: 0.0477851964533329 2023-01-24 04:47:41.729436: step: 602/466, loss: 0.009745490737259388 2023-01-24 04:47:42.369588: step: 604/466, loss: 0.015827316790819168 2023-01-24 04:47:42.921909: step: 606/466, loss: 0.002607752103358507 2023-01-24 04:47:43.464198: step: 608/466, loss: 0.003142754314467311 2023-01-24 04:47:44.057028: step: 610/466, loss: 0.1278313845396042 2023-01-24 04:47:44.680931: step: 612/466, loss: 0.8673977851867676 2023-01-24 04:47:45.292155: step: 614/466, loss: 0.019910240545868874 2023-01-24 04:47:45.902798: step: 616/466, loss: 0.00808005966246128 2023-01-24 04:47:46.542542: step: 618/466, loss: 0.09069032222032547 2023-01-24 04:47:47.185102: step: 620/466, loss: 0.12162695825099945 2023-01-24 04:47:47.804406: step: 622/466, loss: 0.004187727812677622 2023-01-24 04:47:48.459212: step: 624/466, loss: 0.008895082399249077 2023-01-24 04:47:49.144112: step: 626/466, loss: 0.02198082022368908 2023-01-24 04:47:49.776864: step: 628/466, loss: 0.09377451986074448 2023-01-24 04:47:50.387404: step: 630/466, loss: 0.007859280332922935 2023-01-24 04:47:50.998025: step: 632/466, loss: 0.0025520678609609604 2023-01-24 04:47:51.609886: step: 634/466, loss: 0.043739113956689835 2023-01-24 04:47:52.270043: step: 636/466, loss: 0.0031407184433192015 2023-01-24 04:47:52.936618: step: 638/466, loss: 0.008512577973306179 2023-01-24 04:47:53.573235: step: 640/466, loss: 0.008761793375015259 2023-01-24 04:47:54.097897: step: 642/466, loss: 0.0004297647101338953 2023-01-24 04:47:54.735836: step: 644/466, loss: 0.030514473095536232 2023-01-24 04:47:55.357995: step: 646/466, loss: 0.016970517113804817 2023-01-24 04:47:56.002025: step: 648/466, loss: 0.0021906227339059114 2023-01-24 04:47:56.525550: step: 650/466, loss: 0.0007351999520324171 2023-01-24 04:47:57.150395: step: 652/466, loss: 0.01548150647431612 2023-01-24 04:47:57.829566: step: 654/466, loss: 0.017431585118174553 2023-01-24 04:47:58.454914: step: 656/466, loss: 0.05731036514043808 2023-01-24 04:47:59.065147: step: 658/466, loss: 0.00026396213797852397 2023-01-24 04:47:59.643118: step: 660/466, loss: 0.012690886855125427 2023-01-24 04:48:00.197178: step: 662/466, loss: 2.1528674551518634e-05 2023-01-24 04:48:00.825782: step: 664/466, loss: 0.0011564485030248761 2023-01-24 04:48:01.511631: step: 666/466, loss: 0.03397271782159805 2023-01-24 04:48:02.185738: step: 668/466, loss: 0.02553057111799717 2023-01-24 04:48:02.819133: step: 670/466, loss: 0.048447057604789734 2023-01-24 04:48:03.408175: step: 672/466, loss: 0.01090525183826685 2023-01-24 04:48:04.025857: step: 674/466, loss: 0.01414660457521677 2023-01-24 04:48:04.669000: step: 676/466, loss: 0.005885283462703228 2023-01-24 04:48:05.287299: step: 678/466, loss: 0.022681541740894318 2023-01-24 04:48:05.886815: step: 680/466, loss: 0.011428939178586006 2023-01-24 04:48:06.441192: step: 682/466, loss: 0.007325511425733566 2023-01-24 04:48:07.034310: step: 684/466, loss: 0.004513971973210573 2023-01-24 04:48:07.702655: step: 686/466, loss: 0.0050977542996406555 2023-01-24 04:48:08.259835: step: 688/466, loss: 0.002478186273947358 2023-01-24 04:48:08.858577: step: 690/466, loss: 0.0032952444162219763 2023-01-24 04:48:09.499921: step: 692/466, loss: 0.0006835025269538164 2023-01-24 04:48:10.153096: step: 694/466, loss: 0.039983466267585754 2023-01-24 04:48:10.745514: step: 696/466, loss: 0.05446304753422737 2023-01-24 04:48:11.404935: step: 698/466, loss: 0.0005271573318168521 2023-01-24 04:48:11.969167: step: 700/466, loss: 0.026197724044322968 2023-01-24 04:48:12.556725: step: 702/466, loss: 0.000332408380927518 2023-01-24 04:48:13.204119: step: 704/466, loss: 0.010294831357896328 2023-01-24 04:48:13.842229: step: 706/466, loss: 0.013582558371126652 2023-01-24 04:48:14.420914: step: 708/466, loss: 0.00780984153971076 2023-01-24 04:48:15.071349: step: 710/466, loss: 0.0002171879168599844 2023-01-24 04:48:15.643907: step: 712/466, loss: 0.004038907587528229 2023-01-24 04:48:16.351341: step: 714/466, loss: 0.0026337308809161186 2023-01-24 04:48:16.929305: step: 716/466, loss: 0.01237061619758606 2023-01-24 04:48:17.498796: step: 718/466, loss: 0.0006351915071718395 2023-01-24 04:48:18.197021: step: 720/466, loss: 0.002086139051243663 2023-01-24 04:48:18.823415: step: 722/466, loss: 0.0002931247581727803 2023-01-24 04:48:19.437544: step: 724/466, loss: 0.07869864255189896 2023-01-24 04:48:20.087260: step: 726/466, loss: 0.024655615910887718 2023-01-24 04:48:20.749864: step: 728/466, loss: 0.00587571831420064 2023-01-24 04:48:21.394928: step: 730/466, loss: 0.0014530383050441742 2023-01-24 04:48:22.079943: step: 732/466, loss: 3.143577487207949e-05 2023-01-24 04:48:22.702640: step: 734/466, loss: 0.0005164192989468575 2023-01-24 04:48:23.319690: step: 736/466, loss: 0.0863901674747467 2023-01-24 04:48:24.006322: step: 738/466, loss: 0.02110672928392887 2023-01-24 04:48:24.632826: step: 740/466, loss: 0.05317772179841995 2023-01-24 04:48:25.236409: step: 742/466, loss: 0.008193924091756344 2023-01-24 04:48:25.904876: step: 744/466, loss: 0.00725803105160594 2023-01-24 04:48:26.546499: step: 746/466, loss: 0.016299206763505936 2023-01-24 04:48:27.132768: step: 748/466, loss: 0.07704475522041321 2023-01-24 04:48:27.773920: step: 750/466, loss: 0.0030760911758989096 2023-01-24 04:48:28.400316: step: 752/466, loss: 0.00893436186015606 2023-01-24 04:48:28.996683: step: 754/466, loss: 0.0026519685052335262 2023-01-24 04:48:29.600445: step: 756/466, loss: 0.007419214583933353 2023-01-24 04:48:30.170868: step: 758/466, loss: 0.012317190878093243 2023-01-24 04:48:30.855782: step: 760/466, loss: 0.006765678990632296 2023-01-24 04:48:31.445563: step: 762/466, loss: 0.04686688259243965 2023-01-24 04:48:32.085073: step: 764/466, loss: 0.02657747082412243 2023-01-24 04:48:32.725165: step: 766/466, loss: 0.003171471878886223 2023-01-24 04:48:33.393773: step: 768/466, loss: 0.000835277431178838 2023-01-24 04:48:34.003274: step: 770/466, loss: 0.028463780879974365 2023-01-24 04:48:34.514111: step: 772/466, loss: 0.0001282665180042386 2023-01-24 04:48:35.133863: step: 774/466, loss: 0.018418481573462486 2023-01-24 04:48:35.790058: step: 776/466, loss: 0.0020590792410075665 2023-01-24 04:48:36.426475: step: 778/466, loss: 0.00039314222522079945 2023-01-24 04:48:37.051891: step: 780/466, loss: 0.00898836925625801 2023-01-24 04:48:37.702563: step: 782/466, loss: 0.009828277863562107 2023-01-24 04:48:38.393980: step: 784/466, loss: 0.009791073389351368 2023-01-24 04:48:39.007364: step: 786/466, loss: 0.0013660689583048224 2023-01-24 04:48:39.687834: step: 788/466, loss: 0.020413346588611603 2023-01-24 04:48:40.365110: step: 790/466, loss: 0.004203976131975651 2023-01-24 04:48:41.014992: step: 792/466, loss: 0.005849914625287056 2023-01-24 04:48:41.650524: step: 794/466, loss: 0.05871148779988289 2023-01-24 04:48:42.332989: step: 796/466, loss: 0.006010339595377445 2023-01-24 04:48:42.940636: step: 798/466, loss: 0.0334501676261425 2023-01-24 04:48:43.537162: step: 800/466, loss: 0.009299539029598236 2023-01-24 04:48:44.159650: step: 802/466, loss: 0.014091639779508114 2023-01-24 04:48:44.804898: step: 804/466, loss: 0.014303319156169891 2023-01-24 04:48:45.413827: step: 806/466, loss: 0.004221655894070864 2023-01-24 04:48:45.980597: step: 808/466, loss: 0.01323504839092493 2023-01-24 04:48:46.608957: step: 810/466, loss: 0.11757577955722809 2023-01-24 04:48:47.242293: step: 812/466, loss: 0.001458548940718174 2023-01-24 04:48:47.908186: step: 814/466, loss: 0.00623718835413456 2023-01-24 04:48:48.530862: step: 816/466, loss: 0.006825688295066357 2023-01-24 04:48:49.176546: step: 818/466, loss: 0.005815485492348671 2023-01-24 04:48:49.796309: step: 820/466, loss: 0.0036692700814455748 2023-01-24 04:48:50.452155: step: 822/466, loss: 0.008228559046983719 2023-01-24 04:48:51.074343: step: 824/466, loss: 0.009341354481875896 2023-01-24 04:48:51.675299: step: 826/466, loss: 0.021602025255560875 2023-01-24 04:48:52.285824: step: 828/466, loss: 0.0003038942231796682 2023-01-24 04:48:52.908226: step: 830/466, loss: 0.041580308228731155 2023-01-24 04:48:53.519576: step: 832/466, loss: 0.005860107019543648 2023-01-24 04:48:54.117914: step: 834/466, loss: 0.04242083430290222 2023-01-24 04:48:54.795339: step: 836/466, loss: 0.005504989065229893 2023-01-24 04:48:55.433377: step: 838/466, loss: 0.004706955049186945 2023-01-24 04:48:56.010064: step: 840/466, loss: 0.0018675555475056171 2023-01-24 04:48:56.734463: step: 842/466, loss: 0.0537487268447876 2023-01-24 04:48:57.320362: step: 844/466, loss: 0.009648822247982025 2023-01-24 04:48:57.954015: step: 846/466, loss: 0.0014984318986535072 2023-01-24 04:48:58.547805: step: 848/466, loss: 0.004129413515329361 2023-01-24 04:48:59.195906: step: 850/466, loss: 0.06596551835536957 2023-01-24 04:48:59.749011: step: 852/466, loss: 0.00615499634295702 2023-01-24 04:49:00.349634: step: 854/466, loss: 0.3469690978527069 2023-01-24 04:49:00.951079: step: 856/466, loss: 0.0012979827588424087 2023-01-24 04:49:01.590898: step: 858/466, loss: 0.004978050012141466 2023-01-24 04:49:02.276771: step: 860/466, loss: 0.16628430783748627 2023-01-24 04:49:02.912473: step: 862/466, loss: 0.016617318615317345 2023-01-24 04:49:03.551151: step: 864/466, loss: 0.022998757660388947 2023-01-24 04:49:04.137111: step: 866/466, loss: 0.0009642437798902392 2023-01-24 04:49:04.719600: step: 868/466, loss: 0.007418345659971237 2023-01-24 04:49:05.431402: step: 870/466, loss: 0.01664545387029648 2023-01-24 04:49:06.083916: step: 872/466, loss: 0.06470434367656708 2023-01-24 04:49:06.683921: step: 874/466, loss: 0.012197468429803848 2023-01-24 04:49:07.332337: step: 876/466, loss: 0.18374070525169373 2023-01-24 04:49:07.966718: step: 878/466, loss: 0.6884382963180542 2023-01-24 04:49:08.618785: step: 880/466, loss: 0.012180290184915066 2023-01-24 04:49:09.328149: step: 882/466, loss: 0.0011859252117574215 2023-01-24 04:49:09.945830: step: 884/466, loss: 0.002707699779421091 2023-01-24 04:49:10.660710: step: 886/466, loss: 0.010159934870898724 2023-01-24 04:49:11.406448: step: 888/466, loss: 0.0010092303855344653 2023-01-24 04:49:11.951851: step: 890/466, loss: 0.013814612291753292 2023-01-24 04:49:12.683506: step: 892/466, loss: 0.04083891957998276 2023-01-24 04:49:13.293484: step: 894/466, loss: 0.032238803803920746 2023-01-24 04:49:13.933843: step: 896/466, loss: 0.012169080786406994 2023-01-24 04:49:14.552752: step: 898/466, loss: 0.003366566263139248 2023-01-24 04:49:15.159755: step: 900/466, loss: 0.00712932413443923 2023-01-24 04:49:15.761543: step: 902/466, loss: 0.0002415215567452833 2023-01-24 04:49:16.360932: step: 904/466, loss: 6.97071009199135e-05 2023-01-24 04:49:16.920617: step: 906/466, loss: 0.0115275327116251 2023-01-24 04:49:17.463652: step: 908/466, loss: 0.0026450790464878082 2023-01-24 04:49:18.115292: step: 910/466, loss: 0.020093899220228195 2023-01-24 04:49:18.761295: step: 912/466, loss: 0.006807427387684584 2023-01-24 04:49:19.381139: step: 914/466, loss: 0.0032413809094578028 2023-01-24 04:49:19.992061: step: 916/466, loss: 0.0001539701479487121 2023-01-24 04:49:20.653188: step: 918/466, loss: 0.009963742457330227 2023-01-24 04:49:21.264478: step: 920/466, loss: 0.004906130023300648 2023-01-24 04:49:21.917975: step: 922/466, loss: 0.023356501013040543 2023-01-24 04:49:22.510898: step: 924/466, loss: 0.17208455502986908 2023-01-24 04:49:23.272356: step: 926/466, loss: 0.0598161555826664 2023-01-24 04:49:23.898135: step: 928/466, loss: 0.06309767067432404 2023-01-24 04:49:24.517597: step: 930/466, loss: 0.00023569687618874013 2023-01-24 04:49:25.157426: step: 932/466, loss: 0.0010740574216470122 ================================================== Loss: 0.045 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.34467858193586354, 'r': 0.33683011327698237, 'f1': 0.34070915488861747}, 'combined': 0.25104885097056023, 'epoch': 35} Test Chinese: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.36855939881861094, 'r': 0.30091040094021654, 'f1': 0.33131698101615836}, 'combined': 0.2197335418138252, 'epoch': 35} Dev Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.33670273348519364, 'r': 0.2799479166666667, 'f1': 0.3057135470527405}, 'combined': 0.20380903136849365, 'epoch': 35} Test Korean: {'template': {'p': 0.96875, 'r': 0.49206349206349204, 'f1': 0.6526315789473683}, 'slot': {'p': 0.3696950279336589, 'r': 0.27919010327515936, 'f1': 0.31813086188869816}, 'combined': 0.20762224670630824, 'epoch': 35} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3273626091422701, 'r': 0.3298473348283595, 'f1': 0.32860027496133354}, 'combined': 0.24212651839256155, 'epoch': 35} Test Russian: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.3590822417300275, 'r': 0.2897529431102906, 'f1': 0.320713607371716}, 'combined': 0.2127012525574075, 'epoch': 35} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.29954954954954954, 'r': 0.31666666666666665, 'f1': 0.3078703703703704}, 'combined': 0.20524691358024694, 'epoch': 35} Sample Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5555555555555556, 'r': 0.43478260869565216, 'f1': 0.4878048780487805}, 'combined': 0.3252032520325203, 'epoch': 35} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.45454545454545453, 'r': 0.1724137931034483, 'f1': 0.25000000000000006}, 'combined': 0.16666666666666669, 'epoch': 35} New best korean model... ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.35600821224540485, 'r': 0.3296622534644356, 'f1': 0.34232907896701}, 'combined': 0.25224247923884946, 'epoch': 20} Test for Chinese: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.3708503216596444, 'r': 0.2994463568648126, 'f1': 0.33134515303755174}, 'combined': 0.21975222584873896, 'epoch': 20} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.38541666666666663, 'r': 0.35238095238095235, 'f1': 0.36815920398009944}, 'combined': 0.24543946932006627, 'epoch': 20} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.33670273348519364, 'r': 0.2799479166666667, 'f1': 0.3057135470527405}, 'combined': 0.20380903136849365, 'epoch': 35} Test for Korean: {'template': {'p': 0.96875, 'r': 0.49206349206349204, 'f1': 0.6526315789473683}, 'slot': {'p': 0.3696950279336589, 'r': 0.27919010327515936, 'f1': 0.31813086188869816}, 'combined': 0.20762224670630824, 'epoch': 35} Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5555555555555556, 'r': 0.43478260869565216, 'f1': 0.4878048780487805}, 'combined': 0.3252032520325203, 'epoch': 35} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3214272868712161, 'r': 0.31166858366449984, 'f1': 0.3164727236824498}, 'combined': 0.23319042797654194, 'epoch': 7} Test for Russian: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.3374639287478496, 'r': 0.2816078301964814, 'f1': 0.30701605547736693}, 'combined': 0.20361686580882363, 'epoch': 7} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.4423076923076923, 'r': 0.19827586206896552, 'f1': 0.2738095238095238}, 'combined': 0.1825396825396825, 'epoch': 7} ****************************** Epoch: 36 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-24 04:52:04.117712: step: 2/466, loss: 0.004134782124310732 2023-01-24 04:52:04.729310: step: 4/466, loss: 0.014764559455215931 2023-01-24 04:52:05.393230: step: 6/466, loss: 0.00023114972282201052 2023-01-24 04:52:06.024107: step: 8/466, loss: 0.0036247021052986383 2023-01-24 04:52:06.611341: step: 10/466, loss: 0.035689063370227814 2023-01-24 04:52:07.279075: step: 12/466, loss: 0.13503894209861755 2023-01-24 04:52:07.936502: step: 14/466, loss: 0.0015168027020990849 2023-01-24 04:52:08.579363: step: 16/466, loss: 0.00198641000315547 2023-01-24 04:52:09.170301: step: 18/466, loss: 0.007483336143195629 2023-01-24 04:52:09.897513: step: 20/466, loss: 0.011556041426956654 2023-01-24 04:52:10.525391: step: 22/466, loss: 0.011115025728940964 2023-01-24 04:52:11.118776: step: 24/466, loss: 0.00882694125175476 2023-01-24 04:52:11.780402: step: 26/466, loss: 0.11645354330539703 2023-01-24 04:52:12.455391: step: 28/466, loss: 0.056584376841783524 2023-01-24 04:52:13.022550: step: 30/466, loss: 0.001533707370981574 2023-01-24 04:52:13.636331: step: 32/466, loss: 0.0009939366718754172 2023-01-24 04:52:14.278073: step: 34/466, loss: 0.02367999777197838 2023-01-24 04:52:14.899909: step: 36/466, loss: 4.9245820264331996e-05 2023-01-24 04:52:15.510295: step: 38/466, loss: 0.0013309363275766373 2023-01-24 04:52:16.149373: step: 40/466, loss: 0.00654733506962657 2023-01-24 04:52:16.787117: step: 42/466, loss: 0.012524322606623173 2023-01-24 04:52:17.430953: step: 44/466, loss: 9.207048424286768e-05 2023-01-24 04:52:18.029511: step: 46/466, loss: 0.0011543171713128686 2023-01-24 04:52:18.637360: step: 48/466, loss: 0.08923041075468063 2023-01-24 04:52:19.290032: step: 50/466, loss: 0.013645232655107975 2023-01-24 04:52:19.813849: step: 52/466, loss: 0.025942079722881317 2023-01-24 04:52:20.474050: step: 54/466, loss: 0.002431912114843726 2023-01-24 04:52:21.081798: step: 56/466, loss: 0.01853998750448227 2023-01-24 04:52:21.646191: step: 58/466, loss: 0.00020470521121751517 2023-01-24 04:52:22.388559: step: 60/466, loss: 0.0017083105631172657 2023-01-24 04:52:22.945323: step: 62/466, loss: 0.0002841560635715723 2023-01-24 04:52:23.573894: step: 64/466, loss: 0.007360382005572319 2023-01-24 04:52:24.222555: step: 66/466, loss: 0.018356889486312866 2023-01-24 04:52:24.861332: step: 68/466, loss: 0.012392015196383 2023-01-24 04:52:25.561618: step: 70/466, loss: 0.005072339903563261 2023-01-24 04:52:26.111393: step: 72/466, loss: 0.003504128661006689 2023-01-24 04:52:26.870144: step: 74/466, loss: 0.045964308083057404 2023-01-24 04:52:27.441783: step: 76/466, loss: 0.0016541439108550549 2023-01-24 04:52:28.068167: step: 78/466, loss: 0.007963433861732483 2023-01-24 04:52:28.766320: step: 80/466, loss: 0.0008533421787433326 2023-01-24 04:52:29.458891: step: 82/466, loss: 0.0015797861851751804 2023-01-24 04:52:30.072149: step: 84/466, loss: 0.004055194091051817 2023-01-24 04:52:30.687820: step: 86/466, loss: 0.04232597351074219 2023-01-24 04:52:31.335804: step: 88/466, loss: 0.005299170035868883 2023-01-24 04:52:31.916447: step: 90/466, loss: 0.0011265198700129986 2023-01-24 04:52:32.519063: step: 92/466, loss: 0.0067293415777385235 2023-01-24 04:52:33.134946: step: 94/466, loss: 0.011530286632478237 2023-01-24 04:52:33.720296: step: 96/466, loss: 0.0001433990546502173 2023-01-24 04:52:34.331771: step: 98/466, loss: 0.0009852410294115543 2023-01-24 04:52:34.968392: step: 100/466, loss: 0.3928929567337036 2023-01-24 04:52:35.544774: step: 102/466, loss: 0.01997952163219452 2023-01-24 04:52:36.147456: step: 104/466, loss: 0.012904261238873005 2023-01-24 04:52:36.651415: step: 106/466, loss: 0.0003965873329434544 2023-01-24 04:52:37.273142: step: 108/466, loss: 0.01776755601167679 2023-01-24 04:52:37.871225: step: 110/466, loss: 0.003667705925181508 2023-01-24 04:52:38.454161: step: 112/466, loss: 0.0005210679373703897 2023-01-24 04:52:39.039736: step: 114/466, loss: 0.00494978716596961 2023-01-24 04:52:39.695008: step: 116/466, loss: 0.18897537887096405 2023-01-24 04:52:40.382339: step: 118/466, loss: 0.014218885451555252 2023-01-24 04:52:40.968771: step: 120/466, loss: 0.0057451482862234116 2023-01-24 04:52:41.632621: step: 122/466, loss: 0.00048032597987912595 2023-01-24 04:52:42.289362: step: 124/466, loss: 0.007962165400385857 2023-01-24 04:52:42.871156: step: 126/466, loss: 0.006490239407867193 2023-01-24 04:52:43.476175: step: 128/466, loss: 0.06516426056623459 2023-01-24 04:52:44.093426: step: 130/466, loss: 0.0018735408084467053 2023-01-24 04:52:44.735900: step: 132/466, loss: 0.0036469902843236923 2023-01-24 04:52:45.433646: step: 134/466, loss: 0.0337684229016304 2023-01-24 04:52:46.039124: step: 136/466, loss: 0.008938982151448727 2023-01-24 04:52:46.618546: step: 138/466, loss: 0.004523784387856722 2023-01-24 04:52:47.342236: step: 140/466, loss: 0.0073253437876701355 2023-01-24 04:52:47.889625: step: 142/466, loss: 0.0024708369746804237 2023-01-24 04:52:48.495399: step: 144/466, loss: 0.05165370553731918 2023-01-24 04:52:49.082798: step: 146/466, loss: 0.0035205720923841 2023-01-24 04:52:49.698074: step: 148/466, loss: 0.0007877741591073573 2023-01-24 04:52:50.333849: step: 150/466, loss: 0.009827791713178158 2023-01-24 04:52:51.021677: step: 152/466, loss: 0.02217124029994011 2023-01-24 04:52:51.633109: step: 154/466, loss: 0.008764995262026787 2023-01-24 04:52:52.264844: step: 156/466, loss: 0.28140372037887573 2023-01-24 04:52:52.889114: step: 158/466, loss: 0.00439114635810256 2023-01-24 04:52:53.502230: step: 160/466, loss: 0.0021257810294628143 2023-01-24 04:52:54.109973: step: 162/466, loss: 0.005833633244037628 2023-01-24 04:52:54.765698: step: 164/466, loss: 0.0014589038910344243 2023-01-24 04:52:55.471375: step: 166/466, loss: 0.06629382073879242 2023-01-24 04:52:56.062206: step: 168/466, loss: 0.05375420302152634 2023-01-24 04:52:56.705198: step: 170/466, loss: 0.0029997029341757298 2023-01-24 04:52:57.280445: step: 172/466, loss: 0.01608169823884964 2023-01-24 04:52:57.931444: step: 174/466, loss: 0.0902029424905777 2023-01-24 04:52:58.513636: step: 176/466, loss: 0.011634060181677341 2023-01-24 04:52:59.077856: step: 178/466, loss: 0.02953268401324749 2023-01-24 04:52:59.652046: step: 180/466, loss: 0.003666577860713005 2023-01-24 04:53:00.257643: step: 182/466, loss: 0.028573906049132347 2023-01-24 04:53:00.826621: step: 184/466, loss: 0.11055367439985275 2023-01-24 04:53:01.423573: step: 186/466, loss: 4.221461296081543 2023-01-24 04:53:02.095439: step: 188/466, loss: 0.10896050184965134 2023-01-24 04:53:02.698104: step: 190/466, loss: 0.0401657298207283 2023-01-24 04:53:03.338485: step: 192/466, loss: 0.0023471752647310495 2023-01-24 04:53:03.991230: step: 194/466, loss: 0.015245432034134865 2023-01-24 04:53:04.672311: step: 196/466, loss: 0.01414613239467144 2023-01-24 04:53:05.296933: step: 198/466, loss: 0.15259598195552826 2023-01-24 04:53:05.912164: step: 200/466, loss: 0.00308390986174345 2023-01-24 04:53:06.576872: step: 202/466, loss: 0.03860871493816376 2023-01-24 04:53:07.186190: step: 204/466, loss: 0.009423403069376945 2023-01-24 04:53:07.795289: step: 206/466, loss: 0.00128264632076025 2023-01-24 04:53:08.388583: step: 208/466, loss: 0.007487480994313955 2023-01-24 04:53:09.004793: step: 210/466, loss: 5.24006986618042 2023-01-24 04:53:09.619296: step: 212/466, loss: 0.0017698605079203844 2023-01-24 04:53:10.280840: step: 214/466, loss: 0.00010806312639033422 2023-01-24 04:53:10.928517: step: 216/466, loss: 0.006685130763798952 2023-01-24 04:53:11.522744: step: 218/466, loss: 0.001411755452863872 2023-01-24 04:53:12.123669: step: 220/466, loss: 0.0016788601642474532 2023-01-24 04:53:12.815662: step: 222/466, loss: 0.001158503582701087 2023-01-24 04:53:13.420461: step: 224/466, loss: 0.002265445189550519 2023-01-24 04:53:14.013880: step: 226/466, loss: 0.00012758112279698253 2023-01-24 04:53:14.610409: step: 228/466, loss: 0.0001739346917020157 2023-01-24 04:53:15.199840: step: 230/466, loss: 0.029974643141031265 2023-01-24 04:53:15.814179: step: 232/466, loss: 0.002089595189318061 2023-01-24 04:53:16.434688: step: 234/466, loss: 0.015846455469727516 2023-01-24 04:53:17.014431: step: 236/466, loss: 0.00983512494713068 2023-01-24 04:53:17.626374: step: 238/466, loss: 0.0007549841539002955 2023-01-24 04:53:18.223551: step: 240/466, loss: 0.0016962449299171567 2023-01-24 04:53:18.887429: step: 242/466, loss: 0.00811100471764803 2023-01-24 04:53:19.563558: step: 244/466, loss: 0.016276143491268158 2023-01-24 04:53:20.147958: step: 246/466, loss: 0.00047804482164792717 2023-01-24 04:53:20.736993: step: 248/466, loss: 0.0029368638060986996 2023-01-24 04:53:21.318703: step: 250/466, loss: 0.005448102951049805 2023-01-24 04:53:21.942623: step: 252/466, loss: 0.008779305964708328 2023-01-24 04:53:22.556168: step: 254/466, loss: 0.005257518962025642 2023-01-24 04:53:23.213926: step: 256/466, loss: 0.0065764570608735085 2023-01-24 04:53:23.822831: step: 258/466, loss: 0.0008405945263803005 2023-01-24 04:53:24.406729: step: 260/466, loss: 0.00186285434756428 2023-01-24 04:53:25.002023: step: 262/466, loss: 0.0009987503290176392 2023-01-24 04:53:25.545263: step: 264/466, loss: 0.0072653088718652725 2023-01-24 04:53:26.382713: step: 266/466, loss: 0.004129834473133087 2023-01-24 04:53:27.011153: step: 268/466, loss: 0.0023377113975584507 2023-01-24 04:53:27.652738: step: 270/466, loss: 0.0005597401759587228 2023-01-24 04:53:28.252182: step: 272/466, loss: 0.00031847835634835064 2023-01-24 04:53:28.826036: step: 274/466, loss: 0.0003899749426636845 2023-01-24 04:53:29.420356: step: 276/466, loss: 0.01791643165051937 2023-01-24 04:53:30.072553: step: 278/466, loss: 0.011705256067216396 2023-01-24 04:53:30.715405: step: 280/466, loss: 0.015126674436032772 2023-01-24 04:53:31.301402: step: 282/466, loss: 0.002787497593089938 2023-01-24 04:53:31.896879: step: 284/466, loss: 0.004077774006873369 2023-01-24 04:53:32.567812: step: 286/466, loss: 0.007059828843921423 2023-01-24 04:53:33.186421: step: 288/466, loss: 0.01327449269592762 2023-01-24 04:53:33.783692: step: 290/466, loss: 0.00043224674300290644 2023-01-24 04:53:34.418598: step: 292/466, loss: 0.00018172396812587976 2023-01-24 04:53:34.998659: step: 294/466, loss: 0.009487542323768139 2023-01-24 04:53:35.629728: step: 296/466, loss: 1.0592328310012817 2023-01-24 04:53:36.241566: step: 298/466, loss: 0.001033150008879602 2023-01-24 04:53:36.972447: step: 300/466, loss: 0.00043564982479438186 2023-01-24 04:53:37.659690: step: 302/466, loss: 0.0006059581646695733 2023-01-24 04:53:38.233051: step: 304/466, loss: 0.03221989795565605 2023-01-24 04:53:38.819894: step: 306/466, loss: 0.004947112873196602 2023-01-24 04:53:39.479579: step: 308/466, loss: 0.005399622954428196 2023-01-24 04:53:40.110328: step: 310/466, loss: 0.026465794071555138 2023-01-24 04:53:40.690845: step: 312/466, loss: 0.0011521864216774702 2023-01-24 04:53:41.335182: step: 314/466, loss: 0.0005903297569602728 2023-01-24 04:53:41.927530: step: 316/466, loss: 0.022255612537264824 2023-01-24 04:53:42.684048: step: 318/466, loss: 0.032267697155475616 2023-01-24 04:53:43.280058: step: 320/466, loss: 0.022860009223222733 2023-01-24 04:53:43.918801: step: 322/466, loss: 0.00012504737242124975 2023-01-24 04:53:44.467054: step: 324/466, loss: 0.05044516921043396 2023-01-24 04:53:45.043962: step: 326/466, loss: 0.008676145225763321 2023-01-24 04:53:45.599523: step: 328/466, loss: 0.005747594870626926 2023-01-24 04:53:46.211701: step: 330/466, loss: 0.01149928942322731 2023-01-24 04:53:46.843749: step: 332/466, loss: 0.020429149270057678 2023-01-24 04:53:47.508205: step: 334/466, loss: 0.6362595558166504 2023-01-24 04:53:48.124501: step: 336/466, loss: 0.0003849474887829274 2023-01-24 04:53:48.726619: step: 338/466, loss: 0.000748310936614871 2023-01-24 04:53:49.440195: step: 340/466, loss: 0.22001215815544128 2023-01-24 04:53:50.059996: step: 342/466, loss: 0.0038748010993003845 2023-01-24 04:53:50.694620: step: 344/466, loss: 0.007948961108922958 2023-01-24 04:53:51.356301: step: 346/466, loss: 0.07563716918230057 2023-01-24 04:53:51.986340: step: 348/466, loss: 0.019224612042307854 2023-01-24 04:53:52.653493: step: 350/466, loss: 0.01067200768738985 2023-01-24 04:53:53.175630: step: 352/466, loss: 0.00024166921502910554 2023-01-24 04:53:53.805639: step: 354/466, loss: 0.012266403995454311 2023-01-24 04:53:54.396343: step: 356/466, loss: 0.011130928993225098 2023-01-24 04:53:54.980156: step: 358/466, loss: 0.012516900897026062 2023-01-24 04:53:55.546616: step: 360/466, loss: 0.03315199911594391 2023-01-24 04:53:56.181168: step: 362/466, loss: 0.0007820624159649014 2023-01-24 04:53:56.826279: step: 364/466, loss: 0.005528890527784824 2023-01-24 04:53:57.399738: step: 366/466, loss: 0.04025204852223396 2023-01-24 04:53:57.998720: step: 368/466, loss: 0.04432765766978264 2023-01-24 04:53:58.594825: step: 370/466, loss: 2.249654608021956e-05 2023-01-24 04:53:59.229801: step: 372/466, loss: 0.050257954746484756 2023-01-24 04:53:59.866098: step: 374/466, loss: 0.002168118953704834 2023-01-24 04:54:00.463014: step: 376/466, loss: 0.000975790957454592 2023-01-24 04:54:01.053119: step: 378/466, loss: 0.04330888390541077 2023-01-24 04:54:01.685287: step: 380/466, loss: 0.02073555439710617 2023-01-24 04:54:02.381686: step: 382/466, loss: 0.0008999567362479866 2023-01-24 04:54:02.997809: step: 384/466, loss: 0.006720571778714657 2023-01-24 04:54:03.589433: step: 386/466, loss: 0.015682553872466087 2023-01-24 04:54:04.239651: step: 388/466, loss: 0.014605056494474411 2023-01-24 04:54:04.808545: step: 390/466, loss: 0.004097847267985344 2023-01-24 04:54:05.439759: step: 392/466, loss: 0.0360754057765007 2023-01-24 04:54:06.050914: step: 394/466, loss: 0.008980442769825459 2023-01-24 04:54:06.637215: step: 396/466, loss: 0.0016102999215945601 2023-01-24 04:54:07.254788: step: 398/466, loss: 0.012340670451521873 2023-01-24 04:54:07.855161: step: 400/466, loss: 0.004707478452473879 2023-01-24 04:54:08.503414: step: 402/466, loss: 0.003513404866680503 2023-01-24 04:54:09.128092: step: 404/466, loss: 0.018040353432297707 2023-01-24 04:54:09.776122: step: 406/466, loss: 0.007725434377789497 2023-01-24 04:54:10.359933: step: 408/466, loss: 0.007257946766912937 2023-01-24 04:54:10.974567: step: 410/466, loss: 0.0008977011311799288 2023-01-24 04:54:11.590928: step: 412/466, loss: 0.0028205220587551594 2023-01-24 04:54:12.250341: step: 414/466, loss: 0.0014505174476653337 2023-01-24 04:54:12.826440: step: 416/466, loss: 0.007715018931776285 2023-01-24 04:54:13.422589: step: 418/466, loss: 0.0150665994733572 2023-01-24 04:54:14.090708: step: 420/466, loss: 0.000246486539253965 2023-01-24 04:54:14.776395: step: 422/466, loss: 0.04313650727272034 2023-01-24 04:54:15.328104: step: 424/466, loss: 0.02649090252816677 2023-01-24 04:54:15.924529: step: 426/466, loss: 0.004771926905959845 2023-01-24 04:54:16.635086: step: 428/466, loss: 0.0037256137002259493 2023-01-24 04:54:17.214857: step: 430/466, loss: 0.00038914073957130313 2023-01-24 04:54:17.769705: step: 432/466, loss: 0.0022230239119380713 2023-01-24 04:54:18.442642: step: 434/466, loss: 0.03072909638285637 2023-01-24 04:54:19.079124: step: 436/466, loss: 0.10898005962371826 2023-01-24 04:54:19.695626: step: 438/466, loss: 0.04404553398489952 2023-01-24 04:54:20.345701: step: 440/466, loss: 0.0014041807735338807 2023-01-24 04:54:21.011787: step: 442/466, loss: 0.0072995019145309925 2023-01-24 04:54:21.632662: step: 444/466, loss: 0.09468042850494385 2023-01-24 04:54:22.255738: step: 446/466, loss: 0.019442066550254822 2023-01-24 04:54:22.906368: step: 448/466, loss: 0.013830686919391155 2023-01-24 04:54:23.541389: step: 450/466, loss: 0.07783650606870651 2023-01-24 04:54:24.249314: step: 452/466, loss: 0.03487912937998772 2023-01-24 04:54:24.885056: step: 454/466, loss: 0.02686239592730999 2023-01-24 04:54:25.508244: step: 456/466, loss: 0.0005249048699624836 2023-01-24 04:54:26.111666: step: 458/466, loss: 0.00393849890679121 2023-01-24 04:54:26.707436: step: 460/466, loss: 0.27787500619888306 2023-01-24 04:54:27.340305: step: 462/466, loss: 0.00037776704994030297 2023-01-24 04:54:27.950815: step: 464/466, loss: 0.00618037860840559 2023-01-24 04:54:28.583106: step: 466/466, loss: 0.0005530905327759683 2023-01-24 04:54:29.264665: step: 468/466, loss: 0.03683909401297569 2023-01-24 04:54:29.854753: step: 470/466, loss: 0.0035309072118252516 2023-01-24 04:54:30.447603: step: 472/466, loss: 0.0008134430972859263 2023-01-24 04:54:31.044389: step: 474/466, loss: 0.009414694271981716 2023-01-24 04:54:31.707577: step: 476/466, loss: 0.0031402541790157557 2023-01-24 04:54:32.313426: step: 478/466, loss: 0.012147171422839165 2023-01-24 04:54:32.938369: step: 480/466, loss: 0.00014230998931452632 2023-01-24 04:54:33.518310: step: 482/466, loss: 0.0010596913052722812 2023-01-24 04:54:34.120289: step: 484/466, loss: 0.0105279004201293 2023-01-24 04:54:34.712514: step: 486/466, loss: 0.012947582639753819 2023-01-24 04:54:35.367743: step: 488/466, loss: 0.004752716515213251 2023-01-24 04:54:35.984258: step: 490/466, loss: 0.0015016966499388218 2023-01-24 04:54:36.585880: step: 492/466, loss: 0.0045246221125125885 2023-01-24 04:54:37.215514: step: 494/466, loss: 0.003018786199390888 2023-01-24 04:54:37.854964: step: 496/466, loss: 0.0028835893608629704 2023-01-24 04:54:38.458265: step: 498/466, loss: 0.004857239313423634 2023-01-24 04:54:39.069044: step: 500/466, loss: 0.0025022828485816717 2023-01-24 04:54:39.702032: step: 502/466, loss: 0.005179490428417921 2023-01-24 04:54:40.281113: step: 504/466, loss: 0.0005254794377833605 2023-01-24 04:54:40.872387: step: 506/466, loss: 0.46436429023742676 2023-01-24 04:54:41.481554: step: 508/466, loss: 0.008739815093576908 2023-01-24 04:54:42.079920: step: 510/466, loss: 0.003550920868292451 2023-01-24 04:54:42.669754: step: 512/466, loss: 0.003878939663991332 2023-01-24 04:54:43.320864: step: 514/466, loss: 0.00949602760374546 2023-01-24 04:54:43.895553: step: 516/466, loss: 0.0002720048651099205 2023-01-24 04:54:44.476031: step: 518/466, loss: 0.0005417861975729465 2023-01-24 04:54:45.076880: step: 520/466, loss: 0.018731823191046715 2023-01-24 04:54:45.686049: step: 522/466, loss: 0.0026590144261717796 2023-01-24 04:54:46.300900: step: 524/466, loss: 0.01841605268418789 2023-01-24 04:54:46.914214: step: 526/466, loss: 0.052160799503326416 2023-01-24 04:54:47.458248: step: 528/466, loss: 0.0019730718340724707 2023-01-24 04:54:48.042875: step: 530/466, loss: 7.600292155984789e-05 2023-01-24 04:54:48.641598: step: 532/466, loss: 0.04475203529000282 2023-01-24 04:54:49.229748: step: 534/466, loss: 0.0004294338868930936 2023-01-24 04:54:49.802562: step: 536/466, loss: 0.017182378098368645 2023-01-24 04:54:50.485994: step: 538/466, loss: 0.008043097332119942 2023-01-24 04:54:51.093929: step: 540/466, loss: 0.0004025986127089709 2023-01-24 04:54:51.720807: step: 542/466, loss: 0.008009443059563637 2023-01-24 04:54:52.373340: step: 544/466, loss: 0.0009922012686729431 2023-01-24 04:54:52.967648: step: 546/466, loss: 1.1559067388589028e-05 2023-01-24 04:54:53.625332: step: 548/466, loss: 0.0022911306004971266 2023-01-24 04:54:54.266664: step: 550/466, loss: 0.04490361735224724 2023-01-24 04:54:54.876892: step: 552/466, loss: 0.00046182976802811027 2023-01-24 04:54:55.465559: step: 554/466, loss: 0.012909126468002796 2023-01-24 04:54:56.078416: step: 556/466, loss: 0.0005594027461484075 2023-01-24 04:54:56.704772: step: 558/466, loss: 0.0003095760475844145 2023-01-24 04:54:57.337550: step: 560/466, loss: 0.002141920616850257 2023-01-24 04:54:57.999994: step: 562/466, loss: 0.0038322710897773504 2023-01-24 04:54:58.534503: step: 564/466, loss: 0.09330803900957108 2023-01-24 04:54:59.222400: step: 566/466, loss: 0.0011068833991885185 2023-01-24 04:54:59.889028: step: 568/466, loss: 0.0008409665897488594 2023-01-24 04:55:00.505256: step: 570/466, loss: 0.001502808416262269 2023-01-24 04:55:01.170160: step: 572/466, loss: 0.2815001606941223 2023-01-24 04:55:01.773010: step: 574/466, loss: 0.0001007162791211158 2023-01-24 04:55:02.434204: step: 576/466, loss: 0.03037295676767826 2023-01-24 04:55:03.162680: step: 578/466, loss: 0.17398248612880707 2023-01-24 04:55:03.819373: step: 580/466, loss: 0.04349957033991814 2023-01-24 04:55:04.469156: step: 582/466, loss: 0.1279125064611435 2023-01-24 04:55:05.057629: step: 584/466, loss: 0.0009704646654427052 2023-01-24 04:55:05.692603: step: 586/466, loss: 0.0023126662708818913 2023-01-24 04:55:06.338785: step: 588/466, loss: 0.008005252107977867 2023-01-24 04:55:06.912117: step: 590/466, loss: 0.0005765442620031536 2023-01-24 04:55:07.505273: step: 592/466, loss: 0.05980457738041878 2023-01-24 04:55:08.157415: step: 594/466, loss: 0.00028216736973263323 2023-01-24 04:55:08.740433: step: 596/466, loss: 0.000833461235743016 2023-01-24 04:55:09.392001: step: 598/466, loss: 0.0018226697575300932 2023-01-24 04:55:10.008143: step: 600/466, loss: 0.00600542314350605 2023-01-24 04:55:10.573181: step: 602/466, loss: 0.013094187714159489 2023-01-24 04:55:11.212009: step: 604/466, loss: 0.003840026678517461 2023-01-24 04:55:11.869517: step: 606/466, loss: 0.004819925874471664 2023-01-24 04:55:12.481800: step: 608/466, loss: 0.004980082157999277 2023-01-24 04:55:13.169997: step: 610/466, loss: 0.0036126424092799425 2023-01-24 04:55:13.768715: step: 612/466, loss: 0.0014587597688660026 2023-01-24 04:55:14.481648: step: 614/466, loss: 0.00035369230317883193 2023-01-24 04:55:15.123021: step: 616/466, loss: 0.014491337351500988 2023-01-24 04:55:15.772021: step: 618/466, loss: 0.12356384098529816 2023-01-24 04:55:16.371242: step: 620/466, loss: 0.00992063619196415 2023-01-24 04:55:17.008507: step: 622/466, loss: 0.00022361463925335556 2023-01-24 04:55:17.684009: step: 624/466, loss: 0.026897968724370003 2023-01-24 04:55:18.267764: step: 626/466, loss: 0.0015888429479673505 2023-01-24 04:55:18.874798: step: 628/466, loss: 0.022302014753222466 2023-01-24 04:55:19.537961: step: 630/466, loss: 0.0008641614695079625 2023-01-24 04:55:20.166423: step: 632/466, loss: 0.00048572392552159727 2023-01-24 04:55:20.773487: step: 634/466, loss: 0.04261472076177597 2023-01-24 04:55:21.458270: step: 636/466, loss: 0.027452753856778145 2023-01-24 04:55:22.118539: step: 638/466, loss: 0.0005783818196505308 2023-01-24 04:55:22.680553: step: 640/466, loss: 0.020351797342300415 2023-01-24 04:55:23.282870: step: 642/466, loss: 0.0030048314947634935 2023-01-24 04:55:23.919956: step: 644/466, loss: 0.00014717664453200996 2023-01-24 04:55:24.555439: step: 646/466, loss: 0.007317872252315283 2023-01-24 04:55:25.142562: step: 648/466, loss: 0.034398648887872696 2023-01-24 04:55:25.776298: step: 650/466, loss: 0.034762825816869736 2023-01-24 04:55:26.368506: step: 652/466, loss: 0.0005621562013402581 2023-01-24 04:55:26.980927: step: 654/466, loss: 0.002596135251224041 2023-01-24 04:55:27.599820: step: 656/466, loss: 0.03442927077412605 2023-01-24 04:55:28.244923: step: 658/466, loss: 0.0009023649035952985 2023-01-24 04:55:28.855640: step: 660/466, loss: 0.026509709656238556 2023-01-24 04:55:29.513358: step: 662/466, loss: 0.014872992411255836 2023-01-24 04:55:30.116140: step: 664/466, loss: 0.00032420759089291096 2023-01-24 04:55:30.805766: step: 666/466, loss: 0.040809109807014465 2023-01-24 04:55:31.474066: step: 668/466, loss: 0.0006128854001872241 2023-01-24 04:55:32.101216: step: 670/466, loss: 0.14126506447792053 2023-01-24 04:55:32.775290: step: 672/466, loss: 0.0026687774807214737 2023-01-24 04:55:33.451140: step: 674/466, loss: 0.017287014052271843 2023-01-24 04:55:34.120278: step: 676/466, loss: 0.0005365812685340643 2023-01-24 04:55:34.803563: step: 678/466, loss: 0.049192655831575394 2023-01-24 04:55:35.436935: step: 680/466, loss: 0.012509983032941818 2023-01-24 04:55:36.091947: step: 682/466, loss: 0.058808889240026474 2023-01-24 04:55:36.698186: step: 684/466, loss: 0.004044192377477884 2023-01-24 04:55:37.247694: step: 686/466, loss: 0.013986448757350445 2023-01-24 04:55:37.921520: step: 688/466, loss: 0.00043944790377281606 2023-01-24 04:55:38.583595: step: 690/466, loss: 0.00938647985458374 2023-01-24 04:55:39.207769: step: 692/466, loss: 0.0008670755196362734 2023-01-24 04:55:39.843840: step: 694/466, loss: 0.003636277047917247 2023-01-24 04:55:40.539314: step: 696/466, loss: 0.023744331672787666 2023-01-24 04:55:41.175967: step: 698/466, loss: 0.0008914788486436009 2023-01-24 04:55:41.737475: step: 700/466, loss: 0.007494314573705196 2023-01-24 04:55:42.428126: step: 702/466, loss: 0.007531964685767889 2023-01-24 04:55:43.085630: step: 704/466, loss: 0.0007346156635321677 2023-01-24 04:55:43.723992: step: 706/466, loss: 0.015438981354236603 2023-01-24 04:55:44.312477: step: 708/466, loss: 0.038965754210948944 2023-01-24 04:55:44.927509: step: 710/466, loss: 0.029853714630007744 2023-01-24 04:55:45.501316: step: 712/466, loss: 0.06593216955661774 2023-01-24 04:55:46.159793: step: 714/466, loss: 0.01110734324902296 2023-01-24 04:55:46.760949: step: 716/466, loss: 0.016697432845830917 2023-01-24 04:55:47.405751: step: 718/466, loss: 0.04144787788391113 2023-01-24 04:55:48.084398: step: 720/466, loss: 0.0013874376891180873 2023-01-24 04:55:48.690489: step: 722/466, loss: 0.009951998479664326 2023-01-24 04:55:49.357258: step: 724/466, loss: 0.022471958771348 2023-01-24 04:55:50.049925: step: 726/466, loss: 0.14198991656303406 2023-01-24 04:55:50.699940: step: 728/466, loss: 0.010681631043553352 2023-01-24 04:55:51.274502: step: 730/466, loss: 0.044323984533548355 2023-01-24 04:55:51.860466: step: 732/466, loss: 0.03531954064965248 2023-01-24 04:55:52.450495: step: 734/466, loss: 0.0006543992785736918 2023-01-24 04:55:53.066723: step: 736/466, loss: 0.001076264539733529 2023-01-24 04:55:53.683582: step: 738/466, loss: 0.011064611375331879 2023-01-24 04:55:54.281215: step: 740/466, loss: 0.004203278571367264 2023-01-24 04:55:54.881844: step: 742/466, loss: 0.0006326843285933137 2023-01-24 04:55:55.499248: step: 744/466, loss: 0.0007471975404769182 2023-01-24 04:55:56.093400: step: 746/466, loss: 0.007592168636620045 2023-01-24 04:55:56.713029: step: 748/466, loss: 0.006634784862399101 2023-01-24 04:55:57.347549: step: 750/466, loss: 0.03417897969484329 2023-01-24 04:55:57.981477: step: 752/466, loss: 0.001295391470193863 2023-01-24 04:55:58.581869: step: 754/466, loss: 0.0022968819830566645 2023-01-24 04:55:59.178097: step: 756/466, loss: 0.0014519159449264407 2023-01-24 04:55:59.766283: step: 758/466, loss: 0.1301531195640564 2023-01-24 04:56:00.372309: step: 760/466, loss: 0.043121397495269775 2023-01-24 04:56:01.073254: step: 762/466, loss: 0.003836657153442502 2023-01-24 04:56:01.752008: step: 764/466, loss: 0.007856789045035839 2023-01-24 04:56:02.369171: step: 766/466, loss: 0.05979446694254875 2023-01-24 04:56:02.964468: step: 768/466, loss: 0.025634147226810455 2023-01-24 04:56:03.568715: step: 770/466, loss: 0.0024480284191668034 2023-01-24 04:56:04.228292: step: 772/466, loss: 0.011907415464520454 2023-01-24 04:56:04.913341: step: 774/466, loss: 0.004685855470597744 2023-01-24 04:56:05.527939: step: 776/466, loss: 0.00906300637871027 2023-01-24 04:56:06.180677: step: 778/466, loss: 0.15089234709739685 2023-01-24 04:56:06.826804: step: 780/466, loss: 0.06680942326784134 2023-01-24 04:56:07.421883: step: 782/466, loss: 0.0020852303132414818 2023-01-24 04:56:08.023877: step: 784/466, loss: 0.012161944061517715 2023-01-24 04:56:08.588393: step: 786/466, loss: 0.03335161507129669 2023-01-24 04:56:09.236328: step: 788/466, loss: 0.0030902328435331583 2023-01-24 04:56:09.933471: step: 790/466, loss: 0.0237263310700655 2023-01-24 04:56:10.498226: step: 792/466, loss: 0.002544113900512457 2023-01-24 04:56:11.190201: step: 794/466, loss: 0.06023455411195755 2023-01-24 04:56:11.843285: step: 796/466, loss: 0.24715900421142578 2023-01-24 04:56:12.418917: step: 798/466, loss: 0.022844484075903893 2023-01-24 04:56:13.062007: step: 800/466, loss: 0.40122318267822266 2023-01-24 04:56:13.719359: step: 802/466, loss: 0.025761272758245468 2023-01-24 04:56:14.416861: step: 804/466, loss: 0.019390100613236427 2023-01-24 04:56:15.029956: step: 806/466, loss: 0.0019449134124442935 2023-01-24 04:56:15.653001: step: 808/466, loss: 0.0016414802521467209 2023-01-24 04:56:16.282958: step: 810/466, loss: 0.009476746432483196 2023-01-24 04:56:16.896960: step: 812/466, loss: 0.011853148229420185 2023-01-24 04:56:17.700503: step: 814/466, loss: 0.037017446011304855 2023-01-24 04:56:18.303173: step: 816/466, loss: 0.012307077646255493 2023-01-24 04:56:18.885072: step: 818/466, loss: 0.0005546507891267538 2023-01-24 04:56:19.560279: step: 820/466, loss: 0.017318541184067726 2023-01-24 04:56:20.179086: step: 822/466, loss: 0.002184445969760418 2023-01-24 04:56:20.834563: step: 824/466, loss: 0.0024101058952510357 2023-01-24 04:56:21.457938: step: 826/466, loss: 0.01470540463924408 2023-01-24 04:56:22.059582: step: 828/466, loss: 0.00307606253772974 2023-01-24 04:56:22.736532: step: 830/466, loss: 0.0006091810064390302 2023-01-24 04:56:23.376211: step: 832/466, loss: 0.006069089286029339 2023-01-24 04:56:24.050876: step: 834/466, loss: 0.09584932774305344 2023-01-24 04:56:24.684151: step: 836/466, loss: 0.006171443499624729 2023-01-24 04:56:25.310492: step: 838/466, loss: 0.006870781537145376 2023-01-24 04:56:25.929517: step: 840/466, loss: 0.010167393833398819 2023-01-24 04:56:26.568752: step: 842/466, loss: 0.0011944527504965663 2023-01-24 04:56:27.231667: step: 844/466, loss: 0.0006144341314211488 2023-01-24 04:56:27.904478: step: 846/466, loss: 0.2855139672756195 2023-01-24 04:56:28.535839: step: 848/466, loss: 1.5135047435760498 2023-01-24 04:56:29.204766: step: 850/466, loss: 0.006842055357992649 2023-01-24 04:56:29.831345: step: 852/466, loss: 0.01040316466242075 2023-01-24 04:56:30.470740: step: 854/466, loss: 0.003283366095274687 2023-01-24 04:56:31.139434: step: 856/466, loss: 0.00498466519638896 2023-01-24 04:56:31.791766: step: 858/466, loss: 0.004681904800236225 2023-01-24 04:56:32.349072: step: 860/466, loss: 0.0169754009693861 2023-01-24 04:56:32.964531: step: 862/466, loss: 0.0372665636241436 2023-01-24 04:56:33.627628: step: 864/466, loss: 0.02054089680314064 2023-01-24 04:56:34.225377: step: 866/466, loss: 0.03896929696202278 2023-01-24 04:56:34.876557: step: 868/466, loss: 0.007434012833982706 2023-01-24 04:56:35.524088: step: 870/466, loss: 0.0048303971998393536 2023-01-24 04:56:36.085188: step: 872/466, loss: 0.004994220100343227 2023-01-24 04:56:36.731358: step: 874/466, loss: 0.0015695391921326518 2023-01-24 04:56:37.352308: step: 876/466, loss: 0.002584850648418069 2023-01-24 04:56:38.006424: step: 878/466, loss: 0.006896187085658312 2023-01-24 04:56:38.601251: step: 880/466, loss: 0.003456108272075653 2023-01-24 04:56:39.264325: step: 882/466, loss: 0.028172794729471207 2023-01-24 04:56:39.877592: step: 884/466, loss: 0.00853010918945074 2023-01-24 04:56:40.484295: step: 886/466, loss: 0.007580365054309368 2023-01-24 04:56:41.163160: step: 888/466, loss: 0.006292473059147596 2023-01-24 04:56:41.753773: step: 890/466, loss: 0.0060486397705972195 2023-01-24 04:56:42.379753: step: 892/466, loss: 0.03900180757045746 2023-01-24 04:56:42.994883: step: 894/466, loss: 0.004179924260824919 2023-01-24 04:56:43.619626: step: 896/466, loss: 0.0010741227306425571 2023-01-24 04:56:44.241509: step: 898/466, loss: 0.0005468535237014294 2023-01-24 04:56:44.986322: step: 900/466, loss: 0.02371526136994362 2023-01-24 04:56:45.591299: step: 902/466, loss: 0.009221994318068027 2023-01-24 04:56:46.227295: step: 904/466, loss: 0.2059815227985382 2023-01-24 04:56:46.806381: step: 906/466, loss: 0.0070160552859306335 2023-01-24 04:56:47.379813: step: 908/466, loss: 0.032636530697345734 2023-01-24 04:56:47.929219: step: 910/466, loss: 0.01037522405385971 2023-01-24 04:56:48.612835: step: 912/466, loss: 0.008214067667722702 2023-01-24 04:56:49.277632: step: 914/466, loss: 0.0076082623563706875 2023-01-24 04:56:49.866627: step: 916/466, loss: 0.1538480818271637 2023-01-24 04:56:50.488885: step: 918/466, loss: 9.918749856296927e-05 2023-01-24 04:56:51.130189: step: 920/466, loss: 0.008902657777071 2023-01-24 04:56:51.694449: step: 922/466, loss: 0.0007451887940987945 2023-01-24 04:56:52.272269: step: 924/466, loss: 5.677406443282962e-05 2023-01-24 04:56:52.948420: step: 926/466, loss: 0.0005498333484865725 2023-01-24 04:56:53.711460: step: 928/466, loss: 0.10708189755678177 2023-01-24 04:56:54.270008: step: 930/466, loss: 0.0016411797842010856 2023-01-24 04:56:54.834657: step: 932/466, loss: 0.027343858033418655 ================================================== Loss: 0.050 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.33758570554025097, 'r': 0.31708714277499855, 'f1': 0.3270155073237265}, 'combined': 0.24095879487011423, 'epoch': 36} Test Chinese: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.37349445419472704, 'r': 0.3074121743545498, 'f1': 0.3372466574983787}, 'combined': 0.22366617699374333, 'epoch': 36} Dev Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.3222870783384444, 'r': 0.25758550579322637, 'f1': 0.28632662538699694}, 'combined': 0.19088441692466462, 'epoch': 36} Test Korean: {'template': {'p': 0.96875, 'r': 0.49206349206349204, 'f1': 0.6526315789473683}, 'slot': {'p': 0.36472667077324966, 'r': 0.280801675595322, 'f1': 0.31730863830712824}, 'combined': 0.2070856376320205, 'epoch': 36} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3299676944473325, 'r': 0.3174452012994261, 'f1': 0.32358534058955035}, 'combined': 0.23843130359230025, 'epoch': 36} Test Russian: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.36113741416040845, 'r': 0.29436461102137457, 'f1': 0.32435010224449884}, 'combined': 0.21511302117769868, 'epoch': 36} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.33620689655172414, 'r': 0.2785714285714286, 'f1': 0.3046875}, 'combined': 0.203125, 'epoch': 36} Sample Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.48333333333333334, 'r': 0.31521739130434784, 'f1': 0.381578947368421}, 'combined': 0.25438596491228066, 'epoch': 36} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.4444444444444444, 'r': 0.13793103448275862, 'f1': 0.21052631578947367}, 'combined': 0.14035087719298245, 'epoch': 36} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.35600821224540485, 'r': 0.3296622534644356, 'f1': 0.34232907896701}, 'combined': 0.25224247923884946, 'epoch': 20} Test for Chinese: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.3708503216596444, 'r': 0.2994463568648126, 'f1': 0.33134515303755174}, 'combined': 0.21975222584873896, 'epoch': 20} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.38541666666666663, 'r': 0.35238095238095235, 'f1': 0.36815920398009944}, 'combined': 0.24543946932006627, 'epoch': 20} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.33670273348519364, 'r': 0.2799479166666667, 'f1': 0.3057135470527405}, 'combined': 0.20380903136849365, 'epoch': 35} Test for Korean: {'template': {'p': 0.96875, 'r': 0.49206349206349204, 'f1': 0.6526315789473683}, 'slot': {'p': 0.3696950279336589, 'r': 0.27919010327515936, 'f1': 0.31813086188869816}, 'combined': 0.20762224670630824, 'epoch': 35} Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5555555555555556, 'r': 0.43478260869565216, 'f1': 0.4878048780487805}, 'combined': 0.3252032520325203, 'epoch': 35} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3214272868712161, 'r': 0.31166858366449984, 'f1': 0.3164727236824498}, 'combined': 0.23319042797654194, 'epoch': 7} Test for Russian: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.3374639287478496, 'r': 0.2816078301964814, 'f1': 0.30701605547736693}, 'combined': 0.20361686580882363, 'epoch': 7} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.4423076923076923, 'r': 0.19827586206896552, 'f1': 0.2738095238095238}, 'combined': 0.1825396825396825, 'epoch': 7} ****************************** Epoch: 37 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-24 04:59:28.380412: step: 2/466, loss: 0.0008805061224848032 2023-01-24 04:59:28.959676: step: 4/466, loss: 3.681400266941637e-05 2023-01-24 04:59:29.665862: step: 6/466, loss: 0.036304231733083725 2023-01-24 04:59:30.260552: step: 8/466, loss: 0.0014005315024405718 2023-01-24 04:59:30.902881: step: 10/466, loss: 0.2493869513273239 2023-01-24 04:59:31.546067: step: 12/466, loss: 0.005454708356410265 2023-01-24 04:59:32.213343: step: 14/466, loss: 0.0005787847330793738 2023-01-24 04:59:32.838794: step: 16/466, loss: 0.008444041945040226 2023-01-24 04:59:33.472973: step: 18/466, loss: 0.0016095231985673308 2023-01-24 04:59:34.108714: step: 20/466, loss: 0.00022498948965221643 2023-01-24 04:59:34.794736: step: 22/466, loss: 0.012059216387569904 2023-01-24 04:59:35.417783: step: 24/466, loss: 0.0009429472265765071 2023-01-24 04:59:36.087267: step: 26/466, loss: 0.004221235867589712 2023-01-24 04:59:36.745281: step: 28/466, loss: 0.07568201422691345 2023-01-24 04:59:37.409127: step: 30/466, loss: 0.0034840416628867388 2023-01-24 04:59:38.061554: step: 32/466, loss: 0.012441596947610378 2023-01-24 04:59:38.600614: step: 34/466, loss: 0.00023383184452541173 2023-01-24 04:59:39.345686: step: 36/466, loss: 0.00031923266942612827 2023-01-24 04:59:39.968221: step: 38/466, loss: 0.003666159464046359 2023-01-24 04:59:40.662738: step: 40/466, loss: 0.00039576643030159175 2023-01-24 04:59:41.322291: step: 42/466, loss: 0.0009798625251278281 2023-01-24 04:59:41.963549: step: 44/466, loss: 0.005588307045400143 2023-01-24 04:59:42.609181: step: 46/466, loss: 0.03727317228913307 2023-01-24 04:59:43.263932: step: 48/466, loss: 0.004420082084834576 2023-01-24 04:59:43.834171: step: 50/466, loss: 2.9930888558737934e-06 2023-01-24 04:59:44.498544: step: 52/466, loss: 0.01459225732833147 2023-01-24 04:59:45.087923: step: 54/466, loss: 0.00021791511971969157 2023-01-24 04:59:45.720169: step: 56/466, loss: 0.00026564046856947243 2023-01-24 04:59:46.413106: step: 58/466, loss: 0.015419005416333675 2023-01-24 04:59:47.043481: step: 60/466, loss: 0.04069914296269417 2023-01-24 04:59:47.662418: step: 62/466, loss: 0.057508617639541626 2023-01-24 04:59:48.249768: step: 64/466, loss: 0.0007234219228848815 2023-01-24 04:59:48.876464: step: 66/466, loss: 0.0033933455124497414 2023-01-24 04:59:49.497274: step: 68/466, loss: 0.011373174376785755 2023-01-24 04:59:50.201387: step: 70/466, loss: 0.007621078286319971 2023-01-24 04:59:50.782471: step: 72/466, loss: 0.0011365091195330024 2023-01-24 04:59:51.408641: step: 74/466, loss: 0.0007960588554851711 2023-01-24 04:59:52.046424: step: 76/466, loss: 0.020760798826813698 2023-01-24 04:59:52.649369: step: 78/466, loss: 0.013785758055746555 2023-01-24 04:59:53.220042: step: 80/466, loss: 0.003760263556614518 2023-01-24 04:59:53.796283: step: 82/466, loss: 0.010897494852542877 2023-01-24 04:59:54.429423: step: 84/466, loss: 0.000967467378359288 2023-01-24 04:59:55.066337: step: 86/466, loss: 0.005571451503783464 2023-01-24 04:59:55.804769: step: 88/466, loss: 0.04830334335565567 2023-01-24 04:59:56.437092: step: 90/466, loss: 0.026635482907295227 2023-01-24 04:59:57.045299: step: 92/466, loss: 0.00018935652042273432 2023-01-24 04:59:57.705632: step: 94/466, loss: 0.002730258274823427 2023-01-24 04:59:58.284257: step: 96/466, loss: 0.4884699285030365 2023-01-24 04:59:58.875237: step: 98/466, loss: 0.0043935151770710945 2023-01-24 04:59:59.584226: step: 100/466, loss: 0.005710270255804062 2023-01-24 05:00:00.249613: step: 102/466, loss: 0.03708332031965256 2023-01-24 05:00:00.839700: step: 104/466, loss: 0.0016062766080722213 2023-01-24 05:00:01.420868: step: 106/466, loss: 0.02114509418606758 2023-01-24 05:00:02.027165: step: 108/466, loss: 0.0015665657119825482 2023-01-24 05:00:02.596221: step: 110/466, loss: 0.0013903353828936815 2023-01-24 05:00:03.140258: step: 112/466, loss: 0.004117294680327177 2023-01-24 05:00:03.800331: step: 114/466, loss: 9.603073704056442e-05 2023-01-24 05:00:04.563431: step: 116/466, loss: 0.0019076470052823424 2023-01-24 05:00:05.171953: step: 118/466, loss: 0.03609192371368408 2023-01-24 05:00:05.806339: step: 120/466, loss: 0.0002453723573125899 2023-01-24 05:00:06.356992: step: 122/466, loss: 0.0007782530738040805 2023-01-24 05:00:06.998840: step: 124/466, loss: 0.0011430905433371663 2023-01-24 05:00:07.627278: step: 126/466, loss: 0.022032130509614944 2023-01-24 05:00:08.245773: step: 128/466, loss: 0.07144544273614883 2023-01-24 05:00:08.873818: step: 130/466, loss: 0.004590063355863094 2023-01-24 05:00:09.499850: step: 132/466, loss: 0.028195375576615334 2023-01-24 05:00:10.097040: step: 134/466, loss: 0.023656990379095078 2023-01-24 05:00:10.714637: step: 136/466, loss: 0.05975626781582832 2023-01-24 05:00:11.407135: step: 138/466, loss: 0.003439907915890217 2023-01-24 05:00:11.992410: step: 140/466, loss: 0.0012263577664270997 2023-01-24 05:00:12.672532: step: 142/466, loss: 0.01997874490916729 2023-01-24 05:00:13.228925: step: 144/466, loss: 0.0015897410921752453 2023-01-24 05:00:13.904832: step: 146/466, loss: 0.04166782647371292 2023-01-24 05:00:14.634050: step: 148/466, loss: 0.014002179726958275 2023-01-24 05:00:15.259115: step: 150/466, loss: 0.000593072734773159 2023-01-24 05:00:15.915181: step: 152/466, loss: 0.0024511227384209633 2023-01-24 05:00:16.556706: step: 154/466, loss: 0.003020963165909052 2023-01-24 05:00:17.241110: step: 156/466, loss: 0.008708332665264606 2023-01-24 05:00:17.941267: step: 158/466, loss: 0.018368329852819443 2023-01-24 05:00:18.561386: step: 160/466, loss: 0.04773017019033432 2023-01-24 05:00:19.145308: step: 162/466, loss: 0.0012530928943306208 2023-01-24 05:00:19.750443: step: 164/466, loss: 0.0264136865735054 2023-01-24 05:00:20.434725: step: 166/466, loss: 0.005277651362121105 2023-01-24 05:00:21.086291: step: 168/466, loss: 0.006790454499423504 2023-01-24 05:00:21.673752: step: 170/466, loss: 0.007004170678555965 2023-01-24 05:00:22.328701: step: 172/466, loss: 0.0011634072288870811 2023-01-24 05:00:23.020651: step: 174/466, loss: 0.039164841175079346 2023-01-24 05:00:23.579172: step: 176/466, loss: 0.00011331143468851224 2023-01-24 05:00:24.250284: step: 178/466, loss: 0.0017837842460721731 2023-01-24 05:00:24.870191: step: 180/466, loss: 0.007201091386377811 2023-01-24 05:00:25.452799: step: 182/466, loss: 2.483464231772814e-05 2023-01-24 05:00:26.119423: step: 184/466, loss: 0.00038544295239262283 2023-01-24 05:00:26.705889: step: 186/466, loss: 0.003666855860501528 2023-01-24 05:00:27.286770: step: 188/466, loss: 0.0021711955778300762 2023-01-24 05:00:27.967026: step: 190/466, loss: 0.009761896915733814 2023-01-24 05:00:28.599267: step: 192/466, loss: 0.006024029105901718 2023-01-24 05:00:29.168237: step: 194/466, loss: 0.001021972973830998 2023-01-24 05:00:29.859631: step: 196/466, loss: 0.07072720676660538 2023-01-24 05:00:30.453574: step: 198/466, loss: 0.21727712452411652 2023-01-24 05:00:30.981682: step: 200/466, loss: 0.0025600316002964973 2023-01-24 05:00:31.569885: step: 202/466, loss: 0.008431457914412022 2023-01-24 05:00:32.209191: step: 204/466, loss: 0.0926947146654129 2023-01-24 05:00:32.823573: step: 206/466, loss: 0.011736077256500721 2023-01-24 05:00:33.457995: step: 208/466, loss: 0.01215405948460102 2023-01-24 05:00:34.089690: step: 210/466, loss: 0.016161885112524033 2023-01-24 05:00:34.702153: step: 212/466, loss: 0.0162214208394289 2023-01-24 05:00:35.367749: step: 214/466, loss: 0.011673922650516033 2023-01-24 05:00:36.076112: step: 216/466, loss: 0.02247869037091732 2023-01-24 05:00:36.679907: step: 218/466, loss: 0.025550978258252144 2023-01-24 05:00:37.350773: step: 220/466, loss: 0.20616596937179565 2023-01-24 05:00:38.007183: step: 222/466, loss: 0.012072236277163029 2023-01-24 05:00:38.592217: step: 224/466, loss: 0.0014509232714772224 2023-01-24 05:00:39.382249: step: 226/466, loss: 0.28090688586235046 2023-01-24 05:00:39.989742: step: 228/466, loss: 0.0036569759249687195 2023-01-24 05:00:40.595481: step: 230/466, loss: 0.004075853154063225 2023-01-24 05:00:41.238262: step: 232/466, loss: 0.01301665510982275 2023-01-24 05:00:41.845463: step: 234/466, loss: 0.0003938743029721081 2023-01-24 05:00:42.462550: step: 236/466, loss: 0.31214165687561035 2023-01-24 05:00:43.112935: step: 238/466, loss: 0.01012321375310421 2023-01-24 05:00:43.677831: step: 240/466, loss: 0.00015666645776946098 2023-01-24 05:00:44.252544: step: 242/466, loss: 0.0032289193477481604 2023-01-24 05:00:44.873729: step: 244/466, loss: 2.2957598048378713e-05 2023-01-24 05:00:45.540115: step: 246/466, loss: 0.6001042723655701 2023-01-24 05:00:46.138251: step: 248/466, loss: 0.003817453980445862 2023-01-24 05:00:46.828148: step: 250/466, loss: 0.006529581733047962 2023-01-24 05:00:47.378250: step: 252/466, loss: 0.0010104465764015913 2023-01-24 05:00:48.010583: step: 254/466, loss: 0.02891940250992775 2023-01-24 05:00:48.617681: step: 256/466, loss: 0.01952749490737915 2023-01-24 05:00:49.193082: step: 258/466, loss: 0.0005210338858887553 2023-01-24 05:00:49.912132: step: 260/466, loss: 0.05206609517335892 2023-01-24 05:00:50.505810: step: 262/466, loss: 0.006175199057906866 2023-01-24 05:00:51.103380: step: 264/466, loss: 0.01327715627849102 2023-01-24 05:00:51.678593: step: 266/466, loss: 0.01733010821044445 2023-01-24 05:00:52.292153: step: 268/466, loss: 0.0005196294514462352 2023-01-24 05:00:52.894149: step: 270/466, loss: 0.0004086203407496214 2023-01-24 05:00:53.542952: step: 272/466, loss: 0.07791987806558609 2023-01-24 05:00:54.141360: step: 274/466, loss: 0.6732566356658936 2023-01-24 05:00:54.699782: step: 276/466, loss: 0.000941718346439302 2023-01-24 05:00:55.369773: step: 278/466, loss: 0.024449041113257408 2023-01-24 05:00:55.966048: step: 280/466, loss: 0.15648572146892548 2023-01-24 05:00:56.570164: step: 282/466, loss: 0.013371721841394901 2023-01-24 05:00:57.166311: step: 284/466, loss: 0.0007062857621349394 2023-01-24 05:00:57.744108: step: 286/466, loss: 0.010413425974547863 2023-01-24 05:00:58.330802: step: 288/466, loss: 0.0002909430186264217 2023-01-24 05:00:58.968651: step: 290/466, loss: 0.02714431658387184 2023-01-24 05:00:59.655067: step: 292/466, loss: 0.015656817704439163 2023-01-24 05:01:00.303298: step: 294/466, loss: 0.034121233969926834 2023-01-24 05:01:00.913545: step: 296/466, loss: 0.02271353267133236 2023-01-24 05:01:01.483820: step: 298/466, loss: 0.0004174639761913568 2023-01-24 05:01:02.146140: step: 300/466, loss: 0.0024831509217619896 2023-01-24 05:01:02.704820: step: 302/466, loss: 0.0013438124442473054 2023-01-24 05:01:03.328623: step: 304/466, loss: 0.005839589983224869 2023-01-24 05:01:03.967419: step: 306/466, loss: 0.020242193713784218 2023-01-24 05:01:04.678791: step: 308/466, loss: 0.00034786100150085986 2023-01-24 05:01:05.307620: step: 310/466, loss: 0.00068121642107144 2023-01-24 05:01:05.977859: step: 312/466, loss: 0.01574590802192688 2023-01-24 05:01:06.562279: step: 314/466, loss: 0.059288665652275085 2023-01-24 05:01:07.168729: step: 316/466, loss: 2.29655165639997e-06 2023-01-24 05:01:07.819333: step: 318/466, loss: 0.0023216346744447947 2023-01-24 05:01:08.469687: step: 320/466, loss: 0.01720270700752735 2023-01-24 05:01:09.050204: step: 322/466, loss: 0.010333416052162647 2023-01-24 05:01:09.690637: step: 324/466, loss: 0.019171064719557762 2023-01-24 05:01:10.252167: step: 326/466, loss: 0.00010471227869857103 2023-01-24 05:01:10.904882: step: 328/466, loss: 0.00028663675766438246 2023-01-24 05:01:11.568150: step: 330/466, loss: 0.044246140867471695 2023-01-24 05:01:12.196292: step: 332/466, loss: 0.0018243632512167096 2023-01-24 05:01:12.822191: step: 334/466, loss: 0.0005321518983691931 2023-01-24 05:01:13.402050: step: 336/466, loss: 0.0016758107813075185 2023-01-24 05:01:13.969832: step: 338/466, loss: 0.07432331889867783 2023-01-24 05:01:14.557657: step: 340/466, loss: 0.002926712855696678 2023-01-24 05:01:15.144310: step: 342/466, loss: 0.011686854995787144 2023-01-24 05:01:15.810647: step: 344/466, loss: 0.003963540308177471 2023-01-24 05:01:16.418425: step: 346/466, loss: 0.0013557865750044584 2023-01-24 05:01:17.008652: step: 348/466, loss: 0.006266581825911999 2023-01-24 05:01:17.621088: step: 350/466, loss: 0.005226859822869301 2023-01-24 05:01:18.246222: step: 352/466, loss: 1.3675599802809302e-05 2023-01-24 05:01:18.924289: step: 354/466, loss: 0.0035143913701176643 2023-01-24 05:01:19.600454: step: 356/466, loss: 0.0013365527847781777 2023-01-24 05:01:20.292631: step: 358/466, loss: 0.000295504491077736 2023-01-24 05:01:20.927373: step: 360/466, loss: 0.010652300901710987 2023-01-24 05:01:21.544665: step: 362/466, loss: 0.0011146770557388663 2023-01-24 05:01:22.180270: step: 364/466, loss: 0.16300763189792633 2023-01-24 05:01:22.793563: step: 366/466, loss: 0.00569057185202837 2023-01-24 05:01:23.450734: step: 368/466, loss: 0.04630326107144356 2023-01-24 05:01:24.084857: step: 370/466, loss: 0.00736673828214407 2023-01-24 05:01:24.720546: step: 372/466, loss: 0.24240000545978546 2023-01-24 05:01:25.299063: step: 374/466, loss: 0.00043700882815755904 2023-01-24 05:01:25.934856: step: 376/466, loss: 0.003471532603725791 2023-01-24 05:01:26.573317: step: 378/466, loss: 0.013358261436223984 2023-01-24 05:01:27.254336: step: 380/466, loss: 0.04184216260910034 2023-01-24 05:01:27.850270: step: 382/466, loss: 0.006497319787740707 2023-01-24 05:01:28.496333: step: 384/466, loss: 0.01057523861527443 2023-01-24 05:01:29.134691: step: 386/466, loss: 0.00025049291434697807 2023-01-24 05:01:29.756166: step: 388/466, loss: 0.09662456810474396 2023-01-24 05:01:30.477830: step: 390/466, loss: 0.0034395684488117695 2023-01-24 05:01:31.081650: step: 392/466, loss: 0.04491157457232475 2023-01-24 05:01:31.745289: step: 394/466, loss: 0.00925894919782877 2023-01-24 05:01:32.473705: step: 396/466, loss: 0.008593680337071419 2023-01-24 05:01:33.050179: step: 398/466, loss: 0.000266447284957394 2023-01-24 05:01:33.756193: step: 400/466, loss: 0.0008228529477491975 2023-01-24 05:01:34.274695: step: 402/466, loss: 0.0007457354222424328 2023-01-24 05:01:34.888847: step: 404/466, loss: 0.0006555092404596508 2023-01-24 05:01:35.504276: step: 406/466, loss: 0.013119361363351345 2023-01-24 05:01:36.152201: step: 408/466, loss: 0.004940561484545469 2023-01-24 05:01:36.804192: step: 410/466, loss: 0.0086433170363307 2023-01-24 05:01:37.409053: step: 412/466, loss: 0.00023552594939246774 2023-01-24 05:01:38.011370: step: 414/466, loss: 0.01769222877919674 2023-01-24 05:01:38.609363: step: 416/466, loss: 0.003074225503951311 2023-01-24 05:01:39.190364: step: 418/466, loss: 0.0005861958488821983 2023-01-24 05:01:39.868690: step: 420/466, loss: 0.012538060545921326 2023-01-24 05:01:40.441088: step: 422/466, loss: 0.06709732115268707 2023-01-24 05:01:41.037822: step: 424/466, loss: 0.021091319620609283 2023-01-24 05:01:41.691194: step: 426/466, loss: 0.006403537467122078 2023-01-24 05:01:42.363145: step: 428/466, loss: 0.03239838406443596 2023-01-24 05:01:42.957232: step: 430/466, loss: 0.0004718708514701575 2023-01-24 05:01:43.556758: step: 432/466, loss: 0.04298185929656029 2023-01-24 05:01:44.180217: step: 434/466, loss: 0.001556994509883225 2023-01-24 05:01:44.834379: step: 436/466, loss: 0.0026232635136693716 2023-01-24 05:01:45.461197: step: 438/466, loss: 0.0005475019570440054 2023-01-24 05:01:46.104210: step: 440/466, loss: 0.00032511187600903213 2023-01-24 05:01:46.742436: step: 442/466, loss: 0.030677784234285355 2023-01-24 05:01:47.356309: step: 444/466, loss: 0.028117796406149864 2023-01-24 05:01:47.989688: step: 446/466, loss: 0.008695948868989944 2023-01-24 05:01:48.608648: step: 448/466, loss: 0.009362148120999336 2023-01-24 05:01:49.173702: step: 450/466, loss: 0.0016575742047280073 2023-01-24 05:01:49.803410: step: 452/466, loss: 0.04321468248963356 2023-01-24 05:01:50.456304: step: 454/466, loss: 0.02245822176337242 2023-01-24 05:01:51.118707: step: 456/466, loss: 0.006367537658661604 2023-01-24 05:01:51.739197: step: 458/466, loss: 0.014981807209551334 2023-01-24 05:01:52.314527: step: 460/466, loss: 0.015233676880598068 2023-01-24 05:01:52.954745: step: 462/466, loss: 0.0007042231736704707 2023-01-24 05:01:53.668215: step: 464/466, loss: 0.011992243118584156 2023-01-24 05:01:54.245787: step: 466/466, loss: 0.07552166283130646 2023-01-24 05:01:54.818463: step: 468/466, loss: 0.0010221432894468307 2023-01-24 05:01:55.488158: step: 470/466, loss: 0.002009578747674823 2023-01-24 05:01:56.100203: step: 472/466, loss: 0.00045798145583830774 2023-01-24 05:01:56.752524: step: 474/466, loss: 0.0030494979582726955 2023-01-24 05:01:57.416766: step: 476/466, loss: 0.07997559756040573 2023-01-24 05:01:57.949352: step: 478/466, loss: 0.0004329228540882468 2023-01-24 05:01:58.524990: step: 480/466, loss: 0.0002875140926335007 2023-01-24 05:01:59.087773: step: 482/466, loss: 0.039382219314575195 2023-01-24 05:01:59.738328: step: 484/466, loss: 0.029435431584715843 2023-01-24 05:02:00.381702: step: 486/466, loss: 0.008995798416435719 2023-01-24 05:02:01.006309: step: 488/466, loss: 0.0009553866693750024 2023-01-24 05:02:01.577935: step: 490/466, loss: 0.0010284394957125187 2023-01-24 05:02:02.184383: step: 492/466, loss: 0.0042778304778039455 2023-01-24 05:02:02.782071: step: 494/466, loss: 5.903772034798749e-05 2023-01-24 05:02:03.391023: step: 496/466, loss: 0.0021234445739537477 2023-01-24 05:02:03.941929: step: 498/466, loss: 0.01050239522010088 2023-01-24 05:02:04.604062: step: 500/466, loss: 0.008695174008607864 2023-01-24 05:02:05.190579: step: 502/466, loss: 0.004072991665452719 2023-01-24 05:02:05.848536: step: 504/466, loss: 0.002067763125523925 2023-01-24 05:02:06.444433: step: 506/466, loss: 0.02591841109097004 2023-01-24 05:02:07.055934: step: 508/466, loss: 0.012583310715854168 2023-01-24 05:02:07.642672: step: 510/466, loss: 0.009463530965149403 2023-01-24 05:02:08.267453: step: 512/466, loss: 0.0017065029824152589 2023-01-24 05:02:08.909268: step: 514/466, loss: 0.28508394956588745 2023-01-24 05:02:09.544398: step: 516/466, loss: 0.002218435751274228 2023-01-24 05:02:10.188080: step: 518/466, loss: 0.031737249344587326 2023-01-24 05:02:10.762114: step: 520/466, loss: 0.004431023262441158 2023-01-24 05:02:11.435025: step: 522/466, loss: 0.002075351309031248 2023-01-24 05:02:12.000989: step: 524/466, loss: 0.027918415144085884 2023-01-24 05:02:12.583606: step: 526/466, loss: 0.0009299259399995208 2023-01-24 05:02:13.165122: step: 528/466, loss: 4.171831096755341e-05 2023-01-24 05:02:13.779687: step: 530/466, loss: 0.00861969031393528 2023-01-24 05:02:14.314663: step: 532/466, loss: 0.000704513571690768 2023-01-24 05:02:14.941667: step: 534/466, loss: 0.023732317611575127 2023-01-24 05:02:15.596371: step: 536/466, loss: 0.006008785683661699 2023-01-24 05:02:16.186791: step: 538/466, loss: 0.0002596065169200301 2023-01-24 05:02:16.875099: step: 540/466, loss: 0.013545497320592403 2023-01-24 05:02:17.473857: step: 542/466, loss: 0.00020320458861533552 2023-01-24 05:02:18.108719: step: 544/466, loss: 0.002717326395213604 2023-01-24 05:02:18.737471: step: 546/466, loss: 0.03454839065670967 2023-01-24 05:02:19.437387: step: 548/466, loss: 0.005441261455416679 2023-01-24 05:02:20.045003: step: 550/466, loss: 0.00033432990312576294 2023-01-24 05:02:20.690712: step: 552/466, loss: 0.0008576879044994712 2023-01-24 05:02:21.387957: step: 554/466, loss: 0.01396965142339468 2023-01-24 05:02:21.942854: step: 556/466, loss: 0.00013199001841712743 2023-01-24 05:02:22.527110: step: 558/466, loss: 0.0009953243425115943 2023-01-24 05:02:23.148008: step: 560/466, loss: 0.0013477897737175226 2023-01-24 05:02:23.807241: step: 562/466, loss: 0.006996339187026024 2023-01-24 05:02:24.353390: step: 564/466, loss: 0.004686971195042133 2023-01-24 05:02:25.029586: step: 566/466, loss: 0.0018786239670589566 2023-01-24 05:02:25.630482: step: 568/466, loss: 0.001708917785435915 2023-01-24 05:02:26.314568: step: 570/466, loss: 0.00029886234551668167 2023-01-24 05:02:26.889232: step: 572/466, loss: 0.0005396566120907664 2023-01-24 05:02:27.511914: step: 574/466, loss: 0.0022396999411284924 2023-01-24 05:02:28.087981: step: 576/466, loss: 0.039275556802749634 2023-01-24 05:02:28.702185: step: 578/466, loss: 0.00020643284369725734 2023-01-24 05:02:29.344927: step: 580/466, loss: 0.004480894189327955 2023-01-24 05:02:30.038449: step: 582/466, loss: 0.14224693179130554 2023-01-24 05:02:30.568020: step: 584/466, loss: 0.005192149896174669 2023-01-24 05:02:31.265559: step: 586/466, loss: 0.014921769499778748 2023-01-24 05:02:31.879869: step: 588/466, loss: 0.04195178672671318 2023-01-24 05:02:32.607064: step: 590/466, loss: 0.021067453548312187 2023-01-24 05:02:33.238363: step: 592/466, loss: 0.0013888038229197264 2023-01-24 05:02:33.827804: step: 594/466, loss: 9.107340883929282e-05 2023-01-24 05:02:34.402404: step: 596/466, loss: 0.039545539766550064 2023-01-24 05:02:35.031941: step: 598/466, loss: 0.0014226339990273118 2023-01-24 05:02:35.646186: step: 600/466, loss: 0.011670433916151524 2023-01-24 05:02:36.254718: step: 602/466, loss: 0.03440529480576515 2023-01-24 05:02:36.915727: step: 604/466, loss: 0.010719393379986286 2023-01-24 05:02:37.494575: step: 606/466, loss: 0.0003759894461836666 2023-01-24 05:02:38.086198: step: 608/466, loss: 0.028834575787186623 2023-01-24 05:02:38.683948: step: 610/466, loss: 0.014594810083508492 2023-01-24 05:02:39.319495: step: 612/466, loss: 0.001817152719013393 2023-01-24 05:02:39.847774: step: 614/466, loss: 0.027273844927549362 2023-01-24 05:02:40.505281: step: 616/466, loss: 0.016842560842633247 2023-01-24 05:02:41.155736: step: 618/466, loss: 0.005247347056865692 2023-01-24 05:02:41.832034: step: 620/466, loss: 0.05189330130815506 2023-01-24 05:02:42.447874: step: 622/466, loss: 0.002156839007511735 2023-01-24 05:02:43.115665: step: 624/466, loss: 0.037101082503795624 2023-01-24 05:02:43.719625: step: 626/466, loss: 0.021021481603384018 2023-01-24 05:02:44.344068: step: 628/466, loss: 0.011911360546946526 2023-01-24 05:02:44.986227: step: 630/466, loss: 0.3343954086303711 2023-01-24 05:02:45.625647: step: 632/466, loss: 0.0003303734411019832 2023-01-24 05:02:46.270054: step: 634/466, loss: 0.0020633211825042963 2023-01-24 05:02:46.871310: step: 636/466, loss: 0.02070808969438076 2023-01-24 05:02:47.495446: step: 638/466, loss: 0.6909323930740356 2023-01-24 05:02:48.084004: step: 640/466, loss: 0.030152659863233566 2023-01-24 05:02:48.688170: step: 642/466, loss: 0.009435637854039669 2023-01-24 05:02:49.328838: step: 644/466, loss: 0.022851625457406044 2023-01-24 05:02:49.934452: step: 646/466, loss: 0.004921406973153353 2023-01-24 05:02:50.658354: step: 648/466, loss: 0.007689262740314007 2023-01-24 05:02:51.205513: step: 650/466, loss: 1.8792456103255972e-05 2023-01-24 05:02:51.801386: step: 652/466, loss: 0.001837368356063962 2023-01-24 05:02:52.458342: step: 654/466, loss: 0.5769074559211731 2023-01-24 05:02:53.064515: step: 656/466, loss: 0.025163520127534866 2023-01-24 05:02:53.667126: step: 658/466, loss: 0.3724534213542938 2023-01-24 05:02:54.255304: step: 660/466, loss: 0.006495846901088953 2023-01-24 05:02:54.889324: step: 662/466, loss: 0.737528383731842 2023-01-24 05:02:55.478413: step: 664/466, loss: 0.030375540256500244 2023-01-24 05:02:56.054125: step: 666/466, loss: 6.750689499313012e-05 2023-01-24 05:02:56.576349: step: 668/466, loss: 0.0002596633567009121 2023-01-24 05:02:57.213304: step: 670/466, loss: 0.00030843622516840696 2023-01-24 05:02:57.787844: step: 672/466, loss: 0.003468897892162204 2023-01-24 05:02:58.425617: step: 674/466, loss: 1.4518646001815796 2023-01-24 05:02:59.034958: step: 676/466, loss: 0.003203595755621791 2023-01-24 05:02:59.819768: step: 678/466, loss: 0.0002415669005131349 2023-01-24 05:03:00.407372: step: 680/466, loss: 0.1821751594543457 2023-01-24 05:03:01.000685: step: 682/466, loss: 0.022470567375421524 2023-01-24 05:03:01.612773: step: 684/466, loss: 0.0009556738659739494 2023-01-24 05:03:02.201222: step: 686/466, loss: 0.008491684682667255 2023-01-24 05:03:02.913217: step: 688/466, loss: 0.10258602350950241 2023-01-24 05:03:03.482342: step: 690/466, loss: 0.0031227143481373787 2023-01-24 05:03:04.084020: step: 692/466, loss: 0.0009964940836653113 2023-01-24 05:03:04.644683: step: 694/466, loss: 0.001099542947486043 2023-01-24 05:03:05.314986: step: 696/466, loss: 0.011481222696602345 2023-01-24 05:03:05.975371: step: 698/466, loss: 3.475100517272949 2023-01-24 05:03:06.605297: step: 700/466, loss: 0.002428211271762848 2023-01-24 05:03:07.265935: step: 702/466, loss: 0.00325836637057364 2023-01-24 05:03:07.891979: step: 704/466, loss: 0.06448805332183838 2023-01-24 05:03:08.477313: step: 706/466, loss: 19.545780181884766 2023-01-24 05:03:09.090092: step: 708/466, loss: 0.00236521870829165 2023-01-24 05:03:09.746858: step: 710/466, loss: 0.007757192011922598 2023-01-24 05:03:10.380054: step: 712/466, loss: 0.47919291257858276 2023-01-24 05:03:11.010863: step: 714/466, loss: 0.06101994961500168 2023-01-24 05:03:11.667577: step: 716/466, loss: 0.002332099014893174 2023-01-24 05:03:12.301897: step: 718/466, loss: 2.3008902644505724e-05 2023-01-24 05:03:12.851366: step: 720/466, loss: 0.015303296968340874 2023-01-24 05:03:13.485117: step: 722/466, loss: 0.009421579539775848 2023-01-24 05:03:14.028153: step: 724/466, loss: 3.652509258245118e-05 2023-01-24 05:03:14.653185: step: 726/466, loss: 0.008834085427224636 2023-01-24 05:03:15.218201: step: 728/466, loss: 4.781947791343555e-05 2023-01-24 05:03:15.833195: step: 730/466, loss: 0.004869956523180008 2023-01-24 05:03:16.396624: step: 732/466, loss: 0.0074061122722923756 2023-01-24 05:03:17.035095: step: 734/466, loss: 0.001221657614223659 2023-01-24 05:03:17.659035: step: 736/466, loss: 0.026640279218554497 2023-01-24 05:03:18.245570: step: 738/466, loss: 0.0030338140204548836 2023-01-24 05:03:18.915747: step: 740/466, loss: 0.03298734873533249 2023-01-24 05:03:19.487590: step: 742/466, loss: 0.06399720907211304 2023-01-24 05:03:20.111839: step: 744/466, loss: 0.01769380457699299 2023-01-24 05:03:20.719695: step: 746/466, loss: 0.030083784833550453 2023-01-24 05:03:21.363695: step: 748/466, loss: 0.001962339971214533 2023-01-24 05:03:21.947334: step: 750/466, loss: 0.009762940928339958 2023-01-24 05:03:22.518940: step: 752/466, loss: 0.005155081860721111 2023-01-24 05:03:23.142416: step: 754/466, loss: 0.03969765082001686 2023-01-24 05:03:23.801753: step: 756/466, loss: 0.00046790673513896763 2023-01-24 05:03:24.371662: step: 758/466, loss: 9.174644947052002e-05 2023-01-24 05:03:24.961099: step: 760/466, loss: 0.001274529262445867 2023-01-24 05:03:25.607323: step: 762/466, loss: 0.009623071178793907 2023-01-24 05:03:26.248617: step: 764/466, loss: 0.0006088234367780387 2023-01-24 05:03:26.916196: step: 766/466, loss: 0.005249843932688236 2023-01-24 05:03:27.464514: step: 768/466, loss: 0.00038335684803314507 2023-01-24 05:03:28.090161: step: 770/466, loss: 0.015080186538398266 2023-01-24 05:03:28.690520: step: 772/466, loss: 0.00324824801646173 2023-01-24 05:03:29.288202: step: 774/466, loss: 0.0041448757983744144 2023-01-24 05:03:29.946170: step: 776/466, loss: 0.0011435933411121368 2023-01-24 05:03:30.576355: step: 778/466, loss: 0.0011415882036089897 2023-01-24 05:03:31.162696: step: 780/466, loss: 0.0017376645701006055 2023-01-24 05:03:31.901100: step: 782/466, loss: 0.06268323957920074 2023-01-24 05:03:32.529537: step: 784/466, loss: 0.04164966568350792 2023-01-24 05:03:33.222156: step: 786/466, loss: 0.05158247798681259 2023-01-24 05:03:33.802396: step: 788/466, loss: 0.28123775124549866 2023-01-24 05:03:34.378329: step: 790/466, loss: 0.002174936467781663 2023-01-24 05:03:35.052231: step: 792/466, loss: 0.009450436569750309 2023-01-24 05:03:35.710167: step: 794/466, loss: 0.00193881057202816 2023-01-24 05:03:36.358438: step: 796/466, loss: 0.009928989224135876 2023-01-24 05:03:36.933856: step: 798/466, loss: 2.243615199404303e-06 2023-01-24 05:03:37.663840: step: 800/466, loss: 0.001064679236151278 2023-01-24 05:03:38.290159: step: 802/466, loss: 0.0488688126206398 2023-01-24 05:03:38.868258: step: 804/466, loss: 0.0025524573866277933 2023-01-24 05:03:39.477305: step: 806/466, loss: 0.010675763711333275 2023-01-24 05:03:40.094663: step: 808/466, loss: 0.00010395424033049494 2023-01-24 05:03:40.703039: step: 810/466, loss: 0.0003952126717194915 2023-01-24 05:03:41.316247: step: 812/466, loss: 0.04357978701591492 2023-01-24 05:03:41.936526: step: 814/466, loss: 0.0015918684657663107 2023-01-24 05:03:42.541998: step: 816/466, loss: 0.004090688657015562 2023-01-24 05:03:43.164886: step: 818/466, loss: 2.7879799745278433e-05 2023-01-24 05:03:43.781572: step: 820/466, loss: 0.010990914888679981 2023-01-24 05:03:44.394240: step: 822/466, loss: 0.006704024970531464 2023-01-24 05:03:45.029929: step: 824/466, loss: 0.014601575210690498 2023-01-24 05:03:45.622201: step: 826/466, loss: 0.01189486589282751 2023-01-24 05:03:46.230806: step: 828/466, loss: 0.1457836776971817 2023-01-24 05:03:46.819180: step: 830/466, loss: 6.903712346684188e-06 2023-01-24 05:03:47.439370: step: 832/466, loss: 0.000658975972328335 2023-01-24 05:03:48.062858: step: 834/466, loss: 0.004803813993930817 2023-01-24 05:03:48.673206: step: 836/466, loss: 0.07124597579240799 2023-01-24 05:03:49.276751: step: 838/466, loss: 0.00013458853936754167 2023-01-24 05:03:49.887277: step: 840/466, loss: 0.016244329512119293 2023-01-24 05:03:50.606254: step: 842/466, loss: 0.027946149930357933 2023-01-24 05:03:51.223039: step: 844/466, loss: 0.002878320636227727 2023-01-24 05:03:51.824316: step: 846/466, loss: 0.0013908261898905039 2023-01-24 05:03:52.414073: step: 848/466, loss: 3.669026045827195e-05 2023-01-24 05:03:53.022754: step: 850/466, loss: 0.014631629921495914 2023-01-24 05:03:53.608014: step: 852/466, loss: 0.038550667464733124 2023-01-24 05:03:54.182294: step: 854/466, loss: 0.0003395678650122136 2023-01-24 05:03:54.816847: step: 856/466, loss: 0.015413991175591946 2023-01-24 05:03:55.425240: step: 858/466, loss: 0.5240403413772583 2023-01-24 05:03:56.163154: step: 860/466, loss: 0.04657680168747902 2023-01-24 05:03:56.810353: step: 862/466, loss: 0.000312518939608708 2023-01-24 05:03:57.484358: step: 864/466, loss: 0.011041994206607342 2023-01-24 05:03:58.131034: step: 866/466, loss: 0.005065944045782089 2023-01-24 05:03:58.751157: step: 868/466, loss: 0.11262018233537674 2023-01-24 05:03:59.321985: step: 870/466, loss: 0.00018041301518678665 2023-01-24 05:03:59.898599: step: 872/466, loss: 0.0028136800974607468 2023-01-24 05:04:00.501437: step: 874/466, loss: 0.0003731518518179655 2023-01-24 05:04:01.128760: step: 876/466, loss: 0.03876349702477455 2023-01-24 05:04:01.836133: step: 878/466, loss: 0.00015004277520347387 2023-01-24 05:04:02.524637: step: 880/466, loss: 0.08454664051532745 2023-01-24 05:04:03.097794: step: 882/466, loss: 0.0009456683765165508 2023-01-24 05:04:03.767612: step: 884/466, loss: 0.0033473100047558546 2023-01-24 05:04:04.420373: step: 886/466, loss: 0.005513956304639578 2023-01-24 05:04:05.049394: step: 888/466, loss: 0.018945837393403053 2023-01-24 05:04:05.638982: step: 890/466, loss: 0.000268537609372288 2023-01-24 05:04:06.305167: step: 892/466, loss: 0.0008538711117580533 2023-01-24 05:04:06.923643: step: 894/466, loss: 0.049071550369262695 2023-01-24 05:04:07.584630: step: 896/466, loss: 0.023004358634352684 2023-01-24 05:04:08.209738: step: 898/466, loss: 0.03628868982195854 2023-01-24 05:04:08.896312: step: 900/466, loss: 0.00012156509910710156 2023-01-24 05:04:09.527594: step: 902/466, loss: 0.0007333770045079291 2023-01-24 05:04:10.132757: step: 904/466, loss: 0.012230328284204006 2023-01-24 05:04:10.791819: step: 906/466, loss: 0.021114924922585487 2023-01-24 05:04:11.408704: step: 908/466, loss: 0.0005755637539550662 2023-01-24 05:04:12.021265: step: 910/466, loss: 0.001959437970072031 2023-01-24 05:04:12.660410: step: 912/466, loss: 0.003534089308232069 2023-01-24 05:04:13.428756: step: 914/466, loss: 0.025242313742637634 2023-01-24 05:04:14.083396: step: 916/466, loss: 0.0010080791544169188 2023-01-24 05:04:14.685595: step: 918/466, loss: 0.002278506988659501 2023-01-24 05:04:15.348374: step: 920/466, loss: 9.028924978338182e-05 2023-01-24 05:04:15.969074: step: 922/466, loss: 0.004241098649799824 2023-01-24 05:04:16.612015: step: 924/466, loss: 0.02510761469602585 2023-01-24 05:04:17.218706: step: 926/466, loss: 0.0008595963008701801 2023-01-24 05:04:17.889088: step: 928/466, loss: 0.004858372732996941 2023-01-24 05:04:18.485632: step: 930/466, loss: 0.0005197401624172926 2023-01-24 05:04:19.112582: step: 932/466, loss: 0.03285631164908409 ================================================== Loss: 0.083 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3482924363153447, 'r': 0.3317700247254327, 'f1': 0.33983052095296995}, 'combined': 0.2504014364916621, 'epoch': 37} Test Chinese: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.3655986081301611, 'r': 0.3084047031417781, 'f1': 0.3345750037834386}, 'combined': 0.2218943030273582, 'epoch': 37} Dev Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.3398944636678201, 'r': 0.2735892936720143, 'f1': 0.3031587556323684}, 'combined': 0.2021058370882456, 'epoch': 37} Test Korean: {'template': {'p': 0.96875, 'r': 0.49206349206349204, 'f1': 0.6526315789473683}, 'slot': {'p': 0.3711168003071324, 'r': 0.28540037670678264, 'f1': 0.3226629197780349}, 'combined': 0.21058001080250696, 'epoch': 37} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.33845733174116954, 'r': 0.33075052342827765, 'f1': 0.3345595505694863}, 'combined': 0.2465175635775162, 'epoch': 37} Test Russian: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.36164126771888916, 'r': 0.3062209176009295, 'f1': 0.33163165478581674}, 'combined': 0.21994223737090435, 'epoch': 37} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.3452380952380952, 'r': 0.3452380952380952, 'f1': 0.3452380952380952}, 'combined': 0.23015873015873012, 'epoch': 37} Sample Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5294117647058824, 'r': 0.391304347826087, 'f1': 0.45}, 'combined': 0.3, 'epoch': 37} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5555555555555556, 'r': 0.1724137931034483, 'f1': 0.26315789473684215}, 'combined': 0.1754385964912281, 'epoch': 37} New best russian model... ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.35600821224540485, 'r': 0.3296622534644356, 'f1': 0.34232907896701}, 'combined': 0.25224247923884946, 'epoch': 20} Test for Chinese: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.3708503216596444, 'r': 0.2994463568648126, 'f1': 0.33134515303755174}, 'combined': 0.21975222584873896, 'epoch': 20} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.38541666666666663, 'r': 0.35238095238095235, 'f1': 0.36815920398009944}, 'combined': 0.24543946932006627, 'epoch': 20} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.33670273348519364, 'r': 0.2799479166666667, 'f1': 0.3057135470527405}, 'combined': 0.20380903136849365, 'epoch': 35} Test for Korean: {'template': {'p': 0.96875, 'r': 0.49206349206349204, 'f1': 0.6526315789473683}, 'slot': {'p': 0.3696950279336589, 'r': 0.27919010327515936, 'f1': 0.31813086188869816}, 'combined': 0.20762224670630824, 'epoch': 35} Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5555555555555556, 'r': 0.43478260869565216, 'f1': 0.4878048780487805}, 'combined': 0.3252032520325203, 'epoch': 35} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.33845733174116954, 'r': 0.33075052342827765, 'f1': 0.3345595505694863}, 'combined': 0.2465175635775162, 'epoch': 37} Test for Russian: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.36164126771888916, 'r': 0.3062209176009295, 'f1': 0.33163165478581674}, 'combined': 0.21994223737090435, 'epoch': 37} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5555555555555556, 'r': 0.1724137931034483, 'f1': 0.26315789473684215}, 'combined': 0.1754385964912281, 'epoch': 37} ****************************** Epoch: 38 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-24 05:06:58.424546: step: 2/466, loss: 0.021889768540859222 2023-01-24 05:06:59.016646: step: 4/466, loss: 0.06989212334156036 2023-01-24 05:06:59.651944: step: 6/466, loss: 0.004475723020732403 2023-01-24 05:07:00.215749: step: 8/466, loss: 0.054539233446121216 2023-01-24 05:07:00.950488: step: 10/466, loss: 0.04412826523184776 2023-01-24 05:07:01.556955: step: 12/466, loss: 0.0008282770286314189 2023-01-24 05:07:02.144761: step: 14/466, loss: 0.00038537505315616727 2023-01-24 05:07:02.815485: step: 16/466, loss: 0.005497610196471214 2023-01-24 05:07:03.395635: step: 18/466, loss: 0.0009423164883628488 2023-01-24 05:07:04.023658: step: 20/466, loss: 0.02297457680106163 2023-01-24 05:07:04.643117: step: 22/466, loss: 0.041743602603673935 2023-01-24 05:07:05.230044: step: 24/466, loss: 0.0004014494188595563 2023-01-24 05:07:05.850601: step: 26/466, loss: 0.003360750386491418 2023-01-24 05:07:06.497360: step: 28/466, loss: 0.001229410874657333 2023-01-24 05:07:07.154061: step: 30/466, loss: 0.007953322492539883 2023-01-24 05:07:07.770597: step: 32/466, loss: 0.5298741459846497 2023-01-24 05:07:08.450169: step: 34/466, loss: 0.0012196226743981242 2023-01-24 05:07:09.096870: step: 36/466, loss: 0.23362146317958832 2023-01-24 05:07:09.694786: step: 38/466, loss: 0.002468062797561288 2023-01-24 05:07:10.376008: step: 40/466, loss: 0.06696293503046036 2023-01-24 05:07:11.018984: step: 42/466, loss: 0.013891519047319889 2023-01-24 05:07:11.700439: step: 44/466, loss: 6.297170330071822e-05 2023-01-24 05:07:12.296343: step: 46/466, loss: 0.06854761391878128 2023-01-24 05:07:12.880631: step: 48/466, loss: 0.11188627034425735 2023-01-24 05:07:13.494271: step: 50/466, loss: 0.002684258623048663 2023-01-24 05:07:14.152385: step: 52/466, loss: 2.0872628283541417e-06 2023-01-24 05:07:14.782791: step: 54/466, loss: 0.03521451726555824 2023-01-24 05:07:15.421296: step: 56/466, loss: 0.0017311619594693184 2023-01-24 05:07:16.006722: step: 58/466, loss: 0.006770215928554535 2023-01-24 05:07:16.658066: step: 60/466, loss: 0.0028816615231335163 2023-01-24 05:07:17.298968: step: 62/466, loss: 0.0008021043031476438 2023-01-24 05:07:17.950687: step: 64/466, loss: 0.0010710136266425252 2023-01-24 05:07:18.623617: step: 66/466, loss: 0.00023852888261899352 2023-01-24 05:07:19.258757: step: 68/466, loss: 0.00030558169237338006 2023-01-24 05:07:19.903470: step: 70/466, loss: 0.0077896034345030785 2023-01-24 05:07:20.452171: step: 72/466, loss: 0.0009368745377287269 2023-01-24 05:07:21.109105: step: 74/466, loss: 0.0009904556209221482 2023-01-24 05:07:21.664401: step: 76/466, loss: 0.00010989147267537192 2023-01-24 05:07:22.284800: step: 78/466, loss: 0.02084057591855526 2023-01-24 05:07:22.928937: step: 80/466, loss: 0.4380775988101959 2023-01-24 05:07:23.568134: step: 82/466, loss: 0.010385693050920963 2023-01-24 05:07:24.240560: step: 84/466, loss: 0.010272561572492123 2023-01-24 05:07:24.837262: step: 86/466, loss: 0.0013637507800012827 2023-01-24 05:07:25.383037: step: 88/466, loss: 0.000447495753178373 2023-01-24 05:07:26.039222: step: 90/466, loss: 0.000879900180734694 2023-01-24 05:07:26.640275: step: 92/466, loss: 0.03361436724662781 2023-01-24 05:07:27.296019: step: 94/466, loss: 0.0165041983127594 2023-01-24 05:07:27.931025: step: 96/466, loss: 0.022173665463924408 2023-01-24 05:07:28.519961: step: 98/466, loss: 0.005095276981592178 2023-01-24 05:07:29.088329: step: 100/466, loss: 0.00683171022683382 2023-01-24 05:07:29.717819: step: 102/466, loss: 0.01334309671074152 2023-01-24 05:07:30.347901: step: 104/466, loss: 5.372382656787522e-05 2023-01-24 05:07:31.010607: step: 106/466, loss: 0.0027603597845882177 2023-01-24 05:07:31.600041: step: 108/466, loss: 0.046707384288311005 2023-01-24 05:07:32.212043: step: 110/466, loss: 6.69806604491896e-06 2023-01-24 05:07:32.824291: step: 112/466, loss: 0.0007423114730045199 2023-01-24 05:07:33.468560: step: 114/466, loss: 0.002536195795983076 2023-01-24 05:07:34.097556: step: 116/466, loss: 0.0004711195360869169 2023-01-24 05:07:34.644125: step: 118/466, loss: 0.010566012002527714 2023-01-24 05:07:35.309667: step: 120/466, loss: 0.004253188613802195 2023-01-24 05:07:35.940735: step: 122/466, loss: 0.06324297934770584 2023-01-24 05:07:36.548239: step: 124/466, loss: 0.015077656134963036 2023-01-24 05:07:37.239247: step: 126/466, loss: 0.0012931502424180508 2023-01-24 05:07:37.879396: step: 128/466, loss: 0.010562596842646599 2023-01-24 05:07:38.522069: step: 130/466, loss: 0.13915599882602692 2023-01-24 05:07:39.157974: step: 132/466, loss: 0.016481924802064896 2023-01-24 05:07:39.762548: step: 134/466, loss: 0.019297005608677864 2023-01-24 05:07:40.433883: step: 136/466, loss: 0.06798158586025238 2023-01-24 05:07:41.081453: step: 138/466, loss: 0.02616349793970585 2023-01-24 05:07:41.718728: step: 140/466, loss: 0.06994859874248505 2023-01-24 05:07:42.345017: step: 142/466, loss: 0.01263127289712429 2023-01-24 05:07:42.960993: step: 144/466, loss: 0.028079643845558167 2023-01-24 05:07:43.540640: step: 146/466, loss: 0.00032038360950537026 2023-01-24 05:07:44.171380: step: 148/466, loss: 0.039556972682476044 2023-01-24 05:07:44.749117: step: 150/466, loss: 2.667648186616134e-05 2023-01-24 05:07:45.365334: step: 152/466, loss: 0.00024154878337867558 2023-01-24 05:07:46.119740: step: 154/466, loss: 0.010044950991868973 2023-01-24 05:07:46.686126: step: 156/466, loss: 0.00782605167478323 2023-01-24 05:07:47.307097: step: 158/466, loss: 0.00023160962155088782 2023-01-24 05:07:47.866022: step: 160/466, loss: 0.056918852031230927 2023-01-24 05:07:48.442883: step: 162/466, loss: 0.0318228043615818 2023-01-24 05:07:49.059131: step: 164/466, loss: 0.0011338344775140285 2023-01-24 05:07:49.667815: step: 166/466, loss: 0.028528816998004913 2023-01-24 05:07:50.264061: step: 168/466, loss: 0.0003143611247651279 2023-01-24 05:07:50.838473: step: 170/466, loss: 0.02056513912975788 2023-01-24 05:07:51.459343: step: 172/466, loss: 0.0010503178928047419 2023-01-24 05:07:52.074366: step: 174/466, loss: 0.025755485519766808 2023-01-24 05:07:52.731894: step: 176/466, loss: 3.9795515476725996e-05 2023-01-24 05:07:53.387816: step: 178/466, loss: 0.031630728393793106 2023-01-24 05:07:54.006078: step: 180/466, loss: 0.061293408274650574 2023-01-24 05:07:54.604673: step: 182/466, loss: 0.010040693916380405 2023-01-24 05:07:55.215320: step: 184/466, loss: 1.6368253231048584 2023-01-24 05:07:55.792401: step: 186/466, loss: 0.00037069921381771564 2023-01-24 05:07:56.404650: step: 188/466, loss: 0.0012840436538681388 2023-01-24 05:07:56.989693: step: 190/466, loss: 0.001338571310043335 2023-01-24 05:07:57.640753: step: 192/466, loss: 0.013955621048808098 2023-01-24 05:07:58.239273: step: 194/466, loss: 0.04023696482181549 2023-01-24 05:07:58.896639: step: 196/466, loss: 0.0016070693964138627 2023-01-24 05:07:59.439076: step: 198/466, loss: 0.0025192557368427515 2023-01-24 05:08:00.084668: step: 200/466, loss: 0.0037205498665571213 2023-01-24 05:08:00.742567: step: 202/466, loss: 0.008088977076113224 2023-01-24 05:08:01.358204: step: 204/466, loss: 0.006108959671109915 2023-01-24 05:08:01.946496: step: 206/466, loss: 0.0037351585924625397 2023-01-24 05:08:02.564691: step: 208/466, loss: 0.04453587904572487 2023-01-24 05:08:03.136675: step: 210/466, loss: 0.00010247220052406192 2023-01-24 05:08:03.801433: step: 212/466, loss: 0.07220045477151871 2023-01-24 05:08:04.416323: step: 214/466, loss: 0.03526448830962181 2023-01-24 05:08:05.016644: step: 216/466, loss: 0.0002246480726171285 2023-01-24 05:08:05.617405: step: 218/466, loss: 0.16548627614974976 2023-01-24 05:08:06.195577: step: 220/466, loss: 0.001846380764618516 2023-01-24 05:08:06.812292: step: 222/466, loss: 0.03966742008924484 2023-01-24 05:08:07.466255: step: 224/466, loss: 0.0020343000069260597 2023-01-24 05:08:08.056716: step: 226/466, loss: 0.005603497382253408 2023-01-24 05:08:08.616768: step: 228/466, loss: 0.010272718966007233 2023-01-24 05:08:09.245734: step: 230/466, loss: 0.035140909254550934 2023-01-24 05:08:09.846626: step: 232/466, loss: 0.008486488834023476 2023-01-24 05:08:10.386512: step: 234/466, loss: 0.004173364490270615 2023-01-24 05:08:10.987953: step: 236/466, loss: 0.004337500315159559 2023-01-24 05:08:11.611059: step: 238/466, loss: 0.006094334181398153 2023-01-24 05:08:12.341270: step: 240/466, loss: 0.007469480391591787 2023-01-24 05:08:12.928359: step: 242/466, loss: 0.0025166794657707214 2023-01-24 05:08:13.509942: step: 244/466, loss: 0.04684692248702049 2023-01-24 05:08:14.146393: step: 246/466, loss: 0.014013230800628662 2023-01-24 05:08:14.772102: step: 248/466, loss: 0.0654238760471344 2023-01-24 05:08:15.319527: step: 250/466, loss: 0.0053533343598246574 2023-01-24 05:08:15.955453: step: 252/466, loss: 0.00657287985086441 2023-01-24 05:08:16.506597: step: 254/466, loss: 0.0019773482345044613 2023-01-24 05:08:17.120684: step: 256/466, loss: 0.009218761697411537 2023-01-24 05:08:17.739607: step: 258/466, loss: 0.0005720287445001304 2023-01-24 05:08:18.349111: step: 260/466, loss: 0.0006210109568201005 2023-01-24 05:08:19.009427: step: 262/466, loss: 0.004660551901906729 2023-01-24 05:08:19.730466: step: 264/466, loss: 0.004938114434480667 2023-01-24 05:08:20.355167: step: 266/466, loss: 0.0023985709995031357 2023-01-24 05:08:20.959368: step: 268/466, loss: 0.0012294613989070058 2023-01-24 05:08:21.590652: step: 270/466, loss: 0.0014288395177572966 2023-01-24 05:08:22.282457: step: 272/466, loss: 0.0003633807064034045 2023-01-24 05:08:22.917877: step: 274/466, loss: 0.008363694883883 2023-01-24 05:08:23.570767: step: 276/466, loss: 0.009151811711490154 2023-01-24 05:08:24.196209: step: 278/466, loss: 0.005131376441568136 2023-01-24 05:08:24.813486: step: 280/466, loss: 0.0008800456416793168 2023-01-24 05:08:25.451271: step: 282/466, loss: 0.00010396930883871391 2023-01-24 05:08:26.056238: step: 284/466, loss: 0.002452635671943426 2023-01-24 05:08:26.631114: step: 286/466, loss: 0.010570264421403408 2023-01-24 05:08:27.241932: step: 288/466, loss: 0.05791933462023735 2023-01-24 05:08:27.810707: step: 290/466, loss: 0.0001513109600637108 2023-01-24 05:08:28.436552: step: 292/466, loss: 0.527622640132904 2023-01-24 05:08:29.016467: step: 294/466, loss: 0.03949645906686783 2023-01-24 05:08:29.621649: step: 296/466, loss: 6.994983673095703 2023-01-24 05:08:30.267211: step: 298/466, loss: 0.00429984787479043 2023-01-24 05:08:30.887548: step: 300/466, loss: 0.018944447860121727 2023-01-24 05:08:31.491983: step: 302/466, loss: 0.006439780816435814 2023-01-24 05:08:32.154130: step: 304/466, loss: 0.001012927619740367 2023-01-24 05:08:32.748491: step: 306/466, loss: 0.029987772926688194 2023-01-24 05:08:33.327283: step: 308/466, loss: 0.0018202938372269273 2023-01-24 05:08:34.004265: step: 310/466, loss: 0.014104348607361317 2023-01-24 05:08:34.663853: step: 312/466, loss: 0.009202539920806885 2023-01-24 05:08:35.382320: step: 314/466, loss: 0.006665595341473818 2023-01-24 05:08:36.076953: step: 316/466, loss: 0.022786777466535568 2023-01-24 05:08:36.699816: step: 318/466, loss: 0.1801539659500122 2023-01-24 05:08:37.267862: step: 320/466, loss: 0.0052084228955209255 2023-01-24 05:08:37.921092: step: 322/466, loss: 0.010929033160209656 2023-01-24 05:08:38.621759: step: 324/466, loss: 0.0002629157970659435 2023-01-24 05:08:39.301453: step: 326/466, loss: 0.003500701393932104 2023-01-24 05:08:39.993810: step: 328/466, loss: 0.0722888931632042 2023-01-24 05:08:40.577475: step: 330/466, loss: 0.0023090096656233072 2023-01-24 05:08:41.226619: step: 332/466, loss: 0.00232186121866107 2023-01-24 05:08:41.925870: step: 334/466, loss: 0.000827113923151046 2023-01-24 05:08:42.507806: step: 336/466, loss: 0.01287727802991867 2023-01-24 05:08:43.111799: step: 338/466, loss: 2.9299275411176495e-05 2023-01-24 05:08:43.751624: step: 340/466, loss: 0.00022971018915995955 2023-01-24 05:08:44.281672: step: 342/466, loss: 0.20835307240486145 2023-01-24 05:08:44.895396: step: 344/466, loss: 0.0007434620638377964 2023-01-24 05:08:45.575492: step: 346/466, loss: 0.0011136470129713416 2023-01-24 05:08:46.249120: step: 348/466, loss: 0.0001844180515035987 2023-01-24 05:08:46.906931: step: 350/466, loss: 7.553988689323887e-05 2023-01-24 05:08:47.452420: step: 352/466, loss: 0.005245328415185213 2023-01-24 05:08:48.014646: step: 354/466, loss: 9.995513391913846e-05 2023-01-24 05:08:48.616687: step: 356/466, loss: 0.0033678971230983734 2023-01-24 05:08:49.307661: step: 358/466, loss: 0.007808188907802105 2023-01-24 05:08:49.957777: step: 360/466, loss: 0.03540022298693657 2023-01-24 05:08:50.563097: step: 362/466, loss: 0.00361061654984951 2023-01-24 05:08:51.150593: step: 364/466, loss: 0.007014125119894743 2023-01-24 05:08:51.749427: step: 366/466, loss: 0.13524511456489563 2023-01-24 05:08:52.408794: step: 368/466, loss: 0.007791826035827398 2023-01-24 05:08:52.985336: step: 370/466, loss: 0.010278028436005116 2023-01-24 05:08:53.668343: step: 372/466, loss: 0.0022314770612865686 2023-01-24 05:08:54.229506: step: 374/466, loss: 0.0009229807765223086 2023-01-24 05:08:54.917753: step: 376/466, loss: 0.028267759829759598 2023-01-24 05:08:55.568733: step: 378/466, loss: 0.0022364002652466297 2023-01-24 05:08:56.180864: step: 380/466, loss: 0.0007268958725035191 2023-01-24 05:08:56.780580: step: 382/466, loss: 0.005729918368160725 2023-01-24 05:08:57.430772: step: 384/466, loss: 4.04190068366006e-05 2023-01-24 05:08:57.973993: step: 386/466, loss: 0.0007473346777260303 2023-01-24 05:08:58.592472: step: 388/466, loss: 0.012823650613427162 2023-01-24 05:08:59.209274: step: 390/466, loss: 0.13350141048431396 2023-01-24 05:08:59.881292: step: 392/466, loss: 0.04062308743596077 2023-01-24 05:09:00.545241: step: 394/466, loss: 0.023992329835891724 2023-01-24 05:09:01.105218: step: 396/466, loss: 0.0023777533788233995 2023-01-24 05:09:01.712879: step: 398/466, loss: 0.36690983176231384 2023-01-24 05:09:02.366353: step: 400/466, loss: 0.06401252001523972 2023-01-24 05:09:03.024978: step: 402/466, loss: 0.005016879644244909 2023-01-24 05:09:03.672356: step: 404/466, loss: 0.0005979741690680385 2023-01-24 05:09:04.318974: step: 406/466, loss: 0.08845721185207367 2023-01-24 05:09:04.902245: step: 408/466, loss: 0.04313786327838898 2023-01-24 05:09:05.510579: step: 410/466, loss: 0.0065497104078531265 2023-01-24 05:09:06.056287: step: 412/466, loss: 0.0006080567254684865 2023-01-24 05:09:06.691977: step: 414/466, loss: 0.008212543092668056 2023-01-24 05:09:07.320394: step: 416/466, loss: 2.887671689677518e-05 2023-01-24 05:09:07.928162: step: 418/466, loss: 0.016217680647969246 2023-01-24 05:09:08.580601: step: 420/466, loss: 0.5591292381286621 2023-01-24 05:09:09.244237: step: 422/466, loss: 0.002283054869621992 2023-01-24 05:09:09.812835: step: 424/466, loss: 0.001116280909627676 2023-01-24 05:09:10.461371: step: 426/466, loss: 0.0035510158631950617 2023-01-24 05:09:11.061979: step: 428/466, loss: 0.010575790889561176 2023-01-24 05:09:11.665183: step: 430/466, loss: 0.04808664321899414 2023-01-24 05:09:12.265054: step: 432/466, loss: 5.420338766271016e-06 2023-01-24 05:09:12.842026: step: 434/466, loss: 0.0012021951843053102 2023-01-24 05:09:13.532995: step: 436/466, loss: 3.396605825400911e-05 2023-01-24 05:09:14.159152: step: 438/466, loss: 0.02876610867679119 2023-01-24 05:09:14.762119: step: 440/466, loss: 0.0016284855082631111 2023-01-24 05:09:15.391855: step: 442/466, loss: 0.024594612419605255 2023-01-24 05:09:16.095966: step: 444/466, loss: 0.004494968801736832 2023-01-24 05:09:16.686201: step: 446/466, loss: 0.009465239942073822 2023-01-24 05:09:17.317353: step: 448/466, loss: 2.231669714092277e-05 2023-01-24 05:09:17.976442: step: 450/466, loss: 0.0015232398873195052 2023-01-24 05:09:18.584319: step: 452/466, loss: 0.0008695161668583751 2023-01-24 05:09:19.165105: step: 454/466, loss: 0.2243443727493286 2023-01-24 05:09:19.735502: step: 456/466, loss: 0.0012619862100109458 2023-01-24 05:09:20.373963: step: 458/466, loss: 0.00047538834041915834 2023-01-24 05:09:21.038196: step: 460/466, loss: 0.002151229651644826 2023-01-24 05:09:21.683064: step: 462/466, loss: 0.005750596057623625 2023-01-24 05:09:22.341708: step: 464/466, loss: 0.0219623614102602 2023-01-24 05:09:22.953511: step: 466/466, loss: 0.0009176198509521782 2023-01-24 05:09:23.588788: step: 468/466, loss: 0.0002723286161199212 2023-01-24 05:09:24.236660: step: 470/466, loss: 0.005967268254607916 2023-01-24 05:09:24.865400: step: 472/466, loss: 0.021503537893295288 2023-01-24 05:09:25.478393: step: 474/466, loss: 0.12058486044406891 2023-01-24 05:09:26.127966: step: 476/466, loss: 0.16243629157543182 2023-01-24 05:09:26.757441: step: 478/466, loss: 0.00038640364073216915 2023-01-24 05:09:27.384711: step: 480/466, loss: 0.008760428056120872 2023-01-24 05:09:28.026327: step: 482/466, loss: 0.0006672271993011236 2023-01-24 05:09:28.700694: step: 484/466, loss: 0.0004302192246541381 2023-01-24 05:09:29.340227: step: 486/466, loss: 2.0018022041767836e-05 2023-01-24 05:09:30.057182: step: 488/466, loss: 0.0021745488047599792 2023-01-24 05:09:30.670291: step: 490/466, loss: 0.005278285127133131 2023-01-24 05:09:31.306098: step: 492/466, loss: 0.02675752528011799 2023-01-24 05:09:31.893497: step: 494/466, loss: 0.0060821110382676125 2023-01-24 05:09:32.525239: step: 496/466, loss: 0.14712998270988464 2023-01-24 05:09:33.134331: step: 498/466, loss: 0.12543053925037384 2023-01-24 05:09:33.982746: step: 500/466, loss: 0.007587286178022623 2023-01-24 05:09:34.595412: step: 502/466, loss: 0.0019519716734066606 2023-01-24 05:09:35.185744: step: 504/466, loss: 0.0018390478799119592 2023-01-24 05:09:35.746509: step: 506/466, loss: 0.0011793702142313123 2023-01-24 05:09:36.399430: step: 508/466, loss: 0.0064719608053565025 2023-01-24 05:09:37.011559: step: 510/466, loss: 0.025509748607873917 2023-01-24 05:09:37.677786: step: 512/466, loss: 0.0004068021953571588 2023-01-24 05:09:38.231877: step: 514/466, loss: 0.0006000860594213009 2023-01-24 05:09:38.808367: step: 516/466, loss: 0.007948077283799648 2023-01-24 05:09:39.458567: step: 518/466, loss: 0.0008945147856138647 2023-01-24 05:09:40.107108: step: 520/466, loss: 0.0008437213837169111 2023-01-24 05:09:40.736759: step: 522/466, loss: 0.000982102588750422 2023-01-24 05:09:41.350628: step: 524/466, loss: 0.0002092236973112449 2023-01-24 05:09:41.989133: step: 526/466, loss: 0.018853899091482162 2023-01-24 05:09:42.616230: step: 528/466, loss: 0.0017451480962336063 2023-01-24 05:09:43.255917: step: 530/466, loss: 0.0051223840564489365 2023-01-24 05:09:43.868101: step: 532/466, loss: 0.008472130633890629 2023-01-24 05:09:44.515350: step: 534/466, loss: 0.0018088719807565212 2023-01-24 05:09:45.190940: step: 536/466, loss: 0.005146315321326256 2023-01-24 05:09:45.857378: step: 538/466, loss: 0.011584267020225525 2023-01-24 05:09:46.524071: step: 540/466, loss: 0.002393155125901103 2023-01-24 05:09:47.131206: step: 542/466, loss: 0.0217942725867033 2023-01-24 05:09:47.773081: step: 544/466, loss: 0.0023021483793854713 2023-01-24 05:09:48.411327: step: 546/466, loss: 1.6428306480520405e-05 2023-01-24 05:09:49.037925: step: 548/466, loss: 0.001946398988366127 2023-01-24 05:09:49.748261: step: 550/466, loss: 0.003615370951592922 2023-01-24 05:09:50.365902: step: 552/466, loss: 0.004972133319824934 2023-01-24 05:09:50.996791: step: 554/466, loss: 0.3673803508281708 2023-01-24 05:09:51.600214: step: 556/466, loss: 0.0005943301948718727 2023-01-24 05:09:52.218339: step: 558/466, loss: 0.002106464933604002 2023-01-24 05:09:52.900635: step: 560/466, loss: 0.027747251093387604 2023-01-24 05:09:53.498061: step: 562/466, loss: 0.005243944935500622 2023-01-24 05:09:54.134842: step: 564/466, loss: 0.03971351683139801 2023-01-24 05:09:54.751814: step: 566/466, loss: 0.0243286844342947 2023-01-24 05:09:55.338566: step: 568/466, loss: 0.0017740450566634536 2023-01-24 05:09:55.945716: step: 570/466, loss: 0.001602650503627956 2023-01-24 05:09:56.583984: step: 572/466, loss: 0.0005635821144096553 2023-01-24 05:09:57.328451: step: 574/466, loss: 0.0005379511276260018 2023-01-24 05:09:57.930025: step: 576/466, loss: 0.001342284376733005 2023-01-24 05:09:58.546496: step: 578/466, loss: 0.005829032510519028 2023-01-24 05:09:59.164392: step: 580/466, loss: 0.0016859722090885043 2023-01-24 05:09:59.762407: step: 582/466, loss: 0.02985936589539051 2023-01-24 05:10:00.359549: step: 584/466, loss: 0.02248987928032875 2023-01-24 05:10:00.961672: step: 586/466, loss: 0.003450884949415922 2023-01-24 05:10:01.583458: step: 588/466, loss: 0.003009258070960641 2023-01-24 05:10:02.210885: step: 590/466, loss: 0.003541896352544427 2023-01-24 05:10:02.854166: step: 592/466, loss: 0.005958197638392448 2023-01-24 05:10:03.445693: step: 594/466, loss: 0.8511886596679688 2023-01-24 05:10:04.087631: step: 596/466, loss: 0.001087585580535233 2023-01-24 05:10:04.730491: step: 598/466, loss: 0.05506772920489311 2023-01-24 05:10:05.386795: step: 600/466, loss: 0.017406271770596504 2023-01-24 05:10:06.047875: step: 602/466, loss: 0.016767336055636406 2023-01-24 05:10:06.833439: step: 604/466, loss: 0.014688264578580856 2023-01-24 05:10:07.440319: step: 606/466, loss: 0.0007069968269206583 2023-01-24 05:10:08.024637: step: 608/466, loss: 0.0014018111396580935 2023-01-24 05:10:08.612325: step: 610/466, loss: 0.016013823449611664 2023-01-24 05:10:09.223529: step: 612/466, loss: 0.02588215284049511 2023-01-24 05:10:09.916064: step: 614/466, loss: 0.0009318602387793362 2023-01-24 05:10:10.563750: step: 616/466, loss: 0.00710050854831934 2023-01-24 05:10:11.113186: step: 618/466, loss: 0.018266601487994194 2023-01-24 05:10:11.758530: step: 620/466, loss: 0.001025032834149897 2023-01-24 05:10:12.384824: step: 622/466, loss: 0.5549904704093933 2023-01-24 05:10:13.011840: step: 624/466, loss: 0.001709211734123528 2023-01-24 05:10:13.595778: step: 626/466, loss: 0.025714827701449394 2023-01-24 05:10:14.228082: step: 628/466, loss: 0.010942323133349419 2023-01-24 05:10:14.826592: step: 630/466, loss: 0.00045645053614862263 2023-01-24 05:10:15.497737: step: 632/466, loss: 0.047666460275650024 2023-01-24 05:10:16.058209: step: 634/466, loss: 0.011204993352293968 2023-01-24 05:10:16.654453: step: 636/466, loss: 0.0030038023833185434 2023-01-24 05:10:17.269237: step: 638/466, loss: 1.2626858949661255 2023-01-24 05:10:17.944936: step: 640/466, loss: 0.009039514698088169 2023-01-24 05:10:18.558446: step: 642/466, loss: 0.246150940656662 2023-01-24 05:10:19.159718: step: 644/466, loss: 0.000953633920289576 2023-01-24 05:10:19.719502: step: 646/466, loss: 0.0077830590307712555 2023-01-24 05:10:20.282340: step: 648/466, loss: 0.04225435107946396 2023-01-24 05:10:20.903903: step: 650/466, loss: 0.008064341731369495 2023-01-24 05:10:21.585310: step: 652/466, loss: 0.013870935887098312 2023-01-24 05:10:22.180828: step: 654/466, loss: 6.591837882297114e-05 2023-01-24 05:10:22.756432: step: 656/466, loss: 0.04355807229876518 2023-01-24 05:10:23.399516: step: 658/466, loss: 0.002071819268167019 2023-01-24 05:10:24.015305: step: 660/466, loss: 0.08513505756855011 2023-01-24 05:10:24.665075: step: 662/466, loss: 2.543550729751587 2023-01-24 05:10:25.338482: step: 664/466, loss: 0.0010810650419443846 2023-01-24 05:10:26.002528: step: 666/466, loss: 0.042096130549907684 2023-01-24 05:10:26.599506: step: 668/466, loss: 0.013255462050437927 2023-01-24 05:10:27.213620: step: 670/466, loss: 0.06090088561177254 2023-01-24 05:10:27.862616: step: 672/466, loss: 0.004362513776868582 2023-01-24 05:10:28.554915: step: 674/466, loss: 0.057001881301403046 2023-01-24 05:10:29.181620: step: 676/466, loss: 0.11104325205087662 2023-01-24 05:10:29.791320: step: 678/466, loss: 0.0040749735198915005 2023-01-24 05:10:30.402252: step: 680/466, loss: 0.0012445810716599226 2023-01-24 05:10:31.068146: step: 682/466, loss: 0.00045973557280376554 2023-01-24 05:10:31.650300: step: 684/466, loss: 0.014887809753417969 2023-01-24 05:10:32.276336: step: 686/466, loss: 0.004492264240980148 2023-01-24 05:10:32.949284: step: 688/466, loss: 0.03745220601558685 2023-01-24 05:10:33.638576: step: 690/466, loss: 0.006510350853204727 2023-01-24 05:10:34.255331: step: 692/466, loss: 0.008633963763713837 2023-01-24 05:10:34.901410: step: 694/466, loss: 0.0186244398355484 2023-01-24 05:10:35.525395: step: 696/466, loss: 0.0014183515449985862 2023-01-24 05:10:36.174996: step: 698/466, loss: 0.030302129685878754 2023-01-24 05:10:36.757057: step: 700/466, loss: 0.0003668579738587141 2023-01-24 05:10:37.356338: step: 702/466, loss: 0.013108673505485058 2023-01-24 05:10:38.024900: step: 704/466, loss: 0.05892335623502731 2023-01-24 05:10:38.695049: step: 706/466, loss: 0.005821194499731064 2023-01-24 05:10:39.428175: step: 708/466, loss: 0.024348394945263863 2023-01-24 05:10:40.047632: step: 710/466, loss: 0.0029098910745233297 2023-01-24 05:10:40.641293: step: 712/466, loss: 0.0014831717126071453 2023-01-24 05:10:41.254631: step: 714/466, loss: 0.0011402704985812306 2023-01-24 05:10:41.881832: step: 716/466, loss: 0.004425371065735817 2023-01-24 05:10:42.544511: step: 718/466, loss: 0.0004287810006644577 2023-01-24 05:10:43.149629: step: 720/466, loss: 0.06450676918029785 2023-01-24 05:10:43.839708: step: 722/466, loss: 0.006807366851717234 2023-01-24 05:10:44.443543: step: 724/466, loss: 0.00010520143405301496 2023-01-24 05:10:45.109379: step: 726/466, loss: 0.03818788751959801 2023-01-24 05:10:45.653417: step: 728/466, loss: 0.001018814742565155 2023-01-24 05:10:46.279636: step: 730/466, loss: 0.038347743451595306 2023-01-24 05:10:46.932725: step: 732/466, loss: 0.0014345066156238317 2023-01-24 05:10:47.512423: step: 734/466, loss: 0.04617827758193016 2023-01-24 05:10:48.103202: step: 736/466, loss: 0.004945599474012852 2023-01-24 05:10:48.729203: step: 738/466, loss: 0.008316083811223507 2023-01-24 05:10:49.361025: step: 740/466, loss: 0.014469870366156101 2023-01-24 05:10:49.992967: step: 742/466, loss: 0.023186495527625084 2023-01-24 05:10:50.688977: step: 744/466, loss: 0.018661288544535637 2023-01-24 05:10:51.325661: step: 746/466, loss: 0.011117344722151756 2023-01-24 05:10:51.889719: step: 748/466, loss: 0.0007372678956016898 2023-01-24 05:10:52.517109: step: 750/466, loss: 0.005543145816773176 2023-01-24 05:10:53.150497: step: 752/466, loss: 0.03786356747150421 2023-01-24 05:10:53.762544: step: 754/466, loss: 0.0009820330888032913 2023-01-24 05:10:54.399226: step: 756/466, loss: 0.004211984109133482 2023-01-24 05:10:55.052331: step: 758/466, loss: 0.008682195097208023 2023-01-24 05:10:55.596252: step: 760/466, loss: 0.006925350055098534 2023-01-24 05:10:56.175606: step: 762/466, loss: 0.003041060408577323 2023-01-24 05:10:56.715352: step: 764/466, loss: 0.013051096349954605 2023-01-24 05:10:57.337876: step: 766/466, loss: 0.0006539585301652551 2023-01-24 05:10:57.925914: step: 768/466, loss: 0.09705096483230591 2023-01-24 05:10:58.636302: step: 770/466, loss: 0.007492034696042538 2023-01-24 05:10:59.234906: step: 772/466, loss: 0.006348648574203253 2023-01-24 05:10:59.943658: step: 774/466, loss: 0.017989221960306168 2023-01-24 05:11:00.562830: step: 776/466, loss: 0.17219893634319305 2023-01-24 05:11:01.247314: step: 778/466, loss: 0.010285553522408009 2023-01-24 05:11:01.867319: step: 780/466, loss: 0.008526391349732876 2023-01-24 05:11:02.512700: step: 782/466, loss: 0.008109268732368946 2023-01-24 05:11:03.094069: step: 784/466, loss: 0.005808887537568808 2023-01-24 05:11:03.694421: step: 786/466, loss: 0.0009838847909122705 2023-01-24 05:11:04.259796: step: 788/466, loss: 0.1035638228058815 2023-01-24 05:11:04.888180: step: 790/466, loss: 0.0033334919717162848 2023-01-24 05:11:05.421955: step: 792/466, loss: 0.0018809232860803604 2023-01-24 05:11:06.016613: step: 794/466, loss: 0.0032806447707116604 2023-01-24 05:11:06.639277: step: 796/466, loss: 0.0874895304441452 2023-01-24 05:11:07.261283: step: 798/466, loss: 0.00040602550143375993 2023-01-24 05:11:07.850920: step: 800/466, loss: 0.03953491151332855 2023-01-24 05:11:08.401911: step: 802/466, loss: 0.00030488843913190067 2023-01-24 05:11:09.037499: step: 804/466, loss: 0.0029468308202922344 2023-01-24 05:11:09.626927: step: 806/466, loss: 0.02056989260017872 2023-01-24 05:11:10.302230: step: 808/466, loss: 0.0040731183253228664 2023-01-24 05:11:10.936337: step: 810/466, loss: 0.008805769495666027 2023-01-24 05:11:11.578937: step: 812/466, loss: 0.003765393979847431 2023-01-24 05:11:12.162191: step: 814/466, loss: 0.0022657178342342377 2023-01-24 05:11:12.774590: step: 816/466, loss: 0.0012500463053584099 2023-01-24 05:11:13.376142: step: 818/466, loss: 0.00040347143658436835 2023-01-24 05:11:14.009669: step: 820/466, loss: 0.0158883985131979 2023-01-24 05:11:14.681158: step: 822/466, loss: 0.007519667502492666 2023-01-24 05:11:15.305812: step: 824/466, loss: 0.009993246756494045 2023-01-24 05:11:15.918148: step: 826/466, loss: 0.015041830018162727 2023-01-24 05:11:16.527087: step: 828/466, loss: 0.00337592582218349 2023-01-24 05:11:17.114837: step: 830/466, loss: 7.99222761997953e-05 2023-01-24 05:11:17.822386: step: 832/466, loss: 0.5875672101974487 2023-01-24 05:11:18.473055: step: 834/466, loss: 0.057680025696754456 2023-01-24 05:11:19.132979: step: 836/466, loss: 0.019776932895183563 2023-01-24 05:11:19.855746: step: 838/466, loss: 0.014451738446950912 2023-01-24 05:11:20.451233: step: 840/466, loss: 0.004580538719892502 2023-01-24 05:11:21.094802: step: 842/466, loss: 0.0045847478322684765 2023-01-24 05:11:21.756007: step: 844/466, loss: 0.0069063575938344 2023-01-24 05:11:22.385138: step: 846/466, loss: 0.079214908182621 2023-01-24 05:11:23.014706: step: 848/466, loss: 0.0007532158633694053 2023-01-24 05:11:23.658451: step: 850/466, loss: 0.1626470386981964 2023-01-24 05:11:24.292673: step: 852/466, loss: 0.0035434728488326073 2023-01-24 05:11:24.913574: step: 854/466, loss: 0.0006868537748232484 2023-01-24 05:11:25.500616: step: 856/466, loss: 0.008462544530630112 2023-01-24 05:11:26.089618: step: 858/466, loss: 0.0036610299721360207 2023-01-24 05:11:26.658385: step: 860/466, loss: 0.0031512551940977573 2023-01-24 05:11:27.247815: step: 862/466, loss: 0.045224349945783615 2023-01-24 05:11:27.856331: step: 864/466, loss: 7.859440665924922e-05 2023-01-24 05:11:28.487706: step: 866/466, loss: 0.0007107240962795913 2023-01-24 05:11:29.071373: step: 868/466, loss: 0.0005173115059733391 2023-01-24 05:11:29.750244: step: 870/466, loss: 0.0029340023174881935 2023-01-24 05:11:30.307698: step: 872/466, loss: 0.000720936746802181 2023-01-24 05:11:30.956546: step: 874/466, loss: 0.18558841943740845 2023-01-24 05:11:31.581988: step: 876/466, loss: 0.0594550259411335 2023-01-24 05:11:32.251222: step: 878/466, loss: 0.0013480983907356858 2023-01-24 05:11:32.838279: step: 880/466, loss: 0.000333346746629104 2023-01-24 05:11:33.420547: step: 882/466, loss: 0.0003667432174552232 2023-01-24 05:11:34.005140: step: 884/466, loss: 0.014059399254620075 2023-01-24 05:11:34.610948: step: 886/466, loss: 0.0007658023969270289 2023-01-24 05:11:35.343071: step: 888/466, loss: 0.004983733873814344 2023-01-24 05:11:35.932747: step: 890/466, loss: 0.0006161926430650055 2023-01-24 05:11:36.592058: step: 892/466, loss: 0.0013628561282530427 2023-01-24 05:11:37.189218: step: 894/466, loss: 0.0006844276795163751 2023-01-24 05:11:37.855387: step: 896/466, loss: 0.001898131798952818 2023-01-24 05:11:38.506484: step: 898/466, loss: 0.007265943568199873 2023-01-24 05:11:39.136248: step: 900/466, loss: 0.019399205222725868 2023-01-24 05:11:39.731792: step: 902/466, loss: 0.05709183216094971 2023-01-24 05:11:40.367049: step: 904/466, loss: 0.0287553071975708 2023-01-24 05:11:41.009753: step: 906/466, loss: 0.0010598688386380672 2023-01-24 05:11:41.588530: step: 908/466, loss: 0.00038106116699054837 2023-01-24 05:11:42.191739: step: 910/466, loss: 0.0017812260193750262 2023-01-24 05:11:42.791527: step: 912/466, loss: 0.01407301239669323 2023-01-24 05:11:43.392577: step: 914/466, loss: 0.008931309916079044 2023-01-24 05:11:43.994226: step: 916/466, loss: 0.04127160459756851 2023-01-24 05:11:44.591668: step: 918/466, loss: 0.0005697328597307205 2023-01-24 05:11:45.249059: step: 920/466, loss: 0.010193925350904465 2023-01-24 05:11:45.898988: step: 922/466, loss: 0.0004008575342595577 2023-01-24 05:11:46.485256: step: 924/466, loss: 0.0037713053170591593 2023-01-24 05:11:47.133444: step: 926/466, loss: 0.009518631733953953 2023-01-24 05:11:47.708817: step: 928/466, loss: 0.005249551497399807 2023-01-24 05:11:48.390629: step: 930/466, loss: 0.002054632408544421 2023-01-24 05:11:49.067095: step: 932/466, loss: 0.04487023502588272 ================================================== Loss: 0.055 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3485983360123832, 'r': 0.33669175148065095, 'f1': 0.3425416081666082}, 'combined': 0.2523990797017113, 'epoch': 38} Test Chinese: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.37005689571718187, 'r': 0.3113736964830014, 'f1': 0.3381884665801257}, 'combined': 0.2242907964883735, 'epoch': 38} Dev Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.331546462548573, 'r': 0.2756607898841355, 'f1': 0.3010318450027374}, 'combined': 0.20068789666849157, 'epoch': 38} Test Korean: {'template': {'p': 0.96875, 'r': 0.49206349206349204, 'f1': 0.6526315789473683}, 'slot': {'p': 0.3801019711199896, 'r': 0.2880357497414454, 'f1': 0.32772571525699895}, 'combined': 0.21388415100983088, 'epoch': 38} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.33014223323296105, 'r': 0.33264805663510877, 'f1': 0.3313904080277927}, 'combined': 0.24418240591521564, 'epoch': 38} Test Russian: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.35844887921996177, 'r': 0.2957591185252152, 'f1': 0.32410036233076245}, 'combined': 0.21494739056133466, 'epoch': 38} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.3194444444444444, 'r': 0.3559523809523809, 'f1': 0.33671171171171166}, 'combined': 0.22447447447447444, 'epoch': 38} Sample Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.53125, 'r': 0.3695652173913043, 'f1': 0.4358974358974359}, 'combined': 0.29059829059829057, 'epoch': 38} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5555555555555556, 'r': 0.1724137931034483, 'f1': 0.26315789473684215}, 'combined': 0.1754385964912281, 'epoch': 38} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.35600821224540485, 'r': 0.3296622534644356, 'f1': 0.34232907896701}, 'combined': 0.25224247923884946, 'epoch': 20} Test for Chinese: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.3708503216596444, 'r': 0.2994463568648126, 'f1': 0.33134515303755174}, 'combined': 0.21975222584873896, 'epoch': 20} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.38541666666666663, 'r': 0.35238095238095235, 'f1': 0.36815920398009944}, 'combined': 0.24543946932006627, 'epoch': 20} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.33670273348519364, 'r': 0.2799479166666667, 'f1': 0.3057135470527405}, 'combined': 0.20380903136849365, 'epoch': 35} Test for Korean: {'template': {'p': 0.96875, 'r': 0.49206349206349204, 'f1': 0.6526315789473683}, 'slot': {'p': 0.3696950279336589, 'r': 0.27919010327515936, 'f1': 0.31813086188869816}, 'combined': 0.20762224670630824, 'epoch': 35} Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5555555555555556, 'r': 0.43478260869565216, 'f1': 0.4878048780487805}, 'combined': 0.3252032520325203, 'epoch': 35} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.33845733174116954, 'r': 0.33075052342827765, 'f1': 0.3345595505694863}, 'combined': 0.2465175635775162, 'epoch': 37} Test for Russian: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.36164126771888916, 'r': 0.3062209176009295, 'f1': 0.33163165478581674}, 'combined': 0.21994223737090435, 'epoch': 37} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5555555555555556, 'r': 0.1724137931034483, 'f1': 0.26315789473684215}, 'combined': 0.1754385964912281, 'epoch': 37} ****************************** Epoch: 39 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-24 05:14:22.506644: step: 2/466, loss: 0.018085388466715813 2023-01-24 05:14:23.221429: step: 4/466, loss: 0.0055228341370821 2023-01-24 05:14:23.848759: step: 6/466, loss: 0.03207993879914284 2023-01-24 05:14:24.440475: step: 8/466, loss: 0.02125280536711216 2023-01-24 05:14:25.076594: step: 10/466, loss: 0.0024259898345917463 2023-01-24 05:14:25.705525: step: 12/466, loss: 3.278315853094682e-05 2023-01-24 05:14:26.301799: step: 14/466, loss: 0.00158892129547894 2023-01-24 05:14:26.871312: step: 16/466, loss: 0.001289099338464439 2023-01-24 05:14:27.452266: step: 18/466, loss: 0.5699836611747742 2023-01-24 05:14:27.965486: step: 20/466, loss: 4.0209528378909454e-05 2023-01-24 05:14:28.509533: step: 22/466, loss: 0.0016772763337939978 2023-01-24 05:14:29.144464: step: 24/466, loss: 0.05627201870083809 2023-01-24 05:14:29.784150: step: 26/466, loss: 0.0013877139426767826 2023-01-24 05:14:30.408343: step: 28/466, loss: 0.0003633845772128552 2023-01-24 05:14:31.160660: step: 30/466, loss: 0.009575406089425087 2023-01-24 05:14:31.752609: step: 32/466, loss: 1.4860745523037622e-06 2023-01-24 05:14:32.352358: step: 34/466, loss: 0.0001350823586108163 2023-01-24 05:14:33.006516: step: 36/466, loss: 0.008692679926753044 2023-01-24 05:14:33.620270: step: 38/466, loss: 0.03054310567677021 2023-01-24 05:14:34.283775: step: 40/466, loss: 0.005888329353183508 2023-01-24 05:14:34.868503: step: 42/466, loss: 0.00015866171452216804 2023-01-24 05:14:35.503195: step: 44/466, loss: 0.007867840118706226 2023-01-24 05:14:36.080838: step: 46/466, loss: 0.0014041324611753225 2023-01-24 05:14:36.718326: step: 48/466, loss: 0.0002700977784115821 2023-01-24 05:14:37.366638: step: 50/466, loss: 0.012439551763236523 2023-01-24 05:14:38.026639: step: 52/466, loss: 0.024994459003210068 2023-01-24 05:14:38.674887: step: 54/466, loss: 0.0005091347265988588 2023-01-24 05:14:39.342159: step: 56/466, loss: 0.012248466722667217 2023-01-24 05:14:39.943508: step: 58/466, loss: 0.00040014597470872104 2023-01-24 05:14:40.541252: step: 60/466, loss: 0.005358839873224497 2023-01-24 05:14:41.164667: step: 62/466, loss: 0.06228935718536377 2023-01-24 05:14:41.723827: step: 64/466, loss: 0.0008789148996584117 2023-01-24 05:14:42.347565: step: 66/466, loss: 0.02750891074538231 2023-01-24 05:14:42.895149: step: 68/466, loss: 0.019110241904854774 2023-01-24 05:14:43.552743: step: 70/466, loss: 0.0016093218000605702 2023-01-24 05:14:44.205130: step: 72/466, loss: 4.843094211537391e-05 2023-01-24 05:14:44.874815: step: 74/466, loss: 0.0009375785011798143 2023-01-24 05:14:45.512518: step: 76/466, loss: 0.0005191043601371348 2023-01-24 05:14:46.097363: step: 78/466, loss: 0.00801012758165598 2023-01-24 05:14:46.751605: step: 80/466, loss: 0.03936135768890381 2023-01-24 05:14:47.446540: step: 82/466, loss: 0.0033180981408804655 2023-01-24 05:14:48.051547: step: 84/466, loss: 0.047777898609638214 2023-01-24 05:14:48.698070: step: 86/466, loss: 0.023926258087158203 2023-01-24 05:14:49.328423: step: 88/466, loss: 0.008756158873438835 2023-01-24 05:14:50.019415: step: 90/466, loss: 0.002737005241215229 2023-01-24 05:14:50.649328: step: 92/466, loss: 0.027509748935699463 2023-01-24 05:14:51.268130: step: 94/466, loss: 0.000658797740470618 2023-01-24 05:14:51.999038: step: 96/466, loss: 0.04544578492641449 2023-01-24 05:14:52.583543: step: 98/466, loss: 2.4974578991532326e-05 2023-01-24 05:14:53.211974: step: 100/466, loss: 0.000990367727354169 2023-01-24 05:14:53.871633: step: 102/466, loss: 0.009184726513922215 2023-01-24 05:14:54.627483: step: 104/466, loss: 0.01867060735821724 2023-01-24 05:14:55.268730: step: 106/466, loss: 8.735142182558775e-05 2023-01-24 05:14:55.840165: step: 108/466, loss: 0.020845962688326836 2023-01-24 05:14:56.431194: step: 110/466, loss: 0.0040642134845256805 2023-01-24 05:14:57.056875: step: 112/466, loss: 0.016638586297631264 2023-01-24 05:14:57.627597: step: 114/466, loss: 0.0012167844688519835 2023-01-24 05:14:58.192768: step: 116/466, loss: 7.273747905855998e-05 2023-01-24 05:14:58.764029: step: 118/466, loss: 0.0003589347761590034 2023-01-24 05:14:59.358890: step: 120/466, loss: 0.000928523309994489 2023-01-24 05:15:00.034873: step: 122/466, loss: 0.005702007096260786 2023-01-24 05:15:00.698624: step: 124/466, loss: 0.007791510783135891 2023-01-24 05:15:01.268536: step: 126/466, loss: 0.008224002085626125 2023-01-24 05:15:01.869304: step: 128/466, loss: 0.0030422741547226906 2023-01-24 05:15:02.528889: step: 130/466, loss: 0.011904054321348667 2023-01-24 05:15:03.152303: step: 132/466, loss: 0.0015938390279188752 2023-01-24 05:15:03.845781: step: 134/466, loss: 0.01457253284752369 2023-01-24 05:15:04.699523: step: 136/466, loss: 0.08270794153213501 2023-01-24 05:15:05.320455: step: 138/466, loss: 0.009393492713570595 2023-01-24 05:15:05.944620: step: 140/466, loss: 0.0021446438040584326 2023-01-24 05:15:06.609893: step: 142/466, loss: 0.005898400209844112 2023-01-24 05:15:07.238717: step: 144/466, loss: 0.03170425072312355 2023-01-24 05:15:07.880902: step: 146/466, loss: 0.000670681125484407 2023-01-24 05:15:08.494182: step: 148/466, loss: 0.00017904472770169377 2023-01-24 05:15:09.055004: step: 150/466, loss: 0.002040478866547346 2023-01-24 05:15:09.610898: step: 152/466, loss: 0.0013364459155127406 2023-01-24 05:15:10.269499: step: 154/466, loss: 0.013797912746667862 2023-01-24 05:15:10.855561: step: 156/466, loss: 0.14163194596767426 2023-01-24 05:15:11.475073: step: 158/466, loss: 0.0034958203323185444 2023-01-24 05:15:12.165164: step: 160/466, loss: 0.0715586319565773 2023-01-24 05:15:12.821401: step: 162/466, loss: 5.988257908029482e-05 2023-01-24 05:15:13.429926: step: 164/466, loss: 0.0022908237297087908 2023-01-24 05:15:14.027308: step: 166/466, loss: 0.07565037161111832 2023-01-24 05:15:14.644566: step: 168/466, loss: 0.002730026375502348 2023-01-24 05:15:15.232759: step: 170/466, loss: 0.0032308921217918396 2023-01-24 05:15:15.890495: step: 172/466, loss: 0.0012114137643948197 2023-01-24 05:15:16.501789: step: 174/466, loss: 2.362889289855957 2023-01-24 05:15:17.096521: step: 176/466, loss: 0.0025763954035937786 2023-01-24 05:15:17.722064: step: 178/466, loss: 0.0015755045460537076 2023-01-24 05:15:18.319758: step: 180/466, loss: 0.0049422625452280045 2023-01-24 05:15:18.885808: step: 182/466, loss: 0.0011983857257291675 2023-01-24 05:15:19.498724: step: 184/466, loss: 0.0022618311922997236 2023-01-24 05:15:20.085479: step: 186/466, loss: 0.0013922039652243257 2023-01-24 05:15:20.647488: step: 188/466, loss: 0.002917697886005044 2023-01-24 05:15:21.274384: step: 190/466, loss: 0.004662810824811459 2023-01-24 05:15:21.920906: step: 192/466, loss: 0.016557466238737106 2023-01-24 05:15:22.546644: step: 194/466, loss: 0.0029304565396159887 2023-01-24 05:15:23.206411: step: 196/466, loss: 0.0011918117525056005 2023-01-24 05:15:23.773074: step: 198/466, loss: 0.00022702784917782992 2023-01-24 05:15:24.383589: step: 200/466, loss: 0.024145185947418213 2023-01-24 05:15:25.079231: step: 202/466, loss: 0.002579348860308528 2023-01-24 05:15:25.745375: step: 204/466, loss: 0.00012517660798039287 2023-01-24 05:15:26.350361: step: 206/466, loss: 0.004631017800420523 2023-01-24 05:15:27.041080: step: 208/466, loss: 0.03834258392453194 2023-01-24 05:15:27.659370: step: 210/466, loss: 0.00010660316911526024 2023-01-24 05:15:28.192541: step: 212/466, loss: 0.0031571954023092985 2023-01-24 05:15:28.793630: step: 214/466, loss: 0.00038896503974683583 2023-01-24 05:15:29.388598: step: 216/466, loss: 0.001999725354835391 2023-01-24 05:15:30.008020: step: 218/466, loss: 0.03977908939123154 2023-01-24 05:15:30.654803: step: 220/466, loss: 0.0003452486707828939 2023-01-24 05:15:31.316352: step: 222/466, loss: 0.00011602720041992143 2023-01-24 05:15:31.915367: step: 224/466, loss: 0.0012798572424799204 2023-01-24 05:15:32.635558: step: 226/466, loss: 0.0071009122766554356 2023-01-24 05:15:33.286447: step: 228/466, loss: 0.023304520174860954 2023-01-24 05:15:33.851444: step: 230/466, loss: 0.027974234893918037 2023-01-24 05:15:34.457912: step: 232/466, loss: 0.0005350593710318208 2023-01-24 05:15:35.045069: step: 234/466, loss: 0.026748616248369217 2023-01-24 05:15:35.677741: step: 236/466, loss: 0.003761754836887121 2023-01-24 05:15:36.299193: step: 238/466, loss: 0.020880121737718582 2023-01-24 05:15:36.904437: step: 240/466, loss: 0.002736175199970603 2023-01-24 05:15:37.523740: step: 242/466, loss: 0.013938499614596367 2023-01-24 05:15:38.187809: step: 244/466, loss: 0.0015270530711859465 2023-01-24 05:15:38.860114: step: 246/466, loss: 0.005346247460693121 2023-01-24 05:15:39.488405: step: 248/466, loss: 0.00041722002788446844 2023-01-24 05:15:40.174471: step: 250/466, loss: 0.0007358561852015555 2023-01-24 05:15:40.765825: step: 252/466, loss: 0.0042246123775839806 2023-01-24 05:15:41.423028: step: 254/466, loss: 0.0004641209670808166 2023-01-24 05:15:41.977631: step: 256/466, loss: 0.004896281752735376 2023-01-24 05:15:42.637800: step: 258/466, loss: 0.006107239983975887 2023-01-24 05:15:43.258770: step: 260/466, loss: 0.00105644844006747 2023-01-24 05:15:43.874812: step: 262/466, loss: 0.00015503763279411942 2023-01-24 05:15:44.483942: step: 264/466, loss: 0.019540099427103996 2023-01-24 05:15:45.069933: step: 266/466, loss: 0.0032516028732061386 2023-01-24 05:15:45.679908: step: 268/466, loss: 0.00861570704728365 2023-01-24 05:15:46.282325: step: 270/466, loss: 0.01724906452000141 2023-01-24 05:15:46.837323: step: 272/466, loss: 0.0008817288908176124 2023-01-24 05:15:47.477186: step: 274/466, loss: 0.0004552304744720459 2023-01-24 05:15:48.075211: step: 276/466, loss: 0.11388035118579865 2023-01-24 05:15:48.726529: step: 278/466, loss: 0.001476992736570537 2023-01-24 05:15:49.361294: step: 280/466, loss: 0.0038886708207428455 2023-01-24 05:15:49.972246: step: 282/466, loss: 1.6863867131178267e-05 2023-01-24 05:15:50.652959: step: 284/466, loss: 0.037166088819503784 2023-01-24 05:15:51.294008: step: 286/466, loss: 0.01547346543520689 2023-01-24 05:15:51.866747: step: 288/466, loss: 0.0002924564469140023 2023-01-24 05:15:52.419756: step: 290/466, loss: 0.01838175393640995 2023-01-24 05:15:53.051646: step: 292/466, loss: 0.0003352120111230761 2023-01-24 05:15:53.637248: step: 294/466, loss: 8.774209709372371e-05 2023-01-24 05:15:54.239437: step: 296/466, loss: 0.06365593522787094 2023-01-24 05:15:54.874544: step: 298/466, loss: 0.07829616218805313 2023-01-24 05:15:55.469690: step: 300/466, loss: 0.004876286722719669 2023-01-24 05:15:56.134246: step: 302/466, loss: 0.010864168405532837 2023-01-24 05:15:56.705265: step: 304/466, loss: 0.083220936357975 2023-01-24 05:15:57.346228: step: 306/466, loss: 0.019162513315677643 2023-01-24 05:15:58.062443: step: 308/466, loss: 0.09040261805057526 2023-01-24 05:15:58.646233: step: 310/466, loss: 0.038859039545059204 2023-01-24 05:15:59.294738: step: 312/466, loss: 0.04216773808002472 2023-01-24 05:15:59.996796: step: 314/466, loss: 0.12224699556827545 2023-01-24 05:16:00.584191: step: 316/466, loss: 0.00013712629151996225 2023-01-24 05:16:01.281016: step: 318/466, loss: 0.08204283565282822 2023-01-24 05:16:02.010821: step: 320/466, loss: 0.0006768596358597279 2023-01-24 05:16:02.687945: step: 322/466, loss: 0.07268203049898148 2023-01-24 05:16:03.268042: step: 324/466, loss: 0.011528857052326202 2023-01-24 05:16:03.873316: step: 326/466, loss: 0.009545979090034962 2023-01-24 05:16:04.474351: step: 328/466, loss: 0.0033273734152317047 2023-01-24 05:16:05.069829: step: 330/466, loss: 0.0013916256139054894 2023-01-24 05:16:05.675485: step: 332/466, loss: 0.15865670144557953 2023-01-24 05:16:06.308030: step: 334/466, loss: 0.011998772621154785 2023-01-24 05:16:06.867136: step: 336/466, loss: 3.965425639762543e-05 2023-01-24 05:16:07.477852: step: 338/466, loss: 0.009822729974985123 2023-01-24 05:16:08.104426: step: 340/466, loss: 0.00204171659424901 2023-01-24 05:16:08.647099: step: 342/466, loss: 0.04342419654130936 2023-01-24 05:16:09.243983: step: 344/466, loss: 0.027267329394817352 2023-01-24 05:16:09.853904: step: 346/466, loss: 0.008024612441658974 2023-01-24 05:16:10.487855: step: 348/466, loss: 0.0008232182590290904 2023-01-24 05:16:11.081682: step: 350/466, loss: 4.253632505424321e-05 2023-01-24 05:16:11.780512: step: 352/466, loss: 0.012067830190062523 2023-01-24 05:16:12.436849: step: 354/466, loss: 0.6041533350944519 2023-01-24 05:16:13.044515: step: 356/466, loss: 0.02979869954288006 2023-01-24 05:16:13.686135: step: 358/466, loss: 0.015235206112265587 2023-01-24 05:16:14.273333: step: 360/466, loss: 0.0009958329610526562 2023-01-24 05:16:14.872708: step: 362/466, loss: 0.0004881559289060533 2023-01-24 05:16:15.519298: step: 364/466, loss: 0.01994839869439602 2023-01-24 05:16:16.177095: step: 366/466, loss: 0.00024279524222947657 2023-01-24 05:16:16.795246: step: 368/466, loss: 0.0027458854019641876 2023-01-24 05:16:17.449510: step: 370/466, loss: 0.0752294585108757 2023-01-24 05:16:18.070668: step: 372/466, loss: 0.002107922686263919 2023-01-24 05:16:18.680671: step: 374/466, loss: 0.021487440913915634 2023-01-24 05:16:19.279975: step: 376/466, loss: 0.0018568960949778557 2023-01-24 05:16:19.873339: step: 378/466, loss: 0.011506550945341587 2023-01-24 05:16:20.529478: step: 380/466, loss: 0.002839966444298625 2023-01-24 05:16:21.108197: step: 382/466, loss: 0.001984285656362772 2023-01-24 05:16:21.792341: step: 384/466, loss: 0.015150221064686775 2023-01-24 05:16:22.516674: step: 386/466, loss: 0.0021268455311656 2023-01-24 05:16:23.163246: step: 388/466, loss: 0.006002889946103096 2023-01-24 05:16:23.770395: step: 390/466, loss: 0.0029884742107242346 2023-01-24 05:16:24.360253: step: 392/466, loss: 0.0004607290029525757 2023-01-24 05:16:25.043733: step: 394/466, loss: 0.06509587913751602 2023-01-24 05:16:25.684727: step: 396/466, loss: 0.004332786425948143 2023-01-24 05:16:26.266977: step: 398/466, loss: 0.00031428266083821654 2023-01-24 05:16:26.865545: step: 400/466, loss: 0.002126887906342745 2023-01-24 05:16:27.430981: step: 402/466, loss: 0.004601453896611929 2023-01-24 05:16:28.054570: step: 404/466, loss: 0.018566809594631195 2023-01-24 05:16:28.611180: step: 406/466, loss: 0.005234793294221163 2023-01-24 05:16:29.242274: step: 408/466, loss: 0.013058140873908997 2023-01-24 05:16:29.842719: step: 410/466, loss: 0.00037722065462730825 2023-01-24 05:16:30.561825: step: 412/466, loss: 0.0024766004644334316 2023-01-24 05:16:31.174914: step: 414/466, loss: 3.426059629418887e-05 2023-01-24 05:16:31.815064: step: 416/466, loss: 0.00046308644232340157 2023-01-24 05:16:32.466338: step: 418/466, loss: 0.006776104681193829 2023-01-24 05:16:33.121686: step: 420/466, loss: 0.0001457214675610885 2023-01-24 05:16:33.795692: step: 422/466, loss: 0.1614760011434555 2023-01-24 05:16:34.408491: step: 424/466, loss: 0.04326094686985016 2023-01-24 05:16:35.032568: step: 426/466, loss: 0.01517427433282137 2023-01-24 05:16:35.614078: step: 428/466, loss: 0.0005652170511893928 2023-01-24 05:16:36.221548: step: 430/466, loss: 0.005981959402561188 2023-01-24 05:16:36.837285: step: 432/466, loss: 0.013260902836918831 2023-01-24 05:16:37.445948: step: 434/466, loss: 0.019691504538059235 2023-01-24 05:16:38.085142: step: 436/466, loss: 0.0031405503395944834 2023-01-24 05:16:38.707922: step: 438/466, loss: 0.0012364968424662948 2023-01-24 05:16:39.300860: step: 440/466, loss: 0.001196483033709228 2023-01-24 05:16:39.941317: step: 442/466, loss: 0.0004985735868103802 2023-01-24 05:16:40.543140: step: 444/466, loss: 0.038667019456624985 2023-01-24 05:16:41.142282: step: 446/466, loss: 0.07287077605724335 2023-01-24 05:16:41.749684: step: 448/466, loss: 0.006370658054947853 2023-01-24 05:16:42.427434: step: 450/466, loss: 0.13874077796936035 2023-01-24 05:16:42.973625: step: 452/466, loss: 0.004097006283700466 2023-01-24 05:16:43.643947: step: 454/466, loss: 0.015667414292693138 2023-01-24 05:16:44.269936: step: 456/466, loss: 0.007470360491424799 2023-01-24 05:16:44.902461: step: 458/466, loss: 0.00036517850821837783 2023-01-24 05:16:45.547593: step: 460/466, loss: 0.00029116624500602484 2023-01-24 05:16:46.229692: step: 462/466, loss: 0.001083413721062243 2023-01-24 05:16:46.816417: step: 464/466, loss: 0.0014429357834160328 2023-01-24 05:16:47.520679: step: 466/466, loss: 0.028476649895310402 2023-01-24 05:16:48.152257: step: 468/466, loss: 0.0031984311062842607 2023-01-24 05:16:48.744330: step: 470/466, loss: 3.856273542623967e-05 2023-01-24 05:16:49.325758: step: 472/466, loss: 0.0018934006802737713 2023-01-24 05:16:49.943112: step: 474/466, loss: 6.685448170173913e-06 2023-01-24 05:16:50.594712: step: 476/466, loss: 0.007408824283629656 2023-01-24 05:16:51.197306: step: 478/466, loss: 0.002856920473277569 2023-01-24 05:16:51.816173: step: 480/466, loss: 0.0033477952238172293 2023-01-24 05:16:52.457954: step: 482/466, loss: 0.6972894668579102 2023-01-24 05:16:53.021159: step: 484/466, loss: 0.0016652895137667656 2023-01-24 05:16:53.645767: step: 486/466, loss: 0.0008347773109562695 2023-01-24 05:16:54.280928: step: 488/466, loss: 0.0007183398702181876 2023-01-24 05:16:54.946804: step: 490/466, loss: 0.01576983742415905 2023-01-24 05:16:55.591246: step: 492/466, loss: 0.0009741125977598131 2023-01-24 05:16:56.197455: step: 494/466, loss: 0.017025131732225418 2023-01-24 05:16:56.815421: step: 496/466, loss: 0.0027654848527163267 2023-01-24 05:16:57.383007: step: 498/466, loss: 0.0006783697754144669 2023-01-24 05:16:57.990103: step: 500/466, loss: 0.0022116999607533216 2023-01-24 05:16:58.675366: step: 502/466, loss: 0.17197492718696594 2023-01-24 05:16:59.321595: step: 504/466, loss: 0.019571226090192795 2023-01-24 05:16:59.930650: step: 506/466, loss: 0.029277196153998375 2023-01-24 05:17:00.538674: step: 508/466, loss: 0.0432414673268795 2023-01-24 05:17:01.134485: step: 510/466, loss: 1.600991890882142e-05 2023-01-24 05:17:01.719096: step: 512/466, loss: 0.000689912645611912 2023-01-24 05:17:02.394298: step: 514/466, loss: 0.0004268517659511417 2023-01-24 05:17:03.080814: step: 516/466, loss: 0.0882229208946228 2023-01-24 05:17:03.730921: step: 518/466, loss: 4.202971831546165e-05 2023-01-24 05:17:04.307322: step: 520/466, loss: 0.0021449143532663584 2023-01-24 05:17:04.919383: step: 522/466, loss: 4.7994446504162624e-05 2023-01-24 05:17:05.514159: step: 524/466, loss: 0.00016817261348478496 2023-01-24 05:17:06.060677: step: 526/466, loss: 0.003549777902662754 2023-01-24 05:17:06.721391: step: 528/466, loss: 0.03497760370373726 2023-01-24 05:17:07.350714: step: 530/466, loss: 0.0014893412590026855 2023-01-24 05:17:07.962792: step: 532/466, loss: 0.005852533038705587 2023-01-24 05:17:08.586793: step: 534/466, loss: 0.01630992442369461 2023-01-24 05:17:09.193014: step: 536/466, loss: 0.0014065640280023217 2023-01-24 05:17:09.846552: step: 538/466, loss: 0.04440496116876602 2023-01-24 05:17:10.444957: step: 540/466, loss: 0.009069890715181828 2023-01-24 05:17:11.044134: step: 542/466, loss: 0.2030581682920456 2023-01-24 05:17:11.749083: step: 544/466, loss: 0.037008028477430344 2023-01-24 05:17:12.380602: step: 546/466, loss: 0.10246668010950089 2023-01-24 05:17:12.980027: step: 548/466, loss: 0.00014032924082130194 2023-01-24 05:17:13.558925: step: 550/466, loss: 0.0036922169383615255 2023-01-24 05:17:14.211691: step: 552/466, loss: 0.0041724443435668945 2023-01-24 05:17:14.915741: step: 554/466, loss: 0.10033238679170609 2023-01-24 05:17:15.586372: step: 556/466, loss: 0.001182153937406838 2023-01-24 05:17:16.215115: step: 558/466, loss: 0.026487678289413452 2023-01-24 05:17:16.835314: step: 560/466, loss: 0.0033549184445291758 2023-01-24 05:17:17.399464: step: 562/466, loss: 0.00424228236079216 2023-01-24 05:17:17.943027: step: 564/466, loss: 0.0006431190413422883 2023-01-24 05:17:18.598310: step: 566/466, loss: 0.014425793662667274 2023-01-24 05:17:19.245772: step: 568/466, loss: 0.000643086910713464 2023-01-24 05:17:19.920711: step: 570/466, loss: 0.00036597021971829236 2023-01-24 05:17:20.620709: step: 572/466, loss: 0.0026783770881593227 2023-01-24 05:17:21.258404: step: 574/466, loss: 0.004231403581798077 2023-01-24 05:17:21.876231: step: 576/466, loss: 0.000492847990244627 2023-01-24 05:17:22.449747: step: 578/466, loss: 0.011168533936142921 2023-01-24 05:17:23.047880: step: 580/466, loss: 0.005795257166028023 2023-01-24 05:17:23.672964: step: 582/466, loss: 0.001321491552516818 2023-01-24 05:17:24.325858: step: 584/466, loss: 0.0002804806281346828 2023-01-24 05:17:24.895964: step: 586/466, loss: 0.00013717249385081232 2023-01-24 05:17:25.478497: step: 588/466, loss: 0.007665012031793594 2023-01-24 05:17:26.108689: step: 590/466, loss: 0.0003840482677333057 2023-01-24 05:17:26.735636: step: 592/466, loss: 0.0006111133843660355 2023-01-24 05:17:27.326205: step: 594/466, loss: 0.0006169495172798634 2023-01-24 05:17:28.080822: step: 596/466, loss: 0.00024222326464951038 2023-01-24 05:17:28.692245: step: 598/466, loss: 0.02187812514603138 2023-01-24 05:17:29.286961: step: 600/466, loss: 0.023061878979206085 2023-01-24 05:17:29.905506: step: 602/466, loss: 0.015324532985687256 2023-01-24 05:17:30.519499: step: 604/466, loss: 5.273514398140833e-05 2023-01-24 05:17:31.187935: step: 606/466, loss: 0.0560213066637516 2023-01-24 05:17:31.860210: step: 608/466, loss: 0.04418664053082466 2023-01-24 05:17:32.466410: step: 610/466, loss: 0.0077951340936124325 2023-01-24 05:17:33.103109: step: 612/466, loss: 0.006088362541049719 2023-01-24 05:17:33.715270: step: 614/466, loss: 0.036048773676157 2023-01-24 05:17:34.377957: step: 616/466, loss: 0.00034599084756337106 2023-01-24 05:17:35.014511: step: 618/466, loss: 0.010976719669997692 2023-01-24 05:17:35.626193: step: 620/466, loss: 0.016567006707191467 2023-01-24 05:17:36.258152: step: 622/466, loss: 2.328640221094247e-05 2023-01-24 05:17:36.932784: step: 624/466, loss: 0.0007390398532152176 2023-01-24 05:17:37.568466: step: 626/466, loss: 0.00033580331364646554 2023-01-24 05:17:38.171110: step: 628/466, loss: 0.00021214628941379488 2023-01-24 05:17:38.783665: step: 630/466, loss: 0.0009931318927556276 2023-01-24 05:17:39.379253: step: 632/466, loss: 0.0009459779830649495 2023-01-24 05:17:39.977099: step: 634/466, loss: 0.009291564114391804 2023-01-24 05:17:40.631163: step: 636/466, loss: 0.013678831979632378 2023-01-24 05:17:41.253848: step: 638/466, loss: 0.00016148884606081992 2023-01-24 05:17:41.826305: step: 640/466, loss: 0.0016017744783312082 2023-01-24 05:17:42.396975: step: 642/466, loss: 0.0026671478990465403 2023-01-24 05:17:43.011361: step: 644/466, loss: 0.14576734602451324 2023-01-24 05:17:43.649586: step: 646/466, loss: 0.025174817070364952 2023-01-24 05:17:44.248724: step: 648/466, loss: 0.00022869682288728654 2023-01-24 05:17:44.900259: step: 650/466, loss: 0.005399439483880997 2023-01-24 05:17:45.502593: step: 652/466, loss: 4.397485463414341e-05 2023-01-24 05:17:46.117771: step: 654/466, loss: 0.017691288143396378 2023-01-24 05:17:46.820434: step: 656/466, loss: 0.006667081732302904 2023-01-24 05:17:47.385105: step: 658/466, loss: 0.0010638827225193381 2023-01-24 05:17:48.014015: step: 660/466, loss: 0.0013960172655060887 2023-01-24 05:17:48.552367: step: 662/466, loss: 0.020952045917510986 2023-01-24 05:17:49.243801: step: 664/466, loss: 0.00041501622763462365 2023-01-24 05:17:49.847150: step: 666/466, loss: 9.108463564189151e-05 2023-01-24 05:17:50.450101: step: 668/466, loss: 9.921830496750772e-05 2023-01-24 05:17:51.021876: step: 670/466, loss: 0.00023376091849058867 2023-01-24 05:17:51.630464: step: 672/466, loss: 0.00015998842718545347 2023-01-24 05:17:52.234018: step: 674/466, loss: 8.416108175879344e-05 2023-01-24 05:17:52.826110: step: 676/466, loss: 0.022522931918501854 2023-01-24 05:17:53.529278: step: 678/466, loss: 0.0007208925671875477 2023-01-24 05:17:54.200574: step: 680/466, loss: 0.004861995577812195 2023-01-24 05:17:54.793887: step: 682/466, loss: 0.01086998637765646 2023-01-24 05:17:55.486314: step: 684/466, loss: 0.004682690836489201 2023-01-24 05:17:56.084619: step: 686/466, loss: 0.00023975843214429915 2023-01-24 05:17:56.784836: step: 688/466, loss: 0.00241883029229939 2023-01-24 05:17:57.427202: step: 690/466, loss: 1.4486318826675415 2023-01-24 05:17:58.083661: step: 692/466, loss: 0.010795808397233486 2023-01-24 05:17:58.690808: step: 694/466, loss: 0.0011310448171570897 2023-01-24 05:17:59.278027: step: 696/466, loss: 0.2947859466075897 2023-01-24 05:17:59.874665: step: 698/466, loss: 9.568103996571153e-05 2023-01-24 05:18:00.419371: step: 700/466, loss: 0.004735068883746862 2023-01-24 05:18:01.012225: step: 702/466, loss: 0.0025225926656275988 2023-01-24 05:18:01.765005: step: 704/466, loss: 1.4744700193405151 2023-01-24 05:18:02.428142: step: 706/466, loss: 0.013943861238658428 2023-01-24 05:18:03.065526: step: 708/466, loss: 0.011442775838077068 2023-01-24 05:18:03.686839: step: 710/466, loss: 0.7480612993240356 2023-01-24 05:18:04.251771: step: 712/466, loss: 0.008141737431287766 2023-01-24 05:18:04.917319: step: 714/466, loss: 0.009706948883831501 2023-01-24 05:18:05.551435: step: 716/466, loss: 0.003717868123203516 2023-01-24 05:18:06.200260: step: 718/466, loss: 0.01340749766677618 2023-01-24 05:18:06.875423: step: 720/466, loss: 0.001570258755236864 2023-01-24 05:18:07.418849: step: 722/466, loss: 0.013996385037899017 2023-01-24 05:18:08.052194: step: 724/466, loss: 0.008785456418991089 2023-01-24 05:18:08.695633: step: 726/466, loss: 0.04284227639436722 2023-01-24 05:18:09.322408: step: 728/466, loss: 0.007695217151194811 2023-01-24 05:18:10.035038: step: 730/466, loss: 0.00438790675252676 2023-01-24 05:18:10.593719: step: 732/466, loss: 0.0010599372908473015 2023-01-24 05:18:11.186671: step: 734/466, loss: 0.0034963488578796387 2023-01-24 05:18:11.830617: step: 736/466, loss: 0.0013097894843667746 2023-01-24 05:18:12.492119: step: 738/466, loss: 0.00042985836626030505 2023-01-24 05:18:13.082985: step: 740/466, loss: 0.0009328377200290561 2023-01-24 05:18:13.729908: step: 742/466, loss: 0.9467132687568665 2023-01-24 05:18:14.334427: step: 744/466, loss: 0.0005049612373113632 2023-01-24 05:18:15.049537: step: 746/466, loss: 0.6162884831428528 2023-01-24 05:18:15.622475: step: 748/466, loss: 0.0048098317347466946 2023-01-24 05:18:16.277930: step: 750/466, loss: 0.02664458565413952 2023-01-24 05:18:16.937281: step: 752/466, loss: 0.01973588392138481 2023-01-24 05:18:17.547498: step: 754/466, loss: 0.09814440459012985 2023-01-24 05:18:18.193163: step: 756/466, loss: 0.009177024476230145 2023-01-24 05:18:18.789103: step: 758/466, loss: 0.006227452773600817 2023-01-24 05:18:19.431613: step: 760/466, loss: 7.741794252069667e-05 2023-01-24 05:18:20.123032: step: 762/466, loss: 0.024184376001358032 2023-01-24 05:18:20.749502: step: 764/466, loss: 0.0005583571037277579 2023-01-24 05:18:21.341083: step: 766/466, loss: 0.16849538683891296 2023-01-24 05:18:21.948641: step: 768/466, loss: 0.0015892699593678117 2023-01-24 05:18:22.544767: step: 770/466, loss: 3.370045669726096e-05 2023-01-24 05:18:23.173803: step: 772/466, loss: 0.007952672429382801 2023-01-24 05:18:23.764075: step: 774/466, loss: 0.0034520758781582117 2023-01-24 05:18:24.390314: step: 776/466, loss: 0.14407941699028015 2023-01-24 05:18:25.042625: step: 778/466, loss: 0.005839360412210226 2023-01-24 05:18:25.703742: step: 780/466, loss: 0.003146085422486067 2023-01-24 05:18:26.304520: step: 782/466, loss: 0.000626790220849216 2023-01-24 05:18:26.941425: step: 784/466, loss: 0.012974369339644909 2023-01-24 05:18:27.590639: step: 786/466, loss: 6.935989404155407e-06 2023-01-24 05:18:28.248738: step: 788/466, loss: 5.563042213907465e-05 2023-01-24 05:18:28.941639: step: 790/466, loss: 0.0004825711075682193 2023-01-24 05:18:29.542239: step: 792/466, loss: 0.014958060346543789 2023-01-24 05:18:30.159176: step: 794/466, loss: 0.012933915480971336 2023-01-24 05:18:30.778866: step: 796/466, loss: 5.619807325274451e-06 2023-01-24 05:18:31.353686: step: 798/466, loss: 0.00030835132929496467 2023-01-24 05:18:31.965831: step: 800/466, loss: 0.008561772294342518 2023-01-24 05:18:32.826285: step: 802/466, loss: 0.034978095442056656 2023-01-24 05:18:33.452110: step: 804/466, loss: 0.001175934448838234 2023-01-24 05:18:34.085314: step: 806/466, loss: 0.056332968175411224 2023-01-24 05:18:34.702248: step: 808/466, loss: 0.009805538691580296 2023-01-24 05:18:35.317664: step: 810/466, loss: 0.00032590050250291824 2023-01-24 05:18:35.861459: step: 812/466, loss: 0.005281270947307348 2023-01-24 05:18:36.425484: step: 814/466, loss: 0.00117440742906183 2023-01-24 05:18:37.029520: step: 816/466, loss: 0.003840024583041668 2023-01-24 05:18:37.631594: step: 818/466, loss: 0.00021635252051055431 2023-01-24 05:18:38.316279: step: 820/466, loss: 0.01948506012558937 2023-01-24 05:18:38.982666: step: 822/466, loss: 0.0017720076721161604 2023-01-24 05:18:39.569971: step: 824/466, loss: 0.003839016892015934 2023-01-24 05:18:40.133074: step: 826/466, loss: 1.951284139067866e-05 2023-01-24 05:18:40.708585: step: 828/466, loss: 0.005112734157592058 2023-01-24 05:18:41.324301: step: 830/466, loss: 4.354869088274427e-05 2023-01-24 05:18:41.967475: step: 832/466, loss: 0.0506337434053421 2023-01-24 05:18:42.589012: step: 834/466, loss: 0.0003722644178196788 2023-01-24 05:18:43.231102: step: 836/466, loss: 4.691800131695345e-05 2023-01-24 05:18:43.846289: step: 838/466, loss: 0.0004766513593494892 2023-01-24 05:18:44.469011: step: 840/466, loss: 4.759379226015881e-05 2023-01-24 05:18:45.053304: step: 842/466, loss: 0.0035098448861390352 2023-01-24 05:18:45.630858: step: 844/466, loss: 0.013005075044929981 2023-01-24 05:18:46.225653: step: 846/466, loss: 0.01408541202545166 2023-01-24 05:18:46.845172: step: 848/466, loss: 0.00024867948377504945 2023-01-24 05:18:47.506185: step: 850/466, loss: 0.016286736354231834 2023-01-24 05:18:48.166318: step: 852/466, loss: 0.0002770618593785912 2023-01-24 05:18:48.841989: step: 854/466, loss: 0.006749913562089205 2023-01-24 05:18:49.522060: step: 856/466, loss: 0.1688026338815689 2023-01-24 05:18:50.180999: step: 858/466, loss: 0.05698254704475403 2023-01-24 05:18:50.824563: step: 860/466, loss: 0.0005167814088054001 2023-01-24 05:18:51.403637: step: 862/466, loss: 0.00023729843087494373 2023-01-24 05:18:52.074329: step: 864/466, loss: 0.011458028107881546 2023-01-24 05:18:52.690673: step: 866/466, loss: 0.004539316054433584 2023-01-24 05:18:53.302751: step: 868/466, loss: 0.29944872856140137 2023-01-24 05:18:53.838407: step: 870/466, loss: 2.1613164790323935e-05 2023-01-24 05:18:54.464712: step: 872/466, loss: 0.06793544441461563 2023-01-24 05:18:55.067863: step: 874/466, loss: 0.01879395917057991 2023-01-24 05:18:55.668331: step: 876/466, loss: 0.003989489749073982 2023-01-24 05:18:56.353858: step: 878/466, loss: 0.0003668160643428564 2023-01-24 05:18:56.979661: step: 880/466, loss: 0.0028202959802001715 2023-01-24 05:18:57.656428: step: 882/466, loss: 0.013466686941683292 2023-01-24 05:18:58.238259: step: 884/466, loss: 0.008356669917702675 2023-01-24 05:18:58.849678: step: 886/466, loss: 0.0007907028775662184 2023-01-24 05:18:59.475595: step: 888/466, loss: 2.8483047572080977e-05 2023-01-24 05:19:00.071527: step: 890/466, loss: 0.04728523641824722 2023-01-24 05:19:00.715741: step: 892/466, loss: 0.000980711542069912 2023-01-24 05:19:01.308211: step: 894/466, loss: 0.01104824896901846 2023-01-24 05:19:01.958680: step: 896/466, loss: 0.04990740492939949 2023-01-24 05:19:02.603779: step: 898/466, loss: 0.02798609994351864 2023-01-24 05:19:03.175886: step: 900/466, loss: 0.005783462896943092 2023-01-24 05:19:03.808045: step: 902/466, loss: 1.5856760001042858e-05 2023-01-24 05:19:04.405853: step: 904/466, loss: 0.00041372032137587667 2023-01-24 05:19:05.028016: step: 906/466, loss: 0.83101886510849 2023-01-24 05:19:05.636821: step: 908/466, loss: 1.1503672112667118e-06 2023-01-24 05:19:06.253298: step: 910/466, loss: 0.001675062463618815 2023-01-24 05:19:06.949698: step: 912/466, loss: 0.556420624256134 2023-01-24 05:19:07.549296: step: 914/466, loss: 0.0005540436832234263 2023-01-24 05:19:08.199178: step: 916/466, loss: 0.0001962615642696619 2023-01-24 05:19:08.831976: step: 918/466, loss: 1.648462176322937 2023-01-24 05:19:09.448578: step: 920/466, loss: 0.0024485636968165636 2023-01-24 05:19:10.062621: step: 922/466, loss: 2.499064248695504e-05 2023-01-24 05:19:10.715836: step: 924/466, loss: 0.0010540563380345702 2023-01-24 05:19:11.293898: step: 926/466, loss: 0.0013296870747581124 2023-01-24 05:19:11.915539: step: 928/466, loss: 0.006474127992987633 2023-01-24 05:19:12.569256: step: 930/466, loss: 0.049021899700164795 2023-01-24 05:19:13.138967: step: 932/466, loss: 0.007290661800652742 ================================================== Loss: 0.043 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.34078330745788055, 'r': 0.34078330745788055, 'f1': 0.34078330745788055}, 'combined': 0.2511034897058067, 'epoch': 39} Test Chinese: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.3570916749413701, 'r': 0.3103665077696657, 'f1': 0.3320936021939678}, 'combined': 0.22024860663641382, 'epoch': 39} Dev Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.32793956043956046, 'r': 0.2825994318181818, 'f1': 0.3035859613428281}, 'combined': 0.20239064089521874, 'epoch': 39} Test Korean: {'template': {'p': 0.96875, 'r': 0.49206349206349204, 'f1': 0.6526315789473683}, 'slot': {'p': 0.3657722769994456, 'r': 0.2928713899025024, 'f1': 0.3252873762728467}, 'combined': 0.21229281398859468, 'epoch': 39} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3306434373309168, 'r': 0.3369175063504788, 'f1': 0.3337509884336509}, 'combined': 0.2459217809511112, 'epoch': 39} Test Russian: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.34843746611739584, 'r': 0.300813660194776, 'f1': 0.3228789147494531}, 'combined': 0.21413731133642486, 'epoch': 39} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.2876984126984127, 'r': 0.3452380952380952, 'f1': 0.3138528138528138}, 'combined': 0.20923520923520916, 'epoch': 39} Sample Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.625, 'r': 0.43478260869565216, 'f1': 0.5128205128205128}, 'combined': 0.34188034188034183, 'epoch': 39} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.4444444444444444, 'r': 0.13793103448275862, 'f1': 0.21052631578947367}, 'combined': 0.14035087719298245, 'epoch': 39} New best korean model... ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.35600821224540485, 'r': 0.3296622534644356, 'f1': 0.34232907896701}, 'combined': 0.25224247923884946, 'epoch': 20} Test for Chinese: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.3708503216596444, 'r': 0.2994463568648126, 'f1': 0.33134515303755174}, 'combined': 0.21975222584873896, 'epoch': 20} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.38541666666666663, 'r': 0.35238095238095235, 'f1': 0.36815920398009944}, 'combined': 0.24543946932006627, 'epoch': 20} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.32793956043956046, 'r': 0.2825994318181818, 'f1': 0.3035859613428281}, 'combined': 0.20239064089521874, 'epoch': 39} Test for Korean: {'template': {'p': 0.96875, 'r': 0.49206349206349204, 'f1': 0.6526315789473683}, 'slot': {'p': 0.3657722769994456, 'r': 0.2928713899025024, 'f1': 0.3252873762728467}, 'combined': 0.21229281398859468, 'epoch': 39} Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.625, 'r': 0.43478260869565216, 'f1': 0.5128205128205128}, 'combined': 0.34188034188034183, 'epoch': 39} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.33845733174116954, 'r': 0.33075052342827765, 'f1': 0.3345595505694863}, 'combined': 0.2465175635775162, 'epoch': 37} Test for Russian: {'template': {'p': 0.9552238805970149, 'r': 0.5079365079365079, 'f1': 0.6632124352331605}, 'slot': {'p': 0.36164126771888916, 'r': 0.3062209176009295, 'f1': 0.33163165478581674}, 'combined': 0.21994223737090435, 'epoch': 37} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5555555555555556, 'r': 0.1724137931034483, 'f1': 0.26315789473684215}, 'combined': 0.1754385964912281, 'epoch': 37}