Command that produces this log: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 ---------------------------------------------------------------------------------------------------- > trainable params: >>> xlmr.embeddings.word_embeddings.weight: torch.Size([250002, 1024]) >>> xlmr.embeddings.position_embeddings.weight: torch.Size([514, 1024]) >>> xlmr.embeddings.token_type_embeddings.weight: torch.Size([1, 1024]) >>> xlmr.embeddings.LayerNorm.weight: torch.Size([1024]) >>> xlmr.embeddings.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.0.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.0.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.0.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.0.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.0.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.0.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.0.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.0.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.0.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.0.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.0.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.0.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.0.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.0.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.0.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.0.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.1.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.1.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.1.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.1.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.1.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.1.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.1.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.1.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.1.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.1.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.1.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.1.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.1.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.1.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.1.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.1.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.2.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.2.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.2.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.2.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.2.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.2.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.2.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.2.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.2.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.2.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.2.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.2.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.2.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.2.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.2.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.2.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.3.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.3.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.3.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.3.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.3.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.3.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.3.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.3.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.3.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.3.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.3.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.3.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.3.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.3.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.3.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.3.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.4.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.4.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.4.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.4.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.4.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.4.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.4.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.4.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.4.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.4.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.4.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.4.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.4.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.4.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.4.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.4.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.5.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.5.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.5.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.5.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.5.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.5.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.5.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.5.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.5.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.5.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.5.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.5.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.5.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.5.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.5.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.5.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.6.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.6.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.6.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.6.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.6.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.6.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.6.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.6.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.6.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.6.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.6.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.6.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.6.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.6.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.6.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.6.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.7.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.7.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.7.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.7.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.7.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.7.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.7.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.7.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.7.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.7.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.7.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.7.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.7.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.7.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.7.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.7.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.8.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.8.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.8.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.8.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.8.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.8.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.8.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.8.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.8.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.8.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.8.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.8.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.8.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.8.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.8.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.8.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.9.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.9.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.9.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.9.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.9.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.9.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.9.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.9.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.9.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.9.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.9.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.9.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.9.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.9.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.9.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.9.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.10.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.10.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.10.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.10.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.10.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.10.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.10.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.10.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.10.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.10.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.10.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.10.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.10.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.10.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.10.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.10.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.11.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.11.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.11.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.11.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.11.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.11.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.11.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.11.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.11.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.11.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.11.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.11.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.11.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.11.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.11.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.11.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.12.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.12.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.12.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.12.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.12.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.12.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.12.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.12.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.12.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.12.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.12.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.12.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.12.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.12.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.12.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.12.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.13.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.13.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.13.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.13.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.13.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.13.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.13.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.13.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.13.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.13.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.13.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.13.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.13.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.13.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.13.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.13.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.14.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.14.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.14.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.14.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.14.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.14.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.14.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.14.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.14.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.14.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.14.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.14.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.14.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.14.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.14.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.14.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.15.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.15.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.15.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.15.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.15.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.15.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.15.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.15.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.15.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.15.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.15.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.15.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.15.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.15.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.15.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.15.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.16.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.16.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.16.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.16.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.16.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.16.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.16.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.16.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.16.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.16.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.16.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.16.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.16.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.16.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.16.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.16.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.17.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.17.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.17.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.17.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.17.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.17.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.17.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.17.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.17.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.17.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.17.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.17.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.17.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.17.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.17.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.17.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.18.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.18.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.18.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.18.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.18.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.18.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.18.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.18.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.18.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.18.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.18.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.18.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.18.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.18.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.18.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.18.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.19.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.19.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.19.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.19.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.19.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.19.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.19.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.19.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.19.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.19.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.19.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.19.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.19.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.19.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.19.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.19.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.20.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.20.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.20.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.20.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.20.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.20.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.20.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.20.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.20.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.20.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.20.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.20.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.20.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.20.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.20.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.20.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.21.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.21.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.21.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.21.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.21.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.21.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.21.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.21.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.21.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.21.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.21.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.21.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.21.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.21.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.21.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.21.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.22.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.22.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.22.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.22.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.22.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.22.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.22.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.22.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.22.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.22.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.22.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.22.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.22.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.22.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.22.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.22.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.23.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.23.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.23.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.23.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.23.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.23.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.23.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.23.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.23.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.23.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.23.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.23.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.23.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.23.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.23.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.23.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.pooler.dense.weight: torch.Size([1024, 1024]) >>> xlmr.pooler.dense.bias: torch.Size([1024]) >>> basic_gcn.T_T.0.weight: torch.Size([1024, 1024]) >>> basic_gcn.T_T.0.bias: torch.Size([1024]) >>> basic_gcn.T_T.1.weight: torch.Size([1024, 1024]) >>> basic_gcn.T_T.1.bias: torch.Size([1024]) >>> basic_gcn.T_T.2.weight: torch.Size([1024, 1024]) >>> basic_gcn.T_T.2.bias: torch.Size([1024]) >>> basic_gcn.T_E.0.weight: torch.Size([1024, 1024]) >>> basic_gcn.T_E.0.bias: torch.Size([1024]) >>> basic_gcn.T_E.1.weight: torch.Size([1024, 1024]) >>> basic_gcn.T_E.1.bias: torch.Size([1024]) >>> basic_gcn.T_E.2.weight: torch.Size([1024, 1024]) >>> basic_gcn.T_E.2.bias: torch.Size([1024]) >>> basic_gcn.E_T.0.weight: torch.Size([1024, 1024]) >>> basic_gcn.E_T.0.bias: torch.Size([1024]) >>> basic_gcn.E_T.1.weight: torch.Size([1024, 1024]) >>> basic_gcn.E_T.1.bias: torch.Size([1024]) >>> basic_gcn.E_T.2.weight: torch.Size([1024, 1024]) >>> basic_gcn.E_T.2.bias: torch.Size([1024]) >>> basic_gcn.E_E.0.weight: torch.Size([1024, 1024]) >>> basic_gcn.E_E.0.bias: torch.Size([1024]) >>> basic_gcn.E_E.1.weight: torch.Size([1024, 1024]) >>> basic_gcn.E_E.1.bias: torch.Size([1024]) >>> basic_gcn.E_E.2.weight: torch.Size([1024, 1024]) >>> basic_gcn.E_E.2.bias: torch.Size([1024]) >>> basic_gcn.f_t.0.weight: torch.Size([1024, 2048]) >>> basic_gcn.f_t.0.bias: torch.Size([1024]) >>> basic_gcn.f_e.0.weight: torch.Size([1024, 2048]) >>> basic_gcn.f_e.0.bias: torch.Size([1024]) >>> name2classifier.occupy-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.occupy-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.occupy-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.occupy-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.outcome-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.outcome-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.outcome-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.outcome-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.protest-event-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.protest-event-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.protest-event-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.protest-event-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.when-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.when-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.when-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.when-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.where-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.where-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.where-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.where-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.who-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.who-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.who-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.who-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.protest-against-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.protest-against-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.protest-against-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.protest-against-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.protest-for-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.protest-for-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.protest-for-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.protest-for-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.organizer-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.organizer-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.organizer-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.organizer-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.wounded-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.wounded-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.wounded-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.wounded-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.arrested-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.arrested-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.arrested-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.arrested-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.killed-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.killed-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.killed-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.killed-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.imprisoned-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.imprisoned-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.imprisoned-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.imprisoned-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.charged-with-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.charged-with-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.charged-with-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.charged-with-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.corrupt-event-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.corrupt-event-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.corrupt-event-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.corrupt-event-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.judicial-actions-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.judicial-actions-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.judicial-actions-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.judicial-actions-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.prison-term-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.prison-term-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.prison-term-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.prison-term-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.fine-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.fine-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.fine-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.fine-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.npi-events-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.npi-events-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.npi-events-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.npi-events-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.disease-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.disease-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.disease-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.disease-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.infected-cumulative-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.infected-cumulative-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.infected-cumulative-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.infected-cumulative-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.killed-count-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.killed-count-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.killed-count-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.killed-count-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.killed-cumulative-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.killed-cumulative-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.killed-cumulative-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.killed-cumulative-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.outbreak-event-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.outbreak-event-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.outbreak-event-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.outbreak-event-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.infected-count-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.infected-count-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.infected-count-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.infected-count-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.killed-individuals-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.killed-individuals-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.killed-individuals-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.killed-individuals-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.hospitalized-count-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.hospitalized-count-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.hospitalized-count-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.hospitalized-count-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.hospitalized-individuals-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.hospitalized-individuals-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.hospitalized-individuals-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.hospitalized-individuals-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.infected-individuals-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.infected-individuals-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.infected-individuals-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.infected-individuals-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.tested-individuals-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.tested-individuals-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.tested-individuals-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.tested-individuals-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.vaccinated-count-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.vaccinated-count-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.vaccinated-count-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.vaccinated-count-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.tested-count-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.tested-count-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.tested-count-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.tested-count-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.exposed-individuals-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.exposed-individuals-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.exposed-individuals-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.exposed-individuals-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.recovered-cumulative-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.recovered-cumulative-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.recovered-cumulative-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.recovered-cumulative-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.tested-cumulative-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.tested-cumulative-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.tested-cumulative-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.tested-cumulative-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.recovered-count-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.recovered-count-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.recovered-count-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.recovered-count-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.exposed-count-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.exposed-count-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.exposed-count-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.exposed-count-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.vaccinated-cumulative-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.vaccinated-cumulative-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.vaccinated-cumulative-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.vaccinated-cumulative-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.vaccinated-individuals-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.vaccinated-individuals-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.vaccinated-individuals-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.vaccinated-individuals-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.exposed-cumulative-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.exposed-cumulative-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.exposed-cumulative-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.exposed-cumulative-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.hospitalized-cumulative-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.hospitalized-cumulative-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.hospitalized-cumulative-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.hospitalized-cumulative-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.recovered-individuals-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.recovered-individuals-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.recovered-individuals-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.recovered-individuals-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.blamed-by-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.blamed-by-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.blamed-by-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.blamed-by-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.claimed-by-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.claimed-by-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.claimed-by-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.claimed-by-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.terror-event-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.terror-event-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.terror-event-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.terror-event-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.kidnapped-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.kidnapped-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.kidnapped-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.kidnapped-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.named-perp-org-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.named-perp-org-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.named-perp-org-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.named-perp-org-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.target-physical-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.target-physical-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.target-physical-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.target-physical-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.named-perp-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.named-perp-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.named-perp-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.named-perp-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.perp-killed-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.perp-killed-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.perp-killed-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.perp-killed-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.target-human-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.target-human-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.target-human-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.target-human-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.perp-captured-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.perp-captured-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.perp-captured-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.perp-captured-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.perp-objective-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.perp-objective-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.perp-objective-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.perp-objective-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.weapon-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.weapon-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.weapon-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.weapon-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.named-organizer-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.named-organizer-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.named-organizer-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.named-organizer-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.affected-cumulative-count-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.affected-cumulative-count-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.affected-cumulative-count-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.affected-cumulative-count-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.damage-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.damage-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.damage-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.damage-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.human-displacement-events-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.human-displacement-events-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.human-displacement-events-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.human-displacement-events-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.major-disaster-event-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.major-disaster-event-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.major-disaster-event-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.major-disaster-event-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.related-natural-phenomena-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.related-natural-phenomena-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.related-natural-phenomena-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.related-natural-phenomena-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.responders-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.responders-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.responders-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.responders-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.assistance-provided-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.assistance-provided-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.assistance-provided-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.assistance-provided-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.rescue-events-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.rescue-events-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.rescue-events-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.rescue-events-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.individuals-affected-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.individuals-affected-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.individuals-affected-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.individuals-affected-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.missing-count-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.missing-count-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.missing-count-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.missing-count-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.injured-count-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.injured-count-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.injured-count-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.injured-count-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.assistance-needed-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.assistance-needed-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.assistance-needed-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.assistance-needed-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.rescued-count-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.rescued-count-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.rescued-count-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.rescued-count-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.repair-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.repair-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.repair-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.repair-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.declare-emergency-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.declare-emergency-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.declare-emergency-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.declare-emergency-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.announce-disaster-warnings-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.announce-disaster-warnings-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.announce-disaster-warnings-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.announce-disaster-warnings-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.disease-outbreak-events-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.disease-outbreak-events-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.disease-outbreak-events-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.disease-outbreak-events-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.current-location-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.current-location-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.current-location-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.current-location-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.event-or-soa-at-origin-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.event-or-soa-at-origin-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.event-or-soa-at-origin-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.event-or-soa-at-origin-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.group-identity-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.group-identity-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.group-identity-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.group-identity-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.human-displacement-event-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.human-displacement-event-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.human-displacement-event-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.human-displacement-event-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.origin-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.origin-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.origin-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.origin-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.total-displaced-count-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.total-displaced-count-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.total-displaced-count-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.total-displaced-count-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.transitory-events-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.transitory-events-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.transitory-events-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.transitory-events-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.destination-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.destination-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.destination-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.destination-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.transiting-location-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.transiting-location-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.transiting-location-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.transiting-location-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.settlement-status-event-or-soa-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.settlement-status-event-or-soa-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.settlement-status-event-or-soa-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.settlement-status-event-or-soa-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.detained-count-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.detained-count-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.detained-count-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.detained-count-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.blocked-migration-count-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.blocked-migration-count-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.blocked-migration-count-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.blocked-migration-count-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.cybercrime-event-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.cybercrime-event-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.cybercrime-event-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.cybercrime-event-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.information-stolen-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.information-stolen-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.information-stolen-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.information-stolen-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.related-crimes-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.related-crimes-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.related-crimes-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.related-crimes-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.response-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.response-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.response-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.response-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.victim-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.victim-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.victim-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.victim-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.perpetrator-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.perpetrator-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.perpetrator-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.perpetrator-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.victim-impact-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.victim-impact-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.victim-impact-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.victim-impact-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.contract-amount-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.contract-amount-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.contract-amount-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.contract-amount-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.etip-event-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.etip-event-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.etip-event-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.etip-event-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.project-location-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.project-location-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.project-location-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.project-location-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.project-name-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.project-name-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.project-name-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.project-name-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.signatories-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.signatories-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.signatories-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.signatories-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.contract-awardee-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.contract-awardee-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.contract-awardee-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.contract-awardee-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.overall-project-value-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.overall-project-value-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.overall-project-value-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.overall-project-value-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.funding-amount-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.funding-amount-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.funding-amount-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.funding-amount-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.funding-recipient-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.funding-recipient-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.funding-recipient-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.funding-recipient-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.funding-source-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.funding-source-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.funding-source-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.funding-source-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.contract-awarder-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.contract-awarder-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.contract-awarder-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.contract-awarder-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.agreement-length-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.agreement-length-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.agreement-length-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.agreement-length-ffn.layers.1.bias: torch.Size([2]) >>> irrealis_classifier.layers.0.weight: torch.Size([350, 1128]) >>> irrealis_classifier.layers.0.bias: torch.Size([350]) >>> irrealis_classifier.layers.1.weight: torch.Size([7, 350]) >>> irrealis_classifier.layers.1.bias: torch.Size([7]) n_trainable_params: 614103147, n_nontrainable_params: 0 ---------------------------------------------------------------------------------------------------- ****************************** Epoch: 0 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-22 11:18:27.163833: step: 2/464, loss: 9.374310493469238 2023-01-22 11:18:27.962857: step: 4/464, loss: 18.727828979492188 2023-01-22 11:18:28.668707: step: 6/464, loss: 12.132365226745605 2023-01-22 11:18:29.422379: step: 8/464, loss: 19.571609497070312 2023-01-22 11:18:30.313173: step: 10/464, loss: 5.619302749633789 2023-01-22 11:18:31.095539: step: 12/464, loss: 8.781749725341797 2023-01-22 11:18:31.795334: step: 14/464, loss: 12.500725746154785 2023-01-22 11:18:32.522811: step: 16/464, loss: 16.95241928100586 2023-01-22 11:18:33.282859: step: 18/464, loss: 10.717041015625 2023-01-22 11:18:34.125687: step: 20/464, loss: 6.4841814041137695 2023-01-22 11:18:34.831463: step: 22/464, loss: 27.81028938293457 2023-01-22 11:18:35.580527: step: 24/464, loss: 11.50721549987793 2023-01-22 11:18:36.406820: step: 26/464, loss: 17.522432327270508 2023-01-22 11:18:37.171969: step: 28/464, loss: 16.023082733154297 2023-01-22 11:18:37.923727: step: 30/464, loss: 6.1217217445373535 2023-01-22 11:18:38.698611: step: 32/464, loss: 15.088384628295898 2023-01-22 11:18:39.440851: step: 34/464, loss: 18.26673698425293 2023-01-22 11:18:40.190217: step: 36/464, loss: 16.102869033813477 2023-01-22 11:18:41.001542: step: 38/464, loss: 19.091053009033203 2023-01-22 11:18:41.704038: step: 40/464, loss: 16.98545265197754 2023-01-22 11:18:42.488177: step: 42/464, loss: 5.916228294372559 2023-01-22 11:18:43.211755: step: 44/464, loss: 19.067594528198242 2023-01-22 11:18:43.948727: step: 46/464, loss: 14.185728073120117 2023-01-22 11:18:44.708325: step: 48/464, loss: 10.070418357849121 2023-01-22 11:18:45.457610: step: 50/464, loss: 9.842507362365723 2023-01-22 11:18:46.244734: step: 52/464, loss: 29.716541290283203 2023-01-22 11:18:46.989757: step: 54/464, loss: 20.947689056396484 2023-01-22 11:18:47.770497: step: 56/464, loss: 25.807109832763672 2023-01-22 11:18:48.584983: step: 58/464, loss: 6.3881378173828125 2023-01-22 11:18:49.351724: step: 60/464, loss: 10.062652587890625 2023-01-22 11:18:50.143203: step: 62/464, loss: 9.788475036621094 2023-01-22 11:18:50.851337: step: 64/464, loss: 15.149709701538086 2023-01-22 11:18:51.524984: step: 66/464, loss: 18.39317512512207 2023-01-22 11:18:52.282499: step: 68/464, loss: 19.415142059326172 2023-01-22 11:18:53.022648: step: 70/464, loss: 9.763272285461426 2023-01-22 11:18:53.847233: step: 72/464, loss: 12.71608829498291 2023-01-22 11:18:54.552325: step: 74/464, loss: 19.501523971557617 2023-01-22 11:18:55.296172: step: 76/464, loss: 8.746461868286133 2023-01-22 11:18:56.032841: step: 78/464, loss: 9.88006591796875 2023-01-22 11:18:56.825304: step: 80/464, loss: 11.285353660583496 2023-01-22 11:18:57.622040: step: 82/464, loss: 5.65945291519165 2023-01-22 11:18:58.373758: step: 84/464, loss: 13.332258224487305 2023-01-22 11:18:59.154074: step: 86/464, loss: 10.611173629760742 2023-01-22 11:18:59.840491: step: 88/464, loss: 10.113225936889648 2023-01-22 11:19:00.611774: step: 90/464, loss: 12.189413070678711 2023-01-22 11:19:01.300606: step: 92/464, loss: 10.418415069580078 2023-01-22 11:19:01.998602: step: 94/464, loss: 7.840319633483887 2023-01-22 11:19:02.784217: step: 96/464, loss: 16.116275787353516 2023-01-22 11:19:03.469676: step: 98/464, loss: 4.881072521209717 2023-01-22 11:19:04.223580: step: 100/464, loss: 12.32833194732666 2023-01-22 11:19:05.086571: step: 102/464, loss: 14.402412414550781 2023-01-22 11:19:05.803793: step: 104/464, loss: 7.665492534637451 2023-01-22 11:19:06.530275: step: 106/464, loss: 4.3873772621154785 2023-01-22 11:19:07.297146: step: 108/464, loss: 5.572765350341797 2023-01-22 11:19:08.062862: step: 110/464, loss: 14.701895713806152 2023-01-22 11:19:08.797618: step: 112/464, loss: 9.786662101745605 2023-01-22 11:19:09.515719: step: 114/464, loss: 5.392247200012207 2023-01-22 11:19:10.310691: step: 116/464, loss: 8.178632736206055 2023-01-22 11:19:11.087370: step: 118/464, loss: 9.314098358154297 2023-01-22 11:19:11.849121: step: 120/464, loss: 20.993085861206055 2023-01-22 11:19:12.638320: step: 122/464, loss: 4.979355335235596 2023-01-22 11:19:13.372528: step: 124/464, loss: 20.287872314453125 2023-01-22 11:19:14.130391: step: 126/464, loss: 13.280475616455078 2023-01-22 11:19:14.876871: step: 128/464, loss: 5.503267765045166 2023-01-22 11:19:15.734607: step: 130/464, loss: 14.450859069824219 2023-01-22 11:19:16.543161: step: 132/464, loss: 16.351163864135742 2023-01-22 11:19:17.296454: step: 134/464, loss: 9.963763236999512 2023-01-22 11:19:17.998241: step: 136/464, loss: 13.730572700500488 2023-01-22 11:19:18.631923: step: 138/464, loss: 11.675607681274414 2023-01-22 11:19:19.389055: step: 140/464, loss: 11.64958667755127 2023-01-22 11:19:20.122150: step: 142/464, loss: 7.854397296905518 2023-01-22 11:19:20.909916: step: 144/464, loss: 12.849346160888672 2023-01-22 11:19:21.751427: step: 146/464, loss: 21.994861602783203 2023-01-22 11:19:22.445952: step: 148/464, loss: 6.948868274688721 2023-01-22 11:19:23.250275: step: 150/464, loss: 8.413993835449219 2023-01-22 11:19:23.981657: step: 152/464, loss: 2.9484715461730957 2023-01-22 11:19:24.748923: step: 154/464, loss: 13.926671028137207 2023-01-22 11:19:25.555171: step: 156/464, loss: 2.777003049850464 2023-01-22 11:19:26.270567: step: 158/464, loss: 10.172282218933105 2023-01-22 11:19:27.089059: step: 160/464, loss: 5.547769546508789 2023-01-22 11:19:27.840487: step: 162/464, loss: 4.884523868560791 2023-01-22 11:19:28.581856: step: 164/464, loss: 6.705595016479492 2023-01-22 11:19:29.406755: step: 166/464, loss: 8.995575904846191 2023-01-22 11:19:30.162117: step: 168/464, loss: 11.319543838500977 2023-01-22 11:19:30.969791: step: 170/464, loss: 15.827716827392578 2023-01-22 11:19:31.749573: step: 172/464, loss: 10.849842071533203 2023-01-22 11:19:32.477189: step: 174/464, loss: 3.5209128856658936 2023-01-22 11:19:33.220004: step: 176/464, loss: 6.0415544509887695 2023-01-22 11:19:33.953717: step: 178/464, loss: 4.605469703674316 2023-01-22 11:19:34.724439: step: 180/464, loss: 6.940061569213867 2023-01-22 11:19:35.578430: step: 182/464, loss: 11.00831413269043 2023-01-22 11:19:36.256449: step: 184/464, loss: 4.500629901885986 2023-01-22 11:19:36.941853: step: 186/464, loss: 11.850529670715332 2023-01-22 11:19:37.713948: step: 188/464, loss: 9.06021785736084 2023-01-22 11:19:38.362708: step: 190/464, loss: 6.195465087890625 2023-01-22 11:19:39.155307: step: 192/464, loss: 3.9920973777770996 2023-01-22 11:19:39.878191: step: 194/464, loss: 10.8495512008667 2023-01-22 11:19:40.634018: step: 196/464, loss: 5.404269695281982 2023-01-22 11:19:41.346036: step: 198/464, loss: 12.221254348754883 2023-01-22 11:19:42.186622: step: 200/464, loss: 7.055595397949219 2023-01-22 11:19:42.898112: step: 202/464, loss: 15.114753723144531 2023-01-22 11:19:43.642950: step: 204/464, loss: 4.877196311950684 2023-01-22 11:19:44.351093: step: 206/464, loss: 6.17531681060791 2023-01-22 11:19:45.101168: step: 208/464, loss: 6.529362201690674 2023-01-22 11:19:45.762303: step: 210/464, loss: 8.570602416992188 2023-01-22 11:19:46.468947: step: 212/464, loss: 17.8670711517334 2023-01-22 11:19:47.159873: step: 214/464, loss: 1.9516692161560059 2023-01-22 11:19:47.865956: step: 216/464, loss: 6.888588905334473 2023-01-22 11:19:48.566382: step: 218/464, loss: 19.382837295532227 2023-01-22 11:19:49.436231: step: 220/464, loss: 4.47511100769043 2023-01-22 11:19:50.188102: step: 222/464, loss: 3.614959239959717 2023-01-22 11:19:50.997244: step: 224/464, loss: 6.809412002563477 2023-01-22 11:19:51.777611: step: 226/464, loss: 5.449559688568115 2023-01-22 11:19:52.491536: step: 228/464, loss: 11.705848693847656 2023-01-22 11:19:53.250933: step: 230/464, loss: 16.532360076904297 2023-01-22 11:19:53.993424: step: 232/464, loss: 3.4342377185821533 2023-01-22 11:19:54.743335: step: 234/464, loss: 4.141678333282471 2023-01-22 11:19:55.449638: step: 236/464, loss: 5.295482158660889 2023-01-22 11:19:56.091427: step: 238/464, loss: 9.353677749633789 2023-01-22 11:19:56.863718: step: 240/464, loss: 3.7013063430786133 2023-01-22 11:19:57.610680: step: 242/464, loss: 3.4981331825256348 2023-01-22 11:19:58.380229: step: 244/464, loss: 6.408645153045654 2023-01-22 11:19:59.146152: step: 246/464, loss: 9.78841781616211 2023-01-22 11:19:59.845556: step: 248/464, loss: 9.539456367492676 2023-01-22 11:20:00.715337: step: 250/464, loss: 9.358221054077148 2023-01-22 11:20:01.490210: step: 252/464, loss: 9.760936737060547 2023-01-22 11:20:02.289074: step: 254/464, loss: 5.140605449676514 2023-01-22 11:20:03.001642: step: 256/464, loss: 13.301159858703613 2023-01-22 11:20:03.731718: step: 258/464, loss: 12.485919952392578 2023-01-22 11:20:04.449700: step: 260/464, loss: 15.078859329223633 2023-01-22 11:20:05.178812: step: 262/464, loss: 6.826125144958496 2023-01-22 11:20:05.919914: step: 264/464, loss: 3.3939123153686523 2023-01-22 11:20:06.666428: step: 266/464, loss: 4.8039350509643555 2023-01-22 11:20:07.380531: step: 268/464, loss: 7.7959089279174805 2023-01-22 11:20:08.087024: step: 270/464, loss: 4.709813117980957 2023-01-22 11:20:08.854206: step: 272/464, loss: 2.368288516998291 2023-01-22 11:20:09.620558: step: 274/464, loss: 9.393926620483398 2023-01-22 11:20:10.354357: step: 276/464, loss: 7.802385330200195 2023-01-22 11:20:10.995714: step: 278/464, loss: 7.686842918395996 2023-01-22 11:20:11.757222: step: 280/464, loss: 9.50601577758789 2023-01-22 11:20:12.534141: step: 282/464, loss: 5.707566738128662 2023-01-22 11:20:13.244389: step: 284/464, loss: 5.4928107261657715 2023-01-22 11:20:14.003606: step: 286/464, loss: 10.356019973754883 2023-01-22 11:20:14.780201: step: 288/464, loss: 2.413654327392578 2023-01-22 11:20:15.661919: step: 290/464, loss: 8.67782211303711 2023-01-22 11:20:16.350193: step: 292/464, loss: 5.6798858642578125 2023-01-22 11:20:17.111198: step: 294/464, loss: 14.590892791748047 2023-01-22 11:20:17.827394: step: 296/464, loss: 5.307000160217285 2023-01-22 11:20:18.668155: step: 298/464, loss: 16.15806007385254 2023-01-22 11:20:19.415898: step: 300/464, loss: 5.119200706481934 2023-01-22 11:20:20.099830: step: 302/464, loss: 2.9111478328704834 2023-01-22 11:20:20.901311: step: 304/464, loss: 6.235689163208008 2023-01-22 11:20:21.774829: step: 306/464, loss: 5.8132147789001465 2023-01-22 11:20:22.546364: step: 308/464, loss: 2.717848300933838 2023-01-22 11:20:23.330972: step: 310/464, loss: 6.308784484863281 2023-01-22 11:20:24.100257: step: 312/464, loss: 4.407107353210449 2023-01-22 11:20:24.825418: step: 314/464, loss: 3.238675117492676 2023-01-22 11:20:25.598023: step: 316/464, loss: 11.632369995117188 2023-01-22 11:20:26.329378: step: 318/464, loss: 6.453070163726807 2023-01-22 11:20:27.069461: step: 320/464, loss: 2.863231897354126 2023-01-22 11:20:27.861864: step: 322/464, loss: 2.2388625144958496 2023-01-22 11:20:28.672616: step: 324/464, loss: 19.541534423828125 2023-01-22 11:20:29.500041: step: 326/464, loss: 9.022233009338379 2023-01-22 11:20:30.348457: step: 328/464, loss: 4.538577556610107 2023-01-22 11:20:31.146422: step: 330/464, loss: 1.4825515747070312 2023-01-22 11:20:31.835996: step: 332/464, loss: 3.8235790729522705 2023-01-22 11:20:32.655378: step: 334/464, loss: 6.108069896697998 2023-01-22 11:20:33.425627: step: 336/464, loss: 7.865338325500488 2023-01-22 11:20:34.132870: step: 338/464, loss: 1.6649757623672485 2023-01-22 11:20:34.809353: step: 340/464, loss: 15.700176239013672 2023-01-22 11:20:35.522614: step: 342/464, loss: 1.5405945777893066 2023-01-22 11:20:36.280842: step: 344/464, loss: 6.349061965942383 2023-01-22 11:20:37.075226: step: 346/464, loss: 3.196582317352295 2023-01-22 11:20:37.746403: step: 348/464, loss: 8.026315689086914 2023-01-22 11:20:38.494310: step: 350/464, loss: 1.8204209804534912 2023-01-22 11:20:39.208142: step: 352/464, loss: 4.419278144836426 2023-01-22 11:20:40.017251: step: 354/464, loss: 6.484158039093018 2023-01-22 11:20:40.725864: step: 356/464, loss: 3.7357139587402344 2023-01-22 11:20:41.535503: step: 358/464, loss: 8.426981925964355 2023-01-22 11:20:42.247986: step: 360/464, loss: 5.3540544509887695 2023-01-22 11:20:42.995612: step: 362/464, loss: 10.684189796447754 2023-01-22 11:20:43.847475: step: 364/464, loss: 4.385252952575684 2023-01-22 11:20:44.579216: step: 366/464, loss: 1.5382404327392578 2023-01-22 11:20:45.285149: step: 368/464, loss: 2.7515692710876465 2023-01-22 11:20:46.070996: step: 370/464, loss: 2.2363736629486084 2023-01-22 11:20:46.803412: step: 372/464, loss: 3.405794620513916 2023-01-22 11:20:47.697370: step: 374/464, loss: 3.746248245239258 2023-01-22 11:20:48.365743: step: 376/464, loss: 6.643892288208008 2023-01-22 11:20:49.166587: step: 378/464, loss: 9.15991497039795 2023-01-22 11:20:49.868602: step: 380/464, loss: 1.7974623441696167 2023-01-22 11:20:50.639360: step: 382/464, loss: 2.5357003211975098 2023-01-22 11:20:51.452980: step: 384/464, loss: 6.739011764526367 2023-01-22 11:20:52.141161: step: 386/464, loss: 7.696263790130615 2023-01-22 11:20:52.880506: step: 388/464, loss: 3.198539972305298 2023-01-22 11:20:53.766349: step: 390/464, loss: 4.967364311218262 2023-01-22 11:20:54.583960: step: 392/464, loss: 8.086828231811523 2023-01-22 11:20:55.310497: step: 394/464, loss: 2.0360107421875 2023-01-22 11:20:56.033119: step: 396/464, loss: 1.9013662338256836 2023-01-22 11:20:56.808666: step: 398/464, loss: 5.694022178649902 2023-01-22 11:20:57.569164: step: 400/464, loss: 5.332306861877441 2023-01-22 11:20:58.334053: step: 402/464, loss: 7.137255668640137 2023-01-22 11:20:59.052710: step: 404/464, loss: 1.5250176191329956 2023-01-22 11:20:59.723264: step: 406/464, loss: 6.838038921356201 2023-01-22 11:21:00.465833: step: 408/464, loss: 6.499872207641602 2023-01-22 11:21:01.254073: step: 410/464, loss: 2.3436036109924316 2023-01-22 11:21:02.027572: step: 412/464, loss: 4.896717071533203 2023-01-22 11:21:02.794976: step: 414/464, loss: 2.5820794105529785 2023-01-22 11:21:03.605440: step: 416/464, loss: 1.179425597190857 2023-01-22 11:21:04.362673: step: 418/464, loss: 4.6026611328125 2023-01-22 11:21:05.099213: step: 420/464, loss: 2.9256577491760254 2023-01-22 11:21:05.835327: step: 422/464, loss: 3.4536118507385254 2023-01-22 11:21:06.588895: step: 424/464, loss: 4.34628963470459 2023-01-22 11:21:07.434785: step: 426/464, loss: 1.0080816745758057 2023-01-22 11:21:08.192710: step: 428/464, loss: 6.766529083251953 2023-01-22 11:21:08.881899: step: 430/464, loss: 1.8124334812164307 2023-01-22 11:21:09.696064: step: 432/464, loss: 3.1404690742492676 2023-01-22 11:21:10.440837: step: 434/464, loss: 3.186899185180664 2023-01-22 11:21:11.227731: step: 436/464, loss: 1.518646478652954 2023-01-22 11:21:11.940814: step: 438/464, loss: 2.915464162826538 2023-01-22 11:21:12.692386: step: 440/464, loss: 3.0261709690093994 2023-01-22 11:21:13.455535: step: 442/464, loss: 8.494826316833496 2023-01-22 11:21:14.207715: step: 444/464, loss: 1.1129310131072998 2023-01-22 11:21:15.037141: step: 446/464, loss: 1.0951164960861206 2023-01-22 11:21:15.767314: step: 448/464, loss: 3.7174148559570312 2023-01-22 11:21:16.638677: step: 450/464, loss: 1.5301337242126465 2023-01-22 11:21:17.414793: step: 452/464, loss: 1.1110576391220093 2023-01-22 11:21:18.190967: step: 454/464, loss: 4.443788528442383 2023-01-22 11:21:18.946455: step: 456/464, loss: 1.8520171642303467 2023-01-22 11:21:19.755995: step: 458/464, loss: 2.3815908432006836 2023-01-22 11:21:20.540101: step: 460/464, loss: 1.3823764324188232 2023-01-22 11:21:21.206274: step: 462/464, loss: 1.3576384782791138 2023-01-22 11:21:21.950287: step: 464/464, loss: 4.694050312042236 2023-01-22 11:21:22.783313: step: 466/464, loss: 1.1287555694580078 2023-01-22 11:21:23.501605: step: 468/464, loss: 2.7844748497009277 2023-01-22 11:21:24.276930: step: 470/464, loss: 1.916163682937622 2023-01-22 11:21:25.019563: step: 472/464, loss: 3.91558837890625 2023-01-22 11:21:25.752418: step: 474/464, loss: 10.167104721069336 2023-01-22 11:21:26.534112: step: 476/464, loss: 4.300200939178467 2023-01-22 11:21:27.233194: step: 478/464, loss: 1.098439335823059 2023-01-22 11:21:27.893919: step: 480/464, loss: 4.241396903991699 2023-01-22 11:21:28.635660: step: 482/464, loss: 2.5566744804382324 2023-01-22 11:21:29.378494: step: 484/464, loss: 0.6062437891960144 2023-01-22 11:21:30.149450: step: 486/464, loss: 3.4063737392425537 2023-01-22 11:21:30.943190: step: 488/464, loss: 1.0972111225128174 2023-01-22 11:21:31.737744: step: 490/464, loss: 2.4079630374908447 2023-01-22 11:21:32.449122: step: 492/464, loss: 1.7030748128890991 2023-01-22 11:21:33.230080: step: 494/464, loss: 0.6144923567771912 2023-01-22 11:21:33.976382: step: 496/464, loss: 4.25278902053833 2023-01-22 11:21:34.810076: step: 498/464, loss: 11.448986053466797 2023-01-22 11:21:35.558063: step: 500/464, loss: 1.90968656539917 2023-01-22 11:21:36.299522: step: 502/464, loss: 3.7317092418670654 2023-01-22 11:21:37.139292: step: 504/464, loss: 5.710849761962891 2023-01-22 11:21:37.877189: step: 506/464, loss: 3.7273013591766357 2023-01-22 11:21:38.642983: step: 508/464, loss: 1.74265718460083 2023-01-22 11:21:39.384646: step: 510/464, loss: 4.305446624755859 2023-01-22 11:21:40.118113: step: 512/464, loss: 3.4846932888031006 2023-01-22 11:21:40.823920: step: 514/464, loss: 0.8061954379081726 2023-01-22 11:21:41.512145: step: 516/464, loss: 1.9426944255828857 2023-01-22 11:21:42.295047: step: 518/464, loss: 3.0346713066101074 2023-01-22 11:21:43.066570: step: 520/464, loss: 5.749312877655029 2023-01-22 11:21:43.861404: step: 522/464, loss: 8.312419891357422 2023-01-22 11:21:44.546210: step: 524/464, loss: 0.8057400584220886 2023-01-22 11:21:45.289919: step: 526/464, loss: 0.5294106602668762 2023-01-22 11:21:46.100664: step: 528/464, loss: 2.6814074516296387 2023-01-22 11:21:46.727372: step: 530/464, loss: 2.8819162845611572 2023-01-22 11:21:47.521498: step: 532/464, loss: 5.775310039520264 2023-01-22 11:21:48.308981: step: 534/464, loss: 1.208691954612732 2023-01-22 11:21:49.092893: step: 536/464, loss: 0.7920262217521667 2023-01-22 11:21:49.845417: step: 538/464, loss: 0.8998887538909912 2023-01-22 11:21:50.569651: step: 540/464, loss: 5.381350517272949 2023-01-22 11:21:51.324564: step: 542/464, loss: 0.716932475566864 2023-01-22 11:21:52.134161: step: 544/464, loss: 7.021783828735352 2023-01-22 11:21:52.825555: step: 546/464, loss: 1.2206028699874878 2023-01-22 11:21:53.606580: step: 548/464, loss: 1.2179982662200928 2023-01-22 11:21:54.316782: step: 550/464, loss: 0.6256527900695801 2023-01-22 11:21:55.110127: step: 552/464, loss: 2.34977388381958 2023-01-22 11:21:55.849478: step: 554/464, loss: 2.508776903152466 2023-01-22 11:21:56.579102: step: 556/464, loss: 0.8136802911758423 2023-01-22 11:21:57.354264: step: 558/464, loss: 5.513727188110352 2023-01-22 11:21:58.041295: step: 560/464, loss: 1.3077001571655273 2023-01-22 11:21:58.771365: step: 562/464, loss: 2.8942441940307617 2023-01-22 11:21:59.507838: step: 564/464, loss: 1.109363317489624 2023-01-22 11:22:00.337446: step: 566/464, loss: 1.4600365161895752 2023-01-22 11:22:01.206135: step: 568/464, loss: 2.8010220527648926 2023-01-22 11:22:01.952141: step: 570/464, loss: 3.948335647583008 2023-01-22 11:22:02.690265: step: 572/464, loss: 2.7902843952178955 2023-01-22 11:22:03.384168: step: 574/464, loss: 0.5588739514350891 2023-01-22 11:22:04.128863: step: 576/464, loss: 2.748721122741699 2023-01-22 11:22:05.068850: step: 578/464, loss: 0.30092504620552063 2023-01-22 11:22:05.894405: step: 580/464, loss: 0.7211551666259766 2023-01-22 11:22:06.684902: step: 582/464, loss: 3.9377760887145996 2023-01-22 11:22:07.393963: step: 584/464, loss: 2.0166115760803223 2023-01-22 11:22:08.119511: step: 586/464, loss: 6.889471054077148 2023-01-22 11:22:08.794993: step: 588/464, loss: 2.986854076385498 2023-01-22 11:22:09.582633: step: 590/464, loss: 9.561071395874023 2023-01-22 11:22:10.372121: step: 592/464, loss: 1.638632893562317 2023-01-22 11:22:11.171489: step: 594/464, loss: 2.9973866939544678 2023-01-22 11:22:11.962739: step: 596/464, loss: 1.0983021259307861 2023-01-22 11:22:12.740079: step: 598/464, loss: 1.7044750452041626 2023-01-22 11:22:13.541067: step: 600/464, loss: 2.925450325012207 2023-01-22 11:22:14.279024: step: 602/464, loss: 8.897144317626953 2023-01-22 11:22:15.027259: step: 604/464, loss: 1.5220918655395508 2023-01-22 11:22:15.767571: step: 606/464, loss: 2.9862239360809326 2023-01-22 11:22:16.504912: step: 608/464, loss: 5.759862899780273 2023-01-22 11:22:17.367504: step: 610/464, loss: 1.089566946029663 2023-01-22 11:22:18.156446: step: 612/464, loss: 3.5152370929718018 2023-01-22 11:22:18.898898: step: 614/464, loss: 1.3656071424484253 2023-01-22 11:22:19.700436: step: 616/464, loss: 6.461595058441162 2023-01-22 11:22:20.471845: step: 618/464, loss: 0.7708498239517212 2023-01-22 11:22:21.209327: step: 620/464, loss: 2.1966259479522705 2023-01-22 11:22:21.967291: step: 622/464, loss: 1.6778101921081543 2023-01-22 11:22:22.706976: step: 624/464, loss: 1.459666132926941 2023-01-22 11:22:23.380056: step: 626/464, loss: 0.9618430733680725 2023-01-22 11:22:24.131115: step: 628/464, loss: 2.3581111431121826 2023-01-22 11:22:24.869371: step: 630/464, loss: 2.8875811100006104 2023-01-22 11:22:25.673275: step: 632/464, loss: 2.020002603530884 2023-01-22 11:22:26.456795: step: 634/464, loss: 1.6766326427459717 2023-01-22 11:22:27.144808: step: 636/464, loss: 2.358569383621216 2023-01-22 11:22:27.858389: step: 638/464, loss: 1.4719403982162476 2023-01-22 11:22:28.524254: step: 640/464, loss: 3.1487109661102295 2023-01-22 11:22:29.231615: step: 642/464, loss: 1.3823643922805786 2023-01-22 11:22:30.124195: step: 644/464, loss: 6.858518600463867 2023-01-22 11:22:30.863761: step: 646/464, loss: 2.9186878204345703 2023-01-22 11:22:31.618270: step: 648/464, loss: 0.6788272857666016 2023-01-22 11:22:32.336656: step: 650/464, loss: 0.8581159114837646 2023-01-22 11:22:33.252963: step: 652/464, loss: 3.3742990493774414 2023-01-22 11:22:34.021575: step: 654/464, loss: 0.6534217596054077 2023-01-22 11:22:34.769565: step: 656/464, loss: 0.4227631688117981 2023-01-22 11:22:35.476077: step: 658/464, loss: 0.6241885423660278 2023-01-22 11:22:36.261246: step: 660/464, loss: 2.2499334812164307 2023-01-22 11:22:36.923454: step: 662/464, loss: 2.9914143085479736 2023-01-22 11:22:37.674185: step: 664/464, loss: 5.7285356521606445 2023-01-22 11:22:38.469800: step: 666/464, loss: 4.63377046585083 2023-01-22 11:22:39.174176: step: 668/464, loss: 0.9508398175239563 2023-01-22 11:22:40.004723: step: 670/464, loss: 1.86622154712677 2023-01-22 11:22:40.791886: step: 672/464, loss: 18.502479553222656 2023-01-22 11:22:41.542927: step: 674/464, loss: 2.061119556427002 2023-01-22 11:22:42.222918: step: 676/464, loss: 1.0714634656906128 2023-01-22 11:22:42.934167: step: 678/464, loss: 2.290773868560791 2023-01-22 11:22:43.660740: step: 680/464, loss: 1.4382787942886353 2023-01-22 11:22:44.348249: step: 682/464, loss: 1.2964938879013062 2023-01-22 11:22:45.145522: step: 684/464, loss: 1.352104902267456 2023-01-22 11:22:46.037690: step: 686/464, loss: 2.0508487224578857 2023-01-22 11:22:46.866121: step: 688/464, loss: 4.085446357727051 2023-01-22 11:22:47.754297: step: 690/464, loss: 2.051877737045288 2023-01-22 11:22:48.491399: step: 692/464, loss: 1.0510663986206055 2023-01-22 11:22:49.265365: step: 694/464, loss: 1.0191794633865356 2023-01-22 11:22:50.032695: step: 696/464, loss: 2.4271769523620605 2023-01-22 11:22:50.776935: step: 698/464, loss: 5.522613048553467 2023-01-22 11:22:51.480433: step: 700/464, loss: 2.2515859603881836 2023-01-22 11:22:52.294585: step: 702/464, loss: 5.858063220977783 2023-01-22 11:22:53.051962: step: 704/464, loss: 1.3351037502288818 2023-01-22 11:22:53.811300: step: 706/464, loss: 3.453185558319092 2023-01-22 11:22:54.593996: step: 708/464, loss: 0.923098087310791 2023-01-22 11:22:55.358920: step: 710/464, loss: 1.162076473236084 2023-01-22 11:22:56.166589: step: 712/464, loss: 6.112480640411377 2023-01-22 11:22:56.897470: step: 714/464, loss: 0.42647066712379456 2023-01-22 11:22:57.692756: step: 716/464, loss: 1.2001023292541504 2023-01-22 11:22:58.422194: step: 718/464, loss: 2.029315233230591 2023-01-22 11:22:59.227972: step: 720/464, loss: 1.3218841552734375 2023-01-22 11:22:59.973660: step: 722/464, loss: 4.933912754058838 2023-01-22 11:23:00.726794: step: 724/464, loss: 3.8572864532470703 2023-01-22 11:23:01.427679: step: 726/464, loss: 3.1059410572052 2023-01-22 11:23:02.197904: step: 728/464, loss: 2.590851306915283 2023-01-22 11:23:02.965849: step: 730/464, loss: 1.0027847290039062 2023-01-22 11:23:03.666190: step: 732/464, loss: 1.3879283666610718 2023-01-22 11:23:04.411279: step: 734/464, loss: 0.9771751761436462 2023-01-22 11:23:05.220614: step: 736/464, loss: 1.621541142463684 2023-01-22 11:23:05.971535: step: 738/464, loss: 1.1764367818832397 2023-01-22 11:23:06.710499: step: 740/464, loss: 1.2249219417572021 2023-01-22 11:23:07.457887: step: 742/464, loss: 1.8417370319366455 2023-01-22 11:23:08.128544: step: 744/464, loss: 6.823138236999512 2023-01-22 11:23:08.895866: step: 746/464, loss: 1.8373222351074219 2023-01-22 11:23:09.551366: step: 748/464, loss: 5.270013809204102 2023-01-22 11:23:10.314492: step: 750/464, loss: 4.491097450256348 2023-01-22 11:23:11.073787: step: 752/464, loss: 0.7358742356300354 2023-01-22 11:23:11.921305: step: 754/464, loss: 3.433688163757324 2023-01-22 11:23:12.675236: step: 756/464, loss: 2.0288889408111572 2023-01-22 11:23:13.433003: step: 758/464, loss: 3.583008050918579 2023-01-22 11:23:14.183348: step: 760/464, loss: 1.6618074178695679 2023-01-22 11:23:14.916618: step: 762/464, loss: 14.27196216583252 2023-01-22 11:23:15.699923: step: 764/464, loss: 2.677417039871216 2023-01-22 11:23:16.423114: step: 766/464, loss: 4.172243118286133 2023-01-22 11:23:17.233047: step: 768/464, loss: 1.7907335758209229 2023-01-22 11:23:17.914404: step: 770/464, loss: 5.166937828063965 2023-01-22 11:23:18.650718: step: 772/464, loss: 4.351705551147461 2023-01-22 11:23:19.397768: step: 774/464, loss: 3.2508463859558105 2023-01-22 11:23:20.138294: step: 776/464, loss: 1.9359157085418701 2023-01-22 11:23:20.874405: step: 778/464, loss: 3.473889112472534 2023-01-22 11:23:21.620674: step: 780/464, loss: 2.1327149868011475 2023-01-22 11:23:22.492516: step: 782/464, loss: 2.69435715675354 2023-01-22 11:23:23.219754: step: 784/464, loss: 1.1987321376800537 2023-01-22 11:23:24.020311: step: 786/464, loss: 1.4761255979537964 2023-01-22 11:23:24.801717: step: 788/464, loss: 7.58142614364624 2023-01-22 11:23:25.495763: step: 790/464, loss: 1.345672845840454 2023-01-22 11:23:26.182528: step: 792/464, loss: 1.5891671180725098 2023-01-22 11:23:26.892057: step: 794/464, loss: 1.8026704788208008 2023-01-22 11:23:27.659855: step: 796/464, loss: 1.2802784442901611 2023-01-22 11:23:28.397655: step: 798/464, loss: 0.4238661825656891 2023-01-22 11:23:29.219254: step: 800/464, loss: 1.4901365041732788 2023-01-22 11:23:29.991671: step: 802/464, loss: 1.8327875137329102 2023-01-22 11:23:30.840120: step: 804/464, loss: 4.032846927642822 2023-01-22 11:23:31.583479: step: 806/464, loss: 1.8030288219451904 2023-01-22 11:23:32.278879: step: 808/464, loss: 6.166379928588867 2023-01-22 11:23:33.067057: step: 810/464, loss: 2.339250326156616 2023-01-22 11:23:33.809112: step: 812/464, loss: 0.9494217038154602 2023-01-22 11:23:34.576837: step: 814/464, loss: 1.4521145820617676 2023-01-22 11:23:35.292591: step: 816/464, loss: 2.0722744464874268 2023-01-22 11:23:35.994879: step: 818/464, loss: 9.778717994689941 2023-01-22 11:23:36.815857: step: 820/464, loss: 5.905690670013428 2023-01-22 11:23:37.600994: step: 822/464, loss: 3.5487167835235596 2023-01-22 11:23:38.449154: step: 824/464, loss: 1.1876356601715088 2023-01-22 11:23:39.168877: step: 826/464, loss: 4.4284491539001465 2023-01-22 11:23:39.882326: step: 828/464, loss: 4.3333821296691895 2023-01-22 11:23:40.593050: step: 830/464, loss: 8.989230155944824 2023-01-22 11:23:41.305185: step: 832/464, loss: 3.92063045501709 2023-01-22 11:23:42.197954: step: 834/464, loss: 2.353074312210083 2023-01-22 11:23:42.993990: step: 836/464, loss: 1.9473124742507935 2023-01-22 11:23:43.772799: step: 838/464, loss: 1.1626951694488525 2023-01-22 11:23:44.483279: step: 840/464, loss: 2.613588809967041 2023-01-22 11:23:45.243771: step: 842/464, loss: 1.9653387069702148 2023-01-22 11:23:45.953847: step: 844/464, loss: 0.7982001304626465 2023-01-22 11:23:46.678916: step: 846/464, loss: 2.37483286857605 2023-01-22 11:23:47.456131: step: 848/464, loss: 0.9634339809417725 2023-01-22 11:23:48.147175: step: 850/464, loss: 0.6170372366905212 2023-01-22 11:23:48.854639: step: 852/464, loss: 0.7587774991989136 2023-01-22 11:23:49.551471: step: 854/464, loss: 3.129608392715454 2023-01-22 11:23:50.351399: step: 856/464, loss: 0.570770263671875 2023-01-22 11:23:51.118693: step: 858/464, loss: 2.8230128288269043 2023-01-22 11:23:51.805088: step: 860/464, loss: 0.7748469114303589 2023-01-22 11:23:52.540355: step: 862/464, loss: 1.5421247482299805 2023-01-22 11:23:53.325129: step: 864/464, loss: 1.4275951385498047 2023-01-22 11:23:54.095812: step: 866/464, loss: 1.0909823179244995 2023-01-22 11:23:54.820769: step: 868/464, loss: 7.108874797821045 2023-01-22 11:23:55.536561: step: 870/464, loss: 1.9868006706237793 2023-01-22 11:23:56.281273: step: 872/464, loss: 11.043020248413086 2023-01-22 11:23:56.957195: step: 874/464, loss: 0.9537006616592407 2023-01-22 11:23:57.759293: step: 876/464, loss: 5.960934638977051 2023-01-22 11:23:58.525354: step: 878/464, loss: 2.2544214725494385 2023-01-22 11:23:59.272850: step: 880/464, loss: 0.5682533979415894 2023-01-22 11:23:59.977531: step: 882/464, loss: 0.8755449056625366 2023-01-22 11:24:00.775281: step: 884/464, loss: 1.6833244562149048 2023-01-22 11:24:01.469804: step: 886/464, loss: 2.0925192832946777 2023-01-22 11:24:02.196892: step: 888/464, loss: 5.369467735290527 2023-01-22 11:24:02.898384: step: 890/464, loss: 5.678498268127441 2023-01-22 11:24:03.679713: step: 892/464, loss: 2.654069423675537 2023-01-22 11:24:04.454559: step: 894/464, loss: 3.5575973987579346 2023-01-22 11:24:05.162785: step: 896/464, loss: 0.8754044771194458 2023-01-22 11:24:05.881224: step: 898/464, loss: 1.8832297325134277 2023-01-22 11:24:06.626940: step: 900/464, loss: 1.5971415042877197 2023-01-22 11:24:07.426364: step: 902/464, loss: 4.067296028137207 2023-01-22 11:24:08.108214: step: 904/464, loss: 1.232337236404419 2023-01-22 11:24:08.922636: step: 906/464, loss: 1.807895302772522 2023-01-22 11:24:09.698640: step: 908/464, loss: 1.2402400970458984 2023-01-22 11:24:10.420630: step: 910/464, loss: 7.4919891357421875 2023-01-22 11:24:11.188317: step: 912/464, loss: 0.6001932621002197 2023-01-22 11:24:11.952615: step: 914/464, loss: 2.4097423553466797 2023-01-22 11:24:12.751332: step: 916/464, loss: 2.4119486808776855 2023-01-22 11:24:13.520408: step: 918/464, loss: 5.363138198852539 2023-01-22 11:24:14.328625: step: 920/464, loss: 1.7590237855911255 2023-01-22 11:24:15.064410: step: 922/464, loss: 1.0465037822723389 2023-01-22 11:24:15.816703: step: 924/464, loss: 0.8016859889030457 2023-01-22 11:24:16.554046: step: 926/464, loss: 5.46735954284668 2023-01-22 11:24:17.285337: step: 928/464, loss: 1.4254200458526611 2023-01-22 11:24:17.951540: step: 930/464, loss: 1.331081509590149 ================================================== Loss: 5.576 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5103754940711462, 'r': 0.044380477745317067, 'f1': 0.08166007905138341}, 'combined': 0.060170584564177246, 'epoch': 0} Test Chinese: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.3605498721227622, 'r': 0.06510419673266755, 'f1': 0.11029289521294804}, 'combined': 0.06849769281646247, 'epoch': 0} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.4726443768996961, 'r': 0.041992978665946534, 'f1': 0.07713293650793651}, 'combined': 0.056834795321637425, 'epoch': 0} Test Korean: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.3658153701968135, 'r': 0.06964492818271033, 'f1': 0.11701266581728247}, 'combined': 0.07267102403389122, 'epoch': 0} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5066445182724252, 'r': 0.04118282473669997, 'f1': 0.07617382617382616}, 'combined': 0.05612808244387191, 'epoch': 0} Test Russian: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.35325, 'r': 0.06933267909715408, 'f1': 0.1159146841673503}, 'combined': 0.07198911964077545, 'epoch': 0} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.3333333333333333, 'r': 0.02857142857142857, 'f1': 0.05263157894736842}, 'combined': 0.03508771929824561, 'epoch': 0} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 0} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 0} New best chinese model... New best korean model... New best russian model... ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5103754940711462, 'r': 0.044380477745317067, 'f1': 0.08166007905138341}, 'combined': 0.060170584564177246, 'epoch': 0} Test for Chinese: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.3605498721227622, 'r': 0.06510419673266755, 'f1': 0.11029289521294804}, 'combined': 0.06849769281646247, 'epoch': 0} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.3333333333333333, 'r': 0.02857142857142857, 'f1': 0.05263157894736842}, 'combined': 0.03508771929824561, 'epoch': 0} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.4726443768996961, 'r': 0.041992978665946534, 'f1': 0.07713293650793651}, 'combined': 0.056834795321637425, 'epoch': 0} Test for Korean: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.3658153701968135, 'r': 0.06964492818271033, 'f1': 0.11701266581728247}, 'combined': 0.07267102403389122, 'epoch': 0} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 0} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5066445182724252, 'r': 0.04118282473669997, 'f1': 0.07617382617382616}, 'combined': 0.05612808244387191, 'epoch': 0} Test for Russian: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.35325, 'r': 0.06933267909715408, 'f1': 0.1159146841673503}, 'combined': 0.07198911964077545, 'epoch': 0} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 0} ****************************** Epoch: 1 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-22 11:27:22.568733: step: 2/464, loss: 3.567324638366699 2023-01-22 11:27:23.343367: step: 4/464, loss: 2.116065263748169 2023-01-22 11:27:24.026288: step: 6/464, loss: 1.166826844215393 2023-01-22 11:27:24.824771: step: 8/464, loss: 4.697350025177002 2023-01-22 11:27:25.555857: step: 10/464, loss: 5.934710502624512 2023-01-22 11:27:26.327449: step: 12/464, loss: 1.6735713481903076 2023-01-22 11:27:27.050288: step: 14/464, loss: 1.6156316995620728 2023-01-22 11:27:27.771599: step: 16/464, loss: 2.347018003463745 2023-01-22 11:27:28.486319: step: 18/464, loss: 1.8784767389297485 2023-01-22 11:27:29.291128: step: 20/464, loss: 1.6167913675308228 2023-01-22 11:27:30.024407: step: 22/464, loss: 1.060924768447876 2023-01-22 11:27:30.782598: step: 24/464, loss: 2.0685951709747314 2023-01-22 11:27:31.536070: step: 26/464, loss: 1.5308924913406372 2023-01-22 11:27:32.260678: step: 28/464, loss: 2.8823797702789307 2023-01-22 11:27:32.942999: step: 30/464, loss: 3.9805989265441895 2023-01-22 11:27:33.664006: step: 32/464, loss: 4.019295692443848 2023-01-22 11:27:34.400678: step: 34/464, loss: 0.39726945757865906 2023-01-22 11:27:35.155382: step: 36/464, loss: 1.1988452672958374 2023-01-22 11:27:35.943461: step: 38/464, loss: 2.4698407649993896 2023-01-22 11:27:36.721825: step: 40/464, loss: 0.7756727933883667 2023-01-22 11:27:37.548893: step: 42/464, loss: 0.9783002734184265 2023-01-22 11:27:38.303030: step: 44/464, loss: 0.5537374019622803 2023-01-22 11:27:39.032568: step: 46/464, loss: 0.5232036709785461 2023-01-22 11:27:39.766758: step: 48/464, loss: 2.6981263160705566 2023-01-22 11:27:40.504453: step: 50/464, loss: 1.9383834600448608 2023-01-22 11:27:41.293037: step: 52/464, loss: 2.12388277053833 2023-01-22 11:27:42.086714: step: 54/464, loss: 2.682612895965576 2023-01-22 11:27:42.845367: step: 56/464, loss: 2.9795823097229004 2023-01-22 11:27:43.635298: step: 58/464, loss: 15.888510704040527 2023-01-22 11:27:44.360741: step: 60/464, loss: 0.7368502020835876 2023-01-22 11:27:45.106711: step: 62/464, loss: 2.890212297439575 2023-01-22 11:27:45.761203: step: 64/464, loss: 2.0080223083496094 2023-01-22 11:27:46.602808: step: 66/464, loss: 0.7404903769493103 2023-01-22 11:27:47.387058: step: 68/464, loss: 0.9457007050514221 2023-01-22 11:27:48.143281: step: 70/464, loss: 0.7126715183258057 2023-01-22 11:27:48.966585: step: 72/464, loss: 0.5138718485832214 2023-01-22 11:27:49.707999: step: 74/464, loss: 2.5417838096618652 2023-01-22 11:27:50.380262: step: 76/464, loss: 1.963165283203125 2023-01-22 11:27:51.220421: step: 78/464, loss: 3.571277618408203 2023-01-22 11:27:51.960587: step: 80/464, loss: 1.5786569118499756 2023-01-22 11:27:52.728860: step: 82/464, loss: 0.7398642301559448 2023-01-22 11:27:53.480405: step: 84/464, loss: 2.779233694076538 2023-01-22 11:27:54.282919: step: 86/464, loss: 1.2509294748306274 2023-01-22 11:27:54.980643: step: 88/464, loss: 1.2668819427490234 2023-01-22 11:27:55.732828: step: 90/464, loss: 1.0955171585083008 2023-01-22 11:27:56.463826: step: 92/464, loss: 2.2722392082214355 2023-01-22 11:27:57.220026: step: 94/464, loss: 5.017848014831543 2023-01-22 11:27:57.945135: step: 96/464, loss: 1.3737874031066895 2023-01-22 11:27:58.728142: step: 98/464, loss: 1.84757661819458 2023-01-22 11:27:59.456573: step: 100/464, loss: 0.489383727312088 2023-01-22 11:28:00.228401: step: 102/464, loss: 1.7194496393203735 2023-01-22 11:28:00.956792: step: 104/464, loss: 2.4874281883239746 2023-01-22 11:28:01.689690: step: 106/464, loss: 4.907480239868164 2023-01-22 11:28:02.451708: step: 108/464, loss: 0.7825038433074951 2023-01-22 11:28:03.222057: step: 110/464, loss: 1.3089637756347656 2023-01-22 11:28:03.897106: step: 112/464, loss: 0.7727731466293335 2023-01-22 11:28:04.620218: step: 114/464, loss: 4.281703948974609 2023-01-22 11:28:05.405567: step: 116/464, loss: 0.6136764883995056 2023-01-22 11:28:06.161987: step: 118/464, loss: 1.5267757177352905 2023-01-22 11:28:06.819893: step: 120/464, loss: 0.6549609899520874 2023-01-22 11:28:07.511945: step: 122/464, loss: 0.6695728302001953 2023-01-22 11:28:08.207939: step: 124/464, loss: 1.9565430879592896 2023-01-22 11:28:08.991281: step: 126/464, loss: 0.830868124961853 2023-01-22 11:28:09.701316: step: 128/464, loss: 2.3312859535217285 2023-01-22 11:28:10.437829: step: 130/464, loss: 1.7556043863296509 2023-01-22 11:28:11.162124: step: 132/464, loss: 0.5519129037857056 2023-01-22 11:28:11.937439: step: 134/464, loss: 1.4241178035736084 2023-01-22 11:28:12.642253: step: 136/464, loss: 4.3382134437561035 2023-01-22 11:28:13.538511: step: 138/464, loss: 0.9232587814331055 2023-01-22 11:28:14.356423: step: 140/464, loss: 4.314373970031738 2023-01-22 11:28:15.106850: step: 142/464, loss: 1.8007235527038574 2023-01-22 11:28:15.785049: step: 144/464, loss: 1.3967361450195312 2023-01-22 11:28:16.594164: step: 146/464, loss: 2.1668901443481445 2023-01-22 11:28:17.410982: step: 148/464, loss: 2.6193809509277344 2023-01-22 11:28:18.156746: step: 150/464, loss: 0.4625154137611389 2023-01-22 11:28:18.990904: step: 152/464, loss: 1.2553250789642334 2023-01-22 11:28:19.794095: step: 154/464, loss: 2.662593364715576 2023-01-22 11:28:20.556159: step: 156/464, loss: 1.529561161994934 2023-01-22 11:28:21.419766: step: 158/464, loss: 5.751806259155273 2023-01-22 11:28:22.186503: step: 160/464, loss: 3.984288215637207 2023-01-22 11:28:22.929739: step: 162/464, loss: 1.781134843826294 2023-01-22 11:28:23.632059: step: 164/464, loss: 8.015787124633789 2023-01-22 11:28:24.349864: step: 166/464, loss: 2.554163932800293 2023-01-22 11:28:25.079531: step: 168/464, loss: 4.118683815002441 2023-01-22 11:28:25.804939: step: 170/464, loss: 2.1517744064331055 2023-01-22 11:28:26.636131: step: 172/464, loss: 1.5270841121673584 2023-01-22 11:28:27.343016: step: 174/464, loss: 1.0206589698791504 2023-01-22 11:28:28.107716: step: 176/464, loss: 0.5664258599281311 2023-01-22 11:28:28.789275: step: 178/464, loss: 0.5027251243591309 2023-01-22 11:28:29.532199: step: 180/464, loss: 1.4795325994491577 2023-01-22 11:28:30.197019: step: 182/464, loss: 0.8331640362739563 2023-01-22 11:28:30.920220: step: 184/464, loss: 1.1108388900756836 2023-01-22 11:28:31.719104: step: 186/464, loss: 1.221932291984558 2023-01-22 11:28:32.467108: step: 188/464, loss: 1.2539130449295044 2023-01-22 11:28:33.247176: step: 190/464, loss: 1.0047433376312256 2023-01-22 11:28:34.072044: step: 192/464, loss: 1.9757084846496582 2023-01-22 11:28:34.870441: step: 194/464, loss: 4.2894744873046875 2023-01-22 11:28:35.607112: step: 196/464, loss: 1.083547830581665 2023-01-22 11:28:36.388421: step: 198/464, loss: 1.2944920063018799 2023-01-22 11:28:37.099476: step: 200/464, loss: 0.8665717840194702 2023-01-22 11:28:37.880419: step: 202/464, loss: 2.007253646850586 2023-01-22 11:28:38.658374: step: 204/464, loss: 1.715686559677124 2023-01-22 11:28:39.584265: step: 206/464, loss: 0.9834060072898865 2023-01-22 11:28:40.254062: step: 208/464, loss: 1.235771894454956 2023-01-22 11:28:41.001907: step: 210/464, loss: 2.0683391094207764 2023-01-22 11:28:41.678508: step: 212/464, loss: 0.8923385143280029 2023-01-22 11:28:42.526336: step: 214/464, loss: 4.624021053314209 2023-01-22 11:28:43.249936: step: 216/464, loss: 0.5816634297370911 2023-01-22 11:28:43.932828: step: 218/464, loss: 2.2024126052856445 2023-01-22 11:28:44.651910: step: 220/464, loss: 0.6494408845901489 2023-01-22 11:28:45.285519: step: 222/464, loss: 0.8478401899337769 2023-01-22 11:28:46.019722: step: 224/464, loss: 2.6478207111358643 2023-01-22 11:28:46.800485: step: 226/464, loss: 0.974275529384613 2023-01-22 11:28:47.575815: step: 228/464, loss: 0.543366551399231 2023-01-22 11:28:48.390401: step: 230/464, loss: 3.2722392082214355 2023-01-22 11:28:49.096617: step: 232/464, loss: 13.146383285522461 2023-01-22 11:28:49.806343: step: 234/464, loss: 1.7761332988739014 2023-01-22 11:28:50.593754: step: 236/464, loss: 4.8828349113464355 2023-01-22 11:28:51.302975: step: 238/464, loss: 6.12351131439209 2023-01-22 11:28:51.964700: step: 240/464, loss: 2.6610379219055176 2023-01-22 11:28:52.699190: step: 242/464, loss: 2.2539026737213135 2023-01-22 11:28:53.424595: step: 244/464, loss: 1.0267163515090942 2023-01-22 11:28:54.226573: step: 246/464, loss: 2.6492509841918945 2023-01-22 11:28:55.047923: step: 248/464, loss: 4.662771701812744 2023-01-22 11:28:55.804357: step: 250/464, loss: 2.4732513427734375 2023-01-22 11:28:56.603454: step: 252/464, loss: 4.1236186027526855 2023-01-22 11:28:57.388837: step: 254/464, loss: 2.082460880279541 2023-01-22 11:28:58.108259: step: 256/464, loss: 1.0133386850357056 2023-01-22 11:28:58.774745: step: 258/464, loss: 0.5736017823219299 2023-01-22 11:28:59.653026: step: 260/464, loss: 1.344170331954956 2023-01-22 11:29:00.402300: step: 262/464, loss: 0.6307258605957031 2023-01-22 11:29:01.216311: step: 264/464, loss: 0.8127894997596741 2023-01-22 11:29:01.955561: step: 266/464, loss: 3.181589126586914 2023-01-22 11:29:02.661293: step: 268/464, loss: 5.5859174728393555 2023-01-22 11:29:03.389568: step: 270/464, loss: 2.046428918838501 2023-01-22 11:29:04.234311: step: 272/464, loss: 5.400157451629639 2023-01-22 11:29:05.012480: step: 274/464, loss: 0.8204623460769653 2023-01-22 11:29:05.716700: step: 276/464, loss: 4.42466402053833 2023-01-22 11:29:06.491551: step: 278/464, loss: 2.860304117202759 2023-01-22 11:29:07.263219: step: 280/464, loss: 1.755745530128479 2023-01-22 11:29:07.979171: step: 282/464, loss: 0.9788022041320801 2023-01-22 11:29:08.650079: step: 284/464, loss: 0.9861152768135071 2023-01-22 11:29:09.459160: step: 286/464, loss: 1.0290707349777222 2023-01-22 11:29:10.186442: step: 288/464, loss: 2.2862699031829834 2023-01-22 11:29:10.905540: step: 290/464, loss: 2.5819599628448486 2023-01-22 11:29:11.689517: step: 292/464, loss: 0.915147602558136 2023-01-22 11:29:12.442108: step: 294/464, loss: 1.1519476175308228 2023-01-22 11:29:13.207054: step: 296/464, loss: 0.3368190824985504 2023-01-22 11:29:13.916474: step: 298/464, loss: 0.5094312429428101 2023-01-22 11:29:14.659441: step: 300/464, loss: 1.7346651554107666 2023-01-22 11:29:15.360384: step: 302/464, loss: 1.1590203046798706 2023-01-22 11:29:16.139292: step: 304/464, loss: 1.1996859312057495 2023-01-22 11:29:16.898509: step: 306/464, loss: 2.341116189956665 2023-01-22 11:29:17.633669: step: 308/464, loss: 1.8273674249649048 2023-01-22 11:29:18.498218: step: 310/464, loss: 0.4473875164985657 2023-01-22 11:29:19.303803: step: 312/464, loss: 3.9436159133911133 2023-01-22 11:29:20.073753: step: 314/464, loss: 3.0034258365631104 2023-01-22 11:29:20.784207: step: 316/464, loss: 2.238938331604004 2023-01-22 11:29:21.554486: step: 318/464, loss: 2.2146379947662354 2023-01-22 11:29:22.328708: step: 320/464, loss: 0.7863084673881531 2023-01-22 11:29:23.087291: step: 322/464, loss: 0.6234296560287476 2023-01-22 11:29:23.882052: step: 324/464, loss: 0.4865874648094177 2023-01-22 11:29:24.678654: step: 326/464, loss: 0.5788937211036682 2023-01-22 11:29:25.466939: step: 328/464, loss: 2.890346050262451 2023-01-22 11:29:26.263002: step: 330/464, loss: 7.377608299255371 2023-01-22 11:29:26.955989: step: 332/464, loss: 2.9715189933776855 2023-01-22 11:29:27.706955: step: 334/464, loss: 0.6012032628059387 2023-01-22 11:29:28.456796: step: 336/464, loss: 1.0220718383789062 2023-01-22 11:29:29.181923: step: 338/464, loss: 1.4404304027557373 2023-01-22 11:29:29.953855: step: 340/464, loss: 0.9285256862640381 2023-01-22 11:29:30.643050: step: 342/464, loss: 0.7093393802642822 2023-01-22 11:29:31.456412: step: 344/464, loss: 0.5021221041679382 2023-01-22 11:29:32.308801: step: 346/464, loss: 1.502563714981079 2023-01-22 11:29:33.028212: step: 348/464, loss: 0.9983633756637573 2023-01-22 11:29:33.868672: step: 350/464, loss: 0.690024197101593 2023-01-22 11:29:34.633703: step: 352/464, loss: 0.8608592748641968 2023-01-22 11:29:35.410911: step: 354/464, loss: 2.9874119758605957 2023-01-22 11:29:36.173193: step: 356/464, loss: 1.582247257232666 2023-01-22 11:29:36.886292: step: 358/464, loss: 1.9923336505889893 2023-01-22 11:29:37.622321: step: 360/464, loss: 0.45969799160957336 2023-01-22 11:29:38.369767: step: 362/464, loss: 0.37600523233413696 2023-01-22 11:29:39.061825: step: 364/464, loss: 1.5279144048690796 2023-01-22 11:29:39.816590: step: 366/464, loss: 7.2152509689331055 2023-01-22 11:29:40.590344: step: 368/464, loss: 1.3439027070999146 2023-01-22 11:29:41.370386: step: 370/464, loss: 1.8933783769607544 2023-01-22 11:29:42.156435: step: 372/464, loss: 0.5813678503036499 2023-01-22 11:29:42.854773: step: 374/464, loss: 0.8357667922973633 2023-01-22 11:29:43.599739: step: 376/464, loss: 2.265641927719116 2023-01-22 11:29:44.403989: step: 378/464, loss: 0.6052720546722412 2023-01-22 11:29:45.101379: step: 380/464, loss: 3.890368938446045 2023-01-22 11:29:45.847677: step: 382/464, loss: 2.141329526901245 2023-01-22 11:29:46.604617: step: 384/464, loss: 0.5587427616119385 2023-01-22 11:29:47.298292: step: 386/464, loss: 2.0051937103271484 2023-01-22 11:29:48.012857: step: 388/464, loss: 1.6070865392684937 2023-01-22 11:29:48.881514: step: 390/464, loss: 2.655045747756958 2023-01-22 11:29:49.654736: step: 392/464, loss: 1.8066232204437256 2023-01-22 11:29:50.331783: step: 394/464, loss: 2.347830295562744 2023-01-22 11:29:51.046560: step: 396/464, loss: 6.169438362121582 2023-01-22 11:29:51.861665: step: 398/464, loss: 5.859373569488525 2023-01-22 11:29:52.564388: step: 400/464, loss: 0.5093414783477783 2023-01-22 11:29:53.341741: step: 402/464, loss: 1.0215308666229248 2023-01-22 11:29:54.068309: step: 404/464, loss: 2.536679267883301 2023-01-22 11:29:54.812199: step: 406/464, loss: 0.9781687259674072 2023-01-22 11:29:55.541032: step: 408/464, loss: 1.8663051128387451 2023-01-22 11:29:56.293622: step: 410/464, loss: 1.7260236740112305 2023-01-22 11:29:57.021011: step: 412/464, loss: 0.708279013633728 2023-01-22 11:29:57.782255: step: 414/464, loss: 0.8949853181838989 2023-01-22 11:29:58.507203: step: 416/464, loss: 8.901383399963379 2023-01-22 11:29:59.291171: step: 418/464, loss: 4.989002227783203 2023-01-22 11:30:00.129852: step: 420/464, loss: 6.246890544891357 2023-01-22 11:30:00.864248: step: 422/464, loss: 1.0378128290176392 2023-01-22 11:30:01.685983: step: 424/464, loss: 1.038848876953125 2023-01-22 11:30:02.398423: step: 426/464, loss: 1.004784345626831 2023-01-22 11:30:03.136154: step: 428/464, loss: 0.6553089022636414 2023-01-22 11:30:03.879747: step: 430/464, loss: 2.451887845993042 2023-01-22 11:30:04.634351: step: 432/464, loss: 0.3229205012321472 2023-01-22 11:30:05.334347: step: 434/464, loss: 0.18707747757434845 2023-01-22 11:30:06.044965: step: 436/464, loss: 2.7739617824554443 2023-01-22 11:30:06.794007: step: 438/464, loss: 5.1485137939453125 2023-01-22 11:30:07.638297: step: 440/464, loss: 1.6739176511764526 2023-01-22 11:30:08.307209: step: 442/464, loss: 0.21984398365020752 2023-01-22 11:30:09.103395: step: 444/464, loss: 4.9870991706848145 2023-01-22 11:30:09.749910: step: 446/464, loss: 2.1164140701293945 2023-01-22 11:30:10.478157: step: 448/464, loss: 3.631910800933838 2023-01-22 11:30:11.275604: step: 450/464, loss: 0.6704921722412109 2023-01-22 11:30:12.101664: step: 452/464, loss: 0.7273821234703064 2023-01-22 11:30:12.836811: step: 454/464, loss: 1.1897422075271606 2023-01-22 11:30:13.564072: step: 456/464, loss: 0.177559033036232 2023-01-22 11:30:14.276279: step: 458/464, loss: 3.297564744949341 2023-01-22 11:30:15.003609: step: 460/464, loss: 1.4739949703216553 2023-01-22 11:30:15.801303: step: 462/464, loss: 1.0736751556396484 2023-01-22 11:30:16.516948: step: 464/464, loss: 0.20888689160346985 2023-01-22 11:30:17.223503: step: 466/464, loss: 2.5224709510803223 2023-01-22 11:30:18.035425: step: 468/464, loss: 4.367872714996338 2023-01-22 11:30:18.786554: step: 470/464, loss: 0.2336680293083191 2023-01-22 11:30:19.539485: step: 472/464, loss: 0.4050363302230835 2023-01-22 11:30:20.308834: step: 474/464, loss: 0.3901623487472534 2023-01-22 11:30:21.030496: step: 476/464, loss: 0.5768817067146301 2023-01-22 11:30:21.775963: step: 478/464, loss: 1.0388317108154297 2023-01-22 11:30:22.609682: step: 480/464, loss: 2.2464306354522705 2023-01-22 11:30:23.466048: step: 482/464, loss: 10.363744735717773 2023-01-22 11:30:24.224297: step: 484/464, loss: 0.3566378951072693 2023-01-22 11:30:24.945748: step: 486/464, loss: 0.7015933990478516 2023-01-22 11:30:25.657577: step: 488/464, loss: 1.5301611423492432 2023-01-22 11:30:26.494516: step: 490/464, loss: 1.2292773723602295 2023-01-22 11:30:27.308442: step: 492/464, loss: 0.82595294713974 2023-01-22 11:30:27.990874: step: 494/464, loss: 0.7543503046035767 2023-01-22 11:30:28.808140: step: 496/464, loss: 0.8798842430114746 2023-01-22 11:30:29.503029: step: 498/464, loss: 1.4138880968093872 2023-01-22 11:30:30.193701: step: 500/464, loss: 0.45087796449661255 2023-01-22 11:30:30.924394: step: 502/464, loss: 1.0304077863693237 2023-01-22 11:30:31.777458: step: 504/464, loss: 3.080888032913208 2023-01-22 11:30:32.538085: step: 506/464, loss: 3.425476551055908 2023-01-22 11:30:33.253269: step: 508/464, loss: 0.5018462538719177 2023-01-22 11:30:33.925903: step: 510/464, loss: 0.4485335350036621 2023-01-22 11:30:34.669696: step: 512/464, loss: 2.6635360717773438 2023-01-22 11:30:35.431572: step: 514/464, loss: 3.398329257965088 2023-01-22 11:30:36.098677: step: 516/464, loss: 0.505752444267273 2023-01-22 11:30:36.816972: step: 518/464, loss: 1.5625126361846924 2023-01-22 11:30:37.554285: step: 520/464, loss: 0.7278554439544678 2023-01-22 11:30:38.325130: step: 522/464, loss: 1.5637742280960083 2023-01-22 11:30:39.097270: step: 524/464, loss: 1.576052188873291 2023-01-22 11:30:39.903364: step: 526/464, loss: 1.3697073459625244 2023-01-22 11:30:40.610534: step: 528/464, loss: 0.9000505208969116 2023-01-22 11:30:41.295051: step: 530/464, loss: 3.3905420303344727 2023-01-22 11:30:42.091771: step: 532/464, loss: 0.975693941116333 2023-01-22 11:30:42.786861: step: 534/464, loss: 6.041010856628418 2023-01-22 11:30:43.553128: step: 536/464, loss: 0.629315197467804 2023-01-22 11:30:44.345715: step: 538/464, loss: 2.6904046535491943 2023-01-22 11:30:45.066578: step: 540/464, loss: 1.1637006998062134 2023-01-22 11:30:45.797053: step: 542/464, loss: 4.338569164276123 2023-01-22 11:30:46.493303: step: 544/464, loss: 10.183465957641602 2023-01-22 11:30:47.225939: step: 546/464, loss: 1.6647273302078247 2023-01-22 11:30:47.980107: step: 548/464, loss: 0.8222893476486206 2023-01-22 11:30:48.723257: step: 550/464, loss: 1.9054585695266724 2023-01-22 11:30:49.430189: step: 552/464, loss: 2.1345303058624268 2023-01-22 11:30:50.206000: step: 554/464, loss: 4.587452411651611 2023-01-22 11:30:50.974507: step: 556/464, loss: 0.9287194609642029 2023-01-22 11:30:51.615708: step: 558/464, loss: 0.9194431900978088 2023-01-22 11:30:52.498622: step: 560/464, loss: 1.7446281909942627 2023-01-22 11:30:53.184203: step: 562/464, loss: 0.6097180843353271 2023-01-22 11:30:53.980204: step: 564/464, loss: 2.0353572368621826 2023-01-22 11:30:54.803879: step: 566/464, loss: 8.625846862792969 2023-01-22 11:30:55.598614: step: 568/464, loss: 0.5222208499908447 2023-01-22 11:30:56.440302: step: 570/464, loss: 1.3264281749725342 2023-01-22 11:30:57.170463: step: 572/464, loss: 2.493659734725952 2023-01-22 11:30:58.050265: step: 574/464, loss: 0.6286075711250305 2023-01-22 11:30:58.748142: step: 576/464, loss: 0.6821500062942505 2023-01-22 11:30:59.471075: step: 578/464, loss: 2.119264602661133 2023-01-22 11:31:00.222563: step: 580/464, loss: 3.8549129962921143 2023-01-22 11:31:00.960147: step: 582/464, loss: 1.0581588745117188 2023-01-22 11:31:01.784660: step: 584/464, loss: 0.9843960404396057 2023-01-22 11:31:02.563036: step: 586/464, loss: 0.9626711010932922 2023-01-22 11:31:03.293180: step: 588/464, loss: 1.1887109279632568 2023-01-22 11:31:04.035272: step: 590/464, loss: 1.310795545578003 2023-01-22 11:31:04.777199: step: 592/464, loss: 1.8952412605285645 2023-01-22 11:31:05.556138: step: 594/464, loss: 0.790389895439148 2023-01-22 11:31:06.301841: step: 596/464, loss: 1.7052712440490723 2023-01-22 11:31:07.045552: step: 598/464, loss: 0.49568837881088257 2023-01-22 11:31:07.798675: step: 600/464, loss: 0.4807586669921875 2023-01-22 11:31:08.597296: step: 602/464, loss: 2.6345295906066895 2023-01-22 11:31:09.296520: step: 604/464, loss: 0.7684303522109985 2023-01-22 11:31:10.045543: step: 606/464, loss: 0.34246817231178284 2023-01-22 11:31:10.838180: step: 608/464, loss: 1.8327733278274536 2023-01-22 11:31:11.625368: step: 610/464, loss: 0.8029569387435913 2023-01-22 11:31:12.402214: step: 612/464, loss: 2.389103412628174 2023-01-22 11:31:13.126666: step: 614/464, loss: 1.6085231304168701 2023-01-22 11:31:13.945770: step: 616/464, loss: 2.841780185699463 2023-01-22 11:31:14.751435: step: 618/464, loss: 0.411793053150177 2023-01-22 11:31:15.490901: step: 620/464, loss: 2.203213930130005 2023-01-22 11:31:16.258487: step: 622/464, loss: 7.631477355957031 2023-01-22 11:31:17.109250: step: 624/464, loss: 0.8750141263008118 2023-01-22 11:31:17.931150: step: 626/464, loss: 0.9731322526931763 2023-01-22 11:31:18.719212: step: 628/464, loss: 1.101901888847351 2023-01-22 11:31:19.451947: step: 630/464, loss: 0.8351380825042725 2023-01-22 11:31:20.195093: step: 632/464, loss: 1.7162944078445435 2023-01-22 11:31:21.009078: step: 634/464, loss: 4.447570323944092 2023-01-22 11:31:21.709398: step: 636/464, loss: 1.5265611410140991 2023-01-22 11:31:22.512989: step: 638/464, loss: 3.4335930347442627 2023-01-22 11:31:23.191491: step: 640/464, loss: 0.40369030833244324 2023-01-22 11:31:23.928669: step: 642/464, loss: 1.4695643186569214 2023-01-22 11:31:24.654309: step: 644/464, loss: 0.5647727251052856 2023-01-22 11:31:25.414231: step: 646/464, loss: 1.4111278057098389 2023-01-22 11:31:26.161295: step: 648/464, loss: 3.0106451511383057 2023-01-22 11:31:26.929021: step: 650/464, loss: 1.3463108539581299 2023-01-22 11:31:27.632045: step: 652/464, loss: 1.3156379461288452 2023-01-22 11:31:28.335989: step: 654/464, loss: 2.6676833629608154 2023-01-22 11:31:29.099099: step: 656/464, loss: 1.5572566986083984 2023-01-22 11:31:29.956677: step: 658/464, loss: 0.9726899862289429 2023-01-22 11:31:30.702463: step: 660/464, loss: 4.307257652282715 2023-01-22 11:31:31.455342: step: 662/464, loss: 0.39950403571128845 2023-01-22 11:31:32.158341: step: 664/464, loss: 7.1081109046936035 2023-01-22 11:31:32.997428: step: 666/464, loss: 0.9712538719177246 2023-01-22 11:31:33.758922: step: 668/464, loss: 0.19049298763275146 2023-01-22 11:31:34.536271: step: 670/464, loss: 0.5407162308692932 2023-01-22 11:31:35.284182: step: 672/464, loss: 1.2553319931030273 2023-01-22 11:31:36.056439: step: 674/464, loss: 0.3443169891834259 2023-01-22 11:31:36.806617: step: 676/464, loss: 1.114622950553894 2023-01-22 11:31:37.550327: step: 678/464, loss: 1.3746274709701538 2023-01-22 11:31:38.327748: step: 680/464, loss: 5.34721040725708 2023-01-22 11:31:39.071448: step: 682/464, loss: 0.7788916230201721 2023-01-22 11:31:39.782791: step: 684/464, loss: 0.7353830337524414 2023-01-22 11:31:40.517142: step: 686/464, loss: 0.2988717257976532 2023-01-22 11:31:41.371350: step: 688/464, loss: 0.9029651284217834 2023-01-22 11:31:42.167304: step: 690/464, loss: 5.1300811767578125 2023-01-22 11:31:42.950603: step: 692/464, loss: 1.1903573274612427 2023-01-22 11:31:43.799790: step: 694/464, loss: 7.861872673034668 2023-01-22 11:31:44.591079: step: 696/464, loss: 0.6495417356491089 2023-01-22 11:31:45.361054: step: 698/464, loss: 1.1668031215667725 2023-01-22 11:31:46.097461: step: 700/464, loss: 0.5051676630973816 2023-01-22 11:31:46.837673: step: 702/464, loss: 3.5430235862731934 2023-01-22 11:31:47.616807: step: 704/464, loss: 0.5422537922859192 2023-01-22 11:31:48.404709: step: 706/464, loss: 0.7244465351104736 2023-01-22 11:31:49.234735: step: 708/464, loss: 2.217958450317383 2023-01-22 11:31:49.951119: step: 710/464, loss: 4.301735877990723 2023-01-22 11:31:50.615575: step: 712/464, loss: 7.51944637298584 2023-01-22 11:31:51.367603: step: 714/464, loss: 1.5007673501968384 2023-01-22 11:31:52.155872: step: 716/464, loss: 1.6526864767074585 2023-01-22 11:31:52.970974: step: 718/464, loss: 1.4878666400909424 2023-01-22 11:31:53.755277: step: 720/464, loss: 1.9696204662322998 2023-01-22 11:31:54.609036: step: 722/464, loss: 1.068194031715393 2023-01-22 11:31:55.414872: step: 724/464, loss: 1.606663465499878 2023-01-22 11:31:56.227004: step: 726/464, loss: 0.4832223951816559 2023-01-22 11:31:57.063986: step: 728/464, loss: 0.7201308608055115 2023-01-22 11:31:57.840461: step: 730/464, loss: 3.175027847290039 2023-01-22 11:31:58.640865: step: 732/464, loss: 4.3967437744140625 2023-01-22 11:31:59.434823: step: 734/464, loss: 2.772202491760254 2023-01-22 11:32:00.144234: step: 736/464, loss: 0.44274598360061646 2023-01-22 11:32:00.880497: step: 738/464, loss: 1.7568962574005127 2023-01-22 11:32:01.658953: step: 740/464, loss: 2.24100923538208 2023-01-22 11:32:02.397494: step: 742/464, loss: 0.6800580024719238 2023-01-22 11:32:03.156312: step: 744/464, loss: 0.685122013092041 2023-01-22 11:32:03.853762: step: 746/464, loss: 0.48600563406944275 2023-01-22 11:32:04.631178: step: 748/464, loss: 2.582446336746216 2023-01-22 11:32:05.368986: step: 750/464, loss: 0.6945471167564392 2023-01-22 11:32:06.083504: step: 752/464, loss: 3.8455324172973633 2023-01-22 11:32:06.767228: step: 754/464, loss: 0.5839530229568481 2023-01-22 11:32:07.470422: step: 756/464, loss: 1.003867745399475 2023-01-22 11:32:08.276549: step: 758/464, loss: 1.6268566846847534 2023-01-22 11:32:09.006630: step: 760/464, loss: 0.4314429461956024 2023-01-22 11:32:09.825422: step: 762/464, loss: 1.3930310010910034 2023-01-22 11:32:10.538991: step: 764/464, loss: 1.0327585935592651 2023-01-22 11:32:11.354166: step: 766/464, loss: 1.0543372631072998 2023-01-22 11:32:12.138967: step: 768/464, loss: 1.979860544204712 2023-01-22 11:32:13.058465: step: 770/464, loss: 0.5045173168182373 2023-01-22 11:32:13.766237: step: 772/464, loss: 1.0678372383117676 2023-01-22 11:32:14.513285: step: 774/464, loss: 0.7546037435531616 2023-01-22 11:32:15.187776: step: 776/464, loss: 0.40795233845710754 2023-01-22 11:32:15.933598: step: 778/464, loss: 1.6976772546768188 2023-01-22 11:32:16.671301: step: 780/464, loss: 0.39988434314727783 2023-01-22 11:32:17.484923: step: 782/464, loss: 8.160738945007324 2023-01-22 11:32:18.333002: step: 784/464, loss: 1.8919651508331299 2023-01-22 11:32:19.111891: step: 786/464, loss: 1.4552507400512695 2023-01-22 11:32:19.839052: step: 788/464, loss: 1.8836709260940552 2023-01-22 11:32:20.587999: step: 790/464, loss: 2.472114086151123 2023-01-22 11:32:21.386060: step: 792/464, loss: 4.261412143707275 2023-01-22 11:32:22.196610: step: 794/464, loss: 0.5101083517074585 2023-01-22 11:32:22.945492: step: 796/464, loss: 0.31527039408683777 2023-01-22 11:32:23.682920: step: 798/464, loss: 0.5338462591171265 2023-01-22 11:32:24.391754: step: 800/464, loss: 4.2285871505737305 2023-01-22 11:32:25.174201: step: 802/464, loss: 0.4452413022518158 2023-01-22 11:32:26.022824: step: 804/464, loss: 2.7038815021514893 2023-01-22 11:32:26.780405: step: 806/464, loss: 1.2515664100646973 2023-01-22 11:32:27.553256: step: 808/464, loss: 0.48019105195999146 2023-01-22 11:32:28.326860: step: 810/464, loss: 0.627007246017456 2023-01-22 11:32:29.134257: step: 812/464, loss: 0.7928240895271301 2023-01-22 11:32:29.908576: step: 814/464, loss: 1.1529955863952637 2023-01-22 11:32:30.770158: step: 816/464, loss: 0.727571964263916 2023-01-22 11:32:31.524451: step: 818/464, loss: 2.150031566619873 2023-01-22 11:32:32.250674: step: 820/464, loss: 1.0720735788345337 2023-01-22 11:32:33.065922: step: 822/464, loss: 1.3180612325668335 2023-01-22 11:32:33.908647: step: 824/464, loss: 1.2374560832977295 2023-01-22 11:32:34.613347: step: 826/464, loss: 10.174234390258789 2023-01-22 11:32:35.342359: step: 828/464, loss: 0.2928467392921448 2023-01-22 11:32:36.164007: step: 830/464, loss: 2.6930325031280518 2023-01-22 11:32:36.941737: step: 832/464, loss: 4.566496849060059 2023-01-22 11:32:37.678725: step: 834/464, loss: 0.7894934415817261 2023-01-22 11:32:38.523366: step: 836/464, loss: 13.314796447753906 2023-01-22 11:32:39.236962: step: 838/464, loss: 1.932656168937683 2023-01-22 11:32:39.946228: step: 840/464, loss: 0.557920515537262 2023-01-22 11:32:40.728678: step: 842/464, loss: 0.49012115597724915 2023-01-22 11:32:41.472599: step: 844/464, loss: 0.7082429528236389 2023-01-22 11:32:42.271205: step: 846/464, loss: 1.2856661081314087 2023-01-22 11:32:43.062882: step: 848/464, loss: 5.341327667236328 2023-01-22 11:32:43.864873: step: 850/464, loss: 0.5253156423568726 2023-01-22 11:32:44.555277: step: 852/464, loss: 7.842291831970215 2023-01-22 11:32:45.291894: step: 854/464, loss: 0.8897383809089661 2023-01-22 11:32:46.072851: step: 856/464, loss: 1.7032040357589722 2023-01-22 11:32:46.777019: step: 858/464, loss: 0.3025331497192383 2023-01-22 11:32:47.527157: step: 860/464, loss: 0.7118395566940308 2023-01-22 11:32:48.357946: step: 862/464, loss: 0.8716322183609009 2023-01-22 11:32:49.086536: step: 864/464, loss: 2.1756081581115723 2023-01-22 11:32:49.860676: step: 866/464, loss: 0.9115775227546692 2023-01-22 11:32:50.603946: step: 868/464, loss: 0.44515085220336914 2023-01-22 11:32:51.327807: step: 870/464, loss: 0.5697401762008667 2023-01-22 11:32:52.058851: step: 872/464, loss: 0.3869156539440155 2023-01-22 11:32:52.844338: step: 874/464, loss: 0.9719885587692261 2023-01-22 11:32:53.630393: step: 876/464, loss: 0.6325796842575073 2023-01-22 11:32:54.366012: step: 878/464, loss: 0.21479357779026031 2023-01-22 11:32:55.134546: step: 880/464, loss: 1.4217106103897095 2023-01-22 11:32:55.863347: step: 882/464, loss: 2.153522491455078 2023-01-22 11:32:56.611399: step: 884/464, loss: 1.7959179878234863 2023-01-22 11:32:57.410719: step: 886/464, loss: 1.5518654584884644 2023-01-22 11:32:58.154621: step: 888/464, loss: 1.2342078685760498 2023-01-22 11:32:58.991265: step: 890/464, loss: 1.6774334907531738 2023-01-22 11:32:59.865026: step: 892/464, loss: 0.7303564548492432 2023-01-22 11:33:00.604374: step: 894/464, loss: 1.8460235595703125 2023-01-22 11:33:01.359063: step: 896/464, loss: 1.0837057828903198 2023-01-22 11:33:02.088819: step: 898/464, loss: 2.979775905609131 2023-01-22 11:33:02.849621: step: 900/464, loss: 1.2387704849243164 2023-01-22 11:33:03.543655: step: 902/464, loss: 0.5870174169540405 2023-01-22 11:33:04.315101: step: 904/464, loss: 3.3540124893188477 2023-01-22 11:33:05.046169: step: 906/464, loss: 5.47309684753418 2023-01-22 11:33:05.861296: step: 908/464, loss: 0.7924412488937378 2023-01-22 11:33:06.560540: step: 910/464, loss: 1.008704662322998 2023-01-22 11:33:07.255786: step: 912/464, loss: 2.3228940963745117 2023-01-22 11:33:08.015081: step: 914/464, loss: 1.8138552904129028 2023-01-22 11:33:08.778161: step: 916/464, loss: 6.6115570068359375 2023-01-22 11:33:09.654690: step: 918/464, loss: 0.5387253761291504 2023-01-22 11:33:10.350712: step: 920/464, loss: 0.24979518353939056 2023-01-22 11:33:11.110343: step: 922/464, loss: 0.9564343690872192 2023-01-22 11:33:11.876978: step: 924/464, loss: 1.6989599466323853 2023-01-22 11:33:12.629856: step: 926/464, loss: 0.4422229528427124 2023-01-22 11:33:13.383027: step: 928/464, loss: 7.08799934387207 2023-01-22 11:33:13.985782: step: 930/464, loss: 0.38450002670288086 ================================================== Loss: 2.017 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2942022263450835, 'r': 0.2145224567099567, 'f1': 0.24812235956814271}, 'combined': 0.18282700178705252, 'epoch': 1} Test Chinese: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.3433293159201212, 'r': 0.16676959888352003, 'f1': 0.22449329194512227}, 'combined': 0.1394221497343391, 'epoch': 1} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2684233449477352, 'r': 0.20843479437229437, 'f1': 0.2346558026195553}, 'combined': 0.17290427561440916, 'epoch': 1} Test Korean: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.32913483958241896, 'r': 0.16181653863442427, 'f1': 0.2169642976812254}, 'combined': 0.13474624803360316, 'epoch': 1} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2701031070322866, 'r': 0.20871603725222146, 'f1': 0.23547450356660882}, 'combined': 0.17350752894381702, 'epoch': 1} Test Russian: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.34470901105698804, 'r': 0.17015135059056832, 'f1': 0.2278392673477393}, 'combined': 0.14150017656333283, 'epoch': 1} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.2478448275862069, 'r': 0.20535714285714285, 'f1': 0.224609375}, 'combined': 0.14973958333333331, 'epoch': 1} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.18181818181818182, 'r': 0.08695652173913043, 'f1': 0.1176470588235294}, 'combined': 0.0588235294117647, 'epoch': 1} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.4, 'r': 0.06896551724137931, 'f1': 0.1176470588235294}, 'combined': 0.07843137254901959, 'epoch': 1} New best chinese model... New best korean model... New best russian model... ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2942022263450835, 'r': 0.2145224567099567, 'f1': 0.24812235956814271}, 'combined': 0.18282700178705252, 'epoch': 1} Test for Chinese: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.3433293159201212, 'r': 0.16676959888352003, 'f1': 0.22449329194512227}, 'combined': 0.1394221497343391, 'epoch': 1} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.2478448275862069, 'r': 0.20535714285714285, 'f1': 0.224609375}, 'combined': 0.14973958333333331, 'epoch': 1} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2684233449477352, 'r': 0.20843479437229437, 'f1': 0.2346558026195553}, 'combined': 0.17290427561440916, 'epoch': 1} Test for Korean: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.32913483958241896, 'r': 0.16181653863442427, 'f1': 0.2169642976812254}, 'combined': 0.13474624803360316, 'epoch': 1} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.18181818181818182, 'r': 0.08695652173913043, 'f1': 0.1176470588235294}, 'combined': 0.0588235294117647, 'epoch': 1} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2701031070322866, 'r': 0.20871603725222146, 'f1': 0.23547450356660882}, 'combined': 0.17350752894381702, 'epoch': 1} Test for Russian: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.34470901105698804, 'r': 0.17015135059056832, 'f1': 0.2278392673477393}, 'combined': 0.14150017656333283, 'epoch': 1} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.4, 'r': 0.06896551724137931, 'f1': 0.1176470588235294}, 'combined': 0.07843137254901959, 'epoch': 1} ****************************** Epoch: 2 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-22 11:36:24.656427: step: 2/464, loss: 1.2311524152755737 2023-01-22 11:36:25.448705: step: 4/464, loss: 1.7830935716629028 2023-01-22 11:36:26.142987: step: 6/464, loss: 2.093751907348633 2023-01-22 11:36:26.805557: step: 8/464, loss: 2.221421718597412 2023-01-22 11:36:27.562599: step: 10/464, loss: 0.6308695077896118 2023-01-22 11:36:28.254360: step: 12/464, loss: 0.4722578525543213 2023-01-22 11:36:29.022114: step: 14/464, loss: 0.9641572833061218 2023-01-22 11:36:29.717148: step: 16/464, loss: 0.20689396560192108 2023-01-22 11:36:30.461744: step: 18/464, loss: 0.521108865737915 2023-01-22 11:36:31.173775: step: 20/464, loss: 0.6766486167907715 2023-01-22 11:36:31.973576: step: 22/464, loss: 2.2151641845703125 2023-01-22 11:36:32.767304: step: 24/464, loss: 1.4389357566833496 2023-01-22 11:36:33.484220: step: 26/464, loss: 0.9313511848449707 2023-01-22 11:36:34.188547: step: 28/464, loss: 2.294585943222046 2023-01-22 11:36:34.880965: step: 30/464, loss: 0.44173091650009155 2023-01-22 11:36:35.611096: step: 32/464, loss: 3.0902254581451416 2023-01-22 11:36:36.369065: step: 34/464, loss: 0.5559873580932617 2023-01-22 11:36:37.087261: step: 36/464, loss: 3.523479461669922 2023-01-22 11:36:37.919838: step: 38/464, loss: 1.3762128353118896 2023-01-22 11:36:38.653699: step: 40/464, loss: 0.8632673025131226 2023-01-22 11:36:39.428964: step: 42/464, loss: 0.735785722732544 2023-01-22 11:36:40.203314: step: 44/464, loss: 0.5903066992759705 2023-01-22 11:36:40.951493: step: 46/464, loss: 2.5188724994659424 2023-01-22 11:36:41.676346: step: 48/464, loss: 0.8308882117271423 2023-01-22 11:36:42.438809: step: 50/464, loss: 6.881199836730957 2023-01-22 11:36:43.175640: step: 52/464, loss: 2.8523025512695312 2023-01-22 11:36:43.883042: step: 54/464, loss: 2.872032880783081 2023-01-22 11:36:44.584728: step: 56/464, loss: 0.8633500337600708 2023-01-22 11:36:45.335211: step: 58/464, loss: 0.66408371925354 2023-01-22 11:36:46.086381: step: 60/464, loss: 1.057525396347046 2023-01-22 11:36:46.890587: step: 62/464, loss: 0.3248614966869354 2023-01-22 11:36:47.577920: step: 64/464, loss: 1.3190155029296875 2023-01-22 11:36:48.242592: step: 66/464, loss: 0.5487788915634155 2023-01-22 11:36:48.971181: step: 68/464, loss: 1.3203892707824707 2023-01-22 11:36:49.678269: step: 70/464, loss: 0.4012579023838043 2023-01-22 11:36:50.496194: step: 72/464, loss: 1.2286897897720337 2023-01-22 11:36:51.189485: step: 74/464, loss: 2.457897663116455 2023-01-22 11:36:51.918073: step: 76/464, loss: 3.04465913772583 2023-01-22 11:36:52.700878: step: 78/464, loss: 4.860772609710693 2023-01-22 11:36:53.427670: step: 80/464, loss: 1.2226388454437256 2023-01-22 11:36:54.174834: step: 82/464, loss: 0.30012208223342896 2023-01-22 11:36:54.906653: step: 84/464, loss: 1.0310789346694946 2023-01-22 11:36:55.652763: step: 86/464, loss: 3.1720633506774902 2023-01-22 11:36:56.388877: step: 88/464, loss: 1.8114345073699951 2023-01-22 11:36:57.080619: step: 90/464, loss: 1.5391186475753784 2023-01-22 11:36:57.741808: step: 92/464, loss: 0.7284096479415894 2023-01-22 11:36:58.552298: step: 94/464, loss: 6.220029830932617 2023-01-22 11:36:59.328419: step: 96/464, loss: 0.8167260885238647 2023-01-22 11:37:00.107694: step: 98/464, loss: 1.1842533349990845 2023-01-22 11:37:00.920055: step: 100/464, loss: 6.020644664764404 2023-01-22 11:37:01.597387: step: 102/464, loss: 0.5480703711509705 2023-01-22 11:37:02.311762: step: 104/464, loss: 1.0576692819595337 2023-01-22 11:37:03.007346: step: 106/464, loss: 1.0002667903900146 2023-01-22 11:37:03.735695: step: 108/464, loss: 1.7153160572052002 2023-01-22 11:37:04.447699: step: 110/464, loss: 0.47342249751091003 2023-01-22 11:37:05.195600: step: 112/464, loss: 3.2831356525421143 2023-01-22 11:37:05.896346: step: 114/464, loss: 0.616359293460846 2023-01-22 11:37:06.634428: step: 116/464, loss: 7.425902843475342 2023-01-22 11:37:07.424114: step: 118/464, loss: 0.6359020471572876 2023-01-22 11:37:08.218026: step: 120/464, loss: 0.44187021255493164 2023-01-22 11:37:08.974604: step: 122/464, loss: 1.5610920190811157 2023-01-22 11:37:09.685252: step: 124/464, loss: 1.3820793628692627 2023-01-22 11:37:10.532242: step: 126/464, loss: 1.758582592010498 2023-01-22 11:37:11.301002: step: 128/464, loss: 0.8005374670028687 2023-01-22 11:37:12.198589: step: 130/464, loss: 4.700216293334961 2023-01-22 11:37:13.013555: step: 132/464, loss: 1.1407063007354736 2023-01-22 11:37:13.741698: step: 134/464, loss: 0.48220837116241455 2023-01-22 11:37:14.506629: step: 136/464, loss: 2.5536394119262695 2023-01-22 11:37:15.271045: step: 138/464, loss: 2.0750999450683594 2023-01-22 11:37:15.966262: step: 140/464, loss: 2.2614126205444336 2023-01-22 11:37:16.743566: step: 142/464, loss: 1.667802095413208 2023-01-22 11:37:17.488907: step: 144/464, loss: 1.616926908493042 2023-01-22 11:37:18.257913: step: 146/464, loss: 0.33585309982299805 2023-01-22 11:37:18.979230: step: 148/464, loss: 0.7217862010002136 2023-01-22 11:37:19.688504: step: 150/464, loss: 0.8636781573295593 2023-01-22 11:37:20.448849: step: 152/464, loss: 0.3174082636833191 2023-01-22 11:37:21.175079: step: 154/464, loss: 4.170505523681641 2023-01-22 11:37:21.906284: step: 156/464, loss: 5.086087226867676 2023-01-22 11:37:22.626209: step: 158/464, loss: 2.6408884525299072 2023-01-22 11:37:23.354559: step: 160/464, loss: 9.706635475158691 2023-01-22 11:37:24.090358: step: 162/464, loss: 0.5258678793907166 2023-01-22 11:37:24.772069: step: 164/464, loss: 1.1015256643295288 2023-01-22 11:37:25.516934: step: 166/464, loss: 0.4202771484851837 2023-01-22 11:37:26.290635: step: 168/464, loss: 0.8112895488739014 2023-01-22 11:37:26.981677: step: 170/464, loss: 0.44624030590057373 2023-01-22 11:37:27.655228: step: 172/464, loss: 0.385553240776062 2023-01-22 11:37:28.544722: step: 174/464, loss: 1.3247578144073486 2023-01-22 11:37:29.317349: step: 176/464, loss: 1.6399046182632446 2023-01-22 11:37:30.042298: step: 178/464, loss: 0.32270684838294983 2023-01-22 11:37:30.895657: step: 180/464, loss: 0.5270733833312988 2023-01-22 11:37:31.676266: step: 182/464, loss: 0.6382239460945129 2023-01-22 11:37:32.444751: step: 184/464, loss: 0.5134657621383667 2023-01-22 11:37:33.254912: step: 186/464, loss: 1.0302172899246216 2023-01-22 11:37:33.963795: step: 188/464, loss: 0.4852708578109741 2023-01-22 11:37:34.740892: step: 190/464, loss: 6.4960784912109375 2023-01-22 11:37:35.465584: step: 192/464, loss: 0.7964613437652588 2023-01-22 11:37:36.227846: step: 194/464, loss: 0.3542407155036926 2023-01-22 11:37:36.924405: step: 196/464, loss: 3.117703437805176 2023-01-22 11:37:37.650039: step: 198/464, loss: 0.8800047636032104 2023-01-22 11:37:38.454494: step: 200/464, loss: 0.6491445899009705 2023-01-22 11:37:39.201664: step: 202/464, loss: 0.549182116985321 2023-01-22 11:37:39.918744: step: 204/464, loss: 1.4039323329925537 2023-01-22 11:37:40.657887: step: 206/464, loss: 0.6293597221374512 2023-01-22 11:37:41.413852: step: 208/464, loss: 3.826097249984741 2023-01-22 11:37:42.358308: step: 210/464, loss: 0.357878178358078 2023-01-22 11:37:43.097691: step: 212/464, loss: 0.6146247982978821 2023-01-22 11:37:43.814818: step: 214/464, loss: 0.48976725339889526 2023-01-22 11:37:44.663941: step: 216/464, loss: 0.2621183395385742 2023-01-22 11:37:45.489876: step: 218/464, loss: 0.9402350187301636 2023-01-22 11:37:46.289759: step: 220/464, loss: 2.0679545402526855 2023-01-22 11:37:46.955455: step: 222/464, loss: 2.71578049659729 2023-01-22 11:37:47.712043: step: 224/464, loss: 1.4649124145507812 2023-01-22 11:37:48.514880: step: 226/464, loss: 1.0338512659072876 2023-01-22 11:37:49.295322: step: 228/464, loss: 0.4438144564628601 2023-01-22 11:37:50.023114: step: 230/464, loss: 0.5852053761482239 2023-01-22 11:37:50.826542: step: 232/464, loss: 0.9666265249252319 2023-01-22 11:37:51.558048: step: 234/464, loss: 0.9432559609413147 2023-01-22 11:37:52.368755: step: 236/464, loss: 0.6843967437744141 2023-01-22 11:37:53.102593: step: 238/464, loss: 2.344256639480591 2023-01-22 11:37:53.876200: step: 240/464, loss: 1.774417757987976 2023-01-22 11:37:54.578573: step: 242/464, loss: 1.28359854221344 2023-01-22 11:37:55.374479: step: 244/464, loss: 1.9311877489089966 2023-01-22 11:37:56.093055: step: 246/464, loss: 1.1652761697769165 2023-01-22 11:37:56.870724: step: 248/464, loss: 1.9170876741409302 2023-01-22 11:37:57.637139: step: 250/464, loss: 0.16579344868659973 2023-01-22 11:37:58.457811: step: 252/464, loss: 0.22092676162719727 2023-01-22 11:37:59.213234: step: 254/464, loss: 0.44775503873825073 2023-01-22 11:38:00.010369: step: 256/464, loss: 0.511645495891571 2023-01-22 11:38:00.756831: step: 258/464, loss: 1.7976717948913574 2023-01-22 11:38:01.587410: step: 260/464, loss: 1.1174240112304688 2023-01-22 11:38:02.303754: step: 262/464, loss: 0.29902729392051697 2023-01-22 11:38:03.046190: step: 264/464, loss: 1.1252002716064453 2023-01-22 11:38:03.795159: step: 266/464, loss: 1.2579469680786133 2023-01-22 11:38:04.534325: step: 268/464, loss: 1.7518572807312012 2023-01-22 11:38:05.327823: step: 270/464, loss: 5.4651265144348145 2023-01-22 11:38:06.095792: step: 272/464, loss: 1.3823081254959106 2023-01-22 11:38:06.811681: step: 274/464, loss: 0.7875335216522217 2023-01-22 11:38:07.577759: step: 276/464, loss: 0.34622815251350403 2023-01-22 11:38:08.332393: step: 278/464, loss: 0.7284860014915466 2023-01-22 11:38:09.103725: step: 280/464, loss: 0.7788302302360535 2023-01-22 11:38:09.809073: step: 282/464, loss: 0.44866272807121277 2023-01-22 11:38:10.602476: step: 284/464, loss: 3.9130313396453857 2023-01-22 11:38:11.440903: step: 286/464, loss: 0.5465903282165527 2023-01-22 11:38:12.340255: step: 288/464, loss: 1.2728848457336426 2023-01-22 11:38:13.074779: step: 290/464, loss: 3.584545612335205 2023-01-22 11:38:13.809355: step: 292/464, loss: 0.26818153262138367 2023-01-22 11:38:14.561808: step: 294/464, loss: 0.5165697336196899 2023-01-22 11:38:15.370120: step: 296/464, loss: 0.5162287950515747 2023-01-22 11:38:16.089644: step: 298/464, loss: 0.7398544549942017 2023-01-22 11:38:16.837136: step: 300/464, loss: 1.1116465330123901 2023-01-22 11:38:17.512321: step: 302/464, loss: 1.6432995796203613 2023-01-22 11:38:18.327020: step: 304/464, loss: 2.452953338623047 2023-01-22 11:38:19.172966: step: 306/464, loss: 0.43862515687942505 2023-01-22 11:38:19.950639: step: 308/464, loss: 0.7555627226829529 2023-01-22 11:38:20.768108: step: 310/464, loss: 10.338842391967773 2023-01-22 11:38:21.512895: step: 312/464, loss: 1.2393325567245483 2023-01-22 11:38:22.350999: step: 314/464, loss: 3.2715816497802734 2023-01-22 11:38:23.063024: step: 316/464, loss: 2.1629157066345215 2023-01-22 11:38:23.829151: step: 318/464, loss: 1.165483832359314 2023-01-22 11:38:24.568781: step: 320/464, loss: 1.7454993724822998 2023-01-22 11:38:25.312498: step: 322/464, loss: 0.9971399307250977 2023-01-22 11:38:26.116977: step: 324/464, loss: 0.9380273818969727 2023-01-22 11:38:26.952955: step: 326/464, loss: 2.0600996017456055 2023-01-22 11:38:27.661634: step: 328/464, loss: 0.7213514447212219 2023-01-22 11:38:28.435972: step: 330/464, loss: 0.18026822805404663 2023-01-22 11:38:29.182713: step: 332/464, loss: 0.91179358959198 2023-01-22 11:38:29.904523: step: 334/464, loss: 0.45584264397621155 2023-01-22 11:38:30.639121: step: 336/464, loss: 1.7639589309692383 2023-01-22 11:38:31.401433: step: 338/464, loss: 0.46160751581192017 2023-01-22 11:38:32.178802: step: 340/464, loss: 0.4666309058666229 2023-01-22 11:38:33.031346: step: 342/464, loss: 1.0047087669372559 2023-01-22 11:38:33.864780: step: 344/464, loss: 3.103390693664551 2023-01-22 11:38:34.705006: step: 346/464, loss: 1.8842649459838867 2023-01-22 11:38:35.432790: step: 348/464, loss: 1.3954133987426758 2023-01-22 11:38:36.150698: step: 350/464, loss: 0.9122724533081055 2023-01-22 11:38:36.941394: step: 352/464, loss: 1.214568853378296 2023-01-22 11:38:37.641777: step: 354/464, loss: 0.7801638245582581 2023-01-22 11:38:38.533354: step: 356/464, loss: 0.8779234290122986 2023-01-22 11:38:39.307592: step: 358/464, loss: 0.21076428890228271 2023-01-22 11:38:40.058188: step: 360/464, loss: 0.5160536170005798 2023-01-22 11:38:40.743751: step: 362/464, loss: 0.2467440962791443 2023-01-22 11:38:41.531042: step: 364/464, loss: 3.439088821411133 2023-01-22 11:38:42.296154: step: 366/464, loss: 1.7350010871887207 2023-01-22 11:38:42.980241: step: 368/464, loss: 1.040642499923706 2023-01-22 11:38:43.744976: step: 370/464, loss: 2.3546676635742188 2023-01-22 11:38:44.510222: step: 372/464, loss: 1.243515968322754 2023-01-22 11:38:45.215614: step: 374/464, loss: 1.7401405572891235 2023-01-22 11:38:45.924746: step: 376/464, loss: 0.2650654911994934 2023-01-22 11:38:46.741705: step: 378/464, loss: 5.185362339019775 2023-01-22 11:38:47.436438: step: 380/464, loss: 0.27703002095222473 2023-01-22 11:38:48.188395: step: 382/464, loss: 1.7378602027893066 2023-01-22 11:38:48.920137: step: 384/464, loss: 0.3635416626930237 2023-01-22 11:38:49.717941: step: 386/464, loss: 0.9020249843597412 2023-01-22 11:38:50.437910: step: 388/464, loss: 1.87539803981781 2023-01-22 11:38:51.155133: step: 390/464, loss: 2.89346981048584 2023-01-22 11:38:51.884661: step: 392/464, loss: 4.191920757293701 2023-01-22 11:38:52.637776: step: 394/464, loss: 4.365891456604004 2023-01-22 11:38:53.354629: step: 396/464, loss: 0.9762228727340698 2023-01-22 11:38:54.082561: step: 398/464, loss: 1.7938779592514038 2023-01-22 11:38:54.861751: step: 400/464, loss: 0.5622841119766235 2023-01-22 11:38:55.633253: step: 402/464, loss: 1.528921365737915 2023-01-22 11:38:56.428704: step: 404/464, loss: 0.5592197179794312 2023-01-22 11:38:57.144814: step: 406/464, loss: 0.3098835349082947 2023-01-22 11:38:57.928262: step: 408/464, loss: 1.2348670959472656 2023-01-22 11:38:58.600655: step: 410/464, loss: 2.2563488483428955 2023-01-22 11:38:59.334661: step: 412/464, loss: 0.5252102017402649 2023-01-22 11:39:00.083795: step: 414/464, loss: 0.188716322183609 2023-01-22 11:39:00.918373: step: 416/464, loss: 0.5975791811943054 2023-01-22 11:39:01.734425: step: 418/464, loss: 0.9677351117134094 2023-01-22 11:39:02.420063: step: 420/464, loss: 1.548161268234253 2023-01-22 11:39:03.110750: step: 422/464, loss: 0.8674664497375488 2023-01-22 11:39:03.846073: step: 424/464, loss: 0.962982714176178 2023-01-22 11:39:04.569749: step: 426/464, loss: 1.0416945219039917 2023-01-22 11:39:05.433858: step: 428/464, loss: 1.573225975036621 2023-01-22 11:39:06.151759: step: 430/464, loss: 1.4529905319213867 2023-01-22 11:39:06.991741: step: 432/464, loss: 0.43926820158958435 2023-01-22 11:39:07.775046: step: 434/464, loss: 0.9773903489112854 2023-01-22 11:39:08.502027: step: 436/464, loss: 4.343841552734375 2023-01-22 11:39:09.221789: step: 438/464, loss: 9.071602821350098 2023-01-22 11:39:09.939390: step: 440/464, loss: 0.8627247214317322 2023-01-22 11:39:10.692966: step: 442/464, loss: 0.36518824100494385 2023-01-22 11:39:11.444619: step: 444/464, loss: 1.3917042016983032 2023-01-22 11:39:12.261166: step: 446/464, loss: 0.9935654401779175 2023-01-22 11:39:13.083208: step: 448/464, loss: 0.8858616948127747 2023-01-22 11:39:13.887222: step: 450/464, loss: 0.7505353093147278 2023-01-22 11:39:14.604614: step: 452/464, loss: 1.6640448570251465 2023-01-22 11:39:15.371953: step: 454/464, loss: 1.4120301008224487 2023-01-22 11:39:16.094947: step: 456/464, loss: 1.7707470655441284 2023-01-22 11:39:16.843470: step: 458/464, loss: 0.7131549715995789 2023-01-22 11:39:17.639594: step: 460/464, loss: 1.9891457557678223 2023-01-22 11:39:18.500104: step: 462/464, loss: 0.7496906518936157 2023-01-22 11:39:19.243281: step: 464/464, loss: 1.9933056831359863 2023-01-22 11:39:20.056007: step: 466/464, loss: 0.29916486144065857 2023-01-22 11:39:20.831904: step: 468/464, loss: 0.9164116382598877 2023-01-22 11:39:21.605552: step: 470/464, loss: 1.3954243659973145 2023-01-22 11:39:22.363368: step: 472/464, loss: 1.375968098640442 2023-01-22 11:39:23.038127: step: 474/464, loss: 0.18370366096496582 2023-01-22 11:39:23.880709: step: 476/464, loss: 0.6455228328704834 2023-01-22 11:39:24.651788: step: 478/464, loss: 5.674762725830078 2023-01-22 11:39:25.424153: step: 480/464, loss: 1.7659356594085693 2023-01-22 11:39:26.226827: step: 482/464, loss: 13.219393730163574 2023-01-22 11:39:26.922582: step: 484/464, loss: 3.6019363403320312 2023-01-22 11:39:27.673726: step: 486/464, loss: 1.4204580783843994 2023-01-22 11:39:28.395323: step: 488/464, loss: 0.5861679315567017 2023-01-22 11:39:29.136605: step: 490/464, loss: 1.4343019723892212 2023-01-22 11:39:29.955755: step: 492/464, loss: 0.4973408579826355 2023-01-22 11:39:30.682735: step: 494/464, loss: 0.45284712314605713 2023-01-22 11:39:31.377255: step: 496/464, loss: 0.6855869293212891 2023-01-22 11:39:32.118099: step: 498/464, loss: 0.5692172050476074 2023-01-22 11:39:32.833072: step: 500/464, loss: 0.7531363368034363 2023-01-22 11:39:33.580380: step: 502/464, loss: 1.243901252746582 2023-01-22 11:39:34.279354: step: 504/464, loss: 1.8031026124954224 2023-01-22 11:39:35.017310: step: 506/464, loss: 4.986914157867432 2023-01-22 11:39:35.797664: step: 508/464, loss: 2.2970335483551025 2023-01-22 11:39:36.490113: step: 510/464, loss: 1.328364372253418 2023-01-22 11:39:37.345630: step: 512/464, loss: 0.7237935066223145 2023-01-22 11:39:38.123916: step: 514/464, loss: 0.4380602240562439 2023-01-22 11:39:38.934429: step: 516/464, loss: 0.8923473954200745 2023-01-22 11:39:39.778328: step: 518/464, loss: 0.30256325006484985 2023-01-22 11:39:40.573103: step: 520/464, loss: 1.0971101522445679 2023-01-22 11:39:41.369813: step: 522/464, loss: 0.7510563135147095 2023-01-22 11:39:42.254010: step: 524/464, loss: 7.840592384338379 2023-01-22 11:39:42.952395: step: 526/464, loss: 0.3427618145942688 2023-01-22 11:39:43.762450: step: 528/464, loss: 0.6645991206169128 2023-01-22 11:39:44.499641: step: 530/464, loss: 3.503472328186035 2023-01-22 11:39:45.197944: step: 532/464, loss: 1.8764781951904297 2023-01-22 11:39:46.039507: step: 534/464, loss: 8.248307228088379 2023-01-22 11:39:46.812898: step: 536/464, loss: 1.3547332286834717 2023-01-22 11:39:47.614263: step: 538/464, loss: 2.041508436203003 2023-01-22 11:39:48.340286: step: 540/464, loss: 3.7953057289123535 2023-01-22 11:39:49.065198: step: 542/464, loss: 1.6800973415374756 2023-01-22 11:39:49.872813: step: 544/464, loss: 1.61313796043396 2023-01-22 11:39:50.596589: step: 546/464, loss: 1.4671369791030884 2023-01-22 11:39:51.348169: step: 548/464, loss: 1.2521719932556152 2023-01-22 11:39:52.152638: step: 550/464, loss: 5.39314079284668 2023-01-22 11:39:52.851037: step: 552/464, loss: 1.7202141284942627 2023-01-22 11:39:53.660309: step: 554/464, loss: 1.8110288381576538 2023-01-22 11:39:54.501415: step: 556/464, loss: 1.6474603414535522 2023-01-22 11:39:55.248871: step: 558/464, loss: 1.1023237705230713 2023-01-22 11:39:56.023301: step: 560/464, loss: 1.2070395946502686 2023-01-22 11:39:56.779721: step: 562/464, loss: 2.496333360671997 2023-01-22 11:39:57.517976: step: 564/464, loss: 1.7204372882843018 2023-01-22 11:39:58.251991: step: 566/464, loss: 1.6716899871826172 2023-01-22 11:39:58.996636: step: 568/464, loss: 1.0436874628067017 2023-01-22 11:39:59.742323: step: 570/464, loss: 1.597758412361145 2023-01-22 11:40:00.514729: step: 572/464, loss: 1.1227065324783325 2023-01-22 11:40:01.302424: step: 574/464, loss: 0.37439507246017456 2023-01-22 11:40:01.986107: step: 576/464, loss: 0.45976024866104126 2023-01-22 11:40:02.755333: step: 578/464, loss: 1.6979976892471313 2023-01-22 11:40:03.476528: step: 580/464, loss: 0.21175535023212433 2023-01-22 11:40:04.293858: step: 582/464, loss: 0.5808183550834656 2023-01-22 11:40:05.055285: step: 584/464, loss: 2.6092567443847656 2023-01-22 11:40:05.799026: step: 586/464, loss: 1.6025683879852295 2023-01-22 11:40:06.570154: step: 588/464, loss: 0.41113734245300293 2023-01-22 11:40:07.303657: step: 590/464, loss: 0.741346538066864 2023-01-22 11:40:08.080607: step: 592/464, loss: 1.8383212089538574 2023-01-22 11:40:08.801281: step: 594/464, loss: 2.8973169326782227 2023-01-22 11:40:09.573692: step: 596/464, loss: 2.1495635509490967 2023-01-22 11:40:10.362545: step: 598/464, loss: 3.4064736366271973 2023-01-22 11:40:11.101906: step: 600/464, loss: 1.1322810649871826 2023-01-22 11:40:11.977145: step: 602/464, loss: 2.7991814613342285 2023-01-22 11:40:12.687633: step: 604/464, loss: 0.3369024395942688 2023-01-22 11:40:13.461164: step: 606/464, loss: 1.2564321756362915 2023-01-22 11:40:14.127784: step: 608/464, loss: 2.30317759513855 2023-01-22 11:40:14.847905: step: 610/464, loss: 1.2888926267623901 2023-01-22 11:40:15.585632: step: 612/464, loss: 1.0882058143615723 2023-01-22 11:40:16.351676: step: 614/464, loss: 1.2446033954620361 2023-01-22 11:40:17.138078: step: 616/464, loss: 0.1993260234594345 2023-01-22 11:40:17.874223: step: 618/464, loss: 0.8716303110122681 2023-01-22 11:40:18.628454: step: 620/464, loss: 1.3918261528015137 2023-01-22 11:40:19.351985: step: 622/464, loss: 0.903435230255127 2023-01-22 11:40:20.129333: step: 624/464, loss: 0.8670368194580078 2023-01-22 11:40:20.860985: step: 626/464, loss: 0.6818031072616577 2023-01-22 11:40:21.613928: step: 628/464, loss: 0.5814501047134399 2023-01-22 11:40:22.391232: step: 630/464, loss: 2.889941453933716 2023-01-22 11:40:23.175110: step: 632/464, loss: 0.46588319540023804 2023-01-22 11:40:23.894807: step: 634/464, loss: 1.2020893096923828 2023-01-22 11:40:24.753891: step: 636/464, loss: 4.312961101531982 2023-01-22 11:40:25.536506: step: 638/464, loss: 2.40643310546875 2023-01-22 11:40:26.316173: step: 640/464, loss: 0.5957890152931213 2023-01-22 11:40:27.030017: step: 642/464, loss: 0.5673452615737915 2023-01-22 11:40:27.678105: step: 644/464, loss: 1.2440736293792725 2023-01-22 11:40:28.375754: step: 646/464, loss: 2.3645412921905518 2023-01-22 11:40:29.144561: step: 648/464, loss: 3.939936876296997 2023-01-22 11:40:29.893499: step: 650/464, loss: 1.5717930793762207 2023-01-22 11:40:30.625199: step: 652/464, loss: 0.369645893573761 2023-01-22 11:40:31.357473: step: 654/464, loss: 0.5531639456748962 2023-01-22 11:40:32.076918: step: 656/464, loss: 2.1650702953338623 2023-01-22 11:40:32.898108: step: 658/464, loss: 4.002811431884766 2023-01-22 11:40:33.583596: step: 660/464, loss: 0.5150967836380005 2023-01-22 11:40:34.228714: step: 662/464, loss: 0.8544698357582092 2023-01-22 11:40:34.946022: step: 664/464, loss: 1.0840424299240112 2023-01-22 11:40:35.706501: step: 666/464, loss: 0.2541203498840332 2023-01-22 11:40:36.478225: step: 668/464, loss: 2.227607488632202 2023-01-22 11:40:37.210860: step: 670/464, loss: 0.5744845867156982 2023-01-22 11:40:38.009221: step: 672/464, loss: 1.1762940883636475 2023-01-22 11:40:38.912439: step: 674/464, loss: 1.3462352752685547 2023-01-22 11:40:39.665623: step: 676/464, loss: 1.058624505996704 2023-01-22 11:40:40.442380: step: 678/464, loss: 0.7258554697036743 2023-01-22 11:40:41.181380: step: 680/464, loss: 0.5475314855575562 2023-01-22 11:40:41.923307: step: 682/464, loss: 1.2779431343078613 2023-01-22 11:40:42.703082: step: 684/464, loss: 1.897872805595398 2023-01-22 11:40:43.452356: step: 686/464, loss: 0.41958799958229065 2023-01-22 11:40:44.249715: step: 688/464, loss: 0.3179838955402374 2023-01-22 11:40:45.040494: step: 690/464, loss: 0.5948276519775391 2023-01-22 11:40:45.845614: step: 692/464, loss: 0.8153871297836304 2023-01-22 11:40:46.533748: step: 694/464, loss: 0.4472903311252594 2023-01-22 11:40:47.294693: step: 696/464, loss: 0.3059772849082947 2023-01-22 11:40:48.018428: step: 698/464, loss: 1.5691204071044922 2023-01-22 11:40:48.792445: step: 700/464, loss: 0.8117853403091431 2023-01-22 11:40:49.552519: step: 702/464, loss: 0.5508538484573364 2023-01-22 11:40:50.301635: step: 704/464, loss: 0.8707534670829773 2023-01-22 11:40:51.103814: step: 706/464, loss: 1.5053908824920654 2023-01-22 11:40:51.798523: step: 708/464, loss: 1.3240344524383545 2023-01-22 11:40:52.565397: step: 710/464, loss: 0.18909026682376862 2023-01-22 11:40:53.297776: step: 712/464, loss: 0.9368202090263367 2023-01-22 11:40:54.070808: step: 714/464, loss: 0.8139339685440063 2023-01-22 11:40:54.824306: step: 716/464, loss: 2.7701938152313232 2023-01-22 11:40:55.524934: step: 718/464, loss: 0.41402795910835266 2023-01-22 11:40:56.353703: step: 720/464, loss: 2.5215907096862793 2023-01-22 11:40:57.217302: step: 722/464, loss: 0.7189838290214539 2023-01-22 11:40:57.925484: step: 724/464, loss: 0.9132556915283203 2023-01-22 11:40:58.648295: step: 726/464, loss: 1.678982138633728 2023-01-22 11:40:59.459979: step: 728/464, loss: 1.4411935806274414 2023-01-22 11:41:00.187548: step: 730/464, loss: 0.815687358379364 2023-01-22 11:41:00.908063: step: 732/464, loss: 1.181318759918213 2023-01-22 11:41:01.586468: step: 734/464, loss: 2.313734531402588 2023-01-22 11:41:02.265133: step: 736/464, loss: 4.424801349639893 2023-01-22 11:41:02.947484: step: 738/464, loss: 2.7652111053466797 2023-01-22 11:41:03.652924: step: 740/464, loss: 1.9522521495819092 2023-01-22 11:41:04.401435: step: 742/464, loss: 1.2239620685577393 2023-01-22 11:41:05.176606: step: 744/464, loss: 1.2581905126571655 2023-01-22 11:41:05.910121: step: 746/464, loss: 1.5191899538040161 2023-01-22 11:41:06.618851: step: 748/464, loss: 2.230410099029541 2023-01-22 11:41:07.384896: step: 750/464, loss: 1.4481711387634277 2023-01-22 11:41:08.086886: step: 752/464, loss: 1.096886157989502 2023-01-22 11:41:08.842036: step: 754/464, loss: 0.3762080669403076 2023-01-22 11:41:09.517366: step: 756/464, loss: 1.5608004331588745 2023-01-22 11:41:10.269524: step: 758/464, loss: 1.5761467218399048 2023-01-22 11:41:11.026932: step: 760/464, loss: 0.257263720035553 2023-01-22 11:41:11.820973: step: 762/464, loss: 0.40274691581726074 2023-01-22 11:41:12.591357: step: 764/464, loss: 0.15530270338058472 2023-01-22 11:41:13.378562: step: 766/464, loss: 0.3401526212692261 2023-01-22 11:41:14.138900: step: 768/464, loss: 4.480698585510254 2023-01-22 11:41:14.897004: step: 770/464, loss: 2.8207991123199463 2023-01-22 11:41:15.678568: step: 772/464, loss: 0.4728001356124878 2023-01-22 11:41:16.353210: step: 774/464, loss: 3.435164451599121 2023-01-22 11:41:17.174649: step: 776/464, loss: 0.5584661364555359 2023-01-22 11:41:17.895437: step: 778/464, loss: 10.358461380004883 2023-01-22 11:41:18.658726: step: 780/464, loss: 1.0920209884643555 2023-01-22 11:41:19.410830: step: 782/464, loss: 5.505377769470215 2023-01-22 11:41:20.254154: step: 784/464, loss: 0.4454532265663147 2023-01-22 11:41:21.031603: step: 786/464, loss: 0.917724609375 2023-01-22 11:41:21.742721: step: 788/464, loss: 0.9360051155090332 2023-01-22 11:41:22.477533: step: 790/464, loss: 1.591188907623291 2023-01-22 11:41:23.208604: step: 792/464, loss: 0.5870780348777771 2023-01-22 11:41:23.922402: step: 794/464, loss: 0.6872538924217224 2023-01-22 11:41:24.592426: step: 796/464, loss: 0.4402296841144562 2023-01-22 11:41:25.308872: step: 798/464, loss: 1.5215715169906616 2023-01-22 11:41:26.190481: step: 800/464, loss: 1.1026206016540527 2023-01-22 11:41:26.942139: step: 802/464, loss: 0.8330729603767395 2023-01-22 11:41:27.907146: step: 804/464, loss: 0.5337406396865845 2023-01-22 11:41:28.610038: step: 806/464, loss: 1.0772619247436523 2023-01-22 11:41:29.372252: step: 808/464, loss: 0.3447778522968292 2023-01-22 11:41:30.115646: step: 810/464, loss: 1.0803074836730957 2023-01-22 11:41:30.953039: step: 812/464, loss: 3.355602741241455 2023-01-22 11:41:31.672777: step: 814/464, loss: 0.5117831230163574 2023-01-22 11:41:32.431711: step: 816/464, loss: 4.319705009460449 2023-01-22 11:41:33.140587: step: 818/464, loss: 1.5117748975753784 2023-01-22 11:41:33.868191: step: 820/464, loss: 2.1120705604553223 2023-01-22 11:41:34.629575: step: 822/464, loss: 0.9269815683364868 2023-01-22 11:41:35.364603: step: 824/464, loss: 0.7869834899902344 2023-01-22 11:41:36.096830: step: 826/464, loss: 2.005740165710449 2023-01-22 11:41:36.880948: step: 828/464, loss: 0.9282411336898804 2023-01-22 11:41:37.734085: step: 830/464, loss: 0.7959149479866028 2023-01-22 11:41:38.470955: step: 832/464, loss: 4.79236364364624 2023-01-22 11:41:39.202724: step: 834/464, loss: 0.7582805752754211 2023-01-22 11:41:39.960216: step: 836/464, loss: 0.9575939178466797 2023-01-22 11:41:40.713347: step: 838/464, loss: 2.441572666168213 2023-01-22 11:41:41.505415: step: 840/464, loss: 7.67648983001709 2023-01-22 11:41:42.262586: step: 842/464, loss: 0.6135707497596741 2023-01-22 11:41:43.021634: step: 844/464, loss: 0.34217751026153564 2023-01-22 11:41:43.741835: step: 846/464, loss: 1.6250519752502441 2023-01-22 11:41:44.587889: step: 848/464, loss: 5.8050079345703125 2023-01-22 11:41:45.341489: step: 850/464, loss: 0.6835314631462097 2023-01-22 11:41:46.062701: step: 852/464, loss: 0.7090028524398804 2023-01-22 11:41:46.837903: step: 854/464, loss: 0.9978798031806946 2023-01-22 11:41:47.592242: step: 856/464, loss: 0.6591278910636902 2023-01-22 11:41:48.427707: step: 858/464, loss: 0.9084625244140625 2023-01-22 11:41:49.258002: step: 860/464, loss: 0.6810509562492371 2023-01-22 11:41:49.952148: step: 862/464, loss: 0.6677621006965637 2023-01-22 11:41:50.785103: step: 864/464, loss: 1.1423194408416748 2023-01-22 11:41:51.584708: step: 866/464, loss: 4.478385925292969 2023-01-22 11:41:52.425795: step: 868/464, loss: 2.2984297275543213 2023-01-22 11:41:53.174450: step: 870/464, loss: 8.030241966247559 2023-01-22 11:41:53.941750: step: 872/464, loss: 0.6332724690437317 2023-01-22 11:41:54.789011: step: 874/464, loss: 11.14112377166748 2023-01-22 11:41:55.571643: step: 876/464, loss: 0.5625685453414917 2023-01-22 11:41:56.326815: step: 878/464, loss: 0.28894609212875366 2023-01-22 11:41:57.046996: step: 880/464, loss: 0.6808298826217651 2023-01-22 11:41:57.800598: step: 882/464, loss: 0.6325333118438721 2023-01-22 11:41:58.575842: step: 884/464, loss: 1.1005043983459473 2023-01-22 11:41:59.273012: step: 886/464, loss: 1.6734281778335571 2023-01-22 11:42:00.054553: step: 888/464, loss: 1.63663911819458 2023-01-22 11:42:00.811498: step: 890/464, loss: 0.4760796129703522 2023-01-22 11:42:01.553898: step: 892/464, loss: 4.4341254234313965 2023-01-22 11:42:02.231539: step: 894/464, loss: 2.1422908306121826 2023-01-22 11:42:03.004924: step: 896/464, loss: 3.9849419593811035 2023-01-22 11:42:03.747669: step: 898/464, loss: 1.9513940811157227 2023-01-22 11:42:04.417537: step: 900/464, loss: 2.256234645843506 2023-01-22 11:42:05.107988: step: 902/464, loss: 1.4129538536071777 2023-01-22 11:42:05.881618: step: 904/464, loss: 0.4391446113586426 2023-01-22 11:42:06.654758: step: 906/464, loss: 1.476486086845398 2023-01-22 11:42:07.334138: step: 908/464, loss: 1.112925410270691 2023-01-22 11:42:08.110679: step: 910/464, loss: 2.4584295749664307 2023-01-22 11:42:08.833217: step: 912/464, loss: 0.6789507865905762 2023-01-22 11:42:09.638149: step: 914/464, loss: 3.559232473373413 2023-01-22 11:42:10.427030: step: 916/464, loss: 0.7847874760627747 2023-01-22 11:42:11.276445: step: 918/464, loss: 0.9742597341537476 2023-01-22 11:42:12.090656: step: 920/464, loss: 1.0963793992996216 2023-01-22 11:42:12.906004: step: 922/464, loss: 0.8131226897239685 2023-01-22 11:42:13.601780: step: 924/464, loss: 1.1673150062561035 2023-01-22 11:42:14.388647: step: 926/464, loss: 1.7198338508605957 2023-01-22 11:42:15.101892: step: 928/464, loss: 1.1896001100540161 2023-01-22 11:42:15.760751: step: 930/464, loss: 0.19970275461673737 ================================================== Loss: 1.627 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.27277317990287775, 'r': 0.2614291157103195, 'f1': 0.2669806992485695}, 'combined': 0.19672262049894593, 'epoch': 2} Test Chinese: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.2661425985210101, 'r': 0.23333409278617157, 'f1': 0.2486608198477961}, 'combined': 0.15443145653705231, 'epoch': 2} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.26392734575488536, 'r': 0.26042830897404406, 'f1': 0.2621661527898861}, 'combined': 0.19317505995044237, 'epoch': 2} Test Korean: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.26018035901844405, 'r': 0.23575631345841225, 'f1': 0.24736691469146002}, 'combined': 0.15362787333469624, 'epoch': 2} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2791589190130989, 'r': 0.2728143981264376, 'f1': 0.27595019580605185}, 'combined': 0.20333172322551188, 'epoch': 2} Test Russian: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.2684008847293172, 'r': 0.23796094218112052, 'f1': 0.2522659648422961}, 'combined': 0.1566704413231102, 'epoch': 2} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.34459459459459457, 'r': 0.36428571428571427, 'f1': 0.3541666666666667}, 'combined': 0.2361111111111111, 'epoch': 2} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.16428571428571428, 'r': 0.25, 'f1': 0.19827586206896552}, 'combined': 0.09913793103448276, 'epoch': 2} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.3125, 'r': 0.1724137931034483, 'f1': 0.22222222222222224}, 'combined': 0.14814814814814814, 'epoch': 2} New best chinese model... New best korean model... New best russian model... ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.27277317990287775, 'r': 0.2614291157103195, 'f1': 0.2669806992485695}, 'combined': 0.19672262049894593, 'epoch': 2} Test for Chinese: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.2661425985210101, 'r': 0.23333409278617157, 'f1': 0.2486608198477961}, 'combined': 0.15443145653705231, 'epoch': 2} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.34459459459459457, 'r': 0.36428571428571427, 'f1': 0.3541666666666667}, 'combined': 0.2361111111111111, 'epoch': 2} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.26392734575488536, 'r': 0.26042830897404406, 'f1': 0.2621661527898861}, 'combined': 0.19317505995044237, 'epoch': 2} Test for Korean: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.26018035901844405, 'r': 0.23575631345841225, 'f1': 0.24736691469146002}, 'combined': 0.15362787333469624, 'epoch': 2} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.16428571428571428, 'r': 0.25, 'f1': 0.19827586206896552}, 'combined': 0.09913793103448276, 'epoch': 2} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2791589190130989, 'r': 0.2728143981264376, 'f1': 0.27595019580605185}, 'combined': 0.20333172322551188, 'epoch': 2} Test for Russian: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.2684008847293172, 'r': 0.23796094218112052, 'f1': 0.2522659648422961}, 'combined': 0.1566704413231102, 'epoch': 2} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.3125, 'r': 0.1724137931034483, 'f1': 0.22222222222222224}, 'combined': 0.14814814814814814, 'epoch': 2} ****************************** Epoch: 3 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-22 11:45:31.357836: step: 2/464, loss: 0.5728092789649963 2023-01-22 11:45:32.056795: step: 4/464, loss: 1.6666297912597656 2023-01-22 11:45:32.881790: step: 6/464, loss: 0.9216357469558716 2023-01-22 11:45:33.685278: step: 8/464, loss: 0.5197516679763794 2023-01-22 11:45:34.429784: step: 10/464, loss: 1.2050963640213013 2023-01-22 11:45:35.189593: step: 12/464, loss: 0.5885632038116455 2023-01-22 11:45:35.968151: step: 14/464, loss: 3.2682347297668457 2023-01-22 11:45:36.682962: step: 16/464, loss: 0.785646378993988 2023-01-22 11:45:37.508214: step: 18/464, loss: 1.124619483947754 2023-01-22 11:45:38.382836: step: 20/464, loss: 1.2299696207046509 2023-01-22 11:45:39.168509: step: 22/464, loss: 0.6963951587677002 2023-01-22 11:45:39.968162: step: 24/464, loss: 0.9705716967582703 2023-01-22 11:45:40.769716: step: 26/464, loss: 0.5654642581939697 2023-01-22 11:45:41.518570: step: 28/464, loss: 0.9723125100135803 2023-01-22 11:45:42.327495: step: 30/464, loss: 1.4342992305755615 2023-01-22 11:45:43.056250: step: 32/464, loss: 1.0085419416427612 2023-01-22 11:45:43.799647: step: 34/464, loss: 0.2719883620738983 2023-01-22 11:45:44.512630: step: 36/464, loss: 0.6373543739318848 2023-01-22 11:45:45.261169: step: 38/464, loss: 1.7300281524658203 2023-01-22 11:45:46.072618: step: 40/464, loss: 0.2606988251209259 2023-01-22 11:45:46.908835: step: 42/464, loss: 0.26028841733932495 2023-01-22 11:45:47.671668: step: 44/464, loss: 0.4146405756473541 2023-01-22 11:45:48.423149: step: 46/464, loss: 0.3497615456581116 2023-01-22 11:45:49.264369: step: 48/464, loss: 0.6876152753829956 2023-01-22 11:45:49.992086: step: 50/464, loss: 0.3188195824623108 2023-01-22 11:45:50.769620: step: 52/464, loss: 0.5253841280937195 2023-01-22 11:45:51.523729: step: 54/464, loss: 0.6125118136405945 2023-01-22 11:45:52.278316: step: 56/464, loss: 0.7966494560241699 2023-01-22 11:45:53.072776: step: 58/464, loss: 2.040208339691162 2023-01-22 11:45:53.899806: step: 60/464, loss: 1.6140918731689453 2023-01-22 11:45:54.714635: step: 62/464, loss: 1.486433982849121 2023-01-22 11:45:55.503074: step: 64/464, loss: 0.5457777380943298 2023-01-22 11:45:56.265136: step: 66/464, loss: 0.5147194266319275 2023-01-22 11:45:56.945167: step: 68/464, loss: 0.9083465933799744 2023-01-22 11:45:57.675563: step: 70/464, loss: 0.44538620114326477 2023-01-22 11:45:58.426134: step: 72/464, loss: 0.8497267961502075 2023-01-22 11:45:59.262724: step: 74/464, loss: 0.9923356175422668 2023-01-22 11:46:00.023114: step: 76/464, loss: 0.22544407844543457 2023-01-22 11:46:00.721080: step: 78/464, loss: 0.7999193668365479 2023-01-22 11:46:01.386182: step: 80/464, loss: 1.3178879022598267 2023-01-22 11:46:02.184769: step: 82/464, loss: 1.2020602226257324 2023-01-22 11:46:02.934252: step: 84/464, loss: 0.2553924322128296 2023-01-22 11:46:03.667666: step: 86/464, loss: 0.285842627286911 2023-01-22 11:46:04.445049: step: 88/464, loss: 0.6283391714096069 2023-01-22 11:46:05.138589: step: 90/464, loss: 1.5153621435165405 2023-01-22 11:46:05.912183: step: 92/464, loss: 0.7195795178413391 2023-01-22 11:46:06.661018: step: 94/464, loss: 0.4200820326805115 2023-01-22 11:46:07.449053: step: 96/464, loss: 0.575192391872406 2023-01-22 11:46:08.309993: step: 98/464, loss: 3.373455762863159 2023-01-22 11:46:09.000891: step: 100/464, loss: 1.1632630825042725 2023-01-22 11:46:09.724880: step: 102/464, loss: 0.7568671703338623 2023-01-22 11:46:10.443994: step: 104/464, loss: 0.4086417257785797 2023-01-22 11:46:11.210895: step: 106/464, loss: 2.0888540744781494 2023-01-22 11:46:11.959654: step: 108/464, loss: 0.4860653281211853 2023-01-22 11:46:12.769656: step: 110/464, loss: 2.1116738319396973 2023-01-22 11:46:13.540984: step: 112/464, loss: 1.112531065940857 2023-01-22 11:46:14.322692: step: 114/464, loss: 2.252843141555786 2023-01-22 11:46:15.108057: step: 116/464, loss: 0.5225369930267334 2023-01-22 11:46:15.858150: step: 118/464, loss: 1.0584449768066406 2023-01-22 11:46:16.598049: step: 120/464, loss: 6.0835371017456055 2023-01-22 11:46:17.424918: step: 122/464, loss: 2.733490467071533 2023-01-22 11:46:18.186095: step: 124/464, loss: 1.4847700595855713 2023-01-22 11:46:19.011606: step: 126/464, loss: 2.487853527069092 2023-01-22 11:46:19.846908: step: 128/464, loss: 1.2639483213424683 2023-01-22 11:46:20.623783: step: 130/464, loss: 1.7387046813964844 2023-01-22 11:46:21.365022: step: 132/464, loss: 1.5289192199707031 2023-01-22 11:46:22.123523: step: 134/464, loss: 1.3074873685836792 2023-01-22 11:46:22.853899: step: 136/464, loss: 0.5716285109519958 2023-01-22 11:46:23.596399: step: 138/464, loss: 0.24294021725654602 2023-01-22 11:46:24.242403: step: 140/464, loss: 1.5488613843917847 2023-01-22 11:46:25.014624: step: 142/464, loss: 1.200225591659546 2023-01-22 11:46:25.736207: step: 144/464, loss: 1.0220304727554321 2023-01-22 11:46:26.482510: step: 146/464, loss: 0.27229607105255127 2023-01-22 11:46:27.237737: step: 148/464, loss: 1.827118158340454 2023-01-22 11:46:28.039199: step: 150/464, loss: 0.7478753924369812 2023-01-22 11:46:28.858653: step: 152/464, loss: 1.1500465869903564 2023-01-22 11:46:29.608504: step: 154/464, loss: 0.29159173369407654 2023-01-22 11:46:30.443926: step: 156/464, loss: 2.232395648956299 2023-01-22 11:46:31.173410: step: 158/464, loss: 0.5495754480361938 2023-01-22 11:46:31.934524: step: 160/464, loss: 1.171030879020691 2023-01-22 11:46:32.720084: step: 162/464, loss: 3.065204620361328 2023-01-22 11:46:33.433911: step: 164/464, loss: 1.0704265832901 2023-01-22 11:46:34.231986: step: 166/464, loss: 1.6894574165344238 2023-01-22 11:46:35.000009: step: 168/464, loss: 0.600529134273529 2023-01-22 11:46:35.755516: step: 170/464, loss: 6.212244510650635 2023-01-22 11:46:36.521689: step: 172/464, loss: 0.6440250277519226 2023-01-22 11:46:37.273408: step: 174/464, loss: 1.3235870599746704 2023-01-22 11:46:37.996269: step: 176/464, loss: 0.802613377571106 2023-01-22 11:46:38.806930: step: 178/464, loss: 1.1508985757827759 2023-01-22 11:46:39.567491: step: 180/464, loss: 0.834412693977356 2023-01-22 11:46:40.351685: step: 182/464, loss: 1.041805624961853 2023-01-22 11:46:41.065537: step: 184/464, loss: 0.36926427483558655 2023-01-22 11:46:41.803348: step: 186/464, loss: 2.562335968017578 2023-01-22 11:46:42.601464: step: 188/464, loss: 2.016711711883545 2023-01-22 11:46:43.378665: step: 190/464, loss: 1.2292652130126953 2023-01-22 11:46:44.077445: step: 192/464, loss: 0.31911700963974 2023-01-22 11:46:44.800533: step: 194/464, loss: 1.648099422454834 2023-01-22 11:46:45.552195: step: 196/464, loss: 0.5986658334732056 2023-01-22 11:46:46.257285: step: 198/464, loss: 1.1526308059692383 2023-01-22 11:46:47.031815: step: 200/464, loss: 1.5631656646728516 2023-01-22 11:46:47.903183: step: 202/464, loss: 0.9817005395889282 2023-01-22 11:46:48.641416: step: 204/464, loss: 0.5580400228500366 2023-01-22 11:46:49.371192: step: 206/464, loss: 0.6199046969413757 2023-01-22 11:46:50.022103: step: 208/464, loss: 0.47845181822776794 2023-01-22 11:46:50.723983: step: 210/464, loss: 0.6486523747444153 2023-01-22 11:46:51.431989: step: 212/464, loss: 1.9843801259994507 2023-01-22 11:46:52.190024: step: 214/464, loss: 0.5011245608329773 2023-01-22 11:46:52.833277: step: 216/464, loss: 0.6039958000183105 2023-01-22 11:46:53.533098: step: 218/464, loss: 0.7514966726303101 2023-01-22 11:46:54.247201: step: 220/464, loss: 0.9103838801383972 2023-01-22 11:46:55.004525: step: 222/464, loss: 0.17597933113574982 2023-01-22 11:46:55.751077: step: 224/464, loss: 1.262900710105896 2023-01-22 11:46:56.393170: step: 226/464, loss: 0.26214057207107544 2023-01-22 11:46:57.136186: step: 228/464, loss: 0.3917402923107147 2023-01-22 11:46:57.965251: step: 230/464, loss: 1.3757576942443848 2023-01-22 11:46:58.737283: step: 232/464, loss: 0.43058058619499207 2023-01-22 11:46:59.524124: step: 234/464, loss: 1.0661262273788452 2023-01-22 11:47:00.304499: step: 236/464, loss: 0.6610551476478577 2023-01-22 11:47:01.091152: step: 238/464, loss: 5.103069305419922 2023-01-22 11:47:01.843174: step: 240/464, loss: 1.5642491579055786 2023-01-22 11:47:02.669122: step: 242/464, loss: 3.216989040374756 2023-01-22 11:47:03.381239: step: 244/464, loss: 1.0435258150100708 2023-01-22 11:47:04.202756: step: 246/464, loss: 0.7400312423706055 2023-01-22 11:47:04.979488: step: 248/464, loss: 1.1300452947616577 2023-01-22 11:47:05.749806: step: 250/464, loss: 3.0344204902648926 2023-01-22 11:47:06.480969: step: 252/464, loss: 0.28440696001052856 2023-01-22 11:47:07.173247: step: 254/464, loss: 2.7709171772003174 2023-01-22 11:47:07.892242: step: 256/464, loss: 1.6949918270111084 2023-01-22 11:47:08.580763: step: 258/464, loss: 0.7783820033073425 2023-01-22 11:47:09.373122: step: 260/464, loss: 1.269107699394226 2023-01-22 11:47:10.151602: step: 262/464, loss: 0.2354559451341629 2023-01-22 11:47:10.862219: step: 264/464, loss: 0.3233901858329773 2023-01-22 11:47:11.646485: step: 266/464, loss: 6.524602890014648 2023-01-22 11:47:12.482408: step: 268/464, loss: 1.039006233215332 2023-01-22 11:47:13.192636: step: 270/464, loss: 0.36432382464408875 2023-01-22 11:47:14.051595: step: 272/464, loss: 0.6748474836349487 2023-01-22 11:47:14.801841: step: 274/464, loss: 2.2763798236846924 2023-01-22 11:47:15.546774: step: 276/464, loss: 0.3138844072818756 2023-01-22 11:47:16.277738: step: 278/464, loss: 1.578597068786621 2023-01-22 11:47:17.083240: step: 280/464, loss: 0.9769968390464783 2023-01-22 11:47:17.836430: step: 282/464, loss: 1.1653294563293457 2023-01-22 11:47:18.757097: step: 284/464, loss: 1.2490097284317017 2023-01-22 11:47:19.535899: step: 286/464, loss: 1.5943344831466675 2023-01-22 11:47:20.311472: step: 288/464, loss: 0.7358044981956482 2023-01-22 11:47:21.098971: step: 290/464, loss: 0.6281088590621948 2023-01-22 11:47:21.867030: step: 292/464, loss: 0.30778172612190247 2023-01-22 11:47:22.724006: step: 294/464, loss: 1.1002516746520996 2023-01-22 11:47:23.436888: step: 296/464, loss: 0.5382761359214783 2023-01-22 11:47:24.219982: step: 298/464, loss: 1.0521222352981567 2023-01-22 11:47:24.920277: step: 300/464, loss: 0.6565387845039368 2023-01-22 11:47:25.674113: step: 302/464, loss: 0.42484602332115173 2023-01-22 11:47:26.456958: step: 304/464, loss: 1.4506440162658691 2023-01-22 11:47:27.257705: step: 306/464, loss: 0.7423726320266724 2023-01-22 11:47:28.063148: step: 308/464, loss: 0.9235724806785583 2023-01-22 11:47:28.809139: step: 310/464, loss: 0.973768949508667 2023-01-22 11:47:29.550982: step: 312/464, loss: 5.143934726715088 2023-01-22 11:47:30.289877: step: 314/464, loss: 0.5139178037643433 2023-01-22 11:47:30.992013: step: 316/464, loss: 0.7528850436210632 2023-01-22 11:47:31.701196: step: 318/464, loss: 1.8455528020858765 2023-01-22 11:47:32.409211: step: 320/464, loss: 0.9531939029693604 2023-01-22 11:47:33.102087: step: 322/464, loss: 2.141551971435547 2023-01-22 11:47:33.884310: step: 324/464, loss: 1.0747900009155273 2023-01-22 11:47:34.715380: step: 326/464, loss: 1.205752968788147 2023-01-22 11:47:35.529967: step: 328/464, loss: 1.5321749448776245 2023-01-22 11:47:36.294584: step: 330/464, loss: 1.215419054031372 2023-01-22 11:47:37.049274: step: 332/464, loss: 0.4505934417247772 2023-01-22 11:47:37.782260: step: 334/464, loss: 0.421829491853714 2023-01-22 11:47:38.536703: step: 336/464, loss: 0.3405742645263672 2023-01-22 11:47:39.221297: step: 338/464, loss: 0.27169865369796753 2023-01-22 11:47:39.838045: step: 340/464, loss: 1.4433740377426147 2023-01-22 11:47:40.606389: step: 342/464, loss: 0.1989186704158783 2023-01-22 11:47:41.451236: step: 344/464, loss: 3.2341744899749756 2023-01-22 11:47:42.199911: step: 346/464, loss: 1.6050325632095337 2023-01-22 11:47:42.926158: step: 348/464, loss: 2.6557462215423584 2023-01-22 11:47:43.822627: step: 350/464, loss: 0.30212539434432983 2023-01-22 11:47:44.548490: step: 352/464, loss: 0.28420719504356384 2023-01-22 11:47:45.335995: step: 354/464, loss: 0.18610507249832153 2023-01-22 11:47:46.179095: step: 356/464, loss: 1.3019359111785889 2023-01-22 11:47:46.950554: step: 358/464, loss: 1.315913200378418 2023-01-22 11:47:47.629278: step: 360/464, loss: 1.4917500019073486 2023-01-22 11:47:48.456269: step: 362/464, loss: 0.5647187829017639 2023-01-22 11:47:49.201097: step: 364/464, loss: 0.34709829092025757 2023-01-22 11:47:49.983473: step: 366/464, loss: 1.1877985000610352 2023-01-22 11:47:50.707352: step: 368/464, loss: 0.29472312331199646 2023-01-22 11:47:51.469317: step: 370/464, loss: 0.6145370006561279 2023-01-22 11:47:52.207682: step: 372/464, loss: 2.30469012260437 2023-01-22 11:47:53.003398: step: 374/464, loss: 1.8337349891662598 2023-01-22 11:47:53.789982: step: 376/464, loss: 0.39600124955177307 2023-01-22 11:47:54.622437: step: 378/464, loss: 0.4111950695514679 2023-01-22 11:47:55.340037: step: 380/464, loss: 2.2897562980651855 2023-01-22 11:47:56.201626: step: 382/464, loss: 0.4143930673599243 2023-01-22 11:47:56.989344: step: 384/464, loss: 1.055375099182129 2023-01-22 11:47:57.852796: step: 386/464, loss: 0.27740585803985596 2023-01-22 11:47:58.640111: step: 388/464, loss: 0.8603242635726929 2023-01-22 11:47:59.372484: step: 390/464, loss: 0.9055214524269104 2023-01-22 11:48:00.091492: step: 392/464, loss: 0.43781185150146484 2023-01-22 11:48:01.036098: step: 394/464, loss: 1.1656153202056885 2023-01-22 11:48:01.729946: step: 396/464, loss: 0.13810645043849945 2023-01-22 11:48:02.422563: step: 398/464, loss: 1.0912761688232422 2023-01-22 11:48:03.196105: step: 400/464, loss: 0.5084050297737122 2023-01-22 11:48:03.992385: step: 402/464, loss: 3.217175006866455 2023-01-22 11:48:04.701633: step: 404/464, loss: 1.1894950866699219 2023-01-22 11:48:05.498428: step: 406/464, loss: 0.6266841888427734 2023-01-22 11:48:06.200133: step: 408/464, loss: 0.862480640411377 2023-01-22 11:48:06.911396: step: 410/464, loss: 0.3887926936149597 2023-01-22 11:48:07.722453: step: 412/464, loss: 1.5715738534927368 2023-01-22 11:48:08.499683: step: 414/464, loss: 0.43553680181503296 2023-01-22 11:48:09.259842: step: 416/464, loss: 0.9947322010993958 2023-01-22 11:48:10.011312: step: 418/464, loss: 0.2606787383556366 2023-01-22 11:48:10.755138: step: 420/464, loss: 0.36131685972213745 2023-01-22 11:48:11.510728: step: 422/464, loss: 1.416828989982605 2023-01-22 11:48:12.198019: step: 424/464, loss: 0.21712100505828857 2023-01-22 11:48:12.993785: step: 426/464, loss: 8.038595199584961 2023-01-22 11:48:13.721869: step: 428/464, loss: 0.4072921872138977 2023-01-22 11:48:14.425755: step: 430/464, loss: 1.0790162086486816 2023-01-22 11:48:15.185261: step: 432/464, loss: 0.5021647810935974 2023-01-22 11:48:16.004240: step: 434/464, loss: 0.18573813140392303 2023-01-22 11:48:16.817981: step: 436/464, loss: 0.41294416785240173 2023-01-22 11:48:17.547817: step: 438/464, loss: 0.49419182538986206 2023-01-22 11:48:18.256081: step: 440/464, loss: 0.7594491839408875 2023-01-22 11:48:19.025303: step: 442/464, loss: 0.5722238421440125 2023-01-22 11:48:19.824933: step: 444/464, loss: 0.5605573654174805 2023-01-22 11:48:20.562015: step: 446/464, loss: 0.43364423513412476 2023-01-22 11:48:21.335563: step: 448/464, loss: 0.632633626461029 2023-01-22 11:48:22.067231: step: 450/464, loss: 1.6663942337036133 2023-01-22 11:48:22.760096: step: 452/464, loss: 3.8243656158447266 2023-01-22 11:48:23.597569: step: 454/464, loss: 0.572558581829071 2023-01-22 11:48:24.421960: step: 456/464, loss: 0.7511688470840454 2023-01-22 11:48:25.186952: step: 458/464, loss: 1.0489780902862549 2023-01-22 11:48:25.978224: step: 460/464, loss: 1.9205814599990845 2023-01-22 11:48:26.716767: step: 462/464, loss: 0.6067019104957581 2023-01-22 11:48:27.419954: step: 464/464, loss: 0.5245887041091919 2023-01-22 11:48:28.195043: step: 466/464, loss: 0.38798803091049194 2023-01-22 11:48:28.963345: step: 468/464, loss: 1.0211551189422607 2023-01-22 11:48:29.675039: step: 470/464, loss: 0.848628044128418 2023-01-22 11:48:30.448232: step: 472/464, loss: 4.330991744995117 2023-01-22 11:48:31.198910: step: 474/464, loss: 0.36634135246276855 2023-01-22 11:48:32.051847: step: 476/464, loss: 0.8512176275253296 2023-01-22 11:48:32.810314: step: 478/464, loss: 3.9317703247070312 2023-01-22 11:48:33.625341: step: 480/464, loss: 0.7550517320632935 2023-01-22 11:48:34.347026: step: 482/464, loss: 0.6175093650817871 2023-01-22 11:48:35.059703: step: 484/464, loss: 0.4945186376571655 2023-01-22 11:48:35.813739: step: 486/464, loss: 0.8519952297210693 2023-01-22 11:48:36.549903: step: 488/464, loss: 1.2580838203430176 2023-01-22 11:48:37.306414: step: 490/464, loss: 1.7508350610733032 2023-01-22 11:48:38.037040: step: 492/464, loss: 2.554095983505249 2023-01-22 11:48:38.807737: step: 494/464, loss: 0.6756281852722168 2023-01-22 11:48:39.557777: step: 496/464, loss: 0.5863864421844482 2023-01-22 11:48:40.334283: step: 498/464, loss: 0.5947279334068298 2023-01-22 11:48:41.122671: step: 500/464, loss: 0.8904541730880737 2023-01-22 11:48:41.916762: step: 502/464, loss: 5.003479957580566 2023-01-22 11:48:42.710257: step: 504/464, loss: 1.011299967765808 2023-01-22 11:48:43.486474: step: 506/464, loss: 0.562726616859436 2023-01-22 11:48:44.147664: step: 508/464, loss: 0.868272602558136 2023-01-22 11:48:44.872292: step: 510/464, loss: 0.799949049949646 2023-01-22 11:48:45.657793: step: 512/464, loss: 1.1345824003219604 2023-01-22 11:48:46.352336: step: 514/464, loss: 1.110640287399292 2023-01-22 11:48:47.115129: step: 516/464, loss: 6.441725254058838 2023-01-22 11:48:47.795072: step: 518/464, loss: 0.31550291180610657 2023-01-22 11:48:48.572824: step: 520/464, loss: 0.36972540616989136 2023-01-22 11:48:49.319781: step: 522/464, loss: 0.3183565139770508 2023-01-22 11:48:50.088840: step: 524/464, loss: 1.7634189128875732 2023-01-22 11:48:50.862901: step: 526/464, loss: 0.6870174407958984 2023-01-22 11:48:51.612413: step: 528/464, loss: 0.8734363317489624 2023-01-22 11:48:52.317602: step: 530/464, loss: 3.1973259449005127 2023-01-22 11:48:53.038069: step: 532/464, loss: 0.20919574797153473 2023-01-22 11:48:53.856475: step: 534/464, loss: 4.95903205871582 2023-01-22 11:48:54.559462: step: 536/464, loss: 0.2242266833782196 2023-01-22 11:48:55.302305: step: 538/464, loss: 0.9058042168617249 2023-01-22 11:48:56.045155: step: 540/464, loss: 0.5100103616714478 2023-01-22 11:48:56.788671: step: 542/464, loss: 4.897339820861816 2023-01-22 11:48:57.525380: step: 544/464, loss: 1.1864523887634277 2023-01-22 11:48:58.273400: step: 546/464, loss: 1.217078447341919 2023-01-22 11:48:59.084895: step: 548/464, loss: 1.9085209369659424 2023-01-22 11:48:59.828006: step: 550/464, loss: 1.4283592700958252 2023-01-22 11:49:00.605655: step: 552/464, loss: 1.6112210750579834 2023-01-22 11:49:01.296539: step: 554/464, loss: 1.0772231817245483 2023-01-22 11:49:02.072674: step: 556/464, loss: 0.282481849193573 2023-01-22 11:49:02.825835: step: 558/464, loss: 2.148629665374756 2023-01-22 11:49:03.590033: step: 560/464, loss: 1.2082405090332031 2023-01-22 11:49:04.256658: step: 562/464, loss: 0.44685670733451843 2023-01-22 11:49:04.915471: step: 564/464, loss: 1.7282108068466187 2023-01-22 11:49:05.647160: step: 566/464, loss: 1.388054609298706 2023-01-22 11:49:06.457941: step: 568/464, loss: 1.1820733547210693 2023-01-22 11:49:07.260008: step: 570/464, loss: 0.8305114507675171 2023-01-22 11:49:08.011361: step: 572/464, loss: 2.3896570205688477 2023-01-22 11:49:08.798212: step: 574/464, loss: 0.38784927129745483 2023-01-22 11:49:09.562189: step: 576/464, loss: 0.8815006017684937 2023-01-22 11:49:10.325711: step: 578/464, loss: 1.1910144090652466 2023-01-22 11:49:11.006882: step: 580/464, loss: 1.0457309484481812 2023-01-22 11:49:11.811463: step: 582/464, loss: 1.1997699737548828 2023-01-22 11:49:12.530057: step: 584/464, loss: 2.263993978500366 2023-01-22 11:49:13.351711: step: 586/464, loss: 1.034746527671814 2023-01-22 11:49:14.232548: step: 588/464, loss: 3.714615821838379 2023-01-22 11:49:15.029999: step: 590/464, loss: 1.3886295557022095 2023-01-22 11:49:15.783024: step: 592/464, loss: 0.8830761909484863 2023-01-22 11:49:16.545790: step: 594/464, loss: 0.5461584329605103 2023-01-22 11:49:17.336401: step: 596/464, loss: 1.3844913244247437 2023-01-22 11:49:18.111447: step: 598/464, loss: 1.209411859512329 2023-01-22 11:49:18.801018: step: 600/464, loss: 0.11756696552038193 2023-01-22 11:49:19.562434: step: 602/464, loss: 0.6817506551742554 2023-01-22 11:49:20.317690: step: 604/464, loss: 0.7887982726097107 2023-01-22 11:49:21.093801: step: 606/464, loss: 10.397603988647461 2023-01-22 11:49:21.885820: step: 608/464, loss: 2.0216078758239746 2023-01-22 11:49:22.697362: step: 610/464, loss: 3.20801043510437 2023-01-22 11:49:23.519327: step: 612/464, loss: 1.7752575874328613 2023-01-22 11:49:24.284998: step: 614/464, loss: 0.9904509782791138 2023-01-22 11:49:25.015324: step: 616/464, loss: 1.2129480838775635 2023-01-22 11:49:25.736298: step: 618/464, loss: 0.8754247426986694 2023-01-22 11:49:26.458604: step: 620/464, loss: 2.8872084617614746 2023-01-22 11:49:27.151954: step: 622/464, loss: 0.9464370012283325 2023-01-22 11:49:27.928061: step: 624/464, loss: 0.6756294965744019 2023-01-22 11:49:28.650917: step: 626/464, loss: 0.8634769916534424 2023-01-22 11:49:29.318821: step: 628/464, loss: 2.143869400024414 2023-01-22 11:49:30.109538: step: 630/464, loss: 0.4692254066467285 2023-01-22 11:49:30.833198: step: 632/464, loss: 0.5207785367965698 2023-01-22 11:49:31.546606: step: 634/464, loss: 0.5450736284255981 2023-01-22 11:49:32.331118: step: 636/464, loss: 0.4662814140319824 2023-01-22 11:49:33.107194: step: 638/464, loss: 0.448432594537735 2023-01-22 11:49:33.850546: step: 640/464, loss: 0.6444108486175537 2023-01-22 11:49:34.508261: step: 642/464, loss: 1.191615343093872 2023-01-22 11:49:35.220750: step: 644/464, loss: 2.963665008544922 2023-01-22 11:49:35.928135: step: 646/464, loss: 1.7940711975097656 2023-01-22 11:49:36.677702: step: 648/464, loss: 0.5459948778152466 2023-01-22 11:49:37.390104: step: 650/464, loss: 1.9765942096710205 2023-01-22 11:49:38.069108: step: 652/464, loss: 0.784062922000885 2023-01-22 11:49:38.826491: step: 654/464, loss: 0.6920633912086487 2023-01-22 11:49:39.632859: step: 656/464, loss: 0.2723618149757385 2023-01-22 11:49:40.278374: step: 658/464, loss: 7.097645282745361 2023-01-22 11:49:41.050883: step: 660/464, loss: 1.8126413822174072 2023-01-22 11:49:41.858173: step: 662/464, loss: 1.025843620300293 2023-01-22 11:49:42.590068: step: 664/464, loss: 0.48338747024536133 2023-01-22 11:49:43.268266: step: 666/464, loss: 1.12424898147583 2023-01-22 11:49:43.954950: step: 668/464, loss: 2.855159282684326 2023-01-22 11:49:44.720518: step: 670/464, loss: 0.46674615144729614 2023-01-22 11:49:45.481015: step: 672/464, loss: 4.845136642456055 2023-01-22 11:49:46.255803: step: 674/464, loss: 0.7831045389175415 2023-01-22 11:49:47.084592: step: 676/464, loss: 1.4002931118011475 2023-01-22 11:49:47.862445: step: 678/464, loss: 0.4147849678993225 2023-01-22 11:49:48.601646: step: 680/464, loss: 1.0630037784576416 2023-01-22 11:49:49.485598: step: 682/464, loss: 2.586699962615967 2023-01-22 11:49:50.159227: step: 684/464, loss: 0.8683406114578247 2023-01-22 11:49:50.909115: step: 686/464, loss: 1.7670503854751587 2023-01-22 11:49:51.650106: step: 688/464, loss: 0.2609246075153351 2023-01-22 11:49:52.442067: step: 690/464, loss: 0.4074583947658539 2023-01-22 11:49:53.132882: step: 692/464, loss: 1.5731773376464844 2023-01-22 11:49:53.853017: step: 694/464, loss: 0.8332323431968689 2023-01-22 11:49:54.633617: step: 696/464, loss: 1.06688392162323 2023-01-22 11:49:55.434194: step: 698/464, loss: 17.2744083404541 2023-01-22 11:49:56.200560: step: 700/464, loss: 1.0277228355407715 2023-01-22 11:49:56.846916: step: 702/464, loss: 1.0320792198181152 2023-01-22 11:49:57.637615: step: 704/464, loss: 0.6775314807891846 2023-01-22 11:49:58.400235: step: 706/464, loss: 0.8533573150634766 2023-01-22 11:49:59.208680: step: 708/464, loss: 1.7036771774291992 2023-01-22 11:49:59.905396: step: 710/464, loss: 0.3614809513092041 2023-01-22 11:50:00.677483: step: 712/464, loss: 0.7562206983566284 2023-01-22 11:50:01.527658: step: 714/464, loss: 1.420016884803772 2023-01-22 11:50:02.316883: step: 716/464, loss: 1.6971116065979004 2023-01-22 11:50:03.092432: step: 718/464, loss: 0.3137013614177704 2023-01-22 11:50:03.779357: step: 720/464, loss: 0.7519422769546509 2023-01-22 11:50:04.516082: step: 722/464, loss: 1.1476322412490845 2023-01-22 11:50:05.283938: step: 724/464, loss: 0.9484537243843079 2023-01-22 11:50:06.124830: step: 726/464, loss: 0.9967315196990967 2023-01-22 11:50:06.858381: step: 728/464, loss: 1.7490859031677246 2023-01-22 11:50:07.623917: step: 730/464, loss: 0.9392281174659729 2023-01-22 11:50:08.412862: step: 732/464, loss: 2.6961867809295654 2023-01-22 11:50:09.162039: step: 734/464, loss: 0.4748970866203308 2023-01-22 11:50:09.853504: step: 736/464, loss: 0.4516463279724121 2023-01-22 11:50:10.597380: step: 738/464, loss: 0.8927331566810608 2023-01-22 11:50:11.313517: step: 740/464, loss: 1.5245000123977661 2023-01-22 11:50:12.066849: step: 742/464, loss: 2.048813581466675 2023-01-22 11:50:12.868715: step: 744/464, loss: 0.941074788570404 2023-01-22 11:50:13.591036: step: 746/464, loss: 0.9800759553909302 2023-01-22 11:50:14.329342: step: 748/464, loss: 0.7823781371116638 2023-01-22 11:50:15.071083: step: 750/464, loss: 2.4357547760009766 2023-01-22 11:50:15.821642: step: 752/464, loss: 2.9422607421875 2023-01-22 11:50:16.621793: step: 754/464, loss: 1.5935968160629272 2023-01-22 11:50:17.416569: step: 756/464, loss: 0.6326267123222351 2023-01-22 11:50:18.099604: step: 758/464, loss: 0.5571883320808411 2023-01-22 11:50:18.850299: step: 760/464, loss: 0.4855504631996155 2023-01-22 11:50:19.559180: step: 762/464, loss: 2.940896511077881 2023-01-22 11:50:20.271060: step: 764/464, loss: 0.7516842484474182 2023-01-22 11:50:21.054284: step: 766/464, loss: 0.41791966557502747 2023-01-22 11:50:21.763702: step: 768/464, loss: 1.0552831888198853 2023-01-22 11:50:22.495071: step: 770/464, loss: 2.5800318717956543 2023-01-22 11:50:23.251395: step: 772/464, loss: 0.5675879716873169 2023-01-22 11:50:24.099487: step: 774/464, loss: 0.8680490255355835 2023-01-22 11:50:24.870358: step: 776/464, loss: 1.4004287719726562 2023-01-22 11:50:25.669786: step: 778/464, loss: 0.8454161882400513 2023-01-22 11:50:26.438698: step: 780/464, loss: 0.904163122177124 2023-01-22 11:50:27.204302: step: 782/464, loss: 0.6053056120872498 2023-01-22 11:50:27.921977: step: 784/464, loss: 0.21278899908065796 2023-01-22 11:50:28.673230: step: 786/464, loss: 4.879207611083984 2023-01-22 11:50:29.400899: step: 788/464, loss: 0.8809518814086914 2023-01-22 11:50:30.162674: step: 790/464, loss: 1.1817303895950317 2023-01-22 11:50:30.916508: step: 792/464, loss: 2.4634742736816406 2023-01-22 11:50:31.677992: step: 794/464, loss: 0.7877818942070007 2023-01-22 11:50:32.414561: step: 796/464, loss: 0.3282577693462372 2023-01-22 11:50:33.159576: step: 798/464, loss: 0.5703331828117371 2023-01-22 11:50:33.870274: step: 800/464, loss: 1.4936720132827759 2023-01-22 11:50:34.621651: step: 802/464, loss: 2.9826011657714844 2023-01-22 11:50:35.427425: step: 804/464, loss: 0.5595707893371582 2023-01-22 11:50:36.156170: step: 806/464, loss: 3.8043880462646484 2023-01-22 11:50:36.903875: step: 808/464, loss: 1.1721092462539673 2023-01-22 11:50:37.617278: step: 810/464, loss: 0.980811595916748 2023-01-22 11:50:38.372160: step: 812/464, loss: 4.484066963195801 2023-01-22 11:50:39.250684: step: 814/464, loss: 0.5161080360412598 2023-01-22 11:50:40.038903: step: 816/464, loss: 0.8435653448104858 2023-01-22 11:50:40.870498: step: 818/464, loss: 0.9036491513252258 2023-01-22 11:50:41.650899: step: 820/464, loss: 1.5572112798690796 2023-01-22 11:50:42.399935: step: 822/464, loss: 0.7679738402366638 2023-01-22 11:50:43.156406: step: 824/464, loss: 1.4088101387023926 2023-01-22 11:50:43.956370: step: 826/464, loss: 0.3526294529438019 2023-01-22 11:50:44.689021: step: 828/464, loss: 0.7417711019515991 2023-01-22 11:50:45.423543: step: 830/464, loss: 0.29044491052627563 2023-01-22 11:50:46.165354: step: 832/464, loss: 0.6436646580696106 2023-01-22 11:50:46.870914: step: 834/464, loss: 0.7028468251228333 2023-01-22 11:50:47.611807: step: 836/464, loss: 0.35998618602752686 2023-01-22 11:50:48.314190: step: 838/464, loss: 0.3419780135154724 2023-01-22 11:50:48.976417: step: 840/464, loss: 1.7909822463989258 2023-01-22 11:50:49.754068: step: 842/464, loss: 1.2623323202133179 2023-01-22 11:50:50.538030: step: 844/464, loss: 2.257474899291992 2023-01-22 11:50:51.319906: step: 846/464, loss: 0.32528308033943176 2023-01-22 11:50:52.208490: step: 848/464, loss: 0.24421803653240204 2023-01-22 11:50:52.896058: step: 850/464, loss: 0.41921886801719666 2023-01-22 11:50:53.758891: step: 852/464, loss: 4.022235870361328 2023-01-22 11:50:54.474006: step: 854/464, loss: 1.883143424987793 2023-01-22 11:50:55.192027: step: 856/464, loss: 0.9430786371231079 2023-01-22 11:50:55.865344: step: 858/464, loss: 0.6863318681716919 2023-01-22 11:50:56.722729: step: 860/464, loss: 5.270090103149414 2023-01-22 11:50:57.490184: step: 862/464, loss: 0.574114978313446 2023-01-22 11:50:58.235683: step: 864/464, loss: 0.9139434695243835 2023-01-22 11:50:58.944496: step: 866/464, loss: 1.3902381658554077 2023-01-22 11:50:59.696549: step: 868/464, loss: 0.6280410289764404 2023-01-22 11:51:00.444343: step: 870/464, loss: 0.24456611275672913 2023-01-22 11:51:01.236371: step: 872/464, loss: 2.4077582359313965 2023-01-22 11:51:01.915314: step: 874/464, loss: 1.9177312850952148 2023-01-22 11:51:02.633023: step: 876/464, loss: 0.7885171175003052 2023-01-22 11:51:03.417750: step: 878/464, loss: 0.9026452302932739 2023-01-22 11:51:04.236227: step: 880/464, loss: 2.1258692741394043 2023-01-22 11:51:05.023843: step: 882/464, loss: 2.132807970046997 2023-01-22 11:51:05.742350: step: 884/464, loss: 0.6072775721549988 2023-01-22 11:51:06.534756: step: 886/464, loss: 1.4739820957183838 2023-01-22 11:51:07.346141: step: 888/464, loss: 0.6416248679161072 2023-01-22 11:51:08.075544: step: 890/464, loss: 2.314173936843872 2023-01-22 11:51:08.862796: step: 892/464, loss: 0.9566014409065247 2023-01-22 11:51:09.719797: step: 894/464, loss: 0.6913099884986877 2023-01-22 11:51:10.485727: step: 896/464, loss: 2.844928741455078 2023-01-22 11:51:11.256792: step: 898/464, loss: 0.2614089846611023 2023-01-22 11:51:12.084805: step: 900/464, loss: 5.129838943481445 2023-01-22 11:51:12.815643: step: 902/464, loss: 0.91115802526474 2023-01-22 11:51:13.565801: step: 904/464, loss: 2.6563570499420166 2023-01-22 11:51:14.376451: step: 906/464, loss: 1.552895188331604 2023-01-22 11:51:15.086967: step: 908/464, loss: 0.2228650450706482 2023-01-22 11:51:15.865339: step: 910/464, loss: 1.1201412677764893 2023-01-22 11:51:16.580706: step: 912/464, loss: 0.28336161375045776 2023-01-22 11:51:17.365804: step: 914/464, loss: 0.6156604290008545 2023-01-22 11:51:18.072895: step: 916/464, loss: 2.0127663612365723 2023-01-22 11:51:18.844522: step: 918/464, loss: 0.4022485017776489 2023-01-22 11:51:19.612791: step: 920/464, loss: 0.40107741951942444 2023-01-22 11:51:20.375739: step: 922/464, loss: 0.6638324856758118 2023-01-22 11:51:21.180181: step: 924/464, loss: 0.5233751535415649 2023-01-22 11:51:21.884649: step: 926/464, loss: 0.5563598871231079 2023-01-22 11:51:22.602482: step: 928/464, loss: 1.0427556037902832 2023-01-22 11:51:23.258367: step: 930/464, loss: 2.201897144317627 ================================================== Loss: 1.302 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3008384287567167, 'r': 0.2934173669467787, 'f1': 0.29708156077032155}, 'combined': 0.2189022026728685, 'epoch': 3} Test Chinese: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.3061827404249836, 'r': 0.24652101949152982, 'f1': 0.273131781595547}, 'combined': 0.1696292117277608, 'epoch': 3} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2771944367689048, 'r': 0.27193458028372636, 'f1': 0.27453931764276585}, 'combined': 0.2022921287894064, 'epoch': 3} Test Korean: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.2988706955680474, 'r': 0.24453056910112972, 'f1': 0.2689836260112427}, 'combined': 0.1670529887859297, 'epoch': 3} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2951776413690476, 'r': 0.2867760007228698, 'f1': 0.29091617397680924}, 'combined': 0.21435928608817523, 'epoch': 3} Test Russian: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.3139478423272764, 'r': 0.25531529074639175, 'f1': 0.28161207001127897}, 'combined': 0.17489591716489958, 'epoch': 3} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.24509803921568626, 'r': 0.35714285714285715, 'f1': 0.29069767441860467}, 'combined': 0.1937984496124031, 'epoch': 3} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.23684210526315788, 'r': 0.391304347826087, 'f1': 0.2950819672131147}, 'combined': 0.14754098360655735, 'epoch': 3} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.35833333333333334, 'r': 0.1853448275862069, 'f1': 0.24431818181818182}, 'combined': 0.16287878787878787, 'epoch': 3} New best korean model... New best russian model... ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.27277317990287775, 'r': 0.2614291157103195, 'f1': 0.2669806992485695}, 'combined': 0.19672262049894593, 'epoch': 2} Test for Chinese: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.2661425985210101, 'r': 0.23333409278617157, 'f1': 0.2486608198477961}, 'combined': 0.15443145653705231, 'epoch': 2} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.34459459459459457, 'r': 0.36428571428571427, 'f1': 0.3541666666666667}, 'combined': 0.2361111111111111, 'epoch': 2} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2771944367689048, 'r': 0.27193458028372636, 'f1': 0.27453931764276585}, 'combined': 0.2022921287894064, 'epoch': 3} Test for Korean: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.2988706955680474, 'r': 0.24453056910112972, 'f1': 0.2689836260112427}, 'combined': 0.1670529887859297, 'epoch': 3} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.23684210526315788, 'r': 0.391304347826087, 'f1': 0.2950819672131147}, 'combined': 0.14754098360655735, 'epoch': 3} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2951776413690476, 'r': 0.2867760007228698, 'f1': 0.29091617397680924}, 'combined': 0.21435928608817523, 'epoch': 3} Test for Russian: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.3139478423272764, 'r': 0.25531529074639175, 'f1': 0.28161207001127897}, 'combined': 0.17489591716489958, 'epoch': 3} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.35833333333333334, 'r': 0.1853448275862069, 'f1': 0.24431818181818182}, 'combined': 0.16287878787878787, 'epoch': 3} ****************************** Epoch: 4 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-22 11:54:23.194474: step: 2/464, loss: 0.40228208899497986 2023-01-22 11:54:23.944303: step: 4/464, loss: 0.761715829372406 2023-01-22 11:54:24.692739: step: 6/464, loss: 0.7494341731071472 2023-01-22 11:54:25.468813: step: 8/464, loss: 0.24450060725212097 2023-01-22 11:54:26.150967: step: 10/464, loss: 0.5414758920669556 2023-01-22 11:54:26.958852: step: 12/464, loss: 1.0487775802612305 2023-01-22 11:54:27.650054: step: 14/464, loss: 0.7465416193008423 2023-01-22 11:54:28.438286: step: 16/464, loss: 0.2710585296154022 2023-01-22 11:54:29.191904: step: 18/464, loss: 0.36387255787849426 2023-01-22 11:54:29.872163: step: 20/464, loss: 1.130811333656311 2023-01-22 11:54:30.569707: step: 22/464, loss: 0.6743142604827881 2023-01-22 11:54:31.326845: step: 24/464, loss: 1.241051197052002 2023-01-22 11:54:32.051200: step: 26/464, loss: 0.5547807216644287 2023-01-22 11:54:32.907319: step: 28/464, loss: 0.6198810935020447 2023-01-22 11:54:33.603664: step: 30/464, loss: 1.1782160997390747 2023-01-22 11:54:34.396737: step: 32/464, loss: 0.813835620880127 2023-01-22 11:54:35.146362: step: 34/464, loss: 0.641745388507843 2023-01-22 11:54:35.897075: step: 36/464, loss: 0.4335750937461853 2023-01-22 11:54:36.672273: step: 38/464, loss: 0.7220770120620728 2023-01-22 11:54:37.388686: step: 40/464, loss: 0.1818157434463501 2023-01-22 11:54:38.160643: step: 42/464, loss: 0.573371171951294 2023-01-22 11:54:39.036147: step: 44/464, loss: 2.2441868782043457 2023-01-22 11:54:39.781760: step: 46/464, loss: 0.2036203145980835 2023-01-22 11:54:40.584473: step: 48/464, loss: 0.3498281240463257 2023-01-22 11:54:41.374372: step: 50/464, loss: 3.470280647277832 2023-01-22 11:54:42.161434: step: 52/464, loss: 2.04329776763916 2023-01-22 11:54:42.919200: step: 54/464, loss: 0.6625763773918152 2023-01-22 11:54:43.667147: step: 56/464, loss: 1.1685752868652344 2023-01-22 11:54:44.406373: step: 58/464, loss: 2.4842872619628906 2023-01-22 11:54:45.188477: step: 60/464, loss: 1.0792351961135864 2023-01-22 11:54:45.900785: step: 62/464, loss: 0.49406152963638306 2023-01-22 11:54:46.610297: step: 64/464, loss: 0.2838814854621887 2023-01-22 11:54:47.370445: step: 66/464, loss: 0.40156522393226624 2023-01-22 11:54:48.148562: step: 68/464, loss: 1.2194373607635498 2023-01-22 11:54:48.954932: step: 70/464, loss: 0.8869839310646057 2023-01-22 11:54:49.707221: step: 72/464, loss: 0.5222635865211487 2023-01-22 11:54:50.431771: step: 74/464, loss: 0.6052136421203613 2023-01-22 11:54:51.213636: step: 76/464, loss: 0.8005256056785583 2023-01-22 11:54:51.960481: step: 78/464, loss: 0.8984832763671875 2023-01-22 11:54:52.750585: step: 80/464, loss: 0.41829895973205566 2023-01-22 11:54:53.531012: step: 82/464, loss: 0.3826153874397278 2023-01-22 11:54:54.241317: step: 84/464, loss: 0.2728906273841858 2023-01-22 11:54:55.033971: step: 86/464, loss: 0.673712968826294 2023-01-22 11:54:55.820202: step: 88/464, loss: 0.44470030069351196 2023-01-22 11:54:56.647734: step: 90/464, loss: 2.178097724914551 2023-01-22 11:54:57.333722: step: 92/464, loss: 0.7721235752105713 2023-01-22 11:54:58.104868: step: 94/464, loss: 0.8557058572769165 2023-01-22 11:54:58.899981: step: 96/464, loss: 0.9540102481842041 2023-01-22 11:54:59.668785: step: 98/464, loss: 4.250099182128906 2023-01-22 11:55:00.410580: step: 100/464, loss: 0.47582846879959106 2023-01-22 11:55:01.203663: step: 102/464, loss: 1.3785321712493896 2023-01-22 11:55:02.051255: step: 104/464, loss: 0.26402074098587036 2023-01-22 11:55:02.808572: step: 106/464, loss: 0.7524766325950623 2023-01-22 11:55:03.650620: step: 108/464, loss: 1.1828628778457642 2023-01-22 11:55:04.344200: step: 110/464, loss: 0.8786439299583435 2023-01-22 11:55:05.090769: step: 112/464, loss: 1.3983309268951416 2023-01-22 11:55:05.871671: step: 114/464, loss: 0.604883074760437 2023-01-22 11:55:06.605231: step: 116/464, loss: 1.3533883094787598 2023-01-22 11:55:07.356936: step: 118/464, loss: 0.4021783173084259 2023-01-22 11:55:08.053033: step: 120/464, loss: 0.9296446442604065 2023-01-22 11:55:08.838861: step: 122/464, loss: 0.7201257944107056 2023-01-22 11:55:09.590206: step: 124/464, loss: 0.46312540769577026 2023-01-22 11:55:10.405006: step: 126/464, loss: 1.0364785194396973 2023-01-22 11:55:11.281924: step: 128/464, loss: 0.2913583517074585 2023-01-22 11:55:12.072957: step: 130/464, loss: 0.6095572710037231 2023-01-22 11:55:12.844260: step: 132/464, loss: 1.744209885597229 2023-01-22 11:55:13.591221: step: 134/464, loss: 0.7447730898857117 2023-01-22 11:55:14.306303: step: 136/464, loss: 1.452962040901184 2023-01-22 11:55:15.074692: step: 138/464, loss: 1.8778823614120483 2023-01-22 11:55:15.769077: step: 140/464, loss: 0.7671042084693909 2023-01-22 11:55:16.591774: step: 142/464, loss: 1.4017772674560547 2023-01-22 11:55:17.359919: step: 144/464, loss: 0.5678072571754456 2023-01-22 11:55:18.149333: step: 146/464, loss: 3.1386330127716064 2023-01-22 11:55:18.932104: step: 148/464, loss: 0.49076610803604126 2023-01-22 11:55:19.667205: step: 150/464, loss: 1.209444284439087 2023-01-22 11:55:20.563110: step: 152/464, loss: 1.068770408630371 2023-01-22 11:55:21.339379: step: 154/464, loss: 1.9193202257156372 2023-01-22 11:55:22.109788: step: 156/464, loss: 0.46384406089782715 2023-01-22 11:55:22.885961: step: 158/464, loss: 1.053743839263916 2023-01-22 11:55:23.639019: step: 160/464, loss: 0.545936107635498 2023-01-22 11:55:24.365712: step: 162/464, loss: 0.8476034998893738 2023-01-22 11:55:25.153536: step: 164/464, loss: 0.8386257886886597 2023-01-22 11:55:25.920282: step: 166/464, loss: 0.4921552538871765 2023-01-22 11:55:26.661609: step: 168/464, loss: 0.7412563562393188 2023-01-22 11:55:27.460534: step: 170/464, loss: 0.6649913191795349 2023-01-22 11:55:28.222691: step: 172/464, loss: 0.16923581063747406 2023-01-22 11:55:29.065266: step: 174/464, loss: 0.6418856978416443 2023-01-22 11:55:29.837940: step: 176/464, loss: 0.5956724882125854 2023-01-22 11:55:30.595093: step: 178/464, loss: 1.5731823444366455 2023-01-22 11:55:31.359536: step: 180/464, loss: 0.6780312061309814 2023-01-22 11:55:32.114342: step: 182/464, loss: 0.3298395574092865 2023-01-22 11:55:32.926912: step: 184/464, loss: 0.8208402395248413 2023-01-22 11:55:33.622153: step: 186/464, loss: 1.238532304763794 2023-01-22 11:55:34.335372: step: 188/464, loss: 0.22493049502372742 2023-01-22 11:55:35.070082: step: 190/464, loss: 1.5633842945098877 2023-01-22 11:55:35.794629: step: 192/464, loss: 0.8262635469436646 2023-01-22 11:55:36.548011: step: 194/464, loss: 0.917330801486969 2023-01-22 11:55:37.289745: step: 196/464, loss: 0.9680485129356384 2023-01-22 11:55:38.059790: step: 198/464, loss: 0.3854012191295624 2023-01-22 11:55:38.904227: step: 200/464, loss: 1.4094256162643433 2023-01-22 11:55:39.628379: step: 202/464, loss: 0.3496444821357727 2023-01-22 11:55:40.340688: step: 204/464, loss: 1.3187754154205322 2023-01-22 11:55:41.059219: step: 206/464, loss: 1.3401422500610352 2023-01-22 11:55:41.811558: step: 208/464, loss: 1.0585306882858276 2023-01-22 11:55:42.477501: step: 210/464, loss: 0.3400452733039856 2023-01-22 11:55:43.268030: step: 212/464, loss: 0.5716167688369751 2023-01-22 11:55:44.041444: step: 214/464, loss: 0.9066051840782166 2023-01-22 11:55:44.750755: step: 216/464, loss: 0.510635495185852 2023-01-22 11:55:45.530544: step: 218/464, loss: 0.5335964560508728 2023-01-22 11:55:46.323779: step: 220/464, loss: 0.7558530569076538 2023-01-22 11:55:47.033171: step: 222/464, loss: 0.4452095031738281 2023-01-22 11:55:47.702051: step: 224/464, loss: 0.32481619715690613 2023-01-22 11:55:48.430684: step: 226/464, loss: 1.4661684036254883 2023-01-22 11:55:49.302205: step: 228/464, loss: 0.3202166259288788 2023-01-22 11:55:50.030050: step: 230/464, loss: 0.694557785987854 2023-01-22 11:55:50.791662: step: 232/464, loss: 1.280537486076355 2023-01-22 11:55:51.482054: step: 234/464, loss: 1.118554711341858 2023-01-22 11:55:52.221134: step: 236/464, loss: 0.620063066482544 2023-01-22 11:55:52.952113: step: 238/464, loss: 0.40609267354011536 2023-01-22 11:55:53.723662: step: 240/464, loss: 3.694124460220337 2023-01-22 11:55:54.451279: step: 242/464, loss: 0.3537047505378723 2023-01-22 11:55:55.195850: step: 244/464, loss: 0.5573844909667969 2023-01-22 11:55:55.960573: step: 246/464, loss: 0.921636164188385 2023-01-22 11:55:56.664283: step: 248/464, loss: 1.0031071901321411 2023-01-22 11:55:57.434834: step: 250/464, loss: 0.6408367156982422 2023-01-22 11:55:58.257026: step: 252/464, loss: 0.6051366329193115 2023-01-22 11:55:59.047373: step: 254/464, loss: 0.29460883140563965 2023-01-22 11:55:59.741615: step: 256/464, loss: 0.6332646608352661 2023-01-22 11:56:00.526240: step: 258/464, loss: 3.4274678230285645 2023-01-22 11:56:01.273747: step: 260/464, loss: 0.6829164028167725 2023-01-22 11:56:02.027679: step: 262/464, loss: 0.6332787275314331 2023-01-22 11:56:02.852319: step: 264/464, loss: 0.2608512043952942 2023-01-22 11:56:03.583679: step: 266/464, loss: 1.234724760055542 2023-01-22 11:56:04.352754: step: 268/464, loss: 0.23023658990859985 2023-01-22 11:56:05.101618: step: 270/464, loss: 2.5275135040283203 2023-01-22 11:56:05.878478: step: 272/464, loss: 0.7357670068740845 2023-01-22 11:56:06.693890: step: 274/464, loss: 1.2567230463027954 2023-01-22 11:56:07.414543: step: 276/464, loss: 0.9408270716667175 2023-01-22 11:56:08.147953: step: 278/464, loss: 0.8275658488273621 2023-01-22 11:56:08.893953: step: 280/464, loss: 1.6424264907836914 2023-01-22 11:56:09.639569: step: 282/464, loss: 0.5935283899307251 2023-01-22 11:56:10.319027: step: 284/464, loss: 0.923167884349823 2023-01-22 11:56:11.027496: step: 286/464, loss: 4.628636360168457 2023-01-22 11:56:11.874365: step: 288/464, loss: 0.6874059438705444 2023-01-22 11:56:12.627198: step: 290/464, loss: 3.098637104034424 2023-01-22 11:56:13.397260: step: 292/464, loss: 0.7319610118865967 2023-01-22 11:56:14.109078: step: 294/464, loss: 2.2061524391174316 2023-01-22 11:56:14.900247: step: 296/464, loss: 0.7146373987197876 2023-01-22 11:56:15.599211: step: 298/464, loss: 0.8566915392875671 2023-01-22 11:56:16.325252: step: 300/464, loss: 2.7071781158447266 2023-01-22 11:56:17.041526: step: 302/464, loss: 0.4175330400466919 2023-01-22 11:56:17.759885: step: 304/464, loss: 2.2370574474334717 2023-01-22 11:56:18.585959: step: 306/464, loss: 0.8507087230682373 2023-01-22 11:56:19.359070: step: 308/464, loss: 1.7574455738067627 2023-01-22 11:56:20.133555: step: 310/464, loss: 1.437440037727356 2023-01-22 11:56:20.812841: step: 312/464, loss: 1.0773636102676392 2023-01-22 11:56:21.468944: step: 314/464, loss: 4.062402248382568 2023-01-22 11:56:22.198685: step: 316/464, loss: 1.0066360235214233 2023-01-22 11:56:22.970447: step: 318/464, loss: 0.920922040939331 2023-01-22 11:56:23.674718: step: 320/464, loss: 1.5979911088943481 2023-01-22 11:56:24.569601: step: 322/464, loss: 3.2422804832458496 2023-01-22 11:56:25.316554: step: 324/464, loss: 0.5224069356918335 2023-01-22 11:56:26.083497: step: 326/464, loss: 1.6979987621307373 2023-01-22 11:56:26.810676: step: 328/464, loss: 0.4662569761276245 2023-01-22 11:56:27.618796: step: 330/464, loss: 0.4052315950393677 2023-01-22 11:56:28.321822: step: 332/464, loss: 0.2515992224216461 2023-01-22 11:56:29.027801: step: 334/464, loss: 0.7200693488121033 2023-01-22 11:56:29.794919: step: 336/464, loss: 1.927569031715393 2023-01-22 11:56:30.583760: step: 338/464, loss: 1.7271506786346436 2023-01-22 11:56:31.313336: step: 340/464, loss: 0.10802609473466873 2023-01-22 11:56:32.106390: step: 342/464, loss: 0.7540997266769409 2023-01-22 11:56:32.948595: step: 344/464, loss: 2.4743459224700928 2023-01-22 11:56:33.642026: step: 346/464, loss: 1.7580007314682007 2023-01-22 11:56:34.456357: step: 348/464, loss: 0.2616334557533264 2023-01-22 11:56:35.157295: step: 350/464, loss: 0.4886997938156128 2023-01-22 11:56:35.928510: step: 352/464, loss: 0.6108362078666687 2023-01-22 11:56:36.665247: step: 354/464, loss: 1.4357883930206299 2023-01-22 11:56:37.427577: step: 356/464, loss: 0.7157084345817566 2023-01-22 11:56:38.118056: step: 358/464, loss: 0.188466876745224 2023-01-22 11:56:38.872723: step: 360/464, loss: 0.6241902112960815 2023-01-22 11:56:39.612980: step: 362/464, loss: 0.7048114538192749 2023-01-22 11:56:40.319354: step: 364/464, loss: 0.7779271006584167 2023-01-22 11:56:41.069536: step: 366/464, loss: 0.2853296101093292 2023-01-22 11:56:41.844320: step: 368/464, loss: 1.675212025642395 2023-01-22 11:56:42.628196: step: 370/464, loss: 0.7650219202041626 2023-01-22 11:56:43.383971: step: 372/464, loss: 0.40282338857650757 2023-01-22 11:56:44.143594: step: 374/464, loss: 0.49744731187820435 2023-01-22 11:56:44.941960: step: 376/464, loss: 2.2720847129821777 2023-01-22 11:56:45.691700: step: 378/464, loss: 0.8903071284294128 2023-01-22 11:56:46.409471: step: 380/464, loss: 1.1935958862304688 2023-01-22 11:56:47.184150: step: 382/464, loss: 1.7125489711761475 2023-01-22 11:56:47.961165: step: 384/464, loss: 1.872301459312439 2023-01-22 11:56:48.715755: step: 386/464, loss: 0.4071800708770752 2023-01-22 11:56:49.489665: step: 388/464, loss: 0.4406774938106537 2023-01-22 11:56:50.254656: step: 390/464, loss: 0.8730248212814331 2023-01-22 11:56:51.030275: step: 392/464, loss: 0.19085253775119781 2023-01-22 11:56:51.785587: step: 394/464, loss: 0.2352760285139084 2023-01-22 11:56:52.526410: step: 396/464, loss: 2.909917116165161 2023-01-22 11:56:53.319872: step: 398/464, loss: 4.20819616317749 2023-01-22 11:56:54.091805: step: 400/464, loss: 1.4049768447875977 2023-01-22 11:56:54.889746: step: 402/464, loss: 0.6060394644737244 2023-01-22 11:56:55.586788: step: 404/464, loss: 0.3055975139141083 2023-01-22 11:56:56.339270: step: 406/464, loss: 0.778168261051178 2023-01-22 11:56:57.068139: step: 408/464, loss: 0.9540801644325256 2023-01-22 11:56:57.878025: step: 410/464, loss: 0.276130735874176 2023-01-22 11:56:58.670892: step: 412/464, loss: 0.6118130087852478 2023-01-22 11:56:59.367777: step: 414/464, loss: 0.8638632893562317 2023-01-22 11:57:00.108992: step: 416/464, loss: 0.24808137118816376 2023-01-22 11:57:00.905659: step: 418/464, loss: 1.0742295980453491 2023-01-22 11:57:01.731265: step: 420/464, loss: 0.5043665170669556 2023-01-22 11:57:02.474419: step: 422/464, loss: 0.34764429926872253 2023-01-22 11:57:03.258833: step: 424/464, loss: 1.4734704494476318 2023-01-22 11:57:03.994736: step: 426/464, loss: 0.7792679071426392 2023-01-22 11:57:04.906120: step: 428/464, loss: 1.2594319581985474 2023-01-22 11:57:05.732864: step: 430/464, loss: 0.8206206560134888 2023-01-22 11:57:06.451705: step: 432/464, loss: 0.9869316816329956 2023-01-22 11:57:07.169170: step: 434/464, loss: 2.3212811946868896 2023-01-22 11:57:07.827293: step: 436/464, loss: 0.2014824002981186 2023-01-22 11:57:08.508324: step: 438/464, loss: 0.7801494598388672 2023-01-22 11:57:09.331766: step: 440/464, loss: 0.9981138110160828 2023-01-22 11:57:10.116830: step: 442/464, loss: 0.5376265645027161 2023-01-22 11:57:11.002487: step: 444/464, loss: 0.3107282519340515 2023-01-22 11:57:11.812854: step: 446/464, loss: 0.45178577303886414 2023-01-22 11:57:12.650907: step: 448/464, loss: 0.47116056084632874 2023-01-22 11:57:13.357373: step: 450/464, loss: 1.1121314764022827 2023-01-22 11:57:14.061057: step: 452/464, loss: 1.8235584497451782 2023-01-22 11:57:14.755531: step: 454/464, loss: 1.2838335037231445 2023-01-22 11:57:15.582440: step: 456/464, loss: 0.43410632014274597 2023-01-22 11:57:16.323997: step: 458/464, loss: 1.8817094564437866 2023-01-22 11:57:17.101479: step: 460/464, loss: 0.8303155899047852 2023-01-22 11:57:17.859494: step: 462/464, loss: 0.7997901439666748 2023-01-22 11:57:18.605065: step: 464/464, loss: 0.4771265685558319 2023-01-22 11:57:19.398161: step: 466/464, loss: 0.8618097901344299 2023-01-22 11:57:20.139442: step: 468/464, loss: 0.3874833285808563 2023-01-22 11:57:20.967629: step: 470/464, loss: 0.8758236169815063 2023-01-22 11:57:21.699803: step: 472/464, loss: 0.2645411193370819 2023-01-22 11:57:22.469112: step: 474/464, loss: 0.423144668340683 2023-01-22 11:57:23.172034: step: 476/464, loss: 0.4718470573425293 2023-01-22 11:57:23.907277: step: 478/464, loss: 0.7281615734100342 2023-01-22 11:57:24.670680: step: 480/464, loss: 1.2971644401550293 2023-01-22 11:57:25.393857: step: 482/464, loss: 0.3497962951660156 2023-01-22 11:57:26.192199: step: 484/464, loss: 3.243353843688965 2023-01-22 11:57:26.909233: step: 486/464, loss: 1.0315594673156738 2023-01-22 11:57:27.586558: step: 488/464, loss: 2.403261184692383 2023-01-22 11:57:28.319886: step: 490/464, loss: 0.3512910008430481 2023-01-22 11:57:29.067326: step: 492/464, loss: 4.593682289123535 2023-01-22 11:57:29.824328: step: 494/464, loss: 0.6878775954246521 2023-01-22 11:57:30.567498: step: 496/464, loss: 0.7801576852798462 2023-01-22 11:57:31.321179: step: 498/464, loss: 0.2250710427761078 2023-01-22 11:57:32.051878: step: 500/464, loss: 1.9546105861663818 2023-01-22 11:57:32.880562: step: 502/464, loss: 0.5993082523345947 2023-01-22 11:57:33.640424: step: 504/464, loss: 0.9544249773025513 2023-01-22 11:57:34.328164: step: 506/464, loss: 0.6948167085647583 2023-01-22 11:57:35.072175: step: 508/464, loss: 1.0587568283081055 2023-01-22 11:57:35.817247: step: 510/464, loss: 1.2667818069458008 2023-01-22 11:57:36.661298: step: 512/464, loss: 0.41677990555763245 2023-01-22 11:57:37.367412: step: 514/464, loss: 1.2263845205307007 2023-01-22 11:57:38.154385: step: 516/464, loss: 1.315137505531311 2023-01-22 11:57:38.922087: step: 518/464, loss: 1.9078733921051025 2023-01-22 11:57:39.679576: step: 520/464, loss: 1.1094090938568115 2023-01-22 11:57:40.375748: step: 522/464, loss: 0.9098461866378784 2023-01-22 11:57:41.199379: step: 524/464, loss: 1.0443501472473145 2023-01-22 11:57:41.942636: step: 526/464, loss: 1.2700201272964478 2023-01-22 11:57:42.812437: step: 528/464, loss: 0.7448489665985107 2023-01-22 11:57:43.522983: step: 530/464, loss: 0.4656221270561218 2023-01-22 11:57:44.234577: step: 532/464, loss: 0.850081205368042 2023-01-22 11:57:45.064623: step: 534/464, loss: 0.5706437230110168 2023-01-22 11:57:45.828422: step: 536/464, loss: 0.5199675559997559 2023-01-22 11:57:46.629118: step: 538/464, loss: 1.1948305368423462 2023-01-22 11:57:47.361605: step: 540/464, loss: 0.8280027508735657 2023-01-22 11:57:48.103424: step: 542/464, loss: 0.7765970826148987 2023-01-22 11:57:48.845809: step: 544/464, loss: 1.8322759866714478 2023-01-22 11:57:49.594843: step: 546/464, loss: 0.9865900278091431 2023-01-22 11:57:50.306602: step: 548/464, loss: 1.1756948232650757 2023-01-22 11:57:51.042789: step: 550/464, loss: 0.7664552927017212 2023-01-22 11:57:51.810425: step: 552/464, loss: 0.8237978219985962 2023-01-22 11:57:52.488399: step: 554/464, loss: 1.9098732471466064 2023-01-22 11:57:53.259705: step: 556/464, loss: 0.3626733422279358 2023-01-22 11:57:53.998557: step: 558/464, loss: 0.9509762525558472 2023-01-22 11:57:54.775141: step: 560/464, loss: 2.97688364982605 2023-01-22 11:57:55.629797: step: 562/464, loss: 0.4221142530441284 2023-01-22 11:57:56.306727: step: 564/464, loss: 0.9137882590293884 2023-01-22 11:57:57.052976: step: 566/464, loss: 0.972124457359314 2023-01-22 11:57:57.895558: step: 568/464, loss: 0.9981567859649658 2023-01-22 11:57:58.597115: step: 570/464, loss: 0.873939037322998 2023-01-22 11:57:59.325740: step: 572/464, loss: 0.6504555344581604 2023-01-22 11:58:00.067763: step: 574/464, loss: 0.5079010128974915 2023-01-22 11:58:00.813218: step: 576/464, loss: 0.9543384313583374 2023-01-22 11:58:01.547369: step: 578/464, loss: 0.23754742741584778 2023-01-22 11:58:02.336027: step: 580/464, loss: 0.7818069458007812 2023-01-22 11:58:03.092371: step: 582/464, loss: 0.7598432898521423 2023-01-22 11:58:03.788219: step: 584/464, loss: 0.978622317314148 2023-01-22 11:58:04.501262: step: 586/464, loss: 0.7816836833953857 2023-01-22 11:58:05.308398: step: 588/464, loss: 1.2809606790542603 2023-01-22 11:58:06.054970: step: 590/464, loss: 0.5787965059280396 2023-01-22 11:58:06.822456: step: 592/464, loss: 1.170480489730835 2023-01-22 11:58:07.599467: step: 594/464, loss: 0.41068509221076965 2023-01-22 11:58:08.309846: step: 596/464, loss: 1.8384788036346436 2023-01-22 11:58:09.014517: step: 598/464, loss: 0.7644957304000854 2023-01-22 11:58:09.720412: step: 600/464, loss: 0.5268754363059998 2023-01-22 11:58:10.462879: step: 602/464, loss: 0.7573223114013672 2023-01-22 11:58:11.212828: step: 604/464, loss: 0.5257716774940491 2023-01-22 11:58:12.021814: step: 606/464, loss: 0.8698573112487793 2023-01-22 11:58:12.688021: step: 608/464, loss: 0.6615485548973083 2023-01-22 11:58:13.447616: step: 610/464, loss: 0.40329596400260925 2023-01-22 11:58:14.186003: step: 612/464, loss: 0.7785581350326538 2023-01-22 11:58:14.905137: step: 614/464, loss: 0.8779846429824829 2023-01-22 11:58:15.659191: step: 616/464, loss: 0.89288729429245 2023-01-22 11:58:16.389234: step: 618/464, loss: 0.8979387879371643 2023-01-22 11:58:17.120899: step: 620/464, loss: 0.37278124690055847 2023-01-22 11:58:17.876916: step: 622/464, loss: 0.32420939207077026 2023-01-22 11:58:18.600979: step: 624/464, loss: 0.5986490845680237 2023-01-22 11:58:19.359187: step: 626/464, loss: 1.7960107326507568 2023-01-22 11:58:20.101372: step: 628/464, loss: 0.23462745547294617 2023-01-22 11:58:20.867134: step: 630/464, loss: 0.7956054210662842 2023-01-22 11:58:21.646102: step: 632/464, loss: 2.865629196166992 2023-01-22 11:58:22.392041: step: 634/464, loss: 0.3457973301410675 2023-01-22 11:58:23.199629: step: 636/464, loss: 0.4045425057411194 2023-01-22 11:58:24.010688: step: 638/464, loss: 0.5609143376350403 2023-01-22 11:58:24.832951: step: 640/464, loss: 1.1414687633514404 2023-01-22 11:58:25.564006: step: 642/464, loss: 1.053191900253296 2023-01-22 11:58:26.419651: step: 644/464, loss: 0.66423499584198 2023-01-22 11:58:27.211074: step: 646/464, loss: 1.7464537620544434 2023-01-22 11:58:27.919072: step: 648/464, loss: 0.8280461430549622 2023-01-22 11:58:28.639422: step: 650/464, loss: 1.8533332347869873 2023-01-22 11:58:29.406307: step: 652/464, loss: 0.9674221277236938 2023-01-22 11:58:30.176671: step: 654/464, loss: 0.37536531686782837 2023-01-22 11:58:30.955729: step: 656/464, loss: 2.653020143508911 2023-01-22 11:58:31.742961: step: 658/464, loss: 4.386117458343506 2023-01-22 11:58:32.520193: step: 660/464, loss: 1.0777931213378906 2023-01-22 11:58:33.379368: step: 662/464, loss: 1.7315013408660889 2023-01-22 11:58:34.071058: step: 664/464, loss: 1.4899041652679443 2023-01-22 11:58:34.869601: step: 666/464, loss: 1.2888059616088867 2023-01-22 11:58:35.674106: step: 668/464, loss: 0.9007691144943237 2023-01-22 11:58:36.416085: step: 670/464, loss: 8.064138412475586 2023-01-22 11:58:37.205928: step: 672/464, loss: 0.3794676959514618 2023-01-22 11:58:37.966355: step: 674/464, loss: 0.6640487909317017 2023-01-22 11:58:38.732248: step: 676/464, loss: 1.3708484172821045 2023-01-22 11:58:39.476041: step: 678/464, loss: 0.2608605921268463 2023-01-22 11:58:40.169595: step: 680/464, loss: 0.4714253544807434 2023-01-22 11:58:40.935530: step: 682/464, loss: 0.598812997341156 2023-01-22 11:58:41.692701: step: 684/464, loss: 0.558146595954895 2023-01-22 11:58:42.529454: step: 686/464, loss: 0.0715576708316803 2023-01-22 11:58:43.242535: step: 688/464, loss: 0.5410977005958557 2023-01-22 11:58:43.959343: step: 690/464, loss: 0.7585949897766113 2023-01-22 11:58:44.779478: step: 692/464, loss: 0.643237292766571 2023-01-22 11:58:45.594515: step: 694/464, loss: 0.5445526242256165 2023-01-22 11:58:46.359292: step: 696/464, loss: 0.9776055812835693 2023-01-22 11:58:47.104491: step: 698/464, loss: 0.6344531774520874 2023-01-22 11:58:47.847161: step: 700/464, loss: 0.9945721626281738 2023-01-22 11:58:48.596402: step: 702/464, loss: 0.7680280804634094 2023-01-22 11:58:49.410867: step: 704/464, loss: 0.3382861614227295 2023-01-22 11:58:50.149579: step: 706/464, loss: 0.6660884618759155 2023-01-22 11:58:50.958451: step: 708/464, loss: 1.070008397102356 2023-01-22 11:58:51.739454: step: 710/464, loss: 0.7828474044799805 2023-01-22 11:58:52.528502: step: 712/464, loss: 1.4197325706481934 2023-01-22 11:58:53.239043: step: 714/464, loss: 0.23748788237571716 2023-01-22 11:58:54.024887: step: 716/464, loss: 8.611396789550781 2023-01-22 11:58:54.814929: step: 718/464, loss: 1.331899881362915 2023-01-22 11:58:55.598183: step: 720/464, loss: 2.305028200149536 2023-01-22 11:58:56.462793: step: 722/464, loss: 1.1942238807678223 2023-01-22 11:58:57.240480: step: 724/464, loss: 2.60715913772583 2023-01-22 11:58:57.962161: step: 726/464, loss: 0.2433922439813614 2023-01-22 11:58:58.748508: step: 728/464, loss: 0.3391481637954712 2023-01-22 11:58:59.506166: step: 730/464, loss: 3.575850486755371 2023-01-22 11:59:00.161774: step: 732/464, loss: 1.3523809909820557 2023-01-22 11:59:00.883315: step: 734/464, loss: 1.4300650358200073 2023-01-22 11:59:01.629254: step: 736/464, loss: 0.24131545424461365 2023-01-22 11:59:02.418508: step: 738/464, loss: 1.5541698932647705 2023-01-22 11:59:03.129292: step: 740/464, loss: 0.812211275100708 2023-01-22 11:59:03.968228: step: 742/464, loss: 0.7140324115753174 2023-01-22 11:59:04.685379: step: 744/464, loss: 0.3727984130382538 2023-01-22 11:59:05.444965: step: 746/464, loss: 0.7878866195678711 2023-01-22 11:59:06.212155: step: 748/464, loss: 0.2013627290725708 2023-01-22 11:59:06.886573: step: 750/464, loss: 0.2744179964065552 2023-01-22 11:59:07.672322: step: 752/464, loss: 2.1657464504241943 2023-01-22 11:59:08.398540: step: 754/464, loss: 0.2591649293899536 2023-01-22 11:59:09.197047: step: 756/464, loss: 5.291714668273926 2023-01-22 11:59:10.021647: step: 758/464, loss: 0.6014288663864136 2023-01-22 11:59:10.768602: step: 760/464, loss: 1.4996731281280518 2023-01-22 11:59:11.493032: step: 762/464, loss: 1.117444634437561 2023-01-22 11:59:12.328327: step: 764/464, loss: 0.1649799793958664 2023-01-22 11:59:13.053084: step: 766/464, loss: 1.094132661819458 2023-01-22 11:59:13.814350: step: 768/464, loss: 0.41113656759262085 2023-01-22 11:59:14.576880: step: 770/464, loss: 1.1057980060577393 2023-01-22 11:59:15.405015: step: 772/464, loss: 2.4089059829711914 2023-01-22 11:59:16.222532: step: 774/464, loss: 0.7369115948677063 2023-01-22 11:59:16.911927: step: 776/464, loss: 1.485062599182129 2023-01-22 11:59:17.616387: step: 778/464, loss: 1.239599585533142 2023-01-22 11:59:18.375746: step: 780/464, loss: 1.2481592893600464 2023-01-22 11:59:19.108013: step: 782/464, loss: 0.5724455118179321 2023-01-22 11:59:19.875917: step: 784/464, loss: 0.562556803226471 2023-01-22 11:59:20.684867: step: 786/464, loss: 0.19182422757148743 2023-01-22 11:59:21.440409: step: 788/464, loss: 0.9646680951118469 2023-01-22 11:59:22.191289: step: 790/464, loss: 0.4203810691833496 2023-01-22 11:59:22.930185: step: 792/464, loss: 0.7610728740692139 2023-01-22 11:59:23.642844: step: 794/464, loss: 1.6422280073165894 2023-01-22 11:59:24.296080: step: 796/464, loss: 1.0750470161437988 2023-01-22 11:59:24.963337: step: 798/464, loss: 0.2180919051170349 2023-01-22 11:59:25.748982: step: 800/464, loss: 3.362419366836548 2023-01-22 11:59:26.546212: step: 802/464, loss: 0.798356294631958 2023-01-22 11:59:27.281982: step: 804/464, loss: 0.351754367351532 2023-01-22 11:59:28.245062: step: 806/464, loss: 0.20373542606830597 2023-01-22 11:59:29.000501: step: 808/464, loss: 0.25572940707206726 2023-01-22 11:59:29.733677: step: 810/464, loss: 0.6978384852409363 2023-01-22 11:59:30.391067: step: 812/464, loss: 0.8130373954772949 2023-01-22 11:59:31.187184: step: 814/464, loss: 1.1461539268493652 2023-01-22 11:59:31.950108: step: 816/464, loss: 0.3477003574371338 2023-01-22 11:59:32.719444: step: 818/464, loss: 0.8022814989089966 2023-01-22 11:59:33.579433: step: 820/464, loss: 0.9525044560432434 2023-01-22 11:59:34.285279: step: 822/464, loss: 0.5294411182403564 2023-01-22 11:59:35.093408: step: 824/464, loss: 0.6476122140884399 2023-01-22 11:59:35.909933: step: 826/464, loss: 0.4958522617816925 2023-01-22 11:59:36.721976: step: 828/464, loss: 1.3305891752243042 2023-01-22 11:59:37.479769: step: 830/464, loss: 2.5851356983184814 2023-01-22 11:59:38.274983: step: 832/464, loss: 1.622355341911316 2023-01-22 11:59:39.033413: step: 834/464, loss: 0.9364479184150696 2023-01-22 11:59:39.777964: step: 836/464, loss: 0.2947760224342346 2023-01-22 11:59:40.523944: step: 838/464, loss: 0.1488008052110672 2023-01-22 11:59:41.207379: step: 840/464, loss: 0.41798877716064453 2023-01-22 11:59:41.932903: step: 842/464, loss: 0.5885865688323975 2023-01-22 11:59:42.653436: step: 844/464, loss: 1.004331350326538 2023-01-22 11:59:43.322241: step: 846/464, loss: 0.27680039405822754 2023-01-22 11:59:44.089242: step: 848/464, loss: 0.32235682010650635 2023-01-22 11:59:44.803407: step: 850/464, loss: 1.527274250984192 2023-01-22 11:59:45.565010: step: 852/464, loss: 0.4541100263595581 2023-01-22 11:59:46.277034: step: 854/464, loss: 1.2514349222183228 2023-01-22 11:59:46.956399: step: 856/464, loss: 1.489627718925476 2023-01-22 11:59:47.865609: step: 858/464, loss: 0.9419457316398621 2023-01-22 11:59:48.624430: step: 860/464, loss: 0.7608920931816101 2023-01-22 11:59:49.362163: step: 862/464, loss: 0.3095041513442993 2023-01-22 11:59:50.112485: step: 864/464, loss: 2.6239664554595947 2023-01-22 11:59:50.867036: step: 866/464, loss: 5.607526779174805 2023-01-22 11:59:51.724802: step: 868/464, loss: 1.5280218124389648 2023-01-22 11:59:52.498295: step: 870/464, loss: 0.5562810897827148 2023-01-22 11:59:53.237301: step: 872/464, loss: 0.493901789188385 2023-01-22 11:59:53.937195: step: 874/464, loss: 0.5550905466079712 2023-01-22 11:59:54.857199: step: 876/464, loss: 0.6465400457382202 2023-01-22 11:59:55.673375: step: 878/464, loss: 0.8182223439216614 2023-01-22 11:59:56.374450: step: 880/464, loss: 0.44080498814582825 2023-01-22 11:59:57.114778: step: 882/464, loss: 0.8962292671203613 2023-01-22 11:59:57.755086: step: 884/464, loss: 0.7163071632385254 2023-01-22 11:59:58.526851: step: 886/464, loss: 0.781548023223877 2023-01-22 11:59:59.345996: step: 888/464, loss: 0.6812397241592407 2023-01-22 12:00:00.141117: step: 890/464, loss: 1.0619182586669922 2023-01-22 12:00:00.874605: step: 892/464, loss: 0.78568434715271 2023-01-22 12:00:01.634853: step: 894/464, loss: 0.39763808250427246 2023-01-22 12:00:02.363205: step: 896/464, loss: 1.4507485628128052 2023-01-22 12:00:03.114078: step: 898/464, loss: 0.8367409706115723 2023-01-22 12:00:03.856789: step: 900/464, loss: 0.7433250546455383 2023-01-22 12:00:04.622172: step: 902/464, loss: 1.7983527183532715 2023-01-22 12:00:05.335866: step: 904/464, loss: 0.7389869689941406 2023-01-22 12:00:06.080540: step: 906/464, loss: 0.3523313105106354 2023-01-22 12:00:06.744855: step: 908/464, loss: 0.7650223970413208 2023-01-22 12:00:07.488039: step: 910/464, loss: 0.5269652009010315 2023-01-22 12:00:08.253919: step: 912/464, loss: 4.824460029602051 2023-01-22 12:00:09.031278: step: 914/464, loss: 2.659306526184082 2023-01-22 12:00:09.780853: step: 916/464, loss: 0.8726629018783569 2023-01-22 12:00:10.507478: step: 918/464, loss: 1.209802508354187 2023-01-22 12:00:11.267099: step: 920/464, loss: 1.4985301494598389 2023-01-22 12:00:12.042285: step: 922/464, loss: 0.36272984743118286 2023-01-22 12:00:12.750835: step: 924/464, loss: 0.9055819511413574 2023-01-22 12:00:13.483597: step: 926/464, loss: 0.9490545392036438 2023-01-22 12:00:14.313035: step: 928/464, loss: 1.9158504009246826 2023-01-22 12:00:14.956492: step: 930/464, loss: 0.1756402850151062 ================================================== Loss: 1.044 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.28758196899664623, 'r': 0.3175384241004635, 'f1': 0.30181870013509404}, 'combined': 0.22239272641533245, 'epoch': 4} Test Chinese: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.28839284190210124, 'r': 0.24446356628100965, 'f1': 0.2646174148930415}, 'combined': 0.1643413418809416, 'epoch': 4} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.28460783492462427, 'r': 0.3234916377985767, 'f1': 0.3028065597155416}, 'combined': 0.2231206229482938, 'epoch': 4} Test Korean: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.28410097979888116, 'r': 0.2478507064120803, 'f1': 0.2647406911596547}, 'combined': 0.16441790293073294, 'epoch': 4} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.28736309260435466, 'r': 0.32062523425305606, 'f1': 0.30308430215490684}, 'combined': 0.2233252752720366, 'epoch': 4} Test Russian: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.2962319368793944, 'r': 0.2503203731182247, 'f1': 0.2713478201912912}, 'combined': 0.16852127780301246, 'epoch': 4} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.2602040816326531, 'r': 0.36428571428571427, 'f1': 0.3035714285714286}, 'combined': 0.20238095238095238, 'epoch': 4} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.23170731707317074, 'r': 0.41304347826086957, 'f1': 0.296875}, 'combined': 0.1484375, 'epoch': 4} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.4107142857142857, 'r': 0.2974137931034483, 'f1': 0.345}, 'combined': 0.22999999999999998, 'epoch': 4} New best korean model... New best russian model... ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.27277317990287775, 'r': 0.2614291157103195, 'f1': 0.2669806992485695}, 'combined': 0.19672262049894593, 'epoch': 2} Test for Chinese: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.2661425985210101, 'r': 0.23333409278617157, 'f1': 0.2486608198477961}, 'combined': 0.15443145653705231, 'epoch': 2} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.34459459459459457, 'r': 0.36428571428571427, 'f1': 0.3541666666666667}, 'combined': 0.2361111111111111, 'epoch': 2} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.28460783492462427, 'r': 0.3234916377985767, 'f1': 0.3028065597155416}, 'combined': 0.2231206229482938, 'epoch': 4} Test for Korean: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.28410097979888116, 'r': 0.2478507064120803, 'f1': 0.2647406911596547}, 'combined': 0.16441790293073294, 'epoch': 4} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.23170731707317074, 'r': 0.41304347826086957, 'f1': 0.296875}, 'combined': 0.1484375, 'epoch': 4} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.28736309260435466, 'r': 0.32062523425305606, 'f1': 0.30308430215490684}, 'combined': 0.2233252752720366, 'epoch': 4} Test for Russian: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.2962319368793944, 'r': 0.2503203731182247, 'f1': 0.2713478201912912}, 'combined': 0.16852127780301246, 'epoch': 4} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.4107142857142857, 'r': 0.2974137931034483, 'f1': 0.345}, 'combined': 0.22999999999999998, 'epoch': 4} ****************************** Epoch: 5 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-22 12:03:09.523515: step: 2/464, loss: 0.6496837139129639 2023-01-22 12:03:10.355798: step: 4/464, loss: 0.6422550678253174 2023-01-22 12:03:11.083719: step: 6/464, loss: 1.8030405044555664 2023-01-22 12:03:11.860536: step: 8/464, loss: 0.9896953105926514 2023-01-22 12:03:12.594348: step: 10/464, loss: 0.6033563613891602 2023-01-22 12:03:13.378557: step: 12/464, loss: 1.4169790744781494 2023-01-22 12:03:14.098510: step: 14/464, loss: 0.4817117750644684 2023-01-22 12:03:14.805718: step: 16/464, loss: 0.47974684834480286 2023-01-22 12:03:15.582917: step: 18/464, loss: 5.893843650817871 2023-01-22 12:03:16.345599: step: 20/464, loss: 1.6159776449203491 2023-01-22 12:03:17.033324: step: 22/464, loss: 1.6918559074401855 2023-01-22 12:03:17.759045: step: 24/464, loss: 0.6972661018371582 2023-01-22 12:03:18.520133: step: 26/464, loss: 0.8587067127227783 2023-01-22 12:03:19.256248: step: 28/464, loss: 0.42765671014785767 2023-01-22 12:03:20.001273: step: 30/464, loss: 0.6781284809112549 2023-01-22 12:03:20.773633: step: 32/464, loss: 0.2493610978126526 2023-01-22 12:03:21.668688: step: 34/464, loss: 0.9981824159622192 2023-01-22 12:03:22.444868: step: 36/464, loss: 0.4823274314403534 2023-01-22 12:03:23.149675: step: 38/464, loss: 0.4170873761177063 2023-01-22 12:03:23.926174: step: 40/464, loss: 0.6056738495826721 2023-01-22 12:03:24.626896: step: 42/464, loss: 0.22396408021450043 2023-01-22 12:03:25.643717: step: 44/464, loss: 1.1149306297302246 2023-01-22 12:03:26.331236: step: 46/464, loss: 0.6650685667991638 2023-01-22 12:03:27.035260: step: 48/464, loss: 0.6071967482566833 2023-01-22 12:03:27.841225: step: 50/464, loss: 0.4230685234069824 2023-01-22 12:03:28.691847: step: 52/464, loss: 2.148970603942871 2023-01-22 12:03:29.399320: step: 54/464, loss: 0.40129613876342773 2023-01-22 12:03:30.189054: step: 56/464, loss: 0.5853995680809021 2023-01-22 12:03:30.877083: step: 58/464, loss: 0.43005678057670593 2023-01-22 12:03:31.744539: step: 60/464, loss: 0.14942124485969543 2023-01-22 12:03:32.557712: step: 62/464, loss: 0.8140151500701904 2023-01-22 12:03:33.353448: step: 64/464, loss: 0.20544444024562836 2023-01-22 12:03:34.086941: step: 66/464, loss: 0.3382854759693146 2023-01-22 12:03:34.832876: step: 68/464, loss: 0.4904332160949707 2023-01-22 12:03:35.581085: step: 70/464, loss: 0.642681360244751 2023-01-22 12:03:36.277811: step: 72/464, loss: 0.7253457307815552 2023-01-22 12:03:37.013239: step: 74/464, loss: 0.7383750677108765 2023-01-22 12:03:37.709134: step: 76/464, loss: 0.6518418788909912 2023-01-22 12:03:38.461438: step: 78/464, loss: 1.713513731956482 2023-01-22 12:03:39.169439: step: 80/464, loss: 0.5307286381721497 2023-01-22 12:03:39.856791: step: 82/464, loss: 0.26544585824012756 2023-01-22 12:03:40.570575: step: 84/464, loss: 0.14702025055885315 2023-01-22 12:03:41.355292: step: 86/464, loss: 0.34280622005462646 2023-01-22 12:03:42.117211: step: 88/464, loss: 0.8543655276298523 2023-01-22 12:03:42.823936: step: 90/464, loss: 1.2161846160888672 2023-01-22 12:03:43.570194: step: 92/464, loss: 0.30192968249320984 2023-01-22 12:03:44.290250: step: 94/464, loss: 1.5269676446914673 2023-01-22 12:03:45.010531: step: 96/464, loss: 0.28412994742393494 2023-01-22 12:03:45.818652: step: 98/464, loss: 6.153512001037598 2023-01-22 12:03:46.621025: step: 100/464, loss: 1.3124456405639648 2023-01-22 12:03:47.434375: step: 102/464, loss: 1.2780077457427979 2023-01-22 12:03:48.208995: step: 104/464, loss: 0.7583173513412476 2023-01-22 12:03:49.011066: step: 106/464, loss: 0.5176824331283569 2023-01-22 12:03:49.794796: step: 108/464, loss: 0.8629632592201233 2023-01-22 12:03:50.508396: step: 110/464, loss: 0.9889530539512634 2023-01-22 12:03:51.185107: step: 112/464, loss: 0.48290348052978516 2023-01-22 12:03:51.910293: step: 114/464, loss: 1.1069570779800415 2023-01-22 12:03:52.689167: step: 116/464, loss: 0.26499226689338684 2023-01-22 12:03:53.474706: step: 118/464, loss: 0.9873026609420776 2023-01-22 12:03:54.207959: step: 120/464, loss: 0.37587887048721313 2023-01-22 12:03:54.912835: step: 122/464, loss: 2.3691699504852295 2023-01-22 12:03:55.786506: step: 124/464, loss: 0.31297823786735535 2023-01-22 12:03:56.488910: step: 126/464, loss: 0.3409603536128998 2023-01-22 12:03:57.222602: step: 128/464, loss: 1.7431035041809082 2023-01-22 12:03:58.008556: step: 130/464, loss: 0.7571587562561035 2023-01-22 12:03:58.853013: step: 132/464, loss: 0.7416355609893799 2023-01-22 12:03:59.592102: step: 134/464, loss: 0.8539094924926758 2023-01-22 12:04:00.334177: step: 136/464, loss: 1.114874243736267 2023-01-22 12:04:01.158893: step: 138/464, loss: 1.1200034618377686 2023-01-22 12:04:01.876093: step: 140/464, loss: 0.6744300723075867 2023-01-22 12:04:02.630534: step: 142/464, loss: 0.36426013708114624 2023-01-22 12:04:03.335305: step: 144/464, loss: 0.6616504788398743 2023-01-22 12:04:04.021538: step: 146/464, loss: 0.7991723418235779 2023-01-22 12:04:04.777073: step: 148/464, loss: 0.3798237442970276 2023-01-22 12:04:05.509272: step: 150/464, loss: 0.30810824036598206 2023-01-22 12:04:06.253701: step: 152/464, loss: 0.7624216079711914 2023-01-22 12:04:07.041946: step: 154/464, loss: 0.20280928909778595 2023-01-22 12:04:07.853229: step: 156/464, loss: 0.28102943301200867 2023-01-22 12:04:08.585144: step: 158/464, loss: 0.8951029181480408 2023-01-22 12:04:09.326075: step: 160/464, loss: 2.415346622467041 2023-01-22 12:04:10.099084: step: 162/464, loss: 1.141200065612793 2023-01-22 12:04:10.803992: step: 164/464, loss: 0.28005051612854004 2023-01-22 12:04:11.491118: step: 166/464, loss: 1.085160732269287 2023-01-22 12:04:12.256646: step: 168/464, loss: 0.33003517985343933 2023-01-22 12:04:12.963059: step: 170/464, loss: 0.5919443964958191 2023-01-22 12:04:13.703249: step: 172/464, loss: 0.7433185577392578 2023-01-22 12:04:14.403573: step: 174/464, loss: 0.7050065994262695 2023-01-22 12:04:15.209879: step: 176/464, loss: 0.946040153503418 2023-01-22 12:04:15.901402: step: 178/464, loss: 0.3946067988872528 2023-01-22 12:04:16.683931: step: 180/464, loss: 0.8988597989082336 2023-01-22 12:04:17.511977: step: 182/464, loss: 0.24341298639774323 2023-01-22 12:04:18.271135: step: 184/464, loss: 0.634074330329895 2023-01-22 12:04:19.036871: step: 186/464, loss: 0.6246568560600281 2023-01-22 12:04:19.826076: step: 188/464, loss: 0.496273934841156 2023-01-22 12:04:20.544274: step: 190/464, loss: 0.5497092008590698 2023-01-22 12:04:21.276455: step: 192/464, loss: 1.4893434047698975 2023-01-22 12:04:21.995332: step: 194/464, loss: 0.7866815328598022 2023-01-22 12:04:22.782121: step: 196/464, loss: 1.3385767936706543 2023-01-22 12:04:23.529768: step: 198/464, loss: 2.3288447856903076 2023-01-22 12:04:24.332909: step: 200/464, loss: 0.9130333662033081 2023-01-22 12:04:25.095576: step: 202/464, loss: 0.2995184361934662 2023-01-22 12:04:25.941055: step: 204/464, loss: 0.38763970136642456 2023-01-22 12:04:26.796141: step: 206/464, loss: 0.16953855752944946 2023-01-22 12:04:27.518579: step: 208/464, loss: 0.316944420337677 2023-01-22 12:04:28.224836: step: 210/464, loss: 7.582270622253418 2023-01-22 12:04:28.993522: step: 212/464, loss: 1.0553747415542603 2023-01-22 12:04:29.741832: step: 214/464, loss: 0.27340349555015564 2023-01-22 12:04:30.574985: step: 216/464, loss: 0.4284445345401764 2023-01-22 12:04:31.414875: step: 218/464, loss: 0.3169950246810913 2023-01-22 12:04:32.143759: step: 220/464, loss: 0.38640186190605164 2023-01-22 12:04:32.890481: step: 222/464, loss: 1.3186635971069336 2023-01-22 12:04:33.694600: step: 224/464, loss: 0.8435964584350586 2023-01-22 12:04:34.597181: step: 226/464, loss: 0.3275609016418457 2023-01-22 12:04:35.336899: step: 228/464, loss: 0.5751467347145081 2023-01-22 12:04:36.131435: step: 230/464, loss: 0.3827030658721924 2023-01-22 12:04:36.875351: step: 232/464, loss: 0.6583121418952942 2023-01-22 12:04:37.589314: step: 234/464, loss: 0.6310871243476868 2023-01-22 12:04:38.440418: step: 236/464, loss: 1.0897424221038818 2023-01-22 12:04:39.224471: step: 238/464, loss: 0.5430383682250977 2023-01-22 12:04:39.968292: step: 240/464, loss: 4.718739986419678 2023-01-22 12:04:40.789773: step: 242/464, loss: 0.5165674090385437 2023-01-22 12:04:41.497805: step: 244/464, loss: 0.4952998757362366 2023-01-22 12:04:42.224209: step: 246/464, loss: 0.6369624137878418 2023-01-22 12:04:43.052026: step: 248/464, loss: 0.6831679344177246 2023-01-22 12:04:43.751619: step: 250/464, loss: 0.6541798710823059 2023-01-22 12:04:44.529868: step: 252/464, loss: 3.147430181503296 2023-01-22 12:04:45.212951: step: 254/464, loss: 0.6115115880966187 2023-01-22 12:04:46.003619: step: 256/464, loss: 0.977453351020813 2023-01-22 12:04:46.734436: step: 258/464, loss: 0.7032570838928223 2023-01-22 12:04:47.483495: step: 260/464, loss: 0.6393693089485168 2023-01-22 12:04:48.162682: step: 262/464, loss: 0.581774115562439 2023-01-22 12:04:48.831355: step: 264/464, loss: 1.0034289360046387 2023-01-22 12:04:49.612335: step: 266/464, loss: 0.18100430071353912 2023-01-22 12:04:50.390101: step: 268/464, loss: 0.1348925530910492 2023-01-22 12:04:51.153303: step: 270/464, loss: 0.5453640818595886 2023-01-22 12:04:51.854200: step: 272/464, loss: 0.20027294754981995 2023-01-22 12:04:52.818858: step: 274/464, loss: 0.9242507219314575 2023-01-22 12:04:53.584362: step: 276/464, loss: 0.49561578035354614 2023-01-22 12:04:54.249990: step: 278/464, loss: 1.211693286895752 2023-01-22 12:04:55.039852: step: 280/464, loss: 0.40580329298973083 2023-01-22 12:04:55.780026: step: 282/464, loss: 1.4561851024627686 2023-01-22 12:04:56.536882: step: 284/464, loss: 1.5789389610290527 2023-01-22 12:04:57.327170: step: 286/464, loss: 0.38022440671920776 2023-01-22 12:04:58.062931: step: 288/464, loss: 1.5119850635528564 2023-01-22 12:04:58.804258: step: 290/464, loss: 0.8197949528694153 2023-01-22 12:04:59.592494: step: 292/464, loss: 0.18052376806735992 2023-01-22 12:05:00.350019: step: 294/464, loss: 1.1562833786010742 2023-01-22 12:05:01.085471: step: 296/464, loss: 0.34153157472610474 2023-01-22 12:05:01.833385: step: 298/464, loss: 0.3635067939758301 2023-01-22 12:05:02.648487: step: 300/464, loss: 2.3166756629943848 2023-01-22 12:05:03.386450: step: 302/464, loss: 0.3704625070095062 2023-01-22 12:05:04.160728: step: 304/464, loss: 0.5335472226142883 2023-01-22 12:05:04.890864: step: 306/464, loss: 0.5522812008857727 2023-01-22 12:05:05.691749: step: 308/464, loss: 0.7435985207557678 2023-01-22 12:05:06.436767: step: 310/464, loss: 0.5566904544830322 2023-01-22 12:05:07.187768: step: 312/464, loss: 0.5143800377845764 2023-01-22 12:05:07.926412: step: 314/464, loss: 2.127366065979004 2023-01-22 12:05:08.609787: step: 316/464, loss: 1.1592189073562622 2023-01-22 12:05:09.404685: step: 318/464, loss: 0.45830056071281433 2023-01-22 12:05:10.151985: step: 320/464, loss: 1.5356149673461914 2023-01-22 12:05:10.863613: step: 322/464, loss: 0.8824881315231323 2023-01-22 12:05:11.593113: step: 324/464, loss: 0.9135230779647827 2023-01-22 12:05:12.532556: step: 326/464, loss: 0.4782622158527374 2023-01-22 12:05:13.142180: step: 328/464, loss: 0.4283546209335327 2023-01-22 12:05:13.875427: step: 330/464, loss: 1.4542497396469116 2023-01-22 12:05:14.571254: step: 332/464, loss: 1.019738793373108 2023-01-22 12:05:15.290130: step: 334/464, loss: 0.6632002592086792 2023-01-22 12:05:16.018086: step: 336/464, loss: 0.8099590539932251 2023-01-22 12:05:16.785638: step: 338/464, loss: 1.555448055267334 2023-01-22 12:05:17.472763: step: 340/464, loss: 0.9206175804138184 2023-01-22 12:05:18.185750: step: 342/464, loss: 0.35792335867881775 2023-01-22 12:05:18.966721: step: 344/464, loss: 0.3381599187850952 2023-01-22 12:05:19.764051: step: 346/464, loss: 0.16699723899364471 2023-01-22 12:05:20.438068: step: 348/464, loss: 0.5061295628547668 2023-01-22 12:05:21.288155: step: 350/464, loss: 1.0372631549835205 2023-01-22 12:05:22.092657: step: 352/464, loss: 0.6776844263076782 2023-01-22 12:05:22.944774: step: 354/464, loss: 0.32754576206207275 2023-01-22 12:05:23.690259: step: 356/464, loss: 0.9929206371307373 2023-01-22 12:05:24.445054: step: 358/464, loss: 1.648221731185913 2023-01-22 12:05:25.162264: step: 360/464, loss: 0.7910313606262207 2023-01-22 12:05:25.906420: step: 362/464, loss: 0.3531991243362427 2023-01-22 12:05:26.663799: step: 364/464, loss: 7.432271957397461 2023-01-22 12:05:27.450985: step: 366/464, loss: 0.783743143081665 2023-01-22 12:05:28.234506: step: 368/464, loss: 0.7211997509002686 2023-01-22 12:05:28.996155: step: 370/464, loss: 4.978240966796875 2023-01-22 12:05:29.715841: step: 372/464, loss: 1.7499102354049683 2023-01-22 12:05:30.548712: step: 374/464, loss: 0.985282301902771 2023-01-22 12:05:31.325911: step: 376/464, loss: 0.3552509546279907 2023-01-22 12:05:32.062202: step: 378/464, loss: 3.0569443702697754 2023-01-22 12:05:32.890433: step: 380/464, loss: 0.9991600513458252 2023-01-22 12:05:33.687469: step: 382/464, loss: 0.6255131363868713 2023-01-22 12:05:34.408780: step: 384/464, loss: 0.280425488948822 2023-01-22 12:05:35.148988: step: 386/464, loss: 0.9654706716537476 2023-01-22 12:05:35.953705: step: 388/464, loss: 0.7268888354301453 2023-01-22 12:05:36.667610: step: 390/464, loss: 0.5317648649215698 2023-01-22 12:05:37.466610: step: 392/464, loss: 0.3457382917404175 2023-01-22 12:05:38.268352: step: 394/464, loss: 1.3924022912979126 2023-01-22 12:05:39.089835: step: 396/464, loss: 0.32296717166900635 2023-01-22 12:05:39.927162: step: 398/464, loss: 0.6169033050537109 2023-01-22 12:05:40.748993: step: 400/464, loss: 0.20771922171115875 2023-01-22 12:05:41.484498: step: 402/464, loss: 0.8123390674591064 2023-01-22 12:05:42.382183: step: 404/464, loss: 0.37989774346351624 2023-01-22 12:05:43.173638: step: 406/464, loss: 1.4054900407791138 2023-01-22 12:05:43.944013: step: 408/464, loss: 1.6585242748260498 2023-01-22 12:05:44.692760: step: 410/464, loss: 0.742591381072998 2023-01-22 12:05:45.560783: step: 412/464, loss: 3.3262979984283447 2023-01-22 12:05:46.234097: step: 414/464, loss: 0.4885295331478119 2023-01-22 12:05:47.033427: step: 416/464, loss: 0.6670538187026978 2023-01-22 12:05:47.781202: step: 418/464, loss: 0.8947754502296448 2023-01-22 12:05:48.637509: step: 420/464, loss: 1.3545454740524292 2023-01-22 12:05:49.429104: step: 422/464, loss: 1.00066077709198 2023-01-22 12:05:50.190920: step: 424/464, loss: 1.192989468574524 2023-01-22 12:05:50.904785: step: 426/464, loss: 0.8917605876922607 2023-01-22 12:05:51.597044: step: 428/464, loss: 0.6383041143417358 2023-01-22 12:05:52.339062: step: 430/464, loss: 1.215901255607605 2023-01-22 12:05:53.057313: step: 432/464, loss: 0.7293286323547363 2023-01-22 12:05:53.799805: step: 434/464, loss: 0.6771726012229919 2023-01-22 12:05:54.764908: step: 436/464, loss: 0.7097762823104858 2023-01-22 12:05:55.435531: step: 438/464, loss: 0.4456166923046112 2023-01-22 12:05:56.182485: step: 440/464, loss: 0.5973028540611267 2023-01-22 12:05:56.886475: step: 442/464, loss: 0.6711571216583252 2023-01-22 12:05:57.751962: step: 444/464, loss: 0.19482938945293427 2023-01-22 12:05:58.527756: step: 446/464, loss: 0.2720489203929901 2023-01-22 12:05:59.272412: step: 448/464, loss: 2.246628999710083 2023-01-22 12:06:00.017616: step: 450/464, loss: 0.6945271492004395 2023-01-22 12:06:00.758240: step: 452/464, loss: 1.3300384283065796 2023-01-22 12:06:01.487311: step: 454/464, loss: 0.28272974491119385 2023-01-22 12:06:02.301694: step: 456/464, loss: 1.008739709854126 2023-01-22 12:06:03.018656: step: 458/464, loss: 1.3190908432006836 2023-01-22 12:06:03.690281: step: 460/464, loss: 1.1253019571304321 2023-01-22 12:06:04.404513: step: 462/464, loss: 0.7815342545509338 2023-01-22 12:06:05.236496: step: 464/464, loss: 0.5137872695922852 2023-01-22 12:06:05.996140: step: 466/464, loss: 0.7086704969406128 2023-01-22 12:06:06.864206: step: 468/464, loss: 0.1647951453924179 2023-01-22 12:06:07.603032: step: 470/464, loss: 1.607774019241333 2023-01-22 12:06:08.388805: step: 472/464, loss: 0.23846077919006348 2023-01-22 12:06:09.110656: step: 474/464, loss: 0.39363813400268555 2023-01-22 12:06:09.949463: step: 476/464, loss: 0.29684215784072876 2023-01-22 12:06:10.690592: step: 478/464, loss: 0.08082844316959381 2023-01-22 12:06:11.415653: step: 480/464, loss: 0.9043727517127991 2023-01-22 12:06:12.198865: step: 482/464, loss: 1.3557593822479248 2023-01-22 12:06:12.945982: step: 484/464, loss: 3.4816226959228516 2023-01-22 12:06:13.718255: step: 486/464, loss: 0.8969863653182983 2023-01-22 12:06:14.476039: step: 488/464, loss: 0.7074337601661682 2023-01-22 12:06:15.226615: step: 490/464, loss: 0.8259276747703552 2023-01-22 12:06:15.968990: step: 492/464, loss: 0.4520622491836548 2023-01-22 12:06:16.741582: step: 494/464, loss: 0.7717126607894897 2023-01-22 12:06:17.516119: step: 496/464, loss: 0.25604677200317383 2023-01-22 12:06:18.243311: step: 498/464, loss: 0.6380689740180969 2023-01-22 12:06:19.027298: step: 500/464, loss: 0.42727962136268616 2023-01-22 12:06:19.785722: step: 502/464, loss: 0.3959901034832001 2023-01-22 12:06:20.551633: step: 504/464, loss: 0.4253363311290741 2023-01-22 12:06:21.305480: step: 506/464, loss: 0.8880304098129272 2023-01-22 12:06:22.078235: step: 508/464, loss: 0.2152801901102066 2023-01-22 12:06:22.851908: step: 510/464, loss: 0.9369227886199951 2023-01-22 12:06:23.588846: step: 512/464, loss: 2.3638715744018555 2023-01-22 12:06:24.336984: step: 514/464, loss: 0.42224061489105225 2023-01-22 12:06:25.102235: step: 516/464, loss: 0.343061238527298 2023-01-22 12:06:25.866536: step: 518/464, loss: 0.5075637102127075 2023-01-22 12:06:26.540008: step: 520/464, loss: 0.6902667880058289 2023-01-22 12:06:27.400483: step: 522/464, loss: 0.6213053464889526 2023-01-22 12:06:28.236542: step: 524/464, loss: 15.072422981262207 2023-01-22 12:06:28.967771: step: 526/464, loss: 2.107042074203491 2023-01-22 12:06:29.710347: step: 528/464, loss: 0.17740926146507263 2023-01-22 12:06:30.463682: step: 530/464, loss: 2.396019697189331 2023-01-22 12:06:31.185440: step: 532/464, loss: 1.1157536506652832 2023-01-22 12:06:31.873429: step: 534/464, loss: 0.7146736979484558 2023-01-22 12:06:32.658418: step: 536/464, loss: 0.8112752437591553 2023-01-22 12:06:33.427487: step: 538/464, loss: 0.6888623833656311 2023-01-22 12:06:34.206599: step: 540/464, loss: 0.2979349195957184 2023-01-22 12:06:34.948272: step: 542/464, loss: 0.2549689710140228 2023-01-22 12:06:35.717291: step: 544/464, loss: 2.5347049236297607 2023-01-22 12:06:36.516691: step: 546/464, loss: 0.4879702925682068 2023-01-22 12:06:37.231785: step: 548/464, loss: 0.24486508965492249 2023-01-22 12:06:37.945529: step: 550/464, loss: 1.338243842124939 2023-01-22 12:06:38.668925: step: 552/464, loss: 2.470813274383545 2023-01-22 12:06:39.419819: step: 554/464, loss: 0.2586001455783844 2023-01-22 12:06:40.230815: step: 556/464, loss: 0.21788036823272705 2023-01-22 12:06:40.933574: step: 558/464, loss: 0.6774846315383911 2023-01-22 12:06:41.693724: step: 560/464, loss: 0.31087690591812134 2023-01-22 12:06:42.536736: step: 562/464, loss: 1.0382733345031738 2023-01-22 12:06:43.246047: step: 564/464, loss: 0.28306156396865845 2023-01-22 12:06:43.952212: step: 566/464, loss: 0.713935911655426 2023-01-22 12:06:44.815572: step: 568/464, loss: 0.38265261054039 2023-01-22 12:06:45.565041: step: 570/464, loss: 0.7601750493049622 2023-01-22 12:06:46.294102: step: 572/464, loss: 0.7853434681892395 2023-01-22 12:06:47.027368: step: 574/464, loss: 0.24956616759300232 2023-01-22 12:06:47.779641: step: 576/464, loss: 0.48518678545951843 2023-01-22 12:06:48.532611: step: 578/464, loss: 0.6642841100692749 2023-01-22 12:06:49.249900: step: 580/464, loss: 0.8533198237419128 2023-01-22 12:06:49.987100: step: 582/464, loss: 0.4441028833389282 2023-01-22 12:06:50.698094: step: 584/464, loss: 0.8509069085121155 2023-01-22 12:06:51.489308: step: 586/464, loss: 0.816931426525116 2023-01-22 12:06:52.240024: step: 588/464, loss: 0.8874893188476562 2023-01-22 12:06:53.036810: step: 590/464, loss: 1.2424025535583496 2023-01-22 12:06:53.765896: step: 592/464, loss: 0.39771029353141785 2023-01-22 12:06:54.539224: step: 594/464, loss: 0.2105729579925537 2023-01-22 12:06:55.305663: step: 596/464, loss: 1.876196265220642 2023-01-22 12:06:56.145105: step: 598/464, loss: 0.22702628374099731 2023-01-22 12:06:56.804305: step: 600/464, loss: 4.492393970489502 2023-01-22 12:06:57.589724: step: 602/464, loss: 0.18425554037094116 2023-01-22 12:06:58.299119: step: 604/464, loss: 0.5201539993286133 2023-01-22 12:06:59.121136: step: 606/464, loss: 0.30435648560523987 2023-01-22 12:06:59.848798: step: 608/464, loss: 0.6498906016349792 2023-01-22 12:07:00.590513: step: 610/464, loss: 1.6036509275436401 2023-01-22 12:07:01.356087: step: 612/464, loss: 1.4098213911056519 2023-01-22 12:07:02.107613: step: 614/464, loss: 0.24880897998809814 2023-01-22 12:07:02.865538: step: 616/464, loss: 1.1465108394622803 2023-01-22 12:07:03.668723: step: 618/464, loss: 2.580686569213867 2023-01-22 12:07:04.462402: step: 620/464, loss: 0.45269080996513367 2023-01-22 12:07:05.264266: step: 622/464, loss: 5.887742042541504 2023-01-22 12:07:06.042854: step: 624/464, loss: 0.5069065690040588 2023-01-22 12:07:06.800839: step: 626/464, loss: 0.25332337617874146 2023-01-22 12:07:07.528026: step: 628/464, loss: 1.8866604566574097 2023-01-22 12:07:08.216781: step: 630/464, loss: 0.4924744963645935 2023-01-22 12:07:08.937522: step: 632/464, loss: 1.128131628036499 2023-01-22 12:07:09.665110: step: 634/464, loss: 0.3855154812335968 2023-01-22 12:07:10.453791: step: 636/464, loss: 0.3899562954902649 2023-01-22 12:07:11.180165: step: 638/464, loss: 2.0404696464538574 2023-01-22 12:07:12.013183: step: 640/464, loss: 0.23224273324012756 2023-01-22 12:07:12.753508: step: 642/464, loss: 0.9769107103347778 2023-01-22 12:07:13.588960: step: 644/464, loss: 1.0341360569000244 2023-01-22 12:07:14.351321: step: 646/464, loss: 1.0036234855651855 2023-01-22 12:07:15.083235: step: 648/464, loss: 0.36653128266334534 2023-01-22 12:07:15.817018: step: 650/464, loss: 0.6084499955177307 2023-01-22 12:07:16.627093: step: 652/464, loss: 0.6824120283126831 2023-01-22 12:07:17.407528: step: 654/464, loss: 1.4943143129348755 2023-01-22 12:07:18.096113: step: 656/464, loss: 0.5834726691246033 2023-01-22 12:07:18.844709: step: 658/464, loss: 0.48704636096954346 2023-01-22 12:07:19.529804: step: 660/464, loss: 1.881295919418335 2023-01-22 12:07:20.249088: step: 662/464, loss: 0.5884661078453064 2023-01-22 12:07:21.024021: step: 664/464, loss: 0.8509942293167114 2023-01-22 12:07:21.759862: step: 666/464, loss: 0.8986581563949585 2023-01-22 12:07:22.592042: step: 668/464, loss: 0.965084433555603 2023-01-22 12:07:23.361486: step: 670/464, loss: 0.47922876477241516 2023-01-22 12:07:24.185682: step: 672/464, loss: 0.6889102458953857 2023-01-22 12:07:24.946783: step: 674/464, loss: 0.4863603413105011 2023-01-22 12:07:25.686481: step: 676/464, loss: 0.4932281970977783 2023-01-22 12:07:26.416044: step: 678/464, loss: 0.7120741009712219 2023-01-22 12:07:27.183827: step: 680/464, loss: 1.7616684436798096 2023-01-22 12:07:27.892753: step: 682/464, loss: 5.395840644836426 2023-01-22 12:07:28.654924: step: 684/464, loss: 0.5092490315437317 2023-01-22 12:07:29.397380: step: 686/464, loss: 0.4408661723136902 2023-01-22 12:07:30.102689: step: 688/464, loss: 0.6526728272438049 2023-01-22 12:07:30.853210: step: 690/464, loss: 0.4350407123565674 2023-01-22 12:07:31.569040: step: 692/464, loss: 0.9150271415710449 2023-01-22 12:07:32.330242: step: 694/464, loss: 0.9353266358375549 2023-01-22 12:07:33.103734: step: 696/464, loss: 0.6587817072868347 2023-01-22 12:07:33.875504: step: 698/464, loss: 0.34132251143455505 2023-01-22 12:07:34.639156: step: 700/464, loss: 0.9389505982398987 2023-01-22 12:07:35.343587: step: 702/464, loss: 0.8931823968887329 2023-01-22 12:07:36.080732: step: 704/464, loss: 0.45848965644836426 2023-01-22 12:07:36.774204: step: 706/464, loss: 0.5228652358055115 2023-01-22 12:07:37.551352: step: 708/464, loss: 0.7406331896781921 2023-01-22 12:07:38.363790: step: 710/464, loss: 1.119754672050476 2023-01-22 12:07:39.144413: step: 712/464, loss: 0.46346792578697205 2023-01-22 12:07:39.826620: step: 714/464, loss: 0.7823930382728577 2023-01-22 12:07:40.539774: step: 716/464, loss: 0.3420368731021881 2023-01-22 12:07:41.313078: step: 718/464, loss: 0.6503040194511414 2023-01-22 12:07:42.079641: step: 720/464, loss: 0.41361570358276367 2023-01-22 12:07:42.866056: step: 722/464, loss: 0.8351732492446899 2023-01-22 12:07:43.672470: step: 724/464, loss: 0.37005409598350525 2023-01-22 12:07:44.428405: step: 726/464, loss: 6.30983829498291 2023-01-22 12:07:45.192209: step: 728/464, loss: 1.8675658702850342 2023-01-22 12:07:45.946832: step: 730/464, loss: 0.3663023114204407 2023-01-22 12:07:46.676134: step: 732/464, loss: 1.7332795858383179 2023-01-22 12:07:47.417452: step: 734/464, loss: 0.4731151759624481 2023-01-22 12:07:48.242732: step: 736/464, loss: 1.2002259492874146 2023-01-22 12:07:48.970825: step: 738/464, loss: 0.8168594241142273 2023-01-22 12:07:49.663783: step: 740/464, loss: 0.66452556848526 2023-01-22 12:07:50.420963: step: 742/464, loss: 0.3836321532726288 2023-01-22 12:07:51.095648: step: 744/464, loss: 1.8308157920837402 2023-01-22 12:07:51.831111: step: 746/464, loss: 0.5129436254501343 2023-01-22 12:07:52.617799: step: 748/464, loss: 2.239680290222168 2023-01-22 12:07:53.388622: step: 750/464, loss: 0.9776561260223389 2023-01-22 12:07:54.049866: step: 752/464, loss: 1.0558445453643799 2023-01-22 12:07:54.815223: step: 754/464, loss: 0.875616192817688 2023-01-22 12:07:55.519436: step: 756/464, loss: 0.948789656162262 2023-01-22 12:07:56.372545: step: 758/464, loss: 0.5340040922164917 2023-01-22 12:07:57.087654: step: 760/464, loss: 0.20830850303173065 2023-01-22 12:07:57.896912: step: 762/464, loss: 2.270432472229004 2023-01-22 12:07:58.561106: step: 764/464, loss: 0.6409857273101807 2023-01-22 12:07:59.207928: step: 766/464, loss: 1.0566260814666748 2023-01-22 12:07:59.945322: step: 768/464, loss: 0.5132637023925781 2023-01-22 12:08:00.640208: step: 770/464, loss: 0.6227934956550598 2023-01-22 12:08:01.474513: step: 772/464, loss: 0.2820763885974884 2023-01-22 12:08:02.215842: step: 774/464, loss: 0.7153021097183228 2023-01-22 12:08:02.990277: step: 776/464, loss: 1.080057144165039 2023-01-22 12:08:03.791934: step: 778/464, loss: 1.1948184967041016 2023-01-22 12:08:04.635169: step: 780/464, loss: 0.8179348707199097 2023-01-22 12:08:05.438768: step: 782/464, loss: 1.160705804824829 2023-01-22 12:08:06.157121: step: 784/464, loss: 1.1092203855514526 2023-01-22 12:08:06.882370: step: 786/464, loss: 0.7730402946472168 2023-01-22 12:08:07.650993: step: 788/464, loss: 4.028815746307373 2023-01-22 12:08:08.374836: step: 790/464, loss: 1.1363494396209717 2023-01-22 12:08:09.285082: step: 792/464, loss: 1.3450284004211426 2023-01-22 12:08:10.051313: step: 794/464, loss: 0.47559836506843567 2023-01-22 12:08:10.802382: step: 796/464, loss: 0.6122507452964783 2023-01-22 12:08:11.476169: step: 798/464, loss: 0.22464758157730103 2023-01-22 12:08:12.287661: step: 800/464, loss: 1.3674689531326294 2023-01-22 12:08:13.015492: step: 802/464, loss: 0.24557270109653473 2023-01-22 12:08:13.841890: step: 804/464, loss: 1.0580368041992188 2023-01-22 12:08:14.628201: step: 806/464, loss: 1.3049439191818237 2023-01-22 12:08:15.507615: step: 808/464, loss: 1.3354514837265015 2023-01-22 12:08:16.227529: step: 810/464, loss: 0.13791459798812866 2023-01-22 12:08:16.994452: step: 812/464, loss: 0.5022804737091064 2023-01-22 12:08:17.756532: step: 814/464, loss: 2.687323570251465 2023-01-22 12:08:18.472170: step: 816/464, loss: 0.39061281085014343 2023-01-22 12:08:19.167434: step: 818/464, loss: 0.26048630475997925 2023-01-22 12:08:19.967915: step: 820/464, loss: 0.9743564128875732 2023-01-22 12:08:20.719833: step: 822/464, loss: 0.20935381948947906 2023-01-22 12:08:21.424334: step: 824/464, loss: 0.782504677772522 2023-01-22 12:08:22.176871: step: 826/464, loss: 0.5422369837760925 2023-01-22 12:08:22.954614: step: 828/464, loss: 0.569701611995697 2023-01-22 12:08:23.743366: step: 830/464, loss: 0.788164496421814 2023-01-22 12:08:24.582791: step: 832/464, loss: 0.47551649808883667 2023-01-22 12:08:25.329514: step: 834/464, loss: 0.6161491870880127 2023-01-22 12:08:26.049037: step: 836/464, loss: 0.2869093418121338 2023-01-22 12:08:26.796638: step: 838/464, loss: 0.30319467186927795 2023-01-22 12:08:27.492430: step: 840/464, loss: 0.7133747935295105 2023-01-22 12:08:28.230809: step: 842/464, loss: 0.6851193904876709 2023-01-22 12:08:29.091547: step: 844/464, loss: 1.2948142290115356 2023-01-22 12:08:29.893822: step: 846/464, loss: 1.523021936416626 2023-01-22 12:08:30.632176: step: 848/464, loss: 0.5034209489822388 2023-01-22 12:08:31.450373: step: 850/464, loss: 0.7313014268875122 2023-01-22 12:08:32.208718: step: 852/464, loss: 0.7082140445709229 2023-01-22 12:08:32.887414: step: 854/464, loss: 1.2561655044555664 2023-01-22 12:08:33.642983: step: 856/464, loss: 0.7107453942298889 2023-01-22 12:08:34.292364: step: 858/464, loss: 0.4908789396286011 2023-01-22 12:08:35.085261: step: 860/464, loss: 0.49198484420776367 2023-01-22 12:08:35.832248: step: 862/464, loss: 1.3563846349716187 2023-01-22 12:08:36.537144: step: 864/464, loss: 0.3984234929084778 2023-01-22 12:08:37.305863: step: 866/464, loss: 1.1371181011199951 2023-01-22 12:08:38.057920: step: 868/464, loss: 1.318985939025879 2023-01-22 12:08:38.786008: step: 870/464, loss: 0.7706924080848694 2023-01-22 12:08:39.537830: step: 872/464, loss: 0.45280876755714417 2023-01-22 12:08:40.331532: step: 874/464, loss: 0.3999007046222687 2023-01-22 12:08:41.029175: step: 876/464, loss: 0.28994327783584595 2023-01-22 12:08:41.812050: step: 878/464, loss: 0.5507034063339233 2023-01-22 12:08:42.555152: step: 880/464, loss: 0.23364470899105072 2023-01-22 12:08:43.227993: step: 882/464, loss: 3.641611099243164 2023-01-22 12:08:43.909616: step: 884/464, loss: 0.41873595118522644 2023-01-22 12:08:44.743683: step: 886/464, loss: 0.6705647706985474 2023-01-22 12:08:45.494136: step: 888/464, loss: 0.30085307359695435 2023-01-22 12:08:46.281570: step: 890/464, loss: 1.7496700286865234 2023-01-22 12:08:47.117153: step: 892/464, loss: 2.0429694652557373 2023-01-22 12:08:47.863030: step: 894/464, loss: 1.0652697086334229 2023-01-22 12:08:48.526898: step: 896/464, loss: 0.562082052230835 2023-01-22 12:08:49.232258: step: 898/464, loss: 0.5753889083862305 2023-01-22 12:08:49.998988: step: 900/464, loss: 0.5420715808868408 2023-01-22 12:08:50.782272: step: 902/464, loss: 1.3072800636291504 2023-01-22 12:08:51.526469: step: 904/464, loss: 0.48481813073158264 2023-01-22 12:08:52.269143: step: 906/464, loss: 1.3669205904006958 2023-01-22 12:08:53.004455: step: 908/464, loss: 0.38139575719833374 2023-01-22 12:08:53.665124: step: 910/464, loss: 0.4156199097633362 2023-01-22 12:08:54.411878: step: 912/464, loss: 0.3172839283943176 2023-01-22 12:08:55.121349: step: 914/464, loss: 0.29000595211982727 2023-01-22 12:08:55.821106: step: 916/464, loss: 0.2754070460796356 2023-01-22 12:08:56.506743: step: 918/464, loss: 0.8639828562736511 2023-01-22 12:08:57.419389: step: 920/464, loss: 0.6062501072883606 2023-01-22 12:08:58.132268: step: 922/464, loss: 0.9811422228813171 2023-01-22 12:08:58.879884: step: 924/464, loss: 1.0257622003555298 2023-01-22 12:08:59.705007: step: 926/464, loss: 1.0060298442840576 2023-01-22 12:09:00.407273: step: 928/464, loss: 0.8957557082176208 2023-01-22 12:09:01.092287: step: 930/464, loss: 0.33803051710128784 ================================================== Loss: 0.956 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.27596925661816635, 'r': 0.35175626837883706, 'f1': 0.30928777635974347}, 'combined': 0.22789625626507412, 'epoch': 5} Test Chinese: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.2782250985025974, 'r': 0.275203898025274, 'f1': 0.27670625181102}, 'combined': 0.17184914586158084, 'epoch': 5} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2614865884963018, 'r': 0.34980653679296536, 'f1': 0.2992663066394363}, 'combined': 0.22051201541853202, 'epoch': 5} Test Korean: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.26818733210187995, 'r': 0.2779829207373879, 'f1': 0.27299728425300435}, 'combined': 0.1695456817992343, 'epoch': 5} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.26695042427531807, 'r': 0.3422830250651332, 'f1': 0.29995923192429935}, 'combined': 0.22102259194422055, 'epoch': 5} Test Russian: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.28079776794420064, 'r': 0.27664804723566566, 'f1': 0.2787074619793555}, 'combined': 0.17309200270296815, 'epoch': 5} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.2076923076923077, 'r': 0.38571428571428573, 'f1': 0.27}, 'combined': 0.18, 'epoch': 5} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.21808510638297873, 'r': 0.44565217391304346, 'f1': 0.29285714285714287}, 'combined': 0.14642857142857144, 'epoch': 5} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.39880952380952384, 'r': 0.28879310344827586, 'f1': 0.33499999999999996}, 'combined': 0.2233333333333333, 'epoch': 5} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.27277317990287775, 'r': 0.2614291157103195, 'f1': 0.2669806992485695}, 'combined': 0.19672262049894593, 'epoch': 2} Test for Chinese: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.2661425985210101, 'r': 0.23333409278617157, 'f1': 0.2486608198477961}, 'combined': 0.15443145653705231, 'epoch': 2} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.34459459459459457, 'r': 0.36428571428571427, 'f1': 0.3541666666666667}, 'combined': 0.2361111111111111, 'epoch': 2} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.28460783492462427, 'r': 0.3234916377985767, 'f1': 0.3028065597155416}, 'combined': 0.2231206229482938, 'epoch': 4} Test for Korean: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.28410097979888116, 'r': 0.2478507064120803, 'f1': 0.2647406911596547}, 'combined': 0.16441790293073294, 'epoch': 4} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.23170731707317074, 'r': 0.41304347826086957, 'f1': 0.296875}, 'combined': 0.1484375, 'epoch': 4} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.28736309260435466, 'r': 0.32062523425305606, 'f1': 0.30308430215490684}, 'combined': 0.2233252752720366, 'epoch': 4} Test for Russian: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.2962319368793944, 'r': 0.2503203731182247, 'f1': 0.2713478201912912}, 'combined': 0.16852127780301246, 'epoch': 4} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.4107142857142857, 'r': 0.2974137931034483, 'f1': 0.345}, 'combined': 0.22999999999999998, 'epoch': 4} ****************************** Epoch: 6 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-22 12:11:48.293348: step: 2/464, loss: 0.12472327053546906 2023-01-22 12:11:49.018564: step: 4/464, loss: 0.4600578546524048 2023-01-22 12:11:49.798153: step: 6/464, loss: 0.5284104943275452 2023-01-22 12:11:50.547711: step: 8/464, loss: 0.2502985894680023 2023-01-22 12:11:51.258956: step: 10/464, loss: 0.3811435401439667 2023-01-22 12:11:51.940167: step: 12/464, loss: 1.4870409965515137 2023-01-22 12:11:52.675879: step: 14/464, loss: 0.32735830545425415 2023-01-22 12:11:53.472522: step: 16/464, loss: 0.19058562815189362 2023-01-22 12:11:54.221778: step: 18/464, loss: 0.4643670618534088 2023-01-22 12:11:54.986581: step: 20/464, loss: 0.7304343581199646 2023-01-22 12:11:55.682897: step: 22/464, loss: 1.3959012031555176 2023-01-22 12:11:56.442021: step: 24/464, loss: 0.8194465637207031 2023-01-22 12:11:57.299307: step: 26/464, loss: 0.5014887452125549 2023-01-22 12:11:58.171916: step: 28/464, loss: 0.1794343739748001 2023-01-22 12:11:58.865692: step: 30/464, loss: 0.03984064981341362 2023-01-22 12:11:59.553521: step: 32/464, loss: 0.23485831916332245 2023-01-22 12:12:00.505062: step: 34/464, loss: 0.7319287657737732 2023-01-22 12:12:01.267077: step: 36/464, loss: 0.5528953671455383 2023-01-22 12:12:01.937224: step: 38/464, loss: 0.6319125890731812 2023-01-22 12:12:02.797988: step: 40/464, loss: 0.5621069073677063 2023-01-22 12:12:03.484098: step: 42/464, loss: 0.9505723118782043 2023-01-22 12:12:04.274437: step: 44/464, loss: 1.7991302013397217 2023-01-22 12:12:04.973298: step: 46/464, loss: 6.067421913146973 2023-01-22 12:12:05.684634: step: 48/464, loss: 0.37295812368392944 2023-01-22 12:12:06.413312: step: 50/464, loss: 0.19374096393585205 2023-01-22 12:12:07.155985: step: 52/464, loss: 0.2651880979537964 2023-01-22 12:12:07.960146: step: 54/464, loss: 0.6591582298278809 2023-01-22 12:12:08.839037: step: 56/464, loss: 1.1974310874938965 2023-01-22 12:12:09.596640: step: 58/464, loss: 0.44687268137931824 2023-01-22 12:12:10.419919: step: 60/464, loss: 0.8164681196212769 2023-01-22 12:12:11.216492: step: 62/464, loss: 2.135221004486084 2023-01-22 12:12:12.045902: step: 64/464, loss: 1.5907602310180664 2023-01-22 12:12:12.785835: step: 66/464, loss: 0.6092633008956909 2023-01-22 12:12:13.469510: step: 68/464, loss: 0.4194296598434448 2023-01-22 12:12:14.173948: step: 70/464, loss: 0.2739868462085724 2023-01-22 12:12:14.883672: step: 72/464, loss: 0.4589228630065918 2023-01-22 12:12:15.596631: step: 74/464, loss: 1.0880742073059082 2023-01-22 12:12:16.357314: step: 76/464, loss: 3.139005184173584 2023-01-22 12:12:17.135306: step: 78/464, loss: 0.6059046387672424 2023-01-22 12:12:17.836597: step: 80/464, loss: 4.217742919921875 2023-01-22 12:12:18.611717: step: 82/464, loss: 0.2161814421415329 2023-01-22 12:12:19.360913: step: 84/464, loss: 0.6917901039123535 2023-01-22 12:12:20.151841: step: 86/464, loss: 0.4375626742839813 2023-01-22 12:12:20.947755: step: 88/464, loss: 1.4799258708953857 2023-01-22 12:12:21.768126: step: 90/464, loss: 0.49118438363075256 2023-01-22 12:12:22.515834: step: 92/464, loss: 0.2365902066230774 2023-01-22 12:12:23.340850: step: 94/464, loss: 0.541080117225647 2023-01-22 12:12:24.174093: step: 96/464, loss: 0.5610907673835754 2023-01-22 12:12:24.960409: step: 98/464, loss: 0.505711019039154 2023-01-22 12:12:25.682624: step: 100/464, loss: 0.6612415909767151 2023-01-22 12:12:26.421010: step: 102/464, loss: 0.38282832503318787 2023-01-22 12:12:27.213232: step: 104/464, loss: 7.262805938720703 2023-01-22 12:12:27.944213: step: 106/464, loss: 0.750267744064331 2023-01-22 12:12:28.718637: step: 108/464, loss: 0.32267841696739197 2023-01-22 12:12:29.457537: step: 110/464, loss: 0.5233975052833557 2023-01-22 12:12:30.194798: step: 112/464, loss: 1.1282964944839478 2023-01-22 12:12:30.912690: step: 114/464, loss: 0.3750358521938324 2023-01-22 12:12:31.659254: step: 116/464, loss: 0.2865687906742096 2023-01-22 12:12:32.354643: step: 118/464, loss: 0.23131638765335083 2023-01-22 12:12:33.176672: step: 120/464, loss: 0.7558478116989136 2023-01-22 12:12:33.963796: step: 122/464, loss: 0.8477310538291931 2023-01-22 12:12:34.687289: step: 124/464, loss: 0.24333900213241577 2023-01-22 12:12:35.413334: step: 126/464, loss: 0.3285759687423706 2023-01-22 12:12:36.131524: step: 128/464, loss: 0.7810271382331848 2023-01-22 12:12:36.923920: step: 130/464, loss: 0.33401158452033997 2023-01-22 12:12:37.691930: step: 132/464, loss: 0.24617628753185272 2023-01-22 12:12:38.437341: step: 134/464, loss: 0.30306100845336914 2023-01-22 12:12:39.172031: step: 136/464, loss: 6.122437477111816 2023-01-22 12:12:39.932250: step: 138/464, loss: 0.17895404994487762 2023-01-22 12:12:40.669778: step: 140/464, loss: 0.3743041157722473 2023-01-22 12:12:41.358842: step: 142/464, loss: 0.9956777095794678 2023-01-22 12:12:42.204478: step: 144/464, loss: 0.5255347490310669 2023-01-22 12:12:43.002623: step: 146/464, loss: 1.6436017751693726 2023-01-22 12:12:43.773813: step: 148/464, loss: 0.44622185826301575 2023-01-22 12:12:44.451093: step: 150/464, loss: 0.4024694561958313 2023-01-22 12:12:45.167138: step: 152/464, loss: 1.1937528848648071 2023-01-22 12:12:46.001092: step: 154/464, loss: 0.5776351690292358 2023-01-22 12:12:46.837023: step: 156/464, loss: 0.6195184588432312 2023-01-22 12:12:47.683728: step: 158/464, loss: 0.24424467980861664 2023-01-22 12:12:48.429921: step: 160/464, loss: 2.263171672821045 2023-01-22 12:12:49.124159: step: 162/464, loss: 1.060383915901184 2023-01-22 12:12:49.931155: step: 164/464, loss: 0.5988831520080566 2023-01-22 12:12:50.721761: step: 166/464, loss: 0.6007016897201538 2023-01-22 12:12:51.467949: step: 168/464, loss: 1.3479111194610596 2023-01-22 12:12:52.229260: step: 170/464, loss: 2.0738675594329834 2023-01-22 12:12:52.959913: step: 172/464, loss: 1.2061904668807983 2023-01-22 12:12:53.722331: step: 174/464, loss: 0.7636981010437012 2023-01-22 12:12:54.455694: step: 176/464, loss: 0.5179991126060486 2023-01-22 12:12:55.207563: step: 178/464, loss: 1.457108497619629 2023-01-22 12:12:55.983159: step: 180/464, loss: 0.7868736386299133 2023-01-22 12:12:56.753161: step: 182/464, loss: 0.5773451328277588 2023-01-22 12:12:57.470641: step: 184/464, loss: 0.21230916678905487 2023-01-22 12:12:58.252415: step: 186/464, loss: 0.5763471126556396 2023-01-22 12:12:59.043872: step: 188/464, loss: 0.23866167664527893 2023-01-22 12:12:59.797895: step: 190/464, loss: 0.3690454661846161 2023-01-22 12:13:00.560111: step: 192/464, loss: 0.975060224533081 2023-01-22 12:13:01.308797: step: 194/464, loss: 0.6560547947883606 2023-01-22 12:13:02.046258: step: 196/464, loss: 0.3782944679260254 2023-01-22 12:13:02.808237: step: 198/464, loss: 0.33345192670822144 2023-01-22 12:13:03.539882: step: 200/464, loss: 0.810796856880188 2023-01-22 12:13:04.290085: step: 202/464, loss: 0.35488876700401306 2023-01-22 12:13:05.012877: step: 204/464, loss: 1.0040911436080933 2023-01-22 12:13:05.748037: step: 206/464, loss: 0.3226962089538574 2023-01-22 12:13:06.500050: step: 208/464, loss: 0.6221941113471985 2023-01-22 12:13:07.246657: step: 210/464, loss: 0.4910554885864258 2023-01-22 12:13:08.146878: step: 212/464, loss: 0.7612470984458923 2023-01-22 12:13:08.880798: step: 214/464, loss: 0.5407028794288635 2023-01-22 12:13:09.587520: step: 216/464, loss: 3.4458680152893066 2023-01-22 12:13:10.373739: step: 218/464, loss: 0.4417225122451782 2023-01-22 12:13:11.081636: step: 220/464, loss: 0.5013437271118164 2023-01-22 12:13:11.860152: step: 222/464, loss: 0.36002710461616516 2023-01-22 12:13:12.573677: step: 224/464, loss: 0.42992568016052246 2023-01-22 12:13:13.269276: step: 226/464, loss: 0.6748926639556885 2023-01-22 12:13:14.045343: step: 228/464, loss: 0.6752133369445801 2023-01-22 12:13:14.843049: step: 230/464, loss: 1.1018412113189697 2023-01-22 12:13:15.529884: step: 232/464, loss: 1.4467039108276367 2023-01-22 12:13:16.229938: step: 234/464, loss: 0.5036592483520508 2023-01-22 12:13:17.016005: step: 236/464, loss: 1.1971505880355835 2023-01-22 12:13:17.750757: step: 238/464, loss: 0.7818788886070251 2023-01-22 12:13:18.608127: step: 240/464, loss: 0.3119359016418457 2023-01-22 12:13:19.365151: step: 242/464, loss: 1.0075072050094604 2023-01-22 12:13:20.068013: step: 244/464, loss: 0.7762327194213867 2023-01-22 12:13:20.748545: step: 246/464, loss: 0.29996880888938904 2023-01-22 12:13:21.492107: step: 248/464, loss: 0.5244670510292053 2023-01-22 12:13:22.225616: step: 250/464, loss: 0.4027194082736969 2023-01-22 12:13:23.023864: step: 252/464, loss: 0.3438206911087036 2023-01-22 12:13:23.780551: step: 254/464, loss: 0.3174258768558502 2023-01-22 12:13:24.539660: step: 256/464, loss: 0.44195500016212463 2023-01-22 12:13:25.283852: step: 258/464, loss: 0.7431162595748901 2023-01-22 12:13:26.034189: step: 260/464, loss: 0.390606552362442 2023-01-22 12:13:26.793840: step: 262/464, loss: 0.21001914143562317 2023-01-22 12:13:27.549668: step: 264/464, loss: 0.7312803864479065 2023-01-22 12:13:28.263188: step: 266/464, loss: 0.41264137625694275 2023-01-22 12:13:29.004822: step: 268/464, loss: 1.7871190309524536 2023-01-22 12:13:29.755208: step: 270/464, loss: 1.6481120586395264 2023-01-22 12:13:30.580524: step: 272/464, loss: 0.2945783734321594 2023-01-22 12:13:31.369504: step: 274/464, loss: 0.9895068407058716 2023-01-22 12:13:32.119211: step: 276/464, loss: 1.1285908222198486 2023-01-22 12:13:32.833083: step: 278/464, loss: 2.258369207382202 2023-01-22 12:13:33.595416: step: 280/464, loss: 1.4946790933609009 2023-01-22 12:13:34.445171: step: 282/464, loss: 1.9093759059906006 2023-01-22 12:13:35.189356: step: 284/464, loss: 0.6587912440299988 2023-01-22 12:13:35.945055: step: 286/464, loss: 0.5902923345565796 2023-01-22 12:13:36.799998: step: 288/464, loss: 0.2788597345352173 2023-01-22 12:13:37.534931: step: 290/464, loss: 0.7567580938339233 2023-01-22 12:13:38.239893: step: 292/464, loss: 0.2813103497028351 2023-01-22 12:13:38.939285: step: 294/464, loss: 0.3289432227611542 2023-01-22 12:13:39.677747: step: 296/464, loss: 0.3027788996696472 2023-01-22 12:13:40.461402: step: 298/464, loss: 0.5907791256904602 2023-01-22 12:13:41.267548: step: 300/464, loss: 0.9173389673233032 2023-01-22 12:13:42.028178: step: 302/464, loss: 0.2431243360042572 2023-01-22 12:13:42.829767: step: 304/464, loss: 1.1197048425674438 2023-01-22 12:13:43.599865: step: 306/464, loss: 0.37354084849357605 2023-01-22 12:13:44.288543: step: 308/464, loss: 0.1799282729625702 2023-01-22 12:13:45.038848: step: 310/464, loss: 0.6075442433357239 2023-01-22 12:13:45.900852: step: 312/464, loss: 0.3274470865726471 2023-01-22 12:13:46.656434: step: 314/464, loss: 0.6684872508049011 2023-01-22 12:13:47.372693: step: 316/464, loss: 1.3049945831298828 2023-01-22 12:13:48.107358: step: 318/464, loss: 0.777603268623352 2023-01-22 12:13:48.848846: step: 320/464, loss: 0.567020833492279 2023-01-22 12:13:49.590608: step: 322/464, loss: 0.32762405276298523 2023-01-22 12:13:50.329631: step: 324/464, loss: 0.5362238883972168 2023-01-22 12:13:51.149780: step: 326/464, loss: 0.2942344546318054 2023-01-22 12:13:51.896576: step: 328/464, loss: 0.3326949179172516 2023-01-22 12:13:52.608309: step: 330/464, loss: 0.7896073460578918 2023-01-22 12:13:53.396830: step: 332/464, loss: 0.25079795718193054 2023-01-22 12:13:54.174348: step: 334/464, loss: 0.3522076904773712 2023-01-22 12:13:55.004857: step: 336/464, loss: 0.6076290607452393 2023-01-22 12:13:55.698117: step: 338/464, loss: 0.27255961298942566 2023-01-22 12:13:56.567963: step: 340/464, loss: 0.5010226368904114 2023-01-22 12:13:57.276901: step: 342/464, loss: 0.2654639780521393 2023-01-22 12:13:57.986890: step: 344/464, loss: 1.3633586168289185 2023-01-22 12:13:58.719544: step: 346/464, loss: 0.7960776686668396 2023-01-22 12:13:59.482980: step: 348/464, loss: 0.8121628165245056 2023-01-22 12:14:00.351877: step: 350/464, loss: 0.8419029116630554 2023-01-22 12:14:01.164618: step: 352/464, loss: 2.2198326587677 2023-01-22 12:14:01.986322: step: 354/464, loss: 0.5667412281036377 2023-01-22 12:14:02.777752: step: 356/464, loss: 0.6649174690246582 2023-01-22 12:14:03.554394: step: 358/464, loss: 0.24803166091442108 2023-01-22 12:14:04.283732: step: 360/464, loss: 0.6043193340301514 2023-01-22 12:14:05.054896: step: 362/464, loss: 0.651032030582428 2023-01-22 12:14:05.771649: step: 364/464, loss: 0.4023246765136719 2023-01-22 12:14:06.556913: step: 366/464, loss: 1.8113951683044434 2023-01-22 12:14:07.300743: step: 368/464, loss: 0.25779157876968384 2023-01-22 12:14:08.023488: step: 370/464, loss: 1.8274005651474 2023-01-22 12:14:08.761540: step: 372/464, loss: 1.1966185569763184 2023-01-22 12:14:09.478333: step: 374/464, loss: 0.8468092083930969 2023-01-22 12:14:10.216725: step: 376/464, loss: 0.6065338850021362 2023-01-22 12:14:11.008257: step: 378/464, loss: 0.2458256334066391 2023-01-22 12:14:11.921313: step: 380/464, loss: 2.1111176013946533 2023-01-22 12:14:12.702380: step: 382/464, loss: 0.22330960631370544 2023-01-22 12:14:13.434998: step: 384/464, loss: 1.5689202547073364 2023-01-22 12:14:14.225878: step: 386/464, loss: 0.49095427989959717 2023-01-22 12:14:15.028132: step: 388/464, loss: 0.340639591217041 2023-01-22 12:14:15.873173: step: 390/464, loss: 1.0026850700378418 2023-01-22 12:14:16.650948: step: 392/464, loss: 0.7220879793167114 2023-01-22 12:14:17.425752: step: 394/464, loss: 0.8729772567749023 2023-01-22 12:14:18.187147: step: 396/464, loss: 0.26813337206840515 2023-01-22 12:14:18.987474: step: 398/464, loss: 0.6239210963249207 2023-01-22 12:14:19.823369: step: 400/464, loss: 0.20718388259410858 2023-01-22 12:14:20.668029: step: 402/464, loss: 0.2550746202468872 2023-01-22 12:14:21.485110: step: 404/464, loss: 0.849943995475769 2023-01-22 12:14:22.245749: step: 406/464, loss: 0.5320936441421509 2023-01-22 12:14:22.991107: step: 408/464, loss: 0.2525803744792938 2023-01-22 12:14:23.794014: step: 410/464, loss: 0.4498633146286011 2023-01-22 12:14:24.490581: step: 412/464, loss: 0.7323371171951294 2023-01-22 12:14:25.223349: step: 414/464, loss: 0.20026081800460815 2023-01-22 12:14:26.055811: step: 416/464, loss: 0.7177056670188904 2023-01-22 12:14:26.904455: step: 418/464, loss: 0.39402830600738525 2023-01-22 12:14:27.603160: step: 420/464, loss: 0.22038528323173523 2023-01-22 12:14:28.335553: step: 422/464, loss: 1.1169726848602295 2023-01-22 12:14:29.091879: step: 424/464, loss: 0.6293752193450928 2023-01-22 12:14:29.807570: step: 426/464, loss: 0.3213549554347992 2023-01-22 12:14:30.547402: step: 428/464, loss: 1.2308756113052368 2023-01-22 12:14:31.236987: step: 430/464, loss: 1.4570958614349365 2023-01-22 12:14:31.940945: step: 432/464, loss: 0.6932898759841919 2023-01-22 12:14:32.679868: step: 434/464, loss: 0.6488903164863586 2023-01-22 12:14:33.387864: step: 436/464, loss: 0.09309423714876175 2023-01-22 12:14:34.172530: step: 438/464, loss: 1.178100347518921 2023-01-22 12:14:34.915517: step: 440/464, loss: 2.2519099712371826 2023-01-22 12:14:35.699862: step: 442/464, loss: 1.376326322555542 2023-01-22 12:14:36.497183: step: 444/464, loss: 0.41640615463256836 2023-01-22 12:14:37.282202: step: 446/464, loss: 0.6458289623260498 2023-01-22 12:14:38.110197: step: 448/464, loss: 0.5401825308799744 2023-01-22 12:14:38.830475: step: 450/464, loss: 1.0886940956115723 2023-01-22 12:14:39.627254: step: 452/464, loss: 1.7589441537857056 2023-01-22 12:14:40.375398: step: 454/464, loss: 0.406145304441452 2023-01-22 12:14:41.152454: step: 456/464, loss: 0.26355504989624023 2023-01-22 12:14:41.883551: step: 458/464, loss: 0.23833003640174866 2023-01-22 12:14:42.633250: step: 460/464, loss: 0.5560361742973328 2023-01-22 12:14:43.373252: step: 462/464, loss: 0.5383360981941223 2023-01-22 12:14:44.138781: step: 464/464, loss: 1.2050065994262695 2023-01-22 12:14:44.921266: step: 466/464, loss: 0.5009714365005493 2023-01-22 12:14:45.611280: step: 468/464, loss: 0.5247203707695007 2023-01-22 12:14:46.395962: step: 470/464, loss: 0.24868924915790558 2023-01-22 12:14:47.133405: step: 472/464, loss: 0.8285905122756958 2023-01-22 12:14:47.930422: step: 474/464, loss: 0.3940313458442688 2023-01-22 12:14:48.651254: step: 476/464, loss: 0.5545944571495056 2023-01-22 12:14:49.360077: step: 478/464, loss: 1.8019726276397705 2023-01-22 12:14:50.190762: step: 480/464, loss: 1.1479442119598389 2023-01-22 12:14:50.983473: step: 482/464, loss: 0.43199148774147034 2023-01-22 12:14:51.791303: step: 484/464, loss: 0.1521385759115219 2023-01-22 12:14:52.550621: step: 486/464, loss: 3.1033084392547607 2023-01-22 12:14:53.246121: step: 488/464, loss: 0.5504053235054016 2023-01-22 12:14:54.042249: step: 490/464, loss: 2.8585193157196045 2023-01-22 12:14:54.813893: step: 492/464, loss: 0.5221114158630371 2023-01-22 12:14:55.577508: step: 494/464, loss: 0.24164840579032898 2023-01-22 12:14:56.353125: step: 496/464, loss: 0.649586021900177 2023-01-22 12:14:57.145728: step: 498/464, loss: 0.20573021471500397 2023-01-22 12:14:57.925964: step: 500/464, loss: 0.356124609708786 2023-01-22 12:14:58.735876: step: 502/464, loss: 1.0682035684585571 2023-01-22 12:14:59.468745: step: 504/464, loss: 0.2562498450279236 2023-01-22 12:15:00.128750: step: 506/464, loss: 0.9069114327430725 2023-01-22 12:15:00.858503: step: 508/464, loss: 0.260059118270874 2023-01-22 12:15:01.655889: step: 510/464, loss: 0.5727384090423584 2023-01-22 12:15:02.413271: step: 512/464, loss: 0.6543835997581482 2023-01-22 12:15:03.167902: step: 514/464, loss: 1.037026047706604 2023-01-22 12:15:03.903481: step: 516/464, loss: 0.5035011768341064 2023-01-22 12:15:04.658150: step: 518/464, loss: 0.40311571955680847 2023-01-22 12:15:05.439136: step: 520/464, loss: 0.6194513440132141 2023-01-22 12:15:06.205665: step: 522/464, loss: 0.9201607704162598 2023-01-22 12:15:06.965678: step: 524/464, loss: 0.37401604652404785 2023-01-22 12:15:07.763445: step: 526/464, loss: 0.3895934224128723 2023-01-22 12:15:08.497971: step: 528/464, loss: 0.2508663535118103 2023-01-22 12:15:09.187417: step: 530/464, loss: 0.6469005942344666 2023-01-22 12:15:09.954920: step: 532/464, loss: 4.749858856201172 2023-01-22 12:15:10.816446: step: 534/464, loss: 0.15832938253879547 2023-01-22 12:15:11.536869: step: 536/464, loss: 0.6010364890098572 2023-01-22 12:15:12.318381: step: 538/464, loss: 0.9541651010513306 2023-01-22 12:15:13.187852: step: 540/464, loss: 0.7156731486320496 2023-01-22 12:15:13.891586: step: 542/464, loss: 0.26916003227233887 2023-01-22 12:15:14.648543: step: 544/464, loss: 0.34205710887908936 2023-01-22 12:15:15.409962: step: 546/464, loss: 0.38013705611228943 2023-01-22 12:15:16.198124: step: 548/464, loss: 4.3453240394592285 2023-01-22 12:15:16.934723: step: 550/464, loss: 1.071614384651184 2023-01-22 12:15:17.725035: step: 552/464, loss: 0.4483928978443146 2023-01-22 12:15:18.459060: step: 554/464, loss: 1.2307825088500977 2023-01-22 12:15:19.263096: step: 556/464, loss: 0.25892454385757446 2023-01-22 12:15:19.995163: step: 558/464, loss: 2.0551559925079346 2023-01-22 12:15:20.863353: step: 560/464, loss: 5.914419174194336 2023-01-22 12:15:21.641796: step: 562/464, loss: 0.41530847549438477 2023-01-22 12:15:22.397575: step: 564/464, loss: 0.9857712388038635 2023-01-22 12:15:23.206598: step: 566/464, loss: 0.3567578196525574 2023-01-22 12:15:23.926391: step: 568/464, loss: 0.18065162003040314 2023-01-22 12:15:24.741752: step: 570/464, loss: 0.5749038457870483 2023-01-22 12:15:25.463949: step: 572/464, loss: 0.13398727774620056 2023-01-22 12:15:26.145762: step: 574/464, loss: 0.3920579254627228 2023-01-22 12:15:27.010922: step: 576/464, loss: 1.651190161705017 2023-01-22 12:15:27.732988: step: 578/464, loss: 0.17924638092517853 2023-01-22 12:15:28.443440: step: 580/464, loss: 0.8294220566749573 2023-01-22 12:15:29.226764: step: 582/464, loss: 0.4700474143028259 2023-01-22 12:15:30.010387: step: 584/464, loss: 0.7171123623847961 2023-01-22 12:15:30.804673: step: 586/464, loss: 0.5684266090393066 2023-01-22 12:15:31.512396: step: 588/464, loss: 0.5454313158988953 2023-01-22 12:15:32.288598: step: 590/464, loss: 0.331861287355423 2023-01-22 12:15:33.009678: step: 592/464, loss: 0.1441434621810913 2023-01-22 12:15:33.768334: step: 594/464, loss: 0.8970625996589661 2023-01-22 12:15:34.500616: step: 596/464, loss: 0.77960604429245 2023-01-22 12:15:35.323497: step: 598/464, loss: 1.413642406463623 2023-01-22 12:15:36.055357: step: 600/464, loss: 0.2677861154079437 2023-01-22 12:15:36.747623: step: 602/464, loss: 0.2316088229417801 2023-01-22 12:15:37.444208: step: 604/464, loss: 0.7844187021255493 2023-01-22 12:15:38.155630: step: 606/464, loss: 0.8072429299354553 2023-01-22 12:15:38.892166: step: 608/464, loss: 0.3729170262813568 2023-01-22 12:15:39.606116: step: 610/464, loss: 1.2633850574493408 2023-01-22 12:15:40.348561: step: 612/464, loss: 0.3410780429840088 2023-01-22 12:15:41.167254: step: 614/464, loss: 0.32746434211730957 2023-01-22 12:15:41.986956: step: 616/464, loss: 0.283867210149765 2023-01-22 12:15:42.739391: step: 618/464, loss: 0.36813536286354065 2023-01-22 12:15:43.432212: step: 620/464, loss: 0.30180618166923523 2023-01-22 12:15:44.129505: step: 622/464, loss: 0.547936201095581 2023-01-22 12:15:44.819022: step: 624/464, loss: 0.12627914547920227 2023-01-22 12:15:45.532662: step: 626/464, loss: 0.2586286664009094 2023-01-22 12:15:46.325357: step: 628/464, loss: 0.186224102973938 2023-01-22 12:15:47.124519: step: 630/464, loss: 0.5099416375160217 2023-01-22 12:15:47.879581: step: 632/464, loss: 0.3976011574268341 2023-01-22 12:15:48.624741: step: 634/464, loss: 1.3149917125701904 2023-01-22 12:15:49.337984: step: 636/464, loss: 0.35117021203041077 2023-01-22 12:15:50.069179: step: 638/464, loss: 0.4128319323062897 2023-01-22 12:15:50.765702: step: 640/464, loss: 1.0195266008377075 2023-01-22 12:15:51.493905: step: 642/464, loss: 0.30473464727401733 2023-01-22 12:15:52.225342: step: 644/464, loss: 0.833469808101654 2023-01-22 12:15:52.978729: step: 646/464, loss: 0.20172801613807678 2023-01-22 12:15:53.698979: step: 648/464, loss: 0.29524433612823486 2023-01-22 12:15:54.501694: step: 650/464, loss: 0.34519845247268677 2023-01-22 12:15:55.263416: step: 652/464, loss: 0.18610112369060516 2023-01-22 12:15:55.981973: step: 654/464, loss: 0.448944091796875 2023-01-22 12:15:56.887825: step: 656/464, loss: 0.7930111289024353 2023-01-22 12:15:57.638062: step: 658/464, loss: 0.4779053032398224 2023-01-22 12:15:58.384176: step: 660/464, loss: 1.270046353340149 2023-01-22 12:15:59.126142: step: 662/464, loss: 0.6414569020271301 2023-01-22 12:15:59.930568: step: 664/464, loss: 1.1048862934112549 2023-01-22 12:16:00.681512: step: 666/464, loss: 0.19807268679141998 2023-01-22 12:16:01.428287: step: 668/464, loss: 1.0151479244232178 2023-01-22 12:16:02.156192: step: 670/464, loss: 1.5276272296905518 2023-01-22 12:16:02.913144: step: 672/464, loss: 0.37346214056015015 2023-01-22 12:16:03.638363: step: 674/464, loss: 0.4739722013473511 2023-01-22 12:16:04.418467: step: 676/464, loss: 0.37458914518356323 2023-01-22 12:16:05.231301: step: 678/464, loss: 0.9428262114524841 2023-01-22 12:16:06.018448: step: 680/464, loss: 2.244079828262329 2023-01-22 12:16:06.798340: step: 682/464, loss: 0.25588732957839966 2023-01-22 12:16:07.529921: step: 684/464, loss: 0.09564242511987686 2023-01-22 12:16:08.302811: step: 686/464, loss: 0.44685637950897217 2023-01-22 12:16:09.052304: step: 688/464, loss: 0.34719905257225037 2023-01-22 12:16:09.769450: step: 690/464, loss: 0.3859795033931732 2023-01-22 12:16:10.473647: step: 692/464, loss: 0.5400804877281189 2023-01-22 12:16:11.175930: step: 694/464, loss: 0.5231386423110962 2023-01-22 12:16:11.929393: step: 696/464, loss: 0.30685046315193176 2023-01-22 12:16:12.680356: step: 698/464, loss: 0.6772754192352295 2023-01-22 12:16:13.405661: step: 700/464, loss: 0.259883314371109 2023-01-22 12:16:14.112027: step: 702/464, loss: 0.6134689450263977 2023-01-22 12:16:14.933517: step: 704/464, loss: 0.29171958565711975 2023-01-22 12:16:15.686565: step: 706/464, loss: 0.130640909075737 2023-01-22 12:16:16.470196: step: 708/464, loss: 0.11610277742147446 2023-01-22 12:16:17.288254: step: 710/464, loss: 0.304213285446167 2023-01-22 12:16:17.984301: step: 712/464, loss: 0.7409196496009827 2023-01-22 12:16:18.646026: step: 714/464, loss: 0.31891071796417236 2023-01-22 12:16:19.410760: step: 716/464, loss: 0.728702962398529 2023-01-22 12:16:20.131830: step: 718/464, loss: 2.555067539215088 2023-01-22 12:16:20.867017: step: 720/464, loss: 0.33662861585617065 2023-01-22 12:16:21.583694: step: 722/464, loss: 0.1771378517150879 2023-01-22 12:16:22.455974: step: 724/464, loss: 2.431415319442749 2023-01-22 12:16:23.239257: step: 726/464, loss: 0.11213550716638565 2023-01-22 12:16:24.031382: step: 728/464, loss: 0.7706308960914612 2023-01-22 12:16:24.838816: step: 730/464, loss: 0.31764960289001465 2023-01-22 12:16:25.689392: step: 732/464, loss: 0.5397222638130188 2023-01-22 12:16:26.452979: step: 734/464, loss: 0.24206602573394775 2023-01-22 12:16:27.171370: step: 736/464, loss: 0.4958699941635132 2023-01-22 12:16:27.918186: step: 738/464, loss: 0.6772986054420471 2023-01-22 12:16:28.757451: step: 740/464, loss: 2.2552149295806885 2023-01-22 12:16:29.505174: step: 742/464, loss: 0.2647966146469116 2023-01-22 12:16:30.228070: step: 744/464, loss: 0.6256701946258545 2023-01-22 12:16:30.917262: step: 746/464, loss: 0.27616631984710693 2023-01-22 12:16:31.708080: step: 748/464, loss: 0.3815680146217346 2023-01-22 12:16:32.393086: step: 750/464, loss: 0.36198049783706665 2023-01-22 12:16:33.136142: step: 752/464, loss: 0.10101629048585892 2023-01-22 12:16:33.804440: step: 754/464, loss: 1.5349050760269165 2023-01-22 12:16:34.470432: step: 756/464, loss: 1.0652238130569458 2023-01-22 12:16:35.260786: step: 758/464, loss: 0.4184451401233673 2023-01-22 12:16:35.920650: step: 760/464, loss: 0.4156523644924164 2023-01-22 12:16:36.698758: step: 762/464, loss: 0.3656218945980072 2023-01-22 12:16:37.446644: step: 764/464, loss: 1.1675796508789062 2023-01-22 12:16:38.218164: step: 766/464, loss: 0.6156561970710754 2023-01-22 12:16:39.039164: step: 768/464, loss: 0.32256415486335754 2023-01-22 12:16:39.903489: step: 770/464, loss: 1.1748402118682861 2023-01-22 12:16:40.602778: step: 772/464, loss: 0.9681693315505981 2023-01-22 12:16:41.422075: step: 774/464, loss: 0.25069674849510193 2023-01-22 12:16:42.219384: step: 776/464, loss: 0.368958979845047 2023-01-22 12:16:42.986356: step: 778/464, loss: 0.7839964628219604 2023-01-22 12:16:43.702140: step: 780/464, loss: 0.20218144357204437 2023-01-22 12:16:44.480023: step: 782/464, loss: 1.0326519012451172 2023-01-22 12:16:45.246034: step: 784/464, loss: 0.5584985613822937 2023-01-22 12:16:45.955274: step: 786/464, loss: 0.9903871417045593 2023-01-22 12:16:46.686541: step: 788/464, loss: 2.6020190715789795 2023-01-22 12:16:47.406068: step: 790/464, loss: 2.3147616386413574 2023-01-22 12:16:48.257273: step: 792/464, loss: 1.7034763097763062 2023-01-22 12:16:48.964599: step: 794/464, loss: 0.5222653746604919 2023-01-22 12:16:49.712035: step: 796/464, loss: 0.92058265209198 2023-01-22 12:16:50.421081: step: 798/464, loss: 0.3047405183315277 2023-01-22 12:16:51.199424: step: 800/464, loss: 0.3361084759235382 2023-01-22 12:16:51.902780: step: 802/464, loss: 0.63560950756073 2023-01-22 12:16:52.667532: step: 804/464, loss: 0.42710667848587036 2023-01-22 12:16:53.415816: step: 806/464, loss: 1.273741364479065 2023-01-22 12:16:54.093568: step: 808/464, loss: 0.6142624020576477 2023-01-22 12:16:54.906027: step: 810/464, loss: 1.2906978130340576 2023-01-22 12:16:55.614790: step: 812/464, loss: 0.656749427318573 2023-01-22 12:16:56.382488: step: 814/464, loss: 0.5131410956382751 2023-01-22 12:16:57.106914: step: 816/464, loss: 1.9324876070022583 2023-01-22 12:16:57.927579: step: 818/464, loss: 0.3766292631626129 2023-01-22 12:16:58.606577: step: 820/464, loss: 1.3447926044464111 2023-01-22 12:16:59.362559: step: 822/464, loss: 0.8682775497436523 2023-01-22 12:17:00.069957: step: 824/464, loss: 0.21632665395736694 2023-01-22 12:17:00.825139: step: 826/464, loss: 0.4097214937210083 2023-01-22 12:17:01.570694: step: 828/464, loss: 0.25114351511001587 2023-01-22 12:17:02.348638: step: 830/464, loss: 0.2567175328731537 2023-01-22 12:17:03.047550: step: 832/464, loss: 1.210261344909668 2023-01-22 12:17:03.901855: step: 834/464, loss: 0.6198481321334839 2023-01-22 12:17:04.639094: step: 836/464, loss: 0.06847158074378967 2023-01-22 12:17:05.399909: step: 838/464, loss: 0.43360579013824463 2023-01-22 12:17:06.153525: step: 840/464, loss: 0.6006065011024475 2023-01-22 12:17:06.923673: step: 842/464, loss: 0.17225876450538635 2023-01-22 12:17:07.719959: step: 844/464, loss: 0.733666181564331 2023-01-22 12:17:08.427643: step: 846/464, loss: 0.3815063238143921 2023-01-22 12:17:09.225679: step: 848/464, loss: 0.484394371509552 2023-01-22 12:17:10.028276: step: 850/464, loss: 0.6926419734954834 2023-01-22 12:17:10.823587: step: 852/464, loss: 1.4232854843139648 2023-01-22 12:17:11.634332: step: 854/464, loss: 2.2661728858947754 2023-01-22 12:17:12.540966: step: 856/464, loss: 0.2177295684814453 2023-01-22 12:17:13.244512: step: 858/464, loss: 2.403341293334961 2023-01-22 12:17:13.941034: step: 860/464, loss: 0.27968692779541016 2023-01-22 12:17:14.713441: step: 862/464, loss: 0.810583233833313 2023-01-22 12:17:15.496156: step: 864/464, loss: 1.0074455738067627 2023-01-22 12:17:16.280003: step: 866/464, loss: 1.2238481044769287 2023-01-22 12:17:17.033369: step: 868/464, loss: 0.25915518403053284 2023-01-22 12:17:17.742830: step: 870/464, loss: 0.522385835647583 2023-01-22 12:17:18.505996: step: 872/464, loss: 0.3643008768558502 2023-01-22 12:17:19.372859: step: 874/464, loss: 1.0815117359161377 2023-01-22 12:17:20.141925: step: 876/464, loss: 0.21362431347370148 2023-01-22 12:17:20.864480: step: 878/464, loss: 0.5841333866119385 2023-01-22 12:17:21.586190: step: 880/464, loss: 0.7333189249038696 2023-01-22 12:17:22.282686: step: 882/464, loss: 0.6409208178520203 2023-01-22 12:17:22.990668: step: 884/464, loss: 0.18310889601707458 2023-01-22 12:17:23.697832: step: 886/464, loss: 0.2629651427268982 2023-01-22 12:17:24.435784: step: 888/464, loss: 0.22283713519573212 2023-01-22 12:17:25.183964: step: 890/464, loss: 7.620507717132568 2023-01-22 12:17:25.955336: step: 892/464, loss: 0.3329012095928192 2023-01-22 12:17:26.645732: step: 894/464, loss: 0.2923496663570404 2023-01-22 12:17:27.571380: step: 896/464, loss: 0.5430619120597839 2023-01-22 12:17:28.248557: step: 898/464, loss: 0.46999070048332214 2023-01-22 12:17:28.951692: step: 900/464, loss: 0.30711182951927185 2023-01-22 12:17:29.729231: step: 902/464, loss: 0.360378623008728 2023-01-22 12:17:30.556506: step: 904/464, loss: 0.7840194702148438 2023-01-22 12:17:31.249556: step: 906/464, loss: 0.7028653621673584 2023-01-22 12:17:31.963189: step: 908/464, loss: 0.2442149817943573 2023-01-22 12:17:32.808669: step: 910/464, loss: 0.49424415826797485 2023-01-22 12:17:33.556281: step: 912/464, loss: 0.33642280101776123 2023-01-22 12:17:34.355460: step: 914/464, loss: 1.1186965703964233 2023-01-22 12:17:35.188117: step: 916/464, loss: 0.18287938833236694 2023-01-22 12:17:35.956037: step: 918/464, loss: 0.21742171049118042 2023-01-22 12:17:36.809002: step: 920/464, loss: 3.1498212814331055 2023-01-22 12:17:37.634245: step: 922/464, loss: 0.5353689193725586 2023-01-22 12:17:38.356942: step: 924/464, loss: 0.5739856958389282 2023-01-22 12:17:39.184334: step: 926/464, loss: 1.5230512619018555 2023-01-22 12:17:39.931342: step: 928/464, loss: 1.3895797729492188 2023-01-22 12:17:40.593456: step: 930/464, loss: 0.4182075262069702 ================================================== Loss: 0.784 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2817282697206331, 'r': 0.33251420069494075, 'f1': 0.3050217297932703}, 'combined': 0.22475285353188337, 'epoch': 6} Test Chinese: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.2903003877480343, 'r': 0.2750969089430483, 'f1': 0.282494238305799}, 'combined': 0.17544379010570677, 'epoch': 6} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2653414101330768, 'r': 0.3262642006949408, 'f1': 0.292665929814866}, 'combined': 0.21564857986358546, 'epoch': 6} Test Korean: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.2771622768868883, 'r': 0.2678505007859454, 'f1': 0.27242684100037867}, 'combined': 0.16919140651602466, 'epoch': 6} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2756479266347688, 'r': 0.327953036053131, 'f1': 0.2995342287694974}, 'combined': 0.22070943172489282, 'epoch': 6} Test Russian: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.2939259120206898, 'r': 0.2791134401698448, 'f1': 0.2863282325918732}, 'combined': 0.17782490234653178, 'epoch': 6} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.2169811320754717, 'r': 0.32857142857142857, 'f1': 0.26136363636363635}, 'combined': 0.17424242424242423, 'epoch': 6} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.26666666666666666, 'r': 0.5217391304347826, 'f1': 0.3529411764705882}, 'combined': 0.1764705882352941, 'epoch': 6} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.44642857142857145, 'r': 0.3232758620689655, 'f1': 0.37500000000000006}, 'combined': 0.25, 'epoch': 6} New best korean model... New best russian model... ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.27277317990287775, 'r': 0.2614291157103195, 'f1': 0.2669806992485695}, 'combined': 0.19672262049894593, 'epoch': 2} Test for Chinese: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.2661425985210101, 'r': 0.23333409278617157, 'f1': 0.2486608198477961}, 'combined': 0.15443145653705231, 'epoch': 2} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.34459459459459457, 'r': 0.36428571428571427, 'f1': 0.3541666666666667}, 'combined': 0.2361111111111111, 'epoch': 2} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2653414101330768, 'r': 0.3262642006949408, 'f1': 0.292665929814866}, 'combined': 0.21564857986358546, 'epoch': 6} Test for Korean: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.2771622768868883, 'r': 0.2678505007859454, 'f1': 0.27242684100037867}, 'combined': 0.16919140651602466, 'epoch': 6} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.26666666666666666, 'r': 0.5217391304347826, 'f1': 0.3529411764705882}, 'combined': 0.1764705882352941, 'epoch': 6} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2756479266347688, 'r': 0.327953036053131, 'f1': 0.2995342287694974}, 'combined': 0.22070943172489282, 'epoch': 6} Test for Russian: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.2939259120206898, 'r': 0.2791134401698448, 'f1': 0.2863282325918732}, 'combined': 0.17782490234653178, 'epoch': 6} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.44642857142857145, 'r': 0.3232758620689655, 'f1': 0.37500000000000006}, 'combined': 0.25, 'epoch': 6} ****************************** Epoch: 7 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-22 12:20:41.397771: step: 2/464, loss: 0.36327677965164185 2023-01-22 12:20:42.212010: step: 4/464, loss: 1.713686466217041 2023-01-22 12:20:42.961655: step: 6/464, loss: 0.20647361874580383 2023-01-22 12:20:43.675607: step: 8/464, loss: 0.32153114676475525 2023-01-22 12:20:44.507475: step: 10/464, loss: 0.21674101054668427 2023-01-22 12:20:45.321393: step: 12/464, loss: 0.2844339609146118 2023-01-22 12:20:46.090854: step: 14/464, loss: 0.13163089752197266 2023-01-22 12:20:46.833483: step: 16/464, loss: 0.12527650594711304 2023-01-22 12:20:47.491724: step: 18/464, loss: 1.9080713987350464 2023-01-22 12:20:48.362420: step: 20/464, loss: 0.2996969521045685 2023-01-22 12:20:49.109623: step: 22/464, loss: 0.4937852919101715 2023-01-22 12:20:49.872853: step: 24/464, loss: 0.17886342108249664 2023-01-22 12:20:50.623520: step: 26/464, loss: 1.044276475906372 2023-01-22 12:20:51.358720: step: 28/464, loss: 0.2835633456707001 2023-01-22 12:20:52.199470: step: 30/464, loss: 0.5925331115722656 2023-01-22 12:20:52.954475: step: 32/464, loss: 0.5911035537719727 2023-01-22 12:20:53.761515: step: 34/464, loss: 1.7006969451904297 2023-01-22 12:20:54.493217: step: 36/464, loss: 0.7910258769989014 2023-01-22 12:20:55.261695: step: 38/464, loss: 1.1063692569732666 2023-01-22 12:20:55.996544: step: 40/464, loss: 0.3510200083255768 2023-01-22 12:20:56.729888: step: 42/464, loss: 0.4138941764831543 2023-01-22 12:20:57.408950: step: 44/464, loss: 0.6433913111686707 2023-01-22 12:20:58.118673: step: 46/464, loss: 0.19204004108905792 2023-01-22 12:20:58.864891: step: 48/464, loss: 0.41403651237487793 2023-01-22 12:20:59.580005: step: 50/464, loss: 0.9922606945037842 2023-01-22 12:21:00.390864: step: 52/464, loss: 0.2410106509923935 2023-01-22 12:21:01.128204: step: 54/464, loss: 0.993074893951416 2023-01-22 12:21:01.913178: step: 56/464, loss: 0.3063466548919678 2023-01-22 12:21:02.744021: step: 58/464, loss: 0.6400434970855713 2023-01-22 12:21:03.543361: step: 60/464, loss: 0.3732597827911377 2023-01-22 12:21:04.296407: step: 62/464, loss: 0.2601960301399231 2023-01-22 12:21:05.090270: step: 64/464, loss: 0.4548768699169159 2023-01-22 12:21:05.845225: step: 66/464, loss: 0.5139056444168091 2023-01-22 12:21:06.578036: step: 68/464, loss: 1.3860743045806885 2023-01-22 12:21:07.293058: step: 70/464, loss: 0.3661994934082031 2023-01-22 12:21:08.138603: step: 72/464, loss: 0.9115047454833984 2023-01-22 12:21:08.858336: step: 74/464, loss: 0.32631921768188477 2023-01-22 12:21:09.606869: step: 76/464, loss: 0.8258537650108337 2023-01-22 12:21:10.428204: step: 78/464, loss: 0.5958231687545776 2023-01-22 12:21:11.259909: step: 80/464, loss: 1.1353201866149902 2023-01-22 12:21:12.070036: step: 82/464, loss: 0.9883450269699097 2023-01-22 12:21:12.868517: step: 84/464, loss: 0.6555312871932983 2023-01-22 12:21:13.646304: step: 86/464, loss: 0.585017740726471 2023-01-22 12:21:14.337159: step: 88/464, loss: 0.1543656438589096 2023-01-22 12:21:15.136769: step: 90/464, loss: 0.2782514691352844 2023-01-22 12:21:15.901480: step: 92/464, loss: 5.046449184417725 2023-01-22 12:21:16.591190: step: 94/464, loss: 0.4302979111671448 2023-01-22 12:21:17.270862: step: 96/464, loss: 0.570051908493042 2023-01-22 12:21:18.076405: step: 98/464, loss: 0.4938129186630249 2023-01-22 12:21:18.853567: step: 100/464, loss: 0.5796581506729126 2023-01-22 12:21:19.619892: step: 102/464, loss: 0.3145013749599457 2023-01-22 12:21:20.384826: step: 104/464, loss: 0.49091294407844543 2023-01-22 12:21:21.108962: step: 106/464, loss: 2.0131349563598633 2023-01-22 12:21:21.835037: step: 108/464, loss: 0.31967562437057495 2023-01-22 12:21:22.622176: step: 110/464, loss: 0.2807406485080719 2023-01-22 12:21:23.401926: step: 112/464, loss: 0.3552629351615906 2023-01-22 12:21:24.153588: step: 114/464, loss: 0.15171876549720764 2023-01-22 12:21:24.905731: step: 116/464, loss: 0.29429763555526733 2023-01-22 12:21:25.725216: step: 118/464, loss: 0.8097703456878662 2023-01-22 12:21:26.460136: step: 120/464, loss: 0.12587079405784607 2023-01-22 12:21:27.243894: step: 122/464, loss: 0.15087245404720306 2023-01-22 12:21:28.002606: step: 124/464, loss: 0.16236327588558197 2023-01-22 12:21:28.794763: step: 126/464, loss: 0.6959328651428223 2023-01-22 12:21:29.639943: step: 128/464, loss: 0.2463809698820114 2023-01-22 12:21:30.333173: step: 130/464, loss: 0.21326425671577454 2023-01-22 12:21:31.040568: step: 132/464, loss: 0.5366947054862976 2023-01-22 12:21:31.734938: step: 134/464, loss: 0.3185555934906006 2023-01-22 12:21:32.493673: step: 136/464, loss: 0.20601888000965118 2023-01-22 12:21:33.200072: step: 138/464, loss: 0.3682593107223511 2023-01-22 12:21:33.915891: step: 140/464, loss: 0.21474689245224 2023-01-22 12:21:34.630566: step: 142/464, loss: 1.0219560861587524 2023-01-22 12:21:35.356814: step: 144/464, loss: 1.8815715312957764 2023-01-22 12:21:36.138260: step: 146/464, loss: 0.5428639054298401 2023-01-22 12:21:36.853969: step: 148/464, loss: 0.5084086656570435 2023-01-22 12:21:37.538129: step: 150/464, loss: 0.5396155714988708 2023-01-22 12:21:38.270327: step: 152/464, loss: 0.21716253459453583 2023-01-22 12:21:38.989031: step: 154/464, loss: 0.9855729937553406 2023-01-22 12:21:39.791739: step: 156/464, loss: 0.3413148820400238 2023-01-22 12:21:40.712658: step: 158/464, loss: 0.25234007835388184 2023-01-22 12:21:41.583937: step: 160/464, loss: 0.3375931680202484 2023-01-22 12:21:42.311337: step: 162/464, loss: 0.4368821084499359 2023-01-22 12:21:43.059429: step: 164/464, loss: 0.6206197142601013 2023-01-22 12:21:43.850159: step: 166/464, loss: 0.24986518919467926 2023-01-22 12:21:44.581450: step: 168/464, loss: 1.968464970588684 2023-01-22 12:21:45.301941: step: 170/464, loss: 0.4953523576259613 2023-01-22 12:21:46.072020: step: 172/464, loss: 0.102138951420784 2023-01-22 12:21:46.792934: step: 174/464, loss: 0.6342282891273499 2023-01-22 12:21:47.540031: step: 176/464, loss: 0.5496773719787598 2023-01-22 12:21:48.245878: step: 178/464, loss: 0.21457117795944214 2023-01-22 12:21:49.052490: step: 180/464, loss: 0.7042529582977295 2023-01-22 12:21:49.794667: step: 182/464, loss: 0.8047784566879272 2023-01-22 12:21:50.702720: step: 184/464, loss: 0.2432776391506195 2023-01-22 12:21:51.477504: step: 186/464, loss: 0.27830901741981506 2023-01-22 12:21:52.179371: step: 188/464, loss: 0.03338748216629028 2023-01-22 12:21:52.936284: step: 190/464, loss: 0.8855580687522888 2023-01-22 12:21:53.638221: step: 192/464, loss: 0.1724783480167389 2023-01-22 12:21:54.407380: step: 194/464, loss: 0.39989563822746277 2023-01-22 12:21:55.201771: step: 196/464, loss: 0.5461003184318542 2023-01-22 12:21:56.001160: step: 198/464, loss: 3.4108574390411377 2023-01-22 12:21:56.744245: step: 200/464, loss: 0.42918553948402405 2023-01-22 12:21:57.555072: step: 202/464, loss: 0.8983579277992249 2023-01-22 12:21:58.384403: step: 204/464, loss: 0.3279099464416504 2023-01-22 12:21:59.105224: step: 206/464, loss: 0.2302406132221222 2023-01-22 12:21:59.790420: step: 208/464, loss: 1.0292377471923828 2023-01-22 12:22:00.524029: step: 210/464, loss: 0.5882940888404846 2023-01-22 12:22:01.283560: step: 212/464, loss: 0.17929045855998993 2023-01-22 12:22:02.088579: step: 214/464, loss: 0.1669246256351471 2023-01-22 12:22:02.897589: step: 216/464, loss: 0.5283328294754028 2023-01-22 12:22:03.696035: step: 218/464, loss: 1.1631416082382202 2023-01-22 12:22:04.450418: step: 220/464, loss: 0.6150327920913696 2023-01-22 12:22:05.229750: step: 222/464, loss: 0.4816374182701111 2023-01-22 12:22:05.914341: step: 224/464, loss: 0.18757089972496033 2023-01-22 12:22:06.627428: step: 226/464, loss: 0.2773834466934204 2023-01-22 12:22:07.352029: step: 228/464, loss: 0.2142343968153 2023-01-22 12:22:08.112309: step: 230/464, loss: 0.19133417308330536 2023-01-22 12:22:08.859972: step: 232/464, loss: 0.4470163881778717 2023-01-22 12:22:09.627877: step: 234/464, loss: 0.5169656276702881 2023-01-22 12:22:10.461538: step: 236/464, loss: 1.0284165143966675 2023-01-22 12:22:11.239665: step: 238/464, loss: 1.306230902671814 2023-01-22 12:22:11.983237: step: 240/464, loss: 0.20086947083473206 2023-01-22 12:22:12.736765: step: 242/464, loss: 0.9084507822990417 2023-01-22 12:22:13.372512: step: 244/464, loss: 1.1629050970077515 2023-01-22 12:22:14.127480: step: 246/464, loss: 0.24601887166500092 2023-01-22 12:22:14.873218: step: 248/464, loss: 0.6524565815925598 2023-01-22 12:22:15.622785: step: 250/464, loss: 0.9458284974098206 2023-01-22 12:22:16.448356: step: 252/464, loss: 0.5239224433898926 2023-01-22 12:22:17.244184: step: 254/464, loss: 1.2422142028808594 2023-01-22 12:22:17.963505: step: 256/464, loss: 0.9221518635749817 2023-01-22 12:22:18.670143: step: 258/464, loss: 0.17750589549541473 2023-01-22 12:22:19.441156: step: 260/464, loss: 0.5422233939170837 2023-01-22 12:22:20.172695: step: 262/464, loss: 0.26078078150749207 2023-01-22 12:22:20.900354: step: 264/464, loss: 0.7859283089637756 2023-01-22 12:22:21.679183: step: 266/464, loss: 0.4978753328323364 2023-01-22 12:22:22.415164: step: 268/464, loss: 0.19332841038703918 2023-01-22 12:22:23.114137: step: 270/464, loss: 0.04691997542977333 2023-01-22 12:22:23.948704: step: 272/464, loss: 1.331789255142212 2023-01-22 12:22:24.749107: step: 274/464, loss: 0.511642575263977 2023-01-22 12:22:25.485568: step: 276/464, loss: 0.16044507920742035 2023-01-22 12:22:26.283043: step: 278/464, loss: 0.45334386825561523 2023-01-22 12:22:26.958547: step: 280/464, loss: 0.25107744336128235 2023-01-22 12:22:27.723478: step: 282/464, loss: 0.14941652119159698 2023-01-22 12:22:28.499313: step: 284/464, loss: 0.753431499004364 2023-01-22 12:22:29.441054: step: 286/464, loss: 0.21342191100120544 2023-01-22 12:22:30.133968: step: 288/464, loss: 0.4793633222579956 2023-01-22 12:22:30.903557: step: 290/464, loss: 1.7517832517623901 2023-01-22 12:22:31.598902: step: 292/464, loss: 0.23970462381839752 2023-01-22 12:22:32.337448: step: 294/464, loss: 1.064223051071167 2023-01-22 12:22:33.174526: step: 296/464, loss: 0.584053099155426 2023-01-22 12:22:33.941391: step: 298/464, loss: 1.830169439315796 2023-01-22 12:22:34.744966: step: 300/464, loss: 0.1358712762594223 2023-01-22 12:22:35.487882: step: 302/464, loss: 1.2534114122390747 2023-01-22 12:22:36.278717: step: 304/464, loss: 0.5717884302139282 2023-01-22 12:22:37.028811: step: 306/464, loss: 0.220110222697258 2023-01-22 12:22:37.804190: step: 308/464, loss: 0.2528989613056183 2023-01-22 12:22:38.657919: step: 310/464, loss: 0.46339473128318787 2023-01-22 12:22:39.336118: step: 312/464, loss: 0.20236316323280334 2023-01-22 12:22:40.039190: step: 314/464, loss: 0.32077664136886597 2023-01-22 12:22:40.769500: step: 316/464, loss: 0.1633232831954956 2023-01-22 12:22:41.480623: step: 318/464, loss: 0.7350283861160278 2023-01-22 12:22:42.245923: step: 320/464, loss: 0.8395072221755981 2023-01-22 12:22:42.888948: step: 322/464, loss: 0.597608745098114 2023-01-22 12:22:43.624317: step: 324/464, loss: 0.34893375635147095 2023-01-22 12:22:44.342539: step: 326/464, loss: 0.7595546841621399 2023-01-22 12:22:45.126084: step: 328/464, loss: 0.7179768085479736 2023-01-22 12:22:45.890265: step: 330/464, loss: 0.24666252732276917 2023-01-22 12:22:46.584228: step: 332/464, loss: 0.28931012749671936 2023-01-22 12:22:47.379467: step: 334/464, loss: 0.7932789325714111 2023-01-22 12:22:48.223567: step: 336/464, loss: 0.14844614267349243 2023-01-22 12:22:48.980968: step: 338/464, loss: 0.1988063007593155 2023-01-22 12:22:49.699721: step: 340/464, loss: 0.466692179441452 2023-01-22 12:22:50.442460: step: 342/464, loss: 0.20156744122505188 2023-01-22 12:22:51.156121: step: 344/464, loss: 0.4504646360874176 2023-01-22 12:22:51.905696: step: 346/464, loss: 0.7976289987564087 2023-01-22 12:22:52.642298: step: 348/464, loss: 0.5727869272232056 2023-01-22 12:22:53.469782: step: 350/464, loss: 0.13022443652153015 2023-01-22 12:22:54.249198: step: 352/464, loss: 0.1916579008102417 2023-01-22 12:22:55.072756: step: 354/464, loss: 0.6014117002487183 2023-01-22 12:22:55.844782: step: 356/464, loss: 0.18349871039390564 2023-01-22 12:22:56.563045: step: 358/464, loss: 0.382188081741333 2023-01-22 12:22:57.311498: step: 360/464, loss: 1.9701430797576904 2023-01-22 12:22:58.102490: step: 362/464, loss: 0.9048128128051758 2023-01-22 12:22:58.927657: step: 364/464, loss: 0.5793383717536926 2023-01-22 12:22:59.682844: step: 366/464, loss: 0.9921724796295166 2023-01-22 12:23:00.477954: step: 368/464, loss: 0.44221946597099304 2023-01-22 12:23:01.165737: step: 370/464, loss: 0.42944034934043884 2023-01-22 12:23:01.886345: step: 372/464, loss: 0.3041359484195709 2023-01-22 12:23:02.642546: step: 374/464, loss: 0.3399038314819336 2023-01-22 12:23:03.315077: step: 376/464, loss: 0.35421428084373474 2023-01-22 12:23:04.033481: step: 378/464, loss: 0.26342979073524475 2023-01-22 12:23:04.778560: step: 380/464, loss: 0.10741574317216873 2023-01-22 12:23:05.634265: step: 382/464, loss: 0.5449819564819336 2023-01-22 12:23:06.493224: step: 384/464, loss: 1.1310272216796875 2023-01-22 12:23:07.319551: step: 386/464, loss: 1.4800504446029663 2023-01-22 12:23:08.090714: step: 388/464, loss: 0.13597045838832855 2023-01-22 12:23:08.853772: step: 390/464, loss: 0.8671262860298157 2023-01-22 12:23:09.612132: step: 392/464, loss: 0.5314968228340149 2023-01-22 12:23:10.372128: step: 394/464, loss: 0.5388355255126953 2023-01-22 12:23:11.136628: step: 396/464, loss: 2.4609920978546143 2023-01-22 12:23:11.937430: step: 398/464, loss: 0.32738056778907776 2023-01-22 12:23:12.718632: step: 400/464, loss: 0.4142988324165344 2023-01-22 12:23:13.521220: step: 402/464, loss: 0.41231632232666016 2023-01-22 12:23:14.274040: step: 404/464, loss: 0.8846166133880615 2023-01-22 12:23:14.992767: step: 406/464, loss: 0.2211620807647705 2023-01-22 12:23:15.749611: step: 408/464, loss: 0.31446048617362976 2023-01-22 12:23:16.481621: step: 410/464, loss: 1.0601859092712402 2023-01-22 12:23:17.254093: step: 412/464, loss: 1.1913033723831177 2023-01-22 12:23:18.087913: step: 414/464, loss: 0.6423701047897339 2023-01-22 12:23:18.884295: step: 416/464, loss: 0.6132628917694092 2023-01-22 12:23:19.710186: step: 418/464, loss: 1.12965726852417 2023-01-22 12:23:20.435061: step: 420/464, loss: 0.36937347054481506 2023-01-22 12:23:21.201061: step: 422/464, loss: 0.21653147041797638 2023-01-22 12:23:21.907094: step: 424/464, loss: 1.2950403690338135 2023-01-22 12:23:22.616920: step: 426/464, loss: 1.0248606204986572 2023-01-22 12:23:23.305432: step: 428/464, loss: 0.16431821882724762 2023-01-22 12:23:24.021563: step: 430/464, loss: 0.9855753779411316 2023-01-22 12:23:24.730158: step: 432/464, loss: 0.7400719523429871 2023-01-22 12:23:25.514025: step: 434/464, loss: 1.0649687051773071 2023-01-22 12:23:26.266405: step: 436/464, loss: 0.1411045640707016 2023-01-22 12:23:26.965743: step: 438/464, loss: 0.8169607520103455 2023-01-22 12:23:27.756005: step: 440/464, loss: 0.3585193455219269 2023-01-22 12:23:28.443587: step: 442/464, loss: 0.4122265577316284 2023-01-22 12:23:29.231141: step: 444/464, loss: 0.8370229601860046 2023-01-22 12:23:30.051309: step: 446/464, loss: 1.1493605375289917 2023-01-22 12:23:30.842251: step: 448/464, loss: 0.2955279052257538 2023-01-22 12:23:31.600332: step: 450/464, loss: 0.16914665699005127 2023-01-22 12:23:32.343962: step: 452/464, loss: 0.3054179847240448 2023-01-22 12:23:33.063478: step: 454/464, loss: 0.31248268485069275 2023-01-22 12:23:33.811586: step: 456/464, loss: 1.0347895622253418 2023-01-22 12:23:34.522886: step: 458/464, loss: 0.4280281662940979 2023-01-22 12:23:35.318672: step: 460/464, loss: 0.2224617600440979 2023-01-22 12:23:36.066656: step: 462/464, loss: 0.4653012454509735 2023-01-22 12:23:36.850446: step: 464/464, loss: 1.165769338607788 2023-01-22 12:23:37.588894: step: 466/464, loss: 0.2663632333278656 2023-01-22 12:23:38.307641: step: 468/464, loss: 0.09995938092470169 2023-01-22 12:23:39.127699: step: 470/464, loss: 0.2406713366508484 2023-01-22 12:23:39.863641: step: 472/464, loss: 0.1282961070537567 2023-01-22 12:23:40.631506: step: 474/464, loss: 0.2567034959793091 2023-01-22 12:23:41.304821: step: 476/464, loss: 1.0045839548110962 2023-01-22 12:23:42.087692: step: 478/464, loss: 0.4196160137653351 2023-01-22 12:23:42.973882: step: 480/464, loss: 1.2827622890472412 2023-01-22 12:23:43.751151: step: 482/464, loss: 2.9640793800354004 2023-01-22 12:23:44.523457: step: 484/464, loss: 0.22173580527305603 2023-01-22 12:23:45.258178: step: 486/464, loss: 0.25746065378189087 2023-01-22 12:23:46.079212: step: 488/464, loss: 1.617734432220459 2023-01-22 12:23:46.905795: step: 490/464, loss: 0.5065245032310486 2023-01-22 12:23:47.635557: step: 492/464, loss: 0.17967504262924194 2023-01-22 12:23:48.340852: step: 494/464, loss: 0.39161476492881775 2023-01-22 12:23:48.991955: step: 496/464, loss: 0.4677339494228363 2023-01-22 12:23:49.730043: step: 498/464, loss: 0.4482991695404053 2023-01-22 12:23:50.601027: step: 500/464, loss: 0.6047283411026001 2023-01-22 12:23:51.443725: step: 502/464, loss: 0.8439066410064697 2023-01-22 12:23:52.161919: step: 504/464, loss: 0.3559965193271637 2023-01-22 12:23:52.911178: step: 506/464, loss: 0.7783767580986023 2023-01-22 12:23:53.618018: step: 508/464, loss: 0.42967337369918823 2023-01-22 12:23:54.398716: step: 510/464, loss: 0.4983972907066345 2023-01-22 12:23:55.160128: step: 512/464, loss: 0.7379698157310486 2023-01-22 12:23:56.017870: step: 514/464, loss: 0.19468754529953003 2023-01-22 12:23:56.729900: step: 516/464, loss: 0.3303523659706116 2023-01-22 12:23:57.401945: step: 518/464, loss: 0.40996867418289185 2023-01-22 12:23:58.081696: step: 520/464, loss: 1.7850862741470337 2023-01-22 12:23:58.784028: step: 522/464, loss: 0.6261616945266724 2023-01-22 12:23:59.522341: step: 524/464, loss: 0.45365774631500244 2023-01-22 12:24:00.316694: step: 526/464, loss: 1.1625231504440308 2023-01-22 12:24:00.985325: step: 528/464, loss: 0.4727645516395569 2023-01-22 12:24:01.680669: step: 530/464, loss: 3.4565367698669434 2023-01-22 12:24:02.441313: step: 532/464, loss: 0.5150865316390991 2023-01-22 12:24:03.197985: step: 534/464, loss: 4.625498294830322 2023-01-22 12:24:03.922820: step: 536/464, loss: 0.5529197454452515 2023-01-22 12:24:04.749777: step: 538/464, loss: 0.3276940584182739 2023-01-22 12:24:05.461340: step: 540/464, loss: 0.3311607539653778 2023-01-22 12:24:06.160563: step: 542/464, loss: 1.1282060146331787 2023-01-22 12:24:06.913228: step: 544/464, loss: 0.18877974152565002 2023-01-22 12:24:07.649731: step: 546/464, loss: 0.5707278251647949 2023-01-22 12:24:08.390235: step: 548/464, loss: 0.24991458654403687 2023-01-22 12:24:09.278152: step: 550/464, loss: 0.7295461893081665 2023-01-22 12:24:10.019575: step: 552/464, loss: 2.1148874759674072 2023-01-22 12:24:10.759911: step: 554/464, loss: 1.4030309915542603 2023-01-22 12:24:11.538041: step: 556/464, loss: 0.21013569831848145 2023-01-22 12:24:12.372848: step: 558/464, loss: 1.9200247526168823 2023-01-22 12:24:13.120485: step: 560/464, loss: 0.3160833716392517 2023-01-22 12:24:13.813434: step: 562/464, loss: 0.34151044487953186 2023-01-22 12:24:14.568918: step: 564/464, loss: 1.735850214958191 2023-01-22 12:24:15.314941: step: 566/464, loss: 0.7673071026802063 2023-01-22 12:24:16.127754: step: 568/464, loss: 0.1795767992734909 2023-01-22 12:24:16.831148: step: 570/464, loss: 0.2937583327293396 2023-01-22 12:24:17.663245: step: 572/464, loss: 0.28772732615470886 2023-01-22 12:24:18.387387: step: 574/464, loss: 0.5424758791923523 2023-01-22 12:24:19.093697: step: 576/464, loss: 1.393399953842163 2023-01-22 12:24:19.856105: step: 578/464, loss: 0.27327778935432434 2023-01-22 12:24:20.538169: step: 580/464, loss: 0.2959989011287689 2023-01-22 12:24:21.341785: step: 582/464, loss: 1.3786649703979492 2023-01-22 12:24:22.126931: step: 584/464, loss: 0.6207864284515381 2023-01-22 12:24:22.864355: step: 586/464, loss: 0.14585839211940765 2023-01-22 12:24:23.576656: step: 588/464, loss: 0.752159595489502 2023-01-22 12:24:24.278747: step: 590/464, loss: 0.6962541937828064 2023-01-22 12:24:25.001608: step: 592/464, loss: 0.22196218371391296 2023-01-22 12:24:25.757867: step: 594/464, loss: 0.4736698865890503 2023-01-22 12:24:26.433431: step: 596/464, loss: 0.827335000038147 2023-01-22 12:24:27.165298: step: 598/464, loss: 0.9425488710403442 2023-01-22 12:24:27.892298: step: 600/464, loss: 0.35686421394348145 2023-01-22 12:24:28.629611: step: 602/464, loss: 0.2859801650047302 2023-01-22 12:24:29.385163: step: 604/464, loss: 0.21103359758853912 2023-01-22 12:24:30.044744: step: 606/464, loss: 0.19106243550777435 2023-01-22 12:24:30.916414: step: 608/464, loss: 0.6681709885597229 2023-01-22 12:24:31.719251: step: 610/464, loss: 0.1334463208913803 2023-01-22 12:24:32.502324: step: 612/464, loss: 0.3725147247314453 2023-01-22 12:24:33.323096: step: 614/464, loss: 0.8420844078063965 2023-01-22 12:24:34.093287: step: 616/464, loss: 1.3648327589035034 2023-01-22 12:24:34.828826: step: 618/464, loss: 1.5693870782852173 2023-01-22 12:24:35.613271: step: 620/464, loss: 0.35302260518074036 2023-01-22 12:24:36.456871: step: 622/464, loss: 0.11129667609930038 2023-01-22 12:24:37.178863: step: 624/464, loss: 0.5374612212181091 2023-01-22 12:24:37.890841: step: 626/464, loss: 0.34831130504608154 2023-01-22 12:24:38.653511: step: 628/464, loss: 0.1023213267326355 2023-01-22 12:24:39.341545: step: 630/464, loss: 0.8473554849624634 2023-01-22 12:24:40.100330: step: 632/464, loss: 0.3351025879383087 2023-01-22 12:24:40.876857: step: 634/464, loss: 0.3838156461715698 2023-01-22 12:24:41.682283: step: 636/464, loss: 0.2758370339870453 2023-01-22 12:24:42.546771: step: 638/464, loss: 0.29054591059684753 2023-01-22 12:24:43.341566: step: 640/464, loss: 0.2623771131038666 2023-01-22 12:24:44.062092: step: 642/464, loss: 0.4661090672016144 2023-01-22 12:24:44.813208: step: 644/464, loss: 0.16628088057041168 2023-01-22 12:24:45.570551: step: 646/464, loss: 0.32660287618637085 2023-01-22 12:24:46.323367: step: 648/464, loss: 0.49122801423072815 2023-01-22 12:24:47.001799: step: 650/464, loss: 0.3161919116973877 2023-01-22 12:24:47.661068: step: 652/464, loss: 0.1668117344379425 2023-01-22 12:24:48.373699: step: 654/464, loss: 0.32301071286201477 2023-01-22 12:24:49.145969: step: 656/464, loss: 0.39199984073638916 2023-01-22 12:24:49.876777: step: 658/464, loss: 0.6262532472610474 2023-01-22 12:24:50.591410: step: 660/464, loss: 0.22475136816501617 2023-01-22 12:24:51.345793: step: 662/464, loss: 1.1950969696044922 2023-01-22 12:24:52.109613: step: 664/464, loss: 0.5805909037590027 2023-01-22 12:24:52.880151: step: 666/464, loss: 0.42047902941703796 2023-01-22 12:24:53.629916: step: 668/464, loss: 0.7445780634880066 2023-01-22 12:24:54.361057: step: 670/464, loss: 0.9380486607551575 2023-01-22 12:24:55.086534: step: 672/464, loss: 0.49395236372947693 2023-01-22 12:24:55.823330: step: 674/464, loss: 0.2148665338754654 2023-01-22 12:24:56.592258: step: 676/464, loss: 0.16246622800827026 2023-01-22 12:24:57.331311: step: 678/464, loss: 0.36869409680366516 2023-01-22 12:24:58.006793: step: 680/464, loss: 0.34773388504981995 2023-01-22 12:24:58.743977: step: 682/464, loss: 0.4818330705165863 2023-01-22 12:24:59.489367: step: 684/464, loss: 0.3407643139362335 2023-01-22 12:25:00.337537: step: 686/464, loss: 0.5388532876968384 2023-01-22 12:25:01.124379: step: 688/464, loss: 0.9964404702186584 2023-01-22 12:25:01.872437: step: 690/464, loss: 0.3177805244922638 2023-01-22 12:25:02.638748: step: 692/464, loss: 1.2455317974090576 2023-01-22 12:25:03.414080: step: 694/464, loss: 0.6586301326751709 2023-01-22 12:25:04.160533: step: 696/464, loss: 0.932684600353241 2023-01-22 12:25:04.864921: step: 698/464, loss: 0.2300596386194229 2023-01-22 12:25:05.655153: step: 700/464, loss: 0.5244414210319519 2023-01-22 12:25:06.442048: step: 702/464, loss: 0.15712791681289673 2023-01-22 12:25:07.224189: step: 704/464, loss: 0.5612965226173401 2023-01-22 12:25:07.933314: step: 706/464, loss: 0.4752296805381775 2023-01-22 12:25:08.617785: step: 708/464, loss: 0.8295560479164124 2023-01-22 12:25:09.357596: step: 710/464, loss: 0.5479645133018494 2023-01-22 12:25:10.154247: step: 712/464, loss: 0.1347873955965042 2023-01-22 12:25:10.946313: step: 714/464, loss: 0.41366565227508545 2023-01-22 12:25:11.646930: step: 716/464, loss: 0.4166448414325714 2023-01-22 12:25:12.459706: step: 718/464, loss: 0.4311206340789795 2023-01-22 12:25:13.186902: step: 720/464, loss: 0.37743064761161804 2023-01-22 12:25:13.912249: step: 722/464, loss: 0.804713785648346 2023-01-22 12:25:14.666759: step: 724/464, loss: 0.38146859407424927 2023-01-22 12:25:15.432535: step: 726/464, loss: 0.2579302489757538 2023-01-22 12:25:16.223755: step: 728/464, loss: 0.23407909274101257 2023-01-22 12:25:16.997259: step: 730/464, loss: 0.618119478225708 2023-01-22 12:25:17.700328: step: 732/464, loss: 0.6577721834182739 2023-01-22 12:25:18.471892: step: 734/464, loss: 1.7237317562103271 2023-01-22 12:25:19.216218: step: 736/464, loss: 0.2821105122566223 2023-01-22 12:25:20.057121: step: 738/464, loss: 0.8553675413131714 2023-01-22 12:25:20.751257: step: 740/464, loss: 0.9743956327438354 2023-01-22 12:25:21.466360: step: 742/464, loss: 0.7038723826408386 2023-01-22 12:25:22.147796: step: 744/464, loss: 0.6714287996292114 2023-01-22 12:25:22.833770: step: 746/464, loss: 0.48815155029296875 2023-01-22 12:25:23.597713: step: 748/464, loss: 0.14024285972118378 2023-01-22 12:25:24.428798: step: 750/464, loss: 0.35646864771842957 2023-01-22 12:25:25.206259: step: 752/464, loss: 0.4604167342185974 2023-01-22 12:25:26.004824: step: 754/464, loss: 0.1984981894493103 2023-01-22 12:25:26.733877: step: 756/464, loss: 0.36041873693466187 2023-01-22 12:25:27.529598: step: 758/464, loss: 0.5603522062301636 2023-01-22 12:25:28.356958: step: 760/464, loss: 0.6194506883621216 2023-01-22 12:25:29.073022: step: 762/464, loss: 0.30803316831588745 2023-01-22 12:25:29.865607: step: 764/464, loss: 1.318713665008545 2023-01-22 12:25:30.709557: step: 766/464, loss: 0.13423074781894684 2023-01-22 12:25:31.549224: step: 768/464, loss: 0.3748876452445984 2023-01-22 12:25:32.333680: step: 770/464, loss: 1.5057021379470825 2023-01-22 12:25:33.122577: step: 772/464, loss: 0.6628919243812561 2023-01-22 12:25:33.821935: step: 774/464, loss: 0.6508646607398987 2023-01-22 12:25:34.551231: step: 776/464, loss: 0.5535852313041687 2023-01-22 12:25:35.304960: step: 778/464, loss: 0.6056692600250244 2023-01-22 12:25:36.087932: step: 780/464, loss: 0.1761878877878189 2023-01-22 12:25:36.948229: step: 782/464, loss: 1.3329436779022217 2023-01-22 12:25:37.691740: step: 784/464, loss: 0.22690290212631226 2023-01-22 12:25:38.414807: step: 786/464, loss: 0.4519822299480438 2023-01-22 12:25:39.140490: step: 788/464, loss: 1.4893604516983032 2023-01-22 12:25:39.836708: step: 790/464, loss: 0.43520382046699524 2023-01-22 12:25:40.634582: step: 792/464, loss: 2.284243106842041 2023-01-22 12:25:41.446271: step: 794/464, loss: 1.342424750328064 2023-01-22 12:25:42.251471: step: 796/464, loss: 0.18607650697231293 2023-01-22 12:25:43.022672: step: 798/464, loss: 1.91313898563385 2023-01-22 12:25:43.826072: step: 800/464, loss: 0.14298135042190552 2023-01-22 12:25:44.499489: step: 802/464, loss: 0.16389590501785278 2023-01-22 12:25:45.209610: step: 804/464, loss: 1.263809323310852 2023-01-22 12:25:45.974805: step: 806/464, loss: 0.5659499764442444 2023-01-22 12:25:46.698926: step: 808/464, loss: 0.3437536656856537 2023-01-22 12:25:47.494855: step: 810/464, loss: 0.23270297050476074 2023-01-22 12:25:48.250029: step: 812/464, loss: 0.2002369612455368 2023-01-22 12:25:49.122955: step: 814/464, loss: 0.3684091567993164 2023-01-22 12:25:49.817494: step: 816/464, loss: 0.2902369797229767 2023-01-22 12:25:50.575375: step: 818/464, loss: 0.7540484666824341 2023-01-22 12:25:51.308070: step: 820/464, loss: 0.9410716891288757 2023-01-22 12:25:52.022483: step: 822/464, loss: 0.9295744299888611 2023-01-22 12:25:52.754440: step: 824/464, loss: 0.16161967813968658 2023-01-22 12:25:53.465524: step: 826/464, loss: 0.23595713078975677 2023-01-22 12:25:54.167072: step: 828/464, loss: 1.1932841539382935 2023-01-22 12:25:54.928022: step: 830/464, loss: 0.3627581298351288 2023-01-22 12:25:55.673122: step: 832/464, loss: 0.9675153493881226 2023-01-22 12:25:56.493232: step: 834/464, loss: 0.37052589654922485 2023-01-22 12:25:57.231938: step: 836/464, loss: 0.817292332649231 2023-01-22 12:25:57.963012: step: 838/464, loss: 0.3747046887874603 2023-01-22 12:25:58.862474: step: 840/464, loss: 1.0048964023590088 2023-01-22 12:25:59.684498: step: 842/464, loss: 0.47187212109565735 2023-01-22 12:26:00.377409: step: 844/464, loss: 0.8129963278770447 2023-01-22 12:26:01.188728: step: 846/464, loss: 0.7479146718978882 2023-01-22 12:26:01.942462: step: 848/464, loss: 0.48911774158477783 2023-01-22 12:26:02.626694: step: 850/464, loss: 1.1757187843322754 2023-01-22 12:26:03.427155: step: 852/464, loss: 0.2751547694206238 2023-01-22 12:26:04.122782: step: 854/464, loss: 0.9891265630722046 2023-01-22 12:26:04.861143: step: 856/464, loss: 0.3847293257713318 2023-01-22 12:26:05.613689: step: 858/464, loss: 0.40411749482154846 2023-01-22 12:26:06.378339: step: 860/464, loss: 0.17029529809951782 2023-01-22 12:26:07.046190: step: 862/464, loss: 0.46243488788604736 2023-01-22 12:26:07.746701: step: 864/464, loss: 0.10078753530979156 2023-01-22 12:26:08.538239: step: 866/464, loss: 0.6281033158302307 2023-01-22 12:26:09.298879: step: 868/464, loss: 0.9208279252052307 2023-01-22 12:26:10.086140: step: 870/464, loss: 0.6805605888366699 2023-01-22 12:26:10.815644: step: 872/464, loss: 0.09829024970531464 2023-01-22 12:26:11.568572: step: 874/464, loss: 0.10819683223962784 2023-01-22 12:26:12.303978: step: 876/464, loss: 0.4335554838180542 2023-01-22 12:26:13.072262: step: 878/464, loss: 0.2784618139266968 2023-01-22 12:26:13.864208: step: 880/464, loss: 0.603888213634491 2023-01-22 12:26:14.570765: step: 882/464, loss: 0.15740109980106354 2023-01-22 12:26:15.336820: step: 884/464, loss: 0.9284422993659973 2023-01-22 12:26:16.082427: step: 886/464, loss: 0.9966660737991333 2023-01-22 12:26:16.880561: step: 888/464, loss: 0.4289911985397339 2023-01-22 12:26:17.645583: step: 890/464, loss: 0.796715259552002 2023-01-22 12:26:18.440895: step: 892/464, loss: 1.042264461517334 2023-01-22 12:26:19.176693: step: 894/464, loss: 0.21413186192512512 2023-01-22 12:26:19.881456: step: 896/464, loss: 0.148712158203125 2023-01-22 12:26:20.646099: step: 898/464, loss: 0.34306561946868896 2023-01-22 12:26:21.551926: step: 900/464, loss: 0.1933916062116623 2023-01-22 12:26:22.362204: step: 902/464, loss: 0.8368797898292542 2023-01-22 12:26:23.079805: step: 904/464, loss: 0.1777791827917099 2023-01-22 12:26:23.825139: step: 906/464, loss: 0.17683547735214233 2023-01-22 12:26:24.496153: step: 908/464, loss: 0.19002756476402283 2023-01-22 12:26:25.257293: step: 910/464, loss: 0.2975521385669708 2023-01-22 12:26:25.963390: step: 912/464, loss: 0.48966196179389954 2023-01-22 12:26:26.699407: step: 914/464, loss: 0.8314082622528076 2023-01-22 12:26:27.511492: step: 916/464, loss: 0.25605788826942444 2023-01-22 12:26:28.299936: step: 918/464, loss: 0.255087286233902 2023-01-22 12:26:29.059629: step: 920/464, loss: 0.28800809383392334 2023-01-22 12:26:29.845837: step: 922/464, loss: 0.5026906132698059 2023-01-22 12:26:30.597503: step: 924/464, loss: 0.5396649241447449 2023-01-22 12:26:31.336430: step: 926/464, loss: 1.2712345123291016 2023-01-22 12:26:32.091338: step: 928/464, loss: 0.32311421632766724 2023-01-22 12:26:32.705358: step: 930/464, loss: 0.2566630244255066 ================================================== Loss: 0.608 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.311460762286681, 'r': 0.3191293037823758, 'f1': 0.31524840485892314}, 'combined': 0.23228829831710124, 'epoch': 7} Test Chinese: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.3040844575388374, 'r': 0.25205022296493146, 'f1': 0.2756330723824183}, 'combined': 0.17118264495329139, 'epoch': 7} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2861951723842379, 'r': 0.30625051590358787, 'f1': 0.2958833895646742}, 'combined': 0.2180193396792336, 'epoch': 7} Test Korean: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.2937920423504621, 'r': 0.25049331405153147, 'f1': 0.2704204383407349}, 'combined': 0.1679453248642459, 'epoch': 7} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.297284019576004, 'r': 0.3141751570519133, 'f1': 0.3054962853101477}, 'combined': 0.22510252601800357, 'epoch': 7} Test Russian: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.31121577852929394, 'r': 0.26473350102392956, 'f1': 0.2860989519350003}, 'combined': 0.17768250699121071, 'epoch': 7} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.25, 'r': 0.30714285714285716, 'f1': 0.27564102564102566}, 'combined': 0.18376068376068377, 'epoch': 7} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.3046875, 'r': 0.42391304347826086, 'f1': 0.3545454545454545}, 'combined': 0.17727272727272725, 'epoch': 7} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.3465909090909091, 'r': 0.2629310344827586, 'f1': 0.2990196078431373}, 'combined': 0.19934640522875818, 'epoch': 7} New best korean model... ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.27277317990287775, 'r': 0.2614291157103195, 'f1': 0.2669806992485695}, 'combined': 0.19672262049894593, 'epoch': 2} Test for Chinese: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.2661425985210101, 'r': 0.23333409278617157, 'f1': 0.2486608198477961}, 'combined': 0.15443145653705231, 'epoch': 2} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.34459459459459457, 'r': 0.36428571428571427, 'f1': 0.3541666666666667}, 'combined': 0.2361111111111111, 'epoch': 2} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2861951723842379, 'r': 0.30625051590358787, 'f1': 0.2958833895646742}, 'combined': 0.2180193396792336, 'epoch': 7} Test for Korean: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.2937920423504621, 'r': 0.25049331405153147, 'f1': 0.2704204383407349}, 'combined': 0.1679453248642459, 'epoch': 7} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.3046875, 'r': 0.42391304347826086, 'f1': 0.3545454545454545}, 'combined': 0.17727272727272725, 'epoch': 7} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2756479266347688, 'r': 0.327953036053131, 'f1': 0.2995342287694974}, 'combined': 0.22070943172489282, 'epoch': 6} Test for Russian: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.2939259120206898, 'r': 0.2791134401698448, 'f1': 0.2863282325918732}, 'combined': 0.17782490234653178, 'epoch': 6} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.44642857142857145, 'r': 0.3232758620689655, 'f1': 0.37500000000000006}, 'combined': 0.25, 'epoch': 6} ****************************** Epoch: 8 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-22 12:29:23.163713: step: 2/464, loss: 0.24098850786685944 2023-01-22 12:29:23.809056: step: 4/464, loss: 0.21834394335746765 2023-01-22 12:29:24.448762: step: 6/464, loss: 0.6970574259757996 2023-01-22 12:29:25.187325: step: 8/464, loss: 0.11427651345729828 2023-01-22 12:29:25.968346: step: 10/464, loss: 0.3398534059524536 2023-01-22 12:29:26.781257: step: 12/464, loss: 0.3831498920917511 2023-01-22 12:29:27.473425: step: 14/464, loss: 0.32490742206573486 2023-01-22 12:29:28.189985: step: 16/464, loss: 0.5013684630393982 2023-01-22 12:29:28.894553: step: 18/464, loss: 0.2335856556892395 2023-01-22 12:29:29.605575: step: 20/464, loss: 1.0744787454605103 2023-01-22 12:29:30.485918: step: 22/464, loss: 0.1543176919221878 2023-01-22 12:29:31.274731: step: 24/464, loss: 0.36627593636512756 2023-01-22 12:29:32.021530: step: 26/464, loss: 0.32450997829437256 2023-01-22 12:29:32.844810: step: 28/464, loss: 0.13919056951999664 2023-01-22 12:29:33.550895: step: 30/464, loss: 1.290597915649414 2023-01-22 12:29:34.292886: step: 32/464, loss: 5.267265319824219 2023-01-22 12:29:35.016041: step: 34/464, loss: 0.13220682740211487 2023-01-22 12:29:35.740017: step: 36/464, loss: 0.8730195760726929 2023-01-22 12:29:36.550704: step: 38/464, loss: 0.9116947650909424 2023-01-22 12:29:37.245919: step: 40/464, loss: 0.31921955943107605 2023-01-22 12:29:38.090314: step: 42/464, loss: 0.09013380110263824 2023-01-22 12:29:38.873847: step: 44/464, loss: 0.518787145614624 2023-01-22 12:29:39.550695: step: 46/464, loss: 0.558215856552124 2023-01-22 12:29:40.245489: step: 48/464, loss: 0.25874605774879456 2023-01-22 12:29:40.982723: step: 50/464, loss: 0.24418827891349792 2023-01-22 12:29:41.728337: step: 52/464, loss: 0.9657977819442749 2023-01-22 12:29:42.524529: step: 54/464, loss: 0.12193039804697037 2023-01-22 12:29:43.263193: step: 56/464, loss: 0.35909783840179443 2023-01-22 12:29:44.006424: step: 58/464, loss: 2.500666618347168 2023-01-22 12:29:44.686470: step: 60/464, loss: 0.5895683765411377 2023-01-22 12:29:45.406845: step: 62/464, loss: 0.21890252828598022 2023-01-22 12:29:46.163195: step: 64/464, loss: 0.5999624133110046 2023-01-22 12:29:46.990169: step: 66/464, loss: 2.444990396499634 2023-01-22 12:29:47.762228: step: 68/464, loss: 0.13536398112773895 2023-01-22 12:29:48.472523: step: 70/464, loss: 1.011974811553955 2023-01-22 12:29:49.204835: step: 72/464, loss: 0.4110199511051178 2023-01-22 12:29:49.894825: step: 74/464, loss: 0.25794094800949097 2023-01-22 12:29:50.668364: step: 76/464, loss: 0.2513432204723358 2023-01-22 12:29:51.481891: step: 78/464, loss: 0.7264769673347473 2023-01-22 12:29:52.171267: step: 80/464, loss: 0.10029570758342743 2023-01-22 12:29:52.859466: step: 82/464, loss: 0.0936921015381813 2023-01-22 12:29:53.550030: step: 84/464, loss: 0.2198677659034729 2023-01-22 12:29:54.286849: step: 86/464, loss: 1.1094175577163696 2023-01-22 12:29:54.938294: step: 88/464, loss: 0.5705043077468872 2023-01-22 12:29:55.695164: step: 90/464, loss: 0.7141909003257751 2023-01-22 12:29:56.382109: step: 92/464, loss: 0.42526358366012573 2023-01-22 12:29:57.076687: step: 94/464, loss: 0.29963523149490356 2023-01-22 12:29:57.861225: step: 96/464, loss: 0.844380795955658 2023-01-22 12:29:58.556460: step: 98/464, loss: 0.2561212182044983 2023-01-22 12:29:59.269965: step: 100/464, loss: 0.28629907965660095 2023-01-22 12:30:00.003276: step: 102/464, loss: 1.0631016492843628 2023-01-22 12:30:00.813639: step: 104/464, loss: 0.21041373908519745 2023-01-22 12:30:01.508999: step: 106/464, loss: 0.6114664077758789 2023-01-22 12:30:02.280819: step: 108/464, loss: 0.1547708362340927 2023-01-22 12:30:03.002236: step: 110/464, loss: 0.24771416187286377 2023-01-22 12:30:03.790529: step: 112/464, loss: 0.4548543393611908 2023-01-22 12:30:04.539473: step: 114/464, loss: 0.4440891146659851 2023-01-22 12:30:05.223684: step: 116/464, loss: 0.609367311000824 2023-01-22 12:30:05.976736: step: 118/464, loss: 0.4446181356906891 2023-01-22 12:30:06.663479: step: 120/464, loss: 0.08416564762592316 2023-01-22 12:30:07.328771: step: 122/464, loss: 0.10157329589128494 2023-01-22 12:30:08.073233: step: 124/464, loss: 0.16663186252117157 2023-01-22 12:30:08.827711: step: 126/464, loss: 0.24829711019992828 2023-01-22 12:30:09.543226: step: 128/464, loss: 0.5671038031578064 2023-01-22 12:30:10.288586: step: 130/464, loss: 0.2176276594400406 2023-01-22 12:30:11.067492: step: 132/464, loss: 0.30462634563446045 2023-01-22 12:30:11.884360: step: 134/464, loss: 0.158929243683815 2023-01-22 12:30:12.633462: step: 136/464, loss: 0.2325487732887268 2023-01-22 12:30:13.378761: step: 138/464, loss: 0.3691418766975403 2023-01-22 12:30:14.127941: step: 140/464, loss: 0.08586454391479492 2023-01-22 12:30:14.882983: step: 142/464, loss: 0.22785647213459015 2023-01-22 12:30:15.629925: step: 144/464, loss: 0.2567114531993866 2023-01-22 12:30:16.387273: step: 146/464, loss: 0.21886906027793884 2023-01-22 12:30:17.136058: step: 148/464, loss: 0.7984110116958618 2023-01-22 12:30:17.875936: step: 150/464, loss: 0.35591015219688416 2023-01-22 12:30:18.557342: step: 152/464, loss: 0.4163927435874939 2023-01-22 12:30:19.304360: step: 154/464, loss: 0.20556171238422394 2023-01-22 12:30:20.210259: step: 156/464, loss: 0.2828492820262909 2023-01-22 12:30:20.962259: step: 158/464, loss: 0.13407738506793976 2023-01-22 12:30:21.753286: step: 160/464, loss: 0.9878560900688171 2023-01-22 12:30:22.485380: step: 162/464, loss: 0.10810502618551254 2023-01-22 12:30:23.276944: step: 164/464, loss: 0.2033417969942093 2023-01-22 12:30:24.009427: step: 166/464, loss: 0.641880452632904 2023-01-22 12:30:24.773547: step: 168/464, loss: 0.2256614863872528 2023-01-22 12:30:25.472775: step: 170/464, loss: 0.49018874764442444 2023-01-22 12:30:26.136761: step: 172/464, loss: 0.47806239128112793 2023-01-22 12:30:26.926566: step: 174/464, loss: 0.7968089580535889 2023-01-22 12:30:27.688305: step: 176/464, loss: 0.13665267825126648 2023-01-22 12:30:28.473515: step: 178/464, loss: 0.14715999364852905 2023-01-22 12:30:29.232992: step: 180/464, loss: 0.19499091804027557 2023-01-22 12:30:30.018286: step: 182/464, loss: 0.26871755719184875 2023-01-22 12:30:30.855065: step: 184/464, loss: 0.5004696846008301 2023-01-22 12:30:31.722822: step: 186/464, loss: 0.17549346387386322 2023-01-22 12:30:32.470135: step: 188/464, loss: 0.29949861764907837 2023-01-22 12:30:33.206728: step: 190/464, loss: 0.29847872257232666 2023-01-22 12:30:33.947133: step: 192/464, loss: 0.4522988200187683 2023-01-22 12:30:34.603992: step: 194/464, loss: 0.4335414469242096 2023-01-22 12:30:35.299367: step: 196/464, loss: 0.1644258052110672 2023-01-22 12:30:35.995098: step: 198/464, loss: 0.5021744966506958 2023-01-22 12:30:36.791986: step: 200/464, loss: 1.0680073499679565 2023-01-22 12:30:37.472513: step: 202/464, loss: 0.1432187706232071 2023-01-22 12:30:38.161345: step: 204/464, loss: 0.48796772956848145 2023-01-22 12:30:38.929553: step: 206/464, loss: 0.3300281763076782 2023-01-22 12:30:39.659297: step: 208/464, loss: 0.9741002321243286 2023-01-22 12:30:40.437486: step: 210/464, loss: 0.9319281578063965 2023-01-22 12:30:41.309607: step: 212/464, loss: 0.9063480496406555 2023-01-22 12:30:42.061481: step: 214/464, loss: 0.17574895918369293 2023-01-22 12:30:42.840800: step: 216/464, loss: 0.392004132270813 2023-01-22 12:30:43.568299: step: 218/464, loss: 0.9362916946411133 2023-01-22 12:30:44.314383: step: 220/464, loss: 0.08633681386709213 2023-01-22 12:30:45.021109: step: 222/464, loss: 0.2805129587650299 2023-01-22 12:30:45.720314: step: 224/464, loss: 0.7117599248886108 2023-01-22 12:30:46.398996: step: 226/464, loss: 0.22947485744953156 2023-01-22 12:30:47.118652: step: 228/464, loss: 0.19838595390319824 2023-01-22 12:30:47.801766: step: 230/464, loss: 0.1607564240694046 2023-01-22 12:30:48.578993: step: 232/464, loss: 0.5248289108276367 2023-01-22 12:30:49.361610: step: 234/464, loss: 0.972169816493988 2023-01-22 12:30:50.033538: step: 236/464, loss: 0.3914636969566345 2023-01-22 12:30:50.911976: step: 238/464, loss: 0.11111956089735031 2023-01-22 12:30:51.685276: step: 240/464, loss: 0.1756807267665863 2023-01-22 12:30:52.318569: step: 242/464, loss: 0.4231186807155609 2023-01-22 12:30:52.960166: step: 244/464, loss: 0.3299257755279541 2023-01-22 12:30:53.666329: step: 246/464, loss: 0.36662188172340393 2023-01-22 12:30:54.473934: step: 248/464, loss: 0.298601895570755 2023-01-22 12:30:55.188557: step: 250/464, loss: 0.15368987619876862 2023-01-22 12:30:55.963364: step: 252/464, loss: 0.5711445212364197 2023-01-22 12:30:56.671800: step: 254/464, loss: 0.26884526014328003 2023-01-22 12:30:57.510959: step: 256/464, loss: 0.4593277871608734 2023-01-22 12:30:58.274661: step: 258/464, loss: 0.3296632766723633 2023-01-22 12:30:59.059381: step: 260/464, loss: 0.23488648235797882 2023-01-22 12:30:59.794934: step: 262/464, loss: 0.4223480522632599 2023-01-22 12:31:00.529389: step: 264/464, loss: 0.43289366364479065 2023-01-22 12:31:01.299103: step: 266/464, loss: 0.13029201328754425 2023-01-22 12:31:02.011527: step: 268/464, loss: 0.8143272995948792 2023-01-22 12:31:02.816965: step: 270/464, loss: 0.3097124695777893 2023-01-22 12:31:03.526358: step: 272/464, loss: 0.7568479180335999 2023-01-22 12:31:04.254889: step: 274/464, loss: 0.1135609894990921 2023-01-22 12:31:04.988315: step: 276/464, loss: 0.29489538073539734 2023-01-22 12:31:05.662015: step: 278/464, loss: 0.40163666009902954 2023-01-22 12:31:06.373177: step: 280/464, loss: 0.3510648012161255 2023-01-22 12:31:07.081966: step: 282/464, loss: 0.4717705547809601 2023-01-22 12:31:07.848743: step: 284/464, loss: 0.2724142074584961 2023-01-22 12:31:08.634583: step: 286/464, loss: 0.24752037227153778 2023-01-22 12:31:09.428968: step: 288/464, loss: 0.6803609132766724 2023-01-22 12:31:10.127506: step: 290/464, loss: 0.25626876950263977 2023-01-22 12:31:10.883068: step: 292/464, loss: 0.3036022186279297 2023-01-22 12:31:11.608848: step: 294/464, loss: 0.21912869811058044 2023-01-22 12:31:12.374284: step: 296/464, loss: 0.2549686133861542 2023-01-22 12:31:13.111940: step: 298/464, loss: 0.11996863037347794 2023-01-22 12:31:13.882200: step: 300/464, loss: 0.1933732032775879 2023-01-22 12:31:14.655475: step: 302/464, loss: 1.3739672899246216 2023-01-22 12:31:15.521355: step: 304/464, loss: 0.7721089124679565 2023-01-22 12:31:16.201582: step: 306/464, loss: 0.16225355863571167 2023-01-22 12:31:17.032558: step: 308/464, loss: 0.24977611005306244 2023-01-22 12:31:17.716136: step: 310/464, loss: 0.1537967324256897 2023-01-22 12:31:18.368746: step: 312/464, loss: 0.43189412355422974 2023-01-22 12:31:19.138301: step: 314/464, loss: 0.2579169273376465 2023-01-22 12:31:19.860533: step: 316/464, loss: 0.42762044072151184 2023-01-22 12:31:20.654938: step: 318/464, loss: 0.5698819160461426 2023-01-22 12:31:21.421165: step: 320/464, loss: 0.22148877382278442 2023-01-22 12:31:22.185175: step: 322/464, loss: 0.44749972224235535 2023-01-22 12:31:22.904988: step: 324/464, loss: 0.18170328438282013 2023-01-22 12:31:23.651104: step: 326/464, loss: 2.2977375984191895 2023-01-22 12:31:24.333332: step: 328/464, loss: 0.22166982293128967 2023-01-22 12:31:25.093398: step: 330/464, loss: 1.137890100479126 2023-01-22 12:31:25.808970: step: 332/464, loss: 0.3206275701522827 2023-01-22 12:31:26.522433: step: 334/464, loss: 0.7786614894866943 2023-01-22 12:31:27.296023: step: 336/464, loss: 0.08940373361110687 2023-01-22 12:31:28.043831: step: 338/464, loss: 0.7511243224143982 2023-01-22 12:31:28.823894: step: 340/464, loss: 0.18221306800842285 2023-01-22 12:31:29.495194: step: 342/464, loss: 0.18981769680976868 2023-01-22 12:31:30.209158: step: 344/464, loss: 0.2692997455596924 2023-01-22 12:31:30.942398: step: 346/464, loss: 0.4894130825996399 2023-01-22 12:31:31.668583: step: 348/464, loss: 0.7022056579589844 2023-01-22 12:31:32.382231: step: 350/464, loss: 0.17065872251987457 2023-01-22 12:31:33.187764: step: 352/464, loss: 0.13291820883750916 2023-01-22 12:31:34.019631: step: 354/464, loss: 0.7096335887908936 2023-01-22 12:31:34.834463: step: 356/464, loss: 0.6142822504043579 2023-01-22 12:31:35.619713: step: 358/464, loss: 0.60723876953125 2023-01-22 12:31:36.387668: step: 360/464, loss: 0.3338114321231842 2023-01-22 12:31:37.174966: step: 362/464, loss: 0.43763482570648193 2023-01-22 12:31:37.996665: step: 364/464, loss: 0.2908762991428375 2023-01-22 12:31:38.732401: step: 366/464, loss: 0.3009243905544281 2023-01-22 12:31:39.450266: step: 368/464, loss: 0.2691645920276642 2023-01-22 12:31:40.180968: step: 370/464, loss: 0.7100043296813965 2023-01-22 12:31:40.906092: step: 372/464, loss: 0.23052287101745605 2023-01-22 12:31:41.621738: step: 374/464, loss: 0.2514415979385376 2023-01-22 12:31:42.442256: step: 376/464, loss: 0.2036939263343811 2023-01-22 12:31:43.111134: step: 378/464, loss: 0.6348828077316284 2023-01-22 12:31:43.865237: step: 380/464, loss: 0.1718049794435501 2023-01-22 12:31:44.565011: step: 382/464, loss: 0.2206181138753891 2023-01-22 12:31:45.276114: step: 384/464, loss: 0.6723924875259399 2023-01-22 12:31:46.021330: step: 386/464, loss: 0.24519896507263184 2023-01-22 12:31:46.628810: step: 388/464, loss: 0.6588399410247803 2023-01-22 12:31:47.410868: step: 390/464, loss: 0.6879944801330566 2023-01-22 12:31:48.249208: step: 392/464, loss: 0.7047319412231445 2023-01-22 12:31:49.058220: step: 394/464, loss: 0.15743254125118256 2023-01-22 12:31:49.719176: step: 396/464, loss: 0.060449376702308655 2023-01-22 12:31:50.495068: step: 398/464, loss: 0.4597553014755249 2023-01-22 12:31:51.241337: step: 400/464, loss: 0.19338826835155487 2023-01-22 12:31:51.917059: step: 402/464, loss: 0.8644437789916992 2023-01-22 12:31:52.705605: step: 404/464, loss: 1.0874934196472168 2023-01-22 12:31:53.434904: step: 406/464, loss: 0.1290365308523178 2023-01-22 12:31:54.212619: step: 408/464, loss: 0.26761525869369507 2023-01-22 12:31:54.877280: step: 410/464, loss: 0.5796024799346924 2023-01-22 12:31:55.591318: step: 412/464, loss: 0.6488765478134155 2023-01-22 12:31:56.322031: step: 414/464, loss: 0.22247929871082306 2023-01-22 12:31:57.109709: step: 416/464, loss: 0.35781481862068176 2023-01-22 12:31:57.794053: step: 418/464, loss: 0.785033643245697 2023-01-22 12:31:58.549968: step: 420/464, loss: 0.23130568861961365 2023-01-22 12:31:59.276770: step: 422/464, loss: 0.10906381905078888 2023-01-22 12:32:00.074650: step: 424/464, loss: 0.16461393237113953 2023-01-22 12:32:00.825721: step: 426/464, loss: 0.10141077637672424 2023-01-22 12:32:01.549817: step: 428/464, loss: 0.4492965042591095 2023-01-22 12:32:02.413030: step: 430/464, loss: 0.42122381925582886 2023-01-22 12:32:03.191660: step: 432/464, loss: 1.8734062910079956 2023-01-22 12:32:03.993320: step: 434/464, loss: 0.2933647334575653 2023-01-22 12:32:04.718865: step: 436/464, loss: 0.05986648052930832 2023-01-22 12:32:05.447522: step: 438/464, loss: 0.8424848914146423 2023-01-22 12:32:06.154375: step: 440/464, loss: 0.14208859205245972 2023-01-22 12:32:06.913246: step: 442/464, loss: 0.43434256315231323 2023-01-22 12:32:07.647665: step: 444/464, loss: 0.524348795413971 2023-01-22 12:32:08.539440: step: 446/464, loss: 0.2798750102519989 2023-01-22 12:32:09.234610: step: 448/464, loss: 0.18711858987808228 2023-01-22 12:32:09.857612: step: 450/464, loss: 0.15931551158428192 2023-01-22 12:32:10.609156: step: 452/464, loss: 0.34590521454811096 2023-01-22 12:32:11.287095: step: 454/464, loss: 0.19958004355430603 2023-01-22 12:32:12.102183: step: 456/464, loss: 0.5248697996139526 2023-01-22 12:32:12.785140: step: 458/464, loss: 0.2061764895915985 2023-01-22 12:32:13.592923: step: 460/464, loss: 0.3065434992313385 2023-01-22 12:32:14.261131: step: 462/464, loss: 0.9484499096870422 2023-01-22 12:32:15.000788: step: 464/464, loss: 0.554010808467865 2023-01-22 12:32:15.738259: step: 466/464, loss: 3.225388526916504 2023-01-22 12:32:16.542639: step: 468/464, loss: 2.29485821723938 2023-01-22 12:32:17.189699: step: 470/464, loss: 0.7562961578369141 2023-01-22 12:32:17.947013: step: 472/464, loss: 0.8341850638389587 2023-01-22 12:32:18.679397: step: 474/464, loss: 0.33977198600769043 2023-01-22 12:32:19.377396: step: 476/464, loss: 0.28051602840423584 2023-01-22 12:32:20.150540: step: 478/464, loss: 0.19837482273578644 2023-01-22 12:32:20.887726: step: 480/464, loss: 0.33840715885162354 2023-01-22 12:32:21.720558: step: 482/464, loss: 0.41176798939704895 2023-01-22 12:32:22.587002: step: 484/464, loss: 0.4754239320755005 2023-01-22 12:32:23.353001: step: 486/464, loss: 0.07769311219453812 2023-01-22 12:32:24.142776: step: 488/464, loss: 0.6456102728843689 2023-01-22 12:32:24.884317: step: 490/464, loss: 0.32670024037361145 2023-01-22 12:32:25.609069: step: 492/464, loss: 0.3997291922569275 2023-01-22 12:32:26.370001: step: 494/464, loss: 0.24537032842636108 2023-01-22 12:32:27.134205: step: 496/464, loss: 0.45198655128479004 2023-01-22 12:32:27.813473: step: 498/464, loss: 0.7440019845962524 2023-01-22 12:32:28.482459: step: 500/464, loss: 0.3995836675167084 2023-01-22 12:32:29.224730: step: 502/464, loss: 0.8018618822097778 2023-01-22 12:32:29.960560: step: 504/464, loss: 0.4375607371330261 2023-01-22 12:32:30.687618: step: 506/464, loss: 0.49072539806365967 2023-01-22 12:32:31.424700: step: 508/464, loss: 0.37828344106674194 2023-01-22 12:32:32.172037: step: 510/464, loss: 1.328326940536499 2023-01-22 12:32:32.871209: step: 512/464, loss: 0.18207770586013794 2023-01-22 12:32:33.566635: step: 514/464, loss: 0.2423122227191925 2023-01-22 12:32:34.258514: step: 516/464, loss: 0.19432279467582703 2023-01-22 12:32:35.004248: step: 518/464, loss: 0.17209100723266602 2023-01-22 12:32:35.725962: step: 520/464, loss: 0.4468704164028168 2023-01-22 12:32:36.420077: step: 522/464, loss: 0.29360130429267883 2023-01-22 12:32:37.235076: step: 524/464, loss: 0.12493553012609482 2023-01-22 12:32:37.906212: step: 526/464, loss: 0.20450644195079803 2023-01-22 12:32:38.694637: step: 528/464, loss: 0.28605949878692627 2023-01-22 12:32:39.380351: step: 530/464, loss: 0.08858560770750046 2023-01-22 12:32:40.032246: step: 532/464, loss: 0.511139988899231 2023-01-22 12:32:40.825075: step: 534/464, loss: 0.33546334505081177 2023-01-22 12:32:41.543857: step: 536/464, loss: 1.009992003440857 2023-01-22 12:32:42.326101: step: 538/464, loss: 0.2257574498653412 2023-01-22 12:32:43.050394: step: 540/464, loss: 0.31822896003723145 2023-01-22 12:32:43.830673: step: 542/464, loss: 0.42033156752586365 2023-01-22 12:32:44.519421: step: 544/464, loss: 0.3264782130718231 2023-01-22 12:32:45.357805: step: 546/464, loss: 2.3610825538635254 2023-01-22 12:32:46.096191: step: 548/464, loss: 0.1951487511396408 2023-01-22 12:32:46.807110: step: 550/464, loss: 0.21507689356803894 2023-01-22 12:32:47.573838: step: 552/464, loss: 0.6023355722427368 2023-01-22 12:32:48.440740: step: 554/464, loss: 0.5097122192382812 2023-01-22 12:32:49.218766: step: 556/464, loss: 0.2306637465953827 2023-01-22 12:32:49.968377: step: 558/464, loss: 0.7728459239006042 2023-01-22 12:32:50.744103: step: 560/464, loss: 0.21024790406227112 2023-01-22 12:32:51.454285: step: 562/464, loss: 0.6667017936706543 2023-01-22 12:32:52.221220: step: 564/464, loss: 0.19177010655403137 2023-01-22 12:32:52.999471: step: 566/464, loss: 0.3377716541290283 2023-01-22 12:32:53.748203: step: 568/464, loss: 0.4583777189254761 2023-01-22 12:32:54.491066: step: 570/464, loss: 0.698870837688446 2023-01-22 12:32:55.166278: step: 572/464, loss: 0.23514105379581451 2023-01-22 12:32:55.966014: step: 574/464, loss: 0.16757991909980774 2023-01-22 12:32:56.752422: step: 576/464, loss: 0.7315901517868042 2023-01-22 12:32:57.519121: step: 578/464, loss: 0.3616211712360382 2023-01-22 12:32:58.237319: step: 580/464, loss: 0.6538026928901672 2023-01-22 12:32:59.061143: step: 582/464, loss: 0.7223403453826904 2023-01-22 12:32:59.818987: step: 584/464, loss: 0.17685629427433014 2023-01-22 12:33:00.499137: step: 586/464, loss: 0.3052135705947876 2023-01-22 12:33:01.259906: step: 588/464, loss: 0.49685677886009216 2023-01-22 12:33:02.015026: step: 590/464, loss: 0.06801187247037888 2023-01-22 12:33:02.773144: step: 592/464, loss: 0.2643454372882843 2023-01-22 12:33:03.448398: step: 594/464, loss: 0.23082764446735382 2023-01-22 12:33:04.238942: step: 596/464, loss: 0.19490766525268555 2023-01-22 12:33:04.987907: step: 598/464, loss: 0.15837804973125458 2023-01-22 12:33:05.733146: step: 600/464, loss: 0.208059623837471 2023-01-22 12:33:06.478957: step: 602/464, loss: 0.4763175845146179 2023-01-22 12:33:07.184113: step: 604/464, loss: 0.41145578026771545 2023-01-22 12:33:07.903911: step: 606/464, loss: 0.19580335915088654 2023-01-22 12:33:08.702163: step: 608/464, loss: 6.187551021575928 2023-01-22 12:33:09.431873: step: 610/464, loss: 0.345478355884552 2023-01-22 12:33:10.207101: step: 612/464, loss: 1.0338616371154785 2023-01-22 12:33:10.918406: step: 614/464, loss: 0.4276840388774872 2023-01-22 12:33:11.571984: step: 616/464, loss: 0.7984152436256409 2023-01-22 12:33:12.272871: step: 618/464, loss: 0.5086597800254822 2023-01-22 12:33:13.005453: step: 620/464, loss: 1.4494423866271973 2023-01-22 12:33:13.789107: step: 622/464, loss: 0.088454470038414 2023-01-22 12:33:14.517610: step: 624/464, loss: 0.5433585047721863 2023-01-22 12:33:15.229616: step: 626/464, loss: 0.19739283621311188 2023-01-22 12:33:15.980918: step: 628/464, loss: 0.23902741074562073 2023-01-22 12:33:16.693908: step: 630/464, loss: 0.38742804527282715 2023-01-22 12:33:17.446168: step: 632/464, loss: 0.6698702573776245 2023-01-22 12:33:18.195599: step: 634/464, loss: 0.9108648300170898 2023-01-22 12:33:18.942293: step: 636/464, loss: 0.13190753757953644 2023-01-22 12:33:19.668405: step: 638/464, loss: 0.2754857540130615 2023-01-22 12:33:20.425235: step: 640/464, loss: 0.920817494392395 2023-01-22 12:33:21.091714: step: 642/464, loss: 0.8862787485122681 2023-01-22 12:33:21.831773: step: 644/464, loss: 0.34673672914505005 2023-01-22 12:33:22.518213: step: 646/464, loss: 0.6815704107284546 2023-01-22 12:33:23.232320: step: 648/464, loss: 0.1537713259458542 2023-01-22 12:33:24.013734: step: 650/464, loss: 0.33302345871925354 2023-01-22 12:33:24.751649: step: 652/464, loss: 0.19951090216636658 2023-01-22 12:33:25.453650: step: 654/464, loss: 0.20980501174926758 2023-01-22 12:33:26.193680: step: 656/464, loss: 0.5005956888198853 2023-01-22 12:33:26.936648: step: 658/464, loss: 0.27629342675209045 2023-01-22 12:33:27.596190: step: 660/464, loss: 0.3143295347690582 2023-01-22 12:33:28.263409: step: 662/464, loss: 0.10405009239912033 2023-01-22 12:33:28.981423: step: 664/464, loss: 0.4270290434360504 2023-01-22 12:33:29.751143: step: 666/464, loss: 0.2317143976688385 2023-01-22 12:33:30.505802: step: 668/464, loss: 0.3168708384037018 2023-01-22 12:33:31.250533: step: 670/464, loss: 0.3464636504650116 2023-01-22 12:33:31.964625: step: 672/464, loss: 0.48676538467407227 2023-01-22 12:33:32.708107: step: 674/464, loss: 0.6338642835617065 2023-01-22 12:33:33.561990: step: 676/464, loss: 0.536266028881073 2023-01-22 12:33:34.241245: step: 678/464, loss: 0.5338266491889954 2023-01-22 12:33:34.919551: step: 680/464, loss: 0.4320400655269623 2023-01-22 12:33:35.654511: step: 682/464, loss: 0.1096869707107544 2023-01-22 12:33:36.412390: step: 684/464, loss: 0.3257814645767212 2023-01-22 12:33:37.163936: step: 686/464, loss: 0.48217248916625977 2023-01-22 12:33:37.872130: step: 688/464, loss: 0.12141574174165726 2023-01-22 12:33:38.598714: step: 690/464, loss: 0.4606741666793823 2023-01-22 12:33:39.257851: step: 692/464, loss: 0.18588869273662567 2023-01-22 12:33:40.127545: step: 694/464, loss: 0.469809353351593 2023-01-22 12:33:40.919800: step: 696/464, loss: 0.20712803304195404 2023-01-22 12:33:41.631078: step: 698/464, loss: 0.3310616612434387 2023-01-22 12:33:42.448768: step: 700/464, loss: 0.6919637322425842 2023-01-22 12:33:43.200891: step: 702/464, loss: 5.140326976776123 2023-01-22 12:33:43.872889: step: 704/464, loss: 0.48945337533950806 2023-01-22 12:33:44.618441: step: 706/464, loss: 0.25606769323349 2023-01-22 12:33:45.321005: step: 708/464, loss: 0.607251763343811 2023-01-22 12:33:46.013276: step: 710/464, loss: 0.3275914192199707 2023-01-22 12:33:46.785085: step: 712/464, loss: 0.23145699501037598 2023-01-22 12:33:47.586773: step: 714/464, loss: 0.2339666187763214 2023-01-22 12:33:48.257291: step: 716/464, loss: 0.14772441983222961 2023-01-22 12:33:48.979922: step: 718/464, loss: 11.812416076660156 2023-01-22 12:33:49.727862: step: 720/464, loss: 0.7451220750808716 2023-01-22 12:33:50.441071: step: 722/464, loss: 0.31856614351272583 2023-01-22 12:33:51.195426: step: 724/464, loss: 0.17423215508460999 2023-01-22 12:33:52.024761: step: 726/464, loss: 0.2563784718513489 2023-01-22 12:33:52.819954: step: 728/464, loss: 0.23273800313472748 2023-01-22 12:33:53.552109: step: 730/464, loss: 0.19138793647289276 2023-01-22 12:33:54.278097: step: 732/464, loss: 0.07046403735876083 2023-01-22 12:33:55.006010: step: 734/464, loss: 0.32704442739486694 2023-01-22 12:33:55.873401: step: 736/464, loss: 1.0046184062957764 2023-01-22 12:33:56.616883: step: 738/464, loss: 0.9572482705116272 2023-01-22 12:33:57.391967: step: 740/464, loss: 0.46071451902389526 2023-01-22 12:33:58.151983: step: 742/464, loss: 0.2304687350988388 2023-01-22 12:33:58.859807: step: 744/464, loss: 0.5303010940551758 2023-01-22 12:33:59.588811: step: 746/464, loss: 0.3964517414569855 2023-01-22 12:34:00.345011: step: 748/464, loss: 0.27344828844070435 2023-01-22 12:34:01.091082: step: 750/464, loss: 0.33394724130630493 2023-01-22 12:34:01.824560: step: 752/464, loss: 0.48939767479896545 2023-01-22 12:34:02.418019: step: 754/464, loss: 0.1414308100938797 2023-01-22 12:34:03.165126: step: 756/464, loss: 0.18101376295089722 2023-01-22 12:34:03.917701: step: 758/464, loss: 0.38260453939437866 2023-01-22 12:34:04.643039: step: 760/464, loss: 1.707888126373291 2023-01-22 12:34:05.409984: step: 762/464, loss: 0.31803902983665466 2023-01-22 12:34:06.193735: step: 764/464, loss: 0.38109463453292847 2023-01-22 12:34:06.983455: step: 766/464, loss: 0.3848925530910492 2023-01-22 12:34:07.800479: step: 768/464, loss: 0.447073757648468 2023-01-22 12:34:08.553602: step: 770/464, loss: 0.822684645652771 2023-01-22 12:34:09.254459: step: 772/464, loss: 0.20294032990932465 2023-01-22 12:34:10.092341: step: 774/464, loss: 0.34626081585884094 2023-01-22 12:34:10.885666: step: 776/464, loss: 0.4519239068031311 2023-01-22 12:34:11.600271: step: 778/464, loss: 0.6026832461357117 2023-01-22 12:34:12.331093: step: 780/464, loss: 0.2907651364803314 2023-01-22 12:34:13.118242: step: 782/464, loss: 0.45342278480529785 2023-01-22 12:34:13.840490: step: 784/464, loss: 0.4344886541366577 2023-01-22 12:34:14.577579: step: 786/464, loss: 0.4853147864341736 2023-01-22 12:34:15.235818: step: 788/464, loss: 0.10905138403177261 2023-01-22 12:34:15.956312: step: 790/464, loss: 0.22025759518146515 2023-01-22 12:34:16.894290: step: 792/464, loss: 0.12082141637802124 2023-01-22 12:34:17.609046: step: 794/464, loss: 0.8412554264068604 2023-01-22 12:34:18.393271: step: 796/464, loss: 0.2237834930419922 2023-01-22 12:34:19.103336: step: 798/464, loss: 0.18725483119487762 2023-01-22 12:34:19.828749: step: 800/464, loss: 0.3585301339626312 2023-01-22 12:34:20.533271: step: 802/464, loss: 0.2320476770401001 2023-01-22 12:34:21.392702: step: 804/464, loss: 0.6213764548301697 2023-01-22 12:34:22.240641: step: 806/464, loss: 0.1954798549413681 2023-01-22 12:34:22.962748: step: 808/464, loss: 1.6411569118499756 2023-01-22 12:34:23.739793: step: 810/464, loss: 0.23510593175888062 2023-01-22 12:34:24.505064: step: 812/464, loss: 0.9333515763282776 2023-01-22 12:34:25.227696: step: 814/464, loss: 0.5875794887542725 2023-01-22 12:34:26.080871: step: 816/464, loss: 0.22823041677474976 2023-01-22 12:34:26.782295: step: 818/464, loss: 0.2510369122028351 2023-01-22 12:34:27.532014: step: 820/464, loss: 0.9232980012893677 2023-01-22 12:34:28.262287: step: 822/464, loss: 0.29697927832603455 2023-01-22 12:34:28.966770: step: 824/464, loss: 0.19320182502269745 2023-01-22 12:34:29.673974: step: 826/464, loss: 0.20587298274040222 2023-01-22 12:34:30.415721: step: 828/464, loss: 0.3217647075653076 2023-01-22 12:34:31.183389: step: 830/464, loss: 0.5316275954246521 2023-01-22 12:34:31.931893: step: 832/464, loss: 0.4806307554244995 2023-01-22 12:34:32.599071: step: 834/464, loss: 0.12461249530315399 2023-01-22 12:34:33.358165: step: 836/464, loss: 0.1817103773355484 2023-01-22 12:34:34.116788: step: 838/464, loss: 0.20273560285568237 2023-01-22 12:34:34.895220: step: 840/464, loss: 0.6376850008964539 2023-01-22 12:34:35.703282: step: 842/464, loss: 0.1536242812871933 2023-01-22 12:34:36.441061: step: 844/464, loss: 0.4686459004878998 2023-01-22 12:34:37.259960: step: 846/464, loss: 0.18846267461776733 2023-01-22 12:34:38.019773: step: 848/464, loss: 1.2304623126983643 2023-01-22 12:34:38.735469: step: 850/464, loss: 0.7672339081764221 2023-01-22 12:34:39.467062: step: 852/464, loss: 0.48824992775917053 2023-01-22 12:34:40.257967: step: 854/464, loss: 0.1030515730381012 2023-01-22 12:34:40.979926: step: 856/464, loss: 1.1871172189712524 2023-01-22 12:34:41.807529: step: 858/464, loss: 0.321567565202713 2023-01-22 12:34:42.535798: step: 860/464, loss: 0.18968945741653442 2023-01-22 12:34:43.271726: step: 862/464, loss: 0.11518923193216324 2023-01-22 12:34:44.010791: step: 864/464, loss: 0.3778989911079407 2023-01-22 12:34:44.787366: step: 866/464, loss: 0.21847502887248993 2023-01-22 12:34:45.491971: step: 868/464, loss: 0.3434563875198364 2023-01-22 12:34:46.267722: step: 870/464, loss: 0.4898814558982849 2023-01-22 12:34:47.065394: step: 872/464, loss: 1.2420252561569214 2023-01-22 12:34:47.786607: step: 874/464, loss: 0.5621987581253052 2023-01-22 12:34:48.544068: step: 876/464, loss: 0.5099846124649048 2023-01-22 12:34:49.202206: step: 878/464, loss: 0.9314126372337341 2023-01-22 12:34:49.900928: step: 880/464, loss: 0.720192551612854 2023-01-22 12:34:50.667834: step: 882/464, loss: 0.15588845312595367 2023-01-22 12:34:51.367166: step: 884/464, loss: 0.18109045922756195 2023-01-22 12:34:52.096069: step: 886/464, loss: 0.513529360294342 2023-01-22 12:34:52.785056: step: 888/464, loss: 0.2132686972618103 2023-01-22 12:34:53.493279: step: 890/464, loss: 0.40210071206092834 2023-01-22 12:34:54.242013: step: 892/464, loss: 0.36502307653427124 2023-01-22 12:34:55.075044: step: 894/464, loss: 0.5589260458946228 2023-01-22 12:34:55.883121: step: 896/464, loss: 0.4219580888748169 2023-01-22 12:34:56.622599: step: 898/464, loss: 0.3626061975955963 2023-01-22 12:34:57.254391: step: 900/464, loss: 0.4053698182106018 2023-01-22 12:34:57.972717: step: 902/464, loss: 1.0465059280395508 2023-01-22 12:34:58.720673: step: 904/464, loss: 0.16469791531562805 2023-01-22 12:34:59.453143: step: 906/464, loss: 0.5895463228225708 2023-01-22 12:35:00.229119: step: 908/464, loss: 0.7060515284538269 2023-01-22 12:35:00.900888: step: 910/464, loss: 0.3695732057094574 2023-01-22 12:35:01.608116: step: 912/464, loss: 0.22272424399852753 2023-01-22 12:35:02.289969: step: 914/464, loss: 0.381093293428421 2023-01-22 12:35:02.964759: step: 916/464, loss: 0.16798549890518188 2023-01-22 12:35:03.651902: step: 918/464, loss: 0.4496923089027405 2023-01-22 12:35:04.432935: step: 920/464, loss: 0.2888336777687073 2023-01-22 12:35:05.171991: step: 922/464, loss: 0.3023048937320709 2023-01-22 12:35:05.955476: step: 924/464, loss: 0.29694077372550964 2023-01-22 12:35:06.669598: step: 926/464, loss: 0.178087055683136 2023-01-22 12:35:07.334850: step: 928/464, loss: 0.3675735592842102 2023-01-22 12:35:08.001024: step: 930/464, loss: 0.22161225974559784 ================================================== Loss: 0.500 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3094859327619774, 'r': 0.327103727795866, 'f1': 0.3180510416022535}, 'combined': 0.23435339907534466, 'epoch': 8} Test Chinese: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.2983547954674098, 'r': 0.28596023423137495, 'f1': 0.2920260573817375}, 'combined': 0.18136355142655278, 'epoch': 8} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.29660202530160235, 'r': 0.32361701052831376, 'f1': 0.30952116977934907}, 'combined': 0.2280682303637309, 'epoch': 8} Test Korean: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.2910978686511131, 'r': 0.2844754641219582, 'f1': 0.2877485685115555}, 'combined': 0.1787070057071766, 'epoch': 8} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3020804336867607, 'r': 0.32271590923272536, 'f1': 0.3120574021388005}, 'combined': 0.22993703315490563, 'epoch': 8} Test Russian: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.3069139619466407, 'r': 0.292949528465389, 'f1': 0.29976920372318655}, 'combined': 0.1861724528386106, 'epoch': 8} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.2682926829268293, 'r': 0.3142857142857143, 'f1': 0.2894736842105263}, 'combined': 0.19298245614035087, 'epoch': 8} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.3269230769230769, 'r': 0.5543478260869565, 'f1': 0.4112903225806452}, 'combined': 0.2056451612903226, 'epoch': 8} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.44642857142857145, 'r': 0.3232758620689655, 'f1': 0.37500000000000006}, 'combined': 0.25, 'epoch': 8} New best korean model... New best russian model... ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.27277317990287775, 'r': 0.2614291157103195, 'f1': 0.2669806992485695}, 'combined': 0.19672262049894593, 'epoch': 2} Test for Chinese: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.2661425985210101, 'r': 0.23333409278617157, 'f1': 0.2486608198477961}, 'combined': 0.15443145653705231, 'epoch': 2} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.34459459459459457, 'r': 0.36428571428571427, 'f1': 0.3541666666666667}, 'combined': 0.2361111111111111, 'epoch': 2} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.29660202530160235, 'r': 0.32361701052831376, 'f1': 0.30952116977934907}, 'combined': 0.2280682303637309, 'epoch': 8} Test for Korean: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.2910978686511131, 'r': 0.2844754641219582, 'f1': 0.2877485685115555}, 'combined': 0.1787070057071766, 'epoch': 8} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.3269230769230769, 'r': 0.5543478260869565, 'f1': 0.4112903225806452}, 'combined': 0.2056451612903226, 'epoch': 8} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3020804336867607, 'r': 0.32271590923272536, 'f1': 0.3120574021388005}, 'combined': 0.22993703315490563, 'epoch': 8} Test for Russian: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.3069139619466407, 'r': 0.292949528465389, 'f1': 0.29976920372318655}, 'combined': 0.1861724528386106, 'epoch': 8} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.44642857142857145, 'r': 0.3232758620689655, 'f1': 0.37500000000000006}, 'combined': 0.25, 'epoch': 8} ****************************** Epoch: 9 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-22 12:37:57.628940: step: 2/464, loss: 0.7644608020782471 2023-01-22 12:37:58.340915: step: 4/464, loss: 0.1526871919631958 2023-01-22 12:37:59.057388: step: 6/464, loss: 0.13496136665344238 2023-01-22 12:37:59.797925: step: 8/464, loss: 0.38817891478538513 2023-01-22 12:38:00.525858: step: 10/464, loss: 0.09272399544715881 2023-01-22 12:38:01.302514: step: 12/464, loss: 0.9074238538742065 2023-01-22 12:38:02.010824: step: 14/464, loss: 0.2888231575489044 2023-01-22 12:38:02.744251: step: 16/464, loss: 0.22618751227855682 2023-01-22 12:38:03.476549: step: 18/464, loss: 0.14795294404029846 2023-01-22 12:38:04.282037: step: 20/464, loss: 0.29055696725845337 2023-01-22 12:38:05.026103: step: 22/464, loss: 1.158676028251648 2023-01-22 12:38:05.789996: step: 24/464, loss: 0.32179543375968933 2023-01-22 12:38:06.535218: step: 26/464, loss: 0.23352162539958954 2023-01-22 12:38:07.304280: step: 28/464, loss: 0.15808358788490295 2023-01-22 12:38:08.036818: step: 30/464, loss: 0.44342419505119324 2023-01-22 12:38:08.814047: step: 32/464, loss: 0.6043683290481567 2023-01-22 12:38:09.543477: step: 34/464, loss: 0.1392679214477539 2023-01-22 12:38:10.266802: step: 36/464, loss: 0.2797326147556305 2023-01-22 12:38:10.966704: step: 38/464, loss: 0.22573457658290863 2023-01-22 12:38:11.809432: step: 40/464, loss: 0.16020973026752472 2023-01-22 12:38:12.502836: step: 42/464, loss: 0.43194395303726196 2023-01-22 12:38:13.220695: step: 44/464, loss: 0.3286089599132538 2023-01-22 12:38:13.945494: step: 46/464, loss: 0.40341219305992126 2023-01-22 12:38:14.727704: step: 48/464, loss: 0.27753618359565735 2023-01-22 12:38:15.463329: step: 50/464, loss: 0.9163278937339783 2023-01-22 12:38:16.192927: step: 52/464, loss: 0.2896912693977356 2023-01-22 12:38:16.886732: step: 54/464, loss: 0.6954100728034973 2023-01-22 12:38:17.608038: step: 56/464, loss: 0.1046927198767662 2023-01-22 12:38:18.365412: step: 58/464, loss: 0.36709487438201904 2023-01-22 12:38:19.059530: step: 60/464, loss: 0.3054746985435486 2023-01-22 12:38:19.778868: step: 62/464, loss: 0.3704795837402344 2023-01-22 12:38:20.456254: step: 64/464, loss: 0.34034356474876404 2023-01-22 12:38:21.224204: step: 66/464, loss: 0.05187249183654785 2023-01-22 12:38:21.906810: step: 68/464, loss: 0.28469640016555786 2023-01-22 12:38:22.615289: step: 70/464, loss: 0.14327633380889893 2023-01-22 12:38:23.323447: step: 72/464, loss: 0.2619388699531555 2023-01-22 12:38:24.068696: step: 74/464, loss: 0.5715567469596863 2023-01-22 12:38:24.763932: step: 76/464, loss: 0.3016913831233978 2023-01-22 12:38:25.447235: step: 78/464, loss: 0.26843929290771484 2023-01-22 12:38:26.215533: step: 80/464, loss: 0.28098320960998535 2023-01-22 12:38:26.910755: step: 82/464, loss: 0.3540817201137543 2023-01-22 12:38:27.608718: step: 84/464, loss: 0.13923750817775726 2023-01-22 12:38:28.303601: step: 86/464, loss: 0.234792560338974 2023-01-22 12:38:28.974652: step: 88/464, loss: 0.11665306985378265 2023-01-22 12:38:29.778146: step: 90/464, loss: 0.32617178559303284 2023-01-22 12:38:30.507406: step: 92/464, loss: 0.1398487240076065 2023-01-22 12:38:31.252492: step: 94/464, loss: 0.5901756882667542 2023-01-22 12:38:32.008428: step: 96/464, loss: 0.25927284359931946 2023-01-22 12:38:32.705950: step: 98/464, loss: 0.5350198149681091 2023-01-22 12:38:33.438367: step: 100/464, loss: 0.36380600929260254 2023-01-22 12:38:34.298760: step: 102/464, loss: 0.24624530971050262 2023-01-22 12:38:35.108301: step: 104/464, loss: 0.3970596492290497 2023-01-22 12:38:35.849409: step: 106/464, loss: 0.2849036455154419 2023-01-22 12:38:36.619688: step: 108/464, loss: 0.4194733500480652 2023-01-22 12:38:37.291190: step: 110/464, loss: 0.8568342328071594 2023-01-22 12:38:38.087820: step: 112/464, loss: 0.22576601803302765 2023-01-22 12:38:38.825108: step: 114/464, loss: 0.22194799780845642 2023-01-22 12:38:39.618822: step: 116/464, loss: 0.08715911954641342 2023-01-22 12:38:40.333843: step: 118/464, loss: 0.20871415734291077 2023-01-22 12:38:41.172173: step: 120/464, loss: 0.2952080965042114 2023-01-22 12:38:41.896493: step: 122/464, loss: 0.3873203992843628 2023-01-22 12:38:42.608939: step: 124/464, loss: 0.23726877570152283 2023-01-22 12:38:43.263984: step: 126/464, loss: 0.08396518975496292 2023-01-22 12:38:44.039416: step: 128/464, loss: 0.28581172227859497 2023-01-22 12:38:44.773981: step: 130/464, loss: 0.18190300464630127 2023-01-22 12:38:45.662034: step: 132/464, loss: 3.430314064025879 2023-01-22 12:38:46.408713: step: 134/464, loss: 0.25667664408683777 2023-01-22 12:38:47.134172: step: 136/464, loss: 0.46132755279541016 2023-01-22 12:38:47.915181: step: 138/464, loss: 0.20997853577136993 2023-01-22 12:38:48.734590: step: 140/464, loss: 0.5437301993370056 2023-01-22 12:38:49.443922: step: 142/464, loss: 0.18092238903045654 2023-01-22 12:38:50.156250: step: 144/464, loss: 0.16767120361328125 2023-01-22 12:38:50.896829: step: 146/464, loss: 0.24160486459732056 2023-01-22 12:38:51.613945: step: 148/464, loss: 1.0245047807693481 2023-01-22 12:38:52.318143: step: 150/464, loss: 0.1303568333387375 2023-01-22 12:38:53.047550: step: 152/464, loss: 0.21133965253829956 2023-01-22 12:38:53.796092: step: 154/464, loss: 0.3464350700378418 2023-01-22 12:38:54.463554: step: 156/464, loss: 1.610190749168396 2023-01-22 12:38:55.206928: step: 158/464, loss: 0.49895238876342773 2023-01-22 12:38:55.900703: step: 160/464, loss: 0.09356162697076797 2023-01-22 12:38:56.668523: step: 162/464, loss: 0.3734308183193207 2023-01-22 12:38:57.556591: step: 164/464, loss: 5.4781317710876465 2023-01-22 12:38:58.358956: step: 166/464, loss: 0.2703748345375061 2023-01-22 12:38:59.120781: step: 168/464, loss: 0.48195675015449524 2023-01-22 12:38:59.885916: step: 170/464, loss: 0.30958640575408936 2023-01-22 12:39:00.637110: step: 172/464, loss: 0.47871479392051697 2023-01-22 12:39:01.358926: step: 174/464, loss: 0.26660820841789246 2023-01-22 12:39:02.128552: step: 176/464, loss: 0.4663125276565552 2023-01-22 12:39:02.855357: step: 178/464, loss: 0.1425776481628418 2023-01-22 12:39:03.571947: step: 180/464, loss: 0.08970626443624496 2023-01-22 12:39:04.210094: step: 182/464, loss: 0.26843932271003723 2023-01-22 12:39:04.851829: step: 184/464, loss: 0.19826146960258484 2023-01-22 12:39:05.573620: step: 186/464, loss: 0.366080105304718 2023-01-22 12:39:06.310176: step: 188/464, loss: 0.39854663610458374 2023-01-22 12:39:07.035850: step: 190/464, loss: 0.14912031590938568 2023-01-22 12:39:07.705180: step: 192/464, loss: 0.07307624816894531 2023-01-22 12:39:08.401965: step: 194/464, loss: 0.2346281260251999 2023-01-22 12:39:09.191408: step: 196/464, loss: 0.25821825861930847 2023-01-22 12:39:09.960446: step: 198/464, loss: 0.13114692270755768 2023-01-22 12:39:10.700394: step: 200/464, loss: 0.2134842872619629 2023-01-22 12:39:11.399735: step: 202/464, loss: 0.5420844554901123 2023-01-22 12:39:12.241832: step: 204/464, loss: 0.5633516311645508 2023-01-22 12:39:12.940783: step: 206/464, loss: 0.11726406216621399 2023-01-22 12:39:13.607860: step: 208/464, loss: 0.21820184588432312 2023-01-22 12:39:14.384959: step: 210/464, loss: 0.19968479871749878 2023-01-22 12:39:15.046363: step: 212/464, loss: 0.1356292963027954 2023-01-22 12:39:15.746563: step: 214/464, loss: 0.18377524614334106 2023-01-22 12:39:16.463471: step: 216/464, loss: 0.1599075347185135 2023-01-22 12:39:17.275188: step: 218/464, loss: 0.6362401247024536 2023-01-22 12:39:18.023429: step: 220/464, loss: 0.16068458557128906 2023-01-22 12:39:18.712925: step: 222/464, loss: 0.20188620686531067 2023-01-22 12:39:19.426942: step: 224/464, loss: 0.12553514540195465 2023-01-22 12:39:20.242086: step: 226/464, loss: 1.1796537637710571 2023-01-22 12:39:21.056406: step: 228/464, loss: 0.6061399579048157 2023-01-22 12:39:21.767014: step: 230/464, loss: 0.2460578978061676 2023-01-22 12:39:22.424162: step: 232/464, loss: 0.0386289544403553 2023-01-22 12:39:23.203145: step: 234/464, loss: 0.30962854623794556 2023-01-22 12:39:23.895343: step: 236/464, loss: 0.2959977388381958 2023-01-22 12:39:24.650595: step: 238/464, loss: 0.1805409938097 2023-01-22 12:39:25.367439: step: 240/464, loss: 0.2075037956237793 2023-01-22 12:39:26.041081: step: 242/464, loss: 0.5304340124130249 2023-01-22 12:39:26.740196: step: 244/464, loss: 0.10196323692798615 2023-01-22 12:39:27.472589: step: 246/464, loss: 0.41461655497550964 2023-01-22 12:39:28.225644: step: 248/464, loss: 0.36507394909858704 2023-01-22 12:39:28.943083: step: 250/464, loss: 0.15062814950942993 2023-01-22 12:39:29.773637: step: 252/464, loss: 0.5410463213920593 2023-01-22 12:39:30.455872: step: 254/464, loss: 0.32974231243133545 2023-01-22 12:39:31.217747: step: 256/464, loss: 0.2526639699935913 2023-01-22 12:39:31.936069: step: 258/464, loss: 0.22622418403625488 2023-01-22 12:39:32.622922: step: 260/464, loss: 0.15075217187404633 2023-01-22 12:39:33.489922: step: 262/464, loss: 0.2535134255886078 2023-01-22 12:39:34.247296: step: 264/464, loss: 0.2069745510816574 2023-01-22 12:39:34.974727: step: 266/464, loss: 1.1059925556182861 2023-01-22 12:39:35.720382: step: 268/464, loss: 0.16578927636146545 2023-01-22 12:39:36.423820: step: 270/464, loss: 0.1252116560935974 2023-01-22 12:39:37.213532: step: 272/464, loss: 0.4197810888290405 2023-01-22 12:39:38.026590: step: 274/464, loss: 0.7874919176101685 2023-01-22 12:39:38.816940: step: 276/464, loss: 0.43675607442855835 2023-01-22 12:39:39.603964: step: 278/464, loss: 0.15678708255290985 2023-01-22 12:39:40.366057: step: 280/464, loss: 0.3846047818660736 2023-01-22 12:39:41.077511: step: 282/464, loss: 0.35570046305656433 2023-01-22 12:39:41.841937: step: 284/464, loss: 0.10399957001209259 2023-01-22 12:39:42.596728: step: 286/464, loss: 0.3356718122959137 2023-01-22 12:39:43.338095: step: 288/464, loss: 0.2095506191253662 2023-01-22 12:39:44.085478: step: 290/464, loss: 0.1627904325723648 2023-01-22 12:39:44.823303: step: 292/464, loss: 0.47878342866897583 2023-01-22 12:39:45.589324: step: 294/464, loss: 0.29623061418533325 2023-01-22 12:39:46.297406: step: 296/464, loss: 0.31689897179603577 2023-01-22 12:39:47.064690: step: 298/464, loss: 0.14780159294605255 2023-01-22 12:39:47.799815: step: 300/464, loss: 0.11021255701780319 2023-01-22 12:39:48.518866: step: 302/464, loss: 0.151094451546669 2023-01-22 12:39:49.242630: step: 304/464, loss: 0.49748021364212036 2023-01-22 12:39:49.961658: step: 306/464, loss: 0.6771782636642456 2023-01-22 12:39:50.745743: step: 308/464, loss: 0.1410297006368637 2023-01-22 12:39:51.462246: step: 310/464, loss: 0.3992559313774109 2023-01-22 12:39:52.264437: step: 312/464, loss: 0.4508551061153412 2023-01-22 12:39:53.008405: step: 314/464, loss: 0.1885729432106018 2023-01-22 12:39:53.809294: step: 316/464, loss: 0.18776766955852509 2023-01-22 12:39:54.500879: step: 318/464, loss: 0.19099010527133942 2023-01-22 12:39:55.195938: step: 320/464, loss: 0.5384302139282227 2023-01-22 12:39:55.861049: step: 322/464, loss: 0.10380914062261581 2023-01-22 12:39:56.628371: step: 324/464, loss: 0.22476299107074738 2023-01-22 12:39:57.408334: step: 326/464, loss: 0.2729957699775696 2023-01-22 12:39:58.149142: step: 328/464, loss: 0.5379776954650879 2023-01-22 12:39:58.857834: step: 330/464, loss: 0.770466685295105 2023-01-22 12:39:59.633652: step: 332/464, loss: 0.4133603274822235 2023-01-22 12:40:00.374308: step: 334/464, loss: 0.18049293756484985 2023-01-22 12:40:01.182690: step: 336/464, loss: 0.6266219019889832 2023-01-22 12:40:01.954469: step: 338/464, loss: 0.11049497127532959 2023-01-22 12:40:02.656485: step: 340/464, loss: 0.2000427544116974 2023-01-22 12:40:03.463468: step: 342/464, loss: 0.7935618758201599 2023-01-22 12:40:04.241328: step: 344/464, loss: 0.18541643023490906 2023-01-22 12:40:04.947580: step: 346/464, loss: 1.1642208099365234 2023-01-22 12:40:05.790506: step: 348/464, loss: 0.7056695818901062 2023-01-22 12:40:06.563523: step: 350/464, loss: 0.06701252609491348 2023-01-22 12:40:07.308732: step: 352/464, loss: 0.6939871311187744 2023-01-22 12:40:08.074884: step: 354/464, loss: 0.439879834651947 2023-01-22 12:40:08.744849: step: 356/464, loss: 0.29044264554977417 2023-01-22 12:40:09.573653: step: 358/464, loss: 0.3111569583415985 2023-01-22 12:40:10.295008: step: 360/464, loss: 0.17994770407676697 2023-01-22 12:40:11.044615: step: 362/464, loss: 0.22458504140377045 2023-01-22 12:40:11.748810: step: 364/464, loss: 0.22156009078025818 2023-01-22 12:40:12.510872: step: 366/464, loss: 0.6971669793128967 2023-01-22 12:40:13.230744: step: 368/464, loss: 0.14607200026512146 2023-01-22 12:40:13.961587: step: 370/464, loss: 0.41244274377822876 2023-01-22 12:40:14.682476: step: 372/464, loss: 0.21590563654899597 2023-01-22 12:40:15.406144: step: 374/464, loss: 0.3225278854370117 2023-01-22 12:40:16.130243: step: 376/464, loss: 0.09340378642082214 2023-01-22 12:40:16.899065: step: 378/464, loss: 0.23559413850307465 2023-01-22 12:40:17.560268: step: 380/464, loss: 0.23710493743419647 2023-01-22 12:40:18.362872: step: 382/464, loss: 0.18201129138469696 2023-01-22 12:40:19.130549: step: 384/464, loss: 0.7326669692993164 2023-01-22 12:40:19.874209: step: 386/464, loss: 0.37326520681381226 2023-01-22 12:40:20.692023: step: 388/464, loss: 0.08627074211835861 2023-01-22 12:40:21.335328: step: 390/464, loss: 0.13408063352108002 2023-01-22 12:40:22.018182: step: 392/464, loss: 0.21325339376926422 2023-01-22 12:40:22.839210: step: 394/464, loss: 0.21826452016830444 2023-01-22 12:40:23.640623: step: 396/464, loss: 0.9736225008964539 2023-01-22 12:40:24.342907: step: 398/464, loss: 0.23253558576107025 2023-01-22 12:40:25.122616: step: 400/464, loss: 0.060269854962825775 2023-01-22 12:40:25.955939: step: 402/464, loss: 1.2768745422363281 2023-01-22 12:40:26.704843: step: 404/464, loss: 7.715400695800781 2023-01-22 12:40:27.386245: step: 406/464, loss: 0.39205795526504517 2023-01-22 12:40:28.103856: step: 408/464, loss: 0.2768790125846863 2023-01-22 12:40:28.828067: step: 410/464, loss: 0.14054222404956818 2023-01-22 12:40:29.519745: step: 412/464, loss: 0.9549347162246704 2023-01-22 12:40:30.260157: step: 414/464, loss: 0.30312371253967285 2023-01-22 12:40:30.920788: step: 416/464, loss: 0.2272430658340454 2023-01-22 12:40:31.695635: step: 418/464, loss: 0.41558340191841125 2023-01-22 12:40:32.619262: step: 420/464, loss: 0.6737045645713806 2023-01-22 12:40:33.344166: step: 422/464, loss: 0.2897966206073761 2023-01-22 12:40:33.997784: step: 424/464, loss: 0.7693721055984497 2023-01-22 12:40:34.749348: step: 426/464, loss: 0.12149006128311157 2023-01-22 12:40:35.441195: step: 428/464, loss: 0.21027140319347382 2023-01-22 12:40:36.121284: step: 430/464, loss: 0.13245093822479248 2023-01-22 12:40:36.854685: step: 432/464, loss: 0.503391444683075 2023-01-22 12:40:37.631340: step: 434/464, loss: 0.164590984582901 2023-01-22 12:40:38.380000: step: 436/464, loss: 0.42156076431274414 2023-01-22 12:40:39.147705: step: 438/464, loss: 0.20272719860076904 2023-01-22 12:40:39.888935: step: 440/464, loss: 0.5416241884231567 2023-01-22 12:40:40.620301: step: 442/464, loss: 0.20643553137779236 2023-01-22 12:40:41.397260: step: 444/464, loss: 0.2214449644088745 2023-01-22 12:40:42.122219: step: 446/464, loss: 0.14966677129268646 2023-01-22 12:40:42.869384: step: 448/464, loss: 0.17533740401268005 2023-01-22 12:40:43.580115: step: 450/464, loss: 0.37993356585502625 2023-01-22 12:40:44.324436: step: 452/464, loss: 0.09402307868003845 2023-01-22 12:40:45.041946: step: 454/464, loss: 0.17073950171470642 2023-01-22 12:40:45.781566: step: 456/464, loss: 0.2114734798669815 2023-01-22 12:40:46.461330: step: 458/464, loss: 0.21925882995128632 2023-01-22 12:40:47.170879: step: 460/464, loss: 0.6372971534729004 2023-01-22 12:40:47.986606: step: 462/464, loss: 0.8937779664993286 2023-01-22 12:40:48.828602: step: 464/464, loss: 0.1505538523197174 2023-01-22 12:40:49.640209: step: 466/464, loss: 0.1130121722817421 2023-01-22 12:40:50.457132: step: 468/464, loss: 0.3290260434150696 2023-01-22 12:40:51.208832: step: 470/464, loss: 0.5424768328666687 2023-01-22 12:40:51.945401: step: 472/464, loss: 0.1475936770439148 2023-01-22 12:40:52.706360: step: 474/464, loss: 0.5609605312347412 2023-01-22 12:40:53.386995: step: 476/464, loss: 1.326806664466858 2023-01-22 12:40:54.131808: step: 478/464, loss: 0.3607947826385498 2023-01-22 12:40:54.838461: step: 480/464, loss: 0.825629472732544 2023-01-22 12:40:55.634397: step: 482/464, loss: 0.40424904227256775 2023-01-22 12:40:56.406199: step: 484/464, loss: 0.25395286083221436 2023-01-22 12:40:57.174991: step: 486/464, loss: 0.2789684534072876 2023-01-22 12:40:57.876924: step: 488/464, loss: 0.5685667991638184 2023-01-22 12:40:58.625202: step: 490/464, loss: 0.25906193256378174 2023-01-22 12:40:59.379543: step: 492/464, loss: 0.14646874368190765 2023-01-22 12:41:00.153033: step: 494/464, loss: 0.10087735950946808 2023-01-22 12:41:00.884259: step: 496/464, loss: 0.10177915543317795 2023-01-22 12:41:01.619944: step: 498/464, loss: 1.3080449104309082 2023-01-22 12:41:02.344145: step: 500/464, loss: 0.9278597831726074 2023-01-22 12:41:03.226751: step: 502/464, loss: 0.2622971534729004 2023-01-22 12:41:04.019009: step: 504/464, loss: 0.07536578923463821 2023-01-22 12:41:04.826142: step: 506/464, loss: 0.19723260402679443 2023-01-22 12:41:05.629679: step: 508/464, loss: 0.12470276653766632 2023-01-22 12:41:06.437952: step: 510/464, loss: 0.2333928346633911 2023-01-22 12:41:07.162398: step: 512/464, loss: 0.1920052170753479 2023-01-22 12:41:07.924364: step: 514/464, loss: 0.22460505366325378 2023-01-22 12:41:08.729000: step: 516/464, loss: 0.2736990451812744 2023-01-22 12:41:09.425177: step: 518/464, loss: 0.3205868601799011 2023-01-22 12:41:10.249638: step: 520/464, loss: 0.25555187463760376 2023-01-22 12:41:11.023848: step: 522/464, loss: 0.3476880192756653 2023-01-22 12:41:11.831675: step: 524/464, loss: 0.15089766681194305 2023-01-22 12:41:12.577330: step: 526/464, loss: 0.1468602567911148 2023-01-22 12:41:13.314777: step: 528/464, loss: 3.05962872505188 2023-01-22 12:41:14.031507: step: 530/464, loss: 0.14920800924301147 2023-01-22 12:41:14.698055: step: 532/464, loss: 0.1825367510318756 2023-01-22 12:41:15.379080: step: 534/464, loss: 0.08955138921737671 2023-01-22 12:41:16.085025: step: 536/464, loss: 0.08840145170688629 2023-01-22 12:41:16.785690: step: 538/464, loss: 1.535651445388794 2023-01-22 12:41:17.493852: step: 540/464, loss: 0.14180992543697357 2023-01-22 12:41:18.260220: step: 542/464, loss: 0.39354032278060913 2023-01-22 12:41:18.916462: step: 544/464, loss: 0.2640589475631714 2023-01-22 12:41:19.662048: step: 546/464, loss: 0.17533175647258759 2023-01-22 12:41:20.408094: step: 548/464, loss: 0.2676864266395569 2023-01-22 12:41:21.149611: step: 550/464, loss: 0.16309544444084167 2023-01-22 12:41:21.893453: step: 552/464, loss: 0.30232033133506775 2023-01-22 12:41:22.556177: step: 554/464, loss: 0.5454744100570679 2023-01-22 12:41:23.304012: step: 556/464, loss: 0.28288885951042175 2023-01-22 12:41:24.125047: step: 558/464, loss: 0.2505795359611511 2023-01-22 12:41:24.741935: step: 560/464, loss: 0.4683348834514618 2023-01-22 12:41:25.555285: step: 562/464, loss: 0.2200739085674286 2023-01-22 12:41:26.277680: step: 564/464, loss: 0.12059260159730911 2023-01-22 12:41:27.051344: step: 566/464, loss: 0.5861608982086182 2023-01-22 12:41:27.780128: step: 568/464, loss: 0.14045344293117523 2023-01-22 12:41:28.532991: step: 570/464, loss: 0.38023900985717773 2023-01-22 12:41:29.274503: step: 572/464, loss: 0.508532702922821 2023-01-22 12:41:29.998847: step: 574/464, loss: 0.145177960395813 2023-01-22 12:41:30.697315: step: 576/464, loss: 0.2807120382785797 2023-01-22 12:41:31.459647: step: 578/464, loss: 0.22880499064922333 2023-01-22 12:41:32.273915: step: 580/464, loss: 0.14477400481700897 2023-01-22 12:41:33.034143: step: 582/464, loss: 0.40797674655914307 2023-01-22 12:41:33.716459: step: 584/464, loss: 0.8094741702079773 2023-01-22 12:41:34.443351: step: 586/464, loss: 0.11368077993392944 2023-01-22 12:41:35.172182: step: 588/464, loss: 0.37009745836257935 2023-01-22 12:41:35.887749: step: 590/464, loss: 0.9669628739356995 2023-01-22 12:41:36.745519: step: 592/464, loss: 1.0077821016311646 2023-01-22 12:41:37.461104: step: 594/464, loss: 0.2285720556974411 2023-01-22 12:41:38.299987: step: 596/464, loss: 0.2633154094219208 2023-01-22 12:41:38.949507: step: 598/464, loss: 0.16029880940914154 2023-01-22 12:41:39.708459: step: 600/464, loss: 0.22524085640907288 2023-01-22 12:41:40.489092: step: 602/464, loss: 0.6412642598152161 2023-01-22 12:41:41.255530: step: 604/464, loss: 0.2362058162689209 2023-01-22 12:41:41.970735: step: 606/464, loss: 2.0845179557800293 2023-01-22 12:41:42.682555: step: 608/464, loss: 0.2981489598751068 2023-01-22 12:41:43.395835: step: 610/464, loss: 0.1880742907524109 2023-01-22 12:41:44.120637: step: 612/464, loss: 0.4543037414550781 2023-01-22 12:41:44.886353: step: 614/464, loss: 0.44105908274650574 2023-01-22 12:41:45.639502: step: 616/464, loss: 0.2084379643201828 2023-01-22 12:41:46.456059: step: 618/464, loss: 0.30074912309646606 2023-01-22 12:41:47.174488: step: 620/464, loss: 0.837376058101654 2023-01-22 12:41:48.020302: step: 622/464, loss: 0.09070264548063278 2023-01-22 12:41:48.794477: step: 624/464, loss: 0.7012251615524292 2023-01-22 12:41:49.448373: step: 626/464, loss: 0.2049962729215622 2023-01-22 12:41:50.234413: step: 628/464, loss: 0.27639079093933105 2023-01-22 12:41:51.021523: step: 630/464, loss: 0.16173185408115387 2023-01-22 12:41:51.737747: step: 632/464, loss: 0.1771915704011917 2023-01-22 12:41:52.516249: step: 634/464, loss: 0.0934867411851883 2023-01-22 12:41:53.278439: step: 636/464, loss: 0.23240536451339722 2023-01-22 12:41:53.984626: step: 638/464, loss: 0.9698227047920227 2023-01-22 12:41:54.669495: step: 640/464, loss: 0.49567627906799316 2023-01-22 12:41:55.380565: step: 642/464, loss: 0.07443743199110031 2023-01-22 12:41:56.033976: step: 644/464, loss: 0.30980682373046875 2023-01-22 12:41:56.886786: step: 646/464, loss: 0.22703073918819427 2023-01-22 12:41:57.756829: step: 648/464, loss: 0.271577388048172 2023-01-22 12:41:58.488619: step: 650/464, loss: 0.3126218914985657 2023-01-22 12:41:59.189224: step: 652/464, loss: 0.5985322594642639 2023-01-22 12:41:59.879894: step: 654/464, loss: 0.24709294736385345 2023-01-22 12:42:00.647169: step: 656/464, loss: 0.7609562873840332 2023-01-22 12:42:01.297719: step: 658/464, loss: 0.2991550862789154 2023-01-22 12:42:02.061654: step: 660/464, loss: 0.28040316700935364 2023-01-22 12:42:02.848547: step: 662/464, loss: 0.09969662874937057 2023-01-22 12:42:03.568947: step: 664/464, loss: 0.8616743087768555 2023-01-22 12:42:04.304342: step: 666/464, loss: 0.730205774307251 2023-01-22 12:42:04.951627: step: 668/464, loss: 0.2294856160879135 2023-01-22 12:42:05.644539: step: 670/464, loss: 0.17386649549007416 2023-01-22 12:42:06.394402: step: 672/464, loss: 0.21030853688716888 2023-01-22 12:42:07.125629: step: 674/464, loss: 0.4977702796459198 2023-01-22 12:42:07.812288: step: 676/464, loss: 0.23179440200328827 2023-01-22 12:42:08.467514: step: 678/464, loss: 0.3305434286594391 2023-01-22 12:42:09.280795: step: 680/464, loss: 0.5477005839347839 2023-01-22 12:42:09.973719: step: 682/464, loss: 0.4550781548023224 2023-01-22 12:42:10.685698: step: 684/464, loss: 0.127974271774292 2023-01-22 12:42:11.389020: step: 686/464, loss: 0.22048044204711914 2023-01-22 12:42:12.118104: step: 688/464, loss: 0.08273495733737946 2023-01-22 12:42:12.867291: step: 690/464, loss: 0.8862178921699524 2023-01-22 12:42:13.608353: step: 692/464, loss: 0.27741530537605286 2023-01-22 12:42:14.380154: step: 694/464, loss: 0.3920413553714752 2023-01-22 12:42:15.087800: step: 696/464, loss: 0.21315898001194 2023-01-22 12:42:15.901836: step: 698/464, loss: 0.28585731983184814 2023-01-22 12:42:16.622951: step: 700/464, loss: 0.2642350196838379 2023-01-22 12:42:17.349212: step: 702/464, loss: 0.21909445524215698 2023-01-22 12:42:18.028712: step: 704/464, loss: 0.13675111532211304 2023-01-22 12:42:18.722796: step: 706/464, loss: 0.6145305633544922 2023-01-22 12:42:19.547683: step: 708/464, loss: 0.633683979511261 2023-01-22 12:42:20.289461: step: 710/464, loss: 0.23416614532470703 2023-01-22 12:42:20.961484: step: 712/464, loss: 0.43624699115753174 2023-01-22 12:42:21.682818: step: 714/464, loss: 0.29760614037513733 2023-01-22 12:42:22.447812: step: 716/464, loss: 0.11144375801086426 2023-01-22 12:42:23.119866: step: 718/464, loss: 0.5339705348014832 2023-01-22 12:42:23.899521: step: 720/464, loss: 0.1553748995065689 2023-01-22 12:42:24.655601: step: 722/464, loss: 0.15754304826259613 2023-01-22 12:42:25.361794: step: 724/464, loss: 0.15813219547271729 2023-01-22 12:42:26.061259: step: 726/464, loss: 0.16239820420742035 2023-01-22 12:42:26.740033: step: 728/464, loss: 0.29819780588150024 2023-01-22 12:42:27.569498: step: 730/464, loss: 0.7145152688026428 2023-01-22 12:42:28.440177: step: 732/464, loss: 0.384068101644516 2023-01-22 12:42:29.185917: step: 734/464, loss: 0.4480460584163666 2023-01-22 12:42:29.930308: step: 736/464, loss: 0.5774493217468262 2023-01-22 12:42:30.761277: step: 738/464, loss: 0.22652669250965118 2023-01-22 12:42:31.481007: step: 740/464, loss: 0.485545814037323 2023-01-22 12:42:32.174675: step: 742/464, loss: 0.051977403461933136 2023-01-22 12:42:32.900835: step: 744/464, loss: 0.1289929449558258 2023-01-22 12:42:33.665891: step: 746/464, loss: 0.15977248549461365 2023-01-22 12:42:34.428057: step: 748/464, loss: 0.14681187272071838 2023-01-22 12:42:35.249688: step: 750/464, loss: 0.14676165580749512 2023-01-22 12:42:35.929770: step: 752/464, loss: 0.4534044563770294 2023-01-22 12:42:36.815026: step: 754/464, loss: 0.31766563653945923 2023-01-22 12:42:37.562935: step: 756/464, loss: 0.5398532152175903 2023-01-22 12:42:38.261625: step: 758/464, loss: 0.14937244355678558 2023-01-22 12:42:39.066231: step: 760/464, loss: 0.26810702681541443 2023-01-22 12:42:39.786772: step: 762/464, loss: 0.4462334215641022 2023-01-22 12:42:40.543300: step: 764/464, loss: 0.44105684757232666 2023-01-22 12:42:41.350653: step: 766/464, loss: 0.3404810130596161 2023-01-22 12:42:42.153203: step: 768/464, loss: 0.4538693130016327 2023-01-22 12:42:42.847584: step: 770/464, loss: 0.17481106519699097 2023-01-22 12:42:43.484462: step: 772/464, loss: 0.22036492824554443 2023-01-22 12:42:44.178919: step: 774/464, loss: 0.2719525396823883 2023-01-22 12:42:44.925144: step: 776/464, loss: 0.21486765146255493 2023-01-22 12:42:45.670036: step: 778/464, loss: 0.4466085433959961 2023-01-22 12:42:46.402114: step: 780/464, loss: 0.21514861285686493 2023-01-22 12:42:47.071365: step: 782/464, loss: 0.4557275176048279 2023-01-22 12:42:47.863974: step: 784/464, loss: 0.12855775654315948 2023-01-22 12:42:48.605685: step: 786/464, loss: 0.24510878324508667 2023-01-22 12:42:49.390290: step: 788/464, loss: 0.12076195329427719 2023-01-22 12:42:50.204097: step: 790/464, loss: 0.5561219453811646 2023-01-22 12:42:50.908936: step: 792/464, loss: 0.5087153911590576 2023-01-22 12:42:51.594933: step: 794/464, loss: 1.0408319234848022 2023-01-22 12:42:52.380876: step: 796/464, loss: 0.18620692193508148 2023-01-22 12:42:53.029679: step: 798/464, loss: 0.156281515955925 2023-01-22 12:42:53.770734: step: 800/464, loss: 0.48612117767333984 2023-01-22 12:42:54.555407: step: 802/464, loss: 0.6834096908569336 2023-01-22 12:42:55.237699: step: 804/464, loss: 0.2013227492570877 2023-01-22 12:42:56.025895: step: 806/464, loss: 0.5944108366966248 2023-01-22 12:42:56.767705: step: 808/464, loss: 0.1534864604473114 2023-01-22 12:42:57.467681: step: 810/464, loss: 0.2654111385345459 2023-01-22 12:42:58.216743: step: 812/464, loss: 0.11527702957391739 2023-01-22 12:42:58.974576: step: 814/464, loss: 0.4222082197666168 2023-01-22 12:42:59.682468: step: 816/464, loss: 0.19088996946811676 2023-01-22 12:43:00.470110: step: 818/464, loss: 0.459178626537323 2023-01-22 12:43:01.233181: step: 820/464, loss: 0.0953294038772583 2023-01-22 12:43:02.023962: step: 822/464, loss: 0.5236583352088928 2023-01-22 12:43:02.846430: step: 824/464, loss: 0.3258075416088104 2023-01-22 12:43:03.569547: step: 826/464, loss: 0.1721639782190323 2023-01-22 12:43:04.311215: step: 828/464, loss: 0.39053988456726074 2023-01-22 12:43:05.066463: step: 830/464, loss: 0.23920369148254395 2023-01-22 12:43:05.775114: step: 832/464, loss: 0.3683702349662781 2023-01-22 12:43:06.550472: step: 834/464, loss: 0.32811301946640015 2023-01-22 12:43:07.232323: step: 836/464, loss: 0.5699950456619263 2023-01-22 12:43:07.971650: step: 838/464, loss: 0.3164607584476471 2023-01-22 12:43:08.694787: step: 840/464, loss: 0.16344623267650604 2023-01-22 12:43:09.411327: step: 842/464, loss: 0.33790773153305054 2023-01-22 12:43:10.162286: step: 844/464, loss: 0.17128928005695343 2023-01-22 12:43:10.852866: step: 846/464, loss: 0.20610596239566803 2023-01-22 12:43:11.564594: step: 848/464, loss: 0.41363999247550964 2023-01-22 12:43:12.293961: step: 850/464, loss: 0.10889443010091782 2023-01-22 12:43:12.997679: step: 852/464, loss: 0.1969783902168274 2023-01-22 12:43:13.748831: step: 854/464, loss: 3.389371156692505 2023-01-22 12:43:14.482169: step: 856/464, loss: 0.2419338971376419 2023-01-22 12:43:15.187292: step: 858/464, loss: 0.23793935775756836 2023-01-22 12:43:15.895782: step: 860/464, loss: 0.4282359182834625 2023-01-22 12:43:16.643652: step: 862/464, loss: 0.15168102085590363 2023-01-22 12:43:17.351058: step: 864/464, loss: 0.2177678644657135 2023-01-22 12:43:18.121952: step: 866/464, loss: 0.6482052803039551 2023-01-22 12:43:18.923714: step: 868/464, loss: 0.49208351969718933 2023-01-22 12:43:19.755644: step: 870/464, loss: 0.8154049515724182 2023-01-22 12:43:20.488170: step: 872/464, loss: 0.289344847202301 2023-01-22 12:43:21.228654: step: 874/464, loss: 0.2476513385772705 2023-01-22 12:43:21.976394: step: 876/464, loss: 0.3879658877849579 2023-01-22 12:43:22.722077: step: 878/464, loss: 0.17360030114650726 2023-01-22 12:43:23.476524: step: 880/464, loss: 1.1755492687225342 2023-01-22 12:43:24.286899: step: 882/464, loss: 0.6070190072059631 2023-01-22 12:43:25.080030: step: 884/464, loss: 0.6003977060317993 2023-01-22 12:43:25.843430: step: 886/464, loss: 0.5772039294242859 2023-01-22 12:43:26.513121: step: 888/464, loss: 1.0244700908660889 2023-01-22 12:43:27.298908: step: 890/464, loss: 0.26510658860206604 2023-01-22 12:43:27.998678: step: 892/464, loss: 0.2694578766822815 2023-01-22 12:43:28.663147: step: 894/464, loss: 0.1434503048658371 2023-01-22 12:43:29.396153: step: 896/464, loss: 0.41476795077323914 2023-01-22 12:43:30.077242: step: 898/464, loss: 0.15585322678089142 2023-01-22 12:43:30.756991: step: 900/464, loss: 0.6767136454582214 2023-01-22 12:43:31.515868: step: 902/464, loss: 0.30244138836860657 2023-01-22 12:43:32.172896: step: 904/464, loss: 0.07525203377008438 2023-01-22 12:43:32.870046: step: 906/464, loss: 0.3611765205860138 2023-01-22 12:43:33.593417: step: 908/464, loss: 0.21834854781627655 2023-01-22 12:43:34.247564: step: 910/464, loss: 0.307809442281723 2023-01-22 12:43:34.964556: step: 912/464, loss: 0.48542436957359314 2023-01-22 12:43:35.750054: step: 914/464, loss: 1.1236315965652466 2023-01-22 12:43:36.461544: step: 916/464, loss: 0.6004358530044556 2023-01-22 12:43:37.183554: step: 918/464, loss: 0.5024645328521729 2023-01-22 12:43:37.938276: step: 920/464, loss: 0.8452791571617126 2023-01-22 12:43:38.656431: step: 922/464, loss: 0.24478454887866974 2023-01-22 12:43:39.415919: step: 924/464, loss: 0.14474430680274963 2023-01-22 12:43:40.162861: step: 926/464, loss: 0.27781936526298523 2023-01-22 12:43:40.909567: step: 928/464, loss: 0.12701298296451569 2023-01-22 12:43:41.528062: step: 930/464, loss: 1.2429008483886719 ================================================== Loss: 0.400 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2988087325791506, 'r': 0.32829270619227363, 'f1': 0.3128576060819678}, 'combined': 0.2305266571130289, 'epoch': 9} Test Chinese: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.3070130647219126, 'r': 0.2824155788243113, 'f1': 0.29420108211373386}, 'combined': 0.1827143562601084, 'epoch': 9} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2842735877627297, 'r': 0.31933579498204173, 'f1': 0.3007863520206184}, 'combined': 0.22163204885729773, 'epoch': 9} Test Korean: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.3017364948970924, 'r': 0.2874248824909062, 'f1': 0.2944068634421023}, 'combined': 0.18284215729562145, 'epoch': 9} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2923686966789003, 'r': 0.31955288289762157, 'f1': 0.30535697060207906}, 'combined': 0.22499987307521613, 'epoch': 9} Test Russian: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.31799698762667294, 'r': 0.2922558691810785, 'f1': 0.3045835344448894}, 'combined': 0.18916240560261552, 'epoch': 9} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.27698863636363635, 'r': 0.3482142857142857, 'f1': 0.30854430379746833}, 'combined': 0.20569620253164556, 'epoch': 9} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.27325581395348836, 'r': 0.5108695652173914, 'f1': 0.356060606060606}, 'combined': 0.178030303030303, 'epoch': 9} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.359375, 'r': 0.2974137931034483, 'f1': 0.32547169811320753}, 'combined': 0.21698113207547168, 'epoch': 9} New best chinese model... ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2988087325791506, 'r': 0.32829270619227363, 'f1': 0.3128576060819678}, 'combined': 0.2305266571130289, 'epoch': 9} Test for Chinese: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.3070130647219126, 'r': 0.2824155788243113, 'f1': 0.29420108211373386}, 'combined': 0.1827143562601084, 'epoch': 9} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.27698863636363635, 'r': 0.3482142857142857, 'f1': 0.30854430379746833}, 'combined': 0.20569620253164556, 'epoch': 9} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.29660202530160235, 'r': 0.32361701052831376, 'f1': 0.30952116977934907}, 'combined': 0.2280682303637309, 'epoch': 8} Test for Korean: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.2910978686511131, 'r': 0.2844754641219582, 'f1': 0.2877485685115555}, 'combined': 0.1787070057071766, 'epoch': 8} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.3269230769230769, 'r': 0.5543478260869565, 'f1': 0.4112903225806452}, 'combined': 0.2056451612903226, 'epoch': 8} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3020804336867607, 'r': 0.32271590923272536, 'f1': 0.3120574021388005}, 'combined': 0.22993703315490563, 'epoch': 8} Test for Russian: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.3069139619466407, 'r': 0.292949528465389, 'f1': 0.29976920372318655}, 'combined': 0.1861724528386106, 'epoch': 8} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.44642857142857145, 'r': 0.3232758620689655, 'f1': 0.37500000000000006}, 'combined': 0.25, 'epoch': 8} ****************************** Epoch: 10 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-22 12:46:31.107113: step: 2/464, loss: 0.21232697367668152 2023-01-22 12:46:31.915994: step: 4/464, loss: 0.5965004563331604 2023-01-22 12:46:32.658616: step: 6/464, loss: 0.24643248319625854 2023-01-22 12:46:33.324059: step: 8/464, loss: 0.17525090277194977 2023-01-22 12:46:34.150210: step: 10/464, loss: 0.2345692366361618 2023-01-22 12:46:34.870696: step: 12/464, loss: 0.18845805525779724 2023-01-22 12:46:35.571836: step: 14/464, loss: 0.09380072355270386 2023-01-22 12:46:36.304329: step: 16/464, loss: 0.1685120165348053 2023-01-22 12:46:37.052912: step: 18/464, loss: 0.15246163308620453 2023-01-22 12:46:37.811664: step: 20/464, loss: 0.2983258068561554 2023-01-22 12:46:38.584434: step: 22/464, loss: 0.10962452739477158 2023-01-22 12:46:39.285509: step: 24/464, loss: 0.19744998216629028 2023-01-22 12:46:40.051366: step: 26/464, loss: 0.18615196645259857 2023-01-22 12:46:40.731820: step: 28/464, loss: 1.7140228748321533 2023-01-22 12:46:41.450161: step: 30/464, loss: 0.06185387820005417 2023-01-22 12:46:42.253504: step: 32/464, loss: 0.3651220500469208 2023-01-22 12:46:42.918203: step: 34/464, loss: 0.12869255244731903 2023-01-22 12:46:43.546857: step: 36/464, loss: 0.09327950328588486 2023-01-22 12:46:44.223454: step: 38/464, loss: 0.2185249775648117 2023-01-22 12:46:44.965106: step: 40/464, loss: 0.12256766110658646 2023-01-22 12:46:45.689470: step: 42/464, loss: 0.09723819047212601 2023-01-22 12:46:46.341396: step: 44/464, loss: 0.26763907074928284 2023-01-22 12:46:47.028715: step: 46/464, loss: 0.21186493337154388 2023-01-22 12:46:47.701817: step: 48/464, loss: 0.09950794279575348 2023-01-22 12:46:48.472255: step: 50/464, loss: 0.8632354736328125 2023-01-22 12:46:49.276332: step: 52/464, loss: 0.5110155940055847 2023-01-22 12:46:50.054422: step: 54/464, loss: 0.32015207409858704 2023-01-22 12:46:50.969000: step: 56/464, loss: 0.19230931997299194 2023-01-22 12:46:51.692741: step: 58/464, loss: 0.22721707820892334 2023-01-22 12:46:52.491544: step: 60/464, loss: 0.11136981099843979 2023-01-22 12:46:53.200883: step: 62/464, loss: 0.24339455366134644 2023-01-22 12:46:53.900389: step: 64/464, loss: 0.3092723786830902 2023-01-22 12:46:54.590545: step: 66/464, loss: 0.2001309096813202 2023-01-22 12:46:55.304014: step: 68/464, loss: 0.21909552812576294 2023-01-22 12:46:56.075417: step: 70/464, loss: 0.10150119662284851 2023-01-22 12:46:56.849423: step: 72/464, loss: 0.20717015862464905 2023-01-22 12:46:57.648190: step: 74/464, loss: 0.08364620059728622 2023-01-22 12:46:58.448241: step: 76/464, loss: 0.9749086499214172 2023-01-22 12:46:59.251170: step: 78/464, loss: 0.8755331039428711 2023-01-22 12:47:00.059744: step: 80/464, loss: 0.18153031170368195 2023-01-22 12:47:00.742956: step: 82/464, loss: 0.15206857025623322 2023-01-22 12:47:01.437496: step: 84/464, loss: 0.12143020331859589 2023-01-22 12:47:02.146435: step: 86/464, loss: 0.23054926097393036 2023-01-22 12:47:03.036538: step: 88/464, loss: 1.0099869966506958 2023-01-22 12:47:03.840975: step: 90/464, loss: 0.21190431714057922 2023-01-22 12:47:04.577130: step: 92/464, loss: 0.177865132689476 2023-01-22 12:47:05.302228: step: 94/464, loss: 0.16128742694854736 2023-01-22 12:47:06.014254: step: 96/464, loss: 0.6147654056549072 2023-01-22 12:47:06.826007: step: 98/464, loss: 0.44657233357429504 2023-01-22 12:47:07.554798: step: 100/464, loss: 0.61492520570755 2023-01-22 12:47:08.208697: step: 102/464, loss: 0.1421038955450058 2023-01-22 12:47:08.908777: step: 104/464, loss: 0.15594199299812317 2023-01-22 12:47:09.620387: step: 106/464, loss: 0.1598719209432602 2023-01-22 12:47:10.358450: step: 108/464, loss: 0.13725920021533966 2023-01-22 12:47:11.042654: step: 110/464, loss: 0.135905459523201 2023-01-22 12:47:11.798894: step: 112/464, loss: 0.23689238727092743 2023-01-22 12:47:12.612626: step: 114/464, loss: 0.19912511110305786 2023-01-22 12:47:13.377905: step: 116/464, loss: 0.1403755396604538 2023-01-22 12:47:14.124737: step: 118/464, loss: 0.1728145331144333 2023-01-22 12:47:14.892830: step: 120/464, loss: 0.3202742338180542 2023-01-22 12:47:15.696201: step: 122/464, loss: 0.13201336562633514 2023-01-22 12:47:16.390740: step: 124/464, loss: 0.11553767323493958 2023-01-22 12:47:17.195251: step: 126/464, loss: 0.3744227886199951 2023-01-22 12:47:17.977639: step: 128/464, loss: 0.6965504288673401 2023-01-22 12:47:18.746047: step: 130/464, loss: 0.4453356862068176 2023-01-22 12:47:19.517907: step: 132/464, loss: 0.08157233148813248 2023-01-22 12:47:20.183706: step: 134/464, loss: 0.2185574620962143 2023-01-22 12:47:20.931920: step: 136/464, loss: 0.2841760218143463 2023-01-22 12:47:21.711116: step: 138/464, loss: 0.5950050354003906 2023-01-22 12:47:22.393687: step: 140/464, loss: 0.9071783423423767 2023-01-22 12:47:23.145962: step: 142/464, loss: 0.20753763616085052 2023-01-22 12:47:23.891857: step: 144/464, loss: 0.39771732687950134 2023-01-22 12:47:24.580788: step: 146/464, loss: 0.11873120069503784 2023-01-22 12:47:25.293774: step: 148/464, loss: 0.7489532828330994 2023-01-22 12:47:25.989830: step: 150/464, loss: 0.25500231981277466 2023-01-22 12:47:26.786266: step: 152/464, loss: 0.49194595217704773 2023-01-22 12:47:27.543421: step: 154/464, loss: 0.08242286741733551 2023-01-22 12:47:28.373011: step: 156/464, loss: 0.05969241261482239 2023-01-22 12:47:29.152237: step: 158/464, loss: 0.25976186990737915 2023-01-22 12:47:29.950194: step: 160/464, loss: 0.46697625517845154 2023-01-22 12:47:30.728465: step: 162/464, loss: 0.5594256520271301 2023-01-22 12:47:31.471364: step: 164/464, loss: 0.20075185596942902 2023-01-22 12:47:32.230012: step: 166/464, loss: 0.3751225471496582 2023-01-22 12:47:32.958249: step: 168/464, loss: 0.5394560098648071 2023-01-22 12:47:33.679403: step: 170/464, loss: 0.4806734323501587 2023-01-22 12:47:34.449632: step: 172/464, loss: 0.22877338528633118 2023-01-22 12:47:35.180169: step: 174/464, loss: 0.04806321859359741 2023-01-22 12:47:35.890644: step: 176/464, loss: 0.20131415128707886 2023-01-22 12:47:36.629541: step: 178/464, loss: 1.2342121601104736 2023-01-22 12:47:37.332929: step: 180/464, loss: 0.19610396027565002 2023-01-22 12:47:38.043626: step: 182/464, loss: 0.32422640919685364 2023-01-22 12:47:38.798352: step: 184/464, loss: 0.1297033131122589 2023-01-22 12:47:39.625618: step: 186/464, loss: 0.17376866936683655 2023-01-22 12:47:40.359931: step: 188/464, loss: 0.19256730377674103 2023-01-22 12:47:41.082475: step: 190/464, loss: 0.1108497604727745 2023-01-22 12:47:41.756103: step: 192/464, loss: 0.24731086194515228 2023-01-22 12:47:42.577298: step: 194/464, loss: 0.14615006744861603 2023-01-22 12:47:43.290021: step: 196/464, loss: 0.0613839365541935 2023-01-22 12:47:43.980425: step: 198/464, loss: 0.17077118158340454 2023-01-22 12:47:44.651343: step: 200/464, loss: 0.18954557180404663 2023-01-22 12:47:45.399650: step: 202/464, loss: 0.11936686933040619 2023-01-22 12:47:46.118069: step: 204/464, loss: 0.05609899386763573 2023-01-22 12:47:46.784338: step: 206/464, loss: 0.18242907524108887 2023-01-22 12:47:47.477522: step: 208/464, loss: 0.15568356215953827 2023-01-22 12:47:48.174203: step: 210/464, loss: 0.25138142704963684 2023-01-22 12:47:48.937113: step: 212/464, loss: 0.41353484988212585 2023-01-22 12:47:49.687179: step: 214/464, loss: 0.12580302357673645 2023-01-22 12:47:50.387469: step: 216/464, loss: 0.05890248343348503 2023-01-22 12:47:51.139936: step: 218/464, loss: 0.4115862548351288 2023-01-22 12:47:51.856864: step: 220/464, loss: 0.16974353790283203 2023-01-22 12:47:52.666457: step: 222/464, loss: 1.10183584690094 2023-01-22 12:47:53.490890: step: 224/464, loss: 0.8364865183830261 2023-01-22 12:47:54.192800: step: 226/464, loss: 0.60638827085495 2023-01-22 12:47:54.935886: step: 228/464, loss: 0.2671571373939514 2023-01-22 12:47:55.622097: step: 230/464, loss: 0.055853791534900665 2023-01-22 12:47:56.321769: step: 232/464, loss: 0.23713281750679016 2023-01-22 12:47:57.120352: step: 234/464, loss: 0.2899948060512543 2023-01-22 12:47:57.836636: step: 236/464, loss: 0.13573069870471954 2023-01-22 12:47:58.530187: step: 238/464, loss: 0.8137500882148743 2023-01-22 12:47:59.290253: step: 240/464, loss: 0.4511808156967163 2023-01-22 12:48:00.007765: step: 242/464, loss: 0.41318273544311523 2023-01-22 12:48:00.712302: step: 244/464, loss: 0.11683430522680283 2023-01-22 12:48:01.462781: step: 246/464, loss: 0.6399244070053101 2023-01-22 12:48:02.265266: step: 248/464, loss: 0.2516360282897949 2023-01-22 12:48:02.990628: step: 250/464, loss: 0.29193392395973206 2023-01-22 12:48:03.693153: step: 252/464, loss: 0.27569180727005005 2023-01-22 12:48:04.482047: step: 254/464, loss: 0.5015103816986084 2023-01-22 12:48:05.187606: step: 256/464, loss: 0.30280640721321106 2023-01-22 12:48:05.939026: step: 258/464, loss: 0.15494994819164276 2023-01-22 12:48:06.640510: step: 260/464, loss: 0.3739899694919586 2023-01-22 12:48:07.379194: step: 262/464, loss: 0.15253880620002747 2023-01-22 12:48:08.137104: step: 264/464, loss: 0.24771569669246674 2023-01-22 12:48:08.911386: step: 266/464, loss: 0.1636432707309723 2023-01-22 12:48:09.629644: step: 268/464, loss: 0.0732090175151825 2023-01-22 12:48:10.349857: step: 270/464, loss: 0.250224232673645 2023-01-22 12:48:11.167696: step: 272/464, loss: 0.09976043552160263 2023-01-22 12:48:11.892487: step: 274/464, loss: 0.14588111639022827 2023-01-22 12:48:12.654754: step: 276/464, loss: 0.08698129653930664 2023-01-22 12:48:13.352909: step: 278/464, loss: 0.4589356780052185 2023-01-22 12:48:14.028131: step: 280/464, loss: 0.15701982378959656 2023-01-22 12:48:14.664573: step: 282/464, loss: 0.19665288925170898 2023-01-22 12:48:15.352259: step: 284/464, loss: 0.2983488142490387 2023-01-22 12:48:16.118627: step: 286/464, loss: 0.5707964301109314 2023-01-22 12:48:16.822791: step: 288/464, loss: 0.31916865706443787 2023-01-22 12:48:17.660303: step: 290/464, loss: 0.1850687861442566 2023-01-22 12:48:18.401503: step: 292/464, loss: 0.5208576917648315 2023-01-22 12:48:19.107944: step: 294/464, loss: 0.0874454528093338 2023-01-22 12:48:19.768403: step: 296/464, loss: 0.5099970102310181 2023-01-22 12:48:20.409915: step: 298/464, loss: 0.15315058827400208 2023-01-22 12:48:21.144338: step: 300/464, loss: 0.08985976129770279 2023-01-22 12:48:21.841273: step: 302/464, loss: 0.26404041051864624 2023-01-22 12:48:22.607076: step: 304/464, loss: 0.07328619062900543 2023-01-22 12:48:23.404421: step: 306/464, loss: 0.17824620008468628 2023-01-22 12:48:24.100743: step: 308/464, loss: 0.5839994549751282 2023-01-22 12:48:24.873264: step: 310/464, loss: 2.2232160568237305 2023-01-22 12:48:25.594513: step: 312/464, loss: 0.32043343782424927 2023-01-22 12:48:26.356133: step: 314/464, loss: 0.5206624269485474 2023-01-22 12:48:27.056936: step: 316/464, loss: 0.1408088058233261 2023-01-22 12:48:27.737669: step: 318/464, loss: 0.5124632716178894 2023-01-22 12:48:28.499166: step: 320/464, loss: 0.1030653789639473 2023-01-22 12:48:29.216320: step: 322/464, loss: 0.1808159351348877 2023-01-22 12:48:30.015498: step: 324/464, loss: 0.1294044405221939 2023-01-22 12:48:30.714792: step: 326/464, loss: 0.3512157201766968 2023-01-22 12:48:31.486594: step: 328/464, loss: 0.5164199471473694 2023-01-22 12:48:32.223896: step: 330/464, loss: 0.4815867245197296 2023-01-22 12:48:32.955637: step: 332/464, loss: 0.6558340787887573 2023-01-22 12:48:33.896952: step: 334/464, loss: 0.6065689325332642 2023-01-22 12:48:34.704362: step: 336/464, loss: 0.10820100456476212 2023-01-22 12:48:35.441666: step: 338/464, loss: 0.18210628628730774 2023-01-22 12:48:36.156082: step: 340/464, loss: 2.3327083587646484 2023-01-22 12:48:36.849624: step: 342/464, loss: 0.3005005419254303 2023-01-22 12:48:37.601070: step: 344/464, loss: 0.7286275625228882 2023-01-22 12:48:38.332746: step: 346/464, loss: 0.288993239402771 2023-01-22 12:48:39.045219: step: 348/464, loss: 0.2670809328556061 2023-01-22 12:48:39.786915: step: 350/464, loss: 0.1438087671995163 2023-01-22 12:48:40.586330: step: 352/464, loss: 0.32805734872817993 2023-01-22 12:48:41.291827: step: 354/464, loss: 0.11649665981531143 2023-01-22 12:48:42.047802: step: 356/464, loss: 0.1538315713405609 2023-01-22 12:48:42.763040: step: 358/464, loss: 0.1075054258108139 2023-01-22 12:48:43.484993: step: 360/464, loss: 0.4133543372154236 2023-01-22 12:48:44.194443: step: 362/464, loss: 1.6680829524993896 2023-01-22 12:48:44.934444: step: 364/464, loss: 0.25736382603645325 2023-01-22 12:48:45.670983: step: 366/464, loss: 0.1185770332813263 2023-01-22 12:48:46.352233: step: 368/464, loss: 0.3984162509441376 2023-01-22 12:48:47.065907: step: 370/464, loss: 0.3035191297531128 2023-01-22 12:48:47.794338: step: 372/464, loss: 0.18461394309997559 2023-01-22 12:48:48.566175: step: 374/464, loss: 0.09824996441602707 2023-01-22 12:48:49.303618: step: 376/464, loss: 0.2347511500120163 2023-01-22 12:48:50.011180: step: 378/464, loss: 0.6326672434806824 2023-01-22 12:48:50.749409: step: 380/464, loss: 0.11433830857276917 2023-01-22 12:48:51.433197: step: 382/464, loss: 0.18060731887817383 2023-01-22 12:48:52.129069: step: 384/464, loss: 0.2918536365032196 2023-01-22 12:48:52.847443: step: 386/464, loss: 0.20384199917316437 2023-01-22 12:48:53.583734: step: 388/464, loss: 0.28546860814094543 2023-01-22 12:48:54.456183: step: 390/464, loss: 0.07320096343755722 2023-01-22 12:48:55.123597: step: 392/464, loss: 4.606675624847412 2023-01-22 12:48:55.869529: step: 394/464, loss: 0.6734826564788818 2023-01-22 12:48:56.601461: step: 396/464, loss: 0.14568394422531128 2023-01-22 12:48:57.354434: step: 398/464, loss: 0.13595956563949585 2023-01-22 12:48:58.079532: step: 400/464, loss: 0.32695436477661133 2023-01-22 12:48:58.737841: step: 402/464, loss: 0.0527927540242672 2023-01-22 12:48:59.511765: step: 404/464, loss: 0.1709793359041214 2023-01-22 12:49:00.258284: step: 406/464, loss: 0.31607356667518616 2023-01-22 12:49:00.970226: step: 408/464, loss: 0.3742814362049103 2023-01-22 12:49:01.677186: step: 410/464, loss: 0.26817089319229126 2023-01-22 12:49:02.336055: step: 412/464, loss: 0.22187118232250214 2023-01-22 12:49:03.057222: step: 414/464, loss: 0.3849436044692993 2023-01-22 12:49:03.719794: step: 416/464, loss: 0.4200716018676758 2023-01-22 12:49:04.415838: step: 418/464, loss: 0.12109020352363586 2023-01-22 12:49:05.195343: step: 420/464, loss: 0.12587374448776245 2023-01-22 12:49:06.006394: step: 422/464, loss: 0.28375929594039917 2023-01-22 12:49:06.726911: step: 424/464, loss: 0.26978546380996704 2023-01-22 12:49:07.391672: step: 426/464, loss: 0.08260051906108856 2023-01-22 12:49:08.140092: step: 428/464, loss: 0.20075982809066772 2023-01-22 12:49:08.879381: step: 430/464, loss: 0.22721540927886963 2023-01-22 12:49:09.651505: step: 432/464, loss: 0.2437964528799057 2023-01-22 12:49:10.381287: step: 434/464, loss: 0.09995950013399124 2023-01-22 12:49:11.115076: step: 436/464, loss: 0.05342341214418411 2023-01-22 12:49:11.845616: step: 438/464, loss: 0.22190716862678528 2023-01-22 12:49:12.656422: step: 440/464, loss: 0.05813509225845337 2023-01-22 12:49:13.390835: step: 442/464, loss: 1.4940779209136963 2023-01-22 12:49:14.071120: step: 444/464, loss: 0.11205273866653442 2023-01-22 12:49:14.884044: step: 446/464, loss: 0.23172278702259064 2023-01-22 12:49:15.609606: step: 448/464, loss: 0.32702767848968506 2023-01-22 12:49:16.483271: step: 450/464, loss: 0.7603186368942261 2023-01-22 12:49:17.220566: step: 452/464, loss: 0.7385392189025879 2023-01-22 12:49:18.050134: step: 454/464, loss: 0.4690871834754944 2023-01-22 12:49:18.748062: step: 456/464, loss: 0.7157934904098511 2023-01-22 12:49:19.556709: step: 458/464, loss: 0.32446351647377014 2023-01-22 12:49:20.354769: step: 460/464, loss: 0.9630855917930603 2023-01-22 12:49:21.129889: step: 462/464, loss: 0.4231507480144501 2023-01-22 12:49:21.848220: step: 464/464, loss: 0.12859512865543365 2023-01-22 12:49:22.566770: step: 466/464, loss: 0.22822855412960052 2023-01-22 12:49:23.223190: step: 468/464, loss: 0.17116977274417877 2023-01-22 12:49:24.043620: step: 470/464, loss: 0.3771575689315796 2023-01-22 12:49:24.860580: step: 472/464, loss: 0.23884013295173645 2023-01-22 12:49:25.549711: step: 474/464, loss: 0.20674486458301544 2023-01-22 12:49:26.312758: step: 476/464, loss: 0.22556279599666595 2023-01-22 12:49:27.028813: step: 478/464, loss: 0.3144245743751526 2023-01-22 12:49:27.735757: step: 480/464, loss: 0.44248825311660767 2023-01-22 12:49:28.568665: step: 482/464, loss: 0.2980799674987793 2023-01-22 12:49:29.270273: step: 484/464, loss: 0.6016811728477478 2023-01-22 12:49:29.985118: step: 486/464, loss: 0.4583924412727356 2023-01-22 12:49:30.729481: step: 488/464, loss: 0.16038653254508972 2023-01-22 12:49:31.468866: step: 490/464, loss: 0.2621766924858093 2023-01-22 12:49:32.124289: step: 492/464, loss: 0.11310744285583496 2023-01-22 12:49:32.829643: step: 494/464, loss: 0.27761155366897583 2023-01-22 12:49:33.570682: step: 496/464, loss: 0.18872065842151642 2023-01-22 12:49:34.355858: step: 498/464, loss: 0.15203912556171417 2023-01-22 12:49:35.112537: step: 500/464, loss: 0.505059003829956 2023-01-22 12:49:35.811868: step: 502/464, loss: 0.1673978865146637 2023-01-22 12:49:36.467940: step: 504/464, loss: 0.11424268037080765 2023-01-22 12:49:37.209754: step: 506/464, loss: 0.503311812877655 2023-01-22 12:49:37.887816: step: 508/464, loss: 1.2228924036026 2023-01-22 12:49:38.690033: step: 510/464, loss: 0.29327163100242615 2023-01-22 12:49:39.374245: step: 512/464, loss: 0.1813446283340454 2023-01-22 12:49:40.116942: step: 514/464, loss: 0.16473616659641266 2023-01-22 12:49:40.831277: step: 516/464, loss: 0.4814782738685608 2023-01-22 12:49:41.602972: step: 518/464, loss: 0.24993804097175598 2023-01-22 12:49:42.340550: step: 520/464, loss: 0.15137940645217896 2023-01-22 12:49:43.092889: step: 522/464, loss: 0.12308914214372635 2023-01-22 12:49:43.816103: step: 524/464, loss: 0.766007661819458 2023-01-22 12:49:44.523642: step: 526/464, loss: 0.42049071192741394 2023-01-22 12:49:45.286472: step: 528/464, loss: 0.11827506870031357 2023-01-22 12:49:45.982914: step: 530/464, loss: 0.06717483699321747 2023-01-22 12:49:46.696020: step: 532/464, loss: 0.2546265721321106 2023-01-22 12:49:47.589950: step: 534/464, loss: 0.23186105489730835 2023-01-22 12:49:48.377307: step: 536/464, loss: 0.17715178430080414 2023-01-22 12:49:49.098391: step: 538/464, loss: 0.37223905324935913 2023-01-22 12:49:49.837618: step: 540/464, loss: 0.1452968567609787 2023-01-22 12:49:50.547505: step: 542/464, loss: 0.1264655441045761 2023-01-22 12:49:51.293407: step: 544/464, loss: 0.240777388215065 2023-01-22 12:49:52.076095: step: 546/464, loss: 0.3481501340866089 2023-01-22 12:49:52.824153: step: 548/464, loss: 0.18340934813022614 2023-01-22 12:49:53.607039: step: 550/464, loss: 0.24937808513641357 2023-01-22 12:49:54.299875: step: 552/464, loss: 0.08746077120304108 2023-01-22 12:49:54.971565: step: 554/464, loss: 0.13055148720741272 2023-01-22 12:49:55.755522: step: 556/464, loss: 0.5113113522529602 2023-01-22 12:49:56.568083: step: 558/464, loss: 0.4550716280937195 2023-01-22 12:49:57.464648: step: 560/464, loss: 0.6709883213043213 2023-01-22 12:49:58.194095: step: 562/464, loss: 0.26181280612945557 2023-01-22 12:49:58.904669: step: 564/464, loss: 0.09431593120098114 2023-01-22 12:49:59.589150: step: 566/464, loss: 0.0892590880393982 2023-01-22 12:50:00.356554: step: 568/464, loss: 0.15265516936779022 2023-01-22 12:50:01.154830: step: 570/464, loss: 0.2754208743572235 2023-01-22 12:50:01.904693: step: 572/464, loss: 0.23584358394145966 2023-01-22 12:50:02.651038: step: 574/464, loss: 0.2605666518211365 2023-01-22 12:50:03.382980: step: 576/464, loss: 0.407966673374176 2023-01-22 12:50:04.074167: step: 578/464, loss: 0.2766265869140625 2023-01-22 12:50:04.812214: step: 580/464, loss: 0.17403101921081543 2023-01-22 12:50:05.566732: step: 582/464, loss: 0.3033926486968994 2023-01-22 12:50:06.335603: step: 584/464, loss: 0.6713306307792664 2023-01-22 12:50:07.034708: step: 586/464, loss: 0.14127404987812042 2023-01-22 12:50:07.678830: step: 588/464, loss: 0.9483509063720703 2023-01-22 12:50:08.470509: step: 590/464, loss: 0.41415512561798096 2023-01-22 12:50:09.231483: step: 592/464, loss: 0.12702539563179016 2023-01-22 12:50:10.005814: step: 594/464, loss: 0.43179914355278015 2023-01-22 12:50:10.749152: step: 596/464, loss: 0.23740844428539276 2023-01-22 12:50:11.510604: step: 598/464, loss: 0.15830759704113007 2023-01-22 12:50:12.344134: step: 600/464, loss: 0.13203541934490204 2023-01-22 12:50:13.087837: step: 602/464, loss: 0.3251971900463104 2023-01-22 12:50:13.778192: step: 604/464, loss: 0.1645713448524475 2023-01-22 12:50:14.472224: step: 606/464, loss: 0.2223944514989853 2023-01-22 12:50:15.226219: step: 608/464, loss: 0.2733703851699829 2023-01-22 12:50:15.960239: step: 610/464, loss: 0.7987146377563477 2023-01-22 12:50:16.668863: step: 612/464, loss: 0.5771629810333252 2023-01-22 12:50:17.341203: step: 614/464, loss: 0.31207334995269775 2023-01-22 12:50:18.046570: step: 616/464, loss: 0.48628270626068115 2023-01-22 12:50:18.761469: step: 618/464, loss: 0.22458508610725403 2023-01-22 12:50:19.466483: step: 620/464, loss: 0.16614551842212677 2023-01-22 12:50:20.157520: step: 622/464, loss: 0.14270026981830597 2023-01-22 12:50:20.893510: step: 624/464, loss: 0.28465527296066284 2023-01-22 12:50:21.687011: step: 626/464, loss: 1.9318636655807495 2023-01-22 12:50:22.357277: step: 628/464, loss: 0.8859139680862427 2023-01-22 12:50:23.149073: step: 630/464, loss: 0.18957673013210297 2023-01-22 12:50:23.804102: step: 632/464, loss: 0.08624786138534546 2023-01-22 12:50:24.517307: step: 634/464, loss: 0.18184931576251984 2023-01-22 12:50:25.304388: step: 636/464, loss: 0.20926685631275177 2023-01-22 12:50:26.104538: step: 638/464, loss: 0.04862609878182411 2023-01-22 12:50:26.795017: step: 640/464, loss: 0.3046737313270569 2023-01-22 12:50:27.522641: step: 642/464, loss: 0.03974409028887749 2023-01-22 12:50:28.317222: step: 644/464, loss: 0.16337545216083527 2023-01-22 12:50:28.996285: step: 646/464, loss: 0.5412200689315796 2023-01-22 12:50:29.672363: step: 648/464, loss: 0.2021602839231491 2023-01-22 12:50:30.411186: step: 650/464, loss: 0.17793788015842438 2023-01-22 12:50:31.234679: step: 652/464, loss: 0.21241311728954315 2023-01-22 12:50:31.943417: step: 654/464, loss: 1.645676851272583 2023-01-22 12:50:32.720030: step: 656/464, loss: 0.188690647482872 2023-01-22 12:50:33.422286: step: 658/464, loss: 0.09510191529989243 2023-01-22 12:50:34.145214: step: 660/464, loss: 0.11184026300907135 2023-01-22 12:50:34.937723: step: 662/464, loss: 0.4701847434043884 2023-01-22 12:50:35.656109: step: 664/464, loss: 0.09061688184738159 2023-01-22 12:50:36.386166: step: 666/464, loss: 0.26259922981262207 2023-01-22 12:50:37.060936: step: 668/464, loss: 0.12384660542011261 2023-01-22 12:50:37.776626: step: 670/464, loss: 0.17336471378803253 2023-01-22 12:50:38.536539: step: 672/464, loss: 0.2143736183643341 2023-01-22 12:50:39.314484: step: 674/464, loss: 0.08235962688922882 2023-01-22 12:50:39.985513: step: 676/464, loss: 0.8620685338973999 2023-01-22 12:50:40.737944: step: 678/464, loss: 0.27369424700737 2023-01-22 12:50:41.466174: step: 680/464, loss: 0.251381516456604 2023-01-22 12:50:42.163879: step: 682/464, loss: 0.3370826244354248 2023-01-22 12:50:42.892870: step: 684/464, loss: 0.22877030074596405 2023-01-22 12:50:43.571787: step: 686/464, loss: 0.11255232989788055 2023-01-22 12:50:44.274509: step: 688/464, loss: 0.13139210641384125 2023-01-22 12:50:45.036074: step: 690/464, loss: 0.32984012365341187 2023-01-22 12:50:45.748318: step: 692/464, loss: 0.11645828187465668 2023-01-22 12:50:46.425068: step: 694/464, loss: 0.315295934677124 2023-01-22 12:50:47.132966: step: 696/464, loss: 0.1865382045507431 2023-01-22 12:50:47.929413: step: 698/464, loss: 0.06216064468026161 2023-01-22 12:50:48.634291: step: 700/464, loss: 0.0807647556066513 2023-01-22 12:50:49.315847: step: 702/464, loss: 0.10790904611349106 2023-01-22 12:50:50.097389: step: 704/464, loss: 0.2581593096256256 2023-01-22 12:50:50.801339: step: 706/464, loss: 0.15664775669574738 2023-01-22 12:50:51.533846: step: 708/464, loss: 3.8456437587738037 2023-01-22 12:50:52.224009: step: 710/464, loss: 0.2697579264640808 2023-01-22 12:50:52.979883: step: 712/464, loss: 0.7548564672470093 2023-01-22 12:50:53.760736: step: 714/464, loss: 0.11732342094182968 2023-01-22 12:50:54.442585: step: 716/464, loss: 0.3719097971916199 2023-01-22 12:50:55.181747: step: 718/464, loss: 0.40137097239494324 2023-01-22 12:50:55.940122: step: 720/464, loss: 0.09929775446653366 2023-01-22 12:50:56.662644: step: 722/464, loss: 1.2608824968338013 2023-01-22 12:50:57.482024: step: 724/464, loss: 0.2232675701379776 2023-01-22 12:50:58.345904: step: 726/464, loss: 0.2738083600997925 2023-01-22 12:50:59.104383: step: 728/464, loss: 0.5997371077537537 2023-01-22 12:50:59.766392: step: 730/464, loss: 0.38297492265701294 2023-01-22 12:51:00.612218: step: 732/464, loss: 0.3900352716445923 2023-01-22 12:51:01.313245: step: 734/464, loss: 0.18780764937400818 2023-01-22 12:51:02.072297: step: 736/464, loss: 0.09367948770523071 2023-01-22 12:51:02.807744: step: 738/464, loss: 0.24276983737945557 2023-01-22 12:51:03.540273: step: 740/464, loss: 0.6175634860992432 2023-01-22 12:51:04.349278: step: 742/464, loss: 0.10526596754789352 2023-01-22 12:51:05.086606: step: 744/464, loss: 0.45263907313346863 2023-01-22 12:51:05.783816: step: 746/464, loss: 0.1915971040725708 2023-01-22 12:51:06.510481: step: 748/464, loss: 0.15978604555130005 2023-01-22 12:51:07.331297: step: 750/464, loss: 0.14335952699184418 2023-01-22 12:51:08.042994: step: 752/464, loss: 0.2819792330265045 2023-01-22 12:51:08.898562: step: 754/464, loss: 0.6500842571258545 2023-01-22 12:51:09.674016: step: 756/464, loss: 0.303840696811676 2023-01-22 12:51:10.391435: step: 758/464, loss: 0.18252825736999512 2023-01-22 12:51:11.148236: step: 760/464, loss: 0.2440166026353836 2023-01-22 12:51:11.839756: step: 762/464, loss: 0.18778173625469208 2023-01-22 12:51:12.644333: step: 764/464, loss: 0.24666345119476318 2023-01-22 12:51:13.421404: step: 766/464, loss: 0.2585979700088501 2023-01-22 12:51:14.162383: step: 768/464, loss: 0.7861073017120361 2023-01-22 12:51:14.905617: step: 770/464, loss: 0.03258257359266281 2023-01-22 12:51:15.720010: step: 772/464, loss: 0.2287474125623703 2023-01-22 12:51:16.478368: step: 774/464, loss: 0.21275871992111206 2023-01-22 12:51:17.192879: step: 776/464, loss: 0.08996638655662537 2023-01-22 12:51:17.883498: step: 778/464, loss: 0.22523269057273865 2023-01-22 12:51:18.587439: step: 780/464, loss: 0.12908609211444855 2023-01-22 12:51:19.346717: step: 782/464, loss: 0.4444272518157959 2023-01-22 12:51:20.104611: step: 784/464, loss: 0.24299582839012146 2023-01-22 12:51:20.817933: step: 786/464, loss: 0.11312457919120789 2023-01-22 12:51:21.562845: step: 788/464, loss: 0.09250324219465256 2023-01-22 12:51:22.388515: step: 790/464, loss: 0.794976532459259 2023-01-22 12:51:23.133625: step: 792/464, loss: 0.3445376753807068 2023-01-22 12:51:23.853428: step: 794/464, loss: 0.23356936872005463 2023-01-22 12:51:24.608526: step: 796/464, loss: 0.6917689442634583 2023-01-22 12:51:25.344661: step: 798/464, loss: 0.13185939192771912 2023-01-22 12:51:26.095123: step: 800/464, loss: 2.712449312210083 2023-01-22 12:51:26.806857: step: 802/464, loss: 0.8467655181884766 2023-01-22 12:51:27.542026: step: 804/464, loss: 0.07459612935781479 2023-01-22 12:51:28.297251: step: 806/464, loss: 0.09105648845434189 2023-01-22 12:51:28.980830: step: 808/464, loss: 0.21356217563152313 2023-01-22 12:51:29.683301: step: 810/464, loss: 0.2339024692773819 2023-01-22 12:51:30.459991: step: 812/464, loss: 0.23405727744102478 2023-01-22 12:51:31.212871: step: 814/464, loss: 0.18663422763347626 2023-01-22 12:51:31.899745: step: 816/464, loss: 0.18019157648086548 2023-01-22 12:51:32.562093: step: 818/464, loss: 0.17258216440677643 2023-01-22 12:51:33.390732: step: 820/464, loss: 0.2562738358974457 2023-01-22 12:51:34.050168: step: 822/464, loss: 0.20118999481201172 2023-01-22 12:51:34.831791: step: 824/464, loss: 0.4466688632965088 2023-01-22 12:51:35.633769: step: 826/464, loss: 0.2582007646560669 2023-01-22 12:51:36.373792: step: 828/464, loss: 0.3365974724292755 2023-01-22 12:51:37.121988: step: 830/464, loss: 0.2806304097175598 2023-01-22 12:51:37.851595: step: 832/464, loss: 1.128015160560608 2023-01-22 12:51:38.538140: step: 834/464, loss: 0.07908018678426743 2023-01-22 12:51:39.265986: step: 836/464, loss: 0.22766578197479248 2023-01-22 12:51:40.039399: step: 838/464, loss: 0.25081801414489746 2023-01-22 12:51:40.870722: step: 840/464, loss: 0.14951854944229126 2023-01-22 12:51:41.664787: step: 842/464, loss: 0.1881357729434967 2023-01-22 12:51:42.455681: step: 844/464, loss: 0.13177481293678284 2023-01-22 12:51:43.241104: step: 846/464, loss: 0.24283462762832642 2023-01-22 12:51:44.018855: step: 848/464, loss: 0.18885497748851776 2023-01-22 12:51:44.829952: step: 850/464, loss: 0.17050006985664368 2023-01-22 12:51:45.523391: step: 852/464, loss: 0.33716070652008057 2023-01-22 12:51:46.396575: step: 854/464, loss: 0.3635439872741699 2023-01-22 12:51:47.174411: step: 856/464, loss: 0.19956035912036896 2023-01-22 12:51:47.845263: step: 858/464, loss: 0.10865310579538345 2023-01-22 12:51:48.542736: step: 860/464, loss: 0.23677769303321838 2023-01-22 12:51:49.281610: step: 862/464, loss: 3.2011382579803467 2023-01-22 12:51:50.114864: step: 864/464, loss: 0.9883920550346375 2023-01-22 12:51:50.863114: step: 866/464, loss: 0.24623115360736847 2023-01-22 12:51:51.555053: step: 868/464, loss: 0.2305896133184433 2023-01-22 12:51:52.301510: step: 870/464, loss: 0.21101327240467072 2023-01-22 12:51:53.087623: step: 872/464, loss: 0.23975980281829834 2023-01-22 12:51:53.870585: step: 874/464, loss: 0.5831571221351624 2023-01-22 12:51:54.668015: step: 876/464, loss: 0.39869871735572815 2023-01-22 12:51:55.462082: step: 878/464, loss: 0.18923348188400269 2023-01-22 12:51:56.137497: step: 880/464, loss: 0.14323951303958893 2023-01-22 12:51:56.937938: step: 882/464, loss: 4.778229713439941 2023-01-22 12:51:57.728589: step: 884/464, loss: 0.16568735241889954 2023-01-22 12:51:58.450440: step: 886/464, loss: 1.808220624923706 2023-01-22 12:51:59.142461: step: 888/464, loss: 0.0770537406206131 2023-01-22 12:51:59.948103: step: 890/464, loss: 0.5956398248672485 2023-01-22 12:52:00.725400: step: 892/464, loss: 0.06159377470612526 2023-01-22 12:52:01.419875: step: 894/464, loss: 0.10025126487016678 2023-01-22 12:52:02.148127: step: 896/464, loss: 0.27834150195121765 2023-01-22 12:52:02.876060: step: 898/464, loss: 0.25632181763648987 2023-01-22 12:52:03.565771: step: 900/464, loss: 0.8219600319862366 2023-01-22 12:52:04.321042: step: 902/464, loss: 0.05821787193417549 2023-01-22 12:52:05.034256: step: 904/464, loss: 0.08082897216081619 2023-01-22 12:52:05.757378: step: 906/464, loss: 0.34297946095466614 2023-01-22 12:52:06.549924: step: 908/464, loss: 0.11477062106132507 2023-01-22 12:52:07.244007: step: 910/464, loss: 0.6422632336616516 2023-01-22 12:52:07.976555: step: 912/464, loss: 1.6362946033477783 2023-01-22 12:52:08.751008: step: 914/464, loss: 0.8962820768356323 2023-01-22 12:52:09.470852: step: 916/464, loss: 0.2309628278017044 2023-01-22 12:52:10.323787: step: 918/464, loss: 0.3708310127258301 2023-01-22 12:52:11.064158: step: 920/464, loss: 0.24511513113975525 2023-01-22 12:52:11.748143: step: 922/464, loss: 0.2654450833797455 2023-01-22 12:52:12.473830: step: 924/464, loss: 0.11861108988523483 2023-01-22 12:52:13.208182: step: 926/464, loss: 0.18497033417224884 2023-01-22 12:52:14.090380: step: 928/464, loss: 0.39454326033592224 2023-01-22 12:52:14.774779: step: 930/464, loss: 0.6818116307258606 ================================================== Loss: 0.360 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.29345262754638907, 'r': 0.3290226430065575, 'f1': 0.3102213491204684}, 'combined': 0.22858415198350301, 'epoch': 10} Test Chinese: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.28356176775008884, 'r': 0.26198641585606036, 'f1': 0.27234746055093284}, 'combined': 0.1691421070790004, 'epoch': 10} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2772498855903927, 'r': 0.317682160572325, 'f1': 0.2960921108246913}, 'combined': 0.21817313429187776, 'epoch': 10} Test Korean: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.2838887284877848, 'r': 0.26705738094898335, 'f1': 0.2752159567417221}, 'combined': 0.17092359418696426, 'epoch': 10} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.28506743295294545, 'r': 0.3207008620720636, 'f1': 0.3018361054795893}, 'combined': 0.22240555140601315, 'epoch': 10} Test Russian: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.2971733628112622, 'r': 0.27021088807887944, 'f1': 0.2830514881322146}, 'combined': 0.17578987157684908, 'epoch': 10} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.22169811320754718, 'r': 0.3357142857142857, 'f1': 0.2670454545454545}, 'combined': 0.17803030303030298, 'epoch': 10} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.25, 'r': 0.5217391304347826, 'f1': 0.3380281690140845}, 'combined': 0.16901408450704225, 'epoch': 10} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.3806818181818182, 'r': 0.28879310344827586, 'f1': 0.3284313725490196}, 'combined': 0.21895424836601307, 'epoch': 10} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2988087325791506, 'r': 0.32829270619227363, 'f1': 0.3128576060819678}, 'combined': 0.2305266571130289, 'epoch': 9} Test for Chinese: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.3070130647219126, 'r': 0.2824155788243113, 'f1': 0.29420108211373386}, 'combined': 0.1827143562601084, 'epoch': 9} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.27698863636363635, 'r': 0.3482142857142857, 'f1': 0.30854430379746833}, 'combined': 0.20569620253164556, 'epoch': 9} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.29660202530160235, 'r': 0.32361701052831376, 'f1': 0.30952116977934907}, 'combined': 0.2280682303637309, 'epoch': 8} Test for Korean: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.2910978686511131, 'r': 0.2844754641219582, 'f1': 0.2877485685115555}, 'combined': 0.1787070057071766, 'epoch': 8} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.3269230769230769, 'r': 0.5543478260869565, 'f1': 0.4112903225806452}, 'combined': 0.2056451612903226, 'epoch': 8} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3020804336867607, 'r': 0.32271590923272536, 'f1': 0.3120574021388005}, 'combined': 0.22993703315490563, 'epoch': 8} Test for Russian: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.3069139619466407, 'r': 0.292949528465389, 'f1': 0.29976920372318655}, 'combined': 0.1861724528386106, 'epoch': 8} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.44642857142857145, 'r': 0.3232758620689655, 'f1': 0.37500000000000006}, 'combined': 0.25, 'epoch': 8} ****************************** Epoch: 11 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-22 12:54:55.132509: step: 2/464, loss: 0.43263551592826843 2023-01-22 12:54:55.896114: step: 4/464, loss: 0.5643942356109619 2023-01-22 12:54:56.620530: step: 6/464, loss: 0.11672255396842957 2023-01-22 12:54:57.304769: step: 8/464, loss: 0.06750161945819855 2023-01-22 12:54:58.062748: step: 10/464, loss: 0.20070071518421173 2023-01-22 12:54:58.763086: step: 12/464, loss: 0.41763728857040405 2023-01-22 12:54:59.625186: step: 14/464, loss: 0.11456458270549774 2023-01-22 12:55:00.390826: step: 16/464, loss: 0.38342341780662537 2023-01-22 12:55:01.195194: step: 18/464, loss: 0.35928112268447876 2023-01-22 12:55:02.095231: step: 20/464, loss: 0.2673722207546234 2023-01-22 12:55:02.736280: step: 22/464, loss: 0.7194101810455322 2023-01-22 12:55:03.418569: step: 24/464, loss: 0.27697527408599854 2023-01-22 12:55:04.205595: step: 26/464, loss: 0.15484285354614258 2023-01-22 12:55:04.840515: step: 28/464, loss: 0.35449036955833435 2023-01-22 12:55:05.565066: step: 30/464, loss: 0.11388775706291199 2023-01-22 12:55:06.282077: step: 32/464, loss: 0.23958417773246765 2023-01-22 12:55:07.000136: step: 34/464, loss: 0.07231630384922028 2023-01-22 12:55:07.716287: step: 36/464, loss: 0.14240187406539917 2023-01-22 12:55:08.343845: step: 38/464, loss: 0.8622165322303772 2023-01-22 12:55:09.067950: step: 40/464, loss: 0.19801019132137299 2023-01-22 12:55:09.753711: step: 42/464, loss: 0.22898069024085999 2023-01-22 12:55:10.478094: step: 44/464, loss: 0.30996227264404297 2023-01-22 12:55:11.254901: step: 46/464, loss: 0.11852078139781952 2023-01-22 12:55:11.955577: step: 48/464, loss: 0.127349853515625 2023-01-22 12:55:12.753326: step: 50/464, loss: 0.2572477161884308 2023-01-22 12:55:13.476188: step: 52/464, loss: 0.5002968907356262 2023-01-22 12:55:14.286581: step: 54/464, loss: 0.17681193351745605 2023-01-22 12:55:15.025969: step: 56/464, loss: 0.07735830545425415 2023-01-22 12:55:15.727352: step: 58/464, loss: 0.3611297905445099 2023-01-22 12:55:16.566973: step: 60/464, loss: 0.19696767628192902 2023-01-22 12:55:17.258544: step: 62/464, loss: 0.13411551713943481 2023-01-22 12:55:17.981604: step: 64/464, loss: 0.09733086079359055 2023-01-22 12:55:18.726079: step: 66/464, loss: 0.1393062323331833 2023-01-22 12:55:19.480930: step: 68/464, loss: 0.24694481492042542 2023-01-22 12:55:20.330188: step: 70/464, loss: 3.0466761589050293 2023-01-22 12:55:21.106419: step: 72/464, loss: 0.12388309836387634 2023-01-22 12:55:21.779012: step: 74/464, loss: 0.07793045043945312 2023-01-22 12:55:22.495590: step: 76/464, loss: 1.1562849283218384 2023-01-22 12:55:23.205592: step: 78/464, loss: 0.528100311756134 2023-01-22 12:55:23.888221: step: 80/464, loss: 0.08475414663553238 2023-01-22 12:55:24.659849: step: 82/464, loss: 0.4272237718105316 2023-01-22 12:55:25.444361: step: 84/464, loss: 1.6456173658370972 2023-01-22 12:55:26.231004: step: 86/464, loss: 0.19112280011177063 2023-01-22 12:55:26.887947: step: 88/464, loss: 0.1366272270679474 2023-01-22 12:55:27.648283: step: 90/464, loss: 1.2250635623931885 2023-01-22 12:55:28.313372: step: 92/464, loss: 0.6903375387191772 2023-01-22 12:55:29.037624: step: 94/464, loss: 0.19363798201084137 2023-01-22 12:55:29.839688: step: 96/464, loss: 0.1598142832517624 2023-01-22 12:55:30.560681: step: 98/464, loss: 0.0631617084145546 2023-01-22 12:55:31.319707: step: 100/464, loss: 0.1760246604681015 2023-01-22 12:55:31.994215: step: 102/464, loss: 0.051239561289548874 2023-01-22 12:55:32.735279: step: 104/464, loss: 0.27649593353271484 2023-01-22 12:55:33.496975: step: 106/464, loss: 0.4610246419906616 2023-01-22 12:55:34.292340: step: 108/464, loss: 0.23989155888557434 2023-01-22 12:55:35.051963: step: 110/464, loss: 0.05748572573065758 2023-01-22 12:55:35.868671: step: 112/464, loss: 0.18556129932403564 2023-01-22 12:55:36.629507: step: 114/464, loss: 0.13368819653987885 2023-01-22 12:55:37.303624: step: 116/464, loss: 0.10952291637659073 2023-01-22 12:55:38.013420: step: 118/464, loss: 0.07338271290063858 2023-01-22 12:55:38.802032: step: 120/464, loss: 0.21023805439472198 2023-01-22 12:55:39.563082: step: 122/464, loss: 0.12759849429130554 2023-01-22 12:55:40.322795: step: 124/464, loss: 0.38608530163764954 2023-01-22 12:55:41.193984: step: 126/464, loss: 0.046279970556497574 2023-01-22 12:55:42.111662: step: 128/464, loss: 0.6578817367553711 2023-01-22 12:55:42.811961: step: 130/464, loss: 0.06236157938838005 2023-01-22 12:55:43.511748: step: 132/464, loss: 0.2531399130821228 2023-01-22 12:55:44.264750: step: 134/464, loss: 0.14961302280426025 2023-01-22 12:55:44.967938: step: 136/464, loss: 0.08229457587003708 2023-01-22 12:55:45.721262: step: 138/464, loss: 0.14706575870513916 2023-01-22 12:55:46.445331: step: 140/464, loss: 0.7278061509132385 2023-01-22 12:55:47.207228: step: 142/464, loss: 0.18592041730880737 2023-01-22 12:55:47.897225: step: 144/464, loss: 0.1443493366241455 2023-01-22 12:55:48.643118: step: 146/464, loss: 0.1252126544713974 2023-01-22 12:55:49.454204: step: 148/464, loss: 0.4083552062511444 2023-01-22 12:55:50.210767: step: 150/464, loss: 2.9126224517822266 2023-01-22 12:55:50.930200: step: 152/464, loss: 0.29100507497787476 2023-01-22 12:55:51.647393: step: 154/464, loss: 7.7827019691467285 2023-01-22 12:55:52.452256: step: 156/464, loss: 0.2517273724079132 2023-01-22 12:55:53.192135: step: 158/464, loss: 0.22927714884281158 2023-01-22 12:55:53.935270: step: 160/464, loss: 0.13074521720409393 2023-01-22 12:55:54.589060: step: 162/464, loss: 0.4229300916194916 2023-01-22 12:55:55.343896: step: 164/464, loss: 0.1576210856437683 2023-01-22 12:55:56.054577: step: 166/464, loss: 0.12701904773712158 2023-01-22 12:55:56.766377: step: 168/464, loss: 0.4471745193004608 2023-01-22 12:55:57.502753: step: 170/464, loss: 0.2795443832874298 2023-01-22 12:55:58.149075: step: 172/464, loss: 0.04011191427707672 2023-01-22 12:55:58.843704: step: 174/464, loss: 0.19026388227939606 2023-01-22 12:55:59.478177: step: 176/464, loss: 0.01748960092663765 2023-01-22 12:56:00.180969: step: 178/464, loss: 0.16954189538955688 2023-01-22 12:56:00.859140: step: 180/464, loss: 0.4888118505477905 2023-01-22 12:56:01.606126: step: 182/464, loss: 0.05453702062368393 2023-01-22 12:56:02.424914: step: 184/464, loss: 0.20064681768417358 2023-01-22 12:56:03.111718: step: 186/464, loss: 0.2641415596008301 2023-01-22 12:56:03.851042: step: 188/464, loss: 0.2177356630563736 2023-01-22 12:56:04.538727: step: 190/464, loss: 0.7603049874305725 2023-01-22 12:56:05.298819: step: 192/464, loss: 0.13496249914169312 2023-01-22 12:56:05.962331: step: 194/464, loss: 0.10651998221874237 2023-01-22 12:56:06.711117: step: 196/464, loss: 0.08830710500478745 2023-01-22 12:56:07.435059: step: 198/464, loss: 0.30557310581207275 2023-01-22 12:56:08.195929: step: 200/464, loss: 0.08114877343177795 2023-01-22 12:56:08.989123: step: 202/464, loss: 0.25547945499420166 2023-01-22 12:56:09.690125: step: 204/464, loss: 0.2108420878648758 2023-01-22 12:56:10.396440: step: 206/464, loss: 0.165110781788826 2023-01-22 12:56:11.179879: step: 208/464, loss: 2.033724069595337 2023-01-22 12:56:11.952622: step: 210/464, loss: 0.06678891181945801 2023-01-22 12:56:12.706331: step: 212/464, loss: 0.09556983411312103 2023-01-22 12:56:13.385790: step: 214/464, loss: 0.17456085979938507 2023-01-22 12:56:14.169645: step: 216/464, loss: 0.06433313339948654 2023-01-22 12:56:14.917082: step: 218/464, loss: 0.11081457883119583 2023-01-22 12:56:15.618545: step: 220/464, loss: 0.2911708950996399 2023-01-22 12:56:16.417265: step: 222/464, loss: 0.13677887618541718 2023-01-22 12:56:17.160991: step: 224/464, loss: 0.16569681465625763 2023-01-22 12:56:17.903345: step: 226/464, loss: 0.13031910359859467 2023-01-22 12:56:18.642250: step: 228/464, loss: 0.8445505499839783 2023-01-22 12:56:19.349593: step: 230/464, loss: 0.09451509267091751 2023-01-22 12:56:20.128370: step: 232/464, loss: 0.2014075368642807 2023-01-22 12:56:20.780664: step: 234/464, loss: 0.399213582277298 2023-01-22 12:56:21.495231: step: 236/464, loss: 0.08709046989679337 2023-01-22 12:56:22.232457: step: 238/464, loss: 0.27674511075019836 2023-01-22 12:56:22.995848: step: 240/464, loss: 0.8091335296630859 2023-01-22 12:56:23.734087: step: 242/464, loss: 0.04643746092915535 2023-01-22 12:56:24.434761: step: 244/464, loss: 0.20637056231498718 2023-01-22 12:56:25.162259: step: 246/464, loss: 0.1723395735025406 2023-01-22 12:56:25.869738: step: 248/464, loss: 0.21998530626296997 2023-01-22 12:56:26.670617: step: 250/464, loss: 0.37970659136772156 2023-01-22 12:56:27.367760: step: 252/464, loss: 0.11584115773439407 2023-01-22 12:56:28.164240: step: 254/464, loss: 1.5689878463745117 2023-01-22 12:56:28.870408: step: 256/464, loss: 0.3336993157863617 2023-01-22 12:56:29.616202: step: 258/464, loss: 0.390744686126709 2023-01-22 12:56:30.329984: step: 260/464, loss: 0.19020554423332214 2023-01-22 12:56:31.114279: step: 262/464, loss: 0.1410345435142517 2023-01-22 12:56:31.879553: step: 264/464, loss: 0.3876209259033203 2023-01-22 12:56:32.619786: step: 266/464, loss: 0.14369329810142517 2023-01-22 12:56:33.411659: step: 268/464, loss: 0.1893119215965271 2023-01-22 12:56:34.110500: step: 270/464, loss: 0.17482930421829224 2023-01-22 12:56:34.872900: step: 272/464, loss: 0.24558770656585693 2023-01-22 12:56:35.614389: step: 274/464, loss: 0.42578569054603577 2023-01-22 12:56:36.372622: step: 276/464, loss: 0.14443501830101013 2023-01-22 12:56:37.158300: step: 278/464, loss: 0.1805008202791214 2023-01-22 12:56:37.921886: step: 280/464, loss: 0.8799417018890381 2023-01-22 12:56:38.668021: step: 282/464, loss: 0.3285672664642334 2023-01-22 12:56:39.389047: step: 284/464, loss: 0.34625038504600525 2023-01-22 12:56:40.063240: step: 286/464, loss: 0.10508932918310165 2023-01-22 12:56:40.781567: step: 288/464, loss: 0.15506626665592194 2023-01-22 12:56:41.476961: step: 290/464, loss: 0.9398106932640076 2023-01-22 12:56:42.310522: step: 292/464, loss: 0.21145036816596985 2023-01-22 12:56:43.047755: step: 294/464, loss: 0.08800602704286575 2023-01-22 12:56:43.741391: step: 296/464, loss: 0.9233723878860474 2023-01-22 12:56:44.455088: step: 298/464, loss: 0.08186593651771545 2023-01-22 12:56:45.247047: step: 300/464, loss: 0.1038471981883049 2023-01-22 12:56:45.949981: step: 302/464, loss: 0.18606869876384735 2023-01-22 12:56:46.665341: step: 304/464, loss: 0.20838655531406403 2023-01-22 12:56:47.427595: step: 306/464, loss: 0.5244540572166443 2023-01-22 12:56:48.180448: step: 308/464, loss: 0.47285252809524536 2023-01-22 12:56:48.899645: step: 310/464, loss: 0.18951445817947388 2023-01-22 12:56:49.787340: step: 312/464, loss: 0.36259371042251587 2023-01-22 12:56:50.502130: step: 314/464, loss: 0.18875542283058167 2023-01-22 12:56:51.273380: step: 316/464, loss: 0.1639244109392166 2023-01-22 12:56:52.041031: step: 318/464, loss: 0.2972382605075836 2023-01-22 12:56:52.747247: step: 320/464, loss: 0.1820705235004425 2023-01-22 12:56:53.408255: step: 322/464, loss: 0.13749344646930695 2023-01-22 12:56:54.054960: step: 324/464, loss: 0.22790080308914185 2023-01-22 12:56:54.829039: step: 326/464, loss: 0.2505926191806793 2023-01-22 12:56:55.599419: step: 328/464, loss: 0.16676443815231323 2023-01-22 12:56:56.346484: step: 330/464, loss: 2.2576770782470703 2023-01-22 12:56:57.134738: step: 332/464, loss: 0.2201552540063858 2023-01-22 12:56:57.924657: step: 334/464, loss: 0.1650933176279068 2023-01-22 12:56:58.790086: step: 336/464, loss: 0.22324177622795105 2023-01-22 12:56:59.527046: step: 338/464, loss: 1.0308393239974976 2023-01-22 12:57:00.243109: step: 340/464, loss: 0.4288811981678009 2023-01-22 12:57:00.960665: step: 342/464, loss: 0.38893359899520874 2023-01-22 12:57:01.729006: step: 344/464, loss: 0.42521703243255615 2023-01-22 12:57:02.470113: step: 346/464, loss: 0.3413054943084717 2023-01-22 12:57:03.152646: step: 348/464, loss: 0.3558991849422455 2023-01-22 12:57:03.832935: step: 350/464, loss: 0.370784193277359 2023-01-22 12:57:04.516187: step: 352/464, loss: 0.12938739359378815 2023-01-22 12:57:05.189199: step: 354/464, loss: 0.17673499882221222 2023-01-22 12:57:05.901680: step: 356/464, loss: 0.07088849693536758 2023-01-22 12:57:06.609504: step: 358/464, loss: 0.2451186180114746 2023-01-22 12:57:07.308520: step: 360/464, loss: 0.16760440170764923 2023-01-22 12:57:08.056679: step: 362/464, loss: 0.46949276328086853 2023-01-22 12:57:08.840540: step: 364/464, loss: 0.615884006023407 2023-01-22 12:57:09.557840: step: 366/464, loss: 0.12176218628883362 2023-01-22 12:57:10.282000: step: 368/464, loss: 0.17412437498569489 2023-01-22 12:57:11.012467: step: 370/464, loss: 0.2818848192691803 2023-01-22 12:57:11.708323: step: 372/464, loss: 0.04271039739251137 2023-01-22 12:57:12.428040: step: 374/464, loss: 0.366900771856308 2023-01-22 12:57:13.166981: step: 376/464, loss: 0.6440262198448181 2023-01-22 12:57:14.027361: step: 378/464, loss: 0.08439314365386963 2023-01-22 12:57:14.764493: step: 380/464, loss: 0.7600069642066956 2023-01-22 12:57:15.380660: step: 382/464, loss: 0.40017855167388916 2023-01-22 12:57:16.135254: step: 384/464, loss: 0.11778063327074051 2023-01-22 12:57:16.841402: step: 386/464, loss: 0.021956732496619225 2023-01-22 12:57:17.589533: step: 388/464, loss: 0.12280718237161636 2023-01-22 12:57:18.323449: step: 390/464, loss: 0.4814095199108124 2023-01-22 12:57:19.041463: step: 392/464, loss: 0.26054421067237854 2023-01-22 12:57:19.774527: step: 394/464, loss: 0.11960306018590927 2023-01-22 12:57:20.662828: step: 396/464, loss: 0.5592536330223083 2023-01-22 12:57:21.465029: step: 398/464, loss: 0.14069019258022308 2023-01-22 12:57:22.181813: step: 400/464, loss: 0.3604012727737427 2023-01-22 12:57:22.910867: step: 402/464, loss: 0.09957307577133179 2023-01-22 12:57:23.607331: step: 404/464, loss: 0.3808475434780121 2023-01-22 12:57:24.347953: step: 406/464, loss: 0.03552810475230217 2023-01-22 12:57:25.092186: step: 408/464, loss: 0.1248745247721672 2023-01-22 12:57:25.902890: step: 410/464, loss: 0.07676209509372711 2023-01-22 12:57:26.633995: step: 412/464, loss: 0.3310137689113617 2023-01-22 12:57:27.425703: step: 414/464, loss: 0.26078546047210693 2023-01-22 12:57:28.125596: step: 416/464, loss: 0.11648165434598923 2023-01-22 12:57:28.897015: step: 418/464, loss: 0.11332453042268753 2023-01-22 12:57:29.656327: step: 420/464, loss: 0.27925005555152893 2023-01-22 12:57:30.325697: step: 422/464, loss: 0.22074222564697266 2023-01-22 12:57:31.036463: step: 424/464, loss: 0.16748501360416412 2023-01-22 12:57:31.711328: step: 426/464, loss: 1.1123805046081543 2023-01-22 12:57:32.499253: step: 428/464, loss: 0.37792542576789856 2023-01-22 12:57:33.271003: step: 430/464, loss: 0.1879335194826126 2023-01-22 12:57:33.995534: step: 432/464, loss: 1.1048240661621094 2023-01-22 12:57:34.618519: step: 434/464, loss: 0.17384260892868042 2023-01-22 12:57:35.315931: step: 436/464, loss: 0.058489926159381866 2023-01-22 12:57:36.010534: step: 438/464, loss: 0.10755658894777298 2023-01-22 12:57:36.772955: step: 440/464, loss: 0.154677152633667 2023-01-22 12:57:37.515400: step: 442/464, loss: 0.335865318775177 2023-01-22 12:57:38.252100: step: 444/464, loss: 0.17729879915714264 2023-01-22 12:57:38.987319: step: 446/464, loss: 0.1523967832326889 2023-01-22 12:57:39.798633: step: 448/464, loss: 0.14690501987934113 2023-01-22 12:57:40.590766: step: 450/464, loss: 0.3801630437374115 2023-01-22 12:57:41.560780: step: 452/464, loss: 0.18971210718154907 2023-01-22 12:57:42.324862: step: 454/464, loss: 0.28348278999328613 2023-01-22 12:57:43.019122: step: 456/464, loss: 0.11890676617622375 2023-01-22 12:57:43.702795: step: 458/464, loss: 0.08258378505706787 2023-01-22 12:57:44.398931: step: 460/464, loss: 0.10658489167690277 2023-01-22 12:57:45.130964: step: 462/464, loss: 0.3449063301086426 2023-01-22 12:57:45.852891: step: 464/464, loss: 0.2024800330400467 2023-01-22 12:57:46.575541: step: 466/464, loss: 0.11131348460912704 2023-01-22 12:57:47.327955: step: 468/464, loss: 0.050689972937107086 2023-01-22 12:57:47.986234: step: 470/464, loss: 0.09831853210926056 2023-01-22 12:57:48.774227: step: 472/464, loss: 0.16549623012542725 2023-01-22 12:57:49.452146: step: 474/464, loss: 0.3030500113964081 2023-01-22 12:57:50.197274: step: 476/464, loss: 0.11540064960718155 2023-01-22 12:57:50.909372: step: 478/464, loss: 0.1258566528558731 2023-01-22 12:57:51.739774: step: 480/464, loss: 0.3872591257095337 2023-01-22 12:57:52.506808: step: 482/464, loss: 0.4863373041152954 2023-01-22 12:57:53.196933: step: 484/464, loss: 0.03375108912587166 2023-01-22 12:57:53.926595: step: 486/464, loss: 0.15512655675411224 2023-01-22 12:57:54.667104: step: 488/464, loss: 0.18190723657608032 2023-01-22 12:57:55.370911: step: 490/464, loss: 0.08832676708698273 2023-01-22 12:57:56.129668: step: 492/464, loss: 0.1871776580810547 2023-01-22 12:57:56.859665: step: 494/464, loss: 0.22434929013252258 2023-01-22 12:57:57.646152: step: 496/464, loss: 0.34984269738197327 2023-01-22 12:57:58.348272: step: 498/464, loss: 0.9877106547355652 2023-01-22 12:57:59.067583: step: 500/464, loss: 0.1210198700428009 2023-01-22 12:57:59.893604: step: 502/464, loss: 0.0619652234017849 2023-01-22 12:58:00.622833: step: 504/464, loss: 0.21530750393867493 2023-01-22 12:58:01.371221: step: 506/464, loss: 0.10127289593219757 2023-01-22 12:58:02.039991: step: 508/464, loss: 0.19827330112457275 2023-01-22 12:58:02.743498: step: 510/464, loss: 0.2336958348751068 2023-01-22 12:58:03.414941: step: 512/464, loss: 0.14155313372612 2023-01-22 12:58:04.216075: step: 514/464, loss: 2.5356342792510986 2023-01-22 12:58:04.942407: step: 516/464, loss: 0.1917305439710617 2023-01-22 12:58:05.724670: step: 518/464, loss: 0.32812514901161194 2023-01-22 12:58:06.529409: step: 520/464, loss: 0.7535200119018555 2023-01-22 12:58:07.237995: step: 522/464, loss: 0.0939045324921608 2023-01-22 12:58:07.968574: step: 524/464, loss: 0.08146481961011887 2023-01-22 12:58:08.666222: step: 526/464, loss: 0.8648735284805298 2023-01-22 12:58:09.405936: step: 528/464, loss: 0.22401273250579834 2023-01-22 12:58:10.128014: step: 530/464, loss: 0.10157474130392075 2023-01-22 12:58:10.786464: step: 532/464, loss: 0.29921257495880127 2023-01-22 12:58:11.570869: step: 534/464, loss: 0.24234499037265778 2023-01-22 12:58:12.290616: step: 536/464, loss: 0.23960405588150024 2023-01-22 12:58:13.036092: step: 538/464, loss: 0.09281633049249649 2023-01-22 12:58:13.719884: step: 540/464, loss: 0.11242015659809113 2023-01-22 12:58:14.379403: step: 542/464, loss: 0.2687571048736572 2023-01-22 12:58:15.154175: step: 544/464, loss: 0.07830285280942917 2023-01-22 12:58:15.867295: step: 546/464, loss: 0.40920308232307434 2023-01-22 12:58:16.644111: step: 548/464, loss: 0.16095489263534546 2023-01-22 12:58:17.376814: step: 550/464, loss: 0.12933610379695892 2023-01-22 12:58:18.049955: step: 552/464, loss: 0.21303704380989075 2023-01-22 12:58:18.720673: step: 554/464, loss: 0.17689307034015656 2023-01-22 12:58:19.465272: step: 556/464, loss: 0.45598405599594116 2023-01-22 12:58:20.225286: step: 558/464, loss: 0.22006511688232422 2023-01-22 12:58:20.942999: step: 560/464, loss: 0.09161525964736938 2023-01-22 12:58:21.688136: step: 562/464, loss: 0.07944789528846741 2023-01-22 12:58:22.439242: step: 564/464, loss: 0.21354304254055023 2023-01-22 12:58:23.200021: step: 566/464, loss: 0.22804072499275208 2023-01-22 12:58:23.980018: step: 568/464, loss: 0.6266902685165405 2023-01-22 12:58:24.683421: step: 570/464, loss: 0.6643524169921875 2023-01-22 12:58:25.438403: step: 572/464, loss: 0.11704916507005692 2023-01-22 12:58:26.195520: step: 574/464, loss: 0.1938568651676178 2023-01-22 12:58:26.922344: step: 576/464, loss: 0.04783834144473076 2023-01-22 12:58:27.643574: step: 578/464, loss: 0.08962643146514893 2023-01-22 12:58:28.381889: step: 580/464, loss: 0.1545608639717102 2023-01-22 12:58:29.158916: step: 582/464, loss: 0.37907034158706665 2023-01-22 12:58:29.882913: step: 584/464, loss: 0.442465603351593 2023-01-22 12:58:30.605214: step: 586/464, loss: 0.23899458348751068 2023-01-22 12:58:31.289500: step: 588/464, loss: 0.09116481244564056 2023-01-22 12:58:32.020715: step: 590/464, loss: 0.20019377768039703 2023-01-22 12:58:32.794723: step: 592/464, loss: 0.24498000741004944 2023-01-22 12:58:33.534847: step: 594/464, loss: 0.1740119606256485 2023-01-22 12:58:34.269565: step: 596/464, loss: 0.09426325559616089 2023-01-22 12:58:35.009166: step: 598/464, loss: 0.6598936915397644 2023-01-22 12:58:35.696658: step: 600/464, loss: 0.21235957741737366 2023-01-22 12:58:36.462879: step: 602/464, loss: 0.17811620235443115 2023-01-22 12:58:37.188177: step: 604/464, loss: 0.19934476912021637 2023-01-22 12:58:37.995036: step: 606/464, loss: 0.11637186259031296 2023-01-22 12:58:38.708873: step: 608/464, loss: 0.37004193663597107 2023-01-22 12:58:39.419346: step: 610/464, loss: 0.28444433212280273 2023-01-22 12:58:40.144915: step: 612/464, loss: 0.16172698140144348 2023-01-22 12:58:40.831537: step: 614/464, loss: 0.12194719165563583 2023-01-22 12:58:41.624004: step: 616/464, loss: 0.0865287110209465 2023-01-22 12:58:42.371022: step: 618/464, loss: 0.12266885489225388 2023-01-22 12:58:43.090097: step: 620/464, loss: 0.4905916154384613 2023-01-22 12:58:43.710639: step: 622/464, loss: 0.34407004714012146 2023-01-22 12:58:44.464494: step: 624/464, loss: 0.06899049878120422 2023-01-22 12:58:45.227674: step: 626/464, loss: 0.07489988952875137 2023-01-22 12:58:45.969950: step: 628/464, loss: 0.08575539290904999 2023-01-22 12:58:46.709574: step: 630/464, loss: 0.27297940850257874 2023-01-22 12:58:47.436606: step: 632/464, loss: 0.16481491923332214 2023-01-22 12:58:48.128857: step: 634/464, loss: 0.15143327414989471 2023-01-22 12:58:48.850437: step: 636/464, loss: 0.44829922914505005 2023-01-22 12:58:49.530078: step: 638/464, loss: 0.09175318479537964 2023-01-22 12:58:50.249953: step: 640/464, loss: 0.1762494593858719 2023-01-22 12:58:50.879319: step: 642/464, loss: 0.038562506437301636 2023-01-22 12:58:51.676948: step: 644/464, loss: 0.2596011757850647 2023-01-22 12:58:52.436799: step: 646/464, loss: 0.14710719883441925 2023-01-22 12:58:53.180191: step: 648/464, loss: 0.07652927190065384 2023-01-22 12:58:53.937342: step: 650/464, loss: 0.24959689378738403 2023-01-22 12:58:54.794643: step: 652/464, loss: 0.09222123771905899 2023-01-22 12:58:55.515102: step: 654/464, loss: 0.540006697177887 2023-01-22 12:58:56.170139: step: 656/464, loss: 0.3779156804084778 2023-01-22 12:58:56.839570: step: 658/464, loss: 0.12363962829113007 2023-01-22 12:58:57.609009: step: 660/464, loss: 1.2623299360275269 2023-01-22 12:58:58.326159: step: 662/464, loss: 0.40365299582481384 2023-01-22 12:58:59.031480: step: 664/464, loss: 0.06017347052693367 2023-01-22 12:58:59.799550: step: 666/464, loss: 0.16187569499015808 2023-01-22 12:59:00.581592: step: 668/464, loss: 0.08684320747852325 2023-01-22 12:59:01.327711: step: 670/464, loss: 0.2427690476179123 2023-01-22 12:59:02.102057: step: 672/464, loss: 0.12731128931045532 2023-01-22 12:59:02.927726: step: 674/464, loss: 0.09823483228683472 2023-01-22 12:59:03.651877: step: 676/464, loss: 0.09728223085403442 2023-01-22 12:59:04.397458: step: 678/464, loss: 0.06802794337272644 2023-01-22 12:59:05.172256: step: 680/464, loss: 0.46626871824264526 2023-01-22 12:59:05.840950: step: 682/464, loss: 0.1886630654335022 2023-01-22 12:59:06.584879: step: 684/464, loss: 0.15889041125774384 2023-01-22 12:59:07.428339: step: 686/464, loss: 0.4553874135017395 2023-01-22 12:59:08.081232: step: 688/464, loss: 0.09692560881376266 2023-01-22 12:59:08.768858: step: 690/464, loss: 0.7323739528656006 2023-01-22 12:59:09.548266: step: 692/464, loss: 0.5018308758735657 2023-01-22 12:59:10.277126: step: 694/464, loss: 0.3774145841598511 2023-01-22 12:59:11.058271: step: 696/464, loss: 0.22533495724201202 2023-01-22 12:59:11.796691: step: 698/464, loss: 0.11258202791213989 2023-01-22 12:59:12.508788: step: 700/464, loss: 0.24560454487800598 2023-01-22 12:59:13.261431: step: 702/464, loss: 0.23531009256839752 2023-01-22 12:59:14.031584: step: 704/464, loss: 0.07021478563547134 2023-01-22 12:59:14.789622: step: 706/464, loss: 0.18745137751102448 2023-01-22 12:59:15.486309: step: 708/464, loss: 0.20717504620552063 2023-01-22 12:59:16.274910: step: 710/464, loss: 0.20312827825546265 2023-01-22 12:59:17.013041: step: 712/464, loss: 0.2219710499048233 2023-01-22 12:59:17.755244: step: 714/464, loss: 1.655066967010498 2023-01-22 12:59:18.473162: step: 716/464, loss: 0.6989168524742126 2023-01-22 12:59:19.239157: step: 718/464, loss: 0.23458097875118256 2023-01-22 12:59:19.951153: step: 720/464, loss: 0.27344822883605957 2023-01-22 12:59:20.720434: step: 722/464, loss: 0.12376883625984192 2023-01-22 12:59:21.537457: step: 724/464, loss: 1.085131049156189 2023-01-22 12:59:22.223044: step: 726/464, loss: 0.23603534698486328 2023-01-22 12:59:23.027071: step: 728/464, loss: 0.16410315036773682 2023-01-22 12:59:23.733548: step: 730/464, loss: 0.2255164086818695 2023-01-22 12:59:24.485218: step: 732/464, loss: 0.2828912138938904 2023-01-22 12:59:25.190540: step: 734/464, loss: 0.1097203716635704 2023-01-22 12:59:25.953664: step: 736/464, loss: 0.10253311693668365 2023-01-22 12:59:26.674761: step: 738/464, loss: 0.3227848708629608 2023-01-22 12:59:27.512506: step: 740/464, loss: 0.41590064764022827 2023-01-22 12:59:28.227673: step: 742/464, loss: 0.17311303317546844 2023-01-22 12:59:29.037628: step: 744/464, loss: 0.3792298436164856 2023-01-22 12:59:29.727118: step: 746/464, loss: 0.1337263584136963 2023-01-22 12:59:30.441627: step: 748/464, loss: 0.056437354534864426 2023-01-22 12:59:31.210811: step: 750/464, loss: 0.7661721706390381 2023-01-22 12:59:31.904632: step: 752/464, loss: 0.1342168003320694 2023-01-22 12:59:32.710948: step: 754/464, loss: 2.0010664463043213 2023-01-22 12:59:33.420844: step: 756/464, loss: 0.35609573125839233 2023-01-22 12:59:34.076198: step: 758/464, loss: 0.1894560605287552 2023-01-22 12:59:34.778036: step: 760/464, loss: 0.6358622312545776 2023-01-22 12:59:35.574932: step: 762/464, loss: 0.08535061031579971 2023-01-22 12:59:36.377851: step: 764/464, loss: 0.8524316549301147 2023-01-22 12:59:37.145033: step: 766/464, loss: 0.5615988969802856 2023-01-22 12:59:37.876972: step: 768/464, loss: 0.12083742022514343 2023-01-22 12:59:38.669678: step: 770/464, loss: 0.5996964573860168 2023-01-22 12:59:39.406177: step: 772/464, loss: 0.19729484617710114 2023-01-22 12:59:40.186926: step: 774/464, loss: 0.17660316824913025 2023-01-22 12:59:40.881736: step: 776/464, loss: 0.3362042307853699 2023-01-22 12:59:41.587385: step: 778/464, loss: 0.9083384871482849 2023-01-22 12:59:42.388849: step: 780/464, loss: 0.8565486669540405 2023-01-22 12:59:43.098172: step: 782/464, loss: 0.09819147735834122 2023-01-22 12:59:43.912292: step: 784/464, loss: 0.21145141124725342 2023-01-22 12:59:44.628353: step: 786/464, loss: 0.35320553183555603 2023-01-22 12:59:45.379743: step: 788/464, loss: 0.28306126594543457 2023-01-22 12:59:46.145134: step: 790/464, loss: 0.6834411025047302 2023-01-22 12:59:46.810990: step: 792/464, loss: 0.12468130141496658 2023-01-22 12:59:47.558874: step: 794/464, loss: 0.19863691926002502 2023-01-22 12:59:48.333039: step: 796/464, loss: 0.4513365626335144 2023-01-22 12:59:49.082466: step: 798/464, loss: 0.1359991431236267 2023-01-22 12:59:49.804001: step: 800/464, loss: 0.17588861286640167 2023-01-22 12:59:50.540828: step: 802/464, loss: 0.09235641360282898 2023-01-22 12:59:51.331409: step: 804/464, loss: 0.1508709341287613 2023-01-22 12:59:52.067337: step: 806/464, loss: 0.2306538075208664 2023-01-22 12:59:52.802286: step: 808/464, loss: 0.19599375128746033 2023-01-22 12:59:53.618689: step: 810/464, loss: 0.2671712338924408 2023-01-22 12:59:54.333711: step: 812/464, loss: 0.09009677171707153 2023-01-22 12:59:55.107112: step: 814/464, loss: 0.1923481523990631 2023-01-22 12:59:55.794952: step: 816/464, loss: 0.14651606976985931 2023-01-22 12:59:56.561013: step: 818/464, loss: 0.41739848256111145 2023-01-22 12:59:57.288482: step: 820/464, loss: 0.2460222989320755 2023-01-22 12:59:58.083606: step: 822/464, loss: 0.2720450162887573 2023-01-22 12:59:58.795507: step: 824/464, loss: 0.19671209156513214 2023-01-22 12:59:59.563357: step: 826/464, loss: 0.1822405904531479 2023-01-22 13:00:00.265366: step: 828/464, loss: 0.18423821032047272 2023-01-22 13:00:01.037280: step: 830/464, loss: 0.36167916655540466 2023-01-22 13:00:01.790944: step: 832/464, loss: 0.16588671505451202 2023-01-22 13:00:02.579484: step: 834/464, loss: 0.2735016644001007 2023-01-22 13:00:03.374004: step: 836/464, loss: 0.12201464921236038 2023-01-22 13:00:04.103267: step: 838/464, loss: 0.20774728059768677 2023-01-22 13:00:04.751554: step: 840/464, loss: 0.6711456179618835 2023-01-22 13:00:05.477461: step: 842/464, loss: 0.08495127409696579 2023-01-22 13:00:06.213099: step: 844/464, loss: 0.16191741824150085 2023-01-22 13:00:06.988799: step: 846/464, loss: 0.25720930099487305 2023-01-22 13:00:07.695252: step: 848/464, loss: 0.21723899245262146 2023-01-22 13:00:08.436252: step: 850/464, loss: 0.7831099033355713 2023-01-22 13:00:09.164749: step: 852/464, loss: 0.4225570559501648 2023-01-22 13:00:09.955373: step: 854/464, loss: 0.21615463495254517 2023-01-22 13:00:10.683887: step: 856/464, loss: 0.286521852016449 2023-01-22 13:00:11.426384: step: 858/464, loss: 0.11874765157699585 2023-01-22 13:00:12.209856: step: 860/464, loss: 0.12384872138500214 2023-01-22 13:00:12.922814: step: 862/464, loss: 0.15509487688541412 2023-01-22 13:00:13.580681: step: 864/464, loss: 0.16075512766838074 2023-01-22 13:00:14.323553: step: 866/464, loss: 0.3110632300376892 2023-01-22 13:00:15.028580: step: 868/464, loss: 0.10222408920526505 2023-01-22 13:00:15.763217: step: 870/464, loss: 0.09057188034057617 2023-01-22 13:00:16.498799: step: 872/464, loss: 0.13266919553279877 2023-01-22 13:00:17.231042: step: 874/464, loss: 0.1931861937046051 2023-01-22 13:00:17.958932: step: 876/464, loss: 0.08242028206586838 2023-01-22 13:00:18.781315: step: 878/464, loss: 0.1272575557231903 2023-01-22 13:00:19.549532: step: 880/464, loss: 0.22169773280620575 2023-01-22 13:00:20.312410: step: 882/464, loss: 0.26219919323921204 2023-01-22 13:00:21.023400: step: 884/464, loss: 0.04633217677474022 2023-01-22 13:00:21.748744: step: 886/464, loss: 0.1503944844007492 2023-01-22 13:00:22.520433: step: 888/464, loss: 0.27836212515830994 2023-01-22 13:00:23.332001: step: 890/464, loss: 0.16426533460617065 2023-01-22 13:00:24.083840: step: 892/464, loss: 0.15028542280197144 2023-01-22 13:00:24.797031: step: 894/464, loss: 0.12456360459327698 2023-01-22 13:00:25.569831: step: 896/464, loss: 0.6605837941169739 2023-01-22 13:00:26.291982: step: 898/464, loss: 0.4971863925457001 2023-01-22 13:00:27.085661: step: 900/464, loss: 0.205971360206604 2023-01-22 13:00:27.886924: step: 902/464, loss: 0.36103111505508423 2023-01-22 13:00:28.648626: step: 904/464, loss: 0.14627264440059662 2023-01-22 13:00:29.349639: step: 906/464, loss: 0.09463157504796982 2023-01-22 13:00:30.127072: step: 908/464, loss: 0.32430994510650635 2023-01-22 13:00:30.860067: step: 910/464, loss: 0.2579716742038727 2023-01-22 13:00:31.652054: step: 912/464, loss: 0.4348582923412323 2023-01-22 13:00:32.440477: step: 914/464, loss: 0.08700353652238846 2023-01-22 13:00:33.188552: step: 916/464, loss: 0.35793405771255493 2023-01-22 13:00:33.919466: step: 918/464, loss: 0.19745084643363953 2023-01-22 13:00:34.771493: step: 920/464, loss: 0.3313872814178467 2023-01-22 13:00:35.465935: step: 922/464, loss: 0.16230227053165436 2023-01-22 13:00:36.215675: step: 924/464, loss: 0.5203825831413269 2023-01-22 13:00:36.895068: step: 926/464, loss: 0.40407904982566833 2023-01-22 13:00:37.685665: step: 928/464, loss: 0.2651551365852356 2023-01-22 13:00:38.392717: step: 930/464, loss: 0.24683575332164764 ================================================== Loss: 0.317 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.30845857280434635, 'r': 0.30845857280434635, 'f1': 0.30845857280434635}, 'combined': 0.22728526417162362, 'epoch': 11} Test Chinese: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.29367589772764763, 'r': 0.2890373840419197, 'f1': 0.29133817913877086}, 'combined': 0.18093634283355245, 'epoch': 11} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.29603156339535325, 'r': 0.3083896172752352, 'f1': 0.30208425335325084}, 'combined': 0.2225883972076585, 'epoch': 11} Test Korean: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.28848411788393924, 'r': 0.29332541107646337, 'f1': 0.29088462204645854}, 'combined': 0.1806546600078006, 'epoch': 11} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.29736738379562355, 'r': 0.29567779638769387, 'f1': 0.2965201832719893}, 'combined': 0.21848855609515, 'epoch': 11} Test Russian: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.29606572311988383, 'r': 0.297234788166754, 'f1': 0.2966491038550954}, 'combined': 0.18423470660474348, 'epoch': 11} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.25426136363636365, 'r': 0.3196428571428571, 'f1': 0.28322784810126583}, 'combined': 0.18881856540084388, 'epoch': 11} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.29651162790697677, 'r': 0.5543478260869565, 'f1': 0.38636363636363635}, 'combined': 0.19318181818181818, 'epoch': 11} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.46875, 'r': 0.3232758620689655, 'f1': 0.38265306122448983}, 'combined': 0.25510204081632654, 'epoch': 11} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2988087325791506, 'r': 0.32829270619227363, 'f1': 0.3128576060819678}, 'combined': 0.2305266571130289, 'epoch': 9} Test for Chinese: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.3070130647219126, 'r': 0.2824155788243113, 'f1': 0.29420108211373386}, 'combined': 0.1827143562601084, 'epoch': 9} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.27698863636363635, 'r': 0.3482142857142857, 'f1': 0.30854430379746833}, 'combined': 0.20569620253164556, 'epoch': 9} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.29660202530160235, 'r': 0.32361701052831376, 'f1': 0.30952116977934907}, 'combined': 0.2280682303637309, 'epoch': 8} Test for Korean: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.2910978686511131, 'r': 0.2844754641219582, 'f1': 0.2877485685115555}, 'combined': 0.1787070057071766, 'epoch': 8} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.3269230769230769, 'r': 0.5543478260869565, 'f1': 0.4112903225806452}, 'combined': 0.2056451612903226, 'epoch': 8} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3020804336867607, 'r': 0.32271590923272536, 'f1': 0.3120574021388005}, 'combined': 0.22993703315490563, 'epoch': 8} Test for Russian: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.3069139619466407, 'r': 0.292949528465389, 'f1': 0.29976920372318655}, 'combined': 0.1861724528386106, 'epoch': 8} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.44642857142857145, 'r': 0.3232758620689655, 'f1': 0.37500000000000006}, 'combined': 0.25, 'epoch': 8} ****************************** Epoch: 12 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-22 13:03:18.152930: step: 2/464, loss: 0.07937145233154297 2023-01-22 13:03:18.869672: step: 4/464, loss: 0.08152173459529877 2023-01-22 13:03:19.622212: step: 6/464, loss: 1.303863525390625 2023-01-22 13:03:20.328154: step: 8/464, loss: 0.13373735547065735 2023-01-22 13:03:21.092278: step: 10/464, loss: 0.7212203145027161 2023-01-22 13:03:21.909298: step: 12/464, loss: 0.30934861302375793 2023-01-22 13:03:22.687158: step: 14/464, loss: 0.5328418612480164 2023-01-22 13:03:23.486803: step: 16/464, loss: 0.09542962163686752 2023-01-22 13:03:24.156427: step: 18/464, loss: 0.14745691418647766 2023-01-22 13:03:24.844243: step: 20/464, loss: 0.2676483988761902 2023-01-22 13:03:25.627810: step: 22/464, loss: 0.09156662225723267 2023-01-22 13:03:26.278622: step: 24/464, loss: 0.03031635843217373 2023-01-22 13:03:26.997190: step: 26/464, loss: 0.09614569693803787 2023-01-22 13:03:27.764695: step: 28/464, loss: 0.18411265313625336 2023-01-22 13:03:28.462257: step: 30/464, loss: 0.11576136201620102 2023-01-22 13:03:29.203010: step: 32/464, loss: 0.12587027251720428 2023-01-22 13:03:29.891077: step: 34/464, loss: 0.2171574831008911 2023-01-22 13:03:30.648434: step: 36/464, loss: 0.1379026621580124 2023-01-22 13:03:31.379709: step: 38/464, loss: 0.038375645875930786 2023-01-22 13:03:32.087499: step: 40/464, loss: 0.6074793338775635 2023-01-22 13:03:32.806563: step: 42/464, loss: 0.12009892612695694 2023-01-22 13:03:33.609055: step: 44/464, loss: 0.28919121623039246 2023-01-22 13:03:34.281253: step: 46/464, loss: 0.09155625849962234 2023-01-22 13:03:35.060318: step: 48/464, loss: 0.16924382746219635 2023-01-22 13:03:35.770489: step: 50/464, loss: 0.33645200729370117 2023-01-22 13:03:36.505981: step: 52/464, loss: 0.5285319089889526 2023-01-22 13:03:37.223803: step: 54/464, loss: 0.5698367357254028 2023-01-22 13:03:37.991070: step: 56/464, loss: 0.09041299670934677 2023-01-22 13:03:38.737467: step: 58/464, loss: 0.04541686549782753 2023-01-22 13:03:39.449267: step: 60/464, loss: 0.35840001702308655 2023-01-22 13:03:40.141501: step: 62/464, loss: 0.6764888167381287 2023-01-22 13:03:40.911259: step: 64/464, loss: 0.03156716749072075 2023-01-22 13:03:41.650543: step: 66/464, loss: 0.6557780504226685 2023-01-22 13:03:42.390034: step: 68/464, loss: 0.10692758858203888 2023-01-22 13:03:43.143359: step: 70/464, loss: 0.09629504382610321 2023-01-22 13:03:43.939473: step: 72/464, loss: 0.11967064440250397 2023-01-22 13:03:44.720354: step: 74/464, loss: 0.07440176606178284 2023-01-22 13:03:45.476892: step: 76/464, loss: 0.395173579454422 2023-01-22 13:03:46.263989: step: 78/464, loss: 0.17125675082206726 2023-01-22 13:03:47.052140: step: 80/464, loss: 0.08940937370061874 2023-01-22 13:03:47.822998: step: 82/464, loss: 0.19092987477779388 2023-01-22 13:03:48.606124: step: 84/464, loss: 0.25753524899482727 2023-01-22 13:03:49.399661: step: 86/464, loss: 0.3011278212070465 2023-01-22 13:03:50.160073: step: 88/464, loss: 0.13556747138500214 2023-01-22 13:03:50.968734: step: 90/464, loss: 0.09217557311058044 2023-01-22 13:03:51.622711: step: 92/464, loss: 0.06008195877075195 2023-01-22 13:03:52.356827: step: 94/464, loss: 1.1136311292648315 2023-01-22 13:03:53.088648: step: 96/464, loss: 0.0928504541516304 2023-01-22 13:03:53.815650: step: 98/464, loss: 0.11634860187768936 2023-01-22 13:03:54.537856: step: 100/464, loss: 0.17749115824699402 2023-01-22 13:03:55.220432: step: 102/464, loss: 0.3903777599334717 2023-01-22 13:03:55.964657: step: 104/464, loss: 0.08186224102973938 2023-01-22 13:03:56.779908: step: 106/464, loss: 0.149544358253479 2023-01-22 13:03:57.517483: step: 108/464, loss: 2.127565383911133 2023-01-22 13:03:58.223047: step: 110/464, loss: 0.20751747488975525 2023-01-22 13:03:58.924392: step: 112/464, loss: 0.6430408954620361 2023-01-22 13:03:59.643619: step: 114/464, loss: 0.31235331296920776 2023-01-22 13:04:00.375563: step: 116/464, loss: 0.22165215015411377 2023-01-22 13:04:01.094130: step: 118/464, loss: 0.28718554973602295 2023-01-22 13:04:01.808029: step: 120/464, loss: 0.16045460104942322 2023-01-22 13:04:02.619311: step: 122/464, loss: 0.15227341651916504 2023-01-22 13:04:03.336633: step: 124/464, loss: 0.17640146613121033 2023-01-22 13:04:04.076959: step: 126/464, loss: 0.12168958783149719 2023-01-22 13:04:04.810353: step: 128/464, loss: 0.14149035513401031 2023-01-22 13:04:05.493946: step: 130/464, loss: 0.14450493454933167 2023-01-22 13:04:06.253294: step: 132/464, loss: 0.13244052231311798 2023-01-22 13:04:06.995805: step: 134/464, loss: 0.09429279714822769 2023-01-22 13:04:07.709160: step: 136/464, loss: 0.11651817709207535 2023-01-22 13:04:08.383650: step: 138/464, loss: 0.08832782506942749 2023-01-22 13:04:09.112691: step: 140/464, loss: 0.08300164341926575 2023-01-22 13:04:09.909693: step: 142/464, loss: 0.2834710478782654 2023-01-22 13:04:10.631931: step: 144/464, loss: 0.1599501669406891 2023-01-22 13:04:11.406555: step: 146/464, loss: 0.313719242811203 2023-01-22 13:04:12.173327: step: 148/464, loss: 0.07726339995861053 2023-01-22 13:04:12.886516: step: 150/464, loss: 0.18269270658493042 2023-01-22 13:04:13.651654: step: 152/464, loss: 0.10299442708492279 2023-01-22 13:04:14.401753: step: 154/464, loss: 0.2939721941947937 2023-01-22 13:04:15.071171: step: 156/464, loss: 0.1128954365849495 2023-01-22 13:04:15.773404: step: 158/464, loss: 0.13372232019901276 2023-01-22 13:04:16.546771: step: 160/464, loss: 0.1052638366818428 2023-01-22 13:04:17.286336: step: 162/464, loss: 0.7225176692008972 2023-01-22 13:04:18.078888: step: 164/464, loss: 0.51224684715271 2023-01-22 13:04:19.018911: step: 166/464, loss: 0.11861090362071991 2023-01-22 13:04:19.736593: step: 168/464, loss: 0.3967590928077698 2023-01-22 13:04:20.476224: step: 170/464, loss: 0.16093403100967407 2023-01-22 13:04:21.220402: step: 172/464, loss: 0.16586624085903168 2023-01-22 13:04:21.901696: step: 174/464, loss: 0.47254830598831177 2023-01-22 13:04:22.556594: step: 176/464, loss: 0.8602539300918579 2023-01-22 13:04:23.421457: step: 178/464, loss: 0.08452915400266647 2023-01-22 13:04:24.175307: step: 180/464, loss: 0.14934788644313812 2023-01-22 13:04:24.904416: step: 182/464, loss: 0.15610671043395996 2023-01-22 13:04:25.692035: step: 184/464, loss: 0.11498009413480759 2023-01-22 13:04:26.339470: step: 186/464, loss: 0.02484835684299469 2023-01-22 13:04:27.012193: step: 188/464, loss: 0.050583284348249435 2023-01-22 13:04:27.746076: step: 190/464, loss: 0.022346261888742447 2023-01-22 13:04:28.456517: step: 192/464, loss: 0.3246009647846222 2023-01-22 13:04:29.290134: step: 194/464, loss: 0.18790866434574127 2023-01-22 13:04:30.030839: step: 196/464, loss: 0.27084141969680786 2023-01-22 13:04:30.738790: step: 198/464, loss: 0.08829298615455627 2023-01-22 13:04:31.516980: step: 200/464, loss: 0.47170162200927734 2023-01-22 13:04:32.187112: step: 202/464, loss: 0.09879495203495026 2023-01-22 13:04:32.978482: step: 204/464, loss: 0.13257142901420593 2023-01-22 13:04:33.741802: step: 206/464, loss: 0.1926821917295456 2023-01-22 13:04:34.560485: step: 208/464, loss: 0.2796195447444916 2023-01-22 13:04:35.254509: step: 210/464, loss: 0.11591766774654388 2023-01-22 13:04:36.085660: step: 212/464, loss: 0.13543307781219482 2023-01-22 13:04:36.873633: step: 214/464, loss: 0.6604641675949097 2023-01-22 13:04:37.625686: step: 216/464, loss: 0.13425329327583313 2023-01-22 13:04:38.349341: step: 218/464, loss: 0.6592523455619812 2023-01-22 13:04:39.010836: step: 220/464, loss: 0.6960573196411133 2023-01-22 13:04:39.739648: step: 222/464, loss: 0.4156104326248169 2023-01-22 13:04:40.470886: step: 224/464, loss: 0.06526309996843338 2023-01-22 13:04:41.235737: step: 226/464, loss: 0.14824193716049194 2023-01-22 13:04:41.975395: step: 228/464, loss: 0.1567968726158142 2023-01-22 13:04:42.750774: step: 230/464, loss: 0.18765020370483398 2023-01-22 13:04:43.445429: step: 232/464, loss: 0.32986509799957275 2023-01-22 13:04:44.159322: step: 234/464, loss: 0.08478040993213654 2023-01-22 13:04:44.913597: step: 236/464, loss: 0.7538495659828186 2023-01-22 13:04:45.626116: step: 238/464, loss: 0.30088624358177185 2023-01-22 13:04:46.416950: step: 240/464, loss: 0.08321912586688995 2023-01-22 13:04:47.086061: step: 242/464, loss: 0.15158098936080933 2023-01-22 13:04:47.840346: step: 244/464, loss: 0.0883256271481514 2023-01-22 13:04:48.628128: step: 246/464, loss: 0.12839487195014954 2023-01-22 13:04:49.385106: step: 248/464, loss: 0.11329327523708344 2023-01-22 13:04:50.138640: step: 250/464, loss: 0.14634855091571808 2023-01-22 13:04:50.872491: step: 252/464, loss: 0.1465548425912857 2023-01-22 13:04:51.545151: step: 254/464, loss: 0.1491321623325348 2023-01-22 13:04:52.329738: step: 256/464, loss: 0.05420532450079918 2023-01-22 13:04:53.076378: step: 258/464, loss: 0.06870382279157639 2023-01-22 13:04:53.824518: step: 260/464, loss: 0.22869589924812317 2023-01-22 13:04:54.600755: step: 262/464, loss: 0.24778543412685394 2023-01-22 13:04:55.307024: step: 264/464, loss: 0.04789621755480766 2023-01-22 13:04:56.058651: step: 266/464, loss: 0.10004598647356033 2023-01-22 13:04:56.813410: step: 268/464, loss: 0.049069881439208984 2023-01-22 13:04:57.590545: step: 270/464, loss: 0.11798962205648422 2023-01-22 13:04:58.325116: step: 272/464, loss: 0.0744778960943222 2023-01-22 13:04:59.072659: step: 274/464, loss: 0.21960477530956268 2023-01-22 13:04:59.748961: step: 276/464, loss: 0.45826032757759094 2023-01-22 13:05:00.558440: step: 278/464, loss: 0.22718052566051483 2023-01-22 13:05:01.293964: step: 280/464, loss: 0.32442647218704224 2023-01-22 13:05:02.099609: step: 282/464, loss: 0.1793290227651596 2023-01-22 13:05:02.971477: step: 284/464, loss: 0.3830242156982422 2023-01-22 13:05:03.696757: step: 286/464, loss: 0.18583126366138458 2023-01-22 13:05:04.468201: step: 288/464, loss: 0.24565443396568298 2023-01-22 13:05:05.145793: step: 290/464, loss: 0.17036233842372894 2023-01-22 13:05:05.884013: step: 292/464, loss: 0.17884781956672668 2023-01-22 13:05:06.718394: step: 294/464, loss: 0.13129177689552307 2023-01-22 13:05:07.486791: step: 296/464, loss: 0.09641414880752563 2023-01-22 13:05:08.296620: step: 298/464, loss: 0.2380957305431366 2023-01-22 13:05:08.958973: step: 300/464, loss: 0.2236909717321396 2023-01-22 13:05:09.731621: step: 302/464, loss: 0.25487086176872253 2023-01-22 13:05:10.460443: step: 304/464, loss: 0.08199506253004074 2023-01-22 13:05:11.158171: step: 306/464, loss: 0.14794254302978516 2023-01-22 13:05:11.963018: step: 308/464, loss: 0.11556050926446915 2023-01-22 13:05:12.602692: step: 310/464, loss: 0.4206964075565338 2023-01-22 13:05:13.344049: step: 312/464, loss: 0.35478389263153076 2023-01-22 13:05:14.117922: step: 314/464, loss: 0.08433805406093597 2023-01-22 13:05:14.847500: step: 316/464, loss: 0.2681663930416107 2023-01-22 13:05:15.622395: step: 318/464, loss: 0.08490245789289474 2023-01-22 13:05:16.306469: step: 320/464, loss: 0.8618941307067871 2023-01-22 13:05:17.088506: step: 322/464, loss: 0.16621652245521545 2023-01-22 13:05:17.830527: step: 324/464, loss: 0.12393920123577118 2023-01-22 13:05:18.560416: step: 326/464, loss: 0.4252515733242035 2023-01-22 13:05:19.274641: step: 328/464, loss: 0.18016891181468964 2023-01-22 13:05:20.079463: step: 330/464, loss: 0.1707686185836792 2023-01-22 13:05:20.821342: step: 332/464, loss: 0.1263190656900406 2023-01-22 13:05:21.579290: step: 334/464, loss: 0.07681165635585785 2023-01-22 13:05:22.300560: step: 336/464, loss: 0.2365681231021881 2023-01-22 13:05:23.001031: step: 338/464, loss: 0.3317764401435852 2023-01-22 13:05:23.765853: step: 340/464, loss: 0.17738425731658936 2023-01-22 13:05:24.517057: step: 342/464, loss: 0.09095247089862823 2023-01-22 13:05:25.194513: step: 344/464, loss: 0.15220151841640472 2023-01-22 13:05:25.936511: step: 346/464, loss: 0.19217680394649506 2023-01-22 13:05:26.631997: step: 348/464, loss: 0.20572134852409363 2023-01-22 13:05:27.415907: step: 350/464, loss: 0.08010413497686386 2023-01-22 13:05:28.136072: step: 352/464, loss: 0.18971148133277893 2023-01-22 13:05:28.823910: step: 354/464, loss: 0.3263251483440399 2023-01-22 13:05:29.643415: step: 356/464, loss: 0.1769842803478241 2023-01-22 13:05:30.301883: step: 358/464, loss: 0.08641387522220612 2023-01-22 13:05:30.941357: step: 360/464, loss: 0.14438386261463165 2023-01-22 13:05:31.604655: step: 362/464, loss: 0.28434932231903076 2023-01-22 13:05:32.353459: step: 364/464, loss: 0.1055278480052948 2023-01-22 13:05:33.112747: step: 366/464, loss: 0.10482140630483627 2023-01-22 13:05:33.902386: step: 368/464, loss: 0.3840358555316925 2023-01-22 13:05:34.629441: step: 370/464, loss: 0.1421050727367401 2023-01-22 13:05:35.390403: step: 372/464, loss: 0.20900005102157593 2023-01-22 13:05:36.086502: step: 374/464, loss: 0.07068168371915817 2023-01-22 13:05:36.844155: step: 376/464, loss: 0.34346091747283936 2023-01-22 13:05:37.592640: step: 378/464, loss: 0.14604853093624115 2023-01-22 13:05:38.326554: step: 380/464, loss: 0.23045358061790466 2023-01-22 13:05:39.030595: step: 382/464, loss: 0.42827993631362915 2023-01-22 13:05:39.716647: step: 384/464, loss: 0.0735638290643692 2023-01-22 13:05:40.478882: step: 386/464, loss: 0.3617570698261261 2023-01-22 13:05:41.228773: step: 388/464, loss: 0.1411140263080597 2023-01-22 13:05:42.076361: step: 390/464, loss: 0.03528972342610359 2023-01-22 13:05:42.760255: step: 392/464, loss: 0.4784262776374817 2023-01-22 13:05:43.457906: step: 394/464, loss: 0.08293693512678146 2023-01-22 13:05:44.241448: step: 396/464, loss: 0.07473158836364746 2023-01-22 13:05:44.958322: step: 398/464, loss: 0.06551724672317505 2023-01-22 13:05:45.685523: step: 400/464, loss: 0.10966359823942184 2023-01-22 13:05:46.444100: step: 402/464, loss: 0.5371749401092529 2023-01-22 13:05:47.143780: step: 404/464, loss: 0.06447066366672516 2023-01-22 13:05:47.873608: step: 406/464, loss: 1.2373099327087402 2023-01-22 13:05:48.671399: step: 408/464, loss: 0.12304025888442993 2023-01-22 13:05:49.364473: step: 410/464, loss: 0.09652536362409592 2023-01-22 13:05:50.090139: step: 412/464, loss: 0.04025071859359741 2023-01-22 13:05:50.846936: step: 414/464, loss: 0.4970055818557739 2023-01-22 13:05:51.570553: step: 416/464, loss: 0.5476180911064148 2023-01-22 13:05:52.336918: step: 418/464, loss: 0.20421092212200165 2023-01-22 13:05:53.095890: step: 420/464, loss: 0.07824033498764038 2023-01-22 13:05:53.793866: step: 422/464, loss: 0.14528338611125946 2023-01-22 13:05:54.450796: step: 424/464, loss: 0.4104193150997162 2023-01-22 13:05:55.198102: step: 426/464, loss: 0.05804063379764557 2023-01-22 13:05:55.876560: step: 428/464, loss: 0.2260199934244156 2023-01-22 13:05:56.534417: step: 430/464, loss: 0.27057549357414246 2023-01-22 13:05:57.383422: step: 432/464, loss: 0.23801268637180328 2023-01-22 13:05:58.158768: step: 434/464, loss: 0.27735766768455505 2023-01-22 13:05:58.863991: step: 436/464, loss: 0.20993174612522125 2023-01-22 13:05:59.595626: step: 438/464, loss: 0.23531346023082733 2023-01-22 13:06:00.435472: step: 440/464, loss: 0.13657617568969727 2023-01-22 13:06:01.111392: step: 442/464, loss: 0.5901662111282349 2023-01-22 13:06:01.829301: step: 444/464, loss: 0.07006267458200455 2023-01-22 13:06:02.678992: step: 446/464, loss: 0.4572935700416565 2023-01-22 13:06:03.339955: step: 448/464, loss: 0.07816348224878311 2023-01-22 13:06:04.095191: step: 450/464, loss: 5.27097225189209 2023-01-22 13:06:04.860934: step: 452/464, loss: 0.19717015326023102 2023-01-22 13:06:05.525256: step: 454/464, loss: 0.06615665555000305 2023-01-22 13:06:06.181737: step: 456/464, loss: 0.12976990640163422 2023-01-22 13:06:06.943564: step: 458/464, loss: 0.5949162840843201 2023-01-22 13:06:07.644962: step: 460/464, loss: 0.45521727204322815 2023-01-22 13:06:08.411054: step: 462/464, loss: 0.10309097915887833 2023-01-22 13:06:09.088947: step: 464/464, loss: 0.3543277680873871 2023-01-22 13:06:09.898222: step: 466/464, loss: 0.44685256481170654 2023-01-22 13:06:10.577106: step: 468/464, loss: 0.2305956780910492 2023-01-22 13:06:11.391182: step: 470/464, loss: 0.2975473403930664 2023-01-22 13:06:12.104075: step: 472/464, loss: 0.36820775270462036 2023-01-22 13:06:12.808484: step: 474/464, loss: 0.1223047524690628 2023-01-22 13:06:13.555410: step: 476/464, loss: 0.22195355594158173 2023-01-22 13:06:14.421635: step: 478/464, loss: 0.1410691738128662 2023-01-22 13:06:15.126481: step: 480/464, loss: 0.1481795310974121 2023-01-22 13:06:15.940206: step: 482/464, loss: 0.08102936297655106 2023-01-22 13:06:16.700226: step: 484/464, loss: 0.15827953815460205 2023-01-22 13:06:17.491542: step: 486/464, loss: 0.08449569344520569 2023-01-22 13:06:18.265979: step: 488/464, loss: 0.09026332944631577 2023-01-22 13:06:19.001013: step: 490/464, loss: 2.810152053833008 2023-01-22 13:06:19.704833: step: 492/464, loss: 0.10736886411905289 2023-01-22 13:06:20.424228: step: 494/464, loss: 0.06060099974274635 2023-01-22 13:06:21.198702: step: 496/464, loss: 0.10546278208494186 2023-01-22 13:06:21.975380: step: 498/464, loss: 1.2901486158370972 2023-01-22 13:06:22.829451: step: 500/464, loss: 0.09726813435554504 2023-01-22 13:06:23.577406: step: 502/464, loss: 0.09726590663194656 2023-01-22 13:06:24.276207: step: 504/464, loss: 0.08936482667922974 2023-01-22 13:06:25.023415: step: 506/464, loss: 0.193405881524086 2023-01-22 13:06:25.788299: step: 508/464, loss: 0.31795811653137207 2023-01-22 13:06:26.519075: step: 510/464, loss: 0.10925085097551346 2023-01-22 13:06:27.261232: step: 512/464, loss: 0.11257359385490417 2023-01-22 13:06:28.098296: step: 514/464, loss: 0.11842934787273407 2023-01-22 13:06:28.799364: step: 516/464, loss: 0.07557530701160431 2023-01-22 13:06:29.525099: step: 518/464, loss: 0.12648552656173706 2023-01-22 13:06:30.298403: step: 520/464, loss: 0.19210243225097656 2023-01-22 13:06:31.028220: step: 522/464, loss: 0.1300201416015625 2023-01-22 13:06:31.838447: step: 524/464, loss: 0.30112624168395996 2023-01-22 13:06:32.664100: step: 526/464, loss: 0.1210969090461731 2023-01-22 13:06:33.478227: step: 528/464, loss: 0.09769611805677414 2023-01-22 13:06:34.282466: step: 530/464, loss: 0.18061119318008423 2023-01-22 13:06:35.053182: step: 532/464, loss: 0.2147839516401291 2023-01-22 13:06:35.805125: step: 534/464, loss: 0.057371918112039566 2023-01-22 13:06:36.573899: step: 536/464, loss: 0.07153794169425964 2023-01-22 13:06:37.364372: step: 538/464, loss: 1.1326093673706055 2023-01-22 13:06:38.240974: step: 540/464, loss: 0.17589691281318665 2023-01-22 13:06:38.959629: step: 542/464, loss: 0.10882126539945602 2023-01-22 13:06:39.651740: step: 544/464, loss: 0.20188839733600616 2023-01-22 13:06:40.435047: step: 546/464, loss: 0.11440496891736984 2023-01-22 13:06:41.182050: step: 548/464, loss: 0.15246495604515076 2023-01-22 13:06:41.871625: step: 550/464, loss: 1.742716670036316 2023-01-22 13:06:42.608310: step: 552/464, loss: 0.15757957100868225 2023-01-22 13:06:43.340746: step: 554/464, loss: 0.14114047586917877 2023-01-22 13:06:44.025488: step: 556/464, loss: 0.2917866110801697 2023-01-22 13:06:44.750582: step: 558/464, loss: 0.21258586645126343 2023-01-22 13:06:45.521253: step: 560/464, loss: 0.3417026698589325 2023-01-22 13:06:46.279487: step: 562/464, loss: 0.09667295962572098 2023-01-22 13:06:47.074103: step: 564/464, loss: 0.34167689085006714 2023-01-22 13:06:47.854853: step: 566/464, loss: 0.0827411338686943 2023-01-22 13:06:48.515673: step: 568/464, loss: 0.5343703627586365 2023-01-22 13:06:49.326825: step: 570/464, loss: 0.24220702052116394 2023-01-22 13:06:50.208233: step: 572/464, loss: 0.42066872119903564 2023-01-22 13:06:50.990645: step: 574/464, loss: 0.2661969065666199 2023-01-22 13:06:51.661951: step: 576/464, loss: 0.42867612838745117 2023-01-22 13:06:52.390754: step: 578/464, loss: 0.20908300578594208 2023-01-22 13:06:53.160874: step: 580/464, loss: 0.2226964831352234 2023-01-22 13:06:53.920181: step: 582/464, loss: 0.18661530315876007 2023-01-22 13:06:54.631700: step: 584/464, loss: 0.46391546726226807 2023-01-22 13:06:55.339080: step: 586/464, loss: 0.05677974224090576 2023-01-22 13:06:56.091032: step: 588/464, loss: 0.09799033403396606 2023-01-22 13:06:56.849674: step: 590/464, loss: 0.14943784475326538 2023-01-22 13:06:57.511791: step: 592/464, loss: 0.11508098244667053 2023-01-22 13:06:58.206958: step: 594/464, loss: 0.2567056119441986 2023-01-22 13:06:58.988930: step: 596/464, loss: 0.10912559926509857 2023-01-22 13:06:59.686242: step: 598/464, loss: 0.1308480203151703 2023-01-22 13:07:00.409518: step: 600/464, loss: 0.1481272131204605 2023-01-22 13:07:01.105595: step: 602/464, loss: 0.38644760847091675 2023-01-22 13:07:01.865763: step: 604/464, loss: 0.37461990118026733 2023-01-22 13:07:02.612033: step: 606/464, loss: 0.11868741363286972 2023-01-22 13:07:03.309879: step: 608/464, loss: 0.20522792637348175 2023-01-22 13:07:04.021116: step: 610/464, loss: 0.08558636158704758 2023-01-22 13:07:04.724692: step: 612/464, loss: 0.11278170347213745 2023-01-22 13:07:05.444039: step: 614/464, loss: 0.3965875506401062 2023-01-22 13:07:06.199494: step: 616/464, loss: 0.7394839525222778 2023-01-22 13:07:06.997253: step: 618/464, loss: 0.4225020408630371 2023-01-22 13:07:07.729083: step: 620/464, loss: 0.36073020100593567 2023-01-22 13:07:08.482791: step: 622/464, loss: 0.07548674196004868 2023-01-22 13:07:09.198394: step: 624/464, loss: 0.06665664911270142 2023-01-22 13:07:09.960037: step: 626/464, loss: 0.3739912509918213 2023-01-22 13:07:10.665992: step: 628/464, loss: 0.117310531437397 2023-01-22 13:07:11.478048: step: 630/464, loss: 0.05616062879562378 2023-01-22 13:07:12.249397: step: 632/464, loss: 0.20686718821525574 2023-01-22 13:07:12.916734: step: 634/464, loss: 0.09264864772558212 2023-01-22 13:07:13.608230: step: 636/464, loss: 0.21587902307510376 2023-01-22 13:07:14.316333: step: 638/464, loss: 0.15633289515972137 2023-01-22 13:07:14.991579: step: 640/464, loss: 0.04739254713058472 2023-01-22 13:07:15.712585: step: 642/464, loss: 0.05121554806828499 2023-01-22 13:07:16.368359: step: 644/464, loss: 0.03187675401568413 2023-01-22 13:07:17.122486: step: 646/464, loss: 0.07687317579984665 2023-01-22 13:07:17.806241: step: 648/464, loss: 0.20603327453136444 2023-01-22 13:07:18.511031: step: 650/464, loss: 0.16405758261680603 2023-01-22 13:07:19.180418: step: 652/464, loss: 0.07752983272075653 2023-01-22 13:07:19.988888: step: 654/464, loss: 0.16881927847862244 2023-01-22 13:07:20.732941: step: 656/464, loss: 1.8332757949829102 2023-01-22 13:07:21.469341: step: 658/464, loss: 0.03292281553149223 2023-01-22 13:07:22.132753: step: 660/464, loss: 0.14211608469486237 2023-01-22 13:07:22.889425: step: 662/464, loss: 0.14775115251541138 2023-01-22 13:07:23.681901: step: 664/464, loss: 0.21745797991752625 2023-01-22 13:07:24.404644: step: 666/464, loss: 0.1632862389087677 2023-01-22 13:07:25.113016: step: 668/464, loss: 0.30970701575279236 2023-01-22 13:07:25.924389: step: 670/464, loss: 0.049705106765031815 2023-01-22 13:07:26.640688: step: 672/464, loss: 0.1696358174085617 2023-01-22 13:07:27.388265: step: 674/464, loss: 0.18233445286750793 2023-01-22 13:07:28.261791: step: 676/464, loss: 0.12525096535682678 2023-01-22 13:07:29.009842: step: 678/464, loss: 0.18021506071090698 2023-01-22 13:07:29.760215: step: 680/464, loss: 0.3737623691558838 2023-01-22 13:07:30.589599: step: 682/464, loss: 0.3178415596485138 2023-01-22 13:07:31.275504: step: 684/464, loss: 0.061164434999227524 2023-01-22 13:07:31.978108: step: 686/464, loss: 0.16735850274562836 2023-01-22 13:07:32.804517: step: 688/464, loss: 0.6186379194259644 2023-01-22 13:07:33.496225: step: 690/464, loss: 0.17663516104221344 2023-01-22 13:07:34.289444: step: 692/464, loss: 0.16589435935020447 2023-01-22 13:07:34.992372: step: 694/464, loss: 0.21240673959255219 2023-01-22 13:07:35.679552: step: 696/464, loss: 0.18954595923423767 2023-01-22 13:07:36.456965: step: 698/464, loss: 0.20750971138477325 2023-01-22 13:07:37.307562: step: 700/464, loss: 0.44897404313087463 2023-01-22 13:07:37.990953: step: 702/464, loss: 2.3649990558624268 2023-01-22 13:07:38.751087: step: 704/464, loss: 0.7894167900085449 2023-01-22 13:07:39.479868: step: 706/464, loss: 0.04682024195790291 2023-01-22 13:07:40.255790: step: 708/464, loss: 0.353426069021225 2023-01-22 13:07:41.036078: step: 710/464, loss: 0.3856732249259949 2023-01-22 13:07:41.803549: step: 712/464, loss: 0.10253427922725677 2023-01-22 13:07:42.546427: step: 714/464, loss: 0.1930823177099228 2023-01-22 13:07:43.267711: step: 716/464, loss: 0.5934716463088989 2023-01-22 13:07:44.010815: step: 718/464, loss: 0.4938408434391022 2023-01-22 13:07:44.804026: step: 720/464, loss: 0.10433968156576157 2023-01-22 13:07:45.522435: step: 722/464, loss: 0.16280996799468994 2023-01-22 13:07:46.301641: step: 724/464, loss: 0.5624796152114868 2023-01-22 13:07:47.084040: step: 726/464, loss: 0.05956294387578964 2023-01-22 13:07:47.769035: step: 728/464, loss: 0.8196101784706116 2023-01-22 13:07:48.510340: step: 730/464, loss: 0.13151977956295013 2023-01-22 13:07:49.269123: step: 732/464, loss: 0.3388148248195648 2023-01-22 13:07:49.995746: step: 734/464, loss: 0.17667055130004883 2023-01-22 13:07:50.758392: step: 736/464, loss: 0.2739188075065613 2023-01-22 13:07:51.386252: step: 738/464, loss: 0.14711619913578033 2023-01-22 13:07:52.080750: step: 740/464, loss: 0.07891000062227249 2023-01-22 13:07:52.800317: step: 742/464, loss: 0.12373429536819458 2023-01-22 13:07:53.575959: step: 744/464, loss: 2.3303115367889404 2023-01-22 13:07:54.392475: step: 746/464, loss: 0.0886702761054039 2023-01-22 13:07:55.219355: step: 748/464, loss: 1.4180445671081543 2023-01-22 13:07:55.885953: step: 750/464, loss: 0.057705819606781006 2023-01-22 13:07:56.643023: step: 752/464, loss: 0.14857631921768188 2023-01-22 13:07:57.378438: step: 754/464, loss: 0.048156969249248505 2023-01-22 13:07:58.063862: step: 756/464, loss: 0.1455095112323761 2023-01-22 13:07:58.752569: step: 758/464, loss: 0.09087461978197098 2023-01-22 13:07:59.467878: step: 760/464, loss: 0.1472252607345581 2023-01-22 13:08:00.221836: step: 762/464, loss: 0.08561497181653976 2023-01-22 13:08:00.927606: step: 764/464, loss: 0.10442690551280975 2023-01-22 13:08:01.623971: step: 766/464, loss: 0.12880218029022217 2023-01-22 13:08:02.364384: step: 768/464, loss: 0.498107373714447 2023-01-22 13:08:03.057378: step: 770/464, loss: 0.14554743468761444 2023-01-22 13:08:03.803044: step: 772/464, loss: 0.1577606350183487 2023-01-22 13:08:04.539600: step: 774/464, loss: 0.19420844316482544 2023-01-22 13:08:05.279377: step: 776/464, loss: 0.07381264120340347 2023-01-22 13:08:06.031047: step: 778/464, loss: 0.1493281126022339 2023-01-22 13:08:06.720606: step: 780/464, loss: 0.03374498337507248 2023-01-22 13:08:07.509517: step: 782/464, loss: 0.10605400055646896 2023-01-22 13:08:08.162398: step: 784/464, loss: 0.28580501675605774 2023-01-22 13:08:08.972034: step: 786/464, loss: 0.26233959197998047 2023-01-22 13:08:09.717477: step: 788/464, loss: 0.05284832417964935 2023-01-22 13:08:10.499098: step: 790/464, loss: 0.12521898746490479 2023-01-22 13:08:11.207934: step: 792/464, loss: 0.20476263761520386 2023-01-22 13:08:11.991524: step: 794/464, loss: 0.18052512407302856 2023-01-22 13:08:12.724202: step: 796/464, loss: 0.309608519077301 2023-01-22 13:08:13.434547: step: 798/464, loss: 0.13055773079395294 2023-01-22 13:08:14.180456: step: 800/464, loss: 0.22315725684165955 2023-01-22 13:08:14.926295: step: 802/464, loss: 0.2699635624885559 2023-01-22 13:08:15.613974: step: 804/464, loss: 0.17608486115932465 2023-01-22 13:08:16.277397: step: 806/464, loss: 0.05220549553632736 2023-01-22 13:08:17.043144: step: 808/464, loss: 0.16342726349830627 2023-01-22 13:08:17.726639: step: 810/464, loss: 0.17474690079689026 2023-01-22 13:08:18.419733: step: 812/464, loss: 0.2082083821296692 2023-01-22 13:08:19.118385: step: 814/464, loss: 0.21456584334373474 2023-01-22 13:08:19.823704: step: 816/464, loss: 0.07093963772058487 2023-01-22 13:08:20.548270: step: 818/464, loss: 0.12586569786071777 2023-01-22 13:08:21.290346: step: 820/464, loss: 0.4578618109226227 2023-01-22 13:08:22.027886: step: 822/464, loss: 0.10049530863761902 2023-01-22 13:08:22.761210: step: 824/464, loss: 0.7494491338729858 2023-01-22 13:08:23.502644: step: 826/464, loss: 0.10575184226036072 2023-01-22 13:08:24.246207: step: 828/464, loss: 0.4233231246471405 2023-01-22 13:08:24.909227: step: 830/464, loss: 0.15735939145088196 2023-01-22 13:08:25.624870: step: 832/464, loss: 0.21875634789466858 2023-01-22 13:08:26.310114: step: 834/464, loss: 0.2116895616054535 2023-01-22 13:08:26.980677: step: 836/464, loss: 0.05463794991374016 2023-01-22 13:08:27.583668: step: 838/464, loss: 0.06432764232158661 2023-01-22 13:08:28.302081: step: 840/464, loss: 0.07017546892166138 2023-01-22 13:08:29.072966: step: 842/464, loss: 0.5909464955329895 2023-01-22 13:08:29.861254: step: 844/464, loss: 0.17316317558288574 2023-01-22 13:08:30.666249: step: 846/464, loss: 0.14773967862129211 2023-01-22 13:08:31.386778: step: 848/464, loss: 0.19056522846221924 2023-01-22 13:08:32.068335: step: 850/464, loss: 0.20791678130626678 2023-01-22 13:08:32.813773: step: 852/464, loss: 0.3895057141780853 2023-01-22 13:08:33.566718: step: 854/464, loss: 0.13461123406887054 2023-01-22 13:08:34.261473: step: 856/464, loss: 0.1486469954252243 2023-01-22 13:08:35.059768: step: 858/464, loss: 0.12327220290899277 2023-01-22 13:08:35.767442: step: 860/464, loss: 0.11395876854658127 2023-01-22 13:08:36.562607: step: 862/464, loss: 0.0672190859913826 2023-01-22 13:08:37.375071: step: 864/464, loss: 0.08600833266973495 2023-01-22 13:08:38.025448: step: 866/464, loss: 0.21235749125480652 2023-01-22 13:08:38.751517: step: 868/464, loss: 0.4932500422000885 2023-01-22 13:08:39.474101: step: 870/464, loss: 0.19560837745666504 2023-01-22 13:08:40.274550: step: 872/464, loss: 0.35193949937820435 2023-01-22 13:08:41.037917: step: 874/464, loss: 0.17032520473003387 2023-01-22 13:08:41.849831: step: 876/464, loss: 0.1449870467185974 2023-01-22 13:08:42.551892: step: 878/464, loss: 0.14561982452869415 2023-01-22 13:08:43.194666: step: 880/464, loss: 0.16295947134494781 2023-01-22 13:08:43.912382: step: 882/464, loss: 0.1561228334903717 2023-01-22 13:08:44.642904: step: 884/464, loss: 0.24844062328338623 2023-01-22 13:08:45.399010: step: 886/464, loss: 0.5855473279953003 2023-01-22 13:08:46.090809: step: 888/464, loss: 0.9291168451309204 2023-01-22 13:08:46.805581: step: 890/464, loss: 0.16777721047401428 2023-01-22 13:08:47.526389: step: 892/464, loss: 0.30441814661026 2023-01-22 13:08:48.271176: step: 894/464, loss: 0.1941244751214981 2023-01-22 13:08:49.029376: step: 896/464, loss: 0.10518061369657516 2023-01-22 13:08:49.837187: step: 898/464, loss: 0.282848984003067 2023-01-22 13:08:50.568739: step: 900/464, loss: 0.4408087134361267 2023-01-22 13:08:51.320157: step: 902/464, loss: 0.07189252972602844 2023-01-22 13:08:52.077942: step: 904/464, loss: 0.06143733859062195 2023-01-22 13:08:52.943167: step: 906/464, loss: 0.09300398826599121 2023-01-22 13:08:53.689474: step: 908/464, loss: 0.07987049967050552 2023-01-22 13:08:54.406762: step: 910/464, loss: 0.16798578202724457 2023-01-22 13:08:55.145568: step: 912/464, loss: 0.1030452772974968 2023-01-22 13:08:55.950057: step: 914/464, loss: 0.36905360221862793 2023-01-22 13:08:56.701367: step: 916/464, loss: 0.1241687461733818 2023-01-22 13:08:57.474926: step: 918/464, loss: 0.20748136937618256 2023-01-22 13:08:58.206186: step: 920/464, loss: 0.16023887693881989 2023-01-22 13:08:58.898361: step: 922/464, loss: 0.22584021091461182 2023-01-22 13:08:59.647044: step: 924/464, loss: 0.19338735938072205 2023-01-22 13:09:00.375332: step: 926/464, loss: 0.16376447677612305 2023-01-22 13:09:01.183233: step: 928/464, loss: 0.41982167959213257 2023-01-22 13:09:01.884744: step: 930/464, loss: 0.07592824101448059 ================================================== Loss: 0.262 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.31282801835726054, 'r': 0.3603161425860667, 'f1': 0.33489701436130004}, 'combined': 0.24676622110832633, 'epoch': 12} Test Chinese: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.30894966790503225, 'r': 0.2887808468548521, 'f1': 0.29852498585915693}, 'combined': 0.1853997280598975, 'epoch': 12} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2957374656593406, 'r': 0.3501711168338303, 'f1': 0.3206606056844979}, 'combined': 0.23627623576752474, 'epoch': 12} Test Korean: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.30237642650399266, 'r': 0.2871380888046808, 'f1': 0.2945603100560943}, 'combined': 0.18293745571904804, 'epoch': 12} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.31381650028432423, 'r': 0.3543089519339145, 'f1': 0.3328356821197378}, 'combined': 0.24524734471980678, 'epoch': 12} Test Russian: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.31785040890500044, 'r': 0.29772931477055925, 'f1': 0.3074610186241424}, 'combined': 0.19094947472446738, 'epoch': 12} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.3138888888888889, 'r': 0.4035714285714286, 'f1': 0.35312499999999997}, 'combined': 0.23541666666666664, 'epoch': 12} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.31976744186046513, 'r': 0.5978260869565217, 'f1': 0.4166666666666667}, 'combined': 0.20833333333333334, 'epoch': 12} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.36875, 'r': 0.2543103448275862, 'f1': 0.3010204081632653}, 'combined': 0.20068027210884354, 'epoch': 12} New best chinese model... New best korean model... ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.31282801835726054, 'r': 0.3603161425860667, 'f1': 0.33489701436130004}, 'combined': 0.24676622110832633, 'epoch': 12} Test for Chinese: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.30894966790503225, 'r': 0.2887808468548521, 'f1': 0.29852498585915693}, 'combined': 0.1853997280598975, 'epoch': 12} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.3138888888888889, 'r': 0.4035714285714286, 'f1': 0.35312499999999997}, 'combined': 0.23541666666666664, 'epoch': 12} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2957374656593406, 'r': 0.3501711168338303, 'f1': 0.3206606056844979}, 'combined': 0.23627623576752474, 'epoch': 12} Test for Korean: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.30237642650399266, 'r': 0.2871380888046808, 'f1': 0.2945603100560943}, 'combined': 0.18293745571904804, 'epoch': 12} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.31976744186046513, 'r': 0.5978260869565217, 'f1': 0.4166666666666667}, 'combined': 0.20833333333333334, 'epoch': 12} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3020804336867607, 'r': 0.32271590923272536, 'f1': 0.3120574021388005}, 'combined': 0.22993703315490563, 'epoch': 8} Test for Russian: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.3069139619466407, 'r': 0.292949528465389, 'f1': 0.29976920372318655}, 'combined': 0.1861724528386106, 'epoch': 8} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.44642857142857145, 'r': 0.3232758620689655, 'f1': 0.37500000000000006}, 'combined': 0.25, 'epoch': 8} ****************************** Epoch: 13 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-22 13:12:01.257434: step: 2/464, loss: 0.10845023393630981 2023-01-22 13:12:01.954567: step: 4/464, loss: 0.24459651112556458 2023-01-22 13:12:02.615511: step: 6/464, loss: 0.04099079221487045 2023-01-22 13:12:03.370292: step: 8/464, loss: 0.1733992099761963 2023-01-22 13:12:04.154543: step: 10/464, loss: 0.42204564809799194 2023-01-22 13:12:04.852186: step: 12/464, loss: 0.058562155812978745 2023-01-22 13:12:05.608067: step: 14/464, loss: 0.13704180717468262 2023-01-22 13:12:06.371383: step: 16/464, loss: 4.886255741119385 2023-01-22 13:12:07.126700: step: 18/464, loss: 0.04163377359509468 2023-01-22 13:12:07.817280: step: 20/464, loss: 0.1298331320285797 2023-01-22 13:12:08.644277: step: 22/464, loss: 0.11849203705787659 2023-01-22 13:12:09.370266: step: 24/464, loss: 0.12065554410219193 2023-01-22 13:12:10.143386: step: 26/464, loss: 0.32254287600517273 2023-01-22 13:12:10.871589: step: 28/464, loss: 0.10869194567203522 2023-01-22 13:12:11.541497: step: 30/464, loss: 0.1350581794977188 2023-01-22 13:12:12.300425: step: 32/464, loss: 0.34600409865379333 2023-01-22 13:12:13.002335: step: 34/464, loss: 0.14678220450878143 2023-01-22 13:12:13.652564: step: 36/464, loss: 0.10113175958395004 2023-01-22 13:12:14.377365: step: 38/464, loss: 0.08116129785776138 2023-01-22 13:12:15.045697: step: 40/464, loss: 0.22657261788845062 2023-01-22 13:12:15.777246: step: 42/464, loss: 0.06568806618452072 2023-01-22 13:12:16.508450: step: 44/464, loss: 0.046456482261419296 2023-01-22 13:12:17.220944: step: 46/464, loss: 0.08516017347574234 2023-01-22 13:12:17.938255: step: 48/464, loss: 0.1707119345664978 2023-01-22 13:12:18.605063: step: 50/464, loss: 0.44092610478401184 2023-01-22 13:12:19.341629: step: 52/464, loss: 0.13745073974132538 2023-01-22 13:12:20.027581: step: 54/464, loss: 0.0608690120279789 2023-01-22 13:12:20.773685: step: 56/464, loss: 0.21165050566196442 2023-01-22 13:12:21.505115: step: 58/464, loss: 0.09404819458723068 2023-01-22 13:12:22.277318: step: 60/464, loss: 0.09275195747613907 2023-01-22 13:12:23.005541: step: 62/464, loss: 0.2595752477645874 2023-01-22 13:12:23.695394: step: 64/464, loss: 0.0873362123966217 2023-01-22 13:12:24.448495: step: 66/464, loss: 0.35337966680526733 2023-01-22 13:12:25.176708: step: 68/464, loss: 0.19586828351020813 2023-01-22 13:12:25.956286: step: 70/464, loss: 0.10347774624824524 2023-01-22 13:12:26.688880: step: 72/464, loss: 0.1529945284128189 2023-01-22 13:12:27.527011: step: 74/464, loss: 0.17042525112628937 2023-01-22 13:12:28.294862: step: 76/464, loss: 0.06554456055164337 2023-01-22 13:12:28.987236: step: 78/464, loss: 0.03990291431546211 2023-01-22 13:12:29.726467: step: 80/464, loss: 0.06882677972316742 2023-01-22 13:12:30.448838: step: 82/464, loss: 0.2006833404302597 2023-01-22 13:12:31.244193: step: 84/464, loss: 0.15863539278507233 2023-01-22 13:12:31.928337: step: 86/464, loss: 0.05487312376499176 2023-01-22 13:12:32.654387: step: 88/464, loss: 0.061266910284757614 2023-01-22 13:12:33.296892: step: 90/464, loss: 0.10806545615196228 2023-01-22 13:12:33.978739: step: 92/464, loss: 0.12428941577672958 2023-01-22 13:12:34.760718: step: 94/464, loss: 0.5336252450942993 2023-01-22 13:12:35.508812: step: 96/464, loss: 0.9217267036437988 2023-01-22 13:12:36.215293: step: 98/464, loss: 0.12581682205200195 2023-01-22 13:12:36.889954: step: 100/464, loss: 0.18978308141231537 2023-01-22 13:12:37.639872: step: 102/464, loss: 0.1525404453277588 2023-01-22 13:12:38.434637: step: 104/464, loss: 0.12237998098134995 2023-01-22 13:12:39.114202: step: 106/464, loss: 0.12113020569086075 2023-01-22 13:12:39.924354: step: 108/464, loss: 0.05587160959839821 2023-01-22 13:12:40.664623: step: 110/464, loss: 0.20736397802829742 2023-01-22 13:12:41.405640: step: 112/464, loss: 0.2167879045009613 2023-01-22 13:12:42.228414: step: 114/464, loss: 0.028888702392578125 2023-01-22 13:12:42.953751: step: 116/464, loss: 0.1010793000459671 2023-01-22 13:12:43.661260: step: 118/464, loss: 0.24765628576278687 2023-01-22 13:12:44.384860: step: 120/464, loss: 0.4231231212615967 2023-01-22 13:12:45.083561: step: 122/464, loss: 0.061653271317481995 2023-01-22 13:12:45.886074: step: 124/464, loss: 0.516242265701294 2023-01-22 13:12:46.550139: step: 126/464, loss: 0.03295483440160751 2023-01-22 13:12:47.218236: step: 128/464, loss: 0.07076632976531982 2023-01-22 13:12:48.003497: step: 130/464, loss: 0.398122638463974 2023-01-22 13:12:48.719279: step: 132/464, loss: 0.0682402104139328 2023-01-22 13:12:49.470197: step: 134/464, loss: 0.46888625621795654 2023-01-22 13:12:50.200580: step: 136/464, loss: 0.10444146394729614 2023-01-22 13:12:51.004776: step: 138/464, loss: 0.28288185596466064 2023-01-22 13:12:51.767848: step: 140/464, loss: 0.41589614748954773 2023-01-22 13:12:52.500451: step: 142/464, loss: 0.12509053945541382 2023-01-22 13:12:53.214451: step: 144/464, loss: 0.06599561870098114 2023-01-22 13:12:53.986134: step: 146/464, loss: 0.5306773781776428 2023-01-22 13:12:54.733855: step: 148/464, loss: 1.9244468212127686 2023-01-22 13:12:55.481851: step: 150/464, loss: 0.12096443772315979 2023-01-22 13:12:56.334973: step: 152/464, loss: 0.37596365809440613 2023-01-22 13:12:57.091282: step: 154/464, loss: 0.21632319688796997 2023-01-22 13:12:57.827691: step: 156/464, loss: 0.03397297114133835 2023-01-22 13:12:58.642236: step: 158/464, loss: 0.1000724658370018 2023-01-22 13:12:59.426653: step: 160/464, loss: 0.10586489737033844 2023-01-22 13:13:00.128619: step: 162/464, loss: 0.13357606530189514 2023-01-22 13:13:00.894584: step: 164/464, loss: 0.18611209094524384 2023-01-22 13:13:01.677402: step: 166/464, loss: 0.08677546679973602 2023-01-22 13:13:02.462102: step: 168/464, loss: 0.1102922335267067 2023-01-22 13:13:03.210500: step: 170/464, loss: 0.04976590722799301 2023-01-22 13:13:03.918953: step: 172/464, loss: 0.09167968481779099 2023-01-22 13:13:04.771737: step: 174/464, loss: 0.21131104230880737 2023-01-22 13:13:05.519046: step: 176/464, loss: 0.030156195163726807 2023-01-22 13:13:06.242044: step: 178/464, loss: 0.0817871242761612 2023-01-22 13:13:06.930367: step: 180/464, loss: 0.08249973505735397 2023-01-22 13:13:07.606515: step: 182/464, loss: 0.09012583643198013 2023-01-22 13:13:08.321676: step: 184/464, loss: 0.20371857285499573 2023-01-22 13:13:09.110409: step: 186/464, loss: 0.11443076282739639 2023-01-22 13:13:09.882735: step: 188/464, loss: 0.11218014359474182 2023-01-22 13:13:10.677585: step: 190/464, loss: 0.12328223139047623 2023-01-22 13:13:11.443842: step: 192/464, loss: 0.1352798342704773 2023-01-22 13:13:12.225811: step: 194/464, loss: 0.7996044158935547 2023-01-22 13:13:12.937224: step: 196/464, loss: 0.5265755653381348 2023-01-22 13:13:13.692505: step: 198/464, loss: 0.2508794963359833 2023-01-22 13:13:14.404033: step: 200/464, loss: 0.5351430177688599 2023-01-22 13:13:15.253199: step: 202/464, loss: 0.22345688939094543 2023-01-22 13:13:15.975872: step: 204/464, loss: 0.14427419006824493 2023-01-22 13:13:16.714189: step: 206/464, loss: 0.16566066443920135 2023-01-22 13:13:17.415215: step: 208/464, loss: 0.509405255317688 2023-01-22 13:13:18.180635: step: 210/464, loss: 0.20510387420654297 2023-01-22 13:13:19.110772: step: 212/464, loss: 0.10925981402397156 2023-01-22 13:13:19.902992: step: 214/464, loss: 0.05766135826706886 2023-01-22 13:13:20.578968: step: 216/464, loss: 0.2569079101085663 2023-01-22 13:13:21.307076: step: 218/464, loss: 0.16548825800418854 2023-01-22 13:13:22.078061: step: 220/464, loss: 0.038141459226608276 2023-01-22 13:13:22.857773: step: 222/464, loss: 0.08520923554897308 2023-01-22 13:13:23.551137: step: 224/464, loss: 0.5157656073570251 2023-01-22 13:13:24.275992: step: 226/464, loss: 0.04011956602334976 2023-01-22 13:13:25.065238: step: 228/464, loss: 0.1085488423705101 2023-01-22 13:13:25.734331: step: 230/464, loss: 0.20553813874721527 2023-01-22 13:13:26.454752: step: 232/464, loss: 0.1176021471619606 2023-01-22 13:13:27.214879: step: 234/464, loss: 0.17439326643943787 2023-01-22 13:13:27.922269: step: 236/464, loss: 0.6796504259109497 2023-01-22 13:13:28.573914: step: 238/464, loss: 0.05329074710607529 2023-01-22 13:13:29.269325: step: 240/464, loss: 0.14916737377643585 2023-01-22 13:13:30.051993: step: 242/464, loss: 0.6748087406158447 2023-01-22 13:13:30.751539: step: 244/464, loss: 0.4097531735897064 2023-01-22 13:13:31.615244: step: 246/464, loss: 0.07001913338899612 2023-01-22 13:13:32.311820: step: 248/464, loss: 0.1428634524345398 2023-01-22 13:13:33.032854: step: 250/464, loss: 0.8062233924865723 2023-01-22 13:13:33.722932: step: 252/464, loss: 0.13343364000320435 2023-01-22 13:13:34.536314: step: 254/464, loss: 0.21857410669326782 2023-01-22 13:13:35.242123: step: 256/464, loss: 0.11554975062608719 2023-01-22 13:13:35.959868: step: 258/464, loss: 1.3301513195037842 2023-01-22 13:13:36.712949: step: 260/464, loss: 0.1528233289718628 2023-01-22 13:13:37.410806: step: 262/464, loss: 0.15933214128017426 2023-01-22 13:13:38.271854: step: 264/464, loss: 0.11591895669698715 2023-01-22 13:13:38.990816: step: 266/464, loss: 0.08424033224582672 2023-01-22 13:13:39.846146: step: 268/464, loss: 0.2541303038597107 2023-01-22 13:13:40.552419: step: 270/464, loss: 0.07291339337825775 2023-01-22 13:13:41.231401: step: 272/464, loss: 0.06928709894418716 2023-01-22 13:13:41.966379: step: 274/464, loss: 0.6142052412033081 2023-01-22 13:13:42.764118: step: 276/464, loss: 0.45714521408081055 2023-01-22 13:13:43.463263: step: 278/464, loss: 0.024776801466941833 2023-01-22 13:13:44.273849: step: 280/464, loss: 0.240465447306633 2023-01-22 13:13:45.120825: step: 282/464, loss: 0.19866108894348145 2023-01-22 13:13:45.897213: step: 284/464, loss: 0.0858876183629036 2023-01-22 13:13:46.671376: step: 286/464, loss: 0.11999360471963882 2023-01-22 13:13:47.372987: step: 288/464, loss: 0.10557374358177185 2023-01-22 13:13:48.038644: step: 290/464, loss: 0.050895802676677704 2023-01-22 13:13:48.743998: step: 292/464, loss: 0.08306840807199478 2023-01-22 13:13:49.517950: step: 294/464, loss: 1.0396608114242554 2023-01-22 13:13:50.217267: step: 296/464, loss: 0.3304722011089325 2023-01-22 13:13:51.007312: step: 298/464, loss: 0.15357911586761475 2023-01-22 13:13:51.673528: step: 300/464, loss: 0.3373052477836609 2023-01-22 13:13:52.387009: step: 302/464, loss: 0.20188729465007782 2023-01-22 13:13:53.033998: step: 304/464, loss: 0.21684570610523224 2023-01-22 13:13:53.754528: step: 306/464, loss: 0.199372798204422 2023-01-22 13:13:54.462990: step: 308/464, loss: 0.6930966377258301 2023-01-22 13:13:55.146990: step: 310/464, loss: 0.08602418750524521 2023-01-22 13:13:55.893599: step: 312/464, loss: 0.08976003527641296 2023-01-22 13:13:56.617202: step: 314/464, loss: 0.07710489630699158 2023-01-22 13:13:57.334041: step: 316/464, loss: 0.03334254398941994 2023-01-22 13:13:58.055922: step: 318/464, loss: 0.1080012395977974 2023-01-22 13:13:58.795570: step: 320/464, loss: 1.0282444953918457 2023-01-22 13:13:59.551241: step: 322/464, loss: 0.11568324267864227 2023-01-22 13:14:00.298362: step: 324/464, loss: 0.04866556078195572 2023-01-22 13:14:01.092073: step: 326/464, loss: 0.18815982341766357 2023-01-22 13:14:01.799388: step: 328/464, loss: 0.2963554263114929 2023-01-22 13:14:02.539096: step: 330/464, loss: 0.2101053148508072 2023-01-22 13:14:03.189684: step: 332/464, loss: 0.11286016553640366 2023-01-22 13:14:03.941136: step: 334/464, loss: 0.26691845059394836 2023-01-22 13:14:04.661580: step: 336/464, loss: 0.12511800229549408 2023-01-22 13:14:05.404123: step: 338/464, loss: 0.06494363397359848 2023-01-22 13:14:06.192764: step: 340/464, loss: 0.10067279636859894 2023-01-22 13:14:06.903980: step: 342/464, loss: 0.08120899647474289 2023-01-22 13:14:07.594879: step: 344/464, loss: 0.07520805299282074 2023-01-22 13:14:08.361292: step: 346/464, loss: 0.1738840788602829 2023-01-22 13:14:09.102541: step: 348/464, loss: 0.05793759599328041 2023-01-22 13:14:09.809940: step: 350/464, loss: 0.15568017959594727 2023-01-22 13:14:10.622577: step: 352/464, loss: 0.08221323788166046 2023-01-22 13:14:11.374233: step: 354/464, loss: 0.18200474977493286 2023-01-22 13:14:12.109727: step: 356/464, loss: 0.12087699770927429 2023-01-22 13:14:12.877850: step: 358/464, loss: 0.11079593002796173 2023-01-22 13:14:13.629623: step: 360/464, loss: 0.5768294930458069 2023-01-22 13:14:14.379338: step: 362/464, loss: 0.09069310873746872 2023-01-22 13:14:15.142745: step: 364/464, loss: 1.772871732711792 2023-01-22 13:14:15.816226: step: 366/464, loss: 0.13492810726165771 2023-01-22 13:14:16.554185: step: 368/464, loss: 0.3811272978782654 2023-01-22 13:14:17.306766: step: 370/464, loss: 0.153494730591774 2023-01-22 13:14:18.040658: step: 372/464, loss: 0.07832568883895874 2023-01-22 13:14:18.875279: step: 374/464, loss: 0.09619408845901489 2023-01-22 13:14:19.596142: step: 376/464, loss: 0.06830616295337677 2023-01-22 13:14:20.333523: step: 378/464, loss: 0.5078091621398926 2023-01-22 13:14:21.056361: step: 380/464, loss: 0.07470594346523285 2023-01-22 13:14:21.855708: step: 382/464, loss: 0.2070901244878769 2023-01-22 13:14:22.592934: step: 384/464, loss: 0.2671006917953491 2023-01-22 13:14:23.421430: step: 386/464, loss: 0.13977664709091187 2023-01-22 13:14:24.153027: step: 388/464, loss: 2.476294994354248 2023-01-22 13:14:24.939317: step: 390/464, loss: 0.9573323726654053 2023-01-22 13:14:25.699055: step: 392/464, loss: 0.3616604208946228 2023-01-22 13:14:26.494633: step: 394/464, loss: 0.08241743594408035 2023-01-22 13:14:27.250169: step: 396/464, loss: 0.17898499965667725 2023-01-22 13:14:28.009695: step: 398/464, loss: 0.09183957427740097 2023-01-22 13:14:28.732751: step: 400/464, loss: 0.14725667238235474 2023-01-22 13:14:29.424015: step: 402/464, loss: 0.25489723682403564 2023-01-22 13:14:30.159503: step: 404/464, loss: 0.5332356691360474 2023-01-22 13:14:30.817079: step: 406/464, loss: 0.12564606964588165 2023-01-22 13:14:31.526390: step: 408/464, loss: 0.10011353343725204 2023-01-22 13:14:32.227301: step: 410/464, loss: 0.06462383270263672 2023-01-22 13:14:32.977291: step: 412/464, loss: 0.06372683495283127 2023-01-22 13:14:33.774145: step: 414/464, loss: 0.9359332323074341 2023-01-22 13:14:34.510891: step: 416/464, loss: 0.47240427136421204 2023-01-22 13:14:35.242131: step: 418/464, loss: 0.15439915657043457 2023-01-22 13:14:35.976700: step: 420/464, loss: 0.13898834586143494 2023-01-22 13:14:36.735493: step: 422/464, loss: 0.4279281795024872 2023-01-22 13:14:37.451599: step: 424/464, loss: 0.15131518244743347 2023-01-22 13:14:38.171370: step: 426/464, loss: 0.19741450250148773 2023-01-22 13:14:38.931172: step: 428/464, loss: 0.8428908586502075 2023-01-22 13:14:39.631831: step: 430/464, loss: 0.1573386788368225 2023-01-22 13:14:40.393525: step: 432/464, loss: 0.15728452801704407 2023-01-22 13:14:41.145919: step: 434/464, loss: 6.404244422912598 2023-01-22 13:14:42.117603: step: 436/464, loss: 0.15470275282859802 2023-01-22 13:14:42.772362: step: 438/464, loss: 0.39804452657699585 2023-01-22 13:14:43.404809: step: 440/464, loss: 0.0591038353741169 2023-01-22 13:14:44.081836: step: 442/464, loss: 0.53672194480896 2023-01-22 13:14:44.850393: step: 444/464, loss: 0.10315102338790894 2023-01-22 13:14:45.603338: step: 446/464, loss: 0.08972831070423126 2023-01-22 13:14:46.363805: step: 448/464, loss: 0.10898428410291672 2023-01-22 13:14:47.159609: step: 450/464, loss: 0.06059310585260391 2023-01-22 13:14:47.911622: step: 452/464, loss: 0.03156914561986923 2023-01-22 13:14:48.668381: step: 454/464, loss: 0.13455218076705933 2023-01-22 13:14:49.399483: step: 456/464, loss: 0.09226778149604797 2023-01-22 13:14:50.102499: step: 458/464, loss: 0.029771529138088226 2023-01-22 13:14:50.867749: step: 460/464, loss: 0.32640859484672546 2023-01-22 13:14:51.694775: step: 462/464, loss: 0.14315687119960785 2023-01-22 13:14:52.480465: step: 464/464, loss: 0.23702003061771393 2023-01-22 13:14:53.248140: step: 466/464, loss: 0.3839260935783386 2023-01-22 13:14:53.941000: step: 468/464, loss: 0.17654064297676086 2023-01-22 13:14:54.712305: step: 470/464, loss: 0.28439781069755554 2023-01-22 13:14:55.401503: step: 472/464, loss: 0.10062745958566666 2023-01-22 13:14:56.213388: step: 474/464, loss: 0.6959630846977234 2023-01-22 13:14:56.953531: step: 476/464, loss: 0.07147932797670364 2023-01-22 13:14:57.664889: step: 478/464, loss: 1.0545878410339355 2023-01-22 13:14:58.480062: step: 480/464, loss: 0.1763465404510498 2023-01-22 13:14:59.152182: step: 482/464, loss: 0.08560379594564438 2023-01-22 13:14:59.870181: step: 484/464, loss: 0.16833864152431488 2023-01-22 13:15:00.536005: step: 486/464, loss: 0.10443995893001556 2023-01-22 13:15:01.318683: step: 488/464, loss: 0.12339601665735245 2023-01-22 13:15:02.108095: step: 490/464, loss: 0.08989642560482025 2023-01-22 13:15:02.795299: step: 492/464, loss: 0.14790289103984833 2023-01-22 13:15:03.533876: step: 494/464, loss: 0.054193589836359024 2023-01-22 13:15:04.325087: step: 496/464, loss: 0.09965498745441437 2023-01-22 13:15:04.999411: step: 498/464, loss: 0.6218535304069519 2023-01-22 13:15:05.731209: step: 500/464, loss: 0.08032213151454926 2023-01-22 13:15:06.395428: step: 502/464, loss: 0.1574018895626068 2023-01-22 13:15:07.077385: step: 504/464, loss: 0.2863512337207794 2023-01-22 13:15:07.824655: step: 506/464, loss: 0.11170153319835663 2023-01-22 13:15:08.559565: step: 508/464, loss: 0.13133688271045685 2023-01-22 13:15:09.319821: step: 510/464, loss: 0.41238370537757874 2023-01-22 13:15:10.030910: step: 512/464, loss: 0.11662472039461136 2023-01-22 13:15:10.735065: step: 514/464, loss: 0.22386302053928375 2023-01-22 13:15:11.476166: step: 516/464, loss: 0.10225214809179306 2023-01-22 13:15:12.190924: step: 518/464, loss: 0.08818010240793228 2023-01-22 13:15:12.868396: step: 520/464, loss: 0.07073657214641571 2023-01-22 13:15:13.594502: step: 522/464, loss: 0.4500882625579834 2023-01-22 13:15:14.382309: step: 524/464, loss: 0.18536502122879028 2023-01-22 13:15:15.132751: step: 526/464, loss: 0.1312844306230545 2023-01-22 13:15:15.853352: step: 528/464, loss: 0.08108577877283096 2023-01-22 13:15:16.780823: step: 530/464, loss: 0.3510834276676178 2023-01-22 13:15:17.465857: step: 532/464, loss: 0.1260337233543396 2023-01-22 13:15:18.293378: step: 534/464, loss: 0.1317775845527649 2023-01-22 13:15:19.057913: step: 536/464, loss: 0.193727508187294 2023-01-22 13:15:19.763595: step: 538/464, loss: 0.05266972631216049 2023-01-22 13:15:20.490253: step: 540/464, loss: 0.14960551261901855 2023-01-22 13:15:21.215624: step: 542/464, loss: 0.27731019258499146 2023-01-22 13:15:21.924768: step: 544/464, loss: 0.0634918063879013 2023-01-22 13:15:22.623755: step: 546/464, loss: 0.024509821087121964 2023-01-22 13:15:23.356823: step: 548/464, loss: 0.9180271029472351 2023-01-22 13:15:24.047265: step: 550/464, loss: 0.17400743067264557 2023-01-22 13:15:24.754929: step: 552/464, loss: 0.13902823626995087 2023-01-22 13:15:25.436243: step: 554/464, loss: 0.11064955592155457 2023-01-22 13:15:26.223860: step: 556/464, loss: 0.14851465821266174 2023-01-22 13:15:26.993251: step: 558/464, loss: 0.10951215028762817 2023-01-22 13:15:27.706087: step: 560/464, loss: 0.15823350846767426 2023-01-22 13:15:28.468787: step: 562/464, loss: 0.0656096562743187 2023-01-22 13:15:29.294842: step: 564/464, loss: 0.174315944314003 2023-01-22 13:15:30.064891: step: 566/464, loss: 0.2251901626586914 2023-01-22 13:15:30.693390: step: 568/464, loss: 0.3147839605808258 2023-01-22 13:15:31.474126: step: 570/464, loss: 0.17955265939235687 2023-01-22 13:15:32.267790: step: 572/464, loss: 0.1577025055885315 2023-01-22 13:15:33.039684: step: 574/464, loss: 0.11040801554918289 2023-01-22 13:15:33.738668: step: 576/464, loss: 0.08774346113204956 2023-01-22 13:15:34.517422: step: 578/464, loss: 0.14458192884922028 2023-01-22 13:15:35.329372: step: 580/464, loss: 0.17943516373634338 2023-01-22 13:15:36.091483: step: 582/464, loss: 0.4944537878036499 2023-01-22 13:15:36.817745: step: 584/464, loss: 0.08026743680238724 2023-01-22 13:15:37.479711: step: 586/464, loss: 0.3535708785057068 2023-01-22 13:15:38.186702: step: 588/464, loss: 0.1416577249765396 2023-01-22 13:15:38.895589: step: 590/464, loss: 0.15077544748783112 2023-01-22 13:15:39.629538: step: 592/464, loss: 0.21983230113983154 2023-01-22 13:15:40.381515: step: 594/464, loss: 0.0527484267950058 2023-01-22 13:15:41.130969: step: 596/464, loss: 0.5502574443817139 2023-01-22 13:15:41.874432: step: 598/464, loss: 0.10318192839622498 2023-01-22 13:15:42.630771: step: 600/464, loss: 0.7747368812561035 2023-01-22 13:15:43.377318: step: 602/464, loss: 0.19791975617408752 2023-01-22 13:15:44.131836: step: 604/464, loss: 0.03329138085246086 2023-01-22 13:15:44.917037: step: 606/464, loss: 0.10343381017446518 2023-01-22 13:15:45.647438: step: 608/464, loss: 0.051676757633686066 2023-01-22 13:15:46.386465: step: 610/464, loss: 0.2705204486846924 2023-01-22 13:15:47.109739: step: 612/464, loss: 0.1125253215432167 2023-01-22 13:15:47.801521: step: 614/464, loss: 0.545214056968689 2023-01-22 13:15:48.589456: step: 616/464, loss: 0.2631594240665436 2023-01-22 13:15:49.311599: step: 618/464, loss: 0.10287163406610489 2023-01-22 13:15:50.073367: step: 620/464, loss: 0.06313641369342804 2023-01-22 13:15:50.921776: step: 622/464, loss: 0.15470679104328156 2023-01-22 13:15:51.671086: step: 624/464, loss: 0.2945192754268646 2023-01-22 13:15:52.380560: step: 626/464, loss: 0.43829968571662903 2023-01-22 13:15:53.239052: step: 628/464, loss: 0.1536589115858078 2023-01-22 13:15:54.021785: step: 630/464, loss: 0.06712499260902405 2023-01-22 13:15:54.740177: step: 632/464, loss: 0.2594963312149048 2023-01-22 13:15:55.560332: step: 634/464, loss: 0.08554352074861526 2023-01-22 13:15:56.361478: step: 636/464, loss: 0.3566303253173828 2023-01-22 13:15:57.051422: step: 638/464, loss: 0.24919937551021576 2023-01-22 13:15:57.731175: step: 640/464, loss: 0.09062323719263077 2023-01-22 13:15:58.484054: step: 642/464, loss: 0.7201142311096191 2023-01-22 13:15:59.182044: step: 644/464, loss: 0.3398234248161316 2023-01-22 13:15:59.870543: step: 646/464, loss: 0.3669608533382416 2023-01-22 13:16:00.649762: step: 648/464, loss: 0.10835307091474533 2023-01-22 13:16:01.409894: step: 650/464, loss: 0.07052277028560638 2023-01-22 13:16:02.193682: step: 652/464, loss: 0.07522750645875931 2023-01-22 13:16:02.971112: step: 654/464, loss: 0.37385356426239014 2023-01-22 13:16:03.700846: step: 656/464, loss: 0.09178084135055542 2023-01-22 13:16:04.408516: step: 658/464, loss: 0.07646650820970535 2023-01-22 13:16:05.214178: step: 660/464, loss: 0.3632817268371582 2023-01-22 13:16:05.944266: step: 662/464, loss: 1.816754937171936 2023-01-22 13:16:06.656934: step: 664/464, loss: 0.42521968483924866 2023-01-22 13:16:07.431597: step: 666/464, loss: 0.30828657746315 2023-01-22 13:16:08.169506: step: 668/464, loss: 0.10740874707698822 2023-01-22 13:16:08.908657: step: 670/464, loss: 0.12233463674783707 2023-01-22 13:16:09.610853: step: 672/464, loss: 0.05220026522874832 2023-01-22 13:16:10.340063: step: 674/464, loss: 0.07350676506757736 2023-01-22 13:16:11.141816: step: 676/464, loss: 0.8677085638046265 2023-01-22 13:16:11.926169: step: 678/464, loss: 0.12398731708526611 2023-01-22 13:16:12.634013: step: 680/464, loss: 0.4580591917037964 2023-01-22 13:16:13.392627: step: 682/464, loss: 0.30364760756492615 2023-01-22 13:16:14.211191: step: 684/464, loss: 0.0784701257944107 2023-01-22 13:16:14.910034: step: 686/464, loss: 0.02582676336169243 2023-01-22 13:16:15.610830: step: 688/464, loss: 0.2821761667728424 2023-01-22 13:16:16.349231: step: 690/464, loss: 0.12536336481571198 2023-01-22 13:16:17.087219: step: 692/464, loss: 0.05118219181895256 2023-01-22 13:16:17.837471: step: 694/464, loss: 0.07651496678590775 2023-01-22 13:16:18.624297: step: 696/464, loss: 0.16557954251766205 2023-01-22 13:16:19.344440: step: 698/464, loss: 0.2265680879354477 2023-01-22 13:16:20.149855: step: 700/464, loss: 0.37888118624687195 2023-01-22 13:16:20.874180: step: 702/464, loss: 0.13355915248394012 2023-01-22 13:16:21.618572: step: 704/464, loss: 0.2651720643043518 2023-01-22 13:16:22.306362: step: 706/464, loss: 0.4567992687225342 2023-01-22 13:16:23.137026: step: 708/464, loss: 0.24589699506759644 2023-01-22 13:16:23.788638: step: 710/464, loss: 0.5090956687927246 2023-01-22 13:16:24.524460: step: 712/464, loss: 0.1037251204252243 2023-01-22 13:16:25.216556: step: 714/464, loss: 0.1926645189523697 2023-01-22 13:16:25.914636: step: 716/464, loss: 0.0683947280049324 2023-01-22 13:16:26.678591: step: 718/464, loss: 0.11341574788093567 2023-01-22 13:16:27.399775: step: 720/464, loss: 0.1150423139333725 2023-01-22 13:16:28.131028: step: 722/464, loss: 0.09730232506990433 2023-01-22 13:16:28.943222: step: 724/464, loss: 0.0670548602938652 2023-01-22 13:16:29.671498: step: 726/464, loss: 0.0629471018910408 2023-01-22 13:16:30.386426: step: 728/464, loss: 0.0662987157702446 2023-01-22 13:16:31.066221: step: 730/464, loss: 0.05995099991559982 2023-01-22 13:16:31.854980: step: 732/464, loss: 0.15863323211669922 2023-01-22 13:16:32.578062: step: 734/464, loss: 0.10982012748718262 2023-01-22 13:16:33.259254: step: 736/464, loss: 0.11240449547767639 2023-01-22 13:16:33.991084: step: 738/464, loss: 0.22407962381839752 2023-01-22 13:16:34.796422: step: 740/464, loss: 0.10131499916315079 2023-01-22 13:16:35.605336: step: 742/464, loss: 0.31418102979660034 2023-01-22 13:16:36.442152: step: 744/464, loss: 0.11370068043470383 2023-01-22 13:16:37.253845: step: 746/464, loss: 0.41384071111679077 2023-01-22 13:16:38.122030: step: 748/464, loss: 0.08551589399576187 2023-01-22 13:16:38.861162: step: 750/464, loss: 0.0748281255364418 2023-01-22 13:16:39.686130: step: 752/464, loss: 0.07672805339097977 2023-01-22 13:16:40.458328: step: 754/464, loss: 0.09141271561384201 2023-01-22 13:16:41.200206: step: 756/464, loss: 0.11692787706851959 2023-01-22 13:16:41.938876: step: 758/464, loss: 0.05921081081032753 2023-01-22 13:16:42.656786: step: 760/464, loss: 0.2514224648475647 2023-01-22 13:16:43.385236: step: 762/464, loss: 0.509127140045166 2023-01-22 13:16:44.109237: step: 764/464, loss: 0.11973954737186432 2023-01-22 13:16:44.864402: step: 766/464, loss: 0.1569305807352066 2023-01-22 13:16:45.592460: step: 768/464, loss: 0.21292757987976074 2023-01-22 13:16:46.339059: step: 770/464, loss: 0.08074121177196503 2023-01-22 13:16:47.048776: step: 772/464, loss: 0.10858672112226486 2023-01-22 13:16:47.788618: step: 774/464, loss: 0.23681391775608063 2023-01-22 13:16:48.502186: step: 776/464, loss: 0.06215392425656319 2023-01-22 13:16:49.273885: step: 778/464, loss: 0.15121573209762573 2023-01-22 13:16:50.022577: step: 780/464, loss: 0.14246949553489685 2023-01-22 13:16:50.656645: step: 782/464, loss: 0.11438118666410446 2023-01-22 13:16:51.342502: step: 784/464, loss: 0.10496123880147934 2023-01-22 13:16:52.090705: step: 786/464, loss: 0.20628753304481506 2023-01-22 13:16:52.818858: step: 788/464, loss: 0.5667099356651306 2023-01-22 13:16:53.649921: step: 790/464, loss: 0.1638394445180893 2023-01-22 13:16:54.401095: step: 792/464, loss: 0.6485792398452759 2023-01-22 13:16:55.098736: step: 794/464, loss: 0.2431643158197403 2023-01-22 13:16:55.827226: step: 796/464, loss: 0.7835709452629089 2023-01-22 13:16:56.542848: step: 798/464, loss: 0.7168622016906738 2023-01-22 13:16:57.315002: step: 800/464, loss: 0.20961681008338928 2023-01-22 13:16:58.026561: step: 802/464, loss: 0.3085970878601074 2023-01-22 13:16:58.735977: step: 804/464, loss: 0.180080845952034 2023-01-22 13:16:59.520779: step: 806/464, loss: 0.028265029191970825 2023-01-22 13:17:00.287504: step: 808/464, loss: 0.24229370057582855 2023-01-22 13:17:01.024150: step: 810/464, loss: 0.10656850785017014 2023-01-22 13:17:01.809547: step: 812/464, loss: 6.04249382019043 2023-01-22 13:17:02.503245: step: 814/464, loss: 0.2670717239379883 2023-01-22 13:17:03.290220: step: 816/464, loss: 0.4301287531852722 2023-01-22 13:17:04.004766: step: 818/464, loss: 0.09188126027584076 2023-01-22 13:17:04.724397: step: 820/464, loss: 0.05205368250608444 2023-01-22 13:17:05.463817: step: 822/464, loss: 0.06847725808620453 2023-01-22 13:17:06.294886: step: 824/464, loss: 0.1087225005030632 2023-01-22 13:17:07.038229: step: 826/464, loss: 0.03399376943707466 2023-01-22 13:17:07.818874: step: 828/464, loss: 0.19704696536064148 2023-01-22 13:17:08.518872: step: 830/464, loss: 0.1437470018863678 2023-01-22 13:17:09.338619: step: 832/464, loss: 0.0708436593413353 2023-01-22 13:17:10.094834: step: 834/464, loss: 0.4941231906414032 2023-01-22 13:17:10.820054: step: 836/464, loss: 0.35776862502098083 2023-01-22 13:17:11.566186: step: 838/464, loss: 0.6135424375534058 2023-01-22 13:17:12.357282: step: 840/464, loss: 0.21788859367370605 2023-01-22 13:17:13.064550: step: 842/464, loss: 0.10796776413917542 2023-01-22 13:17:13.758924: step: 844/464, loss: 0.08246316015720367 2023-01-22 13:17:14.533012: step: 846/464, loss: 0.7474268674850464 2023-01-22 13:17:15.170491: step: 848/464, loss: 0.1485125720500946 2023-01-22 13:17:15.914388: step: 850/464, loss: 0.12459192425012589 2023-01-22 13:17:16.645531: step: 852/464, loss: 0.16107913851737976 2023-01-22 13:17:17.373990: step: 854/464, loss: 0.15444625914096832 2023-01-22 13:17:18.168345: step: 856/464, loss: 0.10845639556646347 2023-01-22 13:17:18.809981: step: 858/464, loss: 0.33226022124290466 2023-01-22 13:17:19.577820: step: 860/464, loss: 0.6130020618438721 2023-01-22 13:17:20.321075: step: 862/464, loss: 0.1199200227856636 2023-01-22 13:17:21.053748: step: 864/464, loss: 0.1062421128153801 2023-01-22 13:17:21.746909: step: 866/464, loss: 0.02942419983446598 2023-01-22 13:17:22.530850: step: 868/464, loss: 0.11861138045787811 2023-01-22 13:17:23.221268: step: 870/464, loss: 0.41673585772514343 2023-01-22 13:17:23.995218: step: 872/464, loss: 0.2176818698644638 2023-01-22 13:17:24.737901: step: 874/464, loss: 0.06265763193368912 2023-01-22 13:17:25.437338: step: 876/464, loss: 0.33866050839424133 2023-01-22 13:17:26.224372: step: 878/464, loss: 0.09827618300914764 2023-01-22 13:17:26.991386: step: 880/464, loss: 0.6393712759017944 2023-01-22 13:17:27.653030: step: 882/464, loss: 0.12278065085411072 2023-01-22 13:17:28.346678: step: 884/464, loss: 0.034253865480422974 2023-01-22 13:17:29.110048: step: 886/464, loss: 0.12241361290216446 2023-01-22 13:17:29.915051: step: 888/464, loss: 0.18203134834766388 2023-01-22 13:17:30.619568: step: 890/464, loss: 0.4041353762149811 2023-01-22 13:17:31.352628: step: 892/464, loss: 0.06696341186761856 2023-01-22 13:17:32.093861: step: 894/464, loss: 0.14808495342731476 2023-01-22 13:17:32.885356: step: 896/464, loss: 0.06377518177032471 2023-01-22 13:17:33.612057: step: 898/464, loss: 0.34000760316848755 2023-01-22 13:17:34.403084: step: 900/464, loss: 0.19238989055156708 2023-01-22 13:17:35.167370: step: 902/464, loss: 0.22991186380386353 2023-01-22 13:17:35.912024: step: 904/464, loss: 0.16415190696716309 2023-01-22 13:17:36.623895: step: 906/464, loss: 0.06087620556354523 2023-01-22 13:17:37.362490: step: 908/464, loss: 0.04438728466629982 2023-01-22 13:17:38.157101: step: 910/464, loss: 0.18937140703201294 2023-01-22 13:17:38.884954: step: 912/464, loss: 0.1797570437192917 2023-01-22 13:17:39.606871: step: 914/464, loss: 0.11785709112882614 2023-01-22 13:17:40.384065: step: 916/464, loss: 0.3231666684150696 2023-01-22 13:17:41.006348: step: 918/464, loss: 0.11942718923091888 2023-01-22 13:17:41.729924: step: 920/464, loss: 0.5971169471740723 2023-01-22 13:17:42.484538: step: 922/464, loss: 0.1580820530653 2023-01-22 13:17:43.250175: step: 924/464, loss: 0.5575023889541626 2023-01-22 13:17:44.087463: step: 926/464, loss: 0.1856396347284317 2023-01-22 13:17:44.781065: step: 928/464, loss: 0.07226559519767761 2023-01-22 13:17:45.484849: step: 930/464, loss: 0.09687872231006622 ================================================== Loss: 0.264 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.29510078534176937, 'r': 0.3566967746920438, 'f1': 0.3229883166025895}, 'combined': 0.23799139118085538, 'epoch': 13} Test Chinese: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.29619582478547035, 'r': 0.28535581932843535, 'f1': 0.29067479429828524}, 'combined': 0.18052434593261926, 'epoch': 13} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2742058046346005, 'r': 0.34704985140660066, 'f1': 0.3063572390138669}, 'combined': 0.22573691295758613, 'epoch': 13} Test Korean: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.2905983872341264, 'r': 0.2917481335733316, 'f1': 0.2911721254122786}, 'combined': 0.18083321472973093, 'epoch': 13} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.29856527627161444, 'r': 0.35861825404161657, 'f1': 0.3258479653102275}, 'combined': 0.2400985007549045, 'epoch': 13} Test Russian: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.31431241050496134, 'r': 0.30312027719321194, 'f1': 0.3086149045743578}, 'combined': 0.19166609863039066, 'epoch': 13} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.26851851851851855, 'r': 0.4142857142857143, 'f1': 0.3258426966292135}, 'combined': 0.21722846441947566, 'epoch': 13} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.26666666666666666, 'r': 0.5217391304347826, 'f1': 0.3529411764705882}, 'combined': 0.1764705882352941, 'epoch': 13} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.39880952380952384, 'r': 0.28879310344827586, 'f1': 0.33499999999999996}, 'combined': 0.2233333333333333, 'epoch': 13} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.31282801835726054, 'r': 0.3603161425860667, 'f1': 0.33489701436130004}, 'combined': 0.24676622110832633, 'epoch': 12} Test for Chinese: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.30894966790503225, 'r': 0.2887808468548521, 'f1': 0.29852498585915693}, 'combined': 0.1853997280598975, 'epoch': 12} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.3138888888888889, 'r': 0.4035714285714286, 'f1': 0.35312499999999997}, 'combined': 0.23541666666666664, 'epoch': 12} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2957374656593406, 'r': 0.3501711168338303, 'f1': 0.3206606056844979}, 'combined': 0.23627623576752474, 'epoch': 12} Test for Korean: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.30237642650399266, 'r': 0.2871380888046808, 'f1': 0.2945603100560943}, 'combined': 0.18293745571904804, 'epoch': 12} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.31976744186046513, 'r': 0.5978260869565217, 'f1': 0.4166666666666667}, 'combined': 0.20833333333333334, 'epoch': 12} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3020804336867607, 'r': 0.32271590923272536, 'f1': 0.3120574021388005}, 'combined': 0.22993703315490563, 'epoch': 8} Test for Russian: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.3069139619466407, 'r': 0.292949528465389, 'f1': 0.29976920372318655}, 'combined': 0.1861724528386106, 'epoch': 8} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.44642857142857145, 'r': 0.3232758620689655, 'f1': 0.37500000000000006}, 'combined': 0.25, 'epoch': 8} ****************************** Epoch: 14 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-22 13:20:25.454330: step: 2/464, loss: 0.12193366140127182 2023-01-22 13:20:26.203683: step: 4/464, loss: 0.07567768543958664 2023-01-22 13:20:26.795886: step: 6/464, loss: 0.08617586642503738 2023-01-22 13:20:27.510984: step: 8/464, loss: 0.06880141794681549 2023-01-22 13:20:28.193030: step: 10/464, loss: 0.2527928054332733 2023-01-22 13:20:28.942985: step: 12/464, loss: 0.13666020333766937 2023-01-22 13:20:29.674577: step: 14/464, loss: 0.9853371381759644 2023-01-22 13:20:30.422896: step: 16/464, loss: 0.16098473966121674 2023-01-22 13:20:31.157906: step: 18/464, loss: 0.05923319235444069 2023-01-22 13:20:31.954093: step: 20/464, loss: 0.2810465097427368 2023-01-22 13:20:32.817092: step: 22/464, loss: 0.1063055694103241 2023-01-22 13:20:33.554381: step: 24/464, loss: 0.08378912508487701 2023-01-22 13:20:34.284590: step: 26/464, loss: 0.13146083056926727 2023-01-22 13:20:35.027114: step: 28/464, loss: 0.0617261677980423 2023-01-22 13:20:35.691174: step: 30/464, loss: 0.0540962852537632 2023-01-22 13:20:36.411839: step: 32/464, loss: 0.10380225628614426 2023-01-22 13:20:37.123489: step: 34/464, loss: 0.04630668833851814 2023-01-22 13:20:37.823299: step: 36/464, loss: 0.06711572408676147 2023-01-22 13:20:38.683786: step: 38/464, loss: 0.1516401767730713 2023-01-22 13:20:39.414135: step: 40/464, loss: 0.04767386242747307 2023-01-22 13:20:40.188380: step: 42/464, loss: 0.05505099147558212 2023-01-22 13:20:40.982268: step: 44/464, loss: 0.6285038590431213 2023-01-22 13:20:41.745987: step: 46/464, loss: 0.4285893142223358 2023-01-22 13:20:42.487360: step: 48/464, loss: 0.2863122522830963 2023-01-22 13:20:43.189858: step: 50/464, loss: 0.17619168758392334 2023-01-22 13:20:43.862698: step: 52/464, loss: 0.10409979522228241 2023-01-22 13:20:44.564663: step: 54/464, loss: 0.11232734471559525 2023-01-22 13:20:45.330506: step: 56/464, loss: 0.0699315294623375 2023-01-22 13:20:46.093021: step: 58/464, loss: 0.12718603014945984 2023-01-22 13:20:46.824507: step: 60/464, loss: 0.1453656256198883 2023-01-22 13:20:47.593850: step: 62/464, loss: 0.06315405666828156 2023-01-22 13:20:48.438544: step: 64/464, loss: 0.19121019542217255 2023-01-22 13:20:49.172169: step: 66/464, loss: 0.11257484555244446 2023-01-22 13:20:49.914413: step: 68/464, loss: 0.27853450179100037 2023-01-22 13:20:50.592918: step: 70/464, loss: 0.45154255628585815 2023-01-22 13:20:51.307209: step: 72/464, loss: 0.16039679944515228 2023-01-22 13:20:52.019594: step: 74/464, loss: 0.1091892197728157 2023-01-22 13:20:52.715946: step: 76/464, loss: 0.17020484805107117 2023-01-22 13:20:53.427346: step: 78/464, loss: 0.08507838845252991 2023-01-22 13:20:54.203580: step: 80/464, loss: 0.07408726215362549 2023-01-22 13:20:54.933407: step: 82/464, loss: 0.06273666769266129 2023-01-22 13:20:55.625732: step: 84/464, loss: 0.06049322336912155 2023-01-22 13:20:56.392485: step: 86/464, loss: 0.0883559063076973 2023-01-22 13:20:57.052467: step: 88/464, loss: 0.21968767046928406 2023-01-22 13:20:57.717775: step: 90/464, loss: 0.1691584289073944 2023-01-22 13:20:58.479027: step: 92/464, loss: 0.14545169472694397 2023-01-22 13:20:59.211269: step: 94/464, loss: 0.08463499695062637 2023-01-22 13:20:59.965837: step: 96/464, loss: 0.211579829454422 2023-01-22 13:21:00.719697: step: 98/464, loss: 0.06251304596662521 2023-01-22 13:21:01.363001: step: 100/464, loss: 0.037433043122291565 2023-01-22 13:21:02.143592: step: 102/464, loss: 0.17153921723365784 2023-01-22 13:21:02.950989: step: 104/464, loss: 0.0873965322971344 2023-01-22 13:21:03.759263: step: 106/464, loss: 0.17969262599945068 2023-01-22 13:21:04.532478: step: 108/464, loss: 0.07596272975206375 2023-01-22 13:21:05.245362: step: 110/464, loss: 0.06037634611129761 2023-01-22 13:21:06.032623: step: 112/464, loss: 0.11389008164405823 2023-01-22 13:21:06.852242: step: 114/464, loss: 0.10771888494491577 2023-01-22 13:21:07.609155: step: 116/464, loss: 0.08732682466506958 2023-01-22 13:21:08.407177: step: 118/464, loss: 0.11964301764965057 2023-01-22 13:21:09.163327: step: 120/464, loss: 0.03714948520064354 2023-01-22 13:21:09.885021: step: 122/464, loss: 0.08723358064889908 2023-01-22 13:21:10.554978: step: 124/464, loss: 0.08652307838201523 2023-01-22 13:21:11.317370: step: 126/464, loss: 0.182054340839386 2023-01-22 13:21:12.044733: step: 128/464, loss: 0.03836066275835037 2023-01-22 13:21:12.782049: step: 130/464, loss: 0.05950037017464638 2023-01-22 13:21:13.625958: step: 132/464, loss: 0.04085146263241768 2023-01-22 13:21:14.313818: step: 134/464, loss: 0.07664566487073898 2023-01-22 13:21:15.090391: step: 136/464, loss: 0.16005469858646393 2023-01-22 13:21:15.819256: step: 138/464, loss: 0.04177999496459961 2023-01-22 13:21:16.509344: step: 140/464, loss: 0.46670863032341003 2023-01-22 13:21:17.212943: step: 142/464, loss: 0.5049713850021362 2023-01-22 13:21:17.957112: step: 144/464, loss: 0.037760179489851 2023-01-22 13:21:18.637923: step: 146/464, loss: 0.18749308586120605 2023-01-22 13:21:19.373330: step: 148/464, loss: 0.06520560383796692 2023-01-22 13:21:20.071228: step: 150/464, loss: 0.08682045340538025 2023-01-22 13:21:20.893209: step: 152/464, loss: 0.28364092111587524 2023-01-22 13:21:21.590713: step: 154/464, loss: 0.2779575288295746 2023-01-22 13:21:22.363934: step: 156/464, loss: 0.21594323217868805 2023-01-22 13:21:23.236169: step: 158/464, loss: 0.2842862010002136 2023-01-22 13:21:24.013855: step: 160/464, loss: 0.13473035395145416 2023-01-22 13:21:24.642357: step: 162/464, loss: 0.10318247228860855 2023-01-22 13:21:25.367667: step: 164/464, loss: 0.1572057455778122 2023-01-22 13:21:26.200146: step: 166/464, loss: 0.12798070907592773 2023-01-22 13:21:26.900468: step: 168/464, loss: 0.07746720314025879 2023-01-22 13:21:27.687184: step: 170/464, loss: 0.06719473749399185 2023-01-22 13:21:28.448599: step: 172/464, loss: 0.5005451440811157 2023-01-22 13:21:29.208561: step: 174/464, loss: 0.11085796356201172 2023-01-22 13:21:29.980844: step: 176/464, loss: 0.08189515769481659 2023-01-22 13:21:30.757226: step: 178/464, loss: 0.22294452786445618 2023-01-22 13:21:31.555711: step: 180/464, loss: 0.14733389019966125 2023-01-22 13:21:32.305539: step: 182/464, loss: 0.9163328409194946 2023-01-22 13:21:33.043324: step: 184/464, loss: 0.18387994170188904 2023-01-22 13:21:33.784234: step: 186/464, loss: 0.10005448013544083 2023-01-22 13:21:34.520398: step: 188/464, loss: 0.07477206736803055 2023-01-22 13:21:35.215960: step: 190/464, loss: 0.6215589046478271 2023-01-22 13:21:35.982289: step: 192/464, loss: 0.443669855594635 2023-01-22 13:21:36.732084: step: 194/464, loss: 0.29367828369140625 2023-01-22 13:21:37.496563: step: 196/464, loss: 0.0811825841665268 2023-01-22 13:21:38.218660: step: 198/464, loss: 0.037247225642204285 2023-01-22 13:21:38.907190: step: 200/464, loss: 0.06594298779964447 2023-01-22 13:21:39.634261: step: 202/464, loss: 0.16184520721435547 2023-01-22 13:21:40.349672: step: 204/464, loss: 0.10985446721315384 2023-01-22 13:21:41.090241: step: 206/464, loss: 0.053390901535749435 2023-01-22 13:21:41.848219: step: 208/464, loss: 1.1080310344696045 2023-01-22 13:21:42.534853: step: 210/464, loss: 0.41309675574302673 2023-01-22 13:21:43.242046: step: 212/464, loss: 0.09760172665119171 2023-01-22 13:21:44.088840: step: 214/464, loss: 0.17236673831939697 2023-01-22 13:21:44.801899: step: 216/464, loss: 0.06706502288579941 2023-01-22 13:21:45.582106: step: 218/464, loss: 0.640880286693573 2023-01-22 13:21:46.366143: step: 220/464, loss: 0.15519407391548157 2023-01-22 13:21:47.128369: step: 222/464, loss: 0.12634916603565216 2023-01-22 13:21:47.875204: step: 224/464, loss: 0.18875771760940552 2023-01-22 13:21:48.567627: step: 226/464, loss: 0.08672748506069183 2023-01-22 13:21:49.306396: step: 228/464, loss: 0.0703166127204895 2023-01-22 13:21:50.007349: step: 230/464, loss: 0.21372142434120178 2023-01-22 13:21:50.764136: step: 232/464, loss: 0.09859161078929901 2023-01-22 13:21:51.523622: step: 234/464, loss: 0.08668128401041031 2023-01-22 13:21:52.182504: step: 236/464, loss: 0.1777801811695099 2023-01-22 13:21:52.999324: step: 238/464, loss: 0.1596376895904541 2023-01-22 13:21:53.693927: step: 240/464, loss: 0.7244187593460083 2023-01-22 13:21:54.516969: step: 242/464, loss: 0.1390981376171112 2023-01-22 13:21:55.235319: step: 244/464, loss: 0.20769082009792328 2023-01-22 13:21:56.047257: step: 246/464, loss: 0.06973709166049957 2023-01-22 13:21:56.806193: step: 248/464, loss: 0.3444404900074005 2023-01-22 13:21:57.448886: step: 250/464, loss: 0.16165480017662048 2023-01-22 13:21:58.174784: step: 252/464, loss: 0.12662559747695923 2023-01-22 13:21:58.965855: step: 254/464, loss: 0.19266805052757263 2023-01-22 13:21:59.797980: step: 256/464, loss: 0.46310684084892273 2023-01-22 13:22:00.487061: step: 258/464, loss: 0.07980255782604218 2023-01-22 13:22:01.215883: step: 260/464, loss: 0.15134549140930176 2023-01-22 13:22:01.909487: step: 262/464, loss: 0.16883066296577454 2023-01-22 13:22:02.708786: step: 264/464, loss: 0.22058843076229095 2023-01-22 13:22:03.337969: step: 266/464, loss: 0.06367085129022598 2023-01-22 13:22:04.083278: step: 268/464, loss: 0.055862151086330414 2023-01-22 13:22:04.777305: step: 270/464, loss: 0.04270695149898529 2023-01-22 13:22:05.493027: step: 272/464, loss: 0.24421437084674835 2023-01-22 13:22:06.230514: step: 274/464, loss: 0.07772643119096756 2023-01-22 13:22:07.077316: step: 276/464, loss: 0.09192586690187454 2023-01-22 13:22:07.844414: step: 278/464, loss: 0.09877754002809525 2023-01-22 13:22:08.560776: step: 280/464, loss: 0.12516112625598907 2023-01-22 13:22:09.255990: step: 282/464, loss: 0.12664923071861267 2023-01-22 13:22:10.068427: step: 284/464, loss: 1.0937851667404175 2023-01-22 13:22:10.844366: step: 286/464, loss: 0.317772775888443 2023-01-22 13:22:11.529168: step: 288/464, loss: 0.0594954788684845 2023-01-22 13:22:12.236366: step: 290/464, loss: 0.07765944302082062 2023-01-22 13:22:12.956012: step: 292/464, loss: 0.20633435249328613 2023-01-22 13:22:13.657465: step: 294/464, loss: 0.4687633514404297 2023-01-22 13:22:14.471808: step: 296/464, loss: 1.9894096851348877 2023-01-22 13:22:15.221856: step: 298/464, loss: 0.1260242909193039 2023-01-22 13:22:16.023639: step: 300/464, loss: 0.056300777941942215 2023-01-22 13:22:16.729470: step: 302/464, loss: 0.19352252781391144 2023-01-22 13:22:17.404016: step: 304/464, loss: 0.08822372555732727 2023-01-22 13:22:18.173914: step: 306/464, loss: 0.14995773136615753 2023-01-22 13:22:18.896283: step: 308/464, loss: 0.05846775323152542 2023-01-22 13:22:19.656715: step: 310/464, loss: 0.18117393553256989 2023-01-22 13:22:20.330530: step: 312/464, loss: 0.034953609108924866 2023-01-22 13:22:21.026823: step: 314/464, loss: 0.024108169600367546 2023-01-22 13:22:21.763324: step: 316/464, loss: 0.0547054260969162 2023-01-22 13:22:22.486870: step: 318/464, loss: 0.03408285230398178 2023-01-22 13:22:23.254121: step: 320/464, loss: 0.028545448556542397 2023-01-22 13:22:23.948412: step: 322/464, loss: 0.2630191743373871 2023-01-22 13:22:24.682944: step: 324/464, loss: 0.19812065362930298 2023-01-22 13:22:25.380514: step: 326/464, loss: 0.16171224415302277 2023-01-22 13:22:26.115990: step: 328/464, loss: 0.22023920714855194 2023-01-22 13:22:26.899362: step: 330/464, loss: 0.11093226820230484 2023-01-22 13:22:27.656508: step: 332/464, loss: 0.14447692036628723 2023-01-22 13:22:28.404991: step: 334/464, loss: 0.35919249057769775 2023-01-22 13:22:29.091200: step: 336/464, loss: 0.09195218980312347 2023-01-22 13:22:29.910461: step: 338/464, loss: 0.0950898677110672 2023-01-22 13:22:30.716087: step: 340/464, loss: 0.017913883551955223 2023-01-22 13:22:31.398378: step: 342/464, loss: 0.09973374009132385 2023-01-22 13:22:32.228257: step: 344/464, loss: 0.140656977891922 2023-01-22 13:22:32.916784: step: 346/464, loss: 0.09039287269115448 2023-01-22 13:22:33.723607: step: 348/464, loss: 0.1490309089422226 2023-01-22 13:22:34.445298: step: 350/464, loss: 0.26216188073158264 2023-01-22 13:22:35.204880: step: 352/464, loss: 0.1854851096868515 2023-01-22 13:22:35.907404: step: 354/464, loss: 0.0873323529958725 2023-01-22 13:22:36.628238: step: 356/464, loss: 0.13864044845104218 2023-01-22 13:22:37.370586: step: 358/464, loss: 0.14705531299114227 2023-01-22 13:22:38.125719: step: 360/464, loss: 0.06606864929199219 2023-01-22 13:22:38.809855: step: 362/464, loss: 0.05874408036470413 2023-01-22 13:22:39.584705: step: 364/464, loss: 0.12137801945209503 2023-01-22 13:22:40.265383: step: 366/464, loss: 0.1096571609377861 2023-01-22 13:22:40.912311: step: 368/464, loss: 0.08940312266349792 2023-01-22 13:22:41.645376: step: 370/464, loss: 1.0309978723526 2023-01-22 13:22:42.336069: step: 372/464, loss: 0.1272679716348648 2023-01-22 13:22:43.030481: step: 374/464, loss: 0.29737213253974915 2023-01-22 13:22:43.756292: step: 376/464, loss: 0.1702267825603485 2023-01-22 13:22:44.473749: step: 378/464, loss: 0.1460234671831131 2023-01-22 13:22:45.160498: step: 380/464, loss: 0.13599790632724762 2023-01-22 13:22:45.943949: step: 382/464, loss: 0.1381097435951233 2023-01-22 13:22:46.620097: step: 384/464, loss: 0.13972963392734528 2023-01-22 13:22:47.366880: step: 386/464, loss: 0.07999303191900253 2023-01-22 13:22:48.085844: step: 388/464, loss: 0.07907330989837646 2023-01-22 13:22:48.853563: step: 390/464, loss: 0.03261999040842056 2023-01-22 13:22:49.580023: step: 392/464, loss: 0.1951265037059784 2023-01-22 13:22:50.331597: step: 394/464, loss: 0.06247112527489662 2023-01-22 13:22:51.088869: step: 396/464, loss: 0.02576691471040249 2023-01-22 13:22:52.014851: step: 398/464, loss: 0.02725120075047016 2023-01-22 13:22:52.736424: step: 400/464, loss: 0.25071337819099426 2023-01-22 13:22:53.438277: step: 402/464, loss: 0.09235862642526627 2023-01-22 13:22:54.189875: step: 404/464, loss: 0.10845604538917542 2023-01-22 13:22:54.892949: step: 406/464, loss: 0.11261343210935593 2023-01-22 13:22:55.600624: step: 408/464, loss: 0.11463362723588943 2023-01-22 13:22:56.386547: step: 410/464, loss: 0.1571776568889618 2023-01-22 13:22:57.118716: step: 412/464, loss: 0.12521491944789886 2023-01-22 13:22:57.841028: step: 414/464, loss: 0.045416172593832016 2023-01-22 13:22:58.542119: step: 416/464, loss: 0.054573748260736465 2023-01-22 13:22:59.262396: step: 418/464, loss: 0.225603848695755 2023-01-22 13:23:00.086357: step: 420/464, loss: 0.055413082242012024 2023-01-22 13:23:00.934319: step: 422/464, loss: 0.39748069643974304 2023-01-22 13:23:01.683047: step: 424/464, loss: 0.26622945070266724 2023-01-22 13:23:02.461763: step: 426/464, loss: 0.12175989151000977 2023-01-22 13:23:03.142517: step: 428/464, loss: 0.09528283774852753 2023-01-22 13:23:03.944456: step: 430/464, loss: 0.22560282051563263 2023-01-22 13:23:04.613824: step: 432/464, loss: 0.20506493747234344 2023-01-22 13:23:05.396142: step: 434/464, loss: 0.08719082176685333 2023-01-22 13:23:06.197116: step: 436/464, loss: 2.499934673309326 2023-01-22 13:23:06.907443: step: 438/464, loss: 0.6153557300567627 2023-01-22 13:23:07.611809: step: 440/464, loss: 0.1747024953365326 2023-01-22 13:23:08.315417: step: 442/464, loss: 0.06936346739530563 2023-01-22 13:23:09.052821: step: 444/464, loss: 0.06344647705554962 2023-01-22 13:23:09.753684: step: 446/464, loss: 0.07927471399307251 2023-01-22 13:23:10.467468: step: 448/464, loss: 0.40280646085739136 2023-01-22 13:23:11.308831: step: 450/464, loss: 0.21933694183826447 2023-01-22 13:23:12.062329: step: 452/464, loss: 0.26208335161209106 2023-01-22 13:23:12.747033: step: 454/464, loss: 0.14688733220100403 2023-01-22 13:23:13.464087: step: 456/464, loss: 1.7487246990203857 2023-01-22 13:23:14.181167: step: 458/464, loss: 0.18525251746177673 2023-01-22 13:23:15.034836: step: 460/464, loss: 0.19878581166267395 2023-01-22 13:23:15.848109: step: 462/464, loss: 0.1086234375834465 2023-01-22 13:23:16.553484: step: 464/464, loss: 0.5612615346908569 2023-01-22 13:23:17.339118: step: 466/464, loss: 0.22032488882541656 2023-01-22 13:23:18.157232: step: 468/464, loss: 0.024769339710474014 2023-01-22 13:23:18.901043: step: 470/464, loss: 0.11801978200674057 2023-01-22 13:23:19.603074: step: 472/464, loss: 0.06288079172372818 2023-01-22 13:23:20.330032: step: 474/464, loss: 0.10719543695449829 2023-01-22 13:23:21.240192: step: 476/464, loss: 0.09371764212846756 2023-01-22 13:23:21.930261: step: 478/464, loss: 0.09058858454227448 2023-01-22 13:23:22.651949: step: 480/464, loss: 1.2796767950057983 2023-01-22 13:23:23.358482: step: 482/464, loss: 0.33729735016822815 2023-01-22 13:23:24.049044: step: 484/464, loss: 0.10503074526786804 2023-01-22 13:23:24.783842: step: 486/464, loss: 0.14708486199378967 2023-01-22 13:23:25.501519: step: 488/464, loss: 0.04026735946536064 2023-01-22 13:23:26.295741: step: 490/464, loss: 0.1619386523962021 2023-01-22 13:23:27.046455: step: 492/464, loss: 0.07265634089708328 2023-01-22 13:23:27.747944: step: 494/464, loss: 0.06399974226951599 2023-01-22 13:23:28.440142: step: 496/464, loss: 0.16489113867282867 2023-01-22 13:23:29.217483: step: 498/464, loss: 0.03770787641406059 2023-01-22 13:23:30.036931: step: 500/464, loss: 0.10955356806516647 2023-01-22 13:23:30.811981: step: 502/464, loss: 0.26145660877227783 2023-01-22 13:23:31.614917: step: 504/464, loss: 0.3547782897949219 2023-01-22 13:23:32.338554: step: 506/464, loss: 0.10551081597805023 2023-01-22 13:23:33.070991: step: 508/464, loss: 0.07227484881877899 2023-01-22 13:23:33.778341: step: 510/464, loss: 0.24487046897411346 2023-01-22 13:23:34.681829: step: 512/464, loss: 0.11837557703256607 2023-01-22 13:23:35.433719: step: 514/464, loss: 0.19440263509750366 2023-01-22 13:23:36.123778: step: 516/464, loss: 0.04968242719769478 2023-01-22 13:23:36.867896: step: 518/464, loss: 0.2314143180847168 2023-01-22 13:23:37.620746: step: 520/464, loss: 0.0630800649523735 2023-01-22 13:23:38.409826: step: 522/464, loss: 0.09166283905506134 2023-01-22 13:23:39.145447: step: 524/464, loss: 0.09149608761072159 2023-01-22 13:23:39.854360: step: 526/464, loss: 0.13420897722244263 2023-01-22 13:23:40.569192: step: 528/464, loss: 0.09974882006645203 2023-01-22 13:23:41.381400: step: 530/464, loss: 0.09064196050167084 2023-01-22 13:23:42.117642: step: 532/464, loss: 0.08575993031263351 2023-01-22 13:23:42.891790: step: 534/464, loss: 0.04460528492927551 2023-01-22 13:23:43.643849: step: 536/464, loss: 0.08206569403409958 2023-01-22 13:23:44.434285: step: 538/464, loss: 0.3511834442615509 2023-01-22 13:23:45.203787: step: 540/464, loss: 0.9145286083221436 2023-01-22 13:23:45.915010: step: 542/464, loss: 0.23256400227546692 2023-01-22 13:23:46.595094: step: 544/464, loss: 0.0949745923280716 2023-01-22 13:23:47.361638: step: 546/464, loss: 0.05418713763356209 2023-01-22 13:23:48.056185: step: 548/464, loss: 1.9522722959518433 2023-01-22 13:23:48.753666: step: 550/464, loss: 0.6723595857620239 2023-01-22 13:23:49.530120: step: 552/464, loss: 0.21266116201877594 2023-01-22 13:23:50.249608: step: 554/464, loss: 0.1409553587436676 2023-01-22 13:23:51.057575: step: 556/464, loss: 0.11059970408678055 2023-01-22 13:23:51.791569: step: 558/464, loss: 0.09071413427591324 2023-01-22 13:23:52.437664: step: 560/464, loss: 0.12655948102474213 2023-01-22 13:23:53.110879: step: 562/464, loss: 0.05802586302161217 2023-01-22 13:23:53.793120: step: 564/464, loss: 0.09048962593078613 2023-01-22 13:23:54.451852: step: 566/464, loss: 0.23743292689323425 2023-01-22 13:23:55.144804: step: 568/464, loss: 0.07267441600561142 2023-01-22 13:23:55.916761: step: 570/464, loss: 0.07767453044652939 2023-01-22 13:23:56.638195: step: 572/464, loss: 0.11691311746835709 2023-01-22 13:23:57.456948: step: 574/464, loss: 0.10471902787685394 2023-01-22 13:23:58.216170: step: 576/464, loss: 0.1262369155883789 2023-01-22 13:23:58.930142: step: 578/464, loss: 0.06593358516693115 2023-01-22 13:23:59.658865: step: 580/464, loss: 0.11966902017593384 2023-01-22 13:24:00.337303: step: 582/464, loss: 0.1588023602962494 2023-01-22 13:24:01.009360: step: 584/464, loss: 0.07839857786893845 2023-01-22 13:24:01.757178: step: 586/464, loss: 0.07492205500602722 2023-01-22 13:24:02.492307: step: 588/464, loss: 0.04727175459265709 2023-01-22 13:24:03.309242: step: 590/464, loss: 0.2579101622104645 2023-01-22 13:24:04.060680: step: 592/464, loss: 0.05314266309142113 2023-01-22 13:24:04.867588: step: 594/464, loss: 0.3269496560096741 2023-01-22 13:24:05.563608: step: 596/464, loss: 0.07443402707576752 2023-01-22 13:24:06.298140: step: 598/464, loss: 0.1413809210062027 2023-01-22 13:24:06.986943: step: 600/464, loss: 0.12987062335014343 2023-01-22 13:24:07.694153: step: 602/464, loss: 0.34331780672073364 2023-01-22 13:24:08.384072: step: 604/464, loss: 0.13412347435951233 2023-01-22 13:24:09.087626: step: 606/464, loss: 0.12206920236349106 2023-01-22 13:24:09.787606: step: 608/464, loss: 0.11294567584991455 2023-01-22 13:24:10.603945: step: 610/464, loss: 0.1791577935218811 2023-01-22 13:24:11.348413: step: 612/464, loss: 0.05504743009805679 2023-01-22 13:24:12.123056: step: 614/464, loss: 0.13075266778469086 2023-01-22 13:24:12.830392: step: 616/464, loss: 0.17414139211177826 2023-01-22 13:24:13.515357: step: 618/464, loss: 0.0166025310754776 2023-01-22 13:24:14.296346: step: 620/464, loss: 0.08685021847486496 2023-01-22 13:24:14.952945: step: 622/464, loss: 0.029836809262633324 2023-01-22 13:24:15.641265: step: 624/464, loss: 0.19079022109508514 2023-01-22 13:24:16.384081: step: 626/464, loss: 0.06109185889363289 2023-01-22 13:24:17.074636: step: 628/464, loss: 0.08585669845342636 2023-01-22 13:24:17.829644: step: 630/464, loss: 0.2822912335395813 2023-01-22 13:24:18.557510: step: 632/464, loss: 0.11809217929840088 2023-01-22 13:24:19.262971: step: 634/464, loss: 0.7315946817398071 2023-01-22 13:24:19.990705: step: 636/464, loss: 0.1486750692129135 2023-01-22 13:24:20.742234: step: 638/464, loss: 0.08701854944229126 2023-01-22 13:24:21.438397: step: 640/464, loss: 0.22685103118419647 2023-01-22 13:24:22.150251: step: 642/464, loss: 0.5486653447151184 2023-01-22 13:24:22.902009: step: 644/464, loss: 0.14657476544380188 2023-01-22 13:24:23.612949: step: 646/464, loss: 0.04882512614130974 2023-01-22 13:24:24.419371: step: 648/464, loss: 0.18402329087257385 2023-01-22 13:24:25.079084: step: 650/464, loss: 0.21440860629081726 2023-01-22 13:24:25.877187: step: 652/464, loss: 0.3449436128139496 2023-01-22 13:24:26.569824: step: 654/464, loss: 0.13148115575313568 2023-01-22 13:24:27.348748: step: 656/464, loss: 0.16635462641716003 2023-01-22 13:24:28.149467: step: 658/464, loss: 0.1267225593328476 2023-01-22 13:24:28.906546: step: 660/464, loss: 0.028304558247327805 2023-01-22 13:24:29.631522: step: 662/464, loss: 0.054090242832899094 2023-01-22 13:24:30.384271: step: 664/464, loss: 0.11156564950942993 2023-01-22 13:24:31.052505: step: 666/464, loss: 0.3686261773109436 2023-01-22 13:24:31.764450: step: 668/464, loss: 0.13700802624225616 2023-01-22 13:24:32.461978: step: 670/464, loss: 0.09640567749738693 2023-01-22 13:24:33.214431: step: 672/464, loss: 0.08035552501678467 2023-01-22 13:24:33.994876: step: 674/464, loss: 0.07916685938835144 2023-01-22 13:24:34.656874: step: 676/464, loss: 0.039851244539022446 2023-01-22 13:24:35.364005: step: 678/464, loss: 0.6721044182777405 2023-01-22 13:24:36.147362: step: 680/464, loss: 0.18033455312252045 2023-01-22 13:24:36.795964: step: 682/464, loss: 0.7768173217773438 2023-01-22 13:24:37.564264: step: 684/464, loss: 2.363861322402954 2023-01-22 13:24:38.333701: step: 686/464, loss: 0.2924078702926636 2023-01-22 13:24:39.051770: step: 688/464, loss: 0.05811668187379837 2023-01-22 13:24:39.788149: step: 690/464, loss: 0.21463051438331604 2023-01-22 13:24:40.574888: step: 692/464, loss: 0.06769423186779022 2023-01-22 13:24:41.250576: step: 694/464, loss: 0.06266149878501892 2023-01-22 13:24:41.991886: step: 696/464, loss: 0.0853852853178978 2023-01-22 13:24:42.744133: step: 698/464, loss: 0.09094809740781784 2023-01-22 13:24:43.467700: step: 700/464, loss: 0.25462332367897034 2023-01-22 13:24:44.211466: step: 702/464, loss: 0.16619449853897095 2023-01-22 13:24:44.987300: step: 704/464, loss: 0.10084399580955505 2023-01-22 13:24:45.679818: step: 706/464, loss: 0.29148754477500916 2023-01-22 13:24:46.369260: step: 708/464, loss: 0.02944398857653141 2023-01-22 13:24:47.119175: step: 710/464, loss: 0.10250351577997208 2023-01-22 13:24:47.841923: step: 712/464, loss: 0.18411992490291595 2023-01-22 13:24:48.546849: step: 714/464, loss: 0.15183106064796448 2023-01-22 13:24:49.321453: step: 716/464, loss: 0.1189422532916069 2023-01-22 13:24:50.003697: step: 718/464, loss: 0.07556652277708054 2023-01-22 13:24:50.881020: step: 720/464, loss: 0.23556427657604218 2023-01-22 13:24:51.523111: step: 722/464, loss: 0.1929933726787567 2023-01-22 13:24:52.262414: step: 724/464, loss: 0.1436728984117508 2023-01-22 13:24:52.975496: step: 726/464, loss: 0.046241115778684616 2023-01-22 13:24:53.665184: step: 728/464, loss: 0.15094700455665588 2023-01-22 13:24:54.292450: step: 730/464, loss: 0.06769252568483353 2023-01-22 13:24:55.053241: step: 732/464, loss: 0.15382608771324158 2023-01-22 13:24:55.791355: step: 734/464, loss: 0.1293579787015915 2023-01-22 13:24:56.526472: step: 736/464, loss: 0.050788234919309616 2023-01-22 13:24:57.200207: step: 738/464, loss: 0.18262383341789246 2023-01-22 13:24:57.926956: step: 740/464, loss: 0.17732545733451843 2023-01-22 13:24:58.595555: step: 742/464, loss: 0.12889239192008972 2023-01-22 13:24:59.308503: step: 744/464, loss: 0.1251494586467743 2023-01-22 13:25:00.064908: step: 746/464, loss: 0.4357171058654785 2023-01-22 13:25:00.787975: step: 748/464, loss: 0.6001867055892944 2023-01-22 13:25:01.549238: step: 750/464, loss: 0.11146920174360275 2023-01-22 13:25:02.269129: step: 752/464, loss: 0.09523777663707733 2023-01-22 13:25:02.955006: step: 754/464, loss: 0.03619766607880592 2023-01-22 13:25:03.648432: step: 756/464, loss: 0.10129847377538681 2023-01-22 13:25:04.329907: step: 758/464, loss: 0.06092946603894234 2023-01-22 13:25:05.030046: step: 760/464, loss: 0.2895306646823883 2023-01-22 13:25:05.790966: step: 762/464, loss: 0.0353437177836895 2023-01-22 13:25:06.585831: step: 764/464, loss: 0.14700661599636078 2023-01-22 13:25:07.381431: step: 766/464, loss: 0.5515168309211731 2023-01-22 13:25:08.067161: step: 768/464, loss: 0.029831836000084877 2023-01-22 13:25:08.828515: step: 770/464, loss: 0.08818569034337997 2023-01-22 13:25:09.603184: step: 772/464, loss: 0.16117626428604126 2023-01-22 13:25:10.309351: step: 774/464, loss: 0.543182909488678 2023-01-22 13:25:11.107011: step: 776/464, loss: 0.07934430986642838 2023-01-22 13:25:11.862174: step: 778/464, loss: 0.21586981415748596 2023-01-22 13:25:12.656825: step: 780/464, loss: 0.22837424278259277 2023-01-22 13:25:13.391881: step: 782/464, loss: 0.05814993754029274 2023-01-22 13:25:14.139745: step: 784/464, loss: 0.06691834330558777 2023-01-22 13:25:14.866051: step: 786/464, loss: 0.12464606761932373 2023-01-22 13:25:15.620932: step: 788/464, loss: 0.0432974137365818 2023-01-22 13:25:16.383260: step: 790/464, loss: 0.19307135045528412 2023-01-22 13:25:17.135575: step: 792/464, loss: 0.17207276821136475 2023-01-22 13:25:17.864155: step: 794/464, loss: 0.08930698037147522 2023-01-22 13:25:18.645968: step: 796/464, loss: 0.25028547644615173 2023-01-22 13:25:19.368642: step: 798/464, loss: 0.11197063326835632 2023-01-22 13:25:20.113382: step: 800/464, loss: 0.08689679205417633 2023-01-22 13:25:20.896436: step: 802/464, loss: 0.05291495844721794 2023-01-22 13:25:21.600949: step: 804/464, loss: 0.049140579998493195 2023-01-22 13:25:22.338399: step: 806/464, loss: 0.07981767505407333 2023-01-22 13:25:23.133804: step: 808/464, loss: 0.22347483038902283 2023-01-22 13:25:23.952073: step: 810/464, loss: 0.7353914976119995 2023-01-22 13:25:24.717527: step: 812/464, loss: 0.3017483949661255 2023-01-22 13:25:25.410345: step: 814/464, loss: 0.46327030658721924 2023-01-22 13:25:26.180693: step: 816/464, loss: 0.28747430443763733 2023-01-22 13:25:26.937556: step: 818/464, loss: 0.2881736755371094 2023-01-22 13:25:27.634249: step: 820/464, loss: 0.15409404039382935 2023-01-22 13:25:28.332522: step: 822/464, loss: 0.08141999691724777 2023-01-22 13:25:29.076850: step: 824/464, loss: 0.07773395627737045 2023-01-22 13:25:29.900691: step: 826/464, loss: 0.18517354130744934 2023-01-22 13:25:30.614338: step: 828/464, loss: 0.09534697234630585 2023-01-22 13:25:31.412215: step: 830/464, loss: 0.022645127028226852 2023-01-22 13:25:32.155908: step: 832/464, loss: 0.5086654424667358 2023-01-22 13:25:32.947437: step: 834/464, loss: 0.08283492177724838 2023-01-22 13:25:33.628676: step: 836/464, loss: 0.3400731384754181 2023-01-22 13:25:34.359552: step: 838/464, loss: 0.13037338852882385 2023-01-22 13:25:35.081405: step: 840/464, loss: 0.06704337894916534 2023-01-22 13:25:35.872402: step: 842/464, loss: 0.11626792699098587 2023-01-22 13:25:36.610662: step: 844/464, loss: 0.16931547224521637 2023-01-22 13:25:37.360935: step: 846/464, loss: 0.07658080011606216 2023-01-22 13:25:38.149596: step: 848/464, loss: 0.39117011427879333 2023-01-22 13:25:38.943001: step: 850/464, loss: 0.1561334729194641 2023-01-22 13:25:39.646225: step: 852/464, loss: 0.11885397136211395 2023-01-22 13:25:40.396909: step: 854/464, loss: 0.09748605638742447 2023-01-22 13:25:41.139203: step: 856/464, loss: 0.14782683551311493 2023-01-22 13:25:41.878935: step: 858/464, loss: 0.05930705741047859 2023-01-22 13:25:42.566988: step: 860/464, loss: 0.19739149510860443 2023-01-22 13:25:43.294041: step: 862/464, loss: 0.23783403635025024 2023-01-22 13:25:44.047314: step: 864/464, loss: 0.1938062161207199 2023-01-22 13:25:44.808389: step: 866/464, loss: 0.1528290957212448 2023-01-22 13:25:45.485993: step: 868/464, loss: 0.27736181020736694 2023-01-22 13:25:46.242302: step: 870/464, loss: 0.24971850216388702 2023-01-22 13:25:46.931804: step: 872/464, loss: 0.29064181447029114 2023-01-22 13:25:47.653943: step: 874/464, loss: 0.29505786299705505 2023-01-22 13:25:48.445548: step: 876/464, loss: 0.13866356015205383 2023-01-22 13:25:49.272024: step: 878/464, loss: 0.12224125117063522 2023-01-22 13:25:50.029587: step: 880/464, loss: 0.07472982257604599 2023-01-22 13:25:50.749481: step: 882/464, loss: 0.572647750377655 2023-01-22 13:25:51.545441: step: 884/464, loss: 0.47569239139556885 2023-01-22 13:25:52.282644: step: 886/464, loss: 0.05924500524997711 2023-01-22 13:25:52.991665: step: 888/464, loss: 0.3303375840187073 2023-01-22 13:25:53.661437: step: 890/464, loss: 0.01358190830796957 2023-01-22 13:25:54.407881: step: 892/464, loss: 0.06630312651395798 2023-01-22 13:25:55.132564: step: 894/464, loss: 0.1577807366847992 2023-01-22 13:25:55.893929: step: 896/464, loss: 0.16048268973827362 2023-01-22 13:25:56.687978: step: 898/464, loss: 0.6700915694236755 2023-01-22 13:25:57.406007: step: 900/464, loss: 0.15095430612564087 2023-01-22 13:25:58.119762: step: 902/464, loss: 0.052950140088796616 2023-01-22 13:25:58.799822: step: 904/464, loss: 0.04643837735056877 2023-01-22 13:25:59.528987: step: 906/464, loss: 0.15152107179164886 2023-01-22 13:26:00.200116: step: 908/464, loss: 0.11968246847391129 2023-01-22 13:26:00.918062: step: 910/464, loss: 0.05407499521970749 2023-01-22 13:26:01.702496: step: 912/464, loss: 0.26440638303756714 2023-01-22 13:26:02.398106: step: 914/464, loss: 0.2121550738811493 2023-01-22 13:26:03.136265: step: 916/464, loss: 0.15850675106048584 2023-01-22 13:26:03.938972: step: 918/464, loss: 0.1400802731513977 2023-01-22 13:26:04.681253: step: 920/464, loss: 0.09546446800231934 2023-01-22 13:26:05.427808: step: 922/464, loss: 0.32698971033096313 2023-01-22 13:26:06.169018: step: 924/464, loss: 0.16159042716026306 2023-01-22 13:26:07.029212: step: 926/464, loss: 0.1334841400384903 2023-01-22 13:26:07.754097: step: 928/464, loss: 0.18339157104492188 2023-01-22 13:26:08.396999: step: 930/464, loss: 0.2129916399717331 ================================================== Loss: 0.197 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2886159701363528, 'r': 0.32914269080066044, 'f1': 0.3075499965460072}, 'combined': 0.22661578692863685, 'epoch': 14} Test Chinese: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.2983585336153087, 'r': 0.2924562876486359, 'f1': 0.29537792888388703}, 'combined': 0.1834452400436772, 'epoch': 14} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.27788650234259415, 'r': 0.3332528832647429, 'f1': 0.30306172472911047}, 'combined': 0.2233086392740814, 'epoch': 14} Test Korean: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.2925649527952923, 'r': 0.29487771922055156, 'f1': 0.29371678331810847}, 'combined': 0.18241358121861476, 'epoch': 14} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.29389853676479755, 'r': 0.33628238646901887, 'f1': 0.31366516401623523}, 'combined': 0.2311216998014365, 'epoch': 14} Test Russian: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.3099877201683801, 'r': 0.3023299207571059, 'f1': 0.30611093527382804}, 'combined': 0.19011100190690375, 'epoch': 14} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.2109375, 'r': 0.38571428571428573, 'f1': 0.27272727272727276}, 'combined': 0.18181818181818182, 'epoch': 14} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.25555555555555554, 'r': 0.5, 'f1': 0.338235294117647}, 'combined': 0.1691176470588235, 'epoch': 14} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.33152173913043476, 'r': 0.2629310344827586, 'f1': 0.2932692307692308}, 'combined': 0.1955128205128205, 'epoch': 14} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.31282801835726054, 'r': 0.3603161425860667, 'f1': 0.33489701436130004}, 'combined': 0.24676622110832633, 'epoch': 12} Test for Chinese: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.30894966790503225, 'r': 0.2887808468548521, 'f1': 0.29852498585915693}, 'combined': 0.1853997280598975, 'epoch': 12} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.3138888888888889, 'r': 0.4035714285714286, 'f1': 0.35312499999999997}, 'combined': 0.23541666666666664, 'epoch': 12} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2957374656593406, 'r': 0.3501711168338303, 'f1': 0.3206606056844979}, 'combined': 0.23627623576752474, 'epoch': 12} Test for Korean: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.30237642650399266, 'r': 0.2871380888046808, 'f1': 0.2945603100560943}, 'combined': 0.18293745571904804, 'epoch': 12} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.31976744186046513, 'r': 0.5978260869565217, 'f1': 0.4166666666666667}, 'combined': 0.20833333333333334, 'epoch': 12} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3020804336867607, 'r': 0.32271590923272536, 'f1': 0.3120574021388005}, 'combined': 0.22993703315490563, 'epoch': 8} Test for Russian: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.3069139619466407, 'r': 0.292949528465389, 'f1': 0.29976920372318655}, 'combined': 0.1861724528386106, 'epoch': 8} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.44642857142857145, 'r': 0.3232758620689655, 'f1': 0.37500000000000006}, 'combined': 0.25, 'epoch': 8} ****************************** Epoch: 15 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-22 13:28:47.830469: step: 2/464, loss: 0.09386724978685379 2023-01-22 13:28:48.589659: step: 4/464, loss: 0.12544168531894684 2023-01-22 13:28:49.277898: step: 6/464, loss: 0.3711128830909729 2023-01-22 13:28:50.106290: step: 8/464, loss: 0.14079999923706055 2023-01-22 13:28:50.800171: step: 10/464, loss: 0.13299894332885742 2023-01-22 13:28:51.518359: step: 12/464, loss: 0.08605366945266724 2023-01-22 13:28:52.257085: step: 14/464, loss: 0.08199825882911682 2023-01-22 13:28:52.946768: step: 16/464, loss: 0.0731671154499054 2023-01-22 13:28:53.660593: step: 18/464, loss: 0.38960978388786316 2023-01-22 13:28:54.364247: step: 20/464, loss: 0.0806836411356926 2023-01-22 13:28:55.050975: step: 22/464, loss: 0.6706730723381042 2023-01-22 13:28:55.793284: step: 24/464, loss: 0.11350344866514206 2023-01-22 13:28:56.483185: step: 26/464, loss: 0.13751348853111267 2023-01-22 13:28:57.225666: step: 28/464, loss: 0.07605013996362686 2023-01-22 13:28:57.977641: step: 30/464, loss: 0.08604296296834946 2023-01-22 13:28:58.784059: step: 32/464, loss: 0.11444265395402908 2023-01-22 13:28:59.454841: step: 34/464, loss: 0.14891904592514038 2023-01-22 13:29:00.201490: step: 36/464, loss: 0.08350467681884766 2023-01-22 13:29:01.000610: step: 38/464, loss: 0.09013022482395172 2023-01-22 13:29:01.713609: step: 40/464, loss: 0.6425589323043823 2023-01-22 13:29:02.504758: step: 42/464, loss: 0.45394182205200195 2023-01-22 13:29:03.170803: step: 44/464, loss: 0.07432316243648529 2023-01-22 13:29:03.942679: step: 46/464, loss: 0.09558834880590439 2023-01-22 13:29:04.692569: step: 48/464, loss: 0.05760970339179039 2023-01-22 13:29:05.458101: step: 50/464, loss: 0.06453204900026321 2023-01-22 13:29:06.192958: step: 52/464, loss: 0.10807399451732635 2023-01-22 13:29:06.916048: step: 54/464, loss: 0.10865321755409241 2023-01-22 13:29:07.655016: step: 56/464, loss: 0.0828612744808197 2023-01-22 13:29:08.387959: step: 58/464, loss: 0.6269303560256958 2023-01-22 13:29:09.075978: step: 60/464, loss: 0.044964175671339035 2023-01-22 13:29:09.764430: step: 62/464, loss: 0.07399984449148178 2023-01-22 13:29:10.511406: step: 64/464, loss: 0.08610676974058151 2023-01-22 13:29:11.199902: step: 66/464, loss: 0.40302664041519165 2023-01-22 13:29:11.923584: step: 68/464, loss: 0.10739628970623016 2023-01-22 13:29:12.626457: step: 70/464, loss: 0.06525947153568268 2023-01-22 13:29:13.416086: step: 72/464, loss: 0.04451471567153931 2023-01-22 13:29:14.189054: step: 74/464, loss: 0.14145416021347046 2023-01-22 13:29:14.888230: step: 76/464, loss: 0.11191177368164062 2023-01-22 13:29:15.643451: step: 78/464, loss: 0.06259164214134216 2023-01-22 13:29:16.480002: step: 80/464, loss: 0.1686648726463318 2023-01-22 13:29:17.201461: step: 82/464, loss: 0.11285480856895447 2023-01-22 13:29:17.967909: step: 84/464, loss: 0.20290781557559967 2023-01-22 13:29:18.760304: step: 86/464, loss: 4.417758941650391 2023-01-22 13:29:19.482815: step: 88/464, loss: 0.06766335666179657 2023-01-22 13:29:20.172141: step: 90/464, loss: 0.09443461894989014 2023-01-22 13:29:20.865224: step: 92/464, loss: 0.07344210147857666 2023-01-22 13:29:21.539970: step: 94/464, loss: 0.07837525755167007 2023-01-22 13:29:22.340095: step: 96/464, loss: 0.2997106611728668 2023-01-22 13:29:23.078751: step: 98/464, loss: 0.2080036699771881 2023-01-22 13:29:23.804355: step: 100/464, loss: 0.049382418394088745 2023-01-22 13:29:24.545267: step: 102/464, loss: 0.44997745752334595 2023-01-22 13:29:25.221919: step: 104/464, loss: 0.03246447443962097 2023-01-22 13:29:25.971546: step: 106/464, loss: 0.19689927995204926 2023-01-22 13:29:26.718405: step: 108/464, loss: 0.0539761483669281 2023-01-22 13:29:27.427719: step: 110/464, loss: 0.3616011142730713 2023-01-22 13:29:28.217359: step: 112/464, loss: 0.050215672701597214 2023-01-22 13:29:28.917831: step: 114/464, loss: 0.03970296308398247 2023-01-22 13:29:29.658352: step: 116/464, loss: 0.8578179478645325 2023-01-22 13:29:30.436143: step: 118/464, loss: 0.0712665542960167 2023-01-22 13:29:31.145658: step: 120/464, loss: 0.03966226801276207 2023-01-22 13:29:31.905769: step: 122/464, loss: 0.05801122263073921 2023-01-22 13:29:32.635670: step: 124/464, loss: 0.14632351696491241 2023-01-22 13:29:33.325340: step: 126/464, loss: 0.13727235794067383 2023-01-22 13:29:34.151636: step: 128/464, loss: 0.11179246753454208 2023-01-22 13:29:34.887213: step: 130/464, loss: 0.052726734429597855 2023-01-22 13:29:35.580500: step: 132/464, loss: 0.605263888835907 2023-01-22 13:29:36.359037: step: 134/464, loss: 0.1118796169757843 2023-01-22 13:29:37.149568: step: 136/464, loss: 0.060034509748220444 2023-01-22 13:29:37.837564: step: 138/464, loss: 0.10273944586515427 2023-01-22 13:29:38.562555: step: 140/464, loss: 0.00863298773765564 2023-01-22 13:29:39.294172: step: 142/464, loss: 0.0825846940279007 2023-01-22 13:29:40.032005: step: 144/464, loss: 0.1935960054397583 2023-01-22 13:29:40.689743: step: 146/464, loss: 0.18002963066101074 2023-01-22 13:29:41.415465: step: 148/464, loss: 0.1243882030248642 2023-01-22 13:29:42.242780: step: 150/464, loss: 2.4855635166168213 2023-01-22 13:29:42.940928: step: 152/464, loss: 0.049686331301927567 2023-01-22 13:29:43.613258: step: 154/464, loss: 0.15782882273197174 2023-01-22 13:29:44.387288: step: 156/464, loss: 0.38794490694999695 2023-01-22 13:29:45.145776: step: 158/464, loss: 0.13704653084278107 2023-01-22 13:29:45.874501: step: 160/464, loss: 0.04243621975183487 2023-01-22 13:29:46.608425: step: 162/464, loss: 0.04862409457564354 2023-01-22 13:29:47.458021: step: 164/464, loss: 0.04644959047436714 2023-01-22 13:29:48.225940: step: 166/464, loss: 0.10157779604196548 2023-01-22 13:29:49.007062: step: 168/464, loss: 0.08971478044986725 2023-01-22 13:29:49.743503: step: 170/464, loss: 0.17621955275535583 2023-01-22 13:29:50.492840: step: 172/464, loss: 0.32910415530204773 2023-01-22 13:29:51.131946: step: 174/464, loss: 0.09267746657133102 2023-01-22 13:29:51.851821: step: 176/464, loss: 0.3283587396144867 2023-01-22 13:29:52.642172: step: 178/464, loss: 0.0949794203042984 2023-01-22 13:29:53.292860: step: 180/464, loss: 0.26500043272972107 2023-01-22 13:29:54.187246: step: 182/464, loss: 0.08976930379867554 2023-01-22 13:29:54.860511: step: 184/464, loss: 0.19251510500907898 2023-01-22 13:29:55.591876: step: 186/464, loss: 0.2029496431350708 2023-01-22 13:29:56.250648: step: 188/464, loss: 0.018383512273430824 2023-01-22 13:29:57.059166: step: 190/464, loss: 0.8381131291389465 2023-01-22 13:29:57.792399: step: 192/464, loss: 0.0751001387834549 2023-01-22 13:29:58.455531: step: 194/464, loss: 0.05229335278272629 2023-01-22 13:29:59.193214: step: 196/464, loss: 0.1134602278470993 2023-01-22 13:29:59.940161: step: 198/464, loss: 0.10519812256097794 2023-01-22 13:30:00.581919: step: 200/464, loss: 0.18681646883487701 2023-01-22 13:30:01.305060: step: 202/464, loss: 0.20296519994735718 2023-01-22 13:30:02.083843: step: 204/464, loss: 0.14943040907382965 2023-01-22 13:30:02.823483: step: 206/464, loss: 0.2886558473110199 2023-01-22 13:30:03.611753: step: 208/464, loss: 0.12967827916145325 2023-01-22 13:30:04.339541: step: 210/464, loss: 0.2362452745437622 2023-01-22 13:30:05.013954: step: 212/464, loss: 0.09625612944364548 2023-01-22 13:30:05.754863: step: 214/464, loss: 0.20666609704494476 2023-01-22 13:30:06.471003: step: 216/464, loss: 0.09178787469863892 2023-01-22 13:30:07.199401: step: 218/464, loss: 0.06775946170091629 2023-01-22 13:30:07.871332: step: 220/464, loss: 0.06090725213289261 2023-01-22 13:30:08.545554: step: 222/464, loss: 0.08698819577693939 2023-01-22 13:30:09.302809: step: 224/464, loss: 0.05898406729102135 2023-01-22 13:30:10.088341: step: 226/464, loss: 0.07399263978004456 2023-01-22 13:30:10.825629: step: 228/464, loss: 0.3723280131816864 2023-01-22 13:30:11.599190: step: 230/464, loss: 0.11726538836956024 2023-01-22 13:30:12.298641: step: 232/464, loss: 0.008238730020821095 2023-01-22 13:30:13.080319: step: 234/464, loss: 0.0894642323255539 2023-01-22 13:30:13.837440: step: 236/464, loss: 0.7798489928245544 2023-01-22 13:30:14.625898: step: 238/464, loss: 4.281653881072998 2023-01-22 13:30:15.325977: step: 240/464, loss: 0.06438206881284714 2023-01-22 13:30:16.047162: step: 242/464, loss: 0.036539752036333084 2023-01-22 13:30:16.852666: step: 244/464, loss: 0.0853443443775177 2023-01-22 13:30:17.601991: step: 246/464, loss: 0.1012922152876854 2023-01-22 13:30:18.304859: step: 248/464, loss: 0.535499095916748 2023-01-22 13:30:19.078625: step: 250/464, loss: 0.11578743159770966 2023-01-22 13:30:19.883128: step: 252/464, loss: 0.02177901193499565 2023-01-22 13:30:20.610770: step: 254/464, loss: 0.148213192820549 2023-01-22 13:30:21.365732: step: 256/464, loss: 0.1408969908952713 2023-01-22 13:30:22.046616: step: 258/464, loss: 0.09203886240720749 2023-01-22 13:30:22.803582: step: 260/464, loss: 0.08816751092672348 2023-01-22 13:30:23.601274: step: 262/464, loss: 0.2151239514350891 2023-01-22 13:30:24.338229: step: 264/464, loss: 0.23791560530662537 2023-01-22 13:30:25.106797: step: 266/464, loss: 0.060156311839818954 2023-01-22 13:30:25.785872: step: 268/464, loss: 1.4113879203796387 2023-01-22 13:30:26.486480: step: 270/464, loss: 0.05631784349679947 2023-01-22 13:30:27.221725: step: 272/464, loss: 0.04653649777173996 2023-01-22 13:30:27.959458: step: 274/464, loss: 0.038742709904909134 2023-01-22 13:30:28.693153: step: 276/464, loss: 0.36380136013031006 2023-01-22 13:30:29.406014: step: 278/464, loss: 0.18330629169940948 2023-01-22 13:30:30.113211: step: 280/464, loss: 1.0259242057800293 2023-01-22 13:30:30.886582: step: 282/464, loss: 0.35638394951820374 2023-01-22 13:30:31.620438: step: 284/464, loss: 0.08629385381937027 2023-01-22 13:30:32.376382: step: 286/464, loss: 0.8568986654281616 2023-01-22 13:30:33.059624: step: 288/464, loss: 0.10635808855295181 2023-01-22 13:30:33.784404: step: 290/464, loss: 0.16357223689556122 2023-01-22 13:30:34.636271: step: 292/464, loss: 0.13970738649368286 2023-01-22 13:30:35.396399: step: 294/464, loss: 0.18330936133861542 2023-01-22 13:30:36.140901: step: 296/464, loss: 0.13619013130664825 2023-01-22 13:30:36.814434: step: 298/464, loss: 0.032106779515743256 2023-01-22 13:30:37.483836: step: 300/464, loss: 0.31436678767204285 2023-01-22 13:30:38.274073: step: 302/464, loss: 0.4427456259727478 2023-01-22 13:30:39.076343: step: 304/464, loss: 0.08416996151208878 2023-01-22 13:30:39.837059: step: 306/464, loss: 0.12893402576446533 2023-01-22 13:30:40.705777: step: 308/464, loss: 0.15583616495132446 2023-01-22 13:30:41.458882: step: 310/464, loss: 0.1951410174369812 2023-01-22 13:30:42.225349: step: 312/464, loss: 0.040011730045080185 2023-01-22 13:30:43.008780: step: 314/464, loss: 0.09809692949056625 2023-01-22 13:30:43.723319: step: 316/464, loss: 0.1280430257320404 2023-01-22 13:30:44.462399: step: 318/464, loss: 0.33851298689842224 2023-01-22 13:30:45.132802: step: 320/464, loss: 0.32978272438049316 2023-01-22 13:30:45.862026: step: 322/464, loss: 0.06723512709140778 2023-01-22 13:30:46.614297: step: 324/464, loss: 0.934802770614624 2023-01-22 13:30:47.390295: step: 326/464, loss: 1.1091173887252808 2023-01-22 13:30:48.139724: step: 328/464, loss: 0.08063969761133194 2023-01-22 13:30:48.913711: step: 330/464, loss: 0.07467617839574814 2023-01-22 13:30:49.649970: step: 332/464, loss: 0.03026079200208187 2023-01-22 13:30:50.406197: step: 334/464, loss: 0.039279479533433914 2023-01-22 13:30:51.131925: step: 336/464, loss: 0.04296695813536644 2023-01-22 13:30:51.870359: step: 338/464, loss: 0.15277546644210815 2023-01-22 13:30:52.552441: step: 340/464, loss: 0.1530921310186386 2023-01-22 13:30:53.267946: step: 342/464, loss: 0.1040443629026413 2023-01-22 13:30:54.093030: step: 344/464, loss: 0.11835810542106628 2023-01-22 13:30:54.861180: step: 346/464, loss: 0.6051956415176392 2023-01-22 13:30:55.584016: step: 348/464, loss: 0.4107264280319214 2023-01-22 13:30:56.342498: step: 350/464, loss: 0.1013975739479065 2023-01-22 13:30:57.094724: step: 352/464, loss: 0.976349413394928 2023-01-22 13:30:57.897586: step: 354/464, loss: 0.3468686640262604 2023-01-22 13:30:58.635469: step: 356/464, loss: 0.0867522656917572 2023-01-22 13:30:59.369684: step: 358/464, loss: 0.672160267829895 2023-01-22 13:31:00.186829: step: 360/464, loss: 0.1508776992559433 2023-01-22 13:31:01.008924: step: 362/464, loss: 0.08101523667573929 2023-01-22 13:31:01.723336: step: 364/464, loss: 0.19883856177330017 2023-01-22 13:31:02.530202: step: 366/464, loss: 0.14487911760807037 2023-01-22 13:31:03.281476: step: 368/464, loss: 0.11524210125207901 2023-01-22 13:31:04.018214: step: 370/464, loss: 0.1271398514509201 2023-01-22 13:31:04.625271: step: 372/464, loss: 0.049729228019714355 2023-01-22 13:31:05.299323: step: 374/464, loss: 0.030202293768525124 2023-01-22 13:31:06.074062: step: 376/464, loss: 0.2078704684972763 2023-01-22 13:31:06.717702: step: 378/464, loss: 0.16754186153411865 2023-01-22 13:31:07.469417: step: 380/464, loss: 0.14381146430969238 2023-01-22 13:31:08.161147: step: 382/464, loss: 0.05142787843942642 2023-01-22 13:31:09.050594: step: 384/464, loss: 0.6397085785865784 2023-01-22 13:31:09.742478: step: 386/464, loss: 1.1454335451126099 2023-01-22 13:31:10.473076: step: 388/464, loss: 0.9766638278961182 2023-01-22 13:31:11.211952: step: 390/464, loss: 0.01772836036980152 2023-01-22 13:31:11.971294: step: 392/464, loss: 0.11965905874967575 2023-01-22 13:31:12.700532: step: 394/464, loss: 0.4042905569076538 2023-01-22 13:31:13.555738: step: 396/464, loss: 0.06917060911655426 2023-01-22 13:31:14.210364: step: 398/464, loss: 0.14034123718738556 2023-01-22 13:31:15.000544: step: 400/464, loss: 0.03209187090396881 2023-01-22 13:31:15.798518: step: 402/464, loss: 0.14110608398914337 2023-01-22 13:31:16.499883: step: 404/464, loss: 0.12455212324857712 2023-01-22 13:31:17.219961: step: 406/464, loss: 0.0245119147002697 2023-01-22 13:31:17.958054: step: 408/464, loss: 0.12244775146245956 2023-01-22 13:31:18.696816: step: 410/464, loss: 0.11255189776420593 2023-01-22 13:31:19.451021: step: 412/464, loss: 0.25171393156051636 2023-01-22 13:31:20.146547: step: 414/464, loss: 0.24863357841968536 2023-01-22 13:31:20.883143: step: 416/464, loss: 0.07378604263067245 2023-01-22 13:31:21.585172: step: 418/464, loss: 0.32955202460289 2023-01-22 13:31:22.305040: step: 420/464, loss: 0.16634100675582886 2023-01-22 13:31:23.044628: step: 422/464, loss: 0.09777677059173584 2023-01-22 13:31:23.703779: step: 424/464, loss: 1.3568098545074463 2023-01-22 13:31:24.323517: step: 426/464, loss: 0.1650019735097885 2023-01-22 13:31:25.049921: step: 428/464, loss: 7.026379108428955 2023-01-22 13:31:25.840831: step: 430/464, loss: 0.09765823185443878 2023-01-22 13:31:26.598524: step: 432/464, loss: 0.07757440954446793 2023-01-22 13:31:27.264346: step: 434/464, loss: 0.08847343176603317 2023-01-22 13:31:28.011867: step: 436/464, loss: 0.07123363018035889 2023-01-22 13:31:28.672319: step: 438/464, loss: 0.04508393257856369 2023-01-22 13:31:29.406200: step: 440/464, loss: 0.04964699596166611 2023-01-22 13:31:30.096375: step: 442/464, loss: 0.01695246435701847 2023-01-22 13:31:30.853114: step: 444/464, loss: 0.17230522632598877 2023-01-22 13:31:31.551057: step: 446/464, loss: 0.1555473655462265 2023-01-22 13:31:32.365834: step: 448/464, loss: 0.19200685620307922 2023-01-22 13:31:33.121455: step: 450/464, loss: 0.32288166880607605 2023-01-22 13:31:33.915851: step: 452/464, loss: 0.3152397572994232 2023-01-22 13:31:34.707347: step: 454/464, loss: 0.11890127509832382 2023-01-22 13:31:35.424493: step: 456/464, loss: 0.10466277599334717 2023-01-22 13:31:36.125915: step: 458/464, loss: 0.05214134231209755 2023-01-22 13:31:36.859947: step: 460/464, loss: 0.06556827574968338 2023-01-22 13:31:37.580671: step: 462/464, loss: 0.16239741444587708 2023-01-22 13:31:38.295972: step: 464/464, loss: 0.08355816453695297 2023-01-22 13:31:39.028019: step: 466/464, loss: 0.13289578258991241 2023-01-22 13:31:39.713863: step: 468/464, loss: 0.06553588062524796 2023-01-22 13:31:40.538453: step: 470/464, loss: 0.09732535481452942 2023-01-22 13:31:41.267188: step: 472/464, loss: 1.0334306955337524 2023-01-22 13:31:42.035063: step: 474/464, loss: 0.8623963594436646 2023-01-22 13:31:42.870820: step: 476/464, loss: 0.098114974796772 2023-01-22 13:31:43.619192: step: 478/464, loss: 0.4096452593803406 2023-01-22 13:31:44.329045: step: 480/464, loss: 0.5679343938827515 2023-01-22 13:31:45.034853: step: 482/464, loss: 0.0800742655992508 2023-01-22 13:31:45.699832: step: 484/464, loss: 0.07798271626234055 2023-01-22 13:31:46.391740: step: 486/464, loss: 0.15702295303344727 2023-01-22 13:31:47.122563: step: 488/464, loss: 0.11062658578157425 2023-01-22 13:31:47.828598: step: 490/464, loss: 0.3952050805091858 2023-01-22 13:31:48.526610: step: 492/464, loss: 0.03058406338095665 2023-01-22 13:31:49.214252: step: 494/464, loss: 0.08915737271308899 2023-01-22 13:31:49.949788: step: 496/464, loss: 0.11728838831186295 2023-01-22 13:31:50.718715: step: 498/464, loss: 0.21412897109985352 2023-01-22 13:31:51.484798: step: 500/464, loss: 0.09385736286640167 2023-01-22 13:31:52.235131: step: 502/464, loss: 0.0986119955778122 2023-01-22 13:31:52.969151: step: 504/464, loss: 0.1011638268828392 2023-01-22 13:31:53.661653: step: 506/464, loss: 0.050703346729278564 2023-01-22 13:31:54.380285: step: 508/464, loss: 0.8299161195755005 2023-01-22 13:31:55.017890: step: 510/464, loss: 0.10197915881872177 2023-01-22 13:31:55.714513: step: 512/464, loss: 0.050886351615190506 2023-01-22 13:31:56.424178: step: 514/464, loss: 0.0736493468284607 2023-01-22 13:31:57.130844: step: 516/464, loss: 0.32507458329200745 2023-01-22 13:31:57.881046: step: 518/464, loss: 0.1292927861213684 2023-01-22 13:31:58.601154: step: 520/464, loss: 0.11475580185651779 2023-01-22 13:31:59.279201: step: 522/464, loss: 0.07191069424152374 2023-01-22 13:32:00.021777: step: 524/464, loss: 0.21141907572746277 2023-01-22 13:32:00.673421: step: 526/464, loss: 0.12864668667316437 2023-01-22 13:32:01.390352: step: 528/464, loss: 0.0564606748521328 2023-01-22 13:32:02.124065: step: 530/464, loss: 0.10558000206947327 2023-01-22 13:32:02.890819: step: 532/464, loss: 0.10238667577505112 2023-01-22 13:32:03.548654: step: 534/464, loss: 0.13729903101921082 2023-01-22 13:32:04.249258: step: 536/464, loss: 0.5416274070739746 2023-01-22 13:32:05.011126: step: 538/464, loss: 0.11077231168746948 2023-01-22 13:32:05.791279: step: 540/464, loss: 0.10182888060808182 2023-01-22 13:32:06.569684: step: 542/464, loss: 0.08829410374164581 2023-01-22 13:32:07.270370: step: 544/464, loss: 0.20814473927021027 2023-01-22 13:32:08.048997: step: 546/464, loss: 0.10945319384336472 2023-01-22 13:32:08.774150: step: 548/464, loss: 0.10658914595842361 2023-01-22 13:32:09.392758: step: 550/464, loss: 0.05725671350955963 2023-01-22 13:32:10.076916: step: 552/464, loss: 0.05665695294737816 2023-01-22 13:32:10.772952: step: 554/464, loss: 0.07857471704483032 2023-01-22 13:32:11.489841: step: 556/464, loss: 0.0755433514714241 2023-01-22 13:32:12.314343: step: 558/464, loss: 0.21889066696166992 2023-01-22 13:32:13.125628: step: 560/464, loss: 0.05423108488321304 2023-01-22 13:32:13.887008: step: 562/464, loss: 0.4585532248020172 2023-01-22 13:32:14.696330: step: 564/464, loss: 0.11373139917850494 2023-01-22 13:32:15.441112: step: 566/464, loss: 0.14729614555835724 2023-01-22 13:32:16.087569: step: 568/464, loss: 0.1796552836894989 2023-01-22 13:32:16.879548: step: 570/464, loss: 0.10746961086988449 2023-01-22 13:32:17.596516: step: 572/464, loss: 0.08348584920167923 2023-01-22 13:32:18.347037: step: 574/464, loss: 0.226358100771904 2023-01-22 13:32:19.116842: step: 576/464, loss: 0.08186423778533936 2023-01-22 13:32:19.811146: step: 578/464, loss: 0.3481625020503998 2023-01-22 13:32:20.542289: step: 580/464, loss: 0.15354043245315552 2023-01-22 13:32:21.181688: step: 582/464, loss: 0.07363526523113251 2023-01-22 13:32:21.846342: step: 584/464, loss: 0.15959565341472626 2023-01-22 13:32:22.590140: step: 586/464, loss: 0.11872921884059906 2023-01-22 13:32:23.333554: step: 588/464, loss: 0.24006997048854828 2023-01-22 13:32:24.031730: step: 590/464, loss: 0.08303096145391464 2023-01-22 13:32:24.801000: step: 592/464, loss: 0.07172245532274246 2023-01-22 13:32:25.601819: step: 594/464, loss: 0.06550987809896469 2023-01-22 13:32:26.320405: step: 596/464, loss: 0.15820249915122986 2023-01-22 13:32:27.049825: step: 598/464, loss: 0.1852339804172516 2023-01-22 13:32:27.727117: step: 600/464, loss: 0.10605773329734802 2023-01-22 13:32:28.514304: step: 602/464, loss: 0.12317892909049988 2023-01-22 13:32:29.132766: step: 604/464, loss: 0.4308888912200928 2023-01-22 13:32:29.898371: step: 606/464, loss: 0.22137056291103363 2023-01-22 13:32:30.698912: step: 608/464, loss: 0.06388486176729202 2023-01-22 13:32:31.465139: step: 610/464, loss: 0.15812082588672638 2023-01-22 13:32:32.338218: step: 612/464, loss: 0.1343749761581421 2023-01-22 13:32:33.042967: step: 614/464, loss: 0.16867481172084808 2023-01-22 13:32:33.781776: step: 616/464, loss: 0.34840089082717896 2023-01-22 13:32:34.532325: step: 618/464, loss: 0.08506720513105392 2023-01-22 13:32:35.193646: step: 620/464, loss: 0.12645494937896729 2023-01-22 13:32:35.881462: step: 622/464, loss: 0.03901791200041771 2023-01-22 13:32:36.666017: step: 624/464, loss: 0.036129169166088104 2023-01-22 13:32:37.375889: step: 626/464, loss: 0.05090763792395592 2023-01-22 13:32:38.169198: step: 628/464, loss: 0.1342857927083969 2023-01-22 13:32:38.851566: step: 630/464, loss: 0.1088811606168747 2023-01-22 13:32:39.511678: step: 632/464, loss: 0.023789554834365845 2023-01-22 13:32:40.190298: step: 634/464, loss: 0.12072187662124634 2023-01-22 13:32:41.031463: step: 636/464, loss: 0.0999072790145874 2023-01-22 13:32:41.759052: step: 638/464, loss: 0.0696912482380867 2023-01-22 13:32:42.499832: step: 640/464, loss: 0.06875769048929214 2023-01-22 13:32:43.300346: step: 642/464, loss: 0.10171730816364288 2023-01-22 13:32:43.999797: step: 644/464, loss: 0.03848240152001381 2023-01-22 13:32:44.755233: step: 646/464, loss: 0.08445584028959274 2023-01-22 13:32:45.430736: step: 648/464, loss: 2.2473201751708984 2023-01-22 13:32:46.160746: step: 650/464, loss: 0.08144298195838928 2023-01-22 13:32:46.822330: step: 652/464, loss: 0.06221764162182808 2023-01-22 13:32:47.551376: step: 654/464, loss: 0.12438201904296875 2023-01-22 13:32:48.301544: step: 656/464, loss: 0.03517334535717964 2023-01-22 13:32:49.029643: step: 658/464, loss: 0.023519620299339294 2023-01-22 13:32:49.803328: step: 660/464, loss: 0.16124774515628815 2023-01-22 13:32:50.599991: step: 662/464, loss: 0.15201835334300995 2023-01-22 13:32:51.376507: step: 664/464, loss: 0.9340047240257263 2023-01-22 13:32:52.086113: step: 666/464, loss: 0.20893046259880066 2023-01-22 13:32:52.772234: step: 668/464, loss: 0.03892991691827774 2023-01-22 13:32:53.502106: step: 670/464, loss: 0.09852541983127594 2023-01-22 13:32:54.209845: step: 672/464, loss: 0.14694903790950775 2023-01-22 13:32:54.893558: step: 674/464, loss: 0.03230362758040428 2023-01-22 13:32:55.547513: step: 676/464, loss: 0.0965629294514656 2023-01-22 13:32:56.260258: step: 678/464, loss: 0.11053648591041565 2023-01-22 13:32:57.069941: step: 680/464, loss: 0.0767754390835762 2023-01-22 13:32:57.786971: step: 682/464, loss: 0.08364859968423843 2023-01-22 13:32:58.565051: step: 684/464, loss: 0.6616068482398987 2023-01-22 13:32:59.222846: step: 686/464, loss: 0.13472962379455566 2023-01-22 13:32:59.983549: step: 688/464, loss: 0.07598034292459488 2023-01-22 13:33:00.816806: step: 690/464, loss: 0.030683374032378197 2023-01-22 13:33:01.536071: step: 692/464, loss: 0.09763442724943161 2023-01-22 13:33:02.251675: step: 694/464, loss: 0.0856194719672203 2023-01-22 13:33:03.026025: step: 696/464, loss: 0.10150092095136642 2023-01-22 13:33:03.845844: step: 698/464, loss: 0.04385127127170563 2023-01-22 13:33:04.735489: step: 700/464, loss: 0.07088147103786469 2023-01-22 13:33:05.417496: step: 702/464, loss: 0.0784793421626091 2023-01-22 13:33:06.240443: step: 704/464, loss: 0.07931099832057953 2023-01-22 13:33:07.042700: step: 706/464, loss: 0.4231050908565521 2023-01-22 13:33:07.739747: step: 708/464, loss: 0.2891331911087036 2023-01-22 13:33:08.474695: step: 710/464, loss: 0.25972074270248413 2023-01-22 13:33:09.214260: step: 712/464, loss: 0.0433255136013031 2023-01-22 13:33:09.962838: step: 714/464, loss: 0.18018367886543274 2023-01-22 13:33:10.665319: step: 716/464, loss: 0.08854905515909195 2023-01-22 13:33:11.438514: step: 718/464, loss: 0.08068551868200302 2023-01-22 13:33:12.203821: step: 720/464, loss: 0.040290337055921555 2023-01-22 13:33:12.931872: step: 722/464, loss: 0.09700462222099304 2023-01-22 13:33:13.684737: step: 724/464, loss: 0.07090365886688232 2023-01-22 13:33:14.479526: step: 726/464, loss: 0.27016595005989075 2023-01-22 13:33:15.118286: step: 728/464, loss: 0.20455826818943024 2023-01-22 13:33:15.826335: step: 730/464, loss: 0.09726991504430771 2023-01-22 13:33:16.518068: step: 732/464, loss: 0.23708124458789825 2023-01-22 13:33:17.214274: step: 734/464, loss: 0.2544391453266144 2023-01-22 13:33:17.904075: step: 736/464, loss: 0.059123240411281586 2023-01-22 13:33:18.608654: step: 738/464, loss: 0.25635281205177307 2023-01-22 13:33:19.298819: step: 740/464, loss: 0.07739058881998062 2023-01-22 13:33:20.018510: step: 742/464, loss: 0.0810338705778122 2023-01-22 13:33:20.656070: step: 744/464, loss: 0.061010394245386124 2023-01-22 13:33:21.356075: step: 746/464, loss: 0.05705961585044861 2023-01-22 13:33:22.090529: step: 748/464, loss: 0.22858786582946777 2023-01-22 13:33:22.858256: step: 750/464, loss: 0.09986155480146408 2023-01-22 13:33:23.576492: step: 752/464, loss: 0.1608353704214096 2023-01-22 13:33:24.198709: step: 754/464, loss: 0.019120080396533012 2023-01-22 13:33:24.939723: step: 756/464, loss: 0.06734953075647354 2023-01-22 13:33:25.626956: step: 758/464, loss: 0.03486610949039459 2023-01-22 13:33:26.350681: step: 760/464, loss: 0.13575465977191925 2023-01-22 13:33:27.124312: step: 762/464, loss: 0.1357676386833191 2023-01-22 13:33:27.874674: step: 764/464, loss: 0.051089994609355927 2023-01-22 13:33:28.627634: step: 766/464, loss: 0.04774364084005356 2023-01-22 13:33:29.318778: step: 768/464, loss: 0.06380462646484375 2023-01-22 13:33:30.095695: step: 770/464, loss: 0.373801589012146 2023-01-22 13:33:30.866153: step: 772/464, loss: 0.17129270732402802 2023-01-22 13:33:31.644410: step: 774/464, loss: 0.1697038859128952 2023-01-22 13:33:32.419960: step: 776/464, loss: 0.1404598206281662 2023-01-22 13:33:33.162186: step: 778/464, loss: 0.03730373829603195 2023-01-22 13:33:33.967042: step: 780/464, loss: 0.05492500960826874 2023-01-22 13:33:34.766286: step: 782/464, loss: 0.04647694155573845 2023-01-22 13:33:35.477171: step: 784/464, loss: 0.046614717692136765 2023-01-22 13:33:36.189944: step: 786/464, loss: 0.11796533316373825 2023-01-22 13:33:37.020730: step: 788/464, loss: 0.27363863587379456 2023-01-22 13:33:37.722063: step: 790/464, loss: 0.0794137716293335 2023-01-22 13:33:38.457866: step: 792/464, loss: 0.314433753490448 2023-01-22 13:33:39.140118: step: 794/464, loss: 0.2909509837627411 2023-01-22 13:33:39.950981: step: 796/464, loss: 0.7796887159347534 2023-01-22 13:33:40.664749: step: 798/464, loss: 0.06901483237743378 2023-01-22 13:33:41.383644: step: 800/464, loss: 0.05117961764335632 2023-01-22 13:33:42.174535: step: 802/464, loss: 0.07418478280305862 2023-01-22 13:33:42.972455: step: 804/464, loss: 0.21300861239433289 2023-01-22 13:33:43.714963: step: 806/464, loss: 1.5926798582077026 2023-01-22 13:33:44.420450: step: 808/464, loss: 0.04421302676200867 2023-01-22 13:33:45.094365: step: 810/464, loss: 0.037948526442050934 2023-01-22 13:33:45.781050: step: 812/464, loss: 0.07859799265861511 2023-01-22 13:33:46.522117: step: 814/464, loss: 0.07585307210683823 2023-01-22 13:33:47.321205: step: 816/464, loss: 0.1594003438949585 2023-01-22 13:33:48.027945: step: 818/464, loss: 0.13735361397266388 2023-01-22 13:33:48.771356: step: 820/464, loss: 0.10319264233112335 2023-01-22 13:33:49.536019: step: 822/464, loss: 0.07057742774486542 2023-01-22 13:33:50.303432: step: 824/464, loss: 0.3316095173358917 2023-01-22 13:33:51.001654: step: 826/464, loss: 0.24340875446796417 2023-01-22 13:33:51.684284: step: 828/464, loss: 0.21709150075912476 2023-01-22 13:33:52.437782: step: 830/464, loss: 0.2157144844532013 2023-01-22 13:33:53.177146: step: 832/464, loss: 0.06577938795089722 2023-01-22 13:33:53.945406: step: 834/464, loss: 0.10622964054346085 2023-01-22 13:33:54.650123: step: 836/464, loss: 0.10273239016532898 2023-01-22 13:33:55.357029: step: 838/464, loss: 0.5213263630867004 2023-01-22 13:33:56.194232: step: 840/464, loss: 0.11066462844610214 2023-01-22 13:33:57.014015: step: 842/464, loss: 0.06470531970262527 2023-01-22 13:33:57.806412: step: 844/464, loss: 0.29106324911117554 2023-01-22 13:33:58.483534: step: 846/464, loss: 0.12173037230968475 2023-01-22 13:33:59.210464: step: 848/464, loss: 0.04491506889462471 2023-01-22 13:33:59.925103: step: 850/464, loss: 0.08093959838151932 2023-01-22 13:34:00.590357: step: 852/464, loss: 0.4785907566547394 2023-01-22 13:34:01.308138: step: 854/464, loss: 0.39316803216934204 2023-01-22 13:34:02.008337: step: 856/464, loss: 0.10662375390529633 2023-01-22 13:34:02.750423: step: 858/464, loss: 0.1058470830321312 2023-01-22 13:34:03.409038: step: 860/464, loss: 0.1215873584151268 2023-01-22 13:34:04.150232: step: 862/464, loss: 0.1317547708749771 2023-01-22 13:34:05.103808: step: 864/464, loss: 0.0965326651930809 2023-01-22 13:34:05.936748: step: 866/464, loss: 0.2264188826084137 2023-01-22 13:34:06.795368: step: 868/464, loss: 0.17923258244991302 2023-01-22 13:34:07.497818: step: 870/464, loss: 0.08647506684064865 2023-01-22 13:34:08.242395: step: 872/464, loss: 0.03950796648859978 2023-01-22 13:34:08.996833: step: 874/464, loss: 0.3292936086654663 2023-01-22 13:34:09.804933: step: 876/464, loss: 0.15255765616893768 2023-01-22 13:34:10.473526: step: 878/464, loss: 0.08292162418365479 2023-01-22 13:34:11.225833: step: 880/464, loss: 0.15378907322883606 2023-01-22 13:34:12.030543: step: 882/464, loss: 0.07309963554143906 2023-01-22 13:34:12.767807: step: 884/464, loss: 0.028615491464734077 2023-01-22 13:34:13.527723: step: 886/464, loss: 0.09923788160085678 2023-01-22 13:34:14.281426: step: 888/464, loss: 0.19777190685272217 2023-01-22 13:34:15.007656: step: 890/464, loss: 0.14272283017635345 2023-01-22 13:34:15.824316: step: 892/464, loss: 0.06808196753263474 2023-01-22 13:34:16.532564: step: 894/464, loss: 0.08321517705917358 2023-01-22 13:34:17.219368: step: 896/464, loss: 0.059523243457078934 2023-01-22 13:34:17.961649: step: 898/464, loss: 0.18098331987857819 2023-01-22 13:34:18.682697: step: 900/464, loss: 0.07278680801391602 2023-01-22 13:34:19.409604: step: 902/464, loss: 0.1367041915655136 2023-01-22 13:34:20.162464: step: 904/464, loss: 0.1275164633989334 2023-01-22 13:34:20.841477: step: 906/464, loss: 0.05483197793364525 2023-01-22 13:34:21.549664: step: 908/464, loss: 0.22534726560115814 2023-01-22 13:34:22.272766: step: 910/464, loss: 0.1497402787208557 2023-01-22 13:34:23.024483: step: 912/464, loss: 0.11753479391336441 2023-01-22 13:34:23.723283: step: 914/464, loss: 0.4801797866821289 2023-01-22 13:34:24.532214: step: 916/464, loss: 0.1518942415714264 2023-01-22 13:34:25.223202: step: 918/464, loss: 0.1326410174369812 2023-01-22 13:34:25.968307: step: 920/464, loss: 0.5608806014060974 2023-01-22 13:34:26.733465: step: 922/464, loss: 4.899883270263672 2023-01-22 13:34:27.457437: step: 924/464, loss: 0.1643269956111908 2023-01-22 13:34:28.195160: step: 926/464, loss: 0.07229573279619217 2023-01-22 13:34:29.004294: step: 928/464, loss: 0.0629381537437439 2023-01-22 13:34:29.760977: step: 930/464, loss: 0.11444586515426636 ================================================== Loss: 0.233 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.28497951516568537, 'r': 0.3304031950023411, 'f1': 0.30601490995823155}, 'combined': 0.22548467049553902, 'epoch': 15} Test Chinese: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.3018951838274165, 'r': 0.28696466039381524, 'f1': 0.2942406406269241}, 'combined': 0.18273892417882656, 'epoch': 15} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.27389260699991835, 'r': 0.3305421215406984, 'f1': 0.2995626793670646}, 'combined': 0.2207303953231002, 'epoch': 15} Test Korean: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.2873582368661075, 'r': 0.27969156453865207, 'f1': 0.2834730729224996}, 'combined': 0.17605169792028924, 'epoch': 15} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.28292839051135177, 'r': 0.329098867900301, 'f1': 0.30427211119905023}, 'combined': 0.22420050298877384, 'epoch': 15} Test Russian: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.30708218570626866, 'r': 0.28980312886001264, 'f1': 0.2981925541241166}, 'combined': 0.18519327045603032, 'epoch': 15} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.20535714285714285, 'r': 0.32857142857142857, 'f1': 0.25274725274725274}, 'combined': 0.16849816849816848, 'epoch': 15} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.22, 'r': 0.4782608695652174, 'f1': 0.30136986301369867}, 'combined': 0.15068493150684933, 'epoch': 15} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.390625, 'r': 0.3232758620689655, 'f1': 0.3537735849056604}, 'combined': 0.2358490566037736, 'epoch': 15} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.31282801835726054, 'r': 0.3603161425860667, 'f1': 0.33489701436130004}, 'combined': 0.24676622110832633, 'epoch': 12} Test for Chinese: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.30894966790503225, 'r': 0.2887808468548521, 'f1': 0.29852498585915693}, 'combined': 0.1853997280598975, 'epoch': 12} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.3138888888888889, 'r': 0.4035714285714286, 'f1': 0.35312499999999997}, 'combined': 0.23541666666666664, 'epoch': 12} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2957374656593406, 'r': 0.3501711168338303, 'f1': 0.3206606056844979}, 'combined': 0.23627623576752474, 'epoch': 12} Test for Korean: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.30237642650399266, 'r': 0.2871380888046808, 'f1': 0.2945603100560943}, 'combined': 0.18293745571904804, 'epoch': 12} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.31976744186046513, 'r': 0.5978260869565217, 'f1': 0.4166666666666667}, 'combined': 0.20833333333333334, 'epoch': 12} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3020804336867607, 'r': 0.32271590923272536, 'f1': 0.3120574021388005}, 'combined': 0.22993703315490563, 'epoch': 8} Test for Russian: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.3069139619466407, 'r': 0.292949528465389, 'f1': 0.29976920372318655}, 'combined': 0.1861724528386106, 'epoch': 8} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.44642857142857145, 'r': 0.3232758620689655, 'f1': 0.37500000000000006}, 'combined': 0.25, 'epoch': 8} ****************************** Epoch: 16 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-22 13:37:08.149637: step: 2/464, loss: 0.14209577441215515 2023-01-22 13:37:08.866794: step: 4/464, loss: 0.08315715938806534 2023-01-22 13:37:09.545379: step: 6/464, loss: 0.1677294224500656 2023-01-22 13:37:10.259544: step: 8/464, loss: 0.06222666800022125 2023-01-22 13:37:11.051574: step: 10/464, loss: 0.12703558802604675 2023-01-22 13:37:11.868861: step: 12/464, loss: 0.14261215925216675 2023-01-22 13:37:12.586787: step: 14/464, loss: 0.18909241259098053 2023-01-22 13:37:13.415140: step: 16/464, loss: 0.10527751594781876 2023-01-22 13:37:14.140781: step: 18/464, loss: 0.09355314821004868 2023-01-22 13:37:14.872134: step: 20/464, loss: 0.10597775876522064 2023-01-22 13:37:15.592039: step: 22/464, loss: 0.027014456689357758 2023-01-22 13:37:16.327838: step: 24/464, loss: 0.18352515995502472 2023-01-22 13:37:17.140530: step: 26/464, loss: 0.12560158967971802 2023-01-22 13:37:17.826535: step: 28/464, loss: 0.029618391767144203 2023-01-22 13:37:18.555238: step: 30/464, loss: 0.04836886003613472 2023-01-22 13:37:19.251850: step: 32/464, loss: 0.10874973237514496 2023-01-22 13:37:19.929825: step: 34/464, loss: 0.05077312886714935 2023-01-22 13:37:20.625709: step: 36/464, loss: 0.2322484850883484 2023-01-22 13:37:21.285946: step: 38/464, loss: 0.06784657388925552 2023-01-22 13:37:22.008272: step: 40/464, loss: 0.09843548387289047 2023-01-22 13:37:22.747448: step: 42/464, loss: 0.07803135365247726 2023-01-22 13:37:23.539598: step: 44/464, loss: 0.06171514838933945 2023-01-22 13:37:24.248943: step: 46/464, loss: 0.046159062534570694 2023-01-22 13:37:25.037804: step: 48/464, loss: 0.05216488987207413 2023-01-22 13:37:25.746896: step: 50/464, loss: 0.02905622124671936 2023-01-22 13:37:26.508240: step: 52/464, loss: 0.06213120371103287 2023-01-22 13:37:27.263488: step: 54/464, loss: 0.12204937636852264 2023-01-22 13:37:28.078429: step: 56/464, loss: 0.06039648875594139 2023-01-22 13:37:28.835725: step: 58/464, loss: 0.04854537174105644 2023-01-22 13:37:29.545742: step: 60/464, loss: 0.0071175843477249146 2023-01-22 13:37:30.227684: step: 62/464, loss: 0.09326845407485962 2023-01-22 13:37:30.908639: step: 64/464, loss: 0.0703241154551506 2023-01-22 13:37:31.636100: step: 66/464, loss: 0.0811472162604332 2023-01-22 13:37:32.358191: step: 68/464, loss: 0.5887887477874756 2023-01-22 13:37:33.153736: step: 70/464, loss: 0.19554060697555542 2023-01-22 13:37:33.945042: step: 72/464, loss: 0.13443249464035034 2023-01-22 13:37:34.716566: step: 74/464, loss: 0.054412368685007095 2023-01-22 13:37:35.510926: step: 76/464, loss: 0.4430738389492035 2023-01-22 13:37:36.357624: step: 78/464, loss: 0.07522349804639816 2023-01-22 13:37:37.103592: step: 80/464, loss: 0.07084693759679794 2023-01-22 13:37:37.867067: step: 82/464, loss: 0.06538707762956619 2023-01-22 13:37:38.596568: step: 84/464, loss: 0.0802653431892395 2023-01-22 13:37:39.310544: step: 86/464, loss: 0.008309608325362206 2023-01-22 13:37:40.129463: step: 88/464, loss: 0.14886318147182465 2023-01-22 13:37:40.821832: step: 90/464, loss: 0.070876345038414 2023-01-22 13:37:41.520580: step: 92/464, loss: 0.054328061640262604 2023-01-22 13:37:42.338807: step: 94/464, loss: 0.05771628022193909 2023-01-22 13:37:43.030509: step: 96/464, loss: 0.05725085735321045 2023-01-22 13:37:43.739950: step: 98/464, loss: 0.12023055553436279 2023-01-22 13:37:44.343314: step: 100/464, loss: 0.07223106175661087 2023-01-22 13:37:45.107656: step: 102/464, loss: 0.7687522768974304 2023-01-22 13:37:45.819816: step: 104/464, loss: 0.11119861900806427 2023-01-22 13:37:46.608385: step: 106/464, loss: 0.4024055302143097 2023-01-22 13:37:47.338376: step: 108/464, loss: 0.06760146468877792 2023-01-22 13:37:48.058749: step: 110/464, loss: 0.31549784541130066 2023-01-22 13:37:48.735693: step: 112/464, loss: 0.0992489904165268 2023-01-22 13:37:49.478329: step: 114/464, loss: 0.0670260563492775 2023-01-22 13:37:50.208540: step: 116/464, loss: 0.0990232601761818 2023-01-22 13:37:50.963784: step: 118/464, loss: 0.104999840259552 2023-01-22 13:37:51.652077: step: 120/464, loss: 0.08298064768314362 2023-01-22 13:37:52.389025: step: 122/464, loss: 0.0959516391158104 2023-01-22 13:37:53.142003: step: 124/464, loss: 0.05884644761681557 2023-01-22 13:37:53.857752: step: 126/464, loss: 0.13606123626232147 2023-01-22 13:37:54.602222: step: 128/464, loss: 0.10948154330253601 2023-01-22 13:37:55.351606: step: 130/464, loss: 0.31888434290885925 2023-01-22 13:37:56.107175: step: 132/464, loss: 0.059614237397909164 2023-01-22 13:37:56.791794: step: 134/464, loss: 0.2272975742816925 2023-01-22 13:37:57.565108: step: 136/464, loss: 0.05067453533411026 2023-01-22 13:37:58.323017: step: 138/464, loss: 0.0519007109105587 2023-01-22 13:37:59.032220: step: 140/464, loss: 0.08054037392139435 2023-01-22 13:37:59.742104: step: 142/464, loss: 0.067112997174263 2023-01-22 13:38:00.552361: step: 144/464, loss: 0.02732996642589569 2023-01-22 13:38:01.363665: step: 146/464, loss: 0.18856127560138702 2023-01-22 13:38:02.017793: step: 148/464, loss: 0.026846928521990776 2023-01-22 13:38:02.711626: step: 150/464, loss: 0.056829262524843216 2023-01-22 13:38:03.452236: step: 152/464, loss: 0.09267004579305649 2023-01-22 13:38:04.271875: step: 154/464, loss: 0.02071995846927166 2023-01-22 13:38:05.013486: step: 156/464, loss: 0.04269138723611832 2023-01-22 13:38:05.815971: step: 158/464, loss: 0.11235304921865463 2023-01-22 13:38:06.510856: step: 160/464, loss: 0.09503141790628433 2023-01-22 13:38:07.212798: step: 162/464, loss: 0.12102089822292328 2023-01-22 13:38:07.892693: step: 164/464, loss: 0.11403682827949524 2023-01-22 13:38:08.627892: step: 166/464, loss: 0.25968801975250244 2023-01-22 13:38:09.279152: step: 168/464, loss: 0.1058613508939743 2023-01-22 13:38:10.070317: step: 170/464, loss: 0.22510240972042084 2023-01-22 13:38:10.880470: step: 172/464, loss: 0.05563351884484291 2023-01-22 13:38:11.578705: step: 174/464, loss: 0.03602204844355583 2023-01-22 13:38:12.399844: step: 176/464, loss: 0.03502077981829643 2023-01-22 13:38:13.158549: step: 178/464, loss: 0.025843489915132523 2023-01-22 13:38:13.906640: step: 180/464, loss: 0.0642140656709671 2023-01-22 13:38:14.742689: step: 182/464, loss: 0.8990870714187622 2023-01-22 13:38:15.528923: step: 184/464, loss: 0.04531247913837433 2023-01-22 13:38:16.245593: step: 186/464, loss: 0.0421457402408123 2023-01-22 13:38:16.936185: step: 188/464, loss: 0.05311638116836548 2023-01-22 13:38:17.668435: step: 190/464, loss: 0.08210153877735138 2023-01-22 13:38:18.385722: step: 192/464, loss: 0.07027444243431091 2023-01-22 13:38:19.095015: step: 194/464, loss: 0.0335502065718174 2023-01-22 13:38:19.797284: step: 196/464, loss: 0.06261256337165833 2023-01-22 13:38:20.522704: step: 198/464, loss: 0.12928935885429382 2023-01-22 13:38:21.244191: step: 200/464, loss: 0.03679617494344711 2023-01-22 13:38:21.977006: step: 202/464, loss: 1.5623284578323364 2023-01-22 13:38:22.678699: step: 204/464, loss: 0.04236576333642006 2023-01-22 13:38:23.390769: step: 206/464, loss: 2.797375440597534 2023-01-22 13:38:24.092217: step: 208/464, loss: 0.15503378212451935 2023-01-22 13:38:24.771780: step: 210/464, loss: 0.46465569734573364 2023-01-22 13:38:25.495764: step: 212/464, loss: 0.16911257803440094 2023-01-22 13:38:26.294480: step: 214/464, loss: 0.05166523531079292 2023-01-22 13:38:26.962524: step: 216/464, loss: 0.02335522696375847 2023-01-22 13:38:27.604591: step: 218/464, loss: 0.05577556788921356 2023-01-22 13:38:28.351612: step: 220/464, loss: 0.05349784716963768 2023-01-22 13:38:29.128314: step: 222/464, loss: 0.7577762603759766 2023-01-22 13:38:29.861637: step: 224/464, loss: 0.067133329808712 2023-01-22 13:38:30.646024: step: 226/464, loss: 0.0397614948451519 2023-01-22 13:38:31.420756: step: 228/464, loss: 0.0500374361872673 2023-01-22 13:38:32.151373: step: 230/464, loss: 0.12101941555738449 2023-01-22 13:38:32.836146: step: 232/464, loss: 0.03364633768796921 2023-01-22 13:38:33.531406: step: 234/464, loss: 0.029257718473672867 2023-01-22 13:38:34.270362: step: 236/464, loss: 0.09055133163928986 2023-01-22 13:38:34.967961: step: 238/464, loss: 0.7606881260871887 2023-01-22 13:38:35.769627: step: 240/464, loss: 0.05093425512313843 2023-01-22 13:38:36.504301: step: 242/464, loss: 0.09483727812767029 2023-01-22 13:38:37.167164: step: 244/464, loss: 0.09020978957414627 2023-01-22 13:38:37.843111: step: 246/464, loss: 0.07504907995462418 2023-01-22 13:38:38.592038: step: 248/464, loss: 0.1254086196422577 2023-01-22 13:38:39.341645: step: 250/464, loss: 0.1057504266500473 2023-01-22 13:38:40.109768: step: 252/464, loss: 0.16019676625728607 2023-01-22 13:38:40.833776: step: 254/464, loss: 0.033384401351213455 2023-01-22 13:38:41.660306: step: 256/464, loss: 0.022989513352513313 2023-01-22 13:38:42.418280: step: 258/464, loss: 0.03476962074637413 2023-01-22 13:38:43.147873: step: 260/464, loss: 0.032892998307943344 2023-01-22 13:38:43.841171: step: 262/464, loss: 0.03973241522908211 2023-01-22 13:38:44.631390: step: 264/464, loss: 0.0409555658698082 2023-01-22 13:38:45.355560: step: 266/464, loss: 0.6431903839111328 2023-01-22 13:38:46.085756: step: 268/464, loss: 0.10075043886899948 2023-01-22 13:38:46.805843: step: 270/464, loss: 0.12035145610570908 2023-01-22 13:38:47.544616: step: 272/464, loss: 0.008856400847434998 2023-01-22 13:38:48.335578: step: 274/464, loss: 0.1363389790058136 2023-01-22 13:38:49.047898: step: 276/464, loss: 0.01673283614218235 2023-01-22 13:38:49.726755: step: 278/464, loss: 0.15913592278957367 2023-01-22 13:38:50.457054: step: 280/464, loss: 0.05854547768831253 2023-01-22 13:38:51.303426: step: 282/464, loss: 0.37333014607429504 2023-01-22 13:38:52.103186: step: 284/464, loss: 0.10043874382972717 2023-01-22 13:38:52.809088: step: 286/464, loss: 0.043367329984903336 2023-01-22 13:38:53.527019: step: 288/464, loss: 0.08529473096132278 2023-01-22 13:38:54.263772: step: 290/464, loss: 0.10794027149677277 2023-01-22 13:38:55.018429: step: 292/464, loss: 0.1581147313117981 2023-01-22 13:38:55.905336: step: 294/464, loss: 0.06559644639492035 2023-01-22 13:38:56.698845: step: 296/464, loss: 0.2656407952308655 2023-01-22 13:38:57.540962: step: 298/464, loss: 0.12226749956607819 2023-01-22 13:38:58.286675: step: 300/464, loss: 0.03235700726509094 2023-01-22 13:38:58.985602: step: 302/464, loss: 0.07469668239355087 2023-01-22 13:38:59.720021: step: 304/464, loss: 0.019965143874287605 2023-01-22 13:39:00.379189: step: 306/464, loss: 0.05402126908302307 2023-01-22 13:39:01.124825: step: 308/464, loss: 0.11711370199918747 2023-01-22 13:39:01.864394: step: 310/464, loss: 0.06864065676927567 2023-01-22 13:39:02.563477: step: 312/464, loss: 0.2924725115299225 2023-01-22 13:39:03.307256: step: 314/464, loss: 0.14239171147346497 2023-01-22 13:39:04.032017: step: 316/464, loss: 0.20074528455734253 2023-01-22 13:39:04.713784: step: 318/464, loss: 0.05544080212712288 2023-01-22 13:39:05.422402: step: 320/464, loss: 0.11986785382032394 2023-01-22 13:39:06.149965: step: 322/464, loss: 0.26871606707572937 2023-01-22 13:39:06.847468: step: 324/464, loss: 0.183811217546463 2023-01-22 13:39:07.696048: step: 326/464, loss: 0.1335066854953766 2023-01-22 13:39:08.414414: step: 328/464, loss: 0.14025689661502838 2023-01-22 13:39:09.118348: step: 330/464, loss: 0.07599826157093048 2023-01-22 13:39:09.796644: step: 332/464, loss: 0.10522612184286118 2023-01-22 13:39:10.515612: step: 334/464, loss: 0.08387448638677597 2023-01-22 13:39:11.208639: step: 336/464, loss: 0.06999139487743378 2023-01-22 13:39:11.942435: step: 338/464, loss: 0.12560762465000153 2023-01-22 13:39:12.721014: step: 340/464, loss: 0.5550030469894409 2023-01-22 13:39:13.465133: step: 342/464, loss: 0.1845470666885376 2023-01-22 13:39:14.196272: step: 344/464, loss: 0.19275884330272675 2023-01-22 13:39:14.953685: step: 346/464, loss: 0.14432045817375183 2023-01-22 13:39:15.721543: step: 348/464, loss: 0.09600774198770523 2023-01-22 13:39:16.381143: step: 350/464, loss: 0.33380627632141113 2023-01-22 13:39:17.155404: step: 352/464, loss: 0.12676355242729187 2023-01-22 13:39:17.824662: step: 354/464, loss: 0.11150521785020828 2023-01-22 13:39:18.607979: step: 356/464, loss: 0.1399673968553543 2023-01-22 13:39:19.443685: step: 358/464, loss: 0.103044293820858 2023-01-22 13:39:20.088958: step: 360/464, loss: 0.5707974433898926 2023-01-22 13:39:20.902103: step: 362/464, loss: 0.0869031548500061 2023-01-22 13:39:21.558609: step: 364/464, loss: 0.08653386682271957 2023-01-22 13:39:22.195368: step: 366/464, loss: 0.0813024491071701 2023-01-22 13:39:22.914592: step: 368/464, loss: 0.023970788344740868 2023-01-22 13:39:23.648586: step: 370/464, loss: 0.13009853661060333 2023-01-22 13:39:24.384470: step: 372/464, loss: 0.03696196526288986 2023-01-22 13:39:25.092942: step: 374/464, loss: 0.02843949757516384 2023-01-22 13:39:25.852247: step: 376/464, loss: 0.0992714986205101 2023-01-22 13:39:26.569804: step: 378/464, loss: 0.1639476865530014 2023-01-22 13:39:27.339821: step: 380/464, loss: 0.10682159662246704 2023-01-22 13:39:28.087682: step: 382/464, loss: 0.11330337077379227 2023-01-22 13:39:28.779277: step: 384/464, loss: 0.052299533039331436 2023-01-22 13:39:29.483208: step: 386/464, loss: 0.16930243372917175 2023-01-22 13:39:30.223597: step: 388/464, loss: 0.06845500320196152 2023-01-22 13:39:30.963659: step: 390/464, loss: 0.17916421592235565 2023-01-22 13:39:31.678859: step: 392/464, loss: 0.17624793946743011 2023-01-22 13:39:32.510814: step: 394/464, loss: 0.09612559527158737 2023-01-22 13:39:33.269657: step: 396/464, loss: 0.1727244257926941 2023-01-22 13:39:34.015011: step: 398/464, loss: 0.010704328306019306 2023-01-22 13:39:34.704005: step: 400/464, loss: 0.13919922709465027 2023-01-22 13:39:35.483745: step: 402/464, loss: 0.062354229390621185 2023-01-22 13:39:36.208935: step: 404/464, loss: 0.1840866208076477 2023-01-22 13:39:37.011659: step: 406/464, loss: 0.019472267478704453 2023-01-22 13:39:37.696876: step: 408/464, loss: 0.09749460965394974 2023-01-22 13:39:38.366388: step: 410/464, loss: 0.05314226448535919 2023-01-22 13:39:39.123415: step: 412/464, loss: 0.04876469820737839 2023-01-22 13:39:39.787705: step: 414/464, loss: 0.0391797199845314 2023-01-22 13:39:40.506073: step: 416/464, loss: 0.0012564189964905381 2023-01-22 13:39:41.183849: step: 418/464, loss: 0.04942185804247856 2023-01-22 13:39:41.922447: step: 420/464, loss: 0.07895748317241669 2023-01-22 13:39:42.619732: step: 422/464, loss: 0.07031918317079544 2023-01-22 13:39:43.378645: step: 424/464, loss: 0.03398099169135094 2023-01-22 13:39:44.127392: step: 426/464, loss: 0.17639093101024628 2023-01-22 13:39:44.938712: step: 428/464, loss: 0.040956635028123856 2023-01-22 13:39:45.700294: step: 430/464, loss: 0.15845489501953125 2023-01-22 13:39:46.364537: step: 432/464, loss: 0.06485290080308914 2023-01-22 13:39:47.187789: step: 434/464, loss: 0.10166499018669128 2023-01-22 13:39:47.936993: step: 436/464, loss: 0.01902008429169655 2023-01-22 13:39:48.644946: step: 438/464, loss: 0.05797155201435089 2023-01-22 13:39:49.381701: step: 440/464, loss: 0.048199351876974106 2023-01-22 13:39:50.093725: step: 442/464, loss: 0.07311541587114334 2023-01-22 13:39:50.792991: step: 444/464, loss: 0.07391253858804703 2023-01-22 13:39:51.619004: step: 446/464, loss: 0.16486096382141113 2023-01-22 13:39:52.385493: step: 448/464, loss: 0.11881289631128311 2023-01-22 13:39:53.091702: step: 450/464, loss: 0.030278276652097702 2023-01-22 13:39:53.800549: step: 452/464, loss: 0.2739794850349426 2023-01-22 13:39:54.520149: step: 454/464, loss: 0.06841745972633362 2023-01-22 13:39:55.195190: step: 456/464, loss: 0.19904053211212158 2023-01-22 13:39:55.900117: step: 458/464, loss: 0.02035653218626976 2023-01-22 13:39:56.683799: step: 460/464, loss: 0.05959150567650795 2023-01-22 13:39:57.437287: step: 462/464, loss: 0.03956807032227516 2023-01-22 13:39:58.210003: step: 464/464, loss: 0.13315275311470032 2023-01-22 13:39:58.933814: step: 466/464, loss: 0.2876358926296234 2023-01-22 13:39:59.635037: step: 468/464, loss: 0.04231845214962959 2023-01-22 13:40:00.345848: step: 470/464, loss: 0.7945632338523865 2023-01-22 13:40:01.181154: step: 472/464, loss: 0.09187690913677216 2023-01-22 13:40:01.908423: step: 474/464, loss: 0.10587914288043976 2023-01-22 13:40:02.645147: step: 476/464, loss: 0.2761503756046295 2023-01-22 13:40:03.376112: step: 478/464, loss: 0.07797357439994812 2023-01-22 13:40:04.142729: step: 480/464, loss: 0.05710531771183014 2023-01-22 13:40:04.878168: step: 482/464, loss: 0.18004289269447327 2023-01-22 13:40:05.573501: step: 484/464, loss: 0.08591750264167786 2023-01-22 13:40:06.268136: step: 486/464, loss: 0.09022261202335358 2023-01-22 13:40:06.951597: step: 488/464, loss: 0.10273952037096024 2023-01-22 13:40:07.649269: step: 490/464, loss: 0.0322842039167881 2023-01-22 13:40:08.385140: step: 492/464, loss: 0.0460544228553772 2023-01-22 13:40:09.110099: step: 494/464, loss: 0.04815031960606575 2023-01-22 13:40:09.939779: step: 496/464, loss: 0.18230149149894714 2023-01-22 13:40:10.670253: step: 498/464, loss: 0.055478718131780624 2023-01-22 13:40:11.335466: step: 500/464, loss: 0.035472407937049866 2023-01-22 13:40:12.116894: step: 502/464, loss: 0.054278500378131866 2023-01-22 13:40:12.832203: step: 504/464, loss: 0.9115884900093079 2023-01-22 13:40:13.565373: step: 506/464, loss: 0.14384281635284424 2023-01-22 13:40:14.345838: step: 508/464, loss: 0.1132027804851532 2023-01-22 13:40:15.081570: step: 510/464, loss: 0.03535445034503937 2023-01-22 13:40:15.855205: step: 512/464, loss: 0.10807029157876968 2023-01-22 13:40:16.534913: step: 514/464, loss: 0.03422875702381134 2023-01-22 13:40:17.280029: step: 516/464, loss: 0.3180900514125824 2023-01-22 13:40:18.036983: step: 518/464, loss: 0.19532497227191925 2023-01-22 13:40:18.785767: step: 520/464, loss: 0.1341649740934372 2023-01-22 13:40:19.469941: step: 522/464, loss: 0.4564950168132782 2023-01-22 13:40:20.263984: step: 524/464, loss: 0.13296714425086975 2023-01-22 13:40:21.036938: step: 526/464, loss: 0.13749663531780243 2023-01-22 13:40:21.739781: step: 528/464, loss: 0.2917221784591675 2023-01-22 13:40:22.441800: step: 530/464, loss: 0.06330596655607224 2023-01-22 13:40:23.183317: step: 532/464, loss: 0.32829830050468445 2023-01-22 13:40:23.922562: step: 534/464, loss: 0.07539442181587219 2023-01-22 13:40:24.625036: step: 536/464, loss: 0.05646499991416931 2023-01-22 13:40:25.418180: step: 538/464, loss: 0.07317481189966202 2023-01-22 13:40:26.141620: step: 540/464, loss: 0.14793869853019714 2023-01-22 13:40:26.911452: step: 542/464, loss: 0.220576211810112 2023-01-22 13:40:27.674552: step: 544/464, loss: 0.09299348294734955 2023-01-22 13:40:28.487567: step: 546/464, loss: 0.13680137693881989 2023-01-22 13:40:29.156252: step: 548/464, loss: 0.11256013065576553 2023-01-22 13:40:29.840244: step: 550/464, loss: 0.06750702112913132 2023-01-22 13:40:30.643110: step: 552/464, loss: 0.08740395307540894 2023-01-22 13:40:31.351855: step: 554/464, loss: 0.02208864875137806 2023-01-22 13:40:32.145600: step: 556/464, loss: 0.14167027175426483 2023-01-22 13:40:32.844825: step: 558/464, loss: 0.039901308715343475 2023-01-22 13:40:33.514542: step: 560/464, loss: 0.18916989862918854 2023-01-22 13:40:34.341200: step: 562/464, loss: 0.1889684796333313 2023-01-22 13:40:35.113915: step: 564/464, loss: 0.019529862329363823 2023-01-22 13:40:35.819480: step: 566/464, loss: 0.07504037022590637 2023-01-22 13:40:36.547268: step: 568/464, loss: 0.02826717123389244 2023-01-22 13:40:37.264275: step: 570/464, loss: 0.15966065227985382 2023-01-22 13:40:37.963803: step: 572/464, loss: 0.052043791860342026 2023-01-22 13:40:38.723765: step: 574/464, loss: 0.2825374901294708 2023-01-22 13:40:39.426119: step: 576/464, loss: 0.17329668998718262 2023-01-22 13:40:40.157903: step: 578/464, loss: 0.15861612558364868 2023-01-22 13:40:40.945214: step: 580/464, loss: 0.1683676838874817 2023-01-22 13:40:41.720915: step: 582/464, loss: 0.1867954432964325 2023-01-22 13:40:42.442534: step: 584/464, loss: 0.04729166254401207 2023-01-22 13:40:43.141535: step: 586/464, loss: 0.049370501190423965 2023-01-22 13:40:43.867909: step: 588/464, loss: 0.12045831978321075 2023-01-22 13:40:44.569725: step: 590/464, loss: 0.029787318781018257 2023-01-22 13:40:45.383785: step: 592/464, loss: 0.605231761932373 2023-01-22 13:40:46.165743: step: 594/464, loss: 0.09484227001667023 2023-01-22 13:40:46.865271: step: 596/464, loss: 1.07864248752594 2023-01-22 13:40:47.660601: step: 598/464, loss: 0.5804974436759949 2023-01-22 13:40:48.372228: step: 600/464, loss: 0.10189292579889297 2023-01-22 13:40:49.133742: step: 602/464, loss: 0.13452737033367157 2023-01-22 13:40:49.906191: step: 604/464, loss: 0.40827855467796326 2023-01-22 13:40:50.600174: step: 606/464, loss: 0.25222280621528625 2023-01-22 13:40:51.270941: step: 608/464, loss: 0.09318625926971436 2023-01-22 13:40:52.040086: step: 610/464, loss: 0.3048606514930725 2023-01-22 13:40:52.774526: step: 612/464, loss: 0.04482343792915344 2023-01-22 13:40:53.533309: step: 614/464, loss: 0.09011220932006836 2023-01-22 13:40:54.207097: step: 616/464, loss: 0.056642841547727585 2023-01-22 13:40:54.885502: step: 618/464, loss: 0.05928850173950195 2023-01-22 13:40:55.745176: step: 620/464, loss: 0.3149532079696655 2023-01-22 13:40:56.602992: step: 622/464, loss: 0.13324689865112305 2023-01-22 13:40:57.307761: step: 624/464, loss: 0.3803398311138153 2023-01-22 13:40:58.049914: step: 626/464, loss: 0.0937994047999382 2023-01-22 13:40:58.776102: step: 628/464, loss: 0.4643716812133789 2023-01-22 13:40:59.473034: step: 630/464, loss: 0.06096263229846954 2023-01-22 13:41:00.200401: step: 632/464, loss: 0.1206122562289238 2023-01-22 13:41:00.993554: step: 634/464, loss: 0.056166987866163254 2023-01-22 13:41:01.697284: step: 636/464, loss: 0.07867898792028427 2023-01-22 13:41:02.451421: step: 638/464, loss: 0.07757259160280228 2023-01-22 13:41:03.089791: step: 640/464, loss: 0.29400986433029175 2023-01-22 13:41:03.874512: step: 642/464, loss: 0.042422376573085785 2023-01-22 13:41:04.589501: step: 644/464, loss: 0.11885477602481842 2023-01-22 13:41:05.325888: step: 646/464, loss: 0.05155384540557861 2023-01-22 13:41:06.040312: step: 648/464, loss: 0.045021940022706985 2023-01-22 13:41:06.786829: step: 650/464, loss: 0.1434958279132843 2023-01-22 13:41:07.574323: step: 652/464, loss: 0.10911697894334793 2023-01-22 13:41:08.244431: step: 654/464, loss: 0.13170289993286133 2023-01-22 13:41:09.018429: step: 656/464, loss: 0.10425911843776703 2023-01-22 13:41:09.783427: step: 658/464, loss: 0.11268703639507294 2023-01-22 13:41:10.483644: step: 660/464, loss: 0.16174089908599854 2023-01-22 13:41:11.254167: step: 662/464, loss: 0.12235087156295776 2023-01-22 13:41:12.033871: step: 664/464, loss: 0.0665929839015007 2023-01-22 13:41:12.757251: step: 666/464, loss: 0.0692693442106247 2023-01-22 13:41:13.413871: step: 668/464, loss: 0.04867265373468399 2023-01-22 13:41:14.215389: step: 670/464, loss: 0.055253803730010986 2023-01-22 13:41:14.855762: step: 672/464, loss: 0.03663313016295433 2023-01-22 13:41:15.607214: step: 674/464, loss: 0.08477963507175446 2023-01-22 13:41:16.332466: step: 676/464, loss: 0.1519874483346939 2023-01-22 13:41:17.013192: step: 678/464, loss: 0.09137319773435593 2023-01-22 13:41:17.748606: step: 680/464, loss: 0.14208124577999115 2023-01-22 13:41:18.605356: step: 682/464, loss: 0.05060447007417679 2023-01-22 13:41:19.415630: step: 684/464, loss: 0.12298265844583511 2023-01-22 13:41:20.231075: step: 686/464, loss: 0.05008777230978012 2023-01-22 13:41:20.938885: step: 688/464, loss: 0.07089913636445999 2023-01-22 13:41:21.667501: step: 690/464, loss: 0.03305942565202713 2023-01-22 13:41:22.320817: step: 692/464, loss: 0.03266128525137901 2023-01-22 13:41:23.010912: step: 694/464, loss: 0.045487675815820694 2023-01-22 13:41:23.736794: step: 696/464, loss: 0.1335483193397522 2023-01-22 13:41:24.413643: step: 698/464, loss: 0.15396928787231445 2023-01-22 13:41:25.141798: step: 700/464, loss: 0.6332160234451294 2023-01-22 13:41:25.924811: step: 702/464, loss: 0.03149405121803284 2023-01-22 13:41:26.714654: step: 704/464, loss: 0.2502283453941345 2023-01-22 13:41:27.404116: step: 706/464, loss: 0.0986698642373085 2023-01-22 13:41:28.167939: step: 708/464, loss: 0.09098249673843384 2023-01-22 13:41:28.908198: step: 710/464, loss: 0.11281674355268478 2023-01-22 13:41:29.651102: step: 712/464, loss: 0.4118633270263672 2023-01-22 13:41:30.303664: step: 714/464, loss: 0.2575540840625763 2023-01-22 13:41:31.053010: step: 716/464, loss: 0.10508500039577484 2023-01-22 13:41:31.802262: step: 718/464, loss: 0.06450307369232178 2023-01-22 13:41:32.511162: step: 720/464, loss: 0.08441021293401718 2023-01-22 13:41:33.236603: step: 722/464, loss: 0.27181103825569153 2023-01-22 13:41:33.962201: step: 724/464, loss: 0.20128807425498962 2023-01-22 13:41:34.748729: step: 726/464, loss: 0.02161598950624466 2023-01-22 13:41:35.537507: step: 728/464, loss: 0.05868678167462349 2023-01-22 13:41:36.401956: step: 730/464, loss: 0.04888615012168884 2023-01-22 13:41:37.164547: step: 732/464, loss: 0.1488361805677414 2023-01-22 13:41:37.849565: step: 734/464, loss: 0.06514663249254227 2023-01-22 13:41:38.628746: step: 736/464, loss: 0.0822744145989418 2023-01-22 13:41:39.341122: step: 738/464, loss: 0.13314495980739594 2023-01-22 13:41:40.004342: step: 740/464, loss: 0.018895376473665237 2023-01-22 13:41:40.785271: step: 742/464, loss: 0.12033816426992416 2023-01-22 13:41:41.537224: step: 744/464, loss: 0.09358536452054977 2023-01-22 13:41:42.452234: step: 746/464, loss: 0.07539544254541397 2023-01-22 13:41:43.195140: step: 748/464, loss: 0.07023809105157852 2023-01-22 13:41:43.966605: step: 750/464, loss: 0.18539175391197205 2023-01-22 13:41:44.641639: step: 752/464, loss: 0.17972595989704132 2023-01-22 13:41:45.433991: step: 754/464, loss: 0.038272675126791 2023-01-22 13:41:46.255959: step: 756/464, loss: 0.0807376578450203 2023-01-22 13:41:46.940595: step: 758/464, loss: 0.04930340126156807 2023-01-22 13:41:47.655000: step: 760/464, loss: 0.02274372987449169 2023-01-22 13:41:48.325560: step: 762/464, loss: 0.0977998822927475 2023-01-22 13:41:49.080258: step: 764/464, loss: 0.9066311120986938 2023-01-22 13:41:49.746173: step: 766/464, loss: 0.2617078125476837 2023-01-22 13:41:50.551305: step: 768/464, loss: 0.13507243990898132 2023-01-22 13:41:51.273765: step: 770/464, loss: 0.1343207210302353 2023-01-22 13:41:52.024056: step: 772/464, loss: 0.029202204197645187 2023-01-22 13:41:52.712458: step: 774/464, loss: 0.030908869579434395 2023-01-22 13:41:53.355078: step: 776/464, loss: 0.03610406443476677 2023-01-22 13:41:54.098990: step: 778/464, loss: 0.04053062945604324 2023-01-22 13:41:54.873826: step: 780/464, loss: 0.32335996627807617 2023-01-22 13:41:55.575880: step: 782/464, loss: 0.1535036861896515 2023-01-22 13:41:56.290068: step: 784/464, loss: 0.2088811993598938 2023-01-22 13:41:57.061732: step: 786/464, loss: 0.05972848832607269 2023-01-22 13:41:57.852362: step: 788/464, loss: 0.05589446425437927 2023-01-22 13:41:58.588338: step: 790/464, loss: 0.31635093688964844 2023-01-22 13:41:59.336497: step: 792/464, loss: 0.06617505848407745 2023-01-22 13:42:00.086644: step: 794/464, loss: 0.06160557270050049 2023-01-22 13:42:00.785203: step: 796/464, loss: 0.18889334797859192 2023-01-22 13:42:01.497187: step: 798/464, loss: 0.2467067390680313 2023-01-22 13:42:02.248215: step: 800/464, loss: 0.04508241266012192 2023-01-22 13:42:02.964140: step: 802/464, loss: 0.06874433159828186 2023-01-22 13:42:03.708804: step: 804/464, loss: 0.1067737489938736 2023-01-22 13:42:04.526868: step: 806/464, loss: 0.9908697605133057 2023-01-22 13:42:05.368509: step: 808/464, loss: 0.0754680186510086 2023-01-22 13:42:06.027976: step: 810/464, loss: 0.16445475816726685 2023-01-22 13:42:06.680785: step: 812/464, loss: 0.20411959290504456 2023-01-22 13:42:07.423409: step: 814/464, loss: 0.01780773140490055 2023-01-22 13:42:08.163868: step: 816/464, loss: 0.1358191967010498 2023-01-22 13:42:08.935263: step: 818/464, loss: 0.17191946506500244 2023-01-22 13:42:09.627644: step: 820/464, loss: 0.03713075816631317 2023-01-22 13:42:10.433983: step: 822/464, loss: 0.026605524122714996 2023-01-22 13:42:11.145970: step: 824/464, loss: 0.12163020670413971 2023-01-22 13:42:11.878131: step: 826/464, loss: 0.3582850396633148 2023-01-22 13:42:12.571620: step: 828/464, loss: 0.13930034637451172 2023-01-22 13:42:13.296304: step: 830/464, loss: 1.0212076902389526 2023-01-22 13:42:14.127250: step: 832/464, loss: 0.07644716650247574 2023-01-22 13:42:14.848863: step: 834/464, loss: 0.4002009630203247 2023-01-22 13:42:15.604147: step: 836/464, loss: 0.11011244356632233 2023-01-22 13:42:16.353401: step: 838/464, loss: 0.03478245809674263 2023-01-22 13:42:17.076275: step: 840/464, loss: 0.12680082023143768 2023-01-22 13:42:17.777732: step: 842/464, loss: 0.069913849234581 2023-01-22 13:42:18.581507: step: 844/464, loss: 0.10052835196256638 2023-01-22 13:42:19.352357: step: 846/464, loss: 0.15281496942043304 2023-01-22 13:42:20.083739: step: 848/464, loss: 0.08110205829143524 2023-01-22 13:42:20.807677: step: 850/464, loss: 0.08946767449378967 2023-01-22 13:42:21.564568: step: 852/464, loss: 0.0644698217511177 2023-01-22 13:42:22.323671: step: 854/464, loss: 0.10672362148761749 2023-01-22 13:42:23.008587: step: 856/464, loss: 0.33436694741249084 2023-01-22 13:42:23.806618: step: 858/464, loss: 0.0706905946135521 2023-01-22 13:42:24.484888: step: 860/464, loss: 0.14880642294883728 2023-01-22 13:42:25.132617: step: 862/464, loss: 0.03052304871380329 2023-01-22 13:42:25.824230: step: 864/464, loss: 0.0998193621635437 2023-01-22 13:42:26.583399: step: 866/464, loss: 0.26408851146698 2023-01-22 13:42:27.339864: step: 868/464, loss: 0.3294277787208557 2023-01-22 13:42:28.071736: step: 870/464, loss: 0.06395434588193893 2023-01-22 13:42:28.817658: step: 872/464, loss: 0.0582747757434845 2023-01-22 13:42:29.633139: step: 874/464, loss: 0.17045952379703522 2023-01-22 13:42:30.385746: step: 876/464, loss: 0.28945544362068176 2023-01-22 13:42:31.077213: step: 878/464, loss: 0.05567263439297676 2023-01-22 13:42:31.844755: step: 880/464, loss: 0.4171179533004761 2023-01-22 13:42:32.578796: step: 882/464, loss: 0.05881846696138382 2023-01-22 13:42:33.259988: step: 884/464, loss: 0.06386110186576843 2023-01-22 13:42:33.971178: step: 886/464, loss: 0.17204733192920685 2023-01-22 13:42:34.691237: step: 888/464, loss: 0.10010688006877899 2023-01-22 13:42:35.363100: step: 890/464, loss: 0.20999249815940857 2023-01-22 13:42:36.069345: step: 892/464, loss: 0.23485256731510162 2023-01-22 13:42:36.801856: step: 894/464, loss: 0.06420190632343292 2023-01-22 13:42:37.515891: step: 896/464, loss: 0.09333354979753494 2023-01-22 13:42:38.206175: step: 898/464, loss: 0.07883160561323166 2023-01-22 13:42:39.034479: step: 900/464, loss: 0.053643010556697845 2023-01-22 13:42:39.784682: step: 902/464, loss: 0.13103336095809937 2023-01-22 13:42:40.525025: step: 904/464, loss: 0.007707820273935795 2023-01-22 13:42:41.262322: step: 906/464, loss: 0.4912790060043335 2023-01-22 13:42:42.076087: step: 908/464, loss: 0.049322012811899185 2023-01-22 13:42:42.952548: step: 910/464, loss: 0.18857868015766144 2023-01-22 13:42:43.688198: step: 912/464, loss: 0.017859352752566338 2023-01-22 13:42:44.409435: step: 914/464, loss: 0.02454872988164425 2023-01-22 13:42:45.096800: step: 916/464, loss: 0.06313516944646835 2023-01-22 13:42:45.788611: step: 918/464, loss: 0.025236960500478745 2023-01-22 13:42:46.504868: step: 920/464, loss: 0.06392581760883331 2023-01-22 13:42:47.293844: step: 922/464, loss: 0.059632401913404465 2023-01-22 13:42:48.062966: step: 924/464, loss: 0.9361553192138672 2023-01-22 13:42:48.835893: step: 926/464, loss: 0.061501532793045044 2023-01-22 13:42:49.553610: step: 928/464, loss: 0.06427504122257233 2023-01-22 13:42:50.153650: step: 930/464, loss: 0.0780082419514656 ================================================== Loss: 0.147 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.28953002842928216, 'r': 0.33128388452155055, 'f1': 0.3090028445006321}, 'combined': 0.22768630647414995, 'epoch': 16} Test Chinese: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.30218843828446346, 'r': 0.2980038308304748, 'f1': 0.30008154678248017}, 'combined': 0.18636643431754032, 'epoch': 16} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.27995321909951243, 'r': 0.33679381576677586, 'f1': 0.30575424790541067}, 'combined': 0.22529260371977627, 'epoch': 16} Test Korean: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.2882000952166545, 'r': 0.2930461897949761, 'f1': 0.2906029405421489}, 'combined': 0.18047972096828196, 'epoch': 16} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.29120245466607697, 'r': 0.32877696494557074, 'f1': 0.3088510882822028}, 'combined': 0.22757448610267575, 'epoch': 16} Test Russian: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.3086048341149748, 'r': 0.3031212339036525, 'f1': 0.3058384561199203}, 'combined': 0.18994177801131892, 'epoch': 16} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.27325581395348836, 'r': 0.3357142857142857, 'f1': 0.30128205128205127}, 'combined': 0.20085470085470084, 'epoch': 16} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.20121951219512196, 'r': 0.358695652173913, 'f1': 0.2578125}, 'combined': 0.12890625, 'epoch': 16} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.42613636363636365, 'r': 0.3232758620689655, 'f1': 0.36764705882352944}, 'combined': 0.2450980392156863, 'epoch': 16} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.31282801835726054, 'r': 0.3603161425860667, 'f1': 0.33489701436130004}, 'combined': 0.24676622110832633, 'epoch': 12} Test for Chinese: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.30894966790503225, 'r': 0.2887808468548521, 'f1': 0.29852498585915693}, 'combined': 0.1853997280598975, 'epoch': 12} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.3138888888888889, 'r': 0.4035714285714286, 'f1': 0.35312499999999997}, 'combined': 0.23541666666666664, 'epoch': 12} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2957374656593406, 'r': 0.3501711168338303, 'f1': 0.3206606056844979}, 'combined': 0.23627623576752474, 'epoch': 12} Test for Korean: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.30237642650399266, 'r': 0.2871380888046808, 'f1': 0.2945603100560943}, 'combined': 0.18293745571904804, 'epoch': 12} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.31976744186046513, 'r': 0.5978260869565217, 'f1': 0.4166666666666667}, 'combined': 0.20833333333333334, 'epoch': 12} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3020804336867607, 'r': 0.32271590923272536, 'f1': 0.3120574021388005}, 'combined': 0.22993703315490563, 'epoch': 8} Test for Russian: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.3069139619466407, 'r': 0.292949528465389, 'f1': 0.29976920372318655}, 'combined': 0.1861724528386106, 'epoch': 8} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.44642857142857145, 'r': 0.3232758620689655, 'f1': 0.37500000000000006}, 'combined': 0.25, 'epoch': 8} ****************************** Epoch: 17 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-22 13:45:29.276648: step: 2/464, loss: 0.053664859384298325 2023-01-22 13:45:29.947908: step: 4/464, loss: 0.0166542399674654 2023-01-22 13:45:30.716734: step: 6/464, loss: 0.16694216430187225 2023-01-22 13:45:31.467204: step: 8/464, loss: 0.07075951993465424 2023-01-22 13:45:32.223439: step: 10/464, loss: 0.07365650683641434 2023-01-22 13:45:32.956264: step: 12/464, loss: 0.13355489075183868 2023-01-22 13:45:33.687955: step: 14/464, loss: 0.5236701965332031 2023-01-22 13:45:34.414481: step: 16/464, loss: 0.0357673317193985 2023-01-22 13:45:35.118838: step: 18/464, loss: 0.40205228328704834 2023-01-22 13:45:35.850527: step: 20/464, loss: 0.09290299564599991 2023-01-22 13:45:36.589122: step: 22/464, loss: 0.031853266060352325 2023-01-22 13:45:37.487821: step: 24/464, loss: 0.060423143208026886 2023-01-22 13:45:38.138941: step: 26/464, loss: 0.04558515176177025 2023-01-22 13:45:38.845425: step: 28/464, loss: 0.0652964860200882 2023-01-22 13:45:39.554215: step: 30/464, loss: 0.0554376095533371 2023-01-22 13:45:40.264499: step: 32/464, loss: 0.04711688682436943 2023-01-22 13:45:41.047736: step: 34/464, loss: 0.10371864587068558 2023-01-22 13:45:41.807988: step: 36/464, loss: 0.07498136907815933 2023-01-22 13:45:42.690384: step: 38/464, loss: 0.10975520312786102 2023-01-22 13:45:43.398845: step: 40/464, loss: 0.027045963332057 2023-01-22 13:45:44.144583: step: 42/464, loss: 0.14196310937404633 2023-01-22 13:45:44.929229: step: 44/464, loss: 0.06612968444824219 2023-01-22 13:45:45.667845: step: 46/464, loss: 0.41418808698654175 2023-01-22 13:45:46.418816: step: 48/464, loss: 0.9328832626342773 2023-01-22 13:45:47.135173: step: 50/464, loss: 0.3546507656574249 2023-01-22 13:45:47.926375: step: 52/464, loss: 0.06665902584791183 2023-01-22 13:45:48.629152: step: 54/464, loss: 0.09104351699352264 2023-01-22 13:45:49.354049: step: 56/464, loss: 0.1324085146188736 2023-01-22 13:45:50.136588: step: 58/464, loss: 0.14064887166023254 2023-01-22 13:45:50.919379: step: 60/464, loss: 0.027966858819127083 2023-01-22 13:45:51.750303: step: 62/464, loss: 0.09557768702507019 2023-01-22 13:45:52.495316: step: 64/464, loss: 0.03680592030286789 2023-01-22 13:45:53.201188: step: 66/464, loss: 0.11538437753915787 2023-01-22 13:45:53.996171: step: 68/464, loss: 0.17318518459796906 2023-01-22 13:45:54.690126: step: 70/464, loss: 0.05451393499970436 2023-01-22 13:45:55.410286: step: 72/464, loss: 0.11980122327804565 2023-01-22 13:45:56.145551: step: 74/464, loss: 0.04295728728175163 2023-01-22 13:45:56.876751: step: 76/464, loss: 0.090221107006073 2023-01-22 13:45:57.577510: step: 78/464, loss: 0.15044856071472168 2023-01-22 13:45:58.429320: step: 80/464, loss: 0.17875273525714874 2023-01-22 13:45:59.108528: step: 82/464, loss: 0.0368921123445034 2023-01-22 13:45:59.866848: step: 84/464, loss: 0.07362562417984009 2023-01-22 13:46:00.627352: step: 86/464, loss: 0.34966787695884705 2023-01-22 13:46:01.399795: step: 88/464, loss: 0.061326391994953156 2023-01-22 13:46:02.097879: step: 90/464, loss: 0.10193252563476562 2023-01-22 13:46:02.837335: step: 92/464, loss: 0.032848015427589417 2023-01-22 13:46:03.599161: step: 94/464, loss: 0.08947242051362991 2023-01-22 13:46:04.304590: step: 96/464, loss: 0.0824618861079216 2023-01-22 13:46:05.086285: step: 98/464, loss: 0.10630713403224945 2023-01-22 13:46:05.781094: step: 100/464, loss: 0.032170526683330536 2023-01-22 13:46:06.499392: step: 102/464, loss: 0.04691094160079956 2023-01-22 13:46:07.326845: step: 104/464, loss: 0.06669402122497559 2023-01-22 13:46:07.981842: step: 106/464, loss: 0.33576953411102295 2023-01-22 13:46:08.668560: step: 108/464, loss: 0.061419643461704254 2023-01-22 13:46:09.374208: step: 110/464, loss: 0.11293346434831619 2023-01-22 13:46:10.194027: step: 112/464, loss: 0.0694413036108017 2023-01-22 13:46:10.870354: step: 114/464, loss: 0.030548907816410065 2023-01-22 13:46:11.679425: step: 116/464, loss: 0.12947489321231842 2023-01-22 13:46:12.411047: step: 118/464, loss: 0.12525877356529236 2023-01-22 13:46:13.108219: step: 120/464, loss: 0.18001887202262878 2023-01-22 13:46:13.818963: step: 122/464, loss: 0.04662255942821503 2023-01-22 13:46:14.604642: step: 124/464, loss: 0.08843176811933517 2023-01-22 13:46:15.340242: step: 126/464, loss: 0.07850769907236099 2023-01-22 13:46:16.172765: step: 128/464, loss: 0.10110843926668167 2023-01-22 13:46:16.945205: step: 130/464, loss: 0.0414954349398613 2023-01-22 13:46:17.722653: step: 132/464, loss: 0.09983392059803009 2023-01-22 13:46:18.392992: step: 134/464, loss: 0.23782214522361755 2023-01-22 13:46:19.127616: step: 136/464, loss: 0.04706363379955292 2023-01-22 13:46:19.851910: step: 138/464, loss: 0.02044433355331421 2023-01-22 13:46:20.634421: step: 140/464, loss: 0.1627044975757599 2023-01-22 13:46:21.361805: step: 142/464, loss: 0.39245784282684326 2023-01-22 13:46:22.084318: step: 144/464, loss: 0.053587671369314194 2023-01-22 13:46:22.712176: step: 146/464, loss: 0.03694664314389229 2023-01-22 13:46:23.428549: step: 148/464, loss: 0.05931307375431061 2023-01-22 13:46:24.158399: step: 150/464, loss: 0.2365327775478363 2023-01-22 13:46:24.846424: step: 152/464, loss: 0.016334207728505135 2023-01-22 13:46:25.520922: step: 154/464, loss: 0.020577138289809227 2023-01-22 13:46:26.439409: step: 156/464, loss: 0.7687644958496094 2023-01-22 13:46:27.245187: step: 158/464, loss: 0.11480724066495895 2023-01-22 13:46:27.885647: step: 160/464, loss: 0.04226589575409889 2023-01-22 13:46:28.644206: step: 162/464, loss: 0.0029008612036705017 2023-01-22 13:46:29.381411: step: 164/464, loss: 0.040296342223882675 2023-01-22 13:46:30.092357: step: 166/464, loss: 0.0675327405333519 2023-01-22 13:46:30.874589: step: 168/464, loss: 0.05431724339723587 2023-01-22 13:46:31.586520: step: 170/464, loss: 0.027871621772646904 2023-01-22 13:46:32.325465: step: 172/464, loss: 0.11166887730360031 2023-01-22 13:46:33.062431: step: 174/464, loss: 0.10647082328796387 2023-01-22 13:46:33.815998: step: 176/464, loss: 0.0702541172504425 2023-01-22 13:46:34.601748: step: 178/464, loss: 0.11109448224306107 2023-01-22 13:46:35.299231: step: 180/464, loss: 0.12108645588159561 2023-01-22 13:46:35.993994: step: 182/464, loss: 0.0694524422287941 2023-01-22 13:46:36.714892: step: 184/464, loss: 0.462844580411911 2023-01-22 13:46:37.462779: step: 186/464, loss: 0.030476845800876617 2023-01-22 13:46:38.249669: step: 188/464, loss: 0.0202813558280468 2023-01-22 13:46:38.941356: step: 190/464, loss: 0.1016131266951561 2023-01-22 13:46:39.735022: step: 192/464, loss: 0.048376649618148804 2023-01-22 13:46:40.450289: step: 194/464, loss: 0.18338793516159058 2023-01-22 13:46:41.243487: step: 196/464, loss: 0.01997218281030655 2023-01-22 13:46:41.989479: step: 198/464, loss: 0.05872989073395729 2023-01-22 13:46:42.653476: step: 200/464, loss: 0.07014588266611099 2023-01-22 13:46:43.324533: step: 202/464, loss: 0.03125053271651268 2023-01-22 13:46:44.065141: step: 204/464, loss: 0.03680029883980751 2023-01-22 13:46:44.738235: step: 206/464, loss: 0.007591854315251112 2023-01-22 13:46:45.523821: step: 208/464, loss: 0.016236577183008194 2023-01-22 13:46:46.161937: step: 210/464, loss: 0.06709948182106018 2023-01-22 13:46:46.948233: step: 212/464, loss: 0.28125205636024475 2023-01-22 13:46:47.728131: step: 214/464, loss: 0.11939781904220581 2023-01-22 13:46:48.403541: step: 216/464, loss: 0.04702724516391754 2023-01-22 13:46:49.179896: step: 218/464, loss: 0.05703671649098396 2023-01-22 13:46:49.861439: step: 220/464, loss: 0.5108763575553894 2023-01-22 13:46:50.603362: step: 222/464, loss: 0.08599857240915298 2023-01-22 13:46:51.376087: step: 224/464, loss: 0.12017587572336197 2023-01-22 13:46:52.031511: step: 226/464, loss: 0.07810725271701813 2023-01-22 13:46:52.731485: step: 228/464, loss: 0.13311772048473358 2023-01-22 13:46:53.473755: step: 230/464, loss: 0.10150721669197083 2023-01-22 13:46:54.145897: step: 232/464, loss: 0.41148698329925537 2023-01-22 13:46:54.938087: step: 234/464, loss: 0.08408664166927338 2023-01-22 13:46:55.663138: step: 236/464, loss: 0.06322506070137024 2023-01-22 13:46:56.339751: step: 238/464, loss: 0.0461643822491169 2023-01-22 13:46:57.026863: step: 240/464, loss: 0.10806065052747726 2023-01-22 13:46:57.807051: step: 242/464, loss: 0.09661133587360382 2023-01-22 13:46:58.553311: step: 244/464, loss: 0.1098666563630104 2023-01-22 13:46:59.313963: step: 246/464, loss: 0.23182234168052673 2023-01-22 13:47:00.046467: step: 248/464, loss: 0.033908504992723465 2023-01-22 13:47:00.764944: step: 250/464, loss: 10.392996788024902 2023-01-22 13:47:01.498548: step: 252/464, loss: 0.0998709425330162 2023-01-22 13:47:02.283513: step: 254/464, loss: 0.13033442199230194 2023-01-22 13:47:03.011224: step: 256/464, loss: 0.0023037393111735582 2023-01-22 13:47:03.862113: step: 258/464, loss: 0.08774647116661072 2023-01-22 13:47:04.578896: step: 260/464, loss: 0.052995506674051285 2023-01-22 13:47:05.282544: step: 262/464, loss: 0.05491722375154495 2023-01-22 13:47:05.980914: step: 264/464, loss: 0.07200101763010025 2023-01-22 13:47:06.724218: step: 266/464, loss: 0.06661888211965561 2023-01-22 13:47:07.483733: step: 268/464, loss: 0.09004511684179306 2023-01-22 13:47:08.202100: step: 270/464, loss: 0.05332894250750542 2023-01-22 13:47:08.939877: step: 272/464, loss: 0.048882585018873215 2023-01-22 13:47:09.681733: step: 274/464, loss: 0.1596928983926773 2023-01-22 13:47:10.323501: step: 276/464, loss: 0.09304609894752502 2023-01-22 13:47:11.107884: step: 278/464, loss: 0.1633264273405075 2023-01-22 13:47:11.926758: step: 280/464, loss: 0.16213783621788025 2023-01-22 13:47:12.605921: step: 282/464, loss: 0.11811444163322449 2023-01-22 13:47:13.358625: step: 284/464, loss: 0.055999331176280975 2023-01-22 13:47:14.150542: step: 286/464, loss: 0.06834820657968521 2023-01-22 13:47:14.892154: step: 288/464, loss: 0.08850032836198807 2023-01-22 13:47:15.637189: step: 290/464, loss: 0.31907227635383606 2023-01-22 13:47:16.353446: step: 292/464, loss: 0.22951997816562653 2023-01-22 13:47:17.119163: step: 294/464, loss: 0.1545470803976059 2023-01-22 13:47:17.825432: step: 296/464, loss: 0.05630459636449814 2023-01-22 13:47:18.525096: step: 298/464, loss: 0.06210784241557121 2023-01-22 13:47:19.296751: step: 300/464, loss: 0.14896251261234283 2023-01-22 13:47:20.025395: step: 302/464, loss: 0.06157911568880081 2023-01-22 13:47:20.749936: step: 304/464, loss: 0.12659184634685516 2023-01-22 13:47:21.429885: step: 306/464, loss: 0.024008898064494133 2023-01-22 13:47:22.172815: step: 308/464, loss: 0.025252996012568474 2023-01-22 13:47:22.993797: step: 310/464, loss: 0.11464542150497437 2023-01-22 13:47:23.678392: step: 312/464, loss: 0.04284798353910446 2023-01-22 13:47:24.461134: step: 314/464, loss: 0.16082020103931427 2023-01-22 13:47:25.151225: step: 316/464, loss: 0.8475959897041321 2023-01-22 13:47:25.928239: step: 318/464, loss: 0.08074049651622772 2023-01-22 13:47:26.650778: step: 320/464, loss: 0.04344986751675606 2023-01-22 13:47:27.416567: step: 322/464, loss: 0.13628548383712769 2023-01-22 13:47:28.156597: step: 324/464, loss: 0.06325775384902954 2023-01-22 13:47:28.865529: step: 326/464, loss: 0.03645012527704239 2023-01-22 13:47:29.590132: step: 328/464, loss: 0.07878056168556213 2023-01-22 13:47:30.376217: step: 330/464, loss: 0.19644658267498016 2023-01-22 13:47:31.067018: step: 332/464, loss: 0.10182038694620132 2023-01-22 13:47:31.761062: step: 334/464, loss: 0.10113890469074249 2023-01-22 13:47:32.701367: step: 336/464, loss: 0.11203952878713608 2023-01-22 13:47:33.417240: step: 338/464, loss: 0.05734609067440033 2023-01-22 13:47:34.148802: step: 340/464, loss: 0.04465271905064583 2023-01-22 13:47:34.791371: step: 342/464, loss: 0.004280386958271265 2023-01-22 13:47:35.515446: step: 344/464, loss: 0.19001714885234833 2023-01-22 13:47:36.276857: step: 346/464, loss: 0.14562982320785522 2023-01-22 13:47:37.137350: step: 348/464, loss: 0.14720419049263 2023-01-22 13:47:37.857578: step: 350/464, loss: 0.12624314427375793 2023-01-22 13:47:38.609914: step: 352/464, loss: 0.13830460608005524 2023-01-22 13:47:39.364647: step: 354/464, loss: 0.05004170536994934 2023-01-22 13:47:40.076769: step: 356/464, loss: 0.12028498202562332 2023-01-22 13:47:40.787485: step: 358/464, loss: 0.2403750866651535 2023-01-22 13:47:41.497458: step: 360/464, loss: 0.08636949211359024 2023-01-22 13:47:42.216945: step: 362/464, loss: 0.09380398690700531 2023-01-22 13:47:42.967991: step: 364/464, loss: 0.06335175037384033 2023-01-22 13:47:43.666711: step: 366/464, loss: 0.3888946771621704 2023-01-22 13:47:44.422690: step: 368/464, loss: 0.03396635875105858 2023-01-22 13:47:45.129039: step: 370/464, loss: 0.0674733966588974 2023-01-22 13:47:45.778518: step: 372/464, loss: 0.03991612046957016 2023-01-22 13:47:46.441492: step: 374/464, loss: 0.09819450229406357 2023-01-22 13:47:47.088025: step: 376/464, loss: 0.08212851732969284 2023-01-22 13:47:47.770042: step: 378/464, loss: 0.04800790548324585 2023-01-22 13:47:48.496292: step: 380/464, loss: 0.07307100296020508 2023-01-22 13:47:49.193929: step: 382/464, loss: 0.04123353213071823 2023-01-22 13:47:49.913405: step: 384/464, loss: 0.3220861554145813 2023-01-22 13:47:50.632528: step: 386/464, loss: 0.054907046258449554 2023-01-22 13:47:51.369947: step: 388/464, loss: 0.17442238330841064 2023-01-22 13:47:52.180374: step: 390/464, loss: 0.04887682572007179 2023-01-22 13:47:52.887921: step: 392/464, loss: 0.041350577026605606 2023-01-22 13:47:53.580074: step: 394/464, loss: 0.15166236460208893 2023-01-22 13:47:54.295596: step: 396/464, loss: 0.14650413393974304 2023-01-22 13:47:55.055123: step: 398/464, loss: 0.09053334593772888 2023-01-22 13:47:55.853828: step: 400/464, loss: 0.01918404921889305 2023-01-22 13:47:56.595523: step: 402/464, loss: 0.14491277933120728 2023-01-22 13:47:57.246657: step: 404/464, loss: 0.22183500230312347 2023-01-22 13:47:57.964788: step: 406/464, loss: 0.22847139835357666 2023-01-22 13:47:58.750928: step: 408/464, loss: 0.0427161380648613 2023-01-22 13:47:59.475446: step: 410/464, loss: 0.04364943131804466 2023-01-22 13:48:00.217476: step: 412/464, loss: 0.14266836643218994 2023-01-22 13:48:01.006357: step: 414/464, loss: 0.022271789610385895 2023-01-22 13:48:01.751104: step: 416/464, loss: 0.36364445090293884 2023-01-22 13:48:02.446299: step: 418/464, loss: 0.11318691074848175 2023-01-22 13:48:03.217283: step: 420/464, loss: 0.030474815517663956 2023-01-22 13:48:03.952277: step: 422/464, loss: 0.09793737530708313 2023-01-22 13:48:04.817185: step: 424/464, loss: 0.09572654217481613 2023-01-22 13:48:05.587183: step: 426/464, loss: 0.051216285675764084 2023-01-22 13:48:06.297111: step: 428/464, loss: 0.11468888074159622 2023-01-22 13:48:07.000691: step: 430/464, loss: 0.16534972190856934 2023-01-22 13:48:07.685919: step: 432/464, loss: 0.04753812029957771 2023-01-22 13:48:08.405526: step: 434/464, loss: 0.08464813232421875 2023-01-22 13:48:09.227544: step: 436/464, loss: 0.049439672380685806 2023-01-22 13:48:09.963972: step: 438/464, loss: 0.05753437429666519 2023-01-22 13:48:10.751820: step: 440/464, loss: 0.036289017647504807 2023-01-22 13:48:11.497772: step: 442/464, loss: 0.10135679692029953 2023-01-22 13:48:12.242442: step: 444/464, loss: 0.05048893392086029 2023-01-22 13:48:13.028678: step: 446/464, loss: 0.7747906446456909 2023-01-22 13:48:13.784540: step: 448/464, loss: 0.03790482506155968 2023-01-22 13:48:14.445653: step: 450/464, loss: 0.05181882530450821 2023-01-22 13:48:15.114519: step: 452/464, loss: 0.09344540536403656 2023-01-22 13:48:15.803116: step: 454/464, loss: 0.11064407229423523 2023-01-22 13:48:16.513820: step: 456/464, loss: 1.7301889657974243 2023-01-22 13:48:17.244045: step: 458/464, loss: 0.05675554648041725 2023-01-22 13:48:17.929691: step: 460/464, loss: 0.13673949241638184 2023-01-22 13:48:18.679958: step: 462/464, loss: 0.26604804396629333 2023-01-22 13:48:19.369035: step: 464/464, loss: 0.04592286795377731 2023-01-22 13:48:20.098411: step: 466/464, loss: 0.10509765148162842 2023-01-22 13:48:20.843951: step: 468/464, loss: 1.0288773775100708 2023-01-22 13:48:21.553555: step: 470/464, loss: 0.037989452481269836 2023-01-22 13:48:22.300224: step: 472/464, loss: 0.057457275688648224 2023-01-22 13:48:23.033839: step: 474/464, loss: 0.12352025508880615 2023-01-22 13:48:23.779594: step: 476/464, loss: 0.06553643941879272 2023-01-22 13:48:24.506259: step: 478/464, loss: 0.03871558606624603 2023-01-22 13:48:25.334116: step: 480/464, loss: 0.03770974278450012 2023-01-22 13:48:26.044883: step: 482/464, loss: 0.1302376687526703 2023-01-22 13:48:26.815027: step: 484/464, loss: 0.11569955945014954 2023-01-22 13:48:27.577332: step: 486/464, loss: 0.15845021605491638 2023-01-22 13:48:28.318346: step: 488/464, loss: 0.0666755884885788 2023-01-22 13:48:29.030662: step: 490/464, loss: 0.14751411974430084 2023-01-22 13:48:29.766297: step: 492/464, loss: 0.05031317099928856 2023-01-22 13:48:30.493951: step: 494/464, loss: 0.05618094280362129 2023-01-22 13:48:31.342180: step: 496/464, loss: 0.25684428215026855 2023-01-22 13:48:32.093005: step: 498/464, loss: 0.11302121728658676 2023-01-22 13:48:32.846278: step: 500/464, loss: 0.8344253897666931 2023-01-22 13:48:33.520273: step: 502/464, loss: 0.183640256524086 2023-01-22 13:48:34.307454: step: 504/464, loss: 0.05752811208367348 2023-01-22 13:48:35.170276: step: 506/464, loss: 0.016010737046599388 2023-01-22 13:48:35.814377: step: 508/464, loss: 0.018036268651485443 2023-01-22 13:48:36.658200: step: 510/464, loss: 0.0805363729596138 2023-01-22 13:48:37.417051: step: 512/464, loss: 0.0920647606253624 2023-01-22 13:48:38.126570: step: 514/464, loss: 0.08872433006763458 2023-01-22 13:48:38.823397: step: 516/464, loss: 0.1249750405550003 2023-01-22 13:48:39.630797: step: 518/464, loss: 0.2511100769042969 2023-01-22 13:48:40.329695: step: 520/464, loss: 0.012199416756629944 2023-01-22 13:48:41.028565: step: 522/464, loss: 0.2406979352235794 2023-01-22 13:48:41.807513: step: 524/464, loss: 0.11704836785793304 2023-01-22 13:48:42.549348: step: 526/464, loss: 0.013306746259331703 2023-01-22 13:48:43.203263: step: 528/464, loss: 0.014066663570702076 2023-01-22 13:48:43.875173: step: 530/464, loss: 0.060074660927057266 2023-01-22 13:48:44.622676: step: 532/464, loss: 0.034528978168964386 2023-01-22 13:48:45.333982: step: 534/464, loss: 0.04230083152651787 2023-01-22 13:48:46.120088: step: 536/464, loss: 0.12458521872758865 2023-01-22 13:48:46.789236: step: 538/464, loss: 0.05736164376139641 2023-01-22 13:48:47.486557: step: 540/464, loss: 0.20850035548210144 2023-01-22 13:48:48.264169: step: 542/464, loss: 0.0998268872499466 2023-01-22 13:48:49.032534: step: 544/464, loss: 0.09749908000230789 2023-01-22 13:48:49.736189: step: 546/464, loss: 1.4653208255767822 2023-01-22 13:48:50.579173: step: 548/464, loss: 0.05344126373529434 2023-01-22 13:48:51.282670: step: 550/464, loss: 0.07183989882469177 2023-01-22 13:48:52.010361: step: 552/464, loss: 0.26208972930908203 2023-01-22 13:48:52.778362: step: 554/464, loss: 0.06099194660782814 2023-01-22 13:48:53.526356: step: 556/464, loss: 0.13452257215976715 2023-01-22 13:48:54.268891: step: 558/464, loss: 0.0388176366686821 2023-01-22 13:48:55.036343: step: 560/464, loss: 0.13151054084300995 2023-01-22 13:48:55.751770: step: 562/464, loss: 0.11742176860570908 2023-01-22 13:48:56.423937: step: 564/464, loss: 0.03709885850548744 2023-01-22 13:48:57.109587: step: 566/464, loss: 0.20520006120204926 2023-01-22 13:48:57.882854: step: 568/464, loss: 0.1609494388103485 2023-01-22 13:48:58.586254: step: 570/464, loss: 0.0556962713599205 2023-01-22 13:48:59.282792: step: 572/464, loss: 0.21883781254291534 2023-01-22 13:49:00.055644: step: 574/464, loss: 0.07949330657720566 2023-01-22 13:49:00.826994: step: 576/464, loss: 0.3774619698524475 2023-01-22 13:49:01.503948: step: 578/464, loss: 0.18832574784755707 2023-01-22 13:49:02.258372: step: 580/464, loss: 0.11123646795749664 2023-01-22 13:49:03.010295: step: 582/464, loss: 0.051139943301677704 2023-01-22 13:49:03.736216: step: 584/464, loss: 0.043388500809669495 2023-01-22 13:49:04.423566: step: 586/464, loss: 0.037496041506528854 2023-01-22 13:49:05.100813: step: 588/464, loss: 0.03766334429383278 2023-01-22 13:49:05.707003: step: 590/464, loss: 0.04886811599135399 2023-01-22 13:49:06.427403: step: 592/464, loss: 0.06284377723932266 2023-01-22 13:49:07.149395: step: 594/464, loss: 0.0439876988530159 2023-01-22 13:49:07.811689: step: 596/464, loss: 0.04255954176187515 2023-01-22 13:49:08.511229: step: 598/464, loss: 0.24127870798110962 2023-01-22 13:49:09.129927: step: 600/464, loss: 0.07364135980606079 2023-01-22 13:49:09.841459: step: 602/464, loss: 0.07154645770788193 2023-01-22 13:49:10.629244: step: 604/464, loss: 0.3457305133342743 2023-01-22 13:49:11.309258: step: 606/464, loss: 0.0533280149102211 2023-01-22 13:49:12.069071: step: 608/464, loss: 0.3225669264793396 2023-01-22 13:49:12.731129: step: 610/464, loss: 0.05882871150970459 2023-01-22 13:49:13.466126: step: 612/464, loss: 0.03248715400695801 2023-01-22 13:49:14.201453: step: 614/464, loss: 0.64145427942276 2023-01-22 13:49:14.918135: step: 616/464, loss: 0.07119648158550262 2023-01-22 13:49:15.631541: step: 618/464, loss: 0.04753183200955391 2023-01-22 13:49:16.311714: step: 620/464, loss: 0.20422199368476868 2023-01-22 13:49:17.010381: step: 622/464, loss: 0.08181479573249817 2023-01-22 13:49:17.705588: step: 624/464, loss: 0.02665702998638153 2023-01-22 13:49:18.401631: step: 626/464, loss: 0.09652336686849594 2023-01-22 13:49:19.161970: step: 628/464, loss: 0.11436853557825089 2023-01-22 13:49:19.919510: step: 630/464, loss: 0.13034532964229584 2023-01-22 13:49:20.609793: step: 632/464, loss: 0.127781942486763 2023-01-22 13:49:21.356657: step: 634/464, loss: 0.09202573448419571 2023-01-22 13:49:22.047040: step: 636/464, loss: 0.03257625922560692 2023-01-22 13:49:22.781292: step: 638/464, loss: 0.15791824460029602 2023-01-22 13:49:23.495955: step: 640/464, loss: 0.018911080434918404 2023-01-22 13:49:24.236814: step: 642/464, loss: 0.040161360055208206 2023-01-22 13:49:24.958484: step: 644/464, loss: 0.21853290498256683 2023-01-22 13:49:25.702333: step: 646/464, loss: 0.06965027749538422 2023-01-22 13:49:26.446669: step: 648/464, loss: 0.1252724528312683 2023-01-22 13:49:27.144354: step: 650/464, loss: 0.17936329543590546 2023-01-22 13:49:27.926753: step: 652/464, loss: 0.1071472018957138 2023-01-22 13:49:28.648136: step: 654/464, loss: 0.024697106331586838 2023-01-22 13:49:29.399316: step: 656/464, loss: 0.1417209357023239 2023-01-22 13:49:30.118603: step: 658/464, loss: 0.05779734253883362 2023-01-22 13:49:30.788277: step: 660/464, loss: 0.07815540581941605 2023-01-22 13:49:31.576934: step: 662/464, loss: 0.15495631098747253 2023-01-22 13:49:32.311686: step: 664/464, loss: 0.186952605843544 2023-01-22 13:49:33.158866: step: 666/464, loss: 0.08128277957439423 2023-01-22 13:49:33.869541: step: 668/464, loss: 0.16984660923480988 2023-01-22 13:49:34.631228: step: 670/464, loss: 0.026893116533756256 2023-01-22 13:49:35.368333: step: 672/464, loss: 0.043474629521369934 2023-01-22 13:49:36.131313: step: 674/464, loss: 0.13293564319610596 2023-01-22 13:49:36.841823: step: 676/464, loss: 0.030793726444244385 2023-01-22 13:49:37.517639: step: 678/464, loss: 0.06726095080375671 2023-01-22 13:49:38.234505: step: 680/464, loss: 0.15387548506259918 2023-01-22 13:49:38.969468: step: 682/464, loss: 0.09032996743917465 2023-01-22 13:49:39.704618: step: 684/464, loss: 0.06952432543039322 2023-01-22 13:49:40.449160: step: 686/464, loss: 0.03810926526784897 2023-01-22 13:49:41.240732: step: 688/464, loss: 0.07896184921264648 2023-01-22 13:49:42.063437: step: 690/464, loss: 0.4305756688117981 2023-01-22 13:49:42.791987: step: 692/464, loss: 0.23241256177425385 2023-01-22 13:49:43.552739: step: 694/464, loss: 0.06421250104904175 2023-01-22 13:49:44.265658: step: 696/464, loss: 0.03044717386364937 2023-01-22 13:49:45.003007: step: 698/464, loss: 0.1370568573474884 2023-01-22 13:49:45.756936: step: 700/464, loss: 0.07529357820749283 2023-01-22 13:49:46.393347: step: 702/464, loss: 0.04130513593554497 2023-01-22 13:49:47.092329: step: 704/464, loss: 0.02765693888068199 2023-01-22 13:49:47.793432: step: 706/464, loss: 0.19176216423511505 2023-01-22 13:49:48.504895: step: 708/464, loss: 0.15551027655601501 2023-01-22 13:49:49.172080: step: 710/464, loss: 0.06480307877063751 2023-01-22 13:49:49.923185: step: 712/464, loss: 0.023103512823581696 2023-01-22 13:49:50.616296: step: 714/464, loss: 0.03260474652051926 2023-01-22 13:49:51.300045: step: 716/464, loss: 0.0389396958053112 2023-01-22 13:49:52.036299: step: 718/464, loss: 0.020739024505019188 2023-01-22 13:49:52.723000: step: 720/464, loss: 0.1336267739534378 2023-01-22 13:49:53.418547: step: 722/464, loss: 0.08211637288331985 2023-01-22 13:49:54.162145: step: 724/464, loss: 0.04815949127078056 2023-01-22 13:49:54.858133: step: 726/464, loss: 0.16563375294208527 2023-01-22 13:49:55.652989: step: 728/464, loss: 0.13736072182655334 2023-01-22 13:49:56.403580: step: 730/464, loss: 0.06849687546491623 2023-01-22 13:49:57.126405: step: 732/464, loss: 0.08200888335704803 2023-01-22 13:49:57.883608: step: 734/464, loss: 0.15218576788902283 2023-01-22 13:49:58.655626: step: 736/464, loss: 0.09333252906799316 2023-01-22 13:49:59.345519: step: 738/464, loss: 0.04118682071566582 2023-01-22 13:50:00.118836: step: 740/464, loss: 0.08373872935771942 2023-01-22 13:50:00.883829: step: 742/464, loss: 0.08737599104642868 2023-01-22 13:50:01.586143: step: 744/464, loss: 0.1110134944319725 2023-01-22 13:50:02.310981: step: 746/464, loss: 0.09193912148475647 2023-01-22 13:50:03.039059: step: 748/464, loss: 0.09227243065834045 2023-01-22 13:50:03.722499: step: 750/464, loss: 0.1252930611371994 2023-01-22 13:50:04.468163: step: 752/464, loss: 0.08073785156011581 2023-01-22 13:50:05.208300: step: 754/464, loss: 0.17625649273395538 2023-01-22 13:50:05.997943: step: 756/464, loss: 0.2500367760658264 2023-01-22 13:50:06.632623: step: 758/464, loss: 0.05067160353064537 2023-01-22 13:50:07.379773: step: 760/464, loss: 0.02755684033036232 2023-01-22 13:50:08.115549: step: 762/464, loss: 0.1689205765724182 2023-01-22 13:50:08.820493: step: 764/464, loss: 0.05219363793730736 2023-01-22 13:50:09.518070: step: 766/464, loss: 0.07540833950042725 2023-01-22 13:50:10.264727: step: 768/464, loss: 0.11556242406368256 2023-01-22 13:50:10.967716: step: 770/464, loss: 0.059864141047000885 2023-01-22 13:50:11.721022: step: 772/464, loss: 0.23969833552837372 2023-01-22 13:50:12.418231: step: 774/464, loss: 0.036458227783441544 2023-01-22 13:50:13.191506: step: 776/464, loss: 0.08701157569885254 2023-01-22 13:50:13.917843: step: 778/464, loss: 0.028931396082043648 2023-01-22 13:50:14.600861: step: 780/464, loss: 0.02051432803273201 2023-01-22 13:50:15.412553: step: 782/464, loss: 0.13937965035438538 2023-01-22 13:50:16.180320: step: 784/464, loss: 0.11918213218450546 2023-01-22 13:50:16.977370: step: 786/464, loss: 0.024718215689063072 2023-01-22 13:50:17.689208: step: 788/464, loss: 0.0715637132525444 2023-01-22 13:50:18.432913: step: 790/464, loss: 0.1728140264749527 2023-01-22 13:50:19.162030: step: 792/464, loss: 0.0436103418469429 2023-01-22 13:50:19.885264: step: 794/464, loss: 0.05633265897631645 2023-01-22 13:50:20.686110: step: 796/464, loss: 0.13844357430934906 2023-01-22 13:50:21.440510: step: 798/464, loss: 0.13182282447814941 2023-01-22 13:50:22.196382: step: 800/464, loss: 0.04310326278209686 2023-01-22 13:50:22.954574: step: 802/464, loss: 0.6118875741958618 2023-01-22 13:50:23.733714: step: 804/464, loss: 0.06036046892404556 2023-01-22 13:50:24.470963: step: 806/464, loss: 0.08748368918895721 2023-01-22 13:50:25.205776: step: 808/464, loss: 0.08084870874881744 2023-01-22 13:50:25.945845: step: 810/464, loss: 0.07438945770263672 2023-01-22 13:50:26.803230: step: 812/464, loss: 0.04066689312458038 2023-01-22 13:50:27.514907: step: 814/464, loss: 0.034743160009384155 2023-01-22 13:50:28.259989: step: 816/464, loss: 0.09501675516366959 2023-01-22 13:50:28.984814: step: 818/464, loss: 0.04034503549337387 2023-01-22 13:50:29.676932: step: 820/464, loss: 0.08743909001350403 2023-01-22 13:50:30.445919: step: 822/464, loss: 0.18358755111694336 2023-01-22 13:50:31.159901: step: 824/464, loss: 0.08890260010957718 2023-01-22 13:50:31.898175: step: 826/464, loss: 0.022742677479982376 2023-01-22 13:50:32.677417: step: 828/464, loss: 0.4063442051410675 2023-01-22 13:50:33.436839: step: 830/464, loss: 0.10741689056158066 2023-01-22 13:50:34.213068: step: 832/464, loss: 0.051975641399621964 2023-01-22 13:50:34.910121: step: 834/464, loss: 0.014520746655762196 2023-01-22 13:50:35.693876: step: 836/464, loss: 0.13959254324436188 2023-01-22 13:50:36.382518: step: 838/464, loss: 0.24080374836921692 2023-01-22 13:50:37.007697: step: 840/464, loss: 0.02364250086247921 2023-01-22 13:50:37.676573: step: 842/464, loss: 0.04288835823535919 2023-01-22 13:50:38.368414: step: 844/464, loss: 0.11207117140293121 2023-01-22 13:50:39.115870: step: 846/464, loss: 0.12408368289470673 2023-01-22 13:50:39.887714: step: 848/464, loss: 0.05987241491675377 2023-01-22 13:50:40.665118: step: 850/464, loss: 0.11004193872213364 2023-01-22 13:50:41.451838: step: 852/464, loss: 0.07125671207904816 2023-01-22 13:50:42.167790: step: 854/464, loss: 0.09247303009033203 2023-01-22 13:50:42.873392: step: 856/464, loss: 0.07699891924858093 2023-01-22 13:50:43.524127: step: 858/464, loss: 0.0834435373544693 2023-01-22 13:50:44.258023: step: 860/464, loss: 0.15045170485973358 2023-01-22 13:50:45.034061: step: 862/464, loss: 0.03895842283964157 2023-01-22 13:50:45.735460: step: 864/464, loss: 0.07983269542455673 2023-01-22 13:50:46.606604: step: 866/464, loss: 0.09023448079824448 2023-01-22 13:50:47.341523: step: 868/464, loss: 0.05701834335923195 2023-01-22 13:50:48.037185: step: 870/464, loss: 0.03738541156053543 2023-01-22 13:50:48.745746: step: 872/464, loss: 0.022355815395712852 2023-01-22 13:50:49.467675: step: 874/464, loss: 0.011521057225763798 2023-01-22 13:50:50.365104: step: 876/464, loss: 0.038567617535591125 2023-01-22 13:50:51.185983: step: 878/464, loss: 0.2596600353717804 2023-01-22 13:50:51.867399: step: 880/464, loss: 0.04852532595396042 2023-01-22 13:50:52.566530: step: 882/464, loss: 0.008914230391383171 2023-01-22 13:50:53.334454: step: 884/464, loss: 0.12365903705358505 2023-01-22 13:50:54.126473: step: 886/464, loss: 0.07245268672704697 2023-01-22 13:50:54.892323: step: 888/464, loss: 0.13257306814193726 2023-01-22 13:50:55.718176: step: 890/464, loss: 0.1250031590461731 2023-01-22 13:50:56.403502: step: 892/464, loss: 0.409727543592453 2023-01-22 13:50:57.200552: step: 894/464, loss: 0.16821452975273132 2023-01-22 13:50:57.937105: step: 896/464, loss: 0.5855254530906677 2023-01-22 13:50:58.723735: step: 898/464, loss: 0.09568691998720169 2023-01-22 13:50:59.539081: step: 900/464, loss: 0.028606195002794266 2023-01-22 13:51:00.236193: step: 902/464, loss: 0.06103532388806343 2023-01-22 13:51:01.003417: step: 904/464, loss: 0.18515242636203766 2023-01-22 13:51:01.937206: step: 906/464, loss: 0.03543923795223236 2023-01-22 13:51:02.673106: step: 908/464, loss: 0.4824013411998749 2023-01-22 13:51:03.476943: step: 910/464, loss: 0.059209562838077545 2023-01-22 13:51:04.216387: step: 912/464, loss: 0.7797068953514099 2023-01-22 13:51:04.895682: step: 914/464, loss: 0.0929434671998024 2023-01-22 13:51:05.785575: step: 916/464, loss: 0.3149912655353546 2023-01-22 13:51:06.510785: step: 918/464, loss: 0.03383219987154007 2023-01-22 13:51:07.196422: step: 920/464, loss: 0.029487041756510735 2023-01-22 13:51:08.008790: step: 922/464, loss: 0.0898323506116867 2023-01-22 13:51:08.747356: step: 924/464, loss: 0.19288434088230133 2023-01-22 13:51:09.538614: step: 926/464, loss: 0.13376876711845398 2023-01-22 13:51:10.282232: step: 928/464, loss: 0.06767135858535767 2023-01-22 13:51:10.868389: step: 930/464, loss: 0.021101508289575577 ================================================== Loss: 0.147 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.33272748592870544, 'r': 0.33651565464895633, 'f1': 0.33461084905660377}, 'combined': 0.24655536246276066, 'epoch': 17} Test Chinese: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.29871038770749647, 'r': 0.29369252546339825, 'f1': 0.2961802050512795}, 'combined': 0.1839434957686894, 'epoch': 17} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.308911323051948, 'r': 0.3282549163360358, 'f1': 0.31828949569289955}, 'combined': 0.23452910208950492, 'epoch': 17} Test Korean: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.2880918419815071, 'r': 0.29036924784697354, 'f1': 0.28922606183182803}, 'combined': 0.17962460682187215, 'epoch': 17} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3181684355917418, 'r': 0.3278281983421552, 'f1': 0.3229260944417118}, 'combined': 0.23794554327284023, 'epoch': 17} Test Russian: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.31144583921633184, 'r': 0.3037520190775885, 'f1': 0.3075508187158774}, 'combined': 0.19100524530775545, 'epoch': 17} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.30357142857142855, 'r': 0.36428571428571427, 'f1': 0.3311688311688311}, 'combined': 0.22077922077922074, 'epoch': 17} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.31875, 'r': 0.5543478260869565, 'f1': 0.40476190476190477}, 'combined': 0.20238095238095238, 'epoch': 17} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.3977272727272727, 'r': 0.3017241379310345, 'f1': 0.34313725490196073}, 'combined': 0.2287581699346405, 'epoch': 17} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.31282801835726054, 'r': 0.3603161425860667, 'f1': 0.33489701436130004}, 'combined': 0.24676622110832633, 'epoch': 12} Test for Chinese: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.30894966790503225, 'r': 0.2887808468548521, 'f1': 0.29852498585915693}, 'combined': 0.1853997280598975, 'epoch': 12} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.3138888888888889, 'r': 0.4035714285714286, 'f1': 0.35312499999999997}, 'combined': 0.23541666666666664, 'epoch': 12} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2957374656593406, 'r': 0.3501711168338303, 'f1': 0.3206606056844979}, 'combined': 0.23627623576752474, 'epoch': 12} Test for Korean: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.30237642650399266, 'r': 0.2871380888046808, 'f1': 0.2945603100560943}, 'combined': 0.18293745571904804, 'epoch': 12} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.31976744186046513, 'r': 0.5978260869565217, 'f1': 0.4166666666666667}, 'combined': 0.20833333333333334, 'epoch': 12} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3020804336867607, 'r': 0.32271590923272536, 'f1': 0.3120574021388005}, 'combined': 0.22993703315490563, 'epoch': 8} Test for Russian: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.3069139619466407, 'r': 0.292949528465389, 'f1': 0.29976920372318655}, 'combined': 0.1861724528386106, 'epoch': 8} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.44642857142857145, 'r': 0.3232758620689655, 'f1': 0.37500000000000006}, 'combined': 0.25, 'epoch': 8} ****************************** Epoch: 18 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-22 13:53:48.704720: step: 2/464, loss: 0.02255629189312458 2023-01-22 13:53:49.434840: step: 4/464, loss: 0.10551592707633972 2023-01-22 13:53:50.162138: step: 6/464, loss: 0.03464389219880104 2023-01-22 13:53:50.926601: step: 8/464, loss: 0.10338738560676575 2023-01-22 13:53:51.618530: step: 10/464, loss: 0.08868830651044846 2023-01-22 13:53:52.405349: step: 12/464, loss: 0.02098909579217434 2023-01-22 13:53:53.107542: step: 14/464, loss: 0.043363526463508606 2023-01-22 13:53:53.809124: step: 16/464, loss: 0.04640447720885277 2023-01-22 13:53:54.545291: step: 18/464, loss: 0.12110107392072678 2023-01-22 13:53:55.205973: step: 20/464, loss: 0.03413453325629234 2023-01-22 13:53:56.096095: step: 22/464, loss: 0.012522303499281406 2023-01-22 13:53:56.757796: step: 24/464, loss: 0.131692036986351 2023-01-22 13:53:57.520524: step: 26/464, loss: 0.02454289421439171 2023-01-22 13:53:58.208864: step: 28/464, loss: 0.07694856822490692 2023-01-22 13:53:58.894080: step: 30/464, loss: 0.02013799548149109 2023-01-22 13:53:59.595794: step: 32/464, loss: 1.6733767986297607 2023-01-22 13:54:00.307734: step: 34/464, loss: 0.1602635681629181 2023-01-22 13:54:00.985320: step: 36/464, loss: 0.05486570671200752 2023-01-22 13:54:01.757267: step: 38/464, loss: 0.06268543750047684 2023-01-22 13:54:02.481823: step: 40/464, loss: 0.10344596207141876 2023-01-22 13:54:03.208656: step: 42/464, loss: 0.22582951188087463 2023-01-22 13:54:03.928257: step: 44/464, loss: 0.06255070865154266 2023-01-22 13:54:04.745287: step: 46/464, loss: 0.06916432827711105 2023-01-22 13:54:05.574892: step: 48/464, loss: 0.048753079026937485 2023-01-22 13:54:06.338325: step: 50/464, loss: 0.19315694272518158 2023-01-22 13:54:07.087036: step: 52/464, loss: 0.06600367277860641 2023-01-22 13:54:07.775050: step: 54/464, loss: 0.885759711265564 2023-01-22 13:54:08.532714: step: 56/464, loss: 0.06824956089258194 2023-01-22 13:54:09.280219: step: 58/464, loss: 0.10237880796194077 2023-01-22 13:54:10.055846: step: 60/464, loss: 0.0079014478251338 2023-01-22 13:54:10.741910: step: 62/464, loss: 0.08703933656215668 2023-01-22 13:54:11.429363: step: 64/464, loss: 0.20505230128765106 2023-01-22 13:54:12.121612: step: 66/464, loss: 0.08236251026391983 2023-01-22 13:54:12.897505: step: 68/464, loss: 0.13015301525592804 2023-01-22 13:54:13.621098: step: 70/464, loss: 0.06615529209375381 2023-01-22 13:54:14.312894: step: 72/464, loss: 0.22916918992996216 2023-01-22 13:54:15.080156: step: 74/464, loss: 0.07430724054574966 2023-01-22 13:54:15.809827: step: 76/464, loss: 0.0898202508687973 2023-01-22 13:54:16.580615: step: 78/464, loss: 0.08509740978479385 2023-01-22 13:54:17.283234: step: 80/464, loss: 0.02653665468096733 2023-01-22 13:54:17.961147: step: 82/464, loss: 0.1384216696023941 2023-01-22 13:54:18.691502: step: 84/464, loss: 0.1956842988729477 2023-01-22 13:54:19.445610: step: 86/464, loss: 0.047416504472494125 2023-01-22 13:54:20.131365: step: 88/464, loss: 0.19975516200065613 2023-01-22 13:54:20.831694: step: 90/464, loss: 0.1394379884004593 2023-01-22 13:54:21.558829: step: 92/464, loss: 0.11064222455024719 2023-01-22 13:54:22.219305: step: 94/464, loss: 0.011957090348005295 2023-01-22 13:54:22.946207: step: 96/464, loss: 0.0212737824767828 2023-01-22 13:54:23.693356: step: 98/464, loss: 0.11996756494045258 2023-01-22 13:54:24.386353: step: 100/464, loss: 0.08332812041044235 2023-01-22 13:54:25.224157: step: 102/464, loss: 0.04559627175331116 2023-01-22 13:54:25.957724: step: 104/464, loss: 0.11228682845830917 2023-01-22 13:54:26.681651: step: 106/464, loss: 0.057501498609781265 2023-01-22 13:54:27.354042: step: 108/464, loss: 0.02210753969848156 2023-01-22 13:54:28.067406: step: 110/464, loss: 0.021231796592473984 2023-01-22 13:54:28.836449: step: 112/464, loss: 0.22066837549209595 2023-01-22 13:54:29.505638: step: 114/464, loss: 0.04990150406956673 2023-01-22 13:54:30.360426: step: 116/464, loss: 0.05642591789364815 2023-01-22 13:54:31.166628: step: 118/464, loss: 0.010035468265414238 2023-01-22 13:54:31.944416: step: 120/464, loss: 0.15385819971561432 2023-01-22 13:54:32.666557: step: 122/464, loss: 0.12485004961490631 2023-01-22 13:54:33.342557: step: 124/464, loss: 0.01280268281698227 2023-01-22 13:54:34.089409: step: 126/464, loss: 0.2483009397983551 2023-01-22 13:54:34.833183: step: 128/464, loss: 0.05752422288060188 2023-01-22 13:54:35.614937: step: 130/464, loss: 0.036647357046604156 2023-01-22 13:54:36.325636: step: 132/464, loss: 0.034497249871492386 2023-01-22 13:54:37.051438: step: 134/464, loss: 0.03933246433734894 2023-01-22 13:54:37.728438: step: 136/464, loss: 0.029888764023780823 2023-01-22 13:54:38.442400: step: 138/464, loss: 0.06552189588546753 2023-01-22 13:54:39.137519: step: 140/464, loss: 0.38262122869491577 2023-01-22 13:54:39.838616: step: 142/464, loss: 0.015346909873187542 2023-01-22 13:54:40.565298: step: 144/464, loss: 0.07747817039489746 2023-01-22 13:54:41.260101: step: 146/464, loss: 0.0321495346724987 2023-01-22 13:54:42.029230: step: 148/464, loss: 0.04494288191199303 2023-01-22 13:54:42.788318: step: 150/464, loss: 0.05675000697374344 2023-01-22 13:54:43.437674: step: 152/464, loss: 0.10795766115188599 2023-01-22 13:54:44.168150: step: 154/464, loss: 0.2547047436237335 2023-01-22 13:54:44.805897: step: 156/464, loss: 0.018213415518403053 2023-01-22 13:54:45.507258: step: 158/464, loss: 0.027684977278113365 2023-01-22 13:54:46.297881: step: 160/464, loss: 0.04280570521950722 2023-01-22 13:54:47.090309: step: 162/464, loss: 0.2729480266571045 2023-01-22 13:54:47.806076: step: 164/464, loss: 0.06972559541463852 2023-01-22 13:54:48.528200: step: 166/464, loss: 0.049423202872276306 2023-01-22 13:54:49.266567: step: 168/464, loss: 0.5567521452903748 2023-01-22 13:54:50.018067: step: 170/464, loss: 0.3044826090335846 2023-01-22 13:54:50.713270: step: 172/464, loss: 0.06705247610807419 2023-01-22 13:54:51.420122: step: 174/464, loss: 0.012513558380305767 2023-01-22 13:54:52.137782: step: 176/464, loss: 0.13715636730194092 2023-01-22 13:54:52.869173: step: 178/464, loss: 0.014747089706361294 2023-01-22 13:54:53.574440: step: 180/464, loss: 0.01857588067650795 2023-01-22 13:54:54.314905: step: 182/464, loss: 0.05693085864186287 2023-01-22 13:54:55.114788: step: 184/464, loss: 0.09271720796823502 2023-01-22 13:54:55.876151: step: 186/464, loss: 0.04319971427321434 2023-01-22 13:54:56.577350: step: 188/464, loss: 0.009966270998120308 2023-01-22 13:54:57.264809: step: 190/464, loss: 0.060091424733400345 2023-01-22 13:54:57.975137: step: 192/464, loss: 0.09363576769828796 2023-01-22 13:54:58.725632: step: 194/464, loss: 0.03685108199715614 2023-01-22 13:54:59.475915: step: 196/464, loss: 0.10837969183921814 2023-01-22 13:55:00.252557: step: 198/464, loss: 0.0591653436422348 2023-01-22 13:55:01.014524: step: 200/464, loss: 0.034615155309438705 2023-01-22 13:55:01.712549: step: 202/464, loss: 0.1155146136879921 2023-01-22 13:55:02.502517: step: 204/464, loss: 0.05499124899506569 2023-01-22 13:55:03.219031: step: 206/464, loss: 0.06080270931124687 2023-01-22 13:55:03.929233: step: 208/464, loss: 0.060843829065561295 2023-01-22 13:55:04.747601: step: 210/464, loss: 0.06305276602506638 2023-01-22 13:55:05.448035: step: 212/464, loss: 0.07687051594257355 2023-01-22 13:55:06.200206: step: 214/464, loss: 0.08562051504850388 2023-01-22 13:55:06.915796: step: 216/464, loss: 0.11981651932001114 2023-01-22 13:55:07.629212: step: 218/464, loss: 0.08380502462387085 2023-01-22 13:55:08.384826: step: 220/464, loss: 0.11919360607862473 2023-01-22 13:55:09.157699: step: 222/464, loss: 0.08396501839160919 2023-01-22 13:55:09.868977: step: 224/464, loss: 0.05867454409599304 2023-01-22 13:55:10.571181: step: 226/464, loss: 0.018938075751066208 2023-01-22 13:55:11.327853: step: 228/464, loss: 0.21620342135429382 2023-01-22 13:55:12.151861: step: 230/464, loss: 0.05132962390780449 2023-01-22 13:55:12.894180: step: 232/464, loss: 0.09675484895706177 2023-01-22 13:55:13.607322: step: 234/464, loss: 0.03961890563368797 2023-01-22 13:55:14.347109: step: 236/464, loss: 0.027667952701449394 2023-01-22 13:55:15.069149: step: 238/464, loss: 0.14270053803920746 2023-01-22 13:55:15.724284: step: 240/464, loss: 0.031896259635686874 2023-01-22 13:55:16.544649: step: 242/464, loss: 0.05330142378807068 2023-01-22 13:55:17.278749: step: 244/464, loss: 0.06403126567602158 2023-01-22 13:55:17.984985: step: 246/464, loss: 0.03436197713017464 2023-01-22 13:55:18.719170: step: 248/464, loss: 0.06037990003824234 2023-01-22 13:55:19.446753: step: 250/464, loss: 0.04214172810316086 2023-01-22 13:55:20.148111: step: 252/464, loss: 0.03639375790953636 2023-01-22 13:55:20.897753: step: 254/464, loss: 0.13453000783920288 2023-01-22 13:55:21.689107: step: 256/464, loss: 0.05805106833577156 2023-01-22 13:55:22.508145: step: 258/464, loss: 0.16511701047420502 2023-01-22 13:55:23.206793: step: 260/464, loss: 0.025146618485450745 2023-01-22 13:55:23.946076: step: 262/464, loss: 0.04297715798020363 2023-01-22 13:55:24.645179: step: 264/464, loss: 0.08677572011947632 2023-01-22 13:55:25.354356: step: 266/464, loss: 0.18877673149108887 2023-01-22 13:55:26.141227: step: 268/464, loss: 0.029620453715324402 2023-01-22 13:55:26.852737: step: 270/464, loss: 0.021564066410064697 2023-01-22 13:55:27.629908: step: 272/464, loss: 0.03365343436598778 2023-01-22 13:55:28.451689: step: 274/464, loss: 0.06616527587175369 2023-01-22 13:55:29.291327: step: 276/464, loss: 0.05282546579837799 2023-01-22 13:55:30.081020: step: 278/464, loss: 0.09444158524274826 2023-01-22 13:55:30.766213: step: 280/464, loss: 0.048270583152770996 2023-01-22 13:55:31.514540: step: 282/464, loss: 0.035702046006917953 2023-01-22 13:55:32.259017: step: 284/464, loss: 0.15187041461467743 2023-01-22 13:55:33.029484: step: 286/464, loss: 0.07321696728467941 2023-01-22 13:55:33.788255: step: 288/464, loss: 0.061114389449357986 2023-01-22 13:55:34.453522: step: 290/464, loss: 0.07513292133808136 2023-01-22 13:55:35.138471: step: 292/464, loss: 0.02930048480629921 2023-01-22 13:55:35.833462: step: 294/464, loss: 0.06662975996732712 2023-01-22 13:55:36.556682: step: 296/464, loss: 0.01861395500600338 2023-01-22 13:55:37.299700: step: 298/464, loss: 0.07548941671848297 2023-01-22 13:55:38.040661: step: 300/464, loss: 0.06447035074234009 2023-01-22 13:55:38.841099: step: 302/464, loss: 0.032488834112882614 2023-01-22 13:55:39.548224: step: 304/464, loss: 0.11268679797649384 2023-01-22 13:55:40.252364: step: 306/464, loss: 0.023737547919154167 2023-01-22 13:55:41.059327: step: 308/464, loss: 0.05261879414319992 2023-01-22 13:55:41.720375: step: 310/464, loss: 0.02802574262022972 2023-01-22 13:55:42.494480: step: 312/464, loss: 0.057686544954776764 2023-01-22 13:55:43.225973: step: 314/464, loss: 0.015248388051986694 2023-01-22 13:55:43.926161: step: 316/464, loss: 0.09674489498138428 2023-01-22 13:55:44.643861: step: 318/464, loss: 0.4312308132648468 2023-01-22 13:55:45.376575: step: 320/464, loss: 0.03816446289420128 2023-01-22 13:55:46.126419: step: 322/464, loss: 0.1555195152759552 2023-01-22 13:55:46.835726: step: 324/464, loss: 0.09488389641046524 2023-01-22 13:55:47.643795: step: 326/464, loss: 0.059349559247493744 2023-01-22 13:55:48.394582: step: 328/464, loss: 0.04325365275144577 2023-01-22 13:55:49.073578: step: 330/464, loss: 0.08058194071054459 2023-01-22 13:55:49.862288: step: 332/464, loss: 0.08156687766313553 2023-01-22 13:55:50.602855: step: 334/464, loss: 0.055490389466285706 2023-01-22 13:55:51.282374: step: 336/464, loss: 0.03358432650566101 2023-01-22 13:55:52.018364: step: 338/464, loss: 0.09756134450435638 2023-01-22 13:55:52.752291: step: 340/464, loss: 0.2598879635334015 2023-01-22 13:55:53.491129: step: 342/464, loss: 0.09739559888839722 2023-01-22 13:55:54.196964: step: 344/464, loss: 0.0758894681930542 2023-01-22 13:55:54.889284: step: 346/464, loss: 0.016688646748661995 2023-01-22 13:55:55.519693: step: 348/464, loss: 0.07276943325996399 2023-01-22 13:55:56.297279: step: 350/464, loss: 0.4115365147590637 2023-01-22 13:55:57.079397: step: 352/464, loss: 0.6110472083091736 2023-01-22 13:55:57.772672: step: 354/464, loss: 0.02503076381981373 2023-01-22 13:55:58.482182: step: 356/464, loss: 0.04955613613128662 2023-01-22 13:55:59.242998: step: 358/464, loss: 0.036105427891016006 2023-01-22 13:55:59.943274: step: 360/464, loss: 0.0953097864985466 2023-01-22 13:56:00.553943: step: 362/464, loss: 0.05211088806390762 2023-01-22 13:56:01.349682: step: 364/464, loss: 0.06627248227596283 2023-01-22 13:56:02.065519: step: 366/464, loss: 0.29732802510261536 2023-01-22 13:56:02.780821: step: 368/464, loss: 0.07204879820346832 2023-01-22 13:56:03.479780: step: 370/464, loss: 0.08538369089365005 2023-01-22 13:56:04.176971: step: 372/464, loss: 0.0855962336063385 2023-01-22 13:56:04.884862: step: 374/464, loss: 0.34332185983657837 2023-01-22 13:56:05.594238: step: 376/464, loss: 0.11024769395589828 2023-01-22 13:56:06.317158: step: 378/464, loss: 0.022629186511039734 2023-01-22 13:56:07.032935: step: 380/464, loss: 0.0872216522693634 2023-01-22 13:56:07.834358: step: 382/464, loss: 0.07278859615325928 2023-01-22 13:56:08.717498: step: 384/464, loss: 0.0784880667924881 2023-01-22 13:56:09.363658: step: 386/464, loss: 0.06884322315454483 2023-01-22 13:56:10.090843: step: 388/464, loss: 0.07317520678043365 2023-01-22 13:56:10.771202: step: 390/464, loss: 0.07874592393636703 2023-01-22 13:56:11.507814: step: 392/464, loss: 0.025225400924682617 2023-01-22 13:56:12.253410: step: 394/464, loss: 0.01785973832011223 2023-01-22 13:56:12.965855: step: 396/464, loss: 0.0650726780295372 2023-01-22 13:56:13.707818: step: 398/464, loss: 0.0992753878235817 2023-01-22 13:56:14.375658: step: 400/464, loss: 0.03817930445075035 2023-01-22 13:56:15.127602: step: 402/464, loss: 0.15180909633636475 2023-01-22 13:56:15.919070: step: 404/464, loss: 0.09004110097885132 2023-01-22 13:56:16.690956: step: 406/464, loss: 0.09567401558160782 2023-01-22 13:56:17.401470: step: 408/464, loss: 0.027423489838838577 2023-01-22 13:56:18.109361: step: 410/464, loss: 0.02999250404536724 2023-01-22 13:56:18.842306: step: 412/464, loss: 0.023866957053542137 2023-01-22 13:56:19.547664: step: 414/464, loss: 0.039055559784173965 2023-01-22 13:56:20.400412: step: 416/464, loss: 0.13388648629188538 2023-01-22 13:56:21.208032: step: 418/464, loss: 0.07439249008893967 2023-01-22 13:56:21.953718: step: 420/464, loss: 0.04521423205733299 2023-01-22 13:56:22.707642: step: 422/464, loss: 0.07073336094617844 2023-01-22 13:56:23.406900: step: 424/464, loss: 0.21017691493034363 2023-01-22 13:56:24.148072: step: 426/464, loss: 0.2631050646305084 2023-01-22 13:56:24.822161: step: 428/464, loss: 0.060145895928144455 2023-01-22 13:56:25.577474: step: 430/464, loss: 0.08439886569976807 2023-01-22 13:56:26.346433: step: 432/464, loss: 0.09918252378702164 2023-01-22 13:56:27.100411: step: 434/464, loss: 0.14070436358451843 2023-01-22 13:56:27.735995: step: 436/464, loss: 0.005157202482223511 2023-01-22 13:56:28.591824: step: 438/464, loss: 0.05908423289656639 2023-01-22 13:56:29.320305: step: 440/464, loss: 0.19316250085830688 2023-01-22 13:56:29.992190: step: 442/464, loss: 0.26289963722229004 2023-01-22 13:56:30.814441: step: 444/464, loss: 0.3197251558303833 2023-01-22 13:56:31.547777: step: 446/464, loss: 0.03284204378724098 2023-01-22 13:56:32.206445: step: 448/464, loss: 0.09915520995855331 2023-01-22 13:56:32.921468: step: 450/464, loss: 0.2551385462284088 2023-01-22 13:56:33.586129: step: 452/464, loss: 0.1864646077156067 2023-01-22 13:56:34.275277: step: 454/464, loss: 0.0436815470457077 2023-01-22 13:56:34.928512: step: 456/464, loss: 0.10498335212469101 2023-01-22 13:56:35.671729: step: 458/464, loss: 0.033282823860645294 2023-01-22 13:56:36.389767: step: 460/464, loss: 0.010245820507407188 2023-01-22 13:56:37.206754: step: 462/464, loss: 0.06738439947366714 2023-01-22 13:56:37.916493: step: 464/464, loss: 0.050078701227903366 2023-01-22 13:56:38.683029: step: 466/464, loss: 0.06606537848711014 2023-01-22 13:56:39.425563: step: 468/464, loss: 0.025787794962525368 2023-01-22 13:56:40.132392: step: 470/464, loss: 0.07703365385532379 2023-01-22 13:56:40.850284: step: 472/464, loss: 0.01791863888502121 2023-01-22 13:56:41.534202: step: 474/464, loss: 0.3191978633403778 2023-01-22 13:56:42.292935: step: 476/464, loss: 1.24513578414917 2023-01-22 13:56:43.041997: step: 478/464, loss: 0.09245769679546356 2023-01-22 13:56:43.782745: step: 480/464, loss: 0.08820255845785141 2023-01-22 13:56:44.509035: step: 482/464, loss: 0.10357934981584549 2023-01-22 13:56:45.224250: step: 484/464, loss: 0.3993353843688965 2023-01-22 13:56:45.857025: step: 486/464, loss: 0.24043171107769012 2023-01-22 13:56:46.629596: step: 488/464, loss: 0.04684942960739136 2023-01-22 13:56:47.355802: step: 490/464, loss: 0.07265922427177429 2023-01-22 13:56:48.104926: step: 492/464, loss: 0.1522282212972641 2023-01-22 13:56:48.915127: step: 494/464, loss: 0.05145607516169548 2023-01-22 13:56:49.656186: step: 496/464, loss: 0.07494250684976578 2023-01-22 13:56:50.366530: step: 498/464, loss: 0.07337863743305206 2023-01-22 13:56:51.065867: step: 500/464, loss: 0.10319215059280396 2023-01-22 13:56:51.768580: step: 502/464, loss: 0.3263223171234131 2023-01-22 13:56:52.442722: step: 504/464, loss: 0.06274055689573288 2023-01-22 13:56:53.159118: step: 506/464, loss: 0.04613077640533447 2023-01-22 13:56:53.879772: step: 508/464, loss: 0.06665505468845367 2023-01-22 13:56:54.673647: step: 510/464, loss: 0.09452710300683975 2023-01-22 13:56:55.397683: step: 512/464, loss: 0.026932209730148315 2023-01-22 13:56:56.180274: step: 514/464, loss: 0.1044914722442627 2023-01-22 13:56:56.884772: step: 516/464, loss: 0.07837115973234177 2023-01-22 13:56:57.738032: step: 518/464, loss: 0.07439406961202621 2023-01-22 13:56:58.521057: step: 520/464, loss: 0.060525111854076385 2023-01-22 13:56:59.253257: step: 522/464, loss: 0.03752945363521576 2023-01-22 13:56:59.994957: step: 524/464, loss: 0.03834892064332962 2023-01-22 13:57:00.674412: step: 526/464, loss: 0.36184078454971313 2023-01-22 13:57:01.413460: step: 528/464, loss: 0.35955408215522766 2023-01-22 13:57:02.157399: step: 530/464, loss: 0.1059010699391365 2023-01-22 13:57:02.841488: step: 532/464, loss: 0.012662368826568127 2023-01-22 13:57:03.571277: step: 534/464, loss: 0.021875306963920593 2023-01-22 13:57:04.306882: step: 536/464, loss: 0.06533759832382202 2023-01-22 13:57:05.054576: step: 538/464, loss: 0.020436787977814674 2023-01-22 13:57:05.758793: step: 540/464, loss: 0.1005106270313263 2023-01-22 13:57:06.479904: step: 542/464, loss: 0.037786103785037994 2023-01-22 13:57:07.283137: step: 544/464, loss: 0.0772073045372963 2023-01-22 13:57:08.044350: step: 546/464, loss: 0.19233551621437073 2023-01-22 13:57:08.743386: step: 548/464, loss: 0.04664510488510132 2023-01-22 13:57:09.562534: step: 550/464, loss: 0.09136329591274261 2023-01-22 13:57:10.273097: step: 552/464, loss: 0.03763750195503235 2023-01-22 13:57:11.053471: step: 554/464, loss: 0.08128439635038376 2023-01-22 13:57:11.957452: step: 556/464, loss: 0.10550633072853088 2023-01-22 13:57:12.674347: step: 558/464, loss: 0.019246075302362442 2023-01-22 13:57:13.422502: step: 560/464, loss: 0.0624995157122612 2023-01-22 13:57:14.166085: step: 562/464, loss: 1.1483707427978516 2023-01-22 13:57:14.822060: step: 564/464, loss: 0.12313096970319748 2023-01-22 13:57:15.532242: step: 566/464, loss: 0.08051449060440063 2023-01-22 13:57:16.236863: step: 568/464, loss: 0.16188694536685944 2023-01-22 13:57:17.075223: step: 570/464, loss: 0.036759935319423676 2023-01-22 13:57:17.852635: step: 572/464, loss: 0.01996193267405033 2023-01-22 13:57:18.580235: step: 574/464, loss: 0.020887991413474083 2023-01-22 13:57:19.433229: step: 576/464, loss: 0.08122891932725906 2023-01-22 13:57:20.141157: step: 578/464, loss: 0.03574630990624428 2023-01-22 13:57:20.839583: step: 580/464, loss: 0.09584583342075348 2023-01-22 13:57:21.562924: step: 582/464, loss: 0.11320322006940842 2023-01-22 13:57:22.365915: step: 584/464, loss: 0.11420974135398865 2023-01-22 13:57:23.040585: step: 586/464, loss: 0.07629022002220154 2023-01-22 13:57:23.775253: step: 588/464, loss: 0.07765085250139236 2023-01-22 13:57:24.564756: step: 590/464, loss: 0.04749465361237526 2023-01-22 13:57:25.321081: step: 592/464, loss: 0.0955575630068779 2023-01-22 13:57:26.120570: step: 594/464, loss: 0.02726142108440399 2023-01-22 13:57:26.815310: step: 596/464, loss: 0.7329921126365662 2023-01-22 13:57:27.491324: step: 598/464, loss: 0.006696996279060841 2023-01-22 13:57:28.236655: step: 600/464, loss: 0.06922253221273422 2023-01-22 13:57:28.981333: step: 602/464, loss: 0.05629788339138031 2023-01-22 13:57:29.749915: step: 604/464, loss: 0.06038188934326172 2023-01-22 13:57:30.534611: step: 606/464, loss: 0.03784613311290741 2023-01-22 13:57:31.337355: step: 608/464, loss: 0.1631857305765152 2023-01-22 13:57:32.030499: step: 610/464, loss: 0.08317292481660843 2023-01-22 13:57:32.700870: step: 612/464, loss: 0.0531904511153698 2023-01-22 13:57:33.433095: step: 614/464, loss: 0.07334493845701218 2023-01-22 13:57:34.135620: step: 616/464, loss: 0.0606456883251667 2023-01-22 13:57:34.861989: step: 618/464, loss: 0.03547906503081322 2023-01-22 13:57:35.649026: step: 620/464, loss: 0.03521246835589409 2023-01-22 13:57:36.427502: step: 622/464, loss: 0.05137834697961807 2023-01-22 13:57:37.242982: step: 624/464, loss: 0.03159233555197716 2023-01-22 13:57:37.929962: step: 626/464, loss: 0.015035794116556644 2023-01-22 13:57:38.797430: step: 628/464, loss: 0.071849025785923 2023-01-22 13:57:39.552993: step: 630/464, loss: 0.16488297283649445 2023-01-22 13:57:40.311493: step: 632/464, loss: 0.03442539647221565 2023-01-22 13:57:41.027469: step: 634/464, loss: 0.03260069712996483 2023-01-22 13:57:41.740523: step: 636/464, loss: 0.056695397943258286 2023-01-22 13:57:42.517801: step: 638/464, loss: 2.0955114364624023 2023-01-22 13:57:43.218506: step: 640/464, loss: 0.045452218502759933 2023-01-22 13:57:43.947461: step: 642/464, loss: 0.12092519551515579 2023-01-22 13:57:44.657452: step: 644/464, loss: 0.02838761731982231 2023-01-22 13:57:45.359633: step: 646/464, loss: 0.019557658582925797 2023-01-22 13:57:46.054974: step: 648/464, loss: 0.03526949882507324 2023-01-22 13:57:46.721484: step: 650/464, loss: 0.07882774621248245 2023-01-22 13:57:47.446076: step: 652/464, loss: 0.1663265824317932 2023-01-22 13:57:48.193354: step: 654/464, loss: 0.09100101888179779 2023-01-22 13:57:48.898756: step: 656/464, loss: 0.08623427897691727 2023-01-22 13:57:49.582812: step: 658/464, loss: 0.04138614237308502 2023-01-22 13:57:50.251451: step: 660/464, loss: 0.015468517318367958 2023-01-22 13:57:51.033790: step: 662/464, loss: 0.060640037059783936 2023-01-22 13:57:51.744324: step: 664/464, loss: 0.1918199211359024 2023-01-22 13:57:52.457950: step: 666/464, loss: 0.009935087524354458 2023-01-22 13:57:53.219608: step: 668/464, loss: 0.03264904394745827 2023-01-22 13:57:54.017738: step: 670/464, loss: 0.06861291825771332 2023-01-22 13:57:54.762812: step: 672/464, loss: 0.04679633677005768 2023-01-22 13:57:55.540112: step: 674/464, loss: 0.10642059892416 2023-01-22 13:57:56.267710: step: 676/464, loss: 0.08146971464157104 2023-01-22 13:57:56.943430: step: 678/464, loss: 0.026526009663939476 2023-01-22 13:57:57.609119: step: 680/464, loss: 0.13253115117549896 2023-01-22 13:57:58.245160: step: 682/464, loss: 0.15299640595912933 2023-01-22 13:57:59.052427: step: 684/464, loss: 0.16351103782653809 2023-01-22 13:57:59.800019: step: 686/464, loss: 0.11877278983592987 2023-01-22 13:58:00.556226: step: 688/464, loss: 0.08799266070127487 2023-01-22 13:58:01.261857: step: 690/464, loss: 0.01087766420096159 2023-01-22 13:58:02.043759: step: 692/464, loss: 0.17610910534858704 2023-01-22 13:58:02.764205: step: 694/464, loss: 0.0128702437505126 2023-01-22 13:58:03.468876: step: 696/464, loss: 4.251014709472656 2023-01-22 13:58:04.288276: step: 698/464, loss: 0.12636759877204895 2023-01-22 13:58:05.051127: step: 700/464, loss: 0.48409542441368103 2023-01-22 13:58:05.864901: step: 702/464, loss: 0.03458774834871292 2023-01-22 13:58:06.657146: step: 704/464, loss: 0.34720274806022644 2023-01-22 13:58:07.347192: step: 706/464, loss: 0.12886154651641846 2023-01-22 13:58:08.104315: step: 708/464, loss: 0.015398476272821426 2023-01-22 13:58:08.781991: step: 710/464, loss: 0.1452849805355072 2023-01-22 13:58:09.531215: step: 712/464, loss: 0.05062844231724739 2023-01-22 13:58:10.234316: step: 714/464, loss: 0.052139606326818466 2023-01-22 13:58:11.020719: step: 716/464, loss: 0.07707355916500092 2023-01-22 13:58:11.801221: step: 718/464, loss: 0.01662560924887657 2023-01-22 13:58:12.520309: step: 720/464, loss: 0.27249106764793396 2023-01-22 13:58:13.212223: step: 722/464, loss: 0.22377265989780426 2023-01-22 13:58:14.076369: step: 724/464, loss: 0.11260569095611572 2023-01-22 13:58:14.838954: step: 726/464, loss: 0.06264045834541321 2023-01-22 13:58:15.687797: step: 728/464, loss: 0.013300295919179916 2023-01-22 13:58:16.431784: step: 730/464, loss: 0.007667948491871357 2023-01-22 13:58:17.178817: step: 732/464, loss: 0.04861823841929436 2023-01-22 13:58:17.905107: step: 734/464, loss: 0.24045132100582123 2023-01-22 13:58:18.767321: step: 736/464, loss: 0.08063305169343948 2023-01-22 13:58:19.533978: step: 738/464, loss: 0.07206561416387558 2023-01-22 13:58:20.257785: step: 740/464, loss: 1.408238172531128 2023-01-22 13:58:20.892689: step: 742/464, loss: 0.10014645755290985 2023-01-22 13:58:21.623070: step: 744/464, loss: 0.15454338490962982 2023-01-22 13:58:22.373819: step: 746/464, loss: 0.05530770123004913 2023-01-22 13:58:23.109622: step: 748/464, loss: 0.15269611775875092 2023-01-22 13:58:23.868471: step: 750/464, loss: 0.10810256749391556 2023-01-22 13:58:24.695814: step: 752/464, loss: 0.18739116191864014 2023-01-22 13:58:25.464989: step: 754/464, loss: 0.029451007023453712 2023-01-22 13:58:26.188208: step: 756/464, loss: 0.25769859552383423 2023-01-22 13:58:26.925136: step: 758/464, loss: 0.07550712674856186 2023-01-22 13:58:27.720131: step: 760/464, loss: 0.07921797782182693 2023-01-22 13:58:28.490811: step: 762/464, loss: 0.3808189928531647 2023-01-22 13:58:29.193097: step: 764/464, loss: 0.07425136119127274 2023-01-22 13:58:30.016874: step: 766/464, loss: 0.08345261216163635 2023-01-22 13:58:30.782375: step: 768/464, loss: 0.017589038237929344 2023-01-22 13:58:31.519617: step: 770/464, loss: 0.06161960959434509 2023-01-22 13:58:32.274951: step: 772/464, loss: 0.033091556280851364 2023-01-22 13:58:32.967117: step: 774/464, loss: 0.058344125747680664 2023-01-22 13:58:33.683425: step: 776/464, loss: 0.0743429884314537 2023-01-22 13:58:34.422424: step: 778/464, loss: 0.07232428342103958 2023-01-22 13:58:35.141336: step: 780/464, loss: 0.09190531075000763 2023-01-22 13:58:35.862340: step: 782/464, loss: 0.05084923282265663 2023-01-22 13:58:36.548168: step: 784/464, loss: 0.2766363322734833 2023-01-22 13:58:37.223404: step: 786/464, loss: 1.6356395483016968 2023-01-22 13:58:37.923539: step: 788/464, loss: 0.003092309460043907 2023-01-22 13:58:38.668894: step: 790/464, loss: 0.04528704658150673 2023-01-22 13:58:39.386872: step: 792/464, loss: 0.030916161835193634 2023-01-22 13:58:40.152857: step: 794/464, loss: 0.02986675500869751 2023-01-22 13:58:40.926036: step: 796/464, loss: 0.1350688338279724 2023-01-22 13:58:41.580195: step: 798/464, loss: 1.9431557655334473 2023-01-22 13:58:42.339890: step: 800/464, loss: 0.057894084602594376 2023-01-22 13:58:43.075745: step: 802/464, loss: 0.11123964190483093 2023-01-22 13:58:43.806533: step: 804/464, loss: 0.021497314795851707 2023-01-22 13:58:44.610079: step: 806/464, loss: 0.3729839622974396 2023-01-22 13:58:45.299189: step: 808/464, loss: 0.05687344819307327 2023-01-22 13:58:45.925870: step: 810/464, loss: 0.0007358321454375982 2023-01-22 13:58:46.689025: step: 812/464, loss: 0.5136083960533142 2023-01-22 13:58:47.430791: step: 814/464, loss: 0.07494036853313446 2023-01-22 13:58:48.195028: step: 816/464, loss: 0.027758443728089333 2023-01-22 13:58:48.967328: step: 818/464, loss: 0.12550386786460876 2023-01-22 13:58:49.757917: step: 820/464, loss: 0.04769080877304077 2023-01-22 13:58:50.436085: step: 822/464, loss: 0.02603987790644169 2023-01-22 13:58:51.192960: step: 824/464, loss: 0.3562179505825043 2023-01-22 13:58:51.849309: step: 826/464, loss: 0.021887611597776413 2023-01-22 13:58:52.682230: step: 828/464, loss: 0.07408808916807175 2023-01-22 13:58:53.346225: step: 830/464, loss: 1.2311229705810547 2023-01-22 13:58:54.042368: step: 832/464, loss: 0.46964800357818604 2023-01-22 13:58:54.735194: step: 834/464, loss: 0.0071813007816672325 2023-01-22 13:58:55.425613: step: 836/464, loss: 0.03797230124473572 2023-01-22 13:58:56.106332: step: 838/464, loss: 0.11793620139360428 2023-01-22 13:58:56.809725: step: 840/464, loss: 0.04512689262628555 2023-01-22 13:58:57.487913: step: 842/464, loss: 0.1048913225531578 2023-01-22 13:58:58.174485: step: 844/464, loss: 0.05974305421113968 2023-01-22 13:58:58.882502: step: 846/464, loss: 0.17225582897663116 2023-01-22 13:58:59.655855: step: 848/464, loss: 0.11195321381092072 2023-01-22 13:59:00.406417: step: 850/464, loss: 0.07324235886335373 2023-01-22 13:59:01.140684: step: 852/464, loss: 0.05328064784407616 2023-01-22 13:59:01.918465: step: 854/464, loss: 0.06797124445438385 2023-01-22 13:59:02.662741: step: 856/464, loss: 0.1463581919670105 2023-01-22 13:59:03.442129: step: 858/464, loss: 0.1293702870607376 2023-01-22 13:59:04.158220: step: 860/464, loss: 0.05080826207995415 2023-01-22 13:59:04.911804: step: 862/464, loss: 0.04502733424305916 2023-01-22 13:59:05.683821: step: 864/464, loss: 0.8059793710708618 2023-01-22 13:59:06.420039: step: 866/464, loss: 0.2736210227012634 2023-01-22 13:59:07.145783: step: 868/464, loss: 0.04146193712949753 2023-01-22 13:59:07.874768: step: 870/464, loss: 0.035493213683366776 2023-01-22 13:59:08.673043: step: 872/464, loss: 0.13061979413032532 2023-01-22 13:59:09.333459: step: 874/464, loss: 0.04180055856704712 2023-01-22 13:59:10.066340: step: 876/464, loss: 0.9490019083023071 2023-01-22 13:59:10.909669: step: 878/464, loss: 0.14072301983833313 2023-01-22 13:59:11.776845: step: 880/464, loss: 0.051200225949287415 2023-01-22 13:59:12.577608: step: 882/464, loss: 0.06961522996425629 2023-01-22 13:59:13.300775: step: 884/464, loss: 0.11146531999111176 2023-01-22 13:59:14.072969: step: 886/464, loss: 0.17565035820007324 2023-01-22 13:59:14.814030: step: 888/464, loss: 0.11682313680648804 2023-01-22 13:59:15.644657: step: 890/464, loss: 0.07494233548641205 2023-01-22 13:59:16.408212: step: 892/464, loss: 0.06947507709264755 2023-01-22 13:59:17.119230: step: 894/464, loss: 0.041834089905023575 2023-01-22 13:59:17.815546: step: 896/464, loss: 0.07336413115262985 2023-01-22 13:59:18.574063: step: 898/464, loss: 0.010007267817854881 2023-01-22 13:59:19.279883: step: 900/464, loss: 0.07785910367965698 2023-01-22 13:59:20.022438: step: 902/464, loss: 0.14214085042476654 2023-01-22 13:59:20.755809: step: 904/464, loss: 0.7422015070915222 2023-01-22 13:59:21.391327: step: 906/464, loss: 0.08649776130914688 2023-01-22 13:59:22.133358: step: 908/464, loss: 0.017394889146089554 2023-01-22 13:59:22.820434: step: 910/464, loss: 0.19215063750743866 2023-01-22 13:59:23.534365: step: 912/464, loss: 0.14460989832878113 2023-01-22 13:59:24.225728: step: 914/464, loss: 0.037630844861269 2023-01-22 13:59:25.057733: step: 916/464, loss: 0.09516682475805283 2023-01-22 13:59:25.855519: step: 918/464, loss: 0.03875623270869255 2023-01-22 13:59:26.516456: step: 920/464, loss: 0.07534055411815643 2023-01-22 13:59:27.181269: step: 922/464, loss: 0.010584483854472637 2023-01-22 13:59:27.914144: step: 924/464, loss: 0.12951800227165222 2023-01-22 13:59:28.574149: step: 926/464, loss: 0.09220469743013382 2023-01-22 13:59:29.319064: step: 928/464, loss: 0.06358341872692108 2023-01-22 13:59:29.902407: step: 930/464, loss: 0.0035241262521594763 ================================================== Loss: 0.135 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3144723139902327, 'r': 0.3520657784710385, 'f1': 0.33220889033883133}, 'combined': 0.244785498144402, 'epoch': 18} Test Chinese: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.29167670193268874, 'r': 0.2965764093762221, 'f1': 0.29410615020944314}, 'combined': 0.18265539855112786, 'epoch': 18} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.29561765164484577, 'r': 0.34049319648656806, 'f1': 0.31647251243107827}, 'combined': 0.23319027231763662, 'epoch': 18} Test Korean: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.2828514550053801, 'r': 0.2948817344962123, 'f1': 0.28874134002486257}, 'combined': 0.17932356906807256, 'epoch': 18} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3079470851293202, 'r': 0.34826653270792185, 'f1': 0.32686814378820095}, 'combined': 0.24085021121235858, 'epoch': 18} Test Russian: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.3010747823987893, 'r': 0.3046483703204366, 'f1': 0.30285103480232195}, 'combined': 0.18808643214038945, 'epoch': 18} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.30357142857142855, 'r': 0.36428571428571427, 'f1': 0.3311688311688311}, 'combined': 0.22077922077922074, 'epoch': 18} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.26704545454545453, 'r': 0.5108695652173914, 'f1': 0.3507462686567164}, 'combined': 0.1753731343283582, 'epoch': 18} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.39880952380952384, 'r': 0.28879310344827586, 'f1': 0.33499999999999996}, 'combined': 0.2233333333333333, 'epoch': 18} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.31282801835726054, 'r': 0.3603161425860667, 'f1': 0.33489701436130004}, 'combined': 0.24676622110832633, 'epoch': 12} Test for Chinese: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.30894966790503225, 'r': 0.2887808468548521, 'f1': 0.29852498585915693}, 'combined': 0.1853997280598975, 'epoch': 12} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.3138888888888889, 'r': 0.4035714285714286, 'f1': 0.35312499999999997}, 'combined': 0.23541666666666664, 'epoch': 12} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2957374656593406, 'r': 0.3501711168338303, 'f1': 0.3206606056844979}, 'combined': 0.23627623576752474, 'epoch': 12} Test for Korean: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.30237642650399266, 'r': 0.2871380888046808, 'f1': 0.2945603100560943}, 'combined': 0.18293745571904804, 'epoch': 12} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.31976744186046513, 'r': 0.5978260869565217, 'f1': 0.4166666666666667}, 'combined': 0.20833333333333334, 'epoch': 12} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3020804336867607, 'r': 0.32271590923272536, 'f1': 0.3120574021388005}, 'combined': 0.22993703315490563, 'epoch': 8} Test for Russian: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.3069139619466407, 'r': 0.292949528465389, 'f1': 0.29976920372318655}, 'combined': 0.1861724528386106, 'epoch': 8} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.44642857142857145, 'r': 0.3232758620689655, 'f1': 0.37500000000000006}, 'combined': 0.25, 'epoch': 8} ****************************** Epoch: 19 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-22 14:02:09.361299: step: 2/464, loss: 0.12920217216014862 2023-01-22 14:02:10.119666: step: 4/464, loss: 0.04563036561012268 2023-01-22 14:02:10.912561: step: 6/464, loss: 0.09706927090883255 2023-01-22 14:02:11.598176: step: 8/464, loss: 0.03109694831073284 2023-01-22 14:02:12.368924: step: 10/464, loss: 0.036775220185518265 2023-01-22 14:02:13.111104: step: 12/464, loss: 0.18778790533542633 2023-01-22 14:02:13.843859: step: 14/464, loss: 0.0676003098487854 2023-01-22 14:02:14.538678: step: 16/464, loss: 0.8434752821922302 2023-01-22 14:02:15.225468: step: 18/464, loss: 0.11235176026821136 2023-01-22 14:02:16.031667: step: 20/464, loss: 0.06825807690620422 2023-01-22 14:02:16.790887: step: 22/464, loss: 0.006161467172205448 2023-01-22 14:02:17.527542: step: 24/464, loss: 0.04186315834522247 2023-01-22 14:02:18.211082: step: 26/464, loss: 0.03112836368381977 2023-01-22 14:02:18.933633: step: 28/464, loss: 0.03736530616879463 2023-01-22 14:02:19.665820: step: 30/464, loss: 0.07068456709384918 2023-01-22 14:02:20.402806: step: 32/464, loss: 0.06576775014400482 2023-01-22 14:02:21.104884: step: 34/464, loss: 0.06414441019296646 2023-01-22 14:02:21.863916: step: 36/464, loss: 0.08074159920215607 2023-01-22 14:02:22.677368: step: 38/464, loss: 0.03453665226697922 2023-01-22 14:02:23.418863: step: 40/464, loss: 0.09841134399175644 2023-01-22 14:02:24.304025: step: 42/464, loss: 0.10741052031517029 2023-01-22 14:02:24.964897: step: 44/464, loss: 0.05406144633889198 2023-01-22 14:02:25.807057: step: 46/464, loss: 0.05802828073501587 2023-01-22 14:02:26.535780: step: 48/464, loss: 0.015924572944641113 2023-01-22 14:02:27.296396: step: 50/464, loss: 0.036363400518894196 2023-01-22 14:02:27.967008: step: 52/464, loss: 0.024387864395976067 2023-01-22 14:02:28.771404: step: 54/464, loss: 0.030658980831503868 2023-01-22 14:02:29.425571: step: 56/464, loss: 0.020193232223391533 2023-01-22 14:02:30.148896: step: 58/464, loss: 0.34004831314086914 2023-01-22 14:02:30.917161: step: 60/464, loss: 0.05436702072620392 2023-01-22 14:02:31.704213: step: 62/464, loss: 0.9230442047119141 2023-01-22 14:02:32.374705: step: 64/464, loss: 0.021562401205301285 2023-01-22 14:02:33.107789: step: 66/464, loss: 0.030617645010352135 2023-01-22 14:02:33.919337: step: 68/464, loss: 0.2545686960220337 2023-01-22 14:02:34.597352: step: 70/464, loss: 0.05021323636174202 2023-01-22 14:02:35.316672: step: 72/464, loss: 0.03269173204898834 2023-01-22 14:02:36.023562: step: 74/464, loss: 0.07474524527788162 2023-01-22 14:02:36.735482: step: 76/464, loss: 0.05859186127781868 2023-01-22 14:02:37.459522: step: 78/464, loss: 0.022472452372312546 2023-01-22 14:02:38.185062: step: 80/464, loss: 0.03345894068479538 2023-01-22 14:02:38.893544: step: 82/464, loss: 0.002976879244670272 2023-01-22 14:02:39.661166: step: 84/464, loss: 0.5960641503334045 2023-01-22 14:02:40.384967: step: 86/464, loss: 0.0958690419793129 2023-01-22 14:02:41.147525: step: 88/464, loss: 0.026807444170117378 2023-01-22 14:02:41.878732: step: 90/464, loss: 0.11869814991950989 2023-01-22 14:02:42.561696: step: 92/464, loss: 0.031810395419597626 2023-01-22 14:02:43.196548: step: 94/464, loss: 0.05304112657904625 2023-01-22 14:02:43.931250: step: 96/464, loss: 0.19493931531906128 2023-01-22 14:02:44.683282: step: 98/464, loss: 0.03594636544585228 2023-01-22 14:02:45.294494: step: 100/464, loss: 0.0064991544932127 2023-01-22 14:02:45.988819: step: 102/464, loss: 0.1808507740497589 2023-01-22 14:02:46.757717: step: 104/464, loss: 0.006921856198459864 2023-01-22 14:02:47.527642: step: 106/464, loss: 0.21227648854255676 2023-01-22 14:02:48.184665: step: 108/464, loss: 0.027723848819732666 2023-01-22 14:02:48.891715: step: 110/464, loss: 0.031780317425727844 2023-01-22 14:02:49.625863: step: 112/464, loss: 0.14285974204540253 2023-01-22 14:02:50.241174: step: 114/464, loss: 0.11264093220233917 2023-01-22 14:02:51.071003: step: 116/464, loss: 0.05362096428871155 2023-01-22 14:02:51.728614: step: 118/464, loss: 0.03317618370056152 2023-01-22 14:02:52.426653: step: 120/464, loss: 0.767920196056366 2023-01-22 14:02:53.165218: step: 122/464, loss: 0.030632128939032555 2023-01-22 14:02:53.843846: step: 124/464, loss: 0.19124922156333923 2023-01-22 14:02:54.587945: step: 126/464, loss: 0.013469697907567024 2023-01-22 14:02:55.292356: step: 128/464, loss: 0.016623545438051224 2023-01-22 14:02:56.013966: step: 130/464, loss: 0.058871425688266754 2023-01-22 14:02:56.768181: step: 132/464, loss: 0.029303649440407753 2023-01-22 14:02:57.481840: step: 134/464, loss: 0.057812321931123734 2023-01-22 14:02:58.234152: step: 136/464, loss: 0.07526414096355438 2023-01-22 14:02:58.936645: step: 138/464, loss: 0.05507267266511917 2023-01-22 14:02:59.577300: step: 140/464, loss: 0.06511414051055908 2023-01-22 14:03:00.356679: step: 142/464, loss: 0.0870545282959938 2023-01-22 14:03:01.121059: step: 144/464, loss: 0.03300241008400917 2023-01-22 14:03:01.799087: step: 146/464, loss: 0.10764817148447037 2023-01-22 14:03:02.533020: step: 148/464, loss: 0.12006808072328568 2023-01-22 14:03:03.266006: step: 150/464, loss: 0.007246334571391344 2023-01-22 14:03:04.131545: step: 152/464, loss: 0.043186988681554794 2023-01-22 14:03:04.897892: step: 154/464, loss: 0.09337171167135239 2023-01-22 14:03:05.595102: step: 156/464, loss: 1.414793610572815 2023-01-22 14:03:06.290944: step: 158/464, loss: 0.051513053476810455 2023-01-22 14:03:07.062328: step: 160/464, loss: 0.015428408049046993 2023-01-22 14:03:07.975219: step: 162/464, loss: 0.08127868920564651 2023-01-22 14:03:08.675948: step: 164/464, loss: 0.007645327132195234 2023-01-22 14:03:09.488171: step: 166/464, loss: 0.16297253966331482 2023-01-22 14:03:10.194912: step: 168/464, loss: 0.0968436524271965 2023-01-22 14:03:10.885662: step: 170/464, loss: 0.18944884836673737 2023-01-22 14:03:11.675459: step: 172/464, loss: 0.032020069658756256 2023-01-22 14:03:12.425611: step: 174/464, loss: 0.053174279630184174 2023-01-22 14:03:13.196563: step: 176/464, loss: 0.11757547408342361 2023-01-22 14:03:13.909887: step: 178/464, loss: 0.028553200885653496 2023-01-22 14:03:14.589953: step: 180/464, loss: 0.09267770498991013 2023-01-22 14:03:15.214073: step: 182/464, loss: 0.02398275025188923 2023-01-22 14:03:15.921243: step: 184/464, loss: 0.07906018197536469 2023-01-22 14:03:16.631777: step: 186/464, loss: 0.02486865594983101 2023-01-22 14:03:17.346067: step: 188/464, loss: 0.06919976323843002 2023-01-22 14:03:18.136362: step: 190/464, loss: 0.048569340258836746 2023-01-22 14:03:18.893900: step: 192/464, loss: 0.08920535445213318 2023-01-22 14:03:19.625927: step: 194/464, loss: 0.03628360852599144 2023-01-22 14:03:20.440856: step: 196/464, loss: 0.6250218749046326 2023-01-22 14:03:21.162661: step: 198/464, loss: 0.17623533308506012 2023-01-22 14:03:21.838047: step: 200/464, loss: 0.08829116821289062 2023-01-22 14:03:22.598810: step: 202/464, loss: 0.1636122614145279 2023-01-22 14:03:23.324143: step: 204/464, loss: 0.027989499270915985 2023-01-22 14:03:24.010812: step: 206/464, loss: 0.0427122488617897 2023-01-22 14:03:24.736315: step: 208/464, loss: 0.034912075847387314 2023-01-22 14:03:25.478043: step: 210/464, loss: 0.14765290915966034 2023-01-22 14:03:26.259845: step: 212/464, loss: 0.0036442021373659372 2023-01-22 14:03:27.037347: step: 214/464, loss: 0.09502021223306656 2023-01-22 14:03:27.772709: step: 216/464, loss: 0.06033490598201752 2023-01-22 14:03:28.435509: step: 218/464, loss: 0.01887083612382412 2023-01-22 14:03:29.203958: step: 220/464, loss: 0.08002995699644089 2023-01-22 14:03:30.055008: step: 222/464, loss: 0.09107064455747604 2023-01-22 14:03:30.813241: step: 224/464, loss: 0.05492367967963219 2023-01-22 14:03:31.575918: step: 226/464, loss: 0.03423614054918289 2023-01-22 14:03:32.397007: step: 228/464, loss: 0.3338893949985504 2023-01-22 14:03:33.087219: step: 230/464, loss: 0.09567377716302872 2023-01-22 14:03:33.780707: step: 232/464, loss: 0.05865306407213211 2023-01-22 14:03:34.496471: step: 234/464, loss: 0.07105015218257904 2023-01-22 14:03:35.292967: step: 236/464, loss: 0.05263880640268326 2023-01-22 14:03:36.040267: step: 238/464, loss: 0.0064240251667797565 2023-01-22 14:03:36.736347: step: 240/464, loss: 0.01830356940627098 2023-01-22 14:03:37.459345: step: 242/464, loss: 0.010274741798639297 2023-01-22 14:03:38.237205: step: 244/464, loss: 0.07704408466815948 2023-01-22 14:03:38.940891: step: 246/464, loss: 0.08474908024072647 2023-01-22 14:03:39.701009: step: 248/464, loss: 0.018210938200354576 2023-01-22 14:03:40.381889: step: 250/464, loss: 0.03733988106250763 2023-01-22 14:03:41.091767: step: 252/464, loss: 0.0747935101389885 2023-01-22 14:03:41.751867: step: 254/464, loss: 0.031007833778858185 2023-01-22 14:03:42.423035: step: 256/464, loss: 0.005993698723614216 2023-01-22 14:03:43.163761: step: 258/464, loss: 0.03846803680062294 2023-01-22 14:03:43.847197: step: 260/464, loss: 0.014938733540475368 2023-01-22 14:03:44.645501: step: 262/464, loss: 0.030163099989295006 2023-01-22 14:03:45.339540: step: 264/464, loss: 0.02831958793103695 2023-01-22 14:03:46.096734: step: 266/464, loss: 0.08256911486387253 2023-01-22 14:03:46.818659: step: 268/464, loss: 0.0718783363699913 2023-01-22 14:03:47.544380: step: 270/464, loss: 0.31167250871658325 2023-01-22 14:03:48.312862: step: 272/464, loss: 0.05279042571783066 2023-01-22 14:03:49.080181: step: 274/464, loss: 0.02307545766234398 2023-01-22 14:03:49.802433: step: 276/464, loss: 0.011180113069713116 2023-01-22 14:03:50.609740: step: 278/464, loss: 0.02118949219584465 2023-01-22 14:03:51.409464: step: 280/464, loss: 0.11446698009967804 2023-01-22 14:03:52.146594: step: 282/464, loss: 0.09866126626729965 2023-01-22 14:03:52.893136: step: 284/464, loss: 0.472391277551651 2023-01-22 14:03:53.571298: step: 286/464, loss: 0.11861933022737503 2023-01-22 14:03:54.392067: step: 288/464, loss: 0.024260375648736954 2023-01-22 14:03:55.129740: step: 290/464, loss: 0.16927288472652435 2023-01-22 14:03:55.811913: step: 292/464, loss: 0.10281242430210114 2023-01-22 14:03:56.546086: step: 294/464, loss: 0.026213889941573143 2023-01-22 14:03:57.365247: step: 296/464, loss: 0.6735707521438599 2023-01-22 14:03:58.029330: step: 298/464, loss: 0.05733761563897133 2023-01-22 14:03:58.725109: step: 300/464, loss: 0.02001768723130226 2023-01-22 14:03:59.440692: step: 302/464, loss: 0.1101067066192627 2023-01-22 14:04:00.087856: step: 304/464, loss: 0.06691374629735947 2023-01-22 14:04:00.802570: step: 306/464, loss: 0.06036174297332764 2023-01-22 14:04:01.524842: step: 308/464, loss: 0.085693359375 2023-01-22 14:04:02.261035: step: 310/464, loss: 0.017201263457536697 2023-01-22 14:04:02.955906: step: 312/464, loss: 0.05184159800410271 2023-01-22 14:04:03.707277: step: 314/464, loss: 0.0776776447892189 2023-01-22 14:04:04.394436: step: 316/464, loss: 0.07553061097860336 2023-01-22 14:04:05.142625: step: 318/464, loss: 0.3421646058559418 2023-01-22 14:04:05.880856: step: 320/464, loss: 0.9886565804481506 2023-01-22 14:04:06.619722: step: 322/464, loss: 0.13509303331375122 2023-01-22 14:04:07.450496: step: 324/464, loss: 0.02174947038292885 2023-01-22 14:04:08.250413: step: 326/464, loss: 0.055290013551712036 2023-01-22 14:04:08.957437: step: 328/464, loss: 0.0537884496152401 2023-01-22 14:04:09.649573: step: 330/464, loss: 0.04658864066004753 2023-01-22 14:04:10.431357: step: 332/464, loss: 0.26379671692848206 2023-01-22 14:04:11.215023: step: 334/464, loss: 0.2708098590373993 2023-01-22 14:04:11.949781: step: 336/464, loss: 0.02883169986307621 2023-01-22 14:04:12.623980: step: 338/464, loss: 0.045695751905441284 2023-01-22 14:04:13.310488: step: 340/464, loss: 0.019763052463531494 2023-01-22 14:04:14.018230: step: 342/464, loss: 0.07616054266691208 2023-01-22 14:04:14.736160: step: 344/464, loss: 0.06221643462777138 2023-01-22 14:04:15.473410: step: 346/464, loss: 0.07993372529745102 2023-01-22 14:04:16.187608: step: 348/464, loss: 0.02037712186574936 2023-01-22 14:04:17.035191: step: 350/464, loss: 0.17848798632621765 2023-01-22 14:04:17.718900: step: 352/464, loss: 0.06046653166413307 2023-01-22 14:04:18.427432: step: 354/464, loss: 0.05236731842160225 2023-01-22 14:04:19.159679: step: 356/464, loss: 0.019864128902554512 2023-01-22 14:04:19.948931: step: 358/464, loss: 0.025152064859867096 2023-01-22 14:04:20.686288: step: 360/464, loss: 0.18172380328178406 2023-01-22 14:04:21.407123: step: 362/464, loss: 0.06978324800729752 2023-01-22 14:04:22.143026: step: 364/464, loss: 0.09853015094995499 2023-01-22 14:04:22.853395: step: 366/464, loss: 0.051949791610240936 2023-01-22 14:04:23.537883: step: 368/464, loss: 0.0738070160150528 2023-01-22 14:04:24.224975: step: 370/464, loss: 0.017090322449803352 2023-01-22 14:04:24.901195: step: 372/464, loss: 0.04485044628381729 2023-01-22 14:04:25.705968: step: 374/464, loss: 0.09104947000741959 2023-01-22 14:04:26.440617: step: 376/464, loss: 0.021099206060171127 2023-01-22 14:04:27.151144: step: 378/464, loss: 0.40562692284584045 2023-01-22 14:04:27.906347: step: 380/464, loss: 0.10820455104112625 2023-01-22 14:04:28.640060: step: 382/464, loss: 0.06618161499500275 2023-01-22 14:04:29.418158: step: 384/464, loss: 0.1598353087902069 2023-01-22 14:04:30.205660: step: 386/464, loss: 0.049808088690042496 2023-01-22 14:04:30.904676: step: 388/464, loss: 0.07933302223682404 2023-01-22 14:04:31.610025: step: 390/464, loss: 0.02542533352971077 2023-01-22 14:04:32.363956: step: 392/464, loss: 0.04747779667377472 2023-01-22 14:04:33.046536: step: 394/464, loss: 0.02016112394630909 2023-01-22 14:04:33.744210: step: 396/464, loss: 0.025817180052399635 2023-01-22 14:04:34.461997: step: 398/464, loss: 0.06619580835103989 2023-01-22 14:04:35.296023: step: 400/464, loss: 0.20073483884334564 2023-01-22 14:04:36.036771: step: 402/464, loss: 0.17219051718711853 2023-01-22 14:04:36.782822: step: 404/464, loss: 0.021845456212759018 2023-01-22 14:04:37.505183: step: 406/464, loss: 0.06755460798740387 2023-01-22 14:04:38.317515: step: 408/464, loss: 0.037240903824567795 2023-01-22 14:04:39.017623: step: 410/464, loss: 0.07010459899902344 2023-01-22 14:04:39.639952: step: 412/464, loss: 0.11949370056390762 2023-01-22 14:04:40.430515: step: 414/464, loss: 0.10700497776269913 2023-01-22 14:04:41.197399: step: 416/464, loss: 0.1285242736339569 2023-01-22 14:04:41.978634: step: 418/464, loss: 0.09551849216222763 2023-01-22 14:04:42.639342: step: 420/464, loss: 0.05062438175082207 2023-01-22 14:04:43.309978: step: 422/464, loss: 0.1664879471063614 2023-01-22 14:04:44.038713: step: 424/464, loss: 0.037926722317934036 2023-01-22 14:04:44.944139: step: 426/464, loss: 0.2571927011013031 2023-01-22 14:04:45.614658: step: 428/464, loss: 0.036315422505140305 2023-01-22 14:04:46.296570: step: 430/464, loss: 0.024037066847085953 2023-01-22 14:04:47.004779: step: 432/464, loss: 13.031647682189941 2023-01-22 14:04:47.752532: step: 434/464, loss: 0.03742430731654167 2023-01-22 14:04:48.413677: step: 436/464, loss: 0.10749435424804688 2023-01-22 14:04:49.175477: step: 438/464, loss: 0.06732063740491867 2023-01-22 14:04:49.910956: step: 440/464, loss: 0.15107282996177673 2023-01-22 14:04:50.654952: step: 442/464, loss: 0.11087916791439056 2023-01-22 14:04:51.297013: step: 444/464, loss: 0.19436655938625336 2023-01-22 14:04:52.042819: step: 446/464, loss: 0.15645796060562134 2023-01-22 14:04:52.836172: step: 448/464, loss: 0.08497386425733566 2023-01-22 14:04:53.541333: step: 450/464, loss: 0.07939328998327255 2023-01-22 14:04:54.243130: step: 452/464, loss: 0.026771867647767067 2023-01-22 14:04:54.969423: step: 454/464, loss: 0.09241552650928497 2023-01-22 14:04:55.785111: step: 456/464, loss: 0.03661811724305153 2023-01-22 14:04:56.511499: step: 458/464, loss: 0.016207559034228325 2023-01-22 14:04:57.265957: step: 460/464, loss: 0.01745665818452835 2023-01-22 14:04:58.020542: step: 462/464, loss: 0.04709574952721596 2023-01-22 14:04:58.731851: step: 464/464, loss: 0.12045050412416458 2023-01-22 14:04:59.527984: step: 466/464, loss: 0.048567306250333786 2023-01-22 14:05:00.201662: step: 468/464, loss: 0.20824171602725983 2023-01-22 14:05:00.924839: step: 470/464, loss: 0.1283740997314453 2023-01-22 14:05:01.667794: step: 472/464, loss: 0.052708856761455536 2023-01-22 14:05:02.386474: step: 474/464, loss: 0.021493423730134964 2023-01-22 14:05:03.129275: step: 476/464, loss: 0.03765876218676567 2023-01-22 14:05:03.878789: step: 478/464, loss: 0.06767209619283676 2023-01-22 14:05:04.636453: step: 480/464, loss: 0.0993877649307251 2023-01-22 14:05:05.365377: step: 482/464, loss: 0.011112421751022339 2023-01-22 14:05:06.117005: step: 484/464, loss: 0.10992801189422607 2023-01-22 14:05:06.892355: step: 486/464, loss: 0.044933851808309555 2023-01-22 14:05:07.650542: step: 488/464, loss: 0.042952749878168106 2023-01-22 14:05:08.359179: step: 490/464, loss: 0.07052365690469742 2023-01-22 14:05:09.107238: step: 492/464, loss: 0.012943526729941368 2023-01-22 14:05:09.835454: step: 494/464, loss: 0.09424760937690735 2023-01-22 14:05:10.559128: step: 496/464, loss: 0.014014906249940395 2023-01-22 14:05:11.276667: step: 498/464, loss: 0.018368808552622795 2023-01-22 14:05:12.018404: step: 500/464, loss: 0.06157062575221062 2023-01-22 14:05:12.818211: step: 502/464, loss: 0.10935144871473312 2023-01-22 14:05:13.530573: step: 504/464, loss: 0.04085175320506096 2023-01-22 14:05:14.223798: step: 506/464, loss: 0.04039693623781204 2023-01-22 14:05:14.996512: step: 508/464, loss: 0.0954434722661972 2023-01-22 14:05:15.776430: step: 510/464, loss: 0.054345130920410156 2023-01-22 14:05:16.523254: step: 512/464, loss: 0.06101298704743385 2023-01-22 14:05:17.221562: step: 514/464, loss: 0.09147502481937408 2023-01-22 14:05:17.944900: step: 516/464, loss: 0.36184293031692505 2023-01-22 14:05:18.712765: step: 518/464, loss: 0.019469410181045532 2023-01-22 14:05:19.501369: step: 520/464, loss: 0.20030872523784637 2023-01-22 14:05:20.283318: step: 522/464, loss: 0.8680578470230103 2023-01-22 14:05:20.979302: step: 524/464, loss: 0.6503241062164307 2023-01-22 14:05:21.685895: step: 526/464, loss: 0.1460971087217331 2023-01-22 14:05:22.392184: step: 528/464, loss: 0.009627019986510277 2023-01-22 14:05:23.119370: step: 530/464, loss: 0.06993068754673004 2023-01-22 14:05:23.856554: step: 532/464, loss: 0.14353105425834656 2023-01-22 14:05:24.581923: step: 534/464, loss: 0.058041051030159 2023-01-22 14:05:25.364558: step: 536/464, loss: 0.2680377960205078 2023-01-22 14:05:26.099148: step: 538/464, loss: 0.036180466413497925 2023-01-22 14:05:26.858277: step: 540/464, loss: 0.07975853234529495 2023-01-22 14:05:27.604063: step: 542/464, loss: 0.06441237032413483 2023-01-22 14:05:28.335681: step: 544/464, loss: 0.02542603202164173 2023-01-22 14:05:28.988069: step: 546/464, loss: 0.19588123261928558 2023-01-22 14:05:29.771400: step: 548/464, loss: 0.05832467973232269 2023-01-22 14:05:30.558685: step: 550/464, loss: 0.057443540543317795 2023-01-22 14:05:31.374396: step: 552/464, loss: 0.35340365767478943 2023-01-22 14:05:32.065342: step: 554/464, loss: 0.1496967375278473 2023-01-22 14:05:32.780833: step: 556/464, loss: 0.08306215703487396 2023-01-22 14:05:33.509162: step: 558/464, loss: 0.05035701021552086 2023-01-22 14:05:34.247671: step: 560/464, loss: 0.14342284202575684 2023-01-22 14:05:34.927100: step: 562/464, loss: 0.18788276612758636 2023-01-22 14:05:35.665216: step: 564/464, loss: 0.07673000544309616 2023-01-22 14:05:36.376944: step: 566/464, loss: 0.015498715452849865 2023-01-22 14:05:37.039339: step: 568/464, loss: 0.04238492622971535 2023-01-22 14:05:37.804581: step: 570/464, loss: 0.046905532479286194 2023-01-22 14:05:38.523862: step: 572/464, loss: 0.03854568675160408 2023-01-22 14:05:39.319153: step: 574/464, loss: 0.20472976565361023 2023-01-22 14:05:40.052537: step: 576/464, loss: 0.05854792892932892 2023-01-22 14:05:40.793705: step: 578/464, loss: 0.05327191203832626 2023-01-22 14:05:41.577076: step: 580/464, loss: 0.049676697701215744 2023-01-22 14:05:42.446918: step: 582/464, loss: 0.047542087733745575 2023-01-22 14:05:43.205470: step: 584/464, loss: 0.07915417104959488 2023-01-22 14:05:43.923998: step: 586/464, loss: 0.05390486866235733 2023-01-22 14:05:44.670459: step: 588/464, loss: 0.06481099873781204 2023-01-22 14:05:45.370408: step: 590/464, loss: 0.10352101922035217 2023-01-22 14:05:46.102933: step: 592/464, loss: 0.10283131152391434 2023-01-22 14:05:46.804104: step: 594/464, loss: 0.06312693655490875 2023-01-22 14:05:47.521548: step: 596/464, loss: 0.032400377094745636 2023-01-22 14:05:48.248618: step: 598/464, loss: 0.04646240547299385 2023-01-22 14:05:48.976613: step: 600/464, loss: 0.08965610712766647 2023-01-22 14:05:49.680607: step: 602/464, loss: 0.12899154424667358 2023-01-22 14:05:50.480491: step: 604/464, loss: 0.05910657346248627 2023-01-22 14:05:51.273897: step: 606/464, loss: 0.029358960688114166 2023-01-22 14:05:51.987567: step: 608/464, loss: 0.053024690598249435 2023-01-22 14:05:52.759896: step: 610/464, loss: 0.1587500274181366 2023-01-22 14:05:53.502254: step: 612/464, loss: 0.03578795865178108 2023-01-22 14:05:54.182713: step: 614/464, loss: 0.0607665553689003 2023-01-22 14:05:54.858127: step: 616/464, loss: 0.023167381063103676 2023-01-22 14:05:55.582056: step: 618/464, loss: 0.19250163435935974 2023-01-22 14:05:56.260297: step: 620/464, loss: 0.03690149635076523 2023-01-22 14:05:57.020453: step: 622/464, loss: 0.07546142488718033 2023-01-22 14:05:57.755206: step: 624/464, loss: 0.04564463347196579 2023-01-22 14:05:58.499449: step: 626/464, loss: 0.045563727617263794 2023-01-22 14:05:59.243188: step: 628/464, loss: 0.12343955785036087 2023-01-22 14:06:00.045716: step: 630/464, loss: 0.057827290147542953 2023-01-22 14:06:00.822743: step: 632/464, loss: 0.02587360516190529 2023-01-22 14:06:01.590136: step: 634/464, loss: 0.2215512990951538 2023-01-22 14:06:02.386641: step: 636/464, loss: 0.020450718700885773 2023-01-22 14:06:03.067970: step: 638/464, loss: 0.08267145603895187 2023-01-22 14:06:03.823048: step: 640/464, loss: 0.14544984698295593 2023-01-22 14:06:04.631763: step: 642/464, loss: 0.02726486511528492 2023-01-22 14:06:05.355679: step: 644/464, loss: 0.058721382170915604 2023-01-22 14:06:06.113309: step: 646/464, loss: 0.11252403259277344 2023-01-22 14:06:06.876163: step: 648/464, loss: 0.05572717636823654 2023-01-22 14:06:07.625885: step: 650/464, loss: 0.018893899396061897 2023-01-22 14:06:08.384855: step: 652/464, loss: 0.09553354978561401 2023-01-22 14:06:09.071956: step: 654/464, loss: 0.06263506412506104 2023-01-22 14:06:09.808762: step: 656/464, loss: 0.010931789875030518 2023-01-22 14:06:10.498938: step: 658/464, loss: 0.004481497686356306 2023-01-22 14:06:11.232099: step: 660/464, loss: 0.0381837897002697 2023-01-22 14:06:12.017047: step: 662/464, loss: 0.25678524374961853 2023-01-22 14:06:12.748531: step: 664/464, loss: 0.031600549817085266 2023-01-22 14:06:13.459781: step: 666/464, loss: 0.17737647891044617 2023-01-22 14:06:14.147825: step: 668/464, loss: 0.08674076944589615 2023-01-22 14:06:14.856792: step: 670/464, loss: 0.025893433019518852 2023-01-22 14:06:15.636994: step: 672/464, loss: 0.04026762768626213 2023-01-22 14:06:16.392345: step: 674/464, loss: 0.048054732382297516 2023-01-22 14:06:17.106078: step: 676/464, loss: 0.04531051591038704 2023-01-22 14:06:17.917714: step: 678/464, loss: 0.15182489156723022 2023-01-22 14:06:18.697339: step: 680/464, loss: 0.05073189735412598 2023-01-22 14:06:19.501784: step: 682/464, loss: 0.10138265043497086 2023-01-22 14:06:20.217272: step: 684/464, loss: 0.01679018884897232 2023-01-22 14:06:20.984789: step: 686/464, loss: 0.2979933023452759 2023-01-22 14:06:21.702933: step: 688/464, loss: 0.05901891365647316 2023-01-22 14:06:22.388385: step: 690/464, loss: 0.03939371928572655 2023-01-22 14:06:23.181177: step: 692/464, loss: 0.05838843062520027 2023-01-22 14:06:24.078144: step: 694/464, loss: 0.038581814616918564 2023-01-22 14:06:24.870181: step: 696/464, loss: 1.0980371236801147 2023-01-22 14:06:25.563529: step: 698/464, loss: 0.0928528904914856 2023-01-22 14:06:26.238403: step: 700/464, loss: 0.18491260707378387 2023-01-22 14:06:26.923526: step: 702/464, loss: 0.03720412403345108 2023-01-22 14:06:27.641157: step: 704/464, loss: 0.025067999958992004 2023-01-22 14:06:28.335560: step: 706/464, loss: 0.07839228212833405 2023-01-22 14:06:29.067976: step: 708/464, loss: 0.07519470900297165 2023-01-22 14:06:29.816953: step: 710/464, loss: 0.11080680787563324 2023-01-22 14:06:30.587835: step: 712/464, loss: 0.0394955612719059 2023-01-22 14:06:31.292616: step: 714/464, loss: 0.020085155963897705 2023-01-22 14:06:32.007342: step: 716/464, loss: 0.13045024871826172 2023-01-22 14:06:32.825260: step: 718/464, loss: 0.027165353298187256 2023-01-22 14:06:33.556081: step: 720/464, loss: 0.08542001247406006 2023-01-22 14:06:34.213368: step: 722/464, loss: 0.2847355604171753 2023-01-22 14:06:34.969991: step: 724/464, loss: 0.03878088667988777 2023-01-22 14:06:35.727228: step: 726/464, loss: 0.49133923649787903 2023-01-22 14:06:36.417097: step: 728/464, loss: 0.046279340982437134 2023-01-22 14:06:37.133727: step: 730/464, loss: 0.030544577166438103 2023-01-22 14:06:37.940770: step: 732/464, loss: 0.03781994804739952 2023-01-22 14:06:38.676773: step: 734/464, loss: 0.057049937546253204 2023-01-22 14:06:39.437707: step: 736/464, loss: 0.03740621358156204 2023-01-22 14:06:40.238991: step: 738/464, loss: 0.10029034316539764 2023-01-22 14:06:40.877402: step: 740/464, loss: 0.012559827417135239 2023-01-22 14:06:41.662173: step: 742/464, loss: 0.016750726848840714 2023-01-22 14:06:42.381739: step: 744/464, loss: 0.2108759582042694 2023-01-22 14:06:43.102385: step: 746/464, loss: 0.035853851586580276 2023-01-22 14:06:43.774016: step: 748/464, loss: 0.43498602509498596 2023-01-22 14:06:44.488496: step: 750/464, loss: 0.13186833262443542 2023-01-22 14:06:45.199410: step: 752/464, loss: 0.06522515416145325 2023-01-22 14:06:45.899862: step: 754/464, loss: 0.030804447829723358 2023-01-22 14:06:46.584373: step: 756/464, loss: 0.16361655294895172 2023-01-22 14:06:47.319781: step: 758/464, loss: 0.20067808032035828 2023-01-22 14:06:48.067433: step: 760/464, loss: 0.8925706148147583 2023-01-22 14:06:48.776845: step: 762/464, loss: 0.08398979157209396 2023-01-22 14:06:49.542512: step: 764/464, loss: 0.08976228535175323 2023-01-22 14:06:50.274200: step: 766/464, loss: 0.024204669520258904 2023-01-22 14:06:50.960685: step: 768/464, loss: 0.04503363370895386 2023-01-22 14:06:51.627730: step: 770/464, loss: 0.08009073883295059 2023-01-22 14:06:52.459571: step: 772/464, loss: 0.10919658839702606 2023-01-22 14:06:53.169645: step: 774/464, loss: 0.09016293287277222 2023-01-22 14:06:53.877038: step: 776/464, loss: 0.13644663989543915 2023-01-22 14:06:54.641321: step: 778/464, loss: 0.07259364426136017 2023-01-22 14:06:55.336387: step: 780/464, loss: 0.17036207020282745 2023-01-22 14:06:56.059313: step: 782/464, loss: 0.13564035296440125 2023-01-22 14:06:56.821560: step: 784/464, loss: 0.08406843990087509 2023-01-22 14:06:57.588326: step: 786/464, loss: 0.136801615357399 2023-01-22 14:06:58.368047: step: 788/464, loss: 0.13680937886238098 2023-01-22 14:06:59.233907: step: 790/464, loss: 0.09460506588220596 2023-01-22 14:06:59.982320: step: 792/464, loss: 0.13996107876300812 2023-01-22 14:07:00.705528: step: 794/464, loss: 0.06452802568674088 2023-01-22 14:07:01.496508: step: 796/464, loss: 0.1253369301557541 2023-01-22 14:07:02.215127: step: 798/464, loss: 0.04280988872051239 2023-01-22 14:07:02.935674: step: 800/464, loss: 0.022860821336507797 2023-01-22 14:07:03.642547: step: 802/464, loss: 0.05299188196659088 2023-01-22 14:07:04.450535: step: 804/464, loss: 0.13382039964199066 2023-01-22 14:07:05.153387: step: 806/464, loss: 0.14443878829479218 2023-01-22 14:07:05.922554: step: 808/464, loss: 0.033110111951828 2023-01-22 14:07:06.733375: step: 810/464, loss: 0.05904128775000572 2023-01-22 14:07:07.430913: step: 812/464, loss: 0.044445887207984924 2023-01-22 14:07:08.145112: step: 814/464, loss: 0.16334925591945648 2023-01-22 14:07:08.977345: step: 816/464, loss: 0.034357622265815735 2023-01-22 14:07:09.630873: step: 818/464, loss: 0.021517088636755943 2023-01-22 14:07:10.352846: step: 820/464, loss: 0.0479215607047081 2023-01-22 14:07:11.129669: step: 822/464, loss: 0.057476677000522614 2023-01-22 14:07:11.884164: step: 824/464, loss: 0.03518066182732582 2023-01-22 14:07:12.563057: step: 826/464, loss: 0.03787121921777725 2023-01-22 14:07:13.277883: step: 828/464, loss: 0.315551221370697 2023-01-22 14:07:13.988424: step: 830/464, loss: 0.28821465373039246 2023-01-22 14:07:14.677169: step: 832/464, loss: 0.018016284331679344 2023-01-22 14:07:15.397682: step: 834/464, loss: 0.04078349843621254 2023-01-22 14:07:16.103162: step: 836/464, loss: 0.6406852006912231 2023-01-22 14:07:16.847622: step: 838/464, loss: 0.08561952412128448 2023-01-22 14:07:17.604503: step: 840/464, loss: 0.04118037223815918 2023-01-22 14:07:18.379294: step: 842/464, loss: 0.06859776377677917 2023-01-22 14:07:19.154294: step: 844/464, loss: 0.06416252255439758 2023-01-22 14:07:19.924582: step: 846/464, loss: 0.0812632292509079 2023-01-22 14:07:20.646553: step: 848/464, loss: 0.4164171516895294 2023-01-22 14:07:21.339543: step: 850/464, loss: 0.15695109963417053 2023-01-22 14:07:22.031930: step: 852/464, loss: 0.2507461905479431 2023-01-22 14:07:22.735592: step: 854/464, loss: 0.14173288643360138 2023-01-22 14:07:23.486420: step: 856/464, loss: 0.07978172600269318 2023-01-22 14:07:24.156554: step: 858/464, loss: 0.036435868591070175 2023-01-22 14:07:24.957555: step: 860/464, loss: 0.026326339691877365 2023-01-22 14:07:25.748736: step: 862/464, loss: 0.08801314234733582 2023-01-22 14:07:26.505199: step: 864/464, loss: 0.17660625278949738 2023-01-22 14:07:27.222932: step: 866/464, loss: 0.06663139164447784 2023-01-22 14:07:27.951737: step: 868/464, loss: 0.041380446404218674 2023-01-22 14:07:28.699470: step: 870/464, loss: 0.10867740958929062 2023-01-22 14:07:29.367211: step: 872/464, loss: 0.06529152393341064 2023-01-22 14:07:30.054705: step: 874/464, loss: 0.12430823594331741 2023-01-22 14:07:30.782709: step: 876/464, loss: 0.057288605719804764 2023-01-22 14:07:31.620277: step: 878/464, loss: 0.06806528568267822 2023-01-22 14:07:32.245201: step: 880/464, loss: 0.013468866236507893 2023-01-22 14:07:32.959591: step: 882/464, loss: 0.03978579863905907 2023-01-22 14:07:33.775677: step: 884/464, loss: 0.436234712600708 2023-01-22 14:07:34.557052: step: 886/464, loss: 0.011038696393370628 2023-01-22 14:07:35.264338: step: 888/464, loss: 0.18119382858276367 2023-01-22 14:07:36.033070: step: 890/464, loss: 0.1493879109621048 2023-01-22 14:07:36.798124: step: 892/464, loss: 0.16955973207950592 2023-01-22 14:07:37.516568: step: 894/464, loss: 0.08893486857414246 2023-01-22 14:07:38.201780: step: 896/464, loss: 0.04885321855545044 2023-01-22 14:07:38.988882: step: 898/464, loss: 0.03767932206392288 2023-01-22 14:07:39.768993: step: 900/464, loss: 0.22809790074825287 2023-01-22 14:07:40.629337: step: 902/464, loss: 0.07466711103916168 2023-01-22 14:07:41.378897: step: 904/464, loss: 0.08488618582487106 2023-01-22 14:07:42.215304: step: 906/464, loss: 0.1496511995792389 2023-01-22 14:07:43.055446: step: 908/464, loss: 0.03573276102542877 2023-01-22 14:07:43.720364: step: 910/464, loss: 0.05416325107216835 2023-01-22 14:07:44.374807: step: 912/464, loss: 0.04344616085290909 2023-01-22 14:07:45.162306: step: 914/464, loss: 0.07182371616363525 2023-01-22 14:07:45.878472: step: 916/464, loss: 0.021106265485286713 2023-01-22 14:07:46.565294: step: 918/464, loss: 0.2376040667295456 2023-01-22 14:07:47.300834: step: 920/464, loss: 0.08329570293426514 2023-01-22 14:07:47.950716: step: 922/464, loss: 0.030238067731261253 2023-01-22 14:07:48.597879: step: 924/464, loss: 0.05236712470650673 2023-01-22 14:07:49.273251: step: 926/464, loss: 0.05165240168571472 2023-01-22 14:07:50.001938: step: 928/464, loss: 0.05135872960090637 2023-01-22 14:07:50.678826: step: 930/464, loss: 0.03845001384615898 ================================================== Loss: 0.134 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.30367177709899756, 'r': 0.3503461868618416, 'f1': 0.3253435074470317}, 'combined': 0.23972679496097074, 'epoch': 19} Test Chinese: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.29659803276724767, 'r': 0.2933741411067341, 'f1': 0.29497727848983096}, 'combined': 0.18319641506210557, 'epoch': 19} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.27595524622892637, 'r': 0.3372204526972079, 'f1': 0.3035272050750274}, 'combined': 0.22365162479212544, 'epoch': 19} Test Korean: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.298321804105213, 'r': 0.29360525384268, 'f1': 0.29594473793704396}, 'combined': 0.1837972582977431, 'epoch': 19} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2947401118086478, 'r': 0.3456345144169722, 'f1': 0.318164871786453}, 'combined': 0.2344372739479127, 'epoch': 19} Test Russian: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.3115117262645741, 'r': 0.2986215858674193, 'f1': 0.30493049261109717}, 'combined': 0.18937788488478668, 'epoch': 19} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.2388888888888889, 'r': 0.30714285714285716, 'f1': 0.26875}, 'combined': 0.17916666666666664, 'epoch': 19} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.2621951219512195, 'r': 0.4673913043478261, 'f1': 0.3359375}, 'combined': 0.16796875, 'epoch': 19} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.39880952380952384, 'r': 0.28879310344827586, 'f1': 0.33499999999999996}, 'combined': 0.2233333333333333, 'epoch': 19} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.31282801835726054, 'r': 0.3603161425860667, 'f1': 0.33489701436130004}, 'combined': 0.24676622110832633, 'epoch': 12} Test for Chinese: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.30894966790503225, 'r': 0.2887808468548521, 'f1': 0.29852498585915693}, 'combined': 0.1853997280598975, 'epoch': 12} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.3138888888888889, 'r': 0.4035714285714286, 'f1': 0.35312499999999997}, 'combined': 0.23541666666666664, 'epoch': 12} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2957374656593406, 'r': 0.3501711168338303, 'f1': 0.3206606056844979}, 'combined': 0.23627623576752474, 'epoch': 12} Test for Korean: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.30237642650399266, 'r': 0.2871380888046808, 'f1': 0.2945603100560943}, 'combined': 0.18293745571904804, 'epoch': 12} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.31976744186046513, 'r': 0.5978260869565217, 'f1': 0.4166666666666667}, 'combined': 0.20833333333333334, 'epoch': 12} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3020804336867607, 'r': 0.32271590923272536, 'f1': 0.3120574021388005}, 'combined': 0.22993703315490563, 'epoch': 8} Test for Russian: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.3069139619466407, 'r': 0.292949528465389, 'f1': 0.29976920372318655}, 'combined': 0.1861724528386106, 'epoch': 8} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.44642857142857145, 'r': 0.3232758620689655, 'f1': 0.37500000000000006}, 'combined': 0.25, 'epoch': 8} ****************************** Epoch: 20 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-22 14:10:29.393709: step: 2/464, loss: 0.01712164096534252 2023-01-22 14:10:30.106070: step: 4/464, loss: 0.05320145562291145 2023-01-22 14:10:30.872230: step: 6/464, loss: 0.003598960814997554 2023-01-22 14:10:31.556279: step: 8/464, loss: 0.05244593694806099 2023-01-22 14:10:32.299721: step: 10/464, loss: 0.130269393324852 2023-01-22 14:10:33.078369: step: 12/464, loss: 0.02787611447274685 2023-01-22 14:10:33.779682: step: 14/464, loss: 0.0313459113240242 2023-01-22 14:10:34.476167: step: 16/464, loss: 0.018669242039322853 2023-01-22 14:10:35.232580: step: 18/464, loss: 0.03460734337568283 2023-01-22 14:10:36.049966: step: 20/464, loss: 0.05031322315335274 2023-01-22 14:10:36.757332: step: 22/464, loss: 0.10367558896541595 2023-01-22 14:10:37.564193: step: 24/464, loss: 0.045394621789455414 2023-01-22 14:10:38.331854: step: 26/464, loss: 0.013776100240647793 2023-01-22 14:10:39.013840: step: 28/464, loss: 0.05621929094195366 2023-01-22 14:10:39.766715: step: 30/464, loss: 0.017570164054632187 2023-01-22 14:10:40.484557: step: 32/464, loss: 0.06579134613275528 2023-01-22 14:10:41.223747: step: 34/464, loss: 0.01885702833533287 2023-01-22 14:10:41.916323: step: 36/464, loss: 0.005795876029878855 2023-01-22 14:10:42.604435: step: 38/464, loss: 0.3700122833251953 2023-01-22 14:10:43.322599: step: 40/464, loss: 0.060654789209365845 2023-01-22 14:10:44.133227: step: 42/464, loss: 0.09823274612426758 2023-01-22 14:10:44.823019: step: 44/464, loss: 0.05419113114476204 2023-01-22 14:10:45.487487: step: 46/464, loss: 0.05794157832860947 2023-01-22 14:10:46.167865: step: 48/464, loss: 0.053394392132759094 2023-01-22 14:10:46.900659: step: 50/464, loss: 0.06811627000570297 2023-01-22 14:10:47.664545: step: 52/464, loss: 0.006924196146428585 2023-01-22 14:10:48.395040: step: 54/464, loss: 0.011313280090689659 2023-01-22 14:10:49.150545: step: 56/464, loss: 0.02086627669632435 2023-01-22 14:10:49.972679: step: 58/464, loss: 0.06859908252954483 2023-01-22 14:10:50.686375: step: 60/464, loss: 0.048081107437610626 2023-01-22 14:10:51.437719: step: 62/464, loss: 0.2862105071544647 2023-01-22 14:10:52.174360: step: 64/464, loss: 0.04191207513213158 2023-01-22 14:10:52.922806: step: 66/464, loss: 0.48177024722099304 2023-01-22 14:10:53.607350: step: 68/464, loss: 0.14845621585845947 2023-01-22 14:10:54.255981: step: 70/464, loss: 0.04261395335197449 2023-01-22 14:10:54.922777: step: 72/464, loss: 0.1031220480799675 2023-01-22 14:10:55.819076: step: 74/464, loss: 0.03773185983300209 2023-01-22 14:10:56.515691: step: 76/464, loss: 0.0502183698117733 2023-01-22 14:10:57.273997: step: 78/464, loss: 0.047639064490795135 2023-01-22 14:10:58.072467: step: 80/464, loss: 0.054580919444561005 2023-01-22 14:10:58.838190: step: 82/464, loss: 0.05993421748280525 2023-01-22 14:10:59.593688: step: 84/464, loss: 0.06883340328931808 2023-01-22 14:11:00.269322: step: 86/464, loss: 0.15197524428367615 2023-01-22 14:11:01.037581: step: 88/464, loss: 0.11038285493850708 2023-01-22 14:11:01.712676: step: 90/464, loss: 0.00881672278046608 2023-01-22 14:11:02.445483: step: 92/464, loss: 0.0875900387763977 2023-01-22 14:11:03.177936: step: 94/464, loss: 0.3431815803050995 2023-01-22 14:11:03.853734: step: 96/464, loss: 0.017735961824655533 2023-01-22 14:11:04.616264: step: 98/464, loss: 0.05344702675938606 2023-01-22 14:11:05.322094: step: 100/464, loss: 0.05173359811306 2023-01-22 14:11:06.028165: step: 102/464, loss: 0.033794596791267395 2023-01-22 14:11:06.719457: step: 104/464, loss: 0.40232419967651367 2023-01-22 14:11:07.420672: step: 106/464, loss: 0.026458468288183212 2023-01-22 14:11:08.120147: step: 108/464, loss: 0.02863013930618763 2023-01-22 14:11:08.893027: step: 110/464, loss: 0.2328375279903412 2023-01-22 14:11:09.575347: step: 112/464, loss: 0.05186981335282326 2023-01-22 14:11:10.268573: step: 114/464, loss: 0.1496034860610962 2023-01-22 14:11:11.012282: step: 116/464, loss: 0.05353805795311928 2023-01-22 14:11:11.682296: step: 118/464, loss: 0.02698923647403717 2023-01-22 14:11:12.492178: step: 120/464, loss: 0.07385917752981186 2023-01-22 14:11:13.157031: step: 122/464, loss: 0.043923210352659225 2023-01-22 14:11:13.930745: step: 124/464, loss: 0.08073197305202484 2023-01-22 14:11:14.634955: step: 126/464, loss: 0.008128171786665916 2023-01-22 14:11:15.345386: step: 128/464, loss: 0.08548290282487869 2023-01-22 14:11:16.058928: step: 130/464, loss: 0.00616619223728776 2023-01-22 14:11:16.878307: step: 132/464, loss: 0.13567934930324554 2023-01-22 14:11:17.559247: step: 134/464, loss: 0.004059022758156061 2023-01-22 14:11:18.368876: step: 136/464, loss: 0.052950870245695114 2023-01-22 14:11:19.139429: step: 138/464, loss: 0.0192416962236166 2023-01-22 14:11:19.883536: step: 140/464, loss: 0.052168261259794235 2023-01-22 14:11:20.663692: step: 142/464, loss: 0.0380428284406662 2023-01-22 14:11:21.423496: step: 144/464, loss: 0.034303802996873856 2023-01-22 14:11:22.172312: step: 146/464, loss: 0.13178157806396484 2023-01-22 14:11:22.919317: step: 148/464, loss: 0.0370471328496933 2023-01-22 14:11:23.682321: step: 150/464, loss: 0.07469688355922699 2023-01-22 14:11:24.410384: step: 152/464, loss: 0.02738485112786293 2023-01-22 14:11:25.098716: step: 154/464, loss: 0.03580975532531738 2023-01-22 14:11:25.839978: step: 156/464, loss: 0.07651842385530472 2023-01-22 14:11:26.584816: step: 158/464, loss: 0.22850660979747772 2023-01-22 14:11:27.318082: step: 160/464, loss: 0.12235674262046814 2023-01-22 14:11:28.039142: step: 162/464, loss: 0.013990739360451698 2023-01-22 14:11:28.769209: step: 164/464, loss: 0.0508531779050827 2023-01-22 14:11:29.444266: step: 166/464, loss: 0.005535896867513657 2023-01-22 14:11:30.176123: step: 168/464, loss: 0.05309152603149414 2023-01-22 14:11:30.932370: step: 170/464, loss: 0.023079494014382362 2023-01-22 14:11:31.653247: step: 172/464, loss: 0.14203958213329315 2023-01-22 14:11:32.370186: step: 174/464, loss: 0.020587345585227013 2023-01-22 14:11:33.080071: step: 176/464, loss: 0.39844274520874023 2023-01-22 14:11:33.758549: step: 178/464, loss: 0.03402119129896164 2023-01-22 14:11:34.493172: step: 180/464, loss: 0.16260108351707458 2023-01-22 14:11:35.239100: step: 182/464, loss: 0.03452138602733612 2023-01-22 14:11:36.045313: step: 184/464, loss: 0.16593848168849945 2023-01-22 14:11:36.843066: step: 186/464, loss: 0.08963475376367569 2023-01-22 14:11:37.603518: step: 188/464, loss: 0.023618275299668312 2023-01-22 14:11:38.317366: step: 190/464, loss: 0.024404142051935196 2023-01-22 14:11:39.007827: step: 192/464, loss: 0.005555496551096439 2023-01-22 14:11:39.765111: step: 194/464, loss: 0.06363875418901443 2023-01-22 14:11:40.460043: step: 196/464, loss: 0.05725649371743202 2023-01-22 14:11:41.252924: step: 198/464, loss: 0.32381248474121094 2023-01-22 14:11:41.951300: step: 200/464, loss: 0.014517308212816715 2023-01-22 14:11:42.653644: step: 202/464, loss: 0.050813108682632446 2023-01-22 14:11:43.333042: step: 204/464, loss: 0.10932115465402603 2023-01-22 14:11:44.212857: step: 206/464, loss: 0.04808484762907028 2023-01-22 14:11:44.917445: step: 208/464, loss: 0.046366311609745026 2023-01-22 14:11:45.632771: step: 210/464, loss: 0.12464817613363266 2023-01-22 14:11:46.349963: step: 212/464, loss: 0.033209338784217834 2023-01-22 14:11:47.064133: step: 214/464, loss: 0.044971611350774765 2023-01-22 14:11:47.859943: step: 216/464, loss: 0.11272618174552917 2023-01-22 14:11:48.581385: step: 218/464, loss: 0.10156702995300293 2023-01-22 14:11:49.362011: step: 220/464, loss: 0.04471761733293533 2023-01-22 14:11:50.094203: step: 222/464, loss: 0.026782341301441193 2023-01-22 14:11:50.787425: step: 224/464, loss: 0.01698833890259266 2023-01-22 14:11:51.544760: step: 226/464, loss: 0.322632372379303 2023-01-22 14:11:52.342869: step: 228/464, loss: 0.09721288830041885 2023-01-22 14:11:53.086808: step: 230/464, loss: 0.059638142585754395 2023-01-22 14:11:53.783659: step: 232/464, loss: 0.018348954617977142 2023-01-22 14:11:54.617855: step: 234/464, loss: 0.028666546568274498 2023-01-22 14:11:55.279258: step: 236/464, loss: 0.03871145844459534 2023-01-22 14:11:55.961050: step: 238/464, loss: 0.011665408499538898 2023-01-22 14:11:56.639581: step: 240/464, loss: 0.04983703792095184 2023-01-22 14:11:57.351410: step: 242/464, loss: 0.05028875172138214 2023-01-22 14:11:58.023939: step: 244/464, loss: 0.003809570102021098 2023-01-22 14:11:58.818516: step: 246/464, loss: 0.09055406600236893 2023-01-22 14:11:59.527722: step: 248/464, loss: 0.028566665947437286 2023-01-22 14:12:00.334728: step: 250/464, loss: 0.10694783926010132 2023-01-22 14:12:01.091917: step: 252/464, loss: 0.6403794288635254 2023-01-22 14:12:01.761888: step: 254/464, loss: 0.02813796140253544 2023-01-22 14:12:02.559133: step: 256/464, loss: 0.044382158666849136 2023-01-22 14:12:03.349634: step: 258/464, loss: 0.021823404356837273 2023-01-22 14:12:03.992760: step: 260/464, loss: 0.0053736069239676 2023-01-22 14:12:04.756162: step: 262/464, loss: 1.5771262645721436 2023-01-22 14:12:05.429925: step: 264/464, loss: 0.01660967990756035 2023-01-22 14:12:06.188565: step: 266/464, loss: 0.0449100024998188 2023-01-22 14:12:06.999277: step: 268/464, loss: 0.056982915848493576 2023-01-22 14:12:07.732786: step: 270/464, loss: 0.17105650901794434 2023-01-22 14:12:08.433279: step: 272/464, loss: 0.004213482141494751 2023-01-22 14:12:09.136756: step: 274/464, loss: 0.1180211529135704 2023-01-22 14:12:09.966977: step: 276/464, loss: 0.1284855455160141 2023-01-22 14:12:10.747849: step: 278/464, loss: 0.09054433554410934 2023-01-22 14:12:11.469205: step: 280/464, loss: 0.07391920685768127 2023-01-22 14:12:12.235936: step: 282/464, loss: 0.09057587385177612 2023-01-22 14:12:12.992624: step: 284/464, loss: 0.006217170972377062 2023-01-22 14:12:13.761178: step: 286/464, loss: 0.06684119254350662 2023-01-22 14:12:14.468113: step: 288/464, loss: 0.11914774030447006 2023-01-22 14:12:15.244743: step: 290/464, loss: 0.02932037226855755 2023-01-22 14:12:15.948820: step: 292/464, loss: 0.04236677289009094 2023-01-22 14:12:16.691193: step: 294/464, loss: 0.05178089812397957 2023-01-22 14:12:17.420524: step: 296/464, loss: 0.06522136926651001 2023-01-22 14:12:18.188401: step: 298/464, loss: 0.13979460299015045 2023-01-22 14:12:18.925261: step: 300/464, loss: 0.017481815069913864 2023-01-22 14:12:19.680995: step: 302/464, loss: 0.03961453214287758 2023-01-22 14:12:20.402880: step: 304/464, loss: 0.31653302907943726 2023-01-22 14:12:21.206146: step: 306/464, loss: 0.05984179303050041 2023-01-22 14:12:21.980381: step: 308/464, loss: 0.039903104305267334 2023-01-22 14:12:22.692288: step: 310/464, loss: 0.27464759349823 2023-01-22 14:12:23.388025: step: 312/464, loss: 0.11150558292865753 2023-01-22 14:12:24.116416: step: 314/464, loss: 0.046920839697122574 2023-01-22 14:12:24.874140: step: 316/464, loss: 0.011616310104727745 2023-01-22 14:12:25.709869: step: 318/464, loss: 0.06396178901195526 2023-01-22 14:12:26.406252: step: 320/464, loss: 0.22413629293441772 2023-01-22 14:12:27.094014: step: 322/464, loss: 0.057452570647001266 2023-01-22 14:12:27.814852: step: 324/464, loss: 0.06499453634023666 2023-01-22 14:12:28.630131: step: 326/464, loss: 0.027838967740535736 2023-01-22 14:12:29.373894: step: 328/464, loss: 0.4065115451812744 2023-01-22 14:12:30.068926: step: 330/464, loss: 0.03231246396899223 2023-01-22 14:12:30.859394: step: 332/464, loss: 0.2747350335121155 2023-01-22 14:12:31.562342: step: 334/464, loss: 0.753449559211731 2023-01-22 14:12:32.358465: step: 336/464, loss: 0.2993898391723633 2023-01-22 14:12:33.071184: step: 338/464, loss: 0.20569021999835968 2023-01-22 14:12:33.854116: step: 340/464, loss: 0.5489237904548645 2023-01-22 14:12:34.530776: step: 342/464, loss: 0.09590140730142593 2023-01-22 14:12:35.303348: step: 344/464, loss: 0.04386964812874794 2023-01-22 14:12:35.982035: step: 346/464, loss: 0.05596918985247612 2023-01-22 14:12:36.729476: step: 348/464, loss: 0.08769191056489944 2023-01-22 14:12:37.474433: step: 350/464, loss: 0.05492006987333298 2023-01-22 14:12:38.132752: step: 352/464, loss: 0.055251482874155045 2023-01-22 14:12:38.878311: step: 354/464, loss: 0.02360299788415432 2023-01-22 14:12:39.618684: step: 356/464, loss: 0.059911515563726425 2023-01-22 14:12:40.312393: step: 358/464, loss: 0.017682574689388275 2023-01-22 14:12:41.040319: step: 360/464, loss: 0.013219114392995834 2023-01-22 14:12:41.872588: step: 362/464, loss: 0.04908164218068123 2023-01-22 14:12:42.616257: step: 364/464, loss: 0.17003220319747925 2023-01-22 14:12:43.298661: step: 366/464, loss: 0.021513281390070915 2023-01-22 14:12:43.983740: step: 368/464, loss: 0.02743382751941681 2023-01-22 14:12:44.770537: step: 370/464, loss: 0.02264592982828617 2023-01-22 14:12:45.522556: step: 372/464, loss: 0.015140403062105179 2023-01-22 14:12:46.227755: step: 374/464, loss: 0.06058161333203316 2023-01-22 14:12:46.951621: step: 376/464, loss: 0.4133065938949585 2023-01-22 14:12:47.638038: step: 378/464, loss: 0.1243203729391098 2023-01-22 14:12:48.324763: step: 380/464, loss: 0.14328400790691376 2023-01-22 14:12:49.095930: step: 382/464, loss: 0.02986525557935238 2023-01-22 14:12:49.893328: step: 384/464, loss: 0.15979009866714478 2023-01-22 14:12:50.746982: step: 386/464, loss: 0.020061299204826355 2023-01-22 14:12:51.494655: step: 388/464, loss: 0.0972837433218956 2023-01-22 14:12:52.216912: step: 390/464, loss: 0.040774136781692505 2023-01-22 14:12:52.894305: step: 392/464, loss: 0.03275227174162865 2023-01-22 14:12:53.574166: step: 394/464, loss: 0.035346053540706635 2023-01-22 14:12:54.343891: step: 396/464, loss: 0.05919338017702103 2023-01-22 14:12:55.115654: step: 398/464, loss: 0.06316707283258438 2023-01-22 14:12:55.853256: step: 400/464, loss: 0.007029821164906025 2023-01-22 14:12:56.705342: step: 402/464, loss: 0.033194344490766525 2023-01-22 14:12:57.526938: step: 404/464, loss: 0.08716045320034027 2023-01-22 14:12:58.343474: step: 406/464, loss: 0.14590072631835938 2023-01-22 14:12:59.038695: step: 408/464, loss: 0.07667405158281326 2023-01-22 14:12:59.717454: step: 410/464, loss: 0.9854905009269714 2023-01-22 14:13:00.553685: step: 412/464, loss: 0.04124729707837105 2023-01-22 14:13:01.321440: step: 414/464, loss: 0.4641401171684265 2023-01-22 14:13:02.026306: step: 416/464, loss: 0.05213777348399162 2023-01-22 14:13:02.760495: step: 418/464, loss: 0.041891228407621384 2023-01-22 14:13:03.482460: step: 420/464, loss: 0.10602536052465439 2023-01-22 14:13:04.175214: step: 422/464, loss: 0.031055880710482597 2023-01-22 14:13:04.924890: step: 424/464, loss: 0.046828266233205795 2023-01-22 14:13:05.578491: step: 426/464, loss: 0.016272274777293205 2023-01-22 14:13:06.288662: step: 428/464, loss: 0.041486937552690506 2023-01-22 14:13:06.993180: step: 430/464, loss: 0.11460515856742859 2023-01-22 14:13:07.694630: step: 432/464, loss: 0.04825626313686371 2023-01-22 14:13:08.420142: step: 434/464, loss: 0.016949543729424477 2023-01-22 14:13:09.194111: step: 436/464, loss: 0.11145293712615967 2023-01-22 14:13:09.899386: step: 438/464, loss: 0.025824761018157005 2023-01-22 14:13:10.607223: step: 440/464, loss: 0.2621608078479767 2023-01-22 14:13:11.288744: step: 442/464, loss: 0.058782532811164856 2023-01-22 14:13:12.129589: step: 444/464, loss: 0.09873618185520172 2023-01-22 14:13:12.828685: step: 446/464, loss: 0.03907536715269089 2023-01-22 14:13:13.506080: step: 448/464, loss: 0.11768066138029099 2023-01-22 14:13:14.271182: step: 450/464, loss: 0.016299933195114136 2023-01-22 14:13:14.955620: step: 452/464, loss: 0.07312924414873123 2023-01-22 14:13:15.717747: step: 454/464, loss: 0.013142102397978306 2023-01-22 14:13:16.420292: step: 456/464, loss: 0.015804991126060486 2023-01-22 14:13:17.242162: step: 458/464, loss: 0.07611986994743347 2023-01-22 14:13:18.076775: step: 460/464, loss: 0.027487069368362427 2023-01-22 14:13:18.767146: step: 462/464, loss: 0.013240874744951725 2023-01-22 14:13:19.554711: step: 464/464, loss: 0.10774283111095428 2023-01-22 14:13:20.274891: step: 466/464, loss: 0.026716547086834908 2023-01-22 14:13:20.995861: step: 468/464, loss: 0.12575234472751617 2023-01-22 14:13:21.756975: step: 470/464, loss: 0.015290974639356136 2023-01-22 14:13:22.502052: step: 472/464, loss: 0.017766600474715233 2023-01-22 14:13:23.270587: step: 474/464, loss: 0.07921173423528671 2023-01-22 14:13:24.013016: step: 476/464, loss: 0.017183203250169754 2023-01-22 14:13:24.709540: step: 478/464, loss: 0.06559363752603531 2023-01-22 14:13:25.483166: step: 480/464, loss: 0.06318671256303787 2023-01-22 14:13:26.170878: step: 482/464, loss: 0.008761835284531116 2023-01-22 14:13:26.876480: step: 484/464, loss: 0.23408980667591095 2023-01-22 14:13:27.657157: step: 486/464, loss: 0.05240677669644356 2023-01-22 14:13:28.479033: step: 488/464, loss: 0.11231359094381332 2023-01-22 14:13:29.248928: step: 490/464, loss: 0.029335487633943558 2023-01-22 14:13:29.875460: step: 492/464, loss: 0.002755326684564352 2023-01-22 14:13:30.578105: step: 494/464, loss: 0.017630210146307945 2023-01-22 14:13:31.289498: step: 496/464, loss: 0.09911082684993744 2023-01-22 14:13:31.980853: step: 498/464, loss: 0.028314009308815002 2023-01-22 14:13:32.736285: step: 500/464, loss: 0.0582076795399189 2023-01-22 14:13:33.543438: step: 502/464, loss: 0.011484694667160511 2023-01-22 14:13:34.231295: step: 504/464, loss: 0.026740385219454765 2023-01-22 14:13:34.923367: step: 506/464, loss: 0.031355928629636765 2023-01-22 14:13:35.682813: step: 508/464, loss: 0.02741563692688942 2023-01-22 14:13:36.452417: step: 510/464, loss: 0.03371531143784523 2023-01-22 14:13:37.233179: step: 512/464, loss: 0.02765187807381153 2023-01-22 14:13:37.932444: step: 514/464, loss: 0.01636318862438202 2023-01-22 14:13:38.756273: step: 516/464, loss: 0.058873556554317474 2023-01-22 14:13:39.470808: step: 518/464, loss: 0.06401578336954117 2023-01-22 14:13:40.224004: step: 520/464, loss: 0.03372131660580635 2023-01-22 14:13:40.992433: step: 522/464, loss: 0.05860723555088043 2023-01-22 14:13:41.698203: step: 524/464, loss: 0.014996029436588287 2023-01-22 14:13:42.465869: step: 526/464, loss: 0.07337082922458649 2023-01-22 14:13:43.190812: step: 528/464, loss: 0.29840704798698425 2023-01-22 14:13:43.859595: step: 530/464, loss: 0.1524370163679123 2023-01-22 14:13:44.585931: step: 532/464, loss: 0.03650538623332977 2023-01-22 14:13:45.282077: step: 534/464, loss: 0.2502961754798889 2023-01-22 14:13:45.970489: step: 536/464, loss: 0.24290591478347778 2023-01-22 14:13:46.754388: step: 538/464, loss: 0.08531565964221954 2023-01-22 14:13:47.506739: step: 540/464, loss: 0.047006767243146896 2023-01-22 14:13:48.198269: step: 542/464, loss: 0.07336611300706863 2023-01-22 14:13:48.906508: step: 544/464, loss: 0.053565382957458496 2023-01-22 14:13:49.663456: step: 546/464, loss: 0.08777978271245956 2023-01-22 14:13:50.390526: step: 548/464, loss: 0.16474194824695587 2023-01-22 14:13:51.129670: step: 550/464, loss: 0.08033395558595657 2023-01-22 14:13:51.891341: step: 552/464, loss: 0.10461867600679398 2023-01-22 14:13:52.682695: step: 554/464, loss: 0.07356631755828857 2023-01-22 14:13:53.391163: step: 556/464, loss: 0.004075208678841591 2023-01-22 14:13:54.221117: step: 558/464, loss: 0.050737012177705765 2023-01-22 14:13:54.995417: step: 560/464, loss: 0.0372830405831337 2023-01-22 14:13:55.714964: step: 562/464, loss: 0.17407050728797913 2023-01-22 14:13:56.503828: step: 564/464, loss: 0.017158811911940575 2023-01-22 14:13:57.217329: step: 566/464, loss: 0.05447883531451225 2023-01-22 14:13:57.928716: step: 568/464, loss: 0.026156453415751457 2023-01-22 14:13:58.647420: step: 570/464, loss: 0.1320725530385971 2023-01-22 14:13:59.331227: step: 572/464, loss: 0.06671235710382462 2023-01-22 14:14:00.098524: step: 574/464, loss: 0.06426624208688736 2023-01-22 14:14:00.885104: step: 576/464, loss: 0.11161121726036072 2023-01-22 14:14:01.683839: step: 578/464, loss: 0.0519336573779583 2023-01-22 14:14:02.509774: step: 580/464, loss: 0.12640352547168732 2023-01-22 14:14:03.303745: step: 582/464, loss: 0.06665212661027908 2023-01-22 14:14:04.018458: step: 584/464, loss: 0.05036737024784088 2023-01-22 14:14:04.797190: step: 586/464, loss: 0.04287680611014366 2023-01-22 14:14:05.512935: step: 588/464, loss: 0.06328330934047699 2023-01-22 14:14:06.297701: step: 590/464, loss: 0.019768400117754936 2023-01-22 14:14:06.964446: step: 592/464, loss: 0.035888101905584335 2023-01-22 14:14:07.700948: step: 594/464, loss: 7.311809539794922 2023-01-22 14:14:08.454124: step: 596/464, loss: 1.0828490257263184 2023-01-22 14:14:09.310033: step: 598/464, loss: 0.02576698176562786 2023-01-22 14:14:10.031571: step: 600/464, loss: 0.028343813493847847 2023-01-22 14:14:10.839136: step: 602/464, loss: 0.04987442120909691 2023-01-22 14:14:11.555426: step: 604/464, loss: 0.03227812796831131 2023-01-22 14:14:12.265000: step: 606/464, loss: 0.06706904619932175 2023-01-22 14:14:12.965155: step: 608/464, loss: 0.11491075903177261 2023-01-22 14:14:13.705247: step: 610/464, loss: 0.01785075105726719 2023-01-22 14:14:14.439670: step: 612/464, loss: 0.3231183588504791 2023-01-22 14:14:15.169651: step: 614/464, loss: 0.061851970851421356 2023-01-22 14:14:15.907702: step: 616/464, loss: 0.05380820855498314 2023-01-22 14:14:16.630248: step: 618/464, loss: 0.12674058973789215 2023-01-22 14:14:17.366923: step: 620/464, loss: 0.08594769239425659 2023-01-22 14:14:18.117640: step: 622/464, loss: 0.007985075004398823 2023-01-22 14:14:18.867960: step: 624/464, loss: 0.07399801164865494 2023-01-22 14:14:19.574128: step: 626/464, loss: 0.20825162529945374 2023-01-22 14:14:20.309252: step: 628/464, loss: 0.019077161327004433 2023-01-22 14:14:21.075491: step: 630/464, loss: 0.0259280726313591 2023-01-22 14:14:21.762230: step: 632/464, loss: 0.18837212026119232 2023-01-22 14:14:22.545619: step: 634/464, loss: 0.17273612320423126 2023-01-22 14:14:23.287472: step: 636/464, loss: 0.03467350825667381 2023-01-22 14:14:23.915931: step: 638/464, loss: 0.07087120413780212 2023-01-22 14:14:24.659189: step: 640/464, loss: 0.027575084939599037 2023-01-22 14:14:25.324412: step: 642/464, loss: 0.01259750034660101 2023-01-22 14:14:26.116230: step: 644/464, loss: 0.06660866737365723 2023-01-22 14:14:26.833729: step: 646/464, loss: 0.03903059661388397 2023-01-22 14:14:27.588004: step: 648/464, loss: 0.24814175069332123 2023-01-22 14:14:28.378435: step: 650/464, loss: 0.0645311027765274 2023-01-22 14:14:29.099108: step: 652/464, loss: 0.018755216151475906 2023-01-22 14:14:29.814019: step: 654/464, loss: 0.04339805245399475 2023-01-22 14:14:30.578158: step: 656/464, loss: 0.09537842124700546 2023-01-22 14:14:31.291866: step: 658/464, loss: 0.76046222448349 2023-01-22 14:14:32.013890: step: 660/464, loss: 0.037168506532907486 2023-01-22 14:14:32.700460: step: 662/464, loss: 0.06753918528556824 2023-01-22 14:14:33.431053: step: 664/464, loss: 0.036744534969329834 2023-01-22 14:14:34.275774: step: 666/464, loss: 0.13704568147659302 2023-01-22 14:14:35.024055: step: 668/464, loss: 0.05090853571891785 2023-01-22 14:14:35.752962: step: 670/464, loss: 0.04945719987154007 2023-01-22 14:14:36.476395: step: 672/464, loss: 0.44987764954566956 2023-01-22 14:14:37.201847: step: 674/464, loss: 0.10461324453353882 2023-01-22 14:14:37.958461: step: 676/464, loss: 0.08109944313764572 2023-01-22 14:14:38.613739: step: 678/464, loss: 0.04540938511490822 2023-01-22 14:14:39.333434: step: 680/464, loss: 0.019449058920145035 2023-01-22 14:14:40.089734: step: 682/464, loss: 0.052725568413734436 2023-01-22 14:14:40.776034: step: 684/464, loss: 0.0179046131670475 2023-01-22 14:14:41.604738: step: 686/464, loss: 0.01967434585094452 2023-01-22 14:14:42.284762: step: 688/464, loss: 0.0413089245557785 2023-01-22 14:14:42.924524: step: 690/464, loss: 0.06626380234956741 2023-01-22 14:14:43.654639: step: 692/464, loss: 1.3306009769439697 2023-01-22 14:14:44.389131: step: 694/464, loss: 0.01851009391248226 2023-01-22 14:14:45.100738: step: 696/464, loss: 0.00975774321705103 2023-01-22 14:14:45.874486: step: 698/464, loss: 0.04369215667247772 2023-01-22 14:14:46.577304: step: 700/464, loss: 0.03224438801407814 2023-01-22 14:14:47.256722: step: 702/464, loss: 0.03615008294582367 2023-01-22 14:14:47.944073: step: 704/464, loss: 0.9618604183197021 2023-01-22 14:14:48.602862: step: 706/464, loss: 0.030223244801163673 2023-01-22 14:14:49.288883: step: 708/464, loss: 0.052093807607889175 2023-01-22 14:14:49.983604: step: 710/464, loss: 0.1564226597547531 2023-01-22 14:14:50.728957: step: 712/464, loss: 0.038040641695261 2023-01-22 14:14:51.446588: step: 714/464, loss: 0.06683960556983948 2023-01-22 14:14:52.173721: step: 716/464, loss: 0.03439417853951454 2023-01-22 14:14:52.938652: step: 718/464, loss: 0.012578189373016357 2023-01-22 14:14:53.717249: step: 720/464, loss: 0.011915381997823715 2023-01-22 14:14:54.497870: step: 722/464, loss: 0.11274097114801407 2023-01-22 14:14:55.208162: step: 724/464, loss: 0.01812215894460678 2023-01-22 14:14:55.907165: step: 726/464, loss: 0.05313228443264961 2023-01-22 14:14:56.614937: step: 728/464, loss: 0.0890948623418808 2023-01-22 14:14:57.292173: step: 730/464, loss: 0.3860137462615967 2023-01-22 14:14:58.012115: step: 732/464, loss: 0.024147091433405876 2023-01-22 14:14:58.733590: step: 734/464, loss: 0.0798778086900711 2023-01-22 14:14:59.480317: step: 736/464, loss: 0.191580131649971 2023-01-22 14:15:00.223774: step: 738/464, loss: 0.07451540231704712 2023-01-22 14:15:00.950335: step: 740/464, loss: 0.044656287878751755 2023-01-22 14:15:01.642234: step: 742/464, loss: 0.26471590995788574 2023-01-22 14:15:02.388632: step: 744/464, loss: 0.08491283655166626 2023-01-22 14:15:03.192992: step: 746/464, loss: 0.04643037170171738 2023-01-22 14:15:03.994942: step: 748/464, loss: 0.24509261548519135 2023-01-22 14:15:04.636379: step: 750/464, loss: 0.04010670259594917 2023-01-22 14:15:05.370718: step: 752/464, loss: 0.04671388119459152 2023-01-22 14:15:06.155176: step: 754/464, loss: 0.10081122815608978 2023-01-22 14:15:06.903396: step: 756/464, loss: 0.17207136750221252 2023-01-22 14:15:07.707650: step: 758/464, loss: 0.055953122675418854 2023-01-22 14:15:08.478176: step: 760/464, loss: 0.056002210825681686 2023-01-22 14:15:09.157996: step: 762/464, loss: 0.038005270063877106 2023-01-22 14:15:09.871752: step: 764/464, loss: 0.26270192861557007 2023-01-22 14:15:10.565288: step: 766/464, loss: 0.09764519333839417 2023-01-22 14:15:11.371012: step: 768/464, loss: 0.06416065245866776 2023-01-22 14:15:12.173728: step: 770/464, loss: 0.05219209939241409 2023-01-22 14:15:12.986123: step: 772/464, loss: 0.008765576407313347 2023-01-22 14:15:13.731623: step: 774/464, loss: 1.222083568572998 2023-01-22 14:15:14.407044: step: 776/464, loss: 0.12024626135826111 2023-01-22 14:15:15.148802: step: 778/464, loss: 0.05762336403131485 2023-01-22 14:15:15.863266: step: 780/464, loss: 0.051064152270555496 2023-01-22 14:15:16.609992: step: 782/464, loss: 0.016113949939608574 2023-01-22 14:15:17.300333: step: 784/464, loss: 0.5698862075805664 2023-01-22 14:15:18.038119: step: 786/464, loss: 0.0769958645105362 2023-01-22 14:15:18.758621: step: 788/464, loss: 0.07791303098201752 2023-01-22 14:15:19.604608: step: 790/464, loss: 0.11422698944807053 2023-01-22 14:15:20.387084: step: 792/464, loss: 0.08424840122461319 2023-01-22 14:15:21.180979: step: 794/464, loss: 0.051752686500549316 2023-01-22 14:15:21.943887: step: 796/464, loss: 0.12153435498476028 2023-01-22 14:15:22.730739: step: 798/464, loss: 0.050786785781383514 2023-01-22 14:15:23.438507: step: 800/464, loss: 0.023236608132719994 2023-01-22 14:15:24.138952: step: 802/464, loss: 0.5370764136314392 2023-01-22 14:15:24.908994: step: 804/464, loss: 0.08696542680263519 2023-01-22 14:15:25.639635: step: 806/464, loss: 0.033709701150655746 2023-01-22 14:15:26.337622: step: 808/464, loss: 0.11055350303649902 2023-01-22 14:15:27.051217: step: 810/464, loss: 0.04368586465716362 2023-01-22 14:15:27.694038: step: 812/464, loss: 0.004434123169630766 2023-01-22 14:15:28.506699: step: 814/464, loss: 0.10015460103750229 2023-01-22 14:15:29.238263: step: 816/464, loss: 0.03871988505125046 2023-01-22 14:15:29.907574: step: 818/464, loss: 0.022155897691845894 2023-01-22 14:15:30.677796: step: 820/464, loss: 0.07232918590307236 2023-01-22 14:15:31.337751: step: 822/464, loss: 0.060673587024211884 2023-01-22 14:15:31.986960: step: 824/464, loss: 0.005866081919521093 2023-01-22 14:15:32.644963: step: 826/464, loss: 0.09922577440738678 2023-01-22 14:15:33.310139: step: 828/464, loss: 0.09058462828397751 2023-01-22 14:15:34.109164: step: 830/464, loss: 0.02790232188999653 2023-01-22 14:15:34.779805: step: 832/464, loss: 0.008085663430392742 2023-01-22 14:15:35.534340: step: 834/464, loss: 0.05334268510341644 2023-01-22 14:15:36.304337: step: 836/464, loss: 0.029737956821918488 2023-01-22 14:15:37.045302: step: 838/464, loss: 0.25331979990005493 2023-01-22 14:15:37.809676: step: 840/464, loss: 0.10104769468307495 2023-01-22 14:15:38.506175: step: 842/464, loss: 0.017015360295772552 2023-01-22 14:15:39.245497: step: 844/464, loss: 0.11380429565906525 2023-01-22 14:15:40.050465: step: 846/464, loss: 0.04035916551947594 2023-01-22 14:15:40.784458: step: 848/464, loss: 0.03218059614300728 2023-01-22 14:15:41.520081: step: 850/464, loss: 0.08396913856267929 2023-01-22 14:15:42.286215: step: 852/464, loss: 0.2091253399848938 2023-01-22 14:15:42.953370: step: 854/464, loss: 0.041528936475515366 2023-01-22 14:15:43.608050: step: 856/464, loss: 0.05493639409542084 2023-01-22 14:15:44.316069: step: 858/464, loss: 0.11059834808111191 2023-01-22 14:15:45.027674: step: 860/464, loss: 0.06909855455160141 2023-01-22 14:15:45.719954: step: 862/464, loss: 0.0038817180320620537 2023-01-22 14:15:46.451439: step: 864/464, loss: 0.01490766741335392 2023-01-22 14:15:47.218132: step: 866/464, loss: 0.058409348130226135 2023-01-22 14:15:47.924478: step: 868/464, loss: 0.052582159638404846 2023-01-22 14:15:48.602094: step: 870/464, loss: 0.011038169264793396 2023-01-22 14:15:49.252389: step: 872/464, loss: 0.002536794636398554 2023-01-22 14:15:49.978550: step: 874/464, loss: 0.1839173287153244 2023-01-22 14:15:50.697034: step: 876/464, loss: 0.01185121014714241 2023-01-22 14:15:51.461550: step: 878/464, loss: 0.04605353623628616 2023-01-22 14:15:52.267339: step: 880/464, loss: 0.056616757065057755 2023-01-22 14:15:53.000709: step: 882/464, loss: 0.04454910755157471 2023-01-22 14:15:53.743630: step: 884/464, loss: 0.10783626139163971 2023-01-22 14:15:54.470373: step: 886/464, loss: 0.029736977070569992 2023-01-22 14:15:55.163810: step: 888/464, loss: 0.00434313528239727 2023-01-22 14:15:55.821530: step: 890/464, loss: 0.02773529291152954 2023-01-22 14:15:56.582768: step: 892/464, loss: 0.14134477078914642 2023-01-22 14:15:57.327629: step: 894/464, loss: 0.038266871124506 2023-01-22 14:15:58.052828: step: 896/464, loss: 0.03223191574215889 2023-01-22 14:15:58.802397: step: 898/464, loss: 0.028650274500250816 2023-01-22 14:15:59.545897: step: 900/464, loss: 0.0798439085483551 2023-01-22 14:16:00.332125: step: 902/464, loss: 0.05823419243097305 2023-01-22 14:16:01.018387: step: 904/464, loss: 0.017082292586565018 2023-01-22 14:16:01.796860: step: 906/464, loss: 0.5384454727172852 2023-01-22 14:16:02.600276: step: 908/464, loss: 0.07013249397277832 2023-01-22 14:16:03.294972: step: 910/464, loss: 0.10660932958126068 2023-01-22 14:16:03.972430: step: 912/464, loss: 0.007429053075611591 2023-01-22 14:16:04.726971: step: 914/464, loss: 0.059720948338508606 2023-01-22 14:16:05.451182: step: 916/464, loss: 0.027867119759321213 2023-01-22 14:16:06.162785: step: 918/464, loss: 0.031830303370952606 2023-01-22 14:16:06.830506: step: 920/464, loss: 0.010990363545715809 2023-01-22 14:16:07.609501: step: 922/464, loss: 0.012598209083080292 2023-01-22 14:16:08.324477: step: 924/464, loss: 0.3085867762565613 2023-01-22 14:16:09.049766: step: 926/464, loss: 0.022714342921972275 2023-01-22 14:16:09.831841: step: 928/464, loss: 0.09439236670732498 2023-01-22 14:16:10.518112: step: 930/464, loss: 0.08354859799146652 ================================================== Loss: 0.114 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.28191990135660516, 'r': 0.33595009118016705, 'f1': 0.30657263731939055}, 'combined': 0.22589562749849829, 'epoch': 20} Test Chinese: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.2894572451409419, 'r': 0.2894572451409419, 'f1': 0.2894572451409419}, 'combined': 0.17976818382437446, 'epoch': 20} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.269011087553637, 'r': 0.33014997108855454, 'f1': 0.2964611985284979}, 'combined': 0.2184450936525774, 'epoch': 20} Test Korean: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.2764599237660333, 'r': 0.28274931866872244, 'f1': 0.2795692529819838}, 'combined': 0.17362722027302155, 'epoch': 20} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2830887081416697, 'r': 0.3308063123549436, 'f1': 0.3050929832723322}, 'combined': 0.22480535609540267, 'epoch': 20} Test Russian: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.2968497693394078, 'r': 0.2927472058934535, 'f1': 0.2947842142843622}, 'combined': 0.18307651202923547, 'epoch': 20} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.20982142857142858, 'r': 0.3357142857142857, 'f1': 0.25824175824175827}, 'combined': 0.17216117216117216, 'epoch': 20} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.23863636363636365, 'r': 0.45652173913043476, 'f1': 0.31343283582089554}, 'combined': 0.15671641791044777, 'epoch': 20} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.40131578947368424, 'r': 0.2629310344827586, 'f1': 0.31770833333333337}, 'combined': 0.21180555555555558, 'epoch': 20} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.31282801835726054, 'r': 0.3603161425860667, 'f1': 0.33489701436130004}, 'combined': 0.24676622110832633, 'epoch': 12} Test for Chinese: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.30894966790503225, 'r': 0.2887808468548521, 'f1': 0.29852498585915693}, 'combined': 0.1853997280598975, 'epoch': 12} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.3138888888888889, 'r': 0.4035714285714286, 'f1': 0.35312499999999997}, 'combined': 0.23541666666666664, 'epoch': 12} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2957374656593406, 'r': 0.3501711168338303, 'f1': 0.3206606056844979}, 'combined': 0.23627623576752474, 'epoch': 12} Test for Korean: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.30237642650399266, 'r': 0.2871380888046808, 'f1': 0.2945603100560943}, 'combined': 0.18293745571904804, 'epoch': 12} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.31976744186046513, 'r': 0.5978260869565217, 'f1': 0.4166666666666667}, 'combined': 0.20833333333333334, 'epoch': 12} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3020804336867607, 'r': 0.32271590923272536, 'f1': 0.3120574021388005}, 'combined': 0.22993703315490563, 'epoch': 8} Test for Russian: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.3069139619466407, 'r': 0.292949528465389, 'f1': 0.29976920372318655}, 'combined': 0.1861724528386106, 'epoch': 8} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.44642857142857145, 'r': 0.3232758620689655, 'f1': 0.37500000000000006}, 'combined': 0.25, 'epoch': 8} ****************************** Epoch: 21 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-22 14:18:50.557314: step: 2/464, loss: 0.11975234001874924 2023-01-22 14:18:51.248842: step: 4/464, loss: 0.05728471279144287 2023-01-22 14:18:51.946075: step: 6/464, loss: 0.03545444458723068 2023-01-22 14:18:52.679860: step: 8/464, loss: 0.13810107111930847 2023-01-22 14:18:53.444202: step: 10/464, loss: 0.0021625454537570477 2023-01-22 14:18:54.144873: step: 12/464, loss: 0.24097493290901184 2023-01-22 14:18:54.895194: step: 14/464, loss: 0.5120839476585388 2023-01-22 14:18:55.683915: step: 16/464, loss: 0.07629422098398209 2023-01-22 14:18:56.404465: step: 18/464, loss: 0.35801276564598083 2023-01-22 14:18:57.171765: step: 20/464, loss: 0.021506866440176964 2023-01-22 14:18:57.870078: step: 22/464, loss: 0.07394732534885406 2023-01-22 14:18:58.523825: step: 24/464, loss: 0.007438796106725931 2023-01-22 14:18:59.353959: step: 26/464, loss: 0.07921618223190308 2023-01-22 14:19:00.199082: step: 28/464, loss: 0.03039364144206047 2023-01-22 14:19:00.997330: step: 30/464, loss: 0.040770698338747025 2023-01-22 14:19:01.766543: step: 32/464, loss: 0.0478195957839489 2023-01-22 14:19:02.417159: step: 34/464, loss: 0.011056684888899326 2023-01-22 14:19:03.082213: step: 36/464, loss: 0.033487141132354736 2023-01-22 14:19:03.804971: step: 38/464, loss: 0.028325526043772697 2023-01-22 14:19:04.503998: step: 40/464, loss: 0.11843187361955643 2023-01-22 14:19:05.299903: step: 42/464, loss: 0.03145572170615196 2023-01-22 14:19:06.003792: step: 44/464, loss: 0.11456778645515442 2023-01-22 14:19:06.717697: step: 46/464, loss: 0.00822090357542038 2023-01-22 14:19:07.451449: step: 48/464, loss: 0.05254720151424408 2023-01-22 14:19:08.328172: step: 50/464, loss: 0.02053934521973133 2023-01-22 14:19:08.990852: step: 52/464, loss: 0.01988641545176506 2023-01-22 14:19:09.731264: step: 54/464, loss: 0.19410990178585052 2023-01-22 14:19:10.479603: step: 56/464, loss: 0.06711027026176453 2023-01-22 14:19:11.184698: step: 58/464, loss: 0.012005824595689774 2023-01-22 14:19:11.942594: step: 60/464, loss: 0.02315746806561947 2023-01-22 14:19:12.693969: step: 62/464, loss: 0.01853269897401333 2023-01-22 14:19:13.473568: step: 64/464, loss: 0.026548484340310097 2023-01-22 14:19:14.197640: step: 66/464, loss: 0.008066806942224503 2023-01-22 14:19:14.957926: step: 68/464, loss: 0.08661675453186035 2023-01-22 14:19:15.708866: step: 70/464, loss: 0.025570321828126907 2023-01-22 14:19:16.425929: step: 72/464, loss: 0.016779575496912003 2023-01-22 14:19:17.096786: step: 74/464, loss: 0.017785130068659782 2023-01-22 14:19:17.856606: step: 76/464, loss: 0.007999873720109463 2023-01-22 14:19:18.656317: step: 78/464, loss: 0.02611628547310829 2023-01-22 14:19:19.354746: step: 80/464, loss: 0.052033498883247375 2023-01-22 14:19:20.081561: step: 82/464, loss: 0.007343766279518604 2023-01-22 14:19:20.862764: step: 84/464, loss: 0.013929629698395729 2023-01-22 14:19:21.626990: step: 86/464, loss: 0.05986665561795235 2023-01-22 14:19:22.352741: step: 88/464, loss: 0.054861441254615784 2023-01-22 14:19:23.134728: step: 90/464, loss: 0.018272459506988525 2023-01-22 14:19:23.770278: step: 92/464, loss: 0.00936177372932434 2023-01-22 14:19:24.411881: step: 94/464, loss: 0.02910560928285122 2023-01-22 14:19:25.172017: step: 96/464, loss: 0.014908765442669392 2023-01-22 14:19:25.917872: step: 98/464, loss: 0.030643368139863014 2023-01-22 14:19:26.642535: step: 100/464, loss: 0.04429630935192108 2023-01-22 14:19:27.405008: step: 102/464, loss: 0.11084703356027603 2023-01-22 14:19:28.159441: step: 104/464, loss: 0.013898937962949276 2023-01-22 14:19:28.919922: step: 106/464, loss: 0.09585338830947876 2023-01-22 14:19:29.669808: step: 108/464, loss: 0.03203404322266579 2023-01-22 14:19:30.281034: step: 110/464, loss: 0.012530727311968803 2023-01-22 14:19:30.947981: step: 112/464, loss: 0.36144790053367615 2023-01-22 14:19:31.652877: step: 114/464, loss: 0.04872361198067665 2023-01-22 14:19:32.350999: step: 116/464, loss: 0.05325092375278473 2023-01-22 14:19:33.078226: step: 118/464, loss: 0.013846561312675476 2023-01-22 14:19:33.823004: step: 120/464, loss: 2.050266981124878 2023-01-22 14:19:34.595849: step: 122/464, loss: 0.030106237158179283 2023-01-22 14:19:35.302300: step: 124/464, loss: 0.024658236652612686 2023-01-22 14:19:36.026495: step: 126/464, loss: 0.12699775397777557 2023-01-22 14:19:36.727490: step: 128/464, loss: 0.03840040788054466 2023-01-22 14:19:37.501998: step: 130/464, loss: 0.057593636214733124 2023-01-22 14:19:38.230686: step: 132/464, loss: 0.11989856511354446 2023-01-22 14:19:38.898303: step: 134/464, loss: 0.2313280552625656 2023-01-22 14:19:39.623333: step: 136/464, loss: 0.06446570158004761 2023-01-22 14:19:40.355934: step: 138/464, loss: 0.07844987511634827 2023-01-22 14:19:41.047880: step: 140/464, loss: 0.10319890826940536 2023-01-22 14:19:41.822924: step: 142/464, loss: 0.1371772140264511 2023-01-22 14:19:42.636084: step: 144/464, loss: 0.421131432056427 2023-01-22 14:19:43.252411: step: 146/464, loss: 0.09265350550413132 2023-01-22 14:19:44.017165: step: 148/464, loss: 0.09814798831939697 2023-01-22 14:19:44.765121: step: 150/464, loss: 0.003304727841168642 2023-01-22 14:19:45.581654: step: 152/464, loss: 0.09079311043024063 2023-01-22 14:19:46.301958: step: 154/464, loss: 0.07017511874437332 2023-01-22 14:19:47.003837: step: 156/464, loss: 0.03948606550693512 2023-01-22 14:19:47.793762: step: 158/464, loss: 0.031009254977107048 2023-01-22 14:19:48.628767: step: 160/464, loss: 0.03819343075156212 2023-01-22 14:19:49.383191: step: 162/464, loss: 0.021974503993988037 2023-01-22 14:19:50.086305: step: 164/464, loss: 0.041077401489019394 2023-01-22 14:19:50.795769: step: 166/464, loss: 0.033953309059143066 2023-01-22 14:19:51.538790: step: 168/464, loss: 0.07859236747026443 2023-01-22 14:19:52.244086: step: 170/464, loss: 0.044446911662817 2023-01-22 14:19:52.953520: step: 172/464, loss: 0.044578131288290024 2023-01-22 14:19:53.659524: step: 174/464, loss: 0.029382256790995598 2023-01-22 14:19:54.347963: step: 176/464, loss: 0.16760005056858063 2023-01-22 14:19:55.051824: step: 178/464, loss: 0.013348049484193325 2023-01-22 14:19:55.767663: step: 180/464, loss: 0.04084669426083565 2023-01-22 14:19:56.524683: step: 182/464, loss: 0.3408990204334259 2023-01-22 14:19:57.265858: step: 184/464, loss: 0.9496893286705017 2023-01-22 14:19:57.989373: step: 186/464, loss: 0.0338098406791687 2023-01-22 14:19:58.731498: step: 188/464, loss: 0.06081795319914818 2023-01-22 14:19:59.448234: step: 190/464, loss: 0.06129498407244682 2023-01-22 14:20:00.277342: step: 192/464, loss: 0.018290026113390923 2023-01-22 14:20:00.986588: step: 194/464, loss: 0.029910344630479813 2023-01-22 14:20:01.761541: step: 196/464, loss: 0.00872301310300827 2023-01-22 14:20:02.516297: step: 198/464, loss: 0.04122915118932724 2023-01-22 14:20:03.201999: step: 200/464, loss: 0.018726056441664696 2023-01-22 14:20:03.960546: step: 202/464, loss: 0.0619959831237793 2023-01-22 14:20:04.664845: step: 204/464, loss: 0.02457534149289131 2023-01-22 14:20:05.390173: step: 206/464, loss: 0.009037651121616364 2023-01-22 14:20:06.183197: step: 208/464, loss: 0.10055802017450333 2023-01-22 14:20:06.921442: step: 210/464, loss: 0.06440237164497375 2023-01-22 14:20:07.664360: step: 212/464, loss: 0.4239731729030609 2023-01-22 14:20:08.427043: step: 214/464, loss: 0.03676867112517357 2023-01-22 14:20:09.144525: step: 216/464, loss: 0.03093644417822361 2023-01-22 14:20:09.887112: step: 218/464, loss: 0.04951198399066925 2023-01-22 14:20:10.586084: step: 220/464, loss: 0.03629906848073006 2023-01-22 14:20:11.362175: step: 222/464, loss: 0.02488712966442108 2023-01-22 14:20:12.147256: step: 224/464, loss: 0.047976914793252945 2023-01-22 14:20:12.868076: step: 226/464, loss: 0.0374411940574646 2023-01-22 14:20:13.527268: step: 228/464, loss: 0.01827031560242176 2023-01-22 14:20:14.205337: step: 230/464, loss: 0.012965069152414799 2023-01-22 14:20:14.935294: step: 232/464, loss: 0.05805027112364769 2023-01-22 14:20:15.651942: step: 234/464, loss: 0.002010543365031481 2023-01-22 14:20:16.373784: step: 236/464, loss: 0.004295994061976671 2023-01-22 14:20:17.135982: step: 238/464, loss: 0.05023113265633583 2023-01-22 14:20:17.880062: step: 240/464, loss: 0.04340185225009918 2023-01-22 14:20:18.588470: step: 242/464, loss: 0.028225280344486237 2023-01-22 14:20:19.239197: step: 244/464, loss: 0.032393768429756165 2023-01-22 14:20:19.942732: step: 246/464, loss: 0.048727914690971375 2023-01-22 14:20:20.679188: step: 248/464, loss: 0.007662827614694834 2023-01-22 14:20:21.397243: step: 250/464, loss: 0.023732934147119522 2023-01-22 14:20:22.090072: step: 252/464, loss: 0.06910266727209091 2023-01-22 14:20:22.869188: step: 254/464, loss: 0.03406153991818428 2023-01-22 14:20:23.555488: step: 256/464, loss: 0.0055818636901676655 2023-01-22 14:20:24.334227: step: 258/464, loss: 0.038593094795942307 2023-01-22 14:20:24.989167: step: 260/464, loss: 0.06385339051485062 2023-01-22 14:20:25.640120: step: 262/464, loss: 0.0857200101017952 2023-01-22 14:20:26.371208: step: 264/464, loss: 0.03669006749987602 2023-01-22 14:20:27.062607: step: 266/464, loss: 0.018789229914546013 2023-01-22 14:20:27.804296: step: 268/464, loss: 0.04875718429684639 2023-01-22 14:20:28.470968: step: 270/464, loss: 0.14598001539707184 2023-01-22 14:20:29.185270: step: 272/464, loss: 0.04028183966875076 2023-01-22 14:20:29.874050: step: 274/464, loss: 0.36252397298812866 2023-01-22 14:20:30.701449: step: 276/464, loss: 0.06845507770776749 2023-01-22 14:20:31.420285: step: 278/464, loss: 0.17075254023075104 2023-01-22 14:20:32.251414: step: 280/464, loss: 0.024251427501440048 2023-01-22 14:20:32.970991: step: 282/464, loss: 0.5997735857963562 2023-01-22 14:20:33.717714: step: 284/464, loss: 0.02716807648539543 2023-01-22 14:20:34.456726: step: 286/464, loss: 0.5014903545379639 2023-01-22 14:20:35.185058: step: 288/464, loss: 0.07848905771970749 2023-01-22 14:20:35.897339: step: 290/464, loss: 0.1874598264694214 2023-01-22 14:20:36.646036: step: 292/464, loss: 0.04445421323180199 2023-01-22 14:20:37.318504: step: 294/464, loss: 0.029825005680322647 2023-01-22 14:20:37.999803: step: 296/464, loss: 0.16320347785949707 2023-01-22 14:20:38.730637: step: 298/464, loss: 0.16363082826137543 2023-01-22 14:20:39.399666: step: 300/464, loss: 0.2617831230163574 2023-01-22 14:20:40.164867: step: 302/464, loss: 0.009264235384762287 2023-01-22 14:20:40.912661: step: 304/464, loss: 0.02039198949933052 2023-01-22 14:20:41.688524: step: 306/464, loss: 0.013186992146074772 2023-01-22 14:20:42.479371: step: 308/464, loss: 0.1154455617070198 2023-01-22 14:20:43.204980: step: 310/464, loss: 0.008570646867156029 2023-01-22 14:20:43.991949: step: 312/464, loss: 0.047249674797058105 2023-01-22 14:20:44.717358: step: 314/464, loss: 0.03741049766540527 2023-01-22 14:20:45.381731: step: 316/464, loss: 0.023542368784546852 2023-01-22 14:20:46.160951: step: 318/464, loss: 0.014139845035970211 2023-01-22 14:20:46.958714: step: 320/464, loss: 0.03782298043370247 2023-01-22 14:20:47.677544: step: 322/464, loss: 0.04408476501703262 2023-01-22 14:20:48.432352: step: 324/464, loss: 0.010484587401151657 2023-01-22 14:20:49.189083: step: 326/464, loss: 0.6605069637298584 2023-01-22 14:20:49.945766: step: 328/464, loss: 0.023553185164928436 2023-01-22 14:20:50.692909: step: 330/464, loss: 0.05442517250776291 2023-01-22 14:20:51.410849: step: 332/464, loss: 0.40287327766418457 2023-01-22 14:20:52.163973: step: 334/464, loss: 0.14791059494018555 2023-01-22 14:20:52.840895: step: 336/464, loss: 0.3907485008239746 2023-01-22 14:20:53.639351: step: 338/464, loss: 0.08137544989585876 2023-01-22 14:20:54.421904: step: 340/464, loss: 0.12367463111877441 2023-01-22 14:20:55.148427: step: 342/464, loss: 0.012803366407752037 2023-01-22 14:20:55.812315: step: 344/464, loss: 0.024366678670048714 2023-01-22 14:20:56.570495: step: 346/464, loss: 0.02786608599126339 2023-01-22 14:20:57.299914: step: 348/464, loss: 12.75429916381836 2023-01-22 14:20:58.027356: step: 350/464, loss: 0.0908380076289177 2023-01-22 14:20:58.810189: step: 352/464, loss: 0.016326939687132835 2023-01-22 14:20:59.545931: step: 354/464, loss: 0.023789649829268456 2023-01-22 14:21:00.327168: step: 356/464, loss: 0.1525188535451889 2023-01-22 14:21:01.041108: step: 358/464, loss: 0.05632682889699936 2023-01-22 14:21:01.788432: step: 360/464, loss: 0.04506479203701019 2023-01-22 14:21:02.600078: step: 362/464, loss: 0.07740932703018188 2023-01-22 14:21:03.311017: step: 364/464, loss: 0.11900129169225693 2023-01-22 14:21:04.032636: step: 366/464, loss: 0.020981112495064735 2023-01-22 14:21:04.747249: step: 368/464, loss: 0.024556007236242294 2023-01-22 14:21:05.419592: step: 370/464, loss: 0.05328279733657837 2023-01-22 14:21:06.127191: step: 372/464, loss: 0.019029740244150162 2023-01-22 14:21:06.891162: step: 374/464, loss: 0.007005555089563131 2023-01-22 14:21:07.614531: step: 376/464, loss: 0.1613781452178955 2023-01-22 14:21:08.300304: step: 378/464, loss: 0.1356181651353836 2023-01-22 14:21:09.034682: step: 380/464, loss: 0.035038989037275314 2023-01-22 14:21:09.714525: step: 382/464, loss: 0.01615901291370392 2023-01-22 14:21:10.461941: step: 384/464, loss: 0.021121768280863762 2023-01-22 14:21:11.239154: step: 386/464, loss: 0.25910723209381104 2023-01-22 14:21:11.968883: step: 388/464, loss: 0.5805371403694153 2023-01-22 14:21:12.773530: step: 390/464, loss: 0.03570196032524109 2023-01-22 14:21:13.489654: step: 392/464, loss: 0.05775092914700508 2023-01-22 14:21:14.253306: step: 394/464, loss: 0.011715341359376907 2023-01-22 14:21:15.013361: step: 396/464, loss: 0.20321254432201385 2023-01-22 14:21:15.796260: step: 398/464, loss: 0.04123862460255623 2023-01-22 14:21:16.488357: step: 400/464, loss: 0.04705808311700821 2023-01-22 14:21:17.203698: step: 402/464, loss: 0.08706068992614746 2023-01-22 14:21:17.960218: step: 404/464, loss: 2.7982990741729736 2023-01-22 14:21:18.632986: step: 406/464, loss: 0.05648184195160866 2023-01-22 14:21:19.298326: step: 408/464, loss: 0.004382117185741663 2023-01-22 14:21:20.038475: step: 410/464, loss: 0.008848189376294613 2023-01-22 14:21:20.780763: step: 412/464, loss: 0.0265215914696455 2023-01-22 14:21:21.512095: step: 414/464, loss: 0.07577735930681229 2023-01-22 14:21:22.196849: step: 416/464, loss: 0.5927653908729553 2023-01-22 14:21:22.929635: step: 418/464, loss: 0.13155563175678253 2023-01-22 14:21:23.621048: step: 420/464, loss: 0.0663122907280922 2023-01-22 14:21:24.247316: step: 422/464, loss: 0.04869604483246803 2023-01-22 14:21:24.972083: step: 424/464, loss: 0.01776723749935627 2023-01-22 14:21:25.663396: step: 426/464, loss: 0.13402137160301208 2023-01-22 14:21:26.416891: step: 428/464, loss: 0.02516421489417553 2023-01-22 14:21:27.230989: step: 430/464, loss: 0.03531419485807419 2023-01-22 14:21:27.971966: step: 432/464, loss: 0.009145190007984638 2023-01-22 14:21:28.724874: step: 434/464, loss: 0.015271683223545551 2023-01-22 14:21:29.490546: step: 436/464, loss: 0.007533950265496969 2023-01-22 14:21:30.202827: step: 438/464, loss: 0.026609541848301888 2023-01-22 14:21:31.001152: step: 440/464, loss: 0.13909491896629333 2023-01-22 14:21:31.670344: step: 442/464, loss: 0.007417329587042332 2023-01-22 14:21:32.435255: step: 444/464, loss: 0.03112209588289261 2023-01-22 14:21:33.224772: step: 446/464, loss: 0.07906953245401382 2023-01-22 14:21:34.073837: step: 448/464, loss: 0.03891181945800781 2023-01-22 14:21:34.769082: step: 450/464, loss: 0.13607284426689148 2023-01-22 14:21:35.459894: step: 452/464, loss: 0.04584129899740219 2023-01-22 14:21:36.176124: step: 454/464, loss: 0.07778904587030411 2023-01-22 14:21:36.830006: step: 456/464, loss: 0.014196380972862244 2023-01-22 14:21:37.550351: step: 458/464, loss: 0.42927083373069763 2023-01-22 14:21:38.297912: step: 460/464, loss: 0.029154594987630844 2023-01-22 14:21:38.971229: step: 462/464, loss: 0.0662357434630394 2023-01-22 14:21:39.710955: step: 464/464, loss: 0.05748889595270157 2023-01-22 14:21:40.419760: step: 466/464, loss: 0.09649182856082916 2023-01-22 14:21:41.161672: step: 468/464, loss: 0.08588551729917526 2023-01-22 14:21:41.828441: step: 470/464, loss: 0.3580757677555084 2023-01-22 14:21:42.561432: step: 472/464, loss: 0.049161091446876526 2023-01-22 14:21:43.370419: step: 474/464, loss: 0.019643118605017662 2023-01-22 14:21:44.064775: step: 476/464, loss: 0.13698026537895203 2023-01-22 14:21:44.831607: step: 478/464, loss: 0.04523114860057831 2023-01-22 14:21:45.493131: step: 480/464, loss: 0.013664919883012772 2023-01-22 14:21:46.282124: step: 482/464, loss: 0.032867517322301865 2023-01-22 14:21:47.010113: step: 484/464, loss: 0.028636222705245018 2023-01-22 14:21:47.807640: step: 486/464, loss: 0.04870878905057907 2023-01-22 14:21:48.497386: step: 488/464, loss: 0.10918847471475601 2023-01-22 14:21:49.231350: step: 490/464, loss: 0.046230800449848175 2023-01-22 14:21:50.016532: step: 492/464, loss: 0.08702777326107025 2023-01-22 14:21:50.785535: step: 494/464, loss: 0.104853555560112 2023-01-22 14:21:51.527108: step: 496/464, loss: 0.04353467375040054 2023-01-22 14:21:52.283472: step: 498/464, loss: 0.26607006788253784 2023-01-22 14:21:52.971524: step: 500/464, loss: 0.07972798496484756 2023-01-22 14:21:53.777801: step: 502/464, loss: 0.11201713979244232 2023-01-22 14:21:54.540769: step: 504/464, loss: 0.15039947628974915 2023-01-22 14:21:55.298047: step: 506/464, loss: 0.03871912509202957 2023-01-22 14:21:56.072319: step: 508/464, loss: 0.008508550934493542 2023-01-22 14:21:56.800258: step: 510/464, loss: 0.09333653002977371 2023-01-22 14:21:57.537462: step: 512/464, loss: 0.03663307800889015 2023-01-22 14:21:58.274413: step: 514/464, loss: 0.052844543009996414 2023-01-22 14:21:59.094052: step: 516/464, loss: 0.42462268471717834 2023-01-22 14:21:59.763156: step: 518/464, loss: 0.030073698610067368 2023-01-22 14:22:00.483328: step: 520/464, loss: 0.058907970786094666 2023-01-22 14:22:01.200020: step: 522/464, loss: 0.04350745305418968 2023-01-22 14:22:01.964081: step: 524/464, loss: 0.44929540157318115 2023-01-22 14:22:02.692787: step: 526/464, loss: 0.1174045279622078 2023-01-22 14:22:03.378163: step: 528/464, loss: 0.16680335998535156 2023-01-22 14:22:04.082259: step: 530/464, loss: 0.04566435143351555 2023-01-22 14:22:04.937199: step: 532/464, loss: 0.04176734387874603 2023-01-22 14:22:05.660298: step: 534/464, loss: 0.7629539966583252 2023-01-22 14:22:06.368532: step: 536/464, loss: 0.018504736945033073 2023-01-22 14:22:07.062653: step: 538/464, loss: 0.02703903056681156 2023-01-22 14:22:07.832046: step: 540/464, loss: 0.09397948533296585 2023-01-22 14:22:08.550056: step: 542/464, loss: 0.11122222989797592 2023-01-22 14:22:09.253455: step: 544/464, loss: 0.26820042729377747 2023-01-22 14:22:09.997473: step: 546/464, loss: 0.026853233575820923 2023-01-22 14:22:10.795284: step: 548/464, loss: 0.018640518188476562 2023-01-22 14:22:11.488676: step: 550/464, loss: 0.3911222517490387 2023-01-22 14:22:12.236082: step: 552/464, loss: 0.04636659845709801 2023-01-22 14:22:12.909216: step: 554/464, loss: 0.06301417201757431 2023-01-22 14:22:13.686074: step: 556/464, loss: 0.05418648570775986 2023-01-22 14:22:14.406442: step: 558/464, loss: 0.3370462954044342 2023-01-22 14:22:15.179891: step: 560/464, loss: 0.004634824115782976 2023-01-22 14:22:15.908172: step: 562/464, loss: 0.01641211099922657 2023-01-22 14:22:16.633405: step: 564/464, loss: 0.023339591920375824 2023-01-22 14:22:17.353149: step: 566/464, loss: 0.0751141682267189 2023-01-22 14:22:18.038780: step: 568/464, loss: 0.004145853221416473 2023-01-22 14:22:18.764377: step: 570/464, loss: 0.04671361669898033 2023-01-22 14:22:19.506630: step: 572/464, loss: 0.032283443957567215 2023-01-22 14:22:20.253167: step: 574/464, loss: 0.052826691418886185 2023-01-22 14:22:21.032743: step: 576/464, loss: 0.08888741582632065 2023-01-22 14:22:21.728705: step: 578/464, loss: 0.019992846995592117 2023-01-22 14:22:22.496751: step: 580/464, loss: 0.11506533622741699 2023-01-22 14:22:23.249910: step: 582/464, loss: 0.044647980481386185 2023-01-22 14:22:24.034077: step: 584/464, loss: 0.15260529518127441 2023-01-22 14:22:24.828522: step: 586/464, loss: 0.02975417673587799 2023-01-22 14:22:25.678967: step: 588/464, loss: 0.08308760821819305 2023-01-22 14:22:26.444885: step: 590/464, loss: 0.05899420380592346 2023-01-22 14:22:27.149986: step: 592/464, loss: 0.01571488566696644 2023-01-22 14:22:27.855992: step: 594/464, loss: 0.06790255010128021 2023-01-22 14:22:28.610161: step: 596/464, loss: 0.027487952262163162 2023-01-22 14:22:29.262780: step: 598/464, loss: 0.05656803399324417 2023-01-22 14:22:29.989483: step: 600/464, loss: 0.010896249674260616 2023-01-22 14:22:30.767668: step: 602/464, loss: 0.10806481540203094 2023-01-22 14:22:31.434871: step: 604/464, loss: 0.0048584723845124245 2023-01-22 14:22:32.146243: step: 606/464, loss: 0.057283997535705566 2023-01-22 14:22:32.926716: step: 608/464, loss: 0.02630491741001606 2023-01-22 14:22:33.605227: step: 610/464, loss: 0.027782080695033073 2023-01-22 14:22:34.318212: step: 612/464, loss: 0.04245239496231079 2023-01-22 14:22:35.016518: step: 614/464, loss: 0.05944732576608658 2023-01-22 14:22:35.737906: step: 616/464, loss: 0.007870529778301716 2023-01-22 14:22:36.537649: step: 618/464, loss: 0.05828342214226723 2023-01-22 14:22:37.208489: step: 620/464, loss: 0.1730232685804367 2023-01-22 14:22:37.924349: step: 622/464, loss: 0.05308079347014427 2023-01-22 14:22:38.659641: step: 624/464, loss: 0.07295246422290802 2023-01-22 14:22:39.375043: step: 626/464, loss: 0.017740445211529732 2023-01-22 14:22:40.179041: step: 628/464, loss: 0.24502922594547272 2023-01-22 14:22:40.982169: step: 630/464, loss: 0.04027089849114418 2023-01-22 14:22:41.713555: step: 632/464, loss: 0.08156416565179825 2023-01-22 14:22:42.442603: step: 634/464, loss: 0.11167629808187485 2023-01-22 14:22:43.238302: step: 636/464, loss: 0.00642075389623642 2023-01-22 14:22:44.009788: step: 638/464, loss: 0.0548754557967186 2023-01-22 14:22:44.695188: step: 640/464, loss: 0.10855691879987717 2023-01-22 14:22:45.519475: step: 642/464, loss: 0.01500038243830204 2023-01-22 14:22:46.201192: step: 644/464, loss: 0.05018560588359833 2023-01-22 14:22:47.022814: step: 646/464, loss: 0.02341439574956894 2023-01-22 14:22:47.726375: step: 648/464, loss: 0.09186741709709167 2023-01-22 14:22:48.500520: step: 650/464, loss: 0.039070356637239456 2023-01-22 14:22:49.229013: step: 652/464, loss: 0.07108777016401291 2023-01-22 14:22:49.872194: step: 654/464, loss: 0.03220715373754501 2023-01-22 14:22:50.622166: step: 656/464, loss: 0.17577025294303894 2023-01-22 14:22:51.446814: step: 658/464, loss: 0.15418113768100739 2023-01-22 14:22:52.115961: step: 660/464, loss: 0.058939479291439056 2023-01-22 14:22:52.833317: step: 662/464, loss: 0.2582094371318817 2023-01-22 14:22:53.581984: step: 664/464, loss: 0.008546644821763039 2023-01-22 14:22:54.288002: step: 666/464, loss: 0.00984677579253912 2023-01-22 14:22:54.988822: step: 668/464, loss: 0.10566455870866776 2023-01-22 14:22:55.779916: step: 670/464, loss: 0.0995333343744278 2023-01-22 14:22:56.495535: step: 672/464, loss: 0.14754481613636017 2023-01-22 14:22:57.213163: step: 674/464, loss: 0.0048037185333669186 2023-01-22 14:22:57.968587: step: 676/464, loss: 0.041190601885318756 2023-01-22 14:22:58.690939: step: 678/464, loss: 0.47897103428840637 2023-01-22 14:22:59.458628: step: 680/464, loss: 0.031011898070573807 2023-01-22 14:23:00.236181: step: 682/464, loss: 0.05496121942996979 2023-01-22 14:23:01.030307: step: 684/464, loss: 0.04974808543920517 2023-01-22 14:23:01.718621: step: 686/464, loss: 0.07194700092077255 2023-01-22 14:23:02.446936: step: 688/464, loss: 0.016992082819342613 2023-01-22 14:23:03.190653: step: 690/464, loss: 0.08583711832761765 2023-01-22 14:23:03.932883: step: 692/464, loss: 0.0561419315636158 2023-01-22 14:23:04.656968: step: 694/464, loss: 0.0464247465133667 2023-01-22 14:23:05.430604: step: 696/464, loss: 0.07206926494836807 2023-01-22 14:23:06.247266: step: 698/464, loss: 0.044868115335702896 2023-01-22 14:23:06.997921: step: 700/464, loss: 0.05982567369937897 2023-01-22 14:23:07.841367: step: 702/464, loss: 0.13941554725170135 2023-01-22 14:23:08.479907: step: 704/464, loss: 0.017828669399023056 2023-01-22 14:23:09.160602: step: 706/464, loss: 0.007217045873403549 2023-01-22 14:23:09.841355: step: 708/464, loss: 0.031991638243198395 2023-01-22 14:23:10.527464: step: 710/464, loss: 0.10996410250663757 2023-01-22 14:23:11.227249: step: 712/464, loss: 0.022569648921489716 2023-01-22 14:23:11.973768: step: 714/464, loss: 0.08497320115566254 2023-01-22 14:23:12.822020: step: 716/464, loss: 0.029784033074975014 2023-01-22 14:23:13.497019: step: 718/464, loss: 0.04577137902379036 2023-01-22 14:23:14.157830: step: 720/464, loss: 0.01864718087017536 2023-01-22 14:23:14.914550: step: 722/464, loss: 0.08041205257177353 2023-01-22 14:23:15.662805: step: 724/464, loss: 0.0066671911627054214 2023-01-22 14:23:16.492958: step: 726/464, loss: 0.07465112209320068 2023-01-22 14:23:17.234096: step: 728/464, loss: 0.3429624140262604 2023-01-22 14:23:17.997133: step: 730/464, loss: 0.07083141058683395 2023-01-22 14:23:18.743342: step: 732/464, loss: 0.03474923595786095 2023-01-22 14:23:19.406259: step: 734/464, loss: 0.02197730541229248 2023-01-22 14:23:20.160269: step: 736/464, loss: 0.020554326474666595 2023-01-22 14:23:20.850867: step: 738/464, loss: 0.07115784287452698 2023-01-22 14:23:21.555929: step: 740/464, loss: 0.026381531730294228 2023-01-22 14:23:22.244628: step: 742/464, loss: 0.31829917430877686 2023-01-22 14:23:23.008687: step: 744/464, loss: 0.0353279784321785 2023-01-22 14:23:23.767622: step: 746/464, loss: 0.03324635699391365 2023-01-22 14:23:24.430358: step: 748/464, loss: 0.009753571823239326 2023-01-22 14:23:25.132355: step: 750/464, loss: 0.06722907721996307 2023-01-22 14:23:25.816806: step: 752/464, loss: 0.03678857162594795 2023-01-22 14:23:26.522855: step: 754/464, loss: 0.08187399059534073 2023-01-22 14:23:27.221734: step: 756/464, loss: 0.08561398833990097 2023-01-22 14:23:28.003948: step: 758/464, loss: 0.1783737689256668 2023-01-22 14:23:28.752553: step: 760/464, loss: 0.06394535303115845 2023-01-22 14:23:29.538780: step: 762/464, loss: 0.039629869163036346 2023-01-22 14:23:30.287518: step: 764/464, loss: 0.026326943188905716 2023-01-22 14:23:31.041897: step: 766/464, loss: 0.004333826247602701 2023-01-22 14:23:31.724718: step: 768/464, loss: 0.09951043874025345 2023-01-22 14:23:32.433724: step: 770/464, loss: 0.057956498116254807 2023-01-22 14:23:33.321444: step: 772/464, loss: 0.05867960676550865 2023-01-22 14:23:34.036227: step: 774/464, loss: 0.041094571352005005 2023-01-22 14:23:34.784419: step: 776/464, loss: 0.08115795254707336 2023-01-22 14:23:35.540306: step: 778/464, loss: 0.03187811002135277 2023-01-22 14:23:36.385800: step: 780/464, loss: 6.479488372802734 2023-01-22 14:23:37.126242: step: 782/464, loss: 0.387126088142395 2023-01-22 14:23:37.954302: step: 784/464, loss: 0.08255114406347275 2023-01-22 14:23:38.685500: step: 786/464, loss: 0.13230735063552856 2023-01-22 14:23:39.448630: step: 788/464, loss: 0.04115178436040878 2023-01-22 14:23:40.296878: step: 790/464, loss: 0.03452340140938759 2023-01-22 14:23:41.028944: step: 792/464, loss: 0.06277056038379669 2023-01-22 14:23:41.804721: step: 794/464, loss: 0.039516910910606384 2023-01-22 14:23:42.662774: step: 796/464, loss: 0.05712104216217995 2023-01-22 14:23:43.384281: step: 798/464, loss: 0.03840438649058342 2023-01-22 14:23:44.049050: step: 800/464, loss: 0.010737729258835316 2023-01-22 14:23:44.957004: step: 802/464, loss: 0.01658307947218418 2023-01-22 14:23:45.743148: step: 804/464, loss: 0.20165158808231354 2023-01-22 14:23:46.429638: step: 806/464, loss: 0.038009755313396454 2023-01-22 14:23:47.158402: step: 808/464, loss: 0.031555887311697006 2023-01-22 14:23:47.889704: step: 810/464, loss: 0.04327677935361862 2023-01-22 14:23:48.647210: step: 812/464, loss: 0.030223559588193893 2023-01-22 14:23:49.325106: step: 814/464, loss: 0.13202138245105743 2023-01-22 14:23:50.005228: step: 816/464, loss: 0.039860039949417114 2023-01-22 14:23:50.852448: step: 818/464, loss: 0.14208300411701202 2023-01-22 14:23:51.583071: step: 820/464, loss: 0.03917303681373596 2023-01-22 14:23:52.330996: step: 822/464, loss: 0.053042635321617126 2023-01-22 14:23:53.087715: step: 824/464, loss: 0.24905726313591003 2023-01-22 14:23:53.738314: step: 826/464, loss: 0.005672526080161333 2023-01-22 14:23:54.446359: step: 828/464, loss: 0.14162936806678772 2023-01-22 14:23:55.191833: step: 830/464, loss: 0.1332976073026657 2023-01-22 14:23:55.867543: step: 832/464, loss: 0.052006494253873825 2023-01-22 14:23:56.497228: step: 834/464, loss: 0.024735642597079277 2023-01-22 14:23:57.176252: step: 836/464, loss: 0.07817701250314713 2023-01-22 14:23:58.030062: step: 838/464, loss: 0.023987311869859695 2023-01-22 14:23:58.778022: step: 840/464, loss: 0.06612563133239746 2023-01-22 14:23:59.489567: step: 842/464, loss: 0.2896845042705536 2023-01-22 14:24:00.157255: step: 844/464, loss: 0.03888014703989029 2023-01-22 14:24:00.920086: step: 846/464, loss: 0.11084342002868652 2023-01-22 14:24:01.733660: step: 848/464, loss: 0.045841753482818604 2023-01-22 14:24:02.407397: step: 850/464, loss: 0.035428255796432495 2023-01-22 14:24:03.241501: step: 852/464, loss: 0.027241677045822144 2023-01-22 14:24:03.970087: step: 854/464, loss: 0.05396389588713646 2023-01-22 14:24:04.657553: step: 856/464, loss: 0.03509226068854332 2023-01-22 14:24:05.383452: step: 858/464, loss: 0.02402915060520172 2023-01-22 14:24:06.091140: step: 860/464, loss: 0.03456997871398926 2023-01-22 14:24:06.811847: step: 862/464, loss: 0.09369662404060364 2023-01-22 14:24:07.547623: step: 864/464, loss: 0.05106193199753761 2023-01-22 14:24:08.221210: step: 866/464, loss: 0.013446471653878689 2023-01-22 14:24:09.079726: step: 868/464, loss: 0.08844896405935287 2023-01-22 14:24:09.801770: step: 870/464, loss: 0.0037490795366466045 2023-01-22 14:24:10.496592: step: 872/464, loss: 0.2520065903663635 2023-01-22 14:24:11.287425: step: 874/464, loss: 0.2629806697368622 2023-01-22 14:24:12.021412: step: 876/464, loss: 0.07260128110647202 2023-01-22 14:24:12.804244: step: 878/464, loss: 0.11487490683794022 2023-01-22 14:24:13.494504: step: 880/464, loss: 0.08654702454805374 2023-01-22 14:24:14.158887: step: 882/464, loss: 0.04224324971437454 2023-01-22 14:24:14.815481: step: 884/464, loss: 0.24555733799934387 2023-01-22 14:24:15.547570: step: 886/464, loss: 0.0406196303665638 2023-01-22 14:24:16.251717: step: 888/464, loss: 0.018796822056174278 2023-01-22 14:24:16.850066: step: 890/464, loss: 0.05456920340657234 2023-01-22 14:24:17.558122: step: 892/464, loss: 0.030479077249765396 2023-01-22 14:24:18.383031: step: 894/464, loss: 0.046267785131931305 2023-01-22 14:24:19.119451: step: 896/464, loss: 0.014180081896483898 2023-01-22 14:24:19.892404: step: 898/464, loss: 0.07442369312047958 2023-01-22 14:24:20.683438: step: 900/464, loss: 0.05919457599520683 2023-01-22 14:24:21.401151: step: 902/464, loss: 0.0010191906476393342 2023-01-22 14:24:22.117343: step: 904/464, loss: 0.41506895422935486 2023-01-22 14:24:22.874001: step: 906/464, loss: 0.07254950702190399 2023-01-22 14:24:23.628523: step: 908/464, loss: 0.03722519800066948 2023-01-22 14:24:24.440329: step: 910/464, loss: 0.042995885014534 2023-01-22 14:24:25.100603: step: 912/464, loss: 0.06633169203996658 2023-01-22 14:24:25.883830: step: 914/464, loss: 0.11544479429721832 2023-01-22 14:24:26.651739: step: 916/464, loss: 0.03535765781998634 2023-01-22 14:24:27.345290: step: 918/464, loss: 0.09880461543798447 2023-01-22 14:24:28.039255: step: 920/464, loss: 0.19629810750484467 2023-01-22 14:24:28.823933: step: 922/464, loss: 0.07875211536884308 2023-01-22 14:24:29.630720: step: 924/464, loss: 0.1211591511964798 2023-01-22 14:24:30.316060: step: 926/464, loss: 0.03621485084295273 2023-01-22 14:24:30.992134: step: 928/464, loss: 0.0687384083867073 2023-01-22 14:24:31.643067: step: 930/464, loss: 0.06540561467409134 ================================================== Loss: 0.136 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3019097440321601, 'r': 0.34430314262491124, 'f1': 0.32171587972221316}, 'combined': 0.2370538061111044, 'epoch': 21} Test Chinese: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.27531420096356374, 'r': 0.2892024544246337, 'f1': 0.2820874881073851}, 'combined': 0.17519117682458654, 'epoch': 21} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2848150120890121, 'r': 0.33940005235654575, 'f1': 0.3097209135790469}, 'combined': 0.2282154100056135, 'epoch': 21} Test Korean: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.2743035375004626, 'r': 0.2894823893779585, 'f1': 0.28168863274085965}, 'combined': 0.17494346664958652, 'epoch': 21} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.29576729296758847, 'r': 0.33673695594032843, 'f1': 0.3149252453958351}, 'combined': 0.23205018081798376, 'epoch': 21} Test Russian: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.2932632523432252, 'r': 0.3010797457423043, 'f1': 0.2971200997924542}, 'combined': 0.18452721987110313, 'epoch': 21} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.24107142857142858, 'r': 0.38571428571428573, 'f1': 0.2967032967032967}, 'combined': 0.1978021978021978, 'epoch': 21} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.2558139534883721, 'r': 0.4782608695652174, 'f1': 0.33333333333333337}, 'combined': 0.16666666666666669, 'epoch': 21} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.390625, 'r': 0.3232758620689655, 'f1': 0.3537735849056604}, 'combined': 0.2358490566037736, 'epoch': 21} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.31282801835726054, 'r': 0.3603161425860667, 'f1': 0.33489701436130004}, 'combined': 0.24676622110832633, 'epoch': 12} Test for Chinese: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.30894966790503225, 'r': 0.2887808468548521, 'f1': 0.29852498585915693}, 'combined': 0.1853997280598975, 'epoch': 12} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.3138888888888889, 'r': 0.4035714285714286, 'f1': 0.35312499999999997}, 'combined': 0.23541666666666664, 'epoch': 12} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2957374656593406, 'r': 0.3501711168338303, 'f1': 0.3206606056844979}, 'combined': 0.23627623576752474, 'epoch': 12} Test for Korean: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.30237642650399266, 'r': 0.2871380888046808, 'f1': 0.2945603100560943}, 'combined': 0.18293745571904804, 'epoch': 12} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.31976744186046513, 'r': 0.5978260869565217, 'f1': 0.4166666666666667}, 'combined': 0.20833333333333334, 'epoch': 12} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3020804336867607, 'r': 0.32271590923272536, 'f1': 0.3120574021388005}, 'combined': 0.22993703315490563, 'epoch': 8} Test for Russian: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.3069139619466407, 'r': 0.292949528465389, 'f1': 0.29976920372318655}, 'combined': 0.1861724528386106, 'epoch': 8} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.44642857142857145, 'r': 0.3232758620689655, 'f1': 0.37500000000000006}, 'combined': 0.25, 'epoch': 8} ****************************** Epoch: 22 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-22 14:27:12.413856: step: 2/464, loss: 0.022911660373210907 2023-01-22 14:27:13.171721: step: 4/464, loss: 0.048933934420347214 2023-01-22 14:27:13.959003: step: 6/464, loss: 0.19535832107067108 2023-01-22 14:27:14.687663: step: 8/464, loss: 0.014200533740222454 2023-01-22 14:27:15.503741: step: 10/464, loss: 0.06074250862002373 2023-01-22 14:27:16.230429: step: 12/464, loss: 0.04321758449077606 2023-01-22 14:27:17.004124: step: 14/464, loss: 0.020962128415703773 2023-01-22 14:27:17.809030: step: 16/464, loss: 0.028577618300914764 2023-01-22 14:27:18.546334: step: 18/464, loss: 0.026005618274211884 2023-01-22 14:27:19.311217: step: 20/464, loss: 0.06449444591999054 2023-01-22 14:27:20.085364: step: 22/464, loss: 0.010618659667670727 2023-01-22 14:27:20.845808: step: 24/464, loss: 0.07123077660799026 2023-01-22 14:27:21.649950: step: 26/464, loss: 0.03983668237924576 2023-01-22 14:27:22.373171: step: 28/464, loss: 0.03342576324939728 2023-01-22 14:27:23.149130: step: 30/464, loss: 0.039252303540706635 2023-01-22 14:27:23.911186: step: 32/464, loss: 0.018664592877030373 2023-01-22 14:27:24.616534: step: 34/464, loss: 0.1506842076778412 2023-01-22 14:27:25.351988: step: 36/464, loss: 0.9334834814071655 2023-01-22 14:27:26.223559: step: 38/464, loss: 0.007731994614005089 2023-01-22 14:27:26.982205: step: 40/464, loss: 0.020626060664653778 2023-01-22 14:27:27.722076: step: 42/464, loss: 0.011708186939358711 2023-01-22 14:27:28.508688: step: 44/464, loss: 0.11415652185678482 2023-01-22 14:27:29.247167: step: 46/464, loss: 0.03801979497075081 2023-01-22 14:27:30.057708: step: 48/464, loss: 0.05007893592119217 2023-01-22 14:27:30.755880: step: 50/464, loss: 0.02001386322081089 2023-01-22 14:27:31.537480: step: 52/464, loss: 0.03638208657503128 2023-01-22 14:27:32.264975: step: 54/464, loss: 0.02277105487883091 2023-01-22 14:27:33.014936: step: 56/464, loss: 0.05654141679406166 2023-01-22 14:27:33.773875: step: 58/464, loss: 0.04333549737930298 2023-01-22 14:27:34.538530: step: 60/464, loss: 0.06319651007652283 2023-01-22 14:27:35.254772: step: 62/464, loss: 0.03601942956447601 2023-01-22 14:27:35.948519: step: 64/464, loss: 0.018980784341692924 2023-01-22 14:27:36.715885: step: 66/464, loss: 0.5078731775283813 2023-01-22 14:27:37.484510: step: 68/464, loss: 0.013460144400596619 2023-01-22 14:27:38.312484: step: 70/464, loss: 0.24691608548164368 2023-01-22 14:27:39.042234: step: 72/464, loss: 0.03298424929380417 2023-01-22 14:27:39.800663: step: 74/464, loss: 0.03243899345397949 2023-01-22 14:27:40.488348: step: 76/464, loss: 0.061396270990371704 2023-01-22 14:27:41.246618: step: 78/464, loss: 0.044844284653663635 2023-01-22 14:27:41.988726: step: 80/464, loss: 0.2925359606742859 2023-01-22 14:27:42.751245: step: 82/464, loss: 0.053221434354782104 2023-01-22 14:27:43.444892: step: 84/464, loss: 0.02855098433792591 2023-01-22 14:27:44.150710: step: 86/464, loss: 0.04170306771993637 2023-01-22 14:27:44.852809: step: 88/464, loss: 0.1138177141547203 2023-01-22 14:27:45.590385: step: 90/464, loss: 0.008606791496276855 2023-01-22 14:27:46.414462: step: 92/464, loss: 0.006901235319674015 2023-01-22 14:27:47.088437: step: 94/464, loss: 0.13284118473529816 2023-01-22 14:27:47.783436: step: 96/464, loss: 0.018453076481819153 2023-01-22 14:27:48.511699: step: 98/464, loss: 0.06665834039449692 2023-01-22 14:27:49.257030: step: 100/464, loss: 0.07431719452142715 2023-01-22 14:27:50.002636: step: 102/464, loss: 0.009169948287308216 2023-01-22 14:27:50.674797: step: 104/464, loss: 0.007283048704266548 2023-01-22 14:27:51.465484: step: 106/464, loss: 0.10819932818412781 2023-01-22 14:27:52.167504: step: 108/464, loss: 0.02185155265033245 2023-01-22 14:27:52.921528: step: 110/464, loss: 0.015204259194433689 2023-01-22 14:27:53.683988: step: 112/464, loss: 0.02292870543897152 2023-01-22 14:27:54.422507: step: 114/464, loss: 0.19087494909763336 2023-01-22 14:27:55.177521: step: 116/464, loss: 0.07576505094766617 2023-01-22 14:27:55.896345: step: 118/464, loss: 0.01860167644917965 2023-01-22 14:27:56.735949: step: 120/464, loss: 0.04837913438677788 2023-01-22 14:27:57.494072: step: 122/464, loss: 0.05998101457953453 2023-01-22 14:27:58.285236: step: 124/464, loss: 0.09130522608757019 2023-01-22 14:27:59.011061: step: 126/464, loss: 0.06464896351099014 2023-01-22 14:27:59.770917: step: 128/464, loss: 0.014377767220139503 2023-01-22 14:28:00.520427: step: 130/464, loss: 0.015954501926898956 2023-01-22 14:28:01.314263: step: 132/464, loss: 0.09322734922170639 2023-01-22 14:28:02.029110: step: 134/464, loss: 0.007519495207816362 2023-01-22 14:28:02.855688: step: 136/464, loss: 0.24314019083976746 2023-01-22 14:28:03.624940: step: 138/464, loss: 0.08465724438428879 2023-01-22 14:28:04.405470: step: 140/464, loss: 0.04498489573597908 2023-01-22 14:28:05.091822: step: 142/464, loss: 0.05277256295084953 2023-01-22 14:28:05.954770: step: 144/464, loss: 0.0748797133564949 2023-01-22 14:28:06.702587: step: 146/464, loss: 0.6362795233726501 2023-01-22 14:28:07.467841: step: 148/464, loss: 0.0029451518785208464 2023-01-22 14:28:08.178888: step: 150/464, loss: 0.03429803252220154 2023-01-22 14:28:09.010744: step: 152/464, loss: 0.04802894964814186 2023-01-22 14:28:09.880563: step: 154/464, loss: 0.07614947855472565 2023-01-22 14:28:10.656806: step: 156/464, loss: 0.0460069477558136 2023-01-22 14:28:11.413840: step: 158/464, loss: 0.03649597987532616 2023-01-22 14:28:12.231134: step: 160/464, loss: 0.02964697778224945 2023-01-22 14:28:12.990254: step: 162/464, loss: 0.03786936402320862 2023-01-22 14:28:13.693102: step: 164/464, loss: 0.030483728274703026 2023-01-22 14:28:14.374970: step: 166/464, loss: 0.05337598919868469 2023-01-22 14:28:15.135477: step: 168/464, loss: 0.017475681379437447 2023-01-22 14:28:15.925473: step: 170/464, loss: 0.26207685470581055 2023-01-22 14:28:16.673086: step: 172/464, loss: 0.07319401949644089 2023-01-22 14:28:17.355221: step: 174/464, loss: 0.025556493550539017 2023-01-22 14:28:18.127350: step: 176/464, loss: 0.06023148447275162 2023-01-22 14:28:18.931961: step: 178/464, loss: 0.16737209260463715 2023-01-22 14:28:19.647935: step: 180/464, loss: 0.013148444704711437 2023-01-22 14:28:20.349342: step: 182/464, loss: 0.08468661457300186 2023-01-22 14:28:21.124534: step: 184/464, loss: 0.013178776018321514 2023-01-22 14:28:21.830253: step: 186/464, loss: 0.02558089606463909 2023-01-22 14:28:22.608466: step: 188/464, loss: 0.4905921518802643 2023-01-22 14:28:23.305077: step: 190/464, loss: 0.034917011857032776 2023-01-22 14:28:23.960423: step: 192/464, loss: 0.004073809366673231 2023-01-22 14:28:24.700432: step: 194/464, loss: 0.012788454070687294 2023-01-22 14:28:25.520269: step: 196/464, loss: 0.055027078837156296 2023-01-22 14:28:26.248898: step: 198/464, loss: 0.03156561031937599 2023-01-22 14:28:27.042438: step: 200/464, loss: 0.004597052466124296 2023-01-22 14:28:27.737949: step: 202/464, loss: 0.010899416171014309 2023-01-22 14:28:28.498020: step: 204/464, loss: 0.0004813208943232894 2023-01-22 14:28:29.218741: step: 206/464, loss: 0.012121593579649925 2023-01-22 14:28:29.956381: step: 208/464, loss: 0.031046288087964058 2023-01-22 14:28:30.737826: step: 210/464, loss: 0.0077215139754116535 2023-01-22 14:28:31.468940: step: 212/464, loss: 0.02506837248802185 2023-01-22 14:28:32.370542: step: 214/464, loss: 0.027091197669506073 2023-01-22 14:28:33.120588: step: 216/464, loss: 0.020081406459212303 2023-01-22 14:28:33.868175: step: 218/464, loss: 0.07068915665149689 2023-01-22 14:28:34.587296: step: 220/464, loss: 0.06371705234050751 2023-01-22 14:28:35.260219: step: 222/464, loss: 0.07542310655117035 2023-01-22 14:28:36.077079: step: 224/464, loss: 0.02536981925368309 2023-01-22 14:28:36.803033: step: 226/464, loss: 0.019606802612543106 2023-01-22 14:28:37.520230: step: 228/464, loss: 0.006087008863687515 2023-01-22 14:28:38.219676: step: 230/464, loss: 0.019796758890151978 2023-01-22 14:28:38.880282: step: 232/464, loss: 0.00813978910446167 2023-01-22 14:28:39.542459: step: 234/464, loss: 0.008569409139454365 2023-01-22 14:28:40.237986: step: 236/464, loss: 0.0056127277202904224 2023-01-22 14:28:40.977739: step: 238/464, loss: 0.015920937061309814 2023-01-22 14:28:41.729475: step: 240/464, loss: 0.010793288238346577 2023-01-22 14:28:42.540807: step: 242/464, loss: 0.07124904543161392 2023-01-22 14:28:43.296253: step: 244/464, loss: 0.038125790655612946 2023-01-22 14:28:44.010332: step: 246/464, loss: 0.08426562696695328 2023-01-22 14:28:44.738858: step: 248/464, loss: 0.43768274784088135 2023-01-22 14:28:45.425423: step: 250/464, loss: 0.005656179040670395 2023-01-22 14:28:46.114748: step: 252/464, loss: 0.00686601409688592 2023-01-22 14:28:46.802557: step: 254/464, loss: 0.030220884829759598 2023-01-22 14:28:47.586500: step: 256/464, loss: 0.009510435163974762 2023-01-22 14:28:48.230739: step: 258/464, loss: 0.022821389138698578 2023-01-22 14:28:48.930678: step: 260/464, loss: 0.05829579755663872 2023-01-22 14:28:49.752704: step: 262/464, loss: 0.08035297691822052 2023-01-22 14:28:50.460804: step: 264/464, loss: 0.14977802336215973 2023-01-22 14:28:51.139241: step: 266/464, loss: 0.03890952467918396 2023-01-22 14:28:51.852986: step: 268/464, loss: 0.023425551131367683 2023-01-22 14:28:52.560654: step: 270/464, loss: 0.03978399559855461 2023-01-22 14:28:53.293369: step: 272/464, loss: 0.1666889637708664 2023-01-22 14:28:53.982405: step: 274/464, loss: 0.13102814555168152 2023-01-22 14:28:54.681326: step: 276/464, loss: 0.5807995796203613 2023-01-22 14:28:55.349992: step: 278/464, loss: 0.050122637301683426 2023-01-22 14:28:56.086301: step: 280/464, loss: 0.03379981592297554 2023-01-22 14:28:56.767077: step: 282/464, loss: 0.049888014793395996 2023-01-22 14:28:57.524393: step: 284/464, loss: 0.01664917916059494 2023-01-22 14:28:58.279612: step: 286/464, loss: 0.05668970197439194 2023-01-22 14:28:59.011078: step: 288/464, loss: 0.023765837773680687 2023-01-22 14:28:59.745190: step: 290/464, loss: 0.12605422735214233 2023-01-22 14:29:00.478019: step: 292/464, loss: 0.08360307663679123 2023-01-22 14:29:01.250191: step: 294/464, loss: 0.04222423955798149 2023-01-22 14:29:02.098455: step: 296/464, loss: 0.055150359869003296 2023-01-22 14:29:02.900531: step: 298/464, loss: 0.016825374215841293 2023-01-22 14:29:03.644239: step: 300/464, loss: 0.008687654510140419 2023-01-22 14:29:04.333592: step: 302/464, loss: 0.12264332920312881 2023-01-22 14:29:05.034826: step: 304/464, loss: 0.008854144252836704 2023-01-22 14:29:05.787971: step: 306/464, loss: 0.04204096645116806 2023-01-22 14:29:06.624158: step: 308/464, loss: 0.06922618299722672 2023-01-22 14:29:07.405010: step: 310/464, loss: 0.016595102846622467 2023-01-22 14:29:08.107071: step: 312/464, loss: 0.0613955520093441 2023-01-22 14:29:08.868063: step: 314/464, loss: 0.06470753252506256 2023-01-22 14:29:09.569475: step: 316/464, loss: 0.0006224927492439747 2023-01-22 14:29:10.319352: step: 318/464, loss: 0.38640323281288147 2023-01-22 14:29:10.981353: step: 320/464, loss: 0.05243564769625664 2023-01-22 14:29:11.754501: step: 322/464, loss: 0.03029179573059082 2023-01-22 14:29:12.654675: step: 324/464, loss: 0.04183750972151756 2023-01-22 14:29:13.345725: step: 326/464, loss: 0.021579837426543236 2023-01-22 14:29:14.044328: step: 328/464, loss: 0.0030274181626737118 2023-01-22 14:29:14.762201: step: 330/464, loss: 0.016017360612750053 2023-01-22 14:29:15.562165: step: 332/464, loss: 0.1567627489566803 2023-01-22 14:29:16.255587: step: 334/464, loss: 0.009751198813319206 2023-01-22 14:29:17.041599: step: 336/464, loss: 0.2539663314819336 2023-01-22 14:29:17.727560: step: 338/464, loss: 0.0058738901279866695 2023-01-22 14:29:18.550703: step: 340/464, loss: 0.01695852354168892 2023-01-22 14:29:19.308685: step: 342/464, loss: 0.11627886444330215 2023-01-22 14:29:20.139123: step: 344/464, loss: 0.1474836766719818 2023-01-22 14:29:20.906201: step: 346/464, loss: 0.18009726703166962 2023-01-22 14:29:21.587761: step: 348/464, loss: 0.8425463438034058 2023-01-22 14:29:22.307381: step: 350/464, loss: 0.0017548745963722467 2023-01-22 14:29:22.969441: step: 352/464, loss: 0.06723642349243164 2023-01-22 14:29:23.705838: step: 354/464, loss: 0.033363692462444305 2023-01-22 14:29:24.450968: step: 356/464, loss: 0.006056908518075943 2023-01-22 14:29:25.323817: step: 358/464, loss: 0.04731690138578415 2023-01-22 14:29:26.047225: step: 360/464, loss: 0.03328216075897217 2023-01-22 14:29:26.797000: step: 362/464, loss: 0.0012911552330479026 2023-01-22 14:29:27.544282: step: 364/464, loss: 0.02457933872938156 2023-01-22 14:29:28.237676: step: 366/464, loss: 0.07838442176580429 2023-01-22 14:29:28.948275: step: 368/464, loss: 0.06474398076534271 2023-01-22 14:29:29.759881: step: 370/464, loss: 0.05870361626148224 2023-01-22 14:29:30.575242: step: 372/464, loss: 0.24338991940021515 2023-01-22 14:29:31.255584: step: 374/464, loss: 0.04443281888961792 2023-01-22 14:29:31.984743: step: 376/464, loss: 0.041696105152368546 2023-01-22 14:29:32.662262: step: 378/464, loss: 0.014436283148825169 2023-01-22 14:29:33.362729: step: 380/464, loss: 0.06830400973558426 2023-01-22 14:29:34.034217: step: 382/464, loss: 0.06629308313131332 2023-01-22 14:29:34.742132: step: 384/464, loss: 0.0009762270492501557 2023-01-22 14:29:35.390910: step: 386/464, loss: 0.07588119804859161 2023-01-22 14:29:36.126566: step: 388/464, loss: 0.03207830712199211 2023-01-22 14:29:36.844718: step: 390/464, loss: 0.23488739132881165 2023-01-22 14:29:37.647478: step: 392/464, loss: 0.024468503892421722 2023-01-22 14:29:38.444213: step: 394/464, loss: 0.03723740577697754 2023-01-22 14:29:39.198889: step: 396/464, loss: 0.08763992786407471 2023-01-22 14:29:39.867640: step: 398/464, loss: 0.06267467141151428 2023-01-22 14:29:40.609092: step: 400/464, loss: 0.02886023558676243 2023-01-22 14:29:41.347394: step: 402/464, loss: 0.057991873472929 2023-01-22 14:29:42.219025: step: 404/464, loss: 0.018807729706168175 2023-01-22 14:29:42.938402: step: 406/464, loss: 0.011335845105350018 2023-01-22 14:29:43.709949: step: 408/464, loss: 0.058162566274404526 2023-01-22 14:29:44.478570: step: 410/464, loss: 0.1280667930841446 2023-01-22 14:29:45.203388: step: 412/464, loss: 0.02704724669456482 2023-01-22 14:29:45.852932: step: 414/464, loss: 0.02012377791106701 2023-01-22 14:29:46.601633: step: 416/464, loss: 0.041829902678728104 2023-01-22 14:29:47.436212: step: 418/464, loss: 0.08959502726793289 2023-01-22 14:29:48.162276: step: 420/464, loss: 0.017190277576446533 2023-01-22 14:29:48.894819: step: 422/464, loss: 1.137946605682373 2023-01-22 14:29:49.600526: step: 424/464, loss: 0.08984936028718948 2023-01-22 14:29:50.322439: step: 426/464, loss: 0.05588805302977562 2023-01-22 14:29:51.026769: step: 428/464, loss: 0.02053980715572834 2023-01-22 14:29:51.731190: step: 430/464, loss: 0.029160495847463608 2023-01-22 14:29:52.543702: step: 432/464, loss: 0.0642860159277916 2023-01-22 14:29:53.289165: step: 434/464, loss: 0.6985005736351013 2023-01-22 14:29:53.998309: step: 436/464, loss: 0.5311950445175171 2023-01-22 14:29:54.744297: step: 438/464, loss: 0.11107250303030014 2023-01-22 14:29:55.440169: step: 440/464, loss: 0.021011127158999443 2023-01-22 14:29:56.222835: step: 442/464, loss: 0.04673737287521362 2023-01-22 14:29:56.921873: step: 444/464, loss: 0.002503372263163328 2023-01-22 14:29:57.613475: step: 446/464, loss: 0.11988291144371033 2023-01-22 14:29:58.310202: step: 448/464, loss: 0.07330343127250671 2023-01-22 14:29:59.096255: step: 450/464, loss: 0.5222264528274536 2023-01-22 14:29:59.833797: step: 452/464, loss: 0.03502650558948517 2023-01-22 14:30:00.577554: step: 454/464, loss: 0.24348917603492737 2023-01-22 14:30:01.378188: step: 456/464, loss: 0.06922262907028198 2023-01-22 14:30:02.150411: step: 458/464, loss: 0.01801128126680851 2023-01-22 14:30:02.922212: step: 460/464, loss: 0.05330865830183029 2023-01-22 14:30:03.666938: step: 462/464, loss: 0.10185433179140091 2023-01-22 14:30:04.463465: step: 464/464, loss: 0.033372409641742706 2023-01-22 14:30:05.253894: step: 466/464, loss: 0.06386049836874008 2023-01-22 14:30:05.930471: step: 468/464, loss: 0.022253213450312614 2023-01-22 14:30:06.736498: step: 470/464, loss: 0.04618272930383682 2023-01-22 14:30:07.507014: step: 472/464, loss: 0.029864918440580368 2023-01-22 14:30:08.209652: step: 474/464, loss: 0.13251470029354095 2023-01-22 14:30:08.939065: step: 476/464, loss: 0.2079554796218872 2023-01-22 14:30:09.634406: step: 478/464, loss: 0.2565006613731384 2023-01-22 14:30:10.308335: step: 480/464, loss: 0.02998241037130356 2023-01-22 14:30:11.040242: step: 482/464, loss: 0.014584019780158997 2023-01-22 14:30:11.898142: step: 484/464, loss: 0.08024832606315613 2023-01-22 14:30:12.648082: step: 486/464, loss: 0.14153648912906647 2023-01-22 14:30:13.467649: step: 488/464, loss: 0.16485629975795746 2023-01-22 14:30:14.264216: step: 490/464, loss: 0.01724920980632305 2023-01-22 14:30:15.000903: step: 492/464, loss: 0.028226202353835106 2023-01-22 14:30:15.719483: step: 494/464, loss: 0.044736597687006 2023-01-22 14:30:16.530230: step: 496/464, loss: 0.0690031349658966 2023-01-22 14:30:17.229882: step: 498/464, loss: 0.021006660535931587 2023-01-22 14:30:18.048185: step: 500/464, loss: 0.12912136316299438 2023-01-22 14:30:18.808856: step: 502/464, loss: 0.10888178646564484 2023-01-22 14:30:19.495776: step: 504/464, loss: 0.009902720339596272 2023-01-22 14:30:20.275948: step: 506/464, loss: 0.02853531204164028 2023-01-22 14:30:21.004387: step: 508/464, loss: 0.0805187001824379 2023-01-22 14:30:21.735688: step: 510/464, loss: 0.030831599608063698 2023-01-22 14:30:22.437524: step: 512/464, loss: 0.041993558406829834 2023-01-22 14:30:23.168985: step: 514/464, loss: 0.02250916138291359 2023-01-22 14:30:23.994648: step: 516/464, loss: 0.1520540416240692 2023-01-22 14:30:24.754913: step: 518/464, loss: 0.04919816926121712 2023-01-22 14:30:25.660748: step: 520/464, loss: 0.029099134728312492 2023-01-22 14:30:26.365633: step: 522/464, loss: 0.04807744175195694 2023-01-22 14:30:27.039536: step: 524/464, loss: 0.041146181523799896 2023-01-22 14:30:27.746163: step: 526/464, loss: 0.042510777711868286 2023-01-22 14:30:28.477018: step: 528/464, loss: 0.058072157204151154 2023-01-22 14:30:29.222325: step: 530/464, loss: 1.4828964471817017 2023-01-22 14:30:29.913054: step: 532/464, loss: 0.002875608392059803 2023-01-22 14:30:30.786366: step: 534/464, loss: 0.02097652293741703 2023-01-22 14:30:31.516094: step: 536/464, loss: 0.01865016296505928 2023-01-22 14:30:32.227537: step: 538/464, loss: 0.01164599135518074 2023-01-22 14:30:33.039826: step: 540/464, loss: 0.024205351248383522 2023-01-22 14:30:33.770289: step: 542/464, loss: 0.03508354350924492 2023-01-22 14:30:34.604783: step: 544/464, loss: 1.322606086730957 2023-01-22 14:30:35.313377: step: 546/464, loss: 0.07354927062988281 2023-01-22 14:30:36.059160: step: 548/464, loss: 0.0703032910823822 2023-01-22 14:30:36.844783: step: 550/464, loss: 0.016364462673664093 2023-01-22 14:30:37.594493: step: 552/464, loss: 0.010696408338844776 2023-01-22 14:30:38.303435: step: 554/464, loss: 0.03282889723777771 2023-01-22 14:30:38.996654: step: 556/464, loss: 0.004567502066493034 2023-01-22 14:30:39.764465: step: 558/464, loss: 0.2184530347585678 2023-01-22 14:30:40.523024: step: 560/464, loss: 0.016430631279945374 2023-01-22 14:30:41.350421: step: 562/464, loss: 0.1944238543510437 2023-01-22 14:30:42.114393: step: 564/464, loss: 0.030758991837501526 2023-01-22 14:30:42.863588: step: 566/464, loss: 0.046245649456977844 2023-01-22 14:30:43.612294: step: 568/464, loss: 0.03639359399676323 2023-01-22 14:30:44.241776: step: 570/464, loss: 0.03184223920106888 2023-01-22 14:30:44.909696: step: 572/464, loss: 0.0072190905921161175 2023-01-22 14:30:45.555593: step: 574/464, loss: 0.006273357663303614 2023-01-22 14:30:46.365487: step: 576/464, loss: 0.1006399616599083 2023-01-22 14:30:47.046671: step: 578/464, loss: 0.014032825827598572 2023-01-22 14:30:47.924343: step: 580/464, loss: 0.04493585601449013 2023-01-22 14:30:48.672949: step: 582/464, loss: 0.02772378921508789 2023-01-22 14:30:49.357966: step: 584/464, loss: 0.053779810667037964 2023-01-22 14:30:50.101264: step: 586/464, loss: 0.11084982752799988 2023-01-22 14:30:50.882390: step: 588/464, loss: 0.03748084232211113 2023-01-22 14:30:51.627099: step: 590/464, loss: 0.0179930217564106 2023-01-22 14:30:52.446237: step: 592/464, loss: 0.014146368019282818 2023-01-22 14:30:53.224255: step: 594/464, loss: 0.14494703710079193 2023-01-22 14:30:53.980591: step: 596/464, loss: 0.05555734410881996 2023-01-22 14:30:54.681989: step: 598/464, loss: 0.05155664309859276 2023-01-22 14:30:55.400573: step: 600/464, loss: 0.16895079612731934 2023-01-22 14:30:56.157279: step: 602/464, loss: 0.06997605413198471 2023-01-22 14:30:56.873201: step: 604/464, loss: 0.040857136249542236 2023-01-22 14:30:57.601081: step: 606/464, loss: 0.0022094054147601128 2023-01-22 14:30:58.410122: step: 608/464, loss: 0.007077968679368496 2023-01-22 14:30:59.137396: step: 610/464, loss: 0.02948709949851036 2023-01-22 14:30:59.905662: step: 612/464, loss: 0.047195546329021454 2023-01-22 14:31:00.617328: step: 614/464, loss: 0.2917841970920563 2023-01-22 14:31:01.366648: step: 616/464, loss: 0.0415961816906929 2023-01-22 14:31:02.167795: step: 618/464, loss: 0.31351780891418457 2023-01-22 14:31:02.942684: step: 620/464, loss: 0.36635974049568176 2023-01-22 14:31:03.612476: step: 622/464, loss: 0.17023654282093048 2023-01-22 14:31:04.311436: step: 624/464, loss: 0.016414647921919823 2023-01-22 14:31:05.103604: step: 626/464, loss: 0.045262303203344345 2023-01-22 14:31:05.826954: step: 628/464, loss: 0.11020675301551819 2023-01-22 14:31:06.528245: step: 630/464, loss: 0.05932263284921646 2023-01-22 14:31:07.291216: step: 632/464, loss: 0.11531080305576324 2023-01-22 14:31:08.033669: step: 634/464, loss: 0.03160162642598152 2023-01-22 14:31:08.756517: step: 636/464, loss: 0.10523614287376404 2023-01-22 14:31:09.508406: step: 638/464, loss: 0.019944677129387856 2023-01-22 14:31:10.278594: step: 640/464, loss: 0.031220460310578346 2023-01-22 14:31:10.911301: step: 642/464, loss: 0.08424553275108337 2023-01-22 14:31:11.660830: step: 644/464, loss: 0.05185442790389061 2023-01-22 14:31:12.363814: step: 646/464, loss: 0.01238182745873928 2023-01-22 14:31:13.078670: step: 648/464, loss: 0.040707577019929886 2023-01-22 14:31:13.761814: step: 650/464, loss: 0.0012632677098736167 2023-01-22 14:31:14.499180: step: 652/464, loss: 0.06378267705440521 2023-01-22 14:31:15.216154: step: 654/464, loss: 0.033059362322092056 2023-01-22 14:31:15.987684: step: 656/464, loss: 0.015181529335677624 2023-01-22 14:31:16.725360: step: 658/464, loss: 0.04422836750745773 2023-01-22 14:31:17.559608: step: 660/464, loss: 0.12794829905033112 2023-01-22 14:31:18.346577: step: 662/464, loss: 0.04287206009030342 2023-01-22 14:31:19.057072: step: 664/464, loss: 0.008199275471270084 2023-01-22 14:31:19.789196: step: 666/464, loss: 0.024731893092393875 2023-01-22 14:31:20.684837: step: 668/464, loss: 0.028729377314448357 2023-01-22 14:31:21.425888: step: 670/464, loss: 0.056502483785152435 2023-01-22 14:31:22.184854: step: 672/464, loss: 0.022548364475369453 2023-01-22 14:31:22.975604: step: 674/464, loss: 0.037794943898916245 2023-01-22 14:31:23.661219: step: 676/464, loss: 0.006393145769834518 2023-01-22 14:31:24.360719: step: 678/464, loss: 0.28734323382377625 2023-01-22 14:31:25.128277: step: 680/464, loss: 0.16981349885463715 2023-01-22 14:31:25.933801: step: 682/464, loss: 0.13790276646614075 2023-01-22 14:31:26.696258: step: 684/464, loss: 0.07101164013147354 2023-01-22 14:31:27.392260: step: 686/464, loss: 0.04143301770091057 2023-01-22 14:31:28.198268: step: 688/464, loss: 0.03418415039777756 2023-01-22 14:31:28.925173: step: 690/464, loss: 0.005767000373452902 2023-01-22 14:31:29.675194: step: 692/464, loss: 0.04734319448471069 2023-01-22 14:31:30.460409: step: 694/464, loss: 0.04089387133717537 2023-01-22 14:31:31.170768: step: 696/464, loss: 0.02467603236436844 2023-01-22 14:31:31.867338: step: 698/464, loss: 0.041893370449543 2023-01-22 14:31:32.609512: step: 700/464, loss: 0.05832946300506592 2023-01-22 14:31:33.461688: step: 702/464, loss: 0.045026276260614395 2023-01-22 14:31:34.243612: step: 704/464, loss: 0.10983053594827652 2023-01-22 14:31:34.936225: step: 706/464, loss: 0.029054909944534302 2023-01-22 14:31:35.684683: step: 708/464, loss: 0.04096385836601257 2023-01-22 14:31:36.422013: step: 710/464, loss: 0.012389479205012321 2023-01-22 14:31:37.126269: step: 712/464, loss: 0.07159966975450516 2023-01-22 14:31:37.774626: step: 714/464, loss: 0.030216261744499207 2023-01-22 14:31:38.523416: step: 716/464, loss: 0.03051423840224743 2023-01-22 14:31:39.288899: step: 718/464, loss: 0.017640970647335052 2023-01-22 14:31:40.051742: step: 720/464, loss: 0.017679141834378242 2023-01-22 14:31:40.794472: step: 722/464, loss: 0.046189967542886734 2023-01-22 14:31:41.541514: step: 724/464, loss: 0.3321578800678253 2023-01-22 14:31:42.282086: step: 726/464, loss: 0.09389912337064743 2023-01-22 14:31:43.038646: step: 728/464, loss: 0.42146536707878113 2023-01-22 14:31:43.782521: step: 730/464, loss: 0.1609586775302887 2023-01-22 14:31:44.496118: step: 732/464, loss: 0.29355260729789734 2023-01-22 14:31:45.232151: step: 734/464, loss: 0.013211609795689583 2023-01-22 14:31:45.955902: step: 736/464, loss: 0.013481264002621174 2023-01-22 14:31:46.785551: step: 738/464, loss: 0.040613461285829544 2023-01-22 14:31:47.505142: step: 740/464, loss: 0.01638202555477619 2023-01-22 14:31:48.239870: step: 742/464, loss: 0.024240298196673393 2023-01-22 14:31:48.936161: step: 744/464, loss: 0.04148198664188385 2023-01-22 14:31:49.669491: step: 746/464, loss: 0.980080783367157 2023-01-22 14:31:50.453404: step: 748/464, loss: 0.03613200783729553 2023-01-22 14:31:51.185115: step: 750/464, loss: 0.07290980219841003 2023-01-22 14:31:52.097870: step: 752/464, loss: 0.10207195580005646 2023-01-22 14:31:52.787921: step: 754/464, loss: 0.045234184712171555 2023-01-22 14:31:53.542649: step: 756/464, loss: 0.014640030451118946 2023-01-22 14:31:54.264920: step: 758/464, loss: 0.021963372826576233 2023-01-22 14:31:54.971873: step: 760/464, loss: 0.030695024877786636 2023-01-22 14:31:55.745170: step: 762/464, loss: 0.06672321259975433 2023-01-22 14:31:56.471874: step: 764/464, loss: 0.0397285558283329 2023-01-22 14:31:57.199476: step: 766/464, loss: 0.05721195042133331 2023-01-22 14:31:57.946238: step: 768/464, loss: 0.007625281810760498 2023-01-22 14:31:58.749147: step: 770/464, loss: 0.10010773688554764 2023-01-22 14:31:59.416116: step: 772/464, loss: 0.04354416951537132 2023-01-22 14:32:00.232666: step: 774/464, loss: 0.06004622206091881 2023-01-22 14:32:01.001921: step: 776/464, loss: 0.03328992426395416 2023-01-22 14:32:01.734939: step: 778/464, loss: 0.007132671307772398 2023-01-22 14:32:02.466725: step: 780/464, loss: 0.06367408484220505 2023-01-22 14:32:03.258061: step: 782/464, loss: 0.24943578243255615 2023-01-22 14:32:03.994504: step: 784/464, loss: 0.029331697151064873 2023-01-22 14:32:04.678518: step: 786/464, loss: 0.07233763486146927 2023-01-22 14:32:05.370999: step: 788/464, loss: 0.015424920246005058 2023-01-22 14:32:06.142252: step: 790/464, loss: 0.17237427830696106 2023-01-22 14:32:06.810983: step: 792/464, loss: 0.05033355578780174 2023-01-22 14:32:07.514569: step: 794/464, loss: 0.012341322377324104 2023-01-22 14:32:08.188547: step: 796/464, loss: 0.041706882417201996 2023-01-22 14:32:08.972168: step: 798/464, loss: 0.15417589247226715 2023-01-22 14:32:09.754695: step: 800/464, loss: 0.5894966125488281 2023-01-22 14:32:10.543483: step: 802/464, loss: 0.04737395420670509 2023-01-22 14:32:11.265371: step: 804/464, loss: 0.03578581288456917 2023-01-22 14:32:11.956463: step: 806/464, loss: 0.018068134784698486 2023-01-22 14:32:12.739433: step: 808/464, loss: 0.02468440867960453 2023-01-22 14:32:13.503525: step: 810/464, loss: 0.010481053963303566 2023-01-22 14:32:14.245021: step: 812/464, loss: 0.05397513136267662 2023-01-22 14:32:15.049023: step: 814/464, loss: 0.03365986421704292 2023-01-22 14:32:15.848423: step: 816/464, loss: 1.4167505502700806 2023-01-22 14:32:16.480762: step: 818/464, loss: 0.20036783814430237 2023-01-22 14:32:17.186132: step: 820/464, loss: 0.0020387363620102406 2023-01-22 14:32:17.878662: step: 822/464, loss: 0.022577812895178795 2023-01-22 14:32:18.644646: step: 824/464, loss: 0.18894566595554352 2023-01-22 14:32:19.399493: step: 826/464, loss: 0.11747203022241592 2023-01-22 14:32:20.038521: step: 828/464, loss: 0.013679565861821175 2023-01-22 14:32:20.734510: step: 830/464, loss: 0.010823186486959457 2023-01-22 14:32:21.485201: step: 832/464, loss: 0.002872373443096876 2023-01-22 14:32:22.188898: step: 834/464, loss: 0.09699154645204544 2023-01-22 14:32:22.940067: step: 836/464, loss: 0.030660971999168396 2023-01-22 14:32:23.643462: step: 838/464, loss: 0.027780331671237946 2023-01-22 14:32:24.415458: step: 840/464, loss: 0.06117891147732735 2023-01-22 14:32:25.218434: step: 842/464, loss: 0.012357320636510849 2023-01-22 14:32:25.922979: step: 844/464, loss: 0.036991577595472336 2023-01-22 14:32:26.626250: step: 846/464, loss: 0.1404614895582199 2023-01-22 14:32:27.389682: step: 848/464, loss: 0.006232497747987509 2023-01-22 14:32:28.111399: step: 850/464, loss: 0.0432085283100605 2023-01-22 14:32:28.912317: step: 852/464, loss: 0.03327338397502899 2023-01-22 14:32:29.576975: step: 854/464, loss: 0.03004441037774086 2023-01-22 14:32:30.400495: step: 856/464, loss: 0.01888202875852585 2023-01-22 14:32:31.152735: step: 858/464, loss: 0.025666790083050728 2023-01-22 14:32:31.862787: step: 860/464, loss: 0.040373627096414566 2023-01-22 14:32:32.550126: step: 862/464, loss: 0.0444427989423275 2023-01-22 14:32:33.209549: step: 864/464, loss: 0.020637089386582375 2023-01-22 14:32:33.935927: step: 866/464, loss: 0.013471740297973156 2023-01-22 14:32:34.718714: step: 868/464, loss: 0.44508418440818787 2023-01-22 14:32:35.509331: step: 870/464, loss: 0.06201205402612686 2023-01-22 14:32:36.244941: step: 872/464, loss: 0.026107273995876312 2023-01-22 14:32:36.944878: step: 874/464, loss: 0.026736147701740265 2023-01-22 14:32:37.769961: step: 876/464, loss: 0.0658494085073471 2023-01-22 14:32:38.495970: step: 878/464, loss: 0.013066594488918781 2023-01-22 14:32:39.283202: step: 880/464, loss: 0.027758102864027023 2023-01-22 14:32:40.033254: step: 882/464, loss: 0.004973025061190128 2023-01-22 14:32:40.719443: step: 884/464, loss: 0.09600269794464111 2023-01-22 14:32:41.464303: step: 886/464, loss: 0.0065295337699353695 2023-01-22 14:32:42.211482: step: 888/464, loss: 0.07899457216262817 2023-01-22 14:32:42.952436: step: 890/464, loss: 0.05921126902103424 2023-01-22 14:32:43.654575: step: 892/464, loss: 0.07933531701564789 2023-01-22 14:32:44.408604: step: 894/464, loss: 1.9494664669036865 2023-01-22 14:32:45.109168: step: 896/464, loss: 0.02592792920768261 2023-01-22 14:32:45.892551: step: 898/464, loss: 0.03737859055399895 2023-01-22 14:32:46.749499: step: 900/464, loss: 0.040085483342409134 2023-01-22 14:32:47.442911: step: 902/464, loss: 0.02060028165578842 2023-01-22 14:32:48.136843: step: 904/464, loss: 0.012544417753815651 2023-01-22 14:32:48.819430: step: 906/464, loss: 0.005425968207418919 2023-01-22 14:32:49.455233: step: 908/464, loss: 0.02438453957438469 2023-01-22 14:32:50.207468: step: 910/464, loss: 0.06317855417728424 2023-01-22 14:32:50.920124: step: 912/464, loss: 0.009412641637027264 2023-01-22 14:32:51.645957: step: 914/464, loss: 0.032311778515577316 2023-01-22 14:32:52.435154: step: 916/464, loss: 0.018643241375684738 2023-01-22 14:32:53.257758: step: 918/464, loss: 0.0377984344959259 2023-01-22 14:32:54.042275: step: 920/464, loss: 0.10418711602687836 2023-01-22 14:32:54.749171: step: 922/464, loss: 0.198570117354393 2023-01-22 14:32:55.528767: step: 924/464, loss: 1.4531371593475342 2023-01-22 14:32:56.218620: step: 926/464, loss: 0.06543616205453873 2023-01-22 14:32:56.908969: step: 928/464, loss: 0.2391623854637146 2023-01-22 14:32:57.528989: step: 930/464, loss: 0.027653774246573448 ================================================== Loss: 0.093 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.30544195972566757, 'r': 0.36108224081421425, 'f1': 0.3309397233201581}, 'combined': 0.24385032244643226, 'epoch': 22} Test Chinese: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.2904566881696699, 'r': 0.29792639528382564, 'f1': 0.29414412659369893}, 'combined': 0.18267898388450776, 'epoch': 22} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.28800359586649904, 'r': 0.35576914783508706, 'f1': 0.3183197638524463}, 'combined': 0.2345514049439078, 'epoch': 22} Test Korean: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.2830303762456697, 'r': 0.29170885464687224, 'f1': 0.28730409356842457}, 'combined': 0.17843096337407421, 'epoch': 22} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3001040707890553, 'r': 0.3581887296514531, 'f1': 0.3265838417410308}, 'combined': 0.24064072549339108, 'epoch': 22} Test Russian: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.3011024222801996, 'r': 0.301400248612821, 'f1': 0.30125126183644296}, 'combined': 0.18709288893000142, 'epoch': 22} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.25, 'r': 0.30714285714285716, 'f1': 0.27564102564102566}, 'combined': 0.18376068376068377, 'epoch': 22} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.27717391304347827, 'r': 0.5543478260869565, 'f1': 0.3695652173913043}, 'combined': 0.18478260869565216, 'epoch': 22} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.31, 'r': 0.2672413793103448, 'f1': 0.287037037037037}, 'combined': 0.191358024691358, 'epoch': 22} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.31282801835726054, 'r': 0.3603161425860667, 'f1': 0.33489701436130004}, 'combined': 0.24676622110832633, 'epoch': 12} Test for Chinese: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.30894966790503225, 'r': 0.2887808468548521, 'f1': 0.29852498585915693}, 'combined': 0.1853997280598975, 'epoch': 12} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.3138888888888889, 'r': 0.4035714285714286, 'f1': 0.35312499999999997}, 'combined': 0.23541666666666664, 'epoch': 12} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2957374656593406, 'r': 0.3501711168338303, 'f1': 0.3206606056844979}, 'combined': 0.23627623576752474, 'epoch': 12} Test for Korean: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.30237642650399266, 'r': 0.2871380888046808, 'f1': 0.2945603100560943}, 'combined': 0.18293745571904804, 'epoch': 12} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.31976744186046513, 'r': 0.5978260869565217, 'f1': 0.4166666666666667}, 'combined': 0.20833333333333334, 'epoch': 12} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3020804336867607, 'r': 0.32271590923272536, 'f1': 0.3120574021388005}, 'combined': 0.22993703315490563, 'epoch': 8} Test for Russian: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.3069139619466407, 'r': 0.292949528465389, 'f1': 0.29976920372318655}, 'combined': 0.1861724528386106, 'epoch': 8} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.44642857142857145, 'r': 0.3232758620689655, 'f1': 0.37500000000000006}, 'combined': 0.25, 'epoch': 8} ****************************** Epoch: 23 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-22 14:35:39.112820: step: 2/464, loss: 0.02068914659321308 2023-01-22 14:35:39.871539: step: 4/464, loss: 0.03398273512721062 2023-01-22 14:35:40.568840: step: 6/464, loss: 0.08365309983491898 2023-01-22 14:35:41.190556: step: 8/464, loss: 0.026360979303717613 2023-01-22 14:35:41.971798: step: 10/464, loss: 0.021755725145339966 2023-01-22 14:35:42.735364: step: 12/464, loss: 0.03131546080112457 2023-01-22 14:35:43.625009: step: 14/464, loss: 0.06308458745479584 2023-01-22 14:35:44.354744: step: 16/464, loss: 0.04444117099046707 2023-01-22 14:35:45.108409: step: 18/464, loss: 0.08508457988500595 2023-01-22 14:35:45.827955: step: 20/464, loss: 0.11405806988477707 2023-01-22 14:35:46.527917: step: 22/464, loss: 0.08450333029031754 2023-01-22 14:35:47.314063: step: 24/464, loss: 0.1825638711452484 2023-01-22 14:35:47.988631: step: 26/464, loss: 0.035793039947748184 2023-01-22 14:35:48.754923: step: 28/464, loss: 0.04304810240864754 2023-01-22 14:35:49.539952: step: 30/464, loss: 0.42393800616264343 2023-01-22 14:35:50.327490: step: 32/464, loss: 0.04675702378153801 2023-01-22 14:35:50.967935: step: 34/464, loss: 0.006157548166811466 2023-01-22 14:35:51.681795: step: 36/464, loss: 0.7356515526771545 2023-01-22 14:35:52.351916: step: 38/464, loss: 0.02841765619814396 2023-01-22 14:35:53.066508: step: 40/464, loss: 0.011313747614622116 2023-01-22 14:35:53.811682: step: 42/464, loss: 0.7958417534828186 2023-01-22 14:35:54.507057: step: 44/464, loss: 0.023113884031772614 2023-01-22 14:35:55.263516: step: 46/464, loss: 0.05851215496659279 2023-01-22 14:35:56.034653: step: 48/464, loss: 0.020469114184379578 2023-01-22 14:35:56.766107: step: 50/464, loss: 0.009623406454920769 2023-01-22 14:35:57.525953: step: 52/464, loss: 0.06048066169023514 2023-01-22 14:35:58.289927: step: 54/464, loss: 0.02682196907699108 2023-01-22 14:35:58.979603: step: 56/464, loss: 0.020548773929476738 2023-01-22 14:35:59.681753: step: 58/464, loss: 0.033835507929325104 2023-01-22 14:36:00.465693: step: 60/464, loss: 0.041417691856622696 2023-01-22 14:36:01.176170: step: 62/464, loss: 0.026796171441674232 2023-01-22 14:36:01.896262: step: 64/464, loss: 0.04955141246318817 2023-01-22 14:36:02.624755: step: 66/464, loss: 0.04345294088125229 2023-01-22 14:36:03.365146: step: 68/464, loss: 0.00819335225969553 2023-01-22 14:36:04.121769: step: 70/464, loss: 0.01806209795176983 2023-01-22 14:36:04.875296: step: 72/464, loss: 0.4654695391654968 2023-01-22 14:36:05.579752: step: 74/464, loss: 0.1681564301252365 2023-01-22 14:36:06.252986: step: 76/464, loss: 0.008412414230406284 2023-01-22 14:36:06.966905: step: 78/464, loss: 0.013876117765903473 2023-01-22 14:36:07.752879: step: 80/464, loss: 0.053247906267642975 2023-01-22 14:36:08.610031: step: 82/464, loss: 0.08832264691591263 2023-01-22 14:36:09.370750: step: 84/464, loss: 0.20598171651363373 2023-01-22 14:36:10.113494: step: 86/464, loss: 0.028211485594511032 2023-01-22 14:36:10.846748: step: 88/464, loss: 0.07934857904911041 2023-01-22 14:36:11.531465: step: 90/464, loss: 0.07944697141647339 2023-01-22 14:36:12.286067: step: 92/464, loss: 0.08435480296611786 2023-01-22 14:36:12.928064: step: 94/464, loss: 0.00021403271239250898 2023-01-22 14:36:13.624216: step: 96/464, loss: 0.022179214283823967 2023-01-22 14:36:14.365406: step: 98/464, loss: 0.06918807327747345 2023-01-22 14:36:15.105571: step: 100/464, loss: 0.3057056665420532 2023-01-22 14:36:15.844312: step: 102/464, loss: 0.03395991399884224 2023-01-22 14:36:16.515534: step: 104/464, loss: 0.0005153696401976049 2023-01-22 14:36:17.369340: step: 106/464, loss: 0.010365036316215992 2023-01-22 14:36:18.066199: step: 108/464, loss: 0.1427009552717209 2023-01-22 14:36:18.831229: step: 110/464, loss: 0.017721237614750862 2023-01-22 14:36:19.558182: step: 112/464, loss: 0.09974846243858337 2023-01-22 14:36:20.304376: step: 114/464, loss: 0.02816518023610115 2023-01-22 14:36:21.020371: step: 116/464, loss: 0.06467827409505844 2023-01-22 14:36:21.764891: step: 118/464, loss: 0.004381565377116203 2023-01-22 14:36:22.553337: step: 120/464, loss: 0.0018641493516042829 2023-01-22 14:36:23.412561: step: 122/464, loss: 0.006512850988656282 2023-01-22 14:36:24.147180: step: 124/464, loss: 0.17227263748645782 2023-01-22 14:36:24.929911: step: 126/464, loss: 0.13674922287464142 2023-01-22 14:36:25.701972: step: 128/464, loss: 0.036892298609018326 2023-01-22 14:36:26.402940: step: 130/464, loss: 0.28341665863990784 2023-01-22 14:36:27.135437: step: 132/464, loss: 0.04057060182094574 2023-01-22 14:36:27.867904: step: 134/464, loss: 0.03943055495619774 2023-01-22 14:36:28.620173: step: 136/464, loss: 0.10524342954158783 2023-01-22 14:36:29.427661: step: 138/464, loss: 0.024937961250543594 2023-01-22 14:36:30.167877: step: 140/464, loss: 1.0579050779342651 2023-01-22 14:36:30.910923: step: 142/464, loss: 0.020976925268769264 2023-01-22 14:36:31.672542: step: 144/464, loss: 0.04678136110305786 2023-01-22 14:36:32.438893: step: 146/464, loss: 0.04356539994478226 2023-01-22 14:36:33.193334: step: 148/464, loss: 0.027293233200907707 2023-01-22 14:36:33.925991: step: 150/464, loss: 0.07927026599645615 2023-01-22 14:36:34.687542: step: 152/464, loss: 0.052855201065540314 2023-01-22 14:36:35.445843: step: 154/464, loss: 0.07365193963050842 2023-01-22 14:36:36.181728: step: 156/464, loss: 0.014998635277152061 2023-01-22 14:36:36.844997: step: 158/464, loss: 0.020044660195708275 2023-01-22 14:36:37.573403: step: 160/464, loss: 0.006574376951903105 2023-01-22 14:36:38.340030: step: 162/464, loss: 0.10204543173313141 2023-01-22 14:36:39.093603: step: 164/464, loss: 0.017370322719216347 2023-01-22 14:36:39.825265: step: 166/464, loss: 0.05167322978377342 2023-01-22 14:36:40.566725: step: 168/464, loss: 0.02687511220574379 2023-01-22 14:36:41.357118: step: 170/464, loss: 0.04617585614323616 2023-01-22 14:36:42.189448: step: 172/464, loss: 0.011619649827480316 2023-01-22 14:36:42.928889: step: 174/464, loss: 0.0502808652818203 2023-01-22 14:36:43.703836: step: 176/464, loss: 0.027911610901355743 2023-01-22 14:36:44.488526: step: 178/464, loss: 0.02121800370514393 2023-01-22 14:36:45.250658: step: 180/464, loss: 0.025166582316160202 2023-01-22 14:36:45.922795: step: 182/464, loss: 0.01773116923868656 2023-01-22 14:36:46.637741: step: 184/464, loss: 0.22564925253391266 2023-01-22 14:36:47.368159: step: 186/464, loss: 0.00383604783564806 2023-01-22 14:36:48.147693: step: 188/464, loss: 0.03472012281417847 2023-01-22 14:36:48.901078: step: 190/464, loss: 0.03232753276824951 2023-01-22 14:36:49.639719: step: 192/464, loss: 0.02210436575114727 2023-01-22 14:36:50.344015: step: 194/464, loss: 0.030768616124987602 2023-01-22 14:36:51.068747: step: 196/464, loss: 0.028965983539819717 2023-01-22 14:36:51.808260: step: 198/464, loss: 0.0606737844645977 2023-01-22 14:36:52.557846: step: 200/464, loss: 0.22486914694309235 2023-01-22 14:36:53.265618: step: 202/464, loss: 0.018386442214250565 2023-01-22 14:36:54.023216: step: 204/464, loss: 0.03689342364668846 2023-01-22 14:36:54.782466: step: 206/464, loss: 0.1655503213405609 2023-01-22 14:36:55.470137: step: 208/464, loss: 0.039773356169462204 2023-01-22 14:36:56.290065: step: 210/464, loss: 0.057250939309597015 2023-01-22 14:36:57.046527: step: 212/464, loss: 0.018177485093474388 2023-01-22 14:36:57.779387: step: 214/464, loss: 0.014665749855339527 2023-01-22 14:36:58.549739: step: 216/464, loss: 0.04910998418927193 2023-01-22 14:36:59.292793: step: 218/464, loss: 0.04642369598150253 2023-01-22 14:36:59.973445: step: 220/464, loss: 0.038499731570482254 2023-01-22 14:37:00.652047: step: 222/464, loss: 0.03830535709857941 2023-01-22 14:37:01.320074: step: 224/464, loss: 0.0034814043901860714 2023-01-22 14:37:02.070407: step: 226/464, loss: 0.04010867327451706 2023-01-22 14:37:02.762685: step: 228/464, loss: 0.18558181822299957 2023-01-22 14:37:03.595616: step: 230/464, loss: 0.03923764452338219 2023-01-22 14:37:04.281712: step: 232/464, loss: 0.003573687979951501 2023-01-22 14:37:05.019789: step: 234/464, loss: 0.06191304326057434 2023-01-22 14:37:05.683557: step: 236/464, loss: 0.04891486093401909 2023-01-22 14:37:06.509072: step: 238/464, loss: 0.05465091019868851 2023-01-22 14:37:07.224963: step: 240/464, loss: 0.0303682591766119 2023-01-22 14:37:07.967236: step: 242/464, loss: 0.010735997930169106 2023-01-22 14:37:08.747889: step: 244/464, loss: 0.03328661620616913 2023-01-22 14:37:09.486967: step: 246/464, loss: 0.01912694051861763 2023-01-22 14:37:10.182887: step: 248/464, loss: 0.03149857372045517 2023-01-22 14:37:10.873305: step: 250/464, loss: 0.07606285810470581 2023-01-22 14:37:11.603944: step: 252/464, loss: 0.02646615169942379 2023-01-22 14:37:12.468418: step: 254/464, loss: 0.03380844369530678 2023-01-22 14:37:13.143832: step: 256/464, loss: 0.05554010719060898 2023-01-22 14:37:13.953095: step: 258/464, loss: 0.037372887134552 2023-01-22 14:37:14.690647: step: 260/464, loss: 0.05111650377511978 2023-01-22 14:37:15.473408: step: 262/464, loss: 0.06232859566807747 2023-01-22 14:37:16.190558: step: 264/464, loss: 0.02072969451546669 2023-01-22 14:37:17.011794: step: 266/464, loss: 0.0021110870875418186 2023-01-22 14:37:17.786277: step: 268/464, loss: 0.07640819996595383 2023-01-22 14:37:18.502007: step: 270/464, loss: 0.09810517728328705 2023-01-22 14:37:19.300080: step: 272/464, loss: 0.10910936444997787 2023-01-22 14:37:19.975193: step: 274/464, loss: 0.11092743277549744 2023-01-22 14:37:20.643130: step: 276/464, loss: 0.04382758587598801 2023-01-22 14:37:21.360959: step: 278/464, loss: 0.02791966311633587 2023-01-22 14:37:22.097356: step: 280/464, loss: 0.013097509741783142 2023-01-22 14:37:22.864668: step: 282/464, loss: 0.19788333773612976 2023-01-22 14:37:23.583620: step: 284/464, loss: 0.02221561409533024 2023-01-22 14:37:24.321116: step: 286/464, loss: 0.05585845559835434 2023-01-22 14:37:25.024181: step: 288/464, loss: 0.029758591204881668 2023-01-22 14:37:25.708793: step: 290/464, loss: 0.039395757019519806 2023-01-22 14:37:26.448255: step: 292/464, loss: 0.061954349279403687 2023-01-22 14:37:27.183253: step: 294/464, loss: 0.020964158698916435 2023-01-22 14:37:27.886802: step: 296/464, loss: 0.01524094957858324 2023-01-22 14:37:28.657512: step: 298/464, loss: 0.05274888500571251 2023-01-22 14:37:29.376034: step: 300/464, loss: 0.09849551320075989 2023-01-22 14:37:30.268614: step: 302/464, loss: 0.44920846819877625 2023-01-22 14:37:30.947056: step: 304/464, loss: 0.027357229962944984 2023-01-22 14:37:31.601677: step: 306/464, loss: 0.11521462351083755 2023-01-22 14:37:32.396432: step: 308/464, loss: 0.05012825131416321 2023-01-22 14:37:33.176461: step: 310/464, loss: 0.06689614057540894 2023-01-22 14:37:33.924916: step: 312/464, loss: 0.015322437509894371 2023-01-22 14:37:34.664380: step: 314/464, loss: 0.0024135306011885405 2023-01-22 14:37:35.418006: step: 316/464, loss: 0.04468587040901184 2023-01-22 14:37:36.151146: step: 318/464, loss: 0.11873765289783478 2023-01-22 14:37:36.883982: step: 320/464, loss: 0.019776202738285065 2023-01-22 14:37:37.621703: step: 322/464, loss: 0.05951458588242531 2023-01-22 14:37:38.297745: step: 324/464, loss: 0.0024840477854013443 2023-01-22 14:37:39.008258: step: 326/464, loss: 0.036224596202373505 2023-01-22 14:37:39.787130: step: 328/464, loss: 0.021483929827809334 2023-01-22 14:37:40.519162: step: 330/464, loss: 0.021507222205400467 2023-01-22 14:37:41.243666: step: 332/464, loss: 0.004655972123146057 2023-01-22 14:37:41.963245: step: 334/464, loss: 0.05982765182852745 2023-01-22 14:37:42.757087: step: 336/464, loss: 0.09155252575874329 2023-01-22 14:37:43.465562: step: 338/464, loss: 0.01392027921974659 2023-01-22 14:37:44.173495: step: 340/464, loss: 0.03629325330257416 2023-01-22 14:37:44.960967: step: 342/464, loss: 0.02009829320013523 2023-01-22 14:37:45.669886: step: 344/464, loss: 0.1702081263065338 2023-01-22 14:37:46.293393: step: 346/464, loss: 0.009383754804730415 2023-01-22 14:37:46.971096: step: 348/464, loss: 0.0682232603430748 2023-01-22 14:37:47.709411: step: 350/464, loss: 0.08816039562225342 2023-01-22 14:37:48.368667: step: 352/464, loss: 0.0227990560233593 2023-01-22 14:37:49.101664: step: 354/464, loss: 0.7874547243118286 2023-01-22 14:37:49.783226: step: 356/464, loss: 0.015370347537100315 2023-01-22 14:37:50.561236: step: 358/464, loss: 0.17370478808879852 2023-01-22 14:37:51.333765: step: 360/464, loss: 0.010362098924815655 2023-01-22 14:37:52.039922: step: 362/464, loss: 0.107728011906147 2023-01-22 14:37:52.773752: step: 364/464, loss: 1.1347079277038574 2023-01-22 14:37:53.486094: step: 366/464, loss: 0.06244118884205818 2023-01-22 14:37:54.267173: step: 368/464, loss: 0.021270813420414925 2023-01-22 14:37:55.021376: step: 370/464, loss: 0.042185161262750626 2023-01-22 14:37:55.778909: step: 372/464, loss: 0.022724080830812454 2023-01-22 14:37:56.551436: step: 374/464, loss: 0.3049149811267853 2023-01-22 14:37:57.235619: step: 376/464, loss: 0.04081624746322632 2023-01-22 14:37:57.998499: step: 378/464, loss: 0.019985472783446312 2023-01-22 14:37:58.728350: step: 380/464, loss: 0.013072039932012558 2023-01-22 14:37:59.450593: step: 382/464, loss: 0.007333280052989721 2023-01-22 14:38:00.292662: step: 384/464, loss: 0.051796723157167435 2023-01-22 14:38:01.041637: step: 386/464, loss: 0.013379773125052452 2023-01-22 14:38:01.756117: step: 388/464, loss: 0.01237155869603157 2023-01-22 14:38:02.481507: step: 390/464, loss: 0.05361822247505188 2023-01-22 14:38:03.267408: step: 392/464, loss: 0.044191669672727585 2023-01-22 14:38:04.028893: step: 394/464, loss: 0.07439300417900085 2023-01-22 14:38:04.786583: step: 396/464, loss: 0.10303560644388199 2023-01-22 14:38:05.487948: step: 398/464, loss: 0.04386376962065697 2023-01-22 14:38:06.206408: step: 400/464, loss: 0.02358752302825451 2023-01-22 14:38:06.917054: step: 402/464, loss: 0.025624962523579597 2023-01-22 14:38:07.642411: step: 404/464, loss: 0.13347728550434113 2023-01-22 14:38:08.373947: step: 406/464, loss: 0.011129355989396572 2023-01-22 14:38:09.075846: step: 408/464, loss: 0.11874540150165558 2023-01-22 14:38:10.014267: step: 410/464, loss: 0.03069928288459778 2023-01-22 14:38:10.820569: step: 412/464, loss: 0.06555747240781784 2023-01-22 14:38:11.545700: step: 414/464, loss: 0.11554235965013504 2023-01-22 14:38:12.336374: step: 416/464, loss: 0.056944798678159714 2023-01-22 14:38:13.087479: step: 418/464, loss: 0.06231764703989029 2023-01-22 14:38:13.848297: step: 420/464, loss: 0.006298363674432039 2023-01-22 14:38:14.668881: step: 422/464, loss: 0.20415742695331573 2023-01-22 14:38:15.430124: step: 424/464, loss: 0.021070022135972977 2023-01-22 14:38:16.175152: step: 426/464, loss: 0.007275238633155823 2023-01-22 14:38:16.866980: step: 428/464, loss: 0.0696689710021019 2023-01-22 14:38:17.502557: step: 430/464, loss: 0.06242004409432411 2023-01-22 14:38:18.264592: step: 432/464, loss: 0.019797658547759056 2023-01-22 14:38:18.968932: step: 434/464, loss: 0.029195213690400124 2023-01-22 14:38:19.825025: step: 436/464, loss: 0.29423123598098755 2023-01-22 14:38:20.544777: step: 438/464, loss: 0.03901423513889313 2023-01-22 14:38:21.244735: step: 440/464, loss: 0.02818756178021431 2023-01-22 14:38:21.995239: step: 442/464, loss: 0.09110277891159058 2023-01-22 14:38:22.732965: step: 444/464, loss: 0.5170221328735352 2023-01-22 14:38:23.464239: step: 446/464, loss: 0.0012232566950842738 2023-01-22 14:38:24.139549: step: 448/464, loss: 0.006468184292316437 2023-01-22 14:38:24.923282: step: 450/464, loss: 0.061203423887491226 2023-01-22 14:38:25.671552: step: 452/464, loss: 0.07362731546163559 2023-01-22 14:38:26.537869: step: 454/464, loss: 0.15933118760585785 2023-01-22 14:38:27.295692: step: 456/464, loss: 0.2627415955066681 2023-01-22 14:38:28.066036: step: 458/464, loss: 0.04037583991885185 2023-01-22 14:38:28.761860: step: 460/464, loss: 0.04900494962930679 2023-01-22 14:38:29.501219: step: 462/464, loss: 0.9591463208198547 2023-01-22 14:38:30.327183: step: 464/464, loss: 0.012484485283493996 2023-01-22 14:38:31.132321: step: 466/464, loss: 0.08850326389074326 2023-01-22 14:38:31.872347: step: 468/464, loss: 0.037056587636470795 2023-01-22 14:38:32.670832: step: 470/464, loss: 0.06972339749336243 2023-01-22 14:38:33.523486: step: 472/464, loss: 0.024174122139811516 2023-01-22 14:38:34.139286: step: 474/464, loss: 0.014855777844786644 2023-01-22 14:38:34.912189: step: 476/464, loss: 0.02535218745470047 2023-01-22 14:38:35.637357: step: 478/464, loss: 0.012558290734887123 2023-01-22 14:38:36.325748: step: 480/464, loss: 0.0026807389222085476 2023-01-22 14:38:37.205731: step: 482/464, loss: 0.004997937940061092 2023-01-22 14:38:37.975606: step: 484/464, loss: 0.042675137519836426 2023-01-22 14:38:38.682350: step: 486/464, loss: 0.021459804847836494 2023-01-22 14:38:39.470527: step: 488/464, loss: 0.032483913004398346 2023-01-22 14:38:40.281681: step: 490/464, loss: 0.2530280351638794 2023-01-22 14:38:40.956796: step: 492/464, loss: 0.010920085944235325 2023-01-22 14:38:41.809980: step: 494/464, loss: 0.06375552713871002 2023-01-22 14:38:42.582145: step: 496/464, loss: 0.10358325392007828 2023-01-22 14:38:43.326653: step: 498/464, loss: 0.03224386274814606 2023-01-22 14:38:44.001620: step: 500/464, loss: 0.02806994877755642 2023-01-22 14:38:44.655194: step: 502/464, loss: 0.004388626664876938 2023-01-22 14:38:45.379775: step: 504/464, loss: 0.023558881133794785 2023-01-22 14:38:46.129205: step: 506/464, loss: 0.03565063700079918 2023-01-22 14:38:46.857105: step: 508/464, loss: 0.04556173458695412 2023-01-22 14:38:47.655879: step: 510/464, loss: 0.041230447590351105 2023-01-22 14:38:48.519136: step: 512/464, loss: 0.5097260475158691 2023-01-22 14:38:49.216556: step: 514/464, loss: 0.06440841406583786 2023-01-22 14:38:49.975728: step: 516/464, loss: 0.024644505232572556 2023-01-22 14:38:50.652370: step: 518/464, loss: 0.023807339370250702 2023-01-22 14:38:51.419004: step: 520/464, loss: 0.03844551742076874 2023-01-22 14:38:52.260426: step: 522/464, loss: 0.017746197059750557 2023-01-22 14:38:53.007945: step: 524/464, loss: 0.022382527589797974 2023-01-22 14:38:53.697678: step: 526/464, loss: 0.15803313255310059 2023-01-22 14:38:54.417494: step: 528/464, loss: 0.06795371323823929 2023-01-22 14:38:55.134220: step: 530/464, loss: 0.05073828995227814 2023-01-22 14:38:55.861256: step: 532/464, loss: 0.050155382603406906 2023-01-22 14:38:56.553699: step: 534/464, loss: 0.12563583254814148 2023-01-22 14:38:57.226103: step: 536/464, loss: 0.14176763594150543 2023-01-22 14:38:57.974314: step: 538/464, loss: 0.025613034144043922 2023-01-22 14:38:58.743028: step: 540/464, loss: 0.15801124274730682 2023-01-22 14:38:59.428516: step: 542/464, loss: 0.12523072957992554 2023-01-22 14:39:00.187665: step: 544/464, loss: 0.28735101222991943 2023-01-22 14:39:00.906544: step: 546/464, loss: 0.182787224650383 2023-01-22 14:39:01.774881: step: 548/464, loss: 0.11324718594551086 2023-01-22 14:39:02.557087: step: 550/464, loss: 0.10533501207828522 2023-01-22 14:39:03.230534: step: 552/464, loss: 0.03276410326361656 2023-01-22 14:39:03.994353: step: 554/464, loss: 0.029330329969525337 2023-01-22 14:39:04.741449: step: 556/464, loss: 0.00085727364057675 2023-01-22 14:39:05.408623: step: 558/464, loss: 0.025780213996767998 2023-01-22 14:39:06.142662: step: 560/464, loss: 0.05369078367948532 2023-01-22 14:39:06.859731: step: 562/464, loss: 0.03835272043943405 2023-01-22 14:39:07.598619: step: 564/464, loss: 0.030365869402885437 2023-01-22 14:39:08.331889: step: 566/464, loss: 0.0115794837474823 2023-01-22 14:39:09.067096: step: 568/464, loss: 0.07855859398841858 2023-01-22 14:39:09.857686: step: 570/464, loss: 0.020242253318428993 2023-01-22 14:39:10.605477: step: 572/464, loss: 0.08514165133237839 2023-01-22 14:39:11.326404: step: 574/464, loss: 0.023960469290614128 2023-01-22 14:39:12.127783: step: 576/464, loss: 3.4834632873535156 2023-01-22 14:39:12.856181: step: 578/464, loss: 0.022735727950930595 2023-01-22 14:39:13.561702: step: 580/464, loss: 0.056836120784282684 2023-01-22 14:39:14.305035: step: 582/464, loss: 0.02904140204191208 2023-01-22 14:39:15.103409: step: 584/464, loss: 0.0137387840077281 2023-01-22 14:39:15.805529: step: 586/464, loss: 0.021905794739723206 2023-01-22 14:39:16.549520: step: 588/464, loss: 0.006982157006859779 2023-01-22 14:39:17.276081: step: 590/464, loss: 0.011092767119407654 2023-01-22 14:39:17.931222: step: 592/464, loss: 0.05825098976492882 2023-01-22 14:39:18.612965: step: 594/464, loss: 0.0053213657811284065 2023-01-22 14:39:19.275851: step: 596/464, loss: 0.07439672201871872 2023-01-22 14:39:20.006972: step: 598/464, loss: 0.08712119609117508 2023-01-22 14:39:20.713214: step: 600/464, loss: 0.04311757907271385 2023-01-22 14:39:21.507125: step: 602/464, loss: 0.08456127345561981 2023-01-22 14:39:22.197293: step: 604/464, loss: 0.04044141620397568 2023-01-22 14:39:22.889283: step: 606/464, loss: 0.016841372475028038 2023-01-22 14:39:23.594919: step: 608/464, loss: 0.026384098455309868 2023-01-22 14:39:24.240029: step: 610/464, loss: 0.007268472574651241 2023-01-22 14:39:24.990098: step: 612/464, loss: 0.014984131790697575 2023-01-22 14:39:25.724160: step: 614/464, loss: 0.026813920587301254 2023-01-22 14:39:26.393040: step: 616/464, loss: 0.0704665556550026 2023-01-22 14:39:27.144489: step: 618/464, loss: 0.06092187389731407 2023-01-22 14:39:27.837376: step: 620/464, loss: 0.019842494279146194 2023-01-22 14:39:28.574113: step: 622/464, loss: 0.03070417419075966 2023-01-22 14:39:29.253449: step: 624/464, loss: 0.00864504650235176 2023-01-22 14:39:30.011403: step: 626/464, loss: 0.0526597835123539 2023-01-22 14:39:30.782129: step: 628/464, loss: 0.04984492063522339 2023-01-22 14:39:31.559810: step: 630/464, loss: 0.1512703150510788 2023-01-22 14:39:32.357415: step: 632/464, loss: 0.07364907115697861 2023-01-22 14:39:33.033933: step: 634/464, loss: 0.018775172531604767 2023-01-22 14:39:33.758881: step: 636/464, loss: 0.03580787405371666 2023-01-22 14:39:34.488313: step: 638/464, loss: 0.068617083132267 2023-01-22 14:39:35.280547: step: 640/464, loss: 0.016016459092497826 2023-01-22 14:39:36.025389: step: 642/464, loss: 0.06150452792644501 2023-01-22 14:39:36.817675: step: 644/464, loss: 0.03719574585556984 2023-01-22 14:39:37.547879: step: 646/464, loss: 0.05981617048382759 2023-01-22 14:39:38.237252: step: 648/464, loss: 0.034328460693359375 2023-01-22 14:39:38.959663: step: 650/464, loss: 0.023457909002900124 2023-01-22 14:39:39.753234: step: 652/464, loss: 0.3904138207435608 2023-01-22 14:39:40.549631: step: 654/464, loss: 0.02361692301928997 2023-01-22 14:39:41.287211: step: 656/464, loss: 0.07791952788829803 2023-01-22 14:39:42.024960: step: 658/464, loss: 0.021876579150557518 2023-01-22 14:39:42.712163: step: 660/464, loss: 0.04831695184111595 2023-01-22 14:39:43.424836: step: 662/464, loss: 0.14800116419792175 2023-01-22 14:39:44.228586: step: 664/464, loss: 0.007488205097615719 2023-01-22 14:39:45.005813: step: 666/464, loss: 0.02209978550672531 2023-01-22 14:39:45.657354: step: 668/464, loss: 0.001123339869081974 2023-01-22 14:39:46.385354: step: 670/464, loss: 0.10765864700078964 2023-01-22 14:39:47.207236: step: 672/464, loss: 0.10831757634878159 2023-01-22 14:39:47.965294: step: 674/464, loss: 0.011373594403266907 2023-01-22 14:39:48.757370: step: 676/464, loss: 0.015240014530718327 2023-01-22 14:39:49.496161: step: 678/464, loss: 0.4238124191761017 2023-01-22 14:39:50.321424: step: 680/464, loss: 0.01398735772818327 2023-01-22 14:39:51.052131: step: 682/464, loss: 0.09794515371322632 2023-01-22 14:39:51.763673: step: 684/464, loss: 0.05332712456583977 2023-01-22 14:39:52.471623: step: 686/464, loss: 0.0721079632639885 2023-01-22 14:39:53.219978: step: 688/464, loss: 0.09018091857433319 2023-01-22 14:39:53.993603: step: 690/464, loss: 0.5125716924667358 2023-01-22 14:39:54.758714: step: 692/464, loss: 0.06655218452215195 2023-01-22 14:39:55.544769: step: 694/464, loss: 0.13092640042304993 2023-01-22 14:39:56.283247: step: 696/464, loss: 0.032938260585069656 2023-01-22 14:39:56.964622: step: 698/464, loss: 0.008865829557180405 2023-01-22 14:39:57.622636: step: 700/464, loss: 0.08985569328069687 2023-01-22 14:39:58.407905: step: 702/464, loss: 0.002692323410883546 2023-01-22 14:39:59.163627: step: 704/464, loss: 0.03374726325273514 2023-01-22 14:39:59.910873: step: 706/464, loss: 0.03221918269991875 2023-01-22 14:40:00.643240: step: 708/464, loss: 0.04863818734884262 2023-01-22 14:40:01.536706: step: 710/464, loss: 0.028400612995028496 2023-01-22 14:40:02.307208: step: 712/464, loss: 0.03320635110139847 2023-01-22 14:40:03.002418: step: 714/464, loss: 0.04137527197599411 2023-01-22 14:40:03.771792: step: 716/464, loss: 0.06885325163602829 2023-01-22 14:40:04.501698: step: 718/464, loss: 2.812523126602173 2023-01-22 14:40:05.198784: step: 720/464, loss: 0.05131245404481888 2023-01-22 14:40:05.961762: step: 722/464, loss: 0.03149893507361412 2023-01-22 14:40:06.748808: step: 724/464, loss: 0.023914489895105362 2023-01-22 14:40:07.554288: step: 726/464, loss: 0.10525958240032196 2023-01-22 14:40:08.300035: step: 728/464, loss: 0.015110420063138008 2023-01-22 14:40:09.008672: step: 730/464, loss: 0.0554676316678524 2023-01-22 14:40:09.657056: step: 732/464, loss: 0.047157224267721176 2023-01-22 14:40:10.403988: step: 734/464, loss: 0.12007953226566315 2023-01-22 14:40:11.161230: step: 736/464, loss: 0.05309174582362175 2023-01-22 14:40:11.922495: step: 738/464, loss: 0.011370309628546238 2023-01-22 14:40:12.671456: step: 740/464, loss: 0.05437813326716423 2023-01-22 14:40:13.406566: step: 742/464, loss: 0.02289034053683281 2023-01-22 14:40:14.213629: step: 744/464, loss: 0.03706967830657959 2023-01-22 14:40:14.953729: step: 746/464, loss: 0.11801394075155258 2023-01-22 14:40:15.659440: step: 748/464, loss: 0.019746290519833565 2023-01-22 14:40:16.559488: step: 750/464, loss: 0.014962991699576378 2023-01-22 14:40:17.330340: step: 752/464, loss: 0.07079713046550751 2023-01-22 14:40:18.058173: step: 754/464, loss: 0.07932285964488983 2023-01-22 14:40:18.825075: step: 756/464, loss: 0.03423899784684181 2023-01-22 14:40:19.557778: step: 758/464, loss: 0.0764031857252121 2023-01-22 14:40:20.205719: step: 760/464, loss: 0.08636163175106049 2023-01-22 14:40:20.936909: step: 762/464, loss: 0.021301904693245888 2023-01-22 14:40:21.715143: step: 764/464, loss: 0.017300531268119812 2023-01-22 14:40:22.453455: step: 766/464, loss: 0.029869444668293 2023-01-22 14:40:23.147231: step: 768/464, loss: 0.13179388642311096 2023-01-22 14:40:23.890755: step: 770/464, loss: 0.24576228857040405 2023-01-22 14:40:24.597160: step: 772/464, loss: 0.04601896554231644 2023-01-22 14:40:25.353306: step: 774/464, loss: 0.008786949329078197 2023-01-22 14:40:26.092295: step: 776/464, loss: 0.08430537581443787 2023-01-22 14:40:26.819124: step: 778/464, loss: 0.002615898149088025 2023-01-22 14:40:27.519818: step: 780/464, loss: 0.004254632163792849 2023-01-22 14:40:28.284521: step: 782/464, loss: 0.08581655472517014 2023-01-22 14:40:29.000679: step: 784/464, loss: 0.007409216836094856 2023-01-22 14:40:29.780242: step: 786/464, loss: 0.052561745047569275 2023-01-22 14:40:30.605130: step: 788/464, loss: 0.01425325870513916 2023-01-22 14:40:31.298912: step: 790/464, loss: 0.12579873204231262 2023-01-22 14:40:32.047816: step: 792/464, loss: 0.10305667668581009 2023-01-22 14:40:32.830923: step: 794/464, loss: 0.1198534145951271 2023-01-22 14:40:33.570733: step: 796/464, loss: 0.13133980333805084 2023-01-22 14:40:34.321703: step: 798/464, loss: 0.04805370420217514 2023-01-22 14:40:35.033739: step: 800/464, loss: 0.09546133875846863 2023-01-22 14:40:35.779928: step: 802/464, loss: 0.1074833944439888 2023-01-22 14:40:36.462222: step: 804/464, loss: 0.016533413901925087 2023-01-22 14:40:37.187096: step: 806/464, loss: 0.018085090443491936 2023-01-22 14:40:37.892820: step: 808/464, loss: 0.014931551180779934 2023-01-22 14:40:38.576396: step: 810/464, loss: 0.016278119757771492 2023-01-22 14:40:39.235276: step: 812/464, loss: 0.013235675171017647 2023-01-22 14:40:40.010995: step: 814/464, loss: 0.18394386768341064 2023-01-22 14:40:40.716119: step: 816/464, loss: 0.039952490478754044 2023-01-22 14:40:41.507056: step: 818/464, loss: 0.03714795038104057 2023-01-22 14:40:42.240608: step: 820/464, loss: 0.17323879897594452 2023-01-22 14:40:42.960787: step: 822/464, loss: 0.028041217476129532 2023-01-22 14:40:43.640531: step: 824/464, loss: 0.03805036470293999 2023-01-22 14:40:44.432677: step: 826/464, loss: 0.05020849406719208 2023-01-22 14:40:45.171659: step: 828/464, loss: 0.010961787775158882 2023-01-22 14:40:45.870117: step: 830/464, loss: 0.6328175663948059 2023-01-22 14:40:46.597196: step: 832/464, loss: 0.6535952687263489 2023-01-22 14:40:47.327197: step: 834/464, loss: 0.2235526591539383 2023-01-22 14:40:48.135022: step: 836/464, loss: 0.04729365184903145 2023-01-22 14:40:48.977104: step: 838/464, loss: 0.11663021892309189 2023-01-22 14:40:49.729697: step: 840/464, loss: 0.01592096872627735 2023-01-22 14:40:50.537989: step: 842/464, loss: 0.028326695784926414 2023-01-22 14:40:51.356830: step: 844/464, loss: 0.009754992090165615 2023-01-22 14:40:52.114748: step: 846/464, loss: 0.05774148181080818 2023-01-22 14:40:52.817073: step: 848/464, loss: 0.004699027165770531 2023-01-22 14:40:53.611162: step: 850/464, loss: 0.2547849416732788 2023-01-22 14:40:54.379195: step: 852/464, loss: 0.021360615268349648 2023-01-22 14:40:55.132665: step: 854/464, loss: 0.05056114122271538 2023-01-22 14:40:55.812405: step: 856/464, loss: 0.002268953714519739 2023-01-22 14:40:56.509338: step: 858/464, loss: 0.01847556233406067 2023-01-22 14:40:57.269310: step: 860/464, loss: 0.012106945738196373 2023-01-22 14:40:57.973847: step: 862/464, loss: 0.04129493981599808 2023-01-22 14:40:58.689756: step: 864/464, loss: 0.04316987842321396 2023-01-22 14:40:59.379695: step: 866/464, loss: 0.05905461311340332 2023-01-22 14:41:00.122425: step: 868/464, loss: 0.28730905055999756 2023-01-22 14:41:00.761040: step: 870/464, loss: 0.021385056897997856 2023-01-22 14:41:01.488966: step: 872/464, loss: 0.04858007654547691 2023-01-22 14:41:02.223439: step: 874/464, loss: 0.10534320026636124 2023-01-22 14:41:02.942728: step: 876/464, loss: 0.034968048334121704 2023-01-22 14:41:03.744127: step: 878/464, loss: 0.012991942465305328 2023-01-22 14:41:04.648330: step: 880/464, loss: 0.03125219792127609 2023-01-22 14:41:05.368704: step: 882/464, loss: 0.044226501137018204 2023-01-22 14:41:06.109992: step: 884/464, loss: 0.01766287535429001 2023-01-22 14:41:06.908029: step: 886/464, loss: 0.14172212779521942 2023-01-22 14:41:07.673524: step: 888/464, loss: 0.013373379595577717 2023-01-22 14:41:08.448058: step: 890/464, loss: 0.0199363362044096 2023-01-22 14:41:09.197873: step: 892/464, loss: 0.9622757434844971 2023-01-22 14:41:09.876142: step: 894/464, loss: 0.0009300005622208118 2023-01-22 14:41:10.750238: step: 896/464, loss: 0.013780355453491211 2023-01-22 14:41:11.513573: step: 898/464, loss: 0.01795125938951969 2023-01-22 14:41:12.255202: step: 900/464, loss: 0.15666379034519196 2023-01-22 14:41:13.019821: step: 902/464, loss: 0.09425216168165207 2023-01-22 14:41:13.773771: step: 904/464, loss: 0.04233413562178612 2023-01-22 14:41:14.532321: step: 906/464, loss: 0.07996021956205368 2023-01-22 14:41:15.394697: step: 908/464, loss: 0.03722037002444267 2023-01-22 14:41:16.112350: step: 910/464, loss: 0.020252905786037445 2023-01-22 14:41:16.833324: step: 912/464, loss: 0.06584428250789642 2023-01-22 14:41:17.540648: step: 914/464, loss: 0.36199814081192017 2023-01-22 14:41:18.386692: step: 916/464, loss: 0.0024279439821839333 2023-01-22 14:41:19.123512: step: 918/464, loss: 0.0012322930851951241 2023-01-22 14:41:19.848576: step: 920/464, loss: 0.07466727495193481 2023-01-22 14:41:20.603532: step: 922/464, loss: 0.08562754839658737 2023-01-22 14:41:21.368665: step: 924/464, loss: 0.03290723264217377 2023-01-22 14:41:22.089478: step: 926/464, loss: 0.034934818744659424 2023-01-22 14:41:22.812105: step: 928/464, loss: 0.04039369150996208 2023-01-22 14:41:23.482199: step: 930/464, loss: 0.010865515097975731 ================================================== Loss: 0.092 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3142144440924929, 'r': 0.34223736415387274, 'f1': 0.3276277764016184}, 'combined': 0.24140994050645564, 'epoch': 23} Test Chinese: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.2996634344061455, 'r': 0.2910677473658109, 'f1': 0.2953030532732913}, 'combined': 0.18339873834867568, 'epoch': 23} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3081533681870252, 'r': 0.3455761681186563, 'f1': 0.32579363255551325}, 'combined': 0.24005846609353607, 'epoch': 23} Test Korean: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.29327466083978504, 'r': 0.2883432372648332, 'f1': 0.2907880427678267}, 'combined': 0.1805946791926503, 'epoch': 23} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.31745530420505336, 'r': 0.3475744032757415, 'f1': 0.3318328089244851}, 'combined': 0.24450838552330478, 'epoch': 23} Test Russian: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.3097120969999749, 'r': 0.295342434059206, 'f1': 0.30235663032033927}, 'combined': 0.18777938093578966, 'epoch': 23} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.26875, 'r': 0.30714285714285716, 'f1': 0.2866666666666666}, 'combined': 0.19111111111111106, 'epoch': 23} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.3355263157894737, 'r': 0.5543478260869565, 'f1': 0.41803278688524587}, 'combined': 0.20901639344262293, 'epoch': 23} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.3641304347826087, 'r': 0.28879310344827586, 'f1': 0.32211538461538464}, 'combined': 0.21474358974358976, 'epoch': 23} New best korean model... ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.31282801835726054, 'r': 0.3603161425860667, 'f1': 0.33489701436130004}, 'combined': 0.24676622110832633, 'epoch': 12} Test for Chinese: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.30894966790503225, 'r': 0.2887808468548521, 'f1': 0.29852498585915693}, 'combined': 0.1853997280598975, 'epoch': 12} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.3138888888888889, 'r': 0.4035714285714286, 'f1': 0.35312499999999997}, 'combined': 0.23541666666666664, 'epoch': 12} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3081533681870252, 'r': 0.3455761681186563, 'f1': 0.32579363255551325}, 'combined': 0.24005846609353607, 'epoch': 23} Test for Korean: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.29327466083978504, 'r': 0.2883432372648332, 'f1': 0.2907880427678267}, 'combined': 0.1805946791926503, 'epoch': 23} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.3355263157894737, 'r': 0.5543478260869565, 'f1': 0.41803278688524587}, 'combined': 0.20901639344262293, 'epoch': 23} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3020804336867607, 'r': 0.32271590923272536, 'f1': 0.3120574021388005}, 'combined': 0.22993703315490563, 'epoch': 8} Test for Russian: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.3069139619466407, 'r': 0.292949528465389, 'f1': 0.29976920372318655}, 'combined': 0.1861724528386106, 'epoch': 8} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.44642857142857145, 'r': 0.3232758620689655, 'f1': 0.37500000000000006}, 'combined': 0.25, 'epoch': 8} ****************************** Epoch: 24 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-22 14:44:11.362787: step: 2/464, loss: 0.04216277226805687 2023-01-22 14:44:12.161310: step: 4/464, loss: 0.025450855493545532 2023-01-22 14:44:12.880879: step: 6/464, loss: 1.791204810142517 2023-01-22 14:44:13.621885: step: 8/464, loss: 0.01918802782893181 2023-01-22 14:44:14.380161: step: 10/464, loss: 0.03409218788146973 2023-01-22 14:44:15.110717: step: 12/464, loss: 0.00027603565831668675 2023-01-22 14:44:15.899097: step: 14/464, loss: 0.03904130682349205 2023-01-22 14:44:16.657555: step: 16/464, loss: 0.003379831090569496 2023-01-22 14:44:17.394958: step: 18/464, loss: 0.03145407885313034 2023-01-22 14:44:18.113198: step: 20/464, loss: 0.013981604017317295 2023-01-22 14:44:18.851380: step: 22/464, loss: 0.052782196551561356 2023-01-22 14:44:19.605138: step: 24/464, loss: 0.09911604225635529 2023-01-22 14:44:20.315770: step: 26/464, loss: 0.4873526692390442 2023-01-22 14:44:21.004604: step: 28/464, loss: 0.007562725339084864 2023-01-22 14:44:21.801580: step: 30/464, loss: 0.08361497521400452 2023-01-22 14:44:22.554399: step: 32/464, loss: 0.6130539774894714 2023-01-22 14:44:23.312835: step: 34/464, loss: 0.005319906864315271 2023-01-22 14:44:24.067980: step: 36/464, loss: 0.014001567848026752 2023-01-22 14:44:24.816402: step: 38/464, loss: 0.05576007440686226 2023-01-22 14:44:25.549510: step: 40/464, loss: 0.01852346770465374 2023-01-22 14:44:26.260097: step: 42/464, loss: 0.005907487124204636 2023-01-22 14:44:27.097770: step: 44/464, loss: 0.06917471438646317 2023-01-22 14:44:27.879238: step: 46/464, loss: 0.0684736967086792 2023-01-22 14:44:28.666432: step: 48/464, loss: 0.019859906286001205 2023-01-22 14:44:29.470232: step: 50/464, loss: 0.010839635506272316 2023-01-22 14:44:30.169751: step: 52/464, loss: 0.050964005291461945 2023-01-22 14:44:30.881617: step: 54/464, loss: 0.04273327812552452 2023-01-22 14:44:31.603519: step: 56/464, loss: 0.3398931324481964 2023-01-22 14:44:32.369377: step: 58/464, loss: 0.018881939351558685 2023-01-22 14:44:33.162567: step: 60/464, loss: 0.10342434793710709 2023-01-22 14:44:33.922426: step: 62/464, loss: 0.019913988187909126 2023-01-22 14:44:34.721591: step: 64/464, loss: 0.2856569290161133 2023-01-22 14:44:35.454269: step: 66/464, loss: 0.006631938274949789 2023-01-22 14:44:36.189772: step: 68/464, loss: 0.003249118337407708 2023-01-22 14:44:37.064972: step: 70/464, loss: 0.025199271738529205 2023-01-22 14:44:37.907861: step: 72/464, loss: 0.04742187634110451 2023-01-22 14:44:38.581967: step: 74/464, loss: 0.023487098515033722 2023-01-22 14:44:39.298155: step: 76/464, loss: 0.0440724715590477 2023-01-22 14:44:40.160340: step: 78/464, loss: 0.03205224126577377 2023-01-22 14:44:40.911619: step: 80/464, loss: 0.32313501834869385 2023-01-22 14:44:41.681721: step: 82/464, loss: 0.017892520874738693 2023-01-22 14:44:42.422203: step: 84/464, loss: 0.00491137383505702 2023-01-22 14:44:43.169835: step: 86/464, loss: 0.013730145059525967 2023-01-22 14:44:43.832649: step: 88/464, loss: 0.15901567041873932 2023-01-22 14:44:44.615809: step: 90/464, loss: 0.029119687154889107 2023-01-22 14:44:45.341407: step: 92/464, loss: 0.04215722158551216 2023-01-22 14:44:45.980619: step: 94/464, loss: 0.02104552648961544 2023-01-22 14:44:46.673977: step: 96/464, loss: 0.06284292042255402 2023-01-22 14:44:47.470923: step: 98/464, loss: 0.030893992632627487 2023-01-22 14:44:48.188887: step: 100/464, loss: 0.046926919370889664 2023-01-22 14:44:48.977642: step: 102/464, loss: 0.02747052162885666 2023-01-22 14:44:49.752976: step: 104/464, loss: 0.04495851695537567 2023-01-22 14:44:50.491737: step: 106/464, loss: 0.005997125059366226 2023-01-22 14:44:51.242978: step: 108/464, loss: 0.021033380180597305 2023-01-22 14:44:51.928924: step: 110/464, loss: 0.03415670990943909 2023-01-22 14:44:52.774011: step: 112/464, loss: 0.05024714022874832 2023-01-22 14:44:53.519878: step: 114/464, loss: 0.006801267620176077 2023-01-22 14:44:54.243918: step: 116/464, loss: 0.11825337260961533 2023-01-22 14:44:54.953776: step: 118/464, loss: 0.025873728096485138 2023-01-22 14:44:55.676056: step: 120/464, loss: 0.0497831255197525 2023-01-22 14:44:56.431412: step: 122/464, loss: 0.047892551869153976 2023-01-22 14:44:57.162329: step: 124/464, loss: 0.05598433315753937 2023-01-22 14:44:57.901935: step: 126/464, loss: 0.007054745685309172 2023-01-22 14:44:58.649288: step: 128/464, loss: 0.07335645705461502 2023-01-22 14:44:59.370399: step: 130/464, loss: 0.040367502719163895 2023-01-22 14:45:00.132509: step: 132/464, loss: 0.2059478759765625 2023-01-22 14:45:00.811248: step: 134/464, loss: 0.023334026336669922 2023-01-22 14:45:01.574129: step: 136/464, loss: 0.2638107240200043 2023-01-22 14:45:02.314615: step: 138/464, loss: 0.021714024245738983 2023-01-22 14:45:03.063082: step: 140/464, loss: 0.024914352223277092 2023-01-22 14:45:03.805214: step: 142/464, loss: 0.03169897198677063 2023-01-22 14:45:04.494872: step: 144/464, loss: 0.02856578677892685 2023-01-22 14:45:05.201342: step: 146/464, loss: 0.02671072632074356 2023-01-22 14:45:05.924175: step: 148/464, loss: 0.05585016682744026 2023-01-22 14:45:06.736938: step: 150/464, loss: 0.022833239287137985 2023-01-22 14:45:07.431678: step: 152/464, loss: 0.023303162306547165 2023-01-22 14:45:08.211620: step: 154/464, loss: 0.057167548686265945 2023-01-22 14:45:08.957729: step: 156/464, loss: 0.0038631537463515997 2023-01-22 14:45:09.723155: step: 158/464, loss: 7.0057597160339355 2023-01-22 14:45:10.555770: step: 160/464, loss: 0.038201894611120224 2023-01-22 14:45:11.230487: step: 162/464, loss: 0.006435670889914036 2023-01-22 14:45:11.994952: step: 164/464, loss: 0.05948295816779137 2023-01-22 14:45:12.743308: step: 166/464, loss: 0.004582865629345179 2023-01-22 14:45:13.481024: step: 168/464, loss: 0.03444823622703552 2023-01-22 14:45:14.195258: step: 170/464, loss: 0.07894705981016159 2023-01-22 14:45:14.964099: step: 172/464, loss: 0.019771507009863853 2023-01-22 14:45:15.646194: step: 174/464, loss: 0.01344778761267662 2023-01-22 14:45:16.364282: step: 176/464, loss: 0.09892649203538895 2023-01-22 14:45:17.164530: step: 178/464, loss: 0.02090064436197281 2023-01-22 14:45:17.990996: step: 180/464, loss: 0.3622448444366455 2023-01-22 14:45:18.673487: step: 182/464, loss: 0.014399930834770203 2023-01-22 14:45:19.438234: step: 184/464, loss: 0.019762679934501648 2023-01-22 14:45:20.240388: step: 186/464, loss: 0.007910501211881638 2023-01-22 14:45:21.129808: step: 188/464, loss: 0.05793405696749687 2023-01-22 14:45:21.891241: step: 190/464, loss: 0.01847977377474308 2023-01-22 14:45:22.705278: step: 192/464, loss: 0.25239136815071106 2023-01-22 14:45:23.359361: step: 194/464, loss: 0.053811557590961456 2023-01-22 14:45:24.066547: step: 196/464, loss: 0.11339523643255234 2023-01-22 14:45:24.733665: step: 198/464, loss: 0.03743233159184456 2023-01-22 14:45:25.532367: step: 200/464, loss: 0.026974214240908623 2023-01-22 14:45:26.226064: step: 202/464, loss: 0.0056051891297101974 2023-01-22 14:45:26.975275: step: 204/464, loss: 2.888148069381714 2023-01-22 14:45:27.755179: step: 206/464, loss: 0.22232641279697418 2023-01-22 14:45:28.572053: step: 208/464, loss: 0.05838814750313759 2023-01-22 14:45:29.292231: step: 210/464, loss: 0.08797691762447357 2023-01-22 14:45:29.988487: step: 212/464, loss: 0.014637548476457596 2023-01-22 14:45:30.692344: step: 214/464, loss: 0.017743902280926704 2023-01-22 14:45:31.402482: step: 216/464, loss: 0.0002531524805817753 2023-01-22 14:45:32.108052: step: 218/464, loss: 0.0060111405327916145 2023-01-22 14:45:32.924474: step: 220/464, loss: 0.10804308950901031 2023-01-22 14:45:33.642689: step: 222/464, loss: 0.03683672472834587 2023-01-22 14:45:34.435004: step: 224/464, loss: 0.2199680656194687 2023-01-22 14:45:35.117646: step: 226/464, loss: 0.18807777762413025 2023-01-22 14:45:35.913442: step: 228/464, loss: 0.06853378564119339 2023-01-22 14:45:36.687543: step: 230/464, loss: 1.3367910385131836 2023-01-22 14:45:37.377190: step: 232/464, loss: 0.0589243546128273 2023-01-22 14:45:38.184293: step: 234/464, loss: 0.07898443192243576 2023-01-22 14:45:38.908553: step: 236/464, loss: 0.22820298373699188 2023-01-22 14:45:39.554280: step: 238/464, loss: 0.03967048227787018 2023-01-22 14:45:40.304866: step: 240/464, loss: 0.037137553095817566 2023-01-22 14:45:40.979453: step: 242/464, loss: 0.004308091476559639 2023-01-22 14:45:41.687647: step: 244/464, loss: 0.006603443995118141 2023-01-22 14:45:42.461190: step: 246/464, loss: 0.02261928655207157 2023-01-22 14:45:43.148778: step: 248/464, loss: 0.010544451884925365 2023-01-22 14:45:43.964865: step: 250/464, loss: 0.014210019260644913 2023-01-22 14:45:44.754677: step: 252/464, loss: 0.07230827957391739 2023-01-22 14:45:45.460997: step: 254/464, loss: 0.015503766015172005 2023-01-22 14:45:46.217879: step: 256/464, loss: 0.024298088625073433 2023-01-22 14:45:46.992355: step: 258/464, loss: 0.08183684200048447 2023-01-22 14:45:47.741588: step: 260/464, loss: 0.03380714729428291 2023-01-22 14:45:48.415604: step: 262/464, loss: 0.003679390996694565 2023-01-22 14:45:49.128122: step: 264/464, loss: 0.02577679231762886 2023-01-22 14:45:49.886676: step: 266/464, loss: 0.2571578919887543 2023-01-22 14:45:50.687028: step: 268/464, loss: 0.01715640164911747 2023-01-22 14:45:51.458055: step: 270/464, loss: 0.07080483436584473 2023-01-22 14:45:52.188439: step: 272/464, loss: 0.030466442927718163 2023-01-22 14:45:52.915679: step: 274/464, loss: 0.23173189163208008 2023-01-22 14:45:53.677781: step: 276/464, loss: 0.02331365831196308 2023-01-22 14:45:54.415446: step: 278/464, loss: 0.018626758828759193 2023-01-22 14:45:55.077233: step: 280/464, loss: 0.02581346221268177 2023-01-22 14:45:55.799803: step: 282/464, loss: 0.07633237540721893 2023-01-22 14:45:56.538774: step: 284/464, loss: 0.0020407966803759336 2023-01-22 14:45:57.228664: step: 286/464, loss: 0.07563275098800659 2023-01-22 14:45:57.930564: step: 288/464, loss: 0.08578566461801529 2023-01-22 14:45:58.655473: step: 290/464, loss: 0.02253068797290325 2023-01-22 14:45:59.401535: step: 292/464, loss: 0.07634179294109344 2023-01-22 14:46:00.170819: step: 294/464, loss: 0.017030959948897362 2023-01-22 14:46:01.011019: step: 296/464, loss: 0.05161159113049507 2023-01-22 14:46:01.771265: step: 298/464, loss: 0.1865466833114624 2023-01-22 14:46:02.501047: step: 300/464, loss: 0.034873880445957184 2023-01-22 14:46:03.181988: step: 302/464, loss: 0.22604385018348694 2023-01-22 14:46:03.943727: step: 304/464, loss: 0.03261805325746536 2023-01-22 14:46:04.669395: step: 306/464, loss: 0.24040238559246063 2023-01-22 14:46:05.418464: step: 308/464, loss: 0.382816880941391 2023-01-22 14:46:06.196589: step: 310/464, loss: 0.02216779626905918 2023-01-22 14:46:06.894906: step: 312/464, loss: 0.00621431227773428 2023-01-22 14:46:07.644685: step: 314/464, loss: 0.01933230273425579 2023-01-22 14:46:08.307636: step: 316/464, loss: 0.10341157764196396 2023-01-22 14:46:09.171223: step: 318/464, loss: 0.05525139346718788 2023-01-22 14:46:09.896190: step: 320/464, loss: 0.03290149196982384 2023-01-22 14:46:10.659282: step: 322/464, loss: 0.009247610345482826 2023-01-22 14:46:11.391308: step: 324/464, loss: 0.009450143203139305 2023-01-22 14:46:12.108669: step: 326/464, loss: 0.006791172549128532 2023-01-22 14:46:12.796573: step: 328/464, loss: 0.012052402831614017 2023-01-22 14:46:13.598248: step: 330/464, loss: 0.2342667579650879 2023-01-22 14:46:14.344974: step: 332/464, loss: 0.2897554039955139 2023-01-22 14:46:15.013423: step: 334/464, loss: 0.01669563166797161 2023-01-22 14:46:15.711287: step: 336/464, loss: 0.06249774247407913 2023-01-22 14:46:16.437679: step: 338/464, loss: 0.005403649061918259 2023-01-22 14:46:17.180127: step: 340/464, loss: 0.01682235673069954 2023-01-22 14:46:17.928780: step: 342/464, loss: 0.050066784024238586 2023-01-22 14:46:18.634876: step: 344/464, loss: 0.02224194072186947 2023-01-22 14:46:19.335375: step: 346/464, loss: 0.28325849771499634 2023-01-22 14:46:20.010901: step: 348/464, loss: 0.14532120525836945 2023-01-22 14:46:20.728111: step: 350/464, loss: 0.04144451767206192 2023-01-22 14:46:21.428273: step: 352/464, loss: 0.015775060281157494 2023-01-22 14:46:22.175815: step: 354/464, loss: 0.027337608858942986 2023-01-22 14:46:22.911327: step: 356/464, loss: 0.015663186088204384 2023-01-22 14:46:23.646830: step: 358/464, loss: 0.022179163992404938 2023-01-22 14:46:24.382210: step: 360/464, loss: 0.025821568444371223 2023-01-22 14:46:25.089422: step: 362/464, loss: 0.05381513386964798 2023-01-22 14:46:25.812398: step: 364/464, loss: 0.03104221634566784 2023-01-22 14:46:26.537122: step: 366/464, loss: 0.0401470810174942 2023-01-22 14:46:27.222989: step: 368/464, loss: 0.05643609166145325 2023-01-22 14:46:27.984108: step: 370/464, loss: 0.05265054106712341 2023-01-22 14:46:28.709372: step: 372/464, loss: 0.10396604984998703 2023-01-22 14:46:29.419722: step: 374/464, loss: 0.057258524000644684 2023-01-22 14:46:30.148710: step: 376/464, loss: 0.0088130421936512 2023-01-22 14:46:30.870394: step: 378/464, loss: 0.06175927817821503 2023-01-22 14:46:31.611364: step: 380/464, loss: 0.0916573703289032 2023-01-22 14:46:32.303317: step: 382/464, loss: 0.08674547076225281 2023-01-22 14:46:33.014399: step: 384/464, loss: 0.1691557765007019 2023-01-22 14:46:33.674769: step: 386/464, loss: 0.08958717435598373 2023-01-22 14:46:34.519891: step: 388/464, loss: 0.016225000843405724 2023-01-22 14:46:35.230659: step: 390/464, loss: 0.020185474306344986 2023-01-22 14:46:36.010683: step: 392/464, loss: 0.03769129142165184 2023-01-22 14:46:36.656058: step: 394/464, loss: 0.009174306876957417 2023-01-22 14:46:37.305334: step: 396/464, loss: 0.005557133350521326 2023-01-22 14:46:38.081995: step: 398/464, loss: 0.00031553933513350785 2023-01-22 14:46:38.895686: step: 400/464, loss: 0.009889748878777027 2023-01-22 14:46:39.716686: step: 402/464, loss: 0.09212450683116913 2023-01-22 14:46:40.506684: step: 404/464, loss: 0.016208630055189133 2023-01-22 14:46:41.193117: step: 406/464, loss: 0.0028690879698842764 2023-01-22 14:46:42.071888: step: 408/464, loss: 0.04372553899884224 2023-01-22 14:46:42.787250: step: 410/464, loss: 0.042513079941272736 2023-01-22 14:46:43.458179: step: 412/464, loss: 0.04648435860872269 2023-01-22 14:46:44.201467: step: 414/464, loss: 0.0648239478468895 2023-01-22 14:46:44.929372: step: 416/464, loss: 0.03131159394979477 2023-01-22 14:46:45.663363: step: 418/464, loss: 0.04443042725324631 2023-01-22 14:46:46.430699: step: 420/464, loss: 0.029055485501885414 2023-01-22 14:46:47.219800: step: 422/464, loss: 0.04159461706876755 2023-01-22 14:46:48.006236: step: 424/464, loss: 0.5957244634628296 2023-01-22 14:46:48.808008: step: 426/464, loss: 0.09278106689453125 2023-01-22 14:46:49.578120: step: 428/464, loss: 0.018769947811961174 2023-01-22 14:46:50.404908: step: 430/464, loss: 0.35789287090301514 2023-01-22 14:46:51.100816: step: 432/464, loss: 0.061263084411621094 2023-01-22 14:46:51.834703: step: 434/464, loss: 0.11250555515289307 2023-01-22 14:46:52.534106: step: 436/464, loss: 0.016005242243409157 2023-01-22 14:46:53.213874: step: 438/464, loss: 0.024398449808359146 2023-01-22 14:46:53.872226: step: 440/464, loss: 0.053523093461990356 2023-01-22 14:46:54.630009: step: 442/464, loss: 0.09610924869775772 2023-01-22 14:46:55.345422: step: 444/464, loss: 0.018684133887290955 2023-01-22 14:46:56.041522: step: 446/464, loss: 0.2663451135158539 2023-01-22 14:46:56.761088: step: 448/464, loss: 0.01431179791688919 2023-01-22 14:46:57.442576: step: 450/464, loss: 0.03379271551966667 2023-01-22 14:46:58.231195: step: 452/464, loss: 0.07740333676338196 2023-01-22 14:46:59.035824: step: 454/464, loss: 0.03083229996263981 2023-01-22 14:46:59.722611: step: 456/464, loss: 0.10461738705635071 2023-01-22 14:47:00.436137: step: 458/464, loss: 0.4135224223136902 2023-01-22 14:47:01.131281: step: 460/464, loss: 0.28587913513183594 2023-01-22 14:47:01.914601: step: 462/464, loss: 0.3790675103664398 2023-01-22 14:47:02.681318: step: 464/464, loss: 0.10238826274871826 2023-01-22 14:47:03.380662: step: 466/464, loss: 0.00831583235412836 2023-01-22 14:47:04.200958: step: 468/464, loss: 0.3245208263397217 2023-01-22 14:47:04.901647: step: 470/464, loss: 0.0015388050815090537 2023-01-22 14:47:05.643304: step: 472/464, loss: 0.12672469019889832 2023-01-22 14:47:06.386216: step: 474/464, loss: 1.2395349740982056 2023-01-22 14:47:07.123158: step: 476/464, loss: 16.51024055480957 2023-01-22 14:47:07.955603: step: 478/464, loss: 0.003449058858677745 2023-01-22 14:47:08.676033: step: 480/464, loss: 0.009907940402626991 2023-01-22 14:47:09.476258: step: 482/464, loss: 0.8217645287513733 2023-01-22 14:47:10.267806: step: 484/464, loss: 0.1157865822315216 2023-01-22 14:47:11.032723: step: 486/464, loss: 0.11795174330472946 2023-01-22 14:47:11.826140: step: 488/464, loss: 0.03714916482567787 2023-01-22 14:47:12.630564: step: 490/464, loss: 0.2336808443069458 2023-01-22 14:47:13.333996: step: 492/464, loss: 0.19613677263259888 2023-01-22 14:47:14.124553: step: 494/464, loss: 0.07602205127477646 2023-01-22 14:47:14.901385: step: 496/464, loss: 0.03136870637536049 2023-01-22 14:47:15.585378: step: 498/464, loss: 1.29381263256073 2023-01-22 14:47:16.311500: step: 500/464, loss: 0.019961200654506683 2023-01-22 14:47:17.031174: step: 502/464, loss: 6.027398109436035 2023-01-22 14:47:17.834127: step: 504/464, loss: 2.7483906745910645 2023-01-22 14:47:18.583528: step: 506/464, loss: 0.21104967594146729 2023-01-22 14:47:19.325194: step: 508/464, loss: 0.13606393337249756 2023-01-22 14:47:20.105439: step: 510/464, loss: 4.976814270019531 2023-01-22 14:47:20.812023: step: 512/464, loss: 0.10234958678483963 2023-01-22 14:47:21.538117: step: 514/464, loss: 0.05969827249646187 2023-01-22 14:47:22.294786: step: 516/464, loss: 1.065191626548767 2023-01-22 14:47:23.000713: step: 518/464, loss: 0.04677828773856163 2023-01-22 14:47:23.781401: step: 520/464, loss: 0.24411696195602417 2023-01-22 14:47:24.484708: step: 522/464, loss: 0.11171295493841171 2023-01-22 14:47:25.191569: step: 524/464, loss: 0.0074959914200007915 2023-01-22 14:47:26.042422: step: 526/464, loss: 0.9222694039344788 2023-01-22 14:47:26.824900: step: 528/464, loss: 0.0855538472533226 2023-01-22 14:47:27.513289: step: 530/464, loss: 0.011772975325584412 2023-01-22 14:47:28.199167: step: 532/464, loss: 0.09362389147281647 2023-01-22 14:47:28.931884: step: 534/464, loss: 0.016605759039521217 2023-01-22 14:47:29.733638: step: 536/464, loss: 2.961850881576538 2023-01-22 14:47:30.357610: step: 538/464, loss: 0.00629340810701251 2023-01-22 14:47:31.111953: step: 540/464, loss: 0.01514369435608387 2023-01-22 14:47:31.937548: step: 542/464, loss: 0.875588595867157 2023-01-22 14:47:32.699745: step: 544/464, loss: 0.009261378087103367 2023-01-22 14:47:33.396826: step: 546/464, loss: 0.021717533469200134 2023-01-22 14:47:34.196211: step: 548/464, loss: 0.1673237681388855 2023-01-22 14:47:34.881401: step: 550/464, loss: 0.05481730401515961 2023-01-22 14:47:35.703116: step: 552/464, loss: 0.06743796169757843 2023-01-22 14:47:36.511934: step: 554/464, loss: 0.5204348564147949 2023-01-22 14:47:37.232750: step: 556/464, loss: 0.09271502494812012 2023-01-22 14:47:37.977627: step: 558/464, loss: 0.4915308654308319 2023-01-22 14:47:38.655636: step: 560/464, loss: 0.18924763798713684 2023-01-22 14:47:39.343767: step: 562/464, loss: 0.01631811447441578 2023-01-22 14:47:40.140488: step: 564/464, loss: 0.00829684641212225 2023-01-22 14:47:40.814028: step: 566/464, loss: 0.016613049432635307 2023-01-22 14:47:41.574085: step: 568/464, loss: 0.01600293442606926 2023-01-22 14:47:42.253763: step: 570/464, loss: 0.1619260311126709 2023-01-22 14:47:42.996586: step: 572/464, loss: 0.05017637461423874 2023-01-22 14:47:43.695765: step: 574/464, loss: 0.017390595749020576 2023-01-22 14:47:44.433066: step: 576/464, loss: 0.007992753759026527 2023-01-22 14:47:45.198195: step: 578/464, loss: 0.09127692133188248 2023-01-22 14:47:45.894683: step: 580/464, loss: 0.024530908092856407 2023-01-22 14:47:46.674412: step: 582/464, loss: 0.2850531041622162 2023-01-22 14:47:47.398777: step: 584/464, loss: 0.02502652071416378 2023-01-22 14:47:48.172911: step: 586/464, loss: 0.020360369235277176 2023-01-22 14:47:48.900860: step: 588/464, loss: 0.047007378190755844 2023-01-22 14:47:49.609424: step: 590/464, loss: 0.005273542366921902 2023-01-22 14:47:50.371807: step: 592/464, loss: 0.0321476086974144 2023-01-22 14:47:51.054006: step: 594/464, loss: 0.023537082597613335 2023-01-22 14:47:51.727038: step: 596/464, loss: 0.021900134161114693 2023-01-22 14:47:52.484471: step: 598/464, loss: 0.06477120518684387 2023-01-22 14:47:53.220711: step: 600/464, loss: 0.022067628800868988 2023-01-22 14:47:53.898723: step: 602/464, loss: 0.02224505878984928 2023-01-22 14:47:54.754460: step: 604/464, loss: 0.0633058026432991 2023-01-22 14:47:55.546168: step: 606/464, loss: 0.0420142337679863 2023-01-22 14:47:56.216620: step: 608/464, loss: 0.023906320333480835 2023-01-22 14:47:56.905069: step: 610/464, loss: 0.19567719101905823 2023-01-22 14:47:57.629781: step: 612/464, loss: 0.006387477740645409 2023-01-22 14:47:58.393759: step: 614/464, loss: 0.00764105562120676 2023-01-22 14:47:59.093467: step: 616/464, loss: 0.07847554236650467 2023-01-22 14:47:59.803127: step: 618/464, loss: 0.007820271886885166 2023-01-22 14:48:00.503628: step: 620/464, loss: 0.06269571930170059 2023-01-22 14:48:01.265601: step: 622/464, loss: 0.02810000814497471 2023-01-22 14:48:02.114057: step: 624/464, loss: 0.011240391060709953 2023-01-22 14:48:02.844409: step: 626/464, loss: 0.026797929778695107 2023-01-22 14:48:03.569324: step: 628/464, loss: 0.11885682493448257 2023-01-22 14:48:04.306013: step: 630/464, loss: 0.038971733301877975 2023-01-22 14:48:05.080083: step: 632/464, loss: 0.014642666094005108 2023-01-22 14:48:05.874347: step: 634/464, loss: 0.08038690686225891 2023-01-22 14:48:06.624166: step: 636/464, loss: 0.3002861738204956 2023-01-22 14:48:07.402629: step: 638/464, loss: 0.10866020619869232 2023-01-22 14:48:08.096751: step: 640/464, loss: 0.031055618077516556 2023-01-22 14:48:08.849403: step: 642/464, loss: 0.8886435031890869 2023-01-22 14:48:09.613465: step: 644/464, loss: 0.3501538336277008 2023-01-22 14:48:10.417248: step: 646/464, loss: 0.040703218430280685 2023-01-22 14:48:11.237416: step: 648/464, loss: 0.027664896100759506 2023-01-22 14:48:12.018860: step: 650/464, loss: 0.0026877825148403645 2023-01-22 14:48:12.761885: step: 652/464, loss: 0.03684217855334282 2023-01-22 14:48:13.521176: step: 654/464, loss: 0.050308216363191605 2023-01-22 14:48:14.248827: step: 656/464, loss: 0.046196311712265015 2023-01-22 14:48:14.984123: step: 658/464, loss: 0.01166035607457161 2023-01-22 14:48:15.706519: step: 660/464, loss: 0.08269614726305008 2023-01-22 14:48:16.417104: step: 662/464, loss: 0.035362984985113144 2023-01-22 14:48:17.195898: step: 664/464, loss: 0.0011275168508291245 2023-01-22 14:48:17.889382: step: 666/464, loss: 0.000661600090097636 2023-01-22 14:48:18.645158: step: 668/464, loss: 0.053751636296510696 2023-01-22 14:48:19.331067: step: 670/464, loss: 0.024816978722810745 2023-01-22 14:48:20.071867: step: 672/464, loss: 0.0922572910785675 2023-01-22 14:48:20.857250: step: 674/464, loss: 0.04353713616728783 2023-01-22 14:48:21.621589: step: 676/464, loss: 0.00623973598703742 2023-01-22 14:48:22.297614: step: 678/464, loss: 0.007656637113541365 2023-01-22 14:48:23.083005: step: 680/464, loss: 0.11449252814054489 2023-01-22 14:48:23.872534: step: 682/464, loss: 0.023578859865665436 2023-01-22 14:48:24.608519: step: 684/464, loss: 0.0682889074087143 2023-01-22 14:48:25.356397: step: 686/464, loss: 0.02251061610877514 2023-01-22 14:48:26.099562: step: 688/464, loss: 0.1295761913061142 2023-01-22 14:48:26.767613: step: 690/464, loss: 0.008971642702817917 2023-01-22 14:48:27.440885: step: 692/464, loss: 0.02738947980105877 2023-01-22 14:48:28.061048: step: 694/464, loss: 0.016341639682650566 2023-01-22 14:48:28.799741: step: 696/464, loss: 0.014974063262343407 2023-01-22 14:48:29.506300: step: 698/464, loss: 0.0013749344507232308 2023-01-22 14:48:30.241844: step: 700/464, loss: 0.016520194709300995 2023-01-22 14:48:31.078030: step: 702/464, loss: 0.052617549896240234 2023-01-22 14:48:31.803970: step: 704/464, loss: 0.024146053940057755 2023-01-22 14:48:32.626631: step: 706/464, loss: 0.026096755638718605 2023-01-22 14:48:33.330265: step: 708/464, loss: 0.014157273806631565 2023-01-22 14:48:34.058346: step: 710/464, loss: 0.04489751532673836 2023-01-22 14:48:34.833732: step: 712/464, loss: 0.2788737416267395 2023-01-22 14:48:35.572173: step: 714/464, loss: 0.012423225678503513 2023-01-22 14:48:36.296246: step: 716/464, loss: 0.05163278803229332 2023-01-22 14:48:37.011701: step: 718/464, loss: 0.03783539682626724 2023-01-22 14:48:37.879753: step: 720/464, loss: 0.007425676565617323 2023-01-22 14:48:38.573598: step: 722/464, loss: 0.03953913599252701 2023-01-22 14:48:39.333412: step: 724/464, loss: 0.22150929272174835 2023-01-22 14:48:40.086992: step: 726/464, loss: 0.010052965953946114 2023-01-22 14:48:40.817541: step: 728/464, loss: 0.051495082676410675 2023-01-22 14:48:41.603325: step: 730/464, loss: 0.05625370517373085 2023-01-22 14:48:42.324858: step: 732/464, loss: 0.06636309623718262 2023-01-22 14:48:43.051890: step: 734/464, loss: 0.07243285328149796 2023-01-22 14:48:43.922964: step: 736/464, loss: 0.13123223185539246 2023-01-22 14:48:44.678831: step: 738/464, loss: 0.04448264464735985 2023-01-22 14:48:45.468570: step: 740/464, loss: 0.05884762108325958 2023-01-22 14:48:46.182596: step: 742/464, loss: 0.02446148730814457 2023-01-22 14:48:46.978955: step: 744/464, loss: 0.0030263373628258705 2023-01-22 14:48:47.781064: step: 746/464, loss: 0.04585908725857735 2023-01-22 14:48:48.477586: step: 748/464, loss: 0.06436149030923843 2023-01-22 14:48:49.151648: step: 750/464, loss: 0.0034831571392714977 2023-01-22 14:48:49.901338: step: 752/464, loss: 0.0066585722379386425 2023-01-22 14:48:50.671413: step: 754/464, loss: 0.08963311463594437 2023-01-22 14:48:51.393550: step: 756/464, loss: 0.028069909662008286 2023-01-22 14:48:52.112095: step: 758/464, loss: 0.03629578277468681 2023-01-22 14:48:52.856414: step: 760/464, loss: 0.09160567820072174 2023-01-22 14:48:53.555306: step: 762/464, loss: 0.03636811301112175 2023-01-22 14:48:54.311524: step: 764/464, loss: 0.012255738489329815 2023-01-22 14:48:55.163653: step: 766/464, loss: 0.07985013723373413 2023-01-22 14:48:55.901517: step: 768/464, loss: 0.011398863978683949 2023-01-22 14:48:56.615989: step: 770/464, loss: 0.10022206604480743 2023-01-22 14:48:57.376330: step: 772/464, loss: 0.2913207709789276 2023-01-22 14:48:58.090869: step: 774/464, loss: 0.016932744532823563 2023-01-22 14:48:58.845081: step: 776/464, loss: 0.006578746717423201 2023-01-22 14:48:59.489693: step: 778/464, loss: 0.423820823431015 2023-01-22 14:49:00.133265: step: 780/464, loss: 0.027954528108239174 2023-01-22 14:49:00.880960: step: 782/464, loss: 0.34176504611968994 2023-01-22 14:49:01.639209: step: 784/464, loss: 0.47238412499427795 2023-01-22 14:49:02.387926: step: 786/464, loss: 0.019191065803170204 2023-01-22 14:49:03.161632: step: 788/464, loss: 0.006832434795796871 2023-01-22 14:49:03.933717: step: 790/464, loss: 0.051002223044633865 2023-01-22 14:49:04.581163: step: 792/464, loss: 0.009259670041501522 2023-01-22 14:49:05.247434: step: 794/464, loss: 0.02580624632537365 2023-01-22 14:49:05.890115: step: 796/464, loss: 0.057828933000564575 2023-01-22 14:49:06.627303: step: 798/464, loss: 0.05048080533742905 2023-01-22 14:49:07.384523: step: 800/464, loss: 0.06187806278467178 2023-01-22 14:49:08.216645: step: 802/464, loss: 0.04779934138059616 2023-01-22 14:49:08.943926: step: 804/464, loss: 0.04217812046408653 2023-01-22 14:49:09.657056: step: 806/464, loss: 0.03599544242024422 2023-01-22 14:49:10.408638: step: 808/464, loss: 0.0035362495109438896 2023-01-22 14:49:11.110805: step: 810/464, loss: 0.019715817645192146 2023-01-22 14:49:11.859820: step: 812/464, loss: 0.026319032534956932 2023-01-22 14:49:12.628471: step: 814/464, loss: 0.1430322378873825 2023-01-22 14:49:13.427009: step: 816/464, loss: 0.02330571413040161 2023-01-22 14:49:14.174413: step: 818/464, loss: 0.01277852151542902 2023-01-22 14:49:14.842501: step: 820/464, loss: 0.0470539890229702 2023-01-22 14:49:15.534511: step: 822/464, loss: 0.02166724018752575 2023-01-22 14:49:16.218974: step: 824/464, loss: 0.0243095550686121 2023-01-22 14:49:16.972134: step: 826/464, loss: 0.007452270481735468 2023-01-22 14:49:17.594565: step: 828/464, loss: 0.04806208238005638 2023-01-22 14:49:18.336657: step: 830/464, loss: 0.0014277611626312137 2023-01-22 14:49:19.147026: step: 832/464, loss: 0.051559727638959885 2023-01-22 14:49:19.878188: step: 834/464, loss: 0.017420530319213867 2023-01-22 14:49:20.698935: step: 836/464, loss: 0.048703331500291824 2023-01-22 14:49:21.372381: step: 838/464, loss: 0.539999783039093 2023-01-22 14:49:22.150662: step: 840/464, loss: 0.05690797418355942 2023-01-22 14:49:22.911506: step: 842/464, loss: 0.03980085253715515 2023-01-22 14:49:23.605704: step: 844/464, loss: 0.4376107156276703 2023-01-22 14:49:24.406509: step: 846/464, loss: 0.02877359464764595 2023-01-22 14:49:25.310295: step: 848/464, loss: 0.006814028136432171 2023-01-22 14:49:25.994157: step: 850/464, loss: 0.03682967647910118 2023-01-22 14:49:26.699576: step: 852/464, loss: 0.048403650522232056 2023-01-22 14:49:27.531652: step: 854/464, loss: 0.009567310102283955 2023-01-22 14:49:28.347774: step: 856/464, loss: 0.010289072059094906 2023-01-22 14:49:29.113560: step: 858/464, loss: 0.025864794850349426 2023-01-22 14:49:29.790515: step: 860/464, loss: 0.003975290339440107 2023-01-22 14:49:30.601472: step: 862/464, loss: 0.2906065285205841 2023-01-22 14:49:31.324512: step: 864/464, loss: 0.04651524871587753 2023-01-22 14:49:32.039181: step: 866/464, loss: 0.020465346053242683 2023-01-22 14:49:32.753544: step: 868/464, loss: 0.07506394386291504 2023-01-22 14:49:33.462521: step: 870/464, loss: 0.0477907694876194 2023-01-22 14:49:34.196730: step: 872/464, loss: 0.03773835673928261 2023-01-22 14:49:34.950515: step: 874/464, loss: 0.01569550670683384 2023-01-22 14:49:35.671384: step: 876/464, loss: 0.024225924164056778 2023-01-22 14:49:36.488570: step: 878/464, loss: 0.03536829352378845 2023-01-22 14:49:37.156614: step: 880/464, loss: 0.01074038352817297 2023-01-22 14:49:37.867191: step: 882/464, loss: 0.010568689554929733 2023-01-22 14:49:38.610999: step: 884/464, loss: 0.05296974256634712 2023-01-22 14:49:39.342323: step: 886/464, loss: 0.027974413707852364 2023-01-22 14:49:40.107622: step: 888/464, loss: 0.00846959464251995 2023-01-22 14:49:40.820398: step: 890/464, loss: 0.1548718512058258 2023-01-22 14:49:41.527635: step: 892/464, loss: 0.013443255797028542 2023-01-22 14:49:42.234947: step: 894/464, loss: 0.011265892535448074 2023-01-22 14:49:43.022278: step: 896/464, loss: 0.042480919510126114 2023-01-22 14:49:43.808030: step: 898/464, loss: 0.3838881254196167 2023-01-22 14:49:44.511747: step: 900/464, loss: 0.010052017867565155 2023-01-22 14:49:45.241181: step: 902/464, loss: 0.1091848760843277 2023-01-22 14:49:45.975391: step: 904/464, loss: 0.004417903255671263 2023-01-22 14:49:46.667085: step: 906/464, loss: 0.034344952553510666 2023-01-22 14:49:47.375380: step: 908/464, loss: 0.008729771710932255 2023-01-22 14:49:48.099450: step: 910/464, loss: 0.19502171874046326 2023-01-22 14:49:48.801860: step: 912/464, loss: 0.029527118429541588 2023-01-22 14:49:49.671113: step: 914/464, loss: 0.01878412254154682 2023-01-22 14:49:50.429563: step: 916/464, loss: 0.15181230008602142 2023-01-22 14:49:51.170714: step: 918/464, loss: 0.004631921648979187 2023-01-22 14:49:51.943324: step: 920/464, loss: 0.04741932079195976 2023-01-22 14:49:52.753701: step: 922/464, loss: 0.0586797297000885 2023-01-22 14:49:53.489829: step: 924/464, loss: 0.05027880519628525 2023-01-22 14:49:54.112614: step: 926/464, loss: 0.18030577898025513 2023-01-22 14:49:54.949407: step: 928/464, loss: 0.004381497856229544 2023-01-22 14:49:55.681009: step: 930/464, loss: 0.010361275635659695 ================================================== Loss: 0.184 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.31846038146441374, 'r': 0.33719334507996745, 'f1': 0.32755924950625415}, 'combined': 0.24135944700460832, 'epoch': 24} Test Chinese: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.30381005352067847, 'r': 0.2876148482984508, 'f1': 0.295490711284311}, 'combined': 0.1835152838502563, 'epoch': 24} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2983394666988417, 'r': 0.33513655462184877, 'f1': 0.31566928379931064}, 'combined': 0.2325984196415973, 'epoch': 24} Test Korean: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.29803361470226897, 'r': 0.28744209433772633, 'f1': 0.29264205182323294}, 'combined': 0.18174611639548152, 'epoch': 24} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.32184781639317267, 'r': 0.3346728716953674, 'f1': 0.32813507606224857}, 'combined': 0.24178374025639365, 'epoch': 24} Test Russian: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.31977789344146773, 'r': 0.29610233370986844, 'f1': 0.30748504771716734}, 'combined': 0.190964398055925, 'epoch': 24} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.26875, 'r': 0.30714285714285716, 'f1': 0.2866666666666666}, 'combined': 0.19111111111111106, 'epoch': 24} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.2727272727272727, 'r': 0.5217391304347826, 'f1': 0.3582089552238806}, 'combined': 0.1791044776119403, 'epoch': 24} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.46875, 'r': 0.3232758620689655, 'f1': 0.38265306122448983}, 'combined': 0.25510204081632654, 'epoch': 24} New best russian model... ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.31282801835726054, 'r': 0.3603161425860667, 'f1': 0.33489701436130004}, 'combined': 0.24676622110832633, 'epoch': 12} Test for Chinese: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.30894966790503225, 'r': 0.2887808468548521, 'f1': 0.29852498585915693}, 'combined': 0.1853997280598975, 'epoch': 12} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.3138888888888889, 'r': 0.4035714285714286, 'f1': 0.35312499999999997}, 'combined': 0.23541666666666664, 'epoch': 12} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3081533681870252, 'r': 0.3455761681186563, 'f1': 0.32579363255551325}, 'combined': 0.24005846609353607, 'epoch': 23} Test for Korean: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.29327466083978504, 'r': 0.2883432372648332, 'f1': 0.2907880427678267}, 'combined': 0.1805946791926503, 'epoch': 23} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.3355263157894737, 'r': 0.5543478260869565, 'f1': 0.41803278688524587}, 'combined': 0.20901639344262293, 'epoch': 23} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.32184781639317267, 'r': 0.3346728716953674, 'f1': 0.32813507606224857}, 'combined': 0.24178374025639365, 'epoch': 24} Test for Russian: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.31977789344146773, 'r': 0.29610233370986844, 'f1': 0.30748504771716734}, 'combined': 0.190964398055925, 'epoch': 24} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.46875, 'r': 0.3232758620689655, 'f1': 0.38265306122448983}, 'combined': 0.25510204081632654, 'epoch': 24} ****************************** Epoch: 25 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-22 14:52:41.236583: step: 2/464, loss: 0.0419095940887928 2023-01-22 14:52:42.003975: step: 4/464, loss: 0.033328909426927567 2023-01-22 14:52:42.725186: step: 6/464, loss: 0.018745923414826393 2023-01-22 14:52:43.462722: step: 8/464, loss: 0.029888661578297615 2023-01-22 14:52:44.209086: step: 10/464, loss: 0.06249646469950676 2023-01-22 14:52:44.975270: step: 12/464, loss: 0.06099044531583786 2023-01-22 14:52:45.669920: step: 14/464, loss: 0.0551176480948925 2023-01-22 14:52:46.355216: step: 16/464, loss: 0.051852885633707047 2023-01-22 14:52:47.086253: step: 18/464, loss: 0.015076231211423874 2023-01-22 14:52:47.908799: step: 20/464, loss: 0.0029031329322606325 2023-01-22 14:52:48.619679: step: 22/464, loss: 0.018571581691503525 2023-01-22 14:52:49.429354: step: 24/464, loss: 0.03045966476202011 2023-01-22 14:52:50.098481: step: 26/464, loss: 0.047910746186971664 2023-01-22 14:52:50.937606: step: 28/464, loss: 0.016626769676804543 2023-01-22 14:52:51.594850: step: 30/464, loss: 0.05166122689843178 2023-01-22 14:52:52.387470: step: 32/464, loss: 0.020833736285567284 2023-01-22 14:52:53.117129: step: 34/464, loss: 0.04184484854340553 2023-01-22 14:52:54.045378: step: 36/464, loss: 0.0944288820028305 2023-01-22 14:52:54.732988: step: 38/464, loss: 0.0076257940381765366 2023-01-22 14:52:55.479958: step: 40/464, loss: 0.0417209193110466 2023-01-22 14:52:56.182817: step: 42/464, loss: 0.026369836181402206 2023-01-22 14:52:57.033769: step: 44/464, loss: 0.021718092262744904 2023-01-22 14:52:57.818490: step: 46/464, loss: 0.021722707897424698 2023-01-22 14:52:58.522194: step: 48/464, loss: 0.08610419183969498 2023-01-22 14:52:59.314387: step: 50/464, loss: 0.01406539510935545 2023-01-22 14:53:00.014940: step: 52/464, loss: 0.009583326987922192 2023-01-22 14:53:00.810430: step: 54/464, loss: 0.0708504468202591 2023-01-22 14:53:01.508789: step: 56/464, loss: 0.023223470896482468 2023-01-22 14:53:02.226816: step: 58/464, loss: 0.026299627497792244 2023-01-22 14:53:02.944011: step: 60/464, loss: 0.47032177448272705 2023-01-22 14:53:03.708658: step: 62/464, loss: 0.017070485278964043 2023-01-22 14:53:04.438862: step: 64/464, loss: 0.24147924780845642 2023-01-22 14:53:05.216490: step: 66/464, loss: 0.05244443193078041 2023-01-22 14:53:05.948198: step: 68/464, loss: 0.030533207580447197 2023-01-22 14:53:06.684682: step: 70/464, loss: 0.014197474345564842 2023-01-22 14:53:07.415747: step: 72/464, loss: 0.06051541492342949 2023-01-22 14:53:08.136692: step: 74/464, loss: 0.03234916180372238 2023-01-22 14:53:08.858456: step: 76/464, loss: 0.011735406704246998 2023-01-22 14:53:09.595480: step: 78/464, loss: 0.03225342929363251 2023-01-22 14:53:10.300799: step: 80/464, loss: 0.019292151555418968 2023-01-22 14:53:11.124068: step: 82/464, loss: 0.019184516742825508 2023-01-22 14:53:11.880595: step: 84/464, loss: 0.008848621509969234 2023-01-22 14:53:12.575639: step: 86/464, loss: 0.022780464962124825 2023-01-22 14:53:13.321833: step: 88/464, loss: 0.00024235923774540424 2023-01-22 14:53:14.054647: step: 90/464, loss: 0.04511115700006485 2023-01-22 14:53:14.792188: step: 92/464, loss: 0.007133756764233112 2023-01-22 14:53:15.569732: step: 94/464, loss: 0.055271781980991364 2023-01-22 14:53:16.344561: step: 96/464, loss: 0.006807959638535976 2023-01-22 14:53:16.992910: step: 98/464, loss: 0.0008343493682332337 2023-01-22 14:53:17.722483: step: 100/464, loss: 0.7415191531181335 2023-01-22 14:53:18.405904: step: 102/464, loss: 0.016804508864879608 2023-01-22 14:53:19.163242: step: 104/464, loss: 0.03757983446121216 2023-01-22 14:53:20.017075: step: 106/464, loss: 0.07727273553609848 2023-01-22 14:53:20.721410: step: 108/464, loss: 0.0360029935836792 2023-01-22 14:53:21.449211: step: 110/464, loss: 0.026875462383031845 2023-01-22 14:53:22.099420: step: 112/464, loss: 0.033845219761133194 2023-01-22 14:53:22.739329: step: 114/464, loss: 0.22472339868545532 2023-01-22 14:53:23.387278: step: 116/464, loss: 0.012281347066164017 2023-01-22 14:53:24.103024: step: 118/464, loss: 0.0384163036942482 2023-01-22 14:53:24.834121: step: 120/464, loss: 0.020371444523334503 2023-01-22 14:53:25.569477: step: 122/464, loss: 0.005753284320235252 2023-01-22 14:53:26.221323: step: 124/464, loss: 0.009750408120453358 2023-01-22 14:53:26.864550: step: 126/464, loss: 0.007648147642612457 2023-01-22 14:53:27.597188: step: 128/464, loss: 0.006110684014856815 2023-01-22 14:53:28.383561: step: 130/464, loss: 0.008586232550442219 2023-01-22 14:53:29.082615: step: 132/464, loss: 0.020226802676916122 2023-01-22 14:53:29.818249: step: 134/464, loss: 0.023586483672261238 2023-01-22 14:53:30.498418: step: 136/464, loss: 0.022161681205034256 2023-01-22 14:53:31.264149: step: 138/464, loss: 0.2518976628780365 2023-01-22 14:53:32.064537: step: 140/464, loss: 0.02171619050204754 2023-01-22 14:53:32.838179: step: 142/464, loss: 0.0016933977603912354 2023-01-22 14:53:33.590641: step: 144/464, loss: 0.002125772647559643 2023-01-22 14:53:34.271765: step: 146/464, loss: 0.0029265161138027906 2023-01-22 14:53:34.999593: step: 148/464, loss: 0.17299233376979828 2023-01-22 14:53:35.700941: step: 150/464, loss: 0.04470699280500412 2023-01-22 14:53:36.385122: step: 152/464, loss: 0.028527090325951576 2023-01-22 14:53:37.106747: step: 154/464, loss: 0.005463603418320417 2023-01-22 14:53:37.910330: step: 156/464, loss: 0.030952492728829384 2023-01-22 14:53:38.575837: step: 158/464, loss: 0.010252845473587513 2023-01-22 14:53:39.245618: step: 160/464, loss: 0.01254748459905386 2023-01-22 14:53:39.905224: step: 162/464, loss: 0.009948525577783585 2023-01-22 14:53:40.604301: step: 164/464, loss: 0.01122485101222992 2023-01-22 14:53:41.337295: step: 166/464, loss: 0.028889335691928864 2023-01-22 14:53:42.085661: step: 168/464, loss: 0.2363210767507553 2023-01-22 14:53:42.803034: step: 170/464, loss: 0.04092266410589218 2023-01-22 14:53:43.505618: step: 172/464, loss: 0.059117868542671204 2023-01-22 14:53:44.222715: step: 174/464, loss: 0.007250105030834675 2023-01-22 14:53:45.030459: step: 176/464, loss: 0.03736590966582298 2023-01-22 14:53:45.771284: step: 178/464, loss: 0.019464800134301186 2023-01-22 14:53:46.518899: step: 180/464, loss: 0.030115772038698196 2023-01-22 14:53:47.264626: step: 182/464, loss: 0.03560752794146538 2023-01-22 14:53:48.020573: step: 184/464, loss: 0.20950794219970703 2023-01-22 14:53:48.763788: step: 186/464, loss: 0.031240640208125114 2023-01-22 14:53:49.467919: step: 188/464, loss: 0.018869325518608093 2023-01-22 14:53:50.322198: step: 190/464, loss: 1.1431554555892944 2023-01-22 14:53:51.032234: step: 192/464, loss: 0.02737320400774479 2023-01-22 14:53:51.717494: step: 194/464, loss: 0.003148356219753623 2023-01-22 14:53:52.464334: step: 196/464, loss: 0.04883582890033722 2023-01-22 14:53:53.152822: step: 198/464, loss: 0.03754701465368271 2023-01-22 14:53:53.947434: step: 200/464, loss: 0.011245599016547203 2023-01-22 14:53:54.698413: step: 202/464, loss: 0.07296408712863922 2023-01-22 14:53:55.374554: step: 204/464, loss: 0.024501923471689224 2023-01-22 14:53:56.170751: step: 206/464, loss: 0.1741705983877182 2023-01-22 14:53:56.941196: step: 208/464, loss: 0.0009615622693672776 2023-01-22 14:53:57.593861: step: 210/464, loss: 0.005018910858780146 2023-01-22 14:53:58.281080: step: 212/464, loss: 0.06708942353725433 2023-01-22 14:53:59.033426: step: 214/464, loss: 0.0673610121011734 2023-01-22 14:53:59.878407: step: 216/464, loss: 0.0976266860961914 2023-01-22 14:54:00.696732: step: 218/464, loss: 0.005886498838663101 2023-01-22 14:54:01.429343: step: 220/464, loss: 0.05451471731066704 2023-01-22 14:54:02.204879: step: 222/464, loss: 0.03318226337432861 2023-01-22 14:54:02.907360: step: 224/464, loss: 0.035258155316114426 2023-01-22 14:54:03.633124: step: 226/464, loss: 0.04683135449886322 2023-01-22 14:54:04.354909: step: 228/464, loss: 0.04850374534726143 2023-01-22 14:54:05.153771: step: 230/464, loss: 0.017435938119888306 2023-01-22 14:54:05.894186: step: 232/464, loss: 0.0573580302298069 2023-01-22 14:54:06.630963: step: 234/464, loss: 1.5812244415283203 2023-01-22 14:54:07.277268: step: 236/464, loss: 0.005821664817631245 2023-01-22 14:54:07.995010: step: 238/464, loss: 0.01273046899586916 2023-01-22 14:54:08.697066: step: 240/464, loss: 0.008258101530373096 2023-01-22 14:54:09.472783: step: 242/464, loss: 0.012394250370562077 2023-01-22 14:54:10.169173: step: 244/464, loss: 0.022714046761393547 2023-01-22 14:54:10.937318: step: 246/464, loss: 0.04308302700519562 2023-01-22 14:54:11.813814: step: 248/464, loss: 0.07067999243736267 2023-01-22 14:54:12.621449: step: 250/464, loss: 0.02938844822347164 2023-01-22 14:54:13.371834: step: 252/464, loss: 0.07796920835971832 2023-01-22 14:54:14.030124: step: 254/464, loss: 0.017405791208148003 2023-01-22 14:54:14.809543: step: 256/464, loss: 0.004125073552131653 2023-01-22 14:54:15.663925: step: 258/464, loss: 0.03315935656428337 2023-01-22 14:54:16.324860: step: 260/464, loss: 0.005836012773215771 2023-01-22 14:54:17.072831: step: 262/464, loss: 0.04530177637934685 2023-01-22 14:54:17.822389: step: 264/464, loss: 0.32245516777038574 2023-01-22 14:54:18.516892: step: 266/464, loss: 0.0796879455447197 2023-01-22 14:54:19.349566: step: 268/464, loss: 0.1779918223619461 2023-01-22 14:54:20.117857: step: 270/464, loss: 0.015152917243540287 2023-01-22 14:54:20.822184: step: 272/464, loss: 0.017251847311854362 2023-01-22 14:54:21.537352: step: 274/464, loss: 0.07568525522947311 2023-01-22 14:54:22.293899: step: 276/464, loss: 0.010382352396845818 2023-01-22 14:54:23.054438: step: 278/464, loss: 0.050715453922748566 2023-01-22 14:54:23.762748: step: 280/464, loss: 0.009335944429039955 2023-01-22 14:54:24.406042: step: 282/464, loss: 0.03318656235933304 2023-01-22 14:54:25.137368: step: 284/464, loss: 0.03201092779636383 2023-01-22 14:54:25.846087: step: 286/464, loss: 0.028812000527977943 2023-01-22 14:54:26.567973: step: 288/464, loss: 0.001550150802358985 2023-01-22 14:54:27.291502: step: 290/464, loss: 0.003170878393575549 2023-01-22 14:54:28.066744: step: 292/464, loss: 0.01939692534506321 2023-01-22 14:54:28.953946: step: 294/464, loss: 0.0947684794664383 2023-01-22 14:54:29.660919: step: 296/464, loss: 0.3248078525066376 2023-01-22 14:54:30.413220: step: 298/464, loss: 0.06562580913305283 2023-01-22 14:54:31.131054: step: 300/464, loss: 0.028699979186058044 2023-01-22 14:54:31.890614: step: 302/464, loss: 0.024749379605054855 2023-01-22 14:54:32.620579: step: 304/464, loss: 0.012792816385626793 2023-01-22 14:54:33.430978: step: 306/464, loss: 0.008358832448720932 2023-01-22 14:54:34.123581: step: 308/464, loss: 0.2585357129573822 2023-01-22 14:54:34.877935: step: 310/464, loss: 0.0026077909860759974 2023-01-22 14:54:35.529472: step: 312/464, loss: 0.04927883297204971 2023-01-22 14:54:36.328324: step: 314/464, loss: 0.051128946244716644 2023-01-22 14:54:37.093282: step: 316/464, loss: 0.05816463753581047 2023-01-22 14:54:37.870750: step: 318/464, loss: 0.2281452864408493 2023-01-22 14:54:38.569046: step: 320/464, loss: 0.04876122996211052 2023-01-22 14:54:39.308720: step: 322/464, loss: 0.03142109885811806 2023-01-22 14:54:40.084523: step: 324/464, loss: 0.08197657018899918 2023-01-22 14:54:40.889372: step: 326/464, loss: 0.03738683834671974 2023-01-22 14:54:41.577381: step: 328/464, loss: 0.005808872636407614 2023-01-22 14:54:42.306947: step: 330/464, loss: 0.022158043459057808 2023-01-22 14:54:43.057871: step: 332/464, loss: 0.005494985263794661 2023-01-22 14:54:43.797913: step: 334/464, loss: 0.03324667736887932 2023-01-22 14:54:44.624176: step: 336/464, loss: 0.029779573902487755 2023-01-22 14:54:45.344137: step: 338/464, loss: 0.01161265093833208 2023-01-22 14:54:46.077542: step: 340/464, loss: 0.0024349091108888388 2023-01-22 14:54:46.836106: step: 342/464, loss: 0.08912298828363419 2023-01-22 14:54:47.560516: step: 344/464, loss: 0.07514897733926773 2023-01-22 14:54:48.322162: step: 346/464, loss: 0.020701896399259567 2023-01-22 14:54:49.001956: step: 348/464, loss: 0.015195302665233612 2023-01-22 14:54:49.716570: step: 350/464, loss: 0.004897533915936947 2023-01-22 14:54:50.438970: step: 352/464, loss: 0.015249923802912235 2023-01-22 14:54:51.151702: step: 354/464, loss: 0.0631057620048523 2023-01-22 14:54:51.835620: step: 356/464, loss: 0.005183662287890911 2023-01-22 14:54:52.494483: step: 358/464, loss: 0.015517822466790676 2023-01-22 14:54:53.311812: step: 360/464, loss: 0.027061475440859795 2023-01-22 14:54:54.062567: step: 362/464, loss: 0.02585103176534176 2023-01-22 14:54:54.793821: step: 364/464, loss: 0.014051264151930809 2023-01-22 14:54:55.492346: step: 366/464, loss: 0.007386281155049801 2023-01-22 14:54:56.223479: step: 368/464, loss: 0.01598726026713848 2023-01-22 14:54:56.888893: step: 370/464, loss: 0.03127245977520943 2023-01-22 14:54:57.692356: step: 372/464, loss: 0.04506853222846985 2023-01-22 14:54:58.461925: step: 374/464, loss: 0.02675584889948368 2023-01-22 14:54:59.270921: step: 376/464, loss: 0.021385207772254944 2023-01-22 14:54:59.997680: step: 378/464, loss: 0.03744923695921898 2023-01-22 14:55:00.701647: step: 380/464, loss: 0.017804961651563644 2023-01-22 14:55:01.440875: step: 382/464, loss: 0.031217500567436218 2023-01-22 14:55:02.140294: step: 384/464, loss: 0.003439696505665779 2023-01-22 14:55:02.870474: step: 386/464, loss: 0.020607760176062584 2023-01-22 14:55:03.625353: step: 388/464, loss: 0.02362990193068981 2023-01-22 14:55:04.366300: step: 390/464, loss: 0.06575938314199448 2023-01-22 14:55:05.079631: step: 392/464, loss: 0.08327308297157288 2023-01-22 14:55:05.918365: step: 394/464, loss: 0.05336488410830498 2023-01-22 14:55:06.641892: step: 396/464, loss: 0.028403114527463913 2023-01-22 14:55:07.443235: step: 398/464, loss: 0.004095328506082296 2023-01-22 14:55:08.156693: step: 400/464, loss: 0.017295170575380325 2023-01-22 14:55:08.918625: step: 402/464, loss: 0.01599576510488987 2023-01-22 14:55:09.666360: step: 404/464, loss: 0.04949192702770233 2023-01-22 14:55:10.410547: step: 406/464, loss: 0.005515251308679581 2023-01-22 14:55:11.181027: step: 408/464, loss: 0.01241600513458252 2023-01-22 14:55:11.926285: step: 410/464, loss: 0.009431494399905205 2023-01-22 14:55:12.613526: step: 412/464, loss: 0.025457508862018585 2023-01-22 14:55:13.315926: step: 414/464, loss: 0.1266927570104599 2023-01-22 14:55:14.003325: step: 416/464, loss: 0.0686836913228035 2023-01-22 14:55:14.859074: step: 418/464, loss: 0.005135274026542902 2023-01-22 14:55:15.548904: step: 420/464, loss: 0.032037705183029175 2023-01-22 14:55:16.315510: step: 422/464, loss: 0.05591385066509247 2023-01-22 14:55:17.037661: step: 424/464, loss: 0.0294948797672987 2023-01-22 14:55:17.742861: step: 426/464, loss: 0.00016107734700199217 2023-01-22 14:55:18.430515: step: 428/464, loss: 0.008500921539962292 2023-01-22 14:55:19.131822: step: 430/464, loss: 0.02550342306494713 2023-01-22 14:55:19.862635: step: 432/464, loss: 0.0052842143923044205 2023-01-22 14:55:20.617119: step: 434/464, loss: 0.011525984853506088 2023-01-22 14:55:21.349457: step: 436/464, loss: 0.047036126255989075 2023-01-22 14:55:22.080058: step: 438/464, loss: 0.07947475463151932 2023-01-22 14:55:22.736612: step: 440/464, loss: 0.013226899318397045 2023-01-22 14:55:23.531362: step: 442/464, loss: 0.033193811774253845 2023-01-22 14:55:24.237118: step: 444/464, loss: 0.007622615899890661 2023-01-22 14:55:25.090427: step: 446/464, loss: 0.023654088377952576 2023-01-22 14:55:25.870807: step: 448/464, loss: 0.021606845781207085 2023-01-22 14:55:26.682060: step: 450/464, loss: 0.020649321377277374 2023-01-22 14:55:27.377686: step: 452/464, loss: 0.1784026324748993 2023-01-22 14:55:28.067245: step: 454/464, loss: 0.003845152212306857 2023-01-22 14:55:28.755272: step: 456/464, loss: 0.0008206103229895234 2023-01-22 14:55:29.449923: step: 458/464, loss: 0.011938661336898804 2023-01-22 14:55:30.170052: step: 460/464, loss: 0.06470710039138794 2023-01-22 14:55:30.897359: step: 462/464, loss: 0.07462415844202042 2023-01-22 14:55:31.645124: step: 464/464, loss: 0.010974127799272537 2023-01-22 14:55:32.422720: step: 466/464, loss: 0.054856494069099426 2023-01-22 14:55:33.181462: step: 468/464, loss: 0.10140056908130646 2023-01-22 14:55:33.893983: step: 470/464, loss: 0.10893024504184723 2023-01-22 14:55:34.656382: step: 472/464, loss: 0.14633715152740479 2023-01-22 14:55:35.438130: step: 474/464, loss: 0.007225559558719397 2023-01-22 14:55:36.121701: step: 476/464, loss: 0.13329027593135834 2023-01-22 14:55:36.900123: step: 478/464, loss: 0.0937078669667244 2023-01-22 14:55:37.654543: step: 480/464, loss: 0.10952486842870712 2023-01-22 14:55:38.350646: step: 482/464, loss: 0.048163872212171555 2023-01-22 14:55:39.154808: step: 484/464, loss: 0.0025707564782351255 2023-01-22 14:55:39.880508: step: 486/464, loss: 0.011743676848709583 2023-01-22 14:55:40.684932: step: 488/464, loss: 0.01013951189815998 2023-01-22 14:55:41.440976: step: 490/464, loss: 0.030154544860124588 2023-01-22 14:55:42.196134: step: 492/464, loss: 0.011203690432012081 2023-01-22 14:55:42.951257: step: 494/464, loss: 0.09164541959762573 2023-01-22 14:55:43.681471: step: 496/464, loss: 0.06809283792972565 2023-01-22 14:55:44.522998: step: 498/464, loss: 0.07905484735965729 2023-01-22 14:55:45.247239: step: 500/464, loss: 0.1469649225473404 2023-01-22 14:55:45.981054: step: 502/464, loss: 0.03829202055931091 2023-01-22 14:55:46.759325: step: 504/464, loss: 0.20179483294487 2023-01-22 14:55:47.461778: step: 506/464, loss: 0.011438527144491673 2023-01-22 14:55:48.256886: step: 508/464, loss: 0.012148118577897549 2023-01-22 14:55:48.957808: step: 510/464, loss: 0.009396116249263287 2023-01-22 14:55:49.699506: step: 512/464, loss: 0.006800531875342131 2023-01-22 14:55:50.419234: step: 514/464, loss: 0.021514644846320152 2023-01-22 14:55:51.185811: step: 516/464, loss: 0.023618390783667564 2023-01-22 14:55:51.900782: step: 518/464, loss: 0.06186339259147644 2023-01-22 14:55:52.579429: step: 520/464, loss: 0.7646817564964294 2023-01-22 14:55:53.224685: step: 522/464, loss: 0.005214558448642492 2023-01-22 14:55:54.005013: step: 524/464, loss: 0.49541327357292175 2023-01-22 14:55:54.722876: step: 526/464, loss: 0.012311800383031368 2023-01-22 14:55:55.462655: step: 528/464, loss: 0.0192912295460701 2023-01-22 14:55:56.237411: step: 530/464, loss: 0.020535197108983994 2023-01-22 14:55:56.992832: step: 532/464, loss: 0.10043486952781677 2023-01-22 14:55:57.716952: step: 534/464, loss: 0.03558532893657684 2023-01-22 14:55:58.503520: step: 536/464, loss: 0.01853848062455654 2023-01-22 14:55:59.281674: step: 538/464, loss: 0.009096699766814709 2023-01-22 14:56:00.094140: step: 540/464, loss: 0.01217248011380434 2023-01-22 14:56:00.809470: step: 542/464, loss: 0.0004268806369509548 2023-01-22 14:56:01.607304: step: 544/464, loss: 0.06526768952608109 2023-01-22 14:56:02.313959: step: 546/464, loss: 0.07047459483146667 2023-01-22 14:56:03.023752: step: 548/464, loss: 0.004196105059236288 2023-01-22 14:56:03.828951: step: 550/464, loss: 0.019067952409386635 2023-01-22 14:56:04.597038: step: 552/464, loss: 0.08304072916507721 2023-01-22 14:56:05.278126: step: 554/464, loss: 0.0038383365608751774 2023-01-22 14:56:06.089183: step: 556/464, loss: 0.03883401304483414 2023-01-22 14:56:06.825452: step: 558/464, loss: 0.06817010045051575 2023-01-22 14:56:07.546945: step: 560/464, loss: 0.16193494200706482 2023-01-22 14:56:08.303679: step: 562/464, loss: 0.043620552867650986 2023-01-22 14:56:09.021573: step: 564/464, loss: 0.037637632340192795 2023-01-22 14:56:09.705828: step: 566/464, loss: 0.00019179204537067562 2023-01-22 14:56:10.537769: step: 568/464, loss: 0.005051923915743828 2023-01-22 14:56:11.308726: step: 570/464, loss: 0.03895522281527519 2023-01-22 14:56:12.175471: step: 572/464, loss: 0.08474939316511154 2023-01-22 14:56:12.989118: step: 574/464, loss: 0.06918489933013916 2023-01-22 14:56:13.694803: step: 576/464, loss: 0.041286442428827286 2023-01-22 14:56:14.543218: step: 578/464, loss: 0.06198017671704292 2023-01-22 14:56:15.344323: step: 580/464, loss: 0.06712707132101059 2023-01-22 14:56:16.054241: step: 582/464, loss: 0.02944483980536461 2023-01-22 14:56:16.826391: step: 584/464, loss: 0.190678671002388 2023-01-22 14:56:17.501424: step: 586/464, loss: 0.013495603576302528 2023-01-22 14:56:18.235004: step: 588/464, loss: 0.07886006683111191 2023-01-22 14:56:18.988826: step: 590/464, loss: 0.2840629518032074 2023-01-22 14:56:19.708469: step: 592/464, loss: 0.004939667880535126 2023-01-22 14:56:20.425218: step: 594/464, loss: 0.0687856450676918 2023-01-22 14:56:21.122574: step: 596/464, loss: 0.003516688011586666 2023-01-22 14:56:21.815438: step: 598/464, loss: 0.05601877719163895 2023-01-22 14:56:22.505294: step: 600/464, loss: 0.1229490116238594 2023-01-22 14:56:23.347557: step: 602/464, loss: 0.034515511244535446 2023-01-22 14:56:24.077269: step: 604/464, loss: 0.03390450030565262 2023-01-22 14:56:24.973395: step: 606/464, loss: 0.017845941707491875 2023-01-22 14:56:25.644849: step: 608/464, loss: 0.04194774106144905 2023-01-22 14:56:26.374073: step: 610/464, loss: 0.06085171550512314 2023-01-22 14:56:27.086743: step: 612/464, loss: 0.029198765754699707 2023-01-22 14:56:27.869821: step: 614/464, loss: 0.04369807988405228 2023-01-22 14:56:28.609751: step: 616/464, loss: 0.04185190424323082 2023-01-22 14:56:29.304704: step: 618/464, loss: 0.060502730309963226 2023-01-22 14:56:30.026572: step: 620/464, loss: 0.012164751067757607 2023-01-22 14:56:30.824168: step: 622/464, loss: 0.034595951437950134 2023-01-22 14:56:31.658966: step: 624/464, loss: 0.01952231489121914 2023-01-22 14:56:32.350071: step: 626/464, loss: 0.013642151840031147 2023-01-22 14:56:33.058141: step: 628/464, loss: 0.046000026166439056 2023-01-22 14:56:33.802296: step: 630/464, loss: 0.03781093284487724 2023-01-22 14:56:34.568119: step: 632/464, loss: 0.01693800650537014 2023-01-22 14:56:35.328606: step: 634/464, loss: 0.01331486739218235 2023-01-22 14:56:36.059143: step: 636/464, loss: 0.04176183044910431 2023-01-22 14:56:36.808111: step: 638/464, loss: 0.09693517535924911 2023-01-22 14:56:37.623538: step: 640/464, loss: 0.027495836839079857 2023-01-22 14:56:38.412945: step: 642/464, loss: 0.09919502586126328 2023-01-22 14:56:39.109366: step: 644/464, loss: 0.012383551336824894 2023-01-22 14:56:39.804389: step: 646/464, loss: 0.0037764981389045715 2023-01-22 14:56:40.549246: step: 648/464, loss: 0.03265717625617981 2023-01-22 14:56:41.333269: step: 650/464, loss: 0.06838352233171463 2023-01-22 14:56:42.165412: step: 652/464, loss: 0.034150850027799606 2023-01-22 14:56:42.879797: step: 654/464, loss: 0.023890497162938118 2023-01-22 14:56:43.599404: step: 656/464, loss: 0.3121313154697418 2023-01-22 14:56:44.375902: step: 658/464, loss: 0.0807483047246933 2023-01-22 14:56:45.073233: step: 660/464, loss: 1.5089443922042847 2023-01-22 14:56:45.853305: step: 662/464, loss: 0.023560039699077606 2023-01-22 14:56:46.588826: step: 664/464, loss: 0.12527887523174286 2023-01-22 14:56:47.298326: step: 666/464, loss: 0.009700451046228409 2023-01-22 14:56:47.988714: step: 668/464, loss: 0.016948828473687172 2023-01-22 14:56:48.743642: step: 670/464, loss: 0.08660709857940674 2023-01-22 14:56:49.505440: step: 672/464, loss: 0.016221001744270325 2023-01-22 14:56:50.280936: step: 674/464, loss: 0.10388290137052536 2023-01-22 14:56:50.986424: step: 676/464, loss: 0.037814389914274216 2023-01-22 14:56:51.673715: step: 678/464, loss: 0.014574986882507801 2023-01-22 14:56:52.421817: step: 680/464, loss: 0.02477123774588108 2023-01-22 14:56:53.266087: step: 682/464, loss: 0.10053754597902298 2023-01-22 14:56:54.095714: step: 684/464, loss: 0.04018322750926018 2023-01-22 14:56:54.764835: step: 686/464, loss: 0.11917680501937866 2023-01-22 14:56:55.514477: step: 688/464, loss: 0.044764451682567596 2023-01-22 14:56:56.322547: step: 690/464, loss: 0.03544643893837929 2023-01-22 14:56:57.080408: step: 692/464, loss: 0.07452749460935593 2023-01-22 14:56:57.807716: step: 694/464, loss: 0.02271382510662079 2023-01-22 14:56:58.525103: step: 696/464, loss: 0.058031462132930756 2023-01-22 14:56:59.219041: step: 698/464, loss: 0.019309645518660545 2023-01-22 14:56:59.963714: step: 700/464, loss: 0.02176925353705883 2023-01-22 14:57:00.644302: step: 702/464, loss: 0.017063865438103676 2023-01-22 14:57:01.409231: step: 704/464, loss: 0.004486426245421171 2023-01-22 14:57:02.259522: step: 706/464, loss: 0.8419817686080933 2023-01-22 14:57:02.990393: step: 708/464, loss: 0.009446037001907825 2023-01-22 14:57:03.729593: step: 710/464, loss: 0.027423586696386337 2023-01-22 14:57:04.427294: step: 712/464, loss: 0.015616148710250854 2023-01-22 14:57:05.253470: step: 714/464, loss: 0.06453923135995865 2023-01-22 14:57:05.973086: step: 716/464, loss: 0.090809166431427 2023-01-22 14:57:06.737624: step: 718/464, loss: 0.07786067575216293 2023-01-22 14:57:07.536250: step: 720/464, loss: 0.022998638451099396 2023-01-22 14:57:08.213862: step: 722/464, loss: 0.00016782456077635288 2023-01-22 14:57:08.937004: step: 724/464, loss: 0.06510668992996216 2023-01-22 14:57:09.770879: step: 726/464, loss: 0.03561001271009445 2023-01-22 14:57:10.487646: step: 728/464, loss: 0.09167484194040298 2023-01-22 14:57:11.248915: step: 730/464, loss: 0.08646490424871445 2023-01-22 14:57:11.970288: step: 732/464, loss: 0.027139829471707344 2023-01-22 14:57:12.671801: step: 734/464, loss: 0.10090488195419312 2023-01-22 14:57:13.467179: step: 736/464, loss: 0.0418844036757946 2023-01-22 14:57:14.153822: step: 738/464, loss: 0.04323262348771095 2023-01-22 14:57:14.954687: step: 740/464, loss: 0.011654680594801903 2023-01-22 14:57:15.659975: step: 742/464, loss: 0.036519430577754974 2023-01-22 14:57:16.335768: step: 744/464, loss: 3.667783260345459 2023-01-22 14:57:17.044171: step: 746/464, loss: 0.033932533115148544 2023-01-22 14:57:17.787872: step: 748/464, loss: 0.015561865642666817 2023-01-22 14:57:18.564018: step: 750/464, loss: 0.11233902722597122 2023-01-22 14:57:19.237743: step: 752/464, loss: 0.10170744359493256 2023-01-22 14:57:19.988793: step: 754/464, loss: 0.0084531893953681 2023-01-22 14:57:20.682756: step: 756/464, loss: 0.04653886705636978 2023-01-22 14:57:21.408698: step: 758/464, loss: 0.004033430479466915 2023-01-22 14:57:22.114572: step: 760/464, loss: 0.14108993113040924 2023-01-22 14:57:22.917486: step: 762/464, loss: 0.013614281080663204 2023-01-22 14:57:23.621867: step: 764/464, loss: 0.016145724803209305 2023-01-22 14:57:24.377731: step: 766/464, loss: 0.01627621054649353 2023-01-22 14:57:25.094524: step: 768/464, loss: 0.006332142744213343 2023-01-22 14:57:25.807260: step: 770/464, loss: 0.010430836118757725 2023-01-22 14:57:26.547568: step: 772/464, loss: 0.006132225971668959 2023-01-22 14:57:27.260057: step: 774/464, loss: 0.015794722363352776 2023-01-22 14:57:27.988811: step: 776/464, loss: 0.015555287711322308 2023-01-22 14:57:28.615036: step: 778/464, loss: 0.006737166550010443 2023-01-22 14:57:29.453059: step: 780/464, loss: 0.03899889439344406 2023-01-22 14:57:30.184438: step: 782/464, loss: 0.061172544956207275 2023-01-22 14:57:30.937525: step: 784/464, loss: 0.03725861385464668 2023-01-22 14:57:31.618956: step: 786/464, loss: 0.03506654128432274 2023-01-22 14:57:32.296245: step: 788/464, loss: 0.008974473923444748 2023-01-22 14:57:33.026711: step: 790/464, loss: 0.03022157959640026 2023-01-22 14:57:33.750391: step: 792/464, loss: 0.06096022576093674 2023-01-22 14:57:34.485249: step: 794/464, loss: 0.014920524321496487 2023-01-22 14:57:35.167899: step: 796/464, loss: 0.0718575119972229 2023-01-22 14:57:35.860065: step: 798/464, loss: 0.018265772610902786 2023-01-22 14:57:36.600088: step: 800/464, loss: 0.006621572654694319 2023-01-22 14:57:37.420694: step: 802/464, loss: 0.008757734671235085 2023-01-22 14:57:38.109462: step: 804/464, loss: 0.009250654838979244 2023-01-22 14:57:38.867889: step: 806/464, loss: 0.03962739557027817 2023-01-22 14:57:39.593061: step: 808/464, loss: 0.007000217214226723 2023-01-22 14:57:40.303366: step: 810/464, loss: 0.020241227000951767 2023-01-22 14:57:41.066802: step: 812/464, loss: 0.013793728314340115 2023-01-22 14:57:41.749466: step: 814/464, loss: 0.0265958309173584 2023-01-22 14:57:42.537927: step: 816/464, loss: 0.020706037059426308 2023-01-22 14:57:43.305164: step: 818/464, loss: 0.0056082699447870255 2023-01-22 14:57:44.078243: step: 820/464, loss: 0.017334256321191788 2023-01-22 14:57:44.816142: step: 822/464, loss: 0.013314464129507542 2023-01-22 14:57:45.593548: step: 824/464, loss: 0.042475175112485886 2023-01-22 14:57:46.267536: step: 826/464, loss: 0.04164883866906166 2023-01-22 14:57:47.035936: step: 828/464, loss: 0.060853902250528336 2023-01-22 14:57:47.783937: step: 830/464, loss: 0.03174266591668129 2023-01-22 14:57:48.660268: step: 832/464, loss: 0.0642930120229721 2023-01-22 14:57:49.368391: step: 834/464, loss: 0.05120242014527321 2023-01-22 14:57:50.091560: step: 836/464, loss: 0.02251666598021984 2023-01-22 14:57:50.738759: step: 838/464, loss: 0.01608583703637123 2023-01-22 14:57:51.465619: step: 840/464, loss: 0.003405903000384569 2023-01-22 14:57:52.260401: step: 842/464, loss: 0.10293716937303543 2023-01-22 14:57:53.007997: step: 844/464, loss: 0.005280649289488792 2023-01-22 14:57:53.809604: step: 846/464, loss: 0.040373336523771286 2023-01-22 14:57:54.513108: step: 848/464, loss: 0.0019112235167995095 2023-01-22 14:57:55.397561: step: 850/464, loss: 0.01604733057320118 2023-01-22 14:57:56.193710: step: 852/464, loss: 0.05956016480922699 2023-01-22 14:57:56.879136: step: 854/464, loss: 0.04864457622170448 2023-01-22 14:57:57.627610: step: 856/464, loss: 0.17723682522773743 2023-01-22 14:57:58.422610: step: 858/464, loss: 0.010293657891452312 2023-01-22 14:57:59.143182: step: 860/464, loss: 0.05993131175637245 2023-01-22 14:57:59.849909: step: 862/464, loss: 0.015174886211752892 2023-01-22 14:58:00.549490: step: 864/464, loss: 0.07661637663841248 2023-01-22 14:58:01.235249: step: 866/464, loss: 0.010754907503724098 2023-01-22 14:58:01.961016: step: 868/464, loss: 0.03574849292635918 2023-01-22 14:58:02.757752: step: 870/464, loss: 0.012999016791582108 2023-01-22 14:58:03.528271: step: 872/464, loss: 0.009559421800076962 2023-01-22 14:58:04.295583: step: 874/464, loss: 0.07725869119167328 2023-01-22 14:58:05.114361: step: 876/464, loss: 0.05234874412417412 2023-01-22 14:58:05.778457: step: 878/464, loss: 0.13046760857105255 2023-01-22 14:58:06.541761: step: 880/464, loss: 0.10775807499885559 2023-01-22 14:58:07.261838: step: 882/464, loss: 0.017713425680994987 2023-01-22 14:58:08.065712: step: 884/464, loss: 0.06358397006988525 2023-01-22 14:58:08.764208: step: 886/464, loss: 0.012726555578410625 2023-01-22 14:58:09.500446: step: 888/464, loss: 0.002486567012965679 2023-01-22 14:58:10.327874: step: 890/464, loss: 0.0012825526064261794 2023-01-22 14:58:11.071490: step: 892/464, loss: 0.024141965433955193 2023-01-22 14:58:11.829709: step: 894/464, loss: 0.015300101600587368 2023-01-22 14:58:12.592481: step: 896/464, loss: 0.02721925638616085 2023-01-22 14:58:13.281641: step: 898/464, loss: 0.07990097254514694 2023-01-22 14:58:14.050327: step: 900/464, loss: 0.036514200270175934 2023-01-22 14:58:14.912115: step: 902/464, loss: 0.02834787406027317 2023-01-22 14:58:15.573118: step: 904/464, loss: 0.02153974585235119 2023-01-22 14:58:16.273101: step: 906/464, loss: 0.003995648585259914 2023-01-22 14:58:16.932751: step: 908/464, loss: 0.09679343551397324 2023-01-22 14:58:17.664109: step: 910/464, loss: 0.03607809543609619 2023-01-22 14:58:18.367338: step: 912/464, loss: 0.03484540060162544 2023-01-22 14:58:19.157523: step: 914/464, loss: 0.055608466267585754 2023-01-22 14:58:19.958685: step: 916/464, loss: 0.04974879324436188 2023-01-22 14:58:20.735412: step: 918/464, loss: 0.003665907308459282 2023-01-22 14:58:21.437792: step: 920/464, loss: 0.07038435339927673 2023-01-22 14:58:22.219522: step: 922/464, loss: 0.08866694569587708 2023-01-22 14:58:22.924700: step: 924/464, loss: 0.3629743158817291 2023-01-22 14:58:23.699930: step: 926/464, loss: 0.006601552478969097 2023-01-22 14:58:24.449304: step: 928/464, loss: 0.045003585517406464 2023-01-22 14:58:25.096145: step: 930/464, loss: 0.004139734897762537 ================================================== Loss: 0.067 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.29856150793650793, 'r': 0.3569141366223909, 'f1': 0.32514044943820225}, 'combined': 0.23957717327025427, 'epoch': 25} Test Chinese: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.3007495867809209, 'r': 0.2939143688995363, 'f1': 0.29729269497884137}, 'combined': 0.18463441056580676, 'epoch': 25} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.28317819148936174, 'r': 0.35356973434535105, 'f1': 0.3144831223628692}, 'combined': 0.2317244059515878, 'epoch': 25} Test Korean: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.2893929128812115, 'r': 0.2865332991175632, 'f1': 0.2879560066603515}, 'combined': 0.1788358357153762, 'epoch': 25} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2947238299467022, 'r': 0.35344489663437534, 'f1': 0.3214244357658599}, 'combined': 0.23683905793273885, 'epoch': 25} Test Russian: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.3096600668565005, 'r': 0.2968212486847601, 'f1': 0.3031047630218367}, 'combined': 0.1882440107188249, 'epoch': 25} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.2925531914893617, 'r': 0.39285714285714285, 'f1': 0.3353658536585366}, 'combined': 0.22357723577235772, 'epoch': 25} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.29375, 'r': 0.5108695652173914, 'f1': 0.3730158730158731}, 'combined': 0.18650793650793654, 'epoch': 25} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.39880952380952384, 'r': 0.28879310344827586, 'f1': 0.33499999999999996}, 'combined': 0.2233333333333333, 'epoch': 25} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.31282801835726054, 'r': 0.3603161425860667, 'f1': 0.33489701436130004}, 'combined': 0.24676622110832633, 'epoch': 12} Test for Chinese: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.30894966790503225, 'r': 0.2887808468548521, 'f1': 0.29852498585915693}, 'combined': 0.1853997280598975, 'epoch': 12} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.3138888888888889, 'r': 0.4035714285714286, 'f1': 0.35312499999999997}, 'combined': 0.23541666666666664, 'epoch': 12} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3081533681870252, 'r': 0.3455761681186563, 'f1': 0.32579363255551325}, 'combined': 0.24005846609353607, 'epoch': 23} Test for Korean: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.29327466083978504, 'r': 0.2883432372648332, 'f1': 0.2907880427678267}, 'combined': 0.1805946791926503, 'epoch': 23} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.3355263157894737, 'r': 0.5543478260869565, 'f1': 0.41803278688524587}, 'combined': 0.20901639344262293, 'epoch': 23} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.32184781639317267, 'r': 0.3346728716953674, 'f1': 0.32813507606224857}, 'combined': 0.24178374025639365, 'epoch': 24} Test for Russian: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.31977789344146773, 'r': 0.29610233370986844, 'f1': 0.30748504771716734}, 'combined': 0.190964398055925, 'epoch': 24} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.46875, 'r': 0.3232758620689655, 'f1': 0.38265306122448983}, 'combined': 0.25510204081632654, 'epoch': 24} ****************************** Epoch: 26 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-22 15:01:05.449113: step: 2/464, loss: 0.03979220986366272 2023-01-22 15:01:06.079863: step: 4/464, loss: 0.0028875789139419794 2023-01-22 15:01:06.839140: step: 6/464, loss: 0.08068735897541046 2023-01-22 15:01:07.552141: step: 8/464, loss: 0.012192752212285995 2023-01-22 15:01:08.266419: step: 10/464, loss: 0.31130632758140564 2023-01-22 15:01:08.960134: step: 12/464, loss: 0.027175702154636383 2023-01-22 15:01:09.656685: step: 14/464, loss: 0.039750535041093826 2023-01-22 15:01:10.414484: step: 16/464, loss: 0.010211330838501453 2023-01-22 15:01:11.154495: step: 18/464, loss: 0.030984103679656982 2023-01-22 15:01:11.874742: step: 20/464, loss: 0.003631194122135639 2023-01-22 15:01:12.670970: step: 22/464, loss: 0.06238573417067528 2023-01-22 15:01:13.471891: step: 24/464, loss: 0.032683271914720535 2023-01-22 15:01:14.216219: step: 26/464, loss: 0.038223229348659515 2023-01-22 15:01:14.937393: step: 28/464, loss: 0.05012506991624832 2023-01-22 15:01:15.692813: step: 30/464, loss: 0.01869715377688408 2023-01-22 15:01:16.503719: step: 32/464, loss: 0.18817028403282166 2023-01-22 15:01:17.199322: step: 34/464, loss: 0.0033002072013914585 2023-01-22 15:01:17.969324: step: 36/464, loss: 0.0033029152546077967 2023-01-22 15:01:18.787633: step: 38/464, loss: 0.016620703041553497 2023-01-22 15:01:19.528991: step: 40/464, loss: 0.06675605475902557 2023-01-22 15:01:20.337654: step: 42/464, loss: 0.01723124273121357 2023-01-22 15:01:21.071203: step: 44/464, loss: 0.009953220374882221 2023-01-22 15:01:21.806115: step: 46/464, loss: 0.00018864394223783165 2023-01-22 15:01:22.587000: step: 48/464, loss: 0.003479225095361471 2023-01-22 15:01:23.339964: step: 50/464, loss: 0.016302935779094696 2023-01-22 15:01:24.042621: step: 52/464, loss: 0.02298845537006855 2023-01-22 15:01:24.741199: step: 54/464, loss: 0.013436194509267807 2023-01-22 15:01:25.465068: step: 56/464, loss: 0.01852521114051342 2023-01-22 15:01:26.120174: step: 58/464, loss: 0.03060929849743843 2023-01-22 15:01:26.842894: step: 60/464, loss: 0.02657526358962059 2023-01-22 15:01:27.601604: step: 62/464, loss: 0.057143684476614 2023-01-22 15:01:28.316710: step: 64/464, loss: 0.013616350479424 2023-01-22 15:01:29.070312: step: 66/464, loss: 0.004383295774459839 2023-01-22 15:01:29.790973: step: 68/464, loss: 0.01339748129248619 2023-01-22 15:01:30.579757: step: 70/464, loss: 0.014214854687452316 2023-01-22 15:01:31.194814: step: 72/464, loss: 0.001885912730358541 2023-01-22 15:01:31.881960: step: 74/464, loss: 0.05567344278097153 2023-01-22 15:01:32.566184: step: 76/464, loss: 0.1963343620300293 2023-01-22 15:01:33.335260: step: 78/464, loss: 0.19187796115875244 2023-01-22 15:01:34.036859: step: 80/464, loss: 0.10916055738925934 2023-01-22 15:01:34.879821: step: 82/464, loss: 0.03477245569229126 2023-01-22 15:01:35.602745: step: 84/464, loss: 0.024640601128339767 2023-01-22 15:01:36.291509: step: 86/464, loss: 0.01551698986440897 2023-01-22 15:01:37.017696: step: 88/464, loss: 0.02397140860557556 2023-01-22 15:01:37.747709: step: 90/464, loss: 0.07582138478755951 2023-01-22 15:01:38.411094: step: 92/464, loss: 0.03233405947685242 2023-01-22 15:01:39.164297: step: 94/464, loss: 0.006312546785920858 2023-01-22 15:01:39.866498: step: 96/464, loss: 0.05362057313323021 2023-01-22 15:01:40.662851: step: 98/464, loss: 0.028060076758265495 2023-01-22 15:01:41.410253: step: 100/464, loss: 0.008015839383006096 2023-01-22 15:01:42.197433: step: 102/464, loss: 0.011111016385257244 2023-01-22 15:01:42.929735: step: 104/464, loss: 0.012221268378198147 2023-01-22 15:01:43.710587: step: 106/464, loss: 0.04969086870551109 2023-01-22 15:01:44.414409: step: 108/464, loss: 0.019485192373394966 2023-01-22 15:01:45.186563: step: 110/464, loss: 0.05747364088892937 2023-01-22 15:01:45.949567: step: 112/464, loss: 0.010442215949296951 2023-01-22 15:01:46.648995: step: 114/464, loss: 0.01553972065448761 2023-01-22 15:01:47.399959: step: 116/464, loss: 0.08189647644758224 2023-01-22 15:01:48.112726: step: 118/464, loss: 0.04713998734951019 2023-01-22 15:01:48.776663: step: 120/464, loss: 0.015384606085717678 2023-01-22 15:01:49.533311: step: 122/464, loss: 0.06837371736764908 2023-01-22 15:01:50.230073: step: 124/464, loss: 0.0416836179792881 2023-01-22 15:01:50.973758: step: 126/464, loss: 0.010772022418677807 2023-01-22 15:01:51.778653: step: 128/464, loss: 0.006497004069387913 2023-01-22 15:01:52.521826: step: 130/464, loss: 0.005042895209044218 2023-01-22 15:01:53.291476: step: 132/464, loss: 0.009416569024324417 2023-01-22 15:01:54.083142: step: 134/464, loss: 0.03312671184539795 2023-01-22 15:01:54.823921: step: 136/464, loss: 0.023155786097049713 2023-01-22 15:01:55.560750: step: 138/464, loss: 0.08792765438556671 2023-01-22 15:01:56.282049: step: 140/464, loss: 0.01703648269176483 2023-01-22 15:01:57.080794: step: 142/464, loss: 0.006686758249998093 2023-01-22 15:01:57.814351: step: 144/464, loss: 0.00440249452367425 2023-01-22 15:01:58.463042: step: 146/464, loss: 0.002988354070112109 2023-01-22 15:01:59.202890: step: 148/464, loss: 0.00513819744810462 2023-01-22 15:01:59.966766: step: 150/464, loss: 0.025694118812680244 2023-01-22 15:02:00.698159: step: 152/464, loss: 0.021511996164917946 2023-01-22 15:02:01.423075: step: 154/464, loss: 0.0037316379602998495 2023-01-22 15:02:02.119019: step: 156/464, loss: 0.01730765588581562 2023-01-22 15:02:02.796884: step: 158/464, loss: 0.03143288195133209 2023-01-22 15:02:03.514282: step: 160/464, loss: 0.006385265849530697 2023-01-22 15:02:04.224771: step: 162/464, loss: 0.004676298703998327 2023-01-22 15:02:04.905134: step: 164/464, loss: 0.03778495639562607 2023-01-22 15:02:05.644973: step: 166/464, loss: 0.041014038026332855 2023-01-22 15:02:06.335535: step: 168/464, loss: 0.006660899613052607 2023-01-22 15:02:07.046860: step: 170/464, loss: 0.009321154095232487 2023-01-22 15:02:07.771022: step: 172/464, loss: 0.048160020262002945 2023-01-22 15:02:08.491929: step: 174/464, loss: 0.032669879496097565 2023-01-22 15:02:09.252613: step: 176/464, loss: 0.0007618823437951505 2023-01-22 15:02:09.955711: step: 178/464, loss: 0.018640637397766113 2023-01-22 15:02:10.680293: step: 180/464, loss: 0.03052210435271263 2023-01-22 15:02:11.405068: step: 182/464, loss: 0.03019343689084053 2023-01-22 15:02:12.181362: step: 184/464, loss: 0.09650694578886032 2023-01-22 15:02:12.875288: step: 186/464, loss: 0.022672880440950394 2023-01-22 15:02:13.582438: step: 188/464, loss: 0.006853120867162943 2023-01-22 15:02:14.303281: step: 190/464, loss: 0.11154012382030487 2023-01-22 15:02:15.137216: step: 192/464, loss: 0.02759392186999321 2023-01-22 15:02:15.828738: step: 194/464, loss: 0.010374793782830238 2023-01-22 15:02:16.484739: step: 196/464, loss: 0.0036212902050465345 2023-01-22 15:02:17.244580: step: 198/464, loss: 0.4379126727581024 2023-01-22 15:02:18.003222: step: 200/464, loss: 0.006713510025292635 2023-01-22 15:02:18.760556: step: 202/464, loss: 0.037869472056627274 2023-01-22 15:02:19.494176: step: 204/464, loss: 0.1598629355430603 2023-01-22 15:02:20.197300: step: 206/464, loss: 0.05931587144732475 2023-01-22 15:02:20.973602: step: 208/464, loss: 0.01667376235127449 2023-01-22 15:02:21.697191: step: 210/464, loss: 0.020716484636068344 2023-01-22 15:02:22.430876: step: 212/464, loss: 0.06653925031423569 2023-01-22 15:02:23.186409: step: 214/464, loss: 0.15032264590263367 2023-01-22 15:02:23.928662: step: 216/464, loss: 0.01681193895637989 2023-01-22 15:02:24.683478: step: 218/464, loss: 0.08305919915437698 2023-01-22 15:02:25.384233: step: 220/464, loss: 0.01702980510890484 2023-01-22 15:02:26.108980: step: 222/464, loss: 0.0005055239307694137 2023-01-22 15:02:26.844870: step: 224/464, loss: 0.004098755773156881 2023-01-22 15:02:27.578944: step: 226/464, loss: 0.028517749160528183 2023-01-22 15:02:28.358506: step: 228/464, loss: 0.02958236075937748 2023-01-22 15:02:29.102037: step: 230/464, loss: 0.010918030515313148 2023-01-22 15:02:29.847350: step: 232/464, loss: 0.01749340072274208 2023-01-22 15:02:30.613483: step: 234/464, loss: 0.08216807246208191 2023-01-22 15:02:31.313345: step: 236/464, loss: 0.13501933217048645 2023-01-22 15:02:32.110757: step: 238/464, loss: 0.28800854086875916 2023-01-22 15:02:32.767893: step: 240/464, loss: 0.035141170024871826 2023-01-22 15:02:33.467698: step: 242/464, loss: 0.03113182634115219 2023-01-22 15:02:34.163333: step: 244/464, loss: 0.0638260543346405 2023-01-22 15:02:35.026114: step: 246/464, loss: 0.02870015986263752 2023-01-22 15:02:35.782285: step: 248/464, loss: 0.04577351361513138 2023-01-22 15:02:36.530287: step: 250/464, loss: 0.06411606073379517 2023-01-22 15:02:37.301506: step: 252/464, loss: 0.263606995344162 2023-01-22 15:02:38.122844: step: 254/464, loss: 0.02350165694952011 2023-01-22 15:02:38.916396: step: 256/464, loss: 0.1228107437491417 2023-01-22 15:02:39.692880: step: 258/464, loss: 0.008794655092060566 2023-01-22 15:02:40.464809: step: 260/464, loss: 0.16834703087806702 2023-01-22 15:02:41.304374: step: 262/464, loss: 0.003793665673583746 2023-01-22 15:02:42.084263: step: 264/464, loss: 0.009521029889583588 2023-01-22 15:02:42.788840: step: 266/464, loss: 0.007902123034000397 2023-01-22 15:02:43.474897: step: 268/464, loss: 0.016075551509857178 2023-01-22 15:02:44.221644: step: 270/464, loss: 0.004283635877072811 2023-01-22 15:02:44.972208: step: 272/464, loss: 0.03238300979137421 2023-01-22 15:02:45.699665: step: 274/464, loss: 0.023233672603964806 2023-01-22 15:02:46.408751: step: 276/464, loss: 0.011083599179983139 2023-01-22 15:02:47.196297: step: 278/464, loss: 0.009170886129140854 2023-01-22 15:02:47.904256: step: 280/464, loss: 0.19675010442733765 2023-01-22 15:02:48.632890: step: 282/464, loss: 0.04092198982834816 2023-01-22 15:02:49.351553: step: 284/464, loss: 0.08173315227031708 2023-01-22 15:02:50.074685: step: 286/464, loss: 0.022941935807466507 2023-01-22 15:02:50.872132: step: 288/464, loss: 0.027069993317127228 2023-01-22 15:02:51.637966: step: 290/464, loss: 0.054724760353565216 2023-01-22 15:02:52.363800: step: 292/464, loss: 0.02735094539821148 2023-01-22 15:02:53.048507: step: 294/464, loss: 0.04187670722603798 2023-01-22 15:02:53.779413: step: 296/464, loss: 0.02951827645301819 2023-01-22 15:02:54.527543: step: 298/464, loss: 0.009366628713905811 2023-01-22 15:02:55.231432: step: 300/464, loss: 0.006680920720100403 2023-01-22 15:02:55.897968: step: 302/464, loss: 0.000542443071026355 2023-01-22 15:02:56.655983: step: 304/464, loss: 0.06525389105081558 2023-01-22 15:02:57.393266: step: 306/464, loss: 0.041585177183151245 2023-01-22 15:02:58.059047: step: 308/464, loss: 0.0009010765352286398 2023-01-22 15:02:58.757156: step: 310/464, loss: 0.043093081563711166 2023-01-22 15:02:59.582558: step: 312/464, loss: 0.08462139219045639 2023-01-22 15:03:00.313303: step: 314/464, loss: 0.029180768877267838 2023-01-22 15:03:01.074938: step: 316/464, loss: 0.2786836624145508 2023-01-22 15:03:01.822337: step: 318/464, loss: 0.007162583060562611 2023-01-22 15:03:02.483519: step: 320/464, loss: 0.0022578127682209015 2023-01-22 15:03:03.271475: step: 322/464, loss: 0.029035838320851326 2023-01-22 15:03:04.049595: step: 324/464, loss: 0.0048900507390499115 2023-01-22 15:03:04.859019: step: 326/464, loss: 0.019386572763323784 2023-01-22 15:03:05.602955: step: 328/464, loss: 0.07106909155845642 2023-01-22 15:03:06.375633: step: 330/464, loss: 0.010740678757429123 2023-01-22 15:03:07.174983: step: 332/464, loss: 0.03408215939998627 2023-01-22 15:03:07.872193: step: 334/464, loss: 0.10296069830656052 2023-01-22 15:03:08.531047: step: 336/464, loss: 0.02283811941742897 2023-01-22 15:03:09.258385: step: 338/464, loss: 0.00903315469622612 2023-01-22 15:03:10.046341: step: 340/464, loss: 0.1214098259806633 2023-01-22 15:03:10.713490: step: 342/464, loss: 0.0025513179134577513 2023-01-22 15:03:11.460278: step: 344/464, loss: 0.006067641545087099 2023-01-22 15:03:12.238737: step: 346/464, loss: 0.023233452811837196 2023-01-22 15:03:12.986358: step: 348/464, loss: 0.04339459165930748 2023-01-22 15:03:13.674795: step: 350/464, loss: 0.012552225030958652 2023-01-22 15:03:14.439717: step: 352/464, loss: 0.0010526591213420033 2023-01-22 15:03:15.227059: step: 354/464, loss: 0.005083049647510052 2023-01-22 15:03:15.854321: step: 356/464, loss: 0.012266969308257103 2023-01-22 15:03:16.542343: step: 358/464, loss: 6.913995265960693 2023-01-22 15:03:17.458037: step: 360/464, loss: 0.02275826968252659 2023-01-22 15:03:18.148247: step: 362/464, loss: 0.04937801882624626 2023-01-22 15:03:18.980303: step: 364/464, loss: 0.04725099727511406 2023-01-22 15:03:19.854987: step: 366/464, loss: 0.04512450471520424 2023-01-22 15:03:20.666791: step: 368/464, loss: 0.07886477559804916 2023-01-22 15:03:21.362453: step: 370/464, loss: 0.017831819131970406 2023-01-22 15:03:22.057181: step: 372/464, loss: 0.006999008823186159 2023-01-22 15:03:22.711674: step: 374/464, loss: 0.1305447369813919 2023-01-22 15:03:23.497480: step: 376/464, loss: 0.005541624501347542 2023-01-22 15:03:24.213454: step: 378/464, loss: 0.02828267030417919 2023-01-22 15:03:24.997883: step: 380/464, loss: 0.007844929583370686 2023-01-22 15:03:25.774653: step: 382/464, loss: 0.014002975076436996 2023-01-22 15:03:26.499558: step: 384/464, loss: 0.018246008083224297 2023-01-22 15:03:27.231942: step: 386/464, loss: 0.044929563999176025 2023-01-22 15:03:27.882554: step: 388/464, loss: 0.011728995479643345 2023-01-22 15:03:28.511270: step: 390/464, loss: 0.08749885112047195 2023-01-22 15:03:29.241925: step: 392/464, loss: 0.09420124441385269 2023-01-22 15:03:29.951752: step: 394/464, loss: 0.0393458716571331 2023-01-22 15:03:30.744768: step: 396/464, loss: 0.4072987139225006 2023-01-22 15:03:31.545576: step: 398/464, loss: 0.003674858482554555 2023-01-22 15:03:32.293949: step: 400/464, loss: 0.01733958162367344 2023-01-22 15:03:33.004972: step: 402/464, loss: 0.010459842160344124 2023-01-22 15:03:33.756924: step: 404/464, loss: 0.09545612335205078 2023-01-22 15:03:34.455984: step: 406/464, loss: 0.07118836045265198 2023-01-22 15:03:35.207926: step: 408/464, loss: 0.08896943926811218 2023-01-22 15:03:35.916361: step: 410/464, loss: 0.006611619610339403 2023-01-22 15:03:36.624833: step: 412/464, loss: 0.004229702055454254 2023-01-22 15:03:37.431973: step: 414/464, loss: 0.07376924157142639 2023-01-22 15:03:38.154912: step: 416/464, loss: 0.05804646015167236 2023-01-22 15:03:38.912322: step: 418/464, loss: 0.06894968450069427 2023-01-22 15:03:39.790167: step: 420/464, loss: 0.019484853371977806 2023-01-22 15:03:40.569226: step: 422/464, loss: 0.00924522802233696 2023-01-22 15:03:41.397246: step: 424/464, loss: 0.030628271400928497 2023-01-22 15:03:42.161940: step: 426/464, loss: 0.026602789759635925 2023-01-22 15:03:42.878018: step: 428/464, loss: 0.0014539804542437196 2023-01-22 15:03:43.601092: step: 430/464, loss: 0.013244767673313618 2023-01-22 15:03:44.325154: step: 432/464, loss: 0.01868750900030136 2023-01-22 15:03:44.994419: step: 434/464, loss: 0.01659456081688404 2023-01-22 15:03:45.857335: step: 436/464, loss: 0.13781066238880157 2023-01-22 15:03:46.635345: step: 438/464, loss: 0.0003242420207243413 2023-01-22 15:03:47.364283: step: 440/464, loss: 0.059889595955610275 2023-01-22 15:03:48.084889: step: 442/464, loss: 0.09946936368942261 2023-01-22 15:03:48.827280: step: 444/464, loss: 0.01946125365793705 2023-01-22 15:03:49.614404: step: 446/464, loss: 0.02001163363456726 2023-01-22 15:03:50.329855: step: 448/464, loss: 0.03295079246163368 2023-01-22 15:03:51.103739: step: 450/464, loss: 0.009511049836874008 2023-01-22 15:03:51.878405: step: 452/464, loss: 0.020733296871185303 2023-01-22 15:03:52.573144: step: 454/464, loss: 0.05650974065065384 2023-01-22 15:03:53.294828: step: 456/464, loss: 0.00144986214581877 2023-01-22 15:03:54.017169: step: 458/464, loss: 0.013950464315712452 2023-01-22 15:03:54.722049: step: 460/464, loss: 0.00036745882243849337 2023-01-22 15:03:55.497549: step: 462/464, loss: 0.046166226267814636 2023-01-22 15:03:56.436886: step: 464/464, loss: 0.004961703438311815 2023-01-22 15:03:57.221713: step: 466/464, loss: 0.027376022189855576 2023-01-22 15:03:57.999617: step: 468/464, loss: 0.004060413688421249 2023-01-22 15:03:58.711802: step: 470/464, loss: 0.029101349413394928 2023-01-22 15:03:59.464369: step: 472/464, loss: 0.01377090159803629 2023-01-22 15:04:00.172315: step: 474/464, loss: 0.017810342833399773 2023-01-22 15:04:00.910445: step: 476/464, loss: 0.013420073315501213 2023-01-22 15:04:01.747504: step: 478/464, loss: 0.03463403135538101 2023-01-22 15:04:02.475483: step: 480/464, loss: 0.012036774307489395 2023-01-22 15:04:03.151522: step: 482/464, loss: 0.05726030468940735 2023-01-22 15:04:03.936845: step: 484/464, loss: 0.02741643413901329 2023-01-22 15:04:04.696236: step: 486/464, loss: 0.016417313367128372 2023-01-22 15:04:05.455314: step: 488/464, loss: 0.22834038734436035 2023-01-22 15:04:06.168663: step: 490/464, loss: 0.028050106018781662 2023-01-22 15:04:06.928068: step: 492/464, loss: 0.12751196324825287 2023-01-22 15:04:07.602052: step: 494/464, loss: 0.0044830976985394955 2023-01-22 15:04:08.316093: step: 496/464, loss: 0.0003841651196125895 2023-01-22 15:04:09.054737: step: 498/464, loss: 0.00019563184469006956 2023-01-22 15:04:09.837209: step: 500/464, loss: 0.022695457562804222 2023-01-22 15:04:10.551991: step: 502/464, loss: 0.02632814832031727 2023-01-22 15:04:11.219177: step: 504/464, loss: 0.03141283243894577 2023-01-22 15:04:11.934205: step: 506/464, loss: 0.021201007068157196 2023-01-22 15:04:12.713275: step: 508/464, loss: 0.044027432799339294 2023-01-22 15:04:13.407082: step: 510/464, loss: 0.0025048651732504368 2023-01-22 15:04:14.121386: step: 512/464, loss: 0.004569270648062229 2023-01-22 15:04:14.831287: step: 514/464, loss: 0.0010739141143858433 2023-01-22 15:04:15.621356: step: 516/464, loss: 0.017874712124466896 2023-01-22 15:04:16.400584: step: 518/464, loss: 0.019535720348358154 2023-01-22 15:04:17.278673: step: 520/464, loss: 0.05086197704076767 2023-01-22 15:04:17.965816: step: 522/464, loss: 0.044072188436985016 2023-01-22 15:04:18.706918: step: 524/464, loss: 0.038728076964616776 2023-01-22 15:04:19.511126: step: 526/464, loss: 0.008920364081859589 2023-01-22 15:04:20.209637: step: 528/464, loss: 0.042976170778274536 2023-01-22 15:04:20.903922: step: 530/464, loss: 0.11609112471342087 2023-01-22 15:04:21.623158: step: 532/464, loss: 0.07067747414112091 2023-01-22 15:04:22.464366: step: 534/464, loss: 0.019455162808299065 2023-01-22 15:04:23.224536: step: 536/464, loss: 0.08025778830051422 2023-01-22 15:04:23.933824: step: 538/464, loss: 0.037985384464263916 2023-01-22 15:04:24.635343: step: 540/464, loss: 0.008334038779139519 2023-01-22 15:04:25.382169: step: 542/464, loss: 0.051863398402929306 2023-01-22 15:04:26.113815: step: 544/464, loss: 0.016841573640704155 2023-01-22 15:04:26.843477: step: 546/464, loss: 0.035958219319581985 2023-01-22 15:04:27.528130: step: 548/464, loss: 0.006521169561892748 2023-01-22 15:04:28.265185: step: 550/464, loss: 0.003640382084995508 2023-01-22 15:04:29.017609: step: 552/464, loss: 0.004826993215829134 2023-01-22 15:04:29.797997: step: 554/464, loss: 0.14086350798606873 2023-01-22 15:04:30.540894: step: 556/464, loss: 0.009819505736231804 2023-01-22 15:04:31.295314: step: 558/464, loss: 0.06159777566790581 2023-01-22 15:04:32.085409: step: 560/464, loss: 0.017736200243234634 2023-01-22 15:04:32.759340: step: 562/464, loss: 0.04474842548370361 2023-01-22 15:04:33.415755: step: 564/464, loss: 0.028115447610616684 2023-01-22 15:04:34.168649: step: 566/464, loss: 0.01623515598475933 2023-01-22 15:04:34.944760: step: 568/464, loss: 0.02251431718468666 2023-01-22 15:04:35.647619: step: 570/464, loss: 0.0011539249680936337 2023-01-22 15:04:36.364002: step: 572/464, loss: 0.0839838832616806 2023-01-22 15:04:37.045413: step: 574/464, loss: 0.006072110962122679 2023-01-22 15:04:37.798960: step: 576/464, loss: 0.0036800343077629805 2023-01-22 15:04:38.585316: step: 578/464, loss: 0.00892962608486414 2023-01-22 15:04:39.262914: step: 580/464, loss: 0.013826594687998295 2023-01-22 15:04:40.008538: step: 582/464, loss: 0.07251441478729248 2023-01-22 15:04:40.701330: step: 584/464, loss: 0.006227490957826376 2023-01-22 15:04:41.400918: step: 586/464, loss: 0.013700391165912151 2023-01-22 15:04:42.215659: step: 588/464, loss: 0.06363295018672943 2023-01-22 15:04:42.936175: step: 590/464, loss: 0.021341076120734215 2023-01-22 15:04:43.738737: step: 592/464, loss: 0.011852027848362923 2023-01-22 15:04:44.531029: step: 594/464, loss: 0.02045590803027153 2023-01-22 15:04:45.245133: step: 596/464, loss: 0.03566189855337143 2023-01-22 15:04:45.998273: step: 598/464, loss: 0.0173689853399992 2023-01-22 15:04:46.755219: step: 600/464, loss: 0.357320100069046 2023-01-22 15:04:47.490670: step: 602/464, loss: 0.11469513177871704 2023-01-22 15:04:48.182290: step: 604/464, loss: 0.007034547161310911 2023-01-22 15:04:48.883106: step: 606/464, loss: 0.01139721181243658 2023-01-22 15:04:49.691017: step: 608/464, loss: 0.032890621572732925 2023-01-22 15:04:50.389483: step: 610/464, loss: 0.01588939130306244 2023-01-22 15:04:51.155685: step: 612/464, loss: 0.11238836497068405 2023-01-22 15:04:51.889552: step: 614/464, loss: 0.24029462039470673 2023-01-22 15:04:52.678179: step: 616/464, loss: 0.03379188850522041 2023-01-22 15:04:53.386765: step: 618/464, loss: 0.09208420664072037 2023-01-22 15:04:54.135450: step: 620/464, loss: 0.6428366303443909 2023-01-22 15:04:54.870313: step: 622/464, loss: 0.13242271542549133 2023-01-22 15:04:55.613603: step: 624/464, loss: 0.08914587646722794 2023-01-22 15:04:56.383187: step: 626/464, loss: 0.01581208035349846 2023-01-22 15:04:57.143269: step: 628/464, loss: 0.2521286904811859 2023-01-22 15:04:57.915428: step: 630/464, loss: 0.1282821148633957 2023-01-22 15:04:58.643050: step: 632/464, loss: 0.07224276661872864 2023-01-22 15:04:59.465107: step: 634/464, loss: 0.028242819011211395 2023-01-22 15:05:00.302325: step: 636/464, loss: 0.008096271194517612 2023-01-22 15:05:01.033836: step: 638/464, loss: 0.04342019557952881 2023-01-22 15:05:01.779316: step: 640/464, loss: 0.012415018863976002 2023-01-22 15:05:02.585658: step: 642/464, loss: 0.037182290107011795 2023-01-22 15:05:03.260616: step: 644/464, loss: 0.21031810343265533 2023-01-22 15:05:04.079048: step: 646/464, loss: 0.007701578550040722 2023-01-22 15:05:04.749847: step: 648/464, loss: 0.006078493315726519 2023-01-22 15:05:05.512422: step: 650/464, loss: 0.016446499153971672 2023-01-22 15:05:06.331208: step: 652/464, loss: 0.053291235119104385 2023-01-22 15:05:07.162120: step: 654/464, loss: 0.05071832984685898 2023-01-22 15:05:07.916128: step: 656/464, loss: 0.06600752472877502 2023-01-22 15:05:08.794095: step: 658/464, loss: 0.12854290008544922 2023-01-22 15:05:09.586249: step: 660/464, loss: 0.016413327306509018 2023-01-22 15:05:10.453113: step: 662/464, loss: 0.013318453915417194 2023-01-22 15:05:11.229637: step: 664/464, loss: 0.021495165303349495 2023-01-22 15:05:11.926565: step: 666/464, loss: 0.004696085583418608 2023-01-22 15:05:12.623003: step: 668/464, loss: 0.015222067944705486 2023-01-22 15:05:13.360280: step: 670/464, loss: 0.0056726341135799885 2023-01-22 15:05:14.270672: step: 672/464, loss: 0.06520474702119827 2023-01-22 15:05:15.015004: step: 674/464, loss: 0.025041626766324043 2023-01-22 15:05:15.726065: step: 676/464, loss: 0.01942654699087143 2023-01-22 15:05:16.447167: step: 678/464, loss: 0.009234144352376461 2023-01-22 15:05:17.175289: step: 680/464, loss: 0.024104246869683266 2023-01-22 15:05:17.932433: step: 682/464, loss: 0.0200289748609066 2023-01-22 15:05:18.681329: step: 684/464, loss: 0.043629322201013565 2023-01-22 15:05:19.433647: step: 686/464, loss: 0.05797417834401131 2023-01-22 15:05:20.201506: step: 688/464, loss: 0.08065325766801834 2023-01-22 15:05:20.964360: step: 690/464, loss: 0.03184974193572998 2023-01-22 15:05:21.687255: step: 692/464, loss: 0.015992360189557076 2023-01-22 15:05:22.373719: step: 694/464, loss: 0.010463027283549309 2023-01-22 15:05:23.175630: step: 696/464, loss: 0.022422725334763527 2023-01-22 15:05:23.952183: step: 698/464, loss: 0.03707791492342949 2023-01-22 15:05:24.664024: step: 700/464, loss: 0.09422971308231354 2023-01-22 15:05:25.375325: step: 702/464, loss: 0.056585729122161865 2023-01-22 15:05:26.149730: step: 704/464, loss: 0.05239574611186981 2023-01-22 15:05:26.853989: step: 706/464, loss: 0.1525680422782898 2023-01-22 15:05:27.588993: step: 708/464, loss: 0.06405721604824066 2023-01-22 15:05:28.311201: step: 710/464, loss: 0.03996497765183449 2023-01-22 15:05:28.998172: step: 712/464, loss: 0.07094903290271759 2023-01-22 15:05:29.676192: step: 714/464, loss: 0.010407168418169022 2023-01-22 15:05:30.409875: step: 716/464, loss: 0.049233485013246536 2023-01-22 15:05:31.108518: step: 718/464, loss: 0.015937745571136475 2023-01-22 15:05:31.808195: step: 720/464, loss: 0.09470876306295395 2023-01-22 15:05:32.544730: step: 722/464, loss: 0.01487759593874216 2023-01-22 15:05:33.262505: step: 724/464, loss: 0.011564928106963634 2023-01-22 15:05:34.011707: step: 726/464, loss: 0.040957946330308914 2023-01-22 15:05:34.808702: step: 728/464, loss: 0.03968048840761185 2023-01-22 15:05:35.482929: step: 730/464, loss: 0.04555586352944374 2023-01-22 15:05:36.238257: step: 732/464, loss: 0.029980555176734924 2023-01-22 15:05:36.977104: step: 734/464, loss: 0.07169114798307419 2023-01-22 15:05:37.688613: step: 736/464, loss: 0.006400719750672579 2023-01-22 15:05:38.487299: step: 738/464, loss: 0.0033728305716067553 2023-01-22 15:05:39.173283: step: 740/464, loss: 0.03094770386815071 2023-01-22 15:05:39.918287: step: 742/464, loss: 0.023393241688609123 2023-01-22 15:05:40.716972: step: 744/464, loss: 0.05773283913731575 2023-01-22 15:05:41.402833: step: 746/464, loss: 0.0021309382282197475 2023-01-22 15:05:42.119426: step: 748/464, loss: 0.05149630084633827 2023-01-22 15:05:42.886845: step: 750/464, loss: 0.04184247553348541 2023-01-22 15:05:43.601527: step: 752/464, loss: 0.007670742925256491 2023-01-22 15:05:44.391840: step: 754/464, loss: 0.3596675992012024 2023-01-22 15:05:45.151200: step: 756/464, loss: 0.003946928307414055 2023-01-22 15:05:45.820620: step: 758/464, loss: 0.09387727081775665 2023-01-22 15:05:46.472059: step: 760/464, loss: 0.002123955637216568 2023-01-22 15:05:47.255321: step: 762/464, loss: 0.03694995492696762 2023-01-22 15:05:47.974655: step: 764/464, loss: 0.02468699961900711 2023-01-22 15:05:48.705485: step: 766/464, loss: 0.04791996255517006 2023-01-22 15:05:49.415411: step: 768/464, loss: 0.0017564924200996757 2023-01-22 15:05:50.176458: step: 770/464, loss: 0.004304036498069763 2023-01-22 15:05:50.964209: step: 772/464, loss: 0.013958727940917015 2023-01-22 15:05:51.719978: step: 774/464, loss: 0.007403239607810974 2023-01-22 15:05:52.451901: step: 776/464, loss: 0.012045308947563171 2023-01-22 15:05:53.225808: step: 778/464, loss: 0.11195321381092072 2023-01-22 15:05:54.005666: step: 780/464, loss: 0.006845542695373297 2023-01-22 15:05:54.732928: step: 782/464, loss: 0.1624700129032135 2023-01-22 15:05:55.501359: step: 784/464, loss: 0.042097195982933044 2023-01-22 15:05:56.262657: step: 786/464, loss: 0.11464142799377441 2023-01-22 15:05:56.977401: step: 788/464, loss: 1.7824957370758057 2023-01-22 15:05:57.648641: step: 790/464, loss: 0.034144118428230286 2023-01-22 15:05:58.329016: step: 792/464, loss: 0.003950945101678371 2023-01-22 15:05:59.056477: step: 794/464, loss: 0.009137485176324844 2023-01-22 15:05:59.788295: step: 796/464, loss: 0.017941009253263474 2023-01-22 15:06:00.511810: step: 798/464, loss: 0.0004914068267680705 2023-01-22 15:06:01.296623: step: 800/464, loss: 0.026943765580654144 2023-01-22 15:06:02.094703: step: 802/464, loss: 0.07514145970344543 2023-01-22 15:06:02.804544: step: 804/464, loss: 1.424988865852356 2023-01-22 15:06:03.674591: step: 806/464, loss: 0.005588890518993139 2023-01-22 15:06:04.466260: step: 808/464, loss: 0.082185298204422 2023-01-22 15:06:05.141959: step: 810/464, loss: 0.005688100587576628 2023-01-22 15:06:05.935682: step: 812/464, loss: 0.024905728176236153 2023-01-22 15:06:06.661737: step: 814/464, loss: 0.0005005030543543398 2023-01-22 15:06:07.396983: step: 816/464, loss: 0.001384145813062787 2023-01-22 15:06:08.032545: step: 818/464, loss: 0.016171080991625786 2023-01-22 15:06:08.749718: step: 820/464, loss: 0.0553651861846447 2023-01-22 15:06:09.454117: step: 822/464, loss: 0.02234969288110733 2023-01-22 15:06:10.193392: step: 824/464, loss: 0.03491181135177612 2023-01-22 15:06:10.997989: step: 826/464, loss: 0.0041991109028458595 2023-01-22 15:06:11.803934: step: 828/464, loss: 0.0028750444762408733 2023-01-22 15:06:12.513311: step: 830/464, loss: 0.016252554953098297 2023-01-22 15:06:13.293088: step: 832/464, loss: 0.09442304819822311 2023-01-22 15:06:14.126452: step: 834/464, loss: 0.00023691660317126662 2023-01-22 15:06:14.895752: step: 836/464, loss: 0.09088034927845001 2023-01-22 15:06:15.647963: step: 838/464, loss: 0.031029315665364265 2023-01-22 15:06:16.324343: step: 840/464, loss: 0.06107908859848976 2023-01-22 15:06:17.076092: step: 842/464, loss: 0.10602616518735886 2023-01-22 15:06:17.859909: step: 844/464, loss: 0.05736605450510979 2023-01-22 15:06:18.631166: step: 846/464, loss: 0.01999666728079319 2023-01-22 15:06:19.376295: step: 848/464, loss: 0.007230349816381931 2023-01-22 15:06:20.068864: step: 850/464, loss: 0.03765219822525978 2023-01-22 15:06:20.735438: step: 852/464, loss: 0.043977733701467514 2023-01-22 15:06:21.482372: step: 854/464, loss: 0.031765200197696686 2023-01-22 15:06:22.248413: step: 856/464, loss: 0.020025676116347313 2023-01-22 15:06:22.914997: step: 858/464, loss: 0.001896336441859603 2023-01-22 15:06:23.597061: step: 860/464, loss: 0.009502287954092026 2023-01-22 15:06:24.354432: step: 862/464, loss: 0.02049115113914013 2023-01-22 15:06:25.137108: step: 864/464, loss: 0.04266717657446861 2023-01-22 15:06:25.798048: step: 866/464, loss: 0.00879158079624176 2023-01-22 15:06:26.468332: step: 868/464, loss: 0.14739181101322174 2023-01-22 15:06:27.296883: step: 870/464, loss: 0.044605158269405365 2023-01-22 15:06:28.023022: step: 872/464, loss: 0.45157331228256226 2023-01-22 15:06:28.893747: step: 874/464, loss: 0.027765942737460136 2023-01-22 15:06:29.571684: step: 876/464, loss: 0.03756709396839142 2023-01-22 15:06:30.316711: step: 878/464, loss: 0.0074466816149652 2023-01-22 15:06:31.015679: step: 880/464, loss: 0.03392532840371132 2023-01-22 15:06:31.868526: step: 882/464, loss: 0.0901467427611351 2023-01-22 15:06:32.623222: step: 884/464, loss: 0.021974651142954826 2023-01-22 15:06:33.357213: step: 886/464, loss: 0.002986679784953594 2023-01-22 15:06:34.208194: step: 888/464, loss: 0.004938908386975527 2023-01-22 15:06:34.893297: step: 890/464, loss: 0.14711670577526093 2023-01-22 15:06:35.579630: step: 892/464, loss: 0.10275810211896896 2023-01-22 15:06:36.326657: step: 894/464, loss: 0.02016407996416092 2023-01-22 15:06:37.043404: step: 896/464, loss: 0.019757527858018875 2023-01-22 15:06:37.738359: step: 898/464, loss: 0.0035847180988639593 2023-01-22 15:06:38.523308: step: 900/464, loss: 0.05121622979640961 2023-01-22 15:06:39.219833: step: 902/464, loss: 0.02605106495320797 2023-01-22 15:06:39.959555: step: 904/464, loss: 0.007822553627192974 2023-01-22 15:06:40.677487: step: 906/464, loss: 0.017992887645959854 2023-01-22 15:06:41.403714: step: 908/464, loss: 0.2159995585680008 2023-01-22 15:06:42.160417: step: 910/464, loss: 0.0006608831463381648 2023-01-22 15:06:42.892207: step: 912/464, loss: 0.005477035418152809 2023-01-22 15:06:43.627914: step: 914/464, loss: 0.021487193182110786 2023-01-22 15:06:44.403042: step: 916/464, loss: 0.013565809465944767 2023-01-22 15:06:45.097043: step: 918/464, loss: 0.048289552330970764 2023-01-22 15:06:45.821213: step: 920/464, loss: 1.4374891519546509 2023-01-22 15:06:46.555775: step: 922/464, loss: 0.02214394509792328 2023-01-22 15:06:47.299973: step: 924/464, loss: 0.01092542801052332 2023-01-22 15:06:48.049764: step: 926/464, loss: 0.038795508444309235 2023-01-22 15:06:48.778925: step: 928/464, loss: 0.0012786659644916654 2023-01-22 15:06:49.384485: step: 930/464, loss: 0.14401409029960632 ================================================== Loss: 0.070 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.30790000089747127, 'r': 0.35697704088871907, 'f1': 0.3306272417370034}, 'combined': 0.2436200728588446, 'epoch': 26} Test Chinese: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.29751116682849027, 'r': 0.28780011984002324, 'f1': 0.2925750841209286}, 'combined': 0.18170452592773462, 'epoch': 26} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2844825505964025, 'r': 0.3411631346810747, 'f1': 0.3102553442224786}, 'combined': 0.22860920100603682, 'epoch': 26} Test Korean: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.29324255627819124, 'r': 0.289471880480351, 'f1': 0.2913450185820158}, 'combined': 0.18094059048777825, 'epoch': 26} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.29774731763448564, 'r': 0.34520609312081735, 'f1': 0.3197251512735865}, 'combined': 0.2355869535700111, 'epoch': 26} Test Russian: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.31260863279583656, 'r': 0.29745747255152793, 'f1': 0.30484491104875294}, 'combined': 0.18932473423027815, 'epoch': 26} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.24847560975609756, 'r': 0.2910714285714286, 'f1': 0.2680921052631579}, 'combined': 0.1787280701754386, 'epoch': 26} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.25595238095238093, 'r': 0.4673913043478261, 'f1': 0.33076923076923076}, 'combined': 0.16538461538461538, 'epoch': 26} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.3804347826086957, 'r': 0.3017241379310345, 'f1': 0.3365384615384615}, 'combined': 0.22435897435897434, 'epoch': 26} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.31282801835726054, 'r': 0.3603161425860667, 'f1': 0.33489701436130004}, 'combined': 0.24676622110832633, 'epoch': 12} Test for Chinese: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.30894966790503225, 'r': 0.2887808468548521, 'f1': 0.29852498585915693}, 'combined': 0.1853997280598975, 'epoch': 12} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.3138888888888889, 'r': 0.4035714285714286, 'f1': 0.35312499999999997}, 'combined': 0.23541666666666664, 'epoch': 12} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3081533681870252, 'r': 0.3455761681186563, 'f1': 0.32579363255551325}, 'combined': 0.24005846609353607, 'epoch': 23} Test for Korean: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.29327466083978504, 'r': 0.2883432372648332, 'f1': 0.2907880427678267}, 'combined': 0.1805946791926503, 'epoch': 23} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.3355263157894737, 'r': 0.5543478260869565, 'f1': 0.41803278688524587}, 'combined': 0.20901639344262293, 'epoch': 23} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.32184781639317267, 'r': 0.3346728716953674, 'f1': 0.32813507606224857}, 'combined': 0.24178374025639365, 'epoch': 24} Test for Russian: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.31977789344146773, 'r': 0.29610233370986844, 'f1': 0.30748504771716734}, 'combined': 0.190964398055925, 'epoch': 24} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.46875, 'r': 0.3232758620689655, 'f1': 0.38265306122448983}, 'combined': 0.25510204081632654, 'epoch': 24} ****************************** Epoch: 27 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-22 15:09:29.094535: step: 2/464, loss: 0.004643636755645275 2023-01-22 15:09:29.936348: step: 4/464, loss: 0.05969075858592987 2023-01-22 15:09:30.735627: step: 6/464, loss: 0.05660129338502884 2023-01-22 15:09:31.440417: step: 8/464, loss: 0.009980532340705395 2023-01-22 15:09:32.118476: step: 10/464, loss: 0.008995480835437775 2023-01-22 15:09:32.793572: step: 12/464, loss: 0.011057031340897083 2023-01-22 15:09:33.515130: step: 14/464, loss: 0.04951007291674614 2023-01-22 15:09:34.261256: step: 16/464, loss: 0.00735006807371974 2023-01-22 15:09:35.017338: step: 18/464, loss: 0.0010468775872141123 2023-01-22 15:09:35.795350: step: 20/464, loss: 0.03526607155799866 2023-01-22 15:09:36.506241: step: 22/464, loss: 0.010531190782785416 2023-01-22 15:09:37.273560: step: 24/464, loss: 0.006157663185149431 2023-01-22 15:09:37.928425: step: 26/464, loss: 0.0033533810637891293 2023-01-22 15:09:38.612155: step: 28/464, loss: 0.011918006464838982 2023-01-22 15:09:39.278405: step: 30/464, loss: 0.01299851480871439 2023-01-22 15:09:39.971716: step: 32/464, loss: 0.0388779379427433 2023-01-22 15:09:40.679803: step: 34/464, loss: 0.04373202845454216 2023-01-22 15:09:41.343511: step: 36/464, loss: 0.011047535575926304 2023-01-22 15:09:42.143565: step: 38/464, loss: 0.030027193948626518 2023-01-22 15:09:42.840594: step: 40/464, loss: 0.0028982798103243113 2023-01-22 15:09:43.580268: step: 42/464, loss: 0.16715313494205475 2023-01-22 15:09:44.303775: step: 44/464, loss: 0.037980757653713226 2023-01-22 15:09:45.030572: step: 46/464, loss: 0.05919745936989784 2023-01-22 15:09:45.763612: step: 48/464, loss: 0.003755714511498809 2023-01-22 15:09:46.503965: step: 50/464, loss: 0.02528190053999424 2023-01-22 15:09:47.301447: step: 52/464, loss: 0.03390359506011009 2023-01-22 15:09:48.021643: step: 54/464, loss: 0.01960902288556099 2023-01-22 15:09:48.737028: step: 56/464, loss: 0.0009774720529094338 2023-01-22 15:09:49.521342: step: 58/464, loss: 0.07767617702484131 2023-01-22 15:09:50.293177: step: 60/464, loss: 0.009920057840645313 2023-01-22 15:09:50.984010: step: 62/464, loss: 0.05598536878824234 2023-01-22 15:09:51.774169: step: 64/464, loss: 0.002746915677562356 2023-01-22 15:09:52.549211: step: 66/464, loss: 0.13891775906085968 2023-01-22 15:09:53.254687: step: 68/464, loss: 0.012845741584897041 2023-01-22 15:09:54.006521: step: 70/464, loss: 0.0018104149494320154 2023-01-22 15:09:54.685550: step: 72/464, loss: 0.07344887405633926 2023-01-22 15:09:55.448924: step: 74/464, loss: 0.02678999863564968 2023-01-22 15:09:56.166139: step: 76/464, loss: 0.011106455698609352 2023-01-22 15:09:56.904462: step: 78/464, loss: 0.0064119272865355015 2023-01-22 15:09:57.593335: step: 80/464, loss: 0.014576753601431847 2023-01-22 15:09:58.373094: step: 82/464, loss: 0.08922789990901947 2023-01-22 15:09:59.123541: step: 84/464, loss: 0.002564490307122469 2023-01-22 15:09:59.843382: step: 86/464, loss: 0.042727261781692505 2023-01-22 15:10:00.597053: step: 88/464, loss: 0.02241584286093712 2023-01-22 15:10:01.349494: step: 90/464, loss: 0.03405474126338959 2023-01-22 15:10:02.106041: step: 92/464, loss: 0.02004759944975376 2023-01-22 15:10:02.779463: step: 94/464, loss: 0.007997794076800346 2023-01-22 15:10:03.558949: step: 96/464, loss: 0.005616101436316967 2023-01-22 15:10:04.268354: step: 98/464, loss: 0.015116574242711067 2023-01-22 15:10:05.007146: step: 100/464, loss: 0.0059138028882443905 2023-01-22 15:10:05.744698: step: 102/464, loss: 0.005302782170474529 2023-01-22 15:10:06.481547: step: 104/464, loss: 0.0039724973030388355 2023-01-22 15:10:07.209843: step: 106/464, loss: 0.0138095049187541 2023-01-22 15:10:07.996476: step: 108/464, loss: 0.021086279302835464 2023-01-22 15:10:08.739383: step: 110/464, loss: 0.01628832332789898 2023-01-22 15:10:09.450686: step: 112/464, loss: 0.011404899880290031 2023-01-22 15:10:10.183811: step: 114/464, loss: 0.006049105431884527 2023-01-22 15:10:11.015382: step: 116/464, loss: 0.027632322162389755 2023-01-22 15:10:11.817218: step: 118/464, loss: 0.011812705546617508 2023-01-22 15:10:12.539259: step: 120/464, loss: 0.03694211319088936 2023-01-22 15:10:13.242349: step: 122/464, loss: 0.0032955994829535484 2023-01-22 15:10:13.944330: step: 124/464, loss: 0.032637108117341995 2023-01-22 15:10:14.692913: step: 126/464, loss: 0.008958675898611546 2023-01-22 15:10:15.524596: step: 128/464, loss: 0.0314217209815979 2023-01-22 15:10:16.285042: step: 130/464, loss: 0.05973093584179878 2023-01-22 15:10:17.009500: step: 132/464, loss: 0.0837894082069397 2023-01-22 15:10:17.819734: step: 134/464, loss: 2.221879243850708 2023-01-22 15:10:18.551905: step: 136/464, loss: 0.031027106568217278 2023-01-22 15:10:19.292350: step: 138/464, loss: 0.008257872425019741 2023-01-22 15:10:20.036222: step: 140/464, loss: 0.008987409062683582 2023-01-22 15:10:20.756867: step: 142/464, loss: 0.05197222903370857 2023-01-22 15:10:21.520364: step: 144/464, loss: 0.0036501206923276186 2023-01-22 15:10:22.263012: step: 146/464, loss: 0.23644611239433289 2023-01-22 15:10:23.025167: step: 148/464, loss: 0.01297426875680685 2023-01-22 15:10:23.715027: step: 150/464, loss: 0.0007067727274261415 2023-01-22 15:10:24.411087: step: 152/464, loss: 0.059547603130340576 2023-01-22 15:10:25.113872: step: 154/464, loss: 0.05646130442619324 2023-01-22 15:10:25.843444: step: 156/464, loss: 0.08381831645965576 2023-01-22 15:10:26.647848: step: 158/464, loss: 0.01727456972002983 2023-01-22 15:10:27.423557: step: 160/464, loss: 0.019560186192393303 2023-01-22 15:10:28.199376: step: 162/464, loss: 0.036793388426303864 2023-01-22 15:10:28.897529: step: 164/464, loss: 0.020215477794408798 2023-01-22 15:10:29.651887: step: 166/464, loss: 0.0006976730655878782 2023-01-22 15:10:30.423829: step: 168/464, loss: 0.007520411163568497 2023-01-22 15:10:31.088024: step: 170/464, loss: 0.10326247662305832 2023-01-22 15:10:31.901593: step: 172/464, loss: 0.013687805272638798 2023-01-22 15:10:32.747694: step: 174/464, loss: 0.016566680744290352 2023-01-22 15:10:33.499488: step: 176/464, loss: 0.009976368397474289 2023-01-22 15:10:34.201292: step: 178/464, loss: 0.013758454471826553 2023-01-22 15:10:35.001578: step: 180/464, loss: 0.06455282121896744 2023-01-22 15:10:35.691854: step: 182/464, loss: 0.004204621072858572 2023-01-22 15:10:36.405567: step: 184/464, loss: 0.041804660111665726 2023-01-22 15:10:37.071019: step: 186/464, loss: 0.07819627225399017 2023-01-22 15:10:37.792655: step: 188/464, loss: 0.0023909374140203 2023-01-22 15:10:38.604039: step: 190/464, loss: 0.01398144755512476 2023-01-22 15:10:39.326180: step: 192/464, loss: 0.007653918582946062 2023-01-22 15:10:40.017680: step: 194/464, loss: 0.005307620856910944 2023-01-22 15:10:40.766260: step: 196/464, loss: 0.012562950141727924 2023-01-22 15:10:41.524405: step: 198/464, loss: 0.01593020185828209 2023-01-22 15:10:42.365424: step: 200/464, loss: 0.0009638724150136113 2023-01-22 15:10:43.228530: step: 202/464, loss: 0.048679254949092865 2023-01-22 15:10:43.985516: step: 204/464, loss: 0.006392453797161579 2023-01-22 15:10:44.710514: step: 206/464, loss: 0.01042192429304123 2023-01-22 15:10:45.402353: step: 208/464, loss: 0.01366907637566328 2023-01-22 15:10:46.097985: step: 210/464, loss: 0.005037717055529356 2023-01-22 15:10:46.827010: step: 212/464, loss: 1.1862741708755493 2023-01-22 15:10:47.542180: step: 214/464, loss: 0.02484513819217682 2023-01-22 15:10:48.221335: step: 216/464, loss: 0.07665397226810455 2023-01-22 15:10:49.002555: step: 218/464, loss: 0.029446804895997047 2023-01-22 15:10:49.706369: step: 220/464, loss: 0.042743612080812454 2023-01-22 15:10:50.476395: step: 222/464, loss: 0.0012197594624012709 2023-01-22 15:10:51.210640: step: 224/464, loss: 0.022778544574975967 2023-01-22 15:10:51.969679: step: 226/464, loss: 0.0036713809240609407 2023-01-22 15:10:52.681327: step: 228/464, loss: 0.0013936725445091724 2023-01-22 15:10:53.446238: step: 230/464, loss: 0.021703477948904037 2023-01-22 15:10:54.259274: step: 232/464, loss: 0.025985587388277054 2023-01-22 15:10:54.967317: step: 234/464, loss: 0.012457039207220078 2023-01-22 15:10:55.751156: step: 236/464, loss: 0.01900213584303856 2023-01-22 15:10:56.453277: step: 238/464, loss: 0.03523466736078262 2023-01-22 15:10:57.200080: step: 240/464, loss: 0.013470427133142948 2023-01-22 15:10:57.844648: step: 242/464, loss: 0.010749666020274162 2023-01-22 15:10:58.557283: step: 244/464, loss: 0.024799073114991188 2023-01-22 15:10:59.383822: step: 246/464, loss: 0.00432855449616909 2023-01-22 15:11:00.141585: step: 248/464, loss: 0.04193061962723732 2023-01-22 15:11:00.910714: step: 250/464, loss: 0.026824962347745895 2023-01-22 15:11:01.599677: step: 252/464, loss: 0.1815386265516281 2023-01-22 15:11:02.311095: step: 254/464, loss: 0.030335990712046623 2023-01-22 15:11:03.031061: step: 256/464, loss: 0.042864929884672165 2023-01-22 15:11:03.753523: step: 258/464, loss: 0.019191723316907883 2023-01-22 15:11:04.483928: step: 260/464, loss: 0.040823861956596375 2023-01-22 15:11:05.127573: step: 262/464, loss: 0.016286570578813553 2023-01-22 15:11:05.843628: step: 264/464, loss: 0.09178738296031952 2023-01-22 15:11:06.707306: step: 266/464, loss: 0.03739660233259201 2023-01-22 15:11:07.539341: step: 268/464, loss: 0.01867361180484295 2023-01-22 15:11:08.249309: step: 270/464, loss: 0.0024883486330509186 2023-01-22 15:11:08.975131: step: 272/464, loss: 0.003901525866240263 2023-01-22 15:11:09.699014: step: 274/464, loss: 0.029833318665623665 2023-01-22 15:11:10.411428: step: 276/464, loss: 0.02278943918645382 2023-01-22 15:11:11.118447: step: 278/464, loss: 0.2309410125017166 2023-01-22 15:11:11.902392: step: 280/464, loss: 0.022811105474829674 2023-01-22 15:11:12.680001: step: 282/464, loss: 0.02011258341372013 2023-01-22 15:11:13.462303: step: 284/464, loss: 0.010888233780860901 2023-01-22 15:11:14.207097: step: 286/464, loss: 0.03074740245938301 2023-01-22 15:11:14.945753: step: 288/464, loss: 0.011113407090306282 2023-01-22 15:11:15.616092: step: 290/464, loss: 0.024814600124955177 2023-01-22 15:11:16.304651: step: 292/464, loss: 0.029330937191843987 2023-01-22 15:11:17.081593: step: 294/464, loss: 0.017294079065322876 2023-01-22 15:11:17.802210: step: 296/464, loss: 0.12338482588529587 2023-01-22 15:11:18.553740: step: 298/464, loss: 0.027676189318299294 2023-01-22 15:11:19.389014: step: 300/464, loss: 0.2536686360836029 2023-01-22 15:11:20.188055: step: 302/464, loss: 0.02364785224199295 2023-01-22 15:11:21.055153: step: 304/464, loss: 0.13542599976062775 2023-01-22 15:11:21.759640: step: 306/464, loss: 0.002939973957836628 2023-01-22 15:11:22.554864: step: 308/464, loss: 0.009112313389778137 2023-01-22 15:11:23.290073: step: 310/464, loss: 0.005944511387497187 2023-01-22 15:11:24.072512: step: 312/464, loss: 0.024802474305033684 2023-01-22 15:11:24.841051: step: 314/464, loss: 0.11303552240133286 2023-01-22 15:11:25.581756: step: 316/464, loss: 0.001480169128626585 2023-01-22 15:11:26.326370: step: 318/464, loss: 0.021034857258200645 2023-01-22 15:11:27.006301: step: 320/464, loss: 0.008457268588244915 2023-01-22 15:11:27.753647: step: 322/464, loss: 0.024015389382839203 2023-01-22 15:11:28.433335: step: 324/464, loss: 0.0032060788944363594 2023-01-22 15:11:29.117508: step: 326/464, loss: 0.00801069661974907 2023-01-22 15:11:29.825981: step: 328/464, loss: 0.009024497121572495 2023-01-22 15:11:30.523477: step: 330/464, loss: 0.015846198424696922 2023-01-22 15:11:31.189709: step: 332/464, loss: 0.02931729145348072 2023-01-22 15:11:32.055284: step: 334/464, loss: 0.06721168756484985 2023-01-22 15:11:32.831515: step: 336/464, loss: 0.05309503898024559 2023-01-22 15:11:33.572584: step: 338/464, loss: 0.025797521695494652 2023-01-22 15:11:34.394479: step: 340/464, loss: 16.31406593322754 2023-01-22 15:11:35.078997: step: 342/464, loss: 0.032370854169130325 2023-01-22 15:11:35.879098: step: 344/464, loss: 0.011842369101941586 2023-01-22 15:11:36.631953: step: 346/464, loss: 0.07969319820404053 2023-01-22 15:11:37.370683: step: 348/464, loss: 0.0680510401725769 2023-01-22 15:11:38.067655: step: 350/464, loss: 0.0023781214840710163 2023-01-22 15:11:38.763341: step: 352/464, loss: 0.0967128649353981 2023-01-22 15:11:39.509119: step: 354/464, loss: 0.046918027102947235 2023-01-22 15:11:40.246976: step: 356/464, loss: 0.018872834742069244 2023-01-22 15:11:41.007426: step: 358/464, loss: 0.0149329649284482 2023-01-22 15:11:41.751287: step: 360/464, loss: 0.03147076815366745 2023-01-22 15:11:42.493736: step: 362/464, loss: 0.042903654277324677 2023-01-22 15:11:43.212088: step: 364/464, loss: 0.021718693897128105 2023-01-22 15:11:44.012005: step: 366/464, loss: 0.016806410625576973 2023-01-22 15:11:44.747958: step: 368/464, loss: 0.02991560660302639 2023-01-22 15:11:45.355265: step: 370/464, loss: 0.00047591744805686176 2023-01-22 15:11:46.102779: step: 372/464, loss: 0.018689189106225967 2023-01-22 15:11:46.912210: step: 374/464, loss: 0.0032049152068793774 2023-01-22 15:11:47.627862: step: 376/464, loss: 0.011647749692201614 2023-01-22 15:11:48.373339: step: 378/464, loss: 0.032072123140096664 2023-01-22 15:11:49.241682: step: 380/464, loss: 1.2236533164978027 2023-01-22 15:11:49.981920: step: 382/464, loss: 0.1918351650238037 2023-01-22 15:11:50.810472: step: 384/464, loss: 0.007488076109439135 2023-01-22 15:11:51.638127: step: 386/464, loss: 0.029658658429980278 2023-01-22 15:11:52.351218: step: 388/464, loss: 0.0018123077461495996 2023-01-22 15:11:53.076409: step: 390/464, loss: 0.3252991735935211 2023-01-22 15:11:53.786640: step: 392/464, loss: 0.0037515009753406048 2023-01-22 15:11:54.450407: step: 394/464, loss: 0.01372498832643032 2023-01-22 15:11:55.160494: step: 396/464, loss: 0.02327573671936989 2023-01-22 15:11:55.875168: step: 398/464, loss: 0.02104596234858036 2023-01-22 15:11:56.654196: step: 400/464, loss: 0.006648695096373558 2023-01-22 15:11:57.454725: step: 402/464, loss: 0.013397648930549622 2023-01-22 15:11:58.241647: step: 404/464, loss: 0.08620434254407883 2023-01-22 15:11:59.023861: step: 406/464, loss: 0.061332013458013535 2023-01-22 15:11:59.786262: step: 408/464, loss: 0.006002719514071941 2023-01-22 15:12:00.554510: step: 410/464, loss: 0.03985309228301048 2023-01-22 15:12:01.279347: step: 412/464, loss: 0.054387107491493225 2023-01-22 15:12:01.991056: step: 414/464, loss: 0.025079520419239998 2023-01-22 15:12:02.688620: step: 416/464, loss: 0.004718593787401915 2023-01-22 15:12:03.474237: step: 418/464, loss: 0.03148522600531578 2023-01-22 15:12:04.241969: step: 420/464, loss: 0.010869407095015049 2023-01-22 15:12:05.068714: step: 422/464, loss: 0.019349634647369385 2023-01-22 15:12:05.800088: step: 424/464, loss: 0.067308709025383 2023-01-22 15:12:06.474539: step: 426/464, loss: 0.23457546532154083 2023-01-22 15:12:07.209421: step: 428/464, loss: 0.02192721515893936 2023-01-22 15:12:07.887922: step: 430/464, loss: 0.00015661267389077693 2023-01-22 15:12:08.569564: step: 432/464, loss: 0.007550378330051899 2023-01-22 15:12:09.326171: step: 434/464, loss: 0.0132135059684515 2023-01-22 15:12:10.050653: step: 436/464, loss: 0.003985149785876274 2023-01-22 15:12:10.785084: step: 438/464, loss: 0.07638142257928848 2023-01-22 15:12:11.462772: step: 440/464, loss: 0.013048535212874413 2023-01-22 15:12:12.217710: step: 442/464, loss: 0.015374964103102684 2023-01-22 15:12:12.960839: step: 444/464, loss: 0.0020470223389565945 2023-01-22 15:12:13.702398: step: 446/464, loss: 0.006991243921220303 2023-01-22 15:12:14.410428: step: 448/464, loss: 0.02094842866063118 2023-01-22 15:12:15.086707: step: 450/464, loss: 0.054518863558769226 2023-01-22 15:12:15.831435: step: 452/464, loss: 0.06273539364337921 2023-01-22 15:12:16.598175: step: 454/464, loss: 0.006025554146617651 2023-01-22 15:12:17.283946: step: 456/464, loss: 0.05231739953160286 2023-01-22 15:12:18.063441: step: 458/464, loss: 0.001241020974703133 2023-01-22 15:12:18.773337: step: 460/464, loss: 0.0381249338388443 2023-01-22 15:12:19.499023: step: 462/464, loss: 0.0033619175665080547 2023-01-22 15:12:20.256485: step: 464/464, loss: 0.04729428142309189 2023-01-22 15:12:20.934136: step: 466/464, loss: 0.02667922154068947 2023-01-22 15:12:21.708729: step: 468/464, loss: 0.2248188853263855 2023-01-22 15:12:22.455697: step: 470/464, loss: 0.33987537026405334 2023-01-22 15:12:23.232719: step: 472/464, loss: 3.5491943359375 2023-01-22 15:12:23.903743: step: 474/464, loss: 0.017293350771069527 2023-01-22 15:12:24.649016: step: 476/464, loss: 0.022785795852541924 2023-01-22 15:12:25.400884: step: 478/464, loss: 0.017341703176498413 2023-01-22 15:12:26.070421: step: 480/464, loss: 0.0009825509041547775 2023-01-22 15:12:26.801881: step: 482/464, loss: 0.003864610567688942 2023-01-22 15:12:27.520539: step: 484/464, loss: 0.0037028638180345297 2023-01-22 15:12:28.274965: step: 486/464, loss: 0.017254330217838287 2023-01-22 15:12:29.084133: step: 488/464, loss: 0.06554201245307922 2023-01-22 15:12:29.899901: step: 490/464, loss: 0.23053628206253052 2023-01-22 15:12:30.628747: step: 492/464, loss: 0.08253292739391327 2023-01-22 15:12:31.312987: step: 494/464, loss: 0.012563752010464668 2023-01-22 15:12:32.078028: step: 496/464, loss: 0.004077407997101545 2023-01-22 15:12:32.825786: step: 498/464, loss: 0.013352013193070889 2023-01-22 15:12:33.515597: step: 500/464, loss: 0.014727744273841381 2023-01-22 15:12:34.236448: step: 502/464, loss: 0.10225684940814972 2023-01-22 15:12:34.926431: step: 504/464, loss: 0.0051687248051166534 2023-01-22 15:12:35.655226: step: 506/464, loss: 0.00040302166598849 2023-01-22 15:12:36.361501: step: 508/464, loss: 0.025112995877861977 2023-01-22 15:12:37.213542: step: 510/464, loss: 0.08379174023866653 2023-01-22 15:12:38.003506: step: 512/464, loss: 0.009772256948053837 2023-01-22 15:12:38.726858: step: 514/464, loss: 0.004904484376311302 2023-01-22 15:12:39.499965: step: 516/464, loss: 0.005667173303663731 2023-01-22 15:12:40.262114: step: 518/464, loss: 0.0352930948138237 2023-01-22 15:12:41.041562: step: 520/464, loss: 0.03419114649295807 2023-01-22 15:12:41.859272: step: 522/464, loss: 0.0036346132401376963 2023-01-22 15:12:42.587426: step: 524/464, loss: 0.04692724719643593 2023-01-22 15:12:43.353581: step: 526/464, loss: 0.5098605155944824 2023-01-22 15:12:44.235541: step: 528/464, loss: 0.022590486332774162 2023-01-22 15:12:44.949156: step: 530/464, loss: 0.018303902819752693 2023-01-22 15:12:45.642634: step: 532/464, loss: 0.008179579861462116 2023-01-22 15:12:46.394127: step: 534/464, loss: 0.004101166967302561 2023-01-22 15:12:47.109050: step: 536/464, loss: 0.06652644276618958 2023-01-22 15:12:47.894923: step: 538/464, loss: 0.22143836319446564 2023-01-22 15:12:48.751084: step: 540/464, loss: 0.05717802420258522 2023-01-22 15:12:49.491802: step: 542/464, loss: 0.008329024538397789 2023-01-22 15:12:50.332837: step: 544/464, loss: 0.016076695173978806 2023-01-22 15:12:51.039297: step: 546/464, loss: 0.023170355707406998 2023-01-22 15:12:51.796824: step: 548/464, loss: 0.03448038920760155 2023-01-22 15:12:52.607659: step: 550/464, loss: 0.03607087582349777 2023-01-22 15:12:53.372015: step: 552/464, loss: 0.11305753141641617 2023-01-22 15:12:54.181200: step: 554/464, loss: 0.006824792828410864 2023-01-22 15:12:55.016198: step: 556/464, loss: 0.0006629024283029139 2023-01-22 15:12:55.789771: step: 558/464, loss: 0.05242105573415756 2023-01-22 15:12:56.488874: step: 560/464, loss: 0.017612749710679054 2023-01-22 15:12:57.146948: step: 562/464, loss: 0.036728858947753906 2023-01-22 15:12:57.870899: step: 564/464, loss: 0.002265622839331627 2023-01-22 15:12:58.531241: step: 566/464, loss: 0.0009037033887580037 2023-01-22 15:12:59.311435: step: 568/464, loss: 0.020430414006114006 2023-01-22 15:13:00.032271: step: 570/464, loss: 0.02256557159125805 2023-01-22 15:13:00.677083: step: 572/464, loss: 0.04098518192768097 2023-01-22 15:13:01.460046: step: 574/464, loss: 0.14411544799804688 2023-01-22 15:13:02.247719: step: 576/464, loss: 0.0064660003408789635 2023-01-22 15:13:02.937045: step: 578/464, loss: 0.016785940155386925 2023-01-22 15:13:03.677355: step: 580/464, loss: 0.15375134348869324 2023-01-22 15:13:04.410382: step: 582/464, loss: 0.008497925475239754 2023-01-22 15:13:05.136075: step: 584/464, loss: 0.18103736639022827 2023-01-22 15:13:05.794968: step: 586/464, loss: 0.004437592811882496 2023-01-22 15:13:06.662935: step: 588/464, loss: 0.022399378940463066 2023-01-22 15:13:07.476144: step: 590/464, loss: 0.03182051330804825 2023-01-22 15:13:08.226845: step: 592/464, loss: 0.06716489046812057 2023-01-22 15:13:08.927613: step: 594/464, loss: 0.029364589601755142 2023-01-22 15:13:09.633706: step: 596/464, loss: 0.028085386380553246 2023-01-22 15:13:10.388557: step: 598/464, loss: 0.047149620950222015 2023-01-22 15:13:11.142158: step: 600/464, loss: 0.6726506948471069 2023-01-22 15:13:11.864312: step: 602/464, loss: 0.021409453824162483 2023-01-22 15:13:12.542295: step: 604/464, loss: 0.027266275137662888 2023-01-22 15:13:13.347792: step: 606/464, loss: 0.012125412002205849 2023-01-22 15:13:14.044349: step: 608/464, loss: 0.004791960120201111 2023-01-22 15:13:14.757303: step: 610/464, loss: 0.021916117519140244 2023-01-22 15:13:15.504599: step: 612/464, loss: 0.07800301909446716 2023-01-22 15:13:16.300531: step: 614/464, loss: 0.06997323036193848 2023-01-22 15:13:16.976254: step: 616/464, loss: 0.01541509572416544 2023-01-22 15:13:17.672881: step: 618/464, loss: 0.01439857017248869 2023-01-22 15:13:18.389467: step: 620/464, loss: 0.09722045809030533 2023-01-22 15:13:19.157424: step: 622/464, loss: 0.3760955035686493 2023-01-22 15:13:19.977694: step: 624/464, loss: 0.03853768855333328 2023-01-22 15:13:20.717087: step: 626/464, loss: 0.20559030771255493 2023-01-22 15:13:21.453244: step: 628/464, loss: 0.36281540989875793 2023-01-22 15:13:22.151126: step: 630/464, loss: 0.000528940639924258 2023-01-22 15:13:22.933393: step: 632/464, loss: 0.059164758771657944 2023-01-22 15:13:23.751769: step: 634/464, loss: 0.03127744048833847 2023-01-22 15:13:24.547032: step: 636/464, loss: 0.017752759158611298 2023-01-22 15:13:25.215809: step: 638/464, loss: 0.017974853515625 2023-01-22 15:13:25.952276: step: 640/464, loss: 0.004527505021542311 2023-01-22 15:13:26.672507: step: 642/464, loss: 0.8789519667625427 2023-01-22 15:13:27.365507: step: 644/464, loss: 0.022091975435614586 2023-01-22 15:13:28.112652: step: 646/464, loss: 0.029383093118667603 2023-01-22 15:13:28.863179: step: 648/464, loss: 0.014885574579238892 2023-01-22 15:13:29.595142: step: 650/464, loss: 0.09605132788419724 2023-01-22 15:13:30.490568: step: 652/464, loss: 0.015228205360472202 2023-01-22 15:13:31.196332: step: 654/464, loss: 0.03227443993091583 2023-01-22 15:13:31.969998: step: 656/464, loss: 0.02475246600806713 2023-01-22 15:13:32.764713: step: 658/464, loss: 0.05495090410113335 2023-01-22 15:13:33.537824: step: 660/464, loss: 0.023350244387984276 2023-01-22 15:13:34.290794: step: 662/464, loss: 0.00836243201047182 2023-01-22 15:13:35.013633: step: 664/464, loss: 0.05023346468806267 2023-01-22 15:13:35.780945: step: 666/464, loss: 0.019876714795827866 2023-01-22 15:13:36.529836: step: 668/464, loss: 0.04238751158118248 2023-01-22 15:13:37.272883: step: 670/464, loss: 0.00583220599219203 2023-01-22 15:13:38.071869: step: 672/464, loss: 0.027970099821686745 2023-01-22 15:13:38.801986: step: 674/464, loss: 0.06498485803604126 2023-01-22 15:13:39.531546: step: 676/464, loss: 0.06316278874874115 2023-01-22 15:13:40.431370: step: 678/464, loss: 0.011693473905324936 2023-01-22 15:13:41.201825: step: 680/464, loss: 0.04206886515021324 2023-01-22 15:13:41.935931: step: 682/464, loss: 0.01298786886036396 2023-01-22 15:13:42.686552: step: 684/464, loss: 0.0017875705379992723 2023-01-22 15:13:43.419226: step: 686/464, loss: 0.001596881658770144 2023-01-22 15:13:44.106880: step: 688/464, loss: 0.04022899642586708 2023-01-22 15:13:44.822140: step: 690/464, loss: 0.02122717723250389 2023-01-22 15:13:45.490194: step: 692/464, loss: 0.01992715150117874 2023-01-22 15:13:46.297756: step: 694/464, loss: 0.01370147429406643 2023-01-22 15:13:46.986419: step: 696/464, loss: 0.0580449253320694 2023-01-22 15:13:47.697220: step: 698/464, loss: 0.042913101613521576 2023-01-22 15:13:48.420121: step: 700/464, loss: 0.0175530593842268 2023-01-22 15:13:49.180470: step: 702/464, loss: 0.024001671001315117 2023-01-22 15:13:49.828863: step: 704/464, loss: 0.3223625719547272 2023-01-22 15:13:50.537703: step: 706/464, loss: 0.05440326780080795 2023-01-22 15:13:51.261887: step: 708/464, loss: 0.04393898695707321 2023-01-22 15:13:51.945933: step: 710/464, loss: 0.0779217854142189 2023-01-22 15:13:52.785766: step: 712/464, loss: 0.010698176920413971 2023-01-22 15:13:53.510245: step: 714/464, loss: 0.020368900150060654 2023-01-22 15:13:54.216839: step: 716/464, loss: 0.003039130475372076 2023-01-22 15:13:54.893591: step: 718/464, loss: 0.0023006058763712645 2023-01-22 15:13:55.584698: step: 720/464, loss: 0.05625094100832939 2023-01-22 15:13:56.242703: step: 722/464, loss: 0.025456681847572327 2023-01-22 15:13:57.050803: step: 724/464, loss: 0.18045490980148315 2023-01-22 15:13:57.801861: step: 726/464, loss: 0.021235687658190727 2023-01-22 15:13:58.548257: step: 728/464, loss: 0.11662095040082932 2023-01-22 15:13:59.325846: step: 730/464, loss: 0.013974427245557308 2023-01-22 15:14:00.058676: step: 732/464, loss: 0.01940869353711605 2023-01-22 15:14:00.831321: step: 734/464, loss: 0.24936962127685547 2023-01-22 15:14:01.583678: step: 736/464, loss: 0.0662817656993866 2023-01-22 15:14:02.266614: step: 738/464, loss: 0.005788351409137249 2023-01-22 15:14:02.976511: step: 740/464, loss: 0.00926015991717577 2023-01-22 15:14:03.690811: step: 742/464, loss: 0.03753805533051491 2023-01-22 15:14:04.468674: step: 744/464, loss: 0.06381073594093323 2023-01-22 15:14:05.206526: step: 746/464, loss: 0.025666510686278343 2023-01-22 15:14:05.950398: step: 748/464, loss: 0.012702060863375664 2023-01-22 15:14:06.773596: step: 750/464, loss: 0.05052117258310318 2023-01-22 15:14:07.460574: step: 752/464, loss: 0.0011634822003543377 2023-01-22 15:14:08.183495: step: 754/464, loss: 0.01186416856944561 2023-01-22 15:14:08.941844: step: 756/464, loss: 0.008127476088702679 2023-01-22 15:14:09.577748: step: 758/464, loss: 1.6381367444992065 2023-01-22 15:14:10.301450: step: 760/464, loss: 0.006272918079048395 2023-01-22 15:14:11.062916: step: 762/464, loss: 0.005116637796163559 2023-01-22 15:14:11.802922: step: 764/464, loss: 0.010884278453886509 2023-01-22 15:14:12.572731: step: 766/464, loss: 0.03772607445716858 2023-01-22 15:14:13.360625: step: 768/464, loss: 0.012753071263432503 2023-01-22 15:14:14.114160: step: 770/464, loss: 0.040373656898736954 2023-01-22 15:14:14.913255: step: 772/464, loss: 0.006438211537897587 2023-01-22 15:14:15.692995: step: 774/464, loss: 0.041876643896102905 2023-01-22 15:14:16.495998: step: 776/464, loss: 0.014445491135120392 2023-01-22 15:14:17.184877: step: 778/464, loss: 0.01640855334699154 2023-01-22 15:14:17.985999: step: 780/464, loss: 0.010428578592836857 2023-01-22 15:14:18.803326: step: 782/464, loss: 0.06605575233697891 2023-01-22 15:14:19.584200: step: 784/464, loss: 0.14504900574684143 2023-01-22 15:14:20.309360: step: 786/464, loss: 0.013085847720503807 2023-01-22 15:14:21.067810: step: 788/464, loss: 0.05918606370687485 2023-01-22 15:14:21.879162: step: 790/464, loss: 0.009224585257470608 2023-01-22 15:14:22.551024: step: 792/464, loss: 0.01954095996916294 2023-01-22 15:14:23.268521: step: 794/464, loss: 0.005487607326358557 2023-01-22 15:14:23.996470: step: 796/464, loss: 0.027371792122721672 2023-01-22 15:14:24.739148: step: 798/464, loss: 0.043121811002492905 2023-01-22 15:14:25.487512: step: 800/464, loss: 0.13599644601345062 2023-01-22 15:14:26.246165: step: 802/464, loss: 0.15219353139400482 2023-01-22 15:14:26.919914: step: 804/464, loss: 0.003684660419821739 2023-01-22 15:14:27.636972: step: 806/464, loss: 0.05656978115439415 2023-01-22 15:14:28.334557: step: 808/464, loss: 0.017085609957575798 2023-01-22 15:14:29.108962: step: 810/464, loss: 0.02435958757996559 2023-01-22 15:14:29.819883: step: 812/464, loss: 0.001792858587577939 2023-01-22 15:14:30.622334: step: 814/464, loss: 0.005744806956499815 2023-01-22 15:14:31.354229: step: 816/464, loss: 0.014488141983747482 2023-01-22 15:14:32.096150: step: 818/464, loss: 0.01181106548756361 2023-01-22 15:14:32.808047: step: 820/464, loss: 0.01934506557881832 2023-01-22 15:14:33.552764: step: 822/464, loss: 0.009796754457056522 2023-01-22 15:14:34.361474: step: 824/464, loss: 0.00042967224726453424 2023-01-22 15:14:35.095698: step: 826/464, loss: 0.01379795465618372 2023-01-22 15:14:35.840838: step: 828/464, loss: 0.014302251860499382 2023-01-22 15:14:36.693494: step: 830/464, loss: 0.016044262796640396 2023-01-22 15:14:37.418542: step: 832/464, loss: 0.04799362272024155 2023-01-22 15:14:38.036612: step: 834/464, loss: 0.006738803815096617 2023-01-22 15:14:38.708600: step: 836/464, loss: 0.001439097453840077 2023-01-22 15:14:39.469631: step: 838/464, loss: 0.11067359149456024 2023-01-22 15:14:40.173173: step: 840/464, loss: 0.0274420827627182 2023-01-22 15:14:40.886669: step: 842/464, loss: 0.013492869213223457 2023-01-22 15:14:41.641326: step: 844/464, loss: 0.0159356277436018 2023-01-22 15:14:42.377203: step: 846/464, loss: 0.0019269096665084362 2023-01-22 15:14:43.093022: step: 848/464, loss: 0.014913782477378845 2023-01-22 15:14:43.927407: step: 850/464, loss: 0.04737095534801483 2023-01-22 15:14:44.640479: step: 852/464, loss: 0.08423922210931778 2023-01-22 15:14:45.337379: step: 854/464, loss: 0.030404379591345787 2023-01-22 15:14:46.098726: step: 856/464, loss: 0.06620021164417267 2023-01-22 15:14:46.842629: step: 858/464, loss: 0.08976007252931595 2023-01-22 15:14:47.566811: step: 860/464, loss: 0.029361475259065628 2023-01-22 15:14:48.331149: step: 862/464, loss: 0.012592900544404984 2023-01-22 15:14:49.027120: step: 864/464, loss: 0.011326616629958153 2023-01-22 15:14:49.691916: step: 866/464, loss: 0.007855894044041634 2023-01-22 15:14:50.440921: step: 868/464, loss: 0.01106798741966486 2023-01-22 15:14:51.158141: step: 870/464, loss: 0.008292165584862232 2023-01-22 15:14:51.935294: step: 872/464, loss: 0.01627056859433651 2023-01-22 15:14:52.849088: step: 874/464, loss: 0.010699544101953506 2023-01-22 15:14:53.564852: step: 876/464, loss: 0.01734847202897072 2023-01-22 15:14:54.263761: step: 878/464, loss: 0.6863678097724915 2023-01-22 15:14:55.060498: step: 880/464, loss: 0.20707592368125916 2023-01-22 15:14:55.804508: step: 882/464, loss: 0.0022819163277745247 2023-01-22 15:14:56.523400: step: 884/464, loss: 0.013131760060787201 2023-01-22 15:14:57.167913: step: 886/464, loss: 0.008258351124823093 2023-01-22 15:14:57.939758: step: 888/464, loss: 0.10258307307958603 2023-01-22 15:14:58.672251: step: 890/464, loss: 0.010877292603254318 2023-01-22 15:14:59.431796: step: 892/464, loss: 0.02519080974161625 2023-01-22 15:15:00.134889: step: 894/464, loss: 0.02485913224518299 2023-01-22 15:15:00.807808: step: 896/464, loss: 0.00023697837605141103 2023-01-22 15:15:01.629236: step: 898/464, loss: 0.008679088205099106 2023-01-22 15:15:02.335290: step: 900/464, loss: 0.0034459216985851526 2023-01-22 15:15:03.061097: step: 902/464, loss: 0.0010122188832610846 2023-01-22 15:15:03.773797: step: 904/464, loss: 0.00626129936426878 2023-01-22 15:15:04.538704: step: 906/464, loss: 0.03434443101286888 2023-01-22 15:15:05.328192: step: 908/464, loss: 0.027134211733937263 2023-01-22 15:15:06.131987: step: 910/464, loss: 0.012275014072656631 2023-01-22 15:15:06.841232: step: 912/464, loss: 0.016073843464255333 2023-01-22 15:15:07.583055: step: 914/464, loss: 0.010795571841299534 2023-01-22 15:15:08.292261: step: 916/464, loss: 0.014998020604252815 2023-01-22 15:15:09.024814: step: 918/464, loss: 0.010157154873013496 2023-01-22 15:15:09.746797: step: 920/464, loss: 0.07434345781803131 2023-01-22 15:15:10.423627: step: 922/464, loss: 0.028125915676355362 2023-01-22 15:15:11.125041: step: 924/464, loss: 0.0008021766552701592 2023-01-22 15:15:11.880518: step: 926/464, loss: 0.023643728345632553 2023-01-22 15:15:12.584212: step: 928/464, loss: 0.009946424514055252 2023-01-22 15:15:13.268705: step: 930/464, loss: 0.0014948910102248192 ================================================== Loss: 0.099 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.308959307554993, 'r': 0.34589372193063733, 'f1': 0.32638494441798727}, 'combined': 0.2404941695711485, 'epoch': 27} Test Chinese: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.2945667319038624, 'r': 0.28203817654098795, 'f1': 0.28816634308533484}, 'combined': 0.17896646570562902, 'epoch': 27} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.29534142861987556, 'r': 0.3340104202228574, 'f1': 0.3134879634148634}, 'combined': 0.23099113093726775, 'epoch': 27} Test Korean: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.2865749384862607, 'r': 0.27948851567502775, 'f1': 0.28298737040305766}, 'combined': 0.1757500510924253, 'epoch': 27} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3087304984155321, 'r': 0.34212260165971675, 'f1': 0.3245699569301003}, 'combined': 0.23915681036954758, 'epoch': 27} Test Russian: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.3027501670621189, 'r': 0.2872091910628788, 'f1': 0.2947749853563285}, 'combined': 0.1830707803791935, 'epoch': 27} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.2702205882352941, 'r': 0.2625, 'f1': 0.266304347826087}, 'combined': 0.17753623188405798, 'epoch': 27} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.27564102564102566, 'r': 0.4673913043478261, 'f1': 0.3467741935483871}, 'combined': 0.17338709677419356, 'epoch': 27} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.4430921052631579, 'r': 0.3055807622504537, 'f1': 0.3617078410311493}, 'combined': 0.24113856068743283, 'epoch': 27} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.31282801835726054, 'r': 0.3603161425860667, 'f1': 0.33489701436130004}, 'combined': 0.24676622110832633, 'epoch': 12} Test for Chinese: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.30894966790503225, 'r': 0.2887808468548521, 'f1': 0.29852498585915693}, 'combined': 0.1853997280598975, 'epoch': 12} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.3138888888888889, 'r': 0.4035714285714286, 'f1': 0.35312499999999997}, 'combined': 0.23541666666666664, 'epoch': 12} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3081533681870252, 'r': 0.3455761681186563, 'f1': 0.32579363255551325}, 'combined': 0.24005846609353607, 'epoch': 23} Test for Korean: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.29327466083978504, 'r': 0.2883432372648332, 'f1': 0.2907880427678267}, 'combined': 0.1805946791926503, 'epoch': 23} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.3355263157894737, 'r': 0.5543478260869565, 'f1': 0.41803278688524587}, 'combined': 0.20901639344262293, 'epoch': 23} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.32184781639317267, 'r': 0.3346728716953674, 'f1': 0.32813507606224857}, 'combined': 0.24178374025639365, 'epoch': 24} Test for Russian: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.31977789344146773, 'r': 0.29610233370986844, 'f1': 0.30748504771716734}, 'combined': 0.190964398055925, 'epoch': 24} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.46875, 'r': 0.3232758620689655, 'f1': 0.38265306122448983}, 'combined': 0.25510204081632654, 'epoch': 24} ****************************** Epoch: 28 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-22 15:17:53.710522: step: 2/464, loss: 0.11802484840154648 2023-01-22 15:17:54.513098: step: 4/464, loss: 0.01309600193053484 2023-01-22 15:17:55.298117: step: 6/464, loss: 0.03771432489156723 2023-01-22 15:17:56.042130: step: 8/464, loss: 0.04672054946422577 2023-01-22 15:17:56.872192: step: 10/464, loss: 0.05573376640677452 2023-01-22 15:17:57.567059: step: 12/464, loss: 0.014672880060970783 2023-01-22 15:17:58.316633: step: 14/464, loss: 0.0027975246775895357 2023-01-22 15:17:58.990824: step: 16/464, loss: 0.011845177039504051 2023-01-22 15:17:59.690725: step: 18/464, loss: 0.0013881749473512173 2023-01-22 15:18:00.416819: step: 20/464, loss: 0.02932727336883545 2023-01-22 15:18:01.083043: step: 22/464, loss: 0.005997839383780956 2023-01-22 15:18:01.764220: step: 24/464, loss: 0.0056749386712908745 2023-01-22 15:18:02.577903: step: 26/464, loss: 0.017113935202360153 2023-01-22 15:18:03.286236: step: 28/464, loss: 0.014125406742095947 2023-01-22 15:18:04.042450: step: 30/464, loss: 0.03675199672579765 2023-01-22 15:18:04.790393: step: 32/464, loss: 0.019042707979679108 2023-01-22 15:18:05.452805: step: 34/464, loss: 0.006434685550630093 2023-01-22 15:18:06.269000: step: 36/464, loss: 0.03755321726202965 2023-01-22 15:18:07.012463: step: 38/464, loss: 0.023875346407294273 2023-01-22 15:18:07.750144: step: 40/464, loss: 0.34607502818107605 2023-01-22 15:18:08.504991: step: 42/464, loss: 0.0430816151201725 2023-01-22 15:18:09.255559: step: 44/464, loss: 0.034060847014188766 2023-01-22 15:18:09.993150: step: 46/464, loss: 0.0461505763232708 2023-01-22 15:18:10.731630: step: 48/464, loss: 0.02029733918607235 2023-01-22 15:18:11.516472: step: 50/464, loss: 0.0033927804324775934 2023-01-22 15:18:12.236556: step: 52/464, loss: 0.011056846007704735 2023-01-22 15:18:13.050689: step: 54/464, loss: 0.00704342732205987 2023-01-22 15:18:13.766711: step: 56/464, loss: 0.00036773111787624657 2023-01-22 15:18:14.501744: step: 58/464, loss: 0.003657267428934574 2023-01-22 15:18:15.235370: step: 60/464, loss: 0.08691982924938202 2023-01-22 15:18:16.035415: step: 62/464, loss: 0.003310913685709238 2023-01-22 15:18:16.725334: step: 64/464, loss: 0.033577486872673035 2023-01-22 15:18:17.436338: step: 66/464, loss: 0.01103971991688013 2023-01-22 15:18:18.194487: step: 68/464, loss: 0.27612701058387756 2023-01-22 15:18:19.001252: step: 70/464, loss: 0.0013540396466851234 2023-01-22 15:18:19.693742: step: 72/464, loss: 0.01651098020374775 2023-01-22 15:18:20.482965: step: 74/464, loss: 0.027707085013389587 2023-01-22 15:18:21.278613: step: 76/464, loss: 0.002544702496379614 2023-01-22 15:18:22.052904: step: 78/464, loss: 0.003023700788617134 2023-01-22 15:18:22.782465: step: 80/464, loss: 0.0047032893635332584 2023-01-22 15:18:23.471642: step: 82/464, loss: 0.00022587347484659404 2023-01-22 15:18:24.301306: step: 84/464, loss: 0.5387089252471924 2023-01-22 15:18:25.065773: step: 86/464, loss: 0.0016945467796176672 2023-01-22 15:18:25.845902: step: 88/464, loss: 0.010113071650266647 2023-01-22 15:18:26.558444: step: 90/464, loss: 0.0036650216206908226 2023-01-22 15:18:27.302520: step: 92/464, loss: 0.05071362480521202 2023-01-22 15:18:28.153740: step: 94/464, loss: 0.08658748120069504 2023-01-22 15:18:28.855534: step: 96/464, loss: 0.05931760370731354 2023-01-22 15:18:29.577042: step: 98/464, loss: 0.002367552602663636 2023-01-22 15:18:30.323872: step: 100/464, loss: 0.05622279644012451 2023-01-22 15:18:31.013266: step: 102/464, loss: 0.122266985476017 2023-01-22 15:18:31.750104: step: 104/464, loss: 0.10930424183607101 2023-01-22 15:18:32.535907: step: 106/464, loss: 0.01408810168504715 2023-01-22 15:18:33.275341: step: 108/464, loss: 6.7664475440979 2023-01-22 15:18:33.939667: step: 110/464, loss: 0.010170192457735538 2023-01-22 15:18:34.642990: step: 112/464, loss: 0.032275233417749405 2023-01-22 15:18:35.354432: step: 114/464, loss: 0.003709939308464527 2023-01-22 15:18:36.147796: step: 116/464, loss: 0.0035400777123868465 2023-01-22 15:18:36.899607: step: 118/464, loss: 0.1314106434583664 2023-01-22 15:18:37.746145: step: 120/464, loss: 0.0016641796100884676 2023-01-22 15:18:38.498150: step: 122/464, loss: 0.015162871219217777 2023-01-22 15:18:39.255117: step: 124/464, loss: 0.021675320342183113 2023-01-22 15:18:40.003481: step: 126/464, loss: 0.009907363913953304 2023-01-22 15:18:40.690592: step: 128/464, loss: 0.0007932535954751074 2023-01-22 15:18:41.378477: step: 130/464, loss: 0.011510932818055153 2023-01-22 15:18:42.271678: step: 132/464, loss: 0.06883779913187027 2023-01-22 15:18:43.054886: step: 134/464, loss: 0.015645796433091164 2023-01-22 15:18:43.823310: step: 136/464, loss: 0.07446697354316711 2023-01-22 15:18:44.551485: step: 138/464, loss: 0.019342824816703796 2023-01-22 15:18:45.285697: step: 140/464, loss: 0.0017768784891813993 2023-01-22 15:18:45.972700: step: 142/464, loss: 0.00040327577153220773 2023-01-22 15:18:46.662646: step: 144/464, loss: 0.004029213450849056 2023-01-22 15:18:47.355197: step: 146/464, loss: 0.0003138831234537065 2023-01-22 15:18:48.037195: step: 148/464, loss: 0.07467584311962128 2023-01-22 15:18:48.742300: step: 150/464, loss: 0.020751923322677612 2023-01-22 15:18:49.426826: step: 152/464, loss: 0.00014375359751284122 2023-01-22 15:18:50.176971: step: 154/464, loss: 0.005737461615353823 2023-01-22 15:18:50.866035: step: 156/464, loss: 0.020710809156298637 2023-01-22 15:18:51.551244: step: 158/464, loss: 0.16590137779712677 2023-01-22 15:18:52.236154: step: 160/464, loss: 0.0020352245774120092 2023-01-22 15:18:52.982006: step: 162/464, loss: 0.020962907001376152 2023-01-22 15:18:53.763193: step: 164/464, loss: 0.1972610354423523 2023-01-22 15:18:54.522884: step: 166/464, loss: 4.758993625640869 2023-01-22 15:18:55.223218: step: 168/464, loss: 0.08988256752490997 2023-01-22 15:18:56.031152: step: 170/464, loss: 0.0248276200145483 2023-01-22 15:18:56.774392: step: 172/464, loss: 0.005605712067335844 2023-01-22 15:18:57.471033: step: 174/464, loss: 0.02840728685259819 2023-01-22 15:18:58.285086: step: 176/464, loss: 0.00861166138201952 2023-01-22 15:18:58.994312: step: 178/464, loss: 0.004647831432521343 2023-01-22 15:18:59.718059: step: 180/464, loss: 0.008284042589366436 2023-01-22 15:19:00.452611: step: 182/464, loss: 0.038902122527360916 2023-01-22 15:19:01.164053: step: 184/464, loss: 0.013399926014244556 2023-01-22 15:19:01.880641: step: 186/464, loss: 0.0023059824015945196 2023-01-22 15:19:02.677809: step: 188/464, loss: 0.04218491166830063 2023-01-22 15:19:03.424935: step: 190/464, loss: 0.03554745763540268 2023-01-22 15:19:04.147976: step: 192/464, loss: 0.008880481123924255 2023-01-22 15:19:04.885740: step: 194/464, loss: 0.03854241967201233 2023-01-22 15:19:05.761322: step: 196/464, loss: 0.007150109391659498 2023-01-22 15:19:06.538668: step: 198/464, loss: 0.0255377609282732 2023-01-22 15:19:07.325787: step: 200/464, loss: 0.0677812471985817 2023-01-22 15:19:07.982543: step: 202/464, loss: 0.00023945064458530396 2023-01-22 15:19:08.782733: step: 204/464, loss: 0.3639880120754242 2023-01-22 15:19:09.581116: step: 206/464, loss: 0.0019438682356849313 2023-01-22 15:19:10.324497: step: 208/464, loss: 0.00396093400195241 2023-01-22 15:19:11.107437: step: 210/464, loss: 0.019185908138751984 2023-01-22 15:19:11.898837: step: 212/464, loss: 0.013483088463544846 2023-01-22 15:19:12.711445: step: 214/464, loss: 0.03135254606604576 2023-01-22 15:19:13.420732: step: 216/464, loss: 0.01842799037694931 2023-01-22 15:19:14.119903: step: 218/464, loss: 0.007716724649071693 2023-01-22 15:19:14.846383: step: 220/464, loss: 0.0006132380221970379 2023-01-22 15:19:15.572625: step: 222/464, loss: 0.025128891691565514 2023-01-22 15:19:16.318212: step: 224/464, loss: 0.015200333669781685 2023-01-22 15:19:17.117478: step: 226/464, loss: 0.033424243330955505 2023-01-22 15:19:17.864180: step: 228/464, loss: 0.01700105145573616 2023-01-22 15:19:18.578473: step: 230/464, loss: 0.45445510745048523 2023-01-22 15:19:19.297483: step: 232/464, loss: 0.00553317554295063 2023-01-22 15:19:20.116786: step: 234/464, loss: 0.06307711452245712 2023-01-22 15:19:20.830992: step: 236/464, loss: 0.024076784029603004 2023-01-22 15:19:21.690768: step: 238/464, loss: 0.049010127782821655 2023-01-22 15:19:22.415789: step: 240/464, loss: 0.003527000779286027 2023-01-22 15:19:23.206263: step: 242/464, loss: 0.04622814804315567 2023-01-22 15:19:23.975599: step: 244/464, loss: 0.0006396998069249094 2023-01-22 15:19:24.700428: step: 246/464, loss: 0.004830328747630119 2023-01-22 15:19:25.450895: step: 248/464, loss: 0.0254517775028944 2023-01-22 15:19:26.165350: step: 250/464, loss: 0.034026939421892166 2023-01-22 15:19:26.977509: step: 252/464, loss: 0.020695265382528305 2023-01-22 15:19:27.704617: step: 254/464, loss: 0.01600899174809456 2023-01-22 15:19:28.478402: step: 256/464, loss: 0.010306455194950104 2023-01-22 15:19:29.217857: step: 258/464, loss: 0.02375786192715168 2023-01-22 15:19:29.978616: step: 260/464, loss: 0.0194417554885149 2023-01-22 15:19:30.670471: step: 262/464, loss: 0.004774979781359434 2023-01-22 15:19:31.405350: step: 264/464, loss: 0.007717948406934738 2023-01-22 15:19:32.031983: step: 266/464, loss: 0.03315155208110809 2023-01-22 15:19:32.844816: step: 268/464, loss: 0.0035224573221057653 2023-01-22 15:19:33.632685: step: 270/464, loss: 0.0005481951520778239 2023-01-22 15:19:34.366592: step: 272/464, loss: 0.03926454111933708 2023-01-22 15:19:35.158023: step: 274/464, loss: 0.1650092452764511 2023-01-22 15:19:35.844240: step: 276/464, loss: 0.00951425638049841 2023-01-22 15:19:36.608116: step: 278/464, loss: 0.009358447976410389 2023-01-22 15:19:37.277678: step: 280/464, loss: 0.06159272417426109 2023-01-22 15:19:38.083264: step: 282/464, loss: 0.003746254835277796 2023-01-22 15:19:38.833630: step: 284/464, loss: 0.012985536828637123 2023-01-22 15:19:39.587824: step: 286/464, loss: 0.035654276609420776 2023-01-22 15:19:40.310526: step: 288/464, loss: 0.03835088014602661 2023-01-22 15:19:41.003416: step: 290/464, loss: 0.015234079211950302 2023-01-22 15:19:41.741909: step: 292/464, loss: 0.038829442113637924 2023-01-22 15:19:42.504904: step: 294/464, loss: 0.04349800571799278 2023-01-22 15:19:43.243639: step: 296/464, loss: 0.03376410901546478 2023-01-22 15:19:43.991818: step: 298/464, loss: 0.0038537858054041862 2023-01-22 15:19:44.717883: step: 300/464, loss: 0.0397925041615963 2023-01-22 15:19:45.418457: step: 302/464, loss: 0.013765391893684864 2023-01-22 15:19:46.117016: step: 304/464, loss: 0.008335323072969913 2023-01-22 15:19:46.827587: step: 306/464, loss: 0.04323340952396393 2023-01-22 15:19:47.566581: step: 308/464, loss: 0.010128780268132687 2023-01-22 15:19:48.278092: step: 310/464, loss: 0.013416034169495106 2023-01-22 15:19:49.034795: step: 312/464, loss: 0.00013531593140214682 2023-01-22 15:19:49.763648: step: 314/464, loss: 0.09947311133146286 2023-01-22 15:19:50.542702: step: 316/464, loss: 0.017739834263920784 2023-01-22 15:19:51.357974: step: 318/464, loss: 0.0012596314772963524 2023-01-22 15:19:52.071667: step: 320/464, loss: 0.0014567991020157933 2023-01-22 15:19:52.798537: step: 322/464, loss: 0.008959956467151642 2023-01-22 15:19:53.496003: step: 324/464, loss: 0.019878460094332695 2023-01-22 15:19:54.197829: step: 326/464, loss: 0.0139300636947155 2023-01-22 15:19:54.983210: step: 328/464, loss: 0.032952453941106796 2023-01-22 15:19:55.723500: step: 330/464, loss: 0.025093283504247665 2023-01-22 15:19:56.499564: step: 332/464, loss: 0.024029474705457687 2023-01-22 15:19:57.202975: step: 334/464, loss: 0.0024554578121751547 2023-01-22 15:19:57.912369: step: 336/464, loss: 0.09956522285938263 2023-01-22 15:19:58.640330: step: 338/464, loss: 0.20657359063625336 2023-01-22 15:19:59.329814: step: 340/464, loss: 0.052003707736730576 2023-01-22 15:20:00.106933: step: 342/464, loss: 0.3483356535434723 2023-01-22 15:20:00.875317: step: 344/464, loss: 0.012462802231311798 2023-01-22 15:20:01.553001: step: 346/464, loss: 0.0649271160364151 2023-01-22 15:20:02.248525: step: 348/464, loss: 0.031710434705019 2023-01-22 15:20:02.995514: step: 350/464, loss: 0.00686945766210556 2023-01-22 15:20:03.775713: step: 352/464, loss: 0.31140774488449097 2023-01-22 15:20:04.498469: step: 354/464, loss: 0.012221035547554493 2023-01-22 15:20:05.151524: step: 356/464, loss: 0.03105190582573414 2023-01-22 15:20:05.980523: step: 358/464, loss: 0.0019033612916246057 2023-01-22 15:20:06.695932: step: 360/464, loss: 0.007821962237358093 2023-01-22 15:20:07.525292: step: 362/464, loss: 0.1137755960226059 2023-01-22 15:20:08.217528: step: 364/464, loss: 0.007979289628565311 2023-01-22 15:20:08.880725: step: 366/464, loss: 0.002275730948895216 2023-01-22 15:20:09.539533: step: 368/464, loss: 0.05832372605800629 2023-01-22 15:20:10.269695: step: 370/464, loss: 0.03142813593149185 2023-01-22 15:20:10.976385: step: 372/464, loss: 0.016409264877438545 2023-01-22 15:20:11.700836: step: 374/464, loss: 0.010367317125201225 2023-01-22 15:20:12.430705: step: 376/464, loss: 0.001882447162643075 2023-01-22 15:20:13.111846: step: 378/464, loss: 0.00920257717370987 2023-01-22 15:20:13.854320: step: 380/464, loss: 0.025457818061113358 2023-01-22 15:20:14.613312: step: 382/464, loss: 0.007789787836372852 2023-01-22 15:20:15.347851: step: 384/464, loss: 0.057693831622600555 2023-01-22 15:20:16.054423: step: 386/464, loss: 0.008637432008981705 2023-01-22 15:20:16.756706: step: 388/464, loss: 0.014804269187152386 2023-01-22 15:20:17.472193: step: 390/464, loss: 0.03306438773870468 2023-01-22 15:20:18.172858: step: 392/464, loss: 0.013410833664238453 2023-01-22 15:20:18.899511: step: 394/464, loss: 0.0015661357901990414 2023-01-22 15:20:19.750988: step: 396/464, loss: 0.00610681576654315 2023-01-22 15:20:20.508455: step: 398/464, loss: 0.0018912320956587791 2023-01-22 15:20:21.151044: step: 400/464, loss: 0.005749634001404047 2023-01-22 15:20:21.860399: step: 402/464, loss: 0.008570604026317596 2023-01-22 15:20:22.658582: step: 404/464, loss: 0.05428850278258324 2023-01-22 15:20:23.414697: step: 406/464, loss: 0.02714327536523342 2023-01-22 15:20:24.203266: step: 408/464, loss: 0.005631400737911463 2023-01-22 15:20:25.016399: step: 410/464, loss: 0.02358373999595642 2023-01-22 15:20:25.822999: step: 412/464, loss: 0.011837883852422237 2023-01-22 15:20:26.484191: step: 414/464, loss: 0.010881095193326473 2023-01-22 15:20:27.195330: step: 416/464, loss: 0.009863450191915035 2023-01-22 15:20:27.905481: step: 418/464, loss: 0.11035460233688354 2023-01-22 15:20:28.573697: step: 420/464, loss: 0.017884111031889915 2023-01-22 15:20:29.344418: step: 422/464, loss: 0.05123107507824898 2023-01-22 15:20:30.204384: step: 424/464, loss: 0.02582388184964657 2023-01-22 15:20:30.950634: step: 426/464, loss: 0.044229090213775635 2023-01-22 15:20:31.639333: step: 428/464, loss: 0.03117489628493786 2023-01-22 15:20:32.345770: step: 430/464, loss: 0.011095764115452766 2023-01-22 15:20:33.113736: step: 432/464, loss: 0.6051144599914551 2023-01-22 15:20:33.874100: step: 434/464, loss: 0.018150007352232933 2023-01-22 15:20:34.614891: step: 436/464, loss: 0.0067179263569414616 2023-01-22 15:20:35.366592: step: 438/464, loss: 0.010419225320219994 2023-01-22 15:20:36.094103: step: 440/464, loss: 0.0008295488078147173 2023-01-22 15:20:36.792118: step: 442/464, loss: 0.0167152788490057 2023-01-22 15:20:37.594045: step: 444/464, loss: 0.024527952075004578 2023-01-22 15:20:38.324364: step: 446/464, loss: 0.0478508397936821 2023-01-22 15:20:39.009115: step: 448/464, loss: 0.00022361334413290024 2023-01-22 15:20:39.747069: step: 450/464, loss: 0.03885860741138458 2023-01-22 15:20:40.422578: step: 452/464, loss: 0.03442401811480522 2023-01-22 15:20:41.123456: step: 454/464, loss: 1.4656697511672974 2023-01-22 15:20:41.927098: step: 456/464, loss: 0.02225828543305397 2023-01-22 15:20:42.726139: step: 458/464, loss: 0.03477256000041962 2023-01-22 15:20:43.460488: step: 460/464, loss: 0.011446974240243435 2023-01-22 15:20:44.198446: step: 462/464, loss: 0.33820757269859314 2023-01-22 15:20:44.907465: step: 464/464, loss: 0.0659744068980217 2023-01-22 15:20:45.647397: step: 466/464, loss: 0.004842453636229038 2023-01-22 15:20:46.381515: step: 468/464, loss: 0.03637324273586273 2023-01-22 15:20:47.176295: step: 470/464, loss: 0.0970153734087944 2023-01-22 15:20:47.951151: step: 472/464, loss: 0.02549644187092781 2023-01-22 15:20:48.707026: step: 474/464, loss: 0.3136069178581238 2023-01-22 15:20:49.407289: step: 476/464, loss: 0.017255626618862152 2023-01-22 15:20:50.158713: step: 478/464, loss: 0.18557414412498474 2023-01-22 15:20:50.908370: step: 480/464, loss: 0.010793224908411503 2023-01-22 15:20:51.603704: step: 482/464, loss: 0.028504427522420883 2023-01-22 15:20:52.338748: step: 484/464, loss: 0.016538219526410103 2023-01-22 15:20:53.070491: step: 486/464, loss: 0.013090861961245537 2023-01-22 15:20:53.769629: step: 488/464, loss: 0.0014066733419895172 2023-01-22 15:20:54.548954: step: 490/464, loss: 0.01350055355578661 2023-01-22 15:20:55.255912: step: 492/464, loss: 0.016357606276869774 2023-01-22 15:20:55.962181: step: 494/464, loss: 0.07683868706226349 2023-01-22 15:20:56.806791: step: 496/464, loss: 0.028659097850322723 2023-01-22 15:20:57.557647: step: 498/464, loss: 0.06337568908929825 2023-01-22 15:20:58.241262: step: 500/464, loss: 0.0016648870659992099 2023-01-22 15:20:58.939281: step: 502/464, loss: 0.012672177515923977 2023-01-22 15:20:59.659433: step: 504/464, loss: 0.0046038078144192696 2023-01-22 15:21:00.435796: step: 506/464, loss: 0.01888994127511978 2023-01-22 15:21:01.140198: step: 508/464, loss: 0.01926352269947529 2023-01-22 15:21:01.875581: step: 510/464, loss: 0.03320739045739174 2023-01-22 15:21:02.548204: step: 512/464, loss: 0.013302896171808243 2023-01-22 15:21:03.222402: step: 514/464, loss: 0.01819944754242897 2023-01-22 15:21:03.929229: step: 516/464, loss: 0.007101289462298155 2023-01-22 15:21:04.746550: step: 518/464, loss: 0.06670579314231873 2023-01-22 15:21:05.403259: step: 520/464, loss: 0.3770799934864044 2023-01-22 15:21:06.119185: step: 522/464, loss: 0.010539170354604721 2023-01-22 15:21:06.914342: step: 524/464, loss: 0.32410728931427 2023-01-22 15:21:07.587188: step: 526/464, loss: 0.004301424603909254 2023-01-22 15:21:08.302728: step: 528/464, loss: 0.01964602991938591 2023-01-22 15:21:08.994916: step: 530/464, loss: 0.005726655479520559 2023-01-22 15:21:09.673984: step: 532/464, loss: 0.005193687044084072 2023-01-22 15:21:10.417423: step: 534/464, loss: 0.042894672602415085 2023-01-22 15:21:11.133258: step: 536/464, loss: 0.03241725265979767 2023-01-22 15:21:11.955122: step: 538/464, loss: 0.5052769780158997 2023-01-22 15:21:12.736283: step: 540/464, loss: 0.06744624674320221 2023-01-22 15:21:13.455400: step: 542/464, loss: 0.04684813320636749 2023-01-22 15:21:14.118225: step: 544/464, loss: 0.03330191969871521 2023-01-22 15:21:14.789499: step: 546/464, loss: 0.007320890203118324 2023-01-22 15:21:15.533060: step: 548/464, loss: 0.0037253601476550102 2023-01-22 15:21:16.364885: step: 550/464, loss: 0.21355178952217102 2023-01-22 15:21:17.076348: step: 552/464, loss: 0.05642886087298393 2023-01-22 15:21:17.785654: step: 554/464, loss: 0.02629709430038929 2023-01-22 15:21:18.495393: step: 556/464, loss: 0.017702855169773102 2023-01-22 15:21:19.208855: step: 558/464, loss: 0.003917295020073652 2023-01-22 15:21:20.015292: step: 560/464, loss: 0.0066395653411746025 2023-01-22 15:21:20.729723: step: 562/464, loss: 0.05594123154878616 2023-01-22 15:21:21.504009: step: 564/464, loss: 0.07240436226129532 2023-01-22 15:21:22.277214: step: 566/464, loss: 0.0046186018735170364 2023-01-22 15:21:23.081704: step: 568/464, loss: 0.00958124827593565 2023-01-22 15:21:23.793420: step: 570/464, loss: 0.04715810716152191 2023-01-22 15:21:24.697095: step: 572/464, loss: 0.011579863727092743 2023-01-22 15:21:25.390295: step: 574/464, loss: 0.02096959389746189 2023-01-22 15:21:26.129839: step: 576/464, loss: 0.033328570425510406 2023-01-22 15:21:26.860207: step: 578/464, loss: 0.004885158967226744 2023-01-22 15:21:27.663619: step: 580/464, loss: 0.04350057244300842 2023-01-22 15:21:28.428377: step: 582/464, loss: 0.020800787955522537 2023-01-22 15:21:29.098833: step: 584/464, loss: 0.005818777251988649 2023-01-22 15:21:29.821551: step: 586/464, loss: 0.1312158703804016 2023-01-22 15:21:30.556871: step: 588/464, loss: 0.008737059310078621 2023-01-22 15:21:31.230376: step: 590/464, loss: 0.03691103309392929 2023-01-22 15:21:32.003847: step: 592/464, loss: 0.011228111572563648 2023-01-22 15:21:32.759217: step: 594/464, loss: 0.0035349982790648937 2023-01-22 15:21:33.456204: step: 596/464, loss: 0.009966027922928333 2023-01-22 15:21:34.307701: step: 598/464, loss: 0.030326692387461662 2023-01-22 15:21:34.956142: step: 600/464, loss: 0.010724910534918308 2023-01-22 15:21:35.663368: step: 602/464, loss: 0.005237458273768425 2023-01-22 15:21:36.434388: step: 604/464, loss: 0.003180861007422209 2023-01-22 15:21:37.165513: step: 606/464, loss: 0.1141379177570343 2023-01-22 15:21:37.888151: step: 608/464, loss: 0.04503513500094414 2023-01-22 15:21:38.703170: step: 610/464, loss: 0.009239607490599155 2023-01-22 15:21:39.470548: step: 612/464, loss: 0.0586148202419281 2023-01-22 15:21:40.237557: step: 614/464, loss: 0.07480555772781372 2023-01-22 15:21:40.935193: step: 616/464, loss: 0.0015483495080843568 2023-01-22 15:21:41.664728: step: 618/464, loss: 0.029120994731783867 2023-01-22 15:21:42.414525: step: 620/464, loss: 0.0019428718369454145 2023-01-22 15:21:43.169996: step: 622/464, loss: 0.1298806518316269 2023-01-22 15:21:43.952675: step: 624/464, loss: 0.008236655034124851 2023-01-22 15:21:44.664543: step: 626/464, loss: 0.008735090494155884 2023-01-22 15:21:45.458981: step: 628/464, loss: 0.0024231334682554007 2023-01-22 15:21:46.184389: step: 630/464, loss: 0.015083258971571922 2023-01-22 15:21:46.847573: step: 632/464, loss: 0.027918506413698196 2023-01-22 15:21:47.616012: step: 634/464, loss: 0.017090972512960434 2023-01-22 15:21:48.381876: step: 636/464, loss: 0.008627823553979397 2023-01-22 15:21:49.135747: step: 638/464, loss: 0.000644507585093379 2023-01-22 15:21:49.841627: step: 640/464, loss: 0.00273617310449481 2023-01-22 15:21:50.565607: step: 642/464, loss: 0.10561248660087585 2023-01-22 15:21:51.324270: step: 644/464, loss: 0.2085895985364914 2023-01-22 15:21:52.076130: step: 646/464, loss: 0.17949917912483215 2023-01-22 15:21:52.823912: step: 648/464, loss: 0.00046418761485256255 2023-01-22 15:21:53.516151: step: 650/464, loss: 0.016573699191212654 2023-01-22 15:21:54.268737: step: 652/464, loss: 0.04553362727165222 2023-01-22 15:21:54.993700: step: 654/464, loss: 0.6091349720954895 2023-01-22 15:21:55.712217: step: 656/464, loss: 0.020109454169869423 2023-01-22 15:21:56.452991: step: 658/464, loss: 0.10786572843790054 2023-01-22 15:21:57.203806: step: 660/464, loss: 0.06860281527042389 2023-01-22 15:21:58.033158: step: 662/464, loss: 0.022475754842162132 2023-01-22 15:21:58.780075: step: 664/464, loss: 0.09202992916107178 2023-01-22 15:21:59.492127: step: 666/464, loss: 0.21183981001377106 2023-01-22 15:22:00.260286: step: 668/464, loss: 0.037260301411151886 2023-01-22 15:22:01.127120: step: 670/464, loss: 0.0022160960361361504 2023-01-22 15:22:01.865146: step: 672/464, loss: 0.0008184523903764784 2023-01-22 15:22:02.583547: step: 674/464, loss: 0.0066426536068320274 2023-01-22 15:22:03.323866: step: 676/464, loss: 0.03934881463646889 2023-01-22 15:22:04.019619: step: 678/464, loss: 0.03135685250163078 2023-01-22 15:22:04.733432: step: 680/464, loss: 0.003920204471796751 2023-01-22 15:22:05.594420: step: 682/464, loss: 0.05742049589753151 2023-01-22 15:22:06.352786: step: 684/464, loss: 0.006650918163359165 2023-01-22 15:22:07.075711: step: 686/464, loss: 0.00517960824072361 2023-01-22 15:22:07.801071: step: 688/464, loss: 0.027552543208003044 2023-01-22 15:22:08.566294: step: 690/464, loss: 0.01750790700316429 2023-01-22 15:22:09.441198: step: 692/464, loss: 0.21500498056411743 2023-01-22 15:22:10.188727: step: 694/464, loss: 0.0054095140658319 2023-01-22 15:22:10.937290: step: 696/464, loss: 0.36011627316474915 2023-01-22 15:22:11.659504: step: 698/464, loss: 0.04345450922846794 2023-01-22 15:22:12.455737: step: 700/464, loss: 3.994520664215088 2023-01-22 15:22:13.211001: step: 702/464, loss: 0.031575389206409454 2023-01-22 15:22:13.928781: step: 704/464, loss: 0.018297553062438965 2023-01-22 15:22:14.615052: step: 706/464, loss: 0.006059832405298948 2023-01-22 15:22:15.350897: step: 708/464, loss: 0.04579806327819824 2023-01-22 15:22:16.057423: step: 710/464, loss: 0.000498030858580023 2023-01-22 15:22:16.759753: step: 712/464, loss: 0.02703806944191456 2023-01-22 15:22:17.458532: step: 714/464, loss: 0.013390639796853065 2023-01-22 15:22:18.376634: step: 716/464, loss: 0.01270141638815403 2023-01-22 15:22:19.020372: step: 718/464, loss: 0.016825536265969276 2023-01-22 15:22:19.797978: step: 720/464, loss: 0.022613557055592537 2023-01-22 15:22:20.537366: step: 722/464, loss: 0.018898295238614082 2023-01-22 15:22:21.313246: step: 724/464, loss: 0.14791519939899445 2023-01-22 15:22:22.063530: step: 726/464, loss: 0.011607524938881397 2023-01-22 15:22:22.855534: step: 728/464, loss: 0.01842688024044037 2023-01-22 15:22:23.594076: step: 730/464, loss: 0.023921027779579163 2023-01-22 15:22:24.322333: step: 732/464, loss: 0.015431285835802555 2023-01-22 15:22:25.127656: step: 734/464, loss: 0.01931067556142807 2023-01-22 15:22:25.891128: step: 736/464, loss: 0.05411611124873161 2023-01-22 15:22:26.597093: step: 738/464, loss: 0.005268834065645933 2023-01-22 15:22:27.327254: step: 740/464, loss: 0.04305311292409897 2023-01-22 15:22:28.082995: step: 742/464, loss: 0.008710721507668495 2023-01-22 15:22:28.782503: step: 744/464, loss: 0.007986211217939854 2023-01-22 15:22:29.582862: step: 746/464, loss: 0.03394348919391632 2023-01-22 15:22:30.266699: step: 748/464, loss: 0.00660968292504549 2023-01-22 15:22:30.981636: step: 750/464, loss: 0.015248794108629227 2023-01-22 15:22:31.713038: step: 752/464, loss: 0.049215108156204224 2023-01-22 15:22:32.383916: step: 754/464, loss: 0.0020705063361674547 2023-01-22 15:22:33.046775: step: 756/464, loss: 0.009546882472932339 2023-01-22 15:22:33.840503: step: 758/464, loss: 0.010692611336708069 2023-01-22 15:22:34.576347: step: 760/464, loss: 0.002319645369425416 2023-01-22 15:22:35.272834: step: 762/464, loss: 0.33690714836120605 2023-01-22 15:22:36.004061: step: 764/464, loss: 0.013944489881396294 2023-01-22 15:22:36.756888: step: 766/464, loss: 0.011860419996082783 2023-01-22 15:22:37.550510: step: 768/464, loss: 0.10408028960227966 2023-01-22 15:22:38.304005: step: 770/464, loss: 0.007643452845513821 2023-01-22 15:22:39.029975: step: 772/464, loss: 0.037697263062000275 2023-01-22 15:22:39.788001: step: 774/464, loss: 0.012840951792895794 2023-01-22 15:22:40.459012: step: 776/464, loss: 0.0415637344121933 2023-01-22 15:22:41.222251: step: 778/464, loss: 0.0444411039352417 2023-01-22 15:22:42.061074: step: 780/464, loss: 0.012284292839467525 2023-01-22 15:22:42.857152: step: 782/464, loss: 0.0014497153460979462 2023-01-22 15:22:43.641128: step: 784/464, loss: 0.4550762474536896 2023-01-22 15:22:44.345852: step: 786/464, loss: 0.009761333465576172 2023-01-22 15:22:45.048119: step: 788/464, loss: 0.0007776019629091024 2023-01-22 15:22:45.751359: step: 790/464, loss: 0.025154652073979378 2023-01-22 15:22:46.423327: step: 792/464, loss: 0.026213031262159348 2023-01-22 15:22:47.229908: step: 794/464, loss: 0.007201395928859711 2023-01-22 15:22:47.977603: step: 796/464, loss: 0.020271655172109604 2023-01-22 15:22:48.684993: step: 798/464, loss: 0.025903355330228806 2023-01-22 15:22:49.396991: step: 800/464, loss: 0.029932750388979912 2023-01-22 15:22:50.133160: step: 802/464, loss: 0.2351268082857132 2023-01-22 15:22:50.846418: step: 804/464, loss: 0.05926898866891861 2023-01-22 15:22:51.543684: step: 806/464, loss: 0.034163087606430054 2023-01-22 15:22:52.229695: step: 808/464, loss: 0.024908997118473053 2023-01-22 15:22:52.979834: step: 810/464, loss: 0.013467278331518173 2023-01-22 15:22:53.726392: step: 812/464, loss: 0.006014563608914614 2023-01-22 15:22:54.452253: step: 814/464, loss: 0.11496934294700623 2023-01-22 15:22:55.163251: step: 816/464, loss: 0.013000456616282463 2023-01-22 15:22:55.818499: step: 818/464, loss: 0.0011854059994220734 2023-01-22 15:22:56.485633: step: 820/464, loss: 0.007496209349483252 2023-01-22 15:22:57.268893: step: 822/464, loss: 0.5041012167930603 2023-01-22 15:22:58.017654: step: 824/464, loss: 0.01389066781848669 2023-01-22 15:22:58.808935: step: 826/464, loss: 0.013370354659855366 2023-01-22 15:22:59.538012: step: 828/464, loss: 0.020678000524640083 2023-01-22 15:23:00.326532: step: 830/464, loss: 0.018191656097769737 2023-01-22 15:23:01.085488: step: 832/464, loss: 0.009238002821803093 2023-01-22 15:23:01.816842: step: 834/464, loss: 0.04901707544922829 2023-01-22 15:23:02.594008: step: 836/464, loss: 0.0111688869073987 2023-01-22 15:23:03.356917: step: 838/464, loss: 0.036923848092556 2023-01-22 15:23:04.152390: step: 840/464, loss: 0.003734600730240345 2023-01-22 15:23:04.938126: step: 842/464, loss: 0.0461021289229393 2023-01-22 15:23:05.698283: step: 844/464, loss: 0.17386293411254883 2023-01-22 15:23:06.407100: step: 846/464, loss: 0.31301191449165344 2023-01-22 15:23:07.255479: step: 848/464, loss: 0.04938540980219841 2023-01-22 15:23:08.041254: step: 850/464, loss: 0.0695425271987915 2023-01-22 15:23:08.828357: step: 852/464, loss: 0.043353304266929626 2023-01-22 15:23:09.652351: step: 854/464, loss: 0.021765680983662605 2023-01-22 15:23:10.539792: step: 856/464, loss: 0.10840235650539398 2023-01-22 15:23:11.302004: step: 858/464, loss: 0.010906664654612541 2023-01-22 15:23:12.005638: step: 860/464, loss: 0.0002889096213039011 2023-01-22 15:23:12.662528: step: 862/464, loss: 0.0025182217359542847 2023-01-22 15:23:13.391914: step: 864/464, loss: 0.1939670443534851 2023-01-22 15:23:14.191734: step: 866/464, loss: 0.038164377212524414 2023-01-22 15:23:14.848114: step: 868/464, loss: 0.0014197065029293299 2023-01-22 15:23:15.609004: step: 870/464, loss: 0.04917750880122185 2023-01-22 15:23:16.313762: step: 872/464, loss: 0.02373397909104824 2023-01-22 15:23:17.052404: step: 874/464, loss: 0.0019780031871050596 2023-01-22 15:23:17.809347: step: 876/464, loss: 0.005515122786164284 2023-01-22 15:23:18.530401: step: 878/464, loss: 0.09901914745569229 2023-01-22 15:23:19.264226: step: 880/464, loss: 0.02087888866662979 2023-01-22 15:23:20.017489: step: 882/464, loss: 0.025759685784578323 2023-01-22 15:23:20.734159: step: 884/464, loss: 0.004240122158080339 2023-01-22 15:23:21.500068: step: 886/464, loss: 0.03213540464639664 2023-01-22 15:23:22.137523: step: 888/464, loss: 0.004417374264448881 2023-01-22 15:23:22.944885: step: 890/464, loss: 0.06850989907979965 2023-01-22 15:23:23.633849: step: 892/464, loss: 0.007251196075230837 2023-01-22 15:23:24.412442: step: 894/464, loss: 0.0011848622234538198 2023-01-22 15:23:25.119825: step: 896/464, loss: 0.011365242302417755 2023-01-22 15:23:25.864026: step: 898/464, loss: 0.019021768122911453 2023-01-22 15:23:26.685455: step: 900/464, loss: 4.568624496459961 2023-01-22 15:23:27.435720: step: 902/464, loss: 0.001643879571929574 2023-01-22 15:23:28.198426: step: 904/464, loss: 0.005207096692174673 2023-01-22 15:23:28.913738: step: 906/464, loss: 0.00832145381718874 2023-01-22 15:23:29.682543: step: 908/464, loss: 0.007098441943526268 2023-01-22 15:23:30.421609: step: 910/464, loss: 0.00881668645888567 2023-01-22 15:23:31.217574: step: 912/464, loss: 0.015401276759803295 2023-01-22 15:23:31.899292: step: 914/464, loss: 0.07423000037670135 2023-01-22 15:23:32.633561: step: 916/464, loss: 0.06360428035259247 2023-01-22 15:23:33.294658: step: 918/464, loss: 0.08443159610033035 2023-01-22 15:23:34.044749: step: 920/464, loss: 0.0022100761998444796 2023-01-22 15:23:34.859795: step: 922/464, loss: 0.027309654280543327 2023-01-22 15:23:35.551143: step: 924/464, loss: 0.02387995645403862 2023-01-22 15:23:36.317436: step: 926/464, loss: 0.009875823743641376 2023-01-22 15:23:37.078857: step: 928/464, loss: 0.06939181685447693 2023-01-22 15:23:37.711574: step: 930/464, loss: 0.011594748124480247 ================================================== Loss: 0.092 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.31902359378677336, 'r': 0.36745222282461376, 'f1': 0.3415296674225246}, 'combined': 0.2516534391534392, 'epoch': 28} Test Chinese: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.2949031916590556, 'r': 0.2978201371749711, 'f1': 0.2963544868935982}, 'combined': 0.18405173396549784, 'epoch': 28} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2933220541401274, 'r': 0.3495374762808349, 'f1': 0.3189718614718615}, 'combined': 0.2350318979266348, 'epoch': 28} Test Korean: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.28639419132251737, 'r': 0.2997082635204979, 'f1': 0.2929000042718447}, 'combined': 0.1819063184425141, 'epoch': 28} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.30657779370685306, 'r': 0.35544408340585815, 'f1': 0.32920743753055753}, 'combined': 0.24257390133830553, 'epoch': 28} Test Russian: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.3059142671244368, 'r': 0.3031909947168009, 'f1': 0.3045465431283514}, 'combined': 0.18913943204813402, 'epoch': 28} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.25, 'r': 0.3357142857142857, 'f1': 0.28658536585365857}, 'combined': 0.1910569105691057, 'epoch': 28} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.28205128205128205, 'r': 0.4782608695652174, 'f1': 0.3548387096774194}, 'combined': 0.1774193548387097, 'epoch': 28} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.39880952380952384, 'r': 0.28879310344827586, 'f1': 0.33499999999999996}, 'combined': 0.2233333333333333, 'epoch': 28} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.31282801835726054, 'r': 0.3603161425860667, 'f1': 0.33489701436130004}, 'combined': 0.24676622110832633, 'epoch': 12} Test for Chinese: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.30894966790503225, 'r': 0.2887808468548521, 'f1': 0.29852498585915693}, 'combined': 0.1853997280598975, 'epoch': 12} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.3138888888888889, 'r': 0.4035714285714286, 'f1': 0.35312499999999997}, 'combined': 0.23541666666666664, 'epoch': 12} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3081533681870252, 'r': 0.3455761681186563, 'f1': 0.32579363255551325}, 'combined': 0.24005846609353607, 'epoch': 23} Test for Korean: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.29327466083978504, 'r': 0.2883432372648332, 'f1': 0.2907880427678267}, 'combined': 0.1805946791926503, 'epoch': 23} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.3355263157894737, 'r': 0.5543478260869565, 'f1': 0.41803278688524587}, 'combined': 0.20901639344262293, 'epoch': 23} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.32184781639317267, 'r': 0.3346728716953674, 'f1': 0.32813507606224857}, 'combined': 0.24178374025639365, 'epoch': 24} Test for Russian: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.31977789344146773, 'r': 0.29610233370986844, 'f1': 0.30748504771716734}, 'combined': 0.190964398055925, 'epoch': 24} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.46875, 'r': 0.3232758620689655, 'f1': 0.38265306122448983}, 'combined': 0.25510204081632654, 'epoch': 24} ****************************** Epoch: 29 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-22 15:26:17.146966: step: 2/464, loss: 0.0029547016602009535 2023-01-22 15:26:17.855943: step: 4/464, loss: 0.0018905532779172063 2023-01-22 15:26:18.578971: step: 6/464, loss: 0.07960596680641174 2023-01-22 15:26:19.307298: step: 8/464, loss: 0.034294113516807556 2023-01-22 15:26:20.021350: step: 10/464, loss: 0.02181868441402912 2023-01-22 15:26:20.751010: step: 12/464, loss: 0.007761118467897177 2023-01-22 15:26:21.535523: step: 14/464, loss: 0.17579518258571625 2023-01-22 15:26:22.265414: step: 16/464, loss: 0.008423620834946632 2023-01-22 15:26:22.939019: step: 18/464, loss: 0.002608527196571231 2023-01-22 15:26:23.745348: step: 20/464, loss: 0.022364985197782516 2023-01-22 15:26:24.489302: step: 22/464, loss: 0.002348007168620825 2023-01-22 15:26:25.170967: step: 24/464, loss: 0.009676797315478325 2023-01-22 15:26:25.872377: step: 26/464, loss: 0.015824243426322937 2023-01-22 15:26:26.679867: step: 28/464, loss: 0.0011271298862993717 2023-01-22 15:26:27.358227: step: 30/464, loss: 0.015557849779725075 2023-01-22 15:26:28.108255: step: 32/464, loss: 0.0047828881070017815 2023-01-22 15:26:28.890505: step: 34/464, loss: 0.013710367493331432 2023-01-22 15:26:29.649381: step: 36/464, loss: 0.0010676380479708314 2023-01-22 15:26:30.295951: step: 38/464, loss: 0.006321811582893133 2023-01-22 15:26:31.064810: step: 40/464, loss: 0.03331097215414047 2023-01-22 15:26:31.884236: step: 42/464, loss: 0.05714160203933716 2023-01-22 15:26:32.598114: step: 44/464, loss: 0.010108050890266895 2023-01-22 15:26:33.321621: step: 46/464, loss: 0.0493391752243042 2023-01-22 15:26:34.087208: step: 48/464, loss: 0.029232880100607872 2023-01-22 15:26:34.771110: step: 50/464, loss: 0.025543780997395515 2023-01-22 15:26:35.526061: step: 52/464, loss: 0.003706761635839939 2023-01-22 15:26:36.267337: step: 54/464, loss: 3.4575983590912074e-05 2023-01-22 15:26:37.016569: step: 56/464, loss: 0.0759250670671463 2023-01-22 15:26:37.742933: step: 58/464, loss: 0.016924351453781128 2023-01-22 15:26:38.444340: step: 60/464, loss: 0.00038268452044576406 2023-01-22 15:26:39.212385: step: 62/464, loss: 0.007408566307276487 2023-01-22 15:26:39.911981: step: 64/464, loss: 0.000976721872575581 2023-01-22 15:26:40.639617: step: 66/464, loss: 0.04439283907413483 2023-01-22 15:26:41.422119: step: 68/464, loss: 0.025368480011820793 2023-01-22 15:26:42.141605: step: 70/464, loss: 0.04029041528701782 2023-01-22 15:26:42.869191: step: 72/464, loss: 0.007327367551624775 2023-01-22 15:26:43.558446: step: 74/464, loss: 0.025254884734749794 2023-01-22 15:26:44.325682: step: 76/464, loss: 0.020946325734257698 2023-01-22 15:26:45.049078: step: 78/464, loss: 0.02133583091199398 2023-01-22 15:26:45.886634: step: 80/464, loss: 0.11129105091094971 2023-01-22 15:26:46.561998: step: 82/464, loss: 0.1564992368221283 2023-01-22 15:26:47.295122: step: 84/464, loss: 0.026285603642463684 2023-01-22 15:26:48.051349: step: 86/464, loss: 0.014006583951413631 2023-01-22 15:26:48.820574: step: 88/464, loss: 0.08667285740375519 2023-01-22 15:26:49.563652: step: 90/464, loss: 0.022317009046673775 2023-01-22 15:26:50.215305: step: 92/464, loss: 0.012138741090893745 2023-01-22 15:26:50.950901: step: 94/464, loss: 0.04680892825126648 2023-01-22 15:26:51.752805: step: 96/464, loss: 0.06309213489294052 2023-01-22 15:26:52.580722: step: 98/464, loss: 0.0006251411396078765 2023-01-22 15:26:53.341838: step: 100/464, loss: 0.06023294851183891 2023-01-22 15:26:53.985010: step: 102/464, loss: 0.012864368967711926 2023-01-22 15:26:54.693201: step: 104/464, loss: 0.00024760988890193403 2023-01-22 15:26:55.421049: step: 106/464, loss: 0.027757912874221802 2023-01-22 15:26:56.195211: step: 108/464, loss: 0.054326508194208145 2023-01-22 15:26:56.850658: step: 110/464, loss: 0.0025361785665154457 2023-01-22 15:26:57.574822: step: 112/464, loss: 0.010986367240548134 2023-01-22 15:26:58.305195: step: 114/464, loss: 0.018608594313263893 2023-01-22 15:26:58.973812: step: 116/464, loss: 0.008475798182189465 2023-01-22 15:26:59.755743: step: 118/464, loss: 0.003442601067945361 2023-01-22 15:27:00.605596: step: 120/464, loss: 0.010374759323894978 2023-01-22 15:27:01.407270: step: 122/464, loss: 0.03338068723678589 2023-01-22 15:27:02.184709: step: 124/464, loss: 0.056789930909872055 2023-01-22 15:27:02.940116: step: 126/464, loss: 0.012232386507093906 2023-01-22 15:27:03.721906: step: 128/464, loss: 0.013899753801524639 2023-01-22 15:27:04.474474: step: 130/464, loss: 0.043581441044807434 2023-01-22 15:27:05.191971: step: 132/464, loss: 0.016672799363732338 2023-01-22 15:27:05.985872: step: 134/464, loss: 0.44271788001060486 2023-01-22 15:27:06.722115: step: 136/464, loss: 0.07831883430480957 2023-01-22 15:27:07.433843: step: 138/464, loss: 0.016645532101392746 2023-01-22 15:27:08.149323: step: 140/464, loss: 0.09777562320232391 2023-01-22 15:27:08.929519: step: 142/464, loss: 0.010654406622052193 2023-01-22 15:27:09.692352: step: 144/464, loss: 0.0020998416002839804 2023-01-22 15:27:10.409062: step: 146/464, loss: 0.030081741511821747 2023-01-22 15:27:11.114442: step: 148/464, loss: 0.013528715819120407 2023-01-22 15:27:11.884165: step: 150/464, loss: 0.002756628207862377 2023-01-22 15:27:12.586185: step: 152/464, loss: 0.02021130919456482 2023-01-22 15:27:13.352288: step: 154/464, loss: 0.026756566017866135 2023-01-22 15:27:14.021506: step: 156/464, loss: 0.005173517391085625 2023-01-22 15:27:14.790609: step: 158/464, loss: 0.20922529697418213 2023-01-22 15:27:15.471010: step: 160/464, loss: 0.0023500677198171616 2023-01-22 15:27:16.236983: step: 162/464, loss: 0.02557515911757946 2023-01-22 15:27:16.975855: step: 164/464, loss: 0.0018051753286272287 2023-01-22 15:27:17.782904: step: 166/464, loss: 0.03364339470863342 2023-01-22 15:27:18.554702: step: 168/464, loss: 0.019174523651599884 2023-01-22 15:27:19.245133: step: 170/464, loss: 0.06015171855688095 2023-01-22 15:27:19.908257: step: 172/464, loss: 0.002911254530772567 2023-01-22 15:27:20.730170: step: 174/464, loss: 0.028410404920578003 2023-01-22 15:27:21.474743: step: 176/464, loss: 0.012645215727388859 2023-01-22 15:27:22.317846: step: 178/464, loss: 0.004745079670101404 2023-01-22 15:27:22.997218: step: 180/464, loss: 0.01845906488597393 2023-01-22 15:27:23.709668: step: 182/464, loss: 0.03830547630786896 2023-01-22 15:27:24.457444: step: 184/464, loss: 0.014404712244868279 2023-01-22 15:27:25.199684: step: 186/464, loss: 0.008285674266517162 2023-01-22 15:27:25.852653: step: 188/464, loss: 0.0003698143409565091 2023-01-22 15:27:26.648400: step: 190/464, loss: 0.02062177285552025 2023-01-22 15:27:27.390699: step: 192/464, loss: 0.02673025242984295 2023-01-22 15:27:28.079981: step: 194/464, loss: 0.010820874013006687 2023-01-22 15:27:28.817917: step: 196/464, loss: 0.017097724601626396 2023-01-22 15:27:29.647870: step: 198/464, loss: 0.0009626204846426845 2023-01-22 15:27:30.473413: step: 200/464, loss: 0.00660901702940464 2023-01-22 15:27:31.297216: step: 202/464, loss: 0.008002715185284615 2023-01-22 15:27:32.029896: step: 204/464, loss: 0.00957464799284935 2023-01-22 15:27:32.721101: step: 206/464, loss: 0.0024384590797126293 2023-01-22 15:27:33.697796: step: 208/464, loss: 0.0061497800052165985 2023-01-22 15:27:34.448372: step: 210/464, loss: 0.007444137241691351 2023-01-22 15:27:35.149271: step: 212/464, loss: 0.026912638917565346 2023-01-22 15:27:35.838983: step: 214/464, loss: 0.0040068114176392555 2023-01-22 15:27:36.593395: step: 216/464, loss: 0.0004780768067575991 2023-01-22 15:27:37.328952: step: 218/464, loss: 0.3938106596469879 2023-01-22 15:27:38.010896: step: 220/464, loss: 0.009369696490466595 2023-01-22 15:27:38.766592: step: 222/464, loss: 0.0011333172442391515 2023-01-22 15:27:39.468227: step: 224/464, loss: 0.02935289777815342 2023-01-22 15:27:40.121280: step: 226/464, loss: 0.0071406010538339615 2023-01-22 15:27:40.856413: step: 228/464, loss: 0.01660071313381195 2023-01-22 15:27:41.610230: step: 230/464, loss: 0.29751288890838623 2023-01-22 15:27:42.336478: step: 232/464, loss: 0.0016616106731817126 2023-01-22 15:27:43.115314: step: 234/464, loss: 0.04663500189781189 2023-01-22 15:27:43.800434: step: 236/464, loss: 0.011690995655953884 2023-01-22 15:27:44.583709: step: 238/464, loss: 0.000685846374835819 2023-01-22 15:27:45.363148: step: 240/464, loss: 0.015121880918741226 2023-01-22 15:27:46.091278: step: 242/464, loss: 0.010790755040943623 2023-01-22 15:27:46.806778: step: 244/464, loss: 0.00013384269550442696 2023-01-22 15:27:47.644105: step: 246/464, loss: 0.005154958460479975 2023-01-22 15:27:48.296779: step: 248/464, loss: 0.009239543229341507 2023-01-22 15:27:49.046658: step: 250/464, loss: 0.016478722915053368 2023-01-22 15:27:49.848277: step: 252/464, loss: 0.00974748283624649 2023-01-22 15:27:50.539291: step: 254/464, loss: 0.008561811409890652 2023-01-22 15:27:51.274887: step: 256/464, loss: 0.0009420202695764601 2023-01-22 15:27:52.022716: step: 258/464, loss: 0.020443571731448174 2023-01-22 15:27:52.855041: step: 260/464, loss: 0.0278155654668808 2023-01-22 15:27:53.707480: step: 262/464, loss: 0.004828313831239939 2023-01-22 15:27:54.427855: step: 264/464, loss: 0.001934122759848833 2023-01-22 15:27:55.119920: step: 266/464, loss: 0.024927956983447075 2023-01-22 15:27:55.848741: step: 268/464, loss: 0.045292165130376816 2023-01-22 15:27:56.538242: step: 270/464, loss: 0.020056474953889847 2023-01-22 15:27:57.286569: step: 272/464, loss: 0.020017310976982117 2023-01-22 15:27:57.984437: step: 274/464, loss: 0.011287958361208439 2023-01-22 15:27:58.697821: step: 276/464, loss: 0.003020766656845808 2023-01-22 15:27:59.500411: step: 278/464, loss: 0.8390296101570129 2023-01-22 15:28:00.299610: step: 280/464, loss: 0.03157031536102295 2023-01-22 15:28:01.093679: step: 282/464, loss: 0.011981689371168613 2023-01-22 15:28:01.754307: step: 284/464, loss: 0.022621948271989822 2023-01-22 15:28:02.507159: step: 286/464, loss: 0.04245922714471817 2023-01-22 15:28:03.257037: step: 288/464, loss: 0.04538784548640251 2023-01-22 15:28:03.985612: step: 290/464, loss: 0.011972902342677116 2023-01-22 15:28:04.781235: step: 292/464, loss: 0.01967264525592327 2023-01-22 15:28:05.471030: step: 294/464, loss: 0.024140847846865654 2023-01-22 15:28:06.195073: step: 296/464, loss: 0.007386627607047558 2023-01-22 15:28:06.997130: step: 298/464, loss: 0.05708402022719383 2023-01-22 15:28:07.735305: step: 300/464, loss: 0.01678168959915638 2023-01-22 15:28:08.516690: step: 302/464, loss: 0.010171451605856419 2023-01-22 15:28:09.236489: step: 304/464, loss: 0.0012677903287112713 2023-01-22 15:28:09.969669: step: 306/464, loss: 0.0006428569904528558 2023-01-22 15:28:10.775335: step: 308/464, loss: 0.004791866987943649 2023-01-22 15:28:11.551079: step: 310/464, loss: 0.007664674427360296 2023-01-22 15:28:12.383577: step: 312/464, loss: 0.021277105435729027 2023-01-22 15:28:13.102373: step: 314/464, loss: 0.012961627915501595 2023-01-22 15:28:13.885994: step: 316/464, loss: 0.005071598570793867 2023-01-22 15:28:14.707571: step: 318/464, loss: 0.013653067871928215 2023-01-22 15:28:15.427969: step: 320/464, loss: 0.03530142456293106 2023-01-22 15:28:16.174303: step: 322/464, loss: 0.033330876380205154 2023-01-22 15:28:16.915954: step: 324/464, loss: 0.09162920713424683 2023-01-22 15:28:17.723627: step: 326/464, loss: 0.018649160861968994 2023-01-22 15:28:18.556809: step: 328/464, loss: 0.3532303273677826 2023-01-22 15:28:19.325663: step: 330/464, loss: 0.006986723281443119 2023-01-22 15:28:20.132359: step: 332/464, loss: 0.0072303530760109425 2023-01-22 15:28:20.929954: step: 334/464, loss: 0.04930386319756508 2023-01-22 15:28:21.629505: step: 336/464, loss: 0.02361915074288845 2023-01-22 15:28:22.350856: step: 338/464, loss: 0.05951722711324692 2023-01-22 15:28:23.142707: step: 340/464, loss: 0.00894229020923376 2023-01-22 15:28:23.898132: step: 342/464, loss: 0.022131511941552162 2023-01-22 15:28:24.586377: step: 344/464, loss: 0.10531365126371384 2023-01-22 15:28:25.260316: step: 346/464, loss: 0.002158042509108782 2023-01-22 15:28:26.013122: step: 348/464, loss: 0.01175218727439642 2023-01-22 15:28:26.753340: step: 350/464, loss: 0.003121676156297326 2023-01-22 15:28:27.518701: step: 352/464, loss: 0.0032621140126138926 2023-01-22 15:28:28.197079: step: 354/464, loss: 0.03148592263460159 2023-01-22 15:28:28.910506: step: 356/464, loss: 0.041832827031612396 2023-01-22 15:28:29.624996: step: 358/464, loss: 0.003561746794730425 2023-01-22 15:28:30.366426: step: 360/464, loss: 0.0007038118201307952 2023-01-22 15:28:31.031860: step: 362/464, loss: 0.005218755453824997 2023-01-22 15:28:31.772370: step: 364/464, loss: 0.05850491300225258 2023-01-22 15:28:32.524333: step: 366/464, loss: 0.01553135085850954 2023-01-22 15:28:33.325973: step: 368/464, loss: 0.0332174189388752 2023-01-22 15:28:33.988021: step: 370/464, loss: 0.009961692616343498 2023-01-22 15:28:34.715116: step: 372/464, loss: 0.03649236634373665 2023-01-22 15:28:35.420176: step: 374/464, loss: 0.022455479949712753 2023-01-22 15:28:36.217277: step: 376/464, loss: 0.07478629052639008 2023-01-22 15:28:36.962397: step: 378/464, loss: 0.020302915945649147 2023-01-22 15:28:37.760683: step: 380/464, loss: 0.3236561715602875 2023-01-22 15:28:38.513393: step: 382/464, loss: 0.005948520265519619 2023-01-22 15:28:39.227563: step: 384/464, loss: 0.019666900858283043 2023-01-22 15:28:39.991644: step: 386/464, loss: 0.003094746032729745 2023-01-22 15:28:40.840088: step: 388/464, loss: 0.010951261967420578 2023-01-22 15:28:41.644223: step: 390/464, loss: 0.004105293191969395 2023-01-22 15:28:42.528721: step: 392/464, loss: 0.016868796199560165 2023-01-22 15:28:43.260400: step: 394/464, loss: 0.020495926961302757 2023-01-22 15:28:44.013522: step: 396/464, loss: 0.13992923498153687 2023-01-22 15:28:44.758007: step: 398/464, loss: 0.004130828194320202 2023-01-22 15:28:45.462930: step: 400/464, loss: 0.01311857532709837 2023-01-22 15:28:46.248421: step: 402/464, loss: 0.002984261605888605 2023-01-22 15:28:47.049379: step: 404/464, loss: 0.007484616246074438 2023-01-22 15:28:47.770596: step: 406/464, loss: 0.0018927458440884948 2023-01-22 15:28:48.489135: step: 408/464, loss: 0.04563937708735466 2023-01-22 15:28:49.234886: step: 410/464, loss: 0.43953967094421387 2023-01-22 15:28:49.982726: step: 412/464, loss: 0.012645837850868702 2023-01-22 15:28:50.749798: step: 414/464, loss: 0.8327988386154175 2023-01-22 15:28:51.485484: step: 416/464, loss: 0.018644271418452263 2023-01-22 15:28:52.201413: step: 418/464, loss: 0.015001460909843445 2023-01-22 15:28:52.880275: step: 420/464, loss: 0.007504230365157127 2023-01-22 15:28:53.663416: step: 422/464, loss: 0.029034053906798363 2023-01-22 15:28:54.424525: step: 424/464, loss: 0.005283354315906763 2023-01-22 15:28:55.144337: step: 426/464, loss: 0.03228038176894188 2023-01-22 15:28:55.915970: step: 428/464, loss: 0.051429443061351776 2023-01-22 15:28:56.638688: step: 430/464, loss: 0.0792945921421051 2023-01-22 15:28:57.330949: step: 432/464, loss: 0.010515585541725159 2023-01-22 15:28:58.003984: step: 434/464, loss: 0.044191498309373856 2023-01-22 15:28:58.730372: step: 436/464, loss: 0.002709238789975643 2023-01-22 15:28:59.616564: step: 438/464, loss: 0.07902417331933975 2023-01-22 15:29:00.385966: step: 440/464, loss: 0.28564509749412537 2023-01-22 15:29:01.162463: step: 442/464, loss: 0.004908657167106867 2023-01-22 15:29:01.896459: step: 444/464, loss: 0.015626709908246994 2023-01-22 15:29:02.693131: step: 446/464, loss: 0.134577214717865 2023-01-22 15:29:03.466754: step: 448/464, loss: 0.006557188928127289 2023-01-22 15:29:04.216170: step: 450/464, loss: 0.06196486949920654 2023-01-22 15:29:04.984212: step: 452/464, loss: 0.009614722803235054 2023-01-22 15:29:05.747806: step: 454/464, loss: 0.32114338874816895 2023-01-22 15:29:06.488598: step: 456/464, loss: 0.08962027728557587 2023-01-22 15:29:07.218492: step: 458/464, loss: 0.0753081664443016 2023-01-22 15:29:07.958192: step: 460/464, loss: 0.02802230417728424 2023-01-22 15:29:08.648326: step: 462/464, loss: 0.006656539160758257 2023-01-22 15:29:09.317376: step: 464/464, loss: 0.009935715235769749 2023-01-22 15:29:10.018906: step: 466/464, loss: 0.025795668363571167 2023-01-22 15:29:10.722095: step: 468/464, loss: 0.0011418864596635103 2023-01-22 15:29:11.537410: step: 470/464, loss: 0.014790812507271767 2023-01-22 15:29:12.341520: step: 472/464, loss: 0.0025786729529500008 2023-01-22 15:29:13.186801: step: 474/464, loss: 0.08187508583068848 2023-01-22 15:29:13.877513: step: 476/464, loss: 0.021638842299580574 2023-01-22 15:29:14.605242: step: 478/464, loss: 0.004650462418794632 2023-01-22 15:29:15.294191: step: 480/464, loss: 0.031360264867544174 2023-01-22 15:29:16.052593: step: 482/464, loss: 0.0027292489539831877 2023-01-22 15:29:16.851324: step: 484/464, loss: 0.003940966445952654 2023-01-22 15:29:17.595850: step: 486/464, loss: 0.0316765122115612 2023-01-22 15:29:18.356735: step: 488/464, loss: 0.02473451755940914 2023-01-22 15:29:19.186050: step: 490/464, loss: 0.07920479029417038 2023-01-22 15:29:19.895335: step: 492/464, loss: 0.26752930879592896 2023-01-22 15:29:20.652226: step: 494/464, loss: 0.04645884409546852 2023-01-22 15:29:21.408551: step: 496/464, loss: 0.014812164008617401 2023-01-22 15:29:22.133267: step: 498/464, loss: 0.048786476254463196 2023-01-22 15:29:22.783466: step: 500/464, loss: 0.023341378197073936 2023-01-22 15:29:23.462696: step: 502/464, loss: 0.006561134476214647 2023-01-22 15:29:24.229345: step: 504/464, loss: 0.029558319598436356 2023-01-22 15:29:25.095209: step: 506/464, loss: 0.051057860255241394 2023-01-22 15:29:25.907440: step: 508/464, loss: 0.00839884765446186 2023-01-22 15:29:26.536721: step: 510/464, loss: 0.03669516742229462 2023-01-22 15:29:27.299690: step: 512/464, loss: 0.03794403746724129 2023-01-22 15:29:28.007613: step: 514/464, loss: 0.020145447924733162 2023-01-22 15:29:28.879123: step: 516/464, loss: 0.0722341388463974 2023-01-22 15:29:29.631085: step: 518/464, loss: 0.03425714001059532 2023-01-22 15:29:30.362611: step: 520/464, loss: 0.05341227725148201 2023-01-22 15:29:31.125976: step: 522/464, loss: 0.014097227714955807 2023-01-22 15:29:31.805753: step: 524/464, loss: 0.004033136647194624 2023-01-22 15:29:32.520694: step: 526/464, loss: 0.008113464340567589 2023-01-22 15:29:33.199156: step: 528/464, loss: 0.013366038911044598 2023-01-22 15:29:33.934944: step: 530/464, loss: 0.7725804448127747 2023-01-22 15:29:34.653674: step: 532/464, loss: 0.07331458479166031 2023-01-22 15:29:35.363064: step: 534/464, loss: 0.0014355615712702274 2023-01-22 15:29:36.074549: step: 536/464, loss: 0.06860658526420593 2023-01-22 15:29:36.783360: step: 538/464, loss: 0.025517934933304787 2023-01-22 15:29:37.522193: step: 540/464, loss: 0.009260217659175396 2023-01-22 15:29:38.265416: step: 542/464, loss: 0.019623389467597008 2023-01-22 15:29:38.890932: step: 544/464, loss: 0.008851874619722366 2023-01-22 15:29:39.573955: step: 546/464, loss: 0.02663077786564827 2023-01-22 15:29:40.351965: step: 548/464, loss: 0.020357387140393257 2023-01-22 15:29:41.113291: step: 550/464, loss: 0.002843934576958418 2023-01-22 15:29:41.823002: step: 552/464, loss: 0.0904221311211586 2023-01-22 15:29:42.615201: step: 554/464, loss: 0.01523869764059782 2023-01-22 15:29:43.259979: step: 556/464, loss: 0.03416203334927559 2023-01-22 15:29:43.974661: step: 558/464, loss: 0.03625953197479248 2023-01-22 15:29:44.721163: step: 560/464, loss: 0.30425795912742615 2023-01-22 15:29:45.486901: step: 562/464, loss: 0.023968152701854706 2023-01-22 15:29:46.242804: step: 564/464, loss: 0.06374843418598175 2023-01-22 15:29:46.984015: step: 566/464, loss: 0.03259289264678955 2023-01-22 15:29:47.687953: step: 568/464, loss: 0.02984294667840004 2023-01-22 15:29:48.395543: step: 570/464, loss: 0.013688970357179642 2023-01-22 15:29:49.138148: step: 572/464, loss: 0.015799837186932564 2023-01-22 15:29:50.012583: step: 574/464, loss: 0.08788468688726425 2023-01-22 15:29:50.693808: step: 576/464, loss: 0.022226519882678986 2023-01-22 15:29:51.445276: step: 578/464, loss: 0.0028302574064582586 2023-01-22 15:29:52.119144: step: 580/464, loss: 0.0019513736478984356 2023-01-22 15:29:52.920772: step: 582/464, loss: 0.029709680005908012 2023-01-22 15:29:53.719500: step: 584/464, loss: 0.06255444884300232 2023-01-22 15:29:54.461790: step: 586/464, loss: 0.0240376777946949 2023-01-22 15:29:55.118267: step: 588/464, loss: 0.002091553993523121 2023-01-22 15:29:55.854121: step: 590/464, loss: 0.008803433738648891 2023-01-22 15:29:56.540765: step: 592/464, loss: 0.02129915915429592 2023-01-22 15:29:57.290149: step: 594/464, loss: 0.01786724478006363 2023-01-22 15:29:58.085868: step: 596/464, loss: 0.008490340784192085 2023-01-22 15:29:58.793705: step: 598/464, loss: 1.899601697921753 2023-01-22 15:29:59.579045: step: 600/464, loss: 0.014766247011721134 2023-01-22 15:30:00.297359: step: 602/464, loss: 0.004581265151500702 2023-01-22 15:30:00.981120: step: 604/464, loss: 0.008937230333685875 2023-01-22 15:30:01.864371: step: 606/464, loss: 0.014192801900207996 2023-01-22 15:30:02.672501: step: 608/464, loss: 0.0029364165384322405 2023-01-22 15:30:03.410660: step: 610/464, loss: 0.005007661413401365 2023-01-22 15:30:04.201398: step: 612/464, loss: 0.1304178237915039 2023-01-22 15:30:04.940192: step: 614/464, loss: 0.0005642864853143692 2023-01-22 15:30:05.608767: step: 616/464, loss: 0.0005818564095534384 2023-01-22 15:30:06.352952: step: 618/464, loss: 0.018873225897550583 2023-01-22 15:30:07.131908: step: 620/464, loss: 0.0623464472591877 2023-01-22 15:30:07.879859: step: 622/464, loss: 0.05250658467411995 2023-01-22 15:30:08.588730: step: 624/464, loss: 0.20492692291736603 2023-01-22 15:30:09.289550: step: 626/464, loss: 0.007355514448136091 2023-01-22 15:30:09.984682: step: 628/464, loss: 0.006511658895760775 2023-01-22 15:30:10.685065: step: 630/464, loss: 0.43961629271507263 2023-01-22 15:30:11.456783: step: 632/464, loss: 0.02192610315978527 2023-01-22 15:30:12.145774: step: 634/464, loss: 0.011408335529267788 2023-01-22 15:30:12.903122: step: 636/464, loss: 0.03662848100066185 2023-01-22 15:30:13.647633: step: 638/464, loss: 0.00030731482547707856 2023-01-22 15:30:14.425991: step: 640/464, loss: 0.02446604333817959 2023-01-22 15:30:15.126098: step: 642/464, loss: 0.0036431632470339537 2023-01-22 15:30:15.902447: step: 644/464, loss: 0.0008363331435248256 2023-01-22 15:30:16.622688: step: 646/464, loss: 0.6851815581321716 2023-01-22 15:30:17.328969: step: 648/464, loss: 0.0022539652418345213 2023-01-22 15:30:17.986213: step: 650/464, loss: 0.004135349299758673 2023-01-22 15:30:18.764435: step: 652/464, loss: 0.01510525867342949 2023-01-22 15:30:19.485942: step: 654/464, loss: 0.02965255081653595 2023-01-22 15:30:20.283429: step: 656/464, loss: 0.015557816252112389 2023-01-22 15:30:20.999951: step: 658/464, loss: 0.02469242550432682 2023-01-22 15:30:21.688454: step: 660/464, loss: 0.2715980112552643 2023-01-22 15:30:22.564749: step: 662/464, loss: 0.01316728163510561 2023-01-22 15:30:23.367741: step: 664/464, loss: 0.004120450001209974 2023-01-22 15:30:24.095566: step: 666/464, loss: 0.004591710865497589 2023-01-22 15:30:24.732935: step: 668/464, loss: 0.03265814483165741 2023-01-22 15:30:25.443094: step: 670/464, loss: 0.0098827900364995 2023-01-22 15:30:26.140836: step: 672/464, loss: 0.026648204773664474 2023-01-22 15:30:26.875264: step: 674/464, loss: 0.010770948603749275 2023-01-22 15:30:27.587814: step: 676/464, loss: 0.012302640825510025 2023-01-22 15:30:28.403303: step: 678/464, loss: 0.013544720597565174 2023-01-22 15:30:29.200739: step: 680/464, loss: 0.01645328849554062 2023-01-22 15:30:29.935155: step: 682/464, loss: 0.015378853306174278 2023-01-22 15:30:30.645745: step: 684/464, loss: 0.08601805567741394 2023-01-22 15:30:31.407379: step: 686/464, loss: 0.001477090292610228 2023-01-22 15:30:32.099171: step: 688/464, loss: 0.08889120817184448 2023-01-22 15:30:32.786897: step: 690/464, loss: 0.014964031986892223 2023-01-22 15:30:33.524526: step: 692/464, loss: 0.027231650426983833 2023-01-22 15:30:34.293365: step: 694/464, loss: 0.08971980959177017 2023-01-22 15:30:35.128052: step: 696/464, loss: 0.2868679463863373 2023-01-22 15:30:35.917198: step: 698/464, loss: 0.0069223428145051 2023-01-22 15:30:36.558962: step: 700/464, loss: 0.017354296520352364 2023-01-22 15:30:37.233723: step: 702/464, loss: 0.00427975133061409 2023-01-22 15:30:37.970068: step: 704/464, loss: 0.0057968138717114925 2023-01-22 15:30:38.673912: step: 706/464, loss: 0.028853895142674446 2023-01-22 15:30:39.415930: step: 708/464, loss: 0.0894998162984848 2023-01-22 15:30:40.074422: step: 710/464, loss: 0.0027538700960576534 2023-01-22 15:30:40.842165: step: 712/464, loss: 0.0010236428352072835 2023-01-22 15:30:41.515401: step: 714/464, loss: 0.09417460858821869 2023-01-22 15:30:42.308970: step: 716/464, loss: 0.03869073837995529 2023-01-22 15:30:43.060602: step: 718/464, loss: 0.004569509066641331 2023-01-22 15:30:43.794649: step: 720/464, loss: 0.042331527918577194 2023-01-22 15:30:44.560872: step: 722/464, loss: 0.03878622502088547 2023-01-22 15:30:45.300778: step: 724/464, loss: 0.0014780150959268212 2023-01-22 15:30:45.955254: step: 726/464, loss: 0.0032074516639113426 2023-01-22 15:30:46.662653: step: 728/464, loss: 0.01672195829451084 2023-01-22 15:30:47.336504: step: 730/464, loss: 0.008378949947655201 2023-01-22 15:30:48.076204: step: 732/464, loss: 0.005625077523291111 2023-01-22 15:30:48.777668: step: 734/464, loss: 0.01687205582857132 2023-01-22 15:30:49.533193: step: 736/464, loss: 0.00461554154753685 2023-01-22 15:30:50.242076: step: 738/464, loss: 0.06471758335828781 2023-01-22 15:30:50.962405: step: 740/464, loss: 0.009379612281918526 2023-01-22 15:30:51.632200: step: 742/464, loss: 0.00973975658416748 2023-01-22 15:30:52.437893: step: 744/464, loss: 0.0017307644011452794 2023-01-22 15:30:53.090364: step: 746/464, loss: 0.046656664460897446 2023-01-22 15:30:53.865964: step: 748/464, loss: 0.009203736670315266 2023-01-22 15:30:54.655259: step: 750/464, loss: 0.08026555925607681 2023-01-22 15:30:55.378173: step: 752/464, loss: 0.011526745744049549 2023-01-22 15:30:56.136493: step: 754/464, loss: 0.02811591513454914 2023-01-22 15:30:56.900347: step: 756/464, loss: 0.011061850935220718 2023-01-22 15:30:57.553566: step: 758/464, loss: 0.002207051729783416 2023-01-22 15:30:58.233638: step: 760/464, loss: 0.07346905767917633 2023-01-22 15:30:58.941363: step: 762/464, loss: 0.09290290623903275 2023-01-22 15:30:59.729233: step: 764/464, loss: 0.004080015700310469 2023-01-22 15:31:00.475170: step: 766/464, loss: 0.01330144889652729 2023-01-22 15:31:01.206857: step: 768/464, loss: 0.057446204125881195 2023-01-22 15:31:01.952155: step: 770/464, loss: 0.002051639137789607 2023-01-22 15:31:02.694625: step: 772/464, loss: 0.04154136776924133 2023-01-22 15:31:03.341864: step: 774/464, loss: 0.0038536693900823593 2023-01-22 15:31:04.080790: step: 776/464, loss: 0.05529932305216789 2023-01-22 15:31:04.827553: step: 778/464, loss: 0.05054425820708275 2023-01-22 15:31:05.519544: step: 780/464, loss: 0.09152450412511826 2023-01-22 15:31:06.298603: step: 782/464, loss: 0.07490304112434387 2023-01-22 15:31:07.056050: step: 784/464, loss: 0.015331946313381195 2023-01-22 15:31:07.769737: step: 786/464, loss: 0.004108444321900606 2023-01-22 15:31:08.536583: step: 788/464, loss: 0.005928609520196915 2023-01-22 15:31:09.226669: step: 790/464, loss: 0.0025638609658926725 2023-01-22 15:31:09.926517: step: 792/464, loss: 6.599909102078527e-05 2023-01-22 15:31:10.647527: step: 794/464, loss: 0.006902155466377735 2023-01-22 15:31:11.409758: step: 796/464, loss: 0.01019245758652687 2023-01-22 15:31:12.134410: step: 798/464, loss: 0.0029711266979575157 2023-01-22 15:31:12.848530: step: 800/464, loss: 0.01275500375777483 2023-01-22 15:31:13.579277: step: 802/464, loss: 0.004396913107484579 2023-01-22 15:31:14.349520: step: 804/464, loss: 0.013614018447697163 2023-01-22 15:31:15.044143: step: 806/464, loss: 0.0005398777429945767 2023-01-22 15:31:15.678457: step: 808/464, loss: 0.09579914063215256 2023-01-22 15:31:16.433828: step: 810/464, loss: 0.03894653171300888 2023-01-22 15:31:17.245600: step: 812/464, loss: 0.1491641104221344 2023-01-22 15:31:17.966995: step: 814/464, loss: 0.0009035103139467537 2023-01-22 15:31:18.726603: step: 816/464, loss: 0.06483941525220871 2023-01-22 15:31:19.436782: step: 818/464, loss: 0.039540067315101624 2023-01-22 15:31:20.074296: step: 820/464, loss: 0.0060846139676868916 2023-01-22 15:31:20.824520: step: 822/464, loss: 0.025557899847626686 2023-01-22 15:31:21.521423: step: 824/464, loss: 0.006232323590666056 2023-01-22 15:31:22.225906: step: 826/464, loss: 0.0024852799251675606 2023-01-22 15:31:22.892124: step: 828/464, loss: 0.06190106272697449 2023-01-22 15:31:23.566130: step: 830/464, loss: 0.007011691574007273 2023-01-22 15:31:24.332752: step: 832/464, loss: 0.012379252351820469 2023-01-22 15:31:25.027901: step: 834/464, loss: 0.3338109254837036 2023-01-22 15:31:25.750625: step: 836/464, loss: 0.005469049327075481 2023-01-22 15:31:26.462748: step: 838/464, loss: 0.00754694314673543 2023-01-22 15:31:27.210808: step: 840/464, loss: 0.037966564297676086 2023-01-22 15:31:27.926820: step: 842/464, loss: 0.41443148255348206 2023-01-22 15:31:28.665450: step: 844/464, loss: 0.018485354259610176 2023-01-22 15:31:29.390757: step: 846/464, loss: 0.025510603561997414 2023-01-22 15:31:30.156718: step: 848/464, loss: 0.013038084842264652 2023-01-22 15:31:30.900479: step: 850/464, loss: 0.00841561984270811 2023-01-22 15:31:31.647046: step: 852/464, loss: 0.06011233106255531 2023-01-22 15:31:32.374579: step: 854/464, loss: 0.0247954148799181 2023-01-22 15:31:33.058059: step: 856/464, loss: 0.00681648775935173 2023-01-22 15:31:33.818527: step: 858/464, loss: 0.037794142961502075 2023-01-22 15:31:34.560691: step: 860/464, loss: 0.020910143852233887 2023-01-22 15:31:35.229053: step: 862/464, loss: 0.09857627749443054 2023-01-22 15:31:35.952397: step: 864/464, loss: 0.03198883682489395 2023-01-22 15:31:36.668358: step: 866/464, loss: 0.05377843603491783 2023-01-22 15:31:37.354611: step: 868/464, loss: 0.013491624034941196 2023-01-22 15:31:38.065336: step: 870/464, loss: 0.018908044323325157 2023-01-22 15:31:38.756866: step: 872/464, loss: 0.04457402229309082 2023-01-22 15:31:39.493775: step: 874/464, loss: 0.0008078064420260489 2023-01-22 15:31:40.193288: step: 876/464, loss: 0.07652704417705536 2023-01-22 15:31:40.992595: step: 878/464, loss: 0.018900269642472267 2023-01-22 15:31:41.740786: step: 880/464, loss: 2.967814725707285e-05 2023-01-22 15:31:42.696639: step: 882/464, loss: 0.04611194506287575 2023-01-22 15:31:43.409811: step: 884/464, loss: 0.01773378811776638 2023-01-22 15:31:44.209990: step: 886/464, loss: 0.01149496715515852 2023-01-22 15:31:44.969017: step: 888/464, loss: 0.04522504657506943 2023-01-22 15:31:45.668860: step: 890/464, loss: 0.0013022801140323281 2023-01-22 15:31:46.361961: step: 892/464, loss: 0.009706506505608559 2023-01-22 15:31:47.054305: step: 894/464, loss: 0.00707974610850215 2023-01-22 15:31:47.737123: step: 896/464, loss: 0.026814324781298637 2023-01-22 15:31:48.510874: step: 898/464, loss: 0.013630975037813187 2023-01-22 15:31:49.189332: step: 900/464, loss: 0.024357259273529053 2023-01-22 15:31:49.837624: step: 902/464, loss: 0.01049887202680111 2023-01-22 15:31:50.625446: step: 904/464, loss: 0.03636635094881058 2023-01-22 15:31:51.336526: step: 906/464, loss: 0.026952916756272316 2023-01-22 15:31:52.119823: step: 908/464, loss: 0.04131436347961426 2023-01-22 15:31:52.871886: step: 910/464, loss: 0.010415855795145035 2023-01-22 15:31:53.583918: step: 912/464, loss: 0.0918290987610817 2023-01-22 15:31:54.287390: step: 914/464, loss: 0.028102584183216095 2023-01-22 15:31:55.007154: step: 916/464, loss: 0.01105829793959856 2023-01-22 15:31:55.673561: step: 918/464, loss: 0.044654421508312225 2023-01-22 15:31:56.432216: step: 920/464, loss: 0.011219141073524952 2023-01-22 15:31:57.234064: step: 922/464, loss: 0.039576154202222824 2023-01-22 15:31:57.936514: step: 924/464, loss: 0.12635676562786102 2023-01-22 15:31:58.597091: step: 926/464, loss: 0.03394423425197601 2023-01-22 15:31:59.381049: step: 928/464, loss: 0.043058667331933975 2023-01-22 15:32:00.015142: step: 930/464, loss: 0.0004498627968132496 ================================================== Loss: 0.047 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.33209727112676063, 'r': 0.35793406072106265, 'f1': 0.3445319634703197}, 'combined': 0.2538656572939198, 'epoch': 29} Test Chinese: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.2955936103711571, 'r': 0.29150032595454367, 'f1': 0.2935326987450633}, 'combined': 0.18229925501009198, 'epoch': 29} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.31042227250489235, 'r': 0.34399735700731904, 'f1': 0.32634852770991385}, 'combined': 0.24046733620730493, 'epoch': 29} Test Korean: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.2907350567680675, 'r': 0.29102262853539496, 'f1': 0.29087877157615843}, 'combined': 0.18065102655782472, 'epoch': 29} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3249044403336383, 'r': 0.34956511891683667, 'f1': 0.33678394455059035}, 'combined': 0.2481565907214876, 'epoch': 29} Test Russian: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.30685896666241314, 'r': 0.29927096056294694, 'f1': 0.30301746733013457}, 'combined': 0.1881897954997678, 'epoch': 29} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.26121794871794873, 'r': 0.2910714285714286, 'f1': 0.27533783783783783}, 'combined': 0.18355855855855854, 'epoch': 29} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.28289473684210525, 'r': 0.4673913043478261, 'f1': 0.3524590163934426}, 'combined': 0.1762295081967213, 'epoch': 29} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.4097222222222222, 'r': 0.2543103448275862, 'f1': 0.31382978723404253}, 'combined': 0.20921985815602834, 'epoch': 29} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.31282801835726054, 'r': 0.3603161425860667, 'f1': 0.33489701436130004}, 'combined': 0.24676622110832633, 'epoch': 12} Test for Chinese: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.30894966790503225, 'r': 0.2887808468548521, 'f1': 0.29852498585915693}, 'combined': 0.1853997280598975, 'epoch': 12} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.3138888888888889, 'r': 0.4035714285714286, 'f1': 0.35312499999999997}, 'combined': 0.23541666666666664, 'epoch': 12} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3081533681870252, 'r': 0.3455761681186563, 'f1': 0.32579363255551325}, 'combined': 0.24005846609353607, 'epoch': 23} Test for Korean: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.29327466083978504, 'r': 0.2883432372648332, 'f1': 0.2907880427678267}, 'combined': 0.1805946791926503, 'epoch': 23} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.3355263157894737, 'r': 0.5543478260869565, 'f1': 0.41803278688524587}, 'combined': 0.20901639344262293, 'epoch': 23} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.32184781639317267, 'r': 0.3346728716953674, 'f1': 0.32813507606224857}, 'combined': 0.24178374025639365, 'epoch': 24} Test for Russian: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.31977789344146773, 'r': 0.29610233370986844, 'f1': 0.30748504771716734}, 'combined': 0.190964398055925, 'epoch': 24} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.46875, 'r': 0.3232758620689655, 'f1': 0.38265306122448983}, 'combined': 0.25510204081632654, 'epoch': 24} ****************************** Epoch: 30 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-22 15:34:40.718607: step: 2/464, loss: 0.2376316487789154 2023-01-22 15:34:41.563191: step: 4/464, loss: 0.021520117297768593 2023-01-22 15:34:42.353909: step: 6/464, loss: 0.025482257828116417 2023-01-22 15:34:43.045697: step: 8/464, loss: 0.026679327711462975 2023-01-22 15:34:43.788156: step: 10/464, loss: 0.11775459349155426 2023-01-22 15:34:44.528412: step: 12/464, loss: 0.011979639530181885 2023-01-22 15:34:45.296410: step: 14/464, loss: 0.016509508714079857 2023-01-22 15:34:45.970907: step: 16/464, loss: 0.001127360388636589 2023-01-22 15:34:46.702441: step: 18/464, loss: 0.0002913039061240852 2023-01-22 15:34:47.548332: step: 20/464, loss: 0.0073746065609157085 2023-01-22 15:34:48.279538: step: 22/464, loss: 0.0017119019757956266 2023-01-22 15:34:49.033917: step: 24/464, loss: 0.01746697537600994 2023-01-22 15:34:49.828678: step: 26/464, loss: 0.003907793201506138 2023-01-22 15:34:50.613474: step: 28/464, loss: 0.03076671063899994 2023-01-22 15:34:51.329324: step: 30/464, loss: 0.02289789356291294 2023-01-22 15:34:52.116087: step: 32/464, loss: 1.358851671218872 2023-01-22 15:34:52.872873: step: 34/464, loss: 0.0058686695992946625 2023-01-22 15:34:53.655359: step: 36/464, loss: 0.000436199305113405 2023-01-22 15:34:54.357906: step: 38/464, loss: 0.08685613423585892 2023-01-22 15:34:55.111195: step: 40/464, loss: 0.013516401872038841 2023-01-22 15:34:55.785282: step: 42/464, loss: 0.005017260555177927 2023-01-22 15:34:56.552105: step: 44/464, loss: 0.007543819025158882 2023-01-22 15:34:57.354960: step: 46/464, loss: 0.23378436267375946 2023-01-22 15:34:58.015032: step: 48/464, loss: 0.024231471121311188 2023-01-22 15:34:58.729219: step: 50/464, loss: 0.021258708089590073 2023-01-22 15:34:59.464809: step: 52/464, loss: 0.011662538163363934 2023-01-22 15:35:00.195913: step: 54/464, loss: 0.2266659438610077 2023-01-22 15:35:01.008602: step: 56/464, loss: 0.024330206215381622 2023-01-22 15:35:01.719640: step: 58/464, loss: 0.0032375429291278124 2023-01-22 15:35:02.471725: step: 60/464, loss: 0.018813833594322205 2023-01-22 15:35:03.178104: step: 62/464, loss: 0.040911901742219925 2023-01-22 15:35:03.896850: step: 64/464, loss: 0.01895122602581978 2023-01-22 15:35:04.580002: step: 66/464, loss: 0.00867029745131731 2023-01-22 15:35:05.302263: step: 68/464, loss: 0.023470614105463028 2023-01-22 15:35:06.042931: step: 70/464, loss: 0.018438637256622314 2023-01-22 15:35:06.796604: step: 72/464, loss: 0.0023036121856421232 2023-01-22 15:35:07.531561: step: 74/464, loss: 0.002937216544523835 2023-01-22 15:35:08.272195: step: 76/464, loss: 0.5913159847259521 2023-01-22 15:35:09.035291: step: 78/464, loss: 0.007344986777752638 2023-01-22 15:35:09.784339: step: 80/464, loss: 0.009015419520437717 2023-01-22 15:35:10.567943: step: 82/464, loss: 0.10811667144298553 2023-01-22 15:35:11.237066: step: 84/464, loss: 0.010973623022437096 2023-01-22 15:35:11.957353: step: 86/464, loss: 0.00012188367691123858 2023-01-22 15:35:12.733186: step: 88/464, loss: 0.04058532416820526 2023-01-22 15:35:13.565342: step: 90/464, loss: 0.038530707359313965 2023-01-22 15:35:14.316187: step: 92/464, loss: 0.008467328734695911 2023-01-22 15:35:15.028429: step: 94/464, loss: 0.003021214623004198 2023-01-22 15:35:15.774992: step: 96/464, loss: 0.07035733014345169 2023-01-22 15:35:16.495860: step: 98/464, loss: 0.007195932790637016 2023-01-22 15:35:17.277442: step: 100/464, loss: 0.012332151643931866 2023-01-22 15:35:18.037701: step: 102/464, loss: 0.003940146416425705 2023-01-22 15:35:18.719705: step: 104/464, loss: 0.0043151951394975185 2023-01-22 15:35:19.449598: step: 106/464, loss: 0.062128935009241104 2023-01-22 15:35:20.198075: step: 108/464, loss: 0.021147824823856354 2023-01-22 15:35:20.982238: step: 110/464, loss: 0.004816403146833181 2023-01-22 15:35:21.759041: step: 112/464, loss: 0.07347775250673294 2023-01-22 15:35:22.511758: step: 114/464, loss: 0.0327020138502121 2023-01-22 15:35:23.215027: step: 116/464, loss: 0.0873594582080841 2023-01-22 15:35:23.927159: step: 118/464, loss: 0.03273965045809746 2023-01-22 15:35:24.616650: step: 120/464, loss: 0.03554631397128105 2023-01-22 15:35:25.437434: step: 122/464, loss: 0.14380039274692535 2023-01-22 15:35:26.203111: step: 124/464, loss: 0.05833486095070839 2023-01-22 15:35:26.928753: step: 126/464, loss: 0.006037155166268349 2023-01-22 15:35:27.584153: step: 128/464, loss: 0.0014223945327103138 2023-01-22 15:35:28.330741: step: 130/464, loss: 0.01266569271683693 2023-01-22 15:35:29.084309: step: 132/464, loss: 0.02276783436536789 2023-01-22 15:35:29.885391: step: 134/464, loss: 0.010872559621930122 2023-01-22 15:35:30.822437: step: 136/464, loss: 0.005215835757553577 2023-01-22 15:35:31.496007: step: 138/464, loss: 0.002627200447022915 2023-01-22 15:35:32.173934: step: 140/464, loss: 0.10269683599472046 2023-01-22 15:35:32.963659: step: 142/464, loss: 0.0073095522820949554 2023-01-22 15:35:33.722761: step: 144/464, loss: 0.13490694761276245 2023-01-22 15:35:34.475624: step: 146/464, loss: 0.008064567111432552 2023-01-22 15:35:35.154628: step: 148/464, loss: 0.0015113273402675986 2023-01-22 15:35:35.891440: step: 150/464, loss: 0.0027289805002510548 2023-01-22 15:35:36.635468: step: 152/464, loss: 0.012556140311062336 2023-01-22 15:35:37.320988: step: 154/464, loss: 0.013792785815894604 2023-01-22 15:35:38.098386: step: 156/464, loss: 0.037203893065452576 2023-01-22 15:35:38.840546: step: 158/464, loss: 0.13290239870548248 2023-01-22 15:35:39.616987: step: 160/464, loss: 0.03926850110292435 2023-01-22 15:35:40.377044: step: 162/464, loss: 0.09682679176330566 2023-01-22 15:35:41.102190: step: 164/464, loss: 0.05842824652791023 2023-01-22 15:35:41.860718: step: 166/464, loss: 0.00015586864901706576 2023-01-22 15:35:42.653055: step: 168/464, loss: 0.01858036406338215 2023-01-22 15:35:43.341558: step: 170/464, loss: 0.005358466412872076 2023-01-22 15:35:44.136417: step: 172/464, loss: 0.021037913858890533 2023-01-22 15:35:44.916548: step: 174/464, loss: 0.02369670197367668 2023-01-22 15:35:45.678836: step: 176/464, loss: 0.009025280363857746 2023-01-22 15:35:46.351680: step: 178/464, loss: 0.025579875335097313 2023-01-22 15:35:47.104977: step: 180/464, loss: 0.18517696857452393 2023-01-22 15:35:47.906262: step: 182/464, loss: 0.4869004189968109 2023-01-22 15:35:48.696067: step: 184/464, loss: 0.0266411192715168 2023-01-22 15:35:49.351086: step: 186/464, loss: 0.10192040354013443 2023-01-22 15:35:50.155061: step: 188/464, loss: 0.02971969172358513 2023-01-22 15:35:50.971725: step: 190/464, loss: 0.04456546530127525 2023-01-22 15:35:51.788838: step: 192/464, loss: 0.05243932828307152 2023-01-22 15:35:52.643087: step: 194/464, loss: 0.045412931591272354 2023-01-22 15:35:53.432305: step: 196/464, loss: 0.02161242440342903 2023-01-22 15:35:54.093008: step: 198/464, loss: 0.0315740630030632 2023-01-22 15:35:54.838444: step: 200/464, loss: 0.051023464649915695 2023-01-22 15:35:55.570511: step: 202/464, loss: 0.010172654874622822 2023-01-22 15:35:56.420768: step: 204/464, loss: 0.0024958052672445774 2023-01-22 15:35:57.138993: step: 206/464, loss: 0.011833332479000092 2023-01-22 15:35:57.917842: step: 208/464, loss: 0.05117599666118622 2023-01-22 15:35:58.751504: step: 210/464, loss: 0.09916941076517105 2023-01-22 15:35:59.537418: step: 212/464, loss: 0.014036203734576702 2023-01-22 15:36:00.289669: step: 214/464, loss: 0.0511137992143631 2023-01-22 15:36:01.031630: step: 216/464, loss: 0.06060526520013809 2023-01-22 15:36:01.714583: step: 218/464, loss: 0.003811330534517765 2023-01-22 15:36:02.430052: step: 220/464, loss: 0.020781518891453743 2023-01-22 15:36:03.175007: step: 222/464, loss: 0.006058407947421074 2023-01-22 15:36:03.924628: step: 224/464, loss: 0.003760756691917777 2023-01-22 15:36:04.758392: step: 226/464, loss: 0.10233800113201141 2023-01-22 15:36:05.501759: step: 228/464, loss: 0.004481780342757702 2023-01-22 15:36:06.214117: step: 230/464, loss: 0.01326355617493391 2023-01-22 15:36:06.957247: step: 232/464, loss: 0.0025657941587269306 2023-01-22 15:36:07.669886: step: 234/464, loss: 0.005028776824474335 2023-01-22 15:36:08.436341: step: 236/464, loss: 0.0007116646156646311 2023-01-22 15:36:09.098721: step: 238/464, loss: 0.05916238948702812 2023-01-22 15:36:09.927966: step: 240/464, loss: 0.007669151294976473 2023-01-22 15:36:10.664403: step: 242/464, loss: 0.030830563977360725 2023-01-22 15:36:11.481496: step: 244/464, loss: 0.012058115564286709 2023-01-22 15:36:12.302653: step: 246/464, loss: 0.005145329050719738 2023-01-22 15:36:13.029673: step: 248/464, loss: 0.0026552604977041483 2023-01-22 15:36:13.769531: step: 250/464, loss: 0.015001723542809486 2023-01-22 15:36:14.555688: step: 252/464, loss: 0.07432577013969421 2023-01-22 15:36:15.279938: step: 254/464, loss: 0.011798490770161152 2023-01-22 15:36:16.023340: step: 256/464, loss: 0.0030184625647962093 2023-01-22 15:36:16.775900: step: 258/464, loss: 0.012257128022611141 2023-01-22 15:36:17.427159: step: 260/464, loss: 0.0036082889419049025 2023-01-22 15:36:18.128801: step: 262/464, loss: 0.005945158191025257 2023-01-22 15:36:18.893402: step: 264/464, loss: 0.0023442127276211977 2023-01-22 15:36:19.569965: step: 266/464, loss: 0.02518412098288536 2023-01-22 15:36:20.314870: step: 268/464, loss: 0.01687704771757126 2023-01-22 15:36:21.111255: step: 270/464, loss: 0.06268581748008728 2023-01-22 15:36:21.811748: step: 272/464, loss: 0.3735574781894684 2023-01-22 15:36:22.515477: step: 274/464, loss: 0.011284289881587029 2023-01-22 15:36:23.279263: step: 276/464, loss: 0.014095884747803211 2023-01-22 15:36:23.925534: step: 278/464, loss: 0.018179569393396378 2023-01-22 15:36:24.632790: step: 280/464, loss: 0.04955065995454788 2023-01-22 15:36:25.394346: step: 282/464, loss: 0.003387266304343939 2023-01-22 15:36:26.085542: step: 284/464, loss: 0.01337368693202734 2023-01-22 15:36:26.859074: step: 286/464, loss: 0.024945693090558052 2023-01-22 15:36:27.557808: step: 288/464, loss: 0.013176217675209045 2023-01-22 15:36:28.334473: step: 290/464, loss: 0.00446714460849762 2023-01-22 15:36:29.087081: step: 292/464, loss: 0.011248644441366196 2023-01-22 15:36:29.892037: step: 294/464, loss: 0.012288033030927181 2023-01-22 15:36:30.636103: step: 296/464, loss: 0.017076879739761353 2023-01-22 15:36:31.422320: step: 298/464, loss: 0.2934398949146271 2023-01-22 15:36:32.105821: step: 300/464, loss: 0.034979406744241714 2023-01-22 15:36:32.842393: step: 302/464, loss: 0.06029561907052994 2023-01-22 15:36:33.569881: step: 304/464, loss: 0.001526201725937426 2023-01-22 15:36:34.284043: step: 306/464, loss: 0.021015586331486702 2023-01-22 15:36:34.987330: step: 308/464, loss: 0.005308941472321749 2023-01-22 15:36:35.726233: step: 310/464, loss: 0.26065701246261597 2023-01-22 15:36:36.369989: step: 312/464, loss: 0.0007154019549489021 2023-01-22 15:36:37.130142: step: 314/464, loss: 0.03025280497968197 2023-01-22 15:36:37.828106: step: 316/464, loss: 0.08667907863855362 2023-01-22 15:36:38.533460: step: 318/464, loss: 0.08646897971630096 2023-01-22 15:36:39.310345: step: 320/464, loss: 0.005823382176458836 2023-01-22 15:36:40.043155: step: 322/464, loss: 0.07302051782608032 2023-01-22 15:36:40.787911: step: 324/464, loss: 0.014572813175618649 2023-01-22 15:36:41.547632: step: 326/464, loss: 0.03465670347213745 2023-01-22 15:36:42.320575: step: 328/464, loss: 0.027535736560821533 2023-01-22 15:36:43.045333: step: 330/464, loss: 0.0012932810932397842 2023-01-22 15:36:43.754471: step: 332/464, loss: 0.0062098451890051365 2023-01-22 15:36:44.417173: step: 334/464, loss: 0.02418820932507515 2023-01-22 15:36:45.177329: step: 336/464, loss: 0.03034006431698799 2023-01-22 15:36:46.010593: step: 338/464, loss: 0.02313992753624916 2023-01-22 15:36:46.708495: step: 340/464, loss: 0.004631619900465012 2023-01-22 15:36:47.436733: step: 342/464, loss: 0.08408119529485703 2023-01-22 15:36:48.217118: step: 344/464, loss: 0.005881492979824543 2023-01-22 15:36:48.897708: step: 346/464, loss: 0.020383819937705994 2023-01-22 15:36:49.650355: step: 348/464, loss: 0.00032111912150867283 2023-01-22 15:36:50.511251: step: 350/464, loss: 0.0056900642812252045 2023-01-22 15:36:51.183096: step: 352/464, loss: 0.0036925787571817636 2023-01-22 15:36:51.959373: step: 354/464, loss: 0.0058372109197080135 2023-01-22 15:36:52.704403: step: 356/464, loss: 0.025614218786358833 2023-01-22 15:36:53.511940: step: 358/464, loss: 0.10615309327840805 2023-01-22 15:36:54.268729: step: 360/464, loss: 0.06694373488426208 2023-01-22 15:36:55.009307: step: 362/464, loss: 0.015115071088075638 2023-01-22 15:36:55.719276: step: 364/464, loss: 0.6433719992637634 2023-01-22 15:36:56.454735: step: 366/464, loss: 0.02050706557929516 2023-01-22 15:36:57.129836: step: 368/464, loss: 0.009005571715533733 2023-01-22 15:36:57.773981: step: 370/464, loss: 0.04691807180643082 2023-01-22 15:36:58.587861: step: 372/464, loss: 0.0010442383354529738 2023-01-22 15:36:59.290185: step: 374/464, loss: 0.014815493486821651 2023-01-22 15:37:00.069347: step: 376/464, loss: 0.9234954118728638 2023-01-22 15:37:00.821709: step: 378/464, loss: 0.05523635447025299 2023-01-22 15:37:01.702410: step: 380/464, loss: 0.008017596788704395 2023-01-22 15:37:02.428309: step: 382/464, loss: 0.006063917186111212 2023-01-22 15:37:03.134809: step: 384/464, loss: 0.005335265304893255 2023-01-22 15:37:03.838698: step: 386/464, loss: 0.010618263855576515 2023-01-22 15:37:04.580547: step: 388/464, loss: 0.00885799154639244 2023-01-22 15:37:05.353144: step: 390/464, loss: 0.026505211368203163 2023-01-22 15:37:06.066114: step: 392/464, loss: 0.0025265130680054426 2023-01-22 15:37:06.791085: step: 394/464, loss: 0.04610437527298927 2023-01-22 15:37:07.512906: step: 396/464, loss: 0.02633264847099781 2023-01-22 15:37:08.188764: step: 398/464, loss: 0.0013205071445554495 2023-01-22 15:37:08.928265: step: 400/464, loss: 0.006168350577354431 2023-01-22 15:37:09.623935: step: 402/464, loss: 0.00683568837121129 2023-01-22 15:37:10.355868: step: 404/464, loss: 0.05415549874305725 2023-01-22 15:37:11.088690: step: 406/464, loss: 0.019467797130346298 2023-01-22 15:37:11.807195: step: 408/464, loss: 0.1880302131175995 2023-01-22 15:37:12.584721: step: 410/464, loss: 0.059606973081827164 2023-01-22 15:37:13.268203: step: 412/464, loss: 0.010953270830214024 2023-01-22 15:37:14.034913: step: 414/464, loss: 0.017974497750401497 2023-01-22 15:37:14.826144: step: 416/464, loss: 0.021628299728035927 2023-01-22 15:37:15.531111: step: 418/464, loss: 0.0038115577772259712 2023-01-22 15:37:16.279910: step: 420/464, loss: 0.01009564008563757 2023-01-22 15:37:16.994334: step: 422/464, loss: 0.006931070238351822 2023-01-22 15:37:17.672571: step: 424/464, loss: 0.01273108460009098 2023-01-22 15:37:18.392194: step: 426/464, loss: 0.029455358162522316 2023-01-22 15:37:19.050116: step: 428/464, loss: 0.01086959894746542 2023-01-22 15:37:19.774952: step: 430/464, loss: 0.021176839247345924 2023-01-22 15:37:20.616790: step: 432/464, loss: 0.02633567713201046 2023-01-22 15:37:21.299888: step: 434/464, loss: 0.031902752816677094 2023-01-22 15:37:22.047923: step: 436/464, loss: 0.011048228479921818 2023-01-22 15:37:22.732248: step: 438/464, loss: 0.0007115676999092102 2023-01-22 15:37:23.528614: step: 440/464, loss: 0.07314342260360718 2023-01-22 15:37:24.208009: step: 442/464, loss: 0.0017838759813457727 2023-01-22 15:37:24.957820: step: 444/464, loss: 0.021234095096588135 2023-01-22 15:37:25.718942: step: 446/464, loss: 0.14134712517261505 2023-01-22 15:37:26.528576: step: 448/464, loss: 0.030335931107401848 2023-01-22 15:37:27.274303: step: 450/464, loss: 0.3117372691631317 2023-01-22 15:37:28.018499: step: 452/464, loss: 0.25308120250701904 2023-01-22 15:37:28.768876: step: 454/464, loss: 0.027010025456547737 2023-01-22 15:37:29.573370: step: 456/464, loss: 0.009400018490850925 2023-01-22 15:37:30.321232: step: 458/464, loss: 0.009423406794667244 2023-01-22 15:37:31.025126: step: 460/464, loss: 0.047979265451431274 2023-01-22 15:37:31.763751: step: 462/464, loss: 0.301637202501297 2023-01-22 15:37:32.505717: step: 464/464, loss: 0.5655904412269592 2023-01-22 15:37:33.211741: step: 466/464, loss: 0.005549916531890631 2023-01-22 15:37:33.944881: step: 468/464, loss: 0.040917810052633286 2023-01-22 15:37:34.625260: step: 470/464, loss: 0.03385911136865616 2023-01-22 15:37:35.365879: step: 472/464, loss: 0.0006872548838146031 2023-01-22 15:37:36.241772: step: 474/464, loss: 0.012125966139137745 2023-01-22 15:37:36.954699: step: 476/464, loss: 0.005242605693638325 2023-01-22 15:37:37.712716: step: 478/464, loss: 0.01875889115035534 2023-01-22 15:37:38.478190: step: 480/464, loss: 0.07527235895395279 2023-01-22 15:37:39.201900: step: 482/464, loss: 0.011973106302320957 2023-01-22 15:37:39.829849: step: 484/464, loss: 0.02894880622625351 2023-01-22 15:37:40.581575: step: 486/464, loss: 0.016231101006269455 2023-01-22 15:37:41.345896: step: 488/464, loss: 0.10126560181379318 2023-01-22 15:37:42.083508: step: 490/464, loss: 0.040294162929058075 2023-01-22 15:37:42.791088: step: 492/464, loss: 0.012498761527240276 2023-01-22 15:37:43.460372: step: 494/464, loss: 0.006310733500868082 2023-01-22 15:37:44.234675: step: 496/464, loss: 0.011964485980570316 2023-01-22 15:37:44.955572: step: 498/464, loss: 0.0012873383238911629 2023-01-22 15:37:45.634569: step: 500/464, loss: 0.03405240550637245 2023-01-22 15:37:46.328486: step: 502/464, loss: 0.021336054429411888 2023-01-22 15:37:47.031328: step: 504/464, loss: 0.022539552301168442 2023-01-22 15:37:47.733807: step: 506/464, loss: 0.7507018446922302 2023-01-22 15:37:48.392209: step: 508/464, loss: 0.0006076518911868334 2023-01-22 15:37:49.181424: step: 510/464, loss: 0.009058432653546333 2023-01-22 15:37:49.904245: step: 512/464, loss: 0.05086961388587952 2023-01-22 15:37:50.618948: step: 514/464, loss: 0.41207343339920044 2023-01-22 15:37:51.300196: step: 516/464, loss: 0.0023046203423291445 2023-01-22 15:37:52.047934: step: 518/464, loss: 0.020794259384274483 2023-01-22 15:37:52.771884: step: 520/464, loss: 0.022838469594717026 2023-01-22 15:37:53.490159: step: 522/464, loss: 0.05027368664741516 2023-01-22 15:37:54.210877: step: 524/464, loss: 0.006262997165322304 2023-01-22 15:37:54.936583: step: 526/464, loss: 0.003343733726069331 2023-01-22 15:37:55.720042: step: 528/464, loss: 0.008441003039479256 2023-01-22 15:37:56.367390: step: 530/464, loss: 0.00423327274620533 2023-01-22 15:37:57.149504: step: 532/464, loss: 0.05196404829621315 2023-01-22 15:37:57.884890: step: 534/464, loss: 0.016537966206669807 2023-01-22 15:37:58.613245: step: 536/464, loss: 0.023798886686563492 2023-01-22 15:37:59.358628: step: 538/464, loss: 0.02561027929186821 2023-01-22 15:38:00.142369: step: 540/464, loss: 0.013788457959890366 2023-01-22 15:38:00.936969: step: 542/464, loss: 0.0004529604921117425 2023-01-22 15:38:01.611897: step: 544/464, loss: 0.0018864155281335115 2023-01-22 15:38:02.413751: step: 546/464, loss: 0.013033518567681313 2023-01-22 15:38:03.150549: step: 548/464, loss: 0.061912618577480316 2023-01-22 15:38:03.851437: step: 550/464, loss: 0.06112837791442871 2023-01-22 15:38:04.679525: step: 552/464, loss: 0.005378578323870897 2023-01-22 15:38:05.461248: step: 554/464, loss: 0.06155424192547798 2023-01-22 15:38:06.228961: step: 556/464, loss: 0.03351220861077309 2023-01-22 15:38:06.954814: step: 558/464, loss: 0.003313565393909812 2023-01-22 15:38:07.757553: step: 560/464, loss: 0.006736339069902897 2023-01-22 15:38:08.499988: step: 562/464, loss: 0.3144710063934326 2023-01-22 15:38:09.229865: step: 564/464, loss: 0.02292277291417122 2023-01-22 15:38:09.990917: step: 566/464, loss: 0.0073122428730130196 2023-01-22 15:38:10.677732: step: 568/464, loss: 0.0016507483087480068 2023-01-22 15:38:11.442392: step: 570/464, loss: 0.016362829133868217 2023-01-22 15:38:12.163900: step: 572/464, loss: 0.027213526889681816 2023-01-22 15:38:12.867245: step: 574/464, loss: 0.0034306913148611784 2023-01-22 15:38:13.559934: step: 576/464, loss: 0.0063187661580741405 2023-01-22 15:38:14.316066: step: 578/464, loss: 0.002447170903906226 2023-01-22 15:38:15.066829: step: 580/464, loss: 0.05246298760175705 2023-01-22 15:38:15.767011: step: 582/464, loss: 0.002014671452343464 2023-01-22 15:38:16.557259: step: 584/464, loss: 0.0020586480386555195 2023-01-22 15:38:17.277927: step: 586/464, loss: 0.04257035627961159 2023-01-22 15:38:17.971918: step: 588/464, loss: 0.011674296110868454 2023-01-22 15:38:18.739539: step: 590/464, loss: 0.0045217531733214855 2023-01-22 15:38:19.453731: step: 592/464, loss: 0.03411796689033508 2023-01-22 15:38:20.202857: step: 594/464, loss: 0.002003189641982317 2023-01-22 15:38:20.943609: step: 596/464, loss: 0.009588696993887424 2023-01-22 15:38:21.655393: step: 598/464, loss: 0.02322070114314556 2023-01-22 15:38:22.320720: step: 600/464, loss: 0.0009125259821303189 2023-01-22 15:38:23.077894: step: 602/464, loss: 0.0006416767719201744 2023-01-22 15:38:23.994035: step: 604/464, loss: 0.0335293784737587 2023-01-22 15:38:24.724767: step: 606/464, loss: 0.004608626943081617 2023-01-22 15:38:25.491154: step: 608/464, loss: 0.06503929942846298 2023-01-22 15:38:26.227860: step: 610/464, loss: 0.0037166120018810034 2023-01-22 15:38:26.983932: step: 612/464, loss: 0.002145248232409358 2023-01-22 15:38:27.742558: step: 614/464, loss: 0.012959728017449379 2023-01-22 15:38:28.485903: step: 616/464, loss: 0.008085524663329124 2023-01-22 15:38:29.095045: step: 618/464, loss: 0.005036820657551289 2023-01-22 15:38:29.883805: step: 620/464, loss: 0.08978661894798279 2023-01-22 15:38:30.581262: step: 622/464, loss: 0.0007188359159044921 2023-01-22 15:38:31.342848: step: 624/464, loss: 0.002841709880158305 2023-01-22 15:38:32.033918: step: 626/464, loss: 0.00509470934048295 2023-01-22 15:38:32.787495: step: 628/464, loss: 0.005124698393046856 2023-01-22 15:38:33.511094: step: 630/464, loss: 0.09914972633123398 2023-01-22 15:38:34.190024: step: 632/464, loss: 0.007541782688349485 2023-01-22 15:38:34.952330: step: 634/464, loss: 0.004804661963135004 2023-01-22 15:38:35.743108: step: 636/464, loss: 0.20996464788913727 2023-01-22 15:38:36.550002: step: 638/464, loss: 9.226617839885876e-05 2023-01-22 15:38:37.355907: step: 640/464, loss: 0.0010231471387669444 2023-01-22 15:38:38.069237: step: 642/464, loss: 0.007877390831708908 2023-01-22 15:38:38.805696: step: 644/464, loss: 0.020709160715341568 2023-01-22 15:38:39.503545: step: 646/464, loss: 0.18257126212120056 2023-01-22 15:38:40.222992: step: 648/464, loss: 0.005651933141052723 2023-01-22 15:38:40.935006: step: 650/464, loss: 0.0015648715198040009 2023-01-22 15:38:41.627559: step: 652/464, loss: 0.03712617978453636 2023-01-22 15:38:42.310368: step: 654/464, loss: 0.3934994637966156 2023-01-22 15:38:43.069316: step: 656/464, loss: 0.053740598261356354 2023-01-22 15:38:43.853626: step: 658/464, loss: 0.005652363412082195 2023-01-22 15:38:44.676518: step: 660/464, loss: 0.04853597655892372 2023-01-22 15:38:45.430359: step: 662/464, loss: 0.045159582048654556 2023-01-22 15:38:46.189939: step: 664/464, loss: 0.08590636402368546 2023-01-22 15:38:46.822463: step: 666/464, loss: 0.03339572995901108 2023-01-22 15:38:47.550586: step: 668/464, loss: 0.005777023267000914 2023-01-22 15:38:48.288622: step: 670/464, loss: 0.0036592998076230288 2023-01-22 15:38:48.969640: step: 672/464, loss: 0.004354698583483696 2023-01-22 15:38:49.685933: step: 674/464, loss: 0.0054754349403083324 2023-01-22 15:38:50.375206: step: 676/464, loss: 0.002870019059628248 2023-01-22 15:38:51.130557: step: 678/464, loss: 0.04529373720288277 2023-01-22 15:38:51.895641: step: 680/464, loss: 0.0007131235906854272 2023-01-22 15:38:52.637007: step: 682/464, loss: 0.049643710255622864 2023-01-22 15:38:53.418128: step: 684/464, loss: 0.00527068879455328 2023-01-22 15:38:54.115251: step: 686/464, loss: 0.015814781188964844 2023-01-22 15:38:54.781865: step: 688/464, loss: 0.017705701291561127 2023-01-22 15:38:55.487655: step: 690/464, loss: 0.020682422444224358 2023-01-22 15:38:56.283996: step: 692/464, loss: 0.0031925418879836798 2023-01-22 15:38:57.022417: step: 694/464, loss: 0.008852974511682987 2023-01-22 15:38:57.779321: step: 696/464, loss: 0.0342039093375206 2023-01-22 15:38:58.493982: step: 698/464, loss: 0.1069117933511734 2023-01-22 15:38:59.321541: step: 700/464, loss: 0.04989028722047806 2023-01-22 15:38:59.991207: step: 702/464, loss: 0.0005538988625630736 2023-01-22 15:39:00.752501: step: 704/464, loss: 0.004752847366034985 2023-01-22 15:39:01.551041: step: 706/464, loss: 0.00048479001270607114 2023-01-22 15:39:02.321823: step: 708/464, loss: 0.011142611503601074 2023-01-22 15:39:03.063909: step: 710/464, loss: 0.0004893583245575428 2023-01-22 15:39:03.731893: step: 712/464, loss: 0.0023483308032155037 2023-01-22 15:39:04.460954: step: 714/464, loss: 0.009256926365196705 2023-01-22 15:39:05.209362: step: 716/464, loss: 0.007569923531264067 2023-01-22 15:39:05.857085: step: 718/464, loss: 0.000603106280323118 2023-01-22 15:39:06.623641: step: 720/464, loss: 0.2925114631652832 2023-01-22 15:39:07.345471: step: 722/464, loss: 0.01142470259219408 2023-01-22 15:39:08.042199: step: 724/464, loss: 0.0077668167650699615 2023-01-22 15:39:08.759182: step: 726/464, loss: 0.011982940137386322 2023-01-22 15:39:09.435165: step: 728/464, loss: 0.03843516856431961 2023-01-22 15:39:10.156153: step: 730/464, loss: 0.0030147875659167767 2023-01-22 15:39:10.884330: step: 732/464, loss: 0.010897437110543251 2023-01-22 15:39:11.528687: step: 734/464, loss: 0.0892910584807396 2023-01-22 15:39:12.314714: step: 736/464, loss: 0.026463089510798454 2023-01-22 15:39:13.093082: step: 738/464, loss: 0.007182304281741381 2023-01-22 15:39:13.858754: step: 740/464, loss: 0.08904621750116348 2023-01-22 15:39:14.635979: step: 742/464, loss: 0.02075066789984703 2023-01-22 15:39:15.372804: step: 744/464, loss: 0.05011540278792381 2023-01-22 15:39:16.185467: step: 746/464, loss: 0.0033906898461282253 2023-01-22 15:39:17.038218: step: 748/464, loss: 0.02135995402932167 2023-01-22 15:39:17.811024: step: 750/464, loss: 0.03749980032444 2023-01-22 15:39:18.558190: step: 752/464, loss: 0.013032837770879269 2023-01-22 15:39:19.315509: step: 754/464, loss: 0.03569572791457176 2023-01-22 15:39:20.035900: step: 756/464, loss: 0.028139611706137657 2023-01-22 15:39:20.762217: step: 758/464, loss: 0.0023053884506225586 2023-01-22 15:39:21.530963: step: 760/464, loss: 0.037611622363328934 2023-01-22 15:39:22.194355: step: 762/464, loss: 0.1201668381690979 2023-01-22 15:39:22.981890: step: 764/464, loss: 0.18216541409492493 2023-01-22 15:39:23.702632: step: 766/464, loss: 0.006229735910892487 2023-01-22 15:39:24.404781: step: 768/464, loss: 0.004708054009824991 2023-01-22 15:39:25.146218: step: 770/464, loss: 0.06688012182712555 2023-01-22 15:39:25.900884: step: 772/464, loss: 0.023829631507396698 2023-01-22 15:39:26.709123: step: 774/464, loss: 0.0007440192857757211 2023-01-22 15:39:27.553158: step: 776/464, loss: 0.0322493351995945 2023-01-22 15:39:28.359101: step: 778/464, loss: 0.013235216960310936 2023-01-22 15:39:29.139262: step: 780/464, loss: 0.001265358179807663 2023-01-22 15:39:29.836969: step: 782/464, loss: 0.017159154638648033 2023-01-22 15:39:30.626079: step: 784/464, loss: 0.0022521738428622484 2023-01-22 15:39:31.409590: step: 786/464, loss: 0.003812970593571663 2023-01-22 15:39:32.056918: step: 788/464, loss: 0.0006713416078127921 2023-01-22 15:39:32.800789: step: 790/464, loss: 0.04698442295193672 2023-01-22 15:39:33.541929: step: 792/464, loss: 0.23842373490333557 2023-01-22 15:39:34.246793: step: 794/464, loss: 0.03917529806494713 2023-01-22 15:39:34.925271: step: 796/464, loss: 0.018516667187213898 2023-01-22 15:39:35.639845: step: 798/464, loss: 0.07548412680625916 2023-01-22 15:39:36.349083: step: 800/464, loss: 0.021237986162304878 2023-01-22 15:39:37.041630: step: 802/464, loss: 0.01646965742111206 2023-01-22 15:39:37.743974: step: 804/464, loss: 0.01056175772100687 2023-01-22 15:39:38.376521: step: 806/464, loss: 0.024984072893857956 2023-01-22 15:39:39.072811: step: 808/464, loss: 0.31603842973709106 2023-01-22 15:39:39.811194: step: 810/464, loss: 0.012879881076514721 2023-01-22 15:39:40.560934: step: 812/464, loss: 0.024146106094121933 2023-01-22 15:39:41.294042: step: 814/464, loss: 0.03637917712330818 2023-01-22 15:39:42.091184: step: 816/464, loss: 0.04843752458691597 2023-01-22 15:39:42.737213: step: 818/464, loss: 0.013010178692638874 2023-01-22 15:39:43.552658: step: 820/464, loss: 0.008544894866645336 2023-01-22 15:39:44.212930: step: 822/464, loss: 0.2635175883769989 2023-01-22 15:39:44.876188: step: 824/464, loss: 0.13282142579555511 2023-01-22 15:39:45.563277: step: 826/464, loss: 0.002384813968092203 2023-01-22 15:39:46.330916: step: 828/464, loss: 0.04145323857665062 2023-01-22 15:39:46.952286: step: 830/464, loss: 0.10565589368343353 2023-01-22 15:39:47.673807: step: 832/464, loss: 0.001017981325276196 2023-01-22 15:39:48.395849: step: 834/464, loss: 0.0007453096332028508 2023-01-22 15:39:49.127202: step: 836/464, loss: 0.022320443764328957 2023-01-22 15:39:49.881355: step: 838/464, loss: 0.0025967489928007126 2023-01-22 15:39:50.550622: step: 840/464, loss: 0.035556502640247345 2023-01-22 15:39:51.302034: step: 842/464, loss: 0.013183936476707458 2023-01-22 15:39:52.067925: step: 844/464, loss: 0.0010444383369758725 2023-01-22 15:39:52.891988: step: 846/464, loss: 0.03276212140917778 2023-01-22 15:39:53.634055: step: 848/464, loss: 0.003905899589881301 2023-01-22 15:39:54.399237: step: 850/464, loss: 0.01465561706572771 2023-01-22 15:39:55.150108: step: 852/464, loss: 0.023813139647245407 2023-01-22 15:39:55.811753: step: 854/464, loss: 0.0006879746215417981 2023-01-22 15:39:56.573537: step: 856/464, loss: 0.03710510954260826 2023-01-22 15:39:57.212701: step: 858/464, loss: 0.003056836314499378 2023-01-22 15:39:57.933054: step: 860/464, loss: 0.06689032912254333 2023-01-22 15:39:58.576519: step: 862/464, loss: 0.030993498861789703 2023-01-22 15:39:59.342983: step: 864/464, loss: 0.44617322087287903 2023-01-22 15:40:00.021644: step: 866/464, loss: 0.040697306394577026 2023-01-22 15:40:00.744963: step: 868/464, loss: 0.0204264298081398 2023-01-22 15:40:01.443493: step: 870/464, loss: 0.012933028861880302 2023-01-22 15:40:02.196074: step: 872/464, loss: 0.0016678719548508525 2023-01-22 15:40:03.003941: step: 874/464, loss: 0.05385873094201088 2023-01-22 15:40:03.808175: step: 876/464, loss: 0.022039365023374557 2023-01-22 15:40:04.580241: step: 878/464, loss: 0.007482157554477453 2023-01-22 15:40:05.332247: step: 880/464, loss: 1.6804180145263672 2023-01-22 15:40:06.097245: step: 882/464, loss: 0.02266129106283188 2023-01-22 15:40:06.913182: step: 884/464, loss: 0.01153059396892786 2023-01-22 15:40:07.628639: step: 886/464, loss: 0.2091490924358368 2023-01-22 15:40:08.350937: step: 888/464, loss: 0.015897979959845543 2023-01-22 15:40:09.096691: step: 890/464, loss: 0.0003611915453802794 2023-01-22 15:40:09.922954: step: 892/464, loss: 0.4301433265209198 2023-01-22 15:40:10.684056: step: 894/464, loss: 6.273853068705648e-05 2023-01-22 15:40:11.449896: step: 896/464, loss: 0.05613543838262558 2023-01-22 15:40:12.237987: step: 898/464, loss: 0.02761615626513958 2023-01-22 15:40:12.898929: step: 900/464, loss: 0.0015836823731660843 2023-01-22 15:40:13.563710: step: 902/464, loss: 0.047353874891996384 2023-01-22 15:40:14.287020: step: 904/464, loss: 0.05324863642454147 2023-01-22 15:40:15.035126: step: 906/464, loss: 0.03325873240828514 2023-01-22 15:40:15.728720: step: 908/464, loss: 0.1007314920425415 2023-01-22 15:40:16.598261: step: 910/464, loss: 0.051610033959150314 2023-01-22 15:40:17.303897: step: 912/464, loss: 0.001794341136701405 2023-01-22 15:40:18.018968: step: 914/464, loss: 0.009611096233129501 2023-01-22 15:40:18.745672: step: 916/464, loss: 0.014779017306864262 2023-01-22 15:40:19.396843: step: 918/464, loss: 0.019373752176761627 2023-01-22 15:40:20.153051: step: 920/464, loss: 0.0994553416967392 2023-01-22 15:40:20.933241: step: 922/464, loss: 0.0182206928730011 2023-01-22 15:40:21.666039: step: 924/464, loss: 0.04314351826906204 2023-01-22 15:40:22.356152: step: 926/464, loss: 0.0043644472025334835 2023-01-22 15:40:23.134708: step: 928/464, loss: 0.021911775693297386 2023-01-22 15:40:23.717520: step: 930/464, loss: 0.001663040486164391 ================================================== Loss: 0.053 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3017339015151515, 'r': 0.34353005865102637, 'f1': 0.32127833346777446}, 'combined': 0.2367314036078338, 'epoch': 30} Test Chinese: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.30305615549668224, 'r': 0.2898667679181916, 'f1': 0.29631476477784807}, 'combined': 0.18402706444097935, 'epoch': 30} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2826279313839027, 'r': 0.33679381576677586, 'f1': 0.30734258166076345}, 'combined': 0.22646295490793095, 'epoch': 30} Test Korean: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.3035112982049055, 'r': 0.29300398323243104, 'f1': 0.2981651001992831}, 'combined': 0.1851762201237653, 'epoch': 30} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3011806279312142, 'r': 0.3463291471087586, 'f1': 0.3221808658893483}, 'combined': 0.23739642749741452, 'epoch': 30} Test Russian: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.31777228007557534, 'r': 0.29922770586740627, 'f1': 0.3082213047701964}, 'combined': 0.19142165243622725, 'epoch': 30} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.2554347826086957, 'r': 0.3357142857142857, 'f1': 0.2901234567901234}, 'combined': 0.19341563786008226, 'epoch': 30} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.2564102564102564, 'r': 0.43478260869565216, 'f1': 0.3225806451612903}, 'combined': 0.16129032258064516, 'epoch': 30} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.4076086956521739, 'r': 0.3232758620689655, 'f1': 0.3605769230769231}, 'combined': 0.24038461538461536, 'epoch': 30} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.31282801835726054, 'r': 0.3603161425860667, 'f1': 0.33489701436130004}, 'combined': 0.24676622110832633, 'epoch': 12} Test for Chinese: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.30894966790503225, 'r': 0.2887808468548521, 'f1': 0.29852498585915693}, 'combined': 0.1853997280598975, 'epoch': 12} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.3138888888888889, 'r': 0.4035714285714286, 'f1': 0.35312499999999997}, 'combined': 0.23541666666666664, 'epoch': 12} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3081533681870252, 'r': 0.3455761681186563, 'f1': 0.32579363255551325}, 'combined': 0.24005846609353607, 'epoch': 23} Test for Korean: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.29327466083978504, 'r': 0.2883432372648332, 'f1': 0.2907880427678267}, 'combined': 0.1805946791926503, 'epoch': 23} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.3355263157894737, 'r': 0.5543478260869565, 'f1': 0.41803278688524587}, 'combined': 0.20901639344262293, 'epoch': 23} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.32184781639317267, 'r': 0.3346728716953674, 'f1': 0.32813507606224857}, 'combined': 0.24178374025639365, 'epoch': 24} Test for Russian: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.31977789344146773, 'r': 0.29610233370986844, 'f1': 0.30748504771716734}, 'combined': 0.190964398055925, 'epoch': 24} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.46875, 'r': 0.3232758620689655, 'f1': 0.38265306122448983}, 'combined': 0.25510204081632654, 'epoch': 24} ****************************** Epoch: 31 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-22 15:43:03.854110: step: 2/464, loss: 0.007450570352375507 2023-01-22 15:43:04.540818: step: 4/464, loss: 0.10737699270248413 2023-01-22 15:43:05.301327: step: 6/464, loss: 0.002511220285668969 2023-01-22 15:43:05.960120: step: 8/464, loss: 0.004273401573300362 2023-01-22 15:43:06.711153: step: 10/464, loss: 0.002521744230762124 2023-01-22 15:43:07.478383: step: 12/464, loss: 0.00019417490693740547 2023-01-22 15:43:08.202397: step: 14/464, loss: 0.002275019185617566 2023-01-22 15:43:08.940250: step: 16/464, loss: 0.12431767582893372 2023-01-22 15:43:09.648483: step: 18/464, loss: 0.005227986723184586 2023-01-22 15:43:10.407266: step: 20/464, loss: 0.056336916983127594 2023-01-22 15:43:11.220549: step: 22/464, loss: 0.0018891810905188322 2023-01-22 15:43:12.003584: step: 24/464, loss: 0.03804721310734749 2023-01-22 15:43:12.908705: step: 26/464, loss: 0.009079745039343834 2023-01-22 15:43:13.588379: step: 28/464, loss: 0.006622061599045992 2023-01-22 15:43:14.446791: step: 30/464, loss: 0.048646826297044754 2023-01-22 15:43:15.197641: step: 32/464, loss: 0.01823214814066887 2023-01-22 15:43:15.943872: step: 34/464, loss: 0.08220667392015457 2023-01-22 15:43:16.653129: step: 36/464, loss: 0.010882941074669361 2023-01-22 15:43:17.420805: step: 38/464, loss: 0.013139336369931698 2023-01-22 15:43:18.159541: step: 40/464, loss: 0.006461418699473143 2023-01-22 15:43:18.897623: step: 42/464, loss: 0.0024266575928777456 2023-01-22 15:43:19.597439: step: 44/464, loss: 0.5456418991088867 2023-01-22 15:43:20.273314: step: 46/464, loss: 0.00018644030205905437 2023-01-22 15:43:21.047109: step: 48/464, loss: 4.954357791575603e-05 2023-01-22 15:43:21.719597: step: 50/464, loss: 0.006846207659691572 2023-01-22 15:43:22.515845: step: 52/464, loss: 0.0027074338868260384 2023-01-22 15:43:23.231031: step: 54/464, loss: 0.01821725256741047 2023-01-22 15:43:23.893573: step: 56/464, loss: 0.007177872117608786 2023-01-22 15:43:24.658072: step: 58/464, loss: 0.01788950525224209 2023-01-22 15:43:25.407765: step: 60/464, loss: 0.0005401679081842303 2023-01-22 15:43:26.238349: step: 62/464, loss: 0.05340760201215744 2023-01-22 15:43:26.873679: step: 64/464, loss: 4.108704160898924e-05 2023-01-22 15:43:27.535022: step: 66/464, loss: 0.001280111842788756 2023-01-22 15:43:28.190725: step: 68/464, loss: 0.002295039128512144 2023-01-22 15:43:28.914645: step: 70/464, loss: 0.00215263688005507 2023-01-22 15:43:29.749542: step: 72/464, loss: 0.06053628399968147 2023-01-22 15:43:30.466071: step: 74/464, loss: 0.005914163775742054 2023-01-22 15:43:31.214946: step: 76/464, loss: 2.9299533367156982 2023-01-22 15:43:32.033839: step: 78/464, loss: 0.03084677644073963 2023-01-22 15:43:32.800507: step: 80/464, loss: 0.0009321445832028985 2023-01-22 15:43:33.569427: step: 82/464, loss: 0.0003534965217113495 2023-01-22 15:43:34.358072: step: 84/464, loss: 0.004339384380728006 2023-01-22 15:43:35.037998: step: 86/464, loss: 2.9442649974953383e-05 2023-01-22 15:43:35.809008: step: 88/464, loss: 0.0012765272986143827 2023-01-22 15:43:36.568945: step: 90/464, loss: 0.008331461809575558 2023-01-22 15:43:37.359272: step: 92/464, loss: 0.016410961747169495 2023-01-22 15:43:38.067373: step: 94/464, loss: 0.003537172917276621 2023-01-22 15:43:38.748009: step: 96/464, loss: 0.01527501456439495 2023-01-22 15:43:39.487237: step: 98/464, loss: 0.0031059093307703733 2023-01-22 15:43:40.174343: step: 100/464, loss: 0.001596541958861053 2023-01-22 15:43:40.948102: step: 102/464, loss: 0.02693060413002968 2023-01-22 15:43:41.641269: step: 104/464, loss: 0.02274322509765625 2023-01-22 15:43:42.370920: step: 106/464, loss: 0.0003036792913917452 2023-01-22 15:43:43.063573: step: 108/464, loss: 0.002669687382876873 2023-01-22 15:43:43.790179: step: 110/464, loss: 0.006395821925252676 2023-01-22 15:43:44.523298: step: 112/464, loss: 0.0009263441315852106 2023-01-22 15:43:45.349192: step: 114/464, loss: 0.19540373980998993 2023-01-22 15:43:46.053639: step: 116/464, loss: 0.019305413588881493 2023-01-22 15:43:46.813241: step: 118/464, loss: 0.01990027353167534 2023-01-22 15:43:47.529436: step: 120/464, loss: 0.3651050925254822 2023-01-22 15:43:48.226200: step: 122/464, loss: 0.0071814339607954025 2023-01-22 15:43:48.917607: step: 124/464, loss: 0.04094426706433296 2023-01-22 15:43:49.628426: step: 126/464, loss: 0.026430925354361534 2023-01-22 15:43:50.357863: step: 128/464, loss: 0.028471818193793297 2023-01-22 15:43:51.118542: step: 130/464, loss: 0.0014034638879820704 2023-01-22 15:43:51.889795: step: 132/464, loss: 0.026817435398697853 2023-01-22 15:43:52.611053: step: 134/464, loss: 0.011616945266723633 2023-01-22 15:43:53.245252: step: 136/464, loss: 0.013456545770168304 2023-01-22 15:43:53.932575: step: 138/464, loss: 0.00027974642580375075 2023-01-22 15:43:54.632655: step: 140/464, loss: 0.03479605168104172 2023-01-22 15:43:55.299111: step: 142/464, loss: 2.8743435905198567e-05 2023-01-22 15:43:56.179470: step: 144/464, loss: 0.014272580854594707 2023-01-22 15:43:56.970447: step: 146/464, loss: 0.009507148526608944 2023-01-22 15:43:57.706326: step: 148/464, loss: 0.11540583521127701 2023-01-22 15:43:58.488189: step: 150/464, loss: 0.03117658942937851 2023-01-22 15:43:59.167936: step: 152/464, loss: 0.05746026337146759 2023-01-22 15:43:59.915148: step: 154/464, loss: 0.03753349557518959 2023-01-22 15:44:00.662683: step: 156/464, loss: 0.030995476990938187 2023-01-22 15:44:01.467592: step: 158/464, loss: 0.020567825064063072 2023-01-22 15:44:02.158891: step: 160/464, loss: 0.008550383150577545 2023-01-22 15:44:02.912499: step: 162/464, loss: 0.032725293189287186 2023-01-22 15:44:03.591539: step: 164/464, loss: 0.05792173370718956 2023-01-22 15:44:04.307983: step: 166/464, loss: 0.015494300983846188 2023-01-22 15:44:05.011119: step: 168/464, loss: 0.016035746783018112 2023-01-22 15:44:05.740180: step: 170/464, loss: 0.017190538346767426 2023-01-22 15:44:06.427133: step: 172/464, loss: 0.10798454284667969 2023-01-22 15:44:07.198519: step: 174/464, loss: 0.019811222329735756 2023-01-22 15:44:07.898482: step: 176/464, loss: 0.010970844887197018 2023-01-22 15:44:08.704281: step: 178/464, loss: 0.02815398946404457 2023-01-22 15:44:09.521721: step: 180/464, loss: 0.6478611826896667 2023-01-22 15:44:10.269328: step: 182/464, loss: 0.1307576447725296 2023-01-22 15:44:10.994734: step: 184/464, loss: 0.015528128482401371 2023-01-22 15:44:11.718617: step: 186/464, loss: 0.003247783752158284 2023-01-22 15:44:12.467448: step: 188/464, loss: 0.03964921832084656 2023-01-22 15:44:13.160636: step: 190/464, loss: 0.004766841884702444 2023-01-22 15:44:13.930191: step: 192/464, loss: 0.03177197650074959 2023-01-22 15:44:14.703533: step: 194/464, loss: 0.01902329921722412 2023-01-22 15:44:15.466156: step: 196/464, loss: 0.00788768008351326 2023-01-22 15:44:16.149287: step: 198/464, loss: 0.0015454021049663424 2023-01-22 15:44:16.871790: step: 200/464, loss: 0.01335904560983181 2023-01-22 15:44:17.556374: step: 202/464, loss: 0.012268331833183765 2023-01-22 15:44:18.307424: step: 204/464, loss: 0.03662366420030594 2023-01-22 15:44:19.041603: step: 206/464, loss: 0.02568664960563183 2023-01-22 15:44:19.693145: step: 208/464, loss: 0.029208391904830933 2023-01-22 15:44:20.425876: step: 210/464, loss: 0.005202494096010923 2023-01-22 15:44:21.110416: step: 212/464, loss: 0.022604580968618393 2023-01-22 15:44:21.808144: step: 214/464, loss: 0.04248643293976784 2023-01-22 15:44:22.620656: step: 216/464, loss: 0.05426526442170143 2023-01-22 15:44:23.371463: step: 218/464, loss: 0.171112060546875 2023-01-22 15:44:24.081272: step: 220/464, loss: 0.00690581975504756 2023-01-22 15:44:24.737736: step: 222/464, loss: 0.0018096556887030602 2023-01-22 15:44:25.441174: step: 224/464, loss: 0.013162568211555481 2023-01-22 15:44:26.168121: step: 226/464, loss: 0.03787326067686081 2023-01-22 15:44:26.928807: step: 228/464, loss: 19.952999114990234 2023-01-22 15:44:27.727336: step: 230/464, loss: 0.010027474723756313 2023-01-22 15:44:28.461009: step: 232/464, loss: 0.0030535452533513308 2023-01-22 15:44:29.260462: step: 234/464, loss: 0.010311653837561607 2023-01-22 15:44:30.022297: step: 236/464, loss: 0.03960442915558815 2023-01-22 15:44:30.823945: step: 238/464, loss: 0.01425941288471222 2023-01-22 15:44:31.571396: step: 240/464, loss: 0.016366029158234596 2023-01-22 15:44:32.290204: step: 242/464, loss: 0.1724376380443573 2023-01-22 15:44:33.049830: step: 244/464, loss: 0.02939794212579727 2023-01-22 15:44:33.742038: step: 246/464, loss: 0.0029553743079304695 2023-01-22 15:44:34.403351: step: 248/464, loss: 0.00352324265986681 2023-01-22 15:44:35.157269: step: 250/464, loss: 0.012227809056639671 2023-01-22 15:44:35.915400: step: 252/464, loss: 0.046687014400959015 2023-01-22 15:44:36.583347: step: 254/464, loss: 0.0011639392469078302 2023-01-22 15:44:37.304887: step: 256/464, loss: 0.0010756740812212229 2023-01-22 15:44:37.947695: step: 258/464, loss: 0.026980912312865257 2023-01-22 15:44:38.676657: step: 260/464, loss: 0.03663994371891022 2023-01-22 15:44:39.345256: step: 262/464, loss: 0.17038607597351074 2023-01-22 15:44:40.082448: step: 264/464, loss: 0.01435135118663311 2023-01-22 15:44:40.753366: step: 266/464, loss: 0.008019298315048218 2023-01-22 15:44:41.502213: step: 268/464, loss: 0.0016009538667276502 2023-01-22 15:44:42.285076: step: 270/464, loss: 0.04047354310750961 2023-01-22 15:44:43.052127: step: 272/464, loss: 0.011036979034543037 2023-01-22 15:44:43.877077: step: 274/464, loss: 0.13029997050762177 2023-01-22 15:44:44.684964: step: 276/464, loss: 0.023806972429156303 2023-01-22 15:44:45.451473: step: 278/464, loss: 0.0013524888781830668 2023-01-22 15:44:46.233897: step: 280/464, loss: 0.0006496798596344888 2023-01-22 15:44:46.889981: step: 282/464, loss: 0.0014774593291804194 2023-01-22 15:44:47.585238: step: 284/464, loss: 0.20750577747821808 2023-01-22 15:44:48.304395: step: 286/464, loss: 0.042337559163570404 2023-01-22 15:44:49.116368: step: 288/464, loss: 0.004411226604133844 2023-01-22 15:44:49.761263: step: 290/464, loss: 0.1359243392944336 2023-01-22 15:44:50.546631: step: 292/464, loss: 0.015041140839457512 2023-01-22 15:44:51.449571: step: 294/464, loss: 0.0005121452268213034 2023-01-22 15:44:52.134065: step: 296/464, loss: 0.013008585199713707 2023-01-22 15:44:52.883198: step: 298/464, loss: 0.04285898804664612 2023-01-22 15:44:53.657131: step: 300/464, loss: 0.02849295549094677 2023-01-22 15:44:54.376341: step: 302/464, loss: 0.018777921795845032 2023-01-22 15:44:55.172640: step: 304/464, loss: 0.07624068111181259 2023-01-22 15:44:55.909252: step: 306/464, loss: 0.00042426210711710155 2023-01-22 15:44:56.633739: step: 308/464, loss: 0.0076195960864424706 2023-01-22 15:44:57.339548: step: 310/464, loss: 0.0007941815420053899 2023-01-22 15:44:58.059030: step: 312/464, loss: 0.005111439619213343 2023-01-22 15:44:58.821348: step: 314/464, loss: 0.07643471658229828 2023-01-22 15:44:59.541179: step: 316/464, loss: 0.21657128632068634 2023-01-22 15:45:00.250213: step: 318/464, loss: 0.009065191261470318 2023-01-22 15:45:00.936646: step: 320/464, loss: 0.0010730769718065858 2023-01-22 15:45:01.704600: step: 322/464, loss: 0.09378761798143387 2023-01-22 15:45:02.419351: step: 324/464, loss: 0.005632663611322641 2023-01-22 15:45:03.089636: step: 326/464, loss: 0.3768630921840668 2023-01-22 15:45:03.898100: step: 328/464, loss: 0.6099423766136169 2023-01-22 15:45:04.584436: step: 330/464, loss: 0.022345110774040222 2023-01-22 15:45:05.395158: step: 332/464, loss: 0.0025050854310393333 2023-01-22 15:45:06.072954: step: 334/464, loss: 0.0053538125939667225 2023-01-22 15:45:06.811780: step: 336/464, loss: 0.009213737212121487 2023-01-22 15:45:07.497131: step: 338/464, loss: 0.00515262084081769 2023-01-22 15:45:08.226437: step: 340/464, loss: 0.000878259539604187 2023-01-22 15:45:08.922789: step: 342/464, loss: 0.003352927975356579 2023-01-22 15:45:09.573887: step: 344/464, loss: 0.0005760050262324512 2023-01-22 15:45:10.264140: step: 346/464, loss: 0.010878819040954113 2023-01-22 15:45:11.006741: step: 348/464, loss: 0.0002599854487925768 2023-01-22 15:45:11.913841: step: 350/464, loss: 0.0049443976022303104 2023-01-22 15:45:12.592567: step: 352/464, loss: 0.003746855305507779 2023-01-22 15:45:13.266582: step: 354/464, loss: 0.22560137510299683 2023-01-22 15:45:14.009861: step: 356/464, loss: 0.006116420961916447 2023-01-22 15:45:14.689354: step: 358/464, loss: 4.0767979953670874e-05 2023-01-22 15:45:15.375827: step: 360/464, loss: 0.0025461202021688223 2023-01-22 15:45:16.181899: step: 362/464, loss: 0.06831776350736618 2023-01-22 15:45:16.989722: step: 364/464, loss: 0.00244891457259655 2023-01-22 15:45:17.772076: step: 366/464, loss: 0.008710219524800777 2023-01-22 15:45:18.469643: step: 368/464, loss: 0.008920171298086643 2023-01-22 15:45:19.274263: step: 370/464, loss: 0.023013954982161522 2023-01-22 15:45:20.004122: step: 372/464, loss: 0.028991742059588432 2023-01-22 15:45:20.730215: step: 374/464, loss: 0.018598882481455803 2023-01-22 15:45:21.479042: step: 376/464, loss: 0.0032753553241491318 2023-01-22 15:45:22.146424: step: 378/464, loss: 0.0008013547048904002 2023-01-22 15:45:22.884488: step: 380/464, loss: 0.0356551855802536 2023-01-22 15:45:23.672520: step: 382/464, loss: 0.019524503499269485 2023-01-22 15:45:24.400195: step: 384/464, loss: 0.008192269131541252 2023-01-22 15:45:25.138416: step: 386/464, loss: 0.0015428874175995588 2023-01-22 15:45:25.811785: step: 388/464, loss: 0.02081419713795185 2023-01-22 15:45:26.650668: step: 390/464, loss: 0.01860181801021099 2023-01-22 15:45:27.409434: step: 392/464, loss: 2.680257603060454e-05 2023-01-22 15:45:28.056984: step: 394/464, loss: 0.015773866325616837 2023-01-22 15:45:28.783523: step: 396/464, loss: 0.021631063893437386 2023-01-22 15:45:29.490995: step: 398/464, loss: 0.0528394915163517 2023-01-22 15:45:30.296903: step: 400/464, loss: 0.11525987088680267 2023-01-22 15:45:30.979694: step: 402/464, loss: 0.0013505503302440047 2023-01-22 15:45:31.690530: step: 404/464, loss: 0.05220959335565567 2023-01-22 15:45:32.418948: step: 406/464, loss: 0.3575024902820587 2023-01-22 15:45:33.154215: step: 408/464, loss: 0.15189459919929504 2023-01-22 15:45:33.832237: step: 410/464, loss: 0.030411798506975174 2023-01-22 15:45:34.558929: step: 412/464, loss: 0.0368804894387722 2023-01-22 15:45:35.353366: step: 414/464, loss: 0.011386717669665813 2023-01-22 15:45:36.121293: step: 416/464, loss: 0.027711808681488037 2023-01-22 15:45:36.795684: step: 418/464, loss: 0.00030285576940514147 2023-01-22 15:45:37.523187: step: 420/464, loss: 0.02967562898993492 2023-01-22 15:45:38.293136: step: 422/464, loss: 0.004899161402136087 2023-01-22 15:45:38.953237: step: 424/464, loss: 0.005439385771751404 2023-01-22 15:45:39.746176: step: 426/464, loss: 0.00611588079482317 2023-01-22 15:45:40.490012: step: 428/464, loss: 0.009580517187714577 2023-01-22 15:45:41.227653: step: 430/464, loss: 0.10024572908878326 2023-01-22 15:45:41.928105: step: 432/464, loss: 0.0046290247701108456 2023-01-22 15:45:42.617585: step: 434/464, loss: 0.004417914431542158 2023-01-22 15:45:43.393932: step: 436/464, loss: 0.014274193905293941 2023-01-22 15:45:44.074573: step: 438/464, loss: 0.0033748922869563103 2023-01-22 15:45:44.856040: step: 440/464, loss: 0.006222773343324661 2023-01-22 15:45:45.589551: step: 442/464, loss: 0.025498919188976288 2023-01-22 15:45:46.323901: step: 444/464, loss: 0.017806345596909523 2023-01-22 15:45:47.110235: step: 446/464, loss: 0.003274332731962204 2023-01-22 15:45:47.775191: step: 448/464, loss: 0.0009049939690157771 2023-01-22 15:45:48.603758: step: 450/464, loss: 0.005761545151472092 2023-01-22 15:45:49.349340: step: 452/464, loss: 0.013424933888018131 2023-01-22 15:45:50.070713: step: 454/464, loss: 0.009402374736964703 2023-01-22 15:45:50.747373: step: 456/464, loss: 0.0019368845969438553 2023-01-22 15:45:51.507532: step: 458/464, loss: 0.029811447486281395 2023-01-22 15:45:52.285279: step: 460/464, loss: 0.00637458823621273 2023-01-22 15:45:53.050313: step: 462/464, loss: 0.035835716873407364 2023-01-22 15:45:53.788723: step: 464/464, loss: 0.08940885215997696 2023-01-22 15:45:54.540935: step: 466/464, loss: 0.055901557207107544 2023-01-22 15:45:55.368902: step: 468/464, loss: 0.001355905318632722 2023-01-22 15:45:56.035575: step: 470/464, loss: 0.0005913148052059114 2023-01-22 15:45:56.687559: step: 472/464, loss: 0.004807848483324051 2023-01-22 15:45:57.395404: step: 474/464, loss: 0.01984637789428234 2023-01-22 15:45:58.206758: step: 476/464, loss: 0.16941934823989868 2023-01-22 15:45:58.915355: step: 478/464, loss: 0.0008034526836127043 2023-01-22 15:45:59.703607: step: 480/464, loss: 0.019079819321632385 2023-01-22 15:46:00.534238: step: 482/464, loss: 0.06966307759284973 2023-01-22 15:46:01.249733: step: 484/464, loss: 0.11122968792915344 2023-01-22 15:46:02.042433: step: 486/464, loss: 0.05611448734998703 2023-01-22 15:46:02.821560: step: 488/464, loss: 0.05052759125828743 2023-01-22 15:46:03.565787: step: 490/464, loss: 0.309314489364624 2023-01-22 15:46:04.334351: step: 492/464, loss: 0.023992551490664482 2023-01-22 15:46:05.021425: step: 494/464, loss: 0.051077570766210556 2023-01-22 15:46:05.859875: step: 496/464, loss: 0.004689997062087059 2023-01-22 15:46:06.684911: step: 498/464, loss: 0.014873744919896126 2023-01-22 15:46:07.386181: step: 500/464, loss: 0.030518092215061188 2023-01-22 15:46:08.161280: step: 502/464, loss: 0.00024345805286429822 2023-01-22 15:46:08.878641: step: 504/464, loss: 0.02177843451499939 2023-01-22 15:46:09.605839: step: 506/464, loss: 0.011750375851988792 2023-01-22 15:46:10.352248: step: 508/464, loss: 0.024676991626620293 2023-01-22 15:46:11.094831: step: 510/464, loss: 0.018763341009616852 2023-01-22 15:46:11.912383: step: 512/464, loss: 0.014819656498730183 2023-01-22 15:46:12.607067: step: 514/464, loss: 0.0508464016020298 2023-01-22 15:46:13.392709: step: 516/464, loss: 0.11383599042892456 2023-01-22 15:46:14.142162: step: 518/464, loss: 0.009834877215325832 2023-01-22 15:46:14.950619: step: 520/464, loss: 0.004556347616016865 2023-01-22 15:46:15.738957: step: 522/464, loss: 0.0045471033081412315 2023-01-22 15:46:16.450398: step: 524/464, loss: 0.010128447785973549 2023-01-22 15:46:17.146570: step: 526/464, loss: 0.01945709064602852 2023-01-22 15:46:17.940619: step: 528/464, loss: 0.03735598549246788 2023-01-22 15:46:18.640767: step: 530/464, loss: 0.009542558342218399 2023-01-22 15:46:19.398366: step: 532/464, loss: 0.020316550508141518 2023-01-22 15:46:20.120941: step: 534/464, loss: 0.0042798384092748165 2023-01-22 15:46:20.833519: step: 536/464, loss: 0.027576670050621033 2023-01-22 15:46:21.609616: step: 538/464, loss: 0.005680317524820566 2023-01-22 15:46:22.320753: step: 540/464, loss: 0.005088218487799168 2023-01-22 15:46:23.002682: step: 542/464, loss: 0.0036994877737015486 2023-01-22 15:46:23.673049: step: 544/464, loss: 0.08250491321086884 2023-01-22 15:46:24.417762: step: 546/464, loss: 0.003032066859304905 2023-01-22 15:46:25.082499: step: 548/464, loss: 0.006267304066568613 2023-01-22 15:46:25.841694: step: 550/464, loss: 0.012118152342736721 2023-01-22 15:46:26.604933: step: 552/464, loss: 0.3567858040332794 2023-01-22 15:46:27.329080: step: 554/464, loss: 0.06748120486736298 2023-01-22 15:46:28.057838: step: 556/464, loss: 0.0005088126054033637 2023-01-22 15:46:28.748221: step: 558/464, loss: 0.05795755609869957 2023-01-22 15:46:29.448713: step: 560/464, loss: 0.0013993033207952976 2023-01-22 15:46:30.193057: step: 562/464, loss: 0.002470789710059762 2023-01-22 15:46:30.923914: step: 564/464, loss: 0.02224491722881794 2023-01-22 15:46:31.674891: step: 566/464, loss: 0.02115313708782196 2023-01-22 15:46:32.451618: step: 568/464, loss: 0.010152964852750301 2023-01-22 15:46:33.237244: step: 570/464, loss: 0.016146540641784668 2023-01-22 15:46:33.975609: step: 572/464, loss: 0.0042912825010716915 2023-01-22 15:46:34.748508: step: 574/464, loss: 0.0017807194963097572 2023-01-22 15:46:35.533466: step: 576/464, loss: 0.04278545081615448 2023-01-22 15:46:36.299384: step: 578/464, loss: 0.006922352127730846 2023-01-22 15:46:37.028559: step: 580/464, loss: 0.006656852085143328 2023-01-22 15:46:37.765898: step: 582/464, loss: 0.0019682818092405796 2023-01-22 15:46:38.454376: step: 584/464, loss: 0.00039459459367208183 2023-01-22 15:46:39.220755: step: 586/464, loss: 0.0036324732936918736 2023-01-22 15:46:40.033145: step: 588/464, loss: 0.05086535960435867 2023-01-22 15:46:40.738882: step: 590/464, loss: 0.023409342393279076 2023-01-22 15:46:41.459336: step: 592/464, loss: 0.9159653186798096 2023-01-22 15:46:42.129797: step: 594/464, loss: 0.009631011635065079 2023-01-22 15:46:42.843219: step: 596/464, loss: 0.0059930006973445415 2023-01-22 15:46:43.570975: step: 598/464, loss: 0.010481104254722595 2023-01-22 15:46:44.323580: step: 600/464, loss: 0.2565106749534607 2023-01-22 15:46:45.009322: step: 602/464, loss: 0.003442235291004181 2023-01-22 15:46:45.735998: step: 604/464, loss: 0.01562698930501938 2023-01-22 15:46:46.373209: step: 606/464, loss: 0.012435711920261383 2023-01-22 15:46:47.095747: step: 608/464, loss: 0.00962899997830391 2023-01-22 15:46:47.852719: step: 610/464, loss: 0.007187160197645426 2023-01-22 15:46:48.489254: step: 612/464, loss: 0.008403594605624676 2023-01-22 15:46:49.160600: step: 614/464, loss: 0.001437154714949429 2023-01-22 15:46:49.944658: step: 616/464, loss: 0.013526865281164646 2023-01-22 15:46:50.606509: step: 618/464, loss: 0.00013178416702430695 2023-01-22 15:46:51.323712: step: 620/464, loss: 0.016230305656790733 2023-01-22 15:46:52.016031: step: 622/464, loss: 0.04665118828415871 2023-01-22 15:46:52.710081: step: 624/464, loss: 0.012508937157690525 2023-01-22 15:46:53.594516: step: 626/464, loss: 0.040033601224422455 2023-01-22 15:46:54.278067: step: 628/464, loss: 0.00021456902322825044 2023-01-22 15:46:54.999775: step: 630/464, loss: 0.013318480923771858 2023-01-22 15:46:55.655390: step: 632/464, loss: 0.05723757669329643 2023-01-22 15:46:56.369926: step: 634/464, loss: 0.005295167677104473 2023-01-22 15:46:57.142769: step: 636/464, loss: 0.3125 2023-01-22 15:46:57.920675: step: 638/464, loss: 0.07070460170507431 2023-01-22 15:46:58.626921: step: 640/464, loss: 0.0036590429954230785 2023-01-22 15:46:59.325304: step: 642/464, loss: 0.00417109951376915 2023-01-22 15:47:00.059244: step: 644/464, loss: 0.0002276741579407826 2023-01-22 15:47:00.811252: step: 646/464, loss: 0.025444338098168373 2023-01-22 15:47:01.602804: step: 648/464, loss: 0.006258037872612476 2023-01-22 15:47:02.375399: step: 650/464, loss: 0.006045197602361441 2023-01-22 15:47:03.097725: step: 652/464, loss: 0.0074656312353909016 2023-01-22 15:47:03.754531: step: 654/464, loss: 0.003545462852343917 2023-01-22 15:47:04.541460: step: 656/464, loss: 0.09607963263988495 2023-01-22 15:47:05.260019: step: 658/464, loss: 0.014723972417414188 2023-01-22 15:47:05.939479: step: 660/464, loss: 0.025644179433584213 2023-01-22 15:47:06.608300: step: 662/464, loss: 0.11263712495565414 2023-01-22 15:47:07.293289: step: 664/464, loss: 0.001962358597666025 2023-01-22 15:47:08.065784: step: 666/464, loss: 0.004253838676959276 2023-01-22 15:47:08.715806: step: 668/464, loss: 0.0016341455047950149 2023-01-22 15:47:09.427006: step: 670/464, loss: 0.0016103158704936504 2023-01-22 15:47:10.144024: step: 672/464, loss: 0.005251705646514893 2023-01-22 15:47:10.834611: step: 674/464, loss: 0.1602022647857666 2023-01-22 15:47:11.515256: step: 676/464, loss: 0.007221348118036985 2023-01-22 15:47:12.218149: step: 678/464, loss: 0.03209817409515381 2023-01-22 15:47:12.902591: step: 680/464, loss: 0.06724380701780319 2023-01-22 15:47:13.658578: step: 682/464, loss: 0.0058819022960960865 2023-01-22 15:47:14.512496: step: 684/464, loss: 0.009005686268210411 2023-01-22 15:47:15.203101: step: 686/464, loss: 0.012575240805745125 2023-01-22 15:47:15.900908: step: 688/464, loss: 0.008312860503792763 2023-01-22 15:47:16.733217: step: 690/464, loss: 0.08882316201925278 2023-01-22 15:47:17.618845: step: 692/464, loss: 0.011073197238147259 2023-01-22 15:47:18.286249: step: 694/464, loss: 0.07950153946876526 2023-01-22 15:47:18.888214: step: 696/464, loss: 0.011889828369021416 2023-01-22 15:47:19.651451: step: 698/464, loss: 0.23003815114498138 2023-01-22 15:47:20.372343: step: 700/464, loss: 0.00033969481592066586 2023-01-22 15:47:21.154198: step: 702/464, loss: 0.013609882444143295 2023-01-22 15:47:21.891821: step: 704/464, loss: 0.046654343605041504 2023-01-22 15:47:22.580789: step: 706/464, loss: 0.040340326726436615 2023-01-22 15:47:23.264635: step: 708/464, loss: 0.022635284811258316 2023-01-22 15:47:23.972002: step: 710/464, loss: 0.009701196104288101 2023-01-22 15:47:24.779948: step: 712/464, loss: 0.019521228969097137 2023-01-22 15:47:25.483658: step: 714/464, loss: 0.0022151279263198376 2023-01-22 15:47:26.264330: step: 716/464, loss: 0.008278511464595795 2023-01-22 15:47:26.966585: step: 718/464, loss: 0.025777986273169518 2023-01-22 15:47:27.699941: step: 720/464, loss: 0.01621449738740921 2023-01-22 15:47:28.465561: step: 722/464, loss: 0.004078149329870939 2023-01-22 15:47:29.149163: step: 724/464, loss: 0.0037733963690698147 2023-01-22 15:47:29.892796: step: 726/464, loss: 0.019777603447437286 2023-01-22 15:47:30.699208: step: 728/464, loss: 0.009691568091511726 2023-01-22 15:47:31.330160: step: 730/464, loss: 0.0017670301022008061 2023-01-22 15:47:32.033844: step: 732/464, loss: 0.0014202864840626717 2023-01-22 15:47:32.761849: step: 734/464, loss: 0.004550382494926453 2023-01-22 15:47:33.496931: step: 736/464, loss: 0.01193520799279213 2023-01-22 15:47:34.168822: step: 738/464, loss: 0.009102024137973785 2023-01-22 15:47:34.952701: step: 740/464, loss: 0.022574957460165024 2023-01-22 15:47:35.738560: step: 742/464, loss: 0.017012832686305046 2023-01-22 15:47:36.502056: step: 744/464, loss: 0.07949044555425644 2023-01-22 15:47:37.247817: step: 746/464, loss: 0.03681956231594086 2023-01-22 15:47:37.992311: step: 748/464, loss: 0.0013021467020735145 2023-01-22 15:47:38.715315: step: 750/464, loss: 0.02038952335715294 2023-01-22 15:47:39.531719: step: 752/464, loss: 0.16530972719192505 2023-01-22 15:47:40.256869: step: 754/464, loss: 0.014058690518140793 2023-01-22 15:47:40.982595: step: 756/464, loss: 0.016745617613196373 2023-01-22 15:47:41.712888: step: 758/464, loss: 0.017155256122350693 2023-01-22 15:47:42.458689: step: 760/464, loss: 0.7648048400878906 2023-01-22 15:47:43.189106: step: 762/464, loss: 0.003873082809150219 2023-01-22 15:47:43.917573: step: 764/464, loss: 0.0006463755271397531 2023-01-22 15:47:44.812883: step: 766/464, loss: 0.014763910323381424 2023-01-22 15:47:45.547697: step: 768/464, loss: 0.02320656180381775 2023-01-22 15:47:46.281808: step: 770/464, loss: 0.04300972819328308 2023-01-22 15:47:46.981742: step: 772/464, loss: 0.012654591351747513 2023-01-22 15:47:47.784299: step: 774/464, loss: 0.004290216602385044 2023-01-22 15:47:48.599213: step: 776/464, loss: 0.007781681604683399 2023-01-22 15:47:49.335338: step: 778/464, loss: 0.006587921176105738 2023-01-22 15:47:50.108845: step: 780/464, loss: 0.0022355031687766314 2023-01-22 15:47:50.836982: step: 782/464, loss: 0.011383858509361744 2023-01-22 15:47:51.641132: step: 784/464, loss: 0.00821562111377716 2023-01-22 15:47:52.432823: step: 786/464, loss: 0.015217304229736328 2023-01-22 15:47:53.170142: step: 788/464, loss: 0.001672367099672556 2023-01-22 15:47:53.912822: step: 790/464, loss: 0.02738807536661625 2023-01-22 15:47:54.639945: step: 792/464, loss: 0.23105153441429138 2023-01-22 15:47:55.388787: step: 794/464, loss: 0.025443637743592262 2023-01-22 15:47:56.156882: step: 796/464, loss: 0.046076126396656036 2023-01-22 15:47:56.916006: step: 798/464, loss: 0.022370098158717155 2023-01-22 15:47:57.631289: step: 800/464, loss: 0.0020557662937790155 2023-01-22 15:47:58.312329: step: 802/464, loss: 0.0009072026587091386 2023-01-22 15:47:59.025593: step: 804/464, loss: 0.002057021716609597 2023-01-22 15:47:59.769751: step: 806/464, loss: 0.05425307899713516 2023-01-22 15:48:00.529392: step: 808/464, loss: 0.082684725522995 2023-01-22 15:48:01.260923: step: 810/464, loss: 0.05836813524365425 2023-01-22 15:48:01.942959: step: 812/464, loss: 0.35637176036834717 2023-01-22 15:48:02.775411: step: 814/464, loss: 0.0005271787522360682 2023-01-22 15:48:03.461869: step: 816/464, loss: 0.0050505283288657665 2023-01-22 15:48:04.296513: step: 818/464, loss: 0.05915964022278786 2023-01-22 15:48:05.108956: step: 820/464, loss: 0.14169052243232727 2023-01-22 15:48:05.825452: step: 822/464, loss: 0.27492451667785645 2023-01-22 15:48:06.525927: step: 824/464, loss: 0.0031870307866483927 2023-01-22 15:48:07.245846: step: 826/464, loss: 0.04295702278614044 2023-01-22 15:48:07.964682: step: 828/464, loss: 0.28713443875312805 2023-01-22 15:48:08.694167: step: 830/464, loss: 0.00047704242751933634 2023-01-22 15:48:09.398199: step: 832/464, loss: 0.009593743830919266 2023-01-22 15:48:10.042844: step: 834/464, loss: 0.02768618054687977 2023-01-22 15:48:10.735811: step: 836/464, loss: 0.01568949967622757 2023-01-22 15:48:11.468084: step: 838/464, loss: 0.002845700131729245 2023-01-22 15:48:12.455312: step: 840/464, loss: 0.0011342305224388838 2023-01-22 15:48:13.164693: step: 842/464, loss: 0.0013543365057557821 2023-01-22 15:48:13.884973: step: 844/464, loss: 0.02245929092168808 2023-01-22 15:48:14.648784: step: 846/464, loss: 0.03169582411646843 2023-01-22 15:48:15.451164: step: 848/464, loss: 0.05935420095920563 2023-01-22 15:48:16.136307: step: 850/464, loss: 0.017007581889629364 2023-01-22 15:48:16.872812: step: 852/464, loss: 0.010829615406692028 2023-01-22 15:48:17.505708: step: 854/464, loss: 0.0015771217877045274 2023-01-22 15:48:18.173608: step: 856/464, loss: 0.05150710418820381 2023-01-22 15:48:18.887790: step: 858/464, loss: 0.020031724125146866 2023-01-22 15:48:19.764148: step: 860/464, loss: 0.004909739829599857 2023-01-22 15:48:20.510396: step: 862/464, loss: 0.017233915627002716 2023-01-22 15:48:21.226753: step: 864/464, loss: 0.01338171400129795 2023-01-22 15:48:21.922392: step: 866/464, loss: 0.007492452394217253 2023-01-22 15:48:22.699665: step: 868/464, loss: 0.001295449328608811 2023-01-22 15:48:23.457210: step: 870/464, loss: 0.003779294900596142 2023-01-22 15:48:24.162065: step: 872/464, loss: 0.023648492991924286 2023-01-22 15:48:24.831479: step: 874/464, loss: 0.01128461305052042 2023-01-22 15:48:25.570001: step: 876/464, loss: 0.007469322998076677 2023-01-22 15:48:26.263436: step: 878/464, loss: 0.0035442670341581106 2023-01-22 15:48:27.036202: step: 880/464, loss: 0.024421758949756622 2023-01-22 15:48:27.820082: step: 882/464, loss: 0.0779152438044548 2023-01-22 15:48:28.681448: step: 884/464, loss: 0.03339240700006485 2023-01-22 15:48:29.386010: step: 886/464, loss: 4.4183776481077075e-05 2023-01-22 15:48:30.111382: step: 888/464, loss: 0.004640334751456976 2023-01-22 15:48:30.767042: step: 890/464, loss: 0.13504371047019958 2023-01-22 15:48:31.536164: step: 892/464, loss: 0.012566328048706055 2023-01-22 15:48:32.256783: step: 894/464, loss: 0.005399944726377726 2023-01-22 15:48:33.000246: step: 896/464, loss: 0.0016925518866628408 2023-01-22 15:48:33.747632: step: 898/464, loss: 0.007467552553862333 2023-01-22 15:48:34.550710: step: 900/464, loss: 0.15433648228645325 2023-01-22 15:48:35.355187: step: 902/464, loss: 0.024352390319108963 2023-01-22 15:48:36.183111: step: 904/464, loss: 0.009650146588683128 2023-01-22 15:48:36.904391: step: 906/464, loss: 0.03369938209652901 2023-01-22 15:48:37.663633: step: 908/464, loss: 0.001058822381310165 2023-01-22 15:48:38.401012: step: 910/464, loss: 0.010992859490215778 2023-01-22 15:48:39.139279: step: 912/464, loss: 0.0026332309935241938 2023-01-22 15:48:39.792561: step: 914/464, loss: 0.005801694467663765 2023-01-22 15:48:40.595582: step: 916/464, loss: 0.017871178686618805 2023-01-22 15:48:41.325479: step: 918/464, loss: 0.04245564341545105 2023-01-22 15:48:42.252092: step: 920/464, loss: 0.26371610164642334 2023-01-22 15:48:43.015563: step: 922/464, loss: 0.027379389852285385 2023-01-22 15:48:43.827215: step: 924/464, loss: 0.025049660354852676 2023-01-22 15:48:44.596304: step: 926/464, loss: 0.015954000875353813 2023-01-22 15:48:45.295726: step: 928/464, loss: 0.0003321287513244897 2023-01-22 15:48:45.878752: step: 930/464, loss: 1.5560624888166785e-05 ================================================== Loss: 0.089 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3001277515723271, 'r': 0.3622035104364327, 'f1': 0.3282566638005159}, 'combined': 0.24187333122143273, 'epoch': 31} Test Chinese: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.28824271626177206, 'r': 0.29023846207169535, 'f1': 0.28923714652980187}, 'combined': 0.17963149100271908, 'epoch': 31} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.28312784522003037, 'r': 0.35404411764705884, 'f1': 0.314639544688027}, 'combined': 0.23183966450696727, 'epoch': 31} Test Korean: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.2829545153461047, 'r': 0.28883190883994075, 'f1': 0.28586300522484587}, 'combined': 0.17753597166595692, 'epoch': 31} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2998662350099933, 'r': 0.3596118795565765, 'f1': 0.3270327187684483}, 'combined': 0.2409714769872777, 'epoch': 31} Test Russian: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.2972606821036326, 'r': 0.29755412798922354, 'f1': 0.29740733266214453}, 'combined': 0.1847056066007003, 'epoch': 31} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.2897727272727273, 'r': 0.36428571428571427, 'f1': 0.3227848101265823}, 'combined': 0.2151898734177215, 'epoch': 31} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.26973684210526316, 'r': 0.44565217391304346, 'f1': 0.3360655737704918}, 'combined': 0.1680327868852459, 'epoch': 31} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.40789473684210525, 'r': 0.2672413793103448, 'f1': 0.3229166666666667}, 'combined': 0.2152777777777778, 'epoch': 31} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.31282801835726054, 'r': 0.3603161425860667, 'f1': 0.33489701436130004}, 'combined': 0.24676622110832633, 'epoch': 12} Test for Chinese: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.30894966790503225, 'r': 0.2887808468548521, 'f1': 0.29852498585915693}, 'combined': 0.1853997280598975, 'epoch': 12} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.3138888888888889, 'r': 0.4035714285714286, 'f1': 0.35312499999999997}, 'combined': 0.23541666666666664, 'epoch': 12} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3081533681870252, 'r': 0.3455761681186563, 'f1': 0.32579363255551325}, 'combined': 0.24005846609353607, 'epoch': 23} Test for Korean: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.29327466083978504, 'r': 0.2883432372648332, 'f1': 0.2907880427678267}, 'combined': 0.1805946791926503, 'epoch': 23} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.3355263157894737, 'r': 0.5543478260869565, 'f1': 0.41803278688524587}, 'combined': 0.20901639344262293, 'epoch': 23} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.32184781639317267, 'r': 0.3346728716953674, 'f1': 0.32813507606224857}, 'combined': 0.24178374025639365, 'epoch': 24} Test for Russian: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.31977789344146773, 'r': 0.29610233370986844, 'f1': 0.30748504771716734}, 'combined': 0.190964398055925, 'epoch': 24} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.46875, 'r': 0.3232758620689655, 'f1': 0.38265306122448983}, 'combined': 0.25510204081632654, 'epoch': 24} ****************************** Epoch: 32 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-22 15:51:26.990314: step: 2/464, loss: 0.006672237068414688 2023-01-22 15:51:27.755370: step: 4/464, loss: 0.00039798574289307 2023-01-22 15:51:28.417571: step: 6/464, loss: 0.005223429761826992 2023-01-22 15:51:29.097853: step: 8/464, loss: 0.0015860287239775062 2023-01-22 15:51:29.771983: step: 10/464, loss: 9.945368219632655e-05 2023-01-22 15:51:30.466709: step: 12/464, loss: 0.06839819252490997 2023-01-22 15:51:31.211845: step: 14/464, loss: 0.0004776528512593359 2023-01-22 15:51:31.901095: step: 16/464, loss: 0.003025172045454383 2023-01-22 15:51:32.741948: step: 18/464, loss: 0.07096771895885468 2023-01-22 15:51:33.510434: step: 20/464, loss: 0.0016611287137493491 2023-01-22 15:51:34.199945: step: 22/464, loss: 0.004521335940808058 2023-01-22 15:51:34.930826: step: 24/464, loss: 0.012017553672194481 2023-01-22 15:51:35.696871: step: 26/464, loss: 0.01229409035295248 2023-01-22 15:51:36.610552: step: 28/464, loss: 0.04657362774014473 2023-01-22 15:51:37.374982: step: 30/464, loss: 0.06783688068389893 2023-01-22 15:51:38.027198: step: 32/464, loss: 0.0014619616558775306 2023-01-22 15:51:38.770339: step: 34/464, loss: 0.021388240158557892 2023-01-22 15:51:39.476058: step: 36/464, loss: 0.009178460575640202 2023-01-22 15:51:40.225355: step: 38/464, loss: 0.008435037918388844 2023-01-22 15:51:40.956754: step: 40/464, loss: 0.0066080521792173386 2023-01-22 15:51:41.678179: step: 42/464, loss: 0.0008859040099196136 2023-01-22 15:51:42.429722: step: 44/464, loss: 0.020066630095243454 2023-01-22 15:51:43.220540: step: 46/464, loss: 0.003385049058124423 2023-01-22 15:51:43.997801: step: 48/464, loss: 0.00659538246691227 2023-01-22 15:51:44.786867: step: 50/464, loss: 0.0007390569080598652 2023-01-22 15:51:45.505127: step: 52/464, loss: 0.03373732045292854 2023-01-22 15:51:46.254448: step: 54/464, loss: 0.09182209521532059 2023-01-22 15:51:46.980677: step: 56/464, loss: 0.007950812578201294 2023-01-22 15:51:47.666837: step: 58/464, loss: 0.03541431576013565 2023-01-22 15:51:48.329333: step: 60/464, loss: 1.0369291305541992 2023-01-22 15:51:49.062206: step: 62/464, loss: 0.008863094262778759 2023-01-22 15:51:49.788116: step: 64/464, loss: 0.06318230926990509 2023-01-22 15:51:50.645031: step: 66/464, loss: 0.0228151585906744 2023-01-22 15:51:51.361036: step: 68/464, loss: 0.0006504048360511661 2023-01-22 15:51:52.113667: step: 70/464, loss: 0.030889738351106644 2023-01-22 15:51:52.792359: step: 72/464, loss: 0.011693434789776802 2023-01-22 15:51:53.525313: step: 74/464, loss: 0.01864388957619667 2023-01-22 15:51:54.264527: step: 76/464, loss: 0.005030644591897726 2023-01-22 15:51:55.052138: step: 78/464, loss: 0.0004858130414504558 2023-01-22 15:51:55.703353: step: 80/464, loss: 0.7136760950088501 2023-01-22 15:51:56.506170: step: 82/464, loss: 0.13197855651378632 2023-01-22 15:51:57.267655: step: 84/464, loss: 0.08248142898082733 2023-01-22 15:51:57.984585: step: 86/464, loss: 0.07017794251441956 2023-01-22 15:51:58.712227: step: 88/464, loss: 0.0020354515872895718 2023-01-22 15:51:59.495941: step: 90/464, loss: 0.015875235199928284 2023-01-22 15:52:00.328880: step: 92/464, loss: 0.05482667312026024 2023-01-22 15:52:01.026924: step: 94/464, loss: 0.005427805706858635 2023-01-22 15:52:01.693128: step: 96/464, loss: 0.03023233264684677 2023-01-22 15:52:02.394554: step: 98/464, loss: 0.005275893025100231 2023-01-22 15:52:03.274275: step: 100/464, loss: 0.7011404633522034 2023-01-22 15:52:04.054283: step: 102/464, loss: 0.0805116519331932 2023-01-22 15:52:04.853534: step: 104/464, loss: 0.0021324814297258854 2023-01-22 15:52:05.576341: step: 106/464, loss: 0.016958115622401237 2023-01-22 15:52:06.269445: step: 108/464, loss: 0.0028949484694749117 2023-01-22 15:52:07.010356: step: 110/464, loss: 0.01721436157822609 2023-01-22 15:52:07.771814: step: 112/464, loss: 0.001554527203552425 2023-01-22 15:52:08.481644: step: 114/464, loss: 0.0034987321123480797 2023-01-22 15:52:09.246653: step: 116/464, loss: 0.021231235936284065 2023-01-22 15:52:09.943814: step: 118/464, loss: 0.05176122859120369 2023-01-22 15:52:10.743087: step: 120/464, loss: 0.010055532678961754 2023-01-22 15:52:11.512282: step: 122/464, loss: 0.005222069099545479 2023-01-22 15:52:12.189421: step: 124/464, loss: 0.0016218442469835281 2023-01-22 15:52:12.976167: step: 126/464, loss: 0.0006480618030764163 2023-01-22 15:52:13.728106: step: 128/464, loss: 0.006303100846707821 2023-01-22 15:52:14.473374: step: 130/464, loss: 0.12113538384437561 2023-01-22 15:52:15.155018: step: 132/464, loss: 0.0008748953696340322 2023-01-22 15:52:15.880100: step: 134/464, loss: 0.03531118482351303 2023-01-22 15:52:16.683183: step: 136/464, loss: 0.015936831012368202 2023-01-22 15:52:17.383890: step: 138/464, loss: 0.05206011235713959 2023-01-22 15:52:18.154098: step: 140/464, loss: 0.052800193428993225 2023-01-22 15:52:18.867156: step: 142/464, loss: 0.019787365570664406 2023-01-22 15:52:19.533009: step: 144/464, loss: 0.0025979802012443542 2023-01-22 15:52:20.276923: step: 146/464, loss: 0.006643963512033224 2023-01-22 15:52:20.949765: step: 148/464, loss: 0.001538788783363998 2023-01-22 15:52:21.718468: step: 150/464, loss: 0.007603897247463465 2023-01-22 15:52:22.488735: step: 152/464, loss: 0.023585248738527298 2023-01-22 15:52:23.268430: step: 154/464, loss: 0.019093260169029236 2023-01-22 15:52:23.934032: step: 156/464, loss: 0.005717174615710974 2023-01-22 15:52:24.625243: step: 158/464, loss: 0.009188220836222172 2023-01-22 15:52:25.343334: step: 160/464, loss: 0.0014259631279855967 2023-01-22 15:52:26.130330: step: 162/464, loss: 0.028234383091330528 2023-01-22 15:52:26.845249: step: 164/464, loss: 3.184197339578532e-05 2023-01-22 15:52:27.665022: step: 166/464, loss: 0.04168025776743889 2023-01-22 15:52:28.386968: step: 168/464, loss: 0.06091833859682083 2023-01-22 15:52:29.071105: step: 170/464, loss: 0.003078239969909191 2023-01-22 15:52:29.853486: step: 172/464, loss: 0.005105409771203995 2023-01-22 15:52:30.571104: step: 174/464, loss: 0.01845010742545128 2023-01-22 15:52:31.318772: step: 176/464, loss: 0.004629744682461023 2023-01-22 15:52:32.059842: step: 178/464, loss: 0.018534662202000618 2023-01-22 15:52:32.790776: step: 180/464, loss: 0.02453276515007019 2023-01-22 15:52:33.617509: step: 182/464, loss: 0.0069121792912483215 2023-01-22 15:52:34.318855: step: 184/464, loss: 0.012893215753138065 2023-01-22 15:52:35.061733: step: 186/464, loss: 0.007988216355443 2023-01-22 15:52:35.765283: step: 188/464, loss: 0.008411542512476444 2023-01-22 15:52:36.424869: step: 190/464, loss: 0.008808481507003307 2023-01-22 15:52:37.086275: step: 192/464, loss: 0.019918840378522873 2023-01-22 15:52:37.834088: step: 194/464, loss: 0.009231162257492542 2023-01-22 15:52:38.502887: step: 196/464, loss: 0.003246739273890853 2023-01-22 15:52:39.202191: step: 198/464, loss: 0.027443023398518562 2023-01-22 15:52:39.895901: step: 200/464, loss: 0.02603556402027607 2023-01-22 15:52:40.619869: step: 202/464, loss: 0.014729096554219723 2023-01-22 15:52:41.309687: step: 204/464, loss: 0.02216481789946556 2023-01-22 15:52:42.092619: step: 206/464, loss: 0.017174361273646355 2023-01-22 15:52:42.810749: step: 208/464, loss: 0.017632676288485527 2023-01-22 15:52:43.525935: step: 210/464, loss: 0.000630052643828094 2023-01-22 15:52:44.232004: step: 212/464, loss: 0.0018898368580266833 2023-01-22 15:52:44.978949: step: 214/464, loss: 0.0033350172452628613 2023-01-22 15:52:45.803584: step: 216/464, loss: 0.014642687514424324 2023-01-22 15:52:46.507610: step: 218/464, loss: 0.0010271642822772264 2023-01-22 15:52:47.262180: step: 220/464, loss: 0.01610853709280491 2023-01-22 15:52:47.996841: step: 222/464, loss: 0.018197592347860336 2023-01-22 15:52:48.731107: step: 224/464, loss: 0.0016287442995235324 2023-01-22 15:52:49.425754: step: 226/464, loss: 0.001314380788244307 2023-01-22 15:52:50.163199: step: 228/464, loss: 0.004783174954354763 2023-01-22 15:52:50.890500: step: 230/464, loss: 0.0020701689645648003 2023-01-22 15:52:51.693709: step: 232/464, loss: 0.008220354095101357 2023-01-22 15:52:52.416123: step: 234/464, loss: 0.013796810060739517 2023-01-22 15:52:53.149133: step: 236/464, loss: 0.00972803309559822 2023-01-22 15:52:53.867745: step: 238/464, loss: 0.0003476463898550719 2023-01-22 15:52:54.659930: step: 240/464, loss: 0.029232390224933624 2023-01-22 15:52:55.330692: step: 242/464, loss: 0.007535560987889767 2023-01-22 15:52:56.033711: step: 244/464, loss: 0.038384776562452316 2023-01-22 15:52:56.719961: step: 246/464, loss: 0.00025465062935836613 2023-01-22 15:52:57.390031: step: 248/464, loss: 0.01523551158607006 2023-01-22 15:52:58.083970: step: 250/464, loss: 0.008116251789033413 2023-01-22 15:52:58.824416: step: 252/464, loss: 0.3224211037158966 2023-01-22 15:52:59.489130: step: 254/464, loss: 0.0007047428516671062 2023-01-22 15:53:00.217843: step: 256/464, loss: 0.024528512731194496 2023-01-22 15:53:00.933029: step: 258/464, loss: 0.10475929081439972 2023-01-22 15:53:01.600184: step: 260/464, loss: 0.002466861391440034 2023-01-22 15:53:02.378350: step: 262/464, loss: 0.013222538866102695 2023-01-22 15:53:03.113239: step: 264/464, loss: 0.020091462880373 2023-01-22 15:53:03.875510: step: 266/464, loss: 0.008655213750898838 2023-01-22 15:53:04.545173: step: 268/464, loss: 0.4579889178276062 2023-01-22 15:53:05.354742: step: 270/464, loss: 0.01043748389929533 2023-01-22 15:53:06.097805: step: 272/464, loss: 0.002390442881733179 2023-01-22 15:53:06.875217: step: 274/464, loss: 0.01000374648720026 2023-01-22 15:53:07.597007: step: 276/464, loss: 0.07795630395412445 2023-01-22 15:53:08.296103: step: 278/464, loss: 0.13697943091392517 2023-01-22 15:53:09.064366: step: 280/464, loss: 0.025146078318357468 2023-01-22 15:53:09.795716: step: 282/464, loss: 0.015179144218564034 2023-01-22 15:53:10.461779: step: 284/464, loss: 0.0067111230455338955 2023-01-22 15:53:11.270082: step: 286/464, loss: 0.009315651841461658 2023-01-22 15:53:12.039412: step: 288/464, loss: 0.001091480371542275 2023-01-22 15:53:12.769561: step: 290/464, loss: 0.1570238173007965 2023-01-22 15:53:13.504078: step: 292/464, loss: 0.00019482392235659063 2023-01-22 15:53:14.234211: step: 294/464, loss: 0.017355773597955704 2023-01-22 15:53:14.859585: step: 296/464, loss: 0.009909744374454021 2023-01-22 15:53:15.634606: step: 298/464, loss: 0.11022976040840149 2023-01-22 15:53:16.478404: step: 300/464, loss: 0.032275114208459854 2023-01-22 15:53:17.209804: step: 302/464, loss: 0.09890097379684448 2023-01-22 15:53:17.949124: step: 304/464, loss: 0.030709289014339447 2023-01-22 15:53:18.717767: step: 306/464, loss: 0.011800462380051613 2023-01-22 15:53:19.475850: step: 308/464, loss: 0.024882722645998 2023-01-22 15:53:20.182756: step: 310/464, loss: 0.00911180954426527 2023-01-22 15:53:20.871928: step: 312/464, loss: 0.2190168797969818 2023-01-22 15:53:21.730433: step: 314/464, loss: 1.5261512994766235 2023-01-22 15:53:22.534842: step: 316/464, loss: 0.002229273086413741 2023-01-22 15:53:23.301490: step: 318/464, loss: 0.025895819067955017 2023-01-22 15:53:23.998153: step: 320/464, loss: 0.0013171464670449495 2023-01-22 15:53:24.771031: step: 322/464, loss: 0.012127063237130642 2023-01-22 15:53:25.509256: step: 324/464, loss: 0.03169773146510124 2023-01-22 15:53:26.251695: step: 326/464, loss: 0.08925698697566986 2023-01-22 15:53:26.966658: step: 328/464, loss: 0.003148272167891264 2023-01-22 15:53:27.757577: step: 330/464, loss: 0.17901629209518433 2023-01-22 15:53:28.459660: step: 332/464, loss: 0.011668531224131584 2023-01-22 15:53:29.202283: step: 334/464, loss: 0.003586321836337447 2023-01-22 15:53:29.927137: step: 336/464, loss: 0.003102678107097745 2023-01-22 15:53:30.607324: step: 338/464, loss: 8.600712317274883e-05 2023-01-22 15:53:31.274992: step: 340/464, loss: 0.02279592677950859 2023-01-22 15:53:31.967196: step: 342/464, loss: 0.008252730593085289 2023-01-22 15:53:32.640421: step: 344/464, loss: 0.0008808693382889032 2023-01-22 15:53:33.369028: step: 346/464, loss: 0.0013259114930406213 2023-01-22 15:53:34.114784: step: 348/464, loss: 0.2572783827781677 2023-01-22 15:53:34.846173: step: 350/464, loss: 0.00201928592287004 2023-01-22 15:53:35.512497: step: 352/464, loss: 0.015110744163393974 2023-01-22 15:53:36.216420: step: 354/464, loss: 0.007167731411755085 2023-01-22 15:53:37.009380: step: 356/464, loss: 0.011743845418095589 2023-01-22 15:53:37.747933: step: 358/464, loss: 0.20097900927066803 2023-01-22 15:53:38.455512: step: 360/464, loss: 0.007075333036482334 2023-01-22 15:53:39.126593: step: 362/464, loss: 0.0007429586839862168 2023-01-22 15:53:39.844159: step: 364/464, loss: 0.0007310786750167608 2023-01-22 15:53:40.619855: step: 366/464, loss: 0.001109412987716496 2023-01-22 15:53:41.223902: step: 368/464, loss: 0.0006234599859453738 2023-01-22 15:53:41.939611: step: 370/464, loss: 0.0329633466899395 2023-01-22 15:53:42.588029: step: 372/464, loss: 0.00012382878048811108 2023-01-22 15:53:43.246835: step: 374/464, loss: 0.01997263915836811 2023-01-22 15:53:44.056131: step: 376/464, loss: 0.0039342851378023624 2023-01-22 15:53:44.740213: step: 378/464, loss: 0.009925339370965958 2023-01-22 15:53:45.606920: step: 380/464, loss: 0.024046069011092186 2023-01-22 15:53:46.380136: step: 382/464, loss: 0.003556904848664999 2023-01-22 15:53:47.055315: step: 384/464, loss: 0.00487151462584734 2023-01-22 15:53:47.887915: step: 386/464, loss: 0.40563297271728516 2023-01-22 15:53:48.590242: step: 388/464, loss: 0.12907062470912933 2023-01-22 15:53:49.266832: step: 390/464, loss: 0.0023276321589946747 2023-01-22 15:53:50.036752: step: 392/464, loss: 0.027979984879493713 2023-01-22 15:53:50.758247: step: 394/464, loss: 0.00020647967176046222 2023-01-22 15:53:51.575367: step: 396/464, loss: 0.10996302962303162 2023-01-22 15:53:52.308312: step: 398/464, loss: 0.005133859347552061 2023-01-22 15:53:52.976364: step: 400/464, loss: 0.0011490934994071722 2023-01-22 15:53:53.838472: step: 402/464, loss: 0.04221979156136513 2023-01-22 15:53:54.644040: step: 404/464, loss: 0.021809222176671028 2023-01-22 15:53:55.429553: step: 406/464, loss: 7.231834888458252 2023-01-22 15:53:56.174481: step: 408/464, loss: 0.1555778831243515 2023-01-22 15:53:56.896067: step: 410/464, loss: 0.032972101122140884 2023-01-22 15:53:57.645582: step: 412/464, loss: 0.01077666599303484 2023-01-22 15:53:58.464447: step: 414/464, loss: 0.025415778160095215 2023-01-22 15:53:59.169464: step: 416/464, loss: 0.0023834318853914738 2023-01-22 15:53:59.975748: step: 418/464, loss: 0.020479848608374596 2023-01-22 15:54:00.651223: step: 420/464, loss: 0.02557503432035446 2023-01-22 15:54:01.353626: step: 422/464, loss: 0.005967243108898401 2023-01-22 15:54:02.131028: step: 424/464, loss: 0.007172387093305588 2023-01-22 15:54:02.898230: step: 426/464, loss: 0.029743533581495285 2023-01-22 15:54:03.604569: step: 428/464, loss: 0.019332105293869972 2023-01-22 15:54:04.374851: step: 430/464, loss: 5.191092895984184e-06 2023-01-22 15:54:05.152538: step: 432/464, loss: 0.06320548802614212 2023-01-22 15:54:05.873842: step: 434/464, loss: 0.008165508508682251 2023-01-22 15:54:06.536619: step: 436/464, loss: 0.0007337440620176494 2023-01-22 15:54:07.309579: step: 438/464, loss: 0.019695475697517395 2023-01-22 15:54:07.986707: step: 440/464, loss: 0.010529866442084312 2023-01-22 15:54:08.754915: step: 442/464, loss: 0.0012097518192604184 2023-01-22 15:54:09.465016: step: 444/464, loss: 0.039149560034275055 2023-01-22 15:54:10.147114: step: 446/464, loss: 0.00028673012275248766 2023-01-22 15:54:10.949293: step: 448/464, loss: 0.01631099171936512 2023-01-22 15:54:11.712522: step: 450/464, loss: 0.01560910977423191 2023-01-22 15:54:12.450515: step: 452/464, loss: 0.01941799744963646 2023-01-22 15:54:13.213748: step: 454/464, loss: 0.03485127538442612 2023-01-22 15:54:13.935412: step: 456/464, loss: 0.025263575837016106 2023-01-22 15:54:14.688524: step: 458/464, loss: 0.02511150948703289 2023-01-22 15:54:15.397393: step: 460/464, loss: 0.08032690733671188 2023-01-22 15:54:16.191029: step: 462/464, loss: 0.0008171582594513893 2023-01-22 15:54:16.892961: step: 464/464, loss: 0.001809613429941237 2023-01-22 15:54:17.642294: step: 466/464, loss: 0.0789065957069397 2023-01-22 15:54:18.359224: step: 468/464, loss: 0.10419817268848419 2023-01-22 15:54:19.114963: step: 470/464, loss: 0.009127458557486534 2023-01-22 15:54:19.805252: step: 472/464, loss: 0.13789282739162445 2023-01-22 15:54:20.510227: step: 474/464, loss: 0.01613314263522625 2023-01-22 15:54:21.274025: step: 476/464, loss: 0.0026753416750580072 2023-01-22 15:54:22.108103: step: 478/464, loss: 0.035527680069208145 2023-01-22 15:54:22.953058: step: 480/464, loss: 0.005595831666141748 2023-01-22 15:54:23.652861: step: 482/464, loss: 0.013181515969336033 2023-01-22 15:54:24.464417: step: 484/464, loss: 5.290010452270508 2023-01-22 15:54:25.158784: step: 486/464, loss: 0.00702305743470788 2023-01-22 15:54:25.833069: step: 488/464, loss: 0.014755690470337868 2023-01-22 15:54:26.592183: step: 490/464, loss: 0.016292473301291466 2023-01-22 15:54:27.296465: step: 492/464, loss: 0.003482257016003132 2023-01-22 15:54:28.089564: step: 494/464, loss: 0.006003714632242918 2023-01-22 15:54:28.811015: step: 496/464, loss: 0.006600044202059507 2023-01-22 15:54:29.543567: step: 498/464, loss: 0.0008444880368188024 2023-01-22 15:54:30.187130: step: 500/464, loss: 0.003154363017529249 2023-01-22 15:54:30.948317: step: 502/464, loss: 0.006443787831813097 2023-01-22 15:54:31.703525: step: 504/464, loss: 0.01576988771557808 2023-01-22 15:54:32.472383: step: 506/464, loss: 0.058171436190605164 2023-01-22 15:54:33.154323: step: 508/464, loss: 0.06349492818117142 2023-01-22 15:54:33.893945: step: 510/464, loss: 0.0003439192078076303 2023-01-22 15:54:34.522982: step: 512/464, loss: 3.413219747017138e-05 2023-01-22 15:54:35.205600: step: 514/464, loss: 0.007753042504191399 2023-01-22 15:54:35.902619: step: 516/464, loss: 0.0002126079925801605 2023-01-22 15:54:36.649269: step: 518/464, loss: 0.22025060653686523 2023-01-22 15:54:37.446464: step: 520/464, loss: 0.03245438635349274 2023-01-22 15:54:38.202415: step: 522/464, loss: 0.00424537668004632 2023-01-22 15:54:38.953923: step: 524/464, loss: 0.22938542068004608 2023-01-22 15:54:39.696561: step: 526/464, loss: 0.0029693772085011005 2023-01-22 15:54:40.396497: step: 528/464, loss: 0.043420515954494476 2023-01-22 15:54:41.086303: step: 530/464, loss: 0.003392632585018873 2023-01-22 15:54:41.839509: step: 532/464, loss: 0.007097211200743914 2023-01-22 15:54:42.553168: step: 534/464, loss: 0.3268764019012451 2023-01-22 15:54:43.263265: step: 536/464, loss: 0.013606131076812744 2023-01-22 15:54:43.986967: step: 538/464, loss: 0.0022379527799785137 2023-01-22 15:54:44.708450: step: 540/464, loss: 0.010322199203073978 2023-01-22 15:54:45.412412: step: 542/464, loss: 0.143593430519104 2023-01-22 15:54:46.089558: step: 544/464, loss: 0.006676128134131432 2023-01-22 15:54:46.746634: step: 546/464, loss: 0.10117863118648529 2023-01-22 15:54:47.492094: step: 548/464, loss: 0.0023868491407483816 2023-01-22 15:54:48.254438: step: 550/464, loss: 0.0019674180075526237 2023-01-22 15:54:48.952947: step: 552/464, loss: 0.0002450496540404856 2023-01-22 15:54:49.757231: step: 554/464, loss: 0.03220265731215477 2023-01-22 15:54:50.428956: step: 556/464, loss: 0.012878494337201118 2023-01-22 15:54:51.213288: step: 558/464, loss: 0.014794738963246346 2023-01-22 15:54:51.980514: step: 560/464, loss: 0.005773617420345545 2023-01-22 15:54:52.732444: step: 562/464, loss: 0.005836400203406811 2023-01-22 15:54:53.564339: step: 564/464, loss: 0.011275751516222954 2023-01-22 15:54:54.316446: step: 566/464, loss: 0.3081274628639221 2023-01-22 15:54:55.045381: step: 568/464, loss: 0.00132565398234874 2023-01-22 15:54:55.761803: step: 570/464, loss: 0.02459624595940113 2023-01-22 15:54:56.522094: step: 572/464, loss: 0.0017288029193878174 2023-01-22 15:54:57.294517: step: 574/464, loss: 0.029550323262810707 2023-01-22 15:54:58.105212: step: 576/464, loss: 0.0034652845934033394 2023-01-22 15:54:58.781776: step: 578/464, loss: 0.016499284654855728 2023-01-22 15:54:59.496427: step: 580/464, loss: 0.025307010859251022 2023-01-22 15:55:00.239538: step: 582/464, loss: 0.0004522954404819757 2023-01-22 15:55:01.046687: step: 584/464, loss: 0.04433143511414528 2023-01-22 15:55:01.887850: step: 586/464, loss: 0.029701635241508484 2023-01-22 15:55:02.614938: step: 588/464, loss: 0.0012609551195055246 2023-01-22 15:55:03.482008: step: 590/464, loss: 0.005977481137961149 2023-01-22 15:55:04.196405: step: 592/464, loss: 0.022268379107117653 2023-01-22 15:55:04.937240: step: 594/464, loss: 0.007316329050809145 2023-01-22 15:55:05.654379: step: 596/464, loss: 0.0010635869111865759 2023-01-22 15:55:06.312936: step: 598/464, loss: 0.004664260894060135 2023-01-22 15:55:07.030864: step: 600/464, loss: 0.013532687909901142 2023-01-22 15:55:07.790947: step: 602/464, loss: 0.09427149593830109 2023-01-22 15:55:08.547122: step: 604/464, loss: 0.05134844407439232 2023-01-22 15:55:09.256745: step: 606/464, loss: 9.73513160715811e-05 2023-01-22 15:55:10.009004: step: 608/464, loss: 0.20104339718818665 2023-01-22 15:55:10.705693: step: 610/464, loss: 9.715192794799805 2023-01-22 15:55:11.380953: step: 612/464, loss: 0.5705389976501465 2023-01-22 15:55:12.139493: step: 614/464, loss: 0.01592349261045456 2023-01-22 15:55:12.857525: step: 616/464, loss: 0.04646773636341095 2023-01-22 15:55:13.540276: step: 618/464, loss: 0.11873049288988113 2023-01-22 15:55:14.226147: step: 620/464, loss: 0.24818703532218933 2023-01-22 15:55:14.935749: step: 622/464, loss: 0.09224333614110947 2023-01-22 15:55:15.709664: step: 624/464, loss: 0.052292946726083755 2023-01-22 15:55:16.485938: step: 626/464, loss: 0.03228820115327835 2023-01-22 15:55:17.258830: step: 628/464, loss: 0.004404714331030846 2023-01-22 15:55:17.987970: step: 630/464, loss: 0.32245051860809326 2023-01-22 15:55:18.675220: step: 632/464, loss: 0.016344305127859116 2023-01-22 15:55:19.431978: step: 634/464, loss: 0.0008270391263067722 2023-01-22 15:55:20.106201: step: 636/464, loss: 0.002304938854649663 2023-01-22 15:55:20.823060: step: 638/464, loss: 0.04042618349194527 2023-01-22 15:55:21.657363: step: 640/464, loss: 0.04226240888237953 2023-01-22 15:55:22.380896: step: 642/464, loss: 0.004248725716024637 2023-01-22 15:55:23.150700: step: 644/464, loss: 0.003116982989013195 2023-01-22 15:55:24.020345: step: 646/464, loss: 0.001212684321217239 2023-01-22 15:55:24.827325: step: 648/464, loss: 0.010079382918775082 2023-01-22 15:55:25.507078: step: 650/464, loss: 0.022457556799054146 2023-01-22 15:55:26.153147: step: 652/464, loss: 0.00903982575982809 2023-01-22 15:55:26.905843: step: 654/464, loss: 0.0014533146750181913 2023-01-22 15:55:27.591538: step: 656/464, loss: 0.07138672471046448 2023-01-22 15:55:28.288596: step: 658/464, loss: 0.00844450481235981 2023-01-22 15:55:29.036176: step: 660/464, loss: 0.010615388862788677 2023-01-22 15:55:29.747644: step: 662/464, loss: 0.1854914128780365 2023-01-22 15:55:30.423950: step: 664/464, loss: 0.015586788766086102 2023-01-22 15:55:31.136353: step: 666/464, loss: 0.00446612574160099 2023-01-22 15:55:31.834175: step: 668/464, loss: 9.343422425445169e-05 2023-01-22 15:55:32.631296: step: 670/464, loss: 0.0018104618648067117 2023-01-22 15:55:33.491836: step: 672/464, loss: 0.03854462876915932 2023-01-22 15:55:34.198817: step: 674/464, loss: 0.009988157078623772 2023-01-22 15:55:34.990124: step: 676/464, loss: 0.054880864918231964 2023-01-22 15:55:35.686906: step: 678/464, loss: 0.09408848732709885 2023-01-22 15:55:36.431567: step: 680/464, loss: 0.0537644661962986 2023-01-22 15:55:37.280848: step: 682/464, loss: 0.005208852235227823 2023-01-22 15:55:38.033712: step: 684/464, loss: 0.002501447219401598 2023-01-22 15:55:38.809196: step: 686/464, loss: 0.05998269096016884 2023-01-22 15:55:39.567666: step: 688/464, loss: 0.15073977410793304 2023-01-22 15:55:40.308659: step: 690/464, loss: 0.001690528355538845 2023-01-22 15:55:41.046769: step: 692/464, loss: 0.0025427769869565964 2023-01-22 15:55:41.867525: step: 694/464, loss: 0.035794906318187714 2023-01-22 15:55:42.599145: step: 696/464, loss: 0.0347069650888443 2023-01-22 15:55:43.337530: step: 698/464, loss: 0.04867216572165489 2023-01-22 15:55:44.056712: step: 700/464, loss: 0.028771717101335526 2023-01-22 15:55:44.800261: step: 702/464, loss: 0.017442874610424042 2023-01-22 15:55:45.553486: step: 704/464, loss: 0.05742257088422775 2023-01-22 15:55:46.318296: step: 706/464, loss: 0.02653212659060955 2023-01-22 15:55:47.091171: step: 708/464, loss: 0.0015883547021076083 2023-01-22 15:55:47.799252: step: 710/464, loss: 0.012720917351543903 2023-01-22 15:55:48.541957: step: 712/464, loss: 0.0412566177546978 2023-01-22 15:55:49.282953: step: 714/464, loss: 0.00924015324562788 2023-01-22 15:55:50.096131: step: 716/464, loss: 0.0015413612127304077 2023-01-22 15:55:50.822001: step: 718/464, loss: 0.17302869260311127 2023-01-22 15:55:51.535492: step: 720/464, loss: 0.2881391644477844 2023-01-22 15:55:52.329003: step: 722/464, loss: 0.007352608256042004 2023-01-22 15:55:53.141986: step: 724/464, loss: 0.011552570387721062 2023-01-22 15:55:53.914159: step: 726/464, loss: 0.03673321753740311 2023-01-22 15:55:54.678362: step: 728/464, loss: 0.002769971964880824 2023-01-22 15:55:55.500658: step: 730/464, loss: 0.0043175918981432915 2023-01-22 15:55:56.186472: step: 732/464, loss: 0.02598220482468605 2023-01-22 15:55:57.031192: step: 734/464, loss: 0.0896417498588562 2023-01-22 15:55:57.811986: step: 736/464, loss: 0.07240792363882065 2023-01-22 15:55:58.588924: step: 738/464, loss: 0.004663995932787657 2023-01-22 15:55:59.318219: step: 740/464, loss: 0.007267426233738661 2023-01-22 15:56:00.084417: step: 742/464, loss: 0.026949474588036537 2023-01-22 15:56:00.732574: step: 744/464, loss: 0.02133816108107567 2023-01-22 15:56:01.481779: step: 746/464, loss: 0.0009054642869159579 2023-01-22 15:56:02.276062: step: 748/464, loss: 0.027352290228009224 2023-01-22 15:56:03.012397: step: 750/464, loss: 0.031512752175331116 2023-01-22 15:56:03.719993: step: 752/464, loss: 0.001044303411617875 2023-01-22 15:56:04.474474: step: 754/464, loss: 0.0032093217596411705 2023-01-22 15:56:05.299377: step: 756/464, loss: 0.0123143857344985 2023-01-22 15:56:05.997881: step: 758/464, loss: 0.0051696086302399635 2023-01-22 15:56:06.691343: step: 760/464, loss: 0.04265246167778969 2023-01-22 15:56:07.416337: step: 762/464, loss: 0.03394169732928276 2023-01-22 15:56:08.158935: step: 764/464, loss: 0.009011897258460522 2023-01-22 15:56:08.859861: step: 766/464, loss: 0.009601665660738945 2023-01-22 15:56:09.600976: step: 768/464, loss: 0.001267896848730743 2023-01-22 15:56:10.382133: step: 770/464, loss: 0.002758424961939454 2023-01-22 15:56:11.127466: step: 772/464, loss: 0.000714236288331449 2023-01-22 15:56:12.048558: step: 774/464, loss: 0.01285460963845253 2023-01-22 15:56:12.838371: step: 776/464, loss: 0.03342447802424431 2023-01-22 15:56:13.489360: step: 778/464, loss: 0.013974891044199467 2023-01-22 15:56:14.198880: step: 780/464, loss: 0.021319733932614326 2023-01-22 15:56:14.931382: step: 782/464, loss: 0.0034458094742149115 2023-01-22 15:56:15.667856: step: 784/464, loss: 0.005771171301603317 2023-01-22 15:56:16.432775: step: 786/464, loss: 0.1669687032699585 2023-01-22 15:56:17.121318: step: 788/464, loss: 0.01793059892952442 2023-01-22 15:56:17.880434: step: 790/464, loss: 0.1300530731678009 2023-01-22 15:56:18.658008: step: 792/464, loss: 0.02867552451789379 2023-01-22 15:56:19.419409: step: 794/464, loss: 0.007103492971509695 2023-01-22 15:56:20.093122: step: 796/464, loss: 0.00393635593354702 2023-01-22 15:56:20.719677: step: 798/464, loss: 0.0006900137523189187 2023-01-22 15:56:21.436734: step: 800/464, loss: 0.007099061738699675 2023-01-22 15:56:22.232490: step: 802/464, loss: 0.09573821723461151 2023-01-22 15:56:22.958737: step: 804/464, loss: 0.05650990083813667 2023-01-22 15:56:23.630500: step: 806/464, loss: 0.01625436171889305 2023-01-22 15:56:24.399573: step: 808/464, loss: 0.006412264425307512 2023-01-22 15:56:25.114925: step: 810/464, loss: 0.011279560625553131 2023-01-22 15:56:25.922502: step: 812/464, loss: 0.004043412860482931 2023-01-22 15:56:26.645416: step: 814/464, loss: 0.022079112008213997 2023-01-22 15:56:27.339512: step: 816/464, loss: 0.024804426357150078 2023-01-22 15:56:27.999792: step: 818/464, loss: 0.0019032611744478345 2023-01-22 15:56:28.762118: step: 820/464, loss: 0.01402607373893261 2023-01-22 15:56:29.496612: step: 822/464, loss: 0.010601145215332508 2023-01-22 15:56:30.185300: step: 824/464, loss: 0.005030548200011253 2023-01-22 15:56:30.977792: step: 826/464, loss: 0.01895087957382202 2023-01-22 15:56:31.749869: step: 828/464, loss: 0.016737395897507668 2023-01-22 15:56:32.546210: step: 830/464, loss: 0.009662316180765629 2023-01-22 15:56:33.363308: step: 832/464, loss: 0.13119490444660187 2023-01-22 15:56:34.143568: step: 834/464, loss: 0.03237944841384888 2023-01-22 15:56:34.831154: step: 836/464, loss: 0.07424513250589371 2023-01-22 15:56:35.601403: step: 838/464, loss: 0.00170231016818434 2023-01-22 15:56:36.365377: step: 840/464, loss: 0.024623574689030647 2023-01-22 15:56:37.096749: step: 842/464, loss: 0.005516501143574715 2023-01-22 15:56:37.838047: step: 844/464, loss: 0.00010877756722038612 2023-01-22 15:56:38.557794: step: 846/464, loss: 0.004584172740578651 2023-01-22 15:56:39.276893: step: 848/464, loss: 0.03591621667146683 2023-01-22 15:56:39.959525: step: 850/464, loss: 0.046530336141586304 2023-01-22 15:56:40.702225: step: 852/464, loss: 0.005217722151428461 2023-01-22 15:56:41.410633: step: 854/464, loss: 0.0305581483989954 2023-01-22 15:56:42.209982: step: 856/464, loss: 0.0009626333485357463 2023-01-22 15:56:42.958290: step: 858/464, loss: 0.05425672233104706 2023-01-22 15:56:43.674135: step: 860/464, loss: 0.0038780109025537968 2023-01-22 15:56:44.353183: step: 862/464, loss: 0.02388712950050831 2023-01-22 15:56:45.099530: step: 864/464, loss: 0.033917948603630066 2023-01-22 15:56:45.797342: step: 866/464, loss: 0.00023149006301537156 2023-01-22 15:56:46.654553: step: 868/464, loss: 0.039608728140592575 2023-01-22 15:56:47.436294: step: 870/464, loss: 0.015074286609888077 2023-01-22 15:56:48.111836: step: 872/464, loss: 0.0005526712047867477 2023-01-22 15:56:48.852740: step: 874/464, loss: 0.0229922104626894 2023-01-22 15:56:49.592302: step: 876/464, loss: 0.008480170741677284 2023-01-22 15:56:50.402435: step: 878/464, loss: 0.0006979976897127926 2023-01-22 15:56:51.097038: step: 880/464, loss: 0.03483644500374794 2023-01-22 15:56:51.822479: step: 882/464, loss: 0.1716504991054535 2023-01-22 15:56:52.488992: step: 884/464, loss: 0.0076842340640723705 2023-01-22 15:56:53.246514: step: 886/464, loss: 0.02543296478688717 2023-01-22 15:56:54.036985: step: 888/464, loss: 0.0012912801466882229 2023-01-22 15:56:54.749056: step: 890/464, loss: 0.10062738507986069 2023-01-22 15:56:55.395704: step: 892/464, loss: 0.002663268242031336 2023-01-22 15:56:56.083736: step: 894/464, loss: 0.0084781963378191 2023-01-22 15:56:56.844557: step: 896/464, loss: 0.013621035031974316 2023-01-22 15:56:57.572464: step: 898/464, loss: 0.005025189369916916 2023-01-22 15:56:58.249051: step: 900/464, loss: 0.6844657063484192 2023-01-22 15:56:58.959253: step: 902/464, loss: 0.05140011012554169 2023-01-22 15:56:59.781085: step: 904/464, loss: 0.2284938097000122 2023-01-22 15:57:00.669493: step: 906/464, loss: 0.033565670251846313 2023-01-22 15:57:01.408554: step: 908/464, loss: 0.011310895904898643 2023-01-22 15:57:02.134947: step: 910/464, loss: 0.01956539787352085 2023-01-22 15:57:02.870301: step: 912/464, loss: 0.0008775049936957657 2023-01-22 15:57:03.649034: step: 914/464, loss: 0.0022613483015447855 2023-01-22 15:57:04.480205: step: 916/464, loss: 0.02137225866317749 2023-01-22 15:57:05.145988: step: 918/464, loss: 0.012067358009517193 2023-01-22 15:57:05.923422: step: 920/464, loss: 0.04696095362305641 2023-01-22 15:57:06.687300: step: 922/464, loss: 0.020845679566264153 2023-01-22 15:57:07.447522: step: 924/464, loss: 0.023146284744143486 2023-01-22 15:57:08.205149: step: 926/464, loss: 0.004508962854743004 2023-01-22 15:57:08.939587: step: 928/464, loss: 0.003973621409386396 2023-01-22 15:57:09.630877: step: 930/464, loss: 0.006215892732143402 ================================================== Loss: 0.091 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.31191196380524755, 'r': 0.3563017119748748, 'f1': 0.33263242198540127}, 'combined': 0.2450975740945062, 'epoch': 32} Test Chinese: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.30615020061258347, 'r': 0.28586131491422234, 'f1': 0.2956580965506689}, 'combined': 0.18361923891041543, 'epoch': 32} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.28580708661417326, 'r': 0.34437855787476285, 'f1': 0.3123709122203098}, 'combined': 0.23016804058338616, 'epoch': 32} Test Korean: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.2988013515182189, 'r': 0.28551473015624274, 'f1': 0.29200698021032606}, 'combined': 0.18135170349904461, 'epoch': 32} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.31099782550812105, 'r': 0.3499463197842804, 'f1': 0.3293244830827068}, 'combined': 0.24266014542936287, 'epoch': 32} Test Russian: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.31897212870203995, 'r': 0.294678504656484, 'f1': 0.3063444403164065}, 'combined': 0.19025602082808404, 'epoch': 32} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.28488372093023256, 'r': 0.35, 'f1': 0.31410256410256415}, 'combined': 0.20940170940170943, 'epoch': 32} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.27976190476190477, 'r': 0.5108695652173914, 'f1': 0.3615384615384616}, 'combined': 0.1807692307692308, 'epoch': 32} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.3352272727272727, 'r': 0.2543103448275862, 'f1': 0.2892156862745098}, 'combined': 0.19281045751633985, 'epoch': 32} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.31282801835726054, 'r': 0.3603161425860667, 'f1': 0.33489701436130004}, 'combined': 0.24676622110832633, 'epoch': 12} Test for Chinese: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.30894966790503225, 'r': 0.2887808468548521, 'f1': 0.29852498585915693}, 'combined': 0.1853997280598975, 'epoch': 12} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.3138888888888889, 'r': 0.4035714285714286, 'f1': 0.35312499999999997}, 'combined': 0.23541666666666664, 'epoch': 12} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3081533681870252, 'r': 0.3455761681186563, 'f1': 0.32579363255551325}, 'combined': 0.24005846609353607, 'epoch': 23} Test for Korean: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.29327466083978504, 'r': 0.2883432372648332, 'f1': 0.2907880427678267}, 'combined': 0.1805946791926503, 'epoch': 23} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.3355263157894737, 'r': 0.5543478260869565, 'f1': 0.41803278688524587}, 'combined': 0.20901639344262293, 'epoch': 23} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.32184781639317267, 'r': 0.3346728716953674, 'f1': 0.32813507606224857}, 'combined': 0.24178374025639365, 'epoch': 24} Test for Russian: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.31977789344146773, 'r': 0.29610233370986844, 'f1': 0.30748504771716734}, 'combined': 0.190964398055925, 'epoch': 24} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.46875, 'r': 0.3232758620689655, 'f1': 0.38265306122448983}, 'combined': 0.25510204081632654, 'epoch': 24} ****************************** Epoch: 33 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-22 15:59:48.285318: step: 2/464, loss: 0.00039098167326301336 2023-01-22 15:59:48.957106: step: 4/464, loss: 0.0019515041494742036 2023-01-22 15:59:49.652395: step: 6/464, loss: 0.00439158221706748 2023-01-22 15:59:50.294726: step: 8/464, loss: 0.001543243182823062 2023-01-22 15:59:51.069328: step: 10/464, loss: 0.012657595798373222 2023-01-22 15:59:51.810759: step: 12/464, loss: 0.012029886245727539 2023-01-22 15:59:52.517185: step: 14/464, loss: 0.0003889836370944977 2023-01-22 15:59:53.325581: step: 16/464, loss: 0.03859444707632065 2023-01-22 15:59:54.037645: step: 18/464, loss: 0.005608262028545141 2023-01-22 15:59:54.823289: step: 20/464, loss: 0.029072171077132225 2023-01-22 15:59:55.668857: step: 22/464, loss: 0.002594446297734976 2023-01-22 15:59:56.444021: step: 24/464, loss: 0.03561032935976982 2023-01-22 15:59:57.139613: step: 26/464, loss: 0.008360390551388264 2023-01-22 15:59:57.886926: step: 28/464, loss: 0.02092169225215912 2023-01-22 15:59:58.577466: step: 30/464, loss: 0.01666991412639618 2023-01-22 15:59:59.305488: step: 32/464, loss: 0.0022188760340213776 2023-01-22 16:00:00.052998: step: 34/464, loss: 0.03185074031352997 2023-01-22 16:00:00.816927: step: 36/464, loss: 0.027600795030593872 2023-01-22 16:00:01.508620: step: 38/464, loss: 0.16664382815361023 2023-01-22 16:00:02.232738: step: 40/464, loss: 0.0018672705627977848 2023-01-22 16:00:02.926338: step: 42/464, loss: 0.002315365942195058 2023-01-22 16:00:03.687477: step: 44/464, loss: 0.0025608143769204617 2023-01-22 16:00:04.362050: step: 46/464, loss: 0.025399038568139076 2023-01-22 16:00:05.143541: step: 48/464, loss: 0.008031615987420082 2023-01-22 16:00:05.905539: step: 50/464, loss: 0.13993428647518158 2023-01-22 16:00:06.570576: step: 52/464, loss: 0.02097419835627079 2023-01-22 16:00:07.374108: step: 54/464, loss: 0.00789231713861227 2023-01-22 16:00:08.056856: step: 56/464, loss: 0.0020788833498954773 2023-01-22 16:00:08.802173: step: 58/464, loss: 0.0011773493606597185 2023-01-22 16:00:09.511752: step: 60/464, loss: 0.0006447955383919179 2023-01-22 16:00:10.250352: step: 62/464, loss: 0.010896679945290089 2023-01-22 16:00:11.045045: step: 64/464, loss: 0.005736788734793663 2023-01-22 16:00:11.880099: step: 66/464, loss: 0.0026948293671011925 2023-01-22 16:00:12.634300: step: 68/464, loss: 0.03583534061908722 2023-01-22 16:00:13.375687: step: 70/464, loss: 0.0006530498503707349 2023-01-22 16:00:14.161591: step: 72/464, loss: 0.1562230885028839 2023-01-22 16:00:14.856131: step: 74/464, loss: 0.0003401939757168293 2023-01-22 16:00:15.580714: step: 76/464, loss: 0.015257839113473892 2023-01-22 16:00:16.318249: step: 78/464, loss: 0.06005620211362839 2023-01-22 16:00:17.018484: step: 80/464, loss: 0.003746387083083391 2023-01-22 16:00:17.725304: step: 82/464, loss: 0.015048716217279434 2023-01-22 16:00:18.446369: step: 84/464, loss: 0.003595659276470542 2023-01-22 16:00:19.092280: step: 86/464, loss: 0.008061950094997883 2023-01-22 16:00:19.898990: step: 88/464, loss: 0.0037724985741078854 2023-01-22 16:00:20.590517: step: 90/464, loss: 0.00019585739937610924 2023-01-22 16:00:21.565952: step: 92/464, loss: 0.0028360611759126186 2023-01-22 16:00:22.311893: step: 94/464, loss: 0.0010991257149726152 2023-01-22 16:00:23.093634: step: 96/464, loss: 0.01256584469228983 2023-01-22 16:00:23.864056: step: 98/464, loss: 0.012041222304105759 2023-01-22 16:00:24.539765: step: 100/464, loss: 0.23359690606594086 2023-01-22 16:00:25.259319: step: 102/464, loss: 0.0024997605942189693 2023-01-22 16:00:26.034504: step: 104/464, loss: 0.005596070550382137 2023-01-22 16:00:26.782761: step: 106/464, loss: 0.026922399178147316 2023-01-22 16:00:27.497931: step: 108/464, loss: 0.00506148487329483 2023-01-22 16:00:28.344446: step: 110/464, loss: 0.09594239294528961 2023-01-22 16:00:29.047434: step: 112/464, loss: 0.017567602917551994 2023-01-22 16:00:29.737120: step: 114/464, loss: 0.04349970072507858 2023-01-22 16:00:30.401914: step: 116/464, loss: 0.0021890331991016865 2023-01-22 16:00:31.146208: step: 118/464, loss: 0.005069403443485498 2023-01-22 16:00:31.928278: step: 120/464, loss: 0.08780425041913986 2023-01-22 16:00:32.706204: step: 122/464, loss: 0.2537309229373932 2023-01-22 16:00:33.413013: step: 124/464, loss: 2.541379690170288 2023-01-22 16:00:34.092592: step: 126/464, loss: 0.01147314440459013 2023-01-22 16:00:34.889306: step: 128/464, loss: 0.017582839354872704 2023-01-22 16:00:35.725751: step: 130/464, loss: 0.1737164407968521 2023-01-22 16:00:36.464259: step: 132/464, loss: 0.002044209511950612 2023-01-22 16:00:37.144728: step: 134/464, loss: 0.0007453494472429156 2023-01-22 16:00:37.878635: step: 136/464, loss: 0.0845332145690918 2023-01-22 16:00:38.646276: step: 138/464, loss: 0.04407544061541557 2023-01-22 16:00:39.388684: step: 140/464, loss: 0.09098831564188004 2023-01-22 16:00:40.120734: step: 142/464, loss: 0.011893196031451225 2023-01-22 16:00:40.975506: step: 144/464, loss: 0.009819947183132172 2023-01-22 16:00:41.740567: step: 146/464, loss: 0.009369265288114548 2023-01-22 16:00:42.416969: step: 148/464, loss: 0.008498580195009708 2023-01-22 16:00:43.078912: step: 150/464, loss: 0.0017171023646369576 2023-01-22 16:00:43.774575: step: 152/464, loss: 0.037098441272974014 2023-01-22 16:00:44.481014: step: 154/464, loss: 0.005075047258287668 2023-01-22 16:00:45.180108: step: 156/464, loss: 0.014594484120607376 2023-01-22 16:00:45.923489: step: 158/464, loss: 0.0008824823307804763 2023-01-22 16:00:46.606301: step: 160/464, loss: 1.8132606744766235 2023-01-22 16:00:47.337990: step: 162/464, loss: 0.002179122529923916 2023-01-22 16:00:47.977012: step: 164/464, loss: 0.033111296594142914 2023-01-22 16:00:48.699770: step: 166/464, loss: 0.008549502119421959 2023-01-22 16:00:49.367377: step: 168/464, loss: 0.011830088682472706 2023-01-22 16:00:50.063890: step: 170/464, loss: 0.0008865146664902568 2023-01-22 16:00:50.737995: step: 172/464, loss: 0.0002516806125640869 2023-01-22 16:00:51.429539: step: 174/464, loss: 0.0003081305476371199 2023-01-22 16:00:52.157622: step: 176/464, loss: 0.03597329556941986 2023-01-22 16:00:52.910377: step: 178/464, loss: 0.08819016069173813 2023-01-22 16:00:53.682062: step: 180/464, loss: 0.023465489968657494 2023-01-22 16:00:54.471025: step: 182/464, loss: 0.010968365706503391 2023-01-22 16:00:55.122378: step: 184/464, loss: 0.3380451500415802 2023-01-22 16:00:55.821169: step: 186/464, loss: 0.0003763749555218965 2023-01-22 16:00:56.475798: step: 188/464, loss: 0.011835026554763317 2023-01-22 16:00:57.383898: step: 190/464, loss: 0.0027454986702650785 2023-01-22 16:00:58.117041: step: 192/464, loss: 0.055424366146326065 2023-01-22 16:00:58.844974: step: 194/464, loss: 0.03973700851202011 2023-01-22 16:00:59.602141: step: 196/464, loss: 0.023937562480568886 2023-01-22 16:01:00.285973: step: 198/464, loss: 0.006588765420019627 2023-01-22 16:01:00.954977: step: 200/464, loss: 0.04747241362929344 2023-01-22 16:01:01.620344: step: 202/464, loss: 0.04055408015847206 2023-01-22 16:01:02.395877: step: 204/464, loss: 0.010046295821666718 2023-01-22 16:01:03.105263: step: 206/464, loss: 0.005254621617496014 2023-01-22 16:01:03.788748: step: 208/464, loss: 0.06961327791213989 2023-01-22 16:01:04.564571: step: 210/464, loss: 2.223823503300082e-05 2023-01-22 16:01:05.300018: step: 212/464, loss: 0.008146322332322598 2023-01-22 16:01:06.000237: step: 214/464, loss: 0.37217724323272705 2023-01-22 16:01:06.621813: step: 216/464, loss: 0.005295636132359505 2023-01-22 16:01:07.281150: step: 218/464, loss: 0.012367482297122478 2023-01-22 16:01:08.055236: step: 220/464, loss: 0.047387026250362396 2023-01-22 16:01:08.765099: step: 222/464, loss: 0.01395697146654129 2023-01-22 16:01:09.456492: step: 224/464, loss: 0.014371508732438087 2023-01-22 16:01:10.138427: step: 226/464, loss: 0.012212062254548073 2023-01-22 16:01:10.844672: step: 228/464, loss: 0.1137467548251152 2023-01-22 16:01:11.579300: step: 230/464, loss: 0.028719283640384674 2023-01-22 16:01:12.355209: step: 232/464, loss: 0.16531309485435486 2023-01-22 16:01:13.080772: step: 234/464, loss: 0.00018524908227846026 2023-01-22 16:01:13.840199: step: 236/464, loss: 0.008265960030257702 2023-01-22 16:01:14.688156: step: 238/464, loss: 0.015571714378893375 2023-01-22 16:01:15.410114: step: 240/464, loss: 0.09191140532493591 2023-01-22 16:01:16.107413: step: 242/464, loss: 0.01177075132727623 2023-01-22 16:01:16.808917: step: 244/464, loss: 0.005926132667809725 2023-01-22 16:01:17.434786: step: 246/464, loss: 0.27264249324798584 2023-01-22 16:01:18.126250: step: 248/464, loss: 0.09546758979558945 2023-01-22 16:01:18.937568: step: 250/464, loss: 0.01481780968606472 2023-01-22 16:01:19.685590: step: 252/464, loss: 0.17518430948257446 2023-01-22 16:01:20.544705: step: 254/464, loss: 0.09268547594547272 2023-01-22 16:01:21.282548: step: 256/464, loss: 0.04955311119556427 2023-01-22 16:01:21.975781: step: 258/464, loss: 0.008020931854844093 2023-01-22 16:01:22.695082: step: 260/464, loss: 0.0050893924199044704 2023-01-22 16:01:23.402758: step: 262/464, loss: 0.0004801676550414413 2023-01-22 16:01:24.090460: step: 264/464, loss: 0.012764999642968178 2023-01-22 16:01:24.828179: step: 266/464, loss: 0.008500345051288605 2023-01-22 16:01:25.675616: step: 268/464, loss: 0.007304369006305933 2023-01-22 16:01:26.407270: step: 270/464, loss: 0.03284875303506851 2023-01-22 16:01:27.084874: step: 272/464, loss: 0.007013040129095316 2023-01-22 16:01:27.886383: step: 274/464, loss: 0.029180755838751793 2023-01-22 16:01:28.573664: step: 276/464, loss: 0.011627051047980785 2023-01-22 16:01:29.253367: step: 278/464, loss: 0.0007608159794472158 2023-01-22 16:01:29.891212: step: 280/464, loss: 0.004587231669574976 2023-01-22 16:01:30.599307: step: 282/464, loss: 0.002692803042009473 2023-01-22 16:01:31.443441: step: 284/464, loss: 0.008794981054961681 2023-01-22 16:01:32.136506: step: 286/464, loss: 0.00858909823000431 2023-01-22 16:01:33.011741: step: 288/464, loss: 0.0035707568749785423 2023-01-22 16:01:33.697353: step: 290/464, loss: 0.00039932093932293355 2023-01-22 16:01:34.498137: step: 292/464, loss: 0.016323773190379143 2023-01-22 16:01:35.288909: step: 294/464, loss: 0.003390966448932886 2023-01-22 16:01:35.992660: step: 296/464, loss: 0.0008662844775244594 2023-01-22 16:01:36.720965: step: 298/464, loss: 0.00046012995881028473 2023-01-22 16:01:37.529576: step: 300/464, loss: 0.007354297209531069 2023-01-22 16:01:38.223804: step: 302/464, loss: 0.0022028302773833275 2023-01-22 16:01:38.958936: step: 304/464, loss: 0.07865099608898163 2023-01-22 16:01:39.665520: step: 306/464, loss: 0.03541542962193489 2023-01-22 16:01:40.448382: step: 308/464, loss: 0.03597771376371384 2023-01-22 16:01:41.191906: step: 310/464, loss: 0.0018971062963828444 2023-01-22 16:01:41.890820: step: 312/464, loss: 0.004431482870131731 2023-01-22 16:01:42.671694: step: 314/464, loss: 0.045061685144901276 2023-01-22 16:01:43.427377: step: 316/464, loss: 0.0004633663047570735 2023-01-22 16:01:44.150433: step: 318/464, loss: 0.014325146563351154 2023-01-22 16:01:44.917084: step: 320/464, loss: 0.016019832342863083 2023-01-22 16:01:45.606846: step: 322/464, loss: 0.005721123423427343 2023-01-22 16:01:46.381178: step: 324/464, loss: 0.0054297856986522675 2023-01-22 16:01:47.044573: step: 326/464, loss: 0.013479417189955711 2023-01-22 16:01:48.004082: step: 328/464, loss: 0.07097252458333969 2023-01-22 16:01:48.766762: step: 330/464, loss: 0.005652104038745165 2023-01-22 16:01:49.531813: step: 332/464, loss: 0.002012834884226322 2023-01-22 16:01:50.224284: step: 334/464, loss: 0.03151766210794449 2023-01-22 16:01:50.870385: step: 336/464, loss: 0.08759447932243347 2023-01-22 16:01:51.642444: step: 338/464, loss: 0.0022924193181097507 2023-01-22 16:01:52.380168: step: 340/464, loss: 0.0057031805627048016 2023-01-22 16:01:53.108945: step: 342/464, loss: 0.016630645841360092 2023-01-22 16:01:53.794130: step: 344/464, loss: 0.00012650784628931433 2023-01-22 16:01:54.480971: step: 346/464, loss: 0.052899766713380814 2023-01-22 16:01:55.190941: step: 348/464, loss: 0.055778276175260544 2023-01-22 16:01:55.830377: step: 350/464, loss: 0.003215274540707469 2023-01-22 16:01:56.520218: step: 352/464, loss: 0.12934081256389618 2023-01-22 16:01:57.262079: step: 354/464, loss: 0.053472839295864105 2023-01-22 16:01:57.916343: step: 356/464, loss: 0.0236610546708107 2023-01-22 16:01:58.658683: step: 358/464, loss: 0.010390755720436573 2023-01-22 16:01:59.376948: step: 360/464, loss: 0.0007524581160396338 2023-01-22 16:02:00.149484: step: 362/464, loss: 0.0035021156072616577 2023-01-22 16:02:00.817729: step: 364/464, loss: 0.0011628000065684319 2023-01-22 16:02:01.541211: step: 366/464, loss: 0.00948494952172041 2023-01-22 16:02:02.278320: step: 368/464, loss: 0.09849855303764343 2023-01-22 16:02:03.007738: step: 370/464, loss: 0.03865564614534378 2023-01-22 16:02:03.770078: step: 372/464, loss: 0.04403558745980263 2023-01-22 16:02:04.588061: step: 374/464, loss: 0.05303948000073433 2023-01-22 16:02:05.308009: step: 376/464, loss: 0.01641235314309597 2023-01-22 16:02:06.079070: step: 378/464, loss: 0.4176364839076996 2023-01-22 16:02:06.850657: step: 380/464, loss: 0.03687671944499016 2023-01-22 16:02:07.605144: step: 382/464, loss: 0.007635573390871286 2023-01-22 16:02:08.307323: step: 384/464, loss: 0.03301868215203285 2023-01-22 16:02:09.076135: step: 386/464, loss: 0.03237801417708397 2023-01-22 16:02:09.792375: step: 388/464, loss: 0.0010952817974612117 2023-01-22 16:02:10.508758: step: 390/464, loss: 0.013851060532033443 2023-01-22 16:02:11.191177: step: 392/464, loss: 0.004478788003325462 2023-01-22 16:02:12.011576: step: 394/464, loss: 0.3380841612815857 2023-01-22 16:02:12.760027: step: 396/464, loss: 0.004648469388484955 2023-01-22 16:02:13.476035: step: 398/464, loss: 0.035101454704999924 2023-01-22 16:02:14.198017: step: 400/464, loss: 0.01252185832709074 2023-01-22 16:02:14.876348: step: 402/464, loss: 0.003056521527469158 2023-01-22 16:02:15.523285: step: 404/464, loss: 0.002402772894129157 2023-01-22 16:02:16.360264: step: 406/464, loss: 0.1006186380982399 2023-01-22 16:02:17.111545: step: 408/464, loss: 0.02391218952834606 2023-01-22 16:02:17.885037: step: 410/464, loss: 0.055983979254961014 2023-01-22 16:02:18.587363: step: 412/464, loss: 0.013008703477680683 2023-01-22 16:02:19.317210: step: 414/464, loss: 0.23352257907390594 2023-01-22 16:02:20.044309: step: 416/464, loss: 0.07463035732507706 2023-01-22 16:02:20.722714: step: 418/464, loss: 0.002670279471203685 2023-01-22 16:02:21.482031: step: 420/464, loss: 0.013034089468419552 2023-01-22 16:02:22.235742: step: 422/464, loss: 0.06503226608037949 2023-01-22 16:02:23.058259: step: 424/464, loss: 0.0711727887392044 2023-01-22 16:02:23.852483: step: 426/464, loss: 0.016943490132689476 2023-01-22 16:02:24.595807: step: 428/464, loss: 0.017905110493302345 2023-01-22 16:02:25.327835: step: 430/464, loss: 0.010580254718661308 2023-01-22 16:02:26.084579: step: 432/464, loss: 0.0017730090767145157 2023-01-22 16:02:26.827433: step: 434/464, loss: 0.012521494179964066 2023-01-22 16:02:27.529568: step: 436/464, loss: 0.037310272455215454 2023-01-22 16:02:28.243534: step: 438/464, loss: 0.3737095296382904 2023-01-22 16:02:28.946625: step: 440/464, loss: 0.0009822368156164885 2023-01-22 16:02:29.720483: step: 442/464, loss: 0.034751612693071365 2023-01-22 16:02:30.373727: step: 444/464, loss: 0.18611420691013336 2023-01-22 16:02:31.069999: step: 446/464, loss: 0.004463686607778072 2023-01-22 16:02:31.774258: step: 448/464, loss: 0.014982398599386215 2023-01-22 16:02:32.471631: step: 450/464, loss: 0.025062285363674164 2023-01-22 16:02:33.189692: step: 452/464, loss: 0.02551797218620777 2023-01-22 16:02:33.910293: step: 454/464, loss: 0.010809944942593575 2023-01-22 16:02:34.664220: step: 456/464, loss: 0.019055066630244255 2023-01-22 16:02:35.363436: step: 458/464, loss: 0.00012230365246068686 2023-01-22 16:02:36.029994: step: 460/464, loss: 0.007807716727256775 2023-01-22 16:02:36.694192: step: 462/464, loss: 0.0007277731783688068 2023-01-22 16:02:37.476879: step: 464/464, loss: 0.00027377461083233356 2023-01-22 16:02:38.204341: step: 466/464, loss: 0.0009555912110954523 2023-01-22 16:02:38.982766: step: 468/464, loss: 0.0014475013595074415 2023-01-22 16:02:39.763048: step: 470/464, loss: 0.019103819504380226 2023-01-22 16:02:40.414543: step: 472/464, loss: 0.006632484495639801 2023-01-22 16:02:41.239943: step: 474/464, loss: 0.015860576182603836 2023-01-22 16:02:42.177535: step: 476/464, loss: 0.005315141286700964 2023-01-22 16:02:42.890922: step: 478/464, loss: 0.0025896148290485144 2023-01-22 16:02:43.665886: step: 480/464, loss: 0.004105839412659407 2023-01-22 16:02:44.405356: step: 482/464, loss: 0.30062341690063477 2023-01-22 16:02:45.140149: step: 484/464, loss: 0.004436559975147247 2023-01-22 16:02:45.814386: step: 486/464, loss: 0.006108072120696306 2023-01-22 16:02:46.476509: step: 488/464, loss: 0.003864583559334278 2023-01-22 16:02:47.244226: step: 490/464, loss: 0.007178569212555885 2023-01-22 16:02:48.070765: step: 492/464, loss: 0.018330058082938194 2023-01-22 16:02:48.770735: step: 494/464, loss: 0.005468042101711035 2023-01-22 16:02:49.575546: step: 496/464, loss: 0.0365530401468277 2023-01-22 16:02:50.186869: step: 498/464, loss: 0.0017041832907125354 2023-01-22 16:02:50.916367: step: 500/464, loss: 0.03681609407067299 2023-01-22 16:02:51.648548: step: 502/464, loss: 0.19347411394119263 2023-01-22 16:02:52.529285: step: 504/464, loss: 0.01480608619749546 2023-01-22 16:02:53.276259: step: 506/464, loss: 0.0002922680287156254 2023-01-22 16:02:54.032821: step: 508/464, loss: 0.014807172119617462 2023-01-22 16:02:54.824530: step: 510/464, loss: 0.019604379311203957 2023-01-22 16:02:55.567155: step: 512/464, loss: 0.5452590584754944 2023-01-22 16:02:56.349757: step: 514/464, loss: 0.04370192810893059 2023-01-22 16:02:57.008106: step: 516/464, loss: 0.0015719156945124269 2023-01-22 16:02:57.738922: step: 518/464, loss: 0.017483972012996674 2023-01-22 16:02:58.445387: step: 520/464, loss: 0.026621662080287933 2023-01-22 16:02:59.169673: step: 522/464, loss: 0.02510252222418785 2023-01-22 16:02:59.885909: step: 524/464, loss: 0.0642847940325737 2023-01-22 16:03:00.646331: step: 526/464, loss: 0.009867520071566105 2023-01-22 16:03:01.381049: step: 528/464, loss: 0.0023654180113226175 2023-01-22 16:03:02.223514: step: 530/464, loss: 0.0015580817125737667 2023-01-22 16:03:02.959117: step: 532/464, loss: 0.04824097454547882 2023-01-22 16:03:03.668136: step: 534/464, loss: 0.020770171657204628 2023-01-22 16:03:04.347855: step: 536/464, loss: 0.015186947770416737 2023-01-22 16:03:05.151907: step: 538/464, loss: 0.0014887871220707893 2023-01-22 16:03:05.874218: step: 540/464, loss: 0.0006223876844160259 2023-01-22 16:03:06.550140: step: 542/464, loss: 0.008479096926748753 2023-01-22 16:03:07.274632: step: 544/464, loss: 0.00052025041077286 2023-01-22 16:03:08.127816: step: 546/464, loss: 0.014327477663755417 2023-01-22 16:03:08.870034: step: 548/464, loss: 0.054649192839860916 2023-01-22 16:03:09.596189: step: 550/464, loss: 0.01268855668604374 2023-01-22 16:03:10.325243: step: 552/464, loss: 0.12560972571372986 2023-01-22 16:03:11.064586: step: 554/464, loss: 0.050108108669519424 2023-01-22 16:03:11.872293: step: 556/464, loss: 0.15157826244831085 2023-01-22 16:03:12.566760: step: 558/464, loss: 0.0012913616374135017 2023-01-22 16:03:13.339633: step: 560/464, loss: 0.02061178907752037 2023-01-22 16:03:14.133196: step: 562/464, loss: 0.0076378644444048405 2023-01-22 16:03:14.925422: step: 564/464, loss: 0.01353029441088438 2023-01-22 16:03:15.663952: step: 566/464, loss: 0.0009691324084997177 2023-01-22 16:03:16.395631: step: 568/464, loss: 0.29185059666633606 2023-01-22 16:03:17.154315: step: 570/464, loss: 0.0015059993602335453 2023-01-22 16:03:17.899540: step: 572/464, loss: 0.005179137922823429 2023-01-22 16:03:18.538225: step: 574/464, loss: 4.8018511733971536e-05 2023-01-22 16:03:19.322211: step: 576/464, loss: 0.17213767766952515 2023-01-22 16:03:20.133669: step: 578/464, loss: 0.06760312616825104 2023-01-22 16:03:20.827904: step: 580/464, loss: 0.06528377532958984 2023-01-22 16:03:21.567912: step: 582/464, loss: 0.006387208588421345 2023-01-22 16:03:22.326746: step: 584/464, loss: 0.011440436355769634 2023-01-22 16:03:23.080235: step: 586/464, loss: 0.003430222626775503 2023-01-22 16:03:23.915513: step: 588/464, loss: 0.0009344254503957927 2023-01-22 16:03:24.719567: step: 590/464, loss: 0.002713944064453244 2023-01-22 16:03:25.436868: step: 592/464, loss: 0.002663680585101247 2023-01-22 16:03:26.199661: step: 594/464, loss: 0.001129120122641325 2023-01-22 16:03:26.950860: step: 596/464, loss: 0.0009066627826541662 2023-01-22 16:03:27.752484: step: 598/464, loss: 0.016030069440603256 2023-01-22 16:03:28.433417: step: 600/464, loss: 0.0125338826328516 2023-01-22 16:03:29.200509: step: 602/464, loss: 0.006337358150631189 2023-01-22 16:03:29.928134: step: 604/464, loss: 0.0018472378142178059 2023-01-22 16:03:30.621184: step: 606/464, loss: 0.0031487769447267056 2023-01-22 16:03:31.398347: step: 608/464, loss: 0.005827262531965971 2023-01-22 16:03:32.106782: step: 610/464, loss: 0.014079701155424118 2023-01-22 16:03:32.801568: step: 612/464, loss: 0.0038269374053925276 2023-01-22 16:03:33.629433: step: 614/464, loss: 0.0021537039428949356 2023-01-22 16:03:34.402137: step: 616/464, loss: 0.002285576891154051 2023-01-22 16:03:35.109644: step: 618/464, loss: 0.0056445905938744545 2023-01-22 16:03:35.826010: step: 620/464, loss: 0.06387317180633545 2023-01-22 16:03:36.520627: step: 622/464, loss: 0.0125731797888875 2023-01-22 16:03:37.326621: step: 624/464, loss: 0.04404031112790108 2023-01-22 16:03:38.013791: step: 626/464, loss: 0.006039231084287167 2023-01-22 16:03:38.774964: step: 628/464, loss: 0.018473368138074875 2023-01-22 16:03:39.543159: step: 630/464, loss: 0.000345776294125244 2023-01-22 16:03:40.375919: step: 632/464, loss: 0.048042964190244675 2023-01-22 16:03:41.069744: step: 634/464, loss: 0.008211801759898663 2023-01-22 16:03:41.803256: step: 636/464, loss: 0.00013828724331688136 2023-01-22 16:03:42.485439: step: 638/464, loss: 0.02018279954791069 2023-01-22 16:03:43.208509: step: 640/464, loss: 0.029835864901542664 2023-01-22 16:03:43.858180: step: 642/464, loss: 0.004973770119249821 2023-01-22 16:03:44.592272: step: 644/464, loss: 0.002753573004156351 2023-01-22 16:03:45.269332: step: 646/464, loss: 0.006737233605235815 2023-01-22 16:03:45.944954: step: 648/464, loss: 0.05646848306059837 2023-01-22 16:03:46.621679: step: 650/464, loss: 0.01825702004134655 2023-01-22 16:03:47.310192: step: 652/464, loss: 0.10580414533615112 2023-01-22 16:03:48.072962: step: 654/464, loss: 0.00877810176461935 2023-01-22 16:03:48.795346: step: 656/464, loss: 0.004991421941667795 2023-01-22 16:03:49.547249: step: 658/464, loss: 0.00306524196639657 2023-01-22 16:03:50.332995: step: 660/464, loss: 0.011912654154002666 2023-01-22 16:03:51.116632: step: 662/464, loss: 0.0008568924968130887 2023-01-22 16:03:51.835981: step: 664/464, loss: 0.015382496640086174 2023-01-22 16:03:52.549123: step: 666/464, loss: 0.016268685460090637 2023-01-22 16:03:53.265679: step: 668/464, loss: 0.04140138998627663 2023-01-22 16:03:54.035928: step: 670/464, loss: 0.024702560156583786 2023-01-22 16:03:54.708380: step: 672/464, loss: 0.007844786159694195 2023-01-22 16:03:55.384659: step: 674/464, loss: 0.03346068784594536 2023-01-22 16:03:56.153908: step: 676/464, loss: 0.01021251454949379 2023-01-22 16:03:56.903667: step: 678/464, loss: 0.005769534967839718 2023-01-22 16:03:57.585048: step: 680/464, loss: 0.0011539680417627096 2023-01-22 16:03:58.316534: step: 682/464, loss: 0.010150428861379623 2023-01-22 16:03:59.071184: step: 684/464, loss: 0.013346421532332897 2023-01-22 16:03:59.863088: step: 686/464, loss: 0.014792312867939472 2023-01-22 16:04:00.689167: step: 688/464, loss: 0.0027319081127643585 2023-01-22 16:04:01.443083: step: 690/464, loss: 0.015660211443901062 2023-01-22 16:04:02.262152: step: 692/464, loss: 0.006680083926767111 2023-01-22 16:04:02.996368: step: 694/464, loss: 0.012355759739875793 2023-01-22 16:04:03.731490: step: 696/464, loss: 0.002599672181531787 2023-01-22 16:04:04.424376: step: 698/464, loss: 0.0004135113849770278 2023-01-22 16:04:05.168941: step: 700/464, loss: 0.053030844777822495 2023-01-22 16:04:05.942763: step: 702/464, loss: 0.07157432287931442 2023-01-22 16:04:06.697751: step: 704/464, loss: 0.01568608172237873 2023-01-22 16:04:07.416837: step: 706/464, loss: 0.04196842387318611 2023-01-22 16:04:08.148124: step: 708/464, loss: 0.00017093642964027822 2023-01-22 16:04:08.824569: step: 710/464, loss: 0.048214443027973175 2023-01-22 16:04:09.519128: step: 712/464, loss: 0.005283031612634659 2023-01-22 16:04:10.230137: step: 714/464, loss: 0.0061831907369196415 2023-01-22 16:04:11.030140: step: 716/464, loss: 0.0009360117837786674 2023-01-22 16:04:11.795278: step: 718/464, loss: 0.018874213099479675 2023-01-22 16:04:12.667959: step: 720/464, loss: 0.0016444490756839514 2023-01-22 16:04:13.352433: step: 722/464, loss: 0.023046409711241722 2023-01-22 16:04:14.072479: step: 724/464, loss: 0.010292811319231987 2023-01-22 16:04:14.944893: step: 726/464, loss: 0.0054589444771409035 2023-01-22 16:04:15.625030: step: 728/464, loss: 0.10302948206663132 2023-01-22 16:04:16.337266: step: 730/464, loss: 0.0001261269935639575 2023-01-22 16:04:17.084934: step: 732/464, loss: 0.027284221723675728 2023-01-22 16:04:17.835014: step: 734/464, loss: 0.03829146549105644 2023-01-22 16:04:18.543849: step: 736/464, loss: 0.005423355381935835 2023-01-22 16:04:19.324996: step: 738/464, loss: 0.039719391614198685 2023-01-22 16:04:20.085853: step: 740/464, loss: 0.0002570571086835116 2023-01-22 16:04:20.897858: step: 742/464, loss: 0.0417335107922554 2023-01-22 16:04:21.605264: step: 744/464, loss: 0.0009967689402401447 2023-01-22 16:04:22.354960: step: 746/464, loss: 0.013086330145597458 2023-01-22 16:04:23.120157: step: 748/464, loss: 0.0364522710442543 2023-01-22 16:04:23.887567: step: 750/464, loss: 0.024580299854278564 2023-01-22 16:04:24.577327: step: 752/464, loss: 0.00044813542626798153 2023-01-22 16:04:25.291968: step: 754/464, loss: 0.005226737353950739 2023-01-22 16:04:26.133994: step: 756/464, loss: 0.012509177438914776 2023-01-22 16:04:26.824865: step: 758/464, loss: 0.004933376796543598 2023-01-22 16:04:27.532457: step: 760/464, loss: 0.0008160553989000618 2023-01-22 16:04:28.209008: step: 762/464, loss: 0.025479281321167946 2023-01-22 16:04:28.871395: step: 764/464, loss: 0.04945717751979828 2023-01-22 16:04:29.639367: step: 766/464, loss: 0.000553000601939857 2023-01-22 16:04:30.303632: step: 768/464, loss: 0.0039887516759335995 2023-01-22 16:04:31.022336: step: 770/464, loss: 0.004526576027274132 2023-01-22 16:04:31.850001: step: 772/464, loss: 0.006515008397400379 2023-01-22 16:04:32.607830: step: 774/464, loss: 0.004391736816614866 2023-01-22 16:04:33.304593: step: 776/464, loss: 0.0011453385232016444 2023-01-22 16:04:34.015836: step: 778/464, loss: 0.048640068620443344 2023-01-22 16:04:34.742639: step: 780/464, loss: 0.004788265563547611 2023-01-22 16:04:35.526187: step: 782/464, loss: 0.004345182795077562 2023-01-22 16:04:36.268343: step: 784/464, loss: 0.07326582074165344 2023-01-22 16:04:36.942848: step: 786/464, loss: 0.000605506356805563 2023-01-22 16:04:37.675031: step: 788/464, loss: 0.00305646238848567 2023-01-22 16:04:38.494719: step: 790/464, loss: 0.008802256546914577 2023-01-22 16:04:39.273315: step: 792/464, loss: 0.025788472965359688 2023-01-22 16:04:39.984917: step: 794/464, loss: 0.2637900114059448 2023-01-22 16:04:40.666299: step: 796/464, loss: 0.0015494396211579442 2023-01-22 16:04:41.352784: step: 798/464, loss: 0.00048792597954161465 2023-01-22 16:04:42.158405: step: 800/464, loss: 0.005383032839745283 2023-01-22 16:04:42.949892: step: 802/464, loss: 0.2706470191478729 2023-01-22 16:04:43.696131: step: 804/464, loss: 0.002237283391878009 2023-01-22 16:04:44.580339: step: 806/464, loss: 0.23974576592445374 2023-01-22 16:04:45.349069: step: 808/464, loss: 0.00011458772496553138 2023-01-22 16:04:46.081734: step: 810/464, loss: 0.046538788825273514 2023-01-22 16:04:46.805068: step: 812/464, loss: 0.024802470579743385 2023-01-22 16:04:47.573785: step: 814/464, loss: 0.02206752821803093 2023-01-22 16:04:48.288302: step: 816/464, loss: 0.01053948700428009 2023-01-22 16:04:49.000308: step: 818/464, loss: 0.02718799002468586 2023-01-22 16:04:49.775323: step: 820/464, loss: 0.03310324624180794 2023-01-22 16:04:50.543562: step: 822/464, loss: 0.6390102505683899 2023-01-22 16:04:51.226487: step: 824/464, loss: 0.004658107180148363 2023-01-22 16:04:51.904652: step: 826/464, loss: 0.013531827367842197 2023-01-22 16:04:52.623534: step: 828/464, loss: 0.014665245078504086 2023-01-22 16:04:53.361543: step: 830/464, loss: 0.021113982424139977 2023-01-22 16:04:54.065841: step: 832/464, loss: 0.011678681708872318 2023-01-22 16:04:54.805643: step: 834/464, loss: 0.06600571423768997 2023-01-22 16:04:55.545607: step: 836/464, loss: 0.0012176345335319638 2023-01-22 16:04:56.360510: step: 838/464, loss: 0.04618173465132713 2023-01-22 16:04:57.117560: step: 840/464, loss: 0.013476877473294735 2023-01-22 16:04:57.849264: step: 842/464, loss: 0.03788414224982262 2023-01-22 16:04:58.457606: step: 844/464, loss: 0.005714490544050932 2023-01-22 16:04:59.206852: step: 846/464, loss: 0.0042811487801373005 2023-01-22 16:05:00.024643: step: 848/464, loss: 0.028491834178566933 2023-01-22 16:05:00.734194: step: 850/464, loss: 0.027468910440802574 2023-01-22 16:05:01.427261: step: 852/464, loss: 0.004805763252079487 2023-01-22 16:05:02.145589: step: 854/464, loss: 0.011337238363921642 2023-01-22 16:05:02.840462: step: 856/464, loss: 0.0019541652873158455 2023-01-22 16:05:03.564815: step: 858/464, loss: 0.000762142997700721 2023-01-22 16:05:04.314036: step: 860/464, loss: 0.0006989953690208495 2023-01-22 16:05:05.036427: step: 862/464, loss: 0.056273747235536575 2023-01-22 16:05:05.784582: step: 864/464, loss: 0.015350564382970333 2023-01-22 16:05:06.606931: step: 866/464, loss: 0.004561841022223234 2023-01-22 16:05:07.298176: step: 868/464, loss: 0.022886553779244423 2023-01-22 16:05:08.031144: step: 870/464, loss: 0.003614415880292654 2023-01-22 16:05:08.786498: step: 872/464, loss: 0.06976622343063354 2023-01-22 16:05:09.547961: step: 874/464, loss: 0.014863256365060806 2023-01-22 16:05:10.245387: step: 876/464, loss: 0.04626696929335594 2023-01-22 16:05:11.009850: step: 878/464, loss: 0.02423112466931343 2023-01-22 16:05:11.807787: step: 880/464, loss: 1.710673213005066 2023-01-22 16:05:12.577978: step: 882/464, loss: 0.6897745132446289 2023-01-22 16:05:13.325784: step: 884/464, loss: 0.042005181312561035 2023-01-22 16:05:14.091763: step: 886/464, loss: 0.002239571651443839 2023-01-22 16:05:14.813375: step: 888/464, loss: 0.012205242179334164 2023-01-22 16:05:15.558484: step: 890/464, loss: 0.01918407529592514 2023-01-22 16:05:16.349731: step: 892/464, loss: 0.0025333764497190714 2023-01-22 16:05:17.157021: step: 894/464, loss: 0.008899757638573647 2023-01-22 16:05:17.891899: step: 896/464, loss: 0.0008298902539536357 2023-01-22 16:05:18.671430: step: 898/464, loss: 0.0368281826376915 2023-01-22 16:05:19.373604: step: 900/464, loss: 0.0003198850608896464 2023-01-22 16:05:20.163014: step: 902/464, loss: 0.0575554184615612 2023-01-22 16:05:20.988971: step: 904/464, loss: 0.01885542832314968 2023-01-22 16:05:21.711720: step: 906/464, loss: 0.09064995497465134 2023-01-22 16:05:22.562530: step: 908/464, loss: 0.047099050134420395 2023-01-22 16:05:23.308687: step: 910/464, loss: 0.027352536097168922 2023-01-22 16:05:24.045262: step: 912/464, loss: 0.008324525319039822 2023-01-22 16:05:24.768200: step: 914/464, loss: 0.006355096586048603 2023-01-22 16:05:25.432657: step: 916/464, loss: 0.002466668142005801 2023-01-22 16:05:26.091322: step: 918/464, loss: 0.003562645521014929 2023-01-22 16:05:26.815759: step: 920/464, loss: 0.01138092577457428 2023-01-22 16:05:27.581512: step: 922/464, loss: 0.0006305737770162523 2023-01-22 16:05:28.287933: step: 924/464, loss: 0.09922892600297928 2023-01-22 16:05:29.036327: step: 926/464, loss: 0.026989903301000595 2023-01-22 16:05:29.810713: step: 928/464, loss: 0.004400115460157394 2023-01-22 16:05:30.438908: step: 930/464, loss: 0.013907156884670258 ================================================== Loss: 0.048 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3091195573210205, 'r': 0.3495925164389529, 'f1': 0.3281126556782336}, 'combined': 0.24176721997343528, 'epoch': 33} Test Chinese: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.30342855880330655, 'r': 0.28633821507624285, 'f1': 0.2946357637591843}, 'combined': 0.18298431643991447, 'epoch': 33} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2916288971576566, 'r': 0.3381124405376246, 'f1': 0.3131551074926682}, 'combined': 0.23074586867880814, 'epoch': 33} Test Korean: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.2961538542378364, 'r': 0.2829849575573002, 'f1': 0.2894196837271226}, 'combined': 0.17974485620947617, 'epoch': 33} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.30937668765691917, 'r': 0.3481221551813151, 'f1': 0.3276078138938448}, 'combined': 0.2413952312902014, 'epoch': 33} Test Russian: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.3126944616104076, 'r': 0.2907996817540575, 'f1': 0.30134990015187973}, 'combined': 0.18715414851537795, 'epoch': 33} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.2897727272727273, 'r': 0.36428571428571427, 'f1': 0.3227848101265823}, 'combined': 0.2151898734177215, 'epoch': 33} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.2894736842105263, 'r': 0.4782608695652174, 'f1': 0.360655737704918}, 'combined': 0.180327868852459, 'epoch': 33} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.3875, 'r': 0.2672413793103448, 'f1': 0.3163265306122449}, 'combined': 0.2108843537414966, 'epoch': 33} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.31282801835726054, 'r': 0.3603161425860667, 'f1': 0.33489701436130004}, 'combined': 0.24676622110832633, 'epoch': 12} Test for Chinese: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.30894966790503225, 'r': 0.2887808468548521, 'f1': 0.29852498585915693}, 'combined': 0.1853997280598975, 'epoch': 12} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.3138888888888889, 'r': 0.4035714285714286, 'f1': 0.35312499999999997}, 'combined': 0.23541666666666664, 'epoch': 12} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3081533681870252, 'r': 0.3455761681186563, 'f1': 0.32579363255551325}, 'combined': 0.24005846609353607, 'epoch': 23} Test for Korean: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.29327466083978504, 'r': 0.2883432372648332, 'f1': 0.2907880427678267}, 'combined': 0.1805946791926503, 'epoch': 23} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.3355263157894737, 'r': 0.5543478260869565, 'f1': 0.41803278688524587}, 'combined': 0.20901639344262293, 'epoch': 23} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.32184781639317267, 'r': 0.3346728716953674, 'f1': 0.32813507606224857}, 'combined': 0.24178374025639365, 'epoch': 24} Test for Russian: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.31977789344146773, 'r': 0.29610233370986844, 'f1': 0.30748504771716734}, 'combined': 0.190964398055925, 'epoch': 24} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.46875, 'r': 0.3232758620689655, 'f1': 0.38265306122448983}, 'combined': 0.25510204081632654, 'epoch': 24} ****************************** Epoch: 34 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-22 16:08:09.110352: step: 2/464, loss: 0.004507106263190508 2023-01-22 16:08:09.874001: step: 4/464, loss: 0.0138564957305789 2023-01-22 16:08:10.582614: step: 6/464, loss: 0.012533614411950111 2023-01-22 16:08:11.314658: step: 8/464, loss: 0.028478601947426796 2023-01-22 16:08:12.053798: step: 10/464, loss: 0.023900024592876434 2023-01-22 16:08:12.795111: step: 12/464, loss: 0.03197980672121048 2023-01-22 16:08:13.513766: step: 14/464, loss: 0.0026648296043276787 2023-01-22 16:08:14.190661: step: 16/464, loss: 0.0032839514315128326 2023-01-22 16:08:15.089054: step: 18/464, loss: 0.13947302103042603 2023-01-22 16:08:15.857099: step: 20/464, loss: 0.004450374748557806 2023-01-22 16:08:16.553528: step: 22/464, loss: 0.00821272749453783 2023-01-22 16:08:17.354696: step: 24/464, loss: 0.05076976493000984 2023-01-22 16:08:18.127968: step: 26/464, loss: 0.0003773788339458406 2023-01-22 16:08:18.777202: step: 28/464, loss: 0.00023975061776582152 2023-01-22 16:08:19.517383: step: 30/464, loss: 0.0016178319929167628 2023-01-22 16:08:20.330958: step: 32/464, loss: 0.012625518254935741 2023-01-22 16:08:21.057448: step: 34/464, loss: 0.2688717544078827 2023-01-22 16:08:21.781298: step: 36/464, loss: 0.008189328014850616 2023-01-22 16:08:22.513095: step: 38/464, loss: 0.020980000495910645 2023-01-22 16:08:23.189762: step: 40/464, loss: 0.002992508001625538 2023-01-22 16:08:23.939257: step: 42/464, loss: 0.025903644040226936 2023-01-22 16:08:24.673696: step: 44/464, loss: 0.030666600912809372 2023-01-22 16:08:25.355498: step: 46/464, loss: 0.0001177599624497816 2023-01-22 16:08:26.089349: step: 48/464, loss: 0.006271448452025652 2023-01-22 16:08:26.819781: step: 50/464, loss: 0.0019224463030695915 2023-01-22 16:08:27.563685: step: 52/464, loss: 0.519623875617981 2023-01-22 16:08:28.284673: step: 54/464, loss: 0.029549848288297653 2023-01-22 16:08:29.014240: step: 56/464, loss: 0.008546078577637672 2023-01-22 16:08:29.808834: step: 58/464, loss: 0.0016022090567275882 2023-01-22 16:08:30.486698: step: 60/464, loss: 0.008452469483017921 2023-01-22 16:08:31.173307: step: 62/464, loss: 0.0006270703161135316 2023-01-22 16:08:31.854339: step: 64/464, loss: 0.03125409036874771 2023-01-22 16:08:32.599300: step: 66/464, loss: 0.00010930612916126847 2023-01-22 16:08:33.359944: step: 68/464, loss: 0.06612391024827957 2023-01-22 16:08:34.095380: step: 70/464, loss: 0.00494617223739624 2023-01-22 16:08:34.779397: step: 72/464, loss: 0.01734868995845318 2023-01-22 16:08:35.535974: step: 74/464, loss: 0.4363549053668976 2023-01-22 16:08:36.251351: step: 76/464, loss: 0.01373057346791029 2023-01-22 16:08:36.959746: step: 78/464, loss: 0.008064981549978256 2023-01-22 16:08:37.707748: step: 80/464, loss: 0.03656484931707382 2023-01-22 16:08:38.506747: step: 82/464, loss: 0.13750436902046204 2023-01-22 16:08:39.164003: step: 84/464, loss: 0.00047446993994526565 2023-01-22 16:08:39.939109: step: 86/464, loss: 0.006199462339282036 2023-01-22 16:08:40.696965: step: 88/464, loss: 0.0008233002736233175 2023-01-22 16:08:41.444904: step: 90/464, loss: 0.0014060763642191887 2023-01-22 16:08:42.181515: step: 92/464, loss: 0.009109342470765114 2023-01-22 16:08:42.956465: step: 94/464, loss: 0.0016980544896796346 2023-01-22 16:08:43.641773: step: 96/464, loss: 0.0004116365744266659 2023-01-22 16:08:44.359183: step: 98/464, loss: 0.011986698023974895 2023-01-22 16:08:45.289294: step: 100/464, loss: 0.007710412610322237 2023-01-22 16:08:46.026111: step: 102/464, loss: 0.0005119118723087013 2023-01-22 16:08:46.785948: step: 104/464, loss: 0.020808612927794456 2023-01-22 16:08:47.568372: step: 106/464, loss: 0.007151258178055286 2023-01-22 16:08:48.296889: step: 108/464, loss: 0.009474542923271656 2023-01-22 16:08:49.107280: step: 110/464, loss: 4.1004888771567494e-05 2023-01-22 16:08:49.793951: step: 112/464, loss: 0.0347786508500576 2023-01-22 16:08:50.549196: step: 114/464, loss: 0.0015426678583025932 2023-01-22 16:08:51.391035: step: 116/464, loss: 0.0017722718184813857 2023-01-22 16:08:52.083348: step: 118/464, loss: 0.014240579679608345 2023-01-22 16:08:52.792483: step: 120/464, loss: 0.09115071594715118 2023-01-22 16:08:53.742647: step: 122/464, loss: 0.008522900752723217 2023-01-22 16:08:54.476849: step: 124/464, loss: 0.0022267361637204885 2023-01-22 16:08:55.294843: step: 126/464, loss: 0.0030100380536168814 2023-01-22 16:08:55.999953: step: 128/464, loss: 0.0011554791126400232 2023-01-22 16:08:56.840861: step: 130/464, loss: 0.0004891178687103093 2023-01-22 16:08:57.544203: step: 132/464, loss: 0.004247648175805807 2023-01-22 16:08:58.232683: step: 134/464, loss: 0.0036355378106236458 2023-01-22 16:08:58.977822: step: 136/464, loss: 0.14950382709503174 2023-01-22 16:08:59.717595: step: 138/464, loss: 0.006487012840807438 2023-01-22 16:09:00.447279: step: 140/464, loss: 0.0005286363302730024 2023-01-22 16:09:01.165791: step: 142/464, loss: 0.0003259664517827332 2023-01-22 16:09:01.884005: step: 144/464, loss: 0.0174573864787817 2023-01-22 16:09:02.584975: step: 146/464, loss: 0.0023650091607123613 2023-01-22 16:09:03.424567: step: 148/464, loss: 0.002246855990961194 2023-01-22 16:09:04.297308: step: 150/464, loss: 0.010467849671840668 2023-01-22 16:09:05.001002: step: 152/464, loss: 0.0027248847763985395 2023-01-22 16:09:05.669502: step: 154/464, loss: 0.004155377391725779 2023-01-22 16:09:06.424248: step: 156/464, loss: 0.0012879250571131706 2023-01-22 16:09:07.141023: step: 158/464, loss: 0.035379212349653244 2023-01-22 16:09:07.825994: step: 160/464, loss: 0.005478814709931612 2023-01-22 16:09:08.491444: step: 162/464, loss: 0.0020441957749426365 2023-01-22 16:09:09.200079: step: 164/464, loss: 0.027075497433543205 2023-01-22 16:09:09.956185: step: 166/464, loss: 0.014531458728015423 2023-01-22 16:09:10.810430: step: 168/464, loss: 0.029720718041062355 2023-01-22 16:09:11.599741: step: 170/464, loss: 0.0023813534062355757 2023-01-22 16:09:12.432010: step: 172/464, loss: 0.8960830569267273 2023-01-22 16:09:13.150798: step: 174/464, loss: 0.04537244141101837 2023-01-22 16:09:13.883083: step: 176/464, loss: 0.08595781773328781 2023-01-22 16:09:14.653848: step: 178/464, loss: 0.022702278569340706 2023-01-22 16:09:15.383146: step: 180/464, loss: 0.0006193833542056382 2023-01-22 16:09:16.077218: step: 182/464, loss: 0.06762062013149261 2023-01-22 16:09:16.851823: step: 184/464, loss: 0.06323347240686417 2023-01-22 16:09:17.545837: step: 186/464, loss: 0.001289387815631926 2023-01-22 16:09:18.228814: step: 188/464, loss: 0.015292837284505367 2023-01-22 16:09:18.973364: step: 190/464, loss: 0.005967683624476194 2023-01-22 16:09:19.708095: step: 192/464, loss: 0.19376680254936218 2023-01-22 16:09:20.381006: step: 194/464, loss: 0.009574813768267632 2023-01-22 16:09:21.084148: step: 196/464, loss: 0.006247211713343859 2023-01-22 16:09:21.825250: step: 198/464, loss: 0.049863290041685104 2023-01-22 16:09:22.527840: step: 200/464, loss: 0.0009026709012687206 2023-01-22 16:09:23.264686: step: 202/464, loss: 0.009047025814652443 2023-01-22 16:09:23.928162: step: 204/464, loss: 0.003018538001924753 2023-01-22 16:09:24.665046: step: 206/464, loss: 0.06789993494749069 2023-01-22 16:09:25.408810: step: 208/464, loss: 0.014406219124794006 2023-01-22 16:09:26.138506: step: 210/464, loss: 0.009798445738852024 2023-01-22 16:09:27.458469: step: 212/464, loss: 0.005850615445524454 2023-01-22 16:09:28.161664: step: 214/464, loss: 0.0489523746073246 2023-01-22 16:09:28.910525: step: 216/464, loss: 0.005099345929920673 2023-01-22 16:09:29.680584: step: 218/464, loss: 0.0015978205483406782 2023-01-22 16:09:30.343802: step: 220/464, loss: 0.04420878738164902 2023-01-22 16:09:31.117121: step: 222/464, loss: 0.020301537588238716 2023-01-22 16:09:31.811951: step: 224/464, loss: 0.0011024517007172108 2023-01-22 16:09:32.581489: step: 226/464, loss: 0.0043716104701161385 2023-01-22 16:09:33.250057: step: 228/464, loss: 0.0008457738440483809 2023-01-22 16:09:33.978017: step: 230/464, loss: 0.04437427967786789 2023-01-22 16:09:34.662450: step: 232/464, loss: 0.004700162913650274 2023-01-22 16:09:35.336816: step: 234/464, loss: 0.007807662710547447 2023-01-22 16:09:36.080482: step: 236/464, loss: 0.005388321820646524 2023-01-22 16:09:36.782261: step: 238/464, loss: 0.03318406641483307 2023-01-22 16:09:37.494997: step: 240/464, loss: 0.003423970425501466 2023-01-22 16:09:38.183674: step: 242/464, loss: 0.007497243583202362 2023-01-22 16:09:38.848511: step: 244/464, loss: 0.01163523830473423 2023-01-22 16:09:39.615006: step: 246/464, loss: 0.018005944788455963 2023-01-22 16:09:40.442515: step: 248/464, loss: 0.05900489538908005 2023-01-22 16:09:41.121322: step: 250/464, loss: 0.005102918948978186 2023-01-22 16:09:41.873273: step: 252/464, loss: 0.004294229205697775 2023-01-22 16:09:42.594682: step: 254/464, loss: 0.004841707646846771 2023-01-22 16:09:43.230438: step: 256/464, loss: 0.0008235534187406301 2023-01-22 16:09:43.931349: step: 258/464, loss: 1.809576315281447e-05 2023-01-22 16:09:44.683123: step: 260/464, loss: 0.0009198468178510666 2023-01-22 16:09:45.371559: step: 262/464, loss: 0.01337486132979393 2023-01-22 16:09:46.137780: step: 264/464, loss: 0.012098829261958599 2023-01-22 16:09:46.940979: step: 266/464, loss: 0.007691856473684311 2023-01-22 16:09:47.629234: step: 268/464, loss: 0.18738287687301636 2023-01-22 16:09:48.390795: step: 270/464, loss: 0.01887785643339157 2023-01-22 16:09:49.167060: step: 272/464, loss: 0.03846096247434616 2023-01-22 16:09:49.891575: step: 274/464, loss: 0.012068726122379303 2023-01-22 16:09:50.576218: step: 276/464, loss: 0.029271041974425316 2023-01-22 16:09:51.313144: step: 278/464, loss: 0.00018030410865321755 2023-01-22 16:09:51.999655: step: 280/464, loss: 0.0576542504131794 2023-01-22 16:09:52.800442: step: 282/464, loss: 0.019819024950265884 2023-01-22 16:09:53.642603: step: 284/464, loss: 0.036222707480192184 2023-01-22 16:09:54.345926: step: 286/464, loss: 0.03119976446032524 2023-01-22 16:09:55.093669: step: 288/464, loss: 0.008279160596430302 2023-01-22 16:09:55.858874: step: 290/464, loss: 0.0022513007279485464 2023-01-22 16:09:56.558630: step: 292/464, loss: 1.0045217095466796e-05 2023-01-22 16:09:57.256040: step: 294/464, loss: 6.919210136402398e-05 2023-01-22 16:09:58.013534: step: 296/464, loss: 0.008782978169620037 2023-01-22 16:09:58.735035: step: 298/464, loss: 0.005144828464835882 2023-01-22 16:09:59.422408: step: 300/464, loss: 0.004423404578119516 2023-01-22 16:10:00.200258: step: 302/464, loss: 0.02502557262778282 2023-01-22 16:10:00.968522: step: 304/464, loss: 0.03563341870903969 2023-01-22 16:10:01.710199: step: 306/464, loss: 0.011802875436842442 2023-01-22 16:10:02.370766: step: 308/464, loss: 0.00023112045892048627 2023-01-22 16:10:03.046746: step: 310/464, loss: 0.07555259764194489 2023-01-22 16:10:03.744600: step: 312/464, loss: 0.011428655125200748 2023-01-22 16:10:04.536604: step: 314/464, loss: 0.004951578099280596 2023-01-22 16:10:05.311279: step: 316/464, loss: 0.282394677400589 2023-01-22 16:10:06.019844: step: 318/464, loss: 0.002324760193005204 2023-01-22 16:10:06.749811: step: 320/464, loss: 0.061712756752967834 2023-01-22 16:10:07.437512: step: 322/464, loss: 0.0064103505574166775 2023-01-22 16:10:08.208404: step: 324/464, loss: 0.025882260873913765 2023-01-22 16:10:08.997296: step: 326/464, loss: 0.06493687629699707 2023-01-22 16:10:09.799319: step: 328/464, loss: 0.059337176382541656 2023-01-22 16:10:10.590702: step: 330/464, loss: 0.05586615949869156 2023-01-22 16:10:11.362489: step: 332/464, loss: 0.0006606185343116522 2023-01-22 16:10:12.103158: step: 334/464, loss: 0.019652051851153374 2023-01-22 16:10:12.936243: step: 336/464, loss: 0.06849510222673416 2023-01-22 16:10:13.725460: step: 338/464, loss: 0.027316365391016006 2023-01-22 16:10:14.403208: step: 340/464, loss: 0.008591653779149055 2023-01-22 16:10:15.202726: step: 342/464, loss: 0.02025693655014038 2023-01-22 16:10:15.874137: step: 344/464, loss: 0.02047627419233322 2023-01-22 16:10:16.564213: step: 346/464, loss: 0.003447972470894456 2023-01-22 16:10:17.298153: step: 348/464, loss: 0.020870212465524673 2023-01-22 16:10:18.021717: step: 350/464, loss: 0.049200639128685 2023-01-22 16:10:18.812087: step: 352/464, loss: 0.013158795423805714 2023-01-22 16:10:19.659057: step: 354/464, loss: 0.0025647184811532497 2023-01-22 16:10:20.375028: step: 356/464, loss: 0.028932079672813416 2023-01-22 16:10:21.065781: step: 358/464, loss: 0.0025970793794840574 2023-01-22 16:10:21.779805: step: 360/464, loss: 0.03821602091193199 2023-01-22 16:10:22.458414: step: 362/464, loss: 0.0049066864885389805 2023-01-22 16:10:23.178049: step: 364/464, loss: 0.0007283754530362785 2023-01-22 16:10:23.913303: step: 366/464, loss: 0.011494730599224567 2023-01-22 16:10:24.597157: step: 368/464, loss: 0.01839861087501049 2023-01-22 16:10:25.284687: step: 370/464, loss: 0.07493963837623596 2023-01-22 16:10:26.121875: step: 372/464, loss: 0.0021905205212533474 2023-01-22 16:10:26.847232: step: 374/464, loss: 0.0001082076778402552 2023-01-22 16:10:27.541868: step: 376/464, loss: 0.001402409397996962 2023-01-22 16:10:28.251474: step: 378/464, loss: 0.013940723612904549 2023-01-22 16:10:28.960825: step: 380/464, loss: 0.0044783600606024265 2023-01-22 16:10:29.661613: step: 382/464, loss: 0.006135037634521723 2023-01-22 16:10:30.430035: step: 384/464, loss: 0.01858883909881115 2023-01-22 16:10:31.205275: step: 386/464, loss: 0.011167095974087715 2023-01-22 16:10:31.970311: step: 388/464, loss: 0.0011094283545389771 2023-01-22 16:10:32.731827: step: 390/464, loss: 0.003803882747888565 2023-01-22 16:10:33.568017: step: 392/464, loss: 0.0013371568638831377 2023-01-22 16:10:34.343648: step: 394/464, loss: 0.05424215644598007 2023-01-22 16:10:35.084341: step: 396/464, loss: 0.0014253269182518125 2023-01-22 16:10:35.875156: step: 398/464, loss: 0.0028973803855478764 2023-01-22 16:10:36.551547: step: 400/464, loss: 0.0002119986165780574 2023-01-22 16:10:37.314504: step: 402/464, loss: 0.005772217642515898 2023-01-22 16:10:38.101879: step: 404/464, loss: 0.19733327627182007 2023-01-22 16:10:38.778172: step: 406/464, loss: 0.005514085758477449 2023-01-22 16:10:39.482615: step: 408/464, loss: 0.005787344183772802 2023-01-22 16:10:40.175759: step: 410/464, loss: 0.007726243697106838 2023-01-22 16:10:40.979568: step: 412/464, loss: 0.811470091342926 2023-01-22 16:10:41.697922: step: 414/464, loss: 0.0013876872835680842 2023-01-22 16:10:42.469428: step: 416/464, loss: 0.002277060877531767 2023-01-22 16:10:43.198343: step: 418/464, loss: 0.00730202067643404 2023-01-22 16:10:43.980257: step: 420/464, loss: 0.007765179965645075 2023-01-22 16:10:44.698135: step: 422/464, loss: 0.00024765662965364754 2023-01-22 16:10:45.490873: step: 424/464, loss: 0.009965931065380573 2023-01-22 16:10:46.238252: step: 426/464, loss: 0.0009719174704514444 2023-01-22 16:10:46.867033: step: 428/464, loss: 0.015123143792152405 2023-01-22 16:10:47.623966: step: 430/464, loss: 0.008549573831260204 2023-01-22 16:10:48.400566: step: 432/464, loss: 0.009291200898587704 2023-01-22 16:10:49.094173: step: 434/464, loss: 0.0007830564863979816 2023-01-22 16:10:49.852259: step: 436/464, loss: 0.00584413344040513 2023-01-22 16:10:50.553467: step: 438/464, loss: 0.01247811783105135 2023-01-22 16:10:51.338591: step: 440/464, loss: 0.0721527636051178 2023-01-22 16:10:52.047987: step: 442/464, loss: 0.011251426301896572 2023-01-22 16:10:52.871657: step: 444/464, loss: 0.04375053942203522 2023-01-22 16:10:53.614395: step: 446/464, loss: 0.0027558563742786646 2023-01-22 16:10:54.346635: step: 448/464, loss: 0.00021523504983633757 2023-01-22 16:10:55.091606: step: 450/464, loss: 0.014090361073613167 2023-01-22 16:10:55.918325: step: 452/464, loss: 0.022461047396063805 2023-01-22 16:10:56.665631: step: 454/464, loss: 0.015601426362991333 2023-01-22 16:10:57.387634: step: 456/464, loss: 0.006117715500295162 2023-01-22 16:10:58.078823: step: 458/464, loss: 0.00012083905312465504 2023-01-22 16:10:58.838310: step: 460/464, loss: 0.0008911213371902704 2023-01-22 16:10:59.619217: step: 462/464, loss: 0.011780163273215294 2023-01-22 16:11:00.323017: step: 464/464, loss: 0.00481388159096241 2023-01-22 16:11:01.011049: step: 466/464, loss: 0.0182182714343071 2023-01-22 16:11:01.687786: step: 468/464, loss: 0.007232617121189833 2023-01-22 16:11:02.434268: step: 470/464, loss: 0.00524465087801218 2023-01-22 16:11:03.117037: step: 472/464, loss: 0.03622736409306526 2023-01-22 16:11:03.861118: step: 474/464, loss: 0.003495218697935343 2023-01-22 16:11:04.622371: step: 476/464, loss: 0.0007282923324964941 2023-01-22 16:11:05.363036: step: 478/464, loss: 0.018887333571910858 2023-01-22 16:11:06.090364: step: 480/464, loss: 0.03536510467529297 2023-01-22 16:11:06.795101: step: 482/464, loss: 0.0029719527810811996 2023-01-22 16:11:07.528449: step: 484/464, loss: 0.013268392533063889 2023-01-22 16:11:08.179330: step: 486/464, loss: 0.0023780071642249823 2023-01-22 16:11:08.937086: step: 488/464, loss: 0.00029961683321744204 2023-01-22 16:11:09.618437: step: 490/464, loss: 0.00504639558494091 2023-01-22 16:11:10.367965: step: 492/464, loss: 0.0035301174502819777 2023-01-22 16:11:11.087484: step: 494/464, loss: 0.027179397642612457 2023-01-22 16:11:11.899900: step: 496/464, loss: 0.01423187181353569 2023-01-22 16:11:12.634347: step: 498/464, loss: 0.006988744717091322 2023-01-22 16:11:13.466461: step: 500/464, loss: 0.0017694245325401425 2023-01-22 16:11:14.247731: step: 502/464, loss: 0.0043494729325175285 2023-01-22 16:11:14.940169: step: 504/464, loss: 0.02056262083351612 2023-01-22 16:11:15.622533: step: 506/464, loss: 0.0001409807737218216 2023-01-22 16:11:16.365377: step: 508/464, loss: 0.007760221604257822 2023-01-22 16:11:17.043590: step: 510/464, loss: 0.04806235432624817 2023-01-22 16:11:17.818824: step: 512/464, loss: 0.00937352143228054 2023-01-22 16:11:18.502153: step: 514/464, loss: 0.0003969741228502244 2023-01-22 16:11:19.208033: step: 516/464, loss: 0.002457494381815195 2023-01-22 16:11:19.966661: step: 518/464, loss: 5.364713433664292e-05 2023-01-22 16:11:20.660844: step: 520/464, loss: 0.016665572300553322 2023-01-22 16:11:21.487060: step: 522/464, loss: 0.0065728141926229 2023-01-22 16:11:22.245572: step: 524/464, loss: 0.03412323817610741 2023-01-22 16:11:23.000830: step: 526/464, loss: 0.02081715129315853 2023-01-22 16:11:23.700154: step: 528/464, loss: 0.01984614133834839 2023-01-22 16:11:24.372054: step: 530/464, loss: 0.0027193299029022455 2023-01-22 16:11:25.059386: step: 532/464, loss: 0.06476839631795883 2023-01-22 16:11:25.763861: step: 534/464, loss: 0.023456497117877007 2023-01-22 16:11:26.532139: step: 536/464, loss: 0.014758966863155365 2023-01-22 16:11:27.320741: step: 538/464, loss: 0.018996067345142365 2023-01-22 16:11:27.995962: step: 540/464, loss: 0.001212551025673747 2023-01-22 16:11:28.660769: step: 542/464, loss: 0.015158602967858315 2023-01-22 16:11:29.526506: step: 544/464, loss: 0.051817577332258224 2023-01-22 16:11:30.203312: step: 546/464, loss: 0.08023293316364288 2023-01-22 16:11:30.883426: step: 548/464, loss: 0.006792579777538776 2023-01-22 16:11:31.648177: step: 550/464, loss: 0.0006085671484470367 2023-01-22 16:11:32.408963: step: 552/464, loss: 0.0005508697358891368 2023-01-22 16:11:33.240097: step: 554/464, loss: 0.09090903401374817 2023-01-22 16:11:34.050191: step: 556/464, loss: 0.01320156641304493 2023-01-22 16:11:34.847000: step: 558/464, loss: 0.027111440896987915 2023-01-22 16:11:35.660684: step: 560/464, loss: 0.0351184718310833 2023-01-22 16:11:36.427802: step: 562/464, loss: 0.025485580787062645 2023-01-22 16:11:37.123384: step: 564/464, loss: 0.010311855003237724 2023-01-22 16:11:37.859767: step: 566/464, loss: 0.04128720611333847 2023-01-22 16:11:38.590065: step: 568/464, loss: 0.03193189948797226 2023-01-22 16:11:39.388870: step: 570/464, loss: 0.29782480001449585 2023-01-22 16:11:40.154616: step: 572/464, loss: 0.03099055029451847 2023-01-22 16:11:40.893327: step: 574/464, loss: 0.009467075578868389 2023-01-22 16:11:41.641123: step: 576/464, loss: 0.11119036376476288 2023-01-22 16:11:42.411183: step: 578/464, loss: 0.04529016092419624 2023-01-22 16:11:43.207106: step: 580/464, loss: 0.0063427952118217945 2023-01-22 16:11:43.952114: step: 582/464, loss: 0.006485740188509226 2023-01-22 16:11:44.674004: step: 584/464, loss: 0.007228881120681763 2023-01-22 16:11:45.429952: step: 586/464, loss: 0.021861482411623 2023-01-22 16:11:46.183471: step: 588/464, loss: 0.003522490616887808 2023-01-22 16:11:46.995202: step: 590/464, loss: 0.055823516100645065 2023-01-22 16:11:47.763983: step: 592/464, loss: 0.026966070756316185 2023-01-22 16:11:48.526169: step: 594/464, loss: 1.141110897064209 2023-01-22 16:11:49.298390: step: 596/464, loss: 0.01313886046409607 2023-01-22 16:11:50.038096: step: 598/464, loss: 0.009779131039977074 2023-01-22 16:11:50.818161: step: 600/464, loss: 0.0019635711796581745 2023-01-22 16:11:51.570899: step: 602/464, loss: 0.009792122058570385 2023-01-22 16:11:52.252169: step: 604/464, loss: 0.0413961298763752 2023-01-22 16:11:53.001082: step: 606/464, loss: 0.11280696839094162 2023-01-22 16:11:53.761947: step: 608/464, loss: 0.044669028371572495 2023-01-22 16:11:54.514466: step: 610/464, loss: 0.009040174074470997 2023-01-22 16:11:55.359163: step: 612/464, loss: 0.03160270303487778 2023-01-22 16:11:56.053639: step: 614/464, loss: 0.014951037243008614 2023-01-22 16:11:56.701285: step: 616/464, loss: 0.01582406647503376 2023-01-22 16:11:57.520308: step: 618/464, loss: 0.010089858435094357 2023-01-22 16:11:58.238844: step: 620/464, loss: 0.05757782980799675 2023-01-22 16:11:58.960697: step: 622/464, loss: 0.002146498067304492 2023-01-22 16:11:59.683809: step: 624/464, loss: 0.008466691710054874 2023-01-22 16:12:00.597886: step: 626/464, loss: 0.016295647248625755 2023-01-22 16:12:01.294562: step: 628/464, loss: 0.04091161862015724 2023-01-22 16:12:02.038371: step: 630/464, loss: 0.0015703781973570585 2023-01-22 16:12:02.732212: step: 632/464, loss: 0.011625193059444427 2023-01-22 16:12:03.461944: step: 634/464, loss: 0.016939258202910423 2023-01-22 16:12:04.173277: step: 636/464, loss: 0.0007383174379356205 2023-01-22 16:12:05.022161: step: 638/464, loss: 0.00040101975901052356 2023-01-22 16:12:05.817096: step: 640/464, loss: 0.0008818531641736627 2023-01-22 16:12:06.639746: step: 642/464, loss: 0.024727830663323402 2023-01-22 16:12:07.340286: step: 644/464, loss: 0.03569096326828003 2023-01-22 16:12:08.142199: step: 646/464, loss: 0.0017474403139203787 2023-01-22 16:12:08.870401: step: 648/464, loss: 0.01313767209649086 2023-01-22 16:12:09.693424: step: 650/464, loss: 0.030891088768839836 2023-01-22 16:12:10.342049: step: 652/464, loss: 0.01417345367372036 2023-01-22 16:12:11.049700: step: 654/464, loss: 0.02216888591647148 2023-01-22 16:12:11.817070: step: 656/464, loss: 0.0041503203101456165 2023-01-22 16:12:12.492783: step: 658/464, loss: 0.024956727400422096 2023-01-22 16:12:13.215748: step: 660/464, loss: 0.03885156288743019 2023-01-22 16:12:13.930820: step: 662/464, loss: 0.034349534660577774 2023-01-22 16:12:14.631924: step: 664/464, loss: 0.005326046142727137 2023-01-22 16:12:15.347580: step: 666/464, loss: 0.0045426227152347565 2023-01-22 16:12:16.102953: step: 668/464, loss: 0.007938658818602562 2023-01-22 16:12:16.766693: step: 670/464, loss: 0.0032222664449363947 2023-01-22 16:12:17.458357: step: 672/464, loss: 0.02680307626724243 2023-01-22 16:12:18.140728: step: 674/464, loss: 0.008386366069316864 2023-01-22 16:12:18.874157: step: 676/464, loss: 0.010037817060947418 2023-01-22 16:12:19.629682: step: 678/464, loss: 0.40986937284469604 2023-01-22 16:12:20.302430: step: 680/464, loss: 0.0359659306704998 2023-01-22 16:12:21.084284: step: 682/464, loss: 0.0025637580547481775 2023-01-22 16:12:21.794470: step: 684/464, loss: 0.0163812804967165 2023-01-22 16:12:22.502966: step: 686/464, loss: 0.0011752874124795198 2023-01-22 16:12:23.228214: step: 688/464, loss: 0.011594103649258614 2023-01-22 16:12:23.962412: step: 690/464, loss: 0.003088256809860468 2023-01-22 16:12:24.632700: step: 692/464, loss: 0.0018205085070803761 2023-01-22 16:12:25.381439: step: 694/464, loss: 0.005875764414668083 2023-01-22 16:12:26.223208: step: 696/464, loss: 0.0025446880608797073 2023-01-22 16:12:26.855128: step: 698/464, loss: 6.435057002818212e-05 2023-01-22 16:12:27.514518: step: 700/464, loss: 0.006112277507781982 2023-01-22 16:12:28.235503: step: 702/464, loss: 0.03258458524942398 2023-01-22 16:12:28.949442: step: 704/464, loss: 0.0015139655442908406 2023-01-22 16:12:29.689272: step: 706/464, loss: 0.0433700829744339 2023-01-22 16:12:30.416066: step: 708/464, loss: 0.04399729147553444 2023-01-22 16:12:31.156611: step: 710/464, loss: 0.01889374852180481 2023-01-22 16:12:31.900507: step: 712/464, loss: 0.0007756176055409014 2023-01-22 16:12:32.680221: step: 714/464, loss: 0.011662695556879044 2023-01-22 16:12:33.365992: step: 716/464, loss: 0.0009218156919814646 2023-01-22 16:12:34.077533: step: 718/464, loss: 0.01887955144047737 2023-01-22 16:12:34.849893: step: 720/464, loss: 0.1750974804162979 2023-01-22 16:12:35.504259: step: 722/464, loss: 0.0001728379138512537 2023-01-22 16:12:36.186122: step: 724/464, loss: 0.009151075035333633 2023-01-22 16:12:36.929672: step: 726/464, loss: 0.0006785045843571424 2023-01-22 16:12:37.688402: step: 728/464, loss: 0.0013461982598528266 2023-01-22 16:12:38.487433: step: 730/464, loss: 0.010900832712650299 2023-01-22 16:12:39.244253: step: 732/464, loss: 0.01641363464295864 2023-01-22 16:12:40.020514: step: 734/464, loss: 0.010521006770431995 2023-01-22 16:12:40.783222: step: 736/464, loss: 0.0008467060979455709 2023-01-22 16:12:41.443936: step: 738/464, loss: 0.015393110923469067 2023-01-22 16:12:42.159002: step: 740/464, loss: 0.0008853751933202147 2023-01-22 16:12:42.943189: step: 742/464, loss: 0.0014580088900402188 2023-01-22 16:12:43.677403: step: 744/464, loss: 0.22433538734912872 2023-01-22 16:12:44.424445: step: 746/464, loss: 0.0008134776726365089 2023-01-22 16:12:45.139918: step: 748/464, loss: 0.007577679585665464 2023-01-22 16:12:45.969212: step: 750/464, loss: 0.0440809428691864 2023-01-22 16:12:46.663615: step: 752/464, loss: 0.010974534787237644 2023-01-22 16:12:47.440789: step: 754/464, loss: 0.036295562982559204 2023-01-22 16:12:48.151511: step: 756/464, loss: 0.00034559096093289554 2023-01-22 16:12:48.864912: step: 758/464, loss: 0.006786028388887644 2023-01-22 16:12:49.669156: step: 760/464, loss: 0.43013352155685425 2023-01-22 16:12:50.413716: step: 762/464, loss: 0.0012140395119786263 2023-01-22 16:12:51.186395: step: 764/464, loss: 0.020293693989515305 2023-01-22 16:12:52.035297: step: 766/464, loss: 0.0025471991393715143 2023-01-22 16:12:52.726376: step: 768/464, loss: 0.0038294994737952948 2023-01-22 16:12:53.416801: step: 770/464, loss: 0.00017575285164639354 2023-01-22 16:12:54.147861: step: 772/464, loss: 0.0003553438582457602 2023-01-22 16:12:54.884698: step: 774/464, loss: 0.06080649048089981 2023-01-22 16:12:55.629449: step: 776/464, loss: 8.546032040612772e-05 2023-01-22 16:12:56.382040: step: 778/464, loss: 0.06992447376251221 2023-01-22 16:12:57.044098: step: 780/464, loss: 0.004821065813302994 2023-01-22 16:12:57.763997: step: 782/464, loss: 0.03713594749569893 2023-01-22 16:12:58.550652: step: 784/464, loss: 0.006182161625474691 2023-01-22 16:12:59.281019: step: 786/464, loss: 0.0001664771552896127 2023-01-22 16:12:59.966697: step: 788/464, loss: 0.029279792681336403 2023-01-22 16:13:00.664163: step: 790/464, loss: 0.08578027039766312 2023-01-22 16:13:01.306625: step: 792/464, loss: 0.0020249513909220695 2023-01-22 16:13:02.037076: step: 794/464, loss: 0.013059469871222973 2023-01-22 16:13:02.725232: step: 796/464, loss: 0.0051474785432219505 2023-01-22 16:13:03.444804: step: 798/464, loss: 0.03641364723443985 2023-01-22 16:13:04.151287: step: 800/464, loss: 0.0007527320994995534 2023-01-22 16:13:04.891144: step: 802/464, loss: 0.012699021026492119 2023-01-22 16:13:05.642563: step: 804/464, loss: 0.00019273671205155551 2023-01-22 16:13:06.462246: step: 806/464, loss: 0.01091878954321146 2023-01-22 16:13:07.186395: step: 808/464, loss: 0.019247926771640778 2023-01-22 16:13:07.902214: step: 810/464, loss: 0.012168015353381634 2023-01-22 16:13:08.745743: step: 812/464, loss: 0.01567387580871582 2023-01-22 16:13:09.505291: step: 814/464, loss: 0.0009724770206958055 2023-01-22 16:13:10.179565: step: 816/464, loss: 0.0037853403482586145 2023-01-22 16:13:10.908332: step: 818/464, loss: 0.0049398913979530334 2023-01-22 16:13:11.604978: step: 820/464, loss: 0.006806234363466501 2023-01-22 16:13:12.363106: step: 822/464, loss: 0.005008451174944639 2023-01-22 16:13:13.165227: step: 824/464, loss: 0.16014228761196136 2023-01-22 16:13:13.962997: step: 826/464, loss: 0.0002244079951196909 2023-01-22 16:13:14.713654: step: 828/464, loss: 0.005260965786874294 2023-01-22 16:13:15.467541: step: 830/464, loss: 0.012760087847709656 2023-01-22 16:13:16.191329: step: 832/464, loss: 0.0007453107973560691 2023-01-22 16:13:16.942406: step: 834/464, loss: 0.02452758140861988 2023-01-22 16:13:17.668815: step: 836/464, loss: 0.058134328573942184 2023-01-22 16:13:18.375755: step: 838/464, loss: 0.02890494465827942 2023-01-22 16:13:19.128693: step: 840/464, loss: 0.008339189924299717 2023-01-22 16:13:19.903527: step: 842/464, loss: 0.00544707989320159 2023-01-22 16:13:20.595230: step: 844/464, loss: 9.86708837444894e-05 2023-01-22 16:13:21.332231: step: 846/464, loss: 0.00672255689278245 2023-01-22 16:13:21.964744: step: 848/464, loss: 0.006709387991577387 2023-01-22 16:13:22.721155: step: 850/464, loss: 0.022981248795986176 2023-01-22 16:13:23.458943: step: 852/464, loss: 0.2067977786064148 2023-01-22 16:13:24.218425: step: 854/464, loss: 0.2143319994211197 2023-01-22 16:13:24.919257: step: 856/464, loss: 0.004001646768301725 2023-01-22 16:13:25.715208: step: 858/464, loss: 0.08282311260700226 2023-01-22 16:13:26.361854: step: 860/464, loss: 0.006189326755702496 2023-01-22 16:13:27.038083: step: 862/464, loss: 0.00050537777133286 2023-01-22 16:13:27.746113: step: 864/464, loss: 0.002543987240642309 2023-01-22 16:13:28.429823: step: 866/464, loss: 0.0006222067750059068 2023-01-22 16:13:29.147720: step: 868/464, loss: 0.0864039734005928 2023-01-22 16:13:29.862035: step: 870/464, loss: 0.009794293902814388 2023-01-22 16:13:30.620047: step: 872/464, loss: 0.0035909328144043684 2023-01-22 16:13:31.337815: step: 874/464, loss: 0.15241585671901703 2023-01-22 16:13:32.084408: step: 876/464, loss: 0.0015292530879378319 2023-01-22 16:13:32.765397: step: 878/464, loss: 0.017026687040925026 2023-01-22 16:13:33.504698: step: 880/464, loss: 0.011374955996870995 2023-01-22 16:13:34.255874: step: 882/464, loss: 0.01445054356008768 2023-01-22 16:13:34.931803: step: 884/464, loss: 0.01035305205732584 2023-01-22 16:13:35.666744: step: 886/464, loss: 0.003956570755690336 2023-01-22 16:13:36.395497: step: 888/464, loss: 0.1221495121717453 2023-01-22 16:13:37.212608: step: 890/464, loss: 0.013920644298195839 2023-01-22 16:13:37.996622: step: 892/464, loss: 0.002684567356482148 2023-01-22 16:13:38.763747: step: 894/464, loss: 0.07813320308923721 2023-01-22 16:13:39.512042: step: 896/464, loss: 0.01626390591263771 2023-01-22 16:13:40.254145: step: 898/464, loss: 0.5564307570457458 2023-01-22 16:13:40.973922: step: 900/464, loss: 0.0021616253070533276 2023-01-22 16:13:41.666662: step: 902/464, loss: 0.06270702183246613 2023-01-22 16:13:42.358036: step: 904/464, loss: 0.0008776888716965914 2023-01-22 16:13:43.086362: step: 906/464, loss: 0.004115441348403692 2023-01-22 16:13:43.841297: step: 908/464, loss: 0.004244999028742313 2023-01-22 16:13:44.594068: step: 910/464, loss: 0.035991959273815155 2023-01-22 16:13:45.327477: step: 912/464, loss: 0.005663156975060701 2023-01-22 16:13:46.043202: step: 914/464, loss: 0.03923436626791954 2023-01-22 16:13:46.741157: step: 916/464, loss: 0.0590827614068985 2023-01-22 16:13:47.441970: step: 918/464, loss: 0.019325165078043938 2023-01-22 16:13:48.143508: step: 920/464, loss: 0.009129509329795837 2023-01-22 16:13:48.883031: step: 922/464, loss: 0.13903824985027313 2023-01-22 16:13:49.613365: step: 924/464, loss: 0.0009124143980443478 2023-01-22 16:13:50.369164: step: 926/464, loss: 0.005049745086580515 2023-01-22 16:13:51.118114: step: 928/464, loss: 0.0027745855040848255 2023-01-22 16:13:51.810997: step: 930/464, loss: 0.04152434319257736 ================================================== Loss: 0.034 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3138795954468544, 'r': 0.3519978005865103, 'f1': 0.33184765815579775}, 'combined': 0.24451932706216675, 'epoch': 34} Test Chinese: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.2925853300648862, 'r': 0.28535028233825693, 'f1': 0.2889225192228119}, 'combined': 0.17943609088574636, 'epoch': 34} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3009871921556704, 'r': 0.3415376487838537, 'f1': 0.3199828282828282}, 'combined': 0.23577682083997867, 'epoch': 34} Test Korean: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.28340520866169455, 'r': 0.28368553033198307, 'f1': 0.28354530021318325}, 'combined': 0.17609655486924014, 'epoch': 34} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.313382637887527, 'r': 0.3472779137121742, 'f1': 0.32946077502487087}, 'combined': 0.24276057107095747, 'epoch': 34} Test Russian: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.2925592375945749, 'r': 0.28678314208431677, 'f1': 0.2896423957441803}, 'combined': 0.1798831720937541, 'epoch': 34} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.31875, 'r': 0.36428571428571427, 'f1': 0.34}, 'combined': 0.22666666666666668, 'epoch': 34} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.23780487804878048, 'r': 0.42391304347826086, 'f1': 0.3046875}, 'combined': 0.15234375, 'epoch': 34} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.36904761904761907, 'r': 0.2672413793103448, 'f1': 0.31}, 'combined': 0.20666666666666667, 'epoch': 34} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.31282801835726054, 'r': 0.3603161425860667, 'f1': 0.33489701436130004}, 'combined': 0.24676622110832633, 'epoch': 12} Test for Chinese: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.30894966790503225, 'r': 0.2887808468548521, 'f1': 0.29852498585915693}, 'combined': 0.1853997280598975, 'epoch': 12} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.3138888888888889, 'r': 0.4035714285714286, 'f1': 0.35312499999999997}, 'combined': 0.23541666666666664, 'epoch': 12} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3081533681870252, 'r': 0.3455761681186563, 'f1': 0.32579363255551325}, 'combined': 0.24005846609353607, 'epoch': 23} Test for Korean: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.29327466083978504, 'r': 0.2883432372648332, 'f1': 0.2907880427678267}, 'combined': 0.1805946791926503, 'epoch': 23} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.3355263157894737, 'r': 0.5543478260869565, 'f1': 0.41803278688524587}, 'combined': 0.20901639344262293, 'epoch': 23} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.32184781639317267, 'r': 0.3346728716953674, 'f1': 0.32813507606224857}, 'combined': 0.24178374025639365, 'epoch': 24} Test for Russian: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.31977789344146773, 'r': 0.29610233370986844, 'f1': 0.30748504771716734}, 'combined': 0.190964398055925, 'epoch': 24} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.46875, 'r': 0.3232758620689655, 'f1': 0.38265306122448983}, 'combined': 0.25510204081632654, 'epoch': 24} ****************************** Epoch: 35 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-22 16:16:31.323555: step: 2/464, loss: 0.00206485646776855 2023-01-22 16:16:32.079857: step: 4/464, loss: 0.0008505339501425624 2023-01-22 16:16:32.718895: step: 6/464, loss: 0.021608030423521996 2023-01-22 16:16:33.425860: step: 8/464, loss: 0.010798071511089802 2023-01-22 16:16:34.135719: step: 10/464, loss: 0.024295778945088387 2023-01-22 16:16:34.800894: step: 12/464, loss: 0.003499263199046254 2023-01-22 16:16:35.566169: step: 14/464, loss: 0.03136338293552399 2023-01-22 16:16:36.279943: step: 16/464, loss: 0.00037076350417919457 2023-01-22 16:16:36.974634: step: 18/464, loss: 0.06392989307641983 2023-01-22 16:16:37.718300: step: 20/464, loss: 0.021260160952806473 2023-01-22 16:16:38.392393: step: 22/464, loss: 0.0022600204683840275 2023-01-22 16:16:39.061680: step: 24/464, loss: 0.0006521603208966553 2023-01-22 16:16:39.773951: step: 26/464, loss: 0.0018833683570846915 2023-01-22 16:16:40.711602: step: 28/464, loss: 0.004336627200245857 2023-01-22 16:16:41.485964: step: 30/464, loss: 0.11404263228178024 2023-01-22 16:16:42.214297: step: 32/464, loss: 0.0011401284718886018 2023-01-22 16:16:42.887302: step: 34/464, loss: 3.995103179477155e-05 2023-01-22 16:16:43.618694: step: 36/464, loss: 0.004084045998752117 2023-01-22 16:16:44.496734: step: 38/464, loss: 0.00012493340182118118 2023-01-22 16:16:45.236209: step: 40/464, loss: 0.03752082213759422 2023-01-22 16:16:45.952834: step: 42/464, loss: 0.002574938116595149 2023-01-22 16:16:46.621551: step: 44/464, loss: 0.00015688323765061796 2023-01-22 16:16:47.285311: step: 46/464, loss: 0.00012026409240206704 2023-01-22 16:16:48.091515: step: 48/464, loss: 0.0011076332302764058 2023-01-22 16:16:48.817254: step: 50/464, loss: 0.0035281500313431025 2023-01-22 16:16:49.604867: step: 52/464, loss: 0.0036910888738930225 2023-01-22 16:16:50.350880: step: 54/464, loss: 0.001720602624118328 2023-01-22 16:16:51.038308: step: 56/464, loss: 0.389969140291214 2023-01-22 16:16:51.817485: step: 58/464, loss: 0.003773375414311886 2023-01-22 16:16:52.576816: step: 60/464, loss: 0.0032824124209582806 2023-01-22 16:16:53.360599: step: 62/464, loss: 0.0008738187025301158 2023-01-22 16:16:54.079171: step: 64/464, loss: 8.97213103598915e-05 2023-01-22 16:16:54.773689: step: 66/464, loss: 0.00034595775650814176 2023-01-22 16:16:55.498651: step: 68/464, loss: 0.0012479635188356042 2023-01-22 16:16:56.232094: step: 70/464, loss: 0.012148449197411537 2023-01-22 16:16:56.904616: step: 72/464, loss: 0.0016343119787052274 2023-01-22 16:16:57.649424: step: 74/464, loss: 0.012638113461434841 2023-01-22 16:16:58.392707: step: 76/464, loss: 9.724879055283964e-05 2023-01-22 16:16:59.082577: step: 78/464, loss: 0.07232683897018433 2023-01-22 16:16:59.817834: step: 80/464, loss: 0.0045994920656085014 2023-01-22 16:17:00.479571: step: 82/464, loss: 0.0067751286551356316 2023-01-22 16:17:01.241059: step: 84/464, loss: 0.004917151760309935 2023-01-22 16:17:01.999334: step: 86/464, loss: 0.0031492365524172783 2023-01-22 16:17:02.691533: step: 88/464, loss: 0.14419697225093842 2023-01-22 16:17:03.446182: step: 90/464, loss: 0.010157251730561256 2023-01-22 16:17:04.161628: step: 92/464, loss: 0.028758902102708817 2023-01-22 16:17:04.891798: step: 94/464, loss: 0.015158405527472496 2023-01-22 16:17:05.654225: step: 96/464, loss: 0.08756659179925919 2023-01-22 16:17:06.373134: step: 98/464, loss: 0.0077830045484006405 2023-01-22 16:17:07.108291: step: 100/464, loss: 0.11185619980096817 2023-01-22 16:17:07.859912: step: 102/464, loss: 0.18486009538173676 2023-01-22 16:17:08.653709: step: 104/464, loss: 0.011137974448502064 2023-01-22 16:17:09.394463: step: 106/464, loss: 0.0011875235941261053 2023-01-22 16:17:10.064702: step: 108/464, loss: 0.003935209009796381 2023-01-22 16:17:10.825700: step: 110/464, loss: 0.0009032696834765375 2023-01-22 16:17:11.500876: step: 112/464, loss: 0.0010450384579598904 2023-01-22 16:17:12.202905: step: 114/464, loss: 0.006690158508718014 2023-01-22 16:17:12.910870: step: 116/464, loss: 0.001348541583865881 2023-01-22 16:17:13.704626: step: 118/464, loss: 0.024258917197585106 2023-01-22 16:17:14.440218: step: 120/464, loss: 0.0007817841251380742 2023-01-22 16:17:15.126641: step: 122/464, loss: 0.000877066922839731 2023-01-22 16:17:15.794897: step: 124/464, loss: 0.002365712309256196 2023-01-22 16:17:16.619975: step: 126/464, loss: 0.004180488176643848 2023-01-22 16:17:17.370256: step: 128/464, loss: 0.013827762566506863 2023-01-22 16:17:18.060062: step: 130/464, loss: 0.000862470711581409 2023-01-22 16:17:18.766672: step: 132/464, loss: 0.10408184677362442 2023-01-22 16:17:19.518706: step: 134/464, loss: 0.017195992171764374 2023-01-22 16:17:20.222018: step: 136/464, loss: 0.01907893270254135 2023-01-22 16:17:20.964421: step: 138/464, loss: 0.15260766446590424 2023-01-22 16:17:21.706692: step: 140/464, loss: 6.318865780485794e-05 2023-01-22 16:17:22.403900: step: 142/464, loss: 0.00449691666290164 2023-01-22 16:17:23.106999: step: 144/464, loss: 0.00045366917038336396 2023-01-22 16:17:23.869770: step: 146/464, loss: 1.1969776096520945e-05 2023-01-22 16:17:24.588067: step: 148/464, loss: 0.0013482400681823492 2023-01-22 16:17:25.422785: step: 150/464, loss: 0.03972357511520386 2023-01-22 16:17:26.117141: step: 152/464, loss: 0.004650574177503586 2023-01-22 16:17:26.781737: step: 154/464, loss: 0.0019915217999368906 2023-01-22 16:17:27.492836: step: 156/464, loss: 0.00637625390663743 2023-01-22 16:17:28.265584: step: 158/464, loss: 0.007026030216366053 2023-01-22 16:17:29.041312: step: 160/464, loss: 0.0008738907054066658 2023-01-22 16:17:29.859160: step: 162/464, loss: 0.003001864068210125 2023-01-22 16:17:30.609779: step: 164/464, loss: 0.0010084062814712524 2023-01-22 16:17:31.381627: step: 166/464, loss: 0.014043791219592094 2023-01-22 16:17:32.053499: step: 168/464, loss: 0.009049778804183006 2023-01-22 16:17:32.823125: step: 170/464, loss: 0.025290049612522125 2023-01-22 16:17:33.571894: step: 172/464, loss: 0.017468422651290894 2023-01-22 16:17:34.316283: step: 174/464, loss: 0.3379783034324646 2023-01-22 16:17:35.035474: step: 176/464, loss: 0.009229674004018307 2023-01-22 16:17:35.864787: step: 178/464, loss: 0.08187200129032135 2023-01-22 16:17:36.582670: step: 180/464, loss: 0.0059152403846383095 2023-01-22 16:17:37.307156: step: 182/464, loss: 0.0016771698137745261 2023-01-22 16:17:38.053871: step: 184/464, loss: 0.0017269115196540952 2023-01-22 16:17:38.747941: step: 186/464, loss: 0.06236407905817032 2023-01-22 16:17:39.471937: step: 188/464, loss: 0.06140582635998726 2023-01-22 16:17:40.279494: step: 190/464, loss: 0.0031047058291733265 2023-01-22 16:17:40.956589: step: 192/464, loss: 0.006041200365871191 2023-01-22 16:17:41.662926: step: 194/464, loss: 0.00044914300087839365 2023-01-22 16:17:42.395650: step: 196/464, loss: 0.0009885894833132625 2023-01-22 16:17:43.041346: step: 198/464, loss: 0.009171642363071442 2023-01-22 16:17:43.702639: step: 200/464, loss: 0.0006789682083763182 2023-01-22 16:17:44.564173: step: 202/464, loss: 0.17516861855983734 2023-01-22 16:17:45.359652: step: 204/464, loss: 0.006859530229121447 2023-01-22 16:17:46.143957: step: 206/464, loss: 0.0049885171465575695 2023-01-22 16:17:46.824422: step: 208/464, loss: 0.0389438234269619 2023-01-22 16:17:47.627102: step: 210/464, loss: 0.004782094154506922 2023-01-22 16:17:48.409778: step: 212/464, loss: 0.00029921767418272793 2023-01-22 16:17:49.109059: step: 214/464, loss: 0.0008353688172064722 2023-01-22 16:17:49.880743: step: 216/464, loss: 0.0006163629586808383 2023-01-22 16:17:50.589675: step: 218/464, loss: 0.009952775202691555 2023-01-22 16:17:51.323958: step: 220/464, loss: 0.004259512759745121 2023-01-22 16:17:52.057485: step: 222/464, loss: 0.1413278877735138 2023-01-22 16:17:52.835158: step: 224/464, loss: 0.006144764833152294 2023-01-22 16:17:53.600968: step: 226/464, loss: 0.35328689217567444 2023-01-22 16:17:54.390958: step: 228/464, loss: 0.001853810390457511 2023-01-22 16:17:55.175050: step: 230/464, loss: 0.016084423288702965 2023-01-22 16:17:55.849772: step: 232/464, loss: 0.005866493564099073 2023-01-22 16:17:56.493709: step: 234/464, loss: 0.0011610686779022217 2023-01-22 16:17:57.224145: step: 236/464, loss: 0.0002700358454603702 2023-01-22 16:17:57.932348: step: 238/464, loss: 0.0011712840059772134 2023-01-22 16:17:58.636909: step: 240/464, loss: 0.03383626788854599 2023-01-22 16:17:59.357522: step: 242/464, loss: 0.0025248515885323286 2023-01-22 16:18:00.120913: step: 244/464, loss: 0.007510875351727009 2023-01-22 16:18:00.853860: step: 246/464, loss: 0.02027537114918232 2023-01-22 16:18:01.603750: step: 248/464, loss: 0.0294361412525177 2023-01-22 16:18:02.280809: step: 250/464, loss: 0.14149460196495056 2023-01-22 16:18:03.063351: step: 252/464, loss: 0.002963610924780369 2023-01-22 16:18:03.898475: step: 254/464, loss: 0.004055642522871494 2023-01-22 16:18:04.650385: step: 256/464, loss: 0.004812467377632856 2023-01-22 16:18:05.392762: step: 258/464, loss: 0.009900541044771671 2023-01-22 16:18:06.083804: step: 260/464, loss: 0.007619260810315609 2023-01-22 16:18:06.780031: step: 262/464, loss: 0.0006539294845424592 2023-01-22 16:18:07.611166: step: 264/464, loss: 0.005958850029855967 2023-01-22 16:18:08.368793: step: 266/464, loss: 0.006719652563333511 2023-01-22 16:18:09.083207: step: 268/464, loss: 0.004569544456899166 2023-01-22 16:18:09.772172: step: 270/464, loss: 0.0017016378697007895 2023-01-22 16:18:10.517442: step: 272/464, loss: 0.005704191979020834 2023-01-22 16:18:11.321652: step: 274/464, loss: 0.002754301531240344 2023-01-22 16:18:12.077994: step: 276/464, loss: 0.0028462556656450033 2023-01-22 16:18:12.849472: step: 278/464, loss: 0.0028985131066292524 2023-01-22 16:18:13.595340: step: 280/464, loss: 0.019997483119368553 2023-01-22 16:18:14.288422: step: 282/464, loss: 0.0039010359905660152 2023-01-22 16:18:15.016778: step: 284/464, loss: 8.691203402122483e-05 2023-01-22 16:18:15.656053: step: 286/464, loss: 0.0056920647621154785 2023-01-22 16:18:16.393642: step: 288/464, loss: 0.009940583258867264 2023-01-22 16:18:17.183243: step: 290/464, loss: 0.26928314566612244 2023-01-22 16:18:17.886664: step: 292/464, loss: 0.00306792207993567 2023-01-22 16:18:18.595511: step: 294/464, loss: 0.0039005670696496964 2023-01-22 16:18:19.284899: step: 296/464, loss: 0.0011833877069875598 2023-01-22 16:18:20.104619: step: 298/464, loss: 0.0015655200695618987 2023-01-22 16:18:20.805193: step: 300/464, loss: 0.1414550095796585 2023-01-22 16:18:21.484568: step: 302/464, loss: 0.00019204954151064157 2023-01-22 16:18:22.261890: step: 304/464, loss: 0.2077431082725525 2023-01-22 16:18:22.974169: step: 306/464, loss: 0.010231670923531055 2023-01-22 16:18:23.674695: step: 308/464, loss: 0.000895325792953372 2023-01-22 16:18:24.466423: step: 310/464, loss: 2.6799411898537073e-06 2023-01-22 16:18:25.221651: step: 312/464, loss: 0.04067198559641838 2023-01-22 16:18:25.914848: step: 314/464, loss: 0.002498132176697254 2023-01-22 16:18:26.664882: step: 316/464, loss: 0.015202999114990234 2023-01-22 16:18:27.409450: step: 318/464, loss: 0.0015499275177717209 2023-01-22 16:18:28.121451: step: 320/464, loss: 0.019704598933458328 2023-01-22 16:18:28.780509: step: 322/464, loss: 0.006374839227646589 2023-01-22 16:18:29.511131: step: 324/464, loss: 0.007268859073519707 2023-01-22 16:18:30.212147: step: 326/464, loss: 0.0003769928589463234 2023-01-22 16:18:30.992019: step: 328/464, loss: 0.30836057662963867 2023-01-22 16:18:31.786058: step: 330/464, loss: 0.005678210873156786 2023-01-22 16:18:32.480515: step: 332/464, loss: 0.025175319984555244 2023-01-22 16:18:33.138107: step: 334/464, loss: 0.00025786174228414893 2023-01-22 16:18:33.816175: step: 336/464, loss: 0.002200294751673937 2023-01-22 16:18:34.590930: step: 338/464, loss: 0.009255696088075638 2023-01-22 16:18:35.298442: step: 340/464, loss: 0.0015450895298272371 2023-01-22 16:18:36.125152: step: 342/464, loss: 0.0051421429961919785 2023-01-22 16:18:36.817304: step: 344/464, loss: 0.010502039454877377 2023-01-22 16:18:37.586065: step: 346/464, loss: 0.033631693571805954 2023-01-22 16:18:38.294463: step: 348/464, loss: 0.06362398713827133 2023-01-22 16:18:39.146414: step: 350/464, loss: 0.012094840407371521 2023-01-22 16:18:39.890607: step: 352/464, loss: 0.06655651330947876 2023-01-22 16:18:40.565326: step: 354/464, loss: 0.048538390547037125 2023-01-22 16:18:41.274230: step: 356/464, loss: 0.0003698621585499495 2023-01-22 16:18:42.074600: step: 358/464, loss: 0.3159915804862976 2023-01-22 16:18:42.766119: step: 360/464, loss: 0.0037905359640717506 2023-01-22 16:18:43.514462: step: 362/464, loss: 0.0015841275453567505 2023-01-22 16:18:44.243988: step: 364/464, loss: 0.0017936860676854849 2023-01-22 16:18:45.004726: step: 366/464, loss: 0.0035860745701938868 2023-01-22 16:18:45.715550: step: 368/464, loss: 0.0042134555988013744 2023-01-22 16:18:46.452665: step: 370/464, loss: 0.02218709886074066 2023-01-22 16:18:47.173109: step: 372/464, loss: 1.334579348564148 2023-01-22 16:18:47.938294: step: 374/464, loss: 0.009962356649339199 2023-01-22 16:18:48.730299: step: 376/464, loss: 0.02284334972500801 2023-01-22 16:18:49.407535: step: 378/464, loss: 0.003422256326302886 2023-01-22 16:18:50.129476: step: 380/464, loss: 0.007594207767397165 2023-01-22 16:18:50.882426: step: 382/464, loss: 0.008857734501361847 2023-01-22 16:18:51.613400: step: 384/464, loss: 0.01347698550671339 2023-01-22 16:18:52.328540: step: 386/464, loss: 0.03327897563576698 2023-01-22 16:18:53.134569: step: 388/464, loss: 9.90178159554489e-05 2023-01-22 16:18:53.869967: step: 390/464, loss: 0.0015657603507861495 2023-01-22 16:18:54.632556: step: 392/464, loss: 0.01993647962808609 2023-01-22 16:18:55.330419: step: 394/464, loss: 3.237658893340267e-05 2023-01-22 16:18:56.097298: step: 396/464, loss: 0.00817184243351221 2023-01-22 16:18:56.798450: step: 398/464, loss: 0.0779031291604042 2023-01-22 16:18:57.528438: step: 400/464, loss: 0.0011360801290720701 2023-01-22 16:18:58.254804: step: 402/464, loss: 0.005148367024958134 2023-01-22 16:18:59.042128: step: 404/464, loss: 0.008539358153939247 2023-01-22 16:18:59.779156: step: 406/464, loss: 0.09758865833282471 2023-01-22 16:19:00.474694: step: 408/464, loss: 0.0017494140192866325 2023-01-22 16:19:01.139493: step: 410/464, loss: 0.1648520678281784 2023-01-22 16:19:01.781826: step: 412/464, loss: 0.014005640521645546 2023-01-22 16:19:02.510217: step: 414/464, loss: 0.002599345985800028 2023-01-22 16:19:03.248896: step: 416/464, loss: 0.00011764621740439907 2023-01-22 16:19:04.011483: step: 418/464, loss: 0.0003326554433442652 2023-01-22 16:19:04.771129: step: 420/464, loss: 0.03486809507012367 2023-01-22 16:19:05.535360: step: 422/464, loss: 0.016408352181315422 2023-01-22 16:19:06.215909: step: 424/464, loss: 0.003920829389244318 2023-01-22 16:19:06.942328: step: 426/464, loss: 0.02062283083796501 2023-01-22 16:19:07.601880: step: 428/464, loss: 0.009015236981213093 2023-01-22 16:19:08.274339: step: 430/464, loss: 0.003978882450610399 2023-01-22 16:19:09.027450: step: 432/464, loss: 0.00880517903715372 2023-01-22 16:19:09.789259: step: 434/464, loss: 0.011073540896177292 2023-01-22 16:19:10.538855: step: 436/464, loss: 0.03089657984673977 2023-01-22 16:19:11.220305: step: 438/464, loss: 0.0012413024669513106 2023-01-22 16:19:11.953635: step: 440/464, loss: 0.01738613471388817 2023-01-22 16:19:12.670489: step: 442/464, loss: 0.0015985603677108884 2023-01-22 16:19:13.471002: step: 444/464, loss: 0.0005925216828472912 2023-01-22 16:19:14.197176: step: 446/464, loss: 0.17484651505947113 2023-01-22 16:19:14.944223: step: 448/464, loss: 0.0009653688175603747 2023-01-22 16:19:15.790135: step: 450/464, loss: 0.007736225612461567 2023-01-22 16:19:16.451321: step: 452/464, loss: 0.0001321414310950786 2023-01-22 16:19:17.153457: step: 454/464, loss: 0.11280196905136108 2023-01-22 16:19:17.874226: step: 456/464, loss: 0.07258933782577515 2023-01-22 16:19:18.511219: step: 458/464, loss: 0.0474422313272953 2023-01-22 16:19:19.295892: step: 460/464, loss: 0.024412963539361954 2023-01-22 16:19:20.093410: step: 462/464, loss: 0.028620421886444092 2023-01-22 16:19:20.816397: step: 464/464, loss: 0.010022708214819431 2023-01-22 16:19:21.656853: step: 466/464, loss: 0.04155333712697029 2023-01-22 16:19:22.370720: step: 468/464, loss: 0.004220477305352688 2023-01-22 16:19:23.136110: step: 470/464, loss: 0.017753012478351593 2023-01-22 16:19:23.850075: step: 472/464, loss: 0.0043309456668794155 2023-01-22 16:19:24.627493: step: 474/464, loss: 0.0008676517754793167 2023-01-22 16:19:25.367191: step: 476/464, loss: 0.011539888568222523 2023-01-22 16:19:26.018468: step: 478/464, loss: 0.0016300047282129526 2023-01-22 16:19:26.748985: step: 480/464, loss: 0.0011122695868834853 2023-01-22 16:19:27.538187: step: 482/464, loss: 0.009475810453295708 2023-01-22 16:19:28.289016: step: 484/464, loss: 0.013957416638731956 2023-01-22 16:19:29.027765: step: 486/464, loss: 0.022875269874930382 2023-01-22 16:19:29.809297: step: 488/464, loss: 0.0001867082464741543 2023-01-22 16:19:30.502899: step: 490/464, loss: 0.5683570504188538 2023-01-22 16:19:31.203285: step: 492/464, loss: 0.0029813034925609827 2023-01-22 16:19:32.000915: step: 494/464, loss: 0.041646551340818405 2023-01-22 16:19:32.711717: step: 496/464, loss: 0.14465898275375366 2023-01-22 16:19:33.410861: step: 498/464, loss: 0.008280070498585701 2023-01-22 16:19:34.224251: step: 500/464, loss: 0.008318918757140636 2023-01-22 16:19:34.968477: step: 502/464, loss: 0.009130694903433323 2023-01-22 16:19:35.673127: step: 504/464, loss: 6.741621473338455e-05 2023-01-22 16:19:36.489388: step: 506/464, loss: 0.0035972786135971546 2023-01-22 16:19:37.195641: step: 508/464, loss: 0.022480538114905357 2023-01-22 16:19:37.925589: step: 510/464, loss: 0.07424750179052353 2023-01-22 16:19:38.706891: step: 512/464, loss: 0.15917329490184784 2023-01-22 16:19:39.422477: step: 514/464, loss: 0.0038534412160515785 2023-01-22 16:19:40.154789: step: 516/464, loss: 0.013208887539803982 2023-01-22 16:19:40.801965: step: 518/464, loss: 0.038714699447155 2023-01-22 16:19:41.521257: step: 520/464, loss: 0.0006091871182434261 2023-01-22 16:19:42.287590: step: 522/464, loss: 0.01111649814993143 2023-01-22 16:19:43.001874: step: 524/464, loss: 0.0007218350074253976 2023-01-22 16:19:43.722259: step: 526/464, loss: 0.031764477491378784 2023-01-22 16:19:44.396789: step: 528/464, loss: 0.0005717077874578536 2023-01-22 16:19:45.183601: step: 530/464, loss: 0.048164039850234985 2023-01-22 16:19:45.814759: step: 532/464, loss: 0.001963170012459159 2023-01-22 16:19:46.532884: step: 534/464, loss: 0.008921549655497074 2023-01-22 16:19:47.223518: step: 536/464, loss: 0.00019145449914503843 2023-01-22 16:19:48.043161: step: 538/464, loss: 0.01980605162680149 2023-01-22 16:19:48.762321: step: 540/464, loss: 0.000405008060624823 2023-01-22 16:19:49.492062: step: 542/464, loss: 0.003804337466135621 2023-01-22 16:19:50.163760: step: 544/464, loss: 0.00018756087229121476 2023-01-22 16:19:50.918820: step: 546/464, loss: 0.012008019722998142 2023-01-22 16:19:51.648725: step: 548/464, loss: 0.0032865030225366354 2023-01-22 16:19:52.326671: step: 550/464, loss: 0.05710751563310623 2023-01-22 16:19:53.093926: step: 552/464, loss: 0.04066885635256767 2023-01-22 16:19:53.991522: step: 554/464, loss: 0.0033198478631675243 2023-01-22 16:19:54.685686: step: 556/464, loss: 0.00019103632075712085 2023-01-22 16:19:55.435938: step: 558/464, loss: 0.007582390680909157 2023-01-22 16:19:56.166511: step: 560/464, loss: 0.013868369162082672 2023-01-22 16:19:56.927335: step: 562/464, loss: 0.09755430370569229 2023-01-22 16:19:57.751283: step: 564/464, loss: 0.017535768449306488 2023-01-22 16:19:58.465877: step: 566/464, loss: 0.0010590796591714025 2023-01-22 16:19:59.202455: step: 568/464, loss: 0.032498300075531006 2023-01-22 16:19:59.989919: step: 570/464, loss: 0.055956095457077026 2023-01-22 16:20:00.822973: step: 572/464, loss: 0.010634099133312702 2023-01-22 16:20:01.600185: step: 574/464, loss: 0.015558160841464996 2023-01-22 16:20:02.306875: step: 576/464, loss: 0.010191281326115131 2023-01-22 16:20:03.052865: step: 578/464, loss: 0.0025147004052996635 2023-01-22 16:20:03.744395: step: 580/464, loss: 0.00410130200907588 2023-01-22 16:20:04.525855: step: 582/464, loss: 0.09718958288431168 2023-01-22 16:20:05.232203: step: 584/464, loss: 0.04974781349301338 2023-01-22 16:20:06.046632: step: 586/464, loss: 0.018057700246572495 2023-01-22 16:20:06.765290: step: 588/464, loss: 0.7479690313339233 2023-01-22 16:20:07.505674: step: 590/464, loss: 0.01689426228404045 2023-01-22 16:20:08.214541: step: 592/464, loss: 0.00427657924592495 2023-01-22 16:20:09.032615: step: 594/464, loss: 0.0009600772173143923 2023-01-22 16:20:09.715637: step: 596/464, loss: 0.005607732106000185 2023-01-22 16:20:10.353676: step: 598/464, loss: 0.0006944190827198327 2023-01-22 16:20:11.049384: step: 600/464, loss: 0.022029733285307884 2023-01-22 16:20:11.886989: step: 602/464, loss: 0.07722200453281403 2023-01-22 16:20:12.645908: step: 604/464, loss: 0.011648462153971195 2023-01-22 16:20:13.375048: step: 606/464, loss: 0.051798105239868164 2023-01-22 16:20:14.105129: step: 608/464, loss: 0.10283471643924713 2023-01-22 16:20:14.872159: step: 610/464, loss: 0.03725217655301094 2023-01-22 16:20:15.630657: step: 612/464, loss: 0.017023751512169838 2023-01-22 16:20:16.338000: step: 614/464, loss: 0.00312893302179873 2023-01-22 16:20:17.050757: step: 616/464, loss: 0.22248615324497223 2023-01-22 16:20:17.959779: step: 618/464, loss: 0.0312360692769289 2023-01-22 16:20:18.655707: step: 620/464, loss: 0.0007553557516075671 2023-01-22 16:20:19.395279: step: 622/464, loss: 0.0033220225013792515 2023-01-22 16:20:20.079444: step: 624/464, loss: 0.021085388958454132 2023-01-22 16:20:20.779410: step: 626/464, loss: 0.009414348751306534 2023-01-22 16:20:21.478988: step: 628/464, loss: 0.0015694421017542481 2023-01-22 16:20:22.198830: step: 630/464, loss: 0.0058471160009503365 2023-01-22 16:20:22.845213: step: 632/464, loss: 0.00471019372344017 2023-01-22 16:20:23.511380: step: 634/464, loss: 0.0036878592800348997 2023-01-22 16:20:24.191776: step: 636/464, loss: 0.0015276795020326972 2023-01-22 16:20:24.873598: step: 638/464, loss: 0.004554287530481815 2023-01-22 16:20:25.574749: step: 640/464, loss: 0.1497090756893158 2023-01-22 16:20:26.294123: step: 642/464, loss: 0.02090488001704216 2023-01-22 16:20:27.037832: step: 644/464, loss: 0.0030281986109912395 2023-01-22 16:20:27.727338: step: 646/464, loss: 0.04247652739286423 2023-01-22 16:20:28.474944: step: 648/464, loss: 0.006784561090171337 2023-01-22 16:20:29.194582: step: 650/464, loss: 0.00543493265286088 2023-01-22 16:20:29.839030: step: 652/464, loss: 0.004617628175765276 2023-01-22 16:20:30.559721: step: 654/464, loss: 0.005777742248028517 2023-01-22 16:20:31.287370: step: 656/464, loss: 0.02436165325343609 2023-01-22 16:20:32.010739: step: 658/464, loss: 0.005392088089138269 2023-01-22 16:20:32.728003: step: 660/464, loss: 1.3804744867229601e-06 2023-01-22 16:20:33.433896: step: 662/464, loss: 8.009708108147606e-05 2023-01-22 16:20:34.126674: step: 664/464, loss: 0.008129194378852844 2023-01-22 16:20:34.904891: step: 666/464, loss: 0.028133362531661987 2023-01-22 16:20:35.593949: step: 668/464, loss: 0.001996027771383524 2023-01-22 16:20:36.318898: step: 670/464, loss: 0.0009242978994734585 2023-01-22 16:20:37.057759: step: 672/464, loss: 0.09116656333208084 2023-01-22 16:20:37.911790: step: 674/464, loss: 0.008167148567736149 2023-01-22 16:20:38.621179: step: 676/464, loss: 0.00443984242156148 2023-01-22 16:20:39.406607: step: 678/464, loss: 0.0008679300080984831 2023-01-22 16:20:40.078585: step: 680/464, loss: 0.00025120421196334064 2023-01-22 16:20:40.799214: step: 682/464, loss: 0.00019450573017820716 2023-01-22 16:20:41.580454: step: 684/464, loss: 0.028216423466801643 2023-01-22 16:20:42.414118: step: 686/464, loss: 0.013654750771820545 2023-01-22 16:20:43.173641: step: 688/464, loss: 0.03024298883974552 2023-01-22 16:20:43.965069: step: 690/464, loss: 0.8305896520614624 2023-01-22 16:20:44.657125: step: 692/464, loss: 0.012249198742210865 2023-01-22 16:20:45.434676: step: 694/464, loss: 0.0007778406143188477 2023-01-22 16:20:46.198567: step: 696/464, loss: 0.00023288335069082677 2023-01-22 16:20:46.991209: step: 698/464, loss: 0.01683759316802025 2023-01-22 16:20:47.713917: step: 700/464, loss: 0.0005743993096984923 2023-01-22 16:20:48.367710: step: 702/464, loss: 0.0045036799274384975 2023-01-22 16:20:49.115114: step: 704/464, loss: 0.0013135545887053013 2023-01-22 16:20:49.806559: step: 706/464, loss: 0.0026535000652074814 2023-01-22 16:20:50.506904: step: 708/464, loss: 0.0027274100575596094 2023-01-22 16:20:51.255216: step: 710/464, loss: 0.002692986512556672 2023-01-22 16:20:52.000206: step: 712/464, loss: 0.0008930904441513121 2023-01-22 16:20:52.738583: step: 714/464, loss: 0.015236853621900082 2023-01-22 16:20:53.483850: step: 716/464, loss: 0.02877141535282135 2023-01-22 16:20:54.284246: step: 718/464, loss: 0.004633526783436537 2023-01-22 16:20:55.065548: step: 720/464, loss: 0.14433535933494568 2023-01-22 16:20:55.765705: step: 722/464, loss: 0.0099337138235569 2023-01-22 16:20:56.461170: step: 724/464, loss: 0.03656081482768059 2023-01-22 16:20:57.131467: step: 726/464, loss: 0.0005096670356579125 2023-01-22 16:20:57.888456: step: 728/464, loss: 0.0008024107082746923 2023-01-22 16:20:58.728220: step: 730/464, loss: 0.0012601092457771301 2023-01-22 16:20:59.516817: step: 732/464, loss: 0.019106421619653702 2023-01-22 16:21:00.154229: step: 734/464, loss: 0.007535758428275585 2023-01-22 16:21:00.851052: step: 736/464, loss: 0.002676013857126236 2023-01-22 16:21:01.486913: step: 738/464, loss: 2.7624613721854985e-05 2023-01-22 16:21:02.265111: step: 740/464, loss: 0.0021485830657184124 2023-01-22 16:21:02.941040: step: 742/464, loss: 0.007855184376239777 2023-01-22 16:21:03.693943: step: 744/464, loss: 0.0017511058831587434 2023-01-22 16:21:04.412035: step: 746/464, loss: 0.0011880536330863833 2023-01-22 16:21:05.102700: step: 748/464, loss: 0.01896476000547409 2023-01-22 16:21:05.794165: step: 750/464, loss: 0.003101673675701022 2023-01-22 16:21:06.594371: step: 752/464, loss: 0.12919455766677856 2023-01-22 16:21:07.321449: step: 754/464, loss: 0.030971094965934753 2023-01-22 16:21:08.069612: step: 756/464, loss: 0.009586269967257977 2023-01-22 16:21:08.799115: step: 758/464, loss: 0.011086368933320045 2023-01-22 16:21:09.508056: step: 760/464, loss: 0.015494439750909805 2023-01-22 16:21:10.211630: step: 762/464, loss: 0.004420689307153225 2023-01-22 16:21:11.108941: step: 764/464, loss: 0.02614568918943405 2023-01-22 16:21:12.032063: step: 766/464, loss: 0.0012165152002125978 2023-01-22 16:21:12.727517: step: 768/464, loss: 0.0009944859193637967 2023-01-22 16:21:13.454256: step: 770/464, loss: 0.008068513125181198 2023-01-22 16:21:14.148902: step: 772/464, loss: 0.0010496607283130288 2023-01-22 16:21:14.887733: step: 774/464, loss: 0.37772834300994873 2023-01-22 16:21:15.582173: step: 776/464, loss: 0.012436087243258953 2023-01-22 16:21:16.398603: step: 778/464, loss: 0.0002699866017792374 2023-01-22 16:21:17.050542: step: 780/464, loss: 3.1445986678591e-05 2023-01-22 16:21:17.874462: step: 782/464, loss: 0.006023346912115812 2023-01-22 16:21:18.569448: step: 784/464, loss: 0.006444807164371014 2023-01-22 16:21:19.349622: step: 786/464, loss: 0.04231029376387596 2023-01-22 16:21:20.094393: step: 788/464, loss: 0.0011864429106935859 2023-01-22 16:21:20.802937: step: 790/464, loss: 0.05042388290166855 2023-01-22 16:21:21.629470: step: 792/464, loss: 0.019405636936426163 2023-01-22 16:21:22.379020: step: 794/464, loss: 0.03940063714981079 2023-01-22 16:21:23.159215: step: 796/464, loss: 0.021257543936371803 2023-01-22 16:21:23.871769: step: 798/464, loss: 0.10211928188800812 2023-01-22 16:21:24.633705: step: 800/464, loss: 0.09499234706163406 2023-01-22 16:21:25.400617: step: 802/464, loss: 0.000208395067602396 2023-01-22 16:21:26.097551: step: 804/464, loss: 0.002811941783875227 2023-01-22 16:21:26.826640: step: 806/464, loss: 0.0010172117035835981 2023-01-22 16:21:27.541251: step: 808/464, loss: 0.02037464641034603 2023-01-22 16:21:28.288379: step: 810/464, loss: 0.016519617289304733 2023-01-22 16:21:28.943892: step: 812/464, loss: 0.004088247194886208 2023-01-22 16:21:29.728087: step: 814/464, loss: 0.0362807996571064 2023-01-22 16:21:30.498041: step: 816/464, loss: 0.022565070539712906 2023-01-22 16:21:31.218064: step: 818/464, loss: 0.0001856112648965791 2023-01-22 16:21:31.923151: step: 820/464, loss: 0.0030008764006197453 2023-01-22 16:21:32.619585: step: 822/464, loss: 0.00351099600084126 2023-01-22 16:21:33.333922: step: 824/464, loss: 0.19104412198066711 2023-01-22 16:21:34.074188: step: 826/464, loss: 0.026989376172423363 2023-01-22 16:21:34.829442: step: 828/464, loss: 0.003458687337115407 2023-01-22 16:21:35.571713: step: 830/464, loss: 0.026364948600530624 2023-01-22 16:21:36.236901: step: 832/464, loss: 0.007490881253033876 2023-01-22 16:21:36.953724: step: 834/464, loss: 0.0003644174139481038 2023-01-22 16:21:37.621168: step: 836/464, loss: 0.012916052713990211 2023-01-22 16:21:38.305736: step: 838/464, loss: 0.045503780245780945 2023-01-22 16:21:39.050935: step: 840/464, loss: 0.012989304959774017 2023-01-22 16:21:39.773439: step: 842/464, loss: 0.021143782883882523 2023-01-22 16:21:40.541932: step: 844/464, loss: 0.0042087603360414505 2023-01-22 16:21:41.275353: step: 846/464, loss: 0.0018162406049668789 2023-01-22 16:21:42.114315: step: 848/464, loss: 0.0046364981681108475 2023-01-22 16:21:42.883610: step: 850/464, loss: 0.005943454336374998 2023-01-22 16:21:43.676647: step: 852/464, loss: 0.030280398204922676 2023-01-22 16:21:44.495111: step: 854/464, loss: 0.05643213912844658 2023-01-22 16:21:45.276367: step: 856/464, loss: 0.00012839706323575228 2023-01-22 16:21:45.966600: step: 858/464, loss: 7.575419294880703e-05 2023-01-22 16:21:46.728172: step: 860/464, loss: 0.3615998923778534 2023-01-22 16:21:47.448413: step: 862/464, loss: 0.006690404377877712 2023-01-22 16:21:48.203473: step: 864/464, loss: 1.8872606754302979 2023-01-22 16:21:48.935032: step: 866/464, loss: 0.00042271538404747844 2023-01-22 16:21:49.700439: step: 868/464, loss: 0.006247181911021471 2023-01-22 16:21:50.451770: step: 870/464, loss: 0.06341104209423065 2023-01-22 16:21:51.178426: step: 872/464, loss: 0.2562231719493866 2023-01-22 16:21:51.860700: step: 874/464, loss: 0.002793203806504607 2023-01-22 16:21:52.575972: step: 876/464, loss: 0.008603735826909542 2023-01-22 16:21:53.348923: step: 878/464, loss: 1.310736894607544 2023-01-22 16:21:54.120745: step: 880/464, loss: 0.007669101003557444 2023-01-22 16:21:54.871894: step: 882/464, loss: 0.007660615257918835 2023-01-22 16:21:55.619727: step: 884/464, loss: 0.0009276815690100193 2023-01-22 16:21:56.403966: step: 886/464, loss: 0.006773849483579397 2023-01-22 16:21:57.141262: step: 888/464, loss: 0.009978453628718853 2023-01-22 16:21:57.876870: step: 890/464, loss: 0.01657634600996971 2023-01-22 16:21:58.685410: step: 892/464, loss: 0.011022034101188183 2023-01-22 16:21:59.419734: step: 894/464, loss: 0.00021206651581451297 2023-01-22 16:22:00.289692: step: 896/464, loss: 0.023269668221473694 2023-01-22 16:22:00.939609: step: 898/464, loss: 0.013293704949319363 2023-01-22 16:22:01.658791: step: 900/464, loss: 0.0005787741974927485 2023-01-22 16:22:02.363116: step: 902/464, loss: 0.056386083364486694 2023-01-22 16:22:03.116027: step: 904/464, loss: 0.007417832501232624 2023-01-22 16:22:03.829619: step: 906/464, loss: 0.00048767743282951415 2023-01-22 16:22:04.619452: step: 908/464, loss: 0.16414684057235718 2023-01-22 16:22:05.337330: step: 910/464, loss: 0.0373823456466198 2023-01-22 16:22:06.056844: step: 912/464, loss: 0.030118806287646294 2023-01-22 16:22:06.891857: step: 914/464, loss: 0.012832955457270145 2023-01-22 16:22:07.639121: step: 916/464, loss: 0.0005492176860570908 2023-01-22 16:22:08.361297: step: 918/464, loss: 0.00959594827145338 2023-01-22 16:22:09.101733: step: 920/464, loss: 0.00033759669167920947 2023-01-22 16:22:09.882175: step: 922/464, loss: 0.03864290192723274 2023-01-22 16:22:10.548185: step: 924/464, loss: 0.0018645358504727483 2023-01-22 16:22:11.363036: step: 926/464, loss: 0.006714235059916973 2023-01-22 16:22:12.254689: step: 928/464, loss: 0.008423232473433018 2023-01-22 16:22:12.901056: step: 930/464, loss: 0.02039126679301262 ================================================== Loss: 0.040 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3110851371614686, 'r': 0.3577183930167931, 'f1': 0.3327759807940864}, 'combined': 0.2452033542693268, 'epoch': 35} Test Chinese: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.30443874328914766, 'r': 0.2857873578307216, 'f1': 0.29481835486716645}, 'combined': 0.1830977151280297, 'epoch': 35} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2891990112160566, 'r': 0.3380390719337588, 'f1': 0.3117175693947347}, 'combined': 0.22968663008033083, 'epoch': 35} Test Korean: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.294683708744612, 'r': 0.2807066158397292, 'f1': 0.28752540003016797}, 'combined': 0.1785684063345254, 'epoch': 35} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.30693010369764434, 'r': 0.35235808868515145, 'f1': 0.32807899776868343}, 'combined': 0.24174241940850358, 'epoch': 35} Test Russian: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.31562933379822433, 'r': 0.2900798714374599, 'f1': 0.3023157507882169}, 'combined': 0.18775399259478734, 'epoch': 35} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.24107142857142858, 'r': 0.38571428571428573, 'f1': 0.2967032967032967}, 'combined': 0.1978021978021978, 'epoch': 35} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.2558139534883721, 'r': 0.4782608695652174, 'f1': 0.33333333333333337}, 'combined': 0.16666666666666669, 'epoch': 35} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.3489583333333333, 'r': 0.28879310344827586, 'f1': 0.3160377358490566}, 'combined': 0.21069182389937105, 'epoch': 35} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.31282801835726054, 'r': 0.3603161425860667, 'f1': 0.33489701436130004}, 'combined': 0.24676622110832633, 'epoch': 12} Test for Chinese: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.30894966790503225, 'r': 0.2887808468548521, 'f1': 0.29852498585915693}, 'combined': 0.1853997280598975, 'epoch': 12} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.3138888888888889, 'r': 0.4035714285714286, 'f1': 0.35312499999999997}, 'combined': 0.23541666666666664, 'epoch': 12} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3081533681870252, 'r': 0.3455761681186563, 'f1': 0.32579363255551325}, 'combined': 0.24005846609353607, 'epoch': 23} Test for Korean: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.29327466083978504, 'r': 0.2883432372648332, 'f1': 0.2907880427678267}, 'combined': 0.1805946791926503, 'epoch': 23} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.3355263157894737, 'r': 0.5543478260869565, 'f1': 0.41803278688524587}, 'combined': 0.20901639344262293, 'epoch': 23} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.32184781639317267, 'r': 0.3346728716953674, 'f1': 0.32813507606224857}, 'combined': 0.24178374025639365, 'epoch': 24} Test for Russian: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.31977789344146773, 'r': 0.29610233370986844, 'f1': 0.30748504771716734}, 'combined': 0.190964398055925, 'epoch': 24} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.46875, 'r': 0.3232758620689655, 'f1': 0.38265306122448983}, 'combined': 0.25510204081632654, 'epoch': 24} ****************************** Epoch: 36 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-22 16:24:51.787284: step: 2/464, loss: 0.25695911049842834 2023-01-22 16:24:52.590893: step: 4/464, loss: 0.007319469936192036 2023-01-22 16:24:53.334066: step: 6/464, loss: 0.002419233787804842 2023-01-22 16:24:54.063619: step: 8/464, loss: 0.005410829558968544 2023-01-22 16:24:54.825400: step: 10/464, loss: 0.01767798513174057 2023-01-22 16:24:55.593631: step: 12/464, loss: 0.0055710808373987675 2023-01-22 16:24:56.326415: step: 14/464, loss: 0.0014307210221886635 2023-01-22 16:24:56.989982: step: 16/464, loss: 0.01295340247452259 2023-01-22 16:24:57.654072: step: 18/464, loss: 0.003472167532891035 2023-01-22 16:24:58.359415: step: 20/464, loss: 0.012799971736967564 2023-01-22 16:24:59.056990: step: 22/464, loss: 0.0007676775567233562 2023-01-22 16:24:59.778845: step: 24/464, loss: 0.000440617382992059 2023-01-22 16:25:00.600780: step: 26/464, loss: 0.0155528225004673 2023-01-22 16:25:01.343815: step: 28/464, loss: 0.010228335857391357 2023-01-22 16:25:02.104708: step: 30/464, loss: 0.00015586575318593532 2023-01-22 16:25:02.764261: step: 32/464, loss: 0.0031636471394449472 2023-01-22 16:25:03.498979: step: 34/464, loss: 0.015593956224620342 2023-01-22 16:25:04.220750: step: 36/464, loss: 0.027952920645475388 2023-01-22 16:25:04.968763: step: 38/464, loss: 0.032278873026371 2023-01-22 16:25:05.711194: step: 40/464, loss: 0.004284294322133064 2023-01-22 16:25:06.450209: step: 42/464, loss: 0.009568164125084877 2023-01-22 16:25:07.219072: step: 44/464, loss: 0.023043537512421608 2023-01-22 16:25:07.990829: step: 46/464, loss: 0.0008782692020758986 2023-01-22 16:25:08.772680: step: 48/464, loss: 0.001116393250413239 2023-01-22 16:25:09.520059: step: 50/464, loss: 0.00017436177586205304 2023-01-22 16:25:10.267725: step: 52/464, loss: 0.013959270901978016 2023-01-22 16:25:11.010858: step: 54/464, loss: 0.00013729554484598339 2023-01-22 16:25:11.750711: step: 56/464, loss: 0.03546988219022751 2023-01-22 16:25:12.499131: step: 58/464, loss: 0.02767900936305523 2023-01-22 16:25:13.252022: step: 60/464, loss: 0.000629195652436465 2023-01-22 16:25:14.011150: step: 62/464, loss: 0.014760047197341919 2023-01-22 16:25:14.793419: step: 64/464, loss: 0.01793184131383896 2023-01-22 16:25:15.543363: step: 66/464, loss: 0.038921162486076355 2023-01-22 16:25:16.426458: step: 68/464, loss: 0.0007313909009099007 2023-01-22 16:25:17.211591: step: 70/464, loss: 0.00042105818283744156 2023-01-22 16:25:17.938025: step: 72/464, loss: 0.0014784769155085087 2023-01-22 16:25:18.666871: step: 74/464, loss: 0.10685709118843079 2023-01-22 16:25:19.400650: step: 76/464, loss: 0.004993189591914415 2023-01-22 16:25:20.220379: step: 78/464, loss: 0.005299483425915241 2023-01-22 16:25:21.048474: step: 80/464, loss: 1.9907414753106423e-05 2023-01-22 16:25:21.936538: step: 82/464, loss: 0.04119177907705307 2023-01-22 16:25:22.615655: step: 84/464, loss: 0.02733980491757393 2023-01-22 16:25:23.368258: step: 86/464, loss: 0.0033457595854997635 2023-01-22 16:25:24.229689: step: 88/464, loss: 0.0589422769844532 2023-01-22 16:25:25.008955: step: 90/464, loss: 0.006607845425605774 2023-01-22 16:25:25.748175: step: 92/464, loss: 0.004729326348751783 2023-01-22 16:25:26.527228: step: 94/464, loss: 0.22634100914001465 2023-01-22 16:25:27.251355: step: 96/464, loss: 0.005008341744542122 2023-01-22 16:25:27.980861: step: 98/464, loss: 7.832510164007545e-05 2023-01-22 16:25:28.753439: step: 100/464, loss: 0.01266252901405096 2023-01-22 16:25:29.475552: step: 102/464, loss: 0.0002192519896198064 2023-01-22 16:25:30.158265: step: 104/464, loss: 0.02024812437593937 2023-01-22 16:25:30.930435: step: 106/464, loss: 0.010026028379797935 2023-01-22 16:25:31.697130: step: 108/464, loss: 0.005464911926537752 2023-01-22 16:25:32.392536: step: 110/464, loss: 0.00029571220511570573 2023-01-22 16:25:33.167155: step: 112/464, loss: 0.0034745652228593826 2023-01-22 16:25:33.842550: step: 114/464, loss: 0.0015284789260476828 2023-01-22 16:25:34.648810: step: 116/464, loss: 0.014450756832957268 2023-01-22 16:25:35.395907: step: 118/464, loss: 0.06091814860701561 2023-01-22 16:25:36.150949: step: 120/464, loss: 0.004064726177603006 2023-01-22 16:25:36.832128: step: 122/464, loss: 0.002943573519587517 2023-01-22 16:25:37.568001: step: 124/464, loss: 0.0004433517169672996 2023-01-22 16:25:38.298685: step: 126/464, loss: 0.04658803343772888 2023-01-22 16:25:39.011556: step: 128/464, loss: 0.0030392904300242662 2023-01-22 16:25:39.697562: step: 130/464, loss: 0.001116839237511158 2023-01-22 16:25:40.368363: step: 132/464, loss: 0.0006100613973103464 2023-01-22 16:25:41.041924: step: 134/464, loss: 3.7895260902587324e-06 2023-01-22 16:25:41.733550: step: 136/464, loss: 0.012914041988551617 2023-01-22 16:25:42.475478: step: 138/464, loss: 0.0003613363951444626 2023-01-22 16:25:43.172241: step: 140/464, loss: 0.0018395596416667104 2023-01-22 16:25:43.866031: step: 142/464, loss: 0.02837635949254036 2023-01-22 16:25:44.565406: step: 144/464, loss: 0.016525769606232643 2023-01-22 16:25:45.258977: step: 146/464, loss: 0.004591218661516905 2023-01-22 16:25:46.031325: step: 148/464, loss: 0.29191404581069946 2023-01-22 16:25:46.920605: step: 150/464, loss: 0.006326992064714432 2023-01-22 16:25:47.681051: step: 152/464, loss: 0.023135047405958176 2023-01-22 16:25:48.421407: step: 154/464, loss: 0.0006658299244008958 2023-01-22 16:25:49.119429: step: 156/464, loss: 0.0024545418564230204 2023-01-22 16:25:49.908865: step: 158/464, loss: 0.0008211713866330683 2023-01-22 16:25:50.683349: step: 160/464, loss: 0.000625859247520566 2023-01-22 16:25:51.438150: step: 162/464, loss: 0.009707236662507057 2023-01-22 16:25:52.181688: step: 164/464, loss: 0.0004584605630952865 2023-01-22 16:25:52.964095: step: 166/464, loss: 0.09940136224031448 2023-01-22 16:25:53.726667: step: 168/464, loss: 0.06070875748991966 2023-01-22 16:25:54.458807: step: 170/464, loss: 0.005534209311008453 2023-01-22 16:25:55.263667: step: 172/464, loss: 0.011373227462172508 2023-01-22 16:25:56.013034: step: 174/464, loss: 0.03093552775681019 2023-01-22 16:25:56.804616: step: 176/464, loss: 0.026455167680978775 2023-01-22 16:25:57.591527: step: 178/464, loss: 0.016621779650449753 2023-01-22 16:25:58.310870: step: 180/464, loss: 0.01870632730424404 2023-01-22 16:25:59.068475: step: 182/464, loss: 0.00010793719411594793 2023-01-22 16:25:59.858226: step: 184/464, loss: 0.0004974519833922386 2023-01-22 16:26:00.611267: step: 186/464, loss: 0.028530167415738106 2023-01-22 16:26:01.292995: step: 188/464, loss: 0.01475331000983715 2023-01-22 16:26:02.072617: step: 190/464, loss: 0.026886673644185066 2023-01-22 16:26:02.778983: step: 192/464, loss: 0.006785533856600523 2023-01-22 16:26:03.558806: step: 194/464, loss: 0.04471007362008095 2023-01-22 16:26:04.324434: step: 196/464, loss: 0.026023143902420998 2023-01-22 16:26:05.000102: step: 198/464, loss: 0.009699026122689247 2023-01-22 16:26:05.692642: step: 200/464, loss: 0.011059535667300224 2023-01-22 16:26:06.445175: step: 202/464, loss: 0.002562645822763443 2023-01-22 16:26:07.176097: step: 204/464, loss: 0.0005993549129925668 2023-01-22 16:26:07.869804: step: 206/464, loss: 0.0010731748770922422 2023-01-22 16:26:08.582285: step: 208/464, loss: 0.008336545899510384 2023-01-22 16:26:09.350630: step: 210/464, loss: 0.04709343984723091 2023-01-22 16:26:10.223600: step: 212/464, loss: 0.00975609477609396 2023-01-22 16:26:10.942973: step: 214/464, loss: 0.020269447937607765 2023-01-22 16:26:11.725466: step: 216/464, loss: 0.05424981191754341 2023-01-22 16:26:12.490244: step: 218/464, loss: 0.11086882650852203 2023-01-22 16:26:13.223275: step: 220/464, loss: 0.16429883241653442 2023-01-22 16:26:13.918866: step: 222/464, loss: 0.0005940513801760972 2023-01-22 16:26:14.608900: step: 224/464, loss: 0.008422630839049816 2023-01-22 16:26:15.340762: step: 226/464, loss: 0.0005509871407411993 2023-01-22 16:26:16.023843: step: 228/464, loss: 0.009248113259673119 2023-01-22 16:26:16.754955: step: 230/464, loss: 0.06530479341745377 2023-01-22 16:26:17.444366: step: 232/464, loss: 0.014671806246042252 2023-01-22 16:26:18.178471: step: 234/464, loss: 0.006998498924076557 2023-01-22 16:26:18.912191: step: 236/464, loss: 0.0015183803625404835 2023-01-22 16:26:19.605011: step: 238/464, loss: 0.004076081793755293 2023-01-22 16:26:20.318348: step: 240/464, loss: 0.08578099310398102 2023-01-22 16:26:21.098996: step: 242/464, loss: 0.0002759688359219581 2023-01-22 16:26:21.873152: step: 244/464, loss: 0.06304151564836502 2023-01-22 16:26:22.561055: step: 246/464, loss: 0.0003053829132113606 2023-01-22 16:26:23.247435: step: 248/464, loss: 0.013026222586631775 2023-01-22 16:26:23.973991: step: 250/464, loss: 0.04093734547495842 2023-01-22 16:26:24.714816: step: 252/464, loss: 0.0015117143047973514 2023-01-22 16:26:25.455923: step: 254/464, loss: 0.006387445144355297 2023-01-22 16:26:26.168183: step: 256/464, loss: 0.00026626704493537545 2023-01-22 16:26:26.898435: step: 258/464, loss: 0.005855533294379711 2023-01-22 16:26:27.565437: step: 260/464, loss: 0.03732157498598099 2023-01-22 16:26:28.377275: step: 262/464, loss: 0.000551620963960886 2023-01-22 16:26:29.209194: step: 264/464, loss: 0.017255334183573723 2023-01-22 16:26:29.868640: step: 266/464, loss: 0.01849391497671604 2023-01-22 16:26:30.588929: step: 268/464, loss: 0.0038914477918297052 2023-01-22 16:26:31.353150: step: 270/464, loss: 0.018955865874886513 2023-01-22 16:26:32.155613: step: 272/464, loss: 0.01634865067899227 2023-01-22 16:26:32.764975: step: 274/464, loss: 0.0004928586422465742 2023-01-22 16:26:33.598209: step: 276/464, loss: 0.00017035173368640244 2023-01-22 16:26:34.307022: step: 278/464, loss: 0.02748502977192402 2023-01-22 16:26:35.036167: step: 280/464, loss: 0.004082882311195135 2023-01-22 16:26:35.755215: step: 282/464, loss: 0.0005174506222829223 2023-01-22 16:26:36.486208: step: 284/464, loss: 0.006498162169009447 2023-01-22 16:26:37.191619: step: 286/464, loss: 0.003868537489324808 2023-01-22 16:26:38.026479: step: 288/464, loss: 0.25979575514793396 2023-01-22 16:26:38.720746: step: 290/464, loss: 9.816534293349832e-05 2023-01-22 16:26:39.503138: step: 292/464, loss: 0.0020555928349494934 2023-01-22 16:26:40.310641: step: 294/464, loss: 0.04013385251164436 2023-01-22 16:26:41.040042: step: 296/464, loss: 0.12664145231246948 2023-01-22 16:26:41.742801: step: 298/464, loss: 0.0008692051051184535 2023-01-22 16:26:42.457562: step: 300/464, loss: 0.11506323516368866 2023-01-22 16:26:43.245967: step: 302/464, loss: 0.052584897726774216 2023-01-22 16:26:43.927339: step: 304/464, loss: 0.036980610340833664 2023-01-22 16:26:44.681277: step: 306/464, loss: 0.00859944149851799 2023-01-22 16:26:45.533793: step: 308/464, loss: 0.013367211446166039 2023-01-22 16:26:46.318147: step: 310/464, loss: 0.0016901890048757195 2023-01-22 16:26:47.127405: step: 312/464, loss: 0.004952017683535814 2023-01-22 16:26:47.945183: step: 314/464, loss: 0.005257464945316315 2023-01-22 16:26:48.674339: step: 316/464, loss: 0.06977822631597519 2023-01-22 16:26:49.454967: step: 318/464, loss: 0.03525172919034958 2023-01-22 16:26:50.176550: step: 320/464, loss: 0.0061018988490104675 2023-01-22 16:26:50.876182: step: 322/464, loss: 0.0011195632396265864 2023-01-22 16:26:51.593320: step: 324/464, loss: 0.001332466141320765 2023-01-22 16:26:52.341650: step: 326/464, loss: 0.1197722926735878 2023-01-22 16:26:53.076793: step: 328/464, loss: 0.01823013462126255 2023-01-22 16:26:53.773570: step: 330/464, loss: 0.00781966932117939 2023-01-22 16:26:54.478276: step: 332/464, loss: 0.01433896366506815 2023-01-22 16:26:55.159817: step: 334/464, loss: 0.00025075027951970696 2023-01-22 16:26:55.907702: step: 336/464, loss: 0.01363467425107956 2023-01-22 16:26:56.651198: step: 338/464, loss: 0.08750608563423157 2023-01-22 16:26:57.399614: step: 340/464, loss: 0.009655867703258991 2023-01-22 16:26:58.066705: step: 342/464, loss: 0.0014510346809402108 2023-01-22 16:26:58.820749: step: 344/464, loss: 0.04058638960123062 2023-01-22 16:26:59.537745: step: 346/464, loss: 2.510811827960424e-05 2023-01-22 16:27:00.314296: step: 348/464, loss: 0.00536707928404212 2023-01-22 16:27:01.145502: step: 350/464, loss: 0.006009410135447979 2023-01-22 16:27:01.928753: step: 352/464, loss: 0.0005397546919994056 2023-01-22 16:27:02.760034: step: 354/464, loss: 0.026657378301024437 2023-01-22 16:27:03.508966: step: 356/464, loss: 0.022915314882993698 2023-01-22 16:27:04.364276: step: 358/464, loss: 0.008625758811831474 2023-01-22 16:27:05.120375: step: 360/464, loss: 0.05162311717867851 2023-01-22 16:27:05.908435: step: 362/464, loss: 0.010607045143842697 2023-01-22 16:27:06.671053: step: 364/464, loss: 0.0035178023390471935 2023-01-22 16:27:07.416949: step: 366/464, loss: 0.0001674180821282789 2023-01-22 16:27:08.183298: step: 368/464, loss: 0.01395686436444521 2023-01-22 16:27:08.984762: step: 370/464, loss: 0.134748175740242 2023-01-22 16:27:09.706752: step: 372/464, loss: 0.010596811771392822 2023-01-22 16:27:10.397343: step: 374/464, loss: 0.006897300481796265 2023-01-22 16:27:11.166007: step: 376/464, loss: 0.010236544534564018 2023-01-22 16:27:11.874058: step: 378/464, loss: 0.014542913995683193 2023-01-22 16:27:12.585343: step: 380/464, loss: 0.029503915458917618 2023-01-22 16:27:13.407482: step: 382/464, loss: 0.0007390844402834773 2023-01-22 16:27:14.156421: step: 384/464, loss: 0.055452119559049606 2023-01-22 16:27:14.885870: step: 386/464, loss: 0.04913180321455002 2023-01-22 16:27:15.601917: step: 388/464, loss: 0.009322376921772957 2023-01-22 16:27:16.315977: step: 390/464, loss: 0.040665339678525925 2023-01-22 16:27:17.045254: step: 392/464, loss: 0.0012931600213050842 2023-01-22 16:27:17.759837: step: 394/464, loss: 0.012746023014187813 2023-01-22 16:27:18.596769: step: 396/464, loss: 0.022761639207601547 2023-01-22 16:27:19.283154: step: 398/464, loss: 0.002250086283311248 2023-01-22 16:27:19.984866: step: 400/464, loss: 0.10658414661884308 2023-01-22 16:27:20.619255: step: 402/464, loss: 1.0466003004694358e-05 2023-01-22 16:27:21.310993: step: 404/464, loss: 0.007765035144984722 2023-01-22 16:27:22.058224: step: 406/464, loss: 0.023086359724402428 2023-01-22 16:27:22.853585: step: 408/464, loss: 0.0032354777213186026 2023-01-22 16:27:23.568981: step: 410/464, loss: 8.746585808694363e-05 2023-01-22 16:27:24.346886: step: 412/464, loss: 0.011903539299964905 2023-01-22 16:27:25.031747: step: 414/464, loss: 0.00037599849747493863 2023-01-22 16:27:25.782968: step: 416/464, loss: 0.009331165812909603 2023-01-22 16:27:26.512718: step: 418/464, loss: 0.009211680851876736 2023-01-22 16:27:27.233581: step: 420/464, loss: 0.0010665275622159243 2023-01-22 16:27:28.063545: step: 422/464, loss: 0.025540944188833237 2023-01-22 16:27:28.771684: step: 424/464, loss: 0.003281003562733531 2023-01-22 16:27:29.502298: step: 426/464, loss: 0.000535392202436924 2023-01-22 16:27:30.250368: step: 428/464, loss: 0.0040351953357458115 2023-01-22 16:27:30.967437: step: 430/464, loss: 8.87652004166739e-06 2023-01-22 16:27:31.726417: step: 432/464, loss: 0.015301528386771679 2023-01-22 16:27:32.468396: step: 434/464, loss: 0.0045889755710959435 2023-01-22 16:27:33.226185: step: 436/464, loss: 0.0021863903384655714 2023-01-22 16:27:33.941593: step: 438/464, loss: 0.0011723927455022931 2023-01-22 16:27:34.819789: step: 440/464, loss: 0.6480519771575928 2023-01-22 16:27:35.537366: step: 442/464, loss: 0.003675105283036828 2023-01-22 16:27:36.319065: step: 444/464, loss: 0.01643628627061844 2023-01-22 16:27:37.052351: step: 446/464, loss: 0.08712334930896759 2023-01-22 16:27:37.791160: step: 448/464, loss: 0.0016254110960289836 2023-01-22 16:27:38.491678: step: 450/464, loss: 0.013469139114022255 2023-01-22 16:27:39.299755: step: 452/464, loss: 0.01300358772277832 2023-01-22 16:27:39.948465: step: 454/464, loss: 0.00330551341176033 2023-01-22 16:27:40.713864: step: 456/464, loss: 0.058972395956516266 2023-01-22 16:27:41.424741: step: 458/464, loss: 0.6161903738975525 2023-01-22 16:27:42.240275: step: 460/464, loss: 0.00022773313685320318 2023-01-22 16:27:42.957933: step: 462/464, loss: 3.4806244373321533 2023-01-22 16:27:43.701859: step: 464/464, loss: 0.0018051972147077322 2023-01-22 16:27:44.425569: step: 466/464, loss: 0.001051753293722868 2023-01-22 16:27:45.187845: step: 468/464, loss: 0.0013541424414142966 2023-01-22 16:27:45.848782: step: 470/464, loss: 0.0036550683435052633 2023-01-22 16:27:46.555494: step: 472/464, loss: 0.04064280539751053 2023-01-22 16:27:47.317202: step: 474/464, loss: 0.000147580387420021 2023-01-22 16:27:47.976064: step: 476/464, loss: 3.274055416113697e-05 2023-01-22 16:27:48.636481: step: 478/464, loss: 0.05929414555430412 2023-01-22 16:27:49.356138: step: 480/464, loss: 0.0007548134890384972 2023-01-22 16:27:50.068712: step: 482/464, loss: 0.004415807314217091 2023-01-22 16:27:50.833775: step: 484/464, loss: 0.017077995464205742 2023-01-22 16:27:51.674543: step: 486/464, loss: 0.43001672625541687 2023-01-22 16:27:52.380608: step: 488/464, loss: 0.03174535930156708 2023-01-22 16:27:53.115741: step: 490/464, loss: 0.006033504381775856 2023-01-22 16:27:53.854620: step: 492/464, loss: 0.0073009757325053215 2023-01-22 16:27:54.522620: step: 494/464, loss: 0.0071230302564799786 2023-01-22 16:27:55.183488: step: 496/464, loss: 0.00764810387045145 2023-01-22 16:27:55.905221: step: 498/464, loss: 0.00485795084387064 2023-01-22 16:27:56.589566: step: 500/464, loss: 0.0005850406014360487 2023-01-22 16:27:57.315682: step: 502/464, loss: 6.588442920474336e-05 2023-01-22 16:27:57.985935: step: 504/464, loss: 0.004987492226064205 2023-01-22 16:27:58.694620: step: 506/464, loss: 0.01008535549044609 2023-01-22 16:27:59.369272: step: 508/464, loss: 0.0010731680085882545 2023-01-22 16:28:00.091644: step: 510/464, loss: 0.0017684295307844877 2023-01-22 16:28:00.880445: step: 512/464, loss: 0.0022716608364135027 2023-01-22 16:28:01.701997: step: 514/464, loss: 0.012158623896539211 2023-01-22 16:28:02.374269: step: 516/464, loss: 0.018801629543304443 2023-01-22 16:28:03.169958: step: 518/464, loss: 0.03261919319629669 2023-01-22 16:28:03.909680: step: 520/464, loss: 0.00035173961077816784 2023-01-22 16:28:04.613830: step: 522/464, loss: 4.745915430248715e-05 2023-01-22 16:28:05.391637: step: 524/464, loss: 0.0011711184633895755 2023-01-22 16:28:06.141235: step: 526/464, loss: 0.05389797315001488 2023-01-22 16:28:06.887892: step: 528/464, loss: 0.0376632958650589 2023-01-22 16:28:07.633359: step: 530/464, loss: 0.002463065553456545 2023-01-22 16:28:08.341740: step: 532/464, loss: 0.01789242774248123 2023-01-22 16:28:09.001518: step: 534/464, loss: 0.02504398301243782 2023-01-22 16:28:09.646768: step: 536/464, loss: 0.0015117475995793939 2023-01-22 16:28:10.410558: step: 538/464, loss: 0.00411178357899189 2023-01-22 16:28:11.135061: step: 540/464, loss: 0.024579649791121483 2023-01-22 16:28:12.000678: step: 542/464, loss: 0.016789443790912628 2023-01-22 16:28:12.759405: step: 544/464, loss: 0.13599908351898193 2023-01-22 16:28:13.469476: step: 546/464, loss: 7.640109834028408e-05 2023-01-22 16:28:14.314393: step: 548/464, loss: 0.013117525726556778 2023-01-22 16:28:15.007221: step: 550/464, loss: 0.00526499655097723 2023-01-22 16:28:15.745907: step: 552/464, loss: 0.0007741588051430881 2023-01-22 16:28:16.480505: step: 554/464, loss: 0.004953427240252495 2023-01-22 16:28:17.194539: step: 556/464, loss: 0.03301551938056946 2023-01-22 16:28:17.957780: step: 558/464, loss: 0.10400167852640152 2023-01-22 16:28:18.718483: step: 560/464, loss: 0.03694292530417442 2023-01-22 16:28:19.713787: step: 562/464, loss: 0.02041666954755783 2023-01-22 16:28:20.443895: step: 564/464, loss: 0.026419784873723984 2023-01-22 16:28:21.126397: step: 566/464, loss: 0.005885733757168055 2023-01-22 16:28:21.796540: step: 568/464, loss: 0.002397893462330103 2023-01-22 16:28:22.426104: step: 570/464, loss: 0.0011069076135754585 2023-01-22 16:28:23.150132: step: 572/464, loss: 0.007068040315061808 2023-01-22 16:28:23.887842: step: 574/464, loss: 0.022324377670884132 2023-01-22 16:28:24.611306: step: 576/464, loss: 0.007453723344951868 2023-01-22 16:28:25.349046: step: 578/464, loss: 0.0008849214063957334 2023-01-22 16:28:26.066144: step: 580/464, loss: 0.015994714573025703 2023-01-22 16:28:26.890531: step: 582/464, loss: 0.09704574197530746 2023-01-22 16:28:27.669543: step: 584/464, loss: 0.01191841159015894 2023-01-22 16:28:28.315031: step: 586/464, loss: 0.00012223394878674299 2023-01-22 16:28:29.062559: step: 588/464, loss: 0.004012149292975664 2023-01-22 16:28:29.794045: step: 590/464, loss: 0.006843290291726589 2023-01-22 16:28:30.491144: step: 592/464, loss: 0.0001453045551897958 2023-01-22 16:28:31.268668: step: 594/464, loss: 0.0007951535517349839 2023-01-22 16:28:31.944203: step: 596/464, loss: 0.00012174924631835893 2023-01-22 16:28:32.586410: step: 598/464, loss: 0.017362039536237717 2023-01-22 16:28:33.336827: step: 600/464, loss: 0.009249621070921421 2023-01-22 16:28:34.083892: step: 602/464, loss: 0.08007968962192535 2023-01-22 16:28:34.837294: step: 604/464, loss: 0.001187682501040399 2023-01-22 16:28:35.564670: step: 606/464, loss: 0.0017568308394402266 2023-01-22 16:28:36.303876: step: 608/464, loss: 0.03269410505890846 2023-01-22 16:28:37.031603: step: 610/464, loss: 0.0011200553271919489 2023-01-22 16:28:37.738126: step: 612/464, loss: 0.060264717787504196 2023-01-22 16:28:38.473994: step: 614/464, loss: 0.0024317821953445673 2023-01-22 16:28:39.243417: step: 616/464, loss: 0.02085116133093834 2023-01-22 16:28:40.006872: step: 618/464, loss: 0.004324778914451599 2023-01-22 16:28:40.685642: step: 620/464, loss: 0.0014203897444531322 2023-01-22 16:28:41.390766: step: 622/464, loss: 0.0027653370052576065 2023-01-22 16:28:42.235559: step: 624/464, loss: 0.04954336956143379 2023-01-22 16:28:42.995551: step: 626/464, loss: 0.008402707986533642 2023-01-22 16:28:43.830979: step: 628/464, loss: 0.0007335083209909499 2023-01-22 16:28:44.531171: step: 630/464, loss: 0.002768837846815586 2023-01-22 16:28:45.271331: step: 632/464, loss: 0.0957309752702713 2023-01-22 16:28:45.989201: step: 634/464, loss: 0.003592605469748378 2023-01-22 16:28:46.813013: step: 636/464, loss: 0.017020680010318756 2023-01-22 16:28:47.618733: step: 638/464, loss: 0.006964333821088076 2023-01-22 16:28:48.325082: step: 640/464, loss: 0.008871006779372692 2023-01-22 16:28:49.058428: step: 642/464, loss: 0.00032925105188041925 2023-01-22 16:28:49.830919: step: 644/464, loss: 0.15270043909549713 2023-01-22 16:28:50.581991: step: 646/464, loss: 0.011862103827297688 2023-01-22 16:28:51.359666: step: 648/464, loss: 0.004576689563691616 2023-01-22 16:28:52.125330: step: 650/464, loss: 0.01853887178003788 2023-01-22 16:28:52.880896: step: 652/464, loss: 0.0003241170779801905 2023-01-22 16:28:53.480520: step: 654/464, loss: 0.005045429803431034 2023-01-22 16:28:54.185117: step: 656/464, loss: 0.0013937974581494927 2023-01-22 16:28:54.900748: step: 658/464, loss: 0.012597468681633472 2023-01-22 16:28:55.582622: step: 660/464, loss: 0.006145347375422716 2023-01-22 16:28:56.293858: step: 662/464, loss: 0.0005045664729550481 2023-01-22 16:28:57.007997: step: 664/464, loss: 0.013134666718542576 2023-01-22 16:28:57.726851: step: 666/464, loss: 0.019119717180728912 2023-01-22 16:28:58.468057: step: 668/464, loss: 0.0007407998782582581 2023-01-22 16:28:59.133518: step: 670/464, loss: 0.011953119188547134 2023-01-22 16:28:59.806946: step: 672/464, loss: 0.004008065443485975 2023-01-22 16:29:00.476557: step: 674/464, loss: 0.009490004740655422 2023-01-22 16:29:01.117465: step: 676/464, loss: 0.02649562619626522 2023-01-22 16:29:01.880108: step: 678/464, loss: 0.057214513421058655 2023-01-22 16:29:02.679883: step: 680/464, loss: 0.05446144938468933 2023-01-22 16:29:03.438561: step: 682/464, loss: 0.014377056621015072 2023-01-22 16:29:04.311192: step: 684/464, loss: 0.0008804807439446449 2023-01-22 16:29:05.020305: step: 686/464, loss: 0.019741732627153397 2023-01-22 16:29:05.760266: step: 688/464, loss: 0.0017235928680747747 2023-01-22 16:29:06.528510: step: 690/464, loss: 0.013048955239355564 2023-01-22 16:29:07.320287: step: 692/464, loss: 0.027585506439208984 2023-01-22 16:29:07.981227: step: 694/464, loss: 0.0010335876140743494 2023-01-22 16:29:08.778042: step: 696/464, loss: 0.0006243172683753073 2023-01-22 16:29:09.486300: step: 698/464, loss: 0.0284845232963562 2023-01-22 16:29:10.237523: step: 700/464, loss: 0.0001900457573356107 2023-01-22 16:29:10.999832: step: 702/464, loss: 9.985039469029289e-06 2023-01-22 16:29:11.744510: step: 704/464, loss: 0.9770488142967224 2023-01-22 16:29:12.486584: step: 706/464, loss: 0.005254943389445543 2023-01-22 16:29:13.182615: step: 708/464, loss: 0.006244266871362925 2023-01-22 16:29:13.947002: step: 710/464, loss: 0.0006113231065683067 2023-01-22 16:29:14.631031: step: 712/464, loss: 0.0013864204520359635 2023-01-22 16:29:15.349373: step: 714/464, loss: 0.060835305601358414 2023-01-22 16:29:16.051419: step: 716/464, loss: 0.02436334826052189 2023-01-22 16:29:16.811079: step: 718/464, loss: 0.04365500807762146 2023-01-22 16:29:17.602241: step: 720/464, loss: 0.036159999668598175 2023-01-22 16:29:18.349651: step: 722/464, loss: 0.017852721735835075 2023-01-22 16:29:19.120116: step: 724/464, loss: 0.005818805657327175 2023-01-22 16:29:19.756828: step: 726/464, loss: 0.006242651026695967 2023-01-22 16:29:20.477009: step: 728/464, loss: 0.002634809585288167 2023-01-22 16:29:21.220341: step: 730/464, loss: 0.007493005599826574 2023-01-22 16:29:21.983395: step: 732/464, loss: 0.000106641418824438 2023-01-22 16:29:22.747226: step: 734/464, loss: 0.0031143163796514273 2023-01-22 16:29:23.413999: step: 736/464, loss: 0.00782071053981781 2023-01-22 16:29:24.123883: step: 738/464, loss: 0.006812175270169973 2023-01-22 16:29:24.839000: step: 740/464, loss: 0.0028683652635663748 2023-01-22 16:29:25.528891: step: 742/464, loss: 0.006989945657551289 2023-01-22 16:29:26.218184: step: 744/464, loss: 0.018487777560949326 2023-01-22 16:29:26.986957: step: 746/464, loss: 0.0024340515956282616 2023-01-22 16:29:27.656393: step: 748/464, loss: 0.02723161317408085 2023-01-22 16:29:28.342315: step: 750/464, loss: 0.005367112345993519 2023-01-22 16:29:29.065127: step: 752/464, loss: 0.0003777325327973813 2023-01-22 16:29:29.831114: step: 754/464, loss: 0.005238216836005449 2023-01-22 16:29:30.508279: step: 756/464, loss: 0.0005308697000145912 2023-01-22 16:29:31.280220: step: 758/464, loss: 0.04467277601361275 2023-01-22 16:29:32.011441: step: 760/464, loss: 0.0029819693882018328 2023-01-22 16:29:32.846326: step: 762/464, loss: 0.006541172508150339 2023-01-22 16:29:33.570193: step: 764/464, loss: 0.012404871173202991 2023-01-22 16:29:34.233849: step: 766/464, loss: 0.0012938773725181818 2023-01-22 16:29:34.989019: step: 768/464, loss: 0.05380694568157196 2023-01-22 16:29:35.767245: step: 770/464, loss: 0.011378041468560696 2023-01-22 16:29:36.426702: step: 772/464, loss: 0.00021820772963110358 2023-01-22 16:29:37.149049: step: 774/464, loss: 0.01853141188621521 2023-01-22 16:29:37.868639: step: 776/464, loss: 0.15019969642162323 2023-01-22 16:29:38.631903: step: 778/464, loss: 0.028686635196208954 2023-01-22 16:29:39.393863: step: 780/464, loss: 0.006159959360957146 2023-01-22 16:29:40.103243: step: 782/464, loss: 0.03424028679728508 2023-01-22 16:29:40.857481: step: 784/464, loss: 0.006407311651855707 2023-01-22 16:29:41.512785: step: 786/464, loss: 0.0034531159326434135 2023-01-22 16:29:42.363985: step: 788/464, loss: 0.0063906945288181305 2023-01-22 16:29:43.047513: step: 790/464, loss: 0.0017129798652604222 2023-01-22 16:29:43.789237: step: 792/464, loss: 0.03176811337471008 2023-01-22 16:29:44.597595: step: 794/464, loss: 0.004839983303099871 2023-01-22 16:29:45.268649: step: 796/464, loss: 0.0023881683591753244 2023-01-22 16:29:46.040862: step: 798/464, loss: 0.0019653679337352514 2023-01-22 16:29:46.740657: step: 800/464, loss: 0.025102870538830757 2023-01-22 16:29:47.433738: step: 802/464, loss: 0.07787781953811646 2023-01-22 16:29:48.119876: step: 804/464, loss: 0.0024989210069179535 2023-01-22 16:29:48.814647: step: 806/464, loss: 0.00024524348555132747 2023-01-22 16:29:49.456014: step: 808/464, loss: 0.026250962167978287 2023-01-22 16:29:50.260093: step: 810/464, loss: 0.009458690881729126 2023-01-22 16:29:50.899816: step: 812/464, loss: 0.0012966262875124812 2023-01-22 16:29:51.644601: step: 814/464, loss: 0.008395758457481861 2023-01-22 16:29:52.359978: step: 816/464, loss: 0.00576248113065958 2023-01-22 16:29:53.042083: step: 818/464, loss: 0.06546097248792648 2023-01-22 16:29:53.703030: step: 820/464, loss: 0.0005249642999842763 2023-01-22 16:29:54.453389: step: 822/464, loss: 0.011017367243766785 2023-01-22 16:29:55.183511: step: 824/464, loss: 0.0021830948535352945 2023-01-22 16:29:55.901141: step: 826/464, loss: 0.022510824725031853 2023-01-22 16:29:56.634463: step: 828/464, loss: 0.002732113003730774 2023-01-22 16:29:57.392021: step: 830/464, loss: 0.06168021261692047 2023-01-22 16:29:58.070062: step: 832/464, loss: 0.0061044213362038136 2023-01-22 16:29:58.766832: step: 834/464, loss: 0.004234203137457371 2023-01-22 16:29:59.474116: step: 836/464, loss: 0.0001307379425270483 2023-01-22 16:30:00.224241: step: 838/464, loss: 0.0038768108934164047 2023-01-22 16:30:00.971350: step: 840/464, loss: 0.00010593992192298174 2023-01-22 16:30:01.732834: step: 842/464, loss: 0.00027795901405625045 2023-01-22 16:30:02.447816: step: 844/464, loss: 0.03145487606525421 2023-01-22 16:30:03.177729: step: 846/464, loss: 1.2001028060913086 2023-01-22 16:30:03.916549: step: 848/464, loss: 0.001613723929040134 2023-01-22 16:30:04.672305: step: 850/464, loss: 0.004396263509988785 2023-01-22 16:30:05.525224: step: 852/464, loss: 0.05548679083585739 2023-01-22 16:30:06.333862: step: 854/464, loss: 0.0015070741064846516 2023-01-22 16:30:07.039561: step: 856/464, loss: 0.00996997207403183 2023-01-22 16:30:07.757626: step: 858/464, loss: 0.04742727428674698 2023-01-22 16:30:08.527869: step: 860/464, loss: 0.036565475165843964 2023-01-22 16:30:09.250974: step: 862/464, loss: 0.00024810165632516146 2023-01-22 16:30:10.035798: step: 864/464, loss: 0.0038507478311657906 2023-01-22 16:30:10.805742: step: 866/464, loss: 0.018209144473075867 2023-01-22 16:30:11.501643: step: 868/464, loss: 0.06962604075670242 2023-01-22 16:30:12.138108: step: 870/464, loss: 0.0011270270915701985 2023-01-22 16:30:12.947573: step: 872/464, loss: 0.03728864714503288 2023-01-22 16:30:13.622246: step: 874/464, loss: 0.0010363421170040965 2023-01-22 16:30:14.332250: step: 876/464, loss: 0.05309408903121948 2023-01-22 16:30:14.994305: step: 878/464, loss: 0.05263727530837059 2023-01-22 16:30:15.774323: step: 880/464, loss: 1.1782157116613234e-06 2023-01-22 16:30:16.512043: step: 882/464, loss: 0.3160185217857361 2023-01-22 16:30:17.205906: step: 884/464, loss: 0.00017775286687538028 2023-01-22 16:30:17.898784: step: 886/464, loss: 0.0008852652972564101 2023-01-22 16:30:18.547951: step: 888/464, loss: 0.002755802357569337 2023-01-22 16:30:19.224236: step: 890/464, loss: 0.0003568526590242982 2023-01-22 16:30:19.996126: step: 892/464, loss: 0.005164716858416796 2023-01-22 16:30:20.681677: step: 894/464, loss: 0.021540869027376175 2023-01-22 16:30:21.465855: step: 896/464, loss: 0.0017362685175612569 2023-01-22 16:30:22.176826: step: 898/464, loss: 0.001434554811567068 2023-01-22 16:30:22.942725: step: 900/464, loss: 0.010029935277998447 2023-01-22 16:30:23.719452: step: 902/464, loss: 0.005941364914178848 2023-01-22 16:30:24.456112: step: 904/464, loss: 0.0255038533359766 2023-01-22 16:30:25.125714: step: 906/464, loss: 0.022550631314516068 2023-01-22 16:30:25.828036: step: 908/464, loss: 0.024902237579226494 2023-01-22 16:30:26.434866: step: 910/464, loss: 0.14438927173614502 2023-01-22 16:30:27.069174: step: 912/464, loss: 0.0119105763733387 2023-01-22 16:30:27.802059: step: 914/464, loss: 0.014482015743851662 2023-01-22 16:30:28.569476: step: 916/464, loss: 0.04747198522090912 2023-01-22 16:30:29.338511: step: 918/464, loss: 0.025189634412527084 2023-01-22 16:30:30.019599: step: 920/464, loss: 0.0019587657880038023 2023-01-22 16:30:30.743883: step: 922/464, loss: 0.022820036858320236 2023-01-22 16:30:31.452870: step: 924/464, loss: 0.022515999153256416 2023-01-22 16:30:32.148819: step: 926/464, loss: 0.0030971961095929146 2023-01-22 16:30:33.002290: step: 928/464, loss: 0.06827942281961441 2023-01-22 16:30:33.635298: step: 930/464, loss: 4.529808575171046e-05 ================================================== Loss: 0.036 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3223564514189514, 'r': 0.3303083183419996, 'f1': 0.32628394332939786}, 'combined': 0.2404197477163984, 'epoch': 36} Test Chinese: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.3149965388942404, 'r': 0.2931866895148172, 'f1': 0.30370055645438543}, 'combined': 0.18861402979798675, 'epoch': 36} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.30380801480934694, 'r': 0.3245615034870253, 'f1': 0.31384204098653634}, 'combined': 0.23125203020060572, 'epoch': 36} Test Korean: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.30545271719769473, 'r': 0.2930654161046131, 'f1': 0.2991308790325733}, 'combined': 0.1857760196097034, 'epoch': 36} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.31838743693800114, 'r': 0.32684554721718906, 'f1': 0.3225610550252034}, 'combined': 0.23767656686067617, 'epoch': 36} Test Russian: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.3237106508222157, 'r': 0.2981461374305303, 'f1': 0.3104029159477156}, 'combined': 0.19277654779910758, 'epoch': 36} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.3125, 'r': 0.3482142857142857, 'f1': 0.3293918918918919}, 'combined': 0.21959459459459457, 'epoch': 36} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.26875, 'r': 0.4673913043478261, 'f1': 0.3412698412698412}, 'combined': 0.1706349206349206, 'epoch': 36} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.3875, 'r': 0.2672413793103448, 'f1': 0.3163265306122449}, 'combined': 0.2108843537414966, 'epoch': 36} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.31282801835726054, 'r': 0.3603161425860667, 'f1': 0.33489701436130004}, 'combined': 0.24676622110832633, 'epoch': 12} Test for Chinese: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.30894966790503225, 'r': 0.2887808468548521, 'f1': 0.29852498585915693}, 'combined': 0.1853997280598975, 'epoch': 12} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.3138888888888889, 'r': 0.4035714285714286, 'f1': 0.35312499999999997}, 'combined': 0.23541666666666664, 'epoch': 12} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3081533681870252, 'r': 0.3455761681186563, 'f1': 0.32579363255551325}, 'combined': 0.24005846609353607, 'epoch': 23} Test for Korean: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.29327466083978504, 'r': 0.2883432372648332, 'f1': 0.2907880427678267}, 'combined': 0.1805946791926503, 'epoch': 23} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.3355263157894737, 'r': 0.5543478260869565, 'f1': 0.41803278688524587}, 'combined': 0.20901639344262293, 'epoch': 23} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.32184781639317267, 'r': 0.3346728716953674, 'f1': 0.32813507606224857}, 'combined': 0.24178374025639365, 'epoch': 24} Test for Russian: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.31977789344146773, 'r': 0.29610233370986844, 'f1': 0.30748504771716734}, 'combined': 0.190964398055925, 'epoch': 24} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.46875, 'r': 0.3232758620689655, 'f1': 0.38265306122448983}, 'combined': 0.25510204081632654, 'epoch': 24} ****************************** Epoch: 37 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-22 16:33:12.936666: step: 2/464, loss: 0.280551552772522 2023-01-22 16:33:13.680377: step: 4/464, loss: 0.008939560502767563 2023-01-22 16:33:14.514316: step: 6/464, loss: 0.002099171746522188 2023-01-22 16:33:15.229273: step: 8/464, loss: 0.0003868904896080494 2023-01-22 16:33:15.984467: step: 10/464, loss: 0.06558524817228317 2023-01-22 16:33:16.680698: step: 12/464, loss: 0.0013958881609141827 2023-01-22 16:33:17.337015: step: 14/464, loss: 0.0008803207892924547 2023-01-22 16:33:18.040696: step: 16/464, loss: 0.021780746057629585 2023-01-22 16:33:18.746364: step: 18/464, loss: 0.0008355473983101547 2023-01-22 16:33:19.521228: step: 20/464, loss: 0.7923649549484253 2023-01-22 16:33:20.181018: step: 22/464, loss: 0.0001320467854384333 2023-01-22 16:33:20.975774: step: 24/464, loss: 0.0014507113955914974 2023-01-22 16:33:21.709602: step: 26/464, loss: 0.011154986917972565 2023-01-22 16:33:22.507193: step: 28/464, loss: 0.07527535408735275 2023-01-22 16:33:23.206051: step: 30/464, loss: 0.006951568648219109 2023-01-22 16:33:23.918844: step: 32/464, loss: 9.29707384784706e-05 2023-01-22 16:33:24.679651: step: 34/464, loss: 0.005324409808963537 2023-01-22 16:33:25.401212: step: 36/464, loss: 0.5303267240524292 2023-01-22 16:33:26.047959: step: 38/464, loss: 4.769072256749496e-05 2023-01-22 16:33:26.910011: step: 40/464, loss: 0.014213241636753082 2023-01-22 16:33:27.682798: step: 42/464, loss: 0.011028922162950039 2023-01-22 16:33:28.476693: step: 44/464, loss: 0.002563275396823883 2023-01-22 16:33:29.253665: step: 46/464, loss: 1.1958390474319458 2023-01-22 16:33:29.991124: step: 48/464, loss: 0.3304753005504608 2023-01-22 16:33:30.704475: step: 50/464, loss: 0.0014656719285994768 2023-01-22 16:33:31.416230: step: 52/464, loss: 0.0022754385136067867 2023-01-22 16:33:32.128809: step: 54/464, loss: 0.0007055680034682155 2023-01-22 16:33:32.805369: step: 56/464, loss: 0.005791050381958485 2023-01-22 16:33:33.459989: step: 58/464, loss: 0.02034652605652809 2023-01-22 16:33:34.154891: step: 60/464, loss: 0.015613360330462456 2023-01-22 16:33:34.898723: step: 62/464, loss: 0.004705897066742182 2023-01-22 16:33:35.685139: step: 64/464, loss: 0.008632996119558811 2023-01-22 16:33:36.377503: step: 66/464, loss: 0.011131679639220238 2023-01-22 16:33:37.151378: step: 68/464, loss: 0.009376917965710163 2023-01-22 16:33:37.908354: step: 70/464, loss: 0.008288220502436161 2023-01-22 16:33:38.699378: step: 72/464, loss: 0.001611040672287345 2023-01-22 16:33:39.478207: step: 74/464, loss: 0.01145377941429615 2023-01-22 16:33:40.237207: step: 76/464, loss: 0.0019228061428293586 2023-01-22 16:33:40.995407: step: 78/464, loss: 0.0053077698685228825 2023-01-22 16:33:41.734753: step: 80/464, loss: 0.0004243594885338098 2023-01-22 16:33:42.458355: step: 82/464, loss: 0.0004092931339982897 2023-01-22 16:33:43.133632: step: 84/464, loss: 0.0032144596334546804 2023-01-22 16:33:43.870586: step: 86/464, loss: 0.001705450122244656 2023-01-22 16:33:44.573384: step: 88/464, loss: 0.0032923638354986906 2023-01-22 16:33:45.230055: step: 90/464, loss: 0.00027621854678727686 2023-01-22 16:33:46.153009: step: 92/464, loss: 0.009215369820594788 2023-01-22 16:33:46.837474: step: 94/464, loss: 1.0244518307445105e-05 2023-01-22 16:33:47.586854: step: 96/464, loss: 0.0014645474730059505 2023-01-22 16:33:48.349470: step: 98/464, loss: 0.007156630512326956 2023-01-22 16:33:49.081408: step: 100/464, loss: 0.006801954470574856 2023-01-22 16:33:49.871317: step: 102/464, loss: 0.00554502522572875 2023-01-22 16:33:50.653992: step: 104/464, loss: 0.03765822947025299 2023-01-22 16:33:51.367512: step: 106/464, loss: 0.0032080982346087694 2023-01-22 16:33:52.022075: step: 108/464, loss: 0.00014240032760426402 2023-01-22 16:33:52.727468: step: 110/464, loss: 1.0253015756607056 2023-01-22 16:33:53.461885: step: 112/464, loss: 0.0023225920740514994 2023-01-22 16:33:54.236541: step: 114/464, loss: 0.45400118827819824 2023-01-22 16:33:54.938499: step: 116/464, loss: 0.008975445292890072 2023-01-22 16:33:55.662445: step: 118/464, loss: 0.011956234462559223 2023-01-22 16:33:56.475118: step: 120/464, loss: 0.011662127450108528 2023-01-22 16:33:57.266565: step: 122/464, loss: 0.02039065584540367 2023-01-22 16:33:58.026060: step: 124/464, loss: 0.03815082460641861 2023-01-22 16:33:58.800888: step: 126/464, loss: 5.76664533582516e-05 2023-01-22 16:33:59.539896: step: 128/464, loss: 0.0028054367285221815 2023-01-22 16:34:00.371650: step: 130/464, loss: 0.45062610507011414 2023-01-22 16:34:01.038978: step: 132/464, loss: 0.028487809002399445 2023-01-22 16:34:01.763443: step: 134/464, loss: 0.7996045351028442 2023-01-22 16:34:02.578023: step: 136/464, loss: 0.00445803627371788 2023-01-22 16:34:03.333050: step: 138/464, loss: 0.001006808946840465 2023-01-22 16:34:04.035247: step: 140/464, loss: 0.10681931674480438 2023-01-22 16:34:04.679997: step: 142/464, loss: 0.002509431215003133 2023-01-22 16:34:05.392024: step: 144/464, loss: 0.0034187352284789085 2023-01-22 16:34:06.103242: step: 146/464, loss: 0.013594591058790684 2023-01-22 16:34:06.799691: step: 148/464, loss: 0.001105702482163906 2023-01-22 16:34:07.548981: step: 150/464, loss: 0.009012116119265556 2023-01-22 16:34:08.197202: step: 152/464, loss: 0.0008563753799535334 2023-01-22 16:34:08.911679: step: 154/464, loss: 0.013175604864954948 2023-01-22 16:34:09.684657: step: 156/464, loss: 0.014561844058334827 2023-01-22 16:34:10.460811: step: 158/464, loss: 0.05313951522111893 2023-01-22 16:34:11.338210: step: 160/464, loss: 0.0025458012241870165 2023-01-22 16:34:12.199061: step: 162/464, loss: 0.0032371277920901775 2023-01-22 16:34:12.947534: step: 164/464, loss: 0.34664544463157654 2023-01-22 16:34:13.745616: step: 166/464, loss: 0.01839899830520153 2023-01-22 16:34:14.489320: step: 168/464, loss: 0.031566642224788666 2023-01-22 16:34:15.107009: step: 170/464, loss: 0.01217776257544756 2023-01-22 16:34:15.831731: step: 172/464, loss: 1.7539546489715576 2023-01-22 16:34:16.601505: step: 174/464, loss: 0.029823634773492813 2023-01-22 16:34:17.324201: step: 176/464, loss: 0.005956373643130064 2023-01-22 16:34:18.113811: step: 178/464, loss: 0.0004605269932653755 2023-01-22 16:34:18.920534: step: 180/464, loss: 0.01578412391245365 2023-01-22 16:34:19.605883: step: 182/464, loss: 0.007628194522112608 2023-01-22 16:34:20.343351: step: 184/464, loss: 0.0007106433040462434 2023-01-22 16:34:21.125317: step: 186/464, loss: 0.00034200208028778434 2023-01-22 16:34:21.829164: step: 188/464, loss: 0.003999216482043266 2023-01-22 16:34:22.558567: step: 190/464, loss: 0.0010963481618091464 2023-01-22 16:34:23.277814: step: 192/464, loss: 0.0002236723667010665 2023-01-22 16:34:23.973244: step: 194/464, loss: 0.0016451469855383039 2023-01-22 16:34:24.645133: step: 196/464, loss: 0.008220840245485306 2023-01-22 16:34:25.399918: step: 198/464, loss: 0.0009095626883208752 2023-01-22 16:34:26.074355: step: 200/464, loss: 0.4115036725997925 2023-01-22 16:34:26.830699: step: 202/464, loss: 0.002638082019984722 2023-01-22 16:34:27.534562: step: 204/464, loss: 4.80562994198408e-05 2023-01-22 16:34:28.355508: step: 206/464, loss: 0.01674598455429077 2023-01-22 16:34:29.042930: step: 208/464, loss: 0.0022681746631860733 2023-01-22 16:34:29.785626: step: 210/464, loss: 0.0005993362283334136 2023-01-22 16:34:30.472704: step: 212/464, loss: 0.0013340349541977048 2023-01-22 16:34:31.150054: step: 214/464, loss: 0.0249335877597332 2023-01-22 16:34:31.915776: step: 216/464, loss: 0.005155049730092287 2023-01-22 16:34:32.666750: step: 218/464, loss: 0.005427779629826546 2023-01-22 16:34:33.473196: step: 220/464, loss: 0.0010236561065539718 2023-01-22 16:34:34.147200: step: 222/464, loss: 2.951405076601077e-05 2023-01-22 16:34:34.890453: step: 224/464, loss: 0.00332543533295393 2023-01-22 16:34:35.548065: step: 226/464, loss: 0.003271919209510088 2023-01-22 16:34:36.225780: step: 228/464, loss: 0.05662386491894722 2023-01-22 16:34:37.044044: step: 230/464, loss: 0.005633444990962744 2023-01-22 16:34:37.818461: step: 232/464, loss: 0.0012898902641609311 2023-01-22 16:34:38.558455: step: 234/464, loss: 0.0004988862201571465 2023-01-22 16:34:39.276764: step: 236/464, loss: 0.000257426465395838 2023-01-22 16:34:39.965509: step: 238/464, loss: 0.017902931198477745 2023-01-22 16:34:40.697721: step: 240/464, loss: 0.002358401892706752 2023-01-22 16:34:41.387499: step: 242/464, loss: 0.05138232186436653 2023-01-22 16:34:42.187291: step: 244/464, loss: 0.0016939010238274932 2023-01-22 16:34:42.931819: step: 246/464, loss: 0.011640515178442001 2023-01-22 16:34:43.639459: step: 248/464, loss: 0.000751245825085789 2023-01-22 16:34:44.491939: step: 250/464, loss: 0.0003486153727862984 2023-01-22 16:34:45.232364: step: 252/464, loss: 0.045683737844228745 2023-01-22 16:34:46.003445: step: 254/464, loss: 0.0002169935469282791 2023-01-22 16:34:46.671179: step: 256/464, loss: 0.012211199849843979 2023-01-22 16:34:47.466786: step: 258/464, loss: 0.008998665027320385 2023-01-22 16:34:48.245816: step: 260/464, loss: 0.04509185254573822 2023-01-22 16:34:48.949460: step: 262/464, loss: 0.016031892970204353 2023-01-22 16:34:49.677391: step: 264/464, loss: 0.00014143930457066745 2023-01-22 16:34:50.420719: step: 266/464, loss: 0.0026746096555143595 2023-01-22 16:34:51.126523: step: 268/464, loss: 0.06285201013088226 2023-01-22 16:34:51.804517: step: 270/464, loss: 0.025904759764671326 2023-01-22 16:34:52.527231: step: 272/464, loss: 0.0293950904160738 2023-01-22 16:34:53.217865: step: 274/464, loss: 0.00917926337569952 2023-01-22 16:34:53.983678: step: 276/464, loss: 0.00245084916241467 2023-01-22 16:34:54.759477: step: 278/464, loss: 0.06213583052158356 2023-01-22 16:34:55.455840: step: 280/464, loss: 0.00697915768250823 2023-01-22 16:34:56.213494: step: 282/464, loss: 0.026562105864286423 2023-01-22 16:34:56.918220: step: 284/464, loss: 0.017584990710020065 2023-01-22 16:34:57.614524: step: 286/464, loss: 0.00643016304820776 2023-01-22 16:34:58.348435: step: 288/464, loss: 0.004855574574321508 2023-01-22 16:34:59.077548: step: 290/464, loss: 0.0016783819301053882 2023-01-22 16:34:59.820579: step: 292/464, loss: 0.0005006135324947536 2023-01-22 16:35:00.549491: step: 294/464, loss: 0.0228210911154747 2023-01-22 16:35:01.288199: step: 296/464, loss: 0.0024767278227955103 2023-01-22 16:35:01.976733: step: 298/464, loss: 0.00437591876834631 2023-01-22 16:35:02.680857: step: 300/464, loss: 0.28448402881622314 2023-01-22 16:35:03.339721: step: 302/464, loss: 0.00031451816903427243 2023-01-22 16:35:04.093021: step: 304/464, loss: 0.05789633467793465 2023-01-22 16:35:04.859361: step: 306/464, loss: 0.029583610594272614 2023-01-22 16:35:05.570656: step: 308/464, loss: 0.0015377000672742724 2023-01-22 16:35:06.339437: step: 310/464, loss: 0.029524048790335655 2023-01-22 16:35:07.135380: step: 312/464, loss: 0.012191911228001118 2023-01-22 16:35:07.851997: step: 314/464, loss: 0.0005353660089895129 2023-01-22 16:35:08.665603: step: 316/464, loss: 0.03353969007730484 2023-01-22 16:35:09.407862: step: 318/464, loss: 0.28003543615341187 2023-01-22 16:35:10.136032: step: 320/464, loss: 6.730011955369264e-05 2023-01-22 16:35:10.891862: step: 322/464, loss: 0.0003083897172473371 2023-01-22 16:35:11.631218: step: 324/464, loss: 0.0009723399416543543 2023-01-22 16:35:12.338997: step: 326/464, loss: 0.0031207154970616102 2023-01-22 16:35:13.024749: step: 328/464, loss: 0.004871489480137825 2023-01-22 16:35:13.725562: step: 330/464, loss: 0.006424302235245705 2023-01-22 16:35:14.466797: step: 332/464, loss: 0.028175652027130127 2023-01-22 16:35:15.172231: step: 334/464, loss: 0.00297021446749568 2023-01-22 16:35:15.844786: step: 336/464, loss: 0.012575359083712101 2023-01-22 16:35:16.526925: step: 338/464, loss: 0.0013873311690986156 2023-01-22 16:35:17.209053: step: 340/464, loss: 0.012714538723230362 2023-01-22 16:35:17.941706: step: 342/464, loss: 0.0001900491479318589 2023-01-22 16:35:18.683720: step: 344/464, loss: 0.013001831248402596 2023-01-22 16:35:19.446117: step: 346/464, loss: 0.00460166297852993 2023-01-22 16:35:20.170827: step: 348/464, loss: 0.004661817103624344 2023-01-22 16:35:20.895871: step: 350/464, loss: 0.03515256196260452 2023-01-22 16:35:21.603146: step: 352/464, loss: 0.01803845725953579 2023-01-22 16:35:22.267693: step: 354/464, loss: 0.014347223564982414 2023-01-22 16:35:22.947864: step: 356/464, loss: 0.023120839148759842 2023-01-22 16:35:23.653470: step: 358/464, loss: 0.030444692820310593 2023-01-22 16:35:24.401183: step: 360/464, loss: 0.000849287782330066 2023-01-22 16:35:25.165828: step: 362/464, loss: 0.0013305057073011994 2023-01-22 16:35:25.916297: step: 364/464, loss: 0.0025766007602214813 2023-01-22 16:35:26.607594: step: 366/464, loss: 0.00023264545598067343 2023-01-22 16:35:27.420002: step: 368/464, loss: 0.008877119980752468 2023-01-22 16:35:28.190497: step: 370/464, loss: 0.03428717330098152 2023-01-22 16:35:28.863412: step: 372/464, loss: 0.05827299878001213 2023-01-22 16:35:29.610773: step: 374/464, loss: 0.00013529513671528548 2023-01-22 16:35:30.288859: step: 376/464, loss: 0.0038425836246460676 2023-01-22 16:35:31.042810: step: 378/464, loss: 0.11871061474084854 2023-01-22 16:35:31.769418: step: 380/464, loss: 0.0017034834017977118 2023-01-22 16:35:32.456668: step: 382/464, loss: 0.011208605021238327 2023-01-22 16:35:33.280769: step: 384/464, loss: 0.002513043349608779 2023-01-22 16:35:34.034959: step: 386/464, loss: 0.004687040578573942 2023-01-22 16:35:34.746205: step: 388/464, loss: 4.855592123931274e-05 2023-01-22 16:35:35.499568: step: 390/464, loss: 0.004560593515634537 2023-01-22 16:35:36.251433: step: 392/464, loss: 0.043515417724847794 2023-01-22 16:35:36.990497: step: 394/464, loss: 0.42079973220825195 2023-01-22 16:35:37.825244: step: 396/464, loss: 0.0009142745402641594 2023-01-22 16:35:38.512499: step: 398/464, loss: 0.0012250650906935334 2023-01-22 16:35:39.298506: step: 400/464, loss: 0.0012005030876025558 2023-01-22 16:35:40.094858: step: 402/464, loss: 0.03129834681749344 2023-01-22 16:35:40.790349: step: 404/464, loss: 0.00040537992026656866 2023-01-22 16:35:41.549435: step: 406/464, loss: 0.052299920469522476 2023-01-22 16:35:42.350645: step: 408/464, loss: 0.00806864257901907 2023-01-22 16:35:43.131129: step: 410/464, loss: 0.004822226706892252 2023-01-22 16:35:43.884309: step: 412/464, loss: 0.13479790091514587 2023-01-22 16:35:44.528154: step: 414/464, loss: 0.00043789477786049247 2023-01-22 16:35:45.274349: step: 416/464, loss: 0.016873696818947792 2023-01-22 16:35:46.022867: step: 418/464, loss: 0.0004592374316416681 2023-01-22 16:35:46.773617: step: 420/464, loss: 4.474152774491813e-06 2023-01-22 16:35:47.539212: step: 422/464, loss: 0.020167384296655655 2023-01-22 16:35:48.279817: step: 424/464, loss: 0.0105759147554636 2023-01-22 16:35:48.998052: step: 426/464, loss: 0.03869227319955826 2023-01-22 16:35:49.655505: step: 428/464, loss: 0.024213694036006927 2023-01-22 16:35:50.337361: step: 430/464, loss: 0.005004123318940401 2023-01-22 16:35:51.020276: step: 432/464, loss: 0.012551484629511833 2023-01-22 16:35:51.778292: step: 434/464, loss: 0.03439558297395706 2023-01-22 16:35:52.482774: step: 436/464, loss: 0.0023460935335606337 2023-01-22 16:35:53.218701: step: 438/464, loss: 0.0003566384839359671 2023-01-22 16:35:53.942918: step: 440/464, loss: 0.007467083632946014 2023-01-22 16:35:54.698546: step: 442/464, loss: 0.00042208057129755616 2023-01-22 16:35:55.404033: step: 444/464, loss: 0.0004894437151961029 2023-01-22 16:35:56.114132: step: 446/464, loss: 0.0005053331260569394 2023-01-22 16:35:56.887380: step: 448/464, loss: 0.0006129793473519385 2023-01-22 16:35:57.561438: step: 450/464, loss: 0.004137951415032148 2023-01-22 16:35:58.337711: step: 452/464, loss: 0.02457266114652157 2023-01-22 16:35:59.013354: step: 454/464, loss: 0.010156502947211266 2023-01-22 16:35:59.790736: step: 456/464, loss: 0.007887723855674267 2023-01-22 16:36:00.462802: step: 458/464, loss: 0.0018748701550066471 2023-01-22 16:36:01.140345: step: 460/464, loss: 0.0013414479326456785 2023-01-22 16:36:01.945079: step: 462/464, loss: 0.0009742419933900237 2023-01-22 16:36:02.787290: step: 464/464, loss: 0.013239771127700806 2023-01-22 16:36:03.548154: step: 466/464, loss: 0.023154204711318016 2023-01-22 16:36:04.241107: step: 468/464, loss: 0.0389837771654129 2023-01-22 16:36:04.934860: step: 470/464, loss: 0.005595343187451363 2023-01-22 16:36:05.638946: step: 472/464, loss: 0.04169272258877754 2023-01-22 16:36:06.376719: step: 474/464, loss: 0.033552419394254684 2023-01-22 16:36:07.101064: step: 476/464, loss: 0.01440391968935728 2023-01-22 16:36:07.873214: step: 478/464, loss: 1.0795247554779053 2023-01-22 16:36:08.615396: step: 480/464, loss: 0.019383076578378677 2023-01-22 16:36:09.346572: step: 482/464, loss: 0.006699536461383104 2023-01-22 16:36:10.047203: step: 484/464, loss: 0.006788499187678099 2023-01-22 16:36:10.812394: step: 486/464, loss: 0.004629552364349365 2023-01-22 16:36:11.596923: step: 488/464, loss: 0.0003677864442579448 2023-01-22 16:36:12.333143: step: 490/464, loss: 0.014383114874362946 2023-01-22 16:36:13.066528: step: 492/464, loss: 0.036328598856925964 2023-01-22 16:36:13.795510: step: 494/464, loss: 0.02989009954035282 2023-01-22 16:36:14.581259: step: 496/464, loss: 0.008138212375342846 2023-01-22 16:36:15.322193: step: 498/464, loss: 0.0007988435099832714 2023-01-22 16:36:16.023715: step: 500/464, loss: 0.006192335858941078 2023-01-22 16:36:16.730982: step: 502/464, loss: 0.009122030809521675 2023-01-22 16:36:17.482038: step: 504/464, loss: 0.0024027577601373196 2023-01-22 16:36:18.196896: step: 506/464, loss: 0.0008780999341979623 2023-01-22 16:36:18.927796: step: 508/464, loss: 0.05414319410920143 2023-01-22 16:36:19.664000: step: 510/464, loss: 0.15176241099834442 2023-01-22 16:36:20.342589: step: 512/464, loss: 0.0005607667262665927 2023-01-22 16:36:20.980602: step: 514/464, loss: 1.1074645954067819e-05 2023-01-22 16:36:21.777169: step: 516/464, loss: 0.004496861714869738 2023-01-22 16:36:22.522466: step: 518/464, loss: 0.0012348692398518324 2023-01-22 16:36:23.252420: step: 520/464, loss: 0.0013506343821063638 2023-01-22 16:36:23.949565: step: 522/464, loss: 0.0007885704399086535 2023-01-22 16:36:24.715765: step: 524/464, loss: 0.003030086401849985 2023-01-22 16:36:25.446725: step: 526/464, loss: 0.001310868770815432 2023-01-22 16:36:26.114907: step: 528/464, loss: 0.00021499492868315428 2023-01-22 16:36:26.856105: step: 530/464, loss: 0.0947151705622673 2023-01-22 16:36:27.493838: step: 532/464, loss: 1.7647675122134387e-05 2023-01-22 16:36:28.234287: step: 534/464, loss: 0.0041124713607132435 2023-01-22 16:36:28.937864: step: 536/464, loss: 0.0023744269274175167 2023-01-22 16:36:29.671965: step: 538/464, loss: 0.02584262192249298 2023-01-22 16:36:30.390728: step: 540/464, loss: 0.003794726449996233 2023-01-22 16:36:31.120299: step: 542/464, loss: 0.007657110691070557 2023-01-22 16:36:31.808926: step: 544/464, loss: 0.0011781043140217662 2023-01-22 16:36:32.491384: step: 546/464, loss: 0.016155825927853584 2023-01-22 16:36:33.225880: step: 548/464, loss: 0.009724811650812626 2023-01-22 16:36:33.976371: step: 550/464, loss: 0.0010752100497484207 2023-01-22 16:36:34.804152: step: 552/464, loss: 0.00013952278823126107 2023-01-22 16:36:35.491117: step: 554/464, loss: 0.02432950958609581 2023-01-22 16:36:36.182920: step: 556/464, loss: 0.11144956946372986 2023-01-22 16:36:36.980396: step: 558/464, loss: 0.0349210649728775 2023-01-22 16:36:37.726386: step: 560/464, loss: 0.0006711905589327216 2023-01-22 16:36:38.492524: step: 562/464, loss: 0.0016934089362621307 2023-01-22 16:36:39.164420: step: 564/464, loss: 0.0023588351905345917 2023-01-22 16:36:39.883357: step: 566/464, loss: 0.003813401563093066 2023-01-22 16:36:40.568437: step: 568/464, loss: 0.07490542531013489 2023-01-22 16:36:41.293104: step: 570/464, loss: 0.007735707797110081 2023-01-22 16:36:42.049148: step: 572/464, loss: 0.0013534906320273876 2023-01-22 16:36:42.826030: step: 574/464, loss: 0.008723358623683453 2023-01-22 16:36:43.540148: step: 576/464, loss: 0.027125995606184006 2023-01-22 16:36:44.246214: step: 578/464, loss: 0.00965013075619936 2023-01-22 16:36:45.007257: step: 580/464, loss: 0.010939667001366615 2023-01-22 16:36:45.729131: step: 582/464, loss: 0.00047597894445061684 2023-01-22 16:36:46.541097: step: 584/464, loss: 0.03781023249030113 2023-01-22 16:36:47.406201: step: 586/464, loss: 0.023593248799443245 2023-01-22 16:36:48.117571: step: 588/464, loss: 0.054472390562295914 2023-01-22 16:36:48.832910: step: 590/464, loss: 0.02335720881819725 2023-01-22 16:36:49.636194: step: 592/464, loss: 0.2685994505882263 2023-01-22 16:36:50.375589: step: 594/464, loss: 0.007727600634098053 2023-01-22 16:36:51.117320: step: 596/464, loss: 0.03737568482756615 2023-01-22 16:36:51.911575: step: 598/464, loss: 0.004133549984544516 2023-01-22 16:36:52.584930: step: 600/464, loss: 0.0037972447462379932 2023-01-22 16:36:53.326540: step: 602/464, loss: 0.012717491947114468 2023-01-22 16:36:54.083848: step: 604/464, loss: 0.002593568991869688 2023-01-22 16:36:54.824894: step: 606/464, loss: 0.00020642158051487058 2023-01-22 16:36:55.572504: step: 608/464, loss: 0.029949229210615158 2023-01-22 16:36:56.334258: step: 610/464, loss: 0.012447851710021496 2023-01-22 16:36:56.967182: step: 612/464, loss: 0.0029245391488075256 2023-01-22 16:36:57.709283: step: 614/464, loss: 0.014462887309491634 2023-01-22 16:36:58.532645: step: 616/464, loss: 0.026355883106589317 2023-01-22 16:36:59.184739: step: 618/464, loss: 0.007805028930306435 2023-01-22 16:36:59.879096: step: 620/464, loss: 0.012717099860310555 2023-01-22 16:37:00.568913: step: 622/464, loss: 0.007876298390328884 2023-01-22 16:37:01.332521: step: 624/464, loss: 0.00024364175624214113 2023-01-22 16:37:02.047069: step: 626/464, loss: 7.127138087525964e-05 2023-01-22 16:37:02.807279: step: 628/464, loss: 0.001971708843484521 2023-01-22 16:37:03.538515: step: 630/464, loss: 7.846429070923477e-05 2023-01-22 16:37:04.263800: step: 632/464, loss: 0.0007255128002725542 2023-01-22 16:37:05.026041: step: 634/464, loss: 0.014832047745585442 2023-01-22 16:37:05.688857: step: 636/464, loss: 0.0014034698251634836 2023-01-22 16:37:06.577270: step: 638/464, loss: 0.011184017173945904 2023-01-22 16:37:07.338298: step: 640/464, loss: 0.001698363688774407 2023-01-22 16:37:08.181584: step: 642/464, loss: 0.018939031288027763 2023-01-22 16:37:08.917870: step: 644/464, loss: 0.03413274884223938 2023-01-22 16:37:09.671928: step: 646/464, loss: 0.008116284385323524 2023-01-22 16:37:10.472426: step: 648/464, loss: 0.0332346111536026 2023-01-22 16:37:11.209916: step: 650/464, loss: 0.17691725492477417 2023-01-22 16:37:11.962596: step: 652/464, loss: 0.007915996946394444 2023-01-22 16:37:12.692758: step: 654/464, loss: 0.001967285992577672 2023-01-22 16:37:13.404124: step: 656/464, loss: 0.003613883862271905 2023-01-22 16:37:14.250363: step: 658/464, loss: 0.034955792129039764 2023-01-22 16:37:14.926969: step: 660/464, loss: 0.003080249996855855 2023-01-22 16:37:15.655145: step: 662/464, loss: 0.0010708275949582458 2023-01-22 16:37:16.411348: step: 664/464, loss: 0.013194491155445576 2023-01-22 16:37:17.297278: step: 666/464, loss: 0.009825510904192924 2023-01-22 16:37:18.034726: step: 668/464, loss: 0.009700184687972069 2023-01-22 16:37:18.751475: step: 670/464, loss: 0.0038985528517514467 2023-01-22 16:37:19.469596: step: 672/464, loss: 0.002727124374359846 2023-01-22 16:37:20.163595: step: 674/464, loss: 0.002421292709186673 2023-01-22 16:37:20.813221: step: 676/464, loss: 0.002174343913793564 2023-01-22 16:37:21.562079: step: 678/464, loss: 0.03671526908874512 2023-01-22 16:37:22.299261: step: 680/464, loss: 0.00047577309305779636 2023-01-22 16:37:23.267485: step: 682/464, loss: 0.006250323727726936 2023-01-22 16:37:24.005472: step: 684/464, loss: 0.0016291936626657844 2023-01-22 16:37:24.797174: step: 686/464, loss: 0.09297700971364975 2023-01-22 16:37:25.544406: step: 688/464, loss: 0.001678445260040462 2023-01-22 16:37:26.324124: step: 690/464, loss: 0.011845313012599945 2023-01-22 16:37:27.063466: step: 692/464, loss: 0.024060700088739395 2023-01-22 16:37:27.777509: step: 694/464, loss: 0.47590959072113037 2023-01-22 16:37:28.492973: step: 696/464, loss: 0.0004216399975121021 2023-01-22 16:37:29.207025: step: 698/464, loss: 0.002506182761862874 2023-01-22 16:37:29.972791: step: 700/464, loss: 0.008632753044366837 2023-01-22 16:37:30.619934: step: 702/464, loss: 0.020507795736193657 2023-01-22 16:37:31.291661: step: 704/464, loss: 0.23401466012001038 2023-01-22 16:37:32.031549: step: 706/464, loss: 0.0024203970097005367 2023-01-22 16:37:32.817156: step: 708/464, loss: 0.007572733331471682 2023-01-22 16:37:33.550626: step: 710/464, loss: 0.018590884283185005 2023-01-22 16:37:34.270476: step: 712/464, loss: 0.019573379307985306 2023-01-22 16:37:35.074432: step: 714/464, loss: 0.0003258471260778606 2023-01-22 16:37:35.804233: step: 716/464, loss: 0.003308718092739582 2023-01-22 16:37:36.572594: step: 718/464, loss: 0.025799686089158058 2023-01-22 16:37:37.540065: step: 720/464, loss: 0.0014121445128694177 2023-01-22 16:37:38.262646: step: 722/464, loss: 0.0010850804392248392 2023-01-22 16:37:38.977557: step: 724/464, loss: 0.027522243559360504 2023-01-22 16:37:39.774464: step: 726/464, loss: 0.015951845794916153 2023-01-22 16:37:40.487622: step: 728/464, loss: 0.006579286884516478 2023-01-22 16:37:41.237088: step: 730/464, loss: 0.021987447515130043 2023-01-22 16:37:42.036674: step: 732/464, loss: 0.009621849283576012 2023-01-22 16:37:42.676431: step: 734/464, loss: 0.006305316463112831 2023-01-22 16:37:43.422283: step: 736/464, loss: 0.0016307436162605882 2023-01-22 16:37:44.191486: step: 738/464, loss: 0.011331386864185333 2023-01-22 16:37:44.952274: step: 740/464, loss: 0.00675381300970912 2023-01-22 16:37:45.657498: step: 742/464, loss: 0.0013096178881824017 2023-01-22 16:37:46.400178: step: 744/464, loss: 0.007577427197247744 2023-01-22 16:37:47.125229: step: 746/464, loss: 0.022213483229279518 2023-01-22 16:37:47.861971: step: 748/464, loss: 0.02859911136329174 2023-01-22 16:37:48.580428: step: 750/464, loss: 0.00373831856995821 2023-01-22 16:37:49.285085: step: 752/464, loss: 0.0947578027844429 2023-01-22 16:37:49.952755: step: 754/464, loss: 0.0009563664207234979 2023-01-22 16:37:50.625525: step: 756/464, loss: 0.022929368540644646 2023-01-22 16:37:51.369715: step: 758/464, loss: 0.005303644575178623 2023-01-22 16:37:52.247308: step: 760/464, loss: 0.002270778641104698 2023-01-22 16:37:52.969770: step: 762/464, loss: 0.04593295603990555 2023-01-22 16:37:53.698281: step: 764/464, loss: 0.00035218169796280563 2023-01-22 16:37:54.420420: step: 766/464, loss: 0.00043932118569500744 2023-01-22 16:37:55.147989: step: 768/464, loss: 0.01299199927598238 2023-01-22 16:37:55.915594: step: 770/464, loss: 0.00485680066049099 2023-01-22 16:37:56.599681: step: 772/464, loss: 0.005588888190686703 2023-01-22 16:37:57.372527: step: 774/464, loss: 6.823728654126171e-06 2023-01-22 16:37:58.126836: step: 776/464, loss: 0.007791449781507254 2023-01-22 16:37:58.845627: step: 778/464, loss: 0.0028351175133138895 2023-01-22 16:37:59.528390: step: 780/464, loss: 0.001013401080854237 2023-01-22 16:38:00.374048: step: 782/464, loss: 0.00967742782086134 2023-01-22 16:38:01.001411: step: 784/464, loss: 0.002139471471309662 2023-01-22 16:38:01.678563: step: 786/464, loss: 0.014759652316570282 2023-01-22 16:38:02.422891: step: 788/464, loss: 0.024528315290808678 2023-01-22 16:38:03.149389: step: 790/464, loss: 0.0002210488310083747 2023-01-22 16:38:03.846358: step: 792/464, loss: 0.07229010760784149 2023-01-22 16:38:04.604991: step: 794/464, loss: 0.005594093352556229 2023-01-22 16:38:05.325664: step: 796/464, loss: 0.00963997095823288 2023-01-22 16:38:06.008544: step: 798/464, loss: 0.00913211889564991 2023-01-22 16:38:06.809828: step: 800/464, loss: 0.014080382883548737 2023-01-22 16:38:07.633254: step: 802/464, loss: 0.010862361639738083 2023-01-22 16:38:08.366832: step: 804/464, loss: 0.15599840879440308 2023-01-22 16:38:09.126189: step: 806/464, loss: 0.05338187515735626 2023-01-22 16:38:09.756053: step: 808/464, loss: 0.00022713113867212087 2023-01-22 16:38:10.449590: step: 810/464, loss: 0.007019891869276762 2023-01-22 16:38:11.256031: step: 812/464, loss: 0.033112138509750366 2023-01-22 16:38:12.146249: step: 814/464, loss: 0.021351516246795654 2023-01-22 16:38:12.838202: step: 816/464, loss: 0.004479506053030491 2023-01-22 16:38:13.547498: step: 818/464, loss: 0.0241513904184103 2023-01-22 16:38:14.198492: step: 820/464, loss: 0.001121489447541535 2023-01-22 16:38:14.931138: step: 822/464, loss: 0.0023313446436077356 2023-01-22 16:38:15.782093: step: 824/464, loss: 0.009491167962551117 2023-01-22 16:38:16.478022: step: 826/464, loss: 0.002323998138308525 2023-01-22 16:38:17.240974: step: 828/464, loss: 0.012958469800651073 2023-01-22 16:38:17.923387: step: 830/464, loss: 0.008748810738325119 2023-01-22 16:38:18.769805: step: 832/464, loss: 0.09143657237291336 2023-01-22 16:38:19.460747: step: 834/464, loss: 8.623718895250931e-05 2023-01-22 16:38:20.293660: step: 836/464, loss: 0.044922150671482086 2023-01-22 16:38:20.987375: step: 838/464, loss: 0.0027205843944102526 2023-01-22 16:38:21.743873: step: 840/464, loss: 0.0037673949263989925 2023-01-22 16:38:22.509755: step: 842/464, loss: 0.014279971830546856 2023-01-22 16:38:23.221306: step: 844/464, loss: 0.005875707138329744 2023-01-22 16:38:24.100899: step: 846/464, loss: 1.09738028049469 2023-01-22 16:38:24.795783: step: 848/464, loss: 0.001458100276067853 2023-01-22 16:38:25.559144: step: 850/464, loss: 0.004742420744150877 2023-01-22 16:38:26.391188: step: 852/464, loss: 0.015593930147588253 2023-01-22 16:38:27.145220: step: 854/464, loss: 0.0016010843683034182 2023-01-22 16:38:27.897412: step: 856/464, loss: 0.004820770584046841 2023-01-22 16:38:28.606501: step: 858/464, loss: 0.0033338917419314384 2023-01-22 16:38:29.325136: step: 860/464, loss: 0.025021173059940338 2023-01-22 16:38:30.007572: step: 862/464, loss: 0.0010791352251544595 2023-01-22 16:38:30.685852: step: 864/464, loss: 2.1274170875549316 2023-01-22 16:38:31.333306: step: 866/464, loss: 0.0003470660303719342 2023-01-22 16:38:32.094015: step: 868/464, loss: 0.001641890499740839 2023-01-22 16:38:32.743704: step: 870/464, loss: 0.00018000038107857108 2023-01-22 16:38:33.461301: step: 872/464, loss: 0.0009995194850489497 2023-01-22 16:38:34.221055: step: 874/464, loss: 0.0012401562416926026 2023-01-22 16:38:34.921032: step: 876/464, loss: 0.0017109549371525645 2023-01-22 16:38:35.610166: step: 878/464, loss: 0.010825077071785927 2023-01-22 16:38:36.361141: step: 880/464, loss: 0.59823077917099 2023-01-22 16:38:37.027752: step: 882/464, loss: 0.0009296668577007949 2023-01-22 16:38:37.706998: step: 884/464, loss: 0.00184035359416157 2023-01-22 16:38:38.468045: step: 886/464, loss: 0.031071102246642113 2023-01-22 16:38:39.299295: step: 888/464, loss: 0.0023760192561894655 2023-01-22 16:38:40.013518: step: 890/464, loss: 0.004190961830317974 2023-01-22 16:38:40.726449: step: 892/464, loss: 0.0033965427428483963 2023-01-22 16:38:41.494227: step: 894/464, loss: 0.0021972153335809708 2023-01-22 16:38:42.221115: step: 896/464, loss: 0.00022150274890009314 2023-01-22 16:38:42.968281: step: 898/464, loss: 0.02371404506266117 2023-01-22 16:38:43.743158: step: 900/464, loss: 0.0433046817779541 2023-01-22 16:38:44.482800: step: 902/464, loss: 0.010654138401150703 2023-01-22 16:38:45.206034: step: 904/464, loss: 0.00342572876252234 2023-01-22 16:38:45.973722: step: 906/464, loss: 0.0011897621443495154 2023-01-22 16:38:46.755298: step: 908/464, loss: 0.04091949760913849 2023-01-22 16:38:47.440927: step: 910/464, loss: 0.008571179583668709 2023-01-22 16:38:48.159216: step: 912/464, loss: 0.006447421386837959 2023-01-22 16:38:48.851069: step: 914/464, loss: 0.00698639964684844 2023-01-22 16:38:49.556267: step: 916/464, loss: 0.004937224555760622 2023-01-22 16:38:50.266719: step: 918/464, loss: 0.017821161076426506 2023-01-22 16:38:51.034395: step: 920/464, loss: 0.2027270495891571 2023-01-22 16:38:51.741895: step: 922/464, loss: 0.002561023458838463 2023-01-22 16:38:52.569408: step: 924/464, loss: 0.4017066955566406 2023-01-22 16:38:53.250780: step: 926/464, loss: 0.0035937901120632887 2023-01-22 16:38:53.987614: step: 928/464, loss: 0.004983225371688604 2023-01-22 16:38:54.651816: step: 930/464, loss: 0.021204371005296707 ================================================== Loss: 0.047 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3096353737007095, 'r': 0.3237364153872693, 'f1': 0.31652892561983476}, 'combined': 0.23323183993040456, 'epoch': 37} Test Chinese: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.30586635253091304, 'r': 0.27954550913804516, 'f1': 0.29211422195200376}, 'combined': 0.18141830626492866, 'epoch': 37} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2966524562394128, 'r': 0.3236720347963232, 'f1': 0.3095737973460297}, 'combined': 0.22810700857075872, 'epoch': 37} Test Korean: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.29655483359692925, 'r': 0.2736752321918249, 'f1': 0.28465602854520056}, 'combined': 0.17678637562280877, 'epoch': 37} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3104519039969763, 'r': 0.3281246879057226, 'f1': 0.31904374635851623}, 'combined': 0.23508486573785406, 'epoch': 37} Test Russian: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.3187482089388679, 'r': 0.2856437955475908, 'f1': 0.3012893868530144}, 'combined': 0.1871165665718721, 'epoch': 37} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.2984375, 'r': 0.3410714285714286, 'f1': 0.31833333333333336}, 'combined': 0.21222222222222223, 'epoch': 37} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.2894736842105263, 'r': 0.4782608695652174, 'f1': 0.360655737704918}, 'combined': 0.180327868852459, 'epoch': 37} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.41875, 'r': 0.28879310344827586, 'f1': 0.34183673469387754}, 'combined': 0.227891156462585, 'epoch': 37} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.31282801835726054, 'r': 0.3603161425860667, 'f1': 0.33489701436130004}, 'combined': 0.24676622110832633, 'epoch': 12} Test for Chinese: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.30894966790503225, 'r': 0.2887808468548521, 'f1': 0.29852498585915693}, 'combined': 0.1853997280598975, 'epoch': 12} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.3138888888888889, 'r': 0.4035714285714286, 'f1': 0.35312499999999997}, 'combined': 0.23541666666666664, 'epoch': 12} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3081533681870252, 'r': 0.3455761681186563, 'f1': 0.32579363255551325}, 'combined': 0.24005846609353607, 'epoch': 23} Test for Korean: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.29327466083978504, 'r': 0.2883432372648332, 'f1': 0.2907880427678267}, 'combined': 0.1805946791926503, 'epoch': 23} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.3355263157894737, 'r': 0.5543478260869565, 'f1': 0.41803278688524587}, 'combined': 0.20901639344262293, 'epoch': 23} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.32184781639317267, 'r': 0.3346728716953674, 'f1': 0.32813507606224857}, 'combined': 0.24178374025639365, 'epoch': 24} Test for Russian: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.31977789344146773, 'r': 0.29610233370986844, 'f1': 0.30748504771716734}, 'combined': 0.190964398055925, 'epoch': 24} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.46875, 'r': 0.3232758620689655, 'f1': 0.38265306122448983}, 'combined': 0.25510204081632654, 'epoch': 24} ****************************** Epoch: 38 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-22 16:41:33.817819: step: 2/464, loss: 0.004826576914638281 2023-01-22 16:41:34.569772: step: 4/464, loss: 0.00030840112594887614 2023-01-22 16:41:35.396639: step: 6/464, loss: 0.002776885172352195 2023-01-22 16:41:36.156942: step: 8/464, loss: 0.005393106956034899 2023-01-22 16:41:36.945015: step: 10/464, loss: 0.00719247292727232 2023-01-22 16:41:37.680703: step: 12/464, loss: 0.017119480296969414 2023-01-22 16:41:38.389432: step: 14/464, loss: 0.001992583042010665 2023-01-22 16:41:39.110289: step: 16/464, loss: 0.004189030267298222 2023-01-22 16:41:39.763240: step: 18/464, loss: 0.009218255989253521 2023-01-22 16:41:40.521909: step: 20/464, loss: 0.002750575076788664 2023-01-22 16:41:41.222110: step: 22/464, loss: 0.0023512584157288074 2023-01-22 16:41:42.043770: step: 24/464, loss: 0.013374313712120056 2023-01-22 16:41:42.769157: step: 26/464, loss: 0.0004921825020574033 2023-01-22 16:41:43.523165: step: 28/464, loss: 0.02349039912223816 2023-01-22 16:41:44.284787: step: 30/464, loss: 0.004341502673923969 2023-01-22 16:41:44.973598: step: 32/464, loss: 0.010627289302647114 2023-01-22 16:41:45.691259: step: 34/464, loss: 0.00019175864872522652 2023-01-22 16:41:46.390925: step: 36/464, loss: 0.001980855828151107 2023-01-22 16:41:47.201123: step: 38/464, loss: 3.581074270186946e-05 2023-01-22 16:41:47.971542: step: 40/464, loss: 0.001287630875594914 2023-01-22 16:41:48.675102: step: 42/464, loss: 0.005695367231965065 2023-01-22 16:41:49.376023: step: 44/464, loss: 0.00015220152272377163 2023-01-22 16:41:50.123016: step: 46/464, loss: 0.0023461703676730394 2023-01-22 16:41:50.824714: step: 48/464, loss: 0.0034109558910131454 2023-01-22 16:41:51.574994: step: 50/464, loss: 3.344095966895111e-05 2023-01-22 16:41:52.284580: step: 52/464, loss: 0.0007164563285186887 2023-01-22 16:41:53.057484: step: 54/464, loss: 0.020312059670686722 2023-01-22 16:41:53.809847: step: 56/464, loss: 0.008092692121863365 2023-01-22 16:41:54.584944: step: 58/464, loss: 0.11512520164251328 2023-01-22 16:41:55.251447: step: 60/464, loss: 4.2281106289010495e-05 2023-01-22 16:41:55.965834: step: 62/464, loss: 0.015612255781888962 2023-01-22 16:41:56.701350: step: 64/464, loss: 0.01599789410829544 2023-01-22 16:41:57.406540: step: 66/464, loss: 0.0010073683224618435 2023-01-22 16:41:58.208295: step: 68/464, loss: 2.1914695025770925e-05 2023-01-22 16:41:58.894889: step: 70/464, loss: 8.730077388463542e-05 2023-01-22 16:41:59.673141: step: 72/464, loss: 8.373921446036547e-05 2023-01-22 16:42:00.446849: step: 74/464, loss: 0.004494973458349705 2023-01-22 16:42:01.171447: step: 76/464, loss: 0.0002359635109314695 2023-01-22 16:42:01.933219: step: 78/464, loss: 0.11488990485668182 2023-01-22 16:42:02.750269: step: 80/464, loss: 0.00331053021363914 2023-01-22 16:42:03.543054: step: 82/464, loss: 0.08849333971738815 2023-01-22 16:42:04.270304: step: 84/464, loss: 0.0001819575554691255 2023-01-22 16:42:05.048158: step: 86/464, loss: 0.0004114827897865325 2023-01-22 16:42:05.752361: step: 88/464, loss: 0.014233220368623734 2023-01-22 16:42:06.566626: step: 90/464, loss: 0.0013636646326631308 2023-01-22 16:42:07.303742: step: 92/464, loss: 0.0010587204014882445 2023-01-22 16:42:08.125538: step: 94/464, loss: 0.005007532890886068 2023-01-22 16:42:08.956445: step: 96/464, loss: 0.0003044283948838711 2023-01-22 16:42:09.652681: step: 98/464, loss: 0.6308871507644653 2023-01-22 16:42:10.436845: step: 100/464, loss: 0.01804986223578453 2023-01-22 16:42:11.194034: step: 102/464, loss: 0.0004457629984244704 2023-01-22 16:42:12.085636: step: 104/464, loss: 8.525677549187094e-05 2023-01-22 16:42:12.827580: step: 106/464, loss: 0.05905519053339958 2023-01-22 16:42:13.726601: step: 108/464, loss: 0.019224489107728004 2023-01-22 16:42:14.472727: step: 110/464, loss: 0.13390938937664032 2023-01-22 16:42:15.193772: step: 112/464, loss: 0.00015067239291965961 2023-01-22 16:42:15.910604: step: 114/464, loss: 0.0028867078945040703 2023-01-22 16:42:16.630412: step: 116/464, loss: 0.0004979309160262346 2023-01-22 16:42:17.410983: step: 118/464, loss: 0.0012932976242154837 2023-01-22 16:42:18.142258: step: 120/464, loss: 0.00414996175095439 2023-01-22 16:42:18.915968: step: 122/464, loss: 1.270747361559188e-05 2023-01-22 16:42:19.653764: step: 124/464, loss: 0.0025079974438995123 2023-01-22 16:42:20.517133: step: 126/464, loss: 0.3264477252960205 2023-01-22 16:42:21.231863: step: 128/464, loss: 0.01213053148239851 2023-01-22 16:42:21.887980: step: 130/464, loss: 0.0013147556455805898 2023-01-22 16:42:22.675005: step: 132/464, loss: 0.011113953776657581 2023-01-22 16:42:23.410556: step: 134/464, loss: 0.005514933727681637 2023-01-22 16:42:24.116506: step: 136/464, loss: 0.023089662194252014 2023-01-22 16:42:24.940693: step: 138/464, loss: 0.10404906421899796 2023-01-22 16:42:25.697973: step: 140/464, loss: 0.035954758524894714 2023-01-22 16:42:26.414590: step: 142/464, loss: 0.0005670940154232085 2023-01-22 16:42:27.196327: step: 144/464, loss: 0.009082665666937828 2023-01-22 16:42:28.073361: step: 146/464, loss: 0.012343711219727993 2023-01-22 16:42:28.826343: step: 148/464, loss: 6.339289029710926e-06 2023-01-22 16:42:29.547138: step: 150/464, loss: 0.0026830360293388367 2023-01-22 16:42:30.277720: step: 152/464, loss: 0.00025342742446810007 2023-01-22 16:42:30.923423: step: 154/464, loss: 0.003277807030826807 2023-01-22 16:42:31.688993: step: 156/464, loss: 0.0008476334623992443 2023-01-22 16:42:32.418622: step: 158/464, loss: 0.007485872600227594 2023-01-22 16:42:33.145079: step: 160/464, loss: 0.9153180718421936 2023-01-22 16:42:33.883281: step: 162/464, loss: 0.003973011393100023 2023-01-22 16:42:34.623147: step: 164/464, loss: 0.01418553851544857 2023-01-22 16:42:35.289084: step: 166/464, loss: 0.0006633122102357447 2023-01-22 16:42:36.138319: step: 168/464, loss: 0.12291660159826279 2023-01-22 16:42:36.875999: step: 170/464, loss: 2.9597180400742218e-05 2023-01-22 16:42:37.644848: step: 172/464, loss: 0.00031909276731312275 2023-01-22 16:42:38.468388: step: 174/464, loss: 0.0021952923852950335 2023-01-22 16:42:39.184174: step: 176/464, loss: 0.8404306769371033 2023-01-22 16:42:39.978010: step: 178/464, loss: 0.0026680396404117346 2023-01-22 16:42:40.763144: step: 180/464, loss: 0.1378488689661026 2023-01-22 16:42:41.468029: step: 182/464, loss: 0.0038573513738811016 2023-01-22 16:42:42.172509: step: 184/464, loss: 1.6524886632396374e-06 2023-01-22 16:42:42.865655: step: 186/464, loss: 0.0010226487647742033 2023-01-22 16:42:43.601297: step: 188/464, loss: 0.0053894431330263615 2023-01-22 16:42:44.318128: step: 190/464, loss: 0.0068910811096429825 2023-01-22 16:42:45.075662: step: 192/464, loss: 0.014292379841208458 2023-01-22 16:42:45.768160: step: 194/464, loss: 0.015958290547132492 2023-01-22 16:42:46.472807: step: 196/464, loss: 0.003270985558629036 2023-01-22 16:42:47.351552: step: 198/464, loss: 0.019883891567587852 2023-01-22 16:42:48.113851: step: 200/464, loss: 0.0022445889189839363 2023-01-22 16:42:48.823496: step: 202/464, loss: 0.0007975028711371124 2023-01-22 16:42:49.552228: step: 204/464, loss: 0.00023076686193235219 2023-01-22 16:42:50.248162: step: 206/464, loss: 0.11212483048439026 2023-01-22 16:42:50.904110: step: 208/464, loss: 0.0010478850454092026 2023-01-22 16:42:51.648922: step: 210/464, loss: 0.03414137288928032 2023-01-22 16:42:52.361598: step: 212/464, loss: 0.056933943182229996 2023-01-22 16:42:53.138979: step: 214/464, loss: 0.00579112721607089 2023-01-22 16:42:53.844219: step: 216/464, loss: 0.015554566867649555 2023-01-22 16:42:54.572431: step: 218/464, loss: 0.004184139892458916 2023-01-22 16:42:55.234807: step: 220/464, loss: 0.05660339072346687 2023-01-22 16:42:55.985382: step: 222/464, loss: 0.012609807774424553 2023-01-22 16:42:56.765310: step: 224/464, loss: 0.036066535860300064 2023-01-22 16:42:57.508322: step: 226/464, loss: 0.003133221063762903 2023-01-22 16:42:58.196960: step: 228/464, loss: 0.0016689972253516316 2023-01-22 16:42:58.940996: step: 230/464, loss: 0.026168620213866234 2023-01-22 16:42:59.725153: step: 232/464, loss: 0.029742280021309853 2023-01-22 16:43:00.509946: step: 234/464, loss: 0.11596041917800903 2023-01-22 16:43:01.266387: step: 236/464, loss: 0.010135364718735218 2023-01-22 16:43:02.087453: step: 238/464, loss: 0.03542346879839897 2023-01-22 16:43:02.900142: step: 240/464, loss: 0.0002479084942024201 2023-01-22 16:43:03.713198: step: 242/464, loss: 8.900500688469037e-05 2023-01-22 16:43:04.500554: step: 244/464, loss: 0.007964391261339188 2023-01-22 16:43:05.263580: step: 246/464, loss: 0.04299350827932358 2023-01-22 16:43:05.997753: step: 248/464, loss: 0.024968914687633514 2023-01-22 16:43:06.697275: step: 250/464, loss: 0.001517502241767943 2023-01-22 16:43:07.475257: step: 252/464, loss: 0.0010750810615718365 2023-01-22 16:43:08.348094: step: 254/464, loss: 0.00852601882070303 2023-01-22 16:43:09.077930: step: 256/464, loss: 0.0008214469416998327 2023-01-22 16:43:09.801975: step: 258/464, loss: 0.0049279495142400265 2023-01-22 16:43:10.576731: step: 260/464, loss: 0.03465091809630394 2023-01-22 16:43:11.314229: step: 262/464, loss: 0.000700597302056849 2023-01-22 16:43:12.068560: step: 264/464, loss: 0.00017623744497541338 2023-01-22 16:43:12.804147: step: 266/464, loss: 0.001508101588115096 2023-01-22 16:43:13.551469: step: 268/464, loss: 0.03918704390525818 2023-01-22 16:43:14.244205: step: 270/464, loss: 0.003989690914750099 2023-01-22 16:43:14.982626: step: 272/464, loss: 0.007421809248626232 2023-01-22 16:43:15.700799: step: 274/464, loss: 0.03553994372487068 2023-01-22 16:43:16.611635: step: 276/464, loss: 0.005269336514174938 2023-01-22 16:43:17.376165: step: 278/464, loss: 0.001513259601779282 2023-01-22 16:43:18.094106: step: 280/464, loss: 0.005394801031798124 2023-01-22 16:43:18.749589: step: 282/464, loss: 0.005470091011375189 2023-01-22 16:43:19.587101: step: 284/464, loss: 0.00022049256949685514 2023-01-22 16:43:20.330960: step: 286/464, loss: 1.162865555670578e-05 2023-01-22 16:43:21.058856: step: 288/464, loss: 0.00864122062921524 2023-01-22 16:43:21.808986: step: 290/464, loss: 0.03521820902824402 2023-01-22 16:43:22.548418: step: 292/464, loss: 0.007795601151883602 2023-01-22 16:43:23.326688: step: 294/464, loss: 0.04232088848948479 2023-01-22 16:43:24.086415: step: 296/464, loss: 0.0011312862625345588 2023-01-22 16:43:24.806480: step: 298/464, loss: 0.006688180845230818 2023-01-22 16:43:25.543520: step: 300/464, loss: 0.00024246759130619466 2023-01-22 16:43:26.330952: step: 302/464, loss: 0.003744281828403473 2023-01-22 16:43:27.116568: step: 304/464, loss: 0.030908262357115746 2023-01-22 16:43:27.834341: step: 306/464, loss: 0.035456642508506775 2023-01-22 16:43:28.718508: step: 308/464, loss: 0.04970543459057808 2023-01-22 16:43:29.473064: step: 310/464, loss: 0.005955439060926437 2023-01-22 16:43:30.186121: step: 312/464, loss: 0.0006838430999778211 2023-01-22 16:43:31.029967: step: 314/464, loss: 0.0007688677869737148 2023-01-22 16:43:31.815399: step: 316/464, loss: 0.0032956611830741167 2023-01-22 16:43:32.514536: step: 318/464, loss: 0.0005155724938958883 2023-01-22 16:43:33.202112: step: 320/464, loss: 0.0029878311324864626 2023-01-22 16:43:33.891057: step: 322/464, loss: 0.0009836448589339852 2023-01-22 16:43:34.669560: step: 324/464, loss: 0.0037996331229805946 2023-01-22 16:43:35.372336: step: 326/464, loss: 5.877662624698132e-05 2023-01-22 16:43:36.193511: step: 328/464, loss: 0.006655172444880009 2023-01-22 16:43:36.957029: step: 330/464, loss: 0.0013236195081844926 2023-01-22 16:43:37.770879: step: 332/464, loss: 0.00981642585247755 2023-01-22 16:43:38.480707: step: 334/464, loss: 0.0006703927647322416 2023-01-22 16:43:39.190499: step: 336/464, loss: 0.021209033206105232 2023-01-22 16:43:40.017548: step: 338/464, loss: 0.04230085760354996 2023-01-22 16:43:40.856285: step: 340/464, loss: 0.22357164323329926 2023-01-22 16:43:41.553748: step: 342/464, loss: 0.00016751704970374703 2023-01-22 16:43:42.347084: step: 344/464, loss: 0.056093987077474594 2023-01-22 16:43:43.014987: step: 346/464, loss: 0.0025848422665148973 2023-01-22 16:43:43.835885: step: 348/464, loss: 0.0039555374532938 2023-01-22 16:43:44.556259: step: 350/464, loss: 0.0008982414146885276 2023-01-22 16:43:45.310580: step: 352/464, loss: 0.06868308782577515 2023-01-22 16:43:46.006005: step: 354/464, loss: 0.00014734716387465596 2023-01-22 16:43:46.789609: step: 356/464, loss: 0.0005575288669206202 2023-01-22 16:43:47.513486: step: 358/464, loss: 0.007786013185977936 2023-01-22 16:43:48.275381: step: 360/464, loss: 0.03875568136572838 2023-01-22 16:43:49.055474: step: 362/464, loss: 0.034470029175281525 2023-01-22 16:43:49.772415: step: 364/464, loss: 0.00023473362671211362 2023-01-22 16:43:50.515575: step: 366/464, loss: 0.014822765253484249 2023-01-22 16:43:51.193064: step: 368/464, loss: 0.00023555981169920415 2023-01-22 16:43:52.009179: step: 370/464, loss: 0.0026356929447501898 2023-01-22 16:43:52.791652: step: 372/464, loss: 0.014596613124012947 2023-01-22 16:43:53.585295: step: 374/464, loss: 9.38707817113027e-05 2023-01-22 16:43:54.340325: step: 376/464, loss: 0.007376750465482473 2023-01-22 16:43:55.014349: step: 378/464, loss: 0.00013531959848478436 2023-01-22 16:43:55.848534: step: 380/464, loss: 0.009358882904052734 2023-01-22 16:43:56.601951: step: 382/464, loss: 0.046712055802345276 2023-01-22 16:43:57.377711: step: 384/464, loss: 0.004233027808368206 2023-01-22 16:43:58.098463: step: 386/464, loss: 4.672848808695562e-05 2023-01-22 16:43:58.786476: step: 388/464, loss: 0.0007691225619055331 2023-01-22 16:43:59.714222: step: 390/464, loss: 0.01412983052432537 2023-01-22 16:44:00.451055: step: 392/464, loss: 0.0025889184325933456 2023-01-22 16:44:01.193327: step: 394/464, loss: 0.0006169418338686228 2023-01-22 16:44:01.930116: step: 396/464, loss: 0.02392732910811901 2023-01-22 16:44:02.657207: step: 398/464, loss: 0.02258199453353882 2023-01-22 16:44:03.478922: step: 400/464, loss: 0.00559289800003171 2023-01-22 16:44:04.184525: step: 402/464, loss: 0.0742979645729065 2023-01-22 16:44:04.868357: step: 404/464, loss: 0.010385122150182724 2023-01-22 16:44:05.638329: step: 406/464, loss: 0.0037662286777049303 2023-01-22 16:44:06.403347: step: 408/464, loss: 0.04555836319923401 2023-01-22 16:44:07.162647: step: 410/464, loss: 0.006724359467625618 2023-01-22 16:44:07.904938: step: 412/464, loss: 0.01723787561058998 2023-01-22 16:44:08.537692: step: 414/464, loss: 0.0011205865303054452 2023-01-22 16:44:09.283193: step: 416/464, loss: 0.30172547698020935 2023-01-22 16:44:10.197548: step: 418/464, loss: 0.0027903704904019833 2023-01-22 16:44:10.903402: step: 420/464, loss: 0.0023601017892360687 2023-01-22 16:44:11.611987: step: 422/464, loss: 0.0002794855972751975 2023-01-22 16:44:12.578181: step: 424/464, loss: 0.13339070975780487 2023-01-22 16:44:13.246169: step: 426/464, loss: 0.003943042363971472 2023-01-22 16:44:13.979746: step: 428/464, loss: 0.00010437369928695261 2023-01-22 16:44:14.704827: step: 430/464, loss: 0.017912743613123894 2023-01-22 16:44:15.444552: step: 432/464, loss: 0.004407365340739489 2023-01-22 16:44:16.236824: step: 434/464, loss: 0.005560706369578838 2023-01-22 16:44:16.967450: step: 436/464, loss: 0.03185200318694115 2023-01-22 16:44:17.687365: step: 438/464, loss: 0.001232458045706153 2023-01-22 16:44:18.485793: step: 440/464, loss: 0.00033859541872516274 2023-01-22 16:44:19.315637: step: 442/464, loss: 6.870036304462701e-05 2023-01-22 16:44:20.161162: step: 444/464, loss: 0.14986322820186615 2023-01-22 16:44:20.883854: step: 446/464, loss: 0.0128685487434268 2023-01-22 16:44:21.696543: step: 448/464, loss: 0.0524282306432724 2023-01-22 16:44:22.499223: step: 450/464, loss: 0.0016600600210949779 2023-01-22 16:44:23.238616: step: 452/464, loss: 0.08632011711597443 2023-01-22 16:44:23.978388: step: 454/464, loss: 0.006322702392935753 2023-01-22 16:44:24.715423: step: 456/464, loss: 0.06686757504940033 2023-01-22 16:44:25.442978: step: 458/464, loss: 0.02065141499042511 2023-01-22 16:44:26.201903: step: 460/464, loss: 0.0003768194292206317 2023-01-22 16:44:26.939133: step: 462/464, loss: 0.002721271710470319 2023-01-22 16:44:27.666903: step: 464/464, loss: 3.4734282507997705e-06 2023-01-22 16:44:28.461963: step: 466/464, loss: 6.966947694309056e-05 2023-01-22 16:44:29.200383: step: 468/464, loss: 0.03409808874130249 2023-01-22 16:44:29.961948: step: 470/464, loss: 0.008349225856363773 2023-01-22 16:44:30.726257: step: 472/464, loss: 0.00819906685501337 2023-01-22 16:44:31.434367: step: 474/464, loss: 0.003774096257984638 2023-01-22 16:44:32.229429: step: 476/464, loss: 0.0038325139321386814 2023-01-22 16:44:33.011264: step: 478/464, loss: 0.0010969663271680474 2023-01-22 16:44:33.760230: step: 480/464, loss: 0.06739410012960434 2023-01-22 16:44:34.502878: step: 482/464, loss: 0.023230241611599922 2023-01-22 16:44:35.284225: step: 484/464, loss: 0.022559460252523422 2023-01-22 16:44:35.957014: step: 486/464, loss: 0.041261497884988785 2023-01-22 16:44:36.651844: step: 488/464, loss: 0.14989309012889862 2023-01-22 16:44:37.419663: step: 490/464, loss: 0.0018158291932195425 2023-01-22 16:44:38.107853: step: 492/464, loss: 0.0007690245984122157 2023-01-22 16:44:38.767003: step: 494/464, loss: 0.0004796649154741317 2023-01-22 16:44:39.521561: step: 496/464, loss: 0.02845880761742592 2023-01-22 16:44:40.268384: step: 498/464, loss: 0.0004121836391277611 2023-01-22 16:44:40.995045: step: 500/464, loss: 6.842023867648095e-05 2023-01-22 16:44:41.714600: step: 502/464, loss: 0.037237752228975296 2023-01-22 16:44:42.491850: step: 504/464, loss: 0.01267719455063343 2023-01-22 16:44:43.278737: step: 506/464, loss: 0.024383530020713806 2023-01-22 16:44:44.117499: step: 508/464, loss: 0.0009457824053242803 2023-01-22 16:44:44.896225: step: 510/464, loss: 0.006350579205900431 2023-01-22 16:44:45.633062: step: 512/464, loss: 0.024768246337771416 2023-01-22 16:44:46.377658: step: 514/464, loss: 0.0007438718457706273 2023-01-22 16:44:47.217119: step: 516/464, loss: 0.008646605536341667 2023-01-22 16:44:48.038867: step: 518/464, loss: 0.0017969388281926513 2023-01-22 16:44:48.816398: step: 520/464, loss: 0.008778219111263752 2023-01-22 16:44:49.564930: step: 522/464, loss: 0.00014523882418870926 2023-01-22 16:44:50.378516: step: 524/464, loss: 0.0007008722168393433 2023-01-22 16:44:51.152363: step: 526/464, loss: 0.015383994206786156 2023-01-22 16:44:51.849842: step: 528/464, loss: 0.035820845514535904 2023-01-22 16:44:52.535797: step: 530/464, loss: 0.012869827449321747 2023-01-22 16:44:53.325165: step: 532/464, loss: 0.03324505686759949 2023-01-22 16:44:54.115251: step: 534/464, loss: 0.000605732318945229 2023-01-22 16:44:54.862806: step: 536/464, loss: 0.14915016293525696 2023-01-22 16:44:55.643840: step: 538/464, loss: 0.01700645498931408 2023-01-22 16:44:56.435044: step: 540/464, loss: 0.008300750516355038 2023-01-22 16:44:57.269094: step: 542/464, loss: 0.020550068467855453 2023-01-22 16:44:58.049885: step: 544/464, loss: 0.007534760981798172 2023-01-22 16:44:58.761718: step: 546/464, loss: 0.03769280016422272 2023-01-22 16:44:59.583747: step: 548/464, loss: 0.02654329501092434 2023-01-22 16:45:00.274260: step: 550/464, loss: 0.00033084870665334165 2023-01-22 16:45:01.099059: step: 552/464, loss: 0.0725986659526825 2023-01-22 16:45:01.813507: step: 554/464, loss: 0.005369146820157766 2023-01-22 16:45:02.542717: step: 556/464, loss: 0.004675235599279404 2023-01-22 16:45:03.283665: step: 558/464, loss: 0.03285602480173111 2023-01-22 16:45:04.039782: step: 560/464, loss: 0.00970652885735035 2023-01-22 16:45:04.856026: step: 562/464, loss: 0.0004878818872384727 2023-01-22 16:45:05.650014: step: 564/464, loss: 0.003205288900062442 2023-01-22 16:45:06.287722: step: 566/464, loss: 0.0010604523122310638 2023-01-22 16:45:06.985887: step: 568/464, loss: 0.007997574284672737 2023-01-22 16:45:07.727302: step: 570/464, loss: 0.004095721058547497 2023-01-22 16:45:08.512331: step: 572/464, loss: 0.376163125038147 2023-01-22 16:45:09.223970: step: 574/464, loss: 0.004886925686150789 2023-01-22 16:45:10.002080: step: 576/464, loss: 0.0008122111321426928 2023-01-22 16:45:10.759106: step: 578/464, loss: 0.0018273144960403442 2023-01-22 16:45:11.494667: step: 580/464, loss: 0.00041048970888368785 2023-01-22 16:45:12.267020: step: 582/464, loss: 0.0002744023222476244 2023-01-22 16:45:12.988436: step: 584/464, loss: 0.000339846417773515 2023-01-22 16:45:13.680561: step: 586/464, loss: 0.05465429648756981 2023-01-22 16:45:14.385994: step: 588/464, loss: 0.006726523395627737 2023-01-22 16:45:15.103956: step: 590/464, loss: 0.005674843676388264 2023-01-22 16:45:15.950816: step: 592/464, loss: 0.0012488930951803923 2023-01-22 16:45:16.635755: step: 594/464, loss: 7.244118023663759e-05 2023-01-22 16:45:17.336780: step: 596/464, loss: 0.09878098219633102 2023-01-22 16:45:18.111594: step: 598/464, loss: 0.004160263109952211 2023-01-22 16:45:18.836438: step: 600/464, loss: 0.006494167726486921 2023-01-22 16:45:19.690216: step: 602/464, loss: 0.0021416889503598213 2023-01-22 16:45:20.483997: step: 604/464, loss: 0.030240392312407494 2023-01-22 16:45:21.120887: step: 606/464, loss: 0.0015008965274319053 2023-01-22 16:45:21.854799: step: 608/464, loss: 0.0006604917580261827 2023-01-22 16:45:22.533612: step: 610/464, loss: 0.0010907500982284546 2023-01-22 16:45:23.331466: step: 612/464, loss: 0.003240359015762806 2023-01-22 16:45:24.043777: step: 614/464, loss: 0.002071299823001027 2023-01-22 16:45:24.752758: step: 616/464, loss: 0.06142830848693848 2023-01-22 16:45:25.544349: step: 618/464, loss: 0.014102988876402378 2023-01-22 16:45:26.192762: step: 620/464, loss: 0.003527525346726179 2023-01-22 16:45:26.823933: step: 622/464, loss: 0.007172831334173679 2023-01-22 16:45:27.645029: step: 624/464, loss: 0.009524202905595303 2023-01-22 16:45:28.505771: step: 626/464, loss: 0.012650671415030956 2023-01-22 16:45:29.223777: step: 628/464, loss: 0.023050807416439056 2023-01-22 16:45:29.991769: step: 630/464, loss: 0.00029439141508191824 2023-01-22 16:45:30.726201: step: 632/464, loss: 0.0003891981323249638 2023-01-22 16:45:31.430588: step: 634/464, loss: 0.06722453981637955 2023-01-22 16:45:32.290652: step: 636/464, loss: 0.006464850623160601 2023-01-22 16:45:33.038613: step: 638/464, loss: 0.019264616072177887 2023-01-22 16:45:33.762739: step: 640/464, loss: 0.03517953306436539 2023-01-22 16:45:34.500498: step: 642/464, loss: 0.0011486627627164125 2023-01-22 16:45:35.246154: step: 644/464, loss: 0.0007427395903505385 2023-01-22 16:45:35.959247: step: 646/464, loss: 0.014897801913321018 2023-01-22 16:45:36.707319: step: 648/464, loss: 0.000844307302031666 2023-01-22 16:45:37.399159: step: 650/464, loss: 0.31130504608154297 2023-01-22 16:45:38.213763: step: 652/464, loss: 0.007045819889754057 2023-01-22 16:45:38.939284: step: 654/464, loss: 0.006329555530101061 2023-01-22 16:45:39.736158: step: 656/464, loss: 0.20530587434768677 2023-01-22 16:45:40.508719: step: 658/464, loss: 0.013408103957772255 2023-01-22 16:45:41.233466: step: 660/464, loss: 0.0012683480745181441 2023-01-22 16:45:41.990918: step: 662/464, loss: 0.0009223067318089306 2023-01-22 16:45:42.786153: step: 664/464, loss: 0.011936173774302006 2023-01-22 16:45:43.492983: step: 666/464, loss: 0.00199041492305696 2023-01-22 16:45:44.221203: step: 668/464, loss: 0.000733358261641115 2023-01-22 16:45:44.953250: step: 670/464, loss: 0.0031419494189321995 2023-01-22 16:45:45.728177: step: 672/464, loss: 0.0013263403670862317 2023-01-22 16:45:46.443822: step: 674/464, loss: 0.046067021787166595 2023-01-22 16:45:47.199154: step: 676/464, loss: 0.0020457401406019926 2023-01-22 16:45:47.988995: step: 678/464, loss: 0.02471664361655712 2023-01-22 16:45:48.788368: step: 680/464, loss: 0.022153720259666443 2023-01-22 16:45:49.526521: step: 682/464, loss: 0.0025153113529086113 2023-01-22 16:45:50.251716: step: 684/464, loss: 0.14750951528549194 2023-01-22 16:45:51.056014: step: 686/464, loss: 0.01097826100885868 2023-01-22 16:45:51.902293: step: 688/464, loss: 0.00044294664985500276 2023-01-22 16:45:52.730017: step: 690/464, loss: 0.010748908855021 2023-01-22 16:45:53.531292: step: 692/464, loss: 0.009270434267818928 2023-01-22 16:45:54.267240: step: 694/464, loss: 0.019096214324235916 2023-01-22 16:45:55.042912: step: 696/464, loss: 0.001764296437613666 2023-01-22 16:45:55.776150: step: 698/464, loss: 0.00897511001676321 2023-01-22 16:45:56.609184: step: 700/464, loss: 0.004141754005104303 2023-01-22 16:45:57.396939: step: 702/464, loss: 0.02591550722718239 2023-01-22 16:45:58.143389: step: 704/464, loss: 0.015690796077251434 2023-01-22 16:45:58.844623: step: 706/464, loss: 0.0023084969725459814 2023-01-22 16:45:59.559373: step: 708/464, loss: 0.000845626404043287 2023-01-22 16:46:00.326580: step: 710/464, loss: 0.0013910304987803102 2023-01-22 16:46:01.117521: step: 712/464, loss: 0.04212622344493866 2023-01-22 16:46:01.839148: step: 714/464, loss: 0.0007252685609273612 2023-01-22 16:46:02.657253: step: 716/464, loss: 0.012809190899133682 2023-01-22 16:46:03.459559: step: 718/464, loss: 0.004088373389095068 2023-01-22 16:46:04.151869: step: 720/464, loss: 0.01290807407349348 2023-01-22 16:46:04.936772: step: 722/464, loss: 0.09836144000291824 2023-01-22 16:46:05.749677: step: 724/464, loss: 0.012062566354870796 2023-01-22 16:46:06.493673: step: 726/464, loss: 0.01598495803773403 2023-01-22 16:46:07.272255: step: 728/464, loss: 0.03479862958192825 2023-01-22 16:46:08.007699: step: 730/464, loss: 0.013840602710843086 2023-01-22 16:46:08.742109: step: 732/464, loss: 0.006491428706794977 2023-01-22 16:46:09.463979: step: 734/464, loss: 0.00132731010671705 2023-01-22 16:46:10.193456: step: 736/464, loss: 0.006518483627587557 2023-01-22 16:46:10.954452: step: 738/464, loss: 0.0020145985763520002 2023-01-22 16:46:11.611179: step: 740/464, loss: 0.14279569685459137 2023-01-22 16:46:12.349234: step: 742/464, loss: 6.712900358252227e-05 2023-01-22 16:46:13.049334: step: 744/464, loss: 0.019915182143449783 2023-01-22 16:46:13.750541: step: 746/464, loss: 0.002811913378536701 2023-01-22 16:46:14.355299: step: 748/464, loss: 0.007623051758855581 2023-01-22 16:46:15.124786: step: 750/464, loss: 9.78571260930039e-05 2023-01-22 16:46:15.809178: step: 752/464, loss: 0.008471081033349037 2023-01-22 16:46:16.499503: step: 754/464, loss: 0.0037404377944767475 2023-01-22 16:46:17.247641: step: 756/464, loss: 0.00015219133638311177 2023-01-22 16:46:17.992976: step: 758/464, loss: 2.966730244224891e-05 2023-01-22 16:46:18.864672: step: 760/464, loss: 0.01211349293589592 2023-01-22 16:46:19.638514: step: 762/464, loss: 0.0003824532323051244 2023-01-22 16:46:20.366490: step: 764/464, loss: 0.0011139989364892244 2023-01-22 16:46:21.106773: step: 766/464, loss: 0.008691009134054184 2023-01-22 16:46:21.919676: step: 768/464, loss: 0.0024587425868958235 2023-01-22 16:46:22.685111: step: 770/464, loss: 0.0020354236476123333 2023-01-22 16:46:23.370166: step: 772/464, loss: 0.0019010152900591493 2023-01-22 16:46:24.241838: step: 774/464, loss: 0.002046798123046756 2023-01-22 16:46:25.024876: step: 776/464, loss: 0.0019110854482278228 2023-01-22 16:46:25.747027: step: 778/464, loss: 0.010291693732142448 2023-01-22 16:46:26.503324: step: 780/464, loss: 0.0003693166945595294 2023-01-22 16:46:27.204477: step: 782/464, loss: 0.002170081250369549 2023-01-22 16:46:27.939347: step: 784/464, loss: 0.05067446082830429 2023-01-22 16:46:28.646335: step: 786/464, loss: 0.006645172368735075 2023-01-22 16:46:29.354841: step: 788/464, loss: 0.01579456590116024 2023-01-22 16:46:30.032967: step: 790/464, loss: 0.0007638560491614044 2023-01-22 16:46:30.760786: step: 792/464, loss: 0.00043909618398174644 2023-01-22 16:46:31.496517: step: 794/464, loss: 0.0005379181820899248 2023-01-22 16:46:32.289806: step: 796/464, loss: 0.022501660510897636 2023-01-22 16:46:32.975794: step: 798/464, loss: 0.0009315320639871061 2023-01-22 16:46:33.740081: step: 800/464, loss: 0.1112055554986 2023-01-22 16:46:34.518957: step: 802/464, loss: 0.05598204582929611 2023-01-22 16:46:35.219293: step: 804/464, loss: 0.014602440409362316 2023-01-22 16:46:36.099889: step: 806/464, loss: 0.18087033927440643 2023-01-22 16:46:36.869679: step: 808/464, loss: 0.012002396397292614 2023-01-22 16:46:37.720532: step: 810/464, loss: 0.0077812401577830315 2023-01-22 16:46:38.476886: step: 812/464, loss: 9.820223203860223e-05 2023-01-22 16:46:39.247401: step: 814/464, loss: 0.0015867205802351236 2023-01-22 16:46:40.010926: step: 816/464, loss: 0.006540404632687569 2023-01-22 16:46:40.785912: step: 818/464, loss: 0.00014760888007003814 2023-01-22 16:46:41.616481: step: 820/464, loss: 0.002761528827250004 2023-01-22 16:46:42.359602: step: 822/464, loss: 0.03273645043373108 2023-01-22 16:46:43.109099: step: 824/464, loss: 0.003857521340250969 2023-01-22 16:46:43.851279: step: 826/464, loss: 0.0049551865085959435 2023-01-22 16:46:44.610246: step: 828/464, loss: 0.00504017248749733 2023-01-22 16:46:45.416395: step: 830/464, loss: 0.011408940888941288 2023-01-22 16:46:46.185499: step: 832/464, loss: 0.010786442086100578 2023-01-22 16:46:46.991350: step: 834/464, loss: 0.002086550695821643 2023-01-22 16:46:47.691567: step: 836/464, loss: 0.0010599680244922638 2023-01-22 16:46:48.441561: step: 838/464, loss: 0.0013713724911212921 2023-01-22 16:46:49.130669: step: 840/464, loss: 0.0037390729412436485 2023-01-22 16:46:49.826536: step: 842/464, loss: 0.0004886464448645711 2023-01-22 16:46:50.613806: step: 844/464, loss: 0.000662407313939184 2023-01-22 16:46:51.401039: step: 846/464, loss: 0.0027486770413815975 2023-01-22 16:46:52.131929: step: 848/464, loss: 0.014897621236741543 2023-01-22 16:46:52.899946: step: 850/464, loss: 0.03529081121087074 2023-01-22 16:46:53.670634: step: 852/464, loss: 0.0020230154041200876 2023-01-22 16:46:54.470323: step: 854/464, loss: 0.0013419726165011525 2023-01-22 16:46:55.232393: step: 856/464, loss: 0.0024132877588272095 2023-01-22 16:46:55.924770: step: 858/464, loss: 7.780754094710574e-05 2023-01-22 16:46:56.646901: step: 860/464, loss: 0.02189820446074009 2023-01-22 16:46:57.437452: step: 862/464, loss: 0.018136218190193176 2023-01-22 16:46:58.234671: step: 864/464, loss: 0.008349993266165257 2023-01-22 16:46:58.966067: step: 866/464, loss: 0.01271690521389246 2023-01-22 16:46:59.776314: step: 868/464, loss: 0.0003639784117694944 2023-01-22 16:47:00.531941: step: 870/464, loss: 0.001440309570170939 2023-01-22 16:47:01.322030: step: 872/464, loss: 0.0009805704466998577 2023-01-22 16:47:01.960606: step: 874/464, loss: 0.033950306475162506 2023-01-22 16:47:02.649303: step: 876/464, loss: 0.0008513766224496067 2023-01-22 16:47:03.386214: step: 878/464, loss: 0.0002595721452962607 2023-01-22 16:47:04.128279: step: 880/464, loss: 8.58848579810001e-05 2023-01-22 16:47:04.946962: step: 882/464, loss: 0.018774444237351418 2023-01-22 16:47:05.681031: step: 884/464, loss: 0.006721880286931992 2023-01-22 16:47:06.401423: step: 886/464, loss: 0.0034052832052111626 2023-01-22 16:47:07.248856: step: 888/464, loss: 0.0030635748989880085 2023-01-22 16:47:07.974345: step: 890/464, loss: 0.014266138896346092 2023-01-22 16:47:08.729393: step: 892/464, loss: 0.00177770818118006 2023-01-22 16:47:09.458803: step: 894/464, loss: 0.008312602527439594 2023-01-22 16:47:10.259259: step: 896/464, loss: 0.012707074172794819 2023-01-22 16:47:11.066726: step: 898/464, loss: 0.0032947026193141937 2023-01-22 16:47:11.849416: step: 900/464, loss: 9.511190000921488e-05 2023-01-22 16:47:12.548699: step: 902/464, loss: 0.0001595946669112891 2023-01-22 16:47:13.277568: step: 904/464, loss: 0.0008967430330812931 2023-01-22 16:47:13.968949: step: 906/464, loss: 0.001551638706587255 2023-01-22 16:47:14.876003: step: 908/464, loss: 0.11213032901287079 2023-01-22 16:47:15.644900: step: 910/464, loss: 0.011793390847742558 2023-01-22 16:47:16.446474: step: 912/464, loss: 0.009459859691560268 2023-01-22 16:47:17.170566: step: 914/464, loss: 0.0012327745789662004 2023-01-22 16:47:17.905010: step: 916/464, loss: 0.017146775498986244 2023-01-22 16:47:18.720386: step: 918/464, loss: 0.01610678993165493 2023-01-22 16:47:19.538230: step: 920/464, loss: 5.426480493042618e-05 2023-01-22 16:47:20.328274: step: 922/464, loss: 0.0013343350728973746 2023-01-22 16:47:21.061556: step: 924/464, loss: 0.00017622199084144086 2023-01-22 16:47:21.807087: step: 926/464, loss: 0.004111295100301504 2023-01-22 16:47:22.625807: step: 928/464, loss: 0.020738869905471802 2023-01-22 16:47:23.349212: step: 930/464, loss: 0.0001751655072439462 ================================================== Loss: 0.024 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.31059827302631576, 'r': 0.35833728652751423, 'f1': 0.3327643171806167}, 'combined': 0.24519476002782284, 'epoch': 38} Test Chinese: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.30368465229825575, 'r': 0.29287095548051373, 'f1': 0.298179794552668}, 'combined': 0.18518534609060436, 'epoch': 38} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.28841664800716527, 'r': 0.3491647465437788, 'f1': 0.3158966891477621}, 'combined': 0.23276598147729838, 'epoch': 38} Test Korean: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.2943777378458305, 'r': 0.2920506411039662, 'f1': 0.29320957221945815}, 'combined': 0.18209857643103192, 'epoch': 38} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3068623731899766, 'r': 0.35344489663437534, 'f1': 0.3285105123920914}, 'combined': 0.24206037755206733, 'epoch': 38} Test Russian: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.3106783814746556, 'r': 0.29505254572001316, 'f1': 0.3026639163986782}, 'combined': 0.18797022176338962, 'epoch': 38} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.2604166666666667, 'r': 0.35714285714285715, 'f1': 0.3012048192771084}, 'combined': 0.2008032128514056, 'epoch': 38} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.2712765957446808, 'r': 0.5543478260869565, 'f1': 0.36428571428571427}, 'combined': 0.18214285714285713, 'epoch': 38} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.3472222222222222, 'r': 0.3232758620689655, 'f1': 0.33482142857142855}, 'combined': 0.2232142857142857, 'epoch': 38} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.31282801835726054, 'r': 0.3603161425860667, 'f1': 0.33489701436130004}, 'combined': 0.24676622110832633, 'epoch': 12} Test for Chinese: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.30894966790503225, 'r': 0.2887808468548521, 'f1': 0.29852498585915693}, 'combined': 0.1853997280598975, 'epoch': 12} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.3138888888888889, 'r': 0.4035714285714286, 'f1': 0.35312499999999997}, 'combined': 0.23541666666666664, 'epoch': 12} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3081533681870252, 'r': 0.3455761681186563, 'f1': 0.32579363255551325}, 'combined': 0.24005846609353607, 'epoch': 23} Test for Korean: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.29327466083978504, 'r': 0.2883432372648332, 'f1': 0.2907880427678267}, 'combined': 0.1805946791926503, 'epoch': 23} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.3355263157894737, 'r': 0.5543478260869565, 'f1': 0.41803278688524587}, 'combined': 0.20901639344262293, 'epoch': 23} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.32184781639317267, 'r': 0.3346728716953674, 'f1': 0.32813507606224857}, 'combined': 0.24178374025639365, 'epoch': 24} Test for Russian: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.31977789344146773, 'r': 0.29610233370986844, 'f1': 0.30748504771716734}, 'combined': 0.190964398055925, 'epoch': 24} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.46875, 'r': 0.3232758620689655, 'f1': 0.38265306122448983}, 'combined': 0.25510204081632654, 'epoch': 24} ****************************** Epoch: 39 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-22 16:50:10.156403: step: 2/464, loss: 0.012623732909560204 2023-01-22 16:50:10.951163: step: 4/464, loss: 0.06931336969137192 2023-01-22 16:50:11.655322: step: 6/464, loss: 3.5806838241114747e-06 2023-01-22 16:50:12.374473: step: 8/464, loss: 0.00021697515330743045 2023-01-22 16:50:13.121942: step: 10/464, loss: 0.0017598243430256844 2023-01-22 16:50:13.916014: step: 12/464, loss: 0.0003556575393304229 2023-01-22 16:50:14.656743: step: 14/464, loss: 0.01138739287853241 2023-01-22 16:50:15.412232: step: 16/464, loss: 0.006124960258603096 2023-01-22 16:50:16.131326: step: 18/464, loss: 0.033372364938259125 2023-01-22 16:50:16.756993: step: 20/464, loss: 0.01938818395137787 2023-01-22 16:50:17.428079: step: 22/464, loss: 0.0005049011087976396 2023-01-22 16:50:18.138973: step: 24/464, loss: 0.05342579632997513 2023-01-22 16:50:18.844976: step: 26/464, loss: 0.016973961144685745 2023-01-22 16:50:19.616988: step: 28/464, loss: 0.0019795908592641354 2023-01-22 16:50:20.332970: step: 30/464, loss: 9.77626841631718e-05 2023-01-22 16:50:21.069165: step: 32/464, loss: 0.0011293364223092794 2023-01-22 16:50:21.787703: step: 34/464, loss: 0.00017125466547440737 2023-01-22 16:50:22.562957: step: 36/464, loss: 0.024412035942077637 2023-01-22 16:50:23.294193: step: 38/464, loss: 0.27952486276626587 2023-01-22 16:50:24.134458: step: 40/464, loss: 0.027969282120466232 2023-01-22 16:50:24.815379: step: 42/464, loss: 0.0006921574240550399 2023-01-22 16:50:25.494771: step: 44/464, loss: 0.011047055013477802 2023-01-22 16:50:26.191095: step: 46/464, loss: 0.003937239293009043 2023-01-22 16:50:26.942951: step: 48/464, loss: 0.05621099844574928 2023-01-22 16:50:27.634076: step: 50/464, loss: 8.688573143444955e-05 2023-01-22 16:50:28.474437: step: 52/464, loss: 0.0035085054114460945 2023-01-22 16:50:29.196518: step: 54/464, loss: 0.00022192315373104066 2023-01-22 16:50:29.891148: step: 56/464, loss: 0.0003033808898180723 2023-01-22 16:50:30.634286: step: 58/464, loss: 0.044928643852472305 2023-01-22 16:50:31.455920: step: 60/464, loss: 0.044337864965200424 2023-01-22 16:50:32.235835: step: 62/464, loss: 0.0020050599705427885 2023-01-22 16:50:33.061179: step: 64/464, loss: 0.0005321039352566004 2023-01-22 16:50:33.901304: step: 66/464, loss: 0.0008718542521819472 2023-01-22 16:50:34.544131: step: 68/464, loss: 0.00042165510240010917 2023-01-22 16:50:35.262542: step: 70/464, loss: 0.009614691138267517 2023-01-22 16:50:36.090838: step: 72/464, loss: 0.0006315399659797549 2023-01-22 16:50:36.857186: step: 74/464, loss: 0.013663442805409431 2023-01-22 16:50:37.654072: step: 76/464, loss: 8.693081326782703e-05 2023-01-22 16:50:38.384282: step: 78/464, loss: 0.006749124266207218 2023-01-22 16:50:39.121290: step: 80/464, loss: 0.017502861097455025 2023-01-22 16:50:39.873119: step: 82/464, loss: 0.033279284834861755 2023-01-22 16:50:40.589753: step: 84/464, loss: 7.313829701161012e-05 2023-01-22 16:50:41.392069: step: 86/464, loss: 0.02006073109805584 2023-01-22 16:50:42.165617: step: 88/464, loss: 0.0011353358859196305 2023-01-22 16:50:43.048162: step: 90/464, loss: 0.02507760189473629 2023-01-22 16:50:43.816876: step: 92/464, loss: 0.004654605407267809 2023-01-22 16:50:44.530424: step: 94/464, loss: 0.03477931395173073 2023-01-22 16:50:45.282410: step: 96/464, loss: 0.10145996510982513 2023-01-22 16:50:46.145190: step: 98/464, loss: 0.0018320104572921991 2023-01-22 16:50:46.802751: step: 100/464, loss: 6.049231524229981e-05 2023-01-22 16:50:47.543290: step: 102/464, loss: 0.0021832017228007317 2023-01-22 16:50:48.266504: step: 104/464, loss: 0.008522151038050652 2023-01-22 16:50:49.069216: step: 106/464, loss: 0.01118018664419651 2023-01-22 16:50:49.771648: step: 108/464, loss: 0.00034222868271172047 2023-01-22 16:50:50.462327: step: 110/464, loss: 0.0002311748976353556 2023-01-22 16:50:51.257217: step: 112/464, loss: 0.010870211757719517 2023-01-22 16:50:52.076002: step: 114/464, loss: 0.0024785962887108326 2023-01-22 16:50:52.935584: step: 116/464, loss: 0.0037308991886675358 2023-01-22 16:50:53.628868: step: 118/464, loss: 0.001720698899589479 2023-01-22 16:50:54.366939: step: 120/464, loss: 0.002680752892047167 2023-01-22 16:50:55.080216: step: 122/464, loss: 0.0026074047200381756 2023-01-22 16:50:55.792879: step: 124/464, loss: 0.006068503018468618 2023-01-22 16:50:56.592926: step: 126/464, loss: 0.0024157632142305374 2023-01-22 16:50:57.301935: step: 128/464, loss: 0.11783696711063385 2023-01-22 16:50:58.055734: step: 130/464, loss: 0.006381072103977203 2023-01-22 16:50:58.815870: step: 132/464, loss: 0.004709702450782061 2023-01-22 16:50:59.626070: step: 134/464, loss: 0.006073821801692247 2023-01-22 16:51:00.422927: step: 136/464, loss: 0.016582611948251724 2023-01-22 16:51:01.243271: step: 138/464, loss: 0.000737413065508008 2023-01-22 16:51:01.991913: step: 140/464, loss: 0.017299091443419456 2023-01-22 16:51:02.750404: step: 142/464, loss: 0.0006641384679824114 2023-01-22 16:51:03.462265: step: 144/464, loss: 0.0037697700317949057 2023-01-22 16:51:04.202449: step: 146/464, loss: 0.0001279134739888832 2023-01-22 16:51:04.989922: step: 148/464, loss: 0.025734873488545418 2023-01-22 16:51:05.760768: step: 150/464, loss: 0.035919588059186935 2023-01-22 16:51:06.448728: step: 152/464, loss: 0.01290472038090229 2023-01-22 16:51:07.177336: step: 154/464, loss: 0.030983198434114456 2023-01-22 16:51:07.916330: step: 156/464, loss: 6.772899359930307e-05 2023-01-22 16:51:08.640121: step: 158/464, loss: 0.0029904814437031746 2023-01-22 16:51:09.407055: step: 160/464, loss: 7.27025544620119e-05 2023-01-22 16:51:10.144086: step: 162/464, loss: 0.0003170033742208034 2023-01-22 16:51:10.874446: step: 164/464, loss: 0.006331458222121 2023-01-22 16:51:11.663114: step: 166/464, loss: 0.0005581967998296022 2023-01-22 16:51:12.437975: step: 168/464, loss: 0.013321910053491592 2023-01-22 16:51:13.159857: step: 170/464, loss: 0.008421930484473705 2023-01-22 16:51:13.933859: step: 172/464, loss: 0.002395325107499957 2023-01-22 16:51:14.664023: step: 174/464, loss: 0.004580559674650431 2023-01-22 16:51:15.514352: step: 176/464, loss: 0.002185387536883354 2023-01-22 16:51:16.282215: step: 178/464, loss: 0.004770377185195684 2023-01-22 16:51:17.023977: step: 180/464, loss: 0.0006089820526540279 2023-01-22 16:51:17.763218: step: 182/464, loss: 0.0010552399326115847 2023-01-22 16:51:18.553218: step: 184/464, loss: 0.0032529293093830347 2023-01-22 16:51:19.282178: step: 186/464, loss: 0.0037063981872051954 2023-01-22 16:51:19.964441: step: 188/464, loss: 0.001429045107215643 2023-01-22 16:51:20.797955: step: 190/464, loss: 0.053881268948316574 2023-01-22 16:51:21.495835: step: 192/464, loss: 0.0669993981719017 2023-01-22 16:51:22.274875: step: 194/464, loss: 0.0028011894319206476 2023-01-22 16:51:22.955623: step: 196/464, loss: 0.003625689772889018 2023-01-22 16:51:23.717566: step: 198/464, loss: 0.0032770622055977583 2023-01-22 16:51:24.530267: step: 200/464, loss: 0.004626928828656673 2023-01-22 16:51:25.338314: step: 202/464, loss: 0.0008296154555864632 2023-01-22 16:51:25.956447: step: 204/464, loss: 0.061222951859235764 2023-01-22 16:51:26.770644: step: 206/464, loss: 4.607578375726007e-05 2023-01-22 16:51:27.569129: step: 208/464, loss: 0.03948868438601494 2023-01-22 16:51:28.336996: step: 210/464, loss: 0.3414325416088104 2023-01-22 16:51:29.205769: step: 212/464, loss: 0.0017582981381565332 2023-01-22 16:51:30.046116: step: 214/464, loss: 0.0009569655521772802 2023-01-22 16:51:30.834053: step: 216/464, loss: 0.017099468037486076 2023-01-22 16:51:31.602284: step: 218/464, loss: 0.19633235037326813 2023-01-22 16:51:32.288678: step: 220/464, loss: 0.0007412639679387212 2023-01-22 16:51:32.998399: step: 222/464, loss: 0.00041419549961574376 2023-01-22 16:51:33.814834: step: 224/464, loss: 0.00018719013314694166 2023-01-22 16:51:34.516881: step: 226/464, loss: 0.0016826813807711005 2023-01-22 16:51:35.335348: step: 228/464, loss: 0.0009907839121297002 2023-01-22 16:51:36.104532: step: 230/464, loss: 0.00036048784386366606 2023-01-22 16:51:36.816296: step: 232/464, loss: 0.0005840350640937686 2023-01-22 16:51:37.585738: step: 234/464, loss: 0.04000601917505264 2023-01-22 16:51:38.263098: step: 236/464, loss: 0.000522164162248373 2023-01-22 16:51:39.032113: step: 238/464, loss: 0.0007575997151434422 2023-01-22 16:51:39.786259: step: 240/464, loss: 0.029385874047875404 2023-01-22 16:51:40.496287: step: 242/464, loss: 0.0005397187196649611 2023-01-22 16:51:41.213254: step: 244/464, loss: 7.480083149857819e-05 2023-01-22 16:51:41.961655: step: 246/464, loss: 0.001386028598062694 2023-01-22 16:51:42.768703: step: 248/464, loss: 0.0008085906156338751 2023-01-22 16:51:43.486977: step: 250/464, loss: 0.0006489267107099295 2023-01-22 16:51:44.319549: step: 252/464, loss: 0.002511844737455249 2023-01-22 16:51:45.036498: step: 254/464, loss: 0.0003009107313118875 2023-01-22 16:51:45.769818: step: 256/464, loss: 9.194859012495726e-05 2023-01-22 16:51:46.597519: step: 258/464, loss: 0.0005083756987005472 2023-01-22 16:51:47.380569: step: 260/464, loss: 0.005527664441615343 2023-01-22 16:51:48.146143: step: 262/464, loss: 0.005268595647066832 2023-01-22 16:51:48.855358: step: 264/464, loss: 0.0018064638134092093 2023-01-22 16:51:49.626517: step: 266/464, loss: 0.00646596634760499 2023-01-22 16:51:50.350712: step: 268/464, loss: 0.0031213362235575914 2023-01-22 16:51:51.129566: step: 270/464, loss: 0.000955466297455132 2023-01-22 16:51:51.822504: step: 272/464, loss: 0.026391014456748962 2023-01-22 16:51:52.535755: step: 274/464, loss: 0.010174794122576714 2023-01-22 16:51:53.265716: step: 276/464, loss: 6.715762719977647e-05 2023-01-22 16:51:54.016589: step: 278/464, loss: 0.023771759122610092 2023-01-22 16:51:54.679797: step: 280/464, loss: 0.0007089504506438971 2023-01-22 16:51:55.435877: step: 282/464, loss: 0.009939974173903465 2023-01-22 16:51:56.166386: step: 284/464, loss: 3.812194790953072e-07 2023-01-22 16:51:56.851933: step: 286/464, loss: 0.013614516705274582 2023-01-22 16:51:57.639904: step: 288/464, loss: 0.012401481159031391 2023-01-22 16:51:58.529227: step: 290/464, loss: 0.002258396940305829 2023-01-22 16:51:59.323277: step: 292/464, loss: 0.007057834882289171 2023-01-22 16:52:00.014705: step: 294/464, loss: 0.0009403349831700325 2023-01-22 16:52:00.695081: step: 296/464, loss: 0.004534940700978041 2023-01-22 16:52:01.459214: step: 298/464, loss: 0.008743273094296455 2023-01-22 16:52:02.276189: step: 300/464, loss: 0.018859921023249626 2023-01-22 16:52:03.012549: step: 302/464, loss: 0.0003946318756788969 2023-01-22 16:52:03.789830: step: 304/464, loss: 0.006023469381034374 2023-01-22 16:52:04.501051: step: 306/464, loss: 0.006886173039674759 2023-01-22 16:52:05.212229: step: 308/464, loss: 0.00015002640429884195 2023-01-22 16:52:05.965464: step: 310/464, loss: 0.0035786149092018604 2023-01-22 16:52:06.756161: step: 312/464, loss: 0.0021690481808036566 2023-01-22 16:52:07.426469: step: 314/464, loss: 0.0029878343921154737 2023-01-22 16:52:08.183360: step: 316/464, loss: 1.694048523902893 2023-01-22 16:52:08.849581: step: 318/464, loss: 0.003704532515257597 2023-01-22 16:52:09.590374: step: 320/464, loss: 0.004257072228938341 2023-01-22 16:52:10.350334: step: 322/464, loss: 0.0011549916816875339 2023-01-22 16:52:10.989880: step: 324/464, loss: 0.0013482326176017523 2023-01-22 16:52:11.747386: step: 326/464, loss: 0.01761588826775551 2023-01-22 16:52:12.523529: step: 328/464, loss: 0.018198247998952866 2023-01-22 16:52:13.244663: step: 330/464, loss: 0.0001296098344027996 2023-01-22 16:52:14.178660: step: 332/464, loss: 0.009164217859506607 2023-01-22 16:52:14.999736: step: 334/464, loss: 0.026571668684482574 2023-01-22 16:52:15.699520: step: 336/464, loss: 0.1765873283147812 2023-01-22 16:52:16.521206: step: 338/464, loss: 0.012295668944716454 2023-01-22 16:52:17.167847: step: 340/464, loss: 0.0002983546582981944 2023-01-22 16:52:18.036036: step: 342/464, loss: 0.005674338433891535 2023-01-22 16:52:18.763799: step: 344/464, loss: 0.6395951509475708 2023-01-22 16:52:19.473726: step: 346/464, loss: 0.0010086627444252372 2023-01-22 16:52:20.221922: step: 348/464, loss: 0.0005183350294828415 2023-01-22 16:52:20.943153: step: 350/464, loss: 0.00011906491272384301 2023-01-22 16:52:21.726831: step: 352/464, loss: 7.418800669256598e-05 2023-01-22 16:52:22.550190: step: 354/464, loss: 0.02757696434855461 2023-01-22 16:52:23.309375: step: 356/464, loss: 0.00876846443861723 2023-01-22 16:52:24.034573: step: 358/464, loss: 0.004197806119918823 2023-01-22 16:52:24.845278: step: 360/464, loss: 0.04080234467983246 2023-01-22 16:52:25.611898: step: 362/464, loss: 0.015933020040392876 2023-01-22 16:52:26.405881: step: 364/464, loss: 0.05404546484351158 2023-01-22 16:52:27.133682: step: 366/464, loss: 0.01660231687128544 2023-01-22 16:52:27.901617: step: 368/464, loss: 0.006422176957130432 2023-01-22 16:52:28.772389: step: 370/464, loss: 0.03905533626675606 2023-01-22 16:52:29.513517: step: 372/464, loss: 0.0025848206132650375 2023-01-22 16:52:30.267304: step: 374/464, loss: 0.01470188982784748 2023-01-22 16:52:31.149890: step: 376/464, loss: 0.014380333013832569 2023-01-22 16:52:32.038274: step: 378/464, loss: 0.03706406056880951 2023-01-22 16:52:32.703597: step: 380/464, loss: 0.0020173077937215567 2023-01-22 16:52:33.400157: step: 382/464, loss: 0.0005429274751804769 2023-01-22 16:52:34.176025: step: 384/464, loss: 0.0015572941629216075 2023-01-22 16:52:34.944839: step: 386/464, loss: 0.020318368449807167 2023-01-22 16:52:35.701908: step: 388/464, loss: 0.006858357228338718 2023-01-22 16:52:36.382674: step: 390/464, loss: 0.004659565631300211 2023-01-22 16:52:37.127295: step: 392/464, loss: 0.004899164661765099 2023-01-22 16:52:37.905411: step: 394/464, loss: 0.0036024400033056736 2023-01-22 16:52:38.728094: step: 396/464, loss: 0.014771565794944763 2023-01-22 16:52:39.421182: step: 398/464, loss: 0.0004116464115213603 2023-01-22 16:52:40.215347: step: 400/464, loss: 0.004268472082912922 2023-01-22 16:52:40.880538: step: 402/464, loss: 0.013734730891883373 2023-01-22 16:52:41.606095: step: 404/464, loss: 0.0004078770871274173 2023-01-22 16:52:42.333936: step: 406/464, loss: 0.027729980647563934 2023-01-22 16:52:43.083007: step: 408/464, loss: 0.003318031784147024 2023-01-22 16:52:43.893389: step: 410/464, loss: 0.0029189365450292826 2023-01-22 16:52:44.695703: step: 412/464, loss: 0.00035886603291146457 2023-01-22 16:52:45.463458: step: 414/464, loss: 0.0016576717607676983 2023-01-22 16:52:46.208872: step: 416/464, loss: 0.014981823973357677 2023-01-22 16:52:47.036021: step: 418/464, loss: 0.003958564717322588 2023-01-22 16:52:47.804859: step: 420/464, loss: 0.007210117299109697 2023-01-22 16:52:48.596233: step: 422/464, loss: 0.0018005042802542448 2023-01-22 16:52:49.249995: step: 424/464, loss: 0.0001036086687236093 2023-01-22 16:52:49.914628: step: 426/464, loss: 0.0024507339112460613 2023-01-22 16:52:50.771669: step: 428/464, loss: 0.038161493837833405 2023-01-22 16:52:51.509234: step: 430/464, loss: 0.00398477166891098 2023-01-22 16:52:52.288546: step: 432/464, loss: 2.654248964972794e-05 2023-01-22 16:52:53.123247: step: 434/464, loss: 0.022698312997817993 2023-01-22 16:52:53.860562: step: 436/464, loss: 0.00023635398247279227 2023-01-22 16:52:54.586783: step: 438/464, loss: 0.007394007872790098 2023-01-22 16:52:55.326807: step: 440/464, loss: 0.007277240045368671 2023-01-22 16:52:56.085837: step: 442/464, loss: 0.01735176146030426 2023-01-22 16:52:56.892140: step: 444/464, loss: 0.003354936372488737 2023-01-22 16:52:57.691133: step: 446/464, loss: 0.00018706907576415688 2023-01-22 16:52:58.467478: step: 448/464, loss: 0.016221504658460617 2023-01-22 16:52:59.185393: step: 450/464, loss: 0.00032600644044578075 2023-01-22 16:52:59.899668: step: 452/464, loss: 0.6398082971572876 2023-01-22 16:53:00.693757: step: 454/464, loss: 0.00451373215764761 2023-01-22 16:53:01.412132: step: 456/464, loss: 0.0015716877533122897 2023-01-22 16:53:02.235743: step: 458/464, loss: 0.0003451913653407246 2023-01-22 16:53:02.930713: step: 460/464, loss: 0.00024207618844229728 2023-01-22 16:53:03.745553: step: 462/464, loss: 0.045718129724264145 2023-01-22 16:53:04.557707: step: 464/464, loss: 0.00918582733720541 2023-01-22 16:53:05.262249: step: 466/464, loss: 0.006854454055428505 2023-01-22 16:53:06.001160: step: 468/464, loss: 0.013512649573385715 2023-01-22 16:53:06.731412: step: 470/464, loss: 0.003024796023964882 2023-01-22 16:53:07.490820: step: 472/464, loss: 0.01615089550614357 2023-01-22 16:53:08.235951: step: 474/464, loss: 0.008264783769845963 2023-01-22 16:53:08.984483: step: 476/464, loss: 0.023860646411776543 2023-01-22 16:53:09.709663: step: 478/464, loss: 0.0016786216292530298 2023-01-22 16:53:10.457975: step: 480/464, loss: 0.05656686797738075 2023-01-22 16:53:11.197424: step: 482/464, loss: 0.008730104193091393 2023-01-22 16:53:11.953021: step: 484/464, loss: 0.0288022942841053 2023-01-22 16:53:12.667909: step: 486/464, loss: 0.04661581665277481 2023-01-22 16:53:13.463382: step: 488/464, loss: 0.004736119415611029 2023-01-22 16:53:14.224529: step: 490/464, loss: 0.002068700036033988 2023-01-22 16:53:14.977909: step: 492/464, loss: 0.004908754024654627 2023-01-22 16:53:15.718642: step: 494/464, loss: 0.004340517334640026 2023-01-22 16:53:16.421157: step: 496/464, loss: 0.0011247927322983742 2023-01-22 16:53:17.146644: step: 498/464, loss: 0.0042472220957279205 2023-01-22 16:53:17.875771: step: 500/464, loss: 0.0008127755718305707 2023-01-22 16:53:18.605637: step: 502/464, loss: 0.010160490870475769 2023-01-22 16:53:19.296360: step: 504/464, loss: 0.010667679831385612 2023-01-22 16:53:20.032832: step: 506/464, loss: 0.0014209101209416986 2023-01-22 16:53:20.811876: step: 508/464, loss: 0.0068949805572628975 2023-01-22 16:53:21.641533: step: 510/464, loss: 0.01052104588598013 2023-01-22 16:53:22.397999: step: 512/464, loss: 0.011864281259477139 2023-01-22 16:53:23.184355: step: 514/464, loss: 0.007104154676198959 2023-01-22 16:53:23.923339: step: 516/464, loss: 0.0002336573088541627 2023-01-22 16:53:24.743597: step: 518/464, loss: 0.02338859997689724 2023-01-22 16:53:25.573359: step: 520/464, loss: 0.06585898250341415 2023-01-22 16:53:26.299278: step: 522/464, loss: 0.0012727356515824795 2023-01-22 16:53:27.058817: step: 524/464, loss: 0.001677141641266644 2023-01-22 16:53:27.833428: step: 526/464, loss: 0.007514504715800285 2023-01-22 16:53:28.596259: step: 528/464, loss: 0.007119217421859503 2023-01-22 16:53:29.374434: step: 530/464, loss: 0.002552604768425226 2023-01-22 16:53:30.089911: step: 532/464, loss: 0.0002722721255850047 2023-01-22 16:53:30.829544: step: 534/464, loss: 0.0024043868761509657 2023-01-22 16:53:31.547411: step: 536/464, loss: 0.0030759493820369244 2023-01-22 16:53:32.285123: step: 538/464, loss: 0.011626550927758217 2023-01-22 16:53:33.177296: step: 540/464, loss: 0.026739951223134995 2023-01-22 16:53:33.873420: step: 542/464, loss: 0.0007168625597842038 2023-01-22 16:53:34.689256: step: 544/464, loss: 0.005891432985663414 2023-01-22 16:53:35.463261: step: 546/464, loss: 0.0010922928340733051 2023-01-22 16:53:36.217082: step: 548/464, loss: 0.013535697013139725 2023-01-22 16:53:37.012514: step: 550/464, loss: 0.0210479274392128 2023-01-22 16:53:37.788343: step: 552/464, loss: 0.009432249702513218 2023-01-22 16:53:38.619174: step: 554/464, loss: 0.001195187564007938 2023-01-22 16:53:39.447868: step: 556/464, loss: 0.007210468873381615 2023-01-22 16:53:40.281002: step: 558/464, loss: 0.014359408989548683 2023-01-22 16:53:40.982147: step: 560/464, loss: 0.00511901406571269 2023-01-22 16:53:41.817553: step: 562/464, loss: 0.020551275461912155 2023-01-22 16:53:42.554126: step: 564/464, loss: 9.198107363772579e-06 2023-01-22 16:53:43.419201: step: 566/464, loss: 0.05644404888153076 2023-01-22 16:53:44.156516: step: 568/464, loss: 0.016529276967048645 2023-01-22 16:53:44.963951: step: 570/464, loss: 0.7873603701591492 2023-01-22 16:53:45.693568: step: 572/464, loss: 0.785995602607727 2023-01-22 16:53:46.414339: step: 574/464, loss: 0.03733735904097557 2023-01-22 16:53:47.129504: step: 576/464, loss: 4.771085878019221e-05 2023-01-22 16:53:47.893103: step: 578/464, loss: 0.00013325363397598267 2023-01-22 16:53:48.632965: step: 580/464, loss: 1.1196244955062866 2023-01-22 16:53:49.431409: step: 582/464, loss: 0.001423005131073296 2023-01-22 16:53:50.136186: step: 584/464, loss: 0.004689953289926052 2023-01-22 16:53:50.876369: step: 586/464, loss: 0.004397342447191477 2023-01-22 16:53:51.728588: step: 588/464, loss: 0.046125661581754684 2023-01-22 16:53:52.454115: step: 590/464, loss: 0.026454763486981392 2023-01-22 16:53:53.283452: step: 592/464, loss: 0.006255241576582193 2023-01-22 16:53:54.076219: step: 594/464, loss: 0.037292227149009705 2023-01-22 16:53:54.807026: step: 596/464, loss: 0.0010081107029691339 2023-01-22 16:53:55.514566: step: 598/464, loss: 0.001386135583743453 2023-01-22 16:53:56.254996: step: 600/464, loss: 0.002910367678850889 2023-01-22 16:53:56.944188: step: 602/464, loss: 0.0006286511197686195 2023-01-22 16:53:57.767215: step: 604/464, loss: 0.004660574719309807 2023-01-22 16:53:58.474521: step: 606/464, loss: 0.00016427884111180902 2023-01-22 16:53:59.236894: step: 608/464, loss: 0.010264435783028603 2023-01-22 16:54:00.016594: step: 610/464, loss: 0.008379405364394188 2023-01-22 16:54:00.802405: step: 612/464, loss: 8.429507943219505e-06 2023-01-22 16:54:01.526731: step: 614/464, loss: 0.011425070464611053 2023-01-22 16:54:02.227665: step: 616/464, loss: 0.007392220664769411 2023-01-22 16:54:02.984880: step: 618/464, loss: 0.0074181947857141495 2023-01-22 16:54:03.708905: step: 620/464, loss: 0.03724908456206322 2023-01-22 16:54:04.423303: step: 622/464, loss: 0.020692970603704453 2023-01-22 16:54:05.119150: step: 624/464, loss: 0.0033561927266418934 2023-01-22 16:54:05.908396: step: 626/464, loss: 0.00014869822189211845 2023-01-22 16:54:06.691804: step: 628/464, loss: 0.015324524603784084 2023-01-22 16:54:07.416376: step: 630/464, loss: 0.012809859588742256 2023-01-22 16:54:08.217957: step: 632/464, loss: 0.07623506337404251 2023-01-22 16:54:08.948808: step: 634/464, loss: 0.0023513073101639748 2023-01-22 16:54:09.782297: step: 636/464, loss: 0.003813466290012002 2023-01-22 16:54:10.495694: step: 638/464, loss: 0.00017119161202572286 2023-01-22 16:54:11.165538: step: 640/464, loss: 0.00017306060180999339 2023-01-22 16:54:11.919097: step: 642/464, loss: 0.004058307968080044 2023-01-22 16:54:12.741529: step: 644/464, loss: 0.00024587870575487614 2023-01-22 16:54:13.417635: step: 646/464, loss: 0.02357977256178856 2023-01-22 16:54:14.160793: step: 648/464, loss: 0.0015318029327318072 2023-01-22 16:54:14.904280: step: 650/464, loss: 0.01352463848888874 2023-01-22 16:54:15.664250: step: 652/464, loss: 0.0049011362716555595 2023-01-22 16:54:16.381336: step: 654/464, loss: 0.021986398845911026 2023-01-22 16:54:17.131400: step: 656/464, loss: 0.0012051882222294807 2023-01-22 16:54:17.906849: step: 658/464, loss: 0.016538813710212708 2023-01-22 16:54:18.642896: step: 660/464, loss: 0.0037078920286148787 2023-01-22 16:54:19.395882: step: 662/464, loss: 0.0004478042246773839 2023-01-22 16:54:20.180699: step: 664/464, loss: 0.0003174376906827092 2023-01-22 16:54:20.910662: step: 666/464, loss: 0.3379771113395691 2023-01-22 16:54:21.669207: step: 668/464, loss: 0.009541943669319153 2023-01-22 16:54:22.371802: step: 670/464, loss: 0.0030752429738640785 2023-01-22 16:54:23.090040: step: 672/464, loss: 0.05253473296761513 2023-01-22 16:54:23.868122: step: 674/464, loss: 0.36408454179763794 2023-01-22 16:54:24.641910: step: 676/464, loss: 0.0048395427875220776 2023-01-22 16:54:25.362028: step: 678/464, loss: 0.00031773888622410595 2023-01-22 16:54:26.002180: step: 680/464, loss: 0.01639465056359768 2023-01-22 16:54:26.680621: step: 682/464, loss: 0.00026009962311945856 2023-01-22 16:54:27.391431: step: 684/464, loss: 0.13537754118442535 2023-01-22 16:54:28.104990: step: 686/464, loss: 0.007647455669939518 2023-01-22 16:54:28.781485: step: 688/464, loss: 0.013909382745623589 2023-01-22 16:54:29.574336: step: 690/464, loss: 0.018954748287796974 2023-01-22 16:54:30.415373: step: 692/464, loss: 0.0004389037494547665 2023-01-22 16:54:31.247805: step: 694/464, loss: 0.038470931351184845 2023-01-22 16:54:31.951046: step: 696/464, loss: 0.0010190318571403623 2023-01-22 16:54:32.671787: step: 698/464, loss: 0.006811690982431173 2023-01-22 16:54:33.500692: step: 700/464, loss: 0.013463972136378288 2023-01-22 16:54:34.247895: step: 702/464, loss: 0.017721960321068764 2023-01-22 16:54:34.990253: step: 704/464, loss: 0.0021989597007632256 2023-01-22 16:54:35.736909: step: 706/464, loss: 0.00756972236558795 2023-01-22 16:54:36.504584: step: 708/464, loss: 0.15557102859020233 2023-01-22 16:54:37.268894: step: 710/464, loss: 0.013237417675554752 2023-01-22 16:54:37.973108: step: 712/464, loss: 0.0038034068420529366 2023-01-22 16:54:38.757755: step: 714/464, loss: 0.011569908820092678 2023-01-22 16:54:39.499141: step: 716/464, loss: 0.0009132428094744682 2023-01-22 16:54:40.164912: step: 718/464, loss: 0.0012660945067182183 2023-01-22 16:54:40.920233: step: 720/464, loss: 0.001009093364700675 2023-01-22 16:54:41.675301: step: 722/464, loss: 0.3648377060890198 2023-01-22 16:54:42.506706: step: 724/464, loss: 0.00179006636608392 2023-01-22 16:54:43.294568: step: 726/464, loss: 0.004186419770121574 2023-01-22 16:54:44.101832: step: 728/464, loss: 0.0023058911319822073 2023-01-22 16:54:44.865852: step: 730/464, loss: 0.001836820738390088 2023-01-22 16:54:45.520919: step: 732/464, loss: 0.0008673551492393017 2023-01-22 16:54:46.269920: step: 734/464, loss: 0.025690661743283272 2023-01-22 16:54:47.137908: step: 736/464, loss: 0.0021721988450735807 2023-01-22 16:54:47.959556: step: 738/464, loss: 0.003007357008755207 2023-01-22 16:54:48.789534: step: 740/464, loss: 0.005906921811401844 2023-01-22 16:54:49.445371: step: 742/464, loss: 0.002932978793978691 2023-01-22 16:54:50.159231: step: 744/464, loss: 3.0419625545619056e-05 2023-01-22 16:54:50.918503: step: 746/464, loss: 0.0017899353988468647 2023-01-22 16:54:51.720867: step: 748/464, loss: 1.9426894141361117e-05 2023-01-22 16:54:52.461035: step: 750/464, loss: 0.03254895657300949 2023-01-22 16:54:53.228287: step: 752/464, loss: 0.014089247211813927 2023-01-22 16:54:53.989235: step: 754/464, loss: 7.201086555141956e-05 2023-01-22 16:54:54.795541: step: 756/464, loss: 0.007483582943677902 2023-01-22 16:54:55.469179: step: 758/464, loss: 0.002280889078974724 2023-01-22 16:54:56.257771: step: 760/464, loss: 0.041584137827157974 2023-01-22 16:54:56.985440: step: 762/464, loss: 0.02409733645617962 2023-01-22 16:54:57.708587: step: 764/464, loss: 0.03089963272213936 2023-01-22 16:54:58.496446: step: 766/464, loss: 0.003421128960326314 2023-01-22 16:54:59.326608: step: 768/464, loss: 0.011093460954725742 2023-01-22 16:55:00.058872: step: 770/464, loss: 0.010788795538246632 2023-01-22 16:55:00.861661: step: 772/464, loss: 0.00963507778942585 2023-01-22 16:55:01.609732: step: 774/464, loss: 0.0002834459883160889 2023-01-22 16:55:02.336416: step: 776/464, loss: 0.005152272526174784 2023-01-22 16:55:03.113142: step: 778/464, loss: 0.0010087155969813466 2023-01-22 16:55:03.821824: step: 780/464, loss: 0.0005180512671358883 2023-01-22 16:55:04.579671: step: 782/464, loss: 0.018876563757658005 2023-01-22 16:55:05.241654: step: 784/464, loss: 0.00018986199575010687 2023-01-22 16:55:05.953733: step: 786/464, loss: 0.0006348424940370023 2023-01-22 16:55:06.658947: step: 788/464, loss: 0.008300436660647392 2023-01-22 16:55:07.435218: step: 790/464, loss: 0.0004914195160381496 2023-01-22 16:55:08.176761: step: 792/464, loss: 0.03281170502305031 2023-01-22 16:55:08.922059: step: 794/464, loss: 0.030931316316127777 2023-01-22 16:55:09.638770: step: 796/464, loss: 0.009102153591811657 2023-01-22 16:55:10.418935: step: 798/464, loss: 0.0110086128115654 2023-01-22 16:55:11.090034: step: 800/464, loss: 0.022512733936309814 2023-01-22 16:55:11.946030: step: 802/464, loss: 0.00442664697766304 2023-01-22 16:55:12.738578: step: 804/464, loss: 0.011539860628545284 2023-01-22 16:55:13.430516: step: 806/464, loss: 0.0057589225471019745 2023-01-22 16:55:14.202489: step: 808/464, loss: 0.001552989473566413 2023-01-22 16:55:14.984348: step: 810/464, loss: 0.0032748279627412558 2023-01-22 16:55:15.772261: step: 812/464, loss: 0.018966708332300186 2023-01-22 16:55:16.476846: step: 814/464, loss: 0.0037927976809442043 2023-01-22 16:55:17.213510: step: 816/464, loss: 0.00040608947165310383 2023-01-22 16:55:17.888692: step: 818/464, loss: 0.0003042019088752568 2023-01-22 16:55:18.644104: step: 820/464, loss: 0.008616427890956402 2023-01-22 16:55:19.344655: step: 822/464, loss: 0.0003781984851229936 2023-01-22 16:55:20.118983: step: 824/464, loss: 0.03239799290895462 2023-01-22 16:55:20.836710: step: 826/464, loss: 0.004440160468220711 2023-01-22 16:55:21.551434: step: 828/464, loss: 0.00048742041690275073 2023-01-22 16:55:22.275732: step: 830/464, loss: 0.07057930529117584 2023-01-22 16:55:22.969136: step: 832/464, loss: 0.004274421371519566 2023-01-22 16:55:23.714458: step: 834/464, loss: 0.0001647328754188493 2023-01-22 16:55:24.359496: step: 836/464, loss: 0.00017869319708552212 2023-01-22 16:55:25.085417: step: 838/464, loss: 0.008176120929419994 2023-01-22 16:55:25.859579: step: 840/464, loss: 0.018365247175097466 2023-01-22 16:55:26.604297: step: 842/464, loss: 0.017919452860951424 2023-01-22 16:55:27.446562: step: 844/464, loss: 0.012984522618353367 2023-01-22 16:55:28.250224: step: 846/464, loss: 0.005812663119286299 2023-01-22 16:55:28.979152: step: 848/464, loss: 0.008565914817154408 2023-01-22 16:55:29.766859: step: 850/464, loss: 1.5529409211012535e-05 2023-01-22 16:55:30.635293: step: 852/464, loss: 0.015416462905704975 2023-01-22 16:55:31.413618: step: 854/464, loss: 0.0015312719624489546 2023-01-22 16:55:32.213706: step: 856/464, loss: 0.005828104447573423 2023-01-22 16:55:32.916356: step: 858/464, loss: 0.0007630424806848168 2023-01-22 16:55:33.628062: step: 860/464, loss: 1.020262360572815 2023-01-22 16:55:34.471807: step: 862/464, loss: 1.69687900779536e-05 2023-01-22 16:55:35.274951: step: 864/464, loss: 0.0002915971272159368 2023-01-22 16:55:36.064386: step: 866/464, loss: 0.0002090566122205928 2023-01-22 16:55:36.845529: step: 868/464, loss: 0.0440477691590786 2023-01-22 16:55:37.527412: step: 870/464, loss: 0.004202152136713266 2023-01-22 16:55:38.287110: step: 872/464, loss: 0.013471094891428947 2023-01-22 16:55:39.062361: step: 874/464, loss: 0.0003817703400272876 2023-01-22 16:55:39.763039: step: 876/464, loss: 0.024504436179995537 2023-01-22 16:55:40.509720: step: 878/464, loss: 0.006114604417234659 2023-01-22 16:55:41.321495: step: 880/464, loss: 0.0008290052646771073 2023-01-22 16:55:42.092425: step: 882/464, loss: 0.02669224515557289 2023-01-22 16:55:42.889307: step: 884/464, loss: 0.00015607234672643244 2023-01-22 16:55:43.627614: step: 886/464, loss: 11.929213523864746 2023-01-22 16:55:44.351308: step: 888/464, loss: 0.00027429041801951826 2023-01-22 16:55:45.057329: step: 890/464, loss: 0.0008549271733500063 2023-01-22 16:55:45.777709: step: 892/464, loss: 1.0526523510634433e-05 2023-01-22 16:55:46.539714: step: 894/464, loss: 0.056621868163347244 2023-01-22 16:55:47.363921: step: 896/464, loss: 0.001759719685651362 2023-01-22 16:55:48.081196: step: 898/464, loss: 0.006513232830911875 2023-01-22 16:55:48.853918: step: 900/464, loss: 0.006867598742246628 2023-01-22 16:55:49.702824: step: 902/464, loss: 0.10892080515623093 2023-01-22 16:55:50.455372: step: 904/464, loss: 0.016597241163253784 2023-01-22 16:55:51.234029: step: 906/464, loss: 0.030734620988368988 2023-01-22 16:55:52.018723: step: 908/464, loss: 1.996102582779713e-05 2023-01-22 16:55:52.768781: step: 910/464, loss: 0.2448931187391281 2023-01-22 16:55:53.496559: step: 912/464, loss: 0.0026010761503130198 2023-01-22 16:55:54.266976: step: 914/464, loss: 0.0009186618844978511 2023-01-22 16:55:54.982734: step: 916/464, loss: 0.10815151035785675 2023-01-22 16:55:55.687717: step: 918/464, loss: 0.0037876542191952467 2023-01-22 16:55:56.472437: step: 920/464, loss: 0.018428770825266838 2023-01-22 16:55:57.236409: step: 922/464, loss: 0.020010873675346375 2023-01-22 16:55:58.037102: step: 924/464, loss: 0.1322910338640213 2023-01-22 16:55:58.729880: step: 926/464, loss: 0.009188034571707249 2023-01-22 16:55:59.520662: step: 928/464, loss: 0.011618698947131634 2023-01-22 16:56:00.187591: step: 930/464, loss: 0.0005447377334348857 ================================================== Loss: 0.056 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3181704629567324, 'r': 0.3495648919391804, 'f1': 0.3331296528968319}, 'combined': 0.24546395476608665, 'epoch': 39} Test Chinese: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.30351155567301247, 'r': 0.2825176733636144, 'f1': 0.2926385726141021}, 'combined': 0.181743955623495, 'epoch': 39} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3024466410267585, 'r': 0.3420459166071121, 'f1': 0.32102973829376324}, 'combined': 0.2365482282164571, 'epoch': 39} Test Korean: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.2958672710854116, 'r': 0.28329583565391686, 'f1': 0.28944511426730324}, 'combined': 0.1797606499133778, 'epoch': 39} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.31104380219580097, 'r': 0.34114481531152363, 'f1': 0.3253996699894533}, 'combined': 0.23976817788696558, 'epoch': 39} Test Russian: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.31120162852807604, 'r': 0.28998333567388906, 'f1': 0.30021804163884985}, 'combined': 0.18645120480728572, 'epoch': 39} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.266304347826087, 'r': 0.35, 'f1': 0.30246913580246915}, 'combined': 0.20164609053497942, 'epoch': 39} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.3402777777777778, 'r': 0.532608695652174, 'f1': 0.4152542372881356}, 'combined': 0.2076271186440678, 'epoch': 39} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.39880952380952384, 'r': 0.28879310344827586, 'f1': 0.33499999999999996}, 'combined': 0.2233333333333333, 'epoch': 39} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.31282801835726054, 'r': 0.3603161425860667, 'f1': 0.33489701436130004}, 'combined': 0.24676622110832633, 'epoch': 12} Test for Chinese: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.30894966790503225, 'r': 0.2887808468548521, 'f1': 0.29852498585915693}, 'combined': 0.1853997280598975, 'epoch': 12} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.3138888888888889, 'r': 0.4035714285714286, 'f1': 0.35312499999999997}, 'combined': 0.23541666666666664, 'epoch': 12} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3081533681870252, 'r': 0.3455761681186563, 'f1': 0.32579363255551325}, 'combined': 0.24005846609353607, 'epoch': 23} Test for Korean: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.29327466083978504, 'r': 0.2883432372648332, 'f1': 0.2907880427678267}, 'combined': 0.1805946791926503, 'epoch': 23} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.3355263157894737, 'r': 0.5543478260869565, 'f1': 0.41803278688524587}, 'combined': 0.20901639344262293, 'epoch': 23} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.32184781639317267, 'r': 0.3346728716953674, 'f1': 0.32813507606224857}, 'combined': 0.24178374025639365, 'epoch': 24} Test for Russian: {'template': {'p': 0.9516129032258065, 'r': 0.4609375, 'f1': 0.6210526315789474}, 'slot': {'p': 0.31977789344146773, 'r': 0.29610233370986844, 'f1': 0.30748504771716734}, 'combined': 0.190964398055925, 'epoch': 24} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.46875, 'r': 0.3232758620689655, 'f1': 0.38265306122448983}, 'combined': 0.25510204081632654, 'epoch': 24}