Command that produces this log: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 ---------------------------------------------------------------------------------------------------- > trainable params: >>> xlmr.embeddings.word_embeddings.weight: torch.Size([250002, 1024]) >>> xlmr.embeddings.position_embeddings.weight: torch.Size([514, 1024]) >>> xlmr.embeddings.token_type_embeddings.weight: torch.Size([1, 1024]) >>> xlmr.embeddings.LayerNorm.weight: torch.Size([1024]) >>> xlmr.embeddings.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.0.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.0.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.0.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.0.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.0.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.0.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.0.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.0.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.0.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.0.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.0.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.0.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.0.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.0.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.0.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.0.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.1.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.1.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.1.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.1.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.1.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.1.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.1.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.1.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.1.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.1.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.1.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.1.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.1.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.1.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.1.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.1.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.2.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.2.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.2.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.2.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.2.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.2.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.2.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.2.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.2.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.2.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.2.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.2.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.2.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.2.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.2.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.2.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.3.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.3.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.3.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.3.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.3.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.3.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.3.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.3.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.3.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.3.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.3.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.3.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.3.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.3.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.3.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.3.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.4.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.4.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.4.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.4.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.4.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.4.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.4.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.4.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.4.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.4.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.4.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.4.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.4.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.4.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.4.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.4.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.5.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.5.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.5.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.5.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.5.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.5.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.5.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.5.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.5.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.5.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.5.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.5.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.5.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.5.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.5.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.5.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.6.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.6.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.6.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.6.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.6.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.6.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.6.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.6.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.6.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.6.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.6.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.6.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.6.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.6.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.6.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.6.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.7.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.7.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.7.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.7.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.7.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.7.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.7.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.7.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.7.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.7.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.7.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.7.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.7.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.7.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.7.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.7.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.8.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.8.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.8.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.8.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.8.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.8.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.8.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.8.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.8.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.8.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.8.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.8.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.8.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.8.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.8.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.8.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.9.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.9.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.9.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.9.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.9.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.9.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.9.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.9.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.9.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.9.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.9.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.9.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.9.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.9.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.9.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.9.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.10.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.10.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.10.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.10.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.10.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.10.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.10.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.10.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.10.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.10.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.10.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.10.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.10.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.10.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.10.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.10.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.11.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.11.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.11.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.11.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.11.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.11.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.11.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.11.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.11.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.11.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.11.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.11.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.11.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.11.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.11.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.11.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.12.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.12.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.12.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.12.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.12.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.12.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.12.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.12.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.12.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.12.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.12.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.12.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.12.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.12.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.12.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.12.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.13.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.13.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.13.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.13.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.13.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.13.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.13.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.13.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.13.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.13.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.13.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.13.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.13.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.13.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.13.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.13.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.14.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.14.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.14.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.14.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.14.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.14.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.14.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.14.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.14.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.14.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.14.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.14.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.14.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.14.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.14.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.14.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.15.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.15.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.15.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.15.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.15.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.15.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.15.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.15.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.15.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.15.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.15.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.15.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.15.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.15.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.15.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.15.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.16.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.16.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.16.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.16.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.16.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.16.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.16.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.16.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.16.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.16.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.16.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.16.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.16.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.16.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.16.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.16.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.17.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.17.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.17.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.17.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.17.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.17.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.17.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.17.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.17.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.17.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.17.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.17.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.17.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.17.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.17.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.17.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.18.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.18.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.18.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.18.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.18.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.18.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.18.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.18.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.18.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.18.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.18.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.18.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.18.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.18.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.18.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.18.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.19.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.19.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.19.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.19.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.19.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.19.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.19.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.19.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.19.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.19.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.19.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.19.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.19.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.19.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.19.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.19.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.20.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.20.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.20.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.20.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.20.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.20.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.20.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.20.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.20.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.20.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.20.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.20.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.20.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.20.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.20.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.20.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.21.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.21.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.21.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.21.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.21.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.21.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.21.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.21.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.21.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.21.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.21.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.21.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.21.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.21.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.21.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.21.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.22.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.22.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.22.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.22.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.22.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.22.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.22.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.22.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.22.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.22.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.22.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.22.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.22.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.22.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.22.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.22.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.23.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.23.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.23.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.23.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.23.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.23.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.23.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.23.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.23.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.23.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.23.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.23.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.23.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.23.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.23.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.23.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.pooler.dense.weight: torch.Size([1024, 1024]) >>> xlmr.pooler.dense.bias: torch.Size([1024]) >>> basic_gcn.T_T.0.weight: torch.Size([1024, 1024]) >>> basic_gcn.T_T.0.bias: torch.Size([1024]) >>> basic_gcn.T_T.1.weight: torch.Size([1024, 1024]) >>> basic_gcn.T_T.1.bias: torch.Size([1024]) >>> basic_gcn.T_T.2.weight: torch.Size([1024, 1024]) >>> basic_gcn.T_T.2.bias: torch.Size([1024]) >>> basic_gcn.T_E.0.weight: torch.Size([1024, 1024]) >>> basic_gcn.T_E.0.bias: torch.Size([1024]) >>> basic_gcn.T_E.1.weight: torch.Size([1024, 1024]) >>> basic_gcn.T_E.1.bias: torch.Size([1024]) >>> basic_gcn.T_E.2.weight: torch.Size([1024, 1024]) >>> basic_gcn.T_E.2.bias: torch.Size([1024]) >>> basic_gcn.E_T.0.weight: torch.Size([1024, 1024]) >>> basic_gcn.E_T.0.bias: torch.Size([1024]) >>> basic_gcn.E_T.1.weight: torch.Size([1024, 1024]) >>> basic_gcn.E_T.1.bias: torch.Size([1024]) >>> basic_gcn.E_T.2.weight: torch.Size([1024, 1024]) >>> basic_gcn.E_T.2.bias: torch.Size([1024]) >>> basic_gcn.E_E.0.weight: torch.Size([1024, 1024]) >>> basic_gcn.E_E.0.bias: torch.Size([1024]) >>> basic_gcn.E_E.1.weight: torch.Size([1024, 1024]) >>> basic_gcn.E_E.1.bias: torch.Size([1024]) >>> basic_gcn.E_E.2.weight: torch.Size([1024, 1024]) >>> basic_gcn.E_E.2.bias: torch.Size([1024]) >>> basic_gcn.f_t.0.weight: torch.Size([1024, 2048]) >>> basic_gcn.f_t.0.bias: torch.Size([1024]) >>> basic_gcn.f_e.0.weight: torch.Size([1024, 2048]) >>> basic_gcn.f_e.0.bias: torch.Size([1024]) >>> name2classifier.occupy-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.occupy-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.occupy-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.occupy-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.outcome-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.outcome-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.outcome-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.outcome-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.protest-event-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.protest-event-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.protest-event-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.protest-event-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.when-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.when-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.when-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.when-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.where-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.where-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.where-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.where-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.who-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.who-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.who-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.who-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.protest-against-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.protest-against-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.protest-against-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.protest-against-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.protest-for-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.protest-for-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.protest-for-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.protest-for-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.organizer-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.organizer-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.organizer-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.organizer-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.wounded-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.wounded-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.wounded-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.wounded-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.arrested-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.arrested-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.arrested-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.arrested-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.killed-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.killed-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.killed-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.killed-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.imprisoned-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.imprisoned-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.imprisoned-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.imprisoned-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.corrupt-event-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.corrupt-event-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.corrupt-event-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.corrupt-event-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.judicial-actions-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.judicial-actions-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.judicial-actions-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.judicial-actions-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.charged-with-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.charged-with-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.charged-with-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.charged-with-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.prison-term-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.prison-term-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.prison-term-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.prison-term-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.fine-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.fine-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.fine-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.fine-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.npi-events-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.npi-events-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.npi-events-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.npi-events-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.disease-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.disease-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.disease-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.disease-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.infected-cumulative-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.infected-cumulative-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.infected-cumulative-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.infected-cumulative-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.killed-count-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.killed-count-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.killed-count-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.killed-count-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.killed-cumulative-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.killed-cumulative-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.killed-cumulative-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.killed-cumulative-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.outbreak-event-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.outbreak-event-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.outbreak-event-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.outbreak-event-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.infected-count-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.infected-count-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.infected-count-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.infected-count-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.killed-individuals-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.killed-individuals-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.killed-individuals-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.killed-individuals-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.hospitalized-count-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.hospitalized-count-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.hospitalized-count-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.hospitalized-count-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.hospitalized-individuals-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.hospitalized-individuals-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.hospitalized-individuals-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.hospitalized-individuals-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.infected-individuals-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.infected-individuals-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.infected-individuals-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.infected-individuals-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.tested-individuals-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.tested-individuals-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.tested-individuals-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.tested-individuals-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.vaccinated-count-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.vaccinated-count-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.vaccinated-count-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.vaccinated-count-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.tested-count-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.tested-count-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.tested-count-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.tested-count-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.exposed-individuals-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.exposed-individuals-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.exposed-individuals-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.exposed-individuals-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.recovered-cumulative-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.recovered-cumulative-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.recovered-cumulative-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.recovered-cumulative-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.tested-cumulative-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.tested-cumulative-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.tested-cumulative-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.tested-cumulative-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.recovered-count-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.recovered-count-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.recovered-count-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.recovered-count-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.exposed-count-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.exposed-count-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.exposed-count-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.exposed-count-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.vaccinated-cumulative-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.vaccinated-cumulative-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.vaccinated-cumulative-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.vaccinated-cumulative-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.vaccinated-individuals-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.vaccinated-individuals-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.vaccinated-individuals-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.vaccinated-individuals-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.exposed-cumulative-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.exposed-cumulative-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.exposed-cumulative-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.exposed-cumulative-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.hospitalized-cumulative-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.hospitalized-cumulative-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.hospitalized-cumulative-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.hospitalized-cumulative-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.recovered-individuals-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.recovered-individuals-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.recovered-individuals-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.recovered-individuals-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.blamed-by-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.blamed-by-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.blamed-by-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.blamed-by-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.claimed-by-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.claimed-by-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.claimed-by-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.claimed-by-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.terror-event-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.terror-event-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.terror-event-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.terror-event-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.kidnapped-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.kidnapped-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.kidnapped-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.kidnapped-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.named-perp-org-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.named-perp-org-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.named-perp-org-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.named-perp-org-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.target-physical-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.target-physical-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.target-physical-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.target-physical-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.named-perp-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.named-perp-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.named-perp-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.named-perp-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.perp-killed-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.perp-killed-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.perp-killed-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.perp-killed-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.target-human-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.target-human-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.target-human-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.target-human-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.perp-captured-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.perp-captured-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.perp-captured-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.perp-captured-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.perp-objective-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.perp-objective-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.perp-objective-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.perp-objective-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.weapon-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.weapon-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.weapon-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.weapon-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.named-organizer-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.named-organizer-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.named-organizer-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.named-organizer-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.affected-cumulative-count-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.affected-cumulative-count-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.affected-cumulative-count-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.affected-cumulative-count-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.damage-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.damage-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.damage-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.damage-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.human-displacement-events-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.human-displacement-events-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.human-displacement-events-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.human-displacement-events-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.major-disaster-event-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.major-disaster-event-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.major-disaster-event-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.major-disaster-event-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.related-natural-phenomena-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.related-natural-phenomena-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.related-natural-phenomena-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.related-natural-phenomena-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.responders-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.responders-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.responders-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.responders-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.assistance-provided-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.assistance-provided-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.assistance-provided-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.assistance-provided-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.rescue-events-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.rescue-events-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.rescue-events-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.rescue-events-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.individuals-affected-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.individuals-affected-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.individuals-affected-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.individuals-affected-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.missing-count-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.missing-count-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.missing-count-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.missing-count-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.injured-count-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.injured-count-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.injured-count-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.injured-count-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.assistance-needed-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.assistance-needed-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.assistance-needed-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.assistance-needed-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.repair-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.repair-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.repair-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.repair-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.rescued-count-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.rescued-count-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.rescued-count-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.rescued-count-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.declare-emergency-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.declare-emergency-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.declare-emergency-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.declare-emergency-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.announce-disaster-warnings-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.announce-disaster-warnings-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.announce-disaster-warnings-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.announce-disaster-warnings-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.disease-outbreak-events-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.disease-outbreak-events-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.disease-outbreak-events-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.disease-outbreak-events-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.current-location-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.current-location-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.current-location-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.current-location-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.event-or-soa-at-origin-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.event-or-soa-at-origin-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.event-or-soa-at-origin-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.event-or-soa-at-origin-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.group-identity-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.group-identity-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.group-identity-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.group-identity-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.human-displacement-event-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.human-displacement-event-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.human-displacement-event-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.human-displacement-event-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.origin-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.origin-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.origin-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.origin-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.total-displaced-count-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.total-displaced-count-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.total-displaced-count-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.total-displaced-count-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.transitory-events-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.transitory-events-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.transitory-events-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.transitory-events-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.destination-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.destination-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.destination-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.destination-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.transiting-location-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.transiting-location-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.transiting-location-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.transiting-location-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.settlement-status-event-or-soa-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.settlement-status-event-or-soa-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.settlement-status-event-or-soa-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.settlement-status-event-or-soa-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.detained-count-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.detained-count-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.detained-count-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.detained-count-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.blocked-migration-count-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.blocked-migration-count-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.blocked-migration-count-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.blocked-migration-count-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.cybercrime-event-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.cybercrime-event-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.cybercrime-event-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.cybercrime-event-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.perpetrator-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.perpetrator-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.perpetrator-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.perpetrator-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.victim-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.victim-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.victim-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.victim-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.response-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.response-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.response-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.response-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.related-crimes-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.related-crimes-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.related-crimes-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.related-crimes-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.information-stolen-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.information-stolen-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.information-stolen-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.information-stolen-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.victim-impact-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.victim-impact-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.victim-impact-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.victim-impact-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.contract-amount-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.contract-amount-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.contract-amount-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.contract-amount-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.etip-event-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.etip-event-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.etip-event-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.etip-event-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.project-location-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.project-location-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.project-location-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.project-location-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.project-name-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.project-name-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.project-name-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.project-name-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.signatories-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.signatories-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.signatories-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.signatories-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.contract-awardee-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.contract-awardee-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.contract-awardee-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.contract-awardee-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.overall-project-value-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.overall-project-value-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.overall-project-value-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.overall-project-value-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.funding-amount-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.funding-amount-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.funding-amount-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.funding-amount-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.funding-recipient-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.funding-recipient-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.funding-recipient-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.funding-recipient-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.funding-source-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.funding-source-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.funding-source-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.funding-source-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.contract-awarder-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.contract-awarder-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.contract-awarder-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.contract-awarder-ffn.layers.1.bias: torch.Size([2]) >>> name2classifier.agreement-length-ffn.layers.0.weight: torch.Size([350, 1024]) >>> name2classifier.agreement-length-ffn.layers.0.bias: torch.Size([350]) >>> name2classifier.agreement-length-ffn.layers.1.weight: torch.Size([2, 350]) >>> name2classifier.agreement-length-ffn.layers.1.bias: torch.Size([2]) >>> irrealis_classifier.layers.0.weight: torch.Size([350, 1128]) >>> irrealis_classifier.layers.0.bias: torch.Size([350]) >>> irrealis_classifier.layers.1.weight: torch.Size([7, 350]) >>> irrealis_classifier.layers.1.bias: torch.Size([7]) n_trainable_params: 614103147, n_nontrainable_params: 0 ---------------------------------------------------------------------------------------------------- ****************************** Epoch: 0 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-22 11:19:19.704446: step: 2/466, loss: 13.73521900177002 2023-01-22 11:19:20.550500: step: 4/466, loss: 27.88758087158203 2023-01-22 11:19:21.310430: step: 6/466, loss: 15.083219528198242 2023-01-22 11:19:22.050128: step: 8/466, loss: 9.161754608154297 2023-01-22 11:19:22.796112: step: 10/466, loss: 6.564452171325684 2023-01-22 11:19:23.512819: step: 12/466, loss: 5.852215766906738 2023-01-22 11:19:24.240413: step: 14/466, loss: 18.493228912353516 2023-01-22 11:19:24.940047: step: 16/466, loss: 9.561614990234375 2023-01-22 11:19:25.702550: step: 18/466, loss: 16.085447311401367 2023-01-22 11:19:26.466440: step: 20/466, loss: 16.76175308227539 2023-01-22 11:19:27.154264: step: 22/466, loss: 14.106352806091309 2023-01-22 11:19:27.918526: step: 24/466, loss: 18.976318359375 2023-01-22 11:19:28.707850: step: 26/466, loss: 6.381458759307861 2023-01-22 11:19:29.632991: step: 28/466, loss: 39.91090393066406 2023-01-22 11:19:30.400386: step: 30/466, loss: 15.294017791748047 2023-01-22 11:19:31.155732: step: 32/466, loss: 16.546710968017578 2023-01-22 11:19:31.890691: step: 34/466, loss: 26.49373435974121 2023-01-22 11:19:32.645849: step: 36/466, loss: 9.152485847473145 2023-01-22 11:19:33.381828: step: 38/466, loss: 13.001274108886719 2023-01-22 11:19:34.240179: step: 40/466, loss: 7.560708045959473 2023-01-22 11:19:34.960078: step: 42/466, loss: 12.648591041564941 2023-01-22 11:19:35.661390: step: 44/466, loss: 18.493995666503906 2023-01-22 11:19:36.381194: step: 46/466, loss: 11.046037673950195 2023-01-22 11:19:37.113343: step: 48/466, loss: 13.510764122009277 2023-01-22 11:19:37.915704: step: 50/466, loss: 28.686752319335938 2023-01-22 11:19:38.651490: step: 52/466, loss: 5.125367641448975 2023-01-22 11:19:39.422149: step: 54/466, loss: 22.069759368896484 2023-01-22 11:19:40.095138: step: 56/466, loss: 8.92529296875 2023-01-22 11:19:40.904959: step: 58/466, loss: 16.351173400878906 2023-01-22 11:19:41.618536: step: 60/466, loss: 18.710268020629883 2023-01-22 11:19:42.334086: step: 62/466, loss: 32.937156677246094 2023-01-22 11:19:43.029605: step: 64/466, loss: 5.1975226402282715 2023-01-22 11:19:43.910802: step: 66/466, loss: 11.405698776245117 2023-01-22 11:19:44.637735: step: 68/466, loss: 8.95192813873291 2023-01-22 11:19:45.446492: step: 70/466, loss: 22.55552864074707 2023-01-22 11:19:46.260708: step: 72/466, loss: 9.215524673461914 2023-01-22 11:19:47.048800: step: 74/466, loss: 16.336763381958008 2023-01-22 11:19:47.785813: step: 76/466, loss: 13.137796401977539 2023-01-22 11:19:48.536812: step: 78/466, loss: 10.040982246398926 2023-01-22 11:19:49.282422: step: 80/466, loss: 15.665406227111816 2023-01-22 11:19:50.067422: step: 82/466, loss: 9.019591331481934 2023-01-22 11:19:50.854922: step: 84/466, loss: 5.707324028015137 2023-01-22 11:19:51.499749: step: 86/466, loss: 7.152530670166016 2023-01-22 11:19:52.189071: step: 88/466, loss: 11.222397804260254 2023-01-22 11:19:52.966119: step: 90/466, loss: 23.643239974975586 2023-01-22 11:19:53.765941: step: 92/466, loss: 13.26652717590332 2023-01-22 11:19:54.522371: step: 94/466, loss: 5.377017021179199 2023-01-22 11:19:55.409806: step: 96/466, loss: 20.393341064453125 2023-01-22 11:19:56.218894: step: 98/466, loss: 28.030920028686523 2023-01-22 11:19:56.996668: step: 100/466, loss: 8.011981010437012 2023-01-22 11:19:57.752548: step: 102/466, loss: 22.163223266601562 2023-01-22 11:19:58.511340: step: 104/466, loss: 16.60025978088379 2023-01-22 11:19:59.360295: step: 106/466, loss: 11.448785781860352 2023-01-22 11:20:00.130413: step: 108/466, loss: 5.252057075500488 2023-01-22 11:20:00.895825: step: 110/466, loss: 10.07214641571045 2023-01-22 11:20:01.716422: step: 112/466, loss: 9.368322372436523 2023-01-22 11:20:02.510900: step: 114/466, loss: 17.051410675048828 2023-01-22 11:20:03.310218: step: 116/466, loss: 11.597021102905273 2023-01-22 11:20:04.161146: step: 118/466, loss: 14.312658309936523 2023-01-22 11:20:04.942415: step: 120/466, loss: 19.73943328857422 2023-01-22 11:20:05.766223: step: 122/466, loss: 8.307870864868164 2023-01-22 11:20:06.546404: step: 124/466, loss: 16.454423904418945 2023-01-22 11:20:07.354604: step: 126/466, loss: 10.20925235748291 2023-01-22 11:20:08.105503: step: 128/466, loss: 6.411805152893066 2023-01-22 11:20:08.838478: step: 130/466, loss: 3.820077419281006 2023-01-22 11:20:09.652508: step: 132/466, loss: 11.879013061523438 2023-01-22 11:20:10.401613: step: 134/466, loss: 19.770090103149414 2023-01-22 11:20:11.294634: step: 136/466, loss: 19.46176528930664 2023-01-22 11:20:12.065748: step: 138/466, loss: 12.031578063964844 2023-01-22 11:20:12.809141: step: 140/466, loss: 12.311153411865234 2023-01-22 11:20:13.627743: step: 142/466, loss: 7.866918087005615 2023-01-22 11:20:14.355586: step: 144/466, loss: 13.755508422851562 2023-01-22 11:20:15.101804: step: 146/466, loss: 8.731468200683594 2023-01-22 11:20:15.818212: step: 148/466, loss: 14.674220085144043 2023-01-22 11:20:16.556987: step: 150/466, loss: 11.923317909240723 2023-01-22 11:20:17.223759: step: 152/466, loss: 2.6268796920776367 2023-01-22 11:20:18.011719: step: 154/466, loss: 8.184459686279297 2023-01-22 11:20:18.776007: step: 156/466, loss: 10.321874618530273 2023-01-22 11:20:19.525268: step: 158/466, loss: 12.427175521850586 2023-01-22 11:20:20.273072: step: 160/466, loss: 7.840033531188965 2023-01-22 11:20:21.048873: step: 162/466, loss: 6.400445938110352 2023-01-22 11:20:21.813404: step: 164/466, loss: 13.40277099609375 2023-01-22 11:20:22.588968: step: 166/466, loss: 3.168619155883789 2023-01-22 11:20:23.379509: step: 168/466, loss: 11.076648712158203 2023-01-22 11:20:24.198879: step: 170/466, loss: 3.6342873573303223 2023-01-22 11:20:24.990721: step: 172/466, loss: 8.787984848022461 2023-01-22 11:20:25.719580: step: 174/466, loss: 7.403632640838623 2023-01-22 11:20:26.542718: step: 176/466, loss: 4.207934379577637 2023-01-22 11:20:27.262363: step: 178/466, loss: 3.733793020248413 2023-01-22 11:20:27.991009: step: 180/466, loss: 6.416247367858887 2023-01-22 11:20:28.771918: step: 182/466, loss: 8.14820671081543 2023-01-22 11:20:29.490037: step: 184/466, loss: 7.063889503479004 2023-01-22 11:20:30.221819: step: 186/466, loss: 17.30549430847168 2023-01-22 11:20:31.055567: step: 188/466, loss: 28.624164581298828 2023-01-22 11:20:31.931681: step: 190/466, loss: 3.90836763381958 2023-01-22 11:20:32.692405: step: 192/466, loss: 5.004888534545898 2023-01-22 11:20:33.452018: step: 194/466, loss: 3.8708887100219727 2023-01-22 11:20:34.218475: step: 196/466, loss: 7.665855407714844 2023-01-22 11:20:35.070343: step: 198/466, loss: 13.581562042236328 2023-01-22 11:20:35.824432: step: 200/466, loss: 11.876445770263672 2023-01-22 11:20:36.757425: step: 202/466, loss: 6.107644081115723 2023-01-22 11:20:37.496016: step: 204/466, loss: 5.8673577308654785 2023-01-22 11:20:38.208310: step: 206/466, loss: 9.80533218383789 2023-01-22 11:20:38.928670: step: 208/466, loss: 12.336395263671875 2023-01-22 11:20:39.646945: step: 210/466, loss: 12.121980667114258 2023-01-22 11:20:40.353262: step: 212/466, loss: 2.9984421730041504 2023-01-22 11:20:41.157129: step: 214/466, loss: 9.93939208984375 2023-01-22 11:20:42.012209: step: 216/466, loss: 10.636306762695312 2023-01-22 11:20:42.737354: step: 218/466, loss: 8.842659950256348 2023-01-22 11:20:43.541757: step: 220/466, loss: 11.319211959838867 2023-01-22 11:20:44.342501: step: 222/466, loss: 7.915955066680908 2023-01-22 11:20:45.053447: step: 224/466, loss: 4.226457118988037 2023-01-22 11:20:45.758147: step: 226/466, loss: 7.111764907836914 2023-01-22 11:20:46.436840: step: 228/466, loss: 3.8790555000305176 2023-01-22 11:20:47.283330: step: 230/466, loss: 5.482752799987793 2023-01-22 11:20:47.930929: step: 232/466, loss: 6.5940842628479 2023-01-22 11:20:48.748625: step: 234/466, loss: 4.292502403259277 2023-01-22 11:20:49.507778: step: 236/466, loss: 4.059828758239746 2023-01-22 11:20:50.312633: step: 238/466, loss: 13.06025505065918 2023-01-22 11:20:51.024029: step: 240/466, loss: 3.3042235374450684 2023-01-22 11:20:51.685469: step: 242/466, loss: 6.879997253417969 2023-01-22 11:20:52.485286: step: 244/466, loss: 6.937021732330322 2023-01-22 11:20:53.314431: step: 246/466, loss: 11.337417602539062 2023-01-22 11:20:54.013258: step: 248/466, loss: 1.7524123191833496 2023-01-22 11:20:54.739621: step: 250/466, loss: 2.4118971824645996 2023-01-22 11:20:55.520043: step: 252/466, loss: 9.450803756713867 2023-01-22 11:20:56.307790: step: 254/466, loss: 6.877956867218018 2023-01-22 11:20:57.062749: step: 256/466, loss: 6.180779457092285 2023-01-22 11:20:57.831186: step: 258/466, loss: 4.517580986022949 2023-01-22 11:20:58.691498: step: 260/466, loss: 7.087218761444092 2023-01-22 11:20:59.525777: step: 262/466, loss: 17.341150283813477 2023-01-22 11:21:00.391701: step: 264/466, loss: 4.7916436195373535 2023-01-22 11:21:01.180060: step: 266/466, loss: 1.7789692878723145 2023-01-22 11:21:02.015231: step: 268/466, loss: 10.851947784423828 2023-01-22 11:21:02.778998: step: 270/466, loss: 2.1357803344726562 2023-01-22 11:21:03.532178: step: 272/466, loss: 11.99293327331543 2023-01-22 11:21:04.308451: step: 274/466, loss: 6.715605735778809 2023-01-22 11:21:05.133025: step: 276/466, loss: 9.224824905395508 2023-01-22 11:21:05.918941: step: 278/466, loss: 6.91132926940918 2023-01-22 11:21:06.661073: step: 280/466, loss: 7.182122230529785 2023-01-22 11:21:07.434779: step: 282/466, loss: 3.719740390777588 2023-01-22 11:21:08.234562: step: 284/466, loss: 5.512752532958984 2023-01-22 11:21:08.986126: step: 286/466, loss: 4.903224468231201 2023-01-22 11:21:09.669401: step: 288/466, loss: 8.032452583312988 2023-01-22 11:21:10.428837: step: 290/466, loss: 6.836153507232666 2023-01-22 11:21:11.242388: step: 292/466, loss: 3.4128684997558594 2023-01-22 11:21:12.234023: step: 294/466, loss: 4.554594039916992 2023-01-22 11:21:12.981271: step: 296/466, loss: 3.2199997901916504 2023-01-22 11:21:13.798260: step: 298/466, loss: 5.25563907623291 2023-01-22 11:21:14.615153: step: 300/466, loss: 5.8102850914001465 2023-01-22 11:21:15.529252: step: 302/466, loss: 14.86589241027832 2023-01-22 11:21:16.210330: step: 304/466, loss: 7.137749195098877 2023-01-22 11:21:16.946865: step: 306/466, loss: 9.122910499572754 2023-01-22 11:21:17.769121: step: 308/466, loss: 10.541033744812012 2023-01-22 11:21:18.499227: step: 310/466, loss: 7.755885601043701 2023-01-22 11:21:19.276951: step: 312/466, loss: 4.080110549926758 2023-01-22 11:21:19.937982: step: 314/466, loss: 6.619048118591309 2023-01-22 11:21:20.749559: step: 316/466, loss: 2.064182996749878 2023-01-22 11:21:21.439989: step: 318/466, loss: 1.4975166320800781 2023-01-22 11:21:22.187100: step: 320/466, loss: 8.109560012817383 2023-01-22 11:21:22.959143: step: 322/466, loss: 9.021079063415527 2023-01-22 11:21:23.777181: step: 324/466, loss: 3.7328691482543945 2023-01-22 11:21:24.517188: step: 326/466, loss: 8.172685623168945 2023-01-22 11:21:25.284873: step: 328/466, loss: 3.3030261993408203 2023-01-22 11:21:26.077066: step: 330/466, loss: 4.315179824829102 2023-01-22 11:21:26.798852: step: 332/466, loss: 8.02114200592041 2023-01-22 11:21:27.640992: step: 334/466, loss: 14.793797492980957 2023-01-22 11:21:28.339008: step: 336/466, loss: 3.144601821899414 2023-01-22 11:21:29.142790: step: 338/466, loss: 16.421733856201172 2023-01-22 11:21:29.947010: step: 340/466, loss: 7.852216720581055 2023-01-22 11:21:30.633714: step: 342/466, loss: 5.317742347717285 2023-01-22 11:21:31.376740: step: 344/466, loss: 3.8623104095458984 2023-01-22 11:21:32.164651: step: 346/466, loss: 13.985149383544922 2023-01-22 11:21:32.866126: step: 348/466, loss: 5.1254119873046875 2023-01-22 11:21:33.625983: step: 350/466, loss: 2.3744070529937744 2023-01-22 11:21:34.432492: step: 352/466, loss: 2.442744016647339 2023-01-22 11:21:35.173469: step: 354/466, loss: 9.244939804077148 2023-01-22 11:21:35.911778: step: 356/466, loss: 4.649564266204834 2023-01-22 11:21:36.674965: step: 358/466, loss: 9.64559555053711 2023-01-22 11:21:37.530315: step: 360/466, loss: 8.238361358642578 2023-01-22 11:21:38.215937: step: 362/466, loss: 8.38036823272705 2023-01-22 11:21:39.012598: step: 364/466, loss: 2.4607577323913574 2023-01-22 11:21:39.771289: step: 366/466, loss: 2.487901210784912 2023-01-22 11:21:40.571212: step: 368/466, loss: 9.80903148651123 2023-01-22 11:21:41.351250: step: 370/466, loss: 4.288197040557861 2023-01-22 11:21:42.091840: step: 372/466, loss: 8.693002700805664 2023-01-22 11:21:42.932300: step: 374/466, loss: 7.432981491088867 2023-01-22 11:21:43.743484: step: 376/466, loss: 4.512446403503418 2023-01-22 11:21:44.601543: step: 378/466, loss: 5.332370758056641 2023-01-22 11:21:45.401965: step: 380/466, loss: 2.3127150535583496 2023-01-22 11:21:46.188600: step: 382/466, loss: 3.342420816421509 2023-01-22 11:21:46.986285: step: 384/466, loss: 5.061471939086914 2023-01-22 11:21:47.695320: step: 386/466, loss: 3.820502996444702 2023-01-22 11:21:48.506509: step: 388/466, loss: 7.286428451538086 2023-01-22 11:21:49.310558: step: 390/466, loss: 3.637666702270508 2023-01-22 11:21:50.023601: step: 392/466, loss: 8.565852165222168 2023-01-22 11:21:50.744987: step: 394/466, loss: 3.7479686737060547 2023-01-22 11:21:51.466477: step: 396/466, loss: 2.165762424468994 2023-01-22 11:21:52.241361: step: 398/466, loss: 8.176563262939453 2023-01-22 11:21:52.990933: step: 400/466, loss: 4.044155120849609 2023-01-22 11:21:53.747367: step: 402/466, loss: 4.002263069152832 2023-01-22 11:21:54.448553: step: 404/466, loss: 7.4720869064331055 2023-01-22 11:21:55.191052: step: 406/466, loss: 3.0752673149108887 2023-01-22 11:21:55.938514: step: 408/466, loss: 11.876520156860352 2023-01-22 11:21:56.799528: step: 410/466, loss: 1.0020477771759033 2023-01-22 11:21:57.567259: step: 412/466, loss: 2.2785210609436035 2023-01-22 11:21:58.315860: step: 414/466, loss: 2.8014371395111084 2023-01-22 11:21:59.048969: step: 416/466, loss: 4.996395111083984 2023-01-22 11:21:59.800595: step: 418/466, loss: 3.1675033569335938 2023-01-22 11:22:00.613818: step: 420/466, loss: 2.1907224655151367 2023-01-22 11:22:01.326641: step: 422/466, loss: 3.5931363105773926 2023-01-22 11:22:02.156661: step: 424/466, loss: 1.7587145566940308 2023-01-22 11:22:02.892342: step: 426/466, loss: 1.4456634521484375 2023-01-22 11:22:03.712929: step: 428/466, loss: 0.6929149627685547 2023-01-22 11:22:04.535080: step: 430/466, loss: 1.4032700061798096 2023-01-22 11:22:05.303194: step: 432/466, loss: 2.134281635284424 2023-01-22 11:22:06.061673: step: 434/466, loss: 2.9200680255889893 2023-01-22 11:22:06.804000: step: 436/466, loss: 6.406781196594238 2023-01-22 11:22:07.600764: step: 438/466, loss: 1.358239769935608 2023-01-22 11:22:08.453767: step: 440/466, loss: 2.697464942932129 2023-01-22 11:22:09.281499: step: 442/466, loss: 3.9414467811584473 2023-01-22 11:22:10.007504: step: 444/466, loss: 1.3173900842666626 2023-01-22 11:22:10.698885: step: 446/466, loss: 2.6215689182281494 2023-01-22 11:22:11.501212: step: 448/466, loss: 7.385811805725098 2023-01-22 11:22:12.252907: step: 450/466, loss: 2.411505699157715 2023-01-22 11:22:13.054931: step: 452/466, loss: 1.800626516342163 2023-01-22 11:22:13.818337: step: 454/466, loss: 3.886320114135742 2023-01-22 11:22:14.658117: step: 456/466, loss: 1.4516246318817139 2023-01-22 11:22:15.385991: step: 458/466, loss: 2.2257678508758545 2023-01-22 11:22:16.170687: step: 460/466, loss: 2.3415117263793945 2023-01-22 11:22:16.885230: step: 462/466, loss: 1.9631130695343018 2023-01-22 11:22:17.611745: step: 464/466, loss: 4.780514240264893 2023-01-22 11:22:18.311029: step: 466/466, loss: 3.7143073081970215 2023-01-22 11:22:19.030540: step: 468/466, loss: 4.045224666595459 2023-01-22 11:22:19.759047: step: 470/466, loss: 3.0051512718200684 2023-01-22 11:22:20.648303: step: 472/466, loss: 5.037962913513184 2023-01-22 11:22:21.496024: step: 474/466, loss: 3.1078267097473145 2023-01-22 11:22:22.347521: step: 476/466, loss: 3.585723876953125 2023-01-22 11:22:23.125128: step: 478/466, loss: 1.0921087265014648 2023-01-22 11:22:23.837974: step: 480/466, loss: 6.455872535705566 2023-01-22 11:22:24.595067: step: 482/466, loss: 5.039011001586914 2023-01-22 11:22:25.431957: step: 484/466, loss: 5.26975154876709 2023-01-22 11:22:26.393791: step: 486/466, loss: 2.4059643745422363 2023-01-22 11:22:27.121601: step: 488/466, loss: 5.975546836853027 2023-01-22 11:22:27.973404: step: 490/466, loss: 0.2733404040336609 2023-01-22 11:22:28.745303: step: 492/466, loss: 2.014416217803955 2023-01-22 11:22:29.521456: step: 494/466, loss: 1.8301392793655396 2023-01-22 11:22:30.314021: step: 496/466, loss: 1.2460603713989258 2023-01-22 11:22:31.053981: step: 498/466, loss: 3.2944400310516357 2023-01-22 11:22:31.806087: step: 500/466, loss: 2.0257420539855957 2023-01-22 11:22:32.589944: step: 502/466, loss: 6.985259532928467 2023-01-22 11:22:33.346987: step: 504/466, loss: 1.8203388452529907 2023-01-22 11:22:34.040558: step: 506/466, loss: 1.344951868057251 2023-01-22 11:22:34.850566: step: 508/466, loss: 3.0250954627990723 2023-01-22 11:22:35.594955: step: 510/466, loss: 4.147697448730469 2023-01-22 11:22:36.354784: step: 512/466, loss: 1.4706324338912964 2023-01-22 11:22:37.165515: step: 514/466, loss: 2.5658233165740967 2023-01-22 11:22:37.912867: step: 516/466, loss: 2.607825517654419 2023-01-22 11:22:38.627180: step: 518/466, loss: 1.3773353099822998 2023-01-22 11:22:39.439287: step: 520/466, loss: 12.196202278137207 2023-01-22 11:22:40.281336: step: 522/466, loss: 1.900867223739624 2023-01-22 11:22:41.025767: step: 524/466, loss: 1.8195151090621948 2023-01-22 11:22:41.767833: step: 526/466, loss: 0.8975844383239746 2023-01-22 11:22:42.484087: step: 528/466, loss: 2.051081657409668 2023-01-22 11:22:43.263830: step: 530/466, loss: 7.602580547332764 2023-01-22 11:22:44.025363: step: 532/466, loss: 1.4406651258468628 2023-01-22 11:22:44.978464: step: 534/466, loss: 2.3649492263793945 2023-01-22 11:22:45.723745: step: 536/466, loss: 1.9562591314315796 2023-01-22 11:22:46.490262: step: 538/466, loss: 3.06107497215271 2023-01-22 11:22:47.304645: step: 540/466, loss: 2.2868266105651855 2023-01-22 11:22:48.075821: step: 542/466, loss: 5.655248641967773 2023-01-22 11:22:48.789055: step: 544/466, loss: 3.627987861633301 2023-01-22 11:22:49.676489: step: 546/466, loss: 3.139216423034668 2023-01-22 11:22:50.558470: step: 548/466, loss: 2.9110352993011475 2023-01-22 11:22:51.371146: step: 550/466, loss: 0.9010717868804932 2023-01-22 11:22:52.112782: step: 552/466, loss: 1.010190725326538 2023-01-22 11:22:52.916357: step: 554/466, loss: 17.52189064025879 2023-01-22 11:22:53.660351: step: 556/466, loss: 0.635991096496582 2023-01-22 11:22:54.362522: step: 558/466, loss: 2.4419195652008057 2023-01-22 11:22:55.067991: step: 560/466, loss: 2.1627860069274902 2023-01-22 11:22:55.912338: step: 562/466, loss: 1.6352651119232178 2023-01-22 11:22:56.677984: step: 564/466, loss: 1.4795622825622559 2023-01-22 11:22:57.482286: step: 566/466, loss: 1.9066089391708374 2023-01-22 11:22:58.256063: step: 568/466, loss: 15.391820907592773 2023-01-22 11:22:58.983862: step: 570/466, loss: 6.963404655456543 2023-01-22 11:22:59.748185: step: 572/466, loss: 12.239256858825684 2023-01-22 11:23:00.486836: step: 574/466, loss: 4.118880748748779 2023-01-22 11:23:01.254737: step: 576/466, loss: 5.32997989654541 2023-01-22 11:23:02.052502: step: 578/466, loss: 2.3911309242248535 2023-01-22 11:23:02.892021: step: 580/466, loss: 2.738361120223999 2023-01-22 11:23:03.559077: step: 582/466, loss: 3.573793888092041 2023-01-22 11:23:04.402075: step: 584/466, loss: 0.8956894874572754 2023-01-22 11:23:05.130130: step: 586/466, loss: 1.1735498905181885 2023-01-22 11:23:05.923299: step: 588/466, loss: 1.8540666103363037 2023-01-22 11:23:06.692443: step: 590/466, loss: 3.0030360221862793 2023-01-22 11:23:07.491899: step: 592/466, loss: 0.8821673393249512 2023-01-22 11:23:08.249722: step: 594/466, loss: 1.834218978881836 2023-01-22 11:23:09.013383: step: 596/466, loss: 4.334569931030273 2023-01-22 11:23:09.818566: step: 598/466, loss: 1.5742497444152832 2023-01-22 11:23:10.603955: step: 600/466, loss: 1.7871942520141602 2023-01-22 11:23:11.349932: step: 602/466, loss: 1.2830783128738403 2023-01-22 11:23:12.104668: step: 604/466, loss: 3.2601969242095947 2023-01-22 11:23:12.840715: step: 606/466, loss: 6.348970413208008 2023-01-22 11:23:13.567062: step: 608/466, loss: 4.323052883148193 2023-01-22 11:23:14.398188: step: 610/466, loss: 5.402087211608887 2023-01-22 11:23:15.105667: step: 612/466, loss: 6.45858097076416 2023-01-22 11:23:15.937391: step: 614/466, loss: 2.569106340408325 2023-01-22 11:23:16.724668: step: 616/466, loss: 0.705042839050293 2023-01-22 11:23:17.526329: step: 618/466, loss: 4.169329643249512 2023-01-22 11:23:18.275723: step: 620/466, loss: 1.1026544570922852 2023-01-22 11:23:19.099990: step: 622/466, loss: 2.5629806518554688 2023-01-22 11:23:19.887253: step: 624/466, loss: 1.0137592554092407 2023-01-22 11:23:20.637013: step: 626/466, loss: 0.3781975209712982 2023-01-22 11:23:21.498103: step: 628/466, loss: 4.098159313201904 2023-01-22 11:23:22.238160: step: 630/466, loss: 0.9161981344223022 2023-01-22 11:23:22.993704: step: 632/466, loss: 1.3645033836364746 2023-01-22 11:23:23.723430: step: 634/466, loss: 3.8741254806518555 2023-01-22 11:23:24.467533: step: 636/466, loss: 1.3426021337509155 2023-01-22 11:23:25.218449: step: 638/466, loss: 1.9753289222717285 2023-01-22 11:23:26.003133: step: 640/466, loss: 6.594760894775391 2023-01-22 11:23:26.772605: step: 642/466, loss: 0.7493297457695007 2023-01-22 11:23:27.516120: step: 644/466, loss: 0.8289147615432739 2023-01-22 11:23:28.430038: step: 646/466, loss: 4.53216552734375 2023-01-22 11:23:29.151963: step: 648/466, loss: 2.303760051727295 2023-01-22 11:23:29.858256: step: 650/466, loss: 0.7914081811904907 2023-01-22 11:23:30.598622: step: 652/466, loss: 1.3922420740127563 2023-01-22 11:23:31.290879: step: 654/466, loss: 3.121603488922119 2023-01-22 11:23:32.079870: step: 656/466, loss: 5.185492515563965 2023-01-22 11:23:32.860536: step: 658/466, loss: 1.9905723333358765 2023-01-22 11:23:33.672650: step: 660/466, loss: 4.122713088989258 2023-01-22 11:23:34.409707: step: 662/466, loss: 0.9444369673728943 2023-01-22 11:23:35.267460: step: 664/466, loss: 6.511009693145752 2023-01-22 11:23:36.022243: step: 666/466, loss: 1.7498726844787598 2023-01-22 11:23:36.761167: step: 668/466, loss: 4.1698527336120605 2023-01-22 11:23:37.627010: step: 670/466, loss: 2.089411735534668 2023-01-22 11:23:38.512554: step: 672/466, loss: 3.105118751525879 2023-01-22 11:23:39.287075: step: 674/466, loss: 1.3752529621124268 2023-01-22 11:23:40.040504: step: 676/466, loss: 1.345578670501709 2023-01-22 11:23:40.875381: step: 678/466, loss: 3.6869633197784424 2023-01-22 11:23:41.684637: step: 680/466, loss: 1.509131669998169 2023-01-22 11:23:42.448913: step: 682/466, loss: 1.0781230926513672 2023-01-22 11:23:43.173464: step: 684/466, loss: 1.3234790563583374 2023-01-22 11:23:43.974345: step: 686/466, loss: 2.1727919578552246 2023-01-22 11:23:44.725583: step: 688/466, loss: 4.946380138397217 2023-01-22 11:23:45.524793: step: 690/466, loss: 1.8786697387695312 2023-01-22 11:23:46.262772: step: 692/466, loss: 6.498932838439941 2023-01-22 11:23:47.065484: step: 694/466, loss: 6.784847259521484 2023-01-22 11:23:47.791819: step: 696/466, loss: 2.694711923599243 2023-01-22 11:23:48.576735: step: 698/466, loss: 11.929954528808594 2023-01-22 11:23:49.410301: step: 700/466, loss: 2.7109384536743164 2023-01-22 11:23:50.252199: step: 702/466, loss: 5.202466011047363 2023-01-22 11:23:51.022129: step: 704/466, loss: 3.2965590953826904 2023-01-22 11:23:51.828818: step: 706/466, loss: 1.7430288791656494 2023-01-22 11:23:52.568002: step: 708/466, loss: 3.1917104721069336 2023-01-22 11:23:53.271321: step: 710/466, loss: 0.3418167531490326 2023-01-22 11:23:53.985178: step: 712/466, loss: 1.3266034126281738 2023-01-22 11:23:54.788007: step: 714/466, loss: 2.886962652206421 2023-01-22 11:23:55.446993: step: 716/466, loss: 5.965574741363525 2023-01-22 11:23:56.249607: step: 718/466, loss: 8.379350662231445 2023-01-22 11:23:57.033631: step: 720/466, loss: 2.424729108810425 2023-01-22 11:23:57.697787: step: 722/466, loss: 4.797998428344727 2023-01-22 11:23:58.474192: step: 724/466, loss: 0.6181595325469971 2023-01-22 11:23:59.257010: step: 726/466, loss: 6.5584869384765625 2023-01-22 11:24:00.027248: step: 728/466, loss: 2.542454242706299 2023-01-22 11:24:00.780822: step: 730/466, loss: 4.692009449005127 2023-01-22 11:24:01.510120: step: 732/466, loss: 4.815932750701904 2023-01-22 11:24:02.286401: step: 734/466, loss: 1.2393872737884521 2023-01-22 11:24:03.014815: step: 736/466, loss: 1.4991346597671509 2023-01-22 11:24:03.723935: step: 738/466, loss: 1.4453717470169067 2023-01-22 11:24:04.528851: step: 740/466, loss: 5.386700630187988 2023-01-22 11:24:05.214651: step: 742/466, loss: 1.9336085319519043 2023-01-22 11:24:05.976561: step: 744/466, loss: 0.4682157039642334 2023-01-22 11:24:06.774424: step: 746/466, loss: 0.6913096904754639 2023-01-22 11:24:07.516569: step: 748/466, loss: 1.037575125694275 2023-01-22 11:24:08.323861: step: 750/466, loss: 7.556648254394531 2023-01-22 11:24:09.087601: step: 752/466, loss: 2.2385308742523193 2023-01-22 11:24:09.845899: step: 754/466, loss: 1.7686642408370972 2023-01-22 11:24:10.585597: step: 756/466, loss: 1.0103527307510376 2023-01-22 11:24:11.310248: step: 758/466, loss: 1.3823200464248657 2023-01-22 11:24:12.069325: step: 760/466, loss: 5.599819183349609 2023-01-22 11:24:12.811234: step: 762/466, loss: 3.594778060913086 2023-01-22 11:24:13.580893: step: 764/466, loss: 1.2553966045379639 2023-01-22 11:24:14.367117: step: 766/466, loss: 0.6590578556060791 2023-01-22 11:24:15.166852: step: 768/466, loss: 1.9717657566070557 2023-01-22 11:24:15.880507: step: 770/466, loss: 1.6747913360595703 2023-01-22 11:24:16.657093: step: 772/466, loss: 1.52985680103302 2023-01-22 11:24:17.427823: step: 774/466, loss: 10.834771156311035 2023-01-22 11:24:18.194351: step: 776/466, loss: 0.7611805200576782 2023-01-22 11:24:18.971897: step: 778/466, loss: 1.46249258518219 2023-01-22 11:24:19.709181: step: 780/466, loss: 0.8268005847930908 2023-01-22 11:24:20.515570: step: 782/466, loss: 3.1205601692199707 2023-01-22 11:24:21.232129: step: 784/466, loss: 0.8482595682144165 2023-01-22 11:24:22.086488: step: 786/466, loss: 16.482038497924805 2023-01-22 11:24:22.818331: step: 788/466, loss: 0.9325035214424133 2023-01-22 11:24:23.609395: step: 790/466, loss: 2.5457522869110107 2023-01-22 11:24:24.327760: step: 792/466, loss: 1.3102258443832397 2023-01-22 11:24:25.092626: step: 794/466, loss: 2.163250207901001 2023-01-22 11:24:25.794564: step: 796/466, loss: 1.1271615028381348 2023-01-22 11:24:26.505529: step: 798/466, loss: 3.783787250518799 2023-01-22 11:24:27.372982: step: 800/466, loss: 2.997448205947876 2023-01-22 11:24:28.038610: step: 802/466, loss: 1.2927404642105103 2023-01-22 11:24:28.840850: step: 804/466, loss: 2.083723545074463 2023-01-22 11:24:29.703292: step: 806/466, loss: 1.656034231185913 2023-01-22 11:24:30.473988: step: 808/466, loss: 5.2302069664001465 2023-01-22 11:24:31.225272: step: 810/466, loss: 2.1514246463775635 2023-01-22 11:24:31.995145: step: 812/466, loss: 0.6463648080825806 2023-01-22 11:24:32.726801: step: 814/466, loss: 1.2086327075958252 2023-01-22 11:24:33.491198: step: 816/466, loss: 1.9709173440933228 2023-01-22 11:24:34.322103: step: 818/466, loss: 3.217388153076172 2023-01-22 11:24:35.140503: step: 820/466, loss: 1.2259535789489746 2023-01-22 11:24:35.823198: step: 822/466, loss: 3.5686111450195312 2023-01-22 11:24:36.660012: step: 824/466, loss: 1.0073504447937012 2023-01-22 11:24:37.431583: step: 826/466, loss: 0.7077375054359436 2023-01-22 11:24:38.235948: step: 828/466, loss: 1.840885877609253 2023-01-22 11:24:38.942269: step: 830/466, loss: 2.9541375637054443 2023-01-22 11:24:39.700959: step: 832/466, loss: 0.7959054708480835 2023-01-22 11:24:40.398027: step: 834/466, loss: 1.8715145587921143 2023-01-22 11:24:41.208844: step: 836/466, loss: 4.219241142272949 2023-01-22 11:24:41.990830: step: 838/466, loss: 3.6296257972717285 2023-01-22 11:24:42.778259: step: 840/466, loss: 0.6891568303108215 2023-01-22 11:24:43.487315: step: 842/466, loss: 1.4817897081375122 2023-01-22 11:24:44.246024: step: 844/466, loss: 2.252312421798706 2023-01-22 11:24:45.003765: step: 846/466, loss: 0.2936214804649353 2023-01-22 11:24:45.822950: step: 848/466, loss: 2.450155735015869 2023-01-22 11:24:46.678100: step: 850/466, loss: 0.9076513051986694 2023-01-22 11:24:47.478699: step: 852/466, loss: 1.6361433267593384 2023-01-22 11:24:48.303749: step: 854/466, loss: 1.4986391067504883 2023-01-22 11:24:49.096168: step: 856/466, loss: 0.5485374331474304 2023-01-22 11:24:49.879125: step: 858/466, loss: 1.7036077976226807 2023-01-22 11:24:50.609810: step: 860/466, loss: 4.413128852844238 2023-01-22 11:24:51.333704: step: 862/466, loss: 2.4625015258789062 2023-01-22 11:24:52.068127: step: 864/466, loss: 0.6636428236961365 2023-01-22 11:24:52.828500: step: 866/466, loss: 6.692194938659668 2023-01-22 11:24:53.545880: step: 868/466, loss: 7.346501350402832 2023-01-22 11:24:54.279682: step: 870/466, loss: 0.8011118769645691 2023-01-22 11:24:54.993697: step: 872/466, loss: 1.8357152938842773 2023-01-22 11:24:55.888168: step: 874/466, loss: 1.7394757270812988 2023-01-22 11:24:56.678482: step: 876/466, loss: 2.2443785667419434 2023-01-22 11:24:57.419420: step: 878/466, loss: 0.49604469537734985 2023-01-22 11:24:58.114437: step: 880/466, loss: 5.444522857666016 2023-01-22 11:24:58.902257: step: 882/466, loss: 0.5533512234687805 2023-01-22 11:24:59.732459: step: 884/466, loss: 0.8059678077697754 2023-01-22 11:25:00.491406: step: 886/466, loss: 2.289682626724243 2023-01-22 11:25:01.312687: step: 888/466, loss: 1.675082802772522 2023-01-22 11:25:02.141753: step: 890/466, loss: 1.1696981191635132 2023-01-22 11:25:02.924007: step: 892/466, loss: 2.2912344932556152 2023-01-22 11:25:03.693862: step: 894/466, loss: 5.5102949142456055 2023-01-22 11:25:04.406702: step: 896/466, loss: 1.666417121887207 2023-01-22 11:25:05.123514: step: 898/466, loss: 1.7959926128387451 2023-01-22 11:25:05.853205: step: 900/466, loss: 1.474005937576294 2023-01-22 11:25:06.592099: step: 902/466, loss: 1.0745131969451904 2023-01-22 11:25:07.418055: step: 904/466, loss: 1.2037309408187866 2023-01-22 11:25:08.187442: step: 906/466, loss: 1.165777564048767 2023-01-22 11:25:09.003722: step: 908/466, loss: 1.3042948246002197 2023-01-22 11:25:09.855219: step: 910/466, loss: 1.2583338022232056 2023-01-22 11:25:10.631940: step: 912/466, loss: 4.980291366577148 2023-01-22 11:25:11.439668: step: 914/466, loss: 0.7476434707641602 2023-01-22 11:25:12.188735: step: 916/466, loss: 3.4122965335845947 2023-01-22 11:25:12.980977: step: 918/466, loss: 0.9060400724411011 2023-01-22 11:25:13.778251: step: 920/466, loss: 5.214575290679932 2023-01-22 11:25:14.595102: step: 922/466, loss: 5.202361106872559 2023-01-22 11:25:15.363465: step: 924/466, loss: 2.5000698566436768 2023-01-22 11:25:16.202296: step: 926/466, loss: 3.709892988204956 2023-01-22 11:25:16.957217: step: 928/466, loss: 1.2923870086669922 2023-01-22 11:25:17.651639: step: 930/466, loss: 5.998931407928467 2023-01-22 11:25:18.470789: step: 932/466, loss: 4.255189895629883 ================================================== Loss: 5.868 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.4245801033591731, 'r': 0.10353654694391935, 'f1': 0.16647669706180343}, 'combined': 0.12266703994027621, 'epoch': 0} Test Chinese: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.33713319907940165, 'r': 0.06893923052788455, 'f1': 0.11447072805418079}, 'combined': 0.07035761821866722, 'epoch': 0} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.459040346907994, 'r': 0.10152688201934838, 'f1': 0.16627777271899472}, 'combined': 0.12252046410873293, 'epoch': 0} Test Korean: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.33204234730113635, 'r': 0.07334153659110518, 'f1': 0.1201453581753935}, 'combined': 0.07384543965902235, 'epoch': 0} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.44752358490566035, 'r': 0.08967391304347826, 'f1': 0.14940944881889762}, 'combined': 0.11009117281392455, 'epoch': 0} Test Russian: {'template': {'p': 0.863013698630137, 'r': 0.48091603053435117, 'f1': 0.6176470588235295}, 'slot': {'p': 0.34066256574004505, 'r': 0.0711305788689309, 'f1': 0.11768785283239244}, 'combined': 0.07268955616118358, 'epoch': 0} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.6666666666666666, 'r': 0.05714285714285714, 'f1': 0.10526315789473684}, 'combined': 0.07017543859649122, 'epoch': 0} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 0} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 0} New best chinese model... New best korean model... New best russian model... ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.4245801033591731, 'r': 0.10353654694391935, 'f1': 0.16647669706180343}, 'combined': 0.12266703994027621, 'epoch': 0} Test for Chinese: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.33713319907940165, 'r': 0.06893923052788455, 'f1': 0.11447072805418079}, 'combined': 0.07035761821866722, 'epoch': 0} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.6666666666666666, 'r': 0.05714285714285714, 'f1': 0.10526315789473684}, 'combined': 0.07017543859649122, 'epoch': 0} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.459040346907994, 'r': 0.10152688201934838, 'f1': 0.16627777271899472}, 'combined': 0.12252046410873293, 'epoch': 0} Test for Korean: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.33204234730113635, 'r': 0.07334153659110518, 'f1': 0.1201453581753935}, 'combined': 0.07384543965902235, 'epoch': 0} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 0} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.44752358490566035, 'r': 0.08967391304347826, 'f1': 0.14940944881889762}, 'combined': 0.11009117281392455, 'epoch': 0} Test for Russian: {'template': {'p': 0.863013698630137, 'r': 0.48091603053435117, 'f1': 0.6176470588235295}, 'slot': {'p': 0.34066256574004505, 'r': 0.0711305788689309, 'f1': 0.11768785283239244}, 'combined': 0.07268955616118358, 'epoch': 0} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 0} ****************************** Epoch: 1 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-22 11:28:27.159320: step: 2/466, loss: 9.140667915344238 2023-01-22 11:28:27.981729: step: 4/466, loss: 2.162443161010742 2023-01-22 11:28:28.719903: step: 6/466, loss: 2.352289915084839 2023-01-22 11:28:29.548321: step: 8/466, loss: 3.6142661571502686 2023-01-22 11:28:30.587218: step: 10/466, loss: 3.4739856719970703 2023-01-22 11:28:31.421583: step: 12/466, loss: 2.771876096725464 2023-01-22 11:28:32.191819: step: 14/466, loss: 1.1560986042022705 2023-01-22 11:28:32.944951: step: 16/466, loss: 6.923826694488525 2023-01-22 11:28:33.724583: step: 18/466, loss: 7.029289245605469 2023-01-22 11:28:34.495430: step: 20/466, loss: 3.2694921493530273 2023-01-22 11:28:35.314699: step: 22/466, loss: 2.314699649810791 2023-01-22 11:28:36.066717: step: 24/466, loss: 0.4758274555206299 2023-01-22 11:28:36.845212: step: 26/466, loss: 1.258528232574463 2023-01-22 11:28:37.671890: step: 28/466, loss: 1.7560293674468994 2023-01-22 11:28:38.422274: step: 30/466, loss: 0.5054268836975098 2023-01-22 11:28:39.170673: step: 32/466, loss: 1.688248634338379 2023-01-22 11:28:39.954692: step: 34/466, loss: 1.5769336223602295 2023-01-22 11:28:40.631749: step: 36/466, loss: 0.1769465059041977 2023-01-22 11:28:41.380254: step: 38/466, loss: 0.976089596748352 2023-01-22 11:28:42.122850: step: 40/466, loss: 2.1016931533813477 2023-01-22 11:28:42.815247: step: 42/466, loss: 4.128500461578369 2023-01-22 11:28:43.688087: step: 44/466, loss: 1.8174368143081665 2023-01-22 11:28:44.444305: step: 46/466, loss: 0.5061911940574646 2023-01-22 11:28:45.173156: step: 48/466, loss: 2.9876365661621094 2023-01-22 11:28:45.883161: step: 50/466, loss: 0.5157006978988647 2023-01-22 11:28:46.670220: step: 52/466, loss: 1.705984354019165 2023-01-22 11:28:47.459114: step: 54/466, loss: 0.3926768898963928 2023-01-22 11:28:48.178462: step: 56/466, loss: 1.8835524320602417 2023-01-22 11:28:48.990208: step: 58/466, loss: 2.549036979675293 2023-01-22 11:28:49.832985: step: 60/466, loss: 1.1022535562515259 2023-01-22 11:28:50.560015: step: 62/466, loss: 6.8992533683776855 2023-01-22 11:28:51.342198: step: 64/466, loss: 2.8540728092193604 2023-01-22 11:28:52.102947: step: 66/466, loss: 0.3880598545074463 2023-01-22 11:28:52.921577: step: 68/466, loss: 4.757377624511719 2023-01-22 11:28:53.647025: step: 70/466, loss: 1.5769944190979004 2023-01-22 11:28:54.407379: step: 72/466, loss: 2.267717123031616 2023-01-22 11:28:55.235385: step: 74/466, loss: 3.2595067024230957 2023-01-22 11:28:55.938116: step: 76/466, loss: 0.6588865518569946 2023-01-22 11:28:56.756718: step: 78/466, loss: 4.336224555969238 2023-01-22 11:28:57.473504: step: 80/466, loss: 1.129730224609375 2023-01-22 11:28:58.170186: step: 82/466, loss: 0.5617114901542664 2023-01-22 11:28:58.925358: step: 84/466, loss: 2.150784730911255 2023-01-22 11:28:59.660643: step: 86/466, loss: 1.9347360134124756 2023-01-22 11:29:00.408780: step: 88/466, loss: 1.529724359512329 2023-01-22 11:29:01.148277: step: 90/466, loss: 2.5717315673828125 2023-01-22 11:29:01.947893: step: 92/466, loss: 1.7477091550827026 2023-01-22 11:29:02.742663: step: 94/466, loss: 0.5505824089050293 2023-01-22 11:29:03.483823: step: 96/466, loss: 2.1601786613464355 2023-01-22 11:29:04.184879: step: 98/466, loss: 0.8771464228630066 2023-01-22 11:29:04.956883: step: 100/466, loss: 1.4452009201049805 2023-01-22 11:29:05.692451: step: 102/466, loss: 3.2847557067871094 2023-01-22 11:29:06.519151: step: 104/466, loss: 7.115804672241211 2023-01-22 11:29:07.368959: step: 106/466, loss: 1.1448769569396973 2023-01-22 11:29:08.114581: step: 108/466, loss: 1.1002720594406128 2023-01-22 11:29:08.877153: step: 110/466, loss: 1.6691850423812866 2023-01-22 11:29:09.731912: step: 112/466, loss: 4.029344081878662 2023-01-22 11:29:10.535632: step: 114/466, loss: 2.019430637359619 2023-01-22 11:29:11.268645: step: 116/466, loss: 1.26572847366333 2023-01-22 11:29:12.085843: step: 118/466, loss: 1.4695121049880981 2023-01-22 11:29:12.825087: step: 120/466, loss: 1.2214720249176025 2023-01-22 11:29:13.615337: step: 122/466, loss: 1.0011645555496216 2023-01-22 11:29:14.342230: step: 124/466, loss: 5.812806129455566 2023-01-22 11:29:15.052501: step: 126/466, loss: 0.43429237604141235 2023-01-22 11:29:15.749770: step: 128/466, loss: 0.8753689527511597 2023-01-22 11:29:16.494791: step: 130/466, loss: 0.5702555775642395 2023-01-22 11:29:17.240867: step: 132/466, loss: 2.444385528564453 2023-01-22 11:29:17.971077: step: 134/466, loss: 0.8410188555717468 2023-01-22 11:29:18.856026: step: 136/466, loss: 3.110745906829834 2023-01-22 11:29:19.718556: step: 138/466, loss: 4.094484329223633 2023-01-22 11:29:20.514161: step: 140/466, loss: 1.6371426582336426 2023-01-22 11:29:21.327843: step: 142/466, loss: 0.7541942596435547 2023-01-22 11:29:22.147803: step: 144/466, loss: 2.1750974655151367 2023-01-22 11:29:22.977007: step: 146/466, loss: 0.7115434408187866 2023-01-22 11:29:23.783920: step: 148/466, loss: 0.6685702800750732 2023-01-22 11:29:24.583741: step: 150/466, loss: 1.563443660736084 2023-01-22 11:29:25.283807: step: 152/466, loss: 2.2242989540100098 2023-01-22 11:29:26.098235: step: 154/466, loss: 1.2471766471862793 2023-01-22 11:29:26.885918: step: 156/466, loss: 1.1333625316619873 2023-01-22 11:29:27.496343: step: 158/466, loss: 2.8891983032226562 2023-01-22 11:29:28.359113: step: 160/466, loss: 1.4792773723602295 2023-01-22 11:29:29.124393: step: 162/466, loss: 0.8582035303115845 2023-01-22 11:29:29.910495: step: 164/466, loss: 1.1480525732040405 2023-01-22 11:29:30.666871: step: 166/466, loss: 0.9869524836540222 2023-01-22 11:29:31.428865: step: 168/466, loss: 2.1026687622070312 2023-01-22 11:29:32.153861: step: 170/466, loss: 1.9524791240692139 2023-01-22 11:29:32.905552: step: 172/466, loss: 1.8153622150421143 2023-01-22 11:29:33.740657: step: 174/466, loss: 1.347947359085083 2023-01-22 11:29:34.522001: step: 176/466, loss: 7.718930244445801 2023-01-22 11:29:35.235587: step: 178/466, loss: 1.651254415512085 2023-01-22 11:29:36.072804: step: 180/466, loss: 0.29022160172462463 2023-01-22 11:29:36.919817: step: 182/466, loss: 1.8764421939849854 2023-01-22 11:29:37.660358: step: 184/466, loss: 0.9963284730911255 2023-01-22 11:29:38.416753: step: 186/466, loss: 1.812087059020996 2023-01-22 11:29:39.104650: step: 188/466, loss: 2.3929829597473145 2023-01-22 11:29:39.827490: step: 190/466, loss: 1.4361763000488281 2023-01-22 11:29:40.610579: step: 192/466, loss: 0.53630131483078 2023-01-22 11:29:41.307127: step: 194/466, loss: 1.5942753553390503 2023-01-22 11:29:42.052230: step: 196/466, loss: 0.9954477548599243 2023-01-22 11:29:42.782512: step: 198/466, loss: 0.2992696464061737 2023-01-22 11:29:43.726934: step: 200/466, loss: 1.9530024528503418 2023-01-22 11:29:44.382951: step: 202/466, loss: 0.7404540777206421 2023-01-22 11:29:45.164205: step: 204/466, loss: 1.3093249797821045 2023-01-22 11:29:45.950073: step: 206/466, loss: 3.3123230934143066 2023-01-22 11:29:46.750417: step: 208/466, loss: 9.107421875 2023-01-22 11:29:47.529106: step: 210/466, loss: 2.1775870323181152 2023-01-22 11:29:48.290182: step: 212/466, loss: 0.7418627142906189 2023-01-22 11:29:49.073807: step: 214/466, loss: 1.2200261354446411 2023-01-22 11:29:49.847317: step: 216/466, loss: 1.1684129238128662 2023-01-22 11:29:50.624808: step: 218/466, loss: 6.310258865356445 2023-01-22 11:29:51.385984: step: 220/466, loss: 0.6867668032646179 2023-01-22 11:29:52.091909: step: 222/466, loss: 2.495737075805664 2023-01-22 11:29:52.854581: step: 224/466, loss: 1.954410433769226 2023-01-22 11:29:53.570527: step: 226/466, loss: 3.201202630996704 2023-01-22 11:29:54.341348: step: 228/466, loss: 2.4115254878997803 2023-01-22 11:29:55.080346: step: 230/466, loss: 1.5630335807800293 2023-01-22 11:29:55.881668: step: 232/466, loss: 0.6903105974197388 2023-01-22 11:29:56.642579: step: 234/466, loss: 2.7204911708831787 2023-01-22 11:29:57.344804: step: 236/466, loss: 2.1740503311157227 2023-01-22 11:29:58.001152: step: 238/466, loss: 1.7562813758850098 2023-01-22 11:29:58.719832: step: 240/466, loss: 1.4994159936904907 2023-01-22 11:29:59.481659: step: 242/466, loss: 1.1976405382156372 2023-01-22 11:30:00.240650: step: 244/466, loss: 0.43268853425979614 2023-01-22 11:30:00.927750: step: 246/466, loss: 2.2464475631713867 2023-01-22 11:30:01.657618: step: 248/466, loss: 1.5958398580551147 2023-01-22 11:30:02.482882: step: 250/466, loss: 3.0869741439819336 2023-01-22 11:30:03.196827: step: 252/466, loss: 0.9807233214378357 2023-01-22 11:30:03.931933: step: 254/466, loss: 3.9502501487731934 2023-01-22 11:30:04.713778: step: 256/466, loss: 0.6676345467567444 2023-01-22 11:30:05.474725: step: 258/466, loss: 0.45399269461631775 2023-01-22 11:30:06.243773: step: 260/466, loss: 6.352608680725098 2023-01-22 11:30:06.989616: step: 262/466, loss: 1.0999513864517212 2023-01-22 11:30:07.738572: step: 264/466, loss: 2.423794984817505 2023-01-22 11:30:08.490533: step: 266/466, loss: 1.3189613819122314 2023-01-22 11:30:09.276553: step: 268/466, loss: 2.981919050216675 2023-01-22 11:30:10.051260: step: 270/466, loss: 1.505730152130127 2023-01-22 11:30:10.821949: step: 272/466, loss: 1.8038147687911987 2023-01-22 11:30:11.681762: step: 274/466, loss: 3.5392653942108154 2023-01-22 11:30:12.465309: step: 276/466, loss: 0.6619697213172913 2023-01-22 11:30:13.288639: step: 278/466, loss: 0.9468562602996826 2023-01-22 11:30:14.161827: step: 280/466, loss: 3.8605504035949707 2023-01-22 11:30:15.009023: step: 282/466, loss: 1.1731696128845215 2023-01-22 11:30:15.800453: step: 284/466, loss: 1.08231520652771 2023-01-22 11:30:16.572056: step: 286/466, loss: 1.063538670539856 2023-01-22 11:30:17.460834: step: 288/466, loss: 9.27632999420166 2023-01-22 11:30:18.229451: step: 290/466, loss: 1.053228735923767 2023-01-22 11:30:19.119054: step: 292/466, loss: 1.1648130416870117 2023-01-22 11:30:19.940334: step: 294/466, loss: 2.8982105255126953 2023-01-22 11:30:20.801207: step: 296/466, loss: 0.8549784421920776 2023-01-22 11:30:21.625624: step: 298/466, loss: 2.659513235092163 2023-01-22 11:30:22.391471: step: 300/466, loss: 2.1829559803009033 2023-01-22 11:30:23.182342: step: 302/466, loss: 0.779899001121521 2023-01-22 11:30:23.914026: step: 304/466, loss: 0.4097400903701782 2023-01-22 11:30:24.677551: step: 306/466, loss: 1.7801828384399414 2023-01-22 11:30:25.391605: step: 308/466, loss: 0.7560756206512451 2023-01-22 11:30:26.201452: step: 310/466, loss: 0.5453673601150513 2023-01-22 11:30:26.986953: step: 312/466, loss: 0.4129677712917328 2023-01-22 11:30:27.739349: step: 314/466, loss: 4.737407207489014 2023-01-22 11:30:28.508326: step: 316/466, loss: 2.516684055328369 2023-01-22 11:30:29.220658: step: 318/466, loss: 0.49643924832344055 2023-01-22 11:30:29.943844: step: 320/466, loss: 1.9038251638412476 2023-01-22 11:30:30.774624: step: 322/466, loss: 0.6591343283653259 2023-01-22 11:30:31.592614: step: 324/466, loss: 3.3472065925598145 2023-01-22 11:30:32.357556: step: 326/466, loss: 2.4076545238494873 2023-01-22 11:30:33.191170: step: 328/466, loss: 3.0719268321990967 2023-01-22 11:30:33.858062: step: 330/466, loss: 0.3430599570274353 2023-01-22 11:30:34.679091: step: 332/466, loss: 1.3857271671295166 2023-01-22 11:30:35.448704: step: 334/466, loss: 0.30699753761291504 2023-01-22 11:30:36.245228: step: 336/466, loss: 0.47438302636146545 2023-01-22 11:30:36.919872: step: 338/466, loss: 2.707505226135254 2023-01-22 11:30:37.656747: step: 340/466, loss: 1.6239722967147827 2023-01-22 11:30:38.325396: step: 342/466, loss: 0.57518070936203 2023-01-22 11:30:39.115332: step: 344/466, loss: 1.2875721454620361 2023-01-22 11:30:39.948256: step: 346/466, loss: 3.6670823097229004 2023-01-22 11:30:40.679506: step: 348/466, loss: 0.5334588289260864 2023-01-22 11:30:41.368663: step: 350/466, loss: 9.405436515808105 2023-01-22 11:30:42.215235: step: 352/466, loss: 1.1436501741409302 2023-01-22 11:30:43.019902: step: 354/466, loss: 0.8235857486724854 2023-01-22 11:30:43.737780: step: 356/466, loss: 5.213697910308838 2023-01-22 11:30:44.458432: step: 358/466, loss: 1.221557855606079 2023-01-22 11:30:45.153376: step: 360/466, loss: 4.103485584259033 2023-01-22 11:30:45.916970: step: 362/466, loss: 1.9460155963897705 2023-01-22 11:30:46.707194: step: 364/466, loss: 1.0022746324539185 2023-01-22 11:30:47.431947: step: 366/466, loss: 1.3515444993972778 2023-01-22 11:30:48.217146: step: 368/466, loss: 0.4751497209072113 2023-01-22 11:30:49.042367: step: 370/466, loss: 1.04304838180542 2023-01-22 11:30:49.753641: step: 372/466, loss: 0.44798514246940613 2023-01-22 11:30:50.604826: step: 374/466, loss: 0.4274475574493408 2023-01-22 11:30:51.381807: step: 376/466, loss: 2.295330286026001 2023-01-22 11:30:52.139509: step: 378/466, loss: 0.8329123258590698 2023-01-22 11:30:52.919625: step: 380/466, loss: 1.9693539142608643 2023-01-22 11:30:53.686726: step: 382/466, loss: 0.9746490120887756 2023-01-22 11:30:54.481110: step: 384/466, loss: 0.6659336090087891 2023-01-22 11:30:55.289706: step: 386/466, loss: 0.4551994800567627 2023-01-22 11:30:56.080944: step: 388/466, loss: 0.6308054327964783 2023-01-22 11:30:56.916104: step: 390/466, loss: 2.2916059494018555 2023-01-22 11:30:57.751617: step: 392/466, loss: 1.36789071559906 2023-01-22 11:30:58.543158: step: 394/466, loss: 0.6319479942321777 2023-01-22 11:30:59.307938: step: 396/466, loss: 0.5163759589195251 2023-01-22 11:31:00.041490: step: 398/466, loss: 2.276012659072876 2023-01-22 11:31:00.800222: step: 400/466, loss: 1.0691869258880615 2023-01-22 11:31:01.585110: step: 402/466, loss: 2.319650650024414 2023-01-22 11:31:02.334216: step: 404/466, loss: 6.325422286987305 2023-01-22 11:31:03.030501: step: 406/466, loss: 0.4125528931617737 2023-01-22 11:31:03.775664: step: 408/466, loss: 0.8939810395240784 2023-01-22 11:31:04.540855: step: 410/466, loss: 0.45102182030677795 2023-01-22 11:31:05.317502: step: 412/466, loss: 1.2263315916061401 2023-01-22 11:31:06.082764: step: 414/466, loss: 0.9082082509994507 2023-01-22 11:31:06.827193: step: 416/466, loss: 1.0330703258514404 2023-01-22 11:31:07.544387: step: 418/466, loss: 0.951251208782196 2023-01-22 11:31:08.291712: step: 420/466, loss: 1.1964526176452637 2023-01-22 11:31:09.008707: step: 422/466, loss: 1.2323813438415527 2023-01-22 11:31:09.744887: step: 424/466, loss: 2.8224222660064697 2023-01-22 11:31:10.490707: step: 426/466, loss: 1.238973617553711 2023-01-22 11:31:11.386025: step: 428/466, loss: 0.3770541250705719 2023-01-22 11:31:12.159269: step: 430/466, loss: 0.6212438941001892 2023-01-22 11:31:12.976558: step: 432/466, loss: 0.5817193388938904 2023-01-22 11:31:13.847182: step: 434/466, loss: 1.3300286531448364 2023-01-22 11:31:14.635339: step: 436/466, loss: 1.2844303846359253 2023-01-22 11:31:15.412949: step: 438/466, loss: 1.8086202144622803 2023-01-22 11:31:16.152850: step: 440/466, loss: 8.946410179138184 2023-01-22 11:31:16.964039: step: 442/466, loss: 1.169613003730774 2023-01-22 11:31:17.823721: step: 444/466, loss: 1.6764874458312988 2023-01-22 11:31:18.578043: step: 446/466, loss: 4.7825117111206055 2023-01-22 11:31:19.359803: step: 448/466, loss: 12.170503616333008 2023-01-22 11:31:20.146281: step: 450/466, loss: 0.48194020986557007 2023-01-22 11:31:20.859951: step: 452/466, loss: 2.0715715885162354 2023-01-22 11:31:21.600941: step: 454/466, loss: 2.2436349391937256 2023-01-22 11:31:22.459231: step: 456/466, loss: 1.7453405857086182 2023-01-22 11:31:23.217125: step: 458/466, loss: 0.601981520652771 2023-01-22 11:31:24.015913: step: 460/466, loss: 4.245301723480225 2023-01-22 11:31:24.773074: step: 462/466, loss: 1.8211416006088257 2023-01-22 11:31:25.484999: step: 464/466, loss: 1.1473311185836792 2023-01-22 11:31:26.253099: step: 466/466, loss: 3.5540287494659424 2023-01-22 11:31:27.005922: step: 468/466, loss: 0.6876145601272583 2023-01-22 11:31:27.833560: step: 470/466, loss: 4.3995513916015625 2023-01-22 11:31:28.611575: step: 472/466, loss: 1.2393075227737427 2023-01-22 11:31:29.377740: step: 474/466, loss: 0.9958049654960632 2023-01-22 11:31:30.086989: step: 476/466, loss: 0.3819100856781006 2023-01-22 11:31:30.835481: step: 478/466, loss: 0.6085644364356995 2023-01-22 11:31:31.586333: step: 480/466, loss: 8.003620147705078 2023-01-22 11:31:32.414112: step: 482/466, loss: 0.3538343906402588 2023-01-22 11:31:33.282797: step: 484/466, loss: 2.603562116622925 2023-01-22 11:31:34.096322: step: 486/466, loss: 1.1814818382263184 2023-01-22 11:31:34.830324: step: 488/466, loss: 1.0522490739822388 2023-01-22 11:31:35.571156: step: 490/466, loss: 0.9714776277542114 2023-01-22 11:31:36.312819: step: 492/466, loss: 5.451353549957275 2023-01-22 11:31:37.024775: step: 494/466, loss: 1.90500009059906 2023-01-22 11:31:37.768290: step: 496/466, loss: 2.5933101177215576 2023-01-22 11:31:38.499904: step: 498/466, loss: 1.0544312000274658 2023-01-22 11:31:39.260520: step: 500/466, loss: 5.469407081604004 2023-01-22 11:31:40.077932: step: 502/466, loss: 2.7074644565582275 2023-01-22 11:31:40.856489: step: 504/466, loss: 0.4564131200313568 2023-01-22 11:31:41.602535: step: 506/466, loss: 1.8417478799819946 2023-01-22 11:31:42.356499: step: 508/466, loss: 0.9633346199989319 2023-01-22 11:31:43.143392: step: 510/466, loss: 1.4366604089736938 2023-01-22 11:31:43.898357: step: 512/466, loss: 0.8661335110664368 2023-01-22 11:31:44.725262: step: 514/466, loss: 0.5213767290115356 2023-01-22 11:31:45.437474: step: 516/466, loss: 0.5418296456336975 2023-01-22 11:31:46.168707: step: 518/466, loss: 2.071904182434082 2023-01-22 11:31:46.898265: step: 520/466, loss: 5.834861755371094 2023-01-22 11:31:47.686785: step: 522/466, loss: 1.318400263786316 2023-01-22 11:31:48.434599: step: 524/466, loss: 1.7108652591705322 2023-01-22 11:31:49.187650: step: 526/466, loss: 0.3789096772670746 2023-01-22 11:31:49.885380: step: 528/466, loss: 2.4232115745544434 2023-01-22 11:31:50.722491: step: 530/466, loss: 2.5445380210876465 2023-01-22 11:31:51.460819: step: 532/466, loss: 1.4125943183898926 2023-01-22 11:31:52.198084: step: 534/466, loss: 0.5616583228111267 2023-01-22 11:31:52.894241: step: 536/466, loss: 5.609212875366211 2023-01-22 11:31:53.691660: step: 538/466, loss: 7.8469367027282715 2023-01-22 11:31:54.576941: step: 540/466, loss: 1.772813081741333 2023-01-22 11:31:55.358208: step: 542/466, loss: 0.7231072187423706 2023-01-22 11:31:56.166737: step: 544/466, loss: 0.9882017374038696 2023-01-22 11:31:56.887361: step: 546/466, loss: 10.674277305603027 2023-01-22 11:31:57.690332: step: 548/466, loss: 0.8669376373291016 2023-01-22 11:31:58.472520: step: 550/466, loss: 0.8646388053894043 2023-01-22 11:31:59.278558: step: 552/466, loss: 4.222936153411865 2023-01-22 11:32:00.043245: step: 554/466, loss: 3.2410836219787598 2023-01-22 11:32:00.877546: step: 556/466, loss: 1.8248041868209839 2023-01-22 11:32:01.720266: step: 558/466, loss: 3.5040442943573 2023-01-22 11:32:02.479045: step: 560/466, loss: 1.596124291419983 2023-01-22 11:32:03.371850: step: 562/466, loss: 0.30306729674339294 2023-01-22 11:32:04.133029: step: 564/466, loss: 0.9529277682304382 2023-01-22 11:32:04.922262: step: 566/466, loss: 1.5184459686279297 2023-01-22 11:32:05.795118: step: 568/466, loss: 0.9084643721580505 2023-01-22 11:32:06.596095: step: 570/466, loss: 0.4883304834365845 2023-01-22 11:32:07.247291: step: 572/466, loss: 1.4788918495178223 2023-01-22 11:32:07.913356: step: 574/466, loss: 0.7763687968254089 2023-01-22 11:32:08.596106: step: 576/466, loss: 0.799630880355835 2023-01-22 11:32:09.315191: step: 578/466, loss: 2.9817371368408203 2023-01-22 11:32:10.083050: step: 580/466, loss: 1.0672000646591187 2023-01-22 11:32:10.924642: step: 582/466, loss: 1.8494359254837036 2023-01-22 11:32:11.673138: step: 584/466, loss: 1.0946604013442993 2023-01-22 11:32:12.424587: step: 586/466, loss: 2.053010940551758 2023-01-22 11:32:13.147470: step: 588/466, loss: 1.2595796585083008 2023-01-22 11:32:14.000458: step: 590/466, loss: 0.74193274974823 2023-01-22 11:32:14.712998: step: 592/466, loss: 1.6734983921051025 2023-01-22 11:32:15.385784: step: 594/466, loss: 0.6596741080284119 2023-01-22 11:32:16.196839: step: 596/466, loss: 1.8599157333374023 2023-01-22 11:32:16.984618: step: 598/466, loss: 0.7645819187164307 2023-01-22 11:32:17.689019: step: 600/466, loss: 2.639955759048462 2023-01-22 11:32:18.467653: step: 602/466, loss: 0.2884419560432434 2023-01-22 11:32:19.210884: step: 604/466, loss: 1.6108710765838623 2023-01-22 11:32:19.925189: step: 606/466, loss: 6.910267353057861 2023-01-22 11:32:20.676433: step: 608/466, loss: 2.028846263885498 2023-01-22 11:32:21.438363: step: 610/466, loss: 1.0209906101226807 2023-01-22 11:32:22.244032: step: 612/466, loss: 3.6873764991760254 2023-01-22 11:32:23.023538: step: 614/466, loss: 0.35416746139526367 2023-01-22 11:32:23.701468: step: 616/466, loss: 2.9250409603118896 2023-01-22 11:32:24.528397: step: 618/466, loss: 3.456394672393799 2023-01-22 11:32:25.262922: step: 620/466, loss: 2.000549554824829 2023-01-22 11:32:26.036912: step: 622/466, loss: 0.8701221942901611 2023-01-22 11:32:26.829877: step: 624/466, loss: 8.37614631652832 2023-01-22 11:32:27.537026: step: 626/466, loss: 0.4917764365673065 2023-01-22 11:32:28.248977: step: 628/466, loss: 0.5501490831375122 2023-01-22 11:32:29.043123: step: 630/466, loss: 1.4101099967956543 2023-01-22 11:32:29.783702: step: 632/466, loss: 1.3804508447647095 2023-01-22 11:32:30.549078: step: 634/466, loss: 0.34051811695098877 2023-01-22 11:32:31.337500: step: 636/466, loss: 1.925154685974121 2023-01-22 11:32:32.036974: step: 638/466, loss: 0.4945259988307953 2023-01-22 11:32:32.791142: step: 640/466, loss: 3.1828784942626953 2023-01-22 11:32:33.575931: step: 642/466, loss: 0.3875892162322998 2023-01-22 11:32:34.416772: step: 644/466, loss: 2.7635698318481445 2023-01-22 11:32:35.126771: step: 646/466, loss: 1.7485973834991455 2023-01-22 11:32:35.872967: step: 648/466, loss: 0.5556420087814331 2023-01-22 11:32:36.592301: step: 650/466, loss: 1.3510385751724243 2023-01-22 11:32:37.475549: step: 652/466, loss: 0.8903523683547974 2023-01-22 11:32:38.222555: step: 654/466, loss: 3.023341178894043 2023-01-22 11:32:38.965305: step: 656/466, loss: 0.6053643226623535 2023-01-22 11:32:39.752562: step: 658/466, loss: 1.5766246318817139 2023-01-22 11:32:40.552455: step: 660/466, loss: 1.8168840408325195 2023-01-22 11:32:41.356826: step: 662/466, loss: 0.6825411915779114 2023-01-22 11:32:42.115100: step: 664/466, loss: 1.2695629596710205 2023-01-22 11:32:42.863968: step: 666/466, loss: 0.973310112953186 2023-01-22 11:32:43.611558: step: 668/466, loss: 0.8447468280792236 2023-01-22 11:32:44.401864: step: 670/466, loss: 4.895907402038574 2023-01-22 11:32:45.176452: step: 672/466, loss: 0.7715153098106384 2023-01-22 11:32:45.967001: step: 674/466, loss: 3.3655648231506348 2023-01-22 11:32:46.658817: step: 676/466, loss: 0.9991435408592224 2023-01-22 11:32:47.448188: step: 678/466, loss: 3.844038486480713 2023-01-22 11:32:48.174517: step: 680/466, loss: 0.338966429233551 2023-01-22 11:32:48.996904: step: 682/466, loss: 1.0828441381454468 2023-01-22 11:32:49.761911: step: 684/466, loss: 1.0340423583984375 2023-01-22 11:32:50.522695: step: 686/466, loss: 3.864016532897949 2023-01-22 11:32:51.270815: step: 688/466, loss: 0.5012548565864563 2023-01-22 11:32:52.037062: step: 690/466, loss: 1.569545030593872 2023-01-22 11:32:52.799682: step: 692/466, loss: 1.2501015663146973 2023-01-22 11:32:53.542403: step: 694/466, loss: 5.520549774169922 2023-01-22 11:32:54.400716: step: 696/466, loss: 0.9210854172706604 2023-01-22 11:32:55.197063: step: 698/466, loss: 1.2601354122161865 2023-01-22 11:32:55.951880: step: 700/466, loss: 0.6014167070388794 2023-01-22 11:32:56.730030: step: 702/466, loss: 2.394166946411133 2023-01-22 11:32:57.462230: step: 704/466, loss: 0.8901013135910034 2023-01-22 11:32:58.230025: step: 706/466, loss: 3.638638496398926 2023-01-22 11:32:59.019408: step: 708/466, loss: 2.6969895362854004 2023-01-22 11:32:59.801123: step: 710/466, loss: 2.8152315616607666 2023-01-22 11:33:00.576920: step: 712/466, loss: 0.5224518775939941 2023-01-22 11:33:01.451982: step: 714/466, loss: 2.062082052230835 2023-01-22 11:33:02.271523: step: 716/466, loss: 0.602192759513855 2023-01-22 11:33:03.045623: step: 718/466, loss: 1.2655690908432007 2023-01-22 11:33:03.906088: step: 720/466, loss: 1.3014322519302368 2023-01-22 11:33:04.683053: step: 722/466, loss: 0.5693433284759521 2023-01-22 11:33:05.439651: step: 724/466, loss: 0.5791750550270081 2023-01-22 11:33:06.211257: step: 726/466, loss: 0.7184494733810425 2023-01-22 11:33:07.022148: step: 728/466, loss: 0.5201277136802673 2023-01-22 11:33:07.794755: step: 730/466, loss: 0.6421585083007812 2023-01-22 11:33:08.567875: step: 732/466, loss: 4.414198398590088 2023-01-22 11:33:09.339798: step: 734/466, loss: 7.777451992034912 2023-01-22 11:33:10.181785: step: 736/466, loss: 0.8733360767364502 2023-01-22 11:33:10.949946: step: 738/466, loss: 0.46783724427223206 2023-01-22 11:33:11.770389: step: 740/466, loss: 0.41334855556488037 2023-01-22 11:33:12.672200: step: 742/466, loss: 0.6446312069892883 2023-01-22 11:33:13.449426: step: 744/466, loss: 3.32536244392395 2023-01-22 11:33:14.262066: step: 746/466, loss: 0.29394975304603577 2023-01-22 11:33:15.029428: step: 748/466, loss: 6.848193168640137 2023-01-22 11:33:15.838492: step: 750/466, loss: 4.080060005187988 2023-01-22 11:33:16.566161: step: 752/466, loss: 1.6240911483764648 2023-01-22 11:33:17.367226: step: 754/466, loss: 0.9909073114395142 2023-01-22 11:33:18.140032: step: 756/466, loss: 1.2752301692962646 2023-01-22 11:33:18.894954: step: 758/466, loss: 1.7539832592010498 2023-01-22 11:33:19.639412: step: 760/466, loss: 0.8013241291046143 2023-01-22 11:33:20.430236: step: 762/466, loss: 3.9893579483032227 2023-01-22 11:33:21.252753: step: 764/466, loss: 0.5241023898124695 2023-01-22 11:33:21.995300: step: 766/466, loss: 2.236712694168091 2023-01-22 11:33:22.758898: step: 768/466, loss: 4.814619541168213 2023-01-22 11:33:23.579919: step: 770/466, loss: 1.8054533004760742 2023-01-22 11:33:24.394024: step: 772/466, loss: 1.2646602392196655 2023-01-22 11:33:25.194536: step: 774/466, loss: 1.7914949655532837 2023-01-22 11:33:25.973963: step: 776/466, loss: 0.44616395235061646 2023-01-22 11:33:26.820954: step: 778/466, loss: 0.9424298405647278 2023-01-22 11:33:27.648380: step: 780/466, loss: 4.644623279571533 2023-01-22 11:33:28.413323: step: 782/466, loss: 4.011265754699707 2023-01-22 11:33:29.134548: step: 784/466, loss: 0.446866512298584 2023-01-22 11:33:29.971190: step: 786/466, loss: 8.493598937988281 2023-01-22 11:33:30.770045: step: 788/466, loss: 1.4424996376037598 2023-01-22 11:33:31.613046: step: 790/466, loss: 1.4629288911819458 2023-01-22 11:33:32.340971: step: 792/466, loss: 2.3262360095977783 2023-01-22 11:33:33.094525: step: 794/466, loss: 0.5425397753715515 2023-01-22 11:33:33.874917: step: 796/466, loss: 0.9937711954116821 2023-01-22 11:33:34.616710: step: 798/466, loss: 5.378913402557373 2023-01-22 11:33:35.383988: step: 800/466, loss: 0.8644759654998779 2023-01-22 11:33:36.165500: step: 802/466, loss: 1.6354568004608154 2023-01-22 11:33:36.852351: step: 804/466, loss: 0.847054123878479 2023-01-22 11:33:37.585826: step: 806/466, loss: 11.03044319152832 2023-01-22 11:33:38.475298: step: 808/466, loss: 1.2004551887512207 2023-01-22 11:33:39.296180: step: 810/466, loss: 1.6788220405578613 2023-01-22 11:33:40.051301: step: 812/466, loss: 0.9433671832084656 2023-01-22 11:33:40.830254: step: 814/466, loss: 1.2230790853500366 2023-01-22 11:33:41.583051: step: 816/466, loss: 1.431276559829712 2023-01-22 11:33:42.279314: step: 818/466, loss: 0.7971752285957336 2023-01-22 11:33:43.017830: step: 820/466, loss: 1.0101542472839355 2023-01-22 11:33:43.725594: step: 822/466, loss: 1.1521886587142944 2023-01-22 11:33:44.468667: step: 824/466, loss: 1.9171631336212158 2023-01-22 11:33:45.328136: step: 826/466, loss: 0.4679133892059326 2023-01-22 11:33:46.148919: step: 828/466, loss: 3.038625717163086 2023-01-22 11:33:46.987568: step: 830/466, loss: 3.714357852935791 2023-01-22 11:33:47.686225: step: 832/466, loss: 1.6085307598114014 2023-01-22 11:33:48.447247: step: 834/466, loss: 1.6621365547180176 2023-01-22 11:33:49.249071: step: 836/466, loss: 4.6484551429748535 2023-01-22 11:33:50.156829: step: 838/466, loss: 2.3264412879943848 2023-01-22 11:33:50.921912: step: 840/466, loss: 0.3419003486633301 2023-01-22 11:33:51.693958: step: 842/466, loss: 0.6267896890640259 2023-01-22 11:33:52.591881: step: 844/466, loss: 0.9167214632034302 2023-01-22 11:33:53.320550: step: 846/466, loss: 3.188936710357666 2023-01-22 11:33:54.101485: step: 848/466, loss: 0.515191912651062 2023-01-22 11:33:54.856142: step: 850/466, loss: 1.8713839054107666 2023-01-22 11:33:55.710934: step: 852/466, loss: 0.22444993257522583 2023-01-22 11:33:56.431017: step: 854/466, loss: 0.7310824394226074 2023-01-22 11:33:57.188806: step: 856/466, loss: 2.4015276432037354 2023-01-22 11:33:57.908207: step: 858/466, loss: 3.1640782356262207 2023-01-22 11:33:58.712603: step: 860/466, loss: 0.4616748094558716 2023-01-22 11:33:59.482191: step: 862/466, loss: 1.9388368129730225 2023-01-22 11:34:00.175011: step: 864/466, loss: 0.47202399373054504 2023-01-22 11:34:00.926519: step: 866/466, loss: 1.1583646535873413 2023-01-22 11:34:01.665447: step: 868/466, loss: 1.4272856712341309 2023-01-22 11:34:02.479834: step: 870/466, loss: 1.2141351699829102 2023-01-22 11:34:03.309219: step: 872/466, loss: 0.9213106036186218 2023-01-22 11:34:04.087126: step: 874/466, loss: 0.31684863567352295 2023-01-22 11:34:04.858693: step: 876/466, loss: 1.1014714241027832 2023-01-22 11:34:05.576204: step: 878/466, loss: 1.3665329217910767 2023-01-22 11:34:06.315762: step: 880/466, loss: 3.04831862449646 2023-01-22 11:34:07.170756: step: 882/466, loss: 0.9298765659332275 2023-01-22 11:34:08.072302: step: 884/466, loss: 7.784796237945557 2023-01-22 11:34:08.759746: step: 886/466, loss: 0.4797598123550415 2023-01-22 11:34:09.559005: step: 888/466, loss: 4.756036758422852 2023-01-22 11:34:10.245076: step: 890/466, loss: 2.582414388656616 2023-01-22 11:34:11.007367: step: 892/466, loss: 3.721204996109009 2023-01-22 11:34:11.799865: step: 894/466, loss: 1.1353154182434082 2023-01-22 11:34:12.520958: step: 896/466, loss: 0.5402342677116394 2023-01-22 11:34:13.275232: step: 898/466, loss: 1.3996034860610962 2023-01-22 11:34:14.116680: step: 900/466, loss: 0.44790565967559814 2023-01-22 11:34:14.861554: step: 902/466, loss: 2.234361410140991 2023-01-22 11:34:15.553577: step: 904/466, loss: 0.3241014778614044 2023-01-22 11:34:16.275167: step: 906/466, loss: 0.22168482840061188 2023-01-22 11:34:17.113649: step: 908/466, loss: 1.1810328960418701 2023-01-22 11:34:17.929382: step: 910/466, loss: 1.3574362993240356 2023-01-22 11:34:18.781058: step: 912/466, loss: 2.938023567199707 2023-01-22 11:34:19.542838: step: 914/466, loss: 1.8619043827056885 2023-01-22 11:34:20.227648: step: 916/466, loss: 1.254056692123413 2023-01-22 11:34:21.017756: step: 918/466, loss: 0.8026758432388306 2023-01-22 11:34:21.765618: step: 920/466, loss: 1.3725330829620361 2023-01-22 11:34:22.512783: step: 922/466, loss: 1.590737223625183 2023-01-22 11:34:23.304645: step: 924/466, loss: 0.4203554689884186 2023-01-22 11:34:24.071823: step: 926/466, loss: 2.6803925037384033 2023-01-22 11:34:24.814868: step: 928/466, loss: 0.9792072176933289 2023-01-22 11:34:25.586427: step: 930/466, loss: 0.7705917954444885 2023-01-22 11:34:26.290561: step: 932/466, loss: 1.9272147417068481 ================================================== Loss: 1.989 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.24975729804732194, 'r': 0.22989024024810314, 'f1': 0.23941232120512518}, 'combined': 0.1764090787827238, 'epoch': 1} Test Chinese: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.3076816302889428, 'r': 0.18838193080730609, 'f1': 0.23368642300467107}, 'combined': 0.14363165511506612, 'epoch': 1} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.24474242399662569, 'r': 0.23965326206096807, 'f1': 0.24217110913133164}, 'combined': 0.17844186988624436, 'epoch': 1} Test Korean: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.28598942793466964, 'r': 0.18934131529641915, 'f1': 0.22783973145913355}, 'combined': 0.1400380788480528, 'epoch': 1} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2733346024977155, 'r': 0.24279153138528142, 'f1': 0.257159335148302}, 'combined': 0.18948582589874885, 'epoch': 1} Test Russian: {'template': {'p': 0.863013698630137, 'r': 0.48091603053435117, 'f1': 0.6176470588235295}, 'slot': {'p': 0.3071422824180177, 'r': 0.18380794621389143, 'f1': 0.22998336219955295}, 'combined': 0.14204854724090038, 'epoch': 1} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.2631578947368421, 'r': 0.2857142857142857, 'f1': 0.273972602739726}, 'combined': 0.182648401826484, 'epoch': 1} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.32407407407407407, 'r': 0.3804347826086957, 'f1': 0.35000000000000003}, 'combined': 0.17500000000000002, 'epoch': 1} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.3958333333333333, 'r': 0.16379310344827586, 'f1': 0.23170731707317074}, 'combined': 0.15447154471544716, 'epoch': 1} New best chinese model... New best korean model... New best russian model... ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.24975729804732194, 'r': 0.22989024024810314, 'f1': 0.23941232120512518}, 'combined': 0.1764090787827238, 'epoch': 1} Test for Chinese: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.3076816302889428, 'r': 0.18838193080730609, 'f1': 0.23368642300467107}, 'combined': 0.14363165511506612, 'epoch': 1} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.2631578947368421, 'r': 0.2857142857142857, 'f1': 0.273972602739726}, 'combined': 0.182648401826484, 'epoch': 1} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.24474242399662569, 'r': 0.23965326206096807, 'f1': 0.24217110913133164}, 'combined': 0.17844186988624436, 'epoch': 1} Test for Korean: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.28598942793466964, 'r': 0.18934131529641915, 'f1': 0.22783973145913355}, 'combined': 0.1400380788480528, 'epoch': 1} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.32407407407407407, 'r': 0.3804347826086957, 'f1': 0.35000000000000003}, 'combined': 0.17500000000000002, 'epoch': 1} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2733346024977155, 'r': 0.24279153138528142, 'f1': 0.257159335148302}, 'combined': 0.18948582589874885, 'epoch': 1} Test for Russian: {'template': {'p': 0.863013698630137, 'r': 0.48091603053435117, 'f1': 0.6176470588235295}, 'slot': {'p': 0.3071422824180177, 'r': 0.18380794621389143, 'f1': 0.22998336219955295}, 'combined': 0.14204854724090038, 'epoch': 1} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.3958333333333333, 'r': 0.16379310344827586, 'f1': 0.23170731707317074}, 'combined': 0.15447154471544716, 'epoch': 1} ****************************** Epoch: 2 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-22 11:37:42.747505: step: 2/466, loss: 0.7782899141311646 2023-01-22 11:37:43.530382: step: 4/466, loss: 1.060962438583374 2023-01-22 11:37:44.264872: step: 6/466, loss: 0.5922555923461914 2023-01-22 11:37:45.034097: step: 8/466, loss: 0.4344215989112854 2023-01-22 11:37:45.856215: step: 10/466, loss: 1.029514193534851 2023-01-22 11:37:46.623617: step: 12/466, loss: 2.223616361618042 2023-01-22 11:37:47.435253: step: 14/466, loss: 1.798203706741333 2023-01-22 11:37:48.184432: step: 16/466, loss: 0.9643449783325195 2023-01-22 11:37:48.934858: step: 18/466, loss: 1.639407992362976 2023-01-22 11:37:49.715055: step: 20/466, loss: 0.7638140320777893 2023-01-22 11:37:50.559037: step: 22/466, loss: 0.561688244342804 2023-01-22 11:37:51.302068: step: 24/466, loss: 6.8926191329956055 2023-01-22 11:37:52.068141: step: 26/466, loss: 1.3338921070098877 2023-01-22 11:37:52.782908: step: 28/466, loss: 0.6786946058273315 2023-01-22 11:37:53.582901: step: 30/466, loss: 3.4620018005371094 2023-01-22 11:37:54.339227: step: 32/466, loss: 0.38275638222694397 2023-01-22 11:37:55.172867: step: 34/466, loss: 1.3385947942733765 2023-01-22 11:37:56.063942: step: 36/466, loss: 1.246219277381897 2023-01-22 11:37:56.831372: step: 38/466, loss: 0.8457755446434021 2023-01-22 11:37:57.621831: step: 40/466, loss: 0.596068263053894 2023-01-22 11:37:58.427721: step: 42/466, loss: 3.452577590942383 2023-01-22 11:37:59.193680: step: 44/466, loss: 0.40949302911758423 2023-01-22 11:38:00.014955: step: 46/466, loss: 1.7453162670135498 2023-01-22 11:38:00.796974: step: 48/466, loss: 4.246631622314453 2023-01-22 11:38:01.548552: step: 50/466, loss: 0.25915151834487915 2023-01-22 11:38:02.289446: step: 52/466, loss: 4.343395233154297 2023-01-22 11:38:03.018756: step: 54/466, loss: 1.3142693042755127 2023-01-22 11:38:03.779812: step: 56/466, loss: 0.6851130723953247 2023-01-22 11:38:04.527752: step: 58/466, loss: 1.0555392503738403 2023-01-22 11:38:05.387659: step: 60/466, loss: 2.663140296936035 2023-01-22 11:38:06.100852: step: 62/466, loss: 3.964977741241455 2023-01-22 11:38:06.836848: step: 64/466, loss: 2.7914316654205322 2023-01-22 11:38:07.645324: step: 66/466, loss: 0.4234890937805176 2023-01-22 11:38:08.403967: step: 68/466, loss: 0.30918264389038086 2023-01-22 11:38:09.242114: step: 70/466, loss: 1.3355333805084229 2023-01-22 11:38:10.021972: step: 72/466, loss: 0.4983653426170349 2023-01-22 11:38:10.737590: step: 74/466, loss: 5.752483367919922 2023-01-22 11:38:11.504396: step: 76/466, loss: 0.2963273823261261 2023-01-22 11:38:12.217907: step: 78/466, loss: 3.8984785079956055 2023-01-22 11:38:13.005423: step: 80/466, loss: 1.214860439300537 2023-01-22 11:38:13.822907: step: 82/466, loss: 4.006836891174316 2023-01-22 11:38:14.616192: step: 84/466, loss: 4.287250995635986 2023-01-22 11:38:15.502811: step: 86/466, loss: 0.7098851203918457 2023-01-22 11:38:16.239496: step: 88/466, loss: 0.141645148396492 2023-01-22 11:38:16.926596: step: 90/466, loss: 1.717045545578003 2023-01-22 11:38:17.696850: step: 92/466, loss: 1.619040608406067 2023-01-22 11:38:18.392960: step: 94/466, loss: 2.095520257949829 2023-01-22 11:38:19.129037: step: 96/466, loss: 1.948998212814331 2023-01-22 11:38:19.838055: step: 98/466, loss: 0.7309465408325195 2023-01-22 11:38:20.575040: step: 100/466, loss: 1.6109943389892578 2023-01-22 11:38:21.300920: step: 102/466, loss: 2.348635196685791 2023-01-22 11:38:22.089512: step: 104/466, loss: 1.4683918952941895 2023-01-22 11:38:22.848007: step: 106/466, loss: 3.8863635063171387 2023-01-22 11:38:23.634774: step: 108/466, loss: 1.4917595386505127 2023-01-22 11:38:24.343801: step: 110/466, loss: 4.646513938903809 2023-01-22 11:38:25.094365: step: 112/466, loss: 0.14392292499542236 2023-01-22 11:38:25.791656: step: 114/466, loss: 1.1361932754516602 2023-01-22 11:38:26.550878: step: 116/466, loss: 0.9295158386230469 2023-01-22 11:38:27.342860: step: 118/466, loss: 1.2146947383880615 2023-01-22 11:38:28.122697: step: 120/466, loss: 1.472395420074463 2023-01-22 11:38:28.873268: step: 122/466, loss: 2.7178173065185547 2023-01-22 11:38:29.777777: step: 124/466, loss: 0.7712768912315369 2023-01-22 11:38:30.552390: step: 126/466, loss: 0.6306812167167664 2023-01-22 11:38:31.266940: step: 128/466, loss: 1.8061941862106323 2023-01-22 11:38:32.066050: step: 130/466, loss: 1.7571167945861816 2023-01-22 11:38:32.779710: step: 132/466, loss: 1.4951488971710205 2023-01-22 11:38:33.621412: step: 134/466, loss: 8.245603561401367 2023-01-22 11:38:34.390603: step: 136/466, loss: 1.6827788352966309 2023-01-22 11:38:35.112488: step: 138/466, loss: 1.1124346256256104 2023-01-22 11:38:35.796933: step: 140/466, loss: 0.7739265561103821 2023-01-22 11:38:36.621259: step: 142/466, loss: 0.3751266896724701 2023-01-22 11:38:37.430978: step: 144/466, loss: 0.5696679353713989 2023-01-22 11:38:38.194139: step: 146/466, loss: 0.279913067817688 2023-01-22 11:38:38.977593: step: 148/466, loss: 0.7016226053237915 2023-01-22 11:38:39.728414: step: 150/466, loss: 0.8330613374710083 2023-01-22 11:38:40.512048: step: 152/466, loss: 0.6634538173675537 2023-01-22 11:38:41.444834: step: 154/466, loss: 0.32156264781951904 2023-01-22 11:38:42.196680: step: 156/466, loss: 1.308510661125183 2023-01-22 11:38:42.991987: step: 158/466, loss: 1.1710666418075562 2023-01-22 11:38:43.701485: step: 160/466, loss: 0.18871276080608368 2023-01-22 11:38:44.512587: step: 162/466, loss: 2.8034167289733887 2023-01-22 11:38:45.270356: step: 164/466, loss: 2.130894422531128 2023-01-22 11:38:46.045537: step: 166/466, loss: 1.1603654623031616 2023-01-22 11:38:46.850565: step: 168/466, loss: 2.9954476356506348 2023-01-22 11:38:47.748543: step: 170/466, loss: 1.5141651630401611 2023-01-22 11:38:48.586435: step: 172/466, loss: 1.787217378616333 2023-01-22 11:38:49.471839: step: 174/466, loss: 1.7773669958114624 2023-01-22 11:38:50.278245: step: 176/466, loss: 1.1324913501739502 2023-01-22 11:38:51.086840: step: 178/466, loss: 2.0455985069274902 2023-01-22 11:38:52.117400: step: 180/466, loss: 1.0055625438690186 2023-01-22 11:38:52.816968: step: 182/466, loss: 0.31716784834861755 2023-01-22 11:38:53.615398: step: 184/466, loss: 2.176579713821411 2023-01-22 11:38:54.392250: step: 186/466, loss: 1.2036975622177124 2023-01-22 11:38:55.250785: step: 188/466, loss: 1.7671464681625366 2023-01-22 11:38:56.079195: step: 190/466, loss: 0.5484171509742737 2023-01-22 11:38:56.795621: step: 192/466, loss: 0.4990229606628418 2023-01-22 11:38:57.536840: step: 194/466, loss: 1.0899813175201416 2023-01-22 11:38:58.309110: step: 196/466, loss: 1.1280584335327148 2023-01-22 11:38:59.136779: step: 198/466, loss: 1.153459072113037 2023-01-22 11:38:59.876659: step: 200/466, loss: 0.8218337297439575 2023-01-22 11:39:00.662392: step: 202/466, loss: 1.1493607759475708 2023-01-22 11:39:01.445234: step: 204/466, loss: 4.972820281982422 2023-01-22 11:39:02.226829: step: 206/466, loss: 0.9070379734039307 2023-01-22 11:39:02.946048: step: 208/466, loss: 0.2831861674785614 2023-01-22 11:39:03.852716: step: 210/466, loss: 0.7909908890724182 2023-01-22 11:39:04.689055: step: 212/466, loss: 2.180725336074829 2023-01-22 11:39:05.415462: step: 214/466, loss: 0.5319482684135437 2023-01-22 11:39:06.131919: step: 216/466, loss: 1.8310894966125488 2023-01-22 11:39:06.906929: step: 218/466, loss: 1.070844292640686 2023-01-22 11:39:07.642834: step: 220/466, loss: 1.9309546947479248 2023-01-22 11:39:08.445868: step: 222/466, loss: 2.1100826263427734 2023-01-22 11:39:09.244592: step: 224/466, loss: 0.9473533630371094 2023-01-22 11:39:10.058929: step: 226/466, loss: 1.6386666297912598 2023-01-22 11:39:10.892979: step: 228/466, loss: 0.47829851508140564 2023-01-22 11:39:11.649168: step: 230/466, loss: 0.7959439158439636 2023-01-22 11:39:12.415743: step: 232/466, loss: 1.6609903573989868 2023-01-22 11:39:13.215736: step: 234/466, loss: 0.3416905701160431 2023-01-22 11:39:13.991007: step: 236/466, loss: 0.921734094619751 2023-01-22 11:39:14.799877: step: 238/466, loss: 1.387770175933838 2023-01-22 11:39:15.706860: step: 240/466, loss: 2.143401861190796 2023-01-22 11:39:16.400537: step: 242/466, loss: 0.17422008514404297 2023-01-22 11:39:17.128008: step: 244/466, loss: 0.9663944840431213 2023-01-22 11:39:17.887654: step: 246/466, loss: 2.27024507522583 2023-01-22 11:39:18.651997: step: 248/466, loss: 1.0249300003051758 2023-01-22 11:39:19.462616: step: 250/466, loss: 0.7960470914840698 2023-01-22 11:39:20.308272: step: 252/466, loss: 0.5432288646697998 2023-01-22 11:39:21.063921: step: 254/466, loss: 0.46682053804397583 2023-01-22 11:39:21.887593: step: 256/466, loss: 2.035393714904785 2023-01-22 11:39:22.578059: step: 258/466, loss: 0.5158565044403076 2023-01-22 11:39:23.420068: step: 260/466, loss: 1.2990877628326416 2023-01-22 11:39:24.222543: step: 262/466, loss: 0.953231692314148 2023-01-22 11:39:24.986655: step: 264/466, loss: 0.5877880454063416 2023-01-22 11:39:25.792574: step: 266/466, loss: 0.7475366592407227 2023-01-22 11:39:26.537524: step: 268/466, loss: 1.8516916036605835 2023-01-22 11:39:27.270958: step: 270/466, loss: 0.42474454641342163 2023-01-22 11:39:28.103607: step: 272/466, loss: 2.250751495361328 2023-01-22 11:39:28.904748: step: 274/466, loss: 2.9599623680114746 2023-01-22 11:39:29.679950: step: 276/466, loss: 3.367034912109375 2023-01-22 11:39:30.425025: step: 278/466, loss: 2.1417951583862305 2023-01-22 11:39:31.244801: step: 280/466, loss: 0.539064347743988 2023-01-22 11:39:32.091033: step: 282/466, loss: 1.0810662508010864 2023-01-22 11:39:32.901241: step: 284/466, loss: 1.3193659782409668 2023-01-22 11:39:33.711577: step: 286/466, loss: 0.47315260767936707 2023-01-22 11:39:34.419600: step: 288/466, loss: 0.982465922832489 2023-01-22 11:39:35.120445: step: 290/466, loss: 1.0310883522033691 2023-01-22 11:39:35.802817: step: 292/466, loss: 0.8889856338500977 2023-01-22 11:39:36.522205: step: 294/466, loss: 1.234919786453247 2023-01-22 11:39:37.248093: step: 296/466, loss: 0.9779350757598877 2023-01-22 11:39:38.006665: step: 298/466, loss: 2.740853786468506 2023-01-22 11:39:38.810164: step: 300/466, loss: 0.8422730565071106 2023-01-22 11:39:39.530271: step: 302/466, loss: 1.5674970149993896 2023-01-22 11:39:40.345019: step: 304/466, loss: 0.2424909919500351 2023-01-22 11:39:41.130274: step: 306/466, loss: 1.2755539417266846 2023-01-22 11:39:41.894555: step: 308/466, loss: 1.3817476034164429 2023-01-22 11:39:42.756237: step: 310/466, loss: 3.8143482208251953 2023-01-22 11:39:43.524824: step: 312/466, loss: 1.4053281545639038 2023-01-22 11:39:44.209552: step: 314/466, loss: 0.4637737572193146 2023-01-22 11:39:44.957675: step: 316/466, loss: 0.3525203466415405 2023-01-22 11:39:45.701308: step: 318/466, loss: 3.6276497840881348 2023-01-22 11:39:46.449838: step: 320/466, loss: 1.2861909866333008 2023-01-22 11:39:47.201168: step: 322/466, loss: 7.183420181274414 2023-01-22 11:39:48.002374: step: 324/466, loss: 1.1658718585968018 2023-01-22 11:39:48.751785: step: 326/466, loss: 1.338639736175537 2023-01-22 11:39:49.434716: step: 328/466, loss: 1.1101205348968506 2023-01-22 11:39:50.170418: step: 330/466, loss: 2.1631531715393066 2023-01-22 11:39:50.919832: step: 332/466, loss: 1.582674503326416 2023-01-22 11:39:51.681925: step: 334/466, loss: 5.041405200958252 2023-01-22 11:39:52.425314: step: 336/466, loss: 2.207145929336548 2023-01-22 11:39:53.242744: step: 338/466, loss: 1.5536359548568726 2023-01-22 11:39:53.986422: step: 340/466, loss: 1.1272821426391602 2023-01-22 11:39:54.696573: step: 342/466, loss: 1.2651877403259277 2023-01-22 11:39:55.426712: step: 344/466, loss: 1.287071943283081 2023-01-22 11:39:56.150940: step: 346/466, loss: 0.9840033054351807 2023-01-22 11:39:56.888795: step: 348/466, loss: 0.834636390209198 2023-01-22 11:39:57.666555: step: 350/466, loss: 4.166690349578857 2023-01-22 11:39:58.411158: step: 352/466, loss: 1.4835872650146484 2023-01-22 11:39:59.123463: step: 354/466, loss: 1.3906885385513306 2023-01-22 11:39:59.897528: step: 356/466, loss: 1.5957164764404297 2023-01-22 11:40:00.649346: step: 358/466, loss: 1.3467870950698853 2023-01-22 11:40:01.373457: step: 360/466, loss: 1.778088927268982 2023-01-22 11:40:02.155918: step: 362/466, loss: 0.4566032290458679 2023-01-22 11:40:02.894058: step: 364/466, loss: 1.277762770652771 2023-01-22 11:40:03.695701: step: 366/466, loss: 0.898509681224823 2023-01-22 11:40:04.421392: step: 368/466, loss: 9.999470710754395 2023-01-22 11:40:05.182786: step: 370/466, loss: 4.15172815322876 2023-01-22 11:40:05.883176: step: 372/466, loss: 0.3042738735675812 2023-01-22 11:40:06.574415: step: 374/466, loss: 1.9609944820404053 2023-01-22 11:40:07.281713: step: 376/466, loss: 1.6575125455856323 2023-01-22 11:40:08.017394: step: 378/466, loss: 1.5117747783660889 2023-01-22 11:40:08.803944: step: 380/466, loss: 1.8306078910827637 2023-01-22 11:40:09.531100: step: 382/466, loss: 0.45823630690574646 2023-01-22 11:40:10.281910: step: 384/466, loss: 0.44007229804992676 2023-01-22 11:40:11.097829: step: 386/466, loss: 1.2810183763504028 2023-01-22 11:40:11.914017: step: 388/466, loss: 0.6912828683853149 2023-01-22 11:40:12.693468: step: 390/466, loss: 0.7221027612686157 2023-01-22 11:40:13.442039: step: 392/466, loss: 0.22919021546840668 2023-01-22 11:40:14.161907: step: 394/466, loss: 4.120431423187256 2023-01-22 11:40:14.941933: step: 396/466, loss: 1.8976693153381348 2023-01-22 11:40:15.735226: step: 398/466, loss: 0.5101234912872314 2023-01-22 11:40:16.411664: step: 400/466, loss: 0.17438340187072754 2023-01-22 11:40:17.220441: step: 402/466, loss: 4.7474684715271 2023-01-22 11:40:18.026639: step: 404/466, loss: 1.2714061737060547 2023-01-22 11:40:18.806877: step: 406/466, loss: 5.490708827972412 2023-01-22 11:40:19.587604: step: 408/466, loss: 1.0497984886169434 2023-01-22 11:40:20.292470: step: 410/466, loss: 1.5065267086029053 2023-01-22 11:40:21.092669: step: 412/466, loss: 0.8111212253570557 2023-01-22 11:40:21.931134: step: 414/466, loss: 1.206809639930725 2023-01-22 11:40:22.681886: step: 416/466, loss: 1.1442924737930298 2023-01-22 11:40:23.432524: step: 418/466, loss: 0.4491981863975525 2023-01-22 11:40:24.223549: step: 420/466, loss: 1.607987403869629 2023-01-22 11:40:25.054802: step: 422/466, loss: 0.7260214686393738 2023-01-22 11:40:25.889716: step: 424/466, loss: 0.8898999691009521 2023-01-22 11:40:26.635398: step: 426/466, loss: 0.27232709527015686 2023-01-22 11:40:27.392510: step: 428/466, loss: 1.1090251207351685 2023-01-22 11:40:28.209779: step: 430/466, loss: 0.3265036642551422 2023-01-22 11:40:28.927765: step: 432/466, loss: 1.7151259183883667 2023-01-22 11:40:29.689138: step: 434/466, loss: 0.5194512605667114 2023-01-22 11:40:30.497781: step: 436/466, loss: 0.7662211656570435 2023-01-22 11:40:31.291970: step: 438/466, loss: 1.1287963390350342 2023-01-22 11:40:32.137288: step: 440/466, loss: 1.9674453735351562 2023-01-22 11:40:32.929661: step: 442/466, loss: 0.32331493496894836 2023-01-22 11:40:33.652325: step: 444/466, loss: 2.2679405212402344 2023-01-22 11:40:34.575934: step: 446/466, loss: 1.2494890689849854 2023-01-22 11:40:35.405572: step: 448/466, loss: 0.2581394910812378 2023-01-22 11:40:36.190776: step: 450/466, loss: 0.34983840584754944 2023-01-22 11:40:36.961717: step: 452/466, loss: 0.41163530945777893 2023-01-22 11:40:37.677504: step: 454/466, loss: 2.9426956176757812 2023-01-22 11:40:38.344597: step: 456/466, loss: 1.1036453247070312 2023-01-22 11:40:39.137221: step: 458/466, loss: 1.412305235862732 2023-01-22 11:40:39.943375: step: 460/466, loss: 1.4345080852508545 2023-01-22 11:40:40.704993: step: 462/466, loss: 1.881574034690857 2023-01-22 11:40:41.483890: step: 464/466, loss: 0.3297499418258667 2023-01-22 11:40:42.266788: step: 466/466, loss: 0.8000501990318298 2023-01-22 11:40:43.149368: step: 468/466, loss: 1.1551730632781982 2023-01-22 11:40:43.960885: step: 470/466, loss: 2.107248306274414 2023-01-22 11:40:44.750526: step: 472/466, loss: 0.9202122092247009 2023-01-22 11:40:45.604314: step: 474/466, loss: 0.5258978009223938 2023-01-22 11:40:46.407260: step: 476/466, loss: 0.8387023210525513 2023-01-22 11:40:47.079995: step: 478/466, loss: 4.732879161834717 2023-01-22 11:40:47.833004: step: 480/466, loss: 3.042332172393799 2023-01-22 11:40:48.569510: step: 482/466, loss: 0.5810714960098267 2023-01-22 11:40:49.391623: step: 484/466, loss: 1.4622222185134888 2023-01-22 11:40:50.171515: step: 486/466, loss: 0.3892802298069 2023-01-22 11:40:50.857389: step: 488/466, loss: 3.730802059173584 2023-01-22 11:40:51.704725: step: 490/466, loss: 0.22228802740573883 2023-01-22 11:40:52.542833: step: 492/466, loss: 2.525942087173462 2023-01-22 11:40:53.345350: step: 494/466, loss: 1.179243564605713 2023-01-22 11:40:54.151177: step: 496/466, loss: 0.464471697807312 2023-01-22 11:40:54.935308: step: 498/466, loss: 1.479258418083191 2023-01-22 11:40:55.585080: step: 500/466, loss: 2.4078640937805176 2023-01-22 11:40:56.440665: step: 502/466, loss: 1.3903347253799438 2023-01-22 11:40:57.277151: step: 504/466, loss: 8.301027297973633 2023-01-22 11:40:58.088451: step: 506/466, loss: 0.9799388647079468 2023-01-22 11:40:58.832586: step: 508/466, loss: 1.767979621887207 2023-01-22 11:40:59.518655: step: 510/466, loss: 1.7241212129592896 2023-01-22 11:41:00.261527: step: 512/466, loss: 3.429478168487549 2023-01-22 11:41:01.100025: step: 514/466, loss: 0.9211028814315796 2023-01-22 11:41:01.891568: step: 516/466, loss: 0.6944395899772644 2023-01-22 11:41:02.677891: step: 518/466, loss: 4.043740749359131 2023-01-22 11:41:03.516056: step: 520/466, loss: 3.6364011764526367 2023-01-22 11:41:04.230162: step: 522/466, loss: 1.1717474460601807 2023-01-22 11:41:04.974285: step: 524/466, loss: 1.412083387374878 2023-01-22 11:41:05.702582: step: 526/466, loss: 1.7653515338897705 2023-01-22 11:41:06.411320: step: 528/466, loss: 8.733113288879395 2023-01-22 11:41:07.108779: step: 530/466, loss: 0.6756694316864014 2023-01-22 11:41:07.860966: step: 532/466, loss: 0.860968291759491 2023-01-22 11:41:08.637912: step: 534/466, loss: 0.4007422626018524 2023-01-22 11:41:09.356000: step: 536/466, loss: 1.1231260299682617 2023-01-22 11:41:10.088082: step: 538/466, loss: 1.2358288764953613 2023-01-22 11:41:10.795621: step: 540/466, loss: 0.16142581403255463 2023-01-22 11:41:11.563212: step: 542/466, loss: 0.8205204606056213 2023-01-22 11:41:12.370483: step: 544/466, loss: 0.4645836651325226 2023-01-22 11:41:13.101739: step: 546/466, loss: 0.9839164018630981 2023-01-22 11:41:13.970589: step: 548/466, loss: 0.6886512637138367 2023-01-22 11:41:14.812181: step: 550/466, loss: 1.2889955043792725 2023-01-22 11:41:15.495197: step: 552/466, loss: 0.3291648328304291 2023-01-22 11:41:16.209479: step: 554/466, loss: 6.707035541534424 2023-01-22 11:41:17.029077: step: 556/466, loss: 0.2883372902870178 2023-01-22 11:41:17.787208: step: 558/466, loss: 0.17781074345111847 2023-01-22 11:41:18.593478: step: 560/466, loss: 0.9045490026473999 2023-01-22 11:41:19.355201: step: 562/466, loss: 1.085221290588379 2023-01-22 11:41:20.066982: step: 564/466, loss: 0.5101253986358643 2023-01-22 11:41:20.873347: step: 566/466, loss: 0.9176614880561829 2023-01-22 11:41:21.674175: step: 568/466, loss: 0.5486728549003601 2023-01-22 11:41:22.447366: step: 570/466, loss: 2.648871898651123 2023-01-22 11:41:23.195112: step: 572/466, loss: 1.380678653717041 2023-01-22 11:41:23.990836: step: 574/466, loss: 0.9654417037963867 2023-01-22 11:41:24.681244: step: 576/466, loss: 1.4313585758209229 2023-01-22 11:41:25.436038: step: 578/466, loss: 0.9260072708129883 2023-01-22 11:41:26.285117: step: 580/466, loss: 1.1708897352218628 2023-01-22 11:41:27.028223: step: 582/466, loss: 2.7310571670532227 2023-01-22 11:41:27.879625: step: 584/466, loss: 0.6548606157302856 2023-01-22 11:41:28.637955: step: 586/466, loss: 0.8693034648895264 2023-01-22 11:41:29.310037: step: 588/466, loss: 0.11078198999166489 2023-01-22 11:41:30.070244: step: 590/466, loss: 1.0300918817520142 2023-01-22 11:41:30.806342: step: 592/466, loss: 0.395668625831604 2023-01-22 11:41:31.562486: step: 594/466, loss: 2.4247560501098633 2023-01-22 11:41:32.380508: step: 596/466, loss: 1.270116925239563 2023-01-22 11:41:33.190922: step: 598/466, loss: 0.23916392028331757 2023-01-22 11:41:33.987843: step: 600/466, loss: 1.0338306427001953 2023-01-22 11:41:34.723060: step: 602/466, loss: 0.6416707634925842 2023-01-22 11:41:35.468692: step: 604/466, loss: 0.8634581565856934 2023-01-22 11:41:36.202920: step: 606/466, loss: 0.5713067650794983 2023-01-22 11:41:37.020809: step: 608/466, loss: 0.630683183670044 2023-01-22 11:41:37.805929: step: 610/466, loss: 0.7142400145530701 2023-01-22 11:41:38.638971: step: 612/466, loss: 6.908291816711426 2023-01-22 11:41:39.566300: step: 614/466, loss: 1.0872340202331543 2023-01-22 11:41:40.393365: step: 616/466, loss: 0.3696516156196594 2023-01-22 11:41:41.166412: step: 618/466, loss: 1.0836267471313477 2023-01-22 11:41:41.950675: step: 620/466, loss: 0.33955225348472595 2023-01-22 11:41:42.714127: step: 622/466, loss: 0.2656218111515045 2023-01-22 11:41:43.477329: step: 624/466, loss: 2.546243190765381 2023-01-22 11:41:44.251371: step: 626/466, loss: 0.2593325972557068 2023-01-22 11:41:44.941873: step: 628/466, loss: 1.3277188539505005 2023-01-22 11:41:45.742969: step: 630/466, loss: 1.4344978332519531 2023-01-22 11:41:46.476667: step: 632/466, loss: 1.8098323345184326 2023-01-22 11:41:47.305561: step: 634/466, loss: 0.2669510841369629 2023-01-22 11:41:48.033916: step: 636/466, loss: 0.3324632942676544 2023-01-22 11:41:48.770930: step: 638/466, loss: 0.35632777214050293 2023-01-22 11:41:49.555168: step: 640/466, loss: 1.4985682964324951 2023-01-22 11:41:50.396976: step: 642/466, loss: 2.8833279609680176 2023-01-22 11:41:51.106007: step: 644/466, loss: 1.559320330619812 2023-01-22 11:41:51.957413: step: 646/466, loss: 1.1288251876831055 2023-01-22 11:41:52.734946: step: 648/466, loss: 1.512923002243042 2023-01-22 11:41:53.535319: step: 650/466, loss: 0.9332684278488159 2023-01-22 11:41:54.310810: step: 652/466, loss: 0.579450249671936 2023-01-22 11:41:55.113110: step: 654/466, loss: 0.6076934933662415 2023-01-22 11:41:55.924448: step: 656/466, loss: 4.72520637512207 2023-01-22 11:41:56.774953: step: 658/466, loss: 0.9739111661911011 2023-01-22 11:41:57.456807: step: 660/466, loss: 0.22468315064907074 2023-01-22 11:41:58.278457: step: 662/466, loss: 4.175695896148682 2023-01-22 11:41:59.137671: step: 664/466, loss: 1.0225563049316406 2023-01-22 11:41:59.882032: step: 666/466, loss: 0.8607184886932373 2023-01-22 11:42:00.701100: step: 668/466, loss: 2.4260406494140625 2023-01-22 11:42:01.389703: step: 670/466, loss: 0.7923364639282227 2023-01-22 11:42:02.146626: step: 672/466, loss: 1.084152340888977 2023-01-22 11:42:02.826166: step: 674/466, loss: 0.5868847370147705 2023-01-22 11:42:03.610263: step: 676/466, loss: 0.8401212692260742 2023-01-22 11:42:04.347316: step: 678/466, loss: 1.404846429824829 2023-01-22 11:42:05.086196: step: 680/466, loss: 0.825631856918335 2023-01-22 11:42:05.864535: step: 682/466, loss: 1.8228082656860352 2023-01-22 11:42:06.745259: step: 684/466, loss: 1.855266809463501 2023-01-22 11:42:07.502007: step: 686/466, loss: 1.5666351318359375 2023-01-22 11:42:08.226099: step: 688/466, loss: 0.6711158752441406 2023-01-22 11:42:09.003310: step: 690/466, loss: 2.033846616744995 2023-01-22 11:42:09.825796: step: 692/466, loss: 0.7143286466598511 2023-01-22 11:42:10.603922: step: 694/466, loss: 0.57305508852005 2023-01-22 11:42:11.354132: step: 696/466, loss: 1.4142602682113647 2023-01-22 11:42:12.109674: step: 698/466, loss: 1.5018336772918701 2023-01-22 11:42:12.896281: step: 700/466, loss: 2.2561750411987305 2023-01-22 11:42:13.706021: step: 702/466, loss: 0.6018629670143127 2023-01-22 11:42:14.502416: step: 704/466, loss: 0.669532299041748 2023-01-22 11:42:15.264284: step: 706/466, loss: 1.3866297006607056 2023-01-22 11:42:15.993858: step: 708/466, loss: 1.3363064527511597 2023-01-22 11:42:16.817244: step: 710/466, loss: 0.15829530358314514 2023-01-22 11:42:17.625659: step: 712/466, loss: 0.6302545070648193 2023-01-22 11:42:18.508699: step: 714/466, loss: 3.307643413543701 2023-01-22 11:42:19.258355: step: 716/466, loss: 1.5970646142959595 2023-01-22 11:42:19.992099: step: 718/466, loss: 5.3095526695251465 2023-01-22 11:42:20.682984: step: 720/466, loss: 2.7509565353393555 2023-01-22 11:42:21.484543: step: 722/466, loss: 1.5113661289215088 2023-01-22 11:42:22.354351: step: 724/466, loss: 1.215741753578186 2023-01-22 11:42:23.148720: step: 726/466, loss: 0.8744622468948364 2023-01-22 11:42:23.898597: step: 728/466, loss: 1.2747092247009277 2023-01-22 11:42:24.651264: step: 730/466, loss: 1.9986467361450195 2023-01-22 11:42:25.493529: step: 732/466, loss: 0.8631834387779236 2023-01-22 11:42:26.210717: step: 734/466, loss: 0.5817731022834778 2023-01-22 11:42:27.021520: step: 736/466, loss: 4.832178115844727 2023-01-22 11:42:27.807535: step: 738/466, loss: 7.882238388061523 2023-01-22 11:42:28.590852: step: 740/466, loss: 0.30403223633766174 2023-01-22 11:42:29.397302: step: 742/466, loss: 1.302971363067627 2023-01-22 11:42:30.152146: step: 744/466, loss: 0.9712323546409607 2023-01-22 11:42:30.869391: step: 746/466, loss: 2.6433591842651367 2023-01-22 11:42:31.607182: step: 748/466, loss: 1.4316619634628296 2023-01-22 11:42:32.421649: step: 750/466, loss: 0.6358616948127747 2023-01-22 11:42:33.158800: step: 752/466, loss: 1.0994406938552856 2023-01-22 11:42:33.968225: step: 754/466, loss: 0.6324729323387146 2023-01-22 11:42:34.721538: step: 756/466, loss: 0.8818514943122864 2023-01-22 11:42:35.517460: step: 758/466, loss: 0.599430501461029 2023-01-22 11:42:36.333534: step: 760/466, loss: 1.258911371231079 2023-01-22 11:42:37.103626: step: 762/466, loss: 0.6301141381263733 2023-01-22 11:42:37.968198: step: 764/466, loss: 1.468433141708374 2023-01-22 11:42:38.710282: step: 766/466, loss: 0.8221463561058044 2023-01-22 11:42:39.501774: step: 768/466, loss: 1.4093496799468994 2023-01-22 11:42:40.300236: step: 770/466, loss: 2.1558051109313965 2023-01-22 11:42:41.137614: step: 772/466, loss: 5.0710296630859375 2023-01-22 11:42:41.942217: step: 774/466, loss: 1.9297935962677002 2023-01-22 11:42:42.732172: step: 776/466, loss: 1.6422162055969238 2023-01-22 11:42:43.483473: step: 778/466, loss: 0.5209250450134277 2023-01-22 11:42:44.261657: step: 780/466, loss: 0.2130260169506073 2023-01-22 11:42:45.010917: step: 782/466, loss: 1.0116814374923706 2023-01-22 11:42:45.755780: step: 784/466, loss: 0.807086169719696 2023-01-22 11:42:46.477774: step: 786/466, loss: 0.7357428073883057 2023-01-22 11:42:47.265940: step: 788/466, loss: 0.8373849391937256 2023-01-22 11:42:48.052410: step: 790/466, loss: 2.5438427925109863 2023-01-22 11:42:48.792906: step: 792/466, loss: 2.2465767860412598 2023-01-22 11:42:49.587818: step: 794/466, loss: 6.289047718048096 2023-01-22 11:42:50.401507: step: 796/466, loss: 1.2244832515716553 2023-01-22 11:42:51.147859: step: 798/466, loss: 0.9194517731666565 2023-01-22 11:42:51.914669: step: 800/466, loss: 0.5414337515830994 2023-01-22 11:42:52.636150: step: 802/466, loss: 0.45343855023384094 2023-01-22 11:42:53.356577: step: 804/466, loss: 0.6903799772262573 2023-01-22 11:42:54.028567: step: 806/466, loss: 0.7730932831764221 2023-01-22 11:42:54.779511: step: 808/466, loss: 1.878928542137146 2023-01-22 11:42:55.428602: step: 810/466, loss: 0.9115766882896423 2023-01-22 11:42:56.302007: step: 812/466, loss: 5.4044294357299805 2023-01-22 11:42:57.098451: step: 814/466, loss: 1.1320093870162964 2023-01-22 11:42:57.795768: step: 816/466, loss: 1.3294376134872437 2023-01-22 11:42:58.569934: step: 818/466, loss: 2.8793349266052246 2023-01-22 11:42:59.391751: step: 820/466, loss: 11.329612731933594 2023-01-22 11:43:00.140786: step: 822/466, loss: 4.094210624694824 2023-01-22 11:43:00.962537: step: 824/466, loss: 2.4406044483184814 2023-01-22 11:43:01.715293: step: 826/466, loss: 1.6032664775848389 2023-01-22 11:43:02.462876: step: 828/466, loss: 0.45030879974365234 2023-01-22 11:43:03.299177: step: 830/466, loss: 1.1435775756835938 2023-01-22 11:43:04.031831: step: 832/466, loss: 0.5184001326560974 2023-01-22 11:43:04.804835: step: 834/466, loss: 1.1396267414093018 2023-01-22 11:43:05.547412: step: 836/466, loss: 2.8804054260253906 2023-01-22 11:43:06.284868: step: 838/466, loss: 12.526484489440918 2023-01-22 11:43:07.016101: step: 840/466, loss: 0.584930956363678 2023-01-22 11:43:07.795334: step: 842/466, loss: 1.3484601974487305 2023-01-22 11:43:08.590481: step: 844/466, loss: 0.5086191892623901 2023-01-22 11:43:09.445148: step: 846/466, loss: 0.2963380515575409 2023-01-22 11:43:10.221273: step: 848/466, loss: 0.8596228361129761 2023-01-22 11:43:10.931865: step: 850/466, loss: 5.564979553222656 2023-01-22 11:43:11.758490: step: 852/466, loss: 0.38722512125968933 2023-01-22 11:43:12.541325: step: 854/466, loss: 1.176393985748291 2023-01-22 11:43:13.290805: step: 856/466, loss: 0.8879210948944092 2023-01-22 11:43:14.038796: step: 858/466, loss: 4.816599369049072 2023-01-22 11:43:14.842705: step: 860/466, loss: 0.23427298665046692 2023-01-22 11:43:15.606212: step: 862/466, loss: 1.898385763168335 2023-01-22 11:43:16.292173: step: 864/466, loss: 0.7958246469497681 2023-01-22 11:43:17.162839: step: 866/466, loss: 1.7949293851852417 2023-01-22 11:43:17.891912: step: 868/466, loss: 1.4004677534103394 2023-01-22 11:43:18.632296: step: 870/466, loss: 0.6244205236434937 2023-01-22 11:43:19.500164: step: 872/466, loss: 0.2835509181022644 2023-01-22 11:43:20.396099: step: 874/466, loss: 1.0040892362594604 2023-01-22 11:43:21.107613: step: 876/466, loss: 2.8210952281951904 2023-01-22 11:43:21.803876: step: 878/466, loss: 0.45503246784210205 2023-01-22 11:43:22.584504: step: 880/466, loss: 0.5614334344863892 2023-01-22 11:43:23.309617: step: 882/466, loss: 2.8984780311584473 2023-01-22 11:43:24.034947: step: 884/466, loss: 0.786919355392456 2023-01-22 11:43:24.793043: step: 886/466, loss: 4.160480499267578 2023-01-22 11:43:25.579574: step: 888/466, loss: 2.23695707321167 2023-01-22 11:43:26.291870: step: 890/466, loss: 3.336459159851074 2023-01-22 11:43:27.102108: step: 892/466, loss: 0.5677067041397095 2023-01-22 11:43:27.879017: step: 894/466, loss: 0.22661782801151276 2023-01-22 11:43:28.628788: step: 896/466, loss: 1.7460044622421265 2023-01-22 11:43:29.422365: step: 898/466, loss: 0.900327205657959 2023-01-22 11:43:30.184702: step: 900/466, loss: 1.9194039106369019 2023-01-22 11:43:30.994779: step: 902/466, loss: 0.6010991334915161 2023-01-22 11:43:31.729849: step: 904/466, loss: 0.4171159863471985 2023-01-22 11:43:32.484827: step: 906/466, loss: 0.975546658039093 2023-01-22 11:43:33.218233: step: 908/466, loss: 0.8966106176376343 2023-01-22 11:43:34.004618: step: 910/466, loss: 1.053739309310913 2023-01-22 11:43:34.830702: step: 912/466, loss: 2.6362953186035156 2023-01-22 11:43:35.563998: step: 914/466, loss: 1.6291375160217285 2023-01-22 11:43:36.383205: step: 916/466, loss: 0.9473370313644409 2023-01-22 11:43:37.136903: step: 918/466, loss: 0.8623666763305664 2023-01-22 11:43:37.928993: step: 920/466, loss: 0.6700409650802612 2023-01-22 11:43:38.662260: step: 922/466, loss: 0.7651454210281372 2023-01-22 11:43:39.415758: step: 924/466, loss: 0.7948426604270935 2023-01-22 11:43:40.125338: step: 926/466, loss: 1.1421416997909546 2023-01-22 11:43:40.927029: step: 928/466, loss: 0.7876683473587036 2023-01-22 11:43:41.683990: step: 930/466, loss: 0.4691295027732849 2023-01-22 11:43:42.458565: step: 932/466, loss: 0.23193420469760895 ================================================== Loss: 1.580 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.28142266941491284, 'r': 0.22290377785415594, 'f1': 0.24876814026339344}, 'combined': 0.18330284019407936, 'epoch': 2} Test Chinese: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.3433712454345975, 'r': 0.1910909864584257, 'f1': 0.2455370879216357}, 'combined': 0.1509154784298834, 'epoch': 2} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2635567395407361, 'r': 0.2206289372670556, 'f1': 0.24018985335465023}, 'combined': 0.17698199720868962, 'epoch': 2} Test Korean: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.3295442934192746, 'r': 0.19533816184559247, 'f1': 0.2452837806923528}, 'combined': 0.150759787157251, 'epoch': 2} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.29145263218923934, 'r': 0.21597246090393538, 'f1': 0.24809865758562827}, 'combined': 0.18280953716835766, 'epoch': 2} Test Russian: {'template': {'p': 0.863013698630137, 'r': 0.48091603053435117, 'f1': 0.6176470588235295}, 'slot': {'p': 0.3430580857051445, 'r': 0.19624461675799032, 'f1': 0.2496679591904619}, 'combined': 0.1542066806764618, 'epoch': 2} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.3181818181818182, 'r': 0.3, 'f1': 0.30882352941176466}, 'combined': 0.20588235294117643, 'epoch': 2} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.10576923076923077, 'r': 0.11956521739130435, 'f1': 0.11224489795918369}, 'combined': 0.056122448979591844, 'epoch': 2} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.3333333333333333, 'r': 0.06896551724137931, 'f1': 0.1142857142857143}, 'combined': 0.0761904761904762, 'epoch': 2} New best chinese model... ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.28142266941491284, 'r': 0.22290377785415594, 'f1': 0.24876814026339344}, 'combined': 0.18330284019407936, 'epoch': 2} Test for Chinese: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.3433712454345975, 'r': 0.1910909864584257, 'f1': 0.2455370879216357}, 'combined': 0.1509154784298834, 'epoch': 2} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.3181818181818182, 'r': 0.3, 'f1': 0.30882352941176466}, 'combined': 0.20588235294117643, 'epoch': 2} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.24474242399662569, 'r': 0.23965326206096807, 'f1': 0.24217110913133164}, 'combined': 0.17844186988624436, 'epoch': 1} Test for Korean: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.28598942793466964, 'r': 0.18934131529641915, 'f1': 0.22783973145913355}, 'combined': 0.1400380788480528, 'epoch': 1} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.32407407407407407, 'r': 0.3804347826086957, 'f1': 0.35000000000000003}, 'combined': 0.17500000000000002, 'epoch': 1} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2733346024977155, 'r': 0.24279153138528142, 'f1': 0.257159335148302}, 'combined': 0.18948582589874885, 'epoch': 1} Test for Russian: {'template': {'p': 0.863013698630137, 'r': 0.48091603053435117, 'f1': 0.6176470588235295}, 'slot': {'p': 0.3071422824180177, 'r': 0.18380794621389143, 'f1': 0.22998336219955295}, 'combined': 0.14204854724090038, 'epoch': 1} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.3958333333333333, 'r': 0.16379310344827586, 'f1': 0.23170731707317074}, 'combined': 0.15447154471544716, 'epoch': 1} ****************************** Epoch: 3 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-22 11:46:36.909634: step: 2/466, loss: 0.7288981676101685 2023-01-22 11:46:37.761451: step: 4/466, loss: 1.0514813661575317 2023-01-22 11:46:38.561543: step: 6/466, loss: 1.2421069145202637 2023-01-22 11:46:39.285817: step: 8/466, loss: 0.5665553212165833 2023-01-22 11:46:40.053778: step: 10/466, loss: 1.139402151107788 2023-01-22 11:46:40.817835: step: 12/466, loss: 0.49877578020095825 2023-01-22 11:46:41.551238: step: 14/466, loss: 2.224588394165039 2023-01-22 11:46:42.344703: step: 16/466, loss: 0.5720035433769226 2023-01-22 11:46:43.062803: step: 18/466, loss: 1.1438024044036865 2023-01-22 11:46:43.812603: step: 20/466, loss: 1.1782970428466797 2023-01-22 11:46:44.597814: step: 22/466, loss: 2.60244083404541 2023-01-22 11:46:45.356614: step: 24/466, loss: 0.6441414952278137 2023-01-22 11:46:46.239022: step: 26/466, loss: 0.3326947093009949 2023-01-22 11:46:47.035813: step: 28/466, loss: 0.32072970271110535 2023-01-22 11:46:47.836734: step: 30/466, loss: 0.4882255494594574 2023-01-22 11:46:48.603092: step: 32/466, loss: 1.453972578048706 2023-01-22 11:46:49.419207: step: 34/466, loss: 0.5718610286712646 2023-01-22 11:46:50.275046: step: 36/466, loss: 0.7997777462005615 2023-01-22 11:46:50.957927: step: 38/466, loss: 0.3379814326763153 2023-01-22 11:46:51.656092: step: 40/466, loss: 0.7625579237937927 2023-01-22 11:46:52.371582: step: 42/466, loss: 0.29642653465270996 2023-01-22 11:46:53.120214: step: 44/466, loss: 1.0855971574783325 2023-01-22 11:46:53.835480: step: 46/466, loss: 1.693413496017456 2023-01-22 11:46:54.694719: step: 48/466, loss: 0.2988651394844055 2023-01-22 11:46:55.464233: step: 50/466, loss: 1.3019503355026245 2023-01-22 11:46:56.197919: step: 52/466, loss: 1.0249556303024292 2023-01-22 11:46:56.991010: step: 54/466, loss: 1.5618072748184204 2023-01-22 11:46:57.715173: step: 56/466, loss: 0.41827669739723206 2023-01-22 11:46:58.446854: step: 58/466, loss: 0.3538615107536316 2023-01-22 11:46:59.196301: step: 60/466, loss: 1.5650417804718018 2023-01-22 11:46:59.983479: step: 62/466, loss: 0.6613375544548035 2023-01-22 11:47:00.785485: step: 64/466, loss: 0.43019580841064453 2023-01-22 11:47:01.529397: step: 66/466, loss: 4.515617370605469 2023-01-22 11:47:02.299170: step: 68/466, loss: 1.161965250968933 2023-01-22 11:47:03.069641: step: 70/466, loss: 0.6723357439041138 2023-01-22 11:47:03.812137: step: 72/466, loss: 1.1802687644958496 2023-01-22 11:47:04.536924: step: 74/466, loss: 2.829439640045166 2023-01-22 11:47:05.340994: step: 76/466, loss: 0.6573655605316162 2023-01-22 11:47:06.046948: step: 78/466, loss: 0.21155864000320435 2023-01-22 11:47:06.938942: step: 80/466, loss: 1.0598139762878418 2023-01-22 11:47:07.747783: step: 82/466, loss: 0.7500097751617432 2023-01-22 11:47:08.565971: step: 84/466, loss: 1.260082721710205 2023-01-22 11:47:09.338016: step: 86/466, loss: 0.5332298278808594 2023-01-22 11:47:10.181831: step: 88/466, loss: 1.5100605487823486 2023-01-22 11:47:11.002646: step: 90/466, loss: 1.7919092178344727 2023-01-22 11:47:11.752918: step: 92/466, loss: 1.1008827686309814 2023-01-22 11:47:12.526588: step: 94/466, loss: 1.6091139316558838 2023-01-22 11:47:13.329985: step: 96/466, loss: 0.20046083629131317 2023-01-22 11:47:14.181797: step: 98/466, loss: 0.5788496136665344 2023-01-22 11:47:15.021380: step: 100/466, loss: 0.9448338150978088 2023-01-22 11:47:15.827858: step: 102/466, loss: 0.5384277105331421 2023-01-22 11:47:16.666191: step: 104/466, loss: 1.9115006923675537 2023-01-22 11:47:17.402834: step: 106/466, loss: 0.353412002325058 2023-01-22 11:47:18.158546: step: 108/466, loss: 0.2827959656715393 2023-01-22 11:47:18.892559: step: 110/466, loss: 0.7786230444908142 2023-01-22 11:47:19.619581: step: 112/466, loss: 0.6808529496192932 2023-01-22 11:47:20.348512: step: 114/466, loss: 0.3210010230541229 2023-01-22 11:47:21.107441: step: 116/466, loss: 1.32731032371521 2023-01-22 11:47:21.848159: step: 118/466, loss: 1.5391159057617188 2023-01-22 11:47:22.605705: step: 120/466, loss: 0.3658561408519745 2023-01-22 11:47:23.401906: step: 122/466, loss: 0.2683035433292389 2023-01-22 11:47:24.309234: step: 124/466, loss: 0.6769906282424927 2023-01-22 11:47:25.061009: step: 126/466, loss: 0.7051757574081421 2023-01-22 11:47:25.794477: step: 128/466, loss: 1.3545571565628052 2023-01-22 11:47:26.478947: step: 130/466, loss: 1.796260952949524 2023-01-22 11:47:27.245286: step: 132/466, loss: 0.980597198009491 2023-01-22 11:47:28.054527: step: 134/466, loss: 0.6104364395141602 2023-01-22 11:47:28.811216: step: 136/466, loss: 0.17875078320503235 2023-01-22 11:47:29.571086: step: 138/466, loss: 0.6792482137680054 2023-01-22 11:47:30.335262: step: 140/466, loss: 1.7077827453613281 2023-01-22 11:47:31.068321: step: 142/466, loss: 0.34921565651893616 2023-01-22 11:47:31.865186: step: 144/466, loss: 3.0503482818603516 2023-01-22 11:47:32.645505: step: 146/466, loss: 0.8565284013748169 2023-01-22 11:47:33.403516: step: 148/466, loss: 0.40316522121429443 2023-01-22 11:47:34.146408: step: 150/466, loss: 1.3210734128952026 2023-01-22 11:47:34.952497: step: 152/466, loss: 3.314621686935425 2023-01-22 11:47:35.699569: step: 154/466, loss: 3.0272955894470215 2023-01-22 11:47:36.418179: step: 156/466, loss: 3.5029330253601074 2023-01-22 11:47:37.218392: step: 158/466, loss: 0.9041703939437866 2023-01-22 11:47:37.990444: step: 160/466, loss: 0.19255219399929047 2023-01-22 11:47:38.767909: step: 162/466, loss: 1.6574230194091797 2023-01-22 11:47:39.513078: step: 164/466, loss: 1.0237008333206177 2023-01-22 11:47:40.374849: step: 166/466, loss: 1.1813163757324219 2023-01-22 11:47:41.143445: step: 168/466, loss: 3.462944507598877 2023-01-22 11:47:41.831979: step: 170/466, loss: 0.7057815194129944 2023-01-22 11:47:42.608380: step: 172/466, loss: 0.3181644678115845 2023-01-22 11:47:43.400966: step: 174/466, loss: 1.4715278148651123 2023-01-22 11:47:44.123228: step: 176/466, loss: 1.6747549772262573 2023-01-22 11:47:44.878116: step: 178/466, loss: 0.20198678970336914 2023-01-22 11:47:45.681745: step: 180/466, loss: 0.48431089520454407 2023-01-22 11:47:46.494858: step: 182/466, loss: 1.4130982160568237 2023-01-22 11:47:47.246124: step: 184/466, loss: 0.7763614654541016 2023-01-22 11:47:47.977492: step: 186/466, loss: 1.540745496749878 2023-01-22 11:47:48.697601: step: 188/466, loss: 0.9028275609016418 2023-01-22 11:47:49.543576: step: 190/466, loss: 0.4500337243080139 2023-01-22 11:47:50.296907: step: 192/466, loss: 0.5722788572311401 2023-01-22 11:47:50.951830: step: 194/466, loss: 0.26888906955718994 2023-01-22 11:47:51.687029: step: 196/466, loss: 0.9907680749893188 2023-01-22 11:47:52.405577: step: 198/466, loss: 1.654240369796753 2023-01-22 11:47:53.215219: step: 200/466, loss: 0.21867898106575012 2023-01-22 11:47:54.048447: step: 202/466, loss: 3.248518228530884 2023-01-22 11:47:54.819940: step: 204/466, loss: 0.32449525594711304 2023-01-22 11:47:55.571879: step: 206/466, loss: 0.7105669379234314 2023-01-22 11:47:56.301614: step: 208/466, loss: 2.4783077239990234 2023-01-22 11:47:57.037942: step: 210/466, loss: 2.501128673553467 2023-01-22 11:47:57.761280: step: 212/466, loss: 0.6300548315048218 2023-01-22 11:47:58.559314: step: 214/466, loss: 1.6797388792037964 2023-01-22 11:47:59.280401: step: 216/466, loss: 1.4843556880950928 2023-01-22 11:48:00.130663: step: 218/466, loss: 0.28681495785713196 2023-01-22 11:48:00.895991: step: 220/466, loss: 0.5974239110946655 2023-01-22 11:48:01.612894: step: 222/466, loss: 0.27215713262557983 2023-01-22 11:48:02.427960: step: 224/466, loss: 1.0060501098632812 2023-01-22 11:48:03.151898: step: 226/466, loss: 1.3134772777557373 2023-01-22 11:48:04.068881: step: 228/466, loss: 0.22591619193553925 2023-01-22 11:48:04.933985: step: 230/466, loss: 1.7979905605316162 2023-01-22 11:48:05.684302: step: 232/466, loss: 0.454550176858902 2023-01-22 11:48:06.468381: step: 234/466, loss: 0.7232887744903564 2023-01-22 11:48:07.203865: step: 236/466, loss: 2.3125431537628174 2023-01-22 11:48:07.977784: step: 238/466, loss: 1.082403302192688 2023-01-22 11:48:08.722333: step: 240/466, loss: 0.28135067224502563 2023-01-22 11:48:09.439991: step: 242/466, loss: 0.5144278407096863 2023-01-22 11:48:10.334277: step: 244/466, loss: 0.36738258600234985 2023-01-22 11:48:11.134726: step: 246/466, loss: 0.40780818462371826 2023-01-22 11:48:11.890645: step: 248/466, loss: 5.4448699951171875 2023-01-22 11:48:12.580879: step: 250/466, loss: 3.7832436561584473 2023-01-22 11:48:13.343104: step: 252/466, loss: 0.9051271080970764 2023-01-22 11:48:14.082207: step: 254/466, loss: 0.5934857726097107 2023-01-22 11:48:14.923148: step: 256/466, loss: 0.6028462648391724 2023-01-22 11:48:15.784818: step: 258/466, loss: 0.23693428933620453 2023-01-22 11:48:16.565812: step: 260/466, loss: 0.732874870300293 2023-01-22 11:48:17.442174: step: 262/466, loss: 1.4208934307098389 2023-01-22 11:48:18.150895: step: 264/466, loss: 1.9772591590881348 2023-01-22 11:48:18.824956: step: 266/466, loss: 1.3765772581100464 2023-01-22 11:48:19.616114: step: 268/466, loss: 2.3647990226745605 2023-01-22 11:48:20.433778: step: 270/466, loss: 0.6535061001777649 2023-01-22 11:48:21.140012: step: 272/466, loss: 1.8163260221481323 2023-01-22 11:48:21.918476: step: 274/466, loss: 1.1725115776062012 2023-01-22 11:48:22.689001: step: 276/466, loss: 0.42887893319129944 2023-01-22 11:48:23.443857: step: 278/466, loss: 0.5292885899543762 2023-01-22 11:48:24.308533: step: 280/466, loss: 0.70928555727005 2023-01-22 11:48:25.035764: step: 282/466, loss: 0.512312650680542 2023-01-22 11:48:25.760502: step: 284/466, loss: 0.6690589189529419 2023-01-22 11:48:26.517230: step: 286/466, loss: 1.792073130607605 2023-01-22 11:48:27.286970: step: 288/466, loss: 2.455967426300049 2023-01-22 11:48:28.047552: step: 290/466, loss: 0.4556499421596527 2023-01-22 11:48:28.937409: step: 292/466, loss: 1.3838729858398438 2023-01-22 11:48:29.759148: step: 294/466, loss: 0.44971323013305664 2023-01-22 11:48:30.489940: step: 296/466, loss: 3.103240728378296 2023-01-22 11:48:31.179482: step: 298/466, loss: 0.699164867401123 2023-01-22 11:48:32.011831: step: 300/466, loss: 1.1217217445373535 2023-01-22 11:48:32.744654: step: 302/466, loss: 0.5284570455551147 2023-01-22 11:48:33.508708: step: 304/466, loss: 0.3365297317504883 2023-01-22 11:48:34.234485: step: 306/466, loss: 3.434846878051758 2023-01-22 11:48:35.096770: step: 308/466, loss: 1.6268279552459717 2023-01-22 11:48:35.987567: step: 310/466, loss: 0.2957576811313629 2023-01-22 11:48:36.747721: step: 312/466, loss: 0.7529169917106628 2023-01-22 11:48:37.536020: step: 314/466, loss: 1.4831832647323608 2023-01-22 11:48:38.267386: step: 316/466, loss: 0.6919432878494263 2023-01-22 11:48:39.024435: step: 318/466, loss: 1.175722599029541 2023-01-22 11:48:39.833888: step: 320/466, loss: 2.596788167953491 2023-01-22 11:48:40.557839: step: 322/466, loss: 0.6876751780509949 2023-01-22 11:48:41.331422: step: 324/466, loss: 1.394080638885498 2023-01-22 11:48:42.139185: step: 326/466, loss: 1.1835769414901733 2023-01-22 11:48:42.903415: step: 328/466, loss: 0.8824508190155029 2023-01-22 11:48:43.665411: step: 330/466, loss: 0.8156054019927979 2023-01-22 11:48:44.457582: step: 332/466, loss: 0.44112730026245117 2023-01-22 11:48:45.221627: step: 334/466, loss: 3.6188578605651855 2023-01-22 11:48:46.083719: step: 336/466, loss: 8.541708946228027 2023-01-22 11:48:46.830972: step: 338/466, loss: 0.5330957174301147 2023-01-22 11:48:47.690098: step: 340/466, loss: 1.2214475870132446 2023-01-22 11:48:48.479808: step: 342/466, loss: 0.2562291622161865 2023-01-22 11:48:49.236576: step: 344/466, loss: 0.5281143188476562 2023-01-22 11:48:50.046016: step: 346/466, loss: 0.6365292072296143 2023-01-22 11:48:50.843314: step: 348/466, loss: 0.5487217903137207 2023-01-22 11:48:51.626099: step: 350/466, loss: 0.427787184715271 2023-01-22 11:48:52.392359: step: 352/466, loss: 0.3734011650085449 2023-01-22 11:48:53.131000: step: 354/466, loss: 1.6304538249969482 2023-01-22 11:48:53.928153: step: 356/466, loss: 0.672451376914978 2023-01-22 11:48:54.714684: step: 358/466, loss: 0.26452600955963135 2023-01-22 11:48:55.426541: step: 360/466, loss: 1.5289206504821777 2023-01-22 11:48:56.248071: step: 362/466, loss: 0.3662125766277313 2023-01-22 11:48:56.966457: step: 364/466, loss: 1.2627530097961426 2023-01-22 11:48:57.654364: step: 366/466, loss: 0.7326240539550781 2023-01-22 11:48:58.401905: step: 368/466, loss: 2.4031803607940674 2023-01-22 11:48:59.156950: step: 370/466, loss: 0.7940996289253235 2023-01-22 11:48:59.869978: step: 372/466, loss: 0.6374187469482422 2023-01-22 11:49:00.667708: step: 374/466, loss: 0.5218483209609985 2023-01-22 11:49:01.408980: step: 376/466, loss: 0.6916596293449402 2023-01-22 11:49:02.319093: step: 378/466, loss: 2.400697708129883 2023-01-22 11:49:03.082758: step: 380/466, loss: 4.564194679260254 2023-01-22 11:49:03.780696: step: 382/466, loss: 0.934001088142395 2023-01-22 11:49:04.632039: step: 384/466, loss: 1.4111979007720947 2023-01-22 11:49:05.439075: step: 386/466, loss: 0.8816977143287659 2023-01-22 11:49:06.275513: step: 388/466, loss: 0.3229823112487793 2023-01-22 11:49:06.988350: step: 390/466, loss: 1.5400192737579346 2023-01-22 11:49:07.766228: step: 392/466, loss: 0.622016191482544 2023-01-22 11:49:08.545454: step: 394/466, loss: 2.1396517753601074 2023-01-22 11:49:09.318236: step: 396/466, loss: 0.7787021398544312 2023-01-22 11:49:10.043424: step: 398/466, loss: 2.187607526779175 2023-01-22 11:49:10.761915: step: 400/466, loss: 0.5074290037155151 2023-01-22 11:49:11.473677: step: 402/466, loss: 1.3675506114959717 2023-01-22 11:49:12.250360: step: 404/466, loss: 0.653610110282898 2023-01-22 11:49:13.105576: step: 406/466, loss: 1.8830668926239014 2023-01-22 11:49:13.815357: step: 408/466, loss: 1.3166179656982422 2023-01-22 11:49:14.544724: step: 410/466, loss: 1.8001898527145386 2023-01-22 11:49:15.299380: step: 412/466, loss: 2.1397910118103027 2023-01-22 11:49:16.016380: step: 414/466, loss: 1.96988844871521 2023-01-22 11:49:16.801441: step: 416/466, loss: 2.072577476501465 2023-01-22 11:49:17.603364: step: 418/466, loss: 7.2129034996032715 2023-01-22 11:49:18.369355: step: 420/466, loss: 1.306861162185669 2023-01-22 11:49:19.137766: step: 422/466, loss: 1.119315505027771 2023-01-22 11:49:19.962306: step: 424/466, loss: 1.3532315492630005 2023-01-22 11:49:20.725476: step: 426/466, loss: 1.7584383487701416 2023-01-22 11:49:21.532641: step: 428/466, loss: 1.0544145107269287 2023-01-22 11:49:22.303112: step: 430/466, loss: 2.955805540084839 2023-01-22 11:49:23.019733: step: 432/466, loss: 0.6103273034095764 2023-01-22 11:49:23.859157: step: 434/466, loss: 0.7436304092407227 2023-01-22 11:49:24.628124: step: 436/466, loss: 1.3584591150283813 2023-01-22 11:49:25.262429: step: 438/466, loss: 1.8201582431793213 2023-01-22 11:49:25.989730: step: 440/466, loss: 0.9806408882141113 2023-01-22 11:49:26.756559: step: 442/466, loss: 1.0120773315429688 2023-01-22 11:49:27.494712: step: 444/466, loss: 1.465714931488037 2023-01-22 11:49:28.269428: step: 446/466, loss: 1.6572892665863037 2023-01-22 11:49:29.123132: step: 448/466, loss: 1.9525045156478882 2023-01-22 11:49:29.905497: step: 450/466, loss: 1.0817575454711914 2023-01-22 11:49:30.598560: step: 452/466, loss: 1.029548168182373 2023-01-22 11:49:31.329248: step: 454/466, loss: 0.854709267616272 2023-01-22 11:49:32.097360: step: 456/466, loss: 1.9108412265777588 2023-01-22 11:49:32.953571: step: 458/466, loss: 0.26732340455055237 2023-01-22 11:49:33.746101: step: 460/466, loss: 0.5370470285415649 2023-01-22 11:49:34.531229: step: 462/466, loss: 7.576594829559326 2023-01-22 11:49:35.296215: step: 464/466, loss: 2.7582592964172363 2023-01-22 11:49:36.042976: step: 466/466, loss: 1.000118613243103 2023-01-22 11:49:36.908526: step: 468/466, loss: 0.4005153775215149 2023-01-22 11:49:37.654242: step: 470/466, loss: 0.6367888450622559 2023-01-22 11:49:38.430303: step: 472/466, loss: 3.750797748565674 2023-01-22 11:49:39.193663: step: 474/466, loss: 2.3002090454101562 2023-01-22 11:49:39.992699: step: 476/466, loss: 1.1184279918670654 2023-01-22 11:49:40.750918: step: 478/466, loss: 0.9593591690063477 2023-01-22 11:49:41.554909: step: 480/466, loss: 1.2366538047790527 2023-01-22 11:49:42.325364: step: 482/466, loss: 0.5184550881385803 2023-01-22 11:49:43.103092: step: 484/466, loss: 1.9456372261047363 2023-01-22 11:49:43.876989: step: 486/466, loss: 1.1863420009613037 2023-01-22 11:49:44.656683: step: 488/466, loss: 2.484473705291748 2023-01-22 11:49:45.402791: step: 490/466, loss: 4.424013137817383 2023-01-22 11:49:46.111759: step: 492/466, loss: 1.9624214172363281 2023-01-22 11:49:46.864896: step: 494/466, loss: 0.7419807314872742 2023-01-22 11:49:47.635933: step: 496/466, loss: 2.0698354244232178 2023-01-22 11:49:48.367336: step: 498/466, loss: 0.25887054204940796 2023-01-22 11:49:49.212174: step: 500/466, loss: 0.45883864164352417 2023-01-22 11:49:49.995716: step: 502/466, loss: 1.2493901252746582 2023-01-22 11:49:50.701043: step: 504/466, loss: 0.8802942633628845 2023-01-22 11:49:51.559683: step: 506/466, loss: 0.30477413535118103 2023-01-22 11:49:52.342151: step: 508/466, loss: 2.7209062576293945 2023-01-22 11:49:53.039458: step: 510/466, loss: 1.2075614929199219 2023-01-22 11:49:53.804362: step: 512/466, loss: 0.5372397899627686 2023-01-22 11:49:54.833592: step: 514/466, loss: 1.5385627746582031 2023-01-22 11:49:55.579704: step: 516/466, loss: 1.5622609853744507 2023-01-22 11:49:56.265676: step: 518/466, loss: 1.6847238540649414 2023-01-22 11:49:57.059625: step: 520/466, loss: 0.8336131572723389 2023-01-22 11:49:57.810476: step: 522/466, loss: 1.57077956199646 2023-01-22 11:49:58.752457: step: 524/466, loss: 1.0002378225326538 2023-01-22 11:49:59.600511: step: 526/466, loss: 0.6085427403450012 2023-01-22 11:50:00.390583: step: 528/466, loss: 0.9000792503356934 2023-01-22 11:50:01.224726: step: 530/466, loss: 2.0616867542266846 2023-01-22 11:50:02.016709: step: 532/466, loss: 0.26455068588256836 2023-01-22 11:50:02.815337: step: 534/466, loss: 5.376952171325684 2023-01-22 11:50:03.585895: step: 536/466, loss: 1.079664945602417 2023-01-22 11:50:04.319818: step: 538/466, loss: 0.7638109922409058 2023-01-22 11:50:05.051482: step: 540/466, loss: 1.0749095678329468 2023-01-22 11:50:05.811157: step: 542/466, loss: 1.0012367963790894 2023-01-22 11:50:06.664094: step: 544/466, loss: 0.4597180187702179 2023-01-22 11:50:07.364389: step: 546/466, loss: 1.7872309684753418 2023-01-22 11:50:08.163722: step: 548/466, loss: 0.38119691610336304 2023-01-22 11:50:08.912984: step: 550/466, loss: 0.742358922958374 2023-01-22 11:50:09.699364: step: 552/466, loss: 1.5680882930755615 2023-01-22 11:50:10.489038: step: 554/466, loss: 0.574269711971283 2023-01-22 11:50:11.264132: step: 556/466, loss: 0.5157495141029358 2023-01-22 11:50:12.011472: step: 558/466, loss: 0.2620672285556793 2023-01-22 11:50:12.766340: step: 560/466, loss: 0.592666506767273 2023-01-22 11:50:13.581297: step: 562/466, loss: 0.42480170726776123 2023-01-22 11:50:14.451347: step: 564/466, loss: 0.9412723183631897 2023-01-22 11:50:15.273677: step: 566/466, loss: 0.24113863706588745 2023-01-22 11:50:16.034538: step: 568/466, loss: 0.7110223770141602 2023-01-22 11:50:16.813992: step: 570/466, loss: 0.46712177991867065 2023-01-22 11:50:17.573344: step: 572/466, loss: 0.3590967655181885 2023-01-22 11:50:18.344191: step: 574/466, loss: 0.5464659929275513 2023-01-22 11:50:19.094629: step: 576/466, loss: 2.006821632385254 2023-01-22 11:50:19.866625: step: 578/466, loss: 0.6044853925704956 2023-01-22 11:50:20.655254: step: 580/466, loss: 1.7315723896026611 2023-01-22 11:50:21.399067: step: 582/466, loss: 0.8711855411529541 2023-01-22 11:50:22.147533: step: 584/466, loss: 0.8170568346977234 2023-01-22 11:50:22.999900: step: 586/466, loss: 1.6933026313781738 2023-01-22 11:50:23.727957: step: 588/466, loss: 1.312509536743164 2023-01-22 11:50:24.513937: step: 590/466, loss: 1.0134602785110474 2023-01-22 11:50:25.249420: step: 592/466, loss: 0.5736547708511353 2023-01-22 11:50:26.018557: step: 594/466, loss: 1.3567414283752441 2023-01-22 11:50:26.775205: step: 596/466, loss: 1.5852487087249756 2023-01-22 11:50:27.531855: step: 598/466, loss: 0.8714603185653687 2023-01-22 11:50:28.225552: step: 600/466, loss: 0.5756778120994568 2023-01-22 11:50:28.923954: step: 602/466, loss: 1.272370457649231 2023-01-22 11:50:29.725198: step: 604/466, loss: 1.356710433959961 2023-01-22 11:50:30.540477: step: 606/466, loss: 0.39522653818130493 2023-01-22 11:50:31.298164: step: 608/466, loss: 0.75468909740448 2023-01-22 11:50:32.105556: step: 610/466, loss: 0.3983873724937439 2023-01-22 11:50:33.012377: step: 612/466, loss: 1.1683955192565918 2023-01-22 11:50:33.830411: step: 614/466, loss: 1.212808609008789 2023-01-22 11:50:34.624597: step: 616/466, loss: 0.7545867562294006 2023-01-22 11:50:35.365948: step: 618/466, loss: 1.3609846830368042 2023-01-22 11:50:36.118306: step: 620/466, loss: 0.9695680737495422 2023-01-22 11:50:36.846570: step: 622/466, loss: 0.775894045829773 2023-01-22 11:50:37.596294: step: 624/466, loss: 1.3027985095977783 2023-01-22 11:50:38.449046: step: 626/466, loss: 0.5976076126098633 2023-01-22 11:50:39.195310: step: 628/466, loss: 0.483690470457077 2023-01-22 11:50:39.986111: step: 630/466, loss: 1.851898193359375 2023-01-22 11:50:40.728695: step: 632/466, loss: 1.1140954494476318 2023-01-22 11:50:41.521443: step: 634/466, loss: 0.3627002537250519 2023-01-22 11:50:42.309145: step: 636/466, loss: 4.963777542114258 2023-01-22 11:50:43.077359: step: 638/466, loss: 0.44731274247169495 2023-01-22 11:50:43.873479: step: 640/466, loss: 0.9005690813064575 2023-01-22 11:50:44.600554: step: 642/466, loss: 0.2202322781085968 2023-01-22 11:50:45.323819: step: 644/466, loss: 0.3271179795265198 2023-01-22 11:50:46.105448: step: 646/466, loss: 0.7696815133094788 2023-01-22 11:50:46.946238: step: 648/466, loss: 4.6255784034729 2023-01-22 11:50:47.661296: step: 650/466, loss: 0.17914696037769318 2023-01-22 11:50:48.396952: step: 652/466, loss: 0.9905416965484619 2023-01-22 11:50:49.150442: step: 654/466, loss: 0.5770795941352844 2023-01-22 11:50:49.948739: step: 656/466, loss: 2.672057628631592 2023-01-22 11:50:50.736843: step: 658/466, loss: 0.6635293960571289 2023-01-22 11:50:51.465116: step: 660/466, loss: 0.348417729139328 2023-01-22 11:50:52.236672: step: 662/466, loss: 1.1955702304840088 2023-01-22 11:50:53.032156: step: 664/466, loss: 0.9019771814346313 2023-01-22 11:50:53.775863: step: 666/466, loss: 1.5625412464141846 2023-01-22 11:50:54.531725: step: 668/466, loss: 0.8455973267555237 2023-01-22 11:50:55.305762: step: 670/466, loss: 3.6101083755493164 2023-01-22 11:50:56.098290: step: 672/466, loss: 0.7789993286132812 2023-01-22 11:50:56.782771: step: 674/466, loss: 3.386035919189453 2023-01-22 11:50:57.616821: step: 676/466, loss: 0.23694761097431183 2023-01-22 11:50:58.382459: step: 678/466, loss: 0.9110981225967407 2023-01-22 11:50:59.171582: step: 680/466, loss: 1.3185884952545166 2023-01-22 11:51:00.021835: step: 682/466, loss: 3.4931812286376953 2023-01-22 11:51:00.831659: step: 684/466, loss: 0.4314889907836914 2023-01-22 11:51:01.678288: step: 686/466, loss: 0.441946804523468 2023-01-22 11:51:02.451372: step: 688/466, loss: 2.5487897396087646 2023-01-22 11:51:03.165716: step: 690/466, loss: 1.069916844367981 2023-01-22 11:51:03.897951: step: 692/466, loss: 1.4144566059112549 2023-01-22 11:51:04.715682: step: 694/466, loss: 2.8627562522888184 2023-01-22 11:51:05.491485: step: 696/466, loss: 0.24165946245193481 2023-01-22 11:51:06.187150: step: 698/466, loss: 0.5185352563858032 2023-01-22 11:51:07.013234: step: 700/466, loss: 0.8233320713043213 2023-01-22 11:51:07.758244: step: 702/466, loss: 0.9007467031478882 2023-01-22 11:51:08.477476: step: 704/466, loss: 0.9488805532455444 2023-01-22 11:51:09.271870: step: 706/466, loss: 2.8228328227996826 2023-01-22 11:51:10.032158: step: 708/466, loss: 0.9803735017776489 2023-01-22 11:51:10.809080: step: 710/466, loss: 0.8610438108444214 2023-01-22 11:51:11.552062: step: 712/466, loss: 1.4700127840042114 2023-01-22 11:51:12.282551: step: 714/466, loss: 0.8979735374450684 2023-01-22 11:51:13.270696: step: 716/466, loss: 0.7316739559173584 2023-01-22 11:51:14.019766: step: 718/466, loss: 0.3060496747493744 2023-01-22 11:51:14.789096: step: 720/466, loss: 0.6991862058639526 2023-01-22 11:51:15.568571: step: 722/466, loss: 1.6871004104614258 2023-01-22 11:51:16.412787: step: 724/466, loss: 1.376437783241272 2023-01-22 11:51:17.221911: step: 726/466, loss: 0.32766085863113403 2023-01-22 11:51:17.968252: step: 728/466, loss: 1.1724035739898682 2023-01-22 11:51:18.728453: step: 730/466, loss: 1.0425370931625366 2023-01-22 11:51:19.476056: step: 732/466, loss: 1.9704253673553467 2023-01-22 11:51:20.426135: step: 734/466, loss: 1.1451716423034668 2023-01-22 11:51:21.167118: step: 736/466, loss: 0.4858511686325073 2023-01-22 11:51:21.951097: step: 738/466, loss: 0.5664715766906738 2023-01-22 11:51:22.712082: step: 740/466, loss: 0.7027938961982727 2023-01-22 11:51:23.494540: step: 742/466, loss: 2.55523419380188 2023-01-22 11:51:24.292120: step: 744/466, loss: 0.21969053149223328 2023-01-22 11:51:25.176600: step: 746/466, loss: 0.8720443844795227 2023-01-22 11:51:25.982420: step: 748/466, loss: 1.728029727935791 2023-01-22 11:51:26.792503: step: 750/466, loss: 0.46208715438842773 2023-01-22 11:51:27.509674: step: 752/466, loss: 1.1562113761901855 2023-01-22 11:51:28.258365: step: 754/466, loss: 0.8755573034286499 2023-01-22 11:51:29.011101: step: 756/466, loss: 7.363640785217285 2023-01-22 11:51:29.744168: step: 758/466, loss: 1.521296739578247 2023-01-22 11:51:30.676233: step: 760/466, loss: 1.4269208908081055 2023-01-22 11:51:31.444914: step: 762/466, loss: 0.3243265151977539 2023-01-22 11:51:32.246792: step: 764/466, loss: 0.6574571132659912 2023-01-22 11:51:33.023310: step: 766/466, loss: 1.3891175985336304 2023-01-22 11:51:33.828491: step: 768/466, loss: 0.5049048662185669 2023-01-22 11:51:34.540608: step: 770/466, loss: 1.242082118988037 2023-01-22 11:51:35.278831: step: 772/466, loss: 3.593531608581543 2023-01-22 11:51:36.127825: step: 774/466, loss: 0.6223217844963074 2023-01-22 11:51:36.885491: step: 776/466, loss: 0.6413542628288269 2023-01-22 11:51:37.621973: step: 778/466, loss: 0.25577762722969055 2023-01-22 11:51:38.423150: step: 780/466, loss: 0.6415407657623291 2023-01-22 11:51:39.228646: step: 782/466, loss: 1.7713786363601685 2023-01-22 11:51:40.009413: step: 784/466, loss: 0.19394361972808838 2023-01-22 11:51:40.768108: step: 786/466, loss: 1.2882275581359863 2023-01-22 11:51:41.535891: step: 788/466, loss: 4.727072715759277 2023-01-22 11:51:42.327996: step: 790/466, loss: 7.86380672454834 2023-01-22 11:51:43.119105: step: 792/466, loss: 1.1790575981140137 2023-01-22 11:51:43.832444: step: 794/466, loss: 1.2071864604949951 2023-01-22 11:51:44.560112: step: 796/466, loss: 0.38711491227149963 2023-01-22 11:51:45.368506: step: 798/466, loss: 4.495429992675781 2023-01-22 11:51:46.076045: step: 800/466, loss: 0.09176217019557953 2023-01-22 11:51:46.803924: step: 802/466, loss: 0.8673193454742432 2023-01-22 11:51:47.502545: step: 804/466, loss: 0.9799995422363281 2023-01-22 11:51:48.275853: step: 806/466, loss: 1.3106269836425781 2023-01-22 11:51:49.028640: step: 808/466, loss: 2.1999921798706055 2023-01-22 11:51:49.910463: step: 810/466, loss: 4.346255302429199 2023-01-22 11:51:50.717220: step: 812/466, loss: 1.0042102336883545 2023-01-22 11:51:51.564390: step: 814/466, loss: 1.5868946313858032 2023-01-22 11:51:52.388270: step: 816/466, loss: 0.7256672978401184 2023-01-22 11:51:53.110816: step: 818/466, loss: 0.5786141753196716 2023-01-22 11:51:53.846173: step: 820/466, loss: 0.5950932502746582 2023-01-22 11:51:54.650820: step: 822/466, loss: 0.7020928263664246 2023-01-22 11:51:55.415140: step: 824/466, loss: 0.9810025095939636 2023-01-22 11:51:56.132725: step: 826/466, loss: 2.0820469856262207 2023-01-22 11:51:56.878468: step: 828/466, loss: 0.6656241416931152 2023-01-22 11:51:57.601456: step: 830/466, loss: 0.782184362411499 2023-01-22 11:51:58.329763: step: 832/466, loss: 0.6237660050392151 2023-01-22 11:51:59.048385: step: 834/466, loss: 1.8134608268737793 2023-01-22 11:51:59.858542: step: 836/466, loss: 0.722774088382721 2023-01-22 11:52:00.580405: step: 838/466, loss: 0.8320057392120361 2023-01-22 11:52:01.314219: step: 840/466, loss: 9.907206535339355 2023-01-22 11:52:02.169495: step: 842/466, loss: 2.7266483306884766 2023-01-22 11:52:02.933990: step: 844/466, loss: 0.4295651316642761 2023-01-22 11:52:03.698056: step: 846/466, loss: 1.887283205986023 2023-01-22 11:52:04.381723: step: 848/466, loss: 1.4896022081375122 2023-01-22 11:52:05.102624: step: 850/466, loss: 0.31836339831352234 2023-01-22 11:52:05.861995: step: 852/466, loss: 1.7824702262878418 2023-01-22 11:52:06.590574: step: 854/466, loss: 3.638576030731201 2023-01-22 11:52:07.318646: step: 856/466, loss: 0.7391470670700073 2023-01-22 11:52:08.035365: step: 858/466, loss: 1.0898805856704712 2023-01-22 11:52:08.839068: step: 860/466, loss: 0.36856454610824585 2023-01-22 11:52:09.660840: step: 862/466, loss: 2.7287936210632324 2023-01-22 11:52:10.432071: step: 864/466, loss: 0.2533276677131653 2023-01-22 11:52:11.181476: step: 866/466, loss: 0.35639724135398865 2023-01-22 11:52:11.967318: step: 868/466, loss: 0.5558145642280579 2023-01-22 11:52:12.710088: step: 870/466, loss: 0.34684237837791443 2023-01-22 11:52:13.509366: step: 872/466, loss: 1.4609794616699219 2023-01-22 11:52:14.266347: step: 874/466, loss: 0.4918462038040161 2023-01-22 11:52:14.999202: step: 876/466, loss: 0.9814804792404175 2023-01-22 11:52:15.760645: step: 878/466, loss: 0.5223897099494934 2023-01-22 11:52:16.536303: step: 880/466, loss: 1.834208369255066 2023-01-22 11:52:17.396142: step: 882/466, loss: 1.3285759687423706 2023-01-22 11:52:18.166712: step: 884/466, loss: 0.3768533766269684 2023-01-22 11:52:18.921067: step: 886/466, loss: 1.938004970550537 2023-01-22 11:52:19.680875: step: 888/466, loss: 0.7764403820037842 2023-01-22 11:52:20.422602: step: 890/466, loss: 0.6696011424064636 2023-01-22 11:52:21.262435: step: 892/466, loss: 0.8011062145233154 2023-01-22 11:52:21.946940: step: 894/466, loss: 1.0915991067886353 2023-01-22 11:52:22.698410: step: 896/466, loss: 1.013563871383667 2023-01-22 11:52:23.505485: step: 898/466, loss: 1.0057300329208374 2023-01-22 11:52:24.212720: step: 900/466, loss: 4.397829532623291 2023-01-22 11:52:24.931840: step: 902/466, loss: 1.8095632791519165 2023-01-22 11:52:25.677225: step: 904/466, loss: 3.2510788440704346 2023-01-22 11:52:26.526287: step: 906/466, loss: 1.2703807353973389 2023-01-22 11:52:27.273286: step: 908/466, loss: 0.4521823525428772 2023-01-22 11:52:27.987538: step: 910/466, loss: 0.5130724310874939 2023-01-22 11:52:28.714373: step: 912/466, loss: 0.23265279829502106 2023-01-22 11:52:29.553782: step: 914/466, loss: 1.25602388381958 2023-01-22 11:52:30.312407: step: 916/466, loss: 0.9182683229446411 2023-01-22 11:52:31.068785: step: 918/466, loss: 2.160057306289673 2023-01-22 11:52:31.887085: step: 920/466, loss: 0.5430825352668762 2023-01-22 11:52:32.722282: step: 922/466, loss: 1.0246845483779907 2023-01-22 11:52:33.515746: step: 924/466, loss: 0.8522800207138062 2023-01-22 11:52:34.261868: step: 926/466, loss: 1.536653995513916 2023-01-22 11:52:35.005668: step: 928/466, loss: 1.9280500411987305 2023-01-22 11:52:35.843208: step: 930/466, loss: 0.48116955161094666 2023-01-22 11:52:36.573857: step: 932/466, loss: 0.2521522641181946 ================================================== Loss: 1.280 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.28478231137979954, 'r': 0.26857079460296085, 'f1': 0.2764390796010945}, 'combined': 0.20369195339028012, 'epoch': 3} Test Chinese: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.32398281061971673, 'r': 0.2642870159294056, 'f1': 0.2911060413667393}, 'combined': 0.17892371323028852, 'epoch': 3} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2791706942517267, 'r': 0.28552752220432764, 'f1': 0.28231332870859416}, 'combined': 0.2080203474694904, 'epoch': 3} Test Korean: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.306518116903463, 'r': 0.2717829323754754, 'f1': 0.28810735426506145}, 'combined': 0.1770806177434036, 'epoch': 3} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.30090982880755607, 'r': 0.27635741393331525, 'f1': 0.2881114879186096}, 'combined': 0.21229267530844914, 'epoch': 3} Test Russian: {'template': {'p': 0.863013698630137, 'r': 0.48091603053435117, 'f1': 0.6176470588235295}, 'slot': {'p': 0.33152736310705944, 'r': 0.26648266870316795, 'f1': 0.29546760679402523}, 'combined': 0.1824946983139568, 'epoch': 3} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.24127906976744187, 'r': 0.29642857142857143, 'f1': 0.266025641025641}, 'combined': 0.17735042735042733, 'epoch': 3} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.29375, 'r': 0.5108695652173914, 'f1': 0.3730158730158731}, 'combined': 0.18650793650793654, 'epoch': 3} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.35714285714285715, 'r': 0.1724137931034483, 'f1': 0.23255813953488377}, 'combined': 0.1550387596899225, 'epoch': 3} New best korean model... New best russian model... ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.28142266941491284, 'r': 0.22290377785415594, 'f1': 0.24876814026339344}, 'combined': 0.18330284019407936, 'epoch': 2} Test for Chinese: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.3433712454345975, 'r': 0.1910909864584257, 'f1': 0.2455370879216357}, 'combined': 0.1509154784298834, 'epoch': 2} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.3181818181818182, 'r': 0.3, 'f1': 0.30882352941176466}, 'combined': 0.20588235294117643, 'epoch': 2} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2791706942517267, 'r': 0.28552752220432764, 'f1': 0.28231332870859416}, 'combined': 0.2080203474694904, 'epoch': 3} Test for Korean: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.306518116903463, 'r': 0.2717829323754754, 'f1': 0.28810735426506145}, 'combined': 0.1770806177434036, 'epoch': 3} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.29375, 'r': 0.5108695652173914, 'f1': 0.3730158730158731}, 'combined': 0.18650793650793654, 'epoch': 3} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.30090982880755607, 'r': 0.27635741393331525, 'f1': 0.2881114879186096}, 'combined': 0.21229267530844914, 'epoch': 3} Test for Russian: {'template': {'p': 0.863013698630137, 'r': 0.48091603053435117, 'f1': 0.6176470588235295}, 'slot': {'p': 0.33152736310705944, 'r': 0.26648266870316795, 'f1': 0.29546760679402523}, 'combined': 0.1824946983139568, 'epoch': 3} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.35714285714285715, 'r': 0.1724137931034483, 'f1': 0.23255813953488377}, 'combined': 0.1550387596899225, 'epoch': 3} ****************************** Epoch: 4 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-22 11:55:38.753411: step: 2/466, loss: 1.2397675514221191 2023-01-22 11:55:39.523665: step: 4/466, loss: 1.0649540424346924 2023-01-22 11:55:40.229548: step: 6/466, loss: 0.5727893710136414 2023-01-22 11:55:40.921387: step: 8/466, loss: 0.516141951084137 2023-01-22 11:55:41.669124: step: 10/466, loss: 0.28951722383499146 2023-01-22 11:55:42.483562: step: 12/466, loss: 0.19822120666503906 2023-01-22 11:55:43.424363: step: 14/466, loss: 0.9063112139701843 2023-01-22 11:55:44.082986: step: 16/466, loss: 0.7773525714874268 2023-01-22 11:55:44.873424: step: 18/466, loss: 0.7677595019340515 2023-01-22 11:55:45.599054: step: 20/466, loss: 0.351625919342041 2023-01-22 11:55:46.291398: step: 22/466, loss: 0.45544004440307617 2023-01-22 11:55:47.090037: step: 24/466, loss: 0.7318277359008789 2023-01-22 11:55:47.846394: step: 26/466, loss: 4.886200904846191 2023-01-22 11:55:48.755461: step: 28/466, loss: 0.8121609687805176 2023-01-22 11:55:49.440661: step: 30/466, loss: 0.7934070825576782 2023-01-22 11:55:50.176189: step: 32/466, loss: 0.6122418642044067 2023-01-22 11:55:50.973099: step: 34/466, loss: 1.1905437707901 2023-01-22 11:55:51.739218: step: 36/466, loss: 1.7918663024902344 2023-01-22 11:55:52.416303: step: 38/466, loss: 0.3897348940372467 2023-01-22 11:55:53.179870: step: 40/466, loss: 2.976102828979492 2023-01-22 11:55:53.962763: step: 42/466, loss: 1.1098968982696533 2023-01-22 11:55:54.697205: step: 44/466, loss: 0.8799930810928345 2023-01-22 11:55:55.457210: step: 46/466, loss: 0.32722756266593933 2023-01-22 11:55:56.273498: step: 48/466, loss: 0.7166779637336731 2023-01-22 11:55:57.053259: step: 50/466, loss: 0.5385745763778687 2023-01-22 11:55:57.821051: step: 52/466, loss: 6.414247512817383 2023-01-22 11:55:58.538620: step: 54/466, loss: 0.9391055703163147 2023-01-22 11:55:59.377495: step: 56/466, loss: 1.185863971710205 2023-01-22 11:56:00.164115: step: 58/466, loss: 0.2553609311580658 2023-01-22 11:56:00.920478: step: 60/466, loss: 2.96231746673584 2023-01-22 11:56:01.671131: step: 62/466, loss: 0.5796050429344177 2023-01-22 11:56:02.474910: step: 64/466, loss: 1.1674344539642334 2023-01-22 11:56:03.217571: step: 66/466, loss: 0.18244606256484985 2023-01-22 11:56:03.946515: step: 68/466, loss: 1.3118362426757812 2023-01-22 11:56:04.739938: step: 70/466, loss: 1.1825385093688965 2023-01-22 11:56:05.475737: step: 72/466, loss: 0.4374426603317261 2023-01-22 11:56:06.202634: step: 74/466, loss: 0.6698932647705078 2023-01-22 11:56:06.907035: step: 76/466, loss: 0.5556724071502686 2023-01-22 11:56:07.628640: step: 78/466, loss: 0.7124918699264526 2023-01-22 11:56:08.446005: step: 80/466, loss: 0.8932681083679199 2023-01-22 11:56:09.236487: step: 82/466, loss: 0.7622426748275757 2023-01-22 11:56:10.118805: step: 84/466, loss: 0.9120004177093506 2023-01-22 11:56:10.855034: step: 86/466, loss: 0.16653411090373993 2023-01-22 11:56:11.665347: step: 88/466, loss: 1.4499459266662598 2023-01-22 11:56:12.444720: step: 90/466, loss: 0.7019395232200623 2023-01-22 11:56:13.199388: step: 92/466, loss: 0.962942361831665 2023-01-22 11:56:13.963675: step: 94/466, loss: 0.7560228705406189 2023-01-22 11:56:14.867267: step: 96/466, loss: 0.6426159143447876 2023-01-22 11:56:15.587633: step: 98/466, loss: 0.941290020942688 2023-01-22 11:56:16.439371: step: 100/466, loss: 1.404322624206543 2023-01-22 11:56:17.222958: step: 102/466, loss: 0.30239057540893555 2023-01-22 11:56:17.938696: step: 104/466, loss: 0.9652661681175232 2023-01-22 11:56:18.760719: step: 106/466, loss: 0.870488166809082 2023-01-22 11:56:19.557712: step: 108/466, loss: 1.636003851890564 2023-01-22 11:56:20.340033: step: 110/466, loss: 1.6018774509429932 2023-01-22 11:56:21.103108: step: 112/466, loss: 0.7858424782752991 2023-01-22 11:56:21.880532: step: 114/466, loss: 0.997367262840271 2023-01-22 11:56:22.663618: step: 116/466, loss: 0.9120631217956543 2023-01-22 11:56:23.390814: step: 118/466, loss: 2.7179312705993652 2023-01-22 11:56:24.199309: step: 120/466, loss: 2.083798408508301 2023-01-22 11:56:24.964267: step: 122/466, loss: 0.18698270618915558 2023-01-22 11:56:25.755882: step: 124/466, loss: 0.3793947398662567 2023-01-22 11:56:26.531003: step: 126/466, loss: 0.8133951425552368 2023-01-22 11:56:27.326226: step: 128/466, loss: 0.3822079002857208 2023-01-22 11:56:28.124294: step: 130/466, loss: 0.2132749706506729 2023-01-22 11:56:28.906637: step: 132/466, loss: 1.0254547595977783 2023-01-22 11:56:29.680948: step: 134/466, loss: 2.9349710941314697 2023-01-22 11:56:30.470479: step: 136/466, loss: 0.7198674082756042 2023-01-22 11:56:31.192170: step: 138/466, loss: 0.6288474202156067 2023-01-22 11:56:31.993100: step: 140/466, loss: 0.686730146408081 2023-01-22 11:56:32.798215: step: 142/466, loss: 0.36508676409721375 2023-01-22 11:56:33.468629: step: 144/466, loss: 0.6521004438400269 2023-01-22 11:56:34.173160: step: 146/466, loss: 0.9777683615684509 2023-01-22 11:56:34.919321: step: 148/466, loss: 1.2292046546936035 2023-01-22 11:56:35.709825: step: 150/466, loss: 0.512159526348114 2023-01-22 11:56:36.456019: step: 152/466, loss: 0.32447198033332825 2023-01-22 11:56:37.236823: step: 154/466, loss: 1.7976981401443481 2023-01-22 11:56:37.925087: step: 156/466, loss: 0.8636375069618225 2023-01-22 11:56:38.644439: step: 158/466, loss: 0.34068071842193604 2023-01-22 11:56:39.497147: step: 160/466, loss: 0.8749996423721313 2023-01-22 11:56:40.259299: step: 162/466, loss: 0.7675051689147949 2023-01-22 11:56:40.959322: step: 164/466, loss: 3.0022711753845215 2023-01-22 11:56:41.774585: step: 166/466, loss: 5.4731645584106445 2023-01-22 11:56:42.546795: step: 168/466, loss: 0.25334587693214417 2023-01-22 11:56:43.216377: step: 170/466, loss: 0.17516425251960754 2023-01-22 11:56:43.999596: step: 172/466, loss: 0.7780047059059143 2023-01-22 11:56:44.690348: step: 174/466, loss: 2.9079222679138184 2023-01-22 11:56:45.418690: step: 176/466, loss: 1.5633292198181152 2023-01-22 11:56:46.238050: step: 178/466, loss: 0.47790423035621643 2023-01-22 11:56:46.991424: step: 180/466, loss: 0.3932521343231201 2023-01-22 11:56:47.780655: step: 182/466, loss: 0.40621671080589294 2023-01-22 11:56:48.547331: step: 184/466, loss: 1.5518304109573364 2023-01-22 11:56:49.338748: step: 186/466, loss: 1.8420729637145996 2023-01-22 11:56:50.108501: step: 188/466, loss: 1.1289658546447754 2023-01-22 11:56:50.824149: step: 190/466, loss: 0.9936701059341431 2023-01-22 11:56:51.567499: step: 192/466, loss: 1.6482939720153809 2023-01-22 11:56:52.342601: step: 194/466, loss: 1.3317241668701172 2023-01-22 11:56:53.098458: step: 196/466, loss: 4.684289455413818 2023-01-22 11:56:53.856432: step: 198/466, loss: 0.5308117270469666 2023-01-22 11:56:54.646367: step: 200/466, loss: 0.6839389204978943 2023-01-22 11:56:55.390991: step: 202/466, loss: 0.8625197410583496 2023-01-22 11:56:56.188982: step: 204/466, loss: 0.6347617506980896 2023-01-22 11:56:56.966067: step: 206/466, loss: 0.6131063103675842 2023-01-22 11:56:57.705663: step: 208/466, loss: 0.26425862312316895 2023-01-22 11:56:58.425036: step: 210/466, loss: 0.7124180793762207 2023-01-22 11:56:59.208137: step: 212/466, loss: 3.0077409744262695 2023-01-22 11:57:00.199576: step: 214/466, loss: 1.0376074314117432 2023-01-22 11:57:00.981477: step: 216/466, loss: 1.0748909711837769 2023-01-22 11:57:01.844184: step: 218/466, loss: 0.948563814163208 2023-01-22 11:57:02.586824: step: 220/466, loss: 0.6731117963790894 2023-01-22 11:57:03.397633: step: 222/466, loss: 1.0048935413360596 2023-01-22 11:57:04.175220: step: 224/466, loss: 0.5586294531822205 2023-01-22 11:57:04.934922: step: 226/466, loss: 0.4264393150806427 2023-01-22 11:57:05.732406: step: 228/466, loss: 0.28359612822532654 2023-01-22 11:57:06.477472: step: 230/466, loss: 4.026350021362305 2023-01-22 11:57:07.255407: step: 232/466, loss: 0.7377166152000427 2023-01-22 11:57:08.051442: step: 234/466, loss: 1.4324662685394287 2023-01-22 11:57:08.754951: step: 236/466, loss: 0.6396796703338623 2023-01-22 11:57:09.509953: step: 238/466, loss: 1.875848412513733 2023-01-22 11:57:10.307678: step: 240/466, loss: 2.0574350357055664 2023-01-22 11:57:11.011332: step: 242/466, loss: 0.4546733796596527 2023-01-22 11:57:11.787151: step: 244/466, loss: 0.5061032176017761 2023-01-22 11:57:12.608870: step: 246/466, loss: 3.3162288665771484 2023-01-22 11:57:13.364242: step: 248/466, loss: 0.8115181922912598 2023-01-22 11:57:14.095904: step: 250/466, loss: 3.5903172492980957 2023-01-22 11:57:14.799508: step: 252/466, loss: 0.9342566728591919 2023-01-22 11:57:15.689137: step: 254/466, loss: 3.475551128387451 2023-01-22 11:57:16.489473: step: 256/466, loss: 0.47010546922683716 2023-01-22 11:57:17.260550: step: 258/466, loss: 1.0341893434524536 2023-01-22 11:57:18.036227: step: 260/466, loss: 0.9817947149276733 2023-01-22 11:57:18.743356: step: 262/466, loss: 0.77620929479599 2023-01-22 11:57:19.457598: step: 264/466, loss: 0.812580943107605 2023-01-22 11:57:20.287148: step: 266/466, loss: 1.2853327989578247 2023-01-22 11:57:21.059764: step: 268/466, loss: 0.25905662775039673 2023-01-22 11:57:21.887241: step: 270/466, loss: 1.620441198348999 2023-01-22 11:57:22.682682: step: 272/466, loss: 0.26145508885383606 2023-01-22 11:57:23.458927: step: 274/466, loss: 0.42695140838623047 2023-01-22 11:57:24.297426: step: 276/466, loss: 1.4670782089233398 2023-01-22 11:57:25.076485: step: 278/466, loss: 0.37129366397857666 2023-01-22 11:57:25.881390: step: 280/466, loss: 1.6939526796340942 2023-01-22 11:57:26.685768: step: 282/466, loss: 0.25103333592414856 2023-01-22 11:57:27.424675: step: 284/466, loss: 0.5259407758712769 2023-01-22 11:57:28.169439: step: 286/466, loss: 0.353412002325058 2023-01-22 11:57:28.958710: step: 288/466, loss: 1.4371587038040161 2023-01-22 11:57:29.712450: step: 290/466, loss: 1.5338389873504639 2023-01-22 11:57:30.479208: step: 292/466, loss: 3.5602529048919678 2023-01-22 11:57:31.286571: step: 294/466, loss: 3.602296829223633 2023-01-22 11:57:32.069739: step: 296/466, loss: 0.17957167327404022 2023-01-22 11:57:32.827271: step: 298/466, loss: 1.5521713495254517 2023-01-22 11:57:33.641922: step: 300/466, loss: 0.30981093645095825 2023-01-22 11:57:34.445232: step: 302/466, loss: 0.8085811734199524 2023-01-22 11:57:35.220426: step: 304/466, loss: 1.3801547288894653 2023-01-22 11:57:35.928655: step: 306/466, loss: 0.9548262357711792 2023-01-22 11:57:36.804941: step: 308/466, loss: 0.33303385972976685 2023-01-22 11:57:37.652850: step: 310/466, loss: 0.19228722155094147 2023-01-22 11:57:38.493049: step: 312/466, loss: 0.8639482855796814 2023-01-22 11:57:39.250206: step: 314/466, loss: 0.6498980522155762 2023-01-22 11:57:39.947391: step: 316/466, loss: 0.5746176242828369 2023-01-22 11:57:40.785522: step: 318/466, loss: 1.6305537223815918 2023-01-22 11:57:41.546283: step: 320/466, loss: 0.8195745944976807 2023-01-22 11:57:42.297173: step: 322/466, loss: 1.6166954040527344 2023-01-22 11:57:43.084899: step: 324/466, loss: 2.4278018474578857 2023-01-22 11:57:43.795220: step: 326/466, loss: 0.8675681948661804 2023-01-22 11:57:44.508722: step: 328/466, loss: 0.37489765882492065 2023-01-22 11:57:45.226935: step: 330/466, loss: 2.4730303287506104 2023-01-22 11:57:46.003484: step: 332/466, loss: 0.8059287071228027 2023-01-22 11:57:46.768825: step: 334/466, loss: 0.3558503985404968 2023-01-22 11:57:47.530514: step: 336/466, loss: 0.8252885937690735 2023-01-22 11:57:48.240174: step: 338/466, loss: 0.5280299782752991 2023-01-22 11:57:48.974854: step: 340/466, loss: 0.7340704202651978 2023-01-22 11:57:49.763236: step: 342/466, loss: 0.8384386897087097 2023-01-22 11:57:50.561570: step: 344/466, loss: 1.6607086658477783 2023-01-22 11:57:51.322262: step: 346/466, loss: 0.3305199444293976 2023-01-22 11:57:52.060841: step: 348/466, loss: 0.8351885676383972 2023-01-22 11:57:52.830035: step: 350/466, loss: 1.1301592588424683 2023-01-22 11:57:53.637869: step: 352/466, loss: 1.4941134452819824 2023-01-22 11:57:54.394245: step: 354/466, loss: 0.6128787994384766 2023-01-22 11:57:55.139946: step: 356/466, loss: 0.5210601687431335 2023-01-22 11:57:55.918906: step: 358/466, loss: 1.4006214141845703 2023-01-22 11:57:56.642829: step: 360/466, loss: 0.4968298077583313 2023-01-22 11:57:57.362001: step: 362/466, loss: 0.39130163192749023 2023-01-22 11:57:58.143627: step: 364/466, loss: 0.5103227496147156 2023-01-22 11:57:58.904876: step: 366/466, loss: 0.204085111618042 2023-01-22 11:57:59.645496: step: 368/466, loss: 0.20469868183135986 2023-01-22 11:58:00.401653: step: 370/466, loss: 0.20196063816547394 2023-01-22 11:58:01.133567: step: 372/466, loss: 3.2092690467834473 2023-01-22 11:58:01.992612: step: 374/466, loss: 0.4920191168785095 2023-01-22 11:58:02.652903: step: 376/466, loss: 1.18999445438385 2023-01-22 11:58:03.311665: step: 378/466, loss: 3.111128568649292 2023-01-22 11:58:04.047730: step: 380/466, loss: 0.5756344795227051 2023-01-22 11:58:04.865147: step: 382/466, loss: 0.1449517458677292 2023-01-22 11:58:05.593530: step: 384/466, loss: 0.24137958884239197 2023-01-22 11:58:06.396052: step: 386/466, loss: 0.8756279945373535 2023-01-22 11:58:07.166157: step: 388/466, loss: 0.8774224519729614 2023-01-22 11:58:07.968290: step: 390/466, loss: 1.375420331954956 2023-01-22 11:58:08.740725: step: 392/466, loss: 0.25797238945961 2023-01-22 11:58:09.539666: step: 394/466, loss: 0.740105390548706 2023-01-22 11:58:10.438652: step: 396/466, loss: 0.4930828809738159 2023-01-22 11:58:11.272151: step: 398/466, loss: 0.40567928552627563 2023-01-22 11:58:12.013617: step: 400/466, loss: 0.37840887904167175 2023-01-22 11:58:12.852958: step: 402/466, loss: 4.936130046844482 2023-01-22 11:58:13.629127: step: 404/466, loss: 0.722000777721405 2023-01-22 11:58:14.352490: step: 406/466, loss: 0.5378820300102234 2023-01-22 11:58:15.182535: step: 408/466, loss: 3.1902055740356445 2023-01-22 11:58:16.009373: step: 410/466, loss: 1.2864125967025757 2023-01-22 11:58:16.788818: step: 412/466, loss: 1.8357758522033691 2023-01-22 11:58:17.593546: step: 414/466, loss: 0.48244792222976685 2023-01-22 11:58:18.387108: step: 416/466, loss: 0.43493497371673584 2023-01-22 11:58:19.179401: step: 418/466, loss: 1.562419056892395 2023-01-22 11:58:20.020425: step: 420/466, loss: 1.8326852321624756 2023-01-22 11:58:20.778511: step: 422/466, loss: 1.6586694717407227 2023-01-22 11:58:21.572704: step: 424/466, loss: 1.4311715364456177 2023-01-22 11:58:22.296783: step: 426/466, loss: 1.946704387664795 2023-01-22 11:58:23.157810: step: 428/466, loss: 2.4207661151885986 2023-01-22 11:58:23.892681: step: 430/466, loss: 1.1361252069473267 2023-01-22 11:58:24.588171: step: 432/466, loss: 2.117689371109009 2023-01-22 11:58:25.365504: step: 434/466, loss: 0.7453824877738953 2023-01-22 11:58:26.130260: step: 436/466, loss: 3.592252016067505 2023-01-22 11:58:26.983484: step: 438/466, loss: 0.7220326066017151 2023-01-22 11:58:27.744649: step: 440/466, loss: 0.30927208065986633 2023-01-22 11:58:28.521812: step: 442/466, loss: 0.6411184668540955 2023-01-22 11:58:29.244512: step: 444/466, loss: 0.5024139881134033 2023-01-22 11:58:29.958583: step: 446/466, loss: 0.4438614845275879 2023-01-22 11:58:30.764545: step: 448/466, loss: 0.2669348418712616 2023-01-22 11:58:31.458453: step: 450/466, loss: 0.9279356002807617 2023-01-22 11:58:32.243147: step: 452/466, loss: 0.6318209767341614 2023-01-22 11:58:32.982652: step: 454/466, loss: 2.680860757827759 2023-01-22 11:58:33.774720: step: 456/466, loss: 1.1922527551651 2023-01-22 11:58:34.552882: step: 458/466, loss: 0.10912270098924637 2023-01-22 11:58:35.437920: step: 460/466, loss: 0.18991920351982117 2023-01-22 11:58:36.157942: step: 462/466, loss: 1.3336913585662842 2023-01-22 11:58:36.963547: step: 464/466, loss: 2.4666285514831543 2023-01-22 11:58:37.763961: step: 466/466, loss: 0.3230827748775482 2023-01-22 11:58:38.515463: step: 468/466, loss: 0.6548050045967102 2023-01-22 11:58:39.311772: step: 470/466, loss: 0.5973828434944153 2023-01-22 11:58:40.055643: step: 472/466, loss: 0.7570876479148865 2023-01-22 11:58:40.837680: step: 474/466, loss: 1.6824928522109985 2023-01-22 11:58:41.687505: step: 476/466, loss: 0.658035397529602 2023-01-22 11:58:42.427572: step: 478/466, loss: 0.2836608290672302 2023-01-22 11:58:43.182833: step: 480/466, loss: 1.383255124092102 2023-01-22 11:58:43.931408: step: 482/466, loss: 1.4585496187210083 2023-01-22 11:58:44.682679: step: 484/466, loss: 3.030864953994751 2023-01-22 11:58:45.449998: step: 486/466, loss: 0.32928135991096497 2023-01-22 11:58:46.243326: step: 488/466, loss: 2.591329574584961 2023-01-22 11:58:46.990806: step: 490/466, loss: 1.8271920680999756 2023-01-22 11:58:47.736874: step: 492/466, loss: 0.5807033181190491 2023-01-22 11:58:48.519864: step: 494/466, loss: 0.2876582741737366 2023-01-22 11:58:49.396308: step: 496/466, loss: 0.8153465986251831 2023-01-22 11:58:50.231659: step: 498/466, loss: 0.9287121891975403 2023-01-22 11:58:50.991738: step: 500/466, loss: 1.139840006828308 2023-01-22 11:58:51.756714: step: 502/466, loss: 0.5210357308387756 2023-01-22 11:58:52.558831: step: 504/466, loss: 1.055281162261963 2023-01-22 11:58:53.360020: step: 506/466, loss: 0.5654680132865906 2023-01-22 11:58:54.169185: step: 508/466, loss: 0.5230228304862976 2023-01-22 11:58:54.980413: step: 510/466, loss: 1.373519778251648 2023-01-22 11:58:55.735027: step: 512/466, loss: 1.0065639019012451 2023-01-22 11:58:56.628233: step: 514/466, loss: 2.237562656402588 2023-01-22 11:58:57.423653: step: 516/466, loss: 1.6026716232299805 2023-01-22 11:58:58.097467: step: 518/466, loss: 0.37853115797042847 2023-01-22 11:58:58.814753: step: 520/466, loss: 0.3763364255428314 2023-01-22 11:58:59.624483: step: 522/466, loss: 0.8508153557777405 2023-01-22 11:59:00.430608: step: 524/466, loss: 0.9764190316200256 2023-01-22 11:59:01.255884: step: 526/466, loss: 0.9147853255271912 2023-01-22 11:59:02.083279: step: 528/466, loss: 5.266207218170166 2023-01-22 11:59:02.936606: step: 530/466, loss: 0.34616580605506897 2023-01-22 11:59:03.636717: step: 532/466, loss: 0.5940641760826111 2023-01-22 11:59:04.374331: step: 534/466, loss: 1.3932961225509644 2023-01-22 11:59:05.089540: step: 536/466, loss: 0.3049767017364502 2023-01-22 11:59:05.771378: step: 538/466, loss: 0.5108581185340881 2023-01-22 11:59:06.534689: step: 540/466, loss: 0.6778570413589478 2023-01-22 11:59:07.330699: step: 542/466, loss: 0.8873158693313599 2023-01-22 11:59:08.141509: step: 544/466, loss: 2.1264898777008057 2023-01-22 11:59:08.924609: step: 546/466, loss: 1.411435842514038 2023-01-22 11:59:09.646324: step: 548/466, loss: 0.3980066180229187 2023-01-22 11:59:10.389493: step: 550/466, loss: 1.2385305166244507 2023-01-22 11:59:11.222758: step: 552/466, loss: 1.7239978313446045 2023-01-22 11:59:12.037223: step: 554/466, loss: 0.9625904560089111 2023-01-22 11:59:12.858380: step: 556/466, loss: 0.47288978099823 2023-01-22 11:59:13.681831: step: 558/466, loss: 1.9926071166992188 2023-01-22 11:59:14.433557: step: 560/466, loss: 2.027879238128662 2023-01-22 11:59:15.188044: step: 562/466, loss: 0.6236366033554077 2023-01-22 11:59:15.938420: step: 564/466, loss: 0.5292195081710815 2023-01-22 11:59:16.712990: step: 566/466, loss: 1.7229830026626587 2023-01-22 11:59:17.489983: step: 568/466, loss: 1.3079602718353271 2023-01-22 11:59:18.275973: step: 570/466, loss: 0.18612471222877502 2023-01-22 11:59:19.058221: step: 572/466, loss: 0.966667890548706 2023-01-22 11:59:19.818239: step: 574/466, loss: 1.0524870157241821 2023-01-22 11:59:20.611967: step: 576/466, loss: 0.3565841019153595 2023-01-22 11:59:21.389518: step: 578/466, loss: 1.3353606462478638 2023-01-22 11:59:22.112011: step: 580/466, loss: 0.52093505859375 2023-01-22 11:59:22.908556: step: 582/466, loss: 0.28593239188194275 2023-01-22 11:59:23.667815: step: 584/466, loss: 1.649446725845337 2023-01-22 11:59:24.382965: step: 586/466, loss: 0.9959136843681335 2023-01-22 11:59:25.145308: step: 588/466, loss: 0.5500845313072205 2023-01-22 11:59:25.953179: step: 590/466, loss: 0.5372470021247864 2023-01-22 11:59:26.661255: step: 592/466, loss: 4.194912433624268 2023-01-22 11:59:27.482240: step: 594/466, loss: 0.7411658763885498 2023-01-22 11:59:28.148612: step: 596/466, loss: 0.494291216135025 2023-01-22 11:59:28.860643: step: 598/466, loss: 0.4416239559650421 2023-01-22 11:59:29.651239: step: 600/466, loss: 0.610022783279419 2023-01-22 11:59:30.460995: step: 602/466, loss: 0.22815054655075073 2023-01-22 11:59:31.235131: step: 604/466, loss: 2.0596587657928467 2023-01-22 11:59:32.021795: step: 606/466, loss: 2.188324451446533 2023-01-22 11:59:32.744908: step: 608/466, loss: 0.4296473264694214 2023-01-22 11:59:33.573751: step: 610/466, loss: 0.8137211799621582 2023-01-22 11:59:34.354950: step: 612/466, loss: 0.8927656412124634 2023-01-22 11:59:35.176277: step: 614/466, loss: 1.6651499271392822 2023-01-22 11:59:36.077556: step: 616/466, loss: 0.8619599938392639 2023-01-22 11:59:36.907341: step: 618/466, loss: 0.9984352588653564 2023-01-22 11:59:37.710236: step: 620/466, loss: 0.28391364216804504 2023-01-22 11:59:38.493420: step: 622/466, loss: 0.7742931246757507 2023-01-22 11:59:39.320271: step: 624/466, loss: 1.2217891216278076 2023-01-22 11:59:40.074434: step: 626/466, loss: 0.5850755572319031 2023-01-22 11:59:40.829661: step: 628/466, loss: 1.6307834386825562 2023-01-22 11:59:41.633022: step: 630/466, loss: 0.39652201533317566 2023-01-22 11:59:42.446550: step: 632/466, loss: 0.7861143350601196 2023-01-22 11:59:43.074769: step: 634/466, loss: 0.38449627161026 2023-01-22 11:59:43.893980: step: 636/466, loss: 2.7891414165496826 2023-01-22 11:59:44.689891: step: 638/466, loss: 0.6628226041793823 2023-01-22 11:59:45.425321: step: 640/466, loss: 0.41520482301712036 2023-01-22 11:59:46.298497: step: 642/466, loss: 0.2679263949394226 2023-01-22 11:59:47.153660: step: 644/466, loss: 1.1348047256469727 2023-01-22 11:59:47.980659: step: 646/466, loss: 0.46422290802001953 2023-01-22 11:59:48.812019: step: 648/466, loss: 0.2260451465845108 2023-01-22 11:59:49.579085: step: 650/466, loss: 1.5560388565063477 2023-01-22 11:59:50.411188: step: 652/466, loss: 2.7813427448272705 2023-01-22 11:59:51.286284: step: 654/466, loss: 0.5180941224098206 2023-01-22 11:59:52.009263: step: 656/466, loss: 1.9066956043243408 2023-01-22 11:59:52.748338: step: 658/466, loss: 0.7163376808166504 2023-01-22 11:59:53.510161: step: 660/466, loss: 0.8216008543968201 2023-01-22 11:59:54.317945: step: 662/466, loss: 0.6733957529067993 2023-01-22 11:59:55.094252: step: 664/466, loss: 0.9260329008102417 2023-01-22 11:59:55.861570: step: 666/466, loss: 0.31455352902412415 2023-01-22 11:59:56.651768: step: 668/466, loss: 1.2638105154037476 2023-01-22 11:59:57.507694: step: 670/466, loss: 1.3063114881515503 2023-01-22 11:59:58.298237: step: 672/466, loss: 0.7214428186416626 2023-01-22 11:59:59.077104: step: 674/466, loss: 1.3825799226760864 2023-01-22 11:59:59.820593: step: 676/466, loss: 0.8210784196853638 2023-01-22 12:00:00.554086: step: 678/466, loss: 4.343418598175049 2023-01-22 12:00:01.423187: step: 680/466, loss: 0.6992369294166565 2023-01-22 12:00:02.172466: step: 682/466, loss: 0.6307631731033325 2023-01-22 12:00:02.952170: step: 684/466, loss: 0.2862870991230011 2023-01-22 12:00:03.778813: step: 686/466, loss: 0.795841634273529 2023-01-22 12:00:04.565707: step: 688/466, loss: 0.35162660479545593 2023-01-22 12:00:05.273498: step: 690/466, loss: 0.24319665133953094 2023-01-22 12:00:06.038247: step: 692/466, loss: 0.5327470898628235 2023-01-22 12:00:06.767425: step: 694/466, loss: 0.3367350399494171 2023-01-22 12:00:07.569106: step: 696/466, loss: 0.2241695076227188 2023-01-22 12:00:08.312154: step: 698/466, loss: 0.36978334188461304 2023-01-22 12:00:09.159153: step: 700/466, loss: 1.1117911338806152 2023-01-22 12:00:09.904457: step: 702/466, loss: 0.6542451977729797 2023-01-22 12:00:10.730399: step: 704/466, loss: 0.7758954763412476 2023-01-22 12:00:11.492672: step: 706/466, loss: 1.0059378147125244 2023-01-22 12:00:12.308092: step: 708/466, loss: 1.8209892511367798 2023-01-22 12:00:13.073814: step: 710/466, loss: 1.159255027770996 2023-01-22 12:00:13.899645: step: 712/466, loss: 3.3368239402770996 2023-01-22 12:00:14.659248: step: 714/466, loss: 0.5656745433807373 2023-01-22 12:00:15.416223: step: 716/466, loss: 2.0161354541778564 2023-01-22 12:00:16.155909: step: 718/466, loss: 0.24916690587997437 2023-01-22 12:00:17.089792: step: 720/466, loss: 2.7012593746185303 2023-01-22 12:00:17.844260: step: 722/466, loss: 0.9164566993713379 2023-01-22 12:00:18.583319: step: 724/466, loss: 0.8179494738578796 2023-01-22 12:00:19.227127: step: 726/466, loss: 0.8097164630889893 2023-01-22 12:00:20.044071: step: 728/466, loss: 0.7410258054733276 2023-01-22 12:00:20.776630: step: 730/466, loss: 7.917463302612305 2023-01-22 12:00:21.525502: step: 732/466, loss: 7.717822074890137 2023-01-22 12:00:22.348152: step: 734/466, loss: 1.1576268672943115 2023-01-22 12:00:23.065210: step: 736/466, loss: 0.34166157245635986 2023-01-22 12:00:23.857403: step: 738/466, loss: 1.1671950817108154 2023-01-22 12:00:24.646526: step: 740/466, loss: 0.6236756443977356 2023-01-22 12:00:25.433435: step: 742/466, loss: 0.6424854397773743 2023-01-22 12:00:26.211904: step: 744/466, loss: 3.8019018173217773 2023-01-22 12:00:27.039258: step: 746/466, loss: 0.8778786063194275 2023-01-22 12:00:27.775248: step: 748/466, loss: 0.9644218683242798 2023-01-22 12:00:28.592870: step: 750/466, loss: 0.9258589744567871 2023-01-22 12:00:29.395382: step: 752/466, loss: 1.182026743888855 2023-01-22 12:00:30.155413: step: 754/466, loss: 0.4859839975833893 2023-01-22 12:00:30.888701: step: 756/466, loss: 0.42185384035110474 2023-01-22 12:00:31.657978: step: 758/466, loss: 1.056136965751648 2023-01-22 12:00:32.389515: step: 760/466, loss: 1.8052279949188232 2023-01-22 12:00:33.173712: step: 762/466, loss: 0.25396454334259033 2023-01-22 12:00:33.814408: step: 764/466, loss: 0.984928548336029 2023-01-22 12:00:34.536012: step: 766/466, loss: 0.32059386372566223 2023-01-22 12:00:35.322958: step: 768/466, loss: 1.5475854873657227 2023-01-22 12:00:36.091477: step: 770/466, loss: 0.6480833292007446 2023-01-22 12:00:36.810350: step: 772/466, loss: 0.7749090790748596 2023-01-22 12:00:37.590254: step: 774/466, loss: 1.937896490097046 2023-01-22 12:00:38.333156: step: 776/466, loss: 1.2467749118804932 2023-01-22 12:00:39.037303: step: 778/466, loss: 0.29985183477401733 2023-01-22 12:00:39.797539: step: 780/466, loss: 0.7874993085861206 2023-01-22 12:00:40.674716: step: 782/466, loss: 0.5243517756462097 2023-01-22 12:00:41.450797: step: 784/466, loss: 1.106619119644165 2023-01-22 12:00:42.286871: step: 786/466, loss: 0.26240459084510803 2023-01-22 12:00:43.026482: step: 788/466, loss: 3.3374693393707275 2023-01-22 12:00:43.789282: step: 790/466, loss: 1.4162318706512451 2023-01-22 12:00:44.570374: step: 792/466, loss: 1.6213815212249756 2023-01-22 12:00:45.393816: step: 794/466, loss: 3.1641407012939453 2023-01-22 12:00:46.217732: step: 796/466, loss: 1.010169267654419 2023-01-22 12:00:47.037407: step: 798/466, loss: 0.5032852292060852 2023-01-22 12:00:47.727223: step: 800/466, loss: 0.5835109949111938 2023-01-22 12:00:48.494665: step: 802/466, loss: 2.25563907623291 2023-01-22 12:00:49.169665: step: 804/466, loss: 1.6692702770233154 2023-01-22 12:00:49.967501: step: 806/466, loss: 1.2163097858428955 2023-01-22 12:00:50.678934: step: 808/466, loss: 0.15825329720973969 2023-01-22 12:00:51.452331: step: 810/466, loss: 2.151369571685791 2023-01-22 12:00:52.138786: step: 812/466, loss: 1.0254435539245605 2023-01-22 12:00:52.878288: step: 814/466, loss: 1.6680785417556763 2023-01-22 12:00:53.637304: step: 816/466, loss: 1.0526785850524902 2023-01-22 12:00:54.397023: step: 818/466, loss: 0.9085757732391357 2023-01-22 12:00:55.123635: step: 820/466, loss: 0.41828882694244385 2023-01-22 12:00:55.795765: step: 822/466, loss: 0.7217870354652405 2023-01-22 12:00:56.551822: step: 824/466, loss: 0.6949070692062378 2023-01-22 12:00:57.378657: step: 826/466, loss: 1.796603798866272 2023-01-22 12:00:58.114495: step: 828/466, loss: 1.1403634548187256 2023-01-22 12:00:58.840366: step: 830/466, loss: 6.5800652503967285 2023-01-22 12:00:59.679837: step: 832/466, loss: 1.1636403799057007 2023-01-22 12:01:00.455756: step: 834/466, loss: 0.7278444766998291 2023-01-22 12:01:01.259061: step: 836/466, loss: 0.711738646030426 2023-01-22 12:01:02.077513: step: 838/466, loss: 0.42984867095947266 2023-01-22 12:01:02.787196: step: 840/466, loss: 1.1451420783996582 2023-01-22 12:01:03.545158: step: 842/466, loss: 0.5262464284896851 2023-01-22 12:01:04.291327: step: 844/466, loss: 0.3005110025405884 2023-01-22 12:01:05.011288: step: 846/466, loss: 1.6486259698867798 2023-01-22 12:01:05.760477: step: 848/466, loss: 2.0297067165374756 2023-01-22 12:01:06.574275: step: 850/466, loss: 1.0829218626022339 2023-01-22 12:01:07.325242: step: 852/466, loss: 0.3283446431159973 2023-01-22 12:01:08.101338: step: 854/466, loss: 0.4144214391708374 2023-01-22 12:01:08.903303: step: 856/466, loss: 1.1768455505371094 2023-01-22 12:01:09.608911: step: 858/466, loss: 0.47637295722961426 2023-01-22 12:01:10.385273: step: 860/466, loss: 1.35926353931427 2023-01-22 12:01:11.166496: step: 862/466, loss: 0.79765385389328 2023-01-22 12:01:11.931773: step: 864/466, loss: 2.7926888465881348 2023-01-22 12:01:12.664156: step: 866/466, loss: 1.0313583612442017 2023-01-22 12:01:13.447796: step: 868/466, loss: 0.9380786418914795 2023-01-22 12:01:14.245453: step: 870/466, loss: 1.7788538932800293 2023-01-22 12:01:15.030110: step: 872/466, loss: 12.130573272705078 2023-01-22 12:01:15.823684: step: 874/466, loss: 0.48626357316970825 2023-01-22 12:01:16.609740: step: 876/466, loss: 0.8553528189659119 2023-01-22 12:01:17.404353: step: 878/466, loss: 0.5449331998825073 2023-01-22 12:01:18.194061: step: 880/466, loss: 4.440307140350342 2023-01-22 12:01:18.963646: step: 882/466, loss: 2.0392606258392334 2023-01-22 12:01:19.775003: step: 884/466, loss: 0.5503759384155273 2023-01-22 12:01:20.573644: step: 886/466, loss: 0.6343894600868225 2023-01-22 12:01:21.318202: step: 888/466, loss: 0.6469244360923767 2023-01-22 12:01:22.113864: step: 890/466, loss: 1.1977115869522095 2023-01-22 12:01:22.954519: step: 892/466, loss: 0.896173357963562 2023-01-22 12:01:23.909882: step: 894/466, loss: 0.7121672630310059 2023-01-22 12:01:24.709591: step: 896/466, loss: 0.4969857633113861 2023-01-22 12:01:25.460706: step: 898/466, loss: 0.2636564075946808 2023-01-22 12:01:26.181431: step: 900/466, loss: 0.18109741806983948 2023-01-22 12:01:26.967904: step: 902/466, loss: 0.609916090965271 2023-01-22 12:01:27.773815: step: 904/466, loss: 1.431706428527832 2023-01-22 12:01:28.577577: step: 906/466, loss: 1.133394479751587 2023-01-22 12:01:29.294872: step: 908/466, loss: 0.38408052921295166 2023-01-22 12:01:30.048899: step: 910/466, loss: 0.21119973063468933 2023-01-22 12:01:30.755734: step: 912/466, loss: 4.096024036407471 2023-01-22 12:01:31.494239: step: 914/466, loss: 0.4555894732475281 2023-01-22 12:01:32.398798: step: 916/466, loss: 0.49550244212150574 2023-01-22 12:01:33.182075: step: 918/466, loss: 0.8274804949760437 2023-01-22 12:01:33.930923: step: 920/466, loss: 0.9337276220321655 2023-01-22 12:01:34.742196: step: 922/466, loss: 0.5701102614402771 2023-01-22 12:01:35.562364: step: 924/466, loss: 1.468475103378296 2023-01-22 12:01:36.429061: step: 926/466, loss: 0.5688779354095459 2023-01-22 12:01:37.178013: step: 928/466, loss: 0.829647958278656 2023-01-22 12:01:38.016919: step: 930/466, loss: 1.1818006038665771 2023-01-22 12:01:38.784518: step: 932/466, loss: 0.8422772884368896 ================================================== Loss: 1.159 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3067315481548155, 'r': 0.2939268155942729, 'f1': 0.30019269732205783}, 'combined': 0.22119461907941101, 'epoch': 4} Test Chinese: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.3301358185595464, 'r': 0.23397698463165778, 'f1': 0.27386077007468695}, 'combined': 0.168324180631271, 'epoch': 4} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2872336074821183, 'r': 0.30140452549831387, 'f1': 0.29414849062520626}, 'combined': 0.21674099309225722, 'epoch': 4} Test Korean: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.3189267202303506, 'r': 0.24526457983112604, 'f1': 0.2772868990560212}, 'combined': 0.17042999649296914, 'epoch': 4} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3049337367947321, 'r': 0.2922040551828078, 'f1': 0.2984332113979452}, 'combined': 0.219898155766907, 'epoch': 4} Test Russian: {'template': {'p': 0.863013698630137, 'r': 0.48091603053435117, 'f1': 0.6176470588235295}, 'slot': {'p': 0.32651807117430914, 'r': 0.23536393375918221, 'f1': 0.2735470330079094}, 'combined': 0.16895552038723818, 'epoch': 4} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.2403846153846154, 'r': 0.35714285714285715, 'f1': 0.28735632183908044}, 'combined': 0.19157088122605362, 'epoch': 4} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.29411764705882354, 'r': 0.43478260869565216, 'f1': 0.3508771929824562}, 'combined': 0.1754385964912281, 'epoch': 4} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.3125, 'r': 0.1724137931034483, 'f1': 0.22222222222222224}, 'combined': 0.14814814814814814, 'epoch': 4} New best chinese model... ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3067315481548155, 'r': 0.2939268155942729, 'f1': 0.30019269732205783}, 'combined': 0.22119461907941101, 'epoch': 4} Test for Chinese: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.3301358185595464, 'r': 0.23397698463165778, 'f1': 0.27386077007468695}, 'combined': 0.168324180631271, 'epoch': 4} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.2403846153846154, 'r': 0.35714285714285715, 'f1': 0.28735632183908044}, 'combined': 0.19157088122605362, 'epoch': 4} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2791706942517267, 'r': 0.28552752220432764, 'f1': 0.28231332870859416}, 'combined': 0.2080203474694904, 'epoch': 3} Test for Korean: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.306518116903463, 'r': 0.2717829323754754, 'f1': 0.28810735426506145}, 'combined': 0.1770806177434036, 'epoch': 3} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.29375, 'r': 0.5108695652173914, 'f1': 0.3730158730158731}, 'combined': 0.18650793650793654, 'epoch': 3} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.30090982880755607, 'r': 0.27635741393331525, 'f1': 0.2881114879186096}, 'combined': 0.21229267530844914, 'epoch': 3} Test for Russian: {'template': {'p': 0.863013698630137, 'r': 0.48091603053435117, 'f1': 0.6176470588235295}, 'slot': {'p': 0.33152736310705944, 'r': 0.26648266870316795, 'f1': 0.29546760679402523}, 'combined': 0.1824946983139568, 'epoch': 3} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.35714285714285715, 'r': 0.1724137931034483, 'f1': 0.23255813953488377}, 'combined': 0.1550387596899225, 'epoch': 3} ****************************** Epoch: 5 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-22 12:04:29.291048: step: 2/466, loss: 0.9809449911117554 2023-01-22 12:04:30.043224: step: 4/466, loss: 0.9840978980064392 2023-01-22 12:04:30.839005: step: 6/466, loss: 0.43336671590805054 2023-01-22 12:04:31.636094: step: 8/466, loss: 1.2338881492614746 2023-01-22 12:04:32.405999: step: 10/466, loss: 0.5673670172691345 2023-01-22 12:04:33.243416: step: 12/466, loss: 0.8279459476470947 2023-01-22 12:04:34.015762: step: 14/466, loss: 0.7558691501617432 2023-01-22 12:04:34.797778: step: 16/466, loss: 0.34479042887687683 2023-01-22 12:04:35.446403: step: 18/466, loss: 2.8193399906158447 2023-01-22 12:04:36.352243: step: 20/466, loss: 0.6758497953414917 2023-01-22 12:04:37.105725: step: 22/466, loss: 1.0843076705932617 2023-01-22 12:04:37.972955: step: 24/466, loss: 0.7437761425971985 2023-01-22 12:04:38.651563: step: 26/466, loss: 0.8486707806587219 2023-01-22 12:04:39.471786: step: 28/466, loss: 0.17709881067276 2023-01-22 12:04:40.237806: step: 30/466, loss: 0.9467667937278748 2023-01-22 12:04:41.016154: step: 32/466, loss: 0.27025699615478516 2023-01-22 12:04:41.736444: step: 34/466, loss: 0.20740818977355957 2023-01-22 12:04:42.486408: step: 36/466, loss: 0.583893358707428 2023-01-22 12:04:43.243687: step: 38/466, loss: 0.26786521077156067 2023-01-22 12:04:44.020146: step: 40/466, loss: 0.38853883743286133 2023-01-22 12:04:44.799002: step: 42/466, loss: 0.2629576325416565 2023-01-22 12:04:45.592066: step: 44/466, loss: 3.393561840057373 2023-01-22 12:04:46.332670: step: 46/466, loss: 0.9243699312210083 2023-01-22 12:04:47.080054: step: 48/466, loss: 0.3238946497440338 2023-01-22 12:04:47.830144: step: 50/466, loss: 0.34954404830932617 2023-01-22 12:04:48.641757: step: 52/466, loss: 1.080365538597107 2023-01-22 12:04:49.402003: step: 54/466, loss: 0.6035301089286804 2023-01-22 12:04:50.123404: step: 56/466, loss: 0.1855875700712204 2023-01-22 12:04:50.904367: step: 58/466, loss: 0.17876757681369781 2023-01-22 12:04:51.713541: step: 60/466, loss: 0.3543803095817566 2023-01-22 12:04:52.465368: step: 62/466, loss: 1.1884087324142456 2023-01-22 12:04:53.170690: step: 64/466, loss: 0.3980889320373535 2023-01-22 12:04:54.026744: step: 66/466, loss: 0.3353211283683777 2023-01-22 12:04:54.844082: step: 68/466, loss: 0.8751363754272461 2023-01-22 12:04:55.626921: step: 70/466, loss: 0.6269444227218628 2023-01-22 12:04:56.461278: step: 72/466, loss: 0.6170547604560852 2023-01-22 12:04:57.304032: step: 74/466, loss: 0.9494127631187439 2023-01-22 12:04:58.071376: step: 76/466, loss: 0.4018596112728119 2023-01-22 12:04:58.833332: step: 78/466, loss: 4.271050453186035 2023-01-22 12:04:59.590873: step: 80/466, loss: 0.2668708860874176 2023-01-22 12:05:00.400251: step: 82/466, loss: 0.1476602405309677 2023-01-22 12:05:01.148114: step: 84/466, loss: 0.5470862984657288 2023-01-22 12:05:01.948380: step: 86/466, loss: 0.18264155089855194 2023-01-22 12:05:02.710251: step: 88/466, loss: 0.8571751117706299 2023-01-22 12:05:03.559146: step: 90/466, loss: 0.19562962651252747 2023-01-22 12:05:04.318334: step: 92/466, loss: 0.5648656487464905 2023-01-22 12:05:05.026472: step: 94/466, loss: 0.09153655171394348 2023-01-22 12:05:05.753384: step: 96/466, loss: 0.2601352334022522 2023-01-22 12:05:06.471382: step: 98/466, loss: 0.8876150250434875 2023-01-22 12:05:07.226656: step: 100/466, loss: 0.6579775214195251 2023-01-22 12:05:08.004891: step: 102/466, loss: 0.4219387173652649 2023-01-22 12:05:08.709413: step: 104/466, loss: 1.411496639251709 2023-01-22 12:05:09.469804: step: 106/466, loss: 0.7639531493186951 2023-01-22 12:05:10.296001: step: 108/466, loss: 0.375456303358078 2023-01-22 12:05:11.023007: step: 110/466, loss: 0.47441768646240234 2023-01-22 12:05:11.802131: step: 112/466, loss: 0.8059460520744324 2023-01-22 12:05:12.549354: step: 114/466, loss: 0.5893727540969849 2023-01-22 12:05:13.261421: step: 116/466, loss: 0.29101523756980896 2023-01-22 12:05:14.091837: step: 118/466, loss: 2.5849218368530273 2023-01-22 12:05:14.891242: step: 120/466, loss: 0.2512795925140381 2023-01-22 12:05:15.725825: step: 122/466, loss: 0.6373374462127686 2023-01-22 12:05:16.524842: step: 124/466, loss: 0.3568679988384247 2023-01-22 12:05:17.330811: step: 126/466, loss: 0.5779368281364441 2023-01-22 12:05:18.079239: step: 128/466, loss: 0.26844117045402527 2023-01-22 12:05:18.889925: step: 130/466, loss: 0.29416975378990173 2023-01-22 12:05:19.621426: step: 132/466, loss: 1.4655174016952515 2023-01-22 12:05:20.385070: step: 134/466, loss: 1.2914477586746216 2023-01-22 12:05:21.149421: step: 136/466, loss: 3.8348920345306396 2023-01-22 12:05:21.904830: step: 138/466, loss: 0.15896697342395782 2023-01-22 12:05:22.587532: step: 140/466, loss: 0.8044726848602295 2023-01-22 12:05:23.290759: step: 142/466, loss: 0.2973330616950989 2023-01-22 12:05:24.088334: step: 144/466, loss: 1.2005521059036255 2023-01-22 12:05:24.806018: step: 146/466, loss: 0.52372145652771 2023-01-22 12:05:25.561146: step: 148/466, loss: 0.33128097653388977 2023-01-22 12:05:26.340948: step: 150/466, loss: 0.3678876459598541 2023-01-22 12:05:27.104858: step: 152/466, loss: 0.6063940525054932 2023-01-22 12:05:27.846446: step: 154/466, loss: 1.0063389539718628 2023-01-22 12:05:28.679148: step: 156/466, loss: 2.022432327270508 2023-01-22 12:05:29.452433: step: 158/466, loss: 0.3305632174015045 2023-01-22 12:05:30.191992: step: 160/466, loss: 0.5475694537162781 2023-01-22 12:05:30.974804: step: 162/466, loss: 0.7513313293457031 2023-01-22 12:05:31.740600: step: 164/466, loss: 3.8323545455932617 2023-01-22 12:05:32.492486: step: 166/466, loss: 1.0958454608917236 2023-01-22 12:05:33.314390: step: 168/466, loss: 3.309696912765503 2023-01-22 12:05:34.068732: step: 170/466, loss: 0.9720345735549927 2023-01-22 12:05:34.821685: step: 172/466, loss: 1.1154394149780273 2023-01-22 12:05:35.610691: step: 174/466, loss: 0.9783151149749756 2023-01-22 12:05:36.429768: step: 176/466, loss: 0.3279179632663727 2023-01-22 12:05:37.179622: step: 178/466, loss: 1.1613959074020386 2023-01-22 12:05:38.045482: step: 180/466, loss: 0.2454126924276352 2023-01-22 12:05:38.833037: step: 182/466, loss: 0.409664124250412 2023-01-22 12:05:39.577597: step: 184/466, loss: 0.8884005546569824 2023-01-22 12:05:40.330803: step: 186/466, loss: 0.6842936277389526 2023-01-22 12:05:41.105569: step: 188/466, loss: 0.8189715147018433 2023-01-22 12:05:41.881052: step: 190/466, loss: 0.2442971169948578 2023-01-22 12:05:42.694917: step: 192/466, loss: 0.9736469984054565 2023-01-22 12:05:43.514110: step: 194/466, loss: 0.9577223062515259 2023-01-22 12:05:44.258881: step: 196/466, loss: 0.4329565763473511 2023-01-22 12:05:44.994877: step: 198/466, loss: 0.7579882144927979 2023-01-22 12:05:45.719523: step: 200/466, loss: 1.7942726612091064 2023-01-22 12:05:46.523760: step: 202/466, loss: 0.906710684299469 2023-01-22 12:05:47.262126: step: 204/466, loss: 1.0168603658676147 2023-01-22 12:05:47.989216: step: 206/466, loss: 0.24140344560146332 2023-01-22 12:05:48.779367: step: 208/466, loss: 0.17037667334079742 2023-01-22 12:05:49.526993: step: 210/466, loss: 2.784151554107666 2023-01-22 12:05:50.336002: step: 212/466, loss: 0.17526978254318237 2023-01-22 12:05:51.042609: step: 214/466, loss: 0.5849495530128479 2023-01-22 12:05:51.843920: step: 216/466, loss: 0.19523198902606964 2023-01-22 12:05:52.595064: step: 218/466, loss: 0.44503164291381836 2023-01-22 12:05:53.455858: step: 220/466, loss: 0.33854418992996216 2023-01-22 12:05:54.206740: step: 222/466, loss: 0.9513136744499207 2023-01-22 12:05:55.022376: step: 224/466, loss: 1.576082706451416 2023-01-22 12:05:55.723320: step: 226/466, loss: 0.6007593870162964 2023-01-22 12:05:56.551141: step: 228/466, loss: 0.38531842827796936 2023-01-22 12:05:57.291807: step: 230/466, loss: 0.6435443758964539 2023-01-22 12:05:58.151604: step: 232/466, loss: 0.3349244296550751 2023-01-22 12:05:58.889859: step: 234/466, loss: 0.4726037383079529 2023-01-22 12:05:59.590698: step: 236/466, loss: 1.0641672611236572 2023-01-22 12:06:00.279691: step: 238/466, loss: 0.6876202821731567 2023-01-22 12:06:01.198508: step: 240/466, loss: 0.9148756265640259 2023-01-22 12:06:01.914853: step: 242/466, loss: 0.7800842523574829 2023-01-22 12:06:02.684769: step: 244/466, loss: 0.427781879901886 2023-01-22 12:06:03.466402: step: 246/466, loss: 1.4870994091033936 2023-01-22 12:06:04.406363: step: 248/466, loss: 0.8990872502326965 2023-01-22 12:06:05.296240: step: 250/466, loss: 1.415848970413208 2023-01-22 12:06:06.102372: step: 252/466, loss: 0.5700762867927551 2023-01-22 12:06:06.884478: step: 254/466, loss: 7.834717750549316 2023-01-22 12:06:07.674113: step: 256/466, loss: 1.9557559490203857 2023-01-22 12:06:08.402511: step: 258/466, loss: 1.1228989362716675 2023-01-22 12:06:09.183910: step: 260/466, loss: 1.1477344036102295 2023-01-22 12:06:09.860134: step: 262/466, loss: 0.5251627564430237 2023-01-22 12:06:10.745615: step: 264/466, loss: 0.4231213927268982 2023-01-22 12:06:11.581339: step: 266/466, loss: 1.6289584636688232 2023-01-22 12:06:12.334526: step: 268/466, loss: 1.3237013816833496 2023-01-22 12:06:13.091151: step: 270/466, loss: 1.6523653268814087 2023-01-22 12:06:13.907849: step: 272/466, loss: 1.1716986894607544 2023-01-22 12:06:14.647553: step: 274/466, loss: 0.25267747044563293 2023-01-22 12:06:15.350053: step: 276/466, loss: 0.6228837370872498 2023-01-22 12:06:16.149749: step: 278/466, loss: 1.0324732065200806 2023-01-22 12:06:16.945700: step: 280/466, loss: 0.3614393472671509 2023-01-22 12:06:17.750407: step: 282/466, loss: 0.6208564639091492 2023-01-22 12:06:18.461968: step: 284/466, loss: 0.27702635526657104 2023-01-22 12:06:19.291199: step: 286/466, loss: 0.5526717901229858 2023-01-22 12:06:20.115235: step: 288/466, loss: 0.6723749041557312 2023-01-22 12:06:20.998774: step: 290/466, loss: 0.4218622148036957 2023-01-22 12:06:21.892551: step: 292/466, loss: 0.776480495929718 2023-01-22 12:06:22.768281: step: 294/466, loss: 0.6810406446456909 2023-01-22 12:06:23.495973: step: 296/466, loss: 1.0859265327453613 2023-01-22 12:06:24.247371: step: 298/466, loss: 0.3822017312049866 2023-01-22 12:06:25.026078: step: 300/466, loss: 1.7835806608200073 2023-01-22 12:06:25.765466: step: 302/466, loss: 0.3045477569103241 2023-01-22 12:06:26.589904: step: 304/466, loss: 0.27006521821022034 2023-01-22 12:06:27.275130: step: 306/466, loss: 0.22388720512390137 2023-01-22 12:06:28.036533: step: 308/466, loss: 1.2394423484802246 2023-01-22 12:06:28.740277: step: 310/466, loss: 0.9624764323234558 2023-01-22 12:06:29.412041: step: 312/466, loss: 2.5301806926727295 2023-01-22 12:06:30.227989: step: 314/466, loss: 1.4377882480621338 2023-01-22 12:06:30.996253: step: 316/466, loss: 1.201191782951355 2023-01-22 12:06:31.776826: step: 318/466, loss: 0.6634922027587891 2023-01-22 12:06:32.639908: step: 320/466, loss: 1.0930447578430176 2023-01-22 12:06:33.350645: step: 322/466, loss: 0.34051230549812317 2023-01-22 12:06:34.132894: step: 324/466, loss: 0.29256054759025574 2023-01-22 12:06:34.972012: step: 326/466, loss: 2.6791293621063232 2023-01-22 12:06:35.717382: step: 328/466, loss: 0.7095626592636108 2023-01-22 12:06:36.386070: step: 330/466, loss: 1.0395307540893555 2023-01-22 12:06:37.221690: step: 332/466, loss: 0.4586993455886841 2023-01-22 12:06:37.953730: step: 334/466, loss: 0.5928367972373962 2023-01-22 12:06:38.652790: step: 336/466, loss: 0.4381711483001709 2023-01-22 12:06:39.497568: step: 338/466, loss: 6.912117958068848 2023-01-22 12:06:40.332176: step: 340/466, loss: 0.6078882217407227 2023-01-22 12:06:41.051871: step: 342/466, loss: 1.0685304403305054 2023-01-22 12:06:41.883288: step: 344/466, loss: 0.21866194903850555 2023-01-22 12:06:42.729889: step: 346/466, loss: 2.0231590270996094 2023-01-22 12:06:43.499953: step: 348/466, loss: 0.6598676443099976 2023-01-22 12:06:44.231040: step: 350/466, loss: 0.28906044363975525 2023-01-22 12:06:45.003714: step: 352/466, loss: 1.618789792060852 2023-01-22 12:06:45.744566: step: 354/466, loss: 1.2375966310501099 2023-01-22 12:06:46.460318: step: 356/466, loss: 0.8457720875740051 2023-01-22 12:06:47.221417: step: 358/466, loss: 0.2892751395702362 2023-01-22 12:06:48.010646: step: 360/466, loss: 0.2970247268676758 2023-01-22 12:06:48.776914: step: 362/466, loss: 0.39713749289512634 2023-01-22 12:06:49.641850: step: 364/466, loss: 0.9225302338600159 2023-01-22 12:06:50.413765: step: 366/466, loss: 0.5669893026351929 2023-01-22 12:06:51.160234: step: 368/466, loss: 1.1761888265609741 2023-01-22 12:06:51.941386: step: 370/466, loss: 1.6809601783752441 2023-01-22 12:06:52.667587: step: 372/466, loss: 0.9083019495010376 2023-01-22 12:06:53.408274: step: 374/466, loss: 1.4838683605194092 2023-01-22 12:06:54.268472: step: 376/466, loss: 1.4812084436416626 2023-01-22 12:06:55.033694: step: 378/466, loss: 0.3336334228515625 2023-01-22 12:06:55.760616: step: 380/466, loss: 0.2552716135978699 2023-01-22 12:06:56.542576: step: 382/466, loss: 0.4424442946910858 2023-01-22 12:06:57.441487: step: 384/466, loss: 0.5649930238723755 2023-01-22 12:06:58.201288: step: 386/466, loss: 0.9104715585708618 2023-01-22 12:06:58.979124: step: 388/466, loss: 0.6600866913795471 2023-01-22 12:06:59.768227: step: 390/466, loss: 0.5454346537590027 2023-01-22 12:07:00.417556: step: 392/466, loss: 0.26466286182403564 2023-01-22 12:07:01.203634: step: 394/466, loss: 0.40282976627349854 2023-01-22 12:07:01.942211: step: 396/466, loss: 0.1367294043302536 2023-01-22 12:07:02.673249: step: 398/466, loss: 2.8048770427703857 2023-01-22 12:07:03.343583: step: 400/466, loss: 0.240428626537323 2023-01-22 12:07:04.073945: step: 402/466, loss: 0.4536705017089844 2023-01-22 12:07:04.886196: step: 404/466, loss: 0.6393134593963623 2023-01-22 12:07:05.657023: step: 406/466, loss: 0.4184322953224182 2023-01-22 12:07:06.427470: step: 408/466, loss: 3.7611472606658936 2023-01-22 12:07:07.392702: step: 410/466, loss: 0.3797491490840912 2023-01-22 12:07:08.122523: step: 412/466, loss: 0.8127273321151733 2023-01-22 12:07:08.848426: step: 414/466, loss: 0.5759282112121582 2023-01-22 12:07:09.611028: step: 416/466, loss: 0.8959246277809143 2023-01-22 12:07:10.364802: step: 418/466, loss: 0.698711633682251 2023-01-22 12:07:11.093011: step: 420/466, loss: 1.1171255111694336 2023-01-22 12:07:11.822466: step: 422/466, loss: 1.4768218994140625 2023-01-22 12:07:12.556724: step: 424/466, loss: 1.5017131567001343 2023-01-22 12:07:13.299084: step: 426/466, loss: 0.3037756681442261 2023-01-22 12:07:14.207482: step: 428/466, loss: 1.1463508605957031 2023-01-22 12:07:14.991699: step: 430/466, loss: 0.897317111492157 2023-01-22 12:07:15.740062: step: 432/466, loss: 1.296699047088623 2023-01-22 12:07:16.507612: step: 434/466, loss: 0.975659191608429 2023-01-22 12:07:17.314808: step: 436/466, loss: 0.4669412076473236 2023-01-22 12:07:18.052364: step: 438/466, loss: 2.3514862060546875 2023-01-22 12:07:18.776970: step: 440/466, loss: 0.5584709048271179 2023-01-22 12:07:19.503671: step: 442/466, loss: 1.1041259765625 2023-01-22 12:07:20.286193: step: 444/466, loss: 0.24737270176410675 2023-01-22 12:07:21.118950: step: 446/466, loss: 0.5634731650352478 2023-01-22 12:07:21.850041: step: 448/466, loss: 0.5878589153289795 2023-01-22 12:07:22.572340: step: 450/466, loss: 0.5585016012191772 2023-01-22 12:07:23.357941: step: 452/466, loss: 0.5468478202819824 2023-01-22 12:07:24.174930: step: 454/466, loss: 0.30728453397750854 2023-01-22 12:07:24.912218: step: 456/466, loss: 0.6326717734336853 2023-01-22 12:07:25.674761: step: 458/466, loss: 2.432973861694336 2023-01-22 12:07:26.501427: step: 460/466, loss: 1.4347832202911377 2023-01-22 12:07:27.257139: step: 462/466, loss: 1.02238929271698 2023-01-22 12:07:28.059509: step: 464/466, loss: 0.8291895389556885 2023-01-22 12:07:28.876804: step: 466/466, loss: 0.7763910889625549 2023-01-22 12:07:29.779354: step: 468/466, loss: 1.2875326871871948 2023-01-22 12:07:30.597670: step: 470/466, loss: 0.9274290800094604 2023-01-22 12:07:31.403928: step: 472/466, loss: 0.7256883978843689 2023-01-22 12:07:32.352813: step: 474/466, loss: 0.2669588029384613 2023-01-22 12:07:33.092197: step: 476/466, loss: 0.35543161630630493 2023-01-22 12:07:33.851438: step: 478/466, loss: 0.21915604174137115 2023-01-22 12:07:34.590632: step: 480/466, loss: 1.8925755023956299 2023-01-22 12:07:35.407580: step: 482/466, loss: 0.35683223605155945 2023-01-22 12:07:36.201704: step: 484/466, loss: 1.3415595293045044 2023-01-22 12:07:36.980753: step: 486/466, loss: 1.1984599828720093 2023-01-22 12:07:37.758198: step: 488/466, loss: 5.187197208404541 2023-01-22 12:07:38.526054: step: 490/466, loss: 0.2753686308860779 2023-01-22 12:07:39.265057: step: 492/466, loss: 0.2652628421783447 2023-01-22 12:07:39.968244: step: 494/466, loss: 1.2016327381134033 2023-01-22 12:07:40.764055: step: 496/466, loss: 0.6046364903450012 2023-01-22 12:07:41.547261: step: 498/466, loss: 0.8817658424377441 2023-01-22 12:07:42.358890: step: 500/466, loss: 0.38747915625572205 2023-01-22 12:07:43.213158: step: 502/466, loss: 0.7070812582969666 2023-01-22 12:07:43.932873: step: 504/466, loss: 0.3789001703262329 2023-01-22 12:07:44.639087: step: 506/466, loss: 0.9974977374076843 2023-01-22 12:07:45.386904: step: 508/466, loss: 0.4238806366920471 2023-01-22 12:07:46.124622: step: 510/466, loss: 0.3909887373447418 2023-01-22 12:07:46.860281: step: 512/466, loss: 0.32775312662124634 2023-01-22 12:07:47.615938: step: 514/466, loss: 0.47133827209472656 2023-01-22 12:07:48.347961: step: 516/466, loss: 0.8088892102241516 2023-01-22 12:07:49.136596: step: 518/466, loss: 0.7348328232765198 2023-01-22 12:07:49.900491: step: 520/466, loss: 2.851290225982666 2023-01-22 12:07:50.624984: step: 522/466, loss: 0.6305214166641235 2023-01-22 12:07:51.371690: step: 524/466, loss: 0.8245713710784912 2023-01-22 12:07:52.204870: step: 526/466, loss: 0.8373036980628967 2023-01-22 12:07:52.931513: step: 528/466, loss: 1.6329104900360107 2023-01-22 12:07:53.753565: step: 530/466, loss: 0.21834726631641388 2023-01-22 12:07:54.470581: step: 532/466, loss: 0.2542460262775421 2023-01-22 12:07:55.147866: step: 534/466, loss: 0.741761326789856 2023-01-22 12:07:55.897557: step: 536/466, loss: 0.15138760209083557 2023-01-22 12:07:56.672103: step: 538/466, loss: 1.3336399793624878 2023-01-22 12:07:57.457590: step: 540/466, loss: 3.1425106525421143 2023-01-22 12:07:58.271129: step: 542/466, loss: 0.24177254736423492 2023-01-22 12:07:59.112384: step: 544/466, loss: 18.67690086364746 2023-01-22 12:07:59.787218: step: 546/466, loss: 0.42344048619270325 2023-01-22 12:08:00.550316: step: 548/466, loss: 1.8342130184173584 2023-01-22 12:08:01.354070: step: 550/466, loss: 0.5240026116371155 2023-01-22 12:08:02.146144: step: 552/466, loss: 0.2808763086795807 2023-01-22 12:08:03.028540: step: 554/466, loss: 1.0640990734100342 2023-01-22 12:08:03.805281: step: 556/466, loss: 0.8659460544586182 2023-01-22 12:08:04.547321: step: 558/466, loss: 0.5987705588340759 2023-01-22 12:08:05.293420: step: 560/466, loss: 1.1994832754135132 2023-01-22 12:08:06.129603: step: 562/466, loss: 0.2609291076660156 2023-01-22 12:08:06.962311: step: 564/466, loss: 0.9732968807220459 2023-01-22 12:08:07.721439: step: 566/466, loss: 1.5298984050750732 2023-01-22 12:08:08.487676: step: 568/466, loss: 0.4171451926231384 2023-01-22 12:08:09.246918: step: 570/466, loss: 0.2139699012041092 2023-01-22 12:08:10.007598: step: 572/466, loss: 0.9007517099380493 2023-01-22 12:08:10.731127: step: 574/466, loss: 0.17563261091709137 2023-01-22 12:08:11.494250: step: 576/466, loss: 1.719172716140747 2023-01-22 12:08:12.295666: step: 578/466, loss: 1.16365385055542 2023-01-22 12:08:12.975053: step: 580/466, loss: 0.7707199454307556 2023-01-22 12:08:13.677771: step: 582/466, loss: 0.9323350191116333 2023-01-22 12:08:14.371863: step: 584/466, loss: 0.7965724468231201 2023-01-22 12:08:15.105374: step: 586/466, loss: 1.6608428955078125 2023-01-22 12:08:15.847540: step: 588/466, loss: 1.4693870544433594 2023-01-22 12:08:16.708847: step: 590/466, loss: 1.5551859140396118 2023-01-22 12:08:17.479808: step: 592/466, loss: 0.7750275135040283 2023-01-22 12:08:18.252506: step: 594/466, loss: 0.2587835192680359 2023-01-22 12:08:18.960818: step: 596/466, loss: 0.2051980197429657 2023-01-22 12:08:19.772067: step: 598/466, loss: 0.41535842418670654 2023-01-22 12:08:20.614852: step: 600/466, loss: 0.62410968542099 2023-01-22 12:08:21.348469: step: 602/466, loss: 0.3837363123893738 2023-01-22 12:08:22.049641: step: 604/466, loss: 0.22830593585968018 2023-01-22 12:08:22.857474: step: 606/466, loss: 0.7377241253852844 2023-01-22 12:08:23.668475: step: 608/466, loss: 0.8769369125366211 2023-01-22 12:08:24.474927: step: 610/466, loss: 0.6371053457260132 2023-01-22 12:08:25.200594: step: 612/466, loss: 0.5381723046302795 2023-01-22 12:08:25.985089: step: 614/466, loss: 0.21260666847229004 2023-01-22 12:08:26.766747: step: 616/466, loss: 0.7615000009536743 2023-01-22 12:08:27.551193: step: 618/466, loss: 0.9144954085350037 2023-01-22 12:08:28.274663: step: 620/466, loss: 0.6950018405914307 2023-01-22 12:08:28.948755: step: 622/466, loss: 1.596514105796814 2023-01-22 12:08:29.784602: step: 624/466, loss: 1.3464326858520508 2023-01-22 12:08:30.530720: step: 626/466, loss: 0.23497076332569122 2023-01-22 12:08:31.332267: step: 628/466, loss: 0.43337422609329224 2023-01-22 12:08:32.071988: step: 630/466, loss: 0.8914546966552734 2023-01-22 12:08:32.774077: step: 632/466, loss: 4.810173511505127 2023-01-22 12:08:33.633488: step: 634/466, loss: 1.0508265495300293 2023-01-22 12:08:34.364163: step: 636/466, loss: 0.9275634288787842 2023-01-22 12:08:35.096599: step: 638/466, loss: 1.0370503664016724 2023-01-22 12:08:35.899113: step: 640/466, loss: 0.1253279745578766 2023-01-22 12:08:36.686112: step: 642/466, loss: 0.9695085883140564 2023-01-22 12:08:37.508165: step: 644/466, loss: 0.6428408026695251 2023-01-22 12:08:38.229470: step: 646/466, loss: 0.29084670543670654 2023-01-22 12:08:39.022580: step: 648/466, loss: 0.23787152767181396 2023-01-22 12:08:39.780114: step: 650/466, loss: 0.9056107401847839 2023-01-22 12:08:40.477379: step: 652/466, loss: 0.588067889213562 2023-01-22 12:08:41.239725: step: 654/466, loss: 1.6399379968643188 2023-01-22 12:08:42.106093: step: 656/466, loss: 0.40053674578666687 2023-01-22 12:08:42.871504: step: 658/466, loss: 0.2362254559993744 2023-01-22 12:08:43.660871: step: 660/466, loss: 0.42605677247047424 2023-01-22 12:08:44.385831: step: 662/466, loss: 0.5030765533447266 2023-01-22 12:08:45.123958: step: 664/466, loss: 0.21947017312049866 2023-01-22 12:08:45.868429: step: 666/466, loss: 0.6463884115219116 2023-01-22 12:08:46.626589: step: 668/466, loss: 2.0798840522766113 2023-01-22 12:08:47.351612: step: 670/466, loss: 0.25600212812423706 2023-01-22 12:08:48.131840: step: 672/466, loss: 1.530958890914917 2023-01-22 12:08:48.899840: step: 674/466, loss: 0.5654168725013733 2023-01-22 12:08:49.716626: step: 676/466, loss: 0.47959205508232117 2023-01-22 12:08:50.452952: step: 678/466, loss: 0.8467852473258972 2023-01-22 12:08:51.212786: step: 680/466, loss: 0.7995301485061646 2023-01-22 12:08:51.948557: step: 682/466, loss: 0.8343058228492737 2023-01-22 12:08:52.685864: step: 684/466, loss: 0.2655002474784851 2023-01-22 12:08:53.386481: step: 686/466, loss: 0.5379340648651123 2023-01-22 12:08:54.090277: step: 688/466, loss: 0.17376302182674408 2023-01-22 12:08:54.874204: step: 690/466, loss: 0.7422475814819336 2023-01-22 12:08:55.709008: step: 692/466, loss: 0.5891343355178833 2023-01-22 12:08:56.426267: step: 694/466, loss: 0.6532750129699707 2023-01-22 12:08:57.275099: step: 696/466, loss: 0.570507287979126 2023-01-22 12:08:58.034562: step: 698/466, loss: 1.0127125978469849 2023-01-22 12:08:58.781609: step: 700/466, loss: 0.2279486060142517 2023-01-22 12:08:59.522442: step: 702/466, loss: 0.6048024892807007 2023-01-22 12:09:00.309452: step: 704/466, loss: 1.9379174709320068 2023-01-22 12:09:01.111346: step: 706/466, loss: 0.3877120614051819 2023-01-22 12:09:01.912034: step: 708/466, loss: 0.3327438235282898 2023-01-22 12:09:02.676531: step: 710/466, loss: 0.12455499172210693 2023-01-22 12:09:03.423603: step: 712/466, loss: 1.2360749244689941 2023-01-22 12:09:04.146130: step: 714/466, loss: 7.34765100479126 2023-01-22 12:09:04.953811: step: 716/466, loss: 0.48359206318855286 2023-01-22 12:09:05.769484: step: 718/466, loss: 0.2707675099372864 2023-01-22 12:09:06.558504: step: 720/466, loss: 0.4693552851676941 2023-01-22 12:09:07.272864: step: 722/466, loss: 0.7341639995574951 2023-01-22 12:09:08.037850: step: 724/466, loss: 0.7864471077919006 2023-01-22 12:09:08.708599: step: 726/466, loss: 0.5092653632164001 2023-01-22 12:09:09.495764: step: 728/466, loss: 0.47015196084976196 2023-01-22 12:09:10.330900: step: 730/466, loss: 0.7482075095176697 2023-01-22 12:09:11.042740: step: 732/466, loss: 0.14449910819530487 2023-01-22 12:09:11.781590: step: 734/466, loss: 0.42004555463790894 2023-01-22 12:09:12.626663: step: 736/466, loss: 1.316415548324585 2023-01-22 12:09:13.505251: step: 738/466, loss: 1.8354363441467285 2023-01-22 12:09:14.358505: step: 740/466, loss: 0.6618088483810425 2023-01-22 12:09:15.145180: step: 742/466, loss: 0.41975152492523193 2023-01-22 12:09:15.966317: step: 744/466, loss: 1.0764285326004028 2023-01-22 12:09:16.710191: step: 746/466, loss: 0.16238969564437866 2023-01-22 12:09:17.500763: step: 748/466, loss: 1.6173679828643799 2023-01-22 12:09:18.374086: step: 750/466, loss: 2.1954853534698486 2023-01-22 12:09:19.193907: step: 752/466, loss: 0.3339827358722687 2023-01-22 12:09:19.937175: step: 754/466, loss: 0.7078976035118103 2023-01-22 12:09:20.689738: step: 756/466, loss: 0.5993287563323975 2023-01-22 12:09:21.492534: step: 758/466, loss: 0.6090127825737 2023-01-22 12:09:22.309335: step: 760/466, loss: 0.8635631799697876 2023-01-22 12:09:23.034759: step: 762/466, loss: 0.2963520288467407 2023-01-22 12:09:23.843179: step: 764/466, loss: 0.6128653883934021 2023-01-22 12:09:24.569465: step: 766/466, loss: 0.7596311569213867 2023-01-22 12:09:25.344696: step: 768/466, loss: 0.6596415638923645 2023-01-22 12:09:26.167385: step: 770/466, loss: 0.3747796416282654 2023-01-22 12:09:26.838580: step: 772/466, loss: 0.8948156237602234 2023-01-22 12:09:27.593806: step: 774/466, loss: 0.7336122989654541 2023-01-22 12:09:28.394532: step: 776/466, loss: 0.9042572379112244 2023-01-22 12:09:29.185157: step: 778/466, loss: 1.0166397094726562 2023-01-22 12:09:29.851403: step: 780/466, loss: 0.44153064489364624 2023-01-22 12:09:30.550748: step: 782/466, loss: 0.4000193178653717 2023-01-22 12:09:31.436151: step: 784/466, loss: 0.23604170978069305 2023-01-22 12:09:32.296735: step: 786/466, loss: 0.6637749671936035 2023-01-22 12:09:33.097761: step: 788/466, loss: 1.5385891199111938 2023-01-22 12:09:33.930902: step: 790/466, loss: 0.2822076082229614 2023-01-22 12:09:34.673360: step: 792/466, loss: 0.3108176589012146 2023-01-22 12:09:35.339860: step: 794/466, loss: 0.38376277685165405 2023-01-22 12:09:36.109354: step: 796/466, loss: 0.9600449800491333 2023-01-22 12:09:36.835083: step: 798/466, loss: 0.49588266015052795 2023-01-22 12:09:37.705103: step: 800/466, loss: 0.7010972499847412 2023-01-22 12:09:38.448847: step: 802/466, loss: 0.604422926902771 2023-01-22 12:09:39.232320: step: 804/466, loss: 0.1080489307641983 2023-01-22 12:09:39.967236: step: 806/466, loss: 0.25231894850730896 2023-01-22 12:09:40.737328: step: 808/466, loss: 0.7223091721534729 2023-01-22 12:09:41.529710: step: 810/466, loss: 1.1986238956451416 2023-01-22 12:09:42.306052: step: 812/466, loss: 0.7812969088554382 2023-01-22 12:09:43.038604: step: 814/466, loss: 0.8877654671669006 2023-01-22 12:09:43.766593: step: 816/466, loss: 0.5675227642059326 2023-01-22 12:09:44.539279: step: 818/466, loss: 1.0517579317092896 2023-01-22 12:09:45.398441: step: 820/466, loss: 0.8483049273490906 2023-01-22 12:09:46.127512: step: 822/466, loss: 1.1111209392547607 2023-01-22 12:09:46.843644: step: 824/466, loss: 0.7873216271400452 2023-01-22 12:09:47.494682: step: 826/466, loss: 0.2162623107433319 2023-01-22 12:09:48.347028: step: 828/466, loss: 0.521052360534668 2023-01-22 12:09:49.156636: step: 830/466, loss: 0.7919299602508545 2023-01-22 12:09:49.893795: step: 832/466, loss: 1.0302027463912964 2023-01-22 12:09:50.597203: step: 834/466, loss: 0.3500521183013916 2023-01-22 12:09:51.344817: step: 836/466, loss: 0.7324089407920837 2023-01-22 12:09:52.117258: step: 838/466, loss: 0.26163384318351746 2023-01-22 12:09:52.794677: step: 840/466, loss: 0.4300655126571655 2023-01-22 12:09:53.563469: step: 842/466, loss: 0.7690464854240417 2023-01-22 12:09:54.380658: step: 844/466, loss: 0.5059322118759155 2023-01-22 12:09:55.194251: step: 846/466, loss: 0.29137077927589417 2023-01-22 12:09:55.964960: step: 848/466, loss: 0.3070172965526581 2023-01-22 12:09:56.711357: step: 850/466, loss: 0.19828540086746216 2023-01-22 12:09:57.442544: step: 852/466, loss: 0.4829852283000946 2023-01-22 12:09:58.268701: step: 854/466, loss: 0.6273636817932129 2023-01-22 12:09:59.000260: step: 856/466, loss: 0.22892563045024872 2023-01-22 12:09:59.712752: step: 858/466, loss: 0.9977620244026184 2023-01-22 12:10:00.494168: step: 860/466, loss: 0.20873820781707764 2023-01-22 12:10:01.300181: step: 862/466, loss: 0.6448942422866821 2023-01-22 12:10:02.081071: step: 864/466, loss: 1.3498508930206299 2023-01-22 12:10:02.832248: step: 866/466, loss: 0.2051345258951187 2023-01-22 12:10:03.668180: step: 868/466, loss: 0.6440752744674683 2023-01-22 12:10:04.405015: step: 870/466, loss: 0.261584609746933 2023-01-22 12:10:05.252838: step: 872/466, loss: 0.2732155919075012 2023-01-22 12:10:06.039932: step: 874/466, loss: 0.8551644086837769 2023-01-22 12:10:06.836172: step: 876/466, loss: 1.28314208984375 2023-01-22 12:10:07.631267: step: 878/466, loss: 0.34755074977874756 2023-01-22 12:10:08.328786: step: 880/466, loss: 0.30099064111709595 2023-01-22 12:10:09.093645: step: 882/466, loss: 0.32799744606018066 2023-01-22 12:10:09.863355: step: 884/466, loss: 0.9017066955566406 2023-01-22 12:10:10.598516: step: 886/466, loss: 0.8086469173431396 2023-01-22 12:10:11.350686: step: 888/466, loss: 1.2836970090866089 2023-01-22 12:10:12.095468: step: 890/466, loss: 1.206702470779419 2023-01-22 12:10:12.866717: step: 892/466, loss: 0.3659389615058899 2023-01-22 12:10:13.690719: step: 894/466, loss: 2.131016731262207 2023-01-22 12:10:14.521452: step: 896/466, loss: 1.3239178657531738 2023-01-22 12:10:15.322015: step: 898/466, loss: 7.509538650512695 2023-01-22 12:10:16.099054: step: 900/466, loss: 0.4374963939189911 2023-01-22 12:10:16.836476: step: 902/466, loss: 0.9535677433013916 2023-01-22 12:10:17.523758: step: 904/466, loss: 0.19704625010490417 2023-01-22 12:10:18.282516: step: 906/466, loss: 0.3004119396209717 2023-01-22 12:10:19.032207: step: 908/466, loss: 0.8080199956893921 2023-01-22 12:10:19.802941: step: 910/466, loss: 1.2489067316055298 2023-01-22 12:10:20.575634: step: 912/466, loss: 1.3066067695617676 2023-01-22 12:10:21.353876: step: 914/466, loss: 0.8604760766029358 2023-01-22 12:10:22.114623: step: 916/466, loss: 3.206796407699585 2023-01-22 12:10:22.747107: step: 918/466, loss: 0.5037591457366943 2023-01-22 12:10:23.470384: step: 920/466, loss: 0.28978392481803894 2023-01-22 12:10:24.194793: step: 922/466, loss: 1.1058472394943237 2023-01-22 12:10:24.991592: step: 924/466, loss: 0.2016083300113678 2023-01-22 12:10:25.700364: step: 926/466, loss: 1.3228625059127808 2023-01-22 12:10:26.440121: step: 928/466, loss: 1.0719577074050903 2023-01-22 12:10:27.203965: step: 930/466, loss: 2.4533298015594482 2023-01-22 12:10:28.004032: step: 932/466, loss: 10.044556617736816 ================================================== Loss: 0.938 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.30050104216770884, 'r': 0.32269714187327825, 'f1': 0.3112038190120382}, 'combined': 0.22930807716676496, 'epoch': 5} Test Chinese: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.34487493229245325, 'r': 0.23120940529987133, 'f1': 0.2768286613429842}, 'combined': 0.17014834794739514, 'epoch': 5} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.27841788354686137, 'r': 0.32165702455224515, 'f1': 0.298479629109992}, 'combined': 0.21993235829157304, 'epoch': 5} Test Korean: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.3258544834057031, 'r': 0.2314243346678912, 'f1': 0.27063887797276903}, 'combined': 0.16634389572960437, 'epoch': 5} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.32328770863505785, 'r': 0.3355566918849651, 'f1': 0.3293079639169025}, 'combined': 0.24264797341245445, 'epoch': 5} Test Russian: {'template': {'p': 0.863013698630137, 'r': 0.48091603053435117, 'f1': 0.6176470588235295}, 'slot': {'p': 0.3542723554470683, 'r': 0.23076737340107478, 'f1': 0.2794835868534756}, 'combined': 0.17262221540949965, 'epoch': 5} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.23404255319148937, 'r': 0.3142857142857143, 'f1': 0.2682926829268293}, 'combined': 0.17886178861788618, 'epoch': 5} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.2661290322580645, 'r': 0.358695652173913, 'f1': 0.30555555555555547}, 'combined': 0.15277777777777773, 'epoch': 5} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.38461538461538464, 'r': 0.1724137931034483, 'f1': 0.23809523809523808}, 'combined': 0.15873015873015872, 'epoch': 5} New best russian model... ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3067315481548155, 'r': 0.2939268155942729, 'f1': 0.30019269732205783}, 'combined': 0.22119461907941101, 'epoch': 4} Test for Chinese: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.3301358185595464, 'r': 0.23397698463165778, 'f1': 0.27386077007468695}, 'combined': 0.168324180631271, 'epoch': 4} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.2403846153846154, 'r': 0.35714285714285715, 'f1': 0.28735632183908044}, 'combined': 0.19157088122605362, 'epoch': 4} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2791706942517267, 'r': 0.28552752220432764, 'f1': 0.28231332870859416}, 'combined': 0.2080203474694904, 'epoch': 3} Test for Korean: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.306518116903463, 'r': 0.2717829323754754, 'f1': 0.28810735426506145}, 'combined': 0.1770806177434036, 'epoch': 3} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.29375, 'r': 0.5108695652173914, 'f1': 0.3730158730158731}, 'combined': 0.18650793650793654, 'epoch': 3} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.32328770863505785, 'r': 0.3355566918849651, 'f1': 0.3293079639169025}, 'combined': 0.24264797341245445, 'epoch': 5} Test for Russian: {'template': {'p': 0.863013698630137, 'r': 0.48091603053435117, 'f1': 0.6176470588235295}, 'slot': {'p': 0.3542723554470683, 'r': 0.23076737340107478, 'f1': 0.2794835868534756}, 'combined': 0.17262221540949965, 'epoch': 5} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.38461538461538464, 'r': 0.1724137931034483, 'f1': 0.23809523809523808}, 'combined': 0.15873015873015872, 'epoch': 5} ****************************** Epoch: 6 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-22 12:13:20.275796: step: 2/466, loss: 0.32218968868255615 2023-01-22 12:13:21.048378: step: 4/466, loss: 0.5556846857070923 2023-01-22 12:13:21.850969: step: 6/466, loss: 1.0528106689453125 2023-01-22 12:13:22.645962: step: 8/466, loss: 0.4053214192390442 2023-01-22 12:13:23.445970: step: 10/466, loss: 1.505388617515564 2023-01-22 12:13:24.199556: step: 12/466, loss: 0.5031132102012634 2023-01-22 12:13:24.974492: step: 14/466, loss: 0.7867810726165771 2023-01-22 12:13:25.699357: step: 16/466, loss: 0.9841118454933167 2023-01-22 12:13:26.466709: step: 18/466, loss: 0.7677281498908997 2023-01-22 12:13:27.189336: step: 20/466, loss: 0.23764349520206451 2023-01-22 12:13:27.935601: step: 22/466, loss: 0.7144334316253662 2023-01-22 12:13:28.638675: step: 24/466, loss: 0.38185957074165344 2023-01-22 12:13:29.464507: step: 26/466, loss: 2.423734188079834 2023-01-22 12:13:30.200600: step: 28/466, loss: 0.43167775869369507 2023-01-22 12:13:30.906768: step: 30/466, loss: 0.0954960510134697 2023-01-22 12:13:31.582941: step: 32/466, loss: 0.5020628571510315 2023-01-22 12:13:32.420499: step: 34/466, loss: 0.41035231947898865 2023-01-22 12:13:33.127616: step: 36/466, loss: 0.7318971753120422 2023-01-22 12:13:33.902292: step: 38/466, loss: 0.7467445135116577 2023-01-22 12:13:34.692241: step: 40/466, loss: 0.21532611548900604 2023-01-22 12:13:35.427381: step: 42/466, loss: 0.3341991901397705 2023-01-22 12:13:36.179371: step: 44/466, loss: 0.5254302620887756 2023-01-22 12:13:36.903707: step: 46/466, loss: 0.40966877341270447 2023-01-22 12:13:37.728879: step: 48/466, loss: 0.35710608959198 2023-01-22 12:13:38.471669: step: 50/466, loss: 0.3491014242172241 2023-01-22 12:13:39.207942: step: 52/466, loss: 0.5511988997459412 2023-01-22 12:13:39.993982: step: 54/466, loss: 1.2386209964752197 2023-01-22 12:13:40.760037: step: 56/466, loss: 1.3478890657424927 2023-01-22 12:13:41.526622: step: 58/466, loss: 0.8968897461891174 2023-01-22 12:13:42.233404: step: 60/466, loss: 0.2680509388446808 2023-01-22 12:13:42.910731: step: 62/466, loss: 1.1116560697555542 2023-01-22 12:13:43.687294: step: 64/466, loss: 0.18260887265205383 2023-01-22 12:13:44.455068: step: 66/466, loss: 0.3070507347583771 2023-01-22 12:13:45.249207: step: 68/466, loss: 1.1425074338912964 2023-01-22 12:13:46.024004: step: 70/466, loss: 0.8123894333839417 2023-01-22 12:13:46.765192: step: 72/466, loss: 0.27354997396469116 2023-01-22 12:13:47.503511: step: 74/466, loss: 0.36695006489753723 2023-01-22 12:13:48.241491: step: 76/466, loss: 0.33018723130226135 2023-01-22 12:13:48.984281: step: 78/466, loss: 0.5599246621131897 2023-01-22 12:13:49.775988: step: 80/466, loss: 0.7603071928024292 2023-01-22 12:13:50.507988: step: 82/466, loss: 0.3653887212276459 2023-01-22 12:13:51.234945: step: 84/466, loss: 1.2102289199829102 2023-01-22 12:13:52.088593: step: 86/466, loss: 0.4492781162261963 2023-01-22 12:13:52.855297: step: 88/466, loss: 0.325231671333313 2023-01-22 12:13:53.534381: step: 90/466, loss: 0.2342669665813446 2023-01-22 12:13:54.248815: step: 92/466, loss: 0.4624057412147522 2023-01-22 12:13:55.154540: step: 94/466, loss: 0.43966928124427795 2023-01-22 12:13:55.942412: step: 96/466, loss: 0.6490041613578796 2023-01-22 12:13:56.698144: step: 98/466, loss: 0.9651503562927246 2023-01-22 12:13:57.449163: step: 100/466, loss: 0.7601068615913391 2023-01-22 12:13:58.225564: step: 102/466, loss: 0.22763891518115997 2023-01-22 12:13:59.020753: step: 104/466, loss: 0.3821936249732971 2023-01-22 12:13:59.747316: step: 106/466, loss: 0.581498384475708 2023-01-22 12:14:00.504129: step: 108/466, loss: 0.23628266155719757 2023-01-22 12:14:01.199636: step: 110/466, loss: 0.5058366656303406 2023-01-22 12:14:01.951766: step: 112/466, loss: 0.2841312289237976 2023-01-22 12:14:02.673938: step: 114/466, loss: 0.5994552373886108 2023-01-22 12:14:03.450591: step: 116/466, loss: 0.16569878160953522 2023-01-22 12:14:04.192131: step: 118/466, loss: 0.16361653804779053 2023-01-22 12:14:04.966326: step: 120/466, loss: 0.3208037316799164 2023-01-22 12:14:05.720460: step: 122/466, loss: 0.606379508972168 2023-01-22 12:14:06.492751: step: 124/466, loss: 1.2331677675247192 2023-01-22 12:14:07.219169: step: 126/466, loss: 0.25890928506851196 2023-01-22 12:14:07.928385: step: 128/466, loss: 0.2537651062011719 2023-01-22 12:14:08.657720: step: 130/466, loss: 0.29682645201683044 2023-01-22 12:14:09.489214: step: 132/466, loss: 2.0517148971557617 2023-01-22 12:14:10.238399: step: 134/466, loss: 0.28343483805656433 2023-01-22 12:14:10.985321: step: 136/466, loss: 0.7250392436981201 2023-01-22 12:14:11.759415: step: 138/466, loss: 0.1914062350988388 2023-01-22 12:14:12.527369: step: 140/466, loss: 0.3541911840438843 2023-01-22 12:14:13.315274: step: 142/466, loss: 0.8470289707183838 2023-01-22 12:14:14.065138: step: 144/466, loss: 0.49261319637298584 2023-01-22 12:14:14.810091: step: 146/466, loss: 0.34677648544311523 2023-01-22 12:14:15.575262: step: 148/466, loss: 0.2643190920352936 2023-01-22 12:14:16.364235: step: 150/466, loss: 0.1552414745092392 2023-01-22 12:14:17.093368: step: 152/466, loss: 0.20740438997745514 2023-01-22 12:14:17.921490: step: 154/466, loss: 0.21313226222991943 2023-01-22 12:14:18.759639: step: 156/466, loss: 0.33110979199409485 2023-01-22 12:14:19.548126: step: 158/466, loss: 0.2816070318222046 2023-01-22 12:14:20.292822: step: 160/466, loss: 0.2219778299331665 2023-01-22 12:14:21.034972: step: 162/466, loss: 0.404222309589386 2023-01-22 12:14:21.829150: step: 164/466, loss: 0.22453781962394714 2023-01-22 12:14:22.543851: step: 166/466, loss: 0.3454417884349823 2023-01-22 12:14:23.319375: step: 168/466, loss: 0.5886520743370056 2023-01-22 12:14:24.029081: step: 170/466, loss: 0.6885159015655518 2023-01-22 12:14:24.840241: step: 172/466, loss: 0.8449385166168213 2023-01-22 12:14:25.661191: step: 174/466, loss: 0.41053617000579834 2023-01-22 12:14:26.447509: step: 176/466, loss: 0.25612226128578186 2023-01-22 12:14:27.231467: step: 178/466, loss: 1.0309232473373413 2023-01-22 12:14:27.917394: step: 180/466, loss: 1.5213203430175781 2023-01-22 12:14:28.716719: step: 182/466, loss: 0.6834684610366821 2023-01-22 12:14:29.476253: step: 184/466, loss: 0.25764113664627075 2023-01-22 12:14:30.210720: step: 186/466, loss: 1.1139030456542969 2023-01-22 12:14:31.065574: step: 188/466, loss: 1.328330636024475 2023-01-22 12:14:31.866789: step: 190/466, loss: 0.6299300789833069 2023-01-22 12:14:32.625835: step: 192/466, loss: 0.47483086585998535 2023-01-22 12:14:33.585866: step: 194/466, loss: 0.7948506474494934 2023-01-22 12:14:34.261521: step: 196/466, loss: 0.7495373487472534 2023-01-22 12:14:35.042677: step: 198/466, loss: 0.3398696780204773 2023-01-22 12:14:35.770659: step: 200/466, loss: 0.25219038128852844 2023-01-22 12:14:36.567957: step: 202/466, loss: 0.2839111089706421 2023-01-22 12:14:37.335103: step: 204/466, loss: 0.2768838703632355 2023-01-22 12:14:38.142242: step: 206/466, loss: 0.4428100883960724 2023-01-22 12:14:38.902692: step: 208/466, loss: 0.7311434149742126 2023-01-22 12:14:39.711120: step: 210/466, loss: 0.8593393564224243 2023-01-22 12:14:40.492280: step: 212/466, loss: 0.7541153430938721 2023-01-22 12:14:41.250054: step: 214/466, loss: 0.6181634664535522 2023-01-22 12:14:41.986025: step: 216/466, loss: 0.88047856092453 2023-01-22 12:14:42.741493: step: 218/466, loss: 1.1687943935394287 2023-01-22 12:14:43.539567: step: 220/466, loss: 0.5406815409660339 2023-01-22 12:14:44.453403: step: 222/466, loss: 0.42107611894607544 2023-01-22 12:14:45.226662: step: 224/466, loss: 0.17114980518817902 2023-01-22 12:14:46.007680: step: 226/466, loss: 0.4720715880393982 2023-01-22 12:14:46.735202: step: 228/466, loss: 0.37465938925743103 2023-01-22 12:14:47.452133: step: 230/466, loss: 0.20296700298786163 2023-01-22 12:14:48.306251: step: 232/466, loss: 0.747587263584137 2023-01-22 12:14:49.059998: step: 234/466, loss: 0.333487868309021 2023-01-22 12:14:49.872470: step: 236/466, loss: 1.3917906284332275 2023-01-22 12:14:50.697206: step: 238/466, loss: 0.7705444097518921 2023-01-22 12:14:51.451340: step: 240/466, loss: 0.5334435701370239 2023-01-22 12:14:52.180275: step: 242/466, loss: 0.6047327518463135 2023-01-22 12:14:53.100706: step: 244/466, loss: 0.7313522100448608 2023-01-22 12:14:53.840118: step: 246/466, loss: 0.6601555347442627 2023-01-22 12:14:54.571406: step: 248/466, loss: 0.19155901670455933 2023-01-22 12:14:55.397125: step: 250/466, loss: 0.16767148673534393 2023-01-22 12:14:56.106075: step: 252/466, loss: 0.27993425726890564 2023-01-22 12:14:56.908900: step: 254/466, loss: 1.27365243434906 2023-01-22 12:14:57.782971: step: 256/466, loss: 1.5138299465179443 2023-01-22 12:14:58.525780: step: 258/466, loss: 0.5258921980857849 2023-01-22 12:14:59.265430: step: 260/466, loss: 0.25220438838005066 2023-01-22 12:15:00.060875: step: 262/466, loss: 0.5644699335098267 2023-01-22 12:15:00.858340: step: 264/466, loss: 0.2249457985162735 2023-01-22 12:15:01.627804: step: 266/466, loss: 0.38131022453308105 2023-01-22 12:15:02.494341: step: 268/466, loss: 0.445722758769989 2023-01-22 12:15:03.171199: step: 270/466, loss: 1.9734113216400146 2023-01-22 12:15:03.905715: step: 272/466, loss: 0.46760833263397217 2023-01-22 12:15:04.636136: step: 274/466, loss: 0.5852957367897034 2023-01-22 12:15:05.497857: step: 276/466, loss: 1.8949702978134155 2023-01-22 12:15:06.181298: step: 278/466, loss: 1.1939055919647217 2023-01-22 12:15:06.960680: step: 280/466, loss: 1.2649636268615723 2023-01-22 12:15:07.765694: step: 282/466, loss: 0.39441004395484924 2023-01-22 12:15:08.559295: step: 284/466, loss: 0.6390337944030762 2023-01-22 12:15:09.355367: step: 286/466, loss: 0.5137861371040344 2023-01-22 12:15:10.113748: step: 288/466, loss: 0.4395367205142975 2023-01-22 12:15:11.036185: step: 290/466, loss: 0.8883119225502014 2023-01-22 12:15:11.759430: step: 292/466, loss: 0.552345335483551 2023-01-22 12:15:12.561800: step: 294/466, loss: 0.723929762840271 2023-01-22 12:15:13.328505: step: 296/466, loss: 0.530340313911438 2023-01-22 12:15:14.153856: step: 298/466, loss: 0.6502615809440613 2023-01-22 12:15:14.893966: step: 300/466, loss: 2.418755054473877 2023-01-22 12:15:15.738612: step: 302/466, loss: 0.22726400196552277 2023-01-22 12:15:16.575488: step: 304/466, loss: 0.49434563517570496 2023-01-22 12:15:17.363738: step: 306/466, loss: 0.28831496834754944 2023-01-22 12:15:18.168964: step: 308/466, loss: 0.3543457090854645 2023-01-22 12:15:18.893932: step: 310/466, loss: 0.9062546491622925 2023-01-22 12:15:19.654418: step: 312/466, loss: 0.2703440189361572 2023-01-22 12:15:20.466773: step: 314/466, loss: 0.5241992473602295 2023-01-22 12:15:21.171513: step: 316/466, loss: 0.4552978575229645 2023-01-22 12:15:21.879220: step: 318/466, loss: 0.39680472016334534 2023-01-22 12:15:22.597713: step: 320/466, loss: 0.09683714807033539 2023-01-22 12:15:23.362949: step: 322/466, loss: 0.8749290108680725 2023-01-22 12:15:24.092614: step: 324/466, loss: 0.8422608971595764 2023-01-22 12:15:24.907349: step: 326/466, loss: 0.7888513803482056 2023-01-22 12:15:25.671090: step: 328/466, loss: 0.3374817967414856 2023-01-22 12:15:26.356477: step: 330/466, loss: 0.6404813528060913 2023-01-22 12:15:27.118962: step: 332/466, loss: 0.5628029704093933 2023-01-22 12:15:27.993554: step: 334/466, loss: 0.623656153678894 2023-01-22 12:15:28.782350: step: 336/466, loss: 0.7959246039390564 2023-01-22 12:15:29.502920: step: 338/466, loss: 0.8810027837753296 2023-01-22 12:15:30.265872: step: 340/466, loss: 0.5927742719650269 2023-01-22 12:15:31.002855: step: 342/466, loss: 0.6585927605628967 2023-01-22 12:15:31.872337: step: 344/466, loss: 0.45125773549079895 2023-01-22 12:15:32.624349: step: 346/466, loss: 0.8539192080497742 2023-01-22 12:15:33.376974: step: 348/466, loss: 1.160231590270996 2023-01-22 12:15:34.084425: step: 350/466, loss: 1.3324898481369019 2023-01-22 12:15:34.841963: step: 352/466, loss: 0.45158857107162476 2023-01-22 12:15:35.634524: step: 354/466, loss: 0.3662046492099762 2023-01-22 12:15:36.457420: step: 356/466, loss: 0.6877017617225647 2023-01-22 12:15:37.205850: step: 358/466, loss: 0.5057063102722168 2023-01-22 12:15:37.999896: step: 360/466, loss: 0.7695874571800232 2023-01-22 12:15:38.777988: step: 362/466, loss: 0.49749425053596497 2023-01-22 12:15:39.626748: step: 364/466, loss: 0.14780224859714508 2023-01-22 12:15:40.433285: step: 366/466, loss: 0.7745498418807983 2023-01-22 12:15:41.248071: step: 368/466, loss: 1.1423521041870117 2023-01-22 12:15:42.052748: step: 370/466, loss: 0.5440044403076172 2023-01-22 12:15:42.733299: step: 372/466, loss: 3.054959297180176 2023-01-22 12:15:43.572232: step: 374/466, loss: 1.0194264650344849 2023-01-22 12:15:44.343199: step: 376/466, loss: 0.3306431174278259 2023-01-22 12:15:45.123875: step: 378/466, loss: 0.9671432971954346 2023-01-22 12:15:45.955928: step: 380/466, loss: 0.4413227438926697 2023-01-22 12:15:46.769076: step: 382/466, loss: 0.3477259576320648 2023-01-22 12:15:47.594497: step: 384/466, loss: 1.126652479171753 2023-01-22 12:15:48.396902: step: 386/466, loss: 0.8670724630355835 2023-01-22 12:15:49.200355: step: 388/466, loss: 1.109525442123413 2023-01-22 12:15:50.126434: step: 390/466, loss: 0.8821362257003784 2023-01-22 12:15:50.876969: step: 392/466, loss: 0.3569082021713257 2023-01-22 12:15:51.692078: step: 394/466, loss: 0.9964345693588257 2023-01-22 12:15:52.490124: step: 396/466, loss: 0.8517532348632812 2023-01-22 12:15:53.232753: step: 398/466, loss: 1.4055027961730957 2023-01-22 12:15:53.999682: step: 400/466, loss: 0.6798644065856934 2023-01-22 12:15:54.686306: step: 402/466, loss: 0.5541855692863464 2023-01-22 12:15:55.497791: step: 404/466, loss: 0.7029849290847778 2023-01-22 12:15:56.184025: step: 406/466, loss: 0.3102688789367676 2023-01-22 12:15:57.028348: step: 408/466, loss: 1.1614198684692383 2023-01-22 12:15:57.807884: step: 410/466, loss: 1.6590417623519897 2023-01-22 12:15:58.581429: step: 412/466, loss: 1.6184443235397339 2023-01-22 12:15:59.441014: step: 414/466, loss: 0.819237470626831 2023-01-22 12:16:00.278364: step: 416/466, loss: 0.5732754468917847 2023-01-22 12:16:00.993780: step: 418/466, loss: 0.37938907742500305 2023-01-22 12:16:01.738192: step: 420/466, loss: 0.46679314970970154 2023-01-22 12:16:02.509737: step: 422/466, loss: 0.9363054633140564 2023-01-22 12:16:03.232288: step: 424/466, loss: 0.8060967326164246 2023-01-22 12:16:04.009344: step: 426/466, loss: 1.919229507446289 2023-01-22 12:16:04.782040: step: 428/466, loss: 0.39510074257850647 2023-01-22 12:16:05.575419: step: 430/466, loss: 0.3275643587112427 2023-01-22 12:16:06.334026: step: 432/466, loss: 0.41811782121658325 2023-01-22 12:16:07.094340: step: 434/466, loss: 1.048869013786316 2023-01-22 12:16:07.800839: step: 436/466, loss: 0.5930517315864563 2023-01-22 12:16:08.602380: step: 438/466, loss: 0.37204745411872864 2023-01-22 12:16:09.316622: step: 440/466, loss: 0.9435140490531921 2023-01-22 12:16:10.028841: step: 442/466, loss: 0.6689704656600952 2023-01-22 12:16:10.865195: step: 444/466, loss: 1.39583158493042 2023-01-22 12:16:11.621026: step: 446/466, loss: 0.33848732709884644 2023-01-22 12:16:12.318177: step: 448/466, loss: 0.5037708282470703 2023-01-22 12:16:13.052671: step: 450/466, loss: 0.2862124741077423 2023-01-22 12:16:13.797530: step: 452/466, loss: 0.3021622598171234 2023-01-22 12:16:14.643034: step: 454/466, loss: 0.35256338119506836 2023-01-22 12:16:15.504573: step: 456/466, loss: 0.30306318402290344 2023-01-22 12:16:16.218771: step: 458/466, loss: 0.4096969962120056 2023-01-22 12:16:17.038477: step: 460/466, loss: 0.19336767494678497 2023-01-22 12:16:17.807472: step: 462/466, loss: 0.715903103351593 2023-01-22 12:16:18.571525: step: 464/466, loss: 0.6890691518783569 2023-01-22 12:16:19.422427: step: 466/466, loss: 0.416951060295105 2023-01-22 12:16:20.238056: step: 468/466, loss: 1.248590350151062 2023-01-22 12:16:20.968671: step: 470/466, loss: 0.6275112628936768 2023-01-22 12:16:21.754551: step: 472/466, loss: 0.7352412939071655 2023-01-22 12:16:22.589067: step: 474/466, loss: 0.2669612467288971 2023-01-22 12:16:23.366046: step: 476/466, loss: 2.033390998840332 2023-01-22 12:16:24.120995: step: 478/466, loss: 0.29588741064071655 2023-01-22 12:16:24.949713: step: 480/466, loss: 1.0971732139587402 2023-01-22 12:16:25.715608: step: 482/466, loss: 0.32912346720695496 2023-01-22 12:16:26.468093: step: 484/466, loss: 0.2324758917093277 2023-01-22 12:16:27.262529: step: 486/466, loss: 0.8204798102378845 2023-01-22 12:16:28.039633: step: 488/466, loss: 0.3005208671092987 2023-01-22 12:16:28.798870: step: 490/466, loss: 0.3320615589618683 2023-01-22 12:16:29.520101: step: 492/466, loss: 0.21812233328819275 2023-01-22 12:16:30.286910: step: 494/466, loss: 0.347537100315094 2023-01-22 12:16:31.092498: step: 496/466, loss: 0.3185756504535675 2023-01-22 12:16:31.892757: step: 498/466, loss: 0.8476412296295166 2023-01-22 12:16:32.641038: step: 500/466, loss: 3.0157365798950195 2023-01-22 12:16:33.464262: step: 502/466, loss: 0.45933496952056885 2023-01-22 12:16:34.251420: step: 504/466, loss: 0.8938225507736206 2023-01-22 12:16:34.957607: step: 506/466, loss: 0.782000720500946 2023-01-22 12:16:35.771655: step: 508/466, loss: 0.18423041701316833 2023-01-22 12:16:36.566416: step: 510/466, loss: 0.361537903547287 2023-01-22 12:16:37.352280: step: 512/466, loss: 0.2748829424381256 2023-01-22 12:16:38.137930: step: 514/466, loss: 0.6222298741340637 2023-01-22 12:16:38.853778: step: 516/466, loss: 0.24053962528705597 2023-01-22 12:16:39.771696: step: 518/466, loss: 0.37778693437576294 2023-01-22 12:16:40.560864: step: 520/466, loss: 0.8710047006607056 2023-01-22 12:16:41.312347: step: 522/466, loss: 0.853071928024292 2023-01-22 12:16:42.048077: step: 524/466, loss: 1.2134851217269897 2023-01-22 12:16:42.816657: step: 526/466, loss: 1.0508193969726562 2023-01-22 12:16:43.596023: step: 528/466, loss: 0.34795981645584106 2023-01-22 12:16:44.373692: step: 530/466, loss: 1.2121317386627197 2023-01-22 12:16:45.210265: step: 532/466, loss: 0.8486328721046448 2023-01-22 12:16:46.041736: step: 534/466, loss: 1.7330005168914795 2023-01-22 12:16:46.784919: step: 536/466, loss: 0.3037912845611572 2023-01-22 12:16:47.535129: step: 538/466, loss: 0.37141942977905273 2023-01-22 12:16:48.357793: step: 540/466, loss: 1.067850947380066 2023-01-22 12:16:49.151855: step: 542/466, loss: 0.240159809589386 2023-01-22 12:16:49.899688: step: 544/466, loss: 0.6632150411605835 2023-01-22 12:16:50.654629: step: 546/466, loss: 0.42939281463623047 2023-01-22 12:16:51.496107: step: 548/466, loss: 0.5655322670936584 2023-01-22 12:16:52.233602: step: 550/466, loss: 1.8335810899734497 2023-01-22 12:16:52.988940: step: 552/466, loss: 0.17235557734966278 2023-01-22 12:16:53.705304: step: 554/466, loss: 0.987075686454773 2023-01-22 12:16:54.630641: step: 556/466, loss: 0.2663685977458954 2023-01-22 12:16:55.391448: step: 558/466, loss: 0.42914122343063354 2023-01-22 12:16:56.197609: step: 560/466, loss: 0.4898962676525116 2023-01-22 12:16:56.958004: step: 562/466, loss: 0.20535391569137573 2023-01-22 12:16:57.719669: step: 564/466, loss: 0.27649348974227905 2023-01-22 12:16:58.472006: step: 566/466, loss: 0.2878555655479431 2023-01-22 12:16:59.260572: step: 568/466, loss: 0.5329242944717407 2023-01-22 12:17:00.042454: step: 570/466, loss: 1.488804578781128 2023-01-22 12:17:00.783501: step: 572/466, loss: 1.3856981992721558 2023-01-22 12:17:01.533965: step: 574/466, loss: 0.5450658798217773 2023-01-22 12:17:02.452668: step: 576/466, loss: 0.3929985463619232 2023-01-22 12:17:03.162784: step: 578/466, loss: 1.7792253494262695 2023-01-22 12:17:03.789602: step: 580/466, loss: 0.5900392532348633 2023-01-22 12:17:04.598560: step: 582/466, loss: 1.271080493927002 2023-01-22 12:17:05.343535: step: 584/466, loss: 0.14637401700019836 2023-01-22 12:17:06.101601: step: 586/466, loss: 1.0173259973526 2023-01-22 12:17:06.807149: step: 588/466, loss: 0.28128859400749207 2023-01-22 12:17:07.601551: step: 590/466, loss: 0.8834716081619263 2023-01-22 12:17:08.358113: step: 592/466, loss: 0.4259241223335266 2023-01-22 12:17:09.134624: step: 594/466, loss: 0.5690405964851379 2023-01-22 12:17:09.819167: step: 596/466, loss: 1.0316557884216309 2023-01-22 12:17:10.572550: step: 598/466, loss: 0.29038065671920776 2023-01-22 12:17:11.433913: step: 600/466, loss: 0.8943952322006226 2023-01-22 12:17:12.182844: step: 602/466, loss: 0.619583785533905 2023-01-22 12:17:12.922791: step: 604/466, loss: 0.2445245087146759 2023-01-22 12:17:13.672508: step: 606/466, loss: 0.8926266431808472 2023-01-22 12:17:14.513516: step: 608/466, loss: 0.18985801935195923 2023-01-22 12:17:15.183673: step: 610/466, loss: 0.8654029369354248 2023-01-22 12:17:15.976361: step: 612/466, loss: 1.043053388595581 2023-01-22 12:17:16.700191: step: 614/466, loss: 0.35355132818222046 2023-01-22 12:17:17.514012: step: 616/466, loss: 1.2893407344818115 2023-01-22 12:17:18.269475: step: 618/466, loss: 0.8885557055473328 2023-01-22 12:17:19.068714: step: 620/466, loss: 1.7367770671844482 2023-01-22 12:17:19.851571: step: 622/466, loss: 1.368610143661499 2023-01-22 12:17:20.656603: step: 624/466, loss: 0.21645484864711761 2023-01-22 12:17:21.387122: step: 626/466, loss: 2.2032814025878906 2023-01-22 12:17:22.129019: step: 628/466, loss: 2.669538974761963 2023-01-22 12:17:22.894009: step: 630/466, loss: 0.5482318997383118 2023-01-22 12:17:23.750923: step: 632/466, loss: 0.754304051399231 2023-01-22 12:17:24.531183: step: 634/466, loss: 0.38112232089042664 2023-01-22 12:17:25.280713: step: 636/466, loss: 1.5455775260925293 2023-01-22 12:17:25.986058: step: 638/466, loss: 0.33433040976524353 2023-01-22 12:17:26.781861: step: 640/466, loss: 0.6735621094703674 2023-01-22 12:17:27.577021: step: 642/466, loss: 0.19982242584228516 2023-01-22 12:17:28.383094: step: 644/466, loss: 0.4279138445854187 2023-01-22 12:17:29.119968: step: 646/466, loss: 0.5210850238800049 2023-01-22 12:17:29.801609: step: 648/466, loss: 1.5244331359863281 2023-01-22 12:17:30.532011: step: 650/466, loss: 0.7855544090270996 2023-01-22 12:17:31.332948: step: 652/466, loss: 0.25553497672080994 2023-01-22 12:17:32.105846: step: 654/466, loss: 0.19856083393096924 2023-01-22 12:17:32.910697: step: 656/466, loss: 0.2223929613828659 2023-01-22 12:17:33.711523: step: 658/466, loss: 0.5313467383384705 2023-01-22 12:17:34.432028: step: 660/466, loss: 0.24553091824054718 2023-01-22 12:17:35.194385: step: 662/466, loss: 1.6716569662094116 2023-01-22 12:17:35.954047: step: 664/466, loss: 0.6177428960800171 2023-01-22 12:17:36.769710: step: 666/466, loss: 0.20127610862255096 2023-01-22 12:17:37.500920: step: 668/466, loss: 2.8933916091918945 2023-01-22 12:17:38.304136: step: 670/466, loss: 0.5551666617393494 2023-01-22 12:17:39.137345: step: 672/466, loss: 0.2786451578140259 2023-01-22 12:17:39.962861: step: 674/466, loss: 0.5241547226905823 2023-01-22 12:17:40.695585: step: 676/466, loss: 0.46960535645484924 2023-01-22 12:17:41.427381: step: 678/466, loss: 0.9260277152061462 2023-01-22 12:17:42.247839: step: 680/466, loss: 0.505979597568512 2023-01-22 12:17:42.905389: step: 682/466, loss: 0.16141249239444733 2023-01-22 12:17:43.670512: step: 684/466, loss: 0.3264252543449402 2023-01-22 12:17:44.406927: step: 686/466, loss: 4.854494571685791 2023-01-22 12:17:45.174090: step: 688/466, loss: 0.3939049243927002 2023-01-22 12:17:46.015610: step: 690/466, loss: 4.195629596710205 2023-01-22 12:17:46.752660: step: 692/466, loss: 1.5494621992111206 2023-01-22 12:17:47.496178: step: 694/466, loss: 1.9979078769683838 2023-01-22 12:17:48.264558: step: 696/466, loss: 0.22917333245277405 2023-01-22 12:17:48.990527: step: 698/466, loss: 0.1751905381679535 2023-01-22 12:17:49.743136: step: 700/466, loss: 0.24713370203971863 2023-01-22 12:17:50.531181: step: 702/466, loss: 0.4419696033000946 2023-01-22 12:17:51.242530: step: 704/466, loss: 0.25651171803474426 2023-01-22 12:17:52.022531: step: 706/466, loss: 1.2192009687423706 2023-01-22 12:17:52.764833: step: 708/466, loss: 0.5099461674690247 2023-01-22 12:17:53.524081: step: 710/466, loss: 1.9569056034088135 2023-01-22 12:17:54.275485: step: 712/466, loss: 0.6427992582321167 2023-01-22 12:17:55.014711: step: 714/466, loss: 0.9513710141181946 2023-01-22 12:17:55.734353: step: 716/466, loss: 0.1762647181749344 2023-01-22 12:17:56.475364: step: 718/466, loss: 0.33705297112464905 2023-01-22 12:17:57.320311: step: 720/466, loss: 0.40447694063186646 2023-01-22 12:17:58.084262: step: 722/466, loss: 0.25762277841567993 2023-01-22 12:17:58.842065: step: 724/466, loss: 0.14812979102134705 2023-01-22 12:17:59.578192: step: 726/466, loss: 0.8303240537643433 2023-01-22 12:18:00.312344: step: 728/466, loss: 3.1632981300354004 2023-01-22 12:18:01.061367: step: 730/466, loss: 0.8790704011917114 2023-01-22 12:18:01.823462: step: 732/466, loss: 0.6939700841903687 2023-01-22 12:18:02.612257: step: 734/466, loss: 4.361475944519043 2023-01-22 12:18:03.435049: step: 736/466, loss: 1.0758943557739258 2023-01-22 12:18:04.236143: step: 738/466, loss: 0.6111956834793091 2023-01-22 12:18:05.002339: step: 740/466, loss: 0.5156033039093018 2023-01-22 12:18:05.713467: step: 742/466, loss: 0.5572746992111206 2023-01-22 12:18:06.423164: step: 744/466, loss: 0.4452946186065674 2023-01-22 12:18:07.105574: step: 746/466, loss: 1.1491637229919434 2023-01-22 12:18:07.843173: step: 748/466, loss: 0.3944651186466217 2023-01-22 12:18:08.603621: step: 750/466, loss: 0.3764747977256775 2023-01-22 12:18:09.390471: step: 752/466, loss: 0.48217853903770447 2023-01-22 12:18:10.209380: step: 754/466, loss: 0.41872501373291016 2023-01-22 12:18:10.964034: step: 756/466, loss: 0.6103464961051941 2023-01-22 12:18:11.699021: step: 758/466, loss: 0.599940836429596 2023-01-22 12:18:12.487618: step: 760/466, loss: 1.9976304769515991 2023-01-22 12:18:13.279951: step: 762/466, loss: 0.5040253400802612 2023-01-22 12:18:14.068884: step: 764/466, loss: 0.5511229038238525 2023-01-22 12:18:14.809012: step: 766/466, loss: 1.2951661348342896 2023-01-22 12:18:15.640617: step: 768/466, loss: 0.9320796728134155 2023-01-22 12:18:16.377389: step: 770/466, loss: 1.6455953121185303 2023-01-22 12:18:17.087804: step: 772/466, loss: 0.5544575452804565 2023-01-22 12:18:17.833303: step: 774/466, loss: 0.3264284133911133 2023-01-22 12:18:18.540394: step: 776/466, loss: 2.721331834793091 2023-01-22 12:18:19.306317: step: 778/466, loss: 0.29829540848731995 2023-01-22 12:18:20.101975: step: 780/466, loss: 0.3227888345718384 2023-01-22 12:18:20.880193: step: 782/466, loss: 0.20262525975704193 2023-01-22 12:18:21.609561: step: 784/466, loss: 0.2872578799724579 2023-01-22 12:18:22.463612: step: 786/466, loss: 0.34970372915267944 2023-01-22 12:18:23.301853: step: 788/466, loss: 3.8941845893859863 2023-01-22 12:18:24.046387: step: 790/466, loss: 1.0660080909729004 2023-01-22 12:18:24.802408: step: 792/466, loss: 0.4929357171058655 2023-01-22 12:18:25.585429: step: 794/466, loss: 0.7868713140487671 2023-01-22 12:18:26.355431: step: 796/466, loss: 0.8327652215957642 2023-01-22 12:18:27.057267: step: 798/466, loss: 1.2728098630905151 2023-01-22 12:18:27.760407: step: 800/466, loss: 0.6656249761581421 2023-01-22 12:18:28.420265: step: 802/466, loss: 0.7639499306678772 2023-01-22 12:18:29.193215: step: 804/466, loss: 0.8927834033966064 2023-01-22 12:18:29.956273: step: 806/466, loss: 0.2732463479042053 2023-01-22 12:18:30.752903: step: 808/466, loss: 0.4828619658946991 2023-01-22 12:18:31.499176: step: 810/466, loss: 1.1914703845977783 2023-01-22 12:18:32.252052: step: 812/466, loss: 0.23179252445697784 2023-01-22 12:18:33.095268: step: 814/466, loss: 1.8329510688781738 2023-01-22 12:18:33.875933: step: 816/466, loss: 0.2556149661540985 2023-01-22 12:18:34.669309: step: 818/466, loss: 0.8473445773124695 2023-01-22 12:18:35.449046: step: 820/466, loss: 0.5069705247879028 2023-01-22 12:18:36.196867: step: 822/466, loss: 0.1641732007265091 2023-01-22 12:18:36.980805: step: 824/466, loss: 0.8144305348396301 2023-01-22 12:18:37.724344: step: 826/466, loss: 0.1993732899427414 2023-01-22 12:18:38.566734: step: 828/466, loss: 0.16626204550266266 2023-01-22 12:18:39.321962: step: 830/466, loss: 0.8103485107421875 2023-01-22 12:18:40.075527: step: 832/466, loss: 1.944689154624939 2023-01-22 12:18:40.911158: step: 834/466, loss: 0.39095911383628845 2023-01-22 12:18:41.658258: step: 836/466, loss: 0.46140947937965393 2023-01-22 12:18:42.403370: step: 838/466, loss: 0.3880549371242523 2023-01-22 12:18:43.163841: step: 840/466, loss: 1.5692269802093506 2023-01-22 12:18:43.877541: step: 842/466, loss: 0.4883279800415039 2023-01-22 12:18:44.622242: step: 844/466, loss: 0.4998663365840912 2023-01-22 12:18:45.411611: step: 846/466, loss: 0.20900923013687134 2023-01-22 12:18:46.225316: step: 848/466, loss: 0.28893211483955383 2023-01-22 12:18:47.025796: step: 850/466, loss: 0.2505047917366028 2023-01-22 12:18:47.760990: step: 852/466, loss: 0.21978043019771576 2023-01-22 12:18:48.528886: step: 854/466, loss: 0.24482445418834686 2023-01-22 12:18:49.324294: step: 856/466, loss: 0.6268995404243469 2023-01-22 12:18:50.746431: step: 858/466, loss: 2.1591267585754395 2023-01-22 12:18:51.540487: step: 860/466, loss: 0.3672623634338379 2023-01-22 12:18:52.292215: step: 862/466, loss: 0.5924174189567566 2023-01-22 12:18:53.113872: step: 864/466, loss: 0.4663420021533966 2023-01-22 12:18:54.028121: step: 866/466, loss: 2.4700162410736084 2023-01-22 12:18:54.773629: step: 868/466, loss: 0.6600844860076904 2023-01-22 12:18:55.534733: step: 870/466, loss: 0.3065979778766632 2023-01-22 12:18:56.276605: step: 872/466, loss: 0.6033220291137695 2023-01-22 12:18:56.980874: step: 874/466, loss: 0.41608288884162903 2023-01-22 12:18:57.801328: step: 876/466, loss: 1.080448865890503 2023-01-22 12:18:58.596883: step: 878/466, loss: 0.5031520128250122 2023-01-22 12:18:59.376641: step: 880/466, loss: 0.2834298312664032 2023-01-22 12:19:00.209534: step: 882/466, loss: 0.28815382719039917 2023-01-22 12:19:01.014827: step: 884/466, loss: 0.6634982824325562 2023-01-22 12:19:01.794205: step: 886/466, loss: 0.40648210048675537 2023-01-22 12:19:02.581419: step: 888/466, loss: 0.31952717900276184 2023-01-22 12:19:03.408939: step: 890/466, loss: 19.538501739501953 2023-01-22 12:19:04.231818: step: 892/466, loss: 0.6794193387031555 2023-01-22 12:19:04.963905: step: 894/466, loss: 0.6911655068397522 2023-01-22 12:19:05.709143: step: 896/466, loss: 0.740145742893219 2023-01-22 12:19:06.470510: step: 898/466, loss: 0.2256525754928589 2023-01-22 12:19:07.315880: step: 900/466, loss: 0.22152604162693024 2023-01-22 12:19:08.107008: step: 902/466, loss: 0.4137997031211853 2023-01-22 12:19:08.777194: step: 904/466, loss: 0.3823583722114563 2023-01-22 12:19:09.529681: step: 906/466, loss: 0.3872143626213074 2023-01-22 12:19:10.247705: step: 908/466, loss: 2.1703059673309326 2023-01-22 12:19:10.995086: step: 910/466, loss: 0.579026460647583 2023-01-22 12:19:11.696456: step: 912/466, loss: 2.8523318767547607 2023-01-22 12:19:12.596475: step: 914/466, loss: 0.2878245711326599 2023-01-22 12:19:13.326893: step: 916/466, loss: 0.6961103677749634 2023-01-22 12:19:14.109184: step: 918/466, loss: 0.396592378616333 2023-01-22 12:19:14.889345: step: 920/466, loss: 0.8142573833465576 2023-01-22 12:19:15.731303: step: 922/466, loss: 0.35109513998031616 2023-01-22 12:19:16.564699: step: 924/466, loss: 0.20952901244163513 2023-01-22 12:19:17.366546: step: 926/466, loss: 1.3562030792236328 2023-01-22 12:19:18.183740: step: 928/466, loss: 0.8991647958755493 2023-01-22 12:19:18.936939: step: 930/466, loss: 0.5274887084960938 2023-01-22 12:19:19.590361: step: 932/466, loss: 0.24364523589611053 ================================================== Loss: 0.759 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.30341682879377435, 'r': 0.2959321631878558, 'f1': 0.29962776176753125}, 'combined': 0.22077835077607566, 'epoch': 6} Test Chinese: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.3189413328323066, 'r': 0.26702707259639863, 'f1': 0.2906845135238835}, 'combined': 0.17866462782443573, 'epoch': 6} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.28090543453490296, 'r': 0.30435863969531235, 'f1': 0.2921621186146259}, 'combined': 0.21527735055814537, 'epoch': 6} Test Korean: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.30594728031398444, 'r': 0.2776286306655447, 'f1': 0.2911008590016959}, 'combined': 0.17892052797177407, 'epoch': 6} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3062257281553398, 'r': 0.29925284629981025, 'f1': 0.3026991362763916}, 'combined': 0.2230414688352359, 'epoch': 6} Test Russian: {'template': {'p': 0.863013698630137, 'r': 0.48091603053435117, 'f1': 0.6176470588235295}, 'slot': {'p': 0.31919154407940425, 'r': 0.27225161112655066, 'f1': 0.2938588818508801}, 'combined': 0.18150107408436716, 'epoch': 6} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.2755102040816326, 'r': 0.38571428571428573, 'f1': 0.32142857142857145}, 'combined': 0.2142857142857143, 'epoch': 6} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.3125, 'r': 0.43478260869565216, 'f1': 0.36363636363636365}, 'combined': 0.18181818181818182, 'epoch': 6} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.3076923076923077, 'r': 0.13793103448275862, 'f1': 0.1904761904761905}, 'combined': 0.12698412698412698, 'epoch': 6} New best chinese model... New best korean model... ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.30341682879377435, 'r': 0.2959321631878558, 'f1': 0.29962776176753125}, 'combined': 0.22077835077607566, 'epoch': 6} Test for Chinese: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.3189413328323066, 'r': 0.26702707259639863, 'f1': 0.2906845135238835}, 'combined': 0.17866462782443573, 'epoch': 6} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.2755102040816326, 'r': 0.38571428571428573, 'f1': 0.32142857142857145}, 'combined': 0.2142857142857143, 'epoch': 6} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.28090543453490296, 'r': 0.30435863969531235, 'f1': 0.2921621186146259}, 'combined': 0.21527735055814537, 'epoch': 6} Test for Korean: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.30594728031398444, 'r': 0.2776286306655447, 'f1': 0.2911008590016959}, 'combined': 0.17892052797177407, 'epoch': 6} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.3125, 'r': 0.43478260869565216, 'f1': 0.36363636363636365}, 'combined': 0.18181818181818182, 'epoch': 6} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.32328770863505785, 'r': 0.3355566918849651, 'f1': 0.3293079639169025}, 'combined': 0.24264797341245445, 'epoch': 5} Test for Russian: {'template': {'p': 0.863013698630137, 'r': 0.48091603053435117, 'f1': 0.6176470588235295}, 'slot': {'p': 0.3542723554470683, 'r': 0.23076737340107478, 'f1': 0.2794835868534756}, 'combined': 0.17262221540949965, 'epoch': 5} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.38461538461538464, 'r': 0.1724137931034483, 'f1': 0.23809523809523808}, 'combined': 0.15873015873015872, 'epoch': 5} ****************************** Epoch: 7 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-22 12:22:18.174549: step: 2/466, loss: 0.29334014654159546 2023-01-22 12:22:18.882413: step: 4/466, loss: 1.4716475009918213 2023-01-22 12:22:19.677113: step: 6/466, loss: 0.577203631401062 2023-01-22 12:22:20.502246: step: 8/466, loss: 0.7299621105194092 2023-01-22 12:22:21.241074: step: 10/466, loss: 0.1240634173154831 2023-01-22 12:22:22.026137: step: 12/466, loss: 0.44954681396484375 2023-01-22 12:22:22.699765: step: 14/466, loss: 0.3550652265548706 2023-01-22 12:22:23.420909: step: 16/466, loss: 1.110145092010498 2023-01-22 12:22:24.191849: step: 18/466, loss: 3.7168407440185547 2023-01-22 12:22:24.994250: step: 20/466, loss: 0.1842595785856247 2023-01-22 12:22:25.729390: step: 22/466, loss: 0.29075542092323303 2023-01-22 12:22:26.501123: step: 24/466, loss: 1.1452717781066895 2023-01-22 12:22:27.224003: step: 26/466, loss: 2.3476266860961914 2023-01-22 12:22:28.047054: step: 28/466, loss: 0.23660674691200256 2023-01-22 12:22:28.789884: step: 30/466, loss: 0.48752743005752563 2023-01-22 12:22:29.535377: step: 32/466, loss: 0.33039748668670654 2023-01-22 12:22:30.265077: step: 34/466, loss: 0.5670379400253296 2023-01-22 12:22:31.013829: step: 36/466, loss: 0.4027182459831238 2023-01-22 12:22:31.801346: step: 38/466, loss: 0.24127821624279022 2023-01-22 12:22:32.510199: step: 40/466, loss: 1.246384859085083 2023-01-22 12:22:33.298693: step: 42/466, loss: 0.19768026471138 2023-01-22 12:22:34.157532: step: 44/466, loss: 0.3364740014076233 2023-01-22 12:22:34.901035: step: 46/466, loss: 1.4924445152282715 2023-01-22 12:22:35.698757: step: 48/466, loss: 0.6510562896728516 2023-01-22 12:22:36.514621: step: 50/466, loss: 0.5539807081222534 2023-01-22 12:22:37.305229: step: 52/466, loss: 0.6930699348449707 2023-01-22 12:22:38.254253: step: 54/466, loss: 1.2030017375946045 2023-01-22 12:22:39.185234: step: 56/466, loss: 0.20161156356334686 2023-01-22 12:22:39.870477: step: 58/466, loss: 0.37259748578071594 2023-01-22 12:22:40.634802: step: 60/466, loss: 0.4586893320083618 2023-01-22 12:22:41.324168: step: 62/466, loss: 0.4073813855648041 2023-01-22 12:22:42.007501: step: 64/466, loss: 0.2164788842201233 2023-01-22 12:22:42.800222: step: 66/466, loss: 0.8396023511886597 2023-01-22 12:22:43.551030: step: 68/466, loss: 0.37763679027557373 2023-01-22 12:22:44.321298: step: 70/466, loss: 1.0030779838562012 2023-01-22 12:22:45.151948: step: 72/466, loss: 0.2680083215236664 2023-01-22 12:22:45.946997: step: 74/466, loss: 0.2280387282371521 2023-01-22 12:22:46.646247: step: 76/466, loss: 0.3881959915161133 2023-01-22 12:22:47.474029: step: 78/466, loss: 0.7834279537200928 2023-01-22 12:22:48.229005: step: 80/466, loss: 0.10552302747964859 2023-01-22 12:22:49.092803: step: 82/466, loss: 0.2874581813812256 2023-01-22 12:22:49.835840: step: 84/466, loss: 0.22320303320884705 2023-01-22 12:22:50.582187: step: 86/466, loss: 0.22684840857982635 2023-01-22 12:22:51.339188: step: 88/466, loss: 0.15201908349990845 2023-01-22 12:22:52.083125: step: 90/466, loss: 0.4113709628582001 2023-01-22 12:22:52.936245: step: 92/466, loss: 0.35881781578063965 2023-01-22 12:22:53.633758: step: 94/466, loss: 0.1826227754354477 2023-01-22 12:22:54.542392: step: 96/466, loss: 0.48937031626701355 2023-01-22 12:22:55.268221: step: 98/466, loss: 0.4062560498714447 2023-01-22 12:22:55.982928: step: 100/466, loss: 0.4596622884273529 2023-01-22 12:22:56.767909: step: 102/466, loss: 0.2600345015525818 2023-01-22 12:22:57.536264: step: 104/466, loss: 0.36462166905403137 2023-01-22 12:22:58.265069: step: 106/466, loss: 0.24826651811599731 2023-01-22 12:22:58.921535: step: 108/466, loss: 0.27323004603385925 2023-01-22 12:22:59.681773: step: 110/466, loss: 0.9805687069892883 2023-01-22 12:23:00.359137: step: 112/466, loss: 0.12850280106067657 2023-01-22 12:23:01.096076: step: 114/466, loss: 0.16463781893253326 2023-01-22 12:23:01.816873: step: 116/466, loss: 0.338055282831192 2023-01-22 12:23:02.557879: step: 118/466, loss: 0.4716726839542389 2023-01-22 12:23:03.267365: step: 120/466, loss: 0.1965942233800888 2023-01-22 12:23:04.001007: step: 122/466, loss: 0.2254495918750763 2023-01-22 12:23:04.769404: step: 124/466, loss: 0.5136730670928955 2023-01-22 12:23:05.622440: step: 126/466, loss: 0.1527450531721115 2023-01-22 12:23:06.407155: step: 128/466, loss: 0.8943018913269043 2023-01-22 12:23:07.160463: step: 130/466, loss: 0.545354962348938 2023-01-22 12:23:07.971458: step: 132/466, loss: 0.6113411784172058 2023-01-22 12:23:08.680684: step: 134/466, loss: 0.13481000065803528 2023-01-22 12:23:09.446754: step: 136/466, loss: 0.6991126537322998 2023-01-22 12:23:10.188175: step: 138/466, loss: 0.19909624755382538 2023-01-22 12:23:11.005585: step: 140/466, loss: 1.2885040044784546 2023-01-22 12:23:11.701673: step: 142/466, loss: 0.0988985151052475 2023-01-22 12:23:12.483384: step: 144/466, loss: 0.1780286580324173 2023-01-22 12:23:13.255638: step: 146/466, loss: 0.08792361617088318 2023-01-22 12:23:13.962650: step: 148/466, loss: 0.6058153510093689 2023-01-22 12:23:14.675540: step: 150/466, loss: 0.31548625230789185 2023-01-22 12:23:15.518949: step: 152/466, loss: 0.8992829918861389 2023-01-22 12:23:16.281183: step: 154/466, loss: 0.1858549416065216 2023-01-22 12:23:17.030197: step: 156/466, loss: 0.2229134887456894 2023-01-22 12:23:17.709070: step: 158/466, loss: 0.4137539565563202 2023-01-22 12:23:18.494195: step: 160/466, loss: 0.47563108801841736 2023-01-22 12:23:19.165681: step: 162/466, loss: 0.1360701024532318 2023-01-22 12:23:19.885236: step: 164/466, loss: 0.13130144774913788 2023-01-22 12:23:20.661078: step: 166/466, loss: 0.6715490818023682 2023-01-22 12:23:21.396819: step: 168/466, loss: 0.28572914004325867 2023-01-22 12:23:22.099204: step: 170/466, loss: 0.2678864598274231 2023-01-22 12:23:22.900045: step: 172/466, loss: 0.509308397769928 2023-01-22 12:23:23.556898: step: 174/466, loss: 0.3618375062942505 2023-01-22 12:23:24.354772: step: 176/466, loss: 0.5390142798423767 2023-01-22 12:23:25.186932: step: 178/466, loss: 0.45017820596694946 2023-01-22 12:23:25.900726: step: 180/466, loss: 0.37983590364456177 2023-01-22 12:23:26.662229: step: 182/466, loss: 0.6454115509986877 2023-01-22 12:23:27.483629: step: 184/466, loss: 0.8248163461685181 2023-01-22 12:23:28.440530: step: 186/466, loss: 0.26074397563934326 2023-01-22 12:23:29.187378: step: 188/466, loss: 0.22187507152557373 2023-01-22 12:23:29.931924: step: 190/466, loss: 0.311760276556015 2023-01-22 12:23:30.668203: step: 192/466, loss: 0.3976534903049469 2023-01-22 12:23:31.404545: step: 194/466, loss: 0.8048707842826843 2023-01-22 12:23:32.189994: step: 196/466, loss: 1.1141060590744019 2023-01-22 12:23:33.019639: step: 198/466, loss: 0.9119713306427002 2023-01-22 12:23:33.865687: step: 200/466, loss: 0.4953900873661041 2023-01-22 12:23:34.558738: step: 202/466, loss: 0.08808280527591705 2023-01-22 12:23:35.231195: step: 204/466, loss: 0.44729456305503845 2023-01-22 12:23:36.080074: step: 206/466, loss: 1.191168189048767 2023-01-22 12:23:36.855770: step: 208/466, loss: 0.2583087384700775 2023-01-22 12:23:37.595267: step: 210/466, loss: 0.38636747002601624 2023-01-22 12:23:38.272693: step: 212/466, loss: 0.09954768419265747 2023-01-22 12:23:39.038996: step: 214/466, loss: 0.20439481735229492 2023-01-22 12:23:39.783420: step: 216/466, loss: 0.6137973070144653 2023-01-22 12:23:40.573084: step: 218/466, loss: 0.5198429226875305 2023-01-22 12:23:41.381651: step: 220/466, loss: 0.39407676458358765 2023-01-22 12:23:42.145473: step: 222/466, loss: 0.9188137650489807 2023-01-22 12:23:42.874407: step: 224/466, loss: 0.1870497763156891 2023-01-22 12:23:43.639894: step: 226/466, loss: 0.20046013593673706 2023-01-22 12:23:44.484839: step: 228/466, loss: 0.7507703304290771 2023-01-22 12:23:45.332515: step: 230/466, loss: 0.23928441107273102 2023-01-22 12:23:46.084075: step: 232/466, loss: 1.1374417543411255 2023-01-22 12:23:46.819894: step: 234/466, loss: 0.1974034458398819 2023-01-22 12:23:47.612779: step: 236/466, loss: 0.793383002281189 2023-01-22 12:23:48.403108: step: 238/466, loss: 0.32755231857299805 2023-01-22 12:23:49.204073: step: 240/466, loss: 0.9163374304771423 2023-01-22 12:23:49.962433: step: 242/466, loss: 0.3684721887111664 2023-01-22 12:23:50.685386: step: 244/466, loss: 0.9938925504684448 2023-01-22 12:23:51.488381: step: 246/466, loss: 0.2630435824394226 2023-01-22 12:23:52.382779: step: 248/466, loss: 0.9959964752197266 2023-01-22 12:23:53.166686: step: 250/466, loss: 0.26044443249702454 2023-01-22 12:23:53.861933: step: 252/466, loss: 0.3744111955165863 2023-01-22 12:23:54.711754: step: 254/466, loss: 0.2715119421482086 2023-01-22 12:23:55.532359: step: 256/466, loss: 0.31871679425239563 2023-01-22 12:23:56.291873: step: 258/466, loss: 0.2549704313278198 2023-01-22 12:23:57.052690: step: 260/466, loss: 0.1475781798362732 2023-01-22 12:23:57.772974: step: 262/466, loss: 0.29128512740135193 2023-01-22 12:23:58.723546: step: 264/466, loss: 0.18899090588092804 2023-01-22 12:23:59.502208: step: 266/466, loss: 0.8161706924438477 2023-01-22 12:24:00.285671: step: 268/466, loss: 0.6499884724617004 2023-01-22 12:24:01.074050: step: 270/466, loss: 0.6960606575012207 2023-01-22 12:24:01.758067: step: 272/466, loss: 0.5593203902244568 2023-01-22 12:24:02.508343: step: 274/466, loss: 0.27781498432159424 2023-01-22 12:24:03.295273: step: 276/466, loss: 0.3534137010574341 2023-01-22 12:24:04.069098: step: 278/466, loss: 0.29205551743507385 2023-01-22 12:24:04.844514: step: 280/466, loss: 0.846316397190094 2023-01-22 12:24:05.603761: step: 282/466, loss: 0.14797724783420563 2023-01-22 12:24:06.322616: step: 284/466, loss: 1.0052825212478638 2023-01-22 12:24:07.034925: step: 286/466, loss: 1.1280808448791504 2023-01-22 12:24:07.763389: step: 288/466, loss: 0.1440826803445816 2023-01-22 12:24:08.530272: step: 290/466, loss: 0.1783849447965622 2023-01-22 12:24:09.342798: step: 292/466, loss: 0.4393688142299652 2023-01-22 12:24:10.078314: step: 294/466, loss: 0.6713282465934753 2023-01-22 12:24:10.815572: step: 296/466, loss: 0.7822484374046326 2023-01-22 12:24:11.536064: step: 298/466, loss: 0.3282606899738312 2023-01-22 12:24:12.336244: step: 300/466, loss: 0.6223496794700623 2023-01-22 12:24:13.052323: step: 302/466, loss: 1.1325042247772217 2023-01-22 12:24:13.807691: step: 304/466, loss: 1.7857834100723267 2023-01-22 12:24:14.608287: step: 306/466, loss: 0.8472108840942383 2023-01-22 12:24:15.377710: step: 308/466, loss: 0.7359709739685059 2023-01-22 12:24:16.101313: step: 310/466, loss: 0.7195465564727783 2023-01-22 12:24:16.812438: step: 312/466, loss: 2.230241060256958 2023-01-22 12:24:17.715540: step: 314/466, loss: 0.25023576617240906 2023-01-22 12:24:18.489918: step: 316/466, loss: 0.30060118436813354 2023-01-22 12:24:19.259742: step: 318/466, loss: 0.2879800498485565 2023-01-22 12:24:20.112146: step: 320/466, loss: 0.44610944390296936 2023-01-22 12:24:20.926111: step: 322/466, loss: 0.23804304003715515 2023-01-22 12:24:21.689196: step: 324/466, loss: 0.3681897222995758 2023-01-22 12:24:22.483673: step: 326/466, loss: 2.706002712249756 2023-01-22 12:24:23.337503: step: 328/466, loss: 4.632607460021973 2023-01-22 12:24:24.072466: step: 330/466, loss: 0.1979474425315857 2023-01-22 12:24:24.878265: step: 332/466, loss: 0.9591643810272217 2023-01-22 12:24:25.628960: step: 334/466, loss: 0.2160029262304306 2023-01-22 12:24:26.449693: step: 336/466, loss: 0.7066240310668945 2023-01-22 12:24:27.283277: step: 338/466, loss: 0.17320756614208221 2023-01-22 12:24:28.018996: step: 340/466, loss: 0.7314639091491699 2023-01-22 12:24:28.808622: step: 342/466, loss: 0.7041606307029724 2023-01-22 12:24:29.515067: step: 344/466, loss: 0.2489834874868393 2023-01-22 12:24:30.260770: step: 346/466, loss: 0.16589593887329102 2023-01-22 12:24:31.006856: step: 348/466, loss: 0.14014838635921478 2023-01-22 12:24:31.872405: step: 350/466, loss: 0.6268644332885742 2023-01-22 12:24:32.659383: step: 352/466, loss: 0.42653533816337585 2023-01-22 12:24:33.392098: step: 354/466, loss: 0.14306007325649261 2023-01-22 12:24:34.136239: step: 356/466, loss: 0.6861312985420227 2023-01-22 12:24:35.062616: step: 358/466, loss: 0.8640709519386292 2023-01-22 12:24:35.873073: step: 360/466, loss: 0.08862889558076859 2023-01-22 12:24:36.611921: step: 362/466, loss: 0.3018539249897003 2023-01-22 12:24:37.324439: step: 364/466, loss: 0.8258170485496521 2023-01-22 12:24:38.095235: step: 366/466, loss: 0.47512322664260864 2023-01-22 12:24:38.801462: step: 368/466, loss: 1.041671872138977 2023-01-22 12:24:39.680052: step: 370/466, loss: 3.5983853340148926 2023-01-22 12:24:40.506336: step: 372/466, loss: 0.19122132658958435 2023-01-22 12:24:41.208468: step: 374/466, loss: 0.1812903732061386 2023-01-22 12:24:41.925835: step: 376/466, loss: 1.144923210144043 2023-01-22 12:24:42.687737: step: 378/466, loss: 0.2440568506717682 2023-01-22 12:24:43.419321: step: 380/466, loss: 0.5372747182846069 2023-01-22 12:24:44.241632: step: 382/466, loss: 0.4147646427154541 2023-01-22 12:24:45.028808: step: 384/466, loss: 0.43357139825820923 2023-01-22 12:24:45.816910: step: 386/466, loss: 0.37777167558670044 2023-01-22 12:24:46.561870: step: 388/466, loss: 0.2658584415912628 2023-01-22 12:24:47.368031: step: 390/466, loss: 1.4009718894958496 2023-01-22 12:24:48.102990: step: 392/466, loss: 1.075201392173767 2023-01-22 12:24:48.890175: step: 394/466, loss: 0.4344013035297394 2023-01-22 12:24:49.653833: step: 396/466, loss: 0.5745485424995422 2023-01-22 12:24:50.489082: step: 398/466, loss: 0.6046426296234131 2023-01-22 12:24:51.325848: step: 400/466, loss: 0.25382694602012634 2023-01-22 12:24:52.116758: step: 402/466, loss: 0.6577667593955994 2023-01-22 12:24:52.876140: step: 404/466, loss: 1.6412444114685059 2023-01-22 12:24:53.680003: step: 406/466, loss: 0.5959732532501221 2023-01-22 12:24:54.423762: step: 408/466, loss: 0.2520572245121002 2023-01-22 12:24:55.254826: step: 410/466, loss: 0.5502501130104065 2023-01-22 12:24:55.982054: step: 412/466, loss: 0.24335679411888123 2023-01-22 12:24:56.779839: step: 414/466, loss: 0.6526586413383484 2023-01-22 12:24:57.596188: step: 416/466, loss: 0.5386345386505127 2023-01-22 12:24:58.411305: step: 418/466, loss: 0.12898360192775726 2023-01-22 12:24:59.195351: step: 420/466, loss: 0.43213316798210144 2023-01-22 12:24:59.900611: step: 422/466, loss: 0.22789394855499268 2023-01-22 12:25:00.596508: step: 424/466, loss: 0.1911413073539734 2023-01-22 12:25:01.386047: step: 426/466, loss: 0.479383260011673 2023-01-22 12:25:02.182173: step: 428/466, loss: 0.24068160355091095 2023-01-22 12:25:02.896743: step: 430/466, loss: 0.36104321479797363 2023-01-22 12:25:03.623470: step: 432/466, loss: 0.24652445316314697 2023-01-22 12:25:04.365154: step: 434/466, loss: 0.4063998758792877 2023-01-22 12:25:05.160238: step: 436/466, loss: 0.4729330241680145 2023-01-22 12:25:05.937132: step: 438/466, loss: 0.48919734358787537 2023-01-22 12:25:06.645731: step: 440/466, loss: 0.23226898908615112 2023-01-22 12:25:07.409841: step: 442/466, loss: 0.22790449857711792 2023-01-22 12:25:08.179203: step: 444/466, loss: 1.0022603273391724 2023-01-22 12:25:08.962191: step: 446/466, loss: 0.36382633447647095 2023-01-22 12:25:09.758604: step: 448/466, loss: 0.5031176209449768 2023-01-22 12:25:10.584769: step: 450/466, loss: 0.6753062009811401 2023-01-22 12:25:11.318894: step: 452/466, loss: 0.3563656806945801 2023-01-22 12:25:12.127989: step: 454/466, loss: 0.2503567039966583 2023-01-22 12:25:12.947607: step: 456/466, loss: 0.28261706233024597 2023-01-22 12:25:13.686075: step: 458/466, loss: 0.3007924258708954 2023-01-22 12:25:14.395614: step: 460/466, loss: 0.6131289601325989 2023-01-22 12:25:15.255328: step: 462/466, loss: 0.8440264463424683 2023-01-22 12:25:16.032975: step: 464/466, loss: 0.46553340554237366 2023-01-22 12:25:16.776898: step: 466/466, loss: 0.6256039142608643 2023-01-22 12:25:17.544687: step: 468/466, loss: 0.37587013840675354 2023-01-22 12:25:18.356453: step: 470/466, loss: 0.39992663264274597 2023-01-22 12:25:19.044168: step: 472/466, loss: 0.1894497126340866 2023-01-22 12:25:19.821661: step: 474/466, loss: 0.34886282682418823 2023-01-22 12:25:20.591911: step: 476/466, loss: 0.12024272233247757 2023-01-22 12:25:21.321893: step: 478/466, loss: 0.23778888583183289 2023-01-22 12:25:22.070168: step: 480/466, loss: 0.42834538221359253 2023-01-22 12:25:22.802979: step: 482/466, loss: 0.1702658087015152 2023-01-22 12:25:23.497546: step: 484/466, loss: 0.16342906653881073 2023-01-22 12:25:24.254307: step: 486/466, loss: 0.27307936549186707 2023-01-22 12:25:25.068819: step: 488/466, loss: 0.5578143000602722 2023-01-22 12:25:25.909561: step: 490/466, loss: 1.0116239786148071 2023-01-22 12:25:26.702731: step: 492/466, loss: 0.7227616906166077 2023-01-22 12:25:27.470675: step: 494/466, loss: 0.46264591813087463 2023-01-22 12:25:28.159043: step: 496/466, loss: 0.33290180563926697 2023-01-22 12:25:28.974057: step: 498/466, loss: 0.3694120943546295 2023-01-22 12:25:29.742027: step: 500/466, loss: 1.8301973342895508 2023-01-22 12:25:30.470545: step: 502/466, loss: 0.5415647029876709 2023-01-22 12:25:31.215641: step: 504/466, loss: 0.2798643708229065 2023-01-22 12:25:31.945674: step: 506/466, loss: 0.2467707097530365 2023-01-22 12:25:32.712823: step: 508/466, loss: 0.6890528798103333 2023-01-22 12:25:33.442403: step: 510/466, loss: 1.0509618520736694 2023-01-22 12:25:34.183815: step: 512/466, loss: 0.878806471824646 2023-01-22 12:25:34.920224: step: 514/466, loss: 0.39082100987434387 2023-01-22 12:25:35.599379: step: 516/466, loss: 0.4234006404876709 2023-01-22 12:25:36.359927: step: 518/466, loss: 0.4060421288013458 2023-01-22 12:25:37.213645: step: 520/466, loss: 0.2576528489589691 2023-01-22 12:25:38.001672: step: 522/466, loss: 1.0605449676513672 2023-01-22 12:25:38.749710: step: 524/466, loss: 0.0857195034623146 2023-01-22 12:25:39.462967: step: 526/466, loss: 0.44455015659332275 2023-01-22 12:25:40.258095: step: 528/466, loss: 0.4080895483493805 2023-01-22 12:25:40.936700: step: 530/466, loss: 0.3870290219783783 2023-01-22 12:25:41.726704: step: 532/466, loss: 0.4184207320213318 2023-01-22 12:25:42.378451: step: 534/466, loss: 0.150475412607193 2023-01-22 12:25:43.153804: step: 536/466, loss: 0.482689768075943 2023-01-22 12:25:43.936355: step: 538/466, loss: 0.2870560884475708 2023-01-22 12:25:44.737266: step: 540/466, loss: 0.12939991056919098 2023-01-22 12:25:45.570666: step: 542/466, loss: 0.459214448928833 2023-01-22 12:25:46.349320: step: 544/466, loss: 0.36503612995147705 2023-01-22 12:25:47.100287: step: 546/466, loss: 0.24366234242916107 2023-01-22 12:25:47.880537: step: 548/466, loss: 0.9944776296615601 2023-01-22 12:25:48.645274: step: 550/466, loss: 0.49947139620780945 2023-01-22 12:25:49.394143: step: 552/466, loss: 0.5132686495780945 2023-01-22 12:25:50.122744: step: 554/466, loss: 0.8956375122070312 2023-01-22 12:25:50.931062: step: 556/466, loss: 2.3283026218414307 2023-01-22 12:25:51.753158: step: 558/466, loss: 0.12661069631576538 2023-01-22 12:25:52.493725: step: 560/466, loss: 0.18945397436618805 2023-01-22 12:25:53.271104: step: 562/466, loss: 0.8392725586891174 2023-01-22 12:25:54.064391: step: 564/466, loss: 0.1644848883152008 2023-01-22 12:25:54.826924: step: 566/466, loss: 0.18167704343795776 2023-01-22 12:25:55.623399: step: 568/466, loss: 0.17718948423862457 2023-01-22 12:25:56.405613: step: 570/466, loss: 0.673308253288269 2023-01-22 12:25:57.147006: step: 572/466, loss: 0.12600082159042358 2023-01-22 12:25:57.987429: step: 574/466, loss: 0.592653751373291 2023-01-22 12:25:58.708998: step: 576/466, loss: 0.603567361831665 2023-01-22 12:25:59.507681: step: 578/466, loss: 0.3641091287136078 2023-01-22 12:26:00.228819: step: 580/466, loss: 0.2526349425315857 2023-01-22 12:26:00.953707: step: 582/466, loss: 0.4343501329421997 2023-01-22 12:26:01.764273: step: 584/466, loss: 0.2535267770290375 2023-01-22 12:26:02.560422: step: 586/466, loss: 0.3185502886772156 2023-01-22 12:26:03.356563: step: 588/466, loss: 0.13986481726169586 2023-01-22 12:26:04.119405: step: 590/466, loss: 0.9744532108306885 2023-01-22 12:26:04.931004: step: 592/466, loss: 0.2984525263309479 2023-01-22 12:26:05.735178: step: 594/466, loss: 0.25249266624450684 2023-01-22 12:26:06.449652: step: 596/466, loss: 0.13529881834983826 2023-01-22 12:26:07.185007: step: 598/466, loss: 1.1212987899780273 2023-01-22 12:26:07.864357: step: 600/466, loss: 0.8201851844787598 2023-01-22 12:26:08.701510: step: 602/466, loss: 0.25969186425209045 2023-01-22 12:26:09.384190: step: 604/466, loss: 0.19115287065505981 2023-01-22 12:26:10.134740: step: 606/466, loss: 0.11920665949583054 2023-01-22 12:26:10.816642: step: 608/466, loss: 0.09032613784074783 2023-01-22 12:26:11.637774: step: 610/466, loss: 0.7254239320755005 2023-01-22 12:26:12.336099: step: 612/466, loss: 0.2879403829574585 2023-01-22 12:26:13.103358: step: 614/466, loss: 0.5039555430412292 2023-01-22 12:26:13.805763: step: 616/466, loss: 0.3236883580684662 2023-01-22 12:26:14.547431: step: 618/466, loss: 0.45721375942230225 2023-01-22 12:26:15.336196: step: 620/466, loss: 0.883962869644165 2023-01-22 12:26:16.145130: step: 622/466, loss: 1.2909221649169922 2023-01-22 12:26:16.898345: step: 624/466, loss: 0.27289527654647827 2023-01-22 12:26:17.658089: step: 626/466, loss: 0.45594915747642517 2023-01-22 12:26:18.372499: step: 628/466, loss: 0.2602394223213196 2023-01-22 12:26:19.142595: step: 630/466, loss: 0.2512398064136505 2023-01-22 12:26:19.877099: step: 632/466, loss: 0.1512545347213745 2023-01-22 12:26:20.642455: step: 634/466, loss: 0.2594332695007324 2023-01-22 12:26:21.435221: step: 636/466, loss: 0.26782238483428955 2023-01-22 12:26:22.112408: step: 638/466, loss: 0.28907614946365356 2023-01-22 12:26:22.905207: step: 640/466, loss: 0.3315509855747223 2023-01-22 12:26:23.709149: step: 642/466, loss: 0.6611424088478088 2023-01-22 12:26:24.611186: step: 644/466, loss: 0.27497512102127075 2023-01-22 12:26:25.323974: step: 646/466, loss: 0.729381799697876 2023-01-22 12:26:26.147024: step: 648/466, loss: 1.2339369058609009 2023-01-22 12:26:26.929470: step: 650/466, loss: 0.3203907012939453 2023-01-22 12:26:27.719825: step: 652/466, loss: 0.9287397861480713 2023-01-22 12:26:28.535996: step: 654/466, loss: 1.2124075889587402 2023-01-22 12:26:29.309319: step: 656/466, loss: 0.11122621595859528 2023-01-22 12:26:30.112428: step: 658/466, loss: 0.24004901945590973 2023-01-22 12:26:30.811712: step: 660/466, loss: 0.6141506433486938 2023-01-22 12:26:31.605749: step: 662/466, loss: 1.6464180946350098 2023-01-22 12:26:32.377258: step: 664/466, loss: 0.20204073190689087 2023-01-22 12:26:33.160702: step: 666/466, loss: 1.0337920188903809 2023-01-22 12:26:33.996613: step: 668/466, loss: 0.8603740930557251 2023-01-22 12:26:34.838464: step: 670/466, loss: 0.35860171914100647 2023-01-22 12:26:35.604750: step: 672/466, loss: 0.19825685024261475 2023-01-22 12:26:36.375645: step: 674/466, loss: 0.9691605567932129 2023-01-22 12:26:37.137686: step: 676/466, loss: 0.7861379981040955 2023-01-22 12:26:37.839186: step: 678/466, loss: 0.6022818088531494 2023-01-22 12:26:38.604768: step: 680/466, loss: 0.32521283626556396 2023-01-22 12:26:39.334406: step: 682/466, loss: 0.5121086835861206 2023-01-22 12:26:40.073462: step: 684/466, loss: 0.5897647142410278 2023-01-22 12:26:40.785424: step: 686/466, loss: 0.3449326753616333 2023-01-22 12:26:41.535463: step: 688/466, loss: 0.1923578381538391 2023-01-22 12:26:42.246169: step: 690/466, loss: 0.21271264553070068 2023-01-22 12:26:43.004291: step: 692/466, loss: 0.6504039168357849 2023-01-22 12:26:43.826533: step: 694/466, loss: 1.4073481559753418 2023-01-22 12:26:44.617849: step: 696/466, loss: 0.41430070996284485 2023-01-22 12:26:45.368656: step: 698/466, loss: 0.16300787031650543 2023-01-22 12:26:46.173967: step: 700/466, loss: 0.3126220405101776 2023-01-22 12:26:46.987662: step: 702/466, loss: 4.565176963806152 2023-01-22 12:26:47.723993: step: 704/466, loss: 1.1175744533538818 2023-01-22 12:26:48.618911: step: 706/466, loss: 0.32288840413093567 2023-01-22 12:26:49.339172: step: 708/466, loss: 0.24756182730197906 2023-01-22 12:26:50.101964: step: 710/466, loss: 0.26558712124824524 2023-01-22 12:26:50.865467: step: 712/466, loss: 0.2713622748851776 2023-01-22 12:26:51.692459: step: 714/466, loss: 0.3009144067764282 2023-01-22 12:26:52.400384: step: 716/466, loss: 0.9786598682403564 2023-01-22 12:26:53.292196: step: 718/466, loss: 0.5153631567955017 2023-01-22 12:26:54.072907: step: 720/466, loss: 0.2356444001197815 2023-01-22 12:26:54.770821: step: 722/466, loss: 1.4491140842437744 2023-01-22 12:26:55.522362: step: 724/466, loss: 0.16535405814647675 2023-01-22 12:26:56.282011: step: 726/466, loss: 0.4651423692703247 2023-01-22 12:26:57.039251: step: 728/466, loss: 0.1597919464111328 2023-01-22 12:26:57.798818: step: 730/466, loss: 0.18197688460350037 2023-01-22 12:26:58.567304: step: 732/466, loss: 0.11129917204380035 2023-01-22 12:26:59.295623: step: 734/466, loss: 1.0428858995437622 2023-01-22 12:26:59.983358: step: 736/466, loss: 0.7729167938232422 2023-01-22 12:27:00.781431: step: 738/466, loss: 0.12490782141685486 2023-01-22 12:27:01.612677: step: 740/466, loss: 0.971163809299469 2023-01-22 12:27:02.357972: step: 742/466, loss: 0.688123345375061 2023-01-22 12:27:03.160804: step: 744/466, loss: 1.1513824462890625 2023-01-22 12:27:03.961037: step: 746/466, loss: 0.2143803834915161 2023-01-22 12:27:04.676696: step: 748/466, loss: 0.5400157570838928 2023-01-22 12:27:05.380137: step: 750/466, loss: 0.19938690960407257 2023-01-22 12:27:06.125424: step: 752/466, loss: 1.1824733018875122 2023-01-22 12:27:06.818008: step: 754/466, loss: 0.8320244550704956 2023-01-22 12:27:07.684321: step: 756/466, loss: 0.6107889413833618 2023-01-22 12:27:08.409385: step: 758/466, loss: 0.11197170615196228 2023-01-22 12:27:09.216681: step: 760/466, loss: 0.34392377734184265 2023-01-22 12:27:10.051679: step: 762/466, loss: 0.3480488955974579 2023-01-22 12:27:10.892088: step: 764/466, loss: 0.20735707879066467 2023-01-22 12:27:11.673326: step: 766/466, loss: 0.49595823884010315 2023-01-22 12:27:12.454039: step: 768/466, loss: 1.4405468702316284 2023-01-22 12:27:13.209747: step: 770/466, loss: 0.26072534918785095 2023-01-22 12:27:13.967022: step: 772/466, loss: 0.48735663294792175 2023-01-22 12:27:14.757410: step: 774/466, loss: 0.4617172181606293 2023-01-22 12:27:15.502139: step: 776/466, loss: 0.6967835426330566 2023-01-22 12:27:16.355794: step: 778/466, loss: 1.1913807392120361 2023-01-22 12:27:17.062174: step: 780/466, loss: 0.21358001232147217 2023-01-22 12:27:17.773666: step: 782/466, loss: 0.9233065843582153 2023-01-22 12:27:18.538614: step: 784/466, loss: 0.71363365650177 2023-01-22 12:27:19.382284: step: 786/466, loss: 0.7549607753753662 2023-01-22 12:27:20.139280: step: 788/466, loss: 0.4810609817504883 2023-01-22 12:27:20.842132: step: 790/466, loss: 0.2233993262052536 2023-01-22 12:27:21.559544: step: 792/466, loss: 0.9135105609893799 2023-01-22 12:27:22.295203: step: 794/466, loss: 0.6643213033676147 2023-01-22 12:27:23.046160: step: 796/466, loss: 0.26859939098358154 2023-01-22 12:27:23.977354: step: 798/466, loss: 0.36576661467552185 2023-01-22 12:27:24.865638: step: 800/466, loss: 0.09290008246898651 2023-01-22 12:27:25.631753: step: 802/466, loss: 0.8580803871154785 2023-01-22 12:27:26.399154: step: 804/466, loss: 0.3385680019855499 2023-01-22 12:27:27.163449: step: 806/466, loss: 0.715552806854248 2023-01-22 12:27:27.999288: step: 808/466, loss: 0.46964597702026367 2023-01-22 12:27:28.765975: step: 810/466, loss: 0.6051387786865234 2023-01-22 12:27:29.524263: step: 812/466, loss: 0.25436586141586304 2023-01-22 12:27:30.259797: step: 814/466, loss: 1.0981717109680176 2023-01-22 12:27:31.071249: step: 816/466, loss: 0.5780366063117981 2023-01-22 12:27:31.871501: step: 818/466, loss: 0.2242840677499771 2023-01-22 12:27:32.645141: step: 820/466, loss: 0.9424983263015747 2023-01-22 12:27:33.424633: step: 822/466, loss: 0.30817434191703796 2023-01-22 12:27:34.278060: step: 824/466, loss: 0.4366517663002014 2023-01-22 12:27:34.984057: step: 826/466, loss: 0.1335681676864624 2023-01-22 12:27:35.822940: step: 828/466, loss: 0.43123510479927063 2023-01-22 12:27:36.618215: step: 830/466, loss: 0.29615893959999084 2023-01-22 12:27:37.342117: step: 832/466, loss: 0.7336601614952087 2023-01-22 12:27:38.080832: step: 834/466, loss: 0.17370028793811798 2023-01-22 12:27:38.887415: step: 836/466, loss: 0.6856794357299805 2023-01-22 12:27:39.678667: step: 838/466, loss: 0.2515942454338074 2023-01-22 12:27:40.380012: step: 840/466, loss: 0.38089174032211304 2023-01-22 12:27:41.152163: step: 842/466, loss: 0.6962507963180542 2023-01-22 12:27:41.935519: step: 844/466, loss: 0.1630067229270935 2023-01-22 12:27:42.757952: step: 846/466, loss: 0.7469332218170166 2023-01-22 12:27:43.494120: step: 848/466, loss: 0.47416603565216064 2023-01-22 12:27:44.293712: step: 850/466, loss: 0.5374373197555542 2023-01-22 12:27:45.184295: step: 852/466, loss: 0.44624650478363037 2023-01-22 12:27:45.960907: step: 854/466, loss: 0.2969151735305786 2023-01-22 12:27:46.756724: step: 856/466, loss: 0.18856649100780487 2023-01-22 12:27:47.552122: step: 858/466, loss: 8.759953498840332 2023-01-22 12:27:48.376398: step: 860/466, loss: 0.5982837677001953 2023-01-22 12:27:49.145996: step: 862/466, loss: 0.2622735798358917 2023-01-22 12:27:49.995224: step: 864/466, loss: 0.6516578197479248 2023-01-22 12:27:50.847554: step: 866/466, loss: 0.2897380590438843 2023-01-22 12:27:51.606786: step: 868/466, loss: 0.5647562742233276 2023-01-22 12:27:52.351020: step: 870/466, loss: 0.9220472574234009 2023-01-22 12:27:53.099118: step: 872/466, loss: 0.560003936290741 2023-01-22 12:27:53.887681: step: 874/466, loss: 0.5144690871238708 2023-01-22 12:27:54.730224: step: 876/466, loss: 1.207797884941101 2023-01-22 12:27:55.599247: step: 878/466, loss: 0.11351679265499115 2023-01-22 12:27:56.333759: step: 880/466, loss: 0.46011728048324585 2023-01-22 12:27:57.079672: step: 882/466, loss: 0.7599865198135376 2023-01-22 12:27:57.918234: step: 884/466, loss: 0.4350064694881439 2023-01-22 12:27:58.758192: step: 886/466, loss: 0.5173510909080505 2023-01-22 12:27:59.561468: step: 888/466, loss: 1.4208377599716187 2023-01-22 12:28:00.359261: step: 890/466, loss: 0.41764670610427856 2023-01-22 12:28:01.188784: step: 892/466, loss: 0.2334553599357605 2023-01-22 12:28:01.953079: step: 894/466, loss: 0.15015123784542084 2023-01-22 12:28:02.681506: step: 896/466, loss: 0.8183199167251587 2023-01-22 12:28:03.370722: step: 898/466, loss: 0.7500245571136475 2023-01-22 12:28:04.085717: step: 900/466, loss: 0.2415359616279602 2023-01-22 12:28:04.861722: step: 902/466, loss: 0.4030883312225342 2023-01-22 12:28:05.640813: step: 904/466, loss: 0.23466992378234863 2023-01-22 12:28:06.356926: step: 906/466, loss: 0.3660128116607666 2023-01-22 12:28:07.102570: step: 908/466, loss: 0.32721370458602905 2023-01-22 12:28:07.990025: step: 910/466, loss: 0.2336495816707611 2023-01-22 12:28:08.712541: step: 912/466, loss: 0.4234265983104706 2023-01-22 12:28:09.538077: step: 914/466, loss: 0.6359653472900391 2023-01-22 12:28:10.236206: step: 916/466, loss: 0.39385494589805603 2023-01-22 12:28:10.971824: step: 918/466, loss: 0.32778307795524597 2023-01-22 12:28:11.810798: step: 920/466, loss: 0.3014666736125946 2023-01-22 12:28:12.592570: step: 922/466, loss: 0.2647492587566376 2023-01-22 12:28:13.308981: step: 924/466, loss: 0.47217094898223877 2023-01-22 12:28:14.064184: step: 926/466, loss: 1.2372679710388184 2023-01-22 12:28:14.847822: step: 928/466, loss: 0.43278536200523376 2023-01-22 12:28:15.684493: step: 930/466, loss: 0.4891684353351593 2023-01-22 12:28:16.437772: step: 932/466, loss: 0.6238055229187012 ================================================== Loss: 0.560 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.30685797665369646, 'r': 0.2992884250474383, 'f1': 0.3030259365994236}, 'combined': 0.2232822690732595, 'epoch': 7} Test Chinese: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.3432060226061297, 'r': 0.26089600679496266, 'f1': 0.2964435689603363}, 'combined': 0.18220433994635304, 'epoch': 7} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2918760820300343, 'r': 0.3190144653686902, 'f1': 0.30484247189356256}, 'combined': 0.22462076876367768, 'epoch': 7} Test Korean: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.3237627861218527, 'r': 0.26596804267202456, 'f1': 0.2920334169776559}, 'combined': 0.17949370994724215, 'epoch': 7} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.31564761904761907, 'r': 0.314449715370019, 'f1': 0.31504752851711026}, 'combined': 0.23214028417050228, 'epoch': 7} Test Russian: {'template': {'p': 0.863013698630137, 'r': 0.48091603053435117, 'f1': 0.6176470588235295}, 'slot': {'p': 0.3387348958179314, 'r': 0.26036209063302007, 'f1': 0.2944221975409164}, 'combined': 0.18184900436350723, 'epoch': 7} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.2727272727272727, 'r': 0.2571428571428571, 'f1': 0.2647058823529411}, 'combined': 0.17647058823529407, 'epoch': 7} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.28846153846153844, 'r': 0.4891304347826087, 'f1': 0.3629032258064516}, 'combined': 0.1814516129032258, 'epoch': 7} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.3333333333333333, 'r': 0.13793103448275862, 'f1': 0.1951219512195122}, 'combined': 0.13008130081300812, 'epoch': 7} New best korean model... ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.30341682879377435, 'r': 0.2959321631878558, 'f1': 0.29962776176753125}, 'combined': 0.22077835077607566, 'epoch': 6} Test for Chinese: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.3189413328323066, 'r': 0.26702707259639863, 'f1': 0.2906845135238835}, 'combined': 0.17866462782443573, 'epoch': 6} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.2755102040816326, 'r': 0.38571428571428573, 'f1': 0.32142857142857145}, 'combined': 0.2142857142857143, 'epoch': 6} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2918760820300343, 'r': 0.3190144653686902, 'f1': 0.30484247189356256}, 'combined': 0.22462076876367768, 'epoch': 7} Test for Korean: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.3237627861218527, 'r': 0.26596804267202456, 'f1': 0.2920334169776559}, 'combined': 0.17949370994724215, 'epoch': 7} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.28846153846153844, 'r': 0.4891304347826087, 'f1': 0.3629032258064516}, 'combined': 0.1814516129032258, 'epoch': 7} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.32328770863505785, 'r': 0.3355566918849651, 'f1': 0.3293079639169025}, 'combined': 0.24264797341245445, 'epoch': 5} Test for Russian: {'template': {'p': 0.863013698630137, 'r': 0.48091603053435117, 'f1': 0.6176470588235295}, 'slot': {'p': 0.3542723554470683, 'r': 0.23076737340107478, 'f1': 0.2794835868534756}, 'combined': 0.17262221540949965, 'epoch': 5} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.38461538461538464, 'r': 0.1724137931034483, 'f1': 0.23809523809523808}, 'combined': 0.15873015873015872, 'epoch': 5} ****************************** Epoch: 8 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-22 12:31:06.210778: step: 2/466, loss: 0.22759246826171875 2023-01-22 12:31:07.081417: step: 4/466, loss: 0.1430056244134903 2023-01-22 12:31:07.900967: step: 6/466, loss: 0.15150728821754456 2023-01-22 12:31:08.603287: step: 8/466, loss: 0.6790808439254761 2023-01-22 12:31:09.368479: step: 10/466, loss: 0.24653606116771698 2023-01-22 12:31:10.116502: step: 12/466, loss: 0.2990144193172455 2023-01-22 12:31:10.916957: step: 14/466, loss: 0.810178279876709 2023-01-22 12:31:11.707202: step: 16/466, loss: 0.4249630272388458 2023-01-22 12:31:12.412557: step: 18/466, loss: 0.3089500963687897 2023-01-22 12:31:13.202886: step: 20/466, loss: 0.5554283857345581 2023-01-22 12:31:13.976756: step: 22/466, loss: 0.2011716067790985 2023-01-22 12:31:14.745653: step: 24/466, loss: 0.1238151267170906 2023-01-22 12:31:15.585536: step: 26/466, loss: 0.14290007948875427 2023-01-22 12:31:16.286069: step: 28/466, loss: 0.538735568523407 2023-01-22 12:31:17.024842: step: 30/466, loss: 0.13576442003250122 2023-01-22 12:31:17.751146: step: 32/466, loss: 0.461008220911026 2023-01-22 12:31:18.606159: step: 34/466, loss: 0.1841767132282257 2023-01-22 12:31:19.366872: step: 36/466, loss: 0.3797663450241089 2023-01-22 12:31:20.139545: step: 38/466, loss: 0.25566935539245605 2023-01-22 12:31:20.852838: step: 40/466, loss: 0.2620272636413574 2023-01-22 12:31:21.626210: step: 42/466, loss: 0.262869656085968 2023-01-22 12:31:22.348474: step: 44/466, loss: 0.325731486082077 2023-01-22 12:31:23.138267: step: 46/466, loss: 0.2767471671104431 2023-01-22 12:31:23.824200: step: 48/466, loss: 1.0482711791992188 2023-01-22 12:31:24.526416: step: 50/466, loss: 0.19650810956954956 2023-01-22 12:31:25.373256: step: 52/466, loss: 1.0581046342849731 2023-01-22 12:31:26.068266: step: 54/466, loss: 0.21606041491031647 2023-01-22 12:31:26.805857: step: 56/466, loss: 0.3756955564022064 2023-01-22 12:31:27.589850: step: 58/466, loss: 0.3466208279132843 2023-01-22 12:31:28.265664: step: 60/466, loss: 0.43687790632247925 2023-01-22 12:31:28.991585: step: 62/466, loss: 0.22791868448257446 2023-01-22 12:31:29.786025: step: 64/466, loss: 0.30758219957351685 2023-01-22 12:31:30.593312: step: 66/466, loss: 0.10821656882762909 2023-01-22 12:31:31.235744: step: 68/466, loss: 0.34948158264160156 2023-01-22 12:31:32.003319: step: 70/466, loss: 0.6015602350234985 2023-01-22 12:31:32.805495: step: 72/466, loss: 0.42821982502937317 2023-01-22 12:31:33.603172: step: 74/466, loss: 0.22986894845962524 2023-01-22 12:31:34.382289: step: 76/466, loss: 0.15815268456935883 2023-01-22 12:31:35.185128: step: 78/466, loss: 0.245316281914711 2023-01-22 12:31:36.042948: step: 80/466, loss: 0.23755723237991333 2023-01-22 12:31:36.786952: step: 82/466, loss: 0.08479206264019012 2023-01-22 12:31:37.496254: step: 84/466, loss: 0.0937538594007492 2023-01-22 12:31:38.263406: step: 86/466, loss: 0.35855525732040405 2023-01-22 12:31:39.038168: step: 88/466, loss: 0.6756817102432251 2023-01-22 12:31:39.886079: step: 90/466, loss: 0.3196064233779907 2023-01-22 12:31:40.665653: step: 92/466, loss: 0.3062879145145416 2023-01-22 12:31:41.383076: step: 94/466, loss: 0.21980153024196625 2023-01-22 12:31:42.207025: step: 96/466, loss: 0.735496461391449 2023-01-22 12:31:42.992793: step: 98/466, loss: 0.3766806423664093 2023-01-22 12:31:43.771546: step: 100/466, loss: 0.39500221610069275 2023-01-22 12:31:44.514663: step: 102/466, loss: 0.05726594477891922 2023-01-22 12:31:45.360342: step: 104/466, loss: 0.28027504682540894 2023-01-22 12:31:46.071042: step: 106/466, loss: 0.20551623404026031 2023-01-22 12:31:46.878945: step: 108/466, loss: 0.6117175817489624 2023-01-22 12:31:47.655577: step: 110/466, loss: 0.8166786432266235 2023-01-22 12:31:48.402661: step: 112/466, loss: 0.64291912317276 2023-01-22 12:31:49.086947: step: 114/466, loss: 0.6314736008644104 2023-01-22 12:31:49.867884: step: 116/466, loss: 0.6598122119903564 2023-01-22 12:31:50.665990: step: 118/466, loss: 0.2853725254535675 2023-01-22 12:31:51.416532: step: 120/466, loss: 0.14809046685695648 2023-01-22 12:31:52.193990: step: 122/466, loss: 0.48133033514022827 2023-01-22 12:31:52.984373: step: 124/466, loss: 0.43819597363471985 2023-01-22 12:31:53.667000: step: 126/466, loss: 0.4271838068962097 2023-01-22 12:31:54.589191: step: 128/466, loss: 0.28445273637771606 2023-01-22 12:31:55.339189: step: 130/466, loss: 0.6659205555915833 2023-01-22 12:31:56.079742: step: 132/466, loss: 0.6517955660820007 2023-01-22 12:31:56.980341: step: 134/466, loss: 0.5717197060585022 2023-01-22 12:31:57.756524: step: 136/466, loss: 0.3504186272621155 2023-01-22 12:31:58.468130: step: 138/466, loss: 0.6698485016822815 2023-01-22 12:31:59.162459: step: 140/466, loss: 0.17458480596542358 2023-01-22 12:31:59.939168: step: 142/466, loss: 0.18910260498523712 2023-01-22 12:32:00.748693: step: 144/466, loss: 0.5898134112358093 2023-01-22 12:32:01.632884: step: 146/466, loss: 2.988463878631592 2023-01-22 12:32:02.412869: step: 148/466, loss: 0.6820144057273865 2023-01-22 12:32:03.177947: step: 150/466, loss: 0.2095578908920288 2023-01-22 12:32:03.919791: step: 152/466, loss: 0.24290573596954346 2023-01-22 12:32:04.699060: step: 154/466, loss: 0.9977115392684937 2023-01-22 12:32:05.404923: step: 156/466, loss: 0.12446726858615875 2023-01-22 12:32:06.123581: step: 158/466, loss: 0.4961407780647278 2023-01-22 12:32:06.927170: step: 160/466, loss: 0.2016962319612503 2023-01-22 12:32:07.658230: step: 162/466, loss: 0.1211315244436264 2023-01-22 12:32:08.425580: step: 164/466, loss: 0.5762068629264832 2023-01-22 12:32:09.140471: step: 166/466, loss: 0.16279837489128113 2023-01-22 12:32:09.888901: step: 168/466, loss: 0.3130417764186859 2023-01-22 12:32:10.589973: step: 170/466, loss: 0.2116745114326477 2023-01-22 12:32:11.331364: step: 172/466, loss: 0.16255971789360046 2023-01-22 12:32:12.098870: step: 174/466, loss: 1.197072982788086 2023-01-22 12:32:12.875828: step: 176/466, loss: 0.5906725525856018 2023-01-22 12:32:13.729589: step: 178/466, loss: 0.38303136825561523 2023-01-22 12:32:14.493647: step: 180/466, loss: 1.2655754089355469 2023-01-22 12:32:15.281380: step: 182/466, loss: 0.44630518555641174 2023-01-22 12:32:16.163884: step: 184/466, loss: 0.31110793352127075 2023-01-22 12:32:16.863923: step: 186/466, loss: 0.3698071539402008 2023-01-22 12:32:17.621083: step: 188/466, loss: 0.3049424886703491 2023-01-22 12:32:18.376500: step: 190/466, loss: 0.13894884288311005 2023-01-22 12:32:19.202374: step: 192/466, loss: 0.29648882150650024 2023-01-22 12:32:20.083286: step: 194/466, loss: 0.1692948341369629 2023-01-22 12:32:20.799005: step: 196/466, loss: 0.155476376414299 2023-01-22 12:32:21.536899: step: 198/466, loss: 0.24735870957374573 2023-01-22 12:32:22.342250: step: 200/466, loss: 0.45558783411979675 2023-01-22 12:32:23.100367: step: 202/466, loss: 0.3680471181869507 2023-01-22 12:32:23.839245: step: 204/466, loss: 0.19752237200737 2023-01-22 12:32:24.641304: step: 206/466, loss: 0.3782448172569275 2023-01-22 12:32:25.440775: step: 208/466, loss: 0.2489340454339981 2023-01-22 12:32:26.152434: step: 210/466, loss: 0.23533755540847778 2023-01-22 12:32:26.922177: step: 212/466, loss: 0.10373549908399582 2023-01-22 12:32:27.652396: step: 214/466, loss: 0.053315721452236176 2023-01-22 12:32:28.431229: step: 216/466, loss: 0.08812931180000305 2023-01-22 12:32:29.183494: step: 218/466, loss: 2.715826988220215 2023-01-22 12:32:29.995566: step: 220/466, loss: 0.1123843863606453 2023-01-22 12:32:30.673938: step: 222/466, loss: 0.5836272239685059 2023-01-22 12:32:31.426437: step: 224/466, loss: 0.13191816210746765 2023-01-22 12:32:32.294379: step: 226/466, loss: 0.3843967318534851 2023-01-22 12:32:32.976563: step: 228/466, loss: 0.1885620653629303 2023-01-22 12:32:33.700998: step: 230/466, loss: 0.1312829703092575 2023-01-22 12:32:34.466065: step: 232/466, loss: 0.7238270044326782 2023-01-22 12:32:35.258306: step: 234/466, loss: 0.049187153577804565 2023-01-22 12:32:35.964929: step: 236/466, loss: 0.09891631454229355 2023-01-22 12:32:36.792077: step: 238/466, loss: 0.6556817293167114 2023-01-22 12:32:37.508545: step: 240/466, loss: 0.2603532373905182 2023-01-22 12:32:38.279441: step: 242/466, loss: 0.8729942440986633 2023-01-22 12:32:38.950435: step: 244/466, loss: 0.3330722451210022 2023-01-22 12:32:39.727582: step: 246/466, loss: 0.11522451788187027 2023-01-22 12:32:40.416382: step: 248/466, loss: 0.24133557081222534 2023-01-22 12:32:41.123785: step: 250/466, loss: 1.1795192956924438 2023-01-22 12:32:41.922656: step: 252/466, loss: 0.5877870917320251 2023-01-22 12:32:42.631489: step: 254/466, loss: 1.9018089771270752 2023-01-22 12:32:43.399268: step: 256/466, loss: 0.2621552646160126 2023-01-22 12:32:44.118766: step: 258/466, loss: 0.20106241106987 2023-01-22 12:32:44.885056: step: 260/466, loss: 0.5996320247650146 2023-01-22 12:32:45.665477: step: 262/466, loss: 0.2576277554035187 2023-01-22 12:32:46.388385: step: 264/466, loss: 0.6318485736846924 2023-01-22 12:32:47.132648: step: 266/466, loss: 0.32296547293663025 2023-01-22 12:32:47.856287: step: 268/466, loss: 0.5395624041557312 2023-01-22 12:32:48.630403: step: 270/466, loss: 0.23238903284072876 2023-01-22 12:32:49.376476: step: 272/466, loss: 0.20087341964244843 2023-01-22 12:32:50.152718: step: 274/466, loss: 0.5104784965515137 2023-01-22 12:32:50.888048: step: 276/466, loss: 0.4179069995880127 2023-01-22 12:32:51.645941: step: 278/466, loss: 0.4076002836227417 2023-01-22 12:32:52.467036: step: 280/466, loss: 0.23009905219078064 2023-01-22 12:32:53.307672: step: 282/466, loss: 0.29668739438056946 2023-01-22 12:32:54.034216: step: 284/466, loss: 0.3764442503452301 2023-01-22 12:32:54.840390: step: 286/466, loss: 0.18076901137828827 2023-01-22 12:32:55.562015: step: 288/466, loss: 0.8756890296936035 2023-01-22 12:32:56.332893: step: 290/466, loss: 0.6863442659378052 2023-01-22 12:32:57.108478: step: 292/466, loss: 0.44005173444747925 2023-01-22 12:32:57.871268: step: 294/466, loss: 0.25520214438438416 2023-01-22 12:32:58.579724: step: 296/466, loss: 0.1529461294412613 2023-01-22 12:32:59.306016: step: 298/466, loss: 0.5102216601371765 2023-01-22 12:33:00.066588: step: 300/466, loss: 1.0082459449768066 2023-01-22 12:33:00.834257: step: 302/466, loss: 0.3374558687210083 2023-01-22 12:33:01.578466: step: 304/466, loss: 0.586357057094574 2023-01-22 12:33:02.415628: step: 306/466, loss: 0.3253418207168579 2023-01-22 12:33:03.191896: step: 308/466, loss: 0.5261020660400391 2023-01-22 12:33:03.983191: step: 310/466, loss: 0.12631230056285858 2023-01-22 12:33:04.737222: step: 312/466, loss: 0.21314901113510132 2023-01-22 12:33:05.496619: step: 314/466, loss: 0.26143577694892883 2023-01-22 12:33:06.253834: step: 316/466, loss: 0.24304695427417755 2023-01-22 12:33:06.996655: step: 318/466, loss: 0.9620099663734436 2023-01-22 12:33:07.733807: step: 320/466, loss: 0.1309376209974289 2023-01-22 12:33:08.521382: step: 322/466, loss: 0.4396485686302185 2023-01-22 12:33:09.376541: step: 324/466, loss: 0.43423640727996826 2023-01-22 12:33:10.206856: step: 326/466, loss: 0.18821410834789276 2023-01-22 12:33:10.949923: step: 328/466, loss: 0.09666875749826431 2023-01-22 12:33:11.776507: step: 330/466, loss: 0.23626506328582764 2023-01-22 12:33:12.564341: step: 332/466, loss: 0.9110296964645386 2023-01-22 12:33:13.382837: step: 334/466, loss: 2.471574306488037 2023-01-22 12:33:14.045961: step: 336/466, loss: 0.2877243161201477 2023-01-22 12:33:14.851432: step: 338/466, loss: 0.6529865860939026 2023-01-22 12:33:15.535281: step: 340/466, loss: 0.9429927468299866 2023-01-22 12:33:16.326523: step: 342/466, loss: 0.32690343260765076 2023-01-22 12:33:16.997265: step: 344/466, loss: 0.10178147256374359 2023-01-22 12:33:17.823636: step: 346/466, loss: 0.2624507248401642 2023-01-22 12:33:18.579907: step: 348/466, loss: 0.27885669469833374 2023-01-22 12:33:19.278726: step: 350/466, loss: 0.15760451555252075 2023-01-22 12:33:20.027130: step: 352/466, loss: 0.46658897399902344 2023-01-22 12:33:20.779444: step: 354/466, loss: 0.7008988857269287 2023-01-22 12:33:21.590663: step: 356/466, loss: 0.30459436774253845 2023-01-22 12:33:22.320946: step: 358/466, loss: 0.6526221632957458 2023-01-22 12:33:23.046492: step: 360/466, loss: 0.608924925327301 2023-01-22 12:33:23.905084: step: 362/466, loss: 0.21239249408245087 2023-01-22 12:33:24.764991: step: 364/466, loss: 0.15510523319244385 2023-01-22 12:33:25.620893: step: 366/466, loss: 0.2997853755950928 2023-01-22 12:33:26.438125: step: 368/466, loss: 0.2863451838493347 2023-01-22 12:33:27.220894: step: 370/466, loss: 0.3021564483642578 2023-01-22 12:33:27.949401: step: 372/466, loss: 0.1663222461938858 2023-01-22 12:33:28.690355: step: 374/466, loss: 0.60276198387146 2023-01-22 12:33:29.377620: step: 376/466, loss: 0.2979196012020111 2023-01-22 12:33:30.058273: step: 378/466, loss: 0.11225948482751846 2023-01-22 12:33:30.826832: step: 380/466, loss: 0.5652484893798828 2023-01-22 12:33:31.548705: step: 382/466, loss: 0.4527016580104828 2023-01-22 12:33:32.321280: step: 384/466, loss: 2.857930898666382 2023-01-22 12:33:33.113328: step: 386/466, loss: 0.14337778091430664 2023-01-22 12:33:33.912417: step: 388/466, loss: 0.6897900700569153 2023-01-22 12:33:34.580173: step: 390/466, loss: 0.21116462349891663 2023-01-22 12:33:35.298797: step: 392/466, loss: 0.6865518689155579 2023-01-22 12:33:36.011722: step: 394/466, loss: 0.34384459257125854 2023-01-22 12:33:36.982237: step: 396/466, loss: 1.022497296333313 2023-01-22 12:33:37.654907: step: 398/466, loss: 0.2426058053970337 2023-01-22 12:33:38.378863: step: 400/466, loss: 0.13261792063713074 2023-01-22 12:33:39.089135: step: 402/466, loss: 0.6296444535255432 2023-01-22 12:33:39.835724: step: 404/466, loss: 0.42095842957496643 2023-01-22 12:33:40.607371: step: 406/466, loss: 0.2147996574640274 2023-01-22 12:33:41.332417: step: 408/466, loss: 0.2778421938419342 2023-01-22 12:33:42.124470: step: 410/466, loss: 0.13006922602653503 2023-01-22 12:33:42.875501: step: 412/466, loss: 0.6652418971061707 2023-01-22 12:33:43.655820: step: 414/466, loss: 0.5734185576438904 2023-01-22 12:33:44.386693: step: 416/466, loss: 0.2978045344352722 2023-01-22 12:33:45.123458: step: 418/466, loss: 0.3669905662536621 2023-01-22 12:33:45.850798: step: 420/466, loss: 0.3904774487018585 2023-01-22 12:33:46.633503: step: 422/466, loss: 1.5295345783233643 2023-01-22 12:33:47.364767: step: 424/466, loss: 0.2734212577342987 2023-01-22 12:33:48.117201: step: 426/466, loss: 0.708541989326477 2023-01-22 12:33:48.929258: step: 428/466, loss: 0.5671656131744385 2023-01-22 12:33:49.680285: step: 430/466, loss: 0.22312410175800323 2023-01-22 12:33:50.428077: step: 432/466, loss: 0.3741439878940582 2023-01-22 12:33:51.151052: step: 434/466, loss: 0.4038388431072235 2023-01-22 12:33:51.933873: step: 436/466, loss: 0.18085996806621552 2023-01-22 12:33:52.634804: step: 438/466, loss: 0.14758652448654175 2023-01-22 12:33:53.502928: step: 440/466, loss: 0.18641526997089386 2023-01-22 12:33:54.200269: step: 442/466, loss: 0.2503521740436554 2023-01-22 12:33:54.909441: step: 444/466, loss: 0.31462979316711426 2023-01-22 12:33:55.687926: step: 446/466, loss: 1.0632514953613281 2023-01-22 12:33:56.517201: step: 448/466, loss: 0.3526846766471863 2023-01-22 12:33:57.277928: step: 450/466, loss: 0.40511059761047363 2023-01-22 12:33:57.997664: step: 452/466, loss: 0.9780833125114441 2023-01-22 12:33:58.721022: step: 454/466, loss: 0.6404275894165039 2023-01-22 12:33:59.471759: step: 456/466, loss: 0.6844438314437866 2023-01-22 12:34:00.178395: step: 458/466, loss: 0.4939207434654236 2023-01-22 12:34:00.916303: step: 460/466, loss: 0.9836992621421814 2023-01-22 12:34:01.697545: step: 462/466, loss: 1.2867162227630615 2023-01-22 12:34:02.624339: step: 464/466, loss: 0.14138145744800568 2023-01-22 12:34:03.403125: step: 466/466, loss: 0.5794249176979065 2023-01-22 12:34:04.214624: step: 468/466, loss: 0.9879024624824524 2023-01-22 12:34:04.972219: step: 470/466, loss: 0.4761776030063629 2023-01-22 12:34:05.741993: step: 472/466, loss: 0.43164268136024475 2023-01-22 12:34:06.631320: step: 474/466, loss: 0.5020178556442261 2023-01-22 12:34:07.392965: step: 476/466, loss: 0.28537100553512573 2023-01-22 12:34:08.187600: step: 478/466, loss: 0.5118150115013123 2023-01-22 12:34:09.066853: step: 480/466, loss: 0.15906476974487305 2023-01-22 12:34:09.863763: step: 482/466, loss: 0.6295519471168518 2023-01-22 12:34:10.648477: step: 484/466, loss: 0.49415600299835205 2023-01-22 12:34:11.352621: step: 486/466, loss: 0.17992109060287476 2023-01-22 12:34:12.101601: step: 488/466, loss: 0.07363350689411163 2023-01-22 12:34:12.838797: step: 490/466, loss: 0.33977508544921875 2023-01-22 12:34:13.622243: step: 492/466, loss: 0.3130842447280884 2023-01-22 12:34:14.431053: step: 494/466, loss: 1.355231523513794 2023-01-22 12:34:15.236587: step: 496/466, loss: 0.14459706842899323 2023-01-22 12:34:16.049177: step: 498/466, loss: 0.5662770867347717 2023-01-22 12:34:16.776876: step: 500/466, loss: 0.4519438147544861 2023-01-22 12:34:17.557770: step: 502/466, loss: 0.21593526005744934 2023-01-22 12:34:18.355925: step: 504/466, loss: 0.5674519538879395 2023-01-22 12:34:19.113103: step: 506/466, loss: 0.18442323803901672 2023-01-22 12:34:19.872392: step: 508/466, loss: 0.5759652853012085 2023-01-22 12:34:20.582051: step: 510/466, loss: 0.3328949809074402 2023-01-22 12:34:21.319814: step: 512/466, loss: 1.5621113777160645 2023-01-22 12:34:22.065323: step: 514/466, loss: 0.3188912272453308 2023-01-22 12:34:22.788042: step: 516/466, loss: 0.2803928554058075 2023-01-22 12:34:23.523533: step: 518/466, loss: 0.1506417989730835 2023-01-22 12:34:24.450706: step: 520/466, loss: 0.3449791371822357 2023-01-22 12:34:25.171645: step: 522/466, loss: 0.21323703229427338 2023-01-22 12:34:25.915877: step: 524/466, loss: 0.2827951908111572 2023-01-22 12:34:26.643412: step: 526/466, loss: 0.2961046099662781 2023-01-22 12:34:27.442405: step: 528/466, loss: 0.4124855101108551 2023-01-22 12:34:28.336836: step: 530/466, loss: 0.31052014231681824 2023-01-22 12:34:29.139166: step: 532/466, loss: 0.7250269651412964 2023-01-22 12:34:29.891370: step: 534/466, loss: 0.502644956111908 2023-01-22 12:34:30.640450: step: 536/466, loss: 0.3006429374217987 2023-01-22 12:34:31.448853: step: 538/466, loss: 1.0074026584625244 2023-01-22 12:34:32.390865: step: 540/466, loss: 0.3936672806739807 2023-01-22 12:34:33.124859: step: 542/466, loss: 0.4782834053039551 2023-01-22 12:34:33.929812: step: 544/466, loss: 0.17096778750419617 2023-01-22 12:34:34.763305: step: 546/466, loss: 0.3335103690624237 2023-01-22 12:34:35.448891: step: 548/466, loss: 0.195723295211792 2023-01-22 12:34:36.214329: step: 550/466, loss: 0.42337173223495483 2023-01-22 12:34:36.934744: step: 552/466, loss: 0.3888344466686249 2023-01-22 12:34:37.760419: step: 554/466, loss: 1.6903038024902344 2023-01-22 12:34:38.504189: step: 556/466, loss: 0.21383579075336456 2023-01-22 12:34:39.201773: step: 558/466, loss: 0.23822146654129028 2023-01-22 12:34:39.973735: step: 560/466, loss: 1.131753921508789 2023-01-22 12:34:40.786116: step: 562/466, loss: 0.2799283564090729 2023-01-22 12:34:41.530371: step: 564/466, loss: 0.43949148058891296 2023-01-22 12:34:42.278870: step: 566/466, loss: 0.23566792905330658 2023-01-22 12:34:43.022356: step: 568/466, loss: 0.7373307347297668 2023-01-22 12:34:43.828590: step: 570/466, loss: 1.1472512483596802 2023-01-22 12:34:44.627514: step: 572/466, loss: 0.5465192198753357 2023-01-22 12:34:45.432137: step: 574/466, loss: 0.3646984398365021 2023-01-22 12:34:46.205872: step: 576/466, loss: 0.1426057070493698 2023-01-22 12:34:46.964368: step: 578/466, loss: 0.3378341495990753 2023-01-22 12:34:47.751757: step: 580/466, loss: 0.14365246891975403 2023-01-22 12:34:48.528039: step: 582/466, loss: 0.37120455503463745 2023-01-22 12:34:49.332682: step: 584/466, loss: 0.8782157897949219 2023-01-22 12:34:50.140133: step: 586/466, loss: 1.1974824666976929 2023-01-22 12:34:50.921989: step: 588/466, loss: 0.3319639265537262 2023-01-22 12:34:51.719149: step: 590/466, loss: 0.37469062209129333 2023-01-22 12:34:52.469792: step: 592/466, loss: 0.21519650518894196 2023-01-22 12:34:53.265886: step: 594/466, loss: 0.3425082564353943 2023-01-22 12:34:53.999121: step: 596/466, loss: 0.3874545991420746 2023-01-22 12:34:54.788576: step: 598/466, loss: 0.5452808737754822 2023-01-22 12:34:55.545455: step: 600/466, loss: 0.14452871680259705 2023-01-22 12:34:56.293327: step: 602/466, loss: 0.23176631331443787 2023-01-22 12:34:57.075611: step: 604/466, loss: 0.39068472385406494 2023-01-22 12:34:57.885828: step: 606/466, loss: 0.4605758488178253 2023-01-22 12:34:58.622455: step: 608/466, loss: 0.21772919595241547 2023-01-22 12:34:59.432041: step: 610/466, loss: 0.4970461428165436 2023-01-22 12:35:00.235753: step: 612/466, loss: 0.46533507108688354 2023-01-22 12:35:00.995030: step: 614/466, loss: 0.7836638689041138 2023-01-22 12:35:01.749207: step: 616/466, loss: 0.7685657143592834 2023-01-22 12:35:02.637849: step: 618/466, loss: 0.08030344545841217 2023-01-22 12:35:03.342272: step: 620/466, loss: 0.4773416519165039 2023-01-22 12:35:04.116421: step: 622/466, loss: 0.5082002282142639 2023-01-22 12:35:04.856527: step: 624/466, loss: 0.7404825091362 2023-01-22 12:35:05.598407: step: 626/466, loss: 0.2041405588388443 2023-01-22 12:35:06.305014: step: 628/466, loss: 0.2953926920890808 2023-01-22 12:35:07.020436: step: 630/466, loss: 0.10666435211896896 2023-01-22 12:35:07.930650: step: 632/466, loss: 0.16296805441379547 2023-01-22 12:35:08.725384: step: 634/466, loss: 0.7506895065307617 2023-01-22 12:35:09.386186: step: 636/466, loss: 0.4533143937587738 2023-01-22 12:35:10.109118: step: 638/466, loss: 2.788522481918335 2023-01-22 12:35:10.859666: step: 640/466, loss: 1.4357651472091675 2023-01-22 12:35:11.643346: step: 642/466, loss: 1.3311514854431152 2023-01-22 12:35:12.459387: step: 644/466, loss: 0.21705827116966248 2023-01-22 12:35:13.245923: step: 646/466, loss: 0.6914721131324768 2023-01-22 12:35:13.974191: step: 648/466, loss: 0.20655307173728943 2023-01-22 12:35:14.655843: step: 650/466, loss: 0.4660431742668152 2023-01-22 12:35:15.555307: step: 652/466, loss: 0.7570342421531677 2023-01-22 12:35:16.299319: step: 654/466, loss: 0.1994781494140625 2023-01-22 12:35:17.102543: step: 656/466, loss: 0.298532098531723 2023-01-22 12:35:17.889036: step: 658/466, loss: 0.6627851128578186 2023-01-22 12:35:18.611029: step: 660/466, loss: 0.330418199300766 2023-01-22 12:35:19.355408: step: 662/466, loss: 0.4699534773826599 2023-01-22 12:35:20.148504: step: 664/466, loss: 0.30207765102386475 2023-01-22 12:35:20.920279: step: 666/466, loss: 0.23757143318653107 2023-01-22 12:35:21.742801: step: 668/466, loss: 0.6265518069267273 2023-01-22 12:35:22.440551: step: 670/466, loss: 0.19305653870105743 2023-01-22 12:35:23.238653: step: 672/466, loss: 0.06330909579992294 2023-01-22 12:35:23.960249: step: 674/466, loss: 0.2550206184387207 2023-01-22 12:35:24.790566: step: 676/466, loss: 0.7556263208389282 2023-01-22 12:35:25.517334: step: 678/466, loss: 0.1781693398952484 2023-01-22 12:35:26.361001: step: 680/466, loss: 0.521965503692627 2023-01-22 12:35:27.125288: step: 682/466, loss: 0.21839545667171478 2023-01-22 12:35:27.860313: step: 684/466, loss: 0.1758604794740677 2023-01-22 12:35:28.608189: step: 686/466, loss: 0.5475415587425232 2023-01-22 12:35:29.419844: step: 688/466, loss: 0.21490980684757233 2023-01-22 12:35:30.173188: step: 690/466, loss: 0.39477869868278503 2023-01-22 12:35:30.948617: step: 692/466, loss: 0.5012360215187073 2023-01-22 12:35:31.865544: step: 694/466, loss: 0.6030199527740479 2023-01-22 12:35:32.685150: step: 696/466, loss: 0.48296019434928894 2023-01-22 12:35:33.534763: step: 698/466, loss: 0.5750011205673218 2023-01-22 12:35:34.178663: step: 700/466, loss: 0.13368430733680725 2023-01-22 12:35:34.964626: step: 702/466, loss: 0.1267334520816803 2023-01-22 12:35:35.771560: step: 704/466, loss: 0.3822780251502991 2023-01-22 12:35:36.470995: step: 706/466, loss: 0.22256867587566376 2023-01-22 12:35:37.189306: step: 708/466, loss: 0.6377112865447998 2023-01-22 12:35:37.874436: step: 710/466, loss: 0.15049827098846436 2023-01-22 12:35:38.719423: step: 712/466, loss: 0.5327929854393005 2023-01-22 12:35:39.549511: step: 714/466, loss: 0.3772541880607605 2023-01-22 12:35:40.302704: step: 716/466, loss: 0.254294216632843 2023-01-22 12:35:41.113751: step: 718/466, loss: 1.3840159177780151 2023-01-22 12:35:41.918811: step: 720/466, loss: 0.3449174761772156 2023-01-22 12:35:42.606966: step: 722/466, loss: 1.406064510345459 2023-01-22 12:35:43.257196: step: 724/466, loss: 0.3576720058917999 2023-01-22 12:35:44.004559: step: 726/466, loss: 4.103994846343994 2023-01-22 12:35:44.851649: step: 728/466, loss: 0.2885291278362274 2023-01-22 12:35:45.653119: step: 730/466, loss: 0.16140568256378174 2023-01-22 12:35:46.401818: step: 732/466, loss: 0.6693004965782166 2023-01-22 12:35:47.196832: step: 734/466, loss: 0.43078896403312683 2023-01-22 12:35:48.014544: step: 736/466, loss: 0.10041685402393341 2023-01-22 12:35:48.808379: step: 738/466, loss: 0.4233916699886322 2023-01-22 12:35:49.537217: step: 740/466, loss: 0.23425817489624023 2023-01-22 12:35:50.300268: step: 742/466, loss: 0.1729540079832077 2023-01-22 12:35:51.099587: step: 744/466, loss: 0.26733776926994324 2023-01-22 12:35:51.935301: step: 746/466, loss: 0.14470168948173523 2023-01-22 12:35:52.680074: step: 748/466, loss: 0.12287446856498718 2023-01-22 12:35:53.494122: step: 750/466, loss: 0.4447556138038635 2023-01-22 12:35:54.220713: step: 752/466, loss: 0.11110031604766846 2023-01-22 12:35:54.952746: step: 754/466, loss: 0.41622406244277954 2023-01-22 12:35:55.722219: step: 756/466, loss: 0.25940385460853577 2023-01-22 12:35:56.514591: step: 758/466, loss: 0.11219124495983124 2023-01-22 12:35:57.258102: step: 760/466, loss: 0.634303629398346 2023-01-22 12:35:57.981680: step: 762/466, loss: 0.40231674909591675 2023-01-22 12:35:58.692915: step: 764/466, loss: 0.33801835775375366 2023-01-22 12:35:59.435430: step: 766/466, loss: 0.1518666297197342 2023-01-22 12:36:00.196487: step: 768/466, loss: 0.1566280722618103 2023-01-22 12:36:00.957340: step: 770/466, loss: 0.5563924312591553 2023-01-22 12:36:01.684864: step: 772/466, loss: 0.3823583126068115 2023-01-22 12:36:02.582478: step: 774/466, loss: 0.20732498168945312 2023-01-22 12:36:03.423999: step: 776/466, loss: 0.2604410648345947 2023-01-22 12:36:04.191690: step: 778/466, loss: 0.8115687370300293 2023-01-22 12:36:04.931831: step: 780/466, loss: 0.21109074354171753 2023-01-22 12:36:05.650657: step: 782/466, loss: 0.4132833182811737 2023-01-22 12:36:06.389167: step: 784/466, loss: 0.3729992210865021 2023-01-22 12:36:07.221808: step: 786/466, loss: 0.166608989238739 2023-01-22 12:36:08.021783: step: 788/466, loss: 0.45757102966308594 2023-01-22 12:36:08.763984: step: 790/466, loss: 0.31187009811401367 2023-01-22 12:36:09.573584: step: 792/466, loss: 0.35437431931495667 2023-01-22 12:36:10.434963: step: 794/466, loss: 0.15766958892345428 2023-01-22 12:36:11.140659: step: 796/466, loss: 1.3682160377502441 2023-01-22 12:36:11.904725: step: 798/466, loss: 0.23890097439289093 2023-01-22 12:36:12.719457: step: 800/466, loss: 0.0744379311800003 2023-01-22 12:36:13.489491: step: 802/466, loss: 0.1655988246202469 2023-01-22 12:36:14.296226: step: 804/466, loss: 0.5713003873825073 2023-01-22 12:36:15.128865: step: 806/466, loss: 0.26792052388191223 2023-01-22 12:36:15.905274: step: 808/466, loss: 0.6057258248329163 2023-01-22 12:36:16.793907: step: 810/466, loss: 0.22376419603824615 2023-01-22 12:36:17.564013: step: 812/466, loss: 0.15296228229999542 2023-01-22 12:36:18.463066: step: 814/466, loss: 0.4715817868709564 2023-01-22 12:36:19.297788: step: 816/466, loss: 0.2729172110557556 2023-01-22 12:36:20.031309: step: 818/466, loss: 0.10306338220834732 2023-01-22 12:36:20.795875: step: 820/466, loss: 0.6315586566925049 2023-01-22 12:36:21.540687: step: 822/466, loss: 0.21267792582511902 2023-01-22 12:36:22.368102: step: 824/466, loss: 0.26321327686309814 2023-01-22 12:36:23.249307: step: 826/466, loss: 0.3587478995323181 2023-01-22 12:36:23.997451: step: 828/466, loss: 0.3764268755912781 2023-01-22 12:36:24.687377: step: 830/466, loss: 1.0579025745391846 2023-01-22 12:36:25.419776: step: 832/466, loss: 0.4254581928253174 2023-01-22 12:36:26.157685: step: 834/466, loss: 0.6968392133712769 2023-01-22 12:36:26.912368: step: 836/466, loss: 3.7670774459838867 2023-01-22 12:36:27.641712: step: 838/466, loss: 0.36055028438568115 2023-01-22 12:36:28.445633: step: 840/466, loss: 0.2476329505443573 2023-01-22 12:36:29.179948: step: 842/466, loss: 0.2701147198677063 2023-01-22 12:36:29.920508: step: 844/466, loss: 0.19635479152202606 2023-01-22 12:36:30.638904: step: 846/466, loss: 0.7163576483726501 2023-01-22 12:36:31.382144: step: 848/466, loss: 0.2725900709629059 2023-01-22 12:36:32.139062: step: 850/466, loss: 0.1970960944890976 2023-01-22 12:36:32.875796: step: 852/466, loss: 0.24540355801582336 2023-01-22 12:36:33.629846: step: 854/466, loss: 0.11632157117128372 2023-01-22 12:36:34.418985: step: 856/466, loss: 0.12717144191265106 2023-01-22 12:36:35.196616: step: 858/466, loss: 0.15465322136878967 2023-01-22 12:36:35.964016: step: 860/466, loss: 0.11278307437896729 2023-01-22 12:36:36.728883: step: 862/466, loss: 0.2683349847793579 2023-01-22 12:36:37.437724: step: 864/466, loss: 0.2555452585220337 2023-01-22 12:36:38.250480: step: 866/466, loss: 0.235856831073761 2023-01-22 12:36:38.983496: step: 868/466, loss: 0.2737419903278351 2023-01-22 12:36:39.794549: step: 870/466, loss: 0.2201920598745346 2023-01-22 12:36:40.516518: step: 872/466, loss: 0.7243627309799194 2023-01-22 12:36:41.244754: step: 874/466, loss: 0.18630512058734894 2023-01-22 12:36:42.039003: step: 876/466, loss: 0.24475634098052979 2023-01-22 12:36:42.775629: step: 878/466, loss: 0.18331408500671387 2023-01-22 12:36:43.525285: step: 880/466, loss: 0.37412190437316895 2023-01-22 12:36:44.239710: step: 882/466, loss: 0.10743524134159088 2023-01-22 12:36:44.880757: step: 884/466, loss: 0.7666282653808594 2023-01-22 12:36:45.662051: step: 886/466, loss: 0.25264352560043335 2023-01-22 12:36:46.444059: step: 888/466, loss: 0.2977299392223358 2023-01-22 12:36:47.288880: step: 890/466, loss: 0.25733569264411926 2023-01-22 12:36:48.093611: step: 892/466, loss: 1.2413746118545532 2023-01-22 12:36:48.837701: step: 894/466, loss: 0.2269682139158249 2023-01-22 12:36:49.613808: step: 896/466, loss: 0.4371106028556824 2023-01-22 12:36:50.474251: step: 898/466, loss: 1.4118807315826416 2023-01-22 12:36:51.266925: step: 900/466, loss: 0.7781394720077515 2023-01-22 12:36:52.022229: step: 902/466, loss: 0.42764055728912354 2023-01-22 12:36:52.822922: step: 904/466, loss: 0.604682207107544 2023-01-22 12:36:53.568066: step: 906/466, loss: 0.25415295362472534 2023-01-22 12:36:54.337643: step: 908/466, loss: 0.31604674458503723 2023-01-22 12:36:55.131215: step: 910/466, loss: 0.3052256405353546 2023-01-22 12:36:55.942644: step: 912/466, loss: 0.21037594974040985 2023-01-22 12:36:56.713903: step: 914/466, loss: 0.47760850191116333 2023-01-22 12:36:57.506206: step: 916/466, loss: 0.4769919514656067 2023-01-22 12:36:58.256469: step: 918/466, loss: 0.4393412470817566 2023-01-22 12:36:59.030821: step: 920/466, loss: 0.25017476081848145 2023-01-22 12:36:59.820155: step: 922/466, loss: 5.702282905578613 2023-01-22 12:37:00.654951: step: 924/466, loss: 0.6336638331413269 2023-01-22 12:37:01.382206: step: 926/466, loss: 0.17593809962272644 2023-01-22 12:37:02.177410: step: 928/466, loss: 0.6609342098236084 2023-01-22 12:37:02.925829: step: 930/466, loss: 0.33767199516296387 2023-01-22 12:37:03.647344: step: 932/466, loss: 0.4533933401107788 ================================================== Loss: 0.466 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2908841841928836, 'r': 0.2842606354067079, 'f1': 0.287534270363407}, 'combined': 0.21186735710987883, 'epoch': 8} Test Chinese: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.35225738320380023, 'r': 0.27479125737370047, 'f1': 0.3087392045395176}, 'combined': 0.18976165742428885, 'epoch': 8} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.27532384338032734, 'r': 0.29883346947542166, 'f1': 0.28659734015204225}, 'combined': 0.21117698748045216, 'epoch': 8} Test Korean: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.33170262175026705, 'r': 0.2812011845287731, 'f1': 0.3043713195835783}, 'combined': 0.1870770061830774, 'epoch': 8} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3013006092920954, 'r': 0.2932964185329126, 'f1': 0.2972446395516249}, 'combined': 0.2190223659854078, 'epoch': 8} Test Russian: {'template': {'p': 0.863013698630137, 'r': 0.48091603053435117, 'f1': 0.6176470588235295}, 'slot': {'p': 0.3550773065704035, 'r': 0.2770585904208512, 'f1': 0.31125338243586387}, 'combined': 0.19224473621038654, 'epoch': 8} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.30357142857142855, 'r': 0.30357142857142855, 'f1': 0.30357142857142855}, 'combined': 0.20238095238095236, 'epoch': 8} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.30833333333333335, 'r': 0.40217391304347827, 'f1': 0.34905660377358494}, 'combined': 0.17452830188679247, 'epoch': 8} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.45454545454545453, 'r': 0.1724137931034483, 'f1': 0.25000000000000006}, 'combined': 0.16666666666666669, 'epoch': 8} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.30341682879377435, 'r': 0.2959321631878558, 'f1': 0.29962776176753125}, 'combined': 0.22077835077607566, 'epoch': 6} Test for Chinese: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.3189413328323066, 'r': 0.26702707259639863, 'f1': 0.2906845135238835}, 'combined': 0.17866462782443573, 'epoch': 6} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.2755102040816326, 'r': 0.38571428571428573, 'f1': 0.32142857142857145}, 'combined': 0.2142857142857143, 'epoch': 6} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2918760820300343, 'r': 0.3190144653686902, 'f1': 0.30484247189356256}, 'combined': 0.22462076876367768, 'epoch': 7} Test for Korean: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.3237627861218527, 'r': 0.26596804267202456, 'f1': 0.2920334169776559}, 'combined': 0.17949370994724215, 'epoch': 7} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.28846153846153844, 'r': 0.4891304347826087, 'f1': 0.3629032258064516}, 'combined': 0.1814516129032258, 'epoch': 7} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.32328770863505785, 'r': 0.3355566918849651, 'f1': 0.3293079639169025}, 'combined': 0.24264797341245445, 'epoch': 5} Test for Russian: {'template': {'p': 0.863013698630137, 'r': 0.48091603053435117, 'f1': 0.6176470588235295}, 'slot': {'p': 0.3542723554470683, 'r': 0.23076737340107478, 'f1': 0.2794835868534756}, 'combined': 0.17262221540949965, 'epoch': 5} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.38461538461538464, 'r': 0.1724137931034483, 'f1': 0.23809523809523808}, 'combined': 0.15873015873015872, 'epoch': 5} ****************************** Epoch: 9 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-22 12:39:48.983519: step: 2/466, loss: 0.7617669105529785 2023-01-22 12:39:49.793713: step: 4/466, loss: 0.47193193435668945 2023-01-22 12:39:50.576887: step: 6/466, loss: 0.399549663066864 2023-01-22 12:39:51.343307: step: 8/466, loss: 1.0091228485107422 2023-01-22 12:39:52.064151: step: 10/466, loss: 0.16260093450546265 2023-01-22 12:39:52.969161: step: 12/466, loss: 0.24605731666088104 2023-01-22 12:39:53.719188: step: 14/466, loss: 0.4598664343357086 2023-01-22 12:39:54.439491: step: 16/466, loss: 0.22285336256027222 2023-01-22 12:39:55.248794: step: 18/466, loss: 0.2630546987056732 2023-01-22 12:39:56.022437: step: 20/466, loss: 0.9710225462913513 2023-01-22 12:39:56.735299: step: 22/466, loss: 0.138289675116539 2023-01-22 12:39:57.442387: step: 24/466, loss: 0.18287083506584167 2023-01-22 12:39:58.193740: step: 26/466, loss: 1.1075892448425293 2023-01-22 12:39:58.912396: step: 28/466, loss: 0.21186289191246033 2023-01-22 12:39:59.634113: step: 30/466, loss: 0.7421730756759644 2023-01-22 12:40:00.396703: step: 32/466, loss: 0.25038522481918335 2023-01-22 12:40:01.177423: step: 34/466, loss: 0.8326919674873352 2023-01-22 12:40:01.947251: step: 36/466, loss: 0.11848055571317673 2023-01-22 12:40:02.781603: step: 38/466, loss: 0.5838706493377686 2023-01-22 12:40:03.491509: step: 40/466, loss: 0.5546318292617798 2023-01-22 12:40:04.284334: step: 42/466, loss: 0.2589946985244751 2023-01-22 12:40:04.998639: step: 44/466, loss: 0.3935251832008362 2023-01-22 12:40:05.749519: step: 46/466, loss: 0.4007113575935364 2023-01-22 12:40:06.586196: step: 48/466, loss: 0.1532585769891739 2023-01-22 12:40:07.358811: step: 50/466, loss: 0.24019591510295868 2023-01-22 12:40:08.111457: step: 52/466, loss: 0.16741792857646942 2023-01-22 12:40:08.839635: step: 54/466, loss: 0.1578289419412613 2023-01-22 12:40:09.591085: step: 56/466, loss: 0.24181082844734192 2023-01-22 12:40:10.478258: step: 58/466, loss: 0.2726386785507202 2023-01-22 12:40:11.166586: step: 60/466, loss: 0.2632262110710144 2023-01-22 12:40:11.912472: step: 62/466, loss: 0.382368803024292 2023-01-22 12:40:12.650360: step: 64/466, loss: 0.7976775169372559 2023-01-22 12:40:13.390066: step: 66/466, loss: 0.4883986711502075 2023-01-22 12:40:14.175834: step: 68/466, loss: 0.3494545519351959 2023-01-22 12:40:14.885691: step: 70/466, loss: 0.3140970468521118 2023-01-22 12:40:15.638816: step: 72/466, loss: 0.3909737765789032 2023-01-22 12:40:16.369022: step: 74/466, loss: 0.249564528465271 2023-01-22 12:40:17.109752: step: 76/466, loss: 0.03772410377860069 2023-01-22 12:40:17.832909: step: 78/466, loss: 0.4875684082508087 2023-01-22 12:40:18.602862: step: 80/466, loss: 0.09279076755046844 2023-01-22 12:40:19.383701: step: 82/466, loss: 0.13388334214687347 2023-01-22 12:40:20.088388: step: 84/466, loss: 0.3358129858970642 2023-01-22 12:40:20.832080: step: 86/466, loss: 0.33978158235549927 2023-01-22 12:40:21.589802: step: 88/466, loss: 0.3743342161178589 2023-01-22 12:40:22.401027: step: 90/466, loss: 0.6772381067276001 2023-01-22 12:40:23.067756: step: 92/466, loss: 0.16973809897899628 2023-01-22 12:40:23.883587: step: 94/466, loss: 0.1664164513349533 2023-01-22 12:40:24.580980: step: 96/466, loss: 0.6884454488754272 2023-01-22 12:40:25.327774: step: 98/466, loss: 0.143169105052948 2023-01-22 12:40:26.146806: step: 100/466, loss: 0.5162626504898071 2023-01-22 12:40:26.934760: step: 102/466, loss: 0.24762707948684692 2023-01-22 12:40:27.803829: step: 104/466, loss: 0.1087818592786789 2023-01-22 12:40:28.538556: step: 106/466, loss: 0.1368941068649292 2023-01-22 12:40:29.319888: step: 108/466, loss: 0.3564227223396301 2023-01-22 12:40:30.073092: step: 110/466, loss: 0.36613261699676514 2023-01-22 12:40:30.832069: step: 112/466, loss: 0.2938078045845032 2023-01-22 12:40:31.601380: step: 114/466, loss: 0.7527884244918823 2023-01-22 12:40:32.396493: step: 116/466, loss: 0.15450149774551392 2023-01-22 12:40:33.237300: step: 118/466, loss: 1.102258563041687 2023-01-22 12:40:33.966403: step: 120/466, loss: 0.17737160623073578 2023-01-22 12:40:34.733312: step: 122/466, loss: 0.3365913927555084 2023-01-22 12:40:35.626535: step: 124/466, loss: 0.10386361926794052 2023-01-22 12:40:36.423886: step: 126/466, loss: 0.35903674364089966 2023-01-22 12:40:37.155590: step: 128/466, loss: 0.6990206241607666 2023-01-22 12:40:37.930198: step: 130/466, loss: 0.6001828908920288 2023-01-22 12:40:38.694506: step: 132/466, loss: 0.1223577931523323 2023-01-22 12:40:39.465468: step: 134/466, loss: 0.11910473555326462 2023-01-22 12:40:40.162657: step: 136/466, loss: 0.1736944168806076 2023-01-22 12:40:40.919742: step: 138/466, loss: 0.25195005536079407 2023-01-22 12:40:41.634912: step: 140/466, loss: 0.24543677270412445 2023-01-22 12:40:42.360210: step: 142/466, loss: 0.21577134728431702 2023-01-22 12:40:43.113365: step: 144/466, loss: 0.13015221059322357 2023-01-22 12:40:43.854473: step: 146/466, loss: 0.0657990425825119 2023-01-22 12:40:44.614293: step: 148/466, loss: 0.22850392758846283 2023-01-22 12:40:45.464102: step: 150/466, loss: 0.22996000945568085 2023-01-22 12:40:46.296253: step: 152/466, loss: 0.10576073080301285 2023-01-22 12:40:47.098921: step: 154/466, loss: 0.2323477864265442 2023-01-22 12:40:47.819638: step: 156/466, loss: 0.269827663898468 2023-01-22 12:40:48.527777: step: 158/466, loss: 0.33273160457611084 2023-01-22 12:40:49.278053: step: 160/466, loss: 0.2759661078453064 2023-01-22 12:40:49.964708: step: 162/466, loss: 0.357607901096344 2023-01-22 12:40:50.717923: step: 164/466, loss: 0.24747194349765778 2023-01-22 12:40:51.449506: step: 166/466, loss: 0.24419671297073364 2023-01-22 12:40:52.222382: step: 168/466, loss: 0.3568973243236542 2023-01-22 12:40:52.912492: step: 170/466, loss: 0.43960994482040405 2023-01-22 12:40:53.679295: step: 172/466, loss: 0.2388741672039032 2023-01-22 12:40:54.444556: step: 174/466, loss: 0.13412341475486755 2023-01-22 12:40:55.205304: step: 176/466, loss: 0.17737331986427307 2023-01-22 12:40:55.922167: step: 178/466, loss: 0.04712221026420593 2023-01-22 12:40:56.612366: step: 180/466, loss: 0.44618692994117737 2023-01-22 12:40:57.367518: step: 182/466, loss: 0.4233967065811157 2023-01-22 12:40:58.199135: step: 184/466, loss: 0.4268166422843933 2023-01-22 12:40:58.934920: step: 186/466, loss: 0.3471412658691406 2023-01-22 12:40:59.700482: step: 188/466, loss: 0.1013251468539238 2023-01-22 12:41:00.453783: step: 190/466, loss: 0.47036483883857727 2023-01-22 12:41:01.208577: step: 192/466, loss: 0.1508309692144394 2023-01-22 12:41:02.017324: step: 194/466, loss: 0.5607315897941589 2023-01-22 12:41:02.786894: step: 196/466, loss: 0.18536363542079926 2023-01-22 12:41:03.526716: step: 198/466, loss: 0.29159408807754517 2023-01-22 12:41:04.231764: step: 200/466, loss: 0.18396173417568207 2023-01-22 12:41:05.027403: step: 202/466, loss: 0.0968318060040474 2023-01-22 12:41:05.798045: step: 204/466, loss: 0.09180274605751038 2023-01-22 12:41:06.517452: step: 206/466, loss: 0.2655928432941437 2023-01-22 12:41:07.252556: step: 208/466, loss: 0.1596938669681549 2023-01-22 12:41:08.003978: step: 210/466, loss: 0.3983745872974396 2023-01-22 12:41:08.730938: step: 212/466, loss: 0.2768838703632355 2023-01-22 12:41:09.506945: step: 214/466, loss: 0.325339674949646 2023-01-22 12:41:10.239981: step: 216/466, loss: 0.36471548676490784 2023-01-22 12:41:10.965242: step: 218/466, loss: 0.0601777583360672 2023-01-22 12:41:11.687454: step: 220/466, loss: 0.9182330369949341 2023-01-22 12:41:12.537953: step: 222/466, loss: 0.2803994119167328 2023-01-22 12:41:13.357997: step: 224/466, loss: 0.12595197558403015 2023-01-22 12:41:14.132559: step: 226/466, loss: 0.1642196774482727 2023-01-22 12:41:14.813382: step: 228/466, loss: 0.19627025723457336 2023-01-22 12:41:15.585333: step: 230/466, loss: 0.5533586144447327 2023-01-22 12:41:16.314559: step: 232/466, loss: 0.0949460119009018 2023-01-22 12:41:17.138834: step: 234/466, loss: 0.31896600127220154 2023-01-22 12:41:18.027247: step: 236/466, loss: 0.27299320697784424 2023-01-22 12:41:18.776997: step: 238/466, loss: 0.3383888304233551 2023-01-22 12:41:19.534853: step: 240/466, loss: 0.7563154101371765 2023-01-22 12:41:20.294393: step: 242/466, loss: 0.17838376760482788 2023-01-22 12:41:21.051753: step: 244/466, loss: 0.28580132126808167 2023-01-22 12:41:21.845853: step: 246/466, loss: 0.26450544595718384 2023-01-22 12:41:22.620997: step: 248/466, loss: 0.1821730136871338 2023-01-22 12:41:23.470686: step: 250/466, loss: 2.6174964904785156 2023-01-22 12:41:24.249857: step: 252/466, loss: 0.24531832337379456 2023-01-22 12:41:24.952550: step: 254/466, loss: 0.1795777529478073 2023-01-22 12:41:25.740336: step: 256/466, loss: 0.2463114857673645 2023-01-22 12:41:26.430306: step: 258/466, loss: 0.43065595626831055 2023-01-22 12:41:27.243629: step: 260/466, loss: 0.23510803282260895 2023-01-22 12:41:28.043500: step: 262/466, loss: 0.3374987244606018 2023-01-22 12:41:28.859666: step: 264/466, loss: 0.2109946459531784 2023-01-22 12:41:29.602165: step: 266/466, loss: 0.4068831503391266 2023-01-22 12:41:30.458482: step: 268/466, loss: 0.2021590620279312 2023-01-22 12:41:31.160478: step: 270/466, loss: 0.468037486076355 2023-01-22 12:41:31.954188: step: 272/466, loss: 0.13995757699012756 2023-01-22 12:41:32.784235: step: 274/466, loss: 3.66815185546875 2023-01-22 12:41:33.584853: step: 276/466, loss: 1.1114652156829834 2023-01-22 12:41:34.383331: step: 278/466, loss: 0.22365108132362366 2023-01-22 12:41:35.230269: step: 280/466, loss: 0.40952688455581665 2023-01-22 12:41:36.030802: step: 282/466, loss: 0.44825875759124756 2023-01-22 12:41:36.763579: step: 284/466, loss: 0.308834969997406 2023-01-22 12:41:37.483453: step: 286/466, loss: 0.07672157138586044 2023-01-22 12:41:38.285309: step: 288/466, loss: 0.27245256304740906 2023-01-22 12:41:39.078195: step: 290/466, loss: 0.21029876172542572 2023-01-22 12:41:39.812907: step: 292/466, loss: 0.30801981687545776 2023-01-22 12:41:40.604331: step: 294/466, loss: 0.272743284702301 2023-01-22 12:41:41.393843: step: 296/466, loss: 1.1342663764953613 2023-01-22 12:41:42.110698: step: 298/466, loss: 0.2068939507007599 2023-01-22 12:41:42.917168: step: 300/466, loss: 0.4226273000240326 2023-01-22 12:41:43.741669: step: 302/466, loss: 0.35881322622299194 2023-01-22 12:41:44.579148: step: 304/466, loss: 0.0775582492351532 2023-01-22 12:41:45.330225: step: 306/466, loss: 0.2552018165588379 2023-01-22 12:41:46.076318: step: 308/466, loss: 0.20988473296165466 2023-01-22 12:41:46.807522: step: 310/466, loss: 0.09981327503919601 2023-01-22 12:41:47.552034: step: 312/466, loss: 0.26077914237976074 2023-01-22 12:41:48.365953: step: 314/466, loss: 0.24229076504707336 2023-01-22 12:41:49.109822: step: 316/466, loss: 0.19912472367286682 2023-01-22 12:41:49.927905: step: 318/466, loss: 0.3713231682777405 2023-01-22 12:41:50.735253: step: 320/466, loss: 0.20230260491371155 2023-01-22 12:41:51.491398: step: 322/466, loss: 0.356352299451828 2023-01-22 12:41:52.237725: step: 324/466, loss: 0.22180378437042236 2023-01-22 12:41:53.008394: step: 326/466, loss: 1.2216380834579468 2023-01-22 12:41:53.835745: step: 328/466, loss: 0.4283653795719147 2023-01-22 12:41:54.614311: step: 330/466, loss: 0.16802679002285004 2023-01-22 12:41:55.372445: step: 332/466, loss: 0.458918035030365 2023-01-22 12:41:56.096206: step: 334/466, loss: 0.4632418751716614 2023-01-22 12:41:56.880976: step: 336/466, loss: 0.06715085357427597 2023-01-22 12:41:57.607452: step: 338/466, loss: 0.33738797903060913 2023-01-22 12:41:58.419192: step: 340/466, loss: 0.5246156454086304 2023-01-22 12:41:59.210436: step: 342/466, loss: 0.456807404756546 2023-01-22 12:41:59.980878: step: 344/466, loss: 0.8150843381881714 2023-01-22 12:42:00.769472: step: 346/466, loss: 0.6331696510314941 2023-01-22 12:42:01.540097: step: 348/466, loss: 0.23484434187412262 2023-01-22 12:42:02.334847: step: 350/466, loss: 0.9283328652381897 2023-01-22 12:42:03.072262: step: 352/466, loss: 0.20326565206050873 2023-01-22 12:42:03.885300: step: 354/466, loss: 0.13701821863651276 2023-01-22 12:42:04.666429: step: 356/466, loss: 0.47411486506462097 2023-01-22 12:42:05.431788: step: 358/466, loss: 0.41743534803390503 2023-01-22 12:42:06.197004: step: 360/466, loss: 0.1875670850276947 2023-01-22 12:42:06.944946: step: 362/466, loss: 0.2520168721675873 2023-01-22 12:42:07.660039: step: 364/466, loss: 0.33447718620300293 2023-01-22 12:42:08.454023: step: 366/466, loss: 0.4753393530845642 2023-01-22 12:42:09.242834: step: 368/466, loss: 0.13572938740253448 2023-01-22 12:42:10.173863: step: 370/466, loss: 0.19028696417808533 2023-01-22 12:42:11.009997: step: 372/466, loss: 0.2276836782693863 2023-01-22 12:42:11.858816: step: 374/466, loss: 0.2572558522224426 2023-01-22 12:42:12.532175: step: 376/466, loss: 0.285245418548584 2023-01-22 12:42:13.339275: step: 378/466, loss: 0.061036914587020874 2023-01-22 12:42:14.149066: step: 380/466, loss: 0.48633047938346863 2023-01-22 12:42:14.890676: step: 382/466, loss: 0.20658385753631592 2023-01-22 12:42:15.606104: step: 384/466, loss: 0.11859611421823502 2023-01-22 12:42:16.312787: step: 386/466, loss: 0.7415170073509216 2023-01-22 12:42:17.100254: step: 388/466, loss: 0.27088481187820435 2023-01-22 12:42:17.945464: step: 390/466, loss: 0.29326778650283813 2023-01-22 12:42:18.755676: step: 392/466, loss: 0.3623054623603821 2023-01-22 12:42:19.512191: step: 394/466, loss: 0.6543350219726562 2023-01-22 12:42:20.242922: step: 396/466, loss: 0.7425453066825867 2023-01-22 12:42:21.063598: step: 398/466, loss: 0.28374147415161133 2023-01-22 12:42:21.815950: step: 400/466, loss: 0.1749078333377838 2023-01-22 12:42:22.585572: step: 402/466, loss: 0.21680453419685364 2023-01-22 12:42:23.321692: step: 404/466, loss: 0.09921462088823318 2023-01-22 12:42:24.089126: step: 406/466, loss: 0.2960980236530304 2023-01-22 12:42:24.834738: step: 408/466, loss: 0.2041424661874771 2023-01-22 12:42:25.629789: step: 410/466, loss: 1.829854965209961 2023-01-22 12:42:26.413654: step: 412/466, loss: 1.0091545581817627 2023-01-22 12:42:27.225166: step: 414/466, loss: 0.8316347002983093 2023-01-22 12:42:27.984921: step: 416/466, loss: 0.51683109998703 2023-01-22 12:42:28.707426: step: 418/466, loss: 0.44507837295532227 2023-01-22 12:42:29.504971: step: 420/466, loss: 0.3541668951511383 2023-01-22 12:42:30.298867: step: 422/466, loss: 0.09778696298599243 2023-01-22 12:42:31.046148: step: 424/466, loss: 0.27539190649986267 2023-01-22 12:42:31.794304: step: 426/466, loss: 0.11397480964660645 2023-01-22 12:42:32.655695: step: 428/466, loss: 0.44090360403060913 2023-01-22 12:42:33.429642: step: 430/466, loss: 0.1974068135023117 2023-01-22 12:42:34.165000: step: 432/466, loss: 0.5693545937538147 2023-01-22 12:42:34.915119: step: 434/466, loss: 0.3817470669746399 2023-01-22 12:42:35.657347: step: 436/466, loss: 0.17790962755680084 2023-01-22 12:42:36.354451: step: 438/466, loss: 0.22396616637706757 2023-01-22 12:42:37.094750: step: 440/466, loss: 0.3058200776576996 2023-01-22 12:42:37.874562: step: 442/466, loss: 0.21500913798809052 2023-01-22 12:42:38.661964: step: 444/466, loss: 0.25797945261001587 2023-01-22 12:42:39.400203: step: 446/466, loss: 0.3602447509765625 2023-01-22 12:42:40.099645: step: 448/466, loss: 1.1268690824508667 2023-01-22 12:42:40.892843: step: 450/466, loss: 0.12626045942306519 2023-01-22 12:42:41.666150: step: 452/466, loss: 0.523021399974823 2023-01-22 12:42:42.413583: step: 454/466, loss: 0.21925614774227142 2023-01-22 12:42:43.171422: step: 456/466, loss: 0.6794977188110352 2023-01-22 12:42:43.923238: step: 458/466, loss: 0.09772194921970367 2023-01-22 12:42:44.676175: step: 460/466, loss: 0.5687675476074219 2023-01-22 12:42:45.472034: step: 462/466, loss: 0.38782799243927 2023-01-22 12:42:46.204809: step: 464/466, loss: 0.35689419507980347 2023-01-22 12:42:46.932691: step: 466/466, loss: 0.6951501965522766 2023-01-22 12:42:47.683198: step: 468/466, loss: 0.12729394435882568 2023-01-22 12:42:48.455796: step: 470/466, loss: 0.6348598599433899 2023-01-22 12:42:49.178461: step: 472/466, loss: 1.082305669784546 2023-01-22 12:42:49.973454: step: 474/466, loss: 3.6716866493225098 2023-01-22 12:42:50.729751: step: 476/466, loss: 0.23165543377399445 2023-01-22 12:42:51.517441: step: 478/466, loss: 0.11158735305070877 2023-01-22 12:42:52.271349: step: 480/466, loss: 1.666973352432251 2023-01-22 12:42:53.019700: step: 482/466, loss: 0.3452420234680176 2023-01-22 12:42:53.703061: step: 484/466, loss: 0.5956166386604309 2023-01-22 12:42:54.430070: step: 486/466, loss: 0.13574360311031342 2023-01-22 12:42:55.271758: step: 488/466, loss: 0.20491208136081696 2023-01-22 12:42:56.051155: step: 490/466, loss: 2.7010035514831543 2023-01-22 12:42:56.777448: step: 492/466, loss: 0.15974898636341095 2023-01-22 12:42:57.563673: step: 494/466, loss: 0.374811589717865 2023-01-22 12:42:58.297809: step: 496/466, loss: 0.21302439272403717 2023-01-22 12:42:59.082009: step: 498/466, loss: 0.21756063401699066 2023-01-22 12:42:59.822576: step: 500/466, loss: 0.2711212933063507 2023-01-22 12:43:00.674298: step: 502/466, loss: 0.30291667580604553 2023-01-22 12:43:01.552231: step: 504/466, loss: 0.26272809505462646 2023-01-22 12:43:02.308023: step: 506/466, loss: 0.8830516338348389 2023-01-22 12:43:03.060065: step: 508/466, loss: 0.17511196434497833 2023-01-22 12:43:03.776890: step: 510/466, loss: 0.2928393483161926 2023-01-22 12:43:04.705681: step: 512/466, loss: 0.9481021761894226 2023-01-22 12:43:05.481711: step: 514/466, loss: 0.22100970149040222 2023-01-22 12:43:06.286214: step: 516/466, loss: 0.42827150225639343 2023-01-22 12:43:07.070189: step: 518/466, loss: 0.686099112033844 2023-01-22 12:43:07.854871: step: 520/466, loss: 0.35829856991767883 2023-01-22 12:43:08.579816: step: 522/466, loss: 0.8147534728050232 2023-01-22 12:43:09.295047: step: 524/466, loss: 0.3499932587146759 2023-01-22 12:43:10.067879: step: 526/466, loss: 0.15522895753383636 2023-01-22 12:43:10.814454: step: 528/466, loss: 0.27899911999702454 2023-01-22 12:43:11.576292: step: 530/466, loss: 0.2625729441642761 2023-01-22 12:43:12.269906: step: 532/466, loss: 0.3538782000541687 2023-01-22 12:43:12.991932: step: 534/466, loss: 0.22810465097427368 2023-01-22 12:43:13.727887: step: 536/466, loss: 0.39624813199043274 2023-01-22 12:43:14.555838: step: 538/466, loss: 0.2748154401779175 2023-01-22 12:43:15.359769: step: 540/466, loss: 0.2919783294200897 2023-01-22 12:43:16.108137: step: 542/466, loss: 0.5167675614356995 2023-01-22 12:43:16.859892: step: 544/466, loss: 0.17415225505828857 2023-01-22 12:43:17.601971: step: 546/466, loss: 0.11248274147510529 2023-01-22 12:43:18.287723: step: 548/466, loss: 0.5512552857398987 2023-01-22 12:43:19.085225: step: 550/466, loss: 0.3146078884601593 2023-01-22 12:43:19.952856: step: 552/466, loss: 1.2088881731033325 2023-01-22 12:43:20.759689: step: 554/466, loss: 0.17362341284751892 2023-01-22 12:43:21.671970: step: 556/466, loss: 0.16103479266166687 2023-01-22 12:43:22.428656: step: 558/466, loss: 0.6395021677017212 2023-01-22 12:43:23.130494: step: 560/466, loss: 0.22681139409542084 2023-01-22 12:43:23.879069: step: 562/466, loss: 0.6811214089393616 2023-01-22 12:43:24.658864: step: 564/466, loss: 0.3851960599422455 2023-01-22 12:43:25.462479: step: 566/466, loss: 0.13778236508369446 2023-01-22 12:43:26.318873: step: 568/466, loss: 0.2697453498840332 2023-01-22 12:43:27.039499: step: 570/466, loss: 0.15944616496562958 2023-01-22 12:43:27.816171: step: 572/466, loss: 0.06645628064870834 2023-01-22 12:43:28.549801: step: 574/466, loss: 0.16914452612400055 2023-01-22 12:43:29.199093: step: 576/466, loss: 0.38253024220466614 2023-01-22 12:43:30.037576: step: 578/466, loss: 1.3886417150497437 2023-01-22 12:43:30.826811: step: 580/466, loss: 0.1061757430434227 2023-01-22 12:43:31.675780: step: 582/466, loss: 0.3046887218952179 2023-01-22 12:43:32.520900: step: 584/466, loss: 0.42094412446022034 2023-01-22 12:43:33.343501: step: 586/466, loss: 0.09330093115568161 2023-01-22 12:43:34.097163: step: 588/466, loss: 0.40052181482315063 2023-01-22 12:43:34.814165: step: 590/466, loss: 0.22441357374191284 2023-01-22 12:43:35.553778: step: 592/466, loss: 0.09866520762443542 2023-01-22 12:43:36.371074: step: 594/466, loss: 0.37762102484703064 2023-01-22 12:43:37.092008: step: 596/466, loss: 0.102919802069664 2023-01-22 12:43:37.915723: step: 598/466, loss: 4.134809494018555 2023-01-22 12:43:38.681101: step: 600/466, loss: 0.17482620477676392 2023-01-22 12:43:39.425808: step: 602/466, loss: 0.9702067971229553 2023-01-22 12:43:40.165565: step: 604/466, loss: 0.12988156080245972 2023-01-22 12:43:40.880317: step: 606/466, loss: 3.7468557357788086 2023-01-22 12:43:41.656304: step: 608/466, loss: 0.04453423246741295 2023-01-22 12:43:42.399013: step: 610/466, loss: 0.14342595636844635 2023-01-22 12:43:43.122235: step: 612/466, loss: 0.12794727087020874 2023-01-22 12:43:43.902347: step: 614/466, loss: 0.17742207646369934 2023-01-22 12:43:44.653636: step: 616/466, loss: 0.5601744651794434 2023-01-22 12:43:45.362803: step: 618/466, loss: 0.10394865274429321 2023-01-22 12:43:46.235573: step: 620/466, loss: 0.1495993435382843 2023-01-22 12:43:46.948313: step: 622/466, loss: 0.1939275562763214 2023-01-22 12:43:47.708013: step: 624/466, loss: 0.26793724298477173 2023-01-22 12:43:48.468604: step: 626/466, loss: 0.26115572452545166 2023-01-22 12:43:49.195844: step: 628/466, loss: 0.223658487200737 2023-01-22 12:43:49.907511: step: 630/466, loss: 0.21691951155662537 2023-01-22 12:43:50.633356: step: 632/466, loss: 0.16291409730911255 2023-01-22 12:43:51.415602: step: 634/466, loss: 0.11505762487649918 2023-01-22 12:43:52.126367: step: 636/466, loss: 0.5818736553192139 2023-01-22 12:43:52.877847: step: 638/466, loss: 0.35348910093307495 2023-01-22 12:43:53.665290: step: 640/466, loss: 0.6107161641120911 2023-01-22 12:43:54.437879: step: 642/466, loss: 0.5039299726486206 2023-01-22 12:43:55.138587: step: 644/466, loss: 0.4757615327835083 2023-01-22 12:43:55.958333: step: 646/466, loss: 0.09726975858211517 2023-01-22 12:43:56.766465: step: 648/466, loss: 0.644382655620575 2023-01-22 12:43:57.469541: step: 650/466, loss: 0.2625083923339844 2023-01-22 12:43:58.237026: step: 652/466, loss: 0.1770208179950714 2023-01-22 12:43:58.997343: step: 654/466, loss: 0.3707251250743866 2023-01-22 12:43:59.694020: step: 656/466, loss: 1.1968345642089844 2023-01-22 12:44:00.579127: step: 658/466, loss: 0.3440450429916382 2023-01-22 12:44:01.315661: step: 660/466, loss: 0.6458595395088196 2023-01-22 12:44:02.090834: step: 662/466, loss: 0.09910833090543747 2023-01-22 12:44:02.807106: step: 664/466, loss: 0.13965816795825958 2023-01-22 12:44:03.509451: step: 666/466, loss: 0.15654996037483215 2023-01-22 12:44:04.320123: step: 668/466, loss: 0.36312490701675415 2023-01-22 12:44:05.079408: step: 670/466, loss: 0.820220947265625 2023-01-22 12:44:05.915506: step: 672/466, loss: 0.40227824449539185 2023-01-22 12:44:06.666501: step: 674/466, loss: 0.2854505479335785 2023-01-22 12:44:07.412471: step: 676/466, loss: 0.1591416746377945 2023-01-22 12:44:08.198993: step: 678/466, loss: 0.23695874214172363 2023-01-22 12:44:09.003166: step: 680/466, loss: 0.16190719604492188 2023-01-22 12:44:09.728687: step: 682/466, loss: 0.6181747913360596 2023-01-22 12:44:10.496666: step: 684/466, loss: 0.4995764493942261 2023-01-22 12:44:11.204766: step: 686/466, loss: 0.21558211743831635 2023-01-22 12:44:11.954305: step: 688/466, loss: 0.18158429861068726 2023-01-22 12:44:12.743093: step: 690/466, loss: 0.5408704876899719 2023-01-22 12:44:13.470226: step: 692/466, loss: 0.28732776641845703 2023-01-22 12:44:14.216399: step: 694/466, loss: 0.07246675342321396 2023-01-22 12:44:14.961526: step: 696/466, loss: 0.41384613513946533 2023-01-22 12:44:15.721682: step: 698/466, loss: 0.23722220957279205 2023-01-22 12:44:16.518980: step: 700/466, loss: 0.36485758423805237 2023-01-22 12:44:17.344827: step: 702/466, loss: 0.2078809142112732 2023-01-22 12:44:18.234788: step: 704/466, loss: 0.12964241206645966 2023-01-22 12:44:18.945350: step: 706/466, loss: 0.4992145895957947 2023-01-22 12:44:19.765813: step: 708/466, loss: 0.19601070880889893 2023-01-22 12:44:20.461573: step: 710/466, loss: 0.703117847442627 2023-01-22 12:44:21.198507: step: 712/466, loss: 0.39133965969085693 2023-01-22 12:44:22.044569: step: 714/466, loss: 0.5801993012428284 2023-01-22 12:44:22.829847: step: 716/466, loss: 0.15397311747074127 2023-01-22 12:44:23.552659: step: 718/466, loss: 0.06464719772338867 2023-01-22 12:44:24.424753: step: 720/466, loss: 0.36740589141845703 2023-01-22 12:44:25.234759: step: 722/466, loss: 0.09521396458148956 2023-01-22 12:44:26.030061: step: 724/466, loss: 0.46077385544776917 2023-01-22 12:44:26.730279: step: 726/466, loss: 0.18440701067447662 2023-01-22 12:44:27.462844: step: 728/466, loss: 0.2911909520626068 2023-01-22 12:44:28.247582: step: 730/466, loss: 0.17030131816864014 2023-01-22 12:44:29.069306: step: 732/466, loss: 0.305722713470459 2023-01-22 12:44:29.827683: step: 734/466, loss: 0.20065052807331085 2023-01-22 12:44:30.629596: step: 736/466, loss: 0.5773312449455261 2023-01-22 12:44:31.472370: step: 738/466, loss: 0.6475232839584351 2023-01-22 12:44:32.291158: step: 740/466, loss: 0.29985684156417847 2023-01-22 12:44:33.005324: step: 742/466, loss: 0.573562502861023 2023-01-22 12:44:33.756451: step: 744/466, loss: 0.08346546441316605 2023-01-22 12:44:34.488506: step: 746/466, loss: 0.2765767276287079 2023-01-22 12:44:35.222178: step: 748/466, loss: 0.40180301666259766 2023-01-22 12:44:35.990841: step: 750/466, loss: 0.6855344176292419 2023-01-22 12:44:36.797795: step: 752/466, loss: 0.12331660836935043 2023-01-22 12:44:37.576553: step: 754/466, loss: 0.6254172325134277 2023-01-22 12:44:38.367603: step: 756/466, loss: 0.2580622136592865 2023-01-22 12:44:39.123301: step: 758/466, loss: 0.29950806498527527 2023-01-22 12:44:39.844999: step: 760/466, loss: 0.16383399069309235 2023-01-22 12:44:40.654300: step: 762/466, loss: 0.140916109085083 2023-01-22 12:44:41.391787: step: 764/466, loss: 0.15139026939868927 2023-01-22 12:44:42.135227: step: 766/466, loss: 0.11691779643297195 2023-01-22 12:44:42.873217: step: 768/466, loss: 0.2745065689086914 2023-01-22 12:44:43.627772: step: 770/466, loss: 0.5456153154373169 2023-01-22 12:44:44.380532: step: 772/466, loss: 1.0830720663070679 2023-01-22 12:44:45.140983: step: 774/466, loss: 0.3399311900138855 2023-01-22 12:44:45.909928: step: 776/466, loss: 0.7403802275657654 2023-01-22 12:44:46.617747: step: 778/466, loss: 0.18672645092010498 2023-01-22 12:44:47.369407: step: 780/466, loss: 0.2659226953983307 2023-01-22 12:44:48.127488: step: 782/466, loss: 0.33941036462783813 2023-01-22 12:44:49.010818: step: 784/466, loss: 0.40195438265800476 2023-01-22 12:44:49.846666: step: 786/466, loss: 0.20171843469142914 2023-01-22 12:44:50.585825: step: 788/466, loss: 0.2638154625892639 2023-01-22 12:44:51.266710: step: 790/466, loss: 0.25721925497055054 2023-01-22 12:44:52.107135: step: 792/466, loss: 0.6966685056686401 2023-01-22 12:44:52.898998: step: 794/466, loss: 0.9811992049217224 2023-01-22 12:44:53.642284: step: 796/466, loss: 0.26549002528190613 2023-01-22 12:44:54.421715: step: 798/466, loss: 0.36354395747184753 2023-01-22 12:44:55.193571: step: 800/466, loss: 1.4915454387664795 2023-01-22 12:44:55.952887: step: 802/466, loss: 0.2320493459701538 2023-01-22 12:44:56.644013: step: 804/466, loss: 0.5532269477844238 2023-01-22 12:44:57.486976: step: 806/466, loss: 0.3147946000099182 2023-01-22 12:44:58.303742: step: 808/466, loss: 0.6030541062355042 2023-01-22 12:44:59.045454: step: 810/466, loss: 0.25675368309020996 2023-01-22 12:44:59.799966: step: 812/466, loss: 0.12279326468706131 2023-01-22 12:45:00.488422: step: 814/466, loss: 0.09616804122924805 2023-01-22 12:45:01.233799: step: 816/466, loss: 0.22024525701999664 2023-01-22 12:45:02.133452: step: 818/466, loss: 0.9077953100204468 2023-01-22 12:45:02.877290: step: 820/466, loss: 0.10984183847904205 2023-01-22 12:45:03.606986: step: 822/466, loss: 0.19995927810668945 2023-01-22 12:45:04.339629: step: 824/466, loss: 0.2009981870651245 2023-01-22 12:45:05.211461: step: 826/466, loss: 0.42771244049072266 2023-01-22 12:45:06.030484: step: 828/466, loss: 0.16702905297279358 2023-01-22 12:45:06.756172: step: 830/466, loss: 0.32878464460372925 2023-01-22 12:45:07.514628: step: 832/466, loss: 0.3727845549583435 2023-01-22 12:45:08.203248: step: 834/466, loss: 0.1593688279390335 2023-01-22 12:45:08.888480: step: 836/466, loss: 0.15850010514259338 2023-01-22 12:45:09.655072: step: 838/466, loss: 0.1300395280122757 2023-01-22 12:45:10.375079: step: 840/466, loss: 0.29905185103416443 2023-01-22 12:45:11.145988: step: 842/466, loss: 0.1523512899875641 2023-01-22 12:45:11.888666: step: 844/466, loss: 2.24468994140625 2023-01-22 12:45:12.649141: step: 846/466, loss: 0.5050785541534424 2023-01-22 12:45:13.460185: step: 848/466, loss: 0.21456189453601837 2023-01-22 12:45:14.195826: step: 850/466, loss: 0.22099438309669495 2023-01-22 12:45:14.957318: step: 852/466, loss: 0.6934733986854553 2023-01-22 12:45:15.646455: step: 854/466, loss: 2.9839396476745605 2023-01-22 12:45:16.346462: step: 856/466, loss: 0.40766406059265137 2023-01-22 12:45:17.034889: step: 858/466, loss: 0.14863476157188416 2023-01-22 12:45:17.773164: step: 860/466, loss: 0.10019338876008987 2023-01-22 12:45:18.503440: step: 862/466, loss: 0.4024325907230377 2023-01-22 12:45:19.189451: step: 864/466, loss: 0.12153172492980957 2023-01-22 12:45:19.941855: step: 866/466, loss: 0.41037943959236145 2023-01-22 12:45:20.667320: step: 868/466, loss: 0.1819164752960205 2023-01-22 12:45:21.435167: step: 870/466, loss: 0.13573968410491943 2023-01-22 12:45:22.331474: step: 872/466, loss: 0.4748840630054474 2023-01-22 12:45:23.099946: step: 874/466, loss: 0.09928253293037415 2023-01-22 12:45:23.928111: step: 876/466, loss: 0.27209317684173584 2023-01-22 12:45:24.707202: step: 878/466, loss: 0.2346189022064209 2023-01-22 12:45:25.404835: step: 880/466, loss: 1.9735147953033447 2023-01-22 12:45:26.219341: step: 882/466, loss: 0.4819077253341675 2023-01-22 12:45:27.139298: step: 884/466, loss: 0.22298583388328552 2023-01-22 12:45:27.935745: step: 886/466, loss: 0.23651434481143951 2023-01-22 12:45:28.689595: step: 888/466, loss: 0.1375943422317505 2023-01-22 12:45:29.527567: step: 890/466, loss: 0.39386996626853943 2023-01-22 12:45:30.319510: step: 892/466, loss: 0.6610428094863892 2023-01-22 12:45:31.144842: step: 894/466, loss: 0.33950868248939514 2023-01-22 12:45:31.998563: step: 896/466, loss: 0.41958087682724 2023-01-22 12:45:32.706268: step: 898/466, loss: 0.2866438031196594 2023-01-22 12:45:33.547466: step: 900/466, loss: 0.609981894493103 2023-01-22 12:45:34.350870: step: 902/466, loss: 0.7020177245140076 2023-01-22 12:45:35.114704: step: 904/466, loss: 0.34889107942581177 2023-01-22 12:45:35.883796: step: 906/466, loss: 0.7393907308578491 2023-01-22 12:45:36.622026: step: 908/466, loss: 0.34370091557502747 2023-01-22 12:45:37.287494: step: 910/466, loss: 0.3134104013442993 2023-01-22 12:45:38.064926: step: 912/466, loss: 0.3483279347419739 2023-01-22 12:45:38.796070: step: 914/466, loss: 0.38782113790512085 2023-01-22 12:45:39.575610: step: 916/466, loss: 0.118745356798172 2023-01-22 12:45:40.360886: step: 918/466, loss: 0.6202765107154846 2023-01-22 12:45:41.058962: step: 920/466, loss: 0.1659417301416397 2023-01-22 12:45:41.849840: step: 922/466, loss: 0.19849182665348053 2023-01-22 12:45:42.628213: step: 924/466, loss: 0.17553061246871948 2023-01-22 12:45:43.310628: step: 926/466, loss: 0.0439554899930954 2023-01-22 12:45:44.055576: step: 928/466, loss: 0.15662969648838043 2023-01-22 12:45:44.828608: step: 930/466, loss: 0.199647918343544 2023-01-22 12:45:45.689604: step: 932/466, loss: 0.6263249516487122 ================================================== Loss: 0.405 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.30984143611899123, 'r': 0.2933792725301264, 'f1': 0.3013857244120402}, 'combined': 0.22207369167202962, 'epoch': 9} Test Chinese: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.3646924743811674, 'r': 0.30880453674872876, 'f1': 0.33442966708371474}, 'combined': 0.20555189293925882, 'epoch': 9} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2849824614530497, 'r': 0.30012384460425534, 'f1': 0.2923572386440713}, 'combined': 0.21542112321142096, 'epoch': 9} Test Korean: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.3447983441318658, 'r': 0.31494567364425835, 'f1': 0.3291966091032746}, 'combined': 0.2023354768146956, 'epoch': 9} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.319515824622532, 'r': 0.29829560856600706, 'f1': 0.30854128697602695}, 'combined': 0.22734621145601985, 'epoch': 9} Test Russian: {'template': {'p': 0.863013698630137, 'r': 0.48091603053435117, 'f1': 0.6176470588235295}, 'slot': {'p': 0.3632546321952892, 'r': 0.30412747128384815, 'f1': 0.3310718466850562}, 'combined': 0.20448555236429944, 'epoch': 9} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.2826086956521739, 'r': 0.37142857142857144, 'f1': 0.32098765432098764}, 'combined': 0.21399176954732507, 'epoch': 9} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.3014705882352941, 'r': 0.44565217391304346, 'f1': 0.3596491228070175}, 'combined': 0.17982456140350875, 'epoch': 9} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.42857142857142855, 'r': 0.20689655172413793, 'f1': 0.2790697674418604}, 'combined': 0.18604651162790692, 'epoch': 9} New best chinese model... New best russian model... ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.30984143611899123, 'r': 0.2933792725301264, 'f1': 0.3013857244120402}, 'combined': 0.22207369167202962, 'epoch': 9} Test for Chinese: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.3646924743811674, 'r': 0.30880453674872876, 'f1': 0.33442966708371474}, 'combined': 0.20555189293925882, 'epoch': 9} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.2826086956521739, 'r': 0.37142857142857144, 'f1': 0.32098765432098764}, 'combined': 0.21399176954732507, 'epoch': 9} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2918760820300343, 'r': 0.3190144653686902, 'f1': 0.30484247189356256}, 'combined': 0.22462076876367768, 'epoch': 7} Test for Korean: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.3237627861218527, 'r': 0.26596804267202456, 'f1': 0.2920334169776559}, 'combined': 0.17949370994724215, 'epoch': 7} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.28846153846153844, 'r': 0.4891304347826087, 'f1': 0.3629032258064516}, 'combined': 0.1814516129032258, 'epoch': 7} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.319515824622532, 'r': 0.29829560856600706, 'f1': 0.30854128697602695}, 'combined': 0.22734621145601985, 'epoch': 9} Test for Russian: {'template': {'p': 0.863013698630137, 'r': 0.48091603053435117, 'f1': 0.6176470588235295}, 'slot': {'p': 0.3632546321952892, 'r': 0.30412747128384815, 'f1': 0.3310718466850562}, 'combined': 0.20448555236429944, 'epoch': 9} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.42857142857142855, 'r': 0.20689655172413793, 'f1': 0.2790697674418604}, 'combined': 0.18604651162790692, 'epoch': 9} ****************************** Epoch: 10 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-22 12:48:50.580185: step: 2/466, loss: 0.2337709665298462 2023-01-22 12:48:51.397805: step: 4/466, loss: 0.26833581924438477 2023-01-22 12:48:52.138568: step: 6/466, loss: 0.14239007234573364 2023-01-22 12:48:53.009758: step: 8/466, loss: 0.1602407991886139 2023-01-22 12:48:53.846458: step: 10/466, loss: 0.30238527059555054 2023-01-22 12:48:54.690980: step: 12/466, loss: 0.4026970863342285 2023-01-22 12:48:55.447953: step: 14/466, loss: 1.2485120296478271 2023-01-22 12:48:56.240902: step: 16/466, loss: 0.3447570204734802 2023-01-22 12:48:57.058642: step: 18/466, loss: 0.09246478229761124 2023-01-22 12:48:57.891041: step: 20/466, loss: 0.26743602752685547 2023-01-22 12:48:58.612441: step: 22/466, loss: 0.35447368025779724 2023-01-22 12:48:59.434961: step: 24/466, loss: 0.17007564008235931 2023-01-22 12:49:00.158554: step: 26/466, loss: 0.1545749008655548 2023-01-22 12:49:00.840459: step: 28/466, loss: 0.11063441634178162 2023-01-22 12:49:01.650325: step: 30/466, loss: 1.414306879043579 2023-01-22 12:49:02.457857: step: 32/466, loss: 0.15004459023475647 2023-01-22 12:49:03.194620: step: 34/466, loss: 0.34166383743286133 2023-01-22 12:49:04.001186: step: 36/466, loss: 0.1305786520242691 2023-01-22 12:49:04.748623: step: 38/466, loss: 0.16415190696716309 2023-01-22 12:49:05.579357: step: 40/466, loss: 0.9625821709632874 2023-01-22 12:49:06.344910: step: 42/466, loss: 0.11631202697753906 2023-01-22 12:49:07.038630: step: 44/466, loss: 0.09589099138975143 2023-01-22 12:49:07.802082: step: 46/466, loss: 0.24927857518196106 2023-01-22 12:49:08.572827: step: 48/466, loss: 0.0878019705414772 2023-01-22 12:49:09.333724: step: 50/466, loss: 0.19715267419815063 2023-01-22 12:49:10.123289: step: 52/466, loss: 0.13326187431812286 2023-01-22 12:49:10.857227: step: 54/466, loss: 0.07522831857204437 2023-01-22 12:49:11.578255: step: 56/466, loss: 0.09932447969913483 2023-01-22 12:49:12.320277: step: 58/466, loss: 0.2271547019481659 2023-01-22 12:49:13.066411: step: 60/466, loss: 0.9119616746902466 2023-01-22 12:49:13.800849: step: 62/466, loss: 0.13493891060352325 2023-01-22 12:49:14.595813: step: 64/466, loss: 0.21414585411548615 2023-01-22 12:49:15.373268: step: 66/466, loss: 1.22073233127594 2023-01-22 12:49:16.175072: step: 68/466, loss: 0.07765951752662659 2023-01-22 12:49:16.922167: step: 70/466, loss: 0.2033333033323288 2023-01-22 12:49:17.683490: step: 72/466, loss: 0.26286882162094116 2023-01-22 12:49:18.506400: step: 74/466, loss: 0.9073281288146973 2023-01-22 12:49:19.247679: step: 76/466, loss: 0.2290145456790924 2023-01-22 12:49:19.990089: step: 78/466, loss: 0.80869060754776 2023-01-22 12:49:20.722249: step: 80/466, loss: 0.14732767641544342 2023-01-22 12:49:21.481902: step: 82/466, loss: 0.10995560884475708 2023-01-22 12:49:22.276395: step: 84/466, loss: 0.27620553970336914 2023-01-22 12:49:23.032250: step: 86/466, loss: 0.10498054325580597 2023-01-22 12:49:23.726499: step: 88/466, loss: 0.2640683352947235 2023-01-22 12:49:24.483998: step: 90/466, loss: 0.143999382853508 2023-01-22 12:49:25.260140: step: 92/466, loss: 0.38430488109588623 2023-01-22 12:49:25.998587: step: 94/466, loss: 0.12708406150341034 2023-01-22 12:49:26.872092: step: 96/466, loss: 0.16920247673988342 2023-01-22 12:49:27.650837: step: 98/466, loss: 0.13652513921260834 2023-01-22 12:49:28.507389: step: 100/466, loss: 1.5320005416870117 2023-01-22 12:49:29.273514: step: 102/466, loss: 0.2467314749956131 2023-01-22 12:49:29.957226: step: 104/466, loss: 0.3934531807899475 2023-01-22 12:49:30.802558: step: 106/466, loss: 0.18988072872161865 2023-01-22 12:49:31.599763: step: 108/466, loss: 0.16739220917224884 2023-01-22 12:49:32.428494: step: 110/466, loss: 0.1678118109703064 2023-01-22 12:49:33.228691: step: 112/466, loss: 0.0868569165468216 2023-01-22 12:49:33.988920: step: 114/466, loss: 0.2291712462902069 2023-01-22 12:49:34.908504: step: 116/466, loss: 0.264020711183548 2023-01-22 12:49:35.660201: step: 118/466, loss: 0.5153694748878479 2023-01-22 12:49:36.359392: step: 120/466, loss: 0.051767732948064804 2023-01-22 12:49:37.059082: step: 122/466, loss: 0.2249072939157486 2023-01-22 12:49:37.925491: step: 124/466, loss: 0.26452016830444336 2023-01-22 12:49:38.621535: step: 126/466, loss: 0.4013260304927826 2023-01-22 12:49:39.295681: step: 128/466, loss: 0.24689669907093048 2023-01-22 12:49:40.145833: step: 130/466, loss: 0.3453787863254547 2023-01-22 12:49:40.980220: step: 132/466, loss: 0.14687740802764893 2023-01-22 12:49:41.699824: step: 134/466, loss: 0.18214935064315796 2023-01-22 12:49:42.521928: step: 136/466, loss: 0.7668765187263489 2023-01-22 12:49:43.279228: step: 138/466, loss: 0.33379390835762024 2023-01-22 12:49:43.962704: step: 140/466, loss: 0.14274349808692932 2023-01-22 12:49:44.775794: step: 142/466, loss: 0.0996280089020729 2023-01-22 12:49:45.565228: step: 144/466, loss: 0.1344163566827774 2023-01-22 12:49:46.347255: step: 146/466, loss: 0.19428536295890808 2023-01-22 12:49:47.117737: step: 148/466, loss: 0.2296874076128006 2023-01-22 12:49:47.844537: step: 150/466, loss: 0.1182233989238739 2023-01-22 12:49:48.578751: step: 152/466, loss: 0.10741811245679855 2023-01-22 12:49:49.292186: step: 154/466, loss: 0.29721999168395996 2023-01-22 12:49:50.020872: step: 156/466, loss: 0.2870020568370819 2023-01-22 12:49:50.745078: step: 158/466, loss: 0.30465492606163025 2023-01-22 12:49:51.472796: step: 160/466, loss: 0.23985272645950317 2023-01-22 12:49:52.222670: step: 162/466, loss: 0.3943251371383667 2023-01-22 12:49:53.023662: step: 164/466, loss: 0.317905068397522 2023-01-22 12:49:53.824659: step: 166/466, loss: 0.21616436541080475 2023-01-22 12:49:54.555397: step: 168/466, loss: 1.09933340549469 2023-01-22 12:49:55.380028: step: 170/466, loss: 0.13349558413028717 2023-01-22 12:49:56.077324: step: 172/466, loss: 0.2912863790988922 2023-01-22 12:49:56.795084: step: 174/466, loss: 0.11956477910280228 2023-01-22 12:49:57.620196: step: 176/466, loss: 0.1785416156053543 2023-01-22 12:49:58.396183: step: 178/466, loss: 0.17102572321891785 2023-01-22 12:49:59.202898: step: 180/466, loss: 0.27333664894104004 2023-01-22 12:50:00.003378: step: 182/466, loss: 0.1813071221113205 2023-01-22 12:50:00.726858: step: 184/466, loss: 0.20739442110061646 2023-01-22 12:50:01.544287: step: 186/466, loss: 0.1849229484796524 2023-01-22 12:50:02.287457: step: 188/466, loss: 0.048303648829460144 2023-01-22 12:50:03.101118: step: 190/466, loss: 0.0667133778333664 2023-01-22 12:50:03.845876: step: 192/466, loss: 0.38608261942863464 2023-01-22 12:50:04.597901: step: 194/466, loss: 0.15903745591640472 2023-01-22 12:50:05.382541: step: 196/466, loss: 0.3151916265487671 2023-01-22 12:50:06.238821: step: 198/466, loss: 0.6694477200508118 2023-01-22 12:50:06.997029: step: 200/466, loss: 0.3216400146484375 2023-01-22 12:50:07.729397: step: 202/466, loss: 0.16271351277828217 2023-01-22 12:50:08.539331: step: 204/466, loss: 0.5475163459777832 2023-01-22 12:50:09.280370: step: 206/466, loss: 0.12444935739040375 2023-01-22 12:50:10.052091: step: 208/466, loss: 0.3612194061279297 2023-01-22 12:50:10.819694: step: 210/466, loss: 0.19568414986133575 2023-01-22 12:50:11.607236: step: 212/466, loss: 0.23034150898456573 2023-01-22 12:50:12.432362: step: 214/466, loss: 0.21345168352127075 2023-01-22 12:50:13.181719: step: 216/466, loss: 0.3210800290107727 2023-01-22 12:50:14.030272: step: 218/466, loss: 0.17796467244625092 2023-01-22 12:50:14.812777: step: 220/466, loss: 0.19314827024936676 2023-01-22 12:50:15.638751: step: 222/466, loss: 0.15028682351112366 2023-01-22 12:50:16.402122: step: 224/466, loss: 0.11573804914951324 2023-01-22 12:50:17.170644: step: 226/466, loss: 0.7474660873413086 2023-01-22 12:50:17.943158: step: 228/466, loss: 0.11379896104335785 2023-01-22 12:50:18.809411: step: 230/466, loss: 0.4897193908691406 2023-01-22 12:50:19.513105: step: 232/466, loss: 0.11260432004928589 2023-01-22 12:50:20.266538: step: 234/466, loss: 0.6882609128952026 2023-01-22 12:50:21.057247: step: 236/466, loss: 0.5560853481292725 2023-01-22 12:50:21.789628: step: 238/466, loss: 0.767204999923706 2023-01-22 12:50:22.514678: step: 240/466, loss: 0.34223759174346924 2023-01-22 12:50:23.266653: step: 242/466, loss: 0.08386547863483429 2023-01-22 12:50:24.006414: step: 244/466, loss: 0.22089608013629913 2023-01-22 12:50:24.709041: step: 246/466, loss: 0.17293940484523773 2023-01-22 12:50:25.439406: step: 248/466, loss: 0.8853538632392883 2023-01-22 12:50:26.236211: step: 250/466, loss: 0.696312665939331 2023-01-22 12:50:26.994290: step: 252/466, loss: 0.19818630814552307 2023-01-22 12:50:27.779487: step: 254/466, loss: 0.5379956364631653 2023-01-22 12:50:28.508240: step: 256/466, loss: 0.27241238951683044 2023-01-22 12:50:29.241670: step: 258/466, loss: 0.49338239431381226 2023-01-22 12:50:29.968086: step: 260/466, loss: 0.23142513632774353 2023-01-22 12:50:30.796736: step: 262/466, loss: 0.3157171607017517 2023-01-22 12:50:31.532221: step: 264/466, loss: 0.09880304336547852 2023-01-22 12:50:32.303106: step: 266/466, loss: 0.2063807100057602 2023-01-22 12:50:32.981363: step: 268/466, loss: 0.09504813700914383 2023-01-22 12:50:33.773332: step: 270/466, loss: 0.10063324868679047 2023-01-22 12:50:34.519430: step: 272/466, loss: 0.26294028759002686 2023-01-22 12:50:35.232288: step: 274/466, loss: 0.22391608357429504 2023-01-22 12:50:35.997269: step: 276/466, loss: 0.12411212176084518 2023-01-22 12:50:36.785417: step: 278/466, loss: 0.10606664419174194 2023-01-22 12:50:37.597864: step: 280/466, loss: 0.12701769173145294 2023-01-22 12:50:38.353925: step: 282/466, loss: 0.32284846901893616 2023-01-22 12:50:39.058700: step: 284/466, loss: 0.13506066799163818 2023-01-22 12:50:39.776772: step: 286/466, loss: 0.29167723655700684 2023-01-22 12:50:40.517428: step: 288/466, loss: 0.2490895539522171 2023-01-22 12:50:41.313244: step: 290/466, loss: 0.17621193826198578 2023-01-22 12:50:42.060572: step: 292/466, loss: 0.09964149445295334 2023-01-22 12:50:42.745058: step: 294/466, loss: 0.08445818722248077 2023-01-22 12:50:43.520123: step: 296/466, loss: 0.07993566989898682 2023-01-22 12:50:44.259124: step: 298/466, loss: 0.10784658789634705 2023-01-22 12:50:44.998541: step: 300/466, loss: 0.16252484917640686 2023-01-22 12:50:45.730861: step: 302/466, loss: 0.24444356560707092 2023-01-22 12:50:46.467534: step: 304/466, loss: 0.09691224247217178 2023-01-22 12:50:47.315318: step: 306/466, loss: 0.08311986178159714 2023-01-22 12:50:48.078615: step: 308/466, loss: 0.19063793122768402 2023-01-22 12:50:48.916901: step: 310/466, loss: 0.24725835025310516 2023-01-22 12:50:49.609291: step: 312/466, loss: 0.29137295484542847 2023-01-22 12:50:50.392435: step: 314/466, loss: 0.1480897068977356 2023-01-22 12:50:51.119033: step: 316/466, loss: 0.2394569218158722 2023-01-22 12:50:51.913606: step: 318/466, loss: 0.4203174114227295 2023-01-22 12:50:52.687664: step: 320/466, loss: 0.034369274973869324 2023-01-22 12:50:53.446054: step: 322/466, loss: 0.10814063251018524 2023-01-22 12:50:54.263105: step: 324/466, loss: 0.14247576892375946 2023-01-22 12:50:54.980030: step: 326/466, loss: 1.1015814542770386 2023-01-22 12:50:55.699556: step: 328/466, loss: 0.10019680112600327 2023-01-22 12:50:56.572813: step: 330/466, loss: 0.4012758433818817 2023-01-22 12:50:57.350370: step: 332/466, loss: 0.39217323064804077 2023-01-22 12:50:58.110333: step: 334/466, loss: 0.27039122581481934 2023-01-22 12:50:58.941328: step: 336/466, loss: 0.40216994285583496 2023-01-22 12:50:59.739634: step: 338/466, loss: 0.24465128779411316 2023-01-22 12:51:00.448404: step: 340/466, loss: 0.1533263623714447 2023-01-22 12:51:01.167282: step: 342/466, loss: 0.25162526965141296 2023-01-22 12:51:01.923316: step: 344/466, loss: 0.20700925588607788 2023-01-22 12:51:02.638498: step: 346/466, loss: 0.1375453919172287 2023-01-22 12:51:03.471384: step: 348/466, loss: 0.20698057115077972 2023-01-22 12:51:04.232340: step: 350/466, loss: 0.15939858555793762 2023-01-22 12:51:05.065299: step: 352/466, loss: 0.2313118427991867 2023-01-22 12:51:05.862767: step: 354/466, loss: 0.345257431268692 2023-01-22 12:51:06.619130: step: 356/466, loss: 0.5385666489601135 2023-01-22 12:51:07.473315: step: 358/466, loss: 0.5636390447616577 2023-01-22 12:51:08.178410: step: 360/466, loss: 0.20566923916339874 2023-01-22 12:51:08.976495: step: 362/466, loss: 0.16982313990592957 2023-01-22 12:51:09.716435: step: 364/466, loss: 0.1668519228696823 2023-01-22 12:51:10.428388: step: 366/466, loss: 0.5705475211143494 2023-01-22 12:51:11.199266: step: 368/466, loss: 0.19253084063529968 2023-01-22 12:51:11.944991: step: 370/466, loss: 0.1730499565601349 2023-01-22 12:51:12.733908: step: 372/466, loss: 0.17300289869308472 2023-01-22 12:51:13.451956: step: 374/466, loss: 0.29056882858276367 2023-01-22 12:51:14.219833: step: 376/466, loss: 0.07716875523328781 2023-01-22 12:51:15.198573: step: 378/466, loss: 0.6006758809089661 2023-01-22 12:51:15.900916: step: 380/466, loss: 0.26666271686553955 2023-01-22 12:51:16.668161: step: 382/466, loss: 0.9369261860847473 2023-01-22 12:51:17.375252: step: 384/466, loss: 0.18641993403434753 2023-01-22 12:51:18.113165: step: 386/466, loss: 0.12803220748901367 2023-01-22 12:51:18.882976: step: 388/466, loss: 0.24852482974529266 2023-01-22 12:51:19.616237: step: 390/466, loss: 0.20162753760814667 2023-01-22 12:51:20.337883: step: 392/466, loss: 0.22263064980506897 2023-01-22 12:51:21.182410: step: 394/466, loss: 0.5578153133392334 2023-01-22 12:51:21.924432: step: 396/466, loss: 0.43902575969696045 2023-01-22 12:51:22.715553: step: 398/466, loss: 0.18315844237804413 2023-01-22 12:51:23.422655: step: 400/466, loss: 0.5645053386688232 2023-01-22 12:51:24.193064: step: 402/466, loss: 0.04956785589456558 2023-01-22 12:51:24.946367: step: 404/466, loss: 0.100751131772995 2023-01-22 12:51:25.663223: step: 406/466, loss: 0.15760697424411774 2023-01-22 12:51:26.436789: step: 408/466, loss: 0.5888820290565491 2023-01-22 12:51:27.139494: step: 410/466, loss: 0.15951798856258392 2023-01-22 12:51:27.845034: step: 412/466, loss: 1.0511513948440552 2023-01-22 12:51:28.619583: step: 414/466, loss: 0.20933635532855988 2023-01-22 12:51:29.314142: step: 416/466, loss: 0.22570045292377472 2023-01-22 12:51:30.050557: step: 418/466, loss: 0.5592967867851257 2023-01-22 12:51:30.856246: step: 420/466, loss: 21.276796340942383 2023-01-22 12:51:31.723167: step: 422/466, loss: 0.39449530839920044 2023-01-22 12:51:32.513766: step: 424/466, loss: 0.2394774854183197 2023-01-22 12:51:33.288937: step: 426/466, loss: 0.12702800333499908 2023-01-22 12:51:34.010580: step: 428/466, loss: 0.1220950037240982 2023-01-22 12:51:34.732235: step: 430/466, loss: 0.1692110002040863 2023-01-22 12:51:35.514622: step: 432/466, loss: 0.6647356748580933 2023-01-22 12:51:36.259574: step: 434/466, loss: 0.25400546193122864 2023-01-22 12:51:37.047679: step: 436/466, loss: 0.516399085521698 2023-01-22 12:51:37.799328: step: 438/466, loss: 0.2164851427078247 2023-01-22 12:51:38.624609: step: 440/466, loss: 0.7619146108627319 2023-01-22 12:51:39.371320: step: 442/466, loss: 0.09502183645963669 2023-01-22 12:51:40.190926: step: 444/466, loss: 0.16723360121250153 2023-01-22 12:51:40.949734: step: 446/466, loss: 0.18075765669345856 2023-01-22 12:51:41.665267: step: 448/466, loss: 0.24220411479473114 2023-01-22 12:51:42.437600: step: 450/466, loss: 0.08047514408826828 2023-01-22 12:51:43.174214: step: 452/466, loss: 0.186679407954216 2023-01-22 12:51:43.911491: step: 454/466, loss: 0.14323924481868744 2023-01-22 12:51:44.711224: step: 456/466, loss: 0.3566244840621948 2023-01-22 12:51:45.483133: step: 458/466, loss: 0.20824034512043 2023-01-22 12:51:46.138684: step: 460/466, loss: 0.5536962151527405 2023-01-22 12:51:46.930018: step: 462/466, loss: 0.22645868360996246 2023-01-22 12:51:47.661255: step: 464/466, loss: 0.1923508644104004 2023-01-22 12:51:48.395375: step: 466/466, loss: 0.1315545290708542 2023-01-22 12:51:49.138972: step: 468/466, loss: 0.6798703670501709 2023-01-22 12:51:49.912586: step: 470/466, loss: 0.31831642985343933 2023-01-22 12:51:50.643950: step: 472/466, loss: 0.45602738857269287 2023-01-22 12:51:51.381439: step: 474/466, loss: 0.12411798536777496 2023-01-22 12:51:52.141596: step: 476/466, loss: 0.30964699387550354 2023-01-22 12:51:52.899266: step: 478/466, loss: 0.305565744638443 2023-01-22 12:51:53.691141: step: 480/466, loss: 0.11159797012805939 2023-01-22 12:51:54.430606: step: 482/466, loss: 0.44753023982048035 2023-01-22 12:51:55.116187: step: 484/466, loss: 0.1689191460609436 2023-01-22 12:51:55.890383: step: 486/466, loss: 0.1801457405090332 2023-01-22 12:51:56.738878: step: 488/466, loss: 0.7594742774963379 2023-01-22 12:51:57.438136: step: 490/466, loss: 0.3323618173599243 2023-01-22 12:51:58.228772: step: 492/466, loss: 0.17707262933254242 2023-01-22 12:51:59.040332: step: 494/466, loss: 0.3458724617958069 2023-01-22 12:51:59.887925: step: 496/466, loss: 0.2138400375843048 2023-01-22 12:52:00.710851: step: 498/466, loss: 0.22486338019371033 2023-01-22 12:52:01.523283: step: 500/466, loss: 0.6938997507095337 2023-01-22 12:52:02.289186: step: 502/466, loss: 0.4290332496166229 2023-01-22 12:52:03.063702: step: 504/466, loss: 0.25466781854629517 2023-01-22 12:52:03.789545: step: 506/466, loss: 0.11359059810638428 2023-01-22 12:52:04.608752: step: 508/466, loss: 0.15057611465454102 2023-01-22 12:52:05.335754: step: 510/466, loss: 0.14240585267543793 2023-01-22 12:52:06.111165: step: 512/466, loss: 0.8652889728546143 2023-01-22 12:52:06.950438: step: 514/466, loss: 0.25626131892204285 2023-01-22 12:52:07.683768: step: 516/466, loss: 0.865880012512207 2023-01-22 12:52:08.381837: step: 518/466, loss: 0.17434607446193695 2023-01-22 12:52:09.111291: step: 520/466, loss: 0.44801265001296997 2023-01-22 12:52:09.818075: step: 522/466, loss: 0.3766295313835144 2023-01-22 12:52:10.675140: step: 524/466, loss: 0.12937021255493164 2023-01-22 12:52:11.520092: step: 526/466, loss: 1.208333969116211 2023-01-22 12:52:12.262710: step: 528/466, loss: 0.1700485199689865 2023-01-22 12:52:13.018746: step: 530/466, loss: 0.27223414182662964 2023-01-22 12:52:13.866304: step: 532/466, loss: 0.32058337330818176 2023-01-22 12:52:14.548092: step: 534/466, loss: 0.2612878382205963 2023-01-22 12:52:15.319871: step: 536/466, loss: 0.24139459431171417 2023-01-22 12:52:16.118564: step: 538/466, loss: 0.1782309114933014 2023-01-22 12:52:16.971572: step: 540/466, loss: 1.4186257123947144 2023-01-22 12:52:17.830761: step: 542/466, loss: 0.8059676289558411 2023-01-22 12:52:18.596621: step: 544/466, loss: 0.14394286274909973 2023-01-22 12:52:19.317968: step: 546/466, loss: 0.18445007503032684 2023-01-22 12:52:20.194638: step: 548/466, loss: 0.42247700691223145 2023-01-22 12:52:20.953029: step: 550/466, loss: 0.19051989912986755 2023-01-22 12:52:21.845375: step: 552/466, loss: 0.5361776351928711 2023-01-22 12:52:22.593506: step: 554/466, loss: 0.2054612785577774 2023-01-22 12:52:23.303768: step: 556/466, loss: 0.15531039237976074 2023-01-22 12:52:24.012249: step: 558/466, loss: 0.3972584307193756 2023-01-22 12:52:24.673339: step: 560/466, loss: 0.12098507583141327 2023-01-22 12:52:25.564642: step: 562/466, loss: 0.1388738453388214 2023-01-22 12:52:26.293282: step: 564/466, loss: 0.4729807674884796 2023-01-22 12:52:27.061570: step: 566/466, loss: 0.17271478474140167 2023-01-22 12:52:27.754368: step: 568/466, loss: 0.16412785649299622 2023-01-22 12:52:28.482761: step: 570/466, loss: 0.31249192357063293 2023-01-22 12:52:29.208437: step: 572/466, loss: 0.4589084982872009 2023-01-22 12:52:29.929234: step: 574/466, loss: 0.07088687270879745 2023-01-22 12:52:30.711212: step: 576/466, loss: 0.1241055577993393 2023-01-22 12:52:31.514831: step: 578/466, loss: 0.34827089309692383 2023-01-22 12:52:32.247197: step: 580/466, loss: 0.1643725484609604 2023-01-22 12:52:33.037709: step: 582/466, loss: 0.5115247964859009 2023-01-22 12:52:33.793264: step: 584/466, loss: 0.3221728801727295 2023-01-22 12:52:34.681731: step: 586/466, loss: 0.49480682611465454 2023-01-22 12:52:35.427703: step: 588/466, loss: 0.1289515346288681 2023-01-22 12:52:36.171807: step: 590/466, loss: 0.22824212908744812 2023-01-22 12:52:36.951726: step: 592/466, loss: 0.08253544569015503 2023-01-22 12:52:37.691865: step: 594/466, loss: 0.07791145145893097 2023-01-22 12:52:38.514799: step: 596/466, loss: 0.22855186462402344 2023-01-22 12:52:39.203995: step: 598/466, loss: 0.5620360374450684 2023-01-22 12:52:39.886708: step: 600/466, loss: 0.9512036442756653 2023-01-22 12:52:40.618128: step: 602/466, loss: 0.20425112545490265 2023-01-22 12:52:41.296574: step: 604/466, loss: 0.15751442313194275 2023-01-22 12:52:42.168971: step: 606/466, loss: 0.12556593120098114 2023-01-22 12:52:42.836564: step: 608/466, loss: 0.1434406191110611 2023-01-22 12:52:43.609826: step: 610/466, loss: 0.09092912077903748 2023-01-22 12:52:44.331428: step: 612/466, loss: 0.1337815523147583 2023-01-22 12:52:45.093668: step: 614/466, loss: 0.7102845311164856 2023-01-22 12:52:45.875039: step: 616/466, loss: 0.38301992416381836 2023-01-22 12:52:46.746621: step: 618/466, loss: 0.3137530982494354 2023-01-22 12:52:47.546210: step: 620/466, loss: 0.3401144742965698 2023-01-22 12:52:48.247461: step: 622/466, loss: 0.04092513769865036 2023-01-22 12:52:48.933962: step: 624/466, loss: 0.2598436772823334 2023-01-22 12:52:49.662011: step: 626/466, loss: 0.20543630421161652 2023-01-22 12:52:50.463689: step: 628/466, loss: 2.207848310470581 2023-01-22 12:52:51.232145: step: 630/466, loss: 0.3994409441947937 2023-01-22 12:52:52.069108: step: 632/466, loss: 0.13118356466293335 2023-01-22 12:52:52.819669: step: 634/466, loss: 0.15899503231048584 2023-01-22 12:52:53.567501: step: 636/466, loss: 0.2652314007282257 2023-01-22 12:52:54.267821: step: 638/466, loss: 0.4009181559085846 2023-01-22 12:52:55.067679: step: 640/466, loss: 0.2774149775505066 2023-01-22 12:52:55.880620: step: 642/466, loss: 0.5266385674476624 2023-01-22 12:52:56.820334: step: 644/466, loss: 0.138621523976326 2023-01-22 12:52:57.618144: step: 646/466, loss: 0.20024169981479645 2023-01-22 12:52:58.410612: step: 648/466, loss: 0.604051947593689 2023-01-22 12:52:59.135365: step: 650/466, loss: 0.08497483283281326 2023-01-22 12:52:59.922888: step: 652/466, loss: 0.1440829187631607 2023-01-22 12:53:00.647239: step: 654/466, loss: 0.1202130913734436 2023-01-22 12:53:01.428030: step: 656/466, loss: 0.37918657064437866 2023-01-22 12:53:02.257183: step: 658/466, loss: 0.2415979504585266 2023-01-22 12:53:03.012217: step: 660/466, loss: 0.3267868757247925 2023-01-22 12:53:03.756859: step: 662/466, loss: 0.38840076327323914 2023-01-22 12:53:04.477972: step: 664/466, loss: 0.7931300401687622 2023-01-22 12:53:05.278849: step: 666/466, loss: 0.07995925843715668 2023-01-22 12:53:06.033828: step: 668/466, loss: 0.16449187695980072 2023-01-22 12:53:06.915370: step: 670/466, loss: 0.3676665723323822 2023-01-22 12:53:07.692176: step: 672/466, loss: 0.14478278160095215 2023-01-22 12:53:08.460097: step: 674/466, loss: 0.28152671456336975 2023-01-22 12:53:09.396927: step: 676/466, loss: 0.1067856177687645 2023-01-22 12:53:10.162247: step: 678/466, loss: 0.15938341617584229 2023-01-22 12:53:10.929491: step: 680/466, loss: 0.23811189830303192 2023-01-22 12:53:11.651129: step: 682/466, loss: 0.3324274718761444 2023-01-22 12:53:12.386902: step: 684/466, loss: 0.1873573213815689 2023-01-22 12:53:13.252356: step: 686/466, loss: 0.8957513570785522 2023-01-22 12:53:13.979503: step: 688/466, loss: 1.4601404666900635 2023-01-22 12:53:14.752158: step: 690/466, loss: 0.18743014335632324 2023-01-22 12:53:15.498289: step: 692/466, loss: 0.14349307119846344 2023-01-22 12:53:16.277427: step: 694/466, loss: 0.16869373619556427 2023-01-22 12:53:17.007041: step: 696/466, loss: 0.15919862687587738 2023-01-22 12:53:17.759065: step: 698/466, loss: 0.25145259499549866 2023-01-22 12:53:18.476810: step: 700/466, loss: 0.16693904995918274 2023-01-22 12:53:19.153400: step: 702/466, loss: 0.1935565024614334 2023-01-22 12:53:19.957655: step: 704/466, loss: 0.2286834865808487 2023-01-22 12:53:20.785088: step: 706/466, loss: 0.20977507531642914 2023-01-22 12:53:21.580113: step: 708/466, loss: 9.643488883972168 2023-01-22 12:53:22.401839: step: 710/466, loss: 0.6023985147476196 2023-01-22 12:53:23.082331: step: 712/466, loss: 0.13850431144237518 2023-01-22 12:53:23.848663: step: 714/466, loss: 0.1854768544435501 2023-01-22 12:53:24.621358: step: 716/466, loss: 0.03611788526177406 2023-01-22 12:53:25.384969: step: 718/466, loss: 0.08938011527061462 2023-01-22 12:53:26.170732: step: 720/466, loss: 0.2490069717168808 2023-01-22 12:53:26.943110: step: 722/466, loss: 0.4968907833099365 2023-01-22 12:53:27.706360: step: 724/466, loss: 1.0069706439971924 2023-01-22 12:53:28.521827: step: 726/466, loss: 1.2807743549346924 2023-01-22 12:53:29.194717: step: 728/466, loss: 0.17751643061637878 2023-01-22 12:53:29.948740: step: 730/466, loss: 0.04696459323167801 2023-01-22 12:53:30.683760: step: 732/466, loss: 1.1253899335861206 2023-01-22 12:53:31.455467: step: 734/466, loss: 0.1938478946685791 2023-01-22 12:53:32.205312: step: 736/466, loss: 0.39816173911094666 2023-01-22 12:53:32.934761: step: 738/466, loss: 0.19523125886917114 2023-01-22 12:53:33.653390: step: 740/466, loss: 0.1278812736272812 2023-01-22 12:53:34.404250: step: 742/466, loss: 0.19035612046718597 2023-01-22 12:53:35.133065: step: 744/466, loss: 0.09937532991170883 2023-01-22 12:53:35.917767: step: 746/466, loss: 0.1287962794303894 2023-01-22 12:53:36.596822: step: 748/466, loss: 0.6626227498054504 2023-01-22 12:53:37.373024: step: 750/466, loss: 0.18322832882404327 2023-01-22 12:53:38.081829: step: 752/466, loss: 1.757911205291748 2023-01-22 12:53:38.909789: step: 754/466, loss: 0.22883504629135132 2023-01-22 12:53:39.636267: step: 756/466, loss: 1.5663551092147827 2023-01-22 12:53:40.408203: step: 758/466, loss: 0.48100045323371887 2023-01-22 12:53:41.134243: step: 760/466, loss: 0.357120156288147 2023-01-22 12:53:41.933334: step: 762/466, loss: 0.16780208051204681 2023-01-22 12:53:42.758967: step: 764/466, loss: 0.2773645222187042 2023-01-22 12:53:43.579993: step: 766/466, loss: 0.21462607383728027 2023-01-22 12:53:44.238853: step: 768/466, loss: 0.30235159397125244 2023-01-22 12:53:44.930331: step: 770/466, loss: 0.1719624400138855 2023-01-22 12:53:45.715042: step: 772/466, loss: 0.4989997446537018 2023-01-22 12:53:46.588634: step: 774/466, loss: 0.49282678961753845 2023-01-22 12:53:47.468762: step: 776/466, loss: 1.6167389154434204 2023-01-22 12:53:48.257703: step: 778/466, loss: 0.3437347412109375 2023-01-22 12:53:49.028101: step: 780/466, loss: 0.27999237179756165 2023-01-22 12:53:49.754456: step: 782/466, loss: 0.10780219733715057 2023-01-22 12:53:50.532395: step: 784/466, loss: 0.2819616496562958 2023-01-22 12:53:51.236325: step: 786/466, loss: 0.3817901015281677 2023-01-22 12:53:51.974819: step: 788/466, loss: 0.20044463872909546 2023-01-22 12:53:52.781679: step: 790/466, loss: 0.06816612184047699 2023-01-22 12:53:53.636561: step: 792/466, loss: 0.2158653736114502 2023-01-22 12:53:54.460488: step: 794/466, loss: 0.2796313166618347 2023-01-22 12:53:55.112480: step: 796/466, loss: 0.08614753186702728 2023-01-22 12:53:55.903655: step: 798/466, loss: 0.6492806673049927 2023-01-22 12:53:56.657563: step: 800/466, loss: 0.08297394216060638 2023-01-22 12:53:57.404376: step: 802/466, loss: 0.09081264585256577 2023-01-22 12:53:58.096918: step: 804/466, loss: 0.1946927309036255 2023-01-22 12:53:58.907459: step: 806/466, loss: 0.3332517743110657 2023-01-22 12:53:59.670444: step: 808/466, loss: 0.18870267271995544 2023-01-22 12:54:00.539127: step: 810/466, loss: 0.10972864180803299 2023-01-22 12:54:01.284236: step: 812/466, loss: 0.33559009432792664 2023-01-22 12:54:02.011496: step: 814/466, loss: 0.13296253979206085 2023-01-22 12:54:02.805556: step: 816/466, loss: 0.23679058253765106 2023-01-22 12:54:03.521455: step: 818/466, loss: 0.9596275091171265 2023-01-22 12:54:04.327007: step: 820/466, loss: 0.36564570665359497 2023-01-22 12:54:05.122111: step: 822/466, loss: 0.4165021479129791 2023-01-22 12:54:05.934332: step: 824/466, loss: 0.29203930497169495 2023-01-22 12:54:06.798978: step: 826/466, loss: 0.17134632170200348 2023-01-22 12:54:07.480409: step: 828/466, loss: 0.08343864232301712 2023-01-22 12:54:08.187391: step: 830/466, loss: 0.16457146406173706 2023-01-22 12:54:08.898155: step: 832/466, loss: 0.14254069328308105 2023-01-22 12:54:09.628352: step: 834/466, loss: 0.2560867965221405 2023-01-22 12:54:10.429592: step: 836/466, loss: 0.5032888054847717 2023-01-22 12:54:11.190017: step: 838/466, loss: 0.31330007314682007 2023-01-22 12:54:11.977897: step: 840/466, loss: 0.5630853176116943 2023-01-22 12:54:12.653535: step: 842/466, loss: 0.2139889895915985 2023-01-22 12:54:13.418923: step: 844/466, loss: 0.269824743270874 2023-01-22 12:54:14.157312: step: 846/466, loss: 0.4968046545982361 2023-01-22 12:54:14.949322: step: 848/466, loss: 0.10947266966104507 2023-01-22 12:54:15.731590: step: 850/466, loss: 0.3745446503162384 2023-01-22 12:54:16.710450: step: 852/466, loss: 0.14542731642723083 2023-01-22 12:54:17.463613: step: 854/466, loss: 0.0944216251373291 2023-01-22 12:54:18.265499: step: 856/466, loss: 0.5812382102012634 2023-01-22 12:54:19.033805: step: 858/466, loss: 0.15963409841060638 2023-01-22 12:54:19.807395: step: 860/466, loss: 0.3798730671405792 2023-01-22 12:54:20.535531: step: 862/466, loss: 0.056237805634737015 2023-01-22 12:54:21.355305: step: 864/466, loss: 0.1034381166100502 2023-01-22 12:54:22.215519: step: 866/466, loss: 0.18533749878406525 2023-01-22 12:54:23.006719: step: 868/466, loss: 0.26592233777046204 2023-01-22 12:54:23.776472: step: 870/466, loss: 0.167646586894989 2023-01-22 12:54:24.534315: step: 872/466, loss: 0.15275219082832336 2023-01-22 12:54:25.345083: step: 874/466, loss: 0.5274092555046082 2023-01-22 12:54:26.110501: step: 876/466, loss: 0.17686963081359863 2023-01-22 12:54:26.850736: step: 878/466, loss: 0.24046087265014648 2023-01-22 12:54:27.653449: step: 880/466, loss: 0.5760931372642517 2023-01-22 12:54:28.495591: step: 882/466, loss: 0.12636913359165192 2023-01-22 12:54:29.395691: step: 884/466, loss: 0.13510404527187347 2023-01-22 12:54:30.204789: step: 886/466, loss: 1.0734281539916992 2023-01-22 12:54:30.864517: step: 888/466, loss: 0.16917142271995544 2023-01-22 12:54:31.653439: step: 890/466, loss: 0.29217538237571716 2023-01-22 12:54:32.382140: step: 892/466, loss: 0.25594523549079895 2023-01-22 12:54:33.153553: step: 894/466, loss: 1.6487802267074585 2023-01-22 12:54:33.893121: step: 896/466, loss: 0.13572891056537628 2023-01-22 12:54:34.608699: step: 898/466, loss: 0.516257107257843 2023-01-22 12:54:35.361562: step: 900/466, loss: 0.14448872208595276 2023-01-22 12:54:36.155887: step: 902/466, loss: 0.1506912112236023 2023-01-22 12:54:36.865121: step: 904/466, loss: 0.09637308120727539 2023-01-22 12:54:37.692457: step: 906/466, loss: 0.1871449053287506 2023-01-22 12:54:38.418188: step: 908/466, loss: 0.14197932183742523 2023-01-22 12:54:39.108021: step: 910/466, loss: 0.1988995522260666 2023-01-22 12:54:39.823331: step: 912/466, loss: 0.48657023906707764 2023-01-22 12:54:40.610147: step: 914/466, loss: 0.9628162384033203 2023-01-22 12:54:41.325405: step: 916/466, loss: 0.2303006649017334 2023-01-22 12:54:42.062382: step: 918/466, loss: 0.13100191950798035 2023-01-22 12:54:42.696653: step: 920/466, loss: 0.16516250371932983 2023-01-22 12:54:43.475467: step: 922/466, loss: 0.24126259982585907 2023-01-22 12:54:44.260410: step: 924/466, loss: 0.6147301197052002 2023-01-22 12:54:45.100287: step: 926/466, loss: 0.26411890983581543 2023-01-22 12:54:45.797640: step: 928/466, loss: 0.18852995336055756 2023-01-22 12:54:46.571616: step: 930/466, loss: 0.4089183509349823 2023-01-22 12:54:47.349442: step: 932/466, loss: 0.4652842879295349 ================================================== Loss: 0.383 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3002202807646356, 'r': 0.31788029728020234, 'f1': 0.3087980030721966}, 'combined': 0.22753537068477644, 'epoch': 10} Test Chinese: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.3338687788748403, 'r': 0.2924968244735386, 'f1': 0.3118164761593196}, 'combined': 0.1916530536393867, 'epoch': 10} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.27646197776496223, 'r': 0.32105261933995616, 'f1': 0.2970934686429445}, 'combined': 0.2189109768948012, 'epoch': 10} Test Korean: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.31352145653437247, 'r': 0.30672939725069887, 'f1': 0.3100882386572988}, 'combined': 0.1905908198576568, 'epoch': 10} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.31376876563803174, 'r': 0.3244857253751941, 'f1': 0.3190372710312076}, 'combined': 0.2350800944440477, 'epoch': 10} Test Russian: {'template': {'p': 0.863013698630137, 'r': 0.48091603053435117, 'f1': 0.6176470588235295}, 'slot': {'p': 0.32549754310573065, 'r': 0.29393414498638704, 'f1': 0.3089116810366488}, 'combined': 0.1907983912285184, 'epoch': 10} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.2857142857142857, 'r': 0.34285714285714286, 'f1': 0.3116883116883117}, 'combined': 0.20779220779220778, 'epoch': 10} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.27564102564102566, 'r': 0.4673913043478261, 'f1': 0.3467741935483871}, 'combined': 0.17338709677419356, 'epoch': 10} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.46153846153846156, 'r': 0.20689655172413793, 'f1': 0.28571428571428575}, 'combined': 0.1904761904761905, 'epoch': 10} New best russian model... ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.30984143611899123, 'r': 0.2933792725301264, 'f1': 0.3013857244120402}, 'combined': 0.22207369167202962, 'epoch': 9} Test for Chinese: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.3646924743811674, 'r': 0.30880453674872876, 'f1': 0.33442966708371474}, 'combined': 0.20555189293925882, 'epoch': 9} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.2826086956521739, 'r': 0.37142857142857144, 'f1': 0.32098765432098764}, 'combined': 0.21399176954732507, 'epoch': 9} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2918760820300343, 'r': 0.3190144653686902, 'f1': 0.30484247189356256}, 'combined': 0.22462076876367768, 'epoch': 7} Test for Korean: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.3237627861218527, 'r': 0.26596804267202456, 'f1': 0.2920334169776559}, 'combined': 0.17949370994724215, 'epoch': 7} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.28846153846153844, 'r': 0.4891304347826087, 'f1': 0.3629032258064516}, 'combined': 0.1814516129032258, 'epoch': 7} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.31376876563803174, 'r': 0.3244857253751941, 'f1': 0.3190372710312076}, 'combined': 0.2350800944440477, 'epoch': 10} Test for Russian: {'template': {'p': 0.863013698630137, 'r': 0.48091603053435117, 'f1': 0.6176470588235295}, 'slot': {'p': 0.32549754310573065, 'r': 0.29393414498638704, 'f1': 0.3089116810366488}, 'combined': 0.1907983912285184, 'epoch': 10} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.46153846153846156, 'r': 0.20689655172413793, 'f1': 0.28571428571428575}, 'combined': 0.1904761904761905, 'epoch': 10} ****************************** Epoch: 11 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-22 12:57:42.999595: step: 2/466, loss: 2.1529102325439453 2023-01-22 12:57:43.736760: step: 4/466, loss: 1.2928975820541382 2023-01-22 12:57:44.510512: step: 6/466, loss: 0.7491808533668518 2023-01-22 12:57:45.438922: step: 8/466, loss: 0.5165972113609314 2023-01-22 12:57:46.153798: step: 10/466, loss: 0.6967114210128784 2023-01-22 12:57:46.892701: step: 12/466, loss: 0.181780144572258 2023-01-22 12:57:47.651844: step: 14/466, loss: 0.2290743887424469 2023-01-22 12:57:48.429544: step: 16/466, loss: 0.10063283145427704 2023-01-22 12:57:49.188896: step: 18/466, loss: 0.24175231158733368 2023-01-22 12:57:49.919683: step: 20/466, loss: 0.14193516969680786 2023-01-22 12:57:50.716602: step: 22/466, loss: 0.355695515871048 2023-01-22 12:57:51.561020: step: 24/466, loss: 0.3076014816761017 2023-01-22 12:57:52.331935: step: 26/466, loss: 0.07201249897480011 2023-01-22 12:57:53.072800: step: 28/466, loss: 0.09827329963445663 2023-01-22 12:57:53.891040: step: 30/466, loss: 1.4906624555587769 2023-01-22 12:57:54.698512: step: 32/466, loss: 0.16106005012989044 2023-01-22 12:57:55.459558: step: 34/466, loss: 0.2692556381225586 2023-01-22 12:57:56.246051: step: 36/466, loss: 0.6596773862838745 2023-01-22 12:57:57.108580: step: 38/466, loss: 0.7450871467590332 2023-01-22 12:57:57.796372: step: 40/466, loss: 0.16723114252090454 2023-01-22 12:57:58.601922: step: 42/466, loss: 0.17070387303829193 2023-01-22 12:57:59.402037: step: 44/466, loss: 4.699317932128906 2023-01-22 12:58:00.123301: step: 46/466, loss: 0.16521230340003967 2023-01-22 12:58:00.860749: step: 48/466, loss: 0.3028544783592224 2023-01-22 12:58:01.584879: step: 50/466, loss: 0.06732210516929626 2023-01-22 12:58:02.285342: step: 52/466, loss: 0.342848002910614 2023-01-22 12:58:03.065276: step: 54/466, loss: 0.1531759351491928 2023-01-22 12:58:03.881388: step: 56/466, loss: 0.38209885358810425 2023-01-22 12:58:04.669050: step: 58/466, loss: 0.11774308234453201 2023-01-22 12:58:05.432051: step: 60/466, loss: 0.24460706114768982 2023-01-22 12:58:06.200947: step: 62/466, loss: 0.5347704291343689 2023-01-22 12:58:06.978295: step: 64/466, loss: 0.15005452930927277 2023-01-22 12:58:07.752723: step: 66/466, loss: 0.31499019265174866 2023-01-22 12:58:08.535844: step: 68/466, loss: 0.29045701026916504 2023-01-22 12:58:09.356842: step: 70/466, loss: 0.49788540601730347 2023-01-22 12:58:10.073132: step: 72/466, loss: 0.1132664903998375 2023-01-22 12:58:10.788752: step: 74/466, loss: 1.393949031829834 2023-01-22 12:58:11.578138: step: 76/466, loss: 0.26555508375167847 2023-01-22 12:58:12.253784: step: 78/466, loss: 0.07243114709854126 2023-01-22 12:58:12.979663: step: 80/466, loss: 0.27598562836647034 2023-01-22 12:58:13.689006: step: 82/466, loss: 0.11839446425437927 2023-01-22 12:58:14.413722: step: 84/466, loss: 0.28125977516174316 2023-01-22 12:58:15.130882: step: 86/466, loss: 0.10161054879426956 2023-01-22 12:58:15.996302: step: 88/466, loss: 0.4249451160430908 2023-01-22 12:58:16.742752: step: 90/466, loss: 0.21938398480415344 2023-01-22 12:58:17.473683: step: 92/466, loss: 0.11637397855520248 2023-01-22 12:58:18.281572: step: 94/466, loss: 0.10267498344182968 2023-01-22 12:58:19.059393: step: 96/466, loss: 0.17808358371257782 2023-01-22 12:58:19.905428: step: 98/466, loss: 0.22559547424316406 2023-01-22 12:58:20.771374: step: 100/466, loss: 0.08399305492639542 2023-01-22 12:58:21.515596: step: 102/466, loss: 0.1191306933760643 2023-01-22 12:58:22.232915: step: 104/466, loss: 0.4601157605648041 2023-01-22 12:58:22.927799: step: 106/466, loss: 0.10419661551713943 2023-01-22 12:58:23.716582: step: 108/466, loss: 0.20737458765506744 2023-01-22 12:58:24.451563: step: 110/466, loss: 0.0793699324131012 2023-01-22 12:58:25.222141: step: 112/466, loss: 0.0680171549320221 2023-01-22 12:58:25.983248: step: 114/466, loss: 0.2625187337398529 2023-01-22 12:58:26.735060: step: 116/466, loss: 0.762864887714386 2023-01-22 12:58:27.476356: step: 118/466, loss: 0.1961923986673355 2023-01-22 12:58:28.401895: step: 120/466, loss: 0.07939931005239487 2023-01-22 12:58:29.165674: step: 122/466, loss: 0.20566321909427643 2023-01-22 12:58:29.916027: step: 124/466, loss: 0.1146789938211441 2023-01-22 12:58:30.693917: step: 126/466, loss: 0.10061171650886536 2023-01-22 12:58:31.494775: step: 128/466, loss: 0.09796812385320663 2023-01-22 12:58:32.306929: step: 130/466, loss: 0.17923253774642944 2023-01-22 12:58:33.069576: step: 132/466, loss: 0.07315370440483093 2023-01-22 12:58:33.832537: step: 134/466, loss: 0.12449519336223602 2023-01-22 12:58:34.558391: step: 136/466, loss: 0.6119828820228577 2023-01-22 12:58:35.285470: step: 138/466, loss: 0.09807797521352768 2023-01-22 12:58:36.023528: step: 140/466, loss: 0.27021315693855286 2023-01-22 12:58:36.764033: step: 142/466, loss: 0.2910257577896118 2023-01-22 12:58:37.552282: step: 144/466, loss: 0.16442370414733887 2023-01-22 12:58:38.319636: step: 146/466, loss: 0.22531013190746307 2023-01-22 12:58:39.101266: step: 148/466, loss: 0.4174662232398987 2023-01-22 12:58:39.894493: step: 150/466, loss: 0.17463970184326172 2023-01-22 12:58:40.596469: step: 152/466, loss: 0.12008628994226456 2023-01-22 12:58:41.318489: step: 154/466, loss: 0.09626641869544983 2023-01-22 12:58:42.135100: step: 156/466, loss: 0.16386191546916962 2023-01-22 12:58:42.874173: step: 158/466, loss: 0.09380415081977844 2023-01-22 12:58:43.627361: step: 160/466, loss: 0.1615980565547943 2023-01-22 12:58:44.368207: step: 162/466, loss: 0.12062174081802368 2023-01-22 12:58:45.182337: step: 164/466, loss: 0.04635758325457573 2023-01-22 12:58:45.943915: step: 166/466, loss: 0.11345676332712173 2023-01-22 12:58:46.789593: step: 168/466, loss: 0.34069615602493286 2023-01-22 12:58:47.568077: step: 170/466, loss: 0.11447493731975555 2023-01-22 12:58:48.388003: step: 172/466, loss: 0.19096854329109192 2023-01-22 12:58:49.134271: step: 174/466, loss: 0.1390686333179474 2023-01-22 12:58:49.920019: step: 176/466, loss: 0.15758000314235687 2023-01-22 12:58:50.711840: step: 178/466, loss: 0.24189817905426025 2023-01-22 12:58:51.467177: step: 180/466, loss: 0.1777925193309784 2023-01-22 12:58:52.169639: step: 182/466, loss: 0.09924599528312683 2023-01-22 12:58:52.937064: step: 184/466, loss: 0.06437882035970688 2023-01-22 12:58:53.683974: step: 186/466, loss: 0.24384798109531403 2023-01-22 12:58:54.405006: step: 188/466, loss: 0.09469848871231079 2023-01-22 12:58:55.216355: step: 190/466, loss: 0.27627843618392944 2023-01-22 12:58:56.009152: step: 192/466, loss: 0.09130019694566727 2023-01-22 12:58:56.795838: step: 194/466, loss: 0.1484045386314392 2023-01-22 12:58:57.501345: step: 196/466, loss: 0.10212335735559464 2023-01-22 12:58:58.354850: step: 198/466, loss: 0.42030757665634155 2023-01-22 12:58:59.126855: step: 200/466, loss: 0.19676528871059418 2023-01-22 12:58:59.803315: step: 202/466, loss: 0.09120253473520279 2023-01-22 12:59:00.605047: step: 204/466, loss: 0.29615336656570435 2023-01-22 12:59:01.476046: step: 206/466, loss: 0.4958255887031555 2023-01-22 12:59:02.172699: step: 208/466, loss: 0.13587284088134766 2023-01-22 12:59:02.875994: step: 210/466, loss: 0.13223013281822205 2023-01-22 12:59:03.747000: step: 212/466, loss: 0.12652768194675446 2023-01-22 12:59:04.490575: step: 214/466, loss: 0.20821043848991394 2023-01-22 12:59:05.221709: step: 216/466, loss: 0.37757372856140137 2023-01-22 12:59:05.985783: step: 218/466, loss: 0.10182490944862366 2023-01-22 12:59:06.716978: step: 220/466, loss: 0.2610222101211548 2023-01-22 12:59:07.480698: step: 222/466, loss: 0.23877929151058197 2023-01-22 12:59:08.151668: step: 224/466, loss: 0.3908085227012634 2023-01-22 12:59:09.004734: step: 226/466, loss: 0.24849556386470795 2023-01-22 12:59:09.683019: step: 228/466, loss: 0.07584869116544724 2023-01-22 12:59:10.402565: step: 230/466, loss: 0.34698769450187683 2023-01-22 12:59:11.158063: step: 232/466, loss: 0.4174465835094452 2023-01-22 12:59:11.957765: step: 234/466, loss: 0.24441280961036682 2023-01-22 12:59:12.729700: step: 236/466, loss: 0.22616979479789734 2023-01-22 12:59:13.492913: step: 238/466, loss: 0.8674289584159851 2023-01-22 12:59:14.250498: step: 240/466, loss: 0.05681760236620903 2023-01-22 12:59:15.035060: step: 242/466, loss: 0.4447229504585266 2023-01-22 12:59:15.972109: step: 244/466, loss: 0.20784218609333038 2023-01-22 12:59:16.727598: step: 246/466, loss: 0.11302675306797028 2023-01-22 12:59:17.438555: step: 248/466, loss: 0.7658733129501343 2023-01-22 12:59:18.200240: step: 250/466, loss: 0.4308475852012634 2023-01-22 12:59:18.961119: step: 252/466, loss: 0.1710461974143982 2023-01-22 12:59:19.753835: step: 254/466, loss: 0.18237625062465668 2023-01-22 12:59:20.447347: step: 256/466, loss: 0.06273120641708374 2023-01-22 12:59:21.305031: step: 258/466, loss: 0.21046406030654907 2023-01-22 12:59:22.014742: step: 260/466, loss: 0.15667913854122162 2023-01-22 12:59:22.781606: step: 262/466, loss: 0.07241981476545334 2023-01-22 12:59:23.580227: step: 264/466, loss: 0.32974952459335327 2023-01-22 12:59:24.348855: step: 266/466, loss: 0.26676201820373535 2023-01-22 12:59:25.120337: step: 268/466, loss: 0.0830921083688736 2023-01-22 12:59:25.819638: step: 270/466, loss: 0.1152021586894989 2023-01-22 12:59:26.527514: step: 272/466, loss: 0.23809200525283813 2023-01-22 12:59:27.335709: step: 274/466, loss: 0.4443970322608948 2023-01-22 12:59:28.095182: step: 276/466, loss: 0.06326255202293396 2023-01-22 12:59:28.916529: step: 278/466, loss: 0.15755707025527954 2023-01-22 12:59:29.741691: step: 280/466, loss: 1.1373233795166016 2023-01-22 12:59:30.522921: step: 282/466, loss: 0.1670621931552887 2023-01-22 12:59:31.359000: step: 284/466, loss: 6.769673824310303 2023-01-22 12:59:32.189081: step: 286/466, loss: 0.3943406343460083 2023-01-22 12:59:32.991028: step: 288/466, loss: 0.28536364436149597 2023-01-22 12:59:33.718607: step: 290/466, loss: 0.3271265923976898 2023-01-22 12:59:34.401478: step: 292/466, loss: 0.2132055014371872 2023-01-22 12:59:35.236008: step: 294/466, loss: 0.15132853388786316 2023-01-22 12:59:36.034965: step: 296/466, loss: 0.6087186932563782 2023-01-22 12:59:36.816377: step: 298/466, loss: 0.10347239673137665 2023-01-22 12:59:37.545127: step: 300/466, loss: 0.05397750809788704 2023-01-22 12:59:38.313664: step: 302/466, loss: 0.0957869216799736 2023-01-22 12:59:39.050873: step: 304/466, loss: 0.11784403026103973 2023-01-22 12:59:39.767962: step: 306/466, loss: 0.30135923624038696 2023-01-22 12:59:40.466934: step: 308/466, loss: 0.07710994780063629 2023-01-22 12:59:41.243903: step: 310/466, loss: 0.4046939015388489 2023-01-22 12:59:42.058668: step: 312/466, loss: 0.09228195995092392 2023-01-22 12:59:42.790328: step: 314/466, loss: 0.15427225828170776 2023-01-22 12:59:43.522290: step: 316/466, loss: 0.12082573771476746 2023-01-22 12:59:44.221357: step: 318/466, loss: 0.10438908636569977 2023-01-22 12:59:44.967406: step: 320/466, loss: 0.1299682855606079 2023-01-22 12:59:45.750672: step: 322/466, loss: 0.15788854658603668 2023-01-22 12:59:46.514678: step: 324/466, loss: 0.5104916095733643 2023-01-22 12:59:47.233439: step: 326/466, loss: 0.7637230753898621 2023-01-22 12:59:47.982504: step: 328/466, loss: 0.24904048442840576 2023-01-22 12:59:48.703235: step: 330/466, loss: 0.09890061616897583 2023-01-22 12:59:49.397974: step: 332/466, loss: 0.08374074846506119 2023-01-22 12:59:50.140084: step: 334/466, loss: 0.14553312957286835 2023-01-22 12:59:50.920121: step: 336/466, loss: 0.10662180930376053 2023-01-22 12:59:51.646217: step: 338/466, loss: 0.2679828405380249 2023-01-22 12:59:52.415400: step: 340/466, loss: 0.11776454746723175 2023-01-22 12:59:53.135758: step: 342/466, loss: 0.1401228904724121 2023-01-22 12:59:53.868112: step: 344/466, loss: 0.40193748474121094 2023-01-22 12:59:54.655345: step: 346/466, loss: 0.08663631230592728 2023-01-22 12:59:55.423565: step: 348/466, loss: 0.23300756514072418 2023-01-22 12:59:56.110987: step: 350/466, loss: 0.034195512533187866 2023-01-22 12:59:56.817006: step: 352/466, loss: 0.1955024152994156 2023-01-22 12:59:57.518874: step: 354/466, loss: 0.15643028914928436 2023-01-22 12:59:58.283830: step: 356/466, loss: 1.1213116645812988 2023-01-22 12:59:59.118619: step: 358/466, loss: 0.07308322191238403 2023-01-22 12:59:59.855688: step: 360/466, loss: 0.8623815178871155 2023-01-22 13:00:00.568984: step: 362/466, loss: 0.13978946208953857 2023-01-22 13:00:01.385381: step: 364/466, loss: 0.4244544208049774 2023-01-22 13:00:02.201873: step: 366/466, loss: 0.38631099462509155 2023-01-22 13:00:03.002723: step: 368/466, loss: 0.5736966729164124 2023-01-22 13:00:03.771575: step: 370/466, loss: 0.23332534730434418 2023-01-22 13:00:04.518016: step: 372/466, loss: 0.07546471804380417 2023-01-22 13:00:05.288189: step: 374/466, loss: 0.2998636066913605 2023-01-22 13:00:06.060580: step: 376/466, loss: 0.1286058872938156 2023-01-22 13:00:06.896867: step: 378/466, loss: 0.37791845202445984 2023-01-22 13:00:07.617841: step: 380/466, loss: 0.14797884225845337 2023-01-22 13:00:08.417317: step: 382/466, loss: 0.25256213545799255 2023-01-22 13:00:09.193007: step: 384/466, loss: 0.17319680750370026 2023-01-22 13:00:09.970627: step: 386/466, loss: 0.16688449680805206 2023-01-22 13:00:10.735634: step: 388/466, loss: 1.3531017303466797 2023-01-22 13:00:11.455336: step: 390/466, loss: 0.1990767866373062 2023-01-22 13:00:12.201135: step: 392/466, loss: 0.16120776534080505 2023-01-22 13:00:12.935093: step: 394/466, loss: 0.14971502125263214 2023-01-22 13:00:13.633811: step: 396/466, loss: 0.20626650750637054 2023-01-22 13:00:14.362618: step: 398/466, loss: 0.5205829739570618 2023-01-22 13:00:15.158478: step: 400/466, loss: 0.10441724210977554 2023-01-22 13:00:15.965054: step: 402/466, loss: 0.11944933980703354 2023-01-22 13:00:16.912291: step: 404/466, loss: 0.12934282422065735 2023-01-22 13:00:17.725534: step: 406/466, loss: 0.2357649952173233 2023-01-22 13:00:18.435123: step: 408/466, loss: 0.20092816650867462 2023-01-22 13:00:19.244919: step: 410/466, loss: 0.31943678855895996 2023-01-22 13:00:20.117236: step: 412/466, loss: 0.07797721028327942 2023-01-22 13:00:20.919733: step: 414/466, loss: 0.15454114973545074 2023-01-22 13:00:21.703242: step: 416/466, loss: 0.14024995267391205 2023-01-22 13:00:22.468346: step: 418/466, loss: 0.19680440425872803 2023-01-22 13:00:23.214505: step: 420/466, loss: 0.10583780705928802 2023-01-22 13:00:23.927970: step: 422/466, loss: 1.6686673164367676 2023-01-22 13:00:24.704671: step: 424/466, loss: 0.2630821466445923 2023-01-22 13:00:25.439788: step: 426/466, loss: 0.1577325165271759 2023-01-22 13:00:26.175393: step: 428/466, loss: 0.5519349575042725 2023-01-22 13:00:27.018525: step: 430/466, loss: 0.37767624855041504 2023-01-22 13:00:27.776734: step: 432/466, loss: 0.15149228274822235 2023-01-22 13:00:28.487108: step: 434/466, loss: 0.24351133406162262 2023-01-22 13:00:29.236711: step: 436/466, loss: 0.17260925471782684 2023-01-22 13:00:29.998186: step: 438/466, loss: 0.19760867953300476 2023-01-22 13:00:30.691106: step: 440/466, loss: 0.2500152885913849 2023-01-22 13:00:31.398204: step: 442/466, loss: 0.15466606616973877 2023-01-22 13:00:32.193346: step: 444/466, loss: 0.15376563370227814 2023-01-22 13:00:33.019871: step: 446/466, loss: 0.09276437014341354 2023-01-22 13:00:33.791940: step: 448/466, loss: 0.10054701566696167 2023-01-22 13:00:34.546695: step: 450/466, loss: 0.23553961515426636 2023-01-22 13:00:35.391228: step: 452/466, loss: 0.20160697400569916 2023-01-22 13:00:36.172894: step: 454/466, loss: 0.24913400411605835 2023-01-22 13:00:36.993522: step: 456/466, loss: 0.11520318686962128 2023-01-22 13:00:37.729328: step: 458/466, loss: 0.32482796907424927 2023-01-22 13:00:38.515123: step: 460/466, loss: 0.5226231217384338 2023-01-22 13:00:39.258130: step: 462/466, loss: 0.4979248046875 2023-01-22 13:00:40.159282: step: 464/466, loss: 0.40230900049209595 2023-01-22 13:00:40.913510: step: 466/466, loss: 0.3771675229072571 2023-01-22 13:00:41.710922: step: 468/466, loss: 0.13193966448307037 2023-01-22 13:00:42.536624: step: 470/466, loss: 0.6421751976013184 2023-01-22 13:00:43.320192: step: 472/466, loss: 0.32884228229522705 2023-01-22 13:00:44.143737: step: 474/466, loss: 0.13494789600372314 2023-01-22 13:00:44.964192: step: 476/466, loss: 0.3239307999610901 2023-01-22 13:00:45.776735: step: 478/466, loss: 1.8106663227081299 2023-01-22 13:00:46.607372: step: 480/466, loss: 0.653140127658844 2023-01-22 13:00:47.424750: step: 482/466, loss: 0.11913179606199265 2023-01-22 13:00:48.169560: step: 484/466, loss: 0.038208846002817154 2023-01-22 13:00:48.943639: step: 486/466, loss: 0.11867286264896393 2023-01-22 13:00:49.729268: step: 488/466, loss: 0.11869148910045624 2023-01-22 13:00:50.447907: step: 490/466, loss: 0.05254372954368591 2023-01-22 13:00:51.204988: step: 492/466, loss: 0.23922109603881836 2023-01-22 13:00:51.975930: step: 494/466, loss: 0.16811156272888184 2023-01-22 13:00:52.753207: step: 496/466, loss: 0.10078416764736176 2023-01-22 13:00:53.500583: step: 498/466, loss: 0.21667678654193878 2023-01-22 13:00:54.332988: step: 500/466, loss: 0.3497439920902252 2023-01-22 13:00:55.082868: step: 502/466, loss: 0.39367035031318665 2023-01-22 13:00:55.809711: step: 504/466, loss: 0.20031200349330902 2023-01-22 13:00:56.556654: step: 506/466, loss: 0.1277354657649994 2023-01-22 13:00:57.240617: step: 508/466, loss: 0.06326362490653992 2023-01-22 13:00:57.978497: step: 510/466, loss: 0.10629107058048248 2023-01-22 13:00:58.774304: step: 512/466, loss: 0.24034050107002258 2023-01-22 13:00:59.501849: step: 514/466, loss: 0.155228853225708 2023-01-22 13:01:00.236300: step: 516/466, loss: 0.5224460363388062 2023-01-22 13:01:01.038347: step: 518/466, loss: 0.10733848065137863 2023-01-22 13:01:01.772201: step: 520/466, loss: 0.152258038520813 2023-01-22 13:01:02.608479: step: 522/466, loss: 0.16879776120185852 2023-01-22 13:01:03.458904: step: 524/466, loss: 0.12189304083585739 2023-01-22 13:01:04.265155: step: 526/466, loss: 0.06921043246984482 2023-01-22 13:01:04.979067: step: 528/466, loss: 0.11970750242471695 2023-01-22 13:01:05.679438: step: 530/466, loss: 0.11582281440496445 2023-01-22 13:01:06.328022: step: 532/466, loss: 0.24655069410800934 2023-01-22 13:01:07.070444: step: 534/466, loss: 0.2609686255455017 2023-01-22 13:01:07.795746: step: 536/466, loss: 0.09174732863903046 2023-01-22 13:01:08.511022: step: 538/466, loss: 0.12777909636497498 2023-01-22 13:01:09.259759: step: 540/466, loss: 0.1914782077074051 2023-01-22 13:01:10.001957: step: 542/466, loss: 0.20091712474822998 2023-01-22 13:01:10.757848: step: 544/466, loss: 0.36752355098724365 2023-01-22 13:01:11.491752: step: 546/466, loss: 0.2852557599544525 2023-01-22 13:01:12.207442: step: 548/466, loss: 0.08599811047315598 2023-01-22 13:01:12.987978: step: 550/466, loss: 1.1437206268310547 2023-01-22 13:01:13.774144: step: 552/466, loss: 0.1743803322315216 2023-01-22 13:01:14.569097: step: 554/466, loss: 0.13963526487350464 2023-01-22 13:01:15.335292: step: 556/466, loss: 0.2871672809123993 2023-01-22 13:01:16.080677: step: 558/466, loss: 0.10863711684942245 2023-01-22 13:01:16.801477: step: 560/466, loss: 0.2510855793952942 2023-01-22 13:01:17.612044: step: 562/466, loss: 0.5121045112609863 2023-01-22 13:01:18.480184: step: 564/466, loss: 0.15737612545490265 2023-01-22 13:01:19.282706: step: 566/466, loss: 0.5852841138839722 2023-01-22 13:01:20.117053: step: 568/466, loss: 0.17177380621433258 2023-01-22 13:01:20.781726: step: 570/466, loss: 1.1846047639846802 2023-01-22 13:01:21.504081: step: 572/466, loss: 0.14318576455116272 2023-01-22 13:01:22.261544: step: 574/466, loss: 0.10378453880548477 2023-01-22 13:01:23.026557: step: 576/466, loss: 0.11457131803035736 2023-01-22 13:01:23.768440: step: 578/466, loss: 0.08289653807878494 2023-01-22 13:01:24.514858: step: 580/466, loss: 0.732667863368988 2023-01-22 13:01:25.260679: step: 582/466, loss: 0.44659432768821716 2023-01-22 13:01:25.966897: step: 584/466, loss: 0.18047739565372467 2023-01-22 13:01:26.752691: step: 586/466, loss: 0.30189192295074463 2023-01-22 13:01:27.517766: step: 588/466, loss: 0.09621741622686386 2023-01-22 13:01:28.324212: step: 590/466, loss: 0.3058559000492096 2023-01-22 13:01:29.139259: step: 592/466, loss: 0.20640884339809418 2023-01-22 13:01:29.919204: step: 594/466, loss: 0.26927196979522705 2023-01-22 13:01:30.634822: step: 596/466, loss: 0.44683629274368286 2023-01-22 13:01:31.456246: step: 598/466, loss: 0.17201684415340424 2023-01-22 13:01:32.239227: step: 600/466, loss: 0.3946480453014374 2023-01-22 13:01:33.052421: step: 602/466, loss: 0.1997591257095337 2023-01-22 13:01:33.898262: step: 604/466, loss: 0.16095568239688873 2023-01-22 13:01:34.642725: step: 606/466, loss: 0.26406875252723694 2023-01-22 13:01:35.355364: step: 608/466, loss: 0.0596434623003006 2023-01-22 13:01:36.140525: step: 610/466, loss: 0.22998706996440887 2023-01-22 13:01:36.875986: step: 612/466, loss: 0.4247513711452484 2023-01-22 13:01:37.665542: step: 614/466, loss: 0.2173185646533966 2023-01-22 13:01:38.363336: step: 616/466, loss: 0.15110743045806885 2023-01-22 13:01:39.143437: step: 618/466, loss: 0.22173559665679932 2023-01-22 13:01:40.002710: step: 620/466, loss: 0.16269834339618683 2023-01-22 13:01:40.799101: step: 622/466, loss: 0.8579102754592896 2023-01-22 13:01:41.601863: step: 624/466, loss: 0.29735979437828064 2023-01-22 13:01:42.293085: step: 626/466, loss: 0.28379639983177185 2023-01-22 13:01:42.996124: step: 628/466, loss: 0.22198012471199036 2023-01-22 13:01:43.819112: step: 630/466, loss: 0.36806726455688477 2023-01-22 13:01:44.565838: step: 632/466, loss: 0.07528834789991379 2023-01-22 13:01:45.437546: step: 634/466, loss: 0.5660086870193481 2023-01-22 13:01:46.201645: step: 636/466, loss: 0.28243446350097656 2023-01-22 13:01:46.960092: step: 638/466, loss: 0.8265565037727356 2023-01-22 13:01:47.723878: step: 640/466, loss: 0.17058542370796204 2023-01-22 13:01:48.503230: step: 642/466, loss: 1.0795800685882568 2023-01-22 13:01:49.200569: step: 644/466, loss: 0.17732802033424377 2023-01-22 13:01:50.017311: step: 646/466, loss: 0.6980006098747253 2023-01-22 13:01:50.792151: step: 648/466, loss: 0.745697557926178 2023-01-22 13:01:51.596965: step: 650/466, loss: 0.4132540822029114 2023-01-22 13:01:52.429112: step: 652/466, loss: 0.1508013904094696 2023-01-22 13:01:53.190115: step: 654/466, loss: 0.43576478958129883 2023-01-22 13:01:53.975259: step: 656/466, loss: 0.1567678302526474 2023-01-22 13:01:54.768576: step: 658/466, loss: 0.09988389909267426 2023-01-22 13:01:55.596552: step: 660/466, loss: 0.5101841688156128 2023-01-22 13:01:56.389685: step: 662/466, loss: 0.1614990085363388 2023-01-22 13:01:57.094158: step: 664/466, loss: 0.24654972553253174 2023-01-22 13:01:57.949521: step: 666/466, loss: 0.1733606457710266 2023-01-22 13:01:58.708392: step: 668/466, loss: 0.3766717314720154 2023-01-22 13:01:59.442261: step: 670/466, loss: 1.201568841934204 2023-01-22 13:02:00.155828: step: 672/466, loss: 0.11440683901309967 2023-01-22 13:02:00.950198: step: 674/466, loss: 0.06100420653820038 2023-01-22 13:02:01.769410: step: 676/466, loss: 0.5432313084602356 2023-01-22 13:02:02.548307: step: 678/466, loss: 0.13945043087005615 2023-01-22 13:02:03.373805: step: 680/466, loss: 0.16164690256118774 2023-01-22 13:02:04.316619: step: 682/466, loss: 0.2367585301399231 2023-01-22 13:02:05.046016: step: 684/466, loss: 0.2855292856693268 2023-01-22 13:02:05.843037: step: 686/466, loss: 0.0582096241414547 2023-01-22 13:02:06.586671: step: 688/466, loss: 0.13798856735229492 2023-01-22 13:02:07.425998: step: 690/466, loss: 0.27871906757354736 2023-01-22 13:02:08.251968: step: 692/466, loss: 0.5407863855361938 2023-01-22 13:02:09.077941: step: 694/466, loss: 0.13680404424667358 2023-01-22 13:02:09.863318: step: 696/466, loss: 0.5465109348297119 2023-01-22 13:02:10.622112: step: 698/466, loss: 0.1029527336359024 2023-01-22 13:02:11.349736: step: 700/466, loss: 0.28009337186813354 2023-01-22 13:02:12.164876: step: 702/466, loss: 0.23946930468082428 2023-01-22 13:02:12.900153: step: 704/466, loss: 0.19813068211078644 2023-01-22 13:02:13.623845: step: 706/466, loss: 0.2371053397655487 2023-01-22 13:02:14.413633: step: 708/466, loss: 4.844532489776611 2023-01-22 13:02:15.201319: step: 710/466, loss: 0.2899446487426758 2023-01-22 13:02:15.960559: step: 712/466, loss: 0.3364908695220947 2023-01-22 13:02:16.701889: step: 714/466, loss: 0.16732342541217804 2023-01-22 13:02:17.465110: step: 716/466, loss: 0.31874820590019226 2023-01-22 13:02:18.199956: step: 718/466, loss: 0.24993471801280975 2023-01-22 13:02:18.902903: step: 720/466, loss: 0.14156503975391388 2023-01-22 13:02:19.599463: step: 722/466, loss: 0.06063876673579216 2023-01-22 13:02:20.621060: step: 724/466, loss: 0.1900947093963623 2023-01-22 13:02:21.388440: step: 726/466, loss: 0.10233768075704575 2023-01-22 13:02:22.105205: step: 728/466, loss: 0.12381672859191895 2023-01-22 13:02:22.799006: step: 730/466, loss: 0.22875846922397614 2023-01-22 13:02:23.561654: step: 732/466, loss: 0.1475868672132492 2023-01-22 13:02:24.506289: step: 734/466, loss: 0.20701956748962402 2023-01-22 13:02:25.239564: step: 736/466, loss: 0.23301541805267334 2023-01-22 13:02:26.038772: step: 738/466, loss: 0.3107610046863556 2023-01-22 13:02:26.851932: step: 740/466, loss: 0.0913168340921402 2023-01-22 13:02:27.600005: step: 742/466, loss: 0.22894158959388733 2023-01-22 13:02:28.399234: step: 744/466, loss: 0.8507946729660034 2023-01-22 13:02:29.195078: step: 746/466, loss: 0.45970121026039124 2023-01-22 13:02:29.951723: step: 748/466, loss: 0.07593411207199097 2023-01-22 13:02:30.809006: step: 750/466, loss: 0.21511617302894592 2023-01-22 13:02:31.558634: step: 752/466, loss: 0.29151296615600586 2023-01-22 13:02:32.261041: step: 754/466, loss: 0.20864735543727875 2023-01-22 13:02:33.031979: step: 756/466, loss: 0.23786227405071259 2023-01-22 13:02:33.744081: step: 758/466, loss: 0.12289852648973465 2023-01-22 13:02:34.646679: step: 760/466, loss: 0.35148900747299194 2023-01-22 13:02:35.352043: step: 762/466, loss: 0.06791818141937256 2023-01-22 13:02:36.104545: step: 764/466, loss: 0.11666157096624374 2023-01-22 13:02:36.868169: step: 766/466, loss: 0.157499298453331 2023-01-22 13:02:37.643733: step: 768/466, loss: 0.14626558125019073 2023-01-22 13:02:38.380250: step: 770/466, loss: 0.2578248381614685 2023-01-22 13:02:39.133529: step: 772/466, loss: 0.32343581318855286 2023-01-22 13:02:39.889136: step: 774/466, loss: 0.29889240860939026 2023-01-22 13:02:40.696164: step: 776/466, loss: 0.1174267828464508 2023-01-22 13:02:41.439189: step: 778/466, loss: 0.14662732183933258 2023-01-22 13:02:42.215820: step: 780/466, loss: 0.21824069321155548 2023-01-22 13:02:42.957871: step: 782/466, loss: 0.15058116614818573 2023-01-22 13:02:43.649609: step: 784/466, loss: 0.1504792422056198 2023-01-22 13:02:44.453895: step: 786/466, loss: 0.16521036624908447 2023-01-22 13:02:45.233275: step: 788/466, loss: 0.18496575951576233 2023-01-22 13:02:46.012246: step: 790/466, loss: 0.3798065781593323 2023-01-22 13:02:46.781332: step: 792/466, loss: 0.12481054663658142 2023-01-22 13:02:47.530324: step: 794/466, loss: 0.13360081613063812 2023-01-22 13:02:48.300109: step: 796/466, loss: 0.07607656717300415 2023-01-22 13:02:49.051102: step: 798/466, loss: 0.1492290198802948 2023-01-22 13:02:49.882348: step: 800/466, loss: 0.07505486160516739 2023-01-22 13:02:50.655812: step: 802/466, loss: 0.16869288682937622 2023-01-22 13:02:51.386710: step: 804/466, loss: 0.1634787917137146 2023-01-22 13:02:52.041422: step: 806/466, loss: 0.09497810155153275 2023-01-22 13:02:52.830426: step: 808/466, loss: 0.20512060821056366 2023-01-22 13:02:53.579212: step: 810/466, loss: 0.2127516269683838 2023-01-22 13:02:54.234171: step: 812/466, loss: 0.18286651372909546 2023-01-22 13:02:54.949435: step: 814/466, loss: 0.7895347476005554 2023-01-22 13:02:55.763791: step: 816/466, loss: 0.29801860451698303 2023-01-22 13:02:56.524242: step: 818/466, loss: 0.14022906124591827 2023-01-22 13:02:57.211883: step: 820/466, loss: 0.5178606510162354 2023-01-22 13:02:57.984826: step: 822/466, loss: 0.05689023435115814 2023-01-22 13:02:58.751377: step: 824/466, loss: 0.25089576840400696 2023-01-22 13:02:59.572677: step: 826/466, loss: 0.17668074369430542 2023-01-22 13:03:00.385061: step: 828/466, loss: 0.09522435069084167 2023-01-22 13:03:01.140228: step: 830/466, loss: 0.16681769490242004 2023-01-22 13:03:01.897451: step: 832/466, loss: 0.06689116358757019 2023-01-22 13:03:02.720700: step: 834/466, loss: 0.13664615154266357 2023-01-22 13:03:03.459898: step: 836/466, loss: 0.2148696333169937 2023-01-22 13:03:04.194072: step: 838/466, loss: 0.1512245088815689 2023-01-22 13:03:04.988952: step: 840/466, loss: 0.3020787537097931 2023-01-22 13:03:05.691253: step: 842/466, loss: 0.11236050724983215 2023-01-22 13:03:06.471084: step: 844/466, loss: 0.10167965292930603 2023-01-22 13:03:07.194676: step: 846/466, loss: 0.1063070073723793 2023-01-22 13:03:07.873589: step: 848/466, loss: 0.4301778972148895 2023-01-22 13:03:08.608314: step: 850/466, loss: 0.2793833911418915 2023-01-22 13:03:09.485620: step: 852/466, loss: 0.06376925855875015 2023-01-22 13:03:10.189255: step: 854/466, loss: 0.20013092458248138 2023-01-22 13:03:11.015142: step: 856/466, loss: 0.09704309701919556 2023-01-22 13:03:11.742851: step: 858/466, loss: 0.1062927320599556 2023-01-22 13:03:12.628447: step: 860/466, loss: 0.249484583735466 2023-01-22 13:03:13.331909: step: 862/466, loss: 0.12209375947713852 2023-01-22 13:03:14.108899: step: 864/466, loss: 1.1528286933898926 2023-01-22 13:03:14.823779: step: 866/466, loss: 0.1208798959851265 2023-01-22 13:03:15.553923: step: 868/466, loss: 0.2088085114955902 2023-01-22 13:03:16.248245: step: 870/466, loss: 0.1554957628250122 2023-01-22 13:03:17.034351: step: 872/466, loss: 0.5452127456665039 2023-01-22 13:03:17.756754: step: 874/466, loss: 0.1534930318593979 2023-01-22 13:03:18.499676: step: 876/466, loss: 0.13185469806194305 2023-01-22 13:03:19.295361: step: 878/466, loss: 0.0991528108716011 2023-01-22 13:03:20.021471: step: 880/466, loss: 0.24972985684871674 2023-01-22 13:03:20.889332: step: 882/466, loss: 0.15756390988826752 2023-01-22 13:03:21.639289: step: 884/466, loss: 0.1563219279050827 2023-01-22 13:03:22.423976: step: 886/466, loss: 0.0598449744284153 2023-01-22 13:03:23.174113: step: 888/466, loss: 0.052008356899023056 2023-01-22 13:03:23.892783: step: 890/466, loss: 0.3348170518875122 2023-01-22 13:03:24.659466: step: 892/466, loss: 0.3671996295452118 2023-01-22 13:03:25.409421: step: 894/466, loss: 0.7377856969833374 2023-01-22 13:03:26.134693: step: 896/466, loss: 0.2085656374692917 2023-01-22 13:03:26.885114: step: 898/466, loss: 0.31188899278640747 2023-01-22 13:03:27.670923: step: 900/466, loss: 0.5782420635223389 2023-01-22 13:03:28.437113: step: 902/466, loss: 0.058877550065517426 2023-01-22 13:03:29.214611: step: 904/466, loss: 0.20558245480060577 2023-01-22 13:03:29.998118: step: 906/466, loss: 0.37686750292778015 2023-01-22 13:03:30.735255: step: 908/466, loss: 0.33963459730148315 2023-01-22 13:03:31.510099: step: 910/466, loss: 0.14356964826583862 2023-01-22 13:03:32.282482: step: 912/466, loss: 0.21850159764289856 2023-01-22 13:03:33.138326: step: 914/466, loss: 0.26700299978256226 2023-01-22 13:03:33.846453: step: 916/466, loss: 0.21178553998470306 2023-01-22 13:03:34.640271: step: 918/466, loss: 0.5047131180763245 2023-01-22 13:03:35.440845: step: 920/466, loss: 0.475437194108963 2023-01-22 13:03:36.284584: step: 922/466, loss: 0.04170737788081169 2023-01-22 13:03:37.058690: step: 924/466, loss: 0.03637902811169624 2023-01-22 13:03:37.822926: step: 926/466, loss: 0.349740594625473 2023-01-22 13:03:38.645340: step: 928/466, loss: 0.2327994704246521 2023-01-22 13:03:39.413905: step: 930/466, loss: 0.15047289431095123 2023-01-22 13:03:40.245310: step: 932/466, loss: 0.3883518874645233 ================================================== Loss: 0.304 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3148516147989166, 'r': 0.35428274682306937, 'f1': 0.33340537067099557}, 'combined': 0.2456671152312599, 'epoch': 11} Test Chinese: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.34260585369857516, 'r': 0.29629171749668803, 'f1': 0.31777011337470074}, 'combined': 0.19531236236688923, 'epoch': 11} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.28298284686781866, 'r': 0.3527888622242065, 'f1': 0.31405359863540006}, 'combined': 0.23140791478397899, 'epoch': 11} Test Korean: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.32660396071805126, 'r': 0.30226432413074417, 'f1': 0.3139631233545263}, 'combined': 0.19297245630570886, 'epoch': 11} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.32387621960662843, 'r': 0.3607501725030188, 'f1': 0.34132018116533375}, 'combined': 0.25149908085866696, 'epoch': 11} Test Russian: {'template': {'p': 0.863013698630137, 'r': 0.48091603053435117, 'f1': 0.6176470588235295}, 'slot': {'p': 0.3437797535475426, 'r': 0.2928383862541026, 'f1': 0.3162709384531908}, 'combined': 0.19534381492697084, 'epoch': 11} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.275, 'r': 0.3535714285714286, 'f1': 0.309375}, 'combined': 0.20625, 'epoch': 11} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.28888888888888886, 'r': 0.5652173913043478, 'f1': 0.38235294117647056}, 'combined': 0.19117647058823528, 'epoch': 11} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.46153846153846156, 'r': 0.20689655172413793, 'f1': 0.28571428571428575}, 'combined': 0.1904761904761905, 'epoch': 11} New best chinese model... New best korean model... New best russian model... ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3148516147989166, 'r': 0.35428274682306937, 'f1': 0.33340537067099557}, 'combined': 0.2456671152312599, 'epoch': 11} Test for Chinese: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.34260585369857516, 'r': 0.29629171749668803, 'f1': 0.31777011337470074}, 'combined': 0.19531236236688923, 'epoch': 11} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.275, 'r': 0.3535714285714286, 'f1': 0.309375}, 'combined': 0.20625, 'epoch': 11} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.28298284686781866, 'r': 0.3527888622242065, 'f1': 0.31405359863540006}, 'combined': 0.23140791478397899, 'epoch': 11} Test for Korean: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.32660396071805126, 'r': 0.30226432413074417, 'f1': 0.3139631233545263}, 'combined': 0.19297245630570886, 'epoch': 11} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.28888888888888886, 'r': 0.5652173913043478, 'f1': 0.38235294117647056}, 'combined': 0.19117647058823528, 'epoch': 11} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.32387621960662843, 'r': 0.3607501725030188, 'f1': 0.34132018116533375}, 'combined': 0.25149908085866696, 'epoch': 11} Test for Russian: {'template': {'p': 0.863013698630137, 'r': 0.48091603053435117, 'f1': 0.6176470588235295}, 'slot': {'p': 0.3437797535475426, 'r': 0.2928383862541026, 'f1': 0.3162709384531908}, 'combined': 0.19534381492697084, 'epoch': 11} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.46153846153846156, 'r': 0.20689655172413793, 'f1': 0.28571428571428575}, 'combined': 0.1904761904761905, 'epoch': 11} ****************************** Epoch: 12 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-22 13:06:53.002918: step: 2/466, loss: 0.22970537841320038 2023-01-22 13:06:53.753000: step: 4/466, loss: 3.4017958641052246 2023-01-22 13:06:54.539319: step: 6/466, loss: 0.19457271695137024 2023-01-22 13:06:55.319312: step: 8/466, loss: 0.05014999583363533 2023-01-22 13:06:56.007693: step: 10/466, loss: 0.12878787517547607 2023-01-22 13:06:56.813304: step: 12/466, loss: 0.20097599923610687 2023-01-22 13:06:57.498623: step: 14/466, loss: 0.07929079234600067 2023-01-22 13:06:58.218566: step: 16/466, loss: 0.3970954418182373 2023-01-22 13:06:59.016554: step: 18/466, loss: 0.1011417880654335 2023-01-22 13:06:59.704173: step: 20/466, loss: 0.10177238285541534 2023-01-22 13:07:00.474552: step: 22/466, loss: 2.5332489013671875 2023-01-22 13:07:01.232713: step: 24/466, loss: 0.0751798003911972 2023-01-22 13:07:02.101624: step: 26/466, loss: 0.10535023361444473 2023-01-22 13:07:02.877149: step: 28/466, loss: 0.18587830662727356 2023-01-22 13:07:03.625707: step: 30/466, loss: 0.10108933597803116 2023-01-22 13:07:04.413259: step: 32/466, loss: 0.11795395612716675 2023-01-22 13:07:05.148622: step: 34/466, loss: 0.10065092146396637 2023-01-22 13:07:05.914335: step: 36/466, loss: 0.5440007448196411 2023-01-22 13:07:06.709224: step: 38/466, loss: 0.18395252525806427 2023-01-22 13:07:07.491354: step: 40/466, loss: 0.11210142821073532 2023-01-22 13:07:08.211957: step: 42/466, loss: 0.26954972743988037 2023-01-22 13:07:08.929540: step: 44/466, loss: 0.08725812286138535 2023-01-22 13:07:09.710652: step: 46/466, loss: 0.12701453268527985 2023-01-22 13:07:10.441742: step: 48/466, loss: 0.14181667566299438 2023-01-22 13:07:11.288417: step: 50/466, loss: 0.1456719934940338 2023-01-22 13:07:12.036796: step: 52/466, loss: 0.10307493805885315 2023-01-22 13:07:12.878490: step: 54/466, loss: 0.14766325056552887 2023-01-22 13:07:13.652443: step: 56/466, loss: 0.06563329696655273 2023-01-22 13:07:14.436605: step: 58/466, loss: 0.08325640857219696 2023-01-22 13:07:15.177422: step: 60/466, loss: 0.3567356467247009 2023-01-22 13:07:15.963124: step: 62/466, loss: 0.0927126333117485 2023-01-22 13:07:16.782418: step: 64/466, loss: 0.8182318806648254 2023-01-22 13:07:17.563762: step: 66/466, loss: 0.22731924057006836 2023-01-22 13:07:18.291463: step: 68/466, loss: 0.1506653130054474 2023-01-22 13:07:19.035738: step: 70/466, loss: 0.14179442822933197 2023-01-22 13:07:19.761338: step: 72/466, loss: 0.05986696109175682 2023-01-22 13:07:20.515496: step: 74/466, loss: 0.26590049266815186 2023-01-22 13:07:21.252956: step: 76/466, loss: 0.05869297683238983 2023-01-22 13:07:21.986734: step: 78/466, loss: 0.0641389861702919 2023-01-22 13:07:22.716033: step: 80/466, loss: 0.15808598697185516 2023-01-22 13:07:23.515123: step: 82/466, loss: 0.1286565214395523 2023-01-22 13:07:24.298505: step: 84/466, loss: 0.1533057689666748 2023-01-22 13:07:25.083837: step: 86/466, loss: 0.16723588109016418 2023-01-22 13:07:25.842239: step: 88/466, loss: 0.12770910561084747 2023-01-22 13:07:26.505687: step: 90/466, loss: 0.2282274216413498 2023-01-22 13:07:27.259259: step: 92/466, loss: 0.39215442538261414 2023-01-22 13:07:28.085983: step: 94/466, loss: 0.40199926495552063 2023-01-22 13:07:28.833704: step: 96/466, loss: 0.0938708484172821 2023-01-22 13:07:29.770782: step: 98/466, loss: 0.1018475815653801 2023-01-22 13:07:30.425115: step: 100/466, loss: 0.04701365530490875 2023-01-22 13:07:31.192761: step: 102/466, loss: 0.17152880132198334 2023-01-22 13:07:32.012494: step: 104/466, loss: 0.10114247351884842 2023-01-22 13:07:32.851303: step: 106/466, loss: 0.1542118936777115 2023-01-22 13:07:33.550077: step: 108/466, loss: 0.21466155350208282 2023-01-22 13:07:34.276326: step: 110/466, loss: 0.15568910539150238 2023-01-22 13:07:35.040164: step: 112/466, loss: 0.11884459853172302 2023-01-22 13:07:35.840640: step: 114/466, loss: 1.404669165611267 2023-01-22 13:07:36.546012: step: 116/466, loss: 0.6263872385025024 2023-01-22 13:07:37.280664: step: 118/466, loss: 0.10001714527606964 2023-01-22 13:07:37.981880: step: 120/466, loss: 0.2752031683921814 2023-01-22 13:07:38.760461: step: 122/466, loss: 0.21351397037506104 2023-01-22 13:07:39.506951: step: 124/466, loss: 0.33566728234291077 2023-01-22 13:07:40.251034: step: 126/466, loss: 0.0852104052901268 2023-01-22 13:07:41.102691: step: 128/466, loss: 0.1525503247976303 2023-01-22 13:07:42.055273: step: 130/466, loss: 5.630551815032959 2023-01-22 13:07:42.878680: step: 132/466, loss: 0.3894355595111847 2023-01-22 13:07:43.622869: step: 134/466, loss: 0.07047511637210846 2023-01-22 13:07:44.460390: step: 136/466, loss: 1.2956987619400024 2023-01-22 13:07:45.305161: step: 138/466, loss: 0.29221078753471375 2023-01-22 13:07:46.131369: step: 140/466, loss: 0.2097444236278534 2023-01-22 13:07:46.865069: step: 142/466, loss: 0.16492792963981628 2023-01-22 13:07:47.642699: step: 144/466, loss: 0.4171257019042969 2023-01-22 13:07:48.516956: step: 146/466, loss: 0.18491220474243164 2023-01-22 13:07:49.239309: step: 148/466, loss: 0.13337670266628265 2023-01-22 13:07:50.031146: step: 150/466, loss: 0.13845552504062653 2023-01-22 13:07:50.721444: step: 152/466, loss: 0.3049972653388977 2023-01-22 13:07:51.500716: step: 154/466, loss: 0.10930734872817993 2023-01-22 13:07:52.361142: step: 156/466, loss: 0.3641570806503296 2023-01-22 13:07:52.988966: step: 158/466, loss: 0.19187471270561218 2023-01-22 13:07:53.756788: step: 160/466, loss: 0.036166153848171234 2023-01-22 13:07:54.580681: step: 162/466, loss: 0.22070954740047455 2023-01-22 13:07:55.341526: step: 164/466, loss: 0.16025584936141968 2023-01-22 13:07:56.059789: step: 166/466, loss: 0.1772567182779312 2023-01-22 13:07:56.827210: step: 168/466, loss: 0.27021461725234985 2023-01-22 13:07:57.588329: step: 170/466, loss: 0.1123526319861412 2023-01-22 13:07:58.439064: step: 172/466, loss: 0.15609031915664673 2023-01-22 13:07:59.103078: step: 174/466, loss: 0.5921962857246399 2023-01-22 13:07:59.898138: step: 176/466, loss: 0.17766670882701874 2023-01-22 13:08:00.773582: step: 178/466, loss: 0.1023736447095871 2023-01-22 13:08:01.618035: step: 180/466, loss: 0.5316795110702515 2023-01-22 13:08:02.384289: step: 182/466, loss: 0.07270383089780807 2023-01-22 13:08:03.186906: step: 184/466, loss: 0.19070985913276672 2023-01-22 13:08:03.960196: step: 186/466, loss: 0.4144967496395111 2023-01-22 13:08:04.703135: step: 188/466, loss: 0.046569474041461945 2023-01-22 13:08:05.482244: step: 190/466, loss: 0.13963574171066284 2023-01-22 13:08:06.280282: step: 192/466, loss: 0.15261642634868622 2023-01-22 13:08:07.045332: step: 194/466, loss: 0.11186033487319946 2023-01-22 13:08:07.846453: step: 196/466, loss: 0.2546413838863373 2023-01-22 13:08:08.539800: step: 198/466, loss: 0.5899943709373474 2023-01-22 13:08:09.246658: step: 200/466, loss: 0.04048832505941391 2023-01-22 13:08:09.978163: step: 202/466, loss: 0.10920899361371994 2023-01-22 13:08:10.702606: step: 204/466, loss: 0.28868985176086426 2023-01-22 13:08:11.468138: step: 206/466, loss: 0.12218084186315536 2023-01-22 13:08:12.220920: step: 208/466, loss: 0.05156202241778374 2023-01-22 13:08:12.988394: step: 210/466, loss: 0.10586714744567871 2023-01-22 13:08:13.630113: step: 212/466, loss: 0.1500854194164276 2023-01-22 13:08:14.405557: step: 214/466, loss: 0.09757547080516815 2023-01-22 13:08:15.177984: step: 216/466, loss: 0.028935061767697334 2023-01-22 13:08:15.935262: step: 218/466, loss: 0.028134481981396675 2023-01-22 13:08:16.712675: step: 220/466, loss: 0.12261322140693665 2023-01-22 13:08:17.531718: step: 222/466, loss: 0.09786012023687363 2023-01-22 13:08:18.247300: step: 224/466, loss: 0.26669418811798096 2023-01-22 13:08:18.976216: step: 226/466, loss: 0.06629175692796707 2023-01-22 13:08:19.736752: step: 228/466, loss: 0.35725879669189453 2023-01-22 13:08:20.474004: step: 230/466, loss: 0.23570282757282257 2023-01-22 13:08:21.282947: step: 232/466, loss: 0.127173513174057 2023-01-22 13:08:22.101899: step: 234/466, loss: 0.10487955063581467 2023-01-22 13:08:22.834466: step: 236/466, loss: 0.06401833891868591 2023-01-22 13:08:23.630674: step: 238/466, loss: 0.3104228079319 2023-01-22 13:08:24.365185: step: 240/466, loss: 0.8340924978256226 2023-01-22 13:08:25.075754: step: 242/466, loss: 0.5680950880050659 2023-01-22 13:08:25.937812: step: 244/466, loss: 0.2969139814376831 2023-01-22 13:08:26.708919: step: 246/466, loss: 0.12126941233873367 2023-01-22 13:08:27.482734: step: 248/466, loss: 0.42154020071029663 2023-01-22 13:08:28.241906: step: 250/466, loss: 0.0735100582242012 2023-01-22 13:08:28.965155: step: 252/466, loss: 0.0969298928976059 2023-01-22 13:08:29.683063: step: 254/466, loss: 0.14617589116096497 2023-01-22 13:08:30.417259: step: 256/466, loss: 0.35524335503578186 2023-01-22 13:08:31.184612: step: 258/466, loss: 0.1707388311624527 2023-01-22 13:08:31.990720: step: 260/466, loss: 0.13604384660720825 2023-01-22 13:08:32.781607: step: 262/466, loss: 0.19494149088859558 2023-01-22 13:08:33.538474: step: 264/466, loss: 0.3368360698223114 2023-01-22 13:08:34.226045: step: 266/466, loss: 0.17181900143623352 2023-01-22 13:08:35.008862: step: 268/466, loss: 0.2624906301498413 2023-01-22 13:08:35.795335: step: 270/466, loss: 0.3313490152359009 2023-01-22 13:08:36.727414: step: 272/466, loss: 0.15030698478221893 2023-01-22 13:08:37.513707: step: 274/466, loss: 0.019664861261844635 2023-01-22 13:08:38.263949: step: 276/466, loss: 0.13571712374687195 2023-01-22 13:08:39.061043: step: 278/466, loss: 0.2538709342479706 2023-01-22 13:08:39.783731: step: 280/466, loss: 0.11656523495912552 2023-01-22 13:08:40.584489: step: 282/466, loss: 0.11484860628843307 2023-01-22 13:08:41.308220: step: 284/466, loss: 0.1368667632341385 2023-01-22 13:08:42.104358: step: 286/466, loss: 0.16884687542915344 2023-01-22 13:08:42.881603: step: 288/466, loss: 0.13645488023757935 2023-01-22 13:08:43.601433: step: 290/466, loss: 0.1285877227783203 2023-01-22 13:08:44.387190: step: 292/466, loss: 0.16469940543174744 2023-01-22 13:08:45.175994: step: 294/466, loss: 0.14539150893688202 2023-01-22 13:08:45.929687: step: 296/466, loss: 0.09342527389526367 2023-01-22 13:08:46.755585: step: 298/466, loss: 0.11741075664758682 2023-01-22 13:08:47.538889: step: 300/466, loss: 0.1407267451286316 2023-01-22 13:08:48.418312: step: 302/466, loss: 0.2229546159505844 2023-01-22 13:08:49.143756: step: 304/466, loss: 0.2091650664806366 2023-01-22 13:08:49.862677: step: 306/466, loss: 0.15506994724273682 2023-01-22 13:08:50.626827: step: 308/466, loss: 0.24538929760456085 2023-01-22 13:08:51.409737: step: 310/466, loss: 0.1123206615447998 2023-01-22 13:08:52.187103: step: 312/466, loss: 0.15521487593650818 2023-01-22 13:08:52.947769: step: 314/466, loss: 0.48486581444740295 2023-01-22 13:08:53.722747: step: 316/466, loss: 0.07889270037412643 2023-01-22 13:08:54.501222: step: 318/466, loss: 0.18789488077163696 2023-01-22 13:08:55.291595: step: 320/466, loss: 0.13976812362670898 2023-01-22 13:08:56.016685: step: 322/466, loss: 0.12298387289047241 2023-01-22 13:08:56.748539: step: 324/466, loss: 0.024222377687692642 2023-01-22 13:08:57.509127: step: 326/466, loss: 0.0798078179359436 2023-01-22 13:08:58.238251: step: 328/466, loss: 0.12812082469463348 2023-01-22 13:08:59.150118: step: 330/466, loss: 0.14643944799900055 2023-01-22 13:08:59.947389: step: 332/466, loss: 0.07890229672193527 2023-01-22 13:09:00.697192: step: 334/466, loss: 0.03876285254955292 2023-01-22 13:09:01.465578: step: 336/466, loss: 0.0667174905538559 2023-01-22 13:09:02.334334: step: 338/466, loss: 0.11607632786035538 2023-01-22 13:09:03.100753: step: 340/466, loss: 0.058539051562547684 2023-01-22 13:09:03.802168: step: 342/466, loss: 0.09484546631574631 2023-01-22 13:09:04.564293: step: 344/466, loss: 0.06643007695674896 2023-01-22 13:09:05.352734: step: 346/466, loss: 0.17572882771492004 2023-01-22 13:09:06.143638: step: 348/466, loss: 0.23891420662403107 2023-01-22 13:09:06.900541: step: 350/466, loss: 0.10195959359407425 2023-01-22 13:09:07.668030: step: 352/466, loss: 0.09479983896017075 2023-01-22 13:09:08.409462: step: 354/466, loss: 0.31346720457077026 2023-01-22 13:09:09.204863: step: 356/466, loss: 0.05729576200246811 2023-01-22 13:09:09.978914: step: 358/466, loss: 0.30581486225128174 2023-01-22 13:09:10.748710: step: 360/466, loss: 0.4423132538795471 2023-01-22 13:09:11.519598: step: 362/466, loss: 0.07643415778875351 2023-01-22 13:09:12.286992: step: 364/466, loss: 0.279776930809021 2023-01-22 13:09:13.024210: step: 366/466, loss: 0.27822229266166687 2023-01-22 13:09:13.784383: step: 368/466, loss: 0.389532208442688 2023-01-22 13:09:14.569184: step: 370/466, loss: 0.12308437377214432 2023-01-22 13:09:15.326244: step: 372/466, loss: 0.0807652547955513 2023-01-22 13:09:16.125622: step: 374/466, loss: 0.18257291615009308 2023-01-22 13:09:16.933126: step: 376/466, loss: 0.12374410033226013 2023-01-22 13:09:17.737773: step: 378/466, loss: 0.27145010232925415 2023-01-22 13:09:18.527234: step: 380/466, loss: 0.2303091585636139 2023-01-22 13:09:19.297310: step: 382/466, loss: 0.15865585207939148 2023-01-22 13:09:19.988233: step: 384/466, loss: 0.13504156470298767 2023-01-22 13:09:20.689123: step: 386/466, loss: 0.16256369650363922 2023-01-22 13:09:21.392164: step: 388/466, loss: 0.3930529057979584 2023-01-22 13:09:22.160955: step: 390/466, loss: 0.14539262652397156 2023-01-22 13:09:23.045870: step: 392/466, loss: 0.41078057885169983 2023-01-22 13:09:23.827313: step: 394/466, loss: 0.11577159911394119 2023-01-22 13:09:24.503383: step: 396/466, loss: 0.11931080371141434 2023-01-22 13:09:25.345711: step: 398/466, loss: 0.1519489884376526 2023-01-22 13:09:26.073387: step: 400/466, loss: 0.18284772336483002 2023-01-22 13:09:26.783163: step: 402/466, loss: 0.05453243479132652 2023-01-22 13:09:27.548078: step: 404/466, loss: 0.1171683669090271 2023-01-22 13:09:28.280855: step: 406/466, loss: 0.15749157965183258 2023-01-22 13:09:29.013116: step: 408/466, loss: 0.13453346490859985 2023-01-22 13:09:29.819411: step: 410/466, loss: 0.16586779057979584 2023-01-22 13:09:30.692328: step: 412/466, loss: 0.1531343013048172 2023-01-22 13:09:31.522500: step: 414/466, loss: 0.24512223899364471 2023-01-22 13:09:32.328019: step: 416/466, loss: 0.1436801552772522 2023-01-22 13:09:33.160678: step: 418/466, loss: 0.12498243153095245 2023-01-22 13:09:33.882647: step: 420/466, loss: 0.5236779451370239 2023-01-22 13:09:34.624632: step: 422/466, loss: 1.3416144847869873 2023-01-22 13:09:35.324799: step: 424/466, loss: 0.21465624868869781 2023-01-22 13:09:36.073214: step: 426/466, loss: 0.40788188576698303 2023-01-22 13:09:36.830577: step: 428/466, loss: 0.10750317573547363 2023-01-22 13:09:37.560846: step: 430/466, loss: 0.4191490411758423 2023-01-22 13:09:38.279274: step: 432/466, loss: 0.20027561485767365 2023-01-22 13:09:39.010882: step: 434/466, loss: 0.05397174507379532 2023-01-22 13:09:39.694092: step: 436/466, loss: 0.07795599102973938 2023-01-22 13:09:40.508065: step: 438/466, loss: 0.21288935840129852 2023-01-22 13:09:41.250510: step: 440/466, loss: 0.4418155550956726 2023-01-22 13:09:41.975305: step: 442/466, loss: 0.3671300709247589 2023-01-22 13:09:42.714454: step: 444/466, loss: 0.06263580918312073 2023-01-22 13:09:43.458434: step: 446/466, loss: 0.15847350656986237 2023-01-22 13:09:44.186237: step: 448/466, loss: 0.2654990553855896 2023-01-22 13:09:44.976361: step: 450/466, loss: 0.10745230317115784 2023-01-22 13:09:45.695032: step: 452/466, loss: 0.17414699494838715 2023-01-22 13:09:46.364954: step: 454/466, loss: 0.15538115799427032 2023-01-22 13:09:47.167004: step: 456/466, loss: 1.7021269798278809 2023-01-22 13:09:47.973945: step: 458/466, loss: 0.05081811919808388 2023-01-22 13:09:48.717859: step: 460/466, loss: 0.5242745876312256 2023-01-22 13:09:49.488755: step: 462/466, loss: 1.2987644672393799 2023-01-22 13:09:50.228692: step: 464/466, loss: 0.19540849328041077 2023-01-22 13:09:50.955542: step: 466/466, loss: 0.07759097963571548 2023-01-22 13:09:51.700389: step: 468/466, loss: 0.3391871154308319 2023-01-22 13:09:52.460250: step: 470/466, loss: 0.13671857118606567 2023-01-22 13:09:53.215992: step: 472/466, loss: 0.4496214985847473 2023-01-22 13:09:53.938505: step: 474/466, loss: 0.08343297243118286 2023-01-22 13:09:54.726052: step: 476/466, loss: 0.11408674716949463 2023-01-22 13:09:55.507576: step: 478/466, loss: 0.06497704982757568 2023-01-22 13:09:56.256792: step: 480/466, loss: 0.1774224489927292 2023-01-22 13:09:56.959413: step: 482/466, loss: 0.13437721133232117 2023-01-22 13:09:57.708294: step: 484/466, loss: 0.2139037400484085 2023-01-22 13:09:58.528807: step: 486/466, loss: 0.48550844192504883 2023-01-22 13:09:59.224775: step: 488/466, loss: 1.2269116640090942 2023-01-22 13:10:00.005365: step: 490/466, loss: 0.083436980843544 2023-01-22 13:10:00.822487: step: 492/466, loss: 0.26875337958335876 2023-01-22 13:10:01.577910: step: 494/466, loss: 0.10697463899850845 2023-01-22 13:10:02.301408: step: 496/466, loss: 0.0854906216263771 2023-01-22 13:10:03.017971: step: 498/466, loss: 0.5183632969856262 2023-01-22 13:10:03.823504: step: 500/466, loss: 0.3039102852344513 2023-01-22 13:10:04.514479: step: 502/466, loss: 0.12715557217597961 2023-01-22 13:10:05.195507: step: 504/466, loss: 0.16347670555114746 2023-01-22 13:10:05.994230: step: 506/466, loss: 0.14180181920528412 2023-01-22 13:10:06.699235: step: 508/466, loss: 0.08178799599409103 2023-01-22 13:10:07.462356: step: 510/466, loss: 0.1895528882741928 2023-01-22 13:10:08.174440: step: 512/466, loss: 0.2194291651248932 2023-01-22 13:10:09.061378: step: 514/466, loss: 0.17503222823143005 2023-01-22 13:10:09.735252: step: 516/466, loss: 0.2034883201122284 2023-01-22 13:10:10.454177: step: 518/466, loss: 0.1403059959411621 2023-01-22 13:10:11.178152: step: 520/466, loss: 0.04848955571651459 2023-01-22 13:10:11.960274: step: 522/466, loss: 0.1673567295074463 2023-01-22 13:10:12.713549: step: 524/466, loss: 0.12073955684900284 2023-01-22 13:10:13.480029: step: 526/466, loss: 0.42349761724472046 2023-01-22 13:10:14.201609: step: 528/466, loss: 0.10180588066577911 2023-01-22 13:10:14.959236: step: 530/466, loss: 0.32483330368995667 2023-01-22 13:10:15.726336: step: 532/466, loss: 0.08331071585416794 2023-01-22 13:10:16.487309: step: 534/466, loss: 0.05236487463116646 2023-01-22 13:10:17.316075: step: 536/466, loss: 0.4884677529335022 2023-01-22 13:10:18.088644: step: 538/466, loss: 0.6043692827224731 2023-01-22 13:10:18.839986: step: 540/466, loss: 0.39901936054229736 2023-01-22 13:10:19.703217: step: 542/466, loss: 0.10290895402431488 2023-01-22 13:10:20.432194: step: 544/466, loss: 0.2578374445438385 2023-01-22 13:10:21.240926: step: 546/466, loss: 0.34540289640426636 2023-01-22 13:10:21.979927: step: 548/466, loss: 0.12169482558965683 2023-01-22 13:10:22.734974: step: 550/466, loss: 0.20996147394180298 2023-01-22 13:10:23.479279: step: 552/466, loss: 0.14128772914409637 2023-01-22 13:10:24.180686: step: 554/466, loss: 0.21965238451957703 2023-01-22 13:10:24.938377: step: 556/466, loss: 0.09792616218328476 2023-01-22 13:10:25.724855: step: 558/466, loss: 0.10246670246124268 2023-01-22 13:10:26.468413: step: 560/466, loss: 0.0716271847486496 2023-01-22 13:10:27.198644: step: 562/466, loss: 0.10876203328371048 2023-01-22 13:10:27.976668: step: 564/466, loss: 0.08965716511011124 2023-01-22 13:10:28.788564: step: 566/466, loss: 0.2969381511211395 2023-01-22 13:10:29.511390: step: 568/466, loss: 0.10171569138765335 2023-01-22 13:10:30.320460: step: 570/466, loss: 0.2210264503955841 2023-01-22 13:10:31.031373: step: 572/466, loss: 0.1645101010799408 2023-01-22 13:10:31.741015: step: 574/466, loss: 0.15573878586292267 2023-01-22 13:10:32.622095: step: 576/466, loss: 0.15432459115982056 2023-01-22 13:10:33.432774: step: 578/466, loss: 0.10344555974006653 2023-01-22 13:10:34.147413: step: 580/466, loss: 0.08119934052228928 2023-01-22 13:10:34.937096: step: 582/466, loss: 0.03565558046102524 2023-01-22 13:10:35.656557: step: 584/466, loss: 0.0471508614718914 2023-01-22 13:10:36.403705: step: 586/466, loss: 0.30499204993247986 2023-01-22 13:10:37.138888: step: 588/466, loss: 0.07960199564695358 2023-01-22 13:10:37.885408: step: 590/466, loss: 0.07188259810209274 2023-01-22 13:10:38.635010: step: 592/466, loss: 0.20376189053058624 2023-01-22 13:10:39.437858: step: 594/466, loss: 0.08409322798252106 2023-01-22 13:10:40.217905: step: 596/466, loss: 0.1693294495344162 2023-01-22 13:10:41.026359: step: 598/466, loss: 0.3855529725551605 2023-01-22 13:10:41.750611: step: 600/466, loss: 0.1346471756696701 2023-01-22 13:10:42.523615: step: 602/466, loss: 0.05187489464879036 2023-01-22 13:10:43.219170: step: 604/466, loss: 0.09368216246366501 2023-01-22 13:10:43.984151: step: 606/466, loss: 0.16668803989887238 2023-01-22 13:10:44.810155: step: 608/466, loss: 0.4426323175430298 2023-01-22 13:10:45.531074: step: 610/466, loss: 0.05768425762653351 2023-01-22 13:10:46.370026: step: 612/466, loss: 0.46049514412879944 2023-01-22 13:10:47.115132: step: 614/466, loss: 0.11522240936756134 2023-01-22 13:10:47.888353: step: 616/466, loss: 0.17588311433792114 2023-01-22 13:10:48.692705: step: 618/466, loss: 0.11678693443536758 2023-01-22 13:10:49.416822: step: 620/466, loss: 0.05687686800956726 2023-01-22 13:10:50.187561: step: 622/466, loss: 0.17744433879852295 2023-01-22 13:10:50.934383: step: 624/466, loss: 0.07499007880687714 2023-01-22 13:10:51.630471: step: 626/466, loss: 0.0555715411901474 2023-01-22 13:10:52.428997: step: 628/466, loss: 0.08503931015729904 2023-01-22 13:10:53.129807: step: 630/466, loss: 0.4157727360725403 2023-01-22 13:10:53.839288: step: 632/466, loss: 0.10179923474788666 2023-01-22 13:10:54.624608: step: 634/466, loss: 0.39125341176986694 2023-01-22 13:10:55.377051: step: 636/466, loss: 0.10476453602313995 2023-01-22 13:10:56.151563: step: 638/466, loss: 0.332133948802948 2023-01-22 13:10:56.983491: step: 640/466, loss: 4.197464942932129 2023-01-22 13:10:57.744356: step: 642/466, loss: 0.11542638391256332 2023-01-22 13:10:58.557386: step: 644/466, loss: 0.17988237738609314 2023-01-22 13:10:59.336284: step: 646/466, loss: 0.40194782614707947 2023-01-22 13:11:00.032273: step: 648/466, loss: 0.24533627927303314 2023-01-22 13:11:00.804836: step: 650/466, loss: 0.23072265088558197 2023-01-22 13:11:01.530781: step: 652/466, loss: 0.08752790838479996 2023-01-22 13:11:02.343071: step: 654/466, loss: 0.3541397154331207 2023-01-22 13:11:03.113108: step: 656/466, loss: 0.11867779493331909 2023-01-22 13:11:03.925011: step: 658/466, loss: 0.4159494936466217 2023-01-22 13:11:04.695566: step: 660/466, loss: 0.07973338663578033 2023-01-22 13:11:05.424701: step: 662/466, loss: 0.24685896933078766 2023-01-22 13:11:06.206551: step: 664/466, loss: 0.07399331033229828 2023-01-22 13:11:06.953577: step: 666/466, loss: 0.18225261569023132 2023-01-22 13:11:07.783618: step: 668/466, loss: 0.08262845873832703 2023-01-22 13:11:08.558307: step: 670/466, loss: 0.10257098078727722 2023-01-22 13:11:09.292862: step: 672/466, loss: 0.3494648039340973 2023-01-22 13:11:10.091377: step: 674/466, loss: 0.21490906178951263 2023-01-22 13:11:10.797359: step: 676/466, loss: 0.1882047951221466 2023-01-22 13:11:11.531637: step: 678/466, loss: 0.08048104494810104 2023-01-22 13:11:12.280223: step: 680/466, loss: 0.12107175588607788 2023-01-22 13:11:13.037881: step: 682/466, loss: 0.10289272665977478 2023-01-22 13:11:13.827859: step: 684/466, loss: 0.09581567347049713 2023-01-22 13:11:14.634440: step: 686/466, loss: 0.39118218421936035 2023-01-22 13:11:15.380624: step: 688/466, loss: 0.14677421748638153 2023-01-22 13:11:16.140896: step: 690/466, loss: 0.18316148221492767 2023-01-22 13:11:16.903389: step: 692/466, loss: 0.07292038947343826 2023-01-22 13:11:17.640232: step: 694/466, loss: 0.04308658093214035 2023-01-22 13:11:18.372067: step: 696/466, loss: 0.09988658875226974 2023-01-22 13:11:19.111679: step: 698/466, loss: 0.2860415577888489 2023-01-22 13:11:19.909782: step: 700/466, loss: 0.24483168125152588 2023-01-22 13:11:20.744246: step: 702/466, loss: 0.12841112911701202 2023-01-22 13:11:21.564345: step: 704/466, loss: 0.18972891569137573 2023-01-22 13:11:22.375015: step: 706/466, loss: 0.9659270644187927 2023-01-22 13:11:23.121130: step: 708/466, loss: 0.1148655042052269 2023-01-22 13:11:23.922584: step: 710/466, loss: 0.1329548954963684 2023-01-22 13:11:24.713532: step: 712/466, loss: 0.2517092227935791 2023-01-22 13:11:25.517695: step: 714/466, loss: 0.32274332642555237 2023-01-22 13:11:26.326463: step: 716/466, loss: 0.3467109799385071 2023-01-22 13:11:27.037516: step: 718/466, loss: 0.9646264910697937 2023-01-22 13:11:27.872163: step: 720/466, loss: 0.06586325913667679 2023-01-22 13:11:28.655240: step: 722/466, loss: 0.16391877830028534 2023-01-22 13:11:29.373143: step: 724/466, loss: 0.11094661056995392 2023-01-22 13:11:30.085886: step: 726/466, loss: 0.06109832599759102 2023-01-22 13:11:30.872794: step: 728/466, loss: 0.18235573172569275 2023-01-22 13:11:31.618041: step: 730/466, loss: 0.24906662106513977 2023-01-22 13:11:32.403903: step: 732/466, loss: 0.1062256470322609 2023-01-22 13:11:33.287543: step: 734/466, loss: 0.20727825164794922 2023-01-22 13:11:34.036471: step: 736/466, loss: 0.197353795170784 2023-01-22 13:11:34.819204: step: 738/466, loss: 0.13762034475803375 2023-01-22 13:11:35.594750: step: 740/466, loss: 0.2729426622390747 2023-01-22 13:11:36.499578: step: 742/466, loss: 0.2156907469034195 2023-01-22 13:11:37.284150: step: 744/466, loss: 0.42388179898262024 2023-01-22 13:11:37.980607: step: 746/466, loss: 0.10986457765102386 2023-01-22 13:11:38.757855: step: 748/466, loss: 0.2412114143371582 2023-01-22 13:11:39.500426: step: 750/466, loss: 0.16270828247070312 2023-01-22 13:11:40.337124: step: 752/466, loss: 0.26500195264816284 2023-01-22 13:11:41.002339: step: 754/466, loss: 0.7490832209587097 2023-01-22 13:11:41.745364: step: 756/466, loss: 0.10206577926874161 2023-01-22 13:11:42.561196: step: 758/466, loss: 0.11831139028072357 2023-01-22 13:11:43.306993: step: 760/466, loss: 0.6326764225959778 2023-01-22 13:11:44.089846: step: 762/466, loss: 0.03746001422405243 2023-01-22 13:11:44.812170: step: 764/466, loss: 0.40483012795448303 2023-01-22 13:11:45.572551: step: 766/466, loss: 0.09523598849773407 2023-01-22 13:11:46.342194: step: 768/466, loss: 0.10036483407020569 2023-01-22 13:11:47.079696: step: 770/466, loss: 0.07891888171434402 2023-01-22 13:11:47.920256: step: 772/466, loss: 0.12781232595443726 2023-01-22 13:11:48.752574: step: 774/466, loss: 0.13502788543701172 2023-01-22 13:11:49.509234: step: 776/466, loss: 0.29947882890701294 2023-01-22 13:11:50.290605: step: 778/466, loss: 0.2242499589920044 2023-01-22 13:11:51.021220: step: 780/466, loss: 0.15417909622192383 2023-01-22 13:11:51.783130: step: 782/466, loss: 0.08405672013759613 2023-01-22 13:11:52.469935: step: 784/466, loss: 0.051594078540802 2023-01-22 13:11:53.148185: step: 786/466, loss: 0.05530751869082451 2023-01-22 13:11:53.927685: step: 788/466, loss: 0.09490536153316498 2023-01-22 13:11:54.713901: step: 790/466, loss: 0.25493767857551575 2023-01-22 13:11:55.489699: step: 792/466, loss: 0.09009568393230438 2023-01-22 13:11:56.336341: step: 794/466, loss: 0.39803338050842285 2023-01-22 13:11:57.060124: step: 796/466, loss: 0.10020395368337631 2023-01-22 13:11:57.820904: step: 798/466, loss: 0.16874580085277557 2023-01-22 13:11:58.758169: step: 800/466, loss: 0.18689240515232086 2023-01-22 13:11:59.603472: step: 802/466, loss: 0.26988959312438965 2023-01-22 13:12:00.395737: step: 804/466, loss: 0.07876058667898178 2023-01-22 13:12:01.130078: step: 806/466, loss: 0.056754451245069504 2023-01-22 13:12:01.794438: step: 808/466, loss: 0.12084650993347168 2023-01-22 13:12:02.532471: step: 810/466, loss: 0.16054895520210266 2023-01-22 13:12:03.282753: step: 812/466, loss: 0.03489147499203682 2023-01-22 13:12:04.092923: step: 814/466, loss: 0.2401704043149948 2023-01-22 13:12:04.836630: step: 816/466, loss: 0.08901477605104446 2023-01-22 13:12:05.533683: step: 818/466, loss: 0.19765466451644897 2023-01-22 13:12:06.316416: step: 820/466, loss: 0.08183137327432632 2023-01-22 13:12:07.030215: step: 822/466, loss: 0.08816594630479813 2023-01-22 13:12:07.814155: step: 824/466, loss: 0.077956922352314 2023-01-22 13:12:08.638897: step: 826/466, loss: 0.036140188574790955 2023-01-22 13:12:09.460299: step: 828/466, loss: 0.1455797553062439 2023-01-22 13:12:10.195857: step: 830/466, loss: 1.6206011772155762 2023-01-22 13:12:10.947824: step: 832/466, loss: 0.5392410159111023 2023-01-22 13:12:11.746156: step: 834/466, loss: 1.0381767749786377 2023-01-22 13:12:12.468281: step: 836/466, loss: 0.02795933187007904 2023-01-22 13:12:13.260630: step: 838/466, loss: 0.041050322353839874 2023-01-22 13:12:14.005929: step: 840/466, loss: 0.17207692563533783 2023-01-22 13:12:14.729721: step: 842/466, loss: 0.17557108402252197 2023-01-22 13:12:15.453912: step: 844/466, loss: 0.10534816980361938 2023-01-22 13:12:16.236215: step: 846/466, loss: 0.13698969781398773 2023-01-22 13:12:17.096610: step: 848/466, loss: 0.3698488473892212 2023-01-22 13:12:17.821038: step: 850/466, loss: 0.14198416471481323 2023-01-22 13:12:18.573649: step: 852/466, loss: 0.14994223415851593 2023-01-22 13:12:19.392982: step: 854/466, loss: 0.7133137583732605 2023-01-22 13:12:20.110637: step: 856/466, loss: 0.1454135626554489 2023-01-22 13:12:20.862144: step: 858/466, loss: 0.15097977221012115 2023-01-22 13:12:21.716964: step: 860/466, loss: 0.05939953401684761 2023-01-22 13:12:22.540315: step: 862/466, loss: 0.6255478858947754 2023-01-22 13:12:23.332823: step: 864/466, loss: 0.21900486946105957 2023-01-22 13:12:24.179862: step: 866/466, loss: 0.14697346091270447 2023-01-22 13:12:25.140023: step: 868/466, loss: 0.13551542162895203 2023-01-22 13:12:25.988131: step: 870/466, loss: 0.13633784651756287 2023-01-22 13:12:26.828609: step: 872/466, loss: 0.22378049790859222 2023-01-22 13:12:27.612492: step: 874/466, loss: 0.290413498878479 2023-01-22 13:12:28.349124: step: 876/466, loss: 0.35654738545417786 2023-01-22 13:12:29.125884: step: 878/466, loss: 0.27310025691986084 2023-01-22 13:12:29.932096: step: 880/466, loss: 0.09667938202619553 2023-01-22 13:12:30.742952: step: 882/466, loss: 0.17184355854988098 2023-01-22 13:12:31.580099: step: 884/466, loss: 0.1590942144393921 2023-01-22 13:12:32.316137: step: 886/466, loss: 0.1663396954536438 2023-01-22 13:12:33.166708: step: 888/466, loss: 0.18245753645896912 2023-01-22 13:12:33.950798: step: 890/466, loss: 0.10235494375228882 2023-01-22 13:12:34.732480: step: 892/466, loss: 0.09549881517887115 2023-01-22 13:12:35.521843: step: 894/466, loss: 0.04158993065357208 2023-01-22 13:12:36.268061: step: 896/466, loss: 2.234426975250244 2023-01-22 13:12:37.025068: step: 898/466, loss: 0.29457899928092957 2023-01-22 13:12:37.833900: step: 900/466, loss: 0.21517448127269745 2023-01-22 13:12:38.573829: step: 902/466, loss: 0.06916851550340652 2023-01-22 13:12:39.360923: step: 904/466, loss: 0.21449489891529083 2023-01-22 13:12:40.270959: step: 906/466, loss: 0.15825435519218445 2023-01-22 13:12:41.059287: step: 908/466, loss: 0.09181555360555649 2023-01-22 13:12:41.872578: step: 910/466, loss: 0.17901822924613953 2023-01-22 13:12:42.630744: step: 912/466, loss: 0.25450900197029114 2023-01-22 13:12:43.417427: step: 914/466, loss: 0.06662459671497345 2023-01-22 13:12:44.292035: step: 916/466, loss: 0.06792336702346802 2023-01-22 13:12:45.081831: step: 918/466, loss: 0.10544149577617645 2023-01-22 13:12:45.893171: step: 920/466, loss: 0.13937614858150482 2023-01-22 13:12:46.716536: step: 922/466, loss: 0.11666586250066757 2023-01-22 13:12:47.464576: step: 924/466, loss: 0.14547590911388397 2023-01-22 13:12:48.193000: step: 926/466, loss: 0.11564742773771286 2023-01-22 13:12:48.969420: step: 928/466, loss: 0.08576952666044235 2023-01-22 13:12:49.702738: step: 930/466, loss: 0.611613392829895 2023-01-22 13:12:50.437382: step: 932/466, loss: 0.5802958607673645 ================================================== Loss: 0.249 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3082034161490683, 'r': 0.3362750745459474, 'f1': 0.32162788436608764}, 'combined': 0.2369889674276435, 'epoch': 12} Test Chinese: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.36108819526876657, 'r': 0.3004903493499597, 'f1': 0.32801401685415804}, 'combined': 0.20160861523718981, 'epoch': 12} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2809808249197547, 'r': 0.345494448857687, 'f1': 0.3099158715710656}, 'combined': 0.22835906326289043, 'epoch': 12} Test Korean: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.3402471714380789, 'r': 0.30195095300781893, 'f1': 0.3199572024991109}, 'combined': 0.1966566220238438, 'epoch': 12} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.31437067474048447, 'r': 0.3447936432637571, 'f1': 0.3288800904977376}, 'combined': 0.24233269826149084, 'epoch': 12} Test Russian: {'template': {'p': 0.863013698630137, 'r': 0.48091603053435117, 'f1': 0.6176470588235295}, 'slot': {'p': 0.3538798410677973, 'r': 0.29576647045165083, 'f1': 0.32222392308150655}, 'combined': 0.19902065837387173, 'epoch': 12} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.2586734693877551, 'r': 0.36214285714285716, 'f1': 0.3017857142857143}, 'combined': 0.2011904761904762, 'epoch': 12} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.26666666666666666, 'r': 0.5217391304347826, 'f1': 0.3529411764705882}, 'combined': 0.1764705882352941, 'epoch': 12} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.4, 'r': 0.20689655172413793, 'f1': 0.2727272727272727}, 'combined': 0.1818181818181818, 'epoch': 12} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3148516147989166, 'r': 0.35428274682306937, 'f1': 0.33340537067099557}, 'combined': 0.2456671152312599, 'epoch': 11} Test for Chinese: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.34260585369857516, 'r': 0.29629171749668803, 'f1': 0.31777011337470074}, 'combined': 0.19531236236688923, 'epoch': 11} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.275, 'r': 0.3535714285714286, 'f1': 0.309375}, 'combined': 0.20625, 'epoch': 11} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.28298284686781866, 'r': 0.3527888622242065, 'f1': 0.31405359863540006}, 'combined': 0.23140791478397899, 'epoch': 11} Test for Korean: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.32660396071805126, 'r': 0.30226432413074417, 'f1': 0.3139631233545263}, 'combined': 0.19297245630570886, 'epoch': 11} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.28888888888888886, 'r': 0.5652173913043478, 'f1': 0.38235294117647056}, 'combined': 0.19117647058823528, 'epoch': 11} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.32387621960662843, 'r': 0.3607501725030188, 'f1': 0.34132018116533375}, 'combined': 0.25149908085866696, 'epoch': 11} Test for Russian: {'template': {'p': 0.863013698630137, 'r': 0.48091603053435117, 'f1': 0.6176470588235295}, 'slot': {'p': 0.3437797535475426, 'r': 0.2928383862541026, 'f1': 0.3162709384531908}, 'combined': 0.19534381492697084, 'epoch': 11} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.46153846153846156, 'r': 0.20689655172413793, 'f1': 0.28571428571428575}, 'combined': 0.1904761904761905, 'epoch': 11} ****************************** Epoch: 13 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-22 13:15:36.465587: step: 2/466, loss: 0.382373571395874 2023-01-22 13:15:37.246343: step: 4/466, loss: 0.1019504964351654 2023-01-22 13:15:38.180922: step: 6/466, loss: 0.0512908473610878 2023-01-22 13:15:38.981693: step: 8/466, loss: 0.2527649700641632 2023-01-22 13:15:39.761895: step: 10/466, loss: 0.5246418714523315 2023-01-22 13:15:40.582598: step: 12/466, loss: 0.10989820212125778 2023-01-22 13:15:41.307015: step: 14/466, loss: 0.05986408516764641 2023-01-22 13:15:42.039751: step: 16/466, loss: 0.18218691647052765 2023-01-22 13:15:42.824875: step: 18/466, loss: 0.0748194083571434 2023-01-22 13:15:43.592842: step: 20/466, loss: 0.2327689379453659 2023-01-22 13:15:44.393615: step: 22/466, loss: 0.4737565815448761 2023-01-22 13:15:45.176382: step: 24/466, loss: 0.10894951969385147 2023-01-22 13:15:46.001561: step: 26/466, loss: 0.17812326550483704 2023-01-22 13:15:46.825611: step: 28/466, loss: 0.12489855289459229 2023-01-22 13:15:47.576121: step: 30/466, loss: 0.13165737688541412 2023-01-22 13:15:48.316707: step: 32/466, loss: 0.12805214524269104 2023-01-22 13:15:49.006941: step: 34/466, loss: 0.07385516166687012 2023-01-22 13:15:49.724869: step: 36/466, loss: 0.037543077021837234 2023-01-22 13:15:50.470564: step: 38/466, loss: 0.1814260482788086 2023-01-22 13:15:51.281177: step: 40/466, loss: 0.07305333018302917 2023-01-22 13:15:52.026132: step: 42/466, loss: 0.20866356790065765 2023-01-22 13:15:52.797808: step: 44/466, loss: 0.08367782831192017 2023-01-22 13:15:53.546473: step: 46/466, loss: 0.14819931983947754 2023-01-22 13:15:54.379878: step: 48/466, loss: 0.09913279116153717 2023-01-22 13:15:55.151602: step: 50/466, loss: 0.3708493113517761 2023-01-22 13:15:55.903679: step: 52/466, loss: 0.06336794793605804 2023-01-22 13:15:56.650990: step: 54/466, loss: 0.11002914607524872 2023-01-22 13:15:57.432324: step: 56/466, loss: 0.2921782433986664 2023-01-22 13:15:58.216967: step: 58/466, loss: 0.09743791818618774 2023-01-22 13:15:58.945370: step: 60/466, loss: 0.13039200007915497 2023-01-22 13:15:59.659472: step: 62/466, loss: 0.23460814356803894 2023-01-22 13:16:00.453246: step: 64/466, loss: 0.08859889954328537 2023-01-22 13:16:01.210836: step: 66/466, loss: 0.7760970592498779 2023-01-22 13:16:01.947078: step: 68/466, loss: 0.18919910490512848 2023-01-22 13:16:02.790152: step: 70/466, loss: 0.04972463101148605 2023-01-22 13:16:03.493177: step: 72/466, loss: 0.13533838093280792 2023-01-22 13:16:04.178940: step: 74/466, loss: 0.2750799357891083 2023-01-22 13:16:04.864007: step: 76/466, loss: 0.09845469892024994 2023-01-22 13:16:05.601286: step: 78/466, loss: 0.05216136947274208 2023-01-22 13:16:06.363354: step: 80/466, loss: 0.12190327048301697 2023-01-22 13:16:07.114026: step: 82/466, loss: 0.11941462010145187 2023-01-22 13:16:07.791798: step: 84/466, loss: 0.14103659987449646 2023-01-22 13:16:08.512803: step: 86/466, loss: 0.13708628714084625 2023-01-22 13:16:09.233141: step: 88/466, loss: 0.5550259947776794 2023-01-22 13:16:09.910081: step: 90/466, loss: 0.10664961487054825 2023-01-22 13:16:10.619716: step: 92/466, loss: 0.19297471642494202 2023-01-22 13:16:11.376092: step: 94/466, loss: 0.1483359932899475 2023-01-22 13:16:12.084531: step: 96/466, loss: 0.128028005361557 2023-01-22 13:16:12.861896: step: 98/466, loss: 0.17673911154270172 2023-01-22 13:16:13.780726: step: 100/466, loss: 0.11707588285207748 2023-01-22 13:16:14.592896: step: 102/466, loss: 0.38031941652297974 2023-01-22 13:16:15.333951: step: 104/466, loss: 0.06883509457111359 2023-01-22 13:16:16.123368: step: 106/466, loss: 0.20923328399658203 2023-01-22 13:16:16.754812: step: 108/466, loss: 0.08676006644964218 2023-01-22 13:16:17.545441: step: 110/466, loss: 0.04772452265024185 2023-01-22 13:16:18.305043: step: 112/466, loss: 0.6443043947219849 2023-01-22 13:16:19.125077: step: 114/466, loss: 0.12380842864513397 2023-01-22 13:16:19.856675: step: 116/466, loss: 0.09070102870464325 2023-01-22 13:16:20.575933: step: 118/466, loss: 0.07567242532968521 2023-01-22 13:16:21.338078: step: 120/466, loss: 0.12793821096420288 2023-01-22 13:16:22.056131: step: 122/466, loss: 0.11282803863286972 2023-01-22 13:16:22.769650: step: 124/466, loss: 0.049191106110811234 2023-01-22 13:16:23.607157: step: 126/466, loss: 0.3585178554058075 2023-01-22 13:16:24.326564: step: 128/466, loss: 0.25831490755081177 2023-01-22 13:16:25.119560: step: 130/466, loss: 0.07179155200719833 2023-01-22 13:16:25.943271: step: 132/466, loss: 0.0197906531393528 2023-01-22 13:16:26.650976: step: 134/466, loss: 0.03828202560544014 2023-01-22 13:16:27.407083: step: 136/466, loss: 0.10379919409751892 2023-01-22 13:16:28.191651: step: 138/466, loss: 0.16227275133132935 2023-01-22 13:16:28.929300: step: 140/466, loss: 0.17869989573955536 2023-01-22 13:16:29.693558: step: 142/466, loss: 0.16198000311851501 2023-01-22 13:16:30.441695: step: 144/466, loss: 0.08363251388072968 2023-01-22 13:16:31.267127: step: 146/466, loss: 1.946467399597168 2023-01-22 13:16:32.008512: step: 148/466, loss: 0.1306145340204239 2023-01-22 13:16:32.758355: step: 150/466, loss: 0.162128284573555 2023-01-22 13:16:33.435169: step: 152/466, loss: 0.1288970559835434 2023-01-22 13:16:34.208229: step: 154/466, loss: 3.0442185401916504 2023-01-22 13:16:34.928582: step: 156/466, loss: 0.08906519412994385 2023-01-22 13:16:35.604125: step: 158/466, loss: 0.14633415639400482 2023-01-22 13:16:36.276664: step: 160/466, loss: 0.0988697037100792 2023-01-22 13:16:37.015081: step: 162/466, loss: 0.17721976339817047 2023-01-22 13:16:37.878682: step: 164/466, loss: 0.14720726013183594 2023-01-22 13:16:38.681272: step: 166/466, loss: 0.07897642999887466 2023-01-22 13:16:39.475501: step: 168/466, loss: 0.050784893333911896 2023-01-22 13:16:40.228569: step: 170/466, loss: 0.09608414769172668 2023-01-22 13:16:41.010754: step: 172/466, loss: 0.29185426235198975 2023-01-22 13:16:41.766215: step: 174/466, loss: 0.36591893434524536 2023-01-22 13:16:42.610771: step: 176/466, loss: 0.19042915105819702 2023-01-22 13:16:43.443083: step: 178/466, loss: 0.41831472516059875 2023-01-22 13:16:44.224099: step: 180/466, loss: 0.04827124997973442 2023-01-22 13:16:44.948991: step: 182/466, loss: 0.20153003931045532 2023-01-22 13:16:45.686227: step: 184/466, loss: 0.07360716909170151 2023-01-22 13:16:46.426039: step: 186/466, loss: 0.05430116504430771 2023-01-22 13:16:47.262137: step: 188/466, loss: 0.21173177659511566 2023-01-22 13:16:47.956612: step: 190/466, loss: 0.09703905880451202 2023-01-22 13:16:48.778023: step: 192/466, loss: 0.10637550055980682 2023-01-22 13:16:49.629734: step: 194/466, loss: 0.1917353719472885 2023-01-22 13:16:50.438497: step: 196/466, loss: 0.04204834625124931 2023-01-22 13:16:51.234633: step: 198/466, loss: 0.11851033568382263 2023-01-22 13:16:51.960847: step: 200/466, loss: 0.2094639390707016 2023-01-22 13:16:52.726622: step: 202/466, loss: 0.02567223832011223 2023-01-22 13:16:53.495532: step: 204/466, loss: 0.11470548808574677 2023-01-22 13:16:54.318101: step: 206/466, loss: 0.18110646307468414 2023-01-22 13:16:55.087780: step: 208/466, loss: 0.041296664625406265 2023-01-22 13:16:55.836720: step: 210/466, loss: 0.15020208060741425 2023-01-22 13:16:56.607366: step: 212/466, loss: 0.16802366077899933 2023-01-22 13:16:57.447653: step: 214/466, loss: 0.06758114695549011 2023-01-22 13:16:58.224192: step: 216/466, loss: 0.2651681900024414 2023-01-22 13:16:58.978342: step: 218/466, loss: 0.37192392349243164 2023-01-22 13:16:59.745221: step: 220/466, loss: 0.1015489399433136 2023-01-22 13:17:00.461486: step: 222/466, loss: 0.15365496277809143 2023-01-22 13:17:01.264878: step: 224/466, loss: 0.11318421363830566 2023-01-22 13:17:02.077795: step: 226/466, loss: 0.3373357653617859 2023-01-22 13:17:02.855429: step: 228/466, loss: 0.24359995126724243 2023-01-22 13:17:03.573680: step: 230/466, loss: 0.41684919595718384 2023-01-22 13:17:04.293192: step: 232/466, loss: 0.015662597492337227 2023-01-22 13:17:05.068321: step: 234/466, loss: 0.12667682766914368 2023-01-22 13:17:05.842075: step: 236/466, loss: 0.192781463265419 2023-01-22 13:17:06.629194: step: 238/466, loss: 0.06522570550441742 2023-01-22 13:17:07.400260: step: 240/466, loss: 0.13598352670669556 2023-01-22 13:17:08.177764: step: 242/466, loss: 0.04600071534514427 2023-01-22 13:17:08.904185: step: 244/466, loss: 0.07195093482732773 2023-01-22 13:17:09.693349: step: 246/466, loss: 0.1215813159942627 2023-01-22 13:17:10.411544: step: 248/466, loss: 0.07753612101078033 2023-01-22 13:17:11.194504: step: 250/466, loss: 0.13835875689983368 2023-01-22 13:17:11.992216: step: 252/466, loss: 0.20455078780651093 2023-01-22 13:17:12.724641: step: 254/466, loss: 0.6708722114562988 2023-01-22 13:17:13.424920: step: 256/466, loss: 0.12679585814476013 2023-01-22 13:17:14.121837: step: 258/466, loss: 0.13701783120632172 2023-01-22 13:17:14.866919: step: 260/466, loss: 0.35117268562316895 2023-01-22 13:17:15.621439: step: 262/466, loss: 1.0611485242843628 2023-01-22 13:17:16.433569: step: 264/466, loss: 0.03565460443496704 2023-01-22 13:17:17.256873: step: 266/466, loss: 0.11983101069927216 2023-01-22 13:17:17.961723: step: 268/466, loss: 0.11047709733247757 2023-01-22 13:17:18.680778: step: 270/466, loss: 0.5995880961418152 2023-01-22 13:17:19.386537: step: 272/466, loss: 0.07326005399227142 2023-01-22 13:17:20.154081: step: 274/466, loss: 0.039359912276268005 2023-01-22 13:17:20.937937: step: 276/466, loss: 0.038333382457494736 2023-01-22 13:17:21.675102: step: 278/466, loss: 0.1003439798951149 2023-01-22 13:17:22.520437: step: 280/466, loss: 0.5510913729667664 2023-01-22 13:17:23.296740: step: 282/466, loss: 0.1375330090522766 2023-01-22 13:17:24.030653: step: 284/466, loss: 0.09970781207084656 2023-01-22 13:17:24.785573: step: 286/466, loss: 0.08100002259016037 2023-01-22 13:17:25.523981: step: 288/466, loss: 0.24892517924308777 2023-01-22 13:17:26.371947: step: 290/466, loss: 0.14371928572654724 2023-01-22 13:17:27.136327: step: 292/466, loss: 0.22054694592952728 2023-01-22 13:17:27.917646: step: 294/466, loss: 0.06324626505374908 2023-01-22 13:17:28.760519: step: 296/466, loss: 0.2058716118335724 2023-01-22 13:17:29.487611: step: 298/466, loss: 0.0764944925904274 2023-01-22 13:17:30.236256: step: 300/466, loss: 0.17809653282165527 2023-01-22 13:17:31.003978: step: 302/466, loss: 0.10286303609609604 2023-01-22 13:17:31.740600: step: 304/466, loss: 0.14660319685935974 2023-01-22 13:17:32.651359: step: 306/466, loss: 9.334784507751465 2023-01-22 13:17:33.387932: step: 308/466, loss: 1.5064904689788818 2023-01-22 13:17:34.235293: step: 310/466, loss: 0.13779456913471222 2023-01-22 13:17:34.954569: step: 312/466, loss: 0.13367831707000732 2023-01-22 13:17:35.792833: step: 314/466, loss: 0.08212552964687347 2023-01-22 13:17:36.572528: step: 316/466, loss: 4.54255485534668 2023-01-22 13:17:37.387066: step: 318/466, loss: 0.23854738473892212 2023-01-22 13:17:38.184572: step: 320/466, loss: 0.17678216099739075 2023-01-22 13:17:38.865237: step: 322/466, loss: 0.42571911215782166 2023-01-22 13:17:39.636514: step: 324/466, loss: 0.25691738724708557 2023-01-22 13:17:40.407838: step: 326/466, loss: 0.18332761526107788 2023-01-22 13:17:41.193213: step: 328/466, loss: 0.12483546882867813 2023-01-22 13:17:41.928264: step: 330/466, loss: 0.02915043570101261 2023-01-22 13:17:42.643144: step: 332/466, loss: 0.15927720069885254 2023-01-22 13:17:43.434756: step: 334/466, loss: 0.23633554577827454 2023-01-22 13:17:44.233268: step: 336/466, loss: 0.16368862986564636 2023-01-22 13:17:44.989047: step: 338/466, loss: 0.07220807671546936 2023-01-22 13:17:45.776842: step: 340/466, loss: 0.20814989507198334 2023-01-22 13:17:46.513746: step: 342/466, loss: 0.2206103801727295 2023-01-22 13:17:47.287621: step: 344/466, loss: 0.14292998611927032 2023-01-22 13:17:47.992449: step: 346/466, loss: 0.12245476245880127 2023-01-22 13:17:48.785437: step: 348/466, loss: 0.12565559148788452 2023-01-22 13:17:49.458668: step: 350/466, loss: 1.2255115509033203 2023-01-22 13:17:50.257478: step: 352/466, loss: 0.05467798188328743 2023-01-22 13:17:51.074754: step: 354/466, loss: 0.1506376415491104 2023-01-22 13:17:51.799111: step: 356/466, loss: 0.7821328639984131 2023-01-22 13:17:52.587606: step: 358/466, loss: 0.048696957528591156 2023-01-22 13:17:53.406599: step: 360/466, loss: 0.17461726069450378 2023-01-22 13:17:54.199272: step: 362/466, loss: 0.19472767412662506 2023-01-22 13:17:54.936379: step: 364/466, loss: 0.1759956032037735 2023-01-22 13:17:55.627649: step: 366/466, loss: 0.040069859474897385 2023-01-22 13:17:56.386490: step: 368/466, loss: 0.26998400688171387 2023-01-22 13:17:57.115953: step: 370/466, loss: 0.35295793414115906 2023-01-22 13:17:57.866361: step: 372/466, loss: 0.1004018560051918 2023-01-22 13:17:58.721874: step: 374/466, loss: 0.26358577609062195 2023-01-22 13:17:59.525917: step: 376/466, loss: 0.17438524961471558 2023-01-22 13:18:00.237544: step: 378/466, loss: 0.06375767290592194 2023-01-22 13:18:01.051890: step: 380/466, loss: 0.39776426553726196 2023-01-22 13:18:01.816364: step: 382/466, loss: 0.049622464925050735 2023-01-22 13:18:02.572982: step: 384/466, loss: 0.0817975103855133 2023-01-22 13:18:03.558976: step: 386/466, loss: 0.08786547183990479 2023-01-22 13:18:04.355107: step: 388/466, loss: 0.14453229308128357 2023-01-22 13:18:05.073522: step: 390/466, loss: 0.12139902263879776 2023-01-22 13:18:05.785628: step: 392/466, loss: 0.08930321037769318 2023-01-22 13:18:06.501568: step: 394/466, loss: 0.19578050076961517 2023-01-22 13:18:07.277274: step: 396/466, loss: 0.07919494807720184 2023-01-22 13:18:08.088537: step: 398/466, loss: 0.11601810902357101 2023-01-22 13:18:08.896830: step: 400/466, loss: 0.19614189863204956 2023-01-22 13:18:09.645371: step: 402/466, loss: 0.1866857409477234 2023-01-22 13:18:10.421600: step: 404/466, loss: 0.12435347586870193 2023-01-22 13:18:11.152911: step: 406/466, loss: 0.07915801554918289 2023-01-22 13:18:11.824261: step: 408/466, loss: 0.07421186566352844 2023-01-22 13:18:12.605345: step: 410/466, loss: 0.08340541273355484 2023-01-22 13:18:13.368002: step: 412/466, loss: 1.495497703552246 2023-01-22 13:18:14.240473: step: 414/466, loss: 1.0178102254867554 2023-01-22 13:18:15.020687: step: 416/466, loss: 0.09015622735023499 2023-01-22 13:18:15.773895: step: 418/466, loss: 0.10119029134511948 2023-01-22 13:18:16.455713: step: 420/466, loss: 0.04814134165644646 2023-01-22 13:18:17.210748: step: 422/466, loss: 0.07919945567846298 2023-01-22 13:18:17.945120: step: 424/466, loss: 0.25124919414520264 2023-01-22 13:18:18.728076: step: 426/466, loss: 0.10618660598993301 2023-01-22 13:18:19.486341: step: 428/466, loss: 0.2987133860588074 2023-01-22 13:18:20.299003: step: 430/466, loss: 0.08233465254306793 2023-01-22 13:18:21.027379: step: 432/466, loss: 0.7407652139663696 2023-01-22 13:18:21.816428: step: 434/466, loss: 0.024903899058699608 2023-01-22 13:18:22.634839: step: 436/466, loss: 0.17847418785095215 2023-01-22 13:18:23.427374: step: 438/466, loss: 0.17466199398040771 2023-01-22 13:18:24.216031: step: 440/466, loss: 0.1522059440612793 2023-01-22 13:18:24.996793: step: 442/466, loss: 0.07020722329616547 2023-01-22 13:18:25.955854: step: 444/466, loss: 0.7656755447387695 2023-01-22 13:18:26.754251: step: 446/466, loss: 0.06377051770687103 2023-01-22 13:18:27.461012: step: 448/466, loss: 0.1730051189661026 2023-01-22 13:18:28.198068: step: 450/466, loss: 0.09168146550655365 2023-01-22 13:18:28.978483: step: 452/466, loss: 0.1671973019838333 2023-01-22 13:18:29.774339: step: 454/466, loss: 0.39160823822021484 2023-01-22 13:18:30.603988: step: 456/466, loss: 0.06910758465528488 2023-01-22 13:18:31.361353: step: 458/466, loss: 0.24151785671710968 2023-01-22 13:18:32.126453: step: 460/466, loss: 0.24184173345565796 2023-01-22 13:18:32.912704: step: 462/466, loss: 0.06884318590164185 2023-01-22 13:18:33.740949: step: 464/466, loss: 0.18182960152626038 2023-01-22 13:18:34.486599: step: 466/466, loss: 0.18162575364112854 2023-01-22 13:18:35.395581: step: 468/466, loss: 0.062338173389434814 2023-01-22 13:18:36.129942: step: 470/466, loss: 0.13774867355823517 2023-01-22 13:18:36.877954: step: 472/466, loss: 0.06745638698339462 2023-01-22 13:18:37.727220: step: 474/466, loss: 0.04645160213112831 2023-01-22 13:18:38.475502: step: 476/466, loss: 0.14952293038368225 2023-01-22 13:18:39.262795: step: 478/466, loss: 0.09848786145448685 2023-01-22 13:18:40.029242: step: 480/466, loss: 0.2773403525352478 2023-01-22 13:18:40.725255: step: 482/466, loss: 0.03563261032104492 2023-01-22 13:18:41.441647: step: 484/466, loss: 0.30415090918540955 2023-01-22 13:18:42.377082: step: 486/466, loss: 0.44185060262680054 2023-01-22 13:18:43.164670: step: 488/466, loss: 0.6004049777984619 2023-01-22 13:18:43.907677: step: 490/466, loss: 0.5131444334983826 2023-01-22 13:18:44.676804: step: 492/466, loss: 0.11282426118850708 2023-01-22 13:18:45.424390: step: 494/466, loss: 0.07206586748361588 2023-01-22 13:18:46.151061: step: 496/466, loss: 0.11763086169958115 2023-01-22 13:18:46.862463: step: 498/466, loss: 0.17410100996494293 2023-01-22 13:18:47.667301: step: 500/466, loss: 0.4359200894832611 2023-01-22 13:18:48.333116: step: 502/466, loss: 0.10102607309818268 2023-01-22 13:18:49.116705: step: 504/466, loss: 0.19833999872207642 2023-01-22 13:18:49.925861: step: 506/466, loss: 0.2305871993303299 2023-01-22 13:18:50.672270: step: 508/466, loss: 0.10187660902738571 2023-01-22 13:18:51.388821: step: 510/466, loss: 0.25623825192451477 2023-01-22 13:18:52.178883: step: 512/466, loss: 0.07170939445495605 2023-01-22 13:18:52.952052: step: 514/466, loss: 0.1135045662522316 2023-01-22 13:18:53.717562: step: 516/466, loss: 0.26936087012290955 2023-01-22 13:18:54.573042: step: 518/466, loss: 1.3775463104248047 2023-01-22 13:18:55.439631: step: 520/466, loss: 0.16075366735458374 2023-01-22 13:18:56.234259: step: 522/466, loss: 0.11880484968423843 2023-01-22 13:18:56.939132: step: 524/466, loss: 0.2878868877887726 2023-01-22 13:18:57.685129: step: 526/466, loss: 0.09803323447704315 2023-01-22 13:18:58.402053: step: 528/466, loss: 1.1583340167999268 2023-01-22 13:18:59.184209: step: 530/466, loss: 0.11727601289749146 2023-01-22 13:18:59.937906: step: 532/466, loss: 0.15182524919509888 2023-01-22 13:19:00.851132: step: 534/466, loss: 0.056869395077228546 2023-01-22 13:19:01.613404: step: 536/466, loss: 0.34635502099990845 2023-01-22 13:19:02.277570: step: 538/466, loss: 0.060244232416152954 2023-01-22 13:19:03.099532: step: 540/466, loss: 0.48993760347366333 2023-01-22 13:19:03.912385: step: 542/466, loss: 0.1992948055267334 2023-01-22 13:19:04.719762: step: 544/466, loss: 0.08805844187736511 2023-01-22 13:19:05.477062: step: 546/466, loss: 1.0619785785675049 2023-01-22 13:19:06.241762: step: 548/466, loss: 0.0847192108631134 2023-01-22 13:19:07.023487: step: 550/466, loss: 0.2873402535915375 2023-01-22 13:19:07.730716: step: 552/466, loss: 0.1951029747724533 2023-01-22 13:19:08.529107: step: 554/466, loss: 0.401314914226532 2023-01-22 13:19:09.346750: step: 556/466, loss: 0.26130008697509766 2023-01-22 13:19:10.093590: step: 558/466, loss: 0.3713858425617218 2023-01-22 13:19:10.814707: step: 560/466, loss: 0.12568676471710205 2023-01-22 13:19:11.569128: step: 562/466, loss: 0.40286916494369507 2023-01-22 13:19:12.443496: step: 564/466, loss: 0.06690298020839691 2023-01-22 13:19:13.125844: step: 566/466, loss: 0.2356967180967331 2023-01-22 13:19:13.946754: step: 568/466, loss: 0.08960768580436707 2023-01-22 13:19:14.707463: step: 570/466, loss: 0.4388246536254883 2023-01-22 13:19:15.414965: step: 572/466, loss: 0.17967738211154938 2023-01-22 13:19:16.246484: step: 574/466, loss: 0.08611579239368439 2023-01-22 13:19:16.976372: step: 576/466, loss: 0.12125842273235321 2023-01-22 13:19:17.759980: step: 578/466, loss: 0.2086947113275528 2023-01-22 13:19:18.493845: step: 580/466, loss: 0.096670962870121 2023-01-22 13:19:19.210619: step: 582/466, loss: 0.12036548554897308 2023-01-22 13:19:19.861364: step: 584/466, loss: 0.024983666837215424 2023-01-22 13:19:20.659829: step: 586/466, loss: 0.3695588707923889 2023-01-22 13:19:21.406935: step: 588/466, loss: 0.16849878430366516 2023-01-22 13:19:22.190656: step: 590/466, loss: 0.15996284782886505 2023-01-22 13:19:23.064630: step: 592/466, loss: 0.08652857691049576 2023-01-22 13:19:23.904672: step: 594/466, loss: 0.0542147234082222 2023-01-22 13:19:24.815118: step: 596/466, loss: 0.034066833555698395 2023-01-22 13:19:25.533427: step: 598/466, loss: 0.21105864644050598 2023-01-22 13:19:26.350469: step: 600/466, loss: 0.1679028868675232 2023-01-22 13:19:27.062262: step: 602/466, loss: 0.27978837490081787 2023-01-22 13:19:27.829667: step: 604/466, loss: 0.5022321343421936 2023-01-22 13:19:28.595150: step: 606/466, loss: 0.10941528528928757 2023-01-22 13:19:29.337354: step: 608/466, loss: 0.14708594977855682 2023-01-22 13:19:30.023301: step: 610/466, loss: 0.09767206758260727 2023-01-22 13:19:30.754923: step: 612/466, loss: 0.0527835339307785 2023-01-22 13:19:31.512199: step: 614/466, loss: 0.1914864331483841 2023-01-22 13:19:32.289850: step: 616/466, loss: 0.10214617848396301 2023-01-22 13:19:32.984981: step: 618/466, loss: 3.6446030139923096 2023-01-22 13:19:33.744532: step: 620/466, loss: 0.04150984436273575 2023-01-22 13:19:34.525815: step: 622/466, loss: 0.8228867650032043 2023-01-22 13:19:35.266990: step: 624/466, loss: 0.07708427309989929 2023-01-22 13:19:35.986898: step: 626/466, loss: 0.6810038685798645 2023-01-22 13:19:36.717126: step: 628/466, loss: 6.378917217254639 2023-01-22 13:19:37.512928: step: 630/466, loss: 0.3479134738445282 2023-01-22 13:19:38.258846: step: 632/466, loss: 0.06563637405633926 2023-01-22 13:19:39.070777: step: 634/466, loss: 0.29593950510025024 2023-01-22 13:19:39.795457: step: 636/466, loss: 0.5084291696548462 2023-01-22 13:19:40.521777: step: 638/466, loss: 0.21293936669826508 2023-01-22 13:19:41.231127: step: 640/466, loss: 0.05789226293563843 2023-01-22 13:19:42.040886: step: 642/466, loss: 0.2178761065006256 2023-01-22 13:19:42.870943: step: 644/466, loss: 0.18674539029598236 2023-01-22 13:19:43.546067: step: 646/466, loss: 0.28289663791656494 2023-01-22 13:19:44.318753: step: 648/466, loss: 0.05782013759016991 2023-01-22 13:19:45.078360: step: 650/466, loss: 0.12950697541236877 2023-01-22 13:19:45.937190: step: 652/466, loss: 0.044820286333560944 2023-01-22 13:19:46.740442: step: 654/466, loss: 0.08404266089200974 2023-01-22 13:19:47.466523: step: 656/466, loss: 0.1321253478527069 2023-01-22 13:19:48.158045: step: 658/466, loss: 0.16074199974536896 2023-01-22 13:19:48.889820: step: 660/466, loss: 0.26025640964508057 2023-01-22 13:19:49.711617: step: 662/466, loss: 0.7558488845825195 2023-01-22 13:19:50.438982: step: 664/466, loss: 0.05508873984217644 2023-01-22 13:19:51.243661: step: 666/466, loss: 0.5754992365837097 2023-01-22 13:19:52.008318: step: 668/466, loss: 0.10862943530082703 2023-01-22 13:19:52.853877: step: 670/466, loss: 0.2049148827791214 2023-01-22 13:19:53.613213: step: 672/466, loss: 0.11638084799051285 2023-01-22 13:19:54.429351: step: 674/466, loss: 0.3473494052886963 2023-01-22 13:19:55.161676: step: 676/466, loss: 0.03944385051727295 2023-01-22 13:19:55.990658: step: 678/466, loss: 0.19364267587661743 2023-01-22 13:19:56.776949: step: 680/466, loss: 0.2050262987613678 2023-01-22 13:19:57.498976: step: 682/466, loss: 0.12509407103061676 2023-01-22 13:19:58.271972: step: 684/466, loss: 0.3392498791217804 2023-01-22 13:19:59.069975: step: 686/466, loss: 0.40512919425964355 2023-01-22 13:19:59.808090: step: 688/466, loss: 0.22947798669338226 2023-01-22 13:20:00.514985: step: 690/466, loss: 0.08351828902959824 2023-01-22 13:20:01.300660: step: 692/466, loss: 0.33065587282180786 2023-01-22 13:20:02.045432: step: 694/466, loss: 0.09166646748781204 2023-01-22 13:20:02.778594: step: 696/466, loss: 0.18775784969329834 2023-01-22 13:20:03.549985: step: 698/466, loss: 0.15353037416934967 2023-01-22 13:20:04.272538: step: 700/466, loss: 0.0546298585832119 2023-01-22 13:20:05.030214: step: 702/466, loss: 0.12193844467401505 2023-01-22 13:20:05.763891: step: 704/466, loss: 0.0375572144985199 2023-01-22 13:20:06.539273: step: 706/466, loss: 0.07901520282030106 2023-01-22 13:20:07.277341: step: 708/466, loss: 0.1349632441997528 2023-01-22 13:20:08.085793: step: 710/466, loss: 0.30271032452583313 2023-01-22 13:20:08.750655: step: 712/466, loss: 0.05177057161927223 2023-01-22 13:20:09.531778: step: 714/466, loss: 0.1005883440375328 2023-01-22 13:20:10.292900: step: 716/466, loss: 0.07584855705499649 2023-01-22 13:20:11.085485: step: 718/466, loss: 0.13204872608184814 2023-01-22 13:20:11.824426: step: 720/466, loss: 0.03609362989664078 2023-01-22 13:20:12.527063: step: 722/466, loss: 0.012614982202649117 2023-01-22 13:20:13.282895: step: 724/466, loss: 0.08021419495344162 2023-01-22 13:20:13.999528: step: 726/466, loss: 0.2547991871833801 2023-01-22 13:20:14.768301: step: 728/466, loss: 0.9453699588775635 2023-01-22 13:20:15.563443: step: 730/466, loss: 0.25624772906303406 2023-01-22 13:20:16.373656: step: 732/466, loss: 0.1573476493358612 2023-01-22 13:20:17.267256: step: 734/466, loss: 0.14874303340911865 2023-01-22 13:20:18.066093: step: 736/466, loss: 0.11184926331043243 2023-01-22 13:20:18.792205: step: 738/466, loss: 0.2025962769985199 2023-01-22 13:20:19.507752: step: 740/466, loss: 0.04139607027173042 2023-01-22 13:20:20.259704: step: 742/466, loss: 0.08207973837852478 2023-01-22 13:20:20.996209: step: 744/466, loss: 0.18964071571826935 2023-01-22 13:20:21.783820: step: 746/466, loss: 0.3260148763656616 2023-01-22 13:20:22.517106: step: 748/466, loss: 0.13423208892345428 2023-01-22 13:20:23.253112: step: 750/466, loss: 0.06044828146696091 2023-01-22 13:20:24.036588: step: 752/466, loss: 0.2879416346549988 2023-01-22 13:20:24.787993: step: 754/466, loss: 0.08656201511621475 2023-01-22 13:20:25.531787: step: 756/466, loss: 0.14122888445854187 2023-01-22 13:20:26.289359: step: 758/466, loss: 0.07630780339241028 2023-01-22 13:20:26.994001: step: 760/466, loss: 0.031992603093385696 2023-01-22 13:20:27.768484: step: 762/466, loss: 0.040653154253959656 2023-01-22 13:20:28.538901: step: 764/466, loss: 0.040650829672813416 2023-01-22 13:20:29.304925: step: 766/466, loss: 0.2310640513896942 2023-01-22 13:20:30.030951: step: 768/466, loss: 0.3783543109893799 2023-01-22 13:20:30.771175: step: 770/466, loss: 0.6130688190460205 2023-01-22 13:20:31.600789: step: 772/466, loss: 0.09674729406833649 2023-01-22 13:20:32.541075: step: 774/466, loss: 0.19277401268482208 2023-01-22 13:20:33.308793: step: 776/466, loss: 0.4734606146812439 2023-01-22 13:20:34.063896: step: 778/466, loss: 0.010649221017956734 2023-01-22 13:20:34.824157: step: 780/466, loss: 0.18568529188632965 2023-01-22 13:20:35.525965: step: 782/466, loss: 0.08577094972133636 2023-01-22 13:20:36.252687: step: 784/466, loss: 0.1252678632736206 2023-01-22 13:20:37.069805: step: 786/466, loss: 0.07036635279655457 2023-01-22 13:20:37.780094: step: 788/466, loss: 3.049663782119751 2023-01-22 13:20:38.602047: step: 790/466, loss: 0.2606522738933563 2023-01-22 13:20:39.420833: step: 792/466, loss: 0.11692289263010025 2023-01-22 13:20:40.235031: step: 794/466, loss: 0.09116509556770325 2023-01-22 13:20:40.986970: step: 796/466, loss: 0.1748039275407791 2023-01-22 13:20:41.844020: step: 798/466, loss: 0.06679972261190414 2023-01-22 13:20:42.579251: step: 800/466, loss: 0.7025142312049866 2023-01-22 13:20:43.304103: step: 802/466, loss: 0.11592617630958557 2023-01-22 13:20:44.097601: step: 804/466, loss: 0.9633765816688538 2023-01-22 13:20:44.976812: step: 806/466, loss: 0.1293555349111557 2023-01-22 13:20:45.916662: step: 808/466, loss: 0.1482734978199005 2023-01-22 13:20:46.687273: step: 810/466, loss: 0.028712695464491844 2023-01-22 13:20:47.446557: step: 812/466, loss: 0.22466611862182617 2023-01-22 13:20:48.168021: step: 814/466, loss: 0.3807362914085388 2023-01-22 13:20:48.915386: step: 816/466, loss: 0.11186535656452179 2023-01-22 13:20:49.758714: step: 818/466, loss: 0.15200169384479523 2023-01-22 13:20:50.568362: step: 820/466, loss: 0.13248823583126068 2023-01-22 13:20:51.322756: step: 822/466, loss: 0.5440301299095154 2023-01-22 13:20:52.040152: step: 824/466, loss: 0.033675309270620346 2023-01-22 13:20:52.756381: step: 826/466, loss: 0.13058869540691376 2023-01-22 13:20:53.529017: step: 828/466, loss: 0.09893123805522919 2023-01-22 13:20:54.310161: step: 830/466, loss: 0.5407994985580444 2023-01-22 13:20:55.083879: step: 832/466, loss: 0.11411946266889572 2023-01-22 13:20:55.770442: step: 834/466, loss: 0.06883563101291656 2023-01-22 13:20:56.557818: step: 836/466, loss: 0.26703953742980957 2023-01-22 13:20:57.393141: step: 838/466, loss: 0.9148061275482178 2023-01-22 13:20:58.179983: step: 840/466, loss: 0.0823197290301323 2023-01-22 13:20:59.026617: step: 842/466, loss: 0.054216478019952774 2023-01-22 13:20:59.812382: step: 844/466, loss: 0.17121522128582 2023-01-22 13:21:00.575698: step: 846/466, loss: 0.27832192182540894 2023-01-22 13:21:01.340299: step: 848/466, loss: 0.2275869995355606 2023-01-22 13:21:02.106339: step: 850/466, loss: 0.07137225568294525 2023-01-22 13:21:02.913658: step: 852/466, loss: 0.054541222751140594 2023-01-22 13:21:03.629269: step: 854/466, loss: 0.3268347978591919 2023-01-22 13:21:04.317656: step: 856/466, loss: 0.21499894559383392 2023-01-22 13:21:05.073802: step: 858/466, loss: 0.1782616823911667 2023-01-22 13:21:05.890039: step: 860/466, loss: 0.44444623589515686 2023-01-22 13:21:06.683015: step: 862/466, loss: 0.31309425830841064 2023-01-22 13:21:07.422138: step: 864/466, loss: 0.16758395731449127 2023-01-22 13:21:08.207204: step: 866/466, loss: 0.06523101776838303 2023-01-22 13:21:08.985215: step: 868/466, loss: 0.2688615918159485 2023-01-22 13:21:09.830094: step: 870/466, loss: 0.18860988318920135 2023-01-22 13:21:10.631071: step: 872/466, loss: 0.07963003218173981 2023-01-22 13:21:11.438417: step: 874/466, loss: 0.1542077660560608 2023-01-22 13:21:12.198723: step: 876/466, loss: 0.0746893659234047 2023-01-22 13:21:12.981880: step: 878/466, loss: 0.18620958924293518 2023-01-22 13:21:13.675139: step: 880/466, loss: 0.21131157875061035 2023-01-22 13:21:14.353980: step: 882/466, loss: 0.020140519365668297 2023-01-22 13:21:15.153161: step: 884/466, loss: 0.09388389438390732 2023-01-22 13:21:15.941743: step: 886/466, loss: 0.36890143156051636 2023-01-22 13:21:16.712303: step: 888/466, loss: 0.0899510309100151 2023-01-22 13:21:17.451543: step: 890/466, loss: 0.0766301304101944 2023-01-22 13:21:18.113903: step: 892/466, loss: 0.22256304323673248 2023-01-22 13:21:18.843701: step: 894/466, loss: 0.10125160962343216 2023-01-22 13:21:19.695396: step: 896/466, loss: 0.03004208765923977 2023-01-22 13:21:20.558456: step: 898/466, loss: 0.10825519263744354 2023-01-22 13:21:21.272332: step: 900/466, loss: 0.03943290561437607 2023-01-22 13:21:22.027272: step: 902/466, loss: 0.25123417377471924 2023-01-22 13:21:22.794484: step: 904/466, loss: 0.0888330340385437 2023-01-22 13:21:23.600895: step: 906/466, loss: 0.45769384503364563 2023-01-22 13:21:24.383328: step: 908/466, loss: 0.14918996393680573 2023-01-22 13:21:25.163653: step: 910/466, loss: 0.12507207691669464 2023-01-22 13:21:25.907387: step: 912/466, loss: 0.6189706921577454 2023-01-22 13:21:26.708959: step: 914/466, loss: 0.13033513724803925 2023-01-22 13:21:27.454830: step: 916/466, loss: 0.13682545721530914 2023-01-22 13:21:28.194180: step: 918/466, loss: 0.08202681690454483 2023-01-22 13:21:29.008102: step: 920/466, loss: 0.20634540915489197 2023-01-22 13:21:29.701987: step: 922/466, loss: 0.9963807463645935 2023-01-22 13:21:30.428578: step: 924/466, loss: 0.37003475427627563 2023-01-22 13:21:31.137713: step: 926/466, loss: 0.25558382272720337 2023-01-22 13:21:31.912188: step: 928/466, loss: 0.04824042692780495 2023-01-22 13:21:32.630621: step: 930/466, loss: 0.17482222616672516 2023-01-22 13:21:33.397256: step: 932/466, loss: 0.2238139510154724 ================================================== Loss: 0.270 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3229301325273193, 'r': 0.3523431237252535, 'f1': 0.3369960548152606}, 'combined': 0.2483128824954552, 'epoch': 13} Test Chinese: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.359844362340429, 'r': 0.28594218393949167, 'f1': 0.318664683984716}, 'combined': 0.19586219601011812, 'epoch': 13} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2940412553672126, 'r': 0.3548581374071105, 'f1': 0.3215997221213194}, 'combined': 0.23696821629991952, 'epoch': 13} Test Korean: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.34440708830444183, 'r': 0.2975915793314571, 'f1': 0.31929240513500506}, 'combined': 0.19624801486346652, 'epoch': 13} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3395252613240418, 'r': 0.3698055028462998, 'f1': 0.35401907356948226}, 'combined': 0.26085615947225005, 'epoch': 13} Test Russian: {'template': {'p': 0.863013698630137, 'r': 0.48091603053435117, 'f1': 0.6176470588235295}, 'slot': {'p': 0.3632055485629302, 'r': 0.29056443885034416, 'f1': 0.3228493765003824}, 'combined': 0.19940696783847153, 'epoch': 13} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.2554347826086957, 'r': 0.3357142857142857, 'f1': 0.2901234567901234}, 'combined': 0.19341563786008226, 'epoch': 13} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.2361111111111111, 'r': 0.5543478260869565, 'f1': 0.3311688311688311}, 'combined': 0.16558441558441556, 'epoch': 13} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.4, 'r': 0.20689655172413793, 'f1': 0.2727272727272727}, 'combined': 0.1818181818181818, 'epoch': 13} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3148516147989166, 'r': 0.35428274682306937, 'f1': 0.33340537067099557}, 'combined': 0.2456671152312599, 'epoch': 11} Test for Chinese: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.34260585369857516, 'r': 0.29629171749668803, 'f1': 0.31777011337470074}, 'combined': 0.19531236236688923, 'epoch': 11} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.275, 'r': 0.3535714285714286, 'f1': 0.309375}, 'combined': 0.20625, 'epoch': 11} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.28298284686781866, 'r': 0.3527888622242065, 'f1': 0.31405359863540006}, 'combined': 0.23140791478397899, 'epoch': 11} Test for Korean: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.32660396071805126, 'r': 0.30226432413074417, 'f1': 0.3139631233545263}, 'combined': 0.19297245630570886, 'epoch': 11} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.28888888888888886, 'r': 0.5652173913043478, 'f1': 0.38235294117647056}, 'combined': 0.19117647058823528, 'epoch': 11} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.32387621960662843, 'r': 0.3607501725030188, 'f1': 0.34132018116533375}, 'combined': 0.25149908085866696, 'epoch': 11} Test for Russian: {'template': {'p': 0.863013698630137, 'r': 0.48091603053435117, 'f1': 0.6176470588235295}, 'slot': {'p': 0.3437797535475426, 'r': 0.2928383862541026, 'f1': 0.3162709384531908}, 'combined': 0.19534381492697084, 'epoch': 11} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.46153846153846156, 'r': 0.20689655172413793, 'f1': 0.28571428571428575}, 'combined': 0.1904761904761905, 'epoch': 11} ****************************** Epoch: 14 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-22 13:24:18.588599: step: 2/466, loss: 1.7948763370513916 2023-01-22 13:24:19.422521: step: 4/466, loss: 0.11857905983924866 2023-01-22 13:24:20.107100: step: 6/466, loss: 0.014733104966580868 2023-01-22 13:24:20.868572: step: 8/466, loss: 0.0764663890004158 2023-01-22 13:24:21.615822: step: 10/466, loss: 0.09650014340877533 2023-01-22 13:24:22.498934: step: 12/466, loss: 0.08068323135375977 2023-01-22 13:24:23.223767: step: 14/466, loss: 0.07612251490354538 2023-01-22 13:24:23.995681: step: 16/466, loss: 0.14870595932006836 2023-01-22 13:24:24.767429: step: 18/466, loss: 0.01473545003682375 2023-01-22 13:24:25.595599: step: 20/466, loss: 0.13693945109844208 2023-01-22 13:24:26.415292: step: 22/466, loss: 0.09258382767438889 2023-01-22 13:24:27.102713: step: 24/466, loss: 0.0811443030834198 2023-01-22 13:24:27.852125: step: 26/466, loss: 0.08367400616407394 2023-01-22 13:24:28.651119: step: 28/466, loss: 0.04633022099733353 2023-01-22 13:24:29.403724: step: 30/466, loss: 0.14591136574745178 2023-01-22 13:24:30.186425: step: 32/466, loss: 0.2338826209306717 2023-01-22 13:24:30.949619: step: 34/466, loss: 0.0796242207288742 2023-01-22 13:24:31.688070: step: 36/466, loss: 0.04654328525066376 2023-01-22 13:24:32.506701: step: 38/466, loss: 0.2032957375049591 2023-01-22 13:24:33.278290: step: 40/466, loss: 0.15072587132453918 2023-01-22 13:24:34.055356: step: 42/466, loss: 0.03954103961586952 2023-01-22 13:24:34.830441: step: 44/466, loss: 0.16800735890865326 2023-01-22 13:24:35.586306: step: 46/466, loss: 0.08304127305746078 2023-01-22 13:24:36.306701: step: 48/466, loss: 1.2582231760025024 2023-01-22 13:24:37.013537: step: 50/466, loss: 0.17893579602241516 2023-01-22 13:24:37.764436: step: 52/466, loss: 0.14870749413967133 2023-01-22 13:24:38.493356: step: 54/466, loss: 0.052345480769872665 2023-01-22 13:24:39.306616: step: 56/466, loss: 0.3695552945137024 2023-01-22 13:24:40.132926: step: 58/466, loss: 0.05745256692171097 2023-01-22 13:24:40.822250: step: 60/466, loss: 0.05403880774974823 2023-01-22 13:24:41.517122: step: 62/466, loss: 0.09356357157230377 2023-01-22 13:24:42.253335: step: 64/466, loss: 0.06248932331800461 2023-01-22 13:24:43.046038: step: 66/466, loss: 0.04734089598059654 2023-01-22 13:24:43.860246: step: 68/466, loss: 0.11174242943525314 2023-01-22 13:24:44.645069: step: 70/466, loss: 0.059995852410793304 2023-01-22 13:24:45.477551: step: 72/466, loss: 0.09739916026592255 2023-01-22 13:24:46.279640: step: 74/466, loss: 0.14582285284996033 2023-01-22 13:24:47.057223: step: 76/466, loss: 0.08362985402345657 2023-01-22 13:24:47.791493: step: 78/466, loss: 0.17142032086849213 2023-01-22 13:24:48.522765: step: 80/466, loss: 0.07356587052345276 2023-01-22 13:24:49.360314: step: 82/466, loss: 0.1305818408727646 2023-01-22 13:24:50.116592: step: 84/466, loss: 0.13178014755249023 2023-01-22 13:24:50.867580: step: 86/466, loss: 0.06631729006767273 2023-01-22 13:24:51.565193: step: 88/466, loss: 0.11189496517181396 2023-01-22 13:24:52.355623: step: 90/466, loss: 0.08014704287052155 2023-01-22 13:24:53.094601: step: 92/466, loss: 0.08639585971832275 2023-01-22 13:24:53.881060: step: 94/466, loss: 0.05464732274413109 2023-01-22 13:24:54.571479: step: 96/466, loss: 0.048334747552871704 2023-01-22 13:24:55.334133: step: 98/466, loss: 0.3570219576358795 2023-01-22 13:24:56.075599: step: 100/466, loss: 0.037182196974754333 2023-01-22 13:24:56.889525: step: 102/466, loss: 0.15509562194347382 2023-01-22 13:24:57.623702: step: 104/466, loss: 0.6345539093017578 2023-01-22 13:24:58.433390: step: 106/466, loss: 0.05826098099350929 2023-01-22 13:24:59.193984: step: 108/466, loss: 0.10702861845493317 2023-01-22 13:25:00.005444: step: 110/466, loss: 0.3654964566230774 2023-01-22 13:25:00.840141: step: 112/466, loss: 0.18933402001857758 2023-01-22 13:25:01.607185: step: 114/466, loss: 0.13730598986148834 2023-01-22 13:25:02.314853: step: 116/466, loss: 0.030305171385407448 2023-01-22 13:25:03.125474: step: 118/466, loss: 0.12774544954299927 2023-01-22 13:25:03.890120: step: 120/466, loss: 0.05463085323572159 2023-01-22 13:25:04.656678: step: 122/466, loss: 0.085533507168293 2023-01-22 13:25:05.424564: step: 124/466, loss: 0.09212604910135269 2023-01-22 13:25:06.149394: step: 126/466, loss: 0.048779770731925964 2023-01-22 13:25:06.931429: step: 128/466, loss: 0.05446954816579819 2023-01-22 13:25:07.644883: step: 130/466, loss: 0.16687119007110596 2023-01-22 13:25:08.451521: step: 132/466, loss: 0.06060370057821274 2023-01-22 13:25:09.217659: step: 134/466, loss: 0.09922155737876892 2023-01-22 13:25:09.972344: step: 136/466, loss: 0.3732868731021881 2023-01-22 13:25:10.688462: step: 138/466, loss: 0.03313040733337402 2023-01-22 13:25:11.413075: step: 140/466, loss: 0.18710584938526154 2023-01-22 13:25:12.219790: step: 142/466, loss: 0.16831213235855103 2023-01-22 13:25:13.014656: step: 144/466, loss: 0.11231508105993271 2023-01-22 13:25:13.760450: step: 146/466, loss: 0.029249688610434532 2023-01-22 13:25:14.507782: step: 148/466, loss: 0.062171828001737595 2023-01-22 13:25:15.219481: step: 150/466, loss: 0.235786572098732 2023-01-22 13:25:15.990527: step: 152/466, loss: 0.06076842173933983 2023-01-22 13:25:16.764904: step: 154/466, loss: 0.13541647791862488 2023-01-22 13:25:17.545142: step: 156/466, loss: 5.351958751678467 2023-01-22 13:25:18.324479: step: 158/466, loss: 0.09068594127893448 2023-01-22 13:25:19.095529: step: 160/466, loss: 0.06700941920280457 2023-01-22 13:25:19.905163: step: 162/466, loss: 0.19954699277877808 2023-01-22 13:25:20.730128: step: 164/466, loss: 0.14579974114894867 2023-01-22 13:25:21.501507: step: 166/466, loss: 0.10689794272184372 2023-01-22 13:25:22.247230: step: 168/466, loss: 0.018217479810118675 2023-01-22 13:25:22.963438: step: 170/466, loss: 0.04633485898375511 2023-01-22 13:25:23.723799: step: 172/466, loss: 0.11263962835073471 2023-01-22 13:25:24.463169: step: 174/466, loss: 0.14510999619960785 2023-01-22 13:25:25.420199: step: 176/466, loss: 0.15911799669265747 2023-01-22 13:25:26.204303: step: 178/466, loss: 0.27858880162239075 2023-01-22 13:25:27.003064: step: 180/466, loss: 0.10577763617038727 2023-01-22 13:25:27.807718: step: 182/466, loss: 0.08947893977165222 2023-01-22 13:25:28.555457: step: 184/466, loss: 0.06114058196544647 2023-01-22 13:25:29.275109: step: 186/466, loss: 0.0358128622174263 2023-01-22 13:25:30.049264: step: 188/466, loss: 0.053163789212703705 2023-01-22 13:25:30.797566: step: 190/466, loss: 0.19581694900989532 2023-01-22 13:25:31.536854: step: 192/466, loss: 0.30720922350883484 2023-01-22 13:25:32.401046: step: 194/466, loss: 0.16692815721035004 2023-01-22 13:25:33.101571: step: 196/466, loss: 0.019776280969381332 2023-01-22 13:25:33.857436: step: 198/466, loss: 1.4734183549880981 2023-01-22 13:25:34.600885: step: 200/466, loss: 0.07742732763290405 2023-01-22 13:25:35.325881: step: 202/466, loss: 0.0917370393872261 2023-01-22 13:25:36.214889: step: 204/466, loss: 0.21086423099040985 2023-01-22 13:25:37.058383: step: 206/466, loss: 0.13101069629192352 2023-01-22 13:25:37.824898: step: 208/466, loss: 0.12070529162883759 2023-01-22 13:25:38.577312: step: 210/466, loss: 0.6178359985351562 2023-01-22 13:25:39.353555: step: 212/466, loss: 0.14963263273239136 2023-01-22 13:25:40.127237: step: 214/466, loss: 0.17099174857139587 2023-01-22 13:25:40.824743: step: 216/466, loss: 0.1138356477022171 2023-01-22 13:25:41.554985: step: 218/466, loss: 0.12635937333106995 2023-01-22 13:25:42.349471: step: 220/466, loss: 0.2430417686700821 2023-01-22 13:25:43.013237: step: 222/466, loss: 0.009499253705143929 2023-01-22 13:25:43.810627: step: 224/466, loss: 0.7551463842391968 2023-01-22 13:25:44.588289: step: 226/466, loss: 0.08994847536087036 2023-01-22 13:25:45.394180: step: 228/466, loss: 0.10397832840681076 2023-01-22 13:25:46.189110: step: 230/466, loss: 0.29179275035858154 2023-01-22 13:25:46.979128: step: 232/466, loss: 0.036685217171907425 2023-01-22 13:25:47.723182: step: 234/466, loss: 0.060931991785764694 2023-01-22 13:25:48.464788: step: 236/466, loss: 0.07507771253585815 2023-01-22 13:25:49.189972: step: 238/466, loss: 0.1963038593530655 2023-01-22 13:25:49.947631: step: 240/466, loss: 0.5022599101066589 2023-01-22 13:25:50.759918: step: 242/466, loss: 1.7030168771743774 2023-01-22 13:25:51.535323: step: 244/466, loss: 0.944342851638794 2023-01-22 13:25:52.308174: step: 246/466, loss: 0.19377201795578003 2023-01-22 13:25:53.043056: step: 248/466, loss: 0.07232809066772461 2023-01-22 13:25:53.891436: step: 250/466, loss: 0.1252468079328537 2023-01-22 13:25:54.620346: step: 252/466, loss: 0.06834293901920319 2023-01-22 13:25:55.443633: step: 254/466, loss: 0.09259863197803497 2023-01-22 13:25:56.138737: step: 256/466, loss: 0.15927770733833313 2023-01-22 13:25:56.965051: step: 258/466, loss: 0.08494063466787338 2023-01-22 13:25:57.724276: step: 260/466, loss: 0.19152744114398956 2023-01-22 13:25:58.527262: step: 262/466, loss: 0.09181059896945953 2023-01-22 13:25:59.289562: step: 264/466, loss: 0.19823607802391052 2023-01-22 13:26:00.025722: step: 266/466, loss: 0.09367074817419052 2023-01-22 13:26:00.872993: step: 268/466, loss: 0.4473530650138855 2023-01-22 13:26:01.618827: step: 270/466, loss: 0.35830551385879517 2023-01-22 13:26:02.487922: step: 272/466, loss: 0.143303781747818 2023-01-22 13:26:03.309164: step: 274/466, loss: 0.12418963015079498 2023-01-22 13:26:04.115106: step: 276/466, loss: 0.18688806891441345 2023-01-22 13:26:04.868435: step: 278/466, loss: 0.13030363619327545 2023-01-22 13:26:05.628686: step: 280/466, loss: 0.12113102525472641 2023-01-22 13:26:06.475850: step: 282/466, loss: 0.05125928670167923 2023-01-22 13:26:07.294043: step: 284/466, loss: 0.052229441702365875 2023-01-22 13:26:08.147397: step: 286/466, loss: 0.20199353992938995 2023-01-22 13:26:08.968310: step: 288/466, loss: 0.4484192430973053 2023-01-22 13:26:09.651363: step: 290/466, loss: 0.10080991685390472 2023-01-22 13:26:10.420629: step: 292/466, loss: 0.17666229605674744 2023-01-22 13:26:11.298239: step: 294/466, loss: 0.19794604182243347 2023-01-22 13:26:12.080729: step: 296/466, loss: 0.2858116626739502 2023-01-22 13:26:12.798648: step: 298/466, loss: 0.06229928135871887 2023-01-22 13:26:13.591457: step: 300/466, loss: 0.04586632549762726 2023-01-22 13:26:14.350232: step: 302/466, loss: 0.4339018166065216 2023-01-22 13:26:15.120393: step: 304/466, loss: 0.2180899679660797 2023-01-22 13:26:15.857989: step: 306/466, loss: 0.10189656913280487 2023-01-22 13:26:16.585422: step: 308/466, loss: 0.08927162736654282 2023-01-22 13:26:17.312345: step: 310/466, loss: 0.05226528272032738 2023-01-22 13:26:18.038253: step: 312/466, loss: 0.08544369041919708 2023-01-22 13:26:18.703780: step: 314/466, loss: 0.10923538357019424 2023-01-22 13:26:19.497668: step: 316/466, loss: 0.059774525463581085 2023-01-22 13:26:20.175138: step: 318/466, loss: 0.07523495703935623 2023-01-22 13:26:20.860469: step: 320/466, loss: 0.13578970730304718 2023-01-22 13:26:21.824472: step: 322/466, loss: 0.046819448471069336 2023-01-22 13:26:22.611809: step: 324/466, loss: 0.37413427233695984 2023-01-22 13:26:23.433761: step: 326/466, loss: 0.057528331875801086 2023-01-22 13:26:24.156330: step: 328/466, loss: 0.07809772342443466 2023-01-22 13:26:24.892789: step: 330/466, loss: 0.20363347232341766 2023-01-22 13:26:25.685859: step: 332/466, loss: 0.20195549726486206 2023-01-22 13:26:26.437355: step: 334/466, loss: 0.18331404030323029 2023-01-22 13:26:27.126000: step: 336/466, loss: 0.11527480185031891 2023-01-22 13:26:27.795426: step: 338/466, loss: 0.08166998624801636 2023-01-22 13:26:28.609331: step: 340/466, loss: 0.06871407479047775 2023-01-22 13:26:29.378767: step: 342/466, loss: 0.23992781341075897 2023-01-22 13:26:30.324660: step: 344/466, loss: 0.10774870216846466 2023-01-22 13:26:31.194486: step: 346/466, loss: 0.08146621286869049 2023-01-22 13:26:31.949195: step: 348/466, loss: 0.15509775280952454 2023-01-22 13:26:32.655590: step: 350/466, loss: 0.2556403875350952 2023-01-22 13:26:33.416614: step: 352/466, loss: 0.10612188279628754 2023-01-22 13:26:34.310685: step: 354/466, loss: 0.1254601627588272 2023-01-22 13:26:35.006948: step: 356/466, loss: 0.09702162444591522 2023-01-22 13:26:35.751721: step: 358/466, loss: 0.07175100594758987 2023-01-22 13:26:36.488711: step: 360/466, loss: 0.10510864108800888 2023-01-22 13:26:37.212247: step: 362/466, loss: 0.407501220703125 2023-01-22 13:26:37.975127: step: 364/466, loss: 0.13762032985687256 2023-01-22 13:26:38.753577: step: 366/466, loss: 0.11738991737365723 2023-01-22 13:26:39.500945: step: 368/466, loss: 0.07951025664806366 2023-01-22 13:26:40.277593: step: 370/466, loss: 0.351298063993454 2023-01-22 13:26:41.049626: step: 372/466, loss: 0.08246075361967087 2023-01-22 13:26:41.781854: step: 374/466, loss: 0.09868600219488144 2023-01-22 13:26:42.555705: step: 376/466, loss: 0.05318663269281387 2023-01-22 13:26:43.274901: step: 378/466, loss: 0.1378527283668518 2023-01-22 13:26:44.029152: step: 380/466, loss: 0.12403824925422668 2023-01-22 13:26:44.689275: step: 382/466, loss: 0.0518980547785759 2023-01-22 13:26:45.467603: step: 384/466, loss: 0.036360908299684525 2023-01-22 13:26:46.181422: step: 386/466, loss: 0.12337429821491241 2023-01-22 13:26:46.928802: step: 388/466, loss: 0.07348990440368652 2023-01-22 13:26:47.696527: step: 390/466, loss: 0.21546392142772675 2023-01-22 13:26:48.453780: step: 392/466, loss: 0.15438151359558105 2023-01-22 13:26:49.188307: step: 394/466, loss: 0.34767404198646545 2023-01-22 13:26:49.931771: step: 396/466, loss: 0.07150114327669144 2023-01-22 13:26:50.640258: step: 398/466, loss: 0.1381443440914154 2023-01-22 13:26:51.373690: step: 400/466, loss: 0.21789340674877167 2023-01-22 13:26:52.139209: step: 402/466, loss: 0.07714305073022842 2023-01-22 13:26:52.853286: step: 404/466, loss: 0.13851290941238403 2023-01-22 13:26:53.617378: step: 406/466, loss: 0.06918556243181229 2023-01-22 13:26:54.378013: step: 408/466, loss: 0.1138911172747612 2023-01-22 13:26:55.155877: step: 410/466, loss: 0.36651408672332764 2023-01-22 13:26:55.874364: step: 412/466, loss: 0.12470576912164688 2023-01-22 13:26:56.633307: step: 414/466, loss: 0.20686028897762299 2023-01-22 13:26:57.382301: step: 416/466, loss: 0.04287987947463989 2023-01-22 13:26:58.106058: step: 418/466, loss: 0.7809166312217712 2023-01-22 13:26:58.822187: step: 420/466, loss: 0.07137995958328247 2023-01-22 13:26:59.584898: step: 422/466, loss: 0.053179506212472916 2023-01-22 13:27:00.438314: step: 424/466, loss: 0.6887791156768799 2023-01-22 13:27:01.256335: step: 426/466, loss: 0.1992788314819336 2023-01-22 13:27:02.081092: step: 428/466, loss: 0.16830606758594513 2023-01-22 13:27:02.823979: step: 430/466, loss: 0.19737331569194794 2023-01-22 13:27:03.649260: step: 432/466, loss: 0.1718023270368576 2023-01-22 13:27:04.436806: step: 434/466, loss: 0.10333762317895889 2023-01-22 13:27:05.095845: step: 436/466, loss: 0.07898228615522385 2023-01-22 13:27:05.838851: step: 438/466, loss: 0.0929722934961319 2023-01-22 13:27:06.672271: step: 440/466, loss: 0.07579855620861053 2023-01-22 13:27:07.423106: step: 442/466, loss: 0.07746347039937973 2023-01-22 13:27:08.131170: step: 444/466, loss: 0.07698897272348404 2023-01-22 13:27:08.839990: step: 446/466, loss: 0.08299127966165543 2023-01-22 13:27:09.631146: step: 448/466, loss: 0.14330194890499115 2023-01-22 13:27:10.412237: step: 450/466, loss: 0.0593377910554409 2023-01-22 13:27:11.206242: step: 452/466, loss: 0.2686997354030609 2023-01-22 13:27:11.945188: step: 454/466, loss: 0.18140004575252533 2023-01-22 13:27:12.675919: step: 456/466, loss: 0.3100077211856842 2023-01-22 13:27:13.393893: step: 458/466, loss: 0.036361850798130035 2023-01-22 13:27:14.064907: step: 460/466, loss: 0.15428365767002106 2023-01-22 13:27:14.947985: step: 462/466, loss: 0.3123149871826172 2023-01-22 13:27:15.799882: step: 464/466, loss: 0.043799079954624176 2023-01-22 13:27:16.575131: step: 466/466, loss: 0.33713820576667786 2023-01-22 13:27:17.394174: step: 468/466, loss: 0.09103430807590485 2023-01-22 13:27:18.170627: step: 470/466, loss: 0.1020338162779808 2023-01-22 13:27:18.927281: step: 472/466, loss: 0.12119297683238983 2023-01-22 13:27:19.644801: step: 474/466, loss: 0.17621614038944244 2023-01-22 13:27:20.354735: step: 476/466, loss: 0.008811583742499352 2023-01-22 13:27:21.128427: step: 478/466, loss: 0.5255606770515442 2023-01-22 13:27:21.971672: step: 480/466, loss: 0.48379725217819214 2023-01-22 13:27:22.727783: step: 482/466, loss: 0.44753915071487427 2023-01-22 13:27:23.435589: step: 484/466, loss: 0.06750224530696869 2023-01-22 13:27:24.214453: step: 486/466, loss: 0.1774691641330719 2023-01-22 13:27:25.006745: step: 488/466, loss: 0.08389052003622055 2023-01-22 13:27:25.784631: step: 490/466, loss: 0.11968137323856354 2023-01-22 13:27:26.616045: step: 492/466, loss: 0.24325041472911835 2023-01-22 13:27:27.337760: step: 494/466, loss: 0.06990397721529007 2023-01-22 13:27:28.074311: step: 496/466, loss: 0.16294898092746735 2023-01-22 13:27:28.804367: step: 498/466, loss: 0.08192439377307892 2023-01-22 13:27:29.507672: step: 500/466, loss: 0.058084335178136826 2023-01-22 13:27:30.250081: step: 502/466, loss: 0.11363398283720016 2023-01-22 13:27:31.018158: step: 504/466, loss: 0.2798217535018921 2023-01-22 13:27:31.741543: step: 506/466, loss: 0.11963935196399689 2023-01-22 13:27:32.498266: step: 508/466, loss: 0.13938367366790771 2023-01-22 13:27:33.182458: step: 510/466, loss: 0.0852971151471138 2023-01-22 13:27:33.893645: step: 512/466, loss: 0.16567742824554443 2023-01-22 13:27:34.636157: step: 514/466, loss: 0.09919524937868118 2023-01-22 13:27:35.332158: step: 516/466, loss: 0.06752198934555054 2023-01-22 13:27:36.057883: step: 518/466, loss: 0.5552704930305481 2023-01-22 13:27:36.830411: step: 520/466, loss: 0.3073229193687439 2023-01-22 13:27:37.680551: step: 522/466, loss: 0.1807357370853424 2023-01-22 13:27:38.483254: step: 524/466, loss: 0.08142665773630142 2023-01-22 13:27:39.247106: step: 526/466, loss: 0.4611741304397583 2023-01-22 13:27:39.962853: step: 528/466, loss: 0.06939984858036041 2023-01-22 13:27:40.798619: step: 530/466, loss: 0.23970821499824524 2023-01-22 13:27:41.566875: step: 532/466, loss: 0.15627221763134003 2023-01-22 13:27:42.316885: step: 534/466, loss: 0.18242160975933075 2023-01-22 13:27:43.094093: step: 536/466, loss: 0.06681209057569504 2023-01-22 13:27:43.912858: step: 538/466, loss: 0.3042953312397003 2023-01-22 13:27:44.600531: step: 540/466, loss: 0.03705006465315819 2023-01-22 13:27:45.365912: step: 542/466, loss: 0.030975710600614548 2023-01-22 13:27:46.049215: step: 544/466, loss: 0.047146331518888474 2023-01-22 13:27:46.733968: step: 546/466, loss: 0.1211782693862915 2023-01-22 13:27:47.552375: step: 548/466, loss: 0.04829741269350052 2023-01-22 13:27:48.416419: step: 550/466, loss: 0.06972920894622803 2023-01-22 13:27:49.148791: step: 552/466, loss: 0.033795811235904694 2023-01-22 13:27:49.947144: step: 554/466, loss: 0.03054910898208618 2023-01-22 13:27:50.774965: step: 556/466, loss: 0.14491669833660126 2023-01-22 13:27:51.556672: step: 558/466, loss: 0.11190073192119598 2023-01-22 13:27:52.351378: step: 560/466, loss: 0.3582926392555237 2023-01-22 13:27:53.036776: step: 562/466, loss: 0.0762907937169075 2023-01-22 13:27:53.779996: step: 564/466, loss: 0.6806654334068298 2023-01-22 13:27:54.541966: step: 566/466, loss: 0.18192371726036072 2023-01-22 13:27:55.302286: step: 568/466, loss: 0.02502519078552723 2023-01-22 13:27:56.043355: step: 570/466, loss: 0.3706054389476776 2023-01-22 13:27:56.780296: step: 572/466, loss: 0.0466209352016449 2023-01-22 13:27:57.556814: step: 574/466, loss: 0.06274183094501495 2023-01-22 13:27:58.351927: step: 576/466, loss: 0.09402534365653992 2023-01-22 13:27:59.140294: step: 578/466, loss: 0.06700240820646286 2023-01-22 13:27:59.964966: step: 580/466, loss: 0.2377341240644455 2023-01-22 13:28:00.839485: step: 582/466, loss: 0.6597070693969727 2023-01-22 13:28:01.562771: step: 584/466, loss: 0.08730103075504303 2023-01-22 13:28:02.447515: step: 586/466, loss: 0.11759735643863678 2023-01-22 13:28:03.164615: step: 588/466, loss: 0.0636318176984787 2023-01-22 13:28:03.920282: step: 590/466, loss: 0.08937416970729828 2023-01-22 13:28:04.594859: step: 592/466, loss: 0.0803782120347023 2023-01-22 13:28:05.343955: step: 594/466, loss: 0.09570712596178055 2023-01-22 13:28:06.141489: step: 596/466, loss: 0.2582423686981201 2023-01-22 13:28:06.906161: step: 598/466, loss: 0.10370142012834549 2023-01-22 13:28:07.774053: step: 600/466, loss: 0.40866079926490784 2023-01-22 13:28:08.537880: step: 602/466, loss: 0.03324064239859581 2023-01-22 13:28:09.320450: step: 604/466, loss: 0.13952693343162537 2023-01-22 13:28:10.110534: step: 606/466, loss: 0.054963547736406326 2023-01-22 13:28:10.908676: step: 608/466, loss: 0.057081256061792374 2023-01-22 13:28:11.732063: step: 610/466, loss: 0.16398000717163086 2023-01-22 13:28:12.520085: step: 612/466, loss: 0.14669935405254364 2023-01-22 13:28:13.242827: step: 614/466, loss: 0.10255993157625198 2023-01-22 13:28:14.001943: step: 616/466, loss: 0.5874757170677185 2023-01-22 13:28:14.832176: step: 618/466, loss: 0.06501025706529617 2023-01-22 13:28:15.552466: step: 620/466, loss: 0.163362056016922 2023-01-22 13:28:16.291006: step: 622/466, loss: 0.06703763455152512 2023-01-22 13:28:17.169242: step: 624/466, loss: 4.936920642852783 2023-01-22 13:28:17.909590: step: 626/466, loss: 0.03846811503171921 2023-01-22 13:28:18.654097: step: 628/466, loss: 0.03890600800514221 2023-01-22 13:28:19.395869: step: 630/466, loss: 0.19648754596710205 2023-01-22 13:28:20.156189: step: 632/466, loss: 0.18167728185653687 2023-01-22 13:28:20.980265: step: 634/466, loss: 0.09970265626907349 2023-01-22 13:28:21.664402: step: 636/466, loss: 0.04559174180030823 2023-01-22 13:28:22.470208: step: 638/466, loss: 0.14223453402519226 2023-01-22 13:28:23.287789: step: 640/466, loss: 0.054243456572294235 2023-01-22 13:28:24.029193: step: 642/466, loss: 0.0635480135679245 2023-01-22 13:28:24.896430: step: 644/466, loss: 0.07275538891553879 2023-01-22 13:28:25.633415: step: 646/466, loss: 0.0707472488284111 2023-01-22 13:28:26.339790: step: 648/466, loss: 0.44649559259414673 2023-01-22 13:28:27.080979: step: 650/466, loss: 0.19551318883895874 2023-01-22 13:28:27.859957: step: 652/466, loss: 0.16433806717395782 2023-01-22 13:28:28.668886: step: 654/466, loss: 0.20102214813232422 2023-01-22 13:28:29.342133: step: 656/466, loss: 0.07433684915304184 2023-01-22 13:28:30.152069: step: 658/466, loss: 0.17160046100616455 2023-01-22 13:28:30.917057: step: 660/466, loss: 0.1790246218442917 2023-01-22 13:28:31.896635: step: 662/466, loss: 0.15379475057125092 2023-01-22 13:28:32.648422: step: 664/466, loss: 0.15680082142353058 2023-01-22 13:28:33.408747: step: 666/466, loss: 0.23924851417541504 2023-01-22 13:28:34.139925: step: 668/466, loss: 0.0730941891670227 2023-01-22 13:28:34.960967: step: 670/466, loss: 0.2315608114004135 2023-01-22 13:28:35.809531: step: 672/466, loss: 0.04345105215907097 2023-01-22 13:28:36.578988: step: 674/466, loss: 0.32656824588775635 2023-01-22 13:28:37.428941: step: 676/466, loss: 0.16333246231079102 2023-01-22 13:28:38.166752: step: 678/466, loss: 0.302473783493042 2023-01-22 13:28:38.921123: step: 680/466, loss: 0.09493256360292435 2023-01-22 13:28:39.738048: step: 682/466, loss: 0.10015727579593658 2023-01-22 13:28:40.643387: step: 684/466, loss: 0.29031530022621155 2023-01-22 13:28:41.352925: step: 686/466, loss: 0.05644798278808594 2023-01-22 13:28:42.091776: step: 688/466, loss: 0.08461704850196838 2023-01-22 13:28:42.837600: step: 690/466, loss: 0.07248745113611221 2023-01-22 13:28:43.577262: step: 692/466, loss: 1.287148356437683 2023-01-22 13:28:44.395395: step: 694/466, loss: 0.10873213410377502 2023-01-22 13:28:45.235018: step: 696/466, loss: 0.10134746134281158 2023-01-22 13:28:46.022572: step: 698/466, loss: 0.4512856900691986 2023-01-22 13:28:46.763637: step: 700/466, loss: 0.17885048687458038 2023-01-22 13:28:47.539896: step: 702/466, loss: 0.19928614795207977 2023-01-22 13:28:48.245659: step: 704/466, loss: 0.02061893790960312 2023-01-22 13:28:48.960103: step: 706/466, loss: 0.11236510425806046 2023-01-22 13:28:49.775558: step: 708/466, loss: 0.11860001087188721 2023-01-22 13:28:50.496816: step: 710/466, loss: 0.09537634998559952 2023-01-22 13:28:51.190726: step: 712/466, loss: 0.9762457609176636 2023-01-22 13:28:51.918708: step: 714/466, loss: 0.07860858738422394 2023-01-22 13:28:52.631564: step: 716/466, loss: 0.102113276720047 2023-01-22 13:28:53.409312: step: 718/466, loss: 0.6726066470146179 2023-01-22 13:28:54.179018: step: 720/466, loss: 0.12842540442943573 2023-01-22 13:28:54.941332: step: 722/466, loss: 0.04332917556166649 2023-01-22 13:28:55.754207: step: 724/466, loss: 0.08657079935073853 2023-01-22 13:28:56.576202: step: 726/466, loss: 0.9335070848464966 2023-01-22 13:28:57.341036: step: 728/466, loss: 0.04679827764630318 2023-01-22 13:28:58.130170: step: 730/466, loss: 0.2225496470928192 2023-01-22 13:28:58.930476: step: 732/466, loss: 0.2062837779521942 2023-01-22 13:28:59.704757: step: 734/466, loss: 0.06761132925748825 2023-01-22 13:29:00.484919: step: 736/466, loss: 0.12684382498264313 2023-01-22 13:29:01.287718: step: 738/466, loss: 0.11122657358646393 2023-01-22 13:29:02.109974: step: 740/466, loss: 0.17465952038764954 2023-01-22 13:29:02.868419: step: 742/466, loss: 0.10215871036052704 2023-01-22 13:29:03.641541: step: 744/466, loss: 0.047759778797626495 2023-01-22 13:29:04.451700: step: 746/466, loss: 0.10460063815116882 2023-01-22 13:29:05.229552: step: 748/466, loss: 0.1175285130739212 2023-01-22 13:29:05.921946: step: 750/466, loss: 0.3711400330066681 2023-01-22 13:29:06.634036: step: 752/466, loss: 0.10916385054588318 2023-01-22 13:29:07.492049: step: 754/466, loss: 0.09719958901405334 2023-01-22 13:29:08.266273: step: 756/466, loss: 0.1982584148645401 2023-01-22 13:29:09.067053: step: 758/466, loss: 0.15777599811553955 2023-01-22 13:29:09.835496: step: 760/466, loss: 0.0677880272269249 2023-01-22 13:29:10.691148: step: 762/466, loss: 0.08650727570056915 2023-01-22 13:29:11.491342: step: 764/466, loss: 0.06624545156955719 2023-01-22 13:29:12.355644: step: 766/466, loss: 0.05742808058857918 2023-01-22 13:29:13.085358: step: 768/466, loss: 0.7379640936851501 2023-01-22 13:29:13.881511: step: 770/466, loss: 0.16413678228855133 2023-01-22 13:29:14.641481: step: 772/466, loss: 0.11502152681350708 2023-01-22 13:29:15.403548: step: 774/466, loss: 0.14976269006729126 2023-01-22 13:29:16.121052: step: 776/466, loss: 0.24236641824245453 2023-01-22 13:29:16.884362: step: 778/466, loss: 0.5738264918327332 2023-01-22 13:29:17.595895: step: 780/466, loss: 0.17038989067077637 2023-01-22 13:29:18.393073: step: 782/466, loss: 0.22923089563846588 2023-01-22 13:29:19.215149: step: 784/466, loss: 0.09711220115423203 2023-01-22 13:29:19.971758: step: 786/466, loss: 0.10119055956602097 2023-01-22 13:29:20.666796: step: 788/466, loss: 0.039102327078580856 2023-01-22 13:29:21.409582: step: 790/466, loss: 0.4208541512489319 2023-01-22 13:29:22.176010: step: 792/466, loss: 0.09367494285106659 2023-01-22 13:29:23.032192: step: 794/466, loss: 0.09790325909852982 2023-01-22 13:29:23.800734: step: 796/466, loss: 0.2581421732902527 2023-01-22 13:29:24.495955: step: 798/466, loss: 0.09112636744976044 2023-01-22 13:29:25.269778: step: 800/466, loss: 0.13206596672534943 2023-01-22 13:29:26.015424: step: 802/466, loss: 0.25599294900894165 2023-01-22 13:29:26.760161: step: 804/466, loss: 0.059491004794836044 2023-01-22 13:29:27.545483: step: 806/466, loss: 0.32217538356781006 2023-01-22 13:29:28.323814: step: 808/466, loss: 0.045017991214990616 2023-01-22 13:29:29.036742: step: 810/466, loss: 0.1430712193250656 2023-01-22 13:29:29.865742: step: 812/466, loss: 0.2719506025314331 2023-01-22 13:29:30.618297: step: 814/466, loss: 0.054632265120744705 2023-01-22 13:29:31.377320: step: 816/466, loss: 0.08650518208742142 2023-01-22 13:29:32.169264: step: 818/466, loss: 0.1173226535320282 2023-01-22 13:29:32.897133: step: 820/466, loss: 0.090061254799366 2023-01-22 13:29:33.694004: step: 822/466, loss: 0.07849156111478806 2023-01-22 13:29:34.502779: step: 824/466, loss: 0.41988658905029297 2023-01-22 13:29:35.234051: step: 826/466, loss: 0.0609484426677227 2023-01-22 13:29:35.996126: step: 828/466, loss: 0.2581329345703125 2023-01-22 13:29:36.767389: step: 830/466, loss: 0.12320809066295624 2023-01-22 13:29:37.534113: step: 832/466, loss: 0.5082396864891052 2023-01-22 13:29:38.280926: step: 834/466, loss: 0.10686381906270981 2023-01-22 13:29:38.949108: step: 836/466, loss: 0.06501047313213348 2023-01-22 13:29:39.701848: step: 838/466, loss: 0.09969601035118103 2023-01-22 13:29:40.496994: step: 840/466, loss: 0.9600458145141602 2023-01-22 13:29:41.262927: step: 842/466, loss: 0.5189383029937744 2023-01-22 13:29:42.018586: step: 844/466, loss: 0.3567603826522827 2023-01-22 13:29:42.815448: step: 846/466, loss: 0.1487482786178589 2023-01-22 13:29:43.700528: step: 848/466, loss: 0.2470036894083023 2023-01-22 13:29:44.388198: step: 850/466, loss: 0.18263716995716095 2023-01-22 13:29:45.084382: step: 852/466, loss: 0.13464903831481934 2023-01-22 13:29:45.885280: step: 854/466, loss: 0.027120299637317657 2023-01-22 13:29:46.687143: step: 856/466, loss: 0.11120583117008209 2023-01-22 13:29:47.465052: step: 858/466, loss: 0.1276436448097229 2023-01-22 13:29:48.144913: step: 860/466, loss: 0.05004655197262764 2023-01-22 13:29:48.929733: step: 862/466, loss: 0.5223639607429504 2023-01-22 13:29:49.766425: step: 864/466, loss: 0.06866031885147095 2023-01-22 13:29:50.558808: step: 866/466, loss: 0.25382503867149353 2023-01-22 13:29:51.282643: step: 868/466, loss: 0.20900730788707733 2023-01-22 13:29:52.064643: step: 870/466, loss: 0.1347699910402298 2023-01-22 13:29:52.813794: step: 872/466, loss: 0.19099219143390656 2023-01-22 13:29:53.691900: step: 874/466, loss: 0.050141043961048126 2023-01-22 13:29:54.409847: step: 876/466, loss: 0.168554425239563 2023-01-22 13:29:55.231289: step: 878/466, loss: 0.05101846903562546 2023-01-22 13:29:56.000758: step: 880/466, loss: 0.0809459388256073 2023-01-22 13:29:56.710222: step: 882/466, loss: 0.265508770942688 2023-01-22 13:29:57.509926: step: 884/466, loss: 0.06232677027583122 2023-01-22 13:29:58.403833: step: 886/466, loss: 0.24122491478919983 2023-01-22 13:29:59.240907: step: 888/466, loss: 0.07205895334482193 2023-01-22 13:29:59.952938: step: 890/466, loss: 0.07484681904315948 2023-01-22 13:30:00.644103: step: 892/466, loss: 0.09441087394952774 2023-01-22 13:30:01.422971: step: 894/466, loss: 0.6755983829498291 2023-01-22 13:30:02.192123: step: 896/466, loss: 0.034746140241622925 2023-01-22 13:30:02.930824: step: 898/466, loss: 0.08571317791938782 2023-01-22 13:30:03.695567: step: 900/466, loss: 0.24418793618679047 2023-01-22 13:30:04.463540: step: 902/466, loss: 0.0639563649892807 2023-01-22 13:30:05.254630: step: 904/466, loss: 0.080899678170681 2023-01-22 13:30:06.002827: step: 906/466, loss: 0.41414880752563477 2023-01-22 13:30:06.751928: step: 908/466, loss: 0.13303065299987793 2023-01-22 13:30:07.564524: step: 910/466, loss: 0.06787938624620438 2023-01-22 13:30:08.299942: step: 912/466, loss: 0.08366450667381287 2023-01-22 13:30:09.028452: step: 914/466, loss: 0.08863049000501633 2023-01-22 13:30:09.782028: step: 916/466, loss: 0.11649972200393677 2023-01-22 13:30:10.581608: step: 918/466, loss: 0.020747818052768707 2023-01-22 13:30:11.295420: step: 920/466, loss: 0.10685139894485474 2023-01-22 13:30:12.025351: step: 922/466, loss: 0.09153237193822861 2023-01-22 13:30:12.736654: step: 924/466, loss: 0.1741751730442047 2023-01-22 13:30:13.524989: step: 926/466, loss: 0.16743814945220947 2023-01-22 13:30:14.282756: step: 928/466, loss: 0.07640694826841354 2023-01-22 13:30:15.104329: step: 930/466, loss: 0.20740586519241333 2023-01-22 13:30:15.911282: step: 932/466, loss: 0.128241166472435 ================================================== Loss: 0.198 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3041842327150085, 'r': 0.34227941176470594, 'f1': 0.32210937500000003}, 'combined': 0.23734375000000002, 'epoch': 14} Test Chinese: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.3458826249346568, 'r': 0.31084516422699027, 'f1': 0.32742924275620044}, 'combined': 0.20124919310868905, 'epoch': 14} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.27470964758612926, 'r': 0.34560245986642063, 'f1': 0.30610503588168686}, 'combined': 0.22555107907071661, 'epoch': 14} Test Korean: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.3285542059871037, 'r': 0.3180290928948762, 'f1': 0.32320598530011607}, 'combined': 0.1986534348673884, 'epoch': 14} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3128483952702703, 'r': 0.35143500948766604, 'f1': 0.33102100089365505}, 'combined': 0.24391021118479844, 'epoch': 14} Test Russian: {'template': {'p': 0.863013698630137, 'r': 0.48091603053435117, 'f1': 0.6176470588235295}, 'slot': {'p': 0.3416881680567298, 'r': 0.3091464377656127, 'f1': 0.3246037596538933}, 'combined': 0.2004905574332871, 'epoch': 14} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.2616279069767442, 'r': 0.32142857142857145, 'f1': 0.28846153846153855}, 'combined': 0.19230769230769235, 'epoch': 14} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.2553191489361702, 'r': 0.5217391304347826, 'f1': 0.3428571428571428}, 'combined': 0.1714285714285714, 'epoch': 14} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.4166666666666667, 'r': 0.1724137931034483, 'f1': 0.2439024390243903}, 'combined': 0.1626016260162602, 'epoch': 14} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3148516147989166, 'r': 0.35428274682306937, 'f1': 0.33340537067099557}, 'combined': 0.2456671152312599, 'epoch': 11} Test for Chinese: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.34260585369857516, 'r': 0.29629171749668803, 'f1': 0.31777011337470074}, 'combined': 0.19531236236688923, 'epoch': 11} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.275, 'r': 0.3535714285714286, 'f1': 0.309375}, 'combined': 0.20625, 'epoch': 11} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.28298284686781866, 'r': 0.3527888622242065, 'f1': 0.31405359863540006}, 'combined': 0.23140791478397899, 'epoch': 11} Test for Korean: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.32660396071805126, 'r': 0.30226432413074417, 'f1': 0.3139631233545263}, 'combined': 0.19297245630570886, 'epoch': 11} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.28888888888888886, 'r': 0.5652173913043478, 'f1': 0.38235294117647056}, 'combined': 0.19117647058823528, 'epoch': 11} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.32387621960662843, 'r': 0.3607501725030188, 'f1': 0.34132018116533375}, 'combined': 0.25149908085866696, 'epoch': 11} Test for Russian: {'template': {'p': 0.863013698630137, 'r': 0.48091603053435117, 'f1': 0.6176470588235295}, 'slot': {'p': 0.3437797535475426, 'r': 0.2928383862541026, 'f1': 0.3162709384531908}, 'combined': 0.19534381492697084, 'epoch': 11} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.46153846153846156, 'r': 0.20689655172413793, 'f1': 0.28571428571428575}, 'combined': 0.1904761904761905, 'epoch': 11} ****************************** Epoch: 15 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-22 13:33:02.133318: step: 2/466, loss: 0.05210694670677185 2023-01-22 13:33:02.892706: step: 4/466, loss: 0.018808016553521156 2023-01-22 13:33:03.709431: step: 6/466, loss: 0.24722591042518616 2023-01-22 13:33:04.504815: step: 8/466, loss: 0.07486678659915924 2023-01-22 13:33:05.340613: step: 10/466, loss: 0.49716898798942566 2023-01-22 13:33:06.113964: step: 12/466, loss: 0.09887971729040146 2023-01-22 13:33:06.909878: step: 14/466, loss: 0.26036396622657776 2023-01-22 13:33:07.608524: step: 16/466, loss: 0.08663032948970795 2023-01-22 13:33:08.388760: step: 18/466, loss: 0.11957651376724243 2023-01-22 13:33:09.214418: step: 20/466, loss: 0.08551111817359924 2023-01-22 13:33:09.995022: step: 22/466, loss: 0.15386079251766205 2023-01-22 13:33:10.807691: step: 24/466, loss: 0.06206200644373894 2023-01-22 13:33:11.510952: step: 26/466, loss: 0.4639705717563629 2023-01-22 13:33:12.241713: step: 28/466, loss: 0.06706234812736511 2023-01-22 13:33:13.011078: step: 30/466, loss: 0.0980425551533699 2023-01-22 13:33:13.852176: step: 32/466, loss: 0.07218775898218155 2023-01-22 13:33:14.673003: step: 34/466, loss: 0.0752536952495575 2023-01-22 13:33:15.389485: step: 36/466, loss: 0.05118757486343384 2023-01-22 13:33:16.081111: step: 38/466, loss: 0.026366863399744034 2023-01-22 13:33:16.903427: step: 40/466, loss: 0.11569967120885849 2023-01-22 13:33:17.633734: step: 42/466, loss: 0.11791163682937622 2023-01-22 13:33:18.497010: step: 44/466, loss: 0.05543734133243561 2023-01-22 13:33:19.236978: step: 46/466, loss: 0.11040161550045013 2023-01-22 13:33:20.001281: step: 48/466, loss: 0.14670924842357635 2023-01-22 13:33:20.735186: step: 50/466, loss: 0.0680384710431099 2023-01-22 13:33:21.472827: step: 52/466, loss: 0.04696614667773247 2023-01-22 13:33:22.170045: step: 54/466, loss: 0.08242233097553253 2023-01-22 13:33:22.925836: step: 56/466, loss: 0.3122200071811676 2023-01-22 13:33:23.625693: step: 58/466, loss: 0.02097785286605358 2023-01-22 13:33:24.467654: step: 60/466, loss: 0.03667465224862099 2023-01-22 13:33:25.147277: step: 62/466, loss: 0.027150847017765045 2023-01-22 13:33:25.874220: step: 64/466, loss: 0.10212913900613785 2023-01-22 13:33:26.607141: step: 66/466, loss: 0.09340247511863708 2023-01-22 13:33:27.298942: step: 68/466, loss: 0.13316011428833008 2023-01-22 13:33:28.026903: step: 70/466, loss: 0.05900976061820984 2023-01-22 13:33:28.745974: step: 72/466, loss: 0.065632164478302 2023-01-22 13:33:29.521028: step: 74/466, loss: 0.07361248880624771 2023-01-22 13:33:30.231429: step: 76/466, loss: 0.05891217291355133 2023-01-22 13:33:30.993857: step: 78/466, loss: 0.01654049940407276 2023-01-22 13:33:31.748997: step: 80/466, loss: 0.06702452152967453 2023-01-22 13:33:32.572242: step: 82/466, loss: 0.3018524944782257 2023-01-22 13:33:33.352863: step: 84/466, loss: 0.3223186731338501 2023-01-22 13:33:34.063667: step: 86/466, loss: 0.028132904320955276 2023-01-22 13:33:34.885370: step: 88/466, loss: 0.32144275307655334 2023-01-22 13:33:35.575190: step: 90/466, loss: 0.11325498670339584 2023-01-22 13:33:36.308173: step: 92/466, loss: 0.06781657785177231 2023-01-22 13:33:37.191768: step: 94/466, loss: 0.020502163097262383 2023-01-22 13:33:37.931550: step: 96/466, loss: 0.029743617400527 2023-01-22 13:33:38.688582: step: 98/466, loss: 0.6470071077346802 2023-01-22 13:33:39.418383: step: 100/466, loss: 0.08369266241788864 2023-01-22 13:33:40.126611: step: 102/466, loss: 0.1308288723230362 2023-01-22 13:33:40.831273: step: 104/466, loss: 0.1260998249053955 2023-01-22 13:33:41.595757: step: 106/466, loss: 0.09659962356090546 2023-01-22 13:33:42.347309: step: 108/466, loss: 0.04915435239672661 2023-01-22 13:33:43.102405: step: 110/466, loss: 0.0700543075799942 2023-01-22 13:33:43.840361: step: 112/466, loss: 0.06394853442907333 2023-01-22 13:33:44.610855: step: 114/466, loss: 0.1292410045862198 2023-01-22 13:33:45.369259: step: 116/466, loss: 0.09758917987346649 2023-01-22 13:33:46.303422: step: 118/466, loss: 1.9617502689361572 2023-01-22 13:33:47.096768: step: 120/466, loss: 0.06086967885494232 2023-01-22 13:33:47.788593: step: 122/466, loss: 0.059228383004665375 2023-01-22 13:33:48.516542: step: 124/466, loss: 0.19763793051242828 2023-01-22 13:33:49.380321: step: 126/466, loss: 0.050040893256664276 2023-01-22 13:33:50.078245: step: 128/466, loss: 0.08526965230703354 2023-01-22 13:33:50.825527: step: 130/466, loss: 0.1221991628408432 2023-01-22 13:33:51.564193: step: 132/466, loss: 0.06798665970563889 2023-01-22 13:33:52.273967: step: 134/466, loss: 0.1459055095911026 2023-01-22 13:33:53.048428: step: 136/466, loss: 0.08939648419618607 2023-01-22 13:33:53.894856: step: 138/466, loss: 0.14021599292755127 2023-01-22 13:33:54.636761: step: 140/466, loss: 0.04460885748267174 2023-01-22 13:33:55.433296: step: 142/466, loss: 0.09865774214267731 2023-01-22 13:33:56.266080: step: 144/466, loss: 0.03751792013645172 2023-01-22 13:33:56.996273: step: 146/466, loss: 0.025001395493745804 2023-01-22 13:33:57.797037: step: 148/466, loss: 0.02535279095172882 2023-01-22 13:33:58.610694: step: 150/466, loss: 0.17281517386436462 2023-01-22 13:33:59.371340: step: 152/466, loss: 0.07138783484697342 2023-01-22 13:34:00.073010: step: 154/466, loss: 0.14332233369350433 2023-01-22 13:34:00.845239: step: 156/466, loss: 0.05977214500308037 2023-01-22 13:34:01.686872: step: 158/466, loss: 0.02197592705488205 2023-01-22 13:34:02.423884: step: 160/466, loss: 0.05003592371940613 2023-01-22 13:34:03.132349: step: 162/466, loss: 0.1948169469833374 2023-01-22 13:34:03.932517: step: 164/466, loss: 0.06136566773056984 2023-01-22 13:34:04.709414: step: 166/466, loss: 0.054979994893074036 2023-01-22 13:34:05.530197: step: 168/466, loss: 0.1970011591911316 2023-01-22 13:34:06.298450: step: 170/466, loss: 0.21378380060195923 2023-01-22 13:34:07.023410: step: 172/466, loss: 0.0801737830042839 2023-01-22 13:34:07.778999: step: 174/466, loss: 0.09108097851276398 2023-01-22 13:34:08.492220: step: 176/466, loss: 0.031183486804366112 2023-01-22 13:34:09.248826: step: 178/466, loss: 0.03109663724899292 2023-01-22 13:34:10.075788: step: 180/466, loss: 0.12059830874204636 2023-01-22 13:34:10.862701: step: 182/466, loss: 0.14663125574588776 2023-01-22 13:34:11.602056: step: 184/466, loss: 0.0532103069126606 2023-01-22 13:34:12.380263: step: 186/466, loss: 0.08011765033006668 2023-01-22 13:34:13.140471: step: 188/466, loss: 0.05597788095474243 2023-01-22 13:34:13.893273: step: 190/466, loss: 0.14271435141563416 2023-01-22 13:34:14.576275: step: 192/466, loss: 0.033926501870155334 2023-01-22 13:34:15.447446: step: 194/466, loss: 0.1532621681690216 2023-01-22 13:34:16.215239: step: 196/466, loss: 0.11758232861757278 2023-01-22 13:34:17.029590: step: 198/466, loss: 0.16431008279323578 2023-01-22 13:34:17.800663: step: 200/466, loss: 0.03531648963689804 2023-01-22 13:34:18.634939: step: 202/466, loss: 0.07027041167020798 2023-01-22 13:34:19.398550: step: 204/466, loss: 0.04011674225330353 2023-01-22 13:34:20.213875: step: 206/466, loss: 0.11462801694869995 2023-01-22 13:34:21.009500: step: 208/466, loss: 0.42639032006263733 2023-01-22 13:34:21.809817: step: 210/466, loss: 0.09091605991125107 2023-01-22 13:34:22.472893: step: 212/466, loss: 0.1382666379213333 2023-01-22 13:34:23.300011: step: 214/466, loss: 0.07938707619905472 2023-01-22 13:34:24.044712: step: 216/466, loss: 0.03487627953290939 2023-01-22 13:34:24.919643: step: 218/466, loss: 0.07560927420854568 2023-01-22 13:34:25.618200: step: 220/466, loss: 0.1018633246421814 2023-01-22 13:34:26.429701: step: 222/466, loss: 0.2556101679801941 2023-01-22 13:34:27.230786: step: 224/466, loss: 0.2421107292175293 2023-01-22 13:34:27.931227: step: 226/466, loss: 0.029341835528612137 2023-01-22 13:34:28.656577: step: 228/466, loss: 0.051736246794462204 2023-01-22 13:34:29.361779: step: 230/466, loss: 0.09494776278734207 2023-01-22 13:34:30.180743: step: 232/466, loss: 0.06730318069458008 2023-01-22 13:34:30.923894: step: 234/466, loss: 0.2819020748138428 2023-01-22 13:34:31.729676: step: 236/466, loss: 0.09525111317634583 2023-01-22 13:34:32.511328: step: 238/466, loss: 0.09886281937360764 2023-01-22 13:34:33.237529: step: 240/466, loss: 0.06698231399059296 2023-01-22 13:34:34.056664: step: 242/466, loss: 0.07156889885663986 2023-01-22 13:34:34.852663: step: 244/466, loss: 0.036325473338365555 2023-01-22 13:34:35.574399: step: 246/466, loss: 0.10516617447137833 2023-01-22 13:34:36.287628: step: 248/466, loss: 0.1363402009010315 2023-01-22 13:34:37.071718: step: 250/466, loss: 0.05505356565117836 2023-01-22 13:34:37.783429: step: 252/466, loss: 0.06005243584513664 2023-01-22 13:34:38.526582: step: 254/466, loss: 0.09491105377674103 2023-01-22 13:34:39.287538: step: 256/466, loss: 0.03337240591645241 2023-01-22 13:34:39.976254: step: 258/466, loss: 0.015260940417647362 2023-01-22 13:34:40.721146: step: 260/466, loss: 0.182101309299469 2023-01-22 13:34:41.558769: step: 262/466, loss: 0.12483559548854828 2023-01-22 13:34:42.341651: step: 264/466, loss: 0.05799545347690582 2023-01-22 13:34:43.203205: step: 266/466, loss: 0.12771588563919067 2023-01-22 13:34:43.967018: step: 268/466, loss: 0.05148211494088173 2023-01-22 13:34:44.692337: step: 270/466, loss: 0.025062330067157745 2023-01-22 13:34:45.424627: step: 272/466, loss: 0.07430432736873627 2023-01-22 13:34:46.206361: step: 274/466, loss: 0.10613179206848145 2023-01-22 13:34:47.060809: step: 276/466, loss: 0.1614530235528946 2023-01-22 13:34:47.776843: step: 278/466, loss: 0.0584588497877121 2023-01-22 13:34:48.562045: step: 280/466, loss: 0.13125962018966675 2023-01-22 13:34:49.359850: step: 282/466, loss: 0.0455959290266037 2023-01-22 13:34:50.107824: step: 284/466, loss: 0.0269328560680151 2023-01-22 13:34:50.823951: step: 286/466, loss: 0.24000252783298492 2023-01-22 13:34:51.650652: step: 288/466, loss: 0.19251394271850586 2023-01-22 13:34:52.463364: step: 290/466, loss: 0.2961283326148987 2023-01-22 13:34:53.209319: step: 292/466, loss: 0.0695134773850441 2023-01-22 13:34:54.005295: step: 294/466, loss: 0.05912279337644577 2023-01-22 13:34:54.815331: step: 296/466, loss: 0.08277720957994461 2023-01-22 13:34:55.591329: step: 298/466, loss: 0.08698936551809311 2023-01-22 13:34:56.365527: step: 300/466, loss: 0.18971477448940277 2023-01-22 13:34:57.123705: step: 302/466, loss: 0.06840641796588898 2023-01-22 13:34:57.825908: step: 304/466, loss: 0.08240868151187897 2023-01-22 13:34:58.647461: step: 306/466, loss: 0.19505003094673157 2023-01-22 13:34:59.509666: step: 308/466, loss: 0.04841731861233711 2023-01-22 13:35:00.280025: step: 310/466, loss: 0.09289427101612091 2023-01-22 13:35:00.991749: step: 312/466, loss: 10.075116157531738 2023-01-22 13:35:01.777307: step: 314/466, loss: 0.2554793357849121 2023-01-22 13:35:02.626558: step: 316/466, loss: 0.19938571751117706 2023-01-22 13:35:03.355452: step: 318/466, loss: 0.05021106079220772 2023-01-22 13:35:04.070320: step: 320/466, loss: 0.03854874148964882 2023-01-22 13:35:04.833789: step: 322/466, loss: 0.08314540982246399 2023-01-22 13:35:05.581055: step: 324/466, loss: 0.11936472356319427 2023-01-22 13:35:06.321183: step: 326/466, loss: 0.15186713635921478 2023-01-22 13:35:07.096473: step: 328/466, loss: 0.07621181011199951 2023-01-22 13:35:07.828495: step: 330/466, loss: 0.016436375677585602 2023-01-22 13:35:08.596761: step: 332/466, loss: 0.14319784939289093 2023-01-22 13:35:09.393338: step: 334/466, loss: 0.07255373150110245 2023-01-22 13:35:10.117126: step: 336/466, loss: 0.0869244858622551 2023-01-22 13:35:10.924540: step: 338/466, loss: 0.06745993345975876 2023-01-22 13:35:11.640205: step: 340/466, loss: 0.10919850319623947 2023-01-22 13:35:12.418176: step: 342/466, loss: 0.20024287700653076 2023-01-22 13:35:13.183374: step: 344/466, loss: 0.5972100496292114 2023-01-22 13:35:13.919515: step: 346/466, loss: 0.08542009443044662 2023-01-22 13:35:14.742605: step: 348/466, loss: 0.07332447171211243 2023-01-22 13:35:15.452928: step: 350/466, loss: 0.10761795938014984 2023-01-22 13:35:16.199833: step: 352/466, loss: 0.4194304943084717 2023-01-22 13:35:16.946278: step: 354/466, loss: 0.12218812108039856 2023-01-22 13:35:17.797103: step: 356/466, loss: 0.09939373284578323 2023-01-22 13:35:18.556258: step: 358/466, loss: 0.20191505551338196 2023-01-22 13:35:19.369381: step: 360/466, loss: 0.14228050410747528 2023-01-22 13:35:20.197674: step: 362/466, loss: 0.09845307469367981 2023-01-22 13:35:21.052367: step: 364/466, loss: 1.1167224645614624 2023-01-22 13:35:21.830841: step: 366/466, loss: 0.0897873193025589 2023-01-22 13:35:22.608952: step: 368/466, loss: 0.19432386755943298 2023-01-22 13:35:23.370708: step: 370/466, loss: 0.04567892104387283 2023-01-22 13:35:24.138334: step: 372/466, loss: 0.09775812178850174 2023-01-22 13:35:24.817460: step: 374/466, loss: 0.04945585876703262 2023-01-22 13:35:25.582698: step: 376/466, loss: 0.2018977403640747 2023-01-22 13:35:26.451425: step: 378/466, loss: 0.05446924269199371 2023-01-22 13:35:27.183386: step: 380/466, loss: 0.7429501414299011 2023-01-22 13:35:27.956923: step: 382/466, loss: 0.7492738366127014 2023-01-22 13:35:28.699856: step: 384/466, loss: 0.016909055411815643 2023-01-22 13:35:29.530089: step: 386/466, loss: 0.12241260707378387 2023-01-22 13:35:30.331823: step: 388/466, loss: 0.12036903947591782 2023-01-22 13:35:31.151813: step: 390/466, loss: 0.2424386888742447 2023-01-22 13:35:31.886867: step: 392/466, loss: 0.022720765322446823 2023-01-22 13:35:32.634647: step: 394/466, loss: 0.1325964778661728 2023-01-22 13:35:33.353598: step: 396/466, loss: 0.05196783319115639 2023-01-22 13:35:34.118133: step: 398/466, loss: 0.0683765709400177 2023-01-22 13:35:34.861059: step: 400/466, loss: 0.1732609122991562 2023-01-22 13:35:35.681643: step: 402/466, loss: 0.1777394562959671 2023-01-22 13:35:36.413591: step: 404/466, loss: 0.06969244033098221 2023-01-22 13:35:37.150756: step: 406/466, loss: 0.04348913952708244 2023-01-22 13:35:37.942050: step: 408/466, loss: 0.05620116740465164 2023-01-22 13:35:38.751770: step: 410/466, loss: 0.13990871608257294 2023-01-22 13:35:39.563446: step: 412/466, loss: 0.16890095174312592 2023-01-22 13:35:40.354610: step: 414/466, loss: 0.05791330337524414 2023-01-22 13:35:41.126617: step: 416/466, loss: 0.0948304757475853 2023-01-22 13:35:41.866401: step: 418/466, loss: 0.2736469805240631 2023-01-22 13:35:42.627182: step: 420/466, loss: 0.034205324947834015 2023-01-22 13:35:43.427521: step: 422/466, loss: 0.8601592183113098 2023-01-22 13:35:44.161179: step: 424/466, loss: 0.11530404537916183 2023-01-22 13:35:44.945862: step: 426/466, loss: 0.18276342749595642 2023-01-22 13:35:45.694851: step: 428/466, loss: 0.4518357515335083 2023-01-22 13:35:46.451991: step: 430/466, loss: 0.10191506147384644 2023-01-22 13:35:47.177646: step: 432/466, loss: 0.4678365886211395 2023-01-22 13:35:47.923093: step: 434/466, loss: 0.23182491958141327 2023-01-22 13:35:48.646621: step: 436/466, loss: 0.13566505908966064 2023-01-22 13:35:49.397253: step: 438/466, loss: 0.1481209546327591 2023-01-22 13:35:50.161852: step: 440/466, loss: 0.08209249377250671 2023-01-22 13:35:50.882083: step: 442/466, loss: 0.043845757842063904 2023-01-22 13:35:51.652086: step: 444/466, loss: 0.04230104759335518 2023-01-22 13:35:52.448147: step: 446/466, loss: 0.10302285850048065 2023-01-22 13:35:53.148821: step: 448/466, loss: 0.14680084586143494 2023-01-22 13:35:53.975291: step: 450/466, loss: 0.28017646074295044 2023-01-22 13:35:54.767547: step: 452/466, loss: 0.09054071456193924 2023-01-22 13:35:55.479708: step: 454/466, loss: 0.10634226351976395 2023-01-22 13:35:56.255874: step: 456/466, loss: 0.07919905334711075 2023-01-22 13:35:57.013216: step: 458/466, loss: 0.04398207366466522 2023-01-22 13:35:57.711374: step: 460/466, loss: 0.13439683616161346 2023-01-22 13:35:58.469421: step: 462/466, loss: 0.17098841071128845 2023-01-22 13:35:59.155059: step: 464/466, loss: 0.10975412279367447 2023-01-22 13:35:59.906091: step: 466/466, loss: 0.19899797439575195 2023-01-22 13:36:00.714520: step: 468/466, loss: 0.23727427423000336 2023-01-22 13:36:01.479279: step: 470/466, loss: 0.04510289058089256 2023-01-22 13:36:02.346750: step: 472/466, loss: 0.8150736093521118 2023-01-22 13:36:03.023180: step: 474/466, loss: 0.09007147699594498 2023-01-22 13:36:03.815410: step: 476/466, loss: 0.6509134769439697 2023-01-22 13:36:04.555801: step: 478/466, loss: 0.07442466914653778 2023-01-22 13:36:05.343694: step: 480/466, loss: 0.0767611488699913 2023-01-22 13:36:06.102257: step: 482/466, loss: 0.036706291139125824 2023-01-22 13:36:06.790419: step: 484/466, loss: 0.0735660120844841 2023-01-22 13:36:07.472230: step: 486/466, loss: 0.10951712727546692 2023-01-22 13:36:08.179918: step: 488/466, loss: 0.8974351286888123 2023-01-22 13:36:08.931236: step: 490/466, loss: 0.05478595569729805 2023-01-22 13:36:09.671262: step: 492/466, loss: 0.10811832547187805 2023-01-22 13:36:10.434683: step: 494/466, loss: 0.09113547205924988 2023-01-22 13:36:11.268877: step: 496/466, loss: 0.043976835906505585 2023-01-22 13:36:12.107192: step: 498/466, loss: 0.33339035511016846 2023-01-22 13:36:12.863226: step: 500/466, loss: 0.08161080628633499 2023-01-22 13:36:13.663485: step: 502/466, loss: 0.04528295621275902 2023-01-22 13:36:14.445266: step: 504/466, loss: 0.07120140641927719 2023-01-22 13:36:15.179300: step: 506/466, loss: 0.11000839620828629 2023-01-22 13:36:15.950287: step: 508/466, loss: 0.1413177251815796 2023-01-22 13:36:16.720126: step: 510/466, loss: 0.27769792079925537 2023-01-22 13:36:17.579155: step: 512/466, loss: 0.03286886215209961 2023-01-22 13:36:18.326004: step: 514/466, loss: 0.1395481377840042 2023-01-22 13:36:19.144287: step: 516/466, loss: 0.14255927503108978 2023-01-22 13:36:19.935934: step: 518/466, loss: 0.10735977441072464 2023-01-22 13:36:20.727972: step: 520/466, loss: 0.4797162711620331 2023-01-22 13:36:21.484374: step: 522/466, loss: 0.12145961076021194 2023-01-22 13:36:22.256834: step: 524/466, loss: 0.05048738792538643 2023-01-22 13:36:23.002454: step: 526/466, loss: 0.14986084401607513 2023-01-22 13:36:23.794374: step: 528/466, loss: 1.2179806232452393 2023-01-22 13:36:24.575967: step: 530/466, loss: 0.482911616563797 2023-01-22 13:36:25.432899: step: 532/466, loss: 0.2774769961833954 2023-01-22 13:36:26.212688: step: 534/466, loss: 0.0876186266541481 2023-01-22 13:36:27.014470: step: 536/466, loss: 0.7969873547554016 2023-01-22 13:36:27.753470: step: 538/466, loss: 0.33211031556129456 2023-01-22 13:36:28.462579: step: 540/466, loss: 0.11518576741218567 2023-01-22 13:36:29.245571: step: 542/466, loss: 0.07682258635759354 2023-01-22 13:36:30.008964: step: 544/466, loss: 0.03671961650252342 2023-01-22 13:36:30.705580: step: 546/466, loss: 0.26386862993240356 2023-01-22 13:36:31.545427: step: 548/466, loss: 0.10726254433393478 2023-01-22 13:36:32.368244: step: 550/466, loss: 0.11239303648471832 2023-01-22 13:36:33.308833: step: 552/466, loss: 0.04918292164802551 2023-01-22 13:36:34.082137: step: 554/466, loss: 0.21519367396831512 2023-01-22 13:36:34.909863: step: 556/466, loss: 0.17288942635059357 2023-01-22 13:36:35.653037: step: 558/466, loss: 0.04853741452097893 2023-01-22 13:36:36.421958: step: 560/466, loss: 0.4925331473350525 2023-01-22 13:36:37.183045: step: 562/466, loss: 1.027655005455017 2023-01-22 13:36:37.907984: step: 564/466, loss: 0.045533567667007446 2023-01-22 13:36:38.756599: step: 566/466, loss: 0.05125496909022331 2023-01-22 13:36:39.554615: step: 568/466, loss: 0.13714663684368134 2023-01-22 13:36:40.307707: step: 570/466, loss: 0.14521551132202148 2023-01-22 13:36:41.016815: step: 572/466, loss: 0.06423871219158173 2023-01-22 13:36:41.908935: step: 574/466, loss: 0.09547813981771469 2023-01-22 13:36:42.624295: step: 576/466, loss: 0.1754569262266159 2023-01-22 13:36:43.443977: step: 578/466, loss: 0.14660441875457764 2023-01-22 13:36:44.176328: step: 580/466, loss: 0.04251260310411453 2023-01-22 13:36:44.907254: step: 582/466, loss: 0.0846245139837265 2023-01-22 13:36:45.683694: step: 584/466, loss: 0.25155988335609436 2023-01-22 13:36:46.483444: step: 586/466, loss: 0.0799722746014595 2023-01-22 13:36:47.344618: step: 588/466, loss: 0.2427946925163269 2023-01-22 13:36:48.121160: step: 590/466, loss: 0.08180870860815048 2023-01-22 13:36:48.842030: step: 592/466, loss: 0.20674677193164825 2023-01-22 13:36:49.609177: step: 594/466, loss: 0.1549101173877716 2023-01-22 13:36:50.385031: step: 596/466, loss: 0.13579262793064117 2023-01-22 13:36:51.119207: step: 598/466, loss: 0.04477398842573166 2023-01-22 13:36:51.863120: step: 600/466, loss: 0.0795651227235794 2023-01-22 13:36:52.622864: step: 602/466, loss: 0.07534074783325195 2023-01-22 13:36:53.345345: step: 604/466, loss: 0.12016644328832626 2023-01-22 13:36:54.209694: step: 606/466, loss: 0.2995753288269043 2023-01-22 13:36:54.969435: step: 608/466, loss: 0.09187112748622894 2023-01-22 13:36:55.708544: step: 610/466, loss: 0.09526897221803665 2023-01-22 13:36:56.420162: step: 612/466, loss: 0.11181322485208511 2023-01-22 13:36:57.238760: step: 614/466, loss: 0.05556326359510422 2023-01-22 13:36:58.040592: step: 616/466, loss: 0.23775111138820648 2023-01-22 13:36:58.812883: step: 618/466, loss: 0.9687085747718811 2023-01-22 13:36:59.647523: step: 620/466, loss: 0.5194430947303772 2023-01-22 13:37:00.422395: step: 622/466, loss: 0.22122785449028015 2023-01-22 13:37:01.206326: step: 624/466, loss: 0.10294534265995026 2023-01-22 13:37:01.976679: step: 626/466, loss: 0.7381730675697327 2023-01-22 13:37:02.715653: step: 628/466, loss: 0.09413321316242218 2023-01-22 13:37:03.446859: step: 630/466, loss: 0.05440503731369972 2023-01-22 13:37:04.176215: step: 632/466, loss: 0.0769491195678711 2023-01-22 13:37:04.979902: step: 634/466, loss: 0.11340983211994171 2023-01-22 13:37:05.778028: step: 636/466, loss: 0.11693871766328812 2023-01-22 13:37:06.446844: step: 638/466, loss: 0.14776034653186798 2023-01-22 13:37:07.354997: step: 640/466, loss: 0.1705910563468933 2023-01-22 13:37:08.098119: step: 642/466, loss: 0.037550777196884155 2023-01-22 13:37:08.775377: step: 644/466, loss: 0.0886608213186264 2023-01-22 13:37:09.582563: step: 646/466, loss: 0.11776689440011978 2023-01-22 13:37:10.383138: step: 648/466, loss: 0.0356062576174736 2023-01-22 13:37:11.185027: step: 650/466, loss: 0.05151690915226936 2023-01-22 13:37:12.033568: step: 652/466, loss: 0.2012919932603836 2023-01-22 13:37:12.806743: step: 654/466, loss: 0.09088915586471558 2023-01-22 13:37:13.529442: step: 656/466, loss: 0.06062782183289528 2023-01-22 13:37:14.340486: step: 658/466, loss: 0.09774986654520035 2023-01-22 13:37:15.106184: step: 660/466, loss: 0.15430264174938202 2023-01-22 13:37:15.847751: step: 662/466, loss: 0.09477879106998444 2023-01-22 13:37:16.636510: step: 664/466, loss: 0.14804497361183167 2023-01-22 13:37:17.404595: step: 666/466, loss: 0.2251739799976349 2023-01-22 13:37:18.138104: step: 668/466, loss: 0.029697343707084656 2023-01-22 13:37:18.914358: step: 670/466, loss: 0.05504726991057396 2023-01-22 13:37:19.688819: step: 672/466, loss: 0.06437207758426666 2023-01-22 13:37:20.499381: step: 674/466, loss: 0.06237734109163284 2023-01-22 13:37:21.235229: step: 676/466, loss: 0.21999919414520264 2023-01-22 13:37:21.881074: step: 678/466, loss: 0.062139302492141724 2023-01-22 13:37:22.724917: step: 680/466, loss: 0.11696934700012207 2023-01-22 13:37:23.515230: step: 682/466, loss: 0.06388358771800995 2023-01-22 13:37:24.344665: step: 684/466, loss: 0.06830720603466034 2023-01-22 13:37:25.078602: step: 686/466, loss: 0.04317271709442139 2023-01-22 13:37:25.802966: step: 688/466, loss: 0.05816539004445076 2023-01-22 13:37:26.621546: step: 690/466, loss: 0.033409614115953445 2023-01-22 13:37:27.394034: step: 692/466, loss: 0.20011916756629944 2023-01-22 13:37:28.098437: step: 694/466, loss: 0.14553718268871307 2023-01-22 13:37:28.876117: step: 696/466, loss: 0.15632416307926178 2023-01-22 13:37:29.594234: step: 698/466, loss: 0.07469306886196136 2023-01-22 13:37:30.381014: step: 700/466, loss: 0.14653725922107697 2023-01-22 13:37:31.101859: step: 702/466, loss: 0.10972332954406738 2023-01-22 13:37:31.866724: step: 704/466, loss: 0.16227854788303375 2023-01-22 13:37:32.653043: step: 706/466, loss: 0.02626815065741539 2023-01-22 13:37:33.408683: step: 708/466, loss: 0.06898698210716248 2023-01-22 13:37:34.128762: step: 710/466, loss: 0.0700167715549469 2023-01-22 13:37:34.879255: step: 712/466, loss: 0.07130112498998642 2023-01-22 13:37:35.685702: step: 714/466, loss: 0.06511213630437851 2023-01-22 13:37:36.387202: step: 716/466, loss: 0.06653962284326553 2023-01-22 13:37:37.065134: step: 718/466, loss: 0.08015184104442596 2023-01-22 13:37:37.876314: step: 720/466, loss: 0.06690537929534912 2023-01-22 13:37:38.602368: step: 722/466, loss: 0.04099201411008835 2023-01-22 13:37:39.354375: step: 724/466, loss: 0.06964768469333649 2023-01-22 13:37:40.112868: step: 726/466, loss: 0.2873595356941223 2023-01-22 13:37:40.812352: step: 728/466, loss: 0.12969258427619934 2023-01-22 13:37:41.625951: step: 730/466, loss: 0.12038178741931915 2023-01-22 13:37:42.337948: step: 732/466, loss: 0.09994393587112427 2023-01-22 13:37:43.125795: step: 734/466, loss: 0.04484577104449272 2023-01-22 13:37:43.854550: step: 736/466, loss: 0.06055283918976784 2023-01-22 13:37:44.564671: step: 738/466, loss: 0.05349590629339218 2023-01-22 13:37:45.320940: step: 740/466, loss: 0.10076868534088135 2023-01-22 13:37:46.123628: step: 742/466, loss: 0.10146256536245346 2023-01-22 13:37:46.865286: step: 744/466, loss: 0.41469091176986694 2023-01-22 13:37:47.601203: step: 746/466, loss: 0.13279423117637634 2023-01-22 13:37:48.440397: step: 748/466, loss: 0.10934709012508392 2023-01-22 13:37:49.213660: step: 750/466, loss: 0.012673179619014263 2023-01-22 13:37:49.933561: step: 752/466, loss: 0.05210401490330696 2023-01-22 13:37:50.747504: step: 754/466, loss: 0.13793087005615234 2023-01-22 13:37:51.619385: step: 756/466, loss: 0.11023826152086258 2023-01-22 13:37:52.535025: step: 758/466, loss: 0.17428648471832275 2023-01-22 13:37:53.201954: step: 760/466, loss: 0.026478417217731476 2023-01-22 13:37:53.934120: step: 762/466, loss: 0.08447739481925964 2023-01-22 13:37:54.742080: step: 764/466, loss: 0.06560572981834412 2023-01-22 13:37:55.516183: step: 766/466, loss: 0.07490119338035583 2023-01-22 13:37:56.302545: step: 768/466, loss: 0.11548873037099838 2023-01-22 13:37:57.048126: step: 770/466, loss: 0.09563510119915009 2023-01-22 13:37:57.853382: step: 772/466, loss: 0.09963957965373993 2023-01-22 13:37:58.674315: step: 774/466, loss: 0.16085292398929596 2023-01-22 13:37:59.490496: step: 776/466, loss: 0.10022449493408203 2023-01-22 13:38:00.219672: step: 778/466, loss: 0.06054616719484329 2023-01-22 13:38:00.906657: step: 780/466, loss: 1.848663091659546 2023-01-22 13:38:01.661744: step: 782/466, loss: 0.1872957944869995 2023-01-22 13:38:02.553283: step: 784/466, loss: 0.05718646198511124 2023-01-22 13:38:03.336065: step: 786/466, loss: 0.09180223941802979 2023-01-22 13:38:04.188770: step: 788/466, loss: 0.3840520679950714 2023-01-22 13:38:04.958310: step: 790/466, loss: 0.09134998917579651 2023-01-22 13:38:05.774189: step: 792/466, loss: 0.05993478000164032 2023-01-22 13:38:06.614982: step: 794/466, loss: 0.14188416302204132 2023-01-22 13:38:07.380173: step: 796/466, loss: 0.02756846323609352 2023-01-22 13:38:08.145187: step: 798/466, loss: 0.04655991867184639 2023-01-22 13:38:08.839912: step: 800/466, loss: 0.02538241073489189 2023-01-22 13:38:09.668225: step: 802/466, loss: 0.14205986261367798 2023-01-22 13:38:10.409315: step: 804/466, loss: 0.24629098176956177 2023-01-22 13:38:11.099950: step: 806/466, loss: 0.25697317719459534 2023-01-22 13:38:11.884817: step: 808/466, loss: 0.2990754246711731 2023-01-22 13:38:12.622890: step: 810/466, loss: 0.042557764798402786 2023-01-22 13:38:13.403968: step: 812/466, loss: 0.3225131332874298 2023-01-22 13:38:14.131478: step: 814/466, loss: 0.027747908607125282 2023-01-22 13:38:14.873559: step: 816/466, loss: 6.1168413162231445 2023-01-22 13:38:15.613210: step: 818/466, loss: 0.0687987357378006 2023-01-22 13:38:16.404866: step: 820/466, loss: 0.09931518882513046 2023-01-22 13:38:17.230042: step: 822/466, loss: 0.07825738191604614 2023-01-22 13:38:18.017380: step: 824/466, loss: 0.08366437256336212 2023-01-22 13:38:18.802774: step: 826/466, loss: 0.07200953364372253 2023-01-22 13:38:19.554604: step: 828/466, loss: 0.07085266709327698 2023-01-22 13:38:20.468563: step: 830/466, loss: 0.014983849599957466 2023-01-22 13:38:21.273473: step: 832/466, loss: 0.2933443486690521 2023-01-22 13:38:22.023491: step: 834/466, loss: 0.122842937707901 2023-01-22 13:38:22.851025: step: 836/466, loss: 0.16995428502559662 2023-01-22 13:38:23.598028: step: 838/466, loss: 0.06800254434347153 2023-01-22 13:38:24.375470: step: 840/466, loss: 0.03221948444843292 2023-01-22 13:38:25.035425: step: 842/466, loss: 0.4852493107318878 2023-01-22 13:38:25.779371: step: 844/466, loss: 0.28303274512290955 2023-01-22 13:38:26.572502: step: 846/466, loss: 0.24812698364257812 2023-01-22 13:38:27.335470: step: 848/466, loss: 0.020885517820715904 2023-01-22 13:38:28.067432: step: 850/466, loss: 0.3325272500514984 2023-01-22 13:38:28.804628: step: 852/466, loss: 0.09081543236970901 2023-01-22 13:38:29.623068: step: 854/466, loss: 0.0565243735909462 2023-01-22 13:38:30.364543: step: 856/466, loss: 0.1972397118806839 2023-01-22 13:38:31.079484: step: 858/466, loss: 0.11067622900009155 2023-01-22 13:38:31.912600: step: 860/466, loss: 0.027421843260526657 2023-01-22 13:38:32.652876: step: 862/466, loss: 0.060354817658662796 2023-01-22 13:38:33.403737: step: 864/466, loss: 0.1397075653076172 2023-01-22 13:38:34.153874: step: 866/466, loss: 0.05992557108402252 2023-01-22 13:38:34.847881: step: 868/466, loss: 0.00546844070777297 2023-01-22 13:38:35.668865: step: 870/466, loss: 0.07609057426452637 2023-01-22 13:38:36.395014: step: 872/466, loss: 0.05730225145816803 2023-01-22 13:38:37.162649: step: 874/466, loss: 0.10852661728858948 2023-01-22 13:38:38.029222: step: 876/466, loss: 0.09243988990783691 2023-01-22 13:38:38.804014: step: 878/466, loss: 0.026159387081861496 2023-01-22 13:38:39.613909: step: 880/466, loss: 0.17023548483848572 2023-01-22 13:38:40.411177: step: 882/466, loss: 0.15736925601959229 2023-01-22 13:38:41.219398: step: 884/466, loss: 0.7499971985816956 2023-01-22 13:38:41.900229: step: 886/466, loss: 0.1022542342543602 2023-01-22 13:38:42.621124: step: 888/466, loss: 0.15656821429729462 2023-01-22 13:38:43.455413: step: 890/466, loss: 0.11236383765935898 2023-01-22 13:38:44.171545: step: 892/466, loss: 0.1907481998205185 2023-01-22 13:38:44.883137: step: 894/466, loss: 1.1647937297821045 2023-01-22 13:38:45.627011: step: 896/466, loss: 0.031657878309488297 2023-01-22 13:38:46.434025: step: 898/466, loss: 0.07947023212909698 2023-01-22 13:38:47.213795: step: 900/466, loss: 0.12622502446174622 2023-01-22 13:38:47.914274: step: 902/466, loss: 0.19517450034618378 2023-01-22 13:38:48.756695: step: 904/466, loss: 0.024985037744045258 2023-01-22 13:38:49.484741: step: 906/466, loss: 0.4478173851966858 2023-01-22 13:38:50.175847: step: 908/466, loss: 0.09397806227207184 2023-01-22 13:38:50.964551: step: 910/466, loss: 0.3781964182853699 2023-01-22 13:38:51.816303: step: 912/466, loss: 0.08397955447435379 2023-01-22 13:38:52.592722: step: 914/466, loss: 0.05961551517248154 2023-01-22 13:38:53.365299: step: 916/466, loss: 0.08746632933616638 2023-01-22 13:38:54.067177: step: 918/466, loss: 0.03498067334294319 2023-01-22 13:38:54.848356: step: 920/466, loss: 0.020834237337112427 2023-01-22 13:38:55.595607: step: 922/466, loss: 0.15319500863552094 2023-01-22 13:38:56.345034: step: 924/466, loss: 0.3376273512840271 2023-01-22 13:38:57.126024: step: 926/466, loss: 0.12826432287693024 2023-01-22 13:38:57.875515: step: 928/466, loss: 0.18537111580371857 2023-01-22 13:38:58.704135: step: 930/466, loss: 0.12460990250110626 2023-01-22 13:38:59.490238: step: 932/466, loss: 0.07324258983135223 ================================================== Loss: 0.187 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2937074161425577, 'r': 0.35445524984187227, 'f1': 0.32123459443966756}, 'combined': 0.23669917485028136, 'epoch': 15} Test Chinese: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.3367341560703846, 'r': 0.30288566204597855, 'f1': 0.3189142828476818}, 'combined': 0.1960156079941849, 'epoch': 15} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2713159689987903, 'r': 0.35883724932098077, 'f1': 0.30899874247084447}, 'combined': 0.22768328392588538, 'epoch': 15} Test Korean: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.3217764401742104, 'r': 0.31480554675622496, 'f1': 0.3182528260680539}, 'combined': 0.19560905407109652, 'epoch': 15} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.30557476794204214, 'r': 0.3658779479533749, 'f1': 0.3330184431285468}, 'combined': 0.24538201072629762, 'epoch': 15} Test Russian: {'template': {'p': 0.863013698630137, 'r': 0.48091603053435117, 'f1': 0.6176470588235295}, 'slot': {'p': 0.3429815446149785, 'r': 0.3073162193517225, 'f1': 0.3241708566105007}, 'combined': 0.2002231761417799, 'epoch': 15} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.22877358490566038, 'r': 0.3464285714285714, 'f1': 0.2755681818181818}, 'combined': 0.18371212121212122, 'epoch': 15} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.28, 'r': 0.6086956521739131, 'f1': 0.3835616438356165}, 'combined': 0.19178082191780824, 'epoch': 15} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.42857142857142855, 'r': 0.20689655172413793, 'f1': 0.2790697674418604}, 'combined': 0.18604651162790692, 'epoch': 15} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3148516147989166, 'r': 0.35428274682306937, 'f1': 0.33340537067099557}, 'combined': 0.2456671152312599, 'epoch': 11} Test for Chinese: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.34260585369857516, 'r': 0.29629171749668803, 'f1': 0.31777011337470074}, 'combined': 0.19531236236688923, 'epoch': 11} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.275, 'r': 0.3535714285714286, 'f1': 0.309375}, 'combined': 0.20625, 'epoch': 11} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.28298284686781866, 'r': 0.3527888622242065, 'f1': 0.31405359863540006}, 'combined': 0.23140791478397899, 'epoch': 11} Test for Korean: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.32660396071805126, 'r': 0.30226432413074417, 'f1': 0.3139631233545263}, 'combined': 0.19297245630570886, 'epoch': 11} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.28888888888888886, 'r': 0.5652173913043478, 'f1': 0.38235294117647056}, 'combined': 0.19117647058823528, 'epoch': 11} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.32387621960662843, 'r': 0.3607501725030188, 'f1': 0.34132018116533375}, 'combined': 0.25149908085866696, 'epoch': 11} Test for Russian: {'template': {'p': 0.863013698630137, 'r': 0.48091603053435117, 'f1': 0.6176470588235295}, 'slot': {'p': 0.3437797535475426, 'r': 0.2928383862541026, 'f1': 0.3162709384531908}, 'combined': 0.19534381492697084, 'epoch': 11} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.46153846153846156, 'r': 0.20689655172413793, 'f1': 0.28571428571428575}, 'combined': 0.1904761904761905, 'epoch': 11} ****************************** Epoch: 16 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-22 13:41:45.852033: step: 2/466, loss: 0.0018624652875587344 2023-01-22 13:41:46.571839: step: 4/466, loss: 0.14262737333774567 2023-01-22 13:41:47.385343: step: 6/466, loss: 0.12517815828323364 2023-01-22 13:41:48.167174: step: 8/466, loss: 0.07878816872835159 2023-01-22 13:41:48.880147: step: 10/466, loss: 0.6276178359985352 2023-01-22 13:41:49.670745: step: 12/466, loss: 0.0350370928645134 2023-01-22 13:41:50.409981: step: 14/466, loss: 0.1565471738576889 2023-01-22 13:41:51.146058: step: 16/466, loss: 0.1140841469168663 2023-01-22 13:41:51.880634: step: 18/466, loss: 0.05684506148099899 2023-01-22 13:41:52.590661: step: 20/466, loss: 0.20923495292663574 2023-01-22 13:41:53.357961: step: 22/466, loss: 0.08959781378507614 2023-01-22 13:41:54.055866: step: 24/466, loss: 0.08080139756202698 2023-01-22 13:41:54.772694: step: 26/466, loss: 0.03367632254958153 2023-01-22 13:41:55.506201: step: 28/466, loss: 0.06738610565662384 2023-01-22 13:41:56.274497: step: 30/466, loss: 0.05645789951086044 2023-01-22 13:41:57.067135: step: 32/466, loss: 0.06814518570899963 2023-01-22 13:41:57.870656: step: 34/466, loss: 0.03460216894745827 2023-01-22 13:41:58.594494: step: 36/466, loss: 0.015818607062101364 2023-01-22 13:41:59.291731: step: 38/466, loss: 0.19628256559371948 2023-01-22 13:42:00.042060: step: 40/466, loss: 0.1271037459373474 2023-01-22 13:42:00.752602: step: 42/466, loss: 0.02907964028418064 2023-01-22 13:42:01.451664: step: 44/466, loss: 1.326354742050171 2023-01-22 13:42:02.230031: step: 46/466, loss: 0.10550469160079956 2023-01-22 13:42:03.033838: step: 48/466, loss: 0.09840810298919678 2023-01-22 13:42:03.905170: step: 50/466, loss: 0.037362001836299896 2023-01-22 13:42:04.739449: step: 52/466, loss: 0.028449734672904015 2023-01-22 13:42:05.536660: step: 54/466, loss: 0.007634575013071299 2023-01-22 13:42:06.181569: step: 56/466, loss: 0.06437010318040848 2023-01-22 13:42:06.959321: step: 58/466, loss: 0.05331692099571228 2023-01-22 13:42:07.779994: step: 60/466, loss: 0.044606491923332214 2023-01-22 13:42:08.541925: step: 62/466, loss: 0.03616861253976822 2023-01-22 13:42:09.342326: step: 64/466, loss: 0.09568973630666733 2023-01-22 13:42:10.140774: step: 66/466, loss: 0.08516858518123627 2023-01-22 13:42:10.965438: step: 68/466, loss: 0.06501974165439606 2023-01-22 13:42:11.742217: step: 70/466, loss: 0.10325935482978821 2023-01-22 13:42:12.521337: step: 72/466, loss: 0.09397002309560776 2023-01-22 13:42:13.308609: step: 74/466, loss: 0.016955891624093056 2023-01-22 13:42:14.036221: step: 76/466, loss: 0.07174551486968994 2023-01-22 13:42:14.845028: step: 78/466, loss: 0.08200935274362564 2023-01-22 13:42:15.579946: step: 80/466, loss: 0.02655804343521595 2023-01-22 13:42:16.411581: step: 82/466, loss: 0.13291595876216888 2023-01-22 13:42:17.178592: step: 84/466, loss: 0.06990199536085129 2023-01-22 13:42:17.920804: step: 86/466, loss: 0.08203735202550888 2023-01-22 13:42:18.697857: step: 88/466, loss: 0.07973272353410721 2023-01-22 13:42:19.402944: step: 90/466, loss: 0.0974380373954773 2023-01-22 13:42:20.277301: step: 92/466, loss: 0.053154356777668 2023-01-22 13:42:21.007506: step: 94/466, loss: 0.14800702035427094 2023-01-22 13:42:21.672651: step: 96/466, loss: 0.07176072895526886 2023-01-22 13:42:22.487559: step: 98/466, loss: 0.055861346423625946 2023-01-22 13:42:23.240467: step: 100/466, loss: 0.3699154257774353 2023-01-22 13:42:23.903828: step: 102/466, loss: 0.12106865644454956 2023-01-22 13:42:24.599903: step: 104/466, loss: 0.0461328960955143 2023-01-22 13:42:25.355304: step: 106/466, loss: 0.060555242002010345 2023-01-22 13:42:26.113255: step: 108/466, loss: 0.39660879969596863 2023-01-22 13:42:26.830274: step: 110/466, loss: 0.027628857642412186 2023-01-22 13:42:27.601888: step: 112/466, loss: 0.05876636505126953 2023-01-22 13:42:28.379785: step: 114/466, loss: 0.01485330518335104 2023-01-22 13:42:29.150838: step: 116/466, loss: 0.06990094482898712 2023-01-22 13:42:29.921475: step: 118/466, loss: 0.09951207041740417 2023-01-22 13:42:30.809538: step: 120/466, loss: 0.05669524893164635 2023-01-22 13:42:31.559206: step: 122/466, loss: 0.11176761239767075 2023-01-22 13:42:32.315722: step: 124/466, loss: 0.14566385746002197 2023-01-22 13:42:33.091625: step: 126/466, loss: 0.016346879303455353 2023-01-22 13:42:33.900447: step: 128/466, loss: 0.2126549333333969 2023-01-22 13:42:34.583709: step: 130/466, loss: 0.07509668916463852 2023-01-22 13:42:35.334139: step: 132/466, loss: 0.1204756647348404 2023-01-22 13:42:36.097101: step: 134/466, loss: 0.0901917964220047 2023-01-22 13:42:36.900822: step: 136/466, loss: 0.049818042665719986 2023-01-22 13:42:37.716243: step: 138/466, loss: 0.05061956122517586 2023-01-22 13:42:38.472457: step: 140/466, loss: 0.06672712415456772 2023-01-22 13:42:39.265634: step: 142/466, loss: 0.07314980030059814 2023-01-22 13:42:39.968840: step: 144/466, loss: 0.0676993653178215 2023-01-22 13:42:40.766383: step: 146/466, loss: 0.09283817559480667 2023-01-22 13:42:41.547339: step: 148/466, loss: 0.08164266496896744 2023-01-22 13:42:42.293296: step: 150/466, loss: 0.07620931416749954 2023-01-22 13:42:43.259696: step: 152/466, loss: 0.12517696619033813 2023-01-22 13:42:44.055666: step: 154/466, loss: 0.03965863585472107 2023-01-22 13:42:44.847698: step: 156/466, loss: 0.13473178446292877 2023-01-22 13:42:45.690015: step: 158/466, loss: 0.02908577024936676 2023-01-22 13:42:46.481238: step: 160/466, loss: 0.12194164097309113 2023-01-22 13:42:47.304202: step: 162/466, loss: 0.11548975110054016 2023-01-22 13:42:48.113007: step: 164/466, loss: 0.10065381228923798 2023-01-22 13:42:48.844866: step: 166/466, loss: 0.1245603933930397 2023-01-22 13:42:49.522232: step: 168/466, loss: 0.1796674132347107 2023-01-22 13:42:50.285217: step: 170/466, loss: 0.09850645065307617 2023-01-22 13:42:51.129333: step: 172/466, loss: 0.08459986746311188 2023-01-22 13:42:51.916679: step: 174/466, loss: 0.005720966961234808 2023-01-22 13:42:52.623460: step: 176/466, loss: 0.22306662797927856 2023-01-22 13:42:53.467111: step: 178/466, loss: 0.07315231114625931 2023-01-22 13:42:54.225921: step: 180/466, loss: 0.10266924649477005 2023-01-22 13:42:54.915303: step: 182/466, loss: 0.02916317619383335 2023-01-22 13:42:55.647478: step: 184/466, loss: 0.03219641372561455 2023-01-22 13:42:56.387470: step: 186/466, loss: 0.05558239668607712 2023-01-22 13:42:57.173212: step: 188/466, loss: 0.07634638249874115 2023-01-22 13:42:57.974411: step: 190/466, loss: 0.19770660996437073 2023-01-22 13:42:58.684125: step: 192/466, loss: 0.18504159152507782 2023-01-22 13:42:59.400822: step: 194/466, loss: 0.07066548615694046 2023-01-22 13:43:00.152817: step: 196/466, loss: 0.05562664568424225 2023-01-22 13:43:00.923029: step: 198/466, loss: 0.14733237028121948 2023-01-22 13:43:01.669603: step: 200/466, loss: 0.12928493320941925 2023-01-22 13:43:02.421866: step: 202/466, loss: 0.056578729301691055 2023-01-22 13:43:03.235859: step: 204/466, loss: 0.05881396308541298 2023-01-22 13:43:04.028674: step: 206/466, loss: 0.19485962390899658 2023-01-22 13:43:04.881017: step: 208/466, loss: 0.04371248558163643 2023-01-22 13:43:05.626315: step: 210/466, loss: 0.04043707624077797 2023-01-22 13:43:06.333432: step: 212/466, loss: 0.0322677306830883 2023-01-22 13:43:07.052706: step: 214/466, loss: 0.04926469177007675 2023-01-22 13:43:07.821114: step: 216/466, loss: 0.06370534747838974 2023-01-22 13:43:08.583248: step: 218/466, loss: 0.017677977681159973 2023-01-22 13:43:09.377174: step: 220/466, loss: 0.05885789915919304 2023-01-22 13:43:10.198581: step: 222/466, loss: 0.07734831422567368 2023-01-22 13:43:10.940299: step: 224/466, loss: 0.1633378267288208 2023-01-22 13:43:11.667504: step: 226/466, loss: 0.12721040844917297 2023-01-22 13:43:12.468033: step: 228/466, loss: 0.07709307223558426 2023-01-22 13:43:13.264884: step: 230/466, loss: 0.06658844649791718 2023-01-22 13:43:13.988382: step: 232/466, loss: 0.06309105455875397 2023-01-22 13:43:14.822861: step: 234/466, loss: 0.5354830026626587 2023-01-22 13:43:15.592695: step: 236/466, loss: 0.0737801343202591 2023-01-22 13:43:16.487979: step: 238/466, loss: 0.04832407832145691 2023-01-22 13:43:17.203596: step: 240/466, loss: 0.028802694752812386 2023-01-22 13:43:17.944852: step: 242/466, loss: 0.32804444432258606 2023-01-22 13:43:18.733485: step: 244/466, loss: 0.1064852774143219 2023-01-22 13:43:19.523676: step: 246/466, loss: 0.10216860473155975 2023-01-22 13:43:20.258395: step: 248/466, loss: 0.022415174171328545 2023-01-22 13:43:21.067151: step: 250/466, loss: 0.3452892005443573 2023-01-22 13:43:21.853081: step: 252/466, loss: 0.13802990317344666 2023-01-22 13:43:22.605140: step: 254/466, loss: 0.09531251341104507 2023-01-22 13:43:23.359431: step: 256/466, loss: 0.08211075514554977 2023-01-22 13:43:24.133508: step: 258/466, loss: 0.08147446811199188 2023-01-22 13:43:24.890541: step: 260/466, loss: 0.11451318114995956 2023-01-22 13:43:25.622251: step: 262/466, loss: 0.39439642429351807 2023-01-22 13:43:26.455511: step: 264/466, loss: 0.19546227157115936 2023-01-22 13:43:27.235038: step: 266/466, loss: 0.036694370210170746 2023-01-22 13:43:28.028888: step: 268/466, loss: 0.038847874850034714 2023-01-22 13:43:28.772197: step: 270/466, loss: 0.12380710244178772 2023-01-22 13:43:29.607365: step: 272/466, loss: 0.2092113494873047 2023-01-22 13:43:30.439604: step: 274/466, loss: 0.2461710274219513 2023-01-22 13:43:31.165055: step: 276/466, loss: 0.19365479052066803 2023-01-22 13:43:31.933336: step: 278/466, loss: 0.07007157057523727 2023-01-22 13:43:32.688870: step: 280/466, loss: 0.029620543122291565 2023-01-22 13:43:33.498855: step: 282/466, loss: 0.21002694964408875 2023-01-22 13:43:34.296806: step: 284/466, loss: 0.1598701775074005 2023-01-22 13:43:35.001791: step: 286/466, loss: 1.290482997894287 2023-01-22 13:43:35.806362: step: 288/466, loss: 0.06177673488855362 2023-01-22 13:43:36.484695: step: 290/466, loss: 0.12041433155536652 2023-01-22 13:43:37.259172: step: 292/466, loss: 0.4185667037963867 2023-01-22 13:43:38.180069: step: 294/466, loss: 0.1689419448375702 2023-01-22 13:43:38.923964: step: 296/466, loss: 0.14349795877933502 2023-01-22 13:43:39.613976: step: 298/466, loss: 0.10513242334127426 2023-01-22 13:43:40.385014: step: 300/466, loss: 0.2011110484600067 2023-01-22 13:43:41.186679: step: 302/466, loss: 0.04485170543193817 2023-01-22 13:43:42.045196: step: 304/466, loss: 0.035167694091796875 2023-01-22 13:43:42.821661: step: 306/466, loss: 0.00759429857134819 2023-01-22 13:43:43.569000: step: 308/466, loss: 0.31545257568359375 2023-01-22 13:43:44.279877: step: 310/466, loss: 0.09402605146169662 2023-01-22 13:43:45.000786: step: 312/466, loss: 0.07007033377885818 2023-01-22 13:43:45.782243: step: 314/466, loss: 0.06480089575052261 2023-01-22 13:43:46.572762: step: 316/466, loss: 0.2533351182937622 2023-01-22 13:43:47.343376: step: 318/466, loss: 0.3919064998626709 2023-01-22 13:43:48.028003: step: 320/466, loss: 0.13647937774658203 2023-01-22 13:43:48.782634: step: 322/466, loss: 0.12979894876480103 2023-01-22 13:43:49.598636: step: 324/466, loss: 0.24850276112556458 2023-01-22 13:43:50.404022: step: 326/466, loss: 0.06813384592533112 2023-01-22 13:43:51.193666: step: 328/466, loss: 0.2158774584531784 2023-01-22 13:43:51.915318: step: 330/466, loss: 0.0738043487071991 2023-01-22 13:43:52.690507: step: 332/466, loss: 0.1032731756567955 2023-01-22 13:43:53.484047: step: 334/466, loss: 0.24884411692619324 2023-01-22 13:43:54.258597: step: 336/466, loss: 0.10134509950876236 2023-01-22 13:43:55.087497: step: 338/466, loss: 0.10196374356746674 2023-01-22 13:43:55.818482: step: 340/466, loss: 0.10488557815551758 2023-01-22 13:43:56.562646: step: 342/466, loss: 0.019537772983312607 2023-01-22 13:43:57.353178: step: 344/466, loss: 0.12553223967552185 2023-01-22 13:43:58.028601: step: 346/466, loss: 0.035885635763406754 2023-01-22 13:43:58.814467: step: 348/466, loss: 0.044970910996198654 2023-01-22 13:43:59.556094: step: 350/466, loss: 0.056709982454776764 2023-01-22 13:44:00.376020: step: 352/466, loss: 0.11772741377353668 2023-01-22 13:44:01.132868: step: 354/466, loss: 0.5492798089981079 2023-01-22 13:44:02.030055: step: 356/466, loss: 0.016146494075655937 2023-01-22 13:44:02.784611: step: 358/466, loss: 0.21240819990634918 2023-01-22 13:44:03.533190: step: 360/466, loss: 0.12465915828943253 2023-01-22 13:44:04.345760: step: 362/466, loss: 0.7712176442146301 2023-01-22 13:44:05.088476: step: 364/466, loss: 0.13183462619781494 2023-01-22 13:44:05.801579: step: 366/466, loss: 0.10105898231267929 2023-01-22 13:44:06.585745: step: 368/466, loss: 0.015195309184491634 2023-01-22 13:44:07.369806: step: 370/466, loss: 0.04254454746842384 2023-01-22 13:44:08.114194: step: 372/466, loss: 0.044255319982767105 2023-01-22 13:44:08.912110: step: 374/466, loss: 0.08767160028219223 2023-01-22 13:44:09.693955: step: 376/466, loss: 0.1139039471745491 2023-01-22 13:44:10.386791: step: 378/466, loss: 0.23791880905628204 2023-01-22 13:44:11.120832: step: 380/466, loss: 0.05610283091664314 2023-01-22 13:44:11.872250: step: 382/466, loss: 0.10261404514312744 2023-01-22 13:44:12.675922: step: 384/466, loss: 0.05635293200612068 2023-01-22 13:44:13.478441: step: 386/466, loss: 0.11902878433465958 2023-01-22 13:44:14.184741: step: 388/466, loss: 0.023966550827026367 2023-01-22 13:44:14.891279: step: 390/466, loss: 0.059337567538022995 2023-01-22 13:44:15.623076: step: 392/466, loss: 1.2344927787780762 2023-01-22 13:44:16.391939: step: 394/466, loss: 0.04720328375697136 2023-01-22 13:44:17.168894: step: 396/466, loss: 0.11827465891838074 2023-01-22 13:44:17.975699: step: 398/466, loss: 0.15723110735416412 2023-01-22 13:44:18.904473: step: 400/466, loss: 0.8636144995689392 2023-01-22 13:44:19.708209: step: 402/466, loss: 1.0105373859405518 2023-01-22 13:44:20.482161: step: 404/466, loss: 0.1483166366815567 2023-01-22 13:44:21.247927: step: 406/466, loss: 0.019055398181080818 2023-01-22 13:44:22.046189: step: 408/466, loss: 0.0768151581287384 2023-01-22 13:44:22.746254: step: 410/466, loss: 0.13478273153305054 2023-01-22 13:44:23.473208: step: 412/466, loss: 0.035327523946762085 2023-01-22 13:44:24.196400: step: 414/466, loss: 0.040573496371507645 2023-01-22 13:44:24.910300: step: 416/466, loss: 0.10551556944847107 2023-01-22 13:44:25.725180: step: 418/466, loss: 0.11085692793130875 2023-01-22 13:44:26.488503: step: 420/466, loss: 0.13599808514118195 2023-01-22 13:44:27.310083: step: 422/466, loss: 0.1653360277414322 2023-01-22 13:44:28.079794: step: 424/466, loss: 0.6205252408981323 2023-01-22 13:44:28.836761: step: 426/466, loss: 0.43873658776283264 2023-01-22 13:44:29.573104: step: 428/466, loss: 0.1369527131319046 2023-01-22 13:44:30.386288: step: 430/466, loss: 0.2972129285335541 2023-01-22 13:44:31.091115: step: 432/466, loss: 0.07241196185350418 2023-01-22 13:44:31.906445: step: 434/466, loss: 0.028385912999510765 2023-01-22 13:44:32.638617: step: 436/466, loss: 0.05662926286458969 2023-01-22 13:44:33.374282: step: 438/466, loss: 0.10984829068183899 2023-01-22 13:44:34.132214: step: 440/466, loss: 0.16127750277519226 2023-01-22 13:44:34.883893: step: 442/466, loss: 0.33046212792396545 2023-01-22 13:44:35.653329: step: 444/466, loss: 0.023356251418590546 2023-01-22 13:44:36.400578: step: 446/466, loss: 0.04761826992034912 2023-01-22 13:44:37.266423: step: 448/466, loss: 0.0881882831454277 2023-01-22 13:44:38.077688: step: 450/466, loss: 0.059739310294389725 2023-01-22 13:44:38.901128: step: 452/466, loss: 0.07543253898620605 2023-01-22 13:44:39.772612: step: 454/466, loss: 0.010121147148311138 2023-01-22 13:44:40.577227: step: 456/466, loss: 0.08570060133934021 2023-01-22 13:44:41.492520: step: 458/466, loss: 0.1192619800567627 2023-01-22 13:44:42.201219: step: 460/466, loss: 0.6350327730178833 2023-01-22 13:44:43.005051: step: 462/466, loss: 0.028908485546708107 2023-01-22 13:44:43.835821: step: 464/466, loss: 0.07640614360570908 2023-01-22 13:44:44.651016: step: 466/466, loss: 0.09877616912126541 2023-01-22 13:44:45.327242: step: 468/466, loss: 0.017190825194120407 2023-01-22 13:44:45.986390: step: 470/466, loss: 0.16334563493728638 2023-01-22 13:44:46.630425: step: 472/466, loss: 0.06626991927623749 2023-01-22 13:44:47.397500: step: 474/466, loss: 0.04550163075327873 2023-01-22 13:44:48.153199: step: 476/466, loss: 0.03090309165418148 2023-01-22 13:44:48.925839: step: 478/466, loss: 0.045029982924461365 2023-01-22 13:44:49.689685: step: 480/466, loss: 0.1302035003900528 2023-01-22 13:44:50.496348: step: 482/466, loss: 0.08502025902271271 2023-01-22 13:44:51.292455: step: 484/466, loss: 0.034057505428791046 2023-01-22 13:44:52.005670: step: 486/466, loss: 0.026281673461198807 2023-01-22 13:44:52.888958: step: 488/466, loss: 0.07515764981508255 2023-01-22 13:44:53.574730: step: 490/466, loss: 0.029807021841406822 2023-01-22 13:44:54.321220: step: 492/466, loss: 0.08880341798067093 2023-01-22 13:44:55.190551: step: 494/466, loss: 0.13791541755199432 2023-01-22 13:44:56.016379: step: 496/466, loss: 0.08253327012062073 2023-01-22 13:44:56.800300: step: 498/466, loss: 0.0802103653550148 2023-01-22 13:44:57.575327: step: 500/466, loss: 0.08541009575128555 2023-01-22 13:44:58.377409: step: 502/466, loss: 0.24436721205711365 2023-01-22 13:44:59.139499: step: 504/466, loss: 0.2751705050468445 2023-01-22 13:44:59.921182: step: 506/466, loss: 0.24562475085258484 2023-01-22 13:45:00.661110: step: 508/466, loss: 0.48396503925323486 2023-01-22 13:45:01.447212: step: 510/466, loss: 0.8047950863838196 2023-01-22 13:45:02.336778: step: 512/466, loss: 0.057647477835416794 2023-01-22 13:45:03.132643: step: 514/466, loss: 0.08173815906047821 2023-01-22 13:45:03.891675: step: 516/466, loss: 0.0712982639670372 2023-01-22 13:45:04.628861: step: 518/466, loss: 0.041921310126781464 2023-01-22 13:45:05.420257: step: 520/466, loss: 0.18059973418712616 2023-01-22 13:45:06.147333: step: 522/466, loss: 0.0878557562828064 2023-01-22 13:45:07.000596: step: 524/466, loss: 0.1671873927116394 2023-01-22 13:45:07.755367: step: 526/466, loss: 0.025004588067531586 2023-01-22 13:45:08.486935: step: 528/466, loss: 0.08039911836385727 2023-01-22 13:45:09.233773: step: 530/466, loss: 0.3033719062805176 2023-01-22 13:45:10.027633: step: 532/466, loss: 0.08958456665277481 2023-01-22 13:45:10.786013: step: 534/466, loss: 0.09387919306755066 2023-01-22 13:45:11.502633: step: 536/466, loss: 0.09632495045661926 2023-01-22 13:45:12.305046: step: 538/466, loss: 0.1713729053735733 2023-01-22 13:45:13.071074: step: 540/466, loss: 0.07129695266485214 2023-01-22 13:45:13.780846: step: 542/466, loss: 0.12720653414726257 2023-01-22 13:45:14.516640: step: 544/466, loss: 0.060755349695682526 2023-01-22 13:45:15.256718: step: 546/466, loss: 1.95603346824646 2023-01-22 13:45:16.076176: step: 548/466, loss: 0.03766784816980362 2023-01-22 13:45:16.790277: step: 550/466, loss: 0.07923099398612976 2023-01-22 13:45:17.512450: step: 552/466, loss: 0.04662496969103813 2023-01-22 13:45:18.225992: step: 554/466, loss: 0.20221984386444092 2023-01-22 13:45:19.061339: step: 556/466, loss: 0.07292622327804565 2023-01-22 13:45:19.750946: step: 558/466, loss: 0.3139210045337677 2023-01-22 13:45:20.501911: step: 560/466, loss: 0.23124876618385315 2023-01-22 13:45:21.317299: step: 562/466, loss: 0.08138064295053482 2023-01-22 13:45:22.091439: step: 564/466, loss: 0.16122810542583466 2023-01-22 13:45:22.770483: step: 566/466, loss: 0.026222899556159973 2023-01-22 13:45:23.639598: step: 568/466, loss: 0.11673318594694138 2023-01-22 13:45:24.388765: step: 570/466, loss: 0.4448487460613251 2023-01-22 13:45:25.146773: step: 572/466, loss: 0.10425546020269394 2023-01-22 13:45:26.019449: step: 574/466, loss: 0.3005366325378418 2023-01-22 13:45:26.776329: step: 576/466, loss: 0.15741752088069916 2023-01-22 13:45:27.561896: step: 578/466, loss: 0.2272447943687439 2023-01-22 13:45:28.298769: step: 580/466, loss: 0.10790374875068665 2023-01-22 13:45:29.035854: step: 582/466, loss: 0.03616996109485626 2023-01-22 13:45:29.780154: step: 584/466, loss: 1.3382568359375 2023-01-22 13:45:30.485375: step: 586/466, loss: 0.0517452172935009 2023-01-22 13:45:31.242577: step: 588/466, loss: 0.0477430522441864 2023-01-22 13:45:32.002534: step: 590/466, loss: 0.14656464755535126 2023-01-22 13:45:32.694108: step: 592/466, loss: 0.0638587474822998 2023-01-22 13:45:33.447532: step: 594/466, loss: 0.14820857346057892 2023-01-22 13:45:34.212269: step: 596/466, loss: 0.028037745505571365 2023-01-22 13:45:34.960406: step: 598/466, loss: 0.8110078573226929 2023-01-22 13:45:35.713923: step: 600/466, loss: 0.02175460010766983 2023-01-22 13:45:36.460788: step: 602/466, loss: 1.047738790512085 2023-01-22 13:45:37.151320: step: 604/466, loss: 0.06620313227176666 2023-01-22 13:45:37.961080: step: 606/466, loss: 0.2300693392753601 2023-01-22 13:45:38.685625: step: 608/466, loss: 0.14889350533485413 2023-01-22 13:45:39.478306: step: 610/466, loss: 0.06209733709692955 2023-01-22 13:45:40.328560: step: 612/466, loss: 0.2901458442211151 2023-01-22 13:45:41.092160: step: 614/466, loss: 0.06201798841357231 2023-01-22 13:45:41.936625: step: 616/466, loss: 0.14276079833507538 2023-01-22 13:45:42.735846: step: 618/466, loss: 1.0708260536193848 2023-01-22 13:45:43.509293: step: 620/466, loss: 0.07404862344264984 2023-01-22 13:45:44.303651: step: 622/466, loss: 0.023511115461587906 2023-01-22 13:45:45.025284: step: 624/466, loss: 0.20319001376628876 2023-01-22 13:45:45.748397: step: 626/466, loss: 0.08493014425039291 2023-01-22 13:45:46.546906: step: 628/466, loss: 0.040104154497385025 2023-01-22 13:45:47.349379: step: 630/466, loss: 0.338765412569046 2023-01-22 13:45:48.151449: step: 632/466, loss: 0.1115725114941597 2023-01-22 13:45:48.929516: step: 634/466, loss: 0.09772010147571564 2023-01-22 13:45:49.612061: step: 636/466, loss: 0.07879616320133209 2023-01-22 13:45:50.323114: step: 638/466, loss: 0.05774553120136261 2023-01-22 13:45:51.055030: step: 640/466, loss: 0.09257499128580093 2023-01-22 13:45:51.786355: step: 642/466, loss: 0.17606566846370697 2023-01-22 13:45:52.501181: step: 644/466, loss: 0.07357846200466156 2023-01-22 13:45:53.303364: step: 646/466, loss: 0.10471224784851074 2023-01-22 13:45:54.076039: step: 648/466, loss: 0.1160171777009964 2023-01-22 13:45:54.790057: step: 650/466, loss: 0.0974593535065651 2023-01-22 13:45:55.544037: step: 652/466, loss: 0.18082386255264282 2023-01-22 13:45:56.255215: step: 654/466, loss: 0.6907148361206055 2023-01-22 13:45:57.089454: step: 656/466, loss: 0.07257720082998276 2023-01-22 13:45:57.826699: step: 658/466, loss: 0.222214937210083 2023-01-22 13:45:58.584564: step: 660/466, loss: 0.13155943155288696 2023-01-22 13:45:59.271858: step: 662/466, loss: 0.14486588537693024 2023-01-22 13:46:00.070550: step: 664/466, loss: 0.06527838110923767 2023-01-22 13:46:00.773322: step: 666/466, loss: 0.11442571878433228 2023-01-22 13:46:01.595799: step: 668/466, loss: 0.05018752068281174 2023-01-22 13:46:02.447362: step: 670/466, loss: 0.03921462595462799 2023-01-22 13:46:03.198518: step: 672/466, loss: 0.28432151675224304 2023-01-22 13:46:03.872294: step: 674/466, loss: 0.3747497797012329 2023-01-22 13:46:04.668712: step: 676/466, loss: 0.0581992045044899 2023-01-22 13:46:05.436408: step: 678/466, loss: 0.019291166216135025 2023-01-22 13:46:06.197138: step: 680/466, loss: 0.06049179658293724 2023-01-22 13:46:06.950266: step: 682/466, loss: 0.0835191160440445 2023-01-22 13:46:07.685955: step: 684/466, loss: 0.10527876764535904 2023-01-22 13:46:08.560250: step: 686/466, loss: 0.030664782971143723 2023-01-22 13:46:09.367700: step: 688/466, loss: 0.03258570656180382 2023-01-22 13:46:10.228568: step: 690/466, loss: 0.37244319915771484 2023-01-22 13:46:10.957974: step: 692/466, loss: 0.07915109395980835 2023-01-22 13:46:11.674379: step: 694/466, loss: 0.5006815791130066 2023-01-22 13:46:12.424861: step: 696/466, loss: 0.0645032450556755 2023-01-22 13:46:13.242465: step: 698/466, loss: 0.13112987577915192 2023-01-22 13:46:14.112021: step: 700/466, loss: 0.22272424399852753 2023-01-22 13:46:14.913094: step: 702/466, loss: 0.04174065217375755 2023-01-22 13:46:15.642068: step: 704/466, loss: 0.0813443660736084 2023-01-22 13:46:16.363326: step: 706/466, loss: 0.14078551530838013 2023-01-22 13:46:17.139112: step: 708/466, loss: 0.17905648052692413 2023-01-22 13:46:17.878400: step: 710/466, loss: 0.0903608500957489 2023-01-22 13:46:18.593797: step: 712/466, loss: 0.10948151350021362 2023-01-22 13:46:19.375517: step: 714/466, loss: 0.04178426414728165 2023-01-22 13:46:20.154875: step: 716/466, loss: 0.029289964586496353 2023-01-22 13:46:20.861243: step: 718/466, loss: 0.09051091969013214 2023-01-22 13:46:21.675876: step: 720/466, loss: 0.025737447664141655 2023-01-22 13:46:22.569396: step: 722/466, loss: 0.2563604712486267 2023-01-22 13:46:23.279777: step: 724/466, loss: 0.09010884910821915 2023-01-22 13:46:24.012937: step: 726/466, loss: 0.052926205098629 2023-01-22 13:46:24.697722: step: 728/466, loss: 0.07164259254932404 2023-01-22 13:46:25.513782: step: 730/466, loss: 0.10287351161241531 2023-01-22 13:46:26.309535: step: 732/466, loss: 0.12792713940143585 2023-01-22 13:46:27.090982: step: 734/466, loss: 0.06267481297254562 2023-01-22 13:46:27.821670: step: 736/466, loss: 0.06375842541456223 2023-01-22 13:46:28.617320: step: 738/466, loss: 0.05980125442147255 2023-01-22 13:46:29.387184: step: 740/466, loss: 0.049242787063121796 2023-01-22 13:46:30.198846: step: 742/466, loss: 0.0792761817574501 2023-01-22 13:46:31.027304: step: 744/466, loss: 0.018085921183228493 2023-01-22 13:46:31.847263: step: 746/466, loss: 0.0719904825091362 2023-01-22 13:46:32.551656: step: 748/466, loss: 0.09041845053434372 2023-01-22 13:46:33.310871: step: 750/466, loss: 0.018958715721964836 2023-01-22 13:46:34.083389: step: 752/466, loss: 0.08061391115188599 2023-01-22 13:46:34.898203: step: 754/466, loss: 0.07450126111507416 2023-01-22 13:46:35.620688: step: 756/466, loss: 0.07704184949398041 2023-01-22 13:46:36.407520: step: 758/466, loss: 0.23017098009586334 2023-01-22 13:46:37.144004: step: 760/466, loss: 0.07831332832574844 2023-01-22 13:46:37.866097: step: 762/466, loss: 0.07351720333099365 2023-01-22 13:46:38.701848: step: 764/466, loss: 0.15741848945617676 2023-01-22 13:46:39.520502: step: 766/466, loss: 0.1123637706041336 2023-01-22 13:46:40.286083: step: 768/466, loss: 0.05872133746743202 2023-01-22 13:46:41.060492: step: 770/466, loss: 0.07663730531930923 2023-01-22 13:46:41.746336: step: 772/466, loss: 0.15055827796459198 2023-01-22 13:46:42.512899: step: 774/466, loss: 0.1707383543252945 2023-01-22 13:46:43.239486: step: 776/466, loss: 0.5265418887138367 2023-01-22 13:46:44.069110: step: 778/466, loss: 0.5557186603546143 2023-01-22 13:46:45.018696: step: 780/466, loss: 0.06326576322317123 2023-01-22 13:46:45.833226: step: 782/466, loss: 0.1562936007976532 2023-01-22 13:46:46.563676: step: 784/466, loss: 0.03154328465461731 2023-01-22 13:46:47.338932: step: 786/466, loss: 0.04813500493764877 2023-01-22 13:46:48.107287: step: 788/466, loss: 0.07419311255216599 2023-01-22 13:46:48.888771: step: 790/466, loss: 0.16150015592575073 2023-01-22 13:46:49.635790: step: 792/466, loss: 0.6626380085945129 2023-01-22 13:46:50.423122: step: 794/466, loss: 0.05854855850338936 2023-01-22 13:46:51.252814: step: 796/466, loss: 0.07601359486579895 2023-01-22 13:46:52.031045: step: 798/466, loss: 0.316074937582016 2023-01-22 13:46:52.873280: step: 800/466, loss: 0.055279359221458435 2023-01-22 13:46:53.609296: step: 802/466, loss: 0.30370137095451355 2023-01-22 13:46:54.363113: step: 804/466, loss: 2.408041477203369 2023-01-22 13:46:55.087348: step: 806/466, loss: 0.7589643597602844 2023-01-22 13:46:55.784343: step: 808/466, loss: 0.14986556768417358 2023-01-22 13:46:56.534928: step: 810/466, loss: 0.0806887075304985 2023-01-22 13:46:57.318006: step: 812/466, loss: 0.06354395300149918 2023-01-22 13:46:58.109015: step: 814/466, loss: 0.30655643343925476 2023-01-22 13:46:58.862756: step: 816/466, loss: 0.09369249641895294 2023-01-22 13:46:59.678361: step: 818/466, loss: 0.1615120768547058 2023-01-22 13:47:00.436177: step: 820/466, loss: 0.06296786665916443 2023-01-22 13:47:01.166125: step: 822/466, loss: 0.025020739063620567 2023-01-22 13:47:01.931947: step: 824/466, loss: 0.15517769753932953 2023-01-22 13:47:02.642374: step: 826/466, loss: 0.08489702641963959 2023-01-22 13:47:03.438161: step: 828/466, loss: 0.2150644212961197 2023-01-22 13:47:04.215169: step: 830/466, loss: 0.2859443128108978 2023-01-22 13:47:04.963605: step: 832/466, loss: 0.09254368394613266 2023-01-22 13:47:05.760686: step: 834/466, loss: 0.011415778659284115 2023-01-22 13:47:06.450784: step: 836/466, loss: 0.14637251198291779 2023-01-22 13:47:07.301088: step: 838/466, loss: 0.01589520275592804 2023-01-22 13:47:08.062186: step: 840/466, loss: 0.10257188230752945 2023-01-22 13:47:08.843131: step: 842/466, loss: 0.03776068240404129 2023-01-22 13:47:09.539863: step: 844/466, loss: 0.029468653723597527 2023-01-22 13:47:10.319451: step: 846/466, loss: 0.12233823537826538 2023-01-22 13:47:11.115284: step: 848/466, loss: 0.8273965716362 2023-01-22 13:47:11.809751: step: 850/466, loss: 0.0944732055068016 2023-01-22 13:47:12.583793: step: 852/466, loss: 0.11218609660863876 2023-01-22 13:47:13.482184: step: 854/466, loss: 0.027405105531215668 2023-01-22 13:47:14.221635: step: 856/466, loss: 0.05875711888074875 2023-01-22 13:47:15.062746: step: 858/466, loss: 0.1434674710035324 2023-01-22 13:47:15.777420: step: 860/466, loss: 0.03289871662855148 2023-01-22 13:47:16.531517: step: 862/466, loss: 0.11220621317625046 2023-01-22 13:47:17.290029: step: 864/466, loss: 0.17699798941612244 2023-01-22 13:47:18.035277: step: 866/466, loss: 0.1084531918168068 2023-01-22 13:47:18.794253: step: 868/466, loss: 0.06856013089418411 2023-01-22 13:47:19.655763: step: 870/466, loss: 0.09868394583463669 2023-01-22 13:47:20.336520: step: 872/466, loss: 0.039851948618888855 2023-01-22 13:47:21.112084: step: 874/466, loss: 0.03650224953889847 2023-01-22 13:47:21.917181: step: 876/466, loss: 0.04264580085873604 2023-01-22 13:47:22.611938: step: 878/466, loss: 0.018047701567411423 2023-01-22 13:47:23.336296: step: 880/466, loss: 0.041126005351543427 2023-01-22 13:47:24.122695: step: 882/466, loss: 0.03496141731739044 2023-01-22 13:47:24.875165: step: 884/466, loss: 0.22649994492530823 2023-01-22 13:47:25.641985: step: 886/466, loss: 0.11513878405094147 2023-01-22 13:47:26.398530: step: 888/466, loss: 1.339849829673767 2023-01-22 13:47:27.177912: step: 890/466, loss: 0.05769219622015953 2023-01-22 13:47:28.031526: step: 892/466, loss: 0.04100324586033821 2023-01-22 13:47:28.823535: step: 894/466, loss: 0.015059271827340126 2023-01-22 13:47:29.527346: step: 896/466, loss: 0.1016344279050827 2023-01-22 13:47:30.264534: step: 898/466, loss: 0.20613186061382294 2023-01-22 13:47:31.041587: step: 900/466, loss: 0.12245035916566849 2023-01-22 13:47:31.878474: step: 902/466, loss: 0.0788796916604042 2023-01-22 13:47:32.558809: step: 904/466, loss: 0.3605306148529053 2023-01-22 13:47:33.325910: step: 906/466, loss: 0.03475223854184151 2023-01-22 13:47:34.045809: step: 908/466, loss: 1.0116491317749023 2023-01-22 13:47:34.768228: step: 910/466, loss: 0.10003989189863205 2023-01-22 13:47:35.496869: step: 912/466, loss: 0.062395673245191574 2023-01-22 13:47:36.288942: step: 914/466, loss: 0.211165651679039 2023-01-22 13:47:37.050611: step: 916/466, loss: 0.1046147421002388 2023-01-22 13:47:37.796447: step: 918/466, loss: 0.1511353999376297 2023-01-22 13:47:38.496674: step: 920/466, loss: 0.026441054418683052 2023-01-22 13:47:39.350032: step: 922/466, loss: 1.8698264360427856 2023-01-22 13:47:40.180464: step: 924/466, loss: 0.17580001056194305 2023-01-22 13:47:40.973370: step: 926/466, loss: 0.07652544230222702 2023-01-22 13:47:41.737595: step: 928/466, loss: 0.09471571445465088 2023-01-22 13:47:42.463511: step: 930/466, loss: 0.1395745873451233 2023-01-22 13:47:43.174590: step: 932/466, loss: 0.09615115821361542 ================================================== Loss: 0.161 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3027851084501864, 'r': 0.33151234834109594, 'f1': 0.3164982021299956}, 'combined': 0.23320920156947042, 'epoch': 16} Test Chinese: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.34794201805038116, 'r': 0.299398980870042, 'f1': 0.32185041818726456}, 'combined': 0.19782025703217238, 'epoch': 16} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.28936766254685126, 'r': 0.34757064590542097, 'f1': 0.31580987998647736}, 'combined': 0.2327020168321412, 'epoch': 16} Test Korean: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.32514816020784904, 'r': 0.29866295478363947, 'f1': 0.31134331510417346}, 'combined': 0.1913622326981749, 'epoch': 16} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.31313870788302606, 'r': 0.33987730722787646, 'f1': 0.32596058400198524}, 'combined': 0.2401814829488312, 'epoch': 16} Test Russian: {'template': {'p': 0.863013698630137, 'r': 0.48091603053435117, 'f1': 0.6176470588235295}, 'slot': {'p': 0.34224108926331404, 'r': 0.2956796932370226, 'f1': 0.31726114922875326}, 'combined': 0.19595541570011235, 'epoch': 16} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.29347826086956524, 'r': 0.38571428571428573, 'f1': 0.33333333333333337}, 'combined': 0.22222222222222224, 'epoch': 16} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.2765957446808511, 'r': 0.5652173913043478, 'f1': 0.37142857142857144}, 'combined': 0.18571428571428572, 'epoch': 16} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.4, 'r': 0.20689655172413793, 'f1': 0.2727272727272727}, 'combined': 0.1818181818181818, 'epoch': 16} New best chinese model... ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3027851084501864, 'r': 0.33151234834109594, 'f1': 0.3164982021299956}, 'combined': 0.23320920156947042, 'epoch': 16} Test for Chinese: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.34794201805038116, 'r': 0.299398980870042, 'f1': 0.32185041818726456}, 'combined': 0.19782025703217238, 'epoch': 16} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.29347826086956524, 'r': 0.38571428571428573, 'f1': 0.33333333333333337}, 'combined': 0.22222222222222224, 'epoch': 16} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.28298284686781866, 'r': 0.3527888622242065, 'f1': 0.31405359863540006}, 'combined': 0.23140791478397899, 'epoch': 11} Test for Korean: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.32660396071805126, 'r': 0.30226432413074417, 'f1': 0.3139631233545263}, 'combined': 0.19297245630570886, 'epoch': 11} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.28888888888888886, 'r': 0.5652173913043478, 'f1': 0.38235294117647056}, 'combined': 0.19117647058823528, 'epoch': 11} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.32387621960662843, 'r': 0.3607501725030188, 'f1': 0.34132018116533375}, 'combined': 0.25149908085866696, 'epoch': 11} Test for Russian: {'template': {'p': 0.863013698630137, 'r': 0.48091603053435117, 'f1': 0.6176470588235295}, 'slot': {'p': 0.3437797535475426, 'r': 0.2928383862541026, 'f1': 0.3162709384531908}, 'combined': 0.19534381492697084, 'epoch': 11} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.46153846153846156, 'r': 0.20689655172413793, 'f1': 0.28571428571428575}, 'combined': 0.1904761904761905, 'epoch': 11} ****************************** Epoch: 17 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-22 13:50:35.610653: step: 2/466, loss: 0.04191667214035988 2023-01-22 13:50:36.336201: step: 4/466, loss: 0.5147823691368103 2023-01-22 13:50:37.057618: step: 6/466, loss: 0.10745085775852203 2023-01-22 13:50:37.771881: step: 8/466, loss: 0.09878411144018173 2023-01-22 13:50:38.498885: step: 10/466, loss: 0.1570444107055664 2023-01-22 13:50:39.290559: step: 12/466, loss: 0.040436554700136185 2023-01-22 13:50:40.091368: step: 14/466, loss: 0.0783068835735321 2023-01-22 13:50:40.832666: step: 16/466, loss: 0.07695181667804718 2023-01-22 13:50:41.724026: step: 18/466, loss: 0.09340395033359528 2023-01-22 13:50:42.377504: step: 20/466, loss: 0.19459334015846252 2023-01-22 13:50:43.166086: step: 22/466, loss: 0.09914836287498474 2023-01-22 13:50:43.920947: step: 24/466, loss: 0.1062348410487175 2023-01-22 13:50:44.686355: step: 26/466, loss: 0.0334099642932415 2023-01-22 13:50:45.373057: step: 28/466, loss: 0.07453502714633942 2023-01-22 13:50:46.137672: step: 30/466, loss: 0.23181527853012085 2023-01-22 13:50:46.838694: step: 32/466, loss: 0.038020357489585876 2023-01-22 13:50:47.678095: step: 34/466, loss: 0.4997105002403259 2023-01-22 13:50:48.458182: step: 36/466, loss: 0.09167198091745377 2023-01-22 13:50:49.173184: step: 38/466, loss: 0.3927045166492462 2023-01-22 13:50:49.911950: step: 40/466, loss: 0.012292815372347832 2023-01-22 13:50:50.672109: step: 42/466, loss: 0.33926093578338623 2023-01-22 13:50:51.430032: step: 44/466, loss: 0.05959390103816986 2023-01-22 13:50:52.165164: step: 46/466, loss: 0.05834142491221428 2023-01-22 13:50:52.942533: step: 48/466, loss: 0.04409831017255783 2023-01-22 13:50:53.665995: step: 50/466, loss: 0.08244270831346512 2023-01-22 13:50:54.428338: step: 52/466, loss: 0.08154886215925217 2023-01-22 13:50:55.321545: step: 54/466, loss: 0.5688467025756836 2023-01-22 13:50:56.118317: step: 56/466, loss: 0.8010183572769165 2023-01-22 13:50:56.901687: step: 58/466, loss: 0.09396448731422424 2023-01-22 13:50:57.702387: step: 60/466, loss: 0.18453781306743622 2023-01-22 13:50:58.461928: step: 62/466, loss: 0.041124697774648666 2023-01-22 13:50:59.226157: step: 64/466, loss: 0.0507088266313076 2023-01-22 13:50:59.947925: step: 66/466, loss: 0.8169896602630615 2023-01-22 13:51:00.830158: step: 68/466, loss: 0.11679229885339737 2023-01-22 13:51:01.588376: step: 70/466, loss: 0.034613557159900665 2023-01-22 13:51:02.404849: step: 72/466, loss: 0.08543278276920319 2023-01-22 13:51:03.178420: step: 74/466, loss: 0.06375060230493546 2023-01-22 13:51:03.918008: step: 76/466, loss: 0.03974442556500435 2023-01-22 13:51:04.672912: step: 78/466, loss: 0.05513717606663704 2023-01-22 13:51:05.441293: step: 80/466, loss: 0.02512964978814125 2023-01-22 13:51:06.164771: step: 82/466, loss: 0.12997297942638397 2023-01-22 13:51:06.914861: step: 84/466, loss: 0.05820440128445625 2023-01-22 13:51:07.732723: step: 86/466, loss: 0.25973352789878845 2023-01-22 13:51:08.504908: step: 88/466, loss: 0.03171452507376671 2023-01-22 13:51:09.253202: step: 90/466, loss: 0.010969765484333038 2023-01-22 13:51:09.998933: step: 92/466, loss: 0.05218341201543808 2023-01-22 13:51:10.811617: step: 94/466, loss: 0.035453230142593384 2023-01-22 13:51:11.536955: step: 96/466, loss: 0.03244630992412567 2023-01-22 13:51:12.313992: step: 98/466, loss: 0.09304963797330856 2023-01-22 13:51:13.111017: step: 100/466, loss: 0.02452549710869789 2023-01-22 13:51:13.892099: step: 102/466, loss: 0.06358002126216888 2023-01-22 13:51:14.682143: step: 104/466, loss: 0.26026400923728943 2023-01-22 13:51:15.415953: step: 106/466, loss: 0.02341284416615963 2023-01-22 13:51:16.188660: step: 108/466, loss: 0.03641442582011223 2023-01-22 13:51:17.001739: step: 110/466, loss: 0.1473744809627533 2023-01-22 13:51:17.667777: step: 112/466, loss: 0.010492063127458096 2023-01-22 13:51:18.451272: step: 114/466, loss: 0.04320019856095314 2023-01-22 13:51:19.177540: step: 116/466, loss: 0.020513758063316345 2023-01-22 13:51:19.931856: step: 118/466, loss: 0.018752820789813995 2023-01-22 13:51:20.640815: step: 120/466, loss: 0.12152481079101562 2023-01-22 13:51:21.408794: step: 122/466, loss: 0.12580151855945587 2023-01-22 13:51:22.114420: step: 124/466, loss: 0.027853872627019882 2023-01-22 13:51:22.889288: step: 126/466, loss: 0.2141602486371994 2023-01-22 13:51:23.655068: step: 128/466, loss: 0.01891392096877098 2023-01-22 13:51:24.431671: step: 130/466, loss: 0.08879508823156357 2023-01-22 13:51:25.168966: step: 132/466, loss: 0.07886034995317459 2023-01-22 13:51:25.954753: step: 134/466, loss: 0.14440208673477173 2023-01-22 13:51:26.700012: step: 136/466, loss: 0.09427177906036377 2023-01-22 13:51:27.444203: step: 138/466, loss: 0.1376834511756897 2023-01-22 13:51:28.228695: step: 140/466, loss: 0.6417616605758667 2023-01-22 13:51:28.999100: step: 142/466, loss: 0.041780468076467514 2023-01-22 13:51:29.810381: step: 144/466, loss: 0.06160823255777359 2023-01-22 13:51:30.583561: step: 146/466, loss: 0.03103504702448845 2023-01-22 13:51:31.318061: step: 148/466, loss: 0.3792994022369385 2023-01-22 13:51:32.127703: step: 150/466, loss: 0.08279043436050415 2023-01-22 13:51:32.843483: step: 152/466, loss: 0.07768604159355164 2023-01-22 13:51:33.726155: step: 154/466, loss: 0.04201593995094299 2023-01-22 13:51:34.484758: step: 156/466, loss: 0.18667390942573547 2023-01-22 13:51:35.220174: step: 158/466, loss: 0.018082771450281143 2023-01-22 13:51:35.984706: step: 160/466, loss: 0.07044660300016403 2023-01-22 13:51:36.703128: step: 162/466, loss: 0.01237710751593113 2023-01-22 13:51:37.464664: step: 164/466, loss: 0.05196648836135864 2023-01-22 13:51:38.213244: step: 166/466, loss: 0.015970459207892418 2023-01-22 13:51:38.904845: step: 168/466, loss: 0.04284393787384033 2023-01-22 13:51:39.634978: step: 170/466, loss: 0.13483668863773346 2023-01-22 13:51:40.369293: step: 172/466, loss: 0.03902854397892952 2023-01-22 13:51:41.097513: step: 174/466, loss: 0.010558566078543663 2023-01-22 13:51:41.871834: step: 176/466, loss: 0.13913175463676453 2023-01-22 13:51:42.619265: step: 178/466, loss: 0.08197152614593506 2023-01-22 13:51:43.314551: step: 180/466, loss: 0.03208223730325699 2023-01-22 13:51:44.079455: step: 182/466, loss: 0.05784508213400841 2023-01-22 13:51:44.982295: step: 184/466, loss: 0.0503309890627861 2023-01-22 13:51:45.687956: step: 186/466, loss: 0.054225169122219086 2023-01-22 13:51:46.451470: step: 188/466, loss: 0.11491405963897705 2023-01-22 13:51:47.171568: step: 190/466, loss: 0.1225811094045639 2023-01-22 13:51:47.935719: step: 192/466, loss: 0.05776900798082352 2023-01-22 13:51:48.706828: step: 194/466, loss: 0.15298278629779816 2023-01-22 13:51:49.448504: step: 196/466, loss: 0.26484397053718567 2023-01-22 13:51:50.232771: step: 198/466, loss: 0.03561859950423241 2023-01-22 13:51:50.960349: step: 200/466, loss: 0.15547730028629303 2023-01-22 13:51:51.746680: step: 202/466, loss: 0.06322099268436432 2023-01-22 13:51:52.479868: step: 204/466, loss: 0.0602981299161911 2023-01-22 13:51:53.213570: step: 206/466, loss: 0.04739827662706375 2023-01-22 13:51:53.979087: step: 208/466, loss: 0.09399860352277756 2023-01-22 13:51:54.836703: step: 210/466, loss: 0.07635460048913956 2023-01-22 13:51:55.595399: step: 212/466, loss: 0.03263647481799126 2023-01-22 13:51:56.375582: step: 214/466, loss: 0.09039748460054398 2023-01-22 13:51:57.082937: step: 216/466, loss: 0.10271228849887848 2023-01-22 13:51:57.833923: step: 218/466, loss: 0.04623044654726982 2023-01-22 13:51:58.542166: step: 220/466, loss: 0.02166702412068844 2023-01-22 13:51:59.283951: step: 222/466, loss: 0.23645983636379242 2023-01-22 13:52:00.059634: step: 224/466, loss: 0.09341173619031906 2023-01-22 13:52:00.809250: step: 226/466, loss: 0.06331127882003784 2023-01-22 13:52:01.589945: step: 228/466, loss: 0.041292134672403336 2023-01-22 13:52:02.307684: step: 230/466, loss: 0.16387499868869781 2023-01-22 13:52:03.084550: step: 232/466, loss: 0.1170310378074646 2023-01-22 13:52:03.860212: step: 234/466, loss: 0.10173127800226212 2023-01-22 13:52:04.669918: step: 236/466, loss: 0.012713445350527763 2023-01-22 13:52:05.369166: step: 238/466, loss: 0.03802645951509476 2023-01-22 13:52:06.150224: step: 240/466, loss: 0.08751117438077927 2023-01-22 13:52:06.870567: step: 242/466, loss: 0.055097710341215134 2023-01-22 13:52:07.579999: step: 244/466, loss: 0.09168267250061035 2023-01-22 13:52:08.299669: step: 246/466, loss: 0.10639012604951859 2023-01-22 13:52:09.072662: step: 248/466, loss: 0.05749332159757614 2023-01-22 13:52:09.780278: step: 250/466, loss: 0.05517353489995003 2023-01-22 13:52:10.530756: step: 252/466, loss: 0.02633926272392273 2023-01-22 13:52:11.228863: step: 254/466, loss: 0.13642267882823944 2023-01-22 13:52:11.977891: step: 256/466, loss: 0.043375641107559204 2023-01-22 13:52:12.730123: step: 258/466, loss: 0.02250557206571102 2023-01-22 13:52:13.510027: step: 260/466, loss: 0.05615265294909477 2023-01-22 13:52:14.284788: step: 262/466, loss: 0.1533478945493698 2023-01-22 13:52:15.071750: step: 264/466, loss: 0.023662343621253967 2023-01-22 13:52:15.936938: step: 266/466, loss: 0.10575315356254578 2023-01-22 13:52:16.757900: step: 268/466, loss: 0.07846268266439438 2023-01-22 13:52:17.532716: step: 270/466, loss: 0.06294034421443939 2023-01-22 13:52:18.324412: step: 272/466, loss: 0.06830111891031265 2023-01-22 13:52:19.090769: step: 274/466, loss: 0.10785622894763947 2023-01-22 13:52:19.791074: step: 276/466, loss: 0.045360252261161804 2023-01-22 13:52:20.535918: step: 278/466, loss: 0.06354059278964996 2023-01-22 13:52:21.332689: step: 280/466, loss: 0.023901205509901047 2023-01-22 13:52:22.114526: step: 282/466, loss: 0.0633501261472702 2023-01-22 13:52:22.832917: step: 284/466, loss: 0.021562637761235237 2023-01-22 13:52:23.689342: step: 286/466, loss: 0.037073515355587006 2023-01-22 13:52:24.544735: step: 288/466, loss: 0.15239278972148895 2023-01-22 13:52:25.339098: step: 290/466, loss: 0.14575928449630737 2023-01-22 13:52:26.169415: step: 292/466, loss: 0.05778518319129944 2023-01-22 13:52:26.959762: step: 294/466, loss: 0.05821090191602707 2023-01-22 13:52:27.769859: step: 296/466, loss: 0.39319416880607605 2023-01-22 13:52:28.581879: step: 298/466, loss: 0.0335598960518837 2023-01-22 13:52:29.322850: step: 300/466, loss: 0.10477923601865768 2023-01-22 13:52:30.151736: step: 302/466, loss: 0.018370570614933968 2023-01-22 13:52:31.018622: step: 304/466, loss: 0.2609383463859558 2023-01-22 13:52:31.774329: step: 306/466, loss: 0.03143971413373947 2023-01-22 13:52:32.581897: step: 308/466, loss: 0.08613347262144089 2023-01-22 13:52:33.315019: step: 310/466, loss: 0.029043348506093025 2023-01-22 13:52:34.030252: step: 312/466, loss: 0.09032338112592697 2023-01-22 13:52:34.808753: step: 314/466, loss: 0.03294346109032631 2023-01-22 13:52:35.581454: step: 316/466, loss: 0.04086478054523468 2023-01-22 13:52:36.347155: step: 318/466, loss: 0.09302819520235062 2023-01-22 13:52:37.120170: step: 320/466, loss: 0.1359679102897644 2023-01-22 13:52:37.926710: step: 322/466, loss: 0.05803457275032997 2023-01-22 13:52:38.763723: step: 324/466, loss: 0.08316502720117569 2023-01-22 13:52:39.515138: step: 326/466, loss: 0.025853624567389488 2023-01-22 13:52:40.203489: step: 328/466, loss: 0.10935669392347336 2023-01-22 13:52:40.924394: step: 330/466, loss: 0.10221309214830399 2023-01-22 13:52:41.698108: step: 332/466, loss: 0.023362543433904648 2023-01-22 13:52:42.557062: step: 334/466, loss: 0.10492447018623352 2023-01-22 13:52:43.307914: step: 336/466, loss: 0.012983668595552444 2023-01-22 13:52:44.063954: step: 338/466, loss: 0.06686433404684067 2023-01-22 13:52:44.820502: step: 340/466, loss: 0.6464180946350098 2023-01-22 13:52:45.562900: step: 342/466, loss: 0.14225780963897705 2023-01-22 13:52:46.312021: step: 344/466, loss: 0.02568856254220009 2023-01-22 13:52:47.093505: step: 346/466, loss: 0.11996930837631226 2023-01-22 13:52:47.828051: step: 348/466, loss: 0.07267715036869049 2023-01-22 13:52:48.541211: step: 350/466, loss: 0.05870138481259346 2023-01-22 13:52:49.286453: step: 352/466, loss: 0.21511659026145935 2023-01-22 13:52:49.989746: step: 354/466, loss: 0.22571636736392975 2023-01-22 13:52:50.667855: step: 356/466, loss: 0.0684482753276825 2023-01-22 13:52:51.434765: step: 358/466, loss: 0.07619721442461014 2023-01-22 13:52:52.146339: step: 360/466, loss: 0.07417423278093338 2023-01-22 13:52:52.852449: step: 362/466, loss: 0.33287811279296875 2023-01-22 13:52:53.606797: step: 364/466, loss: 0.044003941118717194 2023-01-22 13:52:54.312660: step: 366/466, loss: 0.015192612074315548 2023-01-22 13:52:55.164753: step: 368/466, loss: 0.3871832489967346 2023-01-22 13:52:55.932905: step: 370/466, loss: 0.03940851613879204 2023-01-22 13:52:56.703026: step: 372/466, loss: 0.44761019945144653 2023-01-22 13:52:57.438093: step: 374/466, loss: 0.09093191474676132 2023-01-22 13:52:58.110707: step: 376/466, loss: 0.2419499009847641 2023-01-22 13:52:58.867741: step: 378/466, loss: 0.09732669591903687 2023-01-22 13:52:59.678255: step: 380/466, loss: 0.1610875129699707 2023-01-22 13:53:00.479858: step: 382/466, loss: 0.048162516206502914 2023-01-22 13:53:01.276640: step: 384/466, loss: 0.05992291122674942 2023-01-22 13:53:02.032424: step: 386/466, loss: 0.0989617109298706 2023-01-22 13:53:02.754382: step: 388/466, loss: 0.05073726549744606 2023-01-22 13:53:03.714867: step: 390/466, loss: 0.260887086391449 2023-01-22 13:53:04.495920: step: 392/466, loss: 0.06377539038658142 2023-01-22 13:53:05.290950: step: 394/466, loss: 0.0285948496311903 2023-01-22 13:53:06.115602: step: 396/466, loss: 0.3953843116760254 2023-01-22 13:53:06.841328: step: 398/466, loss: 0.015707774087786674 2023-01-22 13:53:07.695316: step: 400/466, loss: 0.00765496538951993 2023-01-22 13:53:08.414884: step: 402/466, loss: 0.023716503754258156 2023-01-22 13:53:09.163220: step: 404/466, loss: 0.08415870368480682 2023-01-22 13:53:09.945545: step: 406/466, loss: 0.01940714195370674 2023-01-22 13:53:10.719892: step: 408/466, loss: 0.014717105776071548 2023-01-22 13:53:11.537261: step: 410/466, loss: 0.44251349568367004 2023-01-22 13:53:12.423008: step: 412/466, loss: 0.03738182783126831 2023-01-22 13:53:13.205415: step: 414/466, loss: 0.0421098917722702 2023-01-22 13:53:14.093211: step: 416/466, loss: 0.11240135133266449 2023-01-22 13:53:14.905079: step: 418/466, loss: 0.9660542607307434 2023-01-22 13:53:15.716645: step: 420/466, loss: 0.06278149783611298 2023-01-22 13:53:16.496326: step: 422/466, loss: 0.04971605911850929 2023-01-22 13:53:17.270881: step: 424/466, loss: 1.2707209587097168 2023-01-22 13:53:18.039990: step: 426/466, loss: 0.1101389154791832 2023-01-22 13:53:18.778324: step: 428/466, loss: 0.02900017239153385 2023-01-22 13:53:19.558736: step: 430/466, loss: 0.10317675769329071 2023-01-22 13:53:20.325889: step: 432/466, loss: 0.07279238849878311 2023-01-22 13:53:21.126837: step: 434/466, loss: 0.07171858102083206 2023-01-22 13:53:21.953599: step: 436/466, loss: 0.13305605947971344 2023-01-22 13:53:22.733705: step: 438/466, loss: 0.10145549476146698 2023-01-22 13:53:23.675865: step: 440/466, loss: 0.11025592684745789 2023-01-22 13:53:24.425596: step: 442/466, loss: 0.03324136510491371 2023-01-22 13:53:25.183022: step: 444/466, loss: 0.01895085908472538 2023-01-22 13:53:25.882425: step: 446/466, loss: 0.06639997661113739 2023-01-22 13:53:26.615264: step: 448/466, loss: 0.0384809672832489 2023-01-22 13:53:27.379571: step: 450/466, loss: 0.0413355752825737 2023-01-22 13:53:28.129592: step: 452/466, loss: 0.04279356077313423 2023-01-22 13:53:28.898698: step: 454/466, loss: 0.03977316617965698 2023-01-22 13:53:29.651239: step: 456/466, loss: 0.05302877724170685 2023-01-22 13:53:30.466687: step: 458/466, loss: 0.01950419880449772 2023-01-22 13:53:31.190861: step: 460/466, loss: 0.10128819197416306 2023-01-22 13:53:31.981844: step: 462/466, loss: 0.12025299668312073 2023-01-22 13:53:32.763072: step: 464/466, loss: 0.1661524772644043 2023-01-22 13:53:33.519219: step: 466/466, loss: 0.17702050507068634 2023-01-22 13:53:34.306852: step: 468/466, loss: 0.09875538945198059 2023-01-22 13:53:35.016796: step: 470/466, loss: 0.19672173261642456 2023-01-22 13:53:35.720318: step: 472/466, loss: 0.07020531594753265 2023-01-22 13:53:36.459015: step: 474/466, loss: 0.4568902850151062 2023-01-22 13:53:37.220413: step: 476/466, loss: 0.17930543422698975 2023-01-22 13:53:37.982295: step: 478/466, loss: 0.08048145473003387 2023-01-22 13:53:38.733395: step: 480/466, loss: 0.04901100695133209 2023-01-22 13:53:39.486627: step: 482/466, loss: 0.17724567651748657 2023-01-22 13:53:40.211491: step: 484/466, loss: 0.04301964491605759 2023-01-22 13:53:40.983451: step: 486/466, loss: 0.23543091118335724 2023-01-22 13:53:41.730724: step: 488/466, loss: 0.06778578460216522 2023-01-22 13:53:42.453039: step: 490/466, loss: 1.0197558403015137 2023-01-22 13:53:43.211226: step: 492/466, loss: 0.06885930895805359 2023-01-22 13:53:44.049463: step: 494/466, loss: 0.14625200629234314 2023-01-22 13:53:44.869481: step: 496/466, loss: 0.04098017141222954 2023-01-22 13:53:45.672793: step: 498/466, loss: 0.42735791206359863 2023-01-22 13:53:46.428380: step: 500/466, loss: 0.08337932080030441 2023-01-22 13:53:47.151487: step: 502/466, loss: 0.11113856732845306 2023-01-22 13:53:47.842294: step: 504/466, loss: 0.07578197866678238 2023-01-22 13:53:48.581287: step: 506/466, loss: 0.04823679476976395 2023-01-22 13:53:49.389355: step: 508/466, loss: 0.6794129610061646 2023-01-22 13:53:50.125905: step: 510/466, loss: 5.463428020477295 2023-01-22 13:53:50.916958: step: 512/466, loss: 0.019826870411634445 2023-01-22 13:53:51.685112: step: 514/466, loss: 0.298760324716568 2023-01-22 13:53:52.454459: step: 516/466, loss: 0.07626804709434509 2023-01-22 13:53:53.170926: step: 518/466, loss: 0.016729634255170822 2023-01-22 13:53:53.971669: step: 520/466, loss: 0.12583647668361664 2023-01-22 13:53:54.733210: step: 522/466, loss: 0.056236542761325836 2023-01-22 13:53:55.656806: step: 524/466, loss: 0.09457962214946747 2023-01-22 13:53:56.464681: step: 526/466, loss: 0.06622578203678131 2023-01-22 13:53:57.289997: step: 528/466, loss: 0.2225954830646515 2023-01-22 13:53:58.097038: step: 530/466, loss: 0.04802557826042175 2023-01-22 13:53:58.826220: step: 532/466, loss: 0.026738611981272697 2023-01-22 13:53:59.699027: step: 534/466, loss: 0.07819850742816925 2023-01-22 13:54:00.575410: step: 536/466, loss: 0.22115235030651093 2023-01-22 13:54:01.372280: step: 538/466, loss: 0.10885797441005707 2023-01-22 13:54:02.095082: step: 540/466, loss: 0.038471028208732605 2023-01-22 13:54:02.801959: step: 542/466, loss: 0.02333487570285797 2023-01-22 13:54:03.577373: step: 544/466, loss: 0.07428286224603653 2023-01-22 13:54:04.310218: step: 546/466, loss: 0.04125874862074852 2023-01-22 13:54:05.124695: step: 548/466, loss: 0.04482059180736542 2023-01-22 13:54:05.882607: step: 550/466, loss: 0.024111980572342873 2023-01-22 13:54:06.689185: step: 552/466, loss: 0.24125415086746216 2023-01-22 13:54:07.505218: step: 554/466, loss: 0.19689196348190308 2023-01-22 13:54:08.378006: step: 556/466, loss: 0.1388603299856186 2023-01-22 13:54:09.155242: step: 558/466, loss: 0.02577141858637333 2023-01-22 13:54:09.885838: step: 560/466, loss: 0.09173966199159622 2023-01-22 13:54:10.585245: step: 562/466, loss: 0.0680513009428978 2023-01-22 13:54:11.353414: step: 564/466, loss: 0.057585883885622025 2023-01-22 13:54:12.092890: step: 566/466, loss: 0.011553946882486343 2023-01-22 13:54:12.921862: step: 568/466, loss: 0.059238385409116745 2023-01-22 13:54:13.611942: step: 570/466, loss: 0.07379874587059021 2023-01-22 13:54:14.444352: step: 572/466, loss: 0.012319848872721195 2023-01-22 13:54:15.240952: step: 574/466, loss: 0.043450977653265 2023-01-22 13:54:16.296036: step: 576/466, loss: 0.13805872201919556 2023-01-22 13:54:17.019689: step: 578/466, loss: 0.022836795076727867 2023-01-22 13:54:17.710894: step: 580/466, loss: 0.07715574651956558 2023-01-22 13:54:18.490548: step: 582/466, loss: 0.12319004535675049 2023-01-22 13:54:19.328612: step: 584/466, loss: 0.10562030225992203 2023-01-22 13:54:20.176752: step: 586/466, loss: 0.06464926153421402 2023-01-22 13:54:20.929717: step: 588/466, loss: 0.06526906043291092 2023-01-22 13:54:21.736991: step: 590/466, loss: 0.021421361714601517 2023-01-22 13:54:22.601422: step: 592/466, loss: 0.03983045369386673 2023-01-22 13:54:23.449716: step: 594/466, loss: 0.10576335340738297 2023-01-22 13:54:24.247129: step: 596/466, loss: 6.454718589782715 2023-01-22 13:54:25.016202: step: 598/466, loss: 0.08060228824615479 2023-01-22 13:54:25.782170: step: 600/466, loss: 0.28485438227653503 2023-01-22 13:54:26.538247: step: 602/466, loss: 0.04787430539727211 2023-01-22 13:54:27.285506: step: 604/466, loss: 0.06249743700027466 2023-01-22 13:54:28.049761: step: 606/466, loss: 0.28890371322631836 2023-01-22 13:54:28.859159: step: 608/466, loss: 0.06000453978776932 2023-01-22 13:54:29.577434: step: 610/466, loss: 0.048165880143642426 2023-01-22 13:54:30.385917: step: 612/466, loss: 0.048403237015008926 2023-01-22 13:54:31.200145: step: 614/466, loss: 0.021623866632580757 2023-01-22 13:54:32.016152: step: 616/466, loss: 0.07675395905971527 2023-01-22 13:54:32.727106: step: 618/466, loss: 0.04885542392730713 2023-01-22 13:54:33.514609: step: 620/466, loss: 0.10137677192687988 2023-01-22 13:54:34.278834: step: 622/466, loss: 0.10909571498632431 2023-01-22 13:54:35.001877: step: 624/466, loss: 0.05348493903875351 2023-01-22 13:54:35.810131: step: 626/466, loss: 0.05490071326494217 2023-01-22 13:54:36.613178: step: 628/466, loss: 0.2063806653022766 2023-01-22 13:54:37.425977: step: 630/466, loss: 0.11666103452444077 2023-01-22 13:54:38.250365: step: 632/466, loss: 0.08298062533140182 2023-01-22 13:54:39.095839: step: 634/466, loss: 0.1579156070947647 2023-01-22 13:54:39.807483: step: 636/466, loss: 0.00876704789698124 2023-01-22 13:54:40.533521: step: 638/466, loss: 0.1683746874332428 2023-01-22 13:54:41.314499: step: 640/466, loss: 0.5691487193107605 2023-01-22 13:54:42.012288: step: 642/466, loss: 0.08855330944061279 2023-01-22 13:54:42.785742: step: 644/466, loss: 0.039370011538267136 2023-01-22 13:54:43.603715: step: 646/466, loss: 0.2885306179523468 2023-01-22 13:54:44.426778: step: 648/466, loss: 0.0817800760269165 2023-01-22 13:54:45.222222: step: 650/466, loss: 0.15116387605667114 2023-01-22 13:54:45.933396: step: 652/466, loss: 0.3503814935684204 2023-01-22 13:54:46.719710: step: 654/466, loss: 0.3401745855808258 2023-01-22 13:54:47.442611: step: 656/466, loss: 0.1454222947359085 2023-01-22 13:54:48.173628: step: 658/466, loss: 0.08071214705705643 2023-01-22 13:54:48.964869: step: 660/466, loss: 0.12128698825836182 2023-01-22 13:54:49.792702: step: 662/466, loss: 0.07182589918375015 2023-01-22 13:54:50.559417: step: 664/466, loss: 0.10454913228750229 2023-01-22 13:54:51.378504: step: 666/466, loss: 0.03323966637253761 2023-01-22 13:54:52.197893: step: 668/466, loss: 0.2441716492176056 2023-01-22 13:54:52.958878: step: 670/466, loss: 0.10043670982122421 2023-01-22 13:54:53.675964: step: 672/466, loss: 0.05408914014697075 2023-01-22 13:54:54.471840: step: 674/466, loss: 0.06823401153087616 2023-01-22 13:54:55.214540: step: 676/466, loss: 0.0523863360285759 2023-01-22 13:54:56.010845: step: 678/466, loss: 0.26087960600852966 2023-01-22 13:54:56.757032: step: 680/466, loss: 0.033554527908563614 2023-01-22 13:54:57.479004: step: 682/466, loss: 0.22630837559700012 2023-01-22 13:54:58.301279: step: 684/466, loss: 0.4421272575855255 2023-01-22 13:54:59.024287: step: 686/466, loss: 0.910071074962616 2023-01-22 13:54:59.808642: step: 688/466, loss: 0.015432994812726974 2023-01-22 13:55:00.659487: step: 690/466, loss: 0.1441815048456192 2023-01-22 13:55:01.341735: step: 692/466, loss: 0.2625843584537506 2023-01-22 13:55:02.197519: step: 694/466, loss: 0.10949409753084183 2023-01-22 13:55:03.019377: step: 696/466, loss: 1.3729277849197388 2023-01-22 13:55:03.812896: step: 698/466, loss: 0.07162289321422577 2023-01-22 13:55:04.684386: step: 700/466, loss: 0.049353040754795074 2023-01-22 13:55:05.468109: step: 702/466, loss: 0.08966630697250366 2023-01-22 13:55:06.201127: step: 704/466, loss: 0.015086804516613483 2023-01-22 13:55:07.002284: step: 706/466, loss: 0.10388107597827911 2023-01-22 13:55:07.754668: step: 708/466, loss: 0.14753474295139313 2023-01-22 13:55:08.601235: step: 710/466, loss: 0.10368377715349197 2023-01-22 13:55:09.297694: step: 712/466, loss: 0.02157321386039257 2023-01-22 13:55:10.037722: step: 714/466, loss: 0.08537424355745316 2023-01-22 13:55:10.845125: step: 716/466, loss: 0.03927518427371979 2023-01-22 13:55:11.625609: step: 718/466, loss: 0.1469808667898178 2023-01-22 13:55:12.433548: step: 720/466, loss: 0.13254177570343018 2023-01-22 13:55:13.136598: step: 722/466, loss: 0.019300326704978943 2023-01-22 13:55:13.899770: step: 724/466, loss: 0.09577452391386032 2023-01-22 13:55:14.701160: step: 726/466, loss: 0.09022750705480576 2023-01-22 13:55:15.515520: step: 728/466, loss: 0.05232621356844902 2023-01-22 13:55:16.280712: step: 730/466, loss: 0.18306727707386017 2023-01-22 13:55:17.036299: step: 732/466, loss: 0.0629793256521225 2023-01-22 13:55:17.865612: step: 734/466, loss: 0.20556853711605072 2023-01-22 13:55:18.549478: step: 736/466, loss: 0.022572429850697517 2023-01-22 13:55:19.295180: step: 738/466, loss: 0.1834261119365692 2023-01-22 13:55:20.072830: step: 740/466, loss: 0.6015645265579224 2023-01-22 13:55:20.889845: step: 742/466, loss: 0.04095159471035004 2023-01-22 13:55:21.627020: step: 744/466, loss: 0.07869676500558853 2023-01-22 13:55:22.328244: step: 746/466, loss: 0.0747077688574791 2023-01-22 13:55:23.046624: step: 748/466, loss: 0.21149002015590668 2023-01-22 13:55:23.830275: step: 750/466, loss: 0.05114706978201866 2023-01-22 13:55:24.580604: step: 752/466, loss: 0.10323463380336761 2023-01-22 13:55:25.373897: step: 754/466, loss: 0.24099045991897583 2023-01-22 13:55:26.165082: step: 756/466, loss: 0.08624985814094543 2023-01-22 13:55:26.972674: step: 758/466, loss: 0.10293315351009369 2023-01-22 13:55:27.714453: step: 760/466, loss: 0.010724215768277645 2023-01-22 13:55:28.512417: step: 762/466, loss: 0.242195725440979 2023-01-22 13:55:29.237552: step: 764/466, loss: 0.14631156623363495 2023-01-22 13:55:30.128349: step: 766/466, loss: 0.15860383212566376 2023-01-22 13:55:30.819497: step: 768/466, loss: 0.14705486595630646 2023-01-22 13:55:31.598837: step: 770/466, loss: 0.017465418204665184 2023-01-22 13:55:32.403542: step: 772/466, loss: 0.08447589725255966 2023-01-22 13:55:33.160979: step: 774/466, loss: 0.8887242078781128 2023-01-22 13:55:33.890712: step: 776/466, loss: 0.5055169463157654 2023-01-22 13:55:34.626035: step: 778/466, loss: 0.0075454795733094215 2023-01-22 13:55:35.315082: step: 780/466, loss: 0.07857229560613632 2023-01-22 13:55:36.049547: step: 782/466, loss: 0.07658013701438904 2023-01-22 13:55:36.709764: step: 784/466, loss: 0.14287102222442627 2023-01-22 13:55:37.538410: step: 786/466, loss: 0.13052555918693542 2023-01-22 13:55:38.282947: step: 788/466, loss: 0.5707105994224548 2023-01-22 13:55:39.032355: step: 790/466, loss: 0.0379331149160862 2023-01-22 13:55:39.727841: step: 792/466, loss: 0.02855241671204567 2023-01-22 13:55:40.502840: step: 794/466, loss: 0.08172359317541122 2023-01-22 13:55:41.175638: step: 796/466, loss: 0.00676638213917613 2023-01-22 13:55:41.953357: step: 798/466, loss: 0.18921613693237305 2023-01-22 13:55:42.748133: step: 800/466, loss: 0.10927959531545639 2023-01-22 13:55:43.477921: step: 802/466, loss: 0.3304141163825989 2023-01-22 13:55:44.265608: step: 804/466, loss: 0.16432689130306244 2023-01-22 13:55:45.022825: step: 806/466, loss: 0.035742390900850296 2023-01-22 13:55:45.759899: step: 808/466, loss: 0.11775850504636765 2023-01-22 13:55:46.566075: step: 810/466, loss: 0.2532691955566406 2023-01-22 13:55:47.370713: step: 812/466, loss: 0.17157238721847534 2023-01-22 13:55:48.220490: step: 814/466, loss: 0.031197600066661835 2023-01-22 13:55:48.936936: step: 816/466, loss: 0.1423412412405014 2023-01-22 13:55:49.783260: step: 818/466, loss: 0.036518923938274384 2023-01-22 13:55:50.508069: step: 820/466, loss: 0.023789752274751663 2023-01-22 13:55:51.313242: step: 822/466, loss: 0.057291388511657715 2023-01-22 13:55:52.060285: step: 824/466, loss: 0.09669878333806992 2023-01-22 13:55:52.896899: step: 826/466, loss: 0.529859185218811 2023-01-22 13:55:53.666591: step: 828/466, loss: 0.051852673292160034 2023-01-22 13:55:54.399023: step: 830/466, loss: 0.06867492944002151 2023-01-22 13:55:55.124595: step: 832/466, loss: 0.6612184047698975 2023-01-22 13:55:55.874638: step: 834/466, loss: 0.12671461701393127 2023-01-22 13:55:56.580603: step: 836/466, loss: 0.11311160027980804 2023-01-22 13:55:57.407391: step: 838/466, loss: 0.7831020951271057 2023-01-22 13:55:58.090415: step: 840/466, loss: 0.04771916940808296 2023-01-22 13:55:58.914839: step: 842/466, loss: 0.03580975905060768 2023-01-22 13:55:59.771175: step: 844/466, loss: 0.07718465477228165 2023-01-22 13:56:00.589672: step: 846/466, loss: 0.10081658512353897 2023-01-22 13:56:01.356172: step: 848/466, loss: 0.25340622663497925 2023-01-22 13:56:02.167846: step: 850/466, loss: 0.0629146546125412 2023-01-22 13:56:02.915178: step: 852/466, loss: 0.13862603902816772 2023-01-22 13:56:03.685556: step: 854/466, loss: 0.05720217898488045 2023-01-22 13:56:04.399645: step: 856/466, loss: 0.03626011312007904 2023-01-22 13:56:05.197642: step: 858/466, loss: 0.059770986437797546 2023-01-22 13:56:05.911752: step: 860/466, loss: 0.03019898012280464 2023-01-22 13:56:06.667431: step: 862/466, loss: 0.038489192724227905 2023-01-22 13:56:07.381287: step: 864/466, loss: 0.3332316279411316 2023-01-22 13:56:08.200423: step: 866/466, loss: 0.09183098375797272 2023-01-22 13:56:08.899125: step: 868/466, loss: 0.021056165918707848 2023-01-22 13:56:09.687717: step: 870/466, loss: 0.018868671730160713 2023-01-22 13:56:10.526121: step: 872/466, loss: 0.08701654523611069 2023-01-22 13:56:11.239701: step: 874/466, loss: 0.027373263612389565 2023-01-22 13:56:11.985429: step: 876/466, loss: 0.06184190884232521 2023-01-22 13:56:12.866818: step: 878/466, loss: 0.17068737745285034 2023-01-22 13:56:13.592650: step: 880/466, loss: 0.8000120520591736 2023-01-22 13:56:14.489676: step: 882/466, loss: 0.12541207671165466 2023-01-22 13:56:15.240395: step: 884/466, loss: 0.09928561747074127 2023-01-22 13:56:16.038846: step: 886/466, loss: 0.04651761054992676 2023-01-22 13:56:16.675722: step: 888/466, loss: 0.16089124977588654 2023-01-22 13:56:17.544809: step: 890/466, loss: 0.07804146409034729 2023-01-22 13:56:18.281200: step: 892/466, loss: 0.1407633274793625 2023-01-22 13:56:19.007298: step: 894/466, loss: 0.15471599996089935 2023-01-22 13:56:19.717608: step: 896/466, loss: 0.25648272037506104 2023-01-22 13:56:20.449263: step: 898/466, loss: 0.033358488231897354 2023-01-22 13:56:21.176607: step: 900/466, loss: 0.05572345107793808 2023-01-22 13:56:21.983193: step: 902/466, loss: 0.05574074015021324 2023-01-22 13:56:22.732491: step: 904/466, loss: 0.2865733504295349 2023-01-22 13:56:23.507063: step: 906/466, loss: 0.09892137348651886 2023-01-22 13:56:24.277939: step: 908/466, loss: 0.021022509783506393 2023-01-22 13:56:25.069244: step: 910/466, loss: 0.17897191643714905 2023-01-22 13:56:25.827071: step: 912/466, loss: 0.08533147722482681 2023-01-22 13:56:26.545246: step: 914/466, loss: 0.057386137545108795 2023-01-22 13:56:27.357761: step: 916/466, loss: 0.11319765448570251 2023-01-22 13:56:28.106549: step: 918/466, loss: 0.1495431363582611 2023-01-22 13:56:28.834937: step: 920/466, loss: 0.974583625793457 2023-01-22 13:56:29.695334: step: 922/466, loss: 0.04371223598718643 2023-01-22 13:56:30.389075: step: 924/466, loss: 0.06663193553686142 2023-01-22 13:56:31.115993: step: 926/466, loss: 0.173104390501976 2023-01-22 13:56:31.845048: step: 928/466, loss: 0.24562832713127136 2023-01-22 13:56:32.576110: step: 930/466, loss: 0.1565488576889038 2023-01-22 13:56:33.368165: step: 932/466, loss: 0.10224548727273941 ================================================== Loss: 0.157 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2978166390728477, 'r': 0.34133064516129036, 'f1': 0.3180923961096375}, 'combined': 0.23438387081762763, 'epoch': 17} Test Chinese: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.33336874133889766, 'r': 0.31112489984574765, 'f1': 0.32186296227879224}, 'combined': 0.19782796705916011, 'epoch': 17} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.26824143975097453, 'r': 0.34306400453919705, 'f1': 0.30107365594031116}, 'combined': 0.22184374648233451, 'epoch': 17} Test Korean: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.32069202095693583, 'r': 0.3201362288928857, 'f1': 0.3204138839049351}, 'combined': 0.19693731400986253, 'epoch': 17} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.30089890016920473, 'r': 0.33744070208728655, 'f1': 0.31812388193202146}, 'combined': 0.23440707089727897, 'epoch': 17} Test Russian: {'template': {'p': 0.863013698630137, 'r': 0.48091603053435117, 'f1': 0.6176470588235295}, 'slot': {'p': 0.33691290815754443, 'r': 0.30890975734964465, 'f1': 0.3223042183729355}, 'combined': 0.1990702525244602, 'epoch': 17} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.234375, 'r': 0.375, 'f1': 0.28846153846153844}, 'combined': 0.1923076923076923, 'epoch': 17} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.2, 'r': 0.391304347826087, 'f1': 0.2647058823529412}, 'combined': 0.1323529411764706, 'epoch': 17} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.38461538461538464, 'r': 0.1724137931034483, 'f1': 0.23809523809523808}, 'combined': 0.15873015873015872, 'epoch': 17} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3027851084501864, 'r': 0.33151234834109594, 'f1': 0.3164982021299956}, 'combined': 0.23320920156947042, 'epoch': 16} Test for Chinese: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.34794201805038116, 'r': 0.299398980870042, 'f1': 0.32185041818726456}, 'combined': 0.19782025703217238, 'epoch': 16} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.29347826086956524, 'r': 0.38571428571428573, 'f1': 0.33333333333333337}, 'combined': 0.22222222222222224, 'epoch': 16} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.28298284686781866, 'r': 0.3527888622242065, 'f1': 0.31405359863540006}, 'combined': 0.23140791478397899, 'epoch': 11} Test for Korean: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.32660396071805126, 'r': 0.30226432413074417, 'f1': 0.3139631233545263}, 'combined': 0.19297245630570886, 'epoch': 11} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.28888888888888886, 'r': 0.5652173913043478, 'f1': 0.38235294117647056}, 'combined': 0.19117647058823528, 'epoch': 11} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.32387621960662843, 'r': 0.3607501725030188, 'f1': 0.34132018116533375}, 'combined': 0.25149908085866696, 'epoch': 11} Test for Russian: {'template': {'p': 0.863013698630137, 'r': 0.48091603053435117, 'f1': 0.6176470588235295}, 'slot': {'p': 0.3437797535475426, 'r': 0.2928383862541026, 'f1': 0.3162709384531908}, 'combined': 0.19534381492697084, 'epoch': 11} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.46153846153846156, 'r': 0.20689655172413793, 'f1': 0.28571428571428575}, 'combined': 0.1904761904761905, 'epoch': 11} ****************************** Epoch: 18 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-22 13:59:19.678023: step: 2/466, loss: 0.19651255011558533 2023-01-22 13:59:20.432830: step: 4/466, loss: 0.10556065291166306 2023-01-22 13:59:21.184630: step: 6/466, loss: 0.14724516868591309 2023-01-22 13:59:21.940542: step: 8/466, loss: 0.052163269370794296 2023-01-22 13:59:22.774558: step: 10/466, loss: 0.16769303381443024 2023-01-22 13:59:23.520043: step: 12/466, loss: 0.01563209481537342 2023-01-22 13:59:24.258840: step: 14/466, loss: 0.10199791193008423 2023-01-22 13:59:25.019762: step: 16/466, loss: 0.05524469166994095 2023-01-22 13:59:25.717093: step: 18/466, loss: 0.011698777787387371 2023-01-22 13:59:26.438469: step: 20/466, loss: 0.054983723908662796 2023-01-22 13:59:27.174898: step: 22/466, loss: 0.007442728150635958 2023-01-22 13:59:27.952491: step: 24/466, loss: 0.046553682535886765 2023-01-22 13:59:29.422028: step: 26/466, loss: 0.47874942421913147 2023-01-22 13:59:30.162319: step: 28/466, loss: 0.078194759786129 2023-01-22 13:59:30.907249: step: 30/466, loss: 0.011652514338493347 2023-01-22 13:59:31.650773: step: 32/466, loss: 0.09648700803518295 2023-01-22 13:59:32.410613: step: 34/466, loss: 0.10296231508255005 2023-01-22 13:59:33.139441: step: 36/466, loss: 0.11012774705886841 2023-01-22 13:59:33.868027: step: 38/466, loss: 0.02304697409272194 2023-01-22 13:59:34.655889: step: 40/466, loss: 0.1753087341785431 2023-01-22 13:59:35.404962: step: 42/466, loss: 0.039917685091495514 2023-01-22 13:59:36.165547: step: 44/466, loss: 0.13597364723682404 2023-01-22 13:59:36.874746: step: 46/466, loss: 0.08333782851696014 2023-01-22 13:59:37.592073: step: 48/466, loss: 0.07603185623884201 2023-01-22 13:59:38.295108: step: 50/466, loss: 0.009517199359834194 2023-01-22 13:59:39.016143: step: 52/466, loss: 0.023968877270817757 2023-01-22 13:59:39.796290: step: 54/466, loss: 0.0795581191778183 2023-01-22 13:59:40.547307: step: 56/466, loss: 0.036575641483068466 2023-01-22 13:59:41.295218: step: 58/466, loss: 0.024914629757404327 2023-01-22 13:59:42.067458: step: 60/466, loss: 1.6338603496551514 2023-01-22 13:59:42.793454: step: 62/466, loss: 0.03559728339314461 2023-01-22 13:59:43.537593: step: 64/466, loss: 0.028533292934298515 2023-01-22 13:59:44.210528: step: 66/466, loss: 0.04734528437256813 2023-01-22 13:59:44.956259: step: 68/466, loss: 0.42479050159454346 2023-01-22 13:59:45.856000: step: 70/466, loss: 0.0825815424323082 2023-01-22 13:59:46.560930: step: 72/466, loss: 0.02216201275587082 2023-01-22 13:59:47.250027: step: 74/466, loss: 0.37110376358032227 2023-01-22 13:59:47.959343: step: 76/466, loss: 0.044285114854574203 2023-01-22 13:59:48.706997: step: 78/466, loss: 0.00979369506239891 2023-01-22 13:59:49.448629: step: 80/466, loss: 0.1044786274433136 2023-01-22 13:59:50.185695: step: 82/466, loss: 0.014085735194385052 2023-01-22 13:59:50.901828: step: 84/466, loss: 0.10190731287002563 2023-01-22 13:59:51.670194: step: 86/466, loss: 0.06342112272977829 2023-01-22 13:59:52.393920: step: 88/466, loss: 0.04446432739496231 2023-01-22 13:59:53.257383: step: 90/466, loss: 0.17387893795967102 2023-01-22 13:59:53.985949: step: 92/466, loss: 0.03474210202693939 2023-01-22 13:59:54.702382: step: 94/466, loss: 0.03775353729724884 2023-01-22 13:59:55.507626: step: 96/466, loss: 0.30774736404418945 2023-01-22 13:59:56.218608: step: 98/466, loss: 0.035334642976522446 2023-01-22 13:59:56.925462: step: 100/466, loss: 0.01667802408337593 2023-01-22 13:59:57.700637: step: 102/466, loss: 0.05429663509130478 2023-01-22 13:59:58.435038: step: 104/466, loss: 0.3033483922481537 2023-01-22 13:59:59.219377: step: 106/466, loss: 0.004159913398325443 2023-01-22 13:59:59.932575: step: 108/466, loss: 0.024774271994829178 2023-01-22 14:00:00.782248: step: 110/466, loss: 0.07045798003673553 2023-01-22 14:00:01.556676: step: 112/466, loss: 0.05230962857604027 2023-01-22 14:00:02.300820: step: 114/466, loss: 0.020267944782972336 2023-01-22 14:00:03.092713: step: 116/466, loss: 0.11753572523593903 2023-01-22 14:00:03.856590: step: 118/466, loss: 0.02458942122757435 2023-01-22 14:00:04.593348: step: 120/466, loss: 0.6894834637641907 2023-01-22 14:00:05.381970: step: 122/466, loss: 0.22759802639484406 2023-01-22 14:00:06.103548: step: 124/466, loss: 0.01465371623635292 2023-01-22 14:00:06.938352: step: 126/466, loss: 0.0885717123746872 2023-01-22 14:00:07.681346: step: 128/466, loss: 0.06091802567243576 2023-01-22 14:00:08.590733: step: 130/466, loss: 0.13552923500537872 2023-01-22 14:00:09.379661: step: 132/466, loss: 0.061866942793130875 2023-01-22 14:00:10.111882: step: 134/466, loss: 0.2426026165485382 2023-01-22 14:00:10.832176: step: 136/466, loss: 0.024837233126163483 2023-01-22 14:00:11.593012: step: 138/466, loss: 0.06240249425172806 2023-01-22 14:00:12.365066: step: 140/466, loss: 0.05201075226068497 2023-01-22 14:00:13.135573: step: 142/466, loss: 0.13489706814289093 2023-01-22 14:00:13.902287: step: 144/466, loss: 0.0347241647541523 2023-01-22 14:00:14.684549: step: 146/466, loss: 0.15452273190021515 2023-01-22 14:00:15.490512: step: 148/466, loss: 0.3027150630950928 2023-01-22 14:00:16.333128: step: 150/466, loss: 0.06525728106498718 2023-01-22 14:00:17.125243: step: 152/466, loss: 0.05072518065571785 2023-01-22 14:00:17.942204: step: 154/466, loss: 0.025421161204576492 2023-01-22 14:00:18.776275: step: 156/466, loss: 0.07371964305639267 2023-01-22 14:00:19.525960: step: 158/466, loss: 0.05787751078605652 2023-01-22 14:00:20.283123: step: 160/466, loss: 0.056591957807540894 2023-01-22 14:00:21.089695: step: 162/466, loss: 0.06800525635480881 2023-01-22 14:00:21.809311: step: 164/466, loss: 0.24484489858150482 2023-01-22 14:00:22.612192: step: 166/466, loss: 0.10593988746404648 2023-01-22 14:00:23.424013: step: 168/466, loss: 0.03584035485982895 2023-01-22 14:00:24.147487: step: 170/466, loss: 0.04015500470995903 2023-01-22 14:00:24.994444: step: 172/466, loss: 0.16253499686717987 2023-01-22 14:00:25.739788: step: 174/466, loss: 0.04112454131245613 2023-01-22 14:00:26.541397: step: 176/466, loss: 0.04308474063873291 2023-01-22 14:00:27.427468: step: 178/466, loss: 0.28503429889678955 2023-01-22 14:00:28.237398: step: 180/466, loss: 0.188304603099823 2023-01-22 14:00:29.007918: step: 182/466, loss: 0.03568580374121666 2023-01-22 14:00:29.739877: step: 184/466, loss: 0.08562834560871124 2023-01-22 14:00:30.434023: step: 186/466, loss: 0.029489878565073013 2023-01-22 14:00:31.235641: step: 188/466, loss: 0.041018228977918625 2023-01-22 14:00:32.156789: step: 190/466, loss: 0.4618982672691345 2023-01-22 14:00:32.861766: step: 192/466, loss: 0.11267931014299393 2023-01-22 14:00:33.664639: step: 194/466, loss: 0.10493389517068863 2023-01-22 14:00:34.323871: step: 196/466, loss: 0.0027219559997320175 2023-01-22 14:00:35.109517: step: 198/466, loss: 0.03727314621210098 2023-01-22 14:00:35.879660: step: 200/466, loss: 0.14151844382286072 2023-01-22 14:00:36.586288: step: 202/466, loss: 0.03493554890155792 2023-01-22 14:00:37.326617: step: 204/466, loss: 0.02788936160504818 2023-01-22 14:00:38.078507: step: 206/466, loss: 0.01529090479016304 2023-01-22 14:00:38.874551: step: 208/466, loss: 0.07066036015748978 2023-01-22 14:00:39.657465: step: 210/466, loss: 0.04624416306614876 2023-01-22 14:00:40.396797: step: 212/466, loss: 0.0838155597448349 2023-01-22 14:00:41.116942: step: 214/466, loss: 0.05787842348217964 2023-01-22 14:00:41.912313: step: 216/466, loss: 0.3357498049736023 2023-01-22 14:00:42.757661: step: 218/466, loss: 0.15849146246910095 2023-01-22 14:00:43.631488: step: 220/466, loss: 0.08705505728721619 2023-01-22 14:00:44.327720: step: 222/466, loss: 0.0284771379083395 2023-01-22 14:00:45.082055: step: 224/466, loss: 0.04284011945128441 2023-01-22 14:00:45.877440: step: 226/466, loss: 0.06858966499567032 2023-01-22 14:00:46.600635: step: 228/466, loss: 0.17510446906089783 2023-01-22 14:00:47.284261: step: 230/466, loss: 0.08641013503074646 2023-01-22 14:00:48.154975: step: 232/466, loss: 0.3538026809692383 2023-01-22 14:00:48.894097: step: 234/466, loss: 0.02047703228890896 2023-01-22 14:00:49.631469: step: 236/466, loss: 0.1350494623184204 2023-01-22 14:00:50.372003: step: 238/466, loss: 0.012271600775420666 2023-01-22 14:00:51.174807: step: 240/466, loss: 0.044202882796525955 2023-01-22 14:00:52.135983: step: 242/466, loss: 0.07256503403186798 2023-01-22 14:00:52.890125: step: 244/466, loss: 0.04749925807118416 2023-01-22 14:00:53.641452: step: 246/466, loss: 0.1495182067155838 2023-01-22 14:00:54.437484: step: 248/466, loss: 0.06420061737298965 2023-01-22 14:00:55.204387: step: 250/466, loss: 0.2077827900648117 2023-01-22 14:00:55.927934: step: 252/466, loss: 0.051569852977991104 2023-01-22 14:00:56.733394: step: 254/466, loss: 0.05237215757369995 2023-01-22 14:00:57.506069: step: 256/466, loss: 0.07263194024562836 2023-01-22 14:00:58.229107: step: 258/466, loss: 0.11570609360933304 2023-01-22 14:00:59.018443: step: 260/466, loss: 0.053916919976472855 2023-01-22 14:00:59.868222: step: 262/466, loss: 0.06967765837907791 2023-01-22 14:01:00.573403: step: 264/466, loss: 0.053337082266807556 2023-01-22 14:01:01.312517: step: 266/466, loss: 0.1144920140504837 2023-01-22 14:01:02.090494: step: 268/466, loss: 0.036679599434137344 2023-01-22 14:01:02.851048: step: 270/466, loss: 0.043247222900390625 2023-01-22 14:01:03.579140: step: 272/466, loss: 0.12316104024648666 2023-01-22 14:01:04.332571: step: 274/466, loss: 0.03358753025531769 2023-01-22 14:01:05.088946: step: 276/466, loss: 1.4639196395874023 2023-01-22 14:01:05.839477: step: 278/466, loss: 0.1559915691614151 2023-01-22 14:01:06.610353: step: 280/466, loss: 0.12618787586688995 2023-01-22 14:01:07.412050: step: 282/466, loss: 0.042094483971595764 2023-01-22 14:01:08.226101: step: 284/466, loss: 0.1456632912158966 2023-01-22 14:01:08.943939: step: 286/466, loss: 0.06955816596746445 2023-01-22 14:01:09.834181: step: 288/466, loss: 0.18508176505565643 2023-01-22 14:01:10.602003: step: 290/466, loss: 0.38990962505340576 2023-01-22 14:01:11.417181: step: 292/466, loss: 0.15841281414031982 2023-01-22 14:01:12.160488: step: 294/466, loss: 5.494202613830566 2023-01-22 14:01:12.898131: step: 296/466, loss: 0.04548550769686699 2023-01-22 14:01:13.618748: step: 298/466, loss: 0.08657371997833252 2023-01-22 14:01:14.410443: step: 300/466, loss: 0.09277087450027466 2023-01-22 14:01:15.185947: step: 302/466, loss: 0.06852469593286514 2023-01-22 14:01:15.948912: step: 304/466, loss: 0.09778723120689392 2023-01-22 14:01:16.732338: step: 306/466, loss: 0.1429487019777298 2023-01-22 14:01:17.533969: step: 308/466, loss: 0.20687298476696014 2023-01-22 14:01:18.416479: step: 310/466, loss: 0.11959525942802429 2023-01-22 14:01:19.175658: step: 312/466, loss: 0.10952453315258026 2023-01-22 14:01:19.943054: step: 314/466, loss: 0.02904977649450302 2023-01-22 14:01:20.774640: step: 316/466, loss: 0.020986704155802727 2023-01-22 14:01:21.483408: step: 318/466, loss: 0.11110341548919678 2023-01-22 14:01:22.324070: step: 320/466, loss: 0.059407394379377365 2023-01-22 14:01:23.117305: step: 322/466, loss: 0.06788373738527298 2023-01-22 14:01:23.959341: step: 324/466, loss: 0.16677653789520264 2023-01-22 14:01:24.694755: step: 326/466, loss: 0.09506413340568542 2023-01-22 14:01:25.479899: step: 328/466, loss: 0.0655602365732193 2023-01-22 14:01:26.224662: step: 330/466, loss: 0.04528603330254555 2023-01-22 14:01:27.000898: step: 332/466, loss: 0.05972367525100708 2023-01-22 14:01:27.837965: step: 334/466, loss: 0.10645157843828201 2023-01-22 14:01:28.612572: step: 336/466, loss: 0.09351608157157898 2023-01-22 14:01:29.387955: step: 338/466, loss: 0.13656413555145264 2023-01-22 14:01:30.238319: step: 340/466, loss: 0.3023928701877594 2023-01-22 14:01:31.033679: step: 342/466, loss: 0.11083973199129105 2023-01-22 14:01:31.783118: step: 344/466, loss: 0.04845494404435158 2023-01-22 14:01:32.524418: step: 346/466, loss: 0.08185308426618576 2023-01-22 14:01:33.295562: step: 348/466, loss: 0.01976921781897545 2023-01-22 14:01:34.069044: step: 350/466, loss: 0.026596803218126297 2023-01-22 14:01:34.946812: step: 352/466, loss: 0.025404812768101692 2023-01-22 14:01:35.656453: step: 354/466, loss: 0.06451868265867233 2023-01-22 14:01:36.412611: step: 356/466, loss: 0.09652923792600632 2023-01-22 14:01:37.145667: step: 358/466, loss: 0.07527028024196625 2023-01-22 14:01:37.906738: step: 360/466, loss: 0.132870152592659 2023-01-22 14:01:38.612053: step: 362/466, loss: 0.049478884786367416 2023-01-22 14:01:39.355160: step: 364/466, loss: 0.5129496455192566 2023-01-22 14:01:40.126320: step: 366/466, loss: 0.12872274219989777 2023-01-22 14:01:40.907629: step: 368/466, loss: 0.03989960625767708 2023-01-22 14:01:41.652493: step: 370/466, loss: 0.0571373850107193 2023-01-22 14:01:42.424793: step: 372/466, loss: 0.16923730075359344 2023-01-22 14:01:43.096204: step: 374/466, loss: 0.060910288244485855 2023-01-22 14:01:43.881528: step: 376/466, loss: 0.14860039949417114 2023-01-22 14:01:44.659140: step: 378/466, loss: 0.14830780029296875 2023-01-22 14:01:45.387591: step: 380/466, loss: 0.057242076843976974 2023-01-22 14:01:46.087989: step: 382/466, loss: 0.046780772507190704 2023-01-22 14:01:46.793363: step: 384/466, loss: 0.35573306679725647 2023-01-22 14:01:47.556054: step: 386/466, loss: 0.14995847642421722 2023-01-22 14:01:48.255301: step: 388/466, loss: 0.30169206857681274 2023-01-22 14:01:49.036349: step: 390/466, loss: 0.031957320868968964 2023-01-22 14:01:49.751342: step: 392/466, loss: 0.08432843536138535 2023-01-22 14:01:50.684266: step: 394/466, loss: 0.08463986963033676 2023-01-22 14:01:51.512069: step: 396/466, loss: 0.0620778426527977 2023-01-22 14:01:52.322227: step: 398/466, loss: 0.02887018956243992 2023-01-22 14:01:53.062471: step: 400/466, loss: 0.037249885499477386 2023-01-22 14:01:53.795381: step: 402/466, loss: 0.15724727511405945 2023-01-22 14:01:54.586017: step: 404/466, loss: 0.05442766472697258 2023-01-22 14:01:55.341397: step: 406/466, loss: 0.08185989409685135 2023-01-22 14:01:56.050409: step: 408/466, loss: 0.11819741874933243 2023-01-22 14:01:56.821637: step: 410/466, loss: 0.02740086242556572 2023-01-22 14:01:57.748652: step: 412/466, loss: 0.061571717262268066 2023-01-22 14:01:58.603202: step: 414/466, loss: 0.08704986423254013 2023-01-22 14:01:59.430378: step: 416/466, loss: 0.0622885562479496 2023-01-22 14:02:00.138243: step: 418/466, loss: 0.4812530279159546 2023-01-22 14:02:00.878533: step: 420/466, loss: 0.014615857042372227 2023-01-22 14:02:01.581496: step: 422/466, loss: 0.09261064231395721 2023-01-22 14:02:02.328544: step: 424/466, loss: 0.016920937225222588 2023-01-22 14:02:03.161131: step: 426/466, loss: 0.08510475605726242 2023-01-22 14:02:03.885315: step: 428/466, loss: 0.07296749949455261 2023-01-22 14:02:04.656279: step: 430/466, loss: 0.051430556923151016 2023-01-22 14:02:05.397645: step: 432/466, loss: 0.03637000918388367 2023-01-22 14:02:06.165120: step: 434/466, loss: 0.02003244124352932 2023-01-22 14:02:06.966945: step: 436/466, loss: 0.044223539531230927 2023-01-22 14:02:07.795764: step: 438/466, loss: 0.13895899057388306 2023-01-22 14:02:08.507768: step: 440/466, loss: 0.015650153160095215 2023-01-22 14:02:09.339785: step: 442/466, loss: 0.0724409818649292 2023-01-22 14:02:10.080156: step: 444/466, loss: 0.040596701204776764 2023-01-22 14:02:10.806744: step: 446/466, loss: 0.11499258875846863 2023-01-22 14:02:11.578053: step: 448/466, loss: 1.051138162612915 2023-01-22 14:02:12.359997: step: 450/466, loss: 0.036972444504499435 2023-01-22 14:02:13.139255: step: 452/466, loss: 0.06746082752943039 2023-01-22 14:02:13.938112: step: 454/466, loss: 0.2163953334093094 2023-01-22 14:02:14.655294: step: 456/466, loss: 0.040912926197052 2023-01-22 14:02:15.404474: step: 458/466, loss: 0.14414256811141968 2023-01-22 14:02:16.172598: step: 460/466, loss: 0.024899596348404884 2023-01-22 14:02:17.013209: step: 462/466, loss: 0.22652199864387512 2023-01-22 14:02:17.852189: step: 464/466, loss: 0.15495027601718903 2023-01-22 14:02:18.691021: step: 466/466, loss: 0.027327006682753563 2023-01-22 14:02:19.488568: step: 468/466, loss: 0.020539091899991035 2023-01-22 14:02:20.251960: step: 470/466, loss: 0.2805376648902893 2023-01-22 14:02:20.979733: step: 472/466, loss: 0.06690191477537155 2023-01-22 14:02:21.760273: step: 474/466, loss: 0.03993724286556244 2023-01-22 14:02:22.546681: step: 476/466, loss: 0.04442617669701576 2023-01-22 14:02:23.312855: step: 478/466, loss: 0.039570402354002 2023-01-22 14:02:24.062538: step: 480/466, loss: 0.09646327793598175 2023-01-22 14:02:24.777929: step: 482/466, loss: 0.33058470487594604 2023-01-22 14:02:25.511089: step: 484/466, loss: 0.05744529142975807 2023-01-22 14:02:26.302141: step: 486/466, loss: 0.07537069916725159 2023-01-22 14:02:26.942528: step: 488/466, loss: 0.03509717062115669 2023-01-22 14:02:27.724422: step: 490/466, loss: 0.04638931155204773 2023-01-22 14:02:28.470795: step: 492/466, loss: 0.03242679685354233 2023-01-22 14:02:29.277224: step: 494/466, loss: 0.007616049610078335 2023-01-22 14:02:29.978372: step: 496/466, loss: 0.021879682317376137 2023-01-22 14:02:30.762536: step: 498/466, loss: 0.08258692920207977 2023-01-22 14:02:31.412126: step: 500/466, loss: 0.016804566606879234 2023-01-22 14:02:32.202133: step: 502/466, loss: 0.1131386086344719 2023-01-22 14:02:32.921366: step: 504/466, loss: 0.012570555321872234 2023-01-22 14:02:33.639552: step: 506/466, loss: 0.1193646639585495 2023-01-22 14:02:34.425954: step: 508/466, loss: 0.04365543648600578 2023-01-22 14:02:35.159274: step: 510/466, loss: 0.05366494134068489 2023-01-22 14:02:35.905030: step: 512/466, loss: 0.28607046604156494 2023-01-22 14:02:36.683777: step: 514/466, loss: 0.06540153175592422 2023-01-22 14:02:37.429089: step: 516/466, loss: 0.030185092240571976 2023-01-22 14:02:38.177256: step: 518/466, loss: 0.14537490904331207 2023-01-22 14:02:39.033932: step: 520/466, loss: 0.08515751361846924 2023-01-22 14:02:39.724201: step: 522/466, loss: 0.056321412324905396 2023-01-22 14:02:40.549976: step: 524/466, loss: 0.09485920518636703 2023-01-22 14:02:41.366789: step: 526/466, loss: 0.06368310749530792 2023-01-22 14:02:42.030101: step: 528/466, loss: 0.030335480347275734 2023-01-22 14:02:42.845828: step: 530/466, loss: 0.06590232998132706 2023-01-22 14:02:43.607223: step: 532/466, loss: 0.012064045295119286 2023-01-22 14:02:44.388246: step: 534/466, loss: 0.051502350717782974 2023-01-22 14:02:45.144245: step: 536/466, loss: 0.08099795877933502 2023-01-22 14:02:45.934291: step: 538/466, loss: 0.29615336656570435 2023-01-22 14:02:46.679410: step: 540/466, loss: 0.060773301869630814 2023-01-22 14:02:47.360647: step: 542/466, loss: 0.07017336785793304 2023-01-22 14:02:48.207351: step: 544/466, loss: 0.45528674125671387 2023-01-22 14:02:49.011281: step: 546/466, loss: 0.07251206040382385 2023-01-22 14:02:49.776294: step: 548/466, loss: 0.10662711411714554 2023-01-22 14:02:50.537597: step: 550/466, loss: 0.420543909072876 2023-01-22 14:02:51.335633: step: 552/466, loss: 0.12031018733978271 2023-01-22 14:02:52.095660: step: 554/466, loss: 0.05517619475722313 2023-01-22 14:02:52.859389: step: 556/466, loss: 0.05768076702952385 2023-01-22 14:02:53.618149: step: 558/466, loss: 0.12633764743804932 2023-01-22 14:02:54.424550: step: 560/466, loss: 0.07834474742412567 2023-01-22 14:02:55.218952: step: 562/466, loss: 0.03466970846056938 2023-01-22 14:02:56.015884: step: 564/466, loss: 0.06017180532217026 2023-01-22 14:02:56.814612: step: 566/466, loss: 1.0684977769851685 2023-01-22 14:02:57.577460: step: 568/466, loss: 0.046178270131349564 2023-01-22 14:02:58.320245: step: 570/466, loss: 0.04773535206913948 2023-01-22 14:02:59.042767: step: 572/466, loss: 0.45768627524375916 2023-01-22 14:02:59.753358: step: 574/466, loss: 0.08987529575824738 2023-01-22 14:03:00.512626: step: 576/466, loss: 0.06217389926314354 2023-01-22 14:03:01.290456: step: 578/466, loss: 0.08079030364751816 2023-01-22 14:03:02.124557: step: 580/466, loss: 0.33118191361427307 2023-01-22 14:03:02.876919: step: 582/466, loss: 0.022694973275065422 2023-01-22 14:03:03.603406: step: 584/466, loss: 0.024916207417845726 2023-01-22 14:03:04.366929: step: 586/466, loss: 0.09800676256418228 2023-01-22 14:03:05.146982: step: 588/466, loss: 0.07045701891183853 2023-01-22 14:03:05.915783: step: 590/466, loss: 0.09660731256008148 2023-01-22 14:03:06.685231: step: 592/466, loss: 0.061863940209150314 2023-01-22 14:03:07.417195: step: 594/466, loss: 0.048730917274951935 2023-01-22 14:03:08.185256: step: 596/466, loss: 0.025510050356388092 2023-01-22 14:03:08.904547: step: 598/466, loss: 0.16134203970432281 2023-01-22 14:03:09.609756: step: 600/466, loss: 0.04883525148034096 2023-01-22 14:03:10.336831: step: 602/466, loss: 0.07418007403612137 2023-01-22 14:03:11.045157: step: 604/466, loss: 0.035486262291669846 2023-01-22 14:03:11.757536: step: 606/466, loss: 0.008110095746815205 2023-01-22 14:03:12.478378: step: 608/466, loss: 0.7604119777679443 2023-01-22 14:03:13.191750: step: 610/466, loss: 0.16966570913791656 2023-01-22 14:03:13.937226: step: 612/466, loss: 0.0026693926192820072 2023-01-22 14:03:14.701604: step: 614/466, loss: 0.1456402987241745 2023-01-22 14:03:15.438291: step: 616/466, loss: 0.01035115122795105 2023-01-22 14:03:16.287900: step: 618/466, loss: 0.1391374170780182 2023-01-22 14:03:17.075110: step: 620/466, loss: 0.12024839222431183 2023-01-22 14:03:17.835362: step: 622/466, loss: 0.02679123915731907 2023-01-22 14:03:18.632763: step: 624/466, loss: 0.18648961186408997 2023-01-22 14:03:19.532862: step: 626/466, loss: 6.7376532554626465 2023-01-22 14:03:20.340387: step: 628/466, loss: 0.04615697264671326 2023-01-22 14:03:21.109653: step: 630/466, loss: 0.1045772135257721 2023-01-22 14:03:21.864615: step: 632/466, loss: 0.0611422024667263 2023-01-22 14:03:22.574133: step: 634/466, loss: 0.002596375299617648 2023-01-22 14:03:23.393964: step: 636/466, loss: 0.06962011754512787 2023-01-22 14:03:24.176219: step: 638/466, loss: 0.04378426820039749 2023-01-22 14:03:24.970313: step: 640/466, loss: 0.05531560257077217 2023-01-22 14:03:25.693778: step: 642/466, loss: 0.04268931224942207 2023-01-22 14:03:26.515197: step: 644/466, loss: 0.09115175157785416 2023-01-22 14:03:27.248166: step: 646/466, loss: 0.1862109899520874 2023-01-22 14:03:27.970328: step: 648/466, loss: 0.0526001863181591 2023-01-22 14:03:28.737715: step: 650/466, loss: 0.9602542519569397 2023-01-22 14:03:29.501844: step: 652/466, loss: 0.026968909427523613 2023-01-22 14:03:30.238334: step: 654/466, loss: 0.06525703519582748 2023-01-22 14:03:31.140301: step: 656/466, loss: 0.09172939509153366 2023-01-22 14:03:31.919040: step: 658/466, loss: 0.08876709640026093 2023-01-22 14:03:32.758407: step: 660/466, loss: 0.0551137700676918 2023-01-22 14:03:33.496189: step: 662/466, loss: 0.04312824830412865 2023-01-22 14:03:34.307283: step: 664/466, loss: 0.06608863174915314 2023-01-22 14:03:35.052904: step: 666/466, loss: 0.03778354823589325 2023-01-22 14:03:35.882183: step: 668/466, loss: 0.057987332344055176 2023-01-22 14:03:36.686000: step: 670/466, loss: 0.02959345281124115 2023-01-22 14:03:37.375557: step: 672/466, loss: 0.04160600155591965 2023-01-22 14:03:38.163642: step: 674/466, loss: 0.04615752771496773 2023-01-22 14:03:39.004407: step: 676/466, loss: 2.3827602863311768 2023-01-22 14:03:39.756950: step: 678/466, loss: 0.12117211520671844 2023-01-22 14:03:40.548780: step: 680/466, loss: 0.021775022149086 2023-01-22 14:03:41.284449: step: 682/466, loss: 0.04267100989818573 2023-01-22 14:03:42.056828: step: 684/466, loss: 0.14708609879016876 2023-01-22 14:03:42.813633: step: 686/466, loss: 0.021864986047148705 2023-01-22 14:03:43.512050: step: 688/466, loss: 0.1048935055732727 2023-01-22 14:03:44.280596: step: 690/466, loss: 0.06021007522940636 2023-01-22 14:03:45.053711: step: 692/466, loss: 0.07317975908517838 2023-01-22 14:03:45.823142: step: 694/466, loss: 0.05775444954633713 2023-01-22 14:03:46.560583: step: 696/466, loss: 0.33284351229667664 2023-01-22 14:03:47.372836: step: 698/466, loss: 0.06062760576605797 2023-01-22 14:03:48.114965: step: 700/466, loss: 0.1711597740650177 2023-01-22 14:03:48.861999: step: 702/466, loss: 0.03754610940814018 2023-01-22 14:03:49.584580: step: 704/466, loss: 0.05172597989439964 2023-01-22 14:03:50.325942: step: 706/466, loss: 0.0590791180729866 2023-01-22 14:03:51.118784: step: 708/466, loss: 0.04053680971264839 2023-01-22 14:03:51.924401: step: 710/466, loss: 0.07455841451883316 2023-01-22 14:03:52.682875: step: 712/466, loss: 0.05597339943051338 2023-01-22 14:03:53.421155: step: 714/466, loss: 0.07891589403152466 2023-01-22 14:03:54.198110: step: 716/466, loss: 0.1539381742477417 2023-01-22 14:03:55.029105: step: 718/466, loss: 0.19610744714736938 2023-01-22 14:03:55.831955: step: 720/466, loss: 0.1722133606672287 2023-01-22 14:03:56.661049: step: 722/466, loss: 0.025108935311436653 2023-01-22 14:03:57.393216: step: 724/466, loss: 0.07888443768024445 2023-01-22 14:03:58.112486: step: 726/466, loss: 0.06843721121549606 2023-01-22 14:03:58.858341: step: 728/466, loss: 0.06443135440349579 2023-01-22 14:03:59.685342: step: 730/466, loss: 0.16125111281871796 2023-01-22 14:04:00.426031: step: 732/466, loss: 0.1632445901632309 2023-01-22 14:04:01.412660: step: 734/466, loss: 0.2908114194869995 2023-01-22 14:04:02.121246: step: 736/466, loss: 0.5993254780769348 2023-01-22 14:04:02.909238: step: 738/466, loss: 0.07324164360761642 2023-01-22 14:04:03.718262: step: 740/466, loss: 0.03170664981007576 2023-01-22 14:04:04.482675: step: 742/466, loss: 0.03607878088951111 2023-01-22 14:04:05.188072: step: 744/466, loss: 0.0928855612874031 2023-01-22 14:04:06.008249: step: 746/466, loss: 0.07790108770132065 2023-01-22 14:04:06.737779: step: 748/466, loss: 0.04500148817896843 2023-01-22 14:04:07.509055: step: 750/466, loss: 0.08000680059194565 2023-01-22 14:04:08.251711: step: 752/466, loss: 0.21646849811077118 2023-01-22 14:04:08.964194: step: 754/466, loss: 0.13076795637607574 2023-01-22 14:04:09.760795: step: 756/466, loss: 0.6503867506980896 2023-01-22 14:04:10.488401: step: 758/466, loss: 0.20445503294467926 2023-01-22 14:04:11.299037: step: 760/466, loss: 0.1860615313053131 2023-01-22 14:04:12.122621: step: 762/466, loss: 0.05238658934831619 2023-01-22 14:04:13.014332: step: 764/466, loss: 0.15584617853164673 2023-01-22 14:04:13.777765: step: 766/466, loss: 0.11111781746149063 2023-01-22 14:04:14.499718: step: 768/466, loss: 0.05492393672466278 2023-01-22 14:04:15.295502: step: 770/466, loss: 0.015201396308839321 2023-01-22 14:04:16.078472: step: 772/466, loss: 0.015985164791345596 2023-01-22 14:04:16.850643: step: 774/466, loss: 0.05980583652853966 2023-01-22 14:04:17.538917: step: 776/466, loss: 0.10083527117967606 2023-01-22 14:04:18.314071: step: 778/466, loss: 0.5878909826278687 2023-01-22 14:04:19.217689: step: 780/466, loss: 0.07962486147880554 2023-01-22 14:04:19.976131: step: 782/466, loss: 0.18757252395153046 2023-01-22 14:04:20.962661: step: 784/466, loss: 0.19431068003177643 2023-01-22 14:04:21.737904: step: 786/466, loss: 0.011340529657900333 2023-01-22 14:04:22.456562: step: 788/466, loss: 0.9627336859703064 2023-01-22 14:04:23.257858: step: 790/466, loss: 0.030145462602376938 2023-01-22 14:04:24.061607: step: 792/466, loss: 0.04864287003874779 2023-01-22 14:04:24.796158: step: 794/466, loss: 0.02933620661497116 2023-01-22 14:04:25.479993: step: 796/466, loss: 0.02567676641047001 2023-01-22 14:04:26.232428: step: 798/466, loss: 0.05721559375524521 2023-01-22 14:04:27.083976: step: 800/466, loss: 0.1796366423368454 2023-01-22 14:04:27.893206: step: 802/466, loss: 0.09617342054843903 2023-01-22 14:04:28.651065: step: 804/466, loss: 0.08915657550096512 2023-01-22 14:04:29.428913: step: 806/466, loss: 0.056109026074409485 2023-01-22 14:04:30.181459: step: 808/466, loss: 0.8052629828453064 2023-01-22 14:04:30.997572: step: 810/466, loss: 0.1469426453113556 2023-01-22 14:04:31.710931: step: 812/466, loss: 0.011045991443097591 2023-01-22 14:04:32.434383: step: 814/466, loss: 0.020637711510062218 2023-01-22 14:04:33.153248: step: 816/466, loss: 0.15615445375442505 2023-01-22 14:04:33.878167: step: 818/466, loss: 0.41356489062309265 2023-01-22 14:04:34.608044: step: 820/466, loss: 0.09979367256164551 2023-01-22 14:04:35.315156: step: 822/466, loss: 0.019931530579924583 2023-01-22 14:04:36.163683: step: 824/466, loss: 0.024886123836040497 2023-01-22 14:04:36.930237: step: 826/466, loss: 0.0668100118637085 2023-01-22 14:04:37.648849: step: 828/466, loss: 0.13887907564640045 2023-01-22 14:04:38.446763: step: 830/466, loss: 0.04640275612473488 2023-01-22 14:04:39.141253: step: 832/466, loss: 0.08429426699876785 2023-01-22 14:04:39.907144: step: 834/466, loss: 0.05646169185638428 2023-01-22 14:04:40.575271: step: 836/466, loss: 0.07541132718324661 2023-01-22 14:04:41.324455: step: 838/466, loss: 0.021639710292220116 2023-01-22 14:04:42.086191: step: 840/466, loss: 0.03935703635215759 2023-01-22 14:04:42.868064: step: 842/466, loss: 0.04783592000603676 2023-01-22 14:04:43.592098: step: 844/466, loss: 0.3185446262359619 2023-01-22 14:04:44.407022: step: 846/466, loss: 0.0960950031876564 2023-01-22 14:04:45.129445: step: 848/466, loss: 0.032344914972782135 2023-01-22 14:04:45.892028: step: 850/466, loss: 0.13070397078990936 2023-01-22 14:04:46.667762: step: 852/466, loss: 0.03631366789340973 2023-01-22 14:04:47.386660: step: 854/466, loss: 0.0661098062992096 2023-01-22 14:04:48.278295: step: 856/466, loss: 0.035494230687618256 2023-01-22 14:04:49.091694: step: 858/466, loss: 0.13137997686862946 2023-01-22 14:04:49.868951: step: 860/466, loss: 0.026316052302718163 2023-01-22 14:04:50.654614: step: 862/466, loss: 0.08850234746932983 2023-01-22 14:04:51.513009: step: 864/466, loss: 0.10195964574813843 2023-01-22 14:04:52.265573: step: 866/466, loss: 0.07952665537595749 2023-01-22 14:04:53.100478: step: 868/466, loss: 0.06374545395374298 2023-01-22 14:04:53.828582: step: 870/466, loss: 0.0911751389503479 2023-01-22 14:04:54.585490: step: 872/466, loss: 0.07596917450428009 2023-01-22 14:04:55.253040: step: 874/466, loss: 0.15608255565166473 2023-01-22 14:04:55.909981: step: 876/466, loss: 0.03925901651382446 2023-01-22 14:04:56.603986: step: 878/466, loss: 0.06515513360500336 2023-01-22 14:04:57.405907: step: 880/466, loss: 0.07271917909383774 2023-01-22 14:04:58.160885: step: 882/466, loss: 0.20699360966682434 2023-01-22 14:04:58.914569: step: 884/466, loss: 0.05623829364776611 2023-01-22 14:04:59.785237: step: 886/466, loss: 0.05620102211833 2023-01-22 14:05:00.599660: step: 888/466, loss: 0.05816996097564697 2023-01-22 14:05:01.360232: step: 890/466, loss: 0.0925942212343216 2023-01-22 14:05:02.091362: step: 892/466, loss: 0.0655890479683876 2023-01-22 14:05:02.903890: step: 894/466, loss: 0.8882849216461182 2023-01-22 14:05:03.652063: step: 896/466, loss: 0.11083611845970154 2023-01-22 14:05:04.452653: step: 898/466, loss: 0.15663686394691467 2023-01-22 14:05:05.191150: step: 900/466, loss: 0.07761363685131073 2023-01-22 14:05:05.943476: step: 902/466, loss: 0.05530351772904396 2023-01-22 14:05:06.807743: step: 904/466, loss: 0.053331077098846436 2023-01-22 14:05:07.557076: step: 906/466, loss: 0.04591398313641548 2023-01-22 14:05:08.353151: step: 908/466, loss: 0.028425119817256927 2023-01-22 14:05:09.098197: step: 910/466, loss: 0.018302908167243004 2023-01-22 14:05:09.888019: step: 912/466, loss: 0.040488116443157196 2023-01-22 14:05:10.565322: step: 914/466, loss: 0.05558871850371361 2023-01-22 14:05:11.310136: step: 916/466, loss: 0.06911761313676834 2023-01-22 14:05:12.106536: step: 918/466, loss: 0.060869622975587845 2023-01-22 14:05:12.807079: step: 920/466, loss: 0.031241292133927345 2023-01-22 14:05:13.529213: step: 922/466, loss: 0.048789240419864655 2023-01-22 14:05:14.296472: step: 924/466, loss: 0.08579286932945251 2023-01-22 14:05:15.074702: step: 926/466, loss: 0.053416553884744644 2023-01-22 14:05:15.840690: step: 928/466, loss: 0.04905460402369499 2023-01-22 14:05:16.570999: step: 930/466, loss: 0.0519297830760479 2023-01-22 14:05:17.414833: step: 932/466, loss: 0.05160481110215187 ================================================== Loss: 0.147 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.29717021355479756, 'r': 0.3456647835087114, 'f1': 0.31958831738437}, 'combined': 0.23548612859900944, 'epoch': 18} Test Chinese: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.3424999777789339, 'r': 0.29564716696588494, 'f1': 0.3173536039457222}, 'combined': 0.19505636144956584, 'epoch': 18} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2752954937665506, 'r': 0.34424996279346654, 'f1': 0.30593546440498626}, 'combined': 0.22542613166683195, 'epoch': 18} Test Korean: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.3139289823016391, 'r': 0.29815092253257924, 'f1': 0.3058365907578635}, 'combined': 0.18797761188044293, 'epoch': 18} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3223753728999843, 'r': 0.35418470760738313, 'f1': 0.3375322620417557}, 'combined': 0.24870798255708312, 'epoch': 18} Test Russian: {'template': {'p': 0.863013698630137, 'r': 0.48091603053435117, 'f1': 0.6176470588235295}, 'slot': {'p': 0.34306773211469904, 'r': 0.2970283394932459, 'f1': 0.3183923267885838}, 'combined': 0.19665408419294886, 'epoch': 18} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.25, 'r': 0.34285714285714286, 'f1': 0.2891566265060241}, 'combined': 0.19277108433734938, 'epoch': 18} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.24444444444444444, 'r': 0.4782608695652174, 'f1': 0.32352941176470584}, 'combined': 0.16176470588235292, 'epoch': 18} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.42857142857142855, 'r': 0.20689655172413793, 'f1': 0.2790697674418604}, 'combined': 0.18604651162790692, 'epoch': 18} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3027851084501864, 'r': 0.33151234834109594, 'f1': 0.3164982021299956}, 'combined': 0.23320920156947042, 'epoch': 16} Test for Chinese: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.34794201805038116, 'r': 0.299398980870042, 'f1': 0.32185041818726456}, 'combined': 0.19782025703217238, 'epoch': 16} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.29347826086956524, 'r': 0.38571428571428573, 'f1': 0.33333333333333337}, 'combined': 0.22222222222222224, 'epoch': 16} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.28298284686781866, 'r': 0.3527888622242065, 'f1': 0.31405359863540006}, 'combined': 0.23140791478397899, 'epoch': 11} Test for Korean: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.32660396071805126, 'r': 0.30226432413074417, 'f1': 0.3139631233545263}, 'combined': 0.19297245630570886, 'epoch': 11} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.28888888888888886, 'r': 0.5652173913043478, 'f1': 0.38235294117647056}, 'combined': 0.19117647058823528, 'epoch': 11} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.32387621960662843, 'r': 0.3607501725030188, 'f1': 0.34132018116533375}, 'combined': 0.25149908085866696, 'epoch': 11} Test for Russian: {'template': {'p': 0.863013698630137, 'r': 0.48091603053435117, 'f1': 0.6176470588235295}, 'slot': {'p': 0.3437797535475426, 'r': 0.2928383862541026, 'f1': 0.3162709384531908}, 'combined': 0.19534381492697084, 'epoch': 11} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.46153846153846156, 'r': 0.20689655172413793, 'f1': 0.28571428571428575}, 'combined': 0.1904761904761905, 'epoch': 11} ****************************** Epoch: 19 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-22 14:08:03.307850: step: 2/466, loss: 0.07707686722278595 2023-01-22 14:08:04.141804: step: 4/466, loss: 0.07725892961025238 2023-01-22 14:08:04.929330: step: 6/466, loss: 0.03708843141794205 2023-01-22 14:08:05.706816: step: 8/466, loss: 0.036257512867450714 2023-01-22 14:08:06.506820: step: 10/466, loss: 0.017364487051963806 2023-01-22 14:08:07.216755: step: 12/466, loss: 0.04551801085472107 2023-01-22 14:08:07.911340: step: 14/466, loss: 0.06593490391969681 2023-01-22 14:08:08.728067: step: 16/466, loss: 0.10358086228370667 2023-01-22 14:08:09.577494: step: 18/466, loss: 0.19388654828071594 2023-01-22 14:08:10.341237: step: 20/466, loss: 0.066173255443573 2023-01-22 14:08:11.122992: step: 22/466, loss: 0.019162917509675026 2023-01-22 14:08:11.871932: step: 24/466, loss: 0.08188024908304214 2023-01-22 14:08:12.584534: step: 26/466, loss: 0.05701979249715805 2023-01-22 14:08:13.311467: step: 28/466, loss: 0.007061370182782412 2023-01-22 14:08:14.019988: step: 30/466, loss: 0.12841834127902985 2023-01-22 14:08:14.804859: step: 32/466, loss: 0.050115082412958145 2023-01-22 14:08:15.660645: step: 34/466, loss: 0.02729623392224312 2023-01-22 14:08:16.379430: step: 36/466, loss: 0.08978027105331421 2023-01-22 14:08:17.115749: step: 38/466, loss: 0.06600786000490189 2023-01-22 14:08:17.933197: step: 40/466, loss: 0.04679092392325401 2023-01-22 14:08:18.773485: step: 42/466, loss: 0.017778532579541206 2023-01-22 14:08:19.632280: step: 44/466, loss: 0.07528173178434372 2023-01-22 14:08:20.395119: step: 46/466, loss: 0.04950503632426262 2023-01-22 14:08:21.139232: step: 48/466, loss: 0.0496654212474823 2023-01-22 14:08:21.889813: step: 50/466, loss: 0.0461617186665535 2023-01-22 14:08:22.702867: step: 52/466, loss: 0.03887036815285683 2023-01-22 14:08:23.505939: step: 54/466, loss: 0.04674646258354187 2023-01-22 14:08:24.163489: step: 56/466, loss: 0.06008967384696007 2023-01-22 14:08:24.926281: step: 58/466, loss: 0.04620389640331268 2023-01-22 14:08:25.687687: step: 60/466, loss: 0.1120811179280281 2023-01-22 14:08:26.447700: step: 62/466, loss: 0.03864070773124695 2023-01-22 14:08:27.147488: step: 64/466, loss: 0.05045907199382782 2023-01-22 14:08:27.916985: step: 66/466, loss: 0.24931874871253967 2023-01-22 14:08:28.684759: step: 68/466, loss: 0.12652790546417236 2023-01-22 14:08:29.455581: step: 70/466, loss: 0.015776850283145905 2023-01-22 14:08:30.195272: step: 72/466, loss: 0.023323623463511467 2023-01-22 14:08:30.872864: step: 74/466, loss: 0.019689107313752174 2023-01-22 14:08:31.630863: step: 76/466, loss: 0.010504703037440777 2023-01-22 14:08:32.343653: step: 78/466, loss: 0.853032112121582 2023-01-22 14:08:33.258187: step: 80/466, loss: 0.06561411172151566 2023-01-22 14:08:34.029585: step: 82/466, loss: 0.24904441833496094 2023-01-22 14:08:34.860038: step: 84/466, loss: 0.04590437933802605 2023-01-22 14:08:35.638787: step: 86/466, loss: 0.027438893914222717 2023-01-22 14:08:36.376442: step: 88/466, loss: 0.033957649022340775 2023-01-22 14:08:37.232647: step: 90/466, loss: 0.04507846385240555 2023-01-22 14:08:37.930852: step: 92/466, loss: 0.01910211518406868 2023-01-22 14:08:38.724238: step: 94/466, loss: 0.01948835887014866 2023-01-22 14:08:39.628677: step: 96/466, loss: 0.023070571944117546 2023-01-22 14:08:40.392780: step: 98/466, loss: 0.035979241132736206 2023-01-22 14:08:41.145225: step: 100/466, loss: 0.07960603386163712 2023-01-22 14:08:41.960172: step: 102/466, loss: 0.0930582657456398 2023-01-22 14:08:42.690338: step: 104/466, loss: 0.007280975580215454 2023-01-22 14:08:43.417841: step: 106/466, loss: 0.09409566968679428 2023-01-22 14:08:44.184295: step: 108/466, loss: 0.03311523422598839 2023-01-22 14:08:44.962191: step: 110/466, loss: 0.03451240807771683 2023-01-22 14:08:45.806641: step: 112/466, loss: 0.10993362218141556 2023-01-22 14:08:46.530712: step: 114/466, loss: 0.12965621054172516 2023-01-22 14:08:47.309624: step: 116/466, loss: 0.03790228068828583 2023-01-22 14:08:48.016608: step: 118/466, loss: 0.009076929651200771 2023-01-22 14:08:48.798139: step: 120/466, loss: 0.04168053716421127 2023-01-22 14:08:49.577911: step: 122/466, loss: 0.05517464131116867 2023-01-22 14:08:50.283242: step: 124/466, loss: 0.07232671231031418 2023-01-22 14:08:50.963307: step: 126/466, loss: 0.05364637449383736 2023-01-22 14:08:51.684483: step: 128/466, loss: 0.048415087163448334 2023-01-22 14:08:52.361742: step: 130/466, loss: 0.0030281036160886288 2023-01-22 14:08:53.037008: step: 132/466, loss: 0.07231735438108444 2023-01-22 14:08:53.844238: step: 134/466, loss: 0.006062332075089216 2023-01-22 14:08:54.523168: step: 136/466, loss: 0.010802896693348885 2023-01-22 14:08:55.286056: step: 138/466, loss: 0.039841752499341965 2023-01-22 14:08:56.095067: step: 140/466, loss: 0.10794732719659805 2023-01-22 14:08:56.834629: step: 142/466, loss: 0.08357289433479309 2023-01-22 14:08:57.604474: step: 144/466, loss: 0.43468421697616577 2023-01-22 14:08:58.410165: step: 146/466, loss: 0.0399923138320446 2023-01-22 14:08:59.150942: step: 148/466, loss: 0.06643982976675034 2023-01-22 14:08:59.918601: step: 150/466, loss: 0.013120495714247227 2023-01-22 14:09:00.632283: step: 152/466, loss: 0.12737302482128143 2023-01-22 14:09:01.492976: step: 154/466, loss: 0.06080927327275276 2023-01-22 14:09:02.318767: step: 156/466, loss: 0.0551338791847229 2023-01-22 14:09:03.164068: step: 158/466, loss: 0.08971381932497025 2023-01-22 14:09:03.888408: step: 160/466, loss: 0.07730238884687424 2023-01-22 14:09:04.649253: step: 162/466, loss: 0.012164798565208912 2023-01-22 14:09:05.386952: step: 164/466, loss: 0.07556217908859253 2023-01-22 14:09:06.115303: step: 166/466, loss: 0.0272072684019804 2023-01-22 14:09:06.901374: step: 168/466, loss: 0.0506470613181591 2023-01-22 14:09:07.741461: step: 170/466, loss: 0.06211550906300545 2023-01-22 14:09:08.544531: step: 172/466, loss: 0.05883381515741348 2023-01-22 14:09:09.266050: step: 174/466, loss: 0.027972858399152756 2023-01-22 14:09:10.052227: step: 176/466, loss: 0.03771953657269478 2023-01-22 14:09:10.862947: step: 178/466, loss: 0.05252067372202873 2023-01-22 14:09:11.656418: step: 180/466, loss: 0.07096660882234573 2023-01-22 14:09:12.394877: step: 182/466, loss: 0.005565475672483444 2023-01-22 14:09:13.210236: step: 184/466, loss: 0.053429294377565384 2023-01-22 14:09:13.991859: step: 186/466, loss: 0.06567326188087463 2023-01-22 14:09:14.705779: step: 188/466, loss: 0.04723641648888588 2023-01-22 14:09:15.440089: step: 190/466, loss: 0.040017325431108475 2023-01-22 14:09:16.187739: step: 192/466, loss: 0.1834365278482437 2023-01-22 14:09:16.921768: step: 194/466, loss: 0.06563407927751541 2023-01-22 14:09:17.713378: step: 196/466, loss: 0.013922265730798244 2023-01-22 14:09:18.579479: step: 198/466, loss: 0.03886845335364342 2023-01-22 14:09:19.307747: step: 200/466, loss: 0.03570050001144409 2023-01-22 14:09:20.047146: step: 202/466, loss: 0.09004824608564377 2023-01-22 14:09:21.011401: step: 204/466, loss: 0.0846221074461937 2023-01-22 14:09:21.784709: step: 206/466, loss: 0.0383138544857502 2023-01-22 14:09:22.501047: step: 208/466, loss: 0.033008407801389694 2023-01-22 14:09:23.232411: step: 210/466, loss: 0.011824160814285278 2023-01-22 14:09:24.002017: step: 212/466, loss: 0.0965164303779602 2023-01-22 14:09:24.799072: step: 214/466, loss: 0.04126487672328949 2023-01-22 14:09:25.511999: step: 216/466, loss: 0.027900833636522293 2023-01-22 14:09:26.235359: step: 218/466, loss: 0.01755242981016636 2023-01-22 14:09:26.989953: step: 220/466, loss: 0.30426502227783203 2023-01-22 14:09:27.823172: step: 222/466, loss: 0.03312382474541664 2023-01-22 14:09:28.612127: step: 224/466, loss: 0.06691645085811615 2023-01-22 14:09:29.402935: step: 226/466, loss: 0.024335216730833054 2023-01-22 14:09:30.307497: step: 228/466, loss: 0.060176149010658264 2023-01-22 14:09:31.065303: step: 230/466, loss: 0.11157464981079102 2023-01-22 14:09:31.916870: step: 232/466, loss: 0.05215362459421158 2023-01-22 14:09:32.679902: step: 234/466, loss: 0.0009370064362883568 2023-01-22 14:09:33.409630: step: 236/466, loss: 0.06926261633634567 2023-01-22 14:09:34.127858: step: 238/466, loss: 0.04493451490998268 2023-01-22 14:09:34.850943: step: 240/466, loss: 0.3802512288093567 2023-01-22 14:09:35.699016: step: 242/466, loss: 0.03863971307873726 2023-01-22 14:09:36.388139: step: 244/466, loss: 0.024827376008033752 2023-01-22 14:09:37.083851: step: 246/466, loss: 0.04101860523223877 2023-01-22 14:09:37.782953: step: 248/466, loss: 0.03435073420405388 2023-01-22 14:09:38.530032: step: 250/466, loss: 0.04637880250811577 2023-01-22 14:09:39.376213: step: 252/466, loss: 0.011644534766674042 2023-01-22 14:09:40.163352: step: 254/466, loss: 0.48259320855140686 2023-01-22 14:09:40.883621: step: 256/466, loss: 0.4899650812149048 2023-01-22 14:09:41.604424: step: 258/466, loss: 0.07497600466012955 2023-01-22 14:09:42.339926: step: 260/466, loss: 0.11135821789503098 2023-01-22 14:09:43.083830: step: 262/466, loss: 0.09177925437688828 2023-01-22 14:09:43.803433: step: 264/466, loss: 0.03555946797132492 2023-01-22 14:09:44.524742: step: 266/466, loss: 0.03181109204888344 2023-01-22 14:09:45.290703: step: 268/466, loss: 0.05030002444982529 2023-01-22 14:09:46.062239: step: 270/466, loss: 0.06433197855949402 2023-01-22 14:09:46.840252: step: 272/466, loss: 0.14185172319412231 2023-01-22 14:09:47.664594: step: 274/466, loss: 0.06486351042985916 2023-01-22 14:09:48.454989: step: 276/466, loss: 0.13831187784671783 2023-01-22 14:09:49.249110: step: 278/466, loss: 0.07320712506771088 2023-01-22 14:09:50.037297: step: 280/466, loss: 0.04056846350431442 2023-01-22 14:09:50.814348: step: 282/466, loss: 0.0762358158826828 2023-01-22 14:09:51.600916: step: 284/466, loss: 0.13892656564712524 2023-01-22 14:09:52.316394: step: 286/466, loss: 0.14571847021579742 2023-01-22 14:09:53.035487: step: 288/466, loss: 0.056351128965616226 2023-01-22 14:09:53.837421: step: 290/466, loss: 0.47937020659446716 2023-01-22 14:09:54.620264: step: 292/466, loss: 0.017977619543671608 2023-01-22 14:09:55.501144: step: 294/466, loss: 0.022253964096307755 2023-01-22 14:09:56.388659: step: 296/466, loss: 0.021573202684521675 2023-01-22 14:09:57.150667: step: 298/466, loss: 0.10714520514011383 2023-01-22 14:09:57.911248: step: 300/466, loss: 0.20172281563282013 2023-01-22 14:09:58.573901: step: 302/466, loss: 0.010043538175523281 2023-01-22 14:09:59.334165: step: 304/466, loss: 0.13285380601882935 2023-01-22 14:10:00.115092: step: 306/466, loss: 0.11064116656780243 2023-01-22 14:10:00.833515: step: 308/466, loss: 0.014111662283539772 2023-01-22 14:10:01.648619: step: 310/466, loss: 0.026592249050736427 2023-01-22 14:10:02.488436: step: 312/466, loss: 0.5118361711502075 2023-01-22 14:10:03.257870: step: 314/466, loss: 0.07001351565122604 2023-01-22 14:10:04.008241: step: 316/466, loss: 0.08644430339336395 2023-01-22 14:10:04.694577: step: 318/466, loss: 0.03326092287898064 2023-01-22 14:10:05.440939: step: 320/466, loss: 0.045971401035785675 2023-01-22 14:10:06.146283: step: 322/466, loss: 0.05814187228679657 2023-01-22 14:10:06.981220: step: 324/466, loss: 0.05052163079380989 2023-01-22 14:10:07.822720: step: 326/466, loss: 0.09300139546394348 2023-01-22 14:10:08.704211: step: 328/466, loss: 0.08645815402269363 2023-01-22 14:10:09.481950: step: 330/466, loss: 0.043647561222314835 2023-01-22 14:10:10.270448: step: 332/466, loss: 0.05537186563014984 2023-01-22 14:10:11.043135: step: 334/466, loss: 0.07523120939731598 2023-01-22 14:10:11.704829: step: 336/466, loss: 0.0026097306981682777 2023-01-22 14:10:12.440810: step: 338/466, loss: 0.08779795467853546 2023-01-22 14:10:13.278278: step: 340/466, loss: 0.042835015803575516 2023-01-22 14:10:14.108520: step: 342/466, loss: 0.38242512941360474 2023-01-22 14:10:14.895134: step: 344/466, loss: 0.0324886180460453 2023-01-22 14:10:15.677997: step: 346/466, loss: 0.08252550661563873 2023-01-22 14:10:16.442200: step: 348/466, loss: 0.051319487392902374 2023-01-22 14:10:17.346687: step: 350/466, loss: 0.13323046267032623 2023-01-22 14:10:18.136463: step: 352/466, loss: 0.05508129298686981 2023-01-22 14:10:18.917136: step: 354/466, loss: 0.013235564343631268 2023-01-22 14:10:19.654314: step: 356/466, loss: 0.05842670053243637 2023-01-22 14:10:20.422682: step: 358/466, loss: 0.18304871022701263 2023-01-22 14:10:21.199391: step: 360/466, loss: 0.09490270912647247 2023-01-22 14:10:21.858299: step: 362/466, loss: 0.05648095905780792 2023-01-22 14:10:22.595343: step: 364/466, loss: 0.03505036234855652 2023-01-22 14:10:23.354290: step: 366/466, loss: 0.03604874759912491 2023-01-22 14:10:24.110480: step: 368/466, loss: 0.03895943611860275 2023-01-22 14:10:24.848034: step: 370/466, loss: 0.17214235663414001 2023-01-22 14:10:25.534309: step: 372/466, loss: 0.011825804598629475 2023-01-22 14:10:26.262298: step: 374/466, loss: 0.05837767571210861 2023-01-22 14:10:27.021678: step: 376/466, loss: 0.004728924483060837 2023-01-22 14:10:27.837890: step: 378/466, loss: 0.07020552456378937 2023-01-22 14:10:28.630585: step: 380/466, loss: 0.19587011635303497 2023-01-22 14:10:29.380561: step: 382/466, loss: 0.0404064804315567 2023-01-22 14:10:30.115036: step: 384/466, loss: 0.014871107414364815 2023-01-22 14:10:30.842376: step: 386/466, loss: 0.2635861337184906 2023-01-22 14:10:31.592637: step: 388/466, loss: 0.08111178874969482 2023-01-22 14:10:32.450190: step: 390/466, loss: 0.016424495726823807 2023-01-22 14:10:33.175603: step: 392/466, loss: 0.15529052913188934 2023-01-22 14:10:33.956306: step: 394/466, loss: 0.16692110896110535 2023-01-22 14:10:34.623746: step: 396/466, loss: 0.12100377678871155 2023-01-22 14:10:35.385293: step: 398/466, loss: 0.014978073537349701 2023-01-22 14:10:36.118866: step: 400/466, loss: 0.004079705569893122 2023-01-22 14:10:36.892595: step: 402/466, loss: 0.06326240301132202 2023-01-22 14:10:37.700786: step: 404/466, loss: 0.07025929540395737 2023-01-22 14:10:38.544056: step: 406/466, loss: 0.06290563941001892 2023-01-22 14:10:39.310838: step: 408/466, loss: 0.010200410149991512 2023-01-22 14:10:40.043046: step: 410/466, loss: 0.12491553276777267 2023-01-22 14:10:40.836930: step: 412/466, loss: 0.06633555889129639 2023-01-22 14:10:41.580338: step: 414/466, loss: 0.0457879975438118 2023-01-22 14:10:42.366887: step: 416/466, loss: 0.20419169962406158 2023-01-22 14:10:43.102489: step: 418/466, loss: 0.16641490161418915 2023-01-22 14:10:43.943291: step: 420/466, loss: 0.08982834964990616 2023-01-22 14:10:44.667170: step: 422/466, loss: 0.17228271067142487 2023-01-22 14:10:45.413323: step: 424/466, loss: 0.06556040048599243 2023-01-22 14:10:46.168578: step: 426/466, loss: 0.0518675372004509 2023-01-22 14:10:46.996508: step: 428/466, loss: 0.13832581043243408 2023-01-22 14:10:47.812092: step: 430/466, loss: 0.09893655776977539 2023-01-22 14:10:48.655790: step: 432/466, loss: 0.023732980713248253 2023-01-22 14:10:49.391430: step: 434/466, loss: 0.06245775148272514 2023-01-22 14:10:50.227186: step: 436/466, loss: 0.0800282433629036 2023-01-22 14:10:50.977685: step: 438/466, loss: 0.06479812413454056 2023-01-22 14:10:51.691849: step: 440/466, loss: 0.010589199140667915 2023-01-22 14:10:52.401063: step: 442/466, loss: 0.22055262327194214 2023-01-22 14:10:53.114226: step: 444/466, loss: 0.04050662741065025 2023-01-22 14:10:53.823088: step: 446/466, loss: 0.04848470911383629 2023-01-22 14:10:54.574850: step: 448/466, loss: 0.040982309728860855 2023-01-22 14:10:55.328089: step: 450/466, loss: 0.03166522458195686 2023-01-22 14:10:56.093451: step: 452/466, loss: 0.08621339499950409 2023-01-22 14:10:56.869391: step: 454/466, loss: 0.043064236640930176 2023-01-22 14:10:57.624750: step: 456/466, loss: 0.07764985412359238 2023-01-22 14:10:58.326430: step: 458/466, loss: 0.04215572401881218 2023-01-22 14:10:59.102371: step: 460/466, loss: 0.04419616982340813 2023-01-22 14:10:59.854682: step: 462/466, loss: 0.012509040534496307 2023-01-22 14:11:00.582086: step: 464/466, loss: 0.10873904824256897 2023-01-22 14:11:01.351353: step: 466/466, loss: 0.04994974657893181 2023-01-22 14:11:02.226221: step: 468/466, loss: 0.05436602234840393 2023-01-22 14:11:02.929292: step: 470/466, loss: 0.008949813432991505 2023-01-22 14:11:03.702320: step: 472/466, loss: 0.9225003123283386 2023-01-22 14:11:04.494560: step: 474/466, loss: 0.3254028856754303 2023-01-22 14:11:05.279015: step: 476/466, loss: 0.06196141242980957 2023-01-22 14:11:06.053330: step: 478/466, loss: 0.04715636372566223 2023-01-22 14:11:06.836502: step: 480/466, loss: 0.022531533613801003 2023-01-22 14:11:07.611114: step: 482/466, loss: 0.01798596791923046 2023-01-22 14:11:08.484126: step: 484/466, loss: 0.076285719871521 2023-01-22 14:11:09.251944: step: 486/466, loss: 0.09098166972398758 2023-01-22 14:11:09.966992: step: 488/466, loss: 0.04622410237789154 2023-01-22 14:11:10.713704: step: 490/466, loss: 0.045328788459300995 2023-01-22 14:11:11.686838: step: 492/466, loss: 0.03675145283341408 2023-01-22 14:11:12.410563: step: 494/466, loss: 3.8558883666992188 2023-01-22 14:11:13.287538: step: 496/466, loss: 0.11598634719848633 2023-01-22 14:11:13.982033: step: 498/466, loss: 0.009622450917959213 2023-01-22 14:11:14.683576: step: 500/466, loss: 0.02720300666987896 2023-01-22 14:11:15.408637: step: 502/466, loss: 0.06930980086326599 2023-01-22 14:11:16.201393: step: 504/466, loss: 0.10974381119012833 2023-01-22 14:11:16.986045: step: 506/466, loss: 0.041853148490190506 2023-01-22 14:11:17.725106: step: 508/466, loss: 0.053615596145391464 2023-01-22 14:11:18.349638: step: 510/466, loss: 0.023178689181804657 2023-01-22 14:11:19.159239: step: 512/466, loss: 0.10109889507293701 2023-01-22 14:11:19.871057: step: 514/466, loss: 0.020574018359184265 2023-01-22 14:11:20.700903: step: 516/466, loss: 0.11134310066699982 2023-01-22 14:11:21.406358: step: 518/466, loss: 0.02680542692542076 2023-01-22 14:11:22.158386: step: 520/466, loss: 0.08171765506267548 2023-01-22 14:11:22.945923: step: 522/466, loss: 0.04899270087480545 2023-01-22 14:11:23.657997: step: 524/466, loss: 0.04399009793996811 2023-01-22 14:11:24.450683: step: 526/466, loss: 0.27386415004730225 2023-01-22 14:11:25.211668: step: 528/466, loss: 0.07502332329750061 2023-01-22 14:11:25.954061: step: 530/466, loss: 0.02962682582437992 2023-01-22 14:11:26.673490: step: 532/466, loss: 0.18936896324157715 2023-01-22 14:11:27.462924: step: 534/466, loss: 0.08553482592105865 2023-01-22 14:11:28.142006: step: 536/466, loss: 0.011363615281879902 2023-01-22 14:11:28.805243: step: 538/466, loss: 0.017759401351213455 2023-01-22 14:11:29.613976: step: 540/466, loss: 0.041266944259405136 2023-01-22 14:11:30.395921: step: 542/466, loss: 0.08004105091094971 2023-01-22 14:11:31.083227: step: 544/466, loss: 0.02674659714102745 2023-01-22 14:11:31.799604: step: 546/466, loss: 0.0036398719530552626 2023-01-22 14:11:32.544853: step: 548/466, loss: 0.027790717780590057 2023-01-22 14:11:33.379279: step: 550/466, loss: 0.06974704563617706 2023-01-22 14:11:34.151815: step: 552/466, loss: 0.010123740881681442 2023-01-22 14:11:34.885226: step: 554/466, loss: 0.03165018931031227 2023-01-22 14:11:35.652292: step: 556/466, loss: 0.07956918329000473 2023-01-22 14:11:36.428497: step: 558/466, loss: 0.12074112147092819 2023-01-22 14:11:37.248631: step: 560/466, loss: 0.16099034249782562 2023-01-22 14:11:38.068578: step: 562/466, loss: 0.05869884043931961 2023-01-22 14:11:38.799221: step: 564/466, loss: 0.011771907098591328 2023-01-22 14:11:39.518058: step: 566/466, loss: 0.004809739533811808 2023-01-22 14:11:40.255846: step: 568/466, loss: 0.06332944333553314 2023-01-22 14:11:41.011689: step: 570/466, loss: 0.26421257853507996 2023-01-22 14:11:41.744239: step: 572/466, loss: 0.817116916179657 2023-01-22 14:11:42.438102: step: 574/466, loss: 0.31787359714508057 2023-01-22 14:11:43.201433: step: 576/466, loss: 0.05923830345273018 2023-01-22 14:11:44.064143: step: 578/466, loss: 0.37814825773239136 2023-01-22 14:11:44.835094: step: 580/466, loss: 0.08397063612937927 2023-01-22 14:11:45.644030: step: 582/466, loss: 0.03459261357784271 2023-01-22 14:11:46.458375: step: 584/466, loss: 0.046183399856090546 2023-01-22 14:11:47.161276: step: 586/466, loss: 0.0778326541185379 2023-01-22 14:11:47.938828: step: 588/466, loss: 0.025441646575927734 2023-01-22 14:11:48.668995: step: 590/466, loss: 0.02456839382648468 2023-01-22 14:11:49.510072: step: 592/466, loss: 0.04389163479208946 2023-01-22 14:11:50.249431: step: 594/466, loss: 0.009534381330013275 2023-01-22 14:11:50.983076: step: 596/466, loss: 0.019304398447275162 2023-01-22 14:11:51.773388: step: 598/466, loss: 0.03234853595495224 2023-01-22 14:11:52.559412: step: 600/466, loss: 0.03297156095504761 2023-01-22 14:11:53.355649: step: 602/466, loss: 0.05493513494729996 2023-01-22 14:11:54.143810: step: 604/466, loss: 0.6842228770256042 2023-01-22 14:11:54.895199: step: 606/466, loss: 0.07774440199136734 2023-01-22 14:11:55.665954: step: 608/466, loss: 1.7763886451721191 2023-01-22 14:11:56.426404: step: 610/466, loss: 0.1671164333820343 2023-01-22 14:11:57.182778: step: 612/466, loss: 0.458658903837204 2023-01-22 14:11:58.014388: step: 614/466, loss: 0.11497402936220169 2023-01-22 14:11:58.801467: step: 616/466, loss: 0.1912817507982254 2023-01-22 14:11:59.501627: step: 618/466, loss: 0.09462592750787735 2023-01-22 14:12:00.317191: step: 620/466, loss: 0.06380286812782288 2023-01-22 14:12:01.045967: step: 622/466, loss: 0.028667034581303596 2023-01-22 14:12:01.797813: step: 624/466, loss: 0.016529444605112076 2023-01-22 14:12:02.603734: step: 626/466, loss: 0.025493420660495758 2023-01-22 14:12:03.359852: step: 628/466, loss: 0.13471835851669312 2023-01-22 14:12:04.084062: step: 630/466, loss: 0.07315074652433395 2023-01-22 14:12:04.855970: step: 632/466, loss: 0.030087953433394432 2023-01-22 14:12:05.650535: step: 634/466, loss: 0.06007075682282448 2023-01-22 14:12:06.379562: step: 636/466, loss: 0.03061388060450554 2023-01-22 14:12:07.106409: step: 638/466, loss: 0.04862212762236595 2023-01-22 14:12:07.870684: step: 640/466, loss: 0.028137506917119026 2023-01-22 14:12:08.591908: step: 642/466, loss: 0.04800652340054512 2023-01-22 14:12:09.283603: step: 644/466, loss: 0.01501480583101511 2023-01-22 14:12:10.034009: step: 646/466, loss: 0.05946161225438118 2023-01-22 14:12:10.770300: step: 648/466, loss: 0.078636534512043 2023-01-22 14:12:11.527690: step: 650/466, loss: 0.04473862797021866 2023-01-22 14:12:12.243199: step: 652/466, loss: 0.06858037412166595 2023-01-22 14:12:13.018007: step: 654/466, loss: 0.030643390491604805 2023-01-22 14:12:13.728125: step: 656/466, loss: 0.07060644775629044 2023-01-22 14:12:14.513538: step: 658/466, loss: 0.02271956019103527 2023-01-22 14:12:15.220100: step: 660/466, loss: 0.051643069833517075 2023-01-22 14:12:15.995076: step: 662/466, loss: 0.07489115744829178 2023-01-22 14:12:16.724376: step: 664/466, loss: 0.0777987614274025 2023-01-22 14:12:17.455165: step: 666/466, loss: 0.11620701104402542 2023-01-22 14:12:18.280415: step: 668/466, loss: 0.045370180159807205 2023-01-22 14:12:19.018465: step: 670/466, loss: 0.033944014459848404 2023-01-22 14:12:19.808819: step: 672/466, loss: 0.18008968234062195 2023-01-22 14:12:20.581057: step: 674/466, loss: 0.01612461917102337 2023-01-22 14:12:21.349648: step: 676/466, loss: 0.2847628593444824 2023-01-22 14:12:22.138420: step: 678/466, loss: 0.0479779876768589 2023-01-22 14:12:22.851194: step: 680/466, loss: 0.048540275543928146 2023-01-22 14:12:23.602712: step: 682/466, loss: 0.028347892686724663 2023-01-22 14:12:24.388422: step: 684/466, loss: 0.042811281979084015 2023-01-22 14:12:25.082133: step: 686/466, loss: 0.03289152681827545 2023-01-22 14:12:25.783554: step: 688/466, loss: 0.18194958567619324 2023-01-22 14:12:26.509490: step: 690/466, loss: 0.1148754134774208 2023-01-22 14:12:27.302823: step: 692/466, loss: 0.0545729398727417 2023-01-22 14:12:28.067604: step: 694/466, loss: 0.03039471060037613 2023-01-22 14:12:28.840400: step: 696/466, loss: 0.0326075479388237 2023-01-22 14:12:29.556191: step: 698/466, loss: 0.0679233968257904 2023-01-22 14:12:30.278219: step: 700/466, loss: 0.09583453088998795 2023-01-22 14:12:31.162599: step: 702/466, loss: 0.05406005680561066 2023-01-22 14:12:31.999848: step: 704/466, loss: 0.02857596054673195 2023-01-22 14:12:32.751116: step: 706/466, loss: 0.057948268949985504 2023-01-22 14:12:33.464916: step: 708/466, loss: 0.36460480093955994 2023-01-22 14:12:34.259031: step: 710/466, loss: 0.026436137035489082 2023-01-22 14:12:35.061248: step: 712/466, loss: 0.38908836245536804 2023-01-22 14:12:35.942654: step: 714/466, loss: 0.08434242010116577 2023-01-22 14:12:36.735511: step: 716/466, loss: 0.02646796405315399 2023-01-22 14:12:37.471600: step: 718/466, loss: 0.02969576232135296 2023-01-22 14:12:38.255411: step: 720/466, loss: 0.051515594124794006 2023-01-22 14:12:39.114623: step: 722/466, loss: 0.04458548128604889 2023-01-22 14:12:39.881656: step: 724/466, loss: 0.2114185094833374 2023-01-22 14:12:40.647403: step: 726/466, loss: 0.10188092291355133 2023-01-22 14:12:41.368651: step: 728/466, loss: 0.10047049075365067 2023-01-22 14:12:42.090031: step: 730/466, loss: 0.09574826061725616 2023-01-22 14:12:42.834678: step: 732/466, loss: 0.1061575785279274 2023-01-22 14:12:43.647886: step: 734/466, loss: 0.09384144842624664 2023-01-22 14:12:44.485105: step: 736/466, loss: 0.054589297622442245 2023-01-22 14:12:45.195859: step: 738/466, loss: 0.16012555360794067 2023-01-22 14:12:45.990989: step: 740/466, loss: 0.012316581793129444 2023-01-22 14:12:46.713030: step: 742/466, loss: 0.006048239301890135 2023-01-22 14:12:47.488912: step: 744/466, loss: 0.04163842648267746 2023-01-22 14:12:48.232701: step: 746/466, loss: 0.012698384933173656 2023-01-22 14:12:48.987552: step: 748/466, loss: 0.033502642065286636 2023-01-22 14:12:49.814947: step: 750/466, loss: 0.045448169112205505 2023-01-22 14:12:50.610970: step: 752/466, loss: 0.0731145441532135 2023-01-22 14:12:51.352324: step: 754/466, loss: 0.024190278723835945 2023-01-22 14:12:52.208451: step: 756/466, loss: 0.08338966220617294 2023-01-22 14:12:52.991284: step: 758/466, loss: 0.031140204519033432 2023-01-22 14:12:53.686598: step: 760/466, loss: 0.06121518090367317 2023-01-22 14:12:54.397660: step: 762/466, loss: 0.11387481540441513 2023-01-22 14:12:55.188101: step: 764/466, loss: 0.07723309099674225 2023-01-22 14:12:55.963814: step: 766/466, loss: 0.046022143214941025 2023-01-22 14:12:56.704093: step: 768/466, loss: 0.045637596398591995 2023-01-22 14:12:57.501889: step: 770/466, loss: 0.04150305688381195 2023-01-22 14:12:58.264840: step: 772/466, loss: 0.045610833913087845 2023-01-22 14:12:59.068487: step: 774/466, loss: 0.09559348225593567 2023-01-22 14:12:59.785504: step: 776/466, loss: 0.14263160526752472 2023-01-22 14:13:00.550745: step: 778/466, loss: 0.03592607378959656 2023-01-22 14:13:01.295648: step: 780/466, loss: 0.019835738465189934 2023-01-22 14:13:02.042222: step: 782/466, loss: 0.10670880228281021 2023-01-22 14:13:02.947660: step: 784/466, loss: 0.02934003621339798 2023-01-22 14:13:03.680100: step: 786/466, loss: 0.052224624902009964 2023-01-22 14:13:04.411920: step: 788/466, loss: 0.017788240686058998 2023-01-22 14:13:05.222641: step: 790/466, loss: 0.1800827831029892 2023-01-22 14:13:06.041198: step: 792/466, loss: 0.4770299196243286 2023-01-22 14:13:06.767028: step: 794/466, loss: 0.037891022861003876 2023-01-22 14:13:07.467410: step: 796/466, loss: 0.025547461584210396 2023-01-22 14:13:08.255918: step: 798/466, loss: 0.02444528415799141 2023-01-22 14:13:09.012587: step: 800/466, loss: 0.305006206035614 2023-01-22 14:13:09.752519: step: 802/466, loss: 0.024854356423020363 2023-01-22 14:13:10.567850: step: 804/466, loss: 0.040885668247938156 2023-01-22 14:13:11.313537: step: 806/466, loss: 4.361806392669678 2023-01-22 14:13:12.010464: step: 808/466, loss: 0.0032287875656038523 2023-01-22 14:13:12.858996: step: 810/466, loss: 0.1244574561715126 2023-01-22 14:13:13.621965: step: 812/466, loss: 0.021160701289772987 2023-01-22 14:13:14.307346: step: 814/466, loss: 0.04389666020870209 2023-01-22 14:13:15.078731: step: 816/466, loss: 0.14484992623329163 2023-01-22 14:13:15.838774: step: 818/466, loss: 0.09680794924497604 2023-01-22 14:13:16.584452: step: 820/466, loss: 0.1459362655878067 2023-01-22 14:13:17.373275: step: 822/466, loss: 0.4532628655433655 2023-01-22 14:13:18.070247: step: 824/466, loss: 0.05273761972784996 2023-01-22 14:13:18.815374: step: 826/466, loss: 0.037020210176706314 2023-01-22 14:13:19.535633: step: 828/466, loss: 0.11131815612316132 2023-01-22 14:13:20.263271: step: 830/466, loss: 0.028645120561122894 2023-01-22 14:13:21.025254: step: 832/466, loss: 0.10022107511758804 2023-01-22 14:13:21.821274: step: 834/466, loss: 0.10250432044267654 2023-01-22 14:13:22.556633: step: 836/466, loss: 0.005558273755013943 2023-01-22 14:13:23.351261: step: 838/466, loss: 0.06928959488868713 2023-01-22 14:13:24.078017: step: 840/466, loss: 0.4579852521419525 2023-01-22 14:13:24.889341: step: 842/466, loss: 0.0566575787961483 2023-01-22 14:13:25.709442: step: 844/466, loss: 0.014095243066549301 2023-01-22 14:13:26.481529: step: 846/466, loss: 0.041836559772491455 2023-01-22 14:13:27.221353: step: 848/466, loss: 0.08909980207681656 2023-01-22 14:13:27.990285: step: 850/466, loss: 0.032618314027786255 2023-01-22 14:13:28.750990: step: 852/466, loss: 0.20079179108142853 2023-01-22 14:13:29.489431: step: 854/466, loss: 0.04353923723101616 2023-01-22 14:13:30.252443: step: 856/466, loss: 0.08984728157520294 2023-01-22 14:13:30.946555: step: 858/466, loss: 0.05157919600605965 2023-01-22 14:13:31.618967: step: 860/466, loss: 0.027703218162059784 2023-01-22 14:13:32.553107: step: 862/466, loss: 0.13330939412117004 2023-01-22 14:13:33.307503: step: 864/466, loss: 0.03492619842290878 2023-01-22 14:13:34.039827: step: 866/466, loss: 0.013390794396400452 2023-01-22 14:13:34.845459: step: 868/466, loss: 0.007219783030450344 2023-01-22 14:13:35.609795: step: 870/466, loss: 0.6684727668762207 2023-01-22 14:13:36.405882: step: 872/466, loss: 0.028121741488575935 2023-01-22 14:13:37.160798: step: 874/466, loss: 0.042943116277456284 2023-01-22 14:13:37.959913: step: 876/466, loss: 0.1596052497625351 2023-01-22 14:13:38.747901: step: 878/466, loss: 0.05617586523294449 2023-01-22 14:13:39.518029: step: 880/466, loss: 0.03300650045275688 2023-01-22 14:13:40.260852: step: 882/466, loss: 0.02836972288787365 2023-01-22 14:13:41.043223: step: 884/466, loss: 0.0764867439866066 2023-01-22 14:13:41.802814: step: 886/466, loss: 0.3197813630104065 2023-01-22 14:13:42.621197: step: 888/466, loss: 0.08093675225973129 2023-01-22 14:13:43.406880: step: 890/466, loss: 0.08463520556688309 2023-01-22 14:13:44.182699: step: 892/466, loss: 0.025558089837431908 2023-01-22 14:13:44.882016: step: 894/466, loss: 0.01471740286797285 2023-01-22 14:13:45.703773: step: 896/466, loss: 0.0714501366019249 2023-01-22 14:13:46.434786: step: 898/466, loss: 0.08137260377407074 2023-01-22 14:13:47.199669: step: 900/466, loss: 0.08838675916194916 2023-01-22 14:13:47.929690: step: 902/466, loss: 0.014560588635504246 2023-01-22 14:13:48.764349: step: 904/466, loss: 0.1300331950187683 2023-01-22 14:13:49.562725: step: 906/466, loss: 0.13205008208751678 2023-01-22 14:13:50.321390: step: 908/466, loss: 0.033016860485076904 2023-01-22 14:13:51.132558: step: 910/466, loss: 0.04948754981160164 2023-01-22 14:13:51.878224: step: 912/466, loss: 0.01946226879954338 2023-01-22 14:13:52.664859: step: 914/466, loss: 0.13564474880695343 2023-01-22 14:13:53.534547: step: 916/466, loss: 0.09711972624063492 2023-01-22 14:13:54.315589: step: 918/466, loss: 0.10797549039125443 2023-01-22 14:13:55.094787: step: 920/466, loss: 0.0589141882956028 2023-01-22 14:13:55.830590: step: 922/466, loss: 0.24523958563804626 2023-01-22 14:13:56.615810: step: 924/466, loss: 0.09199422597885132 2023-01-22 14:13:57.386900: step: 926/466, loss: 0.05299725756049156 2023-01-22 14:13:58.125933: step: 928/466, loss: 0.22056227922439575 2023-01-22 14:13:58.970081: step: 930/466, loss: 0.03502621129155159 2023-01-22 14:13:59.747711: step: 932/466, loss: 0.011367511935532093 ================================================== Loss: 0.108 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.29369308852067477, 'r': 0.33939106434362604, 'f1': 0.3148927656850192}, 'combined': 0.23202624839948785, 'epoch': 19} Test Chinese: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.3323483427992056, 'r': 0.31797341447744065, 'f1': 0.3250020045410446}, 'combined': 0.19975732962034937, 'epoch': 19} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2705198534637736, 'r': 0.3439246713865812, 'f1': 0.302837597027115}, 'combined': 0.22314349254629526, 'epoch': 19} Test Korean: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.30988475526013426, 'r': 0.3187386054104238, 'f1': 0.3142493292778826}, 'combined': 0.19314836823908882, 'epoch': 19} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3017706576728499, 'r': 0.3395635673624288, 'f1': 0.3195535714285714}, 'combined': 0.23546052631578943, 'epoch': 19} Test Russian: {'template': {'p': 0.863013698630137, 'r': 0.48091603053435117, 'f1': 0.6176470588235295}, 'slot': {'p': 0.3321627543028552, 'r': 0.3129111067783818, 'f1': 0.32224965651297044}, 'combined': 0.19903655255212885, 'epoch': 19} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.2712765957446808, 'r': 0.36428571428571427, 'f1': 0.31097560975609756}, 'combined': 0.2073170731707317, 'epoch': 19} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.26666666666666666, 'r': 0.5217391304347826, 'f1': 0.3529411764705882}, 'combined': 0.1764705882352941, 'epoch': 19} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.4375, 'r': 0.2413793103448276, 'f1': 0.3111111111111111}, 'combined': 0.2074074074074074, 'epoch': 19} New best russian model... ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3027851084501864, 'r': 0.33151234834109594, 'f1': 0.3164982021299956}, 'combined': 0.23320920156947042, 'epoch': 16} Test for Chinese: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.34794201805038116, 'r': 0.299398980870042, 'f1': 0.32185041818726456}, 'combined': 0.19782025703217238, 'epoch': 16} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.29347826086956524, 'r': 0.38571428571428573, 'f1': 0.33333333333333337}, 'combined': 0.22222222222222224, 'epoch': 16} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.28298284686781866, 'r': 0.3527888622242065, 'f1': 0.31405359863540006}, 'combined': 0.23140791478397899, 'epoch': 11} Test for Korean: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.32660396071805126, 'r': 0.30226432413074417, 'f1': 0.3139631233545263}, 'combined': 0.19297245630570886, 'epoch': 11} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.28888888888888886, 'r': 0.5652173913043478, 'f1': 0.38235294117647056}, 'combined': 0.19117647058823528, 'epoch': 11} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3017706576728499, 'r': 0.3395635673624288, 'f1': 0.3195535714285714}, 'combined': 0.23546052631578943, 'epoch': 19} Test for Russian: {'template': {'p': 0.863013698630137, 'r': 0.48091603053435117, 'f1': 0.6176470588235295}, 'slot': {'p': 0.3321627543028552, 'r': 0.3129111067783818, 'f1': 0.32224965651297044}, 'combined': 0.19903655255212885, 'epoch': 19} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.4375, 'r': 0.2413793103448276, 'f1': 0.3111111111111111}, 'combined': 0.2074074074074074, 'epoch': 19} ****************************** Epoch: 20 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-22 14:16:55.725865: step: 2/466, loss: 0.018460728228092194 2023-01-22 14:16:56.445285: step: 4/466, loss: 0.03275569900870323 2023-01-22 14:16:57.263146: step: 6/466, loss: 0.03537336736917496 2023-01-22 14:16:58.134815: step: 8/466, loss: 0.19849158823490143 2023-01-22 14:16:58.855511: step: 10/466, loss: 0.05351646617054939 2023-01-22 14:16:59.595765: step: 12/466, loss: 0.032131731510162354 2023-01-22 14:17:00.356506: step: 14/466, loss: 0.014430488459765911 2023-01-22 14:17:01.060473: step: 16/466, loss: 0.08124786615371704 2023-01-22 14:17:01.748455: step: 18/466, loss: 0.09149978309869766 2023-01-22 14:17:02.412142: step: 20/466, loss: 0.021502234041690826 2023-01-22 14:17:03.244594: step: 22/466, loss: 0.014477944932878017 2023-01-22 14:17:03.955043: step: 24/466, loss: 0.04463953897356987 2023-01-22 14:17:04.652434: step: 26/466, loss: 0.07978744804859161 2023-01-22 14:17:05.358214: step: 28/466, loss: 0.022446228191256523 2023-01-22 14:17:06.116987: step: 30/466, loss: 0.06083933636546135 2023-01-22 14:17:06.830892: step: 32/466, loss: 0.03639407828450203 2023-01-22 14:17:07.612439: step: 34/466, loss: 0.05580973997712135 2023-01-22 14:17:08.347787: step: 36/466, loss: 0.027402512729167938 2023-01-22 14:17:09.125863: step: 38/466, loss: 0.04815574362874031 2023-01-22 14:17:09.901299: step: 40/466, loss: 0.008505039848387241 2023-01-22 14:17:10.676402: step: 42/466, loss: 0.011803867295384407 2023-01-22 14:17:11.475864: step: 44/466, loss: 0.0648321807384491 2023-01-22 14:17:12.179304: step: 46/466, loss: 0.06518565118312836 2023-01-22 14:17:12.937029: step: 48/466, loss: 0.4239426553249359 2023-01-22 14:17:13.637266: step: 50/466, loss: 0.11440339684486389 2023-01-22 14:17:14.358609: step: 52/466, loss: 0.026564881205558777 2023-01-22 14:17:15.179263: step: 54/466, loss: 0.1800045222043991 2023-01-22 14:17:16.010930: step: 56/466, loss: 0.08924289047718048 2023-01-22 14:17:16.767853: step: 58/466, loss: 0.01246184203773737 2023-01-22 14:17:17.558038: step: 60/466, loss: 0.12218394130468369 2023-01-22 14:17:18.328983: step: 62/466, loss: 0.08775953203439713 2023-01-22 14:17:19.077157: step: 64/466, loss: 0.06456584483385086 2023-01-22 14:17:19.822227: step: 66/466, loss: 0.02095860429108143 2023-01-22 14:17:20.544418: step: 68/466, loss: 0.29399341344833374 2023-01-22 14:17:21.271498: step: 70/466, loss: 0.06159337982535362 2023-01-22 14:17:22.101505: step: 72/466, loss: 0.0189979188144207 2023-01-22 14:17:22.959795: step: 74/466, loss: 0.01161477155983448 2023-01-22 14:17:23.724732: step: 76/466, loss: 0.06281650811433792 2023-01-22 14:17:24.431524: step: 78/466, loss: 0.05198509618639946 2023-01-22 14:17:25.170987: step: 80/466, loss: 0.029480332508683205 2023-01-22 14:17:25.956151: step: 82/466, loss: 0.03114878572523594 2023-01-22 14:17:26.724622: step: 84/466, loss: 0.17855964601039886 2023-01-22 14:17:27.514356: step: 86/466, loss: 2.124892234802246 2023-01-22 14:17:28.297189: step: 88/466, loss: 0.03497857227921486 2023-01-22 14:17:29.064077: step: 90/466, loss: 0.10798290371894836 2023-01-22 14:17:29.817168: step: 92/466, loss: 0.030730247497558594 2023-01-22 14:17:30.525675: step: 94/466, loss: 0.05837418511509895 2023-01-22 14:17:31.221162: step: 96/466, loss: 0.037210673093795776 2023-01-22 14:17:32.125216: step: 98/466, loss: 0.02605927363038063 2023-01-22 14:17:32.923620: step: 100/466, loss: 0.05797327309846878 2023-01-22 14:17:33.618354: step: 102/466, loss: 0.014021651819348335 2023-01-22 14:17:34.368544: step: 104/466, loss: 0.008650483563542366 2023-01-22 14:17:35.082105: step: 106/466, loss: 0.27503061294555664 2023-01-22 14:17:35.845204: step: 108/466, loss: 0.05865919962525368 2023-01-22 14:17:36.712182: step: 110/466, loss: 0.32320258021354675 2023-01-22 14:17:37.453000: step: 112/466, loss: 0.03165920823812485 2023-01-22 14:17:38.232765: step: 114/466, loss: 0.3183203637599945 2023-01-22 14:17:38.986526: step: 116/466, loss: 0.3369978368282318 2023-01-22 14:17:39.798509: step: 118/466, loss: 0.049105823040008545 2023-01-22 14:17:40.575436: step: 120/466, loss: 0.01705557107925415 2023-01-22 14:17:41.306427: step: 122/466, loss: 0.10555698722600937 2023-01-22 14:17:41.982324: step: 124/466, loss: 0.10458479821681976 2023-01-22 14:17:42.758096: step: 126/466, loss: 0.02696489356458187 2023-01-22 14:17:43.433242: step: 128/466, loss: 0.02288011461496353 2023-01-22 14:17:44.339236: step: 130/466, loss: 0.06803611665964127 2023-01-22 14:17:45.118030: step: 132/466, loss: 0.022408053278923035 2023-01-22 14:17:45.861894: step: 134/466, loss: 0.11165321618318558 2023-01-22 14:17:46.600782: step: 136/466, loss: 0.02976180799305439 2023-01-22 14:17:47.385813: step: 138/466, loss: 0.05110258609056473 2023-01-22 14:17:48.100803: step: 140/466, loss: 0.04810798540711403 2023-01-22 14:17:48.833212: step: 142/466, loss: 0.04758574068546295 2023-01-22 14:17:49.548575: step: 144/466, loss: 0.0897674486041069 2023-01-22 14:17:50.347236: step: 146/466, loss: 0.13839790225028992 2023-01-22 14:17:51.177592: step: 148/466, loss: 0.026050128042697906 2023-01-22 14:17:51.969684: step: 150/466, loss: 0.07832131534814835 2023-01-22 14:17:52.794109: step: 152/466, loss: 0.03038802742958069 2023-01-22 14:17:53.501653: step: 154/466, loss: 0.010046614333987236 2023-01-22 14:17:54.239988: step: 156/466, loss: 0.028194040060043335 2023-01-22 14:17:54.978065: step: 158/466, loss: 0.05797536298632622 2023-01-22 14:17:55.849204: step: 160/466, loss: 0.06366293877363205 2023-01-22 14:17:56.602796: step: 162/466, loss: 0.06145820766687393 2023-01-22 14:17:57.404934: step: 164/466, loss: 0.006274975370615721 2023-01-22 14:17:58.149006: step: 166/466, loss: 0.19345133006572723 2023-01-22 14:17:58.943094: step: 168/466, loss: 0.054684944450855255 2023-01-22 14:17:59.661512: step: 170/466, loss: 0.031118638813495636 2023-01-22 14:18:00.344207: step: 172/466, loss: 0.07022649049758911 2023-01-22 14:18:01.091408: step: 174/466, loss: 0.034614648669958115 2023-01-22 14:18:01.947073: step: 176/466, loss: 0.07210703939199448 2023-01-22 14:18:02.693092: step: 178/466, loss: 0.0653044581413269 2023-01-22 14:18:03.361860: step: 180/466, loss: 0.08951914310455322 2023-01-22 14:18:04.069349: step: 182/466, loss: 0.09185580909252167 2023-01-22 14:18:04.789005: step: 184/466, loss: 0.0031846435740590096 2023-01-22 14:18:05.541275: step: 186/466, loss: 0.06330131739377975 2023-01-22 14:18:06.278982: step: 188/466, loss: 0.011010921560227871 2023-01-22 14:18:07.074741: step: 190/466, loss: 0.018669040873646736 2023-01-22 14:18:07.840025: step: 192/466, loss: 0.03512444719672203 2023-01-22 14:18:08.672868: step: 194/466, loss: 0.05187104269862175 2023-01-22 14:18:09.476535: step: 196/466, loss: 0.062332749366760254 2023-01-22 14:18:10.232711: step: 198/466, loss: 0.46412545442581177 2023-01-22 14:18:11.009094: step: 200/466, loss: 0.02101816236972809 2023-01-22 14:18:11.793509: step: 202/466, loss: 0.10228997468948364 2023-01-22 14:18:12.521943: step: 204/466, loss: 0.2514619529247284 2023-01-22 14:18:13.367140: step: 206/466, loss: 0.09877415746450424 2023-01-22 14:18:14.157370: step: 208/466, loss: 0.013377824798226357 2023-01-22 14:18:14.919235: step: 210/466, loss: 0.057463813573122025 2023-01-22 14:18:15.574492: step: 212/466, loss: 0.01609615981578827 2023-01-22 14:18:16.246806: step: 214/466, loss: 0.019593212753534317 2023-01-22 14:18:16.879044: step: 216/466, loss: 0.034630268812179565 2023-01-22 14:18:17.598968: step: 218/466, loss: 0.6699469089508057 2023-01-22 14:18:18.382360: step: 220/466, loss: 0.041075363755226135 2023-01-22 14:18:19.221350: step: 222/466, loss: 0.0339871346950531 2023-01-22 14:18:19.966373: step: 224/466, loss: 0.09721852093935013 2023-01-22 14:18:20.684041: step: 226/466, loss: 0.138138547539711 2023-01-22 14:18:21.441188: step: 228/466, loss: 0.02130807936191559 2023-01-22 14:18:22.210095: step: 230/466, loss: 0.04031668230891228 2023-01-22 14:18:22.976520: step: 232/466, loss: 0.020303938537836075 2023-01-22 14:18:23.769667: step: 234/466, loss: 0.05183028429746628 2023-01-22 14:18:24.550803: step: 236/466, loss: 0.07668639719486237 2023-01-22 14:18:25.342510: step: 238/466, loss: 0.019458649680018425 2023-01-22 14:18:26.144012: step: 240/466, loss: 0.1946711540222168 2023-01-22 14:18:26.949764: step: 242/466, loss: 0.059943996369838715 2023-01-22 14:18:27.774865: step: 244/466, loss: 0.08493660390377045 2023-01-22 14:18:28.500939: step: 246/466, loss: 0.0389455184340477 2023-01-22 14:18:29.243751: step: 248/466, loss: 0.03665849566459656 2023-01-22 14:18:29.993617: step: 250/466, loss: 0.10476026684045792 2023-01-22 14:18:30.801155: step: 252/466, loss: 0.017195170745253563 2023-01-22 14:18:31.550733: step: 254/466, loss: 0.0854298323392868 2023-01-22 14:18:32.300802: step: 256/466, loss: 0.020137647166848183 2023-01-22 14:18:33.099158: step: 258/466, loss: 0.026926511898636818 2023-01-22 14:18:33.816812: step: 260/466, loss: 0.14978930354118347 2023-01-22 14:18:34.602450: step: 262/466, loss: 0.04770808294415474 2023-01-22 14:18:35.356122: step: 264/466, loss: 0.03563295304775238 2023-01-22 14:18:36.075997: step: 266/466, loss: 0.03312264755368233 2023-01-22 14:18:36.856524: step: 268/466, loss: 0.05192991718649864 2023-01-22 14:18:37.618878: step: 270/466, loss: 0.4052007496356964 2023-01-22 14:18:38.335610: step: 272/466, loss: 0.08925099670886993 2023-01-22 14:18:39.050187: step: 274/466, loss: 0.05085451528429985 2023-01-22 14:18:39.950897: step: 276/466, loss: 0.10984174907207489 2023-01-22 14:18:40.688053: step: 278/466, loss: 0.020902059972286224 2023-01-22 14:18:41.419859: step: 280/466, loss: 0.33469197154045105 2023-01-22 14:18:42.211379: step: 282/466, loss: 0.05572224780917168 2023-01-22 14:18:43.024621: step: 284/466, loss: 0.06708931177854538 2023-01-22 14:18:43.768666: step: 286/466, loss: 0.03186662867665291 2023-01-22 14:18:44.492849: step: 288/466, loss: 0.0212254598736763 2023-01-22 14:18:45.294397: step: 290/466, loss: 0.0902729481458664 2023-01-22 14:18:46.021897: step: 292/466, loss: 0.020895853638648987 2023-01-22 14:18:46.680586: step: 294/466, loss: 0.046829309314489365 2023-01-22 14:18:47.434900: step: 296/466, loss: 0.10319173336029053 2023-01-22 14:18:48.148892: step: 298/466, loss: 0.03693874925374985 2023-01-22 14:18:48.868881: step: 300/466, loss: 0.014299717731773853 2023-01-22 14:18:49.686956: step: 302/466, loss: 0.6914137005805969 2023-01-22 14:18:50.449077: step: 304/466, loss: 0.07307901233434677 2023-01-22 14:18:51.232209: step: 306/466, loss: 0.04094526544213295 2023-01-22 14:18:52.035093: step: 308/466, loss: 0.047497380524873734 2023-01-22 14:18:52.849758: step: 310/466, loss: 0.057354554533958435 2023-01-22 14:18:53.637889: step: 312/466, loss: 0.026758970692753792 2023-01-22 14:18:54.364770: step: 314/466, loss: 0.019884929060935974 2023-01-22 14:18:55.204515: step: 316/466, loss: 0.0628701001405716 2023-01-22 14:18:55.981607: step: 318/466, loss: 0.19064365327358246 2023-01-22 14:18:56.751165: step: 320/466, loss: 0.12521429359912872 2023-01-22 14:18:57.419267: step: 322/466, loss: 0.04952094703912735 2023-01-22 14:18:58.226402: step: 324/466, loss: 0.09516461938619614 2023-01-22 14:18:59.067535: step: 326/466, loss: 0.08458506315946579 2023-01-22 14:18:59.876033: step: 328/466, loss: 0.03529912978410721 2023-01-22 14:19:00.686541: step: 330/466, loss: 5.178069114685059 2023-01-22 14:19:01.439033: step: 332/466, loss: 0.21790748834609985 2023-01-22 14:19:02.182716: step: 334/466, loss: 0.01928175799548626 2023-01-22 14:19:02.899465: step: 336/466, loss: 0.004510574974119663 2023-01-22 14:19:03.618673: step: 338/466, loss: 0.029111620038747787 2023-01-22 14:19:04.411452: step: 340/466, loss: 0.0197187177836895 2023-01-22 14:19:05.172040: step: 342/466, loss: 0.06134937331080437 2023-01-22 14:19:06.027835: step: 344/466, loss: 0.03785282000899315 2023-01-22 14:19:06.743944: step: 346/466, loss: 0.011232880875468254 2023-01-22 14:19:07.525456: step: 348/466, loss: 0.0719815343618393 2023-01-22 14:19:08.330629: step: 350/466, loss: 0.08595134317874908 2023-01-22 14:19:09.115586: step: 352/466, loss: 0.04742708057165146 2023-01-22 14:19:09.874815: step: 354/466, loss: 0.19618232548236847 2023-01-22 14:19:10.624152: step: 356/466, loss: 0.11792128533124924 2023-01-22 14:19:11.359606: step: 358/466, loss: 0.05558709800243378 2023-01-22 14:19:12.075929: step: 360/466, loss: 0.046625006943941116 2023-01-22 14:19:12.845055: step: 362/466, loss: 0.024011583998799324 2023-01-22 14:19:13.547364: step: 364/466, loss: 0.20778243243694305 2023-01-22 14:19:14.331976: step: 366/466, loss: 0.8859660029411316 2023-01-22 14:19:15.148765: step: 368/466, loss: 0.25369954109191895 2023-01-22 14:19:15.943651: step: 370/466, loss: 0.003959276247769594 2023-01-22 14:19:16.724288: step: 372/466, loss: 0.05117206275463104 2023-01-22 14:19:17.544578: step: 374/466, loss: 0.038259174674749374 2023-01-22 14:19:18.321376: step: 376/466, loss: 0.03823187202215195 2023-01-22 14:19:19.160622: step: 378/466, loss: 0.1659523993730545 2023-01-22 14:19:19.995141: step: 380/466, loss: 0.02509915828704834 2023-01-22 14:19:20.731815: step: 382/466, loss: 0.06251364201307297 2023-01-22 14:19:21.481665: step: 384/466, loss: 0.02569531463086605 2023-01-22 14:19:22.232993: step: 386/466, loss: 0.10763271898031235 2023-01-22 14:19:23.158308: step: 388/466, loss: 0.1218181699514389 2023-01-22 14:19:23.858301: step: 390/466, loss: 0.10014048218727112 2023-01-22 14:19:24.662794: step: 392/466, loss: 0.032277580350637436 2023-01-22 14:19:25.442660: step: 394/466, loss: 0.025713779032230377 2023-01-22 14:19:26.302199: step: 396/466, loss: 0.037789396941661835 2023-01-22 14:19:27.022983: step: 398/466, loss: 0.002991467248648405 2023-01-22 14:19:27.760056: step: 400/466, loss: 0.06556622684001923 2023-01-22 14:19:28.530114: step: 402/466, loss: 0.2638910710811615 2023-01-22 14:19:29.247591: step: 404/466, loss: 0.043477851897478104 2023-01-22 14:19:30.098366: step: 406/466, loss: 0.12631016969680786 2023-01-22 14:19:30.815000: step: 408/466, loss: 0.009239846840500832 2023-01-22 14:19:31.653076: step: 410/466, loss: 0.26313164830207825 2023-01-22 14:19:32.451663: step: 412/466, loss: 0.029916265979409218 2023-01-22 14:19:33.374377: step: 414/466, loss: 0.042880747467279434 2023-01-22 14:19:34.258386: step: 416/466, loss: 0.04426199197769165 2023-01-22 14:19:34.978230: step: 418/466, loss: 0.0630863681435585 2023-01-22 14:19:35.736329: step: 420/466, loss: 0.7282978296279907 2023-01-22 14:19:36.495238: step: 422/466, loss: 0.0994003489613533 2023-01-22 14:19:37.228509: step: 424/466, loss: 0.08450376987457275 2023-01-22 14:19:38.084288: step: 426/466, loss: 0.003517021657899022 2023-01-22 14:19:38.867973: step: 428/466, loss: 0.02976216748356819 2023-01-22 14:19:39.651694: step: 430/466, loss: 0.015140297822654247 2023-01-22 14:19:40.422474: step: 432/466, loss: 0.04440704360604286 2023-01-22 14:19:41.143376: step: 434/466, loss: 0.031749606132507324 2023-01-22 14:19:41.859671: step: 436/466, loss: 0.04850994795560837 2023-01-22 14:19:42.595750: step: 438/466, loss: 0.02641472965478897 2023-01-22 14:19:43.294279: step: 440/466, loss: 0.00908383633941412 2023-01-22 14:19:44.088826: step: 442/466, loss: 0.06254004687070847 2023-01-22 14:19:44.824915: step: 444/466, loss: 0.04309402033686638 2023-01-22 14:19:45.647533: step: 446/466, loss: 0.9316014051437378 2023-01-22 14:19:46.409188: step: 448/466, loss: 0.059038687497377396 2023-01-22 14:19:47.169657: step: 450/466, loss: 0.08485215902328491 2023-01-22 14:19:47.917250: step: 452/466, loss: 0.0382041372358799 2023-01-22 14:19:48.711468: step: 454/466, loss: 0.03592300042510033 2023-01-22 14:19:49.497366: step: 456/466, loss: 0.06754646450281143 2023-01-22 14:19:50.256791: step: 458/466, loss: 0.004598780535161495 2023-01-22 14:19:51.115759: step: 460/466, loss: 0.04046473652124405 2023-01-22 14:19:51.798268: step: 462/466, loss: 0.20700432360172272 2023-01-22 14:19:52.575936: step: 464/466, loss: 0.009331752546131611 2023-01-22 14:19:53.315287: step: 466/466, loss: 0.038695015013217926 2023-01-22 14:19:54.062411: step: 468/466, loss: 0.03941889852285385 2023-01-22 14:19:54.828955: step: 470/466, loss: 0.08039289712905884 2023-01-22 14:19:55.612034: step: 472/466, loss: 0.06569116562604904 2023-01-22 14:19:56.392510: step: 474/466, loss: 0.026711666956543922 2023-01-22 14:19:57.288390: step: 476/466, loss: 0.07863642275333405 2023-01-22 14:19:58.109102: step: 478/466, loss: 0.018847770988941193 2023-01-22 14:19:58.870297: step: 480/466, loss: 0.0439763106405735 2023-01-22 14:19:59.675537: step: 482/466, loss: 0.029851358383893967 2023-01-22 14:20:00.368909: step: 484/466, loss: 0.007683632429689169 2023-01-22 14:20:01.147621: step: 486/466, loss: 0.14108704030513763 2023-01-22 14:20:02.029389: step: 488/466, loss: 0.03667457029223442 2023-01-22 14:20:02.741301: step: 490/466, loss: 0.03250245749950409 2023-01-22 14:20:03.430827: step: 492/466, loss: 0.06343529373407364 2023-01-22 14:20:04.215119: step: 494/466, loss: 0.0005856614443473518 2023-01-22 14:20:04.936007: step: 496/466, loss: 0.11331066489219666 2023-01-22 14:20:05.692037: step: 498/466, loss: 0.037126459181308746 2023-01-22 14:20:06.469381: step: 500/466, loss: 0.06286133080720901 2023-01-22 14:20:07.213708: step: 502/466, loss: 0.030627667903900146 2023-01-22 14:20:08.001448: step: 504/466, loss: 0.060949284583330154 2023-01-22 14:20:08.810201: step: 506/466, loss: 0.039407823234796524 2023-01-22 14:20:09.552338: step: 508/466, loss: 0.030075030401349068 2023-01-22 14:20:10.291981: step: 510/466, loss: 0.1829003244638443 2023-01-22 14:20:11.059569: step: 512/466, loss: 0.03227916359901428 2023-01-22 14:20:11.874129: step: 514/466, loss: 0.06386967748403549 2023-01-22 14:20:12.576645: step: 516/466, loss: 0.029427075758576393 2023-01-22 14:20:13.389753: step: 518/466, loss: 0.019630271941423416 2023-01-22 14:20:14.153741: step: 520/466, loss: 0.1718192994594574 2023-01-22 14:20:14.910736: step: 522/466, loss: 0.10603654384613037 2023-01-22 14:20:15.599967: step: 524/466, loss: 0.07974043488502502 2023-01-22 14:20:16.389584: step: 526/466, loss: 0.11766904592514038 2023-01-22 14:20:17.131224: step: 528/466, loss: 0.025251492857933044 2023-01-22 14:20:17.915643: step: 530/466, loss: 0.008425693958997726 2023-01-22 14:20:18.668391: step: 532/466, loss: 0.015584269538521767 2023-01-22 14:20:19.436323: step: 534/466, loss: 0.05043382942676544 2023-01-22 14:20:20.201695: step: 536/466, loss: 0.0699617862701416 2023-01-22 14:20:20.978289: step: 538/466, loss: 0.05954763665795326 2023-01-22 14:20:21.697676: step: 540/466, loss: 0.027190033346414566 2023-01-22 14:20:22.409242: step: 542/466, loss: 0.04364040866494179 2023-01-22 14:20:23.136055: step: 544/466, loss: 0.06381090730428696 2023-01-22 14:20:23.881152: step: 546/466, loss: 0.05105084925889969 2023-01-22 14:20:24.600545: step: 548/466, loss: 0.05020805820822716 2023-01-22 14:20:25.395230: step: 550/466, loss: 0.17512200772762299 2023-01-22 14:20:26.242085: step: 552/466, loss: 0.0026450727600604296 2023-01-22 14:20:27.090601: step: 554/466, loss: 0.13461250066757202 2023-01-22 14:20:27.845651: step: 556/466, loss: 0.033968906849622726 2023-01-22 14:20:28.638629: step: 558/466, loss: 0.037686340510845184 2023-01-22 14:20:29.436422: step: 560/466, loss: 0.05116521194577217 2023-01-22 14:20:30.207278: step: 562/466, loss: 0.1257532835006714 2023-01-22 14:20:30.960383: step: 564/466, loss: 0.07375410944223404 2023-01-22 14:20:31.650276: step: 566/466, loss: 0.06227536499500275 2023-01-22 14:20:32.423707: step: 568/466, loss: 0.043114885687828064 2023-01-22 14:20:33.194800: step: 570/466, loss: 0.08072521537542343 2023-01-22 14:20:33.964546: step: 572/466, loss: 0.02490549348294735 2023-01-22 14:20:34.677405: step: 574/466, loss: 0.057185281068086624 2023-01-22 14:20:35.386566: step: 576/466, loss: 0.021186070516705513 2023-01-22 14:20:36.164731: step: 578/466, loss: 0.03023500367999077 2023-01-22 14:20:36.859011: step: 580/466, loss: 0.08721121400594711 2023-01-22 14:20:37.616720: step: 582/466, loss: 0.20383666455745697 2023-01-22 14:20:38.377574: step: 584/466, loss: 0.03602638468146324 2023-01-22 14:20:39.128856: step: 586/466, loss: 0.09032157063484192 2023-01-22 14:20:39.944560: step: 588/466, loss: 0.04010477289557457 2023-01-22 14:20:40.741234: step: 590/466, loss: 0.10806169360876083 2023-01-22 14:20:41.480779: step: 592/466, loss: 0.33428919315338135 2023-01-22 14:20:42.287172: step: 594/466, loss: 0.003103163791820407 2023-01-22 14:20:43.004183: step: 596/466, loss: 0.057396624237298965 2023-01-22 14:20:43.762664: step: 598/466, loss: 0.03389997407793999 2023-01-22 14:20:44.546883: step: 600/466, loss: 0.1480042040348053 2023-01-22 14:20:45.266640: step: 602/466, loss: 0.11612707376480103 2023-01-22 14:20:46.080610: step: 604/466, loss: 0.14691177010536194 2023-01-22 14:20:46.872751: step: 606/466, loss: 0.05263170599937439 2023-01-22 14:20:47.541471: step: 608/466, loss: 0.026127617806196213 2023-01-22 14:20:48.250494: step: 610/466, loss: 0.009142450988292694 2023-01-22 14:20:49.056752: step: 612/466, loss: 0.13319078087806702 2023-01-22 14:20:49.794533: step: 614/466, loss: 0.027096861973404884 2023-01-22 14:20:50.504687: step: 616/466, loss: 0.10066401213407516 2023-01-22 14:20:51.323867: step: 618/466, loss: 0.25737836956977844 2023-01-22 14:20:52.069102: step: 620/466, loss: 0.01815767213702202 2023-01-22 14:20:52.959216: step: 622/466, loss: 0.039933137595653534 2023-01-22 14:20:53.724369: step: 624/466, loss: 0.008822445757687092 2023-01-22 14:20:54.561665: step: 626/466, loss: 0.04307998716831207 2023-01-22 14:20:55.328918: step: 628/466, loss: 0.04361351579427719 2023-01-22 14:20:56.033253: step: 630/466, loss: 0.06477247178554535 2023-01-22 14:20:56.817267: step: 632/466, loss: 0.09567868709564209 2023-01-22 14:20:57.523352: step: 634/466, loss: 0.05815883353352547 2023-01-22 14:20:58.255448: step: 636/466, loss: 0.021055176854133606 2023-01-22 14:20:59.073721: step: 638/466, loss: 0.07199005782604218 2023-01-22 14:20:59.850968: step: 640/466, loss: 0.03393526375293732 2023-01-22 14:21:00.622532: step: 642/466, loss: 0.1396375149488449 2023-01-22 14:21:01.395139: step: 644/466, loss: 0.05805453285574913 2023-01-22 14:21:02.126358: step: 646/466, loss: 0.00800328515470028 2023-01-22 14:21:02.911695: step: 648/466, loss: 0.12152507156133652 2023-01-22 14:21:03.625509: step: 650/466, loss: 0.20110099017620087 2023-01-22 14:21:04.488718: step: 652/466, loss: 0.08574999123811722 2023-01-22 14:21:05.269419: step: 654/466, loss: 0.02011111192405224 2023-01-22 14:21:06.042999: step: 656/466, loss: 0.07749201357364655 2023-01-22 14:21:06.845006: step: 658/466, loss: 0.12124801427125931 2023-01-22 14:21:07.661647: step: 660/466, loss: 0.03949256241321564 2023-01-22 14:21:08.593657: step: 662/466, loss: 0.010345552116632462 2023-01-22 14:21:09.297401: step: 664/466, loss: 0.021447787061333656 2023-01-22 14:21:10.070847: step: 666/466, loss: 0.08892843127250671 2023-01-22 14:21:10.839140: step: 668/466, loss: 0.058989591896533966 2023-01-22 14:21:11.559022: step: 670/466, loss: 0.016034310683608055 2023-01-22 14:21:12.275678: step: 672/466, loss: 0.01991070993244648 2023-01-22 14:21:13.221993: step: 674/466, loss: 0.05193669721484184 2023-01-22 14:21:14.042903: step: 676/466, loss: 0.0062385061755776405 2023-01-22 14:21:14.846780: step: 678/466, loss: 0.06463981419801712 2023-01-22 14:21:15.671381: step: 680/466, loss: 0.18236199021339417 2023-01-22 14:21:16.387056: step: 682/466, loss: 0.019208496436476707 2023-01-22 14:21:17.093774: step: 684/466, loss: 0.010285614989697933 2023-01-22 14:21:17.861780: step: 686/466, loss: 0.03297063335776329 2023-01-22 14:21:18.566701: step: 688/466, loss: 0.1760704517364502 2023-01-22 14:21:19.396483: step: 690/466, loss: 0.01781740039587021 2023-01-22 14:21:20.202186: step: 692/466, loss: 0.03579838573932648 2023-01-22 14:21:20.968201: step: 694/466, loss: 0.040287796407938004 2023-01-22 14:21:21.800182: step: 696/466, loss: 0.03944031521677971 2023-01-22 14:21:22.565871: step: 698/466, loss: 0.0013104267418384552 2023-01-22 14:21:23.346591: step: 700/466, loss: 0.17801056802272797 2023-01-22 14:21:24.154444: step: 702/466, loss: 0.0916200578212738 2023-01-22 14:21:25.032809: step: 704/466, loss: 0.033320989459753036 2023-01-22 14:21:25.811686: step: 706/466, loss: 0.04283035546541214 2023-01-22 14:21:26.507645: step: 708/466, loss: 0.1713336706161499 2023-01-22 14:21:27.264956: step: 710/466, loss: 0.016873905435204506 2023-01-22 14:21:28.045936: step: 712/466, loss: 0.05669533833861351 2023-01-22 14:21:28.797626: step: 714/466, loss: 0.2237488031387329 2023-01-22 14:21:29.567649: step: 716/466, loss: 0.05956115201115608 2023-01-22 14:21:30.346106: step: 718/466, loss: 0.45502448081970215 2023-01-22 14:21:31.139828: step: 720/466, loss: 0.06795157492160797 2023-01-22 14:21:31.976477: step: 722/466, loss: 0.12121880799531937 2023-01-22 14:21:32.686902: step: 724/466, loss: 0.05142972618341446 2023-01-22 14:21:33.544264: step: 726/466, loss: 0.08417578041553497 2023-01-22 14:21:34.316655: step: 728/466, loss: 0.1284528374671936 2023-01-22 14:21:35.048524: step: 730/466, loss: 0.09293889999389648 2023-01-22 14:21:35.799028: step: 732/466, loss: 0.06237662583589554 2023-01-22 14:21:36.528049: step: 734/466, loss: 0.08169272541999817 2023-01-22 14:21:37.259760: step: 736/466, loss: 0.015537315979599953 2023-01-22 14:21:38.049714: step: 738/466, loss: 0.018973803147673607 2023-01-22 14:21:38.812080: step: 740/466, loss: 0.019834930077195168 2023-01-22 14:21:39.553658: step: 742/466, loss: 0.024088917300105095 2023-01-22 14:21:40.363530: step: 744/466, loss: 0.10155860334634781 2023-01-22 14:21:41.150565: step: 746/466, loss: 0.024841653183102608 2023-01-22 14:21:41.926757: step: 748/466, loss: 0.03416226804256439 2023-01-22 14:21:42.748227: step: 750/466, loss: 0.0373273529112339 2023-01-22 14:21:43.426996: step: 752/466, loss: 0.03173748031258583 2023-01-22 14:21:44.206109: step: 754/466, loss: 0.035857170820236206 2023-01-22 14:21:44.917736: step: 756/466, loss: 0.0462789386510849 2023-01-22 14:21:45.684250: step: 758/466, loss: 0.2072424590587616 2023-01-22 14:21:46.457022: step: 760/466, loss: 0.15545901656150818 2023-01-22 14:21:47.246630: step: 762/466, loss: 0.09147053956985474 2023-01-22 14:21:48.076200: step: 764/466, loss: 0.1897769570350647 2023-01-22 14:21:48.850483: step: 766/466, loss: 0.040288910269737244 2023-01-22 14:21:49.628527: step: 768/466, loss: 0.014940326102077961 2023-01-22 14:21:50.329339: step: 770/466, loss: 0.037154488265514374 2023-01-22 14:21:51.076025: step: 772/466, loss: 0.052049510180950165 2023-01-22 14:21:51.834623: step: 774/466, loss: 0.1100940853357315 2023-01-22 14:21:52.624031: step: 776/466, loss: 0.033307034522295 2023-01-22 14:21:53.437747: step: 778/466, loss: 0.03242477402091026 2023-01-22 14:21:54.183790: step: 780/466, loss: 0.02020042948424816 2023-01-22 14:21:54.853548: step: 782/466, loss: 0.03318953886628151 2023-01-22 14:21:55.643509: step: 784/466, loss: 0.029286310076713562 2023-01-22 14:21:56.292541: step: 786/466, loss: 0.04063934460282326 2023-01-22 14:21:57.079452: step: 788/466, loss: 0.05453144386410713 2023-01-22 14:21:57.870112: step: 790/466, loss: 0.05044718086719513 2023-01-22 14:21:58.691333: step: 792/466, loss: 0.03858442232012749 2023-01-22 14:21:59.402568: step: 794/466, loss: 0.07941761612892151 2023-01-22 14:22:00.156231: step: 796/466, loss: 0.022421518340706825 2023-01-22 14:22:00.878123: step: 798/466, loss: 0.16522493958473206 2023-01-22 14:22:01.706404: step: 800/466, loss: 0.036643315106630325 2023-01-22 14:22:02.502625: step: 802/466, loss: 0.05247822403907776 2023-01-22 14:22:03.272806: step: 804/466, loss: 0.14290215075016022 2023-01-22 14:22:03.987841: step: 806/466, loss: 0.002410825341939926 2023-01-22 14:22:04.740627: step: 808/466, loss: 0.00896370504051447 2023-01-22 14:22:05.484839: step: 810/466, loss: 0.21436962485313416 2023-01-22 14:22:06.226511: step: 812/466, loss: 0.01603039540350437 2023-01-22 14:22:07.013588: step: 814/466, loss: 0.046455636620521545 2023-01-22 14:22:07.815352: step: 816/466, loss: 0.015781737864017487 2023-01-22 14:22:08.532370: step: 818/466, loss: 0.4075051248073578 2023-01-22 14:22:09.257075: step: 820/466, loss: 0.0431065633893013 2023-01-22 14:22:10.034823: step: 822/466, loss: 0.039836108684539795 2023-01-22 14:22:10.825161: step: 824/466, loss: 0.36340564489364624 2023-01-22 14:22:11.620433: step: 826/466, loss: 0.020049631595611572 2023-01-22 14:22:12.363382: step: 828/466, loss: 0.36558952927589417 2023-01-22 14:22:13.178307: step: 830/466, loss: 0.2860356867313385 2023-01-22 14:22:13.884624: step: 832/466, loss: 0.10677853226661682 2023-01-22 14:22:14.720960: step: 834/466, loss: 0.2021457999944687 2023-01-22 14:22:15.495871: step: 836/466, loss: 0.011159212328493595 2023-01-22 14:22:16.195230: step: 838/466, loss: 0.0147927301004529 2023-01-22 14:22:17.006190: step: 840/466, loss: 0.8715988993644714 2023-01-22 14:22:17.837937: step: 842/466, loss: 0.03797810524702072 2023-01-22 14:22:18.799882: step: 844/466, loss: 0.056683532893657684 2023-01-22 14:22:19.569619: step: 846/466, loss: 0.00796705111861229 2023-01-22 14:22:20.349373: step: 848/466, loss: 0.03010483831167221 2023-01-22 14:22:21.161349: step: 850/466, loss: 0.02800583280622959 2023-01-22 14:22:21.856862: step: 852/466, loss: 0.09175509959459305 2023-01-22 14:22:22.674273: step: 854/466, loss: 0.08602626621723175 2023-01-22 14:22:23.453671: step: 856/466, loss: 0.04435117170214653 2023-01-22 14:22:24.239958: step: 858/466, loss: 0.266827255487442 2023-01-22 14:22:24.951682: step: 860/466, loss: 0.09036989510059357 2023-01-22 14:22:25.698853: step: 862/466, loss: 0.06072517856955528 2023-01-22 14:22:26.506225: step: 864/466, loss: 0.08128470182418823 2023-01-22 14:22:27.244023: step: 866/466, loss: 0.2942475378513336 2023-01-22 14:22:27.965896: step: 868/466, loss: 0.07774436473846436 2023-01-22 14:22:28.712151: step: 870/466, loss: 0.04398656636476517 2023-01-22 14:22:29.545016: step: 872/466, loss: 0.03236667811870575 2023-01-22 14:22:30.308668: step: 874/466, loss: 0.013491833582520485 2023-01-22 14:22:31.058831: step: 876/466, loss: 0.186244398355484 2023-01-22 14:22:31.787874: step: 878/466, loss: 0.02830761857330799 2023-01-22 14:22:32.572222: step: 880/466, loss: 0.14868517220020294 2023-01-22 14:22:33.393143: step: 882/466, loss: 0.009692513383924961 2023-01-22 14:22:34.122611: step: 884/466, loss: 0.009559868834912777 2023-01-22 14:22:34.913806: step: 886/466, loss: 0.1469455510377884 2023-01-22 14:22:35.638183: step: 888/466, loss: 0.0737396776676178 2023-01-22 14:22:36.667918: step: 890/466, loss: 0.05260159447789192 2023-01-22 14:22:37.412339: step: 892/466, loss: 0.02190791442990303 2023-01-22 14:22:38.156913: step: 894/466, loss: 0.05156734585762024 2023-01-22 14:22:38.937339: step: 896/466, loss: 0.09620752185583115 2023-01-22 14:22:39.622686: step: 898/466, loss: 0.1586761176586151 2023-01-22 14:22:40.353572: step: 900/466, loss: 0.10566425323486328 2023-01-22 14:22:41.338627: step: 902/466, loss: 0.06891264021396637 2023-01-22 14:22:42.073308: step: 904/466, loss: 0.04123598709702492 2023-01-22 14:22:42.869970: step: 906/466, loss: 0.07150737196207047 2023-01-22 14:22:43.773044: step: 908/466, loss: 0.04209690913558006 2023-01-22 14:22:44.411146: step: 910/466, loss: 0.0065148635767400265 2023-01-22 14:22:45.193392: step: 912/466, loss: 0.041471704840660095 2023-01-22 14:22:45.889205: step: 914/466, loss: 0.01814841665327549 2023-01-22 14:22:46.637755: step: 916/466, loss: 0.04168889299035072 2023-01-22 14:22:47.468843: step: 918/466, loss: 0.13480067253112793 2023-01-22 14:22:48.208427: step: 920/466, loss: 0.09623857587575912 2023-01-22 14:22:48.925662: step: 922/466, loss: 0.08783978968858719 2023-01-22 14:22:49.740635: step: 924/466, loss: 0.7061190009117126 2023-01-22 14:22:50.559849: step: 926/466, loss: 0.1090499609708786 2023-01-22 14:22:51.228632: step: 928/466, loss: 0.06472326815128326 2023-01-22 14:22:52.058918: step: 930/466, loss: 0.3719378709793091 2023-01-22 14:22:52.962018: step: 932/466, loss: 0.028679050505161285 ================================================== Loss: 0.100 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3022810665362035, 'r': 0.3349756031444836, 'f1': 0.31778963610646777}, 'combined': 0.23416078449950256, 'epoch': 20} Test Chinese: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.347533523150502, 'r': 0.3035648105162097, 'f1': 0.3240645618276651}, 'combined': 0.1991811453184673, 'epoch': 20} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.28564455907558645, 'r': 0.34635080312960104, 'f1': 0.3130821153504284}, 'combined': 0.2306920849950525, 'epoch': 20} Test Korean: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.330687293364747, 'r': 0.31005515721027405, 'f1': 0.3200390442045226}, 'combined': 0.19670692473058463, 'epoch': 20} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3127766599597585, 'r': 0.33711032800216856, 'f1': 0.324487932159165}, 'combined': 0.23909637106464787, 'epoch': 20} Test Russian: {'template': {'p': 0.863013698630137, 'r': 0.48091603053435117, 'f1': 0.6176470588235295}, 'slot': {'p': 0.3523853818787053, 'r': 0.3093296289801806, 'f1': 0.32945675297012317}, 'combined': 0.2034879944815467, 'epoch': 20} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.28532608695652173, 'r': 0.375, 'f1': 0.32407407407407407}, 'combined': 0.21604938271604937, 'epoch': 20} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.28191489361702127, 'r': 0.5760869565217391, 'f1': 0.3785714285714286}, 'combined': 0.1892857142857143, 'epoch': 20} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.4117647058823529, 'r': 0.2413793103448276, 'f1': 0.3043478260869565}, 'combined': 0.20289855072463764, 'epoch': 20} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3027851084501864, 'r': 0.33151234834109594, 'f1': 0.3164982021299956}, 'combined': 0.23320920156947042, 'epoch': 16} Test for Chinese: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.34794201805038116, 'r': 0.299398980870042, 'f1': 0.32185041818726456}, 'combined': 0.19782025703217238, 'epoch': 16} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.29347826086956524, 'r': 0.38571428571428573, 'f1': 0.33333333333333337}, 'combined': 0.22222222222222224, 'epoch': 16} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.28298284686781866, 'r': 0.3527888622242065, 'f1': 0.31405359863540006}, 'combined': 0.23140791478397899, 'epoch': 11} Test for Korean: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.32660396071805126, 'r': 0.30226432413074417, 'f1': 0.3139631233545263}, 'combined': 0.19297245630570886, 'epoch': 11} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.28888888888888886, 'r': 0.5652173913043478, 'f1': 0.38235294117647056}, 'combined': 0.19117647058823528, 'epoch': 11} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3017706576728499, 'r': 0.3395635673624288, 'f1': 0.3195535714285714}, 'combined': 0.23546052631578943, 'epoch': 19} Test for Russian: {'template': {'p': 0.863013698630137, 'r': 0.48091603053435117, 'f1': 0.6176470588235295}, 'slot': {'p': 0.3321627543028552, 'r': 0.3129111067783818, 'f1': 0.32224965651297044}, 'combined': 0.19903655255212885, 'epoch': 19} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.4375, 'r': 0.2413793103448276, 'f1': 0.3111111111111111}, 'combined': 0.2074074074074074, 'epoch': 19} ****************************** Epoch: 21 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-22 14:25:39.176880: step: 2/466, loss: 0.02420053258538246 2023-01-22 14:25:39.914311: step: 4/466, loss: 0.021157976239919662 2023-01-22 14:25:40.645970: step: 6/466, loss: 0.027058003470301628 2023-01-22 14:25:41.346778: step: 8/466, loss: 0.025552958250045776 2023-01-22 14:25:42.065095: step: 10/466, loss: 0.017578069120645523 2023-01-22 14:25:42.807986: step: 12/466, loss: 0.02473308891057968 2023-01-22 14:25:43.568813: step: 14/466, loss: 0.054351478815078735 2023-01-22 14:25:44.298537: step: 16/466, loss: 0.015328841283917427 2023-01-22 14:25:45.053340: step: 18/466, loss: 0.05652279034256935 2023-01-22 14:25:45.944074: step: 20/466, loss: 0.0549527108669281 2023-01-22 14:25:46.672748: step: 22/466, loss: 0.03255992382764816 2023-01-22 14:25:47.448418: step: 24/466, loss: 0.014854073524475098 2023-01-22 14:25:48.147468: step: 26/466, loss: 1.2386393547058105 2023-01-22 14:25:49.003826: step: 28/466, loss: 0.21185410022735596 2023-01-22 14:25:49.712514: step: 30/466, loss: 0.03914286196231842 2023-01-22 14:25:50.483683: step: 32/466, loss: 0.010245121084153652 2023-01-22 14:25:51.179381: step: 34/466, loss: 0.028347767889499664 2023-01-22 14:25:51.940126: step: 36/466, loss: 0.03141717240214348 2023-01-22 14:25:52.726238: step: 38/466, loss: 0.017886755988001823 2023-01-22 14:25:53.447517: step: 40/466, loss: 0.003856037277728319 2023-01-22 14:25:54.194585: step: 42/466, loss: 0.0017376018222421408 2023-01-22 14:25:54.955033: step: 44/466, loss: 0.15095050632953644 2023-01-22 14:25:55.740383: step: 46/466, loss: 0.05372604727745056 2023-01-22 14:25:56.488109: step: 48/466, loss: 0.012925044633448124 2023-01-22 14:25:57.337847: step: 50/466, loss: 0.01901458576321602 2023-01-22 14:25:58.022105: step: 52/466, loss: 0.026867816224694252 2023-01-22 14:25:58.788321: step: 54/466, loss: 0.030041607096791267 2023-01-22 14:25:59.507231: step: 56/466, loss: 0.042561452835798264 2023-01-22 14:26:00.198881: step: 58/466, loss: 0.027939572930336 2023-01-22 14:26:01.024305: step: 60/466, loss: 0.09363628923892975 2023-01-22 14:26:01.877329: step: 62/466, loss: 0.031050391495227814 2023-01-22 14:26:02.657426: step: 64/466, loss: 0.006767976563423872 2023-01-22 14:26:03.476256: step: 66/466, loss: 0.036450713872909546 2023-01-22 14:26:04.282836: step: 68/466, loss: 0.00916550774127245 2023-01-22 14:26:05.021688: step: 70/466, loss: 0.023364635184407234 2023-01-22 14:26:05.814340: step: 72/466, loss: 0.006844291463494301 2023-01-22 14:26:06.543912: step: 74/466, loss: 0.0034316659439355135 2023-01-22 14:26:07.396805: step: 76/466, loss: 0.6681513786315918 2023-01-22 14:26:08.168014: step: 78/466, loss: 0.04171931743621826 2023-01-22 14:26:08.986568: step: 80/466, loss: 0.09284312278032303 2023-01-22 14:26:09.710048: step: 82/466, loss: 0.05913330242037773 2023-01-22 14:26:10.532762: step: 84/466, loss: 0.05039322376251221 2023-01-22 14:26:11.260986: step: 86/466, loss: 0.8253809809684753 2023-01-22 14:26:12.072111: step: 88/466, loss: 0.06900697201490402 2023-01-22 14:26:12.942647: step: 90/466, loss: 0.01916087418794632 2023-01-22 14:26:13.689752: step: 92/466, loss: 0.006514217238873243 2023-01-22 14:26:14.465666: step: 94/466, loss: 0.015011530369520187 2023-01-22 14:26:15.194526: step: 96/466, loss: 0.017279941588640213 2023-01-22 14:26:16.069117: step: 98/466, loss: 0.07186633348464966 2023-01-22 14:26:16.895339: step: 100/466, loss: 0.024932416155934334 2023-01-22 14:26:17.716138: step: 102/466, loss: 0.05825050175189972 2023-01-22 14:26:18.452210: step: 104/466, loss: 0.05546033754944801 2023-01-22 14:26:19.316356: step: 106/466, loss: 0.13620232045650482 2023-01-22 14:26:20.101028: step: 108/466, loss: 0.039917781949043274 2023-01-22 14:26:20.837102: step: 110/466, loss: 0.020082594826817513 2023-01-22 14:26:21.603017: step: 112/466, loss: 0.09066887199878693 2023-01-22 14:26:22.405879: step: 114/466, loss: 0.0343233123421669 2023-01-22 14:26:23.128946: step: 116/466, loss: 0.017364807426929474 2023-01-22 14:26:23.840609: step: 118/466, loss: 0.010822913609445095 2023-01-22 14:26:24.628350: step: 120/466, loss: 0.06861522793769836 2023-01-22 14:26:25.491488: step: 122/466, loss: 0.11362996697425842 2023-01-22 14:26:26.308592: step: 124/466, loss: 0.020435810089111328 2023-01-22 14:26:27.027446: step: 126/466, loss: 0.07974167913198471 2023-01-22 14:26:27.871013: step: 128/466, loss: 0.04215514659881592 2023-01-22 14:26:28.636243: step: 130/466, loss: 0.08239579200744629 2023-01-22 14:26:29.360768: step: 132/466, loss: 0.049038395285606384 2023-01-22 14:26:30.150052: step: 134/466, loss: 0.060617074370384216 2023-01-22 14:26:30.909234: step: 136/466, loss: 0.01857878267765045 2023-01-22 14:26:31.730508: step: 138/466, loss: 0.04355144873261452 2023-01-22 14:26:32.469702: step: 140/466, loss: 0.05773269385099411 2023-01-22 14:26:33.176549: step: 142/466, loss: 0.0333169624209404 2023-01-22 14:26:33.981086: step: 144/466, loss: 0.07793013006448746 2023-01-22 14:26:34.660224: step: 146/466, loss: 0.12647201120853424 2023-01-22 14:26:35.421526: step: 148/466, loss: 0.01892600767314434 2023-01-22 14:26:36.206987: step: 150/466, loss: 0.045988768339157104 2023-01-22 14:26:37.046196: step: 152/466, loss: 0.0888764038681984 2023-01-22 14:26:37.824515: step: 154/466, loss: 0.010339142754673958 2023-01-22 14:26:38.653430: step: 156/466, loss: 0.03397432714700699 2023-01-22 14:26:39.386709: step: 158/466, loss: 0.02980886958539486 2023-01-22 14:26:40.166924: step: 160/466, loss: 0.07075031101703644 2023-01-22 14:26:40.940420: step: 162/466, loss: 0.004570677876472473 2023-01-22 14:26:41.711642: step: 164/466, loss: 0.0559922493994236 2023-01-22 14:26:42.408575: step: 166/466, loss: 0.023972397670149803 2023-01-22 14:26:43.131010: step: 168/466, loss: 0.0040644872933626175 2023-01-22 14:26:43.849652: step: 170/466, loss: 0.06624700874090195 2023-01-22 14:26:44.602897: step: 172/466, loss: 0.08573484420776367 2023-01-22 14:26:45.405482: step: 174/466, loss: 0.02354901283979416 2023-01-22 14:26:46.103274: step: 176/466, loss: 0.0014065414434298873 2023-01-22 14:26:46.851646: step: 178/466, loss: 0.009613445028662682 2023-01-22 14:26:47.655829: step: 180/466, loss: 0.022372951731085777 2023-01-22 14:26:48.388237: step: 182/466, loss: 0.03555876389145851 2023-01-22 14:26:49.050572: step: 184/466, loss: 0.0734868124127388 2023-01-22 14:26:49.813879: step: 186/466, loss: 0.03425343707203865 2023-01-22 14:26:50.532110: step: 188/466, loss: 0.02357129566371441 2023-01-22 14:26:51.222716: step: 190/466, loss: 0.06736937165260315 2023-01-22 14:26:51.944287: step: 192/466, loss: 0.095945343375206 2023-01-22 14:26:52.803085: step: 194/466, loss: 0.10415996611118317 2023-01-22 14:26:53.497688: step: 196/466, loss: 0.402849942445755 2023-01-22 14:26:54.323326: step: 198/466, loss: 0.04138009622693062 2023-01-22 14:26:55.119810: step: 200/466, loss: 0.017961658537387848 2023-01-22 14:26:55.834078: step: 202/466, loss: 0.039096955209970474 2023-01-22 14:26:56.571745: step: 204/466, loss: 0.21321329474449158 2023-01-22 14:26:57.364517: step: 206/466, loss: 0.04853019863367081 2023-01-22 14:26:58.082211: step: 208/466, loss: 0.014309985563158989 2023-01-22 14:26:58.783888: step: 210/466, loss: 0.042209725826978683 2023-01-22 14:26:59.514659: step: 212/466, loss: 0.0016783431638032198 2023-01-22 14:27:00.265114: step: 214/466, loss: 0.006007462274283171 2023-01-22 14:27:01.042358: step: 216/466, loss: 8.656112670898438 2023-01-22 14:27:01.863376: step: 218/466, loss: 0.04152832552790642 2023-01-22 14:27:02.697320: step: 220/466, loss: 0.031420446932315826 2023-01-22 14:27:03.444026: step: 222/466, loss: 0.039263706654310226 2023-01-22 14:27:04.221384: step: 224/466, loss: 0.1400732547044754 2023-01-22 14:27:04.987865: step: 226/466, loss: 0.0901549756526947 2023-01-22 14:27:05.771429: step: 228/466, loss: 0.0026076301001012325 2023-01-22 14:27:06.548956: step: 230/466, loss: 0.16132469475269318 2023-01-22 14:27:07.291937: step: 232/466, loss: 0.002261529676616192 2023-01-22 14:27:08.041280: step: 234/466, loss: 0.018132373690605164 2023-01-22 14:27:08.801812: step: 236/466, loss: 0.01911015622317791 2023-01-22 14:27:09.596209: step: 238/466, loss: 0.07900919765233994 2023-01-22 14:27:10.347419: step: 240/466, loss: 0.02429911494255066 2023-01-22 14:27:11.163866: step: 242/466, loss: 0.0636911392211914 2023-01-22 14:27:11.859662: step: 244/466, loss: 0.032838374376297 2023-01-22 14:27:12.703548: step: 246/466, loss: 0.09503611922264099 2023-01-22 14:27:13.443990: step: 248/466, loss: 0.10877351462841034 2023-01-22 14:27:14.273407: step: 250/466, loss: 0.021549206227064133 2023-01-22 14:27:14.955630: step: 252/466, loss: 0.7020695805549622 2023-01-22 14:27:15.759133: step: 254/466, loss: 0.014177825301885605 2023-01-22 14:27:16.438241: step: 256/466, loss: 0.05610502511262894 2023-01-22 14:27:17.174179: step: 258/466, loss: 0.02227671630680561 2023-01-22 14:27:18.030641: step: 260/466, loss: 0.013494499027729034 2023-01-22 14:27:18.755321: step: 262/466, loss: 0.014789719134569168 2023-01-22 14:27:19.469292: step: 264/466, loss: 0.006463128607720137 2023-01-22 14:27:20.262451: step: 266/466, loss: 0.06511010974645615 2023-01-22 14:27:21.058455: step: 268/466, loss: 0.10073195397853851 2023-01-22 14:27:21.842042: step: 270/466, loss: 0.0873476192355156 2023-01-22 14:27:22.596424: step: 272/466, loss: 0.014236886985599995 2023-01-22 14:27:23.324921: step: 274/466, loss: 0.0363636277616024 2023-01-22 14:27:24.045418: step: 276/466, loss: 0.026089193299412727 2023-01-22 14:27:24.801545: step: 278/466, loss: 0.02656644769012928 2023-01-22 14:27:25.532153: step: 280/466, loss: 0.0171302892267704 2023-01-22 14:27:26.326912: step: 282/466, loss: 0.06511224061250687 2023-01-22 14:27:27.083414: step: 284/466, loss: 0.013560446910560131 2023-01-22 14:27:27.819256: step: 286/466, loss: 0.011944221332669258 2023-01-22 14:27:28.636734: step: 288/466, loss: 0.01988859660923481 2023-01-22 14:27:29.286827: step: 290/466, loss: 0.01531740091741085 2023-01-22 14:27:29.995185: step: 292/466, loss: 0.09434042870998383 2023-01-22 14:27:30.789400: step: 294/466, loss: 0.011777781881392002 2023-01-22 14:27:31.474849: step: 296/466, loss: 0.07179064303636551 2023-01-22 14:27:32.246587: step: 298/466, loss: 0.07993976771831512 2023-01-22 14:27:33.049648: step: 300/466, loss: 1.4223395586013794 2023-01-22 14:27:33.927524: step: 302/466, loss: 0.013697385787963867 2023-01-22 14:27:34.709863: step: 304/466, loss: 0.022314228117465973 2023-01-22 14:27:35.495129: step: 306/466, loss: 0.03576982766389847 2023-01-22 14:27:36.252827: step: 308/466, loss: 0.054183077067136765 2023-01-22 14:27:37.085915: step: 310/466, loss: 0.08912408351898193 2023-01-22 14:27:37.877636: step: 312/466, loss: 0.048739008605480194 2023-01-22 14:27:38.595663: step: 314/466, loss: 0.00755567429587245 2023-01-22 14:27:39.317442: step: 316/466, loss: 0.12594488263130188 2023-01-22 14:27:40.091940: step: 318/466, loss: 0.01600790023803711 2023-01-22 14:27:40.836152: step: 320/466, loss: 0.04415088891983032 2023-01-22 14:27:41.547466: step: 322/466, loss: 0.03756783530116081 2023-01-22 14:27:42.288372: step: 324/466, loss: 0.011490639299154282 2023-01-22 14:27:43.075224: step: 326/466, loss: 0.02899995446205139 2023-01-22 14:27:43.935023: step: 328/466, loss: 0.15357692539691925 2023-01-22 14:27:44.696030: step: 330/466, loss: 0.016842082142829895 2023-01-22 14:27:45.480614: step: 332/466, loss: 0.14622445404529572 2023-01-22 14:27:46.271710: step: 334/466, loss: 0.03253905102610588 2023-01-22 14:27:47.027647: step: 336/466, loss: 0.17871153354644775 2023-01-22 14:27:47.794762: step: 338/466, loss: 0.01588170975446701 2023-01-22 14:27:48.576297: step: 340/466, loss: 0.02543899230659008 2023-01-22 14:27:49.328126: step: 342/466, loss: 0.5410417914390564 2023-01-22 14:27:50.154699: step: 344/466, loss: 0.006712695118039846 2023-01-22 14:27:50.861864: step: 346/466, loss: 0.03204884007573128 2023-01-22 14:27:51.648109: step: 348/466, loss: 0.03173366189002991 2023-01-22 14:27:52.636343: step: 350/466, loss: 0.07975351065397263 2023-01-22 14:27:53.313760: step: 352/466, loss: 0.08007784187793732 2023-01-22 14:27:54.111166: step: 354/466, loss: 0.051084209233522415 2023-01-22 14:27:54.877221: step: 356/466, loss: 0.003182527609169483 2023-01-22 14:27:55.621859: step: 358/466, loss: 0.05374673381447792 2023-01-22 14:27:56.368685: step: 360/466, loss: 0.04119402915239334 2023-01-22 14:27:57.145913: step: 362/466, loss: 0.031509507447481155 2023-01-22 14:27:57.962590: step: 364/466, loss: 0.02087160013616085 2023-01-22 14:27:58.680886: step: 366/466, loss: 0.01806093193590641 2023-01-22 14:27:59.391116: step: 368/466, loss: 0.050688087940216064 2023-01-22 14:28:00.092388: step: 370/466, loss: 0.06753750890493393 2023-01-22 14:28:00.818969: step: 372/466, loss: 0.08971969038248062 2023-01-22 14:28:01.535236: step: 374/466, loss: 0.010876539163291454 2023-01-22 14:28:02.262725: step: 376/466, loss: 0.02585495449602604 2023-01-22 14:28:03.109436: step: 378/466, loss: 0.49682512879371643 2023-01-22 14:28:03.824291: step: 380/466, loss: 0.021598313003778458 2023-01-22 14:28:04.463897: step: 382/466, loss: 0.9013242721557617 2023-01-22 14:28:05.198277: step: 384/466, loss: 0.029887091368436813 2023-01-22 14:28:06.005471: step: 386/466, loss: 0.06689973920583725 2023-01-22 14:28:06.812953: step: 388/466, loss: 0.020853828638792038 2023-01-22 14:28:07.584117: step: 390/466, loss: 0.6124326586723328 2023-01-22 14:28:08.443709: step: 392/466, loss: 0.03974926099181175 2023-01-22 14:28:09.288250: step: 394/466, loss: 0.04044046252965927 2023-01-22 14:28:10.029114: step: 396/466, loss: 0.025991858914494514 2023-01-22 14:28:10.809171: step: 398/466, loss: 0.03772665187716484 2023-01-22 14:28:11.530945: step: 400/466, loss: 0.04145622253417969 2023-01-22 14:28:12.292357: step: 402/466, loss: 0.028987523168325424 2023-01-22 14:28:13.022187: step: 404/466, loss: 0.014267069287598133 2023-01-22 14:28:13.790082: step: 406/466, loss: 0.09248199313879013 2023-01-22 14:28:14.568450: step: 408/466, loss: 0.018743878230452538 2023-01-22 14:28:15.447723: step: 410/466, loss: 0.10698945820331573 2023-01-22 14:28:16.294150: step: 412/466, loss: 0.014625504612922668 2023-01-22 14:28:17.096707: step: 414/466, loss: 0.036891624331474304 2023-01-22 14:28:17.817117: step: 416/466, loss: 0.0296842772513628 2023-01-22 14:28:18.657684: step: 418/466, loss: 0.06785248965024948 2023-01-22 14:28:19.413844: step: 420/466, loss: 0.020969906821846962 2023-01-22 14:28:20.121754: step: 422/466, loss: 0.0571817010641098 2023-01-22 14:28:20.898035: step: 424/466, loss: 0.2317938655614853 2023-01-22 14:28:21.711616: step: 426/466, loss: 0.035724662244319916 2023-01-22 14:28:22.489753: step: 428/466, loss: 0.03899478167295456 2023-01-22 14:28:23.272695: step: 430/466, loss: 0.019177095964550972 2023-01-22 14:28:23.978961: step: 432/466, loss: 0.09212999790906906 2023-01-22 14:28:24.745649: step: 434/466, loss: 0.08950886875391006 2023-01-22 14:28:25.531704: step: 436/466, loss: 0.047341521829366684 2023-01-22 14:28:26.249183: step: 438/466, loss: 0.06883928179740906 2023-01-22 14:28:26.972180: step: 440/466, loss: 0.03817324712872505 2023-01-22 14:28:27.774842: step: 442/466, loss: 0.2911444902420044 2023-01-22 14:28:28.545530: step: 444/466, loss: 0.04138999804854393 2023-01-22 14:28:29.290658: step: 446/466, loss: 0.024322273209691048 2023-01-22 14:28:30.012568: step: 448/466, loss: 0.05159672722220421 2023-01-22 14:28:30.761523: step: 450/466, loss: 0.08194046467542648 2023-01-22 14:28:31.468403: step: 452/466, loss: 0.038597095757722855 2023-01-22 14:28:32.318575: step: 454/466, loss: 0.06234016641974449 2023-01-22 14:28:33.092179: step: 456/466, loss: 0.20821239054203033 2023-01-22 14:28:33.933595: step: 458/466, loss: 0.2597760558128357 2023-01-22 14:28:34.699528: step: 460/466, loss: 1.241673469543457 2023-01-22 14:28:35.439887: step: 462/466, loss: 0.13873669505119324 2023-01-22 14:28:36.236260: step: 464/466, loss: 0.07433667033910751 2023-01-22 14:28:36.918467: step: 466/466, loss: 0.052657343447208405 2023-01-22 14:28:37.712777: step: 468/466, loss: 0.09154356271028519 2023-01-22 14:28:38.457411: step: 470/466, loss: 0.10661379992961884 2023-01-22 14:28:39.228402: step: 472/466, loss: 0.17577272653579712 2023-01-22 14:28:39.965513: step: 474/466, loss: 0.04618869721889496 2023-01-22 14:28:40.781447: step: 476/466, loss: 0.05479388311505318 2023-01-22 14:28:41.451810: step: 478/466, loss: 0.05641501024365425 2023-01-22 14:28:42.165367: step: 480/466, loss: 0.01575690694153309 2023-01-22 14:28:42.901921: step: 482/466, loss: 0.16104817390441895 2023-01-22 14:28:43.815848: step: 484/466, loss: 0.010711174458265305 2023-01-22 14:28:44.513757: step: 486/466, loss: 0.046062011271715164 2023-01-22 14:28:45.304549: step: 488/466, loss: 0.06522417068481445 2023-01-22 14:28:46.038756: step: 490/466, loss: 0.020062437281012535 2023-01-22 14:28:46.845878: step: 492/466, loss: 0.037860166281461716 2023-01-22 14:28:47.614986: step: 494/466, loss: 0.09146386384963989 2023-01-22 14:28:48.351860: step: 496/466, loss: 0.01153822336345911 2023-01-22 14:28:49.040743: step: 498/466, loss: 0.16628001630306244 2023-01-22 14:28:49.840127: step: 500/466, loss: 0.08503858000040054 2023-01-22 14:28:50.576950: step: 502/466, loss: 0.025091633200645447 2023-01-22 14:28:51.364023: step: 504/466, loss: 0.3517615497112274 2023-01-22 14:28:52.124780: step: 506/466, loss: 0.01056545227766037 2023-01-22 14:28:52.857197: step: 508/466, loss: 0.023207852616906166 2023-01-22 14:28:53.629094: step: 510/466, loss: 0.47248417139053345 2023-01-22 14:28:54.370051: step: 512/466, loss: 0.058128539472818375 2023-01-22 14:28:55.120433: step: 514/466, loss: 0.09256549179553986 2023-01-22 14:28:55.873858: step: 516/466, loss: 0.06956575065851212 2023-01-22 14:28:56.572657: step: 518/466, loss: 0.01629825495183468 2023-01-22 14:28:57.369353: step: 520/466, loss: 0.044001027941703796 2023-01-22 14:28:58.059293: step: 522/466, loss: 0.023603804409503937 2023-01-22 14:28:58.799546: step: 524/466, loss: 0.09090606123209 2023-01-22 14:28:59.559117: step: 526/466, loss: 0.030994001775979996 2023-01-22 14:29:00.379475: step: 528/466, loss: 0.08065718412399292 2023-01-22 14:29:01.027342: step: 530/466, loss: 0.07034345716238022 2023-01-22 14:29:01.762788: step: 532/466, loss: 0.017793208360671997 2023-01-22 14:29:02.628280: step: 534/466, loss: 0.07686349004507065 2023-01-22 14:29:03.430737: step: 536/466, loss: 0.039163898676633835 2023-01-22 14:29:04.264059: step: 538/466, loss: 0.14436209201812744 2023-01-22 14:29:04.997132: step: 540/466, loss: 0.0816352590918541 2023-01-22 14:29:05.799404: step: 542/466, loss: 0.09122592955827713 2023-01-22 14:29:06.550743: step: 544/466, loss: 0.04344628378748894 2023-01-22 14:29:07.302414: step: 546/466, loss: 0.024878213182091713 2023-01-22 14:29:08.011800: step: 548/466, loss: 0.07749707251787186 2023-01-22 14:29:08.765739: step: 550/466, loss: 0.09194403886795044 2023-01-22 14:29:09.588169: step: 552/466, loss: 0.048549845814704895 2023-01-22 14:29:10.436399: step: 554/466, loss: 0.09204865992069244 2023-01-22 14:29:11.170608: step: 556/466, loss: 0.0338570736348629 2023-01-22 14:29:11.859612: step: 558/466, loss: 0.09505950659513474 2023-01-22 14:29:12.634740: step: 560/466, loss: 0.11021043360233307 2023-01-22 14:29:13.316486: step: 562/466, loss: 0.0074821156449615955 2023-01-22 14:29:14.108896: step: 564/466, loss: 0.0620780810713768 2023-01-22 14:29:14.871917: step: 566/466, loss: 0.04471885412931442 2023-01-22 14:29:15.671848: step: 568/466, loss: 0.062284309417009354 2023-01-22 14:29:16.459441: step: 570/466, loss: 0.014837083406746387 2023-01-22 14:29:17.185132: step: 572/466, loss: 0.10218457132577896 2023-01-22 14:29:17.919775: step: 574/466, loss: 0.013688577339053154 2023-01-22 14:29:18.575524: step: 576/466, loss: 0.05932674929499626 2023-01-22 14:29:19.374758: step: 578/466, loss: 0.015389678999781609 2023-01-22 14:29:20.073289: step: 580/466, loss: 0.20691397786140442 2023-01-22 14:29:20.773647: step: 582/466, loss: 0.008128570392727852 2023-01-22 14:29:21.490031: step: 584/466, loss: 0.13989268243312836 2023-01-22 14:29:22.210035: step: 586/466, loss: 0.06592102348804474 2023-01-22 14:29:22.859311: step: 588/466, loss: 0.020636077970266342 2023-01-22 14:29:23.556803: step: 590/466, loss: 0.045246824622154236 2023-01-22 14:29:24.273889: step: 592/466, loss: 0.018867220729589462 2023-01-22 14:29:24.989464: step: 594/466, loss: 0.022000886499881744 2023-01-22 14:29:25.693262: step: 596/466, loss: 0.03726113215088844 2023-01-22 14:29:26.464447: step: 598/466, loss: 0.008649222552776337 2023-01-22 14:29:27.240441: step: 600/466, loss: 0.027857929468154907 2023-01-22 14:29:28.023645: step: 602/466, loss: 0.061211489140987396 2023-01-22 14:29:28.850214: step: 604/466, loss: 0.01916958950459957 2023-01-22 14:29:29.568169: step: 606/466, loss: 0.060166411101818085 2023-01-22 14:29:30.339259: step: 608/466, loss: 0.0379047766327858 2023-01-22 14:29:31.169289: step: 610/466, loss: 0.03668516129255295 2023-01-22 14:29:32.004193: step: 612/466, loss: 0.020177414640784264 2023-01-22 14:29:32.722928: step: 614/466, loss: 0.03892548382282257 2023-01-22 14:29:33.447995: step: 616/466, loss: 0.058562684804201126 2023-01-22 14:29:34.148798: step: 618/466, loss: 0.023050406947731972 2023-01-22 14:29:34.946470: step: 620/466, loss: 0.08460335433483124 2023-01-22 14:29:35.686455: step: 622/466, loss: 0.011020708829164505 2023-01-22 14:29:36.535245: step: 624/466, loss: 0.017338331788778305 2023-01-22 14:29:37.297659: step: 626/466, loss: 0.034749772399663925 2023-01-22 14:29:38.009640: step: 628/466, loss: 0.011427835561335087 2023-01-22 14:29:38.739761: step: 630/466, loss: 0.016694651916623116 2023-01-22 14:29:39.512688: step: 632/466, loss: 0.004482876975089312 2023-01-22 14:29:40.368835: step: 634/466, loss: 0.08318014442920685 2023-01-22 14:29:41.053235: step: 636/466, loss: 0.05684254318475723 2023-01-22 14:29:41.889512: step: 638/466, loss: 0.09137509018182755 2023-01-22 14:29:42.618834: step: 640/466, loss: 0.09939432144165039 2023-01-22 14:29:43.417792: step: 642/466, loss: 0.02181725762784481 2023-01-22 14:29:44.189132: step: 644/466, loss: 0.015538212843239307 2023-01-22 14:29:44.962423: step: 646/466, loss: 0.026405729353427887 2023-01-22 14:29:45.668202: step: 648/466, loss: 0.15074358880519867 2023-01-22 14:29:46.433043: step: 650/466, loss: 0.03875117376446724 2023-01-22 14:29:47.145745: step: 652/466, loss: 0.10887783020734787 2023-01-22 14:29:47.871796: step: 654/466, loss: 0.0982695072889328 2023-01-22 14:29:48.673658: step: 656/466, loss: 0.137633815407753 2023-01-22 14:29:49.459172: step: 658/466, loss: 0.03798063471913338 2023-01-22 14:29:50.209473: step: 660/466, loss: 0.046758100390434265 2023-01-22 14:29:50.917058: step: 662/466, loss: 0.030941788107156754 2023-01-22 14:29:51.719465: step: 664/466, loss: 0.010282697156071663 2023-01-22 14:29:52.516611: step: 666/466, loss: 0.09982012957334518 2023-01-22 14:29:53.477115: step: 668/466, loss: 0.03435714170336723 2023-01-22 14:29:54.199189: step: 670/466, loss: 0.018653811886906624 2023-01-22 14:29:55.078358: step: 672/466, loss: 0.6183578372001648 2023-01-22 14:29:55.857361: step: 674/466, loss: 0.028816204518079758 2023-01-22 14:29:56.603279: step: 676/466, loss: 0.0988229289650917 2023-01-22 14:29:57.317530: step: 678/466, loss: 0.06322664022445679 2023-01-22 14:29:58.076197: step: 680/466, loss: 0.044840168207883835 2023-01-22 14:29:58.856644: step: 682/466, loss: 0.1099725216627121 2023-01-22 14:29:59.652211: step: 684/466, loss: 0.05679919198155403 2023-01-22 14:30:00.475636: step: 686/466, loss: 0.02459697611629963 2023-01-22 14:30:01.235627: step: 688/466, loss: 0.05390128120779991 2023-01-22 14:30:01.994701: step: 690/466, loss: 0.2060244232416153 2023-01-22 14:30:02.701197: step: 692/466, loss: 0.07413557171821594 2023-01-22 14:30:03.464364: step: 694/466, loss: 0.02973468042910099 2023-01-22 14:30:04.257620: step: 696/466, loss: 0.08305200934410095 2023-01-22 14:30:05.016752: step: 698/466, loss: 0.04088933765888214 2023-01-22 14:30:05.781377: step: 700/466, loss: 0.07874433696269989 2023-01-22 14:30:06.545303: step: 702/466, loss: 0.09399781376123428 2023-01-22 14:30:07.361666: step: 704/466, loss: 0.023996638134121895 2023-01-22 14:30:08.106372: step: 706/466, loss: 0.05157013610005379 2023-01-22 14:30:08.758347: step: 708/466, loss: 0.047712381929159164 2023-01-22 14:30:09.526724: step: 710/466, loss: 0.047248851507902145 2023-01-22 14:30:10.317587: step: 712/466, loss: 0.033434826880693436 2023-01-22 14:30:11.143917: step: 714/466, loss: 0.03782849386334419 2023-01-22 14:30:11.767532: step: 716/466, loss: 0.016999607905745506 2023-01-22 14:30:12.500714: step: 718/466, loss: 0.01766519993543625 2023-01-22 14:30:13.302829: step: 720/466, loss: 0.018042655661702156 2023-01-22 14:30:14.152915: step: 722/466, loss: 0.024905243888497353 2023-01-22 14:30:14.919895: step: 724/466, loss: 0.4071979224681854 2023-01-22 14:30:15.691541: step: 726/466, loss: 0.005354071501642466 2023-01-22 14:30:16.439935: step: 728/466, loss: 0.06616143137216568 2023-01-22 14:30:17.214808: step: 730/466, loss: 0.03915643319487572 2023-01-22 14:30:18.072484: step: 732/466, loss: 0.010733548551797867 2023-01-22 14:30:18.879397: step: 734/466, loss: 0.02929450199007988 2023-01-22 14:30:19.620299: step: 736/466, loss: 0.03881808742880821 2023-01-22 14:30:20.481896: step: 738/466, loss: 0.03280079364776611 2023-01-22 14:30:21.218128: step: 740/466, loss: 2.8601508140563965 2023-01-22 14:30:21.962527: step: 742/466, loss: 0.10068678855895996 2023-01-22 14:30:22.759998: step: 744/466, loss: 0.07240951806306839 2023-01-22 14:30:23.534455: step: 746/466, loss: 0.03611631318926811 2023-01-22 14:30:24.270844: step: 748/466, loss: 0.008190032094717026 2023-01-22 14:30:25.035874: step: 750/466, loss: 0.040565237402915955 2023-01-22 14:30:25.770061: step: 752/466, loss: 0.029234912246465683 2023-01-22 14:30:26.566767: step: 754/466, loss: 0.1493801325559616 2023-01-22 14:30:27.333810: step: 756/466, loss: 0.26373106241226196 2023-01-22 14:30:28.063473: step: 758/466, loss: 0.007217098958790302 2023-01-22 14:30:28.804450: step: 760/466, loss: 0.028150340542197227 2023-01-22 14:30:29.617050: step: 762/466, loss: 0.04375817999243736 2023-01-22 14:30:30.361736: step: 764/466, loss: 0.04241231828927994 2023-01-22 14:30:31.073042: step: 766/466, loss: 0.06087270751595497 2023-01-22 14:30:31.901476: step: 768/466, loss: 0.06415510922670364 2023-01-22 14:30:32.679577: step: 770/466, loss: 0.06608094274997711 2023-01-22 14:30:33.446596: step: 772/466, loss: 0.09900546818971634 2023-01-22 14:30:34.185915: step: 774/466, loss: 0.013609092682600021 2023-01-22 14:30:34.958979: step: 776/466, loss: 0.1549600213766098 2023-01-22 14:30:35.722276: step: 778/466, loss: 0.0728570744395256 2023-01-22 14:30:36.481565: step: 780/466, loss: 0.007868712767958641 2023-01-22 14:30:37.252931: step: 782/466, loss: 0.012212309055030346 2023-01-22 14:30:38.031768: step: 784/466, loss: 0.043318044394254684 2023-01-22 14:30:38.778364: step: 786/466, loss: 0.11056303232908249 2023-01-22 14:30:39.505410: step: 788/466, loss: 0.631340742111206 2023-01-22 14:30:40.221457: step: 790/466, loss: 0.01474701426923275 2023-01-22 14:30:41.109252: step: 792/466, loss: 0.10303998738527298 2023-01-22 14:30:41.780000: step: 794/466, loss: 0.02286025695502758 2023-01-22 14:30:42.504782: step: 796/466, loss: 0.025005998089909554 2023-01-22 14:30:43.290210: step: 798/466, loss: 0.03820018097758293 2023-01-22 14:30:44.134158: step: 800/466, loss: 0.2768549919128418 2023-01-22 14:30:44.883973: step: 802/466, loss: 0.07749795913696289 2023-01-22 14:30:45.623588: step: 804/466, loss: 0.05952690541744232 2023-01-22 14:30:46.320926: step: 806/466, loss: 0.08135931938886642 2023-01-22 14:30:47.109941: step: 808/466, loss: 0.08084471523761749 2023-01-22 14:30:47.844240: step: 810/466, loss: 0.06380957365036011 2023-01-22 14:30:48.650130: step: 812/466, loss: 0.1317671686410904 2023-01-22 14:30:49.469622: step: 814/466, loss: 0.08646216243505478 2023-01-22 14:30:50.220037: step: 816/466, loss: 0.07240697741508484 2023-01-22 14:30:50.928831: step: 818/466, loss: 0.07382383197546005 2023-01-22 14:30:51.745918: step: 820/466, loss: 0.0447627417743206 2023-01-22 14:30:52.470423: step: 822/466, loss: 0.01566709578037262 2023-01-22 14:30:53.287520: step: 824/466, loss: 0.04784432798624039 2023-01-22 14:30:54.132015: step: 826/466, loss: 0.051131147891283035 2023-01-22 14:30:54.922441: step: 828/466, loss: 0.06012752279639244 2023-01-22 14:30:55.611952: step: 830/466, loss: 0.014375496655702591 2023-01-22 14:30:56.353478: step: 832/466, loss: 0.0393557995557785 2023-01-22 14:30:57.110997: step: 834/466, loss: 0.0965733602643013 2023-01-22 14:30:57.867702: step: 836/466, loss: 0.11071842908859253 2023-01-22 14:30:58.637112: step: 838/466, loss: 0.03374619781970978 2023-01-22 14:30:59.356840: step: 840/466, loss: 0.08582701534032822 2023-01-22 14:31:00.070167: step: 842/466, loss: 0.08291257917881012 2023-01-22 14:31:00.835195: step: 844/466, loss: 0.25133976340293884 2023-01-22 14:31:01.665696: step: 846/466, loss: 0.09083317965269089 2023-01-22 14:31:02.462559: step: 848/466, loss: 0.04542336240410805 2023-01-22 14:31:03.304440: step: 850/466, loss: 0.2875499725341797 2023-01-22 14:31:04.027865: step: 852/466, loss: 0.10486269742250443 2023-01-22 14:31:04.829910: step: 854/466, loss: 0.03992059826850891 2023-01-22 14:31:05.583653: step: 856/466, loss: 0.018067266792058945 2023-01-22 14:31:06.298887: step: 858/466, loss: 0.0018403325229883194 2023-01-22 14:31:07.042404: step: 860/466, loss: 0.0541539341211319 2023-01-22 14:31:07.823928: step: 862/466, loss: 0.019118212163448334 2023-01-22 14:31:08.601594: step: 864/466, loss: 0.05498569458723068 2023-01-22 14:31:09.267669: step: 866/466, loss: 0.02778913825750351 2023-01-22 14:31:10.135755: step: 868/466, loss: 0.06923171132802963 2023-01-22 14:31:10.960050: step: 870/466, loss: 0.039038099348545074 2023-01-22 14:31:11.696756: step: 872/466, loss: 0.039004988968372345 2023-01-22 14:31:12.479960: step: 874/466, loss: 0.09032604098320007 2023-01-22 14:31:13.268189: step: 876/466, loss: 0.07266921550035477 2023-01-22 14:31:14.035664: step: 878/466, loss: 0.11244278401136398 2023-01-22 14:31:14.716318: step: 880/466, loss: 0.035093631595373154 2023-01-22 14:31:15.463194: step: 882/466, loss: 0.03499899059534073 2023-01-22 14:31:16.173519: step: 884/466, loss: 0.054025325924158096 2023-01-22 14:31:17.075258: step: 886/466, loss: 0.045914176851511 2023-01-22 14:31:17.862981: step: 888/466, loss: 0.013009753078222275 2023-01-22 14:31:18.666353: step: 890/466, loss: 0.050087813287973404 2023-01-22 14:31:19.447541: step: 892/466, loss: 0.05891994759440422 2023-01-22 14:31:20.222507: step: 894/466, loss: 0.04082118719816208 2023-01-22 14:31:20.988991: step: 896/466, loss: 0.03301764652132988 2023-01-22 14:31:21.728716: step: 898/466, loss: 0.06859487295150757 2023-01-22 14:31:22.491238: step: 900/466, loss: 0.04476600140333176 2023-01-22 14:31:23.247398: step: 902/466, loss: 0.06929872930049896 2023-01-22 14:31:24.000606: step: 904/466, loss: 0.46976327896118164 2023-01-22 14:31:24.758277: step: 906/466, loss: 0.04072031006217003 2023-01-22 14:31:25.536119: step: 908/466, loss: 0.010995362885296345 2023-01-22 14:31:26.285898: step: 910/466, loss: 0.05154285952448845 2023-01-22 14:31:27.179624: step: 912/466, loss: 0.03599536046385765 2023-01-22 14:31:27.933277: step: 914/466, loss: 0.0024055049289017916 2023-01-22 14:31:28.683953: step: 916/466, loss: 0.009881271980702877 2023-01-22 14:31:29.487737: step: 918/466, loss: 0.08806942403316498 2023-01-22 14:31:30.237917: step: 920/466, loss: 0.038152310997247696 2023-01-22 14:31:30.972574: step: 922/466, loss: 0.041223522275686264 2023-01-22 14:31:31.733236: step: 924/466, loss: 0.18694542348384857 2023-01-22 14:31:32.517548: step: 926/466, loss: 0.06713179498910904 2023-01-22 14:31:33.330798: step: 928/466, loss: 0.18050426244735718 2023-01-22 14:31:34.147646: step: 930/466, loss: 0.05860520899295807 2023-01-22 14:31:34.844690: step: 932/466, loss: 0.03831728547811508 ================================================== Loss: 0.103 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.30074375583566765, 'r': 0.3081624822604564, 'f1': 0.30440792530695504}, 'combined': 0.22430057654196686, 'epoch': 21} Test Chinese: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.355600780324783, 'r': 0.30053802973816, 'f1': 0.3257589895708513}, 'combined': 0.20022259846793788, 'epoch': 21} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.27681382453147646, 'r': 0.315683317919198, 'f1': 0.29497359670818685}, 'combined': 0.2173489659955061, 'epoch': 21} Test Korean: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.34021325462628416, 'r': 0.3048969998207875, 'f1': 0.321588441416816}, 'combined': 0.1976592371635064, 'epoch': 21} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3164016894925909, 'r': 0.31460054135506194, 'f1': 0.31549854480326855}, 'combined': 0.23247261196030314, 'epoch': 21} Test Russian: {'template': {'p': 0.863013698630137, 'r': 0.48091603053435117, 'f1': 0.6176470588235295}, 'slot': {'p': 0.36083100941622814, 'r': 0.29876932326771527, 'f1': 0.3268804794522426}, 'combined': 0.2018967667205028, 'epoch': 21} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.2819767441860465, 'r': 0.3464285714285714, 'f1': 0.31089743589743596}, 'combined': 0.2072649572649573, 'epoch': 21} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.2554347826086957, 'r': 0.5108695652173914, 'f1': 0.3405797101449276}, 'combined': 0.1702898550724638, 'epoch': 21} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.4375, 'r': 0.2413793103448276, 'f1': 0.3111111111111111}, 'combined': 0.2074074074074074, 'epoch': 21} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3027851084501864, 'r': 0.33151234834109594, 'f1': 0.3164982021299956}, 'combined': 0.23320920156947042, 'epoch': 16} Test for Chinese: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.34794201805038116, 'r': 0.299398980870042, 'f1': 0.32185041818726456}, 'combined': 0.19782025703217238, 'epoch': 16} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.29347826086956524, 'r': 0.38571428571428573, 'f1': 0.33333333333333337}, 'combined': 0.22222222222222224, 'epoch': 16} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.28298284686781866, 'r': 0.3527888622242065, 'f1': 0.31405359863540006}, 'combined': 0.23140791478397899, 'epoch': 11} Test for Korean: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.32660396071805126, 'r': 0.30226432413074417, 'f1': 0.3139631233545263}, 'combined': 0.19297245630570886, 'epoch': 11} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.28888888888888886, 'r': 0.5652173913043478, 'f1': 0.38235294117647056}, 'combined': 0.19117647058823528, 'epoch': 11} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3017706576728499, 'r': 0.3395635673624288, 'f1': 0.3195535714285714}, 'combined': 0.23546052631578943, 'epoch': 19} Test for Russian: {'template': {'p': 0.863013698630137, 'r': 0.48091603053435117, 'f1': 0.6176470588235295}, 'slot': {'p': 0.3321627543028552, 'r': 0.3129111067783818, 'f1': 0.32224965651297044}, 'combined': 0.19903655255212885, 'epoch': 19} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.4375, 'r': 0.2413793103448276, 'f1': 0.3111111111111111}, 'combined': 0.2074074074074074, 'epoch': 19} ****************************** Epoch: 22 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-22 14:34:19.358284: step: 2/466, loss: 0.01801205240190029 2023-01-22 14:34:20.105013: step: 4/466, loss: 0.009650541469454765 2023-01-22 14:34:20.867468: step: 6/466, loss: 0.07053247094154358 2023-01-22 14:34:21.581889: step: 8/466, loss: 0.001292285742238164 2023-01-22 14:34:22.338832: step: 10/466, loss: 0.03261404484510422 2023-01-22 14:34:23.079931: step: 12/466, loss: 0.05088101327419281 2023-01-22 14:34:23.750099: step: 14/466, loss: 0.004686241038143635 2023-01-22 14:34:24.456502: step: 16/466, loss: 0.057643257081508636 2023-01-22 14:34:25.208107: step: 18/466, loss: 0.025192510336637497 2023-01-22 14:34:25.956927: step: 20/466, loss: 0.03585965558886528 2023-01-22 14:34:26.723233: step: 22/466, loss: 0.00812239944934845 2023-01-22 14:34:27.481395: step: 24/466, loss: 0.1684376448392868 2023-01-22 14:34:28.204305: step: 26/466, loss: 0.012567630037665367 2023-01-22 14:34:28.992855: step: 28/466, loss: 0.008991257287561893 2023-01-22 14:34:29.765713: step: 30/466, loss: 0.010555324144661427 2023-01-22 14:34:30.577998: step: 32/466, loss: 0.04046061635017395 2023-01-22 14:34:31.327506: step: 34/466, loss: 0.02395005337893963 2023-01-22 14:34:32.084468: step: 36/466, loss: 0.45707467198371887 2023-01-22 14:34:32.794749: step: 38/466, loss: 0.042842619121074677 2023-01-22 14:34:33.646427: step: 40/466, loss: 0.04697068780660629 2023-01-22 14:34:34.415595: step: 42/466, loss: 0.02398432418704033 2023-01-22 14:34:35.132619: step: 44/466, loss: 0.035828251391649246 2023-01-22 14:34:35.902660: step: 46/466, loss: 0.015519676730036736 2023-01-22 14:34:36.765351: step: 48/466, loss: 0.02766352891921997 2023-01-22 14:34:37.494993: step: 50/466, loss: 0.04843959957361221 2023-01-22 14:34:38.225773: step: 52/466, loss: 0.04603329673409462 2023-01-22 14:34:39.097029: step: 54/466, loss: 0.05231574550271034 2023-01-22 14:34:39.900047: step: 56/466, loss: 0.0449846126139164 2023-01-22 14:34:40.699922: step: 58/466, loss: 0.05117674544453621 2023-01-22 14:34:41.492911: step: 60/466, loss: 0.025951247662305832 2023-01-22 14:34:42.291579: step: 62/466, loss: 0.2809740900993347 2023-01-22 14:34:42.985746: step: 64/466, loss: 0.001973114674910903 2023-01-22 14:34:43.754055: step: 66/466, loss: 0.028032252565026283 2023-01-22 14:34:44.543643: step: 68/466, loss: 0.24467821419239044 2023-01-22 14:34:45.394633: step: 70/466, loss: 0.033111125230789185 2023-01-22 14:34:46.115802: step: 72/466, loss: 0.028844568878412247 2023-01-22 14:34:46.836106: step: 74/466, loss: 0.02851303108036518 2023-01-22 14:34:47.625344: step: 76/466, loss: 0.08196083456277847 2023-01-22 14:34:48.334359: step: 78/466, loss: 0.03638365492224693 2023-01-22 14:34:49.104072: step: 80/466, loss: 0.01183218415826559 2023-01-22 14:34:49.866722: step: 82/466, loss: 0.014731865376234055 2023-01-22 14:34:50.590760: step: 84/466, loss: 0.037835411727428436 2023-01-22 14:34:51.363522: step: 86/466, loss: 0.01963892951607704 2023-01-22 14:34:52.058985: step: 88/466, loss: 0.12295688688755035 2023-01-22 14:34:52.816439: step: 90/466, loss: 0.07707997411489487 2023-01-22 14:34:53.504063: step: 92/466, loss: 0.06897678971290588 2023-01-22 14:34:54.406840: step: 94/466, loss: 0.049370914697647095 2023-01-22 14:34:55.141492: step: 96/466, loss: 0.06621725857257843 2023-01-22 14:34:55.891932: step: 98/466, loss: 0.0824403166770935 2023-01-22 14:34:56.692441: step: 100/466, loss: 0.0027513070963323116 2023-01-22 14:34:57.389282: step: 102/466, loss: 0.06183888390660286 2023-01-22 14:34:58.266314: step: 104/466, loss: 0.10273338854312897 2023-01-22 14:34:58.974222: step: 106/466, loss: 0.005924407858401537 2023-01-22 14:34:59.730995: step: 108/466, loss: 0.029296522960066795 2023-01-22 14:35:00.509532: step: 110/466, loss: 0.0434553362429142 2023-01-22 14:35:01.323793: step: 112/466, loss: 0.01007161196321249 2023-01-22 14:35:02.069458: step: 114/466, loss: 0.03057682141661644 2023-01-22 14:35:02.855904: step: 116/466, loss: 0.12594719231128693 2023-01-22 14:35:03.539724: step: 118/466, loss: 0.027035461738705635 2023-01-22 14:35:04.329433: step: 120/466, loss: 0.03666792809963226 2023-01-22 14:35:05.122023: step: 122/466, loss: 0.04613330587744713 2023-01-22 14:35:05.843366: step: 124/466, loss: 0.06358573585748672 2023-01-22 14:35:06.543233: step: 126/466, loss: 0.027978744357824326 2023-01-22 14:35:07.330572: step: 128/466, loss: 0.0762336328625679 2023-01-22 14:35:08.070433: step: 130/466, loss: 0.0326845683157444 2023-01-22 14:35:08.799391: step: 132/466, loss: 0.011000591330230236 2023-01-22 14:35:09.568789: step: 134/466, loss: 0.963161826133728 2023-01-22 14:35:10.246608: step: 136/466, loss: 0.15354757010936737 2023-01-22 14:35:10.955830: step: 138/466, loss: 0.1283196657896042 2023-01-22 14:35:11.672622: step: 140/466, loss: 0.008298342116177082 2023-01-22 14:35:12.355250: step: 142/466, loss: 0.03561526536941528 2023-01-22 14:35:13.060554: step: 144/466, loss: 0.0693802759051323 2023-01-22 14:35:13.885161: step: 146/466, loss: 0.05710234120488167 2023-01-22 14:35:14.642822: step: 148/466, loss: 0.0016815579729154706 2023-01-22 14:35:15.457991: step: 150/466, loss: 0.033275868743658066 2023-01-22 14:35:16.246516: step: 152/466, loss: 0.0874953642487526 2023-01-22 14:35:17.036684: step: 154/466, loss: 0.03488751873373985 2023-01-22 14:35:17.788690: step: 156/466, loss: 0.19325657188892365 2023-01-22 14:35:18.562232: step: 158/466, loss: 2.2976832389831543 2023-01-22 14:35:19.378039: step: 160/466, loss: 0.037470534443855286 2023-01-22 14:35:20.147054: step: 162/466, loss: 0.03140714764595032 2023-01-22 14:35:20.962627: step: 164/466, loss: 0.0779411792755127 2023-01-22 14:35:21.700003: step: 166/466, loss: 0.052832603454589844 2023-01-22 14:35:22.504695: step: 168/466, loss: 0.028959710150957108 2023-01-22 14:35:23.159927: step: 170/466, loss: 0.05092615261673927 2023-01-22 14:35:23.976844: step: 172/466, loss: 0.039884038269519806 2023-01-22 14:35:24.667783: step: 174/466, loss: 0.023970339447259903 2023-01-22 14:35:25.709943: step: 176/466, loss: 0.015574868768453598 2023-01-22 14:35:26.443448: step: 178/466, loss: 0.014533845707774162 2023-01-22 14:35:27.217663: step: 180/466, loss: 0.03421838954091072 2023-01-22 14:35:27.976871: step: 182/466, loss: 0.0036980807781219482 2023-01-22 14:35:28.800188: step: 184/466, loss: 0.10932143777608871 2023-01-22 14:35:29.517341: step: 186/466, loss: 0.09876124560832977 2023-01-22 14:35:30.276999: step: 188/466, loss: 0.03024807572364807 2023-01-22 14:35:31.073900: step: 190/466, loss: 0.04149920493364334 2023-01-22 14:35:31.815951: step: 192/466, loss: 0.025310944765806198 2023-01-22 14:35:32.582785: step: 194/466, loss: 0.02424207702279091 2023-01-22 14:35:33.429683: step: 196/466, loss: 0.04799162968993187 2023-01-22 14:35:34.162781: step: 198/466, loss: 0.017459945753216743 2023-01-22 14:35:34.919973: step: 200/466, loss: 0.010713450610637665 2023-01-22 14:35:35.711247: step: 202/466, loss: 0.06455336511135101 2023-01-22 14:35:36.454639: step: 204/466, loss: 0.0043992772698402405 2023-01-22 14:35:37.194071: step: 206/466, loss: 0.02925746887922287 2023-01-22 14:35:37.988690: step: 208/466, loss: 0.08175531029701233 2023-01-22 14:35:38.779026: step: 210/466, loss: 0.07611624896526337 2023-01-22 14:35:39.592110: step: 212/466, loss: 0.08095847815275192 2023-01-22 14:35:40.255954: step: 214/466, loss: 0.09799374639987946 2023-01-22 14:35:40.975491: step: 216/466, loss: 0.3005771338939667 2023-01-22 14:35:41.665189: step: 218/466, loss: 0.009296614676713943 2023-01-22 14:35:42.469709: step: 220/466, loss: 0.022954193875193596 2023-01-22 14:35:43.215747: step: 222/466, loss: 0.05615520477294922 2023-01-22 14:35:43.992432: step: 224/466, loss: 0.07679169625043869 2023-01-22 14:35:44.747985: step: 226/466, loss: 0.021268269047141075 2023-01-22 14:35:45.539682: step: 228/466, loss: 0.017214687541127205 2023-01-22 14:35:46.281585: step: 230/466, loss: 0.01152557972818613 2023-01-22 14:35:47.008610: step: 232/466, loss: 0.06563958525657654 2023-01-22 14:35:47.837212: step: 234/466, loss: 0.005681390408426523 2023-01-22 14:35:48.577331: step: 236/466, loss: 0.013118581846356392 2023-01-22 14:35:49.366505: step: 238/466, loss: 0.04298659414052963 2023-01-22 14:35:50.173198: step: 240/466, loss: 0.05200590193271637 2023-01-22 14:35:50.908146: step: 242/466, loss: 0.002855125116184354 2023-01-22 14:35:51.712525: step: 244/466, loss: 0.014720942825078964 2023-01-22 14:35:52.409748: step: 246/466, loss: 0.07755979895591736 2023-01-22 14:35:53.274976: step: 248/466, loss: 0.846014142036438 2023-01-22 14:35:54.029390: step: 250/466, loss: 0.030162004753947258 2023-01-22 14:35:54.808746: step: 252/466, loss: 0.0932496190071106 2023-01-22 14:35:55.524913: step: 254/466, loss: 0.06960610300302505 2023-01-22 14:35:56.259219: step: 256/466, loss: 0.016617776826024055 2023-01-22 14:35:56.994298: step: 258/466, loss: 0.0021575915161520243 2023-01-22 14:35:57.712283: step: 260/466, loss: 0.04989505559206009 2023-01-22 14:35:58.496355: step: 262/466, loss: 0.0046344357542693615 2023-01-22 14:35:59.227985: step: 264/466, loss: 0.008433621376752853 2023-01-22 14:36:00.008147: step: 266/466, loss: 0.021122919395565987 2023-01-22 14:36:00.792618: step: 268/466, loss: 0.07885141670703888 2023-01-22 14:36:01.507784: step: 270/466, loss: 0.05262148380279541 2023-01-22 14:36:02.299374: step: 272/466, loss: 0.0057984814047813416 2023-01-22 14:36:03.070943: step: 274/466, loss: 0.0534018948674202 2023-01-22 14:36:03.796839: step: 276/466, loss: 0.04419717565178871 2023-01-22 14:36:04.626662: step: 278/466, loss: 0.016852280125021935 2023-01-22 14:36:05.361131: step: 280/466, loss: 0.0230227243155241 2023-01-22 14:36:06.060064: step: 282/466, loss: 0.008835147134959698 2023-01-22 14:36:06.772127: step: 284/466, loss: 0.070997454226017 2023-01-22 14:36:07.423566: step: 286/466, loss: 0.014292260631918907 2023-01-22 14:36:08.108306: step: 288/466, loss: 0.00638886634260416 2023-01-22 14:36:08.913006: step: 290/466, loss: 0.08430958539247513 2023-01-22 14:36:09.683884: step: 292/466, loss: 0.12293495237827301 2023-01-22 14:36:10.521430: step: 294/466, loss: 0.04928538203239441 2023-01-22 14:36:11.255319: step: 296/466, loss: 0.051817622035741806 2023-01-22 14:36:12.026919: step: 298/466, loss: 0.07045499235391617 2023-01-22 14:36:12.764126: step: 300/466, loss: 0.03849097341299057 2023-01-22 14:36:13.546658: step: 302/466, loss: 0.032904352992773056 2023-01-22 14:36:14.225477: step: 304/466, loss: 0.14624471962451935 2023-01-22 14:36:14.856978: step: 306/466, loss: 0.006894540973007679 2023-01-22 14:36:15.662784: step: 308/466, loss: 0.021506957709789276 2023-01-22 14:36:16.463385: step: 310/466, loss: 0.03344562277197838 2023-01-22 14:36:17.237004: step: 312/466, loss: 0.0015532016986981034 2023-01-22 14:36:17.946284: step: 314/466, loss: 0.011609219945967197 2023-01-22 14:36:18.693642: step: 316/466, loss: 0.08411398530006409 2023-01-22 14:36:19.501110: step: 318/466, loss: 0.024583594873547554 2023-01-22 14:36:20.333720: step: 320/466, loss: 0.011126959696412086 2023-01-22 14:36:21.130145: step: 322/466, loss: 0.050845298916101456 2023-01-22 14:36:21.873697: step: 324/466, loss: 0.004713057540357113 2023-01-22 14:36:22.722411: step: 326/466, loss: 0.04157442972064018 2023-01-22 14:36:23.442825: step: 328/466, loss: 0.08843721449375153 2023-01-22 14:36:24.184944: step: 330/466, loss: 0.011852320283651352 2023-01-22 14:36:24.926017: step: 332/466, loss: 0.08430391550064087 2023-01-22 14:36:25.706538: step: 334/466, loss: 0.007539176847785711 2023-01-22 14:36:26.546921: step: 336/466, loss: 0.14938679337501526 2023-01-22 14:36:27.341149: step: 338/466, loss: 0.051123447716236115 2023-01-22 14:36:28.110368: step: 340/466, loss: 0.01445749681442976 2023-01-22 14:36:28.915698: step: 342/466, loss: 0.09141960740089417 2023-01-22 14:36:29.630137: step: 344/466, loss: 0.028653070330619812 2023-01-22 14:36:30.456999: step: 346/466, loss: 0.12943173944950104 2023-01-22 14:36:31.331846: step: 348/466, loss: 0.06502082943916321 2023-01-22 14:36:32.096787: step: 350/466, loss: 0.04192391782999039 2023-01-22 14:36:32.933140: step: 352/466, loss: 0.03628654032945633 2023-01-22 14:36:33.653514: step: 354/466, loss: 0.036534518003463745 2023-01-22 14:36:34.377836: step: 356/466, loss: 0.003242996521294117 2023-01-22 14:36:35.138498: step: 358/466, loss: 0.01773170195519924 2023-01-22 14:36:35.946629: step: 360/466, loss: 0.04168696328997612 2023-01-22 14:36:36.749737: step: 362/466, loss: 0.05508454144001007 2023-01-22 14:36:37.427770: step: 364/466, loss: 0.041718918830156326 2023-01-22 14:36:38.192717: step: 366/466, loss: 0.048134926706552505 2023-01-22 14:36:38.896470: step: 368/466, loss: 0.06911245733499527 2023-01-22 14:36:39.623819: step: 370/466, loss: 0.0356752835214138 2023-01-22 14:36:40.433560: step: 372/466, loss: 0.06430232524871826 2023-01-22 14:36:41.163868: step: 374/466, loss: 0.021607208997011185 2023-01-22 14:36:41.904447: step: 376/466, loss: 0.1649162322282791 2023-01-22 14:36:42.691485: step: 378/466, loss: 0.015220481902360916 2023-01-22 14:36:43.510182: step: 380/466, loss: 0.04231276363134384 2023-01-22 14:36:44.331125: step: 382/466, loss: 0.027485070750117302 2023-01-22 14:36:45.135717: step: 384/466, loss: 0.055276788771152496 2023-01-22 14:36:45.920510: step: 386/466, loss: 0.009534847922623158 2023-01-22 14:36:46.720187: step: 388/466, loss: 0.058055032044649124 2023-01-22 14:36:47.398184: step: 390/466, loss: 0.02162765897810459 2023-01-22 14:36:48.096658: step: 392/466, loss: 0.09184350818395615 2023-01-22 14:36:48.927752: step: 394/466, loss: 0.017593739554286003 2023-01-22 14:36:49.671641: step: 396/466, loss: 0.0031726297456771135 2023-01-22 14:36:50.452064: step: 398/466, loss: 0.017662903293967247 2023-01-22 14:36:51.269466: step: 400/466, loss: 0.05700105056166649 2023-01-22 14:36:52.054849: step: 402/466, loss: 0.00686702411621809 2023-01-22 14:36:52.865197: step: 404/466, loss: 0.02812480553984642 2023-01-22 14:36:53.653809: step: 406/466, loss: 0.03714694455265999 2023-01-22 14:36:54.445090: step: 408/466, loss: 0.004215636756271124 2023-01-22 14:36:55.233227: step: 410/466, loss: 0.01110562402755022 2023-01-22 14:36:56.058199: step: 412/466, loss: 0.07172445207834244 2023-01-22 14:36:56.758204: step: 414/466, loss: 0.024139823392033577 2023-01-22 14:36:57.552426: step: 416/466, loss: 0.05758264660835266 2023-01-22 14:36:58.286397: step: 418/466, loss: 0.020258810371160507 2023-01-22 14:36:59.040498: step: 420/466, loss: 0.060611262917518616 2023-01-22 14:36:59.800501: step: 422/466, loss: 0.03956054151058197 2023-01-22 14:37:00.606365: step: 424/466, loss: 0.02907363325357437 2023-01-22 14:37:01.452552: step: 426/466, loss: 0.05078652501106262 2023-01-22 14:37:02.221096: step: 428/466, loss: 0.01063844095915556 2023-01-22 14:37:02.966324: step: 430/466, loss: 0.004772020969539881 2023-01-22 14:37:03.693231: step: 432/466, loss: 0.019598014652729034 2023-01-22 14:37:04.379792: step: 434/466, loss: 0.0007601032848469913 2023-01-22 14:37:05.148720: step: 436/466, loss: 0.010198225267231464 2023-01-22 14:37:05.880333: step: 438/466, loss: 0.18594855070114136 2023-01-22 14:37:06.713292: step: 440/466, loss: 0.08964129537343979 2023-01-22 14:37:07.407123: step: 442/466, loss: 0.007389713078737259 2023-01-22 14:37:08.228431: step: 444/466, loss: 0.04020103067159653 2023-01-22 14:37:08.910244: step: 446/466, loss: 0.0469764843583107 2023-01-22 14:37:09.634886: step: 448/466, loss: 0.01508344803005457 2023-01-22 14:37:10.360639: step: 450/466, loss: 0.03385445475578308 2023-01-22 14:37:11.070148: step: 452/466, loss: 0.0008108800393529236 2023-01-22 14:37:11.821066: step: 454/466, loss: 0.06956829875707626 2023-01-22 14:37:12.661990: step: 456/466, loss: 0.017208613455295563 2023-01-22 14:37:13.479880: step: 458/466, loss: 0.06156182661652565 2023-01-22 14:37:14.353666: step: 460/466, loss: 0.02708219736814499 2023-01-22 14:37:15.159309: step: 462/466, loss: 0.026894785463809967 2023-01-22 14:37:15.944699: step: 464/466, loss: 0.3982813358306885 2023-01-22 14:37:16.719137: step: 466/466, loss: 0.07892703264951706 2023-01-22 14:37:17.508223: step: 468/466, loss: 0.02989260107278824 2023-01-22 14:37:18.282209: step: 470/466, loss: 0.01170523650944233 2023-01-22 14:37:18.983027: step: 472/466, loss: 0.03892693296074867 2023-01-22 14:37:19.749885: step: 474/466, loss: 0.05771756172180176 2023-01-22 14:37:20.454532: step: 476/466, loss: 0.17019394040107727 2023-01-22 14:37:21.233651: step: 478/466, loss: 0.07544895261526108 2023-01-22 14:37:21.979880: step: 480/466, loss: 0.1152661144733429 2023-01-22 14:37:22.816817: step: 482/466, loss: 0.021521558985114098 2023-01-22 14:37:23.579675: step: 484/466, loss: 0.06444356590509415 2023-01-22 14:37:24.265676: step: 486/466, loss: 0.0069647375494241714 2023-01-22 14:37:25.082782: step: 488/466, loss: 0.14533472061157227 2023-01-22 14:37:25.855285: step: 490/466, loss: 0.02806529402732849 2023-01-22 14:37:26.595484: step: 492/466, loss: 0.04729820415377617 2023-01-22 14:37:27.342607: step: 494/466, loss: 0.04835475608706474 2023-01-22 14:37:28.158042: step: 496/466, loss: 0.3793639540672302 2023-01-22 14:37:28.953568: step: 498/466, loss: 0.08105266094207764 2023-01-22 14:37:29.683139: step: 500/466, loss: 0.026257047429680824 2023-01-22 14:37:30.610766: step: 502/466, loss: 0.04633655026555061 2023-01-22 14:37:31.350622: step: 504/466, loss: 0.047211624681949615 2023-01-22 14:37:32.135247: step: 506/466, loss: 0.0601053461432457 2023-01-22 14:37:32.878750: step: 508/466, loss: 0.057245686650276184 2023-01-22 14:37:33.703627: step: 510/466, loss: 0.015812266618013382 2023-01-22 14:37:34.578005: step: 512/466, loss: 0.05898257717490196 2023-01-22 14:37:35.321172: step: 514/466, loss: 0.0006638577906414866 2023-01-22 14:37:35.998234: step: 516/466, loss: 0.001410032738931477 2023-01-22 14:37:36.770874: step: 518/466, loss: 0.0344972088932991 2023-01-22 14:37:37.480431: step: 520/466, loss: 0.004521101713180542 2023-01-22 14:37:38.190277: step: 522/466, loss: 0.2457745522260666 2023-01-22 14:37:38.933201: step: 524/466, loss: 0.3025051951408386 2023-01-22 14:37:39.694649: step: 526/466, loss: 0.02308790571987629 2023-01-22 14:37:40.494433: step: 528/466, loss: 0.03719723969697952 2023-01-22 14:37:41.245779: step: 530/466, loss: 0.1055179238319397 2023-01-22 14:37:41.953440: step: 532/466, loss: 0.03662850335240364 2023-01-22 14:37:42.631041: step: 534/466, loss: 0.7362351417541504 2023-01-22 14:37:43.372540: step: 536/466, loss: 0.16023731231689453 2023-01-22 14:37:44.102682: step: 538/466, loss: 0.07985933125019073 2023-01-22 14:37:44.915098: step: 540/466, loss: 0.006084410939365625 2023-01-22 14:37:45.633609: step: 542/466, loss: 0.056004833430051804 2023-01-22 14:37:46.380115: step: 544/466, loss: 0.0024009014014154673 2023-01-22 14:37:47.139527: step: 546/466, loss: 0.017100302502512932 2023-01-22 14:37:47.903872: step: 548/466, loss: 0.051608506590127945 2023-01-22 14:37:48.663121: step: 550/466, loss: 0.015401377342641354 2023-01-22 14:37:49.399661: step: 552/466, loss: 0.012614244595170021 2023-01-22 14:37:50.057473: step: 554/466, loss: 0.012515553273260593 2023-01-22 14:37:50.781651: step: 556/466, loss: 0.005670727230608463 2023-01-22 14:37:51.532777: step: 558/466, loss: 0.016312582418322563 2023-01-22 14:37:52.265490: step: 560/466, loss: 0.028784558176994324 2023-01-22 14:37:52.993507: step: 562/466, loss: 0.35075005888938904 2023-01-22 14:37:53.755957: step: 564/466, loss: 0.49358564615249634 2023-01-22 14:37:54.616580: step: 566/466, loss: 0.047686539590358734 2023-01-22 14:37:55.412442: step: 568/466, loss: 0.07998033612966537 2023-01-22 14:37:56.164411: step: 570/466, loss: 0.022427715361118317 2023-01-22 14:37:57.044889: step: 572/466, loss: 0.09979384392499924 2023-01-22 14:37:57.811103: step: 574/466, loss: 0.08363737165927887 2023-01-22 14:37:58.557891: step: 576/466, loss: 0.0687449499964714 2023-01-22 14:37:59.350383: step: 578/466, loss: 0.03900950402021408 2023-01-22 14:38:00.172020: step: 580/466, loss: 0.04187082126736641 2023-01-22 14:38:00.904075: step: 582/466, loss: 0.01600123941898346 2023-01-22 14:38:01.669593: step: 584/466, loss: 0.023255644366145134 2023-01-22 14:38:02.410429: step: 586/466, loss: 0.0417775884270668 2023-01-22 14:38:03.235084: step: 588/466, loss: 0.22014985978603363 2023-01-22 14:38:04.012511: step: 590/466, loss: 0.07747036218643188 2023-01-22 14:38:04.773176: step: 592/466, loss: 0.061081413179636 2023-01-22 14:38:05.509993: step: 594/466, loss: 0.04934444651007652 2023-01-22 14:38:06.270701: step: 596/466, loss: 0.01672072522342205 2023-01-22 14:38:07.036319: step: 598/466, loss: 0.14264918863773346 2023-01-22 14:38:07.811073: step: 600/466, loss: 0.014670551754534245 2023-01-22 14:38:08.570745: step: 602/466, loss: 0.09861317276954651 2023-01-22 14:38:09.365766: step: 604/466, loss: 0.07894841581583023 2023-01-22 14:38:10.125900: step: 606/466, loss: 0.09325024485588074 2023-01-22 14:38:10.923627: step: 608/466, loss: 0.13695885241031647 2023-01-22 14:38:11.702561: step: 610/466, loss: 0.033022522926330566 2023-01-22 14:38:12.520285: step: 612/466, loss: 0.01769273914396763 2023-01-22 14:38:13.319819: step: 614/466, loss: 0.05903501808643341 2023-01-22 14:38:14.058663: step: 616/466, loss: 0.03772374242544174 2023-01-22 14:38:14.829806: step: 618/466, loss: 0.13912038505077362 2023-01-22 14:38:15.651866: step: 620/466, loss: 0.0425943098962307 2023-01-22 14:38:16.364075: step: 622/466, loss: 0.0020012203603982925 2023-01-22 14:38:17.146028: step: 624/466, loss: 0.10077203065156937 2023-01-22 14:38:17.940992: step: 626/466, loss: 0.1686987429857254 2023-01-22 14:38:18.625584: step: 628/466, loss: 0.011166021227836609 2023-01-22 14:38:19.428146: step: 630/466, loss: 0.8890079259872437 2023-01-22 14:38:20.166065: step: 632/466, loss: 0.04895270988345146 2023-01-22 14:38:20.945950: step: 634/466, loss: 0.004432227462530136 2023-01-22 14:38:21.711352: step: 636/466, loss: 0.01525797601789236 2023-01-22 14:38:22.517652: step: 638/466, loss: 0.08175188302993774 2023-01-22 14:38:23.263850: step: 640/466, loss: 0.04717142507433891 2023-01-22 14:38:24.093700: step: 642/466, loss: 0.15646132826805115 2023-01-22 14:38:24.859720: step: 644/466, loss: 0.01614035665988922 2023-01-22 14:38:25.685410: step: 646/466, loss: 0.029348144307732582 2023-01-22 14:38:26.477165: step: 648/466, loss: 0.040207263082265854 2023-01-22 14:38:27.259538: step: 650/466, loss: 0.004076416604220867 2023-01-22 14:38:27.992036: step: 652/466, loss: 0.024327151477336884 2023-01-22 14:38:28.949921: step: 654/466, loss: 0.04616496339440346 2023-01-22 14:38:29.723737: step: 656/466, loss: 0.08406134694814682 2023-01-22 14:38:30.506757: step: 658/466, loss: 0.04835755378007889 2023-01-22 14:38:31.215742: step: 660/466, loss: 0.017609048634767532 2023-01-22 14:38:32.003900: step: 662/466, loss: 0.011756598949432373 2023-01-22 14:38:32.723675: step: 664/466, loss: 0.013616996817290783 2023-01-22 14:38:33.431362: step: 666/466, loss: 0.0014534511137753725 2023-01-22 14:38:34.188441: step: 668/466, loss: 0.021534455940127373 2023-01-22 14:38:34.983566: step: 670/466, loss: 0.057125575840473175 2023-01-22 14:38:35.822185: step: 672/466, loss: 0.1229112297296524 2023-01-22 14:38:36.603765: step: 674/466, loss: 0.05047660693526268 2023-01-22 14:38:37.326680: step: 676/466, loss: 0.002109276596456766 2023-01-22 14:38:38.158986: step: 678/466, loss: 0.03988087549805641 2023-01-22 14:38:38.976907: step: 680/466, loss: 0.022907430306077003 2023-01-22 14:38:39.743335: step: 682/466, loss: 0.008905788883566856 2023-01-22 14:38:40.456561: step: 684/466, loss: 0.03231525421142578 2023-01-22 14:38:41.260984: step: 686/466, loss: 0.010490043088793755 2023-01-22 14:38:41.990666: step: 688/466, loss: 0.053193338215351105 2023-01-22 14:38:42.761478: step: 690/466, loss: 0.09256144613027573 2023-01-22 14:38:43.507354: step: 692/466, loss: 0.01077374629676342 2023-01-22 14:38:44.275644: step: 694/466, loss: 0.02683631144464016 2023-01-22 14:38:44.971601: step: 696/466, loss: 0.15495428442955017 2023-01-22 14:38:45.657604: step: 698/466, loss: 0.23386859893798828 2023-01-22 14:38:46.462052: step: 700/466, loss: 0.053017597645521164 2023-01-22 14:38:47.257365: step: 702/466, loss: 0.001763337291777134 2023-01-22 14:38:48.152483: step: 704/466, loss: 0.05142718553543091 2023-01-22 14:38:48.939537: step: 706/466, loss: 0.007640931289643049 2023-01-22 14:38:49.781747: step: 708/466, loss: 0.006928376853466034 2023-01-22 14:38:50.545360: step: 710/466, loss: 1.302974820137024 2023-01-22 14:38:51.329923: step: 712/466, loss: 0.14583726227283478 2023-01-22 14:38:52.059353: step: 714/466, loss: 0.08601471781730652 2023-01-22 14:38:52.804611: step: 716/466, loss: 0.04965193197131157 2023-01-22 14:38:53.561906: step: 718/466, loss: 0.03268514946103096 2023-01-22 14:38:54.369585: step: 720/466, loss: 0.044559597969055176 2023-01-22 14:38:55.075366: step: 722/466, loss: 0.036287739872932434 2023-01-22 14:38:55.779915: step: 724/466, loss: 0.17567983269691467 2023-01-22 14:38:56.504701: step: 726/466, loss: 0.08489609509706497 2023-01-22 14:38:57.379244: step: 728/466, loss: 0.06397829204797745 2023-01-22 14:38:58.166691: step: 730/466, loss: 0.08759113401174545 2023-01-22 14:38:58.941450: step: 732/466, loss: 0.03365446254611015 2023-01-22 14:38:59.716255: step: 734/466, loss: 0.09580977261066437 2023-01-22 14:39:00.428063: step: 736/466, loss: 0.0405992791056633 2023-01-22 14:39:01.133549: step: 738/466, loss: 0.03734927996993065 2023-01-22 14:39:01.846452: step: 740/466, loss: 0.033360805362463 2023-01-22 14:39:02.726302: step: 742/466, loss: 0.007269763853400946 2023-01-22 14:39:03.476531: step: 744/466, loss: 0.01740163378417492 2023-01-22 14:39:04.257072: step: 746/466, loss: 0.19998213648796082 2023-01-22 14:39:05.054460: step: 748/466, loss: 0.03510475531220436 2023-01-22 14:39:05.750073: step: 750/466, loss: 0.03854229673743248 2023-01-22 14:39:06.528081: step: 752/466, loss: 0.0308972354978323 2023-01-22 14:39:07.312865: step: 754/466, loss: 0.10259910672903061 2023-01-22 14:39:08.117421: step: 756/466, loss: 0.05756578966975212 2023-01-22 14:39:08.884179: step: 758/466, loss: 0.1863379031419754 2023-01-22 14:39:09.653462: step: 760/466, loss: 0.048143643885850906 2023-01-22 14:39:10.427399: step: 762/466, loss: 0.0705944150686264 2023-01-22 14:39:11.194701: step: 764/466, loss: 0.0309490617364645 2023-01-22 14:39:11.953835: step: 766/466, loss: 0.029179180040955544 2023-01-22 14:39:12.711561: step: 768/466, loss: 0.10912593454122543 2023-01-22 14:39:13.484970: step: 770/466, loss: 0.07420755177736282 2023-01-22 14:39:14.143802: step: 772/466, loss: 0.00907058548182249 2023-01-22 14:39:14.922961: step: 774/466, loss: 0.042753975838422775 2023-01-22 14:39:15.684826: step: 776/466, loss: 0.12360195070505142 2023-01-22 14:39:16.506672: step: 778/466, loss: 0.05424632504582405 2023-01-22 14:39:17.249650: step: 780/466, loss: 0.2799844443798065 2023-01-22 14:39:18.038184: step: 782/466, loss: 0.02033020369708538 2023-01-22 14:39:18.792307: step: 784/466, loss: 0.09192941337823868 2023-01-22 14:39:19.521078: step: 786/466, loss: 0.5942068696022034 2023-01-22 14:39:20.262501: step: 788/466, loss: 0.20447811484336853 2023-01-22 14:39:20.955149: step: 790/466, loss: 0.021643927320837975 2023-01-22 14:39:21.668377: step: 792/466, loss: 0.06576592475175858 2023-01-22 14:39:22.534902: step: 794/466, loss: 0.047535400837659836 2023-01-22 14:39:23.320738: step: 796/466, loss: 0.009031837806105614 2023-01-22 14:39:24.056077: step: 798/466, loss: 0.12657393515110016 2023-01-22 14:39:24.815959: step: 800/466, loss: 0.013000398874282837 2023-01-22 14:39:25.712305: step: 802/466, loss: 0.03871172294020653 2023-01-22 14:39:26.427608: step: 804/466, loss: 0.045810580253601074 2023-01-22 14:39:27.256918: step: 806/466, loss: 0.03326374664902687 2023-01-22 14:39:28.134160: step: 808/466, loss: 0.009571857750415802 2023-01-22 14:39:28.989088: step: 810/466, loss: 0.02861849032342434 2023-01-22 14:39:29.716246: step: 812/466, loss: 0.01981373131275177 2023-01-22 14:39:30.455968: step: 814/466, loss: 0.004079751670360565 2023-01-22 14:39:31.123522: step: 816/466, loss: 0.008983705192804337 2023-01-22 14:39:31.923071: step: 818/466, loss: 0.02426239661872387 2023-01-22 14:39:32.725157: step: 820/466, loss: 0.05498094856739044 2023-01-22 14:39:33.546589: step: 822/466, loss: 0.04358714073896408 2023-01-22 14:39:34.302917: step: 824/466, loss: 0.23724323511123657 2023-01-22 14:39:35.045135: step: 826/466, loss: 0.10743313282728195 2023-01-22 14:39:35.791641: step: 828/466, loss: 0.13279399275779724 2023-01-22 14:39:36.598722: step: 830/466, loss: 0.05824432894587517 2023-01-22 14:39:37.311503: step: 832/466, loss: 0.41754406690597534 2023-01-22 14:39:38.148388: step: 834/466, loss: 0.04348764568567276 2023-01-22 14:39:39.090407: step: 836/466, loss: 0.20859596133232117 2023-01-22 14:39:39.757794: step: 838/466, loss: 0.02455979771912098 2023-01-22 14:39:40.535361: step: 840/466, loss: 0.10641927272081375 2023-01-22 14:39:41.322340: step: 842/466, loss: 0.04542750120162964 2023-01-22 14:39:42.001253: step: 844/466, loss: 0.0511869378387928 2023-01-22 14:39:42.768890: step: 846/466, loss: 0.009238695725798607 2023-01-22 14:39:43.550222: step: 848/466, loss: 0.04486138001084328 2023-01-22 14:39:44.246840: step: 850/466, loss: 0.05206843838095665 2023-01-22 14:39:45.050846: step: 852/466, loss: 0.07903794199228287 2023-01-22 14:39:45.848318: step: 854/466, loss: 0.013895289972424507 2023-01-22 14:39:46.664104: step: 856/466, loss: 0.02503076009452343 2023-01-22 14:39:47.376492: step: 858/466, loss: 0.876710832118988 2023-01-22 14:39:48.138459: step: 860/466, loss: 0.048670414835214615 2023-01-22 14:39:48.976791: step: 862/466, loss: 0.021664870902895927 2023-01-22 14:39:49.702445: step: 864/466, loss: 0.03425924852490425 2023-01-22 14:39:50.493841: step: 866/466, loss: 0.025804908946156502 2023-01-22 14:39:51.201064: step: 868/466, loss: 0.03486303240060806 2023-01-22 14:39:51.903160: step: 870/466, loss: 0.01872149109840393 2023-01-22 14:39:52.616416: step: 872/466, loss: 0.03468276932835579 2023-01-22 14:39:53.369188: step: 874/466, loss: 0.03930312767624855 2023-01-22 14:39:54.052327: step: 876/466, loss: 0.014858567155897617 2023-01-22 14:39:54.798068: step: 878/466, loss: 0.06579507142305374 2023-01-22 14:39:55.566128: step: 880/466, loss: 0.060320112854242325 2023-01-22 14:39:56.269010: step: 882/466, loss: 0.004157866816967726 2023-01-22 14:39:57.016283: step: 884/466, loss: 0.11918888986110687 2023-01-22 14:39:57.722412: step: 886/466, loss: 0.09611214697360992 2023-01-22 14:39:58.464572: step: 888/466, loss: 0.03245295584201813 2023-01-22 14:39:59.189367: step: 890/466, loss: 0.03113154135644436 2023-01-22 14:40:00.055461: step: 892/466, loss: 0.028303897008299828 2023-01-22 14:40:00.754456: step: 894/466, loss: 0.09134092926979065 2023-01-22 14:40:01.617829: step: 896/466, loss: 0.05562019348144531 2023-01-22 14:40:02.390869: step: 898/466, loss: 0.019667336717247963 2023-01-22 14:40:03.211974: step: 900/466, loss: 0.03709959238767624 2023-01-22 14:40:03.989640: step: 902/466, loss: 0.02994011528789997 2023-01-22 14:40:04.737160: step: 904/466, loss: 0.006237436085939407 2023-01-22 14:40:05.537846: step: 906/466, loss: 0.014765221625566483 2023-01-22 14:40:06.421924: step: 908/466, loss: 0.0391659215092659 2023-01-22 14:40:07.137028: step: 910/466, loss: 0.6814891695976257 2023-01-22 14:40:07.938151: step: 912/466, loss: 0.10867901891469955 2023-01-22 14:40:08.766902: step: 914/466, loss: 0.10372887551784515 2023-01-22 14:40:09.617023: step: 916/466, loss: 0.0682687982916832 2023-01-22 14:40:10.412578: step: 918/466, loss: 0.021598655730485916 2023-01-22 14:40:11.123389: step: 920/466, loss: 0.004696763586252928 2023-01-22 14:40:11.926901: step: 922/466, loss: 0.09874019771814346 2023-01-22 14:40:12.673832: step: 924/466, loss: 0.029687268659472466 2023-01-22 14:40:13.401155: step: 926/466, loss: 0.04235919564962387 2023-01-22 14:40:14.168820: step: 928/466, loss: 0.008453264832496643 2023-01-22 14:40:14.997465: step: 930/466, loss: 0.049025628715753555 2023-01-22 14:40:15.792847: step: 932/466, loss: 0.11063175648450851 ================================================== Loss: 0.075 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3112994890724527, 'r': 0.3408345449616797, 'f1': 0.3253981978166761}, 'combined': 0.23976709312807712, 'epoch': 22} Test Chinese: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.354901590881547, 'r': 0.30635228234537004, 'f1': 0.3288446896922884}, 'combined': 0.2021191751279431, 'epoch': 22} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.28828729700602984, 'r': 0.3511962897113305, 'f1': 0.3166474673701817}, 'combined': 0.23331918648329175, 'epoch': 22} Test Korean: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.34180584398363073, 'r': 0.3116203928266348, 'f1': 0.3260159001039522}, 'combined': 0.20038050445413647, 'epoch': 22} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.31867456748198797, 'r': 0.342257694866803, 'f1': 0.33004538919452003}, 'combined': 0.24319133940648843, 'epoch': 22} Test Russian: {'template': {'p': 0.863013698630137, 'r': 0.48091603053435117, 'f1': 0.6176470588235295}, 'slot': {'p': 0.3536886552576399, 'r': 0.2992279453650275, 'f1': 0.3241869773589238}, 'combined': 0.20023313307462948, 'epoch': 22} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.28125, 'r': 0.38571428571428573, 'f1': 0.32530120481927716}, 'combined': 0.2168674698795181, 'epoch': 22} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.2608695652173913, 'r': 0.5217391304347826, 'f1': 0.3478260869565218}, 'combined': 0.1739130434782609, 'epoch': 22} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.46153846153846156, 'r': 0.20689655172413793, 'f1': 0.28571428571428575}, 'combined': 0.1904761904761905, 'epoch': 22} New best chinese model... ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3112994890724527, 'r': 0.3408345449616797, 'f1': 0.3253981978166761}, 'combined': 0.23976709312807712, 'epoch': 22} Test for Chinese: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.354901590881547, 'r': 0.30635228234537004, 'f1': 0.3288446896922884}, 'combined': 0.2021191751279431, 'epoch': 22} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.28125, 'r': 0.38571428571428573, 'f1': 0.32530120481927716}, 'combined': 0.2168674698795181, 'epoch': 22} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.28298284686781866, 'r': 0.3527888622242065, 'f1': 0.31405359863540006}, 'combined': 0.23140791478397899, 'epoch': 11} Test for Korean: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.32660396071805126, 'r': 0.30226432413074417, 'f1': 0.3139631233545263}, 'combined': 0.19297245630570886, 'epoch': 11} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.28888888888888886, 'r': 0.5652173913043478, 'f1': 0.38235294117647056}, 'combined': 0.19117647058823528, 'epoch': 11} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3017706576728499, 'r': 0.3395635673624288, 'f1': 0.3195535714285714}, 'combined': 0.23546052631578943, 'epoch': 19} Test for Russian: {'template': {'p': 0.863013698630137, 'r': 0.48091603053435117, 'f1': 0.6176470588235295}, 'slot': {'p': 0.3321627543028552, 'r': 0.3129111067783818, 'f1': 0.32224965651297044}, 'combined': 0.19903655255212885, 'epoch': 19} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.4375, 'r': 0.2413793103448276, 'f1': 0.3111111111111111}, 'combined': 0.2074074074074074, 'epoch': 19} ****************************** Epoch: 23 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-22 14:43:10.969707: step: 2/466, loss: 0.03882645443081856 2023-01-22 14:43:11.660232: step: 4/466, loss: 0.013151943683624268 2023-01-22 14:43:12.451640: step: 6/466, loss: 0.009563574567437172 2023-01-22 14:43:13.263127: step: 8/466, loss: 0.008155950345098972 2023-01-22 14:43:14.021925: step: 10/466, loss: 0.020659472793340683 2023-01-22 14:43:14.890566: step: 12/466, loss: 0.29554811120033264 2023-01-22 14:43:15.713189: step: 14/466, loss: 0.023778563365340233 2023-01-22 14:43:16.462838: step: 16/466, loss: 0.0215211883187294 2023-01-22 14:43:17.197902: step: 18/466, loss: 0.023544669151306152 2023-01-22 14:43:17.909048: step: 20/466, loss: 0.01035761646926403 2023-01-22 14:43:18.748034: step: 22/466, loss: 0.010739500634372234 2023-01-22 14:43:19.493462: step: 24/466, loss: 0.0321514792740345 2023-01-22 14:43:20.247562: step: 26/466, loss: 0.019760802388191223 2023-01-22 14:43:20.952410: step: 28/466, loss: 0.020527001470327377 2023-01-22 14:43:21.696705: step: 30/466, loss: 0.011523504741489887 2023-01-22 14:43:22.425164: step: 32/466, loss: 0.07324656844139099 2023-01-22 14:43:23.192420: step: 34/466, loss: 0.1542568802833557 2023-01-22 14:43:23.856698: step: 36/466, loss: 0.0032089881133288145 2023-01-22 14:43:24.571759: step: 38/466, loss: 0.008545069955289364 2023-01-22 14:43:25.372924: step: 40/466, loss: 0.028171386569738388 2023-01-22 14:43:26.163565: step: 42/466, loss: 0.02147684618830681 2023-01-22 14:43:26.847302: step: 44/466, loss: 0.020654737949371338 2023-01-22 14:43:27.580926: step: 46/466, loss: 0.012604920193552971 2023-01-22 14:43:28.381269: step: 48/466, loss: 0.14834651350975037 2023-01-22 14:43:29.124398: step: 50/466, loss: 0.05847810581326485 2023-01-22 14:43:29.834295: step: 52/466, loss: 0.14463350176811218 2023-01-22 14:43:30.585842: step: 54/466, loss: 0.01718887686729431 2023-01-22 14:43:31.330171: step: 56/466, loss: 0.02093418687582016 2023-01-22 14:43:32.077711: step: 58/466, loss: 0.11011721938848495 2023-01-22 14:43:32.834451: step: 60/466, loss: 0.01010575145483017 2023-01-22 14:43:33.589178: step: 62/466, loss: 0.0032435916364192963 2023-01-22 14:43:34.285399: step: 64/466, loss: 0.036733150482177734 2023-01-22 14:43:35.013777: step: 66/466, loss: 0.6481682062149048 2023-01-22 14:43:35.801616: step: 68/466, loss: 0.017467249184846878 2023-01-22 14:43:36.575245: step: 70/466, loss: 0.009255862794816494 2023-01-22 14:43:37.436437: step: 72/466, loss: 0.025226807221770287 2023-01-22 14:43:38.204469: step: 74/466, loss: 0.02177729830145836 2023-01-22 14:43:38.936276: step: 76/466, loss: 0.008818302303552628 2023-01-22 14:43:39.790194: step: 78/466, loss: 0.010397948324680328 2023-01-22 14:43:40.481987: step: 80/466, loss: 0.04460631310939789 2023-01-22 14:43:41.237844: step: 82/466, loss: 0.05118514597415924 2023-01-22 14:43:41.965069: step: 84/466, loss: 0.006314792670309544 2023-01-22 14:43:42.715541: step: 86/466, loss: 0.03415378928184509 2023-01-22 14:43:43.412020: step: 88/466, loss: 0.0070980386808514595 2023-01-22 14:43:44.052218: step: 90/466, loss: 0.023878734558820724 2023-01-22 14:43:44.822456: step: 92/466, loss: 0.05339059233665466 2023-01-22 14:43:45.630969: step: 94/466, loss: 0.02996997907757759 2023-01-22 14:43:46.419922: step: 96/466, loss: 0.01937534101307392 2023-01-22 14:43:47.185205: step: 98/466, loss: 0.07422344386577606 2023-01-22 14:43:47.994819: step: 100/466, loss: 0.6109414100646973 2023-01-22 14:43:48.721699: step: 102/466, loss: 0.03483252227306366 2023-01-22 14:43:49.421843: step: 104/466, loss: 0.10521746426820755 2023-01-22 14:43:50.199690: step: 106/466, loss: 0.014691787771880627 2023-01-22 14:43:50.973326: step: 108/466, loss: 0.02977118454873562 2023-01-22 14:43:51.687766: step: 110/466, loss: 0.202758327126503 2023-01-22 14:43:52.490951: step: 112/466, loss: 0.024275628849864006 2023-01-22 14:43:53.301034: step: 114/466, loss: 0.4476814270019531 2023-01-22 14:43:54.046767: step: 116/466, loss: 0.012520572170615196 2023-01-22 14:43:54.731645: step: 118/466, loss: 0.04074344038963318 2023-01-22 14:43:55.490953: step: 120/466, loss: 0.08878043293952942 2023-01-22 14:43:56.299191: step: 122/466, loss: 0.07741080969572067 2023-01-22 14:43:57.060252: step: 124/466, loss: 0.014786334708333015 2023-01-22 14:43:57.780939: step: 126/466, loss: 0.09789146482944489 2023-01-22 14:43:58.520880: step: 128/466, loss: 0.018605459481477737 2023-01-22 14:43:59.319161: step: 130/466, loss: 0.045480720698833466 2023-01-22 14:44:00.121316: step: 132/466, loss: 0.0738578662276268 2023-01-22 14:44:00.922356: step: 134/466, loss: 0.02474837936460972 2023-01-22 14:44:01.647515: step: 136/466, loss: 0.02559565380215645 2023-01-22 14:44:02.351982: step: 138/466, loss: 0.08189298957586288 2023-01-22 14:44:03.156356: step: 140/466, loss: 0.09571664035320282 2023-01-22 14:44:03.893046: step: 142/466, loss: 0.013432195410132408 2023-01-22 14:44:04.737064: step: 144/466, loss: 0.028093885630369186 2023-01-22 14:44:05.447843: step: 146/466, loss: 0.01979084312915802 2023-01-22 14:44:06.261950: step: 148/466, loss: 0.07469996809959412 2023-01-22 14:44:07.002658: step: 150/466, loss: 0.027421394363045692 2023-01-22 14:44:07.769008: step: 152/466, loss: 0.03365630656480789 2023-01-22 14:44:08.480302: step: 154/466, loss: 0.035850051790475845 2023-01-22 14:44:09.221514: step: 156/466, loss: 0.0052161808125674725 2023-01-22 14:44:10.014073: step: 158/466, loss: 0.18874676525592804 2023-01-22 14:44:10.739453: step: 160/466, loss: 0.10076171159744263 2023-01-22 14:44:11.554311: step: 162/466, loss: 0.07173901796340942 2023-01-22 14:44:12.282669: step: 164/466, loss: 0.003141289809718728 2023-01-22 14:44:12.986927: step: 166/466, loss: 0.10055757313966751 2023-01-22 14:44:13.736465: step: 168/466, loss: 0.006264520343393087 2023-01-22 14:44:14.514493: step: 170/466, loss: 0.07405374199151993 2023-01-22 14:44:15.248990: step: 172/466, loss: 0.022403467446565628 2023-01-22 14:44:15.967371: step: 174/466, loss: 0.14631879329681396 2023-01-22 14:44:16.652675: step: 176/466, loss: 0.015352722257375717 2023-01-22 14:44:17.441248: step: 178/466, loss: 0.07186131924390793 2023-01-22 14:44:18.135630: step: 180/466, loss: 0.04182245582342148 2023-01-22 14:44:18.927953: step: 182/466, loss: 0.012946860864758492 2023-01-22 14:44:19.680619: step: 184/466, loss: 0.05952528491616249 2023-01-22 14:44:20.462601: step: 186/466, loss: 0.02963799051940441 2023-01-22 14:44:21.214166: step: 188/466, loss: 0.0191253200173378 2023-01-22 14:44:21.972075: step: 190/466, loss: 0.02037331834435463 2023-01-22 14:44:22.740769: step: 192/466, loss: 0.026460129767656326 2023-01-22 14:44:23.432453: step: 194/466, loss: 0.038798943161964417 2023-01-22 14:44:24.114474: step: 196/466, loss: 0.033809561282396317 2023-01-22 14:44:24.832286: step: 198/466, loss: 0.015106773935258389 2023-01-22 14:44:25.582049: step: 200/466, loss: 0.02011779323220253 2023-01-22 14:44:26.425634: step: 202/466, loss: 0.011515239253640175 2023-01-22 14:44:27.211392: step: 204/466, loss: 0.014304769225418568 2023-01-22 14:44:27.915902: step: 206/466, loss: 0.04641815647482872 2023-01-22 14:44:28.691565: step: 208/466, loss: 0.012536582536995411 2023-01-22 14:44:29.488595: step: 210/466, loss: 0.005806042347103357 2023-01-22 14:44:30.223198: step: 212/466, loss: 0.0254743043333292 2023-01-22 14:44:30.969580: step: 214/466, loss: 0.06325501203536987 2023-01-22 14:44:31.731820: step: 216/466, loss: 0.0032372265122830868 2023-01-22 14:44:32.550925: step: 218/466, loss: 0.011696635745465755 2023-01-22 14:44:33.285267: step: 220/466, loss: 0.09282013028860092 2023-01-22 14:44:33.964738: step: 222/466, loss: 0.034505460411310196 2023-01-22 14:44:34.717741: step: 224/466, loss: 0.31562793254852295 2023-01-22 14:44:35.444409: step: 226/466, loss: 0.0380619652569294 2023-01-22 14:44:36.253239: step: 228/466, loss: 0.03315262496471405 2023-01-22 14:44:37.039506: step: 230/466, loss: 0.1067478284239769 2023-01-22 14:44:37.845921: step: 232/466, loss: 0.03125142306089401 2023-01-22 14:44:38.669496: step: 234/466, loss: 0.14881651103496552 2023-01-22 14:44:39.406473: step: 236/466, loss: 0.04718930646777153 2023-01-22 14:44:40.183996: step: 238/466, loss: 0.07604897022247314 2023-01-22 14:44:40.883528: step: 240/466, loss: 0.06238268315792084 2023-01-22 14:44:41.552934: step: 242/466, loss: 0.006522961892187595 2023-01-22 14:44:42.282363: step: 244/466, loss: 0.3242731988430023 2023-01-22 14:44:43.021128: step: 246/466, loss: 0.01007294375449419 2023-01-22 14:44:43.779910: step: 248/466, loss: 0.0399547815322876 2023-01-22 14:44:44.496484: step: 250/466, loss: 0.03402939811348915 2023-01-22 14:44:45.335176: step: 252/466, loss: 0.0879935547709465 2023-01-22 14:44:46.062486: step: 254/466, loss: 0.013537813909351826 2023-01-22 14:44:46.841956: step: 256/466, loss: 0.03015812113881111 2023-01-22 14:44:47.686860: step: 258/466, loss: 0.0662144348025322 2023-01-22 14:44:48.417374: step: 260/466, loss: 0.019768787547945976 2023-01-22 14:44:49.194632: step: 262/466, loss: 0.044151514768600464 2023-01-22 14:44:49.904787: step: 264/466, loss: 0.008428049273788929 2023-01-22 14:44:50.532112: step: 266/466, loss: 0.028973642736673355 2023-01-22 14:44:51.306547: step: 268/466, loss: 0.04836282134056091 2023-01-22 14:44:52.214297: step: 270/466, loss: 0.042037785053253174 2023-01-22 14:44:52.934505: step: 272/466, loss: 0.022262291982769966 2023-01-22 14:44:53.744052: step: 274/466, loss: 0.09610701352357864 2023-01-22 14:44:54.539368: step: 276/466, loss: 0.0048899780958890915 2023-01-22 14:44:55.476877: step: 278/466, loss: 0.034485138952732086 2023-01-22 14:44:56.277585: step: 280/466, loss: 0.027752559632062912 2023-01-22 14:44:57.102680: step: 282/466, loss: 0.09731708467006683 2023-01-22 14:44:57.818908: step: 284/466, loss: 0.008839517831802368 2023-01-22 14:44:58.551013: step: 286/466, loss: 0.01897738315165043 2023-01-22 14:44:59.348790: step: 288/466, loss: 0.021372944116592407 2023-01-22 14:45:00.110705: step: 290/466, loss: 0.0011967141181230545 2023-01-22 14:45:00.928699: step: 292/466, loss: 0.002933601150289178 2023-01-22 14:45:01.747213: step: 294/466, loss: 0.07908368110656738 2023-01-22 14:45:02.561148: step: 296/466, loss: 0.027145352214574814 2023-01-22 14:45:03.346970: step: 298/466, loss: 0.009775962680578232 2023-01-22 14:45:04.224393: step: 300/466, loss: 0.05080359801650047 2023-01-22 14:45:04.962794: step: 302/466, loss: 0.03475033491849899 2023-01-22 14:45:05.660821: step: 304/466, loss: 0.008489892818033695 2023-01-22 14:45:06.385186: step: 306/466, loss: 0.016335856169462204 2023-01-22 14:45:07.233956: step: 308/466, loss: 0.02795318141579628 2023-01-22 14:45:07.987607: step: 310/466, loss: 0.03940548375248909 2023-01-22 14:45:08.724714: step: 312/466, loss: 0.03053591586649418 2023-01-22 14:45:09.469552: step: 314/466, loss: 0.01822226122021675 2023-01-22 14:45:10.227992: step: 316/466, loss: 0.0020739452447742224 2023-01-22 14:45:10.986487: step: 318/466, loss: 0.01575908437371254 2023-01-22 14:45:11.907774: step: 320/466, loss: 0.3569834232330322 2023-01-22 14:45:12.729947: step: 322/466, loss: 0.038145650178194046 2023-01-22 14:45:13.428152: step: 324/466, loss: 0.0722237229347229 2023-01-22 14:45:14.218277: step: 326/466, loss: 0.005113533232361078 2023-01-22 14:45:14.964259: step: 328/466, loss: 0.013926339335739613 2023-01-22 14:45:15.698801: step: 330/466, loss: 0.17054429650306702 2023-01-22 14:45:16.464013: step: 332/466, loss: 0.04146190360188484 2023-01-22 14:45:17.318099: step: 334/466, loss: 0.1249271035194397 2023-01-22 14:45:18.078969: step: 336/466, loss: 0.00410437723621726 2023-01-22 14:45:18.827580: step: 338/466, loss: 0.0191140566021204 2023-01-22 14:45:19.559961: step: 340/466, loss: 0.028116457164287567 2023-01-22 14:45:20.313069: step: 342/466, loss: 0.09836491197347641 2023-01-22 14:45:21.105149: step: 344/466, loss: 0.036641448736190796 2023-01-22 14:45:21.901120: step: 346/466, loss: 0.11809373646974564 2023-01-22 14:45:22.652350: step: 348/466, loss: 0.013707575388252735 2023-01-22 14:45:23.401984: step: 350/466, loss: 0.0044649383053183556 2023-01-22 14:45:24.225026: step: 352/466, loss: 0.03884003683924675 2023-01-22 14:45:24.957376: step: 354/466, loss: 0.058105651289224625 2023-01-22 14:45:25.702104: step: 356/466, loss: 0.0760202631354332 2023-01-22 14:45:26.448312: step: 358/466, loss: 0.0017490936443209648 2023-01-22 14:45:27.168541: step: 360/466, loss: 0.06289663910865784 2023-01-22 14:45:27.992754: step: 362/466, loss: 0.13581669330596924 2023-01-22 14:45:28.736960: step: 364/466, loss: 0.010210997425019741 2023-01-22 14:45:29.515683: step: 366/466, loss: 0.0055528851225972176 2023-01-22 14:45:30.167682: step: 368/466, loss: 0.024322085082530975 2023-01-22 14:45:30.912175: step: 370/466, loss: 0.07646214962005615 2023-01-22 14:45:31.692954: step: 372/466, loss: 0.01779305562376976 2023-01-22 14:45:32.463421: step: 374/466, loss: 0.6838045120239258 2023-01-22 14:45:33.232680: step: 376/466, loss: 0.015555030666291714 2023-01-22 14:45:34.065195: step: 378/466, loss: 0.005967474076896906 2023-01-22 14:45:34.893149: step: 380/466, loss: 0.06998220831155777 2023-01-22 14:45:35.861197: step: 382/466, loss: 0.0267830528318882 2023-01-22 14:45:36.613130: step: 384/466, loss: 0.09789497405290604 2023-01-22 14:45:37.368519: step: 386/466, loss: 0.025804026052355766 2023-01-22 14:45:38.113553: step: 388/466, loss: 0.009077346883714199 2023-01-22 14:45:38.868949: step: 390/466, loss: 0.003902832977473736 2023-01-22 14:45:39.645466: step: 392/466, loss: 0.009882147423923016 2023-01-22 14:45:40.376630: step: 394/466, loss: 0.012400878593325615 2023-01-22 14:45:41.118045: step: 396/466, loss: 0.006915054749697447 2023-01-22 14:45:41.901874: step: 398/466, loss: 0.029605450108647346 2023-01-22 14:45:42.862899: step: 400/466, loss: 0.020665552467107773 2023-01-22 14:45:43.641883: step: 402/466, loss: 0.04576048627495766 2023-01-22 14:45:44.405191: step: 404/466, loss: 0.029126698151230812 2023-01-22 14:45:45.198512: step: 406/466, loss: 0.05250953882932663 2023-01-22 14:45:45.935872: step: 408/466, loss: 0.09095323085784912 2023-01-22 14:45:46.658686: step: 410/466, loss: 0.37631654739379883 2023-01-22 14:45:47.360460: step: 412/466, loss: 0.03619658201932907 2023-01-22 14:45:48.123780: step: 414/466, loss: 0.15473692119121552 2023-01-22 14:45:48.896032: step: 416/466, loss: 0.06378611922264099 2023-01-22 14:45:49.582380: step: 418/466, loss: 0.027144750580191612 2023-01-22 14:45:50.380091: step: 420/466, loss: 0.06826693564653397 2023-01-22 14:45:51.141298: step: 422/466, loss: 0.7054346203804016 2023-01-22 14:45:51.832221: step: 424/466, loss: 0.010629300028085709 2023-01-22 14:45:52.545080: step: 426/466, loss: 0.018608879297971725 2023-01-22 14:45:53.286709: step: 428/466, loss: 0.034134648740291595 2023-01-22 14:45:54.037983: step: 430/466, loss: 0.025236694142222404 2023-01-22 14:45:54.769799: step: 432/466, loss: 0.06735636293888092 2023-01-22 14:45:55.548489: step: 434/466, loss: 0.04542417451739311 2023-01-22 14:45:56.359510: step: 436/466, loss: 0.11845054477453232 2023-01-22 14:45:57.090387: step: 438/466, loss: 0.034908097237348557 2023-01-22 14:45:57.801205: step: 440/466, loss: 0.01451411284506321 2023-01-22 14:45:58.587944: step: 442/466, loss: 0.2690739631652832 2023-01-22 14:45:59.417406: step: 444/466, loss: 0.037572916597127914 2023-01-22 14:46:00.186040: step: 446/466, loss: 0.05552734434604645 2023-01-22 14:46:00.957836: step: 448/466, loss: 0.052332472056150436 2023-01-22 14:46:01.791102: step: 450/466, loss: 0.023011988028883934 2023-01-22 14:46:02.593846: step: 452/466, loss: 0.024651074782013893 2023-01-22 14:46:03.388698: step: 454/466, loss: 0.19281929731369019 2023-01-22 14:46:04.183692: step: 456/466, loss: 0.013870988972485065 2023-01-22 14:46:04.935423: step: 458/466, loss: 0.01165796909481287 2023-01-22 14:46:05.638741: step: 460/466, loss: 0.015610731206834316 2023-01-22 14:46:06.381257: step: 462/466, loss: 0.021320592612028122 2023-01-22 14:46:07.153903: step: 464/466, loss: 0.01695173606276512 2023-01-22 14:46:07.891343: step: 466/466, loss: 0.012832703068852425 2023-01-22 14:46:08.638326: step: 468/466, loss: 0.03932918980717659 2023-01-22 14:46:09.349785: step: 470/466, loss: 0.011094323359429836 2023-01-22 14:46:10.143397: step: 472/466, loss: 0.03536432236433029 2023-01-22 14:46:10.923275: step: 474/466, loss: 0.013472940772771835 2023-01-22 14:46:11.766099: step: 476/466, loss: 0.0204615518450737 2023-01-22 14:46:12.627571: step: 478/466, loss: 0.011578397825360298 2023-01-22 14:46:13.354798: step: 480/466, loss: 0.004967503249645233 2023-01-22 14:46:14.013896: step: 482/466, loss: 0.020800841972231865 2023-01-22 14:46:14.754385: step: 484/466, loss: 0.027728581801056862 2023-01-22 14:46:15.547063: step: 486/466, loss: 0.026497885584831238 2023-01-22 14:46:16.321795: step: 488/466, loss: 0.1051311269402504 2023-01-22 14:46:17.041332: step: 490/466, loss: 0.0763152688741684 2023-01-22 14:46:17.885750: step: 492/466, loss: 0.23005081713199615 2023-01-22 14:46:18.664509: step: 494/466, loss: 0.011912272311747074 2023-01-22 14:46:19.388110: step: 496/466, loss: 0.010407421737909317 2023-01-22 14:46:20.101796: step: 498/466, loss: 0.0014458999503403902 2023-01-22 14:46:20.808742: step: 500/466, loss: 0.12212016433477402 2023-01-22 14:46:21.590015: step: 502/466, loss: 0.024932388216257095 2023-01-22 14:46:22.300533: step: 504/466, loss: 0.05399378016591072 2023-01-22 14:46:23.020017: step: 506/466, loss: 0.047883279621601105 2023-01-22 14:46:23.781674: step: 508/466, loss: 0.02004711516201496 2023-01-22 14:46:24.547153: step: 510/466, loss: 0.00319908419623971 2023-01-22 14:46:25.308707: step: 512/466, loss: 1.257390022277832 2023-01-22 14:46:26.039552: step: 514/466, loss: 0.07255180180072784 2023-01-22 14:46:26.716107: step: 516/466, loss: 0.009753470309078693 2023-01-22 14:46:27.479795: step: 518/466, loss: 0.09241458028554916 2023-01-22 14:46:28.209822: step: 520/466, loss: 0.024641767144203186 2023-01-22 14:46:28.924317: step: 522/466, loss: 0.028830142691731453 2023-01-22 14:46:29.686658: step: 524/466, loss: 0.0035950529854744673 2023-01-22 14:46:30.489132: step: 526/466, loss: 0.02892039716243744 2023-01-22 14:46:31.268297: step: 528/466, loss: 0.24303272366523743 2023-01-22 14:46:32.023436: step: 530/466, loss: 0.2736338973045349 2023-01-22 14:46:32.819240: step: 532/466, loss: 0.038426872342824936 2023-01-22 14:46:33.586316: step: 534/466, loss: 0.017501261085271835 2023-01-22 14:46:34.338795: step: 536/466, loss: 0.14486098289489746 2023-01-22 14:46:35.070671: step: 538/466, loss: 0.0013692817883566022 2023-01-22 14:46:35.854589: step: 540/466, loss: 0.09689339995384216 2023-01-22 14:46:36.621351: step: 542/466, loss: 0.03417300805449486 2023-01-22 14:46:37.380316: step: 544/466, loss: 0.023501494899392128 2023-01-22 14:46:38.083526: step: 546/466, loss: 0.00033460595295764506 2023-01-22 14:46:38.826289: step: 548/466, loss: 0.03630896285176277 2023-01-22 14:46:39.727971: step: 550/466, loss: 0.10656667500734329 2023-01-22 14:46:40.520753: step: 552/466, loss: 0.01857171766459942 2023-01-22 14:46:41.221365: step: 554/466, loss: 0.004209447186440229 2023-01-22 14:46:41.976979: step: 556/466, loss: 0.038103025406599045 2023-01-22 14:46:42.720935: step: 558/466, loss: 0.0565560944378376 2023-01-22 14:46:43.493289: step: 560/466, loss: 0.024387158453464508 2023-01-22 14:46:44.218609: step: 562/466, loss: 0.1290791779756546 2023-01-22 14:46:44.965880: step: 564/466, loss: 0.20599617063999176 2023-01-22 14:46:45.820678: step: 566/466, loss: 0.022683100774884224 2023-01-22 14:46:46.639380: step: 568/466, loss: 0.020000936463475227 2023-01-22 14:46:47.404770: step: 570/466, loss: 0.0779845118522644 2023-01-22 14:46:48.145250: step: 572/466, loss: 0.00268647656776011 2023-01-22 14:46:48.877724: step: 574/466, loss: 0.07933858782052994 2023-01-22 14:46:49.724699: step: 576/466, loss: 0.03418390080332756 2023-01-22 14:46:50.494270: step: 578/466, loss: 0.06472688168287277 2023-01-22 14:46:51.332512: step: 580/466, loss: 0.020349211990833282 2023-01-22 14:46:52.043201: step: 582/466, loss: 0.041946232318878174 2023-01-22 14:46:52.783599: step: 584/466, loss: 0.02541203796863556 2023-01-22 14:46:53.499295: step: 586/466, loss: 0.02173200063407421 2023-01-22 14:46:54.208765: step: 588/466, loss: 0.034484103322029114 2023-01-22 14:46:54.925097: step: 590/466, loss: 0.03421414643526077 2023-01-22 14:46:55.707534: step: 592/466, loss: 0.029985573142766953 2023-01-22 14:46:56.549632: step: 594/466, loss: 0.03662244975566864 2023-01-22 14:46:57.316133: step: 596/466, loss: 0.14605748653411865 2023-01-22 14:46:58.050377: step: 598/466, loss: 0.014268821105360985 2023-01-22 14:46:58.773331: step: 600/466, loss: 0.015629857778549194 2023-01-22 14:46:59.556299: step: 602/466, loss: 0.0037779496051371098 2023-01-22 14:47:00.314053: step: 604/466, loss: 0.0434369295835495 2023-01-22 14:47:01.135506: step: 606/466, loss: 0.05154235288500786 2023-01-22 14:47:01.848068: step: 608/466, loss: 0.05169665068387985 2023-01-22 14:47:02.669566: step: 610/466, loss: 0.12863320112228394 2023-01-22 14:47:03.454175: step: 612/466, loss: 0.0178390946239233 2023-01-22 14:47:04.299664: step: 614/466, loss: 0.2360614687204361 2023-01-22 14:47:05.058917: step: 616/466, loss: 0.017110150307416916 2023-01-22 14:47:05.882979: step: 618/466, loss: 1.5750479698181152 2023-01-22 14:47:06.629312: step: 620/466, loss: 0.0319860503077507 2023-01-22 14:47:07.444885: step: 622/466, loss: 0.03700065240263939 2023-01-22 14:47:08.224342: step: 624/466, loss: 0.023093944415450096 2023-01-22 14:47:08.996091: step: 626/466, loss: 0.03461216017603874 2023-01-22 14:47:09.771645: step: 628/466, loss: 4.815478801727295 2023-01-22 14:47:10.505791: step: 630/466, loss: 0.005494902841746807 2023-01-22 14:47:11.247841: step: 632/466, loss: 0.08448342978954315 2023-01-22 14:47:11.929456: step: 634/466, loss: 0.015122022479772568 2023-01-22 14:47:12.643628: step: 636/466, loss: 0.054206009954214096 2023-01-22 14:47:13.414702: step: 638/466, loss: 0.07040295749902725 2023-01-22 14:47:14.163251: step: 640/466, loss: 0.009515570476651192 2023-01-22 14:47:15.046611: step: 642/466, loss: 0.24127714335918427 2023-01-22 14:47:15.829902: step: 644/466, loss: 0.06542390584945679 2023-01-22 14:47:16.551790: step: 646/466, loss: 0.13807275891304016 2023-01-22 14:47:17.302092: step: 648/466, loss: 0.05299195647239685 2023-01-22 14:47:17.957076: step: 650/466, loss: 0.00545166851952672 2023-01-22 14:47:18.678455: step: 652/466, loss: 0.055834099650382996 2023-01-22 14:47:19.522598: step: 654/466, loss: 0.03985896334052086 2023-01-22 14:47:20.183028: step: 656/466, loss: 0.05284273624420166 2023-01-22 14:47:20.924880: step: 658/466, loss: 0.06045251339673996 2023-01-22 14:47:21.631155: step: 660/466, loss: 0.0345003679394722 2023-01-22 14:47:22.435834: step: 662/466, loss: 0.06754326075315475 2023-01-22 14:47:23.180270: step: 664/466, loss: 0.20084981620311737 2023-01-22 14:47:23.890540: step: 666/466, loss: 0.44011354446411133 2023-01-22 14:47:24.675873: step: 668/466, loss: 0.012255952693521976 2023-01-22 14:47:25.428884: step: 670/466, loss: 0.048434365540742874 2023-01-22 14:47:26.208148: step: 672/466, loss: 0.01924644410610199 2023-01-22 14:47:26.957809: step: 674/466, loss: 0.03586205840110779 2023-01-22 14:47:27.673653: step: 676/466, loss: 0.019450347870588303 2023-01-22 14:47:28.445230: step: 678/466, loss: 0.04268264025449753 2023-01-22 14:47:29.236798: step: 680/466, loss: 0.06356403231620789 2023-01-22 14:47:29.966555: step: 682/466, loss: 0.06838064640760422 2023-01-22 14:47:30.712854: step: 684/466, loss: 0.033811304718256 2023-01-22 14:47:31.558056: step: 686/466, loss: 0.020641760900616646 2023-01-22 14:47:32.347677: step: 688/466, loss: 0.03898869827389717 2023-01-22 14:47:33.035833: step: 690/466, loss: 0.06437399983406067 2023-01-22 14:47:33.763193: step: 692/466, loss: 0.03513655439019203 2023-01-22 14:47:34.640367: step: 694/466, loss: 0.028021251782774925 2023-01-22 14:47:35.369845: step: 696/466, loss: 0.032173994928598404 2023-01-22 14:47:36.142056: step: 698/466, loss: 0.144153892993927 2023-01-22 14:47:36.992858: step: 700/466, loss: 2.666944980621338 2023-01-22 14:47:37.780412: step: 702/466, loss: 0.03742769733071327 2023-01-22 14:47:38.536211: step: 704/466, loss: 0.007199693471193314 2023-01-22 14:47:39.358735: step: 706/466, loss: 0.04252056032419205 2023-01-22 14:47:40.108360: step: 708/466, loss: 0.030529698356986046 2023-01-22 14:47:40.860130: step: 710/466, loss: 0.0010512126609683037 2023-01-22 14:47:41.606415: step: 712/466, loss: 0.005386349279433489 2023-01-22 14:47:42.352682: step: 714/466, loss: 0.056022871285676956 2023-01-22 14:47:43.018131: step: 716/466, loss: 0.0016162166139110923 2023-01-22 14:47:43.768822: step: 718/466, loss: 0.0325670950114727 2023-01-22 14:47:44.473912: step: 720/466, loss: 0.04420344531536102 2023-01-22 14:47:45.207745: step: 722/466, loss: 0.10789467394351959 2023-01-22 14:47:45.968875: step: 724/466, loss: 0.03549986705183983 2023-01-22 14:47:46.765437: step: 726/466, loss: 0.08880554139614105 2023-01-22 14:47:47.504934: step: 728/466, loss: 0.11527568101882935 2023-01-22 14:47:48.205985: step: 730/466, loss: 0.02176312729716301 2023-01-22 14:47:48.974665: step: 732/466, loss: 0.5555728077888489 2023-01-22 14:47:49.760833: step: 734/466, loss: 0.3058893382549286 2023-01-22 14:47:50.525738: step: 736/466, loss: 0.037982527166604996 2023-01-22 14:47:51.248815: step: 738/466, loss: 0.07793950289487839 2023-01-22 14:47:51.998012: step: 740/466, loss: 0.031146274879574776 2023-01-22 14:47:52.780317: step: 742/466, loss: 0.049540840089321136 2023-01-22 14:47:53.491459: step: 744/466, loss: 0.0020056506618857384 2023-01-22 14:47:54.253986: step: 746/466, loss: 0.10663189738988876 2023-01-22 14:47:55.010414: step: 748/466, loss: 0.004785729572176933 2023-01-22 14:47:55.870915: step: 750/466, loss: 0.03479137644171715 2023-01-22 14:47:56.745289: step: 752/466, loss: 0.06022670120000839 2023-01-22 14:47:57.546058: step: 754/466, loss: 0.09813099354505539 2023-01-22 14:47:58.271015: step: 756/466, loss: 0.05356891453266144 2023-01-22 14:47:59.013338: step: 758/466, loss: 0.016268325969576836 2023-01-22 14:47:59.815405: step: 760/466, loss: 0.06473978608846664 2023-01-22 14:48:00.593910: step: 762/466, loss: 0.06507152318954468 2023-01-22 14:48:01.425312: step: 764/466, loss: 0.030582893639802933 2023-01-22 14:48:02.221635: step: 766/466, loss: 0.3056143522262573 2023-01-22 14:48:03.087332: step: 768/466, loss: 0.034017164260149 2023-01-22 14:48:03.901790: step: 770/466, loss: 0.05874023959040642 2023-01-22 14:48:04.653825: step: 772/466, loss: 0.07668693363666534 2023-01-22 14:48:05.391129: step: 774/466, loss: 0.024750353768467903 2023-01-22 14:48:06.191354: step: 776/466, loss: 0.01619192771613598 2023-01-22 14:48:07.020894: step: 778/466, loss: 0.10239271819591522 2023-01-22 14:48:07.749587: step: 780/466, loss: 0.035204604268074036 2023-01-22 14:48:08.516375: step: 782/466, loss: 0.03336874395608902 2023-01-22 14:48:09.314011: step: 784/466, loss: 0.03149819001555443 2023-01-22 14:48:10.038326: step: 786/466, loss: 0.03700670972466469 2023-01-22 14:48:10.843845: step: 788/466, loss: 0.07515472173690796 2023-01-22 14:48:11.630586: step: 790/466, loss: 0.016940707340836525 2023-01-22 14:48:12.373008: step: 792/466, loss: 0.052269306033849716 2023-01-22 14:48:13.178259: step: 794/466, loss: 0.05127988010644913 2023-01-22 14:48:13.922248: step: 796/466, loss: 0.021262122318148613 2023-01-22 14:48:14.623445: step: 798/466, loss: 0.004020551685243845 2023-01-22 14:48:15.333636: step: 800/466, loss: 0.006356806959956884 2023-01-22 14:48:16.128775: step: 802/466, loss: 0.06390430778265 2023-01-22 14:48:16.982908: step: 804/466, loss: 0.03449377417564392 2023-01-22 14:48:17.795672: step: 806/466, loss: 0.05775659158825874 2023-01-22 14:48:18.588292: step: 808/466, loss: 1.9378232955932617 2023-01-22 14:48:19.316479: step: 810/466, loss: 0.028914159163832664 2023-01-22 14:48:20.120099: step: 812/466, loss: 0.08311357349157333 2023-01-22 14:48:20.966151: step: 814/466, loss: 0.0034366236068308353 2023-01-22 14:48:21.710090: step: 816/466, loss: 0.024715717881917953 2023-01-22 14:48:22.528645: step: 818/466, loss: 0.05565216392278671 2023-01-22 14:48:23.350943: step: 820/466, loss: 0.1202254593372345 2023-01-22 14:48:24.156069: step: 822/466, loss: 0.02393370307981968 2023-01-22 14:48:24.896116: step: 824/466, loss: 0.028989041224122047 2023-01-22 14:48:25.606019: step: 826/466, loss: 0.01382619421929121 2023-01-22 14:48:26.419208: step: 828/466, loss: 0.05490459129214287 2023-01-22 14:48:27.187386: step: 830/466, loss: 0.043450355529785156 2023-01-22 14:48:28.001050: step: 832/466, loss: 0.015542100183665752 2023-01-22 14:48:28.709882: step: 834/466, loss: 0.0452614389359951 2023-01-22 14:48:29.555390: step: 836/466, loss: 0.030005039647221565 2023-01-22 14:48:30.288660: step: 838/466, loss: 0.008008835837244987 2023-01-22 14:48:31.029462: step: 840/466, loss: 0.02006283588707447 2023-01-22 14:48:31.751522: step: 842/466, loss: 0.057198040187358856 2023-01-22 14:48:32.542097: step: 844/466, loss: 0.12752194702625275 2023-01-22 14:48:33.241274: step: 846/466, loss: 0.014229381456971169 2023-01-22 14:48:33.976754: step: 848/466, loss: 0.014348013326525688 2023-01-22 14:48:34.804651: step: 850/466, loss: 0.03122992441058159 2023-01-22 14:48:35.584952: step: 852/466, loss: 0.06689751148223877 2023-01-22 14:48:36.288616: step: 854/466, loss: 0.08557271957397461 2023-01-22 14:48:37.067892: step: 856/466, loss: 0.07315085828304291 2023-01-22 14:48:37.800677: step: 858/466, loss: 0.011412195861339569 2023-01-22 14:48:38.568097: step: 860/466, loss: 0.18121466040611267 2023-01-22 14:48:39.355132: step: 862/466, loss: 0.047808241099119186 2023-01-22 14:48:40.164990: step: 864/466, loss: 0.039932798594236374 2023-01-22 14:48:40.966045: step: 866/466, loss: 0.06648283451795578 2023-01-22 14:48:41.742809: step: 868/466, loss: 0.03634491562843323 2023-01-22 14:48:42.532098: step: 870/466, loss: 0.040794167667627335 2023-01-22 14:48:43.284344: step: 872/466, loss: 0.03335542976856232 2023-01-22 14:48:44.071482: step: 874/466, loss: 0.3126262128353119 2023-01-22 14:48:44.797955: step: 876/466, loss: 0.006511006038635969 2023-01-22 14:48:45.592952: step: 878/466, loss: 0.0005270981346257031 2023-01-22 14:48:46.317545: step: 880/466, loss: 0.03149344399571419 2023-01-22 14:48:47.196201: step: 882/466, loss: 0.1612115204334259 2023-01-22 14:48:48.029491: step: 884/466, loss: 0.02418387308716774 2023-01-22 14:48:48.788819: step: 886/466, loss: 0.04643942415714264 2023-01-22 14:48:49.483785: step: 888/466, loss: 0.07254820317029953 2023-01-22 14:48:50.219611: step: 890/466, loss: 0.024736449122428894 2023-01-22 14:48:50.985671: step: 892/466, loss: 0.015713006258010864 2023-01-22 14:48:51.765820: step: 894/466, loss: 0.002715908456593752 2023-01-22 14:48:52.518493: step: 896/466, loss: 0.06559024006128311 2023-01-22 14:48:53.279698: step: 898/466, loss: 0.029943106696009636 2023-01-22 14:48:54.129380: step: 900/466, loss: 0.01815328374505043 2023-01-22 14:48:54.975250: step: 902/466, loss: 0.030919160693883896 2023-01-22 14:48:55.703632: step: 904/466, loss: 0.0003961712936870754 2023-01-22 14:48:56.492122: step: 906/466, loss: 0.007957972586154938 2023-01-22 14:48:57.246351: step: 908/466, loss: 0.3450794517993927 2023-01-22 14:48:57.985026: step: 910/466, loss: 0.04624255374073982 2023-01-22 14:48:58.749507: step: 912/466, loss: 0.03033428080379963 2023-01-22 14:48:59.561820: step: 914/466, loss: 0.11813104897737503 2023-01-22 14:49:00.334017: step: 916/466, loss: 0.04941783472895622 2023-01-22 14:49:01.117021: step: 918/466, loss: 0.02447052113711834 2023-01-22 14:49:01.947459: step: 920/466, loss: 0.05541486293077469 2023-01-22 14:49:02.768671: step: 922/466, loss: 0.04049112647771835 2023-01-22 14:49:03.552742: step: 924/466, loss: 0.014253446832299232 2023-01-22 14:49:04.358788: step: 926/466, loss: 0.190837562084198 2023-01-22 14:49:05.141007: step: 928/466, loss: 0.06502583622932434 2023-01-22 14:49:05.938779: step: 930/466, loss: 0.018064746633172035 2023-01-22 14:49:06.714027: step: 932/466, loss: 0.1262487918138504 ================================================== Loss: 0.085 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.30854679802955665, 'r': 0.33957712117104905, 'f1': 0.3233191379532843}, 'combined': 0.23823515428136738, 'epoch': 23} Test Chinese: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.3640722295351131, 'r': 0.3048119878445492, 'f1': 0.33181700844529155}, 'combined': 0.2039460637273499, 'epoch': 23} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.28666496819284765, 'r': 0.349763898573057, 'f1': 0.3150864522188052}, 'combined': 0.23216896479280383, 'epoch': 23} Test Korean: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.339203402320094, 'r': 0.30657638528583886, 'f1': 0.32206567921698503}, 'combined': 0.19795256381141518, 'epoch': 23} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.32394897959183677, 'r': 0.3442342098129575, 'f1': 0.33378367722433966}, 'combined': 0.24594586742846078, 'epoch': 23} Test Russian: {'template': {'p': 0.863013698630137, 'r': 0.48091603053435117, 'f1': 0.6176470588235295}, 'slot': {'p': 0.36357328023250146, 'r': 0.3043942528007177, 'f1': 0.3313622638876804}, 'combined': 0.20466492769533207, 'epoch': 23} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.2703488372093023, 'r': 0.33214285714285713, 'f1': 0.2980769230769231}, 'combined': 0.1987179487179487, 'epoch': 23} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.2074468085106383, 'r': 0.42391304347826086, 'f1': 0.2785714285714285}, 'combined': 0.13928571428571426, 'epoch': 23} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.4375, 'r': 0.2413793103448276, 'f1': 0.3111111111111111}, 'combined': 0.2074074074074074, 'epoch': 23} New best russian model... ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3112994890724527, 'r': 0.3408345449616797, 'f1': 0.3253981978166761}, 'combined': 0.23976709312807712, 'epoch': 22} Test for Chinese: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.354901590881547, 'r': 0.30635228234537004, 'f1': 0.3288446896922884}, 'combined': 0.2021191751279431, 'epoch': 22} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.28125, 'r': 0.38571428571428573, 'f1': 0.32530120481927716}, 'combined': 0.2168674698795181, 'epoch': 22} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.28298284686781866, 'r': 0.3527888622242065, 'f1': 0.31405359863540006}, 'combined': 0.23140791478397899, 'epoch': 11} Test for Korean: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.32660396071805126, 'r': 0.30226432413074417, 'f1': 0.3139631233545263}, 'combined': 0.19297245630570886, 'epoch': 11} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.28888888888888886, 'r': 0.5652173913043478, 'f1': 0.38235294117647056}, 'combined': 0.19117647058823528, 'epoch': 11} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.32394897959183677, 'r': 0.3442342098129575, 'f1': 0.33378367722433966}, 'combined': 0.24594586742846078, 'epoch': 23} Test for Russian: {'template': {'p': 0.863013698630137, 'r': 0.48091603053435117, 'f1': 0.6176470588235295}, 'slot': {'p': 0.36357328023250146, 'r': 0.3043942528007177, 'f1': 0.3313622638876804}, 'combined': 0.20466492769533207, 'epoch': 23} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.4375, 'r': 0.2413793103448276, 'f1': 0.3111111111111111}, 'combined': 0.2074074074074074, 'epoch': 23} ****************************** Epoch: 24 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-22 14:51:58.785987: step: 2/466, loss: 0.012442237697541714 2023-01-22 14:51:59.474826: step: 4/466, loss: 0.022447844967246056 2023-01-22 14:52:00.234079: step: 6/466, loss: 0.16383084654808044 2023-01-22 14:52:01.020192: step: 8/466, loss: 0.06788397580385208 2023-01-22 14:52:01.787342: step: 10/466, loss: 0.012289896607398987 2023-01-22 14:52:02.580795: step: 12/466, loss: 0.030877836048603058 2023-01-22 14:52:03.357031: step: 14/466, loss: 0.02345413900911808 2023-01-22 14:52:04.081954: step: 16/466, loss: 0.00726708397269249 2023-01-22 14:52:04.852827: step: 18/466, loss: 0.005459288135170937 2023-01-22 14:52:05.614724: step: 20/466, loss: 0.01712528057396412 2023-01-22 14:52:06.354971: step: 22/466, loss: 0.007801868487149477 2023-01-22 14:52:07.184984: step: 24/466, loss: 0.01762966439127922 2023-01-22 14:52:07.923033: step: 26/466, loss: 0.015811754390597343 2023-01-22 14:52:08.732837: step: 28/466, loss: 0.03606037050485611 2023-01-22 14:52:09.583162: step: 30/466, loss: 0.01512200478464365 2023-01-22 14:52:10.334665: step: 32/466, loss: 0.004144949372857809 2023-01-22 14:52:11.167840: step: 34/466, loss: 0.019165851175785065 2023-01-22 14:52:11.932267: step: 36/466, loss: 0.01260423380881548 2023-01-22 14:52:12.661534: step: 38/466, loss: 0.01843561790883541 2023-01-22 14:52:13.439921: step: 40/466, loss: 0.08869102597236633 2023-01-22 14:52:14.108716: step: 42/466, loss: 0.008993967436254025 2023-01-22 14:52:14.849880: step: 44/466, loss: 0.010097336955368519 2023-01-22 14:52:15.751232: step: 46/466, loss: 0.195277601480484 2023-01-22 14:52:16.542576: step: 48/466, loss: 0.02338215708732605 2023-01-22 14:52:17.357674: step: 50/466, loss: 0.0014439165825024247 2023-01-22 14:52:18.128122: step: 52/466, loss: 0.023032324388623238 2023-01-22 14:52:18.811865: step: 54/466, loss: 0.02563190460205078 2023-01-22 14:52:19.520037: step: 56/466, loss: 0.03805235028266907 2023-01-22 14:52:20.560606: step: 58/466, loss: 0.07855221629142761 2023-01-22 14:52:21.297147: step: 60/466, loss: 0.025932323187589645 2023-01-22 14:52:22.025490: step: 62/466, loss: 0.10620784759521484 2023-01-22 14:52:22.797593: step: 64/466, loss: 0.0058013866655528545 2023-01-22 14:52:23.603010: step: 66/466, loss: 0.3046168386936188 2023-01-22 14:52:24.319700: step: 68/466, loss: 0.01990501582622528 2023-01-22 14:52:25.042337: step: 70/466, loss: 0.05764950439333916 2023-01-22 14:52:25.869416: step: 72/466, loss: 0.030430598184466362 2023-01-22 14:52:26.607494: step: 74/466, loss: 0.009992522187530994 2023-01-22 14:52:27.471259: step: 76/466, loss: 0.11593419313430786 2023-01-22 14:52:28.257928: step: 78/466, loss: 0.04402603954076767 2023-01-22 14:52:29.025812: step: 80/466, loss: 0.006121122743934393 2023-01-22 14:52:29.759008: step: 82/466, loss: 0.0049262214452028275 2023-01-22 14:52:30.554958: step: 84/466, loss: 0.0043501779437065125 2023-01-22 14:52:31.264654: step: 86/466, loss: 0.010162044316530228 2023-01-22 14:52:32.004203: step: 88/466, loss: 0.0474172867834568 2023-01-22 14:52:32.629854: step: 90/466, loss: 0.015527402982115746 2023-01-22 14:52:33.410912: step: 92/466, loss: 0.03818926587700844 2023-01-22 14:52:34.261687: step: 94/466, loss: 0.023612646386027336 2023-01-22 14:52:35.027835: step: 96/466, loss: 0.07392007857561111 2023-01-22 14:52:35.826211: step: 98/466, loss: 0.0026107749436050653 2023-01-22 14:52:36.607137: step: 100/466, loss: 0.0592830553650856 2023-01-22 14:52:37.411310: step: 102/466, loss: 0.026403101161122322 2023-01-22 14:52:38.226127: step: 104/466, loss: 0.13619732856750488 2023-01-22 14:52:38.980402: step: 106/466, loss: 0.021160613745450974 2023-01-22 14:52:39.657143: step: 108/466, loss: 0.3664078414440155 2023-01-22 14:52:40.405839: step: 110/466, loss: 0.030345715582370758 2023-01-22 14:52:41.157084: step: 112/466, loss: 0.0480051189661026 2023-01-22 14:52:41.903466: step: 114/466, loss: 0.028134595602750778 2023-01-22 14:52:42.736147: step: 116/466, loss: 0.032205626368522644 2023-01-22 14:52:43.500031: step: 118/466, loss: 0.062361881136894226 2023-01-22 14:52:44.354721: step: 120/466, loss: 0.05839879810810089 2023-01-22 14:52:45.163335: step: 122/466, loss: 0.017225315794348717 2023-01-22 14:52:45.898221: step: 124/466, loss: 0.02745700441300869 2023-01-22 14:52:46.638300: step: 126/466, loss: 0.06077186018228531 2023-01-22 14:52:47.429459: step: 128/466, loss: 0.03275851905345917 2023-01-22 14:52:48.179316: step: 130/466, loss: 0.014129195362329483 2023-01-22 14:52:48.899442: step: 132/466, loss: 0.018745819106698036 2023-01-22 14:52:49.668819: step: 134/466, loss: 0.008557483553886414 2023-01-22 14:52:50.386577: step: 136/466, loss: 0.013914437033236027 2023-01-22 14:52:51.268594: step: 138/466, loss: 0.06996177136898041 2023-01-22 14:52:51.948033: step: 140/466, loss: 0.01704161800444126 2023-01-22 14:52:52.680462: step: 142/466, loss: 0.0320325568318367 2023-01-22 14:52:53.503513: step: 144/466, loss: 0.34453633427619934 2023-01-22 14:52:54.265085: step: 146/466, loss: 0.23159049451351166 2023-01-22 14:52:54.915040: step: 148/466, loss: 0.00032263854518532753 2023-01-22 14:52:55.615489: step: 150/466, loss: 0.3419460356235504 2023-01-22 14:52:56.456909: step: 152/466, loss: 0.15512454509735107 2023-01-22 14:52:57.190899: step: 154/466, loss: 0.05583605542778969 2023-01-22 14:52:57.927455: step: 156/466, loss: 0.0402413085103035 2023-01-22 14:52:58.694541: step: 158/466, loss: 0.021404564380645752 2023-01-22 14:52:59.452329: step: 160/466, loss: 0.7854284048080444 2023-01-22 14:53:00.201950: step: 162/466, loss: 0.03168287128210068 2023-01-22 14:53:00.914816: step: 164/466, loss: 0.038331855088472366 2023-01-22 14:53:01.619773: step: 166/466, loss: 0.0038885714020580053 2023-01-22 14:53:02.255245: step: 168/466, loss: 0.01354452408850193 2023-01-22 14:53:02.991663: step: 170/466, loss: 0.413261353969574 2023-01-22 14:53:03.815560: step: 172/466, loss: 0.016408804804086685 2023-01-22 14:53:04.584333: step: 174/466, loss: 0.028176261112093925 2023-01-22 14:53:05.444319: step: 176/466, loss: 0.039729684591293335 2023-01-22 14:53:06.303912: step: 178/466, loss: 0.02926718443632126 2023-01-22 14:53:07.183043: step: 180/466, loss: 0.003973286598920822 2023-01-22 14:53:07.849298: step: 182/466, loss: 0.014472606591880322 2023-01-22 14:53:08.647101: step: 184/466, loss: 0.0029143274296075106 2023-01-22 14:53:09.276768: step: 186/466, loss: 0.00688170874491334 2023-01-22 14:53:09.950113: step: 188/466, loss: 0.004136112052947283 2023-01-22 14:53:10.653866: step: 190/466, loss: 0.09447738528251648 2023-01-22 14:53:11.340690: step: 192/466, loss: 0.0017223696922883391 2023-01-22 14:53:12.011112: step: 194/466, loss: 0.017475929111242294 2023-01-22 14:53:12.975479: step: 196/466, loss: 0.2981652021408081 2023-01-22 14:53:13.762251: step: 198/466, loss: 0.05993760749697685 2023-01-22 14:53:14.469995: step: 200/466, loss: 0.011921750381588936 2023-01-22 14:53:15.139619: step: 202/466, loss: 0.008036823011934757 2023-01-22 14:53:15.906643: step: 204/466, loss: 0.0012072388781234622 2023-01-22 14:53:16.675540: step: 206/466, loss: 0.3928010165691376 2023-01-22 14:53:17.456668: step: 208/466, loss: 0.034359339624643326 2023-01-22 14:53:18.180290: step: 210/466, loss: 0.03412700816988945 2023-01-22 14:53:18.984296: step: 212/466, loss: 0.035888466984033585 2023-01-22 14:53:19.762384: step: 214/466, loss: 0.028900718316435814 2023-01-22 14:53:20.505819: step: 216/466, loss: 0.060817234218120575 2023-01-22 14:53:21.317277: step: 218/466, loss: 0.047735925763845444 2023-01-22 14:53:22.087922: step: 220/466, loss: 0.0510413721203804 2023-01-22 14:53:22.865764: step: 222/466, loss: 0.014088866300880909 2023-01-22 14:53:23.626122: step: 224/466, loss: 0.09515645354986191 2023-01-22 14:53:24.451502: step: 226/466, loss: 0.01028447411954403 2023-01-22 14:53:25.189751: step: 228/466, loss: 0.03016878291964531 2023-01-22 14:53:25.928663: step: 230/466, loss: 0.008594774641096592 2023-01-22 14:53:26.634512: step: 232/466, loss: 0.02591010369360447 2023-01-22 14:53:27.388823: step: 234/466, loss: 0.006289259530603886 2023-01-22 14:53:28.224239: step: 236/466, loss: 0.06279350072145462 2023-01-22 14:53:29.073178: step: 238/466, loss: 0.0070212590508162975 2023-01-22 14:53:29.752330: step: 240/466, loss: 0.010945099405944347 2023-01-22 14:53:30.502540: step: 242/466, loss: 0.012712801806628704 2023-01-22 14:53:31.235448: step: 244/466, loss: 0.007867317646741867 2023-01-22 14:53:32.096139: step: 246/466, loss: 0.0334496907889843 2023-01-22 14:53:32.895654: step: 248/466, loss: 0.004537621047347784 2023-01-22 14:53:33.724569: step: 250/466, loss: 0.03142962604761124 2023-01-22 14:53:34.389752: step: 252/466, loss: 0.0010505338432267308 2023-01-22 14:53:35.181732: step: 254/466, loss: 0.017321258783340454 2023-01-22 14:53:35.922750: step: 256/466, loss: 0.14158202707767487 2023-01-22 14:53:36.699643: step: 258/466, loss: 0.047040604054927826 2023-01-22 14:53:37.407982: step: 260/466, loss: 0.047133516520261765 2023-01-22 14:53:38.100979: step: 262/466, loss: 0.03368383273482323 2023-01-22 14:53:38.846718: step: 264/466, loss: 0.030273066833615303 2023-01-22 14:53:39.664986: step: 266/466, loss: 0.0032059112563729286 2023-01-22 14:53:40.396402: step: 268/466, loss: 0.004067208617925644 2023-01-22 14:53:41.149948: step: 270/466, loss: 0.012437507510185242 2023-01-22 14:53:41.990585: step: 272/466, loss: 0.12539102137088776 2023-01-22 14:53:42.762252: step: 274/466, loss: 0.02563825249671936 2023-01-22 14:53:43.502814: step: 276/466, loss: 0.0011749654076993465 2023-01-22 14:53:44.238430: step: 278/466, loss: 0.014036266133189201 2023-01-22 14:53:44.910454: step: 280/466, loss: 0.026236526668071747 2023-01-22 14:53:45.614351: step: 282/466, loss: 1.3633739948272705 2023-01-22 14:53:46.389945: step: 284/466, loss: 0.018139546737074852 2023-01-22 14:53:47.159856: step: 286/466, loss: 0.0016358124557882547 2023-01-22 14:53:47.901628: step: 288/466, loss: 0.09358922392129898 2023-01-22 14:53:48.643186: step: 290/466, loss: 0.022072920575737953 2023-01-22 14:53:49.504631: step: 292/466, loss: 0.008230620995163918 2023-01-22 14:53:50.262461: step: 294/466, loss: 0.03175661712884903 2023-01-22 14:53:50.954857: step: 296/466, loss: 0.0832146480679512 2023-01-22 14:53:51.672358: step: 298/466, loss: 0.006687816698104143 2023-01-22 14:53:52.382101: step: 300/466, loss: 0.00434772577136755 2023-01-22 14:53:53.063447: step: 302/466, loss: 0.1016339585185051 2023-01-22 14:53:53.740530: step: 304/466, loss: 0.0026131209451705217 2023-01-22 14:53:54.525990: step: 306/466, loss: 0.04954265058040619 2023-01-22 14:53:55.236068: step: 308/466, loss: 0.024544520303606987 2023-01-22 14:53:56.042089: step: 310/466, loss: 0.023840585723519325 2023-01-22 14:53:56.849168: step: 312/466, loss: 0.1760261058807373 2023-01-22 14:53:57.555261: step: 314/466, loss: 0.09610689431428909 2023-01-22 14:53:58.396753: step: 316/466, loss: 0.0886836126446724 2023-01-22 14:53:59.242352: step: 318/466, loss: 0.019128063693642616 2023-01-22 14:54:00.037509: step: 320/466, loss: 0.1219346672296524 2023-01-22 14:54:00.756161: step: 322/466, loss: 0.0004176282382104546 2023-01-22 14:54:01.546170: step: 324/466, loss: 0.01580335572361946 2023-01-22 14:54:02.272978: step: 326/466, loss: 0.0009826215682551265 2023-01-22 14:54:03.008951: step: 328/466, loss: 0.024065284058451653 2023-01-22 14:54:03.763280: step: 330/466, loss: 0.0028311621863394976 2023-01-22 14:54:04.550490: step: 332/466, loss: 0.04555172100663185 2023-01-22 14:54:05.340591: step: 334/466, loss: 0.12674173712730408 2023-01-22 14:54:06.095923: step: 336/466, loss: 0.03965069726109505 2023-01-22 14:54:06.891991: step: 338/466, loss: 0.03905881196260452 2023-01-22 14:54:07.616282: step: 340/466, loss: 0.011771049350500107 2023-01-22 14:54:08.275145: step: 342/466, loss: 0.0562756285071373 2023-01-22 14:54:09.003462: step: 344/466, loss: 0.01050628162920475 2023-01-22 14:54:09.793008: step: 346/466, loss: 0.15805859863758087 2023-01-22 14:54:10.547786: step: 348/466, loss: 0.030752331018447876 2023-01-22 14:54:11.325820: step: 350/466, loss: 0.016457442194223404 2023-01-22 14:54:12.197225: step: 352/466, loss: 0.024749215692281723 2023-01-22 14:54:13.001239: step: 354/466, loss: 0.03467196226119995 2023-01-22 14:54:13.722119: step: 356/466, loss: 0.055692195892333984 2023-01-22 14:54:14.484700: step: 358/466, loss: 0.05298139527440071 2023-01-22 14:54:15.189461: step: 360/466, loss: 0.020247722044587135 2023-01-22 14:54:15.998435: step: 362/466, loss: 0.009396249428391457 2023-01-22 14:54:16.767937: step: 364/466, loss: 0.022053968161344528 2023-01-22 14:54:17.437800: step: 366/466, loss: 0.021803874522447586 2023-01-22 14:54:18.261595: step: 368/466, loss: 0.02070603333413601 2023-01-22 14:54:18.948919: step: 370/466, loss: 0.013286711648106575 2023-01-22 14:54:19.614060: step: 372/466, loss: 0.012056940235197544 2023-01-22 14:54:20.400653: step: 374/466, loss: 0.17006126046180725 2023-01-22 14:54:21.171709: step: 376/466, loss: 0.026832759380340576 2023-01-22 14:54:21.927023: step: 378/466, loss: 0.010535070672631264 2023-01-22 14:54:22.702304: step: 380/466, loss: 0.02465672977268696 2023-01-22 14:54:23.490550: step: 382/466, loss: 0.031762540340423584 2023-01-22 14:54:24.230857: step: 384/466, loss: 0.017145009711384773 2023-01-22 14:54:25.029099: step: 386/466, loss: 0.024914514273405075 2023-01-22 14:54:25.773359: step: 388/466, loss: 0.036123715341091156 2023-01-22 14:54:26.503276: step: 390/466, loss: 0.009714765474200249 2023-01-22 14:54:27.273589: step: 392/466, loss: 0.017705464735627174 2023-01-22 14:54:28.035200: step: 394/466, loss: 0.02804849110543728 2023-01-22 14:54:28.901602: step: 396/466, loss: 0.05484087020158768 2023-01-22 14:54:29.584343: step: 398/466, loss: 0.03912174701690674 2023-01-22 14:54:30.344748: step: 400/466, loss: 0.020371561869978905 2023-01-22 14:54:31.126651: step: 402/466, loss: 0.02459416352212429 2023-01-22 14:54:31.900467: step: 404/466, loss: 0.006846928503364325 2023-01-22 14:54:32.669163: step: 406/466, loss: 0.007404484786093235 2023-01-22 14:54:33.382644: step: 408/466, loss: 0.022011689841747284 2023-01-22 14:54:34.111749: step: 410/466, loss: 0.0031958348117768764 2023-01-22 14:54:34.869511: step: 412/466, loss: 0.012686365284025669 2023-01-22 14:54:35.658511: step: 414/466, loss: 0.03906247019767761 2023-01-22 14:54:36.404913: step: 416/466, loss: 0.15825800597667694 2023-01-22 14:54:37.257663: step: 418/466, loss: 1.7678616046905518 2023-01-22 14:54:37.983893: step: 420/466, loss: 0.009127304889261723 2023-01-22 14:54:38.748376: step: 422/466, loss: 0.029808560386300087 2023-01-22 14:54:39.534819: step: 424/466, loss: 0.05525938421487808 2023-01-22 14:54:40.328248: step: 426/466, loss: 0.026894461363554 2023-01-22 14:54:41.122732: step: 428/466, loss: 0.1667776256799698 2023-01-22 14:54:41.822548: step: 430/466, loss: 0.02654326520860195 2023-01-22 14:54:42.548736: step: 432/466, loss: 0.004692245740443468 2023-01-22 14:54:43.339283: step: 434/466, loss: 0.053976550698280334 2023-01-22 14:54:44.156873: step: 436/466, loss: 0.03274490311741829 2023-01-22 14:54:44.866284: step: 438/466, loss: 0.026439087465405464 2023-01-22 14:54:45.665516: step: 440/466, loss: 0.056312721222639084 2023-01-22 14:54:46.328424: step: 442/466, loss: 0.025374772027134895 2023-01-22 14:54:47.106462: step: 444/466, loss: 0.0024430155754089355 2023-01-22 14:54:47.829296: step: 446/466, loss: 0.012750299647450447 2023-01-22 14:54:48.603202: step: 448/466, loss: 0.31890374422073364 2023-01-22 14:54:49.362697: step: 450/466, loss: 0.0671519860625267 2023-01-22 14:54:50.172090: step: 452/466, loss: 0.07847802340984344 2023-01-22 14:54:50.938289: step: 454/466, loss: 0.04167680814862251 2023-01-22 14:54:51.646829: step: 456/466, loss: 0.010705829598009586 2023-01-22 14:54:52.421659: step: 458/466, loss: 0.053796254098415375 2023-01-22 14:54:53.121490: step: 460/466, loss: 0.00658042635768652 2023-01-22 14:54:53.896011: step: 462/466, loss: 0.017308924347162247 2023-01-22 14:54:54.640280: step: 464/466, loss: 0.034351151436567307 2023-01-22 14:54:55.405893: step: 466/466, loss: 0.00565611245110631 2023-01-22 14:54:56.204079: step: 468/466, loss: 0.061411306262016296 2023-01-22 14:54:56.934737: step: 470/466, loss: 0.002887856913730502 2023-01-22 14:54:57.645000: step: 472/466, loss: 0.04241650179028511 2023-01-22 14:54:58.387994: step: 474/466, loss: 0.04883456975221634 2023-01-22 14:54:59.131978: step: 476/466, loss: 0.012651508674025536 2023-01-22 14:54:59.861177: step: 478/466, loss: 0.012203642167150974 2023-01-22 14:55:00.680341: step: 480/466, loss: 0.024727782234549522 2023-01-22 14:55:01.332535: step: 482/466, loss: 0.012281586416065693 2023-01-22 14:55:02.071332: step: 484/466, loss: 0.0012142674531787634 2023-01-22 14:55:02.809069: step: 486/466, loss: 0.06945552676916122 2023-01-22 14:55:03.538203: step: 488/466, loss: 0.04210897535085678 2023-01-22 14:55:04.373374: step: 490/466, loss: 0.3636569678783417 2023-01-22 14:55:05.129960: step: 492/466, loss: 0.027368493378162384 2023-01-22 14:55:05.845514: step: 494/466, loss: 0.012000566348433495 2023-01-22 14:55:06.598110: step: 496/466, loss: 0.0015399146359413862 2023-01-22 14:55:07.271210: step: 498/466, loss: 0.09635155647993088 2023-01-22 14:55:07.985932: step: 500/466, loss: 0.060200709849596024 2023-01-22 14:55:08.699577: step: 502/466, loss: 0.03938752040266991 2023-01-22 14:55:09.435352: step: 504/466, loss: 0.006137209013104439 2023-01-22 14:55:10.282649: step: 506/466, loss: 0.04328594356775284 2023-01-22 14:55:11.045795: step: 508/466, loss: 0.05471671745181084 2023-01-22 14:55:11.836891: step: 510/466, loss: 0.07688561826944351 2023-01-22 14:55:12.617864: step: 512/466, loss: 0.05143206939101219 2023-01-22 14:55:13.422964: step: 514/466, loss: 0.023206721991300583 2023-01-22 14:55:14.278908: step: 516/466, loss: 0.9250187277793884 2023-01-22 14:55:15.154825: step: 518/466, loss: 0.017074864357709885 2023-01-22 14:55:15.969435: step: 520/466, loss: 0.005635548382997513 2023-01-22 14:55:16.772167: step: 522/466, loss: 0.021923230960965157 2023-01-22 14:55:17.554375: step: 524/466, loss: 0.023732662200927734 2023-01-22 14:55:18.248316: step: 526/466, loss: 0.05662060156464577 2023-01-22 14:55:19.044223: step: 528/466, loss: 0.010736081749200821 2023-01-22 14:55:19.820737: step: 530/466, loss: 0.05972069129347801 2023-01-22 14:55:20.516537: step: 532/466, loss: 0.012271245010197163 2023-01-22 14:55:21.215134: step: 534/466, loss: 0.011294533498585224 2023-01-22 14:55:22.000361: step: 536/466, loss: 0.011247357353568077 2023-01-22 14:55:22.714145: step: 538/466, loss: 0.06178859621286392 2023-01-22 14:55:23.437347: step: 540/466, loss: 0.14391258358955383 2023-01-22 14:55:24.218599: step: 542/466, loss: 0.05927535891532898 2023-01-22 14:55:24.961257: step: 544/466, loss: 0.5981643795967102 2023-01-22 14:55:25.841633: step: 546/466, loss: 0.017532022669911385 2023-01-22 14:55:26.609384: step: 548/466, loss: 0.029608314856886864 2023-01-22 14:55:27.331757: step: 550/466, loss: 0.03293095901608467 2023-01-22 14:55:28.048978: step: 552/466, loss: 0.07986298203468323 2023-01-22 14:55:28.785356: step: 554/466, loss: 0.07265043258666992 2023-01-22 14:55:29.565505: step: 556/466, loss: 0.07248475402593613 2023-01-22 14:55:30.549338: step: 558/466, loss: 0.011308040469884872 2023-01-22 14:55:31.295688: step: 560/466, loss: 0.020601406693458557 2023-01-22 14:55:32.144349: step: 562/466, loss: 0.08716525137424469 2023-01-22 14:55:32.824900: step: 564/466, loss: 0.00024699614732526243 2023-01-22 14:55:33.562948: step: 566/466, loss: 0.057305797934532166 2023-01-22 14:55:34.308739: step: 568/466, loss: 0.029795540496706963 2023-01-22 14:55:34.999066: step: 570/466, loss: 0.03908243775367737 2023-01-22 14:55:35.769621: step: 572/466, loss: 0.016558783128857613 2023-01-22 14:55:36.598265: step: 574/466, loss: 0.6643418073654175 2023-01-22 14:55:37.427108: step: 576/466, loss: 0.10121889412403107 2023-01-22 14:55:38.280345: step: 578/466, loss: 0.021568842232227325 2023-01-22 14:55:39.074274: step: 580/466, loss: 0.09610091149806976 2023-01-22 14:55:39.848599: step: 582/466, loss: 0.0005419608787633479 2023-01-22 14:55:40.650645: step: 584/466, loss: 0.004922699648886919 2023-01-22 14:55:41.413266: step: 586/466, loss: 0.04917950928211212 2023-01-22 14:55:42.191643: step: 588/466, loss: 0.03489411249756813 2023-01-22 14:55:42.956764: step: 590/466, loss: 0.0357198603451252 2023-01-22 14:55:43.659678: step: 592/466, loss: 0.003807668574154377 2023-01-22 14:55:44.374924: step: 594/466, loss: 0.03101886436343193 2023-01-22 14:55:45.101951: step: 596/466, loss: 0.007635398767888546 2023-01-22 14:55:45.982204: step: 598/466, loss: 0.08052492141723633 2023-01-22 14:55:46.764536: step: 600/466, loss: 0.0572916604578495 2023-01-22 14:55:47.497535: step: 602/466, loss: 0.0198881383985281 2023-01-22 14:55:48.324534: step: 604/466, loss: 0.04244111105799675 2023-01-22 14:55:49.142036: step: 606/466, loss: 0.014936062507331371 2023-01-22 14:55:49.903014: step: 608/466, loss: 0.007002471946179867 2023-01-22 14:55:50.736196: step: 610/466, loss: 0.015491640195250511 2023-01-22 14:55:51.542698: step: 612/466, loss: 0.02755819819867611 2023-01-22 14:55:52.293734: step: 614/466, loss: 0.0076740216463804245 2023-01-22 14:55:53.124300: step: 616/466, loss: 0.02576759085059166 2023-01-22 14:55:53.846381: step: 618/466, loss: 0.026631657034158707 2023-01-22 14:55:54.629196: step: 620/466, loss: 0.0049104285426437855 2023-01-22 14:55:55.372415: step: 622/466, loss: 0.0017827756237238646 2023-01-22 14:55:56.148490: step: 624/466, loss: 0.039846137166023254 2023-01-22 14:55:56.943969: step: 626/466, loss: 0.0736478939652443 2023-01-22 14:55:57.684386: step: 628/466, loss: 0.026989759877324104 2023-01-22 14:55:58.493885: step: 630/466, loss: 0.0672062411904335 2023-01-22 14:55:59.234819: step: 632/466, loss: 0.03965132683515549 2023-01-22 14:55:59.975963: step: 634/466, loss: 0.047246597707271576 2023-01-22 14:56:00.746040: step: 636/466, loss: 0.012512200511991978 2023-01-22 14:56:01.542251: step: 638/466, loss: 0.07491574436426163 2023-01-22 14:56:02.362678: step: 640/466, loss: 0.21360142529010773 2023-01-22 14:56:03.118094: step: 642/466, loss: 0.021787557750940323 2023-01-22 14:56:03.845867: step: 644/466, loss: 0.08531290292739868 2023-01-22 14:56:04.678893: step: 646/466, loss: 0.09931932389736176 2023-01-22 14:56:05.453264: step: 648/466, loss: 0.04829653725028038 2023-01-22 14:56:06.210190: step: 650/466, loss: 0.028179535642266273 2023-01-22 14:56:06.984584: step: 652/466, loss: 0.06829191744327545 2023-01-22 14:56:07.818627: step: 654/466, loss: 0.007168058305978775 2023-01-22 14:56:08.607792: step: 656/466, loss: 0.01279025711119175 2023-01-22 14:56:09.334818: step: 658/466, loss: 0.0425628125667572 2023-01-22 14:56:10.140664: step: 660/466, loss: 0.025735652074217796 2023-01-22 14:56:10.957854: step: 662/466, loss: 0.03173866868019104 2023-01-22 14:56:11.740429: step: 664/466, loss: 0.03260715678334236 2023-01-22 14:56:12.449868: step: 666/466, loss: 0.02390960231423378 2023-01-22 14:56:13.242056: step: 668/466, loss: 0.02179085463285446 2023-01-22 14:56:14.054869: step: 670/466, loss: 0.014968041330575943 2023-01-22 14:56:14.766283: step: 672/466, loss: 0.02386454865336418 2023-01-22 14:56:15.445118: step: 674/466, loss: 0.10030235350131989 2023-01-22 14:56:16.179492: step: 676/466, loss: 0.008777577430009842 2023-01-22 14:56:16.973777: step: 678/466, loss: 0.028794730082154274 2023-01-22 14:56:17.734909: step: 680/466, loss: 0.021851878613233566 2023-01-22 14:56:18.513530: step: 682/466, loss: 0.020717613399028778 2023-01-22 14:56:19.260414: step: 684/466, loss: 0.026879925280809402 2023-01-22 14:56:20.106862: step: 686/466, loss: 0.029374847188591957 2023-01-22 14:56:20.906119: step: 688/466, loss: 0.0072135343216359615 2023-01-22 14:56:21.691014: step: 690/466, loss: 0.049425311386585236 2023-01-22 14:56:22.499264: step: 692/466, loss: 0.02336275950074196 2023-01-22 14:56:23.264266: step: 694/466, loss: 0.43844372034072876 2023-01-22 14:56:23.972102: step: 696/466, loss: 0.004980398342013359 2023-01-22 14:56:24.758466: step: 698/466, loss: 0.03083825670182705 2023-01-22 14:56:25.553972: step: 700/466, loss: 0.014827440492808819 2023-01-22 14:56:26.284917: step: 702/466, loss: 0.03472888842225075 2023-01-22 14:56:27.057633: step: 704/466, loss: 0.11552385240793228 2023-01-22 14:56:27.906333: step: 706/466, loss: 0.012255324050784111 2023-01-22 14:56:28.615596: step: 708/466, loss: 0.008407890796661377 2023-01-22 14:56:29.363584: step: 710/466, loss: 0.07686987519264221 2023-01-22 14:56:30.172172: step: 712/466, loss: 0.03082488477230072 2023-01-22 14:56:30.888632: step: 714/466, loss: 0.02362615428864956 2023-01-22 14:56:31.655421: step: 716/466, loss: 0.04834354668855667 2023-01-22 14:56:32.539113: step: 718/466, loss: 0.012062999419867992 2023-01-22 14:56:33.292541: step: 720/466, loss: 0.02275286428630352 2023-01-22 14:56:34.021661: step: 722/466, loss: 0.15338103473186493 2023-01-22 14:56:34.713852: step: 724/466, loss: 0.013424837961792946 2023-01-22 14:56:35.523696: step: 726/466, loss: 0.09174531698226929 2023-01-22 14:56:36.347315: step: 728/466, loss: 0.05888718366622925 2023-01-22 14:56:37.109225: step: 730/466, loss: 0.014505615457892418 2023-01-22 14:56:37.965140: step: 732/466, loss: 0.01789144054055214 2023-01-22 14:56:38.804764: step: 734/466, loss: 0.011775941587984562 2023-01-22 14:56:39.531480: step: 736/466, loss: 0.000976826879195869 2023-01-22 14:56:40.249743: step: 738/466, loss: 0.00819521676748991 2023-01-22 14:56:41.089349: step: 740/466, loss: 0.37824034690856934 2023-01-22 14:56:41.909972: step: 742/466, loss: 0.008182219229638577 2023-01-22 14:56:42.715177: step: 744/466, loss: 0.002007798058912158 2023-01-22 14:56:43.493630: step: 746/466, loss: 0.0034525133669376373 2023-01-22 14:56:44.198493: step: 748/466, loss: 0.011354020796716213 2023-01-22 14:56:44.958881: step: 750/466, loss: 0.010496980510652065 2023-01-22 14:56:45.749965: step: 752/466, loss: 0.040494054555892944 2023-01-22 14:56:46.571197: step: 754/466, loss: 0.02089555747807026 2023-01-22 14:56:47.367236: step: 756/466, loss: 0.04060179740190506 2023-01-22 14:56:48.239230: step: 758/466, loss: 0.04249696433544159 2023-01-22 14:56:48.993461: step: 760/466, loss: 0.017529740929603577 2023-01-22 14:56:49.791067: step: 762/466, loss: 0.03140799328684807 2023-01-22 14:56:50.484626: step: 764/466, loss: 0.026097454130649567 2023-01-22 14:56:51.257544: step: 766/466, loss: 0.008137117139995098 2023-01-22 14:56:52.012885: step: 768/466, loss: 0.03194596618413925 2023-01-22 14:56:52.828032: step: 770/466, loss: 0.03512399643659592 2023-01-22 14:56:53.558960: step: 772/466, loss: 0.08083723485469818 2023-01-22 14:56:54.392161: step: 774/466, loss: 0.10024670511484146 2023-01-22 14:56:55.256726: step: 776/466, loss: 0.01750977709889412 2023-01-22 14:56:55.972935: step: 778/466, loss: 0.0022526816464960575 2023-01-22 14:56:56.647423: step: 780/466, loss: 0.002027718350291252 2023-01-22 14:56:57.383174: step: 782/466, loss: 0.015744207426905632 2023-01-22 14:56:58.114119: step: 784/466, loss: 0.0060505992732942104 2023-01-22 14:56:58.749558: step: 786/466, loss: 0.0058455681428313255 2023-01-22 14:56:59.520479: step: 788/466, loss: 0.0794205516576767 2023-01-22 14:57:00.206274: step: 790/466, loss: 0.05062644183635712 2023-01-22 14:57:00.971964: step: 792/466, loss: 0.10680859535932541 2023-01-22 14:57:01.807774: step: 794/466, loss: 0.4264127016067505 2023-01-22 14:57:02.600472: step: 796/466, loss: 0.04735902324318886 2023-01-22 14:57:03.411684: step: 798/466, loss: 0.29138079285621643 2023-01-22 14:57:04.150920: step: 800/466, loss: 0.06728032231330872 2023-01-22 14:57:04.984888: step: 802/466, loss: 0.0150164058431983 2023-01-22 14:57:05.800054: step: 804/466, loss: 0.03291695564985275 2023-01-22 14:57:06.607976: step: 806/466, loss: 0.030138498172163963 2023-01-22 14:57:07.267096: step: 808/466, loss: 0.022056685760617256 2023-01-22 14:57:07.985014: step: 810/466, loss: 0.0335700586438179 2023-01-22 14:57:08.781731: step: 812/466, loss: 0.028037378564476967 2023-01-22 14:57:09.524845: step: 814/466, loss: 0.20931771397590637 2023-01-22 14:57:10.356004: step: 816/466, loss: 0.15887659788131714 2023-01-22 14:57:11.140446: step: 818/466, loss: 0.05127459764480591 2023-01-22 14:57:11.949026: step: 820/466, loss: 0.025693120434880257 2023-01-22 14:57:12.701796: step: 822/466, loss: 0.05340283364057541 2023-01-22 14:57:13.512684: step: 824/466, loss: 0.08578217029571533 2023-01-22 14:57:14.323489: step: 826/466, loss: 0.15850238502025604 2023-01-22 14:57:15.063218: step: 828/466, loss: 0.0478825606405735 2023-01-22 14:57:15.843132: step: 830/466, loss: 0.013694366440176964 2023-01-22 14:57:16.584993: step: 832/466, loss: 0.205677792429924 2023-01-22 14:57:17.360558: step: 834/466, loss: 0.20604534447193146 2023-01-22 14:57:18.160378: step: 836/466, loss: 0.0973101332783699 2023-01-22 14:57:19.012918: step: 838/466, loss: 0.1259995847940445 2023-01-22 14:57:19.719459: step: 840/466, loss: 0.006384965963661671 2023-01-22 14:57:20.471562: step: 842/466, loss: 0.22864051163196564 2023-01-22 14:57:21.305268: step: 844/466, loss: 0.015518547967076302 2023-01-22 14:57:22.024764: step: 846/466, loss: 0.02271226979792118 2023-01-22 14:57:22.776073: step: 848/466, loss: 0.11442562937736511 2023-01-22 14:57:23.461775: step: 850/466, loss: 0.08857929706573486 2023-01-22 14:57:24.214085: step: 852/466, loss: 0.021136639639735222 2023-01-22 14:57:25.001144: step: 854/466, loss: 0.02740650251507759 2023-01-22 14:57:25.701714: step: 856/466, loss: 0.02360440418124199 2023-01-22 14:57:26.479764: step: 858/466, loss: 0.09424004703760147 2023-01-22 14:57:27.237056: step: 860/466, loss: 0.006110194604843855 2023-01-22 14:57:28.011531: step: 862/466, loss: 0.04518682882189751 2023-01-22 14:57:28.800501: step: 864/466, loss: 0.03704385831952095 2023-01-22 14:57:29.593173: step: 866/466, loss: 0.024167396128177643 2023-01-22 14:57:30.372659: step: 868/466, loss: 0.04980127885937691 2023-01-22 14:57:31.089641: step: 870/466, loss: 0.003607484046369791 2023-01-22 14:57:31.863511: step: 872/466, loss: 0.03922824189066887 2023-01-22 14:57:32.622642: step: 874/466, loss: 0.03403126075863838 2023-01-22 14:57:33.407103: step: 876/466, loss: 0.02481072209775448 2023-01-22 14:57:34.127255: step: 878/466, loss: 0.012723129242658615 2023-01-22 14:57:34.951253: step: 880/466, loss: 0.023607250303030014 2023-01-22 14:57:35.684773: step: 882/466, loss: 0.01605868898332119 2023-01-22 14:57:36.352269: step: 884/466, loss: 0.017645277082920074 2023-01-22 14:57:37.126594: step: 886/466, loss: 0.02243782766163349 2023-01-22 14:57:37.860946: step: 888/466, loss: 0.40102073550224304 2023-01-22 14:57:38.590290: step: 890/466, loss: 0.03958764299750328 2023-01-22 14:57:39.335018: step: 892/466, loss: 0.006702665239572525 2023-01-22 14:57:40.093145: step: 894/466, loss: 0.025415126234292984 2023-01-22 14:57:40.832199: step: 896/466, loss: 0.13924889266490936 2023-01-22 14:57:41.623116: step: 898/466, loss: 0.00555342948064208 2023-01-22 14:57:42.368209: step: 900/466, loss: 0.012884487397968769 2023-01-22 14:57:43.081366: step: 902/466, loss: 1.2119906386942603e-05 2023-01-22 14:57:43.806597: step: 904/466, loss: 0.25289541482925415 2023-01-22 14:57:44.553119: step: 906/466, loss: 0.023039881139993668 2023-01-22 14:57:45.334746: step: 908/466, loss: 0.027683792635798454 2023-01-22 14:57:46.007798: step: 910/466, loss: 0.07846502214670181 2023-01-22 14:57:46.700909: step: 912/466, loss: 0.003875490976497531 2023-01-22 14:57:47.424270: step: 914/466, loss: 0.0930880457162857 2023-01-22 14:57:48.382395: step: 916/466, loss: 0.011779635213315487 2023-01-22 14:57:49.179329: step: 918/466, loss: 0.039599090814590454 2023-01-22 14:57:49.897202: step: 920/466, loss: 0.002111016307026148 2023-01-22 14:57:50.640919: step: 922/466, loss: 0.007790920324623585 2023-01-22 14:57:51.393504: step: 924/466, loss: 0.0693824514746666 2023-01-22 14:57:52.204408: step: 926/466, loss: 0.13453641533851624 2023-01-22 14:57:53.006610: step: 928/466, loss: 0.16406899690628052 2023-01-22 14:57:53.798945: step: 930/466, loss: 0.04843205586075783 2023-01-22 14:57:54.476631: step: 932/466, loss: 0.07312513887882233 ================================================== Loss: 0.062 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3017218521421108, 'r': 0.33206579552642174, 'f1': 0.31616743313897794}, 'combined': 0.23296547704977322, 'epoch': 24} Test Chinese: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.34102868261607494, 'r': 0.2982155579586456, 'f1': 0.3181884244270076}, 'combined': 0.19556947062342908, 'epoch': 24} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.29074851855299083, 'r': 0.34867754027607245, 'f1': 0.3170889796816052}, 'combined': 0.23364451134434067, 'epoch': 24} Test Korean: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.324051962200322, 'r': 0.3027105851403355, 'f1': 0.3130179348135727}, 'combined': 0.1923915111537081, 'epoch': 24} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3161637354106435, 'r': 0.3395610516934046, 'f1': 0.3274449665918101}, 'combined': 0.24127523854133376, 'epoch': 24} Test Russian: {'template': {'p': 0.863013698630137, 'r': 0.48091603053435117, 'f1': 0.6176470588235295}, 'slot': {'p': 0.34766580316129947, 'r': 0.3010093533864065, 'f1': 0.32265967810793456}, 'combined': 0.19928980118431255, 'epoch': 24} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.2755102040816326, 'r': 0.38571428571428573, 'f1': 0.32142857142857145}, 'combined': 0.2142857142857143, 'epoch': 24} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.2712765957446808, 'r': 0.5543478260869565, 'f1': 0.36428571428571427}, 'combined': 0.18214285714285713, 'epoch': 24} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.2413793103448276, 'f1': 0.32558139534883723}, 'combined': 0.21705426356589147, 'epoch': 24} New best russian model... ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3112994890724527, 'r': 0.3408345449616797, 'f1': 0.3253981978166761}, 'combined': 0.23976709312807712, 'epoch': 22} Test for Chinese: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.354901590881547, 'r': 0.30635228234537004, 'f1': 0.3288446896922884}, 'combined': 0.2021191751279431, 'epoch': 22} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.28125, 'r': 0.38571428571428573, 'f1': 0.32530120481927716}, 'combined': 0.2168674698795181, 'epoch': 22} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.28298284686781866, 'r': 0.3527888622242065, 'f1': 0.31405359863540006}, 'combined': 0.23140791478397899, 'epoch': 11} Test for Korean: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.32660396071805126, 'r': 0.30226432413074417, 'f1': 0.3139631233545263}, 'combined': 0.19297245630570886, 'epoch': 11} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.28888888888888886, 'r': 0.5652173913043478, 'f1': 0.38235294117647056}, 'combined': 0.19117647058823528, 'epoch': 11} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3161637354106435, 'r': 0.3395610516934046, 'f1': 0.3274449665918101}, 'combined': 0.24127523854133376, 'epoch': 24} Test for Russian: {'template': {'p': 0.863013698630137, 'r': 0.48091603053435117, 'f1': 0.6176470588235295}, 'slot': {'p': 0.34766580316129947, 'r': 0.3010093533864065, 'f1': 0.32265967810793456}, 'combined': 0.19928980118431255, 'epoch': 24} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.2413793103448276, 'f1': 0.32558139534883723}, 'combined': 0.21705426356589147, 'epoch': 24} ****************************** Epoch: 25 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-22 15:00:45.867243: step: 2/466, loss: 0.0048638260923326015 2023-01-22 15:00:46.559570: step: 4/466, loss: 1.9489420652389526 2023-01-22 15:00:47.318894: step: 6/466, loss: 0.01325779128819704 2023-01-22 15:00:48.226530: step: 8/466, loss: 0.06886614114046097 2023-01-22 15:00:48.884192: step: 10/466, loss: 0.0005495556979440153 2023-01-22 15:00:49.614953: step: 12/466, loss: 0.0020770521368831396 2023-01-22 15:00:50.435021: step: 14/466, loss: 0.009379208087921143 2023-01-22 15:00:51.204373: step: 16/466, loss: 0.044607535004615784 2023-01-22 15:00:51.962194: step: 18/466, loss: 0.05780674144625664 2023-01-22 15:00:52.680782: step: 20/466, loss: 0.0162581168115139 2023-01-22 15:00:53.441841: step: 22/466, loss: 0.008655939251184464 2023-01-22 15:00:54.425411: step: 24/466, loss: 0.0031684592831879854 2023-01-22 15:00:55.148996: step: 26/466, loss: 0.01678566262125969 2023-01-22 15:00:55.909609: step: 28/466, loss: 0.013557526282966137 2023-01-22 15:00:56.629076: step: 30/466, loss: 0.0748433992266655 2023-01-22 15:00:57.317853: step: 32/466, loss: 0.009114764630794525 2023-01-22 15:00:58.126086: step: 34/466, loss: 0.02550913766026497 2023-01-22 15:00:58.811879: step: 36/466, loss: 0.004271187819540501 2023-01-22 15:00:59.563404: step: 38/466, loss: 0.009756631217896938 2023-01-22 15:01:00.296411: step: 40/466, loss: 0.044250279664993286 2023-01-22 15:01:01.074517: step: 42/466, loss: 0.030406568199396133 2023-01-22 15:01:01.727011: step: 44/466, loss: 0.04159717634320259 2023-01-22 15:01:02.444266: step: 46/466, loss: 0.03348518908023834 2023-01-22 15:01:03.257376: step: 48/466, loss: 0.0271090529859066 2023-01-22 15:01:03.977863: step: 50/466, loss: 0.04847847297787666 2023-01-22 15:01:04.719781: step: 52/466, loss: 0.003952601924538612 2023-01-22 15:01:05.431688: step: 54/466, loss: 0.0021677513141185045 2023-01-22 15:01:06.154928: step: 56/466, loss: 0.004650316201150417 2023-01-22 15:01:06.895346: step: 58/466, loss: 0.13764320313930511 2023-01-22 15:01:07.637482: step: 60/466, loss: 0.023560674861073494 2023-01-22 15:01:08.485220: step: 62/466, loss: 0.01893126778304577 2023-01-22 15:01:09.261165: step: 64/466, loss: 0.1631479263305664 2023-01-22 15:01:10.000026: step: 66/466, loss: 0.004469368141144514 2023-01-22 15:01:10.697430: step: 68/466, loss: 0.05100572481751442 2023-01-22 15:01:11.419987: step: 70/466, loss: 0.0174653809517622 2023-01-22 15:01:12.202592: step: 72/466, loss: 0.01131537463515997 2023-01-22 15:01:12.895865: step: 74/466, loss: 0.014669515192508698 2023-01-22 15:01:13.729815: step: 76/466, loss: 0.39461952447891235 2023-01-22 15:01:14.390753: step: 78/466, loss: 0.004725391045212746 2023-01-22 15:01:15.162331: step: 80/466, loss: 0.16633807122707367 2023-01-22 15:01:15.887123: step: 82/466, loss: 0.03558366745710373 2023-01-22 15:01:16.623625: step: 84/466, loss: 0.03334732726216316 2023-01-22 15:01:17.401093: step: 86/466, loss: 0.01573135145008564 2023-01-22 15:01:18.145391: step: 88/466, loss: 0.01893058978021145 2023-01-22 15:01:18.868445: step: 90/466, loss: 0.05342453345656395 2023-01-22 15:01:19.610565: step: 92/466, loss: 0.004875754471868277 2023-01-22 15:01:20.352315: step: 94/466, loss: 0.02153726853430271 2023-01-22 15:01:21.104506: step: 96/466, loss: 0.021911753341555595 2023-01-22 15:01:21.909186: step: 98/466, loss: 0.001786972163245082 2023-01-22 15:01:22.590194: step: 100/466, loss: 0.0024792884942144156 2023-01-22 15:01:23.403483: step: 102/466, loss: 0.008267298340797424 2023-01-22 15:01:24.301250: step: 104/466, loss: 0.011938805691897869 2023-01-22 15:01:24.989413: step: 106/466, loss: 0.01350637711584568 2023-01-22 15:01:25.751254: step: 108/466, loss: 0.014575090259313583 2023-01-22 15:01:26.514867: step: 110/466, loss: 0.007169341668486595 2023-01-22 15:01:27.319906: step: 112/466, loss: 0.010071353055536747 2023-01-22 15:01:28.047832: step: 114/466, loss: 0.01568801887333393 2023-01-22 15:01:28.872256: step: 116/466, loss: 0.00023800335475243628 2023-01-22 15:01:29.599520: step: 118/466, loss: 0.0017856850754469633 2023-01-22 15:01:30.439965: step: 120/466, loss: 0.015911098569631577 2023-01-22 15:01:31.239051: step: 122/466, loss: 0.005675219465047121 2023-01-22 15:01:31.989353: step: 124/466, loss: 0.001905403914861381 2023-01-22 15:01:32.698400: step: 126/466, loss: 0.03077961876988411 2023-01-22 15:01:33.451553: step: 128/466, loss: 0.024013573303818703 2023-01-22 15:01:34.166735: step: 130/466, loss: 0.010623539797961712 2023-01-22 15:01:34.940643: step: 132/466, loss: 0.018636681139469147 2023-01-22 15:01:35.712059: step: 134/466, loss: 0.05404209718108177 2023-01-22 15:01:36.512499: step: 136/466, loss: 0.1491851508617401 2023-01-22 15:01:37.208518: step: 138/466, loss: 0.13492342829704285 2023-01-22 15:01:38.012450: step: 140/466, loss: 0.04337337613105774 2023-01-22 15:01:38.815465: step: 142/466, loss: 0.021991444751620293 2023-01-22 15:01:39.564718: step: 144/466, loss: 0.02849423885345459 2023-01-22 15:01:40.320756: step: 146/466, loss: 0.048355210572481155 2023-01-22 15:01:41.038493: step: 148/466, loss: 0.011831236071884632 2023-01-22 15:01:41.783132: step: 150/466, loss: 0.026754720136523247 2023-01-22 15:01:42.493682: step: 152/466, loss: 0.003418155713006854 2023-01-22 15:01:43.230431: step: 154/466, loss: 0.0065348828211426735 2023-01-22 15:01:44.084429: step: 156/466, loss: 0.034028515219688416 2023-01-22 15:01:44.871315: step: 158/466, loss: 0.00662602111697197 2023-01-22 15:01:45.553666: step: 160/466, loss: 0.12418445199728012 2023-01-22 15:01:46.344275: step: 162/466, loss: 0.006915715057402849 2023-01-22 15:01:47.191013: step: 164/466, loss: 0.01204296387732029 2023-01-22 15:01:47.994924: step: 166/466, loss: 0.04335113987326622 2023-01-22 15:01:48.724901: step: 168/466, loss: 0.06921584904193878 2023-01-22 15:01:49.545010: step: 170/466, loss: 0.002505009062588215 2023-01-22 15:01:50.331606: step: 172/466, loss: 0.02193089760839939 2023-01-22 15:01:51.081920: step: 174/466, loss: 0.001341413240879774 2023-01-22 15:01:51.764378: step: 176/466, loss: 0.029483824968338013 2023-01-22 15:01:52.502680: step: 178/466, loss: 0.0035928364377468824 2023-01-22 15:01:53.303911: step: 180/466, loss: 0.0035043051466345787 2023-01-22 15:01:54.132809: step: 182/466, loss: 0.01978055201470852 2023-01-22 15:01:55.005735: step: 184/466, loss: 0.2719780504703522 2023-01-22 15:01:55.762233: step: 186/466, loss: 0.01555517129600048 2023-01-22 15:01:56.641021: step: 188/466, loss: 0.02810639515519142 2023-01-22 15:01:57.428885: step: 190/466, loss: 0.01859775185585022 2023-01-22 15:01:58.198274: step: 192/466, loss: 0.024280639365315437 2023-01-22 15:01:58.955635: step: 194/466, loss: 0.000761770352255553 2023-01-22 15:01:59.757320: step: 196/466, loss: 0.08165077865123749 2023-01-22 15:02:00.472830: step: 198/466, loss: 0.0318489633500576 2023-01-22 15:02:01.169949: step: 200/466, loss: 0.0034272021148353815 2023-01-22 15:02:01.947425: step: 202/466, loss: 0.006649025250226259 2023-01-22 15:02:02.754572: step: 204/466, loss: 0.028749840334057808 2023-01-22 15:02:03.571174: step: 206/466, loss: 0.020333321765065193 2023-01-22 15:02:04.326062: step: 208/466, loss: 0.015936290845274925 2023-01-22 15:02:05.011986: step: 210/466, loss: 0.37533944845199585 2023-01-22 15:02:05.825678: step: 212/466, loss: 0.03870442137122154 2023-01-22 15:02:06.591926: step: 214/466, loss: 0.07513385266065598 2023-01-22 15:02:07.367061: step: 216/466, loss: 0.035680994391441345 2023-01-22 15:02:08.065706: step: 218/466, loss: 0.044503144919872284 2023-01-22 15:02:08.904728: step: 220/466, loss: 0.011080436408519745 2023-01-22 15:02:09.705771: step: 222/466, loss: 0.08355198055505753 2023-01-22 15:02:10.482532: step: 224/466, loss: 0.005143460351973772 2023-01-22 15:02:11.200906: step: 226/466, loss: 0.05669359862804413 2023-01-22 15:02:11.960782: step: 228/466, loss: 0.007362133823335171 2023-01-22 15:02:12.726807: step: 230/466, loss: 0.1185983419418335 2023-01-22 15:02:13.501560: step: 232/466, loss: 0.019225867465138435 2023-01-22 15:02:14.256034: step: 234/466, loss: 0.0384087935090065 2023-01-22 15:02:15.017825: step: 236/466, loss: 0.009993866086006165 2023-01-22 15:02:15.794280: step: 238/466, loss: 0.007030665874481201 2023-01-22 15:02:16.501365: step: 240/466, loss: 0.002258468419313431 2023-01-22 15:02:17.364473: step: 242/466, loss: 0.005683743394911289 2023-01-22 15:02:18.119670: step: 244/466, loss: 0.0037180185317993164 2023-01-22 15:02:18.929765: step: 246/466, loss: 0.07294444739818573 2023-01-22 15:02:19.594204: step: 248/466, loss: 0.002524001756682992 2023-01-22 15:02:20.347341: step: 250/466, loss: 0.017223449423909187 2023-01-22 15:02:21.065814: step: 252/466, loss: 0.03425934538245201 2023-01-22 15:02:21.901268: step: 254/466, loss: 0.0761067345738411 2023-01-22 15:02:22.691245: step: 256/466, loss: 0.0428534634411335 2023-01-22 15:02:23.384151: step: 258/466, loss: 0.0031506107188761234 2023-01-22 15:02:24.280968: step: 260/466, loss: 0.04430555924773216 2023-01-22 15:02:25.005941: step: 262/466, loss: 0.04317544400691986 2023-01-22 15:02:25.799041: step: 264/466, loss: 0.018620461225509644 2023-01-22 15:02:26.598373: step: 266/466, loss: 0.015939462929964066 2023-01-22 15:02:27.311246: step: 268/466, loss: 0.013469705358147621 2023-01-22 15:02:28.081823: step: 270/466, loss: 0.031078225001692772 2023-01-22 15:02:28.887576: step: 272/466, loss: 0.038427598774433136 2023-01-22 15:02:29.674586: step: 274/466, loss: 0.001793315983377397 2023-01-22 15:02:30.434622: step: 276/466, loss: 0.010245811194181442 2023-01-22 15:02:31.132165: step: 278/466, loss: 0.009740750305354595 2023-01-22 15:02:31.881970: step: 280/466, loss: 0.032210443168878555 2023-01-22 15:02:32.736537: step: 282/466, loss: 0.30516213178634644 2023-01-22 15:02:33.496818: step: 284/466, loss: 0.05225621536374092 2023-01-22 15:02:34.266548: step: 286/466, loss: 0.0587480403482914 2023-01-22 15:02:35.025894: step: 288/466, loss: 0.004880858585238457 2023-01-22 15:02:35.800120: step: 290/466, loss: 0.001988980220630765 2023-01-22 15:02:36.480737: step: 292/466, loss: 0.00037727158633060753 2023-01-22 15:02:37.255598: step: 294/466, loss: 0.010188672691583633 2023-01-22 15:02:37.997424: step: 296/466, loss: 0.023995572701096535 2023-01-22 15:02:38.710220: step: 298/466, loss: 0.4974815547466278 2023-01-22 15:02:39.526075: step: 300/466, loss: 0.004259779583662748 2023-01-22 15:02:40.269804: step: 302/466, loss: 0.025984996929764748 2023-01-22 15:02:40.989777: step: 304/466, loss: 0.08418071269989014 2023-01-22 15:02:41.756263: step: 306/466, loss: 0.06732352077960968 2023-01-22 15:02:42.523543: step: 308/466, loss: 0.028239449486136436 2023-01-22 15:02:43.243825: step: 310/466, loss: 0.011483085341751575 2023-01-22 15:02:44.012573: step: 312/466, loss: 0.07952386140823364 2023-01-22 15:02:44.754176: step: 314/466, loss: 0.19841830432415009 2023-01-22 15:02:45.481306: step: 316/466, loss: 0.006301587913185358 2023-01-22 15:02:46.195632: step: 318/466, loss: 0.018144994974136353 2023-01-22 15:02:46.911225: step: 320/466, loss: 0.059368591755628586 2023-01-22 15:02:47.585836: step: 322/466, loss: 0.001227022847160697 2023-01-22 15:02:48.333125: step: 324/466, loss: 0.03310411050915718 2023-01-22 15:02:49.040795: step: 326/466, loss: 0.012070560827851295 2023-01-22 15:02:49.779916: step: 328/466, loss: 0.369498610496521 2023-01-22 15:02:50.535142: step: 330/466, loss: 0.05131317675113678 2023-01-22 15:02:51.289951: step: 332/466, loss: 0.012179000303149223 2023-01-22 15:02:52.104036: step: 334/466, loss: 0.011431191116571426 2023-01-22 15:02:52.953237: step: 336/466, loss: 0.09756392985582352 2023-01-22 15:02:53.692782: step: 338/466, loss: 0.3518053889274597 2023-01-22 15:02:54.440833: step: 340/466, loss: 0.0009419164853170514 2023-01-22 15:02:55.218978: step: 342/466, loss: 0.003747928887605667 2023-01-22 15:02:55.988170: step: 344/466, loss: 0.0055611394345760345 2023-01-22 15:02:56.745433: step: 346/466, loss: 0.010563087649643421 2023-01-22 15:02:57.488199: step: 348/466, loss: 0.03362543135881424 2023-01-22 15:02:58.273744: step: 350/466, loss: 0.0033438573591411114 2023-01-22 15:02:58.992051: step: 352/466, loss: 0.050573818385601044 2023-01-22 15:02:59.734611: step: 354/466, loss: 0.22646935284137726 2023-01-22 15:03:00.492975: step: 356/466, loss: 0.026994843035936356 2023-01-22 15:03:01.293354: step: 358/466, loss: 0.027258573099970818 2023-01-22 15:03:02.002354: step: 360/466, loss: 0.027738217264413834 2023-01-22 15:03:02.721294: step: 362/466, loss: 0.008679354563355446 2023-01-22 15:03:03.434562: step: 364/466, loss: 0.025107435882091522 2023-01-22 15:03:04.179168: step: 366/466, loss: 0.575447142124176 2023-01-22 15:03:04.917481: step: 368/466, loss: 0.018051810562610626 2023-01-22 15:03:05.829701: step: 370/466, loss: 0.009606994688510895 2023-01-22 15:03:06.628053: step: 372/466, loss: 0.10423516482114792 2023-01-22 15:03:07.385205: step: 374/466, loss: 0.0657115951180458 2023-01-22 15:03:08.215365: step: 376/466, loss: 0.002465600613504648 2023-01-22 15:03:09.044917: step: 378/466, loss: 0.0224370826035738 2023-01-22 15:03:09.846758: step: 380/466, loss: 0.05165081098675728 2023-01-22 15:03:10.649134: step: 382/466, loss: 0.037007659673690796 2023-01-22 15:03:11.501341: step: 384/466, loss: 0.020858481526374817 2023-01-22 15:03:12.249449: step: 386/466, loss: 0.012187846004962921 2023-01-22 15:03:12.956490: step: 388/466, loss: 0.020809736102819443 2023-01-22 15:03:13.751692: step: 390/466, loss: 0.015969304367899895 2023-01-22 15:03:14.545955: step: 392/466, loss: 0.0657949447631836 2023-01-22 15:03:15.329999: step: 394/466, loss: 0.005116751417517662 2023-01-22 15:03:16.048898: step: 396/466, loss: 0.012202229350805283 2023-01-22 15:03:16.777602: step: 398/466, loss: 0.02426939085125923 2023-01-22 15:03:17.491686: step: 400/466, loss: 0.0019004530040547252 2023-01-22 15:03:18.211718: step: 402/466, loss: 0.04898487403988838 2023-01-22 15:03:18.899801: step: 404/466, loss: 0.13768237829208374 2023-01-22 15:03:19.777312: step: 406/466, loss: 0.040279969573020935 2023-01-22 15:03:20.488739: step: 408/466, loss: 0.020777981728315353 2023-01-22 15:03:21.176654: step: 410/466, loss: 0.003094709012657404 2023-01-22 15:03:21.936157: step: 412/466, loss: 0.04134169965982437 2023-01-22 15:03:22.783494: step: 414/466, loss: 0.051051847636699677 2023-01-22 15:03:23.563249: step: 416/466, loss: 0.032947517931461334 2023-01-22 15:03:24.362297: step: 418/466, loss: 0.022777795791625977 2023-01-22 15:03:25.128636: step: 420/466, loss: 0.03082258068025112 2023-01-22 15:03:25.928511: step: 422/466, loss: 0.017670484259724617 2023-01-22 15:03:26.615463: step: 424/466, loss: 0.05445479974150658 2023-01-22 15:03:27.370058: step: 426/466, loss: 0.02118140459060669 2023-01-22 15:03:28.054700: step: 428/466, loss: 0.037338707596063614 2023-01-22 15:03:28.815325: step: 430/466, loss: 0.05138538032770157 2023-01-22 15:03:29.552864: step: 432/466, loss: 0.04034169390797615 2023-01-22 15:03:30.450762: step: 434/466, loss: 0.19528257846832275 2023-01-22 15:03:31.179545: step: 436/466, loss: 0.016089381650090218 2023-01-22 15:03:31.948633: step: 438/466, loss: 0.015455513261258602 2023-01-22 15:03:32.680958: step: 440/466, loss: 0.01979210413992405 2023-01-22 15:03:33.369179: step: 442/466, loss: 0.004141667392104864 2023-01-22 15:03:34.178024: step: 444/466, loss: 0.040368158370256424 2023-01-22 15:03:34.927359: step: 446/466, loss: 0.030878448858857155 2023-01-22 15:03:35.691805: step: 448/466, loss: 0.014130041003227234 2023-01-22 15:03:36.505601: step: 450/466, loss: 0.043796684592962265 2023-01-22 15:03:37.311966: step: 452/466, loss: 0.1610839068889618 2023-01-22 15:03:38.058874: step: 454/466, loss: 0.013636937364935875 2023-01-22 15:03:38.840536: step: 456/466, loss: 0.06379681825637817 2023-01-22 15:03:39.706437: step: 458/466, loss: 0.0052261208184063435 2023-01-22 15:03:40.613485: step: 460/466, loss: 0.6101828217506409 2023-01-22 15:03:41.394700: step: 462/466, loss: 0.03779454901814461 2023-01-22 15:03:42.194597: step: 464/466, loss: 0.03427768871188164 2023-01-22 15:03:42.963859: step: 466/466, loss: 0.16591612994670868 2023-01-22 15:03:43.745983: step: 468/466, loss: 0.03611454367637634 2023-01-22 15:03:44.475219: step: 470/466, loss: 0.033405475318431854 2023-01-22 15:03:45.272244: step: 472/466, loss: 0.05486998334527016 2023-01-22 15:03:46.108165: step: 474/466, loss: 0.05461619049310684 2023-01-22 15:03:46.862092: step: 476/466, loss: 0.33534881472587585 2023-01-22 15:03:47.641006: step: 478/466, loss: 0.020561659708619118 2023-01-22 15:03:48.414354: step: 480/466, loss: 0.008116367273032665 2023-01-22 15:03:49.203276: step: 482/466, loss: 0.22204464673995972 2023-01-22 15:03:49.931969: step: 484/466, loss: 0.01090270560234785 2023-01-22 15:03:50.581782: step: 486/466, loss: 0.004103204235434532 2023-01-22 15:03:51.328651: step: 488/466, loss: 0.005206770729273558 2023-01-22 15:03:52.039707: step: 490/466, loss: 0.00061570800608024 2023-01-22 15:03:52.769816: step: 492/466, loss: 0.014579207636415958 2023-01-22 15:03:53.623353: step: 494/466, loss: 0.08810116350650787 2023-01-22 15:03:54.427683: step: 496/466, loss: 0.058866944164037704 2023-01-22 15:03:55.232275: step: 498/466, loss: 0.04130866751074791 2023-01-22 15:03:55.933890: step: 500/466, loss: 0.032959774136543274 2023-01-22 15:03:56.655355: step: 502/466, loss: 0.026765989139676094 2023-01-22 15:03:57.330092: step: 504/466, loss: 0.042060643434524536 2023-01-22 15:03:58.094099: step: 506/466, loss: 0.015514878556132317 2023-01-22 15:03:58.852429: step: 508/466, loss: 0.009979259222745895 2023-01-22 15:03:59.580299: step: 510/466, loss: 0.08473718911409378 2023-01-22 15:04:00.322538: step: 512/466, loss: 0.010136888362467289 2023-01-22 15:04:01.113825: step: 514/466, loss: 0.15655040740966797 2023-01-22 15:04:01.933919: step: 516/466, loss: 0.009329917840659618 2023-01-22 15:04:02.649891: step: 518/466, loss: 0.049568429589271545 2023-01-22 15:04:03.481127: step: 520/466, loss: 0.04900914058089256 2023-01-22 15:04:04.225071: step: 522/466, loss: 0.0602293498814106 2023-01-22 15:04:05.118198: step: 524/466, loss: 0.031564585864543915 2023-01-22 15:04:05.849068: step: 526/466, loss: 0.015625080093741417 2023-01-22 15:04:06.570664: step: 528/466, loss: 0.822575569152832 2023-01-22 15:04:07.311911: step: 530/466, loss: 0.02063736692070961 2023-01-22 15:04:08.037498: step: 532/466, loss: 0.023112384602427483 2023-01-22 15:04:08.790405: step: 534/466, loss: 0.013427951373159885 2023-01-22 15:04:09.685392: step: 536/466, loss: 0.0413489006459713 2023-01-22 15:04:10.503477: step: 538/466, loss: 0.0034788185730576515 2023-01-22 15:04:11.249072: step: 540/466, loss: 0.018048470839858055 2023-01-22 15:04:11.978615: step: 542/466, loss: 0.029570063576102257 2023-01-22 15:04:12.780477: step: 544/466, loss: 0.0032845090609043837 2023-01-22 15:04:13.531505: step: 546/466, loss: 0.028534725308418274 2023-01-22 15:04:14.283443: step: 548/466, loss: 0.13171236217021942 2023-01-22 15:04:15.023593: step: 550/466, loss: 0.002139911288395524 2023-01-22 15:04:15.729445: step: 552/466, loss: 0.020025255158543587 2023-01-22 15:04:16.483649: step: 554/466, loss: 0.1817851960659027 2023-01-22 15:04:17.235610: step: 556/466, loss: 0.014328244142234325 2023-01-22 15:04:18.048789: step: 558/466, loss: 0.060344427824020386 2023-01-22 15:04:19.007647: step: 560/466, loss: 0.014726397581398487 2023-01-22 15:04:19.815848: step: 562/466, loss: 0.0007497974438592792 2023-01-22 15:04:20.601405: step: 564/466, loss: 0.007038436364382505 2023-01-22 15:04:21.411880: step: 566/466, loss: 0.020402414724230766 2023-01-22 15:04:22.190695: step: 568/466, loss: 0.005311600863933563 2023-01-22 15:04:22.894891: step: 570/466, loss: 0.001791072660125792 2023-01-22 15:04:23.608214: step: 572/466, loss: 0.0005361451185308397 2023-01-22 15:04:24.433181: step: 574/466, loss: 0.03888686001300812 2023-01-22 15:04:25.163214: step: 576/466, loss: 0.07183265686035156 2023-01-22 15:04:25.929877: step: 578/466, loss: 0.009103440679609776 2023-01-22 15:04:26.625712: step: 580/466, loss: 0.007811566349118948 2023-01-22 15:04:27.336163: step: 582/466, loss: 0.02086419053375721 2023-01-22 15:04:28.100679: step: 584/466, loss: 0.029495568946003914 2023-01-22 15:04:28.794129: step: 586/466, loss: 0.0748797282576561 2023-01-22 15:04:29.534255: step: 588/466, loss: 0.015807198360562325 2023-01-22 15:04:30.214136: step: 590/466, loss: 0.004969421774148941 2023-01-22 15:04:30.935099: step: 592/466, loss: 0.02605554275214672 2023-01-22 15:04:31.749960: step: 594/466, loss: 0.07038115710020065 2023-01-22 15:04:32.506953: step: 596/466, loss: 0.025732913985848427 2023-01-22 15:04:33.296713: step: 598/466, loss: 0.05297987535595894 2023-01-22 15:04:34.127444: step: 600/466, loss: 0.0939839631319046 2023-01-22 15:04:34.898848: step: 602/466, loss: 0.06241846829652786 2023-01-22 15:04:35.694869: step: 604/466, loss: 0.010820649564266205 2023-01-22 15:04:36.404692: step: 606/466, loss: 0.028127994388341904 2023-01-22 15:04:37.165240: step: 608/466, loss: 0.019125230610370636 2023-01-22 15:04:38.031454: step: 610/466, loss: 0.0562971793115139 2023-01-22 15:04:38.865966: step: 612/466, loss: 0.026095090433955193 2023-01-22 15:04:39.617163: step: 614/466, loss: 0.03391076251864433 2023-01-22 15:04:40.339948: step: 616/466, loss: 0.8686801791191101 2023-01-22 15:04:41.104774: step: 618/466, loss: 0.011446290649473667 2023-01-22 15:04:41.860988: step: 620/466, loss: 0.12559671700000763 2023-01-22 15:04:42.575546: step: 622/466, loss: 0.037010177969932556 2023-01-22 15:04:43.431598: step: 624/466, loss: 0.03859866037964821 2023-01-22 15:04:44.242945: step: 626/466, loss: 0.013249721378087997 2023-01-22 15:04:45.045575: step: 628/466, loss: 0.013942176476120949 2023-01-22 15:04:45.694164: step: 630/466, loss: 0.019450657069683075 2023-01-22 15:04:46.461510: step: 632/466, loss: 0.01676585152745247 2023-01-22 15:04:47.142200: step: 634/466, loss: 0.002062909072265029 2023-01-22 15:04:47.978440: step: 636/466, loss: 0.03258739411830902 2023-01-22 15:04:48.750064: step: 638/466, loss: 0.04516744613647461 2023-01-22 15:04:49.544588: step: 640/466, loss: 0.03640659898519516 2023-01-22 15:04:50.220620: step: 642/466, loss: 0.03564944118261337 2023-01-22 15:04:50.974122: step: 644/466, loss: 0.008748043328523636 2023-01-22 15:04:51.833516: step: 646/466, loss: 0.032235562801361084 2023-01-22 15:04:52.643954: step: 648/466, loss: 0.07947038114070892 2023-01-22 15:04:53.421223: step: 650/466, loss: 0.03167068213224411 2023-01-22 15:04:54.222433: step: 652/466, loss: 0.011534439399838448 2023-01-22 15:04:54.976318: step: 654/466, loss: 0.19016428291797638 2023-01-22 15:04:55.722523: step: 656/466, loss: 0.16548942029476166 2023-01-22 15:04:56.522848: step: 658/466, loss: 0.02085087075829506 2023-01-22 15:04:57.208476: step: 660/466, loss: 0.01081312820315361 2023-01-22 15:04:57.955737: step: 662/466, loss: 0.016461463645100594 2023-01-22 15:04:58.616228: step: 664/466, loss: 0.04506843909621239 2023-01-22 15:04:59.330998: step: 666/466, loss: 0.018984554335474968 2023-01-22 15:05:00.075443: step: 668/466, loss: 0.010795004665851593 2023-01-22 15:05:00.845245: step: 670/466, loss: 0.00987847801297903 2023-01-22 15:05:01.620895: step: 672/466, loss: 0.08290861546993256 2023-01-22 15:05:02.400573: step: 674/466, loss: 0.0027949621435254812 2023-01-22 15:05:03.158449: step: 676/466, loss: 0.04666345939040184 2023-01-22 15:05:03.944978: step: 678/466, loss: 0.015100638382136822 2023-01-22 15:05:04.713557: step: 680/466, loss: 0.03496446833014488 2023-01-22 15:05:05.498034: step: 682/466, loss: 0.03858804330229759 2023-01-22 15:05:06.272314: step: 684/466, loss: 0.02645958960056305 2023-01-22 15:05:07.094940: step: 686/466, loss: 0.06158650666475296 2023-01-22 15:05:07.895685: step: 688/466, loss: 0.024011608213186264 2023-01-22 15:05:08.612449: step: 690/466, loss: 0.010011572390794754 2023-01-22 15:05:09.342035: step: 692/466, loss: 0.0005916806985624135 2023-01-22 15:05:10.082404: step: 694/466, loss: 0.02358367294073105 2023-01-22 15:05:10.900974: step: 696/466, loss: 0.11142602562904358 2023-01-22 15:05:11.649600: step: 698/466, loss: 0.037874944508075714 2023-01-22 15:05:12.515321: step: 700/466, loss: 0.012585917487740517 2023-01-22 15:05:13.207675: step: 702/466, loss: 0.011423270218074322 2023-01-22 15:05:13.952081: step: 704/466, loss: 0.020701391622424126 2023-01-22 15:05:14.765094: step: 706/466, loss: 0.04733499139547348 2023-01-22 15:05:15.510841: step: 708/466, loss: 0.29787299036979675 2023-01-22 15:05:16.236051: step: 710/466, loss: 0.03941582143306732 2023-01-22 15:05:16.939778: step: 712/466, loss: 0.011034637689590454 2023-01-22 15:05:17.712913: step: 714/466, loss: 0.0179904717952013 2023-01-22 15:05:18.495115: step: 716/466, loss: 0.06562364846467972 2023-01-22 15:05:19.284784: step: 718/466, loss: 0.018577704206109047 2023-01-22 15:05:19.944827: step: 720/466, loss: 0.007526410277932882 2023-01-22 15:05:20.684169: step: 722/466, loss: 0.041528038680553436 2023-01-22 15:05:21.398954: step: 724/466, loss: 0.001094332430511713 2023-01-22 15:05:22.261876: step: 726/466, loss: 0.013120784424245358 2023-01-22 15:05:23.057244: step: 728/466, loss: 0.017572740092873573 2023-01-22 15:05:23.758764: step: 730/466, loss: 0.001205058186315 2023-01-22 15:05:24.486339: step: 732/466, loss: 0.06151802837848663 2023-01-22 15:05:25.241480: step: 734/466, loss: 0.014945479109883308 2023-01-22 15:05:25.886567: step: 736/466, loss: 0.03552056849002838 2023-01-22 15:05:26.616800: step: 738/466, loss: 0.034059032797813416 2023-01-22 15:05:27.361856: step: 740/466, loss: 0.049566540867090225 2023-01-22 15:05:28.088205: step: 742/466, loss: 0.008085060864686966 2023-01-22 15:05:28.895897: step: 744/466, loss: 0.009791059419512749 2023-01-22 15:05:29.743376: step: 746/466, loss: 0.014964860863983631 2023-01-22 15:05:30.492670: step: 748/466, loss: 0.06384597718715668 2023-01-22 15:05:31.262071: step: 750/466, loss: 0.0009342418634332716 2023-01-22 15:05:32.115092: step: 752/466, loss: 0.035021040588617325 2023-01-22 15:05:32.885800: step: 754/466, loss: 0.0690048411488533 2023-01-22 15:05:33.659825: step: 756/466, loss: 0.00836377963423729 2023-01-22 15:05:34.474084: step: 758/466, loss: 0.0028185443952679634 2023-01-22 15:05:35.295980: step: 760/466, loss: 0.007473757956176996 2023-01-22 15:05:36.016665: step: 762/466, loss: 0.01888253726065159 2023-01-22 15:05:36.735930: step: 764/466, loss: 0.07119555026292801 2023-01-22 15:05:37.450326: step: 766/466, loss: 0.05385569855570793 2023-01-22 15:05:38.198403: step: 768/466, loss: 0.01937274821102619 2023-01-22 15:05:38.918037: step: 770/466, loss: 0.010562130250036716 2023-01-22 15:05:39.718203: step: 772/466, loss: 0.02259805239737034 2023-01-22 15:05:40.593188: step: 774/466, loss: 0.044527675956487656 2023-01-22 15:05:41.441567: step: 776/466, loss: 0.06347750872373581 2023-01-22 15:05:42.335295: step: 778/466, loss: 0.06932666897773743 2023-01-22 15:05:43.094643: step: 780/466, loss: 0.04155116528272629 2023-01-22 15:05:43.834256: step: 782/466, loss: 0.023821156471967697 2023-01-22 15:05:44.602193: step: 784/466, loss: 0.04671480134129524 2023-01-22 15:05:45.374439: step: 786/466, loss: 0.004834890365600586 2023-01-22 15:05:46.061489: step: 788/466, loss: 0.0016203763661906123 2023-01-22 15:05:46.770995: step: 790/466, loss: 0.002012968761846423 2023-01-22 15:05:47.475544: step: 792/466, loss: 0.030111519619822502 2023-01-22 15:05:48.238147: step: 794/466, loss: 0.013528553768992424 2023-01-22 15:05:49.063392: step: 796/466, loss: 0.013709068298339844 2023-01-22 15:05:49.820535: step: 798/466, loss: 0.004014394711703062 2023-01-22 15:05:50.612659: step: 800/466, loss: 0.011794732883572578 2023-01-22 15:05:51.368341: step: 802/466, loss: 0.01672566682100296 2023-01-22 15:05:52.102355: step: 804/466, loss: 0.0033746538683772087 2023-01-22 15:05:52.820844: step: 806/466, loss: 0.01093566045165062 2023-01-22 15:05:53.668995: step: 808/466, loss: 0.0042954096570611 2023-01-22 15:05:54.442335: step: 810/466, loss: 0.01798640564084053 2023-01-22 15:05:55.133123: step: 812/466, loss: 0.12018779665231705 2023-01-22 15:05:55.949655: step: 814/466, loss: 0.06234927847981453 2023-01-22 15:05:56.782097: step: 816/466, loss: 0.0871921107172966 2023-01-22 15:05:57.525929: step: 818/466, loss: 0.03763734549283981 2023-01-22 15:05:58.338445: step: 820/466, loss: 0.03183059021830559 2023-01-22 15:05:59.073104: step: 822/466, loss: 0.03925763815641403 2023-01-22 15:05:59.738375: step: 824/466, loss: 0.010070499032735825 2023-01-22 15:06:00.526866: step: 826/466, loss: 0.08802307397127151 2023-01-22 15:06:01.275584: step: 828/466, loss: 0.05833996832370758 2023-01-22 15:06:02.008915: step: 830/466, loss: 0.0007282199221663177 2023-01-22 15:06:02.749198: step: 832/466, loss: 0.016765417531132698 2023-01-22 15:06:03.523100: step: 834/466, loss: 0.000603331602178514 2023-01-22 15:06:04.322749: step: 836/466, loss: 0.015151295810937881 2023-01-22 15:06:05.022293: step: 838/466, loss: 0.04270085692405701 2023-01-22 15:06:05.856912: step: 840/466, loss: 0.06138002872467041 2023-01-22 15:06:06.591527: step: 842/466, loss: 0.03053216077387333 2023-01-22 15:06:07.289829: step: 844/466, loss: 0.09704993665218353 2023-01-22 15:06:08.058590: step: 846/466, loss: 0.034290920943021774 2023-01-22 15:06:08.855326: step: 848/466, loss: 0.037506867200136185 2023-01-22 15:06:09.579788: step: 850/466, loss: 0.000346437533153221 2023-01-22 15:06:10.248486: step: 852/466, loss: 0.03536347299814224 2023-01-22 15:06:10.988444: step: 854/466, loss: 0.01068208273500204 2023-01-22 15:06:11.769218: step: 856/466, loss: 0.1030912697315216 2023-01-22 15:06:12.531119: step: 858/466, loss: 0.023195527493953705 2023-01-22 15:06:13.214283: step: 860/466, loss: 0.04605022817850113 2023-01-22 15:06:13.851890: step: 862/466, loss: 0.029108798131346703 2023-01-22 15:06:14.646986: step: 864/466, loss: 0.017726074904203415 2023-01-22 15:06:15.387135: step: 866/466, loss: 0.009071829728782177 2023-01-22 15:06:16.114359: step: 868/466, loss: 0.05669238418340683 2023-01-22 15:06:16.908440: step: 870/466, loss: 0.048009276390075684 2023-01-22 15:06:17.692092: step: 872/466, loss: 0.04615802317857742 2023-01-22 15:06:18.355111: step: 874/466, loss: 0.0086721396073699 2023-01-22 15:06:19.110410: step: 876/466, loss: 0.016087956726551056 2023-01-22 15:06:19.856843: step: 878/466, loss: 0.09400956332683563 2023-01-22 15:06:20.615146: step: 880/466, loss: 0.04891110584139824 2023-01-22 15:06:21.358558: step: 882/466, loss: 0.050575532019138336 2023-01-22 15:06:22.072867: step: 884/466, loss: 0.06017523258924484 2023-01-22 15:06:22.755049: step: 886/466, loss: 0.010187262669205666 2023-01-22 15:06:23.538226: step: 888/466, loss: 1.000565528869629 2023-01-22 15:06:24.269623: step: 890/466, loss: 0.03032972663640976 2023-01-22 15:06:25.055079: step: 892/466, loss: 0.002549918135628104 2023-01-22 15:06:25.757591: step: 894/466, loss: 0.33435216546058655 2023-01-22 15:06:26.410766: step: 896/466, loss: 0.005525761283934116 2023-01-22 15:06:27.073512: step: 898/466, loss: 0.030115347355604172 2023-01-22 15:06:27.777174: step: 900/466, loss: 0.029519736766815186 2023-01-22 15:06:28.484898: step: 902/466, loss: 0.04865584522485733 2023-01-22 15:06:29.218248: step: 904/466, loss: 0.04875032231211662 2023-01-22 15:06:29.987287: step: 906/466, loss: 0.03611094132065773 2023-01-22 15:06:30.665194: step: 908/466, loss: 0.16952019929885864 2023-01-22 15:06:31.461670: step: 910/466, loss: 0.04466065391898155 2023-01-22 15:06:32.330509: step: 912/466, loss: 0.010506573133170605 2023-01-22 15:06:33.029339: step: 914/466, loss: 0.0016740434803068638 2023-01-22 15:06:33.763475: step: 916/466, loss: 0.019911987707018852 2023-01-22 15:06:34.522557: step: 918/466, loss: 0.018170898780226707 2023-01-22 15:06:35.303630: step: 920/466, loss: 0.18965911865234375 2023-01-22 15:06:36.040582: step: 922/466, loss: 0.027456559240818024 2023-01-22 15:06:36.732154: step: 924/466, loss: 0.03433636948466301 2023-01-22 15:06:37.545556: step: 926/466, loss: 0.0002444031124468893 2023-01-22 15:06:38.338116: step: 928/466, loss: 0.02833491750061512 2023-01-22 15:06:39.110868: step: 930/466, loss: 0.015947267413139343 2023-01-22 15:06:39.821157: step: 932/466, loss: 0.029022732749581337 ================================================== Loss: 0.053 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2864857456140351, 'r': 0.3305186590765338, 'f1': 0.3069309838472834}, 'combined': 0.2261596723085246, 'epoch': 25} Test Chinese: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.340469763932497, 'r': 0.2989059226212571, 'f1': 0.3183368747142019}, 'combined': 0.19566071323897288, 'epoch': 25} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2766338014825131, 'r': 0.3558969969737075, 'f1': 0.3112991160251351}, 'combined': 0.22937829601852058, 'epoch': 25} Test Korean: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.3233418639367549, 'r': 0.31074412897818005, 'f1': 0.3169178533949651}, 'combined': 0.19478853428178342, 'epoch': 25} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3057808468659888, 'r': 0.3475573572537519, 'f1': 0.3253334409817536}, 'combined': 0.23971937756550263, 'epoch': 25} Test Russian: {'template': {'p': 0.863013698630137, 'r': 0.48091603053435117, 'f1': 0.6176470588235295}, 'slot': {'p': 0.3433366017632886, 'r': 0.30346909341295875, 'f1': 0.32217418012746496}, 'combined': 0.19898993478461074, 'epoch': 25} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.22916666666666666, 'r': 0.3142857142857143, 'f1': 0.26506024096385544}, 'combined': 0.17670682730923695, 'epoch': 25} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.255, 'r': 0.5543478260869565, 'f1': 0.34931506849315075}, 'combined': 0.17465753424657537, 'epoch': 25} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.4375, 'r': 0.2413793103448276, 'f1': 0.3111111111111111}, 'combined': 0.2074074074074074, 'epoch': 25} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3112994890724527, 'r': 0.3408345449616797, 'f1': 0.3253981978166761}, 'combined': 0.23976709312807712, 'epoch': 22} Test for Chinese: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.354901590881547, 'r': 0.30635228234537004, 'f1': 0.3288446896922884}, 'combined': 0.2021191751279431, 'epoch': 22} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.28125, 'r': 0.38571428571428573, 'f1': 0.32530120481927716}, 'combined': 0.2168674698795181, 'epoch': 22} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.28298284686781866, 'r': 0.3527888622242065, 'f1': 0.31405359863540006}, 'combined': 0.23140791478397899, 'epoch': 11} Test for Korean: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.32660396071805126, 'r': 0.30226432413074417, 'f1': 0.3139631233545263}, 'combined': 0.19297245630570886, 'epoch': 11} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.28888888888888886, 'r': 0.5652173913043478, 'f1': 0.38235294117647056}, 'combined': 0.19117647058823528, 'epoch': 11} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3161637354106435, 'r': 0.3395610516934046, 'f1': 0.3274449665918101}, 'combined': 0.24127523854133376, 'epoch': 24} Test for Russian: {'template': {'p': 0.863013698630137, 'r': 0.48091603053435117, 'f1': 0.6176470588235295}, 'slot': {'p': 0.34766580316129947, 'r': 0.3010093533864065, 'f1': 0.32265967810793456}, 'combined': 0.19928980118431255, 'epoch': 24} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.2413793103448276, 'f1': 0.32558139534883723}, 'combined': 0.21705426356589147, 'epoch': 24} ****************************** Epoch: 26 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-22 15:09:26.564635: step: 2/466, loss: 0.00031524227233603597 2023-01-22 15:09:27.310444: step: 4/466, loss: 0.015762172639369965 2023-01-22 15:09:28.112766: step: 6/466, loss: 0.0012802548008039594 2023-01-22 15:09:28.879040: step: 8/466, loss: 0.009078371338546276 2023-01-22 15:09:29.588676: step: 10/466, loss: 0.0006099325837567449 2023-01-22 15:09:30.344656: step: 12/466, loss: 0.45666512846946716 2023-01-22 15:09:31.099119: step: 14/466, loss: 0.010864143259823322 2023-01-22 15:09:31.927055: step: 16/466, loss: 0.012954924255609512 2023-01-22 15:09:32.774054: step: 18/466, loss: 0.028923632577061653 2023-01-22 15:09:33.526902: step: 20/466, loss: 0.0014648281503468752 2023-01-22 15:09:34.371742: step: 22/466, loss: 0.07409375160932541 2023-01-22 15:09:35.140310: step: 24/466, loss: 0.019660940393805504 2023-01-22 15:09:35.913253: step: 26/466, loss: 0.04426463320851326 2023-01-22 15:09:36.637790: step: 28/466, loss: 0.06421208381652832 2023-01-22 15:09:37.416527: step: 30/466, loss: 0.06367946416139603 2023-01-22 15:09:38.201906: step: 32/466, loss: 0.04807402938604355 2023-01-22 15:09:39.026123: step: 34/466, loss: 0.0189279243350029 2023-01-22 15:09:39.875695: step: 36/466, loss: 2.7010581493377686 2023-01-22 15:09:40.631113: step: 38/466, loss: 0.020814199000597 2023-01-22 15:09:41.417421: step: 40/466, loss: 0.03045761026442051 2023-01-22 15:09:42.276297: step: 42/466, loss: 0.019807307049632072 2023-01-22 15:09:43.051549: step: 44/466, loss: 0.12432827800512314 2023-01-22 15:09:43.754419: step: 46/466, loss: 0.004209555219858885 2023-01-22 15:09:44.547689: step: 48/466, loss: 0.09685204923152924 2023-01-22 15:09:45.302111: step: 50/466, loss: 0.008693347685039043 2023-01-22 15:09:46.207148: step: 52/466, loss: 0.02438419498503208 2023-01-22 15:09:46.959066: step: 54/466, loss: 0.004380271770060062 2023-01-22 15:09:47.717157: step: 56/466, loss: 0.029782719910144806 2023-01-22 15:09:48.510783: step: 58/466, loss: 1.0801372528076172 2023-01-22 15:09:49.238853: step: 60/466, loss: 0.013642487116158009 2023-01-22 15:09:49.946512: step: 62/466, loss: 0.19044984877109528 2023-01-22 15:09:50.724491: step: 64/466, loss: 0.020978759974241257 2023-01-22 15:09:51.447212: step: 66/466, loss: 0.3767179846763611 2023-01-22 15:09:52.202976: step: 68/466, loss: 0.012379830703139305 2023-01-22 15:09:53.021938: step: 70/466, loss: 0.0017019481165334582 2023-01-22 15:09:53.818102: step: 72/466, loss: 0.006129485089331865 2023-01-22 15:09:54.584789: step: 74/466, loss: 0.050105977803468704 2023-01-22 15:09:55.398553: step: 76/466, loss: 0.00046785204904153943 2023-01-22 15:09:56.074870: step: 78/466, loss: 0.07924487441778183 2023-01-22 15:09:56.776777: step: 80/466, loss: 0.061756521463394165 2023-01-22 15:09:57.505394: step: 82/466, loss: 0.003477144753560424 2023-01-22 15:09:58.319494: step: 84/466, loss: 0.05212853476405144 2023-01-22 15:09:59.015571: step: 86/466, loss: 0.07153777778148651 2023-01-22 15:09:59.717966: step: 88/466, loss: 0.0008990956703200936 2023-01-22 15:10:00.417095: step: 90/466, loss: 0.004675067961215973 2023-01-22 15:10:01.229627: step: 92/466, loss: 0.037258706986904144 2023-01-22 15:10:02.006958: step: 94/466, loss: 0.012425399385392666 2023-01-22 15:10:02.786697: step: 96/466, loss: 0.0023102019913494587 2023-01-22 15:10:03.535027: step: 98/466, loss: 0.060890063643455505 2023-01-22 15:10:04.298800: step: 100/466, loss: 0.06003446504473686 2023-01-22 15:10:05.070486: step: 102/466, loss: 0.15408965945243835 2023-01-22 15:10:05.822836: step: 104/466, loss: 0.0013166368007659912 2023-01-22 15:10:06.596442: step: 106/466, loss: 0.016299203038215637 2023-01-22 15:10:07.342456: step: 108/466, loss: 0.0035430581774562597 2023-01-22 15:10:08.021566: step: 110/466, loss: 0.12147834151983261 2023-01-22 15:10:08.711674: step: 112/466, loss: 0.019816666841506958 2023-01-22 15:10:09.416655: step: 114/466, loss: 0.017354421317577362 2023-01-22 15:10:10.189750: step: 116/466, loss: 0.0008715191506780684 2023-01-22 15:10:10.961442: step: 118/466, loss: 0.001017579110339284 2023-01-22 15:10:11.713222: step: 120/466, loss: 0.0178241990506649 2023-01-22 15:10:12.468135: step: 122/466, loss: 0.02976835146546364 2023-01-22 15:10:13.231162: step: 124/466, loss: 0.014377070590853691 2023-01-22 15:10:13.979670: step: 126/466, loss: 0.041578684002161026 2023-01-22 15:10:14.749030: step: 128/466, loss: 0.002998858457431197 2023-01-22 15:10:15.451953: step: 130/466, loss: 0.0054906210862100124 2023-01-22 15:10:16.223030: step: 132/466, loss: 0.0224401094019413 2023-01-22 15:10:17.014238: step: 134/466, loss: 0.019206833094358444 2023-01-22 15:10:17.791813: step: 136/466, loss: 0.0006435486720874906 2023-01-22 15:10:18.555022: step: 138/466, loss: 0.22176651656627655 2023-01-22 15:10:19.256312: step: 140/466, loss: 0.023877454921603203 2023-01-22 15:10:19.994767: step: 142/466, loss: 0.00489756790921092 2023-01-22 15:10:20.760763: step: 144/466, loss: 0.008302816189825535 2023-01-22 15:10:21.480957: step: 146/466, loss: 0.0002597762504592538 2023-01-22 15:10:22.329546: step: 148/466, loss: 0.023236991837620735 2023-01-22 15:10:23.115957: step: 150/466, loss: 0.0020385209936648607 2023-01-22 15:10:23.795561: step: 152/466, loss: 0.03643738478422165 2023-01-22 15:10:24.517396: step: 154/466, loss: 0.0633644387125969 2023-01-22 15:10:25.301988: step: 156/466, loss: 0.044321831315755844 2023-01-22 15:10:26.047060: step: 158/466, loss: 0.00029187617474235594 2023-01-22 15:10:26.791152: step: 160/466, loss: 0.02037622407078743 2023-01-22 15:10:27.523671: step: 162/466, loss: 0.010865447111427784 2023-01-22 15:10:28.313059: step: 164/466, loss: 0.03551540896296501 2023-01-22 15:10:29.083549: step: 166/466, loss: 0.044380009174346924 2023-01-22 15:10:29.955258: step: 168/466, loss: 0.09230950474739075 2023-01-22 15:10:30.696581: step: 170/466, loss: 0.01594698242843151 2023-01-22 15:10:31.551307: step: 172/466, loss: 0.04631970077753067 2023-01-22 15:10:32.303805: step: 174/466, loss: 0.0324825793504715 2023-01-22 15:10:33.086636: step: 176/466, loss: 0.021742789074778557 2023-01-22 15:10:33.802436: step: 178/466, loss: 0.014791291207075119 2023-01-22 15:10:34.609624: step: 180/466, loss: 0.02440674975514412 2023-01-22 15:10:35.390134: step: 182/466, loss: 0.03153248503804207 2023-01-22 15:10:36.181841: step: 184/466, loss: 0.002631034003570676 2023-01-22 15:10:37.002811: step: 186/466, loss: 0.06895671784877777 2023-01-22 15:10:37.761151: step: 188/466, loss: 0.03477528318762779 2023-01-22 15:10:38.453212: step: 190/466, loss: 0.001194530283100903 2023-01-22 15:10:39.223994: step: 192/466, loss: 0.005461663007736206 2023-01-22 15:10:40.057705: step: 194/466, loss: 0.012678248807787895 2023-01-22 15:10:40.893310: step: 196/466, loss: 0.012411870993673801 2023-01-22 15:10:41.687695: step: 198/466, loss: 0.06810742616653442 2023-01-22 15:10:42.506765: step: 200/466, loss: 0.020279204472899437 2023-01-22 15:10:43.347297: step: 202/466, loss: 0.00167837121989578 2023-01-22 15:10:44.043056: step: 204/466, loss: 0.014692934229969978 2023-01-22 15:10:44.844858: step: 206/466, loss: 0.07064501196146011 2023-01-22 15:10:45.606429: step: 208/466, loss: 0.39062148332595825 2023-01-22 15:10:46.374933: step: 210/466, loss: 0.0009400771232321858 2023-01-22 15:10:47.157792: step: 212/466, loss: 0.031239798292517662 2023-01-22 15:10:47.905176: step: 214/466, loss: 0.045270588248968124 2023-01-22 15:10:48.676592: step: 216/466, loss: 0.03298955038189888 2023-01-22 15:10:49.443107: step: 218/466, loss: 0.0049577741883695126 2023-01-22 15:10:50.208737: step: 220/466, loss: 0.008334346115589142 2023-01-22 15:10:50.937425: step: 222/466, loss: 0.03591045364737511 2023-01-22 15:10:51.616451: step: 224/466, loss: 0.012412887066602707 2023-01-22 15:10:52.305305: step: 226/466, loss: 0.03839350864291191 2023-01-22 15:10:53.077305: step: 228/466, loss: 0.0374600924551487 2023-01-22 15:10:53.805059: step: 230/466, loss: 0.003158966079354286 2023-01-22 15:10:54.532171: step: 232/466, loss: 0.00404371228069067 2023-01-22 15:10:55.288097: step: 234/466, loss: 0.01028430461883545 2023-01-22 15:10:56.058404: step: 236/466, loss: 0.06801458448171616 2023-01-22 15:10:56.793010: step: 238/466, loss: 0.03367387130856514 2023-01-22 15:10:57.548776: step: 240/466, loss: 0.004517871420830488 2023-01-22 15:10:58.288566: step: 242/466, loss: 0.10528568923473358 2023-01-22 15:10:59.019538: step: 244/466, loss: 0.03185072913765907 2023-01-22 15:10:59.745074: step: 246/466, loss: 0.025077687576413155 2023-01-22 15:11:00.613714: step: 248/466, loss: 0.020623821765184402 2023-01-22 15:11:01.422373: step: 250/466, loss: 0.0257880799472332 2023-01-22 15:11:02.177597: step: 252/466, loss: 0.027103710919618607 2023-01-22 15:11:02.973190: step: 254/466, loss: 0.02695123478770256 2023-01-22 15:11:03.800487: step: 256/466, loss: 0.04977225139737129 2023-01-22 15:11:04.616885: step: 258/466, loss: 0.014846911653876305 2023-01-22 15:11:05.405450: step: 260/466, loss: 0.010261930525302887 2023-01-22 15:11:06.184008: step: 262/466, loss: 0.007246529217809439 2023-01-22 15:11:07.123737: step: 264/466, loss: 0.026592286303639412 2023-01-22 15:11:07.838525: step: 266/466, loss: 0.02041914314031601 2023-01-22 15:11:08.626662: step: 268/466, loss: 0.037839245051145554 2023-01-22 15:11:09.456043: step: 270/466, loss: 0.006573710590600967 2023-01-22 15:11:10.226458: step: 272/466, loss: 0.012767232023179531 2023-01-22 15:11:10.916210: step: 274/466, loss: 0.03212396055459976 2023-01-22 15:11:11.659227: step: 276/466, loss: 0.014931570738554 2023-01-22 15:11:12.392020: step: 278/466, loss: 0.05406482145190239 2023-01-22 15:11:13.200301: step: 280/466, loss: 0.009635468013584614 2023-01-22 15:11:13.975524: step: 282/466, loss: 0.007484063971787691 2023-01-22 15:11:14.718250: step: 284/466, loss: 0.006852707825601101 2023-01-22 15:11:15.404723: step: 286/466, loss: 0.0074304440058767796 2023-01-22 15:11:16.135797: step: 288/466, loss: 0.040117863565683365 2023-01-22 15:11:16.891471: step: 290/466, loss: 0.06904040277004242 2023-01-22 15:11:17.642114: step: 292/466, loss: 0.005339875817298889 2023-01-22 15:11:18.387148: step: 294/466, loss: 0.000554997066501528 2023-01-22 15:11:19.192346: step: 296/466, loss: 0.002527383156120777 2023-01-22 15:11:19.938625: step: 298/466, loss: 0.013496562838554382 2023-01-22 15:11:20.611934: step: 300/466, loss: 0.0026911115273833275 2023-01-22 15:11:21.488429: step: 302/466, loss: 0.9188671112060547 2023-01-22 15:11:22.297118: step: 304/466, loss: 0.05199922248721123 2023-01-22 15:11:23.011723: step: 306/466, loss: 0.012454289011657238 2023-01-22 15:11:23.782153: step: 308/466, loss: 0.012339092791080475 2023-01-22 15:11:24.474974: step: 310/466, loss: 0.002921469509601593 2023-01-22 15:11:25.219424: step: 312/466, loss: 0.0036970670334994793 2023-01-22 15:11:26.009066: step: 314/466, loss: 0.03886334225535393 2023-01-22 15:11:26.761091: step: 316/466, loss: 0.02411399781703949 2023-01-22 15:11:27.537627: step: 318/466, loss: 0.062258653342723846 2023-01-22 15:11:28.220510: step: 320/466, loss: 0.04542381316423416 2023-01-22 15:11:28.937888: step: 322/466, loss: 0.096859410405159 2023-01-22 15:11:29.649508: step: 324/466, loss: 0.009544518776237965 2023-01-22 15:11:30.423262: step: 326/466, loss: 0.16951487958431244 2023-01-22 15:11:31.188774: step: 328/466, loss: 0.1383233368396759 2023-01-22 15:11:31.992846: step: 330/466, loss: 0.014902369119226933 2023-01-22 15:11:32.921004: step: 332/466, loss: 0.07071257382631302 2023-01-22 15:11:33.705551: step: 334/466, loss: 0.0656275525689125 2023-01-22 15:11:34.493559: step: 336/466, loss: 0.04864136502146721 2023-01-22 15:11:35.279965: step: 338/466, loss: 0.003996263723820448 2023-01-22 15:11:35.991757: step: 340/466, loss: 0.009085068479180336 2023-01-22 15:11:36.705936: step: 342/466, loss: 0.001105593633837998 2023-01-22 15:11:37.449676: step: 344/466, loss: 0.05922761932015419 2023-01-22 15:11:38.226963: step: 346/466, loss: 0.04883921891450882 2023-01-22 15:11:38.941774: step: 348/466, loss: 0.025686321780085564 2023-01-22 15:11:39.720568: step: 350/466, loss: 0.058305688202381134 2023-01-22 15:11:40.531299: step: 352/466, loss: 0.28009021282196045 2023-01-22 15:11:41.353275: step: 354/466, loss: 0.31456565856933594 2023-01-22 15:11:42.114822: step: 356/466, loss: 0.020045241340994835 2023-01-22 15:11:42.769967: step: 358/466, loss: 0.0017619299469515681 2023-01-22 15:11:43.531290: step: 360/466, loss: 0.012018238194286823 2023-01-22 15:11:44.215828: step: 362/466, loss: 0.032567549496889114 2023-01-22 15:11:44.933372: step: 364/466, loss: 0.02509705349802971 2023-01-22 15:11:45.602795: step: 366/466, loss: 0.0745500698685646 2023-01-22 15:11:46.258064: step: 368/466, loss: 0.017987968400120735 2023-01-22 15:11:46.975057: step: 370/466, loss: 0.013996962457895279 2023-01-22 15:11:47.731188: step: 372/466, loss: 0.0067430599592626095 2023-01-22 15:11:48.589792: step: 374/466, loss: 0.012736006639897823 2023-01-22 15:11:49.287194: step: 376/466, loss: 0.013728760182857513 2023-01-22 15:11:50.048531: step: 378/466, loss: 0.016943395137786865 2023-01-22 15:11:50.860087: step: 380/466, loss: 0.007950839586555958 2023-01-22 15:11:51.619644: step: 382/466, loss: 0.011489784345030785 2023-01-22 15:11:52.433563: step: 384/466, loss: 0.05696876347064972 2023-01-22 15:11:53.194721: step: 386/466, loss: 0.0263433326035738 2023-01-22 15:11:54.003647: step: 388/466, loss: 0.05339128524065018 2023-01-22 15:11:54.711887: step: 390/466, loss: 0.012195846997201443 2023-01-22 15:11:55.446637: step: 392/466, loss: 0.01823616772890091 2023-01-22 15:11:56.185660: step: 394/466, loss: 0.012851156294345856 2023-01-22 15:11:56.862150: step: 396/466, loss: 0.003447320545092225 2023-01-22 15:11:57.621782: step: 398/466, loss: 0.019181225448846817 2023-01-22 15:11:58.392579: step: 400/466, loss: 0.0008523253491148353 2023-01-22 15:11:59.309953: step: 402/466, loss: 0.01056719571352005 2023-01-22 15:12:00.095424: step: 404/466, loss: 0.005507184658199549 2023-01-22 15:12:00.743042: step: 406/466, loss: 0.0025642230175435543 2023-01-22 15:12:01.492896: step: 408/466, loss: 0.00011972729407716542 2023-01-22 15:12:02.212253: step: 410/466, loss: 0.0066248211078345776 2023-01-22 15:12:02.961027: step: 412/466, loss: 0.0039841653779149055 2023-01-22 15:12:03.758702: step: 414/466, loss: 0.05562155693769455 2023-01-22 15:12:04.547313: step: 416/466, loss: 0.005581381265074015 2023-01-22 15:12:05.263717: step: 418/466, loss: 0.13449573516845703 2023-01-22 15:12:06.031169: step: 420/466, loss: 0.037772390991449356 2023-01-22 15:12:06.733810: step: 422/466, loss: 0.11866430193185806 2023-01-22 15:12:07.529554: step: 424/466, loss: 0.0028409764636307955 2023-01-22 15:12:08.277344: step: 426/466, loss: 0.019561611115932465 2023-01-22 15:12:08.973668: step: 428/466, loss: 0.001310934778302908 2023-01-22 15:12:09.735229: step: 430/466, loss: 0.017845844849944115 2023-01-22 15:12:10.502242: step: 432/466, loss: 7.294760143849999e-05 2023-01-22 15:12:11.282142: step: 434/466, loss: 0.016323518007993698 2023-01-22 15:12:12.078719: step: 436/466, loss: 0.0048583317548036575 2023-01-22 15:12:12.874956: step: 438/466, loss: 0.005380266811698675 2023-01-22 15:12:13.662255: step: 440/466, loss: 0.019052177667617798 2023-01-22 15:12:14.456211: step: 442/466, loss: 0.021041641011834145 2023-01-22 15:12:15.174712: step: 444/466, loss: 1.4660183191299438 2023-01-22 15:12:15.959278: step: 446/466, loss: 0.3665899634361267 2023-01-22 15:12:16.723539: step: 448/466, loss: 0.30819717049598694 2023-01-22 15:12:17.479964: step: 450/466, loss: 0.009679583832621574 2023-01-22 15:12:18.242035: step: 452/466, loss: 0.8086559772491455 2023-01-22 15:12:19.010315: step: 454/466, loss: 0.04148883745074272 2023-01-22 15:12:19.770238: step: 456/466, loss: 0.04705173894762993 2023-01-22 15:12:20.486155: step: 458/466, loss: 0.00868227705359459 2023-01-22 15:12:21.361432: step: 460/466, loss: 0.003283077385276556 2023-01-22 15:12:22.171971: step: 462/466, loss: 0.0010576589265838265 2023-01-22 15:12:22.961562: step: 464/466, loss: 0.004205800127238035 2023-01-22 15:12:23.649624: step: 466/466, loss: 0.0158492773771286 2023-01-22 15:12:24.371530: step: 468/466, loss: 0.0006152232526801527 2023-01-22 15:12:25.236560: step: 470/466, loss: 0.010952494107186794 2023-01-22 15:12:25.994132: step: 472/466, loss: 0.02950318530201912 2023-01-22 15:12:26.760645: step: 474/466, loss: 0.136695995926857 2023-01-22 15:12:27.486778: step: 476/466, loss: 0.054866403341293335 2023-01-22 15:12:28.205438: step: 478/466, loss: 0.16068577766418457 2023-01-22 15:12:28.971889: step: 480/466, loss: 0.002518736757338047 2023-01-22 15:12:29.719896: step: 482/466, loss: 0.010239574126899242 2023-01-22 15:12:30.539054: step: 484/466, loss: 0.01268444862216711 2023-01-22 15:12:31.273953: step: 486/466, loss: 0.000964211649261415 2023-01-22 15:12:31.982162: step: 488/466, loss: 0.021816113963723183 2023-01-22 15:12:32.777062: step: 490/466, loss: 0.0027528852224349976 2023-01-22 15:12:33.583015: step: 492/466, loss: 0.08536282181739807 2023-01-22 15:12:34.309155: step: 494/466, loss: 0.006941162049770355 2023-01-22 15:12:35.047651: step: 496/466, loss: 0.1977323293685913 2023-01-22 15:12:35.793925: step: 498/466, loss: 0.004771554376929998 2023-01-22 15:12:36.678224: step: 500/466, loss: 0.12913726270198822 2023-01-22 15:12:37.452871: step: 502/466, loss: 0.008971030823886395 2023-01-22 15:12:38.325770: step: 504/466, loss: 0.029957806691527367 2023-01-22 15:12:39.057865: step: 506/466, loss: 1.8468188047409058 2023-01-22 15:12:39.871962: step: 508/466, loss: 0.03492060303688049 2023-01-22 15:12:40.715205: step: 510/466, loss: 0.042045388370752335 2023-01-22 15:12:41.409384: step: 512/466, loss: 0.052512023597955704 2023-01-22 15:12:42.198800: step: 514/466, loss: 0.03430125117301941 2023-01-22 15:12:42.989924: step: 516/466, loss: 0.008694916032254696 2023-01-22 15:12:43.755536: step: 518/466, loss: 0.01283710915595293 2023-01-22 15:12:44.514460: step: 520/466, loss: 0.026418892666697502 2023-01-22 15:12:45.209173: step: 522/466, loss: 0.0028335973620414734 2023-01-22 15:12:45.965935: step: 524/466, loss: 0.0052917106077075005 2023-01-22 15:12:46.690183: step: 526/466, loss: 0.011922935955226421 2023-01-22 15:12:47.368731: step: 528/466, loss: 0.015131733380258083 2023-01-22 15:12:48.081222: step: 530/466, loss: 0.0294162817299366 2023-01-22 15:12:48.774354: step: 532/466, loss: 0.005694697145372629 2023-01-22 15:12:49.574398: step: 534/466, loss: 0.16320542991161346 2023-01-22 15:12:50.318907: step: 536/466, loss: 0.79264235496521 2023-01-22 15:12:51.051413: step: 538/466, loss: 0.019455431029200554 2023-01-22 15:12:51.803902: step: 540/466, loss: 0.0004865480586886406 2023-01-22 15:12:52.560659: step: 542/466, loss: 3.612785577774048 2023-01-22 15:12:53.320154: step: 544/466, loss: 0.16626045107841492 2023-01-22 15:12:54.071330: step: 546/466, loss: 0.027813207358121872 2023-01-22 15:12:54.813798: step: 548/466, loss: 0.006704711355268955 2023-01-22 15:12:55.557372: step: 550/466, loss: 0.011759432032704353 2023-01-22 15:12:56.332067: step: 552/466, loss: 0.01733456179499626 2023-01-22 15:12:57.114087: step: 554/466, loss: 0.02009851485490799 2023-01-22 15:12:57.817400: step: 556/466, loss: 0.015334980562329292 2023-01-22 15:12:58.706062: step: 558/466, loss: 0.11342606693506241 2023-01-22 15:12:59.473409: step: 560/466, loss: 0.005834941752254963 2023-01-22 15:13:00.222936: step: 562/466, loss: 0.12517115473747253 2023-01-22 15:13:00.974698: step: 564/466, loss: 0.002540087793022394 2023-01-22 15:13:01.691548: step: 566/466, loss: 0.025609837844967842 2023-01-22 15:13:02.501707: step: 568/466, loss: 0.013363751582801342 2023-01-22 15:13:03.221884: step: 570/466, loss: 0.06508602946996689 2023-01-22 15:13:04.019458: step: 572/466, loss: 0.08497834205627441 2023-01-22 15:13:04.720369: step: 574/466, loss: 0.3588958978652954 2023-01-22 15:13:05.457449: step: 576/466, loss: 1.6701912879943848 2023-01-22 15:13:06.240479: step: 578/466, loss: 0.10761582106351852 2023-01-22 15:13:07.022051: step: 580/466, loss: 0.07897034287452698 2023-01-22 15:13:07.719162: step: 582/466, loss: 0.20795206725597382 2023-01-22 15:13:08.427139: step: 584/466, loss: 1.709664225578308 2023-01-22 15:13:09.170039: step: 586/466, loss: 0.029606152325868607 2023-01-22 15:13:09.907721: step: 588/466, loss: 0.02808300219476223 2023-01-22 15:13:10.697154: step: 590/466, loss: 0.12041884660720825 2023-01-22 15:13:11.486465: step: 592/466, loss: 0.04004635289311409 2023-01-22 15:13:12.255058: step: 594/466, loss: 0.18785744905471802 2023-01-22 15:13:13.026291: step: 596/466, loss: 0.07838691025972366 2023-01-22 15:13:13.781336: step: 598/466, loss: 0.010151730850338936 2023-01-22 15:13:14.552844: step: 600/466, loss: 0.05482480302453041 2023-01-22 15:13:15.330175: step: 602/466, loss: 0.2743576169013977 2023-01-22 15:13:16.040765: step: 604/466, loss: 0.05277765542268753 2023-01-22 15:13:16.818833: step: 606/466, loss: 0.03785436227917671 2023-01-22 15:13:17.550725: step: 608/466, loss: 0.013006249442696571 2023-01-22 15:13:18.278924: step: 610/466, loss: 0.03416730836033821 2023-01-22 15:13:19.037066: step: 612/466, loss: 0.00351434713229537 2023-01-22 15:13:19.812552: step: 614/466, loss: 0.47471874952316284 2023-01-22 15:13:20.532206: step: 616/466, loss: 0.08355723321437836 2023-01-22 15:13:21.275177: step: 618/466, loss: 0.06152806803584099 2023-01-22 15:13:22.037174: step: 620/466, loss: 0.012297751381993294 2023-01-22 15:13:22.833940: step: 622/466, loss: 0.018981346860527992 2023-01-22 15:13:23.524922: step: 624/466, loss: 0.005464603658765554 2023-01-22 15:13:24.290774: step: 626/466, loss: 0.031064271926879883 2023-01-22 15:13:25.087890: step: 628/466, loss: 0.1384933441877365 2023-01-22 15:13:25.790921: step: 630/466, loss: 0.0034443712793290615 2023-01-22 15:13:26.574875: step: 632/466, loss: 0.04686906933784485 2023-01-22 15:13:27.357066: step: 634/466, loss: 0.03641004487872124 2023-01-22 15:13:28.062573: step: 636/466, loss: 0.01964748091995716 2023-01-22 15:13:28.761089: step: 638/466, loss: 0.0178031288087368 2023-01-22 15:13:29.506400: step: 640/466, loss: 0.022527659311890602 2023-01-22 15:13:30.287542: step: 642/466, loss: 0.38067346811294556 2023-01-22 15:13:31.011229: step: 644/466, loss: 0.010407094843685627 2023-01-22 15:13:31.744127: step: 646/466, loss: 0.006377980578690767 2023-01-22 15:13:32.561394: step: 648/466, loss: 0.06780447065830231 2023-01-22 15:13:33.397189: step: 650/466, loss: 0.18436500430107117 2023-01-22 15:13:34.158768: step: 652/466, loss: 0.02482200786471367 2023-01-22 15:13:34.892512: step: 654/466, loss: 0.019808098673820496 2023-01-22 15:13:35.757571: step: 656/466, loss: 0.07673881947994232 2023-01-22 15:13:36.508980: step: 658/466, loss: 0.09009893983602524 2023-01-22 15:13:37.238363: step: 660/466, loss: 0.0017024546395987272 2023-01-22 15:13:37.983072: step: 662/466, loss: 0.002340559847652912 2023-01-22 15:13:38.768784: step: 664/466, loss: 0.011884375475347042 2023-01-22 15:13:39.537865: step: 666/466, loss: 3.983797550201416 2023-01-22 15:13:40.282744: step: 668/466, loss: 0.04353933781385422 2023-01-22 15:13:40.991037: step: 670/466, loss: 0.006013147532939911 2023-01-22 15:13:41.760843: step: 672/466, loss: 0.05005199462175369 2023-01-22 15:13:42.522653: step: 674/466, loss: 0.0044931103475391865 2023-01-22 15:13:43.344275: step: 676/466, loss: 0.05605557933449745 2023-01-22 15:13:44.139200: step: 678/466, loss: 0.008381667546927929 2023-01-22 15:13:44.834225: step: 680/466, loss: 0.4631694257259369 2023-01-22 15:13:45.557032: step: 682/466, loss: 0.0026287208311259747 2023-01-22 15:13:46.257396: step: 684/466, loss: 0.11688584834337234 2023-01-22 15:13:46.958701: step: 686/466, loss: 0.0012536462163552642 2023-01-22 15:13:47.657715: step: 688/466, loss: 0.0023633234668523073 2023-01-22 15:13:48.423483: step: 690/466, loss: 0.03221137076616287 2023-01-22 15:13:49.160085: step: 692/466, loss: 0.01213639322668314 2023-01-22 15:13:49.946608: step: 694/466, loss: 0.004290018230676651 2023-01-22 15:13:50.858449: step: 696/466, loss: 0.03194379061460495 2023-01-22 15:13:51.681965: step: 698/466, loss: 0.033809881657361984 2023-01-22 15:13:52.432496: step: 700/466, loss: 0.045655541121959686 2023-01-22 15:13:53.155591: step: 702/466, loss: 0.0050416202284395695 2023-01-22 15:13:53.948972: step: 704/466, loss: 0.041732631623744965 2023-01-22 15:13:54.763054: step: 706/466, loss: 0.015561552718281746 2023-01-22 15:13:55.519081: step: 708/466, loss: 0.005210685543715954 2023-01-22 15:13:56.313605: step: 710/466, loss: 2.3857340812683105 2023-01-22 15:13:57.006640: step: 712/466, loss: 0.006181302480399609 2023-01-22 15:13:57.818315: step: 714/466, loss: 0.01496371254324913 2023-01-22 15:13:58.611864: step: 716/466, loss: 0.2443210780620575 2023-01-22 15:13:59.454547: step: 718/466, loss: 0.033997077494859695 2023-01-22 15:14:00.202389: step: 720/466, loss: 0.006322094239294529 2023-01-22 15:14:00.957767: step: 722/466, loss: 0.0022107360418885946 2023-01-22 15:14:01.744234: step: 724/466, loss: 0.01752101257443428 2023-01-22 15:14:02.592301: step: 726/466, loss: 0.11376980692148209 2023-01-22 15:14:03.318598: step: 728/466, loss: 0.011744904331862926 2023-01-22 15:14:04.129832: step: 730/466, loss: 0.005395747721195221 2023-01-22 15:14:04.910520: step: 732/466, loss: 0.04061632230877876 2023-01-22 15:14:05.626662: step: 734/466, loss: 0.026483291760087013 2023-01-22 15:14:06.306981: step: 736/466, loss: 0.026913270354270935 2023-01-22 15:14:07.106604: step: 738/466, loss: 0.018517345190048218 2023-01-22 15:14:07.869212: step: 740/466, loss: 0.32893139123916626 2023-01-22 15:14:08.583528: step: 742/466, loss: 0.00489374715834856 2023-01-22 15:14:09.374556: step: 744/466, loss: 0.017249496653676033 2023-01-22 15:14:10.063833: step: 746/466, loss: 0.01171032153069973 2023-01-22 15:14:10.781829: step: 748/466, loss: 0.016885356977581978 2023-01-22 15:14:11.597654: step: 750/466, loss: 2.5229568481445312 2023-01-22 15:14:12.289390: step: 752/466, loss: 0.016198089346289635 2023-01-22 15:14:12.963003: step: 754/466, loss: 0.0012358203530311584 2023-01-22 15:14:13.685573: step: 756/466, loss: 0.004425358027219772 2023-01-22 15:14:14.433654: step: 758/466, loss: 0.01746828854084015 2023-01-22 15:14:15.304707: step: 760/466, loss: 0.019856218248605728 2023-01-22 15:14:16.166642: step: 762/466, loss: 0.0028859437443315983 2023-01-22 15:14:16.897356: step: 764/466, loss: 0.014554270543158054 2023-01-22 15:14:17.644827: step: 766/466, loss: 0.028981972485780716 2023-01-22 15:14:18.324522: step: 768/466, loss: 0.0030971853993833065 2023-01-22 15:14:19.105979: step: 770/466, loss: 0.013405009172856808 2023-01-22 15:14:19.779845: step: 772/466, loss: 0.06289155781269073 2023-01-22 15:14:20.575925: step: 774/466, loss: 0.05634074658155441 2023-01-22 15:14:21.358973: step: 776/466, loss: 0.014514373615384102 2023-01-22 15:14:22.078964: step: 778/466, loss: 0.0527653768658638 2023-01-22 15:14:22.806464: step: 780/466, loss: 0.012727830559015274 2023-01-22 15:14:23.590384: step: 782/466, loss: 0.006672831252217293 2023-01-22 15:14:24.338039: step: 784/466, loss: 0.20137040317058563 2023-01-22 15:14:25.088028: step: 786/466, loss: 0.02964707463979721 2023-01-22 15:14:25.781279: step: 788/466, loss: 0.013826957903802395 2023-01-22 15:14:26.560057: step: 790/466, loss: 0.0012463852763175964 2023-01-22 15:14:27.341478: step: 792/466, loss: 0.022246506065130234 2023-01-22 15:14:28.109239: step: 794/466, loss: 0.14593353867530823 2023-01-22 15:14:28.838439: step: 796/466, loss: 0.0034577662590891123 2023-01-22 15:14:29.592591: step: 798/466, loss: 0.013602708466351032 2023-01-22 15:14:30.326260: step: 800/466, loss: 0.13226613402366638 2023-01-22 15:14:31.087838: step: 802/466, loss: 0.004429709631949663 2023-01-22 15:14:31.899198: step: 804/466, loss: 0.007877406664192677 2023-01-22 15:14:32.668887: step: 806/466, loss: 0.011734271422028542 2023-01-22 15:14:33.372961: step: 808/466, loss: 0.011828706599771976 2023-01-22 15:14:34.177969: step: 810/466, loss: 0.3028920590877533 2023-01-22 15:14:34.935995: step: 812/466, loss: 0.01951698027551174 2023-01-22 15:14:35.707743: step: 814/466, loss: 0.005089250858873129 2023-01-22 15:14:36.453610: step: 816/466, loss: 0.018252501264214516 2023-01-22 15:14:37.254367: step: 818/466, loss: 0.027791481465101242 2023-01-22 15:14:38.040521: step: 820/466, loss: 0.015957748517394066 2023-01-22 15:14:38.832521: step: 822/466, loss: 0.28690820932388306 2023-01-22 15:14:39.674678: step: 824/466, loss: 0.07727184146642685 2023-01-22 15:14:40.438680: step: 826/466, loss: 0.012947415001690388 2023-01-22 15:14:41.176731: step: 828/466, loss: 0.16258849203586578 2023-01-22 15:14:42.065170: step: 830/466, loss: 0.02158299833536148 2023-01-22 15:14:42.793290: step: 832/466, loss: 0.09814544767141342 2023-01-22 15:14:43.508536: step: 834/466, loss: 0.0013304640306159854 2023-01-22 15:14:44.214551: step: 836/466, loss: 0.037100616842508316 2023-01-22 15:14:45.068217: step: 838/466, loss: 0.008366475813090801 2023-01-22 15:14:45.821909: step: 840/466, loss: 0.08925455808639526 2023-01-22 15:14:46.625897: step: 842/466, loss: 0.011081385426223278 2023-01-22 15:14:47.370303: step: 844/466, loss: 0.03384479507803917 2023-01-22 15:14:48.155815: step: 846/466, loss: 0.02601746656000614 2023-01-22 15:14:48.903449: step: 848/466, loss: 0.030930351465940475 2023-01-22 15:14:49.636839: step: 850/466, loss: 0.015173490159213543 2023-01-22 15:14:50.391599: step: 852/466, loss: 0.02299688011407852 2023-01-22 15:14:51.215431: step: 854/466, loss: 0.0044409967958927155 2023-01-22 15:14:51.998977: step: 856/466, loss: 0.00046178355114534497 2023-01-22 15:14:52.981538: step: 858/466, loss: 0.013453601859509945 2023-01-22 15:14:53.763343: step: 860/466, loss: 0.15057678520679474 2023-01-22 15:14:54.493737: step: 862/466, loss: 0.13931144773960114 2023-01-22 15:14:55.264978: step: 864/466, loss: 0.015180341899394989 2023-01-22 15:14:56.037503: step: 866/466, loss: 0.03371549770236015 2023-01-22 15:14:56.792389: step: 868/466, loss: 0.01678081788122654 2023-01-22 15:14:57.567712: step: 870/466, loss: 0.06515835970640182 2023-01-22 15:14:58.293152: step: 872/466, loss: 0.04113662987947464 2023-01-22 15:14:59.084422: step: 874/466, loss: 0.0009553478448651731 2023-01-22 15:14:59.808189: step: 876/466, loss: 0.4426819980144501 2023-01-22 15:15:00.530515: step: 878/466, loss: 0.03187278285622597 2023-01-22 15:15:01.295504: step: 880/466, loss: 0.01739988848567009 2023-01-22 15:15:02.035936: step: 882/466, loss: 0.017624996602535248 2023-01-22 15:15:02.811447: step: 884/466, loss: 0.0038380082696676254 2023-01-22 15:15:03.617174: step: 886/466, loss: 0.03622297942638397 2023-01-22 15:15:04.393331: step: 888/466, loss: 0.006211594678461552 2023-01-22 15:15:05.183973: step: 890/466, loss: 0.04902772605419159 2023-01-22 15:15:05.899793: step: 892/466, loss: 0.030732234939932823 2023-01-22 15:15:06.573566: step: 894/466, loss: 0.06138194352388382 2023-01-22 15:15:07.299196: step: 896/466, loss: 0.03699515387415886 2023-01-22 15:15:08.067534: step: 898/466, loss: 0.022731278091669083 2023-01-22 15:15:08.810786: step: 900/466, loss: 0.02300518937408924 2023-01-22 15:15:09.509756: step: 902/466, loss: 0.012142459861934185 2023-01-22 15:15:10.404854: step: 904/466, loss: 0.006614815443754196 2023-01-22 15:15:11.221844: step: 906/466, loss: 0.06262435764074326 2023-01-22 15:15:11.960663: step: 908/466, loss: 0.012923638336360455 2023-01-22 15:15:12.680191: step: 910/466, loss: 0.06749510020017624 2023-01-22 15:15:13.399144: step: 912/466, loss: 0.019686004146933556 2023-01-22 15:15:14.142901: step: 914/466, loss: 0.01593194529414177 2023-01-22 15:15:14.936151: step: 916/466, loss: 0.01698552817106247 2023-01-22 15:15:15.624035: step: 918/466, loss: 0.012973749078810215 2023-01-22 15:15:16.385593: step: 920/466, loss: 0.06905797868967056 2023-01-22 15:15:17.256032: step: 922/466, loss: 0.055974330753088 2023-01-22 15:15:17.993973: step: 924/466, loss: 0.009109921753406525 2023-01-22 15:15:18.702725: step: 926/466, loss: 0.035047173500061035 2023-01-22 15:15:19.361545: step: 928/466, loss: 0.018603753298521042 2023-01-22 15:15:20.168580: step: 930/466, loss: 0.10363016277551651 2023-01-22 15:15:20.946417: step: 932/466, loss: 0.09053977578878403 ================================================== Loss: 0.099 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.30726317122593716, 'r': 0.30143275051956264, 'f1': 0.3043200374019339}, 'combined': 0.22423581703300394, 'epoch': 26} Test Chinese: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.356021144962126, 'r': 0.28420042913859755, 'f1': 0.31608232610022163}, 'combined': 0.19427499067623377, 'epoch': 26} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.28986853722109607, 'r': 0.3173702959707257, 'f1': 0.3029966412619066}, 'combined': 0.22326068303508909, 'epoch': 26} Test Korean: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.33296610409620386, 'r': 0.2848229531143285, 'f1': 0.3070186755455431}, 'combined': 0.1887041615548216, 'epoch': 26} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3247170275590551, 'r': 0.3130099620493359, 'f1': 0.31875603864734303}, 'combined': 0.23487287058225276, 'epoch': 26} Test Russian: {'template': {'p': 0.863013698630137, 'r': 0.48091603053435117, 'f1': 0.6176470588235295}, 'slot': {'p': 0.3511413062270572, 'r': 0.2791512633414062, 'f1': 0.3110350461905211}, 'combined': 0.192109881470616, 'epoch': 26} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.3055555555555556, 'r': 0.3142857142857143, 'f1': 0.3098591549295775}, 'combined': 0.20657276995305165, 'epoch': 26} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.2857142857142857, 'r': 0.43478260869565216, 'f1': 0.3448275862068965}, 'combined': 0.17241379310344826, 'epoch': 26} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.359375, 'r': 0.19827586206896552, 'f1': 0.2555555555555556}, 'combined': 0.1703703703703704, 'epoch': 26} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3112994890724527, 'r': 0.3408345449616797, 'f1': 0.3253981978166761}, 'combined': 0.23976709312807712, 'epoch': 22} Test for Chinese: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.354901590881547, 'r': 0.30635228234537004, 'f1': 0.3288446896922884}, 'combined': 0.2021191751279431, 'epoch': 22} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.28125, 'r': 0.38571428571428573, 'f1': 0.32530120481927716}, 'combined': 0.2168674698795181, 'epoch': 22} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.28298284686781866, 'r': 0.3527888622242065, 'f1': 0.31405359863540006}, 'combined': 0.23140791478397899, 'epoch': 11} Test for Korean: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.32660396071805126, 'r': 0.30226432413074417, 'f1': 0.3139631233545263}, 'combined': 0.19297245630570886, 'epoch': 11} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.28888888888888886, 'r': 0.5652173913043478, 'f1': 0.38235294117647056}, 'combined': 0.19117647058823528, 'epoch': 11} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3161637354106435, 'r': 0.3395610516934046, 'f1': 0.3274449665918101}, 'combined': 0.24127523854133376, 'epoch': 24} Test for Russian: {'template': {'p': 0.863013698630137, 'r': 0.48091603053435117, 'f1': 0.6176470588235295}, 'slot': {'p': 0.34766580316129947, 'r': 0.3010093533864065, 'f1': 0.32265967810793456}, 'combined': 0.19928980118431255, 'epoch': 24} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.2413793103448276, 'f1': 0.32558139534883723}, 'combined': 0.21705426356589147, 'epoch': 24} ****************************** Epoch: 27 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-22 15:18:05.689350: step: 2/466, loss: 0.0002814007457345724 2023-01-22 15:18:06.452267: step: 4/466, loss: 0.005627429112792015 2023-01-22 15:18:07.280735: step: 6/466, loss: 0.0013453942956402898 2023-01-22 15:18:08.041542: step: 8/466, loss: 0.026067892089486122 2023-01-22 15:18:08.798258: step: 10/466, loss: 6.268831202760339e-05 2023-01-22 15:18:09.513753: step: 12/466, loss: 0.029802482575178146 2023-01-22 15:18:10.376208: step: 14/466, loss: 0.0027200065087527037 2023-01-22 15:18:11.172142: step: 16/466, loss: 0.005693104583770037 2023-01-22 15:18:11.882004: step: 18/466, loss: 0.0035440269857645035 2023-01-22 15:18:12.796717: step: 20/466, loss: 0.11614356935024261 2023-01-22 15:18:13.623338: step: 22/466, loss: 0.021779097616672516 2023-01-22 15:18:14.366436: step: 24/466, loss: 1.1219671964645386 2023-01-22 15:18:15.116970: step: 26/466, loss: 0.006852391641587019 2023-01-22 15:18:15.861296: step: 28/466, loss: 0.031410735100507736 2023-01-22 15:18:16.658876: step: 30/466, loss: 0.013507161289453506 2023-01-22 15:18:17.315167: step: 32/466, loss: 0.0035800025798380375 2023-01-22 15:18:18.072193: step: 34/466, loss: 0.0002280927001265809 2023-01-22 15:18:18.806215: step: 36/466, loss: 0.002048708265647292 2023-01-22 15:18:19.623418: step: 38/466, loss: 0.03521179407835007 2023-01-22 15:18:20.375995: step: 40/466, loss: 0.00017881934763863683 2023-01-22 15:18:21.044228: step: 42/466, loss: 0.01024035457521677 2023-01-22 15:18:21.838798: step: 44/466, loss: 0.02134045958518982 2023-01-22 15:18:22.659495: step: 46/466, loss: 0.004514336585998535 2023-01-22 15:18:23.453229: step: 48/466, loss: 0.004240673966705799 2023-01-22 15:18:24.215020: step: 50/466, loss: 0.058541689068078995 2023-01-22 15:18:25.011873: step: 52/466, loss: 0.043757617473602295 2023-01-22 15:18:25.766239: step: 54/466, loss: 0.02244669944047928 2023-01-22 15:18:26.486624: step: 56/466, loss: 0.022564508020877838 2023-01-22 15:18:27.230920: step: 58/466, loss: 0.0769951343536377 2023-01-22 15:18:27.893807: step: 60/466, loss: 0.000647237931843847 2023-01-22 15:18:28.618299: step: 62/466, loss: 0.006991761736571789 2023-01-22 15:18:29.457346: step: 64/466, loss: 3.979682683944702 2023-01-22 15:18:30.202048: step: 66/466, loss: 0.006263590883463621 2023-01-22 15:18:30.948305: step: 68/466, loss: 0.22763699293136597 2023-01-22 15:18:31.726026: step: 70/466, loss: 0.023374492302536964 2023-01-22 15:18:32.464447: step: 72/466, loss: 0.005913762375712395 2023-01-22 15:18:33.254791: step: 74/466, loss: 0.026825105771422386 2023-01-22 15:18:33.948843: step: 76/466, loss: 0.029739806428551674 2023-01-22 15:18:34.768428: step: 78/466, loss: 0.0008574838866479695 2023-01-22 15:18:35.487950: step: 80/466, loss: 0.006026826333254576 2023-01-22 15:18:36.183594: step: 82/466, loss: 0.04565891996026039 2023-01-22 15:18:36.930543: step: 84/466, loss: 0.026821792125701904 2023-01-22 15:18:37.619021: step: 86/466, loss: 0.0022763977758586407 2023-01-22 15:18:38.343274: step: 88/466, loss: 0.03604477643966675 2023-01-22 15:18:39.055614: step: 90/466, loss: 0.05254192650318146 2023-01-22 15:18:39.770756: step: 92/466, loss: 0.048561904579401016 2023-01-22 15:18:40.562874: step: 94/466, loss: 0.07324948161840439 2023-01-22 15:18:41.344569: step: 96/466, loss: 0.0026386980898678303 2023-01-22 15:18:42.170182: step: 98/466, loss: 0.024073511362075806 2023-01-22 15:18:42.891293: step: 100/466, loss: 0.0024970003869384527 2023-01-22 15:18:43.634847: step: 102/466, loss: 0.031353145837783813 2023-01-22 15:18:44.409475: step: 104/466, loss: 0.038495924323797226 2023-01-22 15:18:45.117402: step: 106/466, loss: 0.0358666330575943 2023-01-22 15:18:45.831588: step: 108/466, loss: 0.008217512629926205 2023-01-22 15:18:46.659713: step: 110/466, loss: 0.020881768316030502 2023-01-22 15:18:47.416613: step: 112/466, loss: 0.016404718160629272 2023-01-22 15:18:48.149820: step: 114/466, loss: 0.06059930473566055 2023-01-22 15:18:48.972491: step: 116/466, loss: 0.060039568692445755 2023-01-22 15:18:49.745142: step: 118/466, loss: 0.02221786417067051 2023-01-22 15:18:50.448659: step: 120/466, loss: 0.007926841266453266 2023-01-22 15:18:51.197880: step: 122/466, loss: 0.06578806042671204 2023-01-22 15:18:51.966624: step: 124/466, loss: 0.025813933461904526 2023-01-22 15:18:52.759390: step: 126/466, loss: 0.003890152322128415 2023-01-22 15:18:53.425453: step: 128/466, loss: 0.08698451519012451 2023-01-22 15:18:54.160336: step: 130/466, loss: 0.028302686288952827 2023-01-22 15:18:54.975058: step: 132/466, loss: 0.026748361065983772 2023-01-22 15:18:55.678869: step: 134/466, loss: 0.0011858418583869934 2023-01-22 15:18:56.505535: step: 136/466, loss: 0.0025568141136318445 2023-01-22 15:18:57.231126: step: 138/466, loss: 0.09356352686882019 2023-01-22 15:18:57.952477: step: 140/466, loss: 0.005866044666618109 2023-01-22 15:18:58.706465: step: 142/466, loss: 0.01024040300399065 2023-01-22 15:18:59.474095: step: 144/466, loss: 0.00738569488748908 2023-01-22 15:19:00.192626: step: 146/466, loss: 0.04120223969221115 2023-01-22 15:19:00.935796: step: 148/466, loss: 0.0715554878115654 2023-01-22 15:19:01.734851: step: 150/466, loss: 0.027549341320991516 2023-01-22 15:19:02.602539: step: 152/466, loss: 0.014845267869532108 2023-01-22 15:19:03.444067: step: 154/466, loss: 0.03739255666732788 2023-01-22 15:19:04.089007: step: 156/466, loss: 0.012479184195399284 2023-01-22 15:19:04.921887: step: 158/466, loss: 0.0043110898695886135 2023-01-22 15:19:05.621315: step: 160/466, loss: 0.020059850066900253 2023-01-22 15:19:06.412511: step: 162/466, loss: 0.01724727638065815 2023-01-22 15:19:07.250779: step: 164/466, loss: 0.010996063239872456 2023-01-22 15:19:08.006177: step: 166/466, loss: 0.04114590212702751 2023-01-22 15:19:08.760385: step: 168/466, loss: 0.03185999393463135 2023-01-22 15:19:09.541296: step: 170/466, loss: 0.039986953139305115 2023-01-22 15:19:10.280890: step: 172/466, loss: 0.003128435928374529 2023-01-22 15:19:11.031207: step: 174/466, loss: 0.0022137167397886515 2023-01-22 15:19:11.812316: step: 176/466, loss: 0.008617050014436245 2023-01-22 15:19:12.586661: step: 178/466, loss: 0.0037066603545099497 2023-01-22 15:19:13.392324: step: 180/466, loss: 0.0012156914453953505 2023-01-22 15:19:14.198892: step: 182/466, loss: 0.013374599628150463 2023-01-22 15:19:15.039639: step: 184/466, loss: 0.02156377211213112 2023-01-22 15:19:15.744049: step: 186/466, loss: 0.002165052341297269 2023-01-22 15:19:16.582195: step: 188/466, loss: 0.03450706973671913 2023-01-22 15:19:17.346816: step: 190/466, loss: 0.00039736999315209687 2023-01-22 15:19:18.113912: step: 192/466, loss: 0.008332728408277035 2023-01-22 15:19:18.919436: step: 194/466, loss: 0.019774070009589195 2023-01-22 15:19:19.624367: step: 196/466, loss: 0.0986318439245224 2023-01-22 15:19:20.446090: step: 198/466, loss: 0.0016368558863177896 2023-01-22 15:19:21.177540: step: 200/466, loss: 0.044423457235097885 2023-01-22 15:19:21.849648: step: 202/466, loss: 0.029822947457432747 2023-01-22 15:19:22.659811: step: 204/466, loss: 0.07450821995735168 2023-01-22 15:19:23.381192: step: 206/466, loss: 0.03527345880866051 2023-01-22 15:19:24.165609: step: 208/466, loss: 0.04506620019674301 2023-01-22 15:19:24.835847: step: 210/466, loss: 0.014406089670956135 2023-01-22 15:19:25.655723: step: 212/466, loss: 0.05044175311923027 2023-01-22 15:19:26.460194: step: 214/466, loss: 0.031075172126293182 2023-01-22 15:19:27.167635: step: 216/466, loss: 0.0378032922744751 2023-01-22 15:19:27.872141: step: 218/466, loss: 0.0018708063289523125 2023-01-22 15:19:28.770307: step: 220/466, loss: 0.06911762803792953 2023-01-22 15:19:29.509393: step: 222/466, loss: 0.036743901669979095 2023-01-22 15:19:30.249404: step: 224/466, loss: 0.02297184430062771 2023-01-22 15:19:31.008784: step: 226/466, loss: 0.006796710193157196 2023-01-22 15:19:31.686470: step: 228/466, loss: 0.01702827587723732 2023-01-22 15:19:32.383321: step: 230/466, loss: 0.025485971942543983 2023-01-22 15:19:33.149260: step: 232/466, loss: 0.021808648481965065 2023-01-22 15:19:33.903539: step: 234/466, loss: 0.0580766499042511 2023-01-22 15:19:34.749736: step: 236/466, loss: 0.01121596060693264 2023-01-22 15:19:35.444671: step: 238/466, loss: 0.00727180577814579 2023-01-22 15:19:36.289532: step: 240/466, loss: 0.016058299690485 2023-01-22 15:19:37.148544: step: 242/466, loss: 0.010801510885357857 2023-01-22 15:19:37.961132: step: 244/466, loss: 0.014843948185443878 2023-01-22 15:19:38.724409: step: 246/466, loss: 0.0010358254658058286 2023-01-22 15:19:39.466716: step: 248/466, loss: 0.021506547927856445 2023-01-22 15:19:40.286230: step: 250/466, loss: 0.014431829564273357 2023-01-22 15:19:40.985914: step: 252/466, loss: 0.0032666679471731186 2023-01-22 15:19:41.722009: step: 254/466, loss: 0.0034550423733890057 2023-01-22 15:19:42.447000: step: 256/466, loss: 0.03942679986357689 2023-01-22 15:19:43.168410: step: 258/466, loss: 0.01921839639544487 2023-01-22 15:19:44.060682: step: 260/466, loss: 0.006756064482033253 2023-01-22 15:19:44.898005: step: 262/466, loss: 0.013162982650101185 2023-01-22 15:19:45.623956: step: 264/466, loss: 0.01315717026591301 2023-01-22 15:19:46.382195: step: 266/466, loss: 0.004986981861293316 2023-01-22 15:19:47.104099: step: 268/466, loss: 0.0030614964198321104 2023-01-22 15:19:47.913718: step: 270/466, loss: 0.043415263295173645 2023-01-22 15:19:48.654435: step: 272/466, loss: 0.006690033245831728 2023-01-22 15:19:49.374367: step: 274/466, loss: 0.0012875624233856797 2023-01-22 15:19:50.158073: step: 276/466, loss: 0.03934786096215248 2023-01-22 15:19:50.859716: step: 278/466, loss: 0.00044055673060938716 2023-01-22 15:19:51.716617: step: 280/466, loss: 0.016125798225402832 2023-01-22 15:19:52.391518: step: 282/466, loss: 0.0022109998390078545 2023-01-22 15:19:53.230782: step: 284/466, loss: 0.016009317710995674 2023-01-22 15:19:54.033820: step: 286/466, loss: 0.051596127450466156 2023-01-22 15:19:54.750189: step: 288/466, loss: 0.01629328913986683 2023-01-22 15:19:55.512938: step: 290/466, loss: 0.03265248239040375 2023-01-22 15:19:56.193225: step: 292/466, loss: 0.005238786339759827 2023-01-22 15:19:56.951641: step: 294/466, loss: 0.011197719722986221 2023-01-22 15:19:57.730396: step: 296/466, loss: 0.06267337501049042 2023-01-22 15:19:58.475914: step: 298/466, loss: 0.006046592723578215 2023-01-22 15:19:59.141215: step: 300/466, loss: 0.02558089606463909 2023-01-22 15:19:59.802678: step: 302/466, loss: 0.0013408252270892262 2023-01-22 15:20:00.580395: step: 304/466, loss: 0.001228883396834135 2023-01-22 15:20:01.506661: step: 306/466, loss: 0.018673928454518318 2023-01-22 15:20:02.424504: step: 308/466, loss: 0.008346027694642544 2023-01-22 15:20:03.206649: step: 310/466, loss: 0.004616781137883663 2023-01-22 15:20:03.994729: step: 312/466, loss: 0.042829494923353195 2023-01-22 15:20:04.737209: step: 314/466, loss: 0.003077883506193757 2023-01-22 15:20:05.494711: step: 316/466, loss: 0.02967197820544243 2023-01-22 15:20:06.208697: step: 318/466, loss: 0.02122427523136139 2023-01-22 15:20:07.038635: step: 320/466, loss: 0.01630318909883499 2023-01-22 15:20:07.828265: step: 322/466, loss: 0.0977238118648529 2023-01-22 15:20:08.570426: step: 324/466, loss: 0.02882983162999153 2023-01-22 15:20:09.325003: step: 326/466, loss: 0.1824617236852646 2023-01-22 15:20:10.041275: step: 328/466, loss: 0.0032043601386249065 2023-01-22 15:20:10.712101: step: 330/466, loss: 0.002083337400108576 2023-01-22 15:20:11.505168: step: 332/466, loss: 0.0006186572136357427 2023-01-22 15:20:12.271706: step: 334/466, loss: 0.08247829973697662 2023-01-22 15:20:13.136059: step: 336/466, loss: 0.010414715856313705 2023-01-22 15:20:13.997585: step: 338/466, loss: 0.05850432068109512 2023-01-22 15:20:14.705958: step: 340/466, loss: 0.011604820378124714 2023-01-22 15:20:15.505083: step: 342/466, loss: 0.00664097722619772 2023-01-22 15:20:16.362040: step: 344/466, loss: 0.3574390709400177 2023-01-22 15:20:17.315058: step: 346/466, loss: 0.2645539939403534 2023-01-22 15:20:18.149154: step: 348/466, loss: 0.00910822581499815 2023-01-22 15:20:18.915674: step: 350/466, loss: 0.012043699622154236 2023-01-22 15:20:19.690231: step: 352/466, loss: 0.0147309685125947 2023-01-22 15:20:20.371057: step: 354/466, loss: 0.0217413492500782 2023-01-22 15:20:21.120743: step: 356/466, loss: 0.006431126035749912 2023-01-22 15:20:21.807872: step: 358/466, loss: 0.02148101106286049 2023-01-22 15:20:22.557747: step: 360/466, loss: 0.062402743846178055 2023-01-22 15:20:23.260937: step: 362/466, loss: 0.01829441823065281 2023-01-22 15:20:24.018885: step: 364/466, loss: 0.06198782101273537 2023-01-22 15:20:24.766383: step: 366/466, loss: 0.02825883962213993 2023-01-22 15:20:25.574034: step: 368/466, loss: 0.025904733687639236 2023-01-22 15:20:26.357744: step: 370/466, loss: 0.08099622279405594 2023-01-22 15:20:27.123827: step: 372/466, loss: 0.023321600630879402 2023-01-22 15:20:27.867917: step: 374/466, loss: 0.025583907961845398 2023-01-22 15:20:28.584028: step: 376/466, loss: 0.0027648108080029488 2023-01-22 15:20:29.384544: step: 378/466, loss: 0.004878263454884291 2023-01-22 15:20:30.204156: step: 380/466, loss: 0.053702231496572495 2023-01-22 15:20:30.942054: step: 382/466, loss: 0.0037088815588504076 2023-01-22 15:20:31.659450: step: 384/466, loss: 0.0027306238189339638 2023-01-22 15:20:32.383894: step: 386/466, loss: 0.029041174799203873 2023-01-22 15:20:33.128780: step: 388/466, loss: 0.024189729243516922 2023-01-22 15:20:33.953811: step: 390/466, loss: 0.06975167244672775 2023-01-22 15:20:34.628857: step: 392/466, loss: 0.09673363715410233 2023-01-22 15:20:35.407464: step: 394/466, loss: 0.021510764956474304 2023-01-22 15:20:36.120639: step: 396/466, loss: 0.0034308144822716713 2023-01-22 15:20:36.870221: step: 398/466, loss: 0.353545606136322 2023-01-22 15:20:37.603264: step: 400/466, loss: 0.1800854206085205 2023-01-22 15:20:38.403923: step: 402/466, loss: 0.0428028479218483 2023-01-22 15:20:39.209705: step: 404/466, loss: 0.05135725438594818 2023-01-22 15:20:40.002378: step: 406/466, loss: 0.018109343945980072 2023-01-22 15:20:40.781046: step: 408/466, loss: 0.012806784361600876 2023-01-22 15:20:41.549839: step: 410/466, loss: 0.13045519590377808 2023-01-22 15:20:42.319969: step: 412/466, loss: 0.0860825628042221 2023-01-22 15:20:43.098650: step: 414/466, loss: 0.005686786957085133 2023-01-22 15:20:43.797855: step: 416/466, loss: 0.013734704814851284 2023-01-22 15:20:44.539737: step: 418/466, loss: 0.011565088294446468 2023-01-22 15:20:45.311084: step: 420/466, loss: 0.0019097490003332496 2023-01-22 15:20:45.917191: step: 422/466, loss: 0.004762308672070503 2023-01-22 15:20:46.713768: step: 424/466, loss: 0.009719901718199253 2023-01-22 15:20:47.452642: step: 426/466, loss: 0.01680520363152027 2023-01-22 15:20:48.129154: step: 428/466, loss: 0.04889770224690437 2023-01-22 15:20:48.894568: step: 430/466, loss: 0.04680616408586502 2023-01-22 15:20:49.669473: step: 432/466, loss: 0.0015351184410974383 2023-01-22 15:20:50.402830: step: 434/466, loss: 0.017736423760652542 2023-01-22 15:20:51.222188: step: 436/466, loss: 0.015004309825599194 2023-01-22 15:20:52.000284: step: 438/466, loss: 0.0015911321388557553 2023-01-22 15:20:52.672753: step: 440/466, loss: 0.1260666847229004 2023-01-22 15:20:53.450971: step: 442/466, loss: 0.05454300716519356 2023-01-22 15:20:54.154174: step: 444/466, loss: 0.029246529564261436 2023-01-22 15:20:54.850546: step: 446/466, loss: 0.009855160489678383 2023-01-22 15:20:55.550780: step: 448/466, loss: 0.0028746260795742273 2023-01-22 15:20:56.236936: step: 450/466, loss: 0.005132277961820364 2023-01-22 15:20:57.108913: step: 452/466, loss: 0.008061932399868965 2023-01-22 15:20:57.777879: step: 454/466, loss: 0.044358186423778534 2023-01-22 15:20:58.533885: step: 456/466, loss: 0.029368244111537933 2023-01-22 15:20:59.192202: step: 458/466, loss: 0.005659927614033222 2023-01-22 15:20:59.948007: step: 460/466, loss: 0.06808044016361237 2023-01-22 15:21:00.707955: step: 462/466, loss: 0.0015544743509963155 2023-01-22 15:21:01.492879: step: 464/466, loss: 0.0021502196323126554 2023-01-22 15:21:02.206191: step: 466/466, loss: 0.021343419328331947 2023-01-22 15:21:02.985842: step: 468/466, loss: 0.002755869412794709 2023-01-22 15:21:03.825152: step: 470/466, loss: 0.007736708037555218 2023-01-22 15:21:04.557047: step: 472/466, loss: 0.005857081618160009 2023-01-22 15:21:05.307707: step: 474/466, loss: 0.046281494200229645 2023-01-22 15:21:06.039805: step: 476/466, loss: 0.041273847222328186 2023-01-22 15:21:06.804377: step: 478/466, loss: 0.0016959874192252755 2023-01-22 15:21:07.609115: step: 480/466, loss: 0.5545917749404907 2023-01-22 15:21:08.450927: step: 482/466, loss: 0.3655919134616852 2023-01-22 15:21:09.240568: step: 484/466, loss: 0.0027146392967551947 2023-01-22 15:21:09.979924: step: 486/466, loss: 0.09501084685325623 2023-01-22 15:21:10.735568: step: 488/466, loss: 0.007031205575913191 2023-01-22 15:21:11.446907: step: 490/466, loss: 0.05125099793076515 2023-01-22 15:21:12.133420: step: 492/466, loss: 0.04403087496757507 2023-01-22 15:21:12.885150: step: 494/466, loss: 0.04046096280217171 2023-01-22 15:21:13.553322: step: 496/466, loss: 0.006504176650196314 2023-01-22 15:21:14.273846: step: 498/466, loss: 0.01574026048183441 2023-01-22 15:21:14.979516: step: 500/466, loss: 0.00379569036886096 2023-01-22 15:21:15.707707: step: 502/466, loss: 0.02817787230014801 2023-01-22 15:21:16.480371: step: 504/466, loss: 0.0014384161913767457 2023-01-22 15:21:17.201021: step: 506/466, loss: 0.0026149300392717123 2023-01-22 15:21:17.902468: step: 508/466, loss: 0.06416348367929459 2023-01-22 15:21:18.609714: step: 510/466, loss: 0.0029873487073928118 2023-01-22 15:21:19.466052: step: 512/466, loss: 0.004441775381565094 2023-01-22 15:21:20.184817: step: 514/466, loss: 0.037997715175151825 2023-01-22 15:21:21.016029: step: 516/466, loss: 0.004481486044824123 2023-01-22 15:21:21.698827: step: 518/466, loss: 0.0038423393853008747 2023-01-22 15:21:22.467086: step: 520/466, loss: 0.012804090976715088 2023-01-22 15:21:23.155963: step: 522/466, loss: 1.297904372215271 2023-01-22 15:21:23.870849: step: 524/466, loss: 0.016443302854895592 2023-01-22 15:21:24.648324: step: 526/466, loss: 0.009351842105388641 2023-01-22 15:21:25.396350: step: 528/466, loss: 0.012461633421480656 2023-01-22 15:21:26.171616: step: 530/466, loss: 0.05990233272314072 2023-01-22 15:21:27.018039: step: 532/466, loss: 0.00506456708535552 2023-01-22 15:21:27.726425: step: 534/466, loss: 0.00388672505505383 2023-01-22 15:21:28.471630: step: 536/466, loss: 0.00025387172354385257 2023-01-22 15:21:29.106985: step: 538/466, loss: 0.03435612469911575 2023-01-22 15:21:29.846787: step: 540/466, loss: 0.07269947230815887 2023-01-22 15:21:30.642306: step: 542/466, loss: 0.007448124699294567 2023-01-22 15:21:31.302660: step: 544/466, loss: 0.03401390090584755 2023-01-22 15:21:32.153092: step: 546/466, loss: 0.060475341975688934 2023-01-22 15:21:32.893344: step: 548/466, loss: 0.010061251930892467 2023-01-22 15:21:33.761981: step: 550/466, loss: 0.06197541579604149 2023-01-22 15:21:34.475831: step: 552/466, loss: 0.010373301804065704 2023-01-22 15:21:35.293401: step: 554/466, loss: 0.20099429786205292 2023-01-22 15:21:36.122557: step: 556/466, loss: 0.010779143311083317 2023-01-22 15:21:36.887939: step: 558/466, loss: 0.022128930315375328 2023-01-22 15:21:37.611572: step: 560/466, loss: 0.025541655719280243 2023-01-22 15:21:38.345715: step: 562/466, loss: 0.012778117321431637 2023-01-22 15:21:39.013714: step: 564/466, loss: 0.048351503908634186 2023-01-22 15:21:39.778578: step: 566/466, loss: 0.02952878549695015 2023-01-22 15:21:40.569424: step: 568/466, loss: 0.012676065787672997 2023-01-22 15:21:41.324316: step: 570/466, loss: 0.20164689421653748 2023-01-22 15:21:42.128539: step: 572/466, loss: 0.04054632782936096 2023-01-22 15:21:42.956068: step: 574/466, loss: 0.12637276947498322 2023-01-22 15:21:43.730516: step: 576/466, loss: 0.07379309087991714 2023-01-22 15:21:44.465832: step: 578/466, loss: 0.009364991448819637 2023-01-22 15:21:45.203606: step: 580/466, loss: 0.24854327738285065 2023-01-22 15:21:45.910210: step: 582/466, loss: 0.0017289548413828015 2023-01-22 15:21:46.680655: step: 584/466, loss: 0.029230041429400444 2023-01-22 15:21:47.421086: step: 586/466, loss: 0.01573510281741619 2023-01-22 15:21:48.204529: step: 588/466, loss: 0.030351882800459862 2023-01-22 15:21:48.883273: step: 590/466, loss: 0.08203182369470596 2023-01-22 15:21:49.681029: step: 592/466, loss: 0.05282087251543999 2023-01-22 15:21:50.413823: step: 594/466, loss: 0.00015412727952934802 2023-01-22 15:21:51.185956: step: 596/466, loss: 0.006411698181182146 2023-01-22 15:21:51.936707: step: 598/466, loss: 0.052745647728443146 2023-01-22 15:21:52.635785: step: 600/466, loss: 0.017853165045380592 2023-01-22 15:21:53.449026: step: 602/466, loss: 0.07674769312143326 2023-01-22 15:21:54.164133: step: 604/466, loss: 0.04152761772274971 2023-01-22 15:21:54.898376: step: 606/466, loss: 0.05247914791107178 2023-01-22 15:21:55.607168: step: 608/466, loss: 0.03986712917685509 2023-01-22 15:21:56.390443: step: 610/466, loss: 0.0596633218228817 2023-01-22 15:21:57.127006: step: 612/466, loss: 0.0019677767995744944 2023-01-22 15:21:57.923606: step: 614/466, loss: 0.09488515555858612 2023-01-22 15:21:58.640709: step: 616/466, loss: 0.028042137622833252 2023-01-22 15:21:59.430620: step: 618/466, loss: 0.06155802309513092 2023-01-22 15:22:00.252041: step: 620/466, loss: 0.017027219757437706 2023-01-22 15:22:01.151312: step: 622/466, loss: 0.0045541031286120415 2023-01-22 15:22:01.959941: step: 624/466, loss: 0.06806767731904984 2023-01-22 15:22:02.756041: step: 626/466, loss: 0.012947620823979378 2023-01-22 15:22:03.557952: step: 628/466, loss: 0.008482400327920914 2023-01-22 15:22:04.420985: step: 630/466, loss: 0.008181189186871052 2023-01-22 15:22:05.098068: step: 632/466, loss: 0.003240967635065317 2023-01-22 15:22:05.820177: step: 634/466, loss: 0.002436768962070346 2023-01-22 15:22:06.566071: step: 636/466, loss: 0.004692391492426395 2023-01-22 15:22:07.436748: step: 638/466, loss: 0.006729860324412584 2023-01-22 15:22:08.158451: step: 640/466, loss: 0.014092906378209591 2023-01-22 15:22:08.972709: step: 642/466, loss: 0.04447159916162491 2023-01-22 15:22:09.803392: step: 644/466, loss: 0.016485348343849182 2023-01-22 15:22:10.546565: step: 646/466, loss: 0.04199531674385071 2023-01-22 15:22:11.275687: step: 648/466, loss: 0.018296226859092712 2023-01-22 15:22:12.093882: step: 650/466, loss: 0.027963055297732353 2023-01-22 15:22:12.843422: step: 652/466, loss: 0.00020877993665635586 2023-01-22 15:22:13.573175: step: 654/466, loss: 0.006149666849523783 2023-01-22 15:22:14.211932: step: 656/466, loss: 0.0022929850965738297 2023-01-22 15:22:14.889647: step: 658/466, loss: 0.013425644487142563 2023-01-22 15:22:15.699820: step: 660/466, loss: 0.002585696056485176 2023-01-22 15:22:16.450155: step: 662/466, loss: 0.018005739897489548 2023-01-22 15:22:17.273082: step: 664/466, loss: 0.01200743205845356 2023-01-22 15:22:17.998389: step: 666/466, loss: 0.051574330776929855 2023-01-22 15:22:18.735664: step: 668/466, loss: 0.02668682672083378 2023-01-22 15:22:19.448426: step: 670/466, loss: 0.02358873188495636 2023-01-22 15:22:20.263277: step: 672/466, loss: 0.025741780176758766 2023-01-22 15:22:20.984262: step: 674/466, loss: 0.0008982737781479955 2023-01-22 15:22:21.774027: step: 676/466, loss: 0.013151212595403194 2023-01-22 15:22:22.546837: step: 678/466, loss: 0.05363275855779648 2023-01-22 15:22:23.367132: step: 680/466, loss: 0.198809415102005 2023-01-22 15:22:24.083068: step: 682/466, loss: 0.14452099800109863 2023-01-22 15:22:24.834652: step: 684/466, loss: 0.7772390246391296 2023-01-22 15:22:25.680222: step: 686/466, loss: 0.014220787212252617 2023-01-22 15:22:26.475708: step: 688/466, loss: 0.006663024891167879 2023-01-22 15:22:27.266065: step: 690/466, loss: 1.1887304782867432 2023-01-22 15:22:28.085653: step: 692/466, loss: 0.056968383491039276 2023-01-22 15:22:28.825660: step: 694/466, loss: 0.010404759086668491 2023-01-22 15:22:29.596255: step: 696/466, loss: 0.15037499368190765 2023-01-22 15:22:30.376080: step: 698/466, loss: 0.006027761846780777 2023-01-22 15:22:31.123802: step: 700/466, loss: 0.013600733131170273 2023-01-22 15:22:31.991678: step: 702/466, loss: 0.03774077072739601 2023-01-22 15:22:32.687591: step: 704/466, loss: 0.006597063969820738 2023-01-22 15:22:33.498324: step: 706/466, loss: 0.005939188413321972 2023-01-22 15:22:34.258486: step: 708/466, loss: 0.016573762521147728 2023-01-22 15:22:34.999902: step: 710/466, loss: 0.08055862039327621 2023-01-22 15:22:35.956516: step: 712/466, loss: 0.013921476900577545 2023-01-22 15:22:36.759337: step: 714/466, loss: 0.010552269406616688 2023-01-22 15:22:37.631223: step: 716/466, loss: 0.03871457651257515 2023-01-22 15:22:38.419628: step: 718/466, loss: 0.017797337844967842 2023-01-22 15:22:39.130743: step: 720/466, loss: 0.0022084051743149757 2023-01-22 15:22:39.923157: step: 722/466, loss: 0.33818894624710083 2023-01-22 15:22:40.650928: step: 724/466, loss: 0.016701536253094673 2023-01-22 15:22:41.394689: step: 726/466, loss: 0.005369607359170914 2023-01-22 15:22:42.128777: step: 728/466, loss: 0.01709195412695408 2023-01-22 15:22:42.860300: step: 730/466, loss: 0.03921685367822647 2023-01-22 15:22:43.572356: step: 732/466, loss: 0.002594085643067956 2023-01-22 15:22:44.323344: step: 734/466, loss: 0.05583988502621651 2023-01-22 15:22:45.010418: step: 736/466, loss: 0.007267627865076065 2023-01-22 15:22:45.782959: step: 738/466, loss: 0.017063172534108162 2023-01-22 15:22:46.524764: step: 740/466, loss: 0.003912733867764473 2023-01-22 15:22:47.289224: step: 742/466, loss: 0.0022849079687148333 2023-01-22 15:22:47.957988: step: 744/466, loss: 0.0050977920182049274 2023-01-22 15:22:48.785112: step: 746/466, loss: 0.008796028792858124 2023-01-22 15:22:49.558915: step: 748/466, loss: 0.033342909067869186 2023-01-22 15:22:50.308472: step: 750/466, loss: 0.00955971609801054 2023-01-22 15:22:51.100759: step: 752/466, loss: 0.015114396810531616 2023-01-22 15:22:51.814097: step: 754/466, loss: 1.1177012920379639 2023-01-22 15:22:52.539063: step: 756/466, loss: 0.013771473430097103 2023-01-22 15:22:53.398100: step: 758/466, loss: 0.00066518341191113 2023-01-22 15:22:54.117250: step: 760/466, loss: 0.009900188073515892 2023-01-22 15:22:54.858914: step: 762/466, loss: 0.011259110644459724 2023-01-22 15:22:55.594574: step: 764/466, loss: 0.011412428691983223 2023-01-22 15:22:56.322969: step: 766/466, loss: 0.23429237306118011 2023-01-22 15:22:57.128228: step: 768/466, loss: 0.008612806908786297 2023-01-22 15:22:57.942569: step: 770/466, loss: 0.0014791837893426418 2023-01-22 15:22:58.713337: step: 772/466, loss: 0.005687447264790535 2023-01-22 15:22:59.488456: step: 774/466, loss: 0.051985085010528564 2023-01-22 15:23:00.186768: step: 776/466, loss: 0.0002448662417009473 2023-01-22 15:23:00.906278: step: 778/466, loss: 0.02175869606435299 2023-01-22 15:23:01.552475: step: 780/466, loss: 0.025300780311226845 2023-01-22 15:23:02.352862: step: 782/466, loss: 0.04623227193951607 2023-01-22 15:23:03.077573: step: 784/466, loss: 0.004041609354317188 2023-01-22 15:23:03.764946: step: 786/466, loss: 0.02221454679965973 2023-01-22 15:23:04.580326: step: 788/466, loss: 0.00526766199618578 2023-01-22 15:23:05.288131: step: 790/466, loss: 0.03262951225042343 2023-01-22 15:23:06.085895: step: 792/466, loss: 0.0682738646864891 2023-01-22 15:23:06.843006: step: 794/466, loss: 0.0276656411588192 2023-01-22 15:23:07.591942: step: 796/466, loss: 0.041441094130277634 2023-01-22 15:23:08.348093: step: 798/466, loss: 0.1558869183063507 2023-01-22 15:23:09.160474: step: 800/466, loss: 0.013775553554296494 2023-01-22 15:23:09.902944: step: 802/466, loss: 0.03571357578039169 2023-01-22 15:23:10.669265: step: 804/466, loss: 0.047355767339468 2023-01-22 15:23:11.543859: step: 806/466, loss: 0.01611384190618992 2023-01-22 15:23:12.521672: step: 808/466, loss: 0.03671904653310776 2023-01-22 15:23:13.318417: step: 810/466, loss: 0.14984849095344543 2023-01-22 15:23:14.122741: step: 812/466, loss: 0.04282063618302345 2023-01-22 15:23:14.882031: step: 814/466, loss: 1.5589016675949097 2023-01-22 15:23:15.624107: step: 816/466, loss: 0.007101284805685282 2023-01-22 15:23:16.383677: step: 818/466, loss: 0.7906081080436707 2023-01-22 15:23:17.155009: step: 820/466, loss: 0.008964190259575844 2023-01-22 15:23:17.908819: step: 822/466, loss: 0.0995369628071785 2023-01-22 15:23:18.805858: step: 824/466, loss: 0.16250324249267578 2023-01-22 15:23:19.530863: step: 826/466, loss: 0.01729477196931839 2023-01-22 15:23:20.292067: step: 828/466, loss: 0.0016664626309648156 2023-01-22 15:23:20.966476: step: 830/466, loss: 0.02467237040400505 2023-01-22 15:23:21.703939: step: 832/466, loss: 0.8608911037445068 2023-01-22 15:23:22.413248: step: 834/466, loss: 0.015265265479683876 2023-01-22 15:23:23.174135: step: 836/466, loss: 0.04130096361041069 2023-01-22 15:23:23.962318: step: 838/466, loss: 0.02975635416805744 2023-01-22 15:23:24.659198: step: 840/466, loss: 0.06842043995857239 2023-01-22 15:23:25.470269: step: 842/466, loss: 0.005540946964174509 2023-01-22 15:23:26.201373: step: 844/466, loss: 0.06130402162671089 2023-01-22 15:23:26.924638: step: 846/466, loss: 0.030608119443058968 2023-01-22 15:23:27.698359: step: 848/466, loss: 0.0076465848833322525 2023-01-22 15:23:28.455533: step: 850/466, loss: 0.033506572246551514 2023-01-22 15:23:29.205382: step: 852/466, loss: 0.0032035030890256166 2023-01-22 15:23:29.907895: step: 854/466, loss: 0.02916513755917549 2023-01-22 15:23:30.651079: step: 856/466, loss: 0.004772978834807873 2023-01-22 15:23:31.431688: step: 858/466, loss: 4.942668601870537e-05 2023-01-22 15:23:32.327230: step: 860/466, loss: 0.009504769928753376 2023-01-22 15:23:33.105099: step: 862/466, loss: 0.1233750507235527 2023-01-22 15:23:33.763202: step: 864/466, loss: 0.0032240746077150106 2023-01-22 15:23:34.571734: step: 866/466, loss: 0.038527362048625946 2023-01-22 15:23:35.305249: step: 868/466, loss: 0.4711555242538452 2023-01-22 15:23:36.077274: step: 870/466, loss: 0.012075904756784439 2023-01-22 15:23:36.795245: step: 872/466, loss: 0.32846665382385254 2023-01-22 15:23:37.627116: step: 874/466, loss: 0.1630748063325882 2023-01-22 15:23:38.451535: step: 876/466, loss: 0.02237537130713463 2023-01-22 15:23:39.299125: step: 878/466, loss: 0.03680684044957161 2023-01-22 15:23:40.074465: step: 880/466, loss: 0.05501696467399597 2023-01-22 15:23:40.866868: step: 882/466, loss: 0.04206734150648117 2023-01-22 15:23:41.579100: step: 884/466, loss: 0.1176481768488884 2023-01-22 15:23:42.457688: step: 886/466, loss: 0.005531415343284607 2023-01-22 15:23:43.204816: step: 888/466, loss: 0.06910202652215958 2023-01-22 15:23:43.859764: step: 890/466, loss: 0.0008402147796005011 2023-01-22 15:23:44.596742: step: 892/466, loss: 0.0012181682977825403 2023-01-22 15:23:45.430334: step: 894/466, loss: 0.07599858939647675 2023-01-22 15:23:46.207531: step: 896/466, loss: 0.0009424823219887912 2023-01-22 15:23:47.055397: step: 898/466, loss: 0.04718204215168953 2023-01-22 15:23:47.818620: step: 900/466, loss: 0.032861463725566864 2023-01-22 15:23:48.498099: step: 902/466, loss: 0.017354751005768776 2023-01-22 15:23:49.208232: step: 904/466, loss: 0.032805535942316055 2023-01-22 15:23:49.966976: step: 906/466, loss: 0.04162294790148735 2023-01-22 15:23:50.715714: step: 908/466, loss: 0.0005440631066448987 2023-01-22 15:23:51.474013: step: 910/466, loss: 0.2701677680015564 2023-01-22 15:23:52.254589: step: 912/466, loss: 0.3160749673843384 2023-01-22 15:23:52.973273: step: 914/466, loss: 0.004833092913031578 2023-01-22 15:23:53.724510: step: 916/466, loss: 0.11880436539649963 2023-01-22 15:23:54.500528: step: 918/466, loss: 0.00848419964313507 2023-01-22 15:23:55.151403: step: 920/466, loss: 0.029202762991189957 2023-01-22 15:23:55.991398: step: 922/466, loss: 0.0001330649247393012 2023-01-22 15:23:56.735853: step: 924/466, loss: 0.011204649694263935 2023-01-22 15:23:57.588723: step: 926/466, loss: 0.0641421526670456 2023-01-22 15:23:58.347591: step: 928/466, loss: 0.016704823821783066 2023-01-22 15:23:59.024612: step: 930/466, loss: 0.004538416862487793 2023-01-22 15:23:59.835178: step: 932/466, loss: 0.002167261205613613 ================================================== Loss: 0.064 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2922560582162317, 'r': 0.3515945747800586, 'f1': 0.3191909404118706}, 'combined': 0.23519332451400993, 'epoch': 27} Test Chinese: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.3258434408761855, 'r': 0.31116072083670404, 'f1': 0.31833286511130887}, 'combined': 0.19565824880012156, 'epoch': 27} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2774637971234339, 'r': 0.3674947445771477, 'f1': 0.31619547819127647}, 'combined': 0.23298614182515107, 'epoch': 27} Test Korean: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.3079207585626001, 'r': 0.31832709269079884, 'f1': 0.3130374648190728}, 'combined': 0.1924035149619667, 'epoch': 27} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.31270701319614364, 'r': 0.36778608938410073, 'f1': 0.3380175025148916}, 'combined': 0.24906552816886748, 'epoch': 27} Test Russian: {'template': {'p': 0.863013698630137, 'r': 0.48091603053435117, 'f1': 0.6176470588235295}, 'slot': {'p': 0.32073328362257175, 'r': 0.3057249670579107, 'f1': 0.31304934515069116}, 'combined': 0.19335400729895635, 'epoch': 27} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.25, 'r': 0.35714285714285715, 'f1': 0.2941176470588235}, 'combined': 0.196078431372549, 'epoch': 27} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.235, 'r': 0.5108695652173914, 'f1': 0.32191780821917804}, 'combined': 0.16095890410958902, 'epoch': 27} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.375, 'r': 0.20689655172413793, 'f1': 0.26666666666666666}, 'combined': 0.17777777777777776, 'epoch': 27} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3112994890724527, 'r': 0.3408345449616797, 'f1': 0.3253981978166761}, 'combined': 0.23976709312807712, 'epoch': 22} Test for Chinese: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.354901590881547, 'r': 0.30635228234537004, 'f1': 0.3288446896922884}, 'combined': 0.2021191751279431, 'epoch': 22} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.28125, 'r': 0.38571428571428573, 'f1': 0.32530120481927716}, 'combined': 0.2168674698795181, 'epoch': 22} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.28298284686781866, 'r': 0.3527888622242065, 'f1': 0.31405359863540006}, 'combined': 0.23140791478397899, 'epoch': 11} Test for Korean: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.32660396071805126, 'r': 0.30226432413074417, 'f1': 0.3139631233545263}, 'combined': 0.19297245630570886, 'epoch': 11} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.28888888888888886, 'r': 0.5652173913043478, 'f1': 0.38235294117647056}, 'combined': 0.19117647058823528, 'epoch': 11} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3161637354106435, 'r': 0.3395610516934046, 'f1': 0.3274449665918101}, 'combined': 0.24127523854133376, 'epoch': 24} Test for Russian: {'template': {'p': 0.863013698630137, 'r': 0.48091603053435117, 'f1': 0.6176470588235295}, 'slot': {'p': 0.34766580316129947, 'r': 0.3010093533864065, 'f1': 0.32265967810793456}, 'combined': 0.19928980118431255, 'epoch': 24} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.2413793103448276, 'f1': 0.32558139534883723}, 'combined': 0.21705426356589147, 'epoch': 24} ****************************** Epoch: 28 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-22 15:26:47.680907: step: 2/466, loss: 0.06499077379703522 2023-01-22 15:26:48.399085: step: 4/466, loss: 0.00030605480424128473 2023-01-22 15:26:49.158060: step: 6/466, loss: 0.048321682959795 2023-01-22 15:26:49.920105: step: 8/466, loss: 0.03479750081896782 2023-01-22 15:26:50.652018: step: 10/466, loss: 0.03079729899764061 2023-01-22 15:26:51.543671: step: 12/466, loss: 0.02482619881629944 2023-01-22 15:26:52.240418: step: 14/466, loss: 0.004145639017224312 2023-01-22 15:26:52.985464: step: 16/466, loss: 0.05340668186545372 2023-01-22 15:26:53.748275: step: 18/466, loss: 0.0020916545763611794 2023-01-22 15:26:54.462430: step: 20/466, loss: 0.00204270938411355 2023-01-22 15:26:55.278862: step: 22/466, loss: 0.028095096349716187 2023-01-22 15:26:56.009400: step: 24/466, loss: 0.0031982657965272665 2023-01-22 15:26:56.827654: step: 26/466, loss: 0.00482933921739459 2023-01-22 15:26:57.561849: step: 28/466, loss: 0.01134143304079771 2023-01-22 15:26:58.341709: step: 30/466, loss: 0.004522103816270828 2023-01-22 15:26:59.179368: step: 32/466, loss: 0.005846688058227301 2023-01-22 15:26:59.972856: step: 34/466, loss: 0.0363161526620388 2023-01-22 15:27:00.697420: step: 36/466, loss: 0.010997610166668892 2023-01-22 15:27:01.499698: step: 38/466, loss: 0.0009249352151528001 2023-01-22 15:27:02.260274: step: 40/466, loss: 0.0014230635715648532 2023-01-22 15:27:03.106411: step: 42/466, loss: 0.08632513135671616 2023-01-22 15:27:03.824242: step: 44/466, loss: 0.028771214187145233 2023-01-22 15:27:04.527770: step: 46/466, loss: 0.01563280262053013 2023-01-22 15:27:05.305329: step: 48/466, loss: 0.059959497302770615 2023-01-22 15:27:06.024846: step: 50/466, loss: 0.0029923501424491405 2023-01-22 15:27:06.709031: step: 52/466, loss: 0.015609527938067913 2023-01-22 15:27:07.489223: step: 54/466, loss: 0.005849896464496851 2023-01-22 15:27:08.268194: step: 56/466, loss: 0.02084076590836048 2023-01-22 15:27:09.030557: step: 58/466, loss: 0.028277015313506126 2023-01-22 15:27:09.785915: step: 60/466, loss: 0.022914856672286987 2023-01-22 15:27:10.601872: step: 62/466, loss: 0.002642903244122863 2023-01-22 15:27:11.334586: step: 64/466, loss: 0.0164827611297369 2023-01-22 15:27:12.054672: step: 66/466, loss: 0.0020859059877693653 2023-01-22 15:27:12.840160: step: 68/466, loss: 0.0057837022468447685 2023-01-22 15:27:13.479703: step: 70/466, loss: 0.031920988112688065 2023-01-22 15:27:14.222919: step: 72/466, loss: 0.048194173723459244 2023-01-22 15:27:14.947950: step: 74/466, loss: 0.037404634058475494 2023-01-22 15:27:15.798847: step: 76/466, loss: 0.0703200027346611 2023-01-22 15:27:16.501172: step: 78/466, loss: 0.0007235261728055775 2023-01-22 15:27:17.283557: step: 80/466, loss: 0.005455330945551395 2023-01-22 15:27:17.987958: step: 82/466, loss: 0.030146855860948563 2023-01-22 15:27:18.734565: step: 84/466, loss: 0.019173510372638702 2023-01-22 15:27:19.511429: step: 86/466, loss: 0.31309449672698975 2023-01-22 15:27:20.422659: step: 88/466, loss: 0.0006127296946942806 2023-01-22 15:27:21.148619: step: 90/466, loss: 0.010310042649507523 2023-01-22 15:27:21.882515: step: 92/466, loss: 0.006365275010466576 2023-01-22 15:27:22.629574: step: 94/466, loss: 0.019108332693576813 2023-01-22 15:27:23.328705: step: 96/466, loss: 0.005697840824723244 2023-01-22 15:27:24.160954: step: 98/466, loss: 0.02297283336520195 2023-01-22 15:27:24.950678: step: 100/466, loss: 0.00931552518159151 2023-01-22 15:27:25.659132: step: 102/466, loss: 0.0034823233727365732 2023-01-22 15:27:26.382013: step: 104/466, loss: 0.0037601529620587826 2023-01-22 15:27:27.119033: step: 106/466, loss: 0.10511481761932373 2023-01-22 15:27:27.842833: step: 108/466, loss: 0.03389532491564751 2023-01-22 15:27:28.643128: step: 110/466, loss: 0.036362238228321075 2023-01-22 15:27:29.336363: step: 112/466, loss: 0.032383453100919724 2023-01-22 15:27:30.082011: step: 114/466, loss: 0.04542417451739311 2023-01-22 15:27:30.849753: step: 116/466, loss: 0.017092755064368248 2023-01-22 15:27:31.548048: step: 118/466, loss: 0.002053190255537629 2023-01-22 15:27:32.340790: step: 120/466, loss: 0.024400081485509872 2023-01-22 15:27:33.115754: step: 122/466, loss: 0.08548931777477264 2023-01-22 15:27:33.833628: step: 124/466, loss: 0.004342307336628437 2023-01-22 15:27:34.544425: step: 126/466, loss: 0.0299422238022089 2023-01-22 15:27:35.274428: step: 128/466, loss: 0.0469515398144722 2023-01-22 15:27:36.092529: step: 130/466, loss: 0.0016981277149170637 2023-01-22 15:27:36.931339: step: 132/466, loss: 0.02915450558066368 2023-01-22 15:27:37.763141: step: 134/466, loss: 0.1387241780757904 2023-01-22 15:27:38.543977: step: 136/466, loss: 0.03041171096265316 2023-01-22 15:27:39.264280: step: 138/466, loss: 0.000548867043107748 2023-01-22 15:27:39.961089: step: 140/466, loss: 0.001786780427210033 2023-01-22 15:27:40.789825: step: 142/466, loss: 0.01875191740691662 2023-01-22 15:27:41.454566: step: 144/466, loss: 2.2222907543182373 2023-01-22 15:27:42.196876: step: 146/466, loss: 0.1311049610376358 2023-01-22 15:27:42.991046: step: 148/466, loss: 0.14866508543491364 2023-01-22 15:27:43.685249: step: 150/466, loss: 0.00995594821870327 2023-01-22 15:27:44.384904: step: 152/466, loss: 0.0277316402643919 2023-01-22 15:27:45.209253: step: 154/466, loss: 0.0845852717757225 2023-01-22 15:27:45.965109: step: 156/466, loss: 0.014286580495536327 2023-01-22 15:27:46.743845: step: 158/466, loss: 0.02495592273771763 2023-01-22 15:27:47.580217: step: 160/466, loss: 0.028022564947605133 2023-01-22 15:27:48.297192: step: 162/466, loss: 0.02017914690077305 2023-01-22 15:27:49.069596: step: 164/466, loss: 0.0022570958826690912 2023-01-22 15:27:49.854322: step: 166/466, loss: 0.013435564003884792 2023-01-22 15:27:50.655115: step: 168/466, loss: 0.013166406191885471 2023-01-22 15:27:51.361036: step: 170/466, loss: 0.0075103770941495895 2023-01-22 15:27:52.126779: step: 172/466, loss: 0.041951779276132584 2023-01-22 15:27:52.888402: step: 174/466, loss: 0.029832616448402405 2023-01-22 15:27:53.697937: step: 176/466, loss: 0.05610651150345802 2023-01-22 15:27:54.410489: step: 178/466, loss: 0.033889614045619965 2023-01-22 15:27:55.109779: step: 180/466, loss: 0.12283995002508163 2023-01-22 15:27:55.866769: step: 182/466, loss: 0.00535232201218605 2023-01-22 15:27:56.573761: step: 184/466, loss: 0.05301175266504288 2023-01-22 15:27:57.359910: step: 186/466, loss: 0.004555729683488607 2023-01-22 15:27:58.065314: step: 188/466, loss: 0.016074948012828827 2023-01-22 15:27:58.824504: step: 190/466, loss: 0.004225687589496374 2023-01-22 15:27:59.556031: step: 192/466, loss: 0.0014638496795669198 2023-01-22 15:28:00.335472: step: 194/466, loss: 0.014197608456015587 2023-01-22 15:28:01.096984: step: 196/466, loss: 0.03307262435555458 2023-01-22 15:28:01.798835: step: 198/466, loss: 0.008677857927978039 2023-01-22 15:28:02.551003: step: 200/466, loss: 0.005501512438058853 2023-01-22 15:28:03.415047: step: 202/466, loss: 0.02331840991973877 2023-01-22 15:28:04.182458: step: 204/466, loss: 0.0021949547808617353 2023-01-22 15:28:04.903076: step: 206/466, loss: 0.09439667314291 2023-01-22 15:28:05.701649: step: 208/466, loss: 0.051236849278211594 2023-01-22 15:28:06.449296: step: 210/466, loss: 0.03168812766671181 2023-01-22 15:28:07.239911: step: 212/466, loss: 0.030869079753756523 2023-01-22 15:28:08.012129: step: 214/466, loss: 0.12366204708814621 2023-01-22 15:28:08.731012: step: 216/466, loss: 0.021326301619410515 2023-01-22 15:28:09.434100: step: 218/466, loss: 0.07159413397312164 2023-01-22 15:28:10.164297: step: 220/466, loss: 0.013327101245522499 2023-01-22 15:28:10.933997: step: 222/466, loss: 0.007618334610015154 2023-01-22 15:28:11.670341: step: 224/466, loss: 0.009082970209419727 2023-01-22 15:28:12.338646: step: 226/466, loss: 0.0043992577120661736 2023-01-22 15:28:13.169650: step: 228/466, loss: 0.8100372552871704 2023-01-22 15:28:13.932048: step: 230/466, loss: 0.009599895216524601 2023-01-22 15:28:14.676827: step: 232/466, loss: 0.0001887738617369905 2023-01-22 15:28:15.423225: step: 234/466, loss: 0.03161048889160156 2023-01-22 15:28:16.208181: step: 236/466, loss: 0.013339421711862087 2023-01-22 15:28:16.980799: step: 238/466, loss: 0.5421699285507202 2023-01-22 15:28:17.833470: step: 240/466, loss: 0.0341641902923584 2023-01-22 15:28:18.616735: step: 242/466, loss: 0.0005802357918582857 2023-01-22 15:28:19.337680: step: 244/466, loss: 0.013874661177396774 2023-01-22 15:28:20.072355: step: 246/466, loss: 0.038976095616817474 2023-01-22 15:28:20.784609: step: 248/466, loss: 0.015630293637514114 2023-01-22 15:28:21.636143: step: 250/466, loss: 0.07874112576246262 2023-01-22 15:28:22.478236: step: 252/466, loss: 0.011092585511505604 2023-01-22 15:28:23.262022: step: 254/466, loss: 0.06348495930433273 2023-01-22 15:28:23.998898: step: 256/466, loss: 0.03054478019475937 2023-01-22 15:28:24.700851: step: 258/466, loss: 0.012458411045372486 2023-01-22 15:28:25.435650: step: 260/466, loss: 0.03350565582513809 2023-01-22 15:28:26.140389: step: 262/466, loss: 0.02486339397728443 2023-01-22 15:28:26.881767: step: 264/466, loss: 0.007128008641302586 2023-01-22 15:28:27.700420: step: 266/466, loss: 0.03835118189454079 2023-01-22 15:28:28.459772: step: 268/466, loss: 0.382914662361145 2023-01-22 15:28:29.208680: step: 270/466, loss: 0.3014521896839142 2023-01-22 15:28:29.989832: step: 272/466, loss: 0.08240548521280289 2023-01-22 15:28:30.758704: step: 274/466, loss: 0.008726700209081173 2023-01-22 15:28:31.483652: step: 276/466, loss: 0.01880439557135105 2023-01-22 15:28:32.229979: step: 278/466, loss: 0.011851079761981964 2023-01-22 15:28:33.041956: step: 280/466, loss: 0.0475764237344265 2023-01-22 15:28:33.812901: step: 282/466, loss: 0.023751405999064445 2023-01-22 15:28:34.571548: step: 284/466, loss: 0.019384315237402916 2023-01-22 15:28:35.354308: step: 286/466, loss: 0.006504491437226534 2023-01-22 15:28:36.042981: step: 288/466, loss: 0.00377333490177989 2023-01-22 15:28:36.761422: step: 290/466, loss: 0.005131383426487446 2023-01-22 15:28:37.508643: step: 292/466, loss: 0.004204194992780685 2023-01-22 15:28:38.278039: step: 294/466, loss: 0.008302164264023304 2023-01-22 15:28:39.125056: step: 296/466, loss: 0.02942466177046299 2023-01-22 15:28:39.989061: step: 298/466, loss: 0.002165537793189287 2023-01-22 15:28:40.796561: step: 300/466, loss: 0.1387287974357605 2023-01-22 15:28:41.601953: step: 302/466, loss: 0.027791647240519524 2023-01-22 15:28:42.361259: step: 304/466, loss: 0.02116568200290203 2023-01-22 15:28:43.152078: step: 306/466, loss: 0.029098939150571823 2023-01-22 15:28:43.904177: step: 308/466, loss: 0.0016982073429971933 2023-01-22 15:28:44.635732: step: 310/466, loss: 0.0003676303313113749 2023-01-22 15:28:45.405716: step: 312/466, loss: 0.06481810659170151 2023-01-22 15:28:46.125077: step: 314/466, loss: 0.0015671990113332868 2023-01-22 15:28:46.948393: step: 316/466, loss: 0.036887165158987045 2023-01-22 15:28:47.618254: step: 318/466, loss: 0.00625829491764307 2023-01-22 15:28:48.420503: step: 320/466, loss: 1.9453067779541016 2023-01-22 15:28:49.123849: step: 322/466, loss: 0.00021286096307449043 2023-01-22 15:28:49.838286: step: 324/466, loss: 2.494503974914551 2023-01-22 15:28:50.558773: step: 326/466, loss: 0.04123297706246376 2023-01-22 15:28:51.264533: step: 328/466, loss: 0.030818086117506027 2023-01-22 15:28:52.037592: step: 330/466, loss: 0.009971718303859234 2023-01-22 15:28:52.839053: step: 332/466, loss: 0.03440983593463898 2023-01-22 15:28:53.539026: step: 334/466, loss: 0.0012510116212069988 2023-01-22 15:28:54.286993: step: 336/466, loss: 0.3301555812358856 2023-01-22 15:28:55.067382: step: 338/466, loss: 0.01839314214885235 2023-01-22 15:28:55.817498: step: 340/466, loss: 0.0001671733771217987 2023-01-22 15:28:56.522414: step: 342/466, loss: 0.03994448855519295 2023-01-22 15:28:57.283744: step: 344/466, loss: 0.004159613512456417 2023-01-22 15:28:58.008067: step: 346/466, loss: 0.022638363763689995 2023-01-22 15:28:58.710771: step: 348/466, loss: 0.0002808289136737585 2023-01-22 15:28:59.523832: step: 350/466, loss: 0.010448471643030643 2023-01-22 15:29:00.253498: step: 352/466, loss: 0.03703666105866432 2023-01-22 15:29:00.979897: step: 354/466, loss: 0.019218124449253082 2023-01-22 15:29:01.710210: step: 356/466, loss: 0.16720406711101532 2023-01-22 15:29:02.400170: step: 358/466, loss: 0.04546864703297615 2023-01-22 15:29:03.160925: step: 360/466, loss: 0.0613517202436924 2023-01-22 15:29:03.988512: step: 362/466, loss: 0.06387585401535034 2023-01-22 15:29:04.714899: step: 364/466, loss: 0.00040805654134601355 2023-01-22 15:29:05.421660: step: 366/466, loss: 0.03251107037067413 2023-01-22 15:29:06.218623: step: 368/466, loss: 0.031106477603316307 2023-01-22 15:29:07.025351: step: 370/466, loss: 0.017203882336616516 2023-01-22 15:29:07.893236: step: 372/466, loss: 0.2872418761253357 2023-01-22 15:29:08.685763: step: 374/466, loss: 0.016309423372149467 2023-01-22 15:29:09.536042: step: 376/466, loss: 0.016587570309638977 2023-01-22 15:29:10.305599: step: 378/466, loss: 0.010483672842383385 2023-01-22 15:29:11.066940: step: 380/466, loss: 0.004580613691359758 2023-01-22 15:29:11.859868: step: 382/466, loss: 0.05155673250555992 2023-01-22 15:29:12.601721: step: 384/466, loss: 0.019467420876026154 2023-01-22 15:29:13.377143: step: 386/466, loss: 0.029801692813634872 2023-01-22 15:29:14.173973: step: 388/466, loss: 0.008031142875552177 2023-01-22 15:29:14.972218: step: 390/466, loss: 0.02196209877729416 2023-01-22 15:29:15.644342: step: 392/466, loss: 0.0014395922189578414 2023-01-22 15:29:16.419998: step: 394/466, loss: 0.021542318165302277 2023-01-22 15:29:17.168983: step: 396/466, loss: 0.009997592307627201 2023-01-22 15:29:17.990918: step: 398/466, loss: 0.05296003073453903 2023-01-22 15:29:18.786486: step: 400/466, loss: 0.007914634421467781 2023-01-22 15:29:19.569445: step: 402/466, loss: 0.670566201210022 2023-01-22 15:29:20.417754: step: 404/466, loss: 0.038577012717723846 2023-01-22 15:29:21.182982: step: 406/466, loss: 0.026723712682724 2023-01-22 15:29:21.925079: step: 408/466, loss: 0.0045420206151902676 2023-01-22 15:29:22.707944: step: 410/466, loss: 0.006481709890067577 2023-01-22 15:29:23.458219: step: 412/466, loss: 0.014898900873959064 2023-01-22 15:29:24.191582: step: 414/466, loss: 0.015765273943543434 2023-01-22 15:29:24.924749: step: 416/466, loss: 0.06360304355621338 2023-01-22 15:29:25.662635: step: 418/466, loss: 0.007630039472132921 2023-01-22 15:29:26.380123: step: 420/466, loss: 0.014952539466321468 2023-01-22 15:29:27.223356: step: 422/466, loss: 0.4710962474346161 2023-01-22 15:29:28.097751: step: 424/466, loss: 0.05748264491558075 2023-01-22 15:29:28.798206: step: 426/466, loss: 0.000547609175555408 2023-01-22 15:29:29.584113: step: 428/466, loss: 0.08756930381059647 2023-01-22 15:29:30.369583: step: 430/466, loss: 0.023667046800255775 2023-01-22 15:29:31.169399: step: 432/466, loss: 0.015044069848954678 2023-01-22 15:29:31.941831: step: 434/466, loss: 3.511402610456571e-05 2023-01-22 15:29:32.770105: step: 436/466, loss: 0.06586025655269623 2023-01-22 15:29:33.498586: step: 438/466, loss: 0.008547937497496605 2023-01-22 15:29:34.239978: step: 440/466, loss: 0.017574824392795563 2023-01-22 15:29:35.011368: step: 442/466, loss: 0.17807810008525848 2023-01-22 15:29:35.765214: step: 444/466, loss: 0.01362221036106348 2023-01-22 15:29:36.665191: step: 446/466, loss: 0.00949972402304411 2023-01-22 15:29:37.539492: step: 448/466, loss: 0.4360661804676056 2023-01-22 15:29:38.280353: step: 450/466, loss: 0.10850492119789124 2023-01-22 15:29:39.041041: step: 452/466, loss: 0.005768525879830122 2023-01-22 15:29:39.781977: step: 454/466, loss: 0.012159998528659344 2023-01-22 15:29:40.499465: step: 456/466, loss: 0.0253811776638031 2023-01-22 15:29:41.373208: step: 458/466, loss: 0.019882716238498688 2023-01-22 15:29:42.161292: step: 460/466, loss: 0.02412649616599083 2023-01-22 15:29:42.888933: step: 462/466, loss: 0.004687316715717316 2023-01-22 15:29:43.713784: step: 464/466, loss: 0.004203404299914837 2023-01-22 15:29:44.402568: step: 466/466, loss: 0.002235273364931345 2023-01-22 15:29:45.123155: step: 468/466, loss: 0.015087980777025223 2023-01-22 15:29:45.858541: step: 470/466, loss: 0.11413915455341339 2023-01-22 15:29:46.688317: step: 472/466, loss: 0.14809414744377136 2023-01-22 15:29:47.372220: step: 474/466, loss: 0.004887235816568136 2023-01-22 15:29:48.143078: step: 476/466, loss: 0.016122309491038322 2023-01-22 15:29:48.862075: step: 478/466, loss: 0.01863691955804825 2023-01-22 15:29:49.674885: step: 480/466, loss: 0.08607181906700134 2023-01-22 15:29:50.476203: step: 482/466, loss: 0.4806332290172577 2023-01-22 15:29:51.254298: step: 484/466, loss: 0.033233266323804855 2023-01-22 15:29:52.110180: step: 486/466, loss: 0.024094315245747566 2023-01-22 15:29:52.885005: step: 488/466, loss: 0.3318593502044678 2023-01-22 15:29:53.563115: step: 490/466, loss: 0.01980016566812992 2023-01-22 15:29:54.243530: step: 492/466, loss: 0.009421803057193756 2023-01-22 15:29:55.070039: step: 494/466, loss: 0.07890952378511429 2023-01-22 15:29:55.815541: step: 496/466, loss: 0.03698199242353439 2023-01-22 15:29:56.692339: step: 498/466, loss: 0.015461564064025879 2023-01-22 15:29:57.434269: step: 500/466, loss: 0.030055489391088486 2023-01-22 15:29:58.171190: step: 502/466, loss: 0.014831021428108215 2023-01-22 15:29:58.865089: step: 504/466, loss: 0.011799895204603672 2023-01-22 15:29:59.533595: step: 506/466, loss: 0.00011390951840439811 2023-01-22 15:30:00.304735: step: 508/466, loss: 0.02746775932610035 2023-01-22 15:30:01.055154: step: 510/466, loss: 0.09469226002693176 2023-01-22 15:30:01.895268: step: 512/466, loss: 0.04830900952219963 2023-01-22 15:30:02.703481: step: 514/466, loss: 0.004041132051497698 2023-01-22 15:30:03.423396: step: 516/466, loss: 0.019389452412724495 2023-01-22 15:30:04.198869: step: 518/466, loss: 0.05219132825732231 2023-01-22 15:30:04.939412: step: 520/466, loss: 0.0018481820588931441 2023-01-22 15:30:05.713835: step: 522/466, loss: 0.023905931040644646 2023-01-22 15:30:06.440380: step: 524/466, loss: 0.009195341728627682 2023-01-22 15:30:07.193613: step: 526/466, loss: 0.0813000500202179 2023-01-22 15:30:07.921890: step: 528/466, loss: 0.0018000929849222302 2023-01-22 15:30:08.685827: step: 530/466, loss: 0.007307850290089846 2023-01-22 15:30:09.438213: step: 532/466, loss: 0.06790435314178467 2023-01-22 15:30:10.411632: step: 534/466, loss: 0.05543149635195732 2023-01-22 15:30:11.185406: step: 536/466, loss: 0.0454183965921402 2023-01-22 15:30:11.978519: step: 538/466, loss: 0.04881078749895096 2023-01-22 15:30:12.697718: step: 540/466, loss: 0.010402346029877663 2023-01-22 15:30:13.406299: step: 542/466, loss: 0.008386926725506783 2023-01-22 15:30:14.090583: step: 544/466, loss: 0.002387000946328044 2023-01-22 15:30:14.845238: step: 546/466, loss: 0.0064791422337293625 2023-01-22 15:30:15.561084: step: 548/466, loss: 0.014200640842318535 2023-01-22 15:30:16.265817: step: 550/466, loss: 0.0030625786166638136 2023-01-22 15:30:17.000813: step: 552/466, loss: 0.007705213502049446 2023-01-22 15:30:17.694977: step: 554/466, loss: 0.37956467270851135 2023-01-22 15:30:18.439726: step: 556/466, loss: 0.054494984447956085 2023-01-22 15:30:19.219836: step: 558/466, loss: 0.043674640357494354 2023-01-22 15:30:19.980546: step: 560/466, loss: 0.008610519580543041 2023-01-22 15:30:20.683798: step: 562/466, loss: 0.037072937935590744 2023-01-22 15:30:21.447103: step: 564/466, loss: 0.033782653510570526 2023-01-22 15:30:22.131020: step: 566/466, loss: 0.005472981370985508 2023-01-22 15:30:22.830807: step: 568/466, loss: 0.010912074707448483 2023-01-22 15:30:23.553056: step: 570/466, loss: 0.0232261773198843 2023-01-22 15:30:24.207945: step: 572/466, loss: 0.010280226357281208 2023-01-22 15:30:24.945152: step: 574/466, loss: 0.01418902724981308 2023-01-22 15:30:25.726889: step: 576/466, loss: 0.011683930642902851 2023-01-22 15:30:26.480248: step: 578/466, loss: 0.004620610736310482 2023-01-22 15:30:27.271482: step: 580/466, loss: 0.12468191981315613 2023-01-22 15:30:28.125009: step: 582/466, loss: 0.03425385430455208 2023-01-22 15:30:28.939406: step: 584/466, loss: 0.011958101764321327 2023-01-22 15:30:29.709974: step: 586/466, loss: 0.012401238083839417 2023-01-22 15:30:30.576314: step: 588/466, loss: 0.03542652353644371 2023-01-22 15:30:31.361317: step: 590/466, loss: 0.003349336562678218 2023-01-22 15:30:32.192054: step: 592/466, loss: 0.0013430201215669513 2023-01-22 15:30:32.912246: step: 594/466, loss: 0.008202009834349155 2023-01-22 15:30:33.695428: step: 596/466, loss: 0.011623851023614407 2023-01-22 15:30:34.409122: step: 598/466, loss: 0.016427496448159218 2023-01-22 15:30:35.187091: step: 600/466, loss: 0.0034484562929719687 2023-01-22 15:30:35.970641: step: 602/466, loss: 0.007288047112524509 2023-01-22 15:30:36.743720: step: 604/466, loss: 0.0009411592618562281 2023-01-22 15:30:37.420274: step: 606/466, loss: 0.0017976914532482624 2023-01-22 15:30:38.136320: step: 608/466, loss: 0.0015250653959810734 2023-01-22 15:30:38.848715: step: 610/466, loss: 0.0024311088491231203 2023-01-22 15:30:39.622795: step: 612/466, loss: 0.0039992425590753555 2023-01-22 15:30:40.479557: step: 614/466, loss: 0.05181852728128433 2023-01-22 15:30:41.253902: step: 616/466, loss: 0.02542915567755699 2023-01-22 15:30:41.995581: step: 618/466, loss: 0.10837765038013458 2023-01-22 15:30:42.652909: step: 620/466, loss: 0.02626294456422329 2023-01-22 15:30:43.421071: step: 622/466, loss: 0.01962939277291298 2023-01-22 15:30:44.257778: step: 624/466, loss: 0.0007828868110664189 2023-01-22 15:30:44.926143: step: 626/466, loss: 0.022047756239771843 2023-01-22 15:30:45.687961: step: 628/466, loss: 0.026129741221666336 2023-01-22 15:30:46.646857: step: 630/466, loss: 0.07522108405828476 2023-01-22 15:30:47.431102: step: 632/466, loss: 0.05754267796874046 2023-01-22 15:30:48.197192: step: 634/466, loss: 0.030316900461912155 2023-01-22 15:30:48.948432: step: 636/466, loss: 0.009969084523618221 2023-01-22 15:30:49.659720: step: 638/466, loss: 0.03778718039393425 2023-01-22 15:30:50.353475: step: 640/466, loss: 0.0012821757700294256 2023-01-22 15:30:51.096144: step: 642/466, loss: 0.006132754497230053 2023-01-22 15:30:51.852277: step: 644/466, loss: 0.018835240975022316 2023-01-22 15:30:52.592802: step: 646/466, loss: 0.006913262885063887 2023-01-22 15:30:53.407776: step: 648/466, loss: 0.001851757988333702 2023-01-22 15:30:54.258705: step: 650/466, loss: 0.039734967052936554 2023-01-22 15:30:55.101440: step: 652/466, loss: 0.01049152109771967 2023-01-22 15:30:55.846784: step: 654/466, loss: 0.028401654213666916 2023-01-22 15:30:56.568256: step: 656/466, loss: 0.013274271972477436 2023-01-22 15:30:57.358176: step: 658/466, loss: 0.012449268251657486 2023-01-22 15:30:58.060851: step: 660/466, loss: 0.05933975428342819 2023-01-22 15:30:58.837324: step: 662/466, loss: 0.00489422120153904 2023-01-22 15:30:59.581081: step: 664/466, loss: 0.022365255281329155 2023-01-22 15:31:00.368312: step: 666/466, loss: 0.010959116742014885 2023-01-22 15:31:01.184088: step: 668/466, loss: 0.0012225598329678178 2023-01-22 15:31:02.008701: step: 670/466, loss: 0.025942856445908546 2023-01-22 15:31:02.703202: step: 672/466, loss: 0.027114970609545708 2023-01-22 15:31:03.461964: step: 674/466, loss: 0.0978192389011383 2023-01-22 15:31:04.313655: step: 676/466, loss: 0.0022755172103643417 2023-01-22 15:31:05.162360: step: 678/466, loss: 0.0016027453821152449 2023-01-22 15:31:05.977214: step: 680/466, loss: 0.8143453598022461 2023-01-22 15:31:06.858801: step: 682/466, loss: 0.2201130986213684 2023-01-22 15:31:07.604229: step: 684/466, loss: 0.0033785824198275805 2023-01-22 15:31:08.356163: step: 686/466, loss: 0.008999837562441826 2023-01-22 15:31:09.153691: step: 688/466, loss: 0.00800447165966034 2023-01-22 15:31:09.925303: step: 690/466, loss: 0.024964090436697006 2023-01-22 15:31:10.714526: step: 692/466, loss: 0.0023913022596389055 2023-01-22 15:31:11.453239: step: 694/466, loss: 0.002989932894706726 2023-01-22 15:31:12.212191: step: 696/466, loss: 0.019625093787908554 2023-01-22 15:31:12.960893: step: 698/466, loss: 0.027111845090985298 2023-01-22 15:31:13.800968: step: 700/466, loss: 0.06732188165187836 2023-01-22 15:31:14.557211: step: 702/466, loss: 0.025172458961606026 2023-01-22 15:31:15.279124: step: 704/466, loss: 0.013280685059726238 2023-01-22 15:31:16.047490: step: 706/466, loss: 0.007359291426837444 2023-01-22 15:31:16.812239: step: 708/466, loss: 0.24157093465328217 2023-01-22 15:31:17.542281: step: 710/466, loss: 0.3335108757019043 2023-01-22 15:31:18.411871: step: 712/466, loss: 0.009455726481974125 2023-01-22 15:31:19.192020: step: 714/466, loss: 0.028666729107499123 2023-01-22 15:31:19.971150: step: 716/466, loss: 0.01238565519452095 2023-01-22 15:31:20.735597: step: 718/466, loss: 0.05475013330578804 2023-01-22 15:31:21.506928: step: 720/466, loss: 0.003509529633447528 2023-01-22 15:31:22.213862: step: 722/466, loss: 0.02143089286983013 2023-01-22 15:31:22.957091: step: 724/466, loss: 0.049197278916835785 2023-01-22 15:31:23.720122: step: 726/466, loss: 0.029327843338251114 2023-01-22 15:31:24.493701: step: 728/466, loss: 0.012504545971751213 2023-01-22 15:31:25.225541: step: 730/466, loss: 0.01041465625166893 2023-01-22 15:31:25.948160: step: 732/466, loss: 0.0908796489238739 2023-01-22 15:31:26.713671: step: 734/466, loss: 0.011886252090334892 2023-01-22 15:31:27.414220: step: 736/466, loss: 0.001345496391877532 2023-01-22 15:31:28.116317: step: 738/466, loss: 0.0027429629117250443 2023-01-22 15:31:28.805917: step: 740/466, loss: 0.013406159356236458 2023-01-22 15:31:29.608729: step: 742/466, loss: 0.004068335052579641 2023-01-22 15:31:30.447364: step: 744/466, loss: 0.020365413278341293 2023-01-22 15:31:31.277546: step: 746/466, loss: 0.051735859364271164 2023-01-22 15:31:32.018392: step: 748/466, loss: 0.0009359294781461358 2023-01-22 15:31:32.760288: step: 750/466, loss: 0.01213749311864376 2023-01-22 15:31:33.520224: step: 752/466, loss: 0.01617128774523735 2023-01-22 15:31:34.261072: step: 754/466, loss: 0.0006245630793273449 2023-01-22 15:31:35.039421: step: 756/466, loss: 0.09827623516321182 2023-01-22 15:31:35.844849: step: 758/466, loss: 0.020308727398514748 2023-01-22 15:31:36.580762: step: 760/466, loss: 0.016715632751584053 2023-01-22 15:31:37.285485: step: 762/466, loss: 0.02906610816717148 2023-01-22 15:31:38.052113: step: 764/466, loss: 0.032195430248975754 2023-01-22 15:31:38.791681: step: 766/466, loss: 0.0008314056321978569 2023-01-22 15:31:39.553147: step: 768/466, loss: 0.023581545799970627 2023-01-22 15:31:40.330257: step: 770/466, loss: 0.013966499827802181 2023-01-22 15:31:41.084603: step: 772/466, loss: 0.035196367651224136 2023-01-22 15:31:41.823966: step: 774/466, loss: 0.026951663196086884 2023-01-22 15:31:42.638188: step: 776/466, loss: 0.08619563281536102 2023-01-22 15:31:43.435175: step: 778/466, loss: 0.053176235407590866 2023-01-22 15:31:44.256877: step: 780/466, loss: 0.006441871169954538 2023-01-22 15:31:45.001690: step: 782/466, loss: 0.0042631844989955425 2023-01-22 15:31:45.789361: step: 784/466, loss: 0.06781040132045746 2023-01-22 15:31:46.532948: step: 786/466, loss: 0.046221598982810974 2023-01-22 15:31:47.238614: step: 788/466, loss: 0.02901509776711464 2023-01-22 15:31:47.950485: step: 790/466, loss: 0.0003212362644262612 2023-01-22 15:31:48.717414: step: 792/466, loss: 0.012461671605706215 2023-01-22 15:31:49.563954: step: 794/466, loss: 0.08590605109930038 2023-01-22 15:31:50.305779: step: 796/466, loss: 0.010314743034541607 2023-01-22 15:31:51.040593: step: 798/466, loss: 0.0017901716055348516 2023-01-22 15:31:51.783700: step: 800/466, loss: 0.015061999671161175 2023-01-22 15:31:52.536386: step: 802/466, loss: 0.005251292604953051 2023-01-22 15:31:53.211819: step: 804/466, loss: 0.06770678609609604 2023-01-22 15:31:54.011243: step: 806/466, loss: 0.1236957386136055 2023-01-22 15:31:54.772734: step: 808/466, loss: 0.010861529037356377 2023-01-22 15:31:55.478237: step: 810/466, loss: 0.01007386390119791 2023-01-22 15:31:56.218067: step: 812/466, loss: 0.001310806954279542 2023-01-22 15:31:56.941764: step: 814/466, loss: 0.018976766616106033 2023-01-22 15:31:57.782463: step: 816/466, loss: 0.06041739508509636 2023-01-22 15:31:58.589099: step: 818/466, loss: 0.04153867065906525 2023-01-22 15:31:59.340325: step: 820/466, loss: 0.017491161823272705 2023-01-22 15:32:00.089537: step: 822/466, loss: 0.017895622178912163 2023-01-22 15:32:00.971670: step: 824/466, loss: 0.06135671213269234 2023-01-22 15:32:01.703348: step: 826/466, loss: 0.021287666633725166 2023-01-22 15:32:02.490873: step: 828/466, loss: 0.0024881153367459774 2023-01-22 15:32:03.298685: step: 830/466, loss: 0.0017700603930279613 2023-01-22 15:32:04.038093: step: 832/466, loss: 0.04236576706171036 2023-01-22 15:32:04.709164: step: 834/466, loss: 0.014054981991648674 2023-01-22 15:32:05.430389: step: 836/466, loss: 0.013621608726680279 2023-01-22 15:32:06.179607: step: 838/466, loss: 0.04358559846878052 2023-01-22 15:32:06.945687: step: 840/466, loss: 0.08273734897375107 2023-01-22 15:32:07.689478: step: 842/466, loss: 0.004501787014305592 2023-01-22 15:32:08.418757: step: 844/466, loss: 0.010186144150793552 2023-01-22 15:32:09.195550: step: 846/466, loss: 0.014502090401947498 2023-01-22 15:32:09.934006: step: 848/466, loss: 0.0019662058912217617 2023-01-22 15:32:10.706120: step: 850/466, loss: 0.02114732936024666 2023-01-22 15:32:11.496089: step: 852/466, loss: 0.008611384779214859 2023-01-22 15:32:12.287265: step: 854/466, loss: 0.08735426515340805 2023-01-22 15:32:12.981129: step: 856/466, loss: 0.0077768550254404545 2023-01-22 15:32:13.782780: step: 858/466, loss: 0.0015171892009675503 2023-01-22 15:32:14.530030: step: 860/466, loss: 0.03444715589284897 2023-01-22 15:32:15.356734: step: 862/466, loss: 0.00725650554522872 2023-01-22 15:32:16.072102: step: 864/466, loss: 0.00915346760302782 2023-01-22 15:32:16.836526: step: 866/466, loss: 0.04942712560296059 2023-01-22 15:32:17.550739: step: 868/466, loss: 0.023312222212553024 2023-01-22 15:32:18.337211: step: 870/466, loss: 0.0256651621311903 2023-01-22 15:32:19.118016: step: 872/466, loss: 0.00711500458419323 2023-01-22 15:32:19.847116: step: 874/466, loss: 0.0023541359696537256 2023-01-22 15:32:20.582867: step: 876/466, loss: 0.11871694773435593 2023-01-22 15:32:21.349959: step: 878/466, loss: 0.012433771975338459 2023-01-22 15:32:22.093112: step: 880/466, loss: 0.005337044131010771 2023-01-22 15:32:22.798113: step: 882/466, loss: 0.12921734154224396 2023-01-22 15:32:23.621045: step: 884/466, loss: 0.03527417778968811 2023-01-22 15:32:24.403648: step: 886/466, loss: 0.005586323793977499 2023-01-22 15:32:25.190653: step: 888/466, loss: 0.010646643117070198 2023-01-22 15:32:26.091922: step: 890/466, loss: 0.02303536795079708 2023-01-22 15:32:26.857436: step: 892/466, loss: 0.012876001186668873 2023-01-22 15:32:27.690661: step: 894/466, loss: 0.01227349042892456 2023-01-22 15:32:28.454132: step: 896/466, loss: 0.009249640628695488 2023-01-22 15:32:29.181025: step: 898/466, loss: 0.011361805722117424 2023-01-22 15:32:29.928011: step: 900/466, loss: 0.0027859059628099203 2023-01-22 15:32:30.714767: step: 902/466, loss: 0.0026826439425349236 2023-01-22 15:32:31.458993: step: 904/466, loss: 0.015420181676745415 2023-01-22 15:32:32.251857: step: 906/466, loss: 0.01276139635592699 2023-01-22 15:32:33.027529: step: 908/466, loss: 0.08549144864082336 2023-01-22 15:32:33.956488: step: 910/466, loss: 0.05327073484659195 2023-01-22 15:32:34.709013: step: 912/466, loss: 0.09231462329626083 2023-01-22 15:32:35.478419: step: 914/466, loss: 0.03827878087759018 2023-01-22 15:32:36.211655: step: 916/466, loss: 0.00647725211456418 2023-01-22 15:32:37.008371: step: 918/466, loss: 0.011623353697359562 2023-01-22 15:32:37.755046: step: 920/466, loss: 0.15060193836688995 2023-01-22 15:32:38.530718: step: 922/466, loss: 0.05920673534274101 2023-01-22 15:32:39.312549: step: 924/466, loss: 0.027103755623102188 2023-01-22 15:32:40.103722: step: 926/466, loss: 0.035496946424245834 2023-01-22 15:32:40.847446: step: 928/466, loss: 0.001616903580725193 2023-01-22 15:32:41.656519: step: 930/466, loss: 0.014035423286259174 2023-01-22 15:32:42.408844: step: 932/466, loss: 0.002209370955824852 ================================================== Loss: 0.056 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.31108763345195733, 'r': 0.3317481024667932, 'f1': 0.3210858585858586}, 'combined': 0.23658958001063263, 'epoch': 28} Test Chinese: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.35497351171885894, 'r': 0.30298865601653036, 'f1': 0.32692745118567185}, 'combined': 0.20094077487509587, 'epoch': 28} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.28845615818674736, 'r': 0.34045489637980425, 'f1': 0.3123058840594549}, 'combined': 0.23012012509644045, 'epoch': 28} Test Korean: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.3314220112372907, 'r': 0.30844648186208856, 'f1': 0.3195217594872982}, 'combined': 0.19638898387999792, 'epoch': 28} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3266358303249098, 'r': 0.3433704933586338, 'f1': 0.3347941720629047}, 'combined': 0.2466904425726666, 'epoch': 28} Test Russian: {'template': {'p': 0.863013698630137, 'r': 0.48091603053435117, 'f1': 0.6176470588235295}, 'slot': {'p': 0.3560351033242133, 'r': 0.30733073421146373, 'f1': 0.3298949795671381}, 'combined': 0.20375866385029123, 'epoch': 28} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.27325581395348836, 'r': 0.3357142857142857, 'f1': 0.30128205128205127}, 'combined': 0.20085470085470084, 'epoch': 28} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.3170731707317073, 'r': 0.5652173913043478, 'f1': 0.40625}, 'combined': 0.203125, 'epoch': 28} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.4, 'r': 0.20689655172413793, 'f1': 0.2727272727272727}, 'combined': 0.1818181818181818, 'epoch': 28} New best korean model... ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3112994890724527, 'r': 0.3408345449616797, 'f1': 0.3253981978166761}, 'combined': 0.23976709312807712, 'epoch': 22} Test for Chinese: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.354901590881547, 'r': 0.30635228234537004, 'f1': 0.3288446896922884}, 'combined': 0.2021191751279431, 'epoch': 22} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.28125, 'r': 0.38571428571428573, 'f1': 0.32530120481927716}, 'combined': 0.2168674698795181, 'epoch': 22} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.28845615818674736, 'r': 0.34045489637980425, 'f1': 0.3123058840594549}, 'combined': 0.23012012509644045, 'epoch': 28} Test for Korean: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.3314220112372907, 'r': 0.30844648186208856, 'f1': 0.3195217594872982}, 'combined': 0.19638898387999792, 'epoch': 28} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.3170731707317073, 'r': 0.5652173913043478, 'f1': 0.40625}, 'combined': 0.203125, 'epoch': 28} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3161637354106435, 'r': 0.3395610516934046, 'f1': 0.3274449665918101}, 'combined': 0.24127523854133376, 'epoch': 24} Test for Russian: {'template': {'p': 0.863013698630137, 'r': 0.48091603053435117, 'f1': 0.6176470588235295}, 'slot': {'p': 0.34766580316129947, 'r': 0.3010093533864065, 'f1': 0.32265967810793456}, 'combined': 0.19928980118431255, 'epoch': 24} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.2413793103448276, 'f1': 0.32558139534883723}, 'combined': 0.21705426356589147, 'epoch': 24} ****************************** Epoch: 29 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-22 15:35:37.722149: step: 2/466, loss: 0.014562848955392838 2023-01-22 15:35:38.553847: step: 4/466, loss: 0.004749378655105829 2023-01-22 15:35:39.381319: step: 6/466, loss: 0.024124283343553543 2023-01-22 15:35:40.121801: step: 8/466, loss: 0.04171415790915489 2023-01-22 15:35:40.864238: step: 10/466, loss: 0.006787622347474098 2023-01-22 15:35:41.609313: step: 12/466, loss: 0.0034408585634082556 2023-01-22 15:35:42.568243: step: 14/466, loss: 0.017816148698329926 2023-01-22 15:35:43.408832: step: 16/466, loss: 0.037109583616256714 2023-01-22 15:35:44.230720: step: 18/466, loss: 0.004230950027704239 2023-01-22 15:35:44.925196: step: 20/466, loss: 0.047705430537462234 2023-01-22 15:35:45.720121: step: 22/466, loss: 0.07174509763717651 2023-01-22 15:35:46.568399: step: 24/466, loss: 0.024512307718396187 2023-01-22 15:35:47.285724: step: 26/466, loss: 0.00012372205674182624 2023-01-22 15:35:48.076915: step: 28/466, loss: 0.004223550204187632 2023-01-22 15:35:48.809772: step: 30/466, loss: 0.030073709785938263 2023-01-22 15:35:49.479461: step: 32/466, loss: 0.005309813655912876 2023-01-22 15:35:50.287752: step: 34/466, loss: 0.004884167108684778 2023-01-22 15:35:51.084500: step: 36/466, loss: 0.07684051245450974 2023-01-22 15:35:51.806923: step: 38/466, loss: 0.0611296109855175 2023-01-22 15:35:52.573517: step: 40/466, loss: 0.043774817138910294 2023-01-22 15:35:53.303998: step: 42/466, loss: 0.002685240935534239 2023-01-22 15:35:54.072870: step: 44/466, loss: 0.005610991735011339 2023-01-22 15:35:54.818806: step: 46/466, loss: 0.13258616626262665 2023-01-22 15:35:55.628129: step: 48/466, loss: 0.019392507150769234 2023-01-22 15:35:56.434241: step: 50/466, loss: 0.017014844343066216 2023-01-22 15:35:57.178422: step: 52/466, loss: 0.030051592737436295 2023-01-22 15:35:57.955585: step: 54/466, loss: 0.003113041166216135 2023-01-22 15:35:58.690637: step: 56/466, loss: 0.007894654758274555 2023-01-22 15:35:59.547212: step: 58/466, loss: 0.020002620294690132 2023-01-22 15:36:00.298597: step: 60/466, loss: 2.020765542984009 2023-01-22 15:36:01.063285: step: 62/466, loss: 0.016793828457593918 2023-01-22 15:36:01.751645: step: 64/466, loss: 0.004198416136205196 2023-01-22 15:36:02.556865: step: 66/466, loss: 0.01203058660030365 2023-01-22 15:36:03.400260: step: 68/466, loss: 0.5350288152694702 2023-01-22 15:36:04.110731: step: 70/466, loss: 0.0024632588028907776 2023-01-22 15:36:04.904400: step: 72/466, loss: 0.023018931970000267 2023-01-22 15:36:05.616417: step: 74/466, loss: 0.005367817357182503 2023-01-22 15:36:06.560914: step: 76/466, loss: 0.02340589463710785 2023-01-22 15:36:07.364886: step: 78/466, loss: 0.002251754282042384 2023-01-22 15:36:08.120421: step: 80/466, loss: 0.0617501474916935 2023-01-22 15:36:08.900920: step: 82/466, loss: 0.004363054409623146 2023-01-22 15:36:09.697388: step: 84/466, loss: 0.008134029805660248 2023-01-22 15:36:10.340521: step: 86/466, loss: 0.0098897535353899 2023-01-22 15:36:11.121382: step: 88/466, loss: 0.00210469588637352 2023-01-22 15:36:11.902212: step: 90/466, loss: 0.019904451444745064 2023-01-22 15:36:12.634113: step: 92/466, loss: 0.013698196038603783 2023-01-22 15:36:13.398312: step: 94/466, loss: 0.0030992329120635986 2023-01-22 15:36:14.125318: step: 96/466, loss: 0.0009557530866004527 2023-01-22 15:36:14.971050: step: 98/466, loss: 0.06982994079589844 2023-01-22 15:36:15.721127: step: 100/466, loss: 0.015310260467231274 2023-01-22 15:36:16.539992: step: 102/466, loss: 0.03443342447280884 2023-01-22 15:36:17.329281: step: 104/466, loss: 0.02068159729242325 2023-01-22 15:36:18.134264: step: 106/466, loss: 0.011591408401727676 2023-01-22 15:36:18.823749: step: 108/466, loss: 0.012503300793468952 2023-01-22 15:36:19.573331: step: 110/466, loss: 0.04376915469765663 2023-01-22 15:36:20.295544: step: 112/466, loss: 0.012198442593216896 2023-01-22 15:36:21.000623: step: 114/466, loss: 0.005721665918827057 2023-01-22 15:36:21.738715: step: 116/466, loss: 0.0016128283459693193 2023-01-22 15:36:22.488251: step: 118/466, loss: 0.004805414471775293 2023-01-22 15:36:23.284673: step: 120/466, loss: 0.002220995258539915 2023-01-22 15:36:23.936418: step: 122/466, loss: 0.0013943741796538234 2023-01-22 15:36:24.695615: step: 124/466, loss: 0.010333836078643799 2023-01-22 15:36:25.487341: step: 126/466, loss: 0.01896827109158039 2023-01-22 15:36:26.177037: step: 128/466, loss: 0.0013134570326656103 2023-01-22 15:36:26.920694: step: 130/466, loss: 0.08043935149908066 2023-01-22 15:36:27.636848: step: 132/466, loss: 0.020539134740829468 2023-01-22 15:36:28.315133: step: 134/466, loss: 0.025884512811899185 2023-01-22 15:36:29.032815: step: 136/466, loss: 0.0005941173294559121 2023-01-22 15:36:29.774185: step: 138/466, loss: 0.024872949346899986 2023-01-22 15:36:30.523589: step: 140/466, loss: 0.010048205964267254 2023-01-22 15:36:31.419353: step: 142/466, loss: 0.006408365443348885 2023-01-22 15:36:32.183681: step: 144/466, loss: 0.005972626153379679 2023-01-22 15:36:32.905005: step: 146/466, loss: 0.07775118947029114 2023-01-22 15:36:33.650846: step: 148/466, loss: 0.014798705466091633 2023-01-22 15:36:34.453018: step: 150/466, loss: 0.009567567147314548 2023-01-22 15:36:35.328947: step: 152/466, loss: 0.009304573759436607 2023-01-22 15:36:36.176410: step: 154/466, loss: 0.01903250254690647 2023-01-22 15:36:37.064281: step: 156/466, loss: 0.08058947324752808 2023-01-22 15:36:37.828569: step: 158/466, loss: 0.06970684230327606 2023-01-22 15:36:38.569390: step: 160/466, loss: 0.058795832097530365 2023-01-22 15:36:39.285682: step: 162/466, loss: 0.003970024641603231 2023-01-22 15:36:39.954852: step: 164/466, loss: 0.0026512411423027515 2023-01-22 15:36:40.641561: step: 166/466, loss: 0.00022080437338445336 2023-01-22 15:36:41.404236: step: 168/466, loss: 0.0018055520486086607 2023-01-22 15:36:42.158809: step: 170/466, loss: 0.15070666372776031 2023-01-22 15:36:42.917722: step: 172/466, loss: 7.411004543304443 2023-01-22 15:36:43.618635: step: 174/466, loss: 0.08071133494377136 2023-01-22 15:36:44.345352: step: 176/466, loss: 0.01623368076980114 2023-01-22 15:36:45.062169: step: 178/466, loss: 0.0038287367206066847 2023-01-22 15:36:45.751206: step: 180/466, loss: 0.011720797047019005 2023-01-22 15:36:46.479736: step: 182/466, loss: 0.02469259686768055 2023-01-22 15:36:47.268216: step: 184/466, loss: 0.006625003181397915 2023-01-22 15:36:48.085955: step: 186/466, loss: 0.02409050427377224 2023-01-22 15:36:48.976638: step: 188/466, loss: 0.022759372368454933 2023-01-22 15:36:49.730869: step: 190/466, loss: 0.00044331394019536674 2023-01-22 15:36:50.446772: step: 192/466, loss: 0.08735527843236923 2023-01-22 15:36:51.177877: step: 194/466, loss: 0.007105493452399969 2023-01-22 15:36:51.941345: step: 196/466, loss: 0.009793892502784729 2023-01-22 15:36:52.793065: step: 198/466, loss: 0.037961721420288086 2023-01-22 15:36:53.573144: step: 200/466, loss: 0.01566535048186779 2023-01-22 15:36:54.347016: step: 202/466, loss: 0.05330275371670723 2023-01-22 15:36:55.023537: step: 204/466, loss: 0.12019774317741394 2023-01-22 15:36:55.769037: step: 206/466, loss: 0.028496457263827324 2023-01-22 15:36:56.455896: step: 208/466, loss: 0.015642846003174782 2023-01-22 15:36:57.185853: step: 210/466, loss: 0.01874103955924511 2023-01-22 15:36:57.882557: step: 212/466, loss: 0.008261686190962791 2023-01-22 15:36:58.648550: step: 214/466, loss: 0.013057042844593525 2023-01-22 15:36:59.432540: step: 216/466, loss: 0.02398427575826645 2023-01-22 15:37:00.151778: step: 218/466, loss: 0.002774237422272563 2023-01-22 15:37:00.890044: step: 220/466, loss: 0.027512196451425552 2023-01-22 15:37:01.596512: step: 222/466, loss: 0.004296524450182915 2023-01-22 15:37:02.362315: step: 224/466, loss: 0.10447569191455841 2023-01-22 15:37:03.027182: step: 226/466, loss: 0.013976804912090302 2023-01-22 15:37:03.829561: step: 228/466, loss: 0.04611645266413689 2023-01-22 15:37:04.550917: step: 230/466, loss: 0.12256401777267456 2023-01-22 15:37:05.244338: step: 232/466, loss: 0.0076015181839466095 2023-01-22 15:37:06.080622: step: 234/466, loss: 0.008073708973824978 2023-01-22 15:37:06.772982: step: 236/466, loss: 0.030057324096560478 2023-01-22 15:37:07.508230: step: 238/466, loss: 0.003706106450408697 2023-01-22 15:37:08.299724: step: 240/466, loss: 0.11645306646823883 2023-01-22 15:37:09.151437: step: 242/466, loss: 0.01141782570630312 2023-01-22 15:37:09.870882: step: 244/466, loss: 0.009234164841473103 2023-01-22 15:37:10.624364: step: 246/466, loss: 0.00295763136819005 2023-01-22 15:37:11.299621: step: 248/466, loss: 0.008472193032503128 2023-01-22 15:37:11.948054: step: 250/466, loss: 0.0012959379237145185 2023-01-22 15:37:12.658050: step: 252/466, loss: 0.008279495872557163 2023-01-22 15:37:13.428455: step: 254/466, loss: 0.055974967777729034 2023-01-22 15:37:14.161352: step: 256/466, loss: 0.0235903263092041 2023-01-22 15:37:14.932027: step: 258/466, loss: 0.008927579037845135 2023-01-22 15:37:15.691095: step: 260/466, loss: 0.0002345545799471438 2023-01-22 15:37:16.513883: step: 262/466, loss: 0.020156797021627426 2023-01-22 15:37:17.379867: step: 264/466, loss: 0.4345041811466217 2023-01-22 15:37:18.061627: step: 266/466, loss: 0.008493071421980858 2023-01-22 15:37:18.832575: step: 268/466, loss: 0.01874028518795967 2023-01-22 15:37:19.589694: step: 270/466, loss: 0.008288813754916191 2023-01-22 15:37:20.308059: step: 272/466, loss: 0.00955624133348465 2023-01-22 15:37:21.067185: step: 274/466, loss: 0.013928272761404514 2023-01-22 15:37:21.785633: step: 276/466, loss: 0.00287935184314847 2023-01-22 15:37:22.422599: step: 278/466, loss: 0.04072069376707077 2023-01-22 15:37:23.285395: step: 280/466, loss: 0.01191603485494852 2023-01-22 15:37:24.025843: step: 282/466, loss: 0.029306577518582344 2023-01-22 15:37:24.757369: step: 284/466, loss: 4.033674240112305 2023-01-22 15:37:25.523033: step: 286/466, loss: 0.0014825006946921349 2023-01-22 15:37:26.189266: step: 288/466, loss: 0.0002236310683656484 2023-01-22 15:37:26.873825: step: 290/466, loss: 0.03675243258476257 2023-01-22 15:37:27.592443: step: 292/466, loss: 0.013011719100177288 2023-01-22 15:37:28.331693: step: 294/466, loss: 0.0008605459006503224 2023-01-22 15:37:29.072876: step: 296/466, loss: 0.005612206179648638 2023-01-22 15:37:29.833842: step: 298/466, loss: 0.09135919064283371 2023-01-22 15:37:30.609993: step: 300/466, loss: 0.029406633228063583 2023-01-22 15:37:31.327445: step: 302/466, loss: 0.011398477479815483 2023-01-22 15:37:32.122382: step: 304/466, loss: 0.007227979600429535 2023-01-22 15:37:32.865235: step: 306/466, loss: 0.00731433741748333 2023-01-22 15:37:33.546107: step: 308/466, loss: 0.000489395868498832 2023-01-22 15:37:34.409198: step: 310/466, loss: 0.013301452621817589 2023-01-22 15:37:35.160875: step: 312/466, loss: 0.008809504099190235 2023-01-22 15:37:35.851431: step: 314/466, loss: 0.022511150687932968 2023-01-22 15:37:36.513962: step: 316/466, loss: 0.0037717344239354134 2023-01-22 15:37:37.298622: step: 318/466, loss: 0.005426387302577496 2023-01-22 15:37:38.001762: step: 320/466, loss: 0.05997195839881897 2023-01-22 15:37:38.731882: step: 322/466, loss: 0.020946403965353966 2023-01-22 15:37:39.395527: step: 324/466, loss: 0.009468648582696915 2023-01-22 15:37:40.212339: step: 326/466, loss: 0.005700098816305399 2023-01-22 15:37:40.882827: step: 328/466, loss: 0.02063606120646 2023-01-22 15:37:41.682032: step: 330/466, loss: 0.023080935701727867 2023-01-22 15:37:42.446928: step: 332/466, loss: 0.0172546599060297 2023-01-22 15:37:43.174381: step: 334/466, loss: 0.01309296116232872 2023-01-22 15:37:43.888847: step: 336/466, loss: 0.021634528413414955 2023-01-22 15:37:44.596963: step: 338/466, loss: 0.03849001228809357 2023-01-22 15:37:45.354504: step: 340/466, loss: 0.010629228316247463 2023-01-22 15:37:46.047460: step: 342/466, loss: 0.035972755402326584 2023-01-22 15:37:46.712378: step: 344/466, loss: 0.00018546557112131268 2023-01-22 15:37:47.506090: step: 346/466, loss: 0.0004207846650388092 2023-01-22 15:37:48.267564: step: 348/466, loss: 0.004989704582840204 2023-01-22 15:37:49.015288: step: 350/466, loss: 0.03478853777050972 2023-01-22 15:37:49.777193: step: 352/466, loss: 0.04219071939587593 2023-01-22 15:37:50.519688: step: 354/466, loss: 0.01165375579148531 2023-01-22 15:37:51.157764: step: 356/466, loss: 0.005104635842144489 2023-01-22 15:37:51.899185: step: 358/466, loss: 0.09205269813537598 2023-01-22 15:37:52.644272: step: 360/466, loss: 0.002441998338326812 2023-01-22 15:37:53.300221: step: 362/466, loss: 0.08703027665615082 2023-01-22 15:37:54.104712: step: 364/466, loss: 0.03320421278476715 2023-01-22 15:37:54.860396: step: 366/466, loss: 0.012047701515257359 2023-01-22 15:37:55.601630: step: 368/466, loss: 0.0029933189507573843 2023-01-22 15:37:56.304908: step: 370/466, loss: 0.0032152493949979544 2023-01-22 15:37:57.080845: step: 372/466, loss: 0.18299029767513275 2023-01-22 15:37:57.842607: step: 374/466, loss: 0.0006103235646151006 2023-01-22 15:37:58.573724: step: 376/466, loss: 0.00045644465717487037 2023-01-22 15:37:59.256227: step: 378/466, loss: 0.008309612050652504 2023-01-22 15:37:59.922649: step: 380/466, loss: 0.030501268804073334 2023-01-22 15:38:00.666700: step: 382/466, loss: 0.014829293824732304 2023-01-22 15:38:01.418571: step: 384/466, loss: 0.39675837755203247 2023-01-22 15:38:02.173674: step: 386/466, loss: 0.017327308654785156 2023-01-22 15:38:02.950811: step: 388/466, loss: 0.03276915103197098 2023-01-22 15:38:03.697197: step: 390/466, loss: 0.0019592309836298227 2023-01-22 15:38:04.541573: step: 392/466, loss: 0.0044213952496647835 2023-01-22 15:38:05.308560: step: 394/466, loss: 0.03716374561190605 2023-01-22 15:38:06.061740: step: 396/466, loss: 0.044696152210235596 2023-01-22 15:38:06.803789: step: 398/466, loss: 0.023156609386205673 2023-01-22 15:38:07.521074: step: 400/466, loss: 0.0072260950691998005 2023-01-22 15:38:08.269122: step: 402/466, loss: 6.983886123634875e-05 2023-01-22 15:38:08.971308: step: 404/466, loss: 0.001076485961675644 2023-01-22 15:38:09.711109: step: 406/466, loss: 0.056770969182252884 2023-01-22 15:38:10.468490: step: 408/466, loss: 0.0008376673213206232 2023-01-22 15:38:11.082075: step: 410/466, loss: 0.00022090923448558897 2023-01-22 15:38:11.876784: step: 412/466, loss: 0.010586812160909176 2023-01-22 15:38:12.646984: step: 414/466, loss: 0.005161886103451252 2023-01-22 15:38:13.357996: step: 416/466, loss: 0.004693008493632078 2023-01-22 15:38:14.069453: step: 418/466, loss: 0.007503495551645756 2023-01-22 15:38:14.782097: step: 420/466, loss: 0.0016249733744189143 2023-01-22 15:38:15.549398: step: 422/466, loss: 0.019222719594836235 2023-01-22 15:38:16.288546: step: 424/466, loss: 0.019780205562710762 2023-01-22 15:38:17.002013: step: 426/466, loss: 0.03432883322238922 2023-01-22 15:38:17.662274: step: 428/466, loss: 0.016445733606815338 2023-01-22 15:38:18.473037: step: 430/466, loss: 0.001471205847337842 2023-01-22 15:38:19.201325: step: 432/466, loss: 0.707700252532959 2023-01-22 15:38:19.909943: step: 434/466, loss: 0.01202785037457943 2023-01-22 15:38:20.644286: step: 436/466, loss: 0.020741842687129974 2023-01-22 15:38:21.318323: step: 438/466, loss: 0.02251707948744297 2023-01-22 15:38:22.001404: step: 440/466, loss: 0.00407326640561223 2023-01-22 15:38:22.836638: step: 442/466, loss: 0.03363305330276489 2023-01-22 15:38:23.623494: step: 444/466, loss: 0.009023510850965977 2023-01-22 15:38:24.374028: step: 446/466, loss: 0.030631855130195618 2023-01-22 15:38:25.105943: step: 448/466, loss: 0.0005966068711131811 2023-01-22 15:38:25.844538: step: 450/466, loss: 0.019086359068751335 2023-01-22 15:38:26.536170: step: 452/466, loss: 0.0014762079808861017 2023-01-22 15:38:27.296981: step: 454/466, loss: 0.0017003518296405673 2023-01-22 15:38:28.156885: step: 456/466, loss: 0.036129433661699295 2023-01-22 15:38:28.968157: step: 458/466, loss: 0.05053102597594261 2023-01-22 15:38:29.687765: step: 460/466, loss: 0.013784201815724373 2023-01-22 15:38:30.416593: step: 462/466, loss: 0.003369506448507309 2023-01-22 15:38:31.192859: step: 464/466, loss: 0.021329237148165703 2023-01-22 15:38:31.997639: step: 466/466, loss: 0.03785283491015434 2023-01-22 15:38:32.709529: step: 468/466, loss: 0.08001746237277985 2023-01-22 15:38:33.576470: step: 470/466, loss: 0.0024658029433339834 2023-01-22 15:38:34.276910: step: 472/466, loss: 0.03431737795472145 2023-01-22 15:38:34.938899: step: 474/466, loss: 0.021164868026971817 2023-01-22 15:38:35.747289: step: 476/466, loss: 0.032166868448257446 2023-01-22 15:38:36.498978: step: 478/466, loss: 0.09220929443836212 2023-01-22 15:38:37.221458: step: 480/466, loss: 0.0032411126885563135 2023-01-22 15:38:37.917167: step: 482/466, loss: 0.005250800866633654 2023-01-22 15:38:38.717920: step: 484/466, loss: 0.0031083542853593826 2023-01-22 15:38:39.432983: step: 486/466, loss: 0.008143751882016659 2023-01-22 15:38:40.262677: step: 488/466, loss: 0.01995263621211052 2023-01-22 15:38:41.090635: step: 490/466, loss: 0.07909484952688217 2023-01-22 15:38:41.882584: step: 492/466, loss: 0.027795715257525444 2023-01-22 15:38:42.617686: step: 494/466, loss: 0.01582321524620056 2023-01-22 15:38:43.358346: step: 496/466, loss: 0.009483584202826023 2023-01-22 15:38:44.118929: step: 498/466, loss: 0.004073833581060171 2023-01-22 15:38:44.870875: step: 500/466, loss: 0.03123282827436924 2023-01-22 15:38:45.554067: step: 502/466, loss: 0.14656507968902588 2023-01-22 15:38:46.285702: step: 504/466, loss: 0.043541837483644485 2023-01-22 15:38:47.069808: step: 506/466, loss: 0.0035433454904705286 2023-01-22 15:38:47.841797: step: 508/466, loss: 0.023579321801662445 2023-01-22 15:38:48.576008: step: 510/466, loss: 0.007745890412479639 2023-01-22 15:38:49.353226: step: 512/466, loss: 0.044587232172489166 2023-01-22 15:38:50.061318: step: 514/466, loss: 0.05123988911509514 2023-01-22 15:38:50.745602: step: 516/466, loss: 0.037435825914144516 2023-01-22 15:38:51.495180: step: 518/466, loss: 0.06502517312765121 2023-01-22 15:38:52.276630: step: 520/466, loss: 0.021820692345499992 2023-01-22 15:38:52.965908: step: 522/466, loss: 0.033464353531599045 2023-01-22 15:38:53.703295: step: 524/466, loss: 0.05899073928594589 2023-01-22 15:38:54.428481: step: 526/466, loss: 0.025905214250087738 2023-01-22 15:38:55.195540: step: 528/466, loss: 0.028305260464549065 2023-01-22 15:38:55.973180: step: 530/466, loss: 0.3684265911579132 2023-01-22 15:38:56.662519: step: 532/466, loss: 0.05670701712369919 2023-01-22 15:38:57.348240: step: 534/466, loss: 0.00019752232765313238 2023-01-22 15:38:58.025687: step: 536/466, loss: 0.011019325815141201 2023-01-22 15:38:58.812288: step: 538/466, loss: 0.010654658079147339 2023-01-22 15:38:59.486867: step: 540/466, loss: 0.005419179797172546 2023-01-22 15:39:00.239937: step: 542/466, loss: 0.1257794350385666 2023-01-22 15:39:00.951169: step: 544/466, loss: 0.0339568629860878 2023-01-22 15:39:01.668999: step: 546/466, loss: 0.01898978278040886 2023-01-22 15:39:02.420359: step: 548/466, loss: 0.003557452466338873 2023-01-22 15:39:03.142999: step: 550/466, loss: 0.013799430802464485 2023-01-22 15:39:03.891595: step: 552/466, loss: 0.0009106646757572889 2023-01-22 15:39:04.673838: step: 554/466, loss: 0.010609936900436878 2023-01-22 15:39:05.481153: step: 556/466, loss: 0.02367284893989563 2023-01-22 15:39:06.214036: step: 558/466, loss: 0.019093787297606468 2023-01-22 15:39:06.946685: step: 560/466, loss: 0.012392389588057995 2023-01-22 15:39:07.809497: step: 562/466, loss: 0.0020131270866841078 2023-01-22 15:39:08.586278: step: 564/466, loss: 0.04604710638523102 2023-01-22 15:39:09.373488: step: 566/466, loss: 0.0003468830545898527 2023-01-22 15:39:10.180186: step: 568/466, loss: 0.054404839873313904 2023-01-22 15:39:10.867134: step: 570/466, loss: 0.0021197283640503883 2023-01-22 15:39:11.550339: step: 572/466, loss: 0.06230594962835312 2023-01-22 15:39:12.285515: step: 574/466, loss: 0.028980540111660957 2023-01-22 15:39:13.003849: step: 576/466, loss: 0.002518613124266267 2023-01-22 15:39:13.720534: step: 578/466, loss: 0.03597286343574524 2023-01-22 15:39:14.432277: step: 580/466, loss: 0.014612109400331974 2023-01-22 15:39:15.187651: step: 582/466, loss: 0.02597770281136036 2023-01-22 15:39:15.914351: step: 584/466, loss: 0.006040181033313274 2023-01-22 15:39:16.620708: step: 586/466, loss: 0.0018081383313983679 2023-01-22 15:39:17.340090: step: 588/466, loss: 0.004814724437892437 2023-01-22 15:39:18.069632: step: 590/466, loss: 0.026073535904288292 2023-01-22 15:39:18.795349: step: 592/466, loss: 0.011254251934587955 2023-01-22 15:39:19.534367: step: 594/466, loss: 0.051905952394008636 2023-01-22 15:39:20.211207: step: 596/466, loss: 0.278527170419693 2023-01-22 15:39:20.972057: step: 598/466, loss: 0.004648893140256405 2023-01-22 15:39:21.670565: step: 600/466, loss: 0.02231455221772194 2023-01-22 15:39:22.371618: step: 602/466, loss: 0.00720774894580245 2023-01-22 15:39:23.035898: step: 604/466, loss: 0.025287862867116928 2023-01-22 15:39:23.809487: step: 606/466, loss: 0.0005568001070059836 2023-01-22 15:39:24.553428: step: 608/466, loss: 0.0004496572364587337 2023-01-22 15:39:25.373393: step: 610/466, loss: 0.008366197347640991 2023-01-22 15:39:26.155344: step: 612/466, loss: 0.09989674389362335 2023-01-22 15:39:26.929924: step: 614/466, loss: 0.036410532891750336 2023-01-22 15:39:27.680017: step: 616/466, loss: 0.014120755717158318 2023-01-22 15:39:28.417027: step: 618/466, loss: 0.03373485058546066 2023-01-22 15:39:29.260487: step: 620/466, loss: 0.056192394345998764 2023-01-22 15:39:29.954439: step: 622/466, loss: 0.005703099071979523 2023-01-22 15:39:30.755162: step: 624/466, loss: 0.039040133357048035 2023-01-22 15:39:31.507205: step: 626/466, loss: 0.004824712872505188 2023-01-22 15:39:32.218413: step: 628/466, loss: 0.02520856261253357 2023-01-22 15:39:33.016649: step: 630/466, loss: 0.008502018637955189 2023-01-22 15:39:33.760079: step: 632/466, loss: 0.01860482059419155 2023-01-22 15:39:34.432203: step: 634/466, loss: 0.0026551811024546623 2023-01-22 15:39:35.120944: step: 636/466, loss: 0.004953205585479736 2023-01-22 15:39:35.888240: step: 638/466, loss: 0.0004915996687486768 2023-01-22 15:39:36.705078: step: 640/466, loss: 0.03200971707701683 2023-01-22 15:39:37.467381: step: 642/466, loss: 0.02307523973286152 2023-01-22 15:39:38.165293: step: 644/466, loss: 0.0009260879596695304 2023-01-22 15:39:38.913245: step: 646/466, loss: 0.036600060760974884 2023-01-22 15:39:39.746641: step: 648/466, loss: 0.009183496236801147 2023-01-22 15:39:40.506911: step: 650/466, loss: 0.022826118394732475 2023-01-22 15:39:41.267945: step: 652/466, loss: 0.05116073787212372 2023-01-22 15:39:41.892488: step: 654/466, loss: 0.04589260369539261 2023-01-22 15:39:42.615943: step: 656/466, loss: 0.0015795762883499265 2023-01-22 15:39:43.369100: step: 658/466, loss: 0.00048648411757312715 2023-01-22 15:39:44.208974: step: 660/466, loss: 0.08884550631046295 2023-01-22 15:39:44.885758: step: 662/466, loss: 0.0012818826362490654 2023-01-22 15:39:45.625654: step: 664/466, loss: 0.39736467599868774 2023-01-22 15:39:46.395828: step: 666/466, loss: 0.01191942859441042 2023-01-22 15:39:47.097557: step: 668/466, loss: 0.03434213995933533 2023-01-22 15:39:47.730514: step: 670/466, loss: 0.0016377634601667523 2023-01-22 15:39:48.425383: step: 672/466, loss: 0.023413589224219322 2023-01-22 15:39:49.134437: step: 674/466, loss: 0.004105696454644203 2023-01-22 15:39:49.885918: step: 676/466, loss: 0.014797130599617958 2023-01-22 15:39:50.651785: step: 678/466, loss: 0.0023186160251498222 2023-01-22 15:39:51.350440: step: 680/466, loss: 0.023976871743798256 2023-01-22 15:39:52.069579: step: 682/466, loss: 0.0022996345069259405 2023-01-22 15:39:52.814644: step: 684/466, loss: 0.019209645688533783 2023-01-22 15:39:53.564110: step: 686/466, loss: 0.0162705909460783 2023-01-22 15:39:54.368050: step: 688/466, loss: 0.4258826971054077 2023-01-22 15:39:55.059151: step: 690/466, loss: 0.012139077298343182 2023-01-22 15:39:55.717871: step: 692/466, loss: 0.003205450950190425 2023-01-22 15:39:56.457652: step: 694/466, loss: 0.008439544588327408 2023-01-22 15:39:57.113090: step: 696/466, loss: 0.004512060433626175 2023-01-22 15:39:57.902469: step: 698/466, loss: 0.0006865372997708619 2023-01-22 15:39:58.694977: step: 700/466, loss: 0.006161834113299847 2023-01-22 15:39:59.428300: step: 702/466, loss: 0.011986282654106617 2023-01-22 15:40:00.185880: step: 704/466, loss: 0.17363914847373962 2023-01-22 15:40:00.928610: step: 706/466, loss: 0.03010350465774536 2023-01-22 15:40:01.701059: step: 708/466, loss: 0.02479240484535694 2023-01-22 15:40:02.504602: step: 710/466, loss: 0.009199201129376888 2023-01-22 15:40:03.278750: step: 712/466, loss: 0.513110339641571 2023-01-22 15:40:04.078301: step: 714/466, loss: 0.006795932538807392 2023-01-22 15:40:04.902171: step: 716/466, loss: 0.017703594639897346 2023-01-22 15:40:05.624850: step: 718/466, loss: 0.005541081074625254 2023-01-22 15:40:06.362055: step: 720/466, loss: 0.010504513047635555 2023-01-22 15:40:07.078199: step: 722/466, loss: 0.011415023356676102 2023-01-22 15:40:07.789021: step: 724/466, loss: 0.05735626071691513 2023-01-22 15:40:08.560888: step: 726/466, loss: 0.07205421477556229 2023-01-22 15:40:09.222673: step: 728/466, loss: 0.04259423911571503 2023-01-22 15:40:09.943153: step: 730/466, loss: 0.1016535609960556 2023-01-22 15:40:10.682257: step: 732/466, loss: 0.003966829739511013 2023-01-22 15:40:11.337617: step: 734/466, loss: 0.017321214079856873 2023-01-22 15:40:12.065114: step: 736/466, loss: 0.05669906735420227 2023-01-22 15:40:12.860954: step: 738/466, loss: 0.0017302916385233402 2023-01-22 15:40:13.559648: step: 740/466, loss: 0.003923018462955952 2023-01-22 15:40:14.263592: step: 742/466, loss: 0.016879770904779434 2023-01-22 15:40:15.023845: step: 744/466, loss: 2.7057571060140617e-05 2023-01-22 15:40:15.750767: step: 746/466, loss: 0.0012021064758300781 2023-01-22 15:40:16.531057: step: 748/466, loss: 0.010458866134285927 2023-01-22 15:40:17.251766: step: 750/466, loss: 0.05704216659069061 2023-01-22 15:40:17.963220: step: 752/466, loss: 0.018793189898133278 2023-01-22 15:40:18.737551: step: 754/466, loss: 0.0011075339280068874 2023-01-22 15:40:19.520372: step: 756/466, loss: 0.05403747782111168 2023-01-22 15:40:20.329799: step: 758/466, loss: 0.05381093919277191 2023-01-22 15:40:20.985858: step: 760/466, loss: 0.01267674658447504 2023-01-22 15:40:21.756072: step: 762/466, loss: 0.009959987364709377 2023-01-22 15:40:22.489658: step: 764/466, loss: 0.01536529790610075 2023-01-22 15:40:23.144762: step: 766/466, loss: 0.0005679702153429389 2023-01-22 15:40:23.821236: step: 768/466, loss: 0.014717141166329384 2023-01-22 15:40:24.569023: step: 770/466, loss: 0.024072684347629547 2023-01-22 15:40:25.328223: step: 772/466, loss: 0.04764863848686218 2023-01-22 15:40:26.057767: step: 774/466, loss: 2.5167637431877665e-06 2023-01-22 15:40:26.873180: step: 776/466, loss: 0.018080471083521843 2023-01-22 15:40:27.676913: step: 778/466, loss: 0.08379022777080536 2023-01-22 15:40:28.346177: step: 780/466, loss: 0.012476377189159393 2023-01-22 15:40:29.079102: step: 782/466, loss: 0.40508633852005005 2023-01-22 15:40:29.834969: step: 784/466, loss: 0.17186686396598816 2023-01-22 15:40:30.634549: step: 786/466, loss: 0.011937204748392105 2023-01-22 15:40:31.412238: step: 788/466, loss: 0.001307436847127974 2023-01-22 15:40:32.169816: step: 790/466, loss: 0.020467674359679222 2023-01-22 15:40:32.945327: step: 792/466, loss: 0.06974517554044724 2023-01-22 15:40:33.617803: step: 794/466, loss: 0.006880198139697313 2023-01-22 15:40:34.366125: step: 796/466, loss: 0.0707889124751091 2023-01-22 15:40:35.078485: step: 798/466, loss: 0.0010480673518031836 2023-01-22 15:40:35.869212: step: 800/466, loss: 0.03564343601465225 2023-01-22 15:40:36.572550: step: 802/466, loss: 0.026431893929839134 2023-01-22 15:40:37.376144: step: 804/466, loss: 0.030676953494548798 2023-01-22 15:40:38.061660: step: 806/466, loss: 0.05701667442917824 2023-01-22 15:40:38.855756: step: 808/466, loss: 0.2994881272315979 2023-01-22 15:40:39.622410: step: 810/466, loss: 0.022319236770272255 2023-01-22 15:40:40.292490: step: 812/466, loss: 0.004756301175802946 2023-01-22 15:40:41.077521: step: 814/466, loss: 0.052547529339790344 2023-01-22 15:40:41.866024: step: 816/466, loss: 0.00465787248685956 2023-01-22 15:40:42.521631: step: 818/466, loss: 0.014711483381688595 2023-01-22 15:40:43.277336: step: 820/466, loss: 0.005789772141724825 2023-01-22 15:40:44.002596: step: 822/466, loss: 0.007069493178278208 2023-01-22 15:40:44.732601: step: 824/466, loss: 0.6527952551841736 2023-01-22 15:40:45.455584: step: 826/466, loss: 0.0037681246176362038 2023-01-22 15:40:46.185757: step: 828/466, loss: 0.010844358243048191 2023-01-22 15:40:46.869401: step: 830/466, loss: 0.009292328730225563 2023-01-22 15:40:47.637309: step: 832/466, loss: 0.00010707331966841593 2023-01-22 15:40:48.390227: step: 834/466, loss: 0.0015324027044698596 2023-01-22 15:40:49.174256: step: 836/466, loss: 0.005519147031009197 2023-01-22 15:40:50.032825: step: 838/466, loss: 0.036789048463106155 2023-01-22 15:40:50.767824: step: 840/466, loss: 0.017365090548992157 2023-01-22 15:40:51.545519: step: 842/466, loss: 0.03410469368100166 2023-01-22 15:40:52.228405: step: 844/466, loss: 0.030230188742280006 2023-01-22 15:40:53.006000: step: 846/466, loss: 0.0429796427488327 2023-01-22 15:40:53.635068: step: 848/466, loss: 0.0044829887337982655 2023-01-22 15:40:54.441385: step: 850/466, loss: 0.49169623851776123 2023-01-22 15:40:55.203440: step: 852/466, loss: 0.002044479828327894 2023-01-22 15:40:55.969820: step: 854/466, loss: 0.0002728290855884552 2023-01-22 15:40:56.789208: step: 856/466, loss: 0.0035052099265158176 2023-01-22 15:40:57.559504: step: 858/466, loss: 0.0392683781683445 2023-01-22 15:40:58.308015: step: 860/466, loss: 0.05269757658243179 2023-01-22 15:40:58.897674: step: 862/466, loss: 0.000325383385643363 2023-01-22 15:40:59.675132: step: 864/466, loss: 0.0006225865217857063 2023-01-22 15:41:00.406174: step: 866/466, loss: 0.021601015701889992 2023-01-22 15:41:01.182055: step: 868/466, loss: 0.39107298851013184 2023-01-22 15:41:01.927370: step: 870/466, loss: 0.0002772028965409845 2023-01-22 15:41:02.691672: step: 872/466, loss: 0.021326979622244835 2023-01-22 15:41:03.527817: step: 874/466, loss: 0.346127450466156 2023-01-22 15:41:04.310370: step: 876/466, loss: 0.008033477701246738 2023-01-22 15:41:05.011224: step: 878/466, loss: 0.07257543504238129 2023-01-22 15:41:05.677248: step: 880/466, loss: 0.008361267857253551 2023-01-22 15:41:06.532815: step: 882/466, loss: 0.5972766876220703 2023-01-22 15:41:07.186315: step: 884/466, loss: 0.008155311457812786 2023-01-22 15:41:07.945463: step: 886/466, loss: 0.04740725830197334 2023-01-22 15:41:08.660468: step: 888/466, loss: 0.7012631297111511 2023-01-22 15:41:09.378096: step: 890/466, loss: 0.01653783954679966 2023-01-22 15:41:10.140608: step: 892/466, loss: 0.016953716054558754 2023-01-22 15:41:10.872331: step: 894/466, loss: 0.9254959225654602 2023-01-22 15:41:11.694278: step: 896/466, loss: 0.047596968710422516 2023-01-22 15:41:12.477521: step: 898/466, loss: 0.01078968495130539 2023-01-22 15:41:13.296376: step: 900/466, loss: 0.020899973809719086 2023-01-22 15:41:13.971519: step: 902/466, loss: 0.009160463698208332 2023-01-22 15:41:14.723947: step: 904/466, loss: 0.030137361958622932 2023-01-22 15:41:15.510794: step: 906/466, loss: 0.028629913926124573 2023-01-22 15:41:16.295385: step: 908/466, loss: 0.04084698110818863 2023-01-22 15:41:17.061744: step: 910/466, loss: 0.07203912734985352 2023-01-22 15:41:17.730341: step: 912/466, loss: 0.002872600918635726 2023-01-22 15:41:18.410493: step: 914/466, loss: 0.022516105324029922 2023-01-22 15:41:19.196515: step: 916/466, loss: 0.022481245920062065 2023-01-22 15:41:19.967686: step: 918/466, loss: 0.016639167442917824 2023-01-22 15:41:20.745184: step: 920/466, loss: 0.013560550287365913 2023-01-22 15:41:21.586968: step: 922/466, loss: 0.06358363479375839 2023-01-22 15:41:22.357188: step: 924/466, loss: 0.004070811904966831 2023-01-22 15:41:23.054231: step: 926/466, loss: 0.0028404947370290756 2023-01-22 15:41:23.771407: step: 928/466, loss: 0.01054287701845169 2023-01-22 15:41:24.485366: step: 930/466, loss: 0.017033016309142113 2023-01-22 15:41:25.176201: step: 932/466, loss: 0.014236865565180779 ================================================== Loss: 0.070 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.30537003610108304, 'r': 0.3210151802656547, 'f1': 0.3129972247918594}, 'combined': 0.23062953405715952, 'epoch': 29} Test Chinese: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.3503491975729577, 'r': 0.3018159753983488, 'f1': 0.3242766991489236}, 'combined': 0.19931153215982622, 'epoch': 29} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.28686088890250344, 'r': 0.33258444614692523, 'f1': 0.30803515486718736}, 'combined': 0.2269732720074012, 'epoch': 29} Test Korean: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.32597067725008727, 'r': 0.3093049320527258, 'f1': 0.31741920105722143}, 'combined': 0.19509667967419464, 'epoch': 29} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.33026802218114604, 'r': 0.3390417457305503, 'f1': 0.3345973782771535}, 'combined': 0.246545436625271, 'epoch': 29} Test Russian: {'template': {'p': 0.863013698630137, 'r': 0.48091603053435117, 'f1': 0.6176470588235295}, 'slot': {'p': 0.3449618409391938, 'r': 0.3002003563882603, 'f1': 0.32102831820983246}, 'combined': 0.19828219654136717, 'epoch': 29} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.23369565217391305, 'r': 0.30714285714285716, 'f1': 0.2654320987654321}, 'combined': 0.17695473251028807, 'epoch': 29} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.24404761904761904, 'r': 0.44565217391304346, 'f1': 0.3153846153846154}, 'combined': 0.1576923076923077, 'epoch': 29} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.46153846153846156, 'r': 0.20689655172413793, 'f1': 0.28571428571428575}, 'combined': 0.1904761904761905, 'epoch': 29} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3112994890724527, 'r': 0.3408345449616797, 'f1': 0.3253981978166761}, 'combined': 0.23976709312807712, 'epoch': 22} Test for Chinese: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.354901590881547, 'r': 0.30635228234537004, 'f1': 0.3288446896922884}, 'combined': 0.2021191751279431, 'epoch': 22} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.28125, 'r': 0.38571428571428573, 'f1': 0.32530120481927716}, 'combined': 0.2168674698795181, 'epoch': 22} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.28845615818674736, 'r': 0.34045489637980425, 'f1': 0.3123058840594549}, 'combined': 0.23012012509644045, 'epoch': 28} Test for Korean: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.3314220112372907, 'r': 0.30844648186208856, 'f1': 0.3195217594872982}, 'combined': 0.19638898387999792, 'epoch': 28} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.3170731707317073, 'r': 0.5652173913043478, 'f1': 0.40625}, 'combined': 0.203125, 'epoch': 28} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3161637354106435, 'r': 0.3395610516934046, 'f1': 0.3274449665918101}, 'combined': 0.24127523854133376, 'epoch': 24} Test for Russian: {'template': {'p': 0.863013698630137, 'r': 0.48091603053435117, 'f1': 0.6176470588235295}, 'slot': {'p': 0.34766580316129947, 'r': 0.3010093533864065, 'f1': 0.32265967810793456}, 'combined': 0.19928980118431255, 'epoch': 24} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.2413793103448276, 'f1': 0.32558139534883723}, 'combined': 0.21705426356589147, 'epoch': 24} ****************************** Epoch: 30 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-22 15:44:05.366235: step: 2/466, loss: 0.0021244355011731386 2023-01-22 15:44:06.113917: step: 4/466, loss: 0.01118533406406641 2023-01-22 15:44:06.802643: step: 6/466, loss: 0.004588813055306673 2023-01-22 15:44:07.476929: step: 8/466, loss: 0.008610702119767666 2023-01-22 15:44:08.195937: step: 10/466, loss: 0.016871176660060883 2023-01-22 15:44:08.914358: step: 12/466, loss: 0.011558053083717823 2023-01-22 15:44:09.799270: step: 14/466, loss: 0.02135724015533924 2023-01-22 15:44:10.556602: step: 16/466, loss: 0.005487372167408466 2023-01-22 15:44:11.339370: step: 18/466, loss: 0.010226622223854065 2023-01-22 15:44:12.069430: step: 20/466, loss: 0.004341086372733116 2023-01-22 15:44:12.811870: step: 22/466, loss: 0.0025654153432697058 2023-01-22 15:44:13.566148: step: 24/466, loss: 0.08291388303041458 2023-01-22 15:44:14.315890: step: 26/466, loss: 0.02562730386853218 2023-01-22 15:44:15.002993: step: 28/466, loss: 0.011620445176959038 2023-01-22 15:44:15.730995: step: 30/466, loss: 0.029996804893016815 2023-01-22 15:44:16.489338: step: 32/466, loss: 0.0023577671963721514 2023-01-22 15:44:17.272332: step: 34/466, loss: 0.35937052965164185 2023-01-22 15:44:18.003941: step: 36/466, loss: 0.02752179652452469 2023-01-22 15:44:18.768594: step: 38/466, loss: 0.00525928009301424 2023-01-22 15:44:19.519123: step: 40/466, loss: 0.04858270287513733 2023-01-22 15:44:20.237062: step: 42/466, loss: 0.09995172917842865 2023-01-22 15:44:20.968884: step: 44/466, loss: 0.0004004990041721612 2023-01-22 15:44:21.754722: step: 46/466, loss: 0.05577151104807854 2023-01-22 15:44:22.563505: step: 48/466, loss: 0.09152302145957947 2023-01-22 15:44:23.270180: step: 50/466, loss: 0.0006231117877177894 2023-01-22 15:44:24.047802: step: 52/466, loss: 0.00014527323946822435 2023-01-22 15:44:24.851244: step: 54/466, loss: 0.003497096709907055 2023-01-22 15:44:25.584858: step: 56/466, loss: 5.375292676035315e-05 2023-01-22 15:44:26.379073: step: 58/466, loss: 0.02126486599445343 2023-01-22 15:44:27.065698: step: 60/466, loss: 0.04060050845146179 2023-01-22 15:44:27.753977: step: 62/466, loss: 0.035850729793310165 2023-01-22 15:44:28.522254: step: 64/466, loss: 0.00572303868830204 2023-01-22 15:44:29.210592: step: 66/466, loss: 0.009731757454574108 2023-01-22 15:44:29.987314: step: 68/466, loss: 0.1124640703201294 2023-01-22 15:44:30.632473: step: 70/466, loss: 0.01516781747341156 2023-01-22 15:44:31.372190: step: 72/466, loss: 0.00239353789947927 2023-01-22 15:44:32.194453: step: 74/466, loss: 0.1437223106622696 2023-01-22 15:44:32.877300: step: 76/466, loss: 0.006624910980463028 2023-01-22 15:44:33.639900: step: 78/466, loss: 0.008828969672322273 2023-01-22 15:44:34.399602: step: 80/466, loss: 0.12912790477275848 2023-01-22 15:44:35.214065: step: 82/466, loss: 0.0045758625492453575 2023-01-22 15:44:35.891707: step: 84/466, loss: 0.009333760477602482 2023-01-22 15:44:36.600964: step: 86/466, loss: 0.01652279868721962 2023-01-22 15:44:37.310416: step: 88/466, loss: 0.022450599819421768 2023-01-22 15:44:37.989723: step: 90/466, loss: 0.027127673849463463 2023-01-22 15:44:38.652744: step: 92/466, loss: 0.0020243090111762285 2023-01-22 15:44:39.406646: step: 94/466, loss: 0.003077466506510973 2023-01-22 15:44:40.090813: step: 96/466, loss: 0.0033880032133311033 2023-01-22 15:44:40.986096: step: 98/466, loss: 0.06507313996553421 2023-01-22 15:44:41.714052: step: 100/466, loss: 0.024417519569396973 2023-01-22 15:44:42.604029: step: 102/466, loss: 0.04545615240931511 2023-01-22 15:44:43.361968: step: 104/466, loss: 0.009091264568269253 2023-01-22 15:44:44.067801: step: 106/466, loss: 0.02733277529478073 2023-01-22 15:44:44.780014: step: 108/466, loss: 0.008861579932272434 2023-01-22 15:44:45.524818: step: 110/466, loss: 0.019939109683036804 2023-01-22 15:44:46.304036: step: 112/466, loss: 0.13370677828788757 2023-01-22 15:44:47.006202: step: 114/466, loss: 0.002368063433095813 2023-01-22 15:44:47.752513: step: 116/466, loss: 0.001968635944649577 2023-01-22 15:44:48.456382: step: 118/466, loss: 0.002208284568041563 2023-01-22 15:44:49.199854: step: 120/466, loss: 0.0009069097577594221 2023-01-22 15:44:49.903180: step: 122/466, loss: 0.01672951504588127 2023-01-22 15:44:50.615909: step: 124/466, loss: 0.013292327523231506 2023-01-22 15:44:51.393570: step: 126/466, loss: 0.02614814229309559 2023-01-22 15:44:52.148287: step: 128/466, loss: 0.02433999814093113 2023-01-22 15:44:52.849831: step: 130/466, loss: 0.0002162880846299231 2023-01-22 15:44:53.527782: step: 132/466, loss: 0.016634326428174973 2023-01-22 15:44:54.262354: step: 134/466, loss: 0.0033332400489598513 2023-01-22 15:44:54.995965: step: 136/466, loss: 0.0007578277145512402 2023-01-22 15:44:55.687240: step: 138/466, loss: 0.001614563399925828 2023-01-22 15:44:56.444021: step: 140/466, loss: 0.014187711291015148 2023-01-22 15:44:57.268516: step: 142/466, loss: 0.017399389296770096 2023-01-22 15:44:58.003983: step: 144/466, loss: 0.0022756143007427454 2023-01-22 15:44:58.786316: step: 146/466, loss: 0.030932974070310593 2023-01-22 15:44:59.504694: step: 148/466, loss: 0.010401797480881214 2023-01-22 15:45:00.390583: step: 150/466, loss: 0.038547106087207794 2023-01-22 15:45:01.105468: step: 152/466, loss: 0.004876116290688515 2023-01-22 15:45:01.788617: step: 154/466, loss: 0.04435432702302933 2023-01-22 15:45:02.449093: step: 156/466, loss: 0.013212048448622227 2023-01-22 15:45:03.205761: step: 158/466, loss: 0.0058258832432329655 2023-01-22 15:45:03.908738: step: 160/466, loss: 0.006186299957334995 2023-01-22 15:45:04.659578: step: 162/466, loss: 0.023623663932085037 2023-01-22 15:45:05.365085: step: 164/466, loss: 0.1430150866508484 2023-01-22 15:45:06.006300: step: 166/466, loss: 0.02687109261751175 2023-01-22 15:45:06.682724: step: 168/466, loss: 0.019926466047763824 2023-01-22 15:45:07.396192: step: 170/466, loss: 0.005367781035602093 2023-01-22 15:45:08.105829: step: 172/466, loss: 0.02796401083469391 2023-01-22 15:45:09.027067: step: 174/466, loss: 0.0037129439879208803 2023-01-22 15:45:09.686418: step: 176/466, loss: 0.0015536812134087086 2023-01-22 15:45:10.380555: step: 178/466, loss: 0.016774123534560204 2023-01-22 15:45:11.081807: step: 180/466, loss: 0.0027977072168141603 2023-01-22 15:45:11.781408: step: 182/466, loss: 0.00121384731028229 2023-01-22 15:45:12.551056: step: 184/466, loss: 0.05058206245303154 2023-01-22 15:45:13.263403: step: 186/466, loss: 0.006122369784861803 2023-01-22 15:45:13.995762: step: 188/466, loss: 0.006557526532560587 2023-01-22 15:45:14.800150: step: 190/466, loss: 0.0757964700460434 2023-01-22 15:45:15.710551: step: 192/466, loss: 0.05410829558968544 2023-01-22 15:45:16.538184: step: 194/466, loss: 0.001608781749382615 2023-01-22 15:45:17.355780: step: 196/466, loss: 0.03472140058875084 2023-01-22 15:45:18.101987: step: 198/466, loss: 0.011966650374233723 2023-01-22 15:45:18.776157: step: 200/466, loss: 0.003614018438383937 2023-01-22 15:45:19.578428: step: 202/466, loss: 0.0036884110886603594 2023-01-22 15:45:20.365099: step: 204/466, loss: 0.003991882316768169 2023-01-22 15:45:21.109930: step: 206/466, loss: 0.001457979902625084 2023-01-22 15:45:21.983081: step: 208/466, loss: 0.060240548104047775 2023-01-22 15:45:22.698923: step: 210/466, loss: 0.012254497967660427 2023-01-22 15:45:23.427876: step: 212/466, loss: 0.0019502416253089905 2023-01-22 15:45:24.148859: step: 214/466, loss: 0.043386176228523254 2023-01-22 15:45:24.899696: step: 216/466, loss: 0.007872240617871284 2023-01-22 15:45:25.617982: step: 218/466, loss: 0.0024368164595216513 2023-01-22 15:45:26.334225: step: 220/466, loss: 0.021933574229478836 2023-01-22 15:45:27.170009: step: 222/466, loss: 0.06005479395389557 2023-01-22 15:45:27.966306: step: 224/466, loss: 1.3206063508987427 2023-01-22 15:45:28.699423: step: 226/466, loss: 0.003284501377493143 2023-01-22 15:45:29.478099: step: 228/466, loss: 0.0009186320821754634 2023-01-22 15:45:30.185566: step: 230/466, loss: 0.001569428015500307 2023-01-22 15:45:30.918325: step: 232/466, loss: 0.0002581567969173193 2023-01-22 15:45:31.607442: step: 234/466, loss: 0.004090083763003349 2023-01-22 15:45:32.366609: step: 236/466, loss: 0.0028223921544849873 2023-01-22 15:45:33.104861: step: 238/466, loss: 0.00975649245083332 2023-01-22 15:45:33.818334: step: 240/466, loss: 0.02854604460299015 2023-01-22 15:45:34.625550: step: 242/466, loss: 0.014190112240612507 2023-01-22 15:45:35.363987: step: 244/466, loss: 0.47467219829559326 2023-01-22 15:45:36.077836: step: 246/466, loss: 0.17608670890331268 2023-01-22 15:45:36.772311: step: 248/466, loss: 0.040568236261606216 2023-01-22 15:45:37.515968: step: 250/466, loss: 0.009722213260829449 2023-01-22 15:45:38.195453: step: 252/466, loss: 0.006252758204936981 2023-01-22 15:45:38.971151: step: 254/466, loss: 0.014669787138700485 2023-01-22 15:45:39.633752: step: 256/466, loss: 0.001633264822885394 2023-01-22 15:45:40.373073: step: 258/466, loss: 0.025014188140630722 2023-01-22 15:45:41.116757: step: 260/466, loss: 0.006355243269354105 2023-01-22 15:45:41.838272: step: 262/466, loss: 0.06679047644138336 2023-01-22 15:45:42.498929: step: 264/466, loss: 0.015492793172597885 2023-01-22 15:45:43.215380: step: 266/466, loss: 0.023767642676830292 2023-01-22 15:45:43.973587: step: 268/466, loss: 0.004239395260810852 2023-01-22 15:45:44.698814: step: 270/466, loss: 0.009462187997996807 2023-01-22 15:45:45.441467: step: 272/466, loss: 0.03155773505568504 2023-01-22 15:45:46.172764: step: 274/466, loss: 0.021629920229315758 2023-01-22 15:45:46.922199: step: 276/466, loss: 0.0036083217710256577 2023-01-22 15:45:47.689350: step: 278/466, loss: 0.010857186280190945 2023-01-22 15:45:48.413874: step: 280/466, loss: 0.0001240857964148745 2023-01-22 15:45:49.251746: step: 282/466, loss: 0.02709275670349598 2023-01-22 15:45:49.946933: step: 284/466, loss: 0.000314620032440871 2023-01-22 15:45:50.633847: step: 286/466, loss: 0.01560733187943697 2023-01-22 15:45:51.493670: step: 288/466, loss: 0.1660701185464859 2023-01-22 15:45:52.316027: step: 290/466, loss: 0.0015013131778687239 2023-01-22 15:45:52.971894: step: 292/466, loss: 0.030416211113333702 2023-01-22 15:45:53.762461: step: 294/466, loss: 0.027990423142910004 2023-01-22 15:45:54.529907: step: 296/466, loss: 0.04260283708572388 2023-01-22 15:45:55.283286: step: 298/466, loss: 0.010933523997664452 2023-01-22 15:45:56.078982: step: 300/466, loss: 0.07986847311258316 2023-01-22 15:45:56.930757: step: 302/466, loss: 0.01713685505092144 2023-01-22 15:45:57.719227: step: 304/466, loss: 0.04338371008634567 2023-01-22 15:45:58.479864: step: 306/466, loss: 0.130144402384758 2023-01-22 15:45:59.193020: step: 308/466, loss: 0.030209003016352654 2023-01-22 15:45:59.845719: step: 310/466, loss: 3.997722524218261e-05 2023-01-22 15:46:00.590978: step: 312/466, loss: 0.012881987728178501 2023-01-22 15:46:01.407542: step: 314/466, loss: 0.015983590856194496 2023-01-22 15:46:02.175197: step: 316/466, loss: 0.00582438800483942 2023-01-22 15:46:02.974162: step: 318/466, loss: 0.08379048109054565 2023-01-22 15:46:03.761889: step: 320/466, loss: 0.005372151732444763 2023-01-22 15:46:04.458435: step: 322/466, loss: 0.004701228812336922 2023-01-22 15:46:05.298815: step: 324/466, loss: 0.012270765379071236 2023-01-22 15:46:05.982351: step: 326/466, loss: 1.4472018847300205e-05 2023-01-22 15:46:06.682533: step: 328/466, loss: 0.002292596735060215 2023-01-22 15:46:07.445387: step: 330/466, loss: 0.019992461428046227 2023-01-22 15:46:08.183393: step: 332/466, loss: 0.0058171385899186134 2023-01-22 15:46:08.897210: step: 334/466, loss: 0.0016815406270325184 2023-01-22 15:46:09.612704: step: 336/466, loss: 0.015655461698770523 2023-01-22 15:46:10.303876: step: 338/466, loss: 0.12064294517040253 2023-01-22 15:46:11.049828: step: 340/466, loss: 0.001417037914507091 2023-01-22 15:46:11.917149: step: 342/466, loss: 0.03021203726530075 2023-01-22 15:46:12.617888: step: 344/466, loss: 0.019122948870062828 2023-01-22 15:46:13.384042: step: 346/466, loss: 0.0028658395167440176 2023-01-22 15:46:14.075015: step: 348/466, loss: 0.03777375444769859 2023-01-22 15:46:14.799591: step: 350/466, loss: 0.0848744586110115 2023-01-22 15:46:15.545822: step: 352/466, loss: 0.007534432224929333 2023-01-22 15:46:16.345567: step: 354/466, loss: 0.030503325164318085 2023-01-22 15:46:17.047004: step: 356/466, loss: 0.061688248068094254 2023-01-22 15:46:17.714163: step: 358/466, loss: 0.002680381527170539 2023-01-22 15:46:18.493186: step: 360/466, loss: 0.020026415586471558 2023-01-22 15:46:19.229834: step: 362/466, loss: 0.01093121524900198 2023-01-22 15:46:19.953503: step: 364/466, loss: 0.009857839904725552 2023-01-22 15:46:20.759623: step: 366/466, loss: 0.0017632795497775078 2023-01-22 15:46:21.608798: step: 368/466, loss: 0.005583813413977623 2023-01-22 15:46:22.349536: step: 370/466, loss: 0.0028505113441497087 2023-01-22 15:46:23.194759: step: 372/466, loss: 0.007300149649381638 2023-01-22 15:46:23.964157: step: 374/466, loss: 0.3070095181465149 2023-01-22 15:46:24.739391: step: 376/466, loss: 0.037516094744205475 2023-01-22 15:46:25.466995: step: 378/466, loss: 0.0004592242185026407 2023-01-22 15:46:26.213763: step: 380/466, loss: 0.034510307013988495 2023-01-22 15:46:26.863446: step: 382/466, loss: 0.0066200257278978825 2023-01-22 15:46:27.573560: step: 384/466, loss: 0.002755881519988179 2023-01-22 15:46:28.267202: step: 386/466, loss: 0.003863600315526128 2023-01-22 15:46:28.996419: step: 388/466, loss: 0.006329555530101061 2023-01-22 15:46:29.739249: step: 390/466, loss: 0.0013895792653784156 2023-01-22 15:46:30.456514: step: 392/466, loss: 0.008196533657610416 2023-01-22 15:46:31.241267: step: 394/466, loss: 0.024616515263915062 2023-01-22 15:46:32.025740: step: 396/466, loss: 0.008718425408005714 2023-01-22 15:46:32.808743: step: 398/466, loss: 0.011273697949945927 2023-01-22 15:46:33.504139: step: 400/466, loss: 0.010760881938040257 2023-01-22 15:46:34.194151: step: 402/466, loss: 0.035104431211948395 2023-01-22 15:46:34.954176: step: 404/466, loss: 0.0038427747786045074 2023-01-22 15:46:35.694010: step: 406/466, loss: 0.11372831463813782 2023-01-22 15:46:36.447411: step: 408/466, loss: 0.02684140019118786 2023-01-22 15:46:37.203252: step: 410/466, loss: 0.075205959379673 2023-01-22 15:46:37.923690: step: 412/466, loss: 0.1784360706806183 2023-01-22 15:46:38.749113: step: 414/466, loss: 0.02089475654065609 2023-01-22 15:46:39.577881: step: 416/466, loss: 0.0006547744851559401 2023-01-22 15:46:40.261103: step: 418/466, loss: 0.018957240507006645 2023-01-22 15:46:41.088954: step: 420/466, loss: 0.025710005313158035 2023-01-22 15:46:41.802368: step: 422/466, loss: 0.036043502390384674 2023-01-22 15:46:42.516735: step: 424/466, loss: 0.0004383635532576591 2023-01-22 15:46:43.352422: step: 426/466, loss: 0.09480899572372437 2023-01-22 15:46:44.050773: step: 428/466, loss: 0.17611780762672424 2023-01-22 15:46:44.760343: step: 430/466, loss: 0.017583386972546577 2023-01-22 15:46:45.486457: step: 432/466, loss: 0.009646718390285969 2023-01-22 15:46:46.131689: step: 434/466, loss: 0.0010779794538393617 2023-01-22 15:46:46.885613: step: 436/466, loss: 0.0019341098377481103 2023-01-22 15:46:47.600460: step: 438/466, loss: 0.05516645312309265 2023-01-22 15:46:48.382913: step: 440/466, loss: 0.004943589214235544 2023-01-22 15:46:49.072651: step: 442/466, loss: 0.03822134807705879 2023-01-22 15:46:49.843879: step: 444/466, loss: 0.005139884538948536 2023-01-22 15:46:50.544196: step: 446/466, loss: 0.03897429630160332 2023-01-22 15:46:51.311097: step: 448/466, loss: 0.004841428250074387 2023-01-22 15:46:51.993590: step: 450/466, loss: 0.001536195632070303 2023-01-22 15:46:52.728968: step: 452/466, loss: 0.01192461047321558 2023-01-22 15:46:53.456812: step: 454/466, loss: 0.016185369342565536 2023-01-22 15:46:54.251262: step: 456/466, loss: 0.007985131815075874 2023-01-22 15:46:54.965462: step: 458/466, loss: 0.005550692789256573 2023-01-22 15:46:55.675036: step: 460/466, loss: 0.0020444553811103106 2023-01-22 15:46:56.484991: step: 462/466, loss: 0.005862903781235218 2023-01-22 15:46:57.147248: step: 464/466, loss: 0.024352507665753365 2023-01-22 15:46:57.865817: step: 466/466, loss: 0.002665027743205428 2023-01-22 15:46:58.566451: step: 468/466, loss: 0.0034131731372326612 2023-01-22 15:46:59.335215: step: 470/466, loss: 0.011681389063596725 2023-01-22 15:47:00.024659: step: 472/466, loss: 0.004937691614031792 2023-01-22 15:47:00.894289: step: 474/466, loss: 0.00297374720685184 2023-01-22 15:47:01.676365: step: 476/466, loss: 0.030566386878490448 2023-01-22 15:47:02.473933: step: 478/466, loss: 0.01928372122347355 2023-01-22 15:47:03.246147: step: 480/466, loss: 0.07454699277877808 2023-01-22 15:47:03.919867: step: 482/466, loss: 0.02079848386347294 2023-01-22 15:47:04.592702: step: 484/466, loss: 0.01979495771229267 2023-01-22 15:47:05.318521: step: 486/466, loss: 0.004218693822622299 2023-01-22 15:47:06.052983: step: 488/466, loss: 0.022521065548062325 2023-01-22 15:47:06.917302: step: 490/466, loss: 0.07107479870319366 2023-01-22 15:47:07.578161: step: 492/466, loss: 0.0097987474873662 2023-01-22 15:47:08.303458: step: 494/466, loss: 0.00504559138789773 2023-01-22 15:47:08.996904: step: 496/466, loss: 0.004600842017680407 2023-01-22 15:47:09.702376: step: 498/466, loss: 0.08354654908180237 2023-01-22 15:47:10.421576: step: 500/466, loss: 0.0246112197637558 2023-01-22 15:47:11.110849: step: 502/466, loss: 0.0013995743356645107 2023-01-22 15:47:11.767253: step: 504/466, loss: 0.00629635201767087 2023-01-22 15:47:12.538144: step: 506/466, loss: 0.00231743766926229 2023-01-22 15:47:13.167697: step: 508/466, loss: 0.04809323325753212 2023-01-22 15:47:13.951373: step: 510/466, loss: 0.01773303560912609 2023-01-22 15:47:14.701722: step: 512/466, loss: 0.004355045035481453 2023-01-22 15:47:15.498222: step: 514/466, loss: 8.419121877523139e-05 2023-01-22 15:47:16.331388: step: 516/466, loss: 0.0020344394724816084 2023-01-22 15:47:17.024519: step: 518/466, loss: 0.007168296258896589 2023-01-22 15:47:17.768477: step: 520/466, loss: 0.014897344633936882 2023-01-22 15:47:18.527955: step: 522/466, loss: 0.00015005006571300328 2023-01-22 15:47:19.217939: step: 524/466, loss: 0.0010431658010929823 2023-01-22 15:47:19.932065: step: 526/466, loss: 0.0006886800401844084 2023-01-22 15:47:20.689980: step: 528/466, loss: 0.4004390835762024 2023-01-22 15:47:21.583014: step: 530/466, loss: 0.02424515038728714 2023-01-22 15:47:22.317119: step: 532/466, loss: 0.04411615431308746 2023-01-22 15:47:23.148290: step: 534/466, loss: 0.014269612729549408 2023-01-22 15:47:23.965911: step: 536/466, loss: 0.00011343916412442923 2023-01-22 15:47:24.689346: step: 538/466, loss: 0.023898938670754433 2023-01-22 15:47:25.458724: step: 540/466, loss: 0.028368115425109863 2023-01-22 15:47:26.177609: step: 542/466, loss: 0.0327945202589035 2023-01-22 15:47:26.839720: step: 544/466, loss: 0.0063098277896642685 2023-01-22 15:47:27.511811: step: 546/466, loss: 0.01982375793159008 2023-01-22 15:47:28.309527: step: 548/466, loss: 0.02865714766085148 2023-01-22 15:47:29.133443: step: 550/466, loss: 0.35284262895584106 2023-01-22 15:47:29.934844: step: 552/466, loss: 0.0012877887347713113 2023-01-22 15:47:30.683138: step: 554/466, loss: 0.002769460901618004 2023-01-22 15:47:31.396590: step: 556/466, loss: 0.026639414951205254 2023-01-22 15:47:32.164683: step: 558/466, loss: 0.2336762398481369 2023-01-22 15:47:32.877685: step: 560/466, loss: 0.010595796629786491 2023-01-22 15:47:33.537653: step: 562/466, loss: 0.023905830457806587 2023-01-22 15:47:34.331835: step: 564/466, loss: 0.0017976739909499884 2023-01-22 15:47:35.072385: step: 566/466, loss: 0.009105820208787918 2023-01-22 15:47:35.812173: step: 568/466, loss: 0.023006802424788475 2023-01-22 15:47:36.572669: step: 570/466, loss: 0.032034676522016525 2023-01-22 15:47:37.319338: step: 572/466, loss: 0.01416066288948059 2023-01-22 15:47:38.140195: step: 574/466, loss: 0.006038175895810127 2023-01-22 15:47:38.895551: step: 576/466, loss: 0.00783812440931797 2023-01-22 15:47:39.603732: step: 578/466, loss: 0.00037845782935619354 2023-01-22 15:47:40.524832: step: 580/466, loss: 0.17190302908420563 2023-01-22 15:47:41.221215: step: 582/466, loss: 0.0016727076144888997 2023-01-22 15:47:41.882895: step: 584/466, loss: 0.01053509209305048 2023-01-22 15:47:42.683025: step: 586/466, loss: 0.001931222970597446 2023-01-22 15:47:43.391450: step: 588/466, loss: 0.01740656979382038 2023-01-22 15:47:44.113212: step: 590/466, loss: 0.38087451457977295 2023-01-22 15:47:44.867180: step: 592/466, loss: 0.0052076056599617004 2023-01-22 15:47:45.599491: step: 594/466, loss: 0.04615316540002823 2023-01-22 15:47:46.343627: step: 596/466, loss: 0.04258754849433899 2023-01-22 15:47:47.011337: step: 598/466, loss: 0.03115232288837433 2023-01-22 15:47:47.741156: step: 600/466, loss: 0.0031860233284533024 2023-01-22 15:47:48.488257: step: 602/466, loss: 0.025794459506869316 2023-01-22 15:47:49.213692: step: 604/466, loss: 0.009873680770397186 2023-01-22 15:47:49.912762: step: 606/466, loss: 0.0011760890483856201 2023-01-22 15:47:50.660181: step: 608/466, loss: 0.022366557270288467 2023-01-22 15:47:51.394500: step: 610/466, loss: 0.09690196812152863 2023-01-22 15:47:52.141838: step: 612/466, loss: 0.02277245558798313 2023-01-22 15:47:52.840182: step: 614/466, loss: 0.1421499401330948 2023-01-22 15:47:53.561780: step: 616/466, loss: 0.030725853517651558 2023-01-22 15:47:54.273621: step: 618/466, loss: 0.0011555668897926807 2023-01-22 15:47:55.015116: step: 620/466, loss: 0.008171006105840206 2023-01-22 15:47:55.723712: step: 622/466, loss: 0.3574966788291931 2023-01-22 15:47:56.476514: step: 624/466, loss: 0.0005149018252268434 2023-01-22 15:47:57.142779: step: 626/466, loss: 0.08907132595777512 2023-01-22 15:47:57.858163: step: 628/466, loss: 0.003892571199685335 2023-01-22 15:47:58.638709: step: 630/466, loss: 0.047221384942531586 2023-01-22 15:47:59.300511: step: 632/466, loss: 0.0013371066888794303 2023-01-22 15:48:00.020277: step: 634/466, loss: 0.09941345453262329 2023-01-22 15:48:00.720008: step: 636/466, loss: 0.03925804793834686 2023-01-22 15:48:01.481611: step: 638/466, loss: 0.02559492364525795 2023-01-22 15:48:02.228603: step: 640/466, loss: 0.06727918982505798 2023-01-22 15:48:02.996049: step: 642/466, loss: 0.13944895565509796 2023-01-22 15:48:03.671438: step: 644/466, loss: 0.024638397619128227 2023-01-22 15:48:04.385239: step: 646/466, loss: 0.001768801361322403 2023-01-22 15:48:05.114395: step: 648/466, loss: 0.004804654978215694 2023-01-22 15:48:05.854126: step: 650/466, loss: 0.040076322853565216 2023-01-22 15:48:06.647161: step: 652/466, loss: 0.00041040178621187806 2023-01-22 15:48:07.393928: step: 654/466, loss: 0.049572572112083435 2023-01-22 15:48:08.092263: step: 656/466, loss: 0.0026933804620057344 2023-01-22 15:48:08.880859: step: 658/466, loss: 0.02136015146970749 2023-01-22 15:48:09.605037: step: 660/466, loss: 0.0073014600202441216 2023-01-22 15:48:10.382407: step: 662/466, loss: 0.012390895746648312 2023-01-22 15:48:11.109525: step: 664/466, loss: 0.027630962431430817 2023-01-22 15:48:11.915167: step: 666/466, loss: 0.021774159744381905 2023-01-22 15:48:12.633119: step: 668/466, loss: 0.011346152052283287 2023-01-22 15:48:13.408312: step: 670/466, loss: 0.1491927206516266 2023-01-22 15:48:14.155890: step: 672/466, loss: 0.021823476999998093 2023-01-22 15:48:14.894702: step: 674/466, loss: 0.015337795950472355 2023-01-22 15:48:15.588485: step: 676/466, loss: 0.0159169789403677 2023-01-22 15:48:16.317719: step: 678/466, loss: 0.0011591583024710417 2023-01-22 15:48:17.012701: step: 680/466, loss: 0.0018728708382695913 2023-01-22 15:48:17.855178: step: 682/466, loss: 0.03095311112701893 2023-01-22 15:48:18.699916: step: 684/466, loss: 0.4649326205253601 2023-01-22 15:48:19.448725: step: 686/466, loss: 0.027271777391433716 2023-01-22 15:48:20.202542: step: 688/466, loss: 0.010774504393339157 2023-01-22 15:48:20.906565: step: 690/466, loss: 0.0026382540818303823 2023-01-22 15:48:21.659564: step: 692/466, loss: 0.0033890206832438707 2023-01-22 15:48:22.386647: step: 694/466, loss: 0.012756789103150368 2023-01-22 15:48:23.033012: step: 696/466, loss: 0.0001581639371579513 2023-01-22 15:48:23.831702: step: 698/466, loss: 0.017527710646390915 2023-01-22 15:48:24.570315: step: 700/466, loss: 0.022267047315835953 2023-01-22 15:48:25.261983: step: 702/466, loss: 0.004587067756801844 2023-01-22 15:48:26.001033: step: 704/466, loss: 0.07056285440921783 2023-01-22 15:48:26.689938: step: 706/466, loss: 0.029599877074360847 2023-01-22 15:48:27.328376: step: 708/466, loss: 0.02155863121151924 2023-01-22 15:48:28.098998: step: 710/466, loss: 0.008911887183785439 2023-01-22 15:48:28.917366: step: 712/466, loss: 0.14788082242012024 2023-01-22 15:48:29.661319: step: 714/466, loss: 0.00010579630179563537 2023-01-22 15:48:30.515592: step: 716/466, loss: 0.0151499779894948 2023-01-22 15:48:31.229794: step: 718/466, loss: 0.01636345311999321 2023-01-22 15:48:31.977191: step: 720/466, loss: 0.008408303372561932 2023-01-22 15:48:32.744782: step: 722/466, loss: 0.01703471876680851 2023-01-22 15:48:33.469785: step: 724/466, loss: 0.027688410133123398 2023-01-22 15:48:34.355273: step: 726/466, loss: 0.1199537143111229 2023-01-22 15:48:35.121520: step: 728/466, loss: 0.012684978544712067 2023-01-22 15:48:35.826646: step: 730/466, loss: 0.014809907414019108 2023-01-22 15:48:36.598572: step: 732/466, loss: 0.02575354091823101 2023-01-22 15:48:37.412382: step: 734/466, loss: 0.4311109781265259 2023-01-22 15:48:38.137072: step: 736/466, loss: 0.011944697238504887 2023-01-22 15:48:38.862684: step: 738/466, loss: 0.0004669880145229399 2023-01-22 15:48:39.620218: step: 740/466, loss: 0.021837793290615082 2023-01-22 15:48:40.323054: step: 742/466, loss: 0.001047839061357081 2023-01-22 15:48:41.127156: step: 744/466, loss: 0.001859480980783701 2023-01-22 15:48:41.812797: step: 746/466, loss: 0.0020473224576562643 2023-01-22 15:48:42.604490: step: 748/466, loss: 0.010712344199419022 2023-01-22 15:48:43.273501: step: 750/466, loss: 0.003235904034227133 2023-01-22 15:48:44.055027: step: 752/466, loss: 0.015598940663039684 2023-01-22 15:48:44.789381: step: 754/466, loss: 0.01041698083281517 2023-01-22 15:48:45.597539: step: 756/466, loss: 0.02159099653363228 2023-01-22 15:48:46.375364: step: 758/466, loss: 0.005649959202855825 2023-01-22 15:48:47.260747: step: 760/466, loss: 0.2386442869901657 2023-01-22 15:48:47.992428: step: 762/466, loss: 0.001370615209452808 2023-01-22 15:48:48.751437: step: 764/466, loss: 0.002000574953854084 2023-01-22 15:48:49.515853: step: 766/466, loss: 0.011318007484078407 2023-01-22 15:48:50.412920: step: 768/466, loss: 0.0028875821735709906 2023-01-22 15:48:51.184660: step: 770/466, loss: 0.017588937655091286 2023-01-22 15:48:51.848666: step: 772/466, loss: 0.0038840211927890778 2023-01-22 15:48:52.608994: step: 774/466, loss: 0.00015768868615850806 2023-01-22 15:48:53.412754: step: 776/466, loss: 0.012369371019303799 2023-01-22 15:48:54.180703: step: 778/466, loss: 0.006792670115828514 2023-01-22 15:48:54.932551: step: 780/466, loss: 0.2840106189250946 2023-01-22 15:48:55.722807: step: 782/466, loss: 0.01502148900181055 2023-01-22 15:48:56.523772: step: 784/466, loss: 0.024477217346429825 2023-01-22 15:48:57.336937: step: 786/466, loss: 0.05349123477935791 2023-01-22 15:48:58.069346: step: 788/466, loss: 0.00770318740978837 2023-01-22 15:48:58.907436: step: 790/466, loss: 0.07015214115381241 2023-01-22 15:48:59.711856: step: 792/466, loss: 0.0007451863493770361 2023-01-22 15:49:00.490986: step: 794/466, loss: 0.017964649945497513 2023-01-22 15:49:01.172476: step: 796/466, loss: 0.04023731127381325 2023-01-22 15:49:01.886376: step: 798/466, loss: 0.001383331953547895 2023-01-22 15:49:02.601877: step: 800/466, loss: 0.10122286528348923 2023-01-22 15:49:03.369018: step: 802/466, loss: 0.014714941382408142 2023-01-22 15:49:04.054346: step: 804/466, loss: 0.008850020356476307 2023-01-22 15:49:04.828512: step: 806/466, loss: 0.07820001244544983 2023-01-22 15:49:05.598126: step: 808/466, loss: 0.014597302302718163 2023-01-22 15:49:06.414289: step: 810/466, loss: 0.003665260039269924 2023-01-22 15:49:07.168994: step: 812/466, loss: 0.09740063548088074 2023-01-22 15:49:07.819522: step: 814/466, loss: 0.021296529099345207 2023-01-22 15:49:08.636229: step: 816/466, loss: 0.07871195673942566 2023-01-22 15:49:09.359579: step: 818/466, loss: 0.034476276487112045 2023-01-22 15:49:10.084632: step: 820/466, loss: 0.01530569139868021 2023-01-22 15:49:10.834094: step: 822/466, loss: 0.0636683851480484 2023-01-22 15:49:11.514988: step: 824/466, loss: 0.002796899527311325 2023-01-22 15:49:12.269664: step: 826/466, loss: 0.014578304253518581 2023-01-22 15:49:12.941686: step: 828/466, loss: 0.06623385101556778 2023-01-22 15:49:13.694923: step: 830/466, loss: 0.012619067914783955 2023-01-22 15:49:14.433447: step: 832/466, loss: 0.003821233520284295 2023-01-22 15:49:15.127337: step: 834/466, loss: 0.014842454344034195 2023-01-22 15:49:15.887518: step: 836/466, loss: 0.027856387197971344 2023-01-22 15:49:16.656944: step: 838/466, loss: 0.015989501029253006 2023-01-22 15:49:17.402751: step: 840/466, loss: 0.0018745105480775237 2023-01-22 15:49:18.129931: step: 842/466, loss: 0.011679274961352348 2023-01-22 15:49:18.911208: step: 844/466, loss: 0.05877411365509033 2023-01-22 15:49:19.604895: step: 846/466, loss: 0.056277453899383545 2023-01-22 15:49:20.268739: step: 848/466, loss: 0.020335165783762932 2023-01-22 15:49:21.005653: step: 850/466, loss: 0.004025626461952925 2023-01-22 15:49:21.757753: step: 852/466, loss: 0.006656455807387829 2023-01-22 15:49:22.593527: step: 854/466, loss: 0.043784309178590775 2023-01-22 15:49:23.247244: step: 856/466, loss: 0.005373646505177021 2023-01-22 15:49:23.989087: step: 858/466, loss: 0.030034927651286125 2023-01-22 15:49:24.731920: step: 860/466, loss: 0.0360288992524147 2023-01-22 15:49:25.486472: step: 862/466, loss: 0.003689620876684785 2023-01-22 15:49:26.206516: step: 864/466, loss: 0.0027504567988216877 2023-01-22 15:49:26.982613: step: 866/466, loss: 0.010373730212450027 2023-01-22 15:49:27.717900: step: 868/466, loss: 0.013135841116309166 2023-01-22 15:49:28.444200: step: 870/466, loss: 9.005170431919396e-05 2023-01-22 15:49:29.119809: step: 872/466, loss: 0.03494952619075775 2023-01-22 15:49:29.972809: step: 874/466, loss: 0.007895917631685734 2023-01-22 15:49:30.689699: step: 876/466, loss: 0.0025822457391768694 2023-01-22 15:49:31.383156: step: 878/466, loss: 0.022746095433831215 2023-01-22 15:49:32.142980: step: 880/466, loss: 0.044358815997838974 2023-01-22 15:49:32.892229: step: 882/466, loss: 0.019776368513703346 2023-01-22 15:49:33.663654: step: 884/466, loss: 0.0091138556599617 2023-01-22 15:49:34.344558: step: 886/466, loss: 4.838859604205936e-05 2023-01-22 15:49:35.065697: step: 888/466, loss: 0.0009778942912817001 2023-01-22 15:49:35.735532: step: 890/466, loss: 2.5119843485299498e-05 2023-01-22 15:49:36.457699: step: 892/466, loss: 0.0066669778898358345 2023-01-22 15:49:37.187522: step: 894/466, loss: 0.050770752131938934 2023-01-22 15:49:37.871683: step: 896/466, loss: 0.0034824397880584 2023-01-22 15:49:38.527312: step: 898/466, loss: 0.007224982138723135 2023-01-22 15:49:39.268696: step: 900/466, loss: 0.0012097193393856287 2023-01-22 15:49:40.035782: step: 902/466, loss: 0.008332643657922745 2023-01-22 15:49:40.767798: step: 904/466, loss: 0.02265145443379879 2023-01-22 15:49:41.438547: step: 906/466, loss: 0.1696164309978485 2023-01-22 15:49:42.267890: step: 908/466, loss: 0.0021269218996167183 2023-01-22 15:49:43.042510: step: 910/466, loss: 0.6533875465393066 2023-01-22 15:49:43.830030: step: 912/466, loss: 0.03232376649975777 2023-01-22 15:49:44.529627: step: 914/466, loss: 0.0012437463738024235 2023-01-22 15:49:45.205416: step: 916/466, loss: 0.06054536998271942 2023-01-22 15:49:45.908775: step: 918/466, loss: 0.001823619706556201 2023-01-22 15:49:46.598196: step: 920/466, loss: 0.008600911125540733 2023-01-22 15:49:47.460318: step: 922/466, loss: 0.027335256338119507 2023-01-22 15:49:48.240044: step: 924/466, loss: 0.10726413130760193 2023-01-22 15:49:48.970176: step: 926/466, loss: 0.004401802085340023 2023-01-22 15:49:49.693212: step: 928/466, loss: 0.01408754289150238 2023-01-22 15:49:50.566622: step: 930/466, loss: 0.023488802835345268 2023-01-22 15:49:51.312934: step: 932/466, loss: 0.08783465623855591 ================================================== Loss: 0.037 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3114988714196762, 'r': 0.3451904002070036, 'f1': 0.32748036167253086}, 'combined': 0.241301319127128, 'epoch': 30} Test Chinese: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.3472140440441078, 'r': 0.3126429487496728, 'f1': 0.3290228754495418}, 'combined': 0.20222869417874276, 'epoch': 30} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.28552977575485433, 'r': 0.35488046132719087, 'f1': 0.31645008988059153}, 'combined': 0.23317375043833058, 'epoch': 30} Test Korean: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.32422569481168945, 'r': 0.31832557384891175, 'f1': 0.3212485458868773}, 'combined': 0.19745032576461727, 'epoch': 30} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.323116911300505, 'r': 0.3531600396756943, 'f1': 0.3374711530536553}, 'combined': 0.24866295488164075, 'epoch': 30} Test Russian: {'template': {'p': 0.863013698630137, 'r': 0.48091603053435117, 'f1': 0.6176470588235295}, 'slot': {'p': 0.3436973441237516, 'r': 0.3118569841053607, 'f1': 0.32700391887579816}, 'combined': 0.20197300871740478, 'epoch': 30} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.265625, 'r': 0.36428571428571427, 'f1': 0.30722891566265054}, 'combined': 0.20481927710843367, 'epoch': 30} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.2602040816326531, 'r': 0.5543478260869565, 'f1': 0.3541666666666667}, 'combined': 0.17708333333333334, 'epoch': 30} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.4, 'r': 0.20689655172413793, 'f1': 0.2727272727272727}, 'combined': 0.1818181818181818, 'epoch': 30} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3112994890724527, 'r': 0.3408345449616797, 'f1': 0.3253981978166761}, 'combined': 0.23976709312807712, 'epoch': 22} Test for Chinese: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.354901590881547, 'r': 0.30635228234537004, 'f1': 0.3288446896922884}, 'combined': 0.2021191751279431, 'epoch': 22} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.28125, 'r': 0.38571428571428573, 'f1': 0.32530120481927716}, 'combined': 0.2168674698795181, 'epoch': 22} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.28845615818674736, 'r': 0.34045489637980425, 'f1': 0.3123058840594549}, 'combined': 0.23012012509644045, 'epoch': 28} Test for Korean: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.3314220112372907, 'r': 0.30844648186208856, 'f1': 0.3195217594872982}, 'combined': 0.19638898387999792, 'epoch': 28} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.3170731707317073, 'r': 0.5652173913043478, 'f1': 0.40625}, 'combined': 0.203125, 'epoch': 28} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3161637354106435, 'r': 0.3395610516934046, 'f1': 0.3274449665918101}, 'combined': 0.24127523854133376, 'epoch': 24} Test for Russian: {'template': {'p': 0.863013698630137, 'r': 0.48091603053435117, 'f1': 0.6176470588235295}, 'slot': {'p': 0.34766580316129947, 'r': 0.3010093533864065, 'f1': 0.32265967810793456}, 'combined': 0.19928980118431255, 'epoch': 24} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.2413793103448276, 'f1': 0.32558139534883723}, 'combined': 0.21705426356589147, 'epoch': 24} ****************************** Epoch: 31 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-22 15:52:31.421126: step: 2/466, loss: 0.0021119576413184404 2023-01-22 15:52:32.188204: step: 4/466, loss: 0.003949753474444151 2023-01-22 15:52:32.899532: step: 6/466, loss: 0.023282263427972794 2023-01-22 15:52:33.671596: step: 8/466, loss: 0.013862239196896553 2023-01-22 15:52:34.364311: step: 10/466, loss: 0.10889230668544769 2023-01-22 15:52:35.084769: step: 12/466, loss: 0.012851127423346043 2023-01-22 15:52:35.830519: step: 14/466, loss: 0.17665880918502808 2023-01-22 15:52:36.566271: step: 16/466, loss: 0.03574291244149208 2023-01-22 15:52:37.357974: step: 18/466, loss: 0.0005902175325900316 2023-01-22 15:52:38.211215: step: 20/466, loss: 0.0087283318862319 2023-01-22 15:52:38.902939: step: 22/466, loss: 0.006415482610464096 2023-01-22 15:52:39.709814: step: 24/466, loss: 0.04369017109274864 2023-01-22 15:52:40.389752: step: 26/466, loss: 0.005777179729193449 2023-01-22 15:52:41.157271: step: 28/466, loss: 0.013707002624869347 2023-01-22 15:52:41.864424: step: 30/466, loss: 0.02515527606010437 2023-01-22 15:52:42.581349: step: 32/466, loss: 0.005246358923614025 2023-01-22 15:52:43.294748: step: 34/466, loss: 0.005221569444984198 2023-01-22 15:52:43.992272: step: 36/466, loss: 0.006976307835429907 2023-01-22 15:52:44.785526: step: 38/466, loss: 0.0067510176450014114 2023-01-22 15:52:45.486917: step: 40/466, loss: 0.018617145717144012 2023-01-22 15:52:46.161569: step: 42/466, loss: 0.013348113745450974 2023-01-22 15:52:46.792643: step: 44/466, loss: 0.001935301930643618 2023-01-22 15:52:47.472164: step: 46/466, loss: 0.0013969270512461662 2023-01-22 15:52:48.117006: step: 48/466, loss: 0.00413518724963069 2023-01-22 15:52:48.801814: step: 50/466, loss: 0.0023705631028860807 2023-01-22 15:52:49.486850: step: 52/466, loss: 0.004117606673389673 2023-01-22 15:52:50.216698: step: 54/466, loss: 0.06089463829994202 2023-01-22 15:52:50.908726: step: 56/466, loss: 0.038913093507289886 2023-01-22 15:52:51.620289: step: 58/466, loss: 0.004359858576208353 2023-01-22 15:52:52.397818: step: 60/466, loss: 0.019135290756821632 2023-01-22 15:52:53.269003: step: 62/466, loss: 0.029449697583913803 2023-01-22 15:52:53.975692: step: 64/466, loss: 0.0006406322354450822 2023-01-22 15:52:54.692269: step: 66/466, loss: 0.013644445687532425 2023-01-22 15:52:55.488924: step: 68/466, loss: 0.028983967378735542 2023-01-22 15:52:56.222491: step: 70/466, loss: 0.0011641232995316386 2023-01-22 15:52:56.844037: step: 72/466, loss: 0.008698353543877602 2023-01-22 15:52:57.645750: step: 74/466, loss: 0.005477309226989746 2023-01-22 15:52:58.381810: step: 76/466, loss: 0.008166252635419369 2023-01-22 15:52:59.132978: step: 78/466, loss: 0.008764025755226612 2023-01-22 15:52:59.855294: step: 80/466, loss: 0.03627845272421837 2023-01-22 15:53:00.659803: step: 82/466, loss: 0.0008104249718599021 2023-01-22 15:53:01.343579: step: 84/466, loss: 0.027701199054718018 2023-01-22 15:53:02.140691: step: 86/466, loss: 0.0011427691206336021 2023-01-22 15:53:02.873433: step: 88/466, loss: 0.004959442652761936 2023-01-22 15:53:03.574469: step: 90/466, loss: 0.0018794368952512741 2023-01-22 15:53:04.219862: step: 92/466, loss: 0.0008936856174841523 2023-01-22 15:53:04.964411: step: 94/466, loss: 0.0008424253901466727 2023-01-22 15:53:05.684038: step: 96/466, loss: 0.0719582661986351 2023-01-22 15:53:06.401090: step: 98/466, loss: 0.0009956827852874994 2023-01-22 15:53:07.136143: step: 100/466, loss: 0.021351397037506104 2023-01-22 15:53:07.858124: step: 102/466, loss: 0.018605533987283707 2023-01-22 15:53:08.581470: step: 104/466, loss: 0.005402641836553812 2023-01-22 15:53:09.337476: step: 106/466, loss: 0.05505645275115967 2023-01-22 15:53:10.093849: step: 108/466, loss: 0.009099229238927364 2023-01-22 15:53:10.795084: step: 110/466, loss: 0.019879935309290886 2023-01-22 15:53:11.546886: step: 112/466, loss: 0.016863131895661354 2023-01-22 15:53:12.213305: step: 114/466, loss: 0.015752049162983894 2023-01-22 15:53:12.987653: step: 116/466, loss: 0.0007166631985455751 2023-01-22 15:53:13.694424: step: 118/466, loss: 0.0003036932903341949 2023-01-22 15:53:14.446115: step: 120/466, loss: 0.047832999378442764 2023-01-22 15:53:15.203131: step: 122/466, loss: 0.11420870572328568 2023-01-22 15:53:16.018351: step: 124/466, loss: 0.0011902085971087217 2023-01-22 15:53:16.737074: step: 126/466, loss: 0.002473770873621106 2023-01-22 15:53:17.446824: step: 128/466, loss: 0.0057588787749409676 2023-01-22 15:53:18.161942: step: 130/466, loss: 0.005072615575045347 2023-01-22 15:53:18.982154: step: 132/466, loss: 0.032617054879665375 2023-01-22 15:53:19.673377: step: 134/466, loss: 0.00042786714038811624 2023-01-22 15:53:20.403083: step: 136/466, loss: 0.004805801901966333 2023-01-22 15:53:21.196234: step: 138/466, loss: 0.00014069517783354968 2023-01-22 15:53:22.005884: step: 140/466, loss: 0.00585377961397171 2023-01-22 15:53:22.709949: step: 142/466, loss: 0.027600059285759926 2023-01-22 15:53:23.395232: step: 144/466, loss: 0.00015916845586616546 2023-01-22 15:53:24.123120: step: 146/466, loss: 0.012031443417072296 2023-01-22 15:53:24.822838: step: 148/466, loss: 0.0018937510903924704 2023-01-22 15:53:25.546162: step: 150/466, loss: 0.00020203908206894994 2023-01-22 15:53:26.253380: step: 152/466, loss: 0.01154989656060934 2023-01-22 15:53:26.981458: step: 154/466, loss: 0.016319116577506065 2023-01-22 15:53:27.705674: step: 156/466, loss: 1.0363105535507202 2023-01-22 15:53:28.355702: step: 158/466, loss: 0.0228382907807827 2023-01-22 15:53:29.095344: step: 160/466, loss: 0.015598494559526443 2023-01-22 15:53:29.829071: step: 162/466, loss: 0.0042947870679199696 2023-01-22 15:53:30.525045: step: 164/466, loss: 0.010354123078286648 2023-01-22 15:53:31.376547: step: 166/466, loss: 0.13110190629959106 2023-01-22 15:53:32.083657: step: 168/466, loss: 0.04255157709121704 2023-01-22 15:53:32.755357: step: 170/466, loss: 0.01943907141685486 2023-01-22 15:53:33.462248: step: 172/466, loss: 0.0006557477172464132 2023-01-22 15:53:34.133491: step: 174/466, loss: 0.016860581934452057 2023-01-22 15:53:34.861883: step: 176/466, loss: 0.00529084075242281 2023-01-22 15:53:35.570274: step: 178/466, loss: 0.010471130721271038 2023-01-22 15:53:36.351566: step: 180/466, loss: 0.04970330744981766 2023-01-22 15:53:37.118167: step: 182/466, loss: 0.003287299070507288 2023-01-22 15:53:37.842520: step: 184/466, loss: 0.09002115577459335 2023-01-22 15:53:38.631908: step: 186/466, loss: 0.011329753324389458 2023-01-22 15:53:39.382951: step: 188/466, loss: 0.0005845078267157078 2023-01-22 15:53:40.091888: step: 190/466, loss: 0.02527262084186077 2023-01-22 15:53:40.835973: step: 192/466, loss: 0.0019953681621700525 2023-01-22 15:53:41.570459: step: 194/466, loss: 0.0026451025623828173 2023-01-22 15:53:42.254999: step: 196/466, loss: 0.001436003134585917 2023-01-22 15:53:43.073874: step: 198/466, loss: 0.0031633705366402864 2023-01-22 15:53:43.792452: step: 200/466, loss: 0.0010856760200113058 2023-01-22 15:53:44.447625: step: 202/466, loss: 0.0021711231674999 2023-01-22 15:53:45.157895: step: 204/466, loss: 0.0067644547671079636 2023-01-22 15:53:45.863502: step: 206/466, loss: 0.0005549822235479951 2023-01-22 15:53:46.710029: step: 208/466, loss: 0.06292501091957092 2023-01-22 15:53:47.370496: step: 210/466, loss: 0.002018541330471635 2023-01-22 15:53:48.056070: step: 212/466, loss: 0.019470317289233208 2023-01-22 15:53:48.806787: step: 214/466, loss: 0.008875842206180096 2023-01-22 15:53:49.580854: step: 216/466, loss: 0.024308178573846817 2023-01-22 15:53:50.414699: step: 218/466, loss: 0.27486276626586914 2023-01-22 15:53:51.252479: step: 220/466, loss: 0.004777147900313139 2023-01-22 15:53:51.991751: step: 222/466, loss: 0.0049628280103206635 2023-01-22 15:53:52.721796: step: 224/466, loss: 0.0005460731917992234 2023-01-22 15:53:53.466480: step: 226/466, loss: 0.001001058961264789 2023-01-22 15:53:54.207369: step: 228/466, loss: 0.03983161970973015 2023-01-22 15:53:54.978891: step: 230/466, loss: 0.11256561428308487 2023-01-22 15:53:55.728256: step: 232/466, loss: 0.006923142354935408 2023-01-22 15:53:56.410756: step: 234/466, loss: 0.028024213388562202 2023-01-22 15:53:57.228175: step: 236/466, loss: 0.03840837627649307 2023-01-22 15:53:57.997231: step: 238/466, loss: 0.04603128880262375 2023-01-22 15:53:58.780762: step: 240/466, loss: 0.004212265834212303 2023-01-22 15:53:59.464342: step: 242/466, loss: 0.027081385254859924 2023-01-22 15:54:00.288126: step: 244/466, loss: 0.002239649184048176 2023-01-22 15:54:01.041410: step: 246/466, loss: 0.036767009645700455 2023-01-22 15:54:01.801791: step: 248/466, loss: 0.012738176621496677 2023-01-22 15:54:02.534965: step: 250/466, loss: 0.004856251645833254 2023-01-22 15:54:03.381811: step: 252/466, loss: 0.006001537665724754 2023-01-22 15:54:04.159695: step: 254/466, loss: 0.03852864354848862 2023-01-22 15:54:04.969908: step: 256/466, loss: 0.0019090332789346576 2023-01-22 15:54:05.792768: step: 258/466, loss: 0.0013001691550016403 2023-01-22 15:54:06.634964: step: 260/466, loss: 0.007493459153920412 2023-01-22 15:54:07.410539: step: 262/466, loss: 0.041227787733078 2023-01-22 15:54:08.159854: step: 264/466, loss: 0.09032338857650757 2023-01-22 15:54:08.854151: step: 266/466, loss: 0.028157012537121773 2023-01-22 15:54:09.578707: step: 268/466, loss: 0.040503330528736115 2023-01-22 15:54:10.429183: step: 270/466, loss: 0.0026988424360752106 2023-01-22 15:54:11.178303: step: 272/466, loss: 0.47166407108306885 2023-01-22 15:54:12.013351: step: 274/466, loss: 0.0011497886152938008 2023-01-22 15:54:12.753815: step: 276/466, loss: 0.00470120320096612 2023-01-22 15:54:13.430940: step: 278/466, loss: 0.0031544240191578865 2023-01-22 15:54:14.137261: step: 280/466, loss: 0.002785501768812537 2023-01-22 15:54:14.832681: step: 282/466, loss: 0.003542037680745125 2023-01-22 15:54:15.508761: step: 284/466, loss: 0.005036836955696344 2023-01-22 15:54:16.247449: step: 286/466, loss: 0.010355237871408463 2023-01-22 15:54:16.928740: step: 288/466, loss: 0.23964522778987885 2023-01-22 15:54:17.671486: step: 290/466, loss: 0.015845391899347305 2023-01-22 15:54:18.359174: step: 292/466, loss: 0.01506973896175623 2023-01-22 15:54:19.146633: step: 294/466, loss: 0.009640523232519627 2023-01-22 15:54:19.850467: step: 296/466, loss: 0.0004169405438005924 2023-01-22 15:54:20.631808: step: 298/466, loss: 0.21283280849456787 2023-01-22 15:54:21.404754: step: 300/466, loss: 0.003256069030612707 2023-01-22 15:54:22.122003: step: 302/466, loss: 0.034554723650217056 2023-01-22 15:54:22.963207: step: 304/466, loss: 0.00668883603066206 2023-01-22 15:54:23.658157: step: 306/466, loss: 0.002489378210157156 2023-01-22 15:54:24.338307: step: 308/466, loss: 0.01227173674851656 2023-01-22 15:54:25.096642: step: 310/466, loss: 0.007570523303002119 2023-01-22 15:54:25.844919: step: 312/466, loss: 0.009839157573878765 2023-01-22 15:54:26.732871: step: 314/466, loss: 0.03226277232170105 2023-01-22 15:54:27.530092: step: 316/466, loss: 0.004619147628545761 2023-01-22 15:54:28.328594: step: 318/466, loss: 0.03376193344593048 2023-01-22 15:54:29.047505: step: 320/466, loss: 0.0007725472096353769 2023-01-22 15:54:29.779769: step: 322/466, loss: 0.00034353527007624507 2023-01-22 15:54:30.558900: step: 324/466, loss: 0.015253189019858837 2023-01-22 15:54:31.269226: step: 326/466, loss: 0.0205977950245142 2023-01-22 15:54:32.029831: step: 328/466, loss: 4.4319975131656975e-05 2023-01-22 15:54:32.664689: step: 330/466, loss: 0.00025746741448529065 2023-01-22 15:54:33.428395: step: 332/466, loss: 0.002880845917388797 2023-01-22 15:54:34.192952: step: 334/466, loss: 0.20469032227993011 2023-01-22 15:54:34.918537: step: 336/466, loss: 0.001424286630935967 2023-01-22 15:54:35.697633: step: 338/466, loss: 0.028453297913074493 2023-01-22 15:54:36.421531: step: 340/466, loss: 0.03045223280787468 2023-01-22 15:54:37.185823: step: 342/466, loss: 0.10660101473331451 2023-01-22 15:54:37.940730: step: 344/466, loss: 0.037694379687309265 2023-01-22 15:54:38.699347: step: 346/466, loss: 0.8764700889587402 2023-01-22 15:54:39.477566: step: 348/466, loss: 0.016882333904504776 2023-01-22 15:54:40.133290: step: 350/466, loss: 0.007390057668089867 2023-01-22 15:54:40.980174: step: 352/466, loss: 0.0014865277335047722 2023-01-22 15:54:41.707079: step: 354/466, loss: 0.001074317959137261 2023-01-22 15:54:42.617365: step: 356/466, loss: 0.016747722402215004 2023-01-22 15:54:43.477702: step: 358/466, loss: 0.000478467351058498 2023-01-22 15:54:44.213403: step: 360/466, loss: 0.009283812716603279 2023-01-22 15:54:44.992681: step: 362/466, loss: 0.09626930207014084 2023-01-22 15:54:45.688188: step: 364/466, loss: 0.1485295444726944 2023-01-22 15:54:46.461312: step: 366/466, loss: 0.028950830921530724 2023-01-22 15:54:47.330854: step: 368/466, loss: 0.1542282998561859 2023-01-22 15:54:48.090480: step: 370/466, loss: 0.01782785914838314 2023-01-22 15:54:48.755063: step: 372/466, loss: 0.03975386917591095 2023-01-22 15:54:49.557998: step: 374/466, loss: 0.02707074210047722 2023-01-22 15:54:50.285195: step: 376/466, loss: 0.003940037917345762 2023-01-22 15:54:51.027486: step: 378/466, loss: 0.0002613243996165693 2023-01-22 15:54:51.786319: step: 380/466, loss: 0.003921948838979006 2023-01-22 15:54:52.546524: step: 382/466, loss: 0.0002668288070708513 2023-01-22 15:54:53.359548: step: 384/466, loss: 0.0003136082086712122 2023-01-22 15:54:54.121897: step: 386/466, loss: 0.00648491270840168 2023-01-22 15:54:54.837439: step: 388/466, loss: 0.684620201587677 2023-01-22 15:54:55.466533: step: 390/466, loss: 0.0011094522196799517 2023-01-22 15:54:56.179648: step: 392/466, loss: 0.061549510806798935 2023-01-22 15:54:56.992546: step: 394/466, loss: 0.0019716816022992134 2023-01-22 15:54:57.745053: step: 396/466, loss: 0.37270647287368774 2023-01-22 15:54:58.470674: step: 398/466, loss: 0.002892756834626198 2023-01-22 15:54:59.259398: step: 400/466, loss: 0.007956295274198055 2023-01-22 15:55:00.082345: step: 402/466, loss: 0.0010787455830723047 2023-01-22 15:55:00.855913: step: 404/466, loss: 0.056287217885255814 2023-01-22 15:55:01.587829: step: 406/466, loss: 0.029866410419344902 2023-01-22 15:55:02.434599: step: 408/466, loss: 0.04059431701898575 2023-01-22 15:55:03.191861: step: 410/466, loss: 0.013068633154034615 2023-01-22 15:55:03.946817: step: 412/466, loss: 0.023778825998306274 2023-01-22 15:55:04.662805: step: 414/466, loss: 0.008005055598914623 2023-01-22 15:55:05.543147: step: 416/466, loss: 0.042082663625478745 2023-01-22 15:55:06.308946: step: 418/466, loss: 0.02101258747279644 2023-01-22 15:55:07.000536: step: 420/466, loss: 0.003775278339162469 2023-01-22 15:55:07.728393: step: 422/466, loss: 7.323760655708611e-05 2023-01-22 15:55:08.470855: step: 424/466, loss: 0.0001701459987089038 2023-01-22 15:55:09.223960: step: 426/466, loss: 0.013946725986897945 2023-01-22 15:55:09.967595: step: 428/466, loss: 0.010495316237211227 2023-01-22 15:55:10.678938: step: 430/466, loss: 0.0005667632794938982 2023-01-22 15:55:11.379525: step: 432/466, loss: 0.08862043917179108 2023-01-22 15:55:12.115632: step: 434/466, loss: 0.0020771543495357037 2023-01-22 15:55:12.861760: step: 436/466, loss: 0.0011141763534396887 2023-01-22 15:55:13.549624: step: 438/466, loss: 0.20008327066898346 2023-01-22 15:55:14.397770: step: 440/466, loss: 0.02583439089357853 2023-01-22 15:55:15.146227: step: 442/466, loss: 0.009787281975150108 2023-01-22 15:55:15.812439: step: 444/466, loss: 0.010717857629060745 2023-01-22 15:55:16.569318: step: 446/466, loss: 0.0028945282101631165 2023-01-22 15:55:17.379599: step: 448/466, loss: 0.011020984500646591 2023-01-22 15:55:18.079971: step: 450/466, loss: 0.011754285544157028 2023-01-22 15:55:18.947670: step: 452/466, loss: 0.00658143125474453 2023-01-22 15:55:19.765375: step: 454/466, loss: 0.0025760687422007322 2023-01-22 15:55:20.516803: step: 456/466, loss: 0.012058288790285587 2023-01-22 15:55:21.236804: step: 458/466, loss: 0.0031139289494603872 2023-01-22 15:55:21.961780: step: 460/466, loss: 0.002164160367101431 2023-01-22 15:55:22.693238: step: 462/466, loss: 0.00015576444275211543 2023-01-22 15:55:23.464532: step: 464/466, loss: 2.8912174457218498e-05 2023-01-22 15:55:24.271914: step: 466/466, loss: 0.008501943200826645 2023-01-22 15:55:25.060161: step: 468/466, loss: 0.021276382729411125 2023-01-22 15:55:25.861777: step: 470/466, loss: 0.013494429178535938 2023-01-22 15:55:26.632650: step: 472/466, loss: 0.01890728250145912 2023-01-22 15:55:27.349517: step: 474/466, loss: 0.0011015519266948104 2023-01-22 15:55:28.047304: step: 476/466, loss: 0.019345303997397423 2023-01-22 15:55:28.722078: step: 478/466, loss: 0.0008879891829565167 2023-01-22 15:55:29.519490: step: 480/466, loss: 0.07371818274259567 2023-01-22 15:55:30.262385: step: 482/466, loss: 0.02774639055132866 2023-01-22 15:55:31.008008: step: 484/466, loss: 0.0016840663738548756 2023-01-22 15:55:31.771378: step: 486/466, loss: 0.04393770173192024 2023-01-22 15:55:32.525931: step: 488/466, loss: 0.007707908283919096 2023-01-22 15:55:33.352783: step: 490/466, loss: 0.10483228415250778 2023-01-22 15:55:34.026559: step: 492/466, loss: 0.014694461598992348 2023-01-22 15:55:34.731060: step: 494/466, loss: 0.00035671706427820027 2023-01-22 15:55:35.426063: step: 496/466, loss: 0.0007526023546233773 2023-01-22 15:55:36.161306: step: 498/466, loss: 0.005434651393443346 2023-01-22 15:55:36.882258: step: 500/466, loss: 0.0034361332654953003 2023-01-22 15:55:37.665890: step: 502/466, loss: 0.012881132774055004 2023-01-22 15:55:38.426253: step: 504/466, loss: 0.029119957238435745 2023-01-22 15:55:39.238163: step: 506/466, loss: 0.054783616214990616 2023-01-22 15:55:40.084094: step: 508/466, loss: 0.004122794605791569 2023-01-22 15:55:40.834905: step: 510/466, loss: 0.01581837795674801 2023-01-22 15:55:41.517191: step: 512/466, loss: 0.0014336195308715105 2023-01-22 15:55:42.236205: step: 514/466, loss: 0.00047153281047940254 2023-01-22 15:55:43.000550: step: 516/466, loss: 0.04796868935227394 2023-01-22 15:55:43.711993: step: 518/466, loss: 0.0025941431522369385 2023-01-22 15:55:44.406766: step: 520/466, loss: 0.022482289001345634 2023-01-22 15:55:45.165626: step: 522/466, loss: 0.026926374062895775 2023-01-22 15:55:45.988823: step: 524/466, loss: 0.011951807886362076 2023-01-22 15:55:46.729139: step: 526/466, loss: 0.03389818221330643 2023-01-22 15:55:47.527983: step: 528/466, loss: 0.12745307385921478 2023-01-22 15:55:48.240670: step: 530/466, loss: 19.91595458984375 2023-01-22 15:55:48.950690: step: 532/466, loss: 0.011495614424347878 2023-01-22 15:55:49.719429: step: 534/466, loss: 0.1675240844488144 2023-01-22 15:55:50.422670: step: 536/466, loss: 0.0011641534510999918 2023-01-22 15:55:51.165317: step: 538/466, loss: 0.012672476470470428 2023-01-22 15:55:51.960600: step: 540/466, loss: 0.01503435242921114 2023-01-22 15:55:52.758342: step: 542/466, loss: 0.27556371688842773 2023-01-22 15:55:53.564796: step: 544/466, loss: 0.016031546518206596 2023-01-22 15:55:54.313891: step: 546/466, loss: 0.02003948763012886 2023-01-22 15:55:55.091698: step: 548/466, loss: 0.0018222718499600887 2023-01-22 15:55:55.803049: step: 550/466, loss: 0.01665370538830757 2023-01-22 15:55:56.531126: step: 552/466, loss: 0.00013914398732595146 2023-01-22 15:55:57.373094: step: 554/466, loss: 0.0043900106102228165 2023-01-22 15:55:58.091601: step: 556/466, loss: 0.6020170450210571 2023-01-22 15:55:58.800382: step: 558/466, loss: 0.025239666923880577 2023-01-22 15:55:59.569679: step: 560/466, loss: 0.026258215308189392 2023-01-22 15:56:00.274819: step: 562/466, loss: 0.036790695041418076 2023-01-22 15:56:01.064698: step: 564/466, loss: 0.12813805043697357 2023-01-22 15:56:01.857021: step: 566/466, loss: 0.021396158263087273 2023-01-22 15:56:02.596877: step: 568/466, loss: 0.008455985225737095 2023-01-22 15:56:03.417922: step: 570/466, loss: 0.017628345638513565 2023-01-22 15:56:04.151662: step: 572/466, loss: 1.1817795038223267 2023-01-22 15:56:04.984058: step: 574/466, loss: 0.0001591477048350498 2023-01-22 15:56:05.675827: step: 576/466, loss: 0.017058134078979492 2023-01-22 15:56:06.493616: step: 578/466, loss: 0.016770707443356514 2023-01-22 15:56:07.263548: step: 580/466, loss: 0.011051390320062637 2023-01-22 15:56:08.025028: step: 582/466, loss: 0.3217681050300598 2023-01-22 15:56:08.722793: step: 584/466, loss: 0.012333834543824196 2023-01-22 15:56:09.422680: step: 586/466, loss: 0.11268869787454605 2023-01-22 15:56:10.217471: step: 588/466, loss: 0.013719167560338974 2023-01-22 15:56:11.013064: step: 590/466, loss: 0.018657803535461426 2023-01-22 15:56:11.827231: step: 592/466, loss: 0.019210534170269966 2023-01-22 15:56:12.634196: step: 594/466, loss: 0.8516140580177307 2023-01-22 15:56:13.356065: step: 596/466, loss: 0.00032440427457913756 2023-01-22 15:56:14.154215: step: 598/466, loss: 0.05943857878446579 2023-01-22 15:56:14.870915: step: 600/466, loss: 0.0002008227165788412 2023-01-22 15:56:15.604612: step: 602/466, loss: 0.0069481548853218555 2023-01-22 15:56:16.301157: step: 604/466, loss: 0.04889529198408127 2023-01-22 15:56:16.993141: step: 606/466, loss: 0.0031339051201939583 2023-01-22 15:56:17.764836: step: 608/466, loss: 0.07272807508707047 2023-01-22 15:56:18.500868: step: 610/466, loss: 0.029821842908859253 2023-01-22 15:56:19.248650: step: 612/466, loss: 0.0008868348668329418 2023-01-22 15:56:20.011628: step: 614/466, loss: 0.0166871827095747 2023-01-22 15:56:20.827575: step: 616/466, loss: 0.041613008826971054 2023-01-22 15:56:21.523796: step: 618/466, loss: 0.005279239267110825 2023-01-22 15:56:22.293008: step: 620/466, loss: 0.00036231501144357026 2023-01-22 15:56:22.948790: step: 622/466, loss: 0.0006214659078978002 2023-01-22 15:56:23.696997: step: 624/466, loss: 0.020328793674707413 2023-01-22 15:56:24.413321: step: 626/466, loss: 0.00814978126436472 2023-01-22 15:56:25.218676: step: 628/466, loss: 0.0077381557784974575 2023-01-22 15:56:26.016508: step: 630/466, loss: 0.058892786502838135 2023-01-22 15:56:26.759174: step: 632/466, loss: 0.007260517682880163 2023-01-22 15:56:27.594584: step: 634/466, loss: 0.12914660573005676 2023-01-22 15:56:28.330010: step: 636/466, loss: 0.0056647504679858685 2023-01-22 15:56:29.100590: step: 638/466, loss: 0.011224090121686459 2023-01-22 15:56:29.744510: step: 640/466, loss: 0.02375660464167595 2023-01-22 15:56:30.450334: step: 642/466, loss: 0.017788778990507126 2023-01-22 15:56:31.179482: step: 644/466, loss: 0.0006204199744388461 2023-01-22 15:56:31.943634: step: 646/466, loss: 0.0039198934100568295 2023-01-22 15:56:32.780478: step: 648/466, loss: 0.006853078491985798 2023-01-22 15:56:33.554403: step: 650/466, loss: 0.012796571478247643 2023-01-22 15:56:34.321643: step: 652/466, loss: 0.0021514042746275663 2023-01-22 15:56:35.084920: step: 654/466, loss: 10.409101486206055 2023-01-22 15:56:35.864271: step: 656/466, loss: 0.03252642601728439 2023-01-22 15:56:36.657913: step: 658/466, loss: 0.048868775367736816 2023-01-22 15:56:37.356866: step: 660/466, loss: 0.009728114120662212 2023-01-22 15:56:38.199897: step: 662/466, loss: 0.0049397689290344715 2023-01-22 15:56:38.972299: step: 664/466, loss: 0.0002577665145508945 2023-01-22 15:56:39.745914: step: 666/466, loss: 0.004854061175137758 2023-01-22 15:56:40.485841: step: 668/466, loss: 0.007043915335088968 2023-01-22 15:56:41.189204: step: 670/466, loss: 0.07948621362447739 2023-01-22 15:56:42.017665: step: 672/466, loss: 0.0197658259421587 2023-01-22 15:56:42.796224: step: 674/466, loss: 0.19008009135723114 2023-01-22 15:56:43.566172: step: 676/466, loss: 0.002508687088266015 2023-01-22 15:56:44.316560: step: 678/466, loss: 0.026402266696095467 2023-01-22 15:56:45.131245: step: 680/466, loss: 0.0015470042126253247 2023-01-22 15:56:45.871539: step: 682/466, loss: 0.0009299801313318312 2023-01-22 15:56:46.541360: step: 684/466, loss: 0.0004883540677838027 2023-01-22 15:56:47.264574: step: 686/466, loss: 0.019189296290278435 2023-01-22 15:56:48.017507: step: 688/466, loss: 0.04578102380037308 2023-01-22 15:56:48.812271: step: 690/466, loss: 0.0030191573314368725 2023-01-22 15:56:49.601059: step: 692/466, loss: 0.005563544109463692 2023-01-22 15:56:50.412291: step: 694/466, loss: 0.3574367165565491 2023-01-22 15:56:51.115695: step: 696/466, loss: 0.010373993776738644 2023-01-22 15:56:51.864568: step: 698/466, loss: 0.12190108746290207 2023-01-22 15:56:52.561219: step: 700/466, loss: 0.0032017051707953215 2023-01-22 15:56:53.309134: step: 702/466, loss: 0.017082445323467255 2023-01-22 15:56:54.066048: step: 704/466, loss: 0.006890237331390381 2023-01-22 15:56:54.873415: step: 706/466, loss: 0.0007328407955355942 2023-01-22 15:56:55.613124: step: 708/466, loss: 0.01048083696514368 2023-01-22 15:56:56.305943: step: 710/466, loss: 0.046984102576971054 2023-01-22 15:56:57.040808: step: 712/466, loss: 0.24207662045955658 2023-01-22 15:56:57.766165: step: 714/466, loss: 0.012634899467229843 2023-01-22 15:56:58.565686: step: 716/466, loss: 0.03797135129570961 2023-01-22 15:56:59.354656: step: 718/466, loss: 0.019333388656377792 2023-01-22 15:57:00.177762: step: 720/466, loss: 0.0033987818751484156 2023-01-22 15:57:00.902904: step: 722/466, loss: 0.001986326416954398 2023-01-22 15:57:01.707312: step: 724/466, loss: 0.050077229738235474 2023-01-22 15:57:02.485566: step: 726/466, loss: 0.012111474759876728 2023-01-22 15:57:03.277989: step: 728/466, loss: 0.0119631951674819 2023-01-22 15:57:04.070862: step: 730/466, loss: 0.03998979181051254 2023-01-22 15:57:04.837314: step: 732/466, loss: 0.07630390673875809 2023-01-22 15:57:05.529598: step: 734/466, loss: 0.011790863238275051 2023-01-22 15:57:06.224882: step: 736/466, loss: 0.0075907232239842415 2023-01-22 15:57:07.027735: step: 738/466, loss: 0.0070110103115439415 2023-01-22 15:57:07.756025: step: 740/466, loss: 0.02544263005256653 2023-01-22 15:57:08.502321: step: 742/466, loss: 0.0033943578600883484 2023-01-22 15:57:09.239920: step: 744/466, loss: 0.0024433997459709644 2023-01-22 15:57:09.927059: step: 746/466, loss: 0.08233567327260971 2023-01-22 15:57:10.715269: step: 748/466, loss: 0.0069228848442435265 2023-01-22 15:57:11.542939: step: 750/466, loss: 0.04751036688685417 2023-01-22 15:57:12.341014: step: 752/466, loss: 0.011901628226041794 2023-01-22 15:57:13.052566: step: 754/466, loss: 0.19462859630584717 2023-01-22 15:57:13.800619: step: 756/466, loss: 0.011455833911895752 2023-01-22 15:57:14.545986: step: 758/466, loss: 0.09103307127952576 2023-01-22 15:57:15.316332: step: 760/466, loss: 0.0032544638961553574 2023-01-22 15:57:16.107280: step: 762/466, loss: 0.018020547926425934 2023-01-22 15:57:16.872496: step: 764/466, loss: 0.013624212704598904 2023-01-22 15:57:17.702780: step: 766/466, loss: 0.01824241690337658 2023-01-22 15:57:18.405867: step: 768/466, loss: 0.009417587891221046 2023-01-22 15:57:19.201157: step: 770/466, loss: 0.02756587788462639 2023-01-22 15:57:19.937656: step: 772/466, loss: 0.00769635196775198 2023-01-22 15:57:20.738185: step: 774/466, loss: 0.01789870485663414 2023-01-22 15:57:21.559115: step: 776/466, loss: 0.007817920297384262 2023-01-22 15:57:22.336297: step: 778/466, loss: 0.010252301581203938 2023-01-22 15:57:23.152459: step: 780/466, loss: 0.01401289738714695 2023-01-22 15:57:23.832147: step: 782/466, loss: 0.008630058728158474 2023-01-22 15:57:24.530812: step: 784/466, loss: 0.04713859409093857 2023-01-22 15:57:25.339493: step: 786/466, loss: 0.00617067189887166 2023-01-22 15:57:26.147541: step: 788/466, loss: 0.06235240399837494 2023-01-22 15:57:26.895636: step: 790/466, loss: 0.00010020119952969253 2023-01-22 15:57:27.661165: step: 792/466, loss: 0.012778275646269321 2023-01-22 15:57:28.394616: step: 794/466, loss: 0.0429111085832119 2023-01-22 15:57:29.276356: step: 796/466, loss: 0.004496218170970678 2023-01-22 15:57:30.027285: step: 798/466, loss: 0.016231169924139977 2023-01-22 15:57:30.758556: step: 800/466, loss: 0.008096449077129364 2023-01-22 15:57:31.553916: step: 802/466, loss: 0.0170602947473526 2023-01-22 15:57:32.262321: step: 804/466, loss: 0.016482815146446228 2023-01-22 15:57:32.982149: step: 806/466, loss: 0.20911498367786407 2023-01-22 15:57:33.674620: step: 808/466, loss: 0.013375013135373592 2023-01-22 15:57:34.428292: step: 810/466, loss: 0.008207838982343674 2023-01-22 15:57:35.166536: step: 812/466, loss: 0.021015865728259087 2023-01-22 15:57:35.955277: step: 814/466, loss: 0.0028387221973389387 2023-01-22 15:57:36.738153: step: 816/466, loss: 0.06083836778998375 2023-01-22 15:57:37.516769: step: 818/466, loss: 0.01898682489991188 2023-01-22 15:57:38.276929: step: 820/466, loss: 0.02709573693573475 2023-01-22 15:57:39.034294: step: 822/466, loss: 0.04168681800365448 2023-01-22 15:57:39.786064: step: 824/466, loss: 0.10127529501914978 2023-01-22 15:57:40.533756: step: 826/466, loss: 0.019474346190690994 2023-01-22 15:57:41.307651: step: 828/466, loss: 0.047736093401908875 2023-01-22 15:57:42.101414: step: 830/466, loss: 0.06600034236907959 2023-01-22 15:57:42.902353: step: 832/466, loss: 0.008739227429032326 2023-01-22 15:57:43.636328: step: 834/466, loss: 2.102041721343994 2023-01-22 15:57:44.439433: step: 836/466, loss: 0.041133981198072433 2023-01-22 15:57:45.288553: step: 838/466, loss: 0.027052856981754303 2023-01-22 15:57:45.998841: step: 840/466, loss: 0.07918006181716919 2023-01-22 15:57:46.699510: step: 842/466, loss: 0.0011637471616268158 2023-01-22 15:57:47.489738: step: 844/466, loss: 0.008314933627843857 2023-01-22 15:57:48.219299: step: 846/466, loss: 0.02303183451294899 2023-01-22 15:57:49.003267: step: 848/466, loss: 0.04594963416457176 2023-01-22 15:57:49.763602: step: 850/466, loss: 0.0016797209391370416 2023-01-22 15:57:50.524261: step: 852/466, loss: 0.045600421726703644 2023-01-22 15:57:51.443906: step: 854/466, loss: 0.0024260608479380608 2023-01-22 15:57:52.243781: step: 856/466, loss: 0.014224640093743801 2023-01-22 15:57:52.959405: step: 858/466, loss: 0.0008135157404467463 2023-01-22 15:57:53.725085: step: 860/466, loss: 0.011436758562922478 2023-01-22 15:57:54.453213: step: 862/466, loss: 0.00037625571712851524 2023-01-22 15:57:55.146190: step: 864/466, loss: 0.003735210048034787 2023-01-22 15:57:55.920748: step: 866/466, loss: 0.02508869767189026 2023-01-22 15:57:56.660745: step: 868/466, loss: 0.012735653668642044 2023-01-22 15:57:57.400516: step: 870/466, loss: 0.017817430198192596 2023-01-22 15:57:58.157909: step: 872/466, loss: 0.0025312600191682577 2023-01-22 15:57:58.967696: step: 874/466, loss: 0.00425300607457757 2023-01-22 15:57:59.680212: step: 876/466, loss: 0.12249448150396347 2023-01-22 15:58:00.422394: step: 878/466, loss: 0.006648956798017025 2023-01-22 15:58:01.132082: step: 880/466, loss: 0.04195516183972359 2023-01-22 15:58:01.811297: step: 882/466, loss: 0.0004320128355175257 2023-01-22 15:58:02.640139: step: 884/466, loss: 0.022593876346945763 2023-01-22 15:58:03.426090: step: 886/466, loss: 0.0333063080906868 2023-01-22 15:58:04.144492: step: 888/466, loss: 0.0011965618468821049 2023-01-22 15:58:04.813756: step: 890/466, loss: 0.0011643233010545373 2023-01-22 15:58:05.607531: step: 892/466, loss: 0.0015710833249613643 2023-01-22 15:58:06.448878: step: 894/466, loss: 0.0017799792112782598 2023-01-22 15:58:07.226221: step: 896/466, loss: 0.011207039467990398 2023-01-22 15:58:07.974691: step: 898/466, loss: 0.001885769423097372 2023-01-22 15:58:08.861476: step: 900/466, loss: 0.00987928081303835 2023-01-22 15:58:09.574197: step: 902/466, loss: 0.029370475560426712 2023-01-22 15:58:10.343587: step: 904/466, loss: 0.019185485318303108 2023-01-22 15:58:11.071437: step: 906/466, loss: 0.01172910537570715 2023-01-22 15:58:11.767123: step: 908/466, loss: 0.0012697859201580286 2023-01-22 15:58:12.539657: step: 910/466, loss: 0.00720939738675952 2023-01-22 15:58:13.239294: step: 912/466, loss: 0.0008965595043264329 2023-01-22 15:58:14.016809: step: 914/466, loss: 0.005025980528444052 2023-01-22 15:58:14.927539: step: 916/466, loss: 0.00024496426340192556 2023-01-22 15:58:15.869117: step: 918/466, loss: 0.029787475243210793 2023-01-22 15:58:16.596586: step: 920/466, loss: 0.046670347452163696 2023-01-22 15:58:17.254870: step: 922/466, loss: 0.005342547781765461 2023-01-22 15:58:17.989967: step: 924/466, loss: 0.0036600539460778236 2023-01-22 15:58:18.715941: step: 926/466, loss: 0.003369266865774989 2023-01-22 15:58:19.462287: step: 928/466, loss: 0.02210007980465889 2023-01-22 15:58:20.173724: step: 930/466, loss: 0.06508655846118927 2023-01-22 15:58:20.942407: step: 932/466, loss: 0.002537196036428213 ================================================== Loss: 0.108 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.30574650912996776, 'r': 0.3086473298997018, 'f1': 0.3071900714960205}, 'combined': 0.22635057899706773, 'epoch': 31} Test Chinese: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.35761492008807666, 'r': 0.3018344299530214, 'f1': 0.32736553774979954}, 'combined': 0.20121003783646216, 'epoch': 31} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.28186678434979695, 'r': 0.3171669888414224, 'f1': 0.2984767912846957}, 'combined': 0.21993026726240733, 'epoch': 31} Test Korean: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.34086434132989524, 'r': 0.30512379947468093, 'f1': 0.32200536314017536}, 'combined': 0.1979154914910346, 'epoch': 31} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3233716475095786, 'r': 0.320303605313093, 'f1': 0.32183031458531935}, 'combined': 0.23713812653655109, 'epoch': 31} Test Russian: {'template': {'p': 0.863013698630137, 'r': 0.48091603053435117, 'f1': 0.6176470588235295}, 'slot': {'p': 0.35439677734385905, 'r': 0.2963543242086863, 'f1': 0.32278706006307123}, 'combined': 0.19936847827424992, 'epoch': 31} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.251453488372093, 'r': 0.30892857142857144, 'f1': 0.27724358974358976}, 'combined': 0.18482905982905984, 'epoch': 31} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.25, 'r': 0.5108695652173914, 'f1': 0.33571428571428574}, 'combined': 0.16785714285714287, 'epoch': 31} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.4666666666666667, 'r': 0.2413793103448276, 'f1': 0.3181818181818182}, 'combined': 0.2121212121212121, 'epoch': 31} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3112994890724527, 'r': 0.3408345449616797, 'f1': 0.3253981978166761}, 'combined': 0.23976709312807712, 'epoch': 22} Test for Chinese: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.354901590881547, 'r': 0.30635228234537004, 'f1': 0.3288446896922884}, 'combined': 0.2021191751279431, 'epoch': 22} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.28125, 'r': 0.38571428571428573, 'f1': 0.32530120481927716}, 'combined': 0.2168674698795181, 'epoch': 22} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.28845615818674736, 'r': 0.34045489637980425, 'f1': 0.3123058840594549}, 'combined': 0.23012012509644045, 'epoch': 28} Test for Korean: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.3314220112372907, 'r': 0.30844648186208856, 'f1': 0.3195217594872982}, 'combined': 0.19638898387999792, 'epoch': 28} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.3170731707317073, 'r': 0.5652173913043478, 'f1': 0.40625}, 'combined': 0.203125, 'epoch': 28} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3161637354106435, 'r': 0.3395610516934046, 'f1': 0.3274449665918101}, 'combined': 0.24127523854133376, 'epoch': 24} Test for Russian: {'template': {'p': 0.863013698630137, 'r': 0.48091603053435117, 'f1': 0.6176470588235295}, 'slot': {'p': 0.34766580316129947, 'r': 0.3010093533864065, 'f1': 0.32265967810793456}, 'combined': 0.19928980118431255, 'epoch': 24} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.2413793103448276, 'f1': 0.32558139534883723}, 'combined': 0.21705426356589147, 'epoch': 24} ****************************** Epoch: 32 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-22 16:01:06.277279: step: 2/466, loss: 5.15372994414065e-05 2023-01-22 16:01:07.039494: step: 4/466, loss: 0.06286688148975372 2023-01-22 16:01:07.845810: step: 6/466, loss: 0.143926739692688 2023-01-22 16:01:08.606896: step: 8/466, loss: 0.04372314363718033 2023-01-22 16:01:09.340161: step: 10/466, loss: 0.05686626583337784 2023-01-22 16:01:10.052937: step: 12/466, loss: 0.0015271787997335196 2023-01-22 16:01:10.880940: step: 14/466, loss: 0.0009403207222931087 2023-01-22 16:01:11.715752: step: 16/466, loss: 0.003312920220196247 2023-01-22 16:01:12.466856: step: 18/466, loss: 0.0007510375580750406 2023-01-22 16:01:13.185937: step: 20/466, loss: 0.01566898077726364 2023-01-22 16:01:13.893410: step: 22/466, loss: 0.009687750600278378 2023-01-22 16:01:14.930688: step: 24/466, loss: 0.021713193506002426 2023-01-22 16:01:15.677520: step: 26/466, loss: 0.05182737484574318 2023-01-22 16:01:16.409588: step: 28/466, loss: 0.02287333831191063 2023-01-22 16:01:17.238585: step: 30/466, loss: 0.03470327705144882 2023-01-22 16:01:17.971443: step: 32/466, loss: 0.027573635801672935 2023-01-22 16:01:18.742178: step: 34/466, loss: 0.0009407330653630197 2023-01-22 16:01:19.459389: step: 36/466, loss: 0.02692416124045849 2023-01-22 16:01:20.226617: step: 38/466, loss: 0.028733985498547554 2023-01-22 16:01:20.916088: step: 40/466, loss: 0.0004459666379261762 2023-01-22 16:01:21.658990: step: 42/466, loss: 0.016065159812569618 2023-01-22 16:01:22.444366: step: 44/466, loss: 0.007120391353964806 2023-01-22 16:01:23.176520: step: 46/466, loss: 0.0037614544853568077 2023-01-22 16:01:23.952157: step: 48/466, loss: 0.006791574414819479 2023-01-22 16:01:24.715532: step: 50/466, loss: 0.0027036392129957676 2023-01-22 16:01:25.523401: step: 52/466, loss: 0.02349858544766903 2023-01-22 16:01:26.326569: step: 54/466, loss: 0.04122764989733696 2023-01-22 16:01:27.035982: step: 56/466, loss: 0.006118678953498602 2023-01-22 16:01:27.823979: step: 58/466, loss: 0.03250905126333237 2023-01-22 16:01:28.552099: step: 60/466, loss: 0.0003115589206572622 2023-01-22 16:01:29.305300: step: 62/466, loss: 0.033898092806339264 2023-01-22 16:01:30.110016: step: 64/466, loss: 0.0010923035442829132 2023-01-22 16:01:30.840658: step: 66/466, loss: 0.016168424859642982 2023-01-22 16:01:31.568838: step: 68/466, loss: 0.0022301855497062206 2023-01-22 16:01:32.320001: step: 70/466, loss: 0.009409364312887192 2023-01-22 16:01:33.102848: step: 72/466, loss: 0.026693210005760193 2023-01-22 16:01:33.827469: step: 74/466, loss: 0.009221694432199001 2023-01-22 16:01:34.529886: step: 76/466, loss: 0.0024394956417381763 2023-01-22 16:01:35.280236: step: 78/466, loss: 0.0045931036584079266 2023-01-22 16:01:36.081414: step: 80/466, loss: 0.00041875254828482866 2023-01-22 16:01:36.924505: step: 82/466, loss: 0.0021951228845864534 2023-01-22 16:01:37.614587: step: 84/466, loss: 0.0005747769610024989 2023-01-22 16:01:38.356225: step: 86/466, loss: 0.0007826169021427631 2023-01-22 16:01:39.053145: step: 88/466, loss: 0.00010447607201058418 2023-01-22 16:01:39.839737: step: 90/466, loss: 0.003849487518891692 2023-01-22 16:01:40.612916: step: 92/466, loss: 0.693045437335968 2023-01-22 16:01:41.362288: step: 94/466, loss: 0.37063682079315186 2023-01-22 16:01:42.027234: step: 96/466, loss: 0.08187183737754822 2023-01-22 16:01:42.773821: step: 98/466, loss: 0.007195903453975916 2023-01-22 16:01:43.597427: step: 100/466, loss: 0.08682779222726822 2023-01-22 16:01:44.369534: step: 102/466, loss: 0.016507381573319435 2023-01-22 16:01:45.207703: step: 104/466, loss: 0.007591415196657181 2023-01-22 16:01:46.012210: step: 106/466, loss: 0.010487313382327557 2023-01-22 16:01:46.794881: step: 108/466, loss: 0.007627226412296295 2023-01-22 16:01:47.572789: step: 110/466, loss: 0.014922836795449257 2023-01-22 16:01:48.326082: step: 112/466, loss: 0.023592745885252953 2023-01-22 16:01:49.064825: step: 114/466, loss: 0.0010895115556195378 2023-01-22 16:01:49.789951: step: 116/466, loss: 0.0007410570979118347 2023-01-22 16:01:50.534191: step: 118/466, loss: 0.004684681538492441 2023-01-22 16:01:51.323579: step: 120/466, loss: 0.001338861184194684 2023-01-22 16:01:52.140893: step: 122/466, loss: 0.02555706538259983 2023-01-22 16:01:53.066933: step: 124/466, loss: 0.017490247264504433 2023-01-22 16:01:53.967711: step: 126/466, loss: 0.031979188323020935 2023-01-22 16:01:54.724380: step: 128/466, loss: 0.02571520209312439 2023-01-22 16:01:55.421838: step: 130/466, loss: 0.01015580352395773 2023-01-22 16:01:56.163345: step: 132/466, loss: 0.0033969765063375235 2023-01-22 16:01:56.974718: step: 134/466, loss: 0.003304438665509224 2023-01-22 16:01:57.689602: step: 136/466, loss: 0.008730168454349041 2023-01-22 16:01:58.470891: step: 138/466, loss: 0.007856298238039017 2023-01-22 16:01:59.224570: step: 140/466, loss: 0.06530187278985977 2023-01-22 16:02:00.043594: step: 142/466, loss: 0.01866009458899498 2023-01-22 16:02:00.858482: step: 144/466, loss: 0.034205153584480286 2023-01-22 16:02:01.574981: step: 146/466, loss: 0.013796073384582996 2023-01-22 16:02:02.273536: step: 148/466, loss: 0.034375231713056564 2023-01-22 16:02:03.091412: step: 150/466, loss: 0.028176940977573395 2023-01-22 16:02:03.841089: step: 152/466, loss: 0.04444631189107895 2023-01-22 16:02:04.589856: step: 154/466, loss: 0.0017726004589349031 2023-01-22 16:02:05.347816: step: 156/466, loss: 0.006879525724798441 2023-01-22 16:02:06.132439: step: 158/466, loss: 0.0015480904839932919 2023-01-22 16:02:06.894212: step: 160/466, loss: 0.021863074973225594 2023-01-22 16:02:07.567485: step: 162/466, loss: 0.003709911135956645 2023-01-22 16:02:08.298850: step: 164/466, loss: 0.0016268891049548984 2023-01-22 16:02:09.065872: step: 166/466, loss: 0.005751646589487791 2023-01-22 16:02:09.946469: step: 168/466, loss: 0.06992904841899872 2023-01-22 16:02:10.721345: step: 170/466, loss: 0.549974262714386 2023-01-22 16:02:11.485299: step: 172/466, loss: 0.016287731006741524 2023-01-22 16:02:12.166700: step: 174/466, loss: 0.0040813288651406765 2023-01-22 16:02:12.881575: step: 176/466, loss: 0.8938317894935608 2023-01-22 16:02:13.709101: step: 178/466, loss: 0.061728768050670624 2023-01-22 16:02:14.422222: step: 180/466, loss: 0.1072620376944542 2023-01-22 16:02:15.117167: step: 182/466, loss: 0.00487162871286273 2023-01-22 16:02:15.919457: step: 184/466, loss: 0.15155097842216492 2023-01-22 16:02:16.671830: step: 186/466, loss: 0.15379630029201508 2023-01-22 16:02:17.399534: step: 188/466, loss: 0.04495794698596001 2023-01-22 16:02:18.203121: step: 190/466, loss: 0.0044115460477769375 2023-01-22 16:02:18.934774: step: 192/466, loss: 0.0011705057695508003 2023-01-22 16:02:19.740449: step: 194/466, loss: 0.0030282491352409124 2023-01-22 16:02:20.540842: step: 196/466, loss: 0.004321829881519079 2023-01-22 16:02:21.316033: step: 198/466, loss: 0.015348169021308422 2023-01-22 16:02:22.082510: step: 200/466, loss: 0.01626484841108322 2023-01-22 16:02:22.895383: step: 202/466, loss: 0.03765762597322464 2023-01-22 16:02:23.601187: step: 204/466, loss: 0.002001148881390691 2023-01-22 16:02:24.375955: step: 206/466, loss: 0.0076205311343073845 2023-01-22 16:02:25.177603: step: 208/466, loss: 0.026217781007289886 2023-01-22 16:02:25.873942: step: 210/466, loss: 0.014518878422677517 2023-01-22 16:02:26.690840: step: 212/466, loss: 3.662377275759354e-05 2023-01-22 16:02:27.501548: step: 214/466, loss: 0.0026546171866357327 2023-01-22 16:02:28.149835: step: 216/466, loss: 0.0005058245151303709 2023-01-22 16:02:28.954563: step: 218/466, loss: 0.057214125990867615 2023-01-22 16:02:29.739662: step: 220/466, loss: 0.0011696420842781663 2023-01-22 16:02:30.522698: step: 222/466, loss: 0.0006456954870373011 2023-01-22 16:02:31.277604: step: 224/466, loss: 0.007920566946268082 2023-01-22 16:02:32.000691: step: 226/466, loss: 0.0024360103998333216 2023-01-22 16:02:32.806501: step: 228/466, loss: 0.0069953678175807 2023-01-22 16:02:33.490671: step: 230/466, loss: 0.002587408060207963 2023-01-22 16:02:34.186546: step: 232/466, loss: 0.06114426627755165 2023-01-22 16:02:34.917267: step: 234/466, loss: 0.021435175091028214 2023-01-22 16:02:35.692029: step: 236/466, loss: 0.013130392879247665 2023-01-22 16:02:36.470787: step: 238/466, loss: 0.04634890332818031 2023-01-22 16:02:37.238910: step: 240/466, loss: 0.0003861374862026423 2023-01-22 16:02:38.012511: step: 242/466, loss: 0.018936574459075928 2023-01-22 16:02:38.780082: step: 244/466, loss: 0.07208773493766785 2023-01-22 16:02:39.510733: step: 246/466, loss: 0.01850968413054943 2023-01-22 16:02:40.186584: step: 248/466, loss: 0.0026213873643428087 2023-01-22 16:02:40.918993: step: 250/466, loss: 0.000575187848880887 2023-01-22 16:02:41.635679: step: 252/466, loss: 0.010430560447275639 2023-01-22 16:02:42.429195: step: 254/466, loss: 0.02402244694530964 2023-01-22 16:02:43.147604: step: 256/466, loss: 0.4019804894924164 2023-01-22 16:02:43.867753: step: 258/466, loss: 0.002005958929657936 2023-01-22 16:02:44.587622: step: 260/466, loss: 0.005219338461756706 2023-01-22 16:02:45.287133: step: 262/466, loss: 0.007266393397003412 2023-01-22 16:02:46.020297: step: 264/466, loss: 0.011721421964466572 2023-01-22 16:02:46.765513: step: 266/466, loss: 0.01827792450785637 2023-01-22 16:02:47.498293: step: 268/466, loss: 0.04693610966205597 2023-01-22 16:02:48.289547: step: 270/466, loss: 0.01974484883248806 2023-01-22 16:02:49.169746: step: 272/466, loss: 0.09845882654190063 2023-01-22 16:02:49.863825: step: 274/466, loss: 0.4635373651981354 2023-01-22 16:02:50.629526: step: 276/466, loss: 0.6648179292678833 2023-01-22 16:02:51.351313: step: 278/466, loss: 0.035920653492212296 2023-01-22 16:02:52.058375: step: 280/466, loss: 0.007692283485084772 2023-01-22 16:02:52.744763: step: 282/466, loss: 0.032454658299684525 2023-01-22 16:02:53.516814: step: 284/466, loss: 0.001869324711151421 2023-01-22 16:02:54.220503: step: 286/466, loss: 0.0011435500346124172 2023-01-22 16:02:55.065457: step: 288/466, loss: 0.01853904314339161 2023-01-22 16:02:55.707912: step: 290/466, loss: 0.05531471222639084 2023-01-22 16:02:56.492485: step: 292/466, loss: 0.13581180572509766 2023-01-22 16:02:57.253303: step: 294/466, loss: 0.043170515447854996 2023-01-22 16:02:57.919366: step: 296/466, loss: 0.000367786327842623 2023-01-22 16:02:58.730919: step: 298/466, loss: 0.0017710586544126272 2023-01-22 16:02:59.479220: step: 300/466, loss: 0.0012787083396688104 2023-01-22 16:03:00.256983: step: 302/466, loss: 0.009321542456746101 2023-01-22 16:03:00.978820: step: 304/466, loss: 0.04730689898133278 2023-01-22 16:03:01.752717: step: 306/466, loss: 0.07945113629102707 2023-01-22 16:03:02.545382: step: 308/466, loss: 0.0039559840224683285 2023-01-22 16:03:03.337157: step: 310/466, loss: 0.00273063569329679 2023-01-22 16:03:04.039412: step: 312/466, loss: 0.24587573111057281 2023-01-22 16:03:04.736515: step: 314/466, loss: 0.010982971638441086 2023-01-22 16:03:05.465821: step: 316/466, loss: 0.0003546890802681446 2023-01-22 16:03:06.298456: step: 318/466, loss: 0.21356633305549622 2023-01-22 16:03:07.046502: step: 320/466, loss: 0.00359840365126729 2023-01-22 16:03:07.822116: step: 322/466, loss: 0.0009715608903206885 2023-01-22 16:03:08.657154: step: 324/466, loss: 0.003452699165791273 2023-01-22 16:03:09.416645: step: 326/466, loss: 0.008932402357459068 2023-01-22 16:03:10.165286: step: 328/466, loss: 0.008256279863417149 2023-01-22 16:03:10.914435: step: 330/466, loss: 0.01303121168166399 2023-01-22 16:03:11.671353: step: 332/466, loss: 0.012796856462955475 2023-01-22 16:03:12.481178: step: 334/466, loss: 0.005467746406793594 2023-01-22 16:03:13.260022: step: 336/466, loss: 0.028389442712068558 2023-01-22 16:03:14.021205: step: 338/466, loss: 0.015709152445197105 2023-01-22 16:03:14.792290: step: 340/466, loss: 0.08886820822954178 2023-01-22 16:03:15.579928: step: 342/466, loss: 0.03574886545538902 2023-01-22 16:03:16.277826: step: 344/466, loss: 0.06135426089167595 2023-01-22 16:03:17.026927: step: 346/466, loss: 0.008707728236913681 2023-01-22 16:03:17.828423: step: 348/466, loss: 0.09459453821182251 2023-01-22 16:03:18.648422: step: 350/466, loss: 0.002856465056538582 2023-01-22 16:03:19.380731: step: 352/466, loss: 0.004025793168693781 2023-01-22 16:03:20.125339: step: 354/466, loss: 0.01567946933209896 2023-01-22 16:03:20.832849: step: 356/466, loss: 0.020442336797714233 2023-01-22 16:03:21.575829: step: 358/466, loss: 0.0505521185696125 2023-01-22 16:03:22.416833: step: 360/466, loss: 0.018883030861616135 2023-01-22 16:03:23.216720: step: 362/466, loss: 0.12958039343357086 2023-01-22 16:03:23.928356: step: 364/466, loss: 0.005207682028412819 2023-01-22 16:03:24.733246: step: 366/466, loss: 0.004336261190474033 2023-01-22 16:03:25.515325: step: 368/466, loss: 0.03535531833767891 2023-01-22 16:03:26.192640: step: 370/466, loss: 0.012042288668453693 2023-01-22 16:03:26.978218: step: 372/466, loss: 0.06027388945221901 2023-01-22 16:03:27.814321: step: 374/466, loss: 0.0804840475320816 2023-01-22 16:03:28.572436: step: 376/466, loss: 0.0014674561098217964 2023-01-22 16:03:29.272958: step: 378/466, loss: 0.05795981362462044 2023-01-22 16:03:30.001192: step: 380/466, loss: 0.00041502335807308555 2023-01-22 16:03:30.758874: step: 382/466, loss: 0.0005715118604712188 2023-01-22 16:03:31.562161: step: 384/466, loss: 0.06926651298999786 2023-01-22 16:03:32.339890: step: 386/466, loss: 0.03368639945983887 2023-01-22 16:03:33.143623: step: 388/466, loss: 0.011914392933249474 2023-01-22 16:03:33.948264: step: 390/466, loss: 0.01934785209596157 2023-01-22 16:03:34.726001: step: 392/466, loss: 0.017015738412737846 2023-01-22 16:03:35.485358: step: 394/466, loss: 0.0006627896218560636 2023-01-22 16:03:36.354309: step: 396/466, loss: 0.000683549209497869 2023-01-22 16:03:37.062486: step: 398/466, loss: 0.02618207037448883 2023-01-22 16:03:37.912132: step: 400/466, loss: 0.008119520731270313 2023-01-22 16:03:38.548613: step: 402/466, loss: 0.007547646760940552 2023-01-22 16:03:39.258571: step: 404/466, loss: 0.000945351435802877 2023-01-22 16:03:40.164378: step: 406/466, loss: 0.021251145750284195 2023-01-22 16:03:40.893954: step: 408/466, loss: 0.008209510706365108 2023-01-22 16:03:41.598953: step: 410/466, loss: 0.021134501323103905 2023-01-22 16:03:42.337138: step: 412/466, loss: 0.1187206283211708 2023-01-22 16:03:43.139260: step: 414/466, loss: 0.011240017600357533 2023-01-22 16:03:43.924940: step: 416/466, loss: 0.01897376775741577 2023-01-22 16:03:44.662388: step: 418/466, loss: 0.017858237028121948 2023-01-22 16:03:45.403850: step: 420/466, loss: 0.01124533824622631 2023-01-22 16:03:46.125076: step: 422/466, loss: 0.007958485744893551 2023-01-22 16:03:46.848479: step: 424/466, loss: 0.019424919039011 2023-01-22 16:03:47.584165: step: 426/466, loss: 0.0003154211735818535 2023-01-22 16:03:48.338049: step: 428/466, loss: 0.0011403568787500262 2023-01-22 16:03:49.060747: step: 430/466, loss: 0.0004980181693099439 2023-01-22 16:03:49.795808: step: 432/466, loss: 0.011258955113589764 2023-01-22 16:03:50.613585: step: 434/466, loss: 0.04757676273584366 2023-01-22 16:03:51.418631: step: 436/466, loss: 0.048159319907426834 2023-01-22 16:03:52.221080: step: 438/466, loss: 0.01208664383739233 2023-01-22 16:03:52.942647: step: 440/466, loss: 0.06914924085140228 2023-01-22 16:03:53.713110: step: 442/466, loss: 0.03010084666311741 2023-01-22 16:03:54.455849: step: 444/466, loss: 0.008922251872718334 2023-01-22 16:03:55.216648: step: 446/466, loss: 0.01286082249134779 2023-01-22 16:03:55.996725: step: 448/466, loss: 0.014049254357814789 2023-01-22 16:03:56.751797: step: 450/466, loss: 0.007900919765233994 2023-01-22 16:03:57.521000: step: 452/466, loss: 0.001978015759959817 2023-01-22 16:03:58.250819: step: 454/466, loss: 0.03158547729253769 2023-01-22 16:03:59.021933: step: 456/466, loss: 0.031130792573094368 2023-01-22 16:03:59.753024: step: 458/466, loss: 0.05741674825549126 2023-01-22 16:04:00.540086: step: 460/466, loss: 0.031495485454797745 2023-01-22 16:04:01.301561: step: 462/466, loss: 0.10114021599292755 2023-01-22 16:04:02.013710: step: 464/466, loss: 0.010003827512264252 2023-01-22 16:04:02.728632: step: 466/466, loss: 0.0012490164954215288 2023-01-22 16:04:03.456443: step: 468/466, loss: 0.0016084901290014386 2023-01-22 16:04:04.225333: step: 470/466, loss: 0.06793329864740372 2023-01-22 16:04:04.990471: step: 472/466, loss: 0.0016267584869638085 2023-01-22 16:04:05.755982: step: 474/466, loss: 0.04030391946434975 2023-01-22 16:04:06.514791: step: 476/466, loss: 0.004462345503270626 2023-01-22 16:04:07.251073: step: 478/466, loss: 0.007912660017609596 2023-01-22 16:04:08.051699: step: 480/466, loss: 0.0009775584330782294 2023-01-22 16:04:08.779025: step: 482/466, loss: 0.007115909829735756 2023-01-22 16:04:09.498140: step: 484/466, loss: 0.002461756346747279 2023-01-22 16:04:10.327302: step: 486/466, loss: 0.00989474169909954 2023-01-22 16:04:11.075063: step: 488/466, loss: 0.014085104689002037 2023-01-22 16:04:11.785479: step: 490/466, loss: 0.009780194610357285 2023-01-22 16:04:12.542260: step: 492/466, loss: 0.006210966035723686 2023-01-22 16:04:13.279998: step: 494/466, loss: 0.014013716019690037 2023-01-22 16:04:13.994673: step: 496/466, loss: 0.0018455483950674534 2023-01-22 16:04:14.643231: step: 498/466, loss: 0.0010371323442086577 2023-01-22 16:04:15.417686: step: 500/466, loss: 0.00023467614664696157 2023-01-22 16:04:16.143628: step: 502/466, loss: 0.00018893061496783048 2023-01-22 16:04:16.891979: step: 504/466, loss: 0.03937339410185814 2023-01-22 16:04:17.622544: step: 506/466, loss: 0.0881049633026123 2023-01-22 16:04:18.412365: step: 508/466, loss: 0.00766101386398077 2023-01-22 16:04:19.129870: step: 510/466, loss: 0.012610274367034435 2023-01-22 16:04:19.901404: step: 512/466, loss: 0.18752272427082062 2023-01-22 16:04:20.683047: step: 514/466, loss: 0.02388453483581543 2023-01-22 16:04:21.486739: step: 516/466, loss: 0.17496129870414734 2023-01-22 16:04:22.312655: step: 518/466, loss: 0.5000375509262085 2023-01-22 16:04:23.024266: step: 520/466, loss: 0.04777266085147858 2023-01-22 16:04:23.874538: step: 522/466, loss: 0.03219824656844139 2023-01-22 16:04:24.617112: step: 524/466, loss: 0.005659267771989107 2023-01-22 16:04:25.306608: step: 526/466, loss: 0.01143594179302454 2023-01-22 16:04:26.195372: step: 528/466, loss: 0.007292766124010086 2023-01-22 16:04:26.909801: step: 530/466, loss: 0.029840456321835518 2023-01-22 16:04:27.648930: step: 532/466, loss: 0.0034368287306278944 2023-01-22 16:04:28.297204: step: 534/466, loss: 0.016929039731621742 2023-01-22 16:04:28.966610: step: 536/466, loss: 0.013368791900575161 2023-01-22 16:04:29.696006: step: 538/466, loss: 0.08718257397413254 2023-01-22 16:04:30.501271: step: 540/466, loss: 0.0047819907777011395 2023-01-22 16:04:31.265597: step: 542/466, loss: 0.008324574679136276 2023-01-22 16:04:31.967248: step: 544/466, loss: 0.0233193039894104 2023-01-22 16:04:32.751075: step: 546/466, loss: 0.017024584114551544 2023-01-22 16:04:33.561975: step: 548/466, loss: 0.5465644001960754 2023-01-22 16:04:34.296129: step: 550/466, loss: 0.07017926126718521 2023-01-22 16:04:35.052202: step: 552/466, loss: 0.01461838185787201 2023-01-22 16:04:35.866223: step: 554/466, loss: 0.0295786764472723 2023-01-22 16:04:36.702617: step: 556/466, loss: 0.0040992312133312225 2023-01-22 16:04:37.433472: step: 558/466, loss: 0.06020784378051758 2023-01-22 16:04:38.144063: step: 560/466, loss: 0.018876129761338234 2023-01-22 16:04:38.823334: step: 562/466, loss: 0.006293662823736668 2023-01-22 16:04:39.519749: step: 564/466, loss: 0.006968436297029257 2023-01-22 16:04:40.170254: step: 566/466, loss: 0.0026204369496554136 2023-01-22 16:04:41.021959: step: 568/466, loss: 0.003658514702692628 2023-01-22 16:04:41.764800: step: 570/466, loss: 0.0021792915649712086 2023-01-22 16:04:42.494970: step: 572/466, loss: 0.22942087054252625 2023-01-22 16:04:43.254915: step: 574/466, loss: 0.006244773976504803 2023-01-22 16:04:43.998324: step: 576/466, loss: 0.0036456582602113485 2023-01-22 16:04:44.757537: step: 578/466, loss: 0.030625438317656517 2023-01-22 16:04:45.505676: step: 580/466, loss: 0.010105275548994541 2023-01-22 16:04:46.271789: step: 582/466, loss: 0.0260683111846447 2023-01-22 16:04:47.023571: step: 584/466, loss: 0.031024497002363205 2023-01-22 16:04:47.771794: step: 586/466, loss: 0.009688056074082851 2023-01-22 16:04:48.548409: step: 588/466, loss: 0.02166881412267685 2023-01-22 16:04:49.344995: step: 590/466, loss: 0.010086444206535816 2023-01-22 16:04:50.092042: step: 592/466, loss: 0.0005045717116445303 2023-01-22 16:04:50.783485: step: 594/466, loss: 0.0016831730026751757 2023-01-22 16:04:51.510711: step: 596/466, loss: 0.0021537907887250185 2023-01-22 16:04:52.299915: step: 598/466, loss: 0.06127836927771568 2023-01-22 16:04:53.049434: step: 600/466, loss: 0.00690503278747201 2023-01-22 16:04:53.791676: step: 602/466, loss: 0.001794489799067378 2023-01-22 16:04:54.663544: step: 604/466, loss: 0.0634308010339737 2023-01-22 16:04:55.545795: step: 606/466, loss: 0.018846353515982628 2023-01-22 16:04:56.444900: step: 608/466, loss: 0.009228182956576347 2023-01-22 16:04:57.129490: step: 610/466, loss: 0.015570910647511482 2023-01-22 16:04:57.938399: step: 612/466, loss: 0.001527966232970357 2023-01-22 16:04:58.723309: step: 614/466, loss: 0.009683456271886826 2023-01-22 16:04:59.521605: step: 616/466, loss: 0.003080724971368909 2023-01-22 16:05:00.276746: step: 618/466, loss: 0.033230457454919815 2023-01-22 16:05:01.017586: step: 620/466, loss: 0.0030834791250526905 2023-01-22 16:05:01.778455: step: 622/466, loss: 0.010715140029788017 2023-01-22 16:05:02.579693: step: 624/466, loss: 0.06441494822502136 2023-01-22 16:05:03.336252: step: 626/466, loss: 0.006688097957521677 2023-01-22 16:05:04.019907: step: 628/466, loss: 0.01309305801987648 2023-01-22 16:05:04.861054: step: 630/466, loss: 0.057056933641433716 2023-01-22 16:05:05.635662: step: 632/466, loss: 0.005508675705641508 2023-01-22 16:05:06.430514: step: 634/466, loss: 0.020695330575108528 2023-01-22 16:05:07.122826: step: 636/466, loss: 0.002080296166241169 2023-01-22 16:05:07.752786: step: 638/466, loss: 0.0001161619002232328 2023-01-22 16:05:08.520563: step: 640/466, loss: 0.022997191175818443 2023-01-22 16:05:09.275982: step: 642/466, loss: 0.004008348099887371 2023-01-22 16:05:09.967643: step: 644/466, loss: 0.018378565087914467 2023-01-22 16:05:10.769411: step: 646/466, loss: 0.05346864089369774 2023-01-22 16:05:11.556347: step: 648/466, loss: 0.028049439191818237 2023-01-22 16:05:12.349114: step: 650/466, loss: 0.01510514598339796 2023-01-22 16:05:13.032784: step: 652/466, loss: 0.012815630063414574 2023-01-22 16:05:13.804025: step: 654/466, loss: 0.3772117495536804 2023-01-22 16:05:14.610412: step: 656/466, loss: 0.0002463465789332986 2023-01-22 16:05:15.418123: step: 658/466, loss: 0.00645784754306078 2023-01-22 16:05:16.173327: step: 660/466, loss: 0.002490751910954714 2023-01-22 16:05:16.904177: step: 662/466, loss: 0.0003407985786907375 2023-01-22 16:05:17.590750: step: 664/466, loss: 0.0006427292246371508 2023-01-22 16:05:18.357814: step: 666/466, loss: 0.3637174665927887 2023-01-22 16:05:19.199975: step: 668/466, loss: 0.04300692677497864 2023-01-22 16:05:19.950918: step: 670/466, loss: 0.07485006749629974 2023-01-22 16:05:20.713602: step: 672/466, loss: 0.011753874830901623 2023-01-22 16:05:21.544314: step: 674/466, loss: 0.011387944221496582 2023-01-22 16:05:22.299200: step: 676/466, loss: 0.0013695204397663474 2023-01-22 16:05:23.063881: step: 678/466, loss: 0.0006198549526743591 2023-01-22 16:05:23.849307: step: 680/466, loss: 0.03149678558111191 2023-01-22 16:05:24.573720: step: 682/466, loss: 0.06519640982151031 2023-01-22 16:05:25.231683: step: 684/466, loss: 0.0003538952150847763 2023-01-22 16:05:26.034321: step: 686/466, loss: 6.4656453132629395 2023-01-22 16:05:26.813803: step: 688/466, loss: 0.04752850532531738 2023-01-22 16:05:27.540656: step: 690/466, loss: 0.014662222005426884 2023-01-22 16:05:28.335969: step: 692/466, loss: 0.00392181845381856 2023-01-22 16:05:29.068222: step: 694/466, loss: 0.005736898630857468 2023-01-22 16:05:29.846399: step: 696/466, loss: 0.005092841573059559 2023-01-22 16:05:30.583047: step: 698/466, loss: 0.006942718289792538 2023-01-22 16:05:31.355489: step: 700/466, loss: 0.014697852544486523 2023-01-22 16:05:32.262456: step: 702/466, loss: 0.005182123742997646 2023-01-22 16:05:32.972931: step: 704/466, loss: 0.0068984078243374825 2023-01-22 16:05:33.755533: step: 706/466, loss: 0.00032237821142189205 2023-01-22 16:05:34.467545: step: 708/466, loss: 0.009095367044210434 2023-01-22 16:05:35.235065: step: 710/466, loss: 0.00819557998329401 2023-01-22 16:05:36.033052: step: 712/466, loss: 0.01657663844525814 2023-01-22 16:05:36.821779: step: 714/466, loss: 0.018086804077029228 2023-01-22 16:05:37.498311: step: 716/466, loss: 0.013770289719104767 2023-01-22 16:05:38.281790: step: 718/466, loss: 0.021133458241820335 2023-01-22 16:05:39.077497: step: 720/466, loss: 0.07392483204603195 2023-01-22 16:05:39.782521: step: 722/466, loss: 0.002734451787546277 2023-01-22 16:05:40.600358: step: 724/466, loss: 0.0009729207376949489 2023-01-22 16:05:41.303357: step: 726/466, loss: 0.03714947775006294 2023-01-22 16:05:41.997294: step: 728/466, loss: 0.02827623300254345 2023-01-22 16:05:42.754990: step: 730/466, loss: 0.0007589849410578609 2023-01-22 16:05:43.613279: step: 732/466, loss: 0.02701164409518242 2023-01-22 16:05:44.351077: step: 734/466, loss: 0.011115744709968567 2023-01-22 16:05:45.087515: step: 736/466, loss: 0.0006149124819785357 2023-01-22 16:05:45.743771: step: 738/466, loss: 0.0014990844065323472 2023-01-22 16:05:46.567904: step: 740/466, loss: 0.007459838874638081 2023-01-22 16:05:47.418481: step: 742/466, loss: 0.006524314172565937 2023-01-22 16:05:48.165211: step: 744/466, loss: 0.006916932761669159 2023-01-22 16:05:48.854593: step: 746/466, loss: 0.0017152292421087623 2023-01-22 16:05:49.614338: step: 748/466, loss: 0.10608824342489243 2023-01-22 16:05:50.349231: step: 750/466, loss: 0.011215800419449806 2023-01-22 16:05:51.106238: step: 752/466, loss: 0.00193583476357162 2023-01-22 16:05:51.910879: step: 754/466, loss: 0.0022075334563851357 2023-01-22 16:05:52.637438: step: 756/466, loss: 0.14459285140037537 2023-01-22 16:05:53.431098: step: 758/466, loss: 0.01011571940034628 2023-01-22 16:05:54.232145: step: 760/466, loss: 0.01083743292838335 2023-01-22 16:05:55.092168: step: 762/466, loss: 0.0022419507149606943 2023-01-22 16:05:55.797551: step: 764/466, loss: 0.0012138750171288848 2023-01-22 16:05:56.441256: step: 766/466, loss: 0.0013211843324825168 2023-01-22 16:05:57.180429: step: 768/466, loss: 0.05762624740600586 2023-01-22 16:05:57.880360: step: 770/466, loss: 0.006621016189455986 2023-01-22 16:05:58.550794: step: 772/466, loss: 0.01991111785173416 2023-01-22 16:05:59.284330: step: 774/466, loss: 0.010002116672694683 2023-01-22 16:06:00.012827: step: 776/466, loss: 0.006916820537298918 2023-01-22 16:06:00.744667: step: 778/466, loss: 0.012478718534111977 2023-01-22 16:06:01.540056: step: 780/466, loss: 0.05239632725715637 2023-01-22 16:06:02.309359: step: 782/466, loss: 0.0016832815017551184 2023-01-22 16:06:03.118776: step: 784/466, loss: 0.03918365761637688 2023-01-22 16:06:03.894671: step: 786/466, loss: 0.006064407993108034 2023-01-22 16:06:04.679627: step: 788/466, loss: 0.007597712334245443 2023-01-22 16:06:05.399995: step: 790/466, loss: 0.0005627631326206028 2023-01-22 16:06:06.248296: step: 792/466, loss: 0.03723245859146118 2023-01-22 16:06:07.022161: step: 794/466, loss: 0.02122754231095314 2023-01-22 16:06:07.729733: step: 796/466, loss: 3.544481296557933e-05 2023-01-22 16:06:08.508965: step: 798/466, loss: 0.08708756417036057 2023-01-22 16:06:09.202835: step: 800/466, loss: 0.03159240633249283 2023-01-22 16:06:09.967303: step: 802/466, loss: 0.014514084905385971 2023-01-22 16:06:10.764159: step: 804/466, loss: 0.05164254456758499 2023-01-22 16:06:11.487062: step: 806/466, loss: 0.0032412000000476837 2023-01-22 16:06:12.355864: step: 808/466, loss: 0.033188801258802414 2023-01-22 16:06:13.137888: step: 810/466, loss: 0.039974067360162735 2023-01-22 16:06:13.861937: step: 812/466, loss: 0.0007288824999704957 2023-01-22 16:06:14.552470: step: 814/466, loss: 0.034891486167907715 2023-01-22 16:06:15.346677: step: 816/466, loss: 0.007756277918815613 2023-01-22 16:06:16.181504: step: 818/466, loss: 0.007071830797940493 2023-01-22 16:06:16.898572: step: 820/466, loss: 0.0004805404460057616 2023-01-22 16:06:17.593244: step: 822/466, loss: 0.001777339493855834 2023-01-22 16:06:18.328959: step: 824/466, loss: 0.015256262384355068 2023-01-22 16:06:19.036436: step: 826/466, loss: 0.0020203653257340193 2023-01-22 16:06:19.727600: step: 828/466, loss: 0.0001835294533520937 2023-01-22 16:06:20.577871: step: 830/466, loss: 0.006643530912697315 2023-01-22 16:06:21.280413: step: 832/466, loss: 0.011377407237887383 2023-01-22 16:06:21.957385: step: 834/466, loss: 3.36966036229569e-06 2023-01-22 16:06:22.748221: step: 836/466, loss: 0.009512822143733501 2023-01-22 16:06:23.644037: step: 838/466, loss: 0.00871200580149889 2023-01-22 16:06:24.387204: step: 840/466, loss: 0.0054242052137851715 2023-01-22 16:06:25.115732: step: 842/466, loss: 0.01555589772760868 2023-01-22 16:06:25.925397: step: 844/466, loss: 0.00017592482618056238 2023-01-22 16:06:26.679044: step: 846/466, loss: 0.0005512124043889344 2023-01-22 16:06:27.395189: step: 848/466, loss: 0.0009457360720261931 2023-01-22 16:06:28.188706: step: 850/466, loss: 0.0033583471085876226 2023-01-22 16:06:28.929638: step: 852/466, loss: 0.007074189838021994 2023-01-22 16:06:29.654775: step: 854/466, loss: 0.024136371910572052 2023-01-22 16:06:30.415906: step: 856/466, loss: 0.06818605214357376 2023-01-22 16:06:31.149463: step: 858/466, loss: 0.0001580503158038482 2023-01-22 16:06:31.882596: step: 860/466, loss: 0.026508208364248276 2023-01-22 16:06:32.700203: step: 862/466, loss: 0.000835965562146157 2023-01-22 16:06:33.514899: step: 864/466, loss: 0.06645052880048752 2023-01-22 16:06:34.273513: step: 866/466, loss: 0.00231425859965384 2023-01-22 16:06:35.005280: step: 868/466, loss: 0.002165533835068345 2023-01-22 16:06:35.790184: step: 870/466, loss: 0.00231713755056262 2023-01-22 16:06:36.634158: step: 872/466, loss: 0.0013481914065778255 2023-01-22 16:06:37.398550: step: 874/466, loss: 0.010383480228483677 2023-01-22 16:06:38.165774: step: 876/466, loss: 0.016029154881834984 2023-01-22 16:06:38.936719: step: 878/466, loss: 0.05106610804796219 2023-01-22 16:06:39.814640: step: 880/466, loss: 0.02417607232928276 2023-01-22 16:06:40.547109: step: 882/466, loss: 0.007658890448510647 2023-01-22 16:06:41.286746: step: 884/466, loss: 0.021339895203709602 2023-01-22 16:06:42.027187: step: 886/466, loss: 0.013452693819999695 2023-01-22 16:06:42.794770: step: 888/466, loss: 0.014736589044332504 2023-01-22 16:06:43.462598: step: 890/466, loss: 0.23204010725021362 2023-01-22 16:06:44.215812: step: 892/466, loss: 0.005269082263112068 2023-01-22 16:06:44.940321: step: 894/466, loss: 0.010547298938035965 2023-01-22 16:06:45.756985: step: 896/466, loss: 0.01578962802886963 2023-01-22 16:06:46.599251: step: 898/466, loss: 0.012825076468288898 2023-01-22 16:06:47.345313: step: 900/466, loss: 5.141352448845282e-05 2023-01-22 16:06:48.145653: step: 902/466, loss: 0.014206547290086746 2023-01-22 16:06:49.005266: step: 904/466, loss: 0.18870453536510468 2023-01-22 16:06:49.759019: step: 906/466, loss: 0.0011588651686906815 2023-01-22 16:06:50.463061: step: 908/466, loss: 0.00014258567534852773 2023-01-22 16:06:51.215812: step: 910/466, loss: 0.025311194360256195 2023-01-22 16:06:51.999910: step: 912/466, loss: 0.011471842415630817 2023-01-22 16:06:52.741084: step: 914/466, loss: 0.01299799419939518 2023-01-22 16:06:53.592911: step: 916/466, loss: 0.03922403231263161 2023-01-22 16:06:54.312019: step: 918/466, loss: 0.0002841950918082148 2023-01-22 16:06:55.185336: step: 920/466, loss: 0.0006640457431785762 2023-01-22 16:06:55.968556: step: 922/466, loss: 0.0005649054073728621 2023-01-22 16:06:56.764586: step: 924/466, loss: 0.0057207574136555195 2023-01-22 16:06:57.416824: step: 926/466, loss: 0.00019332297961227596 2023-01-22 16:06:58.149885: step: 928/466, loss: 0.004037438426166773 2023-01-22 16:06:58.921513: step: 930/466, loss: 1.7265130281448364 2023-01-22 16:06:59.597577: step: 932/466, loss: 0.00019567122217267752 ================================================== Loss: 0.052 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3126052948255114, 'r': 0.3286211258697027, 'f1': 0.3204131976564909}, 'combined': 0.23609393511530907, 'epoch': 32} Test Chinese: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.34476676326975836, 'r': 0.308915800018137, 'f1': 0.3258581656498447}, 'combined': 0.20028355547258747, 'epoch': 32} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.28950632035051554, 'r': 0.33345414886672287, 'f1': 0.30993004665390295}, 'combined': 0.22836950806077058, 'epoch': 32} Test Korean: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.32465699394756864, 'r': 0.3131223867102634, 'f1': 0.31878538532302064}, 'combined': 0.19593638317414927, 'epoch': 32} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3253357753357754, 'r': 0.33706514864010123, 'f1': 0.33109661385523464}, 'combined': 0.24396592599859393, 'epoch': 32} Test Russian: {'template': {'p': 0.863013698630137, 'r': 0.48091603053435117, 'f1': 0.6176470588235295}, 'slot': {'p': 0.34419191077898653, 'r': 0.3092955038802505, 'f1': 0.32581196848727434}, 'combined': 0.2012368040656695, 'epoch': 32} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.27717391304347827, 'r': 0.36428571428571427, 'f1': 0.3148148148148148}, 'combined': 0.20987654320987653, 'epoch': 32} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.27717391304347827, 'r': 0.5543478260869565, 'f1': 0.3695652173913043}, 'combined': 0.18478260869565216, 'epoch': 32} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.4117647058823529, 'r': 0.2413793103448276, 'f1': 0.3043478260869565}, 'combined': 0.20289855072463764, 'epoch': 32} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3112994890724527, 'r': 0.3408345449616797, 'f1': 0.3253981978166761}, 'combined': 0.23976709312807712, 'epoch': 22} Test for Chinese: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.354901590881547, 'r': 0.30635228234537004, 'f1': 0.3288446896922884}, 'combined': 0.2021191751279431, 'epoch': 22} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.28125, 'r': 0.38571428571428573, 'f1': 0.32530120481927716}, 'combined': 0.2168674698795181, 'epoch': 22} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.28845615818674736, 'r': 0.34045489637980425, 'f1': 0.3123058840594549}, 'combined': 0.23012012509644045, 'epoch': 28} Test for Korean: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.3314220112372907, 'r': 0.30844648186208856, 'f1': 0.3195217594872982}, 'combined': 0.19638898387999792, 'epoch': 28} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.3170731707317073, 'r': 0.5652173913043478, 'f1': 0.40625}, 'combined': 0.203125, 'epoch': 28} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3161637354106435, 'r': 0.3395610516934046, 'f1': 0.3274449665918101}, 'combined': 0.24127523854133376, 'epoch': 24} Test for Russian: {'template': {'p': 0.863013698630137, 'r': 0.48091603053435117, 'f1': 0.6176470588235295}, 'slot': {'p': 0.34766580316129947, 'r': 0.3010093533864065, 'f1': 0.32265967810793456}, 'combined': 0.19928980118431255, 'epoch': 24} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.2413793103448276, 'f1': 0.32558139534883723}, 'combined': 0.21705426356589147, 'epoch': 24} ****************************** Epoch: 33 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-22 16:09:45.705238: step: 2/466, loss: 0.019329171627759933 2023-01-22 16:09:46.442546: step: 4/466, loss: 1.0869028568267822 2023-01-22 16:09:47.241422: step: 6/466, loss: 0.14202919602394104 2023-01-22 16:09:47.893958: step: 8/466, loss: 0.0003967805823776871 2023-01-22 16:09:48.675342: step: 10/466, loss: 0.010574987158179283 2023-01-22 16:09:49.453078: step: 12/466, loss: 0.002730625681579113 2023-01-22 16:09:50.161486: step: 14/466, loss: 0.03457174077630043 2023-01-22 16:09:50.915654: step: 16/466, loss: 0.004452232271432877 2023-01-22 16:09:51.688123: step: 18/466, loss: 0.00018031950457952917 2023-01-22 16:09:52.443764: step: 20/466, loss: 0.021097086369991302 2023-01-22 16:09:53.203713: step: 22/466, loss: 0.0024196114391088486 2023-01-22 16:09:53.880615: step: 24/466, loss: 0.0004382265906315297 2023-01-22 16:09:54.814111: step: 26/466, loss: 0.027091480791568756 2023-01-22 16:09:55.629449: step: 28/466, loss: 0.0008335936581715941 2023-01-22 16:09:56.396814: step: 30/466, loss: 0.02781643532216549 2023-01-22 16:09:57.140358: step: 32/466, loss: 0.0023864025715738535 2023-01-22 16:09:57.858490: step: 34/466, loss: 0.0026451097801327705 2023-01-22 16:09:58.628773: step: 36/466, loss: 0.0025420142337679863 2023-01-22 16:09:59.447102: step: 38/466, loss: 0.0019144106190651655 2023-01-22 16:10:00.148949: step: 40/466, loss: 0.03274979442358017 2023-01-22 16:10:00.901683: step: 42/466, loss: 0.005071322433650494 2023-01-22 16:10:01.597545: step: 44/466, loss: 0.3955758213996887 2023-01-22 16:10:02.383491: step: 46/466, loss: 0.0027988648507744074 2023-01-22 16:10:03.164220: step: 48/466, loss: 0.05165601149201393 2023-01-22 16:10:03.961996: step: 50/466, loss: 0.019785480573773384 2023-01-22 16:10:04.647378: step: 52/466, loss: 0.007184821646660566 2023-01-22 16:10:05.423202: step: 54/466, loss: 0.019311172887682915 2023-01-22 16:10:06.246926: step: 56/466, loss: 0.0055588241666555405 2023-01-22 16:10:06.964934: step: 58/466, loss: 0.0013960510259494185 2023-01-22 16:10:07.760689: step: 60/466, loss: 0.028932584449648857 2023-01-22 16:10:08.527690: step: 62/466, loss: 0.0005302618956193328 2023-01-22 16:10:09.304987: step: 64/466, loss: 0.0064671579748392105 2023-01-22 16:10:10.102916: step: 66/466, loss: 0.00612952746450901 2023-01-22 16:10:10.836751: step: 68/466, loss: 0.08574938774108887 2023-01-22 16:10:11.464170: step: 70/466, loss: 0.004624377936124802 2023-01-22 16:10:12.279063: step: 72/466, loss: 0.002651211339980364 2023-01-22 16:10:12.935987: step: 74/466, loss: 0.0033132026437669992 2023-01-22 16:10:13.718709: step: 76/466, loss: 0.013541768305003643 2023-01-22 16:10:14.514205: step: 78/466, loss: 0.005505187902599573 2023-01-22 16:10:15.221142: step: 80/466, loss: 0.28946197032928467 2023-01-22 16:10:15.997883: step: 82/466, loss: 0.0002097875694744289 2023-01-22 16:10:16.662534: step: 84/466, loss: 0.001316684065386653 2023-01-22 16:10:17.354759: step: 86/466, loss: 0.01842031069099903 2023-01-22 16:10:18.199574: step: 88/466, loss: 0.02234254591166973 2023-01-22 16:10:19.010201: step: 90/466, loss: 0.007133010309189558 2023-01-22 16:10:19.780997: step: 92/466, loss: 0.0002999457356054336 2023-01-22 16:10:20.551981: step: 94/466, loss: 0.04126357287168503 2023-01-22 16:10:21.315703: step: 96/466, loss: 0.0036950150970369577 2023-01-22 16:10:22.053026: step: 98/466, loss: 0.002125050174072385 2023-01-22 16:10:22.795242: step: 100/466, loss: 0.010867869481444359 2023-01-22 16:10:23.544771: step: 102/466, loss: 0.023277217522263527 2023-01-22 16:10:24.272219: step: 104/466, loss: 0.0017478482332080603 2023-01-22 16:10:25.032684: step: 106/466, loss: 0.019343003630638123 2023-01-22 16:10:25.852870: step: 108/466, loss: 0.0020293628331273794 2023-01-22 16:10:26.590387: step: 110/466, loss: 0.0034124022349715233 2023-01-22 16:10:27.332596: step: 112/466, loss: 0.004740450065582991 2023-01-22 16:10:28.045198: step: 114/466, loss: 0.012356654740869999 2023-01-22 16:10:28.770280: step: 116/466, loss: 0.007313753943890333 2023-01-22 16:10:29.560690: step: 118/466, loss: 0.04777335375547409 2023-01-22 16:10:30.322344: step: 120/466, loss: 0.05528281256556511 2023-01-22 16:10:31.031659: step: 122/466, loss: 0.006278525106608868 2023-01-22 16:10:31.746561: step: 124/466, loss: 0.0035879218485206366 2023-01-22 16:10:32.472902: step: 126/466, loss: 0.006362420506775379 2023-01-22 16:10:33.251203: step: 128/466, loss: 0.0008267164812423289 2023-01-22 16:10:34.032088: step: 130/466, loss: 0.006699729710817337 2023-01-22 16:10:34.893983: step: 132/466, loss: 0.008580324240028858 2023-01-22 16:10:35.575918: step: 134/466, loss: 0.0008092653006315231 2023-01-22 16:10:36.287275: step: 136/466, loss: 0.00058770488249138 2023-01-22 16:10:37.132509: step: 138/466, loss: 0.06414636969566345 2023-01-22 16:10:37.976225: step: 140/466, loss: 1.0099543333053589 2023-01-22 16:10:38.669049: step: 142/466, loss: 0.0010454514995217323 2023-01-22 16:10:39.396659: step: 144/466, loss: 0.0009470380609855056 2023-01-22 16:10:40.202482: step: 146/466, loss: 0.006444782484322786 2023-01-22 16:10:40.962677: step: 148/466, loss: 0.12940701842308044 2023-01-22 16:10:41.722303: step: 150/466, loss: 0.02667311765253544 2023-01-22 16:10:42.469730: step: 152/466, loss: 0.00028765155002474785 2023-01-22 16:10:43.200534: step: 154/466, loss: 0.0047229896299541 2023-01-22 16:10:44.061328: step: 156/466, loss: 0.002519721630960703 2023-01-22 16:10:44.747287: step: 158/466, loss: 0.008568288758397102 2023-01-22 16:10:45.443661: step: 160/466, loss: 0.0011734687723219395 2023-01-22 16:10:46.255440: step: 162/466, loss: 0.017831817269325256 2023-01-22 16:10:46.968814: step: 164/466, loss: 0.004616545047610998 2023-01-22 16:10:47.727855: step: 166/466, loss: 1.8955062627792358 2023-01-22 16:10:48.504751: step: 168/466, loss: 0.10970849543809891 2023-01-22 16:10:49.191816: step: 170/466, loss: 0.004755795001983643 2023-01-22 16:10:50.008862: step: 172/466, loss: 0.0019417761359363794 2023-01-22 16:10:50.692878: step: 174/466, loss: 0.0001321727322647348 2023-01-22 16:10:51.459796: step: 176/466, loss: 0.263048380613327 2023-01-22 16:10:52.182195: step: 178/466, loss: 0.0063339099287986755 2023-01-22 16:10:52.937976: step: 180/466, loss: 0.041270166635513306 2023-01-22 16:10:53.802819: step: 182/466, loss: 1.3670841455459595 2023-01-22 16:10:54.651657: step: 184/466, loss: 0.0091679897159338 2023-01-22 16:10:55.388743: step: 186/466, loss: 0.0008498663082718849 2023-01-22 16:10:56.167705: step: 188/466, loss: 0.0013691213680431247 2023-01-22 16:10:56.937443: step: 190/466, loss: 0.029640894383192062 2023-01-22 16:10:57.687168: step: 192/466, loss: 0.0028292567003518343 2023-01-22 16:10:58.482627: step: 194/466, loss: 0.016314124688506126 2023-01-22 16:10:59.228906: step: 196/466, loss: 0.000857730396091938 2023-01-22 16:10:59.986394: step: 198/466, loss: 0.010542549192905426 2023-01-22 16:11:00.700631: step: 200/466, loss: 0.019882574677467346 2023-01-22 16:11:01.429352: step: 202/466, loss: 0.003643724601715803 2023-01-22 16:11:02.213324: step: 204/466, loss: 0.0018131457036361098 2023-01-22 16:11:03.020961: step: 206/466, loss: 0.025027979165315628 2023-01-22 16:11:03.791550: step: 208/466, loss: 0.0016643248964101076 2023-01-22 16:11:04.647035: step: 210/466, loss: 0.006344994530081749 2023-01-22 16:11:05.414371: step: 212/466, loss: 0.0005301121855154634 2023-01-22 16:11:06.220454: step: 214/466, loss: 0.04560599476099014 2023-01-22 16:11:07.034938: step: 216/466, loss: 0.0034986361861228943 2023-01-22 16:11:07.784006: step: 218/466, loss: 0.3330421447753906 2023-01-22 16:11:08.516047: step: 220/466, loss: 0.012391243129968643 2023-01-22 16:11:09.261579: step: 222/466, loss: 0.0019246727460995317 2023-01-22 16:11:10.052393: step: 224/466, loss: 0.02810569852590561 2023-01-22 16:11:10.764095: step: 226/466, loss: 4.9588707042858005e-05 2023-01-22 16:11:11.604779: step: 228/466, loss: 0.014155753888189793 2023-01-22 16:11:12.315593: step: 230/466, loss: 0.014453071169555187 2023-01-22 16:11:13.041518: step: 232/466, loss: 0.0010232643689960241 2023-01-22 16:11:13.849818: step: 234/466, loss: 0.06366181373596191 2023-01-22 16:11:14.657362: step: 236/466, loss: 0.0008305907249450684 2023-01-22 16:11:15.373504: step: 238/466, loss: 0.012591608799993992 2023-01-22 16:11:16.080148: step: 240/466, loss: 0.0005710592959076166 2023-01-22 16:11:16.791566: step: 242/466, loss: 0.0009176023304462433 2023-01-22 16:11:17.539358: step: 244/466, loss: 0.0002121399447787553 2023-01-22 16:11:18.287868: step: 246/466, loss: 0.0004354036063887179 2023-01-22 16:11:19.054722: step: 248/466, loss: 0.0007151217432692647 2023-01-22 16:11:19.894401: step: 250/466, loss: 0.0004262593574821949 2023-01-22 16:11:20.719315: step: 252/466, loss: 0.021437974646687508 2023-01-22 16:11:21.672865: step: 254/466, loss: 0.07851515710353851 2023-01-22 16:11:22.400868: step: 256/466, loss: 0.04753747954964638 2023-01-22 16:11:23.119636: step: 258/466, loss: 0.010555329732596874 2023-01-22 16:11:23.792378: step: 260/466, loss: 0.0038088206201791763 2023-01-22 16:11:24.728879: step: 262/466, loss: 0.03028637170791626 2023-01-22 16:11:25.519423: step: 264/466, loss: 0.007273413706570864 2023-01-22 16:11:26.342417: step: 266/466, loss: 0.017216956242918968 2023-01-22 16:11:27.093741: step: 268/466, loss: 0.048719123005867004 2023-01-22 16:11:27.862581: step: 270/466, loss: 0.0027608266100287437 2023-01-22 16:11:28.652912: step: 272/466, loss: 0.04781614616513252 2023-01-22 16:11:29.407885: step: 274/466, loss: 0.013051237910985947 2023-01-22 16:11:30.176761: step: 276/466, loss: 0.2669077515602112 2023-01-22 16:11:30.975864: step: 278/466, loss: 0.0018297962378710508 2023-01-22 16:11:31.764531: step: 280/466, loss: 0.011756602674722672 2023-01-22 16:11:32.537824: step: 282/466, loss: 0.014492223039269447 2023-01-22 16:11:33.342402: step: 284/466, loss: 0.009878966957330704 2023-01-22 16:11:34.100314: step: 286/466, loss: 0.06663220375776291 2023-01-22 16:11:34.854338: step: 288/466, loss: 0.007071053143590689 2023-01-22 16:11:35.577151: step: 290/466, loss: 0.031226148828864098 2023-01-22 16:11:36.317656: step: 292/466, loss: 0.023413024842739105 2023-01-22 16:11:37.068580: step: 294/466, loss: 0.09983746707439423 2023-01-22 16:11:37.746123: step: 296/466, loss: 0.08914503455162048 2023-01-22 16:11:38.481751: step: 298/466, loss: 0.001067915465682745 2023-01-22 16:11:39.254367: step: 300/466, loss: 0.025817418470978737 2023-01-22 16:11:39.982465: step: 302/466, loss: 0.014062085188925266 2023-01-22 16:11:40.771717: step: 304/466, loss: 0.0725163072347641 2023-01-22 16:11:41.560140: step: 306/466, loss: 0.029730848968029022 2023-01-22 16:11:42.297659: step: 308/466, loss: 0.025710735470056534 2023-01-22 16:11:43.022739: step: 310/466, loss: 0.007411581929773092 2023-01-22 16:11:43.749974: step: 312/466, loss: 0.0043141599744558334 2023-01-22 16:11:44.541558: step: 314/466, loss: 2.60351824760437 2023-01-22 16:11:45.285152: step: 316/466, loss: 0.010887504555284977 2023-01-22 16:11:45.970311: step: 318/466, loss: 0.07423939555883408 2023-01-22 16:11:46.728755: step: 320/466, loss: 0.0009458021959289908 2023-01-22 16:11:47.537806: step: 322/466, loss: 0.017295386642217636 2023-01-22 16:11:48.260651: step: 324/466, loss: 0.0011327442480251193 2023-01-22 16:11:49.100948: step: 326/466, loss: 0.01153257954865694 2023-01-22 16:11:49.755345: step: 328/466, loss: 0.0021202610805630684 2023-01-22 16:11:50.498308: step: 330/466, loss: 0.0021633352153003216 2023-01-22 16:11:51.322275: step: 332/466, loss: 0.0060116020031273365 2023-01-22 16:11:52.096297: step: 334/466, loss: 0.0008030192693695426 2023-01-22 16:11:52.836867: step: 336/466, loss: 0.001924928743392229 2023-01-22 16:11:53.550543: step: 338/466, loss: 0.0022883862257003784 2023-01-22 16:11:54.338248: step: 340/466, loss: 0.011111796833574772 2023-01-22 16:11:55.138004: step: 342/466, loss: 0.01955723576247692 2023-01-22 16:11:55.920418: step: 344/466, loss: 0.013217715546488762 2023-01-22 16:11:56.692951: step: 346/466, loss: 0.0002433011686662212 2023-01-22 16:11:57.403231: step: 348/466, loss: 0.011723429895937443 2023-01-22 16:11:58.206268: step: 350/466, loss: 0.194093719124794 2023-01-22 16:11:58.942617: step: 352/466, loss: 0.008440490812063217 2023-01-22 16:11:59.685081: step: 354/466, loss: 0.0025795488618314266 2023-01-22 16:12:00.389404: step: 356/466, loss: 0.0007878990145400167 2023-01-22 16:12:01.149783: step: 358/466, loss: 0.004591137170791626 2023-01-22 16:12:01.941278: step: 360/466, loss: 0.012769699096679688 2023-01-22 16:12:02.638350: step: 362/466, loss: 0.005436086095869541 2023-01-22 16:12:03.485004: step: 364/466, loss: 0.0008671208051964641 2023-01-22 16:12:04.200326: step: 366/466, loss: 0.005928007420152426 2023-01-22 16:12:04.923851: step: 368/466, loss: 0.002474587643519044 2023-01-22 16:12:05.644638: step: 370/466, loss: 0.050172630697488785 2023-01-22 16:12:06.358800: step: 372/466, loss: 0.008266598917543888 2023-01-22 16:12:07.168048: step: 374/466, loss: 0.5412901639938354 2023-01-22 16:12:07.978543: step: 376/466, loss: 0.03616834059357643 2023-01-22 16:12:08.711530: step: 378/466, loss: 0.009188574738800526 2023-01-22 16:12:09.419868: step: 380/466, loss: 0.009282363578677177 2023-01-22 16:12:10.157094: step: 382/466, loss: 0.0029915831983089447 2023-01-22 16:12:10.871076: step: 384/466, loss: 0.002243778435513377 2023-01-22 16:12:11.590645: step: 386/466, loss: 0.00568827148526907 2023-01-22 16:12:12.428549: step: 388/466, loss: 0.013106818310916424 2023-01-22 16:12:13.187752: step: 390/466, loss: 0.0006954700802452862 2023-01-22 16:12:13.902277: step: 392/466, loss: 0.03494095802307129 2023-01-22 16:12:14.649484: step: 394/466, loss: 0.001071300357580185 2023-01-22 16:12:15.342260: step: 396/466, loss: 0.003233947092667222 2023-01-22 16:12:16.183188: step: 398/466, loss: 0.0007663085707463324 2023-01-22 16:12:16.937921: step: 400/466, loss: 0.002137871226295829 2023-01-22 16:12:17.642409: step: 402/466, loss: 0.005546262953430414 2023-01-22 16:12:18.373137: step: 404/466, loss: 0.02801138162612915 2023-01-22 16:12:19.147091: step: 406/466, loss: 0.007742196787148714 2023-01-22 16:12:19.809832: step: 408/466, loss: 0.007167985662817955 2023-01-22 16:12:20.591038: step: 410/466, loss: 0.03604840487241745 2023-01-22 16:12:21.425285: step: 412/466, loss: 0.05173136666417122 2023-01-22 16:12:22.165024: step: 414/466, loss: 0.0077619957737624645 2023-01-22 16:12:22.897060: step: 416/466, loss: 0.10107298940420151 2023-01-22 16:12:23.703600: step: 418/466, loss: 0.02580912411212921 2023-01-22 16:12:24.553884: step: 420/466, loss: 0.002658374607563019 2023-01-22 16:12:25.521363: step: 422/466, loss: 0.0006400442798621953 2023-01-22 16:12:26.312448: step: 424/466, loss: 0.0002528093755245209 2023-01-22 16:12:27.071114: step: 426/466, loss: 0.002691936446353793 2023-01-22 16:12:27.839974: step: 428/466, loss: 0.03798580914735794 2023-01-22 16:12:28.624680: step: 430/466, loss: 0.023139648139476776 2023-01-22 16:12:29.387743: step: 432/466, loss: 0.0007980384398251772 2023-01-22 16:12:30.169363: step: 434/466, loss: 0.03384041413664818 2023-01-22 16:12:30.917151: step: 436/466, loss: 0.00030431634513661265 2023-01-22 16:12:31.725651: step: 438/466, loss: 0.031009627506136894 2023-01-22 16:12:32.494433: step: 440/466, loss: 0.0068207900039851665 2023-01-22 16:12:33.321896: step: 442/466, loss: 0.07781907171010971 2023-01-22 16:12:34.050319: step: 444/466, loss: 0.04145520552992821 2023-01-22 16:12:34.788654: step: 446/466, loss: 0.028595779091119766 2023-01-22 16:12:35.670900: step: 448/466, loss: 0.01590229943394661 2023-01-22 16:12:36.613851: step: 450/466, loss: 0.0007838807068765163 2023-01-22 16:12:37.355656: step: 452/466, loss: 0.002579035935923457 2023-01-22 16:12:38.100744: step: 454/466, loss: 0.1090591549873352 2023-01-22 16:12:38.872009: step: 456/466, loss: 0.04496876895427704 2023-01-22 16:12:39.651330: step: 458/466, loss: 0.008547370322048664 2023-01-22 16:12:40.474143: step: 460/466, loss: 0.058345843106508255 2023-01-22 16:12:41.189580: step: 462/466, loss: 9.482367750024423e-05 2023-01-22 16:12:41.974388: step: 464/466, loss: 0.029485873878002167 2023-01-22 16:12:42.710841: step: 466/466, loss: 0.006282643880695105 2023-01-22 16:12:43.496032: step: 468/466, loss: 0.060245078057050705 2023-01-22 16:12:44.271372: step: 470/466, loss: 0.004400915931910276 2023-01-22 16:12:44.956922: step: 472/466, loss: 0.05323219299316406 2023-01-22 16:12:45.686081: step: 474/466, loss: 0.004026977811008692 2023-01-22 16:12:46.411697: step: 476/466, loss: 0.008997835218906403 2023-01-22 16:12:47.140370: step: 478/466, loss: 0.0028516759630292654 2023-01-22 16:12:47.970371: step: 480/466, loss: 0.059091534465551376 2023-01-22 16:12:48.751542: step: 482/466, loss: 0.3593703508377075 2023-01-22 16:12:49.471211: step: 484/466, loss: 0.00869487039744854 2023-01-22 16:12:50.205270: step: 486/466, loss: 0.05974646285176277 2023-01-22 16:12:50.969452: step: 488/466, loss: 0.00953682791441679 2023-01-22 16:12:51.760619: step: 490/466, loss: 0.05419261381030083 2023-01-22 16:12:52.469594: step: 492/466, loss: 0.023732004687190056 2023-01-22 16:12:53.153279: step: 494/466, loss: 0.003286184510216117 2023-01-22 16:12:54.060571: step: 496/466, loss: 0.14133943617343903 2023-01-22 16:12:54.809835: step: 498/466, loss: 0.026308268308639526 2023-01-22 16:12:55.703544: step: 500/466, loss: 0.0018950769444927573 2023-01-22 16:12:56.463240: step: 502/466, loss: 0.0204459298402071 2023-01-22 16:12:57.235165: step: 504/466, loss: 0.2291094958782196 2023-01-22 16:12:58.076668: step: 506/466, loss: 0.025624655187129974 2023-01-22 16:12:58.825609: step: 508/466, loss: 7.082007505232468e-05 2023-01-22 16:12:59.589679: step: 510/466, loss: 0.003036458045244217 2023-01-22 16:13:00.394807: step: 512/466, loss: 0.015858786180615425 2023-01-22 16:13:01.204409: step: 514/466, loss: 0.009644883684813976 2023-01-22 16:13:01.945170: step: 516/466, loss: 0.071327805519104 2023-01-22 16:13:02.812027: step: 518/466, loss: 0.009094729088246822 2023-01-22 16:13:03.553926: step: 520/466, loss: 0.07092177867889404 2023-01-22 16:13:04.390890: step: 522/466, loss: 0.018077434971928596 2023-01-22 16:13:05.157824: step: 524/466, loss: 0.00019851350225508213 2023-01-22 16:13:05.932933: step: 526/466, loss: 0.020662939175963402 2023-01-22 16:13:06.676009: step: 528/466, loss: 0.0026878654025495052 2023-01-22 16:13:07.460218: step: 530/466, loss: 0.0020404020324349403 2023-01-22 16:13:08.295412: step: 532/466, loss: 0.0069939009845256805 2023-01-22 16:13:09.084078: step: 534/466, loss: 0.0030733118765056133 2023-01-22 16:13:09.952449: step: 536/466, loss: 0.014625020325183868 2023-01-22 16:13:10.694752: step: 538/466, loss: 0.0217538233846426 2023-01-22 16:13:11.378591: step: 540/466, loss: 0.03101446107029915 2023-01-22 16:13:12.121796: step: 542/466, loss: 0.00269150803796947 2023-01-22 16:13:12.906342: step: 544/466, loss: 0.005434195511043072 2023-01-22 16:13:13.639714: step: 546/466, loss: 0.42131295800209045 2023-01-22 16:13:14.360630: step: 548/466, loss: 0.005677481181919575 2023-01-22 16:13:15.068067: step: 550/466, loss: 0.017299525439739227 2023-01-22 16:13:15.830024: step: 552/466, loss: 0.0023705079220235348 2023-01-22 16:13:16.570188: step: 554/466, loss: 0.0041278693825006485 2023-01-22 16:13:17.346135: step: 556/466, loss: 0.005961044691503048 2023-01-22 16:13:18.015024: step: 558/466, loss: 0.01285065058618784 2023-01-22 16:13:18.724202: step: 560/466, loss: 0.003253075759857893 2023-01-22 16:13:19.491284: step: 562/466, loss: 0.03593744710087776 2023-01-22 16:13:20.224042: step: 564/466, loss: 0.000874812831170857 2023-01-22 16:13:20.963613: step: 566/466, loss: 0.00014044287672732025 2023-01-22 16:13:21.685948: step: 568/466, loss: 0.0013199172681197524 2023-01-22 16:13:22.450141: step: 570/466, loss: 0.006412571761757135 2023-01-22 16:13:23.256152: step: 572/466, loss: 0.03231941536068916 2023-01-22 16:13:24.019941: step: 574/466, loss: 0.003953961189836264 2023-01-22 16:13:24.752150: step: 576/466, loss: 0.00022906172671355307 2023-01-22 16:13:25.521117: step: 578/466, loss: 0.010939808562397957 2023-01-22 16:13:26.242180: step: 580/466, loss: 0.009735723957419395 2023-01-22 16:13:26.959784: step: 582/466, loss: 0.003221513470634818 2023-01-22 16:13:27.683141: step: 584/466, loss: 0.019574739038944244 2023-01-22 16:13:28.506933: step: 586/466, loss: 0.01354447565972805 2023-01-22 16:13:29.303508: step: 588/466, loss: 0.07380802184343338 2023-01-22 16:13:30.044858: step: 590/466, loss: 0.00236341031268239 2023-01-22 16:13:30.760687: step: 592/466, loss: 0.004276533145457506 2023-01-22 16:13:31.483033: step: 594/466, loss: 0.0004312160308472812 2023-01-22 16:13:32.172006: step: 596/466, loss: 0.0008968772599473596 2023-01-22 16:13:32.870479: step: 598/466, loss: 0.03833211213350296 2023-01-22 16:13:33.664506: step: 600/466, loss: 0.012998173013329506 2023-01-22 16:13:34.478139: step: 602/466, loss: 0.03347988799214363 2023-01-22 16:13:35.306392: step: 604/466, loss: 0.014307713136076927 2023-01-22 16:13:36.019963: step: 606/466, loss: 0.0022048174869269133 2023-01-22 16:13:36.656961: step: 608/466, loss: 0.00022488315880764276 2023-01-22 16:13:37.383155: step: 610/466, loss: 0.0007502142107114196 2023-01-22 16:13:38.072319: step: 612/466, loss: 0.4031977951526642 2023-01-22 16:13:38.754488: step: 614/466, loss: 0.006559982430189848 2023-01-22 16:13:39.439909: step: 616/466, loss: 0.0009073030669242144 2023-01-22 16:13:40.180749: step: 618/466, loss: 0.03044399805366993 2023-01-22 16:13:40.957978: step: 620/466, loss: 0.04347848892211914 2023-01-22 16:13:41.803752: step: 622/466, loss: 0.015411065891385078 2023-01-22 16:13:42.527664: step: 624/466, loss: 0.00570999551564455 2023-01-22 16:13:43.236601: step: 626/466, loss: 0.00024472447694279253 2023-01-22 16:13:43.980525: step: 628/466, loss: 0.03959134966135025 2023-01-22 16:13:44.793509: step: 630/466, loss: 0.04707716777920723 2023-01-22 16:13:45.576568: step: 632/466, loss: 0.028172489255666733 2023-01-22 16:13:46.277467: step: 634/466, loss: 0.004007617477327585 2023-01-22 16:13:47.017175: step: 636/466, loss: 0.0007090616854839027 2023-01-22 16:13:47.785460: step: 638/466, loss: 0.05749453231692314 2023-01-22 16:13:48.510288: step: 640/466, loss: 0.005273169372230768 2023-01-22 16:13:49.298992: step: 642/466, loss: 0.005346581339836121 2023-01-22 16:13:50.068493: step: 644/466, loss: 0.024699220433831215 2023-01-22 16:13:50.800069: step: 646/466, loss: 0.00016093281737994403 2023-01-22 16:13:51.466312: step: 648/466, loss: 0.06450103223323822 2023-01-22 16:13:52.179672: step: 650/466, loss: 0.030389755964279175 2023-01-22 16:13:53.030183: step: 652/466, loss: 0.023881230503320694 2023-01-22 16:13:53.738665: step: 654/466, loss: 0.004552459344267845 2023-01-22 16:13:54.416088: step: 656/466, loss: 0.0003522764891386032 2023-01-22 16:13:55.195766: step: 658/466, loss: 0.06739848852157593 2023-01-22 16:13:55.905494: step: 660/466, loss: 0.0010958056664094329 2023-01-22 16:13:56.703033: step: 662/466, loss: 0.02298307977616787 2023-01-22 16:13:57.481469: step: 664/466, loss: 0.0022193677723407745 2023-01-22 16:13:58.231924: step: 666/466, loss: 0.024256214499473572 2023-01-22 16:13:58.962360: step: 668/466, loss: 0.01720893569290638 2023-01-22 16:13:59.669064: step: 670/466, loss: 0.0009447914198972285 2023-01-22 16:14:00.378262: step: 672/466, loss: 0.017255809158086777 2023-01-22 16:14:01.080571: step: 674/466, loss: 0.011553768999874592 2023-01-22 16:14:01.815344: step: 676/466, loss: 0.000498365901876241 2023-01-22 16:14:02.674561: step: 678/466, loss: 0.005982627626508474 2023-01-22 16:14:03.535039: step: 680/466, loss: 0.018326004967093468 2023-01-22 16:14:04.254823: step: 682/466, loss: 0.002403522375971079 2023-01-22 16:14:05.095518: step: 684/466, loss: 0.015257167629897594 2023-01-22 16:14:05.866984: step: 686/466, loss: 0.0046143620274960995 2023-01-22 16:14:06.684427: step: 688/466, loss: 0.021672574803233147 2023-01-22 16:14:07.381175: step: 690/466, loss: 0.03176340088248253 2023-01-22 16:14:08.216203: step: 692/466, loss: 0.010413500480353832 2023-01-22 16:14:09.002002: step: 694/466, loss: 0.004266362637281418 2023-01-22 16:14:09.692083: step: 696/466, loss: 0.003301274497061968 2023-01-22 16:14:10.441608: step: 698/466, loss: 0.015240832231938839 2023-01-22 16:14:11.202488: step: 700/466, loss: 0.015804987400770187 2023-01-22 16:14:11.984330: step: 702/466, loss: 0.012961991131305695 2023-01-22 16:14:12.718566: step: 704/466, loss: 0.0013605301501229405 2023-01-22 16:14:13.493543: step: 706/466, loss: 0.13008445501327515 2023-01-22 16:14:14.220715: step: 708/466, loss: 0.005032180342823267 2023-01-22 16:14:14.897332: step: 710/466, loss: 0.0004104235558770597 2023-01-22 16:14:15.639955: step: 712/466, loss: 0.009015659801661968 2023-01-22 16:14:16.363433: step: 714/466, loss: 0.009201515465974808 2023-01-22 16:14:17.156154: step: 716/466, loss: 0.0029313755221664906 2023-01-22 16:14:17.974924: step: 718/466, loss: 0.0174380112439394 2023-01-22 16:14:18.699102: step: 720/466, loss: 0.005240896716713905 2023-01-22 16:14:19.501749: step: 722/466, loss: 0.011967699974775314 2023-01-22 16:14:20.201204: step: 724/466, loss: 0.0059184362180531025 2023-01-22 16:14:20.969394: step: 726/466, loss: 0.024834871292114258 2023-01-22 16:14:21.705124: step: 728/466, loss: 0.031435705721378326 2023-01-22 16:14:22.427414: step: 730/466, loss: 0.019492171704769135 2023-01-22 16:14:23.229500: step: 732/466, loss: 0.006483915727585554 2023-01-22 16:14:23.980153: step: 734/466, loss: 0.04379876330494881 2023-01-22 16:14:24.946159: step: 736/466, loss: 0.06285839527845383 2023-01-22 16:14:25.756304: step: 738/466, loss: 0.0008022096590138972 2023-01-22 16:14:26.489523: step: 740/466, loss: 0.0004810819518752396 2023-01-22 16:14:27.174042: step: 742/466, loss: 0.0028347100596874952 2023-01-22 16:14:27.851348: step: 744/466, loss: 0.01241873949766159 2023-01-22 16:14:28.614545: step: 746/466, loss: 0.02304881066083908 2023-01-22 16:14:29.384880: step: 748/466, loss: 0.013269875198602676 2023-01-22 16:14:30.131585: step: 750/466, loss: 0.06503497809171677 2023-01-22 16:14:30.898021: step: 752/466, loss: 0.007842977531254292 2023-01-22 16:14:31.686977: step: 754/466, loss: 0.004290265962481499 2023-01-22 16:14:32.471504: step: 756/466, loss: 0.003009806852787733 2023-01-22 16:14:33.299150: step: 758/466, loss: 0.030276000499725342 2023-01-22 16:14:34.078834: step: 760/466, loss: 0.012533142231404781 2023-01-22 16:14:34.900658: step: 762/466, loss: 0.0018880884163081646 2023-01-22 16:14:35.688370: step: 764/466, loss: 0.7592372894287109 2023-01-22 16:14:36.547873: step: 766/466, loss: 0.01072653941810131 2023-01-22 16:14:37.315128: step: 768/466, loss: 0.0017617539269849658 2023-01-22 16:14:38.003655: step: 770/466, loss: 0.003082792041823268 2023-01-22 16:14:38.757750: step: 772/466, loss: 0.09356488287448883 2023-01-22 16:14:39.467074: step: 774/466, loss: 0.01210882980376482 2023-01-22 16:14:40.268239: step: 776/466, loss: 0.10668861865997314 2023-01-22 16:14:41.004364: step: 778/466, loss: 0.009636450558900833 2023-01-22 16:14:41.762101: step: 780/466, loss: 0.020347947254776955 2023-01-22 16:14:42.399834: step: 782/466, loss: 0.02077638916671276 2023-01-22 16:14:43.114190: step: 784/466, loss: 0.00040024143527261913 2023-01-22 16:14:43.902544: step: 786/466, loss: 0.006125684827566147 2023-01-22 16:14:44.675204: step: 788/466, loss: 0.017130881547927856 2023-01-22 16:14:45.414764: step: 790/466, loss: 0.005997342057526112 2023-01-22 16:14:46.143711: step: 792/466, loss: 0.030117981135845184 2023-01-22 16:14:46.829949: step: 794/466, loss: 0.0025042355991899967 2023-01-22 16:14:47.585040: step: 796/466, loss: 0.021588584408164024 2023-01-22 16:14:48.317518: step: 798/466, loss: 0.004534664563834667 2023-01-22 16:14:49.051427: step: 800/466, loss: 0.013607624918222427 2023-01-22 16:14:49.813171: step: 802/466, loss: 0.006609110161662102 2023-01-22 16:14:50.551980: step: 804/466, loss: 0.00018314311455469579 2023-01-22 16:14:51.402982: step: 806/466, loss: 0.00016394034901168197 2023-01-22 16:14:52.215055: step: 808/466, loss: 0.016977539286017418 2023-01-22 16:14:52.826553: step: 810/466, loss: 0.0015409706393256783 2023-01-22 16:14:53.560815: step: 812/466, loss: 0.0012982721673324704 2023-01-22 16:14:54.339418: step: 814/466, loss: 0.04246694967150688 2023-01-22 16:14:55.082823: step: 816/466, loss: 0.0236224215477705 2023-01-22 16:14:55.896524: step: 818/466, loss: 0.0008643745095469058 2023-01-22 16:14:56.614827: step: 820/466, loss: 0.0011652348330244422 2023-01-22 16:14:57.380098: step: 822/466, loss: 0.004280023276805878 2023-01-22 16:14:58.173868: step: 824/466, loss: 0.8916021585464478 2023-01-22 16:14:58.929819: step: 826/466, loss: 0.017976932227611542 2023-01-22 16:14:59.739110: step: 828/466, loss: 0.01639091596007347 2023-01-22 16:15:00.602548: step: 830/466, loss: 0.014316645450890064 2023-01-22 16:15:01.374037: step: 832/466, loss: 0.007979301735758781 2023-01-22 16:15:02.155015: step: 834/466, loss: 0.027367806062102318 2023-01-22 16:15:02.929496: step: 836/466, loss: 0.0001122185931308195 2023-01-22 16:15:03.732892: step: 838/466, loss: 2.0951132682967e-05 2023-01-22 16:15:04.512987: step: 840/466, loss: 0.0003048728685826063 2023-01-22 16:15:05.222934: step: 842/466, loss: 0.003254613606259227 2023-01-22 16:15:05.952249: step: 844/466, loss: 0.01293003000319004 2023-01-22 16:15:06.804114: step: 846/466, loss: 0.013491634279489517 2023-01-22 16:15:07.679646: step: 848/466, loss: 0.01315171830356121 2023-01-22 16:15:08.484779: step: 850/466, loss: 0.0018613581778481603 2023-01-22 16:15:09.212405: step: 852/466, loss: 0.0012655846076086164 2023-01-22 16:15:09.957635: step: 854/466, loss: 0.01288510486483574 2023-01-22 16:15:10.619106: step: 856/466, loss: 0.003832991700619459 2023-01-22 16:15:11.355636: step: 858/466, loss: 0.0038616168312728405 2023-01-22 16:15:12.107933: step: 860/466, loss: 0.00018942559836432338 2023-01-22 16:15:12.899190: step: 862/466, loss: 0.008290973491966724 2023-01-22 16:15:13.625803: step: 864/466, loss: 0.01464066468179226 2023-01-22 16:15:14.389562: step: 866/466, loss: 0.007042956072837114 2023-01-22 16:15:15.188524: step: 868/466, loss: 0.02514898031949997 2023-01-22 16:15:15.857771: step: 870/466, loss: 0.0013938520569354296 2023-01-22 16:15:16.587684: step: 872/466, loss: 0.00681871734559536 2023-01-22 16:15:17.317173: step: 874/466, loss: 0.0021813763305544853 2023-01-22 16:15:18.059042: step: 876/466, loss: 0.009251315146684647 2023-01-22 16:15:18.838123: step: 878/466, loss: 0.011302115395665169 2023-01-22 16:15:19.574717: step: 880/466, loss: 0.0006099729798734188 2023-01-22 16:15:20.361772: step: 882/466, loss: 0.0019406548235565424 2023-01-22 16:15:21.116506: step: 884/466, loss: 0.04168505594134331 2023-01-22 16:15:21.815005: step: 886/466, loss: 0.0035822519566863775 2023-01-22 16:15:22.598022: step: 888/466, loss: 0.01661309041082859 2023-01-22 16:15:23.435887: step: 890/466, loss: 0.002855786122381687 2023-01-22 16:15:24.214259: step: 892/466, loss: 0.7210519313812256 2023-01-22 16:15:25.067479: step: 894/466, loss: 0.09248753637075424 2023-01-22 16:15:25.768944: step: 896/466, loss: 0.003342061536386609 2023-01-22 16:15:26.530982: step: 898/466, loss: 0.004297652281820774 2023-01-22 16:15:27.267957: step: 900/466, loss: 0.030501289293169975 2023-01-22 16:15:28.013622: step: 902/466, loss: 0.00955427996814251 2023-01-22 16:15:28.792010: step: 904/466, loss: 0.39566174149513245 2023-01-22 16:15:29.559912: step: 906/466, loss: 0.002507054479792714 2023-01-22 16:15:30.240391: step: 908/466, loss: 5.768853225163184e-05 2023-01-22 16:15:31.109353: step: 910/466, loss: 0.01109325885772705 2023-01-22 16:15:31.891642: step: 912/466, loss: 0.004711176734417677 2023-01-22 16:15:32.630821: step: 914/466, loss: 0.1867443025112152 2023-01-22 16:15:33.396521: step: 916/466, loss: 0.005634450353682041 2023-01-22 16:15:34.091017: step: 918/466, loss: 0.003459567204117775 2023-01-22 16:15:34.790254: step: 920/466, loss: 0.0048894197680056095 2023-01-22 16:15:35.508753: step: 922/466, loss: 0.010429268702864647 2023-01-22 16:15:36.249623: step: 924/466, loss: 0.009192475117743015 2023-01-22 16:15:37.065676: step: 926/466, loss: 0.018873605877161026 2023-01-22 16:15:37.812910: step: 928/466, loss: 0.004493965767323971 2023-01-22 16:15:38.541746: step: 930/466, loss: 0.00835600309073925 2023-01-22 16:15:39.318476: step: 932/466, loss: 0.033090561628341675 ================================================== Loss: 0.047 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.31374061853002067, 'r': 0.3286239495798319, 'f1': 0.3210098636303455}, 'combined': 0.23653358372762298, 'epoch': 33} Test Chinese: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.35580409301213317, 'r': 0.30097021547433256, 'f1': 0.32609812277003203}, 'combined': 0.20043104131231237, 'epoch': 33} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2914950051990493, 'r': 0.33187287119436354, 'f1': 0.3103762255890498}, 'combined': 0.22869827148666827, 'epoch': 33} Test Korean: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.33862044197675734, 'r': 0.3051692024747207, 'f1': 0.321025760853079}, 'combined': 0.197313394475551, 'epoch': 33} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3318795787545788, 'r': 0.3438448766603416, 'f1': 0.33775629077353214}, 'combined': 0.24887305635944473, 'epoch': 33} Test Russian: {'template': {'p': 0.863013698630137, 'r': 0.48091603053435117, 'f1': 0.6176470588235295}, 'slot': {'p': 0.3586126286771368, 'r': 0.302152854958072, 'f1': 0.3279706106399354}, 'combined': 0.20257008304231308, 'epoch': 33} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.31875, 'r': 0.36428571428571427, 'f1': 0.34}, 'combined': 0.22666666666666668, 'epoch': 33} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.2441860465116279, 'r': 0.45652173913043476, 'f1': 0.3181818181818182}, 'combined': 0.1590909090909091, 'epoch': 33} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.4375, 'r': 0.2413793103448276, 'f1': 0.3111111111111111}, 'combined': 0.2074074074074074, 'epoch': 33} New best chinese model... ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.31374061853002067, 'r': 0.3286239495798319, 'f1': 0.3210098636303455}, 'combined': 0.23653358372762298, 'epoch': 33} Test for Chinese: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.35580409301213317, 'r': 0.30097021547433256, 'f1': 0.32609812277003203}, 'combined': 0.20043104131231237, 'epoch': 33} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.31875, 'r': 0.36428571428571427, 'f1': 0.34}, 'combined': 0.22666666666666668, 'epoch': 33} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.28845615818674736, 'r': 0.34045489637980425, 'f1': 0.3123058840594549}, 'combined': 0.23012012509644045, 'epoch': 28} Test for Korean: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.3314220112372907, 'r': 0.30844648186208856, 'f1': 0.3195217594872982}, 'combined': 0.19638898387999792, 'epoch': 28} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.3170731707317073, 'r': 0.5652173913043478, 'f1': 0.40625}, 'combined': 0.203125, 'epoch': 28} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3161637354106435, 'r': 0.3395610516934046, 'f1': 0.3274449665918101}, 'combined': 0.24127523854133376, 'epoch': 24} Test for Russian: {'template': {'p': 0.863013698630137, 'r': 0.48091603053435117, 'f1': 0.6176470588235295}, 'slot': {'p': 0.34766580316129947, 'r': 0.3010093533864065, 'f1': 0.32265967810793456}, 'combined': 0.19928980118431255, 'epoch': 24} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.2413793103448276, 'f1': 0.32558139534883723}, 'combined': 0.21705426356589147, 'epoch': 24} ****************************** Epoch: 34 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-22 16:18:31.043295: step: 2/466, loss: 0.012847150675952435 2023-01-22 16:18:31.794189: step: 4/466, loss: 0.009129938669502735 2023-01-22 16:18:32.636142: step: 6/466, loss: 0.3339959383010864 2023-01-22 16:18:33.398245: step: 8/466, loss: 0.009937900118529797 2023-01-22 16:18:34.131479: step: 10/466, loss: 0.0019312704680487514 2023-01-22 16:18:34.906166: step: 12/466, loss: 0.011718211695551872 2023-01-22 16:18:35.719865: step: 14/466, loss: 0.001805720617994666 2023-01-22 16:18:36.429724: step: 16/466, loss: 0.008648179471492767 2023-01-22 16:18:37.280904: step: 18/466, loss: 0.0250471830368042 2023-01-22 16:18:37.976763: step: 20/466, loss: 0.0016264470759779215 2023-01-22 16:18:38.763296: step: 22/466, loss: 0.004471521824598312 2023-01-22 16:18:39.571893: step: 24/466, loss: 0.2154838740825653 2023-01-22 16:18:40.333049: step: 26/466, loss: 0.15590013563632965 2023-01-22 16:18:41.089416: step: 28/466, loss: 0.002068422269076109 2023-01-22 16:18:41.817126: step: 30/466, loss: 0.031578317284584045 2023-01-22 16:18:42.606392: step: 32/466, loss: 0.0032016118057072163 2023-01-22 16:18:43.353117: step: 34/466, loss: 0.0034452658146619797 2023-01-22 16:18:44.051151: step: 36/466, loss: 0.002187530044466257 2023-01-22 16:18:44.739123: step: 38/466, loss: 0.048533741384744644 2023-01-22 16:18:45.608930: step: 40/466, loss: 0.08503540605306625 2023-01-22 16:18:46.370539: step: 42/466, loss: 0.0007634704816155136 2023-01-22 16:18:47.151503: step: 44/466, loss: 7.228372123790905e-05 2023-01-22 16:18:48.003044: step: 46/466, loss: 0.07904239743947983 2023-01-22 16:18:48.783634: step: 48/466, loss: 0.011844335123896599 2023-01-22 16:18:49.476007: step: 50/466, loss: 0.02444412000477314 2023-01-22 16:18:50.289706: step: 52/466, loss: 0.0446506068110466 2023-01-22 16:18:51.101074: step: 54/466, loss: 0.0055224960669875145 2023-01-22 16:18:51.885154: step: 56/466, loss: 0.004008932039141655 2023-01-22 16:18:52.650125: step: 58/466, loss: 0.01605677790939808 2023-01-22 16:18:53.498052: step: 60/466, loss: 0.021131541579961777 2023-01-22 16:18:54.256717: step: 62/466, loss: 0.018017586320638657 2023-01-22 16:18:54.996021: step: 64/466, loss: 0.005739844869822264 2023-01-22 16:18:55.752110: step: 66/466, loss: 0.02099965140223503 2023-01-22 16:18:56.560415: step: 68/466, loss: 0.0035145084839314222 2023-01-22 16:18:57.288249: step: 70/466, loss: 0.04764877259731293 2023-01-22 16:18:58.051161: step: 72/466, loss: 0.004909931216388941 2023-01-22 16:18:58.819104: step: 74/466, loss: 2.8016456781188026e-05 2023-01-22 16:18:59.542750: step: 76/466, loss: 0.011395934037864208 2023-01-22 16:19:00.435673: step: 78/466, loss: 0.00350642460398376 2023-01-22 16:19:01.398906: step: 80/466, loss: 0.010868704877793789 2023-01-22 16:19:02.196052: step: 82/466, loss: 0.02996048517525196 2023-01-22 16:19:02.982874: step: 84/466, loss: 0.002716638846322894 2023-01-22 16:19:03.695457: step: 86/466, loss: 0.00235667172819376 2023-01-22 16:19:04.486595: step: 88/466, loss: 0.003214767901226878 2023-01-22 16:19:05.272400: step: 90/466, loss: 0.06307035684585571 2023-01-22 16:19:05.978119: step: 92/466, loss: 0.020926734432578087 2023-01-22 16:19:06.622673: step: 94/466, loss: 0.00556641211733222 2023-01-22 16:19:07.397866: step: 96/466, loss: 0.0022218553349375725 2023-01-22 16:19:08.069321: step: 98/466, loss: 0.0005310930428095162 2023-01-22 16:19:08.742120: step: 100/466, loss: 0.007593709509819746 2023-01-22 16:19:09.451534: step: 102/466, loss: 7.28668092051521e-05 2023-01-22 16:19:10.151026: step: 104/466, loss: 0.012354198843240738 2023-01-22 16:19:10.938245: step: 106/466, loss: 0.0002113124937750399 2023-01-22 16:19:11.675182: step: 108/466, loss: 0.0017441267846152186 2023-01-22 16:19:12.439122: step: 110/466, loss: 0.006075084675103426 2023-01-22 16:19:13.210108: step: 112/466, loss: 0.005706703290343285 2023-01-22 16:19:13.978710: step: 114/466, loss: 0.00020052462059538811 2023-01-22 16:19:14.723475: step: 116/466, loss: 0.009616516530513763 2023-01-22 16:19:15.508330: step: 118/466, loss: 0.002562432549893856 2023-01-22 16:19:16.270883: step: 120/466, loss: 0.0027562868781387806 2023-01-22 16:19:16.958962: step: 122/466, loss: 0.0033706985414028168 2023-01-22 16:19:17.722651: step: 124/466, loss: 0.0015648682601749897 2023-01-22 16:19:18.434781: step: 126/466, loss: 0.046386826783418655 2023-01-22 16:19:19.121224: step: 128/466, loss: 0.00020761314954143018 2023-01-22 16:19:19.883217: step: 130/466, loss: 0.015245960094034672 2023-01-22 16:19:20.682548: step: 132/466, loss: 0.023303933441638947 2023-01-22 16:19:21.418883: step: 134/466, loss: 0.00248493580147624 2023-01-22 16:19:22.235888: step: 136/466, loss: 0.007371986750513315 2023-01-22 16:19:22.999929: step: 138/466, loss: 0.0011075693182647228 2023-01-22 16:19:23.703835: step: 140/466, loss: 0.0002633397525642067 2023-01-22 16:19:24.508210: step: 142/466, loss: 0.015081997029483318 2023-01-22 16:19:25.236095: step: 144/466, loss: 0.0006969335372559726 2023-01-22 16:19:25.952112: step: 146/466, loss: 2.260213477711659e-05 2023-01-22 16:19:26.721942: step: 148/466, loss: 0.004124639555811882 2023-01-22 16:19:27.538966: step: 150/466, loss: 0.000673601112794131 2023-01-22 16:19:28.274470: step: 152/466, loss: 0.02401557005941868 2023-01-22 16:19:28.999539: step: 154/466, loss: 0.0060744090005755424 2023-01-22 16:19:29.779385: step: 156/466, loss: 0.0022823396138846874 2023-01-22 16:19:30.574078: step: 158/466, loss: 0.006407279521226883 2023-01-22 16:19:31.321790: step: 160/466, loss: 0.010667653754353523 2023-01-22 16:19:32.091756: step: 162/466, loss: 0.022322572767734528 2023-01-22 16:19:32.852325: step: 164/466, loss: 0.0059477174654603004 2023-01-22 16:19:33.555799: step: 166/466, loss: 7.60244220145978e-05 2023-01-22 16:19:34.263584: step: 168/466, loss: 0.00989892240613699 2023-01-22 16:19:35.050752: step: 170/466, loss: 0.06553342938423157 2023-01-22 16:19:35.776108: step: 172/466, loss: 0.0013819290325045586 2023-01-22 16:19:36.493395: step: 174/466, loss: 0.0024518663994967937 2023-01-22 16:19:37.327509: step: 176/466, loss: 0.003064458491280675 2023-01-22 16:19:38.090792: step: 178/466, loss: 0.5088443160057068 2023-01-22 16:19:38.836428: step: 180/466, loss: 0.004672045353800058 2023-01-22 16:19:39.579404: step: 182/466, loss: 0.0006907099741511047 2023-01-22 16:19:40.409325: step: 184/466, loss: 0.038431692868471146 2023-01-22 16:19:41.202255: step: 186/466, loss: 0.0007816114812158048 2023-01-22 16:19:41.962837: step: 188/466, loss: 0.012371831573545933 2023-01-22 16:19:42.784103: step: 190/466, loss: 0.019759811460971832 2023-01-22 16:19:43.557905: step: 192/466, loss: 0.010684781707823277 2023-01-22 16:19:44.298799: step: 194/466, loss: 0.0015153209678828716 2023-01-22 16:19:45.093915: step: 196/466, loss: 0.002155774272978306 2023-01-22 16:19:45.847851: step: 198/466, loss: 0.0011441691312938929 2023-01-22 16:19:46.565932: step: 200/466, loss: 0.007144713308662176 2023-01-22 16:19:47.354333: step: 202/466, loss: 0.025695007294416428 2023-01-22 16:19:48.141253: step: 204/466, loss: 0.0004251356585882604 2023-01-22 16:19:48.937555: step: 206/466, loss: 0.014349368400871754 2023-01-22 16:19:49.670667: step: 208/466, loss: 0.061793696135282516 2023-01-22 16:19:50.448976: step: 210/466, loss: 0.0003937285509891808 2023-01-22 16:19:51.264336: step: 212/466, loss: 0.01733911596238613 2023-01-22 16:19:51.971983: step: 214/466, loss: 0.003946369048207998 2023-01-22 16:19:52.734792: step: 216/466, loss: 0.003022226504981518 2023-01-22 16:19:53.548257: step: 218/466, loss: 0.01839815080165863 2023-01-22 16:19:54.383566: step: 220/466, loss: 0.01183647383004427 2023-01-22 16:19:55.192839: step: 222/466, loss: 0.030242666602134705 2023-01-22 16:19:55.967786: step: 224/466, loss: 0.0011466683354228735 2023-01-22 16:19:56.767928: step: 226/466, loss: 0.0030892265494912863 2023-01-22 16:19:57.488693: step: 228/466, loss: 9.504199988441542e-05 2023-01-22 16:19:58.212235: step: 230/466, loss: 0.000864933361299336 2023-01-22 16:19:58.995207: step: 232/466, loss: 0.018242958933115005 2023-01-22 16:19:59.773364: step: 234/466, loss: 0.006578810978680849 2023-01-22 16:20:00.527375: step: 236/466, loss: 0.0023516130167990923 2023-01-22 16:20:01.409510: step: 238/466, loss: 0.02836759015917778 2023-01-22 16:20:02.197051: step: 240/466, loss: 0.0028621014207601547 2023-01-22 16:20:02.950521: step: 242/466, loss: 0.0016216520452871919 2023-01-22 16:20:03.699173: step: 244/466, loss: 0.023854777216911316 2023-01-22 16:20:04.338894: step: 246/466, loss: 0.008852701634168625 2023-01-22 16:20:05.086078: step: 248/466, loss: 9.626195242162794e-05 2023-01-22 16:20:05.738396: step: 250/466, loss: 0.0032315291464328766 2023-01-22 16:20:06.456196: step: 252/466, loss: 0.0030613539274781942 2023-01-22 16:20:07.134391: step: 254/466, loss: 0.0009762575500644743 2023-01-22 16:20:07.926219: step: 256/466, loss: 0.0010766517370939255 2023-01-22 16:20:08.626441: step: 258/466, loss: 0.00029321524198167026 2023-01-22 16:20:09.341702: step: 260/466, loss: 0.0014581572031602263 2023-01-22 16:20:10.069228: step: 262/466, loss: 0.014903835952281952 2023-01-22 16:20:10.870828: step: 264/466, loss: 0.011509068310260773 2023-01-22 16:20:11.541124: step: 266/466, loss: 0.001634338404983282 2023-01-22 16:20:12.262129: step: 268/466, loss: 0.016303768381476402 2023-01-22 16:20:12.935901: step: 270/466, loss: 0.010090984404087067 2023-01-22 16:20:13.719023: step: 272/466, loss: 0.0007559580262750387 2023-01-22 16:20:14.535217: step: 274/466, loss: 0.006533884909003973 2023-01-22 16:20:15.259667: step: 276/466, loss: 0.001971112797036767 2023-01-22 16:20:15.998646: step: 278/466, loss: 0.007579253986477852 2023-01-22 16:20:16.723949: step: 280/466, loss: 0.012597577646374702 2023-01-22 16:20:17.506658: step: 282/466, loss: 0.0018686820985749364 2023-01-22 16:20:18.276801: step: 284/466, loss: 0.024334682151675224 2023-01-22 16:20:18.995158: step: 286/466, loss: 0.0003072105173487216 2023-01-22 16:20:19.748695: step: 288/466, loss: 0.00048379477811977267 2023-01-22 16:20:20.475482: step: 290/466, loss: 0.0029958547092974186 2023-01-22 16:20:21.164291: step: 292/466, loss: 5.4033156629884616e-05 2023-01-22 16:20:21.892524: step: 294/466, loss: 0.002091967035084963 2023-01-22 16:20:22.751904: step: 296/466, loss: 0.009681585244834423 2023-01-22 16:20:23.488066: step: 298/466, loss: 0.0075980499386787415 2023-01-22 16:20:24.201234: step: 300/466, loss: 0.007685400079935789 2023-01-22 16:20:24.916941: step: 302/466, loss: 0.0036172426771372557 2023-01-22 16:20:25.643644: step: 304/466, loss: 0.00325842946767807 2023-01-22 16:20:26.424214: step: 306/466, loss: 0.0005344321252778172 2023-01-22 16:20:27.188065: step: 308/466, loss: 0.06320324540138245 2023-01-22 16:20:27.953566: step: 310/466, loss: 0.00019163635442964733 2023-01-22 16:20:28.762975: step: 312/466, loss: 0.0513727143406868 2023-01-22 16:20:29.489763: step: 314/466, loss: 0.06122292950749397 2023-01-22 16:20:30.238152: step: 316/466, loss: 0.0025486303493380547 2023-01-22 16:20:30.899454: step: 318/466, loss: 0.0069191232323646545 2023-01-22 16:20:31.630361: step: 320/466, loss: 0.038638561964035034 2023-01-22 16:20:32.416204: step: 322/466, loss: 0.03692803159356117 2023-01-22 16:20:33.154400: step: 324/466, loss: 0.15488755702972412 2023-01-22 16:20:33.832072: step: 326/466, loss: 0.001069761230610311 2023-01-22 16:20:34.715000: step: 328/466, loss: 0.021730933338403702 2023-01-22 16:20:35.340180: step: 330/466, loss: 0.01111496239900589 2023-01-22 16:20:36.088211: step: 332/466, loss: 0.00878437515348196 2023-01-22 16:20:36.826406: step: 334/466, loss: 0.015419178642332554 2023-01-22 16:20:37.619015: step: 336/466, loss: 0.0022405844647437334 2023-01-22 16:20:38.394725: step: 338/466, loss: 0.13187530636787415 2023-01-22 16:20:39.134898: step: 340/466, loss: 0.00011246054782532156 2023-01-22 16:20:39.948603: step: 342/466, loss: 0.04561132937669754 2023-01-22 16:20:40.794580: step: 344/466, loss: 0.029233692213892937 2023-01-22 16:20:41.585239: step: 346/466, loss: 0.03551267087459564 2023-01-22 16:20:42.368564: step: 348/466, loss: 0.0002540225104894489 2023-01-22 16:20:43.219845: step: 350/466, loss: 0.06430414319038391 2023-01-22 16:20:43.930544: step: 352/466, loss: 0.00018176485900767148 2023-01-22 16:20:44.722731: step: 354/466, loss: 0.014430318959057331 2023-01-22 16:20:45.491870: step: 356/466, loss: 0.0014982149004936218 2023-01-22 16:20:46.259709: step: 358/466, loss: 0.08484133332967758 2023-01-22 16:20:47.047133: step: 360/466, loss: 0.003684528172016144 2023-01-22 16:20:47.820853: step: 362/466, loss: 0.003811977803707123 2023-01-22 16:20:48.540715: step: 364/466, loss: 0.008543026633560658 2023-01-22 16:20:49.326312: step: 366/466, loss: 0.005676996428519487 2023-01-22 16:20:50.014725: step: 368/466, loss: 0.006688602734357119 2023-01-22 16:20:50.831290: step: 370/466, loss: 0.024656053632497787 2023-01-22 16:20:51.570095: step: 372/466, loss: 0.00386234768666327 2023-01-22 16:20:52.275851: step: 374/466, loss: 0.0007061361102387309 2023-01-22 16:20:53.024732: step: 376/466, loss: 0.005585248116403818 2023-01-22 16:20:53.832737: step: 378/466, loss: 0.000482331175589934 2023-01-22 16:20:54.534420: step: 380/466, loss: 0.0001822631456889212 2023-01-22 16:20:55.215474: step: 382/466, loss: 0.00896023865789175 2023-01-22 16:20:55.930552: step: 384/466, loss: 0.001212327741086483 2023-01-22 16:20:56.718310: step: 386/466, loss: 0.022206583991646767 2023-01-22 16:20:57.463333: step: 388/466, loss: 0.009799190796911716 2023-01-22 16:20:58.101510: step: 390/466, loss: 0.0009703595424070954 2023-01-22 16:20:58.825811: step: 392/466, loss: 0.019333017989993095 2023-01-22 16:20:59.584222: step: 394/466, loss: 0.00568966893479228 2023-01-22 16:21:00.355611: step: 396/466, loss: 0.002850270364433527 2023-01-22 16:21:01.159097: step: 398/466, loss: 0.007840417325496674 2023-01-22 16:21:01.999639: step: 400/466, loss: 0.036355625838041306 2023-01-22 16:21:02.735754: step: 402/466, loss: 0.004576034378260374 2023-01-22 16:21:03.535762: step: 404/466, loss: 0.04875103011727333 2023-01-22 16:21:04.261278: step: 406/466, loss: 0.24837371706962585 2023-01-22 16:21:05.090269: step: 408/466, loss: 0.0003534825809765607 2023-01-22 16:21:05.802526: step: 410/466, loss: 0.022242676466703415 2023-01-22 16:21:06.589211: step: 412/466, loss: 1.1368690729141235 2023-01-22 16:21:07.344951: step: 414/466, loss: 0.000881312764249742 2023-01-22 16:21:08.056249: step: 416/466, loss: 0.02301531471312046 2023-01-22 16:21:08.833471: step: 418/466, loss: 0.00803961418569088 2023-01-22 16:21:09.472744: step: 420/466, loss: 0.01750531978905201 2023-01-22 16:21:10.243956: step: 422/466, loss: 0.03382166847586632 2023-01-22 16:21:11.067283: step: 424/466, loss: 0.06351270526647568 2023-01-22 16:21:11.935345: step: 426/466, loss: 0.014697426930069923 2023-01-22 16:21:12.642864: step: 428/466, loss: 0.0015871673822402954 2023-01-22 16:21:13.425090: step: 430/466, loss: 0.01852606236934662 2023-01-22 16:21:14.133985: step: 432/466, loss: 0.0006288749864324927 2023-01-22 16:21:14.843768: step: 434/466, loss: 0.07311911135911942 2023-01-22 16:21:15.604010: step: 436/466, loss: 0.016755113378167152 2023-01-22 16:21:16.402265: step: 438/466, loss: 0.0033878744579851627 2023-01-22 16:21:17.172734: step: 440/466, loss: 0.003291395725682378 2023-01-22 16:21:17.906877: step: 442/466, loss: 0.7009214162826538 2023-01-22 16:21:18.675234: step: 444/466, loss: 0.05325298756361008 2023-01-22 16:21:19.468356: step: 446/466, loss: 0.002627542708069086 2023-01-22 16:21:20.222213: step: 448/466, loss: 0.006730484776198864 2023-01-22 16:21:21.038811: step: 450/466, loss: 0.018681922927498817 2023-01-22 16:21:21.781600: step: 452/466, loss: 0.006985391955822706 2023-01-22 16:21:22.539372: step: 454/466, loss: 0.005801316816359758 2023-01-22 16:21:23.256247: step: 456/466, loss: 0.000606681453064084 2023-01-22 16:21:23.976967: step: 458/466, loss: 0.029049178585410118 2023-01-22 16:21:24.699416: step: 460/466, loss: 0.021959295496344566 2023-01-22 16:21:25.411041: step: 462/466, loss: 0.0020662713795900345 2023-01-22 16:21:26.198262: step: 464/466, loss: 0.010049436241388321 2023-01-22 16:21:26.968470: step: 466/466, loss: 0.004619895480573177 2023-01-22 16:21:27.727008: step: 468/466, loss: 0.0024331167805939913 2023-01-22 16:21:28.456600: step: 470/466, loss: 0.0027067656628787518 2023-01-22 16:21:29.248870: step: 472/466, loss: 0.00027432592469267547 2023-01-22 16:21:30.028984: step: 474/466, loss: 0.06490861624479294 2023-01-22 16:21:30.899320: step: 476/466, loss: 0.0030483517330139875 2023-01-22 16:21:31.574207: step: 478/466, loss: 0.03932291641831398 2023-01-22 16:21:32.402759: step: 480/466, loss: 0.0018771301256492734 2023-01-22 16:21:33.154168: step: 482/466, loss: 0.005720630753785372 2023-01-22 16:21:33.971256: step: 484/466, loss: 0.03505473956465721 2023-01-22 16:21:34.759267: step: 486/466, loss: 0.04068222641944885 2023-01-22 16:21:35.421461: step: 488/466, loss: 0.0047709825448691845 2023-01-22 16:21:36.166348: step: 490/466, loss: 0.02563410997390747 2023-01-22 16:21:36.953020: step: 492/466, loss: 0.008590064011514187 2023-01-22 16:21:37.630136: step: 494/466, loss: 0.0010861967457458377 2023-01-22 16:21:38.425642: step: 496/466, loss: 0.08789907395839691 2023-01-22 16:21:39.146167: step: 498/466, loss: 0.01686914451420307 2023-01-22 16:21:39.856795: step: 500/466, loss: 0.0014343768125399947 2023-01-22 16:21:40.765642: step: 502/466, loss: 0.04180929437279701 2023-01-22 16:21:41.506058: step: 504/466, loss: 0.022112946957349777 2023-01-22 16:21:42.259524: step: 506/466, loss: 0.03575494885444641 2023-01-22 16:21:42.989916: step: 508/466, loss: 0.0023700930178165436 2023-01-22 16:21:43.800651: step: 510/466, loss: 0.009916874580085278 2023-01-22 16:21:44.580789: step: 512/466, loss: 0.03779168799519539 2023-01-22 16:21:45.335115: step: 514/466, loss: 0.0007968175923451781 2023-01-22 16:21:46.039801: step: 516/466, loss: 0.030672218650579453 2023-01-22 16:21:46.776535: step: 518/466, loss: 0.0027551238890737295 2023-01-22 16:21:47.504786: step: 520/466, loss: 0.009607107378542423 2023-01-22 16:21:48.246627: step: 522/466, loss: 0.0026895857881754637 2023-01-22 16:21:48.992675: step: 524/466, loss: 0.007058565504848957 2023-01-22 16:21:49.771169: step: 526/466, loss: 0.015240203589200974 2023-01-22 16:21:50.506636: step: 528/466, loss: 0.0023862060625106096 2023-01-22 16:21:51.304373: step: 530/466, loss: 0.012481776997447014 2023-01-22 16:21:52.034641: step: 532/466, loss: 0.0006766861770302057 2023-01-22 16:21:52.836221: step: 534/466, loss: 0.06194831430912018 2023-01-22 16:21:53.704272: step: 536/466, loss: 0.019686013460159302 2023-01-22 16:21:54.432070: step: 538/466, loss: 7.519257633248344e-05 2023-01-22 16:21:55.248619: step: 540/466, loss: 0.008502397686243057 2023-01-22 16:21:55.998158: step: 542/466, loss: 0.014276986010372639 2023-01-22 16:21:56.749454: step: 544/466, loss: 0.0015168474055826664 2023-01-22 16:21:57.451659: step: 546/466, loss: 0.01749262772500515 2023-01-22 16:21:58.283321: step: 548/466, loss: 0.0004611381737049669 2023-01-22 16:21:59.011825: step: 550/466, loss: 0.0009669710998423398 2023-01-22 16:21:59.738589: step: 552/466, loss: 0.004210221581161022 2023-01-22 16:22:00.531079: step: 554/466, loss: 0.00013607698201667517 2023-01-22 16:22:01.212246: step: 556/466, loss: 0.0025229689199477434 2023-01-22 16:22:01.977426: step: 558/466, loss: 0.036629319190979004 2023-01-22 16:22:02.763091: step: 560/466, loss: 0.008265677839517593 2023-01-22 16:22:03.731589: step: 562/466, loss: 0.003423975547775626 2023-01-22 16:22:04.491808: step: 564/466, loss: 0.027235476300120354 2023-01-22 16:22:05.186014: step: 566/466, loss: 0.018567977473139763 2023-01-22 16:22:05.961632: step: 568/466, loss: 0.00036482102586887777 2023-01-22 16:22:06.660343: step: 570/466, loss: 0.004040045198053122 2023-01-22 16:22:07.525228: step: 572/466, loss: 0.06701383739709854 2023-01-22 16:22:08.279212: step: 574/466, loss: 0.033170491456985474 2023-01-22 16:22:09.112441: step: 576/466, loss: 0.01985347270965576 2023-01-22 16:22:09.881652: step: 578/466, loss: 7.628784806001931e-05 2023-01-22 16:22:10.600755: step: 580/466, loss: 0.13127250969409943 2023-01-22 16:22:11.324098: step: 582/466, loss: 0.005728569347411394 2023-01-22 16:22:12.056633: step: 584/466, loss: 0.018810346722602844 2023-01-22 16:22:12.789228: step: 586/466, loss: 0.031102297827601433 2023-01-22 16:22:13.527186: step: 588/466, loss: 0.02655886299908161 2023-01-22 16:22:14.247899: step: 590/466, loss: 0.0003828817280009389 2023-01-22 16:22:15.001486: step: 592/466, loss: 0.010496832430362701 2023-01-22 16:22:15.700320: step: 594/466, loss: 0.01843322440981865 2023-01-22 16:22:16.457452: step: 596/466, loss: 0.00011009426816599444 2023-01-22 16:22:17.192611: step: 598/466, loss: 0.0060088844038546085 2023-01-22 16:22:18.018813: step: 600/466, loss: 0.04839334264397621 2023-01-22 16:22:18.730799: step: 602/466, loss: 0.005110643804073334 2023-01-22 16:22:19.410139: step: 604/466, loss: 0.012522554956376553 2023-01-22 16:22:20.179339: step: 606/466, loss: 0.050412654876708984 2023-01-22 16:22:20.976884: step: 608/466, loss: 0.0001513346505817026 2023-01-22 16:22:21.848852: step: 610/466, loss: 0.025511352345347404 2023-01-22 16:22:22.667420: step: 612/466, loss: 0.0007206370355561376 2023-01-22 16:22:23.505426: step: 614/466, loss: 0.012813889421522617 2023-01-22 16:22:24.291405: step: 616/466, loss: 0.11198767274618149 2023-01-22 16:22:25.084534: step: 618/466, loss: 0.02641315758228302 2023-01-22 16:22:25.969498: step: 620/466, loss: 0.0035171855706721544 2023-01-22 16:22:26.812991: step: 622/466, loss: 0.0036150936502963305 2023-01-22 16:22:27.689118: step: 624/466, loss: 0.014979987405240536 2023-01-22 16:22:28.434156: step: 626/466, loss: 0.05457896739244461 2023-01-22 16:22:29.159150: step: 628/466, loss: 0.0005795444594696164 2023-01-22 16:22:29.946116: step: 630/466, loss: 0.013990444131195545 2023-01-22 16:22:30.764344: step: 632/466, loss: 0.034092966467142105 2023-01-22 16:22:31.435806: step: 634/466, loss: 0.01346777006983757 2023-01-22 16:22:32.226324: step: 636/466, loss: 0.011793782003223896 2023-01-22 16:22:32.960931: step: 638/466, loss: 0.010652897879481316 2023-01-22 16:22:33.737048: step: 640/466, loss: 0.02251473255455494 2023-01-22 16:22:34.486309: step: 642/466, loss: 0.03246127441525459 2023-01-22 16:22:35.314618: step: 644/466, loss: 0.000812268815934658 2023-01-22 16:22:36.087810: step: 646/466, loss: 0.2501620948314667 2023-01-22 16:22:36.916454: step: 648/466, loss: 0.0055265543051064014 2023-01-22 16:22:37.797490: step: 650/466, loss: 0.01182998064905405 2023-01-22 16:22:38.516955: step: 652/466, loss: 0.006487314589321613 2023-01-22 16:22:39.265474: step: 654/466, loss: 0.016985846683382988 2023-01-22 16:22:40.033770: step: 656/466, loss: 0.009141712449491024 2023-01-22 16:22:40.894377: step: 658/466, loss: 0.0020782635547220707 2023-01-22 16:22:41.698802: step: 660/466, loss: 0.001583437668159604 2023-01-22 16:22:42.502000: step: 662/466, loss: 0.003407202661037445 2023-01-22 16:22:43.283843: step: 664/466, loss: 0.0012576496228575706 2023-01-22 16:22:44.010607: step: 666/466, loss: 0.0018089022487401962 2023-01-22 16:22:44.731654: step: 668/466, loss: 2.7493411835166626e-05 2023-01-22 16:22:45.471426: step: 670/466, loss: 0.0005983322625979781 2023-01-22 16:22:46.292080: step: 672/466, loss: 0.00021879436098970473 2023-01-22 16:22:47.006698: step: 674/466, loss: 0.01310392189770937 2023-01-22 16:22:47.713933: step: 676/466, loss: 0.0006318899104371667 2023-01-22 16:22:48.414237: step: 678/466, loss: 0.031030451878905296 2023-01-22 16:22:49.163380: step: 680/466, loss: 0.012557669542729855 2023-01-22 16:22:49.955384: step: 682/466, loss: 0.007738407235592604 2023-01-22 16:22:50.667763: step: 684/466, loss: 0.008845807053148746 2023-01-22 16:22:51.417687: step: 686/466, loss: 0.05530845746397972 2023-01-22 16:22:52.171730: step: 688/466, loss: 0.0006768331513740122 2023-01-22 16:22:52.879946: step: 690/466, loss: 0.0006791841005906463 2023-01-22 16:22:53.667655: step: 692/466, loss: 0.0006975700962357223 2023-01-22 16:22:54.438893: step: 694/466, loss: 0.020109187811613083 2023-01-22 16:22:55.241904: step: 696/466, loss: 0.003456049831584096 2023-01-22 16:22:55.938092: step: 698/466, loss: 0.0036004381254315376 2023-01-22 16:22:56.635330: step: 700/466, loss: 0.015351896174252033 2023-01-22 16:22:57.396515: step: 702/466, loss: 0.0023220102302730083 2023-01-22 16:22:58.131044: step: 704/466, loss: 0.0014163806336000562 2023-01-22 16:22:58.878946: step: 706/466, loss: 0.01341936830431223 2023-01-22 16:22:59.561161: step: 708/466, loss: 0.06239281967282295 2023-01-22 16:23:00.183205: step: 710/466, loss: 4.801032543182373 2023-01-22 16:23:00.965750: step: 712/466, loss: 0.5246943831443787 2023-01-22 16:23:01.896881: step: 714/466, loss: 0.013656568713486195 2023-01-22 16:23:02.674000: step: 716/466, loss: 0.00019670475739985704 2023-01-22 16:23:03.386586: step: 718/466, loss: 0.0020928506273776293 2023-01-22 16:23:04.199570: step: 720/466, loss: 0.0014590020291507244 2023-01-22 16:23:04.908067: step: 722/466, loss: 0.007673331536352634 2023-01-22 16:23:05.660160: step: 724/466, loss: 0.04808073118329048 2023-01-22 16:23:06.408248: step: 726/466, loss: 0.0033867366146296263 2023-01-22 16:23:07.186384: step: 728/466, loss: 0.004553688690066338 2023-01-22 16:23:07.940784: step: 730/466, loss: 0.01876821555197239 2023-01-22 16:23:08.647121: step: 732/466, loss: 0.004025513771921396 2023-01-22 16:23:09.435538: step: 734/466, loss: 0.021394729614257812 2023-01-22 16:23:10.181686: step: 736/466, loss: 0.042647164314985275 2023-01-22 16:23:10.977419: step: 738/466, loss: 0.010004247538745403 2023-01-22 16:23:11.691145: step: 740/466, loss: 0.005100119858980179 2023-01-22 16:23:12.505943: step: 742/466, loss: 0.0197369996458292 2023-01-22 16:23:13.288200: step: 744/466, loss: 0.0205406304448843 2023-01-22 16:23:14.040844: step: 746/466, loss: 0.00041430548299103975 2023-01-22 16:23:14.816817: step: 748/466, loss: 0.02803983725607395 2023-01-22 16:23:15.598800: step: 750/466, loss: 0.008702714927494526 2023-01-22 16:23:16.405690: step: 752/466, loss: 0.008735493756830692 2023-01-22 16:23:17.121156: step: 754/466, loss: 0.004952584858983755 2023-01-22 16:23:17.812671: step: 756/466, loss: 1.1424092008383013e-05 2023-01-22 16:23:18.495754: step: 758/466, loss: 0.006848689168691635 2023-01-22 16:23:19.264785: step: 760/466, loss: 0.037673287093639374 2023-01-22 16:23:20.053498: step: 762/466, loss: 0.00044622819405049086 2023-01-22 16:23:20.767921: step: 764/466, loss: 0.0030297415796667337 2023-01-22 16:23:21.624147: step: 766/466, loss: 0.038348063826560974 2023-01-22 16:23:22.458296: step: 768/466, loss: 0.0030770660378038883 2023-01-22 16:23:23.302712: step: 770/466, loss: 5.194837649469264e-05 2023-01-22 16:23:24.054019: step: 772/466, loss: 0.02890346758067608 2023-01-22 16:23:24.842472: step: 774/466, loss: 0.0009718273649923503 2023-01-22 16:23:25.663840: step: 776/466, loss: 0.0007162726833485067 2023-01-22 16:23:26.359612: step: 778/466, loss: 0.0012545500649139285 2023-01-22 16:23:27.100853: step: 780/466, loss: 0.003798744175583124 2023-01-22 16:23:27.884351: step: 782/466, loss: 0.004894550424069166 2023-01-22 16:23:28.710834: step: 784/466, loss: 0.0067248838022351265 2023-01-22 16:23:29.479109: step: 786/466, loss: 0.006345099303871393 2023-01-22 16:23:30.227289: step: 788/466, loss: 0.006225190591067076 2023-01-22 16:23:30.988350: step: 790/466, loss: 0.008393198251724243 2023-01-22 16:23:31.709259: step: 792/466, loss: 0.021510576829314232 2023-01-22 16:23:32.524059: step: 794/466, loss: 9.573287388775498e-05 2023-01-22 16:23:33.284010: step: 796/466, loss: 0.02046097069978714 2023-01-22 16:23:33.997060: step: 798/466, loss: 0.029701031744480133 2023-01-22 16:23:34.770039: step: 800/466, loss: 0.0017636730335652828 2023-01-22 16:23:35.551645: step: 802/466, loss: 0.008908066898584366 2023-01-22 16:23:36.375938: step: 804/466, loss: 0.010303734801709652 2023-01-22 16:23:37.151807: step: 806/466, loss: 0.02440200001001358 2023-01-22 16:23:37.972768: step: 808/466, loss: 0.03921886533498764 2023-01-22 16:23:38.678910: step: 810/466, loss: 0.0026704040355980396 2023-01-22 16:23:39.439019: step: 812/466, loss: 0.004355969838798046 2023-01-22 16:23:40.182706: step: 814/466, loss: 0.020293502137064934 2023-01-22 16:23:40.974868: step: 816/466, loss: 0.010316620580852032 2023-01-22 16:23:41.685097: step: 818/466, loss: 0.002169701736420393 2023-01-22 16:23:42.469750: step: 820/466, loss: 0.07257067412137985 2023-01-22 16:23:43.276814: step: 822/466, loss: 0.2700299918651581 2023-01-22 16:23:44.010392: step: 824/466, loss: 0.009220699779689312 2023-01-22 16:23:44.767995: step: 826/466, loss: 0.0862964317202568 2023-01-22 16:23:45.426176: step: 828/466, loss: 2.2523789084516466e-05 2023-01-22 16:23:46.182921: step: 830/466, loss: 0.002025639172643423 2023-01-22 16:23:46.908299: step: 832/466, loss: 0.017040148377418518 2023-01-22 16:23:47.730259: step: 834/466, loss: 0.016010645776987076 2023-01-22 16:23:48.380098: step: 836/466, loss: 0.011474061757326126 2023-01-22 16:23:49.157104: step: 838/466, loss: 0.013029249384999275 2023-01-22 16:23:49.879376: step: 840/466, loss: 0.030176879838109016 2023-01-22 16:23:50.658111: step: 842/466, loss: 0.0135821383446455 2023-01-22 16:23:51.441385: step: 844/466, loss: 0.0947868824005127 2023-01-22 16:23:52.173652: step: 846/466, loss: 0.0001525956904515624 2023-01-22 16:23:52.874629: step: 848/466, loss: 0.0007349324878305197 2023-01-22 16:23:53.585903: step: 850/466, loss: 2.448061786708422e-05 2023-01-22 16:23:54.244829: step: 852/466, loss: 0.007771041709929705 2023-01-22 16:23:54.978213: step: 854/466, loss: 0.0008471178589388728 2023-01-22 16:23:55.751177: step: 856/466, loss: 0.003764393040910363 2023-01-22 16:23:56.471459: step: 858/466, loss: 0.001283894875086844 2023-01-22 16:23:57.135538: step: 860/466, loss: 0.00010542834206717089 2023-01-22 16:23:57.885140: step: 862/466, loss: 0.011902322061359882 2023-01-22 16:23:58.673091: step: 864/466, loss: 0.003066282719373703 2023-01-22 16:23:59.429255: step: 866/466, loss: 0.0017051482573151588 2023-01-22 16:24:00.101599: step: 868/466, loss: 0.0004545479314401746 2023-01-22 16:24:00.830325: step: 870/466, loss: 0.004616321064531803 2023-01-22 16:24:01.563838: step: 872/466, loss: 0.024221308529376984 2023-01-22 16:24:02.238134: step: 874/466, loss: 3.606598329497501e-05 2023-01-22 16:24:03.008874: step: 876/466, loss: 0.00022110596182756126 2023-01-22 16:24:03.753517: step: 878/466, loss: 0.017143752425909042 2023-01-22 16:24:04.535996: step: 880/466, loss: 0.006943912245333195 2023-01-22 16:24:05.300487: step: 882/466, loss: 0.0264279842376709 2023-01-22 16:24:06.024938: step: 884/466, loss: 0.0035525208804756403 2023-01-22 16:24:06.778568: step: 886/466, loss: 0.003330837469547987 2023-01-22 16:24:07.569336: step: 888/466, loss: 0.07850099354982376 2023-01-22 16:24:08.235881: step: 890/466, loss: 0.001839144853875041 2023-01-22 16:24:09.013839: step: 892/466, loss: 0.0007886227685958147 2023-01-22 16:24:09.759511: step: 894/466, loss: 0.0025404752232134342 2023-01-22 16:24:10.499528: step: 896/466, loss: 0.013902636244893074 2023-01-22 16:24:11.235950: step: 898/466, loss: 0.0199846550822258 2023-01-22 16:24:11.991600: step: 900/466, loss: 0.002748046535998583 2023-01-22 16:24:12.732143: step: 902/466, loss: 0.00542854517698288 2023-01-22 16:24:13.451351: step: 904/466, loss: 0.0008315120358020067 2023-01-22 16:24:14.163219: step: 906/466, loss: 0.004181792959570885 2023-01-22 16:24:14.940002: step: 908/466, loss: 0.0013797288993373513 2023-01-22 16:24:15.703922: step: 910/466, loss: 0.03098655864596367 2023-01-22 16:24:16.626359: step: 912/466, loss: 0.022968340665102005 2023-01-22 16:24:17.405336: step: 914/466, loss: 0.0047401487827301025 2023-01-22 16:24:18.137913: step: 916/466, loss: 0.061698734760284424 2023-01-22 16:24:18.863823: step: 918/466, loss: 0.10695232450962067 2023-01-22 16:24:19.624690: step: 920/466, loss: 0.5546409487724304 2023-01-22 16:24:20.351511: step: 922/466, loss: 0.026372479274868965 2023-01-22 16:24:21.135469: step: 924/466, loss: 0.0012187837855890393 2023-01-22 16:24:21.855003: step: 926/466, loss: 0.060460496693849564 2023-01-22 16:24:22.597611: step: 928/466, loss: 0.013208061456680298 2023-01-22 16:24:23.364228: step: 930/466, loss: 0.038146454840898514 2023-01-22 16:24:24.214607: step: 932/466, loss: 0.019662169739603996 ================================================== Loss: 0.035 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.293395902090209, 'r': 0.3373774509803921, 'f1': 0.31385333921741687}, 'combined': 0.23126035521283347, 'epoch': 34} Test Chinese: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.342535659384776, 'r': 0.313743667218118, 'f1': 0.32750808862026975}, 'combined': 0.20129765446904385, 'epoch': 34} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2733180511788048, 'r': 0.3485194125088365, 'f1': 0.3063715269260331}, 'combined': 0.22574744089286647, 'epoch': 34} Test Korean: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.3177183541533826, 'r': 0.31331324699007745, 'f1': 0.315500424979537}, 'combined': 0.19391733437766664, 'epoch': 34} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.30901859504132234, 'r': 0.35475569259962053, 'f1': 0.3303113957597173}, 'combined': 0.2433873442440022, 'epoch': 34} Test Russian: {'template': {'p': 0.863013698630137, 'r': 0.48091603053435117, 'f1': 0.6176470588235295}, 'slot': {'p': 0.3397558626298769, 'r': 0.30854778512661263, 'f1': 0.3234006757821171}, 'combined': 0.1997474762183665, 'epoch': 34} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.2471590909090909, 'r': 0.3107142857142857, 'f1': 0.2753164556962025}, 'combined': 0.18354430379746833, 'epoch': 34} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.22448979591836735, 'r': 0.4782608695652174, 'f1': 0.3055555555555556}, 'combined': 0.1527777777777778, 'epoch': 34} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.4117647058823529, 'r': 0.2413793103448276, 'f1': 0.3043478260869565}, 'combined': 0.20289855072463764, 'epoch': 34} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.31374061853002067, 'r': 0.3286239495798319, 'f1': 0.3210098636303455}, 'combined': 0.23653358372762298, 'epoch': 33} Test for Chinese: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.35580409301213317, 'r': 0.30097021547433256, 'f1': 0.32609812277003203}, 'combined': 0.20043104131231237, 'epoch': 33} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.31875, 'r': 0.36428571428571427, 'f1': 0.34}, 'combined': 0.22666666666666668, 'epoch': 33} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.28845615818674736, 'r': 0.34045489637980425, 'f1': 0.3123058840594549}, 'combined': 0.23012012509644045, 'epoch': 28} Test for Korean: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.3314220112372907, 'r': 0.30844648186208856, 'f1': 0.3195217594872982}, 'combined': 0.19638898387999792, 'epoch': 28} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.3170731707317073, 'r': 0.5652173913043478, 'f1': 0.40625}, 'combined': 0.203125, 'epoch': 28} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3161637354106435, 'r': 0.3395610516934046, 'f1': 0.3274449665918101}, 'combined': 0.24127523854133376, 'epoch': 24} Test for Russian: {'template': {'p': 0.863013698630137, 'r': 0.48091603053435117, 'f1': 0.6176470588235295}, 'slot': {'p': 0.34766580316129947, 'r': 0.3010093533864065, 'f1': 0.32265967810793456}, 'combined': 0.19928980118431255, 'epoch': 24} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.2413793103448276, 'f1': 0.32558139534883723}, 'combined': 0.21705426356589147, 'epoch': 24} ****************************** Epoch: 35 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-22 16:27:11.131823: step: 2/466, loss: 0.0016653207130730152 2023-01-22 16:27:11.898574: step: 4/466, loss: 0.009674855507910252 2023-01-22 16:27:12.614336: step: 6/466, loss: 0.003571053734049201 2023-01-22 16:27:13.356264: step: 8/466, loss: 0.0011327445972710848 2023-01-22 16:27:14.184436: step: 10/466, loss: 0.010663919150829315 2023-01-22 16:27:14.932344: step: 12/466, loss: 0.004768662620335817 2023-01-22 16:27:15.735107: step: 14/466, loss: 2.741898775100708 2023-01-22 16:27:16.530198: step: 16/466, loss: 0.03675287589430809 2023-01-22 16:27:17.251247: step: 18/466, loss: 0.022872446104884148 2023-01-22 16:27:18.045381: step: 20/466, loss: 0.011350450105965137 2023-01-22 16:27:18.804280: step: 22/466, loss: 0.11710529774427414 2023-01-22 16:27:19.583066: step: 24/466, loss: 0.01858246885240078 2023-01-22 16:27:20.250915: step: 26/466, loss: 0.0015298037324100733 2023-01-22 16:27:20.948050: step: 28/466, loss: 0.053046926856040955 2023-01-22 16:27:21.681205: step: 30/466, loss: 0.009396509267389774 2023-01-22 16:27:22.447678: step: 32/466, loss: 0.0038175564259290695 2023-01-22 16:27:23.110254: step: 34/466, loss: 0.00033454614458605647 2023-01-22 16:27:23.892376: step: 36/466, loss: 0.03061281517148018 2023-01-22 16:27:24.650203: step: 38/466, loss: 0.00020051149476785213 2023-01-22 16:27:25.338420: step: 40/466, loss: 0.14205829799175262 2023-01-22 16:27:26.161015: step: 42/466, loss: 0.000653352471999824 2023-01-22 16:27:26.887603: step: 44/466, loss: 0.0044058533385396 2023-01-22 16:27:27.617968: step: 46/466, loss: 0.020861299708485603 2023-01-22 16:27:28.359566: step: 48/466, loss: 0.011016382835805416 2023-01-22 16:27:29.172091: step: 50/466, loss: 0.026647163555026054 2023-01-22 16:27:29.912643: step: 52/466, loss: 0.00011628265929175541 2023-01-22 16:27:30.609737: step: 54/466, loss: 0.004194340668618679 2023-01-22 16:27:31.368643: step: 56/466, loss: 0.0391090102493763 2023-01-22 16:27:32.064306: step: 58/466, loss: 0.009294603951275349 2023-01-22 16:27:32.852000: step: 60/466, loss: 0.029620099812746048 2023-01-22 16:27:33.612456: step: 62/466, loss: 0.00043678830843418837 2023-01-22 16:27:34.379185: step: 64/466, loss: 0.016344642266631126 2023-01-22 16:27:35.108836: step: 66/466, loss: 0.01553379651159048 2023-01-22 16:27:35.826427: step: 68/466, loss: 0.010907202027738094 2023-01-22 16:27:36.577316: step: 70/466, loss: 0.005213810596615076 2023-01-22 16:27:37.401677: step: 72/466, loss: 0.05356352776288986 2023-01-22 16:27:38.140943: step: 74/466, loss: 0.000370173278497532 2023-01-22 16:27:38.875380: step: 76/466, loss: 9.902358578983694e-05 2023-01-22 16:27:39.589548: step: 78/466, loss: 0.004199262708425522 2023-01-22 16:27:40.343688: step: 80/466, loss: 0.0001537224161438644 2023-01-22 16:27:41.039407: step: 82/466, loss: 0.0006503834738396108 2023-01-22 16:27:41.821821: step: 84/466, loss: 0.021211711689829826 2023-01-22 16:27:42.630177: step: 86/466, loss: 0.018842682242393494 2023-01-22 16:27:43.454031: step: 88/466, loss: 1.0714588165283203 2023-01-22 16:27:44.270494: step: 90/466, loss: 0.0007923658122308552 2023-01-22 16:27:45.031545: step: 92/466, loss: 0.004402614664286375 2023-01-22 16:27:45.795418: step: 94/466, loss: 0.14788812398910522 2023-01-22 16:27:46.541264: step: 96/466, loss: 0.8132479786872864 2023-01-22 16:27:47.280981: step: 98/466, loss: 0.0061640930362045765 2023-01-22 16:27:48.133350: step: 100/466, loss: 0.005263158120214939 2023-01-22 16:27:48.967588: step: 102/466, loss: 0.004776317160576582 2023-01-22 16:27:49.734426: step: 104/466, loss: 0.013818934559822083 2023-01-22 16:27:50.436327: step: 106/466, loss: 0.006237700115889311 2023-01-22 16:27:51.148382: step: 108/466, loss: 0.0023206511978060007 2023-01-22 16:27:51.948893: step: 110/466, loss: 0.009178460575640202 2023-01-22 16:27:52.662674: step: 112/466, loss: 0.009001265279948711 2023-01-22 16:27:53.428505: step: 114/466, loss: 0.03055627830326557 2023-01-22 16:27:54.198662: step: 116/466, loss: 0.0008209957159124315 2023-01-22 16:27:54.990827: step: 118/466, loss: 0.00911964476108551 2023-01-22 16:27:55.708013: step: 120/466, loss: 5.4802025260869414e-05 2023-01-22 16:27:56.391933: step: 122/466, loss: 0.008433597162365913 2023-01-22 16:27:57.151122: step: 124/466, loss: 0.0014055280480533838 2023-01-22 16:27:57.908699: step: 126/466, loss: 0.00755557045340538 2023-01-22 16:27:58.576558: step: 128/466, loss: 0.003355384338647127 2023-01-22 16:27:59.317522: step: 130/466, loss: 0.003536886302754283 2023-01-22 16:28:00.046153: step: 132/466, loss: 0.06227538734674454 2023-01-22 16:28:00.828754: step: 134/466, loss: 0.009922748431563377 2023-01-22 16:28:01.571730: step: 136/466, loss: 0.00438820431008935 2023-01-22 16:28:02.371645: step: 138/466, loss: 0.01125524751842022 2023-01-22 16:28:03.126121: step: 140/466, loss: 0.008828779682517052 2023-01-22 16:28:03.811277: step: 142/466, loss: 0.007119577843695879 2023-01-22 16:28:04.619245: step: 144/466, loss: 0.015063256956636906 2023-01-22 16:28:05.420932: step: 146/466, loss: 0.08099152892827988 2023-01-22 16:28:06.146707: step: 148/466, loss: 0.004164229147136211 2023-01-22 16:28:06.969399: step: 150/466, loss: 0.004260669928044081 2023-01-22 16:28:07.862360: step: 152/466, loss: 0.017557373270392418 2023-01-22 16:28:08.693522: step: 154/466, loss: 0.021458711475133896 2023-01-22 16:28:09.468282: step: 156/466, loss: 0.01932448521256447 2023-01-22 16:28:10.178764: step: 158/466, loss: 0.012501021847128868 2023-01-22 16:28:10.996193: step: 160/466, loss: 0.00018512348469812423 2023-01-22 16:28:11.819325: step: 162/466, loss: 0.012105180881917477 2023-01-22 16:28:12.620359: step: 164/466, loss: 0.04477028176188469 2023-01-22 16:28:13.488103: step: 166/466, loss: 0.008865798823535442 2023-01-22 16:28:14.257438: step: 168/466, loss: 0.0018770851893350482 2023-01-22 16:28:14.964740: step: 170/466, loss: 0.002944796811789274 2023-01-22 16:28:15.784157: step: 172/466, loss: 0.005974494852125645 2023-01-22 16:28:16.487181: step: 174/466, loss: 1.3817396393278614e-05 2023-01-22 16:28:17.162082: step: 176/466, loss: 0.022418662905693054 2023-01-22 16:28:17.932197: step: 178/466, loss: 0.0034530037082731724 2023-01-22 16:28:18.633357: step: 180/466, loss: 0.0007848286768421531 2023-01-22 16:28:19.410060: step: 182/466, loss: 0.03065107949078083 2023-01-22 16:28:20.104772: step: 184/466, loss: 0.024792812764644623 2023-01-22 16:28:20.849125: step: 186/466, loss: 0.003852969268336892 2023-01-22 16:28:21.611750: step: 188/466, loss: 0.0042743380181491375 2023-01-22 16:28:22.316210: step: 190/466, loss: 0.0022542106453329325 2023-01-22 16:28:23.053312: step: 192/466, loss: 0.029596656560897827 2023-01-22 16:28:23.794593: step: 194/466, loss: 0.0017251368844881654 2023-01-22 16:28:24.476669: step: 196/466, loss: 0.08572795987129211 2023-01-22 16:28:25.281997: step: 198/466, loss: 0.005135236773639917 2023-01-22 16:28:26.051649: step: 200/466, loss: 8.483673445880413e-05 2023-01-22 16:28:26.809284: step: 202/466, loss: 0.002134964568540454 2023-01-22 16:28:27.620775: step: 204/466, loss: 0.022524390369653702 2023-01-22 16:28:28.418707: step: 206/466, loss: 0.014563613571226597 2023-01-22 16:28:29.245331: step: 208/466, loss: 0.06287034600973129 2023-01-22 16:28:30.073303: step: 210/466, loss: 0.015115310437977314 2023-01-22 16:28:30.822896: step: 212/466, loss: 0.0005720432964153588 2023-01-22 16:28:31.530380: step: 214/466, loss: 0.003158562583848834 2023-01-22 16:28:32.306260: step: 216/466, loss: 0.08497099578380585 2023-01-22 16:28:33.012286: step: 218/466, loss: 0.000755978049710393 2023-01-22 16:28:33.767014: step: 220/466, loss: 0.0012348828604444861 2023-01-22 16:28:34.507918: step: 222/466, loss: 0.0018827618332579732 2023-01-22 16:28:35.249148: step: 224/466, loss: 0.001642904942855239 2023-01-22 16:28:36.005124: step: 226/466, loss: 0.0006373273790813982 2023-01-22 16:28:36.810503: step: 228/466, loss: 0.0010114375036209822 2023-01-22 16:28:37.582302: step: 230/466, loss: 0.016211893409490585 2023-01-22 16:28:38.317717: step: 232/466, loss: 0.00019392998365219682 2023-01-22 16:28:39.087615: step: 234/466, loss: 0.00047139558591879904 2023-01-22 16:28:39.777122: step: 236/466, loss: 0.013145842589437962 2023-01-22 16:28:40.448581: step: 238/466, loss: 0.00024978467263281345 2023-01-22 16:28:41.164870: step: 240/466, loss: 0.00021391009795479476 2023-01-22 16:28:41.955354: step: 242/466, loss: 0.0013800224987789989 2023-01-22 16:28:42.707322: step: 244/466, loss: 0.012048288248479366 2023-01-22 16:28:43.475155: step: 246/466, loss: 0.0030543492175638676 2023-01-22 16:28:44.265845: step: 248/466, loss: 0.010977456346154213 2023-01-22 16:28:45.016298: step: 250/466, loss: 0.00030336013878695667 2023-01-22 16:28:45.771793: step: 252/466, loss: 0.018870405852794647 2023-01-22 16:28:46.539129: step: 254/466, loss: 0.0030181799083948135 2023-01-22 16:28:47.306757: step: 256/466, loss: 0.0009396415553055704 2023-01-22 16:28:48.102349: step: 258/466, loss: 0.0036075881216675043 2023-01-22 16:28:48.889547: step: 260/466, loss: 0.01562928967177868 2023-01-22 16:28:49.699538: step: 262/466, loss: 0.007501538842916489 2023-01-22 16:28:50.498417: step: 264/466, loss: 0.018335195258259773 2023-01-22 16:28:51.274218: step: 266/466, loss: 0.0022711025085300207 2023-01-22 16:28:52.053882: step: 268/466, loss: 0.007673630956560373 2023-01-22 16:28:52.856799: step: 270/466, loss: 0.00035795767325907946 2023-01-22 16:28:53.529377: step: 272/466, loss: 0.0024007440079003572 2023-01-22 16:28:54.330994: step: 274/466, loss: 0.009524238295853138 2023-01-22 16:28:55.011663: step: 276/466, loss: 0.0010717433178797364 2023-01-22 16:28:55.749397: step: 278/466, loss: 0.0006339615792967379 2023-01-22 16:28:56.506868: step: 280/466, loss: 0.00019898975733667612 2023-01-22 16:28:57.242532: step: 282/466, loss: 0.00895773060619831 2023-01-22 16:28:57.943669: step: 284/466, loss: 0.0007097829948179424 2023-01-22 16:28:58.807764: step: 286/466, loss: 5.67880088055972e-05 2023-01-22 16:28:59.584320: step: 288/466, loss: 0.00515703996643424 2023-01-22 16:29:00.411243: step: 290/466, loss: 0.021021075546741486 2023-01-22 16:29:01.160622: step: 292/466, loss: 0.01693740487098694 2023-01-22 16:29:01.946958: step: 294/466, loss: 8.952400821726769e-05 2023-01-22 16:29:02.746884: step: 296/466, loss: 0.0016920892521739006 2023-01-22 16:29:03.547329: step: 298/466, loss: 0.005159687716513872 2023-01-22 16:29:04.279641: step: 300/466, loss: 0.01627163589000702 2023-01-22 16:29:05.083938: step: 302/466, loss: 0.00884185079485178 2023-01-22 16:29:05.762426: step: 304/466, loss: 0.0006348793976940215 2023-01-22 16:29:06.536620: step: 306/466, loss: 0.00011097556125605479 2023-01-22 16:29:07.289421: step: 308/466, loss: 0.021979600191116333 2023-01-22 16:29:08.025451: step: 310/466, loss: 0.0020497848745435476 2023-01-22 16:29:08.797231: step: 312/466, loss: 0.000507302291225642 2023-01-22 16:29:09.625999: step: 314/466, loss: 0.024476177990436554 2023-01-22 16:29:10.327321: step: 316/466, loss: 0.0002447162114549428 2023-01-22 16:29:11.101802: step: 318/466, loss: 0.00038294721161946654 2023-01-22 16:29:11.797549: step: 320/466, loss: 0.0008296699961647391 2023-01-22 16:29:12.469028: step: 322/466, loss: 2.2966776214161655e-06 2023-01-22 16:29:13.225131: step: 324/466, loss: 0.023119645193219185 2023-01-22 16:29:13.982434: step: 326/466, loss: 0.1485862284898758 2023-01-22 16:29:14.804650: step: 328/466, loss: 0.002185945864766836 2023-01-22 16:29:15.559242: step: 330/466, loss: 0.008323568850755692 2023-01-22 16:29:16.281339: step: 332/466, loss: 0.006141643971204758 2023-01-22 16:29:16.948979: step: 334/466, loss: 0.021487636491656303 2023-01-22 16:29:17.672944: step: 336/466, loss: 0.0013316937256604433 2023-01-22 16:29:18.411219: step: 338/466, loss: 0.051021821796894073 2023-01-22 16:29:19.167247: step: 340/466, loss: 0.005893372930586338 2023-01-22 16:29:19.905052: step: 342/466, loss: 0.007925250567495823 2023-01-22 16:29:20.609213: step: 344/466, loss: 0.0031972848810255527 2023-01-22 16:29:21.319073: step: 346/466, loss: 0.010418121702969074 2023-01-22 16:29:22.281473: step: 348/466, loss: 0.0009884964674711227 2023-01-22 16:29:22.980058: step: 350/466, loss: 7.996588465175591e-06 2023-01-22 16:29:23.812413: step: 352/466, loss: 0.008195837959647179 2023-01-22 16:29:24.560850: step: 354/466, loss: 0.0049437265843153 2023-01-22 16:29:25.266627: step: 356/466, loss: 8.494260691804811e-05 2023-01-22 16:29:25.956428: step: 358/466, loss: 4.478584014577791e-05 2023-01-22 16:29:26.779536: step: 360/466, loss: 0.47110632061958313 2023-01-22 16:29:27.526796: step: 362/466, loss: 0.007744072936475277 2023-01-22 16:29:28.222182: step: 364/466, loss: 0.01543444860726595 2023-01-22 16:29:28.957211: step: 366/466, loss: 0.0016744795721024275 2023-01-22 16:29:29.715485: step: 368/466, loss: 0.001099259126931429 2023-01-22 16:29:30.492093: step: 370/466, loss: 9.564329957356676e-05 2023-01-22 16:29:31.293240: step: 372/466, loss: 0.0027040676213800907 2023-01-22 16:29:32.014737: step: 374/466, loss: 4.3052299588453025e-05 2023-01-22 16:29:32.790701: step: 376/466, loss: 0.00027351133758202195 2023-01-22 16:29:33.512369: step: 378/466, loss: 0.006958000361919403 2023-01-22 16:29:34.221019: step: 380/466, loss: 0.00037304943543858826 2023-01-22 16:29:34.983871: step: 382/466, loss: 6.726358697051182e-05 2023-01-22 16:29:35.729782: step: 384/466, loss: 0.0019613406620919704 2023-01-22 16:29:36.612598: step: 386/466, loss: 0.042300328612327576 2023-01-22 16:29:37.430329: step: 388/466, loss: 0.023403432220220566 2023-01-22 16:29:38.142226: step: 390/466, loss: 0.0028738633263856173 2023-01-22 16:29:38.937507: step: 392/466, loss: 0.006875161547213793 2023-01-22 16:29:39.645394: step: 394/466, loss: 0.003601239761337638 2023-01-22 16:29:40.353485: step: 396/466, loss: 0.007054249756038189 2023-01-22 16:29:41.116607: step: 398/466, loss: 0.006960573140531778 2023-01-22 16:29:41.845703: step: 400/466, loss: 0.001701996778137982 2023-01-22 16:29:42.663609: step: 402/466, loss: 0.012502540834248066 2023-01-22 16:29:43.380557: step: 404/466, loss: 0.0007834290154278278 2023-01-22 16:29:44.173898: step: 406/466, loss: 1.070505142211914 2023-01-22 16:29:44.987187: step: 408/466, loss: 0.21781472861766815 2023-01-22 16:29:45.706602: step: 410/466, loss: 0.0198195967823267 2023-01-22 16:29:46.446804: step: 412/466, loss: 0.0037474900018423796 2023-01-22 16:29:47.199660: step: 414/466, loss: 0.000687834806740284 2023-01-22 16:29:47.978850: step: 416/466, loss: 0.028054993599653244 2023-01-22 16:29:48.767571: step: 418/466, loss: 0.016235293820500374 2023-01-22 16:29:49.535336: step: 420/466, loss: 0.0015493419487029314 2023-01-22 16:29:50.279848: step: 422/466, loss: 0.02799062617123127 2023-01-22 16:29:50.942257: step: 424/466, loss: 0.002889692084863782 2023-01-22 16:29:51.682219: step: 426/466, loss: 0.026743553578853607 2023-01-22 16:29:52.417178: step: 428/466, loss: 0.005009463522583246 2023-01-22 16:29:53.181297: step: 430/466, loss: 7.741060107946396e-05 2023-01-22 16:29:53.895183: step: 432/466, loss: 0.009120047092437744 2023-01-22 16:29:54.741613: step: 434/466, loss: 0.030081048607826233 2023-01-22 16:29:55.444561: step: 436/466, loss: 0.0034114550799131393 2023-01-22 16:29:56.163771: step: 438/466, loss: 0.0009074569679796696 2023-01-22 16:29:56.910598: step: 440/466, loss: 0.0015783591661602259 2023-01-22 16:29:57.689853: step: 442/466, loss: 0.0004006644303444773 2023-01-22 16:29:58.421711: step: 444/466, loss: 0.010972261428833008 2023-01-22 16:29:59.249971: step: 446/466, loss: 0.019794577732682228 2023-01-22 16:30:00.040881: step: 448/466, loss: 0.0050488668493926525 2023-01-22 16:30:00.790052: step: 450/466, loss: 0.09048167616128922 2023-01-22 16:30:01.616427: step: 452/466, loss: 0.09162473678588867 2023-01-22 16:30:02.329088: step: 454/466, loss: 0.007137281354516745 2023-01-22 16:30:03.046993: step: 456/466, loss: 0.03742596507072449 2023-01-22 16:30:03.833699: step: 458/466, loss: 0.01613294705748558 2023-01-22 16:30:04.594643: step: 460/466, loss: 0.000966592924669385 2023-01-22 16:30:05.353903: step: 462/466, loss: 0.010030020028352737 2023-01-22 16:30:06.142641: step: 464/466, loss: 0.015448026359081268 2023-01-22 16:30:06.826054: step: 466/466, loss: 0.0014781695790588856 2023-01-22 16:30:07.607294: step: 468/466, loss: 0.06412964314222336 2023-01-22 16:30:08.365152: step: 470/466, loss: 0.002992757363244891 2023-01-22 16:30:09.001195: step: 472/466, loss: 0.00024858355754986405 2023-01-22 16:30:09.700675: step: 474/466, loss: 0.002858598018065095 2023-01-22 16:30:10.483839: step: 476/466, loss: 0.010369786061346531 2023-01-22 16:30:11.246798: step: 478/466, loss: 0.0003029539075214416 2023-01-22 16:30:11.928174: step: 480/466, loss: 0.014699235558509827 2023-01-22 16:30:12.765880: step: 482/466, loss: 0.06207871064543724 2023-01-22 16:30:13.523614: step: 484/466, loss: 0.0003268167783971876 2023-01-22 16:30:14.234847: step: 486/466, loss: 0.04465591162443161 2023-01-22 16:30:15.022030: step: 488/466, loss: 0.010574414394795895 2023-01-22 16:30:15.696291: step: 490/466, loss: 8.157742558978498e-05 2023-01-22 16:30:16.499525: step: 492/466, loss: 0.033157918602228165 2023-01-22 16:30:17.311076: step: 494/466, loss: 0.001525003812275827 2023-01-22 16:30:18.160300: step: 496/466, loss: 8.820889343041927e-05 2023-01-22 16:30:18.978919: step: 498/466, loss: 0.05314020812511444 2023-01-22 16:30:19.803789: step: 500/466, loss: 2.079984188079834 2023-01-22 16:30:20.580182: step: 502/466, loss: 2.2129243006929755e-05 2023-01-22 16:30:21.341792: step: 504/466, loss: 0.03799187391996384 2023-01-22 16:30:22.049408: step: 506/466, loss: 0.0069936420768499374 2023-01-22 16:30:22.811031: step: 508/466, loss: 0.08669064193964005 2023-01-22 16:30:23.525787: step: 510/466, loss: 0.0012966989306733012 2023-01-22 16:30:24.236948: step: 512/466, loss: 0.7818902730941772 2023-01-22 16:30:24.961465: step: 514/466, loss: 0.00016538219642825425 2023-01-22 16:30:25.708855: step: 516/466, loss: 0.017313748598098755 2023-01-22 16:30:26.436217: step: 518/466, loss: 0.0070531475357711315 2023-01-22 16:30:27.169900: step: 520/466, loss: 0.00020073176710866392 2023-01-22 16:30:27.919471: step: 522/466, loss: 0.003936620429158211 2023-01-22 16:30:28.687087: step: 524/466, loss: 0.0012498348951339722 2023-01-22 16:30:29.436150: step: 526/466, loss: 0.025787923485040665 2023-01-22 16:30:30.089534: step: 528/466, loss: 0.00017150120402220637 2023-01-22 16:30:30.771408: step: 530/466, loss: 0.008530835621058941 2023-01-22 16:30:31.581713: step: 532/466, loss: 0.0306110680103302 2023-01-22 16:30:32.370464: step: 534/466, loss: 0.013904067687690258 2023-01-22 16:30:33.201710: step: 536/466, loss: 0.2676345109939575 2023-01-22 16:30:33.988812: step: 538/466, loss: 0.029973825439810753 2023-01-22 16:30:34.787153: step: 540/466, loss: 0.0027498030103743076 2023-01-22 16:30:35.598118: step: 542/466, loss: 0.0015113947447389364 2023-01-22 16:30:36.399008: step: 544/466, loss: 0.01900675520300865 2023-01-22 16:30:37.127893: step: 546/466, loss: 0.0037392114754766226 2023-01-22 16:30:37.864721: step: 548/466, loss: 0.0005670114187523723 2023-01-22 16:30:38.647578: step: 550/466, loss: 0.012769817374646664 2023-01-22 16:30:39.430789: step: 552/466, loss: 0.006172207649797201 2023-01-22 16:30:40.170902: step: 554/466, loss: 0.07989545166492462 2023-01-22 16:30:40.948041: step: 556/466, loss: 0.0014416680205613375 2023-01-22 16:30:41.701160: step: 558/466, loss: 0.0365539975464344 2023-01-22 16:30:42.437210: step: 560/466, loss: 0.02450944483280182 2023-01-22 16:30:43.140945: step: 562/466, loss: 0.0024222354404628277 2023-01-22 16:30:43.860587: step: 564/466, loss: 0.023588059470057487 2023-01-22 16:30:44.564697: step: 566/466, loss: 0.014570425264537334 2023-01-22 16:30:45.282214: step: 568/466, loss: 0.004594041034579277 2023-01-22 16:30:46.102757: step: 570/466, loss: 5.841199163114652e-05 2023-01-22 16:30:46.932627: step: 572/466, loss: 0.11277756839990616 2023-01-22 16:30:47.672622: step: 574/466, loss: 0.0013348526554182172 2023-01-22 16:30:48.365468: step: 576/466, loss: 0.0006081808242015541 2023-01-22 16:30:49.073090: step: 578/466, loss: 0.0010401842882856727 2023-01-22 16:30:49.828490: step: 580/466, loss: 0.03048798255622387 2023-01-22 16:30:50.601749: step: 582/466, loss: 0.011533768847584724 2023-01-22 16:30:51.327191: step: 584/466, loss: 0.02726082131266594 2023-01-22 16:30:52.062751: step: 586/466, loss: 0.003118099644780159 2023-01-22 16:30:52.777370: step: 588/466, loss: 0.004390220623463392 2023-01-22 16:30:53.581168: step: 590/466, loss: 0.004966802895069122 2023-01-22 16:30:54.325065: step: 592/466, loss: 0.0012968675000593066 2023-01-22 16:30:55.032618: step: 594/466, loss: 0.001784640597179532 2023-01-22 16:30:55.855979: step: 596/466, loss: 0.00022365168842952698 2023-01-22 16:30:56.621711: step: 598/466, loss: 0.33438318967819214 2023-01-22 16:30:57.605843: step: 600/466, loss: 3.4098749893018976e-05 2023-01-22 16:30:58.394379: step: 602/466, loss: 0.43303313851356506 2023-01-22 16:30:59.094413: step: 604/466, loss: 0.0001014567751553841 2023-01-22 16:30:59.865320: step: 606/466, loss: 0.00971634965389967 2023-01-22 16:31:00.597775: step: 608/466, loss: 0.0004996024654246867 2023-01-22 16:31:01.367164: step: 610/466, loss: 0.0024302061647176743 2023-01-22 16:31:02.145601: step: 612/466, loss: 0.0526101216673851 2023-01-22 16:31:02.944411: step: 614/466, loss: 0.00908119697123766 2023-01-22 16:31:03.692387: step: 616/466, loss: 0.0009205329697579145 2023-01-22 16:31:04.420513: step: 618/466, loss: 0.02464241161942482 2023-01-22 16:31:05.331681: step: 620/466, loss: 0.006661287043243647 2023-01-22 16:31:06.065114: step: 622/466, loss: 0.024558162316679955 2023-01-22 16:31:06.730073: step: 624/466, loss: 0.0010383835760876536 2023-01-22 16:31:07.553395: step: 626/466, loss: 0.0002190878294641152 2023-01-22 16:31:08.275426: step: 628/466, loss: 0.01551087200641632 2023-01-22 16:31:09.015645: step: 630/466, loss: 0.013792922720313072 2023-01-22 16:31:09.713198: step: 632/466, loss: 0.011820226907730103 2023-01-22 16:31:10.443144: step: 634/466, loss: 0.009754060767591 2023-01-22 16:31:11.231258: step: 636/466, loss: 0.0007316062110476196 2023-01-22 16:31:11.993201: step: 638/466, loss: 0.007642973214387894 2023-01-22 16:31:12.688225: step: 640/466, loss: 0.001830677385441959 2023-01-22 16:31:13.528091: step: 642/466, loss: 0.0008155608084052801 2023-01-22 16:31:14.233213: step: 644/466, loss: 0.002872632583603263 2023-01-22 16:31:14.939802: step: 646/466, loss: 0.00423818826675415 2023-01-22 16:31:15.724603: step: 648/466, loss: 0.0069534857757389545 2023-01-22 16:31:16.511286: step: 650/466, loss: 0.0025988135021179914 2023-01-22 16:31:17.281332: step: 652/466, loss: 0.02148084156215191 2023-01-22 16:31:18.063902: step: 654/466, loss: 0.003581245429813862 2023-01-22 16:31:18.768446: step: 656/466, loss: 0.0006101642502471805 2023-01-22 16:31:19.659283: step: 658/466, loss: 0.013008118607103825 2023-01-22 16:31:20.491155: step: 660/466, loss: 0.011089122854173183 2023-01-22 16:31:21.277519: step: 662/466, loss: 7.312051457120106e-05 2023-01-22 16:31:22.112590: step: 664/466, loss: 0.000744556135032326 2023-01-22 16:31:22.843270: step: 666/466, loss: 0.0013610776513814926 2023-01-22 16:31:23.584849: step: 668/466, loss: 1.1327815055847168 2023-01-22 16:31:24.277229: step: 670/466, loss: 0.0048438976518809795 2023-01-22 16:31:25.022875: step: 672/466, loss: 8.07375690783374e-05 2023-01-22 16:31:25.747407: step: 674/466, loss: 0.041891444474458694 2023-01-22 16:31:26.535332: step: 676/466, loss: 0.0002708226384129375 2023-01-22 16:31:27.267152: step: 678/466, loss: 0.0003661640512291342 2023-01-22 16:31:27.990538: step: 680/466, loss: 0.0008033191552385688 2023-01-22 16:31:28.706429: step: 682/466, loss: 0.0003179586201440543 2023-01-22 16:31:29.383724: step: 684/466, loss: 0.04391786456108093 2023-01-22 16:31:30.281475: step: 686/466, loss: 0.0032935445196926594 2023-01-22 16:31:31.058165: step: 688/466, loss: 0.0052387891337275505 2023-01-22 16:31:31.904307: step: 690/466, loss: 0.031956274062395096 2023-01-22 16:31:32.681848: step: 692/466, loss: 0.032561078667640686 2023-01-22 16:31:33.420946: step: 694/466, loss: 0.030473001301288605 2023-01-22 16:31:34.205801: step: 696/466, loss: 0.004874146543443203 2023-01-22 16:31:34.899790: step: 698/466, loss: 0.00477445125579834 2023-01-22 16:31:35.716724: step: 700/466, loss: 0.0238727405667305 2023-01-22 16:31:36.358839: step: 702/466, loss: 0.001801260863430798 2023-01-22 16:31:37.041422: step: 704/466, loss: 0.0008925177971832454 2023-01-22 16:31:37.785841: step: 706/466, loss: 0.00024452884099446237 2023-01-22 16:31:38.533219: step: 708/466, loss: 0.0016748437192291021 2023-01-22 16:31:39.360761: step: 710/466, loss: 0.00023085040447767824 2023-01-22 16:31:40.027653: step: 712/466, loss: 0.008219665847718716 2023-01-22 16:31:40.898168: step: 714/466, loss: 0.07780952006578445 2023-01-22 16:31:41.825809: step: 716/466, loss: 0.025093533098697662 2023-01-22 16:31:42.523753: step: 718/466, loss: 0.002227090997621417 2023-01-22 16:31:43.275091: step: 720/466, loss: 0.042245469987392426 2023-01-22 16:31:44.118692: step: 722/466, loss: 0.03828979283571243 2023-01-22 16:31:44.881871: step: 724/466, loss: 0.0036812573671340942 2023-01-22 16:31:45.653598: step: 726/466, loss: 0.01815827749669552 2023-01-22 16:31:46.463103: step: 728/466, loss: 0.0014923752751201391 2023-01-22 16:31:47.236095: step: 730/466, loss: 0.0003263081598561257 2023-01-22 16:31:47.961854: step: 732/466, loss: 0.012029914185404778 2023-01-22 16:31:48.728690: step: 734/466, loss: 2.6921272365143523e-05 2023-01-22 16:31:49.488168: step: 736/466, loss: 0.21548262238502502 2023-01-22 16:31:50.314472: step: 738/466, loss: 0.0009322597761638463 2023-01-22 16:31:50.975043: step: 740/466, loss: 9.577017044648528e-05 2023-01-22 16:31:51.622649: step: 742/466, loss: 0.012453495524823666 2023-01-22 16:31:52.338866: step: 744/466, loss: 0.0031533828005194664 2023-01-22 16:31:53.177775: step: 746/466, loss: 0.0029514674097299576 2023-01-22 16:31:53.914115: step: 748/466, loss: 0.00029309350065886974 2023-01-22 16:31:54.750096: step: 750/466, loss: 0.033656831830739975 2023-01-22 16:31:55.489012: step: 752/466, loss: 0.009833025746047497 2023-01-22 16:31:56.192323: step: 754/466, loss: 0.020311174914240837 2023-01-22 16:31:56.906210: step: 756/466, loss: 0.007364567369222641 2023-01-22 16:31:57.643824: step: 758/466, loss: 0.03312944993376732 2023-01-22 16:31:58.441989: step: 760/466, loss: 0.002991229295730591 2023-01-22 16:31:59.190648: step: 762/466, loss: 0.0003226816188544035 2023-01-22 16:31:59.881652: step: 764/466, loss: 0.0016692840727046132 2023-01-22 16:32:00.550706: step: 766/466, loss: 0.0013053424190729856 2023-01-22 16:32:01.365671: step: 768/466, loss: 0.0428982712328434 2023-01-22 16:32:02.259993: step: 770/466, loss: 0.7088618278503418 2023-01-22 16:32:03.012924: step: 772/466, loss: 0.014309249818325043 2023-01-22 16:32:03.727574: step: 774/466, loss: 0.037123847752809525 2023-01-22 16:32:04.483653: step: 776/466, loss: 0.19105984270572662 2023-01-22 16:32:05.141406: step: 778/466, loss: 0.0012500167358666658 2023-01-22 16:32:05.879451: step: 780/466, loss: 0.01712871342897415 2023-01-22 16:32:06.556514: step: 782/466, loss: 0.005755425896495581 2023-01-22 16:32:07.352982: step: 784/466, loss: 0.012732706032693386 2023-01-22 16:32:08.143235: step: 786/466, loss: 0.010109102353453636 2023-01-22 16:32:08.855866: step: 788/466, loss: 0.8606966733932495 2023-01-22 16:32:09.598674: step: 790/466, loss: 0.012393337674438953 2023-01-22 16:32:10.452103: step: 792/466, loss: 0.0015637580072507262 2023-01-22 16:32:11.210875: step: 794/466, loss: 0.0003238733916077763 2023-01-22 16:32:11.895231: step: 796/466, loss: 0.007022204343229532 2023-01-22 16:32:12.685787: step: 798/466, loss: 0.017238151282072067 2023-01-22 16:32:13.448262: step: 800/466, loss: 0.003413428319618106 2023-01-22 16:32:14.227302: step: 802/466, loss: 0.00270936731249094 2023-01-22 16:32:15.054212: step: 804/466, loss: 0.0003040628507733345 2023-01-22 16:32:15.809983: step: 806/466, loss: 0.010912488214671612 2023-01-22 16:32:16.581559: step: 808/466, loss: 0.019548660144209862 2023-01-22 16:32:17.294893: step: 810/466, loss: 0.0020851590670645237 2023-01-22 16:32:18.188897: step: 812/466, loss: 0.01595069281756878 2023-01-22 16:32:19.035842: step: 814/466, loss: 0.006865901872515678 2023-01-22 16:32:19.839952: step: 816/466, loss: 0.10417701303958893 2023-01-22 16:32:20.593868: step: 818/466, loss: 0.017476389184594154 2023-01-22 16:32:21.433165: step: 820/466, loss: 0.0005563123850151896 2023-01-22 16:32:22.313491: step: 822/466, loss: 0.003879000199958682 2023-01-22 16:32:23.042656: step: 824/466, loss: 0.023705052211880684 2023-01-22 16:32:23.778242: step: 826/466, loss: 0.028963197022676468 2023-01-22 16:32:24.523498: step: 828/466, loss: 0.02524581365287304 2023-01-22 16:32:25.364060: step: 830/466, loss: 0.1719159483909607 2023-01-22 16:32:26.233552: step: 832/466, loss: 0.0010757212294265628 2023-01-22 16:32:26.954042: step: 834/466, loss: 0.025246674194931984 2023-01-22 16:32:27.702569: step: 836/466, loss: 0.020809084177017212 2023-01-22 16:32:28.494001: step: 838/466, loss: 0.012798292562365532 2023-01-22 16:32:29.252318: step: 840/466, loss: 0.0154691431671381 2023-01-22 16:32:29.991165: step: 842/466, loss: 0.004644252825528383 2023-01-22 16:32:30.712169: step: 844/466, loss: 0.00046820956049486995 2023-01-22 16:32:31.489705: step: 846/466, loss: 2.369272470474243 2023-01-22 16:32:32.314165: step: 848/466, loss: 0.006457947660237551 2023-01-22 16:32:33.036814: step: 850/466, loss: 0.020441725850105286 2023-01-22 16:32:33.830360: step: 852/466, loss: 0.03185777738690376 2023-01-22 16:32:34.637567: step: 854/466, loss: 0.0019530951976776123 2023-01-22 16:32:35.454129: step: 856/466, loss: 0.0021320057567209005 2023-01-22 16:32:36.201423: step: 858/466, loss: 0.6590021252632141 2023-01-22 16:32:36.897402: step: 860/466, loss: 0.028135813772678375 2023-01-22 16:32:37.716234: step: 862/466, loss: 0.00792783871293068 2023-01-22 16:32:38.465296: step: 864/466, loss: 0.030260713770985603 2023-01-22 16:32:39.171649: step: 866/466, loss: 0.010048595257103443 2023-01-22 16:32:39.966610: step: 868/466, loss: 0.020413830876350403 2023-01-22 16:32:40.755583: step: 870/466, loss: 0.006550361402332783 2023-01-22 16:32:41.407937: step: 872/466, loss: 0.003479516599327326 2023-01-22 16:32:42.129603: step: 874/466, loss: 0.1767842024564743 2023-01-22 16:32:42.863141: step: 876/466, loss: 0.00718360161408782 2023-01-22 16:32:43.672229: step: 878/466, loss: 0.0019658098462969065 2023-01-22 16:32:44.422662: step: 880/466, loss: 0.04372788965702057 2023-01-22 16:32:45.153691: step: 882/466, loss: 0.15487949550151825 2023-01-22 16:32:45.842753: step: 884/466, loss: 0.0005945255979895592 2023-01-22 16:32:46.568125: step: 886/466, loss: 0.0019250859040766954 2023-01-22 16:32:47.344172: step: 888/466, loss: 0.011378058232367039 2023-01-22 16:32:48.061309: step: 890/466, loss: 0.0009098451700992882 2023-01-22 16:32:48.841993: step: 892/466, loss: 0.20266282558441162 2023-01-22 16:32:49.657399: step: 894/466, loss: 0.0019502755021676421 2023-01-22 16:32:50.401555: step: 896/466, loss: 0.0023257662542164326 2023-01-22 16:32:51.073500: step: 898/466, loss: 0.002615791978314519 2023-01-22 16:32:51.807382: step: 900/466, loss: 0.6581990122795105 2023-01-22 16:32:52.565453: step: 902/466, loss: 0.023657280951738358 2023-01-22 16:32:53.449360: step: 904/466, loss: 0.02864796854555607 2023-01-22 16:32:54.188302: step: 906/466, loss: 0.0019437010632827878 2023-01-22 16:32:54.953910: step: 908/466, loss: 0.012087050825357437 2023-01-22 16:32:55.709198: step: 910/466, loss: 0.02876484952867031 2023-01-22 16:32:56.498508: step: 912/466, loss: 0.007884223945438862 2023-01-22 16:32:57.108882: step: 914/466, loss: 0.0017012872267514467 2023-01-22 16:32:57.812070: step: 916/466, loss: 0.0029429281130433083 2023-01-22 16:32:58.555033: step: 918/466, loss: 0.08260404318571091 2023-01-22 16:32:59.306327: step: 920/466, loss: 0.022113988175988197 2023-01-22 16:33:00.033448: step: 922/466, loss: 0.0013095721369609237 2023-01-22 16:33:00.702693: step: 924/466, loss: 0.001993898767977953 2023-01-22 16:33:01.570757: step: 926/466, loss: 0.0019014434656128287 2023-01-22 16:33:02.269475: step: 928/466, loss: 0.004830385558307171 2023-01-22 16:33:02.996096: step: 930/466, loss: 0.0004450043197721243 2023-01-22 16:33:03.665369: step: 932/466, loss: 0.0031805401667952538 ================================================== Loss: 0.051 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3035618279569892, 'r': 0.33927498418722324, 'f1': 0.3204263739545997}, 'combined': 0.23610364396654715, 'epoch': 35} Test Chinese: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.3473702817047227, 'r': 0.3154627861581884, 'f1': 0.3306485515227515}, 'combined': 0.20322789020422777, 'epoch': 35} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.27562587535014005, 'r': 0.33681795773337797, 'f1': 0.30316492523567923}, 'combined': 0.22338468175260573, 'epoch': 35} Test Korean: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.3357384209870081, 'r': 0.32177356465479284, 'f1': 0.3286076934616203}, 'combined': 0.20197350915202025, 'epoch': 35} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.314317733266146, 'r': 0.33817486672088193, 'f1': 0.3258101549577784}, 'combined': 0.2400706404952051, 'epoch': 35} Test Russian: {'template': {'p': 0.863013698630137, 'r': 0.48091603053435117, 'f1': 0.6176470588235295}, 'slot': {'p': 0.3451213497303603, 'r': 0.30923351440311314, 'f1': 0.32619330495538623}, 'combined': 0.20147233541362095, 'epoch': 35} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.2602040816326531, 'r': 0.36428571428571427, 'f1': 0.3035714285714286}, 'combined': 0.20238095238095238, 'epoch': 35} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.2833333333333333, 'r': 0.5543478260869565, 'f1': 0.375}, 'combined': 0.1875, 'epoch': 35} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.42857142857142855, 'r': 0.20689655172413793, 'f1': 0.2790697674418604}, 'combined': 0.18604651162790692, 'epoch': 35} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.31374061853002067, 'r': 0.3286239495798319, 'f1': 0.3210098636303455}, 'combined': 0.23653358372762298, 'epoch': 33} Test for Chinese: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.35580409301213317, 'r': 0.30097021547433256, 'f1': 0.32609812277003203}, 'combined': 0.20043104131231237, 'epoch': 33} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.31875, 'r': 0.36428571428571427, 'f1': 0.34}, 'combined': 0.22666666666666668, 'epoch': 33} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.28845615818674736, 'r': 0.34045489637980425, 'f1': 0.3123058840594549}, 'combined': 0.23012012509644045, 'epoch': 28} Test for Korean: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.3314220112372907, 'r': 0.30844648186208856, 'f1': 0.3195217594872982}, 'combined': 0.19638898387999792, 'epoch': 28} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.3170731707317073, 'r': 0.5652173913043478, 'f1': 0.40625}, 'combined': 0.203125, 'epoch': 28} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3161637354106435, 'r': 0.3395610516934046, 'f1': 0.3274449665918101}, 'combined': 0.24127523854133376, 'epoch': 24} Test for Russian: {'template': {'p': 0.863013698630137, 'r': 0.48091603053435117, 'f1': 0.6176470588235295}, 'slot': {'p': 0.34766580316129947, 'r': 0.3010093533864065, 'f1': 0.32265967810793456}, 'combined': 0.19928980118431255, 'epoch': 24} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.2413793103448276, 'f1': 0.32558139534883723}, 'combined': 0.21705426356589147, 'epoch': 24} ****************************** Epoch: 36 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-22 16:35:49.571577: step: 2/466, loss: 6.959749589441344e-05 2023-01-22 16:35:50.259721: step: 4/466, loss: 0.052603382617235184 2023-01-22 16:35:51.096744: step: 6/466, loss: 0.007829297333955765 2023-01-22 16:35:51.885479: step: 8/466, loss: 0.0017035205382853746 2023-01-22 16:35:52.591848: step: 10/466, loss: 2.6053298824990634e-06 2023-01-22 16:35:53.310318: step: 12/466, loss: 0.0007733256788924336 2023-01-22 16:35:54.025785: step: 14/466, loss: 0.0004052332369610667 2023-01-22 16:35:54.808008: step: 16/466, loss: 1.8812017515301704e-05 2023-01-22 16:35:55.498193: step: 18/466, loss: 0.01579943485558033 2023-01-22 16:35:56.193878: step: 20/466, loss: 0.0032925379928201437 2023-01-22 16:35:57.058098: step: 22/466, loss: 0.09599200636148453 2023-01-22 16:35:57.872191: step: 24/466, loss: 0.03901425004005432 2023-01-22 16:35:58.694184: step: 26/466, loss: 0.00021684799867216498 2023-01-22 16:35:59.457031: step: 28/466, loss: 0.04132521525025368 2023-01-22 16:36:00.190683: step: 30/466, loss: 0.001680803601630032 2023-01-22 16:36:00.865878: step: 32/466, loss: 0.02639612928032875 2023-01-22 16:36:01.633750: step: 34/466, loss: 0.009096384048461914 2023-01-22 16:36:02.476131: step: 36/466, loss: 0.000210379614145495 2023-01-22 16:36:03.253310: step: 38/466, loss: 0.007331050466746092 2023-01-22 16:36:04.051178: step: 40/466, loss: 0.00036271740100346506 2023-01-22 16:36:04.761900: step: 42/466, loss: 0.003784711705520749 2023-01-22 16:36:05.500340: step: 44/466, loss: 0.006800322327762842 2023-01-22 16:36:06.211399: step: 46/466, loss: 0.0009321786346845329 2023-01-22 16:36:06.908339: step: 48/466, loss: 0.005665746051818132 2023-01-22 16:36:07.601450: step: 50/466, loss: 0.093864306807518 2023-01-22 16:36:08.372075: step: 52/466, loss: 0.003616434521973133 2023-01-22 16:36:09.142873: step: 54/466, loss: 0.02172437310218811 2023-01-22 16:36:09.909359: step: 56/466, loss: 0.010583080351352692 2023-01-22 16:36:10.689920: step: 58/466, loss: 0.14625628292560577 2023-01-22 16:36:11.469893: step: 60/466, loss: 0.003461432410404086 2023-01-22 16:36:12.282415: step: 62/466, loss: 0.05907382443547249 2023-01-22 16:36:13.040546: step: 64/466, loss: 0.0032302553299814463 2023-01-22 16:36:13.783297: step: 66/466, loss: 0.02619274891912937 2023-01-22 16:36:14.676282: step: 68/466, loss: 0.30181601643562317 2023-01-22 16:36:15.409372: step: 70/466, loss: 0.0006808125763200223 2023-01-22 16:36:16.192265: step: 72/466, loss: 0.0056705838069319725 2023-01-22 16:36:16.909776: step: 74/466, loss: 0.0020186409819871187 2023-01-22 16:36:17.649203: step: 76/466, loss: 0.001368081197142601 2023-01-22 16:36:18.381846: step: 78/466, loss: 0.0015106059145182371 2023-01-22 16:36:19.094997: step: 80/466, loss: 0.0026107397861778736 2023-01-22 16:36:19.839469: step: 82/466, loss: 0.0015447793994098902 2023-01-22 16:36:20.597114: step: 84/466, loss: 0.002406883519142866 2023-01-22 16:36:21.299423: step: 86/466, loss: 0.0089812520891428 2023-01-22 16:36:22.041161: step: 88/466, loss: 0.005512211471796036 2023-01-22 16:36:22.720554: step: 90/466, loss: 0.002036134712398052 2023-01-22 16:36:23.514775: step: 92/466, loss: 0.010609150864183903 2023-01-22 16:36:24.221446: step: 94/466, loss: 8.391762094106525e-05 2023-01-22 16:36:24.940804: step: 96/466, loss: 0.001140699489042163 2023-01-22 16:36:25.836857: step: 98/466, loss: 0.08036769926548004 2023-01-22 16:36:26.666175: step: 100/466, loss: 0.08868908882141113 2023-01-22 16:36:27.463638: step: 102/466, loss: 0.03348981961607933 2023-01-22 16:36:28.215165: step: 104/466, loss: 0.004189977888017893 2023-01-22 16:36:29.058995: step: 106/466, loss: 0.27573361992836 2023-01-22 16:36:29.878303: step: 108/466, loss: 0.003680627327412367 2023-01-22 16:36:30.712739: step: 110/466, loss: 0.0007431924459524453 2023-01-22 16:36:31.443429: step: 112/466, loss: 0.0027506130281835794 2023-01-22 16:36:32.174779: step: 114/466, loss: 0.0016330421203747392 2023-01-22 16:36:32.877282: step: 116/466, loss: 0.016300100833177567 2023-01-22 16:36:33.634155: step: 118/466, loss: 3.5442230000626296e-05 2023-01-22 16:36:34.442467: step: 120/466, loss: 0.06424372643232346 2023-01-22 16:36:35.168414: step: 122/466, loss: 0.016632402315735817 2023-01-22 16:36:35.927412: step: 124/466, loss: 0.00016443005006294698 2023-01-22 16:36:36.665126: step: 126/466, loss: 0.0009986787335947156 2023-01-22 16:36:37.340483: step: 128/466, loss: 0.0072743832133710384 2023-01-22 16:36:38.117787: step: 130/466, loss: 0.0005441168905235827 2023-01-22 16:36:38.910889: step: 132/466, loss: 0.0016753192758187652 2023-01-22 16:36:39.703945: step: 134/466, loss: 0.013735419139266014 2023-01-22 16:36:40.512298: step: 136/466, loss: 0.013947011902928352 2023-01-22 16:36:41.247571: step: 138/466, loss: 7.987304707057774e-05 2023-01-22 16:36:42.048719: step: 140/466, loss: 6.819606642238796e-05 2023-01-22 16:36:42.855925: step: 142/466, loss: 0.0016738786362111568 2023-01-22 16:36:43.598452: step: 144/466, loss: 0.03004748746752739 2023-01-22 16:36:44.456663: step: 146/466, loss: 0.007570676505565643 2023-01-22 16:36:45.308432: step: 148/466, loss: 0.00446693692356348 2023-01-22 16:36:46.048936: step: 150/466, loss: 0.0004110346781089902 2023-01-22 16:36:46.810456: step: 152/466, loss: 0.0028237239457666874 2023-01-22 16:36:47.546429: step: 154/466, loss: 0.0074403490871191025 2023-01-22 16:36:48.231097: step: 156/466, loss: 0.0017213046085089445 2023-01-22 16:36:48.931479: step: 158/466, loss: 0.004246914759278297 2023-01-22 16:36:49.782501: step: 160/466, loss: 6.689901056233793e-05 2023-01-22 16:36:50.510639: step: 162/466, loss: 0.13011637330055237 2023-01-22 16:36:51.332357: step: 164/466, loss: 0.01242199819535017 2023-01-22 16:36:52.033598: step: 166/466, loss: 0.0013810870004817843 2023-01-22 16:36:52.717694: step: 168/466, loss: 0.0005429077427834272 2023-01-22 16:36:53.445697: step: 170/466, loss: 0.02831660956144333 2023-01-22 16:36:54.151407: step: 172/466, loss: 0.17097602784633636 2023-01-22 16:36:54.915246: step: 174/466, loss: 0.17554019391536713 2023-01-22 16:36:55.622811: step: 176/466, loss: 0.0023959302343428135 2023-01-22 16:36:56.295523: step: 178/466, loss: 0.0022369904909282923 2023-01-22 16:36:57.083988: step: 180/466, loss: 0.0011512894416227937 2023-01-22 16:36:57.789653: step: 182/466, loss: 0.0018296147463843226 2023-01-22 16:36:58.501077: step: 184/466, loss: 0.00896126963198185 2023-01-22 16:36:59.282245: step: 186/466, loss: 0.014478500932455063 2023-01-22 16:37:00.164616: step: 188/466, loss: 0.016855215653777122 2023-01-22 16:37:00.933891: step: 190/466, loss: 0.00267539219930768 2023-01-22 16:37:01.708546: step: 192/466, loss: 0.006435306742787361 2023-01-22 16:37:02.662007: step: 194/466, loss: 0.006483323406428099 2023-01-22 16:37:03.508244: step: 196/466, loss: 0.0008203312754631042 2023-01-22 16:37:04.196645: step: 198/466, loss: 0.001442829379811883 2023-01-22 16:37:04.938890: step: 200/466, loss: 0.010103004053235054 2023-01-22 16:37:05.656419: step: 202/466, loss: 0.09546792507171631 2023-01-22 16:37:06.442081: step: 204/466, loss: 0.0025321985594928265 2023-01-22 16:37:07.185296: step: 206/466, loss: 0.11714517325162888 2023-01-22 16:37:07.985775: step: 208/466, loss: 0.00020245101768523455 2023-01-22 16:37:08.666181: step: 210/466, loss: 0.040933359414339066 2023-01-22 16:37:09.390975: step: 212/466, loss: 2.2363874450093135e-05 2023-01-22 16:37:10.189985: step: 214/466, loss: 0.022035591304302216 2023-01-22 16:37:10.994336: step: 216/466, loss: 0.032153207808732986 2023-01-22 16:37:11.709709: step: 218/466, loss: 0.0018491385271772742 2023-01-22 16:37:12.429901: step: 220/466, loss: 0.0022317946422845125 2023-01-22 16:37:13.147946: step: 222/466, loss: 0.0013667675666511059 2023-01-22 16:37:13.937566: step: 224/466, loss: 0.003354973392561078 2023-01-22 16:37:14.709064: step: 226/466, loss: 0.02265651896595955 2023-01-22 16:37:15.438588: step: 228/466, loss: 0.00097334646852687 2023-01-22 16:37:16.205111: step: 230/466, loss: 0.0008900273824110627 2023-01-22 16:37:17.016315: step: 232/466, loss: 0.0070862616412341595 2023-01-22 16:37:17.688838: step: 234/466, loss: 0.0008075744262896478 2023-01-22 16:37:18.382632: step: 236/466, loss: 0.006123298313468695 2023-01-22 16:37:19.149463: step: 238/466, loss: 0.006236757151782513 2023-01-22 16:37:19.923573: step: 240/466, loss: 0.004826977849006653 2023-01-22 16:37:20.644736: step: 242/466, loss: 0.0019498377805575728 2023-01-22 16:37:21.380609: step: 244/466, loss: 0.0002522287250030786 2023-01-22 16:37:22.166826: step: 246/466, loss: 0.02219093032181263 2023-01-22 16:37:22.993562: step: 248/466, loss: 0.000634671829175204 2023-01-22 16:37:23.738206: step: 250/466, loss: 0.0149555504322052 2023-01-22 16:37:24.516609: step: 252/466, loss: 0.04596257209777832 2023-01-22 16:37:25.244989: step: 254/466, loss: 0.022731617093086243 2023-01-22 16:37:25.987806: step: 256/466, loss: 0.009812143631279469 2023-01-22 16:37:26.726476: step: 258/466, loss: 0.10636556148529053 2023-01-22 16:37:27.568610: step: 260/466, loss: 0.0033983385656028986 2023-01-22 16:37:28.291684: step: 262/466, loss: 0.01362368743866682 2023-01-22 16:37:29.037526: step: 264/466, loss: 0.06555137783288956 2023-01-22 16:37:29.829412: step: 266/466, loss: 0.014873780310153961 2023-01-22 16:37:30.547208: step: 268/466, loss: 0.02528764307498932 2023-01-22 16:37:31.292187: step: 270/466, loss: 0.0005162880406714976 2023-01-22 16:37:32.088152: step: 272/466, loss: 0.006696953438222408 2023-01-22 16:37:32.807253: step: 274/466, loss: 0.01726466603577137 2023-01-22 16:37:33.561711: step: 276/466, loss: 0.012693598866462708 2023-01-22 16:37:34.324737: step: 278/466, loss: 1.0962568521499634 2023-01-22 16:37:35.090406: step: 280/466, loss: 0.0010377082508057356 2023-01-22 16:37:35.855699: step: 282/466, loss: 0.003022135468199849 2023-01-22 16:37:36.568914: step: 284/466, loss: 0.0012633508304134011 2023-01-22 16:37:37.412999: step: 286/466, loss: 0.012754724361002445 2023-01-22 16:37:38.083178: step: 288/466, loss: 0.0007646095473319292 2023-01-22 16:37:38.846739: step: 290/466, loss: 0.0002800831862259656 2023-01-22 16:37:39.563237: step: 292/466, loss: 0.0013062810758128762 2023-01-22 16:37:40.334107: step: 294/466, loss: 0.0002482525887899101 2023-01-22 16:37:41.159057: step: 296/466, loss: 2.8109747290727682e-05 2023-01-22 16:37:41.824939: step: 298/466, loss: 0.006925472058355808 2023-01-22 16:37:42.645509: step: 300/466, loss: 0.28649452328681946 2023-01-22 16:37:43.406901: step: 302/466, loss: 0.0009704851545393467 2023-01-22 16:37:44.125437: step: 304/466, loss: 0.005734934937208891 2023-01-22 16:37:44.837477: step: 306/466, loss: 0.019645430147647858 2023-01-22 16:37:45.626293: step: 308/466, loss: 0.0017498302040621638 2023-01-22 16:37:46.451341: step: 310/466, loss: 0.0007500092615373433 2023-01-22 16:37:47.158891: step: 312/466, loss: 0.010207954794168472 2023-01-22 16:37:47.913103: step: 314/466, loss: 0.08134476840496063 2023-01-22 16:37:48.700533: step: 316/466, loss: 0.010409084148705006 2023-01-22 16:37:49.377016: step: 318/466, loss: 0.0070197382010519505 2023-01-22 16:37:50.100246: step: 320/466, loss: 0.0004017484898213297 2023-01-22 16:37:50.867702: step: 322/466, loss: 0.051938921213150024 2023-01-22 16:37:51.603269: step: 324/466, loss: 0.012206167913973331 2023-01-22 16:37:52.322748: step: 326/466, loss: 0.03690803796052933 2023-01-22 16:37:53.032011: step: 328/466, loss: 0.0020243311300873756 2023-01-22 16:37:53.766901: step: 330/466, loss: 0.031494155526161194 2023-01-22 16:37:54.454182: step: 332/466, loss: 0.013773403130471706 2023-01-22 16:37:55.204195: step: 334/466, loss: 0.006675016600638628 2023-01-22 16:37:55.970842: step: 336/466, loss: 0.08140300959348679 2023-01-22 16:37:56.648540: step: 338/466, loss: 0.00034415972186252475 2023-01-22 16:37:57.376593: step: 340/466, loss: 0.002552608260884881 2023-01-22 16:37:58.072660: step: 342/466, loss: 0.00010834995919140056 2023-01-22 16:37:58.808105: step: 344/466, loss: 0.00013172495528124273 2023-01-22 16:37:59.567322: step: 346/466, loss: 0.008109035901725292 2023-01-22 16:38:00.341192: step: 348/466, loss: 0.31519660353660583 2023-01-22 16:38:01.154664: step: 350/466, loss: 0.002458364935591817 2023-01-22 16:38:01.945951: step: 352/466, loss: 0.010051483288407326 2023-01-22 16:38:02.721201: step: 354/466, loss: 0.015387998893857002 2023-01-22 16:38:03.474439: step: 356/466, loss: 0.015859654173254967 2023-01-22 16:38:04.274559: step: 358/466, loss: 0.00026099529350176454 2023-01-22 16:38:05.039387: step: 360/466, loss: 0.02173546329140663 2023-01-22 16:38:05.773523: step: 362/466, loss: 0.03111352026462555 2023-01-22 16:38:06.469862: step: 364/466, loss: 0.030289320275187492 2023-01-22 16:38:07.244593: step: 366/466, loss: 0.00595248956233263 2023-01-22 16:38:07.967718: step: 368/466, loss: 0.0007932390435598791 2023-01-22 16:38:08.688450: step: 370/466, loss: 0.05523402616381645 2023-01-22 16:38:09.571614: step: 372/466, loss: 0.011178974062204361 2023-01-22 16:38:10.337043: step: 374/466, loss: 0.03265348821878433 2023-01-22 16:38:11.068914: step: 376/466, loss: 2.4180924892425537 2023-01-22 16:38:11.829567: step: 378/466, loss: 0.06094865873456001 2023-01-22 16:38:12.616269: step: 380/466, loss: 0.0009674673201516271 2023-01-22 16:38:13.381485: step: 382/466, loss: 0.0025440987665206194 2023-01-22 16:38:14.110774: step: 384/466, loss: 0.04209550470113754 2023-01-22 16:38:14.853300: step: 386/466, loss: 0.003342913230881095 2023-01-22 16:38:15.510412: step: 388/466, loss: 0.004386276006698608 2023-01-22 16:38:16.289412: step: 390/466, loss: 0.0031542626675218344 2023-01-22 16:38:17.111536: step: 392/466, loss: 0.008323580026626587 2023-01-22 16:38:17.863004: step: 394/466, loss: 0.028583411127328873 2023-01-22 16:38:18.615236: step: 396/466, loss: 0.021640345454216003 2023-01-22 16:38:19.326766: step: 398/466, loss: 0.003628394566476345 2023-01-22 16:38:20.000113: step: 400/466, loss: 0.0007856853189878166 2023-01-22 16:38:20.760700: step: 402/466, loss: 0.006474267691373825 2023-01-22 16:38:21.418390: step: 404/466, loss: 0.0037609722930938005 2023-01-22 16:38:22.235954: step: 406/466, loss: 0.04697392135858536 2023-01-22 16:38:23.000897: step: 408/466, loss: 0.005808677524328232 2023-01-22 16:38:23.810740: step: 410/466, loss: 0.017602592706680298 2023-01-22 16:38:24.517790: step: 412/466, loss: 0.004466408398002386 2023-01-22 16:38:25.328748: step: 414/466, loss: 0.030358077958226204 2023-01-22 16:38:26.113564: step: 416/466, loss: 0.00018989270029123873 2023-01-22 16:38:26.811355: step: 418/466, loss: 0.019630558788776398 2023-01-22 16:38:27.607810: step: 420/466, loss: 0.008626433089375496 2023-01-22 16:38:28.352652: step: 422/466, loss: 0.013188624754548073 2023-01-22 16:38:29.053128: step: 424/466, loss: 0.18110044300556183 2023-01-22 16:38:29.841300: step: 426/466, loss: 0.15420110523700714 2023-01-22 16:38:30.642166: step: 428/466, loss: 0.00033983562025241554 2023-01-22 16:38:31.436331: step: 430/466, loss: 0.004736277740448713 2023-01-22 16:38:32.170673: step: 432/466, loss: 0.0006059578736312687 2023-01-22 16:38:32.914097: step: 434/466, loss: 0.014155655167996883 2023-01-22 16:38:33.696695: step: 436/466, loss: 0.01840216852724552 2023-01-22 16:38:34.453498: step: 438/466, loss: 0.008797436952590942 2023-01-22 16:38:35.089503: step: 440/466, loss: 2.0591582142515108e-05 2023-01-22 16:38:35.844364: step: 442/466, loss: 0.002799784764647484 2023-01-22 16:38:36.647023: step: 444/466, loss: 0.003289586864411831 2023-01-22 16:38:37.437101: step: 446/466, loss: 0.0030177352018654346 2023-01-22 16:38:38.200598: step: 448/466, loss: 0.0005092833307571709 2023-01-22 16:38:38.918380: step: 450/466, loss: 0.0026675830595195293 2023-01-22 16:38:39.637975: step: 452/466, loss: 0.00010813782137120143 2023-01-22 16:38:40.362417: step: 454/466, loss: 0.030953940004110336 2023-01-22 16:38:41.156311: step: 456/466, loss: 0.044374607503414154 2023-01-22 16:38:41.963983: step: 458/466, loss: 0.13990625739097595 2023-01-22 16:38:42.702291: step: 460/466, loss: 0.00537828216329217 2023-01-22 16:38:43.462852: step: 462/466, loss: 0.022325586527585983 2023-01-22 16:38:44.244520: step: 464/466, loss: 0.006508402526378632 2023-01-22 16:38:45.007107: step: 466/466, loss: 0.0010946118272840977 2023-01-22 16:38:45.766876: step: 468/466, loss: 0.0019944056402891874 2023-01-22 16:38:46.547308: step: 470/466, loss: 0.005054292269051075 2023-01-22 16:38:47.335839: step: 472/466, loss: 0.09031184017658234 2023-01-22 16:38:48.060443: step: 474/466, loss: 0.13491177558898926 2023-01-22 16:38:48.781814: step: 476/466, loss: 7.156516949180514e-05 2023-01-22 16:38:49.521496: step: 478/466, loss: 0.024453453719615936 2023-01-22 16:38:50.373059: step: 480/466, loss: 0.006817124783992767 2023-01-22 16:38:51.169757: step: 482/466, loss: 0.021428994834423065 2023-01-22 16:38:51.929947: step: 484/466, loss: 0.004524318967014551 2023-01-22 16:38:52.778322: step: 486/466, loss: 0.0009695948101580143 2023-01-22 16:38:53.565584: step: 488/466, loss: 0.019664600491523743 2023-01-22 16:38:54.244437: step: 490/466, loss: 0.00011819535575341433 2023-01-22 16:38:54.940111: step: 492/466, loss: 0.0010263827862218022 2023-01-22 16:38:55.699652: step: 494/466, loss: 9.183879592455924e-05 2023-01-22 16:38:56.443226: step: 496/466, loss: 0.028333380818367004 2023-01-22 16:38:57.187788: step: 498/466, loss: 0.00033218032331205904 2023-01-22 16:38:57.962935: step: 500/466, loss: 0.00035989010939374566 2023-01-22 16:38:58.703871: step: 502/466, loss: 0.016311464831233025 2023-01-22 16:38:59.441404: step: 504/466, loss: 0.020886670798063278 2023-01-22 16:39:00.154152: step: 506/466, loss: 0.0010080871870741248 2023-01-22 16:39:00.918646: step: 508/466, loss: 0.04013931751251221 2023-01-22 16:39:01.641712: step: 510/466, loss: 0.01586906984448433 2023-01-22 16:39:02.436691: step: 512/466, loss: 4.607872009277344 2023-01-22 16:39:03.218227: step: 514/466, loss: 0.0015745365526527166 2023-01-22 16:39:04.031871: step: 516/466, loss: 0.07102135568857193 2023-01-22 16:39:04.769344: step: 518/466, loss: 0.0023052708711475134 2023-01-22 16:39:05.474992: step: 520/466, loss: 0.044435564428567886 2023-01-22 16:39:06.189352: step: 522/466, loss: 0.01360018365085125 2023-01-22 16:39:06.997362: step: 524/466, loss: 0.00858695525676012 2023-01-22 16:39:07.826416: step: 526/466, loss: 0.0014044283889234066 2023-01-22 16:39:08.567102: step: 528/466, loss: 0.0015609466936439276 2023-01-22 16:39:09.300492: step: 530/466, loss: 0.002609415678307414 2023-01-22 16:39:10.001950: step: 532/466, loss: 0.03356180340051651 2023-01-22 16:39:10.800207: step: 534/466, loss: 0.00122835545334965 2023-01-22 16:39:11.497657: step: 536/466, loss: 0.004466540180146694 2023-01-22 16:39:12.322715: step: 538/466, loss: 0.019091887399554253 2023-01-22 16:39:13.100996: step: 540/466, loss: 0.05677692964673042 2023-01-22 16:39:13.817935: step: 542/466, loss: 0.02603871561586857 2023-01-22 16:39:14.533878: step: 544/466, loss: 4.200865078018978e-05 2023-01-22 16:39:15.262103: step: 546/466, loss: 0.057331353425979614 2023-01-22 16:39:16.039942: step: 548/466, loss: 0.00982821173965931 2023-01-22 16:39:16.842581: step: 550/466, loss: 0.060289330780506134 2023-01-22 16:39:17.601955: step: 552/466, loss: 0.012156656943261623 2023-01-22 16:39:18.298216: step: 554/466, loss: 0.0003722023102454841 2023-01-22 16:39:19.071166: step: 556/466, loss: 0.018028881400823593 2023-01-22 16:39:19.814643: step: 558/466, loss: 0.023712920024991035 2023-01-22 16:39:20.648154: step: 560/466, loss: 0.003337965114042163 2023-01-22 16:39:21.417481: step: 562/466, loss: 0.00031802200828678906 2023-01-22 16:39:22.111269: step: 564/466, loss: 0.001543916412629187 2023-01-22 16:39:22.858784: step: 566/466, loss: 0.0645618662238121 2023-01-22 16:39:23.663403: step: 568/466, loss: 0.009318874217569828 2023-01-22 16:39:24.386080: step: 570/466, loss: 0.016917699947953224 2023-01-22 16:39:25.156133: step: 572/466, loss: 0.09670083969831467 2023-01-22 16:39:25.939279: step: 574/466, loss: 0.0034049071837216616 2023-01-22 16:39:26.624042: step: 576/466, loss: 0.0009502943139523268 2023-01-22 16:39:27.470939: step: 578/466, loss: 0.025513457134366035 2023-01-22 16:39:28.200151: step: 580/466, loss: 0.02725810743868351 2023-01-22 16:39:28.968242: step: 582/466, loss: 0.020639048889279366 2023-01-22 16:39:29.770422: step: 584/466, loss: 0.016725769266486168 2023-01-22 16:39:30.548083: step: 586/466, loss: 0.004354603588581085 2023-01-22 16:39:31.346835: step: 588/466, loss: 0.00044333579717203975 2023-01-22 16:39:32.184330: step: 590/466, loss: 0.02828321047127247 2023-01-22 16:39:32.909229: step: 592/466, loss: 0.053278740495443344 2023-01-22 16:39:33.637532: step: 594/466, loss: 0.002482444979250431 2023-01-22 16:39:34.478265: step: 596/466, loss: 0.006808500736951828 2023-01-22 16:39:35.286885: step: 598/466, loss: 0.007787493988871574 2023-01-22 16:39:36.052280: step: 600/466, loss: 0.002453563967719674 2023-01-22 16:39:36.754609: step: 602/466, loss: 0.0010509646963328123 2023-01-22 16:39:37.587488: step: 604/466, loss: 0.010503526777029037 2023-01-22 16:39:38.327695: step: 606/466, loss: 0.020881492644548416 2023-01-22 16:39:38.995415: step: 608/466, loss: 0.0014220715966075659 2023-01-22 16:39:39.747661: step: 610/466, loss: 0.004759980831295252 2023-01-22 16:39:40.476279: step: 612/466, loss: 0.012297210283577442 2023-01-22 16:39:41.207077: step: 614/466, loss: 0.002239755354821682 2023-01-22 16:39:41.972002: step: 616/466, loss: 0.04295475035905838 2023-01-22 16:39:42.820813: step: 618/466, loss: 0.01774514839053154 2023-01-22 16:39:43.540429: step: 620/466, loss: 3.909325823769905e-05 2023-01-22 16:39:44.269373: step: 622/466, loss: 0.018953876569867134 2023-01-22 16:39:44.989651: step: 624/466, loss: 0.0008622193709015846 2023-01-22 16:39:45.733514: step: 626/466, loss: 0.021152107045054436 2023-01-22 16:39:46.464260: step: 628/466, loss: 0.00045716180466115475 2023-01-22 16:39:47.292266: step: 630/466, loss: 0.05074857920408249 2023-01-22 16:39:48.085740: step: 632/466, loss: 3.691697202157229e-05 2023-01-22 16:39:48.861207: step: 634/466, loss: 0.004377929028123617 2023-01-22 16:39:49.677315: step: 636/466, loss: 0.0053786421194672585 2023-01-22 16:39:50.494693: step: 638/466, loss: 0.003944407217204571 2023-01-22 16:39:51.158015: step: 640/466, loss: 0.018148386850953102 2023-01-22 16:39:51.912197: step: 642/466, loss: 0.02679787389934063 2023-01-22 16:39:52.826676: step: 644/466, loss: 0.0066886176355183125 2023-01-22 16:39:53.594938: step: 646/466, loss: 0.002962264697998762 2023-01-22 16:39:54.287793: step: 648/466, loss: 0.0009095259010791779 2023-01-22 16:39:54.993909: step: 650/466, loss: 0.0002648793160915375 2023-01-22 16:39:55.742120: step: 652/466, loss: 0.0013989545404911041 2023-01-22 16:39:56.458265: step: 654/466, loss: 0.1868881732225418 2023-01-22 16:39:57.166779: step: 656/466, loss: 0.015928208827972412 2023-01-22 16:39:57.848379: step: 658/466, loss: 0.0023805315140634775 2023-01-22 16:39:58.525581: step: 660/466, loss: 0.004664386156946421 2023-01-22 16:39:59.257666: step: 662/466, loss: 1.4249751984607428e-05 2023-01-22 16:39:59.984293: step: 664/466, loss: 0.000596735393628478 2023-01-22 16:40:00.714520: step: 666/466, loss: 0.002551491605117917 2023-01-22 16:40:01.498229: step: 668/466, loss: 0.03295866400003433 2023-01-22 16:40:02.353442: step: 670/466, loss: 0.009886063635349274 2023-01-22 16:40:03.116832: step: 672/466, loss: 0.01368051115423441 2023-01-22 16:40:03.871203: step: 674/466, loss: 0.003591829678043723 2023-01-22 16:40:04.573548: step: 676/466, loss: 0.00019231434271205217 2023-01-22 16:40:05.288241: step: 678/466, loss: 0.00796997919678688 2023-01-22 16:40:06.154223: step: 680/466, loss: 0.009898564778268337 2023-01-22 16:40:06.865702: step: 682/466, loss: 0.002351475181058049 2023-01-22 16:40:07.647046: step: 684/466, loss: 0.035061463713645935 2023-01-22 16:40:08.452757: step: 686/466, loss: 0.004485884215682745 2023-01-22 16:40:09.259842: step: 688/466, loss: 0.01222966331988573 2023-01-22 16:40:10.029208: step: 690/466, loss: 0.01499281357973814 2023-01-22 16:40:10.848139: step: 692/466, loss: 0.009576707147061825 2023-01-22 16:40:11.634119: step: 694/466, loss: 0.0007713002851232886 2023-01-22 16:40:12.383494: step: 696/466, loss: 8.355027966899797e-05 2023-01-22 16:40:13.187180: step: 698/466, loss: 0.00462744478136301 2023-01-22 16:40:13.944683: step: 700/466, loss: 0.6917054653167725 2023-01-22 16:40:14.656492: step: 702/466, loss: 0.0004787829238921404 2023-01-22 16:40:15.434519: step: 704/466, loss: 0.006655456963926554 2023-01-22 16:40:16.142395: step: 706/466, loss: 0.018385127186775208 2023-01-22 16:40:16.912428: step: 708/466, loss: 0.059344708919525146 2023-01-22 16:40:17.689884: step: 710/466, loss: 0.0037663299590349197 2023-01-22 16:40:18.386731: step: 712/466, loss: 0.0011843966785818338 2023-01-22 16:40:19.071001: step: 714/466, loss: 0.00032077066134661436 2023-01-22 16:40:19.832193: step: 716/466, loss: 0.00018971598183270544 2023-01-22 16:40:20.545192: step: 718/466, loss: 0.0027237460017204285 2023-01-22 16:40:21.275674: step: 720/466, loss: 0.0007768472423776984 2023-01-22 16:40:21.950026: step: 722/466, loss: 0.02395699918270111 2023-01-22 16:40:22.737477: step: 724/466, loss: 0.009225727058947086 2023-01-22 16:40:23.431241: step: 726/466, loss: 0.00010193004709435627 2023-01-22 16:40:24.170598: step: 728/466, loss: 0.0001947433629538864 2023-01-22 16:40:24.929382: step: 730/466, loss: 0.0005076914094388485 2023-01-22 16:40:25.619930: step: 732/466, loss: 0.0006210353458300233 2023-01-22 16:40:26.370826: step: 734/466, loss: 0.0017323088832199574 2023-01-22 16:40:27.121147: step: 736/466, loss: 0.0036521030124276876 2023-01-22 16:40:27.877625: step: 738/466, loss: 0.027026157826185226 2023-01-22 16:40:28.648494: step: 740/466, loss: 0.0278612170368433 2023-01-22 16:40:29.413504: step: 742/466, loss: 0.009088082239031792 2023-01-22 16:40:30.246775: step: 744/466, loss: 0.1705555021762848 2023-01-22 16:40:31.009964: step: 746/466, loss: 0.0037146315444260836 2023-01-22 16:40:31.811106: step: 748/466, loss: 5.9328271163394675e-05 2023-01-22 16:40:32.591321: step: 750/466, loss: 0.009150792844593525 2023-01-22 16:40:33.379317: step: 752/466, loss: 0.003557102754712105 2023-01-22 16:40:34.123532: step: 754/466, loss: 0.07643107324838638 2023-01-22 16:40:34.948760: step: 756/466, loss: 0.023097380995750427 2023-01-22 16:40:35.669862: step: 758/466, loss: 0.0860399603843689 2023-01-22 16:40:36.465432: step: 760/466, loss: 0.011937241069972515 2023-01-22 16:40:37.178808: step: 762/466, loss: 0.002413738053292036 2023-01-22 16:40:37.905272: step: 764/466, loss: 0.0005132968653924763 2023-01-22 16:40:38.668268: step: 766/466, loss: 0.01412399671971798 2023-01-22 16:40:39.484141: step: 768/466, loss: 0.007267692591995001 2023-01-22 16:40:40.199368: step: 770/466, loss: 0.002236289670690894 2023-01-22 16:40:40.902911: step: 772/466, loss: 0.37281978130340576 2023-01-22 16:40:41.699146: step: 774/466, loss: 0.01073366403579712 2023-01-22 16:40:42.465812: step: 776/466, loss: 0.025290386751294136 2023-01-22 16:40:43.341679: step: 778/466, loss: 0.018467970192432404 2023-01-22 16:40:44.044899: step: 780/466, loss: 0.0012066556373611093 2023-01-22 16:40:44.791092: step: 782/466, loss: 0.0006555514992214739 2023-01-22 16:40:45.554070: step: 784/466, loss: 0.02548740617930889 2023-01-22 16:40:46.293058: step: 786/466, loss: 0.4201527237892151 2023-01-22 16:40:47.062667: step: 788/466, loss: 0.014074753038585186 2023-01-22 16:40:47.844372: step: 790/466, loss: 0.017719129100441933 2023-01-22 16:40:48.521687: step: 792/466, loss: 0.002532778074964881 2023-01-22 16:40:49.368439: step: 794/466, loss: 0.0023974813520908356 2023-01-22 16:40:50.130404: step: 796/466, loss: 0.0025688474997878075 2023-01-22 16:40:50.861285: step: 798/466, loss: 0.0006184555240906775 2023-01-22 16:40:51.575861: step: 800/466, loss: 0.029398677870631218 2023-01-22 16:40:52.350382: step: 802/466, loss: 0.00013305842003319412 2023-01-22 16:40:53.157219: step: 804/466, loss: 0.00010052655125036836 2023-01-22 16:40:53.939662: step: 806/466, loss: 0.02093655802309513 2023-01-22 16:40:54.635192: step: 808/466, loss: 0.0018348832381889224 2023-01-22 16:40:55.316216: step: 810/466, loss: 0.01457955688238144 2023-01-22 16:40:56.116425: step: 812/466, loss: 0.032794274389743805 2023-01-22 16:40:56.907826: step: 814/466, loss: 0.35781222581863403 2023-01-22 16:40:57.659811: step: 816/466, loss: 0.0023967118468135595 2023-01-22 16:40:58.309601: step: 818/466, loss: 0.01247766800224781 2023-01-22 16:40:59.094819: step: 820/466, loss: 0.004071477800607681 2023-01-22 16:40:59.785756: step: 822/466, loss: 3.02156840916723e-05 2023-01-22 16:41:00.484380: step: 824/466, loss: 0.02758164145052433 2023-01-22 16:41:01.235938: step: 826/466, loss: 0.11011549085378647 2023-01-22 16:41:02.133838: step: 828/466, loss: 0.0051184226758778095 2023-01-22 16:41:02.937450: step: 830/466, loss: 0.003817720804363489 2023-01-22 16:41:03.665518: step: 832/466, loss: 0.00034997283364646137 2023-01-22 16:41:04.415983: step: 834/466, loss: 0.00047167122829705477 2023-01-22 16:41:05.170805: step: 836/466, loss: 0.005143610294908285 2023-01-22 16:41:05.939941: step: 838/466, loss: 0.01582680456340313 2023-01-22 16:41:06.683079: step: 840/466, loss: 0.0280031468719244 2023-01-22 16:41:07.600893: step: 842/466, loss: 0.0013992044841870666 2023-01-22 16:41:08.357942: step: 844/466, loss: 0.0012508267536759377 2023-01-22 16:41:09.138811: step: 846/466, loss: 0.005863756872713566 2023-01-22 16:41:09.804218: step: 848/466, loss: 0.0010132716270163655 2023-01-22 16:41:10.638314: step: 850/466, loss: 0.0030189540702849627 2023-01-22 16:41:11.373011: step: 852/466, loss: 0.03408302739262581 2023-01-22 16:41:12.194369: step: 854/466, loss: 0.005213402211666107 2023-01-22 16:41:12.899865: step: 856/466, loss: 0.03750526160001755 2023-01-22 16:41:13.654869: step: 858/466, loss: 0.019979437813162804 2023-01-22 16:41:14.354264: step: 860/466, loss: 0.010737068951129913 2023-01-22 16:41:15.086521: step: 862/466, loss: 0.003372674575075507 2023-01-22 16:41:15.826945: step: 864/466, loss: 0.029829924926161766 2023-01-22 16:41:16.577361: step: 866/466, loss: 0.00047801234177313745 2023-01-22 16:41:17.272092: step: 868/466, loss: 0.002308440860360861 2023-01-22 16:41:18.063029: step: 870/466, loss: 0.007953722961246967 2023-01-22 16:41:18.850803: step: 872/466, loss: 0.00415385514497757 2023-01-22 16:41:19.628701: step: 874/466, loss: 0.002917190082371235 2023-01-22 16:41:20.372206: step: 876/466, loss: 0.10013389587402344 2023-01-22 16:41:21.012083: step: 878/466, loss: 0.00955923367291689 2023-01-22 16:41:21.735324: step: 880/466, loss: 0.0020114178769290447 2023-01-22 16:41:22.628076: step: 882/466, loss: 0.02671821415424347 2023-01-22 16:41:23.351872: step: 884/466, loss: 0.00014646655472461134 2023-01-22 16:41:24.092580: step: 886/466, loss: 0.0035071747843176126 2023-01-22 16:41:24.828117: step: 888/466, loss: 0.00027901571593247354 2023-01-22 16:41:25.534534: step: 890/466, loss: 0.007693085819482803 2023-01-22 16:41:26.405319: step: 892/466, loss: 0.008314241655170918 2023-01-22 16:41:27.118160: step: 894/466, loss: 0.0002768787962850183 2023-01-22 16:41:27.991165: step: 896/466, loss: 0.045164354145526886 2023-01-22 16:41:28.817233: step: 898/466, loss: 0.0007990582962520421 2023-01-22 16:41:29.523822: step: 900/466, loss: 0.014751252718269825 2023-01-22 16:41:30.293975: step: 902/466, loss: 0.0013847840018570423 2023-01-22 16:41:31.031588: step: 904/466, loss: 0.006584350951015949 2023-01-22 16:41:31.757365: step: 906/466, loss: 0.00011031017493223771 2023-01-22 16:41:32.584461: step: 908/466, loss: 0.0004208995378576219 2023-01-22 16:41:33.418277: step: 910/466, loss: 0.0004874985897913575 2023-01-22 16:41:34.170222: step: 912/466, loss: 0.26059654355049133 2023-01-22 16:41:35.041316: step: 914/466, loss: 0.03792070969939232 2023-01-22 16:41:35.983167: step: 916/466, loss: 0.0002478975220583379 2023-01-22 16:41:36.874204: step: 918/466, loss: 0.0019585671834647655 2023-01-22 16:41:37.606149: step: 920/466, loss: 0.0002386308478889987 2023-01-22 16:41:38.312827: step: 922/466, loss: 0.014447817578911781 2023-01-22 16:41:39.040364: step: 924/466, loss: 0.017603475600481033 2023-01-22 16:41:39.897967: step: 926/466, loss: 0.03096870332956314 2023-01-22 16:41:40.638870: step: 928/466, loss: 0.016626933589577675 2023-01-22 16:41:41.408839: step: 930/466, loss: 0.001796972006559372 2023-01-22 16:41:42.167306: step: 932/466, loss: 0.011337809264659882 ================================================== Loss: 0.041 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3228707107843137, 'r': 0.3332858950031625, 'f1': 0.32799564270152504}, 'combined': 0.24168099988533423, 'epoch': 36} Test Chinese: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.35285055877312077, 'r': 0.3079033905411894, 'f1': 0.328848230156902}, 'combined': 0.20212135121838856, 'epoch': 36} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.30156788595932055, 'r': 0.3456299869438892, 'f1': 0.322099032925605}, 'combined': 0.23733612952412997, 'epoch': 36} Test Korean: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.331509942889869, 'r': 0.31226283182087317, 'f1': 0.3215986683813365}, 'combined': 0.19766552300511414, 'epoch': 36} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.34297597042513867, 'r': 0.35208728652751425, 'f1': 0.34747191011235956}, 'combined': 0.2560319337670018, 'epoch': 36} Test Russian: {'template': {'p': 0.863013698630137, 'r': 0.48091603053435117, 'f1': 0.6176470588235295}, 'slot': {'p': 0.3525127497709, 'r': 0.30424843914368843, 'f1': 0.3266071616482013}, 'combined': 0.2017279527827126, 'epoch': 36} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.2528409090909091, 'r': 0.31785714285714284, 'f1': 0.28164556962025317}, 'combined': 0.18776371308016876, 'epoch': 36} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.32432432432432434, 'r': 0.5217391304347826, 'f1': 0.4}, 'combined': 0.2, 'epoch': 36} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.42857142857142855, 'r': 0.20689655172413793, 'f1': 0.2790697674418604}, 'combined': 0.18604651162790692, 'epoch': 36} New best korean model... ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.31374061853002067, 'r': 0.3286239495798319, 'f1': 0.3210098636303455}, 'combined': 0.23653358372762298, 'epoch': 33} Test for Chinese: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.35580409301213317, 'r': 0.30097021547433256, 'f1': 0.32609812277003203}, 'combined': 0.20043104131231237, 'epoch': 33} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.31875, 'r': 0.36428571428571427, 'f1': 0.34}, 'combined': 0.22666666666666668, 'epoch': 33} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.30156788595932055, 'r': 0.3456299869438892, 'f1': 0.322099032925605}, 'combined': 0.23733612952412997, 'epoch': 36} Test for Korean: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.331509942889869, 'r': 0.31226283182087317, 'f1': 0.3215986683813365}, 'combined': 0.19766552300511414, 'epoch': 36} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.32432432432432434, 'r': 0.5217391304347826, 'f1': 0.4}, 'combined': 0.2, 'epoch': 36} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3161637354106435, 'r': 0.3395610516934046, 'f1': 0.3274449665918101}, 'combined': 0.24127523854133376, 'epoch': 24} Test for Russian: {'template': {'p': 0.863013698630137, 'r': 0.48091603053435117, 'f1': 0.6176470588235295}, 'slot': {'p': 0.34766580316129947, 'r': 0.3010093533864065, 'f1': 0.32265967810793456}, 'combined': 0.19928980118431255, 'epoch': 24} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.2413793103448276, 'f1': 0.32558139534883723}, 'combined': 0.21705426356589147, 'epoch': 24} ****************************** Epoch: 37 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-22 16:44:37.208404: step: 2/466, loss: 0.0048588840290904045 2023-01-22 16:44:37.928174: step: 4/466, loss: 0.00700045982375741 2023-01-22 16:44:38.661455: step: 6/466, loss: 0.0010578229557722807 2023-01-22 16:44:39.475987: step: 8/466, loss: 0.019440356642007828 2023-01-22 16:44:40.243480: step: 10/466, loss: 0.01578276976943016 2023-01-22 16:44:41.017626: step: 12/466, loss: 0.00025084824301302433 2023-01-22 16:44:41.700030: step: 14/466, loss: 0.0112862978130579 2023-01-22 16:44:42.387277: step: 16/466, loss: 0.00468059116974473 2023-01-22 16:44:43.157871: step: 18/466, loss: 0.0021089408546686172 2023-01-22 16:44:43.927383: step: 20/466, loss: 0.00011019927478628233 2023-01-22 16:44:44.689071: step: 22/466, loss: 0.08023007959127426 2023-01-22 16:44:45.461491: step: 24/466, loss: 0.0026985728181898594 2023-01-22 16:44:46.233352: step: 26/466, loss: 0.0008987162145785987 2023-01-22 16:44:46.969889: step: 28/466, loss: 5.221124229137786e-05 2023-01-22 16:44:47.755733: step: 30/466, loss: 4.487218757276423e-05 2023-01-22 16:44:48.471905: step: 32/466, loss: 0.007680452428758144 2023-01-22 16:44:49.251521: step: 34/466, loss: 0.025444507598876953 2023-01-22 16:44:50.092470: step: 36/466, loss: 0.03113366663455963 2023-01-22 16:44:50.829113: step: 38/466, loss: 0.014559357427060604 2023-01-22 16:44:51.640550: step: 40/466, loss: 0.0157176461070776 2023-01-22 16:44:52.408810: step: 42/466, loss: 0.032705437391996384 2023-01-22 16:44:53.167272: step: 44/466, loss: 0.0012881396105512977 2023-01-22 16:44:53.972828: step: 46/466, loss: 0.0005634765257127583 2023-01-22 16:44:54.683402: step: 48/466, loss: 0.00015636030002497137 2023-01-22 16:44:55.403583: step: 50/466, loss: 0.0008369830320589244 2023-01-22 16:44:56.230640: step: 52/466, loss: 0.0010885618394240737 2023-01-22 16:44:57.018892: step: 54/466, loss: 0.017781171947717667 2023-01-22 16:44:57.773394: step: 56/466, loss: 0.03731287643313408 2023-01-22 16:44:58.516725: step: 58/466, loss: 0.00025140156503766775 2023-01-22 16:44:59.348030: step: 60/466, loss: 0.008262161165475845 2023-01-22 16:45:00.133540: step: 62/466, loss: 0.012372287921607494 2023-01-22 16:45:00.910864: step: 64/466, loss: 0.0029393951408565044 2023-01-22 16:45:01.684261: step: 66/466, loss: 0.003331736195832491 2023-01-22 16:45:02.441080: step: 68/466, loss: 0.0029272697865962982 2023-01-22 16:45:03.120159: step: 70/466, loss: 0.010165790095925331 2023-01-22 16:45:03.872000: step: 72/466, loss: 0.17820614576339722 2023-01-22 16:45:04.613505: step: 74/466, loss: 0.03329198807477951 2023-01-22 16:45:05.378977: step: 76/466, loss: 0.0048468150198459625 2023-01-22 16:45:06.145080: step: 78/466, loss: 0.0005752610741183162 2023-01-22 16:45:06.930196: step: 80/466, loss: 0.0031736392993479967 2023-01-22 16:45:07.679168: step: 82/466, loss: 0.009133272804319859 2023-01-22 16:45:08.427895: step: 84/466, loss: 4.2883766582235694e-05 2023-01-22 16:45:09.194039: step: 86/466, loss: 0.004588339943438768 2023-01-22 16:45:09.928046: step: 88/466, loss: 0.005838914308696985 2023-01-22 16:45:10.651067: step: 90/466, loss: 0.028857816010713577 2023-01-22 16:45:11.419446: step: 92/466, loss: 0.001227057189680636 2023-01-22 16:45:12.138191: step: 94/466, loss: 0.001945412135683 2023-01-22 16:45:12.828391: step: 96/466, loss: 0.006568763870745897 2023-01-22 16:45:13.625332: step: 98/466, loss: 0.00024487273185513914 2023-01-22 16:45:14.389529: step: 100/466, loss: 0.0056565627455711365 2023-01-22 16:45:15.042485: step: 102/466, loss: 0.0008967267931438982 2023-01-22 16:45:15.704883: step: 104/466, loss: 0.007367671467363834 2023-01-22 16:45:16.487638: step: 106/466, loss: 0.024896690621972084 2023-01-22 16:45:17.243311: step: 108/466, loss: 0.026141609996557236 2023-01-22 16:45:17.948232: step: 110/466, loss: 0.0013241652632132173 2023-01-22 16:45:18.718321: step: 112/466, loss: 0.0025367976631969213 2023-01-22 16:45:19.406586: step: 114/466, loss: 0.004370294511318207 2023-01-22 16:45:20.187403: step: 116/466, loss: 0.006824452430009842 2023-01-22 16:45:20.952368: step: 118/466, loss: 0.2507859766483307 2023-01-22 16:45:21.648839: step: 120/466, loss: 0.0023692583199590445 2023-01-22 16:45:22.462592: step: 122/466, loss: 0.003840988501906395 2023-01-22 16:45:23.212779: step: 124/466, loss: 0.0017793461447581649 2023-01-22 16:45:24.036179: step: 126/466, loss: 0.0038497764617204666 2023-01-22 16:45:24.732220: step: 128/466, loss: 0.017459649592638016 2023-01-22 16:45:25.550307: step: 130/466, loss: 0.01397742610424757 2023-01-22 16:45:26.299310: step: 132/466, loss: 0.0017786064418032765 2023-01-22 16:45:27.097451: step: 134/466, loss: 0.019823169335722923 2023-01-22 16:45:27.814930: step: 136/466, loss: 9.185528324451298e-05 2023-01-22 16:45:28.601873: step: 138/466, loss: 0.0022579084616154432 2023-01-22 16:45:29.359378: step: 140/466, loss: 0.003933996427804232 2023-01-22 16:45:30.087883: step: 142/466, loss: 0.016194989904761314 2023-01-22 16:45:30.747654: step: 144/466, loss: 0.003658501198515296 2023-01-22 16:45:31.482666: step: 146/466, loss: 0.0016831440152600408 2023-01-22 16:45:32.247105: step: 148/466, loss: 0.007074498571455479 2023-01-22 16:45:32.978678: step: 150/466, loss: 0.02480948716402054 2023-01-22 16:45:33.703818: step: 152/466, loss: 4.1526051063556224e-05 2023-01-22 16:45:34.448126: step: 154/466, loss: 0.0003189579292666167 2023-01-22 16:45:35.232861: step: 156/466, loss: 0.005066257435828447 2023-01-22 16:45:36.037415: step: 158/466, loss: 0.02136784978210926 2023-01-22 16:45:36.726322: step: 160/466, loss: 0.0036368719302117825 2023-01-22 16:45:37.480338: step: 162/466, loss: 0.024072490632534027 2023-01-22 16:45:38.239886: step: 164/466, loss: 0.0027644268702715635 2023-01-22 16:45:38.918017: step: 166/466, loss: 0.0003109975077677518 2023-01-22 16:45:39.636836: step: 168/466, loss: 0.004258118104189634 2023-01-22 16:45:40.362114: step: 170/466, loss: 0.002872900338843465 2023-01-22 16:45:41.087716: step: 172/466, loss: 0.012874235399067402 2023-01-22 16:45:41.898624: step: 174/466, loss: 0.015394738875329494 2023-01-22 16:45:42.559255: step: 176/466, loss: 0.008090341463685036 2023-01-22 16:45:43.279440: step: 178/466, loss: 0.12440593540668488 2023-01-22 16:45:43.973800: step: 180/466, loss: 0.07188346982002258 2023-01-22 16:45:44.800543: step: 182/466, loss: 0.0001309849467361346 2023-01-22 16:45:45.523792: step: 184/466, loss: 0.031471334397792816 2023-01-22 16:45:46.224876: step: 186/466, loss: 0.08152177929878235 2023-01-22 16:45:46.996704: step: 188/466, loss: 0.0021092984825372696 2023-01-22 16:45:47.729470: step: 190/466, loss: 0.00014183452003635466 2023-01-22 16:45:48.505874: step: 192/466, loss: 0.006583390291780233 2023-01-22 16:45:49.202940: step: 194/466, loss: 7.404685311485082e-05 2023-01-22 16:45:49.894799: step: 196/466, loss: 0.025401262566447258 2023-01-22 16:45:50.773791: step: 198/466, loss: 0.009899438358843327 2023-01-22 16:45:51.653511: step: 200/466, loss: 0.008677742443978786 2023-01-22 16:45:52.365079: step: 202/466, loss: 0.0013968941057100892 2023-01-22 16:45:53.152883: step: 204/466, loss: 0.07518845051527023 2023-01-22 16:45:53.937966: step: 206/466, loss: 0.001217787736095488 2023-01-22 16:45:54.683289: step: 208/466, loss: 0.01642589643597603 2023-01-22 16:45:55.367046: step: 210/466, loss: 0.0003180429630447179 2023-01-22 16:45:56.102163: step: 212/466, loss: 0.004087543115019798 2023-01-22 16:45:56.816848: step: 214/466, loss: 0.002301363507285714 2023-01-22 16:45:57.570011: step: 216/466, loss: 0.0014160939026623964 2023-01-22 16:45:58.221408: step: 218/466, loss: 0.036056406795978546 2023-01-22 16:45:58.954555: step: 220/466, loss: 0.06661252677440643 2023-01-22 16:45:59.680805: step: 222/466, loss: 0.0015557609731331468 2023-01-22 16:46:00.405026: step: 224/466, loss: 0.0019310928182676435 2023-01-22 16:46:01.255237: step: 226/466, loss: 0.05399719625711441 2023-01-22 16:46:02.039671: step: 228/466, loss: 0.002947665983811021 2023-01-22 16:46:02.752522: step: 230/466, loss: 0.014477964490652084 2023-01-22 16:46:03.525210: step: 232/466, loss: 0.019763614982366562 2023-01-22 16:46:04.282637: step: 234/466, loss: 0.006132103502750397 2023-01-22 16:46:05.037257: step: 236/466, loss: 0.010499351657927036 2023-01-22 16:46:05.790385: step: 238/466, loss: 0.0016905704978853464 2023-01-22 16:46:06.521352: step: 240/466, loss: 0.025621794164180756 2023-01-22 16:46:07.329941: step: 242/466, loss: 0.013779666274785995 2023-01-22 16:46:08.072086: step: 244/466, loss: 0.0009651576983742416 2023-01-22 16:46:08.886245: step: 246/466, loss: 0.022902846336364746 2023-01-22 16:46:09.624551: step: 248/466, loss: 0.009824266657233238 2023-01-22 16:46:10.325816: step: 250/466, loss: 0.0001699442946119234 2023-01-22 16:46:11.054403: step: 252/466, loss: 0.0057142386212944984 2023-01-22 16:46:11.798133: step: 254/466, loss: 0.021286005154252052 2023-01-22 16:46:12.520245: step: 256/466, loss: 0.001488046138547361 2023-01-22 16:46:13.184569: step: 258/466, loss: 0.0026199682615697384 2023-01-22 16:46:13.932466: step: 260/466, loss: 0.012011692859232426 2023-01-22 16:46:14.700121: step: 262/466, loss: 0.035835057497024536 2023-01-22 16:46:15.381868: step: 264/466, loss: 0.0019202068215236068 2023-01-22 16:46:16.132439: step: 266/466, loss: 0.02158491127192974 2023-01-22 16:46:16.857899: step: 268/466, loss: 0.00046308664605021477 2023-01-22 16:46:17.543644: step: 270/466, loss: 0.02859680913388729 2023-01-22 16:46:18.196049: step: 272/466, loss: 0.00014170895155984908 2023-01-22 16:46:18.957494: step: 274/466, loss: 0.00041168491588905454 2023-01-22 16:46:19.722174: step: 276/466, loss: 0.003559098346158862 2023-01-22 16:46:20.540295: step: 278/466, loss: 0.0903628021478653 2023-01-22 16:46:21.345513: step: 280/466, loss: 0.017044108361005783 2023-01-22 16:46:22.088513: step: 282/466, loss: 0.011135498993098736 2023-01-22 16:46:22.835084: step: 284/466, loss: 0.16792574524879456 2023-01-22 16:46:23.690603: step: 286/466, loss: 0.0021677750628441572 2023-01-22 16:46:24.430706: step: 288/466, loss: 0.003123561153188348 2023-01-22 16:46:25.178941: step: 290/466, loss: 0.001427959301508963 2023-01-22 16:46:25.926560: step: 292/466, loss: 0.0020495569333434105 2023-01-22 16:46:26.624457: step: 294/466, loss: 0.0013727025361731648 2023-01-22 16:46:27.358972: step: 296/466, loss: 0.07193455845117569 2023-01-22 16:46:28.104274: step: 298/466, loss: 0.002000251319259405 2023-01-22 16:46:29.073084: step: 300/466, loss: 0.0009960554307326674 2023-01-22 16:46:29.852395: step: 302/466, loss: 0.0006270164158195257 2023-01-22 16:46:30.618264: step: 304/466, loss: 0.005108023062348366 2023-01-22 16:46:31.376953: step: 306/466, loss: 0.004545687232166529 2023-01-22 16:46:32.175697: step: 308/466, loss: 0.21969875693321228 2023-01-22 16:46:32.917382: step: 310/466, loss: 0.013188119977712631 2023-01-22 16:46:33.694951: step: 312/466, loss: 0.04590911045670509 2023-01-22 16:46:34.428507: step: 314/466, loss: 0.0009027286432683468 2023-01-22 16:46:35.258954: step: 316/466, loss: 0.1313788741827011 2023-01-22 16:46:35.995123: step: 318/466, loss: 0.0030443756841123104 2023-01-22 16:46:36.773308: step: 320/466, loss: 0.01420830562710762 2023-01-22 16:46:37.475885: step: 322/466, loss: 0.029201209545135498 2023-01-22 16:46:38.233144: step: 324/466, loss: 1.162377953529358 2023-01-22 16:46:38.972035: step: 326/466, loss: 0.0018761560786515474 2023-01-22 16:46:39.750014: step: 328/466, loss: 0.03066885843873024 2023-01-22 16:46:40.510681: step: 330/466, loss: 0.002510959981009364 2023-01-22 16:46:41.233591: step: 332/466, loss: 0.00010499132622499019 2023-01-22 16:46:41.961382: step: 334/466, loss: 0.009663326665759087 2023-01-22 16:46:42.669086: step: 336/466, loss: 0.00018944129988085479 2023-01-22 16:46:43.446476: step: 338/466, loss: 0.0010537714697420597 2023-01-22 16:46:44.283065: step: 340/466, loss: 0.0030080315191298723 2023-01-22 16:46:45.089076: step: 342/466, loss: 0.07661273330450058 2023-01-22 16:46:45.857105: step: 344/466, loss: 0.02163214050233364 2023-01-22 16:46:46.627585: step: 346/466, loss: 0.0030055749230086803 2023-01-22 16:46:47.375574: step: 348/466, loss: 0.01387910358607769 2023-01-22 16:46:48.019199: step: 350/466, loss: 0.002649215515702963 2023-01-22 16:46:48.777568: step: 352/466, loss: 0.00022997992346063256 2023-01-22 16:46:49.568196: step: 354/466, loss: 0.0016065045492723584 2023-01-22 16:46:50.270006: step: 356/466, loss: 0.0029339250177145004 2023-01-22 16:46:50.951284: step: 358/466, loss: 0.004379142541438341 2023-01-22 16:46:51.649526: step: 360/466, loss: 0.00419607711955905 2023-01-22 16:46:52.409031: step: 362/466, loss: 0.11295860260725021 2023-01-22 16:46:53.179699: step: 364/466, loss: 0.0018885949393734336 2023-01-22 16:46:53.877883: step: 366/466, loss: 0.001850526430644095 2023-01-22 16:46:54.622595: step: 368/466, loss: 0.015120510943233967 2023-01-22 16:46:55.382990: step: 370/466, loss: 0.009983471594750881 2023-01-22 16:46:56.216098: step: 372/466, loss: 0.011431551538407803 2023-01-22 16:46:56.903558: step: 374/466, loss: 0.009747300297021866 2023-01-22 16:46:57.578278: step: 376/466, loss: 0.0008253143751062453 2023-01-22 16:46:58.279572: step: 378/466, loss: 0.000229777317144908 2023-01-22 16:46:59.119640: step: 380/466, loss: 0.04102977365255356 2023-01-22 16:46:59.917688: step: 382/466, loss: 0.001671002828516066 2023-01-22 16:47:00.678822: step: 384/466, loss: 0.00015503703616559505 2023-01-22 16:47:01.466010: step: 386/466, loss: 3.818309778580442e-05 2023-01-22 16:47:02.259531: step: 388/466, loss: 0.008570864796638489 2023-01-22 16:47:03.042570: step: 390/466, loss: 0.004652679897844791 2023-01-22 16:47:03.757667: step: 392/466, loss: 0.00042334620957262814 2023-01-22 16:47:04.482840: step: 394/466, loss: 0.0036026700399816036 2023-01-22 16:47:05.244035: step: 396/466, loss: 0.00144184532109648 2023-01-22 16:47:05.997313: step: 398/466, loss: 0.009980946779251099 2023-01-22 16:47:06.737115: step: 400/466, loss: 0.000343825900927186 2023-01-22 16:47:07.541348: step: 402/466, loss: 0.021086854860186577 2023-01-22 16:47:08.255804: step: 404/466, loss: 0.0005226695793680847 2023-01-22 16:47:08.992837: step: 406/466, loss: 0.03247930482029915 2023-01-22 16:47:09.701696: step: 408/466, loss: 0.0001630575570743531 2023-01-22 16:47:10.551101: step: 410/466, loss: 0.016545020043849945 2023-01-22 16:47:11.280158: step: 412/466, loss: 0.0021113858092576265 2023-01-22 16:47:11.991259: step: 414/466, loss: 0.013716679066419601 2023-01-22 16:47:12.764004: step: 416/466, loss: 0.01107280608266592 2023-01-22 16:47:13.574357: step: 418/466, loss: 0.0002520505804568529 2023-01-22 16:47:14.538957: step: 420/466, loss: 0.034739550203084946 2023-01-22 16:47:15.200604: step: 422/466, loss: 0.006240678019821644 2023-01-22 16:47:15.901920: step: 424/466, loss: 0.011412478983402252 2023-01-22 16:47:16.594427: step: 426/466, loss: 0.008855105377733707 2023-01-22 16:47:17.344810: step: 428/466, loss: 0.013068322092294693 2023-01-22 16:47:18.000209: step: 430/466, loss: 0.0018111380049958825 2023-01-22 16:47:18.744997: step: 432/466, loss: 0.0008868640870787203 2023-01-22 16:47:19.505486: step: 434/466, loss: 0.00042411615140736103 2023-01-22 16:47:20.238048: step: 436/466, loss: 0.15449778735637665 2023-01-22 16:47:21.021903: step: 438/466, loss: 0.010791739448904991 2023-01-22 16:47:21.821548: step: 440/466, loss: 0.0001165992216556333 2023-01-22 16:47:22.544215: step: 442/466, loss: 0.00017539116379339248 2023-01-22 16:47:23.359688: step: 444/466, loss: 0.004423872102051973 2023-01-22 16:47:23.996920: step: 446/466, loss: 0.00048000356764532626 2023-01-22 16:47:24.845256: step: 448/466, loss: 0.007207691669464111 2023-01-22 16:47:25.674772: step: 450/466, loss: 0.05481301620602608 2023-01-22 16:47:26.359216: step: 452/466, loss: 0.1583072394132614 2023-01-22 16:47:27.057117: step: 454/466, loss: 0.0013135368935763836 2023-01-22 16:47:27.747211: step: 456/466, loss: 0.02863306924700737 2023-01-22 16:47:28.484666: step: 458/466, loss: 0.010624400340020657 2023-01-22 16:47:29.252226: step: 460/466, loss: 0.0007287510670721531 2023-01-22 16:47:30.095886: step: 462/466, loss: 0.006504683755338192 2023-01-22 16:47:30.804780: step: 464/466, loss: 0.008223538286983967 2023-01-22 16:47:31.574868: step: 466/466, loss: 0.01025502197444439 2023-01-22 16:47:32.339094: step: 468/466, loss: 0.0026410671416670084 2023-01-22 16:47:33.141756: step: 470/466, loss: 0.0350828617811203 2023-01-22 16:47:33.962172: step: 472/466, loss: 0.01333890575915575 2023-01-22 16:47:34.804917: step: 474/466, loss: 0.04505704715847969 2023-01-22 16:47:35.488324: step: 476/466, loss: 0.052273474633693695 2023-01-22 16:47:36.237114: step: 478/466, loss: 9.465384937357157e-05 2023-01-22 16:47:36.970366: step: 480/466, loss: 0.027441969141364098 2023-01-22 16:47:37.729032: step: 482/466, loss: 0.001578064518980682 2023-01-22 16:47:38.463989: step: 484/466, loss: 0.0018492471426725388 2023-01-22 16:47:39.185617: step: 486/466, loss: 0.016192801296710968 2023-01-22 16:47:39.970959: step: 488/466, loss: 0.005125532392412424 2023-01-22 16:47:40.809939: step: 490/466, loss: 0.0839591696858406 2023-01-22 16:47:41.626443: step: 492/466, loss: 0.0011096717789769173 2023-01-22 16:47:42.357640: step: 494/466, loss: 0.08878828585147858 2023-01-22 16:47:43.286040: step: 496/466, loss: 0.00310177868232131 2023-01-22 16:47:44.005324: step: 498/466, loss: 0.017978543415665627 2023-01-22 16:47:44.695639: step: 500/466, loss: 0.001203813822939992 2023-01-22 16:47:45.450073: step: 502/466, loss: 0.0003878503921441734 2023-01-22 16:47:46.175772: step: 504/466, loss: 0.0007280244608409703 2023-01-22 16:47:46.910037: step: 506/466, loss: 0.018997440114617348 2023-01-22 16:47:47.609176: step: 508/466, loss: 0.00015789296594448388 2023-01-22 16:47:48.414048: step: 510/466, loss: 0.00040656939381733537 2023-01-22 16:47:49.131522: step: 512/466, loss: 0.04028857499361038 2023-01-22 16:47:49.851783: step: 514/466, loss: 0.014796644449234009 2023-01-22 16:47:50.630263: step: 516/466, loss: 0.020934315398335457 2023-01-22 16:47:51.349823: step: 518/466, loss: 0.0004342313332017511 2023-01-22 16:47:52.135967: step: 520/466, loss: 0.0016563318204134703 2023-01-22 16:47:52.906553: step: 522/466, loss: 0.011757887899875641 2023-01-22 16:47:53.661934: step: 524/466, loss: 0.016990801319479942 2023-01-22 16:47:54.473687: step: 526/466, loss: 0.013895918615162373 2023-01-22 16:47:55.181981: step: 528/466, loss: 0.002518326509743929 2023-01-22 16:47:55.906708: step: 530/466, loss: 0.00970179121941328 2023-01-22 16:47:56.727689: step: 532/466, loss: 0.02301209419965744 2023-01-22 16:47:57.522571: step: 534/466, loss: 0.03104967251420021 2023-01-22 16:47:58.277308: step: 536/466, loss: 0.0026576619129627943 2023-01-22 16:47:59.028629: step: 538/466, loss: 0.004054277669638395 2023-01-22 16:47:59.792612: step: 540/466, loss: 0.7004632949829102 2023-01-22 16:48:00.515045: step: 542/466, loss: 0.0006315643549896777 2023-01-22 16:48:01.261757: step: 544/466, loss: 0.01206926442682743 2023-01-22 16:48:02.057011: step: 546/466, loss: 0.006809778045862913 2023-01-22 16:48:02.799753: step: 548/466, loss: 0.0025723904836922884 2023-01-22 16:48:03.626956: step: 550/466, loss: 0.0011128417681902647 2023-01-22 16:48:04.417172: step: 552/466, loss: 0.0028901181649416685 2023-01-22 16:48:05.181663: step: 554/466, loss: 0.012472089380025864 2023-01-22 16:48:05.964550: step: 556/466, loss: 0.03356686979532242 2023-01-22 16:48:06.737982: step: 558/466, loss: 0.00046890118392184377 2023-01-22 16:48:07.572883: step: 560/466, loss: 0.08391924202442169 2023-01-22 16:48:08.359360: step: 562/466, loss: 0.006775359157472849 2023-01-22 16:48:09.110976: step: 564/466, loss: 0.011288279667496681 2023-01-22 16:48:09.965703: step: 566/466, loss: 0.03128872066736221 2023-01-22 16:48:10.689171: step: 568/466, loss: 0.00021249732526484877 2023-01-22 16:48:11.421111: step: 570/466, loss: 0.007754562888294458 2023-01-22 16:48:12.097916: step: 572/466, loss: 0.00020670304365921766 2023-01-22 16:48:12.894940: step: 574/466, loss: 0.00020100167603231966 2023-01-22 16:48:13.627280: step: 576/466, loss: 0.0014069074532017112 2023-01-22 16:48:14.432324: step: 578/466, loss: 0.05859680846333504 2023-01-22 16:48:15.182075: step: 580/466, loss: 0.017105508595705032 2023-01-22 16:48:15.868824: step: 582/466, loss: 0.0008416476775892079 2023-01-22 16:48:16.619549: step: 584/466, loss: 0.00212100800126791 2023-01-22 16:48:17.452707: step: 586/466, loss: 0.007836099714040756 2023-01-22 16:48:18.232720: step: 588/466, loss: 0.006854955572634935 2023-01-22 16:48:18.983524: step: 590/466, loss: 0.0015867829788476229 2023-01-22 16:48:19.747321: step: 592/466, loss: 1.9550052456906997e-06 2023-01-22 16:48:20.509601: step: 594/466, loss: 0.00018021200958173722 2023-01-22 16:48:21.274857: step: 596/466, loss: 0.03990541771054268 2023-01-22 16:48:22.040534: step: 598/466, loss: 0.0500204972922802 2023-01-22 16:48:22.734202: step: 600/466, loss: 0.1216036006808281 2023-01-22 16:48:23.532270: step: 602/466, loss: 0.010651436634361744 2023-01-22 16:48:24.279945: step: 604/466, loss: 0.002248160308226943 2023-01-22 16:48:25.016387: step: 606/466, loss: 0.07864465564489365 2023-01-22 16:48:25.836667: step: 608/466, loss: 0.007119298912584782 2023-01-22 16:48:26.549413: step: 610/466, loss: 0.0018475898541510105 2023-01-22 16:48:27.298343: step: 612/466, loss: 0.0014052045298740268 2023-01-22 16:48:28.945757: step: 614/466, loss: 0.0003670759324450046 2023-01-22 16:48:29.712958: step: 616/466, loss: 0.047144003212451935 2023-01-22 16:48:30.434954: step: 618/466, loss: 0.014714747667312622 2023-01-22 16:48:31.181355: step: 620/466, loss: 0.0009839566191658378 2023-01-22 16:48:32.036342: step: 622/466, loss: 0.0046821883879601955 2023-01-22 16:48:32.822124: step: 624/466, loss: 0.01558632217347622 2023-01-22 16:48:33.558220: step: 626/466, loss: 0.022430801764130592 2023-01-22 16:48:34.337017: step: 628/466, loss: 4.678280674852431e-05 2023-01-22 16:48:35.067527: step: 630/466, loss: 0.008924842812120914 2023-01-22 16:48:35.819198: step: 632/466, loss: 0.36126741766929626 2023-01-22 16:48:36.639873: step: 634/466, loss: 0.0009075882844626904 2023-01-22 16:48:37.358805: step: 636/466, loss: 0.0497039258480072 2023-01-22 16:48:38.184742: step: 638/466, loss: 0.0003013135283254087 2023-01-22 16:48:38.927142: step: 640/466, loss: 0.003686620155349374 2023-01-22 16:48:39.658101: step: 642/466, loss: 9.751777542987838e-05 2023-01-22 16:48:40.477750: step: 644/466, loss: 0.0009560614125803113 2023-01-22 16:48:41.375510: step: 646/466, loss: 0.002088146051391959 2023-01-22 16:48:42.093893: step: 648/466, loss: 0.003502498846501112 2023-01-22 16:48:42.836921: step: 650/466, loss: 0.001366930897347629 2023-01-22 16:48:43.588188: step: 652/466, loss: 0.020780501887202263 2023-01-22 16:48:44.313093: step: 654/466, loss: 0.0038583616260439157 2023-01-22 16:48:45.001757: step: 656/466, loss: 0.0011789824347943068 2023-01-22 16:48:45.818784: step: 658/466, loss: 0.003159801010042429 2023-01-22 16:48:46.569603: step: 660/466, loss: 0.013112317770719528 2023-01-22 16:48:47.368425: step: 662/466, loss: 0.0060791620053350925 2023-01-22 16:48:48.103280: step: 664/466, loss: 0.0005515529774129391 2023-01-22 16:48:48.825939: step: 666/466, loss: 0.010279749520123005 2023-01-22 16:48:49.675619: step: 668/466, loss: 0.02576580084860325 2023-01-22 16:48:50.425703: step: 670/466, loss: 0.004443008918315172 2023-01-22 16:48:51.129016: step: 672/466, loss: 0.0020656739361584187 2023-01-22 16:48:51.761173: step: 674/466, loss: 0.059588123112916946 2023-01-22 16:48:52.567013: step: 676/466, loss: 0.06571090221405029 2023-01-22 16:48:53.284685: step: 678/466, loss: 0.0033334565814584494 2023-01-22 16:48:53.961217: step: 680/466, loss: 3.0374652851605788e-05 2023-01-22 16:48:54.730809: step: 682/466, loss: 0.002010779222473502 2023-01-22 16:48:55.412890: step: 684/466, loss: 0.00045487828901968896 2023-01-22 16:48:56.119818: step: 686/466, loss: 0.009617500938475132 2023-01-22 16:48:56.865604: step: 688/466, loss: 1.1015014933946077e-05 2023-01-22 16:48:57.576803: step: 690/466, loss: 0.01960671693086624 2023-01-22 16:48:58.376340: step: 692/466, loss: 0.004229302518069744 2023-01-22 16:48:59.137566: step: 694/466, loss: 0.0006667460547760129 2023-01-22 16:48:59.867719: step: 696/466, loss: 0.0023100117687135935 2023-01-22 16:49:00.571914: step: 698/466, loss: 0.010152764618396759 2023-01-22 16:49:01.245188: step: 700/466, loss: 0.03864798694849014 2023-01-22 16:49:02.066978: step: 702/466, loss: 0.08513421565294266 2023-01-22 16:49:02.838529: step: 704/466, loss: 0.08132486790418625 2023-01-22 16:49:03.646536: step: 706/466, loss: 0.0360155887901783 2023-01-22 16:49:04.431773: step: 708/466, loss: 0.029786350205540657 2023-01-22 16:49:05.156789: step: 710/466, loss: 0.00011806087422883138 2023-01-22 16:49:05.891630: step: 712/466, loss: 0.0663982704281807 2023-01-22 16:49:06.566556: step: 714/466, loss: 0.0035313130356371403 2023-01-22 16:49:07.283203: step: 716/466, loss: 0.00889684446156025 2023-01-22 16:49:07.969857: step: 718/466, loss: 0.0003816070093307644 2023-01-22 16:49:08.902378: step: 720/466, loss: 0.06195018067955971 2023-01-22 16:49:09.717275: step: 722/466, loss: 0.04165364056825638 2023-01-22 16:49:10.416517: step: 724/466, loss: 0.005327022168785334 2023-01-22 16:49:11.148160: step: 726/466, loss: 0.005947711877524853 2023-01-22 16:49:11.865295: step: 728/466, loss: 0.0034825638867914677 2023-01-22 16:49:12.628867: step: 730/466, loss: 0.018863987177610397 2023-01-22 16:49:13.340005: step: 732/466, loss: 0.016013626009225845 2023-01-22 16:49:14.026278: step: 734/466, loss: 0.0028963927179574966 2023-01-22 16:49:14.765901: step: 736/466, loss: 0.007359576877206564 2023-01-22 16:49:15.580036: step: 738/466, loss: 0.003223610110580921 2023-01-22 16:49:16.291630: step: 740/466, loss: 0.08692073076963425 2023-01-22 16:49:17.046353: step: 742/466, loss: 0.00045874129864387214 2023-01-22 16:49:17.809277: step: 744/466, loss: 0.00390154798515141 2023-01-22 16:49:18.548529: step: 746/466, loss: 0.1868094652891159 2023-01-22 16:49:19.326290: step: 748/466, loss: 0.058032043278217316 2023-01-22 16:49:20.172766: step: 750/466, loss: 0.0044331420212984085 2023-01-22 16:49:20.891640: step: 752/466, loss: 0.0005825747502967715 2023-01-22 16:49:21.729945: step: 754/466, loss: 0.019781772047281265 2023-01-22 16:49:22.494812: step: 756/466, loss: 0.013738803565502167 2023-01-22 16:49:23.243470: step: 758/466, loss: 0.0019082339713349938 2023-01-22 16:49:23.924362: step: 760/466, loss: 0.0057865045964717865 2023-01-22 16:49:24.663842: step: 762/466, loss: 0.002451360924169421 2023-01-22 16:49:25.370390: step: 764/466, loss: 0.005069708917289972 2023-01-22 16:49:26.103783: step: 766/466, loss: 0.0019848656374961138 2023-01-22 16:49:26.893209: step: 768/466, loss: 0.010547601617872715 2023-01-22 16:49:27.724149: step: 770/466, loss: 0.045725855976343155 2023-01-22 16:49:28.401971: step: 772/466, loss: 0.004928800743073225 2023-01-22 16:49:29.199574: step: 774/466, loss: 0.0019239893881604075 2023-01-22 16:49:29.942080: step: 776/466, loss: 0.1870308816432953 2023-01-22 16:49:30.692814: step: 778/466, loss: 0.08053610473871231 2023-01-22 16:49:31.509147: step: 780/466, loss: 0.006630890071392059 2023-01-22 16:49:32.348892: step: 782/466, loss: 0.006986789871007204 2023-01-22 16:49:33.044629: step: 784/466, loss: 0.001514837727881968 2023-01-22 16:49:33.761453: step: 786/466, loss: 0.0006803342257626355 2023-01-22 16:49:34.518348: step: 788/466, loss: 0.004624820314347744 2023-01-22 16:49:35.259962: step: 790/466, loss: 0.01107484195381403 2023-01-22 16:49:36.039470: step: 792/466, loss: 0.00035859248600900173 2023-01-22 16:49:36.936583: step: 794/466, loss: 0.004311860539019108 2023-01-22 16:49:37.640449: step: 796/466, loss: 0.006726352032274008 2023-01-22 16:49:38.349301: step: 798/466, loss: 0.02447984181344509 2023-01-22 16:49:39.100265: step: 800/466, loss: 0.026846712455153465 2023-01-22 16:49:39.888665: step: 802/466, loss: 0.031323280185461044 2023-01-22 16:49:40.704408: step: 804/466, loss: 0.008843212388455868 2023-01-22 16:49:41.500144: step: 806/466, loss: 0.0007169311284087598 2023-01-22 16:49:42.257185: step: 808/466, loss: 0.002318635117262602 2023-01-22 16:49:43.052975: step: 810/466, loss: 0.06627952307462692 2023-01-22 16:49:43.740643: step: 812/466, loss: 0.00226763472892344 2023-01-22 16:49:44.585999: step: 814/466, loss: 0.0014524434227496386 2023-01-22 16:49:45.432750: step: 816/466, loss: 0.000614943157415837 2023-01-22 16:49:46.188291: step: 818/466, loss: 0.009615597315132618 2023-01-22 16:49:46.933390: step: 820/466, loss: 0.058534231036901474 2023-01-22 16:49:47.737201: step: 822/466, loss: 0.037576254457235336 2023-01-22 16:49:48.485983: step: 824/466, loss: 0.5030120611190796 2023-01-22 16:49:49.276384: step: 826/466, loss: 0.026011312380433083 2023-01-22 16:49:50.037943: step: 828/466, loss: 0.012016969732940197 2023-01-22 16:49:50.712154: step: 830/466, loss: 0.000505308504216373 2023-01-22 16:49:51.418546: step: 832/466, loss: 0.013352626003324986 2023-01-22 16:49:52.105274: step: 834/466, loss: 0.004660810809582472 2023-01-22 16:49:52.912047: step: 836/466, loss: 0.013114354573190212 2023-01-22 16:49:53.650596: step: 838/466, loss: 0.021583227440714836 2023-01-22 16:49:54.406818: step: 840/466, loss: 0.0003550250257831067 2023-01-22 16:49:55.180045: step: 842/466, loss: 0.001051700091920793 2023-01-22 16:49:55.872752: step: 844/466, loss: 0.0002409262233413756 2023-01-22 16:49:56.582552: step: 846/466, loss: 0.0005547681939788163 2023-01-22 16:49:57.322922: step: 848/466, loss: 0.007193288300186396 2023-01-22 16:49:58.084165: step: 850/466, loss: 0.00018879900744650513 2023-01-22 16:49:58.909166: step: 852/466, loss: 0.00012669037096202374 2023-01-22 16:49:59.642694: step: 854/466, loss: 0.010259213857352734 2023-01-22 16:50:00.389371: step: 856/466, loss: 0.0011524403234943748 2023-01-22 16:50:01.114212: step: 858/466, loss: 0.42043399810791016 2023-01-22 16:50:01.873106: step: 860/466, loss: 0.0032950653694570065 2023-01-22 16:50:02.655131: step: 862/466, loss: 0.004293408710509539 2023-01-22 16:50:03.377114: step: 864/466, loss: 0.014959764666855335 2023-01-22 16:50:04.207627: step: 866/466, loss: 0.02132979966700077 2023-01-22 16:50:05.043281: step: 868/466, loss: 0.020063387230038643 2023-01-22 16:50:05.944781: step: 870/466, loss: 0.005825446918606758 2023-01-22 16:50:06.648807: step: 872/466, loss: 0.0006191516295075417 2023-01-22 16:50:07.426123: step: 874/466, loss: 0.04492638260126114 2023-01-22 16:50:08.192503: step: 876/466, loss: 0.0017336331075057387 2023-01-22 16:50:08.959308: step: 878/466, loss: 0.03967329487204552 2023-01-22 16:50:09.729241: step: 880/466, loss: 0.008279534988105297 2023-01-22 16:50:10.515449: step: 882/466, loss: 0.004151599947363138 2023-01-22 16:50:11.271321: step: 884/466, loss: 0.014451387338340282 2023-01-22 16:50:11.968402: step: 886/466, loss: 0.0016142098465934396 2023-01-22 16:50:12.825705: step: 888/466, loss: 0.009346477687358856 2023-01-22 16:50:13.584387: step: 890/466, loss: 0.010483842343091965 2023-01-22 16:50:14.331435: step: 892/466, loss: 3.388475670362823e-05 2023-01-22 16:50:15.142261: step: 894/466, loss: 0.007849263958632946 2023-01-22 16:50:15.913163: step: 896/466, loss: 0.06408103555440903 2023-01-22 16:50:16.776146: step: 898/466, loss: 0.008165200240910053 2023-01-22 16:50:17.512369: step: 900/466, loss: 0.02221984788775444 2023-01-22 16:50:18.307160: step: 902/466, loss: 0.0095894830301404 2023-01-22 16:50:19.083125: step: 904/466, loss: 0.015169389545917511 2023-01-22 16:50:19.851500: step: 906/466, loss: 0.001172103569842875 2023-01-22 16:50:20.559038: step: 908/466, loss: 0.009952932596206665 2023-01-22 16:50:21.413283: step: 910/466, loss: 0.007936015725135803 2023-01-22 16:50:22.197412: step: 912/466, loss: 0.011334918439388275 2023-01-22 16:50:22.980448: step: 914/466, loss: 0.004115479066967964 2023-01-22 16:50:23.682498: step: 916/466, loss: 5.65770678804256e-05 2023-01-22 16:50:24.403881: step: 918/466, loss: 0.004586285445839167 2023-01-22 16:50:25.125692: step: 920/466, loss: 0.000270793039817363 2023-01-22 16:50:25.830269: step: 922/466, loss: 0.00014199443103279918 2023-01-22 16:50:26.522875: step: 924/466, loss: 0.0424426905810833 2023-01-22 16:50:27.377116: step: 926/466, loss: 0.023249058052897453 2023-01-22 16:50:28.109043: step: 928/466, loss: 0.022267363965511322 2023-01-22 16:50:28.823378: step: 930/466, loss: 0.008111781440675259 2023-01-22 16:50:29.539876: step: 932/466, loss: 0.003675048239529133 ================================================== Loss: 0.024 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2956869009584665, 'r': 0.35123339658444025, 'f1': 0.3210754553339116}, 'combined': 0.2365819144565664, 'epoch': 37} Test Chinese: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.3447676827785396, 'r': 0.3095141415585503, 'f1': 0.3261911592315681}, 'combined': 0.20048822469842723, 'epoch': 37} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2784739175643907, 'r': 0.3646053190122004, 'f1': 0.315771574559457}, 'combined': 0.2326737917806525, 'epoch': 37} Test Korean: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.32908061039328323, 'r': 0.314822351710732, 'f1': 0.3217936172490564}, 'combined': 0.1977853452360054, 'epoch': 37} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3172019867549669, 'r': 0.3635483870967742, 'f1': 0.3387975243147657}, 'combined': 0.24964028107403785, 'epoch': 37} Test Russian: {'template': {'p': 0.863013698630137, 'r': 0.48091603053435117, 'f1': 0.6176470588235295}, 'slot': {'p': 0.35100506997519976, 'r': 0.3123762624475998, 'f1': 0.33056598520360403}, 'combined': 0.2041731085081084, 'epoch': 37} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.2181372549019608, 'r': 0.31785714285714284, 'f1': 0.25872093023255816}, 'combined': 0.17248062015503876, 'epoch': 37} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.2925531914893617, 'r': 0.5978260869565217, 'f1': 0.39285714285714285}, 'combined': 0.19642857142857142, 'epoch': 37} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.4, 'r': 0.20689655172413793, 'f1': 0.2727272727272727}, 'combined': 0.1818181818181818, 'epoch': 37} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.31374061853002067, 'r': 0.3286239495798319, 'f1': 0.3210098636303455}, 'combined': 0.23653358372762298, 'epoch': 33} Test for Chinese: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.35580409301213317, 'r': 0.30097021547433256, 'f1': 0.32609812277003203}, 'combined': 0.20043104131231237, 'epoch': 33} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.31875, 'r': 0.36428571428571427, 'f1': 0.34}, 'combined': 0.22666666666666668, 'epoch': 33} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.30156788595932055, 'r': 0.3456299869438892, 'f1': 0.322099032925605}, 'combined': 0.23733612952412997, 'epoch': 36} Test for Korean: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.331509942889869, 'r': 0.31226283182087317, 'f1': 0.3215986683813365}, 'combined': 0.19766552300511414, 'epoch': 36} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.32432432432432434, 'r': 0.5217391304347826, 'f1': 0.4}, 'combined': 0.2, 'epoch': 36} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3161637354106435, 'r': 0.3395610516934046, 'f1': 0.3274449665918101}, 'combined': 0.24127523854133376, 'epoch': 24} Test for Russian: {'template': {'p': 0.863013698630137, 'r': 0.48091603053435117, 'f1': 0.6176470588235295}, 'slot': {'p': 0.34766580316129947, 'r': 0.3010093533864065, 'f1': 0.32265967810793456}, 'combined': 0.19928980118431255, 'epoch': 24} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.2413793103448276, 'f1': 0.32558139534883723}, 'combined': 0.21705426356589147, 'epoch': 24} ****************************** Epoch: 38 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-22 16:53:15.818076: step: 2/466, loss: 0.00015852558135520667 2023-01-22 16:53:16.496506: step: 4/466, loss: 0.00033445178996771574 2023-01-22 16:53:17.297870: step: 6/466, loss: 0.0018999316962435842 2023-01-22 16:53:18.129054: step: 8/466, loss: 0.002631555078551173 2023-01-22 16:53:18.803174: step: 10/466, loss: 0.01319513376802206 2023-01-22 16:53:19.560941: step: 12/466, loss: 0.004481269977986813 2023-01-22 16:53:20.386474: step: 14/466, loss: 0.00275377556681633 2023-01-22 16:53:21.119454: step: 16/466, loss: 0.003091811668127775 2023-01-22 16:53:21.818146: step: 18/466, loss: 2.619748830795288 2023-01-22 16:53:22.557021: step: 20/466, loss: 0.012885436415672302 2023-01-22 16:53:23.260947: step: 22/466, loss: 0.04020577296614647 2023-01-22 16:53:23.948757: step: 24/466, loss: 0.015233595855534077 2023-01-22 16:53:24.801807: step: 26/466, loss: 0.0010679669212549925 2023-01-22 16:53:25.572616: step: 28/466, loss: 0.014175614342093468 2023-01-22 16:53:26.312993: step: 30/466, loss: 0.0001996166247408837 2023-01-22 16:53:27.032913: step: 32/466, loss: 0.011988112702965736 2023-01-22 16:53:27.706427: step: 34/466, loss: 0.00021556120191235095 2023-01-22 16:53:28.413148: step: 36/466, loss: 0.0013716488610953093 2023-01-22 16:53:29.174475: step: 38/466, loss: 0.001971025485545397 2023-01-22 16:53:29.937656: step: 40/466, loss: 9.164853690890595e-05 2023-01-22 16:53:30.699371: step: 42/466, loss: 0.0010934865567833185 2023-01-22 16:53:31.418608: step: 44/466, loss: 0.022326651960611343 2023-01-22 16:53:32.222388: step: 46/466, loss: 0.0021874187514185905 2023-01-22 16:53:33.005214: step: 48/466, loss: 0.0008808678830973804 2023-01-22 16:53:33.748397: step: 50/466, loss: 0.10775995254516602 2023-01-22 16:53:34.467532: step: 52/466, loss: 0.00046264220145531 2023-01-22 16:53:35.178434: step: 54/466, loss: 0.006234243977814913 2023-01-22 16:53:35.942225: step: 56/466, loss: 0.010173077695071697 2023-01-22 16:53:36.745201: step: 58/466, loss: 0.04318838566541672 2023-01-22 16:53:37.450646: step: 60/466, loss: 0.0006960787577554584 2023-01-22 16:53:38.183022: step: 62/466, loss: 0.45456135272979736 2023-01-22 16:53:38.963752: step: 64/466, loss: 0.007998891174793243 2023-01-22 16:53:39.799788: step: 66/466, loss: 0.116494320333004 2023-01-22 16:53:40.571219: step: 68/466, loss: 0.0003614113375078887 2023-01-22 16:53:41.302265: step: 70/466, loss: 0.0005324063822627068 2023-01-22 16:53:42.208821: step: 72/466, loss: 0.0016135982004925609 2023-01-22 16:53:42.897253: step: 74/466, loss: 0.0055488417856395245 2023-01-22 16:53:43.673905: step: 76/466, loss: 0.05743572115898132 2023-01-22 16:53:44.440902: step: 78/466, loss: 0.28396084904670715 2023-01-22 16:53:45.180740: step: 80/466, loss: 0.010283631272614002 2023-01-22 16:53:45.984666: step: 82/466, loss: 0.008233075961470604 2023-01-22 16:53:46.686169: step: 84/466, loss: 0.03372426703572273 2023-01-22 16:53:47.408342: step: 86/466, loss: 0.00029308663215488195 2023-01-22 16:53:48.193038: step: 88/466, loss: 0.07326936721801758 2023-01-22 16:53:49.072516: step: 90/466, loss: 0.008953921496868134 2023-01-22 16:53:49.789694: step: 92/466, loss: 5.050439722253941e-05 2023-01-22 16:53:50.597708: step: 94/466, loss: 0.008332867175340652 2023-01-22 16:53:51.322207: step: 96/466, loss: 0.06297744810581207 2023-01-22 16:53:52.030155: step: 98/466, loss: 0.002277067629620433 2023-01-22 16:53:52.858320: step: 100/466, loss: 0.020023813471198082 2023-01-22 16:53:53.632863: step: 102/466, loss: 0.0022804755717515945 2023-01-22 16:53:54.384111: step: 104/466, loss: 0.004442311357706785 2023-01-22 16:53:55.143168: step: 106/466, loss: 0.0021702053491026163 2023-01-22 16:53:55.830754: step: 108/466, loss: 9.901211342366878e-06 2023-01-22 16:53:56.624642: step: 110/466, loss: 0.012853569351136684 2023-01-22 16:53:57.429330: step: 112/466, loss: 0.0051322742365300655 2023-01-22 16:53:58.137305: step: 114/466, loss: 0.0065353913232684135 2023-01-22 16:53:58.937988: step: 116/466, loss: 0.00728453928604722 2023-01-22 16:53:59.606701: step: 118/466, loss: 0.003233555005863309 2023-01-22 16:54:00.281228: step: 120/466, loss: 0.0018335931235924363 2023-01-22 16:54:01.034724: step: 122/466, loss: 0.0021310176234692335 2023-01-22 16:54:01.768362: step: 124/466, loss: 3.200375067535788e-05 2023-01-22 16:54:02.475160: step: 126/466, loss: 0.046245038509368896 2023-01-22 16:54:03.201953: step: 128/466, loss: 0.03275144472718239 2023-01-22 16:54:03.982497: step: 130/466, loss: 0.012383184395730495 2023-01-22 16:54:04.712648: step: 132/466, loss: 0.6921976804733276 2023-01-22 16:54:05.497266: step: 134/466, loss: 0.023298662155866623 2023-01-22 16:54:06.268252: step: 136/466, loss: 0.0016130059957504272 2023-01-22 16:54:07.076167: step: 138/466, loss: 0.0016224872088059783 2023-01-22 16:54:07.798392: step: 140/466, loss: 0.0008939092513173819 2023-01-22 16:54:08.573086: step: 142/466, loss: 0.0006672238232567906 2023-01-22 16:54:09.304018: step: 144/466, loss: 0.00451664999127388 2023-01-22 16:54:09.974247: step: 146/466, loss: 0.00010222888522548601 2023-01-22 16:54:10.666083: step: 148/466, loss: 0.014285118319094181 2023-01-22 16:54:11.431370: step: 150/466, loss: 0.007723046466708183 2023-01-22 16:54:12.232827: step: 152/466, loss: 0.013501619920134544 2023-01-22 16:54:13.034337: step: 154/466, loss: 0.010910639539361 2023-01-22 16:54:13.791028: step: 156/466, loss: 0.9693259596824646 2023-01-22 16:54:14.473504: step: 158/466, loss: 0.13675321638584137 2023-01-22 16:54:15.248340: step: 160/466, loss: 0.0007152618491090834 2023-01-22 16:54:15.984262: step: 162/466, loss: 0.009600437246263027 2023-01-22 16:54:16.740438: step: 164/466, loss: 0.0026017329655587673 2023-01-22 16:54:17.436042: step: 166/466, loss: 0.0006253659958019853 2023-01-22 16:54:18.136106: step: 168/466, loss: 0.0031114050652831793 2023-01-22 16:54:18.816536: step: 170/466, loss: 0.002069795271381736 2023-01-22 16:54:19.567319: step: 172/466, loss: 0.006407567299902439 2023-01-22 16:54:20.281094: step: 174/466, loss: 0.005875298287719488 2023-01-22 16:54:20.957771: step: 176/466, loss: 0.014285593293607235 2023-01-22 16:54:21.752572: step: 178/466, loss: 0.011498566716909409 2023-01-22 16:54:22.477969: step: 180/466, loss: 0.004969505127519369 2023-01-22 16:54:23.211072: step: 182/466, loss: 0.0030212486162781715 2023-01-22 16:54:23.963040: step: 184/466, loss: 0.0011963235447183251 2023-01-22 16:54:24.750157: step: 186/466, loss: 0.0015245783142745495 2023-01-22 16:54:25.506420: step: 188/466, loss: 0.0013877090532332659 2023-01-22 16:54:26.348377: step: 190/466, loss: 0.03716769814491272 2023-01-22 16:54:27.119556: step: 192/466, loss: 0.00022803526371717453 2023-01-22 16:54:27.886400: step: 194/466, loss: 0.019137857481837273 2023-01-22 16:54:28.554247: step: 196/466, loss: 0.0002453289635013789 2023-01-22 16:54:29.294216: step: 198/466, loss: 0.003092587925493717 2023-01-22 16:54:30.165074: step: 200/466, loss: 0.0006169404368847609 2023-01-22 16:54:30.880291: step: 202/466, loss: 0.004191730171442032 2023-01-22 16:54:31.693332: step: 204/466, loss: 0.00016001469339244068 2023-01-22 16:54:32.501406: step: 206/466, loss: 0.0030550588853657246 2023-01-22 16:54:33.181019: step: 208/466, loss: 0.0962376520037651 2023-01-22 16:54:33.881693: step: 210/466, loss: 0.02584371156990528 2023-01-22 16:54:34.603297: step: 212/466, loss: 0.01125494297593832 2023-01-22 16:54:35.365158: step: 214/466, loss: 4.2713403672678396e-05 2023-01-22 16:54:36.161477: step: 216/466, loss: 0.039989691227674484 2023-01-22 16:54:36.905670: step: 218/466, loss: 0.00022126472322270274 2023-01-22 16:54:37.722758: step: 220/466, loss: 0.012783754616975784 2023-01-22 16:54:38.454069: step: 222/466, loss: 0.5585067868232727 2023-01-22 16:54:39.201880: step: 224/466, loss: 0.14413070678710938 2023-01-22 16:54:39.883150: step: 226/466, loss: 0.04216240718960762 2023-01-22 16:54:40.598125: step: 228/466, loss: 0.0006263788091018796 2023-01-22 16:54:41.357075: step: 230/466, loss: 0.00011739307956304401 2023-01-22 16:54:42.033415: step: 232/466, loss: 0.0006011001532897353 2023-01-22 16:54:42.724195: step: 234/466, loss: 0.00026434441679157317 2023-01-22 16:54:43.506278: step: 236/466, loss: 0.03465807065367699 2023-01-22 16:54:44.310869: step: 238/466, loss: 0.03530780225992203 2023-01-22 16:54:45.101118: step: 240/466, loss: 0.007487792056053877 2023-01-22 16:54:45.871383: step: 242/466, loss: 0.027928482741117477 2023-01-22 16:54:46.581293: step: 244/466, loss: 0.0005281373159959912 2023-01-22 16:54:47.324315: step: 246/466, loss: 2.9038224965916015e-05 2023-01-22 16:54:48.103356: step: 248/466, loss: 0.015432695858180523 2023-01-22 16:54:48.969856: step: 250/466, loss: 0.0829966738820076 2023-01-22 16:54:49.628555: step: 252/466, loss: 0.0011326514650136232 2023-01-22 16:54:50.371650: step: 254/466, loss: 0.001102382899262011 2023-01-22 16:54:51.117397: step: 256/466, loss: 5.16331747348886e-05 2023-01-22 16:54:51.875695: step: 258/466, loss: 0.002248742850497365 2023-01-22 16:54:52.644682: step: 260/466, loss: 0.0016092954901978374 2023-01-22 16:54:53.340860: step: 262/466, loss: 0.0007166212890297174 2023-01-22 16:54:54.049001: step: 264/466, loss: 6.099096208345145e-05 2023-01-22 16:54:54.800535: step: 266/466, loss: 0.04940228536725044 2023-01-22 16:54:55.648450: step: 268/466, loss: 0.00033056834945455194 2023-01-22 16:54:56.358868: step: 270/466, loss: 0.00013981727533973753 2023-01-22 16:54:57.093267: step: 272/466, loss: 0.004971823655068874 2023-01-22 16:54:57.857554: step: 274/466, loss: 0.000589183415286243 2023-01-22 16:54:58.584328: step: 276/466, loss: 0.0033571096137166023 2023-01-22 16:54:59.313624: step: 278/466, loss: 0.02178746648132801 2023-01-22 16:55:00.114524: step: 280/466, loss: 1.2804764537577285e-06 2023-01-22 16:55:00.874626: step: 282/466, loss: 0.004462054930627346 2023-01-22 16:55:01.620574: step: 284/466, loss: 0.003304133890196681 2023-01-22 16:55:02.299460: step: 286/466, loss: 0.007589017506688833 2023-01-22 16:55:02.975035: step: 288/466, loss: 5.5421369324903935e-06 2023-01-22 16:55:03.794303: step: 290/466, loss: 0.05665628984570503 2023-01-22 16:55:04.599526: step: 292/466, loss: 0.04041202366352081 2023-01-22 16:55:05.444802: step: 294/466, loss: 0.425844669342041 2023-01-22 16:55:06.192363: step: 296/466, loss: 0.010454821400344372 2023-01-22 16:55:06.946493: step: 298/466, loss: 0.0012415708042681217 2023-01-22 16:55:07.687321: step: 300/466, loss: 0.049156442284584045 2023-01-22 16:55:08.507771: step: 302/466, loss: 0.0007102354429662228 2023-01-22 16:55:09.225465: step: 304/466, loss: 0.0025498925242573023 2023-01-22 16:55:09.957874: step: 306/466, loss: 0.0020394527819007635 2023-01-22 16:55:10.707754: step: 308/466, loss: 0.0024311868473887444 2023-01-22 16:55:11.445005: step: 310/466, loss: 0.026064248755574226 2023-01-22 16:55:12.251601: step: 312/466, loss: 0.05080139636993408 2023-01-22 16:55:12.985798: step: 314/466, loss: 0.003608688246458769 2023-01-22 16:55:13.707276: step: 316/466, loss: 4.1068644350161776e-05 2023-01-22 16:55:14.431114: step: 318/466, loss: 0.04948783665895462 2023-01-22 16:55:15.184345: step: 320/466, loss: 0.00613086624071002 2023-01-22 16:55:15.880236: step: 322/466, loss: 0.014376579783856869 2023-01-22 16:55:16.632460: step: 324/466, loss: 0.007867695763707161 2023-01-22 16:55:17.300394: step: 326/466, loss: 0.011521144770085812 2023-01-22 16:55:18.063430: step: 328/466, loss: 0.005872808862477541 2023-01-22 16:55:18.854586: step: 330/466, loss: 0.0047467658296227455 2023-01-22 16:55:19.752087: step: 332/466, loss: 0.003558957949280739 2023-01-22 16:55:20.545757: step: 334/466, loss: 0.006062633357942104 2023-01-22 16:55:21.375948: step: 336/466, loss: 0.0010586964199319482 2023-01-22 16:55:22.110529: step: 338/466, loss: 0.0005861878162249923 2023-01-22 16:55:22.921295: step: 340/466, loss: 0.0028245861176401377 2023-01-22 16:55:23.841578: step: 342/466, loss: 0.03435346484184265 2023-01-22 16:55:24.592182: step: 344/466, loss: 0.022072521969676018 2023-01-22 16:55:25.329947: step: 346/466, loss: 0.00022341452131513506 2023-01-22 16:55:26.102361: step: 348/466, loss: 0.025059375911951065 2023-01-22 16:55:26.897614: step: 350/466, loss: 0.00031253520864993334 2023-01-22 16:55:27.682951: step: 352/466, loss: 0.00012384621368255466 2023-01-22 16:55:28.471464: step: 354/466, loss: 0.004707323852926493 2023-01-22 16:55:29.237787: step: 356/466, loss: 0.1190241277217865 2023-01-22 16:55:29.987986: step: 358/466, loss: 0.00039719868800602853 2023-01-22 16:55:30.756705: step: 360/466, loss: 0.029429111629724503 2023-01-22 16:55:31.527570: step: 362/466, loss: 0.01712273247539997 2023-01-22 16:55:32.337138: step: 364/466, loss: 0.009178683161735535 2023-01-22 16:55:33.095103: step: 366/466, loss: 0.00032894068863242865 2023-01-22 16:55:33.782291: step: 368/466, loss: 0.0008436114294454455 2023-01-22 16:55:34.604079: step: 370/466, loss: 0.0006354791112244129 2023-01-22 16:55:35.433401: step: 372/466, loss: 0.05993903428316116 2023-01-22 16:55:36.218103: step: 374/466, loss: 0.00040928684757091105 2023-01-22 16:55:37.011398: step: 376/466, loss: 0.0014766182284802198 2023-01-22 16:55:37.727945: step: 378/466, loss: 3.832852598861791e-05 2023-01-22 16:55:38.531719: step: 380/466, loss: 0.06406915932893753 2023-01-22 16:55:39.265785: step: 382/466, loss: 0.00010840618779184297 2023-01-22 16:55:40.043522: step: 384/466, loss: 0.007566460873931646 2023-01-22 16:55:40.809406: step: 386/466, loss: 0.0007533340249210596 2023-01-22 16:55:41.541697: step: 388/466, loss: 0.00020608464546967298 2023-01-22 16:55:42.290225: step: 390/466, loss: 0.03024054691195488 2023-01-22 16:55:43.126806: step: 392/466, loss: 0.04676214978098869 2023-01-22 16:55:43.826466: step: 394/466, loss: 0.0019371528178453445 2023-01-22 16:55:44.612320: step: 396/466, loss: 0.00275464728474617 2023-01-22 16:55:45.299616: step: 398/466, loss: 0.0007575393537990749 2023-01-22 16:55:46.015160: step: 400/466, loss: 0.02955714613199234 2023-01-22 16:55:46.812259: step: 402/466, loss: 0.021825818344950676 2023-01-22 16:55:47.630343: step: 404/466, loss: 0.0011155270040035248 2023-01-22 16:55:48.441393: step: 406/466, loss: 7.650330371689051e-05 2023-01-22 16:55:49.199192: step: 408/466, loss: 0.01209378894418478 2023-01-22 16:55:49.946085: step: 410/466, loss: 0.09773796796798706 2023-01-22 16:55:50.757891: step: 412/466, loss: 0.007413296960294247 2023-01-22 16:55:51.504197: step: 414/466, loss: 0.010011570528149605 2023-01-22 16:55:52.302449: step: 416/466, loss: 0.0029658616986125708 2023-01-22 16:55:53.122514: step: 418/466, loss: 0.018716806545853615 2023-01-22 16:55:53.860239: step: 420/466, loss: 3.0199102184269577e-05 2023-01-22 16:55:54.600125: step: 422/466, loss: 0.0003793942742049694 2023-01-22 16:55:55.353085: step: 424/466, loss: 0.0004991032765246928 2023-01-22 16:55:56.111837: step: 426/466, loss: 0.00234232097864151 2023-01-22 16:55:56.854881: step: 428/466, loss: 0.015742633491754532 2023-01-22 16:55:57.578644: step: 430/466, loss: 0.022508902475237846 2023-01-22 16:55:58.284689: step: 432/466, loss: 6.772443884983659e-05 2023-01-22 16:55:59.074066: step: 434/466, loss: 0.0015357910888269544 2023-01-22 16:55:59.827150: step: 436/466, loss: 0.0032131534535437822 2023-01-22 16:56:00.578124: step: 438/466, loss: 0.0007034862646833062 2023-01-22 16:56:01.325250: step: 440/466, loss: 0.0008740437333472073 2023-01-22 16:56:02.111023: step: 442/466, loss: 0.9496735334396362 2023-01-22 16:56:02.882710: step: 444/466, loss: 0.016295237466692924 2023-01-22 16:56:03.609032: step: 446/466, loss: 0.0013185566058382392 2023-01-22 16:56:04.295345: step: 448/466, loss: 0.01054754201322794 2023-01-22 16:56:05.023852: step: 450/466, loss: 0.01293019950389862 2023-01-22 16:56:05.885390: step: 452/466, loss: 0.0074695199728012085 2023-01-22 16:56:06.582795: step: 454/466, loss: 0.01076709944754839 2023-01-22 16:56:07.351399: step: 456/466, loss: 0.005442376714199781 2023-01-22 16:56:08.075709: step: 458/466, loss: 0.005675592925399542 2023-01-22 16:56:08.894818: step: 460/466, loss: 0.005998608190566301 2023-01-22 16:56:09.626449: step: 462/466, loss: 0.0031393063254654408 2023-01-22 16:56:10.347620: step: 464/466, loss: 0.0003654154425021261 2023-01-22 16:56:11.120267: step: 466/466, loss: 0.007295468356460333 2023-01-22 16:56:11.818304: step: 468/466, loss: 0.002882494358345866 2023-01-22 16:56:12.675857: step: 470/466, loss: 0.002005321439355612 2023-01-22 16:56:13.450920: step: 472/466, loss: 0.10671700537204742 2023-01-22 16:56:14.173300: step: 474/466, loss: 0.00037021772004663944 2023-01-22 16:56:14.933051: step: 476/466, loss: 0.001452124328352511 2023-01-22 16:56:15.694994: step: 478/466, loss: 0.012873172760009766 2023-01-22 16:56:16.415824: step: 480/466, loss: 0.00556617695838213 2023-01-22 16:56:17.151013: step: 482/466, loss: 0.003965499345213175 2023-01-22 16:56:17.954085: step: 484/466, loss: 0.026193661615252495 2023-01-22 16:56:18.694557: step: 486/466, loss: 0.03303632140159607 2023-01-22 16:56:19.458570: step: 488/466, loss: 0.006419027224183083 2023-01-22 16:56:20.224384: step: 490/466, loss: 0.011763244867324829 2023-01-22 16:56:20.923004: step: 492/466, loss: 0.0004500457434915006 2023-01-22 16:56:21.727042: step: 494/466, loss: 0.000631829840131104 2023-01-22 16:56:22.507114: step: 496/466, loss: 0.01301645953208208 2023-01-22 16:56:23.183265: step: 498/466, loss: 0.002552577992901206 2023-01-22 16:56:23.884091: step: 500/466, loss: 0.002179771428927779 2023-01-22 16:56:24.597730: step: 502/466, loss: 0.003267711028456688 2023-01-22 16:56:25.303063: step: 504/466, loss: 0.003001777222380042 2023-01-22 16:56:26.058617: step: 506/466, loss: 0.2626116871833801 2023-01-22 16:56:26.854712: step: 508/466, loss: 4.068400085088797e-05 2023-01-22 16:56:27.613545: step: 510/466, loss: 0.0008522938587702811 2023-01-22 16:56:28.423937: step: 512/466, loss: 0.004108110908418894 2023-01-22 16:56:29.091925: step: 514/466, loss: 0.0018746658461168408 2023-01-22 16:56:29.874760: step: 516/466, loss: 0.1399819254875183 2023-01-22 16:56:30.580498: step: 518/466, loss: 0.008463362231850624 2023-01-22 16:56:31.411444: step: 520/466, loss: 0.0003495319979265332 2023-01-22 16:56:32.260578: step: 522/466, loss: 0.011042834259569645 2023-01-22 16:56:33.114509: step: 524/466, loss: 0.026595329865813255 2023-01-22 16:56:33.906265: step: 526/466, loss: 0.00015499211440328509 2023-01-22 16:56:34.631668: step: 528/466, loss: 0.0020827208645641804 2023-01-22 16:56:35.407456: step: 530/466, loss: 0.000841870962176472 2023-01-22 16:56:36.246287: step: 532/466, loss: 0.0030897376127541065 2023-01-22 16:56:37.058971: step: 534/466, loss: 0.017255224287509918 2023-01-22 16:56:37.829427: step: 536/466, loss: 0.026173541322350502 2023-01-22 16:56:38.567025: step: 538/466, loss: 0.0008655735873617232 2023-01-22 16:56:39.420674: step: 540/466, loss: 0.002662037266418338 2023-01-22 16:56:40.185669: step: 542/466, loss: 0.6328471899032593 2023-01-22 16:56:40.920042: step: 544/466, loss: 0.002544648479670286 2023-01-22 16:56:41.740994: step: 546/466, loss: 0.0013098662020638585 2023-01-22 16:56:42.424335: step: 548/466, loss: 0.0001053504747687839 2023-01-22 16:56:43.231414: step: 550/466, loss: 3.780534098041244e-05 2023-01-22 16:56:43.943353: step: 552/466, loss: 0.014702056534588337 2023-01-22 16:56:44.724208: step: 554/466, loss: 0.0015839324332773685 2023-01-22 16:56:45.532264: step: 556/466, loss: 0.009505736641585827 2023-01-22 16:56:46.246273: step: 558/466, loss: 0.011248605325818062 2023-01-22 16:56:46.952918: step: 560/466, loss: 0.0002523681614547968 2023-01-22 16:56:47.688113: step: 562/466, loss: 0.008334793150424957 2023-01-22 16:56:48.408311: step: 564/466, loss: 0.0003578077012207359 2023-01-22 16:56:49.275470: step: 566/466, loss: 0.039999186992645264 2023-01-22 16:56:50.069742: step: 568/466, loss: 0.004132232163101435 2023-01-22 16:56:50.850880: step: 570/466, loss: 0.0009728502482175827 2023-01-22 16:56:51.678304: step: 572/466, loss: 0.01512494869530201 2023-01-22 16:56:52.445009: step: 574/466, loss: 3.994491999037564e-05 2023-01-22 16:56:53.286643: step: 576/466, loss: 0.005956718698143959 2023-01-22 16:56:54.007923: step: 578/466, loss: 0.018015773966908455 2023-01-22 16:56:54.664412: step: 580/466, loss: 0.0006239361246116459 2023-01-22 16:56:55.372611: step: 582/466, loss: 0.027943609282374382 2023-01-22 16:56:56.144413: step: 584/466, loss: 0.1084119901061058 2023-01-22 16:56:56.915538: step: 586/466, loss: 0.008794408291578293 2023-01-22 16:56:57.683363: step: 588/466, loss: 0.0004625521833077073 2023-01-22 16:56:58.637960: step: 590/466, loss: 0.018595660105347633 2023-01-22 16:56:59.409017: step: 592/466, loss: 0.002094178693369031 2023-01-22 16:57:00.304078: step: 594/466, loss: 0.0010547454003244638 2023-01-22 16:57:00.972360: step: 596/466, loss: 0.05371860787272453 2023-01-22 16:57:01.758134: step: 598/466, loss: 0.0032484375406056643 2023-01-22 16:57:02.499146: step: 600/466, loss: 0.01112589705735445 2023-01-22 16:57:03.273458: step: 602/466, loss: 0.0007691808859817684 2023-01-22 16:57:04.005619: step: 604/466, loss: 0.015428180806338787 2023-01-22 16:57:04.808256: step: 606/466, loss: 0.0009691608138382435 2023-01-22 16:57:05.528214: step: 608/466, loss: 0.0001645991433179006 2023-01-22 16:57:06.192170: step: 610/466, loss: 0.009261749684810638 2023-01-22 16:57:06.930331: step: 612/466, loss: 0.0037737779784947634 2023-01-22 16:57:07.681313: step: 614/466, loss: 0.0011043829144909978 2023-01-22 16:57:08.373980: step: 616/466, loss: 0.0018476293189451098 2023-01-22 16:57:09.132913: step: 618/466, loss: 0.00040778619586490095 2023-01-22 16:57:09.923392: step: 620/466, loss: 0.007349770981818438 2023-01-22 16:57:10.694939: step: 622/466, loss: 0.0008918479434214532 2023-01-22 16:57:11.447950: step: 624/466, loss: 0.0003366429591551423 2023-01-22 16:57:12.157182: step: 626/466, loss: 0.003023396944627166 2023-01-22 16:57:13.022096: step: 628/466, loss: 0.016576239839196205 2023-01-22 16:57:13.805253: step: 630/466, loss: 0.04712849110364914 2023-01-22 16:57:14.582554: step: 632/466, loss: 0.20594191551208496 2023-01-22 16:57:15.364760: step: 634/466, loss: 0.00019367921049706638 2023-01-22 16:57:16.162015: step: 636/466, loss: 0.006996245123445988 2023-01-22 16:57:16.959837: step: 638/466, loss: 0.0027225457597523928 2023-01-22 16:57:17.743143: step: 640/466, loss: 0.042146023362874985 2023-01-22 16:57:18.488606: step: 642/466, loss: 0.06693723797798157 2023-01-22 16:57:19.232146: step: 644/466, loss: 0.007613023277372122 2023-01-22 16:57:19.961604: step: 646/466, loss: 0.008608619682490826 2023-01-22 16:57:20.713667: step: 648/466, loss: 0.016024397686123848 2023-01-22 16:57:21.520247: step: 650/466, loss: 0.0009257309720851481 2023-01-22 16:57:22.242371: step: 652/466, loss: 0.08387546241283417 2023-01-22 16:57:22.973559: step: 654/466, loss: 0.0019786653574556112 2023-01-22 16:57:23.648538: step: 656/466, loss: 0.0012632932048290968 2023-01-22 16:57:24.402843: step: 658/466, loss: 0.0005319062620401382 2023-01-22 16:57:25.301538: step: 660/466, loss: 0.016213281080126762 2023-01-22 16:57:26.087651: step: 662/466, loss: 0.01797870174050331 2023-01-22 16:57:26.908130: step: 664/466, loss: 0.0009241351508535445 2023-01-22 16:57:27.700141: step: 666/466, loss: 0.0023297558072954416 2023-01-22 16:57:28.514849: step: 668/466, loss: 0.0005481429398059845 2023-01-22 16:57:29.277342: step: 670/466, loss: 0.0035553527995944023 2023-01-22 16:57:30.058868: step: 672/466, loss: 0.03022400662302971 2023-01-22 16:57:30.771058: step: 674/466, loss: 5.107448669150472e-05 2023-01-22 16:57:31.446057: step: 676/466, loss: 7.539623038610443e-05 2023-01-22 16:57:32.220829: step: 678/466, loss: 0.00021213498257566243 2023-01-22 16:57:32.985086: step: 680/466, loss: 0.025467250496149063 2023-01-22 16:57:33.728344: step: 682/466, loss: 0.06051236391067505 2023-01-22 16:57:34.396923: step: 684/466, loss: 0.0003113978891633451 2023-01-22 16:57:35.140847: step: 686/466, loss: 0.00044558229274116457 2023-01-22 16:57:35.892339: step: 688/466, loss: 0.009315641596913338 2023-01-22 16:57:36.721696: step: 690/466, loss: 0.0009563063504174352 2023-01-22 16:57:37.438850: step: 692/466, loss: 0.0002467456506565213 2023-01-22 16:57:38.259588: step: 694/466, loss: 0.2347729653120041 2023-01-22 16:57:39.065992: step: 696/466, loss: 0.00039162777829915285 2023-01-22 16:57:39.880333: step: 698/466, loss: 0.26982438564300537 2023-01-22 16:57:40.802890: step: 700/466, loss: 0.08687514811754227 2023-01-22 16:57:41.628771: step: 702/466, loss: 0.02728988230228424 2023-01-22 16:57:42.374474: step: 704/466, loss: 0.012942968867719173 2023-01-22 16:57:43.173482: step: 706/466, loss: 0.006773222703486681 2023-01-22 16:57:43.920141: step: 708/466, loss: 0.00023235747357830405 2023-01-22 16:57:44.732500: step: 710/466, loss: 0.0025265063159167767 2023-01-22 16:57:45.470589: step: 712/466, loss: 0.002385400701314211 2023-01-22 16:57:46.191422: step: 714/466, loss: 0.005350553430616856 2023-01-22 16:57:47.020549: step: 716/466, loss: 0.5845271944999695 2023-01-22 16:57:47.838770: step: 718/466, loss: 0.0005377520574256778 2023-01-22 16:57:48.593971: step: 720/466, loss: 0.0005495331133715808 2023-01-22 16:57:49.414485: step: 722/466, loss: 0.0012658092891797423 2023-01-22 16:57:50.071423: step: 724/466, loss: 0.00034427945502102375 2023-01-22 16:57:50.803098: step: 726/466, loss: 0.018899939954280853 2023-01-22 16:57:51.530911: step: 728/466, loss: 0.03183247894048691 2023-01-22 16:57:52.303034: step: 730/466, loss: 0.02796352282166481 2023-01-22 16:57:53.068649: step: 732/466, loss: 0.017680644989013672 2023-01-22 16:57:53.780769: step: 734/466, loss: 0.000501900096423924 2023-01-22 16:57:54.534126: step: 736/466, loss: 0.08719105273485184 2023-01-22 16:57:55.395149: step: 738/466, loss: 0.19477348029613495 2023-01-22 16:57:56.139524: step: 740/466, loss: 0.005550151690840721 2023-01-22 16:57:56.899971: step: 742/466, loss: 0.009086458012461662 2023-01-22 16:57:57.655249: step: 744/466, loss: 0.018472852185368538 2023-01-22 16:57:58.410765: step: 746/466, loss: 0.007351238746196032 2023-01-22 16:57:59.294526: step: 748/466, loss: 0.002588978037238121 2023-01-22 16:58:00.053383: step: 750/466, loss: 0.10875791311264038 2023-01-22 16:58:00.874308: step: 752/466, loss: 0.06240355968475342 2023-01-22 16:58:01.685573: step: 754/466, loss: 0.48639747500419617 2023-01-22 16:58:02.411865: step: 756/466, loss: 0.0014076440129429102 2023-01-22 16:58:03.154673: step: 758/466, loss: 0.00018148223171010613 2023-01-22 16:58:03.861598: step: 760/466, loss: 0.018693551421165466 2023-01-22 16:58:04.569043: step: 762/466, loss: 0.000626800290774554 2023-01-22 16:58:05.351301: step: 764/466, loss: 0.0006411468493752182 2023-01-22 16:58:06.001464: step: 766/466, loss: 0.00027139694429934025 2023-01-22 16:58:06.767461: step: 768/466, loss: 0.000939077464863658 2023-01-22 16:58:07.545904: step: 770/466, loss: 0.00024035456590354443 2023-01-22 16:58:08.270072: step: 772/466, loss: 0.007670735474675894 2023-01-22 16:58:09.060669: step: 774/466, loss: 0.0011248189257457852 2023-01-22 16:58:09.865473: step: 776/466, loss: 0.0717734843492508 2023-01-22 16:58:10.552412: step: 778/466, loss: 0.02742578461766243 2023-01-22 16:58:11.323800: step: 780/466, loss: 0.007576515898108482 2023-01-22 16:58:12.051428: step: 782/466, loss: 0.0005250229733064771 2023-01-22 16:58:12.891809: step: 784/466, loss: 0.012105558067560196 2023-01-22 16:58:13.650707: step: 786/466, loss: 0.006359845399856567 2023-01-22 16:58:14.350451: step: 788/466, loss: 0.24488550424575806 2023-01-22 16:58:15.106767: step: 790/466, loss: 0.0007657191599719226 2023-01-22 16:58:15.942444: step: 792/466, loss: 0.036015670746564865 2023-01-22 16:58:16.703451: step: 794/466, loss: 0.001616243040189147 2023-01-22 16:58:17.427547: step: 796/466, loss: 0.0015849809860810637 2023-01-22 16:58:18.174472: step: 798/466, loss: 0.0016199310775846243 2023-01-22 16:58:18.999128: step: 800/466, loss: 0.0002885865105781704 2023-01-22 16:58:19.706473: step: 802/466, loss: 0.009428664110600948 2023-01-22 16:58:20.482972: step: 804/466, loss: 0.009232708252966404 2023-01-22 16:58:21.164883: step: 806/466, loss: 0.005409192759543657 2023-01-22 16:58:21.990480: step: 808/466, loss: 0.0034889320377260447 2023-01-22 16:58:22.783384: step: 810/466, loss: 0.017431536689400673 2023-01-22 16:58:23.548267: step: 812/466, loss: 0.01968550868332386 2023-01-22 16:58:24.244263: step: 814/466, loss: 0.33501455187797546 2023-01-22 16:58:25.023317: step: 816/466, loss: 0.05295717343688011 2023-01-22 16:58:25.755559: step: 818/466, loss: 0.017188014462590218 2023-01-22 16:58:26.573804: step: 820/466, loss: 0.07719063013792038 2023-01-22 16:58:27.287695: step: 822/466, loss: 0.0049071419052779675 2023-01-22 16:58:28.052060: step: 824/466, loss: 0.006922825239598751 2023-01-22 16:58:28.801676: step: 826/466, loss: 0.000999518553726375 2023-01-22 16:58:29.532021: step: 828/466, loss: 0.0007073359447531402 2023-01-22 16:58:30.179724: step: 830/466, loss: 0.01538429781794548 2023-01-22 16:58:30.935694: step: 832/466, loss: 0.031232627108693123 2023-01-22 16:58:31.694602: step: 834/466, loss: 0.006306334864348173 2023-01-22 16:58:32.453115: step: 836/466, loss: 0.028609514236450195 2023-01-22 16:58:33.210063: step: 838/466, loss: 0.019720768555998802 2023-01-22 16:58:33.831931: step: 840/466, loss: 0.006836059037595987 2023-01-22 16:58:34.623520: step: 842/466, loss: 0.0017908071167767048 2023-01-22 16:58:35.336389: step: 844/466, loss: 0.015740016475319862 2023-01-22 16:58:36.118809: step: 846/466, loss: 0.002415057271718979 2023-01-22 16:58:36.929965: step: 848/466, loss: 0.20730531215667725 2023-01-22 16:58:37.678572: step: 850/466, loss: 0.0008651576936244965 2023-01-22 16:58:38.475091: step: 852/466, loss: 0.030669640749692917 2023-01-22 16:58:39.186526: step: 854/466, loss: 0.14007040858268738 2023-01-22 16:58:40.014311: step: 856/466, loss: 0.0048508713953197 2023-01-22 16:58:40.630126: step: 858/466, loss: 0.0003013678069692105 2023-01-22 16:58:41.358421: step: 860/466, loss: 0.02404596656560898 2023-01-22 16:58:42.133875: step: 862/466, loss: 0.006639343686401844 2023-01-22 16:58:42.864866: step: 864/466, loss: 0.007302514743059874 2023-01-22 16:58:43.580946: step: 866/466, loss: 5.850410889252089e-05 2023-01-22 16:58:44.332607: step: 868/466, loss: 0.029857581481337547 2023-01-22 16:58:45.055078: step: 870/466, loss: 1.3937670701125171e-05 2023-01-22 16:58:45.891244: step: 872/466, loss: 0.004357927944511175 2023-01-22 16:58:46.621338: step: 874/466, loss: 0.04697030410170555 2023-01-22 16:58:47.370076: step: 876/466, loss: 0.0013890161644667387 2023-01-22 16:58:48.086834: step: 878/466, loss: 0.0010503502562642097 2023-01-22 16:58:48.834505: step: 880/466, loss: 0.010377208702266216 2023-01-22 16:58:49.574486: step: 882/466, loss: 0.0020709082018584013 2023-01-22 16:58:50.314043: step: 884/466, loss: 0.001266932813450694 2023-01-22 16:58:51.028980: step: 886/466, loss: 0.008883432485163212 2023-01-22 16:58:51.729625: step: 888/466, loss: 0.011375979520380497 2023-01-22 16:58:52.519331: step: 890/466, loss: 0.01430260669440031 2023-01-22 16:58:53.350289: step: 892/466, loss: 0.0017887807916849852 2023-01-22 16:58:54.075787: step: 894/466, loss: 0.0024639118928462267 2023-01-22 16:58:54.900646: step: 896/466, loss: 0.00041694813990034163 2023-01-22 16:58:55.637560: step: 898/466, loss: 0.0011450136080384254 2023-01-22 16:58:56.355188: step: 900/466, loss: 0.004664968233555555 2023-01-22 16:58:57.084836: step: 902/466, loss: 0.11045849323272705 2023-01-22 16:58:57.821599: step: 904/466, loss: 0.00037624945980496705 2023-01-22 16:58:58.536614: step: 906/466, loss: 0.03347666561603546 2023-01-22 16:58:59.296075: step: 908/466, loss: 0.0009400354465469718 2023-01-22 16:59:00.056104: step: 910/466, loss: 0.026599857956171036 2023-01-22 16:59:00.732522: step: 912/466, loss: 0.044138990342617035 2023-01-22 16:59:01.511699: step: 914/466, loss: 0.006451373919844627 2023-01-22 16:59:02.307875: step: 916/466, loss: 0.002460882533341646 2023-01-22 16:59:02.998765: step: 918/466, loss: 0.005053863860666752 2023-01-22 16:59:03.806221: step: 920/466, loss: 0.007943828590214252 2023-01-22 16:59:04.628081: step: 922/466, loss: 0.0019944910891354084 2023-01-22 16:59:05.272374: step: 924/466, loss: 1.8738961443887092e-05 2023-01-22 16:59:06.041129: step: 926/466, loss: 0.02110712230205536 2023-01-22 16:59:06.750302: step: 928/466, loss: 0.027644284069538116 2023-01-22 16:59:07.510538: step: 930/466, loss: 0.09916841238737106 2023-01-22 16:59:08.271554: step: 932/466, loss: 0.008947809226810932 ================================================== Loss: 0.036 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.30089584828906435, 'r': 0.3408630387638927, 'f1': 0.3196349135739705}, 'combined': 0.23552046263345194, 'epoch': 38} Test Chinese: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.3590430962459348, 'r': 0.3077512253536584, 'f1': 0.33142439653470906}, 'combined': 0.20370475104084557, 'epoch': 38} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2807403246792544, 'r': 0.34519872939688206, 'f1': 0.3096506049228202}, 'combined': 0.2281636036273412, 'epoch': 38} Test Korean: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.3437080943600536, 'r': 0.3118667384323257, 'f1': 0.32701414697170783}, 'combined': 0.2009940610655375, 'epoch': 38} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.32846996753246754, 'r': 0.3565176877202494, 'f1': 0.34191960223579876}, 'combined': 0.2519407595421675, 'epoch': 38} Test Russian: {'template': {'p': 0.863013698630137, 'r': 0.48091603053435117, 'f1': 0.6176470588235295}, 'slot': {'p': 0.36197168252177087, 'r': 0.3068140928041677, 'f1': 0.3321183478808001}, 'combined': 0.20513192074990597, 'epoch': 38} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.2647058823529412, 'r': 0.38571428571428573, 'f1': 0.313953488372093}, 'combined': 0.20930232558139533, 'epoch': 38} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.29651162790697677, 'r': 0.5543478260869565, 'f1': 0.38636363636363635}, 'combined': 0.19318181818181818, 'epoch': 38} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.4117647058823529, 'r': 0.2413793103448276, 'f1': 0.3043478260869565}, 'combined': 0.20289855072463764, 'epoch': 38} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.31374061853002067, 'r': 0.3286239495798319, 'f1': 0.3210098636303455}, 'combined': 0.23653358372762298, 'epoch': 33} Test for Chinese: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.35580409301213317, 'r': 0.30097021547433256, 'f1': 0.32609812277003203}, 'combined': 0.20043104131231237, 'epoch': 33} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.31875, 'r': 0.36428571428571427, 'f1': 0.34}, 'combined': 0.22666666666666668, 'epoch': 33} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.30156788595932055, 'r': 0.3456299869438892, 'f1': 0.322099032925605}, 'combined': 0.23733612952412997, 'epoch': 36} Test for Korean: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.331509942889869, 'r': 0.31226283182087317, 'f1': 0.3215986683813365}, 'combined': 0.19766552300511414, 'epoch': 36} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.32432432432432434, 'r': 0.5217391304347826, 'f1': 0.4}, 'combined': 0.2, 'epoch': 36} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3161637354106435, 'r': 0.3395610516934046, 'f1': 0.3274449665918101}, 'combined': 0.24127523854133376, 'epoch': 24} Test for Russian: {'template': {'p': 0.863013698630137, 'r': 0.48091603053435117, 'f1': 0.6176470588235295}, 'slot': {'p': 0.34766580316129947, 'r': 0.3010093533864065, 'f1': 0.32265967810793456}, 'combined': 0.19928980118431255, 'epoch': 24} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.2413793103448276, 'f1': 0.32558139534883723}, 'combined': 0.21705426356589147, 'epoch': 24} ****************************** Epoch: 39 command: python train.py --model_name slot --xlmr_model_name xlm-roberta-large --batch_size 16 --xlmr_learning_rate 2e-5 --max_epoch 40 --event_hidden_num 450 --role_hidden_num 350 --p1_data_weight 0.1 --learning_rate 9e-4 2023-01-22 17:01:54.002899: step: 2/466, loss: 0.0008740079356357455 2023-01-22 17:01:54.754362: step: 4/466, loss: 0.00029015541076660156 2023-01-22 17:01:55.590391: step: 6/466, loss: 0.005229489412158728 2023-01-22 17:01:56.448244: step: 8/466, loss: 0.005481514148414135 2023-01-22 17:01:57.260656: step: 10/466, loss: 0.0002174789842683822 2023-01-22 17:01:57.924532: step: 12/466, loss: 0.00012509390944615006 2023-01-22 17:01:58.632633: step: 14/466, loss: 0.019930889829993248 2023-01-22 17:01:59.294656: step: 16/466, loss: 0.00010953243327094242 2023-01-22 17:02:00.052276: step: 18/466, loss: 0.009445936419069767 2023-01-22 17:02:00.891029: step: 20/466, loss: 0.012281639501452446 2023-01-22 17:02:01.618402: step: 22/466, loss: 0.0023591876961290836 2023-01-22 17:02:02.453590: step: 24/466, loss: 0.008492972701787949 2023-01-22 17:02:03.193708: step: 26/466, loss: 6.964744898141362e-06 2023-01-22 17:02:03.997251: step: 28/466, loss: 0.00017814691818784922 2023-01-22 17:02:04.691087: step: 30/466, loss: 0.0022147567942738533 2023-01-22 17:02:05.431950: step: 32/466, loss: 0.0043442039750516415 2023-01-22 17:02:06.187463: step: 34/466, loss: 0.004584559705108404 2023-01-22 17:02:06.952925: step: 36/466, loss: 0.013966542668640614 2023-01-22 17:02:07.864317: step: 38/466, loss: 0.0012340530520305037 2023-01-22 17:02:08.583161: step: 40/466, loss: 0.028278250247240067 2023-01-22 17:02:09.367620: step: 42/466, loss: 0.4269455671310425 2023-01-22 17:02:10.150665: step: 44/466, loss: 0.008779562078416348 2023-01-22 17:02:10.908645: step: 46/466, loss: 0.00039897681563161314 2023-01-22 17:02:11.651240: step: 48/466, loss: 8.907252777135e-05 2023-01-22 17:02:12.427473: step: 50/466, loss: 0.007853885181248188 2023-01-22 17:02:13.186421: step: 52/466, loss: 0.00019616371719166636 2023-01-22 17:02:13.892498: step: 54/466, loss: 0.11887061595916748 2023-01-22 17:02:14.597053: step: 56/466, loss: 0.00011372471635695547 2023-01-22 17:02:15.300231: step: 58/466, loss: 0.013960963115096092 2023-01-22 17:02:16.138668: step: 60/466, loss: 0.03438699245452881 2023-01-22 17:02:16.903897: step: 62/466, loss: 0.011784059926867485 2023-01-22 17:02:17.632095: step: 64/466, loss: 0.014181561768054962 2023-01-22 17:02:18.331188: step: 66/466, loss: 0.05932268872857094 2023-01-22 17:02:19.059601: step: 68/466, loss: 0.0005954782827757299 2023-01-22 17:02:19.748756: step: 70/466, loss: 0.0001969360455404967 2023-01-22 17:02:20.478388: step: 72/466, loss: 0.007253388874232769 2023-01-22 17:02:21.179883: step: 74/466, loss: 0.001776547054760158 2023-01-22 17:02:21.953763: step: 76/466, loss: 0.0010972290765494108 2023-01-22 17:02:22.655370: step: 78/466, loss: 0.0016925893723964691 2023-01-22 17:02:23.386514: step: 80/466, loss: 0.0105961998924613 2023-01-22 17:02:24.114350: step: 82/466, loss: 0.001080236746929586 2023-01-22 17:02:24.868446: step: 84/466, loss: 0.00023807617253623903 2023-01-22 17:02:25.665649: step: 86/466, loss: 0.0045480867847800255 2023-01-22 17:02:26.428444: step: 88/466, loss: 0.007902023382484913 2023-01-22 17:02:27.140446: step: 90/466, loss: 0.1550598293542862 2023-01-22 17:02:27.890222: step: 92/466, loss: 0.0016592772444710135 2023-01-22 17:02:28.556167: step: 94/466, loss: 0.001391400583088398 2023-01-22 17:02:29.313020: step: 96/466, loss: 0.0004946725675836205 2023-01-22 17:02:30.050659: step: 98/466, loss: 0.008832229301333427 2023-01-22 17:02:30.781084: step: 100/466, loss: 0.013128861784934998 2023-01-22 17:02:31.568511: step: 102/466, loss: 0.000519815890584141 2023-01-22 17:02:32.319108: step: 104/466, loss: 0.001086760894395411 2023-01-22 17:02:33.034539: step: 106/466, loss: 0.0441858246922493 2023-01-22 17:02:33.847380: step: 108/466, loss: 0.0010125155095010996 2023-01-22 17:02:34.598891: step: 110/466, loss: 0.004666542634367943 2023-01-22 17:02:35.316937: step: 112/466, loss: 0.02774696797132492 2023-01-22 17:02:36.058761: step: 114/466, loss: 0.021488526836037636 2023-01-22 17:02:36.901292: step: 116/466, loss: 0.004335596691817045 2023-01-22 17:02:37.642803: step: 118/466, loss: 0.01922680251300335 2023-01-22 17:02:38.358267: step: 120/466, loss: 0.00018423274741508067 2023-01-22 17:02:39.120617: step: 122/466, loss: 1.6785366824478842e-05 2023-01-22 17:02:39.869209: step: 124/466, loss: 0.03141075372695923 2023-01-22 17:02:40.620833: step: 126/466, loss: 0.00644349679350853 2023-01-22 17:02:41.410998: step: 128/466, loss: 0.03776460886001587 2023-01-22 17:02:42.154110: step: 130/466, loss: 0.0003045888734050095 2023-01-22 17:02:42.906295: step: 132/466, loss: 0.04505644366145134 2023-01-22 17:02:43.718847: step: 134/466, loss: 0.00014811464643571526 2023-01-22 17:02:44.459889: step: 136/466, loss: 0.0005231269751675427 2023-01-22 17:02:45.172953: step: 138/466, loss: 0.002811503829434514 2023-01-22 17:02:45.885953: step: 140/466, loss: 0.0004333566757850349 2023-01-22 17:02:46.618302: step: 142/466, loss: 0.011612461879849434 2023-01-22 17:02:47.307997: step: 144/466, loss: 4.0755899135547224e-06 2023-01-22 17:02:48.341220: step: 146/466, loss: 0.04024361073970795 2023-01-22 17:02:49.064127: step: 148/466, loss: 0.013624968938529491 2023-01-22 17:02:49.775277: step: 150/466, loss: 0.009867950342595577 2023-01-22 17:02:50.508895: step: 152/466, loss: 0.0022369185462594032 2023-01-22 17:02:51.259476: step: 154/466, loss: 0.00014311866834759712 2023-01-22 17:02:52.170632: step: 156/466, loss: 2.0264893464627676e-05 2023-01-22 17:02:52.987691: step: 158/466, loss: 0.2612498700618744 2023-01-22 17:02:53.815397: step: 160/466, loss: 0.0024147075600922108 2023-01-22 17:02:54.533968: step: 162/466, loss: 0.0004107421846129 2023-01-22 17:02:55.279741: step: 164/466, loss: 0.000620778591837734 2023-01-22 17:02:56.005735: step: 166/466, loss: 0.47788673639297485 2023-01-22 17:02:56.806948: step: 168/466, loss: 0.010592091828584671 2023-01-22 17:02:57.607284: step: 170/466, loss: 0.0020647207275032997 2023-01-22 17:02:58.374973: step: 172/466, loss: 0.001345380092971027 2023-01-22 17:02:59.132025: step: 174/466, loss: 0.0003394253144506365 2023-01-22 17:02:59.940389: step: 176/466, loss: 0.014013102278113365 2023-01-22 17:03:00.681995: step: 178/466, loss: 0.001352597028017044 2023-01-22 17:03:01.495764: step: 180/466, loss: 0.015661533921957016 2023-01-22 17:03:02.335419: step: 182/466, loss: 0.0011373377637937665 2023-01-22 17:03:03.105309: step: 184/466, loss: 0.0079009048640728 2023-01-22 17:03:03.862698: step: 186/466, loss: 0.03485744819045067 2023-01-22 17:03:04.685385: step: 188/466, loss: 0.0019052948337048292 2023-01-22 17:03:05.424685: step: 190/466, loss: 0.0438397042453289 2023-01-22 17:03:06.127178: step: 192/466, loss: 0.0030667998362332582 2023-01-22 17:03:06.889669: step: 194/466, loss: 0.0016462020576000214 2023-01-22 17:03:07.641012: step: 196/466, loss: 0.0005860528908669949 2023-01-22 17:03:08.385062: step: 198/466, loss: 0.002719779498875141 2023-01-22 17:03:09.192522: step: 200/466, loss: 0.25626522302627563 2023-01-22 17:03:09.992668: step: 202/466, loss: 0.0031375677790492773 2023-01-22 17:03:10.773403: step: 204/466, loss: 0.009526461362838745 2023-01-22 17:03:11.463021: step: 206/466, loss: 0.0008482421399094164 2023-01-22 17:03:12.243969: step: 208/466, loss: 0.005725763738155365 2023-01-22 17:03:12.956567: step: 210/466, loss: 0.0013842338230460882 2023-01-22 17:03:13.644582: step: 212/466, loss: 0.024162253364920616 2023-01-22 17:03:14.395145: step: 214/466, loss: 0.04524444043636322 2023-01-22 17:03:15.141917: step: 216/466, loss: 0.014798237942159176 2023-01-22 17:03:15.899664: step: 218/466, loss: 0.008126954548060894 2023-01-22 17:03:16.629464: step: 220/466, loss: 0.0012192793656140566 2023-01-22 17:03:17.383514: step: 222/466, loss: 0.00041413892176933587 2023-01-22 17:03:18.105087: step: 224/466, loss: 0.011747845448553562 2023-01-22 17:03:18.891339: step: 226/466, loss: 0.00561458058655262 2023-01-22 17:03:19.673912: step: 228/466, loss: 9.295693598687649e-05 2023-01-22 17:03:20.444062: step: 230/466, loss: 0.011560462415218353 2023-01-22 17:03:21.163193: step: 232/466, loss: 0.009240617975592613 2023-01-22 17:03:21.883336: step: 234/466, loss: 0.009746174328029156 2023-01-22 17:03:22.596239: step: 236/466, loss: 8.335171878570691e-05 2023-01-22 17:03:23.171027: step: 238/466, loss: 0.005442628636956215 2023-01-22 17:03:24.030668: step: 240/466, loss: 0.04420081153512001 2023-01-22 17:03:24.829941: step: 242/466, loss: 0.0012216203613206744 2023-01-22 17:03:25.557190: step: 244/466, loss: 0.0006596589810214937 2023-01-22 17:03:26.280364: step: 246/466, loss: 2.5175178961944766e-05 2023-01-22 17:03:26.993959: step: 248/466, loss: 0.00014090738841332495 2023-01-22 17:03:27.707076: step: 250/466, loss: 0.012872757390141487 2023-01-22 17:03:28.542017: step: 252/466, loss: 0.0034725533332675695 2023-01-22 17:03:29.399283: step: 254/466, loss: 0.0004913858720101416 2023-01-22 17:03:30.140648: step: 256/466, loss: 0.0009157611057162285 2023-01-22 17:03:30.928951: step: 258/466, loss: 0.031524062156677246 2023-01-22 17:03:31.747677: step: 260/466, loss: 0.018843840807676315 2023-01-22 17:03:32.529469: step: 262/466, loss: 0.00017298573220614344 2023-01-22 17:03:33.279304: step: 264/466, loss: 0.0018737587379291654 2023-01-22 17:03:33.999894: step: 266/466, loss: 0.0009831200586631894 2023-01-22 17:03:34.776878: step: 268/466, loss: 0.0018375710351392627 2023-01-22 17:03:35.557670: step: 270/466, loss: 0.02300839126110077 2023-01-22 17:03:36.434858: step: 272/466, loss: 0.04054649546742439 2023-01-22 17:03:37.204682: step: 274/466, loss: 0.003835452953353524 2023-01-22 17:03:37.958413: step: 276/466, loss: 0.004952685441821814 2023-01-22 17:03:38.685811: step: 278/466, loss: 0.1168748065829277 2023-01-22 17:03:39.379384: step: 280/466, loss: 0.0005180786829441786 2023-01-22 17:03:40.218585: step: 282/466, loss: 0.04683459550142288 2023-01-22 17:03:40.930829: step: 284/466, loss: 0.0003267234133090824 2023-01-22 17:03:41.589345: step: 286/466, loss: 0.0005253761191852391 2023-01-22 17:03:42.323626: step: 288/466, loss: 0.0009431827929802239 2023-01-22 17:03:43.040253: step: 290/466, loss: 0.00791383907198906 2023-01-22 17:03:43.825805: step: 292/466, loss: 0.06768930703401566 2023-01-22 17:03:44.654554: step: 294/466, loss: 0.006540664006024599 2023-01-22 17:03:45.453565: step: 296/466, loss: 0.005072426982223988 2023-01-22 17:03:46.177387: step: 298/466, loss: 0.002139729680493474 2023-01-22 17:03:46.924257: step: 300/466, loss: 0.032697562128305435 2023-01-22 17:03:47.664685: step: 302/466, loss: 0.08820069581270218 2023-01-22 17:03:48.390138: step: 304/466, loss: 0.00013937368930783123 2023-01-22 17:03:49.183180: step: 306/466, loss: 0.0013572914758697152 2023-01-22 17:03:49.843978: step: 308/466, loss: 0.0005084231379441917 2023-01-22 17:03:50.565510: step: 310/466, loss: 0.006255794316530228 2023-01-22 17:03:51.316925: step: 312/466, loss: 0.00024360780662391335 2023-01-22 17:03:52.014765: step: 314/466, loss: 0.0010025992523878813 2023-01-22 17:03:52.795253: step: 316/466, loss: 0.0012603605864569545 2023-01-22 17:03:53.650066: step: 318/466, loss: 0.0074287052266299725 2023-01-22 17:03:54.474982: step: 320/466, loss: 0.0029444722458720207 2023-01-22 17:03:55.224179: step: 322/466, loss: 0.005362970754504204 2023-01-22 17:03:56.002258: step: 324/466, loss: 0.016828326508402824 2023-01-22 17:03:56.752858: step: 326/466, loss: 0.09386395663022995 2023-01-22 17:03:57.493357: step: 328/466, loss: 0.000387120118830353 2023-01-22 17:03:58.322433: step: 330/466, loss: 0.004994371440261602 2023-01-22 17:03:59.070346: step: 332/466, loss: 0.013452321290969849 2023-01-22 17:03:59.840407: step: 334/466, loss: 0.00022745579190086573 2023-01-22 17:04:00.633346: step: 336/466, loss: 0.0002610105730127543 2023-01-22 17:04:01.439592: step: 338/466, loss: 0.12716002762317657 2023-01-22 17:04:02.275044: step: 340/466, loss: 0.061778996139764786 2023-01-22 17:04:03.120182: step: 342/466, loss: 0.010173936374485493 2023-01-22 17:04:03.813418: step: 344/466, loss: 0.0002518314286135137 2023-01-22 17:04:04.550423: step: 346/466, loss: 0.0007772438693791628 2023-01-22 17:04:05.351824: step: 348/466, loss: 9.038544521899894e-05 2023-01-22 17:04:06.121326: step: 350/466, loss: 0.00029960667598061264 2023-01-22 17:04:06.860764: step: 352/466, loss: 0.048271242529153824 2023-01-22 17:04:07.615569: step: 354/466, loss: 0.004015645012259483 2023-01-22 17:04:08.397635: step: 356/466, loss: 0.14496587216854095 2023-01-22 17:04:09.155908: step: 358/466, loss: 0.006087936460971832 2023-01-22 17:04:09.953450: step: 360/466, loss: 0.009134674444794655 2023-01-22 17:04:10.682350: step: 362/466, loss: 0.02341679111123085 2023-01-22 17:04:11.365275: step: 364/466, loss: 4.412142880028114e-05 2023-01-22 17:04:12.104761: step: 366/466, loss: 0.008231754414737225 2023-01-22 17:04:12.745535: step: 368/466, loss: 0.0006990503752604127 2023-01-22 17:04:13.471394: step: 370/466, loss: 0.00015528335643466562 2023-01-22 17:04:14.286635: step: 372/466, loss: 0.011209993623197079 2023-01-22 17:04:15.116048: step: 374/466, loss: 0.0003509388188831508 2023-01-22 17:04:15.951657: step: 376/466, loss: 0.0019555650651454926 2023-01-22 17:04:16.778778: step: 378/466, loss: 0.009965005330741405 2023-01-22 17:04:17.507440: step: 380/466, loss: 0.0002666927466634661 2023-01-22 17:04:18.320746: step: 382/466, loss: 0.05020932853221893 2023-01-22 17:04:19.068325: step: 384/466, loss: 0.018008515238761902 2023-01-22 17:04:19.856439: step: 386/466, loss: 5.9866510127903894e-05 2023-01-22 17:04:20.621042: step: 388/466, loss: 0.052257269620895386 2023-01-22 17:04:21.373228: step: 390/466, loss: 0.0010209325700998306 2023-01-22 17:04:22.120821: step: 392/466, loss: 0.008724289946258068 2023-01-22 17:04:22.935661: step: 394/466, loss: 0.05061984807252884 2023-01-22 17:04:23.691053: step: 396/466, loss: 0.2913358509540558 2023-01-22 17:04:24.429951: step: 398/466, loss: 0.004957552067935467 2023-01-22 17:04:25.138763: step: 400/466, loss: 0.0007999493391253054 2023-01-22 17:04:25.917947: step: 402/466, loss: 0.2070377767086029 2023-01-22 17:04:26.623633: step: 404/466, loss: 0.0007287136395461857 2023-01-22 17:04:27.388078: step: 406/466, loss: 2.076130112982355e-05 2023-01-22 17:04:28.195436: step: 408/466, loss: 0.04241722449660301 2023-01-22 17:04:29.062591: step: 410/466, loss: 0.06292695552110672 2023-01-22 17:04:29.857483: step: 412/466, loss: 0.0008323242655023932 2023-01-22 17:04:30.540322: step: 414/466, loss: 0.0010838143061846495 2023-01-22 17:04:31.301613: step: 416/466, loss: 0.004325313027948141 2023-01-22 17:04:32.079425: step: 418/466, loss: 0.007513652089983225 2023-01-22 17:04:32.896309: step: 420/466, loss: 0.023051241412758827 2023-01-22 17:04:33.641822: step: 422/466, loss: 0.00151598802767694 2023-01-22 17:04:34.394436: step: 424/466, loss: 0.08067677170038223 2023-01-22 17:04:35.149319: step: 426/466, loss: 0.00105815299320966 2023-01-22 17:04:35.799757: step: 428/466, loss: 0.011436041444540024 2023-01-22 17:04:36.519000: step: 430/466, loss: 0.0014418803621083498 2023-01-22 17:04:37.210607: step: 432/466, loss: 0.0001352071121800691 2023-01-22 17:04:37.962576: step: 434/466, loss: 0.0011846505803987384 2023-01-22 17:04:38.655293: step: 436/466, loss: 1.4939187167328782e-05 2023-01-22 17:04:39.424071: step: 438/466, loss: 0.0004720861034002155 2023-01-22 17:04:40.075816: step: 440/466, loss: 9.890823275782168e-05 2023-01-22 17:04:40.782133: step: 442/466, loss: 1.8691593140829355e-05 2023-01-22 17:04:41.551825: step: 444/466, loss: 0.006000523455440998 2023-01-22 17:04:42.337177: step: 446/466, loss: 0.014063859358429909 2023-01-22 17:04:43.149306: step: 448/466, loss: 0.002362149301916361 2023-01-22 17:04:43.924529: step: 450/466, loss: 0.007006255444139242 2023-01-22 17:04:44.679300: step: 452/466, loss: 0.041615139693021774 2023-01-22 17:04:45.393222: step: 454/466, loss: 0.0002590412914287299 2023-01-22 17:04:46.136875: step: 456/466, loss: 0.000590867770370096 2023-01-22 17:04:46.887626: step: 458/466, loss: 0.0014481049729511142 2023-01-22 17:04:47.639140: step: 460/466, loss: 0.014301631599664688 2023-01-22 17:04:48.446057: step: 462/466, loss: 0.002033855300396681 2023-01-22 17:04:49.207452: step: 464/466, loss: 0.003797616111114621 2023-01-22 17:04:49.985332: step: 466/466, loss: 8.198487921617925e-05 2023-01-22 17:04:50.742297: step: 468/466, loss: 0.015080071054399014 2023-01-22 17:04:51.429849: step: 470/466, loss: 0.0005228759837336838 2023-01-22 17:04:52.210499: step: 472/466, loss: 0.0002708770043682307 2023-01-22 17:04:52.968859: step: 474/466, loss: 0.07632824778556824 2023-01-22 17:04:53.705070: step: 476/466, loss: 0.010276122018694878 2023-01-22 17:04:54.473880: step: 478/466, loss: 0.0030062233563512564 2023-01-22 17:04:55.220444: step: 480/466, loss: 0.08177800476551056 2023-01-22 17:04:55.954632: step: 482/466, loss: 0.0020473485346883535 2023-01-22 17:04:56.686924: step: 484/466, loss: 0.017333390191197395 2023-01-22 17:04:57.484270: step: 486/466, loss: 0.0031148327980190516 2023-01-22 17:04:58.180772: step: 488/466, loss: 9.554430289426818e-05 2023-01-22 17:04:58.887301: step: 490/466, loss: 0.00386765762232244 2023-01-22 17:04:59.622291: step: 492/466, loss: 0.012081998400390148 2023-01-22 17:05:00.383915: step: 494/466, loss: 4.6635068429168314e-05 2023-01-22 17:05:01.077360: step: 496/466, loss: 0.016525747254490852 2023-01-22 17:05:01.816713: step: 498/466, loss: 0.04223182424902916 2023-01-22 17:05:02.648745: step: 500/466, loss: 0.0017167312325909734 2023-01-22 17:05:03.364429: step: 502/466, loss: 0.002390617271885276 2023-01-22 17:05:04.179174: step: 504/466, loss: 0.010442443192005157 2023-01-22 17:05:04.933196: step: 506/466, loss: 0.004288220778107643 2023-01-22 17:05:05.652869: step: 508/466, loss: 0.0003184019587934017 2023-01-22 17:05:06.394170: step: 510/466, loss: 0.0012993603013455868 2023-01-22 17:05:07.142984: step: 512/466, loss: 0.014843190088868141 2023-01-22 17:05:07.896764: step: 514/466, loss: 0.006183616816997528 2023-01-22 17:05:08.613460: step: 516/466, loss: 0.00975726917386055 2023-01-22 17:05:09.460338: step: 518/466, loss: 0.007801082916557789 2023-01-22 17:05:10.315165: step: 520/466, loss: 0.00010823424236150458 2023-01-22 17:05:11.081183: step: 522/466, loss: 0.027075359597802162 2023-01-22 17:05:11.836744: step: 524/466, loss: 0.022378744557499886 2023-01-22 17:05:12.548435: step: 526/466, loss: 0.001080716261640191 2023-01-22 17:05:13.224797: step: 528/466, loss: 0.0001632870698813349 2023-01-22 17:05:14.004938: step: 530/466, loss: 1.4522252058668528e-05 2023-01-22 17:05:14.781327: step: 532/466, loss: 0.0007839349564164877 2023-01-22 17:05:15.447904: step: 534/466, loss: 1.9176708519808017e-05 2023-01-22 17:05:16.190059: step: 536/466, loss: 9.760348802956287e-06 2023-01-22 17:05:16.950464: step: 538/466, loss: 0.0004075799079146236 2023-01-22 17:05:17.739708: step: 540/466, loss: 0.0017076395452022552 2023-01-22 17:05:18.543585: step: 542/466, loss: 0.030729996040463448 2023-01-22 17:05:19.284161: step: 544/466, loss: 0.0003406803007237613 2023-01-22 17:05:20.036766: step: 546/466, loss: 0.02445288561284542 2023-01-22 17:05:20.691687: step: 548/466, loss: 0.024324113503098488 2023-01-22 17:05:21.459907: step: 550/466, loss: 0.11394055932760239 2023-01-22 17:05:22.274063: step: 552/466, loss: 0.01220634113997221 2023-01-22 17:05:22.990692: step: 554/466, loss: 0.00687334593385458 2023-01-22 17:05:23.689314: step: 556/466, loss: 0.0009453383972868323 2023-01-22 17:05:24.433737: step: 558/466, loss: 0.00467744842171669 2023-01-22 17:05:25.222042: step: 560/466, loss: 0.001324250246398151 2023-01-22 17:05:26.009976: step: 562/466, loss: 0.005736398510634899 2023-01-22 17:05:26.722547: step: 564/466, loss: 0.00043739372631534934 2023-01-22 17:05:27.520287: step: 566/466, loss: 1.2006373405456543 2023-01-22 17:05:28.231964: step: 568/466, loss: 0.0038406013045459986 2023-01-22 17:05:28.971511: step: 570/466, loss: 0.00694508571177721 2023-01-22 17:05:29.712549: step: 572/466, loss: 0.05925419181585312 2023-01-22 17:05:30.444192: step: 574/466, loss: 0.0008464656420983374 2023-01-22 17:05:31.289618: step: 576/466, loss: 0.004853491205722094 2023-01-22 17:05:32.027545: step: 578/466, loss: 0.06622689217329025 2023-01-22 17:05:32.770245: step: 580/466, loss: 0.002883708104491234 2023-01-22 17:05:33.536952: step: 582/466, loss: 0.007973263040184975 2023-01-22 17:05:34.418012: step: 584/466, loss: 0.004370575770735741 2023-01-22 17:05:35.131516: step: 586/466, loss: 0.001321446499787271 2023-01-22 17:05:35.801563: step: 588/466, loss: 0.2518007457256317 2023-01-22 17:05:36.505254: step: 590/466, loss: 0.0006912379176355898 2023-01-22 17:05:37.224733: step: 592/466, loss: 0.007234994322061539 2023-01-22 17:05:37.932388: step: 594/466, loss: 0.001353597966954112 2023-01-22 17:05:38.632571: step: 596/466, loss: 0.00021362333791330457 2023-01-22 17:05:39.322270: step: 598/466, loss: 0.005170780699700117 2023-01-22 17:05:40.227400: step: 600/466, loss: 0.001198322745040059 2023-01-22 17:05:41.011277: step: 602/466, loss: 0.017721228301525116 2023-01-22 17:05:41.708879: step: 604/466, loss: 0.0025628963485360146 2023-01-22 17:05:42.427384: step: 606/466, loss: 0.07840071618556976 2023-01-22 17:05:43.164353: step: 608/466, loss: 0.035325538367033005 2023-01-22 17:05:43.968824: step: 610/466, loss: 0.00014155724784359336 2023-01-22 17:05:44.697256: step: 612/466, loss: 0.004721387289464474 2023-01-22 17:05:45.487803: step: 614/466, loss: 0.020070061087608337 2023-01-22 17:05:46.344708: step: 616/466, loss: 0.0019762504380196333 2023-01-22 17:05:47.221091: step: 618/466, loss: 0.00040442211320623755 2023-01-22 17:05:47.922628: step: 620/466, loss: 0.8801062107086182 2023-01-22 17:05:48.757949: step: 622/466, loss: 0.0013993729371577501 2023-01-22 17:05:49.584361: step: 624/466, loss: 0.05443740636110306 2023-01-22 17:05:50.383138: step: 626/466, loss: 0.0011238008737564087 2023-01-22 17:05:51.093090: step: 628/466, loss: 0.0003994059225078672 2023-01-22 17:05:51.810704: step: 630/466, loss: 0.01006253995001316 2023-01-22 17:05:52.635073: step: 632/466, loss: 0.0012169163674116135 2023-01-22 17:05:53.386238: step: 634/466, loss: 0.14805126190185547 2023-01-22 17:05:54.129596: step: 636/466, loss: 0.0948682650923729 2023-01-22 17:05:54.961936: step: 638/466, loss: 0.0009406930766999722 2023-01-22 17:05:55.720945: step: 640/466, loss: 0.019608452916145325 2023-01-22 17:05:56.470234: step: 642/466, loss: 0.01036733016371727 2023-01-22 17:05:57.196125: step: 644/466, loss: 0.012805589474737644 2023-01-22 17:05:57.981000: step: 646/466, loss: 0.0072302124463021755 2023-01-22 17:05:58.672825: step: 648/466, loss: 0.0005278410390019417 2023-01-22 17:05:59.497994: step: 650/466, loss: 0.0015965031925588846 2023-01-22 17:06:00.199096: step: 652/466, loss: 0.00028344758902676404 2023-01-22 17:06:00.977923: step: 654/466, loss: 0.03176767751574516 2023-01-22 17:06:01.791225: step: 656/466, loss: 0.003983621019870043 2023-01-22 17:06:02.568500: step: 658/466, loss: 0.050124142318964005 2023-01-22 17:06:03.270847: step: 660/466, loss: 0.003998556639999151 2023-01-22 17:06:03.967201: step: 662/466, loss: 0.0013740364229306579 2023-01-22 17:06:04.635519: step: 664/466, loss: 0.005735581275075674 2023-01-22 17:06:05.375685: step: 666/466, loss: 0.0006358802784234285 2023-01-22 17:06:06.095777: step: 668/466, loss: 0.0014941692352294922 2023-01-22 17:06:06.803519: step: 670/466, loss: 0.001266355742700398 2023-01-22 17:06:07.520850: step: 672/466, loss: 0.004563577938824892 2023-01-22 17:06:08.200038: step: 674/466, loss: 0.0006881391745992005 2023-01-22 17:06:08.900970: step: 676/466, loss: 0.5590264797210693 2023-01-22 17:06:09.650999: step: 678/466, loss: 0.016854893416166306 2023-01-22 17:06:10.396647: step: 680/466, loss: 0.009256008081138134 2023-01-22 17:06:11.144388: step: 682/466, loss: 0.002191653475165367 2023-01-22 17:06:11.801050: step: 684/466, loss: 0.00032921944512054324 2023-01-22 17:06:12.497042: step: 686/466, loss: 0.0008804819080978632 2023-01-22 17:06:13.218050: step: 688/466, loss: 0.0013804353075101972 2023-01-22 17:06:13.977645: step: 690/466, loss: 0.002117349300533533 2023-01-22 17:06:14.838064: step: 692/466, loss: 0.0751693993806839 2023-01-22 17:06:15.592934: step: 694/466, loss: 0.005391803570091724 2023-01-22 17:06:16.354141: step: 696/466, loss: 0.0005898470990359783 2023-01-22 17:06:17.070122: step: 698/466, loss: 0.003721448592841625 2023-01-22 17:06:17.733729: step: 700/466, loss: 0.08029770106077194 2023-01-22 17:06:18.502802: step: 702/466, loss: 0.003063736716285348 2023-01-22 17:06:19.198416: step: 704/466, loss: 0.000615855969954282 2023-01-22 17:06:19.926405: step: 706/466, loss: 0.0162990503013134 2023-01-22 17:06:20.597803: step: 708/466, loss: 0.004094661679118872 2023-01-22 17:06:21.343361: step: 710/466, loss: 0.007667486555874348 2023-01-22 17:06:22.102644: step: 712/466, loss: 0.003508082590997219 2023-01-22 17:06:22.854914: step: 714/466, loss: 0.002244040835648775 2023-01-22 17:06:23.600524: step: 716/466, loss: 0.010504397563636303 2023-01-22 17:06:24.404564: step: 718/466, loss: 0.0021802643314003944 2023-01-22 17:06:25.145348: step: 720/466, loss: 1.154409646987915 2023-01-22 17:06:25.903624: step: 722/466, loss: 0.004091985523700714 2023-01-22 17:06:26.637030: step: 724/466, loss: 0.05520002916455269 2023-01-22 17:06:27.404386: step: 726/466, loss: 0.4075619876384735 2023-01-22 17:06:28.209221: step: 728/466, loss: 0.023832345381379128 2023-01-22 17:06:28.950857: step: 730/466, loss: 0.024165647104382515 2023-01-22 17:06:29.667048: step: 732/466, loss: 0.010060613043606281 2023-01-22 17:06:30.424467: step: 734/466, loss: 0.00020264496561139822 2023-01-22 17:06:31.165155: step: 736/466, loss: 0.11557676643133163 2023-01-22 17:06:31.977494: step: 738/466, loss: 0.00019353475363459438 2023-01-22 17:06:32.712178: step: 740/466, loss: 0.0036374281626194715 2023-01-22 17:06:33.419464: step: 742/466, loss: 0.0026445419061928988 2023-01-22 17:06:34.071068: step: 744/466, loss: 0.004334342200309038 2023-01-22 17:06:34.758274: step: 746/466, loss: 1.0070466995239258 2023-01-22 17:06:35.428498: step: 748/466, loss: 0.04262397810816765 2023-01-22 17:06:36.147165: step: 750/466, loss: 0.011616052128374577 2023-01-22 17:06:36.985410: step: 752/466, loss: 0.00277088675647974 2023-01-22 17:06:37.757814: step: 754/466, loss: 0.015158126130700111 2023-01-22 17:06:38.511464: step: 756/466, loss: 0.004447794985026121 2023-01-22 17:06:39.404306: step: 758/466, loss: 0.0005685980431735516 2023-01-22 17:06:40.135798: step: 760/466, loss: 0.00014420298975892365 2023-01-22 17:06:40.931632: step: 762/466, loss: 0.005476310383528471 2023-01-22 17:06:41.738248: step: 764/466, loss: 0.0002213385159848258 2023-01-22 17:06:42.470537: step: 766/466, loss: 0.2885030210018158 2023-01-22 17:06:43.235606: step: 768/466, loss: 0.0647197961807251 2023-01-22 17:06:44.021220: step: 770/466, loss: 0.002829147269949317 2023-01-22 17:06:44.726872: step: 772/466, loss: 0.0008479239768348634 2023-01-22 17:06:45.418001: step: 774/466, loss: 0.007131071761250496 2023-01-22 17:06:46.238756: step: 776/466, loss: 0.004595254082232714 2023-01-22 17:06:46.938016: step: 778/466, loss: 5.081822564534377e-06 2023-01-22 17:06:47.764702: step: 780/466, loss: 0.015721395611763 2023-01-22 17:06:48.481900: step: 782/466, loss: 3.051890598726459e-05 2023-01-22 17:06:49.231879: step: 784/466, loss: 0.000999096198938787 2023-01-22 17:06:49.975200: step: 786/466, loss: 0.025470739230513573 2023-01-22 17:06:50.665958: step: 788/466, loss: 0.0016792990500107408 2023-01-22 17:06:51.398645: step: 790/466, loss: 0.10520414263010025 2023-01-22 17:06:52.115586: step: 792/466, loss: 0.00962340272963047 2023-01-22 17:06:52.781766: step: 794/466, loss: 0.00011330540291965008 2023-01-22 17:06:53.526008: step: 796/466, loss: 0.0818401500582695 2023-01-22 17:06:54.333973: step: 798/466, loss: 0.012172805145382881 2023-01-22 17:06:55.130532: step: 800/466, loss: 0.08613215386867523 2023-01-22 17:06:55.968577: step: 802/466, loss: 0.0011135649401694536 2023-01-22 17:06:56.732874: step: 804/466, loss: 0.003869681851938367 2023-01-22 17:06:57.563522: step: 806/466, loss: 0.030655736103653908 2023-01-22 17:06:58.297685: step: 808/466, loss: 0.09402773529291153 2023-01-22 17:06:59.094335: step: 810/466, loss: 0.10320362448692322 2023-01-22 17:06:59.857981: step: 812/466, loss: 5.8628080296330154e-05 2023-01-22 17:07:00.642542: step: 814/466, loss: 0.007948066107928753 2023-01-22 17:07:01.434101: step: 816/466, loss: 0.0008257722365669906 2023-01-22 17:07:02.177062: step: 818/466, loss: 0.013069583103060722 2023-01-22 17:07:02.925939: step: 820/466, loss: 0.0013827694347128272 2023-01-22 17:07:03.691937: step: 822/466, loss: 0.019912388175725937 2023-01-22 17:07:04.440996: step: 824/466, loss: 0.014426201581954956 2023-01-22 17:07:05.276137: step: 826/466, loss: 0.057453930377960205 2023-01-22 17:07:06.089560: step: 828/466, loss: 0.014677576720714569 2023-01-22 17:07:06.838410: step: 830/466, loss: 0.004854255355894566 2023-01-22 17:07:07.688608: step: 832/466, loss: 0.001058097812347114 2023-01-22 17:07:08.519553: step: 834/466, loss: 0.0003384735609870404 2023-01-22 17:07:09.279803: step: 836/466, loss: 0.0007604488637298346 2023-01-22 17:07:09.987204: step: 838/466, loss: 0.0034899867605417967 2023-01-22 17:07:10.750452: step: 840/466, loss: 0.02199520915746689 2023-01-22 17:07:11.620402: step: 842/466, loss: 0.00045649081584997475 2023-01-22 17:07:12.333016: step: 844/466, loss: 0.0009873228846117854 2023-01-22 17:07:13.380018: step: 846/466, loss: 0.00015023874584585428 2023-01-22 17:07:14.152007: step: 848/466, loss: 0.03509344533085823 2023-01-22 17:07:14.991154: step: 850/466, loss: 0.025919271633028984 2023-01-22 17:07:15.685509: step: 852/466, loss: 0.00026613284717313945 2023-01-22 17:07:16.441570: step: 854/466, loss: 0.0004146205901633948 2023-01-22 17:07:17.225544: step: 856/466, loss: 0.015960287302732468 2023-01-22 17:07:18.007940: step: 858/466, loss: 0.014743988402187824 2023-01-22 17:07:18.796425: step: 860/466, loss: 0.00048502220306545496 2023-01-22 17:07:19.534163: step: 862/466, loss: 0.003604623256251216 2023-01-22 17:07:20.385566: step: 864/466, loss: 0.0007024814840406179 2023-01-22 17:07:21.142111: step: 866/466, loss: 0.025188926607370377 2023-01-22 17:07:21.891883: step: 868/466, loss: 0.0032895857002586126 2023-01-22 17:07:22.665178: step: 870/466, loss: 0.00014311580162029713 2023-01-22 17:07:23.349817: step: 872/466, loss: 0.01836252771317959 2023-01-22 17:07:24.062420: step: 874/466, loss: 0.0012512169778347015 2023-01-22 17:07:24.889849: step: 876/466, loss: 0.10940083861351013 2023-01-22 17:07:25.617203: step: 878/466, loss: 0.0006442145677283406 2023-01-22 17:07:26.381753: step: 880/466, loss: 0.000579558894969523 2023-01-22 17:07:27.166771: step: 882/466, loss: 0.0004485124663915485 2023-01-22 17:07:27.802767: step: 884/466, loss: 0.0012692167656496167 2023-01-22 17:07:28.569168: step: 886/466, loss: 0.4959481656551361 2023-01-22 17:07:29.335079: step: 888/466, loss: 0.031073397025465965 2023-01-22 17:07:30.110409: step: 890/466, loss: 0.001908604521304369 2023-01-22 17:07:30.783465: step: 892/466, loss: 0.0033289012499153614 2023-01-22 17:07:31.543941: step: 894/466, loss: 0.0019436075817793608 2023-01-22 17:07:32.342694: step: 896/466, loss: 0.0030539692379534245 2023-01-22 17:07:33.187269: step: 898/466, loss: 0.005594159010797739 2023-01-22 17:07:34.026635: step: 900/466, loss: 0.00043902089237235487 2023-01-22 17:07:34.800560: step: 902/466, loss: 0.0003640690119937062 2023-01-22 17:07:35.570031: step: 904/466, loss: 0.0005915475194342434 2023-01-22 17:07:36.318305: step: 906/466, loss: 0.013333105482161045 2023-01-22 17:07:37.018099: step: 908/466, loss: 0.00022897178132552654 2023-01-22 17:07:37.718593: step: 910/466, loss: 0.0003777458332479 2023-01-22 17:07:38.522032: step: 912/466, loss: 0.0017597827827557921 2023-01-22 17:07:39.247893: step: 914/466, loss: 3.888283026753925e-05 2023-01-22 17:07:40.001845: step: 916/466, loss: 2.9405186069197953e-05 2023-01-22 17:07:40.773890: step: 918/466, loss: 0.012713199481368065 2023-01-22 17:07:41.521972: step: 920/466, loss: 7.25022327969782e-05 2023-01-22 17:07:42.240921: step: 922/466, loss: 0.15175369381904602 2023-01-22 17:07:43.027215: step: 924/466, loss: 0.00019238461391068995 2023-01-22 17:07:43.861266: step: 926/466, loss: 0.01744076795876026 2023-01-22 17:07:44.574615: step: 928/466, loss: 0.0002738804614637047 2023-01-22 17:07:45.343508: step: 930/466, loss: 3.8340131141012534e-05 2023-01-22 17:07:46.163743: step: 932/466, loss: 0.15691609680652618 ================================================== Loss: 0.031 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2937243605212355, 'r': 0.32995222282461373, 'f1': 0.3107860972807353}, 'combined': 0.22900028220685759, 'epoch': 39} Test Chinese: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.3567806928416761, 'r': 0.3107145548577855, 'f1': 0.3321580327057753}, 'combined': 0.20415566888257408, 'epoch': 39} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.2796014734837577, 'r': 0.3406150777544069, 'f1': 0.30710717874520527}, 'combined': 0.22628950012804597, 'epoch': 39} Test Korean: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.3378873366053381, 'r': 0.3165131809968548, 'f1': 0.32685119540972746}, 'combined': 0.20089390547134467, 'epoch': 39} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.31467126623376623, 'r': 0.34154072919490375, 'f1': 0.3275558949694527}, 'combined': 0.24135697524064934, 'epoch': 39} Test Russian: {'template': {'p': 0.863013698630137, 'r': 0.48091603053435117, 'f1': 0.6176470588235295}, 'slot': {'p': 0.35437621251822377, 'r': 0.3046284253189584, 'f1': 0.32762460654061326}, 'combined': 0.2023563746280259, 'epoch': 39} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.23148148148148148, 'r': 0.35714285714285715, 'f1': 0.2808988764044944}, 'combined': 0.18726591760299627, 'epoch': 39} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.2611111111111111, 'r': 0.5108695652173914, 'f1': 0.34558823529411775}, 'combined': 0.17279411764705888, 'epoch': 39} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.4666666666666667, 'r': 0.2413793103448276, 'f1': 0.3181818181818182}, 'combined': 0.2121212121212121, 'epoch': 39} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.31374061853002067, 'r': 0.3286239495798319, 'f1': 0.3210098636303455}, 'combined': 0.23653358372762298, 'epoch': 33} Test for Chinese: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.35580409301213317, 'r': 0.30097021547433256, 'f1': 0.32609812277003203}, 'combined': 0.20043104131231237, 'epoch': 33} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.31875, 'r': 0.36428571428571427, 'f1': 0.34}, 'combined': 0.22666666666666668, 'epoch': 33} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.30156788595932055, 'r': 0.3456299869438892, 'f1': 0.322099032925605}, 'combined': 0.23733612952412997, 'epoch': 36} Test for Korean: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.331509942889869, 'r': 0.31226283182087317, 'f1': 0.3215986683813365}, 'combined': 0.19766552300511414, 'epoch': 36} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.32432432432432434, 'r': 0.5217391304347826, 'f1': 0.4}, 'combined': 0.2, 'epoch': 36} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.3161637354106435, 'r': 0.3395610516934046, 'f1': 0.3274449665918101}, 'combined': 0.24127523854133376, 'epoch': 24} Test for Russian: {'template': {'p': 0.863013698630137, 'r': 0.48091603053435117, 'f1': 0.6176470588235295}, 'slot': {'p': 0.34766580316129947, 'r': 0.3010093533864065, 'f1': 0.32265967810793456}, 'combined': 0.19928980118431255, 'epoch': 24} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.2413793103448276, 'f1': 0.32558139534883723}, 'combined': 0.21705426356589147, 'epoch': 24}