Command that produces this log: python train.py --model_name template --xlmr_model_name xlm-roberta-large --batch_size 10 --xlmr_learning_rate 2e-5 --event_hidden_num 450 --accumulate_step 4 --max_epoch 30 --p1_data_weight 0.1 --learning_rate 2e-4 Initialized Template model with checkpoint at logs/template/template-model.mdl.lang-chinese ---------------------------------------------------------------------------------------------------- > trainable params: >>> xlmr.embeddings.word_embeddings.weight: torch.Size([250002, 1024]) >>> xlmr.embeddings.position_embeddings.weight: torch.Size([514, 1024]) >>> xlmr.embeddings.token_type_embeddings.weight: torch.Size([1, 1024]) >>> xlmr.embeddings.LayerNorm.weight: torch.Size([1024]) >>> xlmr.embeddings.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.0.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.0.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.0.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.0.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.0.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.0.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.0.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.0.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.0.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.0.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.0.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.0.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.0.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.0.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.0.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.0.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.1.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.1.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.1.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.1.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.1.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.1.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.1.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.1.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.1.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.1.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.1.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.1.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.1.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.1.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.1.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.1.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.2.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.2.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.2.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.2.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.2.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.2.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.2.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.2.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.2.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.2.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.2.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.2.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.2.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.2.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.2.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.2.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.3.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.3.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.3.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.3.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.3.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.3.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.3.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.3.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.3.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.3.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.3.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.3.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.3.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.3.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.3.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.3.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.4.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.4.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.4.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.4.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.4.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.4.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.4.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.4.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.4.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.4.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.4.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.4.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.4.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.4.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.4.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.4.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.5.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.5.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.5.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.5.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.5.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.5.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.5.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.5.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.5.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.5.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.5.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.5.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.5.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.5.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.5.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.5.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.6.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.6.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.6.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.6.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.6.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.6.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.6.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.6.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.6.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.6.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.6.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.6.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.6.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.6.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.6.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.6.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.7.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.7.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.7.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.7.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.7.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.7.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.7.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.7.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.7.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.7.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.7.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.7.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.7.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.7.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.7.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.7.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.8.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.8.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.8.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.8.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.8.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.8.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.8.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.8.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.8.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.8.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.8.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.8.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.8.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.8.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.8.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.8.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.9.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.9.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.9.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.9.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.9.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.9.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.9.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.9.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.9.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.9.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.9.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.9.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.9.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.9.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.9.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.9.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.10.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.10.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.10.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.10.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.10.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.10.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.10.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.10.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.10.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.10.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.10.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.10.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.10.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.10.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.10.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.10.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.11.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.11.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.11.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.11.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.11.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.11.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.11.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.11.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.11.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.11.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.11.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.11.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.11.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.11.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.11.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.11.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.12.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.12.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.12.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.12.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.12.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.12.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.12.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.12.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.12.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.12.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.12.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.12.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.12.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.12.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.12.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.12.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.13.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.13.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.13.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.13.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.13.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.13.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.13.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.13.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.13.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.13.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.13.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.13.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.13.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.13.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.13.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.13.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.14.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.14.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.14.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.14.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.14.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.14.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.14.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.14.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.14.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.14.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.14.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.14.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.14.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.14.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.14.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.14.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.15.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.15.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.15.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.15.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.15.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.15.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.15.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.15.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.15.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.15.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.15.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.15.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.15.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.15.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.15.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.15.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.16.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.16.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.16.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.16.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.16.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.16.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.16.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.16.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.16.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.16.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.16.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.16.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.16.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.16.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.16.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.16.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.17.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.17.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.17.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.17.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.17.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.17.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.17.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.17.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.17.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.17.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.17.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.17.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.17.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.17.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.17.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.17.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.18.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.18.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.18.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.18.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.18.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.18.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.18.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.18.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.18.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.18.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.18.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.18.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.18.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.18.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.18.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.18.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.19.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.19.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.19.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.19.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.19.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.19.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.19.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.19.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.19.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.19.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.19.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.19.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.19.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.19.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.19.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.19.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.20.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.20.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.20.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.20.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.20.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.20.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.20.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.20.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.20.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.20.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.20.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.20.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.20.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.20.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.20.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.20.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.21.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.21.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.21.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.21.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.21.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.21.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.21.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.21.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.21.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.21.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.21.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.21.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.21.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.21.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.21.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.21.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.22.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.22.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.22.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.22.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.22.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.22.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.22.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.22.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.22.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.22.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.22.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.22.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.22.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.22.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.22.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.22.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.23.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.23.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.23.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.23.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.23.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.23.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.23.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.23.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.23.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.23.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.23.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.23.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.23.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.23.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.23.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.23.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.pooler.dense.weight: torch.Size([1024, 1024]) >>> xlmr.pooler.dense.bias: torch.Size([1024]) >>> trans_rep.weight: torch.Size([1024, 2048]) >>> trans_rep.bias: torch.Size([1024]) >>> hidden_ffns.Corruplate.layers.0.weight: torch.Size([768, 1024]) >>> hidden_ffns.Corruplate.layers.0.bias: torch.Size([768]) >>> hidden_ffns.Cybercrimeplate.layers.0.weight: torch.Size([768, 1024]) >>> hidden_ffns.Cybercrimeplate.layers.0.bias: torch.Size([768]) >>> hidden_ffns.Disasterplate.layers.0.weight: torch.Size([768, 1024]) >>> hidden_ffns.Disasterplate.layers.0.bias: torch.Size([768]) >>> hidden_ffns.Displacementplate.layers.0.weight: torch.Size([768, 1024]) >>> hidden_ffns.Displacementplate.layers.0.bias: torch.Size([768]) >>> hidden_ffns.Epidemiplate.layers.0.weight: torch.Size([768, 1024]) >>> hidden_ffns.Epidemiplate.layers.0.bias: torch.Size([768]) >>> hidden_ffns.Etiplate.layers.0.weight: torch.Size([768, 1024]) >>> hidden_ffns.Etiplate.layers.0.bias: torch.Size([768]) >>> hidden_ffns.Protestplate.layers.0.weight: torch.Size([768, 1024]) >>> hidden_ffns.Protestplate.layers.0.bias: torch.Size([768]) >>> hidden_ffns.Terrorplate.layers.0.weight: torch.Size([768, 1024]) >>> hidden_ffns.Terrorplate.layers.0.bias: torch.Size([768]) >>> template_classifiers.Corruplate.layers.0.weight: torch.Size([450, 768]) >>> template_classifiers.Corruplate.layers.0.bias: torch.Size([450]) >>> template_classifiers.Corruplate.layers.1.weight: torch.Size([2, 450]) >>> template_classifiers.Corruplate.layers.1.bias: torch.Size([2]) >>> template_classifiers.Cybercrimeplate.layers.0.weight: torch.Size([450, 768]) >>> template_classifiers.Cybercrimeplate.layers.0.bias: torch.Size([450]) >>> template_classifiers.Cybercrimeplate.layers.1.weight: torch.Size([2, 450]) >>> template_classifiers.Cybercrimeplate.layers.1.bias: torch.Size([2]) >>> template_classifiers.Disasterplate.layers.0.weight: torch.Size([450, 768]) >>> template_classifiers.Disasterplate.layers.0.bias: torch.Size([450]) >>> template_classifiers.Disasterplate.layers.1.weight: torch.Size([2, 450]) >>> template_classifiers.Disasterplate.layers.1.bias: torch.Size([2]) >>> template_classifiers.Displacementplate.layers.0.weight: torch.Size([450, 768]) >>> template_classifiers.Displacementplate.layers.0.bias: torch.Size([450]) >>> template_classifiers.Displacementplate.layers.1.weight: torch.Size([2, 450]) >>> template_classifiers.Displacementplate.layers.1.bias: torch.Size([2]) >>> template_classifiers.Epidemiplate.layers.0.weight: torch.Size([450, 768]) >>> template_classifiers.Epidemiplate.layers.0.bias: torch.Size([450]) >>> template_classifiers.Epidemiplate.layers.1.weight: torch.Size([2, 450]) >>> template_classifiers.Epidemiplate.layers.1.bias: torch.Size([2]) >>> template_classifiers.Etiplate.layers.0.weight: torch.Size([450, 768]) >>> template_classifiers.Etiplate.layers.0.bias: torch.Size([450]) >>> template_classifiers.Etiplate.layers.1.weight: torch.Size([2, 450]) >>> template_classifiers.Etiplate.layers.1.bias: torch.Size([2]) >>> template_classifiers.Protestplate.layers.0.weight: torch.Size([450, 768]) >>> template_classifiers.Protestplate.layers.0.bias: torch.Size([450]) >>> template_classifiers.Protestplate.layers.1.weight: torch.Size([2, 450]) >>> template_classifiers.Protestplate.layers.1.bias: torch.Size([2]) >>> template_classifiers.Terrorplate.layers.0.weight: torch.Size([450, 768]) >>> template_classifiers.Terrorplate.layers.0.bias: torch.Size([450]) >>> template_classifiers.Terrorplate.layers.1.weight: torch.Size([2, 450]) >>> template_classifiers.Terrorplate.layers.1.bias: torch.Size([2]) >>> type_classifiers.Corruplate.layers.0.weight: torch.Size([450, 768]) >>> type_classifiers.Corruplate.layers.0.bias: torch.Size([450]) >>> type_classifiers.Corruplate.layers.1.weight: torch.Size([6, 450]) >>> type_classifiers.Corruplate.layers.1.bias: torch.Size([6]) >>> type_classifiers.Cybercrimeplate.layers.0.weight: torch.Size([450, 768]) >>> type_classifiers.Cybercrimeplate.layers.0.bias: torch.Size([450]) >>> type_classifiers.Cybercrimeplate.layers.1.weight: torch.Size([6, 450]) >>> type_classifiers.Cybercrimeplate.layers.1.bias: torch.Size([6]) >>> type_classifiers.Disasterplate.layers.0.weight: torch.Size([450, 768]) >>> type_classifiers.Disasterplate.layers.0.bias: torch.Size([450]) >>> type_classifiers.Disasterplate.layers.1.weight: torch.Size([6, 450]) >>> type_classifiers.Disasterplate.layers.1.bias: torch.Size([6]) >>> type_classifiers.Displacementplate.layers.0.weight: torch.Size([450, 768]) >>> type_classifiers.Displacementplate.layers.0.bias: torch.Size([450]) >>> type_classifiers.Displacementplate.layers.1.weight: torch.Size([6, 450]) >>> type_classifiers.Displacementplate.layers.1.bias: torch.Size([6]) >>> type_classifiers.Epidemiplate.layers.0.weight: torch.Size([450, 768]) >>> type_classifiers.Epidemiplate.layers.0.bias: torch.Size([450]) >>> type_classifiers.Epidemiplate.layers.1.weight: torch.Size([6, 450]) >>> type_classifiers.Epidemiplate.layers.1.bias: torch.Size([6]) >>> type_classifiers.Etiplate.layers.0.weight: torch.Size([450, 768]) >>> type_classifiers.Etiplate.layers.0.bias: torch.Size([450]) >>> type_classifiers.Etiplate.layers.1.weight: torch.Size([6, 450]) >>> type_classifiers.Etiplate.layers.1.bias: torch.Size([6]) >>> type_classifiers.Protestplate.layers.0.weight: torch.Size([450, 768]) >>> type_classifiers.Protestplate.layers.0.bias: torch.Size([450]) >>> type_classifiers.Protestplate.layers.1.weight: torch.Size([6, 450]) >>> type_classifiers.Protestplate.layers.1.bias: torch.Size([6]) >>> type_classifiers.Terrorplate.layers.0.weight: torch.Size([450, 768]) >>> type_classifiers.Terrorplate.layers.0.bias: torch.Size([450]) >>> type_classifiers.Terrorplate.layers.1.weight: torch.Size([6, 450]) >>> type_classifiers.Terrorplate.layers.1.bias: torch.Size([6]) >>> completion_classifiers.Corruplate.layers.0.weight: torch.Size([450, 768]) >>> completion_classifiers.Corruplate.layers.0.bias: torch.Size([450]) >>> completion_classifiers.Corruplate.layers.1.weight: torch.Size([4, 450]) >>> completion_classifiers.Corruplate.layers.1.bias: torch.Size([4]) >>> completion_classifiers.Cybercrimeplate.layers.0.weight: torch.Size([450, 768]) >>> completion_classifiers.Cybercrimeplate.layers.0.bias: torch.Size([450]) >>> completion_classifiers.Cybercrimeplate.layers.1.weight: torch.Size([4, 450]) >>> completion_classifiers.Cybercrimeplate.layers.1.bias: torch.Size([4]) >>> completion_classifiers.Disasterplate.layers.0.weight: torch.Size([450, 768]) >>> completion_classifiers.Disasterplate.layers.0.bias: torch.Size([450]) >>> completion_classifiers.Disasterplate.layers.1.weight: torch.Size([4, 450]) >>> completion_classifiers.Disasterplate.layers.1.bias: torch.Size([4]) >>> completion_classifiers.Displacementplate.layers.0.weight: torch.Size([450, 768]) >>> completion_classifiers.Displacementplate.layers.0.bias: torch.Size([450]) >>> completion_classifiers.Displacementplate.layers.1.weight: torch.Size([4, 450]) >>> completion_classifiers.Displacementplate.layers.1.bias: torch.Size([4]) >>> completion_classifiers.Epidemiplate.layers.0.weight: torch.Size([450, 768]) >>> completion_classifiers.Epidemiplate.layers.0.bias: torch.Size([450]) >>> completion_classifiers.Epidemiplate.layers.1.weight: torch.Size([4, 450]) >>> completion_classifiers.Epidemiplate.layers.1.bias: torch.Size([4]) >>> completion_classifiers.Etiplate.layers.0.weight: torch.Size([450, 768]) >>> completion_classifiers.Etiplate.layers.0.bias: torch.Size([450]) >>> completion_classifiers.Etiplate.layers.1.weight: torch.Size([4, 450]) >>> completion_classifiers.Etiplate.layers.1.bias: torch.Size([4]) >>> completion_classifiers.Protestplate.layers.0.weight: torch.Size([450, 768]) >>> completion_classifiers.Protestplate.layers.0.bias: torch.Size([450]) >>> completion_classifiers.Protestplate.layers.1.weight: torch.Size([4, 450]) >>> completion_classifiers.Protestplate.layers.1.bias: torch.Size([4]) >>> completion_classifiers.Terrorplate.layers.0.weight: torch.Size([450, 768]) >>> completion_classifiers.Terrorplate.layers.0.bias: torch.Size([450]) >>> completion_classifiers.Terrorplate.layers.1.weight: torch.Size([4, 450]) >>> completion_classifiers.Terrorplate.layers.1.bias: torch.Size([4]) >>> overtime_classifiers.Corruplate.layers.0.weight: torch.Size([450, 768]) >>> overtime_classifiers.Corruplate.layers.0.bias: torch.Size([450]) >>> overtime_classifiers.Corruplate.layers.1.weight: torch.Size([2, 450]) >>> overtime_classifiers.Corruplate.layers.1.bias: torch.Size([2]) >>> overtime_classifiers.Cybercrimeplate.layers.0.weight: torch.Size([450, 768]) >>> overtime_classifiers.Cybercrimeplate.layers.0.bias: torch.Size([450]) >>> overtime_classifiers.Cybercrimeplate.layers.1.weight: torch.Size([2, 450]) >>> overtime_classifiers.Cybercrimeplate.layers.1.bias: torch.Size([2]) >>> overtime_classifiers.Disasterplate.layers.0.weight: torch.Size([450, 768]) >>> overtime_classifiers.Disasterplate.layers.0.bias: torch.Size([450]) >>> overtime_classifiers.Disasterplate.layers.1.weight: torch.Size([2, 450]) >>> overtime_classifiers.Disasterplate.layers.1.bias: torch.Size([2]) >>> overtime_classifiers.Displacementplate.layers.0.weight: torch.Size([450, 768]) >>> overtime_classifiers.Displacementplate.layers.0.bias: torch.Size([450]) >>> overtime_classifiers.Displacementplate.layers.1.weight: torch.Size([2, 450]) >>> overtime_classifiers.Displacementplate.layers.1.bias: torch.Size([2]) >>> overtime_classifiers.Epidemiplate.layers.0.weight: torch.Size([450, 768]) >>> overtime_classifiers.Epidemiplate.layers.0.bias: torch.Size([450]) >>> overtime_classifiers.Epidemiplate.layers.1.weight: torch.Size([2, 450]) >>> overtime_classifiers.Epidemiplate.layers.1.bias: torch.Size([2]) >>> overtime_classifiers.Etiplate.layers.0.weight: torch.Size([450, 768]) >>> overtime_classifiers.Etiplate.layers.0.bias: torch.Size([450]) >>> overtime_classifiers.Etiplate.layers.1.weight: torch.Size([2, 450]) >>> overtime_classifiers.Etiplate.layers.1.bias: torch.Size([2]) >>> overtime_classifiers.Protestplate.layers.0.weight: torch.Size([450, 768]) >>> overtime_classifiers.Protestplate.layers.0.bias: torch.Size([450]) >>> overtime_classifiers.Protestplate.layers.1.weight: torch.Size([2, 450]) >>> overtime_classifiers.Protestplate.layers.1.bias: torch.Size([2]) >>> overtime_classifiers.Terrorplate.layers.0.weight: torch.Size([450, 768]) >>> overtime_classifiers.Terrorplate.layers.0.bias: torch.Size([450]) >>> overtime_classifiers.Terrorplate.layers.1.weight: torch.Size([2, 450]) >>> overtime_classifiers.Terrorplate.layers.1.bias: torch.Size([2]) >>> coordinated_classifiers.Corruplate.layers.0.weight: torch.Size([450, 768]) >>> coordinated_classifiers.Corruplate.layers.0.bias: torch.Size([450]) >>> coordinated_classifiers.Corruplate.layers.1.weight: torch.Size([2, 450]) >>> coordinated_classifiers.Corruplate.layers.1.bias: torch.Size([2]) >>> coordinated_classifiers.Cybercrimeplate.layers.0.weight: torch.Size([450, 768]) >>> coordinated_classifiers.Cybercrimeplate.layers.0.bias: torch.Size([450]) >>> coordinated_classifiers.Cybercrimeplate.layers.1.weight: torch.Size([2, 450]) >>> coordinated_classifiers.Cybercrimeplate.layers.1.bias: torch.Size([2]) >>> coordinated_classifiers.Disasterplate.layers.0.weight: torch.Size([450, 768]) >>> coordinated_classifiers.Disasterplate.layers.0.bias: torch.Size([450]) >>> coordinated_classifiers.Disasterplate.layers.1.weight: torch.Size([2, 450]) >>> coordinated_classifiers.Disasterplate.layers.1.bias: torch.Size([2]) >>> coordinated_classifiers.Displacementplate.layers.0.weight: torch.Size([450, 768]) >>> coordinated_classifiers.Displacementplate.layers.0.bias: torch.Size([450]) >>> coordinated_classifiers.Displacementplate.layers.1.weight: torch.Size([2, 450]) >>> coordinated_classifiers.Displacementplate.layers.1.bias: torch.Size([2]) >>> coordinated_classifiers.Epidemiplate.layers.0.weight: torch.Size([450, 768]) >>> coordinated_classifiers.Epidemiplate.layers.0.bias: torch.Size([450]) >>> coordinated_classifiers.Epidemiplate.layers.1.weight: torch.Size([2, 450]) >>> coordinated_classifiers.Epidemiplate.layers.1.bias: torch.Size([2]) >>> coordinated_classifiers.Etiplate.layers.0.weight: torch.Size([450, 768]) >>> coordinated_classifiers.Etiplate.layers.0.bias: torch.Size([450]) >>> coordinated_classifiers.Etiplate.layers.1.weight: torch.Size([2, 450]) >>> coordinated_classifiers.Etiplate.layers.1.bias: torch.Size([2]) >>> coordinated_classifiers.Protestplate.layers.0.weight: torch.Size([450, 768]) >>> coordinated_classifiers.Protestplate.layers.0.bias: torch.Size([450]) >>> coordinated_classifiers.Protestplate.layers.1.weight: torch.Size([2, 450]) >>> coordinated_classifiers.Protestplate.layers.1.bias: torch.Size([2]) >>> coordinated_classifiers.Terrorplate.layers.0.weight: torch.Size([450, 768]) >>> coordinated_classifiers.Terrorplate.layers.0.bias: torch.Size([450]) >>> coordinated_classifiers.Terrorplate.layers.1.weight: torch.Size([2, 450]) >>> coordinated_classifiers.Terrorplate.layers.1.bias: torch.Size([2]) n_trainable_params: 582185936, n_nontrainable_params: 0 ---------------------------------------------------------------------------------------------------- ****************************** Epoch: 0 command: python train.py --model_name template --xlmr_model_name xlm-roberta-large --batch_size 10 --xlmr_learning_rate 2e-5 --event_hidden_num 450 --accumulate_step 4 --max_epoch 30 --p1_data_weight 0.1 --learning_rate 2e-4 2023-01-22 14:27:49.234307: step: 4/77, loss: 0.014562876895070076 2023-01-22 14:27:50.663021: step: 8/77, loss: 0.07707087695598602 2023-01-22 14:27:52.132951: step: 12/77, loss: 0.038736917078495026 2023-01-22 14:27:53.596164: step: 16/77, loss: 0.038666725158691406 2023-01-22 14:27:55.072942: step: 20/77, loss: 0.07345042377710342 2023-01-22 14:27:56.516333: step: 24/77, loss: 0.06268668919801712 2023-01-22 14:27:57.958337: step: 28/77, loss: 0.014778295531868935 2023-01-22 14:27:59.379378: step: 32/77, loss: 0.027883129194378853 2023-01-22 14:28:00.839037: step: 36/77, loss: 0.07212666422128677 2023-01-22 14:28:02.290261: step: 40/77, loss: 0.16982421278953552 2023-01-22 14:28:03.737561: step: 44/77, loss: 0.008777172304689884 2023-01-22 14:28:05.254756: step: 48/77, loss: 0.07277092337608337 2023-01-22 14:28:06.711830: step: 52/77, loss: 0.011910630390048027 2023-01-22 14:28:08.196786: step: 56/77, loss: 0.053266726434230804 2023-01-22 14:28:09.607602: step: 60/77, loss: 0.025452883914113045 2023-01-22 14:28:11.085524: step: 64/77, loss: 0.013593180105090141 2023-01-22 14:28:12.531051: step: 68/77, loss: 0.027945440262556076 2023-01-22 14:28:13.932028: step: 72/77, loss: 0.03266098350286484 2023-01-22 14:28:15.403187: step: 76/77, loss: 0.1456146091222763 2023-01-22 14:28:16.901815: step: 80/77, loss: 0.028023535385727882 2023-01-22 14:28:18.341525: step: 84/77, loss: 0.01657877117395401 2023-01-22 14:28:19.779598: step: 88/77, loss: 0.042818792164325714 2023-01-22 14:28:21.253687: step: 92/77, loss: 0.00685079675167799 2023-01-22 14:28:22.733957: step: 96/77, loss: 0.03217457979917526 2023-01-22 14:28:24.125467: step: 100/77, loss: 0.013648118823766708 2023-01-22 14:28:25.526668: step: 104/77, loss: 0.020677845925092697 2023-01-22 14:28:27.054212: step: 108/77, loss: 0.024796517565846443 2023-01-22 14:28:28.559311: step: 112/77, loss: 0.13599920272827148 2023-01-22 14:28:29.984678: step: 116/77, loss: 0.015736805275082588 2023-01-22 14:28:31.437113: step: 120/77, loss: 0.04636046290397644 2023-01-22 14:28:32.847320: step: 124/77, loss: 0.020304324105381966 2023-01-22 14:28:34.271991: step: 128/77, loss: 0.014318431727588177 2023-01-22 14:28:35.672024: step: 132/77, loss: 0.017291415482759476 2023-01-22 14:28:37.138243: step: 136/77, loss: 0.051120441406965256 2023-01-22 14:28:38.540688: step: 140/77, loss: 0.010047174990177155 2023-01-22 14:28:39.990102: step: 144/77, loss: 0.0051663643680512905 2023-01-22 14:28:41.427088: step: 148/77, loss: 0.026196736842393875 2023-01-22 14:28:42.904560: step: 152/77, loss: 0.013795826584100723 2023-01-22 14:28:44.299833: step: 156/77, loss: 0.02978861704468727 2023-01-22 14:28:45.721854: step: 160/77, loss: 0.018997054547071457 2023-01-22 14:28:47.197100: step: 164/77, loss: 0.010798189789056778 2023-01-22 14:28:48.609916: step: 168/77, loss: 0.015195745974779129 2023-01-22 14:28:50.084647: step: 172/77, loss: 0.13473385572433472 2023-01-22 14:28:51.524099: step: 176/77, loss: 0.0038908179849386215 2023-01-22 14:28:52.991497: step: 180/77, loss: 0.031229078769683838 2023-01-22 14:28:54.453887: step: 184/77, loss: 0.0022021837066859007 2023-01-22 14:28:55.912331: step: 188/77, loss: 0.028069159016013145 2023-01-22 14:28:57.273980: step: 192/77, loss: 0.008518392220139503 2023-01-22 14:28:58.736132: step: 196/77, loss: 0.04296804592013359 2023-01-22 14:29:00.143173: step: 200/77, loss: 0.057969074696302414 2023-01-22 14:29:01.607282: step: 204/77, loss: 0.01200239546597004 2023-01-22 14:29:03.073996: step: 208/77, loss: 0.022103039547801018 2023-01-22 14:29:04.474418: step: 212/77, loss: 0.02972140721976757 2023-01-22 14:29:05.908608: step: 216/77, loss: 0.020930558443069458 2023-01-22 14:29:07.278956: step: 220/77, loss: 0.07869278639554977 2023-01-22 14:29:08.711007: step: 224/77, loss: 0.016481993719935417 2023-01-22 14:29:10.127742: step: 228/77, loss: 0.04365432634949684 2023-01-22 14:29:11.581917: step: 232/77, loss: 0.010739430785179138 2023-01-22 14:29:13.037339: step: 236/77, loss: 0.029147211462259293 2023-01-22 14:29:14.507032: step: 240/77, loss: 0.03926050662994385 2023-01-22 14:29:15.961555: step: 244/77, loss: 0.004719543270766735 2023-01-22 14:29:17.324675: step: 248/77, loss: 0.028567254543304443 2023-01-22 14:29:18.729851: step: 252/77, loss: 0.008478794246912003 2023-01-22 14:29:20.120149: step: 256/77, loss: 0.2654775083065033 2023-01-22 14:29:21.588580: step: 260/77, loss: 0.062417369335889816 2023-01-22 14:29:23.038968: step: 264/77, loss: 0.04207262396812439 2023-01-22 14:29:24.519240: step: 268/77, loss: 0.04787707328796387 2023-01-22 14:29:25.986608: step: 272/77, loss: 0.010792815126478672 2023-01-22 14:29:27.475392: step: 276/77, loss: 0.012335541658103466 2023-01-22 14:29:28.869817: step: 280/77, loss: 0.016102604568004608 2023-01-22 14:29:30.347990: step: 284/77, loss: 0.07793550938367844 2023-01-22 14:29:31.800546: step: 288/77, loss: 0.0172465518116951 2023-01-22 14:29:33.284115: step: 292/77, loss: 0.019006339833140373 2023-01-22 14:29:34.719983: step: 296/77, loss: 0.02185669168829918 2023-01-22 14:29:36.116355: step: 300/77, loss: 0.020377611741423607 2023-01-22 14:29:37.496028: step: 304/77, loss: 0.010171851143240929 2023-01-22 14:29:38.943505: step: 308/77, loss: 0.006339873652905226 2023-01-22 14:29:40.400831: step: 312/77, loss: 0.02055288292467594 2023-01-22 14:29:41.787243: step: 316/77, loss: 0.07479239255189896 2023-01-22 14:29:43.230082: step: 320/77, loss: 0.08557580411434174 2023-01-22 14:29:44.677231: step: 324/77, loss: 0.026895936578512192 2023-01-22 14:29:46.140752: step: 328/77, loss: 0.04614676535129547 2023-01-22 14:29:47.559203: step: 332/77, loss: 0.028118902817368507 2023-01-22 14:29:49.041538: step: 336/77, loss: 0.03481002897024155 2023-01-22 14:29:50.514909: step: 340/77, loss: 0.009712321683764458 2023-01-22 14:29:51.913337: step: 344/77, loss: 0.02002524957060814 2023-01-22 14:29:53.335128: step: 348/77, loss: 0.014818353578448296 2023-01-22 14:29:54.754470: step: 352/77, loss: 0.009152144193649292 2023-01-22 14:29:56.181803: step: 356/77, loss: 0.027603866532444954 2023-01-22 14:29:57.663712: step: 360/77, loss: 0.006066218018531799 2023-01-22 14:29:59.118490: step: 364/77, loss: 0.013059323653578758 2023-01-22 14:30:00.577896: step: 368/77, loss: 0.0033606139477342367 2023-01-22 14:30:02.031543: step: 372/77, loss: 0.02507396973669529 2023-01-22 14:30:03.406781: step: 376/77, loss: 0.006337035447359085 2023-01-22 14:30:04.842053: step: 380/77, loss: 0.17721346020698547 2023-01-22 14:30:06.274126: step: 384/77, loss: 0.012289897538721561 2023-01-22 14:30:07.668538: step: 388/77, loss: 0.011396890506148338 ================================================== Loss: 0.036 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5666666666666667, 'f1': 0.7234042553191489}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05085442919642522, 'epoch': 0} Test Chinese: {'template': {'p': 0.92, 'r': 0.5433070866141733, 'f1': 0.6831683168316832}, 'slot': {'p': 0.5853658536585366, 'r': 0.022988505747126436, 'f1': 0.04423963133640553}, 'combined': 0.030223114477346356, 'epoch': 0} Dev Korean: {'template': {'p': 1.0, 'r': 0.5666666666666667, 'f1': 0.7234042553191489}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05085442919642522, 'epoch': 0} Test Korean: {'template': {'p': 0.92, 'r': 0.5433070866141733, 'f1': 0.6831683168316832}, 'slot': {'p': 0.5853658536585366, 'r': 0.022988505747126436, 'f1': 0.04423963133640553}, 'combined': 0.030223114477346356, 'epoch': 0} Dev Russian: {'template': {'p': 1.0, 'r': 0.5666666666666667, 'f1': 0.7234042553191489}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05085442919642522, 'epoch': 0} Test Russian: {'template': {'p': 0.92, 'r': 0.5433070866141733, 'f1': 0.6831683168316832}, 'slot': {'p': 0.5609756097560976, 'r': 0.022030651340996167, 'f1': 0.0423963133640553}, 'combined': 0.028963818040790255, 'epoch': 0} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 0} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 0} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 0} New best chinese model... New best korean model... New best russian model... ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5666666666666667, 'f1': 0.7234042553191489}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05085442919642522, 'epoch': 0} Test for Chinese: {'template': {'p': 0.92, 'r': 0.5433070866141733, 'f1': 0.6831683168316832}, 'slot': {'p': 0.5853658536585366, 'r': 0.022988505747126436, 'f1': 0.04423963133640553}, 'combined': 0.030223114477346356, 'epoch': 0} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 0} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5666666666666667, 'f1': 0.7234042553191489}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05085442919642522, 'epoch': 0} Test for Korean: {'template': {'p': 0.92, 'r': 0.5433070866141733, 'f1': 0.6831683168316832}, 'slot': {'p': 0.5853658536585366, 'r': 0.022988505747126436, 'f1': 0.04423963133640553}, 'combined': 0.030223114477346356, 'epoch': 0} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 0} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5666666666666667, 'f1': 0.7234042553191489}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05085442919642522, 'epoch': 0} Test for Russian: {'template': {'p': 0.92, 'r': 0.5433070866141733, 'f1': 0.6831683168316832}, 'slot': {'p': 0.5609756097560976, 'r': 0.022030651340996167, 'f1': 0.0423963133640553}, 'combined': 0.028963818040790255, 'epoch': 0} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 0} ****************************** Epoch: 1 command: python train.py --model_name template --xlmr_model_name xlm-roberta-large --batch_size 10 --xlmr_learning_rate 2e-5 --event_hidden_num 450 --accumulate_step 4 --max_epoch 30 --p1_data_weight 0.1 --learning_rate 2e-4 2023-01-22 14:32:01.326419: step: 4/77, loss: 0.03395911306142807 2023-01-22 14:32:02.772426: step: 8/77, loss: 0.03220266103744507 2023-01-22 14:32:04.157052: step: 12/77, loss: 0.034749917685985565 2023-01-22 14:32:05.569920: step: 16/77, loss: 0.0028172857128083706 2023-01-22 14:32:07.024858: step: 20/77, loss: 0.011881925165653229 2023-01-22 14:32:08.452867: step: 24/77, loss: 0.0772692933678627 2023-01-22 14:32:09.883155: step: 28/77, loss: 0.022852960973978043 2023-01-22 14:32:11.337698: step: 32/77, loss: 0.0349646732211113 2023-01-22 14:32:12.745315: step: 36/77, loss: 0.027902746573090553 2023-01-22 14:32:14.196349: step: 40/77, loss: 0.04271809384226799 2023-01-22 14:32:15.637284: step: 44/77, loss: 0.014981645159423351 2023-01-22 14:32:17.087829: step: 48/77, loss: 0.002633021678775549 2023-01-22 14:32:18.514355: step: 52/77, loss: 0.00447846669703722 2023-01-22 14:32:19.930796: step: 56/77, loss: 0.042827967554330826 2023-01-22 14:32:21.369285: step: 60/77, loss: 0.009535769000649452 2023-01-22 14:32:22.817884: step: 64/77, loss: 0.015919271856546402 2023-01-22 14:32:24.301844: step: 68/77, loss: 0.0035544466227293015 2023-01-22 14:32:25.744356: step: 72/77, loss: 0.013386842794716358 2023-01-22 14:32:27.198195: step: 76/77, loss: 0.004437257535755634 2023-01-22 14:32:28.630472: step: 80/77, loss: 0.017227396368980408 2023-01-22 14:32:30.077004: step: 84/77, loss: 0.00327373412437737 2023-01-22 14:32:31.511519: step: 88/77, loss: 0.012131031602621078 2023-01-22 14:32:32.989862: step: 92/77, loss: 0.06616261601448059 2023-01-22 14:32:34.423521: step: 96/77, loss: 0.004690750036388636 2023-01-22 14:32:35.839126: step: 100/77, loss: 0.0018397839739918709 2023-01-22 14:32:37.286262: step: 104/77, loss: 0.02620423585176468 2023-01-22 14:32:38.739807: step: 108/77, loss: 0.0009788009338080883 2023-01-22 14:32:40.235447: step: 112/77, loss: 0.0021083469036966562 2023-01-22 14:32:41.726444: step: 116/77, loss: 0.0069288769736886024 2023-01-22 14:32:43.146686: step: 120/77, loss: 0.024865340441465378 2023-01-22 14:32:44.586767: step: 124/77, loss: 0.001671614940278232 2023-01-22 14:32:46.037198: step: 128/77, loss: 0.020936831831932068 2023-01-22 14:32:47.496828: step: 132/77, loss: 0.018269537016749382 2023-01-22 14:32:48.851172: step: 136/77, loss: 0.0369483083486557 2023-01-22 14:32:50.300831: step: 140/77, loss: 0.10529951751232147 2023-01-22 14:32:51.812562: step: 144/77, loss: 0.001719652209430933 2023-01-22 14:32:53.261473: step: 148/77, loss: 0.020288731902837753 2023-01-22 14:32:54.727691: step: 152/77, loss: 0.05669817328453064 2023-01-22 14:32:56.141875: step: 156/77, loss: 0.06675045192241669 2023-01-22 14:32:57.533284: step: 160/77, loss: 0.0026104215066879988 2023-01-22 14:32:59.000829: step: 164/77, loss: 0.205818772315979 2023-01-22 14:33:00.479167: step: 168/77, loss: 0.009255688637495041 2023-01-22 14:33:01.992227: step: 172/77, loss: 0.004980894271284342 2023-01-22 14:33:03.454752: step: 176/77, loss: 0.016286468133330345 2023-01-22 14:33:04.920962: step: 180/77, loss: 0.05188234895467758 2023-01-22 14:33:06.334578: step: 184/77, loss: 0.02211000584065914 2023-01-22 14:33:07.768134: step: 188/77, loss: 0.0801074355840683 2023-01-22 14:33:09.219688: step: 192/77, loss: 0.009789615869522095 2023-01-22 14:33:10.644804: step: 196/77, loss: 0.05917605385184288 2023-01-22 14:33:12.083620: step: 200/77, loss: 0.009454208426177502 2023-01-22 14:33:13.442438: step: 204/77, loss: 0.015430726110935211 2023-01-22 14:33:14.818269: step: 208/77, loss: 0.022764410823583603 2023-01-22 14:33:16.288290: step: 212/77, loss: 0.0405598022043705 2023-01-22 14:33:17.722880: step: 216/77, loss: 0.022373605519533157 2023-01-22 14:33:19.159191: step: 220/77, loss: 0.05269388109445572 2023-01-22 14:33:20.546927: step: 224/77, loss: 0.0030973779503256083 2023-01-22 14:33:21.995607: step: 228/77, loss: 0.005300295539200306 2023-01-22 14:33:23.447529: step: 232/77, loss: 0.10702791810035706 2023-01-22 14:33:24.855647: step: 236/77, loss: 0.007571746129542589 2023-01-22 14:33:26.353623: step: 240/77, loss: 0.00989251397550106 2023-01-22 14:33:27.822108: step: 244/77, loss: 0.04054154083132744 2023-01-22 14:33:29.356101: step: 248/77, loss: 0.12586620450019836 2023-01-22 14:33:30.806569: step: 252/77, loss: 0.0805317759513855 2023-01-22 14:33:32.254690: step: 256/77, loss: 0.005477371159940958 2023-01-22 14:33:33.725833: step: 260/77, loss: 0.060180533677339554 2023-01-22 14:33:35.150033: step: 264/77, loss: 0.005746157839894295 2023-01-22 14:33:36.604231: step: 268/77, loss: 0.003750124480575323 2023-01-22 14:33:38.061078: step: 272/77, loss: 0.0038480362854897976 2023-01-22 14:33:39.460090: step: 276/77, loss: 0.10809177160263062 2023-01-22 14:33:40.886245: step: 280/77, loss: 0.02390853688120842 2023-01-22 14:33:42.316232: step: 284/77, loss: 0.0072253309190273285 2023-01-22 14:33:43.829358: step: 288/77, loss: 0.010628428310155869 2023-01-22 14:33:45.307250: step: 292/77, loss: 0.002955624833703041 2023-01-22 14:33:46.728345: step: 296/77, loss: 0.01289620716124773 2023-01-22 14:33:48.181922: step: 300/77, loss: 0.030839625746011734 2023-01-22 14:33:49.611372: step: 304/77, loss: 0.02036554366350174 2023-01-22 14:33:51.132795: step: 308/77, loss: 0.016886930912733078 2023-01-22 14:33:52.526659: step: 312/77, loss: 0.024544425308704376 2023-01-22 14:33:53.922369: step: 316/77, loss: 0.022640999406576157 2023-01-22 14:33:55.394501: step: 320/77, loss: 0.010831587947905064 2023-01-22 14:33:56.882735: step: 324/77, loss: 0.024883292615413666 2023-01-22 14:33:58.326810: step: 328/77, loss: 0.019034339115023613 2023-01-22 14:33:59.767125: step: 332/77, loss: 0.018680687993764877 2023-01-22 14:34:01.242509: step: 336/77, loss: 0.05518527701497078 2023-01-22 14:34:02.681020: step: 340/77, loss: 0.0008263073395937681 2023-01-22 14:34:04.099507: step: 344/77, loss: 0.06957840919494629 2023-01-22 14:34:05.566146: step: 348/77, loss: 0.012340852059423923 2023-01-22 14:34:07.032249: step: 352/77, loss: 0.0025068745017051697 2023-01-22 14:34:08.416341: step: 356/77, loss: 0.010693700052797794 2023-01-22 14:34:09.899814: step: 360/77, loss: 0.06091368570923805 2023-01-22 14:34:11.381342: step: 364/77, loss: 0.004978595767170191 2023-01-22 14:34:12.895153: step: 368/77, loss: 0.032167889177799225 2023-01-22 14:34:14.371538: step: 372/77, loss: 0.0032911431044340134 2023-01-22 14:34:15.792663: step: 376/77, loss: 0.0158363226801157 2023-01-22 14:34:17.193432: step: 380/77, loss: 0.015163689851760864 2023-01-22 14:34:18.655743: step: 384/77, loss: 0.049741633236408234 2023-01-22 14:34:20.059071: step: 388/77, loss: 0.045977696776390076 ================================================== Loss: 0.028 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 1} Test Chinese: {'template': {'p': 0.9459459459459459, 'r': 0.5511811023622047, 'f1': 0.6965174129353234}, 'slot': {'p': 0.6111111111111112, 'r': 0.0210727969348659, 'f1': 0.040740740740740744}, 'combined': 0.028376635341809474, 'epoch': 1} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 1} Test Korean: {'template': {'p': 0.9459459459459459, 'r': 0.5511811023622047, 'f1': 0.6965174129353234}, 'slot': {'p': 0.6111111111111112, 'r': 0.0210727969348659, 'f1': 0.040740740740740744}, 'combined': 0.028376635341809474, 'epoch': 1} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 1} Test Russian: {'template': {'p': 0.9459459459459459, 'r': 0.5511811023622047, 'f1': 0.6965174129353234}, 'slot': {'p': 0.6111111111111112, 'r': 0.0210727969348659, 'f1': 0.040740740740740744}, 'combined': 0.028376635341809474, 'epoch': 1} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 1} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 1} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 1} New best chinese model... New best korean model... New best russian model... ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 1} Test for Chinese: {'template': {'p': 0.9459459459459459, 'r': 0.5511811023622047, 'f1': 0.6965174129353234}, 'slot': {'p': 0.6111111111111112, 'r': 0.0210727969348659, 'f1': 0.040740740740740744}, 'combined': 0.028376635341809474, 'epoch': 1} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 1} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 1} Test for Korean: {'template': {'p': 0.9459459459459459, 'r': 0.5511811023622047, 'f1': 0.6965174129353234}, 'slot': {'p': 0.6111111111111112, 'r': 0.0210727969348659, 'f1': 0.040740740740740744}, 'combined': 0.028376635341809474, 'epoch': 1} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 1} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 1} Test for Russian: {'template': {'p': 0.9459459459459459, 'r': 0.5511811023622047, 'f1': 0.6965174129353234}, 'slot': {'p': 0.6111111111111112, 'r': 0.0210727969348659, 'f1': 0.040740740740740744}, 'combined': 0.028376635341809474, 'epoch': 1} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 1} ****************************** Epoch: 2 command: python train.py --model_name template --xlmr_model_name xlm-roberta-large --batch_size 10 --xlmr_learning_rate 2e-5 --event_hidden_num 450 --accumulate_step 4 --max_epoch 30 --p1_data_weight 0.1 --learning_rate 2e-4 2023-01-22 14:36:09.664875: step: 4/77, loss: 0.0074023776687681675 2023-01-22 14:36:11.135379: step: 8/77, loss: 0.03714536502957344 2023-01-22 14:36:12.599486: step: 12/77, loss: 0.029741812497377396 2023-01-22 14:36:14.072180: step: 16/77, loss: 0.002494464162737131 2023-01-22 14:36:15.566418: step: 20/77, loss: 0.04519050195813179 2023-01-22 14:36:16.986305: step: 24/77, loss: 0.008096764795482159 2023-01-22 14:36:18.380090: step: 28/77, loss: 0.008026783354580402 2023-01-22 14:36:19.862344: step: 32/77, loss: 0.15679682791233063 2023-01-22 14:36:21.268374: step: 36/77, loss: 0.021045060828328133 2023-01-22 14:36:22.689374: step: 40/77, loss: 0.05507553368806839 2023-01-22 14:36:24.115651: step: 44/77, loss: 0.0116695836186409 2023-01-22 14:36:25.559771: step: 48/77, loss: 0.005859737750142813 2023-01-22 14:36:27.035090: step: 52/77, loss: 0.008646172471344471 2023-01-22 14:36:28.482692: step: 56/77, loss: 0.05019918829202652 2023-01-22 14:36:29.908416: step: 60/77, loss: 0.0009840942220762372 2023-01-22 14:36:31.350511: step: 64/77, loss: 0.02366241067647934 2023-01-22 14:36:32.738380: step: 68/77, loss: 0.015323936007916927 2023-01-22 14:36:34.175650: step: 72/77, loss: 0.0016412574332207441 2023-01-22 14:36:35.582983: step: 76/77, loss: 0.07329888641834259 2023-01-22 14:36:37.016246: step: 80/77, loss: 0.0036670793779194355 2023-01-22 14:36:38.409518: step: 84/77, loss: 0.16012811660766602 2023-01-22 14:36:39.859992: step: 88/77, loss: 0.03313121199607849 2023-01-22 14:36:41.243327: step: 92/77, loss: 0.03533478081226349 2023-01-22 14:36:42.667342: step: 96/77, loss: 0.020638640969991684 2023-01-22 14:36:44.079828: step: 100/77, loss: 0.01495118997991085 2023-01-22 14:36:45.506039: step: 104/77, loss: 0.0386812798678875 2023-01-22 14:36:46.922869: step: 108/77, loss: 0.036018919199705124 2023-01-22 14:36:48.346025: step: 112/77, loss: 0.013223566114902496 2023-01-22 14:36:49.824518: step: 116/77, loss: 0.04730243980884552 2023-01-22 14:36:51.282241: step: 120/77, loss: 0.024481404572725296 2023-01-22 14:36:52.708164: step: 124/77, loss: 0.00974772684276104 2023-01-22 14:36:54.156563: step: 128/77, loss: 0.025366447865962982 2023-01-22 14:36:55.512205: step: 132/77, loss: 0.002794741652905941 2023-01-22 14:36:56.968125: step: 136/77, loss: 0.0063951462507247925 2023-01-22 14:36:58.361633: step: 140/77, loss: 0.023222854360938072 2023-01-22 14:36:59.847001: step: 144/77, loss: 0.08070119470357895 2023-01-22 14:37:01.297699: step: 148/77, loss: 0.003176407888531685 2023-01-22 14:37:02.665775: step: 152/77, loss: 0.0008568878984078765 2023-01-22 14:37:04.107501: step: 156/77, loss: 0.0002963995502796024 2023-01-22 14:37:05.548329: step: 160/77, loss: 0.02169245108962059 2023-01-22 14:37:06.993149: step: 164/77, loss: 0.003114581573754549 2023-01-22 14:37:08.471491: step: 168/77, loss: 0.0032922057434916496 2023-01-22 14:37:09.948478: step: 172/77, loss: 0.006550137884914875 2023-01-22 14:37:11.371438: step: 176/77, loss: 0.017594965174794197 2023-01-22 14:37:12.825235: step: 180/77, loss: 0.051499150693416595 2023-01-22 14:37:14.255908: step: 184/77, loss: 0.022368844598531723 2023-01-22 14:37:15.735621: step: 188/77, loss: 0.03106348216533661 2023-01-22 14:37:17.266614: step: 192/77, loss: 0.0034394976682960987 2023-01-22 14:37:18.723501: step: 196/77, loss: 0.013026936911046505 2023-01-22 14:37:20.117134: step: 200/77, loss: 0.009313576854765415 2023-01-22 14:37:21.577773: step: 204/77, loss: 0.02236253395676613 2023-01-22 14:37:23.090413: step: 208/77, loss: 0.07646773755550385 2023-01-22 14:37:24.514509: step: 212/77, loss: 0.0029299717862159014 2023-01-22 14:37:25.921268: step: 216/77, loss: 0.014075284823775291 2023-01-22 14:37:27.339285: step: 220/77, loss: 0.08892801403999329 2023-01-22 14:37:28.755217: step: 224/77, loss: 0.015748074278235435 2023-01-22 14:37:30.202980: step: 228/77, loss: 0.008066429756581783 2023-01-22 14:37:31.706472: step: 232/77, loss: 0.04055601358413696 2023-01-22 14:37:33.180202: step: 236/77, loss: 0.061504434794187546 2023-01-22 14:37:34.664995: step: 240/77, loss: 0.007917475886642933 2023-01-22 14:37:36.225300: step: 244/77, loss: 0.0018510303925722837 2023-01-22 14:37:37.610175: step: 248/77, loss: 0.03056379035115242 2023-01-22 14:37:39.099858: step: 252/77, loss: 0.008867012336850166 2023-01-22 14:37:40.507239: step: 256/77, loss: 0.01966426521539688 2023-01-22 14:37:42.065220: step: 260/77, loss: 0.03805197775363922 2023-01-22 14:37:43.500209: step: 264/77, loss: 0.00914710108190775 2023-01-22 14:37:44.967204: step: 268/77, loss: 0.003982956521213055 2023-01-22 14:37:46.403883: step: 272/77, loss: 0.0023249429650604725 2023-01-22 14:37:47.827479: step: 276/77, loss: 0.01259232684969902 2023-01-22 14:37:49.307769: step: 280/77, loss: 0.005942876450717449 2023-01-22 14:37:50.677509: step: 284/77, loss: 0.015731485560536385 2023-01-22 14:37:52.122482: step: 288/77, loss: 0.014402685686945915 2023-01-22 14:37:53.565487: step: 292/77, loss: 0.07678796350955963 2023-01-22 14:37:55.068657: step: 296/77, loss: 0.011266544461250305 2023-01-22 14:37:56.425039: step: 300/77, loss: 0.013650190085172653 2023-01-22 14:37:57.907567: step: 304/77, loss: 0.01796167902648449 2023-01-22 14:37:59.371391: step: 308/77, loss: 0.04054342582821846 2023-01-22 14:38:00.866474: step: 312/77, loss: 0.18366539478302002 2023-01-22 14:38:02.333328: step: 316/77, loss: 0.0310624148696661 2023-01-22 14:38:03.787100: step: 320/77, loss: 0.010575903579592705 2023-01-22 14:38:05.245080: step: 324/77, loss: 0.0009654787136241794 2023-01-22 14:38:06.685166: step: 328/77, loss: 0.015507823787629604 2023-01-22 14:38:08.094860: step: 332/77, loss: 0.016178598627448082 2023-01-22 14:38:09.563382: step: 336/77, loss: 0.0037763467989861965 2023-01-22 14:38:11.031221: step: 340/77, loss: 0.03331249579787254 2023-01-22 14:38:12.496984: step: 344/77, loss: 0.12362132221460342 2023-01-22 14:38:13.968640: step: 348/77, loss: 0.02561323344707489 2023-01-22 14:38:15.473442: step: 352/77, loss: 0.03250039368867874 2023-01-22 14:38:16.937294: step: 356/77, loss: 0.030868258327245712 2023-01-22 14:38:18.413112: step: 360/77, loss: 0.02164987474679947 2023-01-22 14:38:19.900978: step: 364/77, loss: 0.0348210446536541 2023-01-22 14:38:21.329110: step: 368/77, loss: 0.025197172537446022 2023-01-22 14:38:22.809401: step: 372/77, loss: 0.014328612014651299 2023-01-22 14:38:24.234216: step: 376/77, loss: 0.05011274293065071 2023-01-22 14:38:25.665616: step: 380/77, loss: 0.029031846672296524 2023-01-22 14:38:27.128124: step: 384/77, loss: 0.025708947330713272 2023-01-22 14:38:28.625993: step: 388/77, loss: 0.008589334785938263 ================================================== Loss: 0.028 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 2} Test Chinese: {'template': {'p': 0.9210526315789473, 'r': 0.5511811023622047, 'f1': 0.6896551724137933}, 'slot': {'p': 0.5641025641025641, 'r': 0.0210727969348659, 'f1': 0.04062788550323177}, 'combined': 0.02801923138153916, 'epoch': 2} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 2} Test Korean: {'template': {'p': 0.9210526315789473, 'r': 0.5511811023622047, 'f1': 0.6896551724137933}, 'slot': {'p': 0.5641025641025641, 'r': 0.0210727969348659, 'f1': 0.04062788550323177}, 'combined': 0.02801923138153916, 'epoch': 2} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 2} Test Russian: {'template': {'p': 0.9090909090909091, 'r': 0.5511811023622047, 'f1': 0.6862745098039216}, 'slot': {'p': 0.5526315789473685, 'r': 0.020114942528735632, 'f1': 0.03881700554528651}, 'combined': 0.026639121452647605, 'epoch': 2} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 2} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 2} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 2} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 1} Test for Chinese: {'template': {'p': 0.9459459459459459, 'r': 0.5511811023622047, 'f1': 0.6965174129353234}, 'slot': {'p': 0.6111111111111112, 'r': 0.0210727969348659, 'f1': 0.040740740740740744}, 'combined': 0.028376635341809474, 'epoch': 1} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 1} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 1} Test for Korean: {'template': {'p': 0.9459459459459459, 'r': 0.5511811023622047, 'f1': 0.6965174129353234}, 'slot': {'p': 0.6111111111111112, 'r': 0.0210727969348659, 'f1': 0.040740740740740744}, 'combined': 0.028376635341809474, 'epoch': 1} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 1} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 1} Test for Russian: {'template': {'p': 0.9459459459459459, 'r': 0.5511811023622047, 'f1': 0.6965174129353234}, 'slot': {'p': 0.6111111111111112, 'r': 0.0210727969348659, 'f1': 0.040740740740740744}, 'combined': 0.028376635341809474, 'epoch': 1} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 1} ****************************** Epoch: 3 command: python train.py --model_name template --xlmr_model_name xlm-roberta-large --batch_size 10 --xlmr_learning_rate 2e-5 --event_hidden_num 450 --accumulate_step 4 --max_epoch 30 --p1_data_weight 0.1 --learning_rate 2e-4 2023-01-22 14:40:01.945657: step: 4/77, loss: 0.0035422546789050102 2023-01-22 14:40:03.376656: step: 8/77, loss: 0.0262028556317091 2023-01-22 14:40:04.878781: step: 12/77, loss: 0.021963641047477722 2023-01-22 14:40:06.336219: step: 16/77, loss: 0.007140520494431257 2023-01-22 14:40:07.761217: step: 20/77, loss: 0.0059587387368083 2023-01-22 14:40:09.183422: step: 24/77, loss: 0.01787441037595272 2023-01-22 14:40:10.624765: step: 28/77, loss: 0.006597748026251793 2023-01-22 14:40:12.022103: step: 32/77, loss: 0.0036535533145070076 2023-01-22 14:40:13.462571: step: 36/77, loss: 0.004095820244401693 2023-01-22 14:40:14.899266: step: 40/77, loss: 0.01996089518070221 2023-01-22 14:40:16.281340: step: 44/77, loss: 0.13529127836227417 2023-01-22 14:40:17.716279: step: 48/77, loss: 0.0120127834379673 2023-01-22 14:40:19.133561: step: 52/77, loss: 0.004720134660601616 2023-01-22 14:40:20.541117: step: 56/77, loss: 0.0007113558822311461 2023-01-22 14:40:22.042233: step: 60/77, loss: 0.0045652734115719795 2023-01-22 14:40:23.469752: step: 64/77, loss: 0.009211833588778973 2023-01-22 14:40:24.895306: step: 68/77, loss: 0.0020150267519056797 2023-01-22 14:40:26.319511: step: 72/77, loss: 0.002670627785846591 2023-01-22 14:40:27.790217: step: 76/77, loss: 0.009097274392843246 2023-01-22 14:40:29.337897: step: 80/77, loss: 0.009357105009257793 2023-01-22 14:40:30.827611: step: 84/77, loss: 0.0012333837803453207 2023-01-22 14:40:32.278288: step: 88/77, loss: 0.01648041605949402 2023-01-22 14:40:33.745715: step: 92/77, loss: 0.02276625856757164 2023-01-22 14:40:35.218722: step: 96/77, loss: 0.029029421508312225 2023-01-22 14:40:36.670049: step: 100/77, loss: 0.07483517378568649 2023-01-22 14:40:38.114110: step: 104/77, loss: 0.005736944731324911 2023-01-22 14:40:39.542802: step: 108/77, loss: 0.02961890585720539 2023-01-22 14:40:40.922944: step: 112/77, loss: 0.02841648831963539 2023-01-22 14:40:42.375874: step: 116/77, loss: 0.07503394037485123 2023-01-22 14:40:43.810718: step: 120/77, loss: 0.0740741491317749 2023-01-22 14:40:45.203386: step: 124/77, loss: 0.0006337867816910148 2023-01-22 14:40:46.703469: step: 128/77, loss: 0.06943807750940323 2023-01-22 14:40:48.118371: step: 132/77, loss: 0.003671524580568075 2023-01-22 14:40:49.537486: step: 136/77, loss: 0.004404841456562281 2023-01-22 14:40:50.963276: step: 140/77, loss: 0.03815484791994095 2023-01-22 14:40:52.423114: step: 144/77, loss: 0.005773406475782394 2023-01-22 14:40:53.897446: step: 148/77, loss: 0.0005300405318848789 2023-01-22 14:40:55.306045: step: 152/77, loss: 0.11488598585128784 2023-01-22 14:40:56.777485: step: 156/77, loss: 0.045080967247486115 2023-01-22 14:40:58.118283: step: 160/77, loss: 0.002348395064473152 2023-01-22 14:40:59.630829: step: 164/77, loss: 0.0007226568413898349 2023-01-22 14:41:01.072730: step: 168/77, loss: 0.016137652099132538 2023-01-22 14:41:02.528520: step: 172/77, loss: 0.0228740107268095 2023-01-22 14:41:03.973696: step: 176/77, loss: 0.016735028475522995 2023-01-22 14:41:05.391929: step: 180/77, loss: 0.004764287732541561 2023-01-22 14:41:06.807799: step: 184/77, loss: 0.009152491576969624 2023-01-22 14:41:08.290597: step: 188/77, loss: 0.02768898196518421 2023-01-22 14:41:09.747376: step: 192/77, loss: 0.02774052694439888 2023-01-22 14:41:11.171455: step: 196/77, loss: 0.009748372249305248 2023-01-22 14:41:12.683395: step: 200/77, loss: 0.017457593232393265 2023-01-22 14:41:14.068862: step: 204/77, loss: 0.0895286351442337 2023-01-22 14:41:15.479381: step: 208/77, loss: 0.09868789464235306 2023-01-22 14:41:16.962820: step: 212/77, loss: 0.06212780624628067 2023-01-22 14:41:18.381592: step: 216/77, loss: 0.03467656672000885 2023-01-22 14:41:19.788398: step: 220/77, loss: 0.006487079430371523 2023-01-22 14:41:21.196028: step: 224/77, loss: 0.017196647822856903 2023-01-22 14:41:22.604962: step: 228/77, loss: 0.009787725284695625 2023-01-22 14:41:24.044490: step: 232/77, loss: 0.04158326983451843 2023-01-22 14:41:25.481118: step: 236/77, loss: 0.00034005980705842376 2023-01-22 14:41:26.878208: step: 240/77, loss: 0.0076415762305259705 2023-01-22 14:41:28.318410: step: 244/77, loss: 0.003000237513333559 2023-01-22 14:41:29.790572: step: 248/77, loss: 0.01648474857211113 2023-01-22 14:41:31.217154: step: 252/77, loss: 0.05669962614774704 2023-01-22 14:41:32.574835: step: 256/77, loss: 0.030298719182610512 2023-01-22 14:41:34.013534: step: 260/77, loss: 0.029475966468453407 2023-01-22 14:41:35.510069: step: 264/77, loss: 0.011790428310632706 2023-01-22 14:41:36.914280: step: 268/77, loss: 0.035432107746601105 2023-01-22 14:41:38.368824: step: 272/77, loss: 0.06394211202859879 2023-01-22 14:41:39.807226: step: 276/77, loss: 0.2172403484582901 2023-01-22 14:41:41.281620: step: 280/77, loss: 0.05942591279745102 2023-01-22 14:41:42.720061: step: 284/77, loss: 0.007334072142839432 2023-01-22 14:41:44.149904: step: 288/77, loss: 0.03417358174920082 2023-01-22 14:41:45.580191: step: 292/77, loss: 0.04306645691394806 2023-01-22 14:41:47.063588: step: 296/77, loss: 0.082725889980793 2023-01-22 14:41:48.500111: step: 300/77, loss: 0.10342706739902496 2023-01-22 14:41:49.943130: step: 304/77, loss: 0.055656421929597855 2023-01-22 14:41:51.415485: step: 308/77, loss: 0.005309468135237694 2023-01-22 14:41:52.847233: step: 312/77, loss: 0.08096519112586975 2023-01-22 14:41:54.278192: step: 316/77, loss: 0.0552254281938076 2023-01-22 14:41:55.698644: step: 320/77, loss: 0.0011638941941782832 2023-01-22 14:41:57.073685: step: 324/77, loss: 0.005165431182831526 2023-01-22 14:41:58.599757: step: 328/77, loss: 0.02403011918067932 2023-01-22 14:42:00.057083: step: 332/77, loss: 0.005875763017684221 2023-01-22 14:42:01.501486: step: 336/77, loss: 0.005361301824450493 2023-01-22 14:42:02.937303: step: 340/77, loss: 0.021805912256240845 2023-01-22 14:42:04.336831: step: 344/77, loss: 0.03941915184259415 2023-01-22 14:42:05.836462: step: 348/77, loss: 0.044607698917388916 2023-01-22 14:42:07.248280: step: 352/77, loss: 0.006965402513742447 2023-01-22 14:42:08.709091: step: 356/77, loss: 0.037422556430101395 2023-01-22 14:42:10.132841: step: 360/77, loss: 0.03731338679790497 2023-01-22 14:42:11.598711: step: 364/77, loss: 0.043560612946748734 2023-01-22 14:42:13.055868: step: 368/77, loss: 0.0171118825674057 2023-01-22 14:42:14.506205: step: 372/77, loss: 0.005094244610518217 2023-01-22 14:42:16.031572: step: 376/77, loss: 0.01857147552073002 2023-01-22 14:42:17.500867: step: 380/77, loss: 0.009790465235710144 2023-01-22 14:42:18.974839: step: 384/77, loss: 0.010630027391016483 2023-01-22 14:42:20.434599: step: 388/77, loss: 0.03435883671045303 ================================================== Loss: 0.029 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test Chinese: {'template': {'p': 0.918918918918919, 'r': 0.5354330708661418, 'f1': 0.6766169154228856}, 'slot': {'p': 0.46153846153846156, 'r': 0.017241379310344827, 'f1': 0.0332409972299169}, 'combined': 0.022491421011287056, 'epoch': 3} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test Korean: {'template': {'p': 0.9066666666666666, 'r': 0.5354330708661418, 'f1': 0.6732673267326733}, 'slot': {'p': 0.4634146341463415, 'r': 0.018199233716475097, 'f1': 0.035023041474654376}, 'combined': 0.023579869507688096, 'epoch': 3} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test Russian: {'template': {'p': 0.9066666666666666, 'r': 0.5354330708661418, 'f1': 0.6732673267326733}, 'slot': {'p': 0.475, 'r': 0.018199233716475097, 'f1': 0.03505535055350554}, 'combined': 0.023601622154835415, 'epoch': 3} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 3} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 3} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 3} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 1} Test for Chinese: {'template': {'p': 0.9459459459459459, 'r': 0.5511811023622047, 'f1': 0.6965174129353234}, 'slot': {'p': 0.6111111111111112, 'r': 0.0210727969348659, 'f1': 0.040740740740740744}, 'combined': 0.028376635341809474, 'epoch': 1} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 1} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 1} Test for Korean: {'template': {'p': 0.9459459459459459, 'r': 0.5511811023622047, 'f1': 0.6965174129353234}, 'slot': {'p': 0.6111111111111112, 'r': 0.0210727969348659, 'f1': 0.040740740740740744}, 'combined': 0.028376635341809474, 'epoch': 1} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 1} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 1} Test for Russian: {'template': {'p': 0.9459459459459459, 'r': 0.5511811023622047, 'f1': 0.6965174129353234}, 'slot': {'p': 0.6111111111111112, 'r': 0.0210727969348659, 'f1': 0.040740740740740744}, 'combined': 0.028376635341809474, 'epoch': 1} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 1} ****************************** Epoch: 4 command: python train.py --model_name template --xlmr_model_name xlm-roberta-large --batch_size 10 --xlmr_learning_rate 2e-5 --event_hidden_num 450 --accumulate_step 4 --max_epoch 30 --p1_data_weight 0.1 --learning_rate 2e-4 2023-01-22 14:43:53.114213: step: 4/77, loss: 0.017117882147431374 2023-01-22 14:43:54.587728: step: 8/77, loss: 0.021694406867027283 2023-01-22 14:43:55.961233: step: 12/77, loss: 0.09319684654474258 2023-01-22 14:43:57.429140: step: 16/77, loss: 0.03276555985212326 2023-01-22 14:43:58.884673: step: 20/77, loss: 0.001923754345625639 2023-01-22 14:44:00.303005: step: 24/77, loss: 0.036008577793836594 2023-01-22 14:44:01.813565: step: 28/77, loss: 0.0005280501791276038 2023-01-22 14:44:03.314362: step: 32/77, loss: 0.039218269288539886 2023-01-22 14:44:04.734157: step: 36/77, loss: 0.03408980369567871 2023-01-22 14:44:06.167022: step: 40/77, loss: 0.016865989193320274 2023-01-22 14:44:07.628346: step: 44/77, loss: 0.020239533856511116 2023-01-22 14:44:09.094193: step: 48/77, loss: 0.0007147272117435932 2023-01-22 14:44:10.536559: step: 52/77, loss: 0.004312576726078987 2023-01-22 14:44:12.018459: step: 56/77, loss: 0.004660824779421091 2023-01-22 14:44:13.503841: step: 60/77, loss: 0.013897586613893509 2023-01-22 14:44:14.957979: step: 64/77, loss: 0.020500633865594864 2023-01-22 14:44:16.424132: step: 68/77, loss: 0.0009045275510288775 2023-01-22 14:44:17.853294: step: 72/77, loss: 0.06182611733675003 2023-01-22 14:44:19.320344: step: 76/77, loss: 0.02604839950799942 2023-01-22 14:44:20.773128: step: 80/77, loss: 0.15799608826637268 2023-01-22 14:44:22.257305: step: 84/77, loss: 0.01904124580323696 2023-01-22 14:44:23.644006: step: 88/77, loss: 0.0020622629672288895 2023-01-22 14:44:25.059785: step: 92/77, loss: 0.008743856102228165 2023-01-22 14:44:26.463974: step: 96/77, loss: 0.0007341898744925857 2023-01-22 14:44:27.906874: step: 100/77, loss: 0.01710500195622444 2023-01-22 14:44:29.352282: step: 104/77, loss: 0.0022848332300782204 2023-01-22 14:44:30.799246: step: 108/77, loss: 0.003402404487133026 2023-01-22 14:44:32.264915: step: 112/77, loss: 0.0211468618363142 2023-01-22 14:44:33.646127: step: 116/77, loss: 0.004930790048092604 2023-01-22 14:44:35.107298: step: 120/77, loss: 0.014979320578277111 2023-01-22 14:44:36.560053: step: 124/77, loss: 0.03163864463567734 2023-01-22 14:44:38.013954: step: 128/77, loss: 0.013047886081039906 2023-01-22 14:44:39.440901: step: 132/77, loss: 0.0053689442574977875 2023-01-22 14:44:40.908622: step: 136/77, loss: 0.04405604675412178 2023-01-22 14:44:42.364247: step: 140/77, loss: 0.0545283742249012 2023-01-22 14:44:43.776811: step: 144/77, loss: 0.006766856648027897 2023-01-22 14:44:45.256790: step: 148/77, loss: 0.019583430141210556 2023-01-22 14:44:46.725606: step: 152/77, loss: 0.08986619859933853 2023-01-22 14:44:48.167652: step: 156/77, loss: 0.01181843876838684 2023-01-22 14:44:49.617025: step: 160/77, loss: 0.014085713773965836 2023-01-22 14:44:51.029826: step: 164/77, loss: 0.00812592078000307 2023-01-22 14:44:52.444520: step: 168/77, loss: 0.037390850484371185 2023-01-22 14:44:53.912135: step: 172/77, loss: 0.010096865706145763 2023-01-22 14:44:55.290279: step: 176/77, loss: 0.017443949356675148 2023-01-22 14:44:56.669321: step: 180/77, loss: 0.011668755672872066 2023-01-22 14:44:58.128965: step: 184/77, loss: 0.11201060563325882 2023-01-22 14:44:59.565676: step: 188/77, loss: 0.01760982722043991 2023-01-22 14:45:00.936389: step: 192/77, loss: 0.015110615640878677 2023-01-22 14:45:02.432352: step: 196/77, loss: 0.00934512633830309 2023-01-22 14:45:03.885094: step: 200/77, loss: 0.005648928228765726 2023-01-22 14:45:05.322329: step: 204/77, loss: 0.007891605608165264 2023-01-22 14:45:06.698944: step: 208/77, loss: 0.008117000572383404 2023-01-22 14:45:08.166729: step: 212/77, loss: 0.021162116900086403 2023-01-22 14:45:09.662200: step: 216/77, loss: 0.07232334464788437 2023-01-22 14:45:11.126251: step: 220/77, loss: 0.010912317782640457 2023-01-22 14:45:12.546020: step: 224/77, loss: 0.00976177304983139 2023-01-22 14:45:14.039867: step: 228/77, loss: 0.00580084603279829 2023-01-22 14:45:15.473672: step: 232/77, loss: 0.010606599971652031 2023-01-22 14:45:16.903466: step: 236/77, loss: 0.0038156877271831036 2023-01-22 14:45:18.367015: step: 240/77, loss: 0.03545666113495827 2023-01-22 14:45:19.717301: step: 244/77, loss: 0.018850870430469513 2023-01-22 14:45:21.135727: step: 248/77, loss: 0.002749336650595069 2023-01-22 14:45:22.625922: step: 252/77, loss: 0.06369539350271225 2023-01-22 14:45:24.083040: step: 256/77, loss: 0.012162135913968086 2023-01-22 14:45:25.542368: step: 260/77, loss: 0.03183136507868767 2023-01-22 14:45:27.014449: step: 264/77, loss: 0.0014387741684913635 2023-01-22 14:45:28.425008: step: 268/77, loss: 0.0636252835392952 2023-01-22 14:45:29.929449: step: 272/77, loss: 0.05702805146574974 2023-01-22 14:45:31.368307: step: 276/77, loss: 0.014315210282802582 2023-01-22 14:45:32.818327: step: 280/77, loss: 0.0736774429678917 2023-01-22 14:45:34.231795: step: 284/77, loss: 0.00806544627994299 2023-01-22 14:45:35.645842: step: 288/77, loss: 0.09108346700668335 2023-01-22 14:45:37.090765: step: 292/77, loss: 0.032689716666936874 2023-01-22 14:45:38.499337: step: 296/77, loss: 0.00193894119001925 2023-01-22 14:45:39.891747: step: 300/77, loss: 0.0016112093580886722 2023-01-22 14:45:41.314954: step: 304/77, loss: 0.02805289253592491 2023-01-22 14:45:42.748039: step: 308/77, loss: 0.021633636206388474 2023-01-22 14:45:44.224392: step: 312/77, loss: 0.08681163191795349 2023-01-22 14:45:45.587910: step: 316/77, loss: 0.008203232660889626 2023-01-22 14:45:47.053623: step: 320/77, loss: 0.09448211640119553 2023-01-22 14:45:48.469642: step: 324/77, loss: 0.13245467841625214 2023-01-22 14:45:49.847163: step: 328/77, loss: 0.034202151000499725 2023-01-22 14:45:51.285031: step: 332/77, loss: 0.036005809903144836 2023-01-22 14:45:52.680424: step: 336/77, loss: 0.07581266760826111 2023-01-22 14:45:54.096101: step: 340/77, loss: 0.0025374547112733126 2023-01-22 14:45:55.519898: step: 344/77, loss: 0.01466337963938713 2023-01-22 14:45:56.973316: step: 348/77, loss: 0.08328230679035187 2023-01-22 14:45:58.388172: step: 352/77, loss: 0.010121517814695835 2023-01-22 14:45:59.845673: step: 356/77, loss: 0.015084447339177132 2023-01-22 14:46:01.259813: step: 360/77, loss: 0.011139405891299248 2023-01-22 14:46:02.672241: step: 364/77, loss: 0.06497897952795029 2023-01-22 14:46:04.176412: step: 368/77, loss: 0.11260268092155457 2023-01-22 14:46:05.617915: step: 372/77, loss: 0.08779604732990265 2023-01-22 14:46:07.076290: step: 376/77, loss: 0.038710471242666245 2023-01-22 14:46:08.484298: step: 380/77, loss: 0.06987497955560684 2023-01-22 14:46:09.919589: step: 384/77, loss: 0.03459371253848076 2023-01-22 14:46:11.336410: step: 388/77, loss: 0.006907064002007246 ================================================== Loss: 0.031 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5666666666666667, 'f1': 0.7234042553191489}, 'slot': {'p': 0.5, 'r': 0.035916824196597356, 'f1': 0.0670194003527337}, 'combined': 0.048482119404105226, 'epoch': 4} Test Chinese: {'template': {'p': 0.9206349206349206, 'r': 0.4566929133858268, 'f1': 0.6105263157894737}, 'slot': {'p': 0.5862068965517241, 'r': 0.016283524904214558, 'f1': 0.031686859273066165}, 'combined': 0.019345661450924607, 'epoch': 4} Dev Korean: {'template': {'p': 1.0, 'r': 0.55, 'f1': 0.7096774193548387}, 'slot': {'p': 0.5, 'r': 0.034026465028355386, 'f1': 0.06371681415929203}, 'combined': 0.04521838424207822, 'epoch': 4} Test Korean: {'template': {'p': 0.9344262295081968, 'r': 0.44881889763779526, 'f1': 0.6063829787234042}, 'slot': {'p': 0.5862068965517241, 'r': 0.016283524904214558, 'f1': 0.031686859273066165}, 'combined': 0.019214372112391184, 'epoch': 4} Dev Russian: {'template': {'p': 1.0, 'r': 0.55, 'f1': 0.7096774193548387}, 'slot': {'p': 0.5, 'r': 0.034026465028355386, 'f1': 0.06371681415929203}, 'combined': 0.04521838424207822, 'epoch': 4} Test Russian: {'template': {'p': 0.9193548387096774, 'r': 0.44881889763779526, 'f1': 0.6031746031746031}, 'slot': {'p': 0.5862068965517241, 'r': 0.016283524904214558, 'f1': 0.031686859273066165}, 'combined': 0.019112708767881178, 'epoch': 4} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 4} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 4} Sample Russian: {'template': {'p': 1.0, 'r': 0.25, 'f1': 0.4}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 4} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 1} Test for Chinese: {'template': {'p': 0.9459459459459459, 'r': 0.5511811023622047, 'f1': 0.6965174129353234}, 'slot': {'p': 0.6111111111111112, 'r': 0.0210727969348659, 'f1': 0.040740740740740744}, 'combined': 0.028376635341809474, 'epoch': 1} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 1} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 1} Test for Korean: {'template': {'p': 0.9459459459459459, 'r': 0.5511811023622047, 'f1': 0.6965174129353234}, 'slot': {'p': 0.6111111111111112, 'r': 0.0210727969348659, 'f1': 0.040740740740740744}, 'combined': 0.028376635341809474, 'epoch': 1} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 1} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 1} Test for Russian: {'template': {'p': 0.9459459459459459, 'r': 0.5511811023622047, 'f1': 0.6965174129353234}, 'slot': {'p': 0.6111111111111112, 'r': 0.0210727969348659, 'f1': 0.040740740740740744}, 'combined': 0.028376635341809474, 'epoch': 1} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 1} ****************************** Epoch: 5 command: python train.py --model_name template --xlmr_model_name xlm-roberta-large --batch_size 10 --xlmr_learning_rate 2e-5 --event_hidden_num 450 --accumulate_step 4 --max_epoch 30 --p1_data_weight 0.1 --learning_rate 2e-4 2023-01-22 14:47:45.547525: step: 4/77, loss: 0.04267946630716324 2023-01-22 14:47:46.964619: step: 8/77, loss: 0.019253859296441078 2023-01-22 14:47:48.388768: step: 12/77, loss: 0.0003734159399755299 2023-01-22 14:47:49.822555: step: 16/77, loss: 0.0033347937278449535 2023-01-22 14:47:51.262968: step: 20/77, loss: 0.0369611494243145 2023-01-22 14:47:52.657742: step: 24/77, loss: 0.023462975397706032 2023-01-22 14:47:54.096647: step: 28/77, loss: 0.034193411469459534 2023-01-22 14:47:55.561612: step: 32/77, loss: 0.008454875089228153 2023-01-22 14:47:57.026755: step: 36/77, loss: 0.00033789267763495445 2023-01-22 14:47:58.478334: step: 40/77, loss: 0.1968812197446823 2023-01-22 14:47:59.977540: step: 44/77, loss: 0.11563257873058319 2023-01-22 14:48:01.428530: step: 48/77, loss: 0.00573181826621294 2023-01-22 14:48:02.890599: step: 52/77, loss: 0.10677900910377502 2023-01-22 14:48:04.422785: step: 56/77, loss: 0.0739724338054657 2023-01-22 14:48:05.936863: step: 60/77, loss: 0.018925180658698082 2023-01-22 14:48:07.392247: step: 64/77, loss: 0.005690715275704861 2023-01-22 14:48:08.869560: step: 68/77, loss: 0.006786242127418518 2023-01-22 14:48:10.346308: step: 72/77, loss: 0.023970942944288254 2023-01-22 14:48:11.834000: step: 76/77, loss: 0.0038931886665523052 2023-01-22 14:48:13.318169: step: 80/77, loss: 0.0009363566059619188 2023-01-22 14:48:14.747444: step: 84/77, loss: 0.07831157743930817 2023-01-22 14:48:16.156567: step: 88/77, loss: 0.02357906848192215 2023-01-22 14:48:17.666827: step: 92/77, loss: 0.09391173720359802 2023-01-22 14:48:19.128494: step: 96/77, loss: 0.013351823203265667 2023-01-22 14:48:20.611462: step: 100/77, loss: 0.0017684220802038908 2023-01-22 14:48:22.091524: step: 104/77, loss: 0.0012956967111676931 2023-01-22 14:48:23.538746: step: 108/77, loss: 0.004678148310631514 2023-01-22 14:48:25.032655: step: 112/77, loss: 0.0752822533249855 2023-01-22 14:48:26.555445: step: 116/77, loss: 0.020514022558927536 2023-01-22 14:48:27.984769: step: 120/77, loss: 0.008730904199182987 2023-01-22 14:48:29.418518: step: 124/77, loss: 0.010091107338666916 2023-01-22 14:48:30.841539: step: 128/77, loss: 0.005734878592193127 2023-01-22 14:48:32.262104: step: 132/77, loss: 0.025970421731472015 2023-01-22 14:48:33.765027: step: 136/77, loss: 0.006229419726878405 2023-01-22 14:48:35.161221: step: 140/77, loss: 0.007926160469651222 2023-01-22 14:48:36.579718: step: 144/77, loss: 0.00817070435732603 2023-01-22 14:48:38.023918: step: 148/77, loss: 0.030918531119823456 2023-01-22 14:48:39.494274: step: 152/77, loss: 0.014971529133617878 2023-01-22 14:48:41.034056: step: 156/77, loss: 0.07296043634414673 2023-01-22 14:48:42.433956: step: 160/77, loss: 0.014134378172457218 2023-01-22 14:48:43.973160: step: 164/77, loss: 0.0986773669719696 2023-01-22 14:48:45.439238: step: 168/77, loss: 0.03894984722137451 2023-01-22 14:48:46.855539: step: 172/77, loss: 0.08615649491548538 2023-01-22 14:48:48.372221: step: 176/77, loss: 0.008352591656148434 2023-01-22 14:48:49.857848: step: 180/77, loss: 0.021340519189834595 2023-01-22 14:48:51.295901: step: 184/77, loss: 0.00957648828625679 2023-01-22 14:48:52.773602: step: 188/77, loss: 0.016516223549842834 2023-01-22 14:48:54.257623: step: 192/77, loss: 0.012467145919799805 2023-01-22 14:48:55.728186: step: 196/77, loss: 0.14558720588684082 2023-01-22 14:48:57.176974: step: 200/77, loss: 0.0184561125934124 2023-01-22 14:48:58.592603: step: 204/77, loss: 0.03879745304584503 2023-01-22 14:49:00.116176: step: 208/77, loss: 0.024213481694459915 2023-01-22 14:49:01.589119: step: 212/77, loss: 0.0375804677605629 2023-01-22 14:49:03.066614: step: 216/77, loss: 0.09432969242334366 2023-01-22 14:49:04.539458: step: 220/77, loss: 0.0190828088670969 2023-01-22 14:49:05.988572: step: 224/77, loss: 0.018747573718428612 2023-01-22 14:49:07.405010: step: 228/77, loss: 0.0056815785355865955 2023-01-22 14:49:08.827437: step: 232/77, loss: 0.03290817141532898 2023-01-22 14:49:10.286201: step: 236/77, loss: 0.006787025835365057 2023-01-22 14:49:11.693008: step: 240/77, loss: 0.008653461933135986 2023-01-22 14:49:13.242400: step: 244/77, loss: 0.010657123290002346 2023-01-22 14:49:14.698195: step: 248/77, loss: 0.009114827029407024 2023-01-22 14:49:16.175150: step: 252/77, loss: 0.011150985024869442 2023-01-22 14:49:17.678818: step: 256/77, loss: 0.004261985421180725 2023-01-22 14:49:19.194877: step: 260/77, loss: 0.007893332280218601 2023-01-22 14:49:20.695804: step: 264/77, loss: 0.022277070209383965 2023-01-22 14:49:22.159394: step: 268/77, loss: 0.006228423677384853 2023-01-22 14:49:23.609667: step: 272/77, loss: 0.07288344204425812 2023-01-22 14:49:25.081248: step: 276/77, loss: 0.005302801262587309 2023-01-22 14:49:26.604012: step: 280/77, loss: 0.03222210332751274 2023-01-22 14:49:28.096645: step: 284/77, loss: 0.06637255102396011 2023-01-22 14:49:29.580712: step: 288/77, loss: 0.00120446365326643 2023-01-22 14:49:31.114618: step: 292/77, loss: 0.10510259866714478 2023-01-22 14:49:32.537673: step: 296/77, loss: 0.03763652220368385 2023-01-22 14:49:34.093210: step: 300/77, loss: 0.005131114274263382 2023-01-22 14:49:35.512772: step: 304/77, loss: 0.0008660835446789861 2023-01-22 14:49:36.994648: step: 308/77, loss: 0.07221066951751709 2023-01-22 14:49:38.508303: step: 312/77, loss: 0.04162096232175827 2023-01-22 14:49:39.948196: step: 316/77, loss: 0.03379117324948311 2023-01-22 14:49:41.409151: step: 320/77, loss: 0.1290118247270584 2023-01-22 14:49:42.833933: step: 324/77, loss: 0.056446172297000885 2023-01-22 14:49:44.261153: step: 328/77, loss: 0.01602139323949814 2023-01-22 14:49:45.729174: step: 332/77, loss: 0.038806505501270294 2023-01-22 14:49:47.194992: step: 336/77, loss: 0.036852702498435974 2023-01-22 14:49:48.652483: step: 340/77, loss: 0.011991115286946297 2023-01-22 14:49:50.120990: step: 344/77, loss: 0.06942974776029587 2023-01-22 14:49:51.670885: step: 348/77, loss: 0.0012034145183861256 2023-01-22 14:49:53.195383: step: 352/77, loss: 0.013952319510281086 2023-01-22 14:49:54.657811: step: 356/77, loss: 0.00228273612447083 2023-01-22 14:49:56.113882: step: 360/77, loss: 0.06070905923843384 2023-01-22 14:49:57.621437: step: 364/77, loss: 0.0008847748977132142 2023-01-22 14:49:59.124833: step: 368/77, loss: 0.010776164010167122 2023-01-22 14:50:00.641334: step: 372/77, loss: 0.015828417614102364 2023-01-22 14:50:02.097406: step: 376/77, loss: 0.017704883590340614 2023-01-22 14:50:03.603045: step: 380/77, loss: 0.058441806584596634 2023-01-22 14:50:05.033798: step: 384/77, loss: 0.029390690848231316 2023-01-22 14:50:06.472105: step: 388/77, loss: 0.009416073560714722 ================================================== Loss: 0.032 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 5} Test Chinese: {'template': {'p': 0.8875, 'r': 0.5590551181102362, 'f1': 0.6859903381642511}, 'slot': {'p': 0.575, 'r': 0.022030651340996167, 'f1': 0.042435424354243544}, 'combined': 0.029110291102911027, 'epoch': 5} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 5} Test Korean: {'template': {'p': 0.8987341772151899, 'r': 0.5590551181102362, 'f1': 0.6893203883495145}, 'slot': {'p': 0.575, 'r': 0.022030651340996167, 'f1': 0.042435424354243544}, 'combined': 0.029251603195643603, 'epoch': 5} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 5} Test Russian: {'template': {'p': 0.8987341772151899, 'r': 0.5590551181102362, 'f1': 0.6893203883495145}, 'slot': {'p': 0.5609756097560976, 'r': 0.022030651340996167, 'f1': 0.0423963133640553}, 'combined': 0.029224643192698307, 'epoch': 5} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 5} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 5} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 5} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 1} Test for Chinese: {'template': {'p': 0.9459459459459459, 'r': 0.5511811023622047, 'f1': 0.6965174129353234}, 'slot': {'p': 0.6111111111111112, 'r': 0.0210727969348659, 'f1': 0.040740740740740744}, 'combined': 0.028376635341809474, 'epoch': 1} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 1} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 1} Test for Korean: {'template': {'p': 0.9459459459459459, 'r': 0.5511811023622047, 'f1': 0.6965174129353234}, 'slot': {'p': 0.6111111111111112, 'r': 0.0210727969348659, 'f1': 0.040740740740740744}, 'combined': 0.028376635341809474, 'epoch': 1} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 1} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 1} Test for Russian: {'template': {'p': 0.9459459459459459, 'r': 0.5511811023622047, 'f1': 0.6965174129353234}, 'slot': {'p': 0.6111111111111112, 'r': 0.0210727969348659, 'f1': 0.040740740740740744}, 'combined': 0.028376635341809474, 'epoch': 1} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 1} ****************************** Epoch: 6 command: python train.py --model_name template --xlmr_model_name xlm-roberta-large --batch_size 10 --xlmr_learning_rate 2e-5 --event_hidden_num 450 --accumulate_step 4 --max_epoch 30 --p1_data_weight 0.1 --learning_rate 2e-4 2023-01-22 14:51:41.725284: step: 4/77, loss: 0.0007412336417473853 2023-01-22 14:51:43.192105: step: 8/77, loss: 0.021313954144716263 2023-01-22 14:51:44.604821: step: 12/77, loss: 0.018716149032115936 2023-01-22 14:51:45.975274: step: 16/77, loss: 0.0013814292615279555 2023-01-22 14:51:47.417259: step: 20/77, loss: 0.015931401401758194 2023-01-22 14:51:48.873069: step: 24/77, loss: 0.0033333045430481434 2023-01-22 14:51:50.300738: step: 28/77, loss: 0.0068678054958581924 2023-01-22 14:51:51.780499: step: 32/77, loss: 0.023146910592913628 2023-01-22 14:51:53.250146: step: 36/77, loss: 0.004310089163482189 2023-01-22 14:51:54.695312: step: 40/77, loss: 0.0234291460365057 2023-01-22 14:51:56.145666: step: 44/77, loss: 0.024650773033499718 2023-01-22 14:51:57.593494: step: 48/77, loss: 0.01612193137407303 2023-01-22 14:51:59.118424: step: 52/77, loss: 0.034444257616996765 2023-01-22 14:52:00.623222: step: 56/77, loss: 0.00041520228842273355 2023-01-22 14:52:01.993957: step: 60/77, loss: 0.000145029291161336 2023-01-22 14:52:03.491617: step: 64/77, loss: 0.027829859405755997 2023-01-22 14:52:04.911120: step: 68/77, loss: 0.03428872674703598 2023-01-22 14:52:06.463006: step: 72/77, loss: 0.09775977581739426 2023-01-22 14:52:07.984286: step: 76/77, loss: 0.005572082940489054 2023-01-22 14:52:09.491113: step: 80/77, loss: 0.026036527007818222 2023-01-22 14:52:10.977789: step: 84/77, loss: 0.01301370095461607 2023-01-22 14:52:12.446482: step: 88/77, loss: 0.052194904536008835 2023-01-22 14:52:13.895328: step: 92/77, loss: 0.08800213038921356 2023-01-22 14:52:15.374261: step: 96/77, loss: 0.04067125916481018 2023-01-22 14:52:16.882232: step: 100/77, loss: 0.04747018963098526 2023-01-22 14:52:18.346485: step: 104/77, loss: 0.018774326890707016 2023-01-22 14:52:19.807159: step: 108/77, loss: 0.01722778007388115 2023-01-22 14:52:21.318413: step: 112/77, loss: 0.015616234391927719 2023-01-22 14:52:22.752499: step: 116/77, loss: 0.012200826779007912 2023-01-22 14:52:24.190510: step: 120/77, loss: 0.034464579075574875 2023-01-22 14:52:25.675424: step: 124/77, loss: 0.0038082152605056763 2023-01-22 14:52:27.117564: step: 128/77, loss: 0.056725479662418365 2023-01-22 14:52:28.581379: step: 132/77, loss: 0.0006684925174340606 2023-01-22 14:52:30.053248: step: 136/77, loss: 0.101505808532238 2023-01-22 14:52:31.568329: step: 140/77, loss: 0.01190338283777237 2023-01-22 14:52:32.999379: step: 144/77, loss: 0.021941600367426872 2023-01-22 14:52:34.467713: step: 148/77, loss: 0.08529230207204819 2023-01-22 14:52:35.882472: step: 152/77, loss: 0.009804932400584221 2023-01-22 14:52:37.345944: step: 156/77, loss: 0.025796718895435333 2023-01-22 14:52:38.860700: step: 160/77, loss: 0.33812057971954346 2023-01-22 14:52:40.322257: step: 164/77, loss: 0.004532980732619762 2023-01-22 14:52:41.804254: step: 168/77, loss: 0.00879757385700941 2023-01-22 14:52:43.274648: step: 172/77, loss: 0.040550507605075836 2023-01-22 14:52:44.727882: step: 176/77, loss: 0.02198246493935585 2023-01-22 14:52:46.170715: step: 180/77, loss: 0.030445173382759094 2023-01-22 14:52:47.625798: step: 184/77, loss: 0.003669434692710638 2023-01-22 14:52:49.061644: step: 188/77, loss: 0.005453969817608595 2023-01-22 14:52:50.522597: step: 192/77, loss: 0.01447716448456049 2023-01-22 14:52:51.950709: step: 196/77, loss: 0.14574000239372253 2023-01-22 14:52:53.467865: step: 200/77, loss: 0.03189618140459061 2023-01-22 14:52:55.046403: step: 204/77, loss: 0.037016451358795166 2023-01-22 14:52:56.510591: step: 208/77, loss: 0.06251507997512817 2023-01-22 14:52:57.897083: step: 212/77, loss: 0.0025996535550802946 2023-01-22 14:52:59.380580: step: 216/77, loss: 0.021184764802455902 2023-01-22 14:53:00.901701: step: 220/77, loss: 0.02039853297173977 2023-01-22 14:53:02.373392: step: 224/77, loss: 0.044906072318553925 2023-01-22 14:53:03.799253: step: 228/77, loss: 0.003273381618782878 2023-01-22 14:53:05.251971: step: 232/77, loss: 0.07473050057888031 2023-01-22 14:53:06.641845: step: 236/77, loss: 0.008918672800064087 2023-01-22 14:53:08.111023: step: 240/77, loss: 0.03857048228383064 2023-01-22 14:53:09.558033: step: 244/77, loss: 0.0027572987601161003 2023-01-22 14:53:10.975197: step: 248/77, loss: 0.04240492358803749 2023-01-22 14:53:12.400654: step: 252/77, loss: 0.0041396236047148705 2023-01-22 14:53:13.947649: step: 256/77, loss: 0.025141380727291107 2023-01-22 14:53:15.490583: step: 260/77, loss: 0.18390804529190063 2023-01-22 14:53:16.917287: step: 264/77, loss: 0.11329423636198044 2023-01-22 14:53:18.376228: step: 268/77, loss: 0.0057776011526584625 2023-01-22 14:53:19.805253: step: 272/77, loss: 0.0006380232516676188 2023-01-22 14:53:21.292351: step: 276/77, loss: 0.017560191452503204 2023-01-22 14:53:22.746399: step: 280/77, loss: 0.01942804455757141 2023-01-22 14:53:24.250108: step: 284/77, loss: 0.0005399800720624626 2023-01-22 14:53:25.716437: step: 288/77, loss: 0.010437647812068462 2023-01-22 14:53:27.159306: step: 292/77, loss: 0.004900635220110416 2023-01-22 14:53:28.651649: step: 296/77, loss: 0.00043453832040540874 2023-01-22 14:53:30.089721: step: 300/77, loss: 0.02212320826947689 2023-01-22 14:53:31.608582: step: 304/77, loss: 0.001968685071915388 2023-01-22 14:53:33.027320: step: 308/77, loss: 0.009211909025907516 2023-01-22 14:53:34.481579: step: 312/77, loss: 0.000495576998218894 2023-01-22 14:53:35.961240: step: 316/77, loss: 0.016290096566081047 2023-01-22 14:53:37.456818: step: 320/77, loss: 0.0007329899817705154 2023-01-22 14:53:38.925654: step: 324/77, loss: 0.0068076783791184425 2023-01-22 14:53:40.396547: step: 328/77, loss: 0.00383429741486907 2023-01-22 14:53:41.873644: step: 332/77, loss: 0.020514294505119324 2023-01-22 14:53:43.339787: step: 336/77, loss: 0.005620032548904419 2023-01-22 14:53:44.818752: step: 340/77, loss: 0.000713431159965694 2023-01-22 14:53:46.307130: step: 344/77, loss: 0.0004864736692979932 2023-01-22 14:53:47.824250: step: 348/77, loss: 0.014447920955717564 2023-01-22 14:53:49.260668: step: 352/77, loss: 0.013495909981429577 2023-01-22 14:53:50.735056: step: 356/77, loss: 0.07911933213472366 2023-01-22 14:53:52.211129: step: 360/77, loss: 0.004805286414921284 2023-01-22 14:53:53.594692: step: 364/77, loss: 0.003314794972538948 2023-01-22 14:53:55.050960: step: 368/77, loss: 0.0014681564643979073 2023-01-22 14:53:56.516194: step: 372/77, loss: 0.0097124632447958 2023-01-22 14:53:57.994160: step: 376/77, loss: 0.019356781616806984 2023-01-22 14:53:59.465739: step: 380/77, loss: 0.0035740383900702 2023-01-22 14:54:00.961637: step: 384/77, loss: 0.0015122373588383198 2023-01-22 14:54:02.419798: step: 388/77, loss: 0.00045621368917636573 ================================================== Loss: 0.028 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 6} Test Chinese: {'template': {'p': 0.9324324324324325, 'r': 0.5433070866141733, 'f1': 0.6865671641791046}, 'slot': {'p': 0.6111111111111112, 'r': 0.0210727969348659, 'f1': 0.040740740740740744}, 'combined': 0.027971254836926484, 'epoch': 6} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 6} Test Korean: {'template': {'p': 0.9078947368421053, 'r': 0.5433070866141733, 'f1': 0.6798029556650248}, 'slot': {'p': 0.6111111111111112, 'r': 0.0210727969348659, 'f1': 0.040740740740740744}, 'combined': 0.02769567597153805, 'epoch': 6} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 6} Test Russian: {'template': {'p': 0.9078947368421053, 'r': 0.5433070866141733, 'f1': 0.6798029556650248}, 'slot': {'p': 0.6111111111111112, 'r': 0.0210727969348659, 'f1': 0.040740740740740744}, 'combined': 0.02769567597153805, 'epoch': 6} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 6} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 6} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 6} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 1} Test for Chinese: {'template': {'p': 0.9459459459459459, 'r': 0.5511811023622047, 'f1': 0.6965174129353234}, 'slot': {'p': 0.6111111111111112, 'r': 0.0210727969348659, 'f1': 0.040740740740740744}, 'combined': 0.028376635341809474, 'epoch': 1} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 1} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 1} Test for Korean: {'template': {'p': 0.9459459459459459, 'r': 0.5511811023622047, 'f1': 0.6965174129353234}, 'slot': {'p': 0.6111111111111112, 'r': 0.0210727969348659, 'f1': 0.040740740740740744}, 'combined': 0.028376635341809474, 'epoch': 1} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 1} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 1} Test for Russian: {'template': {'p': 0.9459459459459459, 'r': 0.5511811023622047, 'f1': 0.6965174129353234}, 'slot': {'p': 0.6111111111111112, 'r': 0.0210727969348659, 'f1': 0.040740740740740744}, 'combined': 0.028376635341809474, 'epoch': 1} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 1} ****************************** Epoch: 7 command: python train.py --model_name template --xlmr_model_name xlm-roberta-large --batch_size 10 --xlmr_learning_rate 2e-5 --event_hidden_num 450 --accumulate_step 4 --max_epoch 30 --p1_data_weight 0.1 --learning_rate 2e-4 2023-01-22 14:55:37.165488: step: 4/77, loss: 0.21330411732196808 2023-01-22 14:55:38.682087: step: 8/77, loss: 0.030803874135017395 2023-01-22 14:55:40.125388: step: 12/77, loss: 0.05952095985412598 2023-01-22 14:55:41.570925: step: 16/77, loss: 0.01660231500864029 2023-01-22 14:55:43.093008: step: 20/77, loss: 0.02132391557097435 2023-01-22 14:55:44.594497: step: 24/77, loss: 0.014290052466094494 2023-01-22 14:55:46.033648: step: 28/77, loss: 0.0021367145236581564 2023-01-22 14:55:47.530073: step: 32/77, loss: 0.0003844766179099679 2023-01-22 14:55:49.052211: step: 36/77, loss: 0.007141927722841501 2023-01-22 14:55:50.482134: step: 40/77, loss: 0.002050957642495632 2023-01-22 14:55:51.925467: step: 44/77, loss: 0.044860195368528366 2023-01-22 14:55:53.366973: step: 48/77, loss: 0.013183414936065674 2023-01-22 14:55:54.776519: step: 52/77, loss: 0.06017829850316048 2023-01-22 14:55:56.201850: step: 56/77, loss: 0.002314754296094179 2023-01-22 14:55:57.698707: step: 60/77, loss: 2.7088251954410225e-05 2023-01-22 14:55:59.187766: step: 64/77, loss: 0.00023267108190339059 2023-01-22 14:56:00.545732: step: 68/77, loss: 0.006338904611766338 2023-01-22 14:56:01.998759: step: 72/77, loss: 0.005614170804619789 2023-01-22 14:56:03.422875: step: 76/77, loss: 0.0006589738768525422 2023-01-22 14:56:04.859155: step: 80/77, loss: 0.001101326779462397 2023-01-22 14:56:06.277593: step: 84/77, loss: 0.03517903760075569 2023-01-22 14:56:07.762041: step: 88/77, loss: 0.015235664322972298 2023-01-22 14:56:09.289450: step: 92/77, loss: 0.05550249293446541 2023-01-22 14:56:10.709531: step: 96/77, loss: 0.09422709047794342 2023-01-22 14:56:12.189848: step: 100/77, loss: 0.008238008245825768 2023-01-22 14:56:13.645944: step: 104/77, loss: 0.023327726870775223 2023-01-22 14:56:15.080319: step: 108/77, loss: 0.00034248470910824835 2023-01-22 14:56:16.537014: step: 112/77, loss: 0.004488734062761068 2023-01-22 14:56:17.984379: step: 116/77, loss: 0.0054784854874014854 2023-01-22 14:56:19.471665: step: 120/77, loss: 0.038537733256816864 2023-01-22 14:56:20.966877: step: 124/77, loss: 0.04649285599589348 2023-01-22 14:56:22.451500: step: 128/77, loss: 0.02765333093702793 2023-01-22 14:56:23.907671: step: 132/77, loss: 0.06148548796772957 2023-01-22 14:56:25.401592: step: 136/77, loss: 0.06763473898172379 2023-01-22 14:56:26.861217: step: 140/77, loss: 0.007297105621546507 2023-01-22 14:56:28.367306: step: 144/77, loss: 0.00465565687045455 2023-01-22 14:56:29.761875: step: 148/77, loss: 0.008826401084661484 2023-01-22 14:56:31.236461: step: 152/77, loss: 0.11514346301555634 2023-01-22 14:56:32.675362: step: 156/77, loss: 0.013632182963192463 2023-01-22 14:56:34.182563: step: 160/77, loss: 0.006214080844074488 2023-01-22 14:56:35.632342: step: 164/77, loss: 0.014393839053809643 2023-01-22 14:56:37.100778: step: 168/77, loss: 0.01078635174781084 2023-01-22 14:56:38.510176: step: 172/77, loss: 0.007676491513848305 2023-01-22 14:56:39.980488: step: 176/77, loss: 0.031217725947499275 2023-01-22 14:56:41.451986: step: 180/77, loss: 0.05517624318599701 2023-01-22 14:56:42.932098: step: 184/77, loss: 0.06647980958223343 2023-01-22 14:56:44.416031: step: 188/77, loss: 0.06443522870540619 2023-01-22 14:56:45.891116: step: 192/77, loss: 0.012056811712682247 2023-01-22 14:56:47.358884: step: 196/77, loss: 0.007966795936226845 2023-01-22 14:56:48.877406: step: 200/77, loss: 0.022729909047484398 2023-01-22 14:56:50.275705: step: 204/77, loss: 0.0014283591881394386 2023-01-22 14:56:51.661183: step: 208/77, loss: 0.04743092879652977 2023-01-22 14:56:53.134135: step: 212/77, loss: 0.003068970050662756 2023-01-22 14:56:54.562451: step: 216/77, loss: 0.0051091015338897705 2023-01-22 14:56:56.041569: step: 220/77, loss: 0.03613844886422157 2023-01-22 14:56:57.502017: step: 224/77, loss: 0.03835630416870117 2023-01-22 14:56:58.969499: step: 228/77, loss: 0.01869058422744274 2023-01-22 14:57:00.401185: step: 232/77, loss: 0.008179155178368092 2023-01-22 14:57:01.832681: step: 236/77, loss: 0.0018846002640202641 2023-01-22 14:57:03.370094: step: 240/77, loss: 0.02894376404583454 2023-01-22 14:57:04.850492: step: 244/77, loss: 0.0083406250923872 2023-01-22 14:57:06.323478: step: 248/77, loss: 0.015316452831029892 2023-01-22 14:57:07.831697: step: 252/77, loss: 0.07307127118110657 2023-01-22 14:57:09.231150: step: 256/77, loss: 0.05162765085697174 2023-01-22 14:57:10.717073: step: 260/77, loss: 0.1208905279636383 2023-01-22 14:57:12.160636: step: 264/77, loss: 0.01847473904490471 2023-01-22 14:57:13.666812: step: 268/77, loss: 0.02602345682680607 2023-01-22 14:57:15.122322: step: 272/77, loss: 0.07284299284219742 2023-01-22 14:57:16.592413: step: 276/77, loss: 0.06580187380313873 2023-01-22 14:57:18.084356: step: 280/77, loss: 0.02773948945105076 2023-01-22 14:57:19.528515: step: 284/77, loss: 0.0016939353663474321 2023-01-22 14:57:21.032269: step: 288/77, loss: 0.04821263253688812 2023-01-22 14:57:22.476663: step: 292/77, loss: 0.029789648950099945 2023-01-22 14:57:23.938273: step: 296/77, loss: 0.005959080997854471 2023-01-22 14:57:25.496610: step: 300/77, loss: 0.0001107029092963785 2023-01-22 14:57:26.972202: step: 304/77, loss: 0.003340965835377574 2023-01-22 14:57:28.365888: step: 308/77, loss: 0.021251888945698738 2023-01-22 14:57:29.773437: step: 312/77, loss: 0.01795397326350212 2023-01-22 14:57:31.211944: step: 316/77, loss: 0.048583999276161194 2023-01-22 14:57:32.645059: step: 320/77, loss: 0.014873946085572243 2023-01-22 14:57:34.122307: step: 324/77, loss: 0.04811349883675575 2023-01-22 14:57:35.544479: step: 328/77, loss: 0.0026686962228268385 2023-01-22 14:57:36.984111: step: 332/77, loss: 0.013501567766070366 2023-01-22 14:57:38.484713: step: 336/77, loss: 0.015222469344735146 2023-01-22 14:57:40.023241: step: 340/77, loss: 0.026556754484772682 2023-01-22 14:57:41.504007: step: 344/77, loss: 0.0219095591455698 2023-01-22 14:57:42.991278: step: 348/77, loss: 0.00573897548019886 2023-01-22 14:57:44.399958: step: 352/77, loss: 0.002142177429050207 2023-01-22 14:57:45.852520: step: 356/77, loss: 0.025653140619397163 2023-01-22 14:57:47.250899: step: 360/77, loss: 0.022537333890795708 2023-01-22 14:57:48.706474: step: 364/77, loss: 0.01239784061908722 2023-01-22 14:57:50.233503: step: 368/77, loss: 0.010640754364430904 2023-01-22 14:57:51.738702: step: 372/77, loss: 0.010664643719792366 2023-01-22 14:57:53.184685: step: 376/77, loss: 0.005104396026581526 2023-01-22 14:57:54.647582: step: 380/77, loss: 0.01083988044410944 2023-01-22 14:57:56.175426: step: 384/77, loss: 0.0008839995134621859 2023-01-22 14:57:57.650742: step: 388/77, loss: 0.00999844167381525 ================================================== Loss: 0.026 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 7} Test Chinese: {'template': {'p': 0.9264705882352942, 'r': 0.49606299212598426, 'f1': 0.6461538461538462}, 'slot': {'p': 0.631578947368421, 'r': 0.022988505747126436, 'f1': 0.04436229205175601}, 'combined': 0.028664865633442345, 'epoch': 7} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 7} Test Korean: {'template': {'p': 0.9264705882352942, 'r': 0.49606299212598426, 'f1': 0.6461538461538462}, 'slot': {'p': 0.631578947368421, 'r': 0.022988505747126436, 'f1': 0.04436229205175601}, 'combined': 0.028664865633442345, 'epoch': 7} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 7} Test Russian: {'template': {'p': 0.9253731343283582, 'r': 0.4881889763779528, 'f1': 0.6391752577319587}, 'slot': {'p': 0.6052631578947368, 'r': 0.022030651340996167, 'f1': 0.04251386321626617}, 'combined': 0.027173809478438168, 'epoch': 7} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 7} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 7} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 7} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 1} Test for Chinese: {'template': {'p': 0.9459459459459459, 'r': 0.5511811023622047, 'f1': 0.6965174129353234}, 'slot': {'p': 0.6111111111111112, 'r': 0.0210727969348659, 'f1': 0.040740740740740744}, 'combined': 0.028376635341809474, 'epoch': 1} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 1} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 1} Test for Korean: {'template': {'p': 0.9459459459459459, 'r': 0.5511811023622047, 'f1': 0.6965174129353234}, 'slot': {'p': 0.6111111111111112, 'r': 0.0210727969348659, 'f1': 0.040740740740740744}, 'combined': 0.028376635341809474, 'epoch': 1} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 1} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 1} Test for Russian: {'template': {'p': 0.9459459459459459, 'r': 0.5511811023622047, 'f1': 0.6965174129353234}, 'slot': {'p': 0.6111111111111112, 'r': 0.0210727969348659, 'f1': 0.040740740740740744}, 'combined': 0.028376635341809474, 'epoch': 1} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 1} ****************************** Epoch: 8 command: python train.py --model_name template --xlmr_model_name xlm-roberta-large --batch_size 10 --xlmr_learning_rate 2e-5 --event_hidden_num 450 --accumulate_step 4 --max_epoch 30 --p1_data_weight 0.1 --learning_rate 2e-4 2023-01-22 14:59:32.746737: step: 4/77, loss: 0.00851517915725708 2023-01-22 14:59:34.193670: step: 8/77, loss: 0.020248152315616608 2023-01-22 14:59:35.639435: step: 12/77, loss: 0.03845569118857384 2023-01-22 14:59:37.085876: step: 16/77, loss: 0.017375899478793144 2023-01-22 14:59:38.524855: step: 20/77, loss: 0.018196651712059975 2023-01-22 14:59:39.934995: step: 24/77, loss: 0.0115358866751194 2023-01-22 14:59:41.361089: step: 28/77, loss: 0.011068487539887428 2023-01-22 14:59:42.834674: step: 32/77, loss: 0.031230010092258453 2023-01-22 14:59:44.295548: step: 36/77, loss: 0.008000523783266544 2023-01-22 14:59:45.781608: step: 40/77, loss: 0.006209453102201223 2023-01-22 14:59:47.197950: step: 44/77, loss: 0.0002952442446257919 2023-01-22 14:59:48.638724: step: 48/77, loss: 0.00988959800451994 2023-01-22 14:59:50.111997: step: 52/77, loss: 0.029006047174334526 2023-01-22 14:59:51.592297: step: 56/77, loss: 0.05109583958983421 2023-01-22 14:59:53.037279: step: 60/77, loss: 0.00737263448536396 2023-01-22 14:59:54.485621: step: 64/77, loss: 0.002218138426542282 2023-01-22 14:59:56.019601: step: 68/77, loss: 0.007417859975248575 2023-01-22 14:59:57.485830: step: 72/77, loss: 0.0025550522841513157 2023-01-22 14:59:58.986882: step: 76/77, loss: 3.655156888271449e-06 2023-01-22 15:00:00.506819: step: 80/77, loss: 0.0004421906196512282 2023-01-22 15:00:01.888635: step: 84/77, loss: 0.004834357649087906 2023-01-22 15:00:03.388474: step: 88/77, loss: 0.029413413256406784 2023-01-22 15:00:04.835741: step: 92/77, loss: 0.041734565049409866 2023-01-22 15:00:06.263037: step: 96/77, loss: 0.011020495556294918 2023-01-22 15:00:07.716727: step: 100/77, loss: 0.00971745140850544 2023-01-22 15:00:09.214448: step: 104/77, loss: 0.008658653125166893 2023-01-22 15:00:10.653738: step: 108/77, loss: 0.008677625097334385 2023-01-22 15:00:12.103290: step: 112/77, loss: 0.016906946897506714 2023-01-22 15:00:13.604380: step: 116/77, loss: 0.0006555644213221967 2023-01-22 15:00:15.080543: step: 120/77, loss: 0.00020362350915092975 2023-01-22 15:00:16.582327: step: 124/77, loss: 0.0008139436249621212 2023-01-22 15:00:18.027948: step: 128/77, loss: 0.007392987608909607 2023-01-22 15:00:19.426196: step: 132/77, loss: 0.02124861069023609 2023-01-22 15:00:20.947390: step: 136/77, loss: 0.004259026609361172 2023-01-22 15:00:22.381726: step: 140/77, loss: 0.06056196615099907 2023-01-22 15:00:23.826984: step: 144/77, loss: 0.0033684358932077885 2023-01-22 15:00:25.343628: step: 148/77, loss: 0.03498908504843712 2023-01-22 15:00:26.809008: step: 152/77, loss: 0.01483969297260046 2023-01-22 15:00:28.302875: step: 156/77, loss: 0.03756939247250557 2023-01-22 15:00:29.788442: step: 160/77, loss: 0.010021629743278027 2023-01-22 15:00:31.292187: step: 164/77, loss: 0.05477229505777359 2023-01-22 15:00:32.774266: step: 168/77, loss: 0.02709217183291912 2023-01-22 15:00:34.256532: step: 172/77, loss: 0.008434438146650791 2023-01-22 15:00:35.711023: step: 176/77, loss: 0.06009257212281227 2023-01-22 15:00:37.189786: step: 180/77, loss: 0.00014400421059690416 2023-01-22 15:00:38.561404: step: 184/77, loss: 0.00010034604201791808 2023-01-22 15:00:40.040402: step: 188/77, loss: 0.017458287999033928 2023-01-22 15:00:41.468412: step: 192/77, loss: 0.03429003059864044 2023-01-22 15:00:42.931126: step: 196/77, loss: 0.017773982137441635 2023-01-22 15:00:44.493432: step: 200/77, loss: 0.00906282663345337 2023-01-22 15:00:45.892605: step: 204/77, loss: 0.0068331267684698105 2023-01-22 15:00:47.391620: step: 208/77, loss: 0.040438201278448105 2023-01-22 15:00:48.857852: step: 212/77, loss: 0.05152969807386398 2023-01-22 15:00:50.314678: step: 216/77, loss: 0.010812604799866676 2023-01-22 15:00:51.798277: step: 220/77, loss: 0.015838824212551117 2023-01-22 15:00:53.208488: step: 224/77, loss: 0.0056254188530147076 2023-01-22 15:00:54.719343: step: 228/77, loss: 0.010532453656196594 2023-01-22 15:00:56.175170: step: 232/77, loss: 0.04069099202752113 2023-01-22 15:00:57.557101: step: 236/77, loss: 0.0039288196712732315 2023-01-22 15:00:59.048376: step: 240/77, loss: 0.015094153583049774 2023-01-22 15:01:00.524362: step: 244/77, loss: 0.061751965433359146 2023-01-22 15:01:01.918822: step: 248/77, loss: 0.02132176049053669 2023-01-22 15:01:03.416272: step: 252/77, loss: 0.00747132021933794 2023-01-22 15:01:04.941948: step: 256/77, loss: 0.03146218881011009 2023-01-22 15:01:06.441955: step: 260/77, loss: 0.029169270768761635 2023-01-22 15:01:07.888193: step: 264/77, loss: 0.09212566167116165 2023-01-22 15:01:09.341556: step: 268/77, loss: 0.026289554312825203 2023-01-22 15:01:10.799984: step: 272/77, loss: 0.0020351808052510023 2023-01-22 15:01:12.224192: step: 276/77, loss: 0.01215108297765255 2023-01-22 15:01:13.759950: step: 280/77, loss: 0.13030201196670532 2023-01-22 15:01:15.178804: step: 284/77, loss: 0.010139960795640945 2023-01-22 15:01:16.644993: step: 288/77, loss: 0.018669992685317993 2023-01-22 15:01:18.132709: step: 292/77, loss: 0.004342895466834307 2023-01-22 15:01:19.622576: step: 296/77, loss: 0.008130617439746857 2023-01-22 15:01:21.078730: step: 300/77, loss: 0.0022651138715445995 2023-01-22 15:01:22.540775: step: 304/77, loss: 0.01521453820168972 2023-01-22 15:01:24.000017: step: 308/77, loss: 0.012047030963003635 2023-01-22 15:01:25.479804: step: 312/77, loss: 0.04839862510561943 2023-01-22 15:01:26.885792: step: 316/77, loss: 0.033473189920186996 2023-01-22 15:01:28.372000: step: 320/77, loss: 0.0014845503028482199 2023-01-22 15:01:29.838314: step: 324/77, loss: 0.01895192079246044 2023-01-22 15:01:31.305365: step: 328/77, loss: 0.015420947223901749 2023-01-22 15:01:32.726681: step: 332/77, loss: 0.001918652793392539 2023-01-22 15:01:34.223728: step: 336/77, loss: 0.010518055409193039 2023-01-22 15:01:35.702254: step: 340/77, loss: 0.015032559633255005 2023-01-22 15:01:37.193348: step: 344/77, loss: 0.06734556704759598 2023-01-22 15:01:38.606793: step: 348/77, loss: 0.0430232472717762 2023-01-22 15:01:40.110039: step: 352/77, loss: 0.003731168108060956 2023-01-22 15:01:41.586783: step: 356/77, loss: 0.01779318042099476 2023-01-22 15:01:43.062844: step: 360/77, loss: 0.0022112352307885885 2023-01-22 15:01:44.597193: step: 364/77, loss: 0.005814543925225735 2023-01-22 15:01:46.089707: step: 368/77, loss: 0.0222287829965353 2023-01-22 15:01:47.502825: step: 372/77, loss: 0.040803465992212296 2023-01-22 15:01:48.962012: step: 376/77, loss: 0.07254386693239212 2023-01-22 15:01:50.390954: step: 380/77, loss: 0.003615192137658596 2023-01-22 15:01:51.922656: step: 384/77, loss: 0.016530876979231834 2023-01-22 15:01:53.392088: step: 388/77, loss: 0.0025354349054396152 ================================================== Loss: 0.020 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 8} Test Chinese: {'template': {'p': 0.96, 'r': 0.5669291338582677, 'f1': 0.712871287128713}, 'slot': {'p': 0.65625, 'r': 0.020114942528735632, 'f1': 0.03903345724907063}, 'combined': 0.027825830910228572, 'epoch': 8} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 8} Test Korean: {'template': {'p': 0.972972972972973, 'r': 0.5669291338582677, 'f1': 0.7164179104477612}, 'slot': {'p': 0.65625, 'r': 0.020114942528735632, 'f1': 0.03903345724907063}, 'combined': 0.0279642678799312, 'epoch': 8} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 8} Test Russian: {'template': {'p': 0.9594594594594594, 'r': 0.5590551181102362, 'f1': 0.7064676616915422}, 'slot': {'p': 0.65625, 'r': 0.020114942528735632, 'f1': 0.03903345724907063}, 'combined': 0.027575875270487705, 'epoch': 8} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 8} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 8} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 8} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 1} Test for Chinese: {'template': {'p': 0.9459459459459459, 'r': 0.5511811023622047, 'f1': 0.6965174129353234}, 'slot': {'p': 0.6111111111111112, 'r': 0.0210727969348659, 'f1': 0.040740740740740744}, 'combined': 0.028376635341809474, 'epoch': 1} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 1} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 1} Test for Korean: {'template': {'p': 0.9459459459459459, 'r': 0.5511811023622047, 'f1': 0.6965174129353234}, 'slot': {'p': 0.6111111111111112, 'r': 0.0210727969348659, 'f1': 0.040740740740740744}, 'combined': 0.028376635341809474, 'epoch': 1} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 1} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 1} Test for Russian: {'template': {'p': 0.9459459459459459, 'r': 0.5511811023622047, 'f1': 0.6965174129353234}, 'slot': {'p': 0.6111111111111112, 'r': 0.0210727969348659, 'f1': 0.040740740740740744}, 'combined': 0.028376635341809474, 'epoch': 1} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 1} ****************************** Epoch: 9 command: python train.py --model_name template --xlmr_model_name xlm-roberta-large --batch_size 10 --xlmr_learning_rate 2e-5 --event_hidden_num 450 --accumulate_step 4 --max_epoch 30 --p1_data_weight 0.1 --learning_rate 2e-4 2023-01-22 15:03:28.703144: step: 4/77, loss: 0.004793758504092693 2023-01-22 15:03:30.154488: step: 8/77, loss: 0.04453736171126366 2023-01-22 15:03:31.619474: step: 12/77, loss: 0.026036223396658897 2023-01-22 15:03:33.076190: step: 16/77, loss: 0.0021551132667809725 2023-01-22 15:03:34.563747: step: 20/77, loss: 0.08646086603403091 2023-01-22 15:03:36.006037: step: 24/77, loss: 0.013802998699247837 2023-01-22 15:03:37.492126: step: 28/77, loss: 0.003359440714120865 2023-01-22 15:03:38.878866: step: 32/77, loss: 0.00038291505188681185 2023-01-22 15:03:40.334971: step: 36/77, loss: 0.0017134477384388447 2023-01-22 15:03:41.762346: step: 40/77, loss: 0.0022804250475019217 2023-01-22 15:03:43.257118: step: 44/77, loss: 0.001309186452999711 2023-01-22 15:03:44.668168: step: 48/77, loss: 0.002290336648002267 2023-01-22 15:03:46.138731: step: 52/77, loss: 0.0020765713416039944 2023-01-22 15:03:47.570244: step: 56/77, loss: 2.3651067749597132e-05 2023-01-22 15:03:49.032408: step: 60/77, loss: 0.0045333001762628555 2023-01-22 15:03:50.557672: step: 64/77, loss: 9.704382682684809e-05 2023-01-22 15:03:51.950191: step: 68/77, loss: 0.017325766384601593 2023-01-22 15:03:53.417236: step: 72/77, loss: 0.009218841791152954 2023-01-22 15:03:54.854781: step: 76/77, loss: 0.023138979449868202 2023-01-22 15:03:56.376142: step: 80/77, loss: 0.04861721396446228 2023-01-22 15:03:57.862844: step: 84/77, loss: 2.545259485486895e-05 2023-01-22 15:03:59.250252: step: 88/77, loss: 0.01106626633554697 2023-01-22 15:04:00.686883: step: 92/77, loss: 0.007451111450791359 2023-01-22 15:04:02.141290: step: 96/77, loss: 0.006898709572851658 2023-01-22 15:04:03.650533: step: 100/77, loss: 0.05518203601241112 2023-01-22 15:04:05.045736: step: 104/77, loss: 0.0033061346039175987 2023-01-22 15:04:06.571661: step: 108/77, loss: 0.014921912923455238 2023-01-22 15:04:08.060379: step: 112/77, loss: 0.01143279206007719 2023-01-22 15:04:09.517504: step: 116/77, loss: 0.0007015878800302744 2023-01-22 15:04:10.963094: step: 120/77, loss: 0.004604281857609749 2023-01-22 15:04:12.311355: step: 124/77, loss: 0.032683178782463074 2023-01-22 15:04:13.804466: step: 128/77, loss: 0.037563107907772064 2023-01-22 15:04:15.295977: step: 132/77, loss: 0.018027130514383316 2023-01-22 15:04:16.786319: step: 136/77, loss: 0.008253769017755985 2023-01-22 15:04:18.260144: step: 140/77, loss: 0.01780082657933235 2023-01-22 15:04:19.677241: step: 144/77, loss: 0.005574891343712807 2023-01-22 15:04:21.136926: step: 148/77, loss: 0.02170383185148239 2023-01-22 15:04:22.622931: step: 152/77, loss: 0.0022256188094615936 2023-01-22 15:04:24.057237: step: 156/77, loss: 0.005868262145668268 2023-01-22 15:04:25.547902: step: 160/77, loss: 0.0019104223465546966 2023-01-22 15:04:26.999278: step: 164/77, loss: 0.008093828335404396 2023-01-22 15:04:28.447034: step: 168/77, loss: 0.0259143877774477 2023-01-22 15:04:29.956787: step: 172/77, loss: 0.0027095580007880926 2023-01-22 15:04:31.446355: step: 176/77, loss: 0.0044714659452438354 2023-01-22 15:04:32.916572: step: 180/77, loss: 0.01789674162864685 2023-01-22 15:04:34.361697: step: 184/77, loss: 0.00848174188286066 2023-01-22 15:04:35.815417: step: 188/77, loss: 0.014940755441784859 2023-01-22 15:04:37.298441: step: 192/77, loss: 0.003137253224849701 2023-01-22 15:04:38.733274: step: 196/77, loss: 0.011194746010005474 2023-01-22 15:04:40.187997: step: 200/77, loss: 0.03405408561229706 2023-01-22 15:04:41.710224: step: 204/77, loss: 0.03816353529691696 2023-01-22 15:04:43.167622: step: 208/77, loss: 0.008965753018856049 2023-01-22 15:04:44.621843: step: 212/77, loss: 0.22834959626197815 2023-01-22 15:04:46.072561: step: 216/77, loss: 0.011904070153832436 2023-01-22 15:04:47.490569: step: 220/77, loss: 0.00048413369222544134 2023-01-22 15:04:48.949599: step: 224/77, loss: 0.01909520849585533 2023-01-22 15:04:50.486823: step: 228/77, loss: 0.013929789885878563 2023-01-22 15:04:51.954422: step: 232/77, loss: 0.05527324974536896 2023-01-22 15:04:53.449621: step: 236/77, loss: 0.007887788116931915 2023-01-22 15:04:54.909688: step: 240/77, loss: 0.0007678931578993797 2023-01-22 15:04:56.305587: step: 244/77, loss: 0.031782060861587524 2023-01-22 15:04:57.753036: step: 248/77, loss: 0.01913481205701828 2023-01-22 15:04:59.225710: step: 252/77, loss: 0.0002617061254568398 2023-01-22 15:05:00.702293: step: 256/77, loss: 0.06358633935451508 2023-01-22 15:05:02.204027: step: 260/77, loss: 0.009755901992321014 2023-01-22 15:05:03.698175: step: 264/77, loss: 0.0011176634579896927 2023-01-22 15:05:05.166190: step: 268/77, loss: 0.0020514694042503834 2023-01-22 15:05:06.601049: step: 272/77, loss: 0.0011599418940022588 2023-01-22 15:05:07.997428: step: 276/77, loss: 0.0018144859932363033 2023-01-22 15:05:09.391156: step: 280/77, loss: 0.0006134199211373925 2023-01-22 15:05:10.891837: step: 284/77, loss: 0.00042076807585544884 2023-01-22 15:05:12.444423: step: 288/77, loss: 0.04851783812046051 2023-01-22 15:05:13.957145: step: 292/77, loss: 0.0026194823440164328 2023-01-22 15:05:15.407232: step: 296/77, loss: 0.003914463333785534 2023-01-22 15:05:16.835166: step: 300/77, loss: 0.007222973275929689 2023-01-22 15:05:18.262727: step: 304/77, loss: 0.0073476675897836685 2023-01-22 15:05:19.731634: step: 308/77, loss: 0.04063144698739052 2023-01-22 15:05:21.217069: step: 312/77, loss: 0.020086605101823807 2023-01-22 15:05:22.681980: step: 316/77, loss: 0.007355029694736004 2023-01-22 15:05:24.125509: step: 320/77, loss: 0.0014275667490437627 2023-01-22 15:05:25.548409: step: 324/77, loss: 0.00330280396156013 2023-01-22 15:05:26.949917: step: 328/77, loss: 0.011860538274049759 2023-01-22 15:05:28.399945: step: 332/77, loss: 0.05505289137363434 2023-01-22 15:05:29.919727: step: 336/77, loss: 2.5052808268810622e-05 2023-01-22 15:05:31.383373: step: 340/77, loss: 0.0030943909659981728 2023-01-22 15:05:32.880417: step: 344/77, loss: 0.0008039247477427125 2023-01-22 15:05:34.329997: step: 348/77, loss: 0.003440692089498043 2023-01-22 15:05:35.855122: step: 352/77, loss: 0.01987139880657196 2023-01-22 15:05:37.331721: step: 356/77, loss: 0.0022112764418125153 2023-01-22 15:05:38.804760: step: 360/77, loss: 0.01719844341278076 2023-01-22 15:05:40.294330: step: 364/77, loss: 0.014845077879726887 2023-01-22 15:05:41.782100: step: 368/77, loss: 0.0005380196962505579 2023-01-22 15:05:43.212155: step: 372/77, loss: 3.5617107641883194e-05 2023-01-22 15:05:44.698648: step: 376/77, loss: 9.74436043179594e-05 2023-01-22 15:05:46.140310: step: 380/77, loss: 0.04860334098339081 2023-01-22 15:05:47.695322: step: 384/77, loss: 0.05483109503984451 2023-01-22 15:05:49.106293: step: 388/77, loss: 0.004063018597662449 ================================================== Loss: 0.016 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 9} Test Chinese: {'template': {'p': 0.9444444444444444, 'r': 0.5354330708661418, 'f1': 0.6834170854271356}, 'slot': {'p': 0.5454545454545454, 'r': 0.022988505747126436, 'f1': 0.04411764705882353}, 'combined': 0.03015075376884422, 'epoch': 9} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 9} Test Korean: {'template': {'p': 0.9436619718309859, 'r': 0.5275590551181102, 'f1': 0.6767676767676767}, 'slot': {'p': 0.5581395348837209, 'r': 0.022988505747126436, 'f1': 0.04415823367065318}, 'combined': 0.029884865211452147, 'epoch': 9} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 9} Test Russian: {'template': {'p': 0.9444444444444444, 'r': 0.5354330708661418, 'f1': 0.6834170854271356}, 'slot': {'p': 0.5581395348837209, 'r': 0.022988505747126436, 'f1': 0.04415823367065318}, 'combined': 0.0301784913528082, 'epoch': 9} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 9} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 9} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 9} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 1} Test for Chinese: {'template': {'p': 0.9459459459459459, 'r': 0.5511811023622047, 'f1': 0.6965174129353234}, 'slot': {'p': 0.6111111111111112, 'r': 0.0210727969348659, 'f1': 0.040740740740740744}, 'combined': 0.028376635341809474, 'epoch': 1} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 1} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 1} Test for Korean: {'template': {'p': 0.9459459459459459, 'r': 0.5511811023622047, 'f1': 0.6965174129353234}, 'slot': {'p': 0.6111111111111112, 'r': 0.0210727969348659, 'f1': 0.040740740740740744}, 'combined': 0.028376635341809474, 'epoch': 1} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 1} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 1} Test for Russian: {'template': {'p': 0.9459459459459459, 'r': 0.5511811023622047, 'f1': 0.6965174129353234}, 'slot': {'p': 0.6111111111111112, 'r': 0.0210727969348659, 'f1': 0.040740740740740744}, 'combined': 0.028376635341809474, 'epoch': 1} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 1} ****************************** Epoch: 10 command: python train.py --model_name template --xlmr_model_name xlm-roberta-large --batch_size 10 --xlmr_learning_rate 2e-5 --event_hidden_num 450 --accumulate_step 4 --max_epoch 30 --p1_data_weight 0.1 --learning_rate 2e-4 2023-01-22 15:07:24.115413: step: 4/77, loss: 5.528076599148335e-06 2023-01-22 15:07:25.590496: step: 8/77, loss: 0.0010808947263285518 2023-01-22 15:07:27.066052: step: 12/77, loss: 0.016046933829784393 2023-01-22 15:07:28.511079: step: 16/77, loss: 5.9023699577664956e-05 2023-01-22 15:07:29.897866: step: 20/77, loss: 0.007344263605773449 2023-01-22 15:07:31.409999: step: 24/77, loss: 0.0006989211542531848 2023-01-22 15:07:32.855401: step: 28/77, loss: 0.0009152927668765187 2023-01-22 15:07:34.294866: step: 32/77, loss: 0.009391102939844131 2023-01-22 15:07:35.754579: step: 36/77, loss: 0.021809352561831474 2023-01-22 15:07:37.206694: step: 40/77, loss: 0.004310915246605873 2023-01-22 15:07:38.632163: step: 44/77, loss: 0.00021066641784273088 2023-01-22 15:07:40.158207: step: 48/77, loss: 0.0020009807776659727 2023-01-22 15:07:41.641151: step: 52/77, loss: 7.372568506980315e-05 2023-01-22 15:07:43.099478: step: 56/77, loss: 0.01635999046266079 2023-01-22 15:07:44.534770: step: 60/77, loss: 0.0426853708922863 2023-01-22 15:07:45.964549: step: 64/77, loss: 6.578583997907117e-05 2023-01-22 15:07:47.372161: step: 68/77, loss: 0.0005468910094350576 2023-01-22 15:07:48.795930: step: 72/77, loss: 0.001052496605552733 2023-01-22 15:07:50.262889: step: 76/77, loss: 0.02301090583205223 2023-01-22 15:07:51.690363: step: 80/77, loss: 0.000436722650192678 2023-01-22 15:07:53.127118: step: 84/77, loss: 0.0007260671118274331 2023-01-22 15:07:54.509423: step: 88/77, loss: 0.0007697568507865071 2023-01-22 15:07:55.927382: step: 92/77, loss: 0.07233330607414246 2023-01-22 15:07:57.397521: step: 96/77, loss: 9.478777064941823e-05 2023-01-22 15:07:58.862315: step: 100/77, loss: 0.0001304300967603922 2023-01-22 15:08:00.353810: step: 104/77, loss: 0.0009545194334350526 2023-01-22 15:08:01.813745: step: 108/77, loss: 0.0008118526893667877 2023-01-22 15:08:03.296963: step: 112/77, loss: 0.02061503566801548 2023-01-22 15:08:04.768187: step: 116/77, loss: 0.0003421950386837125 2023-01-22 15:08:06.244475: step: 120/77, loss: 0.09420838952064514 2023-01-22 15:08:07.752922: step: 124/77, loss: 0.0009122472838498652 2023-01-22 15:08:09.251882: step: 128/77, loss: 0.0015252233715727925 2023-01-22 15:08:10.718608: step: 132/77, loss: 0.004896394442766905 2023-01-22 15:08:12.153336: step: 136/77, loss: 0.049466781318187714 2023-01-22 15:08:13.645541: step: 140/77, loss: 0.014073391444981098 2023-01-22 15:08:15.133900: step: 144/77, loss: 0.0003406977921258658 2023-01-22 15:08:16.565023: step: 148/77, loss: 0.0033174047712236643 2023-01-22 15:08:17.971827: step: 152/77, loss: 0.011274401098489761 2023-01-22 15:08:19.522869: step: 156/77, loss: 0.003861426142975688 2023-01-22 15:08:20.987780: step: 160/77, loss: 0.0491793267428875 2023-01-22 15:08:22.455508: step: 164/77, loss: 0.0009002528968267143 2023-01-22 15:08:23.931281: step: 168/77, loss: 0.003091795602813363 2023-01-22 15:08:25.418807: step: 172/77, loss: 0.10658681392669678 2023-01-22 15:08:26.902756: step: 176/77, loss: 0.0006332556949928403 2023-01-22 15:08:28.315306: step: 180/77, loss: 0.0009711352176964283 2023-01-22 15:08:29.805444: step: 184/77, loss: 0.02992566116154194 2023-01-22 15:08:31.231805: step: 188/77, loss: 0.002015071688219905 2023-01-22 15:08:32.701520: step: 192/77, loss: 0.0011653820984065533 2023-01-22 15:08:34.185584: step: 196/77, loss: 0.001487490488216281 2023-01-22 15:08:35.640006: step: 200/77, loss: 0.0009206479880958796 2023-01-22 15:08:37.099276: step: 204/77, loss: 0.0006215562461875379 2023-01-22 15:08:38.494952: step: 208/77, loss: 0.00022190042363945395 2023-01-22 15:08:39.971895: step: 212/77, loss: 0.010531798005104065 2023-01-22 15:08:41.464528: step: 216/77, loss: 0.0007350501837208867 2023-01-22 15:08:42.941812: step: 220/77, loss: 0.0008300531771965325 2023-01-22 15:08:44.438600: step: 224/77, loss: 0.016436023637652397 2023-01-22 15:08:45.893154: step: 228/77, loss: 0.05484340339899063 2023-01-22 15:08:47.297027: step: 232/77, loss: 0.015066299587488174 2023-01-22 15:08:48.695504: step: 236/77, loss: 0.0019667416345328093 2023-01-22 15:08:50.144491: step: 240/77, loss: 0.019391246140003204 2023-01-22 15:08:51.633690: step: 244/77, loss: 0.008214287459850311 2023-01-22 15:08:53.081370: step: 248/77, loss: 0.053598154336214066 2023-01-22 15:08:54.605574: step: 252/77, loss: 0.00010274317901348695 2023-01-22 15:08:56.091129: step: 256/77, loss: 0.009660336188971996 2023-01-22 15:08:57.564975: step: 260/77, loss: 0.003455250756815076 2023-01-22 15:08:59.034257: step: 264/77, loss: 0.005311280023306608 2023-01-22 15:09:00.570694: step: 268/77, loss: 0.040916331112384796 2023-01-22 15:09:02.055314: step: 272/77, loss: 0.0018766542198136449 2023-01-22 15:09:03.481758: step: 276/77, loss: 8.171782974386588e-05 2023-01-22 15:09:04.939866: step: 280/77, loss: 0.026160767301917076 2023-01-22 15:09:06.405496: step: 284/77, loss: 0.0308120958507061 2023-01-22 15:09:07.831280: step: 288/77, loss: 0.012294838204979897 2023-01-22 15:09:09.323301: step: 292/77, loss: 0.0015243733068928123 2023-01-22 15:09:10.698974: step: 296/77, loss: 0.012461314909160137 2023-01-22 15:09:12.218069: step: 300/77, loss: 0.0002546489122323692 2023-01-22 15:09:13.709605: step: 304/77, loss: 0.04105209559202194 2023-01-22 15:09:15.138210: step: 308/77, loss: 0.010356339626014233 2023-01-22 15:09:16.521725: step: 312/77, loss: 0.19176003336906433 2023-01-22 15:09:18.002422: step: 316/77, loss: 0.019898800179362297 2023-01-22 15:09:19.439979: step: 320/77, loss: 0.0013566407142207026 2023-01-22 15:09:20.891201: step: 324/77, loss: 0.01040252111852169 2023-01-22 15:09:22.323356: step: 328/77, loss: 0.007182074710726738 2023-01-22 15:09:23.745495: step: 332/77, loss: 0.0037522786296904087 2023-01-22 15:09:25.185379: step: 336/77, loss: 0.0006864179158583283 2023-01-22 15:09:26.694957: step: 340/77, loss: 0.0036263717338442802 2023-01-22 15:09:28.142815: step: 344/77, loss: 1.552778485347517e-05 2023-01-22 15:09:29.654440: step: 348/77, loss: 0.05568019300699234 2023-01-22 15:09:31.144594: step: 352/77, loss: 1.6686997696524486e-05 2023-01-22 15:09:32.651719: step: 356/77, loss: 8.73190515449096e-07 2023-01-22 15:09:34.034690: step: 360/77, loss: 0.01214410737156868 2023-01-22 15:09:35.518458: step: 364/77, loss: 5.7248958910349756e-05 2023-01-22 15:09:37.026246: step: 368/77, loss: 2.339025195396971e-05 2023-01-22 15:09:38.425656: step: 372/77, loss: 0.005132255144417286 2023-01-22 15:09:39.820351: step: 376/77, loss: 0.014740485697984695 2023-01-22 15:09:41.307349: step: 380/77, loss: 0.017130665481090546 2023-01-22 15:09:42.834238: step: 384/77, loss: 0.03507642447948456 2023-01-22 15:09:44.257362: step: 388/77, loss: 0.005500377155840397 ================================================== Loss: 0.014 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 10} Test Chinese: {'template': {'p': 0.9125, 'r': 0.5748031496062992, 'f1': 0.7053140096618357}, 'slot': {'p': 0.5116279069767442, 'r': 0.0210727969348659, 'f1': 0.04047838086476541}, 'combined': 0.028549969112346616, 'epoch': 10} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 10} Test Korean: {'template': {'p': 0.9125, 'r': 0.5748031496062992, 'f1': 0.7053140096618357}, 'slot': {'p': 0.5116279069767442, 'r': 0.0210727969348659, 'f1': 0.04047838086476541}, 'combined': 0.028549969112346616, 'epoch': 10} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 10} Test Russian: {'template': {'p': 0.9125, 'r': 0.5748031496062992, 'f1': 0.7053140096618357}, 'slot': {'p': 0.5238095238095238, 'r': 0.0210727969348659, 'f1': 0.04051565377532229}, 'combined': 0.028576258218343253, 'epoch': 10} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 10} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 10} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 10} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 1} Test for Chinese: {'template': {'p': 0.9459459459459459, 'r': 0.5511811023622047, 'f1': 0.6965174129353234}, 'slot': {'p': 0.6111111111111112, 'r': 0.0210727969348659, 'f1': 0.040740740740740744}, 'combined': 0.028376635341809474, 'epoch': 1} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 1} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 1} Test for Korean: {'template': {'p': 0.9459459459459459, 'r': 0.5511811023622047, 'f1': 0.6965174129353234}, 'slot': {'p': 0.6111111111111112, 'r': 0.0210727969348659, 'f1': 0.040740740740740744}, 'combined': 0.028376635341809474, 'epoch': 1} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 1} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 1} Test for Russian: {'template': {'p': 0.9459459459459459, 'r': 0.5511811023622047, 'f1': 0.6965174129353234}, 'slot': {'p': 0.6111111111111112, 'r': 0.0210727969348659, 'f1': 0.040740740740740744}, 'combined': 0.028376635341809474, 'epoch': 1} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 1} ****************************** Epoch: 11 command: python train.py --model_name template --xlmr_model_name xlm-roberta-large --batch_size 10 --xlmr_learning_rate 2e-5 --event_hidden_num 450 --accumulate_step 4 --max_epoch 30 --p1_data_weight 0.1 --learning_rate 2e-4 2023-01-22 15:11:19.140825: step: 4/77, loss: 0.0006003312300890684 2023-01-22 15:11:20.586566: step: 8/77, loss: 0.021204061806201935 2023-01-22 15:11:22.074844: step: 12/77, loss: 0.031505286693573 2023-01-22 15:11:23.533663: step: 16/77, loss: 0.0012095154961571097 2023-01-22 15:11:25.013636: step: 20/77, loss: 0.00016371729725506157 2023-01-22 15:11:26.507783: step: 24/77, loss: 0.016680900007486343 2023-01-22 15:11:28.002126: step: 28/77, loss: 0.0002608389186207205 2023-01-22 15:11:29.501670: step: 32/77, loss: 0.0022526036482304335 2023-01-22 15:11:30.941446: step: 36/77, loss: 0.0081439558416605 2023-01-22 15:11:32.419178: step: 40/77, loss: 0.004838872700929642 2023-01-22 15:11:33.838466: step: 44/77, loss: 0.0236189067363739 2023-01-22 15:11:35.262855: step: 48/77, loss: 0.00751454895362258 2023-01-22 15:11:36.730434: step: 52/77, loss: 0.008133837953209877 2023-01-22 15:11:38.238404: step: 56/77, loss: 0.0143471984192729 2023-01-22 15:11:39.689984: step: 60/77, loss: 0.0037688438314944506 2023-01-22 15:11:41.115887: step: 64/77, loss: 0.0002274762955494225 2023-01-22 15:11:42.586304: step: 68/77, loss: 0.0012080915039405227 2023-01-22 15:11:44.091856: step: 72/77, loss: 0.05504855513572693 2023-01-22 15:11:45.545335: step: 76/77, loss: 0.00021534219558816403 2023-01-22 15:11:46.979683: step: 80/77, loss: 0.005196057725697756 2023-01-22 15:11:48.406109: step: 84/77, loss: 0.020581921562552452 2023-01-22 15:11:49.843298: step: 88/77, loss: 0.023325707763433456 2023-01-22 15:11:51.295744: step: 92/77, loss: 0.04466782882809639 2023-01-22 15:11:52.804169: step: 96/77, loss: 2.0177296391921118e-05 2023-01-22 15:11:54.277267: step: 100/77, loss: 0.024103665724396706 2023-01-22 15:11:55.687535: step: 104/77, loss: 0.00651308661326766 2023-01-22 15:11:57.162005: step: 108/77, loss: 0.05946511775255203 2023-01-22 15:11:58.653761: step: 112/77, loss: 0.0014207500498741865 2023-01-22 15:12:00.067906: step: 116/77, loss: 0.03702997416257858 2023-01-22 15:12:01.537773: step: 120/77, loss: 0.02262197434902191 2023-01-22 15:12:03.019813: step: 124/77, loss: 0.0028150691650807858 2023-01-22 15:12:04.571263: step: 128/77, loss: 0.00901339203119278 2023-01-22 15:12:05.988536: step: 132/77, loss: 0.002175933215767145 2023-01-22 15:12:07.384839: step: 136/77, loss: 0.002201940631493926 2023-01-22 15:12:08.817461: step: 140/77, loss: 0.03656301647424698 2023-01-22 15:12:10.375275: step: 144/77, loss: 0.004955732729285955 2023-01-22 15:12:11.830631: step: 148/77, loss: 0.019709350541234016 2023-01-22 15:12:13.241504: step: 152/77, loss: 0.007641168776899576 2023-01-22 15:12:14.648264: step: 156/77, loss: 0.009615294635295868 2023-01-22 15:12:16.103861: step: 160/77, loss: 0.03807912394404411 2023-01-22 15:12:17.612393: step: 164/77, loss: 0.06894589215517044 2023-01-22 15:12:19.113728: step: 168/77, loss: 0.0022699374239891768 2023-01-22 15:12:20.576747: step: 172/77, loss: 0.008004303090274334 2023-01-22 15:12:22.024454: step: 176/77, loss: 0.0213845893740654 2023-01-22 15:12:23.500135: step: 180/77, loss: 0.017815865576267242 2023-01-22 15:12:25.031757: step: 184/77, loss: 0.0009117791196331382 2023-01-22 15:12:26.518149: step: 188/77, loss: 0.035853758454322815 2023-01-22 15:12:27.970805: step: 192/77, loss: 0.03394676372408867 2023-01-22 15:12:29.443993: step: 196/77, loss: 0.045160550624132156 2023-01-22 15:12:30.858793: step: 200/77, loss: 0.03711787238717079 2023-01-22 15:12:32.332510: step: 204/77, loss: 0.15094727277755737 2023-01-22 15:12:33.832915: step: 208/77, loss: 0.03208165243268013 2023-01-22 15:12:35.326747: step: 212/77, loss: 0.012918131425976753 2023-01-22 15:12:36.804111: step: 216/77, loss: 9.924051119014621e-05 2023-01-22 15:12:38.285153: step: 220/77, loss: 0.0001036229805322364 2023-01-22 15:12:39.680060: step: 224/77, loss: 0.003947919700294733 2023-01-22 15:12:41.065933: step: 228/77, loss: 0.033685386180877686 2023-01-22 15:12:42.510081: step: 232/77, loss: 0.0008310351986438036 2023-01-22 15:12:43.973845: step: 236/77, loss: 0.001969542121514678 2023-01-22 15:12:45.403941: step: 240/77, loss: 1.5080418961588293e-05 2023-01-22 15:12:46.857674: step: 244/77, loss: 0.006269903853535652 2023-01-22 15:12:48.287543: step: 248/77, loss: 0.000716530135832727 2023-01-22 15:12:49.779671: step: 252/77, loss: 0.002246389165520668 2023-01-22 15:12:51.270482: step: 256/77, loss: 0.0006060154992155731 2023-01-22 15:12:52.732041: step: 260/77, loss: 0.010807817801833153 2023-01-22 15:12:54.206300: step: 264/77, loss: 0.01716885343194008 2023-01-22 15:12:55.647242: step: 268/77, loss: 0.009795032441616058 2023-01-22 15:12:57.073715: step: 272/77, loss: 0.004241094458848238 2023-01-22 15:12:58.570343: step: 276/77, loss: 0.015581032261252403 2023-01-22 15:13:00.035418: step: 280/77, loss: 0.0036227544769644737 2023-01-22 15:13:01.537395: step: 284/77, loss: 0.00013760387082584202 2023-01-22 15:13:02.982200: step: 288/77, loss: 0.03543272614479065 2023-01-22 15:13:04.459770: step: 292/77, loss: 0.0492778979241848 2023-01-22 15:13:05.968495: step: 296/77, loss: 0.000604078231845051 2023-01-22 15:13:07.423699: step: 300/77, loss: 0.040834154933691025 2023-01-22 15:13:08.899912: step: 304/77, loss: 0.0001600382529431954 2023-01-22 15:13:10.362852: step: 308/77, loss: 0.0005247821100056171 2023-01-22 15:13:11.841990: step: 312/77, loss: 0.044942084699869156 2023-01-22 15:13:13.282241: step: 316/77, loss: 0.0047583202831447124 2023-01-22 15:13:14.771257: step: 320/77, loss: 0.0006811682833358645 2023-01-22 15:13:16.287861: step: 324/77, loss: 0.0006450171931646764 2023-01-22 15:13:17.743874: step: 328/77, loss: 3.7787394830957055e-06 2023-01-22 15:13:19.221384: step: 332/77, loss: 0.00300615094602108 2023-01-22 15:13:20.677873: step: 336/77, loss: 0.0020990371704101562 2023-01-22 15:13:22.201216: step: 340/77, loss: 0.001745673012919724 2023-01-22 15:13:23.725048: step: 344/77, loss: 0.0016484896186739206 2023-01-22 15:13:25.215765: step: 348/77, loss: 0.00247292872518301 2023-01-22 15:13:26.699814: step: 352/77, loss: 0.01175768580287695 2023-01-22 15:13:28.137298: step: 356/77, loss: 0.002395451068878174 2023-01-22 15:13:29.559259: step: 360/77, loss: 0.0027588410302996635 2023-01-22 15:13:31.026245: step: 364/77, loss: 1.4766420690648374e-06 2023-01-22 15:13:32.573504: step: 368/77, loss: 0.0001886768004624173 2023-01-22 15:13:34.032037: step: 372/77, loss: 0.00016802757454570383 2023-01-22 15:13:35.514338: step: 376/77, loss: 0.0002669918758329004 2023-01-22 15:13:36.909705: step: 380/77, loss: 0.02825094386935234 2023-01-22 15:13:38.288017: step: 384/77, loss: 0.0053184181451797485 2023-01-22 15:13:39.760490: step: 388/77, loss: 1.3606122593046166e-05 ================================================== Loss: 0.014 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 11} Test Chinese: {'template': {'p': 0.9130434782608695, 'r': 0.49606299212598426, 'f1': 0.6428571428571428}, 'slot': {'p': 0.5675675675675675, 'r': 0.020114942528735632, 'f1': 0.03885291396854764}, 'combined': 0.02497687326549491, 'epoch': 11} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 11} Test Korean: {'template': {'p': 0.9130434782608695, 'r': 0.49606299212598426, 'f1': 0.6428571428571428}, 'slot': {'p': 0.5555555555555556, 'r': 0.019157088122605363, 'f1': 0.037037037037037035}, 'combined': 0.023809523809523805, 'epoch': 11} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 11} Test Russian: {'template': {'p': 0.8985507246376812, 'r': 0.4881889763779528, 'f1': 0.6326530612244898}, 'slot': {'p': 0.5675675675675675, 'r': 0.020114942528735632, 'f1': 0.03885291396854764}, 'combined': 0.024580414959693406, 'epoch': 11} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 11} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 11} Sample Russian: {'template': {'p': 1.0, 'r': 0.25, 'f1': 0.4}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 11} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 1} Test for Chinese: {'template': {'p': 0.9459459459459459, 'r': 0.5511811023622047, 'f1': 0.6965174129353234}, 'slot': {'p': 0.6111111111111112, 'r': 0.0210727969348659, 'f1': 0.040740740740740744}, 'combined': 0.028376635341809474, 'epoch': 1} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 1} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 1} Test for Korean: {'template': {'p': 0.9459459459459459, 'r': 0.5511811023622047, 'f1': 0.6965174129353234}, 'slot': {'p': 0.6111111111111112, 'r': 0.0210727969348659, 'f1': 0.040740740740740744}, 'combined': 0.028376635341809474, 'epoch': 1} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 1} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 1} Test for Russian: {'template': {'p': 0.9459459459459459, 'r': 0.5511811023622047, 'f1': 0.6965174129353234}, 'slot': {'p': 0.6111111111111112, 'r': 0.0210727969348659, 'f1': 0.040740740740740744}, 'combined': 0.028376635341809474, 'epoch': 1} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 1} ****************************** Epoch: 12 command: python train.py --model_name template --xlmr_model_name xlm-roberta-large --batch_size 10 --xlmr_learning_rate 2e-5 --event_hidden_num 450 --accumulate_step 4 --max_epoch 30 --p1_data_weight 0.1 --learning_rate 2e-4 2023-01-22 15:15:14.885648: step: 4/77, loss: 0.00011447127326391637 2023-01-22 15:15:16.373743: step: 8/77, loss: 0.005873774643987417 2023-01-22 15:15:17.825696: step: 12/77, loss: 0.013873615302145481 2023-01-22 15:15:19.334304: step: 16/77, loss: 0.09532775729894638 2023-01-22 15:15:20.763738: step: 20/77, loss: 0.019224464893341064 2023-01-22 15:15:22.245939: step: 24/77, loss: 0.014952284283936024 2023-01-22 15:15:23.637750: step: 28/77, loss: 0.011354008689522743 2023-01-22 15:15:25.186397: step: 32/77, loss: 0.005646710749715567 2023-01-22 15:15:26.697136: step: 36/77, loss: 0.006470432039350271 2023-01-22 15:15:28.149616: step: 40/77, loss: 0.002005266258493066 2023-01-22 15:15:29.638110: step: 44/77, loss: 0.02737969532608986 2023-01-22 15:15:31.063600: step: 48/77, loss: 0.09160150587558746 2023-01-22 15:15:32.562259: step: 52/77, loss: 0.015574988909065723 2023-01-22 15:15:33.974338: step: 56/77, loss: 0.013402441516518593 2023-01-22 15:15:35.422561: step: 60/77, loss: 0.0028528012335300446 2023-01-22 15:15:36.851407: step: 64/77, loss: 0.11721368134021759 2023-01-22 15:15:38.240995: step: 68/77, loss: 0.04519982635974884 2023-01-22 15:15:39.720963: step: 72/77, loss: 0.007908130064606667 2023-01-22 15:15:41.254480: step: 76/77, loss: 0.000419805379351601 2023-01-22 15:15:42.724984: step: 80/77, loss: 0.029239173978567123 2023-01-22 15:15:44.206951: step: 84/77, loss: 1.4387949704541825e-05 2023-01-22 15:15:45.617583: step: 88/77, loss: 0.0032032665330916643 2023-01-22 15:15:47.083489: step: 92/77, loss: 0.03187604248523712 2023-01-22 15:15:48.534884: step: 96/77, loss: 0.0003781506384257227 2023-01-22 15:15:50.032521: step: 100/77, loss: 0.0025725041050463915 2023-01-22 15:15:51.526348: step: 104/77, loss: 0.005759204737842083 2023-01-22 15:15:52.987589: step: 108/77, loss: 0.0004913319135084748 2023-01-22 15:15:54.406701: step: 112/77, loss: 0.017344143241643906 2023-01-22 15:15:55.805973: step: 116/77, loss: 0.006315631791949272 2023-01-22 15:15:57.303075: step: 120/77, loss: 0.009309705346822739 2023-01-22 15:15:58.757814: step: 124/77, loss: 0.04846097528934479 2023-01-22 15:16:00.177308: step: 128/77, loss: 0.0011693534906953573 2023-01-22 15:16:01.665383: step: 132/77, loss: 0.00011401664232835174 2023-01-22 15:16:03.042604: step: 136/77, loss: 7.084720436978387e-06 2023-01-22 15:16:04.554204: step: 140/77, loss: 0.0003985275106970221 2023-01-22 15:16:06.012779: step: 144/77, loss: 0.01128348894417286 2023-01-22 15:16:07.491748: step: 148/77, loss: 0.004289138596504927 2023-01-22 15:16:09.002201: step: 152/77, loss: 0.013749208301305771 2023-01-22 15:16:10.442522: step: 156/77, loss: 0.0183701254427433 2023-01-22 15:16:11.841962: step: 160/77, loss: 0.0014028212754055858 2023-01-22 15:16:13.301067: step: 164/77, loss: 0.039471760392189026 2023-01-22 15:16:14.749848: step: 168/77, loss: 0.00030573649564757943 2023-01-22 15:16:16.238987: step: 172/77, loss: 0.0019688790198415518 2023-01-22 15:16:17.690839: step: 176/77, loss: 0.0018431775970384479 2023-01-22 15:16:19.184358: step: 180/77, loss: 0.004778346978127956 2023-01-22 15:16:20.630627: step: 184/77, loss: 0.0010927257826551795 2023-01-22 15:16:22.091795: step: 188/77, loss: 0.0025949692353606224 2023-01-22 15:16:23.595236: step: 192/77, loss: 0.01726909726858139 2023-01-22 15:16:25.089181: step: 196/77, loss: 0.00552397733554244 2023-01-22 15:16:26.521876: step: 200/77, loss: 0.024718482047319412 2023-01-22 15:16:27.977224: step: 204/77, loss: 0.003021220676600933 2023-01-22 15:16:29.454394: step: 208/77, loss: 0.0008731871494092047 2023-01-22 15:16:30.899921: step: 212/77, loss: 0.009300929494202137 2023-01-22 15:16:32.397526: step: 216/77, loss: 3.7789632187923416e-05 2023-01-22 15:16:33.813748: step: 220/77, loss: 0.0013306228211149573 2023-01-22 15:16:35.298024: step: 224/77, loss: 0.006890836171805859 2023-01-22 15:16:36.727497: step: 228/77, loss: 0.0002653613919392228 2023-01-22 15:16:38.194218: step: 232/77, loss: 0.0014646729687228799 2023-01-22 15:16:39.601311: step: 236/77, loss: 0.010130883194506168 2023-01-22 15:16:41.056879: step: 240/77, loss: 0.003048648126423359 2023-01-22 15:16:42.522829: step: 244/77, loss: 0.0001677718391874805 2023-01-22 15:16:44.056027: step: 248/77, loss: 6.348369788611308e-05 2023-01-22 15:16:45.500109: step: 252/77, loss: 0.0007760244188830256 2023-01-22 15:16:47.001404: step: 256/77, loss: 0.002921057166531682 2023-01-22 15:16:48.559755: step: 260/77, loss: 0.021878495812416077 2023-01-22 15:16:50.028172: step: 264/77, loss: 0.0029452943708747625 2023-01-22 15:16:51.559577: step: 268/77, loss: 0.00043145238305442035 2023-01-22 15:16:53.027345: step: 272/77, loss: 0.008674006909132004 2023-01-22 15:16:54.537000: step: 276/77, loss: 0.007851340807974339 2023-01-22 15:16:55.981293: step: 280/77, loss: 0.007585796527564526 2023-01-22 15:16:57.449281: step: 284/77, loss: 0.020777950063347816 2023-01-22 15:16:58.940741: step: 288/77, loss: 0.010870959609746933 2023-01-22 15:17:00.342934: step: 292/77, loss: 0.001444056280888617 2023-01-22 15:17:01.811773: step: 296/77, loss: 0.0014722751220688224 2023-01-22 15:17:03.275850: step: 300/77, loss: 0.00020505535940174013 2023-01-22 15:17:04.827909: step: 304/77, loss: 5.0112106691813096e-05 2023-01-22 15:17:06.328171: step: 308/77, loss: 6.33996824035421e-05 2023-01-22 15:17:07.751243: step: 312/77, loss: 0.0021618446335196495 2023-01-22 15:17:09.194195: step: 316/77, loss: 0.003228449262678623 2023-01-22 15:17:10.605577: step: 320/77, loss: 0.003389625111594796 2023-01-22 15:17:12.081384: step: 324/77, loss: 0.028880221769213676 2023-01-22 15:17:13.506691: step: 328/77, loss: 4.9275131459580734e-06 2023-01-22 15:17:14.981104: step: 332/77, loss: 0.0005603508907370269 2023-01-22 15:17:16.438020: step: 336/77, loss: 0.061530083417892456 2023-01-22 15:17:17.904637: step: 340/77, loss: 0.015641264617443085 2023-01-22 15:17:19.328367: step: 344/77, loss: 0.00047594981151632965 2023-01-22 15:17:20.762761: step: 348/77, loss: 0.000820385292172432 2023-01-22 15:17:22.299098: step: 352/77, loss: 0.004266134463250637 2023-01-22 15:17:23.736273: step: 356/77, loss: 0.03275083750486374 2023-01-22 15:17:25.198511: step: 360/77, loss: 0.001963577000424266 2023-01-22 15:17:26.688316: step: 364/77, loss: 0.008813240565359592 2023-01-22 15:17:28.145283: step: 368/77, loss: 0.00036346548586152494 2023-01-22 15:17:29.593983: step: 372/77, loss: 0.00025794553221203387 2023-01-22 15:17:31.026837: step: 376/77, loss: 0.00023035166668705642 2023-01-22 15:17:32.470242: step: 380/77, loss: 0.00019547718693502247 2023-01-22 15:17:34.001685: step: 384/77, loss: 0.003217963268980384 2023-01-22 15:17:35.408266: step: 388/77, loss: 0.020210573449730873 ================================================== Loss: 0.011 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 12} Test Chinese: {'template': {'p': 0.8875, 'r': 0.5590551181102362, 'f1': 0.6859903381642511}, 'slot': {'p': 0.46808510638297873, 'r': 0.0210727969348659, 'f1': 0.04032997250229148}, 'combined': 0.027665971475001883, 'epoch': 12} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 12} Test Korean: {'template': {'p': 0.8875, 'r': 0.5590551181102362, 'f1': 0.6859903381642511}, 'slot': {'p': 0.4791666666666667, 'r': 0.022030651340996167, 'f1': 0.04212454212454212}, 'combined': 0.02889702889702889, 'epoch': 12} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 12} Test Russian: {'template': {'p': 0.8987341772151899, 'r': 0.5590551181102362, 'f1': 0.6893203883495145}, 'slot': {'p': 0.4782608695652174, 'r': 0.0210727969348659, 'f1': 0.04036697247706422}, 'combined': 0.027825777144384074, 'epoch': 12} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 12} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 12} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 12} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 1} Test for Chinese: {'template': {'p': 0.9459459459459459, 'r': 0.5511811023622047, 'f1': 0.6965174129353234}, 'slot': {'p': 0.6111111111111112, 'r': 0.0210727969348659, 'f1': 0.040740740740740744}, 'combined': 0.028376635341809474, 'epoch': 1} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 1} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 1} Test for Korean: {'template': {'p': 0.9459459459459459, 'r': 0.5511811023622047, 'f1': 0.6965174129353234}, 'slot': {'p': 0.6111111111111112, 'r': 0.0210727969348659, 'f1': 0.040740740740740744}, 'combined': 0.028376635341809474, 'epoch': 1} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 1} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 1} Test for Russian: {'template': {'p': 0.9459459459459459, 'r': 0.5511811023622047, 'f1': 0.6965174129353234}, 'slot': {'p': 0.6111111111111112, 'r': 0.0210727969348659, 'f1': 0.040740740740740744}, 'combined': 0.028376635341809474, 'epoch': 1} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 1} ****************************** Epoch: 13 command: python train.py --model_name template --xlmr_model_name xlm-roberta-large --batch_size 10 --xlmr_learning_rate 2e-5 --event_hidden_num 450 --accumulate_step 4 --max_epoch 30 --p1_data_weight 0.1 --learning_rate 2e-4 2023-01-22 15:19:10.508272: step: 4/77, loss: 3.243347237003036e-05 2023-01-22 15:19:12.002651: step: 8/77, loss: 0.009209323674440384 2023-01-22 15:19:13.421584: step: 12/77, loss: 0.0021235691383481026 2023-01-22 15:19:14.905492: step: 16/77, loss: 0.0395694337785244 2023-01-22 15:19:16.338537: step: 20/77, loss: 0.0002502222778275609 2023-01-22 15:19:17.856143: step: 24/77, loss: 3.197432306478731e-05 2023-01-22 15:19:19.320442: step: 28/77, loss: 0.005346166435629129 2023-01-22 15:19:20.715674: step: 32/77, loss: 0.07942058891057968 2023-01-22 15:19:22.241525: step: 36/77, loss: 0.0475703701376915 2023-01-22 15:19:23.718779: step: 40/77, loss: 0.0304610263556242 2023-01-22 15:19:25.157612: step: 44/77, loss: 0.002349685877561569 2023-01-22 15:19:26.586518: step: 48/77, loss: 0.0011101409327238798 2023-01-22 15:19:27.993495: step: 52/77, loss: 0.008202255703508854 2023-01-22 15:19:29.446149: step: 56/77, loss: 0.03519538417458534 2023-01-22 15:19:30.923247: step: 60/77, loss: 0.00024283432867377996 2023-01-22 15:19:32.385187: step: 64/77, loss: 0.0005841474048793316 2023-01-22 15:19:33.830250: step: 68/77, loss: 0.01869087666273117 2023-01-22 15:19:35.289005: step: 72/77, loss: 0.00029711565002799034 2023-01-22 15:19:36.757121: step: 76/77, loss: 0.0028815760742872953 2023-01-22 15:19:38.235013: step: 80/77, loss: 0.0017954572103917599 2023-01-22 15:19:39.705802: step: 84/77, loss: 0.000594982469920069 2023-01-22 15:19:41.200393: step: 88/77, loss: 0.014209384098649025 2023-01-22 15:19:42.686996: step: 92/77, loss: 0.0020865437109023333 2023-01-22 15:19:44.122381: step: 96/77, loss: 0.029563093557953835 2023-01-22 15:19:45.596908: step: 100/77, loss: 0.007532364688813686 2023-01-22 15:19:47.084235: step: 104/77, loss: 0.00436977855861187 2023-01-22 15:19:48.510948: step: 108/77, loss: 4.12298913943232e-06 2023-01-22 15:19:49.981361: step: 112/77, loss: 0.0019220001995563507 2023-01-22 15:19:51.471593: step: 116/77, loss: 0.005670643411576748 2023-01-22 15:19:52.923848: step: 120/77, loss: 0.032279904931783676 2023-01-22 15:19:54.396391: step: 124/77, loss: 0.002251417376101017 2023-01-22 15:19:55.800854: step: 128/77, loss: 0.005260172300040722 2023-01-22 15:19:57.268044: step: 132/77, loss: 0.018494149670004845 2023-01-22 15:19:58.723027: step: 136/77, loss: 3.3229426321668143e-07 2023-01-22 15:20:00.199723: step: 140/77, loss: 0.012185169383883476 2023-01-22 15:20:01.711695: step: 144/77, loss: 0.009393458254635334 2023-01-22 15:20:03.179322: step: 148/77, loss: 0.0041645425371825695 2023-01-22 15:20:04.651160: step: 152/77, loss: 0.00458143837749958 2023-01-22 15:20:06.113726: step: 156/77, loss: 0.0005006591673009098 2023-01-22 15:20:07.513747: step: 160/77, loss: 0.001993057783693075 2023-01-22 15:20:08.902241: step: 164/77, loss: 0.0012036071857437491 2023-01-22 15:20:10.388901: step: 168/77, loss: 0.0001990405871765688 2023-01-22 15:20:11.854868: step: 172/77, loss: 0.0018202560022473335 2023-01-22 15:20:13.381117: step: 176/77, loss: 0.00022611633175984025 2023-01-22 15:20:14.784049: step: 180/77, loss: 0.058546602725982666 2023-01-22 15:20:16.248253: step: 184/77, loss: 0.00033171792165376246 2023-01-22 15:20:17.621943: step: 188/77, loss: 2.6165966119151562e-05 2023-01-22 15:20:19.137471: step: 192/77, loss: 0.00013848446542397141 2023-01-22 15:20:20.571637: step: 196/77, loss: 0.010520029813051224 2023-01-22 15:20:22.028579: step: 200/77, loss: 4.337416612543166e-05 2023-01-22 15:20:23.489410: step: 204/77, loss: 0.00015028423513285816 2023-01-22 15:20:24.917057: step: 208/77, loss: 0.022912312299013138 2023-01-22 15:20:26.322110: step: 212/77, loss: 0.001989134354516864 2023-01-22 15:20:27.737672: step: 216/77, loss: 6.8852045842504594e-06 2023-01-22 15:20:29.146059: step: 220/77, loss: 0.0010125949047505856 2023-01-22 15:20:30.651418: step: 224/77, loss: 4.541853195405565e-05 2023-01-22 15:20:32.136383: step: 228/77, loss: 0.0025225854478776455 2023-01-22 15:20:33.658577: step: 232/77, loss: 0.0011962627759203315 2023-01-22 15:20:35.109822: step: 236/77, loss: 0.016047481447458267 2023-01-22 15:20:36.599127: step: 240/77, loss: 4.664024402245559e-07 2023-01-22 15:20:38.039421: step: 244/77, loss: 7.396838554996066e-06 2023-01-22 15:20:39.479593: step: 248/77, loss: 0.0001585302670719102 2023-01-22 15:20:40.965242: step: 252/77, loss: 0.022302206605672836 2023-01-22 15:20:42.379046: step: 256/77, loss: 0.01055676769465208 2023-01-22 15:20:43.774948: step: 260/77, loss: 0.009294010698795319 2023-01-22 15:20:45.257624: step: 264/77, loss: 0.0014928595628589392 2023-01-22 15:20:46.756237: step: 268/77, loss: 0.007394532207399607 2023-01-22 15:20:48.225260: step: 272/77, loss: 0.061641938984394073 2023-01-22 15:20:49.696506: step: 276/77, loss: 0.09055077284574509 2023-01-22 15:20:51.148914: step: 280/77, loss: 0.0005149097414687276 2023-01-22 15:20:52.609333: step: 284/77, loss: 0.017879195511341095 2023-01-22 15:20:54.106983: step: 288/77, loss: 0.03737274184823036 2023-01-22 15:20:55.572253: step: 292/77, loss: 4.744202669826336e-05 2023-01-22 15:20:56.993900: step: 296/77, loss: 0.03490385785698891 2023-01-22 15:20:58.466898: step: 300/77, loss: 6.847387703601271e-05 2023-01-22 15:20:59.907542: step: 304/77, loss: 0.04560726135969162 2023-01-22 15:21:01.365251: step: 308/77, loss: 7.1887475314724725e-06 2023-01-22 15:21:02.866546: step: 312/77, loss: 0.0036362677346915007 2023-01-22 15:21:04.286052: step: 316/77, loss: 5.381370556278853e-06 2023-01-22 15:21:05.735406: step: 320/77, loss: 0.0035514039918780327 2023-01-22 15:21:07.205672: step: 324/77, loss: 0.0012405157322064042 2023-01-22 15:21:08.680734: step: 328/77, loss: 0.0016681693959981203 2023-01-22 15:21:10.205918: step: 332/77, loss: 1.5273243434421602e-06 2023-01-22 15:21:11.674010: step: 336/77, loss: 0.000838981126435101 2023-01-22 15:21:13.108208: step: 340/77, loss: 0.010036369785666466 2023-01-22 15:21:14.571167: step: 344/77, loss: 0.023637622594833374 2023-01-22 15:21:16.022661: step: 348/77, loss: 1.2234897440066561e-05 2023-01-22 15:21:17.485230: step: 352/77, loss: 2.741796834015986e-07 2023-01-22 15:21:18.962298: step: 356/77, loss: 0.017329927533864975 2023-01-22 15:21:20.487982: step: 360/77, loss: 0.005737886298447847 2023-01-22 15:21:21.969193: step: 364/77, loss: 0.0019417705480009317 2023-01-22 15:21:23.423856: step: 368/77, loss: 0.0009935015114024282 2023-01-22 15:21:24.835508: step: 372/77, loss: 0.010892514139413834 2023-01-22 15:21:26.292946: step: 376/77, loss: 0.0005522985593415797 2023-01-22 15:21:27.803127: step: 380/77, loss: 0.01548667810857296 2023-01-22 15:21:29.279538: step: 384/77, loss: 0.0015371019253507257 2023-01-22 15:21:30.802574: step: 388/77, loss: 4.670552152674645e-05 ================================================== Loss: 0.010 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 13} Test Chinese: {'template': {'p': 0.9154929577464789, 'r': 0.5118110236220472, 'f1': 0.6565656565656565}, 'slot': {'p': 0.5952380952380952, 'r': 0.023946360153256706, 'f1': 0.04604051565377532}, 'combined': 0.03022862138884238, 'epoch': 13} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 13} Test Korean: {'template': {'p': 0.9154929577464789, 'r': 0.5118110236220472, 'f1': 0.6565656565656565}, 'slot': {'p': 0.6046511627906976, 'r': 0.02490421455938697, 'f1': 0.047838086476540934}, 'combined': 0.03140884465631475, 'epoch': 13} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 13} Test Russian: {'template': {'p': 0.9154929577464789, 'r': 0.5118110236220472, 'f1': 0.6565656565656565}, 'slot': {'p': 0.6046511627906976, 'r': 0.02490421455938697, 'f1': 0.047838086476540934}, 'combined': 0.03140884465631475, 'epoch': 13} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 13} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 13} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 13} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 1} Test for Chinese: {'template': {'p': 0.9459459459459459, 'r': 0.5511811023622047, 'f1': 0.6965174129353234}, 'slot': {'p': 0.6111111111111112, 'r': 0.0210727969348659, 'f1': 0.040740740740740744}, 'combined': 0.028376635341809474, 'epoch': 1} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 1} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 1} Test for Korean: {'template': {'p': 0.9459459459459459, 'r': 0.5511811023622047, 'f1': 0.6965174129353234}, 'slot': {'p': 0.6111111111111112, 'r': 0.0210727969348659, 'f1': 0.040740740740740744}, 'combined': 0.028376635341809474, 'epoch': 1} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 1} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 1} Test for Russian: {'template': {'p': 0.9459459459459459, 'r': 0.5511811023622047, 'f1': 0.6965174129353234}, 'slot': {'p': 0.6111111111111112, 'r': 0.0210727969348659, 'f1': 0.040740740740740744}, 'combined': 0.028376635341809474, 'epoch': 1} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 1} ****************************** Epoch: 14 command: python train.py --model_name template --xlmr_model_name xlm-roberta-large --batch_size 10 --xlmr_learning_rate 2e-5 --event_hidden_num 450 --accumulate_step 4 --max_epoch 30 --p1_data_weight 0.1 --learning_rate 2e-4 2023-01-22 15:23:05.523180: step: 4/77, loss: 0.0025656498037278652 2023-01-22 15:23:06.954249: step: 8/77, loss: 0.0011140801943838596 2023-01-22 15:23:08.443195: step: 12/77, loss: 0.017641501501202583 2023-01-22 15:23:09.942646: step: 16/77, loss: 0.041499342769384384 2023-01-22 15:23:11.404041: step: 20/77, loss: 0.017789943143725395 2023-01-22 15:23:12.859838: step: 24/77, loss: 0.015781359747052193 2023-01-22 15:23:14.317755: step: 28/77, loss: 0.002352119656279683 2023-01-22 15:23:15.792883: step: 32/77, loss: 0.0006025315378792584 2023-01-22 15:23:17.280991: step: 36/77, loss: 0.00033516521216370165 2023-01-22 15:23:18.722395: step: 40/77, loss: 0.007510177791118622 2023-01-22 15:23:20.237117: step: 44/77, loss: 0.0020927973091602325 2023-01-22 15:23:21.701343: step: 48/77, loss: 0.00037008029175922275 2023-01-22 15:23:23.206281: step: 52/77, loss: 0.00429653562605381 2023-01-22 15:23:24.639002: step: 56/77, loss: 0.0029666456393897533 2023-01-22 15:23:26.066067: step: 60/77, loss: 0.031368691474199295 2023-01-22 15:23:27.524596: step: 64/77, loss: 0.00030429562320932746 2023-01-22 15:23:29.047834: step: 68/77, loss: 0.003161044092848897 2023-01-22 15:23:30.517811: step: 72/77, loss: 0.0025112165603786707 2023-01-22 15:23:31.935734: step: 76/77, loss: 0.0005638190777972341 2023-01-22 15:23:33.364753: step: 80/77, loss: 0.0009943522745743394 2023-01-22 15:23:34.764774: step: 84/77, loss: 0.008401062339544296 2023-01-22 15:23:36.219618: step: 88/77, loss: 0.00024580463650636375 2023-01-22 15:23:37.668212: step: 92/77, loss: 0.002747837919741869 2023-01-22 15:23:39.145839: step: 96/77, loss: 0.0003098023298662156 2023-01-22 15:23:40.612373: step: 100/77, loss: 0.012885620817542076 2023-01-22 15:23:42.030193: step: 104/77, loss: 0.0001532408205093816 2023-01-22 15:23:43.456612: step: 108/77, loss: 0.0010461807250976562 2023-01-22 15:23:44.894785: step: 112/77, loss: 0.003930769395083189 2023-01-22 15:23:46.406588: step: 116/77, loss: 3.6715005080623087e-06 2023-01-22 15:23:47.889985: step: 120/77, loss: 0.011873658746480942 2023-01-22 15:23:49.350200: step: 124/77, loss: 0.022898582741618156 2023-01-22 15:23:50.795819: step: 128/77, loss: 0.002420936245471239 2023-01-22 15:23:52.235077: step: 132/77, loss: 0.24361535906791687 2023-01-22 15:23:53.693343: step: 136/77, loss: 0.004778902977705002 2023-01-22 15:23:55.090997: step: 140/77, loss: 0.011483339592814445 2023-01-22 15:23:56.549133: step: 144/77, loss: 8.191740926122293e-05 2023-01-22 15:23:57.999807: step: 148/77, loss: 6.910040974617004e-05 2023-01-22 15:23:59.435858: step: 152/77, loss: 2.73572004516609e-06 2023-01-22 15:24:00.874388: step: 156/77, loss: 0.03097393363714218 2023-01-22 15:24:02.283724: step: 160/77, loss: 0.014213638380169868 2023-01-22 15:24:03.740415: step: 164/77, loss: 1.4160813407215755e-05 2023-01-22 15:24:05.212529: step: 168/77, loss: 0.0016870114486664534 2023-01-22 15:24:06.650502: step: 172/77, loss: 0.006039629690349102 2023-01-22 15:24:08.066025: step: 176/77, loss: 0.00044651172356680036 2023-01-22 15:24:09.521468: step: 180/77, loss: 0.014716248959302902 2023-01-22 15:24:11.000002: step: 184/77, loss: 1.01327799484352e-07 2023-01-22 15:24:12.505236: step: 188/77, loss: 0.008014782331883907 2023-01-22 15:24:13.949845: step: 192/77, loss: 0.03508007898926735 2023-01-22 15:24:15.399075: step: 196/77, loss: 0.008259987458586693 2023-01-22 15:24:16.835038: step: 200/77, loss: 0.0011896166251972318 2023-01-22 15:24:18.210230: step: 204/77, loss: 0.0035644685849547386 2023-01-22 15:24:19.670016: step: 208/77, loss: 0.00013127514102961868 2023-01-22 15:24:21.092993: step: 212/77, loss: 0.00935064721852541 2023-01-22 15:24:22.533747: step: 216/77, loss: 0.0010801522294059396 2023-01-22 15:24:23.907652: step: 220/77, loss: 0.00014615019608754665 2023-01-22 15:24:25.350602: step: 224/77, loss: 0.002576695755124092 2023-01-22 15:24:26.790662: step: 228/77, loss: 2.3486076315748505e-05 2023-01-22 15:24:28.221946: step: 232/77, loss: 0.004159575328230858 2023-01-22 15:24:29.638723: step: 236/77, loss: 0.005562157835811377 2023-01-22 15:24:31.082062: step: 240/77, loss: 0.003928885329514742 2023-01-22 15:24:32.514517: step: 244/77, loss: 0.01359700970351696 2023-01-22 15:24:33.964321: step: 248/77, loss: 0.00575456116348505 2023-01-22 15:24:35.404145: step: 252/77, loss: 0.051128089427948 2023-01-22 15:24:36.840857: step: 256/77, loss: 1.993682872125646e-06 2023-01-22 15:24:38.320337: step: 260/77, loss: 0.0014774063602089882 2023-01-22 15:24:39.727603: step: 264/77, loss: 0.0020001463126391172 2023-01-22 15:24:41.130439: step: 268/77, loss: 0.0023930747993290424 2023-01-22 15:24:42.546083: step: 272/77, loss: 0.0007885292870923877 2023-01-22 15:24:43.918814: step: 276/77, loss: 1.6910353224375285e-05 2023-01-22 15:24:45.329230: step: 280/77, loss: 0.019321920350193977 2023-01-22 15:24:46.807491: step: 284/77, loss: 0.0010869840625673532 2023-01-22 15:24:48.281562: step: 288/77, loss: 0.0006996995653025806 2023-01-22 15:24:49.701015: step: 292/77, loss: 0.00034761650022119284 2023-01-22 15:24:51.084424: step: 296/77, loss: 0.0013859684113413095 2023-01-22 15:24:52.528220: step: 300/77, loss: 0.0007678261026740074 2023-01-22 15:24:53.973103: step: 304/77, loss: 0.006496158894151449 2023-01-22 15:24:55.384082: step: 308/77, loss: 0.0008472168119624257 2023-01-22 15:24:56.834394: step: 312/77, loss: 0.0007091138977557421 2023-01-22 15:24:58.314756: step: 316/77, loss: 0.002391271060332656 2023-01-22 15:24:59.790420: step: 320/77, loss: 0.00010580530943116173 2023-01-22 15:25:01.186878: step: 324/77, loss: 0.013863954693078995 2023-01-22 15:25:02.646438: step: 328/77, loss: 0.036960527300834656 2023-01-22 15:25:04.085861: step: 332/77, loss: 0.003463400062173605 2023-01-22 15:25:05.491092: step: 336/77, loss: 0.037611398845911026 2023-01-22 15:25:06.927138: step: 340/77, loss: 6.835413387307199e-06 2023-01-22 15:25:08.329057: step: 344/77, loss: 0.010961107909679413 2023-01-22 15:25:09.786807: step: 348/77, loss: 0.002686964115127921 2023-01-22 15:25:11.258450: step: 352/77, loss: 0.02801627665758133 2023-01-22 15:25:12.727300: step: 356/77, loss: 3.1587826470058644e-06 2023-01-22 15:25:14.227088: step: 360/77, loss: 0.010630859062075615 2023-01-22 15:25:15.640659: step: 364/77, loss: 0.00018246965191792697 2023-01-22 15:25:17.078841: step: 368/77, loss: 0.0010099156061187387 2023-01-22 15:25:18.479764: step: 372/77, loss: 0.09281604737043381 2023-01-22 15:25:19.896174: step: 376/77, loss: 7.082978845573962e-05 2023-01-22 15:25:21.374075: step: 380/77, loss: 0.008714951574802399 2023-01-22 15:25:22.829126: step: 384/77, loss: 0.0015550435055047274 2023-01-22 15:25:24.283483: step: 388/77, loss: 0.000112787245598156 ================================================== Loss: 0.010 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 14} Test Chinese: {'template': {'p': 0.9324324324324325, 'r': 0.5433070866141733, 'f1': 0.6865671641791046}, 'slot': {'p': 0.5813953488372093, 'r': 0.023946360153256706, 'f1': 0.045998160073597055}, 'combined': 0.031580826319186045, 'epoch': 14} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 14} Test Korean: {'template': {'p': 0.9324324324324325, 'r': 0.5433070866141733, 'f1': 0.6865671641791046}, 'slot': {'p': 0.5813953488372093, 'r': 0.023946360153256706, 'f1': 0.045998160073597055}, 'combined': 0.031580826319186045, 'epoch': 14} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 14} Test Russian: {'template': {'p': 0.9333333333333333, 'r': 0.5511811023622047, 'f1': 0.693069306930693}, 'slot': {'p': 0.5813953488372093, 'r': 0.023946360153256706, 'f1': 0.045998160073597055}, 'combined': 0.03187991292229499, 'epoch': 14} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 14} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 14} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 14} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 1} Test for Chinese: {'template': {'p': 0.9459459459459459, 'r': 0.5511811023622047, 'f1': 0.6965174129353234}, 'slot': {'p': 0.6111111111111112, 'r': 0.0210727969348659, 'f1': 0.040740740740740744}, 'combined': 0.028376635341809474, 'epoch': 1} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 1} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 1} Test for Korean: {'template': {'p': 0.9459459459459459, 'r': 0.5511811023622047, 'f1': 0.6965174129353234}, 'slot': {'p': 0.6111111111111112, 'r': 0.0210727969348659, 'f1': 0.040740740740740744}, 'combined': 0.028376635341809474, 'epoch': 1} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 1} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 1} Test for Russian: {'template': {'p': 0.9459459459459459, 'r': 0.5511811023622047, 'f1': 0.6965174129353234}, 'slot': {'p': 0.6111111111111112, 'r': 0.0210727969348659, 'f1': 0.040740740740740744}, 'combined': 0.028376635341809474, 'epoch': 1} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 1} ****************************** Epoch: 15 command: python train.py --model_name template --xlmr_model_name xlm-roberta-large --batch_size 10 --xlmr_learning_rate 2e-5 --event_hidden_num 450 --accumulate_step 4 --max_epoch 30 --p1_data_weight 0.1 --learning_rate 2e-4 2023-01-22 15:26:57.126665: step: 4/77, loss: 5.726131712435745e-05 2023-01-22 15:26:58.529146: step: 8/77, loss: 1.0668932191038039e-06 2023-01-22 15:27:00.017089: step: 12/77, loss: 0.010979668237268925 2023-01-22 15:27:01.462814: step: 16/77, loss: 0.00023395358584821224 2023-01-22 15:27:02.931936: step: 20/77, loss: 3.398751141503453e-05 2023-01-22 15:27:04.359976: step: 24/77, loss: 0.00014000166265759617 2023-01-22 15:27:05.831910: step: 28/77, loss: 0.004639660008251667 2023-01-22 15:27:07.319773: step: 32/77, loss: 0.00677479337900877 2023-01-22 15:27:08.770505: step: 36/77, loss: 0.00034749071346595883 2023-01-22 15:27:10.123695: step: 40/77, loss: 4.043898297823034e-05 2023-01-22 15:27:11.555183: step: 44/77, loss: 7.587048457935452e-05 2023-01-22 15:27:13.050640: step: 48/77, loss: 0.0009193161968141794 2023-01-22 15:27:14.526400: step: 52/77, loss: 0.12788920104503632 2023-01-22 15:27:16.109900: step: 56/77, loss: 0.021221794188022614 2023-01-22 15:27:17.591241: step: 60/77, loss: 0.00216305092908442 2023-01-22 15:27:19.061806: step: 64/77, loss: 0.0004200602415949106 2023-01-22 15:27:20.436713: step: 68/77, loss: 0.00044831325067207217 2023-01-22 15:27:21.842891: step: 72/77, loss: 0.00012400533887557685 2023-01-22 15:27:23.286184: step: 76/77, loss: 0.01535642147064209 2023-01-22 15:27:24.698759: step: 80/77, loss: 0.005503884982317686 2023-01-22 15:27:26.177253: step: 84/77, loss: 0.0016196584329009056 2023-01-22 15:27:27.712792: step: 88/77, loss: 0.0006185656529851258 2023-01-22 15:27:29.151087: step: 92/77, loss: 0.03347449004650116 2023-01-22 15:27:30.580075: step: 96/77, loss: 0.002699061296880245 2023-01-22 15:27:31.972063: step: 100/77, loss: 0.013255964033305645 2023-01-22 15:27:33.355154: step: 104/77, loss: 0.012433169409632683 2023-01-22 15:27:34.755487: step: 108/77, loss: 0.011180263012647629 2023-01-22 15:27:36.204084: step: 112/77, loss: 6.892575038364157e-05 2023-01-22 15:27:37.678777: step: 116/77, loss: 0.006360053550451994 2023-01-22 15:27:39.159094: step: 120/77, loss: 4.398596865939908e-05 2023-01-22 15:27:40.662615: step: 124/77, loss: 8.79165753531197e-08 2023-01-22 15:27:42.116888: step: 128/77, loss: 0.0015617008320987225 2023-01-22 15:27:43.566497: step: 132/77, loss: 0.0014235922135412693 2023-01-22 15:27:45.065633: step: 136/77, loss: 1.4796501091041137e-06 2023-01-22 15:27:46.511188: step: 140/77, loss: 0.00906049832701683 2023-01-22 15:27:47.923056: step: 144/77, loss: 0.09630610793828964 2023-01-22 15:27:49.305317: step: 148/77, loss: 0.006877141539007425 2023-01-22 15:27:50.828960: step: 152/77, loss: 3.0073049856582657e-05 2023-01-22 15:27:52.361225: step: 156/77, loss: 0.0069508543238043785 2023-01-22 15:27:53.788160: step: 160/77, loss: 0.00041814552969299257 2023-01-22 15:27:55.300987: step: 164/77, loss: 0.0021190126426517963 2023-01-22 15:27:56.724444: step: 168/77, loss: 0.019956491887569427 2023-01-22 15:27:58.179981: step: 172/77, loss: 0.0012716761557385325 2023-01-22 15:27:59.540209: step: 176/77, loss: 7.605523569509387e-05 2023-01-22 15:28:00.988338: step: 180/77, loss: 0.0010978883365169168 2023-01-22 15:28:02.484336: step: 184/77, loss: 0.0022109723649919033 2023-01-22 15:28:03.881774: step: 188/77, loss: 2.864640009647701e-05 2023-01-22 15:28:05.338316: step: 192/77, loss: 0.00612494396045804 2023-01-22 15:28:06.770613: step: 196/77, loss: 5.0919388741021976e-05 2023-01-22 15:28:08.214126: step: 200/77, loss: 0.08609358966350555 2023-01-22 15:28:09.686336: step: 204/77, loss: 0.008768275380134583 2023-01-22 15:28:11.061026: step: 208/77, loss: 0.0034861420281231403 2023-01-22 15:28:12.475878: step: 212/77, loss: 0.0052656326442956924 2023-01-22 15:28:13.967128: step: 216/77, loss: 0.00081259710714221 2023-01-22 15:28:15.443082: step: 220/77, loss: 9.202796354657039e-05 2023-01-22 15:28:16.862681: step: 224/77, loss: 0.00012689913273788989 2023-01-22 15:28:18.283139: step: 228/77, loss: 0.00024755820049904287 2023-01-22 15:28:19.780016: step: 232/77, loss: 0.01208973303437233 2023-01-22 15:28:21.194063: step: 236/77, loss: 9.298473742092028e-06 2023-01-22 15:28:22.627046: step: 240/77, loss: 0.0006560988258570433 2023-01-22 15:28:24.108267: step: 244/77, loss: 0.017205968499183655 2023-01-22 15:28:25.512967: step: 248/77, loss: 0.000252981495577842 2023-01-22 15:28:26.945245: step: 252/77, loss: 0.009462903253734112 2023-01-22 15:28:28.452566: step: 256/77, loss: 1.2990919458388817e-05 2023-01-22 15:28:29.850037: step: 260/77, loss: 0.007943823002278805 2023-01-22 15:28:31.232348: step: 264/77, loss: 0.00019092296133749187 2023-01-22 15:28:32.717534: step: 268/77, loss: 0.00018750665185507387 2023-01-22 15:28:34.107534: step: 272/77, loss: 0.0015075051924213767 2023-01-22 15:28:35.598621: step: 276/77, loss: 2.0175839381408878e-06 2023-01-22 15:28:37.018159: step: 280/77, loss: 0.006988356821238995 2023-01-22 15:28:38.374240: step: 284/77, loss: 1.0245306839351542e-05 2023-01-22 15:28:39.813851: step: 288/77, loss: 3.008390649483772e-06 2023-01-22 15:28:41.257092: step: 292/77, loss: 0.0007987078279256821 2023-01-22 15:28:42.738573: step: 296/77, loss: 0.02645926922559738 2023-01-22 15:28:44.220733: step: 300/77, loss: 0.05119650065898895 2023-01-22 15:28:45.658515: step: 304/77, loss: 0.0066785989329218864 2023-01-22 15:28:47.163828: step: 308/77, loss: 0.00010360310989199206 2023-01-22 15:28:48.594403: step: 312/77, loss: 5.426846837508492e-05 2023-01-22 15:28:49.990595: step: 316/77, loss: 0.0004964179242961109 2023-01-22 15:28:51.450867: step: 320/77, loss: 0.004674556199461222 2023-01-22 15:28:52.931400: step: 324/77, loss: 0.014335057698190212 2023-01-22 15:28:54.358623: step: 328/77, loss: 0.005834832787513733 2023-01-22 15:28:55.825841: step: 332/77, loss: 1.1280006901870365e-06 2023-01-22 15:28:57.206649: step: 336/77, loss: 0.001265389146283269 2023-01-22 15:28:58.534171: step: 340/77, loss: 0.01683085784316063 2023-01-22 15:28:59.995223: step: 344/77, loss: 0.002331322291865945 2023-01-22 15:29:01.447034: step: 348/77, loss: 0.06121343374252319 2023-01-22 15:29:02.869035: step: 352/77, loss: 0.07166502624750137 2023-01-22 15:29:04.336616: step: 356/77, loss: 0.07953915745019913 2023-01-22 15:29:05.731349: step: 360/77, loss: 0.0010253022192046046 2023-01-22 15:29:07.146014: step: 364/77, loss: 0.0006745964055880904 2023-01-22 15:29:08.596730: step: 368/77, loss: 0.05614553391933441 2023-01-22 15:29:10.020626: step: 372/77, loss: 0.0018698268104344606 2023-01-22 15:29:11.519196: step: 376/77, loss: 0.01316850259900093 2023-01-22 15:29:13.024120: step: 380/77, loss: 0.0011994204251095653 2023-01-22 15:29:14.464741: step: 384/77, loss: 0.002858781488612294 2023-01-22 15:29:15.900650: step: 388/77, loss: 0.0025763108860701323 ================================================== Loss: 0.011 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 15} Test Chinese: {'template': {'p': 0.9113924050632911, 'r': 0.5669291338582677, 'f1': 0.6990291262135924}, 'slot': {'p': 0.5476190476190477, 'r': 0.022030651340996167, 'f1': 0.04235727440147329}, 'combined': 0.02960896851365124, 'epoch': 15} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 15} Test Korean: {'template': {'p': 0.9113924050632911, 'r': 0.5669291338582677, 'f1': 0.6990291262135924}, 'slot': {'p': 0.5476190476190477, 'r': 0.022030651340996167, 'f1': 0.04235727440147329}, 'combined': 0.02960896851365124, 'epoch': 15} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 15} Test Russian: {'template': {'p': 0.9113924050632911, 'r': 0.5669291338582677, 'f1': 0.6990291262135924}, 'slot': {'p': 0.5476190476190477, 'r': 0.022030651340996167, 'f1': 0.04235727440147329}, 'combined': 0.02960896851365124, 'epoch': 15} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 15} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 15} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 15} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 1} Test for Chinese: {'template': {'p': 0.9459459459459459, 'r': 0.5511811023622047, 'f1': 0.6965174129353234}, 'slot': {'p': 0.6111111111111112, 'r': 0.0210727969348659, 'f1': 0.040740740740740744}, 'combined': 0.028376635341809474, 'epoch': 1} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 1} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 1} Test for Korean: {'template': {'p': 0.9459459459459459, 'r': 0.5511811023622047, 'f1': 0.6965174129353234}, 'slot': {'p': 0.6111111111111112, 'r': 0.0210727969348659, 'f1': 0.040740740740740744}, 'combined': 0.028376635341809474, 'epoch': 1} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 1} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 1} Test for Russian: {'template': {'p': 0.9459459459459459, 'r': 0.5511811023622047, 'f1': 0.6965174129353234}, 'slot': {'p': 0.6111111111111112, 'r': 0.0210727969348659, 'f1': 0.040740740740740744}, 'combined': 0.028376635341809474, 'epoch': 1} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 1} ****************************** Epoch: 16 command: python train.py --model_name template --xlmr_model_name xlm-roberta-large --batch_size 10 --xlmr_learning_rate 2e-5 --event_hidden_num 450 --accumulate_step 4 --max_epoch 30 --p1_data_weight 0.1 --learning_rate 2e-4 2023-01-22 15:30:48.974441: step: 4/77, loss: 0.0017317838501185179 2023-01-22 15:30:50.442226: step: 8/77, loss: 0.00017449873848818243 2023-01-22 15:30:51.924511: step: 12/77, loss: 0.0019176813075318933 2023-01-22 15:30:53.426960: step: 16/77, loss: 0.0006079123122617602 2023-01-22 15:30:54.829472: step: 20/77, loss: 0.02931329235434532 2023-01-22 15:30:56.185729: step: 24/77, loss: 0.0022387776989489794 2023-01-22 15:30:57.684489: step: 28/77, loss: 0.013994835317134857 2023-01-22 15:30:59.062153: step: 32/77, loss: 0.0007644444704055786 2023-01-22 15:31:00.478352: step: 36/77, loss: 0.001288002822548151 2023-01-22 15:31:01.858623: step: 40/77, loss: 0.000167182763107121 2023-01-22 15:31:03.308850: step: 44/77, loss: 0.1361227184534073 2023-01-22 15:31:04.753196: step: 48/77, loss: 0.036910489201545715 2023-01-22 15:31:06.181938: step: 52/77, loss: 0.0011050781467929482 2023-01-22 15:31:07.599379: step: 56/77, loss: 0.0003216075128875673 2023-01-22 15:31:09.083022: step: 60/77, loss: 0.0005953653017058969 2023-01-22 15:31:10.481318: step: 64/77, loss: 0.0030550111550837755 2023-01-22 15:31:11.856227: step: 68/77, loss: 0.00010496083996258676 2023-01-22 15:31:13.275584: step: 72/77, loss: 1.6167392686838866e-06 2023-01-22 15:31:14.692557: step: 76/77, loss: 5.200486157264095e-07 2023-01-22 15:31:16.076612: step: 80/77, loss: 0.0435759499669075 2023-01-22 15:31:17.546175: step: 84/77, loss: 4.8785190301714465e-05 2023-01-22 15:31:19.045159: step: 88/77, loss: 1.1175652616657317e-06 2023-01-22 15:31:20.533193: step: 92/77, loss: 0.02145608328282833 2023-01-22 15:31:21.933840: step: 96/77, loss: 0.00851452723145485 2023-01-22 15:31:23.378075: step: 100/77, loss: 0.010028223507106304 2023-01-22 15:31:24.832017: step: 104/77, loss: 0.002004031091928482 2023-01-22 15:31:26.246577: step: 108/77, loss: 0.0022274067159742117 2023-01-22 15:31:27.637767: step: 112/77, loss: 3.978437234763987e-06 2023-01-22 15:31:29.113259: step: 116/77, loss: 0.0087112532928586 2023-01-22 15:31:30.579017: step: 120/77, loss: 0.08700189739465714 2023-01-22 15:31:31.957179: step: 124/77, loss: 3.882900728058303e-06 2023-01-22 15:31:33.470474: step: 128/77, loss: 0.04807797074317932 2023-01-22 15:31:34.937566: step: 132/77, loss: 7.73815845604986e-05 2023-01-22 15:31:36.326326: step: 136/77, loss: 0.006546195596456528 2023-01-22 15:31:37.791917: step: 140/77, loss: 2.905712506162672e-07 2023-01-22 15:31:39.191308: step: 144/77, loss: 0.005281643010675907 2023-01-22 15:31:40.636521: step: 148/77, loss: 0.02344752661883831 2023-01-22 15:31:42.084914: step: 152/77, loss: 3.428779382375069e-05 2023-01-22 15:31:43.499105: step: 156/77, loss: 1.6352558304788545e-05 2023-01-22 15:31:44.967585: step: 160/77, loss: 2.0265525790819083e-07 2023-01-22 15:31:46.379808: step: 164/77, loss: 0.006970589514821768 2023-01-22 15:31:47.848890: step: 168/77, loss: 1.9060935301240534e-05 2023-01-22 15:31:49.357632: step: 172/77, loss: 9.926483471645042e-06 2023-01-22 15:31:50.821358: step: 176/77, loss: 0.006762489676475525 2023-01-22 15:31:52.306205: step: 180/77, loss: 1.1130903203593334e-06 2023-01-22 15:31:53.745473: step: 184/77, loss: 0.0023053919430822134 2023-01-22 15:31:55.190628: step: 188/77, loss: 3.5974600905319676e-05 2023-01-22 15:31:56.589533: step: 192/77, loss: 0.002892887219786644 2023-01-22 15:31:57.990542: step: 196/77, loss: 8.694143616594374e-05 2023-01-22 15:31:59.452655: step: 200/77, loss: 6.555604340974241e-05 2023-01-22 15:32:00.918406: step: 204/77, loss: 2.2872816771268845e-05 2023-01-22 15:32:02.355775: step: 208/77, loss: 0.1039767637848854 2023-01-22 15:32:03.756764: step: 212/77, loss: 0.00019675935618579388 2023-01-22 15:32:05.185994: step: 216/77, loss: 0.0009639389463700354 2023-01-22 15:32:06.622358: step: 220/77, loss: 0.04210824519395828 2023-01-22 15:32:08.026394: step: 224/77, loss: 0.0059523447416722775 2023-01-22 15:32:09.491295: step: 228/77, loss: 0.031761154532432556 2023-01-22 15:32:11.029961: step: 232/77, loss: 0.0009827163303270936 2023-01-22 15:32:12.462307: step: 236/77, loss: 0.006473634857684374 2023-01-22 15:32:13.868826: step: 240/77, loss: 0.0009058440336957574 2023-01-22 15:32:15.286084: step: 244/77, loss: 0.0001385786454193294 2023-01-22 15:32:16.676080: step: 248/77, loss: 0.07064634561538696 2023-01-22 15:32:18.088182: step: 252/77, loss: 0.012696346268057823 2023-01-22 15:32:19.468712: step: 256/77, loss: 0.009758692234754562 2023-01-22 15:32:20.916295: step: 260/77, loss: 0.03342214971780777 2023-01-22 15:32:22.299028: step: 264/77, loss: 0.0009017561678774655 2023-01-22 15:32:23.745052: step: 268/77, loss: 2.8461090550990775e-07 2023-01-22 15:32:25.237649: step: 272/77, loss: 0.003868951927870512 2023-01-22 15:32:26.633932: step: 276/77, loss: 0.015315387398004532 2023-01-22 15:32:28.053087: step: 280/77, loss: 0.017557790502905846 2023-01-22 15:32:29.464740: step: 284/77, loss: 0.00023310637334361672 2023-01-22 15:32:30.916596: step: 288/77, loss: 0.019398299977183342 2023-01-22 15:32:32.370170: step: 292/77, loss: 0.0005344029632396996 2023-01-22 15:32:33.820290: step: 296/77, loss: 0.06150564178824425 2023-01-22 15:32:35.243816: step: 300/77, loss: 0.00043789047049358487 2023-01-22 15:32:36.587222: step: 304/77, loss: 0.03489608317613602 2023-01-22 15:32:38.073816: step: 308/77, loss: 0.0021844443399459124 2023-01-22 15:32:39.579949: step: 312/77, loss: 0.0009032661910168827 2023-01-22 15:32:41.036932: step: 316/77, loss: 0.0002276599989272654 2023-01-22 15:32:42.476404: step: 320/77, loss: 0.01691807247698307 2023-01-22 15:32:43.882790: step: 324/77, loss: 0.0012815390946343541 2023-01-22 15:32:45.314443: step: 328/77, loss: 0.03847289830446243 2023-01-22 15:32:46.841369: step: 332/77, loss: 0.00035395106533542275 2023-01-22 15:32:48.285387: step: 336/77, loss: 0.03352927416563034 2023-01-22 15:32:49.807703: step: 340/77, loss: 0.0032856473699212074 2023-01-22 15:32:51.211519: step: 344/77, loss: 0.013948741368949413 2023-01-22 15:32:52.662818: step: 348/77, loss: 0.00172902038320899 2023-01-22 15:32:54.135947: step: 352/77, loss: 0.06885801255702972 2023-01-22 15:32:55.626027: step: 356/77, loss: 1.0430716201881296e-06 2023-01-22 15:32:56.977429: step: 360/77, loss: 0.0005146845942363143 2023-01-22 15:32:58.420092: step: 364/77, loss: 0.0019680324476212263 2023-01-22 15:32:59.824532: step: 368/77, loss: 6.528956873808056e-05 2023-01-22 15:33:01.312464: step: 372/77, loss: 0.04392620548605919 2023-01-22 15:33:02.732759: step: 376/77, loss: 0.0037286796141415834 2023-01-22 15:33:04.248005: step: 380/77, loss: 0.11266662180423737 2023-01-22 15:33:05.687115: step: 384/77, loss: 0.0025158552452921867 2023-01-22 15:33:07.124145: step: 388/77, loss: 0.013057614676654339 ================================================== Loss: 0.014 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 16} Test Chinese: {'template': {'p': 0.8888888888888888, 'r': 0.5669291338582677, 'f1': 0.6923076923076924}, 'slot': {'p': 0.5, 'r': 0.023946360153256706, 'f1': 0.045703839122486295}, 'combined': 0.031641119392490515, 'epoch': 16} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 16} Test Korean: {'template': {'p': 0.8888888888888888, 'r': 0.5669291338582677, 'f1': 0.6923076923076924}, 'slot': {'p': 0.5, 'r': 0.023946360153256706, 'f1': 0.045703839122486295}, 'combined': 0.031641119392490515, 'epoch': 16} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 16} Test Russian: {'template': {'p': 0.8888888888888888, 'r': 0.5669291338582677, 'f1': 0.6923076923076924}, 'slot': {'p': 0.5, 'r': 0.023946360153256706, 'f1': 0.045703839122486295}, 'combined': 0.031641119392490515, 'epoch': 16} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 16} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 16} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 16} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 1} Test for Chinese: {'template': {'p': 0.9459459459459459, 'r': 0.5511811023622047, 'f1': 0.6965174129353234}, 'slot': {'p': 0.6111111111111112, 'r': 0.0210727969348659, 'f1': 0.040740740740740744}, 'combined': 0.028376635341809474, 'epoch': 1} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 1} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 1} Test for Korean: {'template': {'p': 0.9459459459459459, 'r': 0.5511811023622047, 'f1': 0.6965174129353234}, 'slot': {'p': 0.6111111111111112, 'r': 0.0210727969348659, 'f1': 0.040740740740740744}, 'combined': 0.028376635341809474, 'epoch': 1} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 1} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 1} Test for Russian: {'template': {'p': 0.9459459459459459, 'r': 0.5511811023622047, 'f1': 0.6965174129353234}, 'slot': {'p': 0.6111111111111112, 'r': 0.0210727969348659, 'f1': 0.040740740740740744}, 'combined': 0.028376635341809474, 'epoch': 1} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 1} ****************************** Epoch: 17 command: python train.py --model_name template --xlmr_model_name xlm-roberta-large --batch_size 10 --xlmr_learning_rate 2e-5 --event_hidden_num 450 --accumulate_step 4 --max_epoch 30 --p1_data_weight 0.1 --learning_rate 2e-4 2023-01-22 15:34:39.646862: step: 4/77, loss: 0.0005133425584062934 2023-01-22 15:34:41.056488: step: 8/77, loss: 0.25700217485427856 2023-01-22 15:34:42.456048: step: 12/77, loss: 1.1277808880549856e-05 2023-01-22 15:34:43.887958: step: 16/77, loss: 0.0005257083103060722 2023-01-22 15:34:45.328456: step: 20/77, loss: 2.32608708756743e-05 2023-01-22 15:34:46.758846: step: 24/77, loss: 1.5358520613517612e-05 2023-01-22 15:34:48.218742: step: 28/77, loss: 0.00023268604127224535 2023-01-22 15:34:49.681991: step: 32/77, loss: 0.00011589200585149229 2023-01-22 15:34:51.134377: step: 36/77, loss: 6.274050065258052e-06 2023-01-22 15:34:52.620398: step: 40/77, loss: 0.004648436326533556 2023-01-22 15:34:54.039023: step: 44/77, loss: 0.01249587070196867 2023-01-22 15:34:55.507521: step: 48/77, loss: 2.8521142667159438e-05 2023-01-22 15:34:56.970652: step: 52/77, loss: 2.7245456294622272e-05 2023-01-22 15:34:58.458611: step: 56/77, loss: 0.006364437751471996 2023-01-22 15:34:59.954507: step: 60/77, loss: 0.006027623545378447 2023-01-22 15:35:01.396385: step: 64/77, loss: 0.01018624659627676 2023-01-22 15:35:02.896148: step: 68/77, loss: 0.0036279428750276566 2023-01-22 15:35:04.354780: step: 72/77, loss: 0.0003061688330490142 2023-01-22 15:35:05.783649: step: 76/77, loss: 6.950273018446751e-06 2023-01-22 15:35:07.207013: step: 80/77, loss: 0.0008913867641240358 2023-01-22 15:35:08.665205: step: 84/77, loss: 0.0008618387510068715 2023-01-22 15:35:10.086380: step: 88/77, loss: 0.0005809272988699377 2023-01-22 15:35:11.472140: step: 92/77, loss: 9.566392691340297e-07 2023-01-22 15:35:12.877521: step: 96/77, loss: 9.808649338083342e-06 2023-01-22 15:35:14.324977: step: 100/77, loss: 0.0003643590025603771 2023-01-22 15:35:15.799898: step: 104/77, loss: 2.4079831746348646e-06 2023-01-22 15:35:17.224613: step: 108/77, loss: 0.030138032510876656 2023-01-22 15:35:18.619521: step: 112/77, loss: 1.3186123396735638e-05 2023-01-22 15:35:20.082306: step: 116/77, loss: 0.0002886793517973274 2023-01-22 15:35:21.560257: step: 120/77, loss: 0.00036135007394477725 2023-01-22 15:35:23.049870: step: 124/77, loss: 0.00021528334764298052 2023-01-22 15:35:24.513765: step: 128/77, loss: 0.0020658867433667183 2023-01-22 15:35:25.966360: step: 132/77, loss: 0.015242397785186768 2023-01-22 15:35:27.419384: step: 136/77, loss: 1.6509604392922483e-05 2023-01-22 15:35:28.885530: step: 140/77, loss: 0.06758764386177063 2023-01-22 15:35:30.329756: step: 144/77, loss: 0.035183727741241455 2023-01-22 15:35:31.798085: step: 148/77, loss: 0.0010059047490358353 2023-01-22 15:35:33.252080: step: 152/77, loss: 0.00012491221423260868 2023-01-22 15:35:34.658622: step: 156/77, loss: 0.0017725983634591103 2023-01-22 15:35:36.030128: step: 160/77, loss: 0.007730530109256506 2023-01-22 15:35:37.495367: step: 164/77, loss: 0.026737932115793228 2023-01-22 15:35:38.936394: step: 168/77, loss: 0.001432826858945191 2023-01-22 15:35:40.362028: step: 172/77, loss: 0.019602090120315552 2023-01-22 15:35:41.822826: step: 176/77, loss: 0.00027316712657921016 2023-01-22 15:35:43.255638: step: 180/77, loss: 0.023925138637423515 2023-01-22 15:35:44.666797: step: 184/77, loss: 8.213570254156366e-05 2023-01-22 15:35:46.149305: step: 188/77, loss: 0.007336677052080631 2023-01-22 15:35:47.602017: step: 192/77, loss: 0.022841282188892365 2023-01-22 15:35:49.097686: step: 196/77, loss: 0.0019216712098568678 2023-01-22 15:35:50.562085: step: 200/77, loss: 0.039149291813373566 2023-01-22 15:35:51.988211: step: 204/77, loss: 0.0003298427618574351 2023-01-22 15:35:53.426439: step: 208/77, loss: 0.0005000152159482241 2023-01-22 15:35:54.876011: step: 212/77, loss: 0.006282121874392033 2023-01-22 15:35:56.300138: step: 216/77, loss: 0.029873473569750786 2023-01-22 15:35:57.723186: step: 220/77, loss: 0.0012654258171096444 2023-01-22 15:35:59.172019: step: 224/77, loss: 0.007508468348532915 2023-01-22 15:36:00.606477: step: 228/77, loss: 9.598219185136259e-05 2023-01-22 15:36:02.043696: step: 232/77, loss: 0.00014405997353605926 2023-01-22 15:36:03.466815: step: 236/77, loss: 0.009656374342739582 2023-01-22 15:36:04.907903: step: 240/77, loss: 0.00014686529175378382 2023-01-22 15:36:06.310617: step: 244/77, loss: 0.11297740042209625 2023-01-22 15:36:07.751314: step: 248/77, loss: 0.00041073394822888076 2023-01-22 15:36:09.174064: step: 252/77, loss: 0.0016093285521492362 2023-01-22 15:36:10.581638: step: 256/77, loss: 0.003659533802419901 2023-01-22 15:36:11.954796: step: 260/77, loss: 0.008596748113632202 2023-01-22 15:36:13.402109: step: 264/77, loss: 1.2040032970617176e-06 2023-01-22 15:36:14.860201: step: 268/77, loss: 0.011089028790593147 2023-01-22 15:36:16.332673: step: 272/77, loss: 0.0005412441096268594 2023-01-22 15:36:17.761470: step: 276/77, loss: 0.000450003775767982 2023-01-22 15:36:19.214376: step: 280/77, loss: 0.0003825658932328224 2023-01-22 15:36:20.679038: step: 284/77, loss: 0.01967022567987442 2023-01-22 15:36:22.096080: step: 288/77, loss: 0.02259012684226036 2023-01-22 15:36:23.536707: step: 292/77, loss: 0.0008718750905245543 2023-01-22 15:36:24.956145: step: 296/77, loss: 0.024363992735743523 2023-01-22 15:36:26.351682: step: 300/77, loss: 0.00037443527253344655 2023-01-22 15:36:27.762955: step: 304/77, loss: 0.0005880811950191855 2023-01-22 15:36:29.196650: step: 308/77, loss: 0.009968264028429985 2023-01-22 15:36:30.628421: step: 312/77, loss: 0.000298752129310742 2023-01-22 15:36:32.048777: step: 316/77, loss: 6.792229396523908e-05 2023-01-22 15:36:33.477937: step: 320/77, loss: 1.5058786630106624e-05 2023-01-22 15:36:34.955704: step: 324/77, loss: 0.0006238286150619388 2023-01-22 15:36:36.410931: step: 328/77, loss: 0.0024789697490632534 2023-01-22 15:36:37.848814: step: 332/77, loss: 5.27570364283747e-06 2023-01-22 15:36:39.249676: step: 336/77, loss: 7.577707583550364e-05 2023-01-22 15:36:40.682204: step: 340/77, loss: 0.00022022971825208515 2023-01-22 15:36:42.127976: step: 344/77, loss: 0.00014018295041751117 2023-01-22 15:36:43.478976: step: 348/77, loss: 0.001186000881716609 2023-01-22 15:36:44.892758: step: 352/77, loss: 5.855800282006385e-06 2023-01-22 15:36:46.345184: step: 356/77, loss: 0.001700332504697144 2023-01-22 15:36:47.808787: step: 360/77, loss: 8.14079976407811e-05 2023-01-22 15:36:49.307072: step: 364/77, loss: 1.452830360904045e-06 2023-01-22 15:36:50.796834: step: 368/77, loss: 1.4275110515882261e-06 2023-01-22 15:36:52.219270: step: 372/77, loss: 2.2887088562129065e-06 2023-01-22 15:36:53.654328: step: 376/77, loss: 0.01788274385035038 2023-01-22 15:36:55.130627: step: 380/77, loss: 0.0008733969880267978 2023-01-22 15:36:56.605277: step: 384/77, loss: 0.032448168843984604 2023-01-22 15:36:58.052584: step: 388/77, loss: 0.02830406092107296 ================================================== Loss: 0.010 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 17} Test Chinese: {'template': {'p': 0.9210526315789473, 'r': 0.5511811023622047, 'f1': 0.6896551724137933}, 'slot': {'p': 0.5952380952380952, 'r': 0.023946360153256706, 'f1': 0.04604051565377532}, 'combined': 0.03175207976122437, 'epoch': 17} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 17} Test Korean: {'template': {'p': 0.9210526315789473, 'r': 0.5511811023622047, 'f1': 0.6896551724137933}, 'slot': {'p': 0.5813953488372093, 'r': 0.023946360153256706, 'f1': 0.045998160073597055}, 'combined': 0.03172286901627384, 'epoch': 17} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 17} Test Russian: {'template': {'p': 0.922077922077922, 'r': 0.5590551181102362, 'f1': 0.696078431372549}, 'slot': {'p': 0.5714285714285714, 'r': 0.022988505747126436, 'f1': 0.044198895027624314}, 'combined': 0.030765897519228688, 'epoch': 17} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 17} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 17} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 17} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 1} Test for Chinese: {'template': {'p': 0.9459459459459459, 'r': 0.5511811023622047, 'f1': 0.6965174129353234}, 'slot': {'p': 0.6111111111111112, 'r': 0.0210727969348659, 'f1': 0.040740740740740744}, 'combined': 0.028376635341809474, 'epoch': 1} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 1} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 1} Test for Korean: {'template': {'p': 0.9459459459459459, 'r': 0.5511811023622047, 'f1': 0.6965174129353234}, 'slot': {'p': 0.6111111111111112, 'r': 0.0210727969348659, 'f1': 0.040740740740740744}, 'combined': 0.028376635341809474, 'epoch': 1} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 1} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 1} Test for Russian: {'template': {'p': 0.9459459459459459, 'r': 0.5511811023622047, 'f1': 0.6965174129353234}, 'slot': {'p': 0.6111111111111112, 'r': 0.0210727969348659, 'f1': 0.040740740740740744}, 'combined': 0.028376635341809474, 'epoch': 1} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 1} ****************************** Epoch: 18 command: python train.py --model_name template --xlmr_model_name xlm-roberta-large --batch_size 10 --xlmr_learning_rate 2e-5 --event_hidden_num 450 --accumulate_step 4 --max_epoch 30 --p1_data_weight 0.1 --learning_rate 2e-4 2023-01-22 15:38:31.400554: step: 4/77, loss: 0.0035507383290678263 2023-01-22 15:38:32.874910: step: 8/77, loss: 0.023755474016070366 2023-01-22 15:38:34.353149: step: 12/77, loss: 0.00019814765255432576 2023-01-22 15:38:35.849806: step: 16/77, loss: 0.1547447144985199 2023-01-22 15:38:37.308129: step: 20/77, loss: 0.0035823448561131954 2023-01-22 15:38:38.704162: step: 24/77, loss: 0.053595662117004395 2023-01-22 15:38:40.111048: step: 28/77, loss: 0.05061209201812744 2023-01-22 15:38:41.542895: step: 32/77, loss: 1.4544233636115678e-05 2023-01-22 15:38:43.048091: step: 36/77, loss: 0.00016624647832941264 2023-01-22 15:38:44.547110: step: 40/77, loss: 0.0011924817226827145 2023-01-22 15:38:46.024338: step: 44/77, loss: 1.3604301329905866e-06 2023-01-22 15:38:47.509235: step: 48/77, loss: 0.00012266525300219655 2023-01-22 15:38:49.030209: step: 52/77, loss: 0.002537267515435815 2023-01-22 15:38:50.399832: step: 56/77, loss: 2.703541576920543e-05 2023-01-22 15:38:51.888390: step: 60/77, loss: 0.0020698516163975 2023-01-22 15:38:53.330500: step: 64/77, loss: 0.0038298233412206173 2023-01-22 15:38:54.686916: step: 68/77, loss: 0.0011645182967185974 2023-01-22 15:38:56.123354: step: 72/77, loss: 0.001665165415033698 2023-01-22 15:38:57.531535: step: 76/77, loss: 5.182696622796357e-05 2023-01-22 15:38:59.034884: step: 80/77, loss: 4.138336589676328e-05 2023-01-22 15:39:00.470529: step: 84/77, loss: 0.008358081802725792 2023-01-22 15:39:01.948106: step: 88/77, loss: 6.886433402542025e-05 2023-01-22 15:39:03.421511: step: 92/77, loss: 6.711319201713195e-06 2023-01-22 15:39:04.946408: step: 96/77, loss: 0.00394531711935997 2023-01-22 15:39:06.353064: step: 100/77, loss: 0.008956738747656345 2023-01-22 15:39:07.767020: step: 104/77, loss: 0.004136328119784594 2023-01-22 15:39:09.239273: step: 108/77, loss: 0.007444991730153561 2023-01-22 15:39:10.678619: step: 112/77, loss: 0.008587652817368507 2023-01-22 15:39:12.106802: step: 116/77, loss: 1.8953192920889705e-05 2023-01-22 15:39:13.505437: step: 120/77, loss: 1.6370984667446464e-05 2023-01-22 15:39:14.959006: step: 124/77, loss: 0.005422429647296667 2023-01-22 15:39:16.453642: step: 128/77, loss: 0.0005618298891931772 2023-01-22 15:39:17.880295: step: 132/77, loss: 0.001508870511315763 2023-01-22 15:39:19.272480: step: 136/77, loss: 5.82295615458861e-06 2023-01-22 15:39:20.685312: step: 140/77, loss: 2.2798707277615904e-07 2023-01-22 15:39:22.081283: step: 144/77, loss: 0.00012311548925936222 2023-01-22 15:39:23.527932: step: 148/77, loss: 0.0022769425995647907 2023-01-22 15:39:24.910985: step: 152/77, loss: 0.0014799899654462934 2023-01-22 15:39:26.344252: step: 156/77, loss: 0.0490715391933918 2023-01-22 15:39:27.758097: step: 160/77, loss: 0.0048894137144088745 2023-01-22 15:39:29.203316: step: 164/77, loss: 0.020756013691425323 2023-01-22 15:39:30.655782: step: 168/77, loss: 0.15315866470336914 2023-01-22 15:39:31.995845: step: 172/77, loss: 0.00010488485713722184 2023-01-22 15:39:33.489113: step: 176/77, loss: 0.0033859512768685818 2023-01-22 15:39:34.835473: step: 180/77, loss: 0.0009017707780003548 2023-01-22 15:39:36.247680: step: 184/77, loss: 2.3831764337955974e-05 2023-01-22 15:39:37.782003: step: 188/77, loss: 0.0002648954978212714 2023-01-22 15:39:39.269893: step: 192/77, loss: 0.000819819571916014 2023-01-22 15:39:40.775624: step: 196/77, loss: 0.0005232224939391017 2023-01-22 15:39:42.240122: step: 200/77, loss: 0.01611267402768135 2023-01-22 15:39:43.697909: step: 204/77, loss: 9.189666343445424e-06 2023-01-22 15:39:45.118719: step: 208/77, loss: 5.754145240643993e-05 2023-01-22 15:39:46.528920: step: 212/77, loss: 0.00015138850721996278 2023-01-22 15:39:47.929407: step: 216/77, loss: 8.713976421859115e-05 2023-01-22 15:39:49.402746: step: 220/77, loss: 0.0026574949733912945 2023-01-22 15:39:50.806662: step: 224/77, loss: 0.01463887095451355 2023-01-22 15:39:52.187461: step: 228/77, loss: 0.0003013443492818624 2023-01-22 15:39:53.663603: step: 232/77, loss: 0.0028723650611937046 2023-01-22 15:39:55.106017: step: 236/77, loss: 0.0006100260652601719 2023-01-22 15:39:56.525754: step: 240/77, loss: 9.001885337056592e-05 2023-01-22 15:39:57.984435: step: 244/77, loss: 0.11137441545724869 2023-01-22 15:39:59.423811: step: 248/77, loss: 5.242784027359448e-06 2023-01-22 15:40:00.874134: step: 252/77, loss: 1.0018793545896187e-05 2023-01-22 15:40:02.387067: step: 256/77, loss: 0.0025258022360503674 2023-01-22 15:40:03.826311: step: 260/77, loss: 4.1723239974089665e-08 2023-01-22 15:40:05.217116: step: 264/77, loss: 0.00875135324895382 2023-01-22 15:40:06.661273: step: 268/77, loss: 4.1276001638834714e-07 2023-01-22 15:40:08.163474: step: 272/77, loss: 0.0002078042452922091 2023-01-22 15:40:09.639375: step: 276/77, loss: 1.3261702633826644e-06 2023-01-22 15:40:11.091813: step: 280/77, loss: 3.9751048461766914e-05 2023-01-22 15:40:12.531789: step: 284/77, loss: 0.0003330234612803906 2023-01-22 15:40:14.019492: step: 288/77, loss: 0.00946515891700983 2023-01-22 15:40:15.520645: step: 292/77, loss: 1.3010629118070938e-05 2023-01-22 15:40:17.014485: step: 296/77, loss: 0.00032809883123263717 2023-01-22 15:40:18.402214: step: 300/77, loss: 0.03821895644068718 2023-01-22 15:40:19.863359: step: 304/77, loss: 7.003527002780174e-08 2023-01-22 15:40:21.279048: step: 308/77, loss: 0.0028851618990302086 2023-01-22 15:40:22.761983: step: 312/77, loss: 0.00011543244181666523 2023-01-22 15:40:24.230127: step: 316/77, loss: 0.03925507888197899 2023-01-22 15:40:25.670195: step: 320/77, loss: 0.0011718106688931584 2023-01-22 15:40:27.115464: step: 324/77, loss: 0.00011222171451663598 2023-01-22 15:40:28.568984: step: 328/77, loss: 0.0006643851520493627 2023-01-22 15:40:30.016278: step: 332/77, loss: 3.662090966827236e-05 2023-01-22 15:40:31.465439: step: 336/77, loss: 0.05327032506465912 2023-01-22 15:40:32.984660: step: 340/77, loss: 1.6509970919287298e-06 2023-01-22 15:40:34.385422: step: 344/77, loss: 0.00027803683769889176 2023-01-22 15:40:35.790598: step: 348/77, loss: 5.260006901153247e-07 2023-01-22 15:40:37.220621: step: 352/77, loss: 0.031084762886166573 2023-01-22 15:40:38.624921: step: 356/77, loss: 0.00046038682921789587 2023-01-22 15:40:40.029142: step: 360/77, loss: 1.0281779339038621e-07 2023-01-22 15:40:41.420776: step: 364/77, loss: 2.38418529363571e-08 2023-01-22 15:40:42.893787: step: 368/77, loss: 0.02806309424340725 2023-01-22 15:40:44.287439: step: 372/77, loss: 0.0004971831804141402 2023-01-22 15:40:45.692804: step: 376/77, loss: 8.940695295223122e-09 2023-01-22 15:40:47.133597: step: 380/77, loss: 0.01742289401590824 2023-01-22 15:40:48.604472: step: 384/77, loss: 0.03783995285630226 2023-01-22 15:40:50.046917: step: 388/77, loss: 0.0002357118937652558 ================================================== Loss: 0.010 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 18} Test Chinese: {'template': {'p': 0.9178082191780822, 'r': 0.5275590551181102, 'f1': 0.6699999999999999}, 'slot': {'p': 0.6571428571428571, 'r': 0.022030651340996167, 'f1': 0.04263206672845227}, 'combined': 0.028563484708063015, 'epoch': 18} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 18} Test Korean: {'template': {'p': 0.9178082191780822, 'r': 0.5275590551181102, 'f1': 0.6699999999999999}, 'slot': {'p': 0.6388888888888888, 'r': 0.022030651340996167, 'f1': 0.04259259259259259}, 'combined': 0.028537037037037034, 'epoch': 18} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 18} Test Russian: {'template': {'p': 0.9166666666666666, 'r': 0.5196850393700787, 'f1': 0.6633165829145728}, 'slot': {'p': 0.6571428571428571, 'r': 0.022030651340996167, 'f1': 0.04263206672845227}, 'combined': 0.02827855682490301, 'epoch': 18} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 18} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 18} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 18} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 1} Test for Chinese: {'template': {'p': 0.9459459459459459, 'r': 0.5511811023622047, 'f1': 0.6965174129353234}, 'slot': {'p': 0.6111111111111112, 'r': 0.0210727969348659, 'f1': 0.040740740740740744}, 'combined': 0.028376635341809474, 'epoch': 1} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 1} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 1} Test for Korean: {'template': {'p': 0.9459459459459459, 'r': 0.5511811023622047, 'f1': 0.6965174129353234}, 'slot': {'p': 0.6111111111111112, 'r': 0.0210727969348659, 'f1': 0.040740740740740744}, 'combined': 0.028376635341809474, 'epoch': 1} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 1} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 1} Test for Russian: {'template': {'p': 0.9459459459459459, 'r': 0.5511811023622047, 'f1': 0.6965174129353234}, 'slot': {'p': 0.6111111111111112, 'r': 0.0210727969348659, 'f1': 0.040740740740740744}, 'combined': 0.028376635341809474, 'epoch': 1} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 1} ****************************** Epoch: 19 command: python train.py --model_name template --xlmr_model_name xlm-roberta-large --batch_size 10 --xlmr_learning_rate 2e-5 --event_hidden_num 450 --accumulate_step 4 --max_epoch 30 --p1_data_weight 0.1 --learning_rate 2e-4 2023-01-22 15:42:23.548419: step: 4/77, loss: 9.256213161279447e-06 2023-01-22 15:42:24.949627: step: 8/77, loss: 1.4101290616963524e-05 2023-01-22 15:42:26.392398: step: 12/77, loss: 0.00030928992782719433 2023-01-22 15:42:27.800071: step: 16/77, loss: 0.032265227288007736 2023-01-22 15:42:29.255389: step: 20/77, loss: 0.004820591304451227 2023-01-22 15:42:30.641758: step: 24/77, loss: 0.0005670119426213205 2023-01-22 15:42:32.080822: step: 28/77, loss: 0.041379909962415695 2023-01-22 15:42:33.557145: step: 32/77, loss: 0.00010742421000031754 2023-01-22 15:42:35.015476: step: 36/77, loss: 0.000136702336021699 2023-01-22 15:42:36.374984: step: 40/77, loss: 5.685820724465884e-05 2023-01-22 15:42:37.847964: step: 44/77, loss: 2.3468689960282063e-06 2023-01-22 15:42:39.265572: step: 48/77, loss: 1.533298245703918e-06 2023-01-22 15:42:40.690805: step: 52/77, loss: 0.0012926801573485136 2023-01-22 15:42:42.088369: step: 56/77, loss: 0.004278821405023336 2023-01-22 15:42:43.551307: step: 60/77, loss: 0.0011613968526944518 2023-01-22 15:42:45.031407: step: 64/77, loss: 6.727749132551253e-05 2023-01-22 15:42:46.443877: step: 68/77, loss: 0.0437166802585125 2023-01-22 15:42:47.857705: step: 72/77, loss: 0.007670319639146328 2023-01-22 15:42:49.285625: step: 76/77, loss: 0.00045689233229495585 2023-01-22 15:42:50.743697: step: 80/77, loss: 0.0001816729491110891 2023-01-22 15:42:52.109832: step: 84/77, loss: 0.00024046326871030033 2023-01-22 15:42:53.531188: step: 88/77, loss: 0.00014642243331763893 2023-01-22 15:42:54.946604: step: 92/77, loss: 6.224327080417424e-05 2023-01-22 15:42:56.411803: step: 96/77, loss: 0.01375646609812975 2023-01-22 15:42:57.797336: step: 100/77, loss: 0.012539714574813843 2023-01-22 15:42:59.296056: step: 104/77, loss: 0.09291870146989822 2023-01-22 15:43:00.711604: step: 108/77, loss: 0.01918274164199829 2023-01-22 15:43:02.169703: step: 112/77, loss: 0.00028531564748845994 2023-01-22 15:43:03.653290: step: 116/77, loss: 0.0111021026968956 2023-01-22 15:43:05.158026: step: 120/77, loss: 0.043655481189489365 2023-01-22 15:43:06.674040: step: 124/77, loss: 4.4703465817974575e-08 2023-01-22 15:43:08.150323: step: 128/77, loss: 0.00031893019331619143 2023-01-22 15:43:09.568939: step: 132/77, loss: 0.05560840666294098 2023-01-22 15:43:10.965246: step: 136/77, loss: 0.006308365147560835 2023-01-22 15:43:12.389601: step: 140/77, loss: 0.0373319573700428 2023-01-22 15:43:13.832169: step: 144/77, loss: 0.023444000631570816 2023-01-22 15:43:15.333940: step: 148/77, loss: 0.1702856570482254 2023-01-22 15:43:16.779657: step: 152/77, loss: 6.703201506752521e-06 2023-01-22 15:43:18.168401: step: 156/77, loss: 0.0005974058294668794 2023-01-22 15:43:19.610003: step: 160/77, loss: 0.003022658172994852 2023-01-22 15:43:21.026718: step: 164/77, loss: 0.00225867610424757 2023-01-22 15:43:22.465181: step: 168/77, loss: 0.0022307855542749166 2023-01-22 15:43:23.843687: step: 172/77, loss: 7.827709487173706e-05 2023-01-22 15:43:25.226573: step: 176/77, loss: 0.0010960374493151903 2023-01-22 15:43:26.664676: step: 180/77, loss: 0.009178534150123596 2023-01-22 15:43:28.128086: step: 184/77, loss: 0.006633516401052475 2023-01-22 15:43:29.542177: step: 188/77, loss: 2.8188369469717145e-05 2023-01-22 15:43:30.938833: step: 192/77, loss: 3.6253209145797882e-06 2023-01-22 15:43:32.357283: step: 196/77, loss: 0.00032620763522572815 2023-01-22 15:43:33.834146: step: 200/77, loss: 2.422863190076896e-06 2023-01-22 15:43:35.239426: step: 204/77, loss: 0.019036108627915382 2023-01-22 15:43:36.717898: step: 208/77, loss: 0.0015177300665527582 2023-01-22 15:43:38.162493: step: 212/77, loss: 0.04273262992501259 2023-01-22 15:43:39.672689: step: 216/77, loss: 0.0003340786788612604 2023-01-22 15:43:41.166064: step: 220/77, loss: 0.005269180051982403 2023-01-22 15:43:42.646801: step: 224/77, loss: 1.6048367115217843e-06 2023-01-22 15:43:44.129036: step: 228/77, loss: 2.676109488675138e-06 2023-01-22 15:43:45.556532: step: 232/77, loss: 1.460295152355684e-06 2023-01-22 15:43:46.979356: step: 236/77, loss: 4.855666247749468e-06 2023-01-22 15:43:48.412918: step: 240/77, loss: 3.206528344890103e-05 2023-01-22 15:43:49.818652: step: 244/77, loss: 0.001754141179844737 2023-01-22 15:43:51.299199: step: 248/77, loss: 0.00017196766566485167 2023-01-22 15:43:52.733852: step: 252/77, loss: 0.0006510470993816853 2023-01-22 15:43:54.119544: step: 256/77, loss: 0.0002761613577604294 2023-01-22 15:43:55.580238: step: 260/77, loss: 0.03558443859219551 2023-01-22 15:43:57.052682: step: 264/77, loss: 0.02022094652056694 2023-01-22 15:43:58.472660: step: 268/77, loss: 0.0002620831655804068 2023-01-22 15:43:59.897617: step: 272/77, loss: 3.918588390661171e-06 2023-01-22 15:44:01.347581: step: 276/77, loss: 0.029164662584662437 2023-01-22 15:44:02.799282: step: 280/77, loss: 0.0025813865941017866 2023-01-22 15:44:04.192483: step: 284/77, loss: 0.00010808744264068082 2023-01-22 15:44:05.665952: step: 288/77, loss: 0.022075019776821136 2023-01-22 15:44:07.139370: step: 292/77, loss: 0.0035410590935498476 2023-01-22 15:44:08.526928: step: 296/77, loss: 5.375422188080847e-05 2023-01-22 15:44:10.002402: step: 300/77, loss: 0.00038338624290190637 2023-01-22 15:44:11.402608: step: 304/77, loss: 9.834757008775341e-08 2023-01-22 15:44:12.825025: step: 308/77, loss: 0.020338894799351692 2023-01-22 15:44:14.258936: step: 312/77, loss: 0.00010872414713958278 2023-01-22 15:44:15.736382: step: 316/77, loss: 0.0019463624339550734 2023-01-22 15:44:17.162205: step: 320/77, loss: 9.368202881887555e-06 2023-01-22 15:44:18.651125: step: 324/77, loss: 0.035205304622650146 2023-01-22 15:44:20.147718: step: 328/77, loss: 1.2039956800435903e-06 2023-01-22 15:44:21.595625: step: 332/77, loss: 0.026829516515135765 2023-01-22 15:44:23.034107: step: 336/77, loss: 6.202932581800269e-06 2023-01-22 15:44:24.522093: step: 340/77, loss: 0.002215564949437976 2023-01-22 15:44:25.967684: step: 344/77, loss: 8.572654223826248e-06 2023-01-22 15:44:27.446855: step: 348/77, loss: 0.004908408038318157 2023-01-22 15:44:28.874158: step: 352/77, loss: 0.0034107582177966833 2023-01-22 15:44:30.307216: step: 356/77, loss: 3.1850118830334395e-05 2023-01-22 15:44:31.817092: step: 360/77, loss: 1.673349743214203e-06 2023-01-22 15:44:33.286353: step: 364/77, loss: 8.198283467208967e-05 2023-01-22 15:44:34.722987: step: 368/77, loss: 1.381803576805396e-05 2023-01-22 15:44:36.116781: step: 372/77, loss: 2.139659727617982e-06 2023-01-22 15:44:37.544500: step: 376/77, loss: 0.0018171846168115735 2023-01-22 15:44:38.966847: step: 380/77, loss: 1.126479673985159e-06 2023-01-22 15:44:40.350535: step: 384/77, loss: 3.2567124435445294e-05 2023-01-22 15:44:41.862838: step: 388/77, loss: 1.2770144621754298e-06 ================================================== Loss: 0.010 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 19} Test Chinese: {'template': {'p': 0.9444444444444444, 'r': 0.5354330708661418, 'f1': 0.6834170854271356}, 'slot': {'p': 0.6052631578947368, 'r': 0.022030651340996167, 'f1': 0.04251386321626617}, 'combined': 0.029054700489508537, 'epoch': 19} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 19} Test Korean: {'template': {'p': 0.9444444444444444, 'r': 0.5354330708661418, 'f1': 0.6834170854271356}, 'slot': {'p': 0.6052631578947368, 'r': 0.022030651340996167, 'f1': 0.04251386321626617}, 'combined': 0.029054700489508537, 'epoch': 19} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 19} Test Russian: {'template': {'p': 0.9444444444444444, 'r': 0.5354330708661418, 'f1': 0.6834170854271356}, 'slot': {'p': 0.6052631578947368, 'r': 0.022030651340996167, 'f1': 0.04251386321626617}, 'combined': 0.029054700489508537, 'epoch': 19} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 19} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 19} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 19} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 1} Test for Chinese: {'template': {'p': 0.9459459459459459, 'r': 0.5511811023622047, 'f1': 0.6965174129353234}, 'slot': {'p': 0.6111111111111112, 'r': 0.0210727969348659, 'f1': 0.040740740740740744}, 'combined': 0.028376635341809474, 'epoch': 1} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 1} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 1} Test for Korean: {'template': {'p': 0.9459459459459459, 'r': 0.5511811023622047, 'f1': 0.6965174129353234}, 'slot': {'p': 0.6111111111111112, 'r': 0.0210727969348659, 'f1': 0.040740740740740744}, 'combined': 0.028376635341809474, 'epoch': 1} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 1} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 1} Test for Russian: {'template': {'p': 0.9459459459459459, 'r': 0.5511811023622047, 'f1': 0.6965174129353234}, 'slot': {'p': 0.6111111111111112, 'r': 0.0210727969348659, 'f1': 0.040740740740740744}, 'combined': 0.028376635341809474, 'epoch': 1} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 1} ****************************** Epoch: 20 command: python train.py --model_name template --xlmr_model_name xlm-roberta-large --batch_size 10 --xlmr_learning_rate 2e-5 --event_hidden_num 450 --accumulate_step 4 --max_epoch 30 --p1_data_weight 0.1 --learning_rate 2e-4 2023-01-22 15:46:15.250915: step: 4/77, loss: 5.769972995040007e-05 2023-01-22 15:46:16.665874: step: 8/77, loss: 7.988006109371781e-05 2023-01-22 15:46:18.108319: step: 12/77, loss: 7.671146158827469e-05 2023-01-22 15:46:19.499280: step: 16/77, loss: 6.30311774330039e-07 2023-01-22 15:46:20.967487: step: 20/77, loss: 4.095068288734183e-05 2023-01-22 15:46:22.393488: step: 24/77, loss: 2.0861618210687993e-08 2023-01-22 15:46:23.922324: step: 28/77, loss: 0.032684728503227234 2023-01-22 15:46:25.348510: step: 32/77, loss: 0.00028799797291867435 2023-01-22 15:46:26.826772: step: 36/77, loss: 6.835158274043351e-06 2023-01-22 15:46:28.219381: step: 40/77, loss: 0.003463412867859006 2023-01-22 15:46:29.664899: step: 44/77, loss: 0.010342842899262905 2023-01-22 15:46:31.095562: step: 48/77, loss: 0.011868762783706188 2023-01-22 15:46:32.624258: step: 52/77, loss: 0.0004890954005531967 2023-01-22 15:46:34.025374: step: 56/77, loss: 0.009687275625765324 2023-01-22 15:46:35.484001: step: 60/77, loss: 0.0014125150628387928 2023-01-22 15:46:36.955469: step: 64/77, loss: 0.0009855409152805805 2023-01-22 15:46:38.411269: step: 68/77, loss: 1.3102288903610315e-05 2023-01-22 15:46:39.883053: step: 72/77, loss: 0.06307441741228104 2023-01-22 15:46:41.228272: step: 76/77, loss: 1.7925975043908693e-05 2023-01-22 15:46:42.657437: step: 80/77, loss: 1.2376327504171059e-05 2023-01-22 15:46:44.041108: step: 84/77, loss: 0.008977088145911694 2023-01-22 15:46:45.422608: step: 88/77, loss: 0.008874515071511269 2023-01-22 15:46:46.851207: step: 92/77, loss: 4.9061251047533005e-05 2023-01-22 15:46:48.218121: step: 96/77, loss: 0.00129993655718863 2023-01-22 15:46:49.722055: step: 100/77, loss: 0.010317339561879635 2023-01-22 15:46:51.205714: step: 104/77, loss: 7.665769953746349e-05 2023-01-22 15:46:52.704827: step: 108/77, loss: 0.004125499166548252 2023-01-22 15:46:54.187706: step: 112/77, loss: 3.3793583043006947e-06 2023-01-22 15:46:55.583350: step: 116/77, loss: 0.00041454663733020425 2023-01-22 15:46:56.977536: step: 120/77, loss: 0.07228752970695496 2023-01-22 15:46:58.396854: step: 124/77, loss: 1.8492028175387532e-06 2023-01-22 15:46:59.830796: step: 128/77, loss: 3.3302158044534735e-06 2023-01-22 15:47:01.377036: step: 132/77, loss: 4.243364674039185e-05 2023-01-22 15:47:02.795686: step: 136/77, loss: 5.568213055084925e-06 2023-01-22 15:47:04.237357: step: 140/77, loss: 2.5114768504863605e-05 2023-01-22 15:47:05.721117: step: 144/77, loss: 3.042675416509155e-06 2023-01-22 15:47:07.170661: step: 148/77, loss: 5.813580173708033e-06 2023-01-22 15:47:08.599459: step: 152/77, loss: 0.00027511370717547834 2023-01-22 15:47:10.073542: step: 156/77, loss: 0.0017471505561843514 2023-01-22 15:47:11.528112: step: 160/77, loss: 0.00016745386528782547 2023-01-22 15:47:12.919041: step: 164/77, loss: 0.0017140633426606655 2023-01-22 15:47:14.385282: step: 168/77, loss: 0.0057662054896354675 2023-01-22 15:47:15.769853: step: 172/77, loss: 0.0008460861281491816 2023-01-22 15:47:17.202397: step: 176/77, loss: 0.011489558964967728 2023-01-22 15:47:18.665406: step: 180/77, loss: 0.0393817164003849 2023-01-22 15:47:20.210824: step: 184/77, loss: 5.8730965974973515e-06 2023-01-22 15:47:21.643569: step: 188/77, loss: 0.0033056659158319235 2023-01-22 15:47:23.179099: step: 192/77, loss: 0.0006920627201907337 2023-01-22 15:47:24.626946: step: 196/77, loss: 0.012154542841017246 2023-01-22 15:47:26.096758: step: 200/77, loss: 0.005420149303972721 2023-01-22 15:47:27.573214: step: 204/77, loss: 0.0007564575644209981 2023-01-22 15:47:28.971085: step: 208/77, loss: 0.00395285664126277 2023-01-22 15:47:30.465573: step: 212/77, loss: 8.369226270588115e-05 2023-01-22 15:47:31.954150: step: 216/77, loss: 4.007963525509695e-06 2023-01-22 15:47:33.465634: step: 220/77, loss: 0.0003974889114033431 2023-01-22 15:47:34.891179: step: 224/77, loss: 0.0003319445240776986 2023-01-22 15:47:36.320048: step: 228/77, loss: 0.030031228438019753 2023-01-22 15:47:37.791217: step: 232/77, loss: 3.814684532699175e-07 2023-01-22 15:47:39.194862: step: 236/77, loss: 3.6237568110664142e-06 2023-01-22 15:47:40.664589: step: 240/77, loss: 0.0003080276947002858 2023-01-22 15:47:42.044806: step: 244/77, loss: 0.00026390611310489476 2023-01-22 15:47:43.463706: step: 248/77, loss: 0.017976175993680954 2023-01-22 15:47:44.975535: step: 252/77, loss: 0.0005334357265383005 2023-01-22 15:47:46.449869: step: 256/77, loss: 0.000413315137848258 2023-01-22 15:47:47.902339: step: 260/77, loss: 1.2889391882708878e-06 2023-01-22 15:47:49.368727: step: 264/77, loss: 1.4996231584518682e-05 2023-01-22 15:47:50.873999: step: 268/77, loss: 0.0005102431168779731 2023-01-22 15:47:52.377896: step: 272/77, loss: 0.012166595086455345 2023-01-22 15:47:53.806669: step: 276/77, loss: 0.0011194954859092832 2023-01-22 15:47:55.271024: step: 280/77, loss: 0.01433619949966669 2023-01-22 15:47:56.677021: step: 284/77, loss: 0.1242150291800499 2023-01-22 15:47:58.152405: step: 288/77, loss: 0.00017780723283067346 2023-01-22 15:47:59.580513: step: 292/77, loss: 0.0014415099285542965 2023-01-22 15:48:01.065930: step: 296/77, loss: 0.11751821637153625 2023-01-22 15:48:02.467894: step: 300/77, loss: 5.6452710850862786e-05 2023-01-22 15:48:03.917770: step: 304/77, loss: 2.5166477826132905e-06 2023-01-22 15:48:05.416933: step: 308/77, loss: 2.8236513571755495e-06 2023-01-22 15:48:06.920105: step: 312/77, loss: 1.400704690013299e-07 2023-01-22 15:48:08.343035: step: 316/77, loss: 0.004095098003745079 2023-01-22 15:48:09.830677: step: 320/77, loss: 8.229284139815718e-05 2023-01-22 15:48:11.253015: step: 324/77, loss: 0.00040705205174162984 2023-01-22 15:48:12.733402: step: 328/77, loss: 0.0004594254423864186 2023-01-22 15:48:14.162505: step: 332/77, loss: 0.0009449953213334084 2023-01-22 15:48:15.665377: step: 336/77, loss: 7.680407725274563e-05 2023-01-22 15:48:17.085462: step: 340/77, loss: 0.013569314032793045 2023-01-22 15:48:18.539251: step: 344/77, loss: 0.014457895420491695 2023-01-22 15:48:19.977860: step: 348/77, loss: 0.01852329447865486 2023-01-22 15:48:21.445094: step: 352/77, loss: 0.012983414344489574 2023-01-22 15:48:22.860830: step: 356/77, loss: 8.676545257912949e-05 2023-01-22 15:48:24.355986: step: 360/77, loss: 0.009477004408836365 2023-01-22 15:48:25.817878: step: 364/77, loss: 2.6095040084328502e-05 2023-01-22 15:48:27.183770: step: 368/77, loss: 2.5815836124820635e-05 2023-01-22 15:48:28.654067: step: 372/77, loss: 9.088053047889844e-06 2023-01-22 15:48:30.091035: step: 376/77, loss: 0.0020199657883495092 2023-01-22 15:48:31.519754: step: 380/77, loss: 2.0801255686819786e-06 2023-01-22 15:48:32.943508: step: 384/77, loss: 0.005283838137984276 2023-01-22 15:48:34.409816: step: 388/77, loss: 0.0010309144854545593 ================================================== Loss: 0.008 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 20} Test Chinese: {'template': {'p': 0.9428571428571428, 'r': 0.5196850393700787, 'f1': 0.6700507614213197}, 'slot': {'p': 0.6571428571428571, 'r': 0.022030651340996167, 'f1': 0.04263206672845227}, 'combined': 0.02856564877236395, 'epoch': 20} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 20} Test Korean: {'template': {'p': 0.9571428571428572, 'r': 0.5275590551181102, 'f1': 0.680203045685279}, 'slot': {'p': 0.7222222222222222, 'r': 0.02490421455938697, 'f1': 0.04814814814814814}, 'combined': 0.03275051701447639, 'epoch': 20} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 20} Test Russian: {'template': {'p': 0.9577464788732394, 'r': 0.5354330708661418, 'f1': 0.6868686868686869}, 'slot': {'p': 0.6944444444444444, 'r': 0.023946360153256706, 'f1': 0.046296296296296294}, 'combined': 0.03179947624392068, 'epoch': 20} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 20} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 20} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 20} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 1} Test for Chinese: {'template': {'p': 0.9459459459459459, 'r': 0.5511811023622047, 'f1': 0.6965174129353234}, 'slot': {'p': 0.6111111111111112, 'r': 0.0210727969348659, 'f1': 0.040740740740740744}, 'combined': 0.028376635341809474, 'epoch': 1} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 1} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 1} Test for Korean: {'template': {'p': 0.9459459459459459, 'r': 0.5511811023622047, 'f1': 0.6965174129353234}, 'slot': {'p': 0.6111111111111112, 'r': 0.0210727969348659, 'f1': 0.040740740740740744}, 'combined': 0.028376635341809474, 'epoch': 1} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 1} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 1} Test for Russian: {'template': {'p': 0.9459459459459459, 'r': 0.5511811023622047, 'f1': 0.6965174129353234}, 'slot': {'p': 0.6111111111111112, 'r': 0.0210727969348659, 'f1': 0.040740740740740744}, 'combined': 0.028376635341809474, 'epoch': 1} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 1} ****************************** Epoch: 21 command: python train.py --model_name template --xlmr_model_name xlm-roberta-large --batch_size 10 --xlmr_learning_rate 2e-5 --event_hidden_num 450 --accumulate_step 4 --max_epoch 30 --p1_data_weight 0.1 --learning_rate 2e-4 2023-01-22 15:50:07.707997: step: 4/77, loss: 1.31126046198915e-06 2023-01-22 15:50:09.190392: step: 8/77, loss: 0.05096353590488434 2023-01-22 15:50:10.689058: step: 12/77, loss: 0.0193883515894413 2023-01-22 15:50:12.149324: step: 16/77, loss: 1.359822454105597e-05 2023-01-22 15:50:13.669440: step: 20/77, loss: 4.39577860333884e-07 2023-01-22 15:50:15.102377: step: 24/77, loss: 0.0006825162563472986 2023-01-22 15:50:16.582939: step: 28/77, loss: 0.005380017217248678 2023-01-22 15:50:17.937030: step: 32/77, loss: 0.006203534547239542 2023-01-22 15:50:19.465381: step: 36/77, loss: 0.0006514445412904024 2023-01-22 15:50:20.892433: step: 40/77, loss: 0.012692932970821857 2023-01-22 15:50:22.265006: step: 44/77, loss: 0.04812805354595184 2023-01-22 15:50:23.674885: step: 48/77, loss: 7.57672096369788e-05 2023-01-22 15:50:25.109045: step: 52/77, loss: 2.6029541913885623e-05 2023-01-22 15:50:26.562155: step: 56/77, loss: 0.0015134100103750825 2023-01-22 15:50:28.019692: step: 60/77, loss: 0.001756713492795825 2023-01-22 15:50:29.515163: step: 64/77, loss: 1.2253251043148339e-05 2023-01-22 15:50:30.979039: step: 68/77, loss: 0.0010077590122818947 2023-01-22 15:50:32.413259: step: 72/77, loss: 8.106019777187612e-06 2023-01-22 15:50:33.955794: step: 76/77, loss: 4.258016633684747e-05 2023-01-22 15:50:35.433020: step: 80/77, loss: 0.06083912402391434 2023-01-22 15:50:36.893606: step: 84/77, loss: 3.637679401435889e-05 2023-01-22 15:50:38.315153: step: 88/77, loss: 3.169894989696331e-05 2023-01-22 15:50:39.712517: step: 92/77, loss: 3.851655492326245e-06 2023-01-22 15:50:41.135285: step: 96/77, loss: 0.04184136912226677 2023-01-22 15:50:42.558365: step: 100/77, loss: 3.517176446621306e-05 2023-01-22 15:50:44.016633: step: 104/77, loss: 4.321118922234746e-06 2023-01-22 15:50:45.431102: step: 108/77, loss: 0.005314297042787075 2023-01-22 15:50:46.893463: step: 112/77, loss: 4.785529017681256e-05 2023-01-22 15:50:48.334970: step: 116/77, loss: 0.001644266420044005 2023-01-22 15:50:49.758234: step: 120/77, loss: 0.00012852785584982485 2023-01-22 15:50:51.126154: step: 124/77, loss: 0.11777294427156448 2023-01-22 15:50:52.606152: step: 128/77, loss: 0.004322417080402374 2023-01-22 15:50:54.087671: step: 132/77, loss: 0.0012313274201005697 2023-01-22 15:50:55.600766: step: 136/77, loss: 2.3772885469952598e-05 2023-01-22 15:50:57.042927: step: 140/77, loss: 1.1486989933473524e-05 2023-01-22 15:50:58.469054: step: 144/77, loss: 7.071543950587511e-05 2023-01-22 15:50:59.872710: step: 148/77, loss: 0.01367159467190504 2023-01-22 15:51:01.290638: step: 152/77, loss: 3.1236049835570157e-05 2023-01-22 15:51:02.734852: step: 156/77, loss: 5.200461146159796e-07 2023-01-22 15:51:04.161400: step: 160/77, loss: 7.232960342662409e-06 2023-01-22 15:51:05.584848: step: 164/77, loss: 6.551224942086264e-05 2023-01-22 15:51:07.065514: step: 168/77, loss: 0.0006813441286794841 2023-01-22 15:51:08.482857: step: 172/77, loss: 0.0335300974547863 2023-01-22 15:51:09.891830: step: 176/77, loss: 0.001984866801649332 2023-01-22 15:51:11.327766: step: 180/77, loss: 0.0005086352466605604 2023-01-22 15:51:12.750216: step: 184/77, loss: 0.02132435329258442 2023-01-22 15:51:14.134356: step: 188/77, loss: 0.018065014854073524 2023-01-22 15:51:15.610717: step: 192/77, loss: 0.04263356328010559 2023-01-22 15:51:17.103595: step: 196/77, loss: 3.0246617825469002e-05 2023-01-22 15:51:18.523297: step: 200/77, loss: 0.0005162729066796601 2023-01-22 15:51:19.950280: step: 204/77, loss: 0.0020963940769433975 2023-01-22 15:51:21.440039: step: 208/77, loss: 0.0417579784989357 2023-01-22 15:51:22.853794: step: 212/77, loss: 0.0011837168131023645 2023-01-22 15:51:24.335016: step: 216/77, loss: 0.0004774730477947742 2023-01-22 15:51:25.755556: step: 220/77, loss: 7.731681762379594e-06 2023-01-22 15:51:27.194659: step: 224/77, loss: 0.00030340399825945497 2023-01-22 15:51:28.650454: step: 228/77, loss: 0.006991207133978605 2023-01-22 15:51:30.112873: step: 232/77, loss: 8.052716111706104e-06 2023-01-22 15:51:31.576458: step: 236/77, loss: 2.5629941546867485e-07 2023-01-22 15:51:33.040033: step: 240/77, loss: 1.451134176022606e-05 2023-01-22 15:51:34.447716: step: 244/77, loss: 0.001297237235121429 2023-01-22 15:51:35.953596: step: 248/77, loss: 9.238686260459872e-08 2023-01-22 15:51:37.364224: step: 252/77, loss: 3.3953885576920584e-05 2023-01-22 15:51:38.833005: step: 256/77, loss: 0.004436683841049671 2023-01-22 15:51:40.279982: step: 260/77, loss: 9.387719046571874e-08 2023-01-22 15:51:41.715005: step: 264/77, loss: 4.773236651089974e-05 2023-01-22 15:51:43.210256: step: 268/77, loss: 0.0002866130380425602 2023-01-22 15:51:44.674858: step: 272/77, loss: 0.001864187652245164 2023-01-22 15:51:46.097260: step: 276/77, loss: 0.0011797643965110183 2023-01-22 15:51:47.497686: step: 280/77, loss: 0.0037937427405267954 2023-01-22 15:51:48.990300: step: 284/77, loss: 3.380151974852197e-05 2023-01-22 15:51:50.390106: step: 288/77, loss: 9.968531458071084e-07 2023-01-22 15:51:51.847185: step: 292/77, loss: 5.363777745515108e-06 2023-01-22 15:51:53.282847: step: 296/77, loss: 0.0018334905616939068 2023-01-22 15:51:54.705721: step: 300/77, loss: 0.00016567608690820634 2023-01-22 15:51:56.200537: step: 304/77, loss: 3.127428499283269e-06 2023-01-22 15:51:57.656809: step: 308/77, loss: 0.028649095445871353 2023-01-22 15:51:59.112525: step: 312/77, loss: 1.5367693777079694e-05 2023-01-22 15:52:00.575071: step: 316/77, loss: 0.00010744504834292457 2023-01-22 15:52:02.040090: step: 320/77, loss: 0.026186620816588402 2023-01-22 15:52:03.464928: step: 324/77, loss: 0.02414688467979431 2023-01-22 15:52:04.956972: step: 328/77, loss: 5.617644092126284e-07 2023-01-22 15:52:06.363242: step: 332/77, loss: 4.2616611040102725e-07 2023-01-22 15:52:07.819957: step: 336/77, loss: 0.02477031946182251 2023-01-22 15:52:09.324438: step: 340/77, loss: 0.0008762071374803782 2023-01-22 15:52:10.774512: step: 344/77, loss: 0.005971649195998907 2023-01-22 15:52:12.235212: step: 348/77, loss: 4.418044409248978e-05 2023-01-22 15:52:13.701065: step: 352/77, loss: 1.7881387037732566e-08 2023-01-22 15:52:15.199659: step: 356/77, loss: 5.572966301770066e-07 2023-01-22 15:52:16.692144: step: 360/77, loss: 0.06752141565084457 2023-01-22 15:52:18.170021: step: 364/77, loss: 0.007526802364736795 2023-01-22 15:52:19.674872: step: 368/77, loss: 1.2833291293645743e-05 2023-01-22 15:52:21.119854: step: 372/77, loss: 0.06438238173723221 2023-01-22 15:52:22.604018: step: 376/77, loss: 0.0006343497079797089 2023-01-22 15:52:24.008158: step: 380/77, loss: 7.463126530637965e-05 2023-01-22 15:52:25.432073: step: 384/77, loss: 1.4692012655359576e-06 2023-01-22 15:52:26.816831: step: 388/77, loss: 0.03346980735659599 ================================================== Loss: 0.009 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.4878048780487805, 'r': 0.03780718336483932, 'f1': 0.07017543859649122}, 'combined': 0.051708217913204055, 'epoch': 21} Test Chinese: {'template': {'p': 0.9583333333333334, 'r': 0.5433070866141733, 'f1': 0.6934673366834172}, 'slot': {'p': 0.5952380952380952, 'r': 0.023946360153256706, 'f1': 0.04604051565377532}, 'combined': 0.03192759376995475, 'epoch': 21} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.4878048780487805, 'r': 0.03780718336483932, 'f1': 0.07017543859649122}, 'combined': 0.051708217913204055, 'epoch': 21} Test Korean: {'template': {'p': 0.9583333333333334, 'r': 0.5433070866141733, 'f1': 0.6934673366834172}, 'slot': {'p': 0.5952380952380952, 'r': 0.023946360153256706, 'f1': 0.04604051565377532}, 'combined': 0.03192759376995475, 'epoch': 21} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.4878048780487805, 'r': 0.03780718336483932, 'f1': 0.07017543859649122}, 'combined': 0.051708217913204055, 'epoch': 21} Test Russian: {'template': {'p': 0.958904109589041, 'r': 0.5511811023622047, 'f1': 0.7000000000000001}, 'slot': {'p': 0.6097560975609756, 'r': 0.023946360153256706, 'f1': 0.04608294930875576}, 'combined': 0.03225806451612904, 'epoch': 21} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 21} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 21} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 21} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 1} Test for Chinese: {'template': {'p': 0.9459459459459459, 'r': 0.5511811023622047, 'f1': 0.6965174129353234}, 'slot': {'p': 0.6111111111111112, 'r': 0.0210727969348659, 'f1': 0.040740740740740744}, 'combined': 0.028376635341809474, 'epoch': 1} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 1} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 1} Test for Korean: {'template': {'p': 0.9459459459459459, 'r': 0.5511811023622047, 'f1': 0.6965174129353234}, 'slot': {'p': 0.6111111111111112, 'r': 0.0210727969348659, 'f1': 0.040740740740740744}, 'combined': 0.028376635341809474, 'epoch': 1} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 1} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 1} Test for Russian: {'template': {'p': 0.9459459459459459, 'r': 0.5511811023622047, 'f1': 0.6965174129353234}, 'slot': {'p': 0.6111111111111112, 'r': 0.0210727969348659, 'f1': 0.040740740740740744}, 'combined': 0.028376635341809474, 'epoch': 1} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 1} ****************************** Epoch: 22 command: python train.py --model_name template --xlmr_model_name xlm-roberta-large --batch_size 10 --xlmr_learning_rate 2e-5 --event_hidden_num 450 --accumulate_step 4 --max_epoch 30 --p1_data_weight 0.1 --learning_rate 2e-4 2023-01-22 15:53:58.696304: step: 4/77, loss: 0.002346228575333953 2023-01-22 15:54:00.165281: step: 8/77, loss: 1.2919023902213667e-05 2023-01-22 15:54:01.575732: step: 12/77, loss: 0.012675482779741287 2023-01-22 15:54:02.985521: step: 16/77, loss: 3.2986914447974414e-05 2023-01-22 15:54:04.344751: step: 20/77, loss: 4.645839908334892e-06 2023-01-22 15:54:05.786875: step: 24/77, loss: 4.511363840720151e-06 2023-01-22 15:54:07.181805: step: 28/77, loss: 3.475850462564267e-05 2023-01-22 15:54:08.615787: step: 32/77, loss: 1.1518498467921745e-05 2023-01-22 15:54:10.038447: step: 36/77, loss: 0.00021341952378861606 2023-01-22 15:54:11.466579: step: 40/77, loss: 5.108970981382299e-06 2023-01-22 15:54:12.941979: step: 44/77, loss: 0.06483907252550125 2023-01-22 15:54:14.396756: step: 48/77, loss: 7.261518021550728e-06 2023-01-22 15:54:15.804279: step: 52/77, loss: 0.007329550571739674 2023-01-22 15:54:17.215220: step: 56/77, loss: 2.389142355241347e-05 2023-01-22 15:54:18.701232: step: 60/77, loss: 0.009565358981490135 2023-01-22 15:54:20.148737: step: 64/77, loss: 0.002138290321454406 2023-01-22 15:54:21.583963: step: 68/77, loss: 9.102316835196689e-05 2023-01-22 15:54:23.037301: step: 72/77, loss: 2.745499114098493e-05 2023-01-22 15:54:24.433374: step: 76/77, loss: 0.0014320156769827008 2023-01-22 15:54:25.877712: step: 80/77, loss: 0.0011423993855714798 2023-01-22 15:54:27.282965: step: 84/77, loss: 1.2819808944186661e-05 2023-01-22 15:54:28.785191: step: 88/77, loss: 1.2093358236597851e-05 2023-01-22 15:54:30.200048: step: 92/77, loss: 7.361109055636916e-07 2023-01-22 15:54:31.630548: step: 96/77, loss: 0.00036043798900209367 2023-01-22 15:54:33.024342: step: 100/77, loss: 9.220741776516661e-05 2023-01-22 15:54:34.405041: step: 104/77, loss: 2.308128841832513e-06 2023-01-22 15:54:35.905757: step: 108/77, loss: 3.493248004815541e-05 2023-01-22 15:54:37.365230: step: 112/77, loss: 0.00014362957153934985 2023-01-22 15:54:38.849832: step: 116/77, loss: 0.00013110990403220057 2023-01-22 15:54:40.328941: step: 120/77, loss: 0.00011647500650724396 2023-01-22 15:54:41.734652: step: 124/77, loss: 0.04890933632850647 2023-01-22 15:54:43.113754: step: 128/77, loss: 2.6746215553430375e-06 2023-01-22 15:54:44.533563: step: 132/77, loss: 1.9445298676146194e-05 2023-01-22 15:54:46.031203: step: 136/77, loss: 5.40605433343444e-05 2023-01-22 15:54:47.450380: step: 140/77, loss: 0.01657833531498909 2023-01-22 15:54:48.881586: step: 144/77, loss: 9.147735909209587e-06 2023-01-22 15:54:50.357510: step: 148/77, loss: 2.9487748633982847e-06 2023-01-22 15:54:51.780987: step: 152/77, loss: 0.00011792431178037077 2023-01-22 15:54:53.225775: step: 156/77, loss: 0.0013862337218597531 2023-01-22 15:54:54.674837: step: 160/77, loss: 0.001244641374796629 2023-01-22 15:54:56.119477: step: 164/77, loss: 3.2183825169340707e-06 2023-01-22 15:54:57.556276: step: 168/77, loss: 6.24346569111367e-07 2023-01-22 15:54:59.025976: step: 172/77, loss: 0.010498532094061375 2023-01-22 15:55:00.569053: step: 176/77, loss: 0.05483954772353172 2023-01-22 15:55:01.943054: step: 180/77, loss: 0.010027075186371803 2023-01-22 15:55:03.369154: step: 184/77, loss: 0.059060029685497284 2023-01-22 15:55:04.770714: step: 188/77, loss: 0.000712457753252238 2023-01-22 15:55:06.263457: step: 192/77, loss: 2.9802306400483758e-08 2023-01-22 15:55:07.651482: step: 196/77, loss: 3.177811231580563e-05 2023-01-22 15:55:09.067035: step: 200/77, loss: 0.00011299244943074882 2023-01-22 15:55:10.488375: step: 204/77, loss: 5.519812020793324e-06 2023-01-22 15:55:11.957721: step: 208/77, loss: 0.0051557752303779125 2023-01-22 15:55:13.396633: step: 212/77, loss: 3.4318709367653355e-05 2023-01-22 15:55:14.825328: step: 216/77, loss: 8.940563702708459e-07 2023-01-22 15:55:16.290372: step: 220/77, loss: 0.0005939690163359046 2023-01-22 15:55:17.734284: step: 224/77, loss: 0.02324836514890194 2023-01-22 15:55:19.204613: step: 228/77, loss: 6.6844377215602435e-06 2023-01-22 15:55:20.606909: step: 232/77, loss: 0.00013106105325277895 2023-01-22 15:55:22.010247: step: 236/77, loss: 5.5134261600642276e-08 2023-01-22 15:55:23.434725: step: 240/77, loss: 7.84103904152289e-05 2023-01-22 15:55:24.899954: step: 244/77, loss: 0.00877442304044962 2023-01-22 15:55:26.283652: step: 248/77, loss: 9.517231956124306e-05 2023-01-22 15:55:27.682166: step: 252/77, loss: 4.619357341084651e-08 2023-01-22 15:55:29.122577: step: 256/77, loss: 2.509105570425163e-06 2023-01-22 15:55:30.639491: step: 260/77, loss: 2.665648707989021e-06 2023-01-22 15:55:32.070829: step: 264/77, loss: 0.005608798936009407 2023-01-22 15:55:33.544206: step: 268/77, loss: 4.577345407597022e-06 2023-01-22 15:55:35.025584: step: 272/77, loss: 0.0003482149331830442 2023-01-22 15:55:36.448556: step: 276/77, loss: 2.5331969411013233e-08 2023-01-22 15:55:37.926915: step: 280/77, loss: 9.00785562407691e-06 2023-01-22 15:55:39.268319: step: 284/77, loss: 0.0003431773220654577 2023-01-22 15:55:40.774488: step: 288/77, loss: 3.666664724732982e-06 2023-01-22 15:55:42.173673: step: 292/77, loss: 0.005634919740259647 2023-01-22 15:55:43.566420: step: 296/77, loss: 5.662578041665256e-05 2023-01-22 15:55:45.039916: step: 300/77, loss: 0.0016856964211910963 2023-01-22 15:55:46.472503: step: 304/77, loss: 0.08939161896705627 2023-01-22 15:55:47.931478: step: 308/77, loss: 7.993636245373636e-06 2023-01-22 15:55:49.384376: step: 312/77, loss: 0.004754865076392889 2023-01-22 15:55:50.873260: step: 316/77, loss: 0.056578945368528366 2023-01-22 15:55:52.340977: step: 320/77, loss: 1.2442321349226404e-05 2023-01-22 15:55:53.801558: step: 324/77, loss: 0.005984515883028507 2023-01-22 15:55:55.248702: step: 328/77, loss: 6.34786033515411e-07 2023-01-22 15:55:56.699516: step: 332/77, loss: 4.058571903442498e-06 2023-01-22 15:55:58.112716: step: 336/77, loss: 0.0010738220298662782 2023-01-22 15:55:59.570450: step: 340/77, loss: 0.003154112957417965 2023-01-22 15:56:01.026459: step: 344/77, loss: 0.01419751811772585 2023-01-22 15:56:02.422401: step: 348/77, loss: 4.372875082481187e-06 2023-01-22 15:56:03.879745: step: 352/77, loss: 2.5379265935043804e-05 2023-01-22 15:56:05.311266: step: 356/77, loss: 6.651124567724764e-05 2023-01-22 15:56:06.789199: step: 360/77, loss: 1.2218924894114025e-07 2023-01-22 15:56:08.282246: step: 364/77, loss: 0.05573984608054161 2023-01-22 15:56:09.740090: step: 368/77, loss: 0.001033936394378543 2023-01-22 15:56:11.214611: step: 372/77, loss: 1.2790627806680277e-05 2023-01-22 15:56:12.693032: step: 376/77, loss: 3.5762759864610416e-08 2023-01-22 15:56:14.117729: step: 380/77, loss: 1.7750164261087775e-05 2023-01-22 15:56:15.573843: step: 384/77, loss: 0.000243719870923087 2023-01-22 15:56:17.027897: step: 388/77, loss: 0.00042630312964320183 ================================================== Loss: 0.006 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5666666666666667, 'f1': 0.7234042553191489}, 'slot': {'p': 0.5, 'r': 0.035916824196597356, 'f1': 0.0670194003527337}, 'combined': 0.048482119404105226, 'epoch': 22} Test Chinese: {'template': {'p': 0.9452054794520548, 'r': 0.5433070866141733, 'f1': 0.69}, 'slot': {'p': 0.5714285714285714, 'r': 0.022988505747126436, 'f1': 0.044198895027624314}, 'combined': 0.030497237569060774, 'epoch': 22} Dev Korean: {'template': {'p': 1.0, 'r': 0.5666666666666667, 'f1': 0.7234042553191489}, 'slot': {'p': 0.5, 'r': 0.035916824196597356, 'f1': 0.0670194003527337}, 'combined': 0.048482119404105226, 'epoch': 22} Test Korean: {'template': {'p': 0.9452054794520548, 'r': 0.5433070866141733, 'f1': 0.69}, 'slot': {'p': 0.5714285714285714, 'r': 0.022988505747126436, 'f1': 0.044198895027624314}, 'combined': 0.030497237569060774, 'epoch': 22} Dev Russian: {'template': {'p': 1.0, 'r': 0.5666666666666667, 'f1': 0.7234042553191489}, 'slot': {'p': 0.5, 'r': 0.035916824196597356, 'f1': 0.0670194003527337}, 'combined': 0.048482119404105226, 'epoch': 22} Test Russian: {'template': {'p': 0.9452054794520548, 'r': 0.5433070866141733, 'f1': 0.69}, 'slot': {'p': 0.5853658536585366, 'r': 0.022988505747126436, 'f1': 0.04423963133640553}, 'combined': 0.030525345622119813, 'epoch': 22} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 22} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 22} Sample Russian: {'template': {'p': 1.0, 'r': 0.25, 'f1': 0.4}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 22} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 1} Test for Chinese: {'template': {'p': 0.9459459459459459, 'r': 0.5511811023622047, 'f1': 0.6965174129353234}, 'slot': {'p': 0.6111111111111112, 'r': 0.0210727969348659, 'f1': 0.040740740740740744}, 'combined': 0.028376635341809474, 'epoch': 1} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 1} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 1} Test for Korean: {'template': {'p': 0.9459459459459459, 'r': 0.5511811023622047, 'f1': 0.6965174129353234}, 'slot': {'p': 0.6111111111111112, 'r': 0.0210727969348659, 'f1': 0.040740740740740744}, 'combined': 0.028376635341809474, 'epoch': 1} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 1} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 1} Test for Russian: {'template': {'p': 0.9459459459459459, 'r': 0.5511811023622047, 'f1': 0.6965174129353234}, 'slot': {'p': 0.6111111111111112, 'r': 0.0210727969348659, 'f1': 0.040740740740740744}, 'combined': 0.028376635341809474, 'epoch': 1} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 1} ****************************** Epoch: 23 command: python train.py --model_name template --xlmr_model_name xlm-roberta-large --batch_size 10 --xlmr_learning_rate 2e-5 --event_hidden_num 450 --accumulate_step 4 --max_epoch 30 --p1_data_weight 0.1 --learning_rate 2e-4 2023-01-22 15:57:51.945554: step: 4/77, loss: 0.004018413834273815 2023-01-22 15:57:53.365005: step: 8/77, loss: 1.7020693121594377e-05 2023-01-22 15:57:54.734494: step: 12/77, loss: 0.015627073124051094 2023-01-22 15:57:56.210391: step: 16/77, loss: 0.006086463574320078 2023-01-22 15:57:57.638320: step: 20/77, loss: 1.9505432646838017e-05 2023-01-22 15:57:59.118355: step: 24/77, loss: 6.6522948145575356e-06 2023-01-22 15:58:00.571779: step: 28/77, loss: 6.474445399362594e-05 2023-01-22 15:58:02.055337: step: 32/77, loss: 0.012845698744058609 2023-01-22 15:58:03.407153: step: 36/77, loss: 0.00020875573682133108 2023-01-22 15:58:04.808942: step: 40/77, loss: 4.506836376094725e-06 2023-01-22 15:58:06.310218: step: 44/77, loss: 7.345504855038598e-06 2023-01-22 15:58:07.753370: step: 48/77, loss: 2.7742883048631484e-06 2023-01-22 15:58:09.259110: step: 52/77, loss: 4.792918844032101e-05 2023-01-22 15:58:10.793983: step: 56/77, loss: 0.05235208570957184 2023-01-22 15:58:12.290394: step: 60/77, loss: 1.9073429768923233e-07 2023-01-22 15:58:13.693344: step: 64/77, loss: 6.526562401631963e-07 2023-01-22 15:58:15.199182: step: 68/77, loss: 0.00017328829562757164 2023-01-22 15:58:16.626581: step: 72/77, loss: 0.0016752263763919473 2023-01-22 15:58:18.095902: step: 76/77, loss: 3.942399416700937e-06 2023-01-22 15:58:19.628480: step: 80/77, loss: 8.374709977942985e-06 2023-01-22 15:58:21.080245: step: 84/77, loss: 0.004228753969073296 2023-01-22 15:58:22.568316: step: 88/77, loss: 1.5771105609019287e-05 2023-01-22 15:58:23.948933: step: 92/77, loss: 0.002886112779378891 2023-01-22 15:58:25.332174: step: 96/77, loss: 9.655738040237338e-07 2023-01-22 15:58:26.884890: step: 100/77, loss: 5.8039957366418093e-05 2023-01-22 15:58:28.350080: step: 104/77, loss: 0.001335770240984857 2023-01-22 15:58:29.785555: step: 108/77, loss: 3.960403319069883e-06 2023-01-22 15:58:31.191692: step: 112/77, loss: 0.0 2023-01-22 15:58:32.628388: step: 116/77, loss: 3.2748478133726167e-06 2023-01-22 15:58:34.145709: step: 120/77, loss: 5.155774829290749e-07 2023-01-22 15:58:35.606147: step: 124/77, loss: 0.001645140815526247 2023-01-22 15:58:37.090877: step: 128/77, loss: 0.00016478932229802012 2023-01-22 15:58:38.577383: step: 132/77, loss: 5.9025991504313424e-05 2023-01-22 15:58:39.974008: step: 136/77, loss: 8.807150152279064e-05 2023-01-22 15:58:41.391483: step: 140/77, loss: 0.002511009108275175 2023-01-22 15:58:42.835450: step: 144/77, loss: 0.0015820222906768322 2023-01-22 15:58:44.294694: step: 148/77, loss: 0.032705824822187424 2023-01-22 15:58:45.792430: step: 152/77, loss: 1.7299270211879048e-06 2023-01-22 15:58:47.238445: step: 156/77, loss: 2.5331967634656394e-08 2023-01-22 15:58:48.728773: step: 160/77, loss: 2.3841838725502384e-08 2023-01-22 15:58:50.267038: step: 164/77, loss: 5.960610906186048e-06 2023-01-22 15:58:51.776628: step: 168/77, loss: 4.598395025823265e-05 2023-01-22 15:58:53.291149: step: 172/77, loss: 3.252502210671082e-06 2023-01-22 15:58:54.759898: step: 176/77, loss: 0.00021018394909333438 2023-01-22 15:58:56.188456: step: 180/77, loss: 1.5469189747818746e-05 2023-01-22 15:58:57.689414: step: 184/77, loss: 0.0065690637566149235 2023-01-22 15:58:59.110205: step: 188/77, loss: 0.0002731600252445787 2023-01-22 15:59:00.582442: step: 192/77, loss: 1.3991452760819811e-06 2023-01-22 15:59:02.039130: step: 196/77, loss: 0.025875654071569443 2023-01-22 15:59:03.536184: step: 200/77, loss: 3.8444750316557474e-07 2023-01-22 15:59:04.956106: step: 204/77, loss: 0.0007766537601128221 2023-01-22 15:59:06.446512: step: 208/77, loss: 3.4117889299523085e-05 2023-01-22 15:59:07.880944: step: 212/77, loss: 0.0001626727607799694 2023-01-22 15:59:09.420201: step: 216/77, loss: 0.006734051275998354 2023-01-22 15:59:10.910862: step: 220/77, loss: 3.4272638060883764e-08 2023-01-22 15:59:12.335214: step: 224/77, loss: 0.00013148770085535944 2023-01-22 15:59:13.768160: step: 228/77, loss: 2.4614346330054104e-06 2023-01-22 15:59:15.168592: step: 232/77, loss: 0.014583967626094818 2023-01-22 15:59:16.601337: step: 236/77, loss: 3.872397428494878e-06 2023-01-22 15:59:18.079614: step: 240/77, loss: 3.6316836485639215e-05 2023-01-22 15:59:19.543794: step: 244/77, loss: 1.9578749288484687e-06 2023-01-22 15:59:20.992475: step: 248/77, loss: 0.0009526479407213628 2023-01-22 15:59:22.468992: step: 252/77, loss: 0.0006445134640671313 2023-01-22 15:59:23.980609: step: 256/77, loss: 0.00042405526619404554 2023-01-22 15:59:25.435511: step: 260/77, loss: 8.889746095519513e-05 2023-01-22 15:59:26.965142: step: 264/77, loss: 0.004766326397657394 2023-01-22 15:59:28.452779: step: 268/77, loss: 3.2186309795179113e-07 2023-01-22 15:59:29.891890: step: 272/77, loss: 0.004515407141298056 2023-01-22 15:59:31.366012: step: 276/77, loss: 0.0015589980175718665 2023-01-22 15:59:32.854044: step: 280/77, loss: 3.427264516631112e-08 2023-01-22 15:59:34.344336: step: 284/77, loss: 0.00012403355503920466 2023-01-22 15:59:35.812045: step: 288/77, loss: 1.7135522512035095e-06 2023-01-22 15:59:37.309338: step: 292/77, loss: 3.7550361753346806e-07 2023-01-22 15:59:38.787407: step: 296/77, loss: 3.028061291843187e-05 2023-01-22 15:59:40.307207: step: 300/77, loss: 1.2870046703028493e-05 2023-01-22 15:59:41.812654: step: 304/77, loss: 0.00011317481403239071 2023-01-22 15:59:43.264213: step: 308/77, loss: 0.00015770268510095775 2023-01-22 15:59:44.787399: step: 312/77, loss: 3.4272638060883764e-08 2023-01-22 15:59:46.216372: step: 316/77, loss: 7.152545578037461e-08 2023-01-22 15:59:47.707382: step: 320/77, loss: 5.289880391501356e-07 2023-01-22 15:59:49.118141: step: 324/77, loss: 1.821878322516568e-05 2023-01-22 15:59:50.573100: step: 328/77, loss: 4.470347647611561e-09 2023-01-22 15:59:51.972086: step: 332/77, loss: 0.00013496009341906756 2023-01-22 15:59:53.424421: step: 336/77, loss: 1.2665930171351647e-07 2023-01-22 15:59:54.885089: step: 340/77, loss: 4.4703405421842035e-08 2023-01-22 15:59:56.366839: step: 344/77, loss: 8.780878852121532e-05 2023-01-22 15:59:57.832837: step: 348/77, loss: 0.0527520626783371 2023-01-22 15:59:59.233015: step: 352/77, loss: 5.677258627656556e-07 2023-01-22 16:00:00.664279: step: 356/77, loss: 2.3482186861656373e-06 2023-01-22 16:00:02.096166: step: 360/77, loss: 0.0025349108036607504 2023-01-22 16:00:03.605395: step: 364/77, loss: 0.007929286919534206 2023-01-22 16:00:05.037714: step: 368/77, loss: 5.9745958424173295e-05 2023-01-22 16:00:06.497503: step: 372/77, loss: 0.00023206451442092657 2023-01-22 16:00:07.967823: step: 376/77, loss: 0.012654569931328297 2023-01-22 16:00:09.441545: step: 380/77, loss: 1.6390407608923852e-06 2023-01-22 16:00:10.914753: step: 384/77, loss: 0.02201502025127411 2023-01-22 16:00:12.402249: step: 388/77, loss: 6.341918378893752e-06 ================================================== Loss: 0.003 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 23} Test Chinese: {'template': {'p': 0.9452054794520548, 'r': 0.5433070866141733, 'f1': 0.69}, 'slot': {'p': 0.46808510638297873, 'r': 0.0210727969348659, 'f1': 0.04032997250229148}, 'combined': 0.02782768102658112, 'epoch': 23} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 23} Test Korean: {'template': {'p': 0.9459459459459459, 'r': 0.5511811023622047, 'f1': 0.6965174129353234}, 'slot': {'p': 0.46938775510204084, 'r': 0.022030651340996167, 'f1': 0.042086001829826164}, 'combined': 0.02931363311530181, 'epoch': 23} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 23} Test Russian: {'template': {'p': 0.9459459459459459, 'r': 0.5511811023622047, 'f1': 0.6965174129353234}, 'slot': {'p': 0.46938775510204084, 'r': 0.022030651340996167, 'f1': 0.042086001829826164}, 'combined': 0.02931363311530181, 'epoch': 23} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 23} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 23} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 23} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 1} Test for Chinese: {'template': {'p': 0.9459459459459459, 'r': 0.5511811023622047, 'f1': 0.6965174129353234}, 'slot': {'p': 0.6111111111111112, 'r': 0.0210727969348659, 'f1': 0.040740740740740744}, 'combined': 0.028376635341809474, 'epoch': 1} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 1} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 1} Test for Korean: {'template': {'p': 0.9459459459459459, 'r': 0.5511811023622047, 'f1': 0.6965174129353234}, 'slot': {'p': 0.6111111111111112, 'r': 0.0210727969348659, 'f1': 0.040740740740740744}, 'combined': 0.028376635341809474, 'epoch': 1} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 1} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 1} Test for Russian: {'template': {'p': 0.9459459459459459, 'r': 0.5511811023622047, 'f1': 0.6965174129353234}, 'slot': {'p': 0.6111111111111112, 'r': 0.0210727969348659, 'f1': 0.040740740740740744}, 'combined': 0.028376635341809474, 'epoch': 1} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 1} ****************************** Epoch: 24 command: python train.py --model_name template --xlmr_model_name xlm-roberta-large --batch_size 10 --xlmr_learning_rate 2e-5 --event_hidden_num 450 --accumulate_step 4 --max_epoch 30 --p1_data_weight 0.1 --learning_rate 2e-4 2023-01-22 16:01:47.139666: step: 4/77, loss: 3.88887065128074e-06 2023-01-22 16:01:48.632727: step: 8/77, loss: 0.006326351780444384 2023-01-22 16:01:50.120245: step: 12/77, loss: 0.007369318976998329 2023-01-22 16:01:51.564981: step: 16/77, loss: 2.0563523150940455e-07 2023-01-22 16:01:53.052661: step: 20/77, loss: 6.288262284215307e-07 2023-01-22 16:01:54.541904: step: 24/77, loss: 0.0005139397107996047 2023-01-22 16:01:55.980358: step: 28/77, loss: 0.0001693215745035559 2023-01-22 16:01:57.447906: step: 32/77, loss: 1.2026353033434134e-05 2023-01-22 16:01:58.913742: step: 36/77, loss: 0.00014785393432248384 2023-01-22 16:02:00.387771: step: 40/77, loss: 0.03438704460859299 2023-01-22 16:02:01.771426: step: 44/77, loss: 2.99509792967001e-07 2023-01-22 16:02:03.221527: step: 48/77, loss: 0.0003298703522887081 2023-01-22 16:02:04.641897: step: 52/77, loss: 2.6872790840570815e-05 2023-01-22 16:02:06.143578: step: 56/77, loss: 0.0006581316119991243 2023-01-22 16:02:07.616709: step: 60/77, loss: 4.749282743432559e-05 2023-01-22 16:02:09.107521: step: 64/77, loss: 0.027956020087003708 2023-01-22 16:02:10.577002: step: 68/77, loss: 0.00013323355233296752 2023-01-22 16:02:12.070654: step: 72/77, loss: 0.00024044552992563695 2023-01-22 16:02:13.532971: step: 76/77, loss: 3.887216280418215e-06 2023-01-22 16:02:15.057425: step: 80/77, loss: 0.007220818195492029 2023-01-22 16:02:16.555747: step: 84/77, loss: 1.748294562275987e-05 2023-01-22 16:02:18.014795: step: 88/77, loss: 6.63890823489055e-05 2023-01-22 16:02:19.463475: step: 92/77, loss: 0.00035221839789301157 2023-01-22 16:02:20.915509: step: 96/77, loss: 2.1529145669774152e-05 2023-01-22 16:02:22.386265: step: 100/77, loss: 0.00010100056533701718 2023-01-22 16:02:23.850747: step: 104/77, loss: 0.011001895181834698 2023-01-22 16:02:25.318828: step: 108/77, loss: 0.010459198616445065 2023-01-22 16:02:26.810106: step: 112/77, loss: 0.013695194385945797 2023-01-22 16:02:28.293351: step: 116/77, loss: 1.0759786164271645e-05 2023-01-22 16:02:29.730107: step: 120/77, loss: 1.4424141227209475e-05 2023-01-22 16:02:31.131091: step: 124/77, loss: 2.251449132018024e-06 2023-01-22 16:02:32.584869: step: 128/77, loss: 4.187168087810278e-07 2023-01-22 16:02:34.040307: step: 132/77, loss: 0.029633767902851105 2023-01-22 16:02:35.484711: step: 136/77, loss: 1.026679456117563e-06 2023-01-22 16:02:36.894020: step: 140/77, loss: 3.769959562305303e-07 2023-01-22 16:02:38.432266: step: 144/77, loss: 0.020879343152046204 2023-01-22 16:02:39.952593: step: 148/77, loss: 4.973839168087579e-06 2023-01-22 16:02:41.426911: step: 152/77, loss: 1.1628907486738171e-05 2023-01-22 16:02:42.905555: step: 156/77, loss: 0.00039154087426140904 2023-01-22 16:02:44.335878: step: 160/77, loss: 0.010981038212776184 2023-01-22 16:02:45.798482: step: 164/77, loss: 0.0009109620586968958 2023-01-22 16:02:47.285924: step: 168/77, loss: 8.858768524078187e-06 2023-01-22 16:02:48.792026: step: 172/77, loss: 0.0049967714585363865 2023-01-22 16:02:50.267758: step: 176/77, loss: 2.4480655156366993e-06 2023-01-22 16:02:51.725716: step: 180/77, loss: 0.0005453022895380855 2023-01-22 16:02:53.188405: step: 184/77, loss: 0.05702085793018341 2023-01-22 16:02:54.633297: step: 188/77, loss: 0.0013953285524621606 2023-01-22 16:02:56.079869: step: 192/77, loss: 0.24813003838062286 2023-01-22 16:02:57.535503: step: 196/77, loss: 2.2082331270212308e-05 2023-01-22 16:02:58.986269: step: 200/77, loss: 0.0030158653389662504 2023-01-22 16:03:00.402317: step: 204/77, loss: 1.544322003610432e-05 2023-01-22 16:03:01.848337: step: 208/77, loss: 1.0132757211067656e-07 2023-01-22 16:03:03.378723: step: 212/77, loss: 7.412156992359087e-05 2023-01-22 16:03:04.865320: step: 216/77, loss: 0.00033129850635305047 2023-01-22 16:03:06.331834: step: 220/77, loss: 1.4110939901001984e-06 2023-01-22 16:03:07.798321: step: 224/77, loss: 0.005589210893958807 2023-01-22 16:03:09.253107: step: 228/77, loss: 7.162740803323686e-05 2023-01-22 16:03:10.665896: step: 232/77, loss: 7.00352771332291e-08 2023-01-22 16:03:12.126849: step: 236/77, loss: 8.71100837684935e-06 2023-01-22 16:03:13.636679: step: 240/77, loss: 4.6280310925794765e-06 2023-01-22 16:03:15.070942: step: 244/77, loss: 5.504771706910105e-06 2023-01-22 16:03:16.501647: step: 248/77, loss: 0.02656180039048195 2023-01-22 16:03:17.971658: step: 252/77, loss: 7.738712156424299e-05 2023-01-22 16:03:19.463914: step: 256/77, loss: 2.162016016882262e-06 2023-01-22 16:03:20.924729: step: 260/77, loss: 4.1723215105093914e-08 2023-01-22 16:03:22.385591: step: 264/77, loss: 9.425855751032941e-06 2023-01-22 16:03:23.794319: step: 268/77, loss: 2.7044243324780837e-06 2023-01-22 16:03:25.241588: step: 272/77, loss: 8.58701878314605e-06 2023-01-22 16:03:26.757642: step: 276/77, loss: 1.0579811515754045e-07 2023-01-22 16:03:28.310763: step: 280/77, loss: 1.4398860002984293e-05 2023-01-22 16:03:29.815909: step: 284/77, loss: 1.3559998990331223e-07 2023-01-22 16:03:31.294519: step: 288/77, loss: 0.0024777958169579506 2023-01-22 16:03:32.822462: step: 292/77, loss: 0.0005366320256143808 2023-01-22 16:03:34.259180: step: 296/77, loss: 8.940685347624822e-08 2023-01-22 16:03:35.674834: step: 300/77, loss: 0.01648043282330036 2023-01-22 16:03:37.063276: step: 304/77, loss: 0.0007264897576533258 2023-01-22 16:03:38.569777: step: 308/77, loss: 1.92816060007317e-06 2023-01-22 16:03:39.977011: step: 312/77, loss: 1.0430576367070898e-06 2023-01-22 16:03:41.384547: step: 316/77, loss: 6.313736776064616e-06 2023-01-22 16:03:42.869337: step: 320/77, loss: 7.53976621581387e-07 2023-01-22 16:03:44.310057: step: 324/77, loss: 1.5497926142415963e-05 2023-01-22 16:03:45.725011: step: 328/77, loss: 1.5797953892615624e-05 2023-01-22 16:03:47.164423: step: 332/77, loss: 0.09263993054628372 2023-01-22 16:03:48.607373: step: 336/77, loss: 0.023720627650618553 2023-01-22 16:03:50.111139: step: 340/77, loss: 1.2665947224377305e-07 2023-01-22 16:03:51.571203: step: 344/77, loss: 8.791669614538478e-08 2023-01-22 16:03:53.047558: step: 348/77, loss: 2.0414438495208742e-07 2023-01-22 16:03:54.438530: step: 352/77, loss: 3.772464424400823e-06 2023-01-22 16:03:55.852737: step: 356/77, loss: 0.0005213633412495255 2023-01-22 16:03:57.305357: step: 360/77, loss: 1.1175710596944555e-06 2023-01-22 16:03:58.734876: step: 364/77, loss: 0.0011136091779917479 2023-01-22 16:04:00.226932: step: 368/77, loss: 4.046501089760568e-06 2023-01-22 16:04:01.687909: step: 372/77, loss: 0.01180601678788662 2023-01-22 16:04:03.190332: step: 376/77, loss: 1.0385722362116212e-06 2023-01-22 16:04:04.717417: step: 380/77, loss: 3.2825400921865366e-06 2023-01-22 16:04:06.215533: step: 384/77, loss: 0.0018417052924633026 2023-01-22 16:04:07.636664: step: 388/77, loss: 3.2559481041971594e-05 ================================================== Loss: 0.007 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5666666666666667, 'f1': 0.7234042553191489}, 'slot': {'p': 0.5, 'r': 0.035916824196597356, 'f1': 0.0670194003527337}, 'combined': 0.048482119404105226, 'epoch': 24} Test Chinese: {'template': {'p': 0.922077922077922, 'r': 0.5590551181102362, 'f1': 0.696078431372549}, 'slot': {'p': 0.6, 'r': 0.022988505747126436, 'f1': 0.04428044280442805}, 'combined': 0.03082266116778815, 'epoch': 24} Dev Korean: {'template': {'p': 1.0, 'r': 0.5666666666666667, 'f1': 0.7234042553191489}, 'slot': {'p': 0.5, 'r': 0.035916824196597356, 'f1': 0.0670194003527337}, 'combined': 0.048482119404105226, 'epoch': 24} Test Korean: {'template': {'p': 0.922077922077922, 'r': 0.5590551181102362, 'f1': 0.696078431372549}, 'slot': {'p': 0.6, 'r': 0.022988505747126436, 'f1': 0.04428044280442805}, 'combined': 0.03082266116778815, 'epoch': 24} Dev Russian: {'template': {'p': 1.0, 'r': 0.5666666666666667, 'f1': 0.7234042553191489}, 'slot': {'p': 0.5, 'r': 0.035916824196597356, 'f1': 0.0670194003527337}, 'combined': 0.048482119404105226, 'epoch': 24} Test Russian: {'template': {'p': 0.9342105263157895, 'r': 0.5590551181102362, 'f1': 0.6995073891625616}, 'slot': {'p': 0.6052631578947368, 'r': 0.022030651340996167, 'f1': 0.04251386321626617}, 'combined': 0.029738761461624613, 'epoch': 24} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 24} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 24} Sample Russian: {'template': {'p': 1.0, 'r': 0.25, 'f1': 0.4}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 24} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 1} Test for Chinese: {'template': {'p': 0.9459459459459459, 'r': 0.5511811023622047, 'f1': 0.6965174129353234}, 'slot': {'p': 0.6111111111111112, 'r': 0.0210727969348659, 'f1': 0.040740740740740744}, 'combined': 0.028376635341809474, 'epoch': 1} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 1} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 1} Test for Korean: {'template': {'p': 0.9459459459459459, 'r': 0.5511811023622047, 'f1': 0.6965174129353234}, 'slot': {'p': 0.6111111111111112, 'r': 0.0210727969348659, 'f1': 0.040740740740740744}, 'combined': 0.028376635341809474, 'epoch': 1} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 1} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 1} Test for Russian: {'template': {'p': 0.9459459459459459, 'r': 0.5511811023622047, 'f1': 0.6965174129353234}, 'slot': {'p': 0.6111111111111112, 'r': 0.0210727969348659, 'f1': 0.040740740740740744}, 'combined': 0.028376635341809474, 'epoch': 1} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 1} ****************************** Epoch: 25 command: python train.py --model_name template --xlmr_model_name xlm-roberta-large --batch_size 10 --xlmr_learning_rate 2e-5 --event_hidden_num 450 --accumulate_step 4 --max_epoch 30 --p1_data_weight 0.1 --learning_rate 2e-4 2023-01-22 16:05:42.329208: step: 4/77, loss: 0.0004749819927383214 2023-01-22 16:05:43.799802: step: 8/77, loss: 0.0065675717778503895 2023-01-22 16:05:45.186201: step: 12/77, loss: 4.678925336065731e-07 2023-01-22 16:05:46.669729: step: 16/77, loss: 7.094155535014579e-06 2023-01-22 16:05:48.170524: step: 20/77, loss: 1.415608039678773e-07 2023-01-22 16:05:49.650103: step: 24/77, loss: 0.010887443087995052 2023-01-22 16:05:51.131775: step: 28/77, loss: 4.455406497072545e-07 2023-01-22 16:05:52.638050: step: 32/77, loss: 9.79992728389334e-06 2023-01-22 16:05:54.120289: step: 36/77, loss: 0.0015145114157348871 2023-01-22 16:05:55.525185: step: 40/77, loss: 0.002752321772277355 2023-01-22 16:05:57.016178: step: 44/77, loss: 0.0026633806992322206 2023-01-22 16:05:58.488840: step: 48/77, loss: 2.826468971761642e-06 2023-01-22 16:05:59.990898: step: 52/77, loss: 0.001166272908449173 2023-01-22 16:06:01.492603: step: 56/77, loss: 0.12087506800889969 2023-01-22 16:06:02.933788: step: 60/77, loss: 1.0534906778048025e-06 2023-01-22 16:06:04.424187: step: 64/77, loss: 5.738164873037022e-06 2023-01-22 16:06:05.840100: step: 68/77, loss: 0.0012781772529706359 2023-01-22 16:06:07.318814: step: 72/77, loss: 2.6516407160670497e-05 2023-01-22 16:06:08.821064: step: 76/77, loss: 1.9073439716521534e-07 2023-01-22 16:06:10.264306: step: 80/77, loss: 9.314483577327337e-06 2023-01-22 16:06:11.763244: step: 84/77, loss: 0.00012070268712705001 2023-01-22 16:06:13.266866: step: 88/77, loss: 1.4487904991256073e-05 2023-01-22 16:06:14.739141: step: 92/77, loss: 0.00048729003174230456 2023-01-22 16:06:16.197024: step: 96/77, loss: 1.2893399798485916e-05 2023-01-22 16:06:17.668847: step: 100/77, loss: 0.028256123885512352 2023-01-22 16:06:19.145325: step: 104/77, loss: 3.774100832742988e-06 2023-01-22 16:06:20.678418: step: 108/77, loss: 2.673522976692766e-05 2023-01-22 16:06:22.102579: step: 112/77, loss: 8.290071491501294e-06 2023-01-22 16:06:23.573636: step: 116/77, loss: 0.0036806620191782713 2023-01-22 16:06:24.990468: step: 120/77, loss: 5.2154035756757366e-08 2023-01-22 16:06:26.443948: step: 124/77, loss: 3.05472553918662e-07 2023-01-22 16:06:27.887408: step: 128/77, loss: 1.528542816231493e-05 2023-01-22 16:06:29.288780: step: 132/77, loss: 7.803183689247817e-05 2023-01-22 16:06:30.704970: step: 136/77, loss: 3.98997553929803e-06 2023-01-22 16:06:32.114594: step: 140/77, loss: 1.0728808774729259e-07 2023-01-22 16:06:33.545572: step: 144/77, loss: 4.582446854328737e-05 2023-01-22 16:06:34.948051: step: 148/77, loss: 2.6567186068859883e-06 2023-01-22 16:06:36.382494: step: 152/77, loss: 0.00017029664013534784 2023-01-22 16:06:37.834262: step: 156/77, loss: 3.7401821373350685e-07 2023-01-22 16:06:39.242670: step: 160/77, loss: 6.106628279667348e-05 2023-01-22 16:06:40.674930: step: 164/77, loss: 0.0004939200589433312 2023-01-22 16:06:42.171970: step: 168/77, loss: 0.00017524884606245905 2023-01-22 16:06:43.629954: step: 172/77, loss: 0.00010956765618175268 2023-01-22 16:06:45.137717: step: 176/77, loss: 0.016837604343891144 2023-01-22 16:06:46.564416: step: 180/77, loss: 0.003015865571796894 2023-01-22 16:06:48.110656: step: 184/77, loss: 0.02175990305840969 2023-01-22 16:06:49.565361: step: 188/77, loss: 0.00017583520093467087 2023-01-22 16:06:51.058013: step: 192/77, loss: 6.720335932186572e-07 2023-01-22 16:06:52.483810: step: 196/77, loss: 1.510552829131484e-05 2023-01-22 16:06:53.982386: step: 200/77, loss: 0.007230991963297129 2023-01-22 16:06:55.495715: step: 204/77, loss: 0.015933791175484657 2023-01-22 16:06:56.939652: step: 208/77, loss: 0.0009870578069239855 2023-01-22 16:06:58.394293: step: 212/77, loss: 1.0266803656122647e-06 2023-01-22 16:06:59.828084: step: 216/77, loss: 0.008149145171046257 2023-01-22 16:07:01.266809: step: 220/77, loss: 0.02759827859699726 2023-01-22 16:07:02.739456: step: 224/77, loss: 2.8312179267686588e-08 2023-01-22 16:07:04.222063: step: 228/77, loss: 0.008469237014651299 2023-01-22 16:07:05.726271: step: 232/77, loss: 2.3519085516454652e-05 2023-01-22 16:07:07.137600: step: 236/77, loss: 0.0003704246773850173 2023-01-22 16:07:08.663305: step: 240/77, loss: 1.2814975036690157e-07 2023-01-22 16:07:10.184377: step: 244/77, loss: 0.045821186155080795 2023-01-22 16:07:11.550392: step: 248/77, loss: 0.004043907392770052 2023-01-22 16:07:12.963882: step: 252/77, loss: 0.0261415746062994 2023-01-22 16:07:14.413257: step: 256/77, loss: 1.6988531569950283e-05 2023-01-22 16:07:15.882507: step: 260/77, loss: 0.0014448239235207438 2023-01-22 16:07:17.391601: step: 264/77, loss: 0.002613488817587495 2023-01-22 16:07:18.867850: step: 268/77, loss: 7.580941201013047e-06 2023-01-22 16:07:20.384954: step: 272/77, loss: 0.003058519447222352 2023-01-22 16:07:21.850978: step: 276/77, loss: 5.662332682732085e-07 2023-01-22 16:07:23.292244: step: 280/77, loss: 0.03081653267145157 2023-01-22 16:07:24.816099: step: 284/77, loss: 4.783949407283217e-05 2023-01-22 16:07:26.391175: step: 288/77, loss: 0.0008776055765338242 2023-01-22 16:07:27.775828: step: 292/77, loss: 0.0 2023-01-22 16:07:29.234000: step: 296/77, loss: 0.00010541600931901485 2023-01-22 16:07:30.654246: step: 300/77, loss: 0.0012375026708468795 2023-01-22 16:07:32.106970: step: 304/77, loss: 1.2069654076185543e-06 2023-01-22 16:07:33.522727: step: 308/77, loss: 1.6283647710224614e-05 2023-01-22 16:07:34.963332: step: 312/77, loss: 1.6277537724818103e-05 2023-01-22 16:07:36.398567: step: 316/77, loss: 0.00013059706543572247 2023-01-22 16:07:37.906353: step: 320/77, loss: 0.0023866556584835052 2023-01-22 16:07:39.365559: step: 324/77, loss: 8.467126463074237e-06 2023-01-22 16:07:40.737838: step: 328/77, loss: 4.461559001356363e-05 2023-01-22 16:07:42.200075: step: 332/77, loss: 5.4754286793468054e-06 2023-01-22 16:07:43.662686: step: 336/77, loss: 2.611947274999693e-05 2023-01-22 16:07:45.078869: step: 340/77, loss: 0.0005549070192500949 2023-01-22 16:07:46.564401: step: 344/77, loss: 0.008930765092372894 2023-01-22 16:07:48.054421: step: 348/77, loss: 0.03252748027443886 2023-01-22 16:07:49.458953: step: 352/77, loss: 0.010262150317430496 2023-01-22 16:07:50.921769: step: 356/77, loss: 2.0828412743867375e-05 2023-01-22 16:07:52.438920: step: 360/77, loss: 0.0024927284102886915 2023-01-22 16:07:53.930540: step: 364/77, loss: 1.2567315025080461e-05 2023-01-22 16:07:55.373079: step: 368/77, loss: 3.479706356301904e-05 2023-01-22 16:07:56.852915: step: 372/77, loss: 1.90722937531973e-06 2023-01-22 16:07:58.388918: step: 376/77, loss: 8.076311814875226e-07 2023-01-22 16:07:59.763756: step: 380/77, loss: 0.03902401402592659 2023-01-22 16:08:01.243929: step: 384/77, loss: 5.200436135055497e-07 2023-01-22 16:08:02.705023: step: 388/77, loss: 0.0009910853113979101 ================================================== Loss: 0.005 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5666666666666667, 'f1': 0.7234042553191489}, 'slot': {'p': 0.5, 'r': 0.035916824196597356, 'f1': 0.0670194003527337}, 'combined': 0.048482119404105226, 'epoch': 25} Test Chinese: {'template': {'p': 0.9342105263157895, 'r': 0.5590551181102362, 'f1': 0.6995073891625616}, 'slot': {'p': 0.6153846153846154, 'r': 0.022988505747126436, 'f1': 0.0443213296398892}, 'combined': 0.03100309758061215, 'epoch': 25} Dev Korean: {'template': {'p': 1.0, 'r': 0.5666666666666667, 'f1': 0.7234042553191489}, 'slot': {'p': 0.5, 'r': 0.035916824196597356, 'f1': 0.0670194003527337}, 'combined': 0.048482119404105226, 'epoch': 25} Test Korean: {'template': {'p': 0.9333333333333333, 'r': 0.5511811023622047, 'f1': 0.693069306930693}, 'slot': {'p': 0.625, 'r': 0.023946360153256706, 'f1': 0.04612546125461255}, 'combined': 0.03196814146359286, 'epoch': 25} Dev Russian: {'template': {'p': 1.0, 'r': 0.5666666666666667, 'f1': 0.7234042553191489}, 'slot': {'p': 0.5, 'r': 0.035916824196597356, 'f1': 0.0670194003527337}, 'combined': 0.048482119404105226, 'epoch': 25} Test Russian: {'template': {'p': 0.9333333333333333, 'r': 0.5511811023622047, 'f1': 0.693069306930693}, 'slot': {'p': 0.65, 'r': 0.02490421455938697, 'f1': 0.047970479704797044}, 'combined': 0.033246867122136564, 'epoch': 25} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 25} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 25} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 25} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 1} Test for Chinese: {'template': {'p': 0.9459459459459459, 'r': 0.5511811023622047, 'f1': 0.6965174129353234}, 'slot': {'p': 0.6111111111111112, 'r': 0.0210727969348659, 'f1': 0.040740740740740744}, 'combined': 0.028376635341809474, 'epoch': 1} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 1} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 1} Test for Korean: {'template': {'p': 0.9459459459459459, 'r': 0.5511811023622047, 'f1': 0.6965174129353234}, 'slot': {'p': 0.6111111111111112, 'r': 0.0210727969348659, 'f1': 0.040740740740740744}, 'combined': 0.028376635341809474, 'epoch': 1} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 1} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 1} Test for Russian: {'template': {'p': 0.9459459459459459, 'r': 0.5511811023622047, 'f1': 0.6965174129353234}, 'slot': {'p': 0.6111111111111112, 'r': 0.0210727969348659, 'f1': 0.040740740740740744}, 'combined': 0.028376635341809474, 'epoch': 1} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 1} ****************************** Epoch: 26 command: python train.py --model_name template --xlmr_model_name xlm-roberta-large --batch_size 10 --xlmr_learning_rate 2e-5 --event_hidden_num 450 --accumulate_step 4 --max_epoch 30 --p1_data_weight 0.1 --learning_rate 2e-4 2023-01-22 16:09:37.363925: step: 4/77, loss: 0.04784005880355835 2023-01-22 16:09:38.793998: step: 8/77, loss: 2.8473150450736284e-06 2023-01-22 16:09:40.331801: step: 12/77, loss: 9.548091838951223e-06 2023-01-22 16:09:41.818414: step: 16/77, loss: 0.0014276622096076608 2023-01-22 16:09:43.256990: step: 20/77, loss: 2.6076907033711905e-07 2023-01-22 16:09:44.633106: step: 24/77, loss: 3.7040456390968757e-06 2023-01-22 16:09:46.095142: step: 28/77, loss: 6.224661774467677e-05 2023-01-22 16:09:47.547662: step: 32/77, loss: 7.467347131751012e-06 2023-01-22 16:09:48.997007: step: 36/77, loss: 5.811447323367247e-08 2023-01-22 16:09:50.434663: step: 40/77, loss: 9.983402378566097e-06 2023-01-22 16:09:51.855124: step: 44/77, loss: 0.0008675174904055893 2023-01-22 16:09:53.345686: step: 48/77, loss: 7.077368354657665e-05 2023-01-22 16:09:54.804709: step: 52/77, loss: 2.8470356483012438e-05 2023-01-22 16:09:56.277356: step: 56/77, loss: 3.979622488259338e-05 2023-01-22 16:09:57.703698: step: 60/77, loss: 9.252169547835365e-06 2023-01-22 16:09:59.157171: step: 64/77, loss: 4.431839261087589e-05 2023-01-22 16:10:00.587363: step: 68/77, loss: 0.02976342663168907 2023-01-22 16:10:02.008896: step: 72/77, loss: 8.807202902971767e-06 2023-01-22 16:10:03.487949: step: 76/77, loss: 6.207458227436291e-06 2023-01-22 16:10:04.986819: step: 80/77, loss: 0.0013492414727807045 2023-01-22 16:10:06.538293: step: 84/77, loss: 4.9216712795896456e-05 2023-01-22 16:10:08.068097: step: 88/77, loss: 0.00016324827447533607 2023-01-22 16:10:09.471731: step: 92/77, loss: 9.795940059120767e-06 2023-01-22 16:10:10.874757: step: 96/77, loss: 0.0 2023-01-22 16:10:12.290573: step: 100/77, loss: 0.0014232625253498554 2023-01-22 16:10:13.791104: step: 104/77, loss: 0.001880752039141953 2023-01-22 16:10:15.262237: step: 108/77, loss: 0.01114499382674694 2023-01-22 16:10:16.691764: step: 112/77, loss: 7.269035450008232e-06 2023-01-22 16:10:18.087781: step: 116/77, loss: 4.693801542998699e-07 2023-01-22 16:10:19.587162: step: 120/77, loss: 4.696446922025643e-05 2023-01-22 16:10:21.070695: step: 124/77, loss: 9.22371896194818e-07 2023-01-22 16:10:22.443146: step: 128/77, loss: 2.6138286557397805e-05 2023-01-22 16:10:23.833887: step: 132/77, loss: 1.3712205145566259e-05 2023-01-22 16:10:25.320045: step: 136/77, loss: 0.000178498710738495 2023-01-22 16:10:26.770395: step: 140/77, loss: 0.0001281556615140289 2023-01-22 16:10:28.254433: step: 144/77, loss: 0.03462947532534599 2023-01-22 16:10:29.679366: step: 148/77, loss: 0.001991844270378351 2023-01-22 16:10:31.157175: step: 152/77, loss: 0.0002328802802367136 2023-01-22 16:10:32.641127: step: 156/77, loss: 4.421785706654191e-05 2023-01-22 16:10:34.126810: step: 160/77, loss: 0.00010313611710444093 2023-01-22 16:10:35.640433: step: 164/77, loss: 6.162692443467677e-05 2023-01-22 16:10:37.104463: step: 168/77, loss: 5.409025902736175e-07 2023-01-22 16:10:38.585935: step: 172/77, loss: 2.4480957563355332e-06 2023-01-22 16:10:40.067150: step: 176/77, loss: 1.3261976050671365e-07 2023-01-22 16:10:41.543910: step: 180/77, loss: 0.0001113208127208054 2023-01-22 16:10:43.002297: step: 184/77, loss: 0.0019749528728425503 2023-01-22 16:10:44.409748: step: 188/77, loss: 2.3482689357479103e-06 2023-01-22 16:10:45.881661: step: 192/77, loss: 3.1944089187163627e-06 2023-01-22 16:10:47.391918: step: 196/77, loss: 1.287994109588908e-05 2023-01-22 16:10:48.866601: step: 200/77, loss: 0.0002692329871933907 2023-01-22 16:10:50.388211: step: 204/77, loss: 2.9279453883646056e-06 2023-01-22 16:10:51.871843: step: 208/77, loss: 1.5795144747698942e-07 2023-01-22 16:10:53.398142: step: 212/77, loss: 3.730396201717667e-05 2023-01-22 16:10:54.831927: step: 216/77, loss: 2.549501459725434e-06 2023-01-22 16:10:56.298430: step: 220/77, loss: 9.834749903347983e-08 2023-01-22 16:10:57.701419: step: 224/77, loss: 0.00016617476649116725 2023-01-22 16:10:59.230416: step: 228/77, loss: 0.0018007175531238317 2023-01-22 16:11:00.671014: step: 232/77, loss: 0.001583755249157548 2023-01-22 16:11:02.138625: step: 236/77, loss: 3.902427124558017e-06 2023-01-22 16:11:03.675496: step: 240/77, loss: 4.989836452296004e-05 2023-01-22 16:11:05.106586: step: 244/77, loss: 5.3792897233506665e-05 2023-01-22 16:11:06.503395: step: 248/77, loss: 1.5794988712514169e-06 2023-01-22 16:11:07.926721: step: 252/77, loss: 0.0005338468472473323 2023-01-22 16:11:09.389115: step: 256/77, loss: 8.940695295223122e-09 2023-01-22 16:11:10.878825: step: 260/77, loss: 8.642645354939305e-08 2023-01-22 16:11:12.313456: step: 264/77, loss: 0.019210506230592728 2023-01-22 16:11:13.732839: step: 268/77, loss: 0.0313275121152401 2023-01-22 16:11:15.190088: step: 272/77, loss: 2.1308578368461895e-07 2023-01-22 16:11:16.625569: step: 276/77, loss: 5.051455786997394e-07 2023-01-22 16:11:18.039583: step: 280/77, loss: 0.00013217749074101448 2023-01-22 16:11:19.483400: step: 284/77, loss: 4.2323437810409814e-05 2023-01-22 16:11:20.941456: step: 288/77, loss: 5.2535860959324054e-06 2023-01-22 16:11:22.389448: step: 292/77, loss: 4.424926828505704e-06 2023-01-22 16:11:23.904623: step: 296/77, loss: 1.7269662748731207e-06 2023-01-22 16:11:25.428349: step: 300/77, loss: 0.0019558239728212357 2023-01-22 16:11:26.891960: step: 304/77, loss: 3.26333918110322e-07 2023-01-22 16:11:28.320658: step: 308/77, loss: 0.0022490471601486206 2023-01-22 16:11:29.762651: step: 312/77, loss: 0.011483880691230297 2023-01-22 16:11:31.240716: step: 316/77, loss: 0.00025594650651328266 2023-01-22 16:11:32.680984: step: 320/77, loss: 0.002967665670439601 2023-01-22 16:11:34.125302: step: 324/77, loss: 0.0038033199962228537 2023-01-22 16:11:35.613631: step: 328/77, loss: 6.2326298575499095e-06 2023-01-22 16:11:37.040080: step: 332/77, loss: 0.0007190780597738922 2023-01-22 16:11:38.511713: step: 336/77, loss: 2.8578956516867038e-06 2023-01-22 16:11:39.970209: step: 340/77, loss: 7.960732182255015e-05 2023-01-22 16:11:41.515133: step: 344/77, loss: 1.039443668560125e-05 2023-01-22 16:11:42.938111: step: 348/77, loss: 0.00019749348575714976 2023-01-22 16:11:44.416666: step: 352/77, loss: 0.006660194601863623 2023-01-22 16:11:45.834446: step: 356/77, loss: 4.912525582767557e-06 2023-01-22 16:11:47.317328: step: 360/77, loss: 8.645362868264783e-06 2023-01-22 16:11:48.828038: step: 364/77, loss: 2.507797944417689e-06 2023-01-22 16:11:50.282987: step: 368/77, loss: 1.5254820937116165e-05 2023-01-22 16:11:51.706903: step: 372/77, loss: 6.432225200114772e-05 2023-01-22 16:11:53.146728: step: 376/77, loss: 5.948862053628545e-06 2023-01-22 16:11:54.654397: step: 380/77, loss: 5.07792537973728e-05 2023-01-22 16:11:56.080066: step: 384/77, loss: 9.581846097717062e-05 2023-01-22 16:11:57.553894: step: 388/77, loss: 0.00010153924813494086 ================================================== Loss: 0.002 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5666666666666667, 'f1': 0.7234042553191489}, 'slot': {'p': 0.48717948717948717, 'r': 0.035916824196597356, 'f1': 0.06690140845070423}, 'combined': 0.04839676356008391, 'epoch': 26} Test Chinese: {'template': {'p': 0.9, 'r': 0.5669291338582677, 'f1': 0.6956521739130436}, 'slot': {'p': 0.5853658536585366, 'r': 0.022988505747126436, 'f1': 0.04423963133640553}, 'combined': 0.030775395712282112, 'epoch': 26} Dev Korean: {'template': {'p': 1.0, 'r': 0.5666666666666667, 'f1': 0.7234042553191489}, 'slot': {'p': 0.48717948717948717, 'r': 0.035916824196597356, 'f1': 0.06690140845070423}, 'combined': 0.04839676356008391, 'epoch': 26} Test Korean: {'template': {'p': 0.9, 'r': 0.5669291338582677, 'f1': 0.6956521739130436}, 'slot': {'p': 0.5853658536585366, 'r': 0.022988505747126436, 'f1': 0.04423963133640553}, 'combined': 0.030775395712282112, 'epoch': 26} Dev Russian: {'template': {'p': 1.0, 'r': 0.5666666666666667, 'f1': 0.7234042553191489}, 'slot': {'p': 0.48717948717948717, 'r': 0.035916824196597356, 'f1': 0.06690140845070423}, 'combined': 0.04839676356008391, 'epoch': 26} Test Russian: {'template': {'p': 0.9113924050632911, 'r': 0.5669291338582677, 'f1': 0.6990291262135924}, 'slot': {'p': 0.5853658536585366, 'r': 0.022988505747126436, 'f1': 0.04423963133640553}, 'combined': 0.030924790837099016, 'epoch': 26} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 26} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 26} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 26} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 1} Test for Chinese: {'template': {'p': 0.9459459459459459, 'r': 0.5511811023622047, 'f1': 0.6965174129353234}, 'slot': {'p': 0.6111111111111112, 'r': 0.0210727969348659, 'f1': 0.040740740740740744}, 'combined': 0.028376635341809474, 'epoch': 1} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 1} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 1} Test for Korean: {'template': {'p': 0.9459459459459459, 'r': 0.5511811023622047, 'f1': 0.6965174129353234}, 'slot': {'p': 0.6111111111111112, 'r': 0.0210727969348659, 'f1': 0.040740740740740744}, 'combined': 0.028376635341809474, 'epoch': 1} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 1} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 1} Test for Russian: {'template': {'p': 0.9459459459459459, 'r': 0.5511811023622047, 'f1': 0.6965174129353234}, 'slot': {'p': 0.6111111111111112, 'r': 0.0210727969348659, 'f1': 0.040740740740740744}, 'combined': 0.028376635341809474, 'epoch': 1} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 1} ****************************** Epoch: 27 command: python train.py --model_name template --xlmr_model_name xlm-roberta-large --batch_size 10 --xlmr_learning_rate 2e-5 --event_hidden_num 450 --accumulate_step 4 --max_epoch 30 --p1_data_weight 0.1 --learning_rate 2e-4 2023-01-22 16:13:32.434995: step: 4/77, loss: 0.0020111622288823128 2023-01-22 16:13:33.831589: step: 8/77, loss: 0.005289367865771055 2023-01-22 16:13:35.221065: step: 12/77, loss: 0.009806794114410877 2023-01-22 16:13:36.662435: step: 16/77, loss: 3.1394179131893907e-06 2023-01-22 16:13:38.095592: step: 20/77, loss: 0.00015450039063580334 2023-01-22 16:13:39.479265: step: 24/77, loss: 6.839962679805467e-06 2023-01-22 16:13:40.969704: step: 28/77, loss: 2.7089104150945786e-06 2023-01-22 16:13:42.366835: step: 32/77, loss: 6.594689330086112e-05 2023-01-22 16:13:43.750611: step: 36/77, loss: 9.025496183312498e-06 2023-01-22 16:13:45.243344: step: 40/77, loss: 3.0696125463691715e-07 2023-01-22 16:13:46.760031: step: 44/77, loss: 9.000206091513974e-07 2023-01-22 16:13:48.200342: step: 48/77, loss: 0.00011323521175654605 2023-01-22 16:13:49.681871: step: 52/77, loss: 8.859327863319777e-06 2023-01-22 16:13:51.118420: step: 56/77, loss: 2.0056279481650563e-06 2023-01-22 16:13:52.601510: step: 60/77, loss: 3.904079335370625e-07 2023-01-22 16:13:54.075283: step: 64/77, loss: 9.086877980735153e-06 2023-01-22 16:13:55.593589: step: 68/77, loss: 0.0001458696642657742 2023-01-22 16:13:57.079082: step: 72/77, loss: 3.5315466107022075e-07 2023-01-22 16:13:58.552867: step: 76/77, loss: 0.0003111056284978986 2023-01-22 16:13:59.969920: step: 80/77, loss: 9.596192285243887e-07 2023-01-22 16:14:01.389576: step: 84/77, loss: 0.001855726120993495 2023-01-22 16:14:02.799259: step: 88/77, loss: 5.6624362088086855e-08 2023-01-22 16:14:04.328852: step: 92/77, loss: 0.010057814419269562 2023-01-22 16:14:05.805836: step: 96/77, loss: 1.5428369806613773e-05 2023-01-22 16:14:07.246456: step: 100/77, loss: 2.6822010568139376e-07 2023-01-22 16:14:08.746731: step: 104/77, loss: 2.0606764792319154e-06 2023-01-22 16:14:10.190278: step: 108/77, loss: 2.8383819881128147e-06 2023-01-22 16:14:11.677548: step: 112/77, loss: 2.1321848180377856e-06 2023-01-22 16:14:13.163964: step: 116/77, loss: 3.6209658560437674e-07 2023-01-22 16:14:14.697813: step: 120/77, loss: 9.953644166671438e-07 2023-01-22 16:14:16.101705: step: 124/77, loss: 1.6166787872862187e-06 2023-01-22 16:14:17.532925: step: 128/77, loss: 0.0011151016224175692 2023-01-22 16:14:18.978210: step: 132/77, loss: 0.00011054376227548346 2023-01-22 16:14:20.444710: step: 136/77, loss: 0.00026065309066325426 2023-01-22 16:14:21.850432: step: 140/77, loss: 1.5362244312200346e-06 2023-01-22 16:14:23.332616: step: 144/77, loss: 0.0011059599928557873 2023-01-22 16:14:24.819068: step: 148/77, loss: 8.669017915963195e-06 2023-01-22 16:14:26.280265: step: 152/77, loss: 2.066708020720398e-06 2023-01-22 16:14:27.782004: step: 156/77, loss: 1.4453820540438755e-06 2023-01-22 16:14:29.268619: step: 160/77, loss: 0.0002578452986199409 2023-01-22 16:14:30.692660: step: 164/77, loss: 1.2174112953289296e-06 2023-01-22 16:14:32.121233: step: 168/77, loss: 2.4884829485927185e-07 2023-01-22 16:14:33.600200: step: 172/77, loss: 1.3260229025036097e-05 2023-01-22 16:14:35.053601: step: 176/77, loss: 1.1512148375913966e-05 2023-01-22 16:14:36.485452: step: 180/77, loss: 6.968080015212763e-06 2023-01-22 16:14:37.964536: step: 184/77, loss: 7.897610032614466e-08 2023-01-22 16:14:39.428868: step: 188/77, loss: 4.321317987887596e-07 2023-01-22 16:14:40.903670: step: 192/77, loss: 0.00020230596419423819 2023-01-22 16:14:42.339855: step: 196/77, loss: 0.0015696781920269132 2023-01-22 16:14:43.825781: step: 200/77, loss: 0.00011516185622895136 2023-01-22 16:14:45.301428: step: 204/77, loss: 2.7409183530835435e-05 2023-01-22 16:14:46.819678: step: 208/77, loss: 0.0013656478840857744 2023-01-22 16:14:48.298769: step: 212/77, loss: 0.1737237274646759 2023-01-22 16:14:49.764008: step: 216/77, loss: 0.0015293348114937544 2023-01-22 16:14:51.182581: step: 220/77, loss: 1.0832799262061599e-06 2023-01-22 16:14:52.644660: step: 224/77, loss: 0.04633089154958725 2023-01-22 16:14:54.096421: step: 228/77, loss: 2.4570554160163738e-06 2023-01-22 16:14:55.583165: step: 232/77, loss: 2.358338315389119e-05 2023-01-22 16:14:57.048518: step: 236/77, loss: 0.0001362025795970112 2023-01-22 16:14:58.485592: step: 240/77, loss: 0.0022872784174978733 2023-01-22 16:14:59.934375: step: 244/77, loss: 0.0002901510742958635 2023-01-22 16:15:01.396416: step: 248/77, loss: 4.127592490021925e-07 2023-01-22 16:15:02.862788: step: 252/77, loss: 0.0012548953527584672 2023-01-22 16:15:04.330636: step: 256/77, loss: 8.046128641581163e-06 2023-01-22 16:15:05.854354: step: 260/77, loss: 2.9128809728717897e-06 2023-01-22 16:15:07.318502: step: 264/77, loss: 3.427696356084198e-05 2023-01-22 16:15:08.753931: step: 268/77, loss: 0.0005216635181568563 2023-01-22 16:15:10.195000: step: 272/77, loss: 0.0001378964225295931 2023-01-22 16:15:11.666771: step: 276/77, loss: 0.04889054223895073 2023-01-22 16:15:13.104393: step: 280/77, loss: 4.67103700430016e-06 2023-01-22 16:15:14.541702: step: 284/77, loss: 0.006070047616958618 2023-01-22 16:15:16.017831: step: 288/77, loss: 6.64962426526472e-05 2023-01-22 16:15:17.515767: step: 292/77, loss: 8.231154060922563e-05 2023-01-22 16:15:18.985872: step: 296/77, loss: 2.6422974769957364e-05 2023-01-22 16:15:20.432904: step: 300/77, loss: 1.3237780876806937e-05 2023-01-22 16:15:21.880998: step: 304/77, loss: 0.00024358625523746014 2023-01-22 16:15:23.431496: step: 308/77, loss: 0.00024345442943740636 2023-01-22 16:15:24.859376: step: 312/77, loss: 0.0005456964136101305 2023-01-22 16:15:26.318571: step: 316/77, loss: 2.260304881929187e-06 2023-01-22 16:15:27.727941: step: 320/77, loss: 1.0200102224189322e-05 2023-01-22 16:15:29.208121: step: 324/77, loss: 4.4255710918150726e-07 2023-01-22 16:15:30.661420: step: 328/77, loss: 0.0003438709245529026 2023-01-22 16:15:32.110113: step: 332/77, loss: 1.2218927736284968e-07 2023-01-22 16:15:33.589615: step: 336/77, loss: 0.0007437565363943577 2023-01-22 16:15:35.062764: step: 340/77, loss: 1.1518320661707548e-06 2023-01-22 16:15:36.489660: step: 344/77, loss: 2.518285100450157e-07 2023-01-22 16:15:37.979839: step: 348/77, loss: 1.6122501165227732e-06 2023-01-22 16:15:39.451613: step: 352/77, loss: 2.1308495945504546e-07 2023-01-22 16:15:41.011025: step: 356/77, loss: 4.947146408085246e-07 2023-01-22 16:15:42.394751: step: 360/77, loss: 1.2606059272002312e-06 2023-01-22 16:15:43.902132: step: 364/77, loss: 2.786503614515823e-07 2023-01-22 16:15:45.372277: step: 368/77, loss: 0.00010207617015112191 2023-01-22 16:15:46.850861: step: 372/77, loss: 5.129635610501282e-06 2023-01-22 16:15:48.380812: step: 376/77, loss: 1.1533345514180837e-06 2023-01-22 16:15:49.813094: step: 380/77, loss: 2.965313683489512e-07 2023-01-22 16:15:51.267319: step: 384/77, loss: 1.5348166471085278e-07 2023-01-22 16:15:52.712615: step: 388/77, loss: 0.00011031147732865065 ================================================== Loss: 0.003 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5666666666666667, 'f1': 0.7234042553191489}, 'slot': {'p': 0.5, 'r': 0.035916824196597356, 'f1': 0.0670194003527337}, 'combined': 0.048482119404105226, 'epoch': 27} Test Chinese: {'template': {'p': 0.9452054794520548, 'r': 0.5433070866141733, 'f1': 0.69}, 'slot': {'p': 0.6470588235294118, 'r': 0.0210727969348659, 'f1': 0.04081632653061225}, 'combined': 0.02816326530612245, 'epoch': 27} Dev Korean: {'template': {'p': 1.0, 'r': 0.5666666666666667, 'f1': 0.7234042553191489}, 'slot': {'p': 0.5, 'r': 0.035916824196597356, 'f1': 0.0670194003527337}, 'combined': 0.048482119404105226, 'epoch': 27} Test Korean: {'template': {'p': 0.9452054794520548, 'r': 0.5433070866141733, 'f1': 0.69}, 'slot': {'p': 0.6470588235294118, 'r': 0.0210727969348659, 'f1': 0.04081632653061225}, 'combined': 0.02816326530612245, 'epoch': 27} Dev Russian: {'template': {'p': 1.0, 'r': 0.5666666666666667, 'f1': 0.7234042553191489}, 'slot': {'p': 0.5, 'r': 0.035916824196597356, 'f1': 0.0670194003527337}, 'combined': 0.048482119404105226, 'epoch': 27} Test Russian: {'template': {'p': 0.9452054794520548, 'r': 0.5433070866141733, 'f1': 0.69}, 'slot': {'p': 0.6666666666666666, 'r': 0.0210727969348659, 'f1': 0.040854224698235846}, 'combined': 0.028189415041782732, 'epoch': 27} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 27} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 27} Sample Russian: {'template': {'p': 1.0, 'r': 0.25, 'f1': 0.4}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 27} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 1} Test for Chinese: {'template': {'p': 0.9459459459459459, 'r': 0.5511811023622047, 'f1': 0.6965174129353234}, 'slot': {'p': 0.6111111111111112, 'r': 0.0210727969348659, 'f1': 0.040740740740740744}, 'combined': 0.028376635341809474, 'epoch': 1} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 1} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 1} Test for Korean: {'template': {'p': 0.9459459459459459, 'r': 0.5511811023622047, 'f1': 0.6965174129353234}, 'slot': {'p': 0.6111111111111112, 'r': 0.0210727969348659, 'f1': 0.040740740740740744}, 'combined': 0.028376635341809474, 'epoch': 1} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 1} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 1} Test for Russian: {'template': {'p': 0.9459459459459459, 'r': 0.5511811023622047, 'f1': 0.6965174129353234}, 'slot': {'p': 0.6111111111111112, 'r': 0.0210727969348659, 'f1': 0.040740740740740744}, 'combined': 0.028376635341809474, 'epoch': 1} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 1} ****************************** Epoch: 28 command: python train.py --model_name template --xlmr_model_name xlm-roberta-large --batch_size 10 --xlmr_learning_rate 2e-5 --event_hidden_num 450 --accumulate_step 4 --max_epoch 30 --p1_data_weight 0.1 --learning_rate 2e-4 2023-01-22 16:17:27.440904: step: 4/77, loss: 2.8942846256541088e-05 2023-01-22 16:17:28.847171: step: 8/77, loss: 6.383044819813222e-06 2023-01-22 16:17:30.255593: step: 12/77, loss: 0.38529524207115173 2023-01-22 16:17:31.759524: step: 16/77, loss: 1.1622874040995157e-07 2023-01-22 16:17:33.262354: step: 20/77, loss: 8.282003545900807e-05 2023-01-22 16:17:34.789535: step: 24/77, loss: 0.00024234222655650228 2023-01-22 16:17:36.264678: step: 28/77, loss: 4.962017214893422e-07 2023-01-22 16:17:37.781345: step: 32/77, loss: 0.00011762435315176845 2023-01-22 16:17:39.299719: step: 36/77, loss: 1.34330575747299e-05 2023-01-22 16:17:40.783070: step: 40/77, loss: 3.195981889803079e-06 2023-01-22 16:17:42.210274: step: 44/77, loss: 2.91780070256209e-05 2023-01-22 16:17:43.617401: step: 48/77, loss: 1.639127233943327e-08 2023-01-22 16:17:45.101012: step: 52/77, loss: 1.8924416167465097e-07 2023-01-22 16:17:46.563951: step: 56/77, loss: 2.447695442242548e-05 2023-01-22 16:17:48.015775: step: 60/77, loss: 0.00011267983791185543 2023-01-22 16:17:49.488583: step: 64/77, loss: 2.6079122108058073e-05 2023-01-22 16:17:50.945725: step: 68/77, loss: 0.00016295979730784893 2023-01-22 16:17:52.409869: step: 72/77, loss: 0.0003402327129151672 2023-01-22 16:17:53.836531: step: 76/77, loss: 0.00013205315917730331 2023-01-22 16:17:55.289259: step: 80/77, loss: 0.0006189257837831974 2023-01-22 16:17:56.734365: step: 84/77, loss: 0.06094278395175934 2023-01-22 16:17:58.209234: step: 88/77, loss: 0.0015897268895059824 2023-01-22 16:17:59.669688: step: 92/77, loss: 9.85594124358613e-06 2023-01-22 16:18:01.231502: step: 96/77, loss: 0.003623634809628129 2023-01-22 16:18:02.739670: step: 100/77, loss: 2.8410710001480766e-05 2023-01-22 16:18:04.139605: step: 104/77, loss: 1.7216219930560328e-05 2023-01-22 16:18:05.632336: step: 108/77, loss: 6.97291616233997e-05 2023-01-22 16:18:07.079343: step: 112/77, loss: 8.642653881452134e-08 2023-01-22 16:18:08.466998: step: 116/77, loss: 0.00021646979439537972 2023-01-22 16:18:09.909711: step: 120/77, loss: 1.060941031028051e-06 2023-01-22 16:18:11.331391: step: 124/77, loss: 2.460104269630392e-06 2023-01-22 16:18:12.821697: step: 128/77, loss: 3.4941976991831325e-06 2023-01-22 16:18:14.297182: step: 132/77, loss: 1.1175850289646405e-07 2023-01-22 16:18:15.785958: step: 136/77, loss: 1.7627659190111444e-06 2023-01-22 16:18:17.271119: step: 140/77, loss: 3.652025043265894e-06 2023-01-22 16:18:18.675816: step: 144/77, loss: 5.454878646560246e-06 2023-01-22 16:18:20.144782: step: 148/77, loss: 5.781562322226819e-07 2023-01-22 16:18:21.641335: step: 152/77, loss: 0.005288339219987392 2023-01-22 16:18:23.084131: step: 156/77, loss: 2.9308903322089463e-06 2023-01-22 16:18:24.551691: step: 160/77, loss: 0.0009818144608289003 2023-01-22 16:18:25.985193: step: 164/77, loss: 1.9466653611743823e-05 2023-01-22 16:18:27.438589: step: 168/77, loss: 9.548571688355878e-05 2023-01-22 16:18:28.862378: step: 172/77, loss: 0.00023758788302075118 2023-01-22 16:18:30.299413: step: 176/77, loss: 8.720810001250356e-05 2023-01-22 16:18:31.792983: step: 180/77, loss: 2.7718302590074018e-05 2023-01-22 16:18:33.247896: step: 184/77, loss: 0.0006940797902643681 2023-01-22 16:18:34.724907: step: 188/77, loss: 0.00012655448517762125 2023-01-22 16:18:36.197884: step: 192/77, loss: 7.862450001994148e-05 2023-01-22 16:18:37.639126: step: 196/77, loss: 6.022163506713696e-05 2023-01-22 16:18:39.149408: step: 200/77, loss: 1.6852578710313537e-06 2023-01-22 16:18:40.559732: step: 204/77, loss: 0.00011896291835000739 2023-01-22 16:18:42.030772: step: 208/77, loss: 0.00020110560581088066 2023-01-22 16:18:43.482086: step: 212/77, loss: 0.00014157082478050143 2023-01-22 16:18:44.913663: step: 216/77, loss: 0.012474425137043 2023-01-22 16:18:46.347255: step: 220/77, loss: 0.012295715510845184 2023-01-22 16:18:47.853471: step: 224/77, loss: 4.8905312723945826e-05 2023-01-22 16:18:49.268167: step: 228/77, loss: 0.013076450675725937 2023-01-22 16:18:50.669289: step: 232/77, loss: 0.00601159455254674 2023-01-22 16:18:52.159806: step: 236/77, loss: 0.00045055043301545084 2023-01-22 16:18:53.597874: step: 240/77, loss: 0.00036595933488570154 2023-01-22 16:18:55.080231: step: 244/77, loss: 7.813594493200071e-06 2023-01-22 16:18:56.615448: step: 248/77, loss: 0.016424184665083885 2023-01-22 16:18:58.108681: step: 252/77, loss: 1.2740142665279564e-06 2023-01-22 16:18:59.504766: step: 256/77, loss: 0.00037757365498691797 2023-01-22 16:19:00.950732: step: 260/77, loss: 4.6495690185111016e-05 2023-01-22 16:19:02.369294: step: 264/77, loss: 2.9504150234060944e-07 2023-01-22 16:19:03.799010: step: 268/77, loss: 1.7438815120840445e-05 2023-01-22 16:19:05.308525: step: 272/77, loss: 7.411535989376716e-06 2023-01-22 16:19:06.829093: step: 276/77, loss: 0.03300153464078903 2023-01-22 16:19:08.232280: step: 280/77, loss: 0.0009415661334060133 2023-01-22 16:19:09.655062: step: 284/77, loss: 3.7816782878508093e-06 2023-01-22 16:19:11.139849: step: 288/77, loss: 1.4901160305669237e-09 2023-01-22 16:19:12.535397: step: 292/77, loss: 7.241833941407094e-07 2023-01-22 16:19:14.068569: step: 296/77, loss: 0.000881119049154222 2023-01-22 16:19:15.555089: step: 300/77, loss: 0.05586878955364227 2023-01-22 16:19:17.002450: step: 304/77, loss: 3.1915890303935157e-06 2023-01-22 16:19:18.443518: step: 308/77, loss: 2.7564183255890384e-06 2023-01-22 16:19:19.926620: step: 312/77, loss: 0.03436660394072533 2023-01-22 16:19:21.378782: step: 316/77, loss: 0.06724284589290619 2023-01-22 16:19:22.825105: step: 320/77, loss: 2.246089934487827e-05 2023-01-22 16:19:24.246546: step: 324/77, loss: 5.081255949335173e-07 2023-01-22 16:19:25.723454: step: 328/77, loss: 0.0034213995095342398 2023-01-22 16:19:27.211609: step: 332/77, loss: 1.2901400623377413e-05 2023-01-22 16:19:28.722608: step: 336/77, loss: 2.0954481442458928e-05 2023-01-22 16:19:30.200882: step: 340/77, loss: 3.9483088585257065e-06 2023-01-22 16:19:31.711208: step: 344/77, loss: 0.0020019214134663343 2023-01-22 16:19:33.122612: step: 348/77, loss: 1.9796116248471662e-05 2023-01-22 16:19:34.590532: step: 352/77, loss: 1.0555484550422989e-05 2023-01-22 16:19:36.099517: step: 356/77, loss: 0.0013992226449772716 2023-01-22 16:19:37.519591: step: 360/77, loss: 0.01580088399350643 2023-01-22 16:19:38.933509: step: 364/77, loss: 7.182278523032437e-07 2023-01-22 16:19:40.432910: step: 368/77, loss: 0.00010402529733255506 2023-01-22 16:19:41.886200: step: 372/77, loss: 2.1308616737769626e-07 2023-01-22 16:19:43.320606: step: 376/77, loss: 0.0032554087229073048 2023-01-22 16:19:44.818548: step: 380/77, loss: 7.881976489443332e-05 2023-01-22 16:19:46.261549: step: 384/77, loss: 7.672096398891881e-05 2023-01-22 16:19:47.710682: step: 388/77, loss: 3.7518868339248e-05 ================================================== Loss: 0.008 -------------------- Dev Chinese: {'template': {'p': 0.9722222222222222, 'r': 0.5833333333333334, 'f1': 0.7291666666666666}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05125951962507322, 'epoch': 28} Test Chinese: {'template': {'p': 0.958904109589041, 'r': 0.5511811023622047, 'f1': 0.7000000000000001}, 'slot': {'p': 0.7222222222222222, 'r': 0.02490421455938697, 'f1': 0.04814814814814814}, 'combined': 0.0337037037037037, 'epoch': 28} Dev Korean: {'template': {'p': 0.9722222222222222, 'r': 0.5833333333333334, 'f1': 0.7291666666666666}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05125951962507322, 'epoch': 28} Test Korean: {'template': {'p': 0.9594594594594594, 'r': 0.5590551181102362, 'f1': 0.7064676616915422}, 'slot': {'p': 0.7222222222222222, 'r': 0.02490421455938697, 'f1': 0.04814814814814814}, 'combined': 0.03401510963700018, 'epoch': 28} Dev Russian: {'template': {'p': 0.9722222222222222, 'r': 0.5833333333333334, 'f1': 0.7291666666666666}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05125951962507322, 'epoch': 28} Test Russian: {'template': {'p': 0.958904109589041, 'r': 0.5511811023622047, 'f1': 0.7000000000000001}, 'slot': {'p': 0.7222222222222222, 'r': 0.02490421455938697, 'f1': 0.04814814814814814}, 'combined': 0.0337037037037037, 'epoch': 28} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 28} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 28} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 28} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 1} Test for Chinese: {'template': {'p': 0.9459459459459459, 'r': 0.5511811023622047, 'f1': 0.6965174129353234}, 'slot': {'p': 0.6111111111111112, 'r': 0.0210727969348659, 'f1': 0.040740740740740744}, 'combined': 0.028376635341809474, 'epoch': 1} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 1} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 1} Test for Korean: {'template': {'p': 0.9459459459459459, 'r': 0.5511811023622047, 'f1': 0.6965174129353234}, 'slot': {'p': 0.6111111111111112, 'r': 0.0210727969348659, 'f1': 0.040740740740740744}, 'combined': 0.028376635341809474, 'epoch': 1} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 1} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 1} Test for Russian: {'template': {'p': 0.9459459459459459, 'r': 0.5511811023622047, 'f1': 0.6965174129353234}, 'slot': {'p': 0.6111111111111112, 'r': 0.0210727969348659, 'f1': 0.040740740740740744}, 'combined': 0.028376635341809474, 'epoch': 1} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 1} ****************************** Epoch: 29 command: python train.py --model_name template --xlmr_model_name xlm-roberta-large --batch_size 10 --xlmr_learning_rate 2e-5 --event_hidden_num 450 --accumulate_step 4 --max_epoch 30 --p1_data_weight 0.1 --learning_rate 2e-4 2023-01-22 16:21:22.609541: step: 4/77, loss: 0.0005631856038235128 2023-01-22 16:21:24.062697: step: 8/77, loss: 4.136122515774332e-05 2023-01-22 16:21:25.505505: step: 12/77, loss: 0.0001225726882694289 2023-01-22 16:21:26.938808: step: 16/77, loss: 7.301562732209277e-08 2023-01-22 16:21:28.396818: step: 20/77, loss: 0.0005650555831380188 2023-01-22 16:21:29.849746: step: 24/77, loss: 0.0009148400276899338 2023-01-22 16:21:31.316229: step: 28/77, loss: 0.00013402664626482874 2023-01-22 16:21:32.775476: step: 32/77, loss: 1.793707451724913e-05 2023-01-22 16:21:34.258335: step: 36/77, loss: 0.0007169814198277891 2023-01-22 16:21:35.674137: step: 40/77, loss: 0.0024766293354332447 2023-01-22 16:21:37.160122: step: 44/77, loss: 0.003547999542206526 2023-01-22 16:21:38.599874: step: 48/77, loss: 8.046615107559774e-08 2023-01-22 16:21:40.016977: step: 52/77, loss: 0.0014436140190809965 2023-01-22 16:21:41.461718: step: 56/77, loss: 5.613188477582298e-05 2023-01-22 16:21:42.864178: step: 60/77, loss: 2.6672881858758046e-07 2023-01-22 16:21:44.340932: step: 64/77, loss: 1.982338471862022e-05 2023-01-22 16:21:45.795938: step: 68/77, loss: 4.73556065117009e-05 2023-01-22 16:21:47.310584: step: 72/77, loss: 0.021486656740307808 2023-01-22 16:21:48.750899: step: 76/77, loss: 7.748585062472557e-08 2023-01-22 16:21:50.177191: step: 80/77, loss: 0.00015511229867115617 2023-01-22 16:21:51.640688: step: 84/77, loss: 0.013580088503658772 2023-01-22 16:21:53.072986: step: 88/77, loss: 0.00035687192576006055 2023-01-22 16:21:54.532724: step: 92/77, loss: 0.00039333925815299153 2023-01-22 16:21:56.011660: step: 96/77, loss: 0.025791311636567116 2023-01-22 16:21:57.430648: step: 100/77, loss: 0.00138597481418401 2023-01-22 16:21:58.928348: step: 104/77, loss: 0.00127062585670501 2023-01-22 16:22:00.397719: step: 108/77, loss: 0.10522889345884323 2023-01-22 16:22:01.851866: step: 112/77, loss: 5.688203600584529e-06 2023-01-22 16:22:03.279165: step: 116/77, loss: 4.932891442877008e-06 2023-01-22 16:22:04.827236: step: 120/77, loss: 9.68143194768345e-06 2023-01-22 16:22:06.382931: step: 124/77, loss: 0.0003503952466417104 2023-01-22 16:22:07.914219: step: 128/77, loss: 0.10227132588624954 2023-01-22 16:22:09.375157: step: 132/77, loss: 0.005473359487950802 2023-01-22 16:22:10.858753: step: 136/77, loss: 5.8472727687330917e-05 2023-01-22 16:22:12.334191: step: 140/77, loss: 2.466635669406969e-05 2023-01-22 16:22:13.823405: step: 144/77, loss: 3.632585139712319e-05 2023-01-22 16:22:15.277773: step: 148/77, loss: 1.9609051378211007e-06 2023-01-22 16:22:16.738154: step: 152/77, loss: 1.266596143523202e-07 2023-01-22 16:22:18.215106: step: 156/77, loss: 1.301693009736482e-05 2023-01-22 16:22:19.642340: step: 160/77, loss: 0.00011166852345922962 2023-01-22 16:22:21.135150: step: 164/77, loss: 0.0001841976190917194 2023-01-22 16:22:22.594820: step: 168/77, loss: 7.271597723956802e-07 2023-01-22 16:22:24.068675: step: 172/77, loss: 1.0699464837671258e-05 2023-01-22 16:22:25.512514: step: 176/77, loss: 6.024013782734983e-05 2023-01-22 16:22:26.993575: step: 180/77, loss: 0.00010401514009572566 2023-01-22 16:22:28.448635: step: 184/77, loss: 1.3306503205967601e-06 2023-01-22 16:22:29.886853: step: 188/77, loss: 4.654767508327495e-06 2023-01-22 16:22:31.349650: step: 192/77, loss: 0.04118737578392029 2023-01-22 16:22:32.814716: step: 196/77, loss: 4.619356985813283e-08 2023-01-22 16:22:34.255614: step: 200/77, loss: 6.482191383838654e-05 2023-01-22 16:22:35.707466: step: 204/77, loss: 0.00015450670616701245 2023-01-22 16:22:37.195987: step: 208/77, loss: 0.020734887570142746 2023-01-22 16:22:38.690524: step: 212/77, loss: 1.4778873264731374e-05 2023-01-22 16:22:40.085539: step: 216/77, loss: 0.00012442428851500154 2023-01-22 16:22:41.562828: step: 220/77, loss: 0.07303691655397415 2023-01-22 16:22:43.046495: step: 224/77, loss: 4.962039383826777e-07 2023-01-22 16:22:44.472978: step: 228/77, loss: 1.813038943510037e-05 2023-01-22 16:22:45.943539: step: 232/77, loss: 3.216864797650487e-06 2023-01-22 16:22:47.333536: step: 236/77, loss: 2.3240434529725462e-05 2023-01-22 16:22:48.832634: step: 240/77, loss: 6.098870653659105e-05 2023-01-22 16:22:50.368288: step: 244/77, loss: 0.0034880172461271286 2023-01-22 16:22:51.818934: step: 248/77, loss: 7.400177764793625e-06 2023-01-22 16:22:53.261120: step: 252/77, loss: 3.127576292172307e-06 2023-01-22 16:22:54.704595: step: 256/77, loss: 0.010403799824416637 2023-01-22 16:22:56.210419: step: 260/77, loss: 9.33471710595768e-06 2023-01-22 16:22:57.667690: step: 264/77, loss: 0.008462383411824703 2023-01-22 16:22:59.048655: step: 268/77, loss: 0.00030241074273362756 2023-01-22 16:23:00.572894: step: 272/77, loss: 2.279755335621303e-06 2023-01-22 16:23:02.084737: step: 276/77, loss: 0.002295043785125017 2023-01-22 16:23:03.526390: step: 280/77, loss: 0.000481647060951218 2023-01-22 16:23:04.914536: step: 284/77, loss: 8.996405085781589e-06 2023-01-22 16:23:06.359067: step: 288/77, loss: 1.1264480235695373e-05 2023-01-22 16:23:07.818486: step: 292/77, loss: 6.616064069930871e-07 2023-01-22 16:23:09.272105: step: 296/77, loss: 0.0005194866680540144 2023-01-22 16:23:10.654379: step: 300/77, loss: 2.9397268008324318e-05 2023-01-22 16:23:12.134673: step: 304/77, loss: 2.837587999238167e-05 2023-01-22 16:23:13.587207: step: 308/77, loss: 1.9426523067522794e-05 2023-01-22 16:23:15.059063: step: 312/77, loss: 8.68957613420207e-06 2023-01-22 16:23:16.527186: step: 316/77, loss: 0.006418840028345585 2023-01-22 16:23:18.016330: step: 320/77, loss: 0.00017917045624926686 2023-01-22 16:23:19.420945: step: 324/77, loss: 0.00012857986439485103 2023-01-22 16:23:20.914945: step: 328/77, loss: 4.828783494303934e-05 2023-01-22 16:23:22.370708: step: 332/77, loss: 0.0035517828073352575 2023-01-22 16:23:23.845603: step: 336/77, loss: 0.00045442316331900656 2023-01-22 16:23:25.325245: step: 340/77, loss: 0.00037013780092820525 2023-01-22 16:23:26.790688: step: 344/77, loss: 2.3277403670363128e-05 2023-01-22 16:23:28.247956: step: 348/77, loss: 2.0234533621987794e-06 2023-01-22 16:23:29.728729: step: 352/77, loss: 7.673408617847599e-06 2023-01-22 16:23:31.175195: step: 356/77, loss: 5.6945304095279425e-05 2023-01-22 16:23:32.704312: step: 360/77, loss: 5.298125324770808e-05 2023-01-22 16:23:34.180901: step: 364/77, loss: 0.00014197189011611044 2023-01-22 16:23:35.651686: step: 368/77, loss: 1.4955718143028207e-05 2023-01-22 16:23:37.049035: step: 372/77, loss: 6.67990098008886e-05 2023-01-22 16:23:38.518544: step: 376/77, loss: 0.0001283240708289668 2023-01-22 16:23:40.014713: step: 380/77, loss: 0.0003709930751938373 2023-01-22 16:23:41.512335: step: 384/77, loss: 4.221945710014552e-05 2023-01-22 16:23:43.032216: step: 388/77, loss: 3.209413989679888e-06 ================================================== Loss: 0.005 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.4878048780487805, 'r': 0.03780718336483932, 'f1': 0.07017543859649122}, 'combined': 0.051708217913204055, 'epoch': 29} Test Chinese: {'template': {'p': 0.9583333333333334, 'r': 0.5433070866141733, 'f1': 0.6934673366834172}, 'slot': {'p': 0.75, 'r': 0.020114942528735632, 'f1': 0.03917910447761194}, 'combined': 0.0271694292357309, 'epoch': 29} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 29} Test Korean: {'template': {'p': 0.9583333333333334, 'r': 0.5433070866141733, 'f1': 0.6934673366834172}, 'slot': {'p': 0.7241379310344828, 'r': 0.020114942528735632, 'f1': 0.0391425908667288}, 'combined': 0.02714410823923907, 'epoch': 29} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 29} Test Russian: {'template': {'p': 0.958904109589041, 'r': 0.5511811023622047, 'f1': 0.7000000000000001}, 'slot': {'p': 0.7241379310344828, 'r': 0.020114942528735632, 'f1': 0.0391425908667288}, 'combined': 0.02739981360671016, 'epoch': 29} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 29} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 29} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 29} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 1} Test for Chinese: {'template': {'p': 0.9459459459459459, 'r': 0.5511811023622047, 'f1': 0.6965174129353234}, 'slot': {'p': 0.6111111111111112, 'r': 0.0210727969348659, 'f1': 0.040740740740740744}, 'combined': 0.028376635341809474, 'epoch': 1} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 1} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 1} Test for Korean: {'template': {'p': 0.9459459459459459, 'r': 0.5511811023622047, 'f1': 0.6965174129353234}, 'slot': {'p': 0.6111111111111112, 'r': 0.0210727969348659, 'f1': 0.040740740740740744}, 'combined': 0.028376635341809474, 'epoch': 1} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 1} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 1} Test for Russian: {'template': {'p': 0.9459459459459459, 'r': 0.5511811023622047, 'f1': 0.6965174129353234}, 'slot': {'p': 0.6111111111111112, 'r': 0.0210727969348659, 'f1': 0.040740740740740744}, 'combined': 0.028376635341809474, 'epoch': 1} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 1}