Command that produces this log: python train.py --model_name template --xlmr_model_name xlm-roberta-large --batch_size 10 --xlmr_learning_rate 2e-5 --event_hidden_num 450 --accumulate_step 4 --max_epoch 30 --p1_data_weight 0.1 --learning_rate 2e-4 ---------------------------------------------------------------------------------------------------- > trainable params: >>> xlmr.embeddings.word_embeddings.weight: torch.Size([250002, 1024]) >>> xlmr.embeddings.position_embeddings.weight: torch.Size([514, 1024]) >>> xlmr.embeddings.token_type_embeddings.weight: torch.Size([1, 1024]) >>> xlmr.embeddings.LayerNorm.weight: torch.Size([1024]) >>> xlmr.embeddings.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.0.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.0.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.0.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.0.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.0.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.0.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.0.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.0.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.0.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.0.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.0.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.0.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.0.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.0.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.0.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.0.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.1.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.1.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.1.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.1.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.1.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.1.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.1.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.1.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.1.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.1.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.1.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.1.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.1.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.1.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.1.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.1.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.2.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.2.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.2.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.2.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.2.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.2.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.2.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.2.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.2.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.2.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.2.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.2.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.2.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.2.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.2.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.2.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.3.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.3.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.3.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.3.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.3.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.3.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.3.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.3.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.3.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.3.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.3.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.3.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.3.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.3.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.3.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.3.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.4.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.4.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.4.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.4.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.4.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.4.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.4.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.4.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.4.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.4.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.4.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.4.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.4.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.4.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.4.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.4.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.5.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.5.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.5.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.5.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.5.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.5.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.5.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.5.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.5.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.5.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.5.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.5.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.5.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.5.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.5.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.5.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.6.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.6.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.6.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.6.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.6.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.6.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.6.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.6.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.6.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.6.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.6.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.6.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.6.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.6.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.6.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.6.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.7.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.7.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.7.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.7.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.7.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.7.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.7.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.7.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.7.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.7.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.7.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.7.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.7.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.7.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.7.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.7.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.8.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.8.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.8.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.8.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.8.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.8.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.8.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.8.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.8.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.8.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.8.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.8.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.8.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.8.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.8.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.8.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.9.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.9.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.9.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.9.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.9.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.9.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.9.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.9.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.9.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.9.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.9.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.9.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.9.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.9.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.9.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.9.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.10.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.10.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.10.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.10.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.10.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.10.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.10.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.10.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.10.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.10.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.10.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.10.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.10.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.10.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.10.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.10.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.11.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.11.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.11.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.11.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.11.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.11.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.11.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.11.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.11.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.11.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.11.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.11.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.11.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.11.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.11.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.11.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.12.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.12.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.12.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.12.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.12.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.12.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.12.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.12.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.12.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.12.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.12.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.12.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.12.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.12.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.12.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.12.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.13.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.13.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.13.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.13.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.13.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.13.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.13.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.13.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.13.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.13.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.13.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.13.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.13.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.13.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.13.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.13.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.14.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.14.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.14.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.14.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.14.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.14.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.14.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.14.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.14.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.14.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.14.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.14.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.14.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.14.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.14.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.14.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.15.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.15.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.15.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.15.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.15.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.15.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.15.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.15.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.15.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.15.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.15.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.15.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.15.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.15.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.15.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.15.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.16.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.16.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.16.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.16.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.16.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.16.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.16.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.16.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.16.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.16.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.16.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.16.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.16.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.16.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.16.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.16.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.17.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.17.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.17.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.17.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.17.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.17.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.17.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.17.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.17.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.17.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.17.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.17.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.17.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.17.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.17.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.17.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.18.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.18.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.18.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.18.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.18.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.18.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.18.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.18.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.18.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.18.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.18.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.18.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.18.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.18.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.18.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.18.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.19.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.19.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.19.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.19.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.19.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.19.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.19.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.19.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.19.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.19.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.19.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.19.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.19.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.19.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.19.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.19.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.20.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.20.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.20.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.20.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.20.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.20.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.20.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.20.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.20.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.20.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.20.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.20.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.20.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.20.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.20.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.20.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.21.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.21.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.21.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.21.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.21.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.21.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.21.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.21.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.21.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.21.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.21.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.21.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.21.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.21.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.21.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.21.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.22.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.22.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.22.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.22.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.22.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.22.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.22.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.22.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.22.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.22.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.22.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.22.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.22.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.22.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.22.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.22.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.23.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.23.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.23.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.23.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.23.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.23.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.23.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.23.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.23.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.23.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.23.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.23.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.23.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.23.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.23.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.23.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.pooler.dense.weight: torch.Size([1024, 1024]) >>> xlmr.pooler.dense.bias: torch.Size([1024]) >>> trans_rep.weight: torch.Size([1024, 2048]) >>> trans_rep.bias: torch.Size([1024]) >>> hidden_ffns.Corruplate.layers.0.weight: torch.Size([768, 1024]) >>> hidden_ffns.Corruplate.layers.0.bias: torch.Size([768]) >>> hidden_ffns.Cybercrimeplate.layers.0.weight: torch.Size([768, 1024]) >>> hidden_ffns.Cybercrimeplate.layers.0.bias: torch.Size([768]) >>> hidden_ffns.Disasterplate.layers.0.weight: torch.Size([768, 1024]) >>> hidden_ffns.Disasterplate.layers.0.bias: torch.Size([768]) >>> hidden_ffns.Displacementplate.layers.0.weight: torch.Size([768, 1024]) >>> hidden_ffns.Displacementplate.layers.0.bias: torch.Size([768]) >>> hidden_ffns.Epidemiplate.layers.0.weight: torch.Size([768, 1024]) >>> hidden_ffns.Epidemiplate.layers.0.bias: torch.Size([768]) >>> hidden_ffns.Etiplate.layers.0.weight: torch.Size([768, 1024]) >>> hidden_ffns.Etiplate.layers.0.bias: torch.Size([768]) >>> hidden_ffns.Protestplate.layers.0.weight: torch.Size([768, 1024]) >>> hidden_ffns.Protestplate.layers.0.bias: torch.Size([768]) >>> hidden_ffns.Terrorplate.layers.0.weight: torch.Size([768, 1024]) >>> hidden_ffns.Terrorplate.layers.0.bias: torch.Size([768]) >>> template_classifiers.Corruplate.layers.0.weight: torch.Size([450, 768]) >>> template_classifiers.Corruplate.layers.0.bias: torch.Size([450]) >>> template_classifiers.Corruplate.layers.1.weight: torch.Size([2, 450]) >>> template_classifiers.Corruplate.layers.1.bias: torch.Size([2]) >>> template_classifiers.Cybercrimeplate.layers.0.weight: torch.Size([450, 768]) >>> template_classifiers.Cybercrimeplate.layers.0.bias: torch.Size([450]) >>> template_classifiers.Cybercrimeplate.layers.1.weight: torch.Size([2, 450]) >>> template_classifiers.Cybercrimeplate.layers.1.bias: torch.Size([2]) >>> template_classifiers.Disasterplate.layers.0.weight: torch.Size([450, 768]) >>> template_classifiers.Disasterplate.layers.0.bias: torch.Size([450]) >>> template_classifiers.Disasterplate.layers.1.weight: torch.Size([2, 450]) >>> template_classifiers.Disasterplate.layers.1.bias: torch.Size([2]) >>> template_classifiers.Displacementplate.layers.0.weight: torch.Size([450, 768]) >>> template_classifiers.Displacementplate.layers.0.bias: torch.Size([450]) >>> template_classifiers.Displacementplate.layers.1.weight: torch.Size([2, 450]) >>> template_classifiers.Displacementplate.layers.1.bias: torch.Size([2]) >>> template_classifiers.Epidemiplate.layers.0.weight: torch.Size([450, 768]) >>> template_classifiers.Epidemiplate.layers.0.bias: torch.Size([450]) >>> template_classifiers.Epidemiplate.layers.1.weight: torch.Size([2, 450]) >>> template_classifiers.Epidemiplate.layers.1.bias: torch.Size([2]) >>> template_classifiers.Etiplate.layers.0.weight: torch.Size([450, 768]) >>> template_classifiers.Etiplate.layers.0.bias: torch.Size([450]) >>> template_classifiers.Etiplate.layers.1.weight: torch.Size([2, 450]) >>> template_classifiers.Etiplate.layers.1.bias: torch.Size([2]) >>> template_classifiers.Protestplate.layers.0.weight: torch.Size([450, 768]) >>> template_classifiers.Protestplate.layers.0.bias: torch.Size([450]) >>> template_classifiers.Protestplate.layers.1.weight: torch.Size([2, 450]) >>> template_classifiers.Protestplate.layers.1.bias: torch.Size([2]) >>> template_classifiers.Terrorplate.layers.0.weight: torch.Size([450, 768]) >>> template_classifiers.Terrorplate.layers.0.bias: torch.Size([450]) >>> template_classifiers.Terrorplate.layers.1.weight: torch.Size([2, 450]) >>> template_classifiers.Terrorplate.layers.1.bias: torch.Size([2]) >>> type_classifiers.Corruplate.layers.0.weight: torch.Size([450, 768]) >>> type_classifiers.Corruplate.layers.0.bias: torch.Size([450]) >>> type_classifiers.Corruplate.layers.1.weight: torch.Size([6, 450]) >>> type_classifiers.Corruplate.layers.1.bias: torch.Size([6]) >>> type_classifiers.Cybercrimeplate.layers.0.weight: torch.Size([450, 768]) >>> type_classifiers.Cybercrimeplate.layers.0.bias: torch.Size([450]) >>> type_classifiers.Cybercrimeplate.layers.1.weight: torch.Size([6, 450]) >>> type_classifiers.Cybercrimeplate.layers.1.bias: torch.Size([6]) >>> type_classifiers.Disasterplate.layers.0.weight: torch.Size([450, 768]) >>> type_classifiers.Disasterplate.layers.0.bias: torch.Size([450]) >>> type_classifiers.Disasterplate.layers.1.weight: torch.Size([6, 450]) >>> type_classifiers.Disasterplate.layers.1.bias: torch.Size([6]) >>> type_classifiers.Displacementplate.layers.0.weight: torch.Size([450, 768]) >>> type_classifiers.Displacementplate.layers.0.bias: torch.Size([450]) >>> type_classifiers.Displacementplate.layers.1.weight: torch.Size([6, 450]) >>> type_classifiers.Displacementplate.layers.1.bias: torch.Size([6]) >>> type_classifiers.Epidemiplate.layers.0.weight: torch.Size([450, 768]) >>> type_classifiers.Epidemiplate.layers.0.bias: torch.Size([450]) >>> type_classifiers.Epidemiplate.layers.1.weight: torch.Size([6, 450]) >>> type_classifiers.Epidemiplate.layers.1.bias: torch.Size([6]) >>> type_classifiers.Etiplate.layers.0.weight: torch.Size([450, 768]) >>> type_classifiers.Etiplate.layers.0.bias: torch.Size([450]) >>> type_classifiers.Etiplate.layers.1.weight: torch.Size([6, 450]) >>> type_classifiers.Etiplate.layers.1.bias: torch.Size([6]) >>> type_classifiers.Protestplate.layers.0.weight: torch.Size([450, 768]) >>> type_classifiers.Protestplate.layers.0.bias: torch.Size([450]) >>> type_classifiers.Protestplate.layers.1.weight: torch.Size([6, 450]) >>> type_classifiers.Protestplate.layers.1.bias: torch.Size([6]) >>> type_classifiers.Terrorplate.layers.0.weight: torch.Size([450, 768]) >>> type_classifiers.Terrorplate.layers.0.bias: torch.Size([450]) >>> type_classifiers.Terrorplate.layers.1.weight: torch.Size([6, 450]) >>> type_classifiers.Terrorplate.layers.1.bias: torch.Size([6]) >>> completion_classifiers.Corruplate.layers.0.weight: torch.Size([450, 768]) >>> completion_classifiers.Corruplate.layers.0.bias: torch.Size([450]) >>> completion_classifiers.Corruplate.layers.1.weight: torch.Size([4, 450]) >>> completion_classifiers.Corruplate.layers.1.bias: torch.Size([4]) >>> completion_classifiers.Cybercrimeplate.layers.0.weight: torch.Size([450, 768]) >>> completion_classifiers.Cybercrimeplate.layers.0.bias: torch.Size([450]) >>> completion_classifiers.Cybercrimeplate.layers.1.weight: torch.Size([4, 450]) >>> completion_classifiers.Cybercrimeplate.layers.1.bias: torch.Size([4]) >>> completion_classifiers.Disasterplate.layers.0.weight: torch.Size([450, 768]) >>> completion_classifiers.Disasterplate.layers.0.bias: torch.Size([450]) >>> completion_classifiers.Disasterplate.layers.1.weight: torch.Size([4, 450]) >>> completion_classifiers.Disasterplate.layers.1.bias: torch.Size([4]) >>> completion_classifiers.Displacementplate.layers.0.weight: torch.Size([450, 768]) >>> completion_classifiers.Displacementplate.layers.0.bias: torch.Size([450]) >>> completion_classifiers.Displacementplate.layers.1.weight: torch.Size([4, 450]) >>> completion_classifiers.Displacementplate.layers.1.bias: torch.Size([4]) >>> completion_classifiers.Epidemiplate.layers.0.weight: torch.Size([450, 768]) >>> completion_classifiers.Epidemiplate.layers.0.bias: torch.Size([450]) >>> completion_classifiers.Epidemiplate.layers.1.weight: torch.Size([4, 450]) >>> completion_classifiers.Epidemiplate.layers.1.bias: torch.Size([4]) >>> completion_classifiers.Etiplate.layers.0.weight: torch.Size([450, 768]) >>> completion_classifiers.Etiplate.layers.0.bias: torch.Size([450]) >>> completion_classifiers.Etiplate.layers.1.weight: torch.Size([4, 450]) >>> completion_classifiers.Etiplate.layers.1.bias: torch.Size([4]) >>> completion_classifiers.Protestplate.layers.0.weight: torch.Size([450, 768]) >>> completion_classifiers.Protestplate.layers.0.bias: torch.Size([450]) >>> completion_classifiers.Protestplate.layers.1.weight: torch.Size([4, 450]) >>> completion_classifiers.Protestplate.layers.1.bias: torch.Size([4]) >>> completion_classifiers.Terrorplate.layers.0.weight: torch.Size([450, 768]) >>> completion_classifiers.Terrorplate.layers.0.bias: torch.Size([450]) >>> completion_classifiers.Terrorplate.layers.1.weight: torch.Size([4, 450]) >>> completion_classifiers.Terrorplate.layers.1.bias: torch.Size([4]) >>> overtime_classifiers.Corruplate.layers.0.weight: torch.Size([450, 768]) >>> overtime_classifiers.Corruplate.layers.0.bias: torch.Size([450]) >>> overtime_classifiers.Corruplate.layers.1.weight: torch.Size([2, 450]) >>> overtime_classifiers.Corruplate.layers.1.bias: torch.Size([2]) >>> overtime_classifiers.Cybercrimeplate.layers.0.weight: torch.Size([450, 768]) >>> overtime_classifiers.Cybercrimeplate.layers.0.bias: torch.Size([450]) >>> overtime_classifiers.Cybercrimeplate.layers.1.weight: torch.Size([2, 450]) >>> overtime_classifiers.Cybercrimeplate.layers.1.bias: torch.Size([2]) >>> overtime_classifiers.Disasterplate.layers.0.weight: torch.Size([450, 768]) >>> overtime_classifiers.Disasterplate.layers.0.bias: torch.Size([450]) >>> overtime_classifiers.Disasterplate.layers.1.weight: torch.Size([2, 450]) >>> overtime_classifiers.Disasterplate.layers.1.bias: torch.Size([2]) >>> overtime_classifiers.Displacementplate.layers.0.weight: torch.Size([450, 768]) >>> overtime_classifiers.Displacementplate.layers.0.bias: torch.Size([450]) >>> overtime_classifiers.Displacementplate.layers.1.weight: torch.Size([2, 450]) >>> overtime_classifiers.Displacementplate.layers.1.bias: torch.Size([2]) >>> overtime_classifiers.Epidemiplate.layers.0.weight: torch.Size([450, 768]) >>> overtime_classifiers.Epidemiplate.layers.0.bias: torch.Size([450]) >>> overtime_classifiers.Epidemiplate.layers.1.weight: torch.Size([2, 450]) >>> overtime_classifiers.Epidemiplate.layers.1.bias: torch.Size([2]) >>> overtime_classifiers.Etiplate.layers.0.weight: torch.Size([450, 768]) >>> overtime_classifiers.Etiplate.layers.0.bias: torch.Size([450]) >>> overtime_classifiers.Etiplate.layers.1.weight: torch.Size([2, 450]) >>> overtime_classifiers.Etiplate.layers.1.bias: torch.Size([2]) >>> overtime_classifiers.Protestplate.layers.0.weight: torch.Size([450, 768]) >>> overtime_classifiers.Protestplate.layers.0.bias: torch.Size([450]) >>> overtime_classifiers.Protestplate.layers.1.weight: torch.Size([2, 450]) >>> overtime_classifiers.Protestplate.layers.1.bias: torch.Size([2]) >>> overtime_classifiers.Terrorplate.layers.0.weight: torch.Size([450, 768]) >>> overtime_classifiers.Terrorplate.layers.0.bias: torch.Size([450]) >>> overtime_classifiers.Terrorplate.layers.1.weight: torch.Size([2, 450]) >>> overtime_classifiers.Terrorplate.layers.1.bias: torch.Size([2]) >>> coordinated_classifiers.Corruplate.layers.0.weight: torch.Size([450, 768]) >>> coordinated_classifiers.Corruplate.layers.0.bias: torch.Size([450]) >>> coordinated_classifiers.Corruplate.layers.1.weight: torch.Size([2, 450]) >>> coordinated_classifiers.Corruplate.layers.1.bias: torch.Size([2]) >>> coordinated_classifiers.Cybercrimeplate.layers.0.weight: torch.Size([450, 768]) >>> coordinated_classifiers.Cybercrimeplate.layers.0.bias: torch.Size([450]) >>> coordinated_classifiers.Cybercrimeplate.layers.1.weight: torch.Size([2, 450]) >>> coordinated_classifiers.Cybercrimeplate.layers.1.bias: torch.Size([2]) >>> coordinated_classifiers.Disasterplate.layers.0.weight: torch.Size([450, 768]) >>> coordinated_classifiers.Disasterplate.layers.0.bias: torch.Size([450]) >>> coordinated_classifiers.Disasterplate.layers.1.weight: torch.Size([2, 450]) >>> coordinated_classifiers.Disasterplate.layers.1.bias: torch.Size([2]) >>> coordinated_classifiers.Displacementplate.layers.0.weight: torch.Size([450, 768]) >>> coordinated_classifiers.Displacementplate.layers.0.bias: torch.Size([450]) >>> coordinated_classifiers.Displacementplate.layers.1.weight: torch.Size([2, 450]) >>> coordinated_classifiers.Displacementplate.layers.1.bias: torch.Size([2]) >>> coordinated_classifiers.Epidemiplate.layers.0.weight: torch.Size([450, 768]) >>> coordinated_classifiers.Epidemiplate.layers.0.bias: torch.Size([450]) >>> coordinated_classifiers.Epidemiplate.layers.1.weight: torch.Size([2, 450]) >>> coordinated_classifiers.Epidemiplate.layers.1.bias: torch.Size([2]) >>> coordinated_classifiers.Etiplate.layers.0.weight: torch.Size([450, 768]) >>> coordinated_classifiers.Etiplate.layers.0.bias: torch.Size([450]) >>> coordinated_classifiers.Etiplate.layers.1.weight: torch.Size([2, 450]) >>> coordinated_classifiers.Etiplate.layers.1.bias: torch.Size([2]) >>> coordinated_classifiers.Protestplate.layers.0.weight: torch.Size([450, 768]) >>> coordinated_classifiers.Protestplate.layers.0.bias: torch.Size([450]) >>> coordinated_classifiers.Protestplate.layers.1.weight: torch.Size([2, 450]) >>> coordinated_classifiers.Protestplate.layers.1.bias: torch.Size([2]) >>> coordinated_classifiers.Terrorplate.layers.0.weight: torch.Size([450, 768]) >>> coordinated_classifiers.Terrorplate.layers.0.bias: torch.Size([450]) >>> coordinated_classifiers.Terrorplate.layers.1.weight: torch.Size([2, 450]) >>> coordinated_classifiers.Terrorplate.layers.1.bias: torch.Size([2]) n_trainable_params: 582185936, n_nontrainable_params: 0 ---------------------------------------------------------------------------------------------------- ****************************** Epoch: 0 command: python train.py --model_name template --xlmr_model_name xlm-roberta-large --batch_size 10 --xlmr_learning_rate 2e-5 --event_hidden_num 450 --accumulate_step 4 --max_epoch 30 --p1_data_weight 0.1 --learning_rate 2e-4 2023-01-23 23:28:18.951742: step: 4/77, loss: 1.049109935760498 2023-01-23 23:28:20.383950: step: 8/77, loss: 1.0610548257827759 2023-01-23 23:28:21.869999: step: 12/77, loss: 1.0632109642028809 2023-01-23 23:28:23.362375: step: 16/77, loss: 1.0482066869735718 2023-01-23 23:28:24.845112: step: 20/77, loss: 1.0590332746505737 2023-01-23 23:28:26.263819: step: 24/77, loss: 1.0530996322631836 2023-01-23 23:28:27.704810: step: 28/77, loss: 1.0462825298309326 2023-01-23 23:28:29.131750: step: 32/77, loss: 1.0212979316711426 2023-01-23 23:28:30.593187: step: 36/77, loss: 1.0319734811782837 2023-01-23 23:28:32.041434: step: 40/77, loss: 1.0269728899002075 2023-01-23 23:28:33.490562: step: 44/77, loss: 1.013390064239502 2023-01-23 23:28:35.028146: step: 48/77, loss: 1.0231409072875977 2023-01-23 23:28:36.516722: step: 52/77, loss: 1.013485312461853 2023-01-23 23:28:37.997338: step: 56/77, loss: 0.9836816787719727 2023-01-23 23:28:39.403067: step: 60/77, loss: 0.9867761135101318 2023-01-23 23:28:40.865612: step: 64/77, loss: 0.9728533029556274 2023-01-23 23:28:42.300124: step: 68/77, loss: 0.9669691324234009 2023-01-23 23:28:43.702718: step: 72/77, loss: 0.9389466047286987 2023-01-23 23:28:45.141563: step: 76/77, loss: 0.9677916169166565 2023-01-23 23:28:46.640692: step: 80/77, loss: 0.9292247295379639 2023-01-23 23:28:48.106388: step: 84/77, loss: 0.9160377979278564 2023-01-23 23:28:49.588958: step: 88/77, loss: 0.9040213227272034 2023-01-23 23:28:51.047192: step: 92/77, loss: 0.898930549621582 2023-01-23 23:28:52.513958: step: 96/77, loss: 0.8593113422393799 2023-01-23 23:28:53.918349: step: 100/77, loss: 0.8631086945533752 2023-01-23 23:28:55.356944: step: 104/77, loss: 0.8304163217544556 2023-01-23 23:28:56.938642: step: 108/77, loss: 0.8112912178039551 2023-01-23 23:28:58.484132: step: 112/77, loss: 0.8210156559944153 2023-01-23 23:28:59.944420: step: 116/77, loss: 0.770197331905365 2023-01-23 23:29:01.418966: step: 120/77, loss: 0.7747995853424072 2023-01-23 23:29:02.844904: step: 124/77, loss: 0.7348337173461914 2023-01-23 23:29:04.281136: step: 128/77, loss: 0.7300294637680054 2023-01-23 23:29:05.694653: step: 132/77, loss: 0.6968427896499634 2023-01-23 23:29:07.164858: step: 136/77, loss: 0.6787471771240234 2023-01-23 23:29:08.582662: step: 140/77, loss: 0.6531374454498291 2023-01-23 23:29:10.028360: step: 144/77, loss: 0.6747608184814453 2023-01-23 23:29:11.484633: step: 148/77, loss: 0.6466701626777649 2023-01-23 23:29:12.989856: step: 152/77, loss: 0.5864278674125671 2023-01-23 23:29:14.429565: step: 156/77, loss: 0.5997405052185059 2023-01-23 23:29:15.888691: step: 160/77, loss: 0.514805793762207 2023-01-23 23:29:17.366970: step: 164/77, loss: 0.5388752222061157 2023-01-23 23:29:18.794391: step: 168/77, loss: 0.5424827337265015 2023-01-23 23:29:20.302598: step: 172/77, loss: 0.4766864776611328 2023-01-23 23:29:21.780238: step: 176/77, loss: 0.4511604905128479 2023-01-23 23:29:23.286704: step: 180/77, loss: 0.43818655610084534 2023-01-23 23:29:24.783416: step: 184/77, loss: 0.36380887031555176 2023-01-23 23:29:26.249781: step: 188/77, loss: 0.40842753648757935 2023-01-23 23:29:27.621221: step: 192/77, loss: 0.3447669744491577 2023-01-23 23:29:29.089645: step: 196/77, loss: 0.37230125069618225 2023-01-23 23:29:30.504637: step: 200/77, loss: 0.35574567317962646 2023-01-23 23:29:31.963935: step: 204/77, loss: 0.2980503737926483 2023-01-23 23:29:33.439462: step: 208/77, loss: 0.25997936725616455 2023-01-23 23:29:34.842929: step: 212/77, loss: 0.23611074686050415 2023-01-23 23:29:36.288284: step: 216/77, loss: 0.23891782760620117 2023-01-23 23:29:37.659612: step: 220/77, loss: 0.372081458568573 2023-01-23 23:29:39.113659: step: 224/77, loss: 0.18111208081245422 2023-01-23 23:29:40.554021: step: 228/77, loss: 0.2535715103149414 2023-01-23 23:29:41.987141: step: 232/77, loss: 0.21446116268634796 2023-01-23 23:29:43.427468: step: 236/77, loss: 0.13724784553050995 2023-01-23 23:29:44.910693: step: 240/77, loss: 0.11958912014961243 2023-01-23 23:29:46.404155: step: 244/77, loss: 0.11895409226417542 2023-01-23 23:29:47.804649: step: 248/77, loss: 0.1942715346813202 2023-01-23 23:29:49.258951: step: 252/77, loss: 0.17173314094543457 2023-01-23 23:29:50.651225: step: 256/77, loss: 0.44372865557670593 2023-01-23 23:29:52.120009: step: 260/77, loss: 0.12320338934659958 2023-01-23 23:29:53.587823: step: 264/77, loss: 0.08322672545909882 2023-01-23 23:29:55.070447: step: 268/77, loss: 0.11128869652748108 2023-01-23 23:29:56.550275: step: 272/77, loss: 0.1094222366809845 2023-01-23 23:29:58.050331: step: 276/77, loss: 0.07033980637788773 2023-01-23 23:29:59.441636: step: 280/77, loss: 0.1360885351896286 2023-01-23 23:30:00.937192: step: 284/77, loss: 0.05895698070526123 2023-01-23 23:30:02.395350: step: 288/77, loss: 0.08663325011730194 2023-01-23 23:30:03.887363: step: 292/77, loss: 0.051292695105075836 2023-01-23 23:30:05.338662: step: 296/77, loss: 0.0824483186006546 2023-01-23 23:30:06.740503: step: 300/77, loss: 0.07237327843904495 2023-01-23 23:30:08.120073: step: 304/77, loss: 0.07325203716754913 2023-01-23 23:30:09.575293: step: 308/77, loss: 0.13217729330062866 2023-01-23 23:30:11.031281: step: 312/77, loss: 0.03724844008684158 2023-01-23 23:30:12.418955: step: 316/77, loss: 0.08004172146320343 2023-01-23 23:30:13.875014: step: 320/77, loss: 0.2094523310661316 2023-01-23 23:30:15.331558: step: 324/77, loss: 0.08899670839309692 2023-01-23 23:30:16.800684: step: 328/77, loss: 0.09950350224971771 2023-01-23 23:30:18.225031: step: 332/77, loss: 0.12434931099414825 2023-01-23 23:30:19.713165: step: 336/77, loss: 0.26316243410110474 2023-01-23 23:30:21.198805: step: 340/77, loss: 0.09677863121032715 2023-01-23 23:30:22.601877: step: 344/77, loss: 0.061031147837638855 2023-01-23 23:30:24.025830: step: 348/77, loss: 0.034742482006549835 2023-01-23 23:30:25.447097: step: 352/77, loss: 0.07141724228858948 2023-01-23 23:30:26.873774: step: 356/77, loss: 0.14917293190956116 2023-01-23 23:30:28.355041: step: 360/77, loss: 0.12509801983833313 2023-01-23 23:30:29.786396: step: 364/77, loss: 0.1556428074836731 2023-01-23 23:30:31.239950: step: 368/77, loss: 0.1296318918466568 2023-01-23 23:30:32.669795: step: 372/77, loss: 0.05960507690906525 2023-01-23 23:30:34.055301: step: 376/77, loss: 0.0793580412864685 2023-01-23 23:30:35.481220: step: 380/77, loss: 0.32634079456329346 2023-01-23 23:30:36.919804: step: 384/77, loss: 0.07845164090394974 2023-01-23 23:30:38.330913: step: 388/77, loss: 0.15920400619506836 ================================================== Loss: 0.478 -------------------- Dev Chinese: {'template': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 0} Test Chinese: {'template': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 0} Dev Korean: {'template': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 0} Test Korean: {'template': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 0} Dev Russian: {'template': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 0} Test Russian: {'template': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 0} Sample Chinese: {'template': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 0} Sample Korean: {'template': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 0} Sample Russian: {'template': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 0} New best chinese model... New best korean model... New best russian model... ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 0} Test for Chinese: {'template': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 0} Chinese: {'template': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 0} -------------------- Dev for Korean: {'template': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 0} Test for Korean: {'template': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 0} Korean: {'template': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 0} -------------------- Dev for Russian: {'template': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 0} Test for Russian: {'template': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 0} Russian: {'template': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 0} ****************************** Epoch: 1 command: python train.py --model_name template --xlmr_model_name xlm-roberta-large --batch_size 10 --xlmr_learning_rate 2e-5 --event_hidden_num 450 --accumulate_step 4 --max_epoch 30 --p1_data_weight 0.1 --learning_rate 2e-4 2023-01-23 23:32:31.797942: step: 4/77, loss: 0.12096530944108963 2023-01-23 23:32:33.227481: step: 8/77, loss: 0.12882250547409058 2023-01-23 23:32:34.590079: step: 12/77, loss: 0.09012199193239212 2023-01-23 23:32:36.021189: step: 16/77, loss: 0.08511804044246674 2023-01-23 23:32:37.473050: step: 20/77, loss: 0.11885780096054077 2023-01-23 23:32:38.872494: step: 24/77, loss: 0.05991317704319954 2023-01-23 23:32:40.328905: step: 28/77, loss: 0.1036485880613327 2023-01-23 23:32:41.800627: step: 32/77, loss: 0.07189082354307175 2023-01-23 23:32:43.207141: step: 36/77, loss: 0.08334918320178986 2023-01-23 23:32:44.663238: step: 40/77, loss: 0.10813970863819122 2023-01-23 23:32:46.093807: step: 44/77, loss: 0.03178555145859718 2023-01-23 23:32:47.539480: step: 48/77, loss: 0.05043605715036392 2023-01-23 23:32:48.984699: step: 52/77, loss: 0.08065943419933319 2023-01-23 23:32:50.351511: step: 56/77, loss: 0.14960873126983643 2023-01-23 23:32:51.792650: step: 60/77, loss: 0.08098729699850082 2023-01-23 23:32:53.270740: step: 64/77, loss: 0.08247891813516617 2023-01-23 23:32:54.760508: step: 68/77, loss: 0.09875787049531937 2023-01-23 23:32:56.197179: step: 72/77, loss: 0.12581631541252136 2023-01-23 23:32:57.665216: step: 76/77, loss: 0.13350044190883636 2023-01-23 23:32:59.091855: step: 80/77, loss: 0.07994288206100464 2023-01-23 23:33:00.548055: step: 84/77, loss: 0.04435095936059952 2023-01-23 23:33:01.995610: step: 88/77, loss: 0.08299937844276428 2023-01-23 23:33:03.472181: step: 92/77, loss: 0.29303139448165894 2023-01-23 23:33:04.897801: step: 96/77, loss: 0.10736545920372009 2023-01-23 23:33:06.308936: step: 100/77, loss: 0.040313832461833954 2023-01-23 23:33:07.754487: step: 104/77, loss: 0.1304677128791809 2023-01-23 23:33:09.210471: step: 108/77, loss: 0.13540500402450562 2023-01-23 23:33:10.704451: step: 112/77, loss: 0.04073725640773773 2023-01-23 23:33:12.181508: step: 116/77, loss: 0.10316254198551178 2023-01-23 23:33:13.598869: step: 120/77, loss: 0.06302158534526825 2023-01-23 23:33:15.040643: step: 124/77, loss: 0.13651029765605927 2023-01-23 23:33:16.489280: step: 128/77, loss: 0.13189075887203217 2023-01-23 23:33:17.945356: step: 132/77, loss: 0.05164668709039688 2023-01-23 23:33:19.298558: step: 136/77, loss: 0.0451040118932724 2023-01-23 23:33:20.738495: step: 140/77, loss: 0.49520280957221985 2023-01-23 23:33:22.239122: step: 144/77, loss: 0.1343473643064499 2023-01-23 23:33:23.699767: step: 148/77, loss: 0.09947903454303741 2023-01-23 23:33:25.170081: step: 152/77, loss: 0.19668103754520416 2023-01-23 23:33:26.589208: step: 156/77, loss: 0.1421450674533844 2023-01-23 23:33:27.987360: step: 160/77, loss: 0.11359035968780518 2023-01-23 23:33:29.442017: step: 164/77, loss: 0.3328215479850769 2023-01-23 23:33:30.919197: step: 168/77, loss: 0.06621459871530533 2023-01-23 23:33:32.424002: step: 172/77, loss: 0.12150992453098297 2023-01-23 23:33:33.894247: step: 176/77, loss: 0.04091096669435501 2023-01-23 23:33:35.365771: step: 180/77, loss: 0.09361355006694794 2023-01-23 23:33:36.785858: step: 184/77, loss: 0.15000203251838684 2023-01-23 23:33:38.216806: step: 188/77, loss: 0.03671369329094887 2023-01-23 23:33:39.672620: step: 192/77, loss: 0.07287999987602234 2023-01-23 23:33:41.080351: step: 196/77, loss: 0.16896918416023254 2023-01-23 23:33:42.503411: step: 200/77, loss: 0.0673668384552002 2023-01-23 23:33:43.858318: step: 204/77, loss: 0.09352917224168777 2023-01-23 23:33:45.247705: step: 208/77, loss: 0.08409743010997772 2023-01-23 23:33:46.724049: step: 212/77, loss: 0.10592477023601532 2023-01-23 23:33:48.164269: step: 216/77, loss: 0.0648825615644455 2023-01-23 23:33:49.608622: step: 220/77, loss: 0.09974086284637451 2023-01-23 23:33:50.983169: step: 224/77, loss: 0.12693047523498535 2023-01-23 23:33:52.435129: step: 228/77, loss: 0.05584800988435745 2023-01-23 23:33:53.861366: step: 232/77, loss: 0.057105544954538345 2023-01-23 23:33:55.254716: step: 236/77, loss: 0.10636867582798004 2023-01-23 23:33:56.736627: step: 240/77, loss: 0.09843003749847412 2023-01-23 23:33:58.175765: step: 244/77, loss: 0.13390299677848816 2023-01-23 23:33:59.686274: step: 248/77, loss: 0.15013137459754944 2023-01-23 23:34:01.147498: step: 252/77, loss: 0.2896338999271393 2023-01-23 23:34:02.601980: step: 256/77, loss: 0.09500610828399658 2023-01-23 23:34:04.081575: step: 260/77, loss: 0.13277631998062134 2023-01-23 23:34:05.509191: step: 264/77, loss: 0.08409351855516434 2023-01-23 23:34:06.970738: step: 268/77, loss: 0.09715725481510162 2023-01-23 23:34:08.430487: step: 272/77, loss: 0.07679928839206696 2023-01-23 23:34:09.838073: step: 276/77, loss: 0.11415556818246841 2023-01-23 23:34:11.255994: step: 280/77, loss: 0.044916920363903046 2023-01-23 23:34:12.684635: step: 284/77, loss: 0.06441016495227814 2023-01-23 23:34:14.197970: step: 288/77, loss: 0.04012870043516159 2023-01-23 23:34:15.681222: step: 292/77, loss: 0.09672766923904419 2023-01-23 23:34:17.094719: step: 296/77, loss: 0.03515460342168808 2023-01-23 23:34:18.557858: step: 300/77, loss: 0.1350592076778412 2023-01-23 23:34:20.000222: step: 304/77, loss: 0.1378142237663269 2023-01-23 23:34:21.519263: step: 308/77, loss: 0.06225842237472534 2023-01-23 23:34:22.918915: step: 312/77, loss: 0.11226215213537216 2023-01-23 23:34:24.325265: step: 316/77, loss: 0.1410035789012909 2023-01-23 23:34:25.795108: step: 320/77, loss: 0.10280730575323105 2023-01-23 23:34:27.285636: step: 324/77, loss: 0.12254294008016586 2023-01-23 23:34:28.730420: step: 328/77, loss: 0.08656258881092072 2023-01-23 23:34:30.165982: step: 332/77, loss: 0.09989891946315765 2023-01-23 23:34:31.648595: step: 336/77, loss: 0.17827792465686798 2023-01-23 23:34:33.095877: step: 340/77, loss: 0.06311401724815369 2023-01-23 23:34:34.507035: step: 344/77, loss: 0.14989399909973145 2023-01-23 23:34:35.960852: step: 348/77, loss: 0.1893458217382431 2023-01-23 23:34:37.433245: step: 352/77, loss: 0.0510341040790081 2023-01-23 23:34:38.808222: step: 356/77, loss: 0.09770865738391876 2023-01-23 23:34:40.290687: step: 360/77, loss: 0.11236228048801422 2023-01-23 23:34:41.758503: step: 364/77, loss: 0.07876268029212952 2023-01-23 23:34:43.279728: step: 368/77, loss: 0.0608571320772171 2023-01-23 23:34:44.762920: step: 372/77, loss: 0.0839594379067421 2023-01-23 23:34:46.192876: step: 376/77, loss: 0.1273816078901291 2023-01-23 23:34:47.608654: step: 380/77, loss: 0.10643083602190018 2023-01-23 23:34:49.064288: step: 384/77, loss: 0.11674681305885315 2023-01-23 23:34:50.458589: step: 388/77, loss: 0.05906044691801071 ================================================== Loss: 0.108 -------------------- Dev Chinese: {'template': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 1} Test Chinese: {'template': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 1} Dev Korean: {'template': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 1} Test Korean: {'template': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 1} Dev Russian: {'template': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 1} Test Russian: {'template': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 1} Sample Chinese: {'template': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 1} Sample Korean: {'template': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 1} Sample Russian: {'template': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 1} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 0} Test for Chinese: {'template': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 0} Chinese: {'template': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 0} -------------------- Dev for Korean: {'template': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 0} Test for Korean: {'template': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 0} Korean: {'template': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 0} -------------------- Dev for Russian: {'template': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 0} Test for Russian: {'template': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 0} Russian: {'template': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 0} ****************************** Epoch: 2 command: python train.py --model_name template --xlmr_model_name xlm-roberta-large --batch_size 10 --xlmr_learning_rate 2e-5 --event_hidden_num 450 --accumulate_step 4 --max_epoch 30 --p1_data_weight 0.1 --learning_rate 2e-4 2023-01-23 23:36:25.526782: step: 4/77, loss: 0.10029887408018112 2023-01-23 23:36:26.986604: step: 8/77, loss: 0.2068578004837036 2023-01-23 23:36:28.450346: step: 12/77, loss: 0.15776383876800537 2023-01-23 23:36:29.931536: step: 16/77, loss: 0.07037387043237686 2023-01-23 23:36:31.448612: step: 20/77, loss: 0.2502818703651428 2023-01-23 23:36:32.887850: step: 24/77, loss: 0.05786603316664696 2023-01-23 23:36:34.313763: step: 28/77, loss: 0.0692940503358841 2023-01-23 23:36:35.825820: step: 32/77, loss: 0.20832131803035736 2023-01-23 23:36:37.260446: step: 36/77, loss: 0.122525155544281 2023-01-23 23:36:38.708867: step: 40/77, loss: 0.22703193128108978 2023-01-23 23:36:40.166752: step: 44/77, loss: 0.06502736359834671 2023-01-23 23:36:41.623047: step: 48/77, loss: 0.051964953541755676 2023-01-23 23:36:43.126439: step: 52/77, loss: 0.04871102795004845 2023-01-23 23:36:44.593257: step: 56/77, loss: 0.21226727962493896 2023-01-23 23:36:46.028678: step: 60/77, loss: 0.07580306380987167 2023-01-23 23:36:47.496038: step: 64/77, loss: 0.03063333034515381 2023-01-23 23:36:48.909533: step: 68/77, loss: 0.03472524136304855 2023-01-23 23:36:50.382509: step: 72/77, loss: 0.0658077821135521 2023-01-23 23:36:51.818254: step: 76/77, loss: 0.18776100873947144 2023-01-23 23:36:53.281647: step: 80/77, loss: 0.07431071996688843 2023-01-23 23:36:54.690267: step: 84/77, loss: 0.23396560549736023 2023-01-23 23:36:56.170247: step: 88/77, loss: 0.13445770740509033 2023-01-23 23:36:57.578083: step: 92/77, loss: 0.12409112602472305 2023-01-23 23:36:59.031017: step: 96/77, loss: 0.08051728457212448 2023-01-23 23:37:00.455995: step: 100/77, loss: 0.0654061883687973 2023-01-23 23:37:01.902467: step: 104/77, loss: 0.0790565013885498 2023-01-23 23:37:03.358499: step: 108/77, loss: 0.10501079261302948 2023-01-23 23:37:04.829367: step: 112/77, loss: 0.08053272217512131 2023-01-23 23:37:06.337336: step: 116/77, loss: 0.18260684609413147 2023-01-23 23:37:07.820755: step: 120/77, loss: 0.0984596386551857 2023-01-23 23:37:09.265167: step: 124/77, loss: 0.04549170657992363 2023-01-23 23:37:10.731038: step: 128/77, loss: 0.07625871896743774 2023-01-23 23:37:12.117761: step: 132/77, loss: 0.0222511924803257 2023-01-23 23:37:13.605972: step: 136/77, loss: 0.05851700156927109 2023-01-23 23:37:15.014471: step: 140/77, loss: 0.0521090030670166 2023-01-23 23:37:16.548338: step: 144/77, loss: 0.1650797724723816 2023-01-23 23:37:18.033702: step: 148/77, loss: 0.039547763764858246 2023-01-23 23:37:19.430468: step: 152/77, loss: 0.005704985931515694 2023-01-23 23:37:20.903534: step: 156/77, loss: 0.018853096291422844 2023-01-23 23:37:22.372780: step: 160/77, loss: 0.05911945924162865 2023-01-23 23:37:23.849819: step: 164/77, loss: 0.012836996465921402 2023-01-23 23:37:25.359891: step: 168/77, loss: 0.01420543435961008 2023-01-23 23:37:26.855461: step: 172/77, loss: 0.0073021515272557735 2023-01-23 23:37:28.286950: step: 176/77, loss: 0.05086870491504669 2023-01-23 23:37:29.753857: step: 180/77, loss: 0.03042435087263584 2023-01-23 23:37:31.224003: step: 184/77, loss: 0.026089057326316833 2023-01-23 23:37:32.725516: step: 188/77, loss: 0.009571501985192299 2023-01-23 23:37:34.306503: step: 192/77, loss: 0.027215342968702316 2023-01-23 23:37:35.786381: step: 196/77, loss: 0.037625692784786224 2023-01-23 23:37:37.211447: step: 200/77, loss: 0.0072413114830851555 2023-01-23 23:37:38.708346: step: 204/77, loss: 0.02484329231083393 2023-01-23 23:37:40.233151: step: 208/77, loss: 0.014388437382876873 2023-01-23 23:37:41.683977: step: 212/77, loss: 0.036508869379758835 2023-01-23 23:37:43.112752: step: 216/77, loss: 0.02174089103937149 2023-01-23 23:37:44.560270: step: 220/77, loss: 0.07830284535884857 2023-01-23 23:37:46.001711: step: 224/77, loss: 0.07827137410640717 2023-01-23 23:37:47.479478: step: 228/77, loss: 0.018267545849084854 2023-01-23 23:37:48.988763: step: 232/77, loss: 0.07455648481845856 2023-01-23 23:37:50.501304: step: 236/77, loss: 0.09529100358486176 2023-01-23 23:37:52.030201: step: 240/77, loss: 0.0072893425822257996 2023-01-23 23:37:53.579655: step: 244/77, loss: 0.007173581048846245 2023-01-23 23:37:54.978243: step: 248/77, loss: 0.084358349442482 2023-01-23 23:37:56.501172: step: 252/77, loss: 0.039768461138010025 2023-01-23 23:37:57.917198: step: 256/77, loss: 0.08220826089382172 2023-01-23 23:37:59.491341: step: 260/77, loss: 0.05335283279418945 2023-01-23 23:38:00.965660: step: 264/77, loss: 0.03304270654916763 2023-01-23 23:38:02.466642: step: 268/77, loss: 0.019193505868315697 2023-01-23 23:38:03.934504: step: 272/77, loss: 0.02530151978135109 2023-01-23 23:38:05.368779: step: 276/77, loss: 0.02418283000588417 2023-01-23 23:38:06.851041: step: 280/77, loss: 0.022593803703784943 2023-01-23 23:38:08.235701: step: 284/77, loss: 0.03843046352267265 2023-01-23 23:38:09.720355: step: 288/77, loss: 0.03328024595975876 2023-01-23 23:38:11.181453: step: 292/77, loss: 0.15002712607383728 2023-01-23 23:38:12.703811: step: 296/77, loss: 0.04646942391991615 2023-01-23 23:38:14.043798: step: 300/77, loss: 0.02150649204850197 2023-01-23 23:38:15.522717: step: 304/77, loss: 0.026621028780937195 2023-01-23 23:38:17.002296: step: 308/77, loss: 0.05212383344769478 2023-01-23 23:38:18.510198: step: 312/77, loss: 0.1913083791732788 2023-01-23 23:38:20.004507: step: 316/77, loss: 0.08444274216890335 2023-01-23 23:38:21.482432: step: 320/77, loss: 0.027384981513023376 2023-01-23 23:38:22.966879: step: 324/77, loss: 0.00819011777639389 2023-01-23 23:38:24.431636: step: 328/77, loss: 0.024473778903484344 2023-01-23 23:38:25.855084: step: 332/77, loss: 0.018421344459056854 2023-01-23 23:38:27.354737: step: 336/77, loss: 0.00830911286175251 2023-01-23 23:38:28.868988: step: 340/77, loss: 0.08067379146814346 2023-01-23 23:38:30.363756: step: 344/77, loss: 0.09274131804704666 2023-01-23 23:38:31.857505: step: 348/77, loss: 0.02914857119321823 2023-01-23 23:38:33.352291: step: 352/77, loss: 0.09893452376127243 2023-01-23 23:38:34.795539: step: 356/77, loss: 0.06000610813498497 2023-01-23 23:38:36.274396: step: 360/77, loss: 0.025374433025717735 2023-01-23 23:38:37.748570: step: 364/77, loss: 0.029576759785413742 2023-01-23 23:38:39.183727: step: 368/77, loss: 0.04382958263158798 2023-01-23 23:38:40.686771: step: 372/77, loss: 0.03420332074165344 2023-01-23 23:38:42.118101: step: 376/77, loss: 0.20685172080993652 2023-01-23 23:38:43.562665: step: 380/77, loss: 0.055401045829057693 2023-01-23 23:38:45.027125: step: 384/77, loss: 0.032930437475442886 2023-01-23 23:38:46.527826: step: 388/77, loss: 0.031787943094968796 ================================================== Loss: 0.069 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 2} Test Chinese: {'template': {'p': 0.9295774647887324, 'r': 0.5196850393700787, 'f1': 0.6666666666666665}, 'slot': {'p': 0.5172413793103449, 'r': 0.014367816091954023, 'f1': 0.02795899347623485}, 'combined': 0.018639328984156562, 'epoch': 2} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 2} Test Korean: {'template': {'p': 0.9295774647887324, 'r': 0.5196850393700787, 'f1': 0.6666666666666665}, 'slot': {'p': 0.5172413793103449, 'r': 0.014367816091954023, 'f1': 0.02795899347623485}, 'combined': 0.018639328984156562, 'epoch': 2} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 2} Test Russian: {'template': {'p': 0.9295774647887324, 'r': 0.5196850393700787, 'f1': 0.6666666666666665}, 'slot': {'p': 0.5172413793103449, 'r': 0.014367816091954023, 'f1': 0.02795899347623485}, 'combined': 0.018639328984156562, 'epoch': 2} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 2} Sample Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 2} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 2} New best chinese model... New best korean model... New best russian model... ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 2} Test for Chinese: {'template': {'p': 0.9295774647887324, 'r': 0.5196850393700787, 'f1': 0.6666666666666665}, 'slot': {'p': 0.5172413793103449, 'r': 0.014367816091954023, 'f1': 0.02795899347623485}, 'combined': 0.018639328984156562, 'epoch': 2} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 2} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 2} Test for Korean: {'template': {'p': 0.9295774647887324, 'r': 0.5196850393700787, 'f1': 0.6666666666666665}, 'slot': {'p': 0.5172413793103449, 'r': 0.014367816091954023, 'f1': 0.02795899347623485}, 'combined': 0.018639328984156562, 'epoch': 2} Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 2} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 2} Test for Russian: {'template': {'p': 0.9295774647887324, 'r': 0.5196850393700787, 'f1': 0.6666666666666665}, 'slot': {'p': 0.5172413793103449, 'r': 0.014367816091954023, 'f1': 0.02795899347623485}, 'combined': 0.018639328984156562, 'epoch': 2} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 2} ****************************** Epoch: 3 command: python train.py --model_name template --xlmr_model_name xlm-roberta-large --batch_size 10 --xlmr_learning_rate 2e-5 --event_hidden_num 450 --accumulate_step 4 --max_epoch 30 --p1_data_weight 0.1 --learning_rate 2e-4 2023-01-23 23:40:38.539320: step: 4/77, loss: 0.01282687857747078 2023-01-23 23:40:39.985314: step: 8/77, loss: 0.030225640162825584 2023-01-23 23:40:41.511710: step: 12/77, loss: 0.03756564110517502 2023-01-23 23:40:42.985337: step: 16/77, loss: 0.016713924705982208 2023-01-23 23:40:44.427279: step: 20/77, loss: 0.003959559369832277 2023-01-23 23:40:45.890340: step: 24/77, loss: 0.01994737796485424 2023-01-23 23:40:47.355630: step: 28/77, loss: 0.016755791381001472 2023-01-23 23:40:48.758219: step: 32/77, loss: 0.007182381581515074 2023-01-23 23:40:50.245734: step: 36/77, loss: 0.0026024668477475643 2023-01-23 23:40:51.715414: step: 40/77, loss: 0.03754647821187973 2023-01-23 23:40:53.102736: step: 44/77, loss: 0.06936914473772049 2023-01-23 23:40:54.567326: step: 48/77, loss: 0.025536412373185158 2023-01-23 23:40:56.000554: step: 52/77, loss: 0.04573522135615349 2023-01-23 23:40:57.422513: step: 56/77, loss: 0.005215484648942947 2023-01-23 23:40:58.942453: step: 60/77, loss: 0.0038853264413774014 2023-01-23 23:41:00.388321: step: 64/77, loss: 0.002377648837864399 2023-01-23 23:41:01.832690: step: 68/77, loss: 0.0057394178584218025 2023-01-23 23:41:03.279054: step: 72/77, loss: 0.007972998544573784 2023-01-23 23:41:04.767131: step: 76/77, loss: 0.01843724027276039 2023-01-23 23:41:06.317987: step: 80/77, loss: 0.007290308829396963 2023-01-23 23:41:07.839304: step: 84/77, loss: 0.09505050629377365 2023-01-23 23:41:09.311742: step: 88/77, loss: 0.1004708856344223 2023-01-23 23:41:10.795162: step: 92/77, loss: 0.00934943649917841 2023-01-23 23:41:12.275641: step: 96/77, loss: 0.016220057383179665 2023-01-23 23:41:13.755972: step: 100/77, loss: 0.08072739839553833 2023-01-23 23:41:15.234988: step: 104/77, loss: 0.005210316739976406 2023-01-23 23:41:16.698033: step: 108/77, loss: 0.1485469937324524 2023-01-23 23:41:18.099529: step: 112/77, loss: 0.01634364202618599 2023-01-23 23:41:19.580767: step: 116/77, loss: 0.02327132597565651 2023-01-23 23:41:21.024876: step: 120/77, loss: 0.03943256288766861 2023-01-23 23:41:22.433389: step: 124/77, loss: 0.0033520772121846676 2023-01-23 23:41:23.943209: step: 128/77, loss: 0.0930645614862442 2023-01-23 23:41:25.375655: step: 132/77, loss: 0.11543893814086914 2023-01-23 23:41:26.811832: step: 136/77, loss: 0.020337283611297607 2023-01-23 23:41:28.251697: step: 140/77, loss: 0.025304848328232765 2023-01-23 23:41:29.729629: step: 144/77, loss: 0.019276432693004608 2023-01-23 23:41:31.224452: step: 148/77, loss: 0.004194684326648712 2023-01-23 23:41:32.648944: step: 152/77, loss: 0.13625293970108032 2023-01-23 23:41:34.141129: step: 156/77, loss: 0.04223461076617241 2023-01-23 23:41:35.502706: step: 160/77, loss: 0.009019285440444946 2023-01-23 23:41:37.026636: step: 164/77, loss: 0.006462970748543739 2023-01-23 23:41:38.485775: step: 168/77, loss: 0.04934345930814743 2023-01-23 23:41:39.961173: step: 172/77, loss: 0.025653602555394173 2023-01-23 23:41:41.421498: step: 176/77, loss: 0.02020188234746456 2023-01-23 23:41:42.857475: step: 180/77, loss: 0.046197310090065 2023-01-23 23:41:44.281002: step: 184/77, loss: 0.024325251579284668 2023-01-23 23:41:45.797065: step: 188/77, loss: 0.06169934570789337 2023-01-23 23:41:47.276312: step: 192/77, loss: 0.037528544664382935 2023-01-23 23:41:48.721568: step: 196/77, loss: 0.011451170779764652 2023-01-23 23:41:50.295422: step: 200/77, loss: 0.014984739013016224 2023-01-23 23:41:51.685853: step: 204/77, loss: 0.09119526296854019 2023-01-23 23:41:53.105848: step: 208/77, loss: 0.19242313504219055 2023-01-23 23:41:54.601488: step: 212/77, loss: 0.0497591570019722 2023-01-23 23:41:56.040812: step: 216/77, loss: 0.032276079058647156 2023-01-23 23:41:57.465780: step: 220/77, loss: 0.055153004825115204 2023-01-23 23:41:58.892557: step: 224/77, loss: 0.04976733401417732 2023-01-23 23:42:00.329368: step: 228/77, loss: 0.032615918666124344 2023-01-23 23:42:01.788184: step: 232/77, loss: 0.05644465982913971 2023-01-23 23:42:03.249482: step: 236/77, loss: 0.01722615212202072 2023-01-23 23:42:04.670823: step: 240/77, loss: 0.01583724468946457 2023-01-23 23:42:06.117837: step: 244/77, loss: 0.01847304217517376 2023-01-23 23:42:07.607713: step: 248/77, loss: 0.022014932706952095 2023-01-23 23:42:09.045265: step: 252/77, loss: 0.09517721831798553 2023-01-23 23:42:10.424830: step: 256/77, loss: 0.028819341212511063 2023-01-23 23:42:11.881348: step: 260/77, loss: 0.031781651079654694 2023-01-23 23:42:13.398759: step: 264/77, loss: 0.023588500916957855 2023-01-23 23:42:14.811140: step: 268/77, loss: 0.060687996447086334 2023-01-23 23:42:16.298303: step: 272/77, loss: 0.09491574764251709 2023-01-23 23:42:17.747578: step: 276/77, loss: 0.11196808516979218 2023-01-23 23:42:19.262782: step: 280/77, loss: 0.1340647041797638 2023-01-23 23:42:20.737364: step: 284/77, loss: 0.014040486887097359 2023-01-23 23:42:22.190524: step: 288/77, loss: 0.045541491359472275 2023-01-23 23:42:23.621722: step: 292/77, loss: 0.06804996728897095 2023-01-23 23:42:25.118049: step: 296/77, loss: 0.029908739030361176 2023-01-23 23:42:26.578133: step: 300/77, loss: 0.1693681925535202 2023-01-23 23:42:28.027847: step: 304/77, loss: 0.009015132673084736 2023-01-23 23:42:29.522320: step: 308/77, loss: 0.0091236662119627 2023-01-23 23:42:30.973840: step: 312/77, loss: 0.14016039669513702 2023-01-23 23:42:32.413444: step: 316/77, loss: 0.05180913209915161 2023-01-23 23:42:33.869038: step: 320/77, loss: 0.014581729657948017 2023-01-23 23:42:35.275262: step: 324/77, loss: 0.02925998717546463 2023-01-23 23:42:36.807616: step: 328/77, loss: 0.052554160356521606 2023-01-23 23:42:38.275853: step: 332/77, loss: 0.0042217811569571495 2023-01-23 23:42:39.732704: step: 336/77, loss: 0.016710154712200165 2023-01-23 23:42:41.196513: step: 340/77, loss: 0.01677018776535988 2023-01-23 23:42:42.623510: step: 344/77, loss: 0.06060976907610893 2023-01-23 23:42:44.170976: step: 348/77, loss: 0.11510007828474045 2023-01-23 23:42:45.621778: step: 352/77, loss: 0.012733589857816696 2023-01-23 23:42:47.114662: step: 356/77, loss: 0.06944231688976288 2023-01-23 23:42:48.581983: step: 360/77, loss: 0.031336460262537 2023-01-23 23:42:50.067169: step: 364/77, loss: 0.049135543406009674 2023-01-23 23:42:51.568542: step: 368/77, loss: 0.017881611362099648 2023-01-23 23:42:53.055365: step: 372/77, loss: 0.04368804395198822 2023-01-23 23:42:54.619531: step: 376/77, loss: 0.021342255175113678 2023-01-23 23:42:56.139480: step: 380/77, loss: 0.042040806263685226 2023-01-23 23:42:57.634254: step: 384/77, loss: 0.017206795513629913 2023-01-23 23:42:59.120352: step: 388/77, loss: 0.03838924318552017 ================================================== Loss: 0.042 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test Chinese: {'template': {'p': 0.9358974358974359, 'r': 0.5748031496062992, 'f1': 0.7121951219512195}, 'slot': {'p': 0.5121951219512195, 'r': 0.020114942528735632, 'f1': 0.03870967741935484}, 'combined': 0.027568843430369788, 'epoch': 3} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test Korean: {'template': {'p': 0.935064935064935, 'r': 0.5669291338582677, 'f1': 0.7058823529411765}, 'slot': {'p': 0.5128205128205128, 'r': 0.019157088122605363, 'f1': 0.036934441366574325}, 'combined': 0.026071370376405407, 'epoch': 3} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test Russian: {'template': {'p': 0.9342105263157895, 'r': 0.5590551181102362, 'f1': 0.6995073891625616}, 'slot': {'p': 0.5128205128205128, 'r': 0.019157088122605363, 'f1': 0.036934441366574325}, 'combined': 0.02583591465051012, 'epoch': 3} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 3} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 3} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 3} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 2} Test for Chinese: {'template': {'p': 0.9295774647887324, 'r': 0.5196850393700787, 'f1': 0.6666666666666665}, 'slot': {'p': 0.5172413793103449, 'r': 0.014367816091954023, 'f1': 0.02795899347623485}, 'combined': 0.018639328984156562, 'epoch': 2} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 2} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 2} Test for Korean: {'template': {'p': 0.9295774647887324, 'r': 0.5196850393700787, 'f1': 0.6666666666666665}, 'slot': {'p': 0.5172413793103449, 'r': 0.014367816091954023, 'f1': 0.02795899347623485}, 'combined': 0.018639328984156562, 'epoch': 2} Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 2} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 2} Test for Russian: {'template': {'p': 0.9295774647887324, 'r': 0.5196850393700787, 'f1': 0.6666666666666665}, 'slot': {'p': 0.5172413793103449, 'r': 0.014367816091954023, 'f1': 0.02795899347623485}, 'combined': 0.018639328984156562, 'epoch': 2} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 2} ****************************** Epoch: 4 command: python train.py --model_name template --xlmr_model_name xlm-roberta-large --batch_size 10 --xlmr_learning_rate 2e-5 --event_hidden_num 450 --accumulate_step 4 --max_epoch 30 --p1_data_weight 0.1 --learning_rate 2e-4 2023-01-23 23:44:34.398206: step: 4/77, loss: 0.013566895388066769 2023-01-23 23:44:35.898510: step: 8/77, loss: 0.027779357507824898 2023-01-23 23:44:37.285068: step: 12/77, loss: 0.08235408365726471 2023-01-23 23:44:38.756056: step: 16/77, loss: 0.035187553614377975 2023-01-23 23:44:40.247834: step: 20/77, loss: 0.004929692950099707 2023-01-23 23:44:41.681791: step: 24/77, loss: 0.04051845520734787 2023-01-23 23:44:43.211621: step: 28/77, loss: 0.002299446612596512 2023-01-23 23:44:44.739804: step: 32/77, loss: 0.05444331839680672 2023-01-23 23:44:46.186438: step: 36/77, loss: 0.019775696098804474 2023-01-23 23:44:47.625889: step: 40/77, loss: 0.009626384824514389 2023-01-23 23:44:49.077664: step: 44/77, loss: 0.006166198290884495 2023-01-23 23:44:50.534423: step: 48/77, loss: 0.001957635162398219 2023-01-23 23:44:51.952911: step: 52/77, loss: 0.008192313835024834 2023-01-23 23:44:53.432720: step: 56/77, loss: 0.027922047302126884 2023-01-23 23:44:54.907846: step: 60/77, loss: 0.007568768225610256 2023-01-23 23:44:56.364648: step: 64/77, loss: 0.03756123036146164 2023-01-23 23:44:57.836624: step: 68/77, loss: 0.003064778633415699 2023-01-23 23:44:59.304368: step: 72/77, loss: 0.06505947560071945 2023-01-23 23:45:00.800521: step: 76/77, loss: 0.10298222303390503 2023-01-23 23:45:02.276065: step: 80/77, loss: 0.05957907810807228 2023-01-23 23:45:03.777035: step: 84/77, loss: 0.025468256324529648 2023-01-23 23:45:05.201475: step: 88/77, loss: 0.008218479342758656 2023-01-23 23:45:06.639851: step: 92/77, loss: 0.0029105516150593758 2023-01-23 23:45:08.073271: step: 96/77, loss: 0.008502427488565445 2023-01-23 23:45:09.547370: step: 100/77, loss: 0.04820306599140167 2023-01-23 23:45:11.017728: step: 104/77, loss: 0.04766437038779259 2023-01-23 23:45:12.492891: step: 108/77, loss: 0.00987369567155838 2023-01-23 23:45:13.969060: step: 112/77, loss: 0.023743394762277603 2023-01-23 23:45:15.390080: step: 116/77, loss: 0.03195509687066078 2023-01-23 23:45:16.870961: step: 120/77, loss: 0.05533948540687561 2023-01-23 23:45:18.350712: step: 124/77, loss: 0.01357505563646555 2023-01-23 23:45:19.867981: step: 128/77, loss: 0.012264851480722427 2023-01-23 23:45:21.322627: step: 132/77, loss: 0.007977421395480633 2023-01-23 23:45:22.819802: step: 136/77, loss: 0.0331607423722744 2023-01-23 23:45:24.286548: step: 140/77, loss: 0.044529691338539124 2023-01-23 23:45:25.722867: step: 144/77, loss: 0.006739187054336071 2023-01-23 23:45:27.230310: step: 148/77, loss: 0.013966698199510574 2023-01-23 23:45:28.728904: step: 152/77, loss: 0.05022020637989044 2023-01-23 23:45:30.198745: step: 156/77, loss: 0.018023833632469177 2023-01-23 23:45:31.652131: step: 160/77, loss: 0.026353932917118073 2023-01-23 23:45:33.093928: step: 164/77, loss: 0.014032849110662937 2023-01-23 23:45:34.534444: step: 168/77, loss: 0.04134642705321312 2023-01-23 23:45:36.030498: step: 172/77, loss: 0.008170973509550095 2023-01-23 23:45:37.446183: step: 176/77, loss: 0.06170719861984253 2023-01-23 23:45:38.845543: step: 180/77, loss: 0.03655800595879555 2023-01-23 23:45:40.314593: step: 184/77, loss: 0.05532701313495636 2023-01-23 23:45:41.784437: step: 188/77, loss: 0.018792198970913887 2023-01-23 23:45:43.177213: step: 192/77, loss: 0.015395881608128548 2023-01-23 23:45:44.694343: step: 196/77, loss: 0.008521834388375282 2023-01-23 23:45:46.188071: step: 200/77, loss: 0.011309171095490456 2023-01-23 23:45:47.636956: step: 204/77, loss: 0.009101646952331066 2023-01-23 23:45:49.050915: step: 208/77, loss: 0.007351151201874018 2023-01-23 23:45:50.544667: step: 212/77, loss: 0.009427372366189957 2023-01-23 23:45:52.059969: step: 216/77, loss: 0.043822627514600754 2023-01-23 23:45:53.551002: step: 220/77, loss: 0.015992596745491028 2023-01-23 23:45:54.993288: step: 224/77, loss: 0.008451610803604126 2023-01-23 23:45:56.516130: step: 228/77, loss: 0.027462324127554893 2023-01-23 23:45:57.966126: step: 232/77, loss: 0.008471709676086903 2023-01-23 23:45:59.419411: step: 236/77, loss: 0.007476914674043655 2023-01-23 23:46:00.912775: step: 240/77, loss: 0.08036915212869644 2023-01-23 23:46:02.284356: step: 244/77, loss: 0.02178315259516239 2023-01-23 23:46:03.726632: step: 248/77, loss: 0.003713016165420413 2023-01-23 23:46:05.241073: step: 252/77, loss: 0.04187320917844772 2023-01-23 23:46:06.728630: step: 256/77, loss: 0.021082069724798203 2023-01-23 23:46:08.226377: step: 260/77, loss: 0.17483514547348022 2023-01-23 23:46:09.737705: step: 264/77, loss: 0.001595023088157177 2023-01-23 23:46:11.172869: step: 268/77, loss: 0.014456291683018208 2023-01-23 23:46:12.650310: step: 272/77, loss: 0.054439082741737366 2023-01-23 23:46:14.094204: step: 276/77, loss: 0.009083086624741554 2023-01-23 23:46:15.560148: step: 280/77, loss: 0.05662960931658745 2023-01-23 23:46:16.999388: step: 284/77, loss: 0.046708956360816956 2023-01-23 23:46:18.438247: step: 288/77, loss: 0.09646288305521011 2023-01-23 23:46:19.935333: step: 292/77, loss: 0.10016967356204987 2023-01-23 23:46:21.368820: step: 296/77, loss: 0.003909575752913952 2023-01-23 23:46:22.787769: step: 300/77, loss: 0.00645021628588438 2023-01-23 23:46:24.233723: step: 304/77, loss: 0.04972463846206665 2023-01-23 23:46:25.699269: step: 308/77, loss: 0.03274954482913017 2023-01-23 23:46:27.216981: step: 312/77, loss: 0.046244800090789795 2023-01-23 23:46:28.612837: step: 316/77, loss: 0.007613649591803551 2023-01-23 23:46:30.104247: step: 320/77, loss: 0.11676406115293503 2023-01-23 23:46:31.550442: step: 324/77, loss: 0.11077509075403214 2023-01-23 23:46:32.921600: step: 328/77, loss: 0.030607830733060837 2023-01-23 23:46:34.399332: step: 332/77, loss: 0.024657350033521652 2023-01-23 23:46:35.844204: step: 336/77, loss: 0.2149878740310669 2023-01-23 23:46:37.289512: step: 340/77, loss: 0.008865756914019585 2023-01-23 23:46:38.742982: step: 344/77, loss: 0.027378270402550697 2023-01-23 23:46:40.236159: step: 348/77, loss: 0.21051861345767975 2023-01-23 23:46:41.677691: step: 352/77, loss: 0.011898575350642204 2023-01-23 23:46:43.149497: step: 356/77, loss: 0.13614597916603088 2023-01-23 23:46:44.579608: step: 360/77, loss: 0.05477931350469589 2023-01-23 23:46:46.011657: step: 364/77, loss: 0.03741421177983284 2023-01-23 23:46:47.547533: step: 368/77, loss: 0.08789347857236862 2023-01-23 23:46:49.035999: step: 372/77, loss: 0.1083749309182167 2023-01-23 23:46:50.524936: step: 376/77, loss: 0.040636055171489716 2023-01-23 23:46:51.980777: step: 380/77, loss: 0.06483837217092514 2023-01-23 23:46:53.452566: step: 384/77, loss: 0.044129446148872375 2023-01-23 23:46:54.895395: step: 388/77, loss: 0.017115775495767593 ================================================== Loss: 0.038 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5666666666666667, 'f1': 0.7234042553191489}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05085442919642522, 'epoch': 4} Test Chinese: {'template': {'p': 0.9666666666666667, 'r': 0.4566929133858268, 'f1': 0.6203208556149733}, 'slot': {'p': 0.5862068965517241, 'r': 0.016283524904214558, 'f1': 0.031686859273066165}, 'combined': 0.019656019656019656, 'epoch': 4} Dev Korean: {'template': {'p': 1.0, 'r': 0.5666666666666667, 'f1': 0.7234042553191489}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05085442919642522, 'epoch': 4} Test Korean: {'template': {'p': 0.9666666666666667, 'r': 0.4566929133858268, 'f1': 0.6203208556149733}, 'slot': {'p': 0.6206896551724138, 'r': 0.017241379310344827, 'f1': 0.033550792171481825}, 'combined': 0.020812256106373755, 'epoch': 4} Dev Russian: {'template': {'p': 1.0, 'r': 0.5666666666666667, 'f1': 0.7234042553191489}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05085442919642522, 'epoch': 4} Test Russian: {'template': {'p': 0.9666666666666667, 'r': 0.4566929133858268, 'f1': 0.6203208556149733}, 'slot': {'p': 0.5862068965517241, 'r': 0.016283524904214558, 'f1': 0.031686859273066165}, 'combined': 0.019656019656019656, 'epoch': 4} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 4} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 4} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 4} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 2} Test for Chinese: {'template': {'p': 0.9295774647887324, 'r': 0.5196850393700787, 'f1': 0.6666666666666665}, 'slot': {'p': 0.5172413793103449, 'r': 0.014367816091954023, 'f1': 0.02795899347623485}, 'combined': 0.018639328984156562, 'epoch': 2} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 2} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 2} Test for Korean: {'template': {'p': 0.9295774647887324, 'r': 0.5196850393700787, 'f1': 0.6666666666666665}, 'slot': {'p': 0.5172413793103449, 'r': 0.014367816091954023, 'f1': 0.02795899347623485}, 'combined': 0.018639328984156562, 'epoch': 2} Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 2} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 2} Test for Russian: {'template': {'p': 0.9295774647887324, 'r': 0.5196850393700787, 'f1': 0.6666666666666665}, 'slot': {'p': 0.5172413793103449, 'r': 0.014367816091954023, 'f1': 0.02795899347623485}, 'combined': 0.018639328984156562, 'epoch': 2} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 2} ****************************** Epoch: 5 command: python train.py --model_name template --xlmr_model_name xlm-roberta-large --batch_size 10 --xlmr_learning_rate 2e-5 --event_hidden_num 450 --accumulate_step 4 --max_epoch 30 --p1_data_weight 0.1 --learning_rate 2e-4 2023-01-23 23:48:30.293244: step: 4/77, loss: 0.036248791962862015 2023-01-23 23:48:31.708192: step: 8/77, loss: 0.009122615680098534 2023-01-23 23:48:33.113804: step: 12/77, loss: 0.0009317334624938667 2023-01-23 23:48:34.533345: step: 16/77, loss: 0.01651964709162712 2023-01-23 23:48:35.975309: step: 20/77, loss: 0.02240084484219551 2023-01-23 23:48:37.370725: step: 24/77, loss: 0.023490462452173233 2023-01-23 23:48:38.814497: step: 28/77, loss: 0.04273740202188492 2023-01-23 23:48:40.286046: step: 32/77, loss: 0.015850331634283066 2023-01-23 23:48:41.747113: step: 36/77, loss: 0.0007848864188417792 2023-01-23 23:48:43.192912: step: 40/77, loss: 0.02575860172510147 2023-01-23 23:48:44.675815: step: 44/77, loss: 0.17241136729717255 2023-01-23 23:48:46.133382: step: 48/77, loss: 0.02504340559244156 2023-01-23 23:48:47.600406: step: 52/77, loss: 0.10933104902505875 2023-01-23 23:48:49.135448: step: 56/77, loss: 0.101542167365551 2023-01-23 23:48:50.643108: step: 60/77, loss: 0.022160649299621582 2023-01-23 23:48:52.103909: step: 64/77, loss: 0.017529074102640152 2023-01-23 23:48:53.579517: step: 68/77, loss: 0.013295048847794533 2023-01-23 23:48:55.060307: step: 72/77, loss: 0.03185984119772911 2023-01-23 23:48:56.547214: step: 76/77, loss: 0.0016769357025623322 2023-01-23 23:48:58.038680: step: 80/77, loss: 0.010549943894147873 2023-01-23 23:48:59.460125: step: 84/77, loss: 0.07163932919502258 2023-01-23 23:49:00.884223: step: 88/77, loss: 0.025985006242990494 2023-01-23 23:49:02.396862: step: 92/77, loss: 0.021814458072185516 2023-01-23 23:49:03.877429: step: 96/77, loss: 0.014404713176190853 2023-01-23 23:49:05.354624: step: 100/77, loss: 0.0023366985842585564 2023-01-23 23:49:06.839098: step: 104/77, loss: 0.0054720123298466206 2023-01-23 23:49:08.288631: step: 108/77, loss: 0.004138094373047352 2023-01-23 23:49:09.802205: step: 112/77, loss: 0.0848890095949173 2023-01-23 23:49:11.319723: step: 116/77, loss: 0.015406105667352676 2023-01-23 23:49:12.753107: step: 120/77, loss: 0.007413563784211874 2023-01-23 23:49:14.195092: step: 124/77, loss: 0.0059564122930169106 2023-01-23 23:49:15.623397: step: 128/77, loss: 0.06725480407476425 2023-01-23 23:49:17.051949: step: 132/77, loss: 0.016986869275569916 2023-01-23 23:49:18.560609: step: 136/77, loss: 0.004946116358041763 2023-01-23 23:49:19.961458: step: 140/77, loss: 0.013666262850165367 2023-01-23 23:49:21.385781: step: 144/77, loss: 0.012062680907547474 2023-01-23 23:49:22.833613: step: 148/77, loss: 0.06930902600288391 2023-01-23 23:49:24.309018: step: 152/77, loss: 0.03199886903166771 2023-01-23 23:49:25.849611: step: 156/77, loss: 0.08720900118350983 2023-01-23 23:49:27.263463: step: 160/77, loss: 0.03612701594829559 2023-01-23 23:49:28.811625: step: 164/77, loss: 0.02299576997756958 2023-01-23 23:49:30.282747: step: 168/77, loss: 0.03409970924258232 2023-01-23 23:49:31.703422: step: 172/77, loss: 0.0749649628996849 2023-01-23 23:49:33.224264: step: 176/77, loss: 0.003535109106451273 2023-01-23 23:49:34.710955: step: 180/77, loss: 0.00820176862180233 2023-01-23 23:49:36.152407: step: 184/77, loss: 0.005655103363096714 2023-01-23 23:49:37.649590: step: 188/77, loss: 0.005554571747779846 2023-01-23 23:49:39.140695: step: 192/77, loss: 0.0024093766696751118 2023-01-23 23:49:40.616848: step: 196/77, loss: 0.04333335533738136 2023-01-23 23:49:42.056399: step: 200/77, loss: 0.03227036073803902 2023-01-23 23:49:43.480693: step: 204/77, loss: 0.03613203763961792 2023-01-23 23:49:44.989397: step: 208/77, loss: 0.02152984030544758 2023-01-23 23:49:46.477055: step: 212/77, loss: 0.03411626070737839 2023-01-23 23:49:47.943084: step: 216/77, loss: 0.20100802183151245 2023-01-23 23:49:49.420625: step: 220/77, loss: 0.03604685515165329 2023-01-23 23:49:50.866607: step: 224/77, loss: 0.011609859764575958 2023-01-23 23:49:52.290691: step: 228/77, loss: 0.007372325751930475 2023-01-23 23:49:53.706825: step: 232/77, loss: 0.026432327926158905 2023-01-23 23:49:55.170132: step: 236/77, loss: 0.014586721546947956 2023-01-23 23:49:56.575398: step: 240/77, loss: 0.003138307249173522 2023-01-23 23:49:58.116875: step: 244/77, loss: 0.013584158383309841 2023-01-23 23:49:59.563823: step: 248/77, loss: 0.020674923434853554 2023-01-23 23:50:01.038024: step: 252/77, loss: 0.10567165166139603 2023-01-23 23:50:02.541342: step: 256/77, loss: 0.005264618434011936 2023-01-23 23:50:04.044041: step: 260/77, loss: 0.011365783400833607 2023-01-23 23:50:05.555983: step: 264/77, loss: 0.01712346449494362 2023-01-23 23:50:07.023764: step: 268/77, loss: 0.007783028297126293 2023-01-23 23:50:08.461859: step: 272/77, loss: 0.05575403571128845 2023-01-23 23:50:09.939981: step: 276/77, loss: 0.0054480587132275105 2023-01-23 23:50:11.462952: step: 280/77, loss: 0.05855439975857735 2023-01-23 23:50:12.957819: step: 284/77, loss: 0.09475122392177582 2023-01-23 23:50:14.443349: step: 288/77, loss: 0.01217370480298996 2023-01-23 23:50:15.968345: step: 292/77, loss: 0.02303715981543064 2023-01-23 23:50:17.387407: step: 296/77, loss: 0.07080667465925217 2023-01-23 23:50:18.976966: step: 300/77, loss: 0.012814212590456009 2023-01-23 23:50:20.397175: step: 304/77, loss: 0.00098832743242383 2023-01-23 23:50:21.872712: step: 308/77, loss: 0.05255415663123131 2023-01-23 23:50:23.397791: step: 312/77, loss: 0.043752521276474 2023-01-23 23:50:24.834464: step: 316/77, loss: 0.04130847752094269 2023-01-23 23:50:26.281210: step: 320/77, loss: 0.024371299892663956 2023-01-23 23:50:27.724764: step: 324/77, loss: 0.024236444383859634 2023-01-23 23:50:29.174020: step: 328/77, loss: 0.08609409630298615 2023-01-23 23:50:30.641384: step: 332/77, loss: 0.03747723996639252 2023-01-23 23:50:32.097434: step: 336/77, loss: 0.06281092762947083 2023-01-23 23:50:33.547009: step: 340/77, loss: 0.02270662412047386 2023-01-23 23:50:34.991006: step: 344/77, loss: 0.010117853060364723 2023-01-23 23:50:36.540557: step: 348/77, loss: 0.0036736996844410896 2023-01-23 23:50:38.079483: step: 352/77, loss: 0.037932757288217545 2023-01-23 23:50:39.533825: step: 356/77, loss: 0.007890701293945312 2023-01-23 23:50:40.986548: step: 360/77, loss: 0.04061102494597435 2023-01-23 23:50:42.492076: step: 364/77, loss: 0.002191794803366065 2023-01-23 23:50:44.010640: step: 368/77, loss: 0.007790668867528439 2023-01-23 23:50:45.526255: step: 372/77, loss: 0.010762091726064682 2023-01-23 23:50:46.994089: step: 376/77, loss: 0.024026095867156982 2023-01-23 23:50:48.520213: step: 380/77, loss: 0.040220487862825394 2023-01-23 23:50:49.954314: step: 384/77, loss: 0.022526392713189125 2023-01-23 23:50:51.383460: step: 388/77, loss: 0.004733316134661436 ================================================== Loss: 0.031 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.55, 'f1': 0.7096774193548387}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.04988944951527864, 'epoch': 5} Test Chinese: {'template': {'p': 0.972972972972973, 'r': 0.5669291338582677, 'f1': 0.7164179104477612}, 'slot': {'p': 0.6285714285714286, 'r': 0.0210727969348659, 'f1': 0.04077849860982392}, 'combined': 0.029214446765246982, 'epoch': 5} Dev Korean: {'template': {'p': 1.0, 'r': 0.55, 'f1': 0.7096774193548387}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.04988944951527864, 'epoch': 5} Test Korean: {'template': {'p': 0.972972972972973, 'r': 0.5669291338582677, 'f1': 0.7164179104477612}, 'slot': {'p': 0.6, 'r': 0.020114942528735632, 'f1': 0.03892493049119555}, 'combined': 0.027886517366826662, 'epoch': 5} Dev Russian: {'template': {'p': 1.0, 'r': 0.55, 'f1': 0.7096774193548387}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.04988944951527864, 'epoch': 5} Test Russian: {'template': {'p': 0.972972972972973, 'r': 0.5669291338582677, 'f1': 0.7164179104477612}, 'slot': {'p': 0.6, 'r': 0.020114942528735632, 'f1': 0.03892493049119555}, 'combined': 0.027886517366826662, 'epoch': 5} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 5} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 5} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 5} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 2} Test for Chinese: {'template': {'p': 0.9295774647887324, 'r': 0.5196850393700787, 'f1': 0.6666666666666665}, 'slot': {'p': 0.5172413793103449, 'r': 0.014367816091954023, 'f1': 0.02795899347623485}, 'combined': 0.018639328984156562, 'epoch': 2} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 2} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 2} Test for Korean: {'template': {'p': 0.9295774647887324, 'r': 0.5196850393700787, 'f1': 0.6666666666666665}, 'slot': {'p': 0.5172413793103449, 'r': 0.014367816091954023, 'f1': 0.02795899347623485}, 'combined': 0.018639328984156562, 'epoch': 2} Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 2} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 2} Test for Russian: {'template': {'p': 0.9295774647887324, 'r': 0.5196850393700787, 'f1': 0.6666666666666665}, 'slot': {'p': 0.5172413793103449, 'r': 0.014367816091954023, 'f1': 0.02795899347623485}, 'combined': 0.018639328984156562, 'epoch': 2} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 2} ****************************** Epoch: 6 command: python train.py --model_name template --xlmr_model_name xlm-roberta-large --batch_size 10 --xlmr_learning_rate 2e-5 --event_hidden_num 450 --accumulate_step 4 --max_epoch 30 --p1_data_weight 0.1 --learning_rate 2e-4 2023-01-23 23:52:27.136268: step: 4/77, loss: 0.0024796538054943085 2023-01-23 23:52:28.611698: step: 8/77, loss: 0.041127145290374756 2023-01-23 23:52:30.023661: step: 12/77, loss: 0.02096676081418991 2023-01-23 23:52:31.393967: step: 16/77, loss: 0.008376974612474442 2023-01-23 23:52:32.838461: step: 20/77, loss: 0.024190135300159454 2023-01-23 23:52:34.297431: step: 24/77, loss: 0.004426180850714445 2023-01-23 23:52:35.727977: step: 28/77, loss: 0.013920667581260204 2023-01-23 23:52:37.218690: step: 32/77, loss: 0.02847531996667385 2023-01-23 23:52:38.689319: step: 36/77, loss: 0.006405944470316172 2023-01-23 23:52:40.125856: step: 40/77, loss: 0.010968439280986786 2023-01-23 23:52:41.577707: step: 44/77, loss: 0.0225000511854887 2023-01-23 23:52:43.012224: step: 48/77, loss: 0.009205841459333897 2023-01-23 23:52:44.523168: step: 52/77, loss: 0.05182967707514763 2023-01-23 23:52:46.028619: step: 56/77, loss: 0.000634967815130949 2023-01-23 23:52:47.418663: step: 60/77, loss: 0.0024788244627416134 2023-01-23 23:52:48.945230: step: 64/77, loss: 0.06443198770284653 2023-01-23 23:52:50.366422: step: 68/77, loss: 0.02163878083229065 2023-01-23 23:52:51.907836: step: 72/77, loss: 0.1100260317325592 2023-01-23 23:52:53.431877: step: 76/77, loss: 0.004883392248302698 2023-01-23 23:52:54.941725: step: 80/77, loss: 0.02596667967736721 2023-01-23 23:52:56.419451: step: 84/77, loss: 0.02585577964782715 2023-01-23 23:52:57.872981: step: 88/77, loss: 0.05215982347726822 2023-01-23 23:52:59.325877: step: 92/77, loss: 0.09464490413665771 2023-01-23 23:53:00.795951: step: 96/77, loss: 0.024899493902921677 2023-01-23 23:53:02.306455: step: 100/77, loss: 0.06541687250137329 2023-01-23 23:53:03.760787: step: 104/77, loss: 0.03352929651737213 2023-01-23 23:53:05.221370: step: 108/77, loss: 0.01297876238822937 2023-01-23 23:53:06.723368: step: 112/77, loss: 0.008127512410283089 2023-01-23 23:53:08.156073: step: 116/77, loss: 0.004411404021084309 2023-01-23 23:53:09.605830: step: 120/77, loss: 0.01010224036872387 2023-01-23 23:53:11.098155: step: 124/77, loss: 0.004716739524155855 2023-01-23 23:53:12.530032: step: 128/77, loss: 0.02865005098283291 2023-01-23 23:53:13.994633: step: 132/77, loss: 0.0030319523066282272 2023-01-23 23:53:15.451375: step: 136/77, loss: 0.07615530490875244 2023-01-23 23:53:16.965858: step: 140/77, loss: 0.03446655720472336 2023-01-23 23:53:18.400978: step: 144/77, loss: 0.03845930099487305 2023-01-23 23:53:19.880339: step: 148/77, loss: 0.020856691524386406 2023-01-23 23:53:21.298545: step: 152/77, loss: 0.004411679692566395 2023-01-23 23:53:22.764628: step: 156/77, loss: 0.021663689985871315 2023-01-23 23:53:24.280958: step: 160/77, loss: 0.1191597655415535 2023-01-23 23:53:25.743843: step: 164/77, loss: 0.008801175281405449 2023-01-23 23:53:27.229028: step: 168/77, loss: 0.009983634576201439 2023-01-23 23:53:28.702034: step: 172/77, loss: 0.036132607609033585 2023-01-23 23:53:30.146354: step: 176/77, loss: 0.02146175503730774 2023-01-23 23:53:31.594807: step: 180/77, loss: 0.0011503604473546147 2023-01-23 23:53:33.055381: step: 184/77, loss: 0.015455983579158783 2023-01-23 23:53:34.505638: step: 188/77, loss: 0.007145550101995468 2023-01-23 23:53:35.956132: step: 192/77, loss: 0.015583626925945282 2023-01-23 23:53:37.388487: step: 196/77, loss: 0.21648414433002472 2023-01-23 23:53:38.907703: step: 200/77, loss: 0.05748395621776581 2023-01-23 23:53:40.487848: step: 204/77, loss: 0.023034442216157913 2023-01-23 23:53:41.953440: step: 208/77, loss: 0.024447085335850716 2023-01-23 23:53:43.357584: step: 212/77, loss: 0.005049333907663822 2023-01-23 23:53:44.827855: step: 216/77, loss: 0.06323403120040894 2023-01-23 23:53:46.347228: step: 220/77, loss: 0.02320742793381214 2023-01-23 23:53:47.808456: step: 224/77, loss: 0.008724575862288475 2023-01-23 23:53:49.253615: step: 228/77, loss: 0.005981612019240856 2023-01-23 23:53:50.693760: step: 232/77, loss: 0.055108070373535156 2023-01-23 23:53:52.096525: step: 236/77, loss: 0.00531669519841671 2023-01-23 23:53:53.589918: step: 240/77, loss: 0.028439946472644806 2023-01-23 23:53:55.045175: step: 244/77, loss: 0.015837809070944786 2023-01-23 23:53:56.471593: step: 248/77, loss: 0.04547914117574692 2023-01-23 23:53:57.911181: step: 252/77, loss: 0.05782359465956688 2023-01-23 23:53:59.464871: step: 256/77, loss: 0.02527480386197567 2023-01-23 23:54:01.012884: step: 260/77, loss: 0.0378800593316555 2023-01-23 23:54:02.463478: step: 264/77, loss: 0.02843671292066574 2023-01-23 23:54:03.930107: step: 268/77, loss: 0.010607427917420864 2023-01-23 23:54:05.364750: step: 272/77, loss: 0.005313307046890259 2023-01-23 23:54:06.863324: step: 276/77, loss: 0.0413510799407959 2023-01-23 23:54:08.320519: step: 280/77, loss: 0.016609536483883858 2023-01-23 23:54:09.821924: step: 284/77, loss: 0.001440250314772129 2023-01-23 23:54:11.306378: step: 288/77, loss: 0.023595498874783516 2023-01-23 23:54:12.755964: step: 292/77, loss: 0.004608721937984228 2023-01-23 23:54:14.248906: step: 296/77, loss: 0.00027793479966931045 2023-01-23 23:54:15.677312: step: 300/77, loss: 0.019556252285838127 2023-01-23 23:54:17.205575: step: 304/77, loss: 0.03160078078508377 2023-01-23 23:54:18.641474: step: 308/77, loss: 0.014493018388748169 2023-01-23 23:54:20.103444: step: 312/77, loss: 0.0004999942029826343 2023-01-23 23:54:21.594332: step: 316/77, loss: 0.019626328721642494 2023-01-23 23:54:23.111349: step: 320/77, loss: 0.0005291143897920847 2023-01-23 23:54:24.598428: step: 324/77, loss: 0.010973026975989342 2023-01-23 23:54:26.064596: step: 328/77, loss: 0.0021628625690937042 2023-01-23 23:54:27.569550: step: 332/77, loss: 0.011647387407720089 2023-01-23 23:54:29.057842: step: 336/77, loss: 0.0008302477071993053 2023-01-23 23:54:30.555745: step: 340/77, loss: 0.005226088687777519 2023-01-23 23:54:32.057767: step: 344/77, loss: 0.0061736274510622025 2023-01-23 23:54:33.582705: step: 348/77, loss: 0.014224600046873093 2023-01-23 23:54:35.003229: step: 352/77, loss: 0.009641819633543491 2023-01-23 23:54:36.495046: step: 356/77, loss: 0.01288052648305893 2023-01-23 23:54:37.976542: step: 360/77, loss: 0.004281376022845507 2023-01-23 23:54:39.347306: step: 364/77, loss: 0.0023920051753520966 2023-01-23 23:54:40.804970: step: 368/77, loss: 0.003546684980392456 2023-01-23 23:54:42.273051: step: 372/77, loss: 0.00038135904469527304 2023-01-23 23:54:43.752946: step: 376/77, loss: 0.0384867787361145 2023-01-23 23:54:45.222862: step: 380/77, loss: 0.006556495558470488 2023-01-23 23:54:46.726039: step: 384/77, loss: 0.0005675572901964188 2023-01-23 23:54:48.188773: step: 388/77, loss: 0.0004982967511750758 ================================================== Loss: 0.024 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 6} Test Chinese: {'template': {'p': 0.935064935064935, 'r': 0.5669291338582677, 'f1': 0.7058823529411765}, 'slot': {'p': 0.575, 'r': 0.022030651340996167, 'f1': 0.042435424354243544}, 'combined': 0.02995441719123074, 'epoch': 6} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 6} Test Korean: {'template': {'p': 0.935064935064935, 'r': 0.5669291338582677, 'f1': 0.7058823529411765}, 'slot': {'p': 0.5609756097560976, 'r': 0.022030651340996167, 'f1': 0.0423963133640553}, 'combined': 0.0299268094334508, 'epoch': 6} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 6} Test Russian: {'template': {'p': 0.935064935064935, 'r': 0.5669291338582677, 'f1': 0.7058823529411765}, 'slot': {'p': 0.575, 'r': 0.022030651340996167, 'f1': 0.042435424354243544}, 'combined': 0.02995441719123074, 'epoch': 6} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 6} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 6} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 6} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 2} Test for Chinese: {'template': {'p': 0.9295774647887324, 'r': 0.5196850393700787, 'f1': 0.6666666666666665}, 'slot': {'p': 0.5172413793103449, 'r': 0.014367816091954023, 'f1': 0.02795899347623485}, 'combined': 0.018639328984156562, 'epoch': 2} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 2} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 2} Test for Korean: {'template': {'p': 0.9295774647887324, 'r': 0.5196850393700787, 'f1': 0.6666666666666665}, 'slot': {'p': 0.5172413793103449, 'r': 0.014367816091954023, 'f1': 0.02795899347623485}, 'combined': 0.018639328984156562, 'epoch': 2} Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 2} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 2} Test for Russian: {'template': {'p': 0.9295774647887324, 'r': 0.5196850393700787, 'f1': 0.6666666666666665}, 'slot': {'p': 0.5172413793103449, 'r': 0.014367816091954023, 'f1': 0.02795899347623485}, 'combined': 0.018639328984156562, 'epoch': 2} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 2} ****************************** Epoch: 7 command: python train.py --model_name template --xlmr_model_name xlm-roberta-large --batch_size 10 --xlmr_learning_rate 2e-5 --event_hidden_num 450 --accumulate_step 4 --max_epoch 30 --p1_data_weight 0.1 --learning_rate 2e-4 2023-01-23 23:56:25.347488: step: 4/77, loss: 0.08273284882307053 2023-01-23 23:56:26.895680: step: 8/77, loss: 0.04297233000397682 2023-01-23 23:56:28.366828: step: 12/77, loss: 0.0591762512922287 2023-01-23 23:56:29.823405: step: 16/77, loss: 0.018232570961117744 2023-01-23 23:56:31.351411: step: 20/77, loss: 0.10887392610311508 2023-01-23 23:56:32.848700: step: 24/77, loss: 0.015558844432234764 2023-01-23 23:56:34.296221: step: 28/77, loss: 0.0031128046102821827 2023-01-23 23:56:35.807691: step: 32/77, loss: 0.005159935913980007 2023-01-23 23:56:37.353689: step: 36/77, loss: 0.004948144778609276 2023-01-23 23:56:38.811641: step: 40/77, loss: 0.00809083878993988 2023-01-23 23:56:40.295706: step: 44/77, loss: 0.09721863269805908 2023-01-23 23:56:41.765193: step: 48/77, loss: 0.009527448564767838 2023-01-23 23:56:43.181399: step: 52/77, loss: 0.04058494791388512 2023-01-23 23:56:44.631077: step: 56/77, loss: 0.00038469943683594465 2023-01-23 23:56:46.160010: step: 60/77, loss: 0.007827687077224255 2023-01-23 23:56:47.688568: step: 64/77, loss: 0.0031181438826024532 2023-01-23 23:56:49.066763: step: 68/77, loss: 0.03353375568985939 2023-01-23 23:56:50.545473: step: 72/77, loss: 0.008835235610604286 2023-01-23 23:56:51.995720: step: 76/77, loss: 0.0019557795021682978 2023-01-23 23:56:53.442746: step: 80/77, loss: 0.07034602016210556 2023-01-23 23:56:54.870574: step: 84/77, loss: 0.01971951127052307 2023-01-23 23:56:56.347704: step: 88/77, loss: 0.01234626304358244 2023-01-23 23:56:57.869449: step: 92/77, loss: 0.015013743191957474 2023-01-23 23:56:59.279581: step: 96/77, loss: 0.007682151161134243 2023-01-23 23:57:00.754051: step: 100/77, loss: 0.011956706643104553 2023-01-23 23:57:02.199313: step: 104/77, loss: 0.021283064037561417 2023-01-23 23:57:03.652289: step: 108/77, loss: 0.0031635239720344543 2023-01-23 23:57:05.118222: step: 112/77, loss: 0.01382882334291935 2023-01-23 23:57:06.570670: step: 116/77, loss: 0.021016880869865417 2023-01-23 23:57:08.047666: step: 120/77, loss: 0.015348262153565884 2023-01-23 23:57:09.538379: step: 124/77, loss: 0.03651634231209755 2023-01-23 23:57:11.053255: step: 128/77, loss: 0.023910239338874817 2023-01-23 23:57:12.530588: step: 132/77, loss: 0.04460494965314865 2023-01-23 23:57:14.029387: step: 136/77, loss: 0.04335037246346474 2023-01-23 23:57:15.494452: step: 140/77, loss: 0.009747933596372604 2023-01-23 23:57:16.999628: step: 144/77, loss: 0.007545825093984604 2023-01-23 23:57:18.398476: step: 148/77, loss: 0.01542804017663002 2023-01-23 23:57:19.902452: step: 152/77, loss: 0.10653623193502426 2023-01-23 23:57:21.359105: step: 156/77, loss: 0.015858223661780357 2023-01-23 23:57:22.870596: step: 160/77, loss: 0.03192513436079025 2023-01-23 23:57:24.329816: step: 164/77, loss: 0.017632409930229187 2023-01-23 23:57:25.796003: step: 168/77, loss: 0.008900660090148449 2023-01-23 23:57:27.210508: step: 172/77, loss: 0.006193622946739197 2023-01-23 23:57:28.688076: step: 176/77, loss: 0.026168471202254295 2023-01-23 23:57:30.164538: step: 180/77, loss: 0.010774902068078518 2023-01-23 23:57:31.625642: step: 184/77, loss: 0.1376255452632904 2023-01-23 23:57:33.102606: step: 188/77, loss: 0.03331328183412552 2023-01-23 23:57:34.572667: step: 192/77, loss: 0.042599745094776154 2023-01-23 23:57:36.038075: step: 196/77, loss: 0.006560549605637789 2023-01-23 23:57:37.568791: step: 200/77, loss: 0.009247076697647572 2023-01-23 23:57:38.977211: step: 204/77, loss: 0.0025599943473935127 2023-01-23 23:57:40.375329: step: 208/77, loss: 0.044314295053482056 2023-01-23 23:57:41.859591: step: 212/77, loss: 0.00999920442700386 2023-01-23 23:57:43.294157: step: 216/77, loss: 0.007233984302729368 2023-01-23 23:57:44.779685: step: 220/77, loss: 0.02284615859389305 2023-01-23 23:57:46.249397: step: 224/77, loss: 0.038655318319797516 2023-01-23 23:57:47.717219: step: 228/77, loss: 0.020562071353197098 2023-01-23 23:57:49.153741: step: 232/77, loss: 0.001989179290831089 2023-01-23 23:57:50.577094: step: 236/77, loss: 0.002037348924204707 2023-01-23 23:57:52.122755: step: 240/77, loss: 0.03317081183195114 2023-01-23 23:57:53.612690: step: 244/77, loss: 0.004017078783363104 2023-01-23 23:57:55.094973: step: 248/77, loss: 0.011808084324002266 2023-01-23 23:57:56.596464: step: 252/77, loss: 0.0727214366197586 2023-01-23 23:57:58.014559: step: 256/77, loss: 0.05245751142501831 2023-01-23 23:57:59.513763: step: 260/77, loss: 0.10336805135011673 2023-01-23 23:58:00.953903: step: 264/77, loss: 0.019474662840366364 2023-01-23 23:58:02.455688: step: 268/77, loss: 0.012679839506745338 2023-01-23 23:58:03.922160: step: 272/77, loss: 0.031886640936136246 2023-01-23 23:58:05.416014: step: 276/77, loss: 0.05295976251363754 2023-01-23 23:58:06.916267: step: 280/77, loss: 0.03312786668539047 2023-01-23 23:58:08.353921: step: 284/77, loss: 0.0002425676502753049 2023-01-23 23:58:09.865882: step: 288/77, loss: 0.047083549201488495 2023-01-23 23:58:11.318816: step: 292/77, loss: 0.02535715512931347 2023-01-23 23:58:12.779222: step: 296/77, loss: 0.011999406851828098 2023-01-23 23:58:14.340625: step: 300/77, loss: 0.00013501528883352876 2023-01-23 23:58:15.812098: step: 304/77, loss: 0.003165546339005232 2023-01-23 23:58:17.227598: step: 308/77, loss: 0.032844021916389465 2023-01-23 23:58:18.638444: step: 312/77, loss: 0.010812544263899326 2023-01-23 23:58:20.077203: step: 316/77, loss: 0.006718991324305534 2023-01-23 23:58:21.520468: step: 320/77, loss: 0.005521384999155998 2023-01-23 23:58:22.991952: step: 324/77, loss: 0.016020435839891434 2023-01-23 23:58:24.423217: step: 328/77, loss: 0.0027800784446299076 2023-01-23 23:58:25.868574: step: 332/77, loss: 0.025032440200448036 2023-01-23 23:58:27.376210: step: 336/77, loss: 0.025155704468488693 2023-01-23 23:58:28.908345: step: 340/77, loss: 0.004684568382799625 2023-01-23 23:58:30.396584: step: 344/77, loss: 0.024152573198080063 2023-01-23 23:58:31.894167: step: 348/77, loss: 0.004911579191684723 2023-01-23 23:58:33.312328: step: 352/77, loss: 0.0006968736997805536 2023-01-23 23:58:34.787658: step: 356/77, loss: 0.018942473456263542 2023-01-23 23:58:36.197723: step: 360/77, loss: 0.020946066826581955 2023-01-23 23:58:37.660354: step: 364/77, loss: 0.013985851779580116 2023-01-23 23:58:39.193911: step: 368/77, loss: 0.0010878103785216808 2023-01-23 23:58:40.710270: step: 372/77, loss: 0.0079379016533494 2023-01-23 23:58:42.162155: step: 376/77, loss: 0.003763859858736396 2023-01-23 23:58:43.645692: step: 380/77, loss: 0.011913584545254707 2023-01-23 23:58:45.174673: step: 384/77, loss: 0.0014786208048462868 2023-01-23 23:58:46.674712: step: 388/77, loss: 0.14149004220962524 ================================================== Loss: 0.025 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 7} Test Chinese: {'template': {'p': 0.9558823529411765, 'r': 0.5118110236220472, 'f1': 0.6666666666666666}, 'slot': {'p': 0.71875, 'r': 0.022030651340996167, 'f1': 0.04275092936802974}, 'combined': 0.028500619578686492, 'epoch': 7} Dev Korean: {'template': {'p': 1.0, 'r': 0.5666666666666667, 'f1': 0.7234042553191489}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05085442919642522, 'epoch': 7} Test Korean: {'template': {'p': 0.9558823529411765, 'r': 0.5118110236220472, 'f1': 0.6666666666666666}, 'slot': {'p': 0.71875, 'r': 0.022030651340996167, 'f1': 0.04275092936802974}, 'combined': 0.028500619578686492, 'epoch': 7} Dev Russian: {'template': {'p': 1.0, 'r': 0.5666666666666667, 'f1': 0.7234042553191489}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05085442919642522, 'epoch': 7} Test Russian: {'template': {'p': 0.9558823529411765, 'r': 0.5118110236220472, 'f1': 0.6666666666666666}, 'slot': {'p': 0.7096774193548387, 'r': 0.0210727969348659, 'f1': 0.04093023255813954}, 'combined': 0.02728682170542636, 'epoch': 7} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 7} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 7} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 7} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 2} Test for Chinese: {'template': {'p': 0.9295774647887324, 'r': 0.5196850393700787, 'f1': 0.6666666666666665}, 'slot': {'p': 0.5172413793103449, 'r': 0.014367816091954023, 'f1': 0.02795899347623485}, 'combined': 0.018639328984156562, 'epoch': 2} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 2} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 2} Test for Korean: {'template': {'p': 0.9295774647887324, 'r': 0.5196850393700787, 'f1': 0.6666666666666665}, 'slot': {'p': 0.5172413793103449, 'r': 0.014367816091954023, 'f1': 0.02795899347623485}, 'combined': 0.018639328984156562, 'epoch': 2} Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 2} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 2} Test for Russian: {'template': {'p': 0.9295774647887324, 'r': 0.5196850393700787, 'f1': 0.6666666666666665}, 'slot': {'p': 0.5172413793103449, 'r': 0.014367816091954023, 'f1': 0.02795899347623485}, 'combined': 0.018639328984156562, 'epoch': 2} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 2} ****************************** Epoch: 8 command: python train.py --model_name template --xlmr_model_name xlm-roberta-large --batch_size 10 --xlmr_learning_rate 2e-5 --event_hidden_num 450 --accumulate_step 4 --max_epoch 30 --p1_data_weight 0.1 --learning_rate 2e-4 2023-01-24 00:00:21.785043: step: 4/77, loss: 0.0117162074893713 2023-01-24 00:00:23.235050: step: 8/77, loss: 0.016873063519597054 2023-01-24 00:00:24.681125: step: 12/77, loss: 0.02453618496656418 2023-01-24 00:00:26.125308: step: 16/77, loss: 0.012327548116445541 2023-01-24 00:00:27.570360: step: 20/77, loss: 0.019070779904723167 2023-01-24 00:00:28.991405: step: 24/77, loss: 0.01693248189985752 2023-01-24 00:00:30.422765: step: 28/77, loss: 0.009283289313316345 2023-01-24 00:00:31.894345: step: 32/77, loss: 0.012527648359537125 2023-01-24 00:00:33.362405: step: 36/77, loss: 0.019595034420490265 2023-01-24 00:00:34.854071: step: 40/77, loss: 0.01977097988128662 2023-01-24 00:00:36.274412: step: 44/77, loss: 0.00018120244203601032 2023-01-24 00:00:37.723168: step: 48/77, loss: 0.07907330244779587 2023-01-24 00:00:39.201429: step: 52/77, loss: 0.10135805606842041 2023-01-24 00:00:40.688938: step: 56/77, loss: 0.07528279721736908 2023-01-24 00:00:42.144129: step: 60/77, loss: 0.0033129402436316013 2023-01-24 00:00:43.589918: step: 64/77, loss: 0.002514893189072609 2023-01-24 00:00:45.134461: step: 68/77, loss: 0.011508330702781677 2023-01-24 00:00:46.614879: step: 72/77, loss: 0.0036744915414601564 2023-01-24 00:00:48.113448: step: 76/77, loss: 0.0002770134015008807 2023-01-24 00:00:49.652691: step: 80/77, loss: 0.0005424823611974716 2023-01-24 00:00:51.036088: step: 84/77, loss: 0.004335353150963783 2023-01-24 00:00:52.542974: step: 88/77, loss: 0.02489081397652626 2023-01-24 00:00:54.010062: step: 92/77, loss: 0.043442342430353165 2023-01-24 00:00:55.456148: step: 96/77, loss: 0.02136102318763733 2023-01-24 00:00:56.930189: step: 100/77, loss: 0.00951157882809639 2023-01-24 00:00:58.437341: step: 104/77, loss: 0.007268311455845833 2023-01-24 00:00:59.897189: step: 108/77, loss: 0.009832844138145447 2023-01-24 00:01:01.354831: step: 112/77, loss: 0.011927825398743153 2023-01-24 00:01:02.867098: step: 116/77, loss: 0.01002841629087925 2023-01-24 00:01:04.349920: step: 120/77, loss: 0.0016436288133263588 2023-01-24 00:01:05.858979: step: 124/77, loss: 0.005534032825380564 2023-01-24 00:01:07.310081: step: 128/77, loss: 0.013640894554555416 2023-01-24 00:01:08.718167: step: 132/77, loss: 0.00495021790266037 2023-01-24 00:01:10.246200: step: 136/77, loss: 0.011084084399044514 2023-01-24 00:01:11.689176: step: 140/77, loss: 0.03978944569826126 2023-01-24 00:01:13.159065: step: 144/77, loss: 0.014671813696622849 2023-01-24 00:01:14.690868: step: 148/77, loss: 0.041456300765275955 2023-01-24 00:01:16.168824: step: 152/77, loss: 0.05762626230716705 2023-01-24 00:01:17.693083: step: 156/77, loss: 0.03742530941963196 2023-01-24 00:01:19.201154: step: 160/77, loss: 0.026271728798747063 2023-01-24 00:01:20.706365: step: 164/77, loss: 0.06438086181879044 2023-01-24 00:01:22.186375: step: 168/77, loss: 0.002691475907340646 2023-01-24 00:01:23.681218: step: 172/77, loss: 0.015857674181461334 2023-01-24 00:01:25.167253: step: 176/77, loss: 0.121455118060112 2023-01-24 00:01:26.660238: step: 180/77, loss: 0.000552354205865413 2023-01-24 00:01:28.039851: step: 184/77, loss: 0.0006054186960682273 2023-01-24 00:01:29.526800: step: 188/77, loss: 0.01104812417179346 2023-01-24 00:01:30.963667: step: 192/77, loss: 0.027507148683071136 2023-01-24 00:01:32.438514: step: 196/77, loss: 0.0430024117231369 2023-01-24 00:01:34.015715: step: 200/77, loss: 0.008453615009784698 2023-01-24 00:01:35.427033: step: 204/77, loss: 0.005616758018732071 2023-01-24 00:01:36.921924: step: 208/77, loss: 0.01384175568819046 2023-01-24 00:01:38.397219: step: 212/77, loss: 0.05978897213935852 2023-01-24 00:01:39.860565: step: 216/77, loss: 0.005040631163865328 2023-01-24 00:01:41.343143: step: 220/77, loss: 0.02630782499909401 2023-01-24 00:01:42.761439: step: 224/77, loss: 0.019113430753350258 2023-01-24 00:01:44.280601: step: 228/77, loss: 0.005212479270994663 2023-01-24 00:01:45.756002: step: 232/77, loss: 0.03740738332271576 2023-01-24 00:01:47.149884: step: 236/77, loss: 0.002008106792345643 2023-01-24 00:01:48.648829: step: 240/77, loss: 0.017777230590581894 2023-01-24 00:01:50.129626: step: 244/77, loss: 0.031976018100976944 2023-01-24 00:01:51.534516: step: 248/77, loss: 0.016905630007386208 2023-01-24 00:01:53.037822: step: 252/77, loss: 0.0038639819249510765 2023-01-24 00:01:54.549861: step: 256/77, loss: 0.06911718100309372 2023-01-24 00:01:56.050826: step: 260/77, loss: 0.010147958062589169 2023-01-24 00:01:57.488989: step: 264/77, loss: 0.07028476893901825 2023-01-24 00:01:58.950383: step: 268/77, loss: 0.008182062767446041 2023-01-24 00:02:00.410760: step: 272/77, loss: 0.0007549443980678916 2023-01-24 00:02:01.835148: step: 276/77, loss: 0.0041632880456745625 2023-01-24 00:02:03.344770: step: 280/77, loss: 0.12215571850538254 2023-01-24 00:02:04.751009: step: 284/77, loss: 0.010168886743485928 2023-01-24 00:02:06.207396: step: 288/77, loss: 0.01675845868885517 2023-01-24 00:02:07.677296: step: 292/77, loss: 0.011644527316093445 2023-01-24 00:02:09.167050: step: 296/77, loss: 0.02150135114789009 2023-01-24 00:02:10.624379: step: 300/77, loss: 0.0020920918323099613 2023-01-24 00:02:12.084437: step: 304/77, loss: 0.021450433880090714 2023-01-24 00:02:13.542612: step: 308/77, loss: 0.036505237221717834 2023-01-24 00:02:15.026592: step: 312/77, loss: 0.04061291366815567 2023-01-24 00:02:16.437848: step: 316/77, loss: 0.04255300015211105 2023-01-24 00:02:17.965387: step: 320/77, loss: 0.05172652751207352 2023-01-24 00:02:19.436661: step: 324/77, loss: 0.018251018598675728 2023-01-24 00:02:20.882684: step: 328/77, loss: 0.006169596686959267 2023-01-24 00:02:22.304639: step: 332/77, loss: 0.011672554537653923 2023-01-24 00:02:23.805265: step: 336/77, loss: 0.010357705876231194 2023-01-24 00:02:25.290933: step: 340/77, loss: 0.009683008305728436 2023-01-24 00:02:26.794967: step: 344/77, loss: 0.10765119642019272 2023-01-24 00:02:28.205160: step: 348/77, loss: 0.02400229312479496 2023-01-24 00:02:29.718381: step: 352/77, loss: 0.003297572722658515 2023-01-24 00:02:31.181959: step: 356/77, loss: 0.014272533357143402 2023-01-24 00:02:32.655964: step: 360/77, loss: 0.041691526770591736 2023-01-24 00:02:34.198302: step: 364/77, loss: 0.001685146358795464 2023-01-24 00:02:35.694086: step: 368/77, loss: 0.02277713268995285 2023-01-24 00:02:37.097233: step: 372/77, loss: 0.039153438061475754 2023-01-24 00:02:38.546806: step: 376/77, loss: 0.09712395817041397 2023-01-24 00:02:39.984773: step: 380/77, loss: 0.0038834463339298964 2023-01-24 00:02:41.524487: step: 384/77, loss: 0.035216622054576874 2023-01-24 00:02:42.991227: step: 388/77, loss: 0.0019067820394411683 ================================================== Loss: 0.024 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 8} Test Chinese: {'template': {'p': 0.9305555555555556, 'r': 0.5275590551181102, 'f1': 0.6733668341708542}, 'slot': {'p': 0.6111111111111112, 'r': 0.0210727969348659, 'f1': 0.040740740740740744}, 'combined': 0.027433463614368134, 'epoch': 8} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 8} Test Korean: {'template': {'p': 0.9305555555555556, 'r': 0.5275590551181102, 'f1': 0.6733668341708542}, 'slot': {'p': 0.6111111111111112, 'r': 0.0210727969348659, 'f1': 0.040740740740740744}, 'combined': 0.027433463614368134, 'epoch': 8} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 8} Test Russian: {'template': {'p': 0.9324324324324325, 'r': 0.5433070866141733, 'f1': 0.6865671641791046}, 'slot': {'p': 0.6111111111111112, 'r': 0.0210727969348659, 'f1': 0.040740740740740744}, 'combined': 0.027971254836926484, 'epoch': 8} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 8} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 8} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 8} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 2} Test for Chinese: {'template': {'p': 0.9295774647887324, 'r': 0.5196850393700787, 'f1': 0.6666666666666665}, 'slot': {'p': 0.5172413793103449, 'r': 0.014367816091954023, 'f1': 0.02795899347623485}, 'combined': 0.018639328984156562, 'epoch': 2} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 2} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 2} Test for Korean: {'template': {'p': 0.9295774647887324, 'r': 0.5196850393700787, 'f1': 0.6666666666666665}, 'slot': {'p': 0.5172413793103449, 'r': 0.014367816091954023, 'f1': 0.02795899347623485}, 'combined': 0.018639328984156562, 'epoch': 2} Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 2} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 2} Test for Russian: {'template': {'p': 0.9295774647887324, 'r': 0.5196850393700787, 'f1': 0.6666666666666665}, 'slot': {'p': 0.5172413793103449, 'r': 0.014367816091954023, 'f1': 0.02795899347623485}, 'combined': 0.018639328984156562, 'epoch': 2} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 2} ****************************** Epoch: 9 command: python train.py --model_name template --xlmr_model_name xlm-roberta-large --batch_size 10 --xlmr_learning_rate 2e-5 --event_hidden_num 450 --accumulate_step 4 --max_epoch 30 --p1_data_weight 0.1 --learning_rate 2e-4 2023-01-24 00:04:18.289070: step: 4/77, loss: 0.004329627845436335 2023-01-24 00:04:19.765301: step: 8/77, loss: 0.1144450381398201 2023-01-24 00:04:21.254771: step: 12/77, loss: 0.007480972912162542 2023-01-24 00:04:22.736347: step: 16/77, loss: 0.005383838899433613 2023-01-24 00:04:24.242253: step: 20/77, loss: 0.07345490902662277 2023-01-24 00:04:25.728356: step: 24/77, loss: 0.023349113762378693 2023-01-24 00:04:27.237508: step: 28/77, loss: 0.0055717844516038895 2023-01-24 00:04:28.634357: step: 32/77, loss: 0.00044566497672349215 2023-01-24 00:04:30.091999: step: 36/77, loss: 0.005972985178232193 2023-01-24 00:04:31.525471: step: 40/77, loss: 0.017099654302001 2023-01-24 00:04:33.026135: step: 44/77, loss: 0.0007056721369735897 2023-01-24 00:04:34.447187: step: 48/77, loss: 0.0044279154390096664 2023-01-24 00:04:35.925734: step: 52/77, loss: 0.0030765619594603777 2023-01-24 00:04:37.368478: step: 56/77, loss: 0.0002256929292343557 2023-01-24 00:04:38.833124: step: 60/77, loss: 0.018481384962797165 2023-01-24 00:04:40.352070: step: 64/77, loss: 0.0009991687256842852 2023-01-24 00:04:41.740119: step: 68/77, loss: 0.008599960245192051 2023-01-24 00:04:43.213459: step: 72/77, loss: 0.016015585511922836 2023-01-24 00:04:44.655773: step: 76/77, loss: 0.0316038504242897 2023-01-24 00:04:46.196630: step: 80/77, loss: 0.05238572135567665 2023-01-24 00:04:47.691573: step: 84/77, loss: 9.742747351992875e-05 2023-01-24 00:04:49.096772: step: 88/77, loss: 0.007090198807418346 2023-01-24 00:04:50.517946: step: 92/77, loss: 0.04278424382209778 2023-01-24 00:04:51.976884: step: 96/77, loss: 0.004927410744130611 2023-01-24 00:04:53.473716: step: 100/77, loss: 0.008587077260017395 2023-01-24 00:04:54.871758: step: 104/77, loss: 0.0022492746356874704 2023-01-24 00:04:56.408614: step: 108/77, loss: 0.0069634742103517056 2023-01-24 00:04:57.905504: step: 112/77, loss: 0.013344593346118927 2023-01-24 00:04:59.358654: step: 116/77, loss: 0.006937253288924694 2023-01-24 00:05:00.811343: step: 120/77, loss: 0.0035367405507713556 2023-01-24 00:05:02.162930: step: 124/77, loss: 0.020305421203374863 2023-01-24 00:05:03.648963: step: 128/77, loss: 0.011044786311686039 2023-01-24 00:05:05.134895: step: 132/77, loss: 0.020416777580976486 2023-01-24 00:05:06.634366: step: 136/77, loss: 0.009008553810417652 2023-01-24 00:05:08.119247: step: 140/77, loss: 0.013809466734528542 2023-01-24 00:05:09.543183: step: 144/77, loss: 0.01881277747452259 2023-01-24 00:05:10.998315: step: 148/77, loss: 0.009183013811707497 2023-01-24 00:05:12.509457: step: 152/77, loss: 0.030898435041308403 2023-01-24 00:05:13.951867: step: 156/77, loss: 0.021500788629055023 2023-01-24 00:05:15.451299: step: 160/77, loss: 0.005511270835995674 2023-01-24 00:05:16.901366: step: 164/77, loss: 0.016400450840592384 2023-01-24 00:05:18.359896: step: 168/77, loss: 0.02405705861747265 2023-01-24 00:05:19.895057: step: 172/77, loss: 0.001751496223732829 2023-01-24 00:05:21.380158: step: 176/77, loss: 0.004747659899294376 2023-01-24 00:05:22.871025: step: 180/77, loss: 0.00525059225037694 2023-01-24 00:05:24.322828: step: 184/77, loss: 0.019080569967627525 2023-01-24 00:05:25.771452: step: 188/77, loss: 0.007594104390591383 2023-01-24 00:05:27.265538: step: 192/77, loss: 0.0002163934987038374 2023-01-24 00:05:28.709434: step: 196/77, loss: 0.007122086361050606 2023-01-24 00:05:30.172519: step: 200/77, loss: 0.01763790473341942 2023-01-24 00:05:31.690026: step: 204/77, loss: 0.02837509661912918 2023-01-24 00:05:33.155557: step: 208/77, loss: 0.004397459328174591 2023-01-24 00:05:34.608346: step: 212/77, loss: 0.010287810117006302 2023-01-24 00:05:36.070413: step: 216/77, loss: 0.026625284925103188 2023-01-24 00:05:37.498854: step: 220/77, loss: 0.00010709754860727116 2023-01-24 00:05:38.956805: step: 224/77, loss: 0.028379343450069427 2023-01-24 00:05:40.504542: step: 228/77, loss: 0.001870258478447795 2023-01-24 00:05:41.982603: step: 232/77, loss: 0.03586038574576378 2023-01-24 00:05:43.485568: step: 236/77, loss: 0.0033279252238571644 2023-01-24 00:05:44.955725: step: 240/77, loss: 7.378146256087348e-05 2023-01-24 00:05:46.371354: step: 244/77, loss: 0.0011043344857171178 2023-01-24 00:05:47.824228: step: 248/77, loss: 0.005896029528230429 2023-01-24 00:05:49.325988: step: 252/77, loss: 0.0006844383897259831 2023-01-24 00:05:50.806157: step: 256/77, loss: 0.0564330518245697 2023-01-24 00:05:52.302721: step: 260/77, loss: 0.0017695273272693157 2023-01-24 00:05:53.802946: step: 264/77, loss: 0.023450348526239395 2023-01-24 00:05:55.292991: step: 268/77, loss: 0.0073721930384635925 2023-01-24 00:05:56.736060: step: 272/77, loss: 0.001965705305337906 2023-01-24 00:05:58.140977: step: 276/77, loss: 0.0022502404171973467 2023-01-24 00:05:59.540674: step: 280/77, loss: 0.0019588125869631767 2023-01-24 00:06:01.055792: step: 284/77, loss: 0.006595402956008911 2023-01-24 00:06:02.608262: step: 288/77, loss: 0.04348934814333916 2023-01-24 00:06:04.133482: step: 292/77, loss: 0.0033084216993302107 2023-01-24 00:06:05.593005: step: 296/77, loss: 0.0061813537031412125 2023-01-24 00:06:07.030556: step: 300/77, loss: 0.0027494626119732857 2023-01-24 00:06:08.456956: step: 304/77, loss: 0.04956817626953125 2023-01-24 00:06:09.943854: step: 308/77, loss: 0.06495145708322525 2023-01-24 00:06:11.444197: step: 312/77, loss: 0.046590082347393036 2023-01-24 00:06:12.922548: step: 316/77, loss: 0.01466241106390953 2023-01-24 00:06:14.377692: step: 320/77, loss: 0.003367375349625945 2023-01-24 00:06:15.807587: step: 324/77, loss: 0.000668396707624197 2023-01-24 00:06:17.227953: step: 328/77, loss: 0.04692220315337181 2023-01-24 00:06:18.687238: step: 332/77, loss: 0.047929201275110245 2023-01-24 00:06:20.219154: step: 336/77, loss: 6.350577314151451e-05 2023-01-24 00:06:21.688532: step: 340/77, loss: 0.0008273457060568035 2023-01-24 00:06:23.186656: step: 344/77, loss: 0.001057411776855588 2023-01-24 00:06:24.648160: step: 348/77, loss: 0.0007184810237959027 2023-01-24 00:06:26.187097: step: 352/77, loss: 0.02137991413474083 2023-01-24 00:06:27.676910: step: 356/77, loss: 0.005427772644907236 2023-01-24 00:06:29.160833: step: 360/77, loss: 0.016476700082421303 2023-01-24 00:06:30.661279: step: 364/77, loss: 0.020422574132680893 2023-01-24 00:06:32.174171: step: 368/77, loss: 0.0037095905281603336 2023-01-24 00:06:33.618024: step: 372/77, loss: 6.116175063652918e-05 2023-01-24 00:06:35.115504: step: 376/77, loss: 0.0017591891810297966 2023-01-24 00:06:36.581366: step: 380/77, loss: 0.04851994290947914 2023-01-24 00:06:38.137262: step: 384/77, loss: 0.0023985635489225388 2023-01-24 00:06:39.561789: step: 388/77, loss: 0.022556450217962265 ================================================== Loss: 0.015 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 9} Test Chinese: {'template': {'p': 0.9552238805970149, 'r': 0.5039370078740157, 'f1': 0.6597938144329897}, 'slot': {'p': 0.5609756097560976, 'r': 0.022030651340996167, 'f1': 0.0423963133640553}, 'combined': 0.027972825312366383, 'epoch': 9} Dev Korean: {'template': {'p': 1.0, 'r': 0.5666666666666667, 'f1': 0.7234042553191489}, 'slot': {'p': 0.5, 'r': 0.035916824196597356, 'f1': 0.0670194003527337}, 'combined': 0.048482119404105226, 'epoch': 9} Test Korean: {'template': {'p': 0.9558823529411765, 'r': 0.5118110236220472, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5609756097560976, 'r': 0.022030651340996167, 'f1': 0.0423963133640553}, 'combined': 0.028264208909370196, 'epoch': 9} Dev Russian: {'template': {'p': 1.0, 'r': 0.5666666666666667, 'f1': 0.7234042553191489}, 'slot': {'p': 0.5, 'r': 0.035916824196597356, 'f1': 0.0670194003527337}, 'combined': 0.048482119404105226, 'epoch': 9} Test Russian: {'template': {'p': 0.9552238805970149, 'r': 0.5039370078740157, 'f1': 0.6597938144329897}, 'slot': {'p': 0.5609756097560976, 'r': 0.022030651340996167, 'f1': 0.0423963133640553}, 'combined': 0.027972825312366383, 'epoch': 9} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 9} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 9} Sample Russian: {'template': {'p': 1.0, 'r': 0.25, 'f1': 0.4}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 9} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 2} Test for Chinese: {'template': {'p': 0.9295774647887324, 'r': 0.5196850393700787, 'f1': 0.6666666666666665}, 'slot': {'p': 0.5172413793103449, 'r': 0.014367816091954023, 'f1': 0.02795899347623485}, 'combined': 0.018639328984156562, 'epoch': 2} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 2} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 2} Test for Korean: {'template': {'p': 0.9295774647887324, 'r': 0.5196850393700787, 'f1': 0.6666666666666665}, 'slot': {'p': 0.5172413793103449, 'r': 0.014367816091954023, 'f1': 0.02795899347623485}, 'combined': 0.018639328984156562, 'epoch': 2} Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 2} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 2} Test for Russian: {'template': {'p': 0.9295774647887324, 'r': 0.5196850393700787, 'f1': 0.6666666666666665}, 'slot': {'p': 0.5172413793103449, 'r': 0.014367816091954023, 'f1': 0.02795899347623485}, 'combined': 0.018639328984156562, 'epoch': 2} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 2} ****************************** Epoch: 10 command: python train.py --model_name template --xlmr_model_name xlm-roberta-large --batch_size 10 --xlmr_learning_rate 2e-5 --event_hidden_num 450 --accumulate_step 4 --max_epoch 30 --p1_data_weight 0.1 --learning_rate 2e-4 2023-01-24 00:08:14.488748: step: 4/77, loss: 4.589699165080674e-05 2023-01-24 00:08:15.959026: step: 8/77, loss: 0.00024144344206433743 2023-01-24 00:08:17.437200: step: 12/77, loss: 0.005603502504527569 2023-01-24 00:08:18.869923: step: 16/77, loss: 5.645120108965784e-05 2023-01-24 00:08:20.247506: step: 20/77, loss: 0.0016491918358951807 2023-01-24 00:08:21.740476: step: 24/77, loss: 0.0020571216009557247 2023-01-24 00:08:23.164486: step: 28/77, loss: 0.00506344810128212 2023-01-24 00:08:24.606879: step: 32/77, loss: 0.014536905102431774 2023-01-24 00:08:26.074511: step: 36/77, loss: 0.020713821053504944 2023-01-24 00:08:27.506671: step: 40/77, loss: 0.00030312419403344393 2023-01-24 00:08:28.933976: step: 44/77, loss: 0.0010036167223006487 2023-01-24 00:08:30.457957: step: 48/77, loss: 0.006692064926028252 2023-01-24 00:08:31.960181: step: 52/77, loss: 0.009225301444530487 2023-01-24 00:08:33.412686: step: 56/77, loss: 0.01339406706392765 2023-01-24 00:08:34.851331: step: 60/77, loss: 0.02862321399152279 2023-01-24 00:08:36.305782: step: 64/77, loss: 4.114151670364663e-05 2023-01-24 00:08:37.709500: step: 68/77, loss: 0.0011684123892337084 2023-01-24 00:08:39.135744: step: 72/77, loss: 0.0053319199942052364 2023-01-24 00:08:40.594275: step: 76/77, loss: 0.0463118739426136 2023-01-24 00:08:42.024485: step: 80/77, loss: 0.0003504931228235364 2023-01-24 00:08:43.469784: step: 84/77, loss: 0.012099414132535458 2023-01-24 00:08:44.851713: step: 88/77, loss: 0.003216678276658058 2023-01-24 00:08:46.259004: step: 92/77, loss: 0.016689065843820572 2023-01-24 00:08:47.743507: step: 96/77, loss: 2.5759411073522642e-05 2023-01-24 00:08:49.248799: step: 100/77, loss: 0.00016509568376932293 2023-01-24 00:08:50.734887: step: 104/77, loss: 0.00431009940803051 2023-01-24 00:08:52.185942: step: 108/77, loss: 0.0017709459643810987 2023-01-24 00:08:53.651243: step: 112/77, loss: 0.007031287532299757 2023-01-24 00:08:55.118130: step: 116/77, loss: 0.006091595161706209 2023-01-24 00:08:56.579629: step: 120/77, loss: 0.030307577922940254 2023-01-24 00:08:58.089022: step: 124/77, loss: 0.1605072170495987 2023-01-24 00:08:59.584207: step: 128/77, loss: 0.005444565322250128 2023-01-24 00:09:01.055926: step: 132/77, loss: 0.00503362575545907 2023-01-24 00:09:02.495941: step: 136/77, loss: 0.04614949971437454 2023-01-24 00:09:03.995815: step: 140/77, loss: 0.008267560042440891 2023-01-24 00:09:05.483936: step: 144/77, loss: 0.0012018970446661115 2023-01-24 00:09:06.911936: step: 148/77, loss: 0.0025487628299742937 2023-01-24 00:09:08.333104: step: 152/77, loss: 0.012709196656942368 2023-01-24 00:09:09.868427: step: 156/77, loss: 0.008082669228315353 2023-01-24 00:09:11.335993: step: 160/77, loss: 0.0749572366476059 2023-01-24 00:09:12.817571: step: 164/77, loss: 0.00012748232984449714 2023-01-24 00:09:14.296413: step: 168/77, loss: 0.004599342588335276 2023-01-24 00:09:15.784644: step: 172/77, loss: 0.12452461570501328 2023-01-24 00:09:17.261696: step: 176/77, loss: 0.002565551083534956 2023-01-24 00:09:18.660726: step: 180/77, loss: 0.019924625754356384 2023-01-24 00:09:20.138327: step: 184/77, loss: 0.03820423781871796 2023-01-24 00:09:21.550600: step: 188/77, loss: 0.004104171879589558 2023-01-24 00:09:23.018175: step: 192/77, loss: 0.0008288765093311667 2023-01-24 00:09:24.495169: step: 196/77, loss: 0.009139486588537693 2023-01-24 00:09:25.938232: step: 200/77, loss: 0.0002491409541107714 2023-01-24 00:09:27.361520: step: 204/77, loss: 0.0016183594707399607 2023-01-24 00:09:28.754728: step: 208/77, loss: 0.005521266255527735 2023-01-24 00:09:30.228221: step: 212/77, loss: 0.008927865885198116 2023-01-24 00:09:31.764999: step: 216/77, loss: 0.005285509862005711 2023-01-24 00:09:33.250477: step: 220/77, loss: 0.06506399065256119 2023-01-24 00:09:34.740700: step: 224/77, loss: 0.025701530277729034 2023-01-24 00:09:36.171986: step: 228/77, loss: 0.08451960980892181 2023-01-24 00:09:37.553014: step: 232/77, loss: 0.00516652874648571 2023-01-24 00:09:38.931734: step: 236/77, loss: 0.00426349975168705 2023-01-24 00:09:40.357019: step: 240/77, loss: 0.010390719398856163 2023-01-24 00:09:41.829762: step: 244/77, loss: 0.00514764990657568 2023-01-24 00:09:43.253743: step: 248/77, loss: 0.034158460795879364 2023-01-24 00:09:44.764402: step: 252/77, loss: 0.0057203867472708225 2023-01-24 00:09:46.224315: step: 256/77, loss: 0.003704886185005307 2023-01-24 00:09:47.688671: step: 260/77, loss: 0.00412334781140089 2023-01-24 00:09:49.137373: step: 264/77, loss: 0.005762157030403614 2023-01-24 00:09:50.641976: step: 268/77, loss: 0.04966042935848236 2023-01-24 00:09:52.109929: step: 272/77, loss: 0.025200556963682175 2023-01-24 00:09:53.513512: step: 276/77, loss: 0.0036996318958699703 2023-01-24 00:09:54.960609: step: 280/77, loss: 0.02489420771598816 2023-01-24 00:09:56.391213: step: 284/77, loss: 0.04982485622167587 2023-01-24 00:09:57.797780: step: 288/77, loss: 0.029103586450219154 2023-01-24 00:09:59.265628: step: 292/77, loss: 0.003530132584273815 2023-01-24 00:10:00.633071: step: 296/77, loss: 0.01538523007184267 2023-01-24 00:10:02.139474: step: 300/77, loss: 0.0024071214720606804 2023-01-24 00:10:03.606857: step: 304/77, loss: 0.02392967790365219 2023-01-24 00:10:05.014434: step: 308/77, loss: 0.0115028265863657 2023-01-24 00:10:06.367446: step: 312/77, loss: 0.004025975242257118 2023-01-24 00:10:07.819249: step: 316/77, loss: 0.009813313372433186 2023-01-24 00:10:09.228557: step: 320/77, loss: 0.0030793596524745226 2023-01-24 00:10:10.655425: step: 324/77, loss: 0.012550530955195427 2023-01-24 00:10:12.066426: step: 328/77, loss: 0.019397376105189323 2023-01-24 00:10:13.466818: step: 332/77, loss: 0.00932474248111248 2023-01-24 00:10:14.870320: step: 336/77, loss: 0.0037380498833954334 2023-01-24 00:10:16.358601: step: 340/77, loss: 0.0163428895175457 2023-01-24 00:10:17.790451: step: 344/77, loss: 0.0024034634698182344 2023-01-24 00:10:19.303127: step: 348/77, loss: 0.05375993996858597 2023-01-24 00:10:20.763699: step: 352/77, loss: 0.003951383754611015 2023-01-24 00:10:22.249767: step: 356/77, loss: 2.102347934851423e-05 2023-01-24 00:10:23.598777: step: 360/77, loss: 0.025249600410461426 2023-01-24 00:10:25.055475: step: 364/77, loss: 0.0028448286466300488 2023-01-24 00:10:26.581314: step: 368/77, loss: 0.00012489972868934274 2023-01-24 00:10:27.977111: step: 372/77, loss: 0.009151318110525608 2023-01-24 00:10:29.354097: step: 376/77, loss: 0.006536440923810005 2023-01-24 00:10:30.819353: step: 380/77, loss: 0.02696499414741993 2023-01-24 00:10:32.330355: step: 384/77, loss: 0.03662192076444626 2023-01-24 00:10:33.745492: step: 388/77, loss: 0.003093214239925146 ================================================== Loss: 0.016 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 10} Test Chinese: {'template': {'p': 0.9358974358974359, 'r': 0.5748031496062992, 'f1': 0.7121951219512195}, 'slot': {'p': 0.4897959183673469, 'r': 0.022988505747126436, 'f1': 0.04391582799634035}, 'combined': 0.03127663847544239, 'epoch': 10} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 10} Test Korean: {'template': {'p': 0.9358974358974359, 'r': 0.5748031496062992, 'f1': 0.7121951219512195}, 'slot': {'p': 0.4897959183673469, 'r': 0.022988505747126436, 'f1': 0.04391582799634035}, 'combined': 0.03127663847544239, 'epoch': 10} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 10} Test Russian: {'template': {'p': 0.9358974358974359, 'r': 0.5748031496062992, 'f1': 0.7121951219512195}, 'slot': {'p': 0.4897959183673469, 'r': 0.022988505747126436, 'f1': 0.04391582799634035}, 'combined': 0.03127663847544239, 'epoch': 10} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 10} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 10} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 10} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 2} Test for Chinese: {'template': {'p': 0.9295774647887324, 'r': 0.5196850393700787, 'f1': 0.6666666666666665}, 'slot': {'p': 0.5172413793103449, 'r': 0.014367816091954023, 'f1': 0.02795899347623485}, 'combined': 0.018639328984156562, 'epoch': 2} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 2} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 2} Test for Korean: {'template': {'p': 0.9295774647887324, 'r': 0.5196850393700787, 'f1': 0.6666666666666665}, 'slot': {'p': 0.5172413793103449, 'r': 0.014367816091954023, 'f1': 0.02795899347623485}, 'combined': 0.018639328984156562, 'epoch': 2} Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 2} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 2} Test for Russian: {'template': {'p': 0.9295774647887324, 'r': 0.5196850393700787, 'f1': 0.6666666666666665}, 'slot': {'p': 0.5172413793103449, 'r': 0.014367816091954023, 'f1': 0.02795899347623485}, 'combined': 0.018639328984156562, 'epoch': 2} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 2} ****************************** Epoch: 11 command: python train.py --model_name template --xlmr_model_name xlm-roberta-large --batch_size 10 --xlmr_learning_rate 2e-5 --event_hidden_num 450 --accumulate_step 4 --max_epoch 30 --p1_data_weight 0.1 --learning_rate 2e-4 2023-01-24 00:12:07.532719: step: 4/77, loss: 0.004117776174098253 2023-01-24 00:12:08.972991: step: 8/77, loss: 0.004915441386401653 2023-01-24 00:12:10.482277: step: 12/77, loss: 0.01415804773569107 2023-01-24 00:12:11.944881: step: 16/77, loss: 0.02904348261654377 2023-01-24 00:12:13.429272: step: 20/77, loss: 0.004187597427517176 2023-01-24 00:12:14.919746: step: 24/77, loss: 0.02014974318444729 2023-01-24 00:12:16.408318: step: 28/77, loss: 0.0004997641663067043 2023-01-24 00:12:17.913393: step: 32/77, loss: 0.0051374551840126514 2023-01-24 00:12:19.349195: step: 36/77, loss: 0.002386154141277075 2023-01-24 00:12:20.812222: step: 40/77, loss: 0.0029787034727633 2023-01-24 00:12:22.227679: step: 44/77, loss: 0.0478215366601944 2023-01-24 00:12:23.645606: step: 48/77, loss: 0.0031992436852306128 2023-01-24 00:12:25.101998: step: 52/77, loss: 0.01320080365985632 2023-01-24 00:12:26.602492: step: 56/77, loss: 0.016016095876693726 2023-01-24 00:12:28.029688: step: 60/77, loss: 0.004994070157408714 2023-01-24 00:12:29.430569: step: 64/77, loss: 0.0011320551857352257 2023-01-24 00:12:30.887182: step: 68/77, loss: 0.004726284649223089 2023-01-24 00:12:32.389455: step: 72/77, loss: 0.1585463434457779 2023-01-24 00:12:33.831146: step: 76/77, loss: 0.0007189574535004795 2023-01-24 00:12:35.250563: step: 80/77, loss: 0.002605971647426486 2023-01-24 00:12:36.677751: step: 84/77, loss: 0.045810721814632416 2023-01-24 00:12:38.099212: step: 88/77, loss: 0.023271696642041206 2023-01-24 00:12:39.539790: step: 92/77, loss: 0.01812889613211155 2023-01-24 00:12:41.038230: step: 96/77, loss: 0.006418706849217415 2023-01-24 00:12:42.503152: step: 100/77, loss: 0.03154192864894867 2023-01-24 00:12:43.897687: step: 104/77, loss: 0.005148966331034899 2023-01-24 00:12:45.360192: step: 108/77, loss: 0.005101452115923166 2023-01-24 00:12:46.839790: step: 112/77, loss: 0.010542785748839378 2023-01-24 00:12:48.234151: step: 116/77, loss: 0.0051417420618236065 2023-01-24 00:12:49.680939: step: 120/77, loss: 0.01077075581997633 2023-01-24 00:12:51.149483: step: 124/77, loss: 0.0030551054514944553 2023-01-24 00:12:52.681726: step: 128/77, loss: 0.010427115485072136 2023-01-24 00:12:54.083876: step: 132/77, loss: 0.0038118772208690643 2023-01-24 00:12:55.467136: step: 136/77, loss: 0.0009647954138927162 2023-01-24 00:12:56.886571: step: 140/77, loss: 0.03399214148521423 2023-01-24 00:12:58.443134: step: 144/77, loss: 0.006156741175800562 2023-01-24 00:12:59.884401: step: 148/77, loss: 0.022563472390174866 2023-01-24 00:13:01.301250: step: 152/77, loss: 0.09333071857690811 2023-01-24 00:13:02.702557: step: 156/77, loss: 0.010883467271924019 2023-01-24 00:13:04.153830: step: 160/77, loss: 0.03193432465195656 2023-01-24 00:13:05.640401: step: 164/77, loss: 0.002262625377625227 2023-01-24 00:13:07.126129: step: 168/77, loss: 0.001045061624608934 2023-01-24 00:13:08.575888: step: 172/77, loss: 0.004644421394914389 2023-01-24 00:13:10.012884: step: 176/77, loss: 0.016477003693580627 2023-01-24 00:13:11.480644: step: 180/77, loss: 0.012041272595524788 2023-01-24 00:13:12.998398: step: 184/77, loss: 0.0005915906513109803 2023-01-24 00:13:14.470668: step: 188/77, loss: 0.025882355868816376 2023-01-24 00:13:15.900638: step: 192/77, loss: 0.00848508719354868 2023-01-24 00:13:17.358222: step: 196/77, loss: 0.0398591086268425 2023-01-24 00:13:18.758304: step: 200/77, loss: 0.028594810515642166 2023-01-24 00:13:20.226565: step: 204/77, loss: 0.00010791288514155895 2023-01-24 00:13:21.697442: step: 208/77, loss: 0.02433749847114086 2023-01-24 00:13:23.167887: step: 212/77, loss: 0.007587941829115152 2023-01-24 00:13:24.634849: step: 216/77, loss: 0.000461955729406327 2023-01-24 00:13:26.099325: step: 220/77, loss: 0.003147740848362446 2023-01-24 00:13:27.481022: step: 224/77, loss: 0.02163620851933956 2023-01-24 00:13:28.856958: step: 228/77, loss: 0.014736814424395561 2023-01-24 00:13:30.288408: step: 232/77, loss: 0.004088567104190588 2023-01-24 00:13:31.741052: step: 236/77, loss: 0.001286070910282433 2023-01-24 00:13:33.162056: step: 240/77, loss: 3.1217612558975816e-05 2023-01-24 00:13:34.603044: step: 244/77, loss: 0.022283362224698067 2023-01-24 00:13:36.027295: step: 248/77, loss: 0.0012130478862673044 2023-01-24 00:13:37.508319: step: 252/77, loss: 0.0005672836559824646 2023-01-24 00:13:38.988052: step: 256/77, loss: 0.0002956362732220441 2023-01-24 00:13:40.442483: step: 260/77, loss: 0.016247715801000595 2023-01-24 00:13:41.891706: step: 264/77, loss: 0.012342410162091255 2023-01-24 00:13:43.318378: step: 268/77, loss: 0.0045818742364645 2023-01-24 00:13:44.734496: step: 272/77, loss: 0.0005448129377327859 2023-01-24 00:13:46.221675: step: 276/77, loss: 0.058306984603405 2023-01-24 00:13:47.663825: step: 280/77, loss: 0.0006328775198198855 2023-01-24 00:13:49.146099: step: 284/77, loss: 8.370068098884076e-05 2023-01-24 00:13:50.582704: step: 288/77, loss: 0.031446944922208786 2023-01-24 00:13:52.032025: step: 292/77, loss: 0.06117931753396988 2023-01-24 00:13:53.510953: step: 296/77, loss: 0.0010637836530804634 2023-01-24 00:13:54.941710: step: 300/77, loss: 0.044221919029951096 2023-01-24 00:13:56.399205: step: 304/77, loss: 0.000338422047207132 2023-01-24 00:13:57.851864: step: 308/77, loss: 0.00033849888131953776 2023-01-24 00:13:59.321915: step: 312/77, loss: 0.5784204006195068 2023-01-24 00:14:00.749577: step: 316/77, loss: 0.02844500169157982 2023-01-24 00:14:02.228428: step: 320/77, loss: 0.0025478312745690346 2023-01-24 00:14:03.724745: step: 324/77, loss: 0.019952338188886642 2023-01-24 00:14:05.168878: step: 328/77, loss: 6.558552558999509e-05 2023-01-24 00:14:06.632407: step: 332/77, loss: 0.008837895467877388 2023-01-24 00:14:08.078865: step: 336/77, loss: 0.029670370742678642 2023-01-24 00:14:09.564058: step: 340/77, loss: 0.003721608780324459 2023-01-24 00:14:11.072943: step: 344/77, loss: 0.005711579695343971 2023-01-24 00:14:12.540258: step: 348/77, loss: 0.0020196978002786636 2023-01-24 00:14:13.995956: step: 352/77, loss: 0.030364107340574265 2023-01-24 00:14:15.409798: step: 356/77, loss: 0.0035150665789842606 2023-01-24 00:14:16.822750: step: 360/77, loss: 0.023433202877640724 2023-01-24 00:14:18.257566: step: 364/77, loss: 8.302327478304505e-05 2023-01-24 00:14:19.792287: step: 368/77, loss: 0.04784025624394417 2023-01-24 00:14:21.231576: step: 372/77, loss: 0.00016430678078904748 2023-01-24 00:14:22.696232: step: 376/77, loss: 0.002062925137579441 2023-01-24 00:14:24.078729: step: 380/77, loss: 0.017002275213599205 2023-01-24 00:14:25.434842: step: 384/77, loss: 0.009056108072400093 2023-01-24 00:14:26.874358: step: 388/77, loss: 0.00011864644329762086 ================================================== Loss: 0.021 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 11} Test Chinese: {'template': {'p': 0.9358974358974359, 'r': 0.5748031496062992, 'f1': 0.7121951219512195}, 'slot': {'p': 0.7037037037037037, 'r': 0.018199233716475097, 'f1': 0.035480859010270774}, 'combined': 0.02526929470975382, 'epoch': 11} Dev Korean: {'template': {'p': 1.0, 'r': 0.5666666666666667, 'f1': 0.7234042553191489}, 'slot': {'p': 0.5, 'r': 0.035916824196597356, 'f1': 0.0670194003527337}, 'combined': 0.048482119404105226, 'epoch': 11} Test Korean: {'template': {'p': 0.935064935064935, 'r': 0.5669291338582677, 'f1': 0.7058823529411765}, 'slot': {'p': 0.64, 'r': 0.01532567049808429, 'f1': 0.02993451824134705}, 'combined': 0.021130248170362624, 'epoch': 11} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 11} Test Russian: {'template': {'p': 0.935064935064935, 'r': 0.5669291338582677, 'f1': 0.7058823529411765}, 'slot': {'p': 0.625, 'r': 0.014367816091954023, 'f1': 0.028089887640449437}, 'combined': 0.019828155981493723, 'epoch': 11} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 11} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 11} Sample Russian: {'template': {'p': 1.0, 'r': 0.25, 'f1': 0.4}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 11} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 2} Test for Chinese: {'template': {'p': 0.9295774647887324, 'r': 0.5196850393700787, 'f1': 0.6666666666666665}, 'slot': {'p': 0.5172413793103449, 'r': 0.014367816091954023, 'f1': 0.02795899347623485}, 'combined': 0.018639328984156562, 'epoch': 2} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 2} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 2} Test for Korean: {'template': {'p': 0.9295774647887324, 'r': 0.5196850393700787, 'f1': 0.6666666666666665}, 'slot': {'p': 0.5172413793103449, 'r': 0.014367816091954023, 'f1': 0.02795899347623485}, 'combined': 0.018639328984156562, 'epoch': 2} Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 2} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 2} Test for Russian: {'template': {'p': 0.9295774647887324, 'r': 0.5196850393700787, 'f1': 0.6666666666666665}, 'slot': {'p': 0.5172413793103449, 'r': 0.014367816091954023, 'f1': 0.02795899347623485}, 'combined': 0.018639328984156562, 'epoch': 2} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 2} ****************************** Epoch: 12 command: python train.py --model_name template --xlmr_model_name xlm-roberta-large --batch_size 10 --xlmr_learning_rate 2e-5 --event_hidden_num 450 --accumulate_step 4 --max_epoch 30 --p1_data_weight 0.1 --learning_rate 2e-4 2023-01-24 00:15:59.835234: step: 4/77, loss: 0.0005247407825663686 2023-01-24 00:16:01.305348: step: 8/77, loss: 0.002567094750702381 2023-01-24 00:16:02.769227: step: 12/77, loss: 0.017465317621827126 2023-01-24 00:16:04.272158: step: 16/77, loss: 0.12676353752613068 2023-01-24 00:16:05.679589: step: 20/77, loss: 0.02119380608201027 2023-01-24 00:16:07.135942: step: 24/77, loss: 0.005282233469188213 2023-01-24 00:16:08.505144: step: 28/77, loss: 0.02207837998867035 2023-01-24 00:16:10.045228: step: 32/77, loss: 0.00440692575648427 2023-01-24 00:16:11.525984: step: 36/77, loss: 0.0044909631833434105 2023-01-24 00:16:12.960560: step: 40/77, loss: 0.0014111119089648128 2023-01-24 00:16:14.438389: step: 44/77, loss: 0.0404345728456974 2023-01-24 00:16:15.823268: step: 48/77, loss: 0.019730795174837112 2023-01-24 00:16:17.302306: step: 52/77, loss: 0.01563037931919098 2023-01-24 00:16:18.711467: step: 56/77, loss: 0.016580674797296524 2023-01-24 00:16:20.142272: step: 60/77, loss: 0.002576342085376382 2023-01-24 00:16:21.550391: step: 64/77, loss: 0.1102462187409401 2023-01-24 00:16:22.922602: step: 68/77, loss: 0.11612618714570999 2023-01-24 00:16:24.376432: step: 72/77, loss: 0.019472433254122734 2023-01-24 00:16:25.900875: step: 76/77, loss: 0.006306190975010395 2023-01-24 00:16:27.339264: step: 80/77, loss: 0.034560639411211014 2023-01-24 00:16:28.801467: step: 84/77, loss: 4.036349128000438e-05 2023-01-24 00:16:30.197312: step: 88/77, loss: 0.0003321681288070977 2023-01-24 00:16:31.643745: step: 92/77, loss: 0.0636209100484848 2023-01-24 00:16:33.080436: step: 96/77, loss: 0.00023842410882934928 2023-01-24 00:16:34.554062: step: 100/77, loss: 0.003968775738030672 2023-01-24 00:16:36.025933: step: 104/77, loss: 0.005073751788586378 2023-01-24 00:16:37.469401: step: 108/77, loss: 0.006253676023334265 2023-01-24 00:16:38.858604: step: 112/77, loss: 0.010998057201504707 2023-01-24 00:16:40.241543: step: 116/77, loss: 0.011545268818736076 2023-01-24 00:16:41.725957: step: 120/77, loss: 0.01722642220556736 2023-01-24 00:16:43.164688: step: 124/77, loss: 0.0311063751578331 2023-01-24 00:16:44.553612: step: 128/77, loss: 0.0013293633237481117 2023-01-24 00:16:46.007969: step: 132/77, loss: 0.0005748384282924235 2023-01-24 00:16:47.371992: step: 136/77, loss: 2.1748521248809993e-05 2023-01-24 00:16:48.865020: step: 140/77, loss: 0.00010100715007865801 2023-01-24 00:16:50.310162: step: 144/77, loss: 0.008385705761611462 2023-01-24 00:16:51.772124: step: 148/77, loss: 0.0017346638487651944 2023-01-24 00:16:53.260458: step: 152/77, loss: 0.018460754305124283 2023-01-24 00:16:54.686485: step: 156/77, loss: 0.036916207522153854 2023-01-24 00:16:56.073724: step: 160/77, loss: 0.008006674237549305 2023-01-24 00:16:57.520272: step: 164/77, loss: 0.007975637912750244 2023-01-24 00:16:58.969499: step: 168/77, loss: 0.0025035978760570288 2023-01-24 00:17:00.441222: step: 172/77, loss: 0.002373643219470978 2023-01-24 00:17:01.901976: step: 176/77, loss: 0.00031880938331596553 2023-01-24 00:17:03.390139: step: 180/77, loss: 0.0016650452744215727 2023-01-24 00:17:04.845283: step: 184/77, loss: 0.006842002738267183 2023-01-24 00:17:06.312370: step: 188/77, loss: 0.0019103825325146317 2023-01-24 00:17:07.820689: step: 192/77, loss: 0.019019873812794685 2023-01-24 00:17:09.306590: step: 196/77, loss: 0.012821578420698643 2023-01-24 00:17:10.723545: step: 200/77, loss: 0.06491917371749878 2023-01-24 00:17:12.164325: step: 204/77, loss: 0.003583463840186596 2023-01-24 00:17:13.618745: step: 208/77, loss: 0.004135860130190849 2023-01-24 00:17:15.045122: step: 212/77, loss: 0.000807414238806814 2023-01-24 00:17:16.540730: step: 216/77, loss: 5.906960359425284e-05 2023-01-24 00:17:17.966754: step: 220/77, loss: 0.00628303550183773 2023-01-24 00:17:19.447539: step: 224/77, loss: 0.0019145627738907933 2023-01-24 00:17:20.867079: step: 228/77, loss: 7.338388968491927e-05 2023-01-24 00:17:22.306978: step: 232/77, loss: 0.0041746823117136955 2023-01-24 00:17:23.698471: step: 236/77, loss: 0.009081336669623852 2023-01-24 00:17:25.128442: step: 240/77, loss: 0.016627684235572815 2023-01-24 00:17:26.575093: step: 244/77, loss: 0.0008930110489018261 2023-01-24 00:17:28.082099: step: 248/77, loss: 0.0002453183406032622 2023-01-24 00:17:29.511724: step: 252/77, loss: 0.0023990191984921694 2023-01-24 00:17:30.999299: step: 256/77, loss: 0.050463445484638214 2023-01-24 00:17:32.539731: step: 260/77, loss: 0.013038757257163525 2023-01-24 00:17:33.992387: step: 264/77, loss: 0.0045187515206635 2023-01-24 00:17:35.506696: step: 268/77, loss: 0.002060836646705866 2023-01-24 00:17:36.967287: step: 272/77, loss: 3.999815817223862e-05 2023-01-24 00:17:38.463341: step: 276/77, loss: 0.043776314705610275 2023-01-24 00:17:39.882296: step: 280/77, loss: 0.026300055906176567 2023-01-24 00:17:41.332006: step: 284/77, loss: 0.000944439263548702 2023-01-24 00:17:42.784063: step: 288/77, loss: 0.04076274111866951 2023-01-24 00:17:44.172036: step: 292/77, loss: 0.002136054914444685 2023-01-24 00:17:45.614738: step: 296/77, loss: 0.0010709648486226797 2023-01-24 00:17:47.053689: step: 300/77, loss: 0.0005529043264687061 2023-01-24 00:17:48.590778: step: 304/77, loss: 0.004681096877902746 2023-01-24 00:17:50.091934: step: 308/77, loss: 0.013274089433252811 2023-01-24 00:17:51.497930: step: 312/77, loss: 0.004294515587389469 2023-01-24 00:17:52.918570: step: 316/77, loss: 0.003923433367162943 2023-01-24 00:17:54.318013: step: 320/77, loss: 0.00102332909591496 2023-01-24 00:17:55.775520: step: 324/77, loss: 0.015580816194415092 2023-01-24 00:17:57.185376: step: 328/77, loss: 0.00026385392993688583 2023-01-24 00:17:58.633130: step: 332/77, loss: 2.9556349545600824e-05 2023-01-24 00:18:00.077956: step: 336/77, loss: 0.017724979668855667 2023-01-24 00:18:01.533857: step: 340/77, loss: 0.007643330376595259 2023-01-24 00:18:02.947596: step: 344/77, loss: 0.02229895628988743 2023-01-24 00:18:04.359822: step: 348/77, loss: 0.00037963211070746183 2023-01-24 00:18:05.878258: step: 352/77, loss: 0.0010418157326057553 2023-01-24 00:18:07.309687: step: 356/77, loss: 0.0025185563135892153 2023-01-24 00:18:08.784601: step: 360/77, loss: 0.0009169665281660855 2023-01-24 00:18:10.259330: step: 364/77, loss: 0.007604700978845358 2023-01-24 00:18:11.709283: step: 368/77, loss: 0.00028573302552103996 2023-01-24 00:18:13.153160: step: 372/77, loss: 0.0028571230359375477 2023-01-24 00:18:14.574162: step: 376/77, loss: 6.98276562616229e-05 2023-01-24 00:18:16.007048: step: 380/77, loss: 0.0016387768555432558 2023-01-24 00:18:17.552591: step: 384/77, loss: 0.0003549756947904825 2023-01-24 00:18:18.990390: step: 388/77, loss: 0.014541917480528355 ================================================== Loss: 0.013 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 12} Test Chinese: {'template': {'p': 0.9583333333333334, 'r': 0.5433070866141733, 'f1': 0.6934673366834172}, 'slot': {'p': 0.5897435897435898, 'r': 0.022030651340996167, 'f1': 0.04247460757156048}, 'combined': 0.02945475298932335, 'epoch': 12} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 12} Test Korean: {'template': {'p': 0.9722222222222222, 'r': 0.5511811023622047, 'f1': 0.7035175879396985}, 'slot': {'p': 0.5897435897435898, 'r': 0.022030651340996167, 'f1': 0.04247460757156048}, 'combined': 0.02988163346742948, 'epoch': 12} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 12} Test Russian: {'template': {'p': 0.958904109589041, 'r': 0.5511811023622047, 'f1': 0.7000000000000001}, 'slot': {'p': 0.5897435897435898, 'r': 0.022030651340996167, 'f1': 0.04247460757156048}, 'combined': 0.029732225300092337, 'epoch': 12} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 12} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 12} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 12} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 2} Test for Chinese: {'template': {'p': 0.9295774647887324, 'r': 0.5196850393700787, 'f1': 0.6666666666666665}, 'slot': {'p': 0.5172413793103449, 'r': 0.014367816091954023, 'f1': 0.02795899347623485}, 'combined': 0.018639328984156562, 'epoch': 2} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 2} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 2} Test for Korean: {'template': {'p': 0.9295774647887324, 'r': 0.5196850393700787, 'f1': 0.6666666666666665}, 'slot': {'p': 0.5172413793103449, 'r': 0.014367816091954023, 'f1': 0.02795899347623485}, 'combined': 0.018639328984156562, 'epoch': 2} Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 2} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 2} Test for Russian: {'template': {'p': 0.9295774647887324, 'r': 0.5196850393700787, 'f1': 0.6666666666666665}, 'slot': {'p': 0.5172413793103449, 'r': 0.014367816091954023, 'f1': 0.02795899347623485}, 'combined': 0.018639328984156562, 'epoch': 2} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 2} ****************************** Epoch: 13 command: python train.py --model_name template --xlmr_model_name xlm-roberta-large --batch_size 10 --xlmr_learning_rate 2e-5 --event_hidden_num 450 --accumulate_step 4 --max_epoch 30 --p1_data_weight 0.1 --learning_rate 2e-4 2023-01-24 00:19:52.780623: step: 4/77, loss: 8.652826363686472e-05 2023-01-24 00:19:54.256853: step: 8/77, loss: 0.02977108582854271 2023-01-24 00:19:55.648331: step: 12/77, loss: 0.004708381835371256 2023-01-24 00:19:57.109083: step: 16/77, loss: 0.03685312718153 2023-01-24 00:19:58.519167: step: 20/77, loss: 0.0016763614257797599 2023-01-24 00:20:00.006998: step: 24/77, loss: 0.00012752984184771776 2023-01-24 00:20:01.464281: step: 28/77, loss: 0.0006036722334101796 2023-01-24 00:20:02.860055: step: 32/77, loss: 0.050690364092588425 2023-01-24 00:20:04.365850: step: 36/77, loss: 0.03537668660283089 2023-01-24 00:20:05.812174: step: 40/77, loss: 0.009171489626169205 2023-01-24 00:20:07.234361: step: 44/77, loss: 4.9030128138838336e-05 2023-01-24 00:20:08.651330: step: 48/77, loss: 0.029300745576620102 2023-01-24 00:20:10.045676: step: 52/77, loss: 0.017077535390853882 2023-01-24 00:20:11.482343: step: 56/77, loss: 0.03321102634072304 2023-01-24 00:20:12.944549: step: 60/77, loss: 0.000311605486785993 2023-01-24 00:20:14.391865: step: 64/77, loss: 0.0014301147311925888 2023-01-24 00:20:15.817193: step: 68/77, loss: 0.03102892078459263 2023-01-24 00:20:17.262537: step: 72/77, loss: 0.0002303342189406976 2023-01-24 00:20:18.720061: step: 76/77, loss: 0.0001876634923974052 2023-01-24 00:20:20.183919: step: 80/77, loss: 0.0005235670832917094 2023-01-24 00:20:21.637069: step: 84/77, loss: 0.03420034423470497 2023-01-24 00:20:23.127325: step: 88/77, loss: 0.017702868208289146 2023-01-24 00:20:24.599390: step: 92/77, loss: 0.0008840985246933997 2023-01-24 00:20:26.021589: step: 96/77, loss: 0.01722489297389984 2023-01-24 00:20:27.476526: step: 100/77, loss: 0.0024477271363139153 2023-01-24 00:20:28.949899: step: 104/77, loss: 0.005323044955730438 2023-01-24 00:20:30.362357: step: 108/77, loss: 8.686994260642678e-05 2023-01-24 00:20:31.816130: step: 112/77, loss: 0.012361899949610233 2023-01-24 00:20:33.281556: step: 116/77, loss: 0.01226483378559351 2023-01-24 00:20:34.719314: step: 120/77, loss: 0.004511932842433453 2023-01-24 00:20:36.157668: step: 124/77, loss: 0.001093173515982926 2023-01-24 00:20:37.561329: step: 128/77, loss: 0.005858981050550938 2023-01-24 00:20:39.002088: step: 132/77, loss: 0.03214491531252861 2023-01-24 00:20:40.430761: step: 136/77, loss: 0.0001557179057272151 2023-01-24 00:20:41.895850: step: 140/77, loss: 0.0070053753443062305 2023-01-24 00:20:43.368044: step: 144/77, loss: 0.010695857927203178 2023-01-24 00:20:44.831093: step: 148/77, loss: 0.03357268497347832 2023-01-24 00:20:46.296094: step: 152/77, loss: 0.0016243371646851301 2023-01-24 00:20:47.748952: step: 156/77, loss: 0.0026706468779593706 2023-01-24 00:20:49.138283: step: 160/77, loss: 0.0007189217139966786 2023-01-24 00:20:50.518434: step: 164/77, loss: 0.0003048968792427331 2023-01-24 00:20:51.989224: step: 168/77, loss: 0.0005347698461264372 2023-01-24 00:20:53.445009: step: 172/77, loss: 0.0015633282018825412 2023-01-24 00:20:54.955452: step: 176/77, loss: 0.0007745270850136876 2023-01-24 00:20:56.360701: step: 180/77, loss: 0.08212535083293915 2023-01-24 00:20:57.811578: step: 184/77, loss: 0.0016086840769276023 2023-01-24 00:20:59.176642: step: 188/77, loss: 0.003522861050441861 2023-01-24 00:21:00.677319: step: 192/77, loss: 0.0009818300604820251 2023-01-24 00:21:02.088567: step: 196/77, loss: 0.03784959018230438 2023-01-24 00:21:03.518373: step: 200/77, loss: 0.0001309435610892251 2023-01-24 00:21:04.966844: step: 204/77, loss: 0.000350303016602993 2023-01-24 00:21:06.379964: step: 208/77, loss: 0.03144621104001999 2023-01-24 00:21:07.774258: step: 212/77, loss: 0.0029899291694164276 2023-01-24 00:21:09.181779: step: 216/77, loss: 7.75201478973031e-05 2023-01-24 00:21:10.576005: step: 220/77, loss: 0.009032079018652439 2023-01-24 00:21:12.069968: step: 224/77, loss: 0.00041380090988241136 2023-01-24 00:21:13.532352: step: 228/77, loss: 0.0018120076274499297 2023-01-24 00:21:15.036535: step: 232/77, loss: 0.0006691589951515198 2023-01-24 00:21:16.461397: step: 236/77, loss: 0.03671402111649513 2023-01-24 00:21:17.948310: step: 240/77, loss: 0.00020888636936433613 2023-01-24 00:21:19.400646: step: 244/77, loss: 0.00014447855937760323 2023-01-24 00:21:20.819121: step: 248/77, loss: 4.1640640120022e-05 2023-01-24 00:21:22.293179: step: 252/77, loss: 0.013404018245637417 2023-01-24 00:21:23.694907: step: 256/77, loss: 0.003983436618000269 2023-01-24 00:21:25.081131: step: 260/77, loss: 0.0011014521587640047 2023-01-24 00:21:26.551596: step: 264/77, loss: 0.002696492476388812 2023-01-24 00:21:28.037706: step: 268/77, loss: 0.014381970278918743 2023-01-24 00:21:29.498938: step: 272/77, loss: 0.015419892966747284 2023-01-24 00:21:30.962115: step: 276/77, loss: 0.05354885011911392 2023-01-24 00:21:32.428945: step: 280/77, loss: 0.00037291960325092077 2023-01-24 00:21:33.892629: step: 284/77, loss: 0.0010739283170551062 2023-01-24 00:21:35.402683: step: 288/77, loss: 0.0356796458363533 2023-01-24 00:21:36.882115: step: 292/77, loss: 0.0005537212709896266 2023-01-24 00:21:38.322795: step: 296/77, loss: 0.023459943011403084 2023-01-24 00:21:39.823241: step: 300/77, loss: 0.0007196709048002958 2023-01-24 00:21:41.265514: step: 304/77, loss: 0.05698677524924278 2023-01-24 00:21:42.693209: step: 308/77, loss: 2.0601302821887657e-05 2023-01-24 00:21:44.183005: step: 312/77, loss: 0.0007258675759658217 2023-01-24 00:21:45.593489: step: 316/77, loss: 0.012658772990107536 2023-01-24 00:21:47.029804: step: 320/77, loss: 0.0019717190880328417 2023-01-24 00:21:48.498725: step: 324/77, loss: 0.0018236006144434214 2023-01-24 00:21:49.984631: step: 328/77, loss: 0.005118303466588259 2023-01-24 00:21:51.507326: step: 332/77, loss: 0.007175174541771412 2023-01-24 00:21:52.965945: step: 336/77, loss: 0.006737024523317814 2023-01-24 00:21:54.386603: step: 340/77, loss: 0.002637992613017559 2023-01-24 00:21:55.827896: step: 344/77, loss: 0.007260086480528116 2023-01-24 00:21:57.274621: step: 348/77, loss: 3.884471425408265e-06 2023-01-24 00:21:58.716778: step: 352/77, loss: 2.6254192562191747e-06 2023-01-24 00:22:00.155086: step: 356/77, loss: 0.01232851017266512 2023-01-24 00:22:01.652541: step: 360/77, loss: 0.02668759785592556 2023-01-24 00:22:03.123100: step: 364/77, loss: 0.0018975171260535717 2023-01-24 00:22:04.571919: step: 368/77, loss: 8.679232996655628e-05 2023-01-24 00:22:05.975190: step: 372/77, loss: 0.0029303967021405697 2023-01-24 00:22:07.410598: step: 376/77, loss: 0.0003926019126083702 2023-01-24 00:22:08.907917: step: 380/77, loss: 0.017120422795414925 2023-01-24 00:22:10.385828: step: 384/77, loss: 0.054849255830049515 2023-01-24 00:22:11.902961: step: 388/77, loss: 0.0002151026128558442 ================================================== Loss: 0.011 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5666666666666667, 'f1': 0.7234042553191489}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05085442919642522, 'epoch': 13} Test Chinese: {'template': {'p': 0.922077922077922, 'r': 0.5590551181102362, 'f1': 0.696078431372549}, 'slot': {'p': 0.5121951219512195, 'r': 0.020114942528735632, 'f1': 0.03870967741935484}, 'combined': 0.026944971537001896, 'epoch': 13} Dev Korean: {'template': {'p': 1.0, 'r': 0.5666666666666667, 'f1': 0.7234042553191489}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05085442919642522, 'epoch': 13} Test Korean: {'template': {'p': 0.9342105263157895, 'r': 0.5590551181102362, 'f1': 0.6995073891625616}, 'slot': {'p': 0.525, 'r': 0.020114942528735632, 'f1': 0.03874538745387454}, 'combined': 0.02710268481995165, 'epoch': 13} Dev Russian: {'template': {'p': 1.0, 'r': 0.5666666666666667, 'f1': 0.7234042553191489}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05085442919642522, 'epoch': 13} Test Russian: {'template': {'p': 0.922077922077922, 'r': 0.5590551181102362, 'f1': 0.696078431372549}, 'slot': {'p': 0.5121951219512195, 'r': 0.020114942528735632, 'f1': 0.03870967741935484}, 'combined': 0.026944971537001896, 'epoch': 13} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 13} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 13} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 13} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 2} Test for Chinese: {'template': {'p': 0.9295774647887324, 'r': 0.5196850393700787, 'f1': 0.6666666666666665}, 'slot': {'p': 0.5172413793103449, 'r': 0.014367816091954023, 'f1': 0.02795899347623485}, 'combined': 0.018639328984156562, 'epoch': 2} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 2} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 2} Test for Korean: {'template': {'p': 0.9295774647887324, 'r': 0.5196850393700787, 'f1': 0.6666666666666665}, 'slot': {'p': 0.5172413793103449, 'r': 0.014367816091954023, 'f1': 0.02795899347623485}, 'combined': 0.018639328984156562, 'epoch': 2} Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 2} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 2} Test for Russian: {'template': {'p': 0.9295774647887324, 'r': 0.5196850393700787, 'f1': 0.6666666666666665}, 'slot': {'p': 0.5172413793103449, 'r': 0.014367816091954023, 'f1': 0.02795899347623485}, 'combined': 0.018639328984156562, 'epoch': 2} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 2} ****************************** Epoch: 14 command: python train.py --model_name template --xlmr_model_name xlm-roberta-large --batch_size 10 --xlmr_learning_rate 2e-5 --event_hidden_num 450 --accumulate_step 4 --max_epoch 30 --p1_data_weight 0.1 --learning_rate 2e-4 2023-01-24 00:23:44.628029: step: 4/77, loss: 0.0005923768039792776 2023-01-24 00:23:46.048710: step: 8/77, loss: 0.0010371538810431957 2023-01-24 00:23:47.527845: step: 12/77, loss: 0.0025780892465263605 2023-01-24 00:23:49.047387: step: 16/77, loss: 0.014587003737688065 2023-01-24 00:23:50.489813: step: 20/77, loss: 0.009667899459600449 2023-01-24 00:23:51.919502: step: 24/77, loss: 0.015686804428696632 2023-01-24 00:23:53.347369: step: 28/77, loss: 0.001794680254533887 2023-01-24 00:23:54.801667: step: 32/77, loss: 0.0003042321477551013 2023-01-24 00:23:56.268130: step: 36/77, loss: 2.3017515559331514e-05 2023-01-24 00:23:57.690622: step: 40/77, loss: 0.02376677840948105 2023-01-24 00:23:59.155382: step: 44/77, loss: 0.0007356961723417044 2023-01-24 00:24:00.605394: step: 48/77, loss: 0.0004205175209790468 2023-01-24 00:24:02.092219: step: 52/77, loss: 0.0027321891393512487 2023-01-24 00:24:03.510382: step: 56/77, loss: 0.0027330503799021244 2023-01-24 00:24:04.929191: step: 60/77, loss: 0.17966850101947784 2023-01-24 00:24:06.371726: step: 64/77, loss: 2.972300171677489e-05 2023-01-24 00:24:07.861575: step: 68/77, loss: 0.002235070103779435 2023-01-24 00:24:09.312587: step: 72/77, loss: 0.007015921175479889 2023-01-24 00:24:10.715684: step: 76/77, loss: 0.003534244140610099 2023-01-24 00:24:12.151029: step: 80/77, loss: 0.0008546484168618917 2023-01-24 00:24:13.551133: step: 84/77, loss: 0.007646994665265083 2023-01-24 00:24:15.010977: step: 88/77, loss: 0.0015440911520272493 2023-01-24 00:24:16.464252: step: 92/77, loss: 0.0008783871307969093 2023-01-24 00:24:17.945073: step: 96/77, loss: 0.000569098221603781 2023-01-24 00:24:19.430007: step: 100/77, loss: 0.025969691574573517 2023-01-24 00:24:20.852543: step: 104/77, loss: 0.003922209609299898 2023-01-24 00:24:22.285078: step: 108/77, loss: 0.002353658201172948 2023-01-24 00:24:23.737229: step: 112/77, loss: 0.003125975374132395 2023-01-24 00:24:25.260233: step: 116/77, loss: 1.4664983609691262e-05 2023-01-24 00:24:26.754071: step: 120/77, loss: 0.009418785572052002 2023-01-24 00:24:28.219654: step: 124/77, loss: 0.03296668082475662 2023-01-24 00:24:29.672337: step: 128/77, loss: 0.00045859161764383316 2023-01-24 00:24:31.111316: step: 132/77, loss: 0.02482902631163597 2023-01-24 00:24:32.576388: step: 136/77, loss: 0.00011510573676787317 2023-01-24 00:24:33.993628: step: 140/77, loss: 0.010969472117722034 2023-01-24 00:24:35.454117: step: 144/77, loss: 0.0002006545546464622 2023-01-24 00:24:36.908892: step: 148/77, loss: 0.000147586441016756 2023-01-24 00:24:38.346754: step: 152/77, loss: 0.00021865824237465858 2023-01-24 00:24:39.764139: step: 156/77, loss: 0.014760902151465416 2023-01-24 00:24:41.174191: step: 160/77, loss: 0.03611365333199501 2023-01-24 00:24:42.637715: step: 164/77, loss: 6.745725113432854e-05 2023-01-24 00:24:44.102534: step: 168/77, loss: 0.004985267762094736 2023-01-24 00:24:45.531306: step: 172/77, loss: 0.0003679897345136851 2023-01-24 00:24:46.934010: step: 176/77, loss: 6.150172703200951e-05 2023-01-24 00:24:48.408115: step: 180/77, loss: 0.005789898335933685 2023-01-24 00:24:49.913283: step: 184/77, loss: 1.4037606888450682e-05 2023-01-24 00:24:51.420186: step: 188/77, loss: 0.0030449635814875364 2023-01-24 00:24:52.868302: step: 192/77, loss: 0.0323932059109211 2023-01-24 00:24:54.319543: step: 196/77, loss: 0.006242392584681511 2023-01-24 00:24:55.755538: step: 200/77, loss: 0.00563141331076622 2023-01-24 00:24:57.134243: step: 204/77, loss: 0.00011276640725554898 2023-01-24 00:24:58.593616: step: 208/77, loss: 0.0010092456359416246 2023-01-24 00:25:00.015112: step: 212/77, loss: 0.0007759323343634605 2023-01-24 00:25:01.464224: step: 216/77, loss: 0.0009462046436965466 2023-01-24 00:25:02.871267: step: 220/77, loss: 0.001489174086600542 2023-01-24 00:25:04.314803: step: 224/77, loss: 0.010666913352906704 2023-01-24 00:25:05.756513: step: 228/77, loss: 0.00035105596180073917 2023-01-24 00:25:07.182869: step: 232/77, loss: 0.0017263833433389664 2023-01-24 00:25:08.600067: step: 236/77, loss: 0.00438380753621459 2023-01-24 00:25:10.031662: step: 240/77, loss: 0.0013219165848568082 2023-01-24 00:25:11.464051: step: 244/77, loss: 0.0049150073900818825 2023-01-24 00:25:12.924761: step: 248/77, loss: 0.001922804513014853 2023-01-24 00:25:14.353350: step: 252/77, loss: 0.008862883783876896 2023-01-24 00:25:15.788213: step: 256/77, loss: 3.397425462026149e-05 2023-01-24 00:25:17.268803: step: 260/77, loss: 0.0238387081772089 2023-01-24 00:25:18.680608: step: 264/77, loss: 0.001543010352179408 2023-01-24 00:25:20.083962: step: 268/77, loss: 0.0015451196813955903 2023-01-24 00:25:21.503097: step: 272/77, loss: 0.001354982960037887 2023-01-24 00:25:22.878068: step: 276/77, loss: 1.9417417206568643e-05 2023-01-24 00:25:24.287117: step: 280/77, loss: 0.011326556093990803 2023-01-24 00:25:25.750320: step: 284/77, loss: 0.0003216102486476302 2023-01-24 00:25:27.215120: step: 288/77, loss: 7.104573160177097e-05 2023-01-24 00:25:28.620431: step: 292/77, loss: 7.505870598834008e-05 2023-01-24 00:25:29.998239: step: 296/77, loss: 0.005343265365809202 2023-01-24 00:25:31.445280: step: 300/77, loss: 0.0016696308739483356 2023-01-24 00:25:32.874877: step: 304/77, loss: 0.011962641961872578 2023-01-24 00:25:34.257958: step: 308/77, loss: 0.004443937446922064 2023-01-24 00:25:35.671967: step: 312/77, loss: 0.0005993850063532591 2023-01-24 00:25:37.143828: step: 316/77, loss: 0.0010810885578393936 2023-01-24 00:25:38.601126: step: 320/77, loss: 0.0004884044174104929 2023-01-24 00:25:39.978032: step: 324/77, loss: 0.006879044696688652 2023-01-24 00:25:41.443861: step: 328/77, loss: 0.2006436139345169 2023-01-24 00:25:42.868545: step: 332/77, loss: 0.0019754248205572367 2023-01-24 00:25:44.286558: step: 336/77, loss: 0.06549952179193497 2023-01-24 00:25:45.751459: step: 340/77, loss: 1.5091065506567247e-05 2023-01-24 00:25:47.183826: step: 344/77, loss: 0.012617451138794422 2023-01-24 00:25:48.646354: step: 348/77, loss: 0.0666639506816864 2023-01-24 00:25:50.084613: step: 352/77, loss: 0.023786582052707672 2023-01-24 00:25:51.532138: step: 356/77, loss: 0.004085585009306669 2023-01-24 00:25:53.029135: step: 360/77, loss: 0.02184930071234703 2023-01-24 00:25:54.444208: step: 364/77, loss: 0.00239895679987967 2023-01-24 00:25:55.913251: step: 368/77, loss: 0.004362748935818672 2023-01-24 00:25:57.318615: step: 372/77, loss: 0.0003628632111940533 2023-01-24 00:25:58.716667: step: 376/77, loss: 0.0006062775501050055 2023-01-24 00:26:00.192713: step: 380/77, loss: 0.0021805637516081333 2023-01-24 00:26:01.637502: step: 384/77, loss: 0.0005585667095147073 2023-01-24 00:26:03.084032: step: 388/77, loss: 0.002515989588573575 ================================================== Loss: 0.011 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 14} Test Chinese: {'template': {'p': 0.9324324324324325, 'r': 0.5433070866141733, 'f1': 0.6865671641791046}, 'slot': {'p': 0.6, 'r': 0.022988505747126436, 'f1': 0.04428044280442805}, 'combined': 0.0304014980448312, 'epoch': 14} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 14} Test Korean: {'template': {'p': 0.9324324324324325, 'r': 0.5433070866141733, 'f1': 0.6865671641791046}, 'slot': {'p': 0.6, 'r': 0.022988505747126436, 'f1': 0.04428044280442805}, 'combined': 0.0304014980448312, 'epoch': 14} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 14} Test Russian: {'template': {'p': 0.9324324324324325, 'r': 0.5433070866141733, 'f1': 0.6865671641791046}, 'slot': {'p': 0.6, 'r': 0.022988505747126436, 'f1': 0.04428044280442805}, 'combined': 0.0304014980448312, 'epoch': 14} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 14} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 14} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 14} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 2} Test for Chinese: {'template': {'p': 0.9295774647887324, 'r': 0.5196850393700787, 'f1': 0.6666666666666665}, 'slot': {'p': 0.5172413793103449, 'r': 0.014367816091954023, 'f1': 0.02795899347623485}, 'combined': 0.018639328984156562, 'epoch': 2} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 2} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 2} Test for Korean: {'template': {'p': 0.9295774647887324, 'r': 0.5196850393700787, 'f1': 0.6666666666666665}, 'slot': {'p': 0.5172413793103449, 'r': 0.014367816091954023, 'f1': 0.02795899347623485}, 'combined': 0.018639328984156562, 'epoch': 2} Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 2} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 2} Test for Russian: {'template': {'p': 0.9295774647887324, 'r': 0.5196850393700787, 'f1': 0.6666666666666665}, 'slot': {'p': 0.5172413793103449, 'r': 0.014367816091954023, 'f1': 0.02795899347623485}, 'combined': 0.018639328984156562, 'epoch': 2} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 2} ****************************** Epoch: 15 command: python train.py --model_name template --xlmr_model_name xlm-roberta-large --batch_size 10 --xlmr_learning_rate 2e-5 --event_hidden_num 450 --accumulate_step 4 --max_epoch 30 --p1_data_weight 0.1 --learning_rate 2e-4 2023-01-24 00:27:35.840771: step: 4/77, loss: 0.0027438707184046507 2023-01-24 00:27:37.255791: step: 8/77, loss: 0.00026642318698577583 2023-01-24 00:27:38.720221: step: 12/77, loss: 0.004584559239447117 2023-01-24 00:27:40.153828: step: 16/77, loss: 0.007085051387548447 2023-01-24 00:27:41.618748: step: 20/77, loss: 0.001897596986964345 2023-01-24 00:27:43.022464: step: 24/77, loss: 0.0015989484963938594 2023-01-24 00:27:44.470300: step: 28/77, loss: 0.004994917660951614 2023-01-24 00:27:45.936317: step: 32/77, loss: 0.001282883808016777 2023-01-24 00:27:47.368220: step: 36/77, loss: 0.003742544911801815 2023-01-24 00:27:48.701301: step: 40/77, loss: 0.00034317385870963335 2023-01-24 00:27:50.115697: step: 44/77, loss: 0.014128199778497219 2023-01-24 00:27:51.587186: step: 48/77, loss: 0.002062295563519001 2023-01-24 00:27:53.053206: step: 52/77, loss: 0.005334725137799978 2023-01-24 00:27:54.616057: step: 56/77, loss: 0.02750406041741371 2023-01-24 00:27:56.088000: step: 60/77, loss: 1.7550912161823362e-05 2023-01-24 00:27:57.546834: step: 64/77, loss: 0.0007373357657343149 2023-01-24 00:27:58.923402: step: 68/77, loss: 0.011681796982884407 2023-01-24 00:28:00.336933: step: 72/77, loss: 0.0003210146678611636 2023-01-24 00:28:01.783908: step: 76/77, loss: 0.027712354436516762 2023-01-24 00:28:03.200808: step: 80/77, loss: 0.013353580608963966 2023-01-24 00:28:04.667936: step: 84/77, loss: 0.0004966604756191373 2023-01-24 00:28:06.234941: step: 88/77, loss: 0.0008741347119212151 2023-01-24 00:28:07.673644: step: 92/77, loss: 0.017185816541314125 2023-01-24 00:28:09.109107: step: 96/77, loss: 0.0003531465772539377 2023-01-24 00:28:10.499597: step: 100/77, loss: 0.01242710743099451 2023-01-24 00:28:11.883485: step: 104/77, loss: 0.01390456035733223 2023-01-24 00:28:13.299049: step: 108/77, loss: 0.046912193298339844 2023-01-24 00:28:14.742763: step: 112/77, loss: 0.0011105500161647797 2023-01-24 00:28:16.185333: step: 116/77, loss: 0.0133155956864357 2023-01-24 00:28:17.639952: step: 120/77, loss: 0.00010007782111642882 2023-01-24 00:28:19.123064: step: 124/77, loss: 9.07158937479835e-06 2023-01-24 00:28:20.561268: step: 128/77, loss: 8.05234958534129e-05 2023-01-24 00:28:21.979636: step: 132/77, loss: 0.00122557720169425 2023-01-24 00:28:23.458244: step: 136/77, loss: 4.57392798125511e-06 2023-01-24 00:28:24.899781: step: 140/77, loss: 0.030837003141641617 2023-01-24 00:28:26.313229: step: 144/77, loss: 0.033023592084646225 2023-01-24 00:28:27.692634: step: 148/77, loss: 0.028124654665589333 2023-01-24 00:28:29.198105: step: 152/77, loss: 1.0030650628323201e-05 2023-01-24 00:28:30.716923: step: 156/77, loss: 0.0028765795286744833 2023-01-24 00:28:32.140043: step: 160/77, loss: 0.00033796275965869427 2023-01-24 00:28:33.621648: step: 164/77, loss: 0.00021561034373007715 2023-01-24 00:28:35.032198: step: 168/77, loss: 0.0023136413656175137 2023-01-24 00:28:36.505544: step: 172/77, loss: 0.07497882843017578 2023-01-24 00:28:37.869406: step: 176/77, loss: 0.0006167854298837483 2023-01-24 00:28:39.318027: step: 180/77, loss: 0.0005940793198533356 2023-01-24 00:28:40.798986: step: 184/77, loss: 0.0018082348397001624 2023-01-24 00:28:42.199697: step: 188/77, loss: 0.0010285713942721486 2023-01-24 00:28:43.674527: step: 192/77, loss: 0.03155481815338135 2023-01-24 00:28:45.111343: step: 196/77, loss: 4.493755113799125e-05 2023-01-24 00:28:46.552308: step: 200/77, loss: 0.08632223308086395 2023-01-24 00:28:48.033849: step: 204/77, loss: 0.020001355558633804 2023-01-24 00:28:49.429947: step: 208/77, loss: 0.0036979829892516136 2023-01-24 00:28:50.849691: step: 212/77, loss: 0.005464925896376371 2023-01-24 00:28:52.337672: step: 216/77, loss: 0.0006584033253602684 2023-01-24 00:28:53.827405: step: 220/77, loss: 0.0031981179490685463 2023-01-24 00:28:55.263263: step: 224/77, loss: 0.0001279626740142703 2023-01-24 00:28:56.691217: step: 228/77, loss: 0.0026379525661468506 2023-01-24 00:28:58.198397: step: 232/77, loss: 0.02259586565196514 2023-01-24 00:28:59.612761: step: 236/77, loss: 5.061298725195229e-05 2023-01-24 00:29:01.045402: step: 240/77, loss: 0.0009586418746039271 2023-01-24 00:29:02.544079: step: 244/77, loss: 0.03353344276547432 2023-01-24 00:29:03.967504: step: 248/77, loss: 4.8978930863086134e-05 2023-01-24 00:29:05.395672: step: 252/77, loss: 0.0018730126321315765 2023-01-24 00:29:06.915656: step: 256/77, loss: 0.0020813527517020702 2023-01-24 00:29:08.324060: step: 260/77, loss: 0.012264858931303024 2023-01-24 00:29:09.712786: step: 264/77, loss: 0.000144295918289572 2023-01-24 00:29:11.206496: step: 268/77, loss: 0.0005735751474276185 2023-01-24 00:29:12.606652: step: 272/77, loss: 8.950501796789467e-05 2023-01-24 00:29:14.111564: step: 276/77, loss: 1.8721562810242176e-05 2023-01-24 00:29:15.542571: step: 280/77, loss: 0.0009570186375640333 2023-01-24 00:29:16.908753: step: 284/77, loss: 0.0001999553933274001 2023-01-24 00:29:18.364662: step: 288/77, loss: 1.228625205840217e-05 2023-01-24 00:29:19.811242: step: 292/77, loss: 0.005362714175134897 2023-01-24 00:29:21.294439: step: 296/77, loss: 0.003985617309808731 2023-01-24 00:29:22.773952: step: 300/77, loss: 0.0010382856708019972 2023-01-24 00:29:24.221760: step: 304/77, loss: 0.001597329042851925 2023-01-24 00:29:25.726386: step: 308/77, loss: 0.0002656100841704756 2023-01-24 00:29:27.171671: step: 312/77, loss: 0.0005380258662626147 2023-01-24 00:29:28.564448: step: 316/77, loss: 0.0003440924920141697 2023-01-24 00:29:30.044627: step: 320/77, loss: 0.025249456986784935 2023-01-24 00:29:31.531674: step: 324/77, loss: 0.004531663376837969 2023-01-24 00:29:32.978447: step: 328/77, loss: 0.0008072768105193973 2023-01-24 00:29:34.450243: step: 332/77, loss: 0.0002947875182144344 2023-01-24 00:29:35.841195: step: 336/77, loss: 2.8490705517469905e-05 2023-01-24 00:29:37.179169: step: 340/77, loss: 0.006270504556596279 2023-01-24 00:29:38.661966: step: 344/77, loss: 0.00047301093582063913 2023-01-24 00:29:40.116533: step: 348/77, loss: 0.04062270745635033 2023-01-24 00:29:41.543603: step: 352/77, loss: 0.08819930255413055 2023-01-24 00:29:43.019505: step: 356/77, loss: 0.03620404750108719 2023-01-24 00:29:44.435999: step: 360/77, loss: 0.009372718632221222 2023-01-24 00:29:45.872147: step: 364/77, loss: 0.00039701780769973993 2023-01-24 00:29:47.338603: step: 368/77, loss: 0.010207166895270348 2023-01-24 00:29:48.772579: step: 372/77, loss: 0.00714827049523592 2023-01-24 00:29:50.304391: step: 376/77, loss: 0.005693482235074043 2023-01-24 00:29:51.813546: step: 380/77, loss: 4.395148062030785e-05 2023-01-24 00:29:53.249686: step: 384/77, loss: 0.0007585534476675093 2023-01-24 00:29:54.713066: step: 388/77, loss: 0.0034739826805889606 ================================================== Loss: 0.009 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 15} Test Chinese: {'template': {'p': 0.9452054794520548, 'r': 0.5433070866141733, 'f1': 0.69}, 'slot': {'p': 0.6538461538461539, 'r': 0.016283524904214558, 'f1': 0.03177570093457943}, 'combined': 0.021925233644859807, 'epoch': 15} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 15} Test Korean: {'template': {'p': 0.9452054794520548, 'r': 0.5433070866141733, 'f1': 0.69}, 'slot': {'p': 0.6538461538461539, 'r': 0.016283524904214558, 'f1': 0.03177570093457943}, 'combined': 0.021925233644859807, 'epoch': 15} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 15} Test Russian: {'template': {'p': 0.9452054794520548, 'r': 0.5433070866141733, 'f1': 0.69}, 'slot': {'p': 0.6538461538461539, 'r': 0.016283524904214558, 'f1': 0.03177570093457943}, 'combined': 0.021925233644859807, 'epoch': 15} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 15} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 15} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 15} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 2} Test for Chinese: {'template': {'p': 0.9295774647887324, 'r': 0.5196850393700787, 'f1': 0.6666666666666665}, 'slot': {'p': 0.5172413793103449, 'r': 0.014367816091954023, 'f1': 0.02795899347623485}, 'combined': 0.018639328984156562, 'epoch': 2} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 2} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 2} Test for Korean: {'template': {'p': 0.9295774647887324, 'r': 0.5196850393700787, 'f1': 0.6666666666666665}, 'slot': {'p': 0.5172413793103449, 'r': 0.014367816091954023, 'f1': 0.02795899347623485}, 'combined': 0.018639328984156562, 'epoch': 2} Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 2} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 2} Test for Russian: {'template': {'p': 0.9295774647887324, 'r': 0.5196850393700787, 'f1': 0.6666666666666665}, 'slot': {'p': 0.5172413793103449, 'r': 0.014367816091954023, 'f1': 0.02795899347623485}, 'combined': 0.018639328984156562, 'epoch': 2} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 2} ****************************** Epoch: 16 command: python train.py --model_name template --xlmr_model_name xlm-roberta-large --batch_size 10 --xlmr_learning_rate 2e-5 --event_hidden_num 450 --accumulate_step 4 --max_epoch 30 --p1_data_weight 0.1 --learning_rate 2e-4 2023-01-24 00:31:27.615061: step: 4/77, loss: 4.4154916395200416e-05 2023-01-24 00:31:29.078524: step: 8/77, loss: 0.0002105018065776676 2023-01-24 00:31:30.543953: step: 12/77, loss: 0.00098187115509063 2023-01-24 00:31:32.039007: step: 16/77, loss: 6.890665099490434e-05 2023-01-24 00:31:33.440403: step: 20/77, loss: 0.0364018939435482 2023-01-24 00:31:34.789704: step: 24/77, loss: 0.0007282377337105572 2023-01-24 00:31:36.285061: step: 28/77, loss: 0.0036674540024250746 2023-01-24 00:31:37.679663: step: 32/77, loss: 0.001136477687396109 2023-01-24 00:31:39.124781: step: 36/77, loss: 0.002714781789109111 2023-01-24 00:31:40.512095: step: 40/77, loss: 0.006572374142706394 2023-01-24 00:31:41.966614: step: 44/77, loss: 0.06183435767889023 2023-01-24 00:31:43.430833: step: 48/77, loss: 0.012780948542058468 2023-01-24 00:31:44.867747: step: 52/77, loss: 0.03482649475336075 2023-01-24 00:31:46.293216: step: 56/77, loss: 0.0003644750395324081 2023-01-24 00:31:47.794235: step: 60/77, loss: 0.02861851267516613 2023-01-24 00:31:49.234401: step: 64/77, loss: 0.004674635361880064 2023-01-24 00:31:50.618883: step: 68/77, loss: 0.00026710546808317304 2023-01-24 00:31:52.044049: step: 72/77, loss: 0.010100537911057472 2023-01-24 00:31:53.448004: step: 76/77, loss: 3.099114110227674e-05 2023-01-24 00:31:54.832115: step: 80/77, loss: 0.0859188586473465 2023-01-24 00:31:56.299755: step: 84/77, loss: 0.012669489718973637 2023-01-24 00:31:57.783285: step: 88/77, loss: 2.87165203189943e-05 2023-01-24 00:31:59.245808: step: 92/77, loss: 0.00016863422933965921 2023-01-24 00:32:00.672143: step: 96/77, loss: 0.004398399963974953 2023-01-24 00:32:02.120049: step: 100/77, loss: 0.011153515428304672 2023-01-24 00:32:03.572898: step: 104/77, loss: 0.008227325044572353 2023-01-24 00:32:05.003158: step: 108/77, loss: 0.0030601287726312876 2023-01-24 00:32:06.410823: step: 112/77, loss: 0.0019659164827317 2023-01-24 00:32:07.899268: step: 116/77, loss: 0.008243720047175884 2023-01-24 00:32:09.364506: step: 120/77, loss: 0.03656342998147011 2023-01-24 00:32:10.754632: step: 124/77, loss: 9.505009802524e-05 2023-01-24 00:32:12.269361: step: 128/77, loss: 0.0013397192815318704 2023-01-24 00:32:13.753136: step: 132/77, loss: 0.0007053818553686142 2023-01-24 00:32:15.151430: step: 136/77, loss: 0.014095144346356392 2023-01-24 00:32:16.627366: step: 140/77, loss: 6.120680609456031e-06 2023-01-24 00:32:18.054965: step: 144/77, loss: 0.005919267889112234 2023-01-24 00:32:19.563230: step: 148/77, loss: 0.03168538212776184 2023-01-24 00:32:21.022012: step: 152/77, loss: 0.0001610640756553039 2023-01-24 00:32:22.455909: step: 156/77, loss: 0.0004475472669582814 2023-01-24 00:32:23.945995: step: 160/77, loss: 3.547245796653442e-05 2023-01-24 00:32:25.368226: step: 164/77, loss: 0.0024135210551321507 2023-01-24 00:32:26.846649: step: 168/77, loss: 3.7103816907801956e-07 2023-01-24 00:32:28.369281: step: 172/77, loss: 4.891722710453905e-06 2023-01-24 00:32:29.839788: step: 176/77, loss: 2.0519268218777142e-05 2023-01-24 00:32:31.324964: step: 180/77, loss: 1.7564783775014803e-05 2023-01-24 00:32:32.771228: step: 184/77, loss: 0.0010788263753056526 2023-01-24 00:32:34.222614: step: 188/77, loss: 1.704473288555164e-05 2023-01-24 00:32:35.641892: step: 192/77, loss: 0.0025848494842648506 2023-01-24 00:32:37.033430: step: 196/77, loss: 0.0004149794112890959 2023-01-24 00:32:38.490534: step: 200/77, loss: 0.004008912947028875 2023-01-24 00:32:39.947519: step: 204/77, loss: 0.0009068038780242205 2023-01-24 00:32:41.376372: step: 208/77, loss: 0.006564724259078503 2023-01-24 00:32:42.781620: step: 212/77, loss: 0.002046438166871667 2023-01-24 00:32:44.230956: step: 216/77, loss: 0.0060040769167244434 2023-01-24 00:32:45.662179: step: 220/77, loss: 0.028682604432106018 2023-01-24 00:32:47.087545: step: 224/77, loss: 0.01584686152637005 2023-01-24 00:32:48.565641: step: 228/77, loss: 0.021169526502490044 2023-01-24 00:32:50.111501: step: 232/77, loss: 0.0028272352647036314 2023-01-24 00:32:51.543915: step: 236/77, loss: 0.00097802618984133 2023-01-24 00:32:52.961282: step: 240/77, loss: 0.0007845730287954211 2023-01-24 00:32:54.371950: step: 244/77, loss: 3.649089649115922e-06 2023-01-24 00:32:55.769973: step: 248/77, loss: 0.043560940772295 2023-01-24 00:32:57.199904: step: 252/77, loss: 0.011906352825462818 2023-01-24 00:32:58.585697: step: 256/77, loss: 0.014353277161717415 2023-01-24 00:33:00.031301: step: 260/77, loss: 0.026976825669407845 2023-01-24 00:33:01.424154: step: 264/77, loss: 0.00015734511543996632 2023-01-24 00:33:02.868241: step: 268/77, loss: 6.051371747162193e-05 2023-01-24 00:33:04.371436: step: 272/77, loss: 0.001316477544605732 2023-01-24 00:33:05.775353: step: 276/77, loss: 0.0126840490847826 2023-01-24 00:33:07.216532: step: 280/77, loss: 0.04170496016740799 2023-01-24 00:33:08.634205: step: 284/77, loss: 0.0032664278987795115 2023-01-24 00:33:10.103563: step: 288/77, loss: 0.004903575871139765 2023-01-24 00:33:11.570195: step: 292/77, loss: 0.0004226120363455266 2023-01-24 00:33:13.022986: step: 296/77, loss: 0.04508967697620392 2023-01-24 00:33:14.464597: step: 300/77, loss: 7.970984006533399e-05 2023-01-24 00:33:15.808698: step: 304/77, loss: 0.0002352129085920751 2023-01-24 00:33:17.302165: step: 308/77, loss: 0.014296479523181915 2023-01-24 00:33:18.831614: step: 312/77, loss: 0.0003451265802141279 2023-01-24 00:33:20.330008: step: 316/77, loss: 0.00031777145341038704 2023-01-24 00:33:21.778475: step: 320/77, loss: 0.03641887754201889 2023-01-24 00:33:23.190056: step: 324/77, loss: 8.758921467233449e-05 2023-01-24 00:33:24.647775: step: 328/77, loss: 0.0032292057294398546 2023-01-24 00:33:26.187645: step: 332/77, loss: 0.0006023455644026399 2023-01-24 00:33:27.633089: step: 336/77, loss: 0.009644701145589352 2023-01-24 00:33:29.190448: step: 340/77, loss: 3.1262068659998477e-05 2023-01-24 00:33:30.617967: step: 344/77, loss: 0.032040685415267944 2023-01-24 00:33:32.084645: step: 348/77, loss: 0.01243653241544962 2023-01-24 00:33:33.605624: step: 352/77, loss: 0.0280932430177927 2023-01-24 00:33:35.131311: step: 356/77, loss: 0.0001937321067089215 2023-01-24 00:33:36.532743: step: 360/77, loss: 0.015767639502882957 2023-01-24 00:33:38.007655: step: 364/77, loss: 0.0038702921010553837 2023-01-24 00:33:39.437640: step: 368/77, loss: 0.0022727069444954395 2023-01-24 00:33:40.920377: step: 372/77, loss: 0.056999169290065765 2023-01-24 00:33:42.324350: step: 376/77, loss: 0.0014027263969182968 2023-01-24 00:33:43.811212: step: 380/77, loss: 0.08449889719486237 2023-01-24 00:33:45.231021: step: 384/77, loss: 0.002161977579817176 2023-01-24 00:33:46.646400: step: 388/77, loss: 0.004340730607509613 ================================================== Loss: 0.011 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 16} Test Chinese: {'template': {'p': 0.9487179487179487, 'r': 0.5826771653543307, 'f1': 0.721951219512195}, 'slot': {'p': 0.5581395348837209, 'r': 0.022988505747126436, 'f1': 0.04415823367065318}, 'combined': 0.031880090650032535, 'epoch': 16} Dev Korean: {'template': {'p': 1.0, 'r': 0.5666666666666667, 'f1': 0.7234042553191489}, 'slot': {'p': 0.5, 'r': 0.035916824196597356, 'f1': 0.0670194003527337}, 'combined': 0.048482119404105226, 'epoch': 16} Test Korean: {'template': {'p': 0.9487179487179487, 'r': 0.5826771653543307, 'f1': 0.721951219512195}, 'slot': {'p': 0.5581395348837209, 'r': 0.022988505747126436, 'f1': 0.04415823367065318}, 'combined': 0.031880090650032535, 'epoch': 16} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 16} Test Russian: {'template': {'p': 0.9487179487179487, 'r': 0.5826771653543307, 'f1': 0.721951219512195}, 'slot': {'p': 0.5581395348837209, 'r': 0.022988505747126436, 'f1': 0.04415823367065318}, 'combined': 0.031880090650032535, 'epoch': 16} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 16} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 16} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 16} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 2} Test for Chinese: {'template': {'p': 0.9295774647887324, 'r': 0.5196850393700787, 'f1': 0.6666666666666665}, 'slot': {'p': 0.5172413793103449, 'r': 0.014367816091954023, 'f1': 0.02795899347623485}, 'combined': 0.018639328984156562, 'epoch': 2} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 2} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 2} Test for Korean: {'template': {'p': 0.9295774647887324, 'r': 0.5196850393700787, 'f1': 0.6666666666666665}, 'slot': {'p': 0.5172413793103449, 'r': 0.014367816091954023, 'f1': 0.02795899347623485}, 'combined': 0.018639328984156562, 'epoch': 2} Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 2} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 2} Test for Russian: {'template': {'p': 0.9295774647887324, 'r': 0.5196850393700787, 'f1': 0.6666666666666665}, 'slot': {'p': 0.5172413793103449, 'r': 0.014367816091954023, 'f1': 0.02795899347623485}, 'combined': 0.018639328984156562, 'epoch': 2} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 2} ****************************** Epoch: 17 command: python train.py --model_name template --xlmr_model_name xlm-roberta-large --batch_size 10 --xlmr_learning_rate 2e-5 --event_hidden_num 450 --accumulate_step 4 --max_epoch 30 --p1_data_weight 0.1 --learning_rate 2e-4 2023-01-24 00:35:19.426676: step: 4/77, loss: 0.002037170808762312 2023-01-24 00:35:20.828887: step: 8/77, loss: 0.02480945736169815 2023-01-24 00:35:22.241379: step: 12/77, loss: 0.00014272028056439012 2023-01-24 00:35:23.667909: step: 16/77, loss: 0.0006749244057573378 2023-01-24 00:35:25.115290: step: 20/77, loss: 0.00039547350024804473 2023-01-24 00:35:26.552023: step: 24/77, loss: 1.3559053513745312e-05 2023-01-24 00:35:28.020636: step: 28/77, loss: 0.00037224931293167174 2023-01-24 00:35:29.483928: step: 32/77, loss: 1.5073310350999236e-05 2023-01-24 00:35:30.941118: step: 36/77, loss: 0.0002469658793415874 2023-01-24 00:35:32.434327: step: 40/77, loss: 0.005605190992355347 2023-01-24 00:35:33.859289: step: 44/77, loss: 0.004595811478793621 2023-01-24 00:35:35.332980: step: 48/77, loss: 1.3607714208774269e-05 2023-01-24 00:35:36.804882: step: 52/77, loss: 2.6351059204898775e-05 2023-01-24 00:35:38.299817: step: 56/77, loss: 0.0015755100175738335 2023-01-24 00:35:39.790671: step: 60/77, loss: 0.0048150732181966305 2023-01-24 00:35:41.226003: step: 64/77, loss: 0.001235154690220952 2023-01-24 00:35:42.720440: step: 68/77, loss: 0.015772752463817596 2023-01-24 00:35:44.181141: step: 72/77, loss: 5.922800482949242e-06 2023-01-24 00:35:45.610685: step: 76/77, loss: 3.9877650124253705e-05 2023-01-24 00:35:47.030060: step: 80/77, loss: 0.0043601421639323235 2023-01-24 00:35:48.505227: step: 84/77, loss: 0.002000096021220088 2023-01-24 00:35:49.945151: step: 88/77, loss: 2.683079037524294e-05 2023-01-24 00:35:51.337847: step: 92/77, loss: 5.596005394181702e-06 2023-01-24 00:35:52.739273: step: 96/77, loss: 0.003744155401363969 2023-01-24 00:35:54.195026: step: 100/77, loss: 0.0109467888250947 2023-01-24 00:35:55.660858: step: 104/77, loss: 2.821162524924148e-05 2023-01-24 00:35:57.091255: step: 108/77, loss: 0.04504281282424927 2023-01-24 00:35:58.505470: step: 112/77, loss: 5.929253893555142e-05 2023-01-24 00:35:59.980474: step: 116/77, loss: 0.0003192286239936948 2023-01-24 00:36:01.474335: step: 120/77, loss: 0.001598101807758212 2023-01-24 00:36:02.960069: step: 124/77, loss: 0.002815796062350273 2023-01-24 00:36:04.432625: step: 128/77, loss: 0.0006713587790727615 2023-01-24 00:36:05.889107: step: 132/77, loss: 0.01400033850222826 2023-01-24 00:36:07.363802: step: 136/77, loss: 0.0009075433481484652 2023-01-24 00:36:08.830873: step: 140/77, loss: 0.008522202260792255 2023-01-24 00:36:10.271341: step: 144/77, loss: 0.0021871919743716717 2023-01-24 00:36:11.736712: step: 148/77, loss: 0.005150264594703913 2023-01-24 00:36:13.189147: step: 152/77, loss: 3.062113682972267e-05 2023-01-24 00:36:14.602299: step: 156/77, loss: 0.009946542792022228 2023-01-24 00:36:15.990288: step: 160/77, loss: 0.002842171583324671 2023-01-24 00:36:17.469992: step: 164/77, loss: 0.00044835201697424054 2023-01-24 00:36:18.947519: step: 168/77, loss: 0.000633113319054246 2023-01-24 00:36:20.398285: step: 172/77, loss: 0.06409518420696259 2023-01-24 00:36:21.889423: step: 176/77, loss: 0.0008590835495851934 2023-01-24 00:36:23.337907: step: 180/77, loss: 0.0012458937708288431 2023-01-24 00:36:24.765362: step: 184/77, loss: 0.00060157326515764 2023-01-24 00:36:26.254608: step: 188/77, loss: 1.6427084119641222e-05 2023-01-24 00:36:27.702048: step: 192/77, loss: 0.012870164588093758 2023-01-24 00:36:29.189893: step: 196/77, loss: 0.0018701556837186217 2023-01-24 00:36:30.648567: step: 200/77, loss: 0.03447765111923218 2023-01-24 00:36:32.077930: step: 204/77, loss: 0.0004893302684649825 2023-01-24 00:36:33.531801: step: 208/77, loss: 0.0004636928206309676 2023-01-24 00:36:34.997830: step: 212/77, loss: 0.0019563462119549513 2023-01-24 00:36:36.424525: step: 216/77, loss: 0.0025333473458886147 2023-01-24 00:36:37.863124: step: 220/77, loss: 0.0001070392390829511 2023-01-24 00:36:39.327600: step: 224/77, loss: 0.02948172204196453 2023-01-24 00:36:40.773777: step: 228/77, loss: 0.00033519353019073606 2023-01-24 00:36:42.217011: step: 232/77, loss: 0.000515027204528451 2023-01-24 00:36:43.656723: step: 236/77, loss: 0.00027163440245203674 2023-01-24 00:36:45.110251: step: 240/77, loss: 6.183959158079233e-07 2023-01-24 00:36:46.525299: step: 244/77, loss: 0.000521102047059685 2023-01-24 00:36:47.975233: step: 248/77, loss: 0.0015060872538015246 2023-01-24 00:36:49.422852: step: 252/77, loss: 3.615822424762882e-05 2023-01-24 00:36:50.838484: step: 256/77, loss: 0.00032852476579137146 2023-01-24 00:36:52.237094: step: 260/77, loss: 0.0005770521820522845 2023-01-24 00:36:53.707094: step: 264/77, loss: 1.873048290690349e-06 2023-01-24 00:36:55.161388: step: 268/77, loss: 0.02278841845691204 2023-01-24 00:36:56.657117: step: 272/77, loss: 0.0035157110542058945 2023-01-24 00:36:58.084932: step: 276/77, loss: 0.00018249360437039286 2023-01-24 00:36:59.544396: step: 280/77, loss: 0.000337746343575418 2023-01-24 00:37:01.012501: step: 284/77, loss: 0.0062423390336334705 2023-01-24 00:37:02.442211: step: 288/77, loss: 0.021998530253767967 2023-01-24 00:37:03.894872: step: 292/77, loss: 0.007070072460919619 2023-01-24 00:37:05.323967: step: 296/77, loss: 0.03983669728040695 2023-01-24 00:37:06.733390: step: 300/77, loss: 0.005213170312345028 2023-01-24 00:37:08.155390: step: 304/77, loss: 0.006226182449609041 2023-01-24 00:37:09.613636: step: 308/77, loss: 0.002841666806489229 2023-01-24 00:37:11.055828: step: 312/77, loss: 0.00035264910547994077 2023-01-24 00:37:12.465154: step: 316/77, loss: 0.0008341956418007612 2023-01-24 00:37:13.915525: step: 320/77, loss: 0.03190179541707039 2023-01-24 00:37:15.396128: step: 324/77, loss: 0.0018410655902698636 2023-01-24 00:37:16.859804: step: 328/77, loss: 0.019059965386986732 2023-01-24 00:37:18.317129: step: 332/77, loss: 8.135671669151634e-05 2023-01-24 00:37:19.767377: step: 336/77, loss: 0.011301896534860134 2023-01-24 00:37:21.245124: step: 340/77, loss: 6.137183663668111e-05 2023-01-24 00:37:22.698740: step: 344/77, loss: 9.129245154326782e-05 2023-01-24 00:37:24.057197: step: 348/77, loss: 0.007659727707505226 2023-01-24 00:37:25.473588: step: 352/77, loss: 0.00029633031226694584 2023-01-24 00:37:26.930301: step: 356/77, loss: 4.890608761343174e-05 2023-01-24 00:37:28.417134: step: 360/77, loss: 0.005314785521477461 2023-01-24 00:37:29.905420: step: 364/77, loss: 6.02393538429169e-06 2023-01-24 00:37:31.369101: step: 368/77, loss: 7.892806752352044e-05 2023-01-24 00:37:32.770114: step: 372/77, loss: 3.2073781767394394e-05 2023-01-24 00:37:34.205462: step: 376/77, loss: 0.03660866618156433 2023-01-24 00:37:35.655268: step: 380/77, loss: 0.000222804446821101 2023-01-24 00:37:37.131750: step: 384/77, loss: 0.00884947832673788 2023-01-24 00:37:38.590124: step: 388/77, loss: 0.0360332727432251 ================================================== Loss: 0.006 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 17} Test Chinese: {'template': {'p': 0.9102564102564102, 'r': 0.5590551181102362, 'f1': 0.6926829268292682}, 'slot': {'p': 0.5416666666666666, 'r': 0.02490421455938697, 'f1': 0.047619047619047616}, 'combined': 0.0329849012775842, 'epoch': 17} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 17} Test Korean: {'template': {'p': 0.9090909090909091, 'r': 0.5511811023622047, 'f1': 0.6862745098039216}, 'slot': {'p': 0.5416666666666666, 'r': 0.02490421455938697, 'f1': 0.047619047619047616}, 'combined': 0.032679738562091505, 'epoch': 17} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 17} Test Russian: {'template': {'p': 0.9090909090909091, 'r': 0.5511811023622047, 'f1': 0.6862745098039216}, 'slot': {'p': 0.5416666666666666, 'r': 0.02490421455938697, 'f1': 0.047619047619047616}, 'combined': 0.032679738562091505, 'epoch': 17} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 17} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 17} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 17} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 2} Test for Chinese: {'template': {'p': 0.9295774647887324, 'r': 0.5196850393700787, 'f1': 0.6666666666666665}, 'slot': {'p': 0.5172413793103449, 'r': 0.014367816091954023, 'f1': 0.02795899347623485}, 'combined': 0.018639328984156562, 'epoch': 2} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 2} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 2} Test for Korean: {'template': {'p': 0.9295774647887324, 'r': 0.5196850393700787, 'f1': 0.6666666666666665}, 'slot': {'p': 0.5172413793103449, 'r': 0.014367816091954023, 'f1': 0.02795899347623485}, 'combined': 0.018639328984156562, 'epoch': 2} Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 2} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 2} Test for Russian: {'template': {'p': 0.9295774647887324, 'r': 0.5196850393700787, 'f1': 0.6666666666666665}, 'slot': {'p': 0.5172413793103449, 'r': 0.014367816091954023, 'f1': 0.02795899347623485}, 'combined': 0.018639328984156562, 'epoch': 2} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 2} ****************************** Epoch: 18 command: python train.py --model_name template --xlmr_model_name xlm-roberta-large --batch_size 10 --xlmr_learning_rate 2e-5 --event_hidden_num 450 --accumulate_step 4 --max_epoch 30 --p1_data_weight 0.1 --learning_rate 2e-4 2023-01-24 00:39:10.779596: step: 4/77, loss: 0.0006858706474304199 2023-01-24 00:39:12.221622: step: 8/77, loss: 0.044688817113637924 2023-01-24 00:39:13.699506: step: 12/77, loss: 0.0038130078464746475 2023-01-24 00:39:15.172988: step: 16/77, loss: 0.029383910819888115 2023-01-24 00:39:16.625528: step: 20/77, loss: 0.004666387569159269 2023-01-24 00:39:18.019400: step: 24/77, loss: 0.03362457454204559 2023-01-24 00:39:19.432923: step: 28/77, loss: 0.03850243240594864 2023-01-24 00:39:20.873320: step: 32/77, loss: 0.00010418617603136227 2023-01-24 00:39:22.408110: step: 36/77, loss: 1.208678622788284e-05 2023-01-24 00:39:23.910689: step: 40/77, loss: 0.0004281321889720857 2023-01-24 00:39:25.394597: step: 44/77, loss: 1.9828887161565945e-05 2023-01-24 00:39:26.882527: step: 48/77, loss: 0.0009721000678837299 2023-01-24 00:39:28.403992: step: 52/77, loss: 0.0015299199149012566 2023-01-24 00:39:29.787667: step: 56/77, loss: 0.0002908943861257285 2023-01-24 00:39:31.272672: step: 60/77, loss: 0.0012947148643434048 2023-01-24 00:39:32.716213: step: 64/77, loss: 0.00016326965123880655 2023-01-24 00:39:34.087008: step: 68/77, loss: 0.0013171466998755932 2023-01-24 00:39:35.526701: step: 72/77, loss: 0.0010111655574291945 2023-01-24 00:39:36.912487: step: 76/77, loss: 4.4684516069537494e-06 2023-01-24 00:39:38.391819: step: 80/77, loss: 0.001246205298230052 2023-01-24 00:39:39.801461: step: 84/77, loss: 0.03493797406554222 2023-01-24 00:39:41.244498: step: 88/77, loss: 4.621453263098374e-05 2023-01-24 00:39:42.683338: step: 92/77, loss: 8.632710523670539e-05 2023-01-24 00:39:44.179513: step: 96/77, loss: 0.02589493989944458 2023-01-24 00:39:45.564971: step: 100/77, loss: 0.00997802522033453 2023-01-24 00:39:46.964730: step: 104/77, loss: 0.012017711065709591 2023-01-24 00:39:48.435410: step: 108/77, loss: 0.0037646391429007053 2023-01-24 00:39:49.902114: step: 112/77, loss: 0.0018711037700995803 2023-01-24 00:39:51.328359: step: 116/77, loss: 0.0008302136557176709 2023-01-24 00:39:52.711543: step: 120/77, loss: 0.004413140472024679 2023-01-24 00:39:54.161463: step: 124/77, loss: 0.0008827412966638803 2023-01-24 00:39:55.654238: step: 128/77, loss: 0.007443072274327278 2023-01-24 00:39:57.080500: step: 132/77, loss: 0.00017162800941150635 2023-01-24 00:39:58.472078: step: 136/77, loss: 1.1584797903196886e-05 2023-01-24 00:39:59.895495: step: 140/77, loss: 0.040182989090681076 2023-01-24 00:40:01.316044: step: 144/77, loss: 0.00107326521538198 2023-01-24 00:40:02.788357: step: 148/77, loss: 0.00023638608399778605 2023-01-24 00:40:04.198130: step: 152/77, loss: 0.0006994745344854891 2023-01-24 00:40:05.641463: step: 156/77, loss: 0.0009634314919821918 2023-01-24 00:40:07.078022: step: 160/77, loss: 3.515436401357874e-05 2023-01-24 00:40:08.559775: step: 164/77, loss: 0.03037864901125431 2023-01-24 00:40:10.026188: step: 168/77, loss: 0.0026817091275006533 2023-01-24 00:40:11.391634: step: 172/77, loss: 0.000941793026868254 2023-01-24 00:40:12.900911: step: 176/77, loss: 0.00016398500883951783 2023-01-24 00:40:14.267236: step: 180/77, loss: 0.00012539573071990162 2023-01-24 00:40:15.713850: step: 184/77, loss: 0.0005771261057816446 2023-01-24 00:40:17.249136: step: 188/77, loss: 7.127389835659415e-05 2023-01-24 00:40:18.731768: step: 192/77, loss: 0.0008534117951057851 2023-01-24 00:40:20.270508: step: 196/77, loss: 0.00011146351607749239 2023-01-24 00:40:21.748643: step: 200/77, loss: 0.0001266698382096365 2023-01-24 00:40:23.245991: step: 204/77, loss: 3.1347470212494954e-05 2023-01-24 00:40:24.687655: step: 208/77, loss: 6.562614726135507e-05 2023-01-24 00:40:26.121379: step: 212/77, loss: 3.7813962990185246e-05 2023-01-24 00:40:27.547931: step: 216/77, loss: 0.00014214005204848945 2023-01-24 00:40:29.038738: step: 220/77, loss: 2.7556823624763638e-05 2023-01-24 00:40:30.464214: step: 224/77, loss: 0.012045787647366524 2023-01-24 00:40:31.867304: step: 228/77, loss: 0.0008810462313704193 2023-01-24 00:40:33.374852: step: 232/77, loss: 0.00255215005017817 2023-01-24 00:40:34.862347: step: 236/77, loss: 0.03862829506397247 2023-01-24 00:40:36.298452: step: 240/77, loss: 0.0023982999846339226 2023-01-24 00:40:37.768685: step: 244/77, loss: 4.224439544486813e-05 2023-01-24 00:40:39.246740: step: 248/77, loss: 9.605423110770062e-05 2023-01-24 00:40:40.706331: step: 252/77, loss: 7.658920367248356e-05 2023-01-24 00:40:42.213839: step: 256/77, loss: 0.0006383816944435239 2023-01-24 00:40:43.681098: step: 260/77, loss: 3.0640898330602795e-05 2023-01-24 00:40:45.103019: step: 264/77, loss: 0.01589365489780903 2023-01-24 00:40:46.579169: step: 268/77, loss: 3.102341725025326e-05 2023-01-24 00:40:48.084220: step: 272/77, loss: 0.0015564777422696352 2023-01-24 00:40:49.612473: step: 276/77, loss: 2.715253685892094e-05 2023-01-24 00:40:51.105142: step: 280/77, loss: 0.00046800231211818755 2023-01-24 00:40:52.581662: step: 284/77, loss: 0.003451876575127244 2023-01-24 00:40:54.108221: step: 288/77, loss: 0.034866366535425186 2023-01-24 00:40:55.644977: step: 292/77, loss: 0.0010681836865842342 2023-01-24 00:40:57.180848: step: 296/77, loss: 0.0018033330561593175 2023-01-24 00:40:58.595511: step: 300/77, loss: 0.001033065840601921 2023-01-24 00:41:00.083834: step: 304/77, loss: 0.003283113706856966 2023-01-24 00:41:01.532934: step: 308/77, loss: 0.0062102884985506535 2023-01-24 00:41:03.053757: step: 312/77, loss: 7.772250683046877e-05 2023-01-24 00:41:04.553717: step: 316/77, loss: 0.054183512926101685 2023-01-24 00:41:06.026668: step: 320/77, loss: 0.018427947536110878 2023-01-24 00:41:07.501281: step: 324/77, loss: 0.0007778811268508434 2023-01-24 00:41:08.984607: step: 328/77, loss: 0.0024008373729884624 2023-01-24 00:41:10.455639: step: 332/77, loss: 3.1286188459489495e-05 2023-01-24 00:41:11.941032: step: 336/77, loss: 0.05209927260875702 2023-01-24 00:41:13.471869: step: 340/77, loss: 5.062124546384439e-05 2023-01-24 00:41:14.898395: step: 344/77, loss: 0.0043088700622320175 2023-01-24 00:41:16.337994: step: 348/77, loss: 0.0004078157071489841 2023-01-24 00:41:17.796494: step: 352/77, loss: 0.011750757694244385 2023-01-24 00:41:19.273151: step: 356/77, loss: 0.0011366137769073248 2023-01-24 00:41:20.708192: step: 360/77, loss: 6.086245412006974e-05 2023-01-24 00:41:22.141049: step: 364/77, loss: 0.00013936932373326272 2023-01-24 00:41:23.649979: step: 368/77, loss: 0.0029132000636309385 2023-01-24 00:41:25.084405: step: 372/77, loss: 0.00029768256354145706 2023-01-24 00:41:26.511029: step: 376/77, loss: 9.392637366545387e-06 2023-01-24 00:41:27.980107: step: 380/77, loss: 3.185265086358413e-05 2023-01-24 00:41:29.490138: step: 384/77, loss: 0.00016818696167320013 2023-01-24 00:41:30.970256: step: 388/77, loss: 0.000533089623786509 ================================================== Loss: 0.006 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 18} Test Chinese: {'template': {'p': 0.9583333333333334, 'r': 0.5433070866141733, 'f1': 0.6934673366834172}, 'slot': {'p': 0.6428571428571429, 'r': 0.017241379310344827, 'f1': 0.033582089552238806}, 'combined': 0.023288082202055055, 'epoch': 18} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 18} Test Korean: {'template': {'p': 0.9452054794520548, 'r': 0.5433070866141733, 'f1': 0.69}, 'slot': {'p': 0.6785714285714286, 'r': 0.018199233716475097, 'f1': 0.03544776119402985}, 'combined': 0.024458955223880596, 'epoch': 18} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 18} Test Russian: {'template': {'p': 0.9459459459459459, 'r': 0.5511811023622047, 'f1': 0.6965174129353234}, 'slot': {'p': 0.6666666666666666, 'r': 0.017241379310344827, 'f1': 0.03361344537815126}, 'combined': 0.02341235001463272, 'epoch': 18} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 18} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 18} Sample Russian: {'template': {'p': 1.0, 'r': 0.25, 'f1': 0.4}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 18} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 2} Test for Chinese: {'template': {'p': 0.9295774647887324, 'r': 0.5196850393700787, 'f1': 0.6666666666666665}, 'slot': {'p': 0.5172413793103449, 'r': 0.014367816091954023, 'f1': 0.02795899347623485}, 'combined': 0.018639328984156562, 'epoch': 2} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 2} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 2} Test for Korean: {'template': {'p': 0.9295774647887324, 'r': 0.5196850393700787, 'f1': 0.6666666666666665}, 'slot': {'p': 0.5172413793103449, 'r': 0.014367816091954023, 'f1': 0.02795899347623485}, 'combined': 0.018639328984156562, 'epoch': 2} Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 2} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 2} Test for Russian: {'template': {'p': 0.9295774647887324, 'r': 0.5196850393700787, 'f1': 0.6666666666666665}, 'slot': {'p': 0.5172413793103449, 'r': 0.014367816091954023, 'f1': 0.02795899347623485}, 'combined': 0.018639328984156562, 'epoch': 2} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 2} ****************************** Epoch: 19 command: python train.py --model_name template --xlmr_model_name xlm-roberta-large --batch_size 10 --xlmr_learning_rate 2e-5 --event_hidden_num 450 --accumulate_step 4 --max_epoch 30 --p1_data_weight 0.1 --learning_rate 2e-4 2023-01-24 00:43:06.207846: step: 4/77, loss: 0.04768310487270355 2023-01-24 00:43:07.620395: step: 8/77, loss: 5.905014768359251e-05 2023-01-24 00:43:09.095233: step: 12/77, loss: 8.530027116648853e-05 2023-01-24 00:43:10.531702: step: 16/77, loss: 0.027549365535378456 2023-01-24 00:43:12.003214: step: 20/77, loss: 0.002080831676721573 2023-01-24 00:43:13.394019: step: 24/77, loss: 0.00027371218311600387 2023-01-24 00:43:14.823078: step: 28/77, loss: 0.03933372348546982 2023-01-24 00:43:16.316029: step: 32/77, loss: 0.00011253123375354335 2023-01-24 00:43:17.801447: step: 36/77, loss: 1.3273269360070117e-05 2023-01-24 00:43:19.216477: step: 40/77, loss: 0.0007404336356557906 2023-01-24 00:43:20.724273: step: 44/77, loss: 2.1038144041085616e-05 2023-01-24 00:43:22.156322: step: 48/77, loss: 4.754773271997692e-06 2023-01-24 00:43:23.581949: step: 52/77, loss: 0.020821966230869293 2023-01-24 00:43:25.006790: step: 56/77, loss: 0.016195418313145638 2023-01-24 00:43:26.490741: step: 60/77, loss: 0.003978441935032606 2023-01-24 00:43:27.989341: step: 64/77, loss: 1.6718449842301197e-05 2023-01-24 00:43:29.423654: step: 68/77, loss: 0.03040507063269615 2023-01-24 00:43:30.867059: step: 72/77, loss: 0.014016987755894661 2023-01-24 00:43:32.314208: step: 76/77, loss: 0.001976592233404517 2023-01-24 00:43:33.805309: step: 80/77, loss: 0.00010204622958553955 2023-01-24 00:43:35.198357: step: 84/77, loss: 0.006804922129958868 2023-01-24 00:43:36.650150: step: 88/77, loss: 0.00038014521123841405 2023-01-24 00:43:38.098720: step: 92/77, loss: 0.00024286401458084583 2023-01-24 00:43:39.611671: step: 96/77, loss: 0.0007883305079303682 2023-01-24 00:43:41.027714: step: 100/77, loss: 0.004189997911453247 2023-01-24 00:43:42.552450: step: 104/77, loss: 0.029285039752721786 2023-01-24 00:43:43.972263: step: 108/77, loss: 0.010122988373041153 2023-01-24 00:43:45.437492: step: 112/77, loss: 0.009222110733389854 2023-01-24 00:43:46.925686: step: 116/77, loss: 0.014736614190042019 2023-01-24 00:43:48.425317: step: 120/77, loss: 0.0032707280479371548 2023-01-24 00:43:49.967067: step: 124/77, loss: 1.443083874619333e-05 2023-01-24 00:43:51.441232: step: 128/77, loss: 0.0006309268064796925 2023-01-24 00:43:52.886094: step: 132/77, loss: 0.003975137136876583 2023-01-24 00:43:54.308361: step: 136/77, loss: 5.537320248549804e-05 2023-01-24 00:43:55.754682: step: 140/77, loss: 0.012196928262710571 2023-01-24 00:43:57.228706: step: 144/77, loss: 0.007411782164126635 2023-01-24 00:43:58.767615: step: 148/77, loss: 0.07958276569843292 2023-01-24 00:44:00.246117: step: 152/77, loss: 0.0004275135288480669 2023-01-24 00:44:01.668760: step: 156/77, loss: 6.887994095450267e-05 2023-01-24 00:44:03.138359: step: 160/77, loss: 0.0016891647828742862 2023-01-24 00:44:04.591774: step: 164/77, loss: 0.000604730099439621 2023-01-24 00:44:06.064261: step: 168/77, loss: 0.00016131930169649422 2023-01-24 00:44:07.472704: step: 172/77, loss: 0.0014135086676105857 2023-01-24 00:44:08.888628: step: 176/77, loss: 0.024117151275277138 2023-01-24 00:44:10.379146: step: 180/77, loss: 0.00032136618392542005 2023-01-24 00:44:11.877796: step: 184/77, loss: 0.020940173417329788 2023-01-24 00:44:13.311616: step: 188/77, loss: 6.748281157342717e-05 2023-01-24 00:44:14.739777: step: 192/77, loss: 0.014003898948431015 2023-01-24 00:44:16.179530: step: 196/77, loss: 0.009558777324855328 2023-01-24 00:44:17.688680: step: 200/77, loss: 1.5671652363380417e-05 2023-01-24 00:44:19.136356: step: 204/77, loss: 0.04960835725069046 2023-01-24 00:44:20.657481: step: 208/77, loss: 1.4638470020145178e-05 2023-01-24 00:44:22.151031: step: 212/77, loss: 0.11005327105522156 2023-01-24 00:44:23.702091: step: 216/77, loss: 2.261926965729799e-06 2023-01-24 00:44:25.229142: step: 220/77, loss: 0.0045501659624278545 2023-01-24 00:44:26.749044: step: 224/77, loss: 0.00013695968664251268 2023-01-24 00:44:28.256608: step: 228/77, loss: 1.2972234799235594e-05 2023-01-24 00:44:29.705152: step: 232/77, loss: 1.9106813851976767e-05 2023-01-24 00:44:31.175394: step: 236/77, loss: 6.309180753305554e-05 2023-01-24 00:44:32.640155: step: 240/77, loss: 0.00010856376320589334 2023-01-24 00:44:34.067436: step: 244/77, loss: 0.0081355981528759 2023-01-24 00:44:35.584396: step: 248/77, loss: 0.0007701553986407816 2023-01-24 00:44:37.053412: step: 252/77, loss: 0.00026954078930430114 2023-01-24 00:44:38.468627: step: 256/77, loss: 7.595164061058313e-05 2023-01-24 00:44:39.967349: step: 260/77, loss: 0.06677436083555222 2023-01-24 00:44:41.488103: step: 264/77, loss: 0.0021454612724483013 2023-01-24 00:44:42.943322: step: 268/77, loss: 2.3908436560304835e-05 2023-01-24 00:44:44.401807: step: 272/77, loss: 0.004027285147458315 2023-01-24 00:44:45.863158: step: 276/77, loss: 0.00013086202670820057 2023-01-24 00:44:47.346444: step: 280/77, loss: 0.009245424531400204 2023-01-24 00:44:48.774783: step: 284/77, loss: 0.00020366633543744683 2023-01-24 00:44:50.323614: step: 288/77, loss: 0.02489008940756321 2023-01-24 00:44:51.832701: step: 292/77, loss: 0.00040090453694574535 2023-01-24 00:44:53.248673: step: 296/77, loss: 9.466216579312459e-05 2023-01-24 00:44:54.774566: step: 300/77, loss: 0.009035969153046608 2023-01-24 00:44:56.205847: step: 304/77, loss: 0.0009680857183411717 2023-01-24 00:44:57.658075: step: 308/77, loss: 0.04155074805021286 2023-01-24 00:44:59.121194: step: 312/77, loss: 0.00012125683133490384 2023-01-24 00:45:00.606532: step: 316/77, loss: 0.00019658671226352453 2023-01-24 00:45:02.043197: step: 320/77, loss: 0.00034607492852956057 2023-01-24 00:45:03.547499: step: 324/77, loss: 0.03478804603219032 2023-01-24 00:45:05.057080: step: 328/77, loss: 5.577044703386491e-06 2023-01-24 00:45:06.520096: step: 332/77, loss: 0.0053550852462649345 2023-01-24 00:45:07.993811: step: 336/77, loss: 0.00036130042281001806 2023-01-24 00:45:09.532250: step: 340/77, loss: 0.08390412479639053 2023-01-24 00:45:11.015951: step: 344/77, loss: 0.00020369903359096497 2023-01-24 00:45:12.526272: step: 348/77, loss: 0.010599369183182716 2023-01-24 00:45:13.972877: step: 352/77, loss: 0.0090256929397583 2023-01-24 00:45:15.439737: step: 356/77, loss: 3.7848014471819624e-05 2023-01-24 00:45:16.979274: step: 360/77, loss: 2.1635869416058995e-06 2023-01-24 00:45:18.472750: step: 364/77, loss: 2.361656333960127e-05 2023-01-24 00:45:19.971688: step: 368/77, loss: 1.8357882254349533e-06 2023-01-24 00:45:21.380768: step: 372/77, loss: 3.522189945215359e-05 2023-01-24 00:45:22.839443: step: 376/77, loss: 0.00022090724087320268 2023-01-24 00:45:24.304148: step: 380/77, loss: 2.6850680114876013e-06 2023-01-24 00:45:25.721541: step: 384/77, loss: 0.0015824445290490985 2023-01-24 00:45:27.269278: step: 388/77, loss: 1.2197881005704403e-05 ================================================== Loss: 0.010 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.45454545454545453, 'r': 0.03780718336483932, 'f1': 0.06980802792321117}, 'combined': 0.051437494259208225, 'epoch': 19} Test Chinese: {'template': {'p': 0.92, 'r': 0.5433070866141733, 'f1': 0.6831683168316832}, 'slot': {'p': 0.5, 'r': 0.023946360153256706, 'f1': 0.045703839122486295}, 'combined': 0.031223414846054995, 'epoch': 19} Dev Korean: {'template': {'p': 1.0, 'r': 0.5666666666666667, 'f1': 0.7234042553191489}, 'slot': {'p': 0.4444444444444444, 'r': 0.03780718336483932, 'f1': 0.06968641114982578}, 'combined': 0.05041144636370375, 'epoch': 19} Test Korean: {'template': {'p': 0.92, 'r': 0.5433070866141733, 'f1': 0.6831683168316832}, 'slot': {'p': 0.5102040816326531, 'r': 0.023946360153256706, 'f1': 0.04574565416285453}, 'combined': 0.03125198155680162, 'epoch': 19} Dev Russian: {'template': {'p': 1.0, 'r': 0.5666666666666667, 'f1': 0.7234042553191489}, 'slot': {'p': 0.45454545454545453, 'r': 0.03780718336483932, 'f1': 0.06980802792321117}, 'combined': 0.050499424455088926, 'epoch': 19} Test Russian: {'template': {'p': 0.92, 'r': 0.5433070866141733, 'f1': 0.6831683168316832}, 'slot': {'p': 0.5, 'r': 0.023946360153256706, 'f1': 0.045703839122486295}, 'combined': 0.031223414846054995, 'epoch': 19} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 19} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 19} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 19} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 2} Test for Chinese: {'template': {'p': 0.9295774647887324, 'r': 0.5196850393700787, 'f1': 0.6666666666666665}, 'slot': {'p': 0.5172413793103449, 'r': 0.014367816091954023, 'f1': 0.02795899347623485}, 'combined': 0.018639328984156562, 'epoch': 2} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 2} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 2} Test for Korean: {'template': {'p': 0.9295774647887324, 'r': 0.5196850393700787, 'f1': 0.6666666666666665}, 'slot': {'p': 0.5172413793103449, 'r': 0.014367816091954023, 'f1': 0.02795899347623485}, 'combined': 0.018639328984156562, 'epoch': 2} Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 2} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 2} Test for Russian: {'template': {'p': 0.9295774647887324, 'r': 0.5196850393700787, 'f1': 0.6666666666666665}, 'slot': {'p': 0.5172413793103449, 'r': 0.014367816091954023, 'f1': 0.02795899347623485}, 'combined': 0.018639328984156562, 'epoch': 2} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 2} ****************************** Epoch: 20 command: python train.py --model_name template --xlmr_model_name xlm-roberta-large --batch_size 10 --xlmr_learning_rate 2e-5 --event_hidden_num 450 --accumulate_step 4 --max_epoch 30 --p1_data_weight 0.1 --learning_rate 2e-4 2023-01-24 00:47:02.497413: step: 4/77, loss: 1.3192928236094303e-05 2023-01-24 00:47:03.941701: step: 8/77, loss: 2.463090822857339e-06 2023-01-24 00:47:05.411589: step: 12/77, loss: 0.00011500025721034035 2023-01-24 00:47:06.801953: step: 16/77, loss: 1.0528083294047974e-05 2023-01-24 00:47:08.288434: step: 20/77, loss: 1.1484605238365475e-05 2023-01-24 00:47:09.748743: step: 24/77, loss: 1.9093291484750807e-05 2023-01-24 00:47:11.314723: step: 28/77, loss: 0.029310494661331177 2023-01-24 00:47:12.772179: step: 32/77, loss: 0.0016498463228344917 2023-01-24 00:47:14.278861: step: 36/77, loss: 0.00037180885556153953 2023-01-24 00:47:15.700176: step: 40/77, loss: 0.0028189048171043396 2023-01-24 00:47:17.157793: step: 44/77, loss: 0.023563450202345848 2023-01-24 00:47:18.592721: step: 48/77, loss: 0.004786632489413023 2023-01-24 00:47:20.152877: step: 52/77, loss: 0.005417908541858196 2023-01-24 00:47:21.575492: step: 56/77, loss: 0.003728205803781748 2023-01-24 00:47:23.069495: step: 60/77, loss: 0.07977019250392914 2023-01-24 00:47:24.560834: step: 64/77, loss: 0.002021214459091425 2023-01-24 00:47:26.048337: step: 68/77, loss: 5.231113391346298e-06 2023-01-24 00:47:27.541873: step: 72/77, loss: 0.03847013786435127 2023-01-24 00:47:28.923677: step: 76/77, loss: 0.007580421399325132 2023-01-24 00:47:30.385819: step: 80/77, loss: 3.203808591933921e-05 2023-01-24 00:47:31.795460: step: 84/77, loss: 0.0034962797071784735 2023-01-24 00:47:33.187160: step: 88/77, loss: 0.016852017492055893 2023-01-24 00:47:34.645564: step: 92/77, loss: 0.0004227173049002886 2023-01-24 00:47:36.054470: step: 96/77, loss: 0.005792779847979546 2023-01-24 00:47:37.573353: step: 100/77, loss: 0.002475955756381154 2023-01-24 00:47:39.053575: step: 104/77, loss: 0.0022028908133506775 2023-01-24 00:47:40.561054: step: 108/77, loss: 0.012379514053463936 2023-01-24 00:47:42.062627: step: 112/77, loss: 0.00023423394304700196 2023-01-24 00:47:43.486961: step: 116/77, loss: 0.0012709060683846474 2023-01-24 00:47:44.899756: step: 120/77, loss: 0.0036717234179377556 2023-01-24 00:47:46.348099: step: 124/77, loss: 6.576021405635402e-05 2023-01-24 00:47:47.794681: step: 128/77, loss: 0.00034538499312475324 2023-01-24 00:47:49.384973: step: 132/77, loss: 0.00018151012773159891 2023-01-24 00:47:50.791091: step: 136/77, loss: 0.008112723007798195 2023-01-24 00:47:52.246839: step: 140/77, loss: 0.07242359966039658 2023-01-24 00:47:53.721945: step: 144/77, loss: 0.008109656162559986 2023-01-24 00:47:55.180383: step: 148/77, loss: 8.539374903193675e-06 2023-01-24 00:47:56.616973: step: 152/77, loss: 0.0005678643356077373 2023-01-24 00:47:58.095346: step: 156/77, loss: 4.22751072619576e-05 2023-01-24 00:47:59.560212: step: 160/77, loss: 3.292236215202138e-05 2023-01-24 00:48:00.982683: step: 164/77, loss: 0.03325174003839493 2023-01-24 00:48:02.468644: step: 168/77, loss: 0.018970109522342682 2023-01-24 00:48:03.860719: step: 172/77, loss: 0.00023700771271251142 2023-01-24 00:48:05.303959: step: 176/77, loss: 0.013335713185369968 2023-01-24 00:48:06.801436: step: 180/77, loss: 0.03609664738178253 2023-01-24 00:48:08.337097: step: 184/77, loss: 6.33083691354841e-05 2023-01-24 00:48:09.774796: step: 188/77, loss: 0.0002317598118679598 2023-01-24 00:48:11.331288: step: 192/77, loss: 0.007055716589093208 2023-01-24 00:48:12.780364: step: 196/77, loss: 0.04417296499013901 2023-01-24 00:48:14.280909: step: 200/77, loss: 0.014277588576078415 2023-01-24 00:48:15.792704: step: 204/77, loss: 0.0032836194150149822 2023-01-24 00:48:17.219461: step: 208/77, loss: 0.010912610217928886 2023-01-24 00:48:18.734811: step: 212/77, loss: 0.0008074995130300522 2023-01-24 00:48:20.264268: step: 216/77, loss: 3.4301144751225365e-06 2023-01-24 00:48:21.802380: step: 220/77, loss: 0.0002279908221680671 2023-01-24 00:48:23.242074: step: 224/77, loss: 0.00021480608847923577 2023-01-24 00:48:24.684703: step: 228/77, loss: 0.00034418608993291855 2023-01-24 00:48:26.191431: step: 232/77, loss: 0.016895901411771774 2023-01-24 00:48:27.629836: step: 236/77, loss: 0.0010526860132813454 2023-01-24 00:48:29.099420: step: 240/77, loss: 0.007215662859380245 2023-01-24 00:48:30.514445: step: 244/77, loss: 0.0023093121126294136 2023-01-24 00:48:31.966579: step: 248/77, loss: 0.03414314240217209 2023-01-24 00:48:33.508701: step: 252/77, loss: 0.00805766973644495 2023-01-24 00:48:34.980102: step: 256/77, loss: 0.001133862417191267 2023-01-24 00:48:36.441337: step: 260/77, loss: 7.707732584094629e-05 2023-01-24 00:48:37.908485: step: 264/77, loss: 0.0010598760563880205 2023-01-24 00:48:39.421254: step: 268/77, loss: 0.0011756059248000383 2023-01-24 00:48:40.922913: step: 272/77, loss: 0.006573973689228296 2023-01-24 00:48:42.366415: step: 276/77, loss: 0.00018033068045042455 2023-01-24 00:48:43.878758: step: 280/77, loss: 0.007437266409397125 2023-01-24 00:48:45.310897: step: 284/77, loss: 0.0010263456497341394 2023-01-24 00:48:46.811082: step: 288/77, loss: 0.008123028092086315 2023-01-24 00:48:48.280250: step: 292/77, loss: 0.0013863637577742338 2023-01-24 00:48:49.778414: step: 296/77, loss: 0.016317684203386307 2023-01-24 00:48:51.197189: step: 300/77, loss: 4.197472208034014e-06 2023-01-24 00:48:52.659956: step: 304/77, loss: 1.459309169149492e-05 2023-01-24 00:48:54.190654: step: 308/77, loss: 1.002541284833569e-05 2023-01-24 00:48:55.752614: step: 312/77, loss: 1.635883018025197e-05 2023-01-24 00:48:57.213354: step: 316/77, loss: 0.0006573385326191783 2023-01-24 00:48:58.730639: step: 320/77, loss: 2.3008291464066133e-05 2023-01-24 00:49:00.160807: step: 324/77, loss: 6.158732867334038e-05 2023-01-24 00:49:01.652589: step: 328/77, loss: 1.1455389540060423e-05 2023-01-24 00:49:03.120086: step: 332/77, loss: 8.295646694023162e-05 2023-01-24 00:49:04.661497: step: 336/77, loss: 0.00028201111126691103 2023-01-24 00:49:06.124744: step: 340/77, loss: 0.0009026177576743066 2023-01-24 00:49:07.623698: step: 344/77, loss: 0.00747287692502141 2023-01-24 00:49:09.074332: step: 348/77, loss: 0.0015667448751628399 2023-01-24 00:49:10.543439: step: 352/77, loss: 0.0016064089722931385 2023-01-24 00:49:11.972860: step: 356/77, loss: 0.0004564714909065515 2023-01-24 00:49:13.487567: step: 360/77, loss: 0.013650856912136078 2023-01-24 00:49:14.988945: step: 364/77, loss: 5.7811830629361793e-05 2023-01-24 00:49:16.386161: step: 368/77, loss: 0.006817900110036135 2023-01-24 00:49:17.905157: step: 372/77, loss: 0.005823828279972076 2023-01-24 00:49:19.376053: step: 376/77, loss: 0.000923436542507261 2023-01-24 00:49:20.849211: step: 380/77, loss: 0.00016840689931996167 2023-01-24 00:49:22.290721: step: 384/77, loss: 0.002086574910208583 2023-01-24 00:49:23.776483: step: 388/77, loss: 0.0021831956692039967 ================================================== Loss: 0.007 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 20} Test Chinese: {'template': {'p': 0.9315068493150684, 'r': 0.5354330708661418, 'f1': 0.6799999999999999}, 'slot': {'p': 0.6216216216216216, 'r': 0.022030651340996167, 'f1': 0.0425531914893617}, 'combined': 0.028936170212765955, 'epoch': 20} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 20} Test Korean: {'template': {'p': 0.9444444444444444, 'r': 0.5354330708661418, 'f1': 0.6834170854271356}, 'slot': {'p': 0.6216216216216216, 'r': 0.022030651340996167, 'f1': 0.0425531914893617}, 'combined': 0.029081578103282366, 'epoch': 20} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 20} Test Russian: {'template': {'p': 0.918918918918919, 'r': 0.5354330708661418, 'f1': 0.6766169154228856}, 'slot': {'p': 0.6111111111111112, 'r': 0.0210727969348659, 'f1': 0.040740740740740744}, 'combined': 0.02756587433204349, 'epoch': 20} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 20} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 20} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 20} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 2} Test for Chinese: {'template': {'p': 0.9295774647887324, 'r': 0.5196850393700787, 'f1': 0.6666666666666665}, 'slot': {'p': 0.5172413793103449, 'r': 0.014367816091954023, 'f1': 0.02795899347623485}, 'combined': 0.018639328984156562, 'epoch': 2} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 2} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 2} Test for Korean: {'template': {'p': 0.9295774647887324, 'r': 0.5196850393700787, 'f1': 0.6666666666666665}, 'slot': {'p': 0.5172413793103449, 'r': 0.014367816091954023, 'f1': 0.02795899347623485}, 'combined': 0.018639328984156562, 'epoch': 2} Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 2} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 2} Test for Russian: {'template': {'p': 0.9295774647887324, 'r': 0.5196850393700787, 'f1': 0.6666666666666665}, 'slot': {'p': 0.5172413793103449, 'r': 0.014367816091954023, 'f1': 0.02795899347623485}, 'combined': 0.018639328984156562, 'epoch': 2} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 2} ****************************** Epoch: 21 command: python train.py --model_name template --xlmr_model_name xlm-roberta-large --batch_size 10 --xlmr_learning_rate 2e-5 --event_hidden_num 450 --accumulate_step 4 --max_epoch 30 --p1_data_weight 0.1 --learning_rate 2e-4 2023-01-24 00:50:59.287308: step: 4/77, loss: 4.391517722979188e-05 2023-01-24 00:51:00.792778: step: 8/77, loss: 0.011269661597907543 2023-01-24 00:51:02.282160: step: 12/77, loss: 0.00026010849978774786 2023-01-24 00:51:03.752800: step: 16/77, loss: 4.358853038866073e-05 2023-01-24 00:51:05.307290: step: 20/77, loss: 2.2985019313637167e-05 2023-01-24 00:51:06.751340: step: 24/77, loss: 0.0008312297286465764 2023-01-24 00:51:08.257811: step: 28/77, loss: 0.012758411467075348 2023-01-24 00:51:09.641931: step: 32/77, loss: 0.0011191972298547626 2023-01-24 00:51:11.192222: step: 36/77, loss: 0.0036675273440778255 2023-01-24 00:51:12.641740: step: 40/77, loss: 0.0017228196375072002 2023-01-24 00:51:14.032303: step: 44/77, loss: 0.014871697872877121 2023-01-24 00:51:15.464621: step: 48/77, loss: 0.0004049563722219318 2023-01-24 00:51:16.912861: step: 52/77, loss: 3.213894160580821e-05 2023-01-24 00:51:18.397860: step: 56/77, loss: 0.0005126817850396037 2023-01-24 00:51:19.898254: step: 60/77, loss: 0.00032478783396072686 2023-01-24 00:51:21.438137: step: 64/77, loss: 0.0002003766712732613 2023-01-24 00:51:22.947758: step: 68/77, loss: 0.01953509822487831 2023-01-24 00:51:24.402148: step: 72/77, loss: 0.006982152815908194 2023-01-24 00:51:25.961341: step: 76/77, loss: 0.00012618339678738266 2023-01-24 00:51:27.440222: step: 80/77, loss: 0.0071145957335829735 2023-01-24 00:51:28.894912: step: 84/77, loss: 6.516471785289468e-06 2023-01-24 00:51:30.320617: step: 88/77, loss: 4.911324867862277e-05 2023-01-24 00:51:31.719327: step: 92/77, loss: 0.00011751156125683337 2023-01-24 00:51:33.158540: step: 96/77, loss: 0.0031759554985910654 2023-01-24 00:51:34.612772: step: 100/77, loss: 0.0003278399526607245 2023-01-24 00:51:36.104312: step: 104/77, loss: 3.380963107701973e-06 2023-01-24 00:51:37.541545: step: 108/77, loss: 0.0030511897057294846 2023-01-24 00:51:39.038527: step: 112/77, loss: 0.0023707051295787096 2023-01-24 00:51:40.512965: step: 116/77, loss: 0.010366893373429775 2023-01-24 00:51:41.955226: step: 120/77, loss: 7.0356600190280005e-06 2023-01-24 00:51:43.318046: step: 124/77, loss: 8.97041218195227e-07 2023-01-24 00:51:44.805964: step: 128/77, loss: 0.008515083231031895 2023-01-24 00:51:46.279304: step: 132/77, loss: 1.7076557696782402e-06 2023-01-24 00:51:47.795510: step: 136/77, loss: 0.000308698188746348 2023-01-24 00:51:49.268710: step: 140/77, loss: 1.6256635717581958e-06 2023-01-24 00:51:50.709087: step: 144/77, loss: 0.002256601583212614 2023-01-24 00:51:52.117286: step: 148/77, loss: 0.013435864821076393 2023-01-24 00:51:53.557998: step: 152/77, loss: 2.1047762857051566e-05 2023-01-24 00:51:55.031432: step: 156/77, loss: 8.036449798964895e-06 2023-01-24 00:51:56.486294: step: 160/77, loss: 1.6093207477752003e-07 2023-01-24 00:51:57.943502: step: 164/77, loss: 1.1622243619058281e-05 2023-01-24 00:51:59.458993: step: 168/77, loss: 7.565349278593203e-06 2023-01-24 00:52:00.907969: step: 172/77, loss: 0.04428644850850105 2023-01-24 00:52:02.349714: step: 176/77, loss: 0.041636187583208084 2023-01-24 00:52:03.810880: step: 180/77, loss: 0.0004688084591180086 2023-01-24 00:52:05.251525: step: 184/77, loss: 0.0008889245218597353 2023-01-24 00:52:06.665992: step: 188/77, loss: 0.017766296863555908 2023-01-24 00:52:08.138639: step: 192/77, loss: 0.003885013749822974 2023-01-24 00:52:09.640055: step: 196/77, loss: 0.0003852845693472773 2023-01-24 00:52:11.068531: step: 200/77, loss: 0.0001461335486965254 2023-01-24 00:52:12.501271: step: 204/77, loss: 0.01028933934867382 2023-01-24 00:52:14.006655: step: 208/77, loss: 0.027263857424259186 2023-01-24 00:52:15.430789: step: 212/77, loss: 4.07897423428949e-05 2023-01-24 00:52:16.921008: step: 216/77, loss: 0.00028045265935361385 2023-01-24 00:52:18.362058: step: 220/77, loss: 9.948672959581017e-05 2023-01-24 00:52:19.819026: step: 224/77, loss: 0.00154870527330786 2023-01-24 00:52:21.282900: step: 228/77, loss: 8.737082680454478e-05 2023-01-24 00:52:22.793019: step: 232/77, loss: 4.500118109262985e-07 2023-01-24 00:52:24.273306: step: 236/77, loss: 2.4586870495113544e-07 2023-01-24 00:52:25.762439: step: 240/77, loss: 0.00015047851775307208 2023-01-24 00:52:27.194991: step: 244/77, loss: 0.0026609927881509066 2023-01-24 00:52:28.706818: step: 248/77, loss: 2.2798730014983448e-07 2023-01-24 00:52:30.152494: step: 252/77, loss: 2.4057619157247245e-05 2023-01-24 00:52:31.652638: step: 256/77, loss: 0.0002987839106936008 2023-01-24 00:52:33.096427: step: 260/77, loss: 6.951633167773252e-06 2023-01-24 00:52:34.558025: step: 264/77, loss: 1.9087817690888187e-06 2023-01-24 00:52:36.088474: step: 268/77, loss: 0.00012575466826092452 2023-01-24 00:52:37.563352: step: 272/77, loss: 1.6970532669802196e-05 2023-01-24 00:52:38.998984: step: 276/77, loss: 0.00428348034620285 2023-01-24 00:52:40.406791: step: 280/77, loss: 0.00021226488752290606 2023-01-24 00:52:41.907680: step: 284/77, loss: 2.315082929271739e-05 2023-01-24 00:52:43.346726: step: 288/77, loss: 2.339491038583219e-05 2023-01-24 00:52:44.838888: step: 292/77, loss: 1.760100531100761e-05 2023-01-24 00:52:46.297632: step: 296/77, loss: 0.0004236962995491922 2023-01-24 00:52:47.737916: step: 300/77, loss: 0.0034623718820512295 2023-01-24 00:52:49.262868: step: 304/77, loss: 2.373845381953288e-05 2023-01-24 00:52:50.729247: step: 308/77, loss: 0.021428581327199936 2023-01-24 00:52:52.182759: step: 312/77, loss: 3.681535963551141e-05 2023-01-24 00:52:53.650044: step: 316/77, loss: 0.0003865455510094762 2023-01-24 00:52:55.160136: step: 320/77, loss: 0.000635168980807066 2023-01-24 00:52:56.608216: step: 324/77, loss: 0.0008278586901724339 2023-01-24 00:52:58.134137: step: 328/77, loss: 9.744359704200178e-05 2023-01-24 00:52:59.572727: step: 332/77, loss: 3.93389314012893e-07 2023-01-24 00:53:01.074412: step: 336/77, loss: 0.058941058814525604 2023-01-24 00:53:02.623080: step: 340/77, loss: 0.0012770385947078466 2023-01-24 00:53:04.080821: step: 344/77, loss: 4.457090108189732e-05 2023-01-24 00:53:05.548523: step: 348/77, loss: 0.0015952158719301224 2023-01-24 00:53:07.008327: step: 352/77, loss: 7.524989769081003e-07 2023-01-24 00:53:08.515582: step: 356/77, loss: 2.8592396574822487e-06 2023-01-24 00:53:10.005504: step: 360/77, loss: 0.016292478889226913 2023-01-24 00:53:11.516367: step: 364/77, loss: 0.0033385471906512976 2023-01-24 00:53:13.053844: step: 368/77, loss: 0.002890442730858922 2023-01-24 00:53:14.500349: step: 372/77, loss: 0.026279674842953682 2023-01-24 00:53:15.991243: step: 376/77, loss: 8.808910934021696e-05 2023-01-24 00:53:17.421321: step: 380/77, loss: 5.139029599376954e-05 2023-01-24 00:53:18.875780: step: 384/77, loss: 1.6525015098523e-06 2023-01-24 00:53:20.305948: step: 388/77, loss: 0.04984476417303085 ================================================== Loss: 0.005 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 21} Test Chinese: {'template': {'p': 0.9466666666666667, 'r': 0.5590551181102362, 'f1': 0.7029702970297029}, 'slot': {'p': 0.6388888888888888, 'r': 0.022030651340996167, 'f1': 0.04259259259259259}, 'combined': 0.02994132746607994, 'epoch': 21} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 21} Test Korean: {'template': {'p': 0.9473684210526315, 'r': 0.5669291338582677, 'f1': 0.70935960591133}, 'slot': {'p': 0.6111111111111112, 'r': 0.0210727969348659, 'f1': 0.040740740740740744}, 'combined': 0.02889983579638752, 'epoch': 21} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 21} Test Russian: {'template': {'p': 0.9473684210526315, 'r': 0.5669291338582677, 'f1': 0.70935960591133}, 'slot': {'p': 0.6111111111111112, 'r': 0.0210727969348659, 'f1': 0.040740740740740744}, 'combined': 0.02889983579638752, 'epoch': 21} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 21} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 21} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 21} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 2} Test for Chinese: {'template': {'p': 0.9295774647887324, 'r': 0.5196850393700787, 'f1': 0.6666666666666665}, 'slot': {'p': 0.5172413793103449, 'r': 0.014367816091954023, 'f1': 0.02795899347623485}, 'combined': 0.018639328984156562, 'epoch': 2} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 2} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 2} Test for Korean: {'template': {'p': 0.9295774647887324, 'r': 0.5196850393700787, 'f1': 0.6666666666666665}, 'slot': {'p': 0.5172413793103449, 'r': 0.014367816091954023, 'f1': 0.02795899347623485}, 'combined': 0.018639328984156562, 'epoch': 2} Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 2} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 2} Test for Russian: {'template': {'p': 0.9295774647887324, 'r': 0.5196850393700787, 'f1': 0.6666666666666665}, 'slot': {'p': 0.5172413793103449, 'r': 0.014367816091954023, 'f1': 0.02795899347623485}, 'combined': 0.018639328984156562, 'epoch': 2} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 2} ****************************** Epoch: 22 command: python train.py --model_name template --xlmr_model_name xlm-roberta-large --batch_size 10 --xlmr_learning_rate 2e-5 --event_hidden_num 450 --accumulate_step 4 --max_epoch 30 --p1_data_weight 0.1 --learning_rate 2e-4 2023-01-24 00:54:55.785851: step: 4/77, loss: 0.001251904759556055 2023-01-24 00:54:57.277718: step: 8/77, loss: 4.813056193597731e-07 2023-01-24 00:54:58.723146: step: 12/77, loss: 0.028468012809753418 2023-01-24 00:55:00.160844: step: 16/77, loss: 4.5322199184738565e-06 2023-01-24 00:55:01.539328: step: 20/77, loss: 0.0034898424055427313 2023-01-24 00:55:03.018891: step: 24/77, loss: 2.628446054586675e-06 2023-01-24 00:55:04.427323: step: 28/77, loss: 0.0006907194619998336 2023-01-24 00:55:05.896894: step: 32/77, loss: 3.6535427625494776e-06 2023-01-24 00:55:07.335526: step: 36/77, loss: 2.296172169735655e-06 2023-01-24 00:55:08.766945: step: 40/77, loss: 1.3321690857992508e-05 2023-01-24 00:55:10.242403: step: 44/77, loss: 0.0019014282152056694 2023-01-24 00:55:11.724272: step: 48/77, loss: 0.0014974469086155295 2023-01-24 00:55:13.139252: step: 52/77, loss: 0.029981238767504692 2023-01-24 00:55:14.564821: step: 56/77, loss: 0.004113981034606695 2023-01-24 00:55:16.074561: step: 60/77, loss: 0.009812380187213421 2023-01-24 00:55:17.531298: step: 64/77, loss: 0.00020641175797209144 2023-01-24 00:55:18.983238: step: 68/77, loss: 7.629348033333372e-07 2023-01-24 00:55:20.440434: step: 72/77, loss: 6.504666089313105e-05 2023-01-24 00:55:21.840070: step: 76/77, loss: 0.007216009311378002 2023-01-24 00:55:23.302156: step: 80/77, loss: 0.0006816386012360454 2023-01-24 00:55:24.749577: step: 84/77, loss: 2.8358892450341955e-05 2023-01-24 00:55:26.287097: step: 88/77, loss: 0.000104675127658993 2023-01-24 00:55:27.733983: step: 92/77, loss: 0.00655527925118804 2023-01-24 00:55:29.192763: step: 96/77, loss: 0.023317817598581314 2023-01-24 00:55:30.596706: step: 100/77, loss: 0.003784725908190012 2023-01-24 00:55:31.991240: step: 104/77, loss: 1.2789068932761438e-05 2023-01-24 00:55:33.497836: step: 108/77, loss: 4.066258043167181e-06 2023-01-24 00:55:34.976412: step: 112/77, loss: 0.0010824999772012234 2023-01-24 00:55:36.499240: step: 116/77, loss: 0.048643819987773895 2023-01-24 00:55:37.991073: step: 120/77, loss: 0.00024930958170443773 2023-01-24 00:55:39.413984: step: 124/77, loss: 0.0014804255915805697 2023-01-24 00:55:40.815099: step: 128/77, loss: 8.126140528474934e-06 2023-01-24 00:55:42.255148: step: 132/77, loss: 5.98239876126172e-06 2023-01-24 00:55:43.752254: step: 136/77, loss: 8.287557284347713e-05 2023-01-24 00:55:45.212517: step: 140/77, loss: 0.04968878626823425 2023-01-24 00:55:46.671508: step: 144/77, loss: 3.498396108625457e-05 2023-01-24 00:55:48.152435: step: 148/77, loss: 0.0015079396544024348 2023-01-24 00:55:49.633710: step: 152/77, loss: 0.02270987629890442 2023-01-24 00:55:51.119455: step: 156/77, loss: 0.0007338325376622379 2023-01-24 00:55:52.587984: step: 160/77, loss: 4.485302633838728e-05 2023-01-24 00:55:54.056065: step: 164/77, loss: 0.010014176368713379 2023-01-24 00:55:55.516086: step: 168/77, loss: 9.416887769475579e-05 2023-01-24 00:55:57.016504: step: 172/77, loss: 0.00011647852807072923 2023-01-24 00:55:58.558449: step: 176/77, loss: 0.0059953853487968445 2023-01-24 00:55:59.944505: step: 180/77, loss: 1.4442720384977292e-05 2023-01-24 00:56:01.390417: step: 184/77, loss: 0.03647876903414726 2023-01-24 00:56:02.811783: step: 188/77, loss: 0.0004354465054348111 2023-01-24 00:56:04.334620: step: 192/77, loss: 6.023783043929143e-06 2023-01-24 00:56:05.748399: step: 196/77, loss: 4.984165570931509e-06 2023-01-24 00:56:07.177563: step: 200/77, loss: 0.0002774551685433835 2023-01-24 00:56:08.621828: step: 204/77, loss: 0.00017141942225862294 2023-01-24 00:56:10.133025: step: 208/77, loss: 0.0011483619455248117 2023-01-24 00:56:11.590790: step: 212/77, loss: 0.0008126638713292778 2023-01-24 00:56:13.040802: step: 216/77, loss: 0.000159363160491921 2023-01-24 00:56:14.496686: step: 220/77, loss: 0.00397157110273838 2023-01-24 00:56:15.966265: step: 224/77, loss: 0.021798763424158096 2023-01-24 00:56:17.451675: step: 228/77, loss: 3.5462057894619647e-06 2023-01-24 00:56:18.876312: step: 232/77, loss: 0.0014899511588737369 2023-01-24 00:56:20.303958: step: 236/77, loss: 1.7540756743983366e-05 2023-01-24 00:56:21.767687: step: 240/77, loss: 0.020959466695785522 2023-01-24 00:56:23.268940: step: 244/77, loss: 0.004268889781087637 2023-01-24 00:56:24.693409: step: 248/77, loss: 0.003262277226895094 2023-01-24 00:56:26.125122: step: 252/77, loss: 8.32986697787419e-05 2023-01-24 00:56:27.586518: step: 256/77, loss: 6.029654286976438e-06 2023-01-24 00:56:29.142464: step: 260/77, loss: 0.00020383202354423702 2023-01-24 00:56:30.624811: step: 264/77, loss: 0.0087860943749547 2023-01-24 00:56:32.121386: step: 268/77, loss: 2.563952875789255e-05 2023-01-24 00:56:33.640797: step: 272/77, loss: 0.0010856754379346967 2023-01-24 00:56:35.082748: step: 276/77, loss: 1.0841111361514777e-05 2023-01-24 00:56:36.610100: step: 280/77, loss: 0.013169659301638603 2023-01-24 00:56:37.980962: step: 284/77, loss: 0.0002249402750749141 2023-01-24 00:56:39.523727: step: 288/77, loss: 4.219470065436326e-05 2023-01-24 00:56:40.960120: step: 292/77, loss: 0.02478584460914135 2023-01-24 00:56:42.385455: step: 296/77, loss: 0.0014029683079570532 2023-01-24 00:56:43.895865: step: 300/77, loss: 0.000522875867318362 2023-01-24 00:56:45.360801: step: 304/77, loss: 1.6998614228214137e-05 2023-01-24 00:56:46.869853: step: 308/77, loss: 8.205435005947948e-05 2023-01-24 00:56:48.340921: step: 312/77, loss: 0.03432891145348549 2023-01-24 00:56:49.831682: step: 316/77, loss: 0.05371011793613434 2023-01-24 00:56:51.340795: step: 320/77, loss: 9.02634346857667e-05 2023-01-24 00:56:52.845035: step: 324/77, loss: 0.00013596990902442485 2023-01-24 00:56:54.314379: step: 328/77, loss: 3.2078220101539046e-05 2023-01-24 00:56:55.782618: step: 332/77, loss: 3.0337028874782845e-05 2023-01-24 00:56:57.207608: step: 336/77, loss: 0.0001057839544955641 2023-01-24 00:56:58.676908: step: 340/77, loss: 0.01671535149216652 2023-01-24 00:57:00.159006: step: 344/77, loss: 0.0042815255001187325 2023-01-24 00:57:01.591197: step: 348/77, loss: 7.020933117019013e-05 2023-01-24 00:57:03.083196: step: 352/77, loss: 0.00011571750656003132 2023-01-24 00:57:04.541255: step: 356/77, loss: 0.0004864287911914289 2023-01-24 00:57:06.027684: step: 360/77, loss: 3.163710061926395e-05 2023-01-24 00:57:07.532429: step: 364/77, loss: 0.07282278686761856 2023-01-24 00:57:09.021098: step: 368/77, loss: 0.003499907674267888 2023-01-24 00:57:10.506478: step: 372/77, loss: 1.0058472980745137e-05 2023-01-24 00:57:12.011378: step: 376/77, loss: 2.708286228880752e-05 2023-01-24 00:57:13.470734: step: 380/77, loss: 1.9398390577407554e-05 2023-01-24 00:57:14.939573: step: 384/77, loss: 0.004925777204334736 2023-01-24 00:57:16.406347: step: 388/77, loss: 0.015260990709066391 ================================================== Loss: 0.006 -------------------- Dev Chinese: {'template': {'p': 0.9722222222222222, 'r': 0.5833333333333334, 'f1': 0.7291666666666666}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05125951962507322, 'epoch': 22} Test Chinese: {'template': {'p': 0.9452054794520548, 'r': 0.5433070866141733, 'f1': 0.69}, 'slot': {'p': 0.5897435897435898, 'r': 0.022030651340996167, 'f1': 0.04247460757156048}, 'combined': 0.029307479224376726, 'epoch': 22} Dev Korean: {'template': {'p': 0.9714285714285714, 'r': 0.5666666666666667, 'f1': 0.7157894736842105}, 'slot': {'p': 0.5, 'r': 0.035916824196597356, 'f1': 0.0670194003527337}, 'combined': 0.04797178130511465, 'epoch': 22} Test Korean: {'template': {'p': 0.9583333333333334, 'r': 0.5433070866141733, 'f1': 0.6934673366834172}, 'slot': {'p': 0.6111111111111112, 'r': 0.0210727969348659, 'f1': 0.040740740740740744}, 'combined': 0.028252372975991074, 'epoch': 22} Dev Russian: {'template': {'p': 0.9722222222222222, 'r': 0.5833333333333334, 'f1': 0.7291666666666666}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05125951962507322, 'epoch': 22} Test Russian: {'template': {'p': 0.9583333333333334, 'r': 0.5433070866141733, 'f1': 0.6934673366834172}, 'slot': {'p': 0.6216216216216216, 'r': 0.022030651340996167, 'f1': 0.0425531914893617}, 'combined': 0.029509248369507114, 'epoch': 22} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 22} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 22} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 22} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 2} Test for Chinese: {'template': {'p': 0.9295774647887324, 'r': 0.5196850393700787, 'f1': 0.6666666666666665}, 'slot': {'p': 0.5172413793103449, 'r': 0.014367816091954023, 'f1': 0.02795899347623485}, 'combined': 0.018639328984156562, 'epoch': 2} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 2} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 2} Test for Korean: {'template': {'p': 0.9295774647887324, 'r': 0.5196850393700787, 'f1': 0.6666666666666665}, 'slot': {'p': 0.5172413793103449, 'r': 0.014367816091954023, 'f1': 0.02795899347623485}, 'combined': 0.018639328984156562, 'epoch': 2} Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 2} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 2} Test for Russian: {'template': {'p': 0.9295774647887324, 'r': 0.5196850393700787, 'f1': 0.6666666666666665}, 'slot': {'p': 0.5172413793103449, 'r': 0.014367816091954023, 'f1': 0.02795899347623485}, 'combined': 0.018639328984156562, 'epoch': 2} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 2} ****************************** Epoch: 23 command: python train.py --model_name template --xlmr_model_name xlm-roberta-large --batch_size 10 --xlmr_learning_rate 2e-5 --event_hidden_num 450 --accumulate_step 4 --max_epoch 30 --p1_data_weight 0.1 --learning_rate 2e-4 2023-01-24 00:58:50.672311: step: 4/77, loss: 6.476036651292816e-05 2023-01-24 00:58:52.075986: step: 8/77, loss: 0.0008242184412665665 2023-01-24 00:58:53.428422: step: 12/77, loss: 0.029622716829180717 2023-01-24 00:58:54.911932: step: 16/77, loss: 0.0022930025588721037 2023-01-24 00:58:56.347126: step: 20/77, loss: 2.899443279602565e-05 2023-01-24 00:58:57.810486: step: 24/77, loss: 0.000592827214859426 2023-01-24 00:58:59.239936: step: 28/77, loss: 8.819873619358987e-05 2023-01-24 00:59:00.693788: step: 32/77, loss: 0.0006759160896763206 2023-01-24 00:59:02.024079: step: 36/77, loss: 5.535118816624163e-06 2023-01-24 00:59:03.413273: step: 40/77, loss: 1.5288499071175465e-06 2023-01-24 00:59:04.907382: step: 44/77, loss: 2.2268908651312813e-05 2023-01-24 00:59:06.333730: step: 48/77, loss: 0.00036133662797510624 2023-01-24 00:59:07.820068: step: 52/77, loss: 0.0005304719670675695 2023-01-24 00:59:09.334136: step: 56/77, loss: 0.03949795290827751 2023-01-24 00:59:10.813118: step: 60/77, loss: 9.679071808932349e-06 2023-01-24 00:59:12.203345: step: 64/77, loss: 7.986901096046495e-07 2023-01-24 00:59:13.694753: step: 68/77, loss: 3.198342164978385e-05 2023-01-24 00:59:15.109231: step: 72/77, loss: 0.0005049941828474402 2023-01-24 00:59:16.548330: step: 76/77, loss: 0.00011991035717073828 2023-01-24 00:59:18.047437: step: 80/77, loss: 9.753071935847402e-06 2023-01-24 00:59:19.496323: step: 84/77, loss: 0.001151953125372529 2023-01-24 00:59:20.977035: step: 88/77, loss: 3.5002187360078096e-05 2023-01-24 00:59:22.355494: step: 92/77, loss: 0.06682904809713364 2023-01-24 00:59:23.721044: step: 96/77, loss: 0.00020730571122840047 2023-01-24 00:59:25.240189: step: 100/77, loss: 4.214938235236332e-05 2023-01-24 00:59:26.687846: step: 104/77, loss: 4.819598325411789e-05 2023-01-24 00:59:28.106944: step: 108/77, loss: 0.00024913056404329836 2023-01-24 00:59:29.499315: step: 112/77, loss: 4.713028829428367e-06 2023-01-24 00:59:30.916193: step: 116/77, loss: 1.797926779545378e-05 2023-01-24 00:59:32.422497: step: 120/77, loss: 5.1061497288174e-06 2023-01-24 00:59:33.866017: step: 124/77, loss: 4.534593244898133e-05 2023-01-24 00:59:35.331524: step: 128/77, loss: 0.05518024042248726 2023-01-24 00:59:36.814541: step: 132/77, loss: 0.000779474969021976 2023-01-24 00:59:38.201180: step: 136/77, loss: 1.6099438653327525e-05 2023-01-24 00:59:39.599244: step: 140/77, loss: 4.6170545829227194e-05 2023-01-24 00:59:41.043449: step: 144/77, loss: 4.403839920996688e-05 2023-01-24 00:59:42.485891: step: 148/77, loss: 0.012226158753037453 2023-01-24 00:59:43.954544: step: 152/77, loss: 0.0005230201641097665 2023-01-24 00:59:45.375666: step: 156/77, loss: 3.545012077665888e-05 2023-01-24 00:59:46.846212: step: 160/77, loss: 3.968955934396945e-05 2023-01-24 00:59:48.351025: step: 164/77, loss: 0.00028362390003167093 2023-01-24 00:59:49.856897: step: 168/77, loss: 0.0006454983376897871 2023-01-24 00:59:51.355093: step: 172/77, loss: 6.407515320461243e-05 2023-01-24 00:59:52.810357: step: 176/77, loss: 0.0016403638292104006 2023-01-24 00:59:54.224693: step: 180/77, loss: 0.002352670766413212 2023-01-24 00:59:55.720316: step: 184/77, loss: 0.04686371609568596 2023-01-24 00:59:57.129540: step: 188/77, loss: 0.0003184076922480017 2023-01-24 00:59:58.580326: step: 192/77, loss: 3.0171644539223053e-05 2023-01-24 01:00:00.039766: step: 196/77, loss: 0.03708694502711296 2023-01-24 01:00:01.541597: step: 200/77, loss: 7.464420195901766e-05 2023-01-24 01:00:02.954234: step: 204/77, loss: 0.002194175496697426 2023-01-24 01:00:04.427407: step: 208/77, loss: 7.220302450150484e-06 2023-01-24 01:00:05.838404: step: 212/77, loss: 0.00476179551333189 2023-01-24 01:00:07.373379: step: 216/77, loss: 0.020851392298936844 2023-01-24 01:00:08.861878: step: 220/77, loss: 6.299009692156687e-05 2023-01-24 01:00:10.277562: step: 224/77, loss: 0.0009391269995830953 2023-01-24 01:00:11.700203: step: 228/77, loss: 0.0010091927833855152 2023-01-24 01:00:13.091883: step: 232/77, loss: 0.009036161936819553 2023-01-24 01:00:14.513880: step: 236/77, loss: 0.0027202453929930925 2023-01-24 01:00:15.978878: step: 240/77, loss: 0.0002560011635068804 2023-01-24 01:00:17.424012: step: 244/77, loss: 0.0003327125741634518 2023-01-24 01:00:18.860174: step: 248/77, loss: 0.000369568879250437 2023-01-24 01:00:20.341812: step: 252/77, loss: 0.00020267318177502602 2023-01-24 01:00:21.841171: step: 256/77, loss: 0.000802332884632051 2023-01-24 01:00:23.291826: step: 260/77, loss: 0.015739187598228455 2023-01-24 01:00:24.834233: step: 264/77, loss: 0.002607054775580764 2023-01-24 01:00:26.321197: step: 268/77, loss: 4.639992766897194e-05 2023-01-24 01:00:27.735164: step: 272/77, loss: 5.869245796930045e-05 2023-01-24 01:00:29.196904: step: 276/77, loss: 0.00129833968821913 2023-01-24 01:00:30.670280: step: 280/77, loss: 4.291278855816927e-06 2023-01-24 01:00:32.142425: step: 284/77, loss: 0.0011007684515789151 2023-01-24 01:00:33.600216: step: 288/77, loss: 0.00020736704755108804 2023-01-24 01:00:35.083415: step: 292/77, loss: 2.8625487175304443e-05 2023-01-24 01:00:36.552268: step: 296/77, loss: 0.0003702756075654179 2023-01-24 01:00:38.051907: step: 300/77, loss: 9.267129644285887e-05 2023-01-24 01:00:39.529374: step: 304/77, loss: 0.0012066512135788798 2023-01-24 01:00:40.949527: step: 308/77, loss: 0.004357015714049339 2023-01-24 01:00:42.461095: step: 312/77, loss: 0.0010678028920665383 2023-01-24 01:00:43.889217: step: 316/77, loss: 0.00029340051696635783 2023-01-24 01:00:45.381819: step: 320/77, loss: 0.00046228180872276425 2023-01-24 01:00:46.784626: step: 324/77, loss: 0.00031857320573180914 2023-01-24 01:00:48.227468: step: 328/77, loss: 0.00023108372988644987 2023-01-24 01:00:49.621947: step: 332/77, loss: 8.038013766054064e-05 2023-01-24 01:00:51.070975: step: 336/77, loss: 4.702713340520859e-05 2023-01-24 01:00:52.524305: step: 340/77, loss: 5.826320830237819e-07 2023-01-24 01:00:53.981681: step: 344/77, loss: 0.0004363069892860949 2023-01-24 01:00:55.420452: step: 348/77, loss: 0.031395308673381805 2023-01-24 01:00:56.821786: step: 352/77, loss: 3.939534508390352e-06 2023-01-24 01:00:58.249877: step: 356/77, loss: 0.00011500406253617257 2023-01-24 01:00:59.651265: step: 360/77, loss: 8.872351463651285e-05 2023-01-24 01:01:01.135607: step: 364/77, loss: 0.0006411436479538679 2023-01-24 01:01:02.573363: step: 368/77, loss: 6.815737287979573e-05 2023-01-24 01:01:04.026522: step: 372/77, loss: 0.0013687231112271547 2023-01-24 01:01:05.465284: step: 376/77, loss: 3.338500755489804e-05 2023-01-24 01:01:06.933328: step: 380/77, loss: 0.00010364050103817135 2023-01-24 01:01:08.381805: step: 384/77, loss: 0.009905654937028885 2023-01-24 01:01:09.860156: step: 388/77, loss: 0.00033722905209288 ================================================== Loss: 0.004 -------------------- Dev Chinese: {'template': {'p': 0.9714285714285714, 'r': 0.5666666666666667, 'f1': 0.7157894736842105}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05031911941541022, 'epoch': 23} Test Chinese: {'template': {'p': 0.9452054794520548, 'r': 0.5433070866141733, 'f1': 0.69}, 'slot': {'p': 0.631578947368421, 'r': 0.022988505747126436, 'f1': 0.04436229205175601}, 'combined': 0.03060998151571164, 'epoch': 23} Dev Korean: {'template': {'p': 0.9705882352941176, 'r': 0.55, 'f1': 0.7021276595744681}, 'slot': {'p': 0.5, 'r': 0.035916824196597356, 'f1': 0.0670194003527337}, 'combined': 0.047056174715749195, 'epoch': 23} Test Korean: {'template': {'p': 0.9452054794520548, 'r': 0.5433070866141733, 'f1': 0.69}, 'slot': {'p': 0.6052631578947368, 'r': 0.022030651340996167, 'f1': 0.04251386321626617}, 'combined': 0.029334565619223655, 'epoch': 23} Dev Russian: {'template': {'p': 0.9714285714285714, 'r': 0.5666666666666667, 'f1': 0.7157894736842105}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05031911941541022, 'epoch': 23} Test Russian: {'template': {'p': 0.9452054794520548, 'r': 0.5433070866141733, 'f1': 0.69}, 'slot': {'p': 0.6052631578947368, 'r': 0.022030651340996167, 'f1': 0.04251386321626617}, 'combined': 0.029334565619223655, 'epoch': 23} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 23} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 23} Sample Russian: {'template': {'p': 1.0, 'r': 0.25, 'f1': 0.4}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 23} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 2} Test for Chinese: {'template': {'p': 0.9295774647887324, 'r': 0.5196850393700787, 'f1': 0.6666666666666665}, 'slot': {'p': 0.5172413793103449, 'r': 0.014367816091954023, 'f1': 0.02795899347623485}, 'combined': 0.018639328984156562, 'epoch': 2} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 2} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 2} Test for Korean: {'template': {'p': 0.9295774647887324, 'r': 0.5196850393700787, 'f1': 0.6666666666666665}, 'slot': {'p': 0.5172413793103449, 'r': 0.014367816091954023, 'f1': 0.02795899347623485}, 'combined': 0.018639328984156562, 'epoch': 2} Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 2} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 2} Test for Russian: {'template': {'p': 0.9295774647887324, 'r': 0.5196850393700787, 'f1': 0.6666666666666665}, 'slot': {'p': 0.5172413793103449, 'r': 0.014367816091954023, 'f1': 0.02795899347623485}, 'combined': 0.018639328984156562, 'epoch': 2} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 2} ****************************** Epoch: 24 command: python train.py --model_name template --xlmr_model_name xlm-roberta-large --batch_size 10 --xlmr_learning_rate 2e-5 --event_hidden_num 450 --accumulate_step 4 --max_epoch 30 --p1_data_weight 0.1 --learning_rate 2e-4 2023-01-24 01:02:42.918238: step: 4/77, loss: 3.1171319278655574e-05 2023-01-24 01:02:44.394945: step: 8/77, loss: 0.0011873997282236814 2023-01-24 01:02:45.881164: step: 12/77, loss: 0.003962877206504345 2023-01-24 01:02:47.311858: step: 16/77, loss: 2.2723768324794946e-06 2023-01-24 01:02:48.786142: step: 20/77, loss: 0.00022218370577320457 2023-01-24 01:02:50.294039: step: 24/77, loss: 0.00029364702641032636 2023-01-24 01:02:51.717748: step: 28/77, loss: 7.54345310269855e-05 2023-01-24 01:02:53.178782: step: 32/77, loss: 0.021652810275554657 2023-01-24 01:02:54.610076: step: 36/77, loss: 0.0008724477957002819 2023-01-24 01:02:56.073606: step: 40/77, loss: 0.12781716883182526 2023-01-24 01:02:57.436530: step: 44/77, loss: 1.7628128262003884e-05 2023-01-24 01:02:58.860204: step: 48/77, loss: 0.0005912228371016681 2023-01-24 01:03:00.255877: step: 52/77, loss: 0.0004211801278870553 2023-01-24 01:03:01.722829: step: 56/77, loss: 9.959259477909654e-05 2023-01-24 01:03:03.184150: step: 60/77, loss: 0.004942398518323898 2023-01-24 01:03:04.653694: step: 64/77, loss: 0.010029622353613377 2023-01-24 01:03:06.098060: step: 68/77, loss: 7.413248386001214e-05 2023-01-24 01:03:07.566176: step: 72/77, loss: 0.00029315147548913956 2023-01-24 01:03:09.003601: step: 76/77, loss: 9.57174961513374e-06 2023-01-24 01:03:10.488138: step: 80/77, loss: 0.0011903179110959172 2023-01-24 01:03:11.958324: step: 84/77, loss: 0.00022463969071395695 2023-01-24 01:03:13.397698: step: 88/77, loss: 2.155514812329784e-05 2023-01-24 01:03:14.815722: step: 92/77, loss: 0.005188435316085815 2023-01-24 01:03:16.237467: step: 96/77, loss: 3.63168292096816e-05 2023-01-24 01:03:17.688645: step: 100/77, loss: 0.001799465506337583 2023-01-24 01:03:19.130121: step: 104/77, loss: 0.0014674203703179955 2023-01-24 01:03:20.573571: step: 108/77, loss: 0.0011961114360019565 2023-01-24 01:03:22.022849: step: 112/77, loss: 0.0003475307603366673 2023-01-24 01:03:23.469834: step: 116/77, loss: 9.936955393641256e-06 2023-01-24 01:03:24.894365: step: 120/77, loss: 4.879153857473284e-05 2023-01-24 01:03:26.269925: step: 124/77, loss: 5.204067565500736e-05 2023-01-24 01:03:27.697209: step: 128/77, loss: 2.216498614870943e-05 2023-01-24 01:03:29.139575: step: 132/77, loss: 0.003418202279135585 2023-01-24 01:03:30.560129: step: 136/77, loss: 2.672818118298892e-05 2023-01-24 01:03:31.947517: step: 140/77, loss: 4.202239870210178e-05 2023-01-24 01:03:33.462474: step: 144/77, loss: 0.037262678146362305 2023-01-24 01:03:34.958034: step: 148/77, loss: 9.831888746703044e-06 2023-01-24 01:03:36.406795: step: 152/77, loss: 4.6100030886009336e-05 2023-01-24 01:03:37.860699: step: 156/77, loss: 0.0016948329284787178 2023-01-24 01:03:39.260411: step: 160/77, loss: 2.138233185178251e-06 2023-01-24 01:03:40.702916: step: 164/77, loss: 0.0002200250164605677 2023-01-24 01:03:42.177668: step: 168/77, loss: 7.85043157520704e-05 2023-01-24 01:03:43.647564: step: 172/77, loss: 1.8322307369089685e-05 2023-01-24 01:03:45.083478: step: 176/77, loss: 0.0002504443982616067 2023-01-24 01:03:46.519545: step: 180/77, loss: 6.690556801913772e-07 2023-01-24 01:03:47.985263: step: 184/77, loss: 0.016600850969552994 2023-01-24 01:03:49.437662: step: 188/77, loss: 0.003764849854633212 2023-01-24 01:03:50.862832: step: 192/77, loss: 0.0019138616044074297 2023-01-24 01:03:52.314726: step: 196/77, loss: 1.1861472557939123e-05 2023-01-24 01:03:53.753691: step: 200/77, loss: 9.53390626818873e-05 2023-01-24 01:03:55.168714: step: 204/77, loss: 1.658840301388409e-05 2023-01-24 01:03:56.610157: step: 208/77, loss: 0.0001371430407743901 2023-01-24 01:03:58.149302: step: 212/77, loss: 0.0035580710973590612 2023-01-24 01:03:59.648680: step: 216/77, loss: 0.004919815808534622 2023-01-24 01:04:01.113043: step: 220/77, loss: 2.8758281587215606e-06 2023-01-24 01:04:02.546564: step: 224/77, loss: 8.045551658142358e-05 2023-01-24 01:04:03.968796: step: 228/77, loss: 0.00023129928740672767 2023-01-24 01:04:05.365767: step: 232/77, loss: 1.6018514088500524e-06 2023-01-24 01:04:06.802649: step: 236/77, loss: 3.5045354707108345e-06 2023-01-24 01:04:08.306915: step: 240/77, loss: 0.0006813876680098474 2023-01-24 01:04:09.725056: step: 244/77, loss: 9.24681153264828e-05 2023-01-24 01:04:11.132503: step: 248/77, loss: 0.03565293177962303 2023-01-24 01:04:12.582256: step: 252/77, loss: 3.8891698750376236e-07 2023-01-24 01:04:14.047878: step: 256/77, loss: 5.0910502977785654e-06 2023-01-24 01:04:15.520245: step: 260/77, loss: 9.142967428488191e-06 2023-01-24 01:04:16.982239: step: 264/77, loss: 0.01213124766945839 2023-01-24 01:04:18.374569: step: 268/77, loss: 3.512187322485261e-05 2023-01-24 01:04:19.814494: step: 272/77, loss: 0.000551240926142782 2023-01-24 01:04:21.306416: step: 276/77, loss: 0.00014770854613743722 2023-01-24 01:04:22.814955: step: 280/77, loss: 0.0006746418075636029 2023-01-24 01:04:24.303263: step: 284/77, loss: 1.341101665275346e-07 2023-01-24 01:04:25.802063: step: 288/77, loss: 0.0007360518793575466 2023-01-24 01:04:27.303185: step: 292/77, loss: 0.0007057063630782068 2023-01-24 01:04:28.720952: step: 296/77, loss: 1.5855710444157012e-05 2023-01-24 01:04:30.123339: step: 300/77, loss: 0.0071692271158099174 2023-01-24 01:04:31.512687: step: 304/77, loss: 0.013543746434152126 2023-01-24 01:04:32.988424: step: 308/77, loss: 7.214854849735275e-05 2023-01-24 01:04:34.395005: step: 312/77, loss: 0.009257547557353973 2023-01-24 01:04:35.791517: step: 316/77, loss: 5.399264773586765e-05 2023-01-24 01:04:37.267359: step: 320/77, loss: 0.00014896635548211634 2023-01-24 01:04:38.701490: step: 324/77, loss: 0.0029846744146198034 2023-01-24 01:04:40.094422: step: 328/77, loss: 8.863164111971855e-06 2023-01-24 01:04:41.502840: step: 332/77, loss: 0.04843698441982269 2023-01-24 01:04:42.930014: step: 336/77, loss: 0.009433270432054996 2023-01-24 01:04:44.427467: step: 340/77, loss: 0.00015376076044049114 2023-01-24 01:04:45.875982: step: 344/77, loss: 1.597374875927926e-06 2023-01-24 01:04:47.334057: step: 348/77, loss: 5.695552317774855e-06 2023-01-24 01:04:48.709859: step: 352/77, loss: 3.231646405765787e-05 2023-01-24 01:04:50.131490: step: 356/77, loss: 0.0005682723131030798 2023-01-24 01:04:51.563099: step: 360/77, loss: 3.5998855310026556e-06 2023-01-24 01:04:52.976421: step: 364/77, loss: 0.0009214639430865645 2023-01-24 01:04:54.495450: step: 368/77, loss: 1.7061327071132837e-06 2023-01-24 01:04:55.941395: step: 372/77, loss: 0.026705704629421234 2023-01-24 01:04:57.420380: step: 376/77, loss: 4.4256216824578587e-07 2023-01-24 01:04:58.929788: step: 380/77, loss: 3.592257144191535e-06 2023-01-24 01:05:00.425176: step: 384/77, loss: 0.002922291401773691 2023-01-24 01:05:01.847797: step: 388/77, loss: 0.012621428817510605 ================================================== Loss: 0.005 -------------------- Dev Chinese: {'template': {'p': 0.9705882352941176, 'r': 0.55, 'f1': 0.7021276595744681}, 'slot': {'p': 0.5, 'r': 0.034026465028355386, 'f1': 0.06371681415929203}, 'combined': 0.044737337601205046, 'epoch': 24} Test Chinese: {'template': {'p': 0.9444444444444444, 'r': 0.5354330708661418, 'f1': 0.6834170854271356}, 'slot': {'p': 0.7241379310344828, 'r': 0.020114942528735632, 'f1': 0.0391425908667288}, 'combined': 0.026750715366206615, 'epoch': 24} Dev Korean: {'template': {'p': 1.0, 'r': 0.5666666666666667, 'f1': 0.7234042553191489}, 'slot': {'p': 0.5, 'r': 0.035916824196597356, 'f1': 0.0670194003527337}, 'combined': 0.048482119404105226, 'epoch': 24} Test Korean: {'template': {'p': 0.9452054794520548, 'r': 0.5433070866141733, 'f1': 0.69}, 'slot': {'p': 0.7586206896551724, 'r': 0.0210727969348659, 'f1': 0.04100652376514446}, 'combined': 0.028294501397949673, 'epoch': 24} Dev Russian: {'template': {'p': 1.0, 'r': 0.5666666666666667, 'f1': 0.7234042553191489}, 'slot': {'p': 0.5, 'r': 0.035916824196597356, 'f1': 0.0670194003527337}, 'combined': 0.048482119404105226, 'epoch': 24} Test Russian: {'template': {'p': 0.9444444444444444, 'r': 0.5354330708661418, 'f1': 0.6834170854271356}, 'slot': {'p': 0.7241379310344828, 'r': 0.020114942528735632, 'f1': 0.0391425908667288}, 'combined': 0.026750715366206615, 'epoch': 24} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 24} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 24} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 24} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 2} Test for Chinese: {'template': {'p': 0.9295774647887324, 'r': 0.5196850393700787, 'f1': 0.6666666666666665}, 'slot': {'p': 0.5172413793103449, 'r': 0.014367816091954023, 'f1': 0.02795899347623485}, 'combined': 0.018639328984156562, 'epoch': 2} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 2} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 2} Test for Korean: {'template': {'p': 0.9295774647887324, 'r': 0.5196850393700787, 'f1': 0.6666666666666665}, 'slot': {'p': 0.5172413793103449, 'r': 0.014367816091954023, 'f1': 0.02795899347623485}, 'combined': 0.018639328984156562, 'epoch': 2} Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 2} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 2} Test for Russian: {'template': {'p': 0.9295774647887324, 'r': 0.5196850393700787, 'f1': 0.6666666666666665}, 'slot': {'p': 0.5172413793103449, 'r': 0.014367816091954023, 'f1': 0.02795899347623485}, 'combined': 0.018639328984156562, 'epoch': 2} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 2} ****************************** Epoch: 25 command: python train.py --model_name template --xlmr_model_name xlm-roberta-large --batch_size 10 --xlmr_learning_rate 2e-5 --event_hidden_num 450 --accumulate_step 4 --max_epoch 30 --p1_data_weight 0.1 --learning_rate 2e-4 2023-01-24 01:06:34.859589: step: 4/77, loss: 0.009819536469876766 2023-01-24 01:06:36.345621: step: 8/77, loss: 0.014220115728676319 2023-01-24 01:06:37.739938: step: 12/77, loss: 4.107958739041351e-05 2023-01-24 01:06:39.213437: step: 16/77, loss: 1.9505171167111257e-06 2023-01-24 01:06:40.695059: step: 20/77, loss: 0.0007240984123200178 2023-01-24 01:06:42.140968: step: 24/77, loss: 0.007082062307745218 2023-01-24 01:06:43.603241: step: 28/77, loss: 0.0007487640250474215 2023-01-24 01:06:45.101873: step: 32/77, loss: 7.519516202592058e-06 2023-01-24 01:06:46.565547: step: 36/77, loss: 0.00028292066417634487 2023-01-24 01:06:47.953206: step: 40/77, loss: 0.00017292556003667414 2023-01-24 01:06:49.426516: step: 44/77, loss: 0.05107611045241356 2023-01-24 01:06:50.893525: step: 48/77, loss: 9.529092494631186e-06 2023-01-24 01:06:52.370251: step: 52/77, loss: 0.00024360073439311236 2023-01-24 01:06:53.838695: step: 56/77, loss: 0.00033219667966477573 2023-01-24 01:06:55.275757: step: 60/77, loss: 0.00029517264920286834 2023-01-24 01:06:56.739885: step: 64/77, loss: 0.00020420033251866698 2023-01-24 01:06:58.135662: step: 68/77, loss: 2.9449969588313252e-05 2023-01-24 01:06:59.593395: step: 72/77, loss: 0.010752936825156212 2023-01-24 01:07:01.095421: step: 76/77, loss: 1.7533868231112137e-05 2023-01-24 01:07:02.551654: step: 80/77, loss: 8.120994607452303e-05 2023-01-24 01:07:04.056555: step: 84/77, loss: 3.9463975554099306e-05 2023-01-24 01:07:05.556841: step: 88/77, loss: 0.000144461722811684 2023-01-24 01:07:07.020541: step: 92/77, loss: 0.006458427291363478 2023-01-24 01:07:08.460409: step: 96/77, loss: 0.0009248307324014604 2023-01-24 01:07:09.902827: step: 100/77, loss: 0.0016990250442177057 2023-01-24 01:07:11.357570: step: 104/77, loss: 3.797957469942048e-05 2023-01-24 01:07:12.868716: step: 108/77, loss: 0.00011431700841058046 2023-01-24 01:07:14.276279: step: 112/77, loss: 0.0004391880356706679 2023-01-24 01:07:15.743549: step: 116/77, loss: 0.031691160053014755 2023-01-24 01:07:17.146109: step: 120/77, loss: 4.013401849078946e-05 2023-01-24 01:07:18.563649: step: 124/77, loss: 1.1909891327377409e-05 2023-01-24 01:07:19.993404: step: 128/77, loss: 3.6935871321475133e-06 2023-01-24 01:07:21.398535: step: 132/77, loss: 8.292774873552844e-05 2023-01-24 01:07:22.783717: step: 136/77, loss: 7.226949492178392e-07 2023-01-24 01:07:24.174383: step: 140/77, loss: 3.6549467949953396e-06 2023-01-24 01:07:25.580034: step: 144/77, loss: 2.932922143372707e-05 2023-01-24 01:07:26.964202: step: 148/77, loss: 6.541444008689723e-07 2023-01-24 01:07:28.366946: step: 152/77, loss: 0.00017043561092577875 2023-01-24 01:07:29.800877: step: 156/77, loss: 0.00010256864334223792 2023-01-24 01:07:31.192293: step: 160/77, loss: 0.005601917859166861 2023-01-24 01:07:32.607149: step: 164/77, loss: 0.0007875625742599368 2023-01-24 01:07:34.072874: step: 168/77, loss: 1.0448028660903219e-05 2023-01-24 01:07:35.512247: step: 172/77, loss: 0.006522939540445805 2023-01-24 01:07:37.001782: step: 176/77, loss: 0.07718431949615479 2023-01-24 01:07:38.437197: step: 180/77, loss: 0.001776683609932661 2023-01-24 01:07:39.988514: step: 184/77, loss: 0.029603658244013786 2023-01-24 01:07:41.419856: step: 188/77, loss: 0.013666364364326 2023-01-24 01:07:42.910097: step: 192/77, loss: 1.081784375855932e-06 2023-01-24 01:07:44.320972: step: 196/77, loss: 5.108955519972369e-06 2023-01-24 01:07:45.802428: step: 200/77, loss: 0.000783779367338866 2023-01-24 01:07:47.285324: step: 204/77, loss: 5.707119044018327e-07 2023-01-24 01:07:48.702733: step: 208/77, loss: 7.573235961899627e-06 2023-01-24 01:07:50.131455: step: 212/77, loss: 0.00014532128989230841 2023-01-24 01:07:51.541285: step: 216/77, loss: 0.008629778400063515 2023-01-24 01:07:52.971265: step: 220/77, loss: 0.0024603274650871754 2023-01-24 01:07:54.427685: step: 224/77, loss: 6.228598294910626e-07 2023-01-24 01:07:55.885001: step: 228/77, loss: 0.0012935721315443516 2023-01-24 01:07:57.371568: step: 232/77, loss: 3.5904915421269834e-05 2023-01-24 01:07:58.753762: step: 236/77, loss: 8.25760216685012e-05 2023-01-24 01:08:00.246622: step: 240/77, loss: 1.0371200005465653e-05 2023-01-24 01:08:01.744896: step: 244/77, loss: 0.03577936068177223 2023-01-24 01:08:03.094788: step: 248/77, loss: 0.009609689936041832 2023-01-24 01:08:04.495876: step: 252/77, loss: 0.04909440502524376 2023-01-24 01:08:05.939465: step: 256/77, loss: 6.557624146807939e-05 2023-01-24 01:08:07.403374: step: 260/77, loss: 1.0412841220386326e-05 2023-01-24 01:08:08.901344: step: 264/77, loss: 3.9196598663693294e-05 2023-01-24 01:08:10.361826: step: 268/77, loss: 2.399073650849459e-07 2023-01-24 01:08:11.860936: step: 272/77, loss: 5.692206173080194e-07 2023-01-24 01:08:13.297962: step: 276/77, loss: 2.4139782794918574e-07 2023-01-24 01:08:14.716197: step: 280/77, loss: 1.2278434951440431e-05 2023-01-24 01:08:16.233591: step: 284/77, loss: 0.00012191428686492145 2023-01-24 01:08:17.781242: step: 288/77, loss: 0.09169664233922958 2023-01-24 01:08:19.152052: step: 292/77, loss: 3.4272244420208153e-07 2023-01-24 01:08:20.606806: step: 296/77, loss: 5.785742905572988e-05 2023-01-24 01:08:22.026906: step: 300/77, loss: 0.01666434481739998 2023-01-24 01:08:23.461297: step: 304/77, loss: 9.44926287047565e-05 2023-01-24 01:08:24.859962: step: 308/77, loss: 0.00016875761502888054 2023-01-24 01:08:26.283547: step: 312/77, loss: 5.314128065947443e-05 2023-01-24 01:08:27.715737: step: 316/77, loss: 0.004922967404127121 2023-01-24 01:08:29.206206: step: 320/77, loss: 1.8414266378385946e-05 2023-01-24 01:08:30.641133: step: 324/77, loss: 0.00015346906729973853 2023-01-24 01:08:32.001233: step: 328/77, loss: 0.00015929968503769487 2023-01-24 01:08:33.447920: step: 332/77, loss: 0.00010385241330368444 2023-01-24 01:08:34.898306: step: 336/77, loss: 0.000952318892814219 2023-01-24 01:08:36.311431: step: 340/77, loss: 0.015849722549319267 2023-01-24 01:08:37.781159: step: 344/77, loss: 0.059916820377111435 2023-01-24 01:08:39.241467: step: 348/77, loss: 0.00011897022341145203 2023-01-24 01:08:40.629592: step: 352/77, loss: 0.00013665176811628044 2023-01-24 01:08:42.077266: step: 356/77, loss: 0.0005663028568960726 2023-01-24 01:08:43.589561: step: 360/77, loss: 0.42172589898109436 2023-01-24 01:08:45.066014: step: 364/77, loss: 5.4338863265002146e-05 2023-01-24 01:08:46.493912: step: 368/77, loss: 1.3788014257443137e-05 2023-01-24 01:08:47.959742: step: 372/77, loss: 2.5643452318035997e-05 2023-01-24 01:08:49.480730: step: 376/77, loss: 0.0001552673347759992 2023-01-24 01:08:50.847228: step: 380/77, loss: 6.0173853853484616e-05 2023-01-24 01:08:52.310722: step: 384/77, loss: 5.203439650358632e-05 2023-01-24 01:08:53.756083: step: 388/77, loss: 0.00010354188270866871 ================================================== Loss: 0.010 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5666666666666667, 'f1': 0.7234042553191489}, 'slot': {'p': 0.5, 'r': 0.035916824196597356, 'f1': 0.0670194003527337}, 'combined': 0.048482119404105226, 'epoch': 25} Test Chinese: {'template': {'p': 0.9285714285714286, 'r': 0.5118110236220472, 'f1': 0.6598984771573604}, 'slot': {'p': 0.5, 'r': 0.022030651340996167, 'f1': 0.04220183486238532}, 'combined': 0.027848926558934478, 'epoch': 25} Dev Korean: {'template': {'p': 1.0, 'r': 0.5666666666666667, 'f1': 0.7234042553191489}, 'slot': {'p': 0.5, 'r': 0.035916824196597356, 'f1': 0.0670194003527337}, 'combined': 0.048482119404105226, 'epoch': 25} Test Korean: {'template': {'p': 0.9428571428571428, 'r': 0.5196850393700787, 'f1': 0.6700507614213197}, 'slot': {'p': 0.5111111111111111, 'r': 0.022030651340996167, 'f1': 0.04224058769513315}, 'combined': 0.028303337948007993, 'epoch': 25} Dev Russian: {'template': {'p': 1.0, 'r': 0.5666666666666667, 'f1': 0.7234042553191489}, 'slot': {'p': 0.5, 'r': 0.035916824196597356, 'f1': 0.0670194003527337}, 'combined': 0.048482119404105226, 'epoch': 25} Test Russian: {'template': {'p': 0.9428571428571428, 'r': 0.5196850393700787, 'f1': 0.6700507614213197}, 'slot': {'p': 0.5, 'r': 0.022030651340996167, 'f1': 0.04220183486238532}, 'combined': 0.02827737158291808, 'epoch': 25} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 25} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 25} Sample Russian: {'template': {'p': 1.0, 'r': 0.25, 'f1': 0.4}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 25} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 2} Test for Chinese: {'template': {'p': 0.9295774647887324, 'r': 0.5196850393700787, 'f1': 0.6666666666666665}, 'slot': {'p': 0.5172413793103449, 'r': 0.014367816091954023, 'f1': 0.02795899347623485}, 'combined': 0.018639328984156562, 'epoch': 2} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 2} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 2} Test for Korean: {'template': {'p': 0.9295774647887324, 'r': 0.5196850393700787, 'f1': 0.6666666666666665}, 'slot': {'p': 0.5172413793103449, 'r': 0.014367816091954023, 'f1': 0.02795899347623485}, 'combined': 0.018639328984156562, 'epoch': 2} Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 2} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 2} Test for Russian: {'template': {'p': 0.9295774647887324, 'r': 0.5196850393700787, 'f1': 0.6666666666666665}, 'slot': {'p': 0.5172413793103449, 'r': 0.014367816091954023, 'f1': 0.02795899347623485}, 'combined': 0.018639328984156562, 'epoch': 2} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 2} ****************************** Epoch: 26 command: python train.py --model_name template --xlmr_model_name xlm-roberta-large --batch_size 10 --xlmr_learning_rate 2e-5 --event_hidden_num 450 --accumulate_step 4 --max_epoch 30 --p1_data_weight 0.1 --learning_rate 2e-4 2023-01-24 01:10:26.324591: step: 4/77, loss: 0.0017838947242125869 2023-01-24 01:10:27.733539: step: 8/77, loss: 7.226975071716879e-07 2023-01-24 01:10:29.247412: step: 12/77, loss: 5.246474756859243e-05 2023-01-24 01:10:30.721223: step: 16/77, loss: 0.000598268408793956 2023-01-24 01:10:32.154118: step: 20/77, loss: 1.639127411579011e-08 2023-01-24 01:10:33.518916: step: 24/77, loss: 4.097351848031394e-05 2023-01-24 01:10:34.956706: step: 28/77, loss: 0.0009339273674413562 2023-01-24 01:10:36.414962: step: 32/77, loss: 2.0804567611776292e-05 2023-01-24 01:10:37.842526: step: 36/77, loss: 0.00034928921377286315 2023-01-24 01:10:39.262138: step: 40/77, loss: 0.00030408421298488975 2023-01-24 01:10:40.672777: step: 44/77, loss: 8.966489258455113e-05 2023-01-24 01:10:42.139139: step: 48/77, loss: 3.918986806183966e-07 2023-01-24 01:10:43.583025: step: 52/77, loss: 1.739860454108566e-05 2023-01-24 01:10:45.034759: step: 56/77, loss: 0.00017790854326449335 2023-01-24 01:10:46.454100: step: 60/77, loss: 4.157096554990858e-06 2023-01-24 01:10:47.900984: step: 64/77, loss: 0.018841929733753204 2023-01-24 01:10:49.308325: step: 68/77, loss: 0.00646100752055645 2023-01-24 01:10:50.707963: step: 72/77, loss: 0.00015355581126641482 2023-01-24 01:10:52.176729: step: 76/77, loss: 1.0102801297762198e-06 2023-01-24 01:10:53.665568: step: 80/77, loss: 1.714569043542724e-05 2023-01-24 01:10:55.187652: step: 84/77, loss: 0.001359621761366725 2023-01-24 01:10:56.678675: step: 88/77, loss: 0.00012873602099716663 2023-01-24 01:10:58.058693: step: 92/77, loss: 0.0003332077758386731 2023-01-24 01:10:59.431039: step: 96/77, loss: 2.5836798158707097e-06 2023-01-24 01:11:00.827023: step: 100/77, loss: 0.0013962970115244389 2023-01-24 01:11:02.303642: step: 104/77, loss: 0.0006795570370741189 2023-01-24 01:11:03.756038: step: 108/77, loss: 0.005097075365483761 2023-01-24 01:11:05.161015: step: 112/77, loss: 1.3976770105728065e-06 2023-01-24 01:11:06.537107: step: 116/77, loss: 7.601392098877113e-06 2023-01-24 01:11:08.004377: step: 120/77, loss: 0.00032862581429071724 2023-01-24 01:11:09.498006: step: 124/77, loss: 1.247330874321051e-05 2023-01-24 01:11:10.859204: step: 128/77, loss: 2.391489260844537e-06 2023-01-24 01:11:12.235101: step: 132/77, loss: 3.6128487408859655e-05 2023-01-24 01:11:13.709175: step: 136/77, loss: 0.00016953838348854333 2023-01-24 01:11:15.145228: step: 140/77, loss: 0.0012672448065131903 2023-01-24 01:11:16.620648: step: 144/77, loss: 0.0003668934223242104 2023-01-24 01:11:18.062820: step: 148/77, loss: 0.011562679894268513 2023-01-24 01:11:19.573427: step: 152/77, loss: 2.9200720746302977e-05 2023-01-24 01:11:21.053545: step: 156/77, loss: 3.244157414883375e-05 2023-01-24 01:11:22.509415: step: 160/77, loss: 2.2947481284063542e-06 2023-01-24 01:11:23.988690: step: 164/77, loss: 0.0020991831552237272 2023-01-24 01:11:25.424651: step: 168/77, loss: 0.0002871434553526342 2023-01-24 01:11:26.884582: step: 172/77, loss: 0.00023810111451894045 2023-01-24 01:11:28.343559: step: 176/77, loss: 4.5447964680533914e-07 2023-01-24 01:11:29.795505: step: 180/77, loss: 6.748407031409442e-05 2023-01-24 01:11:31.235060: step: 184/77, loss: 7.154821651056409e-05 2023-01-24 01:11:32.620980: step: 188/77, loss: 3.4152094485762063e-06 2023-01-24 01:11:34.068060: step: 192/77, loss: 3.905124685843475e-05 2023-01-24 01:11:35.588473: step: 196/77, loss: 0.0036262294743210077 2023-01-24 01:11:37.060340: step: 200/77, loss: 0.0002491704362910241 2023-01-24 01:11:38.547784: step: 204/77, loss: 5.036541779190884e-07 2023-01-24 01:11:40.017816: step: 208/77, loss: 9.896850315271877e-06 2023-01-24 01:11:41.531441: step: 212/77, loss: 2.8555174139910378e-05 2023-01-24 01:11:42.962966: step: 216/77, loss: 0.003979911096394062 2023-01-24 01:11:44.405921: step: 220/77, loss: 0.0016700468258932233 2023-01-24 01:11:45.817813: step: 224/77, loss: 1.7285307762904267e-07 2023-01-24 01:11:47.360364: step: 228/77, loss: 0.023326512426137924 2023-01-24 01:11:48.788537: step: 232/77, loss: 7.369111699517816e-05 2023-01-24 01:11:50.266902: step: 236/77, loss: 1.3097933333483525e-06 2023-01-24 01:11:51.796173: step: 240/77, loss: 1.3172389117244165e-06 2023-01-24 01:11:53.213493: step: 244/77, loss: 0.0009034214308485389 2023-01-24 01:11:54.597936: step: 248/77, loss: 0.015626205131411552 2023-01-24 01:11:56.000938: step: 252/77, loss: 1.3142660009179963e-06 2023-01-24 01:11:57.455986: step: 256/77, loss: 5.230222654972749e-07 2023-01-24 01:11:58.941155: step: 260/77, loss: 3.874300347206372e-08 2023-01-24 01:12:00.371429: step: 264/77, loss: 0.014988576993346214 2023-01-24 01:12:01.755926: step: 268/77, loss: 0.028431393206119537 2023-01-24 01:12:03.200926: step: 272/77, loss: 2.6538033125689253e-05 2023-01-24 01:12:04.617451: step: 276/77, loss: 7.4582731031114236e-06 2023-01-24 01:12:06.042589: step: 280/77, loss: 0.004460113123059273 2023-01-24 01:12:07.490226: step: 284/77, loss: 6.880648925289279e-06 2023-01-24 01:12:08.947219: step: 288/77, loss: 5.675950160366483e-05 2023-01-24 01:12:10.395506: step: 292/77, loss: 0.02451791800558567 2023-01-24 01:12:11.915721: step: 296/77, loss: 0.0002563257294241339 2023-01-24 01:12:13.437424: step: 300/77, loss: 0.004377185832709074 2023-01-24 01:12:14.899507: step: 304/77, loss: 0.04975339025259018 2023-01-24 01:12:16.342957: step: 308/77, loss: 0.011360619217157364 2023-01-24 01:12:17.784403: step: 312/77, loss: 0.03482261672616005 2023-01-24 01:12:19.255252: step: 316/77, loss: 5.094198786537163e-05 2023-01-24 01:12:20.706923: step: 320/77, loss: 0.0037432571407407522 2023-01-24 01:12:22.165646: step: 324/77, loss: 0.0006270411540754139 2023-01-24 01:12:23.656970: step: 328/77, loss: 3.8891886333658476e-07 2023-01-24 01:12:25.078004: step: 332/77, loss: 0.007590716704726219 2023-01-24 01:12:26.529725: step: 336/77, loss: 3.2466082302562427e-06 2023-01-24 01:12:27.970887: step: 340/77, loss: 1.8978684238390997e-05 2023-01-24 01:12:29.490457: step: 344/77, loss: 8.508499149684212e-07 2023-01-24 01:12:30.897344: step: 348/77, loss: 4.011272721982095e-06 2023-01-24 01:12:32.353596: step: 352/77, loss: 0.03435808792710304 2023-01-24 01:12:33.774594: step: 356/77, loss: 4.8348661039199214e-06 2023-01-24 01:12:35.250391: step: 360/77, loss: 6.67527929181233e-05 2023-01-24 01:12:36.742108: step: 364/77, loss: 0.0005355605971999466 2023-01-24 01:12:38.170506: step: 368/77, loss: 5.0092576202587225e-06 2023-01-24 01:12:39.574804: step: 372/77, loss: 7.11991333446349e-06 2023-01-24 01:12:41.000213: step: 376/77, loss: 0.0002545999304857105 2023-01-24 01:12:42.487758: step: 380/77, loss: 5.245151442068163e-07 2023-01-24 01:12:43.888612: step: 384/77, loss: 6.656296318396926e-05 2023-01-24 01:12:45.326312: step: 388/77, loss: 7.972573075676337e-05 ================================================== Loss: 0.003 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 26} Test Chinese: {'template': {'p': 0.9444444444444444, 'r': 0.5354330708661418, 'f1': 0.6834170854271356}, 'slot': {'p': 0.65625, 'r': 0.020114942528735632, 'f1': 0.03903345724907063}, 'combined': 0.02667613158730455, 'epoch': 26} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 26} Test Korean: {'template': {'p': 0.9315068493150684, 'r': 0.5354330708661418, 'f1': 0.6799999999999999}, 'slot': {'p': 0.65625, 'r': 0.020114942528735632, 'f1': 0.03903345724907063}, 'combined': 0.026542750929368027, 'epoch': 26} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 26} Test Russian: {'template': {'p': 0.9315068493150684, 'r': 0.5354330708661418, 'f1': 0.6799999999999999}, 'slot': {'p': 0.6666666666666666, 'r': 0.019157088122605363, 'f1': 0.037243947858472994}, 'combined': 0.025325884543761633, 'epoch': 26} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 26} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 26} Sample Russian: {'template': {'p': 1.0, 'r': 0.25, 'f1': 0.4}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 26} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 2} Test for Chinese: {'template': {'p': 0.9295774647887324, 'r': 0.5196850393700787, 'f1': 0.6666666666666665}, 'slot': {'p': 0.5172413793103449, 'r': 0.014367816091954023, 'f1': 0.02795899347623485}, 'combined': 0.018639328984156562, 'epoch': 2} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 2} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 2} Test for Korean: {'template': {'p': 0.9295774647887324, 'r': 0.5196850393700787, 'f1': 0.6666666666666665}, 'slot': {'p': 0.5172413793103449, 'r': 0.014367816091954023, 'f1': 0.02795899347623485}, 'combined': 0.018639328984156562, 'epoch': 2} Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 2} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 2} Test for Russian: {'template': {'p': 0.9295774647887324, 'r': 0.5196850393700787, 'f1': 0.6666666666666665}, 'slot': {'p': 0.5172413793103449, 'r': 0.014367816091954023, 'f1': 0.02795899347623485}, 'combined': 0.018639328984156562, 'epoch': 2} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 2} ****************************** Epoch: 27 command: python train.py --model_name template --xlmr_model_name xlm-roberta-large --batch_size 10 --xlmr_learning_rate 2e-5 --event_hidden_num 450 --accumulate_step 4 --max_epoch 30 --p1_data_weight 0.1 --learning_rate 2e-4 2023-01-24 01:14:18.262923: step: 4/77, loss: 0.0021314637269824743 2023-01-24 01:14:19.634800: step: 8/77, loss: 0.04611368477344513 2023-01-24 01:14:21.029068: step: 12/77, loss: 2.6280285965185612e-05 2023-01-24 01:14:22.469678: step: 16/77, loss: 5.7614823163021356e-05 2023-01-24 01:14:23.903737: step: 20/77, loss: 8.972358045866713e-06 2023-01-24 01:14:25.289401: step: 24/77, loss: 0.0002448845189064741 2023-01-24 01:14:26.796547: step: 28/77, loss: 8.642667381764113e-08 2023-01-24 01:14:28.187626: step: 32/77, loss: 0.00010978860518662259 2023-01-24 01:14:29.560109: step: 36/77, loss: 0.0001491555303800851 2023-01-24 01:14:31.061505: step: 40/77, loss: 4.802901821676642e-05 2023-01-24 01:14:32.581123: step: 44/77, loss: 2.7653884444589494e-06 2023-01-24 01:14:34.023611: step: 48/77, loss: 7.958367496030405e-05 2023-01-24 01:14:35.511102: step: 52/77, loss: 2.556142135290429e-05 2023-01-24 01:14:36.929186: step: 56/77, loss: 4.932249453304394e-07 2023-01-24 01:14:38.381412: step: 60/77, loss: 1.4454097652105702e-07 2023-01-24 01:14:39.828188: step: 64/77, loss: 5.4347357945516706e-05 2023-01-24 01:14:41.337049: step: 68/77, loss: 0.10340731590986252 2023-01-24 01:14:42.803015: step: 72/77, loss: 0.0005586078623309731 2023-01-24 01:14:44.257631: step: 76/77, loss: 0.0001581439282745123 2023-01-24 01:14:45.659205: step: 80/77, loss: 0.00101377721875906 2023-01-24 01:14:47.057023: step: 84/77, loss: 0.0022370575461536646 2023-01-24 01:14:48.446917: step: 88/77, loss: 3.117080996162258e-05 2023-01-24 01:14:49.976459: step: 92/77, loss: 0.07382217049598694 2023-01-24 01:14:51.453243: step: 96/77, loss: 0.021854978054761887 2023-01-24 01:14:52.877297: step: 100/77, loss: 0.00010129621659871191 2023-01-24 01:14:54.357489: step: 104/77, loss: 7.38961753086187e-05 2023-01-24 01:14:55.783375: step: 108/77, loss: 7.603023732372094e-06 2023-01-24 01:14:57.254087: step: 112/77, loss: 2.898144884966314e-06 2023-01-24 01:14:58.732332: step: 116/77, loss: 0.028281545266509056 2023-01-24 01:15:00.247314: step: 120/77, loss: 0.00011244900088058785 2023-01-24 01:15:01.637167: step: 124/77, loss: 1.1423481737438124e-05 2023-01-24 01:15:03.066011: step: 128/77, loss: 0.0001512135931989178 2023-01-24 01:15:04.502439: step: 132/77, loss: 0.00036550781805999577 2023-01-24 01:15:05.949887: step: 136/77, loss: 2.6029840228147805e-05 2023-01-24 01:15:07.326083: step: 140/77, loss: 5.220618913881481e-05 2023-01-24 01:15:08.786367: step: 144/77, loss: 0.00010639047832228243 2023-01-24 01:15:10.250818: step: 148/77, loss: 1.2359154425212182e-05 2023-01-24 01:15:11.694722: step: 152/77, loss: 1.0415762972115772e-06 2023-01-24 01:15:13.177299: step: 156/77, loss: 2.846110191967455e-07 2023-01-24 01:15:14.643677: step: 160/77, loss: 0.00021485608885996044 2023-01-24 01:15:16.053067: step: 164/77, loss: 0.00016367128409910947 2023-01-24 01:15:17.461768: step: 168/77, loss: 0.00012849051563534886 2023-01-24 01:15:18.939891: step: 172/77, loss: 0.0001794326672097668 2023-01-24 01:15:20.384220: step: 176/77, loss: 9.309967572335154e-05 2023-01-24 01:15:21.804973: step: 180/77, loss: 3.740165652743599e-07 2023-01-24 01:15:23.250297: step: 184/77, loss: 8.236393114202656e-06 2023-01-24 01:15:24.695604: step: 188/77, loss: 0.0007826384389773011 2023-01-24 01:15:26.154864: step: 192/77, loss: 6.202798977028579e-05 2023-01-24 01:15:27.571691: step: 196/77, loss: 0.011609578505158424 2023-01-24 01:15:29.040433: step: 200/77, loss: 0.00024659387418068945 2023-01-24 01:15:30.514386: step: 204/77, loss: 0.00010214914073003456 2023-01-24 01:15:32.033860: step: 208/77, loss: 0.0020770388655364513 2023-01-24 01:15:33.483420: step: 212/77, loss: 0.00011197337880730629 2023-01-24 01:15:34.939940: step: 216/77, loss: 0.0004414325812831521 2023-01-24 01:15:36.343430: step: 220/77, loss: 2.2782662654208252e-06 2023-01-24 01:15:37.796362: step: 224/77, loss: 0.07051575183868408 2023-01-24 01:15:39.229541: step: 228/77, loss: 7.942214210743259e-07 2023-01-24 01:15:40.697669: step: 232/77, loss: 0.0005741835338994861 2023-01-24 01:15:42.140457: step: 236/77, loss: 1.4741096492798533e-05 2023-01-24 01:15:43.555155: step: 240/77, loss: 0.000563853420317173 2023-01-24 01:15:44.985626: step: 244/77, loss: 0.001484802458435297 2023-01-24 01:15:46.431789: step: 248/77, loss: 7.956744229886681e-05 2023-01-24 01:15:47.871618: step: 252/77, loss: 0.001037311041727662 2023-01-24 01:15:49.316192: step: 256/77, loss: 1.990406417462509e-05 2023-01-24 01:15:50.828446: step: 260/77, loss: 1.7225285091626574e-06 2023-01-24 01:15:52.280253: step: 264/77, loss: 2.6508356313570403e-05 2023-01-24 01:15:53.696298: step: 268/77, loss: 0.00012444915773812681 2023-01-24 01:15:55.125614: step: 272/77, loss: 3.8260586734395474e-05 2023-01-24 01:15:56.584125: step: 276/77, loss: 0.053769297897815704 2023-01-24 01:15:58.008596: step: 280/77, loss: 2.1546213702094974e-06 2023-01-24 01:15:59.431019: step: 284/77, loss: 0.003493919502943754 2023-01-24 01:16:00.897081: step: 288/77, loss: 9.377560672874097e-06 2023-01-24 01:16:02.374090: step: 292/77, loss: 1.3846120054950006e-05 2023-01-24 01:16:03.808983: step: 296/77, loss: 0.09929952770471573 2023-01-24 01:16:05.242154: step: 300/77, loss: 1.4438862763199722e-06 2023-01-24 01:16:06.669331: step: 304/77, loss: 0.00014174054376780987 2023-01-24 01:16:08.203748: step: 308/77, loss: 2.1268220734782517e-05 2023-01-24 01:16:09.618636: step: 312/77, loss: 4.914067631034413e-06 2023-01-24 01:16:11.060334: step: 316/77, loss: 0.00031243939884006977 2023-01-24 01:16:12.456442: step: 320/77, loss: 2.980193130497355e-05 2023-01-24 01:16:13.906568: step: 324/77, loss: 5.915725296290475e-07 2023-01-24 01:16:15.346716: step: 328/77, loss: 0.0004401046026032418 2023-01-24 01:16:16.779489: step: 332/77, loss: 7.772417302476242e-05 2023-01-24 01:16:18.250982: step: 336/77, loss: 0.015740012750029564 2023-01-24 01:16:19.689273: step: 340/77, loss: 5.46865976502886e-07 2023-01-24 01:16:21.096788: step: 344/77, loss: 2.771384060906712e-06 2023-01-24 01:16:22.561652: step: 348/77, loss: 8.50096967042191e-06 2023-01-24 01:16:24.004394: step: 352/77, loss: 1.4289958016888704e-06 2023-01-24 01:16:25.533914: step: 356/77, loss: 2.460886025801301e-05 2023-01-24 01:16:26.891074: step: 360/77, loss: 1.0281792128807865e-07 2023-01-24 01:16:28.379865: step: 364/77, loss: 1.9967509956586582e-07 2023-01-24 01:16:29.834214: step: 368/77, loss: 0.0002714739239308983 2023-01-24 01:16:31.286757: step: 372/77, loss: 5.9639728533511516e-06 2023-01-24 01:16:32.799572: step: 376/77, loss: 6.159111217129976e-05 2023-01-24 01:16:34.199932: step: 380/77, loss: 8.031515221773589e-07 2023-01-24 01:16:35.637117: step: 384/77, loss: 2.7013338694814593e-06 2023-01-24 01:16:37.074012: step: 388/77, loss: 0.000659904966596514 ================================================== Loss: 0.006 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 27} Test Chinese: {'template': {'p': 0.9305555555555556, 'r': 0.5275590551181102, 'f1': 0.6733668341708542}, 'slot': {'p': 0.6052631578947368, 'r': 0.022030651340996167, 'f1': 0.04251386321626617}, 'combined': 0.02862742548230988, 'epoch': 27} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 27} Test Korean: {'template': {'p': 0.9324324324324325, 'r': 0.5433070866141733, 'f1': 0.6865671641791046}, 'slot': {'p': 0.6052631578947368, 'r': 0.022030651340996167, 'f1': 0.04251386321626617}, 'combined': 0.02918862250669021, 'epoch': 27} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 27} Test Russian: {'template': {'p': 0.9315068493150684, 'r': 0.5354330708661418, 'f1': 0.6799999999999999}, 'slot': {'p': 0.6052631578947368, 'r': 0.022030651340996167, 'f1': 0.04251386321626617}, 'combined': 0.028909426987060994, 'epoch': 27} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 27} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 27} Sample Russian: {'template': {'p': 1.0, 'r': 0.25, 'f1': 0.4}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 27} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 2} Test for Chinese: {'template': {'p': 0.9295774647887324, 'r': 0.5196850393700787, 'f1': 0.6666666666666665}, 'slot': {'p': 0.5172413793103449, 'r': 0.014367816091954023, 'f1': 0.02795899347623485}, 'combined': 0.018639328984156562, 'epoch': 2} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 2} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 2} Test for Korean: {'template': {'p': 0.9295774647887324, 'r': 0.5196850393700787, 'f1': 0.6666666666666665}, 'slot': {'p': 0.5172413793103449, 'r': 0.014367816091954023, 'f1': 0.02795899347623485}, 'combined': 0.018639328984156562, 'epoch': 2} Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 2} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 2} Test for Russian: {'template': {'p': 0.9295774647887324, 'r': 0.5196850393700787, 'f1': 0.6666666666666665}, 'slot': {'p': 0.5172413793103449, 'r': 0.014367816091954023, 'f1': 0.02795899347623485}, 'combined': 0.018639328984156562, 'epoch': 2} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 2} ****************************** Epoch: 28 command: python train.py --model_name template --xlmr_model_name xlm-roberta-large --batch_size 10 --xlmr_learning_rate 2e-5 --event_hidden_num 450 --accumulate_step 4 --max_epoch 30 --p1_data_weight 0.1 --learning_rate 2e-4 2023-01-24 01:18:10.538784: step: 4/77, loss: 0.00040027956129051745 2023-01-24 01:18:11.958167: step: 8/77, loss: 2.1859127627976704e-06 2023-01-24 01:18:13.359673: step: 12/77, loss: 0.00019009897368960083 2023-01-24 01:18:14.865922: step: 16/77, loss: 3.5464603342916234e-07 2023-01-24 01:18:16.376304: step: 20/77, loss: 0.0005413616891019046 2023-01-24 01:18:17.902098: step: 24/77, loss: 0.001072825863957405 2023-01-24 01:18:19.382362: step: 28/77, loss: 3.041150193894282e-06 2023-01-24 01:18:20.891018: step: 32/77, loss: 0.0005392186576500535 2023-01-24 01:18:22.386372: step: 36/77, loss: 0.00021960663434583694 2023-01-24 01:18:23.848555: step: 40/77, loss: 4.476705362321809e-05 2023-01-24 01:18:25.241699: step: 44/77, loss: 0.00011493064812384546 2023-01-24 01:18:26.611058: step: 48/77, loss: 7.882613317633513e-07 2023-01-24 01:18:28.072030: step: 52/77, loss: 4.191356310911942e-06 2023-01-24 01:18:29.511773: step: 56/77, loss: 0.0014918299857527018 2023-01-24 01:18:30.966126: step: 60/77, loss: 6.311079050647095e-05 2023-01-24 01:18:32.464013: step: 64/77, loss: 2.693668648134917e-05 2023-01-24 01:18:33.916563: step: 68/77, loss: 2.789504469546955e-05 2023-01-24 01:18:35.377695: step: 72/77, loss: 0.00019142587552778423 2023-01-24 01:18:36.818339: step: 76/77, loss: 0.0008312238496728241 2023-01-24 01:18:38.253710: step: 80/77, loss: 0.0001014574954751879 2023-01-24 01:18:39.667807: step: 84/77, loss: 0.008465916849672794 2023-01-24 01:18:41.135608: step: 88/77, loss: 0.0009976828005164862 2023-01-24 01:18:42.584030: step: 92/77, loss: 1.341102517926629e-07 2023-01-24 01:18:44.114693: step: 96/77, loss: 0.00011098245158791542 2023-01-24 01:18:45.600141: step: 100/77, loss: 9.53789203776978e-05 2023-01-24 01:18:46.981334: step: 104/77, loss: 1.5348143733717734e-07 2023-01-24 01:18:48.455498: step: 108/77, loss: 1.4706993169966154e-06 2023-01-24 01:18:49.903217: step: 112/77, loss: 2.4139754373209144e-07 2023-01-24 01:18:51.279274: step: 116/77, loss: 7.577970245620236e-05 2023-01-24 01:18:52.707477: step: 120/77, loss: 1.9295155652798712e-05 2023-01-24 01:18:54.110381: step: 124/77, loss: 0.0001189530739793554 2023-01-24 01:18:55.578299: step: 128/77, loss: 1.7593152733752504e-05 2023-01-24 01:18:57.028683: step: 132/77, loss: 5.364407229535573e-08 2023-01-24 01:18:58.518524: step: 136/77, loss: 1.4920466128387488e-05 2023-01-24 01:19:00.015648: step: 140/77, loss: 2.756704304829327e-07 2023-01-24 01:19:01.417576: step: 144/77, loss: 0.0006070179515518248 2023-01-24 01:19:02.872535: step: 148/77, loss: 8.448770927316218e-07 2023-01-24 01:19:04.339669: step: 152/77, loss: 0.00011783481750171632 2023-01-24 01:19:05.778884: step: 156/77, loss: 3.100753019680269e-06 2023-01-24 01:19:07.249673: step: 160/77, loss: 1.0468844266142696e-05 2023-01-24 01:19:08.693557: step: 164/77, loss: 8.910707265386009e-07 2023-01-24 01:19:10.153627: step: 168/77, loss: 7.314847607631236e-05 2023-01-24 01:19:11.584644: step: 172/77, loss: 0.00037385508767329156 2023-01-24 01:19:13.024788: step: 176/77, loss: 4.618462480721064e-05 2023-01-24 01:19:14.497957: step: 180/77, loss: 2.781278089969419e-05 2023-01-24 01:19:15.949421: step: 184/77, loss: 0.03798334300518036 2023-01-24 01:19:17.412116: step: 188/77, loss: 7.233463111333549e-05 2023-01-24 01:19:18.864397: step: 192/77, loss: 2.7957617930951528e-05 2023-01-24 01:19:20.294392: step: 196/77, loss: 3.992303754785098e-05 2023-01-24 01:19:21.786464: step: 200/77, loss: 7.301559890038334e-08 2023-01-24 01:19:23.181425: step: 204/77, loss: 0.00419108010828495 2023-01-24 01:19:24.628026: step: 208/77, loss: 8.37369680084521e-06 2023-01-24 01:19:26.046090: step: 212/77, loss: 0.000886766065377742 2023-01-24 01:19:27.447352: step: 216/77, loss: 0.019571105018258095 2023-01-24 01:19:28.887202: step: 220/77, loss: 0.007960694842040539 2023-01-24 01:19:30.384707: step: 224/77, loss: 1.624222250029561e-07 2023-01-24 01:19:31.794387: step: 228/77, loss: 3.8517246139235795e-06 2023-01-24 01:19:33.187599: step: 232/77, loss: 0.012839417904615402 2023-01-24 01:19:34.655396: step: 236/77, loss: 0.0006180446944199502 2023-01-24 01:19:36.076618: step: 240/77, loss: 0.0004989909357391298 2023-01-24 01:19:37.539185: step: 244/77, loss: 2.1411585748865036e-06 2023-01-24 01:19:39.104918: step: 248/77, loss: 0.015560336410999298 2023-01-24 01:19:40.596115: step: 252/77, loss: 1.972132668015547e-05 2023-01-24 01:19:42.001303: step: 256/77, loss: 2.7699634301825427e-06 2023-01-24 01:19:43.455721: step: 260/77, loss: 1.7081476471503265e-05 2023-01-24 01:19:44.875420: step: 264/77, loss: 1.7463208905610372e-06 2023-01-24 01:19:46.299825: step: 268/77, loss: 8.59771489558625e-07 2023-01-24 01:19:47.786646: step: 272/77, loss: 7.122570195861044e-07 2023-01-24 01:19:49.303150: step: 276/77, loss: 0.00022912969870958477 2023-01-24 01:19:50.697303: step: 280/77, loss: 0.00013790567754767835 2023-01-24 01:19:52.110545: step: 284/77, loss: 2.7252042855252512e-05 2023-01-24 01:19:53.566678: step: 288/77, loss: 3.2927439406194026e-06 2023-01-24 01:19:54.937130: step: 292/77, loss: 1.9038379832636565e-05 2023-01-24 01:19:56.473617: step: 296/77, loss: 5.681869515683502e-05 2023-01-24 01:19:57.957855: step: 300/77, loss: 0.000779482361394912 2023-01-24 01:19:59.415879: step: 304/77, loss: 8.301656635012478e-06 2023-01-24 01:20:00.864642: step: 308/77, loss: 9.468827192904428e-05 2023-01-24 01:20:02.321971: step: 312/77, loss: 0.03950297087430954 2023-01-24 01:20:03.753069: step: 316/77, loss: 0.08819811046123505 2023-01-24 01:20:05.175347: step: 320/77, loss: 2.816292123952735e-07 2023-01-24 01:20:06.597532: step: 324/77, loss: 5.2443148888414726e-05 2023-01-24 01:20:08.079518: step: 328/77, loss: 0.0021318369545042515 2023-01-24 01:20:09.556982: step: 332/77, loss: 7.926521357148886e-05 2023-01-24 01:20:11.035926: step: 336/77, loss: 3.8356803997885436e-05 2023-01-24 01:20:12.498818: step: 340/77, loss: 1.7150616713479394e-06 2023-01-24 01:20:13.997400: step: 344/77, loss: 0.004695745650678873 2023-01-24 01:20:15.392287: step: 348/77, loss: 1.2293284044062602e-06 2023-01-24 01:20:16.817341: step: 352/77, loss: 0.0001591688342159614 2023-01-24 01:20:18.308270: step: 356/77, loss: 1.3376004972087685e-05 2023-01-24 01:20:19.704137: step: 360/77, loss: 3.653626663435716e-06 2023-01-24 01:20:21.096394: step: 364/77, loss: 3.874299636663636e-08 2023-01-24 01:20:22.579432: step: 368/77, loss: 0.0008065245347097516 2023-01-24 01:20:24.017341: step: 372/77, loss: 9.514586417935789e-05 2023-01-24 01:20:25.431992: step: 376/77, loss: 0.00012280340888537467 2023-01-24 01:20:26.912362: step: 380/77, loss: 1.7434284416140144e-07 2023-01-24 01:20:28.341732: step: 384/77, loss: 4.4804314711655024e-06 2023-01-24 01:20:29.774770: step: 388/77, loss: 4.538383109320421e-06 ================================================== Loss: 0.003 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 28} Test Chinese: {'template': {'p': 0.9473684210526315, 'r': 0.5669291338582677, 'f1': 0.70935960591133}, 'slot': {'p': 0.6666666666666666, 'r': 0.022988505747126436, 'f1': 0.044444444444444446}, 'combined': 0.03152709359605911, 'epoch': 28} Dev Korean: {'template': {'p': 1.0, 'r': 0.5666666666666667, 'f1': 0.7234042553191489}, 'slot': {'p': 0.5, 'r': 0.035916824196597356, 'f1': 0.0670194003527337}, 'combined': 0.048482119404105226, 'epoch': 28} Test Korean: {'template': {'p': 0.9466666666666667, 'r': 0.5590551181102362, 'f1': 0.7029702970297029}, 'slot': {'p': 0.6764705882352942, 'r': 0.022030651340996167, 'f1': 0.04267161410018552}, 'combined': 0.029996877238744276, 'epoch': 28} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 28} Test Russian: {'template': {'p': 0.9466666666666667, 'r': 0.5590551181102362, 'f1': 0.7029702970297029}, 'slot': {'p': 0.6764705882352942, 'r': 0.022030651340996167, 'f1': 0.04267161410018552}, 'combined': 0.029996877238744276, 'epoch': 28} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 28} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 28} Sample Russian: {'template': {'p': 1.0, 'r': 0.25, 'f1': 0.4}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 28} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 2} Test for Chinese: {'template': {'p': 0.9295774647887324, 'r': 0.5196850393700787, 'f1': 0.6666666666666665}, 'slot': {'p': 0.5172413793103449, 'r': 0.014367816091954023, 'f1': 0.02795899347623485}, 'combined': 0.018639328984156562, 'epoch': 2} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 2} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 2} Test for Korean: {'template': {'p': 0.9295774647887324, 'r': 0.5196850393700787, 'f1': 0.6666666666666665}, 'slot': {'p': 0.5172413793103449, 'r': 0.014367816091954023, 'f1': 0.02795899347623485}, 'combined': 0.018639328984156562, 'epoch': 2} Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 2} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 2} Test for Russian: {'template': {'p': 0.9295774647887324, 'r': 0.5196850393700787, 'f1': 0.6666666666666665}, 'slot': {'p': 0.5172413793103449, 'r': 0.014367816091954023, 'f1': 0.02795899347623485}, 'combined': 0.018639328984156562, 'epoch': 2} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 2} ****************************** Epoch: 29 command: python train.py --model_name template --xlmr_model_name xlm-roberta-large --batch_size 10 --xlmr_learning_rate 2e-5 --event_hidden_num 450 --accumulate_step 4 --max_epoch 30 --p1_data_weight 0.1 --learning_rate 2e-4 2023-01-24 01:22:01.967831: step: 4/77, loss: 1.0981790410369285e-06 2023-01-24 01:22:03.427952: step: 8/77, loss: 9.417245792064932e-07 2023-01-24 01:22:04.864453: step: 12/77, loss: 9.58396412897855e-06 2023-01-24 01:22:06.300745: step: 16/77, loss: 4.61935414364234e-08 2023-01-24 01:22:07.764482: step: 20/77, loss: 0.002150250133126974 2023-01-24 01:22:09.207819: step: 24/77, loss: 1.5589535905746743e-05 2023-01-24 01:22:10.652509: step: 28/77, loss: 0.00014798429037909955 2023-01-24 01:22:12.092232: step: 32/77, loss: 1.4901160305669237e-09 2023-01-24 01:22:13.549631: step: 36/77, loss: 0.0005514497170224786 2023-01-24 01:22:14.925837: step: 40/77, loss: 1.990706550714094e-06 2023-01-24 01:22:16.390588: step: 44/77, loss: 0.004755903966724873 2023-01-24 01:22:17.797554: step: 48/77, loss: 5.960463234089275e-09 2023-01-24 01:22:19.218791: step: 52/77, loss: 0.0008213441469706595 2023-01-24 01:22:20.658244: step: 56/77, loss: 2.7505743673827965e-06 2023-01-24 01:22:22.073490: step: 60/77, loss: 6.258481732857035e-08 2023-01-24 01:22:23.541269: step: 64/77, loss: 7.01549424775294e-06 2023-01-24 01:22:24.980249: step: 68/77, loss: 0.00027408479945734143 2023-01-24 01:22:26.494075: step: 72/77, loss: 0.001076017739251256 2023-01-24 01:22:27.901692: step: 76/77, loss: 7.748595010070858e-08 2023-01-24 01:22:29.297001: step: 80/77, loss: 0.0008492418564856052 2023-01-24 01:22:30.733032: step: 84/77, loss: 1.4919985915184952e-05 2023-01-24 01:22:32.161320: step: 88/77, loss: 0.03549102321267128 2023-01-24 01:22:33.607489: step: 92/77, loss: 1.9016228179680184e-05 2023-01-24 01:22:35.070822: step: 96/77, loss: 2.816211235767696e-06 2023-01-24 01:22:36.463514: step: 100/77, loss: 6.242699782887939e-06 2023-01-24 01:22:37.944947: step: 104/77, loss: 0.0019745193421840668 2023-01-24 01:22:39.387714: step: 108/77, loss: 6.2407048062596004e-06 2023-01-24 01:22:40.829965: step: 112/77, loss: 7.97189500190143e-07 2023-01-24 01:22:42.243692: step: 116/77, loss: 9.354756002721842e-06 2023-01-24 01:22:43.771429: step: 120/77, loss: 3.090270547545515e-05 2023-01-24 01:22:45.312811: step: 124/77, loss: 0.0001383114722557366 2023-01-24 01:22:46.815934: step: 128/77, loss: 0.0009392743231728673 2023-01-24 01:22:48.254138: step: 132/77, loss: 0.009995796717703342 2023-01-24 01:22:49.726654: step: 136/77, loss: 5.677227932210371e-07 2023-01-24 01:22:51.194728: step: 140/77, loss: 5.5413238442270085e-06 2023-01-24 01:22:52.669291: step: 144/77, loss: 4.6998102334327996e-05 2023-01-24 01:22:54.114086: step: 148/77, loss: 6.750182706127816e-07 2023-01-24 01:22:55.574537: step: 152/77, loss: 8.619653272035066e-06 2023-01-24 01:22:57.030033: step: 156/77, loss: 3.4633274481166154e-05 2023-01-24 01:22:58.445949: step: 160/77, loss: 1.1918426935153548e-05 2023-01-24 01:22:59.921534: step: 164/77, loss: 7.30147974081774e-07 2023-01-24 01:23:01.386845: step: 168/77, loss: 4.495015673455782e-06 2023-01-24 01:23:02.842333: step: 172/77, loss: 5.5530435929540545e-06 2023-01-24 01:23:04.278004: step: 176/77, loss: 3.898432260029949e-05 2023-01-24 01:23:05.747640: step: 180/77, loss: 1.0834314707608428e-05 2023-01-24 01:23:07.189335: step: 184/77, loss: 1.4603087095110823e-07 2023-01-24 01:23:08.619329: step: 188/77, loss: 1.0279748494212981e-05 2023-01-24 01:23:10.073493: step: 192/77, loss: 0.026890480890870094 2023-01-24 01:23:11.530172: step: 196/77, loss: 1.2561100675156922e-06 2023-01-24 01:23:12.957489: step: 200/77, loss: 2.3390979549731128e-05 2023-01-24 01:23:14.387126: step: 204/77, loss: 0.00014223478501662612 2023-01-24 01:23:15.866960: step: 208/77, loss: 0.07313355058431625 2023-01-24 01:23:17.345738: step: 212/77, loss: 0.021757755428552628 2023-01-24 01:23:18.735224: step: 216/77, loss: 3.6715177884616423e-06 2023-01-24 01:23:20.212275: step: 220/77, loss: 1.1026828161675439e-07 2023-01-24 01:23:21.691561: step: 224/77, loss: 2.980229574234272e-08 2023-01-24 01:23:23.109959: step: 228/77, loss: 1.2755149327858817e-06 2023-01-24 01:23:24.573354: step: 232/77, loss: 3.090401378358365e-06 2023-01-24 01:23:25.953735: step: 236/77, loss: 1.416515533492202e-05 2023-01-24 01:23:27.437535: step: 240/77, loss: 7.53992139834736e-07 2023-01-24 01:23:28.909365: step: 244/77, loss: 0.00016382562171202153 2023-01-24 01:23:30.348258: step: 248/77, loss: 1.59735122906568e-06 2023-01-24 01:23:31.777561: step: 252/77, loss: 1.4706193724123295e-05 2023-01-24 01:23:33.208345: step: 256/77, loss: 0.024192549288272858 2023-01-24 01:23:34.701142: step: 260/77, loss: 7.3840710683725774e-06 2023-01-24 01:23:36.144507: step: 264/77, loss: 0.04937880113720894 2023-01-24 01:23:37.510669: step: 268/77, loss: 0.015833653509616852 2023-01-24 01:23:39.016002: step: 272/77, loss: 5.93059667153284e-07 2023-01-24 01:23:40.517579: step: 276/77, loss: 0.005859335884451866 2023-01-24 01:23:41.943445: step: 280/77, loss: 0.0004879350890405476 2023-01-24 01:23:43.316220: step: 284/77, loss: 1.2471202353481203e-05 2023-01-24 01:23:44.752005: step: 288/77, loss: 0.0008277146844193339 2023-01-24 01:23:46.211339: step: 292/77, loss: 7.241902721943916e-07 2023-01-24 01:23:47.641273: step: 296/77, loss: 0.0003251023299526423 2023-01-24 01:23:49.041334: step: 300/77, loss: 3.8586495065828785e-05 2023-01-24 01:23:50.507569: step: 304/77, loss: 2.430009590170812e-05 2023-01-24 01:23:51.952805: step: 308/77, loss: 5.81522035645321e-05 2023-01-24 01:23:53.412918: step: 312/77, loss: 0.00013343266618903726 2023-01-24 01:23:54.886182: step: 316/77, loss: 2.306269198015798e-05 2023-01-24 01:23:56.344907: step: 320/77, loss: 9.536656762065832e-07 2023-01-24 01:23:57.725638: step: 324/77, loss: 0.0009279365767724812 2023-01-24 01:23:59.209888: step: 328/77, loss: 6.693446266581304e-06 2023-01-24 01:24:00.657820: step: 332/77, loss: 0.005179972387850285 2023-01-24 01:24:02.119204: step: 336/77, loss: 0.00029751280089840293 2023-01-24 01:24:03.583212: step: 340/77, loss: 9.581345921105822e-07 2023-01-24 01:24:05.036568: step: 344/77, loss: 2.428880350180407e-07 2023-01-24 01:24:06.481581: step: 348/77, loss: 0.00010887684766203165 2023-01-24 01:24:07.950084: step: 352/77, loss: 3.5044804462813772e-06 2023-01-24 01:24:09.385190: step: 356/77, loss: 1.8073638784699142e-05 2023-01-24 01:24:10.890472: step: 360/77, loss: 4.5219483581604436e-05 2023-01-24 01:24:12.341220: step: 364/77, loss: 1.6982190572889522e-05 2023-01-24 01:24:13.790143: step: 368/77, loss: 7.688842629249848e-07 2023-01-24 01:24:15.160949: step: 372/77, loss: 1.617308589629829e-05 2023-01-24 01:24:16.616207: step: 376/77, loss: 0.00010206141450908035 2023-01-24 01:24:18.110816: step: 380/77, loss: 1.5071236703079194e-05 2023-01-24 01:24:19.612854: step: 384/77, loss: 1.8000201862378162e-06 2023-01-24 01:24:21.119383: step: 388/77, loss: 0.0009697185014374554 ================================================== Loss: 0.003 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.47619047619047616, 'r': 0.03780718336483932, 'f1': 0.07005253940455342}, 'combined': 0.05161766061388146, 'epoch': 29} Test Chinese: {'template': {'p': 0.9342105263157895, 'r': 0.5590551181102362, 'f1': 0.6995073891625616}, 'slot': {'p': 0.6571428571428571, 'r': 0.022030651340996167, 'f1': 0.04263206672845227}, 'combined': 0.029821445691823753, 'epoch': 29} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.47619047619047616, 'r': 0.03780718336483932, 'f1': 0.07005253940455342}, 'combined': 0.05161766061388146, 'epoch': 29} Test Korean: {'template': {'p': 0.9342105263157895, 'r': 0.5590551181102362, 'f1': 0.6995073891625616}, 'slot': {'p': 0.6571428571428571, 'r': 0.022030651340996167, 'f1': 0.04263206672845227}, 'combined': 0.029821445691823753, 'epoch': 29} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.47619047619047616, 'r': 0.03780718336483932, 'f1': 0.07005253940455342}, 'combined': 0.05161766061388146, 'epoch': 29} Test Russian: {'template': {'p': 0.9342105263157895, 'r': 0.5590551181102362, 'f1': 0.6995073891625616}, 'slot': {'p': 0.6470588235294118, 'r': 0.0210727969348659, 'f1': 0.04081632653061225}, 'combined': 0.02855132200663517, 'epoch': 29} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 29} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 29} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 29} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 2} Test for Chinese: {'template': {'p': 0.9295774647887324, 'r': 0.5196850393700787, 'f1': 0.6666666666666665}, 'slot': {'p': 0.5172413793103449, 'r': 0.014367816091954023, 'f1': 0.02795899347623485}, 'combined': 0.018639328984156562, 'epoch': 2} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 2} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 2} Test for Korean: {'template': {'p': 0.9295774647887324, 'r': 0.5196850393700787, 'f1': 0.6666666666666665}, 'slot': {'p': 0.5172413793103449, 'r': 0.014367816091954023, 'f1': 0.02795899347623485}, 'combined': 0.018639328984156562, 'epoch': 2} Korean: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 2} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 2} Test for Russian: {'template': {'p': 0.9295774647887324, 'r': 0.5196850393700787, 'f1': 0.6666666666666665}, 'slot': {'p': 0.5172413793103449, 'r': 0.014367816091954023, 'f1': 0.02795899347623485}, 'combined': 0.018639328984156562, 'epoch': 2} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 2}