Command that produces this log: python train.py --model_name template --xlmr_model_name xlm-roberta-large --batch_size 10 --xlmr_learning_rate 2e-5 --event_hidden_num 450 --accumulate_step 4 --max_epoch 30 --p1_data_weight 0.1 --learning_rate 2e-4 ---------------------------------------------------------------------------------------------------- > trainable params: >>> xlmr.embeddings.word_embeddings.weight: torch.Size([250002, 1024]) >>> xlmr.embeddings.position_embeddings.weight: torch.Size([514, 1024]) >>> xlmr.embeddings.token_type_embeddings.weight: torch.Size([1, 1024]) >>> xlmr.embeddings.LayerNorm.weight: torch.Size([1024]) >>> xlmr.embeddings.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.0.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.0.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.0.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.0.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.0.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.0.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.0.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.0.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.0.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.0.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.0.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.0.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.0.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.0.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.0.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.0.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.1.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.1.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.1.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.1.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.1.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.1.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.1.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.1.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.1.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.1.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.1.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.1.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.1.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.1.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.1.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.1.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.2.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.2.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.2.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.2.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.2.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.2.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.2.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.2.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.2.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.2.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.2.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.2.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.2.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.2.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.2.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.2.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.3.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.3.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.3.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.3.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.3.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.3.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.3.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.3.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.3.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.3.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.3.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.3.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.3.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.3.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.3.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.3.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.4.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.4.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.4.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.4.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.4.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.4.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.4.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.4.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.4.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.4.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.4.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.4.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.4.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.4.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.4.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.4.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.5.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.5.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.5.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.5.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.5.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.5.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.5.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.5.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.5.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.5.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.5.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.5.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.5.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.5.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.5.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.5.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.6.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.6.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.6.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.6.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.6.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.6.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.6.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.6.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.6.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.6.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.6.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.6.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.6.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.6.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.6.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.6.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.7.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.7.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.7.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.7.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.7.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.7.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.7.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.7.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.7.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.7.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.7.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.7.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.7.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.7.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.7.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.7.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.8.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.8.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.8.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.8.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.8.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.8.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.8.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.8.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.8.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.8.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.8.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.8.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.8.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.8.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.8.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.8.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.9.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.9.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.9.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.9.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.9.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.9.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.9.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.9.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.9.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.9.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.9.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.9.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.9.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.9.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.9.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.9.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.10.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.10.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.10.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.10.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.10.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.10.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.10.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.10.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.10.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.10.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.10.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.10.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.10.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.10.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.10.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.10.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.11.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.11.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.11.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.11.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.11.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.11.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.11.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.11.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.11.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.11.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.11.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.11.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.11.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.11.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.11.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.11.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.12.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.12.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.12.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.12.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.12.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.12.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.12.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.12.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.12.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.12.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.12.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.12.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.12.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.12.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.12.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.12.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.13.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.13.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.13.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.13.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.13.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.13.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.13.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.13.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.13.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.13.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.13.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.13.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.13.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.13.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.13.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.13.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.14.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.14.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.14.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.14.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.14.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.14.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.14.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.14.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.14.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.14.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.14.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.14.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.14.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.14.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.14.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.14.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.15.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.15.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.15.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.15.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.15.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.15.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.15.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.15.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.15.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.15.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.15.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.15.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.15.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.15.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.15.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.15.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.16.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.16.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.16.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.16.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.16.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.16.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.16.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.16.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.16.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.16.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.16.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.16.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.16.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.16.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.16.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.16.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.17.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.17.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.17.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.17.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.17.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.17.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.17.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.17.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.17.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.17.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.17.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.17.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.17.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.17.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.17.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.17.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.18.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.18.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.18.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.18.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.18.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.18.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.18.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.18.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.18.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.18.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.18.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.18.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.18.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.18.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.18.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.18.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.19.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.19.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.19.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.19.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.19.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.19.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.19.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.19.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.19.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.19.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.19.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.19.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.19.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.19.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.19.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.19.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.20.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.20.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.20.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.20.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.20.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.20.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.20.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.20.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.20.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.20.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.20.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.20.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.20.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.20.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.20.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.20.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.21.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.21.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.21.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.21.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.21.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.21.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.21.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.21.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.21.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.21.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.21.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.21.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.21.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.21.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.21.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.21.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.22.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.22.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.22.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.22.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.22.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.22.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.22.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.22.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.22.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.22.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.22.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.22.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.22.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.22.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.22.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.22.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.23.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.23.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.23.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.23.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.23.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.23.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.23.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.23.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.23.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.23.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.23.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.23.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.23.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.23.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.23.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.23.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.pooler.dense.weight: torch.Size([1024, 1024]) >>> xlmr.pooler.dense.bias: torch.Size([1024]) >>> trans_rep.weight: torch.Size([1024, 2048]) >>> trans_rep.bias: torch.Size([1024]) >>> hidden_ffns.Corruplate.layers.0.weight: torch.Size([768, 1024]) >>> hidden_ffns.Corruplate.layers.0.bias: torch.Size([768]) >>> hidden_ffns.Cybercrimeplate.layers.0.weight: torch.Size([768, 1024]) >>> hidden_ffns.Cybercrimeplate.layers.0.bias: torch.Size([768]) >>> hidden_ffns.Disasterplate.layers.0.weight: torch.Size([768, 1024]) >>> hidden_ffns.Disasterplate.layers.0.bias: torch.Size([768]) >>> hidden_ffns.Displacementplate.layers.0.weight: torch.Size([768, 1024]) >>> hidden_ffns.Displacementplate.layers.0.bias: torch.Size([768]) >>> hidden_ffns.Epidemiplate.layers.0.weight: torch.Size([768, 1024]) >>> hidden_ffns.Epidemiplate.layers.0.bias: torch.Size([768]) >>> hidden_ffns.Etiplate.layers.0.weight: torch.Size([768, 1024]) >>> hidden_ffns.Etiplate.layers.0.bias: torch.Size([768]) >>> hidden_ffns.Protestplate.layers.0.weight: torch.Size([768, 1024]) >>> hidden_ffns.Protestplate.layers.0.bias: torch.Size([768]) >>> hidden_ffns.Terrorplate.layers.0.weight: torch.Size([768, 1024]) >>> hidden_ffns.Terrorplate.layers.0.bias: torch.Size([768]) >>> template_classifiers.Corruplate.layers.0.weight: torch.Size([450, 768]) >>> template_classifiers.Corruplate.layers.0.bias: torch.Size([450]) >>> template_classifiers.Corruplate.layers.1.weight: torch.Size([2, 450]) >>> template_classifiers.Corruplate.layers.1.bias: torch.Size([2]) >>> template_classifiers.Cybercrimeplate.layers.0.weight: torch.Size([450, 768]) >>> template_classifiers.Cybercrimeplate.layers.0.bias: torch.Size([450]) >>> template_classifiers.Cybercrimeplate.layers.1.weight: torch.Size([2, 450]) >>> template_classifiers.Cybercrimeplate.layers.1.bias: torch.Size([2]) >>> template_classifiers.Disasterplate.layers.0.weight: torch.Size([450, 768]) >>> template_classifiers.Disasterplate.layers.0.bias: torch.Size([450]) >>> template_classifiers.Disasterplate.layers.1.weight: torch.Size([2, 450]) >>> template_classifiers.Disasterplate.layers.1.bias: torch.Size([2]) >>> template_classifiers.Displacementplate.layers.0.weight: torch.Size([450, 768]) >>> template_classifiers.Displacementplate.layers.0.bias: torch.Size([450]) >>> template_classifiers.Displacementplate.layers.1.weight: torch.Size([2, 450]) >>> template_classifiers.Displacementplate.layers.1.bias: torch.Size([2]) >>> template_classifiers.Epidemiplate.layers.0.weight: torch.Size([450, 768]) >>> template_classifiers.Epidemiplate.layers.0.bias: torch.Size([450]) >>> template_classifiers.Epidemiplate.layers.1.weight: torch.Size([2, 450]) >>> template_classifiers.Epidemiplate.layers.1.bias: torch.Size([2]) >>> template_classifiers.Etiplate.layers.0.weight: torch.Size([450, 768]) >>> template_classifiers.Etiplate.layers.0.bias: torch.Size([450]) >>> template_classifiers.Etiplate.layers.1.weight: torch.Size([2, 450]) >>> template_classifiers.Etiplate.layers.1.bias: torch.Size([2]) >>> template_classifiers.Protestplate.layers.0.weight: torch.Size([450, 768]) >>> template_classifiers.Protestplate.layers.0.bias: torch.Size([450]) >>> template_classifiers.Protestplate.layers.1.weight: torch.Size([2, 450]) >>> template_classifiers.Protestplate.layers.1.bias: torch.Size([2]) >>> template_classifiers.Terrorplate.layers.0.weight: torch.Size([450, 768]) >>> template_classifiers.Terrorplate.layers.0.bias: torch.Size([450]) >>> template_classifiers.Terrorplate.layers.1.weight: torch.Size([2, 450]) >>> template_classifiers.Terrorplate.layers.1.bias: torch.Size([2]) >>> type_classifiers.Corruplate.layers.0.weight: torch.Size([450, 768]) >>> type_classifiers.Corruplate.layers.0.bias: torch.Size([450]) >>> type_classifiers.Corruplate.layers.1.weight: torch.Size([6, 450]) >>> type_classifiers.Corruplate.layers.1.bias: torch.Size([6]) >>> type_classifiers.Cybercrimeplate.layers.0.weight: torch.Size([450, 768]) >>> type_classifiers.Cybercrimeplate.layers.0.bias: torch.Size([450]) >>> type_classifiers.Cybercrimeplate.layers.1.weight: torch.Size([6, 450]) >>> type_classifiers.Cybercrimeplate.layers.1.bias: torch.Size([6]) >>> type_classifiers.Disasterplate.layers.0.weight: torch.Size([450, 768]) >>> type_classifiers.Disasterplate.layers.0.bias: torch.Size([450]) >>> type_classifiers.Disasterplate.layers.1.weight: torch.Size([6, 450]) >>> type_classifiers.Disasterplate.layers.1.bias: torch.Size([6]) >>> type_classifiers.Displacementplate.layers.0.weight: torch.Size([450, 768]) >>> type_classifiers.Displacementplate.layers.0.bias: torch.Size([450]) >>> type_classifiers.Displacementplate.layers.1.weight: torch.Size([6, 450]) >>> type_classifiers.Displacementplate.layers.1.bias: torch.Size([6]) >>> type_classifiers.Epidemiplate.layers.0.weight: torch.Size([450, 768]) >>> type_classifiers.Epidemiplate.layers.0.bias: torch.Size([450]) >>> type_classifiers.Epidemiplate.layers.1.weight: torch.Size([6, 450]) >>> type_classifiers.Epidemiplate.layers.1.bias: torch.Size([6]) >>> type_classifiers.Etiplate.layers.0.weight: torch.Size([450, 768]) >>> type_classifiers.Etiplate.layers.0.bias: torch.Size([450]) >>> type_classifiers.Etiplate.layers.1.weight: torch.Size([6, 450]) >>> type_classifiers.Etiplate.layers.1.bias: torch.Size([6]) >>> type_classifiers.Protestplate.layers.0.weight: torch.Size([450, 768]) >>> type_classifiers.Protestplate.layers.0.bias: torch.Size([450]) >>> type_classifiers.Protestplate.layers.1.weight: torch.Size([6, 450]) >>> type_classifiers.Protestplate.layers.1.bias: torch.Size([6]) >>> type_classifiers.Terrorplate.layers.0.weight: torch.Size([450, 768]) >>> type_classifiers.Terrorplate.layers.0.bias: torch.Size([450]) >>> type_classifiers.Terrorplate.layers.1.weight: torch.Size([6, 450]) >>> type_classifiers.Terrorplate.layers.1.bias: torch.Size([6]) >>> completion_classifiers.Corruplate.layers.0.weight: torch.Size([450, 768]) >>> completion_classifiers.Corruplate.layers.0.bias: torch.Size([450]) >>> completion_classifiers.Corruplate.layers.1.weight: torch.Size([4, 450]) >>> completion_classifiers.Corruplate.layers.1.bias: torch.Size([4]) >>> completion_classifiers.Cybercrimeplate.layers.0.weight: torch.Size([450, 768]) >>> completion_classifiers.Cybercrimeplate.layers.0.bias: torch.Size([450]) >>> completion_classifiers.Cybercrimeplate.layers.1.weight: torch.Size([4, 450]) >>> completion_classifiers.Cybercrimeplate.layers.1.bias: torch.Size([4]) >>> completion_classifiers.Disasterplate.layers.0.weight: torch.Size([450, 768]) >>> completion_classifiers.Disasterplate.layers.0.bias: torch.Size([450]) >>> completion_classifiers.Disasterplate.layers.1.weight: torch.Size([4, 450]) >>> completion_classifiers.Disasterplate.layers.1.bias: torch.Size([4]) >>> completion_classifiers.Displacementplate.layers.0.weight: torch.Size([450, 768]) >>> completion_classifiers.Displacementplate.layers.0.bias: torch.Size([450]) >>> completion_classifiers.Displacementplate.layers.1.weight: torch.Size([4, 450]) >>> completion_classifiers.Displacementplate.layers.1.bias: torch.Size([4]) >>> completion_classifiers.Epidemiplate.layers.0.weight: torch.Size([450, 768]) >>> completion_classifiers.Epidemiplate.layers.0.bias: torch.Size([450]) >>> completion_classifiers.Epidemiplate.layers.1.weight: torch.Size([4, 450]) >>> completion_classifiers.Epidemiplate.layers.1.bias: torch.Size([4]) >>> completion_classifiers.Etiplate.layers.0.weight: torch.Size([450, 768]) >>> completion_classifiers.Etiplate.layers.0.bias: torch.Size([450]) >>> completion_classifiers.Etiplate.layers.1.weight: torch.Size([4, 450]) >>> completion_classifiers.Etiplate.layers.1.bias: torch.Size([4]) >>> completion_classifiers.Protestplate.layers.0.weight: torch.Size([450, 768]) >>> completion_classifiers.Protestplate.layers.0.bias: torch.Size([450]) >>> completion_classifiers.Protestplate.layers.1.weight: torch.Size([4, 450]) >>> completion_classifiers.Protestplate.layers.1.bias: torch.Size([4]) >>> completion_classifiers.Terrorplate.layers.0.weight: torch.Size([450, 768]) >>> completion_classifiers.Terrorplate.layers.0.bias: torch.Size([450]) >>> completion_classifiers.Terrorplate.layers.1.weight: torch.Size([4, 450]) >>> completion_classifiers.Terrorplate.layers.1.bias: torch.Size([4]) >>> overtime_classifiers.Corruplate.layers.0.weight: torch.Size([450, 768]) >>> overtime_classifiers.Corruplate.layers.0.bias: torch.Size([450]) >>> overtime_classifiers.Corruplate.layers.1.weight: torch.Size([2, 450]) >>> overtime_classifiers.Corruplate.layers.1.bias: torch.Size([2]) >>> overtime_classifiers.Cybercrimeplate.layers.0.weight: torch.Size([450, 768]) >>> overtime_classifiers.Cybercrimeplate.layers.0.bias: torch.Size([450]) >>> overtime_classifiers.Cybercrimeplate.layers.1.weight: torch.Size([2, 450]) >>> overtime_classifiers.Cybercrimeplate.layers.1.bias: torch.Size([2]) >>> overtime_classifiers.Disasterplate.layers.0.weight: torch.Size([450, 768]) >>> overtime_classifiers.Disasterplate.layers.0.bias: torch.Size([450]) >>> overtime_classifiers.Disasterplate.layers.1.weight: torch.Size([2, 450]) >>> overtime_classifiers.Disasterplate.layers.1.bias: torch.Size([2]) >>> overtime_classifiers.Displacementplate.layers.0.weight: torch.Size([450, 768]) >>> overtime_classifiers.Displacementplate.layers.0.bias: torch.Size([450]) >>> overtime_classifiers.Displacementplate.layers.1.weight: torch.Size([2, 450]) >>> overtime_classifiers.Displacementplate.layers.1.bias: torch.Size([2]) >>> overtime_classifiers.Epidemiplate.layers.0.weight: torch.Size([450, 768]) >>> overtime_classifiers.Epidemiplate.layers.0.bias: torch.Size([450]) >>> overtime_classifiers.Epidemiplate.layers.1.weight: torch.Size([2, 450]) >>> overtime_classifiers.Epidemiplate.layers.1.bias: torch.Size([2]) >>> overtime_classifiers.Etiplate.layers.0.weight: torch.Size([450, 768]) >>> overtime_classifiers.Etiplate.layers.0.bias: torch.Size([450]) >>> overtime_classifiers.Etiplate.layers.1.weight: torch.Size([2, 450]) >>> overtime_classifiers.Etiplate.layers.1.bias: torch.Size([2]) >>> overtime_classifiers.Protestplate.layers.0.weight: torch.Size([450, 768]) >>> overtime_classifiers.Protestplate.layers.0.bias: torch.Size([450]) >>> overtime_classifiers.Protestplate.layers.1.weight: torch.Size([2, 450]) >>> overtime_classifiers.Protestplate.layers.1.bias: torch.Size([2]) >>> overtime_classifiers.Terrorplate.layers.0.weight: torch.Size([450, 768]) >>> overtime_classifiers.Terrorplate.layers.0.bias: torch.Size([450]) >>> overtime_classifiers.Terrorplate.layers.1.weight: torch.Size([2, 450]) >>> overtime_classifiers.Terrorplate.layers.1.bias: torch.Size([2]) >>> coordinated_classifiers.Corruplate.layers.0.weight: torch.Size([450, 768]) >>> coordinated_classifiers.Corruplate.layers.0.bias: torch.Size([450]) >>> coordinated_classifiers.Corruplate.layers.1.weight: torch.Size([2, 450]) >>> coordinated_classifiers.Corruplate.layers.1.bias: torch.Size([2]) >>> coordinated_classifiers.Cybercrimeplate.layers.0.weight: torch.Size([450, 768]) >>> coordinated_classifiers.Cybercrimeplate.layers.0.bias: torch.Size([450]) >>> coordinated_classifiers.Cybercrimeplate.layers.1.weight: torch.Size([2, 450]) >>> coordinated_classifiers.Cybercrimeplate.layers.1.bias: torch.Size([2]) >>> coordinated_classifiers.Disasterplate.layers.0.weight: torch.Size([450, 768]) >>> coordinated_classifiers.Disasterplate.layers.0.bias: torch.Size([450]) >>> coordinated_classifiers.Disasterplate.layers.1.weight: torch.Size([2, 450]) >>> coordinated_classifiers.Disasterplate.layers.1.bias: torch.Size([2]) >>> coordinated_classifiers.Displacementplate.layers.0.weight: torch.Size([450, 768]) >>> coordinated_classifiers.Displacementplate.layers.0.bias: torch.Size([450]) >>> coordinated_classifiers.Displacementplate.layers.1.weight: torch.Size([2, 450]) >>> coordinated_classifiers.Displacementplate.layers.1.bias: torch.Size([2]) >>> coordinated_classifiers.Epidemiplate.layers.0.weight: torch.Size([450, 768]) >>> coordinated_classifiers.Epidemiplate.layers.0.bias: torch.Size([450]) >>> coordinated_classifiers.Epidemiplate.layers.1.weight: torch.Size([2, 450]) >>> coordinated_classifiers.Epidemiplate.layers.1.bias: torch.Size([2]) >>> coordinated_classifiers.Etiplate.layers.0.weight: torch.Size([450, 768]) >>> coordinated_classifiers.Etiplate.layers.0.bias: torch.Size([450]) >>> coordinated_classifiers.Etiplate.layers.1.weight: torch.Size([2, 450]) >>> coordinated_classifiers.Etiplate.layers.1.bias: torch.Size([2]) >>> coordinated_classifiers.Protestplate.layers.0.weight: torch.Size([450, 768]) >>> coordinated_classifiers.Protestplate.layers.0.bias: torch.Size([450]) >>> coordinated_classifiers.Protestplate.layers.1.weight: torch.Size([2, 450]) >>> coordinated_classifiers.Protestplate.layers.1.bias: torch.Size([2]) >>> coordinated_classifiers.Terrorplate.layers.0.weight: torch.Size([450, 768]) >>> coordinated_classifiers.Terrorplate.layers.0.bias: torch.Size([450]) >>> coordinated_classifiers.Terrorplate.layers.1.weight: torch.Size([2, 450]) >>> coordinated_classifiers.Terrorplate.layers.1.bias: torch.Size([2]) n_trainable_params: 582185936, n_nontrainable_params: 0 ---------------------------------------------------------------------------------------------------- ****************************** Epoch: 0 command: python train.py --model_name template --xlmr_model_name xlm-roberta-large --batch_size 10 --xlmr_learning_rate 2e-5 --event_hidden_num 450 --accumulate_step 4 --max_epoch 30 --p1_data_weight 0.1 --learning_rate 2e-4 2023-01-23 23:27:50.308200: step: 4/77, loss: 1.0449273586273193 2023-01-23 23:27:51.781189: step: 8/77, loss: 1.0661876201629639 2023-01-23 23:27:53.275219: step: 12/77, loss: 1.0697906017303467 2023-01-23 23:27:54.640745: step: 16/77, loss: 1.0578088760375977 2023-01-23 23:27:56.095450: step: 20/77, loss: 1.0446879863739014 2023-01-23 23:27:57.553884: step: 24/77, loss: 1.0503871440887451 2023-01-23 23:27:59.010439: step: 28/77, loss: 1.04624342918396 2023-01-23 23:28:00.496046: step: 32/77, loss: 1.0361480712890625 2023-01-23 23:28:01.972049: step: 36/77, loss: 1.041550874710083 2023-01-23 23:28:03.447476: step: 40/77, loss: 1.0250319242477417 2023-01-23 23:28:04.896035: step: 44/77, loss: 1.0202736854553223 2023-01-23 23:28:06.375572: step: 48/77, loss: 1.0159680843353271 2023-01-23 23:28:07.955805: step: 52/77, loss: 1.002465844154358 2023-01-23 23:28:09.454072: step: 56/77, loss: 0.977909505367279 2023-01-23 23:28:10.896526: step: 60/77, loss: 0.9835793972015381 2023-01-23 23:28:12.360018: step: 64/77, loss: 0.9763692617416382 2023-01-23 23:28:13.833606: step: 68/77, loss: 0.9826990962028503 2023-01-23 23:28:15.297418: step: 72/77, loss: 0.9448894262313843 2023-01-23 23:28:16.775652: step: 76/77, loss: 0.9651656150817871 2023-01-23 23:28:18.310877: step: 80/77, loss: 0.9191781282424927 2023-01-23 23:28:19.823674: step: 84/77, loss: 0.9262199401855469 2023-01-23 23:28:21.221156: step: 88/77, loss: 0.9188696146011353 2023-01-23 23:28:22.687685: step: 92/77, loss: 0.8682029247283936 2023-01-23 23:28:24.167043: step: 96/77, loss: 0.8716950416564941 2023-01-23 23:28:25.583509: step: 100/77, loss: 0.8641495108604431 2023-01-23 23:28:27.043990: step: 104/77, loss: 0.8422443866729736 2023-01-23 23:28:28.556973: step: 108/77, loss: 0.7932794094085693 2023-01-23 23:28:30.117760: step: 112/77, loss: 0.8285765647888184 2023-01-23 23:28:31.545110: step: 116/77, loss: 0.7687060832977295 2023-01-23 23:28:33.053119: step: 120/77, loss: 0.7670192718505859 2023-01-23 23:28:34.546411: step: 124/77, loss: 0.7816425561904907 2023-01-23 23:28:35.929349: step: 128/77, loss: 0.7292256355285645 2023-01-23 23:28:37.422411: step: 132/77, loss: 0.7331589460372925 2023-01-23 23:28:38.972048: step: 136/77, loss: 0.6675155162811279 2023-01-23 23:28:40.382365: step: 140/77, loss: 0.6493780016899109 2023-01-23 23:28:41.900067: step: 144/77, loss: 0.646944522857666 2023-01-23 23:28:43.396888: step: 148/77, loss: 0.6240701675415039 2023-01-23 23:28:44.977178: step: 152/77, loss: 0.5725189447402954 2023-01-23 23:28:46.499443: step: 156/77, loss: 0.5819498896598816 2023-01-23 23:28:47.990259: step: 160/77, loss: 0.5860099792480469 2023-01-23 23:28:49.505009: step: 164/77, loss: 0.5213608145713806 2023-01-23 23:28:51.021622: step: 168/77, loss: 0.5680379867553711 2023-01-23 23:28:52.491168: step: 172/77, loss: 0.45404040813446045 2023-01-23 23:28:53.937873: step: 176/77, loss: 0.44066500663757324 2023-01-23 23:28:55.422339: step: 180/77, loss: 0.4716237783432007 2023-01-23 23:28:56.887471: step: 184/77, loss: 0.3929561376571655 2023-01-23 23:28:58.416686: step: 188/77, loss: 0.4217242896556854 2023-01-23 23:28:59.854170: step: 192/77, loss: 0.39566516876220703 2023-01-23 23:29:01.317115: step: 196/77, loss: 0.41571173071861267 2023-01-23 23:29:02.797011: step: 200/77, loss: 0.2986573874950409 2023-01-23 23:29:04.241631: step: 204/77, loss: 0.44773268699645996 2023-01-23 23:29:05.707911: step: 208/77, loss: 0.2613578140735626 2023-01-23 23:29:07.172267: step: 212/77, loss: 0.22704322636127472 2023-01-23 23:29:08.611283: step: 216/77, loss: 0.1780972182750702 2023-01-23 23:29:10.087270: step: 220/77, loss: 0.20675839483737946 2023-01-23 23:29:11.626163: step: 224/77, loss: 0.19478635489940643 2023-01-23 23:29:12.997693: step: 228/77, loss: 0.2304016649723053 2023-01-23 23:29:14.492380: step: 232/77, loss: 0.2229248285293579 2023-01-23 23:29:16.002940: step: 236/77, loss: 0.13515512645244598 2023-01-23 23:29:17.498400: step: 240/77, loss: 0.11883608996868134 2023-01-23 23:29:19.000048: step: 244/77, loss: 0.1275760531425476 2023-01-23 23:29:20.532413: step: 248/77, loss: 0.33747437596321106 2023-01-23 23:29:22.022373: step: 252/77, loss: 0.12381209433078766 2023-01-23 23:29:23.552745: step: 256/77, loss: 0.40654462575912476 2023-01-23 23:29:25.100815: step: 260/77, loss: 0.14962953329086304 2023-01-23 23:29:26.526112: step: 264/77, loss: 0.044936202466487885 2023-01-23 23:29:27.966233: step: 268/77, loss: 0.06391699612140656 2023-01-23 23:29:29.450030: step: 272/77, loss: 0.20007681846618652 2023-01-23 23:29:30.989405: step: 276/77, loss: 0.2122235894203186 2023-01-23 23:29:32.401900: step: 280/77, loss: 0.13911424577236176 2023-01-23 23:29:33.895540: step: 284/77, loss: 0.06268332153558731 2023-01-23 23:29:35.403972: step: 288/77, loss: 0.11234302073717117 2023-01-23 23:29:36.846064: step: 292/77, loss: 0.04873437061905861 2023-01-23 23:29:38.303902: step: 296/77, loss: 0.1151905208826065 2023-01-23 23:29:39.804857: step: 300/77, loss: 0.20775069296360016 2023-01-23 23:29:41.284303: step: 304/77, loss: 0.1022971048951149 2023-01-23 23:29:42.770548: step: 308/77, loss: 0.14672891795635223 2023-01-23 23:29:44.199381: step: 312/77, loss: 0.036175355315208435 2023-01-23 23:29:45.640670: step: 316/77, loss: 0.06932605803012848 2023-01-23 23:29:47.092955: step: 320/77, loss: 0.24824608862400055 2023-01-23 23:29:48.608993: step: 324/77, loss: 0.0985107496380806 2023-01-23 23:29:50.044418: step: 328/77, loss: 0.27712494134902954 2023-01-23 23:29:51.484223: step: 332/77, loss: 0.080054372549057 2023-01-23 23:29:52.889146: step: 336/77, loss: 0.24405285716056824 2023-01-23 23:29:54.350378: step: 340/77, loss: 0.11317433416843414 2023-01-23 23:29:55.845021: step: 344/77, loss: 0.13220016658306122 2023-01-23 23:29:57.290141: step: 348/77, loss: 0.04351910203695297 2023-01-23 23:29:58.809086: step: 352/77, loss: 0.08148898184299469 2023-01-23 23:30:00.244856: step: 356/77, loss: 0.25758957862854004 2023-01-23 23:30:01.752687: step: 360/77, loss: 0.19760257005691528 2023-01-23 23:30:03.285192: step: 364/77, loss: 0.08745350688695908 2023-01-23 23:30:04.848610: step: 368/77, loss: 0.20615199208259583 2023-01-23 23:30:06.348058: step: 372/77, loss: 0.04876667261123657 2023-01-23 23:30:07.794212: step: 376/77, loss: 0.1216202825307846 2023-01-23 23:30:09.204357: step: 380/77, loss: 0.16642817854881287 2023-01-23 23:30:10.708248: step: 384/77, loss: 0.06036415696144104 2023-01-23 23:30:12.161387: step: 388/77, loss: 0.1854311227798462 ================================================== Loss: 0.487 -------------------- Dev Chinese: {'template': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 0} Test Chinese: {'template': {'p': 1.0, 'r': 0.0, 'f1': 0.0}, 'slot': {'p': 1.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 0} Dev Korean: {'template': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 0} Test Korean: {'template': {'p': 1.0, 'r': 0.0, 'f1': 0.0}, 'slot': {'p': 1.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 0} Dev Russian: {'template': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 0} Test Russian: {'template': {'p': 1.0, 'r': 0.0, 'f1': 0.0}, 'slot': {'p': 1.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 0} Sample Chinese: {'template': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 0} Sample Korean: {'template': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 0} Sample Russian: {'template': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 0} New best chinese model... New best korean model... New best russian model... ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 0} Test for Chinese: {'template': {'p': 1.0, 'r': 0.0, 'f1': 0.0}, 'slot': {'p': 1.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 0} Chinese: {'template': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 0} -------------------- Dev for Korean: {'template': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 0} Test for Korean: {'template': {'p': 1.0, 'r': 0.0, 'f1': 0.0}, 'slot': {'p': 1.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 0} Korean: {'template': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 0} -------------------- Dev for Russian: {'template': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 0} Test for Russian: {'template': {'p': 1.0, 'r': 0.0, 'f1': 0.0}, 'slot': {'p': 1.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 0} Russian: {'template': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 0} ****************************** Epoch: 1 command: python train.py --model_name template --xlmr_model_name xlm-roberta-large --batch_size 10 --xlmr_learning_rate 2e-5 --event_hidden_num 450 --accumulate_step 4 --max_epoch 30 --p1_data_weight 0.1 --learning_rate 2e-4 2023-01-23 23:32:10.868328: step: 4/77, loss: 0.08808518946170807 2023-01-23 23:32:12.461181: step: 8/77, loss: 0.07598349452018738 2023-01-23 23:32:13.847302: step: 12/77, loss: 0.07537257671356201 2023-01-23 23:32:15.262022: step: 16/77, loss: 0.19350022077560425 2023-01-23 23:32:16.653349: step: 20/77, loss: 0.08845332264900208 2023-01-23 23:32:18.112356: step: 24/77, loss: 0.06127769500017166 2023-01-23 23:32:19.604704: step: 28/77, loss: 0.08885109424591064 2023-01-23 23:32:21.082092: step: 32/77, loss: 0.21233563125133514 2023-01-23 23:32:22.560208: step: 36/77, loss: 0.3040357232093811 2023-01-23 23:32:24.035328: step: 40/77, loss: 0.27482870221138 2023-01-23 23:32:25.560304: step: 44/77, loss: 0.08601843565702438 2023-01-23 23:32:27.002967: step: 48/77, loss: 0.07719548791646957 2023-01-23 23:32:28.467607: step: 52/77, loss: 0.06267912685871124 2023-01-23 23:32:29.903551: step: 56/77, loss: 0.07540792971849442 2023-01-23 23:32:31.348818: step: 60/77, loss: 0.04689452797174454 2023-01-23 23:32:32.851797: step: 64/77, loss: 0.15594510734081268 2023-01-23 23:32:34.390345: step: 68/77, loss: 0.14404773712158203 2023-01-23 23:32:35.873270: step: 72/77, loss: 0.07119323313236237 2023-01-23 23:32:37.379676: step: 76/77, loss: 0.16119280457496643 2023-01-23 23:32:38.838106: step: 80/77, loss: 0.09215322881937027 2023-01-23 23:32:40.395579: step: 84/77, loss: 0.3043191432952881 2023-01-23 23:32:41.879220: step: 88/77, loss: 0.3175913393497467 2023-01-23 23:32:43.344440: step: 92/77, loss: 0.08320172131061554 2023-01-23 23:32:44.819616: step: 96/77, loss: 0.03388125076889992 2023-01-23 23:32:46.220018: step: 100/77, loss: 0.03213505819439888 2023-01-23 23:32:47.691347: step: 104/77, loss: 0.11765853315591812 2023-01-23 23:32:49.127343: step: 108/77, loss: 0.16065680980682373 2023-01-23 23:32:50.615642: step: 112/77, loss: 0.031914323568344116 2023-01-23 23:32:52.039909: step: 116/77, loss: 0.1371314525604248 2023-01-23 23:32:53.551454: step: 120/77, loss: 0.057588137686252594 2023-01-23 23:32:54.990548: step: 124/77, loss: 0.17369845509529114 2023-01-23 23:32:56.432721: step: 128/77, loss: 0.14050480723381042 2023-01-23 23:32:57.931769: step: 132/77, loss: 0.06881752610206604 2023-01-23 23:32:59.330986: step: 136/77, loss: 0.08079089224338531 2023-01-23 23:33:00.743169: step: 140/77, loss: 0.11471383273601532 2023-01-23 23:33:02.227829: step: 144/77, loss: 0.07265495508909225 2023-01-23 23:33:03.687428: step: 148/77, loss: 0.0584573820233345 2023-01-23 23:33:05.249962: step: 152/77, loss: 0.20290088653564453 2023-01-23 23:33:06.710451: step: 156/77, loss: 0.08977239578962326 2023-01-23 23:33:08.186960: step: 160/77, loss: 0.2409203052520752 2023-01-23 23:33:09.631092: step: 164/77, loss: 0.31462815403938293 2023-01-23 23:33:11.085394: step: 168/77, loss: 0.09223669022321701 2023-01-23 23:33:12.603708: step: 172/77, loss: 0.15644097328186035 2023-01-23 23:33:14.084944: step: 176/77, loss: 0.03518672287464142 2023-01-23 23:33:15.576529: step: 180/77, loss: 0.0733434408903122 2023-01-23 23:33:17.082105: step: 184/77, loss: 0.19737818837165833 2023-01-23 23:33:18.543377: step: 188/77, loss: 0.07022371888160706 2023-01-23 23:33:20.048972: step: 192/77, loss: 0.08941012620925903 2023-01-23 23:33:21.531594: step: 196/77, loss: 0.17177169024944305 2023-01-23 23:33:22.990324: step: 200/77, loss: 0.0818762332201004 2023-01-23 23:33:24.474436: step: 204/77, loss: 0.1051383763551712 2023-01-23 23:33:25.843404: step: 208/77, loss: 0.13907712697982788 2023-01-23 23:33:27.267195: step: 212/77, loss: 0.050016190856695175 2023-01-23 23:33:28.696983: step: 216/77, loss: 0.05056443437933922 2023-01-23 23:33:30.126654: step: 220/77, loss: 0.07941845059394836 2023-01-23 23:33:31.607073: step: 224/77, loss: 0.08878391236066818 2023-01-23 23:33:33.112762: step: 228/77, loss: 0.0625339150428772 2023-01-23 23:33:34.490923: step: 232/77, loss: 0.06150152161717415 2023-01-23 23:33:35.956479: step: 236/77, loss: 0.11748947948217392 2023-01-23 23:33:37.449234: step: 240/77, loss: 0.18033303320407867 2023-01-23 23:33:38.943754: step: 244/77, loss: 0.14193198084831238 2023-01-23 23:33:40.463939: step: 248/77, loss: 0.08232773840427399 2023-01-23 23:33:41.927624: step: 252/77, loss: 0.2342931628227234 2023-01-23 23:33:43.378750: step: 256/77, loss: 0.08909574151039124 2023-01-23 23:33:44.861586: step: 260/77, loss: 0.05348915234208107 2023-01-23 23:33:46.313313: step: 264/77, loss: 0.1067211776971817 2023-01-23 23:33:47.737354: step: 268/77, loss: 0.23183506727218628 2023-01-23 23:33:49.214113: step: 272/77, loss: 0.1415650099515915 2023-01-23 23:33:50.700163: step: 276/77, loss: 0.06152913719415665 2023-01-23 23:33:52.150143: step: 280/77, loss: 0.041706740856170654 2023-01-23 23:33:53.639683: step: 284/77, loss: 0.10872595757246017 2023-01-23 23:33:55.133551: step: 288/77, loss: 0.043247997760772705 2023-01-23 23:33:56.597813: step: 292/77, loss: 0.12283715605735779 2023-01-23 23:33:58.064506: step: 296/77, loss: 0.03415264934301376 2023-01-23 23:33:59.490512: step: 300/77, loss: 0.17462123930454254 2023-01-23 23:34:00.975033: step: 304/77, loss: 0.1322343945503235 2023-01-23 23:34:02.530140: step: 308/77, loss: 0.06682181358337402 2023-01-23 23:34:04.073628: step: 312/77, loss: 0.07804687321186066 2023-01-23 23:34:05.490909: step: 316/77, loss: 0.36672520637512207 2023-01-23 23:34:06.921733: step: 320/77, loss: 0.06110672280192375 2023-01-23 23:34:08.394960: step: 324/77, loss: 0.1718875765800476 2023-01-23 23:34:09.903380: step: 328/77, loss: 0.13002590835094452 2023-01-23 23:34:11.358830: step: 332/77, loss: 0.10468481481075287 2023-01-23 23:34:12.843493: step: 336/77, loss: 0.1538790762424469 2023-01-23 23:34:14.228880: step: 340/77, loss: 0.026036838069558144 2023-01-23 23:34:15.727332: step: 344/77, loss: 0.11194135248661041 2023-01-23 23:34:17.228261: step: 348/77, loss: 0.07212279736995697 2023-01-23 23:34:18.740280: step: 352/77, loss: 0.06141982227563858 2023-01-23 23:34:20.230809: step: 356/77, loss: 0.10051367431879044 2023-01-23 23:34:21.744089: step: 360/77, loss: 0.08243393898010254 2023-01-23 23:34:23.187910: step: 364/77, loss: 0.08698487281799316 2023-01-23 23:34:24.661126: step: 368/77, loss: 0.06616237759590149 2023-01-23 23:34:26.155015: step: 372/77, loss: 0.08207230269908905 2023-01-23 23:34:27.660984: step: 376/77, loss: 0.13694001734256744 2023-01-23 23:34:29.061262: step: 380/77, loss: 0.06030358374118805 2023-01-23 23:34:30.567030: step: 384/77, loss: 0.07378076016902924 2023-01-23 23:34:32.045066: step: 388/77, loss: 0.0624019056558609 ================================================== Loss: 0.115 -------------------- Dev Chinese: {'template': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 1} Test Chinese: {'template': {'p': 1.0, 'r': 0.0, 'f1': 0.0}, 'slot': {'p': 1.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 1} Dev Korean: {'template': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 1} Test Korean: {'template': {'p': 1.0, 'r': 0.0, 'f1': 0.0}, 'slot': {'p': 1.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 1} Dev Russian: {'template': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 1} Test Russian: {'template': {'p': 1.0, 'r': 0.0, 'f1': 0.0}, 'slot': {'p': 1.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 1} Sample Chinese: {'template': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 1} Sample Korean: {'template': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 1} Sample Russian: {'template': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 1} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 0} Test for Chinese: {'template': {'p': 1.0, 'r': 0.0, 'f1': 0.0}, 'slot': {'p': 1.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 0} Chinese: {'template': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 0} -------------------- Dev for Korean: {'template': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 0} Test for Korean: {'template': {'p': 1.0, 'r': 0.0, 'f1': 0.0}, 'slot': {'p': 1.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 0} Korean: {'template': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 0} -------------------- Dev for Russian: {'template': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 0} Test for Russian: {'template': {'p': 1.0, 'r': 0.0, 'f1': 0.0}, 'slot': {'p': 1.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 0} Russian: {'template': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 0} ****************************** Epoch: 2 command: python train.py --model_name template --xlmr_model_name xlm-roberta-large --batch_size 10 --xlmr_learning_rate 2e-5 --event_hidden_num 450 --accumulate_step 4 --max_epoch 30 --p1_data_weight 0.1 --learning_rate 2e-4 2023-01-23 23:36:09.129312: step: 4/77, loss: 0.13258124887943268 2023-01-23 23:36:10.577785: step: 8/77, loss: 0.06252895295619965 2023-01-23 23:36:11.996906: step: 12/77, loss: 0.10674737393856049 2023-01-23 23:36:13.512513: step: 16/77, loss: 0.09641920775175095 2023-01-23 23:36:14.957593: step: 20/77, loss: 0.11784550547599792 2023-01-23 23:36:16.416633: step: 24/77, loss: 0.05828585475683212 2023-01-23 23:36:17.917464: step: 28/77, loss: 0.058067552745342255 2023-01-23 23:36:19.460760: step: 32/77, loss: 0.09018880873918533 2023-01-23 23:36:20.860125: step: 36/77, loss: 0.22561988234519958 2023-01-23 23:36:22.277391: step: 40/77, loss: 0.14366985857486725 2023-01-23 23:36:23.776621: step: 44/77, loss: 0.08453567326068878 2023-01-23 23:36:25.180114: step: 48/77, loss: 0.03233639895915985 2023-01-23 23:36:26.628705: step: 52/77, loss: 0.07216435670852661 2023-01-23 23:36:28.056594: step: 56/77, loss: 0.24645262956619263 2023-01-23 23:36:29.556949: step: 60/77, loss: 0.050122007727622986 2023-01-23 23:36:31.029372: step: 64/77, loss: 0.12721282243728638 2023-01-23 23:36:32.442436: step: 68/77, loss: 0.05591835081577301 2023-01-23 23:36:33.941752: step: 72/77, loss: 0.06773615628480911 2023-01-23 23:36:35.351874: step: 76/77, loss: 0.14394433796405792 2023-01-23 23:36:36.750477: step: 80/77, loss: 0.0904054343700409 2023-01-23 23:36:38.272191: step: 84/77, loss: 0.23689547181129456 2023-01-23 23:36:39.777705: step: 88/77, loss: 0.1101958304643631 2023-01-23 23:36:41.237829: step: 92/77, loss: 0.08307532966136932 2023-01-23 23:36:42.695909: step: 96/77, loss: 0.15602004528045654 2023-01-23 23:36:44.135456: step: 100/77, loss: 0.04147651046514511 2023-01-23 23:36:45.639104: step: 104/77, loss: 0.15626007318496704 2023-01-23 23:36:47.194385: step: 108/77, loss: 0.08543263375759125 2023-01-23 23:36:48.589917: step: 112/77, loss: 0.05249679833650589 2023-01-23 23:36:50.143820: step: 116/77, loss: 0.13839921355247498 2023-01-23 23:36:51.621077: step: 120/77, loss: 0.1184510886669159 2023-01-23 23:36:53.071821: step: 124/77, loss: 0.12318076193332672 2023-01-23 23:36:54.543665: step: 128/77, loss: 0.06913600862026215 2023-01-23 23:36:55.999748: step: 132/77, loss: 0.07552852481603622 2023-01-23 23:36:57.486152: step: 136/77, loss: 0.1511237621307373 2023-01-23 23:36:58.939025: step: 140/77, loss: 0.11312386393547058 2023-01-23 23:37:00.394427: step: 144/77, loss: 0.07943280786275864 2023-01-23 23:37:01.941845: step: 148/77, loss: 0.03293665871024132 2023-01-23 23:37:03.393571: step: 152/77, loss: 0.06744928658008575 2023-01-23 23:37:04.873852: step: 156/77, loss: 0.08263631165027618 2023-01-23 23:37:06.332997: step: 160/77, loss: 0.079817034304142 2023-01-23 23:37:07.822606: step: 164/77, loss: 0.03467145934700966 2023-01-23 23:37:09.262295: step: 168/77, loss: 0.025395743548870087 2023-01-23 23:37:10.741141: step: 172/77, loss: 0.06069394573569298 2023-01-23 23:37:12.245243: step: 176/77, loss: 0.01779540255665779 2023-01-23 23:37:13.680980: step: 180/77, loss: 0.03509168699383736 2023-01-23 23:37:15.137775: step: 184/77, loss: 0.03671012073755264 2023-01-23 23:37:16.646012: step: 188/77, loss: 0.04279708117246628 2023-01-23 23:37:18.154221: step: 192/77, loss: 0.031519681215286255 2023-01-23 23:37:19.626296: step: 196/77, loss: 0.03178836405277252 2023-01-23 23:37:21.087946: step: 200/77, loss: 0.03438074141740799 2023-01-23 23:37:22.522183: step: 204/77, loss: 0.013872837647795677 2023-01-23 23:37:24.000657: step: 208/77, loss: 0.0239429734647274 2023-01-23 23:37:25.397951: step: 212/77, loss: 0.04659315571188927 2023-01-23 23:37:26.853240: step: 216/77, loss: 0.00971127487719059 2023-01-23 23:37:28.369144: step: 220/77, loss: 0.06695039570331573 2023-01-23 23:37:29.817957: step: 224/77, loss: 0.06293109059333801 2023-01-23 23:37:31.363350: step: 228/77, loss: 0.0710817202925682 2023-01-23 23:37:32.902219: step: 232/77, loss: 0.018637798726558685 2023-01-23 23:37:34.420152: step: 236/77, loss: 0.11581932008266449 2023-01-23 23:37:35.869309: step: 240/77, loss: 0.04097363352775574 2023-01-23 23:37:37.348305: step: 244/77, loss: 0.014920370653271675 2023-01-23 23:37:38.846506: step: 248/77, loss: 0.01983986236155033 2023-01-23 23:37:40.264890: step: 252/77, loss: 0.024595849215984344 2023-01-23 23:37:41.749769: step: 256/77, loss: 0.04437322914600372 2023-01-23 23:37:43.236024: step: 260/77, loss: 0.014827568084001541 2023-01-23 23:37:44.749902: step: 264/77, loss: 0.05651269108057022 2023-01-23 23:37:46.202214: step: 268/77, loss: 0.05368447303771973 2023-01-23 23:37:47.695949: step: 272/77, loss: 0.008875405415892601 2023-01-23 23:37:49.174777: step: 276/77, loss: 0.08118952065706253 2023-01-23 23:37:50.620132: step: 280/77, loss: 0.044285401701927185 2023-01-23 23:37:52.069510: step: 284/77, loss: 0.021020062267780304 2023-01-23 23:37:53.555948: step: 288/77, loss: 0.04102979600429535 2023-01-23 23:37:55.040880: step: 292/77, loss: 0.08634628355503082 2023-01-23 23:37:56.479407: step: 296/77, loss: 0.10369566082954407 2023-01-23 23:37:57.979407: step: 300/77, loss: 0.1591092050075531 2023-01-23 23:37:59.503625: step: 304/77, loss: 0.0445716418325901 2023-01-23 23:38:00.984973: step: 308/77, loss: 0.044002607464790344 2023-01-23 23:38:02.509455: step: 312/77, loss: 0.12348321080207825 2023-01-23 23:38:03.958419: step: 316/77, loss: 0.04513373598456383 2023-01-23 23:38:05.444442: step: 320/77, loss: 0.07740610092878342 2023-01-23 23:38:06.964946: step: 324/77, loss: 0.017783014103770256 2023-01-23 23:38:08.424258: step: 328/77, loss: 0.016946876421570778 2023-01-23 23:38:10.004385: step: 332/77, loss: 0.011803146451711655 2023-01-23 23:38:11.496922: step: 336/77, loss: 0.07661925256252289 2023-01-23 23:38:12.939333: step: 340/77, loss: 0.02661910280585289 2023-01-23 23:38:14.429561: step: 344/77, loss: 0.03145609050989151 2023-01-23 23:38:15.923455: step: 348/77, loss: 0.019618254154920578 2023-01-23 23:38:17.405481: step: 352/77, loss: 0.07848714292049408 2023-01-23 23:38:18.876099: step: 356/77, loss: 0.04304105043411255 2023-01-23 23:38:20.356927: step: 360/77, loss: 0.04550347849726677 2023-01-23 23:38:21.860912: step: 364/77, loss: 0.019600503146648407 2023-01-23 23:38:23.377094: step: 368/77, loss: 0.0321541503071785 2023-01-23 23:38:24.982288: step: 372/77, loss: 0.08753525465726852 2023-01-23 23:38:26.511340: step: 376/77, loss: 0.041867706924676895 2023-01-23 23:38:28.004078: step: 380/77, loss: 0.03870141878724098 2023-01-23 23:38:29.413386: step: 384/77, loss: 0.05822906270623207 2023-01-23 23:38:30.874642: step: 388/77, loss: 0.03057796321809292 ================================================== Loss: 0.069 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5666666666666667, 'f1': 0.7234042553191489}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05085442919642522, 'epoch': 2} Test Chinese: {'template': {'p': 0.9384615384615385, 'r': 0.46564885496183206, 'f1': 0.6224489795918368}, 'slot': {'p': 0.6153846153846154, 'r': 0.013805004314063849, 'f1': 0.027004219409282704}, 'combined': 0.016808748815982093, 'epoch': 2} Dev Korean: {'template': {'p': 1.0, 'r': 0.5666666666666667, 'f1': 0.7234042553191489}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05085442919642522, 'epoch': 2} Test Korean: {'template': {'p': 0.9230769230769231, 'r': 0.4580152671755725, 'f1': 0.6122448979591837}, 'slot': {'p': 0.6153846153846154, 'r': 0.013805004314063849, 'f1': 0.027004219409282704}, 'combined': 0.016533195556703698, 'epoch': 2} Dev Russian: {'template': {'p': 1.0, 'r': 0.5666666666666667, 'f1': 0.7234042553191489}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05085442919642522, 'epoch': 2} Test Russian: {'template': {'p': 0.9230769230769231, 'r': 0.4580152671755725, 'f1': 0.6122448979591837}, 'slot': {'p': 0.6153846153846154, 'r': 0.013805004314063849, 'f1': 0.027004219409282704}, 'combined': 0.016533195556703698, 'epoch': 2} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 2} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 2} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 2} New best chinese model... New best korean model... New best russian model... ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5666666666666667, 'f1': 0.7234042553191489}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05085442919642522, 'epoch': 2} Test for Chinese: {'template': {'p': 0.9384615384615385, 'r': 0.46564885496183206, 'f1': 0.6224489795918368}, 'slot': {'p': 0.6153846153846154, 'r': 0.013805004314063849, 'f1': 0.027004219409282704}, 'combined': 0.016808748815982093, 'epoch': 2} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 2} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5666666666666667, 'f1': 0.7234042553191489}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05085442919642522, 'epoch': 2} Test for Korean: {'template': {'p': 0.9230769230769231, 'r': 0.4580152671755725, 'f1': 0.6122448979591837}, 'slot': {'p': 0.6153846153846154, 'r': 0.013805004314063849, 'f1': 0.027004219409282704}, 'combined': 0.016533195556703698, 'epoch': 2} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 2} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5666666666666667, 'f1': 0.7234042553191489}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05085442919642522, 'epoch': 2} Test for Russian: {'template': {'p': 0.9230769230769231, 'r': 0.4580152671755725, 'f1': 0.6122448979591837}, 'slot': {'p': 0.6153846153846154, 'r': 0.013805004314063849, 'f1': 0.027004219409282704}, 'combined': 0.016533195556703698, 'epoch': 2} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 2} ****************************** Epoch: 3 command: python train.py --model_name template --xlmr_model_name xlm-roberta-large --batch_size 10 --xlmr_learning_rate 2e-5 --event_hidden_num 450 --accumulate_step 4 --max_epoch 30 --p1_data_weight 0.1 --learning_rate 2e-4 2023-01-23 23:40:25.845211: step: 4/77, loss: 0.012912862002849579 2023-01-23 23:40:27.186394: step: 8/77, loss: 0.04591010510921478 2023-01-23 23:40:28.743221: step: 12/77, loss: 0.02694067917764187 2023-01-23 23:40:30.233963: step: 16/77, loss: 0.01571756601333618 2023-01-23 23:40:31.608664: step: 20/77, loss: 0.01355262566357851 2023-01-23 23:40:33.065818: step: 24/77, loss: 0.03795113414525986 2023-01-23 23:40:34.550582: step: 28/77, loss: 0.03535859286785126 2023-01-23 23:40:35.972152: step: 32/77, loss: 0.015470536425709724 2023-01-23 23:40:37.417864: step: 36/77, loss: 0.015092505142092705 2023-01-23 23:40:38.915236: step: 40/77, loss: 0.1292186826467514 2023-01-23 23:40:40.322766: step: 44/77, loss: 0.07205881923437119 2023-01-23 23:40:41.822637: step: 48/77, loss: 0.03737745061516762 2023-01-23 23:40:43.307648: step: 52/77, loss: 0.011360120959579945 2023-01-23 23:40:44.809139: step: 56/77, loss: 0.007079091854393482 2023-01-23 23:40:46.345196: step: 60/77, loss: 0.021989651024341583 2023-01-23 23:40:47.773292: step: 64/77, loss: 0.11524832248687744 2023-01-23 23:40:49.253002: step: 68/77, loss: 0.029087066650390625 2023-01-23 23:40:50.774939: step: 72/77, loss: 0.025555476546287537 2023-01-23 23:40:52.198202: step: 76/77, loss: 0.05321405082941055 2023-01-23 23:40:53.724177: step: 80/77, loss: 0.021694185212254524 2023-01-23 23:40:55.254645: step: 84/77, loss: 0.026616228744387627 2023-01-23 23:40:56.700086: step: 88/77, loss: 0.025496210902929306 2023-01-23 23:40:58.142076: step: 92/77, loss: 0.02640456147491932 2023-01-23 23:40:59.717069: step: 96/77, loss: 0.014219501987099648 2023-01-23 23:41:01.111777: step: 100/77, loss: 0.04648306965827942 2023-01-23 23:41:02.640827: step: 104/77, loss: 0.003235041629523039 2023-01-23 23:41:04.118918: step: 108/77, loss: 0.016798511147499084 2023-01-23 23:41:05.597059: step: 112/77, loss: 0.033600855618715286 2023-01-23 23:41:07.105766: step: 116/77, loss: 0.015783516690135002 2023-01-23 23:41:08.545626: step: 120/77, loss: 0.05280846357345581 2023-01-23 23:41:10.033646: step: 124/77, loss: 0.00289800763130188 2023-01-23 23:41:11.500914: step: 128/77, loss: 0.034957654774188995 2023-01-23 23:41:12.973945: step: 132/77, loss: 0.08969595283269882 2023-01-23 23:41:14.424966: step: 136/77, loss: 0.04882393777370453 2023-01-23 23:41:15.912125: step: 140/77, loss: 0.02632911317050457 2023-01-23 23:41:17.387071: step: 144/77, loss: 0.002386486390605569 2023-01-23 23:41:18.918136: step: 148/77, loss: 0.04905037209391594 2023-01-23 23:41:20.430951: step: 152/77, loss: 0.007842171005904675 2023-01-23 23:41:21.896738: step: 156/77, loss: 0.02815202623605728 2023-01-23 23:41:23.399196: step: 160/77, loss: 0.005176716484129429 2023-01-23 23:41:24.894782: step: 164/77, loss: 0.017684893682599068 2023-01-23 23:41:26.332967: step: 168/77, loss: 0.015900595113635063 2023-01-23 23:41:27.891567: step: 172/77, loss: 0.006918495055288076 2023-01-23 23:41:29.330903: step: 176/77, loss: 0.02818692848086357 2023-01-23 23:41:30.824152: step: 180/77, loss: 0.03534059599041939 2023-01-23 23:41:32.313698: step: 184/77, loss: 0.026743004098534584 2023-01-23 23:41:33.845954: step: 188/77, loss: 0.027218960225582123 2023-01-23 23:41:35.295846: step: 192/77, loss: 0.00607309490442276 2023-01-23 23:41:36.846255: step: 196/77, loss: 0.009721008129417896 2023-01-23 23:41:38.402871: step: 200/77, loss: 0.026713203638792038 2023-01-23 23:41:39.842013: step: 204/77, loss: 0.04019925370812416 2023-01-23 23:41:41.263222: step: 208/77, loss: 0.08135680109262466 2023-01-23 23:41:42.727409: step: 212/77, loss: 0.07656152546405792 2023-01-23 23:41:44.205533: step: 216/77, loss: 0.024162959307432175 2023-01-23 23:41:45.666865: step: 220/77, loss: 0.01666862890124321 2023-01-23 23:41:47.148780: step: 224/77, loss: 0.03890030086040497 2023-01-23 23:41:48.576962: step: 228/77, loss: 0.045673124492168427 2023-01-23 23:41:50.073326: step: 232/77, loss: 0.010527916252613068 2023-01-23 23:41:51.558869: step: 236/77, loss: 0.006743155419826508 2023-01-23 23:41:53.046453: step: 240/77, loss: 0.011435626074671745 2023-01-23 23:41:54.562523: step: 244/77, loss: 0.0381687693297863 2023-01-23 23:41:56.076447: step: 248/77, loss: 0.022242587059736252 2023-01-23 23:41:57.514969: step: 252/77, loss: 0.04305797815322876 2023-01-23 23:41:58.979840: step: 256/77, loss: 0.009767288342118263 2023-01-23 23:42:00.513701: step: 260/77, loss: 0.12234492599964142 2023-01-23 23:42:02.024249: step: 264/77, loss: 0.03097294643521309 2023-01-23 23:42:03.506143: step: 268/77, loss: 0.0573321133852005 2023-01-23 23:42:04.957997: step: 272/77, loss: 0.22089190781116486 2023-01-23 23:42:06.428593: step: 276/77, loss: 0.05211479216814041 2023-01-23 23:42:07.967031: step: 280/77, loss: 0.02089191973209381 2023-01-23 23:42:09.359747: step: 284/77, loss: 0.06037844344973564 2023-01-23 23:42:10.816868: step: 288/77, loss: 0.018641719594597816 2023-01-23 23:42:12.325413: step: 292/77, loss: 0.010771493427455425 2023-01-23 23:42:13.781625: step: 296/77, loss: 0.02911381423473358 2023-01-23 23:42:15.259751: step: 300/77, loss: 0.056641578674316406 2023-01-23 23:42:16.726558: step: 304/77, loss: 0.020282302051782608 2023-01-23 23:42:18.216766: step: 308/77, loss: 0.10549724102020264 2023-01-23 23:42:19.668734: step: 312/77, loss: 0.12837551534175873 2023-01-23 23:42:21.145108: step: 316/77, loss: 0.008430239744484425 2023-01-23 23:42:22.594243: step: 320/77, loss: 0.022032450884580612 2023-01-23 23:42:24.019804: step: 324/77, loss: 0.07047325372695923 2023-01-23 23:42:25.455172: step: 328/77, loss: 0.018198523670434952 2023-01-23 23:42:26.933106: step: 332/77, loss: 0.016754793003201485 2023-01-23 23:42:28.384511: step: 336/77, loss: 0.042492322623729706 2023-01-23 23:42:29.840482: step: 340/77, loss: 0.029950454831123352 2023-01-23 23:42:31.273982: step: 344/77, loss: 0.05285784602165222 2023-01-23 23:42:32.770054: step: 348/77, loss: 0.06085269898176193 2023-01-23 23:42:34.209730: step: 352/77, loss: 0.014680081978440285 2023-01-23 23:42:35.696223: step: 356/77, loss: 0.050214581191539764 2023-01-23 23:42:37.119586: step: 360/77, loss: 0.03656889870762825 2023-01-23 23:42:38.556128: step: 364/77, loss: 0.014042508788406849 2023-01-23 23:42:40.084942: step: 368/77, loss: 0.04989948868751526 2023-01-23 23:42:41.592895: step: 372/77, loss: 0.048607222735881805 2023-01-23 23:42:43.089278: step: 376/77, loss: 0.04302738979458809 2023-01-23 23:42:44.531426: step: 380/77, loss: 0.02379428781569004 2023-01-23 23:42:46.025053: step: 384/77, loss: 0.013878141529858112 2023-01-23 23:42:47.556970: step: 388/77, loss: 0.008634086698293686 ================================================== Loss: 0.036 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test Chinese: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.37777777777777777, 'r': 0.014667817083692839, 'f1': 0.028239202657807307}, 'combined': 0.01735677821894498, 'epoch': 3} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test Korean: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.37777777777777777, 'r': 0.014667817083692839, 'f1': 0.028239202657807307}, 'combined': 0.01735677821894498, 'epoch': 3} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test Russian: {'template': {'p': 0.863013698630137, 'r': 0.48091603053435117, 'f1': 0.6176470588235295}, 'slot': {'p': 0.38636363636363635, 'r': 0.014667817083692839, 'f1': 0.02826267664172901}, 'combined': 0.01745635910224439, 'epoch': 3} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 3} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 3} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 3} New best chinese model... New best korean model... New best russian model... ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Chinese: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.37777777777777777, 'r': 0.014667817083692839, 'f1': 0.028239202657807307}, 'combined': 0.01735677821894498, 'epoch': 3} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 3} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Korean: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.37777777777777777, 'r': 0.014667817083692839, 'f1': 0.028239202657807307}, 'combined': 0.01735677821894498, 'epoch': 3} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 3} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Russian: {'template': {'p': 0.863013698630137, 'r': 0.48091603053435117, 'f1': 0.6176470588235295}, 'slot': {'p': 0.38636363636363635, 'r': 0.014667817083692839, 'f1': 0.02826267664172901}, 'combined': 0.01745635910224439, 'epoch': 3} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 3} ****************************** Epoch: 4 command: python train.py --model_name template --xlmr_model_name xlm-roberta-large --batch_size 10 --xlmr_learning_rate 2e-5 --event_hidden_num 450 --accumulate_step 4 --max_epoch 30 --p1_data_weight 0.1 --learning_rate 2e-4 2023-01-23 23:44:46.122030: step: 4/77, loss: 0.01413442101329565 2023-01-23 23:44:47.601198: step: 8/77, loss: 0.03007185272872448 2023-01-23 23:44:49.061175: step: 12/77, loss: 0.03499831259250641 2023-01-23 23:44:50.478936: step: 16/77, loss: 0.05546431988477707 2023-01-23 23:44:51.914479: step: 20/77, loss: 0.002329618204385042 2023-01-23 23:44:53.384209: step: 24/77, loss: 0.03021444007754326 2023-01-23 23:44:54.909226: step: 28/77, loss: 0.002550596371293068 2023-01-23 23:44:56.401885: step: 32/77, loss: 0.06376893818378448 2023-01-23 23:44:57.872103: step: 36/77, loss: 0.013936012983322144 2023-01-23 23:44:59.312265: step: 40/77, loss: 0.024293333292007446 2023-01-23 23:45:00.794380: step: 44/77, loss: 0.003448844887316227 2023-01-23 23:45:02.262477: step: 48/77, loss: 0.04759862273931503 2023-01-23 23:45:03.714181: step: 52/77, loss: 0.02035803720355034 2023-01-23 23:45:05.247447: step: 56/77, loss: 0.006649450398981571 2023-01-23 23:45:06.688095: step: 60/77, loss: 0.010570206679403782 2023-01-23 23:45:08.132450: step: 64/77, loss: 0.05664711818099022 2023-01-23 23:45:09.583298: step: 68/77, loss: 0.015202310867607594 2023-01-23 23:45:11.109509: step: 72/77, loss: 0.03192824497818947 2023-01-23 23:45:12.554951: step: 76/77, loss: 0.04990002512931824 2023-01-23 23:45:14.045804: step: 80/77, loss: 0.004372069146484137 2023-01-23 23:45:15.533188: step: 84/77, loss: 0.06885036081075668 2023-01-23 23:45:17.042998: step: 88/77, loss: 0.002442281460389495 2023-01-23 23:45:18.516265: step: 92/77, loss: 0.003796561621129513 2023-01-23 23:45:19.967700: step: 96/77, loss: 0.04918007180094719 2023-01-23 23:45:21.462719: step: 100/77, loss: 0.0023976361844688654 2023-01-23 23:45:22.890935: step: 104/77, loss: 0.060220494866371155 2023-01-23 23:45:24.394911: step: 108/77, loss: 0.019232070073485374 2023-01-23 23:45:25.798712: step: 112/77, loss: 0.03376259282231331 2023-01-23 23:45:27.286811: step: 116/77, loss: 0.03330276161432266 2023-01-23 23:45:28.767138: step: 120/77, loss: 0.01623927801847458 2023-01-23 23:45:30.327568: step: 124/77, loss: 0.008382521569728851 2023-01-23 23:45:31.812683: step: 128/77, loss: 0.025504330173134804 2023-01-23 23:45:33.299650: step: 132/77, loss: 0.016718082129955292 2023-01-23 23:45:34.778139: step: 136/77, loss: 0.03711515665054321 2023-01-23 23:45:36.278272: step: 140/77, loss: 0.04266016557812691 2023-01-23 23:45:37.782434: step: 144/77, loss: 0.05167564004659653 2023-01-23 23:45:39.235002: step: 148/77, loss: 0.003722875379025936 2023-01-23 23:45:40.693089: step: 152/77, loss: 0.009192441590130329 2023-01-23 23:45:42.189144: step: 156/77, loss: 0.03469576686620712 2023-01-23 23:45:43.688446: step: 160/77, loss: 0.034946829080581665 2023-01-23 23:45:45.109695: step: 164/77, loss: 0.007201574742794037 2023-01-23 23:45:46.622332: step: 168/77, loss: 0.004851900972425938 2023-01-23 23:45:48.105542: step: 172/77, loss: 0.03253195434808731 2023-01-23 23:45:49.548529: step: 176/77, loss: 0.02325173281133175 2023-01-23 23:45:51.009732: step: 180/77, loss: 0.026802418753504753 2023-01-23 23:45:52.529329: step: 184/77, loss: 0.01016119122505188 2023-01-23 23:45:53.958158: step: 188/77, loss: 0.03971070796251297 2023-01-23 23:45:55.423714: step: 192/77, loss: 0.018513280898332596 2023-01-23 23:45:56.993639: step: 196/77, loss: 0.017288243398070335 2023-01-23 23:45:58.505067: step: 200/77, loss: 0.021497106179594994 2023-01-23 23:45:59.959630: step: 204/77, loss: 0.01803465560078621 2023-01-23 23:46:01.456480: step: 208/77, loss: 0.01829645223915577 2023-01-23 23:46:02.933692: step: 212/77, loss: 0.005914963781833649 2023-01-23 23:46:04.445092: step: 216/77, loss: 0.0278038140386343 2023-01-23 23:46:05.892822: step: 220/77, loss: 0.016222145408391953 2023-01-23 23:46:07.314588: step: 224/77, loss: 0.010350292548537254 2023-01-23 23:46:08.770799: step: 228/77, loss: 0.034529589116573334 2023-01-23 23:46:10.232838: step: 232/77, loss: 0.006905150134116411 2023-01-23 23:46:11.640444: step: 236/77, loss: 0.014340376481413841 2023-01-23 23:46:13.156000: step: 240/77, loss: 0.043415144085884094 2023-01-23 23:46:14.573480: step: 244/77, loss: 0.0488872304558754 2023-01-23 23:46:16.058581: step: 248/77, loss: 0.003966958727687597 2023-01-23 23:46:17.521997: step: 252/77, loss: 0.030302058905363083 2023-01-23 23:46:18.944488: step: 256/77, loss: 0.012804752215743065 2023-01-23 23:46:20.455165: step: 260/77, loss: 0.08690030872821808 2023-01-23 23:46:21.949582: step: 264/77, loss: 0.01370689831674099 2023-01-23 23:46:23.493925: step: 268/77, loss: 0.14590376615524292 2023-01-23 23:46:24.925457: step: 272/77, loss: 0.003179072868078947 2023-01-23 23:46:26.378456: step: 276/77, loss: 0.0042409347370266914 2023-01-23 23:46:27.864997: step: 280/77, loss: 0.007665436249226332 2023-01-23 23:46:29.297615: step: 284/77, loss: 0.061556752771139145 2023-01-23 23:46:30.832059: step: 288/77, loss: 0.12139555811882019 2023-01-23 23:46:32.360122: step: 292/77, loss: 0.10933557152748108 2023-01-23 23:46:33.820780: step: 296/77, loss: 0.0040985699743032455 2023-01-23 23:46:35.248335: step: 300/77, loss: 0.012230083346366882 2023-01-23 23:46:36.738104: step: 304/77, loss: 0.11386236548423767 2023-01-23 23:46:38.240333: step: 308/77, loss: 0.0003976405132561922 2023-01-23 23:46:39.798482: step: 312/77, loss: 0.043285876512527466 2023-01-23 23:46:41.267070: step: 316/77, loss: 0.001523602637462318 2023-01-23 23:46:42.718490: step: 320/77, loss: 0.1256551891565323 2023-01-23 23:46:44.200889: step: 324/77, loss: 0.07823194563388824 2023-01-23 23:46:45.698322: step: 328/77, loss: 0.0037024905905127525 2023-01-23 23:46:47.227083: step: 332/77, loss: 0.05983438342809677 2023-01-23 23:46:48.645439: step: 336/77, loss: 0.052032772451639175 2023-01-23 23:46:50.164275: step: 340/77, loss: 0.0024174668360501528 2023-01-23 23:46:51.696430: step: 344/77, loss: 0.0021779723465442657 2023-01-23 23:46:53.217838: step: 348/77, loss: 0.13080300390720367 2023-01-23 23:46:54.698045: step: 352/77, loss: 0.0091065289452672 2023-01-23 23:46:56.139849: step: 356/77, loss: 0.030503802001476288 2023-01-23 23:46:57.627815: step: 360/77, loss: 0.00729318056255579 2023-01-23 23:46:59.130108: step: 364/77, loss: 0.05925387516617775 2023-01-23 23:47:00.645210: step: 368/77, loss: 0.015410172753036022 2023-01-23 23:47:02.197206: step: 372/77, loss: 0.013694683089852333 2023-01-23 23:47:03.679815: step: 376/77, loss: 0.09178860485553741 2023-01-23 23:47:05.061947: step: 380/77, loss: 0.022821731865406036 2023-01-23 23:47:06.594781: step: 384/77, loss: 0.10924479365348816 2023-01-23 23:47:08.109464: step: 388/77, loss: 0.020602600648999214 ================================================== Loss: 0.032 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 4} Test Chinese: {'template': {'p': 0.8767123287671232, 'r': 0.48854961832061067, 'f1': 0.6274509803921569}, 'slot': {'p': 0.45454545454545453, 'r': 0.012942191544434857, 'f1': 0.025167785234899324}, 'combined': 0.01579155151993683, 'epoch': 4} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 4} Test Korean: {'template': {'p': 0.8888888888888888, 'r': 0.48854961832061067, 'f1': 0.6305418719211823}, 'slot': {'p': 0.46875, 'r': 0.012942191544434857, 'f1': 0.025188916876574305}, 'combined': 0.015882666799022224, 'epoch': 4} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 4} Test Russian: {'template': {'p': 0.8767123287671232, 'r': 0.48854961832061067, 'f1': 0.6274509803921569}, 'slot': {'p': 0.45454545454545453, 'r': 0.012942191544434857, 'f1': 0.025167785234899324}, 'combined': 0.01579155151993683, 'epoch': 4} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 4} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 4} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 4} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Chinese: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.37777777777777777, 'r': 0.014667817083692839, 'f1': 0.028239202657807307}, 'combined': 0.01735677821894498, 'epoch': 3} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 3} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Korean: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.37777777777777777, 'r': 0.014667817083692839, 'f1': 0.028239202657807307}, 'combined': 0.01735677821894498, 'epoch': 3} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 3} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Russian: {'template': {'p': 0.863013698630137, 'r': 0.48091603053435117, 'f1': 0.6176470588235295}, 'slot': {'p': 0.38636363636363635, 'r': 0.014667817083692839, 'f1': 0.02826267664172901}, 'combined': 0.01745635910224439, 'epoch': 3} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 3} ****************************** Epoch: 5 command: python train.py --model_name template --xlmr_model_name xlm-roberta-large --batch_size 10 --xlmr_learning_rate 2e-5 --event_hidden_num 450 --accumulate_step 4 --max_epoch 30 --p1_data_weight 0.1 --learning_rate 2e-4 2023-01-23 23:48:45.884853: step: 4/77, loss: 0.024476751685142517 2023-01-23 23:48:47.413453: step: 8/77, loss: 0.11043231189250946 2023-01-23 23:48:48.911635: step: 12/77, loss: 0.040273018181324005 2023-01-23 23:48:50.394853: step: 16/77, loss: 0.007355161011219025 2023-01-23 23:48:51.869770: step: 20/77, loss: 0.02030954509973526 2023-01-23 23:48:53.310646: step: 24/77, loss: 0.009412623941898346 2023-01-23 23:48:54.796690: step: 28/77, loss: 0.030328869819641113 2023-01-23 23:48:56.240299: step: 32/77, loss: 0.019532401114702225 2023-01-23 23:48:57.700842: step: 36/77, loss: 0.0017457769718021154 2023-01-23 23:48:59.110644: step: 40/77, loss: 0.028483323752880096 2023-01-23 23:49:00.564416: step: 44/77, loss: 0.18267452716827393 2023-01-23 23:49:02.126554: step: 48/77, loss: 0.01519215852022171 2023-01-23 23:49:03.618986: step: 52/77, loss: 0.023144662380218506 2023-01-23 23:49:05.112780: step: 56/77, loss: 0.02538256347179413 2023-01-23 23:49:06.598128: step: 60/77, loss: 0.008262258023023605 2023-01-23 23:49:08.120220: step: 64/77, loss: 0.019567331299185753 2023-01-23 23:49:09.542740: step: 68/77, loss: 0.004387851804494858 2023-01-23 23:49:10.991240: step: 72/77, loss: 0.03609474003314972 2023-01-23 23:49:12.467700: step: 76/77, loss: 0.027372196316719055 2023-01-23 23:49:13.926823: step: 80/77, loss: 0.08391016721725464 2023-01-23 23:49:15.399802: step: 84/77, loss: 0.05477209761738777 2023-01-23 23:49:16.865887: step: 88/77, loss: 0.018585750833153725 2023-01-23 23:49:18.312036: step: 92/77, loss: 0.03730151802301407 2023-01-23 23:49:19.805137: step: 96/77, loss: 0.04512316733598709 2023-01-23 23:49:21.259342: step: 100/77, loss: 0.057060979306697845 2023-01-23 23:49:22.709622: step: 104/77, loss: 0.032657966017723083 2023-01-23 23:49:24.217441: step: 108/77, loss: 0.016146738082170486 2023-01-23 23:49:25.714145: step: 112/77, loss: 0.0069486647844314575 2023-01-23 23:49:27.193740: step: 116/77, loss: 0.1186465322971344 2023-01-23 23:49:28.627467: step: 120/77, loss: 0.017615636810660362 2023-01-23 23:49:30.079996: step: 124/77, loss: 0.009101053699851036 2023-01-23 23:49:31.553100: step: 128/77, loss: 0.002567005343735218 2023-01-23 23:49:33.020141: step: 132/77, loss: 0.027679648250341415 2023-01-23 23:49:34.460191: step: 136/77, loss: 0.016381870955228806 2023-01-23 23:49:35.908496: step: 140/77, loss: 0.10870490223169327 2023-01-23 23:49:37.408187: step: 144/77, loss: 0.02034395933151245 2023-01-23 23:49:38.904371: step: 148/77, loss: 0.02121562510728836 2023-01-23 23:49:40.374546: step: 152/77, loss: 0.01823749952018261 2023-01-23 23:49:41.911876: step: 156/77, loss: 0.0024375556968152523 2023-01-23 23:49:43.387132: step: 160/77, loss: 0.022238213568925858 2023-01-23 23:49:44.843364: step: 164/77, loss: 0.018150899559259415 2023-01-23 23:49:46.348556: step: 168/77, loss: 0.062192972749471664 2023-01-23 23:49:47.820323: step: 172/77, loss: 0.04013078287243843 2023-01-23 23:49:49.313573: step: 176/77, loss: 0.033303748816251755 2023-01-23 23:49:50.860702: step: 180/77, loss: 0.019885744899511337 2023-01-23 23:49:52.344526: step: 184/77, loss: 0.02491719275712967 2023-01-23 23:49:53.891458: step: 188/77, loss: 0.013048075139522552 2023-01-23 23:49:55.426366: step: 192/77, loss: 0.023345254361629486 2023-01-23 23:49:56.893501: step: 196/77, loss: 0.037435151636600494 2023-01-23 23:49:58.332739: step: 200/77, loss: 0.018141061067581177 2023-01-23 23:49:59.785044: step: 204/77, loss: 0.03752421215176582 2023-01-23 23:50:01.287025: step: 208/77, loss: 0.01851789653301239 2023-01-23 23:50:02.712371: step: 212/77, loss: 0.0015066369669511914 2023-01-23 23:50:04.170477: step: 216/77, loss: 0.050621166825294495 2023-01-23 23:50:05.723056: step: 220/77, loss: 0.08110310137271881 2023-01-23 23:50:07.210223: step: 224/77, loss: 0.04713662341237068 2023-01-23 23:50:08.736189: step: 228/77, loss: 0.017320964485406876 2023-01-23 23:50:10.135904: step: 232/77, loss: 0.025410648435354233 2023-01-23 23:50:11.682584: step: 236/77, loss: 0.011799340136349201 2023-01-23 23:50:13.162573: step: 240/77, loss: 0.10319054871797562 2023-01-23 23:50:14.653551: step: 244/77, loss: 0.01371192466467619 2023-01-23 23:50:16.066794: step: 248/77, loss: 0.011762048117816448 2023-01-23 23:50:17.547910: step: 252/77, loss: 0.07716451585292816 2023-01-23 23:50:19.051607: step: 256/77, loss: 0.006613610312342644 2023-01-23 23:50:20.504496: step: 260/77, loss: 0.04181970655918121 2023-01-23 23:50:21.999688: step: 264/77, loss: 0.00952971912920475 2023-01-23 23:50:23.516354: step: 268/77, loss: 0.007909356616437435 2023-01-23 23:50:24.916224: step: 272/77, loss: 0.025044672191143036 2023-01-23 23:50:26.354306: step: 276/77, loss: 0.003298920812085271 2023-01-23 23:50:27.849670: step: 280/77, loss: 0.07326558977365494 2023-01-23 23:50:29.298920: step: 284/77, loss: 0.07228539884090424 2023-01-23 23:50:30.713197: step: 288/77, loss: 0.018365979194641113 2023-01-23 23:50:32.184684: step: 292/77, loss: 0.0013023103820160031 2023-01-23 23:50:33.683030: step: 296/77, loss: 0.01624615490436554 2023-01-23 23:50:35.168774: step: 300/77, loss: 0.03355495631694794 2023-01-23 23:50:36.612623: step: 304/77, loss: 0.0013003923231735826 2023-01-23 23:50:38.102513: step: 308/77, loss: 0.007164771668612957 2023-01-23 23:50:39.546239: step: 312/77, loss: 0.05500699207186699 2023-01-23 23:50:41.009879: step: 316/77, loss: 0.019462941214442253 2023-01-23 23:50:42.448020: step: 320/77, loss: 0.0029178541153669357 2023-01-23 23:50:43.896715: step: 324/77, loss: 0.06747382134199142 2023-01-23 23:50:45.337043: step: 328/77, loss: 0.013694136403501034 2023-01-23 23:50:46.824961: step: 332/77, loss: 0.028184257447719574 2023-01-23 23:50:48.235733: step: 336/77, loss: 0.1709933876991272 2023-01-23 23:50:49.661483: step: 340/77, loss: 0.013454165309667587 2023-01-23 23:50:51.135008: step: 344/77, loss: 0.019300248473882675 2023-01-23 23:50:52.689294: step: 348/77, loss: 0.007079022936522961 2023-01-23 23:50:54.178395: step: 352/77, loss: 0.06331252306699753 2023-01-23 23:50:55.648768: step: 356/77, loss: 0.018890781328082085 2023-01-23 23:50:57.081073: step: 360/77, loss: 0.07067049294710159 2023-01-23 23:50:58.608358: step: 364/77, loss: 0.004420126788318157 2023-01-23 23:51:00.128197: step: 368/77, loss: 0.02071470208466053 2023-01-23 23:51:01.591450: step: 372/77, loss: 0.03721703961491585 2023-01-23 23:51:03.117904: step: 376/77, loss: 0.0021472936496138573 2023-01-23 23:51:04.622905: step: 380/77, loss: 0.0019225344294682145 2023-01-23 23:51:06.084751: step: 384/77, loss: 0.0833422914147377 2023-01-23 23:51:07.542706: step: 388/77, loss: 0.019889041781425476 ================================================== Loss: 0.033 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 5} Test Chinese: {'template': {'p': 0.9027777777777778, 'r': 0.4961832061068702, 'f1': 0.6403940886699507}, 'slot': {'p': 0.4473684210526316, 'r': 0.014667817083692839, 'f1': 0.028404344193817876}, 'combined': 0.018189974114267607, 'epoch': 5} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 5} Test Korean: {'template': {'p': 0.9142857142857143, 'r': 0.48854961832061067, 'f1': 0.6368159203980099}, 'slot': {'p': 0.4594594594594595, 'r': 0.014667817083692839, 'f1': 0.028428093645484948}, 'combined': 0.018103462620010315, 'epoch': 5} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 5} Test Russian: {'template': {'p': 0.8888888888888888, 'r': 0.48854961832061067, 'f1': 0.6305418719211823}, 'slot': {'p': 0.42105263157894735, 'r': 0.013805004314063849, 'f1': 0.026733500417710946}, 'combined': 0.01685659139638917, 'epoch': 5} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 5} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 5} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 5} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Chinese: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.37777777777777777, 'r': 0.014667817083692839, 'f1': 0.028239202657807307}, 'combined': 0.01735677821894498, 'epoch': 3} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 3} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Korean: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.37777777777777777, 'r': 0.014667817083692839, 'f1': 0.028239202657807307}, 'combined': 0.01735677821894498, 'epoch': 3} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 3} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Russian: {'template': {'p': 0.863013698630137, 'r': 0.48091603053435117, 'f1': 0.6176470588235295}, 'slot': {'p': 0.38636363636363635, 'r': 0.014667817083692839, 'f1': 0.02826267664172901}, 'combined': 0.01745635910224439, 'epoch': 3} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 3} ****************************** Epoch: 6 command: python train.py --model_name template --xlmr_model_name xlm-roberta-large --batch_size 10 --xlmr_learning_rate 2e-5 --event_hidden_num 450 --accumulate_step 4 --max_epoch 30 --p1_data_weight 0.1 --learning_rate 2e-4 2023-01-23 23:52:45.224782: step: 4/77, loss: 0.03039437159895897 2023-01-23 23:52:46.726110: step: 8/77, loss: 0.0026546369772404432 2023-01-23 23:52:48.222268: step: 12/77, loss: 0.028612440451979637 2023-01-23 23:52:49.676279: step: 16/77, loss: 0.030664879828691483 2023-01-23 23:52:51.114078: step: 20/77, loss: 0.0023875534534454346 2023-01-23 23:52:52.596554: step: 24/77, loss: 0.0006607776740565896 2023-01-23 23:52:54.026799: step: 28/77, loss: 0.007618624716997147 2023-01-23 23:52:55.514341: step: 32/77, loss: 0.028325766324996948 2023-01-23 23:52:56.987181: step: 36/77, loss: 0.012421460822224617 2023-01-23 23:52:58.412548: step: 40/77, loss: 0.016435008496046066 2023-01-23 23:52:59.856047: step: 44/77, loss: 0.0659736841917038 2023-01-23 23:53:01.371647: step: 48/77, loss: 0.010327223688364029 2023-01-23 23:53:02.892507: step: 52/77, loss: 0.0065464754588902 2023-01-23 23:53:04.433289: step: 56/77, loss: 0.057871297001838684 2023-01-23 23:53:05.828560: step: 60/77, loss: 0.002542986534535885 2023-01-23 23:53:07.374439: step: 64/77, loss: 0.006201583426445723 2023-01-23 23:53:08.886126: step: 68/77, loss: 0.01481205690652132 2023-01-23 23:53:10.339697: step: 72/77, loss: 0.0031775743700563908 2023-01-23 23:53:11.811806: step: 76/77, loss: 0.026160014793276787 2023-01-23 23:53:13.222720: step: 80/77, loss: 0.012620776891708374 2023-01-23 23:53:14.723628: step: 84/77, loss: 0.0677712932229042 2023-01-23 23:53:16.176403: step: 88/77, loss: 0.0394502654671669 2023-01-23 23:53:17.648502: step: 92/77, loss: 0.07249009609222412 2023-01-23 23:53:19.097906: step: 96/77, loss: 0.009052613750100136 2023-01-23 23:53:20.618383: step: 100/77, loss: 0.08241299539804459 2023-01-23 23:53:22.127762: step: 104/77, loss: 0.02526727318763733 2023-01-23 23:53:23.609287: step: 108/77, loss: 0.023446206003427505 2023-01-23 23:53:25.131718: step: 112/77, loss: 0.003657972440123558 2023-01-23 23:53:26.643847: step: 116/77, loss: 0.04917879030108452 2023-01-23 23:53:28.129240: step: 120/77, loss: 0.029167205095291138 2023-01-23 23:53:29.577784: step: 124/77, loss: 0.00030399844399653375 2023-01-23 23:53:31.057855: step: 128/77, loss: 0.02055046521127224 2023-01-23 23:53:32.572386: step: 132/77, loss: 0.03628047555685043 2023-01-23 23:53:34.049513: step: 136/77, loss: 0.000510257261339575 2023-01-23 23:53:35.585857: step: 140/77, loss: 1.6263384168269113e-05 2023-01-23 23:53:37.036709: step: 144/77, loss: 0.04925939813256264 2023-01-23 23:53:38.592925: step: 148/77, loss: 0.01939420960843563 2023-01-23 23:53:40.093687: step: 152/77, loss: 0.023076839745044708 2023-01-23 23:53:41.534910: step: 156/77, loss: 0.019336925819516182 2023-01-23 23:53:43.086961: step: 160/77, loss: 0.052915360778570175 2023-01-23 23:53:44.614030: step: 164/77, loss: 0.010208374820649624 2023-01-23 23:53:46.081239: step: 168/77, loss: 0.010052263736724854 2023-01-23 23:53:47.621051: step: 172/77, loss: 0.018969284370541573 2023-01-23 23:53:49.103332: step: 176/77, loss: 0.011579119600355625 2023-01-23 23:53:50.532895: step: 180/77, loss: 0.0002825237170327455 2023-01-23 23:53:52.074071: step: 184/77, loss: 0.0627252608537674 2023-01-23 23:53:53.558941: step: 188/77, loss: 0.005596342496573925 2023-01-23 23:53:55.025775: step: 192/77, loss: 0.07881352305412292 2023-01-23 23:53:56.480202: step: 196/77, loss: 0.10441139340400696 2023-01-23 23:53:57.922625: step: 200/77, loss: 0.06772737950086594 2023-01-23 23:53:59.491526: step: 204/77, loss: 0.018566543236374855 2023-01-23 23:54:00.985820: step: 208/77, loss: 0.08514667302370071 2023-01-23 23:54:02.535892: step: 212/77, loss: 0.003423793241381645 2023-01-23 23:54:03.962423: step: 216/77, loss: 0.017923181876540184 2023-01-23 23:54:05.455518: step: 220/77, loss: 0.05222643166780472 2023-01-23 23:54:06.885858: step: 224/77, loss: 0.008815684355795383 2023-01-23 23:54:08.384441: step: 228/77, loss: 0.0019420962780714035 2023-01-23 23:54:09.831177: step: 232/77, loss: 0.11282327771186829 2023-01-23 23:54:11.339466: step: 236/77, loss: 0.01948855072259903 2023-01-23 23:54:12.821394: step: 240/77, loss: 0.06035421043634415 2023-01-23 23:54:14.274937: step: 244/77, loss: 0.09459991753101349 2023-01-23 23:54:15.705465: step: 248/77, loss: 0.006792946252971888 2023-01-23 23:54:17.140793: step: 252/77, loss: 0.06218063831329346 2023-01-23 23:54:18.569052: step: 256/77, loss: 0.023452896624803543 2023-01-23 23:54:20.080096: step: 260/77, loss: 0.0018449525814503431 2023-01-23 23:54:21.476324: step: 264/77, loss: 0.05207361653447151 2023-01-23 23:54:22.950071: step: 268/77, loss: 0.00742388516664505 2023-01-23 23:54:24.403583: step: 272/77, loss: 0.00440608337521553 2023-01-23 23:54:25.863594: step: 276/77, loss: 0.013198236003518105 2023-01-23 23:54:27.303509: step: 280/77, loss: 0.011632833629846573 2023-01-23 23:54:28.852438: step: 284/77, loss: 0.006646803580224514 2023-01-23 23:54:30.299603: step: 288/77, loss: 0.010615387000143528 2023-01-23 23:54:31.858683: step: 292/77, loss: 0.012356102466583252 2023-01-23 23:54:33.358597: step: 296/77, loss: 0.0011919524986296892 2023-01-23 23:54:34.822095: step: 300/77, loss: 0.04278775304555893 2023-01-23 23:54:36.310947: step: 304/77, loss: 0.05970532447099686 2023-01-23 23:54:37.779585: step: 308/77, loss: 0.03060147911310196 2023-01-23 23:54:39.290235: step: 312/77, loss: 0.01697326824069023 2023-01-23 23:54:40.723476: step: 316/77, loss: 0.040285106748342514 2023-01-23 23:54:42.179367: step: 320/77, loss: 0.020163346081972122 2023-01-23 23:54:43.707203: step: 324/77, loss: 0.007350889965891838 2023-01-23 23:54:45.153718: step: 328/77, loss: 0.01338155660778284 2023-01-23 23:54:46.648418: step: 332/77, loss: 0.022605765610933304 2023-01-23 23:54:48.180287: step: 336/77, loss: 0.03450581803917885 2023-01-23 23:54:49.649156: step: 340/77, loss: 0.03401753678917885 2023-01-23 23:54:51.116876: step: 344/77, loss: 0.011614415794610977 2023-01-23 23:54:52.662947: step: 348/77, loss: 0.04379738122224808 2023-01-23 23:54:54.114034: step: 352/77, loss: 0.16224682331085205 2023-01-23 23:54:55.582446: step: 356/77, loss: 0.00017778918845579028 2023-01-23 23:54:57.058420: step: 360/77, loss: 0.002048301976174116 2023-01-23 23:54:58.469935: step: 364/77, loss: 0.02142212726175785 2023-01-23 23:54:59.963017: step: 368/77, loss: 0.007760262116789818 2023-01-23 23:55:01.516570: step: 372/77, loss: 0.017139358446002007 2023-01-23 23:55:02.986908: step: 376/77, loss: 0.007925149984657764 2023-01-23 23:55:04.473530: step: 380/77, loss: 0.028815656900405884 2023-01-23 23:55:05.893345: step: 384/77, loss: 0.0073576332069933414 2023-01-23 23:55:07.330062: step: 388/77, loss: 0.0014327471144497395 ================================================== Loss: 0.028 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 6} Test Chinese: {'template': {'p': 0.875, 'r': 0.48091603053435117, 'f1': 0.6206896551724138}, 'slot': {'p': 0.3695652173913043, 'r': 0.014667817083692839, 'f1': 0.028215767634854772}, 'combined': 0.017513235083702963, 'epoch': 6} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 6} Test Korean: {'template': {'p': 0.863013698630137, 'r': 0.48091603053435117, 'f1': 0.6176470588235295}, 'slot': {'p': 0.391304347826087, 'r': 0.015530629853321829, 'f1': 0.029875518672199168}, 'combined': 0.018452526238711256, 'epoch': 6} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 6} Test Russian: {'template': {'p': 0.863013698630137, 'r': 0.48091603053435117, 'f1': 0.6176470588235295}, 'slot': {'p': 0.3695652173913043, 'r': 0.014667817083692839, 'f1': 0.028215767634854772}, 'combined': 0.017427385892116187, 'epoch': 6} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 6} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 6} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 6} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Chinese: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.37777777777777777, 'r': 0.014667817083692839, 'f1': 0.028239202657807307}, 'combined': 0.01735677821894498, 'epoch': 3} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 3} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Korean: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.37777777777777777, 'r': 0.014667817083692839, 'f1': 0.028239202657807307}, 'combined': 0.01735677821894498, 'epoch': 3} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 3} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Russian: {'template': {'p': 0.863013698630137, 'r': 0.48091603053435117, 'f1': 0.6176470588235295}, 'slot': {'p': 0.38636363636363635, 'r': 0.014667817083692839, 'f1': 0.02826267664172901}, 'combined': 0.01745635910224439, 'epoch': 3} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 3} ****************************** Epoch: 7 command: python train.py --model_name template --xlmr_model_name xlm-roberta-large --batch_size 10 --xlmr_learning_rate 2e-5 --event_hidden_num 450 --accumulate_step 4 --max_epoch 30 --p1_data_weight 0.1 --learning_rate 2e-4 2023-01-23 23:56:44.957857: step: 4/77, loss: 0.012997529469430447 2023-01-23 23:56:46.428146: step: 8/77, loss: 0.020570045337080956 2023-01-23 23:56:47.894662: step: 12/77, loss: 0.04421515017747879 2023-01-23 23:56:49.296098: step: 16/77, loss: 0.032468900084495544 2023-01-23 23:56:50.820837: step: 20/77, loss: 0.03809467703104019 2023-01-23 23:56:52.357323: step: 24/77, loss: 0.011496270075440407 2023-01-23 23:56:53.823136: step: 28/77, loss: 0.019316416233778 2023-01-23 23:56:55.289707: step: 32/77, loss: 0.005474533885717392 2023-01-23 23:56:56.799390: step: 36/77, loss: 0.005868466105312109 2023-01-23 23:56:58.245065: step: 40/77, loss: 0.013153919950127602 2023-01-23 23:56:59.746041: step: 44/77, loss: 0.0031642599496990442 2023-01-23 23:57:01.178386: step: 48/77, loss: 0.005004220642149448 2023-01-23 23:57:02.636647: step: 52/77, loss: 0.01386738196015358 2023-01-23 23:57:04.085719: step: 56/77, loss: 0.013531411997973919 2023-01-23 23:57:05.664294: step: 60/77, loss: 0.01102381944656372 2023-01-23 23:57:07.120490: step: 64/77, loss: 0.012573636136949062 2023-01-23 23:57:08.563173: step: 68/77, loss: 0.0019415427232161164 2023-01-23 23:57:10.014178: step: 72/77, loss: 0.12153831124305725 2023-01-23 23:57:11.478981: step: 76/77, loss: 0.01277611218392849 2023-01-23 23:57:12.932506: step: 80/77, loss: 0.02256181091070175 2023-01-23 23:57:14.405180: step: 84/77, loss: 0.0024817378725856543 2023-01-23 23:57:15.864274: step: 88/77, loss: 0.012858950532972813 2023-01-23 23:57:17.365541: step: 92/77, loss: 6.024163303663954e-05 2023-01-23 23:57:18.826530: step: 96/77, loss: 0.012603029608726501 2023-01-23 23:57:20.288694: step: 100/77, loss: 0.00898762233555317 2023-01-23 23:57:21.779938: step: 104/77, loss: 0.01040840707719326 2023-01-23 23:57:23.183034: step: 108/77, loss: 0.009314004331827164 2023-01-23 23:57:24.701262: step: 112/77, loss: 0.002158994786441326 2023-01-23 23:57:26.157143: step: 116/77, loss: 0.001862498465925455 2023-01-23 23:57:27.565670: step: 120/77, loss: 0.004027359187602997 2023-01-23 23:57:29.092287: step: 124/77, loss: 0.0019363107858225703 2023-01-23 23:57:30.529642: step: 128/77, loss: 0.0674593448638916 2023-01-23 23:57:31.962130: step: 132/77, loss: 0.07844030112028122 2023-01-23 23:57:33.471843: step: 136/77, loss: 0.0004071201547048986 2023-01-23 23:57:34.988054: step: 140/77, loss: 0.040196869522333145 2023-01-23 23:57:36.423596: step: 144/77, loss: 0.02462776005268097 2023-01-23 23:57:37.876458: step: 148/77, loss: 0.018356427550315857 2023-01-23 23:57:39.324935: step: 152/77, loss: 0.02162858285009861 2023-01-23 23:57:40.727369: step: 156/77, loss: 0.009433625265955925 2023-01-23 23:57:42.239253: step: 160/77, loss: 0.009734027087688446 2023-01-23 23:57:43.754764: step: 164/77, loss: 0.0023181047290563583 2023-01-23 23:57:45.274433: step: 168/77, loss: 0.017527658492326736 2023-01-23 23:57:46.760560: step: 172/77, loss: 0.00703967921435833 2023-01-23 23:57:48.216793: step: 176/77, loss: 0.008034227415919304 2023-01-23 23:57:49.708692: step: 180/77, loss: 0.06797732412815094 2023-01-23 23:57:51.168749: step: 184/77, loss: 0.028809472918510437 2023-01-23 23:57:52.657639: step: 188/77, loss: 0.01280111912637949 2023-01-23 23:57:54.157714: step: 192/77, loss: 0.006291474215686321 2023-01-23 23:57:55.609570: step: 196/77, loss: 0.019357847049832344 2023-01-23 23:57:57.058711: step: 200/77, loss: 0.01437158789485693 2023-01-23 23:57:58.503530: step: 204/77, loss: 0.004847629461437464 2023-01-23 23:57:59.956426: step: 208/77, loss: 0.04459906369447708 2023-01-23 23:58:01.422130: step: 212/77, loss: 0.0019956021569669247 2023-01-23 23:58:02.924588: step: 216/77, loss: 0.008743159472942352 2023-01-23 23:58:04.372818: step: 220/77, loss: 0.06492185592651367 2023-01-23 23:58:05.798972: step: 224/77, loss: 0.04172681272029877 2023-01-23 23:58:07.320918: step: 228/77, loss: 0.03955743834376335 2023-01-23 23:58:08.724582: step: 232/77, loss: 0.0006398952100425959 2023-01-23 23:58:10.221933: step: 236/77, loss: 0.009599082171916962 2023-01-23 23:58:11.753878: step: 240/77, loss: 0.01301715150475502 2023-01-23 23:58:13.251671: step: 244/77, loss: 0.00035458861384540796 2023-01-23 23:58:14.716646: step: 248/77, loss: 0.007781412452459335 2023-01-23 23:58:16.229882: step: 252/77, loss: 0.02035202831029892 2023-01-23 23:58:17.732115: step: 256/77, loss: 0.012641222216188908 2023-01-23 23:58:19.184466: step: 260/77, loss: 0.05043390765786171 2023-01-23 23:58:20.621634: step: 264/77, loss: 0.011204622685909271 2023-01-23 23:58:22.124217: step: 268/77, loss: 0.004016530700027943 2023-01-23 23:58:23.537315: step: 272/77, loss: 0.002317877020686865 2023-01-23 23:58:25.105685: step: 276/77, loss: 0.0009432684746570885 2023-01-23 23:58:26.599532: step: 280/77, loss: 0.0006387169123627245 2023-01-23 23:58:28.131554: step: 284/77, loss: 0.0016859809402376413 2023-01-23 23:58:29.654003: step: 288/77, loss: 0.03399414196610451 2023-01-23 23:58:31.187174: step: 292/77, loss: 0.015151074156165123 2023-01-23 23:58:32.690069: step: 296/77, loss: 0.02047363668680191 2023-01-23 23:58:34.119540: step: 300/77, loss: 0.038063161075115204 2023-01-23 23:58:35.519568: step: 304/77, loss: 0.00330452062189579 2023-01-23 23:58:36.947677: step: 308/77, loss: 0.011480795219540596 2023-01-23 23:58:38.384768: step: 312/77, loss: 0.019310910254716873 2023-01-23 23:58:39.824614: step: 316/77, loss: 0.015391256660223007 2023-01-23 23:58:41.351166: step: 320/77, loss: 0.010779057629406452 2023-01-23 23:58:42.773912: step: 324/77, loss: 0.007522165309637785 2023-01-23 23:58:44.267190: step: 328/77, loss: 0.02136208489537239 2023-01-23 23:58:45.738412: step: 332/77, loss: 0.01059940829873085 2023-01-23 23:58:47.215686: step: 336/77, loss: 0.00015911197988316417 2023-01-23 23:58:48.725816: step: 340/77, loss: 0.050910405814647675 2023-01-23 23:58:50.142129: step: 344/77, loss: 0.00024434231454506516 2023-01-23 23:58:51.598258: step: 348/77, loss: 0.010583743453025818 2023-01-23 23:58:53.093410: step: 352/77, loss: 0.025950804352760315 2023-01-23 23:58:54.514480: step: 356/77, loss: 0.024648727849125862 2023-01-23 23:58:55.965483: step: 360/77, loss: 0.010220241732895374 2023-01-23 23:58:57.456495: step: 364/77, loss: 0.042353857308626175 2023-01-23 23:58:58.888959: step: 368/77, loss: 0.01584051176905632 2023-01-23 23:59:00.365115: step: 372/77, loss: 0.02006104774773121 2023-01-23 23:59:01.914740: step: 376/77, loss: 0.020328937098383904 2023-01-23 23:59:03.400287: step: 380/77, loss: 0.020396513864398003 2023-01-23 23:59:04.879332: step: 384/77, loss: 0.03346937522292137 2023-01-23 23:59:06.406951: step: 388/77, loss: 0.061174191534519196 ================================================== Loss: 0.019 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5666666666666667, 'f1': 0.7234042553191489}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05085442919642522, 'epoch': 7} Test Chinese: {'template': {'p': 0.8955223880597015, 'r': 0.4580152671755725, 'f1': 0.6060606060606061}, 'slot': {'p': 0.45, 'r': 0.015530629853321829, 'f1': 0.030025020850708923}, 'combined': 0.018196982333762983, 'epoch': 7} Dev Korean: {'template': {'p': 1.0, 'r': 0.55, 'f1': 0.7096774193548387}, 'slot': {'p': 0.5, 'r': 0.035916824196597356, 'f1': 0.0670194003527337}, 'combined': 0.04756215508903682, 'epoch': 7} Test Korean: {'template': {'p': 0.8939393939393939, 'r': 0.45038167938931295, 'f1': 0.598984771573604}, 'slot': {'p': 0.4594594594594595, 'r': 0.014667817083692839, 'f1': 0.028428093645484948}, 'combined': 0.017027995178513826, 'epoch': 7} Dev Russian: {'template': {'p': 1.0, 'r': 0.55, 'f1': 0.7096774193548387}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.04988944951527864, 'epoch': 7} Test Russian: {'template': {'p': 0.9076923076923077, 'r': 0.45038167938931295, 'f1': 0.6020408163265306}, 'slot': {'p': 0.4864864864864865, 'r': 0.015530629853321829, 'f1': 0.030100334448160532}, 'combined': 0.018121629922872157, 'epoch': 7} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 7} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 7} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 7} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Chinese: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.37777777777777777, 'r': 0.014667817083692839, 'f1': 0.028239202657807307}, 'combined': 0.01735677821894498, 'epoch': 3} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 3} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Korean: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.37777777777777777, 'r': 0.014667817083692839, 'f1': 0.028239202657807307}, 'combined': 0.01735677821894498, 'epoch': 3} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 3} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Russian: {'template': {'p': 0.863013698630137, 'r': 0.48091603053435117, 'f1': 0.6176470588235295}, 'slot': {'p': 0.38636363636363635, 'r': 0.014667817083692839, 'f1': 0.02826267664172901}, 'combined': 0.01745635910224439, 'epoch': 3} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 3} ****************************** Epoch: 8 command: python train.py --model_name template --xlmr_model_name xlm-roberta-large --batch_size 10 --xlmr_learning_rate 2e-5 --event_hidden_num 450 --accumulate_step 4 --max_epoch 30 --p1_data_weight 0.1 --learning_rate 2e-4 2023-01-24 00:00:44.066731: step: 4/77, loss: 0.0067916191183030605 2023-01-24 00:00:45.501920: step: 8/77, loss: 0.013048367574810982 2023-01-24 00:00:46.977999: step: 12/77, loss: 0.03453684225678444 2023-01-24 00:00:48.426429: step: 16/77, loss: 0.00477053876966238 2023-01-24 00:00:49.895100: step: 20/77, loss: 0.022574584931135178 2023-01-24 00:00:51.382235: step: 24/77, loss: 0.03624827042222023 2023-01-24 00:00:52.835385: step: 28/77, loss: 0.013998491689562798 2023-01-24 00:00:54.347832: step: 32/77, loss: 0.007175574544817209 2023-01-24 00:00:55.866514: step: 36/77, loss: 0.029678918421268463 2023-01-24 00:00:57.345706: step: 40/77, loss: 0.026228811591863632 2023-01-24 00:00:58.767706: step: 44/77, loss: 0.0012878580018877983 2023-01-24 00:01:00.250201: step: 48/77, loss: 0.010943735018372536 2023-01-24 00:01:01.747157: step: 52/77, loss: 0.0012138265883550048 2023-01-24 00:01:03.237429: step: 56/77, loss: 0.017435984686017036 2023-01-24 00:01:04.739310: step: 60/77, loss: 0.00820908322930336 2023-01-24 00:01:06.237898: step: 64/77, loss: 0.21635589003562927 2023-01-24 00:01:07.796313: step: 68/77, loss: 0.09472758322954178 2023-01-24 00:01:09.230142: step: 72/77, loss: 0.030867278575897217 2023-01-24 00:01:10.717878: step: 76/77, loss: 0.008024647831916809 2023-01-24 00:01:12.186678: step: 80/77, loss: 0.0015849823830649257 2023-01-24 00:01:13.658566: step: 84/77, loss: 0.019332479685544968 2023-01-24 00:01:15.129086: step: 88/77, loss: 0.003665961092337966 2023-01-24 00:01:16.564622: step: 92/77, loss: 0.04837939888238907 2023-01-24 00:01:18.042526: step: 96/77, loss: 0.007896811701357365 2023-01-24 00:01:19.561236: step: 100/77, loss: 0.017443090677261353 2023-01-24 00:01:21.184592: step: 104/77, loss: 0.04705725982785225 2023-01-24 00:01:22.595504: step: 108/77, loss: 0.0034224099945276976 2023-01-24 00:01:24.040364: step: 112/77, loss: 0.0312179122120142 2023-01-24 00:01:25.557600: step: 116/77, loss: 0.029161768034100533 2023-01-24 00:01:27.034353: step: 120/77, loss: 0.005436992272734642 2023-01-24 00:01:28.507568: step: 124/77, loss: 0.008618153631687164 2023-01-24 00:01:29.973844: step: 128/77, loss: 0.0025659631937742233 2023-01-24 00:01:31.447640: step: 132/77, loss: 0.004739833064377308 2023-01-24 00:01:32.927568: step: 136/77, loss: 0.12532849609851837 2023-01-24 00:01:34.377944: step: 140/77, loss: 0.012651849538087845 2023-01-24 00:01:35.791332: step: 144/77, loss: 0.0029987136367708445 2023-01-24 00:01:37.291043: step: 148/77, loss: 0.0017156063113361597 2023-01-24 00:01:38.774332: step: 152/77, loss: 0.011197465471923351 2023-01-24 00:01:40.208534: step: 156/77, loss: 0.05191802978515625 2023-01-24 00:01:41.733409: step: 160/77, loss: 0.01135370321571827 2023-01-24 00:01:43.232464: step: 164/77, loss: 0.03425750881433487 2023-01-24 00:01:44.699957: step: 168/77, loss: 0.009411108680069447 2023-01-24 00:01:46.133633: step: 172/77, loss: 0.010142726823687553 2023-01-24 00:01:47.541991: step: 176/77, loss: 0.0025873896665871143 2023-01-24 00:01:49.083171: step: 180/77, loss: 0.017334995791316032 2023-01-24 00:01:50.578849: step: 184/77, loss: 0.004583724774420261 2023-01-24 00:01:52.039899: step: 188/77, loss: 0.006915468256920576 2023-01-24 00:01:53.411961: step: 192/77, loss: 0.013365627266466618 2023-01-24 00:01:54.843560: step: 196/77, loss: 0.0026318528689444065 2023-01-24 00:01:56.288844: step: 200/77, loss: 0.013050251640379429 2023-01-24 00:01:57.782450: step: 204/77, loss: 0.01102149672806263 2023-01-24 00:01:59.285125: step: 208/77, loss: 0.005182032473385334 2023-01-24 00:02:00.824503: step: 212/77, loss: 0.032337501645088196 2023-01-24 00:02:02.325330: step: 216/77, loss: 0.025232627987861633 2023-01-24 00:02:03.854110: step: 220/77, loss: 0.026668911799788475 2023-01-24 00:02:05.305480: step: 224/77, loss: 0.0707787573337555 2023-01-24 00:02:06.797214: step: 228/77, loss: 0.018818672746419907 2023-01-24 00:02:08.198397: step: 232/77, loss: 0.0017937154043465853 2023-01-24 00:02:09.687770: step: 236/77, loss: 0.0008982113795354962 2023-01-24 00:02:11.243836: step: 240/77, loss: 0.08587608486413956 2023-01-24 00:02:12.703054: step: 244/77, loss: 0.021022209897637367 2023-01-24 00:02:14.140595: step: 248/77, loss: 0.05474882945418358 2023-01-24 00:02:15.676072: step: 252/77, loss: 0.004854440223425627 2023-01-24 00:02:17.096175: step: 256/77, loss: 0.013054034672677517 2023-01-24 00:02:18.590679: step: 260/77, loss: 0.022247303277254105 2023-01-24 00:02:20.092617: step: 264/77, loss: 0.010746228508651257 2023-01-24 00:02:21.484150: step: 268/77, loss: 0.06552846729755402 2023-01-24 00:02:22.993942: step: 272/77, loss: 0.005899173207581043 2023-01-24 00:02:24.505943: step: 276/77, loss: 0.006159973330795765 2023-01-24 00:02:26.012343: step: 280/77, loss: 0.12346875667572021 2023-01-24 00:02:27.433652: step: 284/77, loss: 0.011204426176846027 2023-01-24 00:02:28.847211: step: 288/77, loss: 0.018768085166811943 2023-01-24 00:02:30.249683: step: 292/77, loss: 0.008486710488796234 2023-01-24 00:02:31.704581: step: 296/77, loss: 0.036354124546051025 2023-01-24 00:02:33.213608: step: 300/77, loss: 0.0038927635177969933 2023-01-24 00:02:34.634432: step: 304/77, loss: 0.002151659457013011 2023-01-24 00:02:36.149081: step: 308/77, loss: 0.0006010127253830433 2023-01-24 00:02:37.683307: step: 312/77, loss: 0.038890399038791656 2023-01-24 00:02:39.086658: step: 316/77, loss: 0.015262553468346596 2023-01-24 00:02:40.569050: step: 320/77, loss: 0.004743649158626795 2023-01-24 00:02:42.079703: step: 324/77, loss: 0.05304583162069321 2023-01-24 00:02:43.558564: step: 328/77, loss: 0.021255943924188614 2023-01-24 00:02:44.985889: step: 332/77, loss: 0.004364157095551491 2023-01-24 00:02:46.445743: step: 336/77, loss: 0.010830073617398739 2023-01-24 00:02:47.927522: step: 340/77, loss: 0.013078930787742138 2023-01-24 00:02:49.472508: step: 344/77, loss: 0.10247281193733215 2023-01-24 00:02:50.921500: step: 348/77, loss: 0.04736286401748657 2023-01-24 00:02:52.338842: step: 352/77, loss: 0.033875543624162674 2023-01-24 00:02:53.822884: step: 356/77, loss: 0.0046717822551727295 2023-01-24 00:02:55.281960: step: 360/77, loss: 0.011814633384346962 2023-01-24 00:02:56.674308: step: 364/77, loss: 0.0285545215010643 2023-01-24 00:02:58.036647: step: 368/77, loss: 0.015754615887999535 2023-01-24 00:02:59.505902: step: 372/77, loss: 0.05176156014204025 2023-01-24 00:03:01.055159: step: 376/77, loss: 0.01782143861055374 2023-01-24 00:03:02.586349: step: 380/77, loss: 0.011716475710272789 2023-01-24 00:03:04.117498: step: 384/77, loss: 0.0015753823099657893 2023-01-24 00:03:05.600287: step: 388/77, loss: 0.04279303923249245 ================================================== Loss: 0.024 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 8} Test Chinese: {'template': {'p': 0.8805970149253731, 'r': 0.45038167938931295, 'f1': 0.5959595959595959}, 'slot': {'p': 0.45714285714285713, 'r': 0.013805004314063849, 'f1': 0.02680067001675042}, 'combined': 0.015972116474629035, 'epoch': 8} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 8} Test Korean: {'template': {'p': 0.8939393939393939, 'r': 0.45038167938931295, 'f1': 0.598984771573604}, 'slot': {'p': 0.45714285714285713, 'r': 0.013805004314063849, 'f1': 0.02680067001675042}, 'combined': 0.016053193208002785, 'epoch': 8} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 8} Test Russian: {'template': {'p': 0.8823529411764706, 'r': 0.4580152671755725, 'f1': 0.6030150753768845}, 'slot': {'p': 0.43243243243243246, 'r': 0.013805004314063849, 'f1': 0.026755852842809368}, 'combined': 0.01613418261877952, 'epoch': 8} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 8} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 8} Sample Russian: {'template': {'p': 1.0, 'r': 0.25, 'f1': 0.4}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 8} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Chinese: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.37777777777777777, 'r': 0.014667817083692839, 'f1': 0.028239202657807307}, 'combined': 0.01735677821894498, 'epoch': 3} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 3} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Korean: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.37777777777777777, 'r': 0.014667817083692839, 'f1': 0.028239202657807307}, 'combined': 0.01735677821894498, 'epoch': 3} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 3} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Russian: {'template': {'p': 0.863013698630137, 'r': 0.48091603053435117, 'f1': 0.6176470588235295}, 'slot': {'p': 0.38636363636363635, 'r': 0.014667817083692839, 'f1': 0.02826267664172901}, 'combined': 0.01745635910224439, 'epoch': 3} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 3} ****************************** Epoch: 9 command: python train.py --model_name template --xlmr_model_name xlm-roberta-large --batch_size 10 --xlmr_learning_rate 2e-5 --event_hidden_num 450 --accumulate_step 4 --max_epoch 30 --p1_data_weight 0.1 --learning_rate 2e-4 2023-01-24 00:04:43.240170: step: 4/77, loss: 0.00931814406067133 2023-01-24 00:04:44.713253: step: 8/77, loss: 0.016727566719055176 2023-01-24 00:04:46.155335: step: 12/77, loss: 0.02211870439350605 2023-01-24 00:04:47.592891: step: 16/77, loss: 0.011608324013650417 2023-01-24 00:04:49.094585: step: 20/77, loss: 0.010061085224151611 2023-01-24 00:04:50.559488: step: 24/77, loss: 0.010383963584899902 2023-01-24 00:04:52.018643: step: 28/77, loss: 0.03394642099738121 2023-01-24 00:04:53.508359: step: 32/77, loss: 0.0005709524848498404 2023-01-24 00:04:55.014273: step: 36/77, loss: 0.010349091142416 2023-01-24 00:04:56.465876: step: 40/77, loss: 0.002377279568463564 2023-01-24 00:04:57.941779: step: 44/77, loss: 0.009859619662165642 2023-01-24 00:04:59.435974: step: 48/77, loss: 0.010785480029881 2023-01-24 00:05:00.906952: step: 52/77, loss: 0.021909940987825394 2023-01-24 00:05:02.372048: step: 56/77, loss: 0.0010997147765010595 2023-01-24 00:05:03.839914: step: 60/77, loss: 0.005814129486680031 2023-01-24 00:05:05.349035: step: 64/77, loss: 0.0011625312035903335 2023-01-24 00:05:06.770540: step: 68/77, loss: 0.02374696359038353 2023-01-24 00:05:08.266463: step: 72/77, loss: 0.010308654978871346 2023-01-24 00:05:09.731014: step: 76/77, loss: 0.0102377999573946 2023-01-24 00:05:11.242470: step: 80/77, loss: 0.012174428440630436 2023-01-24 00:05:12.767209: step: 84/77, loss: 0.00158389238640666 2023-01-24 00:05:14.236935: step: 88/77, loss: 0.0022693369537591934 2023-01-24 00:05:15.703307: step: 92/77, loss: 0.0003267589781899005 2023-01-24 00:05:17.133082: step: 96/77, loss: 0.03900589048862457 2023-01-24 00:05:18.611165: step: 100/77, loss: 0.01670156605541706 2023-01-24 00:05:20.061564: step: 104/77, loss: 0.001030667801387608 2023-01-24 00:05:21.498775: step: 108/77, loss: 0.00393590796738863 2023-01-24 00:05:22.952775: step: 112/77, loss: 0.04139762371778488 2023-01-24 00:05:24.417215: step: 116/77, loss: 0.0035628098994493484 2023-01-24 00:05:25.874964: step: 120/77, loss: 0.027284199371933937 2023-01-24 00:05:27.315537: step: 124/77, loss: 0.010536571964621544 2023-01-24 00:05:28.759322: step: 128/77, loss: 0.012623676098883152 2023-01-24 00:05:30.231799: step: 132/77, loss: 0.13224123418331146 2023-01-24 00:05:31.664943: step: 136/77, loss: 0.11464852094650269 2023-01-24 00:05:33.102741: step: 140/77, loss: 0.0009070251835510135 2023-01-24 00:05:34.596166: step: 144/77, loss: 0.034760989248752594 2023-01-24 00:05:36.066176: step: 148/77, loss: 0.025749383494257927 2023-01-24 00:05:37.589558: step: 152/77, loss: 0.026630792766809464 2023-01-24 00:05:38.980976: step: 156/77, loss: 0.00897565670311451 2023-01-24 00:05:40.439601: step: 160/77, loss: 0.029134489595890045 2023-01-24 00:05:41.897326: step: 164/77, loss: 0.0009648214327171445 2023-01-24 00:05:43.406247: step: 168/77, loss: 0.0089799165725708 2023-01-24 00:05:44.936024: step: 172/77, loss: 0.06454553455114365 2023-01-24 00:05:46.359529: step: 176/77, loss: 0.0012069199001416564 2023-01-24 00:05:47.844637: step: 180/77, loss: 0.05223080515861511 2023-01-24 00:05:49.294388: step: 184/77, loss: 0.015568587929010391 2023-01-24 00:05:50.746407: step: 188/77, loss: 0.011848854832351208 2023-01-24 00:05:52.203195: step: 192/77, loss: 0.013729307800531387 2023-01-24 00:05:53.663625: step: 196/77, loss: 0.012604327872395515 2023-01-24 00:05:55.127257: step: 200/77, loss: 0.032689254730939865 2023-01-24 00:05:56.572556: step: 204/77, loss: 0.009877309203147888 2023-01-24 00:05:58.043640: step: 208/77, loss: 0.008456312119960785 2023-01-24 00:05:59.485702: step: 212/77, loss: 0.009990466758608818 2023-01-24 00:06:00.968522: step: 216/77, loss: 0.012410677969455719 2023-01-24 00:06:02.551051: step: 220/77, loss: 0.0003021705197170377 2023-01-24 00:06:04.085865: step: 224/77, loss: 0.07351236045360565 2023-01-24 00:06:05.611575: step: 228/77, loss: 0.0010626473231241107 2023-01-24 00:06:07.045031: step: 232/77, loss: 0.030719848349690437 2023-01-24 00:06:08.474368: step: 236/77, loss: 0.017361463978886604 2023-01-24 00:06:09.923900: step: 240/77, loss: 0.005429819226264954 2023-01-24 00:06:11.417216: step: 244/77, loss: 0.0007189378957264125 2023-01-24 00:06:12.890395: step: 248/77, loss: 0.02332794852554798 2023-01-24 00:06:14.255266: step: 252/77, loss: 6.61729573039338e-05 2023-01-24 00:06:15.744463: step: 256/77, loss: 0.017682623118162155 2023-01-24 00:06:17.276787: step: 260/77, loss: 0.00020931556355208158 2023-01-24 00:06:18.756199: step: 264/77, loss: 0.004470092244446278 2023-01-24 00:06:20.309865: step: 268/77, loss: 0.0027932701632380486 2023-01-24 00:06:21.785434: step: 272/77, loss: 0.00921584665775299 2023-01-24 00:06:23.227747: step: 276/77, loss: 0.0022591277956962585 2023-01-24 00:06:24.708354: step: 280/77, loss: 0.00729965977370739 2023-01-24 00:06:26.198819: step: 284/77, loss: 0.06238330900669098 2023-01-24 00:06:27.675034: step: 288/77, loss: 0.04654904827475548 2023-01-24 00:06:29.200137: step: 292/77, loss: 0.017025865614414215 2023-01-24 00:06:30.751482: step: 296/77, loss: 0.010817298665642738 2023-01-24 00:06:32.222519: step: 300/77, loss: 0.009884810075163841 2023-01-24 00:06:33.669949: step: 304/77, loss: 0.0028474931605160236 2023-01-24 00:06:35.172380: step: 308/77, loss: 0.054710544645786285 2023-01-24 00:06:36.631092: step: 312/77, loss: 0.0037204334512352943 2023-01-24 00:06:38.139344: step: 316/77, loss: 0.014074725098907948 2023-01-24 00:06:39.625962: step: 320/77, loss: 0.014103882946074009 2023-01-24 00:06:41.160465: step: 324/77, loss: 0.01592918299138546 2023-01-24 00:06:42.685354: step: 328/77, loss: 0.018581323325634003 2023-01-24 00:06:44.182095: step: 332/77, loss: 0.015413613058626652 2023-01-24 00:06:45.629521: step: 336/77, loss: 0.011257076635956764 2023-01-24 00:06:47.118379: step: 340/77, loss: 0.006358277052640915 2023-01-24 00:06:48.581805: step: 344/77, loss: 0.00376639561727643 2023-01-24 00:06:50.020700: step: 348/77, loss: 0.0034146499820053577 2023-01-24 00:06:51.550137: step: 352/77, loss: 0.0217830128967762 2023-01-24 00:06:53.036496: step: 356/77, loss: 0.008566494099795818 2023-01-24 00:06:54.477939: step: 360/77, loss: 0.024732094258069992 2023-01-24 00:06:55.920741: step: 364/77, loss: 0.020356018096208572 2023-01-24 00:06:57.385879: step: 368/77, loss: 5.2446688641794026e-05 2023-01-24 00:06:58.696748: step: 372/77, loss: 0.0001511536247562617 2023-01-24 00:07:00.160299: step: 376/77, loss: 0.01291133277118206 2023-01-24 00:07:01.682690: step: 380/77, loss: 0.006426115054637194 2023-01-24 00:07:03.221744: step: 384/77, loss: 0.15697544813156128 2023-01-24 00:07:04.738247: step: 388/77, loss: 0.08485838770866394 ================================================== Loss: 0.019 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 9} Test Chinese: {'template': {'p': 0.9090909090909091, 'r': 0.4580152671755725, 'f1': 0.6091370558375634}, 'slot': {'p': 0.5, 'r': 0.013805004314063849, 'f1': 0.026868178001679264}, 'combined': 0.016366402843662493, 'epoch': 9} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 9} Test Korean: {'template': {'p': 0.9104477611940298, 'r': 0.46564885496183206, 'f1': 0.6161616161616161}, 'slot': {'p': 0.4838709677419355, 'r': 0.012942191544434857, 'f1': 0.025210084033613443}, 'combined': 0.015533486121721413, 'epoch': 9} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 9} Test Russian: {'template': {'p': 0.9104477611940298, 'r': 0.46564885496183206, 'f1': 0.6161616161616161}, 'slot': {'p': 0.5151515151515151, 'r': 0.014667817083692839, 'f1': 0.02852348993288591}, 'combined': 0.01757507965561657, 'epoch': 9} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 9} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 9} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 9} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Chinese: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.37777777777777777, 'r': 0.014667817083692839, 'f1': 0.028239202657807307}, 'combined': 0.01735677821894498, 'epoch': 3} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 3} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Korean: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.37777777777777777, 'r': 0.014667817083692839, 'f1': 0.028239202657807307}, 'combined': 0.01735677821894498, 'epoch': 3} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 3} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Russian: {'template': {'p': 0.863013698630137, 'r': 0.48091603053435117, 'f1': 0.6176470588235295}, 'slot': {'p': 0.38636363636363635, 'r': 0.014667817083692839, 'f1': 0.02826267664172901}, 'combined': 0.01745635910224439, 'epoch': 3} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 3} ****************************** Epoch: 10 command: python train.py --model_name template --xlmr_model_name xlm-roberta-large --batch_size 10 --xlmr_learning_rate 2e-5 --event_hidden_num 450 --accumulate_step 4 --max_epoch 30 --p1_data_weight 0.1 --learning_rate 2e-4 2023-01-24 00:08:42.920790: step: 4/77, loss: 0.0037124394439160824 2023-01-24 00:08:44.333561: step: 8/77, loss: 0.018964888527989388 2023-01-24 00:08:45.827443: step: 12/77, loss: 0.010961982421576977 2023-01-24 00:08:47.223396: step: 16/77, loss: 0.00022207949950825423 2023-01-24 00:08:48.719287: step: 20/77, loss: 0.02525768056511879 2023-01-24 00:08:50.248384: step: 24/77, loss: 0.006949287373572588 2023-01-24 00:08:51.682101: step: 28/77, loss: 0.017542913556098938 2023-01-24 00:08:53.150085: step: 32/77, loss: 0.007600500714033842 2023-01-24 00:08:54.661630: step: 36/77, loss: 0.07663308084011078 2023-01-24 00:08:56.144157: step: 40/77, loss: 0.01322341151535511 2023-01-24 00:08:57.623931: step: 44/77, loss: 0.0020390732679516077 2023-01-24 00:08:59.100552: step: 48/77, loss: 0.01980689913034439 2023-01-24 00:09:00.551276: step: 52/77, loss: 0.00036896448000334203 2023-01-24 00:09:02.050061: step: 56/77, loss: 0.012697389349341393 2023-01-24 00:09:03.563800: step: 60/77, loss: 0.04382698982954025 2023-01-24 00:09:05.095766: step: 64/77, loss: 0.009921489283442497 2023-01-24 00:09:06.480419: step: 68/77, loss: 0.0007633566856384277 2023-01-24 00:09:07.910699: step: 72/77, loss: 0.004064435604959726 2023-01-24 00:09:09.338963: step: 76/77, loss: 0.02458122745156288 2023-01-24 00:09:10.837922: step: 80/77, loss: 0.023752877488732338 2023-01-24 00:09:12.273417: step: 84/77, loss: 0.0001711217628326267 2023-01-24 00:09:13.820343: step: 88/77, loss: 0.06998871266841888 2023-01-24 00:09:15.276266: step: 92/77, loss: 0.009247813373804092 2023-01-24 00:09:16.751828: step: 96/77, loss: 0.005743591580539942 2023-01-24 00:09:18.156451: step: 100/77, loss: 0.01699553057551384 2023-01-24 00:09:19.633258: step: 104/77, loss: 0.011396847665309906 2023-01-24 00:09:21.186503: step: 108/77, loss: 0.08990509808063507 2023-01-24 00:09:22.618634: step: 112/77, loss: 0.004237010609358549 2023-01-24 00:09:24.082505: step: 116/77, loss: 0.0958765372633934 2023-01-24 00:09:25.577082: step: 120/77, loss: 0.014597264118492603 2023-01-24 00:09:27.113391: step: 124/77, loss: 0.021818269044160843 2023-01-24 00:09:28.619356: step: 128/77, loss: 0.0006683270330540836 2023-01-24 00:09:30.045957: step: 132/77, loss: 0.016099683940410614 2023-01-24 00:09:31.636491: step: 136/77, loss: 0.004529166035354137 2023-01-24 00:09:33.198579: step: 140/77, loss: 0.007304504048079252 2023-01-24 00:09:34.689008: step: 144/77, loss: 0.0021621109917759895 2023-01-24 00:09:36.180853: step: 148/77, loss: 0.038148678839206696 2023-01-24 00:09:37.685069: step: 152/77, loss: 0.002339947270229459 2023-01-24 00:09:39.188934: step: 156/77, loss: 0.010671212337911129 2023-01-24 00:09:40.598819: step: 160/77, loss: 0.0012018646812066436 2023-01-24 00:09:42.092162: step: 164/77, loss: 0.0007029086700640619 2023-01-24 00:09:43.533299: step: 168/77, loss: 0.05930791795253754 2023-01-24 00:09:45.061684: step: 172/77, loss: 0.06313134729862213 2023-01-24 00:09:46.610845: step: 176/77, loss: 0.0035771233960986137 2023-01-24 00:09:48.137394: step: 180/77, loss: 0.033747103065252304 2023-01-24 00:09:49.639525: step: 184/77, loss: 0.01955796591937542 2023-01-24 00:09:51.124806: step: 188/77, loss: 0.0012469518696889281 2023-01-24 00:09:52.594421: step: 192/77, loss: 0.03592196851968765 2023-01-24 00:09:54.062116: step: 196/77, loss: 0.009159698151051998 2023-01-24 00:09:55.520748: step: 200/77, loss: 0.05379951000213623 2023-01-24 00:09:56.922205: step: 204/77, loss: 0.0014175847172737122 2023-01-24 00:09:58.410315: step: 208/77, loss: 0.0003584384103305638 2023-01-24 00:09:59.899873: step: 212/77, loss: 0.0024745934642851353 2023-01-24 00:10:01.375064: step: 216/77, loss: 0.003314611967653036 2023-01-24 00:10:02.977574: step: 220/77, loss: 0.028728632256388664 2023-01-24 00:10:04.378495: step: 224/77, loss: 0.00189878954552114 2023-01-24 00:10:05.859607: step: 228/77, loss: 0.001966055715456605 2023-01-24 00:10:07.272744: step: 232/77, loss: 0.01825111173093319 2023-01-24 00:10:08.789216: step: 236/77, loss: 0.09419666230678558 2023-01-24 00:10:10.201131: step: 240/77, loss: 0.019170410931110382 2023-01-24 00:10:11.674993: step: 244/77, loss: 0.020276600494980812 2023-01-24 00:10:13.079797: step: 248/77, loss: 0.0032611675560474396 2023-01-24 00:10:14.609568: step: 252/77, loss: 0.09802025556564331 2023-01-24 00:10:15.996351: step: 256/77, loss: 0.011103980243206024 2023-01-24 00:10:17.410325: step: 260/77, loss: 0.016340192407369614 2023-01-24 00:10:18.928650: step: 264/77, loss: 0.001242226455360651 2023-01-24 00:10:20.404173: step: 268/77, loss: 0.03747468814253807 2023-01-24 00:10:21.837643: step: 272/77, loss: 0.028045082464814186 2023-01-24 00:10:23.311624: step: 276/77, loss: 0.016944503411650658 2023-01-24 00:10:24.816034: step: 280/77, loss: 0.01631280407309532 2023-01-24 00:10:26.254984: step: 284/77, loss: 0.030796894803643227 2023-01-24 00:10:27.706545: step: 288/77, loss: 0.005280392710119486 2023-01-24 00:10:29.142324: step: 292/77, loss: 0.10407906025648117 2023-01-24 00:10:30.616153: step: 296/77, loss: 0.00925945583730936 2023-01-24 00:10:32.134588: step: 300/77, loss: 0.005232213530689478 2023-01-24 00:10:33.630148: step: 304/77, loss: 0.03128921985626221 2023-01-24 00:10:35.098014: step: 308/77, loss: 0.00042872564517892897 2023-01-24 00:10:36.607953: step: 312/77, loss: 0.06519351154565811 2023-01-24 00:10:38.126525: step: 316/77, loss: 0.0005023244884796441 2023-01-24 00:10:39.553573: step: 320/77, loss: 0.0007901216158643365 2023-01-24 00:10:41.078204: step: 324/77, loss: 0.038823582231998444 2023-01-24 00:10:42.506689: step: 328/77, loss: 0.03915676474571228 2023-01-24 00:10:43.970083: step: 332/77, loss: 0.004828822799026966 2023-01-24 00:10:45.415430: step: 336/77, loss: 0.008032601326704025 2023-01-24 00:10:46.930818: step: 340/77, loss: 0.02326280064880848 2023-01-24 00:10:48.402920: step: 344/77, loss: 0.0006932779215276241 2023-01-24 00:10:49.888168: step: 348/77, loss: 0.0041923802345991135 2023-01-24 00:10:51.344100: step: 352/77, loss: 0.021920569241046906 2023-01-24 00:10:52.854844: step: 356/77, loss: 0.0004425476072356105 2023-01-24 00:10:54.307789: step: 360/77, loss: 0.05860942602157593 2023-01-24 00:10:55.773900: step: 364/77, loss: 0.010078839026391506 2023-01-24 00:10:57.310352: step: 368/77, loss: 0.00035972020123153925 2023-01-24 00:10:58.839745: step: 372/77, loss: 0.02206575870513916 2023-01-24 00:11:00.276795: step: 376/77, loss: 0.06646309792995453 2023-01-24 00:11:01.734462: step: 380/77, loss: 0.07609245181083679 2023-01-24 00:11:03.207268: step: 384/77, loss: 0.05032084882259369 2023-01-24 00:11:04.694489: step: 388/77, loss: 0.009962482377886772 ================================================== Loss: 0.022 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 10} Test Chinese: {'template': {'p': 0.9230769230769231, 'r': 0.4580152671755725, 'f1': 0.6122448979591837}, 'slot': {'p': 0.4857142857142857, 'r': 0.014667817083692839, 'f1': 0.02847571189279732}, 'combined': 0.01743410932212081, 'epoch': 10} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 10} Test Korean: {'template': {'p': 0.9242424242424242, 'r': 0.46564885496183206, 'f1': 0.6192893401015228}, 'slot': {'p': 0.4722222222222222, 'r': 0.014667817083692839, 'f1': 0.028451882845188285}, 'combined': 0.01761994775184249, 'epoch': 10} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 10} Test Russian: {'template': {'p': 0.9230769230769231, 'r': 0.4580152671755725, 'f1': 0.6122448979591837}, 'slot': {'p': 0.4857142857142857, 'r': 0.014667817083692839, 'f1': 0.02847571189279732}, 'combined': 0.01743410932212081, 'epoch': 10} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 10} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 10} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 10} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Chinese: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.37777777777777777, 'r': 0.014667817083692839, 'f1': 0.028239202657807307}, 'combined': 0.01735677821894498, 'epoch': 3} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 3} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Korean: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.37777777777777777, 'r': 0.014667817083692839, 'f1': 0.028239202657807307}, 'combined': 0.01735677821894498, 'epoch': 3} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 3} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Russian: {'template': {'p': 0.863013698630137, 'r': 0.48091603053435117, 'f1': 0.6176470588235295}, 'slot': {'p': 0.38636363636363635, 'r': 0.014667817083692839, 'f1': 0.02826267664172901}, 'combined': 0.01745635910224439, 'epoch': 3} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 3} ****************************** Epoch: 11 command: python train.py --model_name template --xlmr_model_name xlm-roberta-large --batch_size 10 --xlmr_learning_rate 2e-5 --event_hidden_num 450 --accumulate_step 4 --max_epoch 30 --p1_data_weight 0.1 --learning_rate 2e-4 2023-01-24 00:12:42.703882: step: 4/77, loss: 0.018249448388814926 2023-01-24 00:12:44.129823: step: 8/77, loss: 0.02742091938853264 2023-01-24 00:12:45.574885: step: 12/77, loss: 0.008343767374753952 2023-01-24 00:12:47.055738: step: 16/77, loss: 0.024018079042434692 2023-01-24 00:12:48.586580: step: 20/77, loss: 0.025947896763682365 2023-01-24 00:12:50.066486: step: 24/77, loss: 0.02267858199775219 2023-01-24 00:12:51.511878: step: 28/77, loss: 3.0612878617830575e-05 2023-01-24 00:12:53.001980: step: 32/77, loss: 0.0003211384464520961 2023-01-24 00:12:54.465488: step: 36/77, loss: 0.05688820034265518 2023-01-24 00:12:55.959046: step: 40/77, loss: 0.006101340986788273 2023-01-24 00:12:57.350922: step: 44/77, loss: 0.050070710480213165 2023-01-24 00:12:58.808112: step: 48/77, loss: 0.004720164462924004 2023-01-24 00:13:00.284753: step: 52/77, loss: 0.015022655948996544 2023-01-24 00:13:01.759949: step: 56/77, loss: 0.03563885763287544 2023-01-24 00:13:03.247052: step: 60/77, loss: 0.0330636128783226 2023-01-24 00:13:04.689611: step: 64/77, loss: 0.0014286059886217117 2023-01-24 00:13:06.170219: step: 68/77, loss: 0.000816542305983603 2023-01-24 00:13:07.637432: step: 72/77, loss: 0.01442760694772005 2023-01-24 00:13:09.098455: step: 76/77, loss: 9.378043614560738e-05 2023-01-24 00:13:10.565751: step: 80/77, loss: 0.0005923286080360413 2023-01-24 00:13:12.033302: step: 84/77, loss: 0.020105646923184395 2023-01-24 00:13:13.530811: step: 88/77, loss: 0.009484020993113518 2023-01-24 00:13:15.006572: step: 92/77, loss: 0.012684566900134087 2023-01-24 00:13:16.499624: step: 96/77, loss: 0.04148320108652115 2023-01-24 00:13:18.065390: step: 100/77, loss: 7.08875959389843e-05 2023-01-24 00:13:19.516831: step: 104/77, loss: 0.005405884236097336 2023-01-24 00:13:21.023926: step: 108/77, loss: 0.07155608385801315 2023-01-24 00:13:22.524451: step: 112/77, loss: 0.0010453937575221062 2023-01-24 00:13:23.954054: step: 116/77, loss: 0.009059806354343891 2023-01-24 00:13:25.336707: step: 120/77, loss: 0.02621476724743843 2023-01-24 00:13:26.823396: step: 124/77, loss: 0.002740699565038085 2023-01-24 00:13:28.250848: step: 128/77, loss: 0.010486319661140442 2023-01-24 00:13:29.751040: step: 132/77, loss: 0.01720418594777584 2023-01-24 00:13:31.217673: step: 136/77, loss: 0.00957572739571333 2023-01-24 00:13:32.687250: step: 140/77, loss: 0.003676429158076644 2023-01-24 00:13:34.197336: step: 144/77, loss: 0.009357225149869919 2023-01-24 00:13:35.676069: step: 148/77, loss: 0.03653610125184059 2023-01-24 00:13:37.122431: step: 152/77, loss: 0.04308353364467621 2023-01-24 00:13:38.607594: step: 156/77, loss: 0.05171763524413109 2023-01-24 00:13:40.074337: step: 160/77, loss: 0.004753998946398497 2023-01-24 00:13:41.544725: step: 164/77, loss: 3.090988684562035e-05 2023-01-24 00:13:43.049190: step: 168/77, loss: 0.013740334659814835 2023-01-24 00:13:44.546321: step: 172/77, loss: 0.015785954892635345 2023-01-24 00:13:46.003698: step: 176/77, loss: 0.019747678190469742 2023-01-24 00:13:47.526488: step: 180/77, loss: 0.009812500327825546 2023-01-24 00:13:49.057863: step: 184/77, loss: 0.022340651601552963 2023-01-24 00:13:50.522004: step: 188/77, loss: 0.013165976852178574 2023-01-24 00:13:51.957185: step: 192/77, loss: 0.0026761272456496954 2023-01-24 00:13:53.518846: step: 196/77, loss: 0.0722554549574852 2023-01-24 00:13:54.952868: step: 200/77, loss: 0.014125137589871883 2023-01-24 00:13:56.347422: step: 204/77, loss: 0.00012538768351078033 2023-01-24 00:13:57.767214: step: 208/77, loss: 0.015420390293002129 2023-01-24 00:13:59.156093: step: 212/77, loss: 0.035006795078516006 2023-01-24 00:14:00.663082: step: 216/77, loss: 0.016030533239245415 2023-01-24 00:14:02.173449: step: 220/77, loss: 0.00042447782470844686 2023-01-24 00:14:03.681423: step: 224/77, loss: 0.011341025121510029 2023-01-24 00:14:05.142237: step: 228/77, loss: 0.01798064447939396 2023-01-24 00:14:06.641195: step: 232/77, loss: 0.03821370005607605 2023-01-24 00:14:08.090171: step: 236/77, loss: 0.08387462049722672 2023-01-24 00:14:09.558365: step: 240/77, loss: 0.011262631975114346 2023-01-24 00:14:11.012780: step: 244/77, loss: 0.04161141812801361 2023-01-24 00:14:12.449239: step: 248/77, loss: 0.023375479504466057 2023-01-24 00:14:13.934644: step: 252/77, loss: 0.03386523202061653 2023-01-24 00:14:15.388684: step: 256/77, loss: 0.03562305495142937 2023-01-24 00:14:16.817835: step: 260/77, loss: 0.008851949125528336 2023-01-24 00:14:18.276540: step: 264/77, loss: 0.009420386515557766 2023-01-24 00:14:19.732788: step: 268/77, loss: 0.01851673051714897 2023-01-24 00:14:21.116982: step: 272/77, loss: 0.008063890039920807 2023-01-24 00:14:22.606145: step: 276/77, loss: 0.004955152980983257 2023-01-24 00:14:24.126850: step: 280/77, loss: 0.11314801871776581 2023-01-24 00:14:25.627637: step: 284/77, loss: 0.009548168629407883 2023-01-24 00:14:27.028380: step: 288/77, loss: 0.00019546764087863266 2023-01-24 00:14:28.598477: step: 292/77, loss: 0.09462368488311768 2023-01-24 00:14:30.113333: step: 296/77, loss: 7.842419290682301e-05 2023-01-24 00:14:31.567716: step: 300/77, loss: 0.0013920800993219018 2023-01-24 00:14:33.073892: step: 304/77, loss: 0.04698015749454498 2023-01-24 00:14:34.544895: step: 308/77, loss: 0.002782592084258795 2023-01-24 00:14:35.960715: step: 312/77, loss: 2.5230790924979374e-05 2023-01-24 00:14:37.397309: step: 316/77, loss: 0.007181983441114426 2023-01-24 00:14:38.898581: step: 320/77, loss: 0.022983960807323456 2023-01-24 00:14:40.363372: step: 324/77, loss: 0.00940174050629139 2023-01-24 00:14:41.772351: step: 328/77, loss: 0.02167385257780552 2023-01-24 00:14:43.187644: step: 332/77, loss: 0.009350062347948551 2023-01-24 00:14:44.687149: step: 336/77, loss: 0.063703753054142 2023-01-24 00:14:46.197703: step: 340/77, loss: 0.002021544612944126 2023-01-24 00:14:47.711625: step: 344/77, loss: 0.0050396379083395 2023-01-24 00:14:49.183080: step: 348/77, loss: 0.008820513263344765 2023-01-24 00:14:50.630152: step: 352/77, loss: 0.003946100827306509 2023-01-24 00:14:52.090967: step: 356/77, loss: 0.003359707072377205 2023-01-24 00:14:53.594757: step: 360/77, loss: 0.008270454593002796 2023-01-24 00:14:55.119548: step: 364/77, loss: 9.894469985738397e-05 2023-01-24 00:14:56.626621: step: 368/77, loss: 0.002662702463567257 2023-01-24 00:14:58.159491: step: 372/77, loss: 0.002308095572516322 2023-01-24 00:14:59.611812: step: 376/77, loss: 0.013296185992658138 2023-01-24 00:15:01.008537: step: 380/77, loss: 0.049899984151124954 2023-01-24 00:15:02.552380: step: 384/77, loss: 0.0031746437307447195 2023-01-24 00:15:04.056371: step: 388/77, loss: 0.08022551983594894 ================================================== Loss: 0.020 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.04686584651435266, 'epoch': 11} Test Chinese: {'template': {'p': 0.9104477611940298, 'r': 0.46564885496183206, 'f1': 0.6161616161616161}, 'slot': {'p': 0.42424242424242425, 'r': 0.012079378774805867, 'f1': 0.02348993288590604}, 'combined': 0.014473595010507762, 'epoch': 11} Dev Korean: {'template': {'p': 1.0, 'r': 0.4666666666666667, 'f1': 0.6363636363636364}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.04473558076370027, 'epoch': 11} Test Korean: {'template': {'p': 0.9253731343283582, 'r': 0.4732824427480916, 'f1': 0.6262626262626263}, 'slot': {'p': 0.45714285714285713, 'r': 0.013805004314063849, 'f1': 0.02680067001675042}, 'combined': 0.016784257990288143, 'epoch': 11} Dev Russian: {'template': {'p': 1.0, 'r': 0.48333333333333334, 'f1': 0.6516853932584269}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.045812681424142486, 'epoch': 11} Test Russian: {'template': {'p': 0.8970588235294118, 'r': 0.46564885496183206, 'f1': 0.6130653266331658}, 'slot': {'p': 0.46875, 'r': 0.012942191544434857, 'f1': 0.025188916876574305}, 'combined': 0.015442451552472689, 'epoch': 11} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 11} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 11} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 11} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Chinese: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.37777777777777777, 'r': 0.014667817083692839, 'f1': 0.028239202657807307}, 'combined': 0.01735677821894498, 'epoch': 3} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 3} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Korean: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.37777777777777777, 'r': 0.014667817083692839, 'f1': 0.028239202657807307}, 'combined': 0.01735677821894498, 'epoch': 3} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 3} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Russian: {'template': {'p': 0.863013698630137, 'r': 0.48091603053435117, 'f1': 0.6176470588235295}, 'slot': {'p': 0.38636363636363635, 'r': 0.014667817083692839, 'f1': 0.02826267664172901}, 'combined': 0.01745635910224439, 'epoch': 3} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 3} ****************************** Epoch: 12 command: python train.py --model_name template --xlmr_model_name xlm-roberta-large --batch_size 10 --xlmr_learning_rate 2e-5 --event_hidden_num 450 --accumulate_step 4 --max_epoch 30 --p1_data_weight 0.1 --learning_rate 2e-4 2023-01-24 00:16:39.524765: step: 4/77, loss: 0.04436550661921501 2023-01-24 00:16:40.966065: step: 8/77, loss: 0.004802200943231583 2023-01-24 00:16:42.404709: step: 12/77, loss: 0.11278200894594193 2023-01-24 00:16:43.837584: step: 16/77, loss: 0.004204769618809223 2023-01-24 00:16:45.315296: step: 20/77, loss: 0.002979584503918886 2023-01-24 00:16:46.743014: step: 24/77, loss: 0.011945674195885658 2023-01-24 00:16:48.156918: step: 28/77, loss: 0.004289418924599886 2023-01-24 00:16:49.684222: step: 32/77, loss: 0.005253539886325598 2023-01-24 00:16:51.098423: step: 36/77, loss: 0.0008780001080594957 2023-01-24 00:16:52.535581: step: 40/77, loss: 0.001079891575500369 2023-01-24 00:16:54.032694: step: 44/77, loss: 0.015719100832939148 2023-01-24 00:16:55.454729: step: 48/77, loss: 0.032325148582458496 2023-01-24 00:16:56.917158: step: 52/77, loss: 0.0008553061634302139 2023-01-24 00:16:58.417532: step: 56/77, loss: 0.001794558484107256 2023-01-24 00:16:59.934980: step: 60/77, loss: 0.03624221682548523 2023-01-24 00:17:01.413295: step: 64/77, loss: 0.0045928796753287315 2023-01-24 00:17:02.889687: step: 68/77, loss: 0.05722226947546005 2023-01-24 00:17:04.339541: step: 72/77, loss: 0.0012267765123397112 2023-01-24 00:17:05.846166: step: 76/77, loss: 0.0031581628136336803 2023-01-24 00:17:07.287557: step: 80/77, loss: 0.01436734851449728 2023-01-24 00:17:08.786153: step: 84/77, loss: 0.013682027347385883 2023-01-24 00:17:10.268456: step: 88/77, loss: 0.004203316755592823 2023-01-24 00:17:11.697691: step: 92/77, loss: 0.005488214548677206 2023-01-24 00:17:13.159018: step: 96/77, loss: 0.0018630167469382286 2023-01-24 00:17:14.577795: step: 100/77, loss: 0.00012121098552597687 2023-01-24 00:17:16.012396: step: 104/77, loss: 0.0013276045210659504 2023-01-24 00:17:17.533566: step: 108/77, loss: 0.005005714017897844 2023-01-24 00:17:18.884324: step: 112/77, loss: 0.006716449744999409 2023-01-24 00:17:20.360998: step: 116/77, loss: 0.009621203877031803 2023-01-24 00:17:21.823936: step: 120/77, loss: 0.003736691316589713 2023-01-24 00:17:23.273697: step: 124/77, loss: 0.02548018842935562 2023-01-24 00:17:24.686804: step: 128/77, loss: 0.015529114753007889 2023-01-24 00:17:26.162022: step: 132/77, loss: 0.0020801310893148184 2023-01-24 00:17:27.598563: step: 136/77, loss: 0.0016847447259351611 2023-01-24 00:17:29.108636: step: 140/77, loss: 0.0012095924466848373 2023-01-24 00:17:30.528381: step: 144/77, loss: 0.0006464039324782789 2023-01-24 00:17:31.981822: step: 148/77, loss: 0.0005958712426945567 2023-01-24 00:17:33.465153: step: 152/77, loss: 0.00440417742356658 2023-01-24 00:17:34.915816: step: 156/77, loss: 0.024716690182685852 2023-01-24 00:17:36.355201: step: 160/77, loss: 0.009818075224757195 2023-01-24 00:17:37.830576: step: 164/77, loss: 0.01651581935584545 2023-01-24 00:17:39.226890: step: 168/77, loss: 0.0026334517169743776 2023-01-24 00:17:40.608856: step: 172/77, loss: 7.838180317776278e-05 2023-01-24 00:17:42.041405: step: 176/77, loss: 0.0017036596545949578 2023-01-24 00:17:43.443801: step: 180/77, loss: 0.0021797381341457367 2023-01-24 00:17:44.892568: step: 184/77, loss: 0.002597886137664318 2023-01-24 00:17:46.312762: step: 188/77, loss: 0.0011595187243074179 2023-01-24 00:17:47.696022: step: 192/77, loss: 0.0011745744850486517 2023-01-24 00:17:49.160428: step: 196/77, loss: 0.0181453675031662 2023-01-24 00:17:50.598379: step: 200/77, loss: 0.011830981820821762 2023-01-24 00:17:52.127470: step: 204/77, loss: 0.022019436582922935 2023-01-24 00:17:53.591768: step: 208/77, loss: 0.04811529070138931 2023-01-24 00:17:55.033327: step: 212/77, loss: 0.020062996074557304 2023-01-24 00:17:56.507320: step: 216/77, loss: 0.0033560858573764563 2023-01-24 00:17:57.974345: step: 220/77, loss: 0.012346186675131321 2023-01-24 00:17:59.372187: step: 224/77, loss: 0.00652111042290926 2023-01-24 00:18:00.839493: step: 228/77, loss: 3.144413494737819e-05 2023-01-24 00:18:02.316250: step: 232/77, loss: 0.018586190417408943 2023-01-24 00:18:03.721111: step: 236/77, loss: 0.046308234333992004 2023-01-24 00:18:05.161409: step: 240/77, loss: 0.0010744313476607203 2023-01-24 00:18:06.589621: step: 244/77, loss: 0.013715185225009918 2023-01-24 00:18:08.098285: step: 248/77, loss: 0.0011830773437395692 2023-01-24 00:18:09.531339: step: 252/77, loss: 0.035007935017347336 2023-01-24 00:18:10.955984: step: 256/77, loss: 0.00036713789450004697 2023-01-24 00:18:12.410128: step: 260/77, loss: 0.003876642556861043 2023-01-24 00:18:13.832818: step: 264/77, loss: 0.03178451210260391 2023-01-24 00:18:15.292543: step: 268/77, loss: 0.004363094922155142 2023-01-24 00:18:16.718084: step: 272/77, loss: 0.00016332468658220023 2023-01-24 00:18:18.193793: step: 276/77, loss: 2.408213185844943e-05 2023-01-24 00:18:19.609817: step: 280/77, loss: 0.016927000135183334 2023-01-24 00:18:21.096792: step: 284/77, loss: 0.025983309373259544 2023-01-24 00:18:22.544019: step: 288/77, loss: 0.0042741927318274975 2023-01-24 00:18:23.992311: step: 292/77, loss: 0.017417028546333313 2023-01-24 00:18:25.434136: step: 296/77, loss: 0.014634665101766586 2023-01-24 00:18:26.866350: step: 300/77, loss: 0.009720695205032825 2023-01-24 00:18:28.376661: step: 304/77, loss: 0.0002194459520978853 2023-01-24 00:18:29.871307: step: 308/77, loss: 6.72009409754537e-05 2023-01-24 00:18:31.324981: step: 312/77, loss: 0.05025641992688179 2023-01-24 00:18:32.790835: step: 316/77, loss: 0.012442845851182938 2023-01-24 00:18:34.282607: step: 320/77, loss: 0.0018316482892259955 2023-01-24 00:18:35.816139: step: 324/77, loss: 0.0017850275617092848 2023-01-24 00:18:37.319574: step: 328/77, loss: 0.00014143706357572228 2023-01-24 00:18:38.835393: step: 332/77, loss: 0.003508616704493761 2023-01-24 00:18:40.301319: step: 336/77, loss: 0.046058665961027145 2023-01-24 00:18:41.757767: step: 340/77, loss: 0.03320928290486336 2023-01-24 00:18:43.221592: step: 344/77, loss: 0.004275294486433268 2023-01-24 00:18:44.641272: step: 348/77, loss: 0.001723662600852549 2023-01-24 00:18:46.080547: step: 352/77, loss: 0.002751779742538929 2023-01-24 00:18:47.473069: step: 356/77, loss: 0.0015267680864781141 2023-01-24 00:18:48.931448: step: 360/77, loss: 0.009214522317051888 2023-01-24 00:18:50.369605: step: 364/77, loss: 0.0674782544374466 2023-01-24 00:18:51.801037: step: 368/77, loss: 0.01894235797226429 2023-01-24 00:18:53.328431: step: 372/77, loss: 0.00815523136407137 2023-01-24 00:18:54.764771: step: 376/77, loss: 0.010504646226763725 2023-01-24 00:18:56.226162: step: 380/77, loss: 0.004112462513148785 2023-01-24 00:18:57.699026: step: 384/77, loss: 0.003164243418723345 2023-01-24 00:18:59.223675: step: 388/77, loss: 0.02831357903778553 ================================================== Loss: 0.013 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 12} Test Chinese: {'template': {'p': 0.9402985074626866, 'r': 0.48091603053435117, 'f1': 0.6363636363636365}, 'slot': {'p': 0.5, 'r': 0.013805004314063849, 'f1': 0.026868178001679264}, 'combined': 0.01709793145561408, 'epoch': 12} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 12} Test Korean: {'template': {'p': 0.9402985074626866, 'r': 0.48091603053435117, 'f1': 0.6363636363636365}, 'slot': {'p': 0.5483870967741935, 'r': 0.014667817083692839, 'f1': 0.02857142857142857}, 'combined': 0.018181818181818184, 'epoch': 12} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 12} Test Russian: {'template': {'p': 0.9411764705882353, 'r': 0.48854961832061067, 'f1': 0.6432160804020101}, 'slot': {'p': 0.5483870967741935, 'r': 0.014667817083692839, 'f1': 0.02857142857142857}, 'combined': 0.01837760229720029, 'epoch': 12} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 12} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 12} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 12} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Chinese: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.37777777777777777, 'r': 0.014667817083692839, 'f1': 0.028239202657807307}, 'combined': 0.01735677821894498, 'epoch': 3} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 3} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Korean: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.37777777777777777, 'r': 0.014667817083692839, 'f1': 0.028239202657807307}, 'combined': 0.01735677821894498, 'epoch': 3} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 3} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Russian: {'template': {'p': 0.863013698630137, 'r': 0.48091603053435117, 'f1': 0.6176470588235295}, 'slot': {'p': 0.38636363636363635, 'r': 0.014667817083692839, 'f1': 0.02826267664172901}, 'combined': 0.01745635910224439, 'epoch': 3} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 3} ****************************** Epoch: 13 command: python train.py --model_name template --xlmr_model_name xlm-roberta-large --batch_size 10 --xlmr_learning_rate 2e-5 --event_hidden_num 450 --accumulate_step 4 --max_epoch 30 --p1_data_weight 0.1 --learning_rate 2e-4 2023-01-24 00:20:35.365488: step: 4/77, loss: 0.02979923039674759 2023-01-24 00:20:36.818494: step: 8/77, loss: 0.0011581765720620751 2023-01-24 00:20:38.316441: step: 12/77, loss: 0.01394204143434763 2023-01-24 00:20:39.661375: step: 16/77, loss: 0.016473228111863136 2023-01-24 00:20:41.113921: step: 20/77, loss: 0.003298100782558322 2023-01-24 00:20:42.548462: step: 24/77, loss: 0.02194811776280403 2023-01-24 00:20:44.011240: step: 28/77, loss: 0.01043167244642973 2023-01-24 00:20:45.417721: step: 32/77, loss: 0.06836826354265213 2023-01-24 00:20:46.885003: step: 36/77, loss: 0.000960124540142715 2023-01-24 00:20:48.322367: step: 40/77, loss: 0.06304540485143661 2023-01-24 00:20:49.777372: step: 44/77, loss: 0.00409977650269866 2023-01-24 00:20:51.230496: step: 48/77, loss: 0.00018478457059245557 2023-01-24 00:20:52.668945: step: 52/77, loss: 0.046891722828149796 2023-01-24 00:20:54.089304: step: 56/77, loss: 0.01876625046133995 2023-01-24 00:20:55.541646: step: 60/77, loss: 3.593548899516463e-05 2023-01-24 00:20:56.990259: step: 64/77, loss: 6.849043711554259e-05 2023-01-24 00:20:58.446686: step: 68/77, loss: 0.0011330159613862634 2023-01-24 00:20:59.862678: step: 72/77, loss: 0.0002272444253321737 2023-01-24 00:21:01.291199: step: 76/77, loss: 1.507402066636132e-05 2023-01-24 00:21:02.842076: step: 80/77, loss: 0.002101314952597022 2023-01-24 00:21:04.272761: step: 84/77, loss: 0.033360376954078674 2023-01-24 00:21:05.709586: step: 88/77, loss: 0.10507829487323761 2023-01-24 00:21:07.207678: step: 92/77, loss: 0.00254506035707891 2023-01-24 00:21:08.606158: step: 96/77, loss: 0.015117666684091091 2023-01-24 00:21:10.127847: step: 100/77, loss: 0.011987771838903427 2023-01-24 00:21:11.566784: step: 104/77, loss: 0.0020428854040801525 2023-01-24 00:21:12.956733: step: 108/77, loss: 1.1954606634390075e-05 2023-01-24 00:21:14.382769: step: 112/77, loss: 0.0017031385796144605 2023-01-24 00:21:15.826043: step: 116/77, loss: 0.04721391946077347 2023-01-24 00:21:17.238037: step: 120/77, loss: 0.00435302872210741 2023-01-24 00:21:18.690945: step: 124/77, loss: 0.006813234183937311 2023-01-24 00:21:20.099777: step: 128/77, loss: 0.0010583556722849607 2023-01-24 00:21:21.609463: step: 132/77, loss: 0.0035780100151896477 2023-01-24 00:21:23.040056: step: 136/77, loss: 0.00271525327116251 2023-01-24 00:21:24.525443: step: 140/77, loss: 0.0007694312371313572 2023-01-24 00:21:26.010170: step: 144/77, loss: 0.0012259716168045998 2023-01-24 00:21:27.446504: step: 148/77, loss: 0.0017704784404486418 2023-01-24 00:21:28.923207: step: 152/77, loss: 0.00048166397027671337 2023-01-24 00:21:30.424655: step: 156/77, loss: 0.01880849339067936 2023-01-24 00:21:31.858920: step: 160/77, loss: 0.008621509186923504 2023-01-24 00:21:33.243427: step: 164/77, loss: 0.010005577467381954 2023-01-24 00:21:34.707589: step: 168/77, loss: 0.011141312308609486 2023-01-24 00:21:36.165553: step: 172/77, loss: 0.0018681518267840147 2023-01-24 00:21:37.671282: step: 176/77, loss: 0.0004037081089336425 2023-01-24 00:21:39.055967: step: 180/77, loss: 0.009404388256371021 2023-01-24 00:21:40.549639: step: 184/77, loss: 0.0016339367721229792 2023-01-24 00:21:42.000634: step: 188/77, loss: 0.014390097931027412 2023-01-24 00:21:43.467817: step: 192/77, loss: 0.0010596985230222344 2023-01-24 00:21:44.895554: step: 196/77, loss: 0.054352860897779465 2023-01-24 00:21:46.326151: step: 200/77, loss: 0.03959896042943001 2023-01-24 00:21:47.826747: step: 204/77, loss: 9.916317503666505e-05 2023-01-24 00:21:49.264180: step: 208/77, loss: 0.013791847974061966 2023-01-24 00:21:50.707897: step: 212/77, loss: 0.0007248240290209651 2023-01-24 00:21:52.221490: step: 216/77, loss: 0.0002708295651245862 2023-01-24 00:21:53.636425: step: 220/77, loss: 0.0004911398864351213 2023-01-24 00:21:55.130501: step: 224/77, loss: 0.0047632548958063126 2023-01-24 00:21:56.536203: step: 228/77, loss: 0.0011701977346092463 2023-01-24 00:21:57.994720: step: 232/77, loss: 0.00042671553092077374 2023-01-24 00:21:59.487643: step: 236/77, loss: 0.03621303662657738 2023-01-24 00:22:00.987048: step: 240/77, loss: 0.0006495914421975613 2023-01-24 00:22:02.451629: step: 244/77, loss: 0.0029841335490345955 2023-01-24 00:22:03.995936: step: 248/77, loss: 0.02200494334101677 2023-01-24 00:22:05.533568: step: 252/77, loss: 0.0032038982026278973 2023-01-24 00:22:06.965086: step: 256/77, loss: 0.004941218066960573 2023-01-24 00:22:08.315897: step: 260/77, loss: 0.027301592752337456 2023-01-24 00:22:09.812194: step: 264/77, loss: 0.010927285067737103 2023-01-24 00:22:11.307393: step: 268/77, loss: 0.010645708069205284 2023-01-24 00:22:12.788466: step: 272/77, loss: 0.00907566212117672 2023-01-24 00:22:14.313325: step: 276/77, loss: 0.0008718777680769563 2023-01-24 00:22:15.744967: step: 280/77, loss: 0.004993719980120659 2023-01-24 00:22:17.201510: step: 284/77, loss: 0.0007454871665686369 2023-01-24 00:22:18.664344: step: 288/77, loss: 0.05532342940568924 2023-01-24 00:22:20.134140: step: 292/77, loss: 0.00028806107002310455 2023-01-24 00:22:21.625918: step: 296/77, loss: 0.00247851456515491 2023-01-24 00:22:23.107266: step: 300/77, loss: 0.012744927778840065 2023-01-24 00:22:24.587249: step: 304/77, loss: 0.033811114728450775 2023-01-24 00:22:26.026997: step: 308/77, loss: 0.0008661256870254874 2023-01-24 00:22:27.458243: step: 312/77, loss: 0.0006619691848754883 2023-01-24 00:22:28.969074: step: 316/77, loss: 0.006505975965410471 2023-01-24 00:22:30.461446: step: 320/77, loss: 0.03772088512778282 2023-01-24 00:22:31.889042: step: 324/77, loss: 0.015472686849534512 2023-01-24 00:22:33.327790: step: 328/77, loss: 0.006314371712505817 2023-01-24 00:22:34.814544: step: 332/77, loss: 2.2619480660068803e-05 2023-01-24 00:22:36.320465: step: 336/77, loss: 0.021693747490644455 2023-01-24 00:22:37.773468: step: 340/77, loss: 0.00468218931928277 2023-01-24 00:22:39.261984: step: 344/77, loss: 0.001238841563463211 2023-01-24 00:22:40.734466: step: 348/77, loss: 0.0005457285442389548 2023-01-24 00:22:42.242399: step: 352/77, loss: 0.008669009432196617 2023-01-24 00:22:43.689397: step: 356/77, loss: 0.004476443864405155 2023-01-24 00:22:45.085078: step: 360/77, loss: 0.01310722716152668 2023-01-24 00:22:46.584710: step: 364/77, loss: 0.0666121393442154 2023-01-24 00:22:48.021813: step: 368/77, loss: 0.0067118373699486256 2023-01-24 00:22:49.416947: step: 372/77, loss: 0.00446122232824564 2023-01-24 00:22:50.863761: step: 376/77, loss: 0.0023765151854604483 2023-01-24 00:22:52.344990: step: 380/77, loss: 0.007924961857497692 2023-01-24 00:22:53.799319: step: 384/77, loss: 0.010871789418160915 2023-01-24 00:22:55.226015: step: 388/77, loss: 0.00019569540745578706 ================================================== Loss: 0.012 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 13} Test Chinese: {'template': {'p': 0.8625, 'r': 0.5267175572519084, 'f1': 0.6540284360189573}, 'slot': {'p': 0.42857142857142855, 'r': 0.015530629853321829, 'f1': 0.029975020815986676}, 'combined': 0.019604515983915455, 'epoch': 13} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 13} Test Korean: {'template': {'p': 0.875, 'r': 0.5343511450381679, 'f1': 0.6635071090047393}, 'slot': {'p': 0.4318181818181818, 'r': 0.01639344262295082, 'f1': 0.03158769742310889}, 'combined': 0.020958661797323436, 'epoch': 13} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 13} Test Russian: {'template': {'p': 0.8625, 'r': 0.5267175572519084, 'f1': 0.6540284360189573}, 'slot': {'p': 0.42857142857142855, 'r': 0.015530629853321829, 'f1': 0.029975020815986676}, 'combined': 0.019604515983915455, 'epoch': 13} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 13} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 13} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 13} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Chinese: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.37777777777777777, 'r': 0.014667817083692839, 'f1': 0.028239202657807307}, 'combined': 0.01735677821894498, 'epoch': 3} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 3} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Korean: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.37777777777777777, 'r': 0.014667817083692839, 'f1': 0.028239202657807307}, 'combined': 0.01735677821894498, 'epoch': 3} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 3} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Russian: {'template': {'p': 0.863013698630137, 'r': 0.48091603053435117, 'f1': 0.6176470588235295}, 'slot': {'p': 0.38636363636363635, 'r': 0.014667817083692839, 'f1': 0.02826267664172901}, 'combined': 0.01745635910224439, 'epoch': 3} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 3} ****************************** Epoch: 14 command: python train.py --model_name template --xlmr_model_name xlm-roberta-large --batch_size 10 --xlmr_learning_rate 2e-5 --event_hidden_num 450 --accumulate_step 4 --max_epoch 30 --p1_data_weight 0.1 --learning_rate 2e-4 2023-01-24 00:24:30.297290: step: 4/77, loss: 0.05021757259964943 2023-01-24 00:24:31.784525: step: 8/77, loss: 0.013492027297616005 2023-01-24 00:24:33.176804: step: 12/77, loss: 0.01188771240413189 2023-01-24 00:24:34.594222: step: 16/77, loss: 0.004332687705755234 2023-01-24 00:24:36.049669: step: 20/77, loss: 0.0068397969007492065 2023-01-24 00:24:37.439623: step: 24/77, loss: 0.03639312461018562 2023-01-24 00:24:38.895359: step: 28/77, loss: 0.010193396359682083 2023-01-24 00:24:40.343832: step: 32/77, loss: 0.0002720890333876014 2023-01-24 00:24:41.788508: step: 36/77, loss: 0.0033012309577316046 2023-01-24 00:24:43.179238: step: 40/77, loss: 0.06803090870380402 2023-01-24 00:24:44.613310: step: 44/77, loss: 0.002303973538801074 2023-01-24 00:24:46.031412: step: 48/77, loss: 0.01426534354686737 2023-01-24 00:24:47.446722: step: 52/77, loss: 0.01106639951467514 2023-01-24 00:24:48.906959: step: 56/77, loss: 0.013820632360875607 2023-01-24 00:24:50.343974: step: 60/77, loss: 0.001012542168609798 2023-01-24 00:24:51.806060: step: 64/77, loss: 0.003073731204494834 2023-01-24 00:24:53.289203: step: 68/77, loss: 0.0074550556018948555 2023-01-24 00:24:54.745176: step: 72/77, loss: 0.03673708066344261 2023-01-24 00:24:56.235762: step: 76/77, loss: 0.009585902094841003 2023-01-24 00:24:57.696683: step: 80/77, loss: 0.0049670119769871235 2023-01-24 00:24:59.114005: step: 84/77, loss: 0.0012508973013609648 2023-01-24 00:25:00.561924: step: 88/77, loss: 0.04550894722342491 2023-01-24 00:25:02.115692: step: 92/77, loss: 0.0476241372525692 2023-01-24 00:25:03.557794: step: 96/77, loss: 0.0003463767934590578 2023-01-24 00:25:05.014109: step: 100/77, loss: 0.0013855427969247103 2023-01-24 00:25:06.464574: step: 104/77, loss: 6.948120426386595e-05 2023-01-24 00:25:07.896322: step: 108/77, loss: 0.0025392724201083183 2023-01-24 00:25:09.327938: step: 112/77, loss: 0.0023203599266707897 2023-01-24 00:25:10.834309: step: 116/77, loss: 0.002155708149075508 2023-01-24 00:25:12.267598: step: 120/77, loss: 0.010432385839521885 2023-01-24 00:25:13.710364: step: 124/77, loss: 0.0015561548061668873 2023-01-24 00:25:15.143284: step: 128/77, loss: 0.0022515016607940197 2023-01-24 00:25:16.548990: step: 132/77, loss: 0.0063758837059140205 2023-01-24 00:25:17.987449: step: 136/77, loss: 0.00816910620778799 2023-01-24 00:25:19.408644: step: 140/77, loss: 0.0004892628057859838 2023-01-24 00:25:20.864750: step: 144/77, loss: 0.006460327655076981 2023-01-24 00:25:22.325441: step: 148/77, loss: 0.0022336977999657393 2023-01-24 00:25:23.788385: step: 152/77, loss: 0.04381607472896576 2023-01-24 00:25:25.193799: step: 156/77, loss: 0.01606643944978714 2023-01-24 00:25:26.644315: step: 160/77, loss: 0.01276947371661663 2023-01-24 00:25:28.174136: step: 164/77, loss: 8.410248847212642e-05 2023-01-24 00:25:29.637161: step: 168/77, loss: 0.01344869751483202 2023-01-24 00:25:31.073515: step: 172/77, loss: 0.01566564477980137 2023-01-24 00:25:32.564390: step: 176/77, loss: 3.885049591190182e-05 2023-01-24 00:25:34.028351: step: 180/77, loss: 0.002805543364956975 2023-01-24 00:25:35.559153: step: 184/77, loss: 5.492310265253764e-06 2023-01-24 00:25:36.946997: step: 188/77, loss: 0.004818313755095005 2023-01-24 00:25:38.426206: step: 192/77, loss: 0.00019668742606882006 2023-01-24 00:25:39.862685: step: 196/77, loss: 0.0005715168663300574 2023-01-24 00:25:41.390160: step: 200/77, loss: 0.013614819385111332 2023-01-24 00:25:42.869303: step: 204/77, loss: 0.006921195425093174 2023-01-24 00:25:44.341037: step: 208/77, loss: 0.000296951737254858 2023-01-24 00:25:45.790953: step: 212/77, loss: 0.000808404351118952 2023-01-24 00:25:47.296253: step: 216/77, loss: 0.0011310662375763059 2023-01-24 00:25:48.772470: step: 220/77, loss: 0.1592380404472351 2023-01-24 00:25:50.220390: step: 224/77, loss: 0.0025794643443077803 2023-01-24 00:25:51.637992: step: 228/77, loss: 0.0009930891683325171 2023-01-24 00:25:53.128147: step: 232/77, loss: 0.0015125039499253035 2023-01-24 00:25:54.547723: step: 236/77, loss: 0.005088160280138254 2023-01-24 00:25:56.082927: step: 240/77, loss: 0.0073650190606713295 2023-01-24 00:25:57.547739: step: 244/77, loss: 0.007640082389116287 2023-01-24 00:25:59.027126: step: 248/77, loss: 0.0062585556879639626 2023-01-24 00:26:00.486542: step: 252/77, loss: 0.02538384683430195 2023-01-24 00:26:01.973035: step: 256/77, loss: 0.0007471991702914238 2023-01-24 00:26:03.460136: step: 260/77, loss: 0.000744312594179064 2023-01-24 00:26:04.905918: step: 264/77, loss: 6.348014721879736e-05 2023-01-24 00:26:06.438372: step: 268/77, loss: 0.0054208822548389435 2023-01-24 00:26:07.911755: step: 272/77, loss: 0.004874257370829582 2023-01-24 00:26:09.373699: step: 276/77, loss: 0.0018353211926296353 2023-01-24 00:26:10.775005: step: 280/77, loss: 0.01595592312514782 2023-01-24 00:26:12.317121: step: 284/77, loss: 0.010417342185974121 2023-01-24 00:26:13.773977: step: 288/77, loss: 0.015512584708631039 2023-01-24 00:26:15.216136: step: 292/77, loss: 0.05964883044362068 2023-01-24 00:26:16.642420: step: 296/77, loss: 0.021162962540984154 2023-01-24 00:26:18.085219: step: 300/77, loss: 0.005686973687261343 2023-01-24 00:26:19.506100: step: 304/77, loss: 0.007818142883479595 2023-01-24 00:26:20.955565: step: 308/77, loss: 0.006607674062252045 2023-01-24 00:26:22.382840: step: 312/77, loss: 0.00150693254545331 2023-01-24 00:26:23.844698: step: 316/77, loss: 0.014010453596711159 2023-01-24 00:26:25.305935: step: 320/77, loss: 0.0010493621230125427 2023-01-24 00:26:26.796003: step: 324/77, loss: 0.004597794730216265 2023-01-24 00:26:28.318030: step: 328/77, loss: 0.001323473872616887 2023-01-24 00:26:29.811037: step: 332/77, loss: 0.003567654872313142 2023-01-24 00:26:31.305230: step: 336/77, loss: 0.023809565231204033 2023-01-24 00:26:32.774191: step: 340/77, loss: 0.0202918890863657 2023-01-24 00:26:34.314457: step: 344/77, loss: 0.05709415674209595 2023-01-24 00:26:35.704834: step: 348/77, loss: 0.06244867295026779 2023-01-24 00:26:37.200326: step: 352/77, loss: 0.0023825361859053373 2023-01-24 00:26:38.601742: step: 356/77, loss: 0.015587534755468369 2023-01-24 00:26:40.083597: step: 360/77, loss: 0.003471288364380598 2023-01-24 00:26:41.537401: step: 364/77, loss: 0.004319621250033379 2023-01-24 00:26:43.006434: step: 368/77, loss: 0.014541847631335258 2023-01-24 00:26:44.501619: step: 372/77, loss: 0.00017313337593805045 2023-01-24 00:26:45.862818: step: 376/77, loss: 0.002016652375459671 2023-01-24 00:26:47.328842: step: 380/77, loss: 0.021287081763148308 2023-01-24 00:26:48.813844: step: 384/77, loss: 0.0034362124279141426 2023-01-24 00:26:50.246439: step: 388/77, loss: 0.002922603627666831 ================================================== Loss: 0.013 -------------------- Dev Chinese: {'template': {'p': 0.9459459459459459, 'r': 0.5833333333333334, 'f1': 0.7216494845360825}, 'slot': {'p': 0.43478260869565216, 'r': 0.03780718336483932, 'f1': 0.06956521739130435}, 'combined': 0.0502017032720753, 'epoch': 14} Test Chinese: {'template': {'p': 0.9014084507042254, 'r': 0.48854961832061067, 'f1': 0.6336633663366336}, 'slot': {'p': 0.4, 'r': 0.015530629853321829, 'f1': 0.029900332225913623}, 'combined': 0.018946745172856154, 'epoch': 14} Dev Korean: {'template': {'p': 0.9459459459459459, 'r': 0.5833333333333334, 'f1': 0.7216494845360825}, 'slot': {'p': 0.43478260869565216, 'r': 0.03780718336483932, 'f1': 0.06956521739130435}, 'combined': 0.0502017032720753, 'epoch': 14} Test Korean: {'template': {'p': 0.9014084507042254, 'r': 0.48854961832061067, 'f1': 0.6336633663366336}, 'slot': {'p': 0.391304347826087, 'r': 0.015530629853321829, 'f1': 0.029875518672199168}, 'combined': 0.018931021732878677, 'epoch': 14} Dev Russian: {'template': {'p': 0.9459459459459459, 'r': 0.5833333333333334, 'f1': 0.7216494845360825}, 'slot': {'p': 0.43478260869565216, 'r': 0.03780718336483932, 'f1': 0.06956521739130435}, 'combined': 0.0502017032720753, 'epoch': 14} Test Russian: {'template': {'p': 0.8873239436619719, 'r': 0.48091603053435117, 'f1': 0.6237623762376238}, 'slot': {'p': 0.38636363636363635, 'r': 0.014667817083692839, 'f1': 0.02826267664172901}, 'combined': 0.01762919434088047, 'epoch': 14} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 14} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 14} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 14} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Chinese: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.37777777777777777, 'r': 0.014667817083692839, 'f1': 0.028239202657807307}, 'combined': 0.01735677821894498, 'epoch': 3} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 3} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Korean: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.37777777777777777, 'r': 0.014667817083692839, 'f1': 0.028239202657807307}, 'combined': 0.01735677821894498, 'epoch': 3} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 3} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Russian: {'template': {'p': 0.863013698630137, 'r': 0.48091603053435117, 'f1': 0.6176470588235295}, 'slot': {'p': 0.38636363636363635, 'r': 0.014667817083692839, 'f1': 0.02826267664172901}, 'combined': 0.01745635910224439, 'epoch': 3} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 3} ****************************** Epoch: 15 command: python train.py --model_name template --xlmr_model_name xlm-roberta-large --batch_size 10 --xlmr_learning_rate 2e-5 --event_hidden_num 450 --accumulate_step 4 --max_epoch 30 --p1_data_weight 0.1 --learning_rate 2e-4 2023-01-24 00:28:25.327238: step: 4/77, loss: 0.008644331246614456 2023-01-24 00:28:26.749932: step: 8/77, loss: 0.011979562230408192 2023-01-24 00:28:28.267479: step: 12/77, loss: 0.014013068750500679 2023-01-24 00:28:29.685847: step: 16/77, loss: 0.008621701039373875 2023-01-24 00:28:31.114413: step: 20/77, loss: 0.021316878497600555 2023-01-24 00:28:32.599212: step: 24/77, loss: 0.051223039627075195 2023-01-24 00:28:34.090778: step: 28/77, loss: 0.002289854222908616 2023-01-24 00:28:35.588914: step: 32/77, loss: 0.004393209703266621 2023-01-24 00:28:37.068688: step: 36/77, loss: 5.282249730953481e-06 2023-01-24 00:28:38.557824: step: 40/77, loss: 0.01075073517858982 2023-01-24 00:28:40.045362: step: 44/77, loss: 0.0020622683223336935 2023-01-24 00:28:41.516125: step: 48/77, loss: 0.03781045228242874 2023-01-24 00:28:42.945526: step: 52/77, loss: 0.0880887433886528 2023-01-24 00:28:44.440647: step: 56/77, loss: 0.03132542222738266 2023-01-24 00:28:45.928658: step: 60/77, loss: 0.00321400398388505 2023-01-24 00:28:47.353437: step: 64/77, loss: 0.004773072432726622 2023-01-24 00:28:48.721702: step: 68/77, loss: 0.017901595681905746 2023-01-24 00:28:50.163636: step: 72/77, loss: 0.038539398461580276 2023-01-24 00:28:51.641199: step: 76/77, loss: 0.133880615234375 2023-01-24 00:28:53.111537: step: 80/77, loss: 0.0034761850256472826 2023-01-24 00:28:54.587281: step: 84/77, loss: 0.013166949152946472 2023-01-24 00:28:56.042731: step: 88/77, loss: 0.00681588239967823 2023-01-24 00:28:57.476669: step: 92/77, loss: 0.009442516602575779 2023-01-24 00:28:58.944404: step: 96/77, loss: 0.023162730038166046 2023-01-24 00:29:00.414415: step: 100/77, loss: 0.017544515430927277 2023-01-24 00:29:01.891539: step: 104/77, loss: 0.016073524951934814 2023-01-24 00:29:03.380072: step: 108/77, loss: 0.03755786269903183 2023-01-24 00:29:04.914410: step: 112/77, loss: 0.0028731636703014374 2023-01-24 00:29:06.388965: step: 116/77, loss: 0.006997175980359316 2023-01-24 00:29:07.867982: step: 120/77, loss: 0.002053946955129504 2023-01-24 00:29:09.329468: step: 124/77, loss: 0.0005283099017105997 2023-01-24 00:29:10.790810: step: 128/77, loss: 0.007918376475572586 2023-01-24 00:29:12.250987: step: 132/77, loss: 0.0013505109818652272 2023-01-24 00:29:13.730840: step: 136/77, loss: 0.006453365087509155 2023-01-24 00:29:15.158902: step: 140/77, loss: 0.008280211128294468 2023-01-24 00:29:16.605634: step: 144/77, loss: 0.02169647254049778 2023-01-24 00:29:18.075067: step: 148/77, loss: 0.008906039409339428 2023-01-24 00:29:19.541904: step: 152/77, loss: 9.285704436479136e-05 2023-01-24 00:29:20.961277: step: 156/77, loss: 0.061494454741477966 2023-01-24 00:29:22.395381: step: 160/77, loss: 0.0007124262629076838 2023-01-24 00:29:23.856792: step: 164/77, loss: 0.0020129838958382607 2023-01-24 00:29:25.300798: step: 168/77, loss: 0.00560009153559804 2023-01-24 00:29:26.768789: step: 172/77, loss: 0.030477937310934067 2023-01-24 00:29:28.195577: step: 176/77, loss: 0.005781487561762333 2023-01-24 00:29:29.592831: step: 180/77, loss: 0.002156471135094762 2023-01-24 00:29:31.036405: step: 184/77, loss: 0.01489571388810873 2023-01-24 00:29:32.546484: step: 188/77, loss: 0.01714280992746353 2023-01-24 00:29:33.994396: step: 192/77, loss: 0.014948004856705666 2023-01-24 00:29:35.479130: step: 196/77, loss: 0.0004343294131103903 2023-01-24 00:29:36.924268: step: 200/77, loss: 0.0468200147151947 2023-01-24 00:29:38.389602: step: 204/77, loss: 0.00364921847358346 2023-01-24 00:29:39.797670: step: 208/77, loss: 0.006669571157544851 2023-01-24 00:29:41.280919: step: 212/77, loss: 0.03793036937713623 2023-01-24 00:29:42.759769: step: 216/77, loss: 0.0024284652899950743 2023-01-24 00:29:44.204539: step: 220/77, loss: 0.002885939320549369 2023-01-24 00:29:45.629232: step: 224/77, loss: 0.05066582188010216 2023-01-24 00:29:47.039439: step: 228/77, loss: 0.0012308210134506226 2023-01-24 00:29:48.523665: step: 232/77, loss: 0.03428790718317032 2023-01-24 00:29:49.962746: step: 236/77, loss: 0.005835465621203184 2023-01-24 00:29:51.387403: step: 240/77, loss: 0.035248078405857086 2023-01-24 00:29:52.840850: step: 244/77, loss: 0.011382810771465302 2023-01-24 00:29:54.266445: step: 248/77, loss: 0.0028067068196833134 2023-01-24 00:29:55.714296: step: 252/77, loss: 0.00026669781072996557 2023-01-24 00:29:57.146143: step: 256/77, loss: 0.0037234588526189327 2023-01-24 00:29:58.640022: step: 260/77, loss: 0.004366753157228231 2023-01-24 00:30:00.076872: step: 264/77, loss: 0.01697009615600109 2023-01-24 00:30:01.524602: step: 268/77, loss: 0.06928707659244537 2023-01-24 00:30:02.967471: step: 272/77, loss: 0.00224512442946434 2023-01-24 00:30:04.404487: step: 276/77, loss: 0.00021704388200305402 2023-01-24 00:30:05.912382: step: 280/77, loss: 0.007994461804628372 2023-01-24 00:30:07.303273: step: 284/77, loss: 0.0010824492201209068 2023-01-24 00:30:08.723126: step: 288/77, loss: 0.005412569735199213 2023-01-24 00:30:10.282883: step: 292/77, loss: 0.014690631069242954 2023-01-24 00:30:11.699913: step: 296/77, loss: 0.07107909023761749 2023-01-24 00:30:13.209006: step: 300/77, loss: 0.003367634490132332 2023-01-24 00:30:14.646574: step: 304/77, loss: 0.0009482861496508121 2023-01-24 00:30:16.083584: step: 308/77, loss: 0.0010064172092825174 2023-01-24 00:30:17.521423: step: 312/77, loss: 0.03311848267912865 2023-01-24 00:30:19.021004: step: 316/77, loss: 0.00782004464417696 2023-01-24 00:30:20.478098: step: 320/77, loss: 0.00015082456229720265 2023-01-24 00:30:21.948067: step: 324/77, loss: 0.00022852692927699536 2023-01-24 00:30:23.362720: step: 328/77, loss: 0.012201538309454918 2023-01-24 00:30:24.862662: step: 332/77, loss: 0.04965365678071976 2023-01-24 00:30:26.303173: step: 336/77, loss: 0.00021343353728298098 2023-01-24 00:30:27.734381: step: 340/77, loss: 0.0032562892884016037 2023-01-24 00:30:29.115162: step: 344/77, loss: 0.0028177325148135424 2023-01-24 00:30:30.544552: step: 348/77, loss: 0.00022388799698092043 2023-01-24 00:30:31.983608: step: 352/77, loss: 0.00048760930076241493 2023-01-24 00:30:33.444183: step: 356/77, loss: 0.033975739032030106 2023-01-24 00:30:34.868972: step: 360/77, loss: 0.07614095509052277 2023-01-24 00:30:36.311528: step: 364/77, loss: 0.020101221278309822 2023-01-24 00:30:37.810245: step: 368/77, loss: 0.013422614894807339 2023-01-24 00:30:39.210830: step: 372/77, loss: 0.01529695838689804 2023-01-24 00:30:40.722979: step: 376/77, loss: 0.0003953626146540046 2023-01-24 00:30:42.257966: step: 380/77, loss: 0.026907581835985184 2023-01-24 00:30:43.625659: step: 384/77, loss: 0.0025897109881043434 2023-01-24 00:30:45.020571: step: 388/77, loss: 0.020302483811974525 ================================================== Loss: 0.017 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.04686584651435266, 'epoch': 15} Test Chinese: {'template': {'p': 0.8783783783783784, 'r': 0.4961832061068702, 'f1': 0.6341463414634146}, 'slot': {'p': 0.3829787234042553, 'r': 0.015530629853321829, 'f1': 0.029850746268656716}, 'combined': 0.01892974153622133, 'epoch': 15} Dev Korean: {'template': {'p': 1.0, 'r': 0.4666666666666667, 'f1': 0.6363636363636364}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.04473558076370027, 'epoch': 15} Test Korean: {'template': {'p': 0.8666666666666667, 'r': 0.4961832061068702, 'f1': 0.6310679611650485}, 'slot': {'p': 0.40425531914893614, 'r': 0.01639344262295082, 'f1': 0.03150912106135987}, 'combined': 0.019884396786295062, 'epoch': 15} Dev Russian: {'template': {'p': 1.0, 'r': 0.48333333333333334, 'f1': 0.6516853932584269}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.045812681424142486, 'epoch': 15} Test Russian: {'template': {'p': 0.8666666666666667, 'r': 0.4961832061068702, 'f1': 0.6310679611650485}, 'slot': {'p': 0.40425531914893614, 'r': 0.01639344262295082, 'f1': 0.03150912106135987}, 'combined': 0.019884396786295062, 'epoch': 15} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 15} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 15} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 15} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Chinese: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.37777777777777777, 'r': 0.014667817083692839, 'f1': 0.028239202657807307}, 'combined': 0.01735677821894498, 'epoch': 3} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 3} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Korean: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.37777777777777777, 'r': 0.014667817083692839, 'f1': 0.028239202657807307}, 'combined': 0.01735677821894498, 'epoch': 3} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 3} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Russian: {'template': {'p': 0.863013698630137, 'r': 0.48091603053435117, 'f1': 0.6176470588235295}, 'slot': {'p': 0.38636363636363635, 'r': 0.014667817083692839, 'f1': 0.02826267664172901}, 'combined': 0.01745635910224439, 'epoch': 3} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 3} ****************************** Epoch: 16 command: python train.py --model_name template --xlmr_model_name xlm-roberta-large --batch_size 10 --xlmr_learning_rate 2e-5 --event_hidden_num 450 --accumulate_step 4 --max_epoch 30 --p1_data_weight 0.1 --learning_rate 2e-4 2023-01-24 00:32:19.634890: step: 4/77, loss: 0.02240251749753952 2023-01-24 00:32:21.096679: step: 8/77, loss: 0.02771635353565216 2023-01-24 00:32:22.563240: step: 12/77, loss: 0.00028855435084551573 2023-01-24 00:32:23.996168: step: 16/77, loss: 0.0013435601722449064 2023-01-24 00:32:25.467620: step: 20/77, loss: 0.04604283347725868 2023-01-24 00:32:26.863970: step: 24/77, loss: 0.00026520941173657775 2023-01-24 00:32:28.247667: step: 28/77, loss: 0.00022504819207824767 2023-01-24 00:32:29.677365: step: 32/77, loss: 0.0024469781201332808 2023-01-24 00:32:31.145941: step: 36/77, loss: 0.06819023936986923 2023-01-24 00:32:32.574856: step: 40/77, loss: 0.004971928894519806 2023-01-24 00:32:34.032866: step: 44/77, loss: 0.016732260584831238 2023-01-24 00:32:35.477268: step: 48/77, loss: 0.0035637859255075455 2023-01-24 00:32:36.909355: step: 52/77, loss: 2.8777996703865938e-05 2023-01-24 00:32:38.326693: step: 56/77, loss: 0.00130385288503021 2023-01-24 00:32:39.834460: step: 60/77, loss: 0.011374658904969692 2023-01-24 00:32:41.299577: step: 64/77, loss: 0.0029410512652248144 2023-01-24 00:32:42.776903: step: 68/77, loss: 0.023517031222581863 2023-01-24 00:32:44.214010: step: 72/77, loss: 0.029339302331209183 2023-01-24 00:32:45.730383: step: 76/77, loss: 0.01185949519276619 2023-01-24 00:32:47.159675: step: 80/77, loss: 0.05707313492894173 2023-01-24 00:32:48.626706: step: 84/77, loss: 0.06443598121404648 2023-01-24 00:32:50.130107: step: 88/77, loss: 0.000236800653510727 2023-01-24 00:32:51.597679: step: 92/77, loss: 0.0030893548391759396 2023-01-24 00:32:53.027023: step: 96/77, loss: 0.005407319869846106 2023-01-24 00:32:54.421824: step: 100/77, loss: 0.0009325853898189962 2023-01-24 00:32:55.880266: step: 104/77, loss: 0.0016836941940709949 2023-01-24 00:32:57.360244: step: 108/77, loss: 0.0005430675228126347 2023-01-24 00:32:58.849743: step: 112/77, loss: 0.0018001548014581203 2023-01-24 00:33:00.393508: step: 116/77, loss: 0.011139596812427044 2023-01-24 00:33:01.903069: step: 120/77, loss: 0.009026577696204185 2023-01-24 00:33:03.335994: step: 124/77, loss: 0.013764426112174988 2023-01-24 00:33:04.829764: step: 128/77, loss: 0.13721467554569244 2023-01-24 00:33:06.384132: step: 132/77, loss: 0.05707092583179474 2023-01-24 00:33:07.897744: step: 136/77, loss: 0.002023660810664296 2023-01-24 00:33:09.381731: step: 140/77, loss: 0.0009388293838128448 2023-01-24 00:33:10.798820: step: 144/77, loss: 1.7927572116605006e-05 2023-01-24 00:33:12.264096: step: 148/77, loss: 0.007441079709678888 2023-01-24 00:33:13.723911: step: 152/77, loss: 1.1405087207094766e-05 2023-01-24 00:33:15.224725: step: 156/77, loss: 0.01594318449497223 2023-01-24 00:33:16.679219: step: 160/77, loss: 0.009689363650977612 2023-01-24 00:33:18.083943: step: 164/77, loss: 0.00965337734669447 2023-01-24 00:33:19.537605: step: 168/77, loss: 0.00018008879851549864 2023-01-24 00:33:21.021571: step: 172/77, loss: 0.0013620827812701464 2023-01-24 00:33:22.493048: step: 176/77, loss: 0.009706384502351284 2023-01-24 00:33:23.910777: step: 180/77, loss: 0.00011257622099947184 2023-01-24 00:33:25.369268: step: 184/77, loss: 5.58066058147233e-05 2023-01-24 00:33:26.764013: step: 188/77, loss: 0.0010068108094856143 2023-01-24 00:33:28.226146: step: 192/77, loss: 0.0033298954367637634 2023-01-24 00:33:29.685000: step: 196/77, loss: 0.002340085804462433 2023-01-24 00:33:31.123796: step: 200/77, loss: 4.1069710277952254e-05 2023-01-24 00:33:32.620247: step: 204/77, loss: 4.162640470894985e-05 2023-01-24 00:33:34.081896: step: 208/77, loss: 0.0057720597833395 2023-01-24 00:33:35.567613: step: 212/77, loss: 0.045091234147548676 2023-01-24 00:33:37.037859: step: 216/77, loss: 0.0005430678138509393 2023-01-24 00:33:38.549811: step: 220/77, loss: 0.02244877815246582 2023-01-24 00:33:39.961941: step: 224/77, loss: 0.015751518309116364 2023-01-24 00:33:41.474819: step: 228/77, loss: 0.03594440221786499 2023-01-24 00:33:43.056112: step: 232/77, loss: 0.001045037410221994 2023-01-24 00:33:44.582460: step: 236/77, loss: 0.00036588916555047035 2023-01-24 00:33:46.061882: step: 240/77, loss: 0.0008532066131010652 2023-01-24 00:33:47.535589: step: 244/77, loss: 1.4190770343702752e-05 2023-01-24 00:33:49.040232: step: 248/77, loss: 0.002486072713509202 2023-01-24 00:33:50.478733: step: 252/77, loss: 0.0003959749301429838 2023-01-24 00:33:51.964898: step: 256/77, loss: 4.7969617298804224e-05 2023-01-24 00:33:53.419530: step: 260/77, loss: 0.02460763230919838 2023-01-24 00:33:54.886827: step: 264/77, loss: 0.026778748258948326 2023-01-24 00:33:56.374087: step: 268/77, loss: 0.00020856253104284406 2023-01-24 00:33:57.844816: step: 272/77, loss: 0.06297913193702698 2023-01-24 00:33:59.326380: step: 276/77, loss: 0.02976994216442108 2023-01-24 00:34:00.744100: step: 280/77, loss: 0.09210612624883652 2023-01-24 00:34:02.247130: step: 284/77, loss: 0.0030472474172711372 2023-01-24 00:34:03.722962: step: 288/77, loss: 0.0022248579189181328 2023-01-24 00:34:05.145091: step: 292/77, loss: 0.0329422801733017 2023-01-24 00:34:06.605096: step: 296/77, loss: 0.0012546322541311383 2023-01-24 00:34:08.091905: step: 300/77, loss: 0.012546733021736145 2023-01-24 00:34:09.500222: step: 304/77, loss: 0.002945370739325881 2023-01-24 00:34:10.958169: step: 308/77, loss: 0.0017267671646550298 2023-01-24 00:34:12.461964: step: 312/77, loss: 0.0015203878283500671 2023-01-24 00:34:13.855173: step: 316/77, loss: 0.009436001069843769 2023-01-24 00:34:15.247842: step: 320/77, loss: 0.00017988021136261523 2023-01-24 00:34:16.677751: step: 324/77, loss: 0.0006251703598536551 2023-01-24 00:34:18.138442: step: 328/77, loss: 0.004496883600950241 2023-01-24 00:34:19.629394: step: 332/77, loss: 0.0008476347429677844 2023-01-24 00:34:21.033673: step: 336/77, loss: 0.0237599965184927 2023-01-24 00:34:22.517026: step: 340/77, loss: 0.0004581378889270127 2023-01-24 00:34:23.957236: step: 344/77, loss: 0.0019197032088413835 2023-01-24 00:34:25.445591: step: 348/77, loss: 5.6227523600682616e-05 2023-01-24 00:34:26.912572: step: 352/77, loss: 2.4058379494817927e-05 2023-01-24 00:34:28.380618: step: 356/77, loss: 0.048158254474401474 2023-01-24 00:34:29.780393: step: 360/77, loss: 0.07158225774765015 2023-01-24 00:34:31.283913: step: 364/77, loss: 0.006104794796556234 2023-01-24 00:34:32.796152: step: 368/77, loss: 0.010368650779128075 2023-01-24 00:34:34.215468: step: 372/77, loss: 0.018325114622712135 2023-01-24 00:34:35.675312: step: 376/77, loss: 0.07363635301589966 2023-01-24 00:34:37.130970: step: 380/77, loss: 0.002189134480431676 2023-01-24 00:34:38.524127: step: 384/77, loss: 0.0005988850025460124 2023-01-24 00:34:39.915686: step: 388/77, loss: 0.00013736364780925214 ================================================== Loss: 0.014 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 16} Test Chinese: {'template': {'p': 0.9054054054054054, 'r': 0.5114503816793893, 'f1': 0.6536585365853658}, 'slot': {'p': 0.3953488372093023, 'r': 0.014667817083692839, 'f1': 0.028286189683860232}, 'combined': 0.018489509354328148, 'epoch': 16} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 16} Test Korean: {'template': {'p': 0.9054054054054054, 'r': 0.5114503816793893, 'f1': 0.6536585365853658}, 'slot': {'p': 0.3953488372093023, 'r': 0.014667817083692839, 'f1': 0.028286189683860232}, 'combined': 0.018489509354328148, 'epoch': 16} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 16} Test Russian: {'template': {'p': 0.8918918918918919, 'r': 0.5038167938931297, 'f1': 0.6439024390243903}, 'slot': {'p': 0.3902439024390244, 'r': 0.013805004314063849, 'f1': 0.02666666666666667}, 'combined': 0.017170731707317075, 'epoch': 16} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 16} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 16} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 16} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Chinese: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.37777777777777777, 'r': 0.014667817083692839, 'f1': 0.028239202657807307}, 'combined': 0.01735677821894498, 'epoch': 3} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 3} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Korean: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.37777777777777777, 'r': 0.014667817083692839, 'f1': 0.028239202657807307}, 'combined': 0.01735677821894498, 'epoch': 3} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 3} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Russian: {'template': {'p': 0.863013698630137, 'r': 0.48091603053435117, 'f1': 0.6176470588235295}, 'slot': {'p': 0.38636363636363635, 'r': 0.014667817083692839, 'f1': 0.02826267664172901}, 'combined': 0.01745635910224439, 'epoch': 3} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 3} ****************************** Epoch: 17 command: python train.py --model_name template --xlmr_model_name xlm-roberta-large --batch_size 10 --xlmr_learning_rate 2e-5 --event_hidden_num 450 --accumulate_step 4 --max_epoch 30 --p1_data_weight 0.1 --learning_rate 2e-4 2023-01-24 00:36:14.702093: step: 4/77, loss: 0.004096877295523882 2023-01-24 00:36:16.136356: step: 8/77, loss: 0.004776291083544493 2023-01-24 00:36:17.576818: step: 12/77, loss: 0.005598829127848148 2023-01-24 00:36:19.104617: step: 16/77, loss: 0.002439585980027914 2023-01-24 00:36:20.525770: step: 20/77, loss: 3.45842563547194e-05 2023-01-24 00:36:22.013568: step: 24/77, loss: 0.0013430280378088355 2023-01-24 00:36:23.489400: step: 28/77, loss: 0.0018386875744909048 2023-01-24 00:36:24.964231: step: 32/77, loss: 0.0015910633374005556 2023-01-24 00:36:26.426845: step: 36/77, loss: 0.0009483025060035288 2023-01-24 00:36:27.877135: step: 40/77, loss: 0.007047180086374283 2023-01-24 00:36:29.324993: step: 44/77, loss: 0.00018717434431891888 2023-01-24 00:36:30.814155: step: 48/77, loss: 0.001299167750403285 2023-01-24 00:36:32.249221: step: 52/77, loss: 0.0006834648665972054 2023-01-24 00:36:33.731496: step: 56/77, loss: 0.002170323161408305 2023-01-24 00:36:35.221388: step: 60/77, loss: 0.002382697071880102 2023-01-24 00:36:36.644534: step: 64/77, loss: 0.000798621098510921 2023-01-24 00:36:38.158916: step: 68/77, loss: 0.003686311189085245 2023-01-24 00:36:39.572143: step: 72/77, loss: 0.06565132737159729 2023-01-24 00:36:41.035207: step: 76/77, loss: 0.005340840667486191 2023-01-24 00:36:42.462013: step: 80/77, loss: 0.0011033548507839441 2023-01-24 00:36:43.956656: step: 84/77, loss: 0.0021340290550142527 2023-01-24 00:36:45.359742: step: 88/77, loss: 0.0003320540417917073 2023-01-24 00:36:46.828330: step: 92/77, loss: 0.001845193444751203 2023-01-24 00:36:48.250784: step: 96/77, loss: 0.0006272225291468203 2023-01-24 00:36:49.704582: step: 100/77, loss: 0.0001363737101200968 2023-01-24 00:36:51.194931: step: 104/77, loss: 5.750182026531547e-05 2023-01-24 00:36:52.655761: step: 108/77, loss: 0.05694460868835449 2023-01-24 00:36:54.098475: step: 112/77, loss: 0.050074946135282516 2023-01-24 00:36:55.523498: step: 116/77, loss: 0.0011947468155995011 2023-01-24 00:36:56.993317: step: 120/77, loss: 0.0027722204104065895 2023-01-24 00:36:58.414515: step: 124/77, loss: 0.0018663202645257115 2023-01-24 00:36:59.932964: step: 128/77, loss: 0.0026401153299957514 2023-01-24 00:37:01.347368: step: 132/77, loss: 0.006284075789153576 2023-01-24 00:37:02.812546: step: 136/77, loss: 0.0007562717655673623 2023-01-24 00:37:04.348652: step: 140/77, loss: 0.035824716091156006 2023-01-24 00:37:05.821325: step: 144/77, loss: 0.026698850095272064 2023-01-24 00:37:07.336301: step: 148/77, loss: 0.00026182507281191647 2023-01-24 00:37:08.849471: step: 152/77, loss: 0.00024711183505132794 2023-01-24 00:37:10.272352: step: 156/77, loss: 0.042546238750219345 2023-01-24 00:37:11.765900: step: 160/77, loss: 0.13341380655765533 2023-01-24 00:37:13.284167: step: 164/77, loss: 0.00846744142472744 2023-01-24 00:37:14.842440: step: 168/77, loss: 0.08409827202558517 2023-01-24 00:37:16.301501: step: 172/77, loss: 0.0067952219396829605 2023-01-24 00:37:17.710800: step: 176/77, loss: 0.0042352499440312386 2023-01-24 00:37:19.048924: step: 180/77, loss: 0.00029177599935792387 2023-01-24 00:37:20.518072: step: 184/77, loss: 0.00332057336345315 2023-01-24 00:37:22.018417: step: 188/77, loss: 0.002562582725659013 2023-01-24 00:37:23.492095: step: 192/77, loss: 0.14900614321231842 2023-01-24 00:37:24.925781: step: 196/77, loss: 0.006481688003987074 2023-01-24 00:37:26.353653: step: 200/77, loss: 0.00015579882892780006 2023-01-24 00:37:27.760664: step: 204/77, loss: 0.019360896199941635 2023-01-24 00:37:29.230330: step: 208/77, loss: 0.004088334273546934 2023-01-24 00:37:30.672125: step: 212/77, loss: 0.00019180560775566846 2023-01-24 00:37:32.019074: step: 216/77, loss: 0.0003879494033753872 2023-01-24 00:37:33.475321: step: 220/77, loss: 0.0012808794854208827 2023-01-24 00:37:34.927707: step: 224/77, loss: 0.00027431672788225114 2023-01-24 00:37:36.434401: step: 228/77, loss: 0.00030285323737189174 2023-01-24 00:37:37.969649: step: 232/77, loss: 9.394896915182471e-05 2023-01-24 00:37:39.426838: step: 236/77, loss: 0.005379479378461838 2023-01-24 00:37:40.876884: step: 240/77, loss: 0.00025858887238427997 2023-01-24 00:37:42.321009: step: 244/77, loss: 0.020550856366753578 2023-01-24 00:37:43.807554: step: 248/77, loss: 0.005961176007986069 2023-01-24 00:37:45.330286: step: 252/77, loss: 0.0003792982315644622 2023-01-24 00:37:46.821255: step: 256/77, loss: 0.00023473313194699585 2023-01-24 00:37:48.286805: step: 260/77, loss: 0.0024014206137508154 2023-01-24 00:37:49.745349: step: 264/77, loss: 2.4127570213750005e-05 2023-01-24 00:37:51.148421: step: 268/77, loss: 0.0009776921942830086 2023-01-24 00:37:52.569015: step: 272/77, loss: 0.010743441991508007 2023-01-24 00:37:54.034001: step: 276/77, loss: 0.007811921648681164 2023-01-24 00:37:55.505034: step: 280/77, loss: 0.002057426143437624 2023-01-24 00:37:57.005340: step: 284/77, loss: 0.0005255554569885135 2023-01-24 00:37:58.528404: step: 288/77, loss: 9.64771315921098e-05 2023-01-24 00:38:00.007400: step: 292/77, loss: 0.0022660386748611927 2023-01-24 00:38:01.528808: step: 296/77, loss: 0.0022366391494870186 2023-01-24 00:38:02.924161: step: 300/77, loss: 0.0016711915377527475 2023-01-24 00:38:04.383031: step: 304/77, loss: 0.0007188515737652779 2023-01-24 00:38:05.857623: step: 308/77, loss: 0.020181728526949883 2023-01-24 00:38:07.333689: step: 312/77, loss: 0.00023877693456597626 2023-01-24 00:38:08.739652: step: 316/77, loss: 0.002911501796916127 2023-01-24 00:38:10.194756: step: 320/77, loss: 0.0006735376664437354 2023-01-24 00:38:11.618293: step: 324/77, loss: 0.0001844169746618718 2023-01-24 00:38:13.056828: step: 328/77, loss: 0.06680993735790253 2023-01-24 00:38:14.487978: step: 332/77, loss: 0.00025244487915188074 2023-01-24 00:38:15.994543: step: 336/77, loss: 0.0013559891376644373 2023-01-24 00:38:17.472743: step: 340/77, loss: 0.021186305209994316 2023-01-24 00:38:18.946140: step: 344/77, loss: 0.0040461840108036995 2023-01-24 00:38:20.369887: step: 348/77, loss: 0.0058979676105082035 2023-01-24 00:38:21.829862: step: 352/77, loss: 1.0261072020512074e-05 2023-01-24 00:38:23.318774: step: 356/77, loss: 0.002143705729395151 2023-01-24 00:38:24.811568: step: 360/77, loss: 0.06320623308420181 2023-01-24 00:38:26.281354: step: 364/77, loss: 0.00015051935042720288 2023-01-24 00:38:27.727652: step: 368/77, loss: 1.2711274393950589e-05 2023-01-24 00:38:29.150223: step: 372/77, loss: 0.21453514695167542 2023-01-24 00:38:30.605695: step: 376/77, loss: 0.014170726761221886 2023-01-24 00:38:31.994398: step: 380/77, loss: 6.4277210185537115e-06 2023-01-24 00:38:33.480290: step: 384/77, loss: 0.022878451272845268 2023-01-24 00:38:34.948625: step: 388/77, loss: 0.02404708042740822 ================================================== Loss: 0.013 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5666666666666667, 'f1': 0.7234042553191489}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05085442919642522, 'epoch': 17} Test Chinese: {'template': {'p': 0.8552631578947368, 'r': 0.4961832061068702, 'f1': 0.6280193236714976}, 'slot': {'p': 0.4166666666666667, 'r': 0.012942191544434857, 'f1': 0.02510460251046025}, 'combined': 0.015766175489661027, 'epoch': 17} Dev Korean: {'template': {'p': 1.0, 'r': 0.5333333333333333, 'f1': 0.6956521739130436}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.04890349201497669, 'epoch': 17} Test Korean: {'template': {'p': 0.8648648648648649, 'r': 0.48854961832061067, 'f1': 0.624390243902439}, 'slot': {'p': 0.43243243243243246, 'r': 0.013805004314063849, 'f1': 0.026755852842809368}, 'combined': 0.016706093482339507, 'epoch': 17} Dev Russian: {'template': {'p': 1.0, 'r': 0.55, 'f1': 0.7096774193548387}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.04988944951527864, 'epoch': 17} Test Russian: {'template': {'p': 0.8648648648648649, 'r': 0.48854961832061067, 'f1': 0.624390243902439}, 'slot': {'p': 0.4594594594594595, 'r': 0.014667817083692839, 'f1': 0.028428093645484948}, 'combined': 0.017750224324985724, 'epoch': 17} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 17} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 17} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 17} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Chinese: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.37777777777777777, 'r': 0.014667817083692839, 'f1': 0.028239202657807307}, 'combined': 0.01735677821894498, 'epoch': 3} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 3} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Korean: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.37777777777777777, 'r': 0.014667817083692839, 'f1': 0.028239202657807307}, 'combined': 0.01735677821894498, 'epoch': 3} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 3} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Russian: {'template': {'p': 0.863013698630137, 'r': 0.48091603053435117, 'f1': 0.6176470588235295}, 'slot': {'p': 0.38636363636363635, 'r': 0.014667817083692839, 'f1': 0.02826267664172901}, 'combined': 0.01745635910224439, 'epoch': 3} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 3} ****************************** Epoch: 18 command: python train.py --model_name template --xlmr_model_name xlm-roberta-large --batch_size 10 --xlmr_learning_rate 2e-5 --event_hidden_num 450 --accumulate_step 4 --max_epoch 30 --p1_data_weight 0.1 --learning_rate 2e-4 2023-01-24 00:40:09.569515: step: 4/77, loss: 0.0028756712563335896 2023-01-24 00:40:11.043663: step: 8/77, loss: 0.045157160609960556 2023-01-24 00:40:12.471578: step: 12/77, loss: 0.012379593215882778 2023-01-24 00:40:13.908461: step: 16/77, loss: 0.02100391685962677 2023-01-24 00:40:15.367082: step: 20/77, loss: 0.0064195566810667515 2023-01-24 00:40:16.809601: step: 24/77, loss: 0.003383409697562456 2023-01-24 00:40:18.212313: step: 28/77, loss: 0.000816413841675967 2023-01-24 00:40:19.625584: step: 32/77, loss: 0.013582956977188587 2023-01-24 00:40:21.115333: step: 36/77, loss: 0.010575263760983944 2023-01-24 00:40:22.511134: step: 40/77, loss: 0.0007761925226077437 2023-01-24 00:40:23.949635: step: 44/77, loss: 0.005285558756440878 2023-01-24 00:40:25.423208: step: 48/77, loss: 0.001977279782295227 2023-01-24 00:40:26.856953: step: 52/77, loss: 0.001641746610403061 2023-01-24 00:40:28.299019: step: 56/77, loss: 6.50401216262253e-06 2023-01-24 00:40:29.763054: step: 60/77, loss: 0.02756490930914879 2023-01-24 00:40:31.191555: step: 64/77, loss: 0.0005078827380202711 2023-01-24 00:40:32.570003: step: 68/77, loss: 0.005699621979147196 2023-01-24 00:40:34.003587: step: 72/77, loss: 0.0027254566084593534 2023-01-24 00:40:35.479431: step: 76/77, loss: 2.5649595045251772e-05 2023-01-24 00:40:36.949497: step: 80/77, loss: 1.5124951460165903e-05 2023-01-24 00:40:38.360419: step: 84/77, loss: 0.03806731477379799 2023-01-24 00:40:39.858108: step: 88/77, loss: 0.001081199967302382 2023-01-24 00:40:41.287106: step: 92/77, loss: 1.783421430445742e-05 2023-01-24 00:40:42.645910: step: 96/77, loss: 0.04326368868350983 2023-01-24 00:40:44.128242: step: 100/77, loss: 0.006358175538480282 2023-01-24 00:40:45.622819: step: 104/77, loss: 0.0016182229155674577 2023-01-24 00:40:47.062931: step: 108/77, loss: 0.007719434332102537 2023-01-24 00:40:48.514495: step: 112/77, loss: 0.08727987110614777 2023-01-24 00:40:49.946664: step: 116/77, loss: 6.309882155619562e-05 2023-01-24 00:40:51.362526: step: 120/77, loss: 0.009384812787175179 2023-01-24 00:40:52.830541: step: 124/77, loss: 0.0004629144095815718 2023-01-24 00:40:54.329462: step: 128/77, loss: 0.0019581338856369257 2023-01-24 00:40:55.753032: step: 132/77, loss: 0.0017693155677989125 2023-01-24 00:40:57.159451: step: 136/77, loss: 5.7300036132801324e-05 2023-01-24 00:40:58.643248: step: 140/77, loss: 0.0002848027506843209 2023-01-24 00:41:00.062018: step: 144/77, loss: 0.0009141655173152685 2023-01-24 00:41:01.547849: step: 148/77, loss: 0.0034261501859873533 2023-01-24 00:41:03.048497: step: 152/77, loss: 0.09311222285032272 2023-01-24 00:41:04.465511: step: 156/77, loss: 0.0011829708237200975 2023-01-24 00:41:05.891159: step: 160/77, loss: 0.0006447265041060746 2023-01-24 00:41:07.389629: step: 164/77, loss: 0.007240463979542255 2023-01-24 00:41:08.842688: step: 168/77, loss: 0.020239055156707764 2023-01-24 00:41:10.275359: step: 172/77, loss: 0.00021344999549910426 2023-01-24 00:41:11.697567: step: 176/77, loss: 0.0005965419695712626 2023-01-24 00:41:13.170251: step: 180/77, loss: 0.0009922974277287722 2023-01-24 00:41:14.690981: step: 184/77, loss: 0.0048019420355558395 2023-01-24 00:41:16.155596: step: 188/77, loss: 0.0009401412680745125 2023-01-24 00:41:17.632762: step: 192/77, loss: 0.0022180110681802034 2023-01-24 00:41:19.163069: step: 196/77, loss: 0.00930736307054758 2023-01-24 00:41:20.622087: step: 200/77, loss: 0.0036351527087390423 2023-01-24 00:41:22.118866: step: 204/77, loss: 0.00014229273074306548 2023-01-24 00:41:23.494224: step: 208/77, loss: 0.0007904646918177605 2023-01-24 00:41:24.935537: step: 212/77, loss: 0.003755400190129876 2023-01-24 00:41:26.367241: step: 216/77, loss: 9.308937296736985e-05 2023-01-24 00:41:27.804599: step: 220/77, loss: 3.324115095892921e-05 2023-01-24 00:41:29.244705: step: 224/77, loss: 0.0007430835394188762 2023-01-24 00:41:30.642070: step: 228/77, loss: 0.01505276933312416 2023-01-24 00:41:32.189918: step: 232/77, loss: 0.0008096634410321712 2023-01-24 00:41:33.596709: step: 236/77, loss: 0.0005306456587277353 2023-01-24 00:41:35.052576: step: 240/77, loss: 0.0006985433283261955 2023-01-24 00:41:36.601595: step: 244/77, loss: 0.0006532074767164886 2023-01-24 00:41:38.051698: step: 248/77, loss: 1.2684857210842893e-05 2023-01-24 00:41:39.491556: step: 252/77, loss: 3.6554571124725044e-05 2023-01-24 00:41:40.962625: step: 256/77, loss: 0.0003153039433527738 2023-01-24 00:41:42.466849: step: 260/77, loss: 0.0004511611768975854 2023-01-24 00:41:43.929063: step: 264/77, loss: 0.04475802928209305 2023-01-24 00:41:45.364083: step: 268/77, loss: 8.685257489560172e-05 2023-01-24 00:41:46.821887: step: 272/77, loss: 0.00020022218814119697 2023-01-24 00:41:48.293417: step: 276/77, loss: 0.004787555430084467 2023-01-24 00:41:49.735692: step: 280/77, loss: 0.0038822393398731947 2023-01-24 00:41:51.219456: step: 284/77, loss: 0.05055629462003708 2023-01-24 00:41:52.655803: step: 288/77, loss: 0.002044880762696266 2023-01-24 00:41:54.126753: step: 292/77, loss: 0.00030714101740159094 2023-01-24 00:41:55.582536: step: 296/77, loss: 0.0073666879907250404 2023-01-24 00:41:57.080350: step: 300/77, loss: 0.000893754418939352 2023-01-24 00:41:58.511272: step: 304/77, loss: 0.004460779018700123 2023-01-24 00:41:59.998226: step: 308/77, loss: 0.0016178349032998085 2023-01-24 00:42:01.388913: step: 312/77, loss: 0.0005586580373346806 2023-01-24 00:42:02.828321: step: 316/77, loss: 0.0004077023477293551 2023-01-24 00:42:04.320574: step: 320/77, loss: 0.0034647120628505945 2023-01-24 00:42:05.808055: step: 324/77, loss: 0.0008419217774644494 2023-01-24 00:42:07.227750: step: 328/77, loss: 0.00011264411295996979 2023-01-24 00:42:08.784637: step: 332/77, loss: 0.0011876357020810246 2023-01-24 00:42:10.351080: step: 336/77, loss: 0.011139290407299995 2023-01-24 00:42:11.812261: step: 340/77, loss: 0.0006338097155094147 2023-01-24 00:42:13.337816: step: 344/77, loss: 0.00022002437617629766 2023-01-24 00:42:14.797644: step: 348/77, loss: 4.005714799859561e-05 2023-01-24 00:42:16.251264: step: 352/77, loss: 0.0015249974094331264 2023-01-24 00:42:17.714062: step: 356/77, loss: 0.003730625845491886 2023-01-24 00:42:19.145395: step: 360/77, loss: 0.0007901267963461578 2023-01-24 00:42:20.564362: step: 364/77, loss: 0.00011056935181841254 2023-01-24 00:42:22.062104: step: 368/77, loss: 0.005171317607164383 2023-01-24 00:42:23.545546: step: 372/77, loss: 0.005288761109113693 2023-01-24 00:42:25.073118: step: 376/77, loss: 0.006044285371899605 2023-01-24 00:42:26.570060: step: 380/77, loss: 0.0016455070581287146 2023-01-24 00:42:28.074455: step: 384/77, loss: 0.0005599971045739949 2023-01-24 00:42:29.477123: step: 388/77, loss: 0.003552208887413144 ================================================== Loss: 0.007 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 18} Test Chinese: {'template': {'p': 0.881578947368421, 'r': 0.5114503816793893, 'f1': 0.6473429951690821}, 'slot': {'p': 0.4473684210526316, 'r': 0.014667817083692839, 'f1': 0.028404344193817876}, 'combined': 0.01838735324623959, 'epoch': 18} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 18} Test Korean: {'template': {'p': 0.8933333333333333, 'r': 0.5114503816793893, 'f1': 0.6504854368932039}, 'slot': {'p': 0.4722222222222222, 'r': 0.014667817083692839, 'f1': 0.028451882845188285}, 'combined': 0.018507535442986556, 'epoch': 18} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 18} Test Russian: {'template': {'p': 0.9054054054054054, 'r': 0.5114503816793893, 'f1': 0.6536585365853658}, 'slot': {'p': 0.4857142857142857, 'r': 0.014667817083692839, 'f1': 0.02847571189279732}, 'combined': 0.01861339216407239, 'epoch': 18} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 18} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 18} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 18} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Chinese: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.37777777777777777, 'r': 0.014667817083692839, 'f1': 0.028239202657807307}, 'combined': 0.01735677821894498, 'epoch': 3} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 3} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Korean: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.37777777777777777, 'r': 0.014667817083692839, 'f1': 0.028239202657807307}, 'combined': 0.01735677821894498, 'epoch': 3} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 3} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Russian: {'template': {'p': 0.863013698630137, 'r': 0.48091603053435117, 'f1': 0.6176470588235295}, 'slot': {'p': 0.38636363636363635, 'r': 0.014667817083692839, 'f1': 0.02826267664172901}, 'combined': 0.01745635910224439, 'epoch': 3} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 3} ****************************** Epoch: 19 command: python train.py --model_name template --xlmr_model_name xlm-roberta-large --batch_size 10 --xlmr_learning_rate 2e-5 --event_hidden_num 450 --accumulate_step 4 --max_epoch 30 --p1_data_weight 0.1 --learning_rate 2e-4 2023-01-24 00:44:06.201831: step: 4/77, loss: 0.00014098669635131955 2023-01-24 00:44:07.664949: step: 8/77, loss: 0.037430573254823685 2023-01-24 00:44:09.086063: step: 12/77, loss: 9.992123523261398e-06 2023-01-24 00:44:10.500533: step: 16/77, loss: 0.027012495324015617 2023-01-24 00:44:11.946289: step: 20/77, loss: 0.0012325807474553585 2023-01-24 00:44:13.427876: step: 24/77, loss: 0.00010384414781583473 2023-01-24 00:44:14.874528: step: 28/77, loss: 0.05970339477062225 2023-01-24 00:44:16.398690: step: 32/77, loss: 0.010155356489121914 2023-01-24 00:44:17.886806: step: 36/77, loss: 0.019072813913226128 2023-01-24 00:44:19.332267: step: 40/77, loss: 0.00041514128679409623 2023-01-24 00:44:20.797588: step: 44/77, loss: 0.0005081351846456528 2023-01-24 00:44:22.190160: step: 48/77, loss: 5.959239206276834e-05 2023-01-24 00:44:23.678582: step: 52/77, loss: 0.00889910850673914 2023-01-24 00:44:25.112936: step: 56/77, loss: 0.03261871263384819 2023-01-24 00:44:26.528148: step: 60/77, loss: 0.00019574278849177063 2023-01-24 00:44:27.954294: step: 64/77, loss: 7.374716369668022e-05 2023-01-24 00:44:29.398600: step: 68/77, loss: 0.001230890746228397 2023-01-24 00:44:30.854234: step: 72/77, loss: 0.03803974390029907 2023-01-24 00:44:32.269807: step: 76/77, loss: 0.00023354985751211643 2023-01-24 00:44:33.773444: step: 80/77, loss: 0.0221809484064579 2023-01-24 00:44:35.148451: step: 84/77, loss: 0.00018142740009352565 2023-01-24 00:44:36.615280: step: 88/77, loss: 0.027372796088457108 2023-01-24 00:44:38.095742: step: 92/77, loss: 0.0009225103422068059 2023-01-24 00:44:39.570822: step: 96/77, loss: 0.0002753528533503413 2023-01-24 00:44:40.995057: step: 100/77, loss: 0.04084169119596481 2023-01-24 00:44:42.428424: step: 104/77, loss: 0.033095426857471466 2023-01-24 00:44:43.877510: step: 108/77, loss: 0.019844137132167816 2023-01-24 00:44:45.280645: step: 112/77, loss: 0.00010307527554687113 2023-01-24 00:44:46.660117: step: 116/77, loss: 0.07973741739988327 2023-01-24 00:44:48.133385: step: 120/77, loss: 0.007755426689982414 2023-01-24 00:44:49.531431: step: 124/77, loss: 0.0020310066174715757 2023-01-24 00:44:50.975577: step: 128/77, loss: 0.0014131821226328611 2023-01-24 00:44:52.402847: step: 132/77, loss: 0.0007622442790307105 2023-01-24 00:44:53.855152: step: 136/77, loss: 0.0013501873472705483 2023-01-24 00:44:55.256625: step: 140/77, loss: 0.003476389916613698 2023-01-24 00:44:56.765976: step: 144/77, loss: 0.004717133939266205 2023-01-24 00:44:58.201267: step: 148/77, loss: 0.034023940563201904 2023-01-24 00:44:59.709165: step: 152/77, loss: 0.0025565975811332464 2023-01-24 00:45:01.290872: step: 156/77, loss: 0.0021379715763032436 2023-01-24 00:45:02.736738: step: 160/77, loss: 0.008405786007642746 2023-01-24 00:45:04.191090: step: 164/77, loss: 0.0002632577088661492 2023-01-24 00:45:05.712264: step: 168/77, loss: 0.007901491597294807 2023-01-24 00:45:07.214483: step: 172/77, loss: 0.0023893998004496098 2023-01-24 00:45:08.728963: step: 176/77, loss: 0.00036748748971149325 2023-01-24 00:45:10.192830: step: 180/77, loss: 6.505564670078456e-06 2023-01-24 00:45:11.695992: step: 184/77, loss: 0.002341366373002529 2023-01-24 00:45:13.125459: step: 188/77, loss: 0.00022118315973784775 2023-01-24 00:45:14.537004: step: 192/77, loss: 1.2627208889171015e-05 2023-01-24 00:45:16.035452: step: 196/77, loss: 0.0001342456671409309 2023-01-24 00:45:17.582536: step: 200/77, loss: 0.028776878491044044 2023-01-24 00:45:19.067749: step: 204/77, loss: 0.00026175566017627716 2023-01-24 00:45:20.549577: step: 208/77, loss: 0.0006376546807587147 2023-01-24 00:45:21.986525: step: 212/77, loss: 0.0004763914621435106 2023-01-24 00:45:23.429609: step: 216/77, loss: 3.821781137958169e-05 2023-01-24 00:45:24.898564: step: 220/77, loss: 0.00015172931307461113 2023-01-24 00:45:26.381414: step: 224/77, loss: 1.782135996108991e-06 2023-01-24 00:45:27.841066: step: 228/77, loss: 6.226929144759197e-06 2023-01-24 00:45:29.315939: step: 232/77, loss: 0.0005442426190711558 2023-01-24 00:45:30.704271: step: 236/77, loss: 6.345907604554668e-05 2023-01-24 00:45:32.204551: step: 240/77, loss: 8.847442222759128e-05 2023-01-24 00:45:33.644778: step: 244/77, loss: 0.0030105954501777887 2023-01-24 00:45:35.078245: step: 248/77, loss: 1.6285044694086537e-05 2023-01-24 00:45:36.618718: step: 252/77, loss: 0.00028308553737588227 2023-01-24 00:45:38.037149: step: 256/77, loss: 7.794688281137496e-05 2023-01-24 00:45:39.465519: step: 260/77, loss: 0.004675406496971846 2023-01-24 00:45:40.888620: step: 264/77, loss: 0.009595395065844059 2023-01-24 00:45:42.369142: step: 268/77, loss: 0.09094759076833725 2023-01-24 00:45:43.873221: step: 272/77, loss: 0.00025511524290777743 2023-01-24 00:45:45.349029: step: 276/77, loss: 8.626453200122342e-05 2023-01-24 00:45:46.774030: step: 280/77, loss: 2.5752302462933585e-05 2023-01-24 00:45:48.222307: step: 284/77, loss: 0.00047939157229848206 2023-01-24 00:45:49.719802: step: 288/77, loss: 0.0007043445948511362 2023-01-24 00:45:51.182756: step: 292/77, loss: 3.5091293284494895e-06 2023-01-24 00:45:52.631958: step: 296/77, loss: 0.004549146164208651 2023-01-24 00:45:54.137970: step: 300/77, loss: 0.0013034878065809608 2023-01-24 00:45:55.567528: step: 304/77, loss: 7.059721247060224e-05 2023-01-24 00:45:56.952702: step: 308/77, loss: 0.028981227427721024 2023-01-24 00:45:58.379270: step: 312/77, loss: 5.120259811519645e-05 2023-01-24 00:45:59.768891: step: 316/77, loss: 0.0703129917383194 2023-01-24 00:46:01.249906: step: 320/77, loss: 8.140176760207396e-06 2023-01-24 00:46:02.724036: step: 324/77, loss: 0.02896547131240368 2023-01-24 00:46:04.189691: step: 328/77, loss: 1.8530841771280393e-05 2023-01-24 00:46:05.658155: step: 332/77, loss: 0.0004325577465351671 2023-01-24 00:46:07.109734: step: 336/77, loss: 0.0001609406026545912 2023-01-24 00:46:08.606696: step: 340/77, loss: 0.0032217581756412983 2023-01-24 00:46:10.096379: step: 344/77, loss: 0.00014991796342656016 2023-01-24 00:46:11.516900: step: 348/77, loss: 0.041126806288957596 2023-01-24 00:46:12.981566: step: 352/77, loss: 0.00013102588127367198 2023-01-24 00:46:14.487435: step: 356/77, loss: 0.00045040063560009 2023-01-24 00:46:15.988164: step: 360/77, loss: 5.666089418809861e-05 2023-01-24 00:46:17.402380: step: 364/77, loss: 0.0007076942129060626 2023-01-24 00:46:18.806814: step: 368/77, loss: 0.0002744776720646769 2023-01-24 00:46:20.317335: step: 372/77, loss: 1.4699688108521514e-05 2023-01-24 00:46:21.756924: step: 376/77, loss: 0.006560564041137695 2023-01-24 00:46:23.213686: step: 380/77, loss: 0.017250539734959602 2023-01-24 00:46:24.635375: step: 384/77, loss: 0.007945503108203411 2023-01-24 00:46:26.077506: step: 388/77, loss: 0.0006101108156144619 ================================================== Loss: 0.009 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 19} Test Chinese: {'template': {'p': 0.9178082191780822, 'r': 0.5114503816793893, 'f1': 0.6568627450980392}, 'slot': {'p': 0.4857142857142857, 'r': 0.014667817083692839, 'f1': 0.02847571189279732}, 'combined': 0.018704634282523728, 'epoch': 19} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 19} Test Korean: {'template': {'p': 0.9295774647887324, 'r': 0.5038167938931297, 'f1': 0.6534653465346535}, 'slot': {'p': 0.5, 'r': 0.014667817083692839, 'f1': 0.028499580888516347}, 'combined': 0.018623488501406722, 'epoch': 19} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 19} Test Russian: {'template': {'p': 0.9295774647887324, 'r': 0.5038167938931297, 'f1': 0.6534653465346535}, 'slot': {'p': 0.5, 'r': 0.014667817083692839, 'f1': 0.028499580888516347}, 'combined': 0.018623488501406722, 'epoch': 19} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 19} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 19} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 19} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Chinese: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.37777777777777777, 'r': 0.014667817083692839, 'f1': 0.028239202657807307}, 'combined': 0.01735677821894498, 'epoch': 3} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 3} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Korean: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.37777777777777777, 'r': 0.014667817083692839, 'f1': 0.028239202657807307}, 'combined': 0.01735677821894498, 'epoch': 3} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 3} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Russian: {'template': {'p': 0.863013698630137, 'r': 0.48091603053435117, 'f1': 0.6176470588235295}, 'slot': {'p': 0.38636363636363635, 'r': 0.014667817083692839, 'f1': 0.02826267664172901}, 'combined': 0.01745635910224439, 'epoch': 3} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 3} ****************************** Epoch: 20 command: python train.py --model_name template --xlmr_model_name xlm-roberta-large --batch_size 10 --xlmr_learning_rate 2e-5 --event_hidden_num 450 --accumulate_step 4 --max_epoch 30 --p1_data_weight 0.1 --learning_rate 2e-4 2023-01-24 00:48:03.449733: step: 4/77, loss: 8.427352440776303e-05 2023-01-24 00:48:04.899526: step: 8/77, loss: 0.003391423961147666 2023-01-24 00:48:06.449535: step: 12/77, loss: 0.00022610797896049917 2023-01-24 00:48:07.828967: step: 16/77, loss: 0.012872123159468174 2023-01-24 00:48:09.255424: step: 20/77, loss: 7.742694288026541e-05 2023-01-24 00:48:10.698815: step: 24/77, loss: 0.00036254245787858963 2023-01-24 00:48:12.199252: step: 28/77, loss: 0.0004164370766375214 2023-01-24 00:48:13.627547: step: 32/77, loss: 0.0007234312943182886 2023-01-24 00:48:15.134298: step: 36/77, loss: 0.005771125201135874 2023-01-24 00:48:16.499338: step: 40/77, loss: 0.003106000367552042 2023-01-24 00:48:17.887421: step: 44/77, loss: 0.017629720270633698 2023-01-24 00:48:19.352400: step: 48/77, loss: 0.02136806771159172 2023-01-24 00:48:20.814818: step: 52/77, loss: 0.00046486067003570497 2023-01-24 00:48:22.272798: step: 56/77, loss: 0.001176183228380978 2023-01-24 00:48:23.732016: step: 60/77, loss: 0.006411808542907238 2023-01-24 00:48:25.163490: step: 64/77, loss: 0.0006058060680516064 2023-01-24 00:48:26.671104: step: 68/77, loss: 2.3365575543721206e-05 2023-01-24 00:48:28.135006: step: 72/77, loss: 0.008123316802084446 2023-01-24 00:48:29.599323: step: 76/77, loss: 1.5268948118318804e-05 2023-01-24 00:48:31.073196: step: 80/77, loss: 6.315793143585324e-05 2023-01-24 00:48:32.563274: step: 84/77, loss: 0.00023585847520735115 2023-01-24 00:48:34.013018: step: 88/77, loss: 0.00752745708450675 2023-01-24 00:48:35.450872: step: 92/77, loss: 0.0002199528826167807 2023-01-24 00:48:36.896556: step: 96/77, loss: 0.0004972168244421482 2023-01-24 00:48:38.397200: step: 100/77, loss: 0.0022977537009865046 2023-01-24 00:48:39.869366: step: 104/77, loss: 8.510050975019112e-05 2023-01-24 00:48:41.452649: step: 108/77, loss: 0.000818159431219101 2023-01-24 00:48:42.915038: step: 112/77, loss: 3.944763739127666e-05 2023-01-24 00:48:44.376813: step: 116/77, loss: 0.005611030850559473 2023-01-24 00:48:45.860421: step: 120/77, loss: 0.027019493281841278 2023-01-24 00:48:47.325321: step: 124/77, loss: 0.00016905099619179964 2023-01-24 00:48:48.751248: step: 128/77, loss: 0.000452701176982373 2023-01-24 00:48:50.253361: step: 132/77, loss: 0.001107914256863296 2023-01-24 00:48:51.732456: step: 136/77, loss: 3.4447726648068056e-05 2023-01-24 00:48:53.145244: step: 140/77, loss: 0.0005560338613577187 2023-01-24 00:48:54.638287: step: 144/77, loss: 0.00014413423195946962 2023-01-24 00:48:56.117067: step: 148/77, loss: 0.00014533651119563729 2023-01-24 00:48:57.527527: step: 152/77, loss: 0.005465753376483917 2023-01-24 00:48:59.045040: step: 156/77, loss: 1.3785129340249114e-05 2023-01-24 00:49:00.505544: step: 160/77, loss: 0.0049934606067836285 2023-01-24 00:49:01.983888: step: 164/77, loss: 0.001129616517573595 2023-01-24 00:49:03.439894: step: 168/77, loss: 0.003189380746334791 2023-01-24 00:49:04.836384: step: 172/77, loss: 6.476558610302163e-06 2023-01-24 00:49:06.293009: step: 176/77, loss: 0.02472682110965252 2023-01-24 00:49:07.777449: step: 180/77, loss: 5.072112344350899e-06 2023-01-24 00:49:09.231871: step: 184/77, loss: 0.0020358862821012735 2023-01-24 00:49:10.639766: step: 188/77, loss: 0.0002398270444246009 2023-01-24 00:49:12.126269: step: 192/77, loss: 0.05871470272541046 2023-01-24 00:49:13.703680: step: 196/77, loss: 0.002355430740863085 2023-01-24 00:49:15.212399: step: 200/77, loss: 0.0007327854400500655 2023-01-24 00:49:16.739522: step: 204/77, loss: 0.0006290523451752961 2023-01-24 00:49:18.236724: step: 208/77, loss: 0.00018614571308717132 2023-01-24 00:49:19.715177: step: 212/77, loss: 0.00010260358249070123 2023-01-24 00:49:21.189240: step: 216/77, loss: 0.022568479180336 2023-01-24 00:49:22.607141: step: 220/77, loss: 0.0005458329687826335 2023-01-24 00:49:24.043362: step: 224/77, loss: 0.0008564339368604124 2023-01-24 00:49:25.447606: step: 228/77, loss: 0.0035173415672034025 2023-01-24 00:49:26.862917: step: 232/77, loss: 0.016759483143687248 2023-01-24 00:49:28.366854: step: 236/77, loss: 0.03251827880740166 2023-01-24 00:49:29.748178: step: 240/77, loss: 0.00995706394314766 2023-01-24 00:49:31.152422: step: 244/77, loss: 0.0006280227098613977 2023-01-24 00:49:32.598784: step: 248/77, loss: 0.04479119926691055 2023-01-24 00:49:34.112384: step: 252/77, loss: 0.00023071595933288336 2023-01-24 00:49:35.566278: step: 256/77, loss: 0.0007097484776750207 2023-01-24 00:49:37.060019: step: 260/77, loss: 0.00010665479203453287 2023-01-24 00:49:38.482101: step: 264/77, loss: 0.0001831296249292791 2023-01-24 00:49:39.921946: step: 268/77, loss: 0.0060518416576087475 2023-01-24 00:49:41.373656: step: 272/77, loss: 0.0017663311446085572 2023-01-24 00:49:42.867508: step: 276/77, loss: 0.0004453969595488161 2023-01-24 00:49:44.286190: step: 280/77, loss: 0.0013352977111935616 2023-01-24 00:49:45.724677: step: 284/77, loss: 0.002567255636677146 2023-01-24 00:49:47.164285: step: 288/77, loss: 1.810535104596056e-05 2023-01-24 00:49:48.621814: step: 292/77, loss: 0.0020445568952709436 2023-01-24 00:49:50.126874: step: 296/77, loss: 1.9323746528243646e-05 2023-01-24 00:49:51.580857: step: 300/77, loss: 0.002380709396675229 2023-01-24 00:49:53.000932: step: 304/77, loss: 0.0007935311878100038 2023-01-24 00:49:54.478914: step: 308/77, loss: 0.0022404869087040424 2023-01-24 00:49:55.948045: step: 312/77, loss: 0.05067255347967148 2023-01-24 00:49:57.387560: step: 316/77, loss: 0.00011900630488526076 2023-01-24 00:49:58.814718: step: 320/77, loss: 0.0002168344653910026 2023-01-24 00:50:00.263344: step: 324/77, loss: 0.0004526789125520736 2023-01-24 00:50:01.749836: step: 328/77, loss: 0.000879471015650779 2023-01-24 00:50:03.263319: step: 332/77, loss: 0.0030491615179926157 2023-01-24 00:50:04.727232: step: 336/77, loss: 0.006456117145717144 2023-01-24 00:50:06.160235: step: 340/77, loss: 0.010428737848997116 2023-01-24 00:50:07.581220: step: 344/77, loss: 0.0005070596816949546 2023-01-24 00:50:09.130907: step: 348/77, loss: 0.00023114164650905877 2023-01-24 00:50:10.621260: step: 352/77, loss: 0.03697359934449196 2023-01-24 00:50:12.030672: step: 356/77, loss: 0.011408147402107716 2023-01-24 00:50:13.537721: step: 360/77, loss: 0.015400312840938568 2023-01-24 00:50:14.992452: step: 364/77, loss: 1.4918558008503169e-05 2023-01-24 00:50:16.413001: step: 368/77, loss: 0.0003141985216643661 2023-01-24 00:50:17.894272: step: 372/77, loss: 0.03331225365400314 2023-01-24 00:50:19.306712: step: 376/77, loss: 0.0009111135732382536 2023-01-24 00:50:20.721077: step: 380/77, loss: 0.0017461779061704874 2023-01-24 00:50:22.123863: step: 384/77, loss: 1.2481615158321802e-05 2023-01-24 00:50:23.571738: step: 388/77, loss: 0.032015111297369 ================================================== Loss: 0.006 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 20} Test Chinese: {'template': {'p': 0.9142857142857143, 'r': 0.48854961832061067, 'f1': 0.6368159203980099}, 'slot': {'p': 0.4722222222222222, 'r': 0.014667817083692839, 'f1': 0.028451882845188285}, 'combined': 0.018118611961114927, 'epoch': 20} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 20} Test Korean: {'template': {'p': 0.9130434782608695, 'r': 0.48091603053435117, 'f1': 0.63}, 'slot': {'p': 0.4857142857142857, 'r': 0.014667817083692839, 'f1': 0.02847571189279732}, 'combined': 0.01793969849246231, 'epoch': 20} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 20} Test Russian: {'template': {'p': 0.9142857142857143, 'r': 0.48854961832061067, 'f1': 0.6368159203980099}, 'slot': {'p': 0.4594594594594595, 'r': 0.014667817083692839, 'f1': 0.028428093645484948}, 'combined': 0.018103462620010315, 'epoch': 20} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 20} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 20} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 20} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Chinese: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.37777777777777777, 'r': 0.014667817083692839, 'f1': 0.028239202657807307}, 'combined': 0.01735677821894498, 'epoch': 3} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 3} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Korean: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.37777777777777777, 'r': 0.014667817083692839, 'f1': 0.028239202657807307}, 'combined': 0.01735677821894498, 'epoch': 3} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 3} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Russian: {'template': {'p': 0.863013698630137, 'r': 0.48091603053435117, 'f1': 0.6176470588235295}, 'slot': {'p': 0.38636363636363635, 'r': 0.014667817083692839, 'f1': 0.02826267664172901}, 'combined': 0.01745635910224439, 'epoch': 3} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 3} ****************************** Epoch: 21 command: python train.py --model_name template --xlmr_model_name xlm-roberta-large --batch_size 10 --xlmr_learning_rate 2e-5 --event_hidden_num 450 --accumulate_step 4 --max_epoch 30 --p1_data_weight 0.1 --learning_rate 2e-4 2023-01-24 00:51:58.435779: step: 4/77, loss: 8.27891708468087e-05 2023-01-24 00:51:59.879578: step: 8/77, loss: 0.00043526801164261997 2023-01-24 00:52:01.350695: step: 12/77, loss: 9.745121133164503e-06 2023-01-24 00:52:02.889017: step: 16/77, loss: 2.9559068934759125e-05 2023-01-24 00:52:04.385132: step: 20/77, loss: 0.0003522019542288035 2023-01-24 00:52:05.871333: step: 24/77, loss: 0.00022706623713020235 2023-01-24 00:52:07.328953: step: 28/77, loss: 0.009877260774374008 2023-01-24 00:52:08.788175: step: 32/77, loss: 0.006384177133440971 2023-01-24 00:52:10.291639: step: 36/77, loss: 0.00014239516167435795 2023-01-24 00:52:11.763232: step: 40/77, loss: 0.0005055277142673731 2023-01-24 00:52:13.245719: step: 44/77, loss: 0.05409137159585953 2023-01-24 00:52:14.748417: step: 48/77, loss: 3.8041166590119246e-06 2023-01-24 00:52:16.258267: step: 52/77, loss: 8.18027911009267e-06 2023-01-24 00:52:17.745046: step: 56/77, loss: 9.932726243278012e-05 2023-01-24 00:52:19.202002: step: 60/77, loss: 0.00014925358118489385 2023-01-24 00:52:20.639084: step: 64/77, loss: 0.0002240837930003181 2023-01-24 00:52:22.059462: step: 68/77, loss: 5.057529415353201e-05 2023-01-24 00:52:23.523631: step: 72/77, loss: 9.591747584636323e-06 2023-01-24 00:52:25.095943: step: 76/77, loss: 6.70036433803034e-06 2023-01-24 00:52:26.582626: step: 80/77, loss: 0.0002989994827657938 2023-01-24 00:52:28.033502: step: 84/77, loss: 0.0007450602715834975 2023-01-24 00:52:29.535152: step: 88/77, loss: 0.02040993794798851 2023-01-24 00:52:30.950724: step: 92/77, loss: 1.1034421731892508e-05 2023-01-24 00:52:32.351926: step: 96/77, loss: 6.633730663452297e-05 2023-01-24 00:52:33.766822: step: 100/77, loss: 0.00017172204388771206 2023-01-24 00:52:35.237044: step: 104/77, loss: 2.5287281459895894e-05 2023-01-24 00:52:36.626318: step: 108/77, loss: 0.0019874691497534513 2023-01-24 00:52:38.115817: step: 112/77, loss: 0.0028940557967871428 2023-01-24 00:52:39.604742: step: 116/77, loss: 1.3265988854982425e-05 2023-01-24 00:52:41.032715: step: 120/77, loss: 1.3763572496827692e-05 2023-01-24 00:52:42.463725: step: 124/77, loss: 5.456419967231341e-05 2023-01-24 00:52:43.924263: step: 128/77, loss: 3.199049024260603e-05 2023-01-24 00:52:45.449268: step: 132/77, loss: 0.00015985312347766012 2023-01-24 00:52:46.877834: step: 136/77, loss: 0.0018992533441632986 2023-01-24 00:52:48.337180: step: 140/77, loss: 1.3396806025411934e-05 2023-01-24 00:52:49.814098: step: 144/77, loss: 0.0006888278294354677 2023-01-24 00:52:51.200364: step: 148/77, loss: 0.001850732951425016 2023-01-24 00:52:52.607996: step: 152/77, loss: 0.0010139790829271078 2023-01-24 00:52:54.081866: step: 156/77, loss: 4.4166750740259886e-05 2023-01-24 00:52:55.483041: step: 160/77, loss: 3.8711961678927764e-05 2023-01-24 00:52:56.922003: step: 164/77, loss: 4.7252186050172895e-05 2023-01-24 00:52:58.361034: step: 168/77, loss: 0.0015337056247517467 2023-01-24 00:52:59.759612: step: 172/77, loss: 1.5749639715068042e-05 2023-01-24 00:53:01.248160: step: 176/77, loss: 2.1206040401011705e-05 2023-01-24 00:53:02.665567: step: 180/77, loss: 0.00034399217111058533 2023-01-24 00:53:04.113639: step: 184/77, loss: 5.2080049499636516e-05 2023-01-24 00:53:05.647397: step: 188/77, loss: 0.017733346670866013 2023-01-24 00:53:07.129857: step: 192/77, loss: 0.07597655057907104 2023-01-24 00:53:08.590167: step: 196/77, loss: 0.24772121012210846 2023-01-24 00:53:10.013837: step: 200/77, loss: 9.717977081891149e-05 2023-01-24 00:53:11.500798: step: 204/77, loss: 0.00013973054592497647 2023-01-24 00:53:13.004031: step: 208/77, loss: 0.024061929434537888 2023-01-24 00:53:14.409649: step: 212/77, loss: 0.0001402836642228067 2023-01-24 00:53:15.895741: step: 216/77, loss: 0.00044021164649166167 2023-01-24 00:53:17.345622: step: 220/77, loss: 4.2444888094905764e-05 2023-01-24 00:53:18.779610: step: 224/77, loss: 0.00022847886430099607 2023-01-24 00:53:20.245390: step: 228/77, loss: 0.0002529481425881386 2023-01-24 00:53:21.650131: step: 232/77, loss: 2.9905932024121284e-06 2023-01-24 00:53:23.142004: step: 236/77, loss: 0.002287093782797456 2023-01-24 00:53:24.638084: step: 240/77, loss: 5.147743650013581e-06 2023-01-24 00:53:26.088463: step: 244/77, loss: 0.0014095694059506059 2023-01-24 00:53:27.489685: step: 248/77, loss: 2.9975240977364592e-05 2023-01-24 00:53:29.057799: step: 252/77, loss: 0.013966652564704418 2023-01-24 00:53:30.488066: step: 256/77, loss: 0.019837338477373123 2023-01-24 00:53:31.994355: step: 260/77, loss: 0.0011463196715340018 2023-01-24 00:53:33.467304: step: 264/77, loss: 9.620159835321829e-05 2023-01-24 00:53:34.906462: step: 268/77, loss: 0.00427041482180357 2023-01-24 00:53:36.295866: step: 272/77, loss: 0.00024166949151549488 2023-01-24 00:53:37.773806: step: 276/77, loss: 0.0034298989921808243 2023-01-24 00:53:39.200847: step: 280/77, loss: 2.2885571524966508e-05 2023-01-24 00:53:40.669916: step: 284/77, loss: 0.0002919553080573678 2023-01-24 00:53:42.147583: step: 288/77, loss: 0.0018753650365397334 2023-01-24 00:53:43.639061: step: 292/77, loss: 0.0005949947517365217 2023-01-24 00:53:45.065651: step: 296/77, loss: 0.003798791905865073 2023-01-24 00:53:46.508343: step: 300/77, loss: 1.837530180637259e-05 2023-01-24 00:53:47.977671: step: 304/77, loss: 9.359028808830772e-06 2023-01-24 00:53:49.360618: step: 308/77, loss: 0.03410165011882782 2023-01-24 00:53:50.793660: step: 312/77, loss: 0.005747949704527855 2023-01-24 00:53:52.195486: step: 316/77, loss: 0.0071901543997228146 2023-01-24 00:53:53.581408: step: 320/77, loss: 0.014834891073405743 2023-01-24 00:53:55.079244: step: 324/77, loss: 3.8532862163265236e-06 2023-01-24 00:53:56.498863: step: 328/77, loss: 0.0001279481512028724 2023-01-24 00:53:57.890545: step: 332/77, loss: 0.0006787201855331659 2023-01-24 00:53:59.345788: step: 336/77, loss: 0.00013982687960378826 2023-01-24 00:54:00.813347: step: 340/77, loss: 2.8490972908912227e-05 2023-01-24 00:54:02.277600: step: 344/77, loss: 0.014455622062087059 2023-01-24 00:54:03.778934: step: 348/77, loss: 0.13285669684410095 2023-01-24 00:54:05.200080: step: 352/77, loss: 0.00019437828450463712 2023-01-24 00:54:06.697478: step: 356/77, loss: 0.002372733550146222 2023-01-24 00:54:08.147814: step: 360/77, loss: 0.009212651289999485 2023-01-24 00:54:09.612323: step: 364/77, loss: 0.060634076595306396 2023-01-24 00:54:11.041547: step: 368/77, loss: 1.0017057320510503e-05 2023-01-24 00:54:12.490469: step: 372/77, loss: 0.0002461381664033979 2023-01-24 00:54:13.941554: step: 376/77, loss: 2.5817340429057367e-05 2023-01-24 00:54:15.432196: step: 380/77, loss: 0.0008293167338706553 2023-01-24 00:54:16.916893: step: 384/77, loss: 7.01223143551033e-06 2023-01-24 00:54:18.433529: step: 388/77, loss: 0.03312050551176071 ================================================== Loss: 0.009 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 21} Test Chinese: {'template': {'p': 0.9230769230769231, 'r': 0.4580152671755725, 'f1': 0.6122448979591837}, 'slot': {'p': 0.48484848484848486, 'r': 0.013805004314063849, 'f1': 0.02684563758389262}, 'combined': 0.016436104643199563, 'epoch': 21} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 21} Test Korean: {'template': {'p': 0.9206349206349206, 'r': 0.44274809160305345, 'f1': 0.5979381443298969}, 'slot': {'p': 0.5294117647058824, 'r': 0.015530629853321829, 'f1': 0.03017602682313495}, 'combined': 0.018043397481874505, 'epoch': 21} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 21} Test Russian: {'template': {'p': 0.9206349206349206, 'r': 0.44274809160305345, 'f1': 0.5979381443298969}, 'slot': {'p': 0.48484848484848486, 'r': 0.013805004314063849, 'f1': 0.02684563758389262}, 'combined': 0.01605203072026569, 'epoch': 21} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 21} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 21} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 21} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Chinese: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.37777777777777777, 'r': 0.014667817083692839, 'f1': 0.028239202657807307}, 'combined': 0.01735677821894498, 'epoch': 3} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 3} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Korean: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.37777777777777777, 'r': 0.014667817083692839, 'f1': 0.028239202657807307}, 'combined': 0.01735677821894498, 'epoch': 3} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 3} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Russian: {'template': {'p': 0.863013698630137, 'r': 0.48091603053435117, 'f1': 0.6176470588235295}, 'slot': {'p': 0.38636363636363635, 'r': 0.014667817083692839, 'f1': 0.02826267664172901}, 'combined': 0.01745635910224439, 'epoch': 3} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 3} ****************************** Epoch: 22 command: python train.py --model_name template --xlmr_model_name xlm-roberta-large --batch_size 10 --xlmr_learning_rate 2e-5 --event_hidden_num 450 --accumulate_step 4 --max_epoch 30 --p1_data_weight 0.1 --learning_rate 2e-4 2023-01-24 00:55:53.192339: step: 4/77, loss: 0.005217066500335932 2023-01-24 00:55:54.648055: step: 8/77, loss: 1.7110400222009048e-05 2023-01-24 00:55:56.044974: step: 12/77, loss: 0.013908019289374352 2023-01-24 00:55:57.491323: step: 16/77, loss: 0.060203030705451965 2023-01-24 00:55:58.968731: step: 20/77, loss: 0.0001018143302644603 2023-01-24 00:56:00.434035: step: 24/77, loss: 0.00033130700467154384 2023-01-24 00:56:01.956461: step: 28/77, loss: 0.0022158983629196882 2023-01-24 00:56:03.449868: step: 32/77, loss: 0.001340493792667985 2023-01-24 00:56:04.862038: step: 36/77, loss: 9.457357373321429e-05 2023-01-24 00:56:06.314666: step: 40/77, loss: 0.007253519259393215 2023-01-24 00:56:07.760195: step: 44/77, loss: 0.01193111203610897 2023-01-24 00:56:09.254444: step: 48/77, loss: 1.947039709193632e-05 2023-01-24 00:56:10.663417: step: 52/77, loss: 0.010435977019369602 2023-01-24 00:56:12.080881: step: 56/77, loss: 0.00014408468268811703 2023-01-24 00:56:13.524029: step: 60/77, loss: 0.01907634176313877 2023-01-24 00:56:14.973934: step: 64/77, loss: 0.048381563276052475 2023-01-24 00:56:16.412772: step: 68/77, loss: 0.0009525776840746403 2023-01-24 00:56:17.867465: step: 72/77, loss: 0.040768206119537354 2023-01-24 00:56:19.347155: step: 76/77, loss: 0.0001661547867115587 2023-01-24 00:56:20.749211: step: 80/77, loss: 0.0006874087848700583 2023-01-24 00:56:22.227549: step: 84/77, loss: 2.096245952998288e-05 2023-01-24 00:56:23.673876: step: 88/77, loss: 0.007326844148337841 2023-01-24 00:56:25.105004: step: 92/77, loss: 0.024552563205361366 2023-01-24 00:56:26.561337: step: 96/77, loss: 0.0018240232020616531 2023-01-24 00:56:28.013849: step: 100/77, loss: 1.764373701007571e-05 2023-01-24 00:56:29.481947: step: 104/77, loss: 0.0007256006938405335 2023-01-24 00:56:30.936767: step: 108/77, loss: 0.0015230000717565417 2023-01-24 00:56:32.462659: step: 112/77, loss: 2.4019012926146388e-05 2023-01-24 00:56:33.859877: step: 116/77, loss: 3.794354779529385e-05 2023-01-24 00:56:35.318298: step: 120/77, loss: 0.0011226541828364134 2023-01-24 00:56:36.737208: step: 124/77, loss: 0.014624684117734432 2023-01-24 00:56:38.182535: step: 128/77, loss: 0.0008393382304348052 2023-01-24 00:56:39.608817: step: 132/77, loss: 3.3838718991319183e-06 2023-01-24 00:56:41.118004: step: 136/77, loss: 0.0315176360309124 2023-01-24 00:56:42.515574: step: 140/77, loss: 0.0003097353910561651 2023-01-24 00:56:43.903358: step: 144/77, loss: 3.469519651844166e-05 2023-01-24 00:56:45.339094: step: 148/77, loss: 8.082071144599468e-05 2023-01-24 00:56:46.823921: step: 152/77, loss: 0.001497301273047924 2023-01-24 00:56:48.321321: step: 156/77, loss: 3.961978109146003e-06 2023-01-24 00:56:49.825221: step: 160/77, loss: 0.004871972370892763 2023-01-24 00:56:51.269528: step: 164/77, loss: 0.0010909107513725758 2023-01-24 00:56:52.686281: step: 168/77, loss: 8.584916213294491e-05 2023-01-24 00:56:54.136924: step: 172/77, loss: 0.010070906020700932 2023-01-24 00:56:55.604477: step: 176/77, loss: 0.0017261668108403683 2023-01-24 00:56:57.112161: step: 180/77, loss: 0.0006850729114376009 2023-01-24 00:56:58.579628: step: 184/77, loss: 0.029952459037303925 2023-01-24 00:57:00.037940: step: 188/77, loss: 1.568080551805906e-05 2023-01-24 00:57:01.580539: step: 192/77, loss: 0.02392057701945305 2023-01-24 00:57:03.072972: step: 196/77, loss: 0.00026258424622938037 2023-01-24 00:57:04.549758: step: 200/77, loss: 0.018205387517809868 2023-01-24 00:57:06.016911: step: 204/77, loss: 3.9935000017976563e-07 2023-01-24 00:57:07.444983: step: 208/77, loss: 0.0023173026274889708 2023-01-24 00:57:08.901960: step: 212/77, loss: 0.017787277698516846 2023-01-24 00:57:10.322308: step: 216/77, loss: 4.1314960981253535e-06 2023-01-24 00:57:11.790625: step: 220/77, loss: 4.357820944278501e-05 2023-01-24 00:57:13.216695: step: 224/77, loss: 7.372977415798232e-05 2023-01-24 00:57:14.691800: step: 228/77, loss: 0.0021139492746442556 2023-01-24 00:57:16.103412: step: 232/77, loss: 0.00020499885431490839 2023-01-24 00:57:17.533847: step: 236/77, loss: 4.148268635617569e-05 2023-01-24 00:57:18.970096: step: 240/77, loss: 6.6961051743419375e-06 2023-01-24 00:57:20.459343: step: 244/77, loss: 0.11798547953367233 2023-01-24 00:57:21.854098: step: 248/77, loss: 0.0022454713471233845 2023-01-24 00:57:23.378263: step: 252/77, loss: 0.061752669513225555 2023-01-24 00:57:24.824233: step: 256/77, loss: 9.347809827886522e-05 2023-01-24 00:57:26.349640: step: 260/77, loss: 2.206732824561186e-05 2023-01-24 00:57:27.867086: step: 264/77, loss: 0.09364143759012222 2023-01-24 00:57:29.359788: step: 268/77, loss: 1.2169822184660006e-05 2023-01-24 00:57:30.842587: step: 272/77, loss: 6.98716685292311e-05 2023-01-24 00:57:32.341911: step: 276/77, loss: 8.472028639516793e-06 2023-01-24 00:57:33.802386: step: 280/77, loss: 0.028671029955148697 2023-01-24 00:57:35.178516: step: 284/77, loss: 4.439530675881542e-05 2023-01-24 00:57:36.654678: step: 288/77, loss: 3.467196165729547e-06 2023-01-24 00:57:38.131406: step: 292/77, loss: 0.027677757665514946 2023-01-24 00:57:39.577472: step: 296/77, loss: 0.07804442197084427 2023-01-24 00:57:41.012610: step: 300/77, loss: 0.04617539048194885 2023-01-24 00:57:42.493833: step: 304/77, loss: 8.539891132386401e-05 2023-01-24 00:57:43.925383: step: 308/77, loss: 4.726528914034134e-06 2023-01-24 00:57:45.432144: step: 312/77, loss: 0.0016289966879412532 2023-01-24 00:57:46.880794: step: 316/77, loss: 8.53421715873992e-06 2023-01-24 00:57:48.372461: step: 320/77, loss: 0.0001842157798819244 2023-01-24 00:57:49.782887: step: 324/77, loss: 0.013489406555891037 2023-01-24 00:57:51.215294: step: 328/77, loss: 0.0041824206709861755 2023-01-24 00:57:52.685750: step: 332/77, loss: 5.475835860124789e-05 2023-01-24 00:57:54.169233: step: 336/77, loss: 0.0028862636536359787 2023-01-24 00:57:55.701573: step: 340/77, loss: 0.015338265337049961 2023-01-24 00:57:57.221471: step: 344/77, loss: 4.800434908247553e-05 2023-01-24 00:57:58.685866: step: 348/77, loss: 0.003050778992474079 2023-01-24 00:58:00.147206: step: 352/77, loss: 0.00028969853883609176 2023-01-24 00:58:01.614777: step: 356/77, loss: 6.305790066107875e-06 2023-01-24 00:58:03.076046: step: 360/77, loss: 3.759312312467955e-05 2023-01-24 00:58:04.540637: step: 364/77, loss: 0.0016704229637980461 2023-01-24 00:58:05.994954: step: 368/77, loss: 7.791905227350071e-05 2023-01-24 00:58:07.447812: step: 372/77, loss: 0.0015094865811988711 2023-01-24 00:58:08.931535: step: 376/77, loss: 0.0017382418736815453 2023-01-24 00:58:10.377391: step: 380/77, loss: 5.582388894254109e-06 2023-01-24 00:58:11.787133: step: 384/77, loss: 4.877383980783634e-05 2023-01-24 00:58:13.214876: step: 388/77, loss: 5.6000149925239384e-05 ================================================== Loss: 0.010 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 22} Test Chinese: {'template': {'p': 0.9090909090909091, 'r': 0.4580152671755725, 'f1': 0.6091370558375634}, 'slot': {'p': 0.5161290322580645, 'r': 0.013805004314063849, 'f1': 0.02689075630252101}, 'combined': 0.01638015612336305, 'epoch': 22} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 22} Test Korean: {'template': {'p': 0.9090909090909091, 'r': 0.4580152671755725, 'f1': 0.6091370558375634}, 'slot': {'p': 0.5161290322580645, 'r': 0.013805004314063849, 'f1': 0.02689075630252101}, 'combined': 0.01638015612336305, 'epoch': 22} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 22} Test Russian: {'template': {'p': 0.9230769230769231, 'r': 0.4580152671755725, 'f1': 0.6122448979591837}, 'slot': {'p': 0.5333333333333333, 'r': 0.013805004314063849, 'f1': 0.026913372582001684}, 'combined': 0.016477575050205112, 'epoch': 22} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 22} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 22} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 22} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Chinese: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.37777777777777777, 'r': 0.014667817083692839, 'f1': 0.028239202657807307}, 'combined': 0.01735677821894498, 'epoch': 3} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 3} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Korean: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.37777777777777777, 'r': 0.014667817083692839, 'f1': 0.028239202657807307}, 'combined': 0.01735677821894498, 'epoch': 3} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 3} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Russian: {'template': {'p': 0.863013698630137, 'r': 0.48091603053435117, 'f1': 0.6176470588235295}, 'slot': {'p': 0.38636363636363635, 'r': 0.014667817083692839, 'f1': 0.02826267664172901}, 'combined': 0.01745635910224439, 'epoch': 3} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 3} ****************************** Epoch: 23 command: python train.py --model_name template --xlmr_model_name xlm-roberta-large --batch_size 10 --xlmr_learning_rate 2e-5 --event_hidden_num 450 --accumulate_step 4 --max_epoch 30 --p1_data_weight 0.1 --learning_rate 2e-4 2023-01-24 00:59:48.204243: step: 4/77, loss: 4.509889549808577e-05 2023-01-24 00:59:49.671169: step: 8/77, loss: 0.0002913977950811386 2023-01-24 00:59:51.087475: step: 12/77, loss: 0.01139332540333271 2023-01-24 00:59:52.519205: step: 16/77, loss: 0.004326066002249718 2023-01-24 00:59:53.973949: step: 20/77, loss: 2.7513986424310133e-05 2023-01-24 00:59:55.393239: step: 24/77, loss: 1.3975003639643546e-05 2023-01-24 00:59:56.813971: step: 28/77, loss: 0.007240791339427233 2023-01-24 00:59:58.232345: step: 32/77, loss: 0.00022055639419704676 2023-01-24 00:59:59.662002: step: 36/77, loss: 0.005759728141129017 2023-01-24 01:00:01.075277: step: 40/77, loss: 3.934923824999714e-06 2023-01-24 01:00:02.498640: step: 44/77, loss: 0.00020762631902471185 2023-01-24 01:00:03.983744: step: 48/77, loss: 0.017258862033486366 2023-01-24 01:00:05.493823: step: 52/77, loss: 0.01816752925515175 2023-01-24 01:00:06.925060: step: 56/77, loss: 7.464773807441816e-05 2023-01-24 01:00:08.343411: step: 60/77, loss: 6.470151856774464e-05 2023-01-24 01:00:09.831220: step: 64/77, loss: 0.03022352047264576 2023-01-24 01:00:11.344383: step: 68/77, loss: 0.030080357566475868 2023-01-24 01:00:12.761105: step: 72/77, loss: 0.0005232697003521025 2023-01-24 01:00:14.230478: step: 76/77, loss: 5.357146437745541e-05 2023-01-24 01:00:15.686552: step: 80/77, loss: 3.602316428441554e-05 2023-01-24 01:00:17.100704: step: 84/77, loss: 1.2637017789529637e-05 2023-01-24 01:00:18.568348: step: 88/77, loss: 0.0014475014759227633 2023-01-24 01:00:20.053245: step: 92/77, loss: 0.036904722452163696 2023-01-24 01:00:21.490585: step: 96/77, loss: 3.237287455704063e-05 2023-01-24 01:00:22.984648: step: 100/77, loss: 0.022598525509238243 2023-01-24 01:00:24.406402: step: 104/77, loss: 0.0006198819610290229 2023-01-24 01:00:25.884985: step: 108/77, loss: 0.037170566618442535 2023-01-24 01:00:27.346527: step: 112/77, loss: 2.7292508093523793e-05 2023-01-24 01:00:28.868735: step: 116/77, loss: 0.037723708897829056 2023-01-24 01:00:30.348878: step: 120/77, loss: 0.020332274958491325 2023-01-24 01:00:31.778062: step: 124/77, loss: 0.0104548754170537 2023-01-24 01:00:33.245668: step: 128/77, loss: 0.00011534785153344274 2023-01-24 01:00:34.664158: step: 132/77, loss: 0.018704598769545555 2023-01-24 01:00:36.067680: step: 136/77, loss: 5.230163424130296e-06 2023-01-24 01:00:37.527222: step: 140/77, loss: 0.005852022208273411 2023-01-24 01:00:39.006046: step: 144/77, loss: 0.0001379873719997704 2023-01-24 01:00:40.475740: step: 148/77, loss: 0.0046800170093774796 2023-01-24 01:00:41.897920: step: 152/77, loss: 0.0003418435517232865 2023-01-24 01:00:43.408950: step: 156/77, loss: 0.0001083689639926888 2023-01-24 01:00:44.870573: step: 160/77, loss: 0.00015809066826477647 2023-01-24 01:00:46.385422: step: 164/77, loss: 3.1797608244232833e-06 2023-01-24 01:00:47.817808: step: 168/77, loss: 5.667543882736936e-05 2023-01-24 01:00:49.293006: step: 172/77, loss: 3.554875002009794e-05 2023-01-24 01:00:50.742253: step: 176/77, loss: 0.0007138706278055906 2023-01-24 01:00:52.215042: step: 180/77, loss: 0.0003262606624048203 2023-01-24 01:00:53.742878: step: 184/77, loss: 0.01835954189300537 2023-01-24 01:00:55.169698: step: 188/77, loss: 5.6338943977607414e-05 2023-01-24 01:00:56.628871: step: 192/77, loss: 2.4946128178271465e-05 2023-01-24 01:00:58.123523: step: 196/77, loss: 8.789340063231066e-05 2023-01-24 01:00:59.569206: step: 200/77, loss: 5.236292054178193e-05 2023-01-24 01:01:00.977111: step: 204/77, loss: 0.001034987042658031 2023-01-24 01:01:02.557841: step: 208/77, loss: 1.9261453417129815e-05 2023-01-24 01:01:04.011329: step: 212/77, loss: 0.00017419336654711515 2023-01-24 01:01:05.518349: step: 216/77, loss: 0.027231387794017792 2023-01-24 01:01:06.939524: step: 220/77, loss: 0.003382456488907337 2023-01-24 01:01:08.332465: step: 224/77, loss: 0.00829127337783575 2023-01-24 01:01:09.757358: step: 228/77, loss: 0.016746779903769493 2023-01-24 01:01:11.172440: step: 232/77, loss: 0.0023354680743068457 2023-01-24 01:01:12.645895: step: 236/77, loss: 0.00019349480862729251 2023-01-24 01:01:14.087586: step: 240/77, loss: 5.5896136473165825e-05 2023-01-24 01:01:15.529517: step: 244/77, loss: 6.264793682930758e-06 2023-01-24 01:01:16.958909: step: 248/77, loss: 0.014695419929921627 2023-01-24 01:01:18.416739: step: 252/77, loss: 0.0002196079440182075 2023-01-24 01:01:19.797029: step: 256/77, loss: 0.00027947252965532243 2023-01-24 01:01:21.233435: step: 260/77, loss: 0.07945749908685684 2023-01-24 01:01:22.705094: step: 264/77, loss: 2.668962588359136e-05 2023-01-24 01:01:24.136593: step: 268/77, loss: 4.155714123044163e-06 2023-01-24 01:01:25.613645: step: 272/77, loss: 1.8812963389791548e-05 2023-01-24 01:01:27.142054: step: 276/77, loss: 0.00014113544602878392 2023-01-24 01:01:28.542516: step: 280/77, loss: 0.002064595464617014 2023-01-24 01:01:30.000480: step: 284/77, loss: 0.0031714034266769886 2023-01-24 01:01:31.473882: step: 288/77, loss: 2.7988609872409143e-05 2023-01-24 01:01:32.974859: step: 292/77, loss: 1.8803808416123502e-05 2023-01-24 01:01:34.434833: step: 296/77, loss: 0.0007328785723075271 2023-01-24 01:01:35.915527: step: 300/77, loss: 0.04188724607229233 2023-01-24 01:01:37.343687: step: 304/77, loss: 2.3720574517938076e-06 2023-01-24 01:01:38.813556: step: 308/77, loss: 0.00018719259242061526 2023-01-24 01:01:40.247684: step: 312/77, loss: 0.0003036497510038316 2023-01-24 01:01:41.711146: step: 316/77, loss: 3.775849108933471e-06 2023-01-24 01:01:43.104056: step: 320/77, loss: 0.00034413387766107917 2023-01-24 01:01:44.620656: step: 324/77, loss: 0.0007906183018349111 2023-01-24 01:01:46.086164: step: 328/77, loss: 2.5849119992926717e-05 2023-01-24 01:01:47.552035: step: 332/77, loss: 0.015596098266541958 2023-01-24 01:01:48.975073: step: 336/77, loss: 4.1424982555327006e-07 2023-01-24 01:01:50.464291: step: 340/77, loss: 4.976780473953113e-06 2023-01-24 01:01:51.971100: step: 344/77, loss: 0.0038793461862951517 2023-01-24 01:01:53.484688: step: 348/77, loss: 0.027320679277181625 2023-01-24 01:01:54.916121: step: 352/77, loss: 8.697208613739349e-06 2023-01-24 01:01:56.314579: step: 356/77, loss: 0.011141828261315823 2023-01-24 01:01:57.726409: step: 360/77, loss: 0.01932971365749836 2023-01-24 01:01:59.194064: step: 364/77, loss: 0.00039088045014068484 2023-01-24 01:02:00.651012: step: 368/77, loss: 1.4817132978350855e-05 2023-01-24 01:02:02.138705: step: 372/77, loss: 0.0013533779419958591 2023-01-24 01:02:03.590837: step: 376/77, loss: 0.000289235933450982 2023-01-24 01:02:05.071813: step: 380/77, loss: 6.28348789177835e-05 2023-01-24 01:02:06.544516: step: 384/77, loss: 0.011985783465206623 2023-01-24 01:02:08.068702: step: 388/77, loss: 9.261792001780123e-06 ================================================== Loss: 0.007 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 23} Test Chinese: {'template': {'p': 0.8873239436619719, 'r': 0.48091603053435117, 'f1': 0.6237623762376238}, 'slot': {'p': 0.5, 'r': 0.014667817083692839, 'f1': 0.028499580888516347}, 'combined': 0.017776966296797325, 'epoch': 23} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 23} Test Korean: {'template': {'p': 0.9, 'r': 0.48091603053435117, 'f1': 0.6268656716417911}, 'slot': {'p': 0.5, 'r': 0.014667817083692839, 'f1': 0.028499580888516347}, 'combined': 0.017865408915189354, 'epoch': 23} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 23} Test Russian: {'template': {'p': 0.8873239436619719, 'r': 0.48091603053435117, 'f1': 0.6237623762376238}, 'slot': {'p': 0.45714285714285713, 'r': 0.013805004314063849, 'f1': 0.02680067001675042}, 'combined': 0.016717249614408677, 'epoch': 23} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 23} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 23} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 23} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Chinese: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.37777777777777777, 'r': 0.014667817083692839, 'f1': 0.028239202657807307}, 'combined': 0.01735677821894498, 'epoch': 3} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 3} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Korean: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.37777777777777777, 'r': 0.014667817083692839, 'f1': 0.028239202657807307}, 'combined': 0.01735677821894498, 'epoch': 3} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 3} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Russian: {'template': {'p': 0.863013698630137, 'r': 0.48091603053435117, 'f1': 0.6176470588235295}, 'slot': {'p': 0.38636363636363635, 'r': 0.014667817083692839, 'f1': 0.02826267664172901}, 'combined': 0.01745635910224439, 'epoch': 3} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 3} ****************************** Epoch: 24 command: python train.py --model_name template --xlmr_model_name xlm-roberta-large --batch_size 10 --xlmr_learning_rate 2e-5 --event_hidden_num 450 --accumulate_step 4 --max_epoch 30 --p1_data_weight 0.1 --learning_rate 2e-4 2023-01-24 01:03:42.404133: step: 4/77, loss: 7.372148502327036e-06 2023-01-24 01:03:43.867754: step: 8/77, loss: 0.002566393930464983 2023-01-24 01:03:45.322772: step: 12/77, loss: 0.00975698884576559 2023-01-24 01:03:46.719118: step: 16/77, loss: 0.00012423066073097289 2023-01-24 01:03:48.177484: step: 20/77, loss: 7.469201955245808e-05 2023-01-24 01:03:49.618749: step: 24/77, loss: 1.7883849068311974e-05 2023-01-24 01:03:51.100085: step: 28/77, loss: 1.3897247299610171e-05 2023-01-24 01:03:52.544321: step: 32/77, loss: 0.000126511775306426 2023-01-24 01:03:54.003946: step: 36/77, loss: 0.00047933805035427213 2023-01-24 01:03:55.447947: step: 40/77, loss: 0.011026003398001194 2023-01-24 01:03:56.875791: step: 44/77, loss: 0.0017197122797369957 2023-01-24 01:03:58.345631: step: 48/77, loss: 0.011792484670877457 2023-01-24 01:03:59.857693: step: 52/77, loss: 0.02285902015864849 2023-01-24 01:04:01.345548: step: 56/77, loss: 0.0004306872433517128 2023-01-24 01:04:02.860112: step: 60/77, loss: 7.246661698445678e-05 2023-01-24 01:04:04.249255: step: 64/77, loss: 0.0010392410913482308 2023-01-24 01:04:05.724657: step: 68/77, loss: 1.759197584760841e-05 2023-01-24 01:04:07.237861: step: 72/77, loss: 0.00021055589604657143 2023-01-24 01:04:08.755812: step: 76/77, loss: 0.0056786141358315945 2023-01-24 01:04:10.302979: step: 80/77, loss: 2.048878059213166e-06 2023-01-24 01:04:11.670911: step: 84/77, loss: 0.00010283981100656092 2023-01-24 01:04:13.213172: step: 88/77, loss: 6.372314237523824e-05 2023-01-24 01:04:14.671292: step: 92/77, loss: 7.664812437724322e-05 2023-01-24 01:04:16.154667: step: 96/77, loss: 0.001548820873722434 2023-01-24 01:04:17.664535: step: 100/77, loss: 0.0007104914984665811 2023-01-24 01:04:19.148132: step: 104/77, loss: 0.0004165653372183442 2023-01-24 01:04:20.630204: step: 108/77, loss: 0.014097335748374462 2023-01-24 01:04:22.111182: step: 112/77, loss: 0.022480204701423645 2023-01-24 01:04:23.606394: step: 116/77, loss: 0.00048142069135792553 2023-01-24 01:04:25.060043: step: 120/77, loss: 0.002312499564141035 2023-01-24 01:04:26.472222: step: 124/77, loss: 7.586013816762716e-05 2023-01-24 01:04:27.969589: step: 128/77, loss: 1.1175856684531027e-07 2023-01-24 01:04:29.456279: step: 132/77, loss: 0.00231208186596632 2023-01-24 01:04:30.898169: step: 136/77, loss: 0.000274372985586524 2023-01-24 01:04:32.380106: step: 140/77, loss: 6.377603085638839e-07 2023-01-24 01:04:33.817121: step: 144/77, loss: 0.009999795816838741 2023-01-24 01:04:35.326791: step: 148/77, loss: 2.1054979697510134e-06 2023-01-24 01:04:36.802945: step: 152/77, loss: 0.014564106240868568 2023-01-24 01:04:38.299427: step: 156/77, loss: 0.008510801941156387 2023-01-24 01:04:39.721007: step: 160/77, loss: 0.0003925769997294992 2023-01-24 01:04:41.192684: step: 164/77, loss: 0.15949967503547668 2023-01-24 01:04:42.606084: step: 168/77, loss: 0.002451444510370493 2023-01-24 01:04:44.051883: step: 172/77, loss: 2.3691179649176775e-06 2023-01-24 01:04:45.445705: step: 176/77, loss: 1.0802951919686166e-06 2023-01-24 01:04:46.905584: step: 180/77, loss: 3.7903109841863625e-06 2023-01-24 01:04:48.331245: step: 184/77, loss: 0.03973957896232605 2023-01-24 01:04:49.739102: step: 188/77, loss: 4.1733612306416035e-05 2023-01-24 01:04:51.236733: step: 192/77, loss: 0.04006608948111534 2023-01-24 01:04:52.683365: step: 196/77, loss: 7.700354763073847e-05 2023-01-24 01:04:54.158255: step: 200/77, loss: 9.149542165687308e-05 2023-01-24 01:04:55.626894: step: 204/77, loss: 2.995125214511063e-07 2023-01-24 01:04:57.086129: step: 208/77, loss: 0.011563536711037159 2023-01-24 01:04:58.546672: step: 212/77, loss: 0.0006145972874946892 2023-01-24 01:05:00.092206: step: 216/77, loss: 0.001980900764465332 2023-01-24 01:05:01.594334: step: 220/77, loss: 4.839376560994424e-05 2023-01-24 01:05:02.963566: step: 224/77, loss: 8.927112503442913e-05 2023-01-24 01:05:04.415861: step: 228/77, loss: 7.5996690611646045e-06 2023-01-24 01:05:05.854005: step: 232/77, loss: 1.1799502317444421e-05 2023-01-24 01:05:07.351403: step: 236/77, loss: 0.00021003717847634107 2023-01-24 01:05:08.866871: step: 240/77, loss: 8.29909276944818e-06 2023-01-24 01:05:10.328719: step: 244/77, loss: 1.0683887694540317e-06 2023-01-24 01:05:11.742024: step: 248/77, loss: 0.0002714066649787128 2023-01-24 01:05:13.212260: step: 252/77, loss: 0.015390491113066673 2023-01-24 01:05:14.670828: step: 256/77, loss: 0.027384832501411438 2023-01-24 01:05:16.165505: step: 260/77, loss: 0.02881324291229248 2023-01-24 01:05:17.646927: step: 264/77, loss: 0.00021542828471865505 2023-01-24 01:05:19.079041: step: 268/77, loss: 3.8738376133551355e-06 2023-01-24 01:05:20.534716: step: 272/77, loss: 0.005717065185308456 2023-01-24 01:05:21.999350: step: 276/77, loss: 0.0003914251283276826 2023-01-24 01:05:23.513034: step: 280/77, loss: 0.0007405400392599404 2023-01-24 01:05:24.979982: step: 284/77, loss: 2.542083166190423e-06 2023-01-24 01:05:26.370228: step: 288/77, loss: 1.5962510587996803e-05 2023-01-24 01:05:27.873040: step: 292/77, loss: 5.124079962115502e-06 2023-01-24 01:05:29.409486: step: 296/77, loss: 0.00012356540537439287 2023-01-24 01:05:30.889498: step: 300/77, loss: 0.000246676238020882 2023-01-24 01:05:32.362251: step: 304/77, loss: 0.00015589930990245193 2023-01-24 01:05:33.829842: step: 308/77, loss: 0.0006130424444563687 2023-01-24 01:05:35.310734: step: 312/77, loss: 6.260881491471082e-05 2023-01-24 01:05:36.736942: step: 316/77, loss: 0.0031988155096769333 2023-01-24 01:05:38.243163: step: 320/77, loss: 0.004418224096298218 2023-01-24 01:05:39.660156: step: 324/77, loss: 0.0009965308709070086 2023-01-24 01:05:41.107618: step: 328/77, loss: 7.539941293543961e-07 2023-01-24 01:05:42.532052: step: 332/77, loss: 0.003678370965644717 2023-01-24 01:05:43.977756: step: 336/77, loss: 0.0003343412245158106 2023-01-24 01:05:45.373631: step: 340/77, loss: 0.00017664878396317363 2023-01-24 01:05:46.892624: step: 344/77, loss: 0.04077373445034027 2023-01-24 01:05:48.310829: step: 348/77, loss: 3.8742859942431096e-07 2023-01-24 01:05:49.798829: step: 352/77, loss: 0.001119026681408286 2023-01-24 01:05:51.244450: step: 356/77, loss: 0.04550314322113991 2023-01-24 01:05:52.745134: step: 360/77, loss: 1.2471847412598436e-06 2023-01-24 01:05:54.232283: step: 364/77, loss: 1.2867828445450868e-05 2023-01-24 01:05:55.736143: step: 368/77, loss: 0.008402790874242783 2023-01-24 01:05:57.226192: step: 372/77, loss: 0.0688982605934143 2023-01-24 01:05:58.697180: step: 376/77, loss: 0.011048972606658936 2023-01-24 01:06:00.160320: step: 380/77, loss: 0.00045871181646361947 2023-01-24 01:06:01.593432: step: 384/77, loss: 0.0016792448004707694 2023-01-24 01:06:03.063628: step: 388/77, loss: 0.00023599954147357494 ================================================== Loss: 0.007 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 24} Test Chinese: {'template': {'p': 0.88, 'r': 0.5038167938931297, 'f1': 0.640776699029126}, 'slot': {'p': 0.47058823529411764, 'r': 0.013805004314063849, 'f1': 0.02682313495389774}, 'combined': 0.017187639873371362, 'epoch': 24} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 24} Test Korean: {'template': {'p': 0.8783783783783784, 'r': 0.4961832061068702, 'f1': 0.6341463414634146}, 'slot': {'p': 0.4857142857142857, 'r': 0.014667817083692839, 'f1': 0.02847571189279732}, 'combined': 0.018057768517383666, 'epoch': 24} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 24} Test Russian: {'template': {'p': 0.8888888888888888, 'r': 0.48854961832061067, 'f1': 0.6305418719211823}, 'slot': {'p': 0.4857142857142857, 'r': 0.014667817083692839, 'f1': 0.02847571189279732}, 'combined': 0.017955128681172695, 'epoch': 24} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 24} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 24} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 24} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Chinese: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.37777777777777777, 'r': 0.014667817083692839, 'f1': 0.028239202657807307}, 'combined': 0.01735677821894498, 'epoch': 3} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 3} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Korean: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.37777777777777777, 'r': 0.014667817083692839, 'f1': 0.028239202657807307}, 'combined': 0.01735677821894498, 'epoch': 3} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 3} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Russian: {'template': {'p': 0.863013698630137, 'r': 0.48091603053435117, 'f1': 0.6176470588235295}, 'slot': {'p': 0.38636363636363635, 'r': 0.014667817083692839, 'f1': 0.02826267664172901}, 'combined': 0.01745635910224439, 'epoch': 3} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 3} ****************************** Epoch: 25 command: python train.py --model_name template --xlmr_model_name xlm-roberta-large --batch_size 10 --xlmr_learning_rate 2e-5 --event_hidden_num 450 --accumulate_step 4 --max_epoch 30 --p1_data_weight 0.1 --learning_rate 2e-4 2023-01-24 01:07:38.835009: step: 4/77, loss: 4.059045750182122e-05 2023-01-24 01:07:40.327852: step: 8/77, loss: 0.030835065990686417 2023-01-24 01:07:41.748252: step: 12/77, loss: 5.7991685025626794e-05 2023-01-24 01:07:43.176153: step: 16/77, loss: 4.914428427582607e-05 2023-01-24 01:07:44.679737: step: 20/77, loss: 3.874687899951823e-05 2023-01-24 01:07:46.119442: step: 24/77, loss: 2.7834380489366595e-06 2023-01-24 01:07:47.554693: step: 28/77, loss: 6.2171811805455945e-06 2023-01-24 01:07:49.048065: step: 32/77, loss: 1.473645966143522e-06 2023-01-24 01:07:50.488117: step: 36/77, loss: 6.257622590055689e-05 2023-01-24 01:07:51.926297: step: 40/77, loss: 0.0005334470770321786 2023-01-24 01:07:53.391217: step: 44/77, loss: 0.04825974255800247 2023-01-24 01:07:54.861219: step: 48/77, loss: 6.16902980254963e-07 2023-01-24 01:07:56.264397: step: 52/77, loss: 2.3803200747352093e-05 2023-01-24 01:07:57.742788: step: 56/77, loss: 1.7761865365173435e-06 2023-01-24 01:07:59.264017: step: 60/77, loss: 0.002082192339003086 2023-01-24 01:08:00.698334: step: 64/77, loss: 2.617944664962124e-05 2023-01-24 01:08:02.188456: step: 68/77, loss: 0.021399078890681267 2023-01-24 01:08:03.644716: step: 72/77, loss: 0.036595191806554794 2023-01-24 01:08:05.136402: step: 76/77, loss: 4.119883669773117e-05 2023-01-24 01:08:06.691065: step: 80/77, loss: 0.0001858456089394167 2023-01-24 01:08:08.183050: step: 84/77, loss: 2.6728503144113347e-05 2023-01-24 01:08:09.571005: step: 88/77, loss: 3.4449337817932246e-06 2023-01-24 01:08:11.007952: step: 92/77, loss: 0.00030246900860220194 2023-01-24 01:08:12.523568: step: 96/77, loss: 0.0003325316938571632 2023-01-24 01:08:14.002659: step: 100/77, loss: 9.07470052879944e-07 2023-01-24 01:08:15.440810: step: 104/77, loss: 0.00027373951161280274 2023-01-24 01:08:16.896005: step: 108/77, loss: 3.2495640880370047e-06 2023-01-24 01:08:18.363279: step: 112/77, loss: 0.0010588520672172308 2023-01-24 01:08:19.781364: step: 116/77, loss: 0.02988281659781933 2023-01-24 01:08:21.232476: step: 120/77, loss: 3.7877066461078357e-06 2023-01-24 01:08:22.638005: step: 124/77, loss: 1.281479626413784e-06 2023-01-24 01:08:24.101533: step: 128/77, loss: 0.0009377918904647231 2023-01-24 01:08:25.613196: step: 132/77, loss: 0.00032386762904934585 2023-01-24 01:08:27.067997: step: 136/77, loss: 0.0004764663754031062 2023-01-24 01:08:28.554216: step: 140/77, loss: 0.0033441728446632624 2023-01-24 01:08:29.980348: step: 144/77, loss: 0.008817191235721111 2023-01-24 01:08:31.496557: step: 148/77, loss: 0.0001582540717208758 2023-01-24 01:08:32.964889: step: 152/77, loss: 1.4754637049918529e-05 2023-01-24 01:08:34.385830: step: 156/77, loss: 1.8953408016386675e-06 2023-01-24 01:08:35.837830: step: 160/77, loss: 0.007105558179318905 2023-01-24 01:08:37.282802: step: 164/77, loss: 0.00016064877854660153 2023-01-24 01:08:38.762763: step: 168/77, loss: 0.006513113155961037 2023-01-24 01:08:40.212101: step: 172/77, loss: 0.004730660002678633 2023-01-24 01:08:41.634838: step: 176/77, loss: 0.012969336472451687 2023-01-24 01:08:43.093942: step: 180/77, loss: 2.9611353966174647e-05 2023-01-24 01:08:44.532676: step: 184/77, loss: 3.134362486889586e-05 2023-01-24 01:08:45.945755: step: 188/77, loss: 4.576904757414013e-06 2023-01-24 01:08:47.391344: step: 192/77, loss: 7.770049705868587e-06 2023-01-24 01:08:48.855096: step: 196/77, loss: 0.002541177673265338 2023-01-24 01:08:50.309283: step: 200/77, loss: 0.0008257463341578841 2023-01-24 01:08:51.708211: step: 204/77, loss: 3.1156530440057395e-06 2023-01-24 01:08:53.164132: step: 208/77, loss: 2.0965369913028553e-05 2023-01-24 01:08:54.641229: step: 212/77, loss: 6.824683964623546e-07 2023-01-24 01:08:56.100841: step: 216/77, loss: 0.01078418642282486 2023-01-24 01:08:57.552504: step: 220/77, loss: 0.0002531272766645998 2023-01-24 01:08:58.900585: step: 224/77, loss: 0.0014299320755526423 2023-01-24 01:09:00.308512: step: 228/77, loss: 0.0007252601790241897 2023-01-24 01:09:01.706024: step: 232/77, loss: 4.472882665140787e-06 2023-01-24 01:09:03.240438: step: 236/77, loss: 0.004823240917176008 2023-01-24 01:09:04.750322: step: 240/77, loss: 0.01945425011217594 2023-01-24 01:09:06.224903: step: 244/77, loss: 0.0003206911205779761 2023-01-24 01:09:07.681438: step: 248/77, loss: 3.069389322263305e-06 2023-01-24 01:09:09.102147: step: 252/77, loss: 0.0044409167021512985 2023-01-24 01:09:10.536317: step: 256/77, loss: 0.0038761033210903406 2023-01-24 01:09:11.953280: step: 260/77, loss: 1.8266971892444417e-05 2023-01-24 01:09:13.423533: step: 264/77, loss: 0.0022358319256454706 2023-01-24 01:09:14.850196: step: 268/77, loss: 3.647561607067473e-05 2023-01-24 01:09:16.335137: step: 272/77, loss: 2.4451919671264477e-06 2023-01-24 01:09:17.783862: step: 276/77, loss: 6.576570740435272e-05 2023-01-24 01:09:19.265890: step: 280/77, loss: 0.0027494183741509914 2023-01-24 01:09:20.689322: step: 284/77, loss: 1.141419488703832e-06 2023-01-24 01:09:22.205118: step: 288/77, loss: 0.0010258633410558105 2023-01-24 01:09:23.617489: step: 292/77, loss: 1.6540182912194723e-07 2023-01-24 01:09:25.128826: step: 296/77, loss: 0.0002123012236552313 2023-01-24 01:09:26.502264: step: 300/77, loss: 3.2169398309633834e-06 2023-01-24 01:09:28.013886: step: 304/77, loss: 0.0015341609250754118 2023-01-24 01:09:29.530117: step: 308/77, loss: 2.290213160449639e-06 2023-01-24 01:09:30.952271: step: 312/77, loss: 2.740030140557792e-06 2023-01-24 01:09:32.384001: step: 316/77, loss: 5.185409827390686e-05 2023-01-24 01:09:33.901545: step: 320/77, loss: 4.31674015999306e-06 2023-01-24 01:09:35.373774: step: 324/77, loss: 2.0086281438125297e-06 2023-01-24 01:09:36.811640: step: 328/77, loss: 2.9443667699524667e-06 2023-01-24 01:09:38.250223: step: 332/77, loss: 1.6882592035472044e-06 2023-01-24 01:09:39.675662: step: 336/77, loss: 3.605149686336517e-05 2023-01-24 01:09:41.060566: step: 340/77, loss: 3.4285606034245575e-06 2023-01-24 01:09:42.432833: step: 344/77, loss: 0.04269392788410187 2023-01-24 01:09:43.851297: step: 348/77, loss: 6.202944405231392e-06 2023-01-24 01:09:45.357919: step: 352/77, loss: 5.522930223378353e-06 2023-01-24 01:09:46.793131: step: 356/77, loss: 3.839743385469774e-06 2023-01-24 01:09:48.232178: step: 360/77, loss: 6.6556758611113764e-06 2023-01-24 01:09:49.639956: step: 364/77, loss: 1.2227968909428455e-05 2023-01-24 01:09:51.029998: step: 368/77, loss: 1.5571474705211585e-06 2023-01-24 01:09:52.499406: step: 372/77, loss: 1.3326905900612473e-05 2023-01-24 01:09:53.982718: step: 376/77, loss: 1.5645682651665993e-06 2023-01-24 01:09:55.384773: step: 380/77, loss: 0.00039393804036080837 2023-01-24 01:09:56.846116: step: 384/77, loss: 6.189832492964342e-05 2023-01-24 01:09:58.345195: step: 388/77, loss: 0.0002459617971908301 ================================================== Loss: 0.003 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 25} Test Chinese: {'template': {'p': 0.9166666666666666, 'r': 0.5038167938931297, 'f1': 0.6502463054187191}, 'slot': {'p': 0.5151515151515151, 'r': 0.014667817083692839, 'f1': 0.02852348993288591}, 'combined': 0.01854729394650709, 'epoch': 25} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 25} Test Korean: {'template': {'p': 0.9142857142857143, 'r': 0.48854961832061067, 'f1': 0.6368159203980099}, 'slot': {'p': 0.53125, 'r': 0.014667817083692839, 'f1': 0.028547439126784216}, 'combined': 0.01817946372252925, 'epoch': 25} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 25} Test Russian: {'template': {'p': 0.9285714285714286, 'r': 0.4961832061068702, 'f1': 0.6467661691542288}, 'slot': {'p': 0.5483870967741935, 'r': 0.014667817083692839, 'f1': 0.02857142857142857}, 'combined': 0.018479033404406535, 'epoch': 25} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 25} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 25} Sample Russian: {'template': {'p': 1.0, 'r': 0.25, 'f1': 0.4}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 25} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Chinese: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.37777777777777777, 'r': 0.014667817083692839, 'f1': 0.028239202657807307}, 'combined': 0.01735677821894498, 'epoch': 3} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 3} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Korean: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.37777777777777777, 'r': 0.014667817083692839, 'f1': 0.028239202657807307}, 'combined': 0.01735677821894498, 'epoch': 3} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 3} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Russian: {'template': {'p': 0.863013698630137, 'r': 0.48091603053435117, 'f1': 0.6176470588235295}, 'slot': {'p': 0.38636363636363635, 'r': 0.014667817083692839, 'f1': 0.02826267664172901}, 'combined': 0.01745635910224439, 'epoch': 3} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 3} ****************************** Epoch: 26 command: python train.py --model_name template --xlmr_model_name xlm-roberta-large --batch_size 10 --xlmr_learning_rate 2e-5 --event_hidden_num 450 --accumulate_step 4 --max_epoch 30 --p1_data_weight 0.1 --learning_rate 2e-4 2023-01-24 01:11:33.197520: step: 4/77, loss: 0.0004410330147948116 2023-01-24 01:11:34.659291: step: 8/77, loss: 6.347841008391697e-07 2023-01-24 01:11:36.138350: step: 12/77, loss: 0.00019786340999417007 2023-01-24 01:11:37.586987: step: 16/77, loss: 1.8058998421111028e-06 2023-01-24 01:11:39.070783: step: 20/77, loss: 1.4379454569279915e-06 2023-01-24 01:11:40.542894: step: 24/77, loss: 0.001322779105976224 2023-01-24 01:11:41.943582: step: 28/77, loss: 8.72594173415564e-05 2023-01-24 01:11:43.413710: step: 32/77, loss: 9.983496966015082e-06 2023-01-24 01:11:44.844926: step: 36/77, loss: 7.26973712517065e-06 2023-01-24 01:11:46.303642: step: 40/77, loss: 4.976939749212761e-07 2023-01-24 01:11:47.754552: step: 44/77, loss: 0.007929427549242973 2023-01-24 01:11:49.285530: step: 48/77, loss: 8.099547812889796e-06 2023-01-24 01:11:50.737731: step: 52/77, loss: 0.0005367292324081063 2023-01-24 01:11:52.270294: step: 56/77, loss: 5.662437985165525e-08 2023-01-24 01:11:53.652204: step: 60/77, loss: 1.0163690603803843e-05 2023-01-24 01:11:55.051996: step: 64/77, loss: 3.014777939824853e-05 2023-01-24 01:11:56.508937: step: 68/77, loss: 0.004237155895680189 2023-01-24 01:11:57.909779: step: 72/77, loss: 1.9191624232917093e-05 2023-01-24 01:11:59.332199: step: 76/77, loss: 0.00026323023485019803 2023-01-24 01:12:00.751713: step: 80/77, loss: 2.7281901111564366e-06 2023-01-24 01:12:02.253948: step: 84/77, loss: 6.288236136242631e-07 2023-01-24 01:12:03.743451: step: 88/77, loss: 0.022111790254712105 2023-01-24 01:12:05.219342: step: 92/77, loss: 1.6018191672628745e-05 2023-01-24 01:12:06.679677: step: 96/77, loss: 2.7297407996229595e-06 2023-01-24 01:12:08.121942: step: 100/77, loss: 3.272101366746938e-06 2023-01-24 01:12:09.638092: step: 104/77, loss: 0.0006118750898167491 2023-01-24 01:12:11.129996: step: 108/77, loss: 0.09048576653003693 2023-01-24 01:12:12.570460: step: 112/77, loss: 7.00344287452026e-07 2023-01-24 01:12:13.975796: step: 116/77, loss: 2.5460256438236684e-05 2023-01-24 01:12:15.417627: step: 120/77, loss: 0.0005740249762311578 2023-01-24 01:12:16.921180: step: 124/77, loss: 1.1066706065321341e-05 2023-01-24 01:12:18.324354: step: 128/77, loss: 0.00012074044934706762 2023-01-24 01:12:19.737150: step: 132/77, loss: 0.0003262482932768762 2023-01-24 01:12:21.146994: step: 136/77, loss: 0.0011029550805687904 2023-01-24 01:12:22.620288: step: 140/77, loss: 3.281430326751433e-05 2023-01-24 01:12:24.089229: step: 144/77, loss: 0.015173608437180519 2023-01-24 01:12:25.553425: step: 148/77, loss: 0.011440704576671124 2023-01-24 01:12:27.062157: step: 152/77, loss: 0.00012958000297658145 2023-01-24 01:12:28.527931: step: 156/77, loss: 0.016136083751916885 2023-01-24 01:12:30.029570: step: 160/77, loss: 9.393112850375473e-05 2023-01-24 01:12:31.454610: step: 164/77, loss: 0.002286059781908989 2023-01-24 01:12:32.916746: step: 168/77, loss: 0.009524857625365257 2023-01-24 01:12:34.382496: step: 172/77, loss: 0.027265436947345734 2023-01-24 01:12:35.809095: step: 176/77, loss: 5.697094456991181e-05 2023-01-24 01:12:37.307864: step: 180/77, loss: 0.0001194120486616157 2023-01-24 01:12:38.748943: step: 184/77, loss: 0.0003231299633625895 2023-01-24 01:12:40.204431: step: 188/77, loss: 4.0711223846301436e-05 2023-01-24 01:12:41.662138: step: 192/77, loss: 0.0005879050586372614 2023-01-24 01:12:43.110157: step: 196/77, loss: 0.056800760328769684 2023-01-24 01:12:44.557797: step: 200/77, loss: 4.6171685426088516e-06 2023-01-24 01:12:46.089424: step: 204/77, loss: 1.0742302038124762e-05 2023-01-24 01:12:47.575066: step: 208/77, loss: 0.06644407659769058 2023-01-24 01:12:49.014978: step: 212/77, loss: 0.0002735485613811761 2023-01-24 01:12:50.454431: step: 216/77, loss: 3.1349229629995534e-06 2023-01-24 01:12:51.955555: step: 220/77, loss: 8.984311534732115e-06 2023-01-24 01:12:53.293992: step: 224/77, loss: 0.0023781578056514263 2023-01-24 01:12:54.807837: step: 228/77, loss: 0.02856743521988392 2023-01-24 01:12:56.261474: step: 232/77, loss: 0.03514505922794342 2023-01-24 01:12:57.699518: step: 236/77, loss: 9.761759429238737e-06 2023-01-24 01:12:59.145887: step: 240/77, loss: 7.212029231595807e-07 2023-01-24 01:13:00.632813: step: 244/77, loss: 4.139644079259597e-05 2023-01-24 01:13:02.162876: step: 248/77, loss: 2.2276112758845557e-06 2023-01-24 01:13:03.653974: step: 252/77, loss: 1.6014524589991197e-05 2023-01-24 01:13:05.080165: step: 256/77, loss: 1.4529955478792544e-05 2023-01-24 01:13:06.555146: step: 260/77, loss: 0.001574151567183435 2023-01-24 01:13:08.033666: step: 264/77, loss: 0.003839747281745076 2023-01-24 01:13:09.476002: step: 268/77, loss: 0.0009322985424660146 2023-01-24 01:13:10.881408: step: 272/77, loss: 0.0005584964528679848 2023-01-24 01:13:12.348726: step: 276/77, loss: 0.00011649419320747256 2023-01-24 01:13:13.791927: step: 280/77, loss: 0.0008474696660414338 2023-01-24 01:13:15.273476: step: 284/77, loss: 0.1427866369485855 2023-01-24 01:13:16.742183: step: 288/77, loss: 9.221310028806329e-05 2023-01-24 01:13:18.203669: step: 292/77, loss: 5.444014459499158e-05 2023-01-24 01:13:19.669245: step: 296/77, loss: 3.405211100471206e-05 2023-01-24 01:13:21.159531: step: 300/77, loss: 0.007121166680008173 2023-01-24 01:13:22.589976: step: 304/77, loss: 0.028989192098379135 2023-01-24 01:13:24.096458: step: 308/77, loss: 0.005328961182385683 2023-01-24 01:13:25.519635: step: 312/77, loss: 0.04974348098039627 2023-01-24 01:13:26.987300: step: 316/77, loss: 0.006064072251319885 2023-01-24 01:13:28.459377: step: 320/77, loss: 4.2640358515200205e-06 2023-01-24 01:13:29.905452: step: 324/77, loss: 2.233498526038602e-05 2023-01-24 01:13:31.421173: step: 328/77, loss: 2.7715998385247076e-07 2023-01-24 01:13:32.888267: step: 332/77, loss: 2.317662801942788e-05 2023-01-24 01:13:34.304807: step: 336/77, loss: 0.06426730006933212 2023-01-24 01:13:35.801204: step: 340/77, loss: 0.00011196612467756495 2023-01-24 01:13:37.283232: step: 344/77, loss: 8.359479579667095e-07 2023-01-24 01:13:38.719719: step: 348/77, loss: 4.4287400669418275e-05 2023-01-24 01:13:40.148415: step: 352/77, loss: 0.05372339114546776 2023-01-24 01:13:41.660878: step: 356/77, loss: 6.985371783230221e-06 2023-01-24 01:13:43.162605: step: 360/77, loss: 0.00024018401745706797 2023-01-24 01:13:44.626945: step: 364/77, loss: 7.439233013428748e-05 2023-01-24 01:13:46.111469: step: 368/77, loss: 0.0001586980652064085 2023-01-24 01:13:47.597400: step: 372/77, loss: 0.00013080626376904547 2023-01-24 01:13:49.049593: step: 376/77, loss: 7.703786764068354e-07 2023-01-24 01:13:50.462958: step: 380/77, loss: 1.3012524505029432e-05 2023-01-24 01:13:51.858298: step: 384/77, loss: 0.008434685878455639 2023-01-24 01:13:53.317517: step: 388/77, loss: 2.2103627998149022e-05 ================================================== Loss: 0.008 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.4878048780487805, 'r': 0.03780718336483932, 'f1': 0.07017543859649122}, 'combined': 0.051708217913204055, 'epoch': 26} Test Chinese: {'template': {'p': 0.9178082191780822, 'r': 0.5114503816793893, 'f1': 0.6568627450980392}, 'slot': {'p': 0.425, 'r': 0.014667817083692839, 'f1': 0.02835696413678065}, 'combined': 0.018626633305532388, 'epoch': 26} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.4878048780487805, 'r': 0.03780718336483932, 'f1': 0.07017543859649122}, 'combined': 0.051708217913204055, 'epoch': 26} Test Korean: {'template': {'p': 0.9166666666666666, 'r': 0.5038167938931297, 'f1': 0.6502463054187191}, 'slot': {'p': 0.4473684210526316, 'r': 0.014667817083692839, 'f1': 0.028404344193817876}, 'combined': 0.018469819869871718, 'epoch': 26} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.4878048780487805, 'r': 0.03780718336483932, 'f1': 0.07017543859649122}, 'combined': 0.051708217913204055, 'epoch': 26} Test Russian: {'template': {'p': 0.9295774647887324, 'r': 0.5038167938931297, 'f1': 0.6534653465346535}, 'slot': {'p': 0.4722222222222222, 'r': 0.014667817083692839, 'f1': 0.028451882845188285}, 'combined': 0.018592319482994325, 'epoch': 26} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 26} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 26} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 26} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Chinese: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.37777777777777777, 'r': 0.014667817083692839, 'f1': 0.028239202657807307}, 'combined': 0.01735677821894498, 'epoch': 3} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 3} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Korean: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.37777777777777777, 'r': 0.014667817083692839, 'f1': 0.028239202657807307}, 'combined': 0.01735677821894498, 'epoch': 3} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 3} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Russian: {'template': {'p': 0.863013698630137, 'r': 0.48091603053435117, 'f1': 0.6176470588235295}, 'slot': {'p': 0.38636363636363635, 'r': 0.014667817083692839, 'f1': 0.02826267664172901}, 'combined': 0.01745635910224439, 'epoch': 3} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 3} ****************************** Epoch: 27 command: python train.py --model_name template --xlmr_model_name xlm-roberta-large --batch_size 10 --xlmr_learning_rate 2e-5 --event_hidden_num 450 --accumulate_step 4 --max_epoch 30 --p1_data_weight 0.1 --learning_rate 2e-4 2023-01-24 01:15:30.394686: step: 4/77, loss: 1.4326322343549691e-05 2023-01-24 01:15:31.910037: step: 8/77, loss: 0.024663139134645462 2023-01-24 01:15:33.416226: step: 12/77, loss: 0.0004897532053291798 2023-01-24 01:15:34.894549: step: 16/77, loss: 0.0061692120507359505 2023-01-24 01:15:36.332649: step: 20/77, loss: 0.00023279106244444847 2023-01-24 01:15:37.750218: step: 24/77, loss: 6.758703966625035e-05 2023-01-24 01:15:39.207649: step: 28/77, loss: 6.1161990743130445e-06 2023-01-24 01:15:40.631083: step: 32/77, loss: 2.1376172298914753e-05 2023-01-24 01:15:42.071398: step: 36/77, loss: 9.568851965013891e-05 2023-01-24 01:15:43.507891: step: 40/77, loss: 3.833698883681791e-06 2023-01-24 01:15:45.079146: step: 44/77, loss: 0.00012776638322975487 2023-01-24 01:15:46.563216: step: 48/77, loss: 0.0004679126723203808 2023-01-24 01:15:48.008951: step: 52/77, loss: 1.3973161003377754e-05 2023-01-24 01:15:49.490386: step: 56/77, loss: 1.3835771824233234e-05 2023-01-24 01:15:50.984356: step: 60/77, loss: 1.6599069567746483e-06 2023-01-24 01:15:52.472971: step: 64/77, loss: 7.525195542257279e-05 2023-01-24 01:15:53.946245: step: 68/77, loss: 5.3652707720175385e-05 2023-01-24 01:15:55.370747: step: 72/77, loss: 0.000490330159664154 2023-01-24 01:15:56.841798: step: 76/77, loss: 6.186037353472784e-05 2023-01-24 01:15:58.269622: step: 80/77, loss: 9.387572390551213e-07 2023-01-24 01:15:59.684210: step: 84/77, loss: 0.00446065329015255 2023-01-24 01:16:01.145850: step: 88/77, loss: 3.517463119351305e-05 2023-01-24 01:16:02.639116: step: 92/77, loss: 0.0010564837139099836 2023-01-24 01:16:04.075785: step: 96/77, loss: 0.0038536631036549807 2023-01-24 01:16:05.505847: step: 100/77, loss: 0.000469402177259326 2023-01-24 01:16:07.009008: step: 104/77, loss: 8.486769365845248e-05 2023-01-24 01:16:08.472709: step: 108/77, loss: 3.024913439730881e-07 2023-01-24 01:16:09.979211: step: 112/77, loss: 2.229935307695996e-05 2023-01-24 01:16:11.468135: step: 116/77, loss: 8.681615872774273e-05 2023-01-24 01:16:12.897566: step: 120/77, loss: 0.00025649185408838093 2023-01-24 01:16:14.329425: step: 124/77, loss: 1.7866010466605076e-06 2023-01-24 01:16:15.822524: step: 128/77, loss: 1.3009529538976494e-05 2023-01-24 01:16:17.301463: step: 132/77, loss: 2.220266424046713e-07 2023-01-24 01:16:18.765612: step: 136/77, loss: 5.0626226766326e-06 2023-01-24 01:16:20.240929: step: 140/77, loss: 3.058028596569784e-05 2023-01-24 01:16:21.732200: step: 144/77, loss: 2.7159321689396165e-05 2023-01-24 01:16:23.260379: step: 148/77, loss: 1.4751408343727235e-06 2023-01-24 01:16:24.725369: step: 152/77, loss: 3.5762763417324095e-08 2023-01-24 01:16:26.217506: step: 156/77, loss: 3.208236739737913e-05 2023-01-24 01:16:27.715931: step: 160/77, loss: 0.0027070590294897556 2023-01-24 01:16:29.192629: step: 164/77, loss: 0.0012007926125079393 2023-01-24 01:16:30.619216: step: 168/77, loss: 7.897610032614466e-08 2023-01-24 01:16:32.139426: step: 172/77, loss: 0.0011402148520573974 2023-01-24 01:16:33.681190: step: 176/77, loss: 1.2129133892813115e-06 2023-01-24 01:16:35.195161: step: 180/77, loss: 6.869350386295991e-07 2023-01-24 01:16:36.639526: step: 184/77, loss: 0.0020278082229197025 2023-01-24 01:16:38.132340: step: 188/77, loss: 2.4167438823496923e-06 2023-01-24 01:16:39.668048: step: 192/77, loss: 0.0004763460601679981 2023-01-24 01:16:41.144974: step: 196/77, loss: 0.00020171045616734773 2023-01-24 01:16:42.636808: step: 200/77, loss: 0.006428330205380917 2023-01-24 01:16:44.077281: step: 204/77, loss: 0.054347168654203415 2023-01-24 01:16:45.545432: step: 208/77, loss: 9.391622006660327e-05 2023-01-24 01:16:47.067036: step: 212/77, loss: 5.589288048213348e-05 2023-01-24 01:16:48.497738: step: 216/77, loss: 1.017721160678775e-06 2023-01-24 01:16:49.927886: step: 220/77, loss: 5.111033942739596e-07 2023-01-24 01:16:51.347563: step: 224/77, loss: 1.4677115132144536e-06 2023-01-24 01:16:52.828307: step: 228/77, loss: 0.040604718029499054 2023-01-24 01:16:54.268130: step: 232/77, loss: 0.00016247703752014786 2023-01-24 01:16:55.745901: step: 236/77, loss: 6.460230360971764e-05 2023-01-24 01:16:57.235835: step: 240/77, loss: 0.00135130959097296 2023-01-24 01:16:58.655746: step: 244/77, loss: 0.00023467946448363364 2023-01-24 01:17:00.188383: step: 248/77, loss: 7.356399873970076e-05 2023-01-24 01:17:01.707752: step: 252/77, loss: 0.006442447658628225 2023-01-24 01:17:03.238449: step: 256/77, loss: 0.01619216613471508 2023-01-24 01:17:04.704263: step: 260/77, loss: 1.4328755241876934e-05 2023-01-24 01:17:06.166868: step: 264/77, loss: 2.5376413759659044e-05 2023-01-24 01:17:07.588285: step: 268/77, loss: 0.12852409482002258 2023-01-24 01:17:09.132576: step: 272/77, loss: 1.3841839063388761e-05 2023-01-24 01:17:10.630827: step: 276/77, loss: 0.00680402759462595 2023-01-24 01:17:12.080965: step: 280/77, loss: 0.00011101227573817596 2023-01-24 01:17:13.582463: step: 284/77, loss: 0.00027165425126440823 2023-01-24 01:17:15.085566: step: 288/77, loss: 0.00021375974756665528 2023-01-24 01:17:16.661299: step: 292/77, loss: 0.027166698127985 2023-01-24 01:17:18.129246: step: 296/77, loss: 0.000678533164318651 2023-01-24 01:17:19.672853: step: 300/77, loss: 8.930165677156765e-06 2023-01-24 01:17:21.225330: step: 304/77, loss: 0.0004659105616156012 2023-01-24 01:17:22.664655: step: 308/77, loss: 2.5572109734639525e-05 2023-01-24 01:17:24.135134: step: 312/77, loss: 0.0006945566856302321 2023-01-24 01:17:25.621529: step: 316/77, loss: 0.0002417838986730203 2023-01-24 01:17:27.091350: step: 320/77, loss: 5.098254405311309e-05 2023-01-24 01:17:28.588344: step: 324/77, loss: 5.953685104032047e-05 2023-01-24 01:17:30.058302: step: 328/77, loss: 0.013821378350257874 2023-01-24 01:17:31.474197: step: 332/77, loss: 0.0018615383887663484 2023-01-24 01:17:33.018295: step: 336/77, loss: 1.4126595488050953e-05 2023-01-24 01:17:34.464635: step: 340/77, loss: 2.8266451863601105e-06 2023-01-24 01:17:35.998348: step: 344/77, loss: 1.291919033974409e-06 2023-01-24 01:17:37.453562: step: 348/77, loss: 3.385403397260234e-05 2023-01-24 01:17:38.931036: step: 352/77, loss: 4.066002929903334e-06 2023-01-24 01:17:40.384328: step: 356/77, loss: 1.440505002392456e-05 2023-01-24 01:17:41.854290: step: 360/77, loss: 8.49277203087695e-05 2023-01-24 01:17:43.312007: step: 364/77, loss: 4.555952273221919e-06 2023-01-24 01:17:44.811376: step: 368/77, loss: 2.5610373995732516e-05 2023-01-24 01:17:46.232653: step: 372/77, loss: 0.028466036543250084 2023-01-24 01:17:47.685367: step: 376/77, loss: 2.1152529370738193e-05 2023-01-24 01:17:49.171196: step: 380/77, loss: 9.026997759065125e-06 2023-01-24 01:17:50.609657: step: 384/77, loss: 1.4021417200638098e-06 2023-01-24 01:17:52.026254: step: 388/77, loss: 0.00033527766936458647 ================================================== Loss: 0.004 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 27} Test Chinese: {'template': {'p': 0.9154929577464789, 'r': 0.4961832061068702, 'f1': 0.6435643564356436}, 'slot': {'p': 0.5, 'r': 0.014667817083692839, 'f1': 0.028499580888516347}, 'combined': 0.018341314433203592, 'epoch': 27} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 27} Test Korean: {'template': {'p': 0.9154929577464789, 'r': 0.4961832061068702, 'f1': 0.6435643564356436}, 'slot': {'p': 0.5151515151515151, 'r': 0.014667817083692839, 'f1': 0.02852348993288591}, 'combined': 0.01835670144195628, 'epoch': 27} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 27} Test Russian: {'template': {'p': 0.9285714285714286, 'r': 0.4961832061068702, 'f1': 0.6467661691542288}, 'slot': {'p': 0.5483870967741935, 'r': 0.014667817083692839, 'f1': 0.02857142857142857}, 'combined': 0.018479033404406535, 'epoch': 27} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 27} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 27} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 27} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Chinese: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.37777777777777777, 'r': 0.014667817083692839, 'f1': 0.028239202657807307}, 'combined': 0.01735677821894498, 'epoch': 3} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 3} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Korean: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.37777777777777777, 'r': 0.014667817083692839, 'f1': 0.028239202657807307}, 'combined': 0.01735677821894498, 'epoch': 3} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 3} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Russian: {'template': {'p': 0.863013698630137, 'r': 0.48091603053435117, 'f1': 0.6176470588235295}, 'slot': {'p': 0.38636363636363635, 'r': 0.014667817083692839, 'f1': 0.02826267664172901}, 'combined': 0.01745635910224439, 'epoch': 3} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 3} ****************************** Epoch: 28 command: python train.py --model_name template --xlmr_model_name xlm-roberta-large --batch_size 10 --xlmr_learning_rate 2e-5 --event_hidden_num 450 --accumulate_step 4 --max_epoch 30 --p1_data_weight 0.1 --learning_rate 2e-4 2023-01-24 01:19:29.665890: step: 4/77, loss: 6.341078551486135e-05 2023-01-24 01:19:31.210673: step: 8/77, loss: 1.2370970580377616e-05 2023-01-24 01:19:32.652691: step: 12/77, loss: 0.0004744845209643245 2023-01-24 01:19:34.155986: step: 16/77, loss: 6.40781627225806e-06 2023-01-24 01:19:35.681221: step: 20/77, loss: 6.124305400589947e-07 2023-01-24 01:19:37.117000: step: 24/77, loss: 7.749309588689357e-06 2023-01-24 01:19:38.512298: step: 28/77, loss: 1.6838266958529857e-07 2023-01-24 01:19:39.992946: step: 32/77, loss: 0.0009012370137497783 2023-01-24 01:19:41.524885: step: 36/77, loss: 0.00016497840988449752 2023-01-24 01:19:42.951720: step: 40/77, loss: 0.031463008373975754 2023-01-24 01:19:44.418115: step: 44/77, loss: 0.00010549923172220588 2023-01-24 01:19:45.834304: step: 48/77, loss: 0.00022325686586555094 2023-01-24 01:19:47.342355: step: 52/77, loss: 2.074153599096462e-06 2023-01-24 01:19:48.850102: step: 56/77, loss: 9.855855751084164e-05 2023-01-24 01:19:50.268215: step: 60/77, loss: 1.778966179699637e-05 2023-01-24 01:19:51.756338: step: 64/77, loss: 4.362392246548552e-06 2023-01-24 01:19:53.247833: step: 68/77, loss: 1.3663729987456463e-06 2023-01-24 01:19:54.729010: step: 72/77, loss: 0.004756465088576078 2023-01-24 01:19:56.177996: step: 76/77, loss: 4.683442966779694e-05 2023-01-24 01:19:57.661242: step: 80/77, loss: 0.00019577420607674867 2023-01-24 01:19:59.134431: step: 84/77, loss: 0.00035057426430284977 2023-01-24 01:20:00.591138: step: 88/77, loss: 0.0017822480294853449 2023-01-24 01:20:02.095729: step: 92/77, loss: 0.000780368922278285 2023-01-24 01:20:03.615623: step: 96/77, loss: 6.379641126841307e-05 2023-01-24 01:20:05.057789: step: 100/77, loss: 2.9951229407743085e-07 2023-01-24 01:20:06.513308: step: 104/77, loss: 7.06719310983317e-06 2023-01-24 01:20:08.054763: step: 108/77, loss: 1.3345738807402086e-05 2023-01-24 01:20:09.496503: step: 112/77, loss: 0.04274173080921173 2023-01-24 01:20:10.933483: step: 116/77, loss: 9.51277106651105e-05 2023-01-24 01:20:12.386883: step: 120/77, loss: 2.816310029629676e-07 2023-01-24 01:20:13.805494: step: 124/77, loss: 0.028377747163176537 2023-01-24 01:20:15.286986: step: 128/77, loss: 7.620793894602684e-06 2023-01-24 01:20:16.768028: step: 132/77, loss: 7.897602927187108e-08 2023-01-24 01:20:18.247354: step: 136/77, loss: 8.433857874479145e-05 2023-01-24 01:20:19.767620: step: 140/77, loss: 0.042357299476861954 2023-01-24 01:20:21.192248: step: 144/77, loss: 2.6225916371913627e-07 2023-01-24 01:20:22.617063: step: 148/77, loss: 1.937150173603186e-08 2023-01-24 01:20:24.088054: step: 152/77, loss: 0.00019146957492921501 2023-01-24 01:20:25.563955: step: 156/77, loss: 4.1731591409188695e-06 2023-01-24 01:20:27.066118: step: 160/77, loss: 1.1456423635536339e-05 2023-01-24 01:20:28.530422: step: 164/77, loss: 4.647242349165026e-06 2023-01-24 01:20:30.021116: step: 168/77, loss: 1.8775331511733384e-07 2023-01-24 01:20:31.458764: step: 172/77, loss: 0.000803797913249582 2023-01-24 01:20:32.938067: step: 176/77, loss: 0.00011625502520473674 2023-01-24 01:20:34.403773: step: 180/77, loss: 0.02850656770169735 2023-01-24 01:20:35.868175: step: 184/77, loss: 0.023709582164883614 2023-01-24 01:20:37.384178: step: 188/77, loss: 0.001574437483213842 2023-01-24 01:20:38.876052: step: 192/77, loss: 4.412966518430039e-06 2023-01-24 01:20:40.353528: step: 196/77, loss: 0.028234709054231644 2023-01-24 01:20:41.819394: step: 200/77, loss: 5.9604619906394873e-08 2023-01-24 01:20:43.276386: step: 204/77, loss: 0.007427121512591839 2023-01-24 01:20:44.657665: step: 208/77, loss: 1.28149778788611e-07 2023-01-24 01:20:46.130218: step: 212/77, loss: 1.8853193978429772e-05 2023-01-24 01:20:47.540220: step: 216/77, loss: 0.00016453364514745772 2023-01-24 01:20:48.969191: step: 220/77, loss: 0.04771774262189865 2023-01-24 01:20:50.483764: step: 224/77, loss: 0.00020399382628966123 2023-01-24 01:20:51.874617: step: 228/77, loss: 4.151056145929033e-06 2023-01-24 01:20:53.352042: step: 232/77, loss: 0.0006100233877077699 2023-01-24 01:20:54.871404: step: 236/77, loss: 0.0030997106805443764 2023-01-24 01:20:56.297361: step: 240/77, loss: 0.0016627665609121323 2023-01-24 01:20:57.850400: step: 244/77, loss: 1.3559860008172109e-06 2023-01-24 01:20:59.408928: step: 248/77, loss: 0.00033372087636962533 2023-01-24 01:21:00.903871: step: 252/77, loss: 2.2440113752963953e-05 2023-01-24 01:21:02.432514: step: 256/77, loss: 1.2248431175976293e-06 2023-01-24 01:21:03.884288: step: 260/77, loss: 3.861261939164251e-05 2023-01-24 01:21:05.419191: step: 264/77, loss: 7.700593414483592e-05 2023-01-24 01:21:06.923471: step: 268/77, loss: 6.907177157700062e-05 2023-01-24 01:21:08.410257: step: 272/77, loss: 0.0003736157377716154 2023-01-24 01:21:09.861110: step: 276/77, loss: 0.03906060755252838 2023-01-24 01:21:11.365322: step: 280/77, loss: 1.0371045391366351e-06 2023-01-24 01:21:12.878411: step: 284/77, loss: 2.6862755476031452e-05 2023-01-24 01:21:14.353076: step: 288/77, loss: 0.00017517435480840504 2023-01-24 01:21:15.791084: step: 292/77, loss: 3.055980641875067e-06 2023-01-24 01:21:17.356573: step: 296/77, loss: 0.0014680877793580294 2023-01-24 01:21:18.836418: step: 300/77, loss: 3.172190918121487e-05 2023-01-24 01:21:20.285062: step: 304/77, loss: 8.607449854025617e-05 2023-01-24 01:21:21.727973: step: 308/77, loss: 0.01271668728441 2023-01-24 01:21:23.244038: step: 312/77, loss: 4.753455016270891e-07 2023-01-24 01:21:24.648138: step: 316/77, loss: 1.7221045709447935e-05 2023-01-24 01:21:26.103853: step: 320/77, loss: 0.0009752624318934977 2023-01-24 01:21:27.557392: step: 324/77, loss: 1.535144474473782e-05 2023-01-24 01:21:29.049850: step: 328/77, loss: 6.94218761054799e-05 2023-01-24 01:21:30.521396: step: 332/77, loss: 0.00026013870956376195 2023-01-24 01:21:32.058323: step: 336/77, loss: 0.00010290901263942942 2023-01-24 01:21:33.500594: step: 340/77, loss: 1.2418715414241888e-05 2023-01-24 01:21:35.014378: step: 344/77, loss: 3.4568495266285026e-06 2023-01-24 01:21:36.424028: step: 348/77, loss: 0.0007589494343847036 2023-01-24 01:21:37.844667: step: 352/77, loss: 4.990630259271711e-05 2023-01-24 01:21:39.400909: step: 356/77, loss: 1.3291570439832867e-06 2023-01-24 01:21:40.836768: step: 360/77, loss: 4.246799960583303e-07 2023-01-24 01:21:42.320022: step: 364/77, loss: 7.510137152166863e-07 2023-01-24 01:21:43.747601: step: 368/77, loss: 0.010786645114421844 2023-01-24 01:21:45.263941: step: 372/77, loss: 1.5899097434157738e-06 2023-01-24 01:21:46.746648: step: 376/77, loss: 7.986807872839563e-07 2023-01-24 01:21:48.285153: step: 380/77, loss: 8.875988896761555e-06 2023-01-24 01:21:49.793722: step: 384/77, loss: 1.1134283340652473e-05 2023-01-24 01:21:51.279266: step: 388/77, loss: 2.2824442567070946e-05 ================================================== Loss: 0.004 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.4878048780487805, 'r': 0.03780718336483932, 'f1': 0.07017543859649122}, 'combined': 0.051708217913204055, 'epoch': 28} Test Chinese: {'template': {'p': 0.8947368421052632, 'r': 0.5190839694656488, 'f1': 0.6570048309178743}, 'slot': {'p': 0.40540540540540543, 'r': 0.012942191544434857, 'f1': 0.02508361204013378}, 'combined': 0.01648005428723765, 'epoch': 28} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.4878048780487805, 'r': 0.03780718336483932, 'f1': 0.07017543859649122}, 'combined': 0.051708217913204055, 'epoch': 28} Test Korean: {'template': {'p': 0.8947368421052632, 'r': 0.5190839694656488, 'f1': 0.6570048309178743}, 'slot': {'p': 0.4166666666666667, 'r': 0.012942191544434857, 'f1': 0.02510460251046025}, 'combined': 0.01649384512764538, 'epoch': 28} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.4878048780487805, 'r': 0.03780718336483932, 'f1': 0.07017543859649122}, 'combined': 0.051708217913204055, 'epoch': 28} Test Russian: {'template': {'p': 0.9178082191780822, 'r': 0.5114503816793893, 'f1': 0.6568627450980392}, 'slot': {'p': 0.4411764705882353, 'r': 0.012942191544434857, 'f1': 0.025146689019279123}, 'combined': 0.016517923179330405, 'epoch': 28} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 28} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 28} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 28} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Chinese: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.37777777777777777, 'r': 0.014667817083692839, 'f1': 0.028239202657807307}, 'combined': 0.01735677821894498, 'epoch': 3} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 3} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Korean: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.37777777777777777, 'r': 0.014667817083692839, 'f1': 0.028239202657807307}, 'combined': 0.01735677821894498, 'epoch': 3} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 3} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Russian: {'template': {'p': 0.863013698630137, 'r': 0.48091603053435117, 'f1': 0.6176470588235295}, 'slot': {'p': 0.38636363636363635, 'r': 0.014667817083692839, 'f1': 0.02826267664172901}, 'combined': 0.01745635910224439, 'epoch': 3} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 3} ****************************** Epoch: 29 command: python train.py --model_name template --xlmr_model_name xlm-roberta-large --batch_size 10 --xlmr_learning_rate 2e-5 --event_hidden_num 450 --accumulate_step 4 --max_epoch 30 --p1_data_weight 0.1 --learning_rate 2e-4 2023-01-24 01:23:29.316351: step: 4/77, loss: 3.6937315144314198e-06 2023-01-24 01:23:30.885995: step: 8/77, loss: 4.1574102738195506e-07 2023-01-24 01:23:32.377171: step: 12/77, loss: 0.0003356757224537432 2023-01-24 01:23:33.865559: step: 16/77, loss: 0.06527945399284363 2023-01-24 01:23:35.410140: step: 20/77, loss: 0.0002221096510766074 2023-01-24 01:23:36.862287: step: 24/77, loss: 0.00017310661496594548 2023-01-24 01:23:38.316873: step: 28/77, loss: 4.425617419201444e-07 2023-01-24 01:23:39.778526: step: 32/77, loss: 8.567985787522048e-06 2023-01-24 01:23:41.267334: step: 36/77, loss: 1.175110628537368e-05 2023-01-24 01:23:42.788405: step: 40/77, loss: 6.787302118027583e-05 2023-01-24 01:23:44.247123: step: 44/77, loss: 0.0003931297978851944 2023-01-24 01:23:45.733477: step: 48/77, loss: 0.0005336487083695829 2023-01-24 01:23:47.168445: step: 52/77, loss: 0.004889761097729206 2023-01-24 01:23:48.652287: step: 56/77, loss: 1.2963991480319237e-07 2023-01-24 01:23:50.077421: step: 60/77, loss: 0.006010737735778093 2023-01-24 01:23:51.557749: step: 64/77, loss: 0.002367202192544937 2023-01-24 01:23:52.977142: step: 68/77, loss: 0.005203723441809416 2023-01-24 01:23:54.400680: step: 72/77, loss: 0.0011668759398162365 2023-01-24 01:23:55.839346: step: 76/77, loss: 4.994434675609227e-06 2023-01-24 01:23:57.285370: step: 80/77, loss: 0.0210029948502779 2023-01-24 01:23:58.757502: step: 84/77, loss: 4.144196645938791e-05 2023-01-24 01:24:00.287159: step: 88/77, loss: 0.003072816878557205 2023-01-24 01:24:01.775180: step: 92/77, loss: 8.771360444370657e-05 2023-01-24 01:24:03.227869: step: 96/77, loss: 2.2703750801156275e-05 2023-01-24 01:24:04.650978: step: 100/77, loss: 0.0005149574135430157 2023-01-24 01:24:06.034165: step: 104/77, loss: 6.4182363530562725e-06 2023-01-24 01:24:07.476943: step: 108/77, loss: 2.364353531447705e-05 2023-01-24 01:24:08.917603: step: 112/77, loss: 1.475209217005613e-07 2023-01-24 01:24:10.411730: step: 116/77, loss: 1.2321707799856085e-05 2023-01-24 01:24:11.967853: step: 120/77, loss: 0.0019780858419835567 2023-01-24 01:24:13.466640: step: 124/77, loss: 2.781839384624618e-06 2023-01-24 01:24:14.980245: step: 128/77, loss: 0.002130064181983471 2023-01-24 01:24:16.463109: step: 132/77, loss: 5.453763947116386e-07 2023-01-24 01:24:17.947009: step: 136/77, loss: 0.00013564022083301097 2023-01-24 01:24:19.444442: step: 140/77, loss: 7.915633432276081e-06 2023-01-24 01:24:20.873386: step: 144/77, loss: 8.597708074375987e-07 2023-01-24 01:24:22.393053: step: 148/77, loss: 0.0005095271626487374 2023-01-24 01:24:23.830360: step: 152/77, loss: 5.636212790705031e-06 2023-01-24 01:24:25.330641: step: 156/77, loss: 6.559267785632983e-05 2023-01-24 01:24:26.778285: step: 160/77, loss: 2.820798908942379e-05 2023-01-24 01:24:28.285796: step: 164/77, loss: 1.3960838259663433e-05 2023-01-24 01:24:29.748067: step: 168/77, loss: 1.0728827959383125e-07 2023-01-24 01:24:31.179685: step: 172/77, loss: 1.5794829550941358e-06 2023-01-24 01:24:32.715652: step: 176/77, loss: 0.14907880127429962 2023-01-24 01:24:34.225220: step: 180/77, loss: 2.362069790251553e-05 2023-01-24 01:24:35.681997: step: 184/77, loss: 3.7997801882738713e-07 2023-01-24 01:24:37.135720: step: 188/77, loss: 0.0001900776260299608 2023-01-24 01:24:38.661784: step: 192/77, loss: 5.02228613186162e-06 2023-01-24 01:24:40.156198: step: 196/77, loss: 4.827723842026899e-06 2023-01-24 01:24:41.690980: step: 200/77, loss: 4.217001219330996e-07 2023-01-24 01:24:43.149736: step: 204/77, loss: 0.0013312987284734845 2023-01-24 01:24:44.641662: step: 208/77, loss: 8.674228411109652e-06 2023-01-24 01:24:46.117506: step: 212/77, loss: 2.571580080257263e-05 2023-01-24 01:24:47.578761: step: 216/77, loss: 8.191204688046128e-05 2023-01-24 01:24:49.083641: step: 220/77, loss: 2.35874153986515e-06 2023-01-24 01:24:50.526950: step: 224/77, loss: 0.0007027069223113358 2023-01-24 01:24:51.945202: step: 228/77, loss: 5.2166076784487814e-05 2023-01-24 01:24:53.412270: step: 232/77, loss: 0.00013188117009121925 2023-01-24 01:24:54.856306: step: 236/77, loss: 1.0763304089778103e-05 2023-01-24 01:24:56.374382: step: 240/77, loss: 6.738617230439559e-06 2023-01-24 01:24:57.815040: step: 244/77, loss: 6.242575182113796e-05 2023-01-24 01:24:59.336773: step: 248/77, loss: 7.897473324192106e-07 2023-01-24 01:25:00.780608: step: 252/77, loss: 0.0017958583775907755 2023-01-24 01:25:02.264500: step: 256/77, loss: 6.188504357851343e-06 2023-01-24 01:25:03.759118: step: 260/77, loss: 0.0002860073291230947 2023-01-24 01:25:05.223994: step: 264/77, loss: 1.5450248611159623e-05 2023-01-24 01:25:06.687244: step: 268/77, loss: 2.9504178655770374e-07 2023-01-24 01:25:08.153165: step: 272/77, loss: 4.7683684556432127e-08 2023-01-24 01:25:09.571646: step: 276/77, loss: 2.3002725356491283e-05 2023-01-24 01:25:11.044793: step: 280/77, loss: 0.0012365533038973808 2023-01-24 01:25:12.524590: step: 284/77, loss: 3.7278171021171147e-06 2023-01-24 01:25:13.994813: step: 288/77, loss: 4.2184408812318e-05 2023-01-24 01:25:15.429045: step: 292/77, loss: 0.011463784612715244 2023-01-24 01:25:16.889955: step: 296/77, loss: 0.02129548415541649 2023-01-24 01:25:18.305291: step: 300/77, loss: 0.02072848007082939 2023-01-24 01:25:19.767910: step: 304/77, loss: 0.03113666921854019 2023-01-24 01:25:21.177578: step: 308/77, loss: 8.225305236919667e-07 2023-01-24 01:25:22.703408: step: 312/77, loss: 3.0233147754188394e-06 2023-01-24 01:25:24.184561: step: 316/77, loss: 5.078814865555614e-06 2023-01-24 01:25:25.646871: step: 320/77, loss: 3.300204525658046e-06 2023-01-24 01:25:27.055221: step: 324/77, loss: 1.2367942758828576e-07 2023-01-24 01:25:28.533750: step: 328/77, loss: 6.798368758609286e-06 2023-01-24 01:25:30.061161: step: 332/77, loss: 2.697095453640941e-07 2023-01-24 01:25:31.564768: step: 336/77, loss: 0.07185610383749008 2023-01-24 01:25:33.141402: step: 340/77, loss: 0.011198935098946095 2023-01-24 01:25:34.641935: step: 344/77, loss: 0.0010905108647421002 2023-01-24 01:25:36.088057: step: 348/77, loss: 3.576277407546513e-08 2023-01-24 01:25:37.529914: step: 352/77, loss: 0.000372204085579142 2023-01-24 01:25:38.980222: step: 356/77, loss: 0.0019713786896318197 2023-01-24 01:25:40.426542: step: 360/77, loss: 0.00993399415165186 2023-01-24 01:25:41.893295: step: 364/77, loss: 0.00010721880971686915 2023-01-24 01:25:43.330598: step: 368/77, loss: 0.00018293378525413573 2023-01-24 01:25:44.762425: step: 372/77, loss: 3.2212028600042686e-05 2023-01-24 01:25:46.231873: step: 376/77, loss: 0.018035752698779106 2023-01-24 01:25:47.772999: step: 380/77, loss: 1.722459273878485e-06 2023-01-24 01:25:49.271689: step: 384/77, loss: 6.50309902994195e-06 2023-01-24 01:25:50.783600: step: 388/77, loss: 3.4866245641751448e-06 ================================================== Loss: 0.005 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 29} Test Chinese: {'template': {'p': 0.8933333333333333, 'r': 0.5114503816793893, 'f1': 0.6504854368932039}, 'slot': {'p': 0.4, 'r': 0.012079378774805867, 'f1': 0.023450586264656615}, 'combined': 0.015254264851766926, 'epoch': 29} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 29} Test Korean: {'template': {'p': 0.9054054054054054, 'r': 0.5114503816793893, 'f1': 0.6536585365853658}, 'slot': {'p': 0.42424242424242425, 'r': 0.012079378774805867, 'f1': 0.02348993288590604}, 'combined': 0.0153543951546898, 'epoch': 29} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 29} Test Russian: {'template': {'p': 0.8947368421052632, 'r': 0.5190839694656488, 'f1': 0.6570048309178743}, 'slot': {'p': 0.3888888888888889, 'r': 0.012079378774805867, 'f1': 0.02343096234309623}, 'combined': 0.01539425545246902, 'epoch': 29} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 29} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 29} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 29} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Chinese: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.37777777777777777, 'r': 0.014667817083692839, 'f1': 0.028239202657807307}, 'combined': 0.01735677821894498, 'epoch': 3} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 3} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Korean: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.37777777777777777, 'r': 0.014667817083692839, 'f1': 0.028239202657807307}, 'combined': 0.01735677821894498, 'epoch': 3} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 3} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Russian: {'template': {'p': 0.863013698630137, 'r': 0.48091603053435117, 'f1': 0.6176470588235295}, 'slot': {'p': 0.38636363636363635, 'r': 0.014667817083692839, 'f1': 0.02826267664172901}, 'combined': 0.01745635910224439, 'epoch': 3} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 3}