Command that produces this log: python train.py --model_name template --xlmr_model_name xlm-roberta-large --batch_size 10 --xlmr_learning_rate 2e-5 --event_hidden_num 450 --accumulate_step 4 --max_epoch 30 --p1_data_weight 0.1 --learning_rate 2e-4 ---------------------------------------------------------------------------------------------------- > trainable params: >>> xlmr.embeddings.word_embeddings.weight: torch.Size([250002, 1024]) >>> xlmr.embeddings.position_embeddings.weight: torch.Size([514, 1024]) >>> xlmr.embeddings.token_type_embeddings.weight: torch.Size([1, 1024]) >>> xlmr.embeddings.LayerNorm.weight: torch.Size([1024]) >>> xlmr.embeddings.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.0.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.0.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.0.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.0.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.0.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.0.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.0.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.0.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.0.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.0.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.0.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.0.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.0.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.0.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.0.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.0.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.1.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.1.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.1.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.1.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.1.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.1.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.1.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.1.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.1.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.1.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.1.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.1.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.1.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.1.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.1.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.1.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.2.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.2.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.2.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.2.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.2.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.2.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.2.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.2.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.2.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.2.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.2.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.2.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.2.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.2.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.2.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.2.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.3.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.3.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.3.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.3.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.3.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.3.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.3.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.3.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.3.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.3.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.3.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.3.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.3.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.3.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.3.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.3.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.4.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.4.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.4.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.4.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.4.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.4.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.4.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.4.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.4.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.4.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.4.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.4.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.4.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.4.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.4.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.4.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.5.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.5.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.5.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.5.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.5.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.5.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.5.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.5.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.5.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.5.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.5.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.5.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.5.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.5.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.5.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.5.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.6.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.6.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.6.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.6.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.6.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.6.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.6.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.6.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.6.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.6.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.6.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.6.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.6.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.6.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.6.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.6.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.7.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.7.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.7.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.7.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.7.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.7.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.7.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.7.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.7.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.7.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.7.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.7.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.7.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.7.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.7.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.7.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.8.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.8.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.8.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.8.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.8.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.8.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.8.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.8.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.8.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.8.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.8.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.8.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.8.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.8.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.8.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.8.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.9.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.9.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.9.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.9.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.9.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.9.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.9.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.9.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.9.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.9.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.9.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.9.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.9.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.9.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.9.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.9.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.10.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.10.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.10.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.10.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.10.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.10.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.10.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.10.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.10.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.10.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.10.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.10.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.10.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.10.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.10.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.10.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.11.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.11.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.11.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.11.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.11.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.11.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.11.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.11.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.11.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.11.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.11.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.11.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.11.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.11.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.11.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.11.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.12.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.12.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.12.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.12.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.12.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.12.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.12.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.12.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.12.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.12.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.12.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.12.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.12.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.12.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.12.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.12.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.13.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.13.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.13.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.13.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.13.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.13.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.13.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.13.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.13.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.13.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.13.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.13.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.13.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.13.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.13.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.13.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.14.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.14.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.14.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.14.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.14.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.14.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.14.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.14.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.14.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.14.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.14.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.14.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.14.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.14.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.14.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.14.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.15.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.15.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.15.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.15.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.15.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.15.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.15.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.15.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.15.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.15.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.15.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.15.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.15.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.15.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.15.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.15.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.16.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.16.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.16.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.16.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.16.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.16.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.16.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.16.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.16.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.16.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.16.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.16.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.16.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.16.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.16.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.16.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.17.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.17.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.17.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.17.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.17.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.17.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.17.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.17.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.17.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.17.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.17.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.17.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.17.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.17.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.17.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.17.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.18.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.18.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.18.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.18.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.18.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.18.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.18.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.18.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.18.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.18.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.18.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.18.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.18.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.18.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.18.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.18.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.19.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.19.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.19.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.19.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.19.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.19.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.19.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.19.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.19.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.19.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.19.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.19.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.19.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.19.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.19.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.19.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.20.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.20.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.20.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.20.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.20.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.20.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.20.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.20.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.20.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.20.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.20.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.20.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.20.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.20.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.20.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.20.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.21.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.21.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.21.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.21.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.21.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.21.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.21.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.21.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.21.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.21.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.21.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.21.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.21.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.21.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.21.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.21.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.22.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.22.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.22.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.22.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.22.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.22.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.22.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.22.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.22.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.22.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.22.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.22.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.22.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.22.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.22.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.22.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.23.attention.self.query.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.23.attention.self.query.bias: torch.Size([1024]) >>> xlmr.encoder.layer.23.attention.self.key.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.23.attention.self.key.bias: torch.Size([1024]) >>> xlmr.encoder.layer.23.attention.self.value.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.23.attention.self.value.bias: torch.Size([1024]) >>> xlmr.encoder.layer.23.attention.output.dense.weight: torch.Size([1024, 1024]) >>> xlmr.encoder.layer.23.attention.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.23.attention.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.23.attention.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.encoder.layer.23.intermediate.dense.weight: torch.Size([4096, 1024]) >>> xlmr.encoder.layer.23.intermediate.dense.bias: torch.Size([4096]) >>> xlmr.encoder.layer.23.output.dense.weight: torch.Size([1024, 4096]) >>> xlmr.encoder.layer.23.output.dense.bias: torch.Size([1024]) >>> xlmr.encoder.layer.23.output.LayerNorm.weight: torch.Size([1024]) >>> xlmr.encoder.layer.23.output.LayerNorm.bias: torch.Size([1024]) >>> xlmr.pooler.dense.weight: torch.Size([1024, 1024]) >>> xlmr.pooler.dense.bias: torch.Size([1024]) >>> trans_rep.weight: torch.Size([1024, 2048]) >>> trans_rep.bias: torch.Size([1024]) >>> hidden_ffns.Corruplate.layers.0.weight: torch.Size([768, 1024]) >>> hidden_ffns.Corruplate.layers.0.bias: torch.Size([768]) >>> hidden_ffns.Cybercrimeplate.layers.0.weight: torch.Size([768, 1024]) >>> hidden_ffns.Cybercrimeplate.layers.0.bias: torch.Size([768]) >>> hidden_ffns.Disasterplate.layers.0.weight: torch.Size([768, 1024]) >>> hidden_ffns.Disasterplate.layers.0.bias: torch.Size([768]) >>> hidden_ffns.Displacementplate.layers.0.weight: torch.Size([768, 1024]) >>> hidden_ffns.Displacementplate.layers.0.bias: torch.Size([768]) >>> hidden_ffns.Epidemiplate.layers.0.weight: torch.Size([768, 1024]) >>> hidden_ffns.Epidemiplate.layers.0.bias: torch.Size([768]) >>> hidden_ffns.Etiplate.layers.0.weight: torch.Size([768, 1024]) >>> hidden_ffns.Etiplate.layers.0.bias: torch.Size([768]) >>> hidden_ffns.Protestplate.layers.0.weight: torch.Size([768, 1024]) >>> hidden_ffns.Protestplate.layers.0.bias: torch.Size([768]) >>> hidden_ffns.Terrorplate.layers.0.weight: torch.Size([768, 1024]) >>> hidden_ffns.Terrorplate.layers.0.bias: torch.Size([768]) >>> template_classifiers.Corruplate.layers.0.weight: torch.Size([450, 768]) >>> template_classifiers.Corruplate.layers.0.bias: torch.Size([450]) >>> template_classifiers.Corruplate.layers.1.weight: torch.Size([2, 450]) >>> template_classifiers.Corruplate.layers.1.bias: torch.Size([2]) >>> template_classifiers.Cybercrimeplate.layers.0.weight: torch.Size([450, 768]) >>> template_classifiers.Cybercrimeplate.layers.0.bias: torch.Size([450]) >>> template_classifiers.Cybercrimeplate.layers.1.weight: torch.Size([2, 450]) >>> template_classifiers.Cybercrimeplate.layers.1.bias: torch.Size([2]) >>> template_classifiers.Disasterplate.layers.0.weight: torch.Size([450, 768]) >>> template_classifiers.Disasterplate.layers.0.bias: torch.Size([450]) >>> template_classifiers.Disasterplate.layers.1.weight: torch.Size([2, 450]) >>> template_classifiers.Disasterplate.layers.1.bias: torch.Size([2]) >>> template_classifiers.Displacementplate.layers.0.weight: torch.Size([450, 768]) >>> template_classifiers.Displacementplate.layers.0.bias: torch.Size([450]) >>> template_classifiers.Displacementplate.layers.1.weight: torch.Size([2, 450]) >>> template_classifiers.Displacementplate.layers.1.bias: torch.Size([2]) >>> template_classifiers.Epidemiplate.layers.0.weight: torch.Size([450, 768]) >>> template_classifiers.Epidemiplate.layers.0.bias: torch.Size([450]) >>> template_classifiers.Epidemiplate.layers.1.weight: torch.Size([2, 450]) >>> template_classifiers.Epidemiplate.layers.1.bias: torch.Size([2]) >>> template_classifiers.Etiplate.layers.0.weight: torch.Size([450, 768]) >>> template_classifiers.Etiplate.layers.0.bias: torch.Size([450]) >>> template_classifiers.Etiplate.layers.1.weight: torch.Size([2, 450]) >>> template_classifiers.Etiplate.layers.1.bias: torch.Size([2]) >>> template_classifiers.Protestplate.layers.0.weight: torch.Size([450, 768]) >>> template_classifiers.Protestplate.layers.0.bias: torch.Size([450]) >>> template_classifiers.Protestplate.layers.1.weight: torch.Size([2, 450]) >>> template_classifiers.Protestplate.layers.1.bias: torch.Size([2]) >>> template_classifiers.Terrorplate.layers.0.weight: torch.Size([450, 768]) >>> template_classifiers.Terrorplate.layers.0.bias: torch.Size([450]) >>> template_classifiers.Terrorplate.layers.1.weight: torch.Size([2, 450]) >>> template_classifiers.Terrorplate.layers.1.bias: torch.Size([2]) >>> type_classifiers.Corruplate.layers.0.weight: torch.Size([450, 768]) >>> type_classifiers.Corruplate.layers.0.bias: torch.Size([450]) >>> type_classifiers.Corruplate.layers.1.weight: torch.Size([6, 450]) >>> type_classifiers.Corruplate.layers.1.bias: torch.Size([6]) >>> type_classifiers.Cybercrimeplate.layers.0.weight: torch.Size([450, 768]) >>> type_classifiers.Cybercrimeplate.layers.0.bias: torch.Size([450]) >>> type_classifiers.Cybercrimeplate.layers.1.weight: torch.Size([6, 450]) >>> type_classifiers.Cybercrimeplate.layers.1.bias: torch.Size([6]) >>> type_classifiers.Disasterplate.layers.0.weight: torch.Size([450, 768]) >>> type_classifiers.Disasterplate.layers.0.bias: torch.Size([450]) >>> type_classifiers.Disasterplate.layers.1.weight: torch.Size([6, 450]) >>> type_classifiers.Disasterplate.layers.1.bias: torch.Size([6]) >>> type_classifiers.Displacementplate.layers.0.weight: torch.Size([450, 768]) >>> type_classifiers.Displacementplate.layers.0.bias: torch.Size([450]) >>> type_classifiers.Displacementplate.layers.1.weight: torch.Size([6, 450]) >>> type_classifiers.Displacementplate.layers.1.bias: torch.Size([6]) >>> type_classifiers.Epidemiplate.layers.0.weight: torch.Size([450, 768]) >>> type_classifiers.Epidemiplate.layers.0.bias: torch.Size([450]) >>> type_classifiers.Epidemiplate.layers.1.weight: torch.Size([6, 450]) >>> type_classifiers.Epidemiplate.layers.1.bias: torch.Size([6]) >>> type_classifiers.Etiplate.layers.0.weight: torch.Size([450, 768]) >>> type_classifiers.Etiplate.layers.0.bias: torch.Size([450]) >>> type_classifiers.Etiplate.layers.1.weight: torch.Size([6, 450]) >>> type_classifiers.Etiplate.layers.1.bias: torch.Size([6]) >>> type_classifiers.Protestplate.layers.0.weight: torch.Size([450, 768]) >>> type_classifiers.Protestplate.layers.0.bias: torch.Size([450]) >>> type_classifiers.Protestplate.layers.1.weight: torch.Size([6, 450]) >>> type_classifiers.Protestplate.layers.1.bias: torch.Size([6]) >>> type_classifiers.Terrorplate.layers.0.weight: torch.Size([450, 768]) >>> type_classifiers.Terrorplate.layers.0.bias: torch.Size([450]) >>> type_classifiers.Terrorplate.layers.1.weight: torch.Size([6, 450]) >>> type_classifiers.Terrorplate.layers.1.bias: torch.Size([6]) >>> completion_classifiers.Corruplate.layers.0.weight: torch.Size([450, 768]) >>> completion_classifiers.Corruplate.layers.0.bias: torch.Size([450]) >>> completion_classifiers.Corruplate.layers.1.weight: torch.Size([4, 450]) >>> completion_classifiers.Corruplate.layers.1.bias: torch.Size([4]) >>> completion_classifiers.Cybercrimeplate.layers.0.weight: torch.Size([450, 768]) >>> completion_classifiers.Cybercrimeplate.layers.0.bias: torch.Size([450]) >>> completion_classifiers.Cybercrimeplate.layers.1.weight: torch.Size([4, 450]) >>> completion_classifiers.Cybercrimeplate.layers.1.bias: torch.Size([4]) >>> completion_classifiers.Disasterplate.layers.0.weight: torch.Size([450, 768]) >>> completion_classifiers.Disasterplate.layers.0.bias: torch.Size([450]) >>> completion_classifiers.Disasterplate.layers.1.weight: torch.Size([4, 450]) >>> completion_classifiers.Disasterplate.layers.1.bias: torch.Size([4]) >>> completion_classifiers.Displacementplate.layers.0.weight: torch.Size([450, 768]) >>> completion_classifiers.Displacementplate.layers.0.bias: torch.Size([450]) >>> completion_classifiers.Displacementplate.layers.1.weight: torch.Size([4, 450]) >>> completion_classifiers.Displacementplate.layers.1.bias: torch.Size([4]) >>> completion_classifiers.Epidemiplate.layers.0.weight: torch.Size([450, 768]) >>> completion_classifiers.Epidemiplate.layers.0.bias: torch.Size([450]) >>> completion_classifiers.Epidemiplate.layers.1.weight: torch.Size([4, 450]) >>> completion_classifiers.Epidemiplate.layers.1.bias: torch.Size([4]) >>> completion_classifiers.Etiplate.layers.0.weight: torch.Size([450, 768]) >>> completion_classifiers.Etiplate.layers.0.bias: torch.Size([450]) >>> completion_classifiers.Etiplate.layers.1.weight: torch.Size([4, 450]) >>> completion_classifiers.Etiplate.layers.1.bias: torch.Size([4]) >>> completion_classifiers.Protestplate.layers.0.weight: torch.Size([450, 768]) >>> completion_classifiers.Protestplate.layers.0.bias: torch.Size([450]) >>> completion_classifiers.Protestplate.layers.1.weight: torch.Size([4, 450]) >>> completion_classifiers.Protestplate.layers.1.bias: torch.Size([4]) >>> completion_classifiers.Terrorplate.layers.0.weight: torch.Size([450, 768]) >>> completion_classifiers.Terrorplate.layers.0.bias: torch.Size([450]) >>> completion_classifiers.Terrorplate.layers.1.weight: torch.Size([4, 450]) >>> completion_classifiers.Terrorplate.layers.1.bias: torch.Size([4]) >>> overtime_classifiers.Corruplate.layers.0.weight: torch.Size([450, 768]) >>> overtime_classifiers.Corruplate.layers.0.bias: torch.Size([450]) >>> overtime_classifiers.Corruplate.layers.1.weight: torch.Size([2, 450]) >>> overtime_classifiers.Corruplate.layers.1.bias: torch.Size([2]) >>> overtime_classifiers.Cybercrimeplate.layers.0.weight: torch.Size([450, 768]) >>> overtime_classifiers.Cybercrimeplate.layers.0.bias: torch.Size([450]) >>> overtime_classifiers.Cybercrimeplate.layers.1.weight: torch.Size([2, 450]) >>> overtime_classifiers.Cybercrimeplate.layers.1.bias: torch.Size([2]) >>> overtime_classifiers.Disasterplate.layers.0.weight: torch.Size([450, 768]) >>> overtime_classifiers.Disasterplate.layers.0.bias: torch.Size([450]) >>> overtime_classifiers.Disasterplate.layers.1.weight: torch.Size([2, 450]) >>> overtime_classifiers.Disasterplate.layers.1.bias: torch.Size([2]) >>> overtime_classifiers.Displacementplate.layers.0.weight: torch.Size([450, 768]) >>> overtime_classifiers.Displacementplate.layers.0.bias: torch.Size([450]) >>> overtime_classifiers.Displacementplate.layers.1.weight: torch.Size([2, 450]) >>> overtime_classifiers.Displacementplate.layers.1.bias: torch.Size([2]) >>> overtime_classifiers.Epidemiplate.layers.0.weight: torch.Size([450, 768]) >>> overtime_classifiers.Epidemiplate.layers.0.bias: torch.Size([450]) >>> overtime_classifiers.Epidemiplate.layers.1.weight: torch.Size([2, 450]) >>> overtime_classifiers.Epidemiplate.layers.1.bias: torch.Size([2]) >>> overtime_classifiers.Etiplate.layers.0.weight: torch.Size([450, 768]) >>> overtime_classifiers.Etiplate.layers.0.bias: torch.Size([450]) >>> overtime_classifiers.Etiplate.layers.1.weight: torch.Size([2, 450]) >>> overtime_classifiers.Etiplate.layers.1.bias: torch.Size([2]) >>> overtime_classifiers.Protestplate.layers.0.weight: torch.Size([450, 768]) >>> overtime_classifiers.Protestplate.layers.0.bias: torch.Size([450]) >>> overtime_classifiers.Protestplate.layers.1.weight: torch.Size([2, 450]) >>> overtime_classifiers.Protestplate.layers.1.bias: torch.Size([2]) >>> overtime_classifiers.Terrorplate.layers.0.weight: torch.Size([450, 768]) >>> overtime_classifiers.Terrorplate.layers.0.bias: torch.Size([450]) >>> overtime_classifiers.Terrorplate.layers.1.weight: torch.Size([2, 450]) >>> overtime_classifiers.Terrorplate.layers.1.bias: torch.Size([2]) >>> coordinated_classifiers.Corruplate.layers.0.weight: torch.Size([450, 768]) >>> coordinated_classifiers.Corruplate.layers.0.bias: torch.Size([450]) >>> coordinated_classifiers.Corruplate.layers.1.weight: torch.Size([2, 450]) >>> coordinated_classifiers.Corruplate.layers.1.bias: torch.Size([2]) >>> coordinated_classifiers.Cybercrimeplate.layers.0.weight: torch.Size([450, 768]) >>> coordinated_classifiers.Cybercrimeplate.layers.0.bias: torch.Size([450]) >>> coordinated_classifiers.Cybercrimeplate.layers.1.weight: torch.Size([2, 450]) >>> coordinated_classifiers.Cybercrimeplate.layers.1.bias: torch.Size([2]) >>> coordinated_classifiers.Disasterplate.layers.0.weight: torch.Size([450, 768]) >>> coordinated_classifiers.Disasterplate.layers.0.bias: torch.Size([450]) >>> coordinated_classifiers.Disasterplate.layers.1.weight: torch.Size([2, 450]) >>> coordinated_classifiers.Disasterplate.layers.1.bias: torch.Size([2]) >>> coordinated_classifiers.Displacementplate.layers.0.weight: torch.Size([450, 768]) >>> coordinated_classifiers.Displacementplate.layers.0.bias: torch.Size([450]) >>> coordinated_classifiers.Displacementplate.layers.1.weight: torch.Size([2, 450]) >>> coordinated_classifiers.Displacementplate.layers.1.bias: torch.Size([2]) >>> coordinated_classifiers.Epidemiplate.layers.0.weight: torch.Size([450, 768]) >>> coordinated_classifiers.Epidemiplate.layers.0.bias: torch.Size([450]) >>> coordinated_classifiers.Epidemiplate.layers.1.weight: torch.Size([2, 450]) >>> coordinated_classifiers.Epidemiplate.layers.1.bias: torch.Size([2]) >>> coordinated_classifiers.Etiplate.layers.0.weight: torch.Size([450, 768]) >>> coordinated_classifiers.Etiplate.layers.0.bias: torch.Size([450]) >>> coordinated_classifiers.Etiplate.layers.1.weight: torch.Size([2, 450]) >>> coordinated_classifiers.Etiplate.layers.1.bias: torch.Size([2]) >>> coordinated_classifiers.Protestplate.layers.0.weight: torch.Size([450, 768]) >>> coordinated_classifiers.Protestplate.layers.0.bias: torch.Size([450]) >>> coordinated_classifiers.Protestplate.layers.1.weight: torch.Size([2, 450]) >>> coordinated_classifiers.Protestplate.layers.1.bias: torch.Size([2]) >>> coordinated_classifiers.Terrorplate.layers.0.weight: torch.Size([450, 768]) >>> coordinated_classifiers.Terrorplate.layers.0.bias: torch.Size([450]) >>> coordinated_classifiers.Terrorplate.layers.1.weight: torch.Size([2, 450]) >>> coordinated_classifiers.Terrorplate.layers.1.bias: torch.Size([2]) n_trainable_params: 582185936, n_nontrainable_params: 0 ---------------------------------------------------------------------------------------------------- ****************************** Epoch: 0 command: python train.py --model_name template --xlmr_model_name xlm-roberta-large --batch_size 10 --xlmr_learning_rate 2e-5 --event_hidden_num 450 --accumulate_step 4 --max_epoch 30 --p1_data_weight 0.1 --learning_rate 2e-4 2023-01-22 09:16:56.358065: step: 4/77, loss: 1.0449273586273193 2023-01-22 09:16:57.827825: step: 8/77, loss: 1.0661876201629639 2023-01-22 09:16:59.324630: step: 12/77, loss: 1.0697906017303467 2023-01-22 09:17:00.699488: step: 16/77, loss: 1.0578088760375977 2023-01-22 09:17:02.187594: step: 20/77, loss: 1.0446879863739014 2023-01-22 09:17:03.652851: step: 24/77, loss: 1.0503871440887451 2023-01-22 09:17:05.114233: step: 28/77, loss: 1.04624342918396 2023-01-22 09:17:06.599356: step: 32/77, loss: 1.0361480712890625 2023-01-22 09:17:08.075614: step: 36/77, loss: 1.041550874710083 2023-01-22 09:17:09.568526: step: 40/77, loss: 1.0250319242477417 2023-01-22 09:17:11.019204: step: 44/77, loss: 1.0202736854553223 2023-01-22 09:17:12.507145: step: 48/77, loss: 1.0159680843353271 2023-01-22 09:17:14.090441: step: 52/77, loss: 1.002465844154358 2023-01-22 09:17:15.580382: step: 56/77, loss: 0.977909505367279 2023-01-22 09:17:17.022893: step: 60/77, loss: 0.9835793972015381 2023-01-22 09:17:18.488566: step: 64/77, loss: 0.9763692617416382 2023-01-22 09:17:19.961940: step: 68/77, loss: 0.9826990962028503 2023-01-22 09:17:21.425546: step: 72/77, loss: 0.9448894262313843 2023-01-22 09:17:22.908391: step: 76/77, loss: 0.9651656150817871 2023-01-22 09:17:24.451368: step: 80/77, loss: 0.9191781282424927 2023-01-22 09:17:25.953672: step: 84/77, loss: 0.9262199401855469 2023-01-22 09:17:27.367510: step: 88/77, loss: 0.9188696146011353 2023-01-22 09:17:28.838873: step: 92/77, loss: 0.8682029247283936 2023-01-22 09:17:30.321267: step: 96/77, loss: 0.8716950416564941 2023-01-22 09:17:31.742621: step: 100/77, loss: 0.8641495108604431 2023-01-22 09:17:33.229413: step: 104/77, loss: 0.8422443866729736 2023-01-22 09:17:34.747972: step: 108/77, loss: 0.7932794094085693 2023-01-22 09:17:36.307033: step: 112/77, loss: 0.8285765647888184 2023-01-22 09:17:37.749700: step: 116/77, loss: 0.7687060832977295 2023-01-22 09:17:39.260142: step: 120/77, loss: 0.7670192718505859 2023-01-22 09:17:40.744833: step: 124/77, loss: 0.7816425561904907 2023-01-22 09:17:42.142011: step: 128/77, loss: 0.7292256355285645 2023-01-22 09:17:43.634157: step: 132/77, loss: 0.7331589460372925 2023-01-22 09:17:45.174150: step: 136/77, loss: 0.6675155162811279 2023-01-22 09:17:46.575445: step: 140/77, loss: 0.6493780016899109 2023-01-22 09:17:48.089943: step: 144/77, loss: 0.646944522857666 2023-01-22 09:17:49.591924: step: 148/77, loss: 0.6240701675415039 2023-01-22 09:17:51.172276: step: 152/77, loss: 0.5725189447402954 2023-01-22 09:17:52.690481: step: 156/77, loss: 0.5819498896598816 2023-01-22 09:17:54.174964: step: 160/77, loss: 0.5860099792480469 2023-01-22 09:17:55.688048: step: 164/77, loss: 0.5213608145713806 2023-01-22 09:17:57.209525: step: 168/77, loss: 0.5680379867553711 2023-01-22 09:17:58.686592: step: 172/77, loss: 0.45404040813446045 2023-01-22 09:18:00.140585: step: 176/77, loss: 0.44066500663757324 2023-01-22 09:18:01.626087: step: 180/77, loss: 0.4716237783432007 2023-01-22 09:18:03.090223: step: 184/77, loss: 0.3929561376571655 2023-01-22 09:18:04.621112: step: 188/77, loss: 0.4217242896556854 2023-01-22 09:18:06.057996: step: 192/77, loss: 0.39566516876220703 2023-01-22 09:18:07.523149: step: 196/77, loss: 0.41571173071861267 2023-01-22 09:18:08.982250: step: 200/77, loss: 0.2986573874950409 2023-01-22 09:18:10.428547: step: 204/77, loss: 0.44773268699645996 2023-01-22 09:18:11.899374: step: 208/77, loss: 0.2613578140735626 2023-01-22 09:18:13.367739: step: 212/77, loss: 0.22704322636127472 2023-01-22 09:18:14.807222: step: 216/77, loss: 0.1780972182750702 2023-01-22 09:18:16.282623: step: 220/77, loss: 0.20675839483737946 2023-01-22 09:18:17.823268: step: 224/77, loss: 0.19478635489940643 2023-01-22 09:18:19.196436: step: 228/77, loss: 0.2304016649723053 2023-01-22 09:18:20.678548: step: 232/77, loss: 0.2229248285293579 2023-01-22 09:18:22.188024: step: 236/77, loss: 0.13515512645244598 2023-01-22 09:18:23.681849: step: 240/77, loss: 0.11883608996868134 2023-01-22 09:18:25.188740: step: 244/77, loss: 0.1275760531425476 2023-01-22 09:18:26.724201: step: 248/77, loss: 0.33747437596321106 2023-01-22 09:18:28.215214: step: 252/77, loss: 0.12381209433078766 2023-01-22 09:18:29.744957: step: 256/77, loss: 0.40654462575912476 2023-01-22 09:18:31.292560: step: 260/77, loss: 0.14962953329086304 2023-01-22 09:18:32.725822: step: 264/77, loss: 0.044936202466487885 2023-01-22 09:18:34.170252: step: 268/77, loss: 0.06391699612140656 2023-01-22 09:18:35.653102: step: 272/77, loss: 0.20007681846618652 2023-01-22 09:18:37.188693: step: 276/77, loss: 0.2122235894203186 2023-01-22 09:18:38.578746: step: 280/77, loss: 0.13911424577236176 2023-01-22 09:18:40.072472: step: 284/77, loss: 0.06268332153558731 2023-01-22 09:18:41.582624: step: 288/77, loss: 0.11234302073717117 2023-01-22 09:18:43.025641: step: 292/77, loss: 0.04873437061905861 2023-01-22 09:18:44.481150: step: 296/77, loss: 0.1151905208826065 2023-01-22 09:18:45.982836: step: 300/77, loss: 0.20775069296360016 2023-01-22 09:18:47.467075: step: 304/77, loss: 0.1022971048951149 2023-01-22 09:18:48.955247: step: 308/77, loss: 0.14672891795635223 2023-01-22 09:18:50.381212: step: 312/77, loss: 0.036175355315208435 2023-01-22 09:18:51.828978: step: 316/77, loss: 0.06932605803012848 2023-01-22 09:18:53.282684: step: 320/77, loss: 0.24824608862400055 2023-01-22 09:18:54.804056: step: 324/77, loss: 0.0985107496380806 2023-01-22 09:18:56.238294: step: 328/77, loss: 0.27712494134902954 2023-01-22 09:18:57.683053: step: 332/77, loss: 0.080054372549057 2023-01-22 09:18:59.085779: step: 336/77, loss: 0.24405285716056824 2023-01-22 09:19:00.549558: step: 340/77, loss: 0.11317433416843414 2023-01-22 09:19:02.044472: step: 344/77, loss: 0.13220016658306122 2023-01-22 09:19:03.484371: step: 348/77, loss: 0.04351910203695297 2023-01-22 09:19:05.005198: step: 352/77, loss: 0.08148898184299469 2023-01-22 09:19:06.447395: step: 356/77, loss: 0.25758957862854004 2023-01-22 09:19:07.942433: step: 360/77, loss: 0.19760257005691528 2023-01-22 09:19:09.448521: step: 364/77, loss: 0.08745350688695908 2023-01-22 09:19:11.015004: step: 368/77, loss: 0.20615199208259583 2023-01-22 09:19:12.517086: step: 372/77, loss: 0.04876667261123657 2023-01-22 09:19:13.968204: step: 376/77, loss: 0.1216202825307846 2023-01-22 09:19:15.377747: step: 380/77, loss: 0.16642817854881287 2023-01-22 09:19:16.884625: step: 384/77, loss: 0.06036415696144104 2023-01-22 09:19:18.340494: step: 388/77, loss: 0.1854311227798462 ================================================== Loss: 0.487 -------------------- Dev Chinese: {'template': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 0} Test Chinese: {'template': {'p': 1.0, 'r': 0.0, 'f1': 0.0}, 'slot': {'p': 1.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 0} Dev Korean: {'template': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 0} Test Korean: {'template': {'p': 1.0, 'r': 0.0, 'f1': 0.0}, 'slot': {'p': 1.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 0} Dev Russian: {'template': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 0} Test Russian: {'template': {'p': 1.0, 'r': 0.0, 'f1': 0.0}, 'slot': {'p': 1.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 0} Sample Chinese: {'template': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 0} Sample Korean: {'template': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 0} Sample Russian: {'template': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 0} New best chinese model... New best korean model... New best russian model... ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 0} Test for Chinese: {'template': {'p': 1.0, 'r': 0.0, 'f1': 0.0}, 'slot': {'p': 1.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 0} Chinese: {'template': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 0} -------------------- Dev for Korean: {'template': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 0} Test for Korean: {'template': {'p': 1.0, 'r': 0.0, 'f1': 0.0}, 'slot': {'p': 1.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 0} Korean: {'template': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 0} -------------------- Dev for Russian: {'template': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 0} Test for Russian: {'template': {'p': 1.0, 'r': 0.0, 'f1': 0.0}, 'slot': {'p': 1.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 0} Russian: {'template': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 0} ****************************** Epoch: 1 command: python train.py --model_name template --xlmr_model_name xlm-roberta-large --batch_size 10 --xlmr_learning_rate 2e-5 --event_hidden_num 450 --accumulate_step 4 --max_epoch 30 --p1_data_weight 0.1 --learning_rate 2e-4 2023-01-22 09:21:13.751247: step: 4/77, loss: 0.08808518946170807 2023-01-22 09:21:15.335225: step: 8/77, loss: 0.07598349452018738 2023-01-22 09:21:16.718984: step: 12/77, loss: 0.07537257671356201 2023-01-22 09:21:18.125848: step: 16/77, loss: 0.19350022077560425 2023-01-22 09:21:19.522590: step: 20/77, loss: 0.08845332264900208 2023-01-22 09:21:20.994339: step: 24/77, loss: 0.06127769500017166 2023-01-22 09:21:22.494691: step: 28/77, loss: 0.08885109424591064 2023-01-22 09:21:23.965234: step: 32/77, loss: 0.21233563125133514 2023-01-22 09:21:25.439419: step: 36/77, loss: 0.3040357232093811 2023-01-22 09:21:26.903331: step: 40/77, loss: 0.27482870221138 2023-01-22 09:21:28.425857: step: 44/77, loss: 0.08601843565702438 2023-01-22 09:21:29.869032: step: 48/77, loss: 0.07719548791646957 2023-01-22 09:21:31.343980: step: 52/77, loss: 0.06267912685871124 2023-01-22 09:21:32.809445: step: 56/77, loss: 0.07540792971849442 2023-01-22 09:21:34.248675: step: 60/77, loss: 0.04689452797174454 2023-01-22 09:21:35.712920: step: 64/77, loss: 0.15594510734081268 2023-01-22 09:21:37.232915: step: 68/77, loss: 0.14404773712158203 2023-01-22 09:21:38.711558: step: 72/77, loss: 0.07119323313236237 2023-01-22 09:21:40.214456: step: 76/77, loss: 0.16119280457496643 2023-01-22 09:21:41.674024: step: 80/77, loss: 0.09215322881937027 2023-01-22 09:21:43.234657: step: 84/77, loss: 0.3043191432952881 2023-01-22 09:21:44.708998: step: 88/77, loss: 0.3175913393497467 2023-01-22 09:21:46.174417: step: 92/77, loss: 0.08320172131061554 2023-01-22 09:21:47.649894: step: 96/77, loss: 0.03388125076889992 2023-01-22 09:21:49.048797: step: 100/77, loss: 0.03213505819439888 2023-01-22 09:21:50.522159: step: 104/77, loss: 0.11765853315591812 2023-01-22 09:21:51.961605: step: 108/77, loss: 0.16065680980682373 2023-01-22 09:21:53.445189: step: 112/77, loss: 0.031914323568344116 2023-01-22 09:21:54.858532: step: 116/77, loss: 0.1371314525604248 2023-01-22 09:21:56.368475: step: 120/77, loss: 0.057588137686252594 2023-01-22 09:21:57.806766: step: 124/77, loss: 0.17369845509529114 2023-01-22 09:21:59.245137: step: 128/77, loss: 0.14050480723381042 2023-01-22 09:22:00.746476: step: 132/77, loss: 0.06881752610206604 2023-01-22 09:22:02.173876: step: 136/77, loss: 0.08079089224338531 2023-01-22 09:22:03.591808: step: 140/77, loss: 0.11471383273601532 2023-01-22 09:22:05.034844: step: 144/77, loss: 0.07265495508909225 2023-01-22 09:22:06.488623: step: 148/77, loss: 0.0584573820233345 2023-01-22 09:22:08.043858: step: 152/77, loss: 0.20290088653564453 2023-01-22 09:22:09.500319: step: 156/77, loss: 0.08977239578962326 2023-01-22 09:22:10.974914: step: 160/77, loss: 0.2409203052520752 2023-01-22 09:22:12.422538: step: 164/77, loss: 0.31462815403938293 2023-01-22 09:22:13.874609: step: 168/77, loss: 0.09223669022321701 2023-01-22 09:22:15.392226: step: 172/77, loss: 0.15644097328186035 2023-01-22 09:22:16.873738: step: 176/77, loss: 0.03518672287464142 2023-01-22 09:22:18.366102: step: 180/77, loss: 0.0733434408903122 2023-01-22 09:22:19.871376: step: 184/77, loss: 0.19737818837165833 2023-01-22 09:22:21.332821: step: 188/77, loss: 0.07022371888160706 2023-01-22 09:22:22.834879: step: 192/77, loss: 0.08941012620925903 2023-01-22 09:22:24.313197: step: 196/77, loss: 0.17177169024944305 2023-01-22 09:22:25.768500: step: 200/77, loss: 0.0818762332201004 2023-01-22 09:22:27.255119: step: 204/77, loss: 0.1051383763551712 2023-01-22 09:22:28.623268: step: 208/77, loss: 0.13907712697982788 2023-01-22 09:22:30.048341: step: 212/77, loss: 0.050016190856695175 2023-01-22 09:22:31.477574: step: 216/77, loss: 0.05056443437933922 2023-01-22 09:22:32.937158: step: 220/77, loss: 0.07941845059394836 2023-01-22 09:22:34.412419: step: 224/77, loss: 0.08878391236066818 2023-01-22 09:22:35.914689: step: 228/77, loss: 0.0625339150428772 2023-01-22 09:22:37.298376: step: 232/77, loss: 0.06150152161717415 2023-01-22 09:22:38.765289: step: 236/77, loss: 0.11748947948217392 2023-01-22 09:22:40.263711: step: 240/77, loss: 0.18033303320407867 2023-01-22 09:22:41.759740: step: 244/77, loss: 0.14193198084831238 2023-01-22 09:22:43.279282: step: 248/77, loss: 0.08232773840427399 2023-01-22 09:22:44.743300: step: 252/77, loss: 0.2342931628227234 2023-01-22 09:22:46.198654: step: 256/77, loss: 0.08909574151039124 2023-01-22 09:22:47.679894: step: 260/77, loss: 0.05348915234208107 2023-01-22 09:22:49.126202: step: 264/77, loss: 0.1067211776971817 2023-01-22 09:22:50.556724: step: 268/77, loss: 0.23183506727218628 2023-01-22 09:22:52.030069: step: 272/77, loss: 0.1415650099515915 2023-01-22 09:22:53.509851: step: 276/77, loss: 0.06152913719415665 2023-01-22 09:22:54.956166: step: 280/77, loss: 0.041706740856170654 2023-01-22 09:22:56.445780: step: 284/77, loss: 0.10872595757246017 2023-01-22 09:22:57.946409: step: 288/77, loss: 0.043247997760772705 2023-01-22 09:22:59.409321: step: 292/77, loss: 0.12283715605735779 2023-01-22 09:23:00.872330: step: 296/77, loss: 0.03415264934301376 2023-01-22 09:23:02.308492: step: 300/77, loss: 0.17462123930454254 2023-01-22 09:23:03.790722: step: 304/77, loss: 0.1322343945503235 2023-01-22 09:23:05.317916: step: 308/77, loss: 0.06682181358337402 2023-01-22 09:23:06.866575: step: 312/77, loss: 0.07804687321186066 2023-01-22 09:23:08.285686: step: 316/77, loss: 0.36672520637512207 2023-01-22 09:23:09.726505: step: 320/77, loss: 0.06110672280192375 2023-01-22 09:23:11.201398: step: 324/77, loss: 0.1718875765800476 2023-01-22 09:23:12.714400: step: 328/77, loss: 0.13002590835094452 2023-01-22 09:23:14.166899: step: 332/77, loss: 0.10468481481075287 2023-01-22 09:23:15.662493: step: 336/77, loss: 0.1538790762424469 2023-01-22 09:23:17.050154: step: 340/77, loss: 0.026036838069558144 2023-01-22 09:23:18.552012: step: 344/77, loss: 0.11194135248661041 2023-01-22 09:23:20.054987: step: 348/77, loss: 0.07212279736995697 2023-01-22 09:23:21.572932: step: 352/77, loss: 0.06141982227563858 2023-01-22 09:23:23.073769: step: 356/77, loss: 0.10051367431879044 2023-01-22 09:23:24.592463: step: 360/77, loss: 0.08243393898010254 2023-01-22 09:23:26.041226: step: 364/77, loss: 0.08698487281799316 2023-01-22 09:23:27.518516: step: 368/77, loss: 0.06616237759590149 2023-01-22 09:23:29.007801: step: 372/77, loss: 0.08207230269908905 2023-01-22 09:23:30.513073: step: 376/77, loss: 0.13694001734256744 2023-01-22 09:23:31.942351: step: 380/77, loss: 0.06030358374118805 2023-01-22 09:23:33.452444: step: 384/77, loss: 0.07378076016902924 2023-01-22 09:23:34.930755: step: 388/77, loss: 0.0624019056558609 ================================================== Loss: 0.115 -------------------- Dev Chinese: {'template': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 1} Test Chinese: {'template': {'p': 1.0, 'r': 0.0, 'f1': 0.0}, 'slot': {'p': 1.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 1} Dev Korean: {'template': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 1} Test Korean: {'template': {'p': 1.0, 'r': 0.0, 'f1': 0.0}, 'slot': {'p': 1.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 1} Dev Russian: {'template': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 1} Test Russian: {'template': {'p': 1.0, 'r': 0.0, 'f1': 0.0}, 'slot': {'p': 1.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 1} Sample Chinese: {'template': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 1} Sample Korean: {'template': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 1} Sample Russian: {'template': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 1} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 0} Test for Chinese: {'template': {'p': 1.0, 'r': 0.0, 'f1': 0.0}, 'slot': {'p': 1.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 0} Chinese: {'template': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 0} -------------------- Dev for Korean: {'template': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 0} Test for Korean: {'template': {'p': 1.0, 'r': 0.0, 'f1': 0.0}, 'slot': {'p': 1.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 0} Korean: {'template': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 0} -------------------- Dev for Russian: {'template': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 0} Test for Russian: {'template': {'p': 1.0, 'r': 0.0, 'f1': 0.0}, 'slot': {'p': 1.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 0} Russian: {'template': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 0} ****************************** Epoch: 2 command: python train.py --model_name template --xlmr_model_name xlm-roberta-large --batch_size 10 --xlmr_learning_rate 2e-5 --event_hidden_num 450 --accumulate_step 4 --max_epoch 30 --p1_data_weight 0.1 --learning_rate 2e-4 2023-01-22 09:25:12.086793: step: 4/77, loss: 0.13258124887943268 2023-01-22 09:25:13.552149: step: 8/77, loss: 0.06252895295619965 2023-01-22 09:25:14.977051: step: 12/77, loss: 0.10674737393856049 2023-01-22 09:25:16.496701: step: 16/77, loss: 0.09641920775175095 2023-01-22 09:25:17.950756: step: 20/77, loss: 0.11784550547599792 2023-01-22 09:25:19.413090: step: 24/77, loss: 0.05828585475683212 2023-01-22 09:25:20.916118: step: 28/77, loss: 0.058067552745342255 2023-01-22 09:25:22.448289: step: 32/77, loss: 0.09018880873918533 2023-01-22 09:25:23.861144: step: 36/77, loss: 0.22561988234519958 2023-01-22 09:25:25.282741: step: 40/77, loss: 0.14366985857486725 2023-01-22 09:25:26.785761: step: 44/77, loss: 0.08453567326068878 2023-01-22 09:25:28.184787: step: 48/77, loss: 0.03233639895915985 2023-01-22 09:25:29.627863: step: 52/77, loss: 0.07216435670852661 2023-01-22 09:25:31.049688: step: 56/77, loss: 0.24645262956619263 2023-01-22 09:25:32.571934: step: 60/77, loss: 0.050122007727622986 2023-01-22 09:25:34.043053: step: 64/77, loss: 0.12721282243728638 2023-01-22 09:25:35.427719: step: 68/77, loss: 0.05591835081577301 2023-01-22 09:25:36.934350: step: 72/77, loss: 0.06773615628480911 2023-01-22 09:25:38.341711: step: 76/77, loss: 0.14394433796405792 2023-01-22 09:25:39.747372: step: 80/77, loss: 0.0904054343700409 2023-01-22 09:25:41.264409: step: 84/77, loss: 0.23689547181129456 2023-01-22 09:25:42.778573: step: 88/77, loss: 0.1101958304643631 2023-01-22 09:25:44.238477: step: 92/77, loss: 0.08307532966136932 2023-01-22 09:25:45.711351: step: 96/77, loss: 0.15602004528045654 2023-01-22 09:25:47.159056: step: 100/77, loss: 0.04147651046514511 2023-01-22 09:25:48.660307: step: 104/77, loss: 0.15626007318496704 2023-01-22 09:25:50.212289: step: 108/77, loss: 0.08543263375759125 2023-01-22 09:25:51.608057: step: 112/77, loss: 0.05249679833650589 2023-01-22 09:25:53.157450: step: 116/77, loss: 0.13839921355247498 2023-01-22 09:25:54.635012: step: 120/77, loss: 0.1184510886669159 2023-01-22 09:25:56.092777: step: 124/77, loss: 0.12318076193332672 2023-01-22 09:25:57.571807: step: 128/77, loss: 0.06913600862026215 2023-01-22 09:25:59.029294: step: 132/77, loss: 0.07552852481603622 2023-01-22 09:26:00.517843: step: 136/77, loss: 0.1511237621307373 2023-01-22 09:26:01.975436: step: 140/77, loss: 0.11312386393547058 2023-01-22 09:26:03.439824: step: 144/77, loss: 0.07943280786275864 2023-01-22 09:26:04.959098: step: 148/77, loss: 0.03293665871024132 2023-01-22 09:26:06.410537: step: 152/77, loss: 0.06744928658008575 2023-01-22 09:26:07.887959: step: 156/77, loss: 0.08263631165027618 2023-01-22 09:26:09.353786: step: 160/77, loss: 0.079817034304142 2023-01-22 09:26:10.855005: step: 164/77, loss: 0.03467145934700966 2023-01-22 09:26:12.300871: step: 168/77, loss: 0.025395743548870087 2023-01-22 09:26:13.775990: step: 172/77, loss: 0.06069394573569298 2023-01-22 09:26:15.282880: step: 176/77, loss: 0.01779540255665779 2023-01-22 09:26:16.729807: step: 180/77, loss: 0.03509168699383736 2023-01-22 09:26:18.187202: step: 184/77, loss: 0.03671012073755264 2023-01-22 09:26:19.693102: step: 188/77, loss: 0.04279708117246628 2023-01-22 09:26:21.201280: step: 192/77, loss: 0.031519681215286255 2023-01-22 09:26:22.669869: step: 196/77, loss: 0.03178836405277252 2023-01-22 09:26:24.139367: step: 200/77, loss: 0.03438074141740799 2023-01-22 09:26:25.581256: step: 204/77, loss: 0.013872837647795677 2023-01-22 09:26:27.062092: step: 208/77, loss: 0.0239429734647274 2023-01-22 09:26:28.464359: step: 212/77, loss: 0.04659315571188927 2023-01-22 09:26:29.920577: step: 216/77, loss: 0.00971127487719059 2023-01-22 09:26:31.441681: step: 220/77, loss: 0.06695039570331573 2023-01-22 09:26:32.915078: step: 224/77, loss: 0.06293109059333801 2023-01-22 09:26:34.460207: step: 228/77, loss: 0.0710817202925682 2023-01-22 09:26:35.959331: step: 232/77, loss: 0.018637798726558685 2023-01-22 09:26:37.474038: step: 236/77, loss: 0.11581932008266449 2023-01-22 09:26:38.929561: step: 240/77, loss: 0.04097363352775574 2023-01-22 09:26:40.398845: step: 244/77, loss: 0.014920370653271675 2023-01-22 09:26:41.903309: step: 248/77, loss: 0.01983986236155033 2023-01-22 09:26:43.324117: step: 252/77, loss: 0.024595849215984344 2023-01-22 09:26:44.811163: step: 256/77, loss: 0.04437322914600372 2023-01-22 09:26:46.300303: step: 260/77, loss: 0.014827568084001541 2023-01-22 09:26:47.813190: step: 264/77, loss: 0.05651269108057022 2023-01-22 09:26:49.264530: step: 268/77, loss: 0.05368447303771973 2023-01-22 09:26:50.762605: step: 272/77, loss: 0.008875405415892601 2023-01-22 09:26:52.250194: step: 276/77, loss: 0.08118952065706253 2023-01-22 09:26:53.701178: step: 280/77, loss: 0.044285401701927185 2023-01-22 09:26:55.156481: step: 284/77, loss: 0.021020062267780304 2023-01-22 09:26:56.641652: step: 288/77, loss: 0.04102979600429535 2023-01-22 09:26:58.125474: step: 292/77, loss: 0.08634628355503082 2023-01-22 09:26:59.568001: step: 296/77, loss: 0.10369566082954407 2023-01-22 09:27:01.068467: step: 300/77, loss: 0.1591092050075531 2023-01-22 09:27:02.601644: step: 304/77, loss: 0.0445716418325901 2023-01-22 09:27:04.083288: step: 308/77, loss: 0.044002607464790344 2023-01-22 09:27:05.577172: step: 312/77, loss: 0.12348321080207825 2023-01-22 09:27:07.016959: step: 316/77, loss: 0.04513373598456383 2023-01-22 09:27:08.509542: step: 320/77, loss: 0.07740610092878342 2023-01-22 09:27:10.033723: step: 324/77, loss: 0.017783014103770256 2023-01-22 09:27:11.492641: step: 328/77, loss: 0.016946876421570778 2023-01-22 09:27:13.079334: step: 332/77, loss: 0.011803146451711655 2023-01-22 09:27:14.570511: step: 336/77, loss: 0.07661925256252289 2023-01-22 09:27:16.015640: step: 340/77, loss: 0.02661910280585289 2023-01-22 09:27:17.505975: step: 344/77, loss: 0.03145609050989151 2023-01-22 09:27:19.000145: step: 348/77, loss: 0.019618254154920578 2023-01-22 09:27:20.487969: step: 352/77, loss: 0.07848714292049408 2023-01-22 09:27:21.958059: step: 356/77, loss: 0.04304105043411255 2023-01-22 09:27:23.437633: step: 360/77, loss: 0.04550347849726677 2023-01-22 09:27:24.941445: step: 364/77, loss: 0.019600503146648407 2023-01-22 09:27:26.457428: step: 368/77, loss: 0.0321541503071785 2023-01-22 09:27:28.056675: step: 372/77, loss: 0.08753525465726852 2023-01-22 09:27:29.581313: step: 376/77, loss: 0.041867706924676895 2023-01-22 09:27:31.066705: step: 380/77, loss: 0.03870141878724098 2023-01-22 09:27:32.497934: step: 384/77, loss: 0.05822906270623207 2023-01-22 09:27:33.954013: step: 388/77, loss: 0.03057796321809292 ================================================== Loss: 0.069 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5666666666666667, 'f1': 0.7234042553191489}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05085442919642522, 'epoch': 2} Test Chinese: {'template': {'p': 0.9384615384615385, 'r': 0.46564885496183206, 'f1': 0.6224489795918368}, 'slot': {'p': 0.6153846153846154, 'r': 0.013805004314063849, 'f1': 0.027004219409282704}, 'combined': 0.016808748815982093, 'epoch': 2} Dev Korean: {'template': {'p': 1.0, 'r': 0.5666666666666667, 'f1': 0.7234042553191489}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05085442919642522, 'epoch': 2} Test Korean: {'template': {'p': 0.9230769230769231, 'r': 0.4580152671755725, 'f1': 0.6122448979591837}, 'slot': {'p': 0.6153846153846154, 'r': 0.013805004314063849, 'f1': 0.027004219409282704}, 'combined': 0.016533195556703698, 'epoch': 2} Dev Russian: {'template': {'p': 1.0, 'r': 0.5666666666666667, 'f1': 0.7234042553191489}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05085442919642522, 'epoch': 2} Test Russian: {'template': {'p': 0.9230769230769231, 'r': 0.4580152671755725, 'f1': 0.6122448979591837}, 'slot': {'p': 0.6153846153846154, 'r': 0.013805004314063849, 'f1': 0.027004219409282704}, 'combined': 0.016533195556703698, 'epoch': 2} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 2} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 2} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 2} New best chinese model... New best korean model... New best russian model... ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5666666666666667, 'f1': 0.7234042553191489}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05085442919642522, 'epoch': 2} Test for Chinese: {'template': {'p': 0.9384615384615385, 'r': 0.46564885496183206, 'f1': 0.6224489795918368}, 'slot': {'p': 0.6153846153846154, 'r': 0.013805004314063849, 'f1': 0.027004219409282704}, 'combined': 0.016808748815982093, 'epoch': 2} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 2} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5666666666666667, 'f1': 0.7234042553191489}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05085442919642522, 'epoch': 2} Test for Korean: {'template': {'p': 0.9230769230769231, 'r': 0.4580152671755725, 'f1': 0.6122448979591837}, 'slot': {'p': 0.6153846153846154, 'r': 0.013805004314063849, 'f1': 0.027004219409282704}, 'combined': 0.016533195556703698, 'epoch': 2} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 2} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5666666666666667, 'f1': 0.7234042553191489}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05085442919642522, 'epoch': 2} Test for Russian: {'template': {'p': 0.9230769230769231, 'r': 0.4580152671755725, 'f1': 0.6122448979591837}, 'slot': {'p': 0.6153846153846154, 'r': 0.013805004314063849, 'f1': 0.027004219409282704}, 'combined': 0.016533195556703698, 'epoch': 2} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 2} ****************************** Epoch: 3 command: python train.py --model_name template --xlmr_model_name xlm-roberta-large --batch_size 10 --xlmr_learning_rate 2e-5 --event_hidden_num 450 --accumulate_step 4 --max_epoch 30 --p1_data_weight 0.1 --learning_rate 2e-4 2023-01-22 09:29:32.738737: step: 4/77, loss: 0.012912862002849579 2023-01-22 09:29:34.077166: step: 8/77, loss: 0.04591010510921478 2023-01-22 09:29:35.633658: step: 12/77, loss: 0.02694067917764187 2023-01-22 09:29:37.131225: step: 16/77, loss: 0.01571756601333618 2023-01-22 09:29:38.512850: step: 20/77, loss: 0.01355262566357851 2023-01-22 09:29:39.970573: step: 24/77, loss: 0.03795113414525986 2023-01-22 09:29:41.451435: step: 28/77, loss: 0.03535859286785126 2023-01-22 09:29:42.870307: step: 32/77, loss: 0.015470536425709724 2023-01-22 09:29:44.313918: step: 36/77, loss: 0.015092505142092705 2023-01-22 09:29:45.809234: step: 40/77, loss: 0.1292186826467514 2023-01-22 09:29:47.217436: step: 44/77, loss: 0.07205881923437119 2023-01-22 09:29:48.716272: step: 48/77, loss: 0.03737745061516762 2023-01-22 09:29:50.198967: step: 52/77, loss: 0.011360120959579945 2023-01-22 09:29:51.695470: step: 56/77, loss: 0.007079091854393482 2023-01-22 09:29:53.230182: step: 60/77, loss: 0.021989651024341583 2023-01-22 09:29:54.654772: step: 64/77, loss: 0.11524832248687744 2023-01-22 09:29:56.132508: step: 68/77, loss: 0.029087066650390625 2023-01-22 09:29:57.654451: step: 72/77, loss: 0.025555476546287537 2023-01-22 09:29:59.076549: step: 76/77, loss: 0.05321405082941055 2023-01-22 09:30:00.601288: step: 80/77, loss: 0.021694185212254524 2023-01-22 09:30:02.169140: step: 84/77, loss: 0.026616228744387627 2023-01-22 09:30:03.609810: step: 88/77, loss: 0.025496210902929306 2023-01-22 09:30:05.045097: step: 92/77, loss: 0.02640456147491932 2023-01-22 09:30:06.611763: step: 96/77, loss: 0.014219501987099648 2023-01-22 09:30:07.999119: step: 100/77, loss: 0.04648306965827942 2023-01-22 09:30:09.507815: step: 104/77, loss: 0.003235041629523039 2023-01-22 09:30:10.983375: step: 108/77, loss: 0.016798511147499084 2023-01-22 09:30:12.465405: step: 112/77, loss: 0.033600855618715286 2023-01-22 09:30:13.966666: step: 116/77, loss: 0.015783516690135002 2023-01-22 09:30:15.401296: step: 120/77, loss: 0.05280846357345581 2023-01-22 09:30:16.895098: step: 124/77, loss: 0.00289800763130188 2023-01-22 09:30:18.364632: step: 128/77, loss: 0.034957654774188995 2023-01-22 09:30:19.829523: step: 132/77, loss: 0.08969595283269882 2023-01-22 09:30:21.276399: step: 136/77, loss: 0.04882393777370453 2023-01-22 09:30:22.763615: step: 140/77, loss: 0.02632911317050457 2023-01-22 09:30:24.229183: step: 144/77, loss: 0.002386486390605569 2023-01-22 09:30:25.757042: step: 148/77, loss: 0.04905037209391594 2023-01-22 09:30:27.267492: step: 152/77, loss: 0.007842171005904675 2023-01-22 09:30:28.724971: step: 156/77, loss: 0.02815202623605728 2023-01-22 09:30:30.223015: step: 160/77, loss: 0.005176716484129429 2023-01-22 09:30:31.712201: step: 164/77, loss: 0.017684893682599068 2023-01-22 09:30:33.172885: step: 168/77, loss: 0.015900595113635063 2023-01-22 09:30:34.737282: step: 172/77, loss: 0.006918495055288076 2023-01-22 09:30:36.179798: step: 176/77, loss: 0.02818692848086357 2023-01-22 09:30:37.678240: step: 180/77, loss: 0.03534059599041939 2023-01-22 09:30:39.145300: step: 184/77, loss: 0.026743004098534584 2023-01-22 09:30:40.672544: step: 188/77, loss: 0.027218960225582123 2023-01-22 09:30:42.124286: step: 192/77, loss: 0.00607309490442276 2023-01-22 09:30:43.674566: step: 196/77, loss: 0.009721008129417896 2023-01-22 09:30:45.228368: step: 200/77, loss: 0.026713203638792038 2023-01-22 09:30:46.667549: step: 204/77, loss: 0.04019925370812416 2023-01-22 09:30:48.090936: step: 208/77, loss: 0.08135680109262466 2023-01-22 09:30:49.558235: step: 212/77, loss: 0.07656152546405792 2023-01-22 09:30:51.035494: step: 216/77, loss: 0.024162959307432175 2023-01-22 09:30:52.493167: step: 220/77, loss: 0.01666862890124321 2023-01-22 09:30:53.977652: step: 224/77, loss: 0.03890030086040497 2023-01-22 09:30:55.408298: step: 228/77, loss: 0.045673124492168427 2023-01-22 09:30:56.903682: step: 232/77, loss: 0.010527916252613068 2023-01-22 09:30:58.389590: step: 236/77, loss: 0.006743155419826508 2023-01-22 09:30:59.886656: step: 240/77, loss: 0.011435626074671745 2023-01-22 09:31:01.402916: step: 244/77, loss: 0.0381687693297863 2023-01-22 09:31:02.920323: step: 248/77, loss: 0.022242587059736252 2023-01-22 09:31:04.360965: step: 252/77, loss: 0.04305797815322876 2023-01-22 09:31:05.819296: step: 256/77, loss: 0.009767288342118263 2023-01-22 09:31:07.348484: step: 260/77, loss: 0.12234492599964142 2023-01-22 09:31:08.857945: step: 264/77, loss: 0.03097294643521309 2023-01-22 09:31:10.327885: step: 268/77, loss: 0.0573321133852005 2023-01-22 09:31:11.787825: step: 272/77, loss: 0.22089190781116486 2023-01-22 09:31:13.266451: step: 276/77, loss: 0.05211479216814041 2023-01-22 09:31:14.803621: step: 280/77, loss: 0.02089191973209381 2023-01-22 09:31:16.196923: step: 284/77, loss: 0.06037844344973564 2023-01-22 09:31:17.660571: step: 288/77, loss: 0.018641719594597816 2023-01-22 09:31:19.171804: step: 292/77, loss: 0.010771493427455425 2023-01-22 09:31:20.630452: step: 296/77, loss: 0.02911381423473358 2023-01-22 09:31:22.106721: step: 300/77, loss: 0.056641578674316406 2023-01-22 09:31:23.578859: step: 304/77, loss: 0.020282302051782608 2023-01-22 09:31:25.076372: step: 308/77, loss: 0.10549724102020264 2023-01-22 09:31:26.538391: step: 312/77, loss: 0.12837551534175873 2023-01-22 09:31:28.006218: step: 316/77, loss: 0.008430239744484425 2023-01-22 09:31:29.460169: step: 320/77, loss: 0.022032450884580612 2023-01-22 09:31:30.886275: step: 324/77, loss: 0.07047325372695923 2023-01-22 09:31:32.318347: step: 328/77, loss: 0.018198523670434952 2023-01-22 09:31:33.807823: step: 332/77, loss: 0.016754793003201485 2023-01-22 09:31:35.277315: step: 336/77, loss: 0.042492322623729706 2023-01-22 09:31:36.736543: step: 340/77, loss: 0.029950454831123352 2023-01-22 09:31:38.175708: step: 344/77, loss: 0.05285784602165222 2023-01-22 09:31:39.667787: step: 348/77, loss: 0.06085269898176193 2023-01-22 09:31:41.107137: step: 352/77, loss: 0.014680081978440285 2023-01-22 09:31:42.600869: step: 356/77, loss: 0.050214581191539764 2023-01-22 09:31:44.032122: step: 360/77, loss: 0.03656889870762825 2023-01-22 09:31:45.474909: step: 364/77, loss: 0.014042508788406849 2023-01-22 09:31:47.010852: step: 368/77, loss: 0.04989948868751526 2023-01-22 09:31:48.519951: step: 372/77, loss: 0.048607222735881805 2023-01-22 09:31:50.018960: step: 376/77, loss: 0.04302738979458809 2023-01-22 09:31:51.462196: step: 380/77, loss: 0.02379428781569004 2023-01-22 09:31:52.967475: step: 384/77, loss: 0.013878141529858112 2023-01-22 09:31:54.496104: step: 388/77, loss: 0.008634086698293686 ================================================== Loss: 0.036 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test Chinese: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.37777777777777777, 'r': 0.014667817083692839, 'f1': 0.028239202657807307}, 'combined': 0.01735677821894498, 'epoch': 3} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test Korean: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.37777777777777777, 'r': 0.014667817083692839, 'f1': 0.028239202657807307}, 'combined': 0.01735677821894498, 'epoch': 3} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test Russian: {'template': {'p': 0.863013698630137, 'r': 0.48091603053435117, 'f1': 0.6176470588235295}, 'slot': {'p': 0.38636363636363635, 'r': 0.014667817083692839, 'f1': 0.02826267664172901}, 'combined': 0.01745635910224439, 'epoch': 3} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 3} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 3} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 3} New best chinese model... New best korean model... New best russian model... ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Chinese: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.37777777777777777, 'r': 0.014667817083692839, 'f1': 0.028239202657807307}, 'combined': 0.01735677821894498, 'epoch': 3} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 3} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Korean: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.37777777777777777, 'r': 0.014667817083692839, 'f1': 0.028239202657807307}, 'combined': 0.01735677821894498, 'epoch': 3} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 3} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Russian: {'template': {'p': 0.863013698630137, 'r': 0.48091603053435117, 'f1': 0.6176470588235295}, 'slot': {'p': 0.38636363636363635, 'r': 0.014667817083692839, 'f1': 0.02826267664172901}, 'combined': 0.01745635910224439, 'epoch': 3} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 3} ****************************** Epoch: 4 command: python train.py --model_name template --xlmr_model_name xlm-roberta-large --batch_size 10 --xlmr_learning_rate 2e-5 --event_hidden_num 450 --accumulate_step 4 --max_epoch 30 --p1_data_weight 0.1 --learning_rate 2e-4 2023-01-22 09:33:52.332313: step: 4/77, loss: 0.01413442101329565 2023-01-22 09:33:53.812819: step: 8/77, loss: 0.03007185272872448 2023-01-22 09:33:55.279400: step: 12/77, loss: 0.03499831259250641 2023-01-22 09:33:56.707476: step: 16/77, loss: 0.05546431988477707 2023-01-22 09:33:58.139104: step: 20/77, loss: 0.002329618204385042 2023-01-22 09:33:59.613308: step: 24/77, loss: 0.03021444007754326 2023-01-22 09:34:01.148891: step: 28/77, loss: 0.002550596371293068 2023-01-22 09:34:02.646327: step: 32/77, loss: 0.06376893818378448 2023-01-22 09:34:04.125986: step: 36/77, loss: 0.013936012983322144 2023-01-22 09:34:05.567508: step: 40/77, loss: 0.024293333292007446 2023-01-22 09:34:07.046401: step: 44/77, loss: 0.003448844887316227 2023-01-22 09:34:08.512476: step: 48/77, loss: 0.04759862273931503 2023-01-22 09:34:09.962432: step: 52/77, loss: 0.02035803720355034 2023-01-22 09:34:11.493917: step: 56/77, loss: 0.006649450398981571 2023-01-22 09:34:12.933830: step: 60/77, loss: 0.010570206679403782 2023-01-22 09:34:14.379390: step: 64/77, loss: 0.05664711818099022 2023-01-22 09:34:15.836393: step: 68/77, loss: 0.015202310867607594 2023-01-22 09:34:17.362715: step: 72/77, loss: 0.03192824497818947 2023-01-22 09:34:18.806777: step: 76/77, loss: 0.04990002512931824 2023-01-22 09:34:20.299316: step: 80/77, loss: 0.004372069146484137 2023-01-22 09:34:21.792354: step: 84/77, loss: 0.06885036081075668 2023-01-22 09:34:23.301218: step: 88/77, loss: 0.002442281460389495 2023-01-22 09:34:24.776424: step: 92/77, loss: 0.003796561621129513 2023-01-22 09:34:26.228797: step: 96/77, loss: 0.04918007180094719 2023-01-22 09:34:27.726377: step: 100/77, loss: 0.0023976361844688654 2023-01-22 09:34:29.160331: step: 104/77, loss: 0.060220494866371155 2023-01-22 09:34:30.664001: step: 108/77, loss: 0.019232070073485374 2023-01-22 09:34:32.070809: step: 112/77, loss: 0.03376259282231331 2023-01-22 09:34:33.566581: step: 116/77, loss: 0.03330276161432266 2023-01-22 09:34:35.054149: step: 120/77, loss: 0.01623927801847458 2023-01-22 09:34:36.615184: step: 124/77, loss: 0.008382521569728851 2023-01-22 09:34:38.100027: step: 128/77, loss: 0.025504330173134804 2023-01-22 09:34:39.583413: step: 132/77, loss: 0.016718082129955292 2023-01-22 09:34:41.062503: step: 136/77, loss: 0.03711515665054321 2023-01-22 09:34:42.564011: step: 140/77, loss: 0.04266016557812691 2023-01-22 09:34:44.069898: step: 144/77, loss: 0.05167564004659653 2023-01-22 09:34:45.531091: step: 148/77, loss: 0.003722875379025936 2023-01-22 09:34:46.992922: step: 152/77, loss: 0.009192441590130329 2023-01-22 09:34:48.496737: step: 156/77, loss: 0.03469576686620712 2023-01-22 09:34:49.999285: step: 160/77, loss: 0.034946829080581665 2023-01-22 09:34:51.424392: step: 164/77, loss: 0.007201574742794037 2023-01-22 09:34:52.939853: step: 168/77, loss: 0.004851900972425938 2023-01-22 09:34:54.426635: step: 172/77, loss: 0.03253195434808731 2023-01-22 09:34:55.872219: step: 176/77, loss: 0.02325173281133175 2023-01-22 09:34:57.333636: step: 180/77, loss: 0.026802418753504753 2023-01-22 09:34:58.857650: step: 184/77, loss: 0.01016119122505188 2023-01-22 09:35:00.290000: step: 188/77, loss: 0.03971070796251297 2023-01-22 09:35:01.760638: step: 192/77, loss: 0.018513280898332596 2023-01-22 09:35:03.357169: step: 196/77, loss: 0.017288243398070335 2023-01-22 09:35:04.876718: step: 200/77, loss: 0.021497106179594994 2023-01-22 09:35:06.332224: step: 204/77, loss: 0.01803465560078621 2023-01-22 09:35:07.824302: step: 208/77, loss: 0.01829645223915577 2023-01-22 09:35:09.295501: step: 212/77, loss: 0.005914963781833649 2023-01-22 09:35:10.804388: step: 216/77, loss: 0.0278038140386343 2023-01-22 09:35:12.253037: step: 220/77, loss: 0.016222145408391953 2023-01-22 09:35:13.677756: step: 224/77, loss: 0.010350292548537254 2023-01-22 09:35:15.135751: step: 228/77, loss: 0.034529589116573334 2023-01-22 09:35:16.595822: step: 232/77, loss: 0.006905150134116411 2023-01-22 09:35:17.999620: step: 236/77, loss: 0.014340376481413841 2023-01-22 09:35:19.516108: step: 240/77, loss: 0.043415144085884094 2023-01-22 09:35:20.937646: step: 244/77, loss: 0.0488872304558754 2023-01-22 09:35:22.425369: step: 248/77, loss: 0.003966958727687597 2023-01-22 09:35:23.886673: step: 252/77, loss: 0.030302058905363083 2023-01-22 09:35:25.313130: step: 256/77, loss: 0.012804752215743065 2023-01-22 09:35:26.818511: step: 260/77, loss: 0.08690030872821808 2023-01-22 09:35:28.312294: step: 264/77, loss: 0.01370689831674099 2023-01-22 09:35:29.859754: step: 268/77, loss: 0.14590376615524292 2023-01-22 09:35:31.288090: step: 272/77, loss: 0.003179072868078947 2023-01-22 09:35:32.749230: step: 276/77, loss: 0.0042409347370266914 2023-01-22 09:35:34.229977: step: 280/77, loss: 0.007665436249226332 2023-01-22 09:35:35.664886: step: 284/77, loss: 0.061556752771139145 2023-01-22 09:35:37.198618: step: 288/77, loss: 0.12139555811882019 2023-01-22 09:35:38.728500: step: 292/77, loss: 0.10933557152748108 2023-01-22 09:35:40.189869: step: 296/77, loss: 0.0040985699743032455 2023-01-22 09:35:41.610169: step: 300/77, loss: 0.012230083346366882 2023-01-22 09:35:43.105382: step: 304/77, loss: 0.11386236548423767 2023-01-22 09:35:44.609225: step: 308/77, loss: 0.0003976405132561922 2023-01-22 09:35:46.163889: step: 312/77, loss: 0.043285876512527466 2023-01-22 09:35:47.617039: step: 316/77, loss: 0.001523602637462318 2023-01-22 09:35:49.072575: step: 320/77, loss: 0.1256551891565323 2023-01-22 09:35:50.552235: step: 324/77, loss: 0.07823194563388824 2023-01-22 09:35:52.048896: step: 328/77, loss: 0.0037024905905127525 2023-01-22 09:35:53.575684: step: 332/77, loss: 0.05983438342809677 2023-01-22 09:35:54.995179: step: 336/77, loss: 0.052032772451639175 2023-01-22 09:35:56.506229: step: 340/77, loss: 0.0024174668360501528 2023-01-22 09:35:58.034598: step: 344/77, loss: 0.0021779723465442657 2023-01-22 09:35:59.571981: step: 348/77, loss: 0.13080300390720367 2023-01-22 09:36:01.056501: step: 352/77, loss: 0.0091065289452672 2023-01-22 09:36:02.530961: step: 356/77, loss: 0.030503802001476288 2023-01-22 09:36:04.017390: step: 360/77, loss: 0.00729318056255579 2023-01-22 09:36:05.516894: step: 364/77, loss: 0.05925387516617775 2023-01-22 09:36:07.032660: step: 368/77, loss: 0.015410172753036022 2023-01-22 09:36:08.565224: step: 372/77, loss: 0.013694683089852333 2023-01-22 09:36:10.052369: step: 376/77, loss: 0.09178860485553741 2023-01-22 09:36:11.432571: step: 380/77, loss: 0.022821731865406036 2023-01-22 09:36:12.971834: step: 384/77, loss: 0.10924479365348816 2023-01-22 09:36:14.482621: step: 388/77, loss: 0.020602600648999214 ================================================== Loss: 0.032 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 4} Test Chinese: {'template': {'p': 0.8767123287671232, 'r': 0.48854961832061067, 'f1': 0.6274509803921569}, 'slot': {'p': 0.45454545454545453, 'r': 0.012942191544434857, 'f1': 0.025167785234899324}, 'combined': 0.01579155151993683, 'epoch': 4} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 4} Test Korean: {'template': {'p': 0.8888888888888888, 'r': 0.48854961832061067, 'f1': 0.6305418719211823}, 'slot': {'p': 0.46875, 'r': 0.012942191544434857, 'f1': 0.025188916876574305}, 'combined': 0.015882666799022224, 'epoch': 4} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 4} Test Russian: {'template': {'p': 0.8767123287671232, 'r': 0.48854961832061067, 'f1': 0.6274509803921569}, 'slot': {'p': 0.45454545454545453, 'r': 0.012942191544434857, 'f1': 0.025167785234899324}, 'combined': 0.01579155151993683, 'epoch': 4} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 4} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 4} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 4} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Chinese: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.37777777777777777, 'r': 0.014667817083692839, 'f1': 0.028239202657807307}, 'combined': 0.01735677821894498, 'epoch': 3} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 3} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Korean: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.37777777777777777, 'r': 0.014667817083692839, 'f1': 0.028239202657807307}, 'combined': 0.01735677821894498, 'epoch': 3} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 3} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Russian: {'template': {'p': 0.863013698630137, 'r': 0.48091603053435117, 'f1': 0.6176470588235295}, 'slot': {'p': 0.38636363636363635, 'r': 0.014667817083692839, 'f1': 0.02826267664172901}, 'combined': 0.01745635910224439, 'epoch': 3} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 3} ****************************** Epoch: 5 command: python train.py --model_name template --xlmr_model_name xlm-roberta-large --batch_size 10 --xlmr_learning_rate 2e-5 --event_hidden_num 450 --accumulate_step 4 --max_epoch 30 --p1_data_weight 0.1 --learning_rate 2e-4 2023-01-22 09:37:52.167539: step: 4/77, loss: 0.024476751685142517 2023-01-22 09:37:53.699545: step: 8/77, loss: 0.11043231189250946 2023-01-22 09:37:55.213340: step: 12/77, loss: 0.040273018181324005 2023-01-22 09:37:56.709880: step: 16/77, loss: 0.007355161011219025 2023-01-22 09:37:58.192353: step: 20/77, loss: 0.02030954509973526 2023-01-22 09:37:59.635002: step: 24/77, loss: 0.009412623941898346 2023-01-22 09:38:01.130870: step: 28/77, loss: 0.030328869819641113 2023-01-22 09:38:02.582625: step: 32/77, loss: 0.019532401114702225 2023-01-22 09:38:04.050266: step: 36/77, loss: 0.0017457769718021154 2023-01-22 09:38:05.470697: step: 40/77, loss: 0.028483323752880096 2023-01-22 09:38:06.936214: step: 44/77, loss: 0.18267452716827393 2023-01-22 09:38:08.485868: step: 48/77, loss: 0.01519215852022171 2023-01-22 09:38:09.998133: step: 52/77, loss: 0.023144662380218506 2023-01-22 09:38:11.510273: step: 56/77, loss: 0.02538256347179413 2023-01-22 09:38:13.013008: step: 60/77, loss: 0.008262258023023605 2023-01-22 09:38:14.552869: step: 64/77, loss: 0.019567331299185753 2023-01-22 09:38:15.991053: step: 68/77, loss: 0.004387851804494858 2023-01-22 09:38:17.460406: step: 72/77, loss: 0.03609474003314972 2023-01-22 09:38:18.952640: step: 76/77, loss: 0.027372196316719055 2023-01-22 09:38:20.421826: step: 80/77, loss: 0.08391016721725464 2023-01-22 09:38:21.905361: step: 84/77, loss: 0.05477209761738777 2023-01-22 09:38:23.392817: step: 88/77, loss: 0.018585750833153725 2023-01-22 09:38:24.858682: step: 92/77, loss: 0.03730151802301407 2023-01-22 09:38:26.372738: step: 96/77, loss: 0.04512316733598709 2023-01-22 09:38:27.849968: step: 100/77, loss: 0.057060979306697845 2023-01-22 09:38:29.327867: step: 104/77, loss: 0.032657966017723083 2023-01-22 09:38:30.850536: step: 108/77, loss: 0.016146738082170486 2023-01-22 09:38:32.357769: step: 112/77, loss: 0.0069486647844314575 2023-01-22 09:38:33.843548: step: 116/77, loss: 0.1186465322971344 2023-01-22 09:38:35.302478: step: 120/77, loss: 0.017615636810660362 2023-01-22 09:38:36.778732: step: 124/77, loss: 0.009101053699851036 2023-01-22 09:38:38.257250: step: 128/77, loss: 0.002567005343735218 2023-01-22 09:38:39.745185: step: 132/77, loss: 0.027679648250341415 2023-01-22 09:38:41.205319: step: 136/77, loss: 0.016381870955228806 2023-01-22 09:38:42.670921: step: 140/77, loss: 0.10870490223169327 2023-01-22 09:38:44.183794: step: 144/77, loss: 0.02034395933151245 2023-01-22 09:38:45.704939: step: 148/77, loss: 0.02121562510728836 2023-01-22 09:38:47.214632: step: 152/77, loss: 0.01823749952018261 2023-01-22 09:38:48.778229: step: 156/77, loss: 0.0024375556968152523 2023-01-22 09:38:50.282131: step: 160/77, loss: 0.022238213568925858 2023-01-22 09:38:51.752607: step: 164/77, loss: 0.018150899559259415 2023-01-22 09:38:53.261027: step: 168/77, loss: 0.062192972749471664 2023-01-22 09:38:54.739265: step: 172/77, loss: 0.04013078287243843 2023-01-22 09:38:56.238333: step: 176/77, loss: 0.033303748816251755 2023-01-22 09:38:57.784967: step: 180/77, loss: 0.019885744899511337 2023-01-22 09:38:59.291802: step: 184/77, loss: 0.02491719275712967 2023-01-22 09:39:00.848099: step: 188/77, loss: 0.013048075139522552 2023-01-22 09:39:02.390745: step: 192/77, loss: 0.023345254361629486 2023-01-22 09:39:03.864940: step: 196/77, loss: 0.037435151636600494 2023-01-22 09:39:05.310018: step: 200/77, loss: 0.018141061067581177 2023-01-22 09:39:06.761252: step: 204/77, loss: 0.03752421215176582 2023-01-22 09:39:08.257178: step: 208/77, loss: 0.01851789653301239 2023-01-22 09:39:09.681116: step: 212/77, loss: 0.0015066369669511914 2023-01-22 09:39:11.136651: step: 216/77, loss: 0.050621166825294495 2023-01-22 09:39:12.689363: step: 220/77, loss: 0.08110310137271881 2023-01-22 09:39:14.173810: step: 224/77, loss: 0.04713662341237068 2023-01-22 09:39:15.698794: step: 228/77, loss: 0.017320964485406876 2023-01-22 09:39:17.101465: step: 232/77, loss: 0.025410648435354233 2023-01-22 09:39:18.653441: step: 236/77, loss: 0.011799340136349201 2023-01-22 09:39:20.139742: step: 240/77, loss: 0.10319054871797562 2023-01-22 09:39:21.643238: step: 244/77, loss: 0.01371192466467619 2023-01-22 09:39:23.063764: step: 248/77, loss: 0.011762048117816448 2023-01-22 09:39:24.549827: step: 252/77, loss: 0.07716451585292816 2023-01-22 09:39:26.057806: step: 256/77, loss: 0.006613610312342644 2023-01-22 09:39:27.517482: step: 260/77, loss: 0.04181970655918121 2023-01-22 09:39:29.017676: step: 264/77, loss: 0.00952971912920475 2023-01-22 09:39:30.544113: step: 268/77, loss: 0.007909356616437435 2023-01-22 09:39:31.964137: step: 272/77, loss: 0.025044672191143036 2023-01-22 09:39:33.407691: step: 276/77, loss: 0.003298920812085271 2023-01-22 09:39:34.906222: step: 280/77, loss: 0.07326558977365494 2023-01-22 09:39:36.367539: step: 284/77, loss: 0.07228539884090424 2023-01-22 09:39:37.789355: step: 288/77, loss: 0.018365979194641113 2023-01-22 09:39:39.239661: step: 292/77, loss: 0.0013023103820160031 2023-01-22 09:39:40.743076: step: 296/77, loss: 0.01624615490436554 2023-01-22 09:39:42.237625: step: 300/77, loss: 0.03355495631694794 2023-01-22 09:39:43.689084: step: 304/77, loss: 0.0013003923231735826 2023-01-22 09:39:45.181754: step: 308/77, loss: 0.007164771668612957 2023-01-22 09:39:46.632947: step: 312/77, loss: 0.05500699207186699 2023-01-22 09:39:48.103240: step: 316/77, loss: 0.019462941214442253 2023-01-22 09:39:49.544796: step: 320/77, loss: 0.0029178541153669357 2023-01-22 09:39:50.986570: step: 324/77, loss: 0.06747382134199142 2023-01-22 09:39:52.439946: step: 328/77, loss: 0.013694136403501034 2023-01-22 09:39:53.929883: step: 332/77, loss: 0.028184257447719574 2023-01-22 09:39:55.346332: step: 336/77, loss: 0.1709933876991272 2023-01-22 09:39:56.778320: step: 340/77, loss: 0.013454165309667587 2023-01-22 09:39:58.253284: step: 344/77, loss: 0.019300248473882675 2023-01-22 09:39:59.819349: step: 348/77, loss: 0.007079022936522961 2023-01-22 09:40:01.313232: step: 352/77, loss: 0.06331252306699753 2023-01-22 09:40:02.820994: step: 356/77, loss: 0.018890781328082085 2023-01-22 09:40:04.259961: step: 360/77, loss: 0.07067049294710159 2023-01-22 09:40:05.792672: step: 364/77, loss: 0.004420126788318157 2023-01-22 09:40:07.322753: step: 368/77, loss: 0.02071470208466053 2023-01-22 09:40:08.790742: step: 372/77, loss: 0.03721703961491585 2023-01-22 09:40:10.311871: step: 376/77, loss: 0.0021472936496138573 2023-01-22 09:40:11.822050: step: 380/77, loss: 0.0019225344294682145 2023-01-22 09:40:13.297387: step: 384/77, loss: 0.0833422914147377 2023-01-22 09:40:14.759495: step: 388/77, loss: 0.019889041781425476 ================================================== Loss: 0.033 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 5} Test Chinese: {'template': {'p': 0.9027777777777778, 'r': 0.4961832061068702, 'f1': 0.6403940886699507}, 'slot': {'p': 0.4473684210526316, 'r': 0.014667817083692839, 'f1': 0.028404344193817876}, 'combined': 0.018189974114267607, 'epoch': 5} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 5} Test Korean: {'template': {'p': 0.9142857142857143, 'r': 0.48854961832061067, 'f1': 0.6368159203980099}, 'slot': {'p': 0.4594594594594595, 'r': 0.014667817083692839, 'f1': 0.028428093645484948}, 'combined': 0.018103462620010315, 'epoch': 5} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 5} Test Russian: {'template': {'p': 0.8888888888888888, 'r': 0.48854961832061067, 'f1': 0.6305418719211823}, 'slot': {'p': 0.42105263157894735, 'r': 0.013805004314063849, 'f1': 0.026733500417710946}, 'combined': 0.01685659139638917, 'epoch': 5} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 5} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 5} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 5} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Chinese: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.37777777777777777, 'r': 0.014667817083692839, 'f1': 0.028239202657807307}, 'combined': 0.01735677821894498, 'epoch': 3} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 3} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Korean: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.37777777777777777, 'r': 0.014667817083692839, 'f1': 0.028239202657807307}, 'combined': 0.01735677821894498, 'epoch': 3} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 3} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Russian: {'template': {'p': 0.863013698630137, 'r': 0.48091603053435117, 'f1': 0.6176470588235295}, 'slot': {'p': 0.38636363636363635, 'r': 0.014667817083692839, 'f1': 0.02826267664172901}, 'combined': 0.01745635910224439, 'epoch': 3} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 3} ****************************** Epoch: 6 command: python train.py --model_name template --xlmr_model_name xlm-roberta-large --batch_size 10 --xlmr_learning_rate 2e-5 --event_hidden_num 450 --accumulate_step 4 --max_epoch 30 --p1_data_weight 0.1 --learning_rate 2e-4 2023-01-22 09:41:52.763290: step: 4/77, loss: 0.03039437159895897 2023-01-22 09:41:54.267337: step: 8/77, loss: 0.0026546369772404432 2023-01-22 09:41:55.765624: step: 12/77, loss: 0.028612440451979637 2023-01-22 09:41:57.223525: step: 16/77, loss: 0.030664879828691483 2023-01-22 09:41:58.668361: step: 20/77, loss: 0.0023875534534454346 2023-01-22 09:42:00.156602: step: 24/77, loss: 0.0006607776740565896 2023-01-22 09:42:01.588952: step: 28/77, loss: 0.007618624716997147 2023-01-22 09:42:03.082524: step: 32/77, loss: 0.028325766324996948 2023-01-22 09:42:04.561269: step: 36/77, loss: 0.012421460822224617 2023-01-22 09:42:05.991661: step: 40/77, loss: 0.016435008496046066 2023-01-22 09:42:07.437680: step: 44/77, loss: 0.0659736841917038 2023-01-22 09:42:08.956996: step: 48/77, loss: 0.010327223688364029 2023-01-22 09:42:10.457910: step: 52/77, loss: 0.0065464754588902 2023-01-22 09:42:12.001236: step: 56/77, loss: 0.057871297001838684 2023-01-22 09:42:13.403308: step: 60/77, loss: 0.002542986534535885 2023-01-22 09:42:14.953046: step: 64/77, loss: 0.006201583426445723 2023-01-22 09:42:16.465404: step: 68/77, loss: 0.01481205690652132 2023-01-22 09:42:17.920655: step: 72/77, loss: 0.0031775743700563908 2023-01-22 09:42:19.394363: step: 76/77, loss: 0.026160014793276787 2023-01-22 09:42:20.810084: step: 80/77, loss: 0.012620776891708374 2023-01-22 09:42:22.313627: step: 84/77, loss: 0.0677712932229042 2023-01-22 09:42:23.774717: step: 88/77, loss: 0.0394502654671669 2023-01-22 09:42:25.253539: step: 92/77, loss: 0.07249009609222412 2023-01-22 09:42:26.708699: step: 96/77, loss: 0.009052613750100136 2023-01-22 09:42:28.229439: step: 100/77, loss: 0.08241299539804459 2023-01-22 09:42:29.736586: step: 104/77, loss: 0.02526727318763733 2023-01-22 09:42:31.219135: step: 108/77, loss: 0.023446206003427505 2023-01-22 09:42:32.744643: step: 112/77, loss: 0.003657972440123558 2023-01-22 09:42:34.256716: step: 116/77, loss: 0.04917879030108452 2023-01-22 09:42:35.739026: step: 120/77, loss: 0.029167205095291138 2023-01-22 09:42:37.195087: step: 124/77, loss: 0.00030399844399653375 2023-01-22 09:42:38.682291: step: 128/77, loss: 0.02055046521127224 2023-01-22 09:42:40.170703: step: 132/77, loss: 0.03628047555685043 2023-01-22 09:42:41.648732: step: 136/77, loss: 0.000510257261339575 2023-01-22 09:42:43.197198: step: 140/77, loss: 1.6263384168269113e-05 2023-01-22 09:42:44.649178: step: 144/77, loss: 0.04925939813256264 2023-01-22 09:42:46.201747: step: 148/77, loss: 0.01939420960843563 2023-01-22 09:42:47.699406: step: 152/77, loss: 0.023076839745044708 2023-01-22 09:42:49.137325: step: 156/77, loss: 0.019336925819516182 2023-01-22 09:42:50.695589: step: 160/77, loss: 0.052915360778570175 2023-01-22 09:42:52.217082: step: 164/77, loss: 0.010208374820649624 2023-01-22 09:42:53.676094: step: 168/77, loss: 0.010052263736724854 2023-01-22 09:42:55.215614: step: 172/77, loss: 0.018969284370541573 2023-01-22 09:42:56.697677: step: 176/77, loss: 0.011579119600355625 2023-01-22 09:42:58.138638: step: 180/77, loss: 0.0002825237170327455 2023-01-22 09:42:59.684698: step: 184/77, loss: 0.0627252608537674 2023-01-22 09:43:01.180844: step: 188/77, loss: 0.005596342496573925 2023-01-22 09:43:02.655248: step: 192/77, loss: 0.07881352305412292 2023-01-22 09:43:04.109537: step: 196/77, loss: 0.10441139340400696 2023-01-22 09:43:05.551273: step: 200/77, loss: 0.06772737950086594 2023-01-22 09:43:07.121574: step: 204/77, loss: 0.018566543236374855 2023-01-22 09:43:08.620538: step: 208/77, loss: 0.08514667302370071 2023-01-22 09:43:10.150304: step: 212/77, loss: 0.003423793241381645 2023-01-22 09:43:11.576758: step: 216/77, loss: 0.017923181876540184 2023-01-22 09:43:13.081896: step: 220/77, loss: 0.05222643166780472 2023-01-22 09:43:14.510055: step: 224/77, loss: 0.008815684355795383 2023-01-22 09:43:16.013648: step: 228/77, loss: 0.0019420962780714035 2023-01-22 09:43:17.471662: step: 232/77, loss: 0.11282327771186829 2023-01-22 09:43:18.979719: step: 236/77, loss: 0.01948855072259903 2023-01-22 09:43:20.461605: step: 240/77, loss: 0.06035421043634415 2023-01-22 09:43:21.910695: step: 244/77, loss: 0.09459991753101349 2023-01-22 09:43:23.353618: step: 248/77, loss: 0.006792946252971888 2023-01-22 09:43:24.786826: step: 252/77, loss: 0.06218063831329346 2023-01-22 09:43:26.222837: step: 256/77, loss: 0.023452896624803543 2023-01-22 09:43:27.738807: step: 260/77, loss: 0.0018449525814503431 2023-01-22 09:43:29.131649: step: 264/77, loss: 0.05207361653447151 2023-01-22 09:43:30.610163: step: 268/77, loss: 0.00742388516664505 2023-01-22 09:43:32.069762: step: 272/77, loss: 0.00440608337521553 2023-01-22 09:43:33.522951: step: 276/77, loss: 0.013198236003518105 2023-01-22 09:43:34.956297: step: 280/77, loss: 0.011632833629846573 2023-01-22 09:43:36.506066: step: 284/77, loss: 0.006646803580224514 2023-01-22 09:43:37.964241: step: 288/77, loss: 0.010615387000143528 2023-01-22 09:43:39.501906: step: 292/77, loss: 0.012356102466583252 2023-01-22 09:43:41.011010: step: 296/77, loss: 0.0011919524986296892 2023-01-22 09:43:42.485725: step: 300/77, loss: 0.04278775304555893 2023-01-22 09:43:43.961113: step: 304/77, loss: 0.05970532447099686 2023-01-22 09:43:45.418934: step: 308/77, loss: 0.03060147911310196 2023-01-22 09:43:46.929888: step: 312/77, loss: 0.01697326824069023 2023-01-22 09:43:48.368199: step: 316/77, loss: 0.040285106748342514 2023-01-22 09:43:49.824165: step: 320/77, loss: 0.020163346081972122 2023-01-22 09:43:51.340802: step: 324/77, loss: 0.007350889965891838 2023-01-22 09:43:52.777368: step: 328/77, loss: 0.01338155660778284 2023-01-22 09:43:54.267877: step: 332/77, loss: 0.022605765610933304 2023-01-22 09:43:55.797414: step: 336/77, loss: 0.03450581803917885 2023-01-22 09:43:57.268696: step: 340/77, loss: 0.03401753678917885 2023-01-22 09:43:58.734102: step: 344/77, loss: 0.011614415794610977 2023-01-22 09:44:00.282116: step: 348/77, loss: 0.04379738122224808 2023-01-22 09:44:01.735091: step: 352/77, loss: 0.16224682331085205 2023-01-22 09:44:03.209994: step: 356/77, loss: 0.00017778918845579028 2023-01-22 09:44:04.678799: step: 360/77, loss: 0.002048301976174116 2023-01-22 09:44:06.093122: step: 364/77, loss: 0.02142212726175785 2023-01-22 09:44:07.580841: step: 368/77, loss: 0.007760262116789818 2023-01-22 09:44:09.126775: step: 372/77, loss: 0.017139358446002007 2023-01-22 09:44:10.589903: step: 376/77, loss: 0.007925149984657764 2023-01-22 09:44:12.054681: step: 380/77, loss: 0.028815656900405884 2023-01-22 09:44:13.481260: step: 384/77, loss: 0.0073576332069933414 2023-01-22 09:44:14.910297: step: 388/77, loss: 0.0014327471144497395 ================================================== Loss: 0.028 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 6} Test Chinese: {'template': {'p': 0.875, 'r': 0.48091603053435117, 'f1': 0.6206896551724138}, 'slot': {'p': 0.3695652173913043, 'r': 0.014667817083692839, 'f1': 0.028215767634854772}, 'combined': 0.017513235083702963, 'epoch': 6} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 6} Test Korean: {'template': {'p': 0.863013698630137, 'r': 0.48091603053435117, 'f1': 0.6176470588235295}, 'slot': {'p': 0.391304347826087, 'r': 0.015530629853321829, 'f1': 0.029875518672199168}, 'combined': 0.018452526238711256, 'epoch': 6} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 6} Test Russian: {'template': {'p': 0.863013698630137, 'r': 0.48091603053435117, 'f1': 0.6176470588235295}, 'slot': {'p': 0.3695652173913043, 'r': 0.014667817083692839, 'f1': 0.028215767634854772}, 'combined': 0.017427385892116187, 'epoch': 6} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 6} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 6} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 6} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Chinese: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.37777777777777777, 'r': 0.014667817083692839, 'f1': 0.028239202657807307}, 'combined': 0.01735677821894498, 'epoch': 3} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 3} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Korean: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.37777777777777777, 'r': 0.014667817083692839, 'f1': 0.028239202657807307}, 'combined': 0.01735677821894498, 'epoch': 3} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 3} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Russian: {'template': {'p': 0.863013698630137, 'r': 0.48091603053435117, 'f1': 0.6176470588235295}, 'slot': {'p': 0.38636363636363635, 'r': 0.014667817083692839, 'f1': 0.02826267664172901}, 'combined': 0.01745635910224439, 'epoch': 3} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 3} ****************************** Epoch: 7 command: python train.py --model_name template --xlmr_model_name xlm-roberta-large --batch_size 10 --xlmr_learning_rate 2e-5 --event_hidden_num 450 --accumulate_step 4 --max_epoch 30 --p1_data_weight 0.1 --learning_rate 2e-4 2023-01-22 09:45:52.200879: step: 4/77, loss: 0.012997529469430447 2023-01-22 09:45:53.673456: step: 8/77, loss: 0.020570045337080956 2023-01-22 09:45:55.132764: step: 12/77, loss: 0.04421515017747879 2023-01-22 09:45:56.534802: step: 16/77, loss: 0.032468900084495544 2023-01-22 09:45:58.054092: step: 20/77, loss: 0.03809467703104019 2023-01-22 09:45:59.587210: step: 24/77, loss: 0.011496270075440407 2023-01-22 09:46:01.055831: step: 28/77, loss: 0.019316416233778 2023-01-22 09:46:02.542186: step: 32/77, loss: 0.005474533885717392 2023-01-22 09:46:04.046817: step: 36/77, loss: 0.005868466105312109 2023-01-22 09:46:05.483386: step: 40/77, loss: 0.013153919950127602 2023-01-22 09:46:06.981316: step: 44/77, loss: 0.0031642599496990442 2023-01-22 09:46:08.408699: step: 48/77, loss: 0.005004220642149448 2023-01-22 09:46:09.846096: step: 52/77, loss: 0.01386738196015358 2023-01-22 09:46:11.295552: step: 56/77, loss: 0.013531411997973919 2023-01-22 09:46:12.881680: step: 60/77, loss: 0.01102381944656372 2023-01-22 09:46:14.349200: step: 64/77, loss: 0.012573636136949062 2023-01-22 09:46:15.808974: step: 68/77, loss: 0.0019415427232161164 2023-01-22 09:46:17.273744: step: 72/77, loss: 0.12153831124305725 2023-01-22 09:46:18.756851: step: 76/77, loss: 0.01277611218392849 2023-01-22 09:46:20.233346: step: 80/77, loss: 0.02256181091070175 2023-01-22 09:46:21.732317: step: 84/77, loss: 0.0024817378725856543 2023-01-22 09:46:23.211984: step: 88/77, loss: 0.012858950532972813 2023-01-22 09:46:24.738546: step: 92/77, loss: 6.024163303663954e-05 2023-01-22 09:46:26.223733: step: 96/77, loss: 0.012603029608726501 2023-01-22 09:46:27.698466: step: 100/77, loss: 0.00898762233555317 2023-01-22 09:46:29.209340: step: 104/77, loss: 0.01040840707719326 2023-01-22 09:46:30.631623: step: 108/77, loss: 0.009314004331827164 2023-01-22 09:46:32.206720: step: 112/77, loss: 0.002158994786441326 2023-01-22 09:46:33.676430: step: 116/77, loss: 0.001862498465925455 2023-01-22 09:46:35.087096: step: 120/77, loss: 0.004027359187602997 2023-01-22 09:46:36.616310: step: 124/77, loss: 0.0019363107858225703 2023-01-22 09:46:38.054245: step: 128/77, loss: 0.0674593448638916 2023-01-22 09:46:39.473290: step: 132/77, loss: 0.07844030112028122 2023-01-22 09:46:40.981136: step: 136/77, loss: 0.0004071201547048986 2023-01-22 09:46:42.508753: step: 140/77, loss: 0.040196869522333145 2023-01-22 09:46:43.950466: step: 144/77, loss: 0.02462776005268097 2023-01-22 09:46:45.404263: step: 148/77, loss: 0.018356427550315857 2023-01-22 09:46:46.854585: step: 152/77, loss: 0.02162858285009861 2023-01-22 09:46:48.258665: step: 156/77, loss: 0.009433625265955925 2023-01-22 09:46:49.774119: step: 160/77, loss: 0.009734027087688446 2023-01-22 09:46:51.290489: step: 164/77, loss: 0.0023181047290563583 2023-01-22 09:46:52.811548: step: 168/77, loss: 0.017527658492326736 2023-01-22 09:46:54.305023: step: 172/77, loss: 0.00703967921435833 2023-01-22 09:46:55.764501: step: 176/77, loss: 0.008034227415919304 2023-01-22 09:46:57.259859: step: 180/77, loss: 0.06797732412815094 2023-01-22 09:46:58.720809: step: 184/77, loss: 0.028809472918510437 2023-01-22 09:47:00.215185: step: 188/77, loss: 0.01280111912637949 2023-01-22 09:47:01.719529: step: 192/77, loss: 0.006291474215686321 2023-01-22 09:47:03.213739: step: 196/77, loss: 0.019357847049832344 2023-01-22 09:47:04.668193: step: 200/77, loss: 0.01437158789485693 2023-01-22 09:47:06.114775: step: 204/77, loss: 0.004847629461437464 2023-01-22 09:47:07.584621: step: 208/77, loss: 0.04459906369447708 2023-01-22 09:47:09.051986: step: 212/77, loss: 0.0019956021569669247 2023-01-22 09:47:10.527985: step: 216/77, loss: 0.008743159472942352 2023-01-22 09:47:11.975111: step: 220/77, loss: 0.06492185592651367 2023-01-22 09:47:13.417550: step: 224/77, loss: 0.04172681272029877 2023-01-22 09:47:14.943385: step: 228/77, loss: 0.03955743834376335 2023-01-22 09:47:16.346821: step: 232/77, loss: 0.0006398952100425959 2023-01-22 09:47:17.862289: step: 236/77, loss: 0.009599082171916962 2023-01-22 09:47:19.408364: step: 240/77, loss: 0.01301715150475502 2023-01-22 09:47:20.910966: step: 244/77, loss: 0.00035458861384540796 2023-01-22 09:47:22.384345: step: 248/77, loss: 0.007781412452459335 2023-01-22 09:47:23.906072: step: 252/77, loss: 0.02035202831029892 2023-01-22 09:47:25.414224: step: 256/77, loss: 0.012641222216188908 2023-01-22 09:47:26.884775: step: 260/77, loss: 0.05043390765786171 2023-01-22 09:47:28.321893: step: 264/77, loss: 0.011204622685909271 2023-01-22 09:47:29.830752: step: 268/77, loss: 0.004016530700027943 2023-01-22 09:47:31.250459: step: 272/77, loss: 0.002317877020686865 2023-01-22 09:47:32.847425: step: 276/77, loss: 0.0009432684746570885 2023-01-22 09:47:34.345160: step: 280/77, loss: 0.0006387169123627245 2023-01-22 09:47:35.880637: step: 284/77, loss: 0.0016859809402376413 2023-01-22 09:47:37.400121: step: 288/77, loss: 0.03399414196610451 2023-01-22 09:47:38.928453: step: 292/77, loss: 0.015151074156165123 2023-01-22 09:47:40.436670: step: 296/77, loss: 0.02047363668680191 2023-01-22 09:47:41.869593: step: 300/77, loss: 0.038063161075115204 2023-01-22 09:47:43.289237: step: 304/77, loss: 0.00330452062189579 2023-01-22 09:47:44.719067: step: 308/77, loss: 0.011480795219540596 2023-01-22 09:47:46.167739: step: 312/77, loss: 0.019310910254716873 2023-01-22 09:47:47.610195: step: 316/77, loss: 0.015391256660223007 2023-01-22 09:47:49.150950: step: 320/77, loss: 0.010779057629406452 2023-01-22 09:47:50.586629: step: 324/77, loss: 0.007522165309637785 2023-01-22 09:47:52.081875: step: 328/77, loss: 0.02136208489537239 2023-01-22 09:47:53.560280: step: 332/77, loss: 0.01059940829873085 2023-01-22 09:47:55.038031: step: 336/77, loss: 0.00015911197988316417 2023-01-22 09:47:56.550828: step: 340/77, loss: 0.050910405814647675 2023-01-22 09:47:57.972476: step: 344/77, loss: 0.00024434231454506516 2023-01-22 09:47:59.436262: step: 348/77, loss: 0.010583743453025818 2023-01-22 09:48:00.938190: step: 352/77, loss: 0.025950804352760315 2023-01-22 09:48:02.370318: step: 356/77, loss: 0.024648727849125862 2023-01-22 09:48:03.827629: step: 360/77, loss: 0.010220241732895374 2023-01-22 09:48:05.320418: step: 364/77, loss: 0.042353857308626175 2023-01-22 09:48:06.750440: step: 368/77, loss: 0.01584051176905632 2023-01-22 09:48:08.235482: step: 372/77, loss: 0.02006104774773121 2023-01-22 09:48:09.778195: step: 376/77, loss: 0.020328937098383904 2023-01-22 09:48:11.275467: step: 380/77, loss: 0.020396513864398003 2023-01-22 09:48:12.767660: step: 384/77, loss: 0.03346937522292137 2023-01-22 09:48:14.300505: step: 388/77, loss: 0.061174191534519196 ================================================== Loss: 0.019 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5666666666666667, 'f1': 0.7234042553191489}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05085442919642522, 'epoch': 7} Test Chinese: {'template': {'p': 0.8955223880597015, 'r': 0.4580152671755725, 'f1': 0.6060606060606061}, 'slot': {'p': 0.45, 'r': 0.015530629853321829, 'f1': 0.030025020850708923}, 'combined': 0.018196982333762983, 'epoch': 7} Dev Korean: {'template': {'p': 1.0, 'r': 0.55, 'f1': 0.7096774193548387}, 'slot': {'p': 0.5, 'r': 0.035916824196597356, 'f1': 0.0670194003527337}, 'combined': 0.04756215508903682, 'epoch': 7} Test Korean: {'template': {'p': 0.8939393939393939, 'r': 0.45038167938931295, 'f1': 0.598984771573604}, 'slot': {'p': 0.4594594594594595, 'r': 0.014667817083692839, 'f1': 0.028428093645484948}, 'combined': 0.017027995178513826, 'epoch': 7} Dev Russian: {'template': {'p': 1.0, 'r': 0.55, 'f1': 0.7096774193548387}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.04988944951527864, 'epoch': 7} Test Russian: {'template': {'p': 0.9076923076923077, 'r': 0.45038167938931295, 'f1': 0.6020408163265306}, 'slot': {'p': 0.4864864864864865, 'r': 0.015530629853321829, 'f1': 0.030100334448160532}, 'combined': 0.018121629922872157, 'epoch': 7} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 7} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 7} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 7} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Chinese: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.37777777777777777, 'r': 0.014667817083692839, 'f1': 0.028239202657807307}, 'combined': 0.01735677821894498, 'epoch': 3} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 3} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Korean: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.37777777777777777, 'r': 0.014667817083692839, 'f1': 0.028239202657807307}, 'combined': 0.01735677821894498, 'epoch': 3} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 3} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Russian: {'template': {'p': 0.863013698630137, 'r': 0.48091603053435117, 'f1': 0.6176470588235295}, 'slot': {'p': 0.38636363636363635, 'r': 0.014667817083692839, 'f1': 0.02826267664172901}, 'combined': 0.01745635910224439, 'epoch': 3} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 3} ****************************** Epoch: 8 command: python train.py --model_name template --xlmr_model_name xlm-roberta-large --batch_size 10 --xlmr_learning_rate 2e-5 --event_hidden_num 450 --accumulate_step 4 --max_epoch 30 --p1_data_weight 0.1 --learning_rate 2e-4 2023-01-22 09:49:51.960733: step: 4/77, loss: 0.0067916191183030605 2023-01-22 09:49:53.405304: step: 8/77, loss: 0.013048367574810982 2023-01-22 09:49:54.886275: step: 12/77, loss: 0.03453684225678444 2023-01-22 09:49:56.337849: step: 16/77, loss: 0.00477053876966238 2023-01-22 09:49:57.812172: step: 20/77, loss: 0.022574584931135178 2023-01-22 09:49:59.304647: step: 24/77, loss: 0.03624827042222023 2023-01-22 09:50:00.769432: step: 28/77, loss: 0.013998491689562798 2023-01-22 09:50:02.322729: step: 32/77, loss: 0.007175574544817209 2023-01-22 09:50:03.844935: step: 36/77, loss: 0.029678918421268463 2023-01-22 09:50:05.324979: step: 40/77, loss: 0.026228811591863632 2023-01-22 09:50:06.748117: step: 44/77, loss: 0.0012878580018877983 2023-01-22 09:50:08.236983: step: 48/77, loss: 0.010943735018372536 2023-01-22 09:50:09.733571: step: 52/77, loss: 0.0012138265883550048 2023-01-22 09:50:11.197671: step: 56/77, loss: 0.017435984686017036 2023-01-22 09:50:12.699860: step: 60/77, loss: 0.00820908322930336 2023-01-22 09:50:14.210046: step: 64/77, loss: 0.21635589003562927 2023-01-22 09:50:15.768997: step: 68/77, loss: 0.09472758322954178 2023-01-22 09:50:17.214973: step: 72/77, loss: 0.030867278575897217 2023-01-22 09:50:18.708833: step: 76/77, loss: 0.008024647831916809 2023-01-22 09:50:20.182964: step: 80/77, loss: 0.0015849823830649257 2023-01-22 09:50:21.664263: step: 84/77, loss: 0.019332479685544968 2023-01-22 09:50:23.141368: step: 88/77, loss: 0.003665961092337966 2023-01-22 09:50:24.574408: step: 92/77, loss: 0.04837939888238907 2023-01-22 09:50:26.059983: step: 96/77, loss: 0.007896811701357365 2023-01-22 09:50:27.587954: step: 100/77, loss: 0.017443090677261353 2023-01-22 09:50:29.216757: step: 104/77, loss: 0.04705725982785225 2023-01-22 09:50:30.628773: step: 108/77, loss: 0.0034224099945276976 2023-01-22 09:50:32.080309: step: 112/77, loss: 0.0312179122120142 2023-01-22 09:50:33.603576: step: 116/77, loss: 0.029161768034100533 2023-01-22 09:50:35.086350: step: 120/77, loss: 0.005436992272734642 2023-01-22 09:50:36.562274: step: 124/77, loss: 0.008618153631687164 2023-01-22 09:50:38.027396: step: 128/77, loss: 0.0025659631937742233 2023-01-22 09:50:39.509682: step: 132/77, loss: 0.004739833064377308 2023-01-22 09:50:40.977449: step: 136/77, loss: 0.12532849609851837 2023-01-22 09:50:42.429753: step: 140/77, loss: 0.012651849538087845 2023-01-22 09:50:43.846756: step: 144/77, loss: 0.0029987136367708445 2023-01-22 09:50:45.351487: step: 148/77, loss: 0.0017156063113361597 2023-01-22 09:50:46.838590: step: 152/77, loss: 0.011197465471923351 2023-01-22 09:50:48.269256: step: 156/77, loss: 0.05191802978515625 2023-01-22 09:50:49.798160: step: 160/77, loss: 0.01135370321571827 2023-01-22 09:50:51.295069: step: 164/77, loss: 0.03425750881433487 2023-01-22 09:50:52.770447: step: 168/77, loss: 0.009411108680069447 2023-01-22 09:50:54.208750: step: 172/77, loss: 0.010142726823687553 2023-01-22 09:50:55.618565: step: 176/77, loss: 0.0025873896665871143 2023-01-22 09:50:57.163285: step: 180/77, loss: 0.017334995791316032 2023-01-22 09:50:58.660736: step: 184/77, loss: 0.004583724774420261 2023-01-22 09:51:00.131824: step: 188/77, loss: 0.006915468256920576 2023-01-22 09:51:01.503854: step: 192/77, loss: 0.013365627266466618 2023-01-22 09:51:02.945146: step: 196/77, loss: 0.0026318528689444065 2023-01-22 09:51:04.392257: step: 200/77, loss: 0.013050251640379429 2023-01-22 09:51:05.892533: step: 204/77, loss: 0.01102149672806263 2023-01-22 09:51:07.405317: step: 208/77, loss: 0.005182032473385334 2023-01-22 09:51:08.946882: step: 212/77, loss: 0.032337501645088196 2023-01-22 09:51:10.444511: step: 216/77, loss: 0.025232627987861633 2023-01-22 09:51:11.973863: step: 220/77, loss: 0.026668911799788475 2023-01-22 09:51:13.428469: step: 224/77, loss: 0.0707787573337555 2023-01-22 09:51:14.920268: step: 228/77, loss: 0.018818672746419907 2023-01-22 09:51:16.324398: step: 232/77, loss: 0.0017937154043465853 2023-01-22 09:51:17.816610: step: 236/77, loss: 0.0008982113795354962 2023-01-22 09:51:19.371800: step: 240/77, loss: 0.08587608486413956 2023-01-22 09:51:20.826555: step: 244/77, loss: 0.021022209897637367 2023-01-22 09:51:22.255904: step: 248/77, loss: 0.05474882945418358 2023-01-22 09:51:23.786696: step: 252/77, loss: 0.004854440223425627 2023-01-22 09:51:25.203631: step: 256/77, loss: 0.013054034672677517 2023-01-22 09:51:26.698021: step: 260/77, loss: 0.022247303277254105 2023-01-22 09:51:28.190891: step: 264/77, loss: 0.010746228508651257 2023-01-22 09:51:29.582031: step: 268/77, loss: 0.06552846729755402 2023-01-22 09:51:31.092205: step: 272/77, loss: 0.005899173207581043 2023-01-22 09:51:32.609725: step: 276/77, loss: 0.006159973330795765 2023-01-22 09:51:34.112797: step: 280/77, loss: 0.12346875667572021 2023-01-22 09:51:35.520368: step: 284/77, loss: 0.011204426176846027 2023-01-22 09:51:36.934682: step: 288/77, loss: 0.018768085166811943 2023-01-22 09:51:38.324323: step: 292/77, loss: 0.008486710488796234 2023-01-22 09:51:39.777787: step: 296/77, loss: 0.036354124546051025 2023-01-22 09:51:41.262850: step: 300/77, loss: 0.0038927635177969933 2023-01-22 09:51:42.664310: step: 304/77, loss: 0.002151659457013011 2023-01-22 09:51:44.158145: step: 308/77, loss: 0.0006010127253830433 2023-01-22 09:51:45.687082: step: 312/77, loss: 0.038890399038791656 2023-01-22 09:51:47.090838: step: 316/77, loss: 0.015262553468346596 2023-01-22 09:51:48.565050: step: 320/77, loss: 0.004743649158626795 2023-01-22 09:51:50.055542: step: 324/77, loss: 0.05304583162069321 2023-01-22 09:51:51.521395: step: 328/77, loss: 0.021255943924188614 2023-01-22 09:51:52.946215: step: 332/77, loss: 0.004364157095551491 2023-01-22 09:51:54.399282: step: 336/77, loss: 0.010830073617398739 2023-01-22 09:51:55.874909: step: 340/77, loss: 0.013078930787742138 2023-01-22 09:51:57.416519: step: 344/77, loss: 0.10247281193733215 2023-01-22 09:51:58.843909: step: 348/77, loss: 0.04736286401748657 2023-01-22 09:52:00.252258: step: 352/77, loss: 0.033875543624162674 2023-01-22 09:52:01.731845: step: 356/77, loss: 0.0046717822551727295 2023-01-22 09:52:03.210232: step: 360/77, loss: 0.011814633384346962 2023-01-22 09:52:04.600323: step: 364/77, loss: 0.0285545215010643 2023-01-22 09:52:05.956716: step: 368/77, loss: 0.015754615887999535 2023-01-22 09:52:07.415678: step: 372/77, loss: 0.05176156014204025 2023-01-22 09:52:08.955564: step: 376/77, loss: 0.01782143861055374 2023-01-22 09:52:10.451119: step: 380/77, loss: 0.011716475710272789 2023-01-22 09:52:11.968754: step: 384/77, loss: 0.0015753823099657893 2023-01-22 09:52:13.452421: step: 388/77, loss: 0.04279303923249245 ================================================== Loss: 0.024 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 8} Test Chinese: {'template': {'p': 0.8805970149253731, 'r': 0.45038167938931295, 'f1': 0.5959595959595959}, 'slot': {'p': 0.45714285714285713, 'r': 0.013805004314063849, 'f1': 0.02680067001675042}, 'combined': 0.015972116474629035, 'epoch': 8} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 8} Test Korean: {'template': {'p': 0.8939393939393939, 'r': 0.45038167938931295, 'f1': 0.598984771573604}, 'slot': {'p': 0.45714285714285713, 'r': 0.013805004314063849, 'f1': 0.02680067001675042}, 'combined': 0.016053193208002785, 'epoch': 8} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 8} Test Russian: {'template': {'p': 0.8823529411764706, 'r': 0.4580152671755725, 'f1': 0.6030150753768845}, 'slot': {'p': 0.43243243243243246, 'r': 0.013805004314063849, 'f1': 0.026755852842809368}, 'combined': 0.01613418261877952, 'epoch': 8} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 8} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 8} Sample Russian: {'template': {'p': 1.0, 'r': 0.25, 'f1': 0.4}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 8} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Chinese: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.37777777777777777, 'r': 0.014667817083692839, 'f1': 0.028239202657807307}, 'combined': 0.01735677821894498, 'epoch': 3} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 3} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Korean: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.37777777777777777, 'r': 0.014667817083692839, 'f1': 0.028239202657807307}, 'combined': 0.01735677821894498, 'epoch': 3} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 3} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Russian: {'template': {'p': 0.863013698630137, 'r': 0.48091603053435117, 'f1': 0.6176470588235295}, 'slot': {'p': 0.38636363636363635, 'r': 0.014667817083692839, 'f1': 0.02826267664172901}, 'combined': 0.01745635910224439, 'epoch': 3} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 3} ****************************** Epoch: 9 command: python train.py --model_name template --xlmr_model_name xlm-roberta-large --batch_size 10 --xlmr_learning_rate 2e-5 --event_hidden_num 450 --accumulate_step 4 --max_epoch 30 --p1_data_weight 0.1 --learning_rate 2e-4 2023-01-22 09:53:50.718140: step: 4/77, loss: 0.00931814406067133 2023-01-22 09:53:52.195921: step: 8/77, loss: 0.016727566719055176 2023-01-22 09:53:53.633580: step: 12/77, loss: 0.02211870439350605 2023-01-22 09:53:55.064600: step: 16/77, loss: 0.011608324013650417 2023-01-22 09:53:56.565966: step: 20/77, loss: 0.010061085224151611 2023-01-22 09:53:58.048904: step: 24/77, loss: 0.010383963584899902 2023-01-22 09:53:59.497414: step: 28/77, loss: 0.03394642099738121 2023-01-22 09:54:00.986490: step: 32/77, loss: 0.0005709524848498404 2023-01-22 09:54:02.521165: step: 36/77, loss: 0.010349091142416 2023-01-22 09:54:03.992882: step: 40/77, loss: 0.002377279568463564 2023-01-22 09:54:05.499991: step: 44/77, loss: 0.009859619662165642 2023-01-22 09:54:07.027526: step: 48/77, loss: 0.010785480029881 2023-01-22 09:54:08.537814: step: 52/77, loss: 0.021909940987825394 2023-01-22 09:54:10.017469: step: 56/77, loss: 0.0010997147765010595 2023-01-22 09:54:11.517140: step: 60/77, loss: 0.005814129486680031 2023-01-22 09:54:13.066943: step: 64/77, loss: 0.0011625312035903335 2023-01-22 09:54:14.522434: step: 68/77, loss: 0.02374696359038353 2023-01-22 09:54:16.045012: step: 72/77, loss: 0.010308654978871346 2023-01-22 09:54:17.539047: step: 76/77, loss: 0.0102377999573946 2023-01-22 09:54:19.046797: step: 80/77, loss: 0.012174428440630436 2023-01-22 09:54:20.579062: step: 84/77, loss: 0.00158389238640666 2023-01-22 09:54:22.057681: step: 88/77, loss: 0.0022693369537591934 2023-01-22 09:54:23.534122: step: 92/77, loss: 0.0003267589781899005 2023-01-22 09:54:24.960117: step: 96/77, loss: 0.03900589048862457 2023-01-22 09:54:26.445860: step: 100/77, loss: 0.01670156605541706 2023-01-22 09:54:27.903905: step: 104/77, loss: 0.001030667801387608 2023-01-22 09:54:29.341706: step: 108/77, loss: 0.00393590796738863 2023-01-22 09:54:30.806573: step: 112/77, loss: 0.04139762371778488 2023-01-22 09:54:32.271962: step: 116/77, loss: 0.0035628098994493484 2023-01-22 09:54:33.732152: step: 120/77, loss: 0.027284199371933937 2023-01-22 09:54:35.185154: step: 124/77, loss: 0.010536571964621544 2023-01-22 09:54:36.632383: step: 128/77, loss: 0.012623676098883152 2023-01-22 09:54:38.102457: step: 132/77, loss: 0.13224123418331146 2023-01-22 09:54:39.543293: step: 136/77, loss: 0.11464852094650269 2023-01-22 09:54:40.951740: step: 140/77, loss: 0.0009070251835510135 2023-01-22 09:54:42.450349: step: 144/77, loss: 0.034760989248752594 2023-01-22 09:54:43.933849: step: 148/77, loss: 0.025749383494257927 2023-01-22 09:54:45.461486: step: 152/77, loss: 0.026630792766809464 2023-01-22 09:54:46.853340: step: 156/77, loss: 0.00897565670311451 2023-01-22 09:54:48.313944: step: 160/77, loss: 0.029134489595890045 2023-01-22 09:54:49.784873: step: 164/77, loss: 0.0009648214327171445 2023-01-22 09:54:51.292238: step: 168/77, loss: 0.0089799165725708 2023-01-22 09:54:52.827704: step: 172/77, loss: 0.06454553455114365 2023-01-22 09:54:54.252409: step: 176/77, loss: 0.0012069199001416564 2023-01-22 09:54:55.741077: step: 180/77, loss: 0.05223080515861511 2023-01-22 09:54:57.194978: step: 184/77, loss: 0.015568587929010391 2023-01-22 09:54:58.654828: step: 188/77, loss: 0.011848854832351208 2023-01-22 09:55:00.115039: step: 192/77, loss: 0.013729307800531387 2023-01-22 09:55:01.578957: step: 196/77, loss: 0.012604327872395515 2023-01-22 09:55:03.045774: step: 200/77, loss: 0.032689254730939865 2023-01-22 09:55:04.488474: step: 204/77, loss: 0.009877309203147888 2023-01-22 09:55:05.961624: step: 208/77, loss: 0.008456312119960785 2023-01-22 09:55:07.407102: step: 212/77, loss: 0.009990466758608818 2023-01-22 09:55:08.890771: step: 216/77, loss: 0.012410677969455719 2023-01-22 09:55:10.442170: step: 220/77, loss: 0.0003021705197170377 2023-01-22 09:55:11.975159: step: 224/77, loss: 0.07351236045360565 2023-01-22 09:55:13.521195: step: 228/77, loss: 0.0010626473231241107 2023-01-22 09:55:14.959593: step: 232/77, loss: 0.030719848349690437 2023-01-22 09:55:16.389959: step: 236/77, loss: 0.017361463978886604 2023-01-22 09:55:17.842234: step: 240/77, loss: 0.005429819226264954 2023-01-22 09:55:19.340533: step: 244/77, loss: 0.0007189378957264125 2023-01-22 09:55:20.821556: step: 248/77, loss: 0.02332794852554798 2023-01-22 09:55:22.191639: step: 252/77, loss: 6.61729573039338e-05 2023-01-22 09:55:23.677303: step: 256/77, loss: 0.017682623118162155 2023-01-22 09:55:25.207191: step: 260/77, loss: 0.00020931556355208158 2023-01-22 09:55:26.696245: step: 264/77, loss: 0.004470092244446278 2023-01-22 09:55:28.253240: step: 268/77, loss: 0.0027932701632380486 2023-01-22 09:55:29.735751: step: 272/77, loss: 0.00921584665775299 2023-01-22 09:55:31.182254: step: 276/77, loss: 0.0022591277956962585 2023-01-22 09:55:32.675682: step: 280/77, loss: 0.00729965977370739 2023-01-22 09:55:34.174615: step: 284/77, loss: 0.06238330900669098 2023-01-22 09:55:35.658207: step: 288/77, loss: 0.04654904827475548 2023-01-22 09:55:37.196233: step: 292/77, loss: 0.017025865614414215 2023-01-22 09:55:38.759012: step: 296/77, loss: 0.010817298665642738 2023-01-22 09:55:40.214376: step: 300/77, loss: 0.009884810075163841 2023-01-22 09:55:41.669747: step: 304/77, loss: 0.0028474931605160236 2023-01-22 09:55:43.161442: step: 308/77, loss: 0.054710544645786285 2023-01-22 09:55:44.625953: step: 312/77, loss: 0.0037204334512352943 2023-01-22 09:55:46.142732: step: 316/77, loss: 0.014074725098907948 2023-01-22 09:55:47.629417: step: 320/77, loss: 0.014103882946074009 2023-01-22 09:55:49.173900: step: 324/77, loss: 0.01592918299138546 2023-01-22 09:55:50.698053: step: 328/77, loss: 0.018581323325634003 2023-01-22 09:55:52.184437: step: 332/77, loss: 0.015413613058626652 2023-01-22 09:55:53.636265: step: 336/77, loss: 0.011257076635956764 2023-01-22 09:55:55.135176: step: 340/77, loss: 0.006358277052640915 2023-01-22 09:55:56.609555: step: 344/77, loss: 0.00376639561727643 2023-01-22 09:55:58.058986: step: 348/77, loss: 0.0034146499820053577 2023-01-22 09:55:59.591227: step: 352/77, loss: 0.0217830128967762 2023-01-22 09:56:01.080141: step: 356/77, loss: 0.008566494099795818 2023-01-22 09:56:02.543501: step: 360/77, loss: 0.024732094258069992 2023-01-22 09:56:03.998811: step: 364/77, loss: 0.020356018096208572 2023-01-22 09:56:05.461696: step: 368/77, loss: 5.2446688641794026e-05 2023-01-22 09:56:06.783469: step: 372/77, loss: 0.0001511536247562617 2023-01-22 09:56:08.248587: step: 376/77, loss: 0.01291133277118206 2023-01-22 09:56:09.770278: step: 380/77, loss: 0.006426115054637194 2023-01-22 09:56:11.284619: step: 384/77, loss: 0.15697544813156128 2023-01-22 09:56:12.801908: step: 388/77, loss: 0.08485838770866394 ================================================== Loss: 0.019 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 9} Test Chinese: {'template': {'p': 0.9090909090909091, 'r': 0.4580152671755725, 'f1': 0.6091370558375634}, 'slot': {'p': 0.5, 'r': 0.013805004314063849, 'f1': 0.026868178001679264}, 'combined': 0.016366402843662493, 'epoch': 9} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 9} Test Korean: {'template': {'p': 0.9104477611940298, 'r': 0.46564885496183206, 'f1': 0.6161616161616161}, 'slot': {'p': 0.4838709677419355, 'r': 0.012942191544434857, 'f1': 0.025210084033613443}, 'combined': 0.015533486121721413, 'epoch': 9} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 9} Test Russian: {'template': {'p': 0.9104477611940298, 'r': 0.46564885496183206, 'f1': 0.6161616161616161}, 'slot': {'p': 0.5151515151515151, 'r': 0.014667817083692839, 'f1': 0.02852348993288591}, 'combined': 0.01757507965561657, 'epoch': 9} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 9} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 9} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 9} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Chinese: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.37777777777777777, 'r': 0.014667817083692839, 'f1': 0.028239202657807307}, 'combined': 0.01735677821894498, 'epoch': 3} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 3} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Korean: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.37777777777777777, 'r': 0.014667817083692839, 'f1': 0.028239202657807307}, 'combined': 0.01735677821894498, 'epoch': 3} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 3} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Russian: {'template': {'p': 0.863013698630137, 'r': 0.48091603053435117, 'f1': 0.6176470588235295}, 'slot': {'p': 0.38636363636363635, 'r': 0.014667817083692839, 'f1': 0.02826267664172901}, 'combined': 0.01745635910224439, 'epoch': 3} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 3} ****************************** Epoch: 10 command: python train.py --model_name template --xlmr_model_name xlm-roberta-large --batch_size 10 --xlmr_learning_rate 2e-5 --event_hidden_num 450 --accumulate_step 4 --max_epoch 30 --p1_data_weight 0.1 --learning_rate 2e-4 2023-01-22 09:57:50.480286: step: 4/77, loss: 0.0037124394439160824 2023-01-22 09:57:51.893403: step: 8/77, loss: 0.018964888527989388 2023-01-22 09:57:53.389318: step: 12/77, loss: 0.010961982421576977 2023-01-22 09:57:54.783029: step: 16/77, loss: 0.00022207949950825423 2023-01-22 09:57:56.278311: step: 20/77, loss: 0.02525768056511879 2023-01-22 09:57:57.805904: step: 24/77, loss: 0.006949287373572588 2023-01-22 09:57:59.257181: step: 28/77, loss: 0.017542913556098938 2023-01-22 09:58:00.733764: step: 32/77, loss: 0.007600500714033842 2023-01-22 09:58:02.251071: step: 36/77, loss: 0.07663308084011078 2023-01-22 09:58:03.740055: step: 40/77, loss: 0.01322341151535511 2023-01-22 09:58:05.223176: step: 44/77, loss: 0.0020390732679516077 2023-01-22 09:58:06.703483: step: 48/77, loss: 0.01980689913034439 2023-01-22 09:58:08.157390: step: 52/77, loss: 0.00036896448000334203 2023-01-22 09:58:09.661692: step: 56/77, loss: 0.012697389349341393 2023-01-22 09:58:11.178892: step: 60/77, loss: 0.04382698982954025 2023-01-22 09:58:12.708092: step: 64/77, loss: 0.009921489283442497 2023-01-22 09:58:14.103880: step: 68/77, loss: 0.0007633566856384277 2023-01-22 09:58:15.536141: step: 72/77, loss: 0.004064435604959726 2023-01-22 09:58:16.970951: step: 76/77, loss: 0.02458122745156288 2023-01-22 09:58:18.471857: step: 80/77, loss: 0.023752877488732338 2023-01-22 09:58:19.907038: step: 84/77, loss: 0.0001711217628326267 2023-01-22 09:58:21.460031: step: 88/77, loss: 0.06998871266841888 2023-01-22 09:58:22.924523: step: 92/77, loss: 0.009247813373804092 2023-01-22 09:58:24.406896: step: 96/77, loss: 0.005743591580539942 2023-01-22 09:58:25.815043: step: 100/77, loss: 0.01699553057551384 2023-01-22 09:58:27.303196: step: 104/77, loss: 0.011396847665309906 2023-01-22 09:58:28.867658: step: 108/77, loss: 0.08990509808063507 2023-01-22 09:58:30.302386: step: 112/77, loss: 0.004237010609358549 2023-01-22 09:58:31.762768: step: 116/77, loss: 0.0958765372633934 2023-01-22 09:58:33.260127: step: 120/77, loss: 0.014597264118492603 2023-01-22 09:58:34.788541: step: 124/77, loss: 0.021818269044160843 2023-01-22 09:58:36.296466: step: 128/77, loss: 0.0006683270330540836 2023-01-22 09:58:37.713604: step: 132/77, loss: 0.016099683940410614 2023-01-22 09:58:39.266722: step: 136/77, loss: 0.004529166035354137 2023-01-22 09:58:40.804574: step: 140/77, loss: 0.007304504048079252 2023-01-22 09:58:42.296308: step: 144/77, loss: 0.0021621109917759895 2023-01-22 09:58:43.798220: step: 148/77, loss: 0.038148678839206696 2023-01-22 09:58:45.305131: step: 152/77, loss: 0.002339947270229459 2023-01-22 09:58:46.813001: step: 156/77, loss: 0.010671212337911129 2023-01-22 09:58:48.228238: step: 160/77, loss: 0.0012018646812066436 2023-01-22 09:58:49.725941: step: 164/77, loss: 0.0007029086700640619 2023-01-22 09:58:51.170508: step: 168/77, loss: 0.05930791795253754 2023-01-22 09:58:52.701873: step: 172/77, loss: 0.06313134729862213 2023-01-22 09:58:54.258031: step: 176/77, loss: 0.0035771233960986137 2023-01-22 09:58:55.788509: step: 180/77, loss: 0.033747103065252304 2023-01-22 09:58:57.292097: step: 184/77, loss: 0.01955796591937542 2023-01-22 09:58:58.778431: step: 188/77, loss: 0.0012469518696889281 2023-01-22 09:59:00.244097: step: 192/77, loss: 0.03592196851968765 2023-01-22 09:59:01.726251: step: 196/77, loss: 0.009159698151051998 2023-01-22 09:59:03.226543: step: 200/77, loss: 0.05379951000213623 2023-01-22 09:59:04.628567: step: 204/77, loss: 0.0014175847172737122 2023-01-22 09:59:06.126833: step: 208/77, loss: 0.0003584384103305638 2023-01-22 09:59:07.612814: step: 212/77, loss: 0.0024745934642851353 2023-01-22 09:59:09.082504: step: 216/77, loss: 0.003314611967653036 2023-01-22 09:59:10.666140: step: 220/77, loss: 0.028728632256388664 2023-01-22 09:59:12.070645: step: 224/77, loss: 0.00189878954552114 2023-01-22 09:59:13.549984: step: 228/77, loss: 0.001966055715456605 2023-01-22 09:59:14.974036: step: 232/77, loss: 0.01825111173093319 2023-01-22 09:59:16.480771: step: 236/77, loss: 0.09419666230678558 2023-01-22 09:59:17.874237: step: 240/77, loss: 0.019170410931110382 2023-01-22 09:59:19.348976: step: 244/77, loss: 0.020276600494980812 2023-01-22 09:59:20.754659: step: 248/77, loss: 0.0032611675560474396 2023-01-22 09:59:22.281716: step: 252/77, loss: 0.09802025556564331 2023-01-22 09:59:23.679168: step: 256/77, loss: 0.011103980243206024 2023-01-22 09:59:25.087762: step: 260/77, loss: 0.016340192407369614 2023-01-22 09:59:26.610081: step: 264/77, loss: 0.001242226455360651 2023-01-22 09:59:28.094385: step: 268/77, loss: 0.03747468814253807 2023-01-22 09:59:29.528261: step: 272/77, loss: 0.028045082464814186 2023-01-22 09:59:30.996412: step: 276/77, loss: 0.016944503411650658 2023-01-22 09:59:32.523782: step: 280/77, loss: 0.01631280407309532 2023-01-22 09:59:33.961584: step: 284/77, loss: 0.030796894803643227 2023-01-22 09:59:35.415896: step: 288/77, loss: 0.005280392710119486 2023-01-22 09:59:36.838405: step: 292/77, loss: 0.10407906025648117 2023-01-22 09:59:38.314708: step: 296/77, loss: 0.00925945583730936 2023-01-22 09:59:39.805257: step: 300/77, loss: 0.005232213530689478 2023-01-22 09:59:41.297737: step: 304/77, loss: 0.03128921985626221 2023-01-22 09:59:42.757046: step: 308/77, loss: 0.00042872564517892897 2023-01-22 09:59:44.272718: step: 312/77, loss: 0.06519351154565811 2023-01-22 09:59:45.771789: step: 316/77, loss: 0.0005023244884796441 2023-01-22 09:59:47.199387: step: 320/77, loss: 0.0007901216158643365 2023-01-22 09:59:48.730186: step: 324/77, loss: 0.038823582231998444 2023-01-22 09:59:50.155024: step: 328/77, loss: 0.03915676474571228 2023-01-22 09:59:51.612806: step: 332/77, loss: 0.004828822799026966 2023-01-22 09:59:53.061151: step: 336/77, loss: 0.008032601326704025 2023-01-22 09:59:54.574800: step: 340/77, loss: 0.02326280064880848 2023-01-22 09:59:56.048142: step: 344/77, loss: 0.0006932779215276241 2023-01-22 09:59:57.535496: step: 348/77, loss: 0.0041923802345991135 2023-01-22 09:59:58.988369: step: 352/77, loss: 0.021920569241046906 2023-01-22 10:00:00.503348: step: 356/77, loss: 0.0004425476072356105 2023-01-22 10:00:01.968321: step: 360/77, loss: 0.05860942602157593 2023-01-22 10:00:03.415328: step: 364/77, loss: 0.010078839026391506 2023-01-22 10:00:04.948143: step: 368/77, loss: 0.00035972020123153925 2023-01-22 10:00:06.472823: step: 372/77, loss: 0.02206575870513916 2023-01-22 10:00:07.910077: step: 376/77, loss: 0.06646309792995453 2023-01-22 10:00:09.376970: step: 380/77, loss: 0.07609245181083679 2023-01-22 10:00:10.831565: step: 384/77, loss: 0.05032084882259369 2023-01-22 10:00:12.315884: step: 388/77, loss: 0.009962482377886772 ================================================== Loss: 0.022 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 10} Test Chinese: {'template': {'p': 0.9230769230769231, 'r': 0.4580152671755725, 'f1': 0.6122448979591837}, 'slot': {'p': 0.4857142857142857, 'r': 0.014667817083692839, 'f1': 0.02847571189279732}, 'combined': 0.01743410932212081, 'epoch': 10} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 10} Test Korean: {'template': {'p': 0.9242424242424242, 'r': 0.46564885496183206, 'f1': 0.6192893401015228}, 'slot': {'p': 0.4722222222222222, 'r': 0.014667817083692839, 'f1': 0.028451882845188285}, 'combined': 0.01761994775184249, 'epoch': 10} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 10} Test Russian: {'template': {'p': 0.9230769230769231, 'r': 0.4580152671755725, 'f1': 0.6122448979591837}, 'slot': {'p': 0.4857142857142857, 'r': 0.014667817083692839, 'f1': 0.02847571189279732}, 'combined': 0.01743410932212081, 'epoch': 10} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 10} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 10} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 10} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Chinese: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.37777777777777777, 'r': 0.014667817083692839, 'f1': 0.028239202657807307}, 'combined': 0.01735677821894498, 'epoch': 3} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 3} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Korean: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.37777777777777777, 'r': 0.014667817083692839, 'f1': 0.028239202657807307}, 'combined': 0.01735677821894498, 'epoch': 3} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 3} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Russian: {'template': {'p': 0.863013698630137, 'r': 0.48091603053435117, 'f1': 0.6176470588235295}, 'slot': {'p': 0.38636363636363635, 'r': 0.014667817083692839, 'f1': 0.02826267664172901}, 'combined': 0.01745635910224439, 'epoch': 3} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 3} ****************************** Epoch: 11 command: python train.py --model_name template --xlmr_model_name xlm-roberta-large --batch_size 10 --xlmr_learning_rate 2e-5 --event_hidden_num 450 --accumulate_step 4 --max_epoch 30 --p1_data_weight 0.1 --learning_rate 2e-4 2023-01-22 10:01:49.845722: step: 4/77, loss: 0.018249448388814926 2023-01-22 10:01:51.276802: step: 8/77, loss: 0.02742091938853264 2023-01-22 10:01:52.751145: step: 12/77, loss: 0.008343767374753952 2023-01-22 10:01:54.254411: step: 16/77, loss: 0.024018079042434692 2023-01-22 10:01:55.789957: step: 20/77, loss: 0.025947896763682365 2023-01-22 10:01:57.293754: step: 24/77, loss: 0.02267858199775219 2023-01-22 10:01:58.748271: step: 28/77, loss: 3.0612878617830575e-05 2023-01-22 10:02:00.247058: step: 32/77, loss: 0.0003211384464520961 2023-01-22 10:02:01.724890: step: 36/77, loss: 0.05688820034265518 2023-01-22 10:02:03.265476: step: 40/77, loss: 0.006101340986788273 2023-01-22 10:02:04.663927: step: 44/77, loss: 0.050070710480213165 2023-01-22 10:02:06.132698: step: 48/77, loss: 0.004720164462924004 2023-01-22 10:02:07.617429: step: 52/77, loss: 0.015022655948996544 2023-01-22 10:02:09.105679: step: 56/77, loss: 0.03563885763287544 2023-01-22 10:02:10.577915: step: 60/77, loss: 0.0330636128783226 2023-01-22 10:02:12.039786: step: 64/77, loss: 0.0014286059886217117 2023-01-22 10:02:13.533077: step: 68/77, loss: 0.000816542305983603 2023-01-22 10:02:15.016870: step: 72/77, loss: 0.01442760694772005 2023-01-22 10:02:16.496556: step: 76/77, loss: 9.378043614560738e-05 2023-01-22 10:02:17.973498: step: 80/77, loss: 0.0005923286080360413 2023-01-22 10:02:19.445085: step: 84/77, loss: 0.020105646923184395 2023-01-22 10:02:20.936963: step: 88/77, loss: 0.009484020993113518 2023-01-22 10:02:22.410204: step: 92/77, loss: 0.012684566900134087 2023-01-22 10:02:23.908081: step: 96/77, loss: 0.04148320108652115 2023-01-22 10:02:25.478786: step: 100/77, loss: 7.08875959389843e-05 2023-01-22 10:02:26.926550: step: 104/77, loss: 0.005405884236097336 2023-01-22 10:02:28.434153: step: 108/77, loss: 0.07155608385801315 2023-01-22 10:02:29.932278: step: 112/77, loss: 0.0010453937575221062 2023-01-22 10:02:31.364901: step: 116/77, loss: 0.009059806354343891 2023-01-22 10:02:32.770312: step: 120/77, loss: 0.02621476724743843 2023-01-22 10:02:34.259252: step: 124/77, loss: 0.002740699565038085 2023-01-22 10:02:35.688694: step: 128/77, loss: 0.010486319661140442 2023-01-22 10:02:37.187699: step: 132/77, loss: 0.01720418594777584 2023-01-22 10:02:38.657280: step: 136/77, loss: 0.00957572739571333 2023-01-22 10:02:40.123293: step: 140/77, loss: 0.003676429158076644 2023-01-22 10:02:41.628181: step: 144/77, loss: 0.009357225149869919 2023-01-22 10:02:43.106144: step: 148/77, loss: 0.03653610125184059 2023-01-22 10:02:44.557910: step: 152/77, loss: 0.04308353364467621 2023-01-22 10:02:46.047777: step: 156/77, loss: 0.05171763524413109 2023-01-22 10:02:47.520178: step: 160/77, loss: 0.004753998946398497 2023-01-22 10:02:48.995810: step: 164/77, loss: 3.090988684562035e-05 2023-01-22 10:02:50.499016: step: 168/77, loss: 0.013740334659814835 2023-01-22 10:02:51.999115: step: 172/77, loss: 0.015785954892635345 2023-01-22 10:02:53.457886: step: 176/77, loss: 0.019747678190469742 2023-01-22 10:02:54.983811: step: 180/77, loss: 0.009812500327825546 2023-01-22 10:02:56.520665: step: 184/77, loss: 0.022340651601552963 2023-01-22 10:02:57.989759: step: 188/77, loss: 0.013165976852178574 2023-01-22 10:02:59.428552: step: 192/77, loss: 0.0026761272456496954 2023-01-22 10:03:00.972122: step: 196/77, loss: 0.0722554549574852 2023-01-22 10:03:02.403410: step: 200/77, loss: 0.014125137589871883 2023-01-22 10:03:03.800592: step: 204/77, loss: 0.00012538768351078033 2023-01-22 10:03:05.217615: step: 208/77, loss: 0.015420390293002129 2023-01-22 10:03:06.614757: step: 212/77, loss: 0.035006795078516006 2023-01-22 10:03:08.124607: step: 216/77, loss: 0.016030533239245415 2023-01-22 10:03:09.608290: step: 220/77, loss: 0.00042447782470844686 2023-01-22 10:03:11.109489: step: 224/77, loss: 0.011341025121510029 2023-01-22 10:03:12.570532: step: 228/77, loss: 0.01798064447939396 2023-01-22 10:03:14.080807: step: 232/77, loss: 0.03821370005607605 2023-01-22 10:03:15.534798: step: 236/77, loss: 0.08387462049722672 2023-01-22 10:03:17.009415: step: 240/77, loss: 0.011262631975114346 2023-01-22 10:03:18.469025: step: 244/77, loss: 0.04161141812801361 2023-01-22 10:03:19.912291: step: 248/77, loss: 0.023375479504466057 2023-01-22 10:03:21.402646: step: 252/77, loss: 0.03386523202061653 2023-01-22 10:03:22.857579: step: 256/77, loss: 0.03562305495142937 2023-01-22 10:03:24.286647: step: 260/77, loss: 0.008851949125528336 2023-01-22 10:03:25.746483: step: 264/77, loss: 0.009420386515557766 2023-01-22 10:03:27.210482: step: 268/77, loss: 0.01851673051714897 2023-01-22 10:03:28.595831: step: 272/77, loss: 0.008063890039920807 2023-01-22 10:03:30.083733: step: 276/77, loss: 0.004955152980983257 2023-01-22 10:03:31.602580: step: 280/77, loss: 0.11314801871776581 2023-01-22 10:03:33.104720: step: 284/77, loss: 0.009548168629407883 2023-01-22 10:03:34.509105: step: 288/77, loss: 0.00019546764087863266 2023-01-22 10:03:36.085923: step: 292/77, loss: 0.09462368488311768 2023-01-22 10:03:37.613334: step: 296/77, loss: 7.842419290682301e-05 2023-01-22 10:03:39.061691: step: 300/77, loss: 0.0013920800993219018 2023-01-22 10:03:40.569423: step: 304/77, loss: 0.04698015749454498 2023-01-22 10:03:42.046486: step: 308/77, loss: 0.002782592084258795 2023-01-22 10:03:43.467607: step: 312/77, loss: 2.5230790924979374e-05 2023-01-22 10:03:44.917193: step: 316/77, loss: 0.007181983441114426 2023-01-22 10:03:46.431552: step: 320/77, loss: 0.022983960807323456 2023-01-22 10:03:47.893345: step: 324/77, loss: 0.00940174050629139 2023-01-22 10:03:49.297438: step: 328/77, loss: 0.02167385257780552 2023-01-22 10:03:50.721154: step: 332/77, loss: 0.009350062347948551 2023-01-22 10:03:52.223083: step: 336/77, loss: 0.063703753054142 2023-01-22 10:03:53.741556: step: 340/77, loss: 0.002021544612944126 2023-01-22 10:03:55.249559: step: 344/77, loss: 0.0050396379083395 2023-01-22 10:03:56.708543: step: 348/77, loss: 0.008820513263344765 2023-01-22 10:03:58.162263: step: 352/77, loss: 0.003946100827306509 2023-01-22 10:03:59.626444: step: 356/77, loss: 0.003359707072377205 2023-01-22 10:04:01.129911: step: 360/77, loss: 0.008270454593002796 2023-01-22 10:04:02.659694: step: 364/77, loss: 9.894469985738397e-05 2023-01-22 10:04:04.168682: step: 368/77, loss: 0.002662702463567257 2023-01-22 10:04:05.688431: step: 372/77, loss: 0.002308095572516322 2023-01-22 10:04:07.135249: step: 376/77, loss: 0.013296185992658138 2023-01-22 10:04:08.535840: step: 380/77, loss: 0.049899984151124954 2023-01-22 10:04:10.059484: step: 384/77, loss: 0.0031746437307447195 2023-01-22 10:04:11.569201: step: 388/77, loss: 0.08022551983594894 ================================================== Loss: 0.020 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.04686584651435266, 'epoch': 11} Test Chinese: {'template': {'p': 0.9104477611940298, 'r': 0.46564885496183206, 'f1': 0.6161616161616161}, 'slot': {'p': 0.42424242424242425, 'r': 0.012079378774805867, 'f1': 0.02348993288590604}, 'combined': 0.014473595010507762, 'epoch': 11} Dev Korean: {'template': {'p': 1.0, 'r': 0.4666666666666667, 'f1': 0.6363636363636364}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.04473558076370027, 'epoch': 11} Test Korean: {'template': {'p': 0.9253731343283582, 'r': 0.4732824427480916, 'f1': 0.6262626262626263}, 'slot': {'p': 0.45714285714285713, 'r': 0.013805004314063849, 'f1': 0.02680067001675042}, 'combined': 0.016784257990288143, 'epoch': 11} Dev Russian: {'template': {'p': 1.0, 'r': 0.48333333333333334, 'f1': 0.6516853932584269}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.045812681424142486, 'epoch': 11} Test Russian: {'template': {'p': 0.8970588235294118, 'r': 0.46564885496183206, 'f1': 0.6130653266331658}, 'slot': {'p': 0.46875, 'r': 0.012942191544434857, 'f1': 0.025188916876574305}, 'combined': 0.015442451552472689, 'epoch': 11} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 11} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 11} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 11} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Chinese: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.37777777777777777, 'r': 0.014667817083692839, 'f1': 0.028239202657807307}, 'combined': 0.01735677821894498, 'epoch': 3} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 3} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Korean: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.37777777777777777, 'r': 0.014667817083692839, 'f1': 0.028239202657807307}, 'combined': 0.01735677821894498, 'epoch': 3} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 3} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Russian: {'template': {'p': 0.863013698630137, 'r': 0.48091603053435117, 'f1': 0.6176470588235295}, 'slot': {'p': 0.38636363636363635, 'r': 0.014667817083692839, 'f1': 0.02826267664172901}, 'combined': 0.01745635910224439, 'epoch': 3} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 3} ****************************** Epoch: 12 command: python train.py --model_name template --xlmr_model_name xlm-roberta-large --batch_size 10 --xlmr_learning_rate 2e-5 --event_hidden_num 450 --accumulate_step 4 --max_epoch 30 --p1_data_weight 0.1 --learning_rate 2e-4 2023-01-22 10:05:50.293554: step: 4/77, loss: 0.04436550661921501 2023-01-22 10:05:51.756363: step: 8/77, loss: 0.004802200943231583 2023-01-22 10:05:53.224326: step: 12/77, loss: 0.11278200894594193 2023-01-22 10:05:54.686470: step: 16/77, loss: 0.004204769618809223 2023-01-22 10:05:56.197999: step: 20/77, loss: 0.002979584503918886 2023-01-22 10:05:57.661551: step: 24/77, loss: 0.011945674195885658 2023-01-22 10:05:59.108090: step: 28/77, loss: 0.004289418924599886 2023-01-22 10:06:00.675161: step: 32/77, loss: 0.005253539886325598 2023-01-22 10:06:02.138712: step: 36/77, loss: 0.0008780001080594957 2023-01-22 10:06:03.587046: step: 40/77, loss: 0.001079891575500369 2023-01-22 10:06:05.104864: step: 44/77, loss: 0.015719100832939148 2023-01-22 10:06:06.563926: step: 48/77, loss: 0.032325148582458496 2023-01-22 10:06:08.054692: step: 52/77, loss: 0.0008553061634302139 2023-01-22 10:06:09.608326: step: 56/77, loss: 0.001794558484107256 2023-01-22 10:06:11.178552: step: 60/77, loss: 0.03624221682548523 2023-01-22 10:06:12.692245: step: 64/77, loss: 0.0045928796753287315 2023-01-22 10:06:14.186706: step: 68/77, loss: 0.05722226947546005 2023-01-22 10:06:15.675574: step: 72/77, loss: 0.0012267765123397112 2023-01-22 10:06:17.216126: step: 76/77, loss: 0.0031581628136336803 2023-01-22 10:06:18.686860: step: 80/77, loss: 0.01436734851449728 2023-01-22 10:06:20.220666: step: 84/77, loss: 0.013682027347385883 2023-01-22 10:06:21.737466: step: 88/77, loss: 0.004203316755592823 2023-01-22 10:06:23.196647: step: 92/77, loss: 0.005488214548677206 2023-01-22 10:06:24.685035: step: 96/77, loss: 0.0018630167469382286 2023-01-22 10:06:26.132051: step: 100/77, loss: 0.00012121098552597687 2023-01-22 10:06:27.588373: step: 104/77, loss: 0.0013276045210659504 2023-01-22 10:06:29.128738: step: 108/77, loss: 0.005005714017897844 2023-01-22 10:06:30.514225: step: 112/77, loss: 0.006716449744999409 2023-01-22 10:06:32.037570: step: 116/77, loss: 0.009621203877031803 2023-01-22 10:06:33.540370: step: 120/77, loss: 0.003736691316589713 2023-01-22 10:06:35.014277: step: 124/77, loss: 0.02548018842935562 2023-01-22 10:06:36.457501: step: 128/77, loss: 0.015529114753007889 2023-01-22 10:06:37.960669: step: 132/77, loss: 0.0020801310893148184 2023-01-22 10:06:39.416937: step: 136/77, loss: 0.0016847447259351611 2023-01-22 10:06:40.959421: step: 140/77, loss: 0.0012095924466848373 2023-01-22 10:06:42.409459: step: 144/77, loss: 0.0006464039324782789 2023-01-22 10:06:43.881697: step: 148/77, loss: 0.0005958712426945567 2023-01-22 10:06:45.404342: step: 152/77, loss: 0.00440417742356658 2023-01-22 10:06:46.868019: step: 156/77, loss: 0.024716690182685852 2023-01-22 10:06:48.332765: step: 160/77, loss: 0.009818075224757195 2023-01-22 10:06:49.825410: step: 164/77, loss: 0.01651581935584545 2023-01-22 10:06:51.251660: step: 168/77, loss: 0.0026334517169743776 2023-01-22 10:06:52.659410: step: 172/77, loss: 7.838180317776278e-05 2023-01-22 10:06:54.122630: step: 176/77, loss: 0.0017036596545949578 2023-01-22 10:06:55.546384: step: 180/77, loss: 0.0021797381341457367 2023-01-22 10:06:57.021043: step: 184/77, loss: 0.002597886137664318 2023-01-22 10:06:58.469630: step: 188/77, loss: 0.0011595187243074179 2023-01-22 10:06:59.874721: step: 192/77, loss: 0.0011745744850486517 2023-01-22 10:07:01.366577: step: 196/77, loss: 0.0181453675031662 2023-01-22 10:07:02.860791: step: 200/77, loss: 0.011830981820821762 2023-01-22 10:07:04.427471: step: 204/77, loss: 0.022019436582922935 2023-01-22 10:07:05.918524: step: 208/77, loss: 0.04811529070138931 2023-01-22 10:07:07.357845: step: 212/77, loss: 0.020062996074557304 2023-01-22 10:07:08.831857: step: 216/77, loss: 0.0033560858573764563 2023-01-22 10:07:10.321880: step: 220/77, loss: 0.012346186675131321 2023-01-22 10:07:11.747795: step: 224/77, loss: 0.00652111042290926 2023-01-22 10:07:13.243885: step: 228/77, loss: 3.144413494737819e-05 2023-01-22 10:07:14.729873: step: 232/77, loss: 0.018586190417408943 2023-01-22 10:07:16.145408: step: 236/77, loss: 0.046308234333992004 2023-01-22 10:07:17.612637: step: 240/77, loss: 0.0010744313476607203 2023-01-22 10:07:19.057463: step: 244/77, loss: 0.013715185225009918 2023-01-22 10:07:20.588885: step: 248/77, loss: 0.0011830773437395692 2023-01-22 10:07:22.045289: step: 252/77, loss: 0.035007935017347336 2023-01-22 10:07:23.498621: step: 256/77, loss: 0.00036713789450004697 2023-01-22 10:07:24.963439: step: 260/77, loss: 0.003876642556861043 2023-01-22 10:07:26.396925: step: 264/77, loss: 0.03178451210260391 2023-01-22 10:07:27.877629: step: 268/77, loss: 0.004363094922155142 2023-01-22 10:07:29.327161: step: 272/77, loss: 0.00016332468658220023 2023-01-22 10:07:30.823287: step: 276/77, loss: 2.408213185844943e-05 2023-01-22 10:07:32.247357: step: 280/77, loss: 0.016927000135183334 2023-01-22 10:07:33.713662: step: 284/77, loss: 0.025983309373259544 2023-01-22 10:07:35.147377: step: 288/77, loss: 0.0042741927318274975 2023-01-22 10:07:36.602564: step: 292/77, loss: 0.017417028546333313 2023-01-22 10:07:38.072941: step: 296/77, loss: 0.014634665101766586 2023-01-22 10:07:39.514749: step: 300/77, loss: 0.009720695205032825 2023-01-22 10:07:41.048841: step: 304/77, loss: 0.0002194459520978853 2023-01-22 10:07:42.564146: step: 308/77, loss: 6.72009409754537e-05 2023-01-22 10:07:44.040568: step: 312/77, loss: 0.05025641992688179 2023-01-22 10:07:45.501623: step: 316/77, loss: 0.012442845851182938 2023-01-22 10:07:47.010699: step: 320/77, loss: 0.0018316482892259955 2023-01-22 10:07:48.568495: step: 324/77, loss: 0.0017850275617092848 2023-01-22 10:07:50.084969: step: 328/77, loss: 0.00014143706357572228 2023-01-22 10:07:51.601855: step: 332/77, loss: 0.003508616704493761 2023-01-22 10:07:53.089302: step: 336/77, loss: 0.046058665961027145 2023-01-22 10:07:54.565812: step: 340/77, loss: 0.03320928290486336 2023-01-22 10:07:56.050715: step: 344/77, loss: 0.004275294486433268 2023-01-22 10:07:57.481993: step: 348/77, loss: 0.001723662600852549 2023-01-22 10:07:58.949036: step: 352/77, loss: 0.002751779742538929 2023-01-22 10:08:00.358290: step: 356/77, loss: 0.0015267680864781141 2023-01-22 10:08:01.821457: step: 360/77, loss: 0.009214522317051888 2023-01-22 10:08:03.298337: step: 364/77, loss: 0.0674782544374466 2023-01-22 10:08:04.753718: step: 368/77, loss: 0.01894235797226429 2023-01-22 10:08:06.291555: step: 372/77, loss: 0.00815523136407137 2023-01-22 10:08:07.737528: step: 376/77, loss: 0.010504646226763725 2023-01-22 10:08:09.216818: step: 380/77, loss: 0.004112462513148785 2023-01-22 10:08:10.716397: step: 384/77, loss: 0.003164243418723345 2023-01-22 10:08:12.256870: step: 388/77, loss: 0.02831357903778553 ================================================== Loss: 0.013 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 12} Test Chinese: {'template': {'p': 0.9402985074626866, 'r': 0.48091603053435117, 'f1': 0.6363636363636365}, 'slot': {'p': 0.5, 'r': 0.013805004314063849, 'f1': 0.026868178001679264}, 'combined': 0.01709793145561408, 'epoch': 12} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 12} Test Korean: {'template': {'p': 0.9402985074626866, 'r': 0.48091603053435117, 'f1': 0.6363636363636365}, 'slot': {'p': 0.5483870967741935, 'r': 0.014667817083692839, 'f1': 0.02857142857142857}, 'combined': 0.018181818181818184, 'epoch': 12} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 12} Test Russian: {'template': {'p': 0.9411764705882353, 'r': 0.48854961832061067, 'f1': 0.6432160804020101}, 'slot': {'p': 0.5483870967741935, 'r': 0.014667817083692839, 'f1': 0.02857142857142857}, 'combined': 0.01837760229720029, 'epoch': 12} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 12} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 12} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 12} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Chinese: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.37777777777777777, 'r': 0.014667817083692839, 'f1': 0.028239202657807307}, 'combined': 0.01735677821894498, 'epoch': 3} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 3} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Korean: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.37777777777777777, 'r': 0.014667817083692839, 'f1': 0.028239202657807307}, 'combined': 0.01735677821894498, 'epoch': 3} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 3} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Russian: {'template': {'p': 0.863013698630137, 'r': 0.48091603053435117, 'f1': 0.6176470588235295}, 'slot': {'p': 0.38636363636363635, 'r': 0.014667817083692839, 'f1': 0.02826267664172901}, 'combined': 0.01745635910224439, 'epoch': 3} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 3} ****************************** Epoch: 13 command: python train.py --model_name template --xlmr_model_name xlm-roberta-large --batch_size 10 --xlmr_learning_rate 2e-5 --event_hidden_num 450 --accumulate_step 4 --max_epoch 30 --p1_data_weight 0.1 --learning_rate 2e-4 2023-01-22 10:09:50.392803: step: 4/77, loss: 0.02979923039674759 2023-01-22 10:09:51.886780: step: 8/77, loss: 0.0011581765720620751 2023-01-22 10:09:53.418468: step: 12/77, loss: 0.01394204143434763 2023-01-22 10:09:54.799032: step: 16/77, loss: 0.016473228111863136 2023-01-22 10:09:56.288815: step: 20/77, loss: 0.003298100782558322 2023-01-22 10:09:57.760066: step: 24/77, loss: 0.02194811776280403 2023-01-22 10:09:59.260602: step: 28/77, loss: 0.01043167244642973 2023-01-22 10:10:00.710610: step: 32/77, loss: 0.06836826354265213 2023-01-22 10:10:02.228791: step: 36/77, loss: 0.000960124540142715 2023-01-22 10:10:03.704010: step: 40/77, loss: 0.06304540485143661 2023-01-22 10:10:05.189435: step: 44/77, loss: 0.00409977650269866 2023-01-22 10:10:06.676850: step: 48/77, loss: 0.00018478457059245557 2023-01-22 10:10:08.133856: step: 52/77, loss: 0.046891722828149796 2023-01-22 10:10:09.566001: step: 56/77, loss: 0.01876625046133995 2023-01-22 10:10:11.044139: step: 60/77, loss: 3.593548899516463e-05 2023-01-22 10:10:12.519337: step: 64/77, loss: 6.849043711554259e-05 2023-01-22 10:10:13.995510: step: 68/77, loss: 0.0011330159613862634 2023-01-22 10:10:15.439031: step: 72/77, loss: 0.0002272444253321737 2023-01-22 10:10:16.892032: step: 76/77, loss: 1.507402066636132e-05 2023-01-22 10:10:18.429606: step: 80/77, loss: 0.002101314952597022 2023-01-22 10:10:19.867994: step: 84/77, loss: 0.033360376954078674 2023-01-22 10:10:21.314894: step: 88/77, loss: 0.10507829487323761 2023-01-22 10:10:22.842846: step: 92/77, loss: 0.00254506035707891 2023-01-22 10:10:24.265012: step: 96/77, loss: 0.015117666684091091 2023-01-22 10:10:25.804646: step: 100/77, loss: 0.011987771838903427 2023-01-22 10:10:27.262945: step: 104/77, loss: 0.0020428854040801525 2023-01-22 10:10:28.674799: step: 108/77, loss: 1.1954606634390075e-05 2023-01-22 10:10:30.123953: step: 112/77, loss: 0.0017031385796144605 2023-01-22 10:10:31.593521: step: 116/77, loss: 0.04721391946077347 2023-01-22 10:10:33.036319: step: 120/77, loss: 0.00435302872210741 2023-01-22 10:10:34.516037: step: 124/77, loss: 0.006813234183937311 2023-01-22 10:10:35.943435: step: 128/77, loss: 0.0010583556722849607 2023-01-22 10:10:37.485313: step: 132/77, loss: 0.0035780100151896477 2023-01-22 10:10:38.943901: step: 136/77, loss: 0.00271525327116251 2023-01-22 10:10:40.458880: step: 140/77, loss: 0.0007694312371313572 2023-01-22 10:10:41.971358: step: 144/77, loss: 0.0012259716168045998 2023-01-22 10:10:43.440799: step: 148/77, loss: 0.0017704784404486418 2023-01-22 10:10:44.949362: step: 152/77, loss: 0.00048166397027671337 2023-01-22 10:10:46.481927: step: 156/77, loss: 0.01880849339067936 2023-01-22 10:10:47.899553: step: 160/77, loss: 0.008621509186923504 2023-01-22 10:10:49.303385: step: 164/77, loss: 0.010005577467381954 2023-01-22 10:10:50.793167: step: 168/77, loss: 0.011141312308609486 2023-01-22 10:10:52.278258: step: 172/77, loss: 0.0018681518267840147 2023-01-22 10:10:53.791712: step: 176/77, loss: 0.0004037081089336425 2023-01-22 10:10:55.186121: step: 180/77, loss: 0.009404388256371021 2023-01-22 10:10:56.716509: step: 184/77, loss: 0.0016339367721229792 2023-01-22 10:10:58.192770: step: 188/77, loss: 0.014390097931027412 2023-01-22 10:10:59.685350: step: 192/77, loss: 0.0010596985230222344 2023-01-22 10:11:01.139311: step: 196/77, loss: 0.054352860897779465 2023-01-22 10:11:02.565605: step: 200/77, loss: 0.03959896042943001 2023-01-22 10:11:04.070890: step: 204/77, loss: 9.916317503666505e-05 2023-01-22 10:11:05.520882: step: 208/77, loss: 0.013791847974061966 2023-01-22 10:11:06.974660: step: 212/77, loss: 0.0007248240290209651 2023-01-22 10:11:08.507365: step: 216/77, loss: 0.0002708295651245862 2023-01-22 10:11:09.937528: step: 220/77, loss: 0.0004911398864351213 2023-01-22 10:11:11.453266: step: 224/77, loss: 0.0047632548958063126 2023-01-22 10:11:12.885492: step: 228/77, loss: 0.0011701977346092463 2023-01-22 10:11:14.372773: step: 232/77, loss: 0.00042671553092077374 2023-01-22 10:11:15.894781: step: 236/77, loss: 0.03621303662657738 2023-01-22 10:11:17.419332: step: 240/77, loss: 0.0006495914421975613 2023-01-22 10:11:18.887085: step: 244/77, loss: 0.0029841335490345955 2023-01-22 10:11:20.454433: step: 248/77, loss: 0.02200494334101677 2023-01-22 10:11:21.988895: step: 252/77, loss: 0.0032038982026278973 2023-01-22 10:11:23.428478: step: 256/77, loss: 0.004941218066960573 2023-01-22 10:11:24.799961: step: 260/77, loss: 0.027301592752337456 2023-01-22 10:11:26.333479: step: 264/77, loss: 0.010927285067737103 2023-01-22 10:11:27.854611: step: 268/77, loss: 0.010645708069205284 2023-01-22 10:11:29.362381: step: 272/77, loss: 0.00907566212117672 2023-01-22 10:11:30.910551: step: 276/77, loss: 0.0008718777680769563 2023-01-22 10:11:32.358246: step: 280/77, loss: 0.004993719980120659 2023-01-22 10:11:33.847914: step: 284/77, loss: 0.0007454871665686369 2023-01-22 10:11:35.344616: step: 288/77, loss: 0.05532342940568924 2023-01-22 10:11:36.839460: step: 292/77, loss: 0.00028806107002310455 2023-01-22 10:11:38.359066: step: 296/77, loss: 0.00247851456515491 2023-01-22 10:11:39.866743: step: 300/77, loss: 0.012744927778840065 2023-01-22 10:11:41.376039: step: 304/77, loss: 0.033811114728450775 2023-01-22 10:11:42.839564: step: 308/77, loss: 0.0008661256870254874 2023-01-22 10:11:44.291341: step: 312/77, loss: 0.0006619691848754883 2023-01-22 10:11:45.823172: step: 316/77, loss: 0.006505975965410471 2023-01-22 10:11:47.336111: step: 320/77, loss: 0.03772088512778282 2023-01-22 10:11:48.768166: step: 324/77, loss: 0.015472686849534512 2023-01-22 10:11:50.232932: step: 328/77, loss: 0.006314371712505817 2023-01-22 10:11:51.729305: step: 332/77, loss: 2.2619480660068803e-05 2023-01-22 10:11:53.243337: step: 336/77, loss: 0.021693747490644455 2023-01-22 10:11:54.715002: step: 340/77, loss: 0.00468218931928277 2023-01-22 10:11:56.224063: step: 344/77, loss: 0.001238841563463211 2023-01-22 10:11:57.711243: step: 348/77, loss: 0.0005457285442389548 2023-01-22 10:11:59.255059: step: 352/77, loss: 0.008669009432196617 2023-01-22 10:12:00.733823: step: 356/77, loss: 0.004476443864405155 2023-01-22 10:12:02.176479: step: 360/77, loss: 0.01310722716152668 2023-01-22 10:12:03.710546: step: 364/77, loss: 0.0666121393442154 2023-01-22 10:12:05.178394: step: 368/77, loss: 0.0067118373699486256 2023-01-22 10:12:06.592582: step: 372/77, loss: 0.00446122232824564 2023-01-22 10:12:08.069960: step: 376/77, loss: 0.0023765151854604483 2023-01-22 10:12:09.575364: step: 380/77, loss: 0.007924961857497692 2023-01-22 10:12:11.057854: step: 384/77, loss: 0.010871789418160915 2023-01-22 10:12:12.510396: step: 388/77, loss: 0.00019569540745578706 ================================================== Loss: 0.012 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 13} Test Chinese: {'template': {'p': 0.8625, 'r': 0.5267175572519084, 'f1': 0.6540284360189573}, 'slot': {'p': 0.42857142857142855, 'r': 0.015530629853321829, 'f1': 0.029975020815986676}, 'combined': 0.019604515983915455, 'epoch': 13} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 13} Test Korean: {'template': {'p': 0.875, 'r': 0.5343511450381679, 'f1': 0.6635071090047393}, 'slot': {'p': 0.4318181818181818, 'r': 0.01639344262295082, 'f1': 0.03158769742310889}, 'combined': 0.020958661797323436, 'epoch': 13} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 13} Test Russian: {'template': {'p': 0.8625, 'r': 0.5267175572519084, 'f1': 0.6540284360189573}, 'slot': {'p': 0.42857142857142855, 'r': 0.015530629853321829, 'f1': 0.029975020815986676}, 'combined': 0.019604515983915455, 'epoch': 13} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 13} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 13} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 13} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Chinese: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.37777777777777777, 'r': 0.014667817083692839, 'f1': 0.028239202657807307}, 'combined': 0.01735677821894498, 'epoch': 3} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 3} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Korean: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.37777777777777777, 'r': 0.014667817083692839, 'f1': 0.028239202657807307}, 'combined': 0.01735677821894498, 'epoch': 3} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 3} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Russian: {'template': {'p': 0.863013698630137, 'r': 0.48091603053435117, 'f1': 0.6176470588235295}, 'slot': {'p': 0.38636363636363635, 'r': 0.014667817083692839, 'f1': 0.02826267664172901}, 'combined': 0.01745635910224439, 'epoch': 3} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 3} ****************************** Epoch: 14 command: python train.py --model_name template --xlmr_model_name xlm-roberta-large --batch_size 10 --xlmr_learning_rate 2e-5 --event_hidden_num 450 --accumulate_step 4 --max_epoch 30 --p1_data_weight 0.1 --learning_rate 2e-4 2023-01-22 10:13:49.945548: step: 4/77, loss: 0.05021757259964943 2023-01-22 10:13:51.449952: step: 8/77, loss: 0.013492027297616005 2023-01-22 10:13:52.856611: step: 12/77, loss: 0.01188771240413189 2023-01-22 10:13:54.296990: step: 16/77, loss: 0.004332687705755234 2023-01-22 10:13:55.775533: step: 20/77, loss: 0.0068397969007492065 2023-01-22 10:13:57.183015: step: 24/77, loss: 0.03639312461018562 2023-01-22 10:13:58.657593: step: 28/77, loss: 0.010193396359682083 2023-01-22 10:14:00.125735: step: 32/77, loss: 0.0002720890333876014 2023-01-22 10:14:01.584382: step: 36/77, loss: 0.0033012309577316046 2023-01-22 10:14:02.994154: step: 40/77, loss: 0.06803090870380402 2023-01-22 10:14:04.447184: step: 44/77, loss: 0.002303973538801074 2023-01-22 10:14:05.892211: step: 48/77, loss: 0.01426534354686737 2023-01-22 10:14:07.326928: step: 52/77, loss: 0.01106639951467514 2023-01-22 10:14:08.812677: step: 56/77, loss: 0.013820632360875607 2023-01-22 10:14:10.270901: step: 60/77, loss: 0.001012542168609798 2023-01-22 10:14:11.761353: step: 64/77, loss: 0.003073731204494834 2023-01-22 10:14:13.269530: step: 68/77, loss: 0.0074550556018948555 2023-01-22 10:14:14.742364: step: 72/77, loss: 0.03673708066344261 2023-01-22 10:14:16.254848: step: 76/77, loss: 0.009585902094841003 2023-01-22 10:14:17.737006: step: 80/77, loss: 0.0049670119769871235 2023-01-22 10:14:19.174801: step: 84/77, loss: 0.0012508973013609648 2023-01-22 10:14:20.644359: step: 88/77, loss: 0.04550894722342491 2023-01-22 10:14:22.192604: step: 92/77, loss: 0.0476241372525692 2023-01-22 10:14:23.653287: step: 96/77, loss: 0.0003463767934590578 2023-01-22 10:14:25.121430: step: 100/77, loss: 0.0013855427969247103 2023-01-22 10:14:26.583017: step: 104/77, loss: 6.948120426386595e-05 2023-01-22 10:14:28.039511: step: 108/77, loss: 0.0025392724201083183 2023-01-22 10:14:29.491099: step: 112/77, loss: 0.0023203599266707897 2023-01-22 10:14:31.023165: step: 116/77, loss: 0.002155708149075508 2023-01-22 10:14:32.508147: step: 120/77, loss: 0.010432385839521885 2023-01-22 10:14:33.977244: step: 124/77, loss: 0.0015561548061668873 2023-01-22 10:14:35.432408: step: 128/77, loss: 0.0022515016607940197 2023-01-22 10:14:36.863586: step: 132/77, loss: 0.0063758837059140205 2023-01-22 10:14:38.327640: step: 136/77, loss: 0.00816910620778799 2023-01-22 10:14:39.772504: step: 140/77, loss: 0.0004892628057859838 2023-01-22 10:14:41.253142: step: 144/77, loss: 0.006460327655076981 2023-01-22 10:14:42.745362: step: 148/77, loss: 0.0022336977999657393 2023-01-22 10:14:44.234862: step: 152/77, loss: 0.04381607472896576 2023-01-22 10:14:45.673018: step: 156/77, loss: 0.01606643944978714 2023-01-22 10:14:47.134622: step: 160/77, loss: 0.01276947371661663 2023-01-22 10:14:48.686265: step: 164/77, loss: 8.410248847212642e-05 2023-01-22 10:14:50.177139: step: 168/77, loss: 0.01344869751483202 2023-01-22 10:14:51.638297: step: 172/77, loss: 0.01566564477980137 2023-01-22 10:14:53.144766: step: 176/77, loss: 3.885049591190182e-05 2023-01-22 10:14:54.644009: step: 180/77, loss: 0.002805543364956975 2023-01-22 10:14:56.181146: step: 184/77, loss: 5.492310265253764e-06 2023-01-22 10:14:57.582725: step: 188/77, loss: 0.004818313755095005 2023-01-22 10:14:59.090250: step: 192/77, loss: 0.00019668742606882006 2023-01-22 10:15:00.545674: step: 196/77, loss: 0.0005715168663300574 2023-01-22 10:15:02.129249: step: 200/77, loss: 0.013614819385111332 2023-01-22 10:15:03.628716: step: 204/77, loss: 0.006921195425093174 2023-01-22 10:15:05.127107: step: 208/77, loss: 0.000296951737254858 2023-01-22 10:15:06.584772: step: 212/77, loss: 0.000808404351118952 2023-01-22 10:15:08.113889: step: 216/77, loss: 0.0011310662375763059 2023-01-22 10:15:09.616881: step: 220/77, loss: 0.1592380404472351 2023-01-22 10:15:11.093149: step: 224/77, loss: 0.0025794643443077803 2023-01-22 10:15:12.529034: step: 228/77, loss: 0.0009930891683325171 2023-01-22 10:15:14.042661: step: 232/77, loss: 0.0015125039499253035 2023-01-22 10:15:15.489108: step: 236/77, loss: 0.005088160280138254 2023-01-22 10:15:17.045895: step: 240/77, loss: 0.0073650190606713295 2023-01-22 10:15:18.517081: step: 244/77, loss: 0.007640082389116287 2023-01-22 10:15:20.015688: step: 248/77, loss: 0.0062585556879639626 2023-01-22 10:15:21.491038: step: 252/77, loss: 0.02538384683430195 2023-01-22 10:15:22.958029: step: 256/77, loss: 0.0007471991702914238 2023-01-22 10:15:24.457241: step: 260/77, loss: 0.000744312594179064 2023-01-22 10:15:25.909292: step: 264/77, loss: 6.348014721879736e-05 2023-01-22 10:15:27.452319: step: 268/77, loss: 0.0054208822548389435 2023-01-22 10:15:28.946148: step: 272/77, loss: 0.004874257370829582 2023-01-22 10:15:30.425065: step: 276/77, loss: 0.0018353211926296353 2023-01-22 10:15:31.862650: step: 280/77, loss: 0.01595592312514782 2023-01-22 10:15:33.429081: step: 284/77, loss: 0.010417342185974121 2023-01-22 10:15:34.905950: step: 288/77, loss: 0.015512584708631039 2023-01-22 10:15:36.366099: step: 292/77, loss: 0.05964883044362068 2023-01-22 10:15:37.807625: step: 296/77, loss: 0.021162962540984154 2023-01-22 10:15:39.273979: step: 300/77, loss: 0.005686973687261343 2023-01-22 10:15:40.694665: step: 304/77, loss: 0.007818142883479595 2023-01-22 10:15:42.169066: step: 308/77, loss: 0.006607674062252045 2023-01-22 10:15:43.612309: step: 312/77, loss: 0.00150693254545331 2023-01-22 10:15:45.084795: step: 316/77, loss: 0.014010453596711159 2023-01-22 10:15:46.567218: step: 320/77, loss: 0.0010493621230125427 2023-01-22 10:15:48.075970: step: 324/77, loss: 0.004597794730216265 2023-01-22 10:15:49.609741: step: 328/77, loss: 0.001323473872616887 2023-01-22 10:15:51.092498: step: 332/77, loss: 0.003567654872313142 2023-01-22 10:15:52.602123: step: 336/77, loss: 0.023809565231204033 2023-01-22 10:15:54.078376: step: 340/77, loss: 0.0202918890863657 2023-01-22 10:15:55.638816: step: 344/77, loss: 0.05709415674209595 2023-01-22 10:15:57.050120: step: 348/77, loss: 0.06244867295026779 2023-01-22 10:15:58.578601: step: 352/77, loss: 0.0023825361859053373 2023-01-22 10:15:59.996986: step: 356/77, loss: 0.015587534755468369 2023-01-22 10:16:01.505535: step: 360/77, loss: 0.003471288364380598 2023-01-22 10:16:02.992842: step: 364/77, loss: 0.004319621250033379 2023-01-22 10:16:04.491688: step: 368/77, loss: 0.014541847631335258 2023-01-22 10:16:06.014058: step: 372/77, loss: 0.00017313337593805045 2023-01-22 10:16:07.393113: step: 376/77, loss: 0.002016652375459671 2023-01-22 10:16:08.890035: step: 380/77, loss: 0.021287081763148308 2023-01-22 10:16:10.381067: step: 384/77, loss: 0.0034362124279141426 2023-01-22 10:16:11.815830: step: 388/77, loss: 0.002922603627666831 ================================================== Loss: 0.013 -------------------- Dev Chinese: {'template': {'p': 0.9459459459459459, 'r': 0.5833333333333334, 'f1': 0.7216494845360825}, 'slot': {'p': 0.43478260869565216, 'r': 0.03780718336483932, 'f1': 0.06956521739130435}, 'combined': 0.0502017032720753, 'epoch': 14} Test Chinese: {'template': {'p': 0.9014084507042254, 'r': 0.48854961832061067, 'f1': 0.6336633663366336}, 'slot': {'p': 0.4, 'r': 0.015530629853321829, 'f1': 0.029900332225913623}, 'combined': 0.018946745172856154, 'epoch': 14} Dev Korean: {'template': {'p': 0.9459459459459459, 'r': 0.5833333333333334, 'f1': 0.7216494845360825}, 'slot': {'p': 0.43478260869565216, 'r': 0.03780718336483932, 'f1': 0.06956521739130435}, 'combined': 0.0502017032720753, 'epoch': 14} Test Korean: {'template': {'p': 0.9014084507042254, 'r': 0.48854961832061067, 'f1': 0.6336633663366336}, 'slot': {'p': 0.391304347826087, 'r': 0.015530629853321829, 'f1': 0.029875518672199168}, 'combined': 0.018931021732878677, 'epoch': 14} Dev Russian: {'template': {'p': 0.9459459459459459, 'r': 0.5833333333333334, 'f1': 0.7216494845360825}, 'slot': {'p': 0.43478260869565216, 'r': 0.03780718336483932, 'f1': 0.06956521739130435}, 'combined': 0.0502017032720753, 'epoch': 14} Test Russian: {'template': {'p': 0.8873239436619719, 'r': 0.48091603053435117, 'f1': 0.6237623762376238}, 'slot': {'p': 0.38636363636363635, 'r': 0.014667817083692839, 'f1': 0.02826267664172901}, 'combined': 0.01762919434088047, 'epoch': 14} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 14} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 14} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 14} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Chinese: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.37777777777777777, 'r': 0.014667817083692839, 'f1': 0.028239202657807307}, 'combined': 0.01735677821894498, 'epoch': 3} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 3} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Korean: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.37777777777777777, 'r': 0.014667817083692839, 'f1': 0.028239202657807307}, 'combined': 0.01735677821894498, 'epoch': 3} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 3} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Russian: {'template': {'p': 0.863013698630137, 'r': 0.48091603053435117, 'f1': 0.6176470588235295}, 'slot': {'p': 0.38636363636363635, 'r': 0.014667817083692839, 'f1': 0.02826267664172901}, 'combined': 0.01745635910224439, 'epoch': 3} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 3} ****************************** Epoch: 15 command: python train.py --model_name template --xlmr_model_name xlm-roberta-large --batch_size 10 --xlmr_learning_rate 2e-5 --event_hidden_num 450 --accumulate_step 4 --max_epoch 30 --p1_data_weight 0.1 --learning_rate 2e-4 2023-01-22 10:17:49.168404: step: 4/77, loss: 0.008644331246614456 2023-01-22 10:17:50.637435: step: 8/77, loss: 0.011979562230408192 2023-01-22 10:17:52.196476: step: 12/77, loss: 0.014013068750500679 2023-01-22 10:17:53.640997: step: 16/77, loss: 0.008621701039373875 2023-01-22 10:17:55.101687: step: 20/77, loss: 0.021316878497600555 2023-01-22 10:17:56.606946: step: 24/77, loss: 0.051223039627075195 2023-01-22 10:17:58.134426: step: 28/77, loss: 0.002289854222908616 2023-01-22 10:17:59.658207: step: 32/77, loss: 0.004393209703266621 2023-01-22 10:18:01.184267: step: 36/77, loss: 5.282249730953481e-06 2023-01-22 10:18:02.710137: step: 40/77, loss: 0.01075073517858982 2023-01-22 10:18:04.222285: step: 44/77, loss: 0.0020622683223336935 2023-01-22 10:18:05.719334: step: 48/77, loss: 0.03781045228242874 2023-01-22 10:18:07.177167: step: 52/77, loss: 0.0880887433886528 2023-01-22 10:18:08.708472: step: 56/77, loss: 0.03132542222738266 2023-01-22 10:18:10.225183: step: 60/77, loss: 0.00321400398388505 2023-01-22 10:18:11.663565: step: 64/77, loss: 0.004773072432726622 2023-01-22 10:18:13.046192: step: 68/77, loss: 0.017901595681905746 2023-01-22 10:18:14.509130: step: 72/77, loss: 0.038539398461580276 2023-01-22 10:18:16.010595: step: 76/77, loss: 0.133880615234375 2023-01-22 10:18:17.498263: step: 80/77, loss: 0.0034761850256472826 2023-01-22 10:18:18.993961: step: 84/77, loss: 0.013166949152946472 2023-01-22 10:18:20.458079: step: 88/77, loss: 0.00681588239967823 2023-01-22 10:18:21.906843: step: 92/77, loss: 0.009442516602575779 2023-01-22 10:18:23.398944: step: 96/77, loss: 0.023162730038166046 2023-01-22 10:18:24.890911: step: 100/77, loss: 0.017544515430927277 2023-01-22 10:18:26.365257: step: 104/77, loss: 0.016073524951934814 2023-01-22 10:18:27.872523: step: 108/77, loss: 0.03755786269903183 2023-01-22 10:18:29.431881: step: 112/77, loss: 0.0028731636703014374 2023-01-22 10:18:30.919167: step: 116/77, loss: 0.006997175980359316 2023-01-22 10:18:32.413348: step: 120/77, loss: 0.002053946955129504 2023-01-22 10:18:33.904577: step: 124/77, loss: 0.0005283099017105997 2023-01-22 10:18:35.391178: step: 128/77, loss: 0.007918376475572586 2023-01-22 10:18:36.879098: step: 132/77, loss: 0.0013505109818652272 2023-01-22 10:18:38.380710: step: 136/77, loss: 0.006453365087509155 2023-01-22 10:18:39.831287: step: 140/77, loss: 0.008280211128294468 2023-01-22 10:18:41.286749: step: 144/77, loss: 0.02169647254049778 2023-01-22 10:18:42.786768: step: 148/77, loss: 0.008906039409339428 2023-01-22 10:18:44.282604: step: 152/77, loss: 9.285704436479136e-05 2023-01-22 10:18:45.729720: step: 156/77, loss: 0.061494454741477966 2023-01-22 10:18:47.189985: step: 160/77, loss: 0.0007124262629076838 2023-01-22 10:18:48.678368: step: 164/77, loss: 0.0020129838958382607 2023-01-22 10:18:50.129744: step: 168/77, loss: 0.00560009153559804 2023-01-22 10:18:51.624069: step: 172/77, loss: 0.030477937310934067 2023-01-22 10:18:53.072129: step: 176/77, loss: 0.005781487561762333 2023-01-22 10:18:54.495016: step: 180/77, loss: 0.002156471135094762 2023-01-22 10:18:55.965711: step: 184/77, loss: 0.01489571388810873 2023-01-22 10:18:57.467792: step: 188/77, loss: 0.01714280992746353 2023-01-22 10:18:58.941701: step: 192/77, loss: 0.014948004856705666 2023-01-22 10:19:00.462422: step: 196/77, loss: 0.0004343294131103903 2023-01-22 10:19:01.961508: step: 200/77, loss: 0.0468200147151947 2023-01-22 10:19:03.452803: step: 204/77, loss: 0.00364921847358346 2023-01-22 10:19:04.891391: step: 208/77, loss: 0.006669571157544851 2023-01-22 10:19:06.387095: step: 212/77, loss: 0.03793036937713623 2023-01-22 10:19:07.885081: step: 216/77, loss: 0.0024284652899950743 2023-01-22 10:19:09.358408: step: 220/77, loss: 0.002885939320549369 2023-01-22 10:19:10.806824: step: 224/77, loss: 0.05066582188010216 2023-01-22 10:19:12.243875: step: 228/77, loss: 0.0012308210134506226 2023-01-22 10:19:13.757090: step: 232/77, loss: 0.03428790718317032 2023-01-22 10:19:15.219016: step: 236/77, loss: 0.005835465621203184 2023-01-22 10:19:16.673405: step: 240/77, loss: 0.035248078405857086 2023-01-22 10:19:18.140824: step: 244/77, loss: 0.011382810771465302 2023-01-22 10:19:19.589736: step: 248/77, loss: 0.0028067068196833134 2023-01-22 10:19:21.068507: step: 252/77, loss: 0.00026669781072996557 2023-01-22 10:19:22.533886: step: 256/77, loss: 0.0037234588526189327 2023-01-22 10:19:24.050808: step: 260/77, loss: 0.004366753157228231 2023-01-22 10:19:25.499945: step: 264/77, loss: 0.01697009615600109 2023-01-22 10:19:26.949893: step: 268/77, loss: 0.06928707659244537 2023-01-22 10:19:28.400433: step: 272/77, loss: 0.00224512442946434 2023-01-22 10:19:29.857972: step: 276/77, loss: 0.00021704388200305402 2023-01-22 10:19:31.388842: step: 280/77, loss: 0.007994461804628372 2023-01-22 10:19:32.823801: step: 284/77, loss: 0.0010824492201209068 2023-01-22 10:19:34.245772: step: 288/77, loss: 0.005412569735199213 2023-01-22 10:19:35.806700: step: 292/77, loss: 0.014690631069242954 2023-01-22 10:19:37.242870: step: 296/77, loss: 0.07107909023761749 2023-01-22 10:19:38.777966: step: 300/77, loss: 0.003367634490132332 2023-01-22 10:19:40.239352: step: 304/77, loss: 0.0009482861496508121 2023-01-22 10:19:41.703710: step: 308/77, loss: 0.0010064172092825174 2023-01-22 10:19:43.174489: step: 312/77, loss: 0.03311848267912865 2023-01-22 10:19:44.692696: step: 316/77, loss: 0.00782004464417696 2023-01-22 10:19:46.176001: step: 320/77, loss: 0.00015082456229720265 2023-01-22 10:19:47.673799: step: 324/77, loss: 0.00022852692927699536 2023-01-22 10:19:49.112254: step: 328/77, loss: 0.012201538309454918 2023-01-22 10:19:50.646138: step: 332/77, loss: 0.04965365678071976 2023-01-22 10:19:52.105628: step: 336/77, loss: 0.00021343353728298098 2023-01-22 10:19:53.572261: step: 340/77, loss: 0.0032562892884016037 2023-01-22 10:19:54.963476: step: 344/77, loss: 0.0028177325148135424 2023-01-22 10:19:56.423629: step: 348/77, loss: 0.00022388799698092043 2023-01-22 10:19:57.884664: step: 352/77, loss: 0.00048760930076241493 2023-01-22 10:19:59.369867: step: 356/77, loss: 0.033975739032030106 2023-01-22 10:20:00.831599: step: 360/77, loss: 0.07614095509052277 2023-01-22 10:20:02.312234: step: 364/77, loss: 0.020101221278309822 2023-01-22 10:20:03.838808: step: 368/77, loss: 0.013422614894807339 2023-01-22 10:20:05.239102: step: 372/77, loss: 0.01529695838689804 2023-01-22 10:20:06.760845: step: 376/77, loss: 0.0003953626146540046 2023-01-22 10:20:08.302425: step: 380/77, loss: 0.026907581835985184 2023-01-22 10:20:09.697669: step: 384/77, loss: 0.0025897109881043434 2023-01-22 10:20:11.110393: step: 388/77, loss: 0.020302483811974525 ================================================== Loss: 0.017 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.04686584651435266, 'epoch': 15} Test Chinese: {'template': {'p': 0.8783783783783784, 'r': 0.4961832061068702, 'f1': 0.6341463414634146}, 'slot': {'p': 0.3829787234042553, 'r': 0.015530629853321829, 'f1': 0.029850746268656716}, 'combined': 0.01892974153622133, 'epoch': 15} Dev Korean: {'template': {'p': 1.0, 'r': 0.4666666666666667, 'f1': 0.6363636363636364}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.04473558076370027, 'epoch': 15} Test Korean: {'template': {'p': 0.8666666666666667, 'r': 0.4961832061068702, 'f1': 0.6310679611650485}, 'slot': {'p': 0.40425531914893614, 'r': 0.01639344262295082, 'f1': 0.03150912106135987}, 'combined': 0.019884396786295062, 'epoch': 15} Dev Russian: {'template': {'p': 1.0, 'r': 0.48333333333333334, 'f1': 0.6516853932584269}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.045812681424142486, 'epoch': 15} Test Russian: {'template': {'p': 0.8666666666666667, 'r': 0.4961832061068702, 'f1': 0.6310679611650485}, 'slot': {'p': 0.40425531914893614, 'r': 0.01639344262295082, 'f1': 0.03150912106135987}, 'combined': 0.019884396786295062, 'epoch': 15} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 15} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 15} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 15} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Chinese: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.37777777777777777, 'r': 0.014667817083692839, 'f1': 0.028239202657807307}, 'combined': 0.01735677821894498, 'epoch': 3} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 3} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Korean: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.37777777777777777, 'r': 0.014667817083692839, 'f1': 0.028239202657807307}, 'combined': 0.01735677821894498, 'epoch': 3} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 3} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Russian: {'template': {'p': 0.863013698630137, 'r': 0.48091603053435117, 'f1': 0.6176470588235295}, 'slot': {'p': 0.38636363636363635, 'r': 0.014667817083692839, 'f1': 0.02826267664172901}, 'combined': 0.01745635910224439, 'epoch': 3} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 3} ****************************** Epoch: 16 command: python train.py --model_name template --xlmr_model_name xlm-roberta-large --batch_size 10 --xlmr_learning_rate 2e-5 --event_hidden_num 450 --accumulate_step 4 --max_epoch 30 --p1_data_weight 0.1 --learning_rate 2e-4 2023-01-22 10:21:49.087104: step: 4/77, loss: 0.02240251749753952 2023-01-22 10:21:50.579512: step: 8/77, loss: 0.02771635353565216 2023-01-22 10:21:52.074614: step: 12/77, loss: 0.00028855435084551573 2023-01-22 10:21:53.531972: step: 16/77, loss: 0.0013435601722449064 2023-01-22 10:21:55.029070: step: 20/77, loss: 0.04604283347725868 2023-01-22 10:21:56.453931: step: 24/77, loss: 0.00026520941173657775 2023-01-22 10:21:57.856494: step: 28/77, loss: 0.00022504819207824767 2023-01-22 10:21:59.314397: step: 32/77, loss: 0.0024469781201332808 2023-01-22 10:22:00.808457: step: 36/77, loss: 0.06819023936986923 2023-01-22 10:22:02.278441: step: 40/77, loss: 0.004971928894519806 2023-01-22 10:22:03.759450: step: 44/77, loss: 0.016732260584831238 2023-01-22 10:22:05.227349: step: 48/77, loss: 0.0035637859255075455 2023-01-22 10:22:06.669151: step: 52/77, loss: 2.8777996703865938e-05 2023-01-22 10:22:08.111469: step: 56/77, loss: 0.00130385288503021 2023-01-22 10:22:09.648229: step: 60/77, loss: 0.011374658904969692 2023-01-22 10:22:11.141418: step: 64/77, loss: 0.0029410512652248144 2023-01-22 10:22:12.653778: step: 68/77, loss: 0.023517031222581863 2023-01-22 10:22:14.117974: step: 72/77, loss: 0.029339302331209183 2023-01-22 10:22:15.672121: step: 76/77, loss: 0.01185949519276619 2023-01-22 10:22:17.132855: step: 80/77, loss: 0.05707313492894173 2023-01-22 10:22:18.625866: step: 84/77, loss: 0.06443598121404648 2023-01-22 10:22:20.161964: step: 88/77, loss: 0.000236800653510727 2023-01-22 10:22:21.655372: step: 92/77, loss: 0.0030893548391759396 2023-01-22 10:22:23.103936: step: 96/77, loss: 0.005407319869846106 2023-01-22 10:22:24.528421: step: 100/77, loss: 0.0009325853898189962 2023-01-22 10:22:26.009321: step: 104/77, loss: 0.0016836941940709949 2023-01-22 10:22:27.511433: step: 108/77, loss: 0.0005430675228126347 2023-01-22 10:22:29.024538: step: 112/77, loss: 0.0018001548014581203 2023-01-22 10:22:30.572362: step: 116/77, loss: 0.011139596812427044 2023-01-22 10:22:32.059689: step: 120/77, loss: 0.009026577696204185 2023-01-22 10:22:33.506495: step: 124/77, loss: 0.013764426112174988 2023-01-22 10:22:35.006296: step: 128/77, loss: 0.13721467554569244 2023-01-22 10:22:36.566741: step: 132/77, loss: 0.05707092583179474 2023-01-22 10:22:38.104605: step: 136/77, loss: 0.002023660810664296 2023-01-22 10:22:39.595179: step: 140/77, loss: 0.0009388293838128448 2023-01-22 10:22:41.027507: step: 144/77, loss: 1.7927572116605006e-05 2023-01-22 10:22:42.536172: step: 148/77, loss: 0.007441079709678888 2023-01-22 10:22:44.012024: step: 152/77, loss: 1.1405087207094766e-05 2023-01-22 10:22:45.541252: step: 156/77, loss: 0.01594318449497223 2023-01-22 10:22:47.021604: step: 160/77, loss: 0.009689363650977612 2023-01-22 10:22:48.449929: step: 164/77, loss: 0.00965337734669447 2023-01-22 10:22:49.930536: step: 168/77, loss: 0.00018008879851549864 2023-01-22 10:22:51.448546: step: 172/77, loss: 0.0013620827812701464 2023-01-22 10:22:52.952215: step: 176/77, loss: 0.009706384502351284 2023-01-22 10:22:54.391223: step: 180/77, loss: 0.00011257622099947184 2023-01-22 10:22:55.880072: step: 184/77, loss: 5.58066058147233e-05 2023-01-22 10:22:57.297110: step: 188/77, loss: 0.0010068108094856143 2023-01-22 10:22:58.790821: step: 192/77, loss: 0.0033298954367637634 2023-01-22 10:23:00.289719: step: 196/77, loss: 0.002340085804462433 2023-01-22 10:23:01.745843: step: 200/77, loss: 4.1069710277952254e-05 2023-01-22 10:23:03.282817: step: 204/77, loss: 4.162640470894985e-05 2023-01-22 10:23:04.765075: step: 208/77, loss: 0.0057720597833395 2023-01-22 10:23:06.270424: step: 212/77, loss: 0.045091234147548676 2023-01-22 10:23:07.754255: step: 216/77, loss: 0.0005430678138509393 2023-01-22 10:23:09.288184: step: 220/77, loss: 0.02244877815246582 2023-01-22 10:23:10.710744: step: 224/77, loss: 0.015751518309116364 2023-01-22 10:23:12.235928: step: 228/77, loss: 0.03594440221786499 2023-01-22 10:23:13.829878: step: 232/77, loss: 0.001045037410221994 2023-01-22 10:23:15.372299: step: 236/77, loss: 0.00036588916555047035 2023-01-22 10:23:16.872610: step: 240/77, loss: 0.0008532066131010652 2023-01-22 10:23:18.355323: step: 244/77, loss: 1.4190770343702752e-05 2023-01-22 10:23:19.878832: step: 248/77, loss: 0.002486072713509202 2023-01-22 10:23:21.343072: step: 252/77, loss: 0.0003959749301429838 2023-01-22 10:23:22.844932: step: 256/77, loss: 4.7969617298804224e-05 2023-01-22 10:23:24.320397: step: 260/77, loss: 0.02460763230919838 2023-01-22 10:23:25.820703: step: 264/77, loss: 0.026778748258948326 2023-01-22 10:23:27.336074: step: 268/77, loss: 0.00020856253104284406 2023-01-22 10:23:28.819756: step: 272/77, loss: 0.06297913193702698 2023-01-22 10:23:30.322172: step: 276/77, loss: 0.02976994216442108 2023-01-22 10:23:31.762539: step: 280/77, loss: 0.09210612624883652 2023-01-22 10:23:33.283080: step: 284/77, loss: 0.0030472474172711372 2023-01-22 10:23:34.783501: step: 288/77, loss: 0.0022248579189181328 2023-01-22 10:23:36.235558: step: 292/77, loss: 0.0329422801733017 2023-01-22 10:23:37.700492: step: 296/77, loss: 0.0012546322541311383 2023-01-22 10:23:39.182385: step: 300/77, loss: 0.012546733021736145 2023-01-22 10:23:40.612988: step: 304/77, loss: 0.002945370739325881 2023-01-22 10:23:42.077323: step: 308/77, loss: 0.0017267671646550298 2023-01-22 10:23:43.578848: step: 312/77, loss: 0.0015203878283500671 2023-01-22 10:23:44.993636: step: 316/77, loss: 0.009436001069843769 2023-01-22 10:23:46.399012: step: 320/77, loss: 0.00017988021136261523 2023-01-22 10:23:47.852022: step: 324/77, loss: 0.0006251703598536551 2023-01-22 10:23:49.338391: step: 328/77, loss: 0.004496883600950241 2023-01-22 10:23:50.853374: step: 332/77, loss: 0.0008476347429677844 2023-01-22 10:23:52.281094: step: 336/77, loss: 0.0237599965184927 2023-01-22 10:23:53.782087: step: 340/77, loss: 0.0004581378889270127 2023-01-22 10:23:55.250676: step: 344/77, loss: 0.0019197032088413835 2023-01-22 10:23:56.754088: step: 348/77, loss: 5.6227523600682616e-05 2023-01-22 10:23:58.239510: step: 352/77, loss: 2.4058379494817927e-05 2023-01-22 10:23:59.711188: step: 356/77, loss: 0.048158254474401474 2023-01-22 10:24:01.135308: step: 360/77, loss: 0.07158225774765015 2023-01-22 10:24:02.685934: step: 364/77, loss: 0.006104794796556234 2023-01-22 10:24:04.195897: step: 368/77, loss: 0.010368650779128075 2023-01-22 10:24:05.641645: step: 372/77, loss: 0.018325114622712135 2023-01-22 10:24:07.120327: step: 376/77, loss: 0.07363635301589966 2023-01-22 10:24:08.596537: step: 380/77, loss: 0.002189134480431676 2023-01-22 10:24:10.007692: step: 384/77, loss: 0.0005988850025460124 2023-01-22 10:24:11.416972: step: 388/77, loss: 0.00013736364780925214 ================================================== Loss: 0.014 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 16} Test Chinese: {'template': {'p': 0.9054054054054054, 'r': 0.5114503816793893, 'f1': 0.6536585365853658}, 'slot': {'p': 0.3953488372093023, 'r': 0.014667817083692839, 'f1': 0.028286189683860232}, 'combined': 0.018489509354328148, 'epoch': 16} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 16} Test Korean: {'template': {'p': 0.9054054054054054, 'r': 0.5114503816793893, 'f1': 0.6536585365853658}, 'slot': {'p': 0.3953488372093023, 'r': 0.014667817083692839, 'f1': 0.028286189683860232}, 'combined': 0.018489509354328148, 'epoch': 16} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 16} Test Russian: {'template': {'p': 0.8918918918918919, 'r': 0.5038167938931297, 'f1': 0.6439024390243903}, 'slot': {'p': 0.3902439024390244, 'r': 0.013805004314063849, 'f1': 0.02666666666666667}, 'combined': 0.017170731707317075, 'epoch': 16} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 16} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 16} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 16} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Chinese: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.37777777777777777, 'r': 0.014667817083692839, 'f1': 0.028239202657807307}, 'combined': 0.01735677821894498, 'epoch': 3} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 3} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Korean: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.37777777777777777, 'r': 0.014667817083692839, 'f1': 0.028239202657807307}, 'combined': 0.01735677821894498, 'epoch': 3} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 3} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Russian: {'template': {'p': 0.863013698630137, 'r': 0.48091603053435117, 'f1': 0.6176470588235295}, 'slot': {'p': 0.38636363636363635, 'r': 0.014667817083692839, 'f1': 0.02826267664172901}, 'combined': 0.01745635910224439, 'epoch': 3} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 3} ****************************** Epoch: 17 command: python train.py --model_name template --xlmr_model_name xlm-roberta-large --batch_size 10 --xlmr_learning_rate 2e-5 --event_hidden_num 450 --accumulate_step 4 --max_epoch 30 --p1_data_weight 0.1 --learning_rate 2e-4 2023-01-22 10:25:48.928608: step: 4/77, loss: 0.004096877295523882 2023-01-22 10:25:50.382327: step: 8/77, loss: 0.004776291083544493 2023-01-22 10:25:51.851843: step: 12/77, loss: 0.005598829127848148 2023-01-22 10:25:53.397545: step: 16/77, loss: 0.002439585980027914 2023-01-22 10:25:54.830907: step: 20/77, loss: 3.45842563547194e-05 2023-01-22 10:25:56.342019: step: 24/77, loss: 0.0013430280378088355 2023-01-22 10:25:57.838077: step: 28/77, loss: 0.0018386875744909048 2023-01-22 10:25:59.340234: step: 32/77, loss: 0.0015910633374005556 2023-01-22 10:26:00.822160: step: 36/77, loss: 0.0009483025060035288 2023-01-22 10:26:02.307074: step: 40/77, loss: 0.007047180086374283 2023-01-22 10:26:03.775497: step: 44/77, loss: 0.00018717434431891888 2023-01-22 10:26:05.287874: step: 48/77, loss: 0.001299167750403285 2023-01-22 10:26:06.740831: step: 52/77, loss: 0.0006834648665972054 2023-01-22 10:26:08.246690: step: 56/77, loss: 0.002170323161408305 2023-01-22 10:26:09.755994: step: 60/77, loss: 0.002382697071880102 2023-01-22 10:26:11.190824: step: 64/77, loss: 0.000798621098510921 2023-01-22 10:26:12.736417: step: 68/77, loss: 0.003686311189085245 2023-01-22 10:26:14.173148: step: 72/77, loss: 0.06565132737159729 2023-01-22 10:26:15.663639: step: 76/77, loss: 0.005340840667486191 2023-01-22 10:26:17.107722: step: 80/77, loss: 0.0011033548507839441 2023-01-22 10:26:18.625794: step: 84/77, loss: 0.0021340290550142527 2023-01-22 10:26:20.053498: step: 88/77, loss: 0.0003320540417917073 2023-01-22 10:26:21.542824: step: 92/77, loss: 0.001845193444751203 2023-01-22 10:26:22.987302: step: 96/77, loss: 0.0006272225291468203 2023-01-22 10:26:24.466647: step: 100/77, loss: 0.0001363737101200968 2023-01-22 10:26:25.984172: step: 104/77, loss: 5.750182026531547e-05 2023-01-22 10:26:27.467967: step: 108/77, loss: 0.05694460868835449 2023-01-22 10:26:28.920359: step: 112/77, loss: 0.050074946135282516 2023-01-22 10:26:30.341338: step: 116/77, loss: 0.0011947468155995011 2023-01-22 10:26:31.817471: step: 120/77, loss: 0.0027722204104065895 2023-01-22 10:26:33.265920: step: 124/77, loss: 0.0018663202645257115 2023-01-22 10:26:34.800114: step: 128/77, loss: 0.0026401153299957514 2023-01-22 10:26:36.229299: step: 132/77, loss: 0.006284075789153576 2023-01-22 10:26:37.690425: step: 136/77, loss: 0.0007562717655673623 2023-01-22 10:26:39.222600: step: 140/77, loss: 0.035824716091156006 2023-01-22 10:26:40.694185: step: 144/77, loss: 0.026698850095272064 2023-01-22 10:26:42.212932: step: 148/77, loss: 0.00026182507281191647 2023-01-22 10:26:43.723836: step: 152/77, loss: 0.00024711183505132794 2023-01-22 10:26:45.146418: step: 156/77, loss: 0.042546238750219345 2023-01-22 10:26:46.645987: step: 160/77, loss: 0.13341380655765533 2023-01-22 10:26:48.156068: step: 164/77, loss: 0.00846744142472744 2023-01-22 10:26:49.728979: step: 168/77, loss: 0.08409827202558517 2023-01-22 10:26:51.207086: step: 172/77, loss: 0.0067952219396829605 2023-01-22 10:26:52.636013: step: 176/77, loss: 0.0042352499440312386 2023-01-22 10:26:53.992657: step: 180/77, loss: 0.00029177599935792387 2023-01-22 10:26:55.483299: step: 184/77, loss: 0.00332057336345315 2023-01-22 10:26:56.990854: step: 188/77, loss: 0.002562582725659013 2023-01-22 10:26:58.475472: step: 192/77, loss: 0.14900614321231842 2023-01-22 10:26:59.924495: step: 196/77, loss: 0.006481688003987074 2023-01-22 10:27:01.375823: step: 200/77, loss: 0.00015579882892780006 2023-01-22 10:27:02.834500: step: 204/77, loss: 0.019360896199941635 2023-01-22 10:27:04.318172: step: 208/77, loss: 0.004088334273546934 2023-01-22 10:27:05.775189: step: 212/77, loss: 0.00019180560775566846 2023-01-22 10:27:07.130242: step: 216/77, loss: 0.0003879494033753872 2023-01-22 10:27:08.596243: step: 220/77, loss: 0.0012808794854208827 2023-01-22 10:27:10.053527: step: 224/77, loss: 0.00027431672788225114 2023-01-22 10:27:11.567656: step: 228/77, loss: 0.00030285323737189174 2023-01-22 10:27:13.101313: step: 232/77, loss: 9.394896915182471e-05 2023-01-22 10:27:14.576755: step: 236/77, loss: 0.005379479378461838 2023-01-22 10:27:16.052769: step: 240/77, loss: 0.00025858887238427997 2023-01-22 10:27:17.516734: step: 244/77, loss: 0.020550856366753578 2023-01-22 10:27:19.005600: step: 248/77, loss: 0.005961176007986069 2023-01-22 10:27:20.532066: step: 252/77, loss: 0.0003792982315644622 2023-01-22 10:27:22.033793: step: 256/77, loss: 0.00023473313194699585 2023-01-22 10:27:23.505371: step: 260/77, loss: 0.0024014206137508154 2023-01-22 10:27:24.978855: step: 264/77, loss: 2.4127570213750005e-05 2023-01-22 10:27:26.405641: step: 268/77, loss: 0.0009776921942830086 2023-01-22 10:27:27.841492: step: 272/77, loss: 0.010743441991508007 2023-01-22 10:27:29.327677: step: 276/77, loss: 0.007811921648681164 2023-01-22 10:27:30.818547: step: 280/77, loss: 0.002057426143437624 2023-01-22 10:27:32.326106: step: 284/77, loss: 0.0005255554569885135 2023-01-22 10:27:33.849796: step: 288/77, loss: 9.64771315921098e-05 2023-01-22 10:27:35.343539: step: 292/77, loss: 0.0022660386748611927 2023-01-22 10:27:36.888176: step: 296/77, loss: 0.0022366391494870186 2023-01-22 10:27:38.301350: step: 300/77, loss: 0.0016711915377527475 2023-01-22 10:27:39.763957: step: 304/77, loss: 0.0007188515737652779 2023-01-22 10:27:41.233937: step: 308/77, loss: 0.020181728526949883 2023-01-22 10:27:42.733469: step: 312/77, loss: 0.00023877693456597626 2023-01-22 10:27:44.165251: step: 316/77, loss: 0.002911501796916127 2023-01-22 10:27:45.644958: step: 320/77, loss: 0.0006735376664437354 2023-01-22 10:27:47.087094: step: 324/77, loss: 0.0001844169746618718 2023-01-22 10:27:48.539597: step: 328/77, loss: 0.06680993735790253 2023-01-22 10:27:49.990544: step: 332/77, loss: 0.00025244487915188074 2023-01-22 10:27:51.502300: step: 336/77, loss: 0.0013559891376644373 2023-01-22 10:27:52.981337: step: 340/77, loss: 0.021186305209994316 2023-01-22 10:27:54.457792: step: 344/77, loss: 0.0040461840108036995 2023-01-22 10:27:55.901273: step: 348/77, loss: 0.0058979676105082035 2023-01-22 10:27:57.372716: step: 352/77, loss: 1.0261072020512074e-05 2023-01-22 10:27:58.884580: step: 356/77, loss: 0.002143705729395151 2023-01-22 10:28:00.399434: step: 360/77, loss: 0.06320623308420181 2023-01-22 10:28:01.911329: step: 364/77, loss: 0.00015051935042720288 2023-01-22 10:28:03.373955: step: 368/77, loss: 1.2711274393950589e-05 2023-01-22 10:28:04.806653: step: 372/77, loss: 0.21453514695167542 2023-01-22 10:28:06.278679: step: 376/77, loss: 0.014170726761221886 2023-01-22 10:28:07.680431: step: 380/77, loss: 6.4277210185537115e-06 2023-01-22 10:28:09.185018: step: 384/77, loss: 0.022878451272845268 2023-01-22 10:28:10.661181: step: 388/77, loss: 0.02404708042740822 ================================================== Loss: 0.013 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5666666666666667, 'f1': 0.7234042553191489}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05085442919642522, 'epoch': 17} Test Chinese: {'template': {'p': 0.8552631578947368, 'r': 0.4961832061068702, 'f1': 0.6280193236714976}, 'slot': {'p': 0.4166666666666667, 'r': 0.012942191544434857, 'f1': 0.02510460251046025}, 'combined': 0.015766175489661027, 'epoch': 17} Dev Korean: {'template': {'p': 1.0, 'r': 0.5333333333333333, 'f1': 0.6956521739130436}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.04890349201497669, 'epoch': 17} Test Korean: {'template': {'p': 0.8648648648648649, 'r': 0.48854961832061067, 'f1': 0.624390243902439}, 'slot': {'p': 0.43243243243243246, 'r': 0.013805004314063849, 'f1': 0.026755852842809368}, 'combined': 0.016706093482339507, 'epoch': 17} Dev Russian: {'template': {'p': 1.0, 'r': 0.55, 'f1': 0.7096774193548387}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.04988944951527864, 'epoch': 17} Test Russian: {'template': {'p': 0.8648648648648649, 'r': 0.48854961832061067, 'f1': 0.624390243902439}, 'slot': {'p': 0.4594594594594595, 'r': 0.014667817083692839, 'f1': 0.028428093645484948}, 'combined': 0.017750224324985724, 'epoch': 17} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 17} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 17} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 17} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Chinese: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.37777777777777777, 'r': 0.014667817083692839, 'f1': 0.028239202657807307}, 'combined': 0.01735677821894498, 'epoch': 3} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 3} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Korean: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.37777777777777777, 'r': 0.014667817083692839, 'f1': 0.028239202657807307}, 'combined': 0.01735677821894498, 'epoch': 3} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 3} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Russian: {'template': {'p': 0.863013698630137, 'r': 0.48091603053435117, 'f1': 0.6176470588235295}, 'slot': {'p': 0.38636363636363635, 'r': 0.014667817083692839, 'f1': 0.02826267664172901}, 'combined': 0.01745635910224439, 'epoch': 3} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 3} ****************************** Epoch: 18 command: python train.py --model_name template --xlmr_model_name xlm-roberta-large --batch_size 10 --xlmr_learning_rate 2e-5 --event_hidden_num 450 --accumulate_step 4 --max_epoch 30 --p1_data_weight 0.1 --learning_rate 2e-4 2023-01-22 10:29:48.180955: step: 4/77, loss: 0.0028756712563335896 2023-01-22 10:29:49.681624: step: 8/77, loss: 0.045157160609960556 2023-01-22 10:29:51.131110: step: 12/77, loss: 0.012379593215882778 2023-01-22 10:29:52.583507: step: 16/77, loss: 0.02100391685962677 2023-01-22 10:29:54.066888: step: 20/77, loss: 0.0064195566810667515 2023-01-22 10:29:55.536601: step: 24/77, loss: 0.003383409697562456 2023-01-22 10:29:56.956472: step: 28/77, loss: 0.000816413841675967 2023-01-22 10:29:58.392001: step: 32/77, loss: 0.013582956977188587 2023-01-22 10:29:59.903787: step: 36/77, loss: 0.010575263760983944 2023-01-22 10:30:01.331508: step: 40/77, loss: 0.0007761925226077437 2023-01-22 10:30:02.823424: step: 44/77, loss: 0.005285558756440878 2023-01-22 10:30:04.326003: step: 48/77, loss: 0.001977279782295227 2023-01-22 10:30:05.784741: step: 52/77, loss: 0.001641746610403061 2023-01-22 10:30:07.248377: step: 56/77, loss: 6.50401216262253e-06 2023-01-22 10:30:08.736596: step: 60/77, loss: 0.02756490930914879 2023-01-22 10:30:10.193239: step: 64/77, loss: 0.0005078827380202711 2023-01-22 10:30:11.586704: step: 68/77, loss: 0.005699621979147196 2023-01-22 10:30:13.039388: step: 72/77, loss: 0.0027254566084593534 2023-01-22 10:30:14.535446: step: 76/77, loss: 2.5649595045251772e-05 2023-01-22 10:30:16.029067: step: 80/77, loss: 1.5124951460165903e-05 2023-01-22 10:30:17.455929: step: 84/77, loss: 0.03806731477379799 2023-01-22 10:30:18.982063: step: 88/77, loss: 0.001081199967302382 2023-01-22 10:30:20.440525: step: 92/77, loss: 1.783421430445742e-05 2023-01-22 10:30:21.820902: step: 96/77, loss: 0.04326368868350983 2023-01-22 10:30:23.302073: step: 100/77, loss: 0.006358175538480282 2023-01-22 10:30:24.793837: step: 104/77, loss: 0.0016182229155674577 2023-01-22 10:30:26.249851: step: 108/77, loss: 0.007719434332102537 2023-01-22 10:30:27.726569: step: 112/77, loss: 0.08727987110614777 2023-01-22 10:30:29.187950: step: 116/77, loss: 6.309882155619562e-05 2023-01-22 10:30:30.631011: step: 120/77, loss: 0.009384812787175179 2023-01-22 10:30:32.137298: step: 124/77, loss: 0.0004629144095815718 2023-01-22 10:30:33.653093: step: 128/77, loss: 0.0019581338856369257 2023-01-22 10:30:35.099860: step: 132/77, loss: 0.0017693155677989125 2023-01-22 10:30:36.531468: step: 136/77, loss: 5.7300036132801324e-05 2023-01-22 10:30:38.036982: step: 140/77, loss: 0.0002848027506843209 2023-01-22 10:30:39.474825: step: 144/77, loss: 0.0009141655173152685 2023-01-22 10:30:40.988389: step: 148/77, loss: 0.0034261501859873533 2023-01-22 10:30:42.514699: step: 152/77, loss: 0.09311222285032272 2023-01-22 10:30:43.955620: step: 156/77, loss: 0.0011829708237200975 2023-01-22 10:30:45.406166: step: 160/77, loss: 0.0006447265041060746 2023-01-22 10:30:46.934448: step: 164/77, loss: 0.007240463979542255 2023-01-22 10:30:48.416397: step: 168/77, loss: 0.020239055156707764 2023-01-22 10:30:49.862809: step: 172/77, loss: 0.00021344999549910426 2023-01-22 10:30:51.300551: step: 176/77, loss: 0.0005965419695712626 2023-01-22 10:30:52.796396: step: 180/77, loss: 0.0009922974277287722 2023-01-22 10:30:54.344392: step: 184/77, loss: 0.0048019420355558395 2023-01-22 10:30:55.834993: step: 188/77, loss: 0.0009401412680745125 2023-01-22 10:30:57.339817: step: 192/77, loss: 0.0022180110681802034 2023-01-22 10:30:58.901148: step: 196/77, loss: 0.00930736307054758 2023-01-22 10:31:00.376876: step: 200/77, loss: 0.0036351527087390423 2023-01-22 10:31:01.914836: step: 204/77, loss: 0.00014229273074306548 2023-01-22 10:31:03.316731: step: 208/77, loss: 0.0007904646918177605 2023-01-22 10:31:04.782902: step: 212/77, loss: 0.003755400190129876 2023-01-22 10:31:06.248385: step: 216/77, loss: 9.308937296736985e-05 2023-01-22 10:31:07.705802: step: 220/77, loss: 3.324115095892921e-05 2023-01-22 10:31:09.168544: step: 224/77, loss: 0.0007430835394188762 2023-01-22 10:31:10.599090: step: 228/77, loss: 0.01505276933312416 2023-01-22 10:31:12.149587: step: 232/77, loss: 0.0008096634410321712 2023-01-22 10:31:13.577717: step: 236/77, loss: 0.0005306456587277353 2023-01-22 10:31:15.041090: step: 240/77, loss: 0.0006985433283261955 2023-01-22 10:31:16.589508: step: 244/77, loss: 0.0006532074767164886 2023-01-22 10:31:18.052755: step: 248/77, loss: 1.2684857210842893e-05 2023-01-22 10:31:19.513571: step: 252/77, loss: 3.6554571124725044e-05 2023-01-22 10:31:21.011949: step: 256/77, loss: 0.0003153039433527738 2023-01-22 10:31:22.545178: step: 260/77, loss: 0.0004511611768975854 2023-01-22 10:31:24.014108: step: 264/77, loss: 0.04475802928209305 2023-01-22 10:31:25.471703: step: 268/77, loss: 8.685257489560172e-05 2023-01-22 10:31:26.929568: step: 272/77, loss: 0.00020022218814119697 2023-01-22 10:31:28.408586: step: 276/77, loss: 0.004787555430084467 2023-01-22 10:31:29.870544: step: 280/77, loss: 0.0038822393398731947 2023-01-22 10:31:31.382574: step: 284/77, loss: 0.05055629462003708 2023-01-22 10:31:32.881393: step: 288/77, loss: 0.002044880762696266 2023-01-22 10:31:34.363647: step: 292/77, loss: 0.00030714101740159094 2023-01-22 10:31:35.864789: step: 296/77, loss: 0.0073666879907250404 2023-01-22 10:31:37.398460: step: 300/77, loss: 0.000893754418939352 2023-01-22 10:31:38.853227: step: 304/77, loss: 0.004460779018700123 2023-01-22 10:31:40.371823: step: 308/77, loss: 0.0016178349032998085 2023-01-22 10:31:41.778276: step: 312/77, loss: 0.0005586580373346806 2023-01-22 10:31:43.220256: step: 316/77, loss: 0.0004077023477293551 2023-01-22 10:31:44.751048: step: 320/77, loss: 0.0034647120628505945 2023-01-22 10:31:46.263212: step: 324/77, loss: 0.0008419217774644494 2023-01-22 10:31:47.688296: step: 328/77, loss: 0.00011264411295996979 2023-01-22 10:31:49.213388: step: 332/77, loss: 0.0011876357020810246 2023-01-22 10:31:50.741184: step: 336/77, loss: 0.011139290407299995 2023-01-22 10:31:52.210712: step: 340/77, loss: 0.0006338097155094147 2023-01-22 10:31:53.737739: step: 344/77, loss: 0.00022002437617629766 2023-01-22 10:31:55.198256: step: 348/77, loss: 4.005714799859561e-05 2023-01-22 10:31:56.682209: step: 352/77, loss: 0.0015249974094331264 2023-01-22 10:31:58.153963: step: 356/77, loss: 0.003730625845491886 2023-01-22 10:31:59.581256: step: 360/77, loss: 0.0007901267963461578 2023-01-22 10:32:01.014277: step: 364/77, loss: 0.00011056935181841254 2023-01-22 10:32:02.551118: step: 368/77, loss: 0.005171317607164383 2023-01-22 10:32:04.024005: step: 372/77, loss: 0.005288761109113693 2023-01-22 10:32:05.541735: step: 376/77, loss: 0.006044285371899605 2023-01-22 10:32:07.047853: step: 380/77, loss: 0.0016455070581287146 2023-01-22 10:32:08.542578: step: 384/77, loss: 0.0005599971045739949 2023-01-22 10:32:09.939662: step: 388/77, loss: 0.003552208887413144 ================================================== Loss: 0.007 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 18} Test Chinese: {'template': {'p': 0.881578947368421, 'r': 0.5114503816793893, 'f1': 0.6473429951690821}, 'slot': {'p': 0.4473684210526316, 'r': 0.014667817083692839, 'f1': 0.028404344193817876}, 'combined': 0.01838735324623959, 'epoch': 18} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 18} Test Korean: {'template': {'p': 0.8933333333333333, 'r': 0.5114503816793893, 'f1': 0.6504854368932039}, 'slot': {'p': 0.4722222222222222, 'r': 0.014667817083692839, 'f1': 0.028451882845188285}, 'combined': 0.018507535442986556, 'epoch': 18} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 18} Test Russian: {'template': {'p': 0.9054054054054054, 'r': 0.5114503816793893, 'f1': 0.6536585365853658}, 'slot': {'p': 0.4857142857142857, 'r': 0.014667817083692839, 'f1': 0.02847571189279732}, 'combined': 0.01861339216407239, 'epoch': 18} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 18} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 18} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 18} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Chinese: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.37777777777777777, 'r': 0.014667817083692839, 'f1': 0.028239202657807307}, 'combined': 0.01735677821894498, 'epoch': 3} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 3} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Korean: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.37777777777777777, 'r': 0.014667817083692839, 'f1': 0.028239202657807307}, 'combined': 0.01735677821894498, 'epoch': 3} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 3} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Russian: {'template': {'p': 0.863013698630137, 'r': 0.48091603053435117, 'f1': 0.6176470588235295}, 'slot': {'p': 0.38636363636363635, 'r': 0.014667817083692839, 'f1': 0.02826267664172901}, 'combined': 0.01745635910224439, 'epoch': 3} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 3} ****************************** Epoch: 19 command: python train.py --model_name template --xlmr_model_name xlm-roberta-large --batch_size 10 --xlmr_learning_rate 2e-5 --event_hidden_num 450 --accumulate_step 4 --max_epoch 30 --p1_data_weight 0.1 --learning_rate 2e-4 2023-01-22 10:33:47.467246: step: 4/77, loss: 0.00014098669635131955 2023-01-22 10:33:48.952491: step: 8/77, loss: 0.037430573254823685 2023-01-22 10:33:50.398166: step: 12/77, loss: 9.992123523261398e-06 2023-01-22 10:33:51.828765: step: 16/77, loss: 0.027012495324015617 2023-01-22 10:33:53.305199: step: 20/77, loss: 0.0012325807474553585 2023-01-22 10:33:54.807292: step: 24/77, loss: 0.00010384414781583473 2023-01-22 10:33:56.272673: step: 28/77, loss: 0.05970339477062225 2023-01-22 10:33:57.817392: step: 32/77, loss: 0.010155356489121914 2023-01-22 10:33:59.332498: step: 36/77, loss: 0.019072813913226128 2023-01-22 10:34:00.802820: step: 40/77, loss: 0.00041514128679409623 2023-01-22 10:34:02.307913: step: 44/77, loss: 0.0005081351846456528 2023-01-22 10:34:03.727666: step: 48/77, loss: 5.959239206276834e-05 2023-01-22 10:34:05.237569: step: 52/77, loss: 0.00889910850673914 2023-01-22 10:34:06.694602: step: 56/77, loss: 0.03261871263384819 2023-01-22 10:34:08.127470: step: 60/77, loss: 0.00019574278849177063 2023-01-22 10:34:09.586844: step: 64/77, loss: 7.374716369668022e-05 2023-01-22 10:34:11.050907: step: 68/77, loss: 0.001230890746228397 2023-01-22 10:34:12.534082: step: 72/77, loss: 0.03803974390029907 2023-01-22 10:34:13.958492: step: 76/77, loss: 0.00023354985751211643 2023-01-22 10:34:15.491259: step: 80/77, loss: 0.0221809484064579 2023-01-22 10:34:16.897001: step: 84/77, loss: 0.00018142740009352565 2023-01-22 10:34:18.380261: step: 88/77, loss: 0.027372796088457108 2023-01-22 10:34:19.888165: step: 92/77, loss: 0.0009225103422068059 2023-01-22 10:34:21.393646: step: 96/77, loss: 0.0002753528533503413 2023-01-22 10:34:22.842602: step: 100/77, loss: 0.04084169119596481 2023-01-22 10:34:24.296854: step: 104/77, loss: 0.033095426857471466 2023-01-22 10:34:25.771233: step: 108/77, loss: 0.019844137132167816 2023-01-22 10:34:27.194050: step: 112/77, loss: 0.00010307527554687113 2023-01-22 10:34:28.586338: step: 116/77, loss: 0.07973741739988327 2023-01-22 10:34:30.084998: step: 120/77, loss: 0.007755426689982414 2023-01-22 10:34:31.504352: step: 124/77, loss: 0.0020310066174715757 2023-01-22 10:34:32.973322: step: 128/77, loss: 0.0014131821226328611 2023-01-22 10:34:34.428911: step: 132/77, loss: 0.0007622442790307105 2023-01-22 10:34:35.906926: step: 136/77, loss: 0.0013501873472705483 2023-01-22 10:34:37.328943: step: 140/77, loss: 0.003476389916613698 2023-01-22 10:34:38.870499: step: 144/77, loss: 0.004717133939266205 2023-01-22 10:34:40.337320: step: 148/77, loss: 0.034023940563201904 2023-01-22 10:34:41.867982: step: 152/77, loss: 0.0025565975811332464 2023-01-22 10:34:43.480068: step: 156/77, loss: 0.0021379715763032436 2023-01-22 10:34:44.944678: step: 160/77, loss: 0.008405786007642746 2023-01-22 10:34:46.430278: step: 164/77, loss: 0.0002632577088661492 2023-01-22 10:34:47.963457: step: 168/77, loss: 0.007901491597294807 2023-01-22 10:34:49.458001: step: 172/77, loss: 0.0023893998004496098 2023-01-22 10:34:50.982301: step: 176/77, loss: 0.00036748748971149325 2023-01-22 10:34:52.466862: step: 180/77, loss: 6.505564670078456e-06 2023-01-22 10:34:54.001775: step: 184/77, loss: 0.002341366373002529 2023-01-22 10:34:55.457055: step: 188/77, loss: 0.00022118315973784775 2023-01-22 10:34:56.878916: step: 192/77, loss: 1.2627208889171015e-05 2023-01-22 10:34:58.382286: step: 196/77, loss: 0.0001342456671409309 2023-01-22 10:34:59.928471: step: 200/77, loss: 0.028776878491044044 2023-01-22 10:35:01.422118: step: 204/77, loss: 0.00026175566017627716 2023-01-22 10:35:02.935137: step: 208/77, loss: 0.0006376546807587147 2023-01-22 10:35:04.399715: step: 212/77, loss: 0.0004763914621435106 2023-01-22 10:35:05.865642: step: 216/77, loss: 3.821781137958169e-05 2023-01-22 10:35:07.361119: step: 220/77, loss: 0.00015172931307461113 2023-01-22 10:35:08.864488: step: 224/77, loss: 1.782135996108991e-06 2023-01-22 10:35:10.341322: step: 228/77, loss: 6.226929144759197e-06 2023-01-22 10:35:11.830719: step: 232/77, loss: 0.0005442426190711558 2023-01-22 10:35:13.240025: step: 236/77, loss: 6.345907604554668e-05 2023-01-22 10:35:14.729698: step: 240/77, loss: 8.847442222759128e-05 2023-01-22 10:35:16.192416: step: 244/77, loss: 0.0030105954501777887 2023-01-22 10:35:17.646593: step: 248/77, loss: 1.6285044694086537e-05 2023-01-22 10:35:19.202189: step: 252/77, loss: 0.00028308553737588227 2023-01-22 10:35:20.622448: step: 256/77, loss: 7.794688281137496e-05 2023-01-22 10:35:22.062470: step: 260/77, loss: 0.004675406496971846 2023-01-22 10:35:23.504771: step: 264/77, loss: 0.009595395065844059 2023-01-22 10:35:25.005204: step: 268/77, loss: 0.09094759076833725 2023-01-22 10:35:26.534284: step: 272/77, loss: 0.00025511524290777743 2023-01-22 10:35:28.033300: step: 276/77, loss: 8.626453200122342e-05 2023-01-22 10:35:29.471625: step: 280/77, loss: 2.5752302462933585e-05 2023-01-22 10:35:30.939939: step: 284/77, loss: 0.00047939157229848206 2023-01-22 10:35:32.482471: step: 288/77, loss: 0.0007043445948511362 2023-01-22 10:35:33.957545: step: 292/77, loss: 3.5091293284494895e-06 2023-01-22 10:35:35.416243: step: 296/77, loss: 0.004549146164208651 2023-01-22 10:35:36.947090: step: 300/77, loss: 0.0013034878065809608 2023-01-22 10:35:38.392898: step: 304/77, loss: 7.059721247060224e-05 2023-01-22 10:35:39.790265: step: 308/77, loss: 0.028981227427721024 2023-01-22 10:35:41.234569: step: 312/77, loss: 5.120259811519645e-05 2023-01-22 10:35:42.642097: step: 316/77, loss: 0.0703129917383194 2023-01-22 10:35:44.140421: step: 320/77, loss: 8.140176760207396e-06 2023-01-22 10:35:45.633609: step: 324/77, loss: 0.02896547131240368 2023-01-22 10:35:47.122657: step: 328/77, loss: 1.8530841771280393e-05 2023-01-22 10:35:48.606897: step: 332/77, loss: 0.0004325577465351671 2023-01-22 10:35:50.064352: step: 336/77, loss: 0.0001609406026545912 2023-01-22 10:35:51.573361: step: 340/77, loss: 0.0032217581756412983 2023-01-22 10:35:53.083181: step: 344/77, loss: 0.00014991796342656016 2023-01-22 10:35:54.520923: step: 348/77, loss: 0.041126806288957596 2023-01-22 10:35:55.996219: step: 352/77, loss: 0.00013102588127367198 2023-01-22 10:35:57.517895: step: 356/77, loss: 0.00045040063560009 2023-01-22 10:35:59.039117: step: 360/77, loss: 5.666089418809861e-05 2023-01-22 10:36:00.475913: step: 364/77, loss: 0.0007076942129060626 2023-01-22 10:36:01.911602: step: 368/77, loss: 0.0002744776720646769 2023-01-22 10:36:03.441859: step: 372/77, loss: 1.4699688108521514e-05 2023-01-22 10:36:04.894781: step: 376/77, loss: 0.006560564041137695 2023-01-22 10:36:06.363936: step: 380/77, loss: 0.017250539734959602 2023-01-22 10:36:07.808280: step: 384/77, loss: 0.007945503108203411 2023-01-22 10:36:09.260792: step: 388/77, loss: 0.0006101108156144619 ================================================== Loss: 0.009 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 19} Test Chinese: {'template': {'p': 0.9178082191780822, 'r': 0.5114503816793893, 'f1': 0.6568627450980392}, 'slot': {'p': 0.4857142857142857, 'r': 0.014667817083692839, 'f1': 0.02847571189279732}, 'combined': 0.018704634282523728, 'epoch': 19} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 19} Test Korean: {'template': {'p': 0.9295774647887324, 'r': 0.5038167938931297, 'f1': 0.6534653465346535}, 'slot': {'p': 0.5, 'r': 0.014667817083692839, 'f1': 0.028499580888516347}, 'combined': 0.018623488501406722, 'epoch': 19} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 19} Test Russian: {'template': {'p': 0.9295774647887324, 'r': 0.5038167938931297, 'f1': 0.6534653465346535}, 'slot': {'p': 0.5, 'r': 0.014667817083692839, 'f1': 0.028499580888516347}, 'combined': 0.018623488501406722, 'epoch': 19} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 19} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 19} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 19} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Chinese: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.37777777777777777, 'r': 0.014667817083692839, 'f1': 0.028239202657807307}, 'combined': 0.01735677821894498, 'epoch': 3} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 3} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Korean: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.37777777777777777, 'r': 0.014667817083692839, 'f1': 0.028239202657807307}, 'combined': 0.01735677821894498, 'epoch': 3} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 3} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Russian: {'template': {'p': 0.863013698630137, 'r': 0.48091603053435117, 'f1': 0.6176470588235295}, 'slot': {'p': 0.38636363636363635, 'r': 0.014667817083692839, 'f1': 0.02826267664172901}, 'combined': 0.01745635910224439, 'epoch': 3} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 3} ****************************** Epoch: 20 command: python train.py --model_name template --xlmr_model_name xlm-roberta-large --batch_size 10 --xlmr_learning_rate 2e-5 --event_hidden_num 450 --accumulate_step 4 --max_epoch 30 --p1_data_weight 0.1 --learning_rate 2e-4 2023-01-22 10:37:46.815008: step: 4/77, loss: 8.427352440776303e-05 2023-01-22 10:37:48.261401: step: 8/77, loss: 0.003391423961147666 2023-01-22 10:37:49.808242: step: 12/77, loss: 0.00022610797896049917 2023-01-22 10:37:51.198622: step: 16/77, loss: 0.012872123159468174 2023-01-22 10:37:52.649560: step: 20/77, loss: 7.742694288026541e-05 2023-01-22 10:37:54.112477: step: 24/77, loss: 0.00036254245787858963 2023-01-22 10:37:55.629096: step: 28/77, loss: 0.0004164370766375214 2023-01-22 10:37:57.080144: step: 32/77, loss: 0.0007234312943182886 2023-01-22 10:37:58.611175: step: 36/77, loss: 0.005771125201135874 2023-01-22 10:37:59.992025: step: 40/77, loss: 0.003106000367552042 2023-01-22 10:38:01.405953: step: 44/77, loss: 0.017629720270633698 2023-01-22 10:38:02.917309: step: 48/77, loss: 0.02136806771159172 2023-01-22 10:38:04.403492: step: 52/77, loss: 0.00046486067003570497 2023-01-22 10:38:05.890750: step: 56/77, loss: 0.001176183228380978 2023-01-22 10:38:07.386876: step: 60/77, loss: 0.006411808542907238 2023-01-22 10:38:08.858107: step: 64/77, loss: 0.0006058060680516064 2023-01-22 10:38:10.397854: step: 68/77, loss: 2.3365575543721206e-05 2023-01-22 10:38:11.887713: step: 72/77, loss: 0.008123316802084446 2023-01-22 10:38:13.376812: step: 76/77, loss: 1.5268948118318804e-05 2023-01-22 10:38:14.873572: step: 80/77, loss: 6.315793143585324e-05 2023-01-22 10:38:16.385091: step: 84/77, loss: 0.00023585847520735115 2023-01-22 10:38:17.865332: step: 88/77, loss: 0.00752745708450675 2023-01-22 10:38:19.324544: step: 92/77, loss: 0.0002199528826167807 2023-01-22 10:38:20.799224: step: 96/77, loss: 0.0004972168244421482 2023-01-22 10:38:22.324815: step: 100/77, loss: 0.0022977537009865046 2023-01-22 10:38:23.818340: step: 104/77, loss: 8.510050975019112e-05 2023-01-22 10:38:25.420014: step: 108/77, loss: 0.000818159431219101 2023-01-22 10:38:26.899654: step: 112/77, loss: 3.944763739127666e-05 2023-01-22 10:38:28.370914: step: 116/77, loss: 0.005611030850559473 2023-01-22 10:38:29.871928: step: 120/77, loss: 0.027019493281841278 2023-01-22 10:38:31.360614: step: 124/77, loss: 0.00016905099619179964 2023-01-22 10:38:32.835751: step: 128/77, loss: 0.000452701176982373 2023-01-22 10:38:34.355525: step: 132/77, loss: 0.001107914256863296 2023-01-22 10:38:35.857712: step: 136/77, loss: 3.4447726648068056e-05 2023-01-22 10:38:37.290596: step: 140/77, loss: 0.0005560338613577187 2023-01-22 10:38:38.803115: step: 144/77, loss: 0.00014413423195946962 2023-01-22 10:38:40.303276: step: 148/77, loss: 0.00014533651119563729 2023-01-22 10:38:41.734176: step: 152/77, loss: 0.005465753376483917 2023-01-22 10:38:43.268672: step: 156/77, loss: 1.3785129340249114e-05 2023-01-22 10:38:44.747951: step: 160/77, loss: 0.0049934606067836285 2023-01-22 10:38:46.255569: step: 164/77, loss: 0.001129616517573595 2023-01-22 10:38:47.728890: step: 168/77, loss: 0.003189380746334791 2023-01-22 10:38:49.140955: step: 172/77, loss: 6.476558610302163e-06 2023-01-22 10:38:50.627731: step: 176/77, loss: 0.02472682110965252 2023-01-22 10:38:52.115509: step: 180/77, loss: 5.072112344350899e-06 2023-01-22 10:38:53.587894: step: 184/77, loss: 0.0020358862821012735 2023-01-22 10:38:55.013436: step: 188/77, loss: 0.0002398270444246009 2023-01-22 10:38:56.521548: step: 192/77, loss: 0.05871470272541046 2023-01-22 10:38:58.090059: step: 196/77, loss: 0.002355430740863085 2023-01-22 10:38:59.608702: step: 200/77, loss: 0.0007327854400500655 2023-01-22 10:39:01.150621: step: 204/77, loss: 0.0006290523451752961 2023-01-22 10:39:02.664318: step: 208/77, loss: 0.00018614571308717132 2023-01-22 10:39:04.156304: step: 212/77, loss: 0.00010260358249070123 2023-01-22 10:39:05.644525: step: 216/77, loss: 0.022568479180336 2023-01-22 10:39:07.082054: step: 220/77, loss: 0.0005458329687826335 2023-01-22 10:39:08.518473: step: 224/77, loss: 0.0008564339368604124 2023-01-22 10:39:09.944923: step: 228/77, loss: 0.0035173415672034025 2023-01-22 10:39:11.369137: step: 232/77, loss: 0.016759483143687248 2023-01-22 10:39:12.891125: step: 236/77, loss: 0.03251827880740166 2023-01-22 10:39:14.285812: step: 240/77, loss: 0.00995706394314766 2023-01-22 10:39:15.706137: step: 244/77, loss: 0.0006280227098613977 2023-01-22 10:39:17.174872: step: 248/77, loss: 0.04479119926691055 2023-01-22 10:39:18.697670: step: 252/77, loss: 0.00023071595933288336 2023-01-22 10:39:20.169548: step: 256/77, loss: 0.0007097484776750207 2023-01-22 10:39:21.655003: step: 260/77, loss: 0.00010665479203453287 2023-01-22 10:39:23.087072: step: 264/77, loss: 0.0001831296249292791 2023-01-22 10:39:24.547085: step: 268/77, loss: 0.0060518416576087475 2023-01-22 10:39:26.003227: step: 272/77, loss: 0.0017663311446085572 2023-01-22 10:39:27.502967: step: 276/77, loss: 0.0004453969595488161 2023-01-22 10:39:28.935773: step: 280/77, loss: 0.0013352977111935616 2023-01-22 10:39:30.393193: step: 284/77, loss: 0.002567255636677146 2023-01-22 10:39:31.865004: step: 288/77, loss: 1.810535104596056e-05 2023-01-22 10:39:33.351426: step: 292/77, loss: 0.0020445568952709436 2023-01-22 10:39:34.890176: step: 296/77, loss: 1.9323746528243646e-05 2023-01-22 10:39:36.353161: step: 300/77, loss: 0.002380709396675229 2023-01-22 10:39:37.796121: step: 304/77, loss: 0.0007935311878100038 2023-01-22 10:39:39.289001: step: 308/77, loss: 0.0022404869087040424 2023-01-22 10:39:40.768735: step: 312/77, loss: 0.05067255347967148 2023-01-22 10:39:42.233221: step: 316/77, loss: 0.00011900630488526076 2023-01-22 10:39:43.683017: step: 320/77, loss: 0.0002168344653910026 2023-01-22 10:39:45.145736: step: 324/77, loss: 0.0004526789125520736 2023-01-22 10:39:46.640382: step: 328/77, loss: 0.000879471015650779 2023-01-22 10:39:48.141532: step: 332/77, loss: 0.0030491615179926157 2023-01-22 10:39:49.620774: step: 336/77, loss: 0.006456117145717144 2023-01-22 10:39:51.060034: step: 340/77, loss: 0.010428737848997116 2023-01-22 10:39:52.473742: step: 344/77, loss: 0.0005070596816949546 2023-01-22 10:39:54.007572: step: 348/77, loss: 0.00023114164650905877 2023-01-22 10:39:55.514182: step: 352/77, loss: 0.03697359934449196 2023-01-22 10:39:56.950542: step: 356/77, loss: 0.011408147402107716 2023-01-22 10:39:58.480492: step: 360/77, loss: 0.015400312840938568 2023-01-22 10:39:59.952487: step: 364/77, loss: 1.4918558008503169e-05 2023-01-22 10:40:01.397760: step: 368/77, loss: 0.0003141985216643661 2023-01-22 10:40:02.905273: step: 372/77, loss: 0.03331225365400314 2023-01-22 10:40:04.314531: step: 376/77, loss: 0.0009111135732382536 2023-01-22 10:40:05.759317: step: 380/77, loss: 0.0017461779061704874 2023-01-22 10:40:07.192451: step: 384/77, loss: 1.2481615158321802e-05 2023-01-22 10:40:08.654080: step: 388/77, loss: 0.032015111297369 ================================================== Loss: 0.006 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 20} Test Chinese: {'template': {'p': 0.9142857142857143, 'r': 0.48854961832061067, 'f1': 0.6368159203980099}, 'slot': {'p': 0.4722222222222222, 'r': 0.014667817083692839, 'f1': 0.028451882845188285}, 'combined': 0.018118611961114927, 'epoch': 20} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 20} Test Korean: {'template': {'p': 0.9130434782608695, 'r': 0.48091603053435117, 'f1': 0.63}, 'slot': {'p': 0.4857142857142857, 'r': 0.014667817083692839, 'f1': 0.02847571189279732}, 'combined': 0.01793969849246231, 'epoch': 20} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 20} Test Russian: {'template': {'p': 0.9142857142857143, 'r': 0.48854961832061067, 'f1': 0.6368159203980099}, 'slot': {'p': 0.4594594594594595, 'r': 0.014667817083692839, 'f1': 0.028428093645484948}, 'combined': 0.018103462620010315, 'epoch': 20} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 20} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 20} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 20} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Chinese: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.37777777777777777, 'r': 0.014667817083692839, 'f1': 0.028239202657807307}, 'combined': 0.01735677821894498, 'epoch': 3} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 3} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Korean: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.37777777777777777, 'r': 0.014667817083692839, 'f1': 0.028239202657807307}, 'combined': 0.01735677821894498, 'epoch': 3} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 3} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Russian: {'template': {'p': 0.863013698630137, 'r': 0.48091603053435117, 'f1': 0.6176470588235295}, 'slot': {'p': 0.38636363636363635, 'r': 0.014667817083692839, 'f1': 0.02826267664172901}, 'combined': 0.01745635910224439, 'epoch': 3} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 3} ****************************** Epoch: 21 command: python train.py --model_name template --xlmr_model_name xlm-roberta-large --batch_size 10 --xlmr_learning_rate 2e-5 --event_hidden_num 450 --accumulate_step 4 --max_epoch 30 --p1_data_weight 0.1 --learning_rate 2e-4 2023-01-22 10:41:46.260960: step: 4/77, loss: 8.27891708468087e-05 2023-01-22 10:41:47.713312: step: 8/77, loss: 0.00043526801164261997 2023-01-22 10:41:49.175838: step: 12/77, loss: 9.745121133164503e-06 2023-01-22 10:41:50.681466: step: 16/77, loss: 2.9559068934759125e-05 2023-01-22 10:41:52.188549: step: 20/77, loss: 0.0003522019542288035 2023-01-22 10:41:53.687402: step: 24/77, loss: 0.00022706623713020235 2023-01-22 10:41:55.162911: step: 28/77, loss: 0.009877260774374008 2023-01-22 10:41:56.648861: step: 32/77, loss: 0.006384177133440971 2023-01-22 10:41:58.168194: step: 36/77, loss: 0.00014239516167435795 2023-01-22 10:41:59.662708: step: 40/77, loss: 0.0005055277142673731 2023-01-22 10:42:01.154604: step: 44/77, loss: 0.05409137159585953 2023-01-22 10:42:02.681036: step: 48/77, loss: 3.8041166590119246e-06 2023-01-22 10:42:04.216185: step: 52/77, loss: 8.18027911009267e-06 2023-01-22 10:42:05.732137: step: 56/77, loss: 9.932726243278012e-05 2023-01-22 10:42:07.216522: step: 60/77, loss: 0.00014925358118489385 2023-01-22 10:42:08.671214: step: 64/77, loss: 0.0002240837930003181 2023-01-22 10:42:10.112030: step: 68/77, loss: 5.057529415353201e-05 2023-01-22 10:42:11.595596: step: 72/77, loss: 9.591747584636323e-06 2023-01-22 10:42:13.187785: step: 76/77, loss: 6.70036433803034e-06 2023-01-22 10:42:14.694819: step: 80/77, loss: 0.0002989994827657938 2023-01-22 10:42:16.163273: step: 84/77, loss: 0.0007450602715834975 2023-01-22 10:42:17.691420: step: 88/77, loss: 0.02040993794798851 2023-01-22 10:42:19.130444: step: 92/77, loss: 1.1034421731892508e-05 2023-01-22 10:42:20.552855: step: 96/77, loss: 6.633730663452297e-05 2023-01-22 10:42:21.984782: step: 100/77, loss: 0.00017172204388771206 2023-01-22 10:42:23.478155: step: 104/77, loss: 2.5287281459895894e-05 2023-01-22 10:42:24.884432: step: 108/77, loss: 0.0019874691497534513 2023-01-22 10:42:26.380672: step: 112/77, loss: 0.0028940557967871428 2023-01-22 10:42:27.900227: step: 116/77, loss: 1.3265988854982425e-05 2023-01-22 10:42:29.356593: step: 120/77, loss: 1.3763572496827692e-05 2023-01-22 10:42:30.805388: step: 124/77, loss: 5.456419967231341e-05 2023-01-22 10:42:32.312310: step: 128/77, loss: 3.199049024260603e-05 2023-01-22 10:42:33.866203: step: 132/77, loss: 0.00015985312347766012 2023-01-22 10:42:35.315153: step: 136/77, loss: 0.0018992533441632986 2023-01-22 10:42:36.799876: step: 140/77, loss: 1.3396806025411934e-05 2023-01-22 10:42:38.306162: step: 144/77, loss: 0.0006888278294354677 2023-01-22 10:42:39.714245: step: 148/77, loss: 0.001850732951425016 2023-01-22 10:42:41.148175: step: 152/77, loss: 0.0010139790829271078 2023-01-22 10:42:42.651792: step: 156/77, loss: 4.4166750740259886e-05 2023-01-22 10:42:44.073134: step: 160/77, loss: 3.8711961678927764e-05 2023-01-22 10:42:45.532675: step: 164/77, loss: 4.7252186050172895e-05 2023-01-22 10:42:47.004631: step: 168/77, loss: 0.0015337056247517467 2023-01-22 10:42:48.424379: step: 172/77, loss: 1.5749639715068042e-05 2023-01-22 10:42:49.938807: step: 176/77, loss: 2.1206040401011705e-05 2023-01-22 10:42:51.384996: step: 180/77, loss: 0.00034399217111058533 2023-01-22 10:42:52.859078: step: 184/77, loss: 5.2080049499636516e-05 2023-01-22 10:42:54.423030: step: 188/77, loss: 0.017733346670866013 2023-01-22 10:42:55.931426: step: 192/77, loss: 0.07597655057907104 2023-01-22 10:42:57.424294: step: 196/77, loss: 0.24772121012210846 2023-01-22 10:42:58.873163: step: 200/77, loss: 9.717977081891149e-05 2023-01-22 10:43:00.381739: step: 204/77, loss: 0.00013973054592497647 2023-01-22 10:43:01.925613: step: 208/77, loss: 0.024061929434537888 2023-01-22 10:43:03.356389: step: 212/77, loss: 0.0001402836642228067 2023-01-22 10:43:04.868165: step: 216/77, loss: 0.00044021164649166167 2023-01-22 10:43:06.338120: step: 220/77, loss: 4.2444888094905764e-05 2023-01-22 10:43:07.803736: step: 224/77, loss: 0.00022847886430099607 2023-01-22 10:43:09.294916: step: 228/77, loss: 0.0002529481425881386 2023-01-22 10:43:10.722973: step: 232/77, loss: 2.9905932024121284e-06 2023-01-22 10:43:12.241175: step: 236/77, loss: 0.002287093782797456 2023-01-22 10:43:13.769439: step: 240/77, loss: 5.147743650013581e-06 2023-01-22 10:43:15.242280: step: 244/77, loss: 0.0014095694059506059 2023-01-22 10:43:16.676466: step: 248/77, loss: 2.9975240977364592e-05 2023-01-22 10:43:18.274774: step: 252/77, loss: 0.013966652564704418 2023-01-22 10:43:19.724634: step: 256/77, loss: 0.019837338477373123 2023-01-22 10:43:21.265972: step: 260/77, loss: 0.0011463196715340018 2023-01-22 10:43:22.767287: step: 264/77, loss: 9.620159835321829e-05 2023-01-22 10:43:24.238630: step: 268/77, loss: 0.00427041482180357 2023-01-22 10:43:25.650170: step: 272/77, loss: 0.00024166949151549488 2023-01-22 10:43:27.147406: step: 276/77, loss: 0.0034298989921808243 2023-01-22 10:43:28.602038: step: 280/77, loss: 2.2885571524966508e-05 2023-01-22 10:43:30.096756: step: 284/77, loss: 0.0002919553080573678 2023-01-22 10:43:31.603571: step: 288/77, loss: 0.0018753650365397334 2023-01-22 10:43:33.115641: step: 292/77, loss: 0.0005949947517365217 2023-01-22 10:43:34.563939: step: 296/77, loss: 0.003798791905865073 2023-01-22 10:43:36.038575: step: 300/77, loss: 1.837530180637259e-05 2023-01-22 10:43:37.530346: step: 304/77, loss: 9.359028808830772e-06 2023-01-22 10:43:38.936101: step: 308/77, loss: 0.03410165011882782 2023-01-22 10:43:40.395260: step: 312/77, loss: 0.005747949704527855 2023-01-22 10:43:41.819541: step: 316/77, loss: 0.0071901543997228146 2023-01-22 10:43:43.231232: step: 320/77, loss: 0.014834891073405743 2023-01-22 10:43:44.753605: step: 324/77, loss: 3.8532862163265236e-06 2023-01-22 10:43:46.190024: step: 328/77, loss: 0.0001279481512028724 2023-01-22 10:43:47.613397: step: 332/77, loss: 0.0006787201855331659 2023-01-22 10:43:49.089590: step: 336/77, loss: 0.00013982687960378826 2023-01-22 10:43:50.577316: step: 340/77, loss: 2.8490972908912227e-05 2023-01-22 10:43:52.058746: step: 344/77, loss: 0.014455622062087059 2023-01-22 10:43:53.593250: step: 348/77, loss: 0.13285669684410095 2023-01-22 10:43:55.021899: step: 352/77, loss: 0.00019437828450463712 2023-01-22 10:43:56.524627: step: 356/77, loss: 0.002372733550146222 2023-01-22 10:43:57.999105: step: 360/77, loss: 0.009212651289999485 2023-01-22 10:43:59.485515: step: 364/77, loss: 0.060634076595306396 2023-01-22 10:44:00.938078: step: 368/77, loss: 1.0017057320510503e-05 2023-01-22 10:44:02.422290: step: 372/77, loss: 0.0002461381664033979 2023-01-22 10:44:03.887337: step: 376/77, loss: 2.5817340429057367e-05 2023-01-22 10:44:05.385852: step: 380/77, loss: 0.0008293167338706553 2023-01-22 10:44:06.897802: step: 384/77, loss: 7.01223143551033e-06 2023-01-22 10:44:08.418895: step: 388/77, loss: 0.03312050551176071 ================================================== Loss: 0.009 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 21} Test Chinese: {'template': {'p': 0.9230769230769231, 'r': 0.4580152671755725, 'f1': 0.6122448979591837}, 'slot': {'p': 0.48484848484848486, 'r': 0.013805004314063849, 'f1': 0.02684563758389262}, 'combined': 0.016436104643199563, 'epoch': 21} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 21} Test Korean: {'template': {'p': 0.9206349206349206, 'r': 0.44274809160305345, 'f1': 0.5979381443298969}, 'slot': {'p': 0.5294117647058824, 'r': 0.015530629853321829, 'f1': 0.03017602682313495}, 'combined': 0.018043397481874505, 'epoch': 21} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 21} Test Russian: {'template': {'p': 0.9206349206349206, 'r': 0.44274809160305345, 'f1': 0.5979381443298969}, 'slot': {'p': 0.48484848484848486, 'r': 0.013805004314063849, 'f1': 0.02684563758389262}, 'combined': 0.01605203072026569, 'epoch': 21} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 21} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 21} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 21} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Chinese: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.37777777777777777, 'r': 0.014667817083692839, 'f1': 0.028239202657807307}, 'combined': 0.01735677821894498, 'epoch': 3} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 3} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Korean: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.37777777777777777, 'r': 0.014667817083692839, 'f1': 0.028239202657807307}, 'combined': 0.01735677821894498, 'epoch': 3} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 3} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Russian: {'template': {'p': 0.863013698630137, 'r': 0.48091603053435117, 'f1': 0.6176470588235295}, 'slot': {'p': 0.38636363636363635, 'r': 0.014667817083692839, 'f1': 0.02826267664172901}, 'combined': 0.01745635910224439, 'epoch': 3} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 3} ****************************** Epoch: 22 command: python train.py --model_name template --xlmr_model_name xlm-roberta-large --batch_size 10 --xlmr_learning_rate 2e-5 --event_hidden_num 450 --accumulate_step 4 --max_epoch 30 --p1_data_weight 0.1 --learning_rate 2e-4 2023-01-22 10:45:46.035701: step: 4/77, loss: 0.005217066500335932 2023-01-22 10:45:47.521034: step: 8/77, loss: 1.7110400222009048e-05 2023-01-22 10:45:48.948442: step: 12/77, loss: 0.013908019289374352 2023-01-22 10:45:50.427459: step: 16/77, loss: 0.060203030705451965 2023-01-22 10:45:51.935794: step: 20/77, loss: 0.0001018143302644603 2023-01-22 10:45:53.427188: step: 24/77, loss: 0.00033130700467154384 2023-01-22 10:45:54.952034: step: 28/77, loss: 0.0022158983629196882 2023-01-22 10:45:56.469937: step: 32/77, loss: 0.001340493792667985 2023-01-22 10:45:57.906194: step: 36/77, loss: 9.457357373321429e-05 2023-01-22 10:45:59.383523: step: 40/77, loss: 0.007253519259393215 2023-01-22 10:46:00.849519: step: 44/77, loss: 0.01193111203610897 2023-01-22 10:46:02.372645: step: 48/77, loss: 1.947039709193632e-05 2023-01-22 10:46:03.801868: step: 52/77, loss: 0.010435977019369602 2023-01-22 10:46:05.238874: step: 56/77, loss: 0.00014408468268811703 2023-01-22 10:46:06.700451: step: 60/77, loss: 0.01907634176313877 2023-01-22 10:46:08.168128: step: 64/77, loss: 0.048381563276052475 2023-01-22 10:46:09.631522: step: 68/77, loss: 0.0009525776840746403 2023-01-22 10:46:11.105055: step: 72/77, loss: 0.040768206119537354 2023-01-22 10:46:12.605681: step: 76/77, loss: 0.0001661547867115587 2023-01-22 10:46:14.023985: step: 80/77, loss: 0.0006874087848700583 2023-01-22 10:46:15.520601: step: 84/77, loss: 2.096245952998288e-05 2023-01-22 10:46:16.996651: step: 88/77, loss: 0.007326844148337841 2023-01-22 10:46:18.447909: step: 92/77, loss: 0.024552563205361366 2023-01-22 10:46:19.919325: step: 96/77, loss: 0.0018240232020616531 2023-01-22 10:46:21.378666: step: 100/77, loss: 1.764373701007571e-05 2023-01-22 10:46:22.844928: step: 104/77, loss: 0.0007256006938405335 2023-01-22 10:46:24.306709: step: 108/77, loss: 0.0015230000717565417 2023-01-22 10:46:25.826887: step: 112/77, loss: 2.4019012926146388e-05 2023-01-22 10:46:27.242229: step: 116/77, loss: 3.794354779529385e-05 2023-01-22 10:46:28.717545: step: 120/77, loss: 0.0011226541828364134 2023-01-22 10:46:30.152797: step: 124/77, loss: 0.014624684117734432 2023-01-22 10:46:31.610332: step: 128/77, loss: 0.0008393382304348052 2023-01-22 10:46:33.055361: step: 132/77, loss: 3.3838718991319183e-06 2023-01-22 10:46:34.582570: step: 136/77, loss: 0.0315176360309124 2023-01-22 10:46:35.991403: step: 140/77, loss: 0.0003097353910561651 2023-01-22 10:46:37.404203: step: 144/77, loss: 3.469519651844166e-05 2023-01-22 10:46:38.838823: step: 148/77, loss: 8.082071144599468e-05 2023-01-22 10:46:40.309947: step: 152/77, loss: 0.001497301273047924 2023-01-22 10:46:41.793851: step: 156/77, loss: 3.961978109146003e-06 2023-01-22 10:46:43.296613: step: 160/77, loss: 0.004871972370892763 2023-01-22 10:46:44.748311: step: 164/77, loss: 0.0010909107513725758 2023-01-22 10:46:46.190170: step: 168/77, loss: 8.584916213294491e-05 2023-01-22 10:46:47.657676: step: 172/77, loss: 0.010070906020700932 2023-01-22 10:46:49.110798: step: 176/77, loss: 0.0017261668108403683 2023-01-22 10:46:50.622274: step: 180/77, loss: 0.0006850729114376009 2023-01-22 10:46:52.089231: step: 184/77, loss: 0.029952459037303925 2023-01-22 10:46:53.554849: step: 188/77, loss: 1.568080551805906e-05 2023-01-22 10:46:55.110074: step: 192/77, loss: 0.02392057701945305 2023-01-22 10:46:56.621176: step: 196/77, loss: 0.00026258424622938037 2023-01-22 10:46:58.089191: step: 200/77, loss: 0.018205387517809868 2023-01-22 10:46:59.562583: step: 204/77, loss: 3.9935000017976563e-07 2023-01-22 10:47:01.016711: step: 208/77, loss: 0.0023173026274889708 2023-01-22 10:47:02.509962: step: 212/77, loss: 0.017787277698516846 2023-01-22 10:47:03.947135: step: 216/77, loss: 4.1314960981253535e-06 2023-01-22 10:47:05.433769: step: 220/77, loss: 4.357820944278501e-05 2023-01-22 10:47:06.868569: step: 224/77, loss: 7.372977415798232e-05 2023-01-22 10:47:08.363146: step: 228/77, loss: 0.0021139492746442556 2023-01-22 10:47:09.778319: step: 232/77, loss: 0.00020499885431490839 2023-01-22 10:47:11.230431: step: 236/77, loss: 4.148268635617569e-05 2023-01-22 10:47:12.688310: step: 240/77, loss: 6.6961051743419375e-06 2023-01-22 10:47:14.192734: step: 244/77, loss: 0.11798547953367233 2023-01-22 10:47:15.606475: step: 248/77, loss: 0.0022454713471233845 2023-01-22 10:47:17.158456: step: 252/77, loss: 0.061752669513225555 2023-01-22 10:47:18.615345: step: 256/77, loss: 9.347809827886522e-05 2023-01-22 10:47:20.137645: step: 260/77, loss: 2.206732824561186e-05 2023-01-22 10:47:21.647235: step: 264/77, loss: 0.09364143759012222 2023-01-22 10:47:23.158667: step: 268/77, loss: 1.2169822184660006e-05 2023-01-22 10:47:24.647298: step: 272/77, loss: 6.98716685292311e-05 2023-01-22 10:47:26.151685: step: 276/77, loss: 8.472028639516793e-06 2023-01-22 10:47:27.635176: step: 280/77, loss: 0.028671029955148697 2023-01-22 10:47:29.021400: step: 284/77, loss: 4.439530675881542e-05 2023-01-22 10:47:30.514335: step: 288/77, loss: 3.467196165729547e-06 2023-01-22 10:47:31.976657: step: 292/77, loss: 0.027677757665514946 2023-01-22 10:47:33.438624: step: 296/77, loss: 0.07804442197084427 2023-01-22 10:47:34.901520: step: 300/77, loss: 0.04617539048194885 2023-01-22 10:47:36.396773: step: 304/77, loss: 8.539891132386401e-05 2023-01-22 10:47:37.839360: step: 308/77, loss: 4.726528914034134e-06 2023-01-22 10:47:39.331385: step: 312/77, loss: 0.0016289966879412532 2023-01-22 10:47:40.769533: step: 316/77, loss: 8.53421715873992e-06 2023-01-22 10:47:42.256643: step: 320/77, loss: 0.0001842157798819244 2023-01-22 10:47:43.692356: step: 324/77, loss: 0.013489406555891037 2023-01-22 10:47:45.146983: step: 328/77, loss: 0.0041824206709861755 2023-01-22 10:47:46.640311: step: 332/77, loss: 5.475835860124789e-05 2023-01-22 10:47:48.152521: step: 336/77, loss: 0.0028862636536359787 2023-01-22 10:47:49.663469: step: 340/77, loss: 0.015338265337049961 2023-01-22 10:47:51.184036: step: 344/77, loss: 4.800434908247553e-05 2023-01-22 10:47:52.654454: step: 348/77, loss: 0.003050778992474079 2023-01-22 10:47:54.133441: step: 352/77, loss: 0.00028969853883609176 2023-01-22 10:47:55.629571: step: 356/77, loss: 6.305790066107875e-06 2023-01-22 10:47:57.111041: step: 360/77, loss: 3.759312312467955e-05 2023-01-22 10:47:58.594796: step: 364/77, loss: 0.0016704229637980461 2023-01-22 10:48:00.069631: step: 368/77, loss: 7.791905227350071e-05 2023-01-22 10:48:01.542572: step: 372/77, loss: 0.0015094865811988711 2023-01-22 10:48:03.044665: step: 376/77, loss: 0.0017382418736815453 2023-01-22 10:48:04.515659: step: 380/77, loss: 5.582388894254109e-06 2023-01-22 10:48:05.951555: step: 384/77, loss: 4.877383980783634e-05 2023-01-22 10:48:07.394619: step: 388/77, loss: 5.6000149925239384e-05 ================================================== Loss: 0.010 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 22} Test Chinese: {'template': {'p': 0.9090909090909091, 'r': 0.4580152671755725, 'f1': 0.6091370558375634}, 'slot': {'p': 0.5161290322580645, 'r': 0.013805004314063849, 'f1': 0.02689075630252101}, 'combined': 0.01638015612336305, 'epoch': 22} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 22} Test Korean: {'template': {'p': 0.9090909090909091, 'r': 0.4580152671755725, 'f1': 0.6091370558375634}, 'slot': {'p': 0.5161290322580645, 'r': 0.013805004314063849, 'f1': 0.02689075630252101}, 'combined': 0.01638015612336305, 'epoch': 22} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 22} Test Russian: {'template': {'p': 0.9230769230769231, 'r': 0.4580152671755725, 'f1': 0.6122448979591837}, 'slot': {'p': 0.5333333333333333, 'r': 0.013805004314063849, 'f1': 0.026913372582001684}, 'combined': 0.016477575050205112, 'epoch': 22} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 22} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 22} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 22} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Chinese: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.37777777777777777, 'r': 0.014667817083692839, 'f1': 0.028239202657807307}, 'combined': 0.01735677821894498, 'epoch': 3} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 3} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Korean: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.37777777777777777, 'r': 0.014667817083692839, 'f1': 0.028239202657807307}, 'combined': 0.01735677821894498, 'epoch': 3} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 3} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Russian: {'template': {'p': 0.863013698630137, 'r': 0.48091603053435117, 'f1': 0.6176470588235295}, 'slot': {'p': 0.38636363636363635, 'r': 0.014667817083692839, 'f1': 0.02826267664172901}, 'combined': 0.01745635910224439, 'epoch': 3} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 3} ****************************** Epoch: 23 command: python train.py --model_name template --xlmr_model_name xlm-roberta-large --batch_size 10 --xlmr_learning_rate 2e-5 --event_hidden_num 450 --accumulate_step 4 --max_epoch 30 --p1_data_weight 0.1 --learning_rate 2e-4 2023-01-22 10:49:45.541060: step: 4/77, loss: 4.509889549808577e-05 2023-01-22 10:49:47.028959: step: 8/77, loss: 0.0002913977950811386 2023-01-22 10:49:48.476281: step: 12/77, loss: 0.01139332540333271 2023-01-22 10:49:49.934130: step: 16/77, loss: 0.004326066002249718 2023-01-22 10:49:51.415456: step: 20/77, loss: 2.7513986424310133e-05 2023-01-22 10:49:52.855595: step: 24/77, loss: 1.3975003639643546e-05 2023-01-22 10:49:54.297714: step: 28/77, loss: 0.007240791339427233 2023-01-22 10:49:55.736797: step: 32/77, loss: 0.00022055639419704676 2023-01-22 10:49:57.202863: step: 36/77, loss: 0.005759728141129017 2023-01-22 10:49:58.609404: step: 40/77, loss: 3.934923824999714e-06 2023-01-22 10:50:00.017902: step: 44/77, loss: 0.00020762631902471185 2023-01-22 10:50:01.529022: step: 48/77, loss: 0.017258862033486366 2023-01-22 10:50:03.062821: step: 52/77, loss: 0.01816752925515175 2023-01-22 10:50:04.515338: step: 56/77, loss: 7.464773807441816e-05 2023-01-22 10:50:05.949402: step: 60/77, loss: 6.470151856774464e-05 2023-01-22 10:50:07.467569: step: 64/77, loss: 0.03022352047264576 2023-01-22 10:50:09.018127: step: 68/77, loss: 0.030080357566475868 2023-01-22 10:50:10.464031: step: 72/77, loss: 0.0005232697003521025 2023-01-22 10:50:11.964661: step: 76/77, loss: 5.357146437745541e-05 2023-01-22 10:50:13.452707: step: 80/77, loss: 3.602316428441554e-05 2023-01-22 10:50:14.890073: step: 84/77, loss: 1.2637017789529637e-05 2023-01-22 10:50:16.384997: step: 88/77, loss: 0.0014475014759227633 2023-01-22 10:50:17.904243: step: 92/77, loss: 0.036904722452163696 2023-01-22 10:50:19.368423: step: 96/77, loss: 3.237287455704063e-05 2023-01-22 10:50:20.883874: step: 100/77, loss: 0.022598525509238243 2023-01-22 10:50:22.327529: step: 104/77, loss: 0.0006198819610290229 2023-01-22 10:50:23.832179: step: 108/77, loss: 0.037170566618442535 2023-01-22 10:50:25.313949: step: 112/77, loss: 2.7292508093523793e-05 2023-01-22 10:50:26.867147: step: 116/77, loss: 0.037723708897829056 2023-01-22 10:50:28.371742: step: 120/77, loss: 0.020332274958491325 2023-01-22 10:50:29.826203: step: 124/77, loss: 0.0104548754170537 2023-01-22 10:50:31.316564: step: 128/77, loss: 0.00011534785153344274 2023-01-22 10:50:32.784453: step: 132/77, loss: 0.018704598769545555 2023-01-22 10:50:34.204643: step: 136/77, loss: 5.230163424130296e-06 2023-01-22 10:50:35.689731: step: 140/77, loss: 0.005852022208273411 2023-01-22 10:50:37.198352: step: 144/77, loss: 0.0001379873719997704 2023-01-22 10:50:38.696198: step: 148/77, loss: 0.0046800170093774796 2023-01-22 10:50:40.133192: step: 152/77, loss: 0.0003418435517232865 2023-01-22 10:50:41.668592: step: 156/77, loss: 0.0001083689639926888 2023-01-22 10:50:43.154720: step: 160/77, loss: 0.00015809066826477647 2023-01-22 10:50:44.680833: step: 164/77, loss: 3.1797608244232833e-06 2023-01-22 10:50:46.129632: step: 168/77, loss: 5.667543882736936e-05 2023-01-22 10:50:47.640642: step: 172/77, loss: 3.554875002009794e-05 2023-01-22 10:50:49.124033: step: 176/77, loss: 0.0007138706278055906 2023-01-22 10:50:50.627847: step: 180/77, loss: 0.0003262606624048203 2023-01-22 10:50:52.185328: step: 184/77, loss: 0.01835954189300537 2023-01-22 10:50:53.629484: step: 188/77, loss: 5.6338943977607414e-05 2023-01-22 10:50:55.119659: step: 192/77, loss: 2.4946128178271465e-05 2023-01-22 10:50:56.641287: step: 196/77, loss: 8.789340063231066e-05 2023-01-22 10:50:58.106318: step: 200/77, loss: 5.236292054178193e-05 2023-01-22 10:50:59.539214: step: 204/77, loss: 0.001034987042658031 2023-01-22 10:51:01.120467: step: 208/77, loss: 1.9261453417129815e-05 2023-01-22 10:51:02.619974: step: 212/77, loss: 0.00017419336654711515 2023-01-22 10:51:04.150647: step: 216/77, loss: 0.027231387794017792 2023-01-22 10:51:05.583804: step: 220/77, loss: 0.003382456488907337 2023-01-22 10:51:07.001769: step: 224/77, loss: 0.00829127337783575 2023-01-22 10:51:08.447501: step: 228/77, loss: 0.016746779903769493 2023-01-22 10:51:09.886465: step: 232/77, loss: 0.0023354680743068457 2023-01-22 10:51:11.379341: step: 236/77, loss: 0.00019349480862729251 2023-01-22 10:51:12.835720: step: 240/77, loss: 5.5896136473165825e-05 2023-01-22 10:51:14.296286: step: 244/77, loss: 6.264793682930758e-06 2023-01-22 10:51:15.760290: step: 248/77, loss: 0.014695419929921627 2023-01-22 10:51:17.237997: step: 252/77, loss: 0.0002196079440182075 2023-01-22 10:51:18.641224: step: 256/77, loss: 0.00027947252965532243 2023-01-22 10:51:20.098273: step: 260/77, loss: 0.07945749908685684 2023-01-22 10:51:21.593445: step: 264/77, loss: 2.668962588359136e-05 2023-01-22 10:51:23.046350: step: 268/77, loss: 4.155714123044163e-06 2023-01-22 10:51:24.556253: step: 272/77, loss: 1.8812963389791548e-05 2023-01-22 10:51:26.111419: step: 276/77, loss: 0.00014113544602878392 2023-01-22 10:51:27.537163: step: 280/77, loss: 0.002064595464617014 2023-01-22 10:51:29.013373: step: 284/77, loss: 0.0031714034266769886 2023-01-22 10:51:30.504591: step: 288/77, loss: 2.7988609872409143e-05 2023-01-22 10:51:32.026582: step: 292/77, loss: 1.8803808416123502e-05 2023-01-22 10:51:33.511714: step: 296/77, loss: 0.0007328785723075271 2023-01-22 10:51:35.015364: step: 300/77, loss: 0.04188724607229233 2023-01-22 10:51:36.458758: step: 304/77, loss: 2.3720574517938076e-06 2023-01-22 10:51:37.955140: step: 308/77, loss: 0.00018719259242061526 2023-01-22 10:51:39.414541: step: 312/77, loss: 0.0003036497510038316 2023-01-22 10:51:40.894848: step: 316/77, loss: 3.775849108933471e-06 2023-01-22 10:51:42.312907: step: 320/77, loss: 0.00034413387766107917 2023-01-22 10:51:43.849009: step: 324/77, loss: 0.0007906183018349111 2023-01-22 10:51:45.317146: step: 328/77, loss: 2.5849119992926717e-05 2023-01-22 10:51:46.788342: step: 332/77, loss: 0.015596098266541958 2023-01-22 10:51:48.217962: step: 336/77, loss: 4.1424982555327006e-07 2023-01-22 10:51:49.709524: step: 340/77, loss: 4.976780473953113e-06 2023-01-22 10:51:51.218956: step: 344/77, loss: 0.0038793461862951517 2023-01-22 10:51:52.733992: step: 348/77, loss: 0.027320679277181625 2023-01-22 10:51:54.182817: step: 352/77, loss: 8.697208613739349e-06 2023-01-22 10:51:55.599592: step: 356/77, loss: 0.011141828261315823 2023-01-22 10:51:57.030460: step: 360/77, loss: 0.01932971365749836 2023-01-22 10:51:58.515125: step: 364/77, loss: 0.00039088045014068484 2023-01-22 10:51:59.992252: step: 368/77, loss: 1.4817132978350855e-05 2023-01-22 10:52:01.479802: step: 372/77, loss: 0.0013533779419958591 2023-01-22 10:52:02.984561: step: 376/77, loss: 0.000289235933450982 2023-01-22 10:52:04.485147: step: 380/77, loss: 6.28348789177835e-05 2023-01-22 10:52:05.981436: step: 384/77, loss: 0.011985783465206623 2023-01-22 10:52:07.529244: step: 388/77, loss: 9.261792001780123e-06 ================================================== Loss: 0.007 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 23} Test Chinese: {'template': {'p': 0.8873239436619719, 'r': 0.48091603053435117, 'f1': 0.6237623762376238}, 'slot': {'p': 0.5, 'r': 0.014667817083692839, 'f1': 0.028499580888516347}, 'combined': 0.017776966296797325, 'epoch': 23} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 23} Test Korean: {'template': {'p': 0.9, 'r': 0.48091603053435117, 'f1': 0.6268656716417911}, 'slot': {'p': 0.5, 'r': 0.014667817083692839, 'f1': 0.028499580888516347}, 'combined': 0.017865408915189354, 'epoch': 23} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 23} Test Russian: {'template': {'p': 0.8873239436619719, 'r': 0.48091603053435117, 'f1': 0.6237623762376238}, 'slot': {'p': 0.45714285714285713, 'r': 0.013805004314063849, 'f1': 0.02680067001675042}, 'combined': 0.016717249614408677, 'epoch': 23} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 23} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 23} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 23} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Chinese: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.37777777777777777, 'r': 0.014667817083692839, 'f1': 0.028239202657807307}, 'combined': 0.01735677821894498, 'epoch': 3} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 3} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Korean: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.37777777777777777, 'r': 0.014667817083692839, 'f1': 0.028239202657807307}, 'combined': 0.01735677821894498, 'epoch': 3} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 3} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Russian: {'template': {'p': 0.863013698630137, 'r': 0.48091603053435117, 'f1': 0.6176470588235295}, 'slot': {'p': 0.38636363636363635, 'r': 0.014667817083692839, 'f1': 0.02826267664172901}, 'combined': 0.01745635910224439, 'epoch': 3} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 3} ****************************** Epoch: 24 command: python train.py --model_name template --xlmr_model_name xlm-roberta-large --batch_size 10 --xlmr_learning_rate 2e-5 --event_hidden_num 450 --accumulate_step 4 --max_epoch 30 --p1_data_weight 0.1 --learning_rate 2e-4 2023-01-22 10:53:45.978829: step: 4/77, loss: 7.372148502327036e-06 2023-01-22 10:53:47.472245: step: 8/77, loss: 0.002566393930464983 2023-01-22 10:53:48.957498: step: 12/77, loss: 0.00975698884576559 2023-01-22 10:53:50.375647: step: 16/77, loss: 0.00012423066073097289 2023-01-22 10:53:51.861986: step: 20/77, loss: 7.469201955245808e-05 2023-01-22 10:53:53.321129: step: 24/77, loss: 1.7883849068311974e-05 2023-01-22 10:53:54.823411: step: 28/77, loss: 1.3897247299610171e-05 2023-01-22 10:53:56.280046: step: 32/77, loss: 0.000126511775306426 2023-01-22 10:53:57.759959: step: 36/77, loss: 0.00047933805035427213 2023-01-22 10:53:59.225747: step: 40/77, loss: 0.011026003398001194 2023-01-22 10:54:00.676774: step: 44/77, loss: 0.0017197122797369957 2023-01-22 10:54:02.188366: step: 48/77, loss: 0.011792484670877457 2023-01-22 10:54:03.714936: step: 52/77, loss: 0.02285902015864849 2023-01-22 10:54:05.223389: step: 56/77, loss: 0.0004306872433517128 2023-01-22 10:54:06.728509: step: 60/77, loss: 7.246661698445678e-05 2023-01-22 10:54:08.137028: step: 64/77, loss: 0.0010392410913482308 2023-01-22 10:54:09.637959: step: 68/77, loss: 1.759197584760841e-05 2023-01-22 10:54:11.180834: step: 72/77, loss: 0.00021055589604657143 2023-01-22 10:54:12.696986: step: 76/77, loss: 0.0056786141358315945 2023-01-22 10:54:14.252854: step: 80/77, loss: 2.048878059213166e-06 2023-01-22 10:54:15.644374: step: 84/77, loss: 0.00010283981100656092 2023-01-22 10:54:17.221940: step: 88/77, loss: 6.372314237523824e-05 2023-01-22 10:54:18.712716: step: 92/77, loss: 7.664812437724322e-05 2023-01-22 10:54:20.222564: step: 96/77, loss: 0.001548820873722434 2023-01-22 10:54:21.758978: step: 100/77, loss: 0.0007104914984665811 2023-01-22 10:54:23.261941: step: 104/77, loss: 0.0004165653372183442 2023-01-22 10:54:24.755480: step: 108/77, loss: 0.014097335748374462 2023-01-22 10:54:26.246147: step: 112/77, loss: 0.022480204701423645 2023-01-22 10:54:27.771441: step: 116/77, loss: 0.00048142069135792553 2023-01-22 10:54:29.263006: step: 120/77, loss: 0.002312499564141035 2023-01-22 10:54:30.704163: step: 124/77, loss: 7.586013816762716e-05 2023-01-22 10:54:32.278525: step: 128/77, loss: 1.1175856684531027e-07 2023-01-22 10:54:33.781940: step: 132/77, loss: 0.00231208186596632 2023-01-22 10:54:35.222550: step: 136/77, loss: 0.000274372985586524 2023-01-22 10:54:36.723859: step: 140/77, loss: 6.377603085638839e-07 2023-01-22 10:54:38.188085: step: 144/77, loss: 0.009999795816838741 2023-01-22 10:54:39.728848: step: 148/77, loss: 2.1054979697510134e-06 2023-01-22 10:54:41.229198: step: 152/77, loss: 0.014564106240868568 2023-01-22 10:54:42.741157: step: 156/77, loss: 0.008510801941156387 2023-01-22 10:54:44.216061: step: 160/77, loss: 0.0003925769997294992 2023-01-22 10:54:45.722506: step: 164/77, loss: 0.15949967503547668 2023-01-22 10:54:47.166104: step: 168/77, loss: 0.002451444510370493 2023-01-22 10:54:48.646593: step: 172/77, loss: 2.3691179649176775e-06 2023-01-22 10:54:50.060442: step: 176/77, loss: 1.0802951919686166e-06 2023-01-22 10:54:51.555227: step: 180/77, loss: 3.7903109841863625e-06 2023-01-22 10:54:53.012432: step: 184/77, loss: 0.03973957896232605 2023-01-22 10:54:54.413510: step: 188/77, loss: 4.1733612306416035e-05 2023-01-22 10:54:55.913148: step: 192/77, loss: 0.04006608948111534 2023-01-22 10:54:57.370670: step: 196/77, loss: 7.700354763073847e-05 2023-01-22 10:54:58.844390: step: 200/77, loss: 9.149542165687308e-05 2023-01-22 10:55:00.308386: step: 204/77, loss: 2.995125214511063e-07 2023-01-22 10:55:01.772532: step: 208/77, loss: 0.011563536711037159 2023-01-22 10:55:03.269413: step: 212/77, loss: 0.0006145972874946892 2023-01-22 10:55:04.815723: step: 216/77, loss: 0.001980900764465332 2023-01-22 10:55:06.318053: step: 220/77, loss: 4.839376560994424e-05 2023-01-22 10:55:07.695791: step: 224/77, loss: 8.927112503442913e-05 2023-01-22 10:55:09.174045: step: 228/77, loss: 7.5996690611646045e-06 2023-01-22 10:55:10.641697: step: 232/77, loss: 1.1799502317444421e-05 2023-01-22 10:55:12.153398: step: 236/77, loss: 0.00021003717847634107 2023-01-22 10:55:13.662539: step: 240/77, loss: 8.29909276944818e-06 2023-01-22 10:55:15.144786: step: 244/77, loss: 1.0683887694540317e-06 2023-01-22 10:55:16.608715: step: 248/77, loss: 0.0002714066649787128 2023-01-22 10:55:18.112268: step: 252/77, loss: 0.015390491113066673 2023-01-22 10:55:19.595485: step: 256/77, loss: 0.027384832501411438 2023-01-22 10:55:21.120096: step: 260/77, loss: 0.02881324291229248 2023-01-22 10:55:22.631092: step: 264/77, loss: 0.00021542828471865505 2023-01-22 10:55:24.049929: step: 268/77, loss: 3.8738376133551355e-06 2023-01-22 10:55:25.539881: step: 272/77, loss: 0.005717065185308456 2023-01-22 10:55:27.040505: step: 276/77, loss: 0.0003914251283276826 2023-01-22 10:55:28.575233: step: 280/77, loss: 0.0007405400392599404 2023-01-22 10:55:30.070789: step: 284/77, loss: 2.542083166190423e-06 2023-01-22 10:55:31.476821: step: 288/77, loss: 1.5962510587996803e-05 2023-01-22 10:55:33.005914: step: 292/77, loss: 5.124079962115502e-06 2023-01-22 10:55:34.529145: step: 296/77, loss: 0.00012356540537439287 2023-01-22 10:55:36.010986: step: 300/77, loss: 0.000246676238020882 2023-01-22 10:55:37.489810: step: 304/77, loss: 0.00015589930990245193 2023-01-22 10:55:38.959947: step: 308/77, loss: 0.0006130424444563687 2023-01-22 10:55:40.466609: step: 312/77, loss: 6.260881491471082e-05 2023-01-22 10:55:41.910317: step: 316/77, loss: 0.0031988155096769333 2023-01-22 10:55:43.436855: step: 320/77, loss: 0.004418224096298218 2023-01-22 10:55:44.870999: step: 324/77, loss: 0.0009965308709070086 2023-01-22 10:55:46.345632: step: 328/77, loss: 7.539941293543961e-07 2023-01-22 10:55:47.800917: step: 332/77, loss: 0.003678370965644717 2023-01-22 10:55:49.274605: step: 336/77, loss: 0.0003343412245158106 2023-01-22 10:55:50.716902: step: 340/77, loss: 0.00017664878396317363 2023-01-22 10:55:52.252531: step: 344/77, loss: 0.04077373445034027 2023-01-22 10:55:53.696154: step: 348/77, loss: 3.8742859942431096e-07 2023-01-22 10:55:55.200112: step: 352/77, loss: 0.001119026681408286 2023-01-22 10:55:56.662061: step: 356/77, loss: 0.04550314322113991 2023-01-22 10:55:58.160807: step: 360/77, loss: 1.2471847412598436e-06 2023-01-22 10:55:59.641002: step: 364/77, loss: 1.2867828445450868e-05 2023-01-22 10:56:01.137948: step: 368/77, loss: 0.008402790874242783 2023-01-22 10:56:02.608289: step: 372/77, loss: 0.0688982605934143 2023-01-22 10:56:04.074924: step: 376/77, loss: 0.011048972606658936 2023-01-22 10:56:05.521351: step: 380/77, loss: 0.00045871181646361947 2023-01-22 10:56:06.923182: step: 384/77, loss: 0.0016792448004707694 2023-01-22 10:56:08.383912: step: 388/77, loss: 0.00023599954147357494 ================================================== Loss: 0.007 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 24} Test Chinese: {'template': {'p': 0.88, 'r': 0.5038167938931297, 'f1': 0.640776699029126}, 'slot': {'p': 0.47058823529411764, 'r': 0.013805004314063849, 'f1': 0.02682313495389774}, 'combined': 0.017187639873371362, 'epoch': 24} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 24} Test Korean: {'template': {'p': 0.8783783783783784, 'r': 0.4961832061068702, 'f1': 0.6341463414634146}, 'slot': {'p': 0.4857142857142857, 'r': 0.014667817083692839, 'f1': 0.02847571189279732}, 'combined': 0.018057768517383666, 'epoch': 24} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 24} Test Russian: {'template': {'p': 0.8888888888888888, 'r': 0.48854961832061067, 'f1': 0.6305418719211823}, 'slot': {'p': 0.4857142857142857, 'r': 0.014667817083692839, 'f1': 0.02847571189279732}, 'combined': 0.017955128681172695, 'epoch': 24} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 24} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 24} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 24} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Chinese: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.37777777777777777, 'r': 0.014667817083692839, 'f1': 0.028239202657807307}, 'combined': 0.01735677821894498, 'epoch': 3} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 3} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Korean: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.37777777777777777, 'r': 0.014667817083692839, 'f1': 0.028239202657807307}, 'combined': 0.01735677821894498, 'epoch': 3} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 3} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Russian: {'template': {'p': 0.863013698630137, 'r': 0.48091603053435117, 'f1': 0.6176470588235295}, 'slot': {'p': 0.38636363636363635, 'r': 0.014667817083692839, 'f1': 0.02826267664172901}, 'combined': 0.01745635910224439, 'epoch': 3} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 3} ****************************** Epoch: 25 command: python train.py --model_name template --xlmr_model_name xlm-roberta-large --batch_size 10 --xlmr_learning_rate 2e-5 --event_hidden_num 450 --accumulate_step 4 --max_epoch 30 --p1_data_weight 0.1 --learning_rate 2e-4 2023-01-22 10:57:46.570060: step: 4/77, loss: 4.059045750182122e-05 2023-01-22 10:57:48.090887: step: 8/77, loss: 0.030835065990686417 2023-01-22 10:57:49.526094: step: 12/77, loss: 5.7991685025626794e-05 2023-01-22 10:57:50.971092: step: 16/77, loss: 4.914428427582607e-05 2023-01-22 10:57:52.498213: step: 20/77, loss: 3.874687899951823e-05 2023-01-22 10:57:53.954757: step: 24/77, loss: 2.7834380489366595e-06 2023-01-22 10:57:55.413970: step: 28/77, loss: 6.2171811805455945e-06 2023-01-22 10:57:56.926305: step: 32/77, loss: 1.473645966143522e-06 2023-01-22 10:57:58.380432: step: 36/77, loss: 6.257622590055689e-05 2023-01-22 10:57:59.838114: step: 40/77, loss: 0.0005334470770321786 2023-01-22 10:58:01.326503: step: 44/77, loss: 0.04825974255800247 2023-01-22 10:58:02.842092: step: 48/77, loss: 6.16902980254963e-07 2023-01-22 10:58:04.254427: step: 52/77, loss: 2.3803200747352093e-05 2023-01-22 10:58:05.758212: step: 56/77, loss: 1.7761865365173435e-06 2023-01-22 10:58:07.303382: step: 60/77, loss: 0.002082192339003086 2023-01-22 10:58:08.756601: step: 64/77, loss: 2.617944664962124e-05 2023-01-22 10:58:10.241173: step: 68/77, loss: 0.021399078890681267 2023-01-22 10:58:11.711068: step: 72/77, loss: 0.036595191806554794 2023-01-22 10:58:13.223274: step: 76/77, loss: 4.119883669773117e-05 2023-01-22 10:58:14.794260: step: 80/77, loss: 0.0001858456089394167 2023-01-22 10:58:16.299821: step: 84/77, loss: 2.6728503144113347e-05 2023-01-22 10:58:17.709405: step: 88/77, loss: 3.4449337817932246e-06 2023-01-22 10:58:19.165312: step: 92/77, loss: 0.00030246900860220194 2023-01-22 10:58:20.708151: step: 96/77, loss: 0.0003325316938571632 2023-01-22 10:58:22.207646: step: 100/77, loss: 9.07470052879944e-07 2023-01-22 10:58:23.674648: step: 104/77, loss: 0.00027373951161280274 2023-01-22 10:58:25.153754: step: 108/77, loss: 3.2495640880370047e-06 2023-01-22 10:58:26.642092: step: 112/77, loss: 0.0010588520672172308 2023-01-22 10:58:28.084817: step: 116/77, loss: 0.02988281659781933 2023-01-22 10:58:29.561115: step: 120/77, loss: 3.7877066461078357e-06 2023-01-22 10:58:30.990646: step: 124/77, loss: 1.281479626413784e-06 2023-01-22 10:58:32.501715: step: 128/77, loss: 0.0009377918904647231 2023-01-22 10:58:34.034873: step: 132/77, loss: 0.00032386762904934585 2023-01-22 10:58:35.512860: step: 136/77, loss: 0.0004764663754031062 2023-01-22 10:58:37.027004: step: 140/77, loss: 0.0033441728446632624 2023-01-22 10:58:38.473164: step: 144/77, loss: 0.008817191235721111 2023-01-22 10:58:40.017813: step: 148/77, loss: 0.0001582540717208758 2023-01-22 10:58:41.508879: step: 152/77, loss: 1.4754637049918529e-05 2023-01-22 10:58:42.958158: step: 156/77, loss: 1.8953408016386675e-06 2023-01-22 10:58:44.430175: step: 160/77, loss: 0.007105558179318905 2023-01-22 10:58:45.899373: step: 164/77, loss: 0.00016064877854660153 2023-01-22 10:58:47.406334: step: 168/77, loss: 0.006513113155961037 2023-01-22 10:58:48.875466: step: 172/77, loss: 0.004730660002678633 2023-01-22 10:58:50.326869: step: 176/77, loss: 0.012969336472451687 2023-01-22 10:58:51.808414: step: 180/77, loss: 2.9611353966174647e-05 2023-01-22 10:58:53.273792: step: 184/77, loss: 3.134362486889586e-05 2023-01-22 10:58:54.718070: step: 188/77, loss: 4.576904757414013e-06 2023-01-22 10:58:56.186714: step: 192/77, loss: 7.770049705868587e-06 2023-01-22 10:58:57.674269: step: 196/77, loss: 0.002541177673265338 2023-01-22 10:58:59.152488: step: 200/77, loss: 0.0008257463341578841 2023-01-22 10:59:00.580364: step: 204/77, loss: 3.1156530440057395e-06 2023-01-22 10:59:02.066928: step: 208/77, loss: 2.0965369913028553e-05 2023-01-22 10:59:03.564792: step: 212/77, loss: 6.824683964623546e-07 2023-01-22 10:59:05.047572: step: 216/77, loss: 0.01078418642282486 2023-01-22 10:59:06.525750: step: 220/77, loss: 0.0002531272766645998 2023-01-22 10:59:07.898559: step: 224/77, loss: 0.0014299320755526423 2023-01-22 10:59:09.333810: step: 228/77, loss: 0.0007252601790241897 2023-01-22 10:59:10.749425: step: 232/77, loss: 4.472882665140787e-06 2023-01-22 10:59:12.286986: step: 236/77, loss: 0.004823240917176008 2023-01-22 10:59:13.825232: step: 240/77, loss: 0.01945425011217594 2023-01-22 10:59:15.330087: step: 244/77, loss: 0.0003206911205779761 2023-01-22 10:59:16.815161: step: 248/77, loss: 3.069389322263305e-06 2023-01-22 10:59:18.260064: step: 252/77, loss: 0.0044409167021512985 2023-01-22 10:59:19.719256: step: 256/77, loss: 0.0038761033210903406 2023-01-22 10:59:21.155603: step: 260/77, loss: 1.8266971892444417e-05 2023-01-22 10:59:22.660416: step: 264/77, loss: 0.0022358319256454706 2023-01-22 10:59:24.119173: step: 268/77, loss: 3.647561607067473e-05 2023-01-22 10:59:25.628300: step: 272/77, loss: 2.4451919671264477e-06 2023-01-22 10:59:27.105466: step: 276/77, loss: 6.576570740435272e-05 2023-01-22 10:59:28.622779: step: 280/77, loss: 0.0027494183741509914 2023-01-22 10:59:30.068400: step: 284/77, loss: 1.141419488703832e-06 2023-01-22 10:59:31.618617: step: 288/77, loss: 0.0010258633410558105 2023-01-22 10:59:33.059241: step: 292/77, loss: 1.6540182912194723e-07 2023-01-22 10:59:34.599113: step: 296/77, loss: 0.0002123012236552313 2023-01-22 10:59:35.999569: step: 300/77, loss: 3.2169398309633834e-06 2023-01-22 10:59:37.543764: step: 304/77, loss: 0.0015341609250754118 2023-01-22 10:59:39.093366: step: 308/77, loss: 2.290213160449639e-06 2023-01-22 10:59:40.544720: step: 312/77, loss: 2.740030140557792e-06 2023-01-22 10:59:42.002432: step: 316/77, loss: 5.185409827390686e-05 2023-01-22 10:59:43.558213: step: 320/77, loss: 4.31674015999306e-06 2023-01-22 10:59:45.059821: step: 324/77, loss: 2.0086281438125297e-06 2023-01-22 10:59:46.518847: step: 328/77, loss: 2.9443667699524667e-06 2023-01-22 10:59:47.990907: step: 332/77, loss: 1.6882592035472044e-06 2023-01-22 10:59:49.447026: step: 336/77, loss: 3.605149686336517e-05 2023-01-22 10:59:50.855848: step: 340/77, loss: 3.4285606034245575e-06 2023-01-22 10:59:52.251584: step: 344/77, loss: 0.04269392788410187 2023-01-22 10:59:53.692887: step: 348/77, loss: 6.202944405231392e-06 2023-01-22 10:59:55.235406: step: 352/77, loss: 5.522930223378353e-06 2023-01-22 10:59:56.704994: step: 356/77, loss: 3.839743385469774e-06 2023-01-22 10:59:58.175773: step: 360/77, loss: 6.6556758611113764e-06 2023-01-22 10:59:59.607867: step: 364/77, loss: 1.2227968909428455e-05 2023-01-22 11:00:01.030001: step: 368/77, loss: 1.5571474705211585e-06 2023-01-22 11:00:02.560362: step: 372/77, loss: 1.3326905900612473e-05 2023-01-22 11:00:04.083941: step: 376/77, loss: 1.5645682651665993e-06 2023-01-22 11:00:05.509771: step: 380/77, loss: 0.00039393804036080837 2023-01-22 11:00:07.002689: step: 384/77, loss: 6.189832492964342e-05 2023-01-22 11:00:08.532662: step: 388/77, loss: 0.0002459617971908301 ================================================== Loss: 0.003 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 25} Test Chinese: {'template': {'p': 0.9166666666666666, 'r': 0.5038167938931297, 'f1': 0.6502463054187191}, 'slot': {'p': 0.5151515151515151, 'r': 0.014667817083692839, 'f1': 0.02852348993288591}, 'combined': 0.01854729394650709, 'epoch': 25} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 25} Test Korean: {'template': {'p': 0.9142857142857143, 'r': 0.48854961832061067, 'f1': 0.6368159203980099}, 'slot': {'p': 0.53125, 'r': 0.014667817083692839, 'f1': 0.028547439126784216}, 'combined': 0.01817946372252925, 'epoch': 25} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 25} Test Russian: {'template': {'p': 0.9285714285714286, 'r': 0.4961832061068702, 'f1': 0.6467661691542288}, 'slot': {'p': 0.5483870967741935, 'r': 0.014667817083692839, 'f1': 0.02857142857142857}, 'combined': 0.018479033404406535, 'epoch': 25} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 25} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 25} Sample Russian: {'template': {'p': 1.0, 'r': 0.25, 'f1': 0.4}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 25} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Chinese: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.37777777777777777, 'r': 0.014667817083692839, 'f1': 0.028239202657807307}, 'combined': 0.01735677821894498, 'epoch': 3} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 3} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Korean: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.37777777777777777, 'r': 0.014667817083692839, 'f1': 0.028239202657807307}, 'combined': 0.01735677821894498, 'epoch': 3} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 3} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Russian: {'template': {'p': 0.863013698630137, 'r': 0.48091603053435117, 'f1': 0.6176470588235295}, 'slot': {'p': 0.38636363636363635, 'r': 0.014667817083692839, 'f1': 0.02826267664172901}, 'combined': 0.01745635910224439, 'epoch': 3} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 3} ****************************** Epoch: 26 command: python train.py --model_name template --xlmr_model_name xlm-roberta-large --batch_size 10 --xlmr_learning_rate 2e-5 --event_hidden_num 450 --accumulate_step 4 --max_epoch 30 --p1_data_weight 0.1 --learning_rate 2e-4 2023-01-22 11:01:46.677181: step: 4/77, loss: 0.0004410330147948116 2023-01-22 11:01:48.145984: step: 8/77, loss: 6.347841008391697e-07 2023-01-22 11:01:49.631438: step: 12/77, loss: 0.00019786340999417007 2023-01-22 11:01:51.082864: step: 16/77, loss: 1.8058998421111028e-06 2023-01-22 11:01:52.580552: step: 20/77, loss: 1.4379454569279915e-06 2023-01-22 11:01:54.078920: step: 24/77, loss: 0.001322779105976224 2023-01-22 11:01:55.510210: step: 28/77, loss: 8.72594173415564e-05 2023-01-22 11:01:57.008350: step: 32/77, loss: 9.983496966015082e-06 2023-01-22 11:01:58.464643: step: 36/77, loss: 7.26973712517065e-06 2023-01-22 11:01:59.959202: step: 40/77, loss: 4.976939749212761e-07 2023-01-22 11:02:01.450386: step: 44/77, loss: 0.007929427549242973 2023-01-22 11:02:03.020875: step: 48/77, loss: 8.099547812889796e-06 2023-01-22 11:02:04.490036: step: 52/77, loss: 0.0005367292324081063 2023-01-22 11:02:06.036636: step: 56/77, loss: 5.662437985165525e-08 2023-01-22 11:02:07.421473: step: 60/77, loss: 1.0163690603803843e-05 2023-01-22 11:02:08.828532: step: 64/77, loss: 3.014777939824853e-05 2023-01-22 11:02:10.316750: step: 68/77, loss: 0.004237155895680189 2023-01-22 11:02:11.746134: step: 72/77, loss: 1.9191624232917093e-05 2023-01-22 11:02:13.197514: step: 76/77, loss: 0.00026323023485019803 2023-01-22 11:02:14.637918: step: 80/77, loss: 2.7281901111564366e-06 2023-01-22 11:02:16.168794: step: 84/77, loss: 6.288236136242631e-07 2023-01-22 11:02:17.660015: step: 88/77, loss: 0.022111790254712105 2023-01-22 11:02:19.165438: step: 92/77, loss: 1.6018191672628745e-05 2023-01-22 11:02:20.662649: step: 96/77, loss: 2.7297407996229595e-06 2023-01-22 11:02:22.117009: step: 100/77, loss: 3.272101366746938e-06 2023-01-22 11:02:23.677244: step: 104/77, loss: 0.0006118750898167491 2023-01-22 11:02:25.203104: step: 108/77, loss: 0.09048576653003693 2023-01-22 11:02:26.656223: step: 112/77, loss: 7.00344287452026e-07 2023-01-22 11:02:28.076750: step: 116/77, loss: 2.5460256438236684e-05 2023-01-22 11:02:29.535403: step: 120/77, loss: 0.0005740249762311578 2023-01-22 11:02:31.067047: step: 124/77, loss: 1.1066706065321341e-05 2023-01-22 11:02:32.525041: step: 128/77, loss: 0.00012074044934706762 2023-01-22 11:02:33.951275: step: 132/77, loss: 0.0003262482932768762 2023-01-22 11:02:35.372527: step: 136/77, loss: 0.0011029550805687904 2023-01-22 11:02:36.859725: step: 140/77, loss: 3.281430326751433e-05 2023-01-22 11:02:38.372759: step: 144/77, loss: 0.015173608437180519 2023-01-22 11:02:39.870462: step: 148/77, loss: 0.011440704576671124 2023-01-22 11:02:41.410887: step: 152/77, loss: 0.00012958000297658145 2023-01-22 11:02:42.903377: step: 156/77, loss: 0.016136083751916885 2023-01-22 11:02:44.449917: step: 160/77, loss: 9.393112850375473e-05 2023-01-22 11:02:45.915136: step: 164/77, loss: 0.002286059781908989 2023-01-22 11:02:47.392289: step: 168/77, loss: 0.009524857625365257 2023-01-22 11:02:48.866102: step: 172/77, loss: 0.027265436947345734 2023-01-22 11:02:50.329382: step: 176/77, loss: 5.697094456991181e-05 2023-01-22 11:02:51.862411: step: 180/77, loss: 0.0001194120486616157 2023-01-22 11:02:53.313662: step: 184/77, loss: 0.0003231299633625895 2023-01-22 11:02:54.785808: step: 188/77, loss: 4.0711223846301436e-05 2023-01-22 11:02:56.286643: step: 192/77, loss: 0.0005879050586372614 2023-01-22 11:02:57.755723: step: 196/77, loss: 0.056800760328769684 2023-01-22 11:02:59.233695: step: 200/77, loss: 4.6171685426088516e-06 2023-01-22 11:03:00.804968: step: 204/77, loss: 1.0742302038124762e-05 2023-01-22 11:03:02.328615: step: 208/77, loss: 0.06644407659769058 2023-01-22 11:03:03.787419: step: 212/77, loss: 0.0002735485613811761 2023-01-22 11:03:05.263184: step: 216/77, loss: 3.1349229629995534e-06 2023-01-22 11:03:06.789750: step: 220/77, loss: 8.984311534732115e-06 2023-01-22 11:03:08.149525: step: 224/77, loss: 0.0023781578056514263 2023-01-22 11:03:09.693246: step: 228/77, loss: 0.02856743521988392 2023-01-22 11:03:11.168833: step: 232/77, loss: 0.03514505922794342 2023-01-22 11:03:12.640897: step: 236/77, loss: 9.761759429238737e-06 2023-01-22 11:03:14.114379: step: 240/77, loss: 7.212029231595807e-07 2023-01-22 11:03:15.640936: step: 244/77, loss: 4.139644079259597e-05 2023-01-22 11:03:17.145795: step: 248/77, loss: 2.2276112758845557e-06 2023-01-22 11:03:18.671310: step: 252/77, loss: 1.6014524589991197e-05 2023-01-22 11:03:20.126057: step: 256/77, loss: 1.4529955478792544e-05 2023-01-22 11:03:21.640328: step: 260/77, loss: 0.001574151567183435 2023-01-22 11:03:23.155256: step: 264/77, loss: 0.003839747281745076 2023-01-22 11:03:24.604504: step: 268/77, loss: 0.0009322985424660146 2023-01-22 11:03:26.032360: step: 272/77, loss: 0.0005584964528679848 2023-01-22 11:03:27.524948: step: 276/77, loss: 0.00011649419320747256 2023-01-22 11:03:28.994267: step: 280/77, loss: 0.0008474696660414338 2023-01-22 11:03:30.504697: step: 284/77, loss: 0.1427866369485855 2023-01-22 11:03:31.997850: step: 288/77, loss: 9.221310028806329e-05 2023-01-22 11:03:33.480796: step: 292/77, loss: 5.444014459499158e-05 2023-01-22 11:03:34.969653: step: 296/77, loss: 3.405211100471206e-05 2023-01-22 11:03:36.498033: step: 300/77, loss: 0.007121166680008173 2023-01-22 11:03:37.938951: step: 304/77, loss: 0.028989192098379135 2023-01-22 11:03:39.433868: step: 308/77, loss: 0.005328961182385683 2023-01-22 11:03:40.860426: step: 312/77, loss: 0.04974348098039627 2023-01-22 11:03:42.339034: step: 316/77, loss: 0.006064072251319885 2023-01-22 11:03:43.800574: step: 320/77, loss: 4.2640358515200205e-06 2023-01-22 11:03:45.237915: step: 324/77, loss: 2.233498526038602e-05 2023-01-22 11:03:46.764986: step: 328/77, loss: 2.7715998385247076e-07 2023-01-22 11:03:48.235299: step: 332/77, loss: 2.317662801942788e-05 2023-01-22 11:03:49.669478: step: 336/77, loss: 0.06426730006933212 2023-01-22 11:03:51.189407: step: 340/77, loss: 0.00011196612467756495 2023-01-22 11:03:52.710864: step: 344/77, loss: 8.359479579667095e-07 2023-01-22 11:03:54.152186: step: 348/77, loss: 4.4287400669418275e-05 2023-01-22 11:03:55.599371: step: 352/77, loss: 0.05372339114546776 2023-01-22 11:03:57.112486: step: 356/77, loss: 6.985371783230221e-06 2023-01-22 11:03:58.634035: step: 360/77, loss: 0.00024018401745706797 2023-01-22 11:04:00.120279: step: 364/77, loss: 7.439233013428748e-05 2023-01-22 11:04:01.631726: step: 368/77, loss: 0.0001586980652064085 2023-01-22 11:04:03.170180: step: 372/77, loss: 0.00013080626376904547 2023-01-22 11:04:04.648001: step: 376/77, loss: 7.703786764068354e-07 2023-01-22 11:04:06.081168: step: 380/77, loss: 1.3012524505029432e-05 2023-01-22 11:04:07.502365: step: 384/77, loss: 0.008434685878455639 2023-01-22 11:04:08.976088: step: 388/77, loss: 2.2103627998149022e-05 ================================================== Loss: 0.008 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.4878048780487805, 'r': 0.03780718336483932, 'f1': 0.07017543859649122}, 'combined': 0.051708217913204055, 'epoch': 26} Test Chinese: {'template': {'p': 0.9178082191780822, 'r': 0.5114503816793893, 'f1': 0.6568627450980392}, 'slot': {'p': 0.425, 'r': 0.014667817083692839, 'f1': 0.02835696413678065}, 'combined': 0.018626633305532388, 'epoch': 26} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.4878048780487805, 'r': 0.03780718336483932, 'f1': 0.07017543859649122}, 'combined': 0.051708217913204055, 'epoch': 26} Test Korean: {'template': {'p': 0.9166666666666666, 'r': 0.5038167938931297, 'f1': 0.6502463054187191}, 'slot': {'p': 0.4473684210526316, 'r': 0.014667817083692839, 'f1': 0.028404344193817876}, 'combined': 0.018469819869871718, 'epoch': 26} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.4878048780487805, 'r': 0.03780718336483932, 'f1': 0.07017543859649122}, 'combined': 0.051708217913204055, 'epoch': 26} Test Russian: {'template': {'p': 0.9295774647887324, 'r': 0.5038167938931297, 'f1': 0.6534653465346535}, 'slot': {'p': 0.4722222222222222, 'r': 0.014667817083692839, 'f1': 0.028451882845188285}, 'combined': 0.018592319482994325, 'epoch': 26} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 26} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 26} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 26} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Chinese: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.37777777777777777, 'r': 0.014667817083692839, 'f1': 0.028239202657807307}, 'combined': 0.01735677821894498, 'epoch': 3} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 3} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Korean: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.37777777777777777, 'r': 0.014667817083692839, 'f1': 0.028239202657807307}, 'combined': 0.01735677821894498, 'epoch': 3} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 3} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Russian: {'template': {'p': 0.863013698630137, 'r': 0.48091603053435117, 'f1': 0.6176470588235295}, 'slot': {'p': 0.38636363636363635, 'r': 0.014667817083692839, 'f1': 0.02826267664172901}, 'combined': 0.01745635910224439, 'epoch': 3} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 3} ****************************** Epoch: 27 command: python train.py --model_name template --xlmr_model_name xlm-roberta-large --batch_size 10 --xlmr_learning_rate 2e-5 --event_hidden_num 450 --accumulate_step 4 --max_epoch 30 --p1_data_weight 0.1 --learning_rate 2e-4 2023-01-22 11:05:47.132975: step: 4/77, loss: 1.4326322343549691e-05 2023-01-22 11:05:48.643637: step: 8/77, loss: 0.024663139134645462 2023-01-22 11:05:50.157280: step: 12/77, loss: 0.0004897532053291798 2023-01-22 11:05:51.640498: step: 16/77, loss: 0.0061692120507359505 2023-01-22 11:05:53.079244: step: 20/77, loss: 0.00023279106244444847 2023-01-22 11:05:54.506925: step: 24/77, loss: 6.758703966625035e-05 2023-01-22 11:05:55.963069: step: 28/77, loss: 6.1161990743130445e-06 2023-01-22 11:05:57.391749: step: 32/77, loss: 2.1376172298914753e-05 2023-01-22 11:05:58.832038: step: 36/77, loss: 9.568851965013891e-05 2023-01-22 11:06:00.274771: step: 40/77, loss: 3.833698883681791e-06 2023-01-22 11:06:01.875248: step: 44/77, loss: 0.00012776638322975487 2023-01-22 11:06:03.364579: step: 48/77, loss: 0.0004679126723203808 2023-01-22 11:06:04.815490: step: 52/77, loss: 1.3973161003377754e-05 2023-01-22 11:06:06.291397: step: 56/77, loss: 1.3835771824233234e-05 2023-01-22 11:06:07.787434: step: 60/77, loss: 1.6599069567746483e-06 2023-01-22 11:06:09.276779: step: 64/77, loss: 7.525195542257279e-05 2023-01-22 11:06:10.757539: step: 68/77, loss: 5.3652707720175385e-05 2023-01-22 11:06:12.191890: step: 72/77, loss: 0.000490330159664154 2023-01-22 11:06:13.656879: step: 76/77, loss: 6.186037353472784e-05 2023-01-22 11:06:15.091901: step: 80/77, loss: 9.387572390551213e-07 2023-01-22 11:06:16.503896: step: 84/77, loss: 0.00446065329015255 2023-01-22 11:06:17.974167: step: 88/77, loss: 3.517463119351305e-05 2023-01-22 11:06:19.457682: step: 92/77, loss: 0.0010564837139099836 2023-01-22 11:06:20.893273: step: 96/77, loss: 0.0038536631036549807 2023-01-22 11:06:22.327685: step: 100/77, loss: 0.000469402177259326 2023-01-22 11:06:23.832444: step: 104/77, loss: 8.486769365845248e-05 2023-01-22 11:06:25.299695: step: 108/77, loss: 3.024913439730881e-07 2023-01-22 11:06:26.808913: step: 112/77, loss: 2.229935307695996e-05 2023-01-22 11:06:28.300361: step: 116/77, loss: 8.681615872774273e-05 2023-01-22 11:06:29.732622: step: 120/77, loss: 0.00025649185408838093 2023-01-22 11:06:31.169787: step: 124/77, loss: 1.7866010466605076e-06 2023-01-22 11:06:32.666118: step: 128/77, loss: 1.3009529538976494e-05 2023-01-22 11:06:34.143079: step: 132/77, loss: 2.220266424046713e-07 2023-01-22 11:06:35.609571: step: 136/77, loss: 5.0626226766326e-06 2023-01-22 11:06:37.090213: step: 140/77, loss: 3.058028596569784e-05 2023-01-22 11:06:38.584825: step: 144/77, loss: 2.7159321689396165e-05 2023-01-22 11:06:40.108455: step: 148/77, loss: 1.4751408343727235e-06 2023-01-22 11:06:41.572277: step: 152/77, loss: 3.5762763417324095e-08 2023-01-22 11:06:43.066647: step: 156/77, loss: 3.208236739737913e-05 2023-01-22 11:06:44.563711: step: 160/77, loss: 0.0027070590294897556 2023-01-22 11:06:46.037812: step: 164/77, loss: 0.0012007926125079393 2023-01-22 11:06:47.468276: step: 168/77, loss: 7.897610032614466e-08 2023-01-22 11:06:48.969944: step: 172/77, loss: 0.0011402148520573974 2023-01-22 11:06:50.510109: step: 176/77, loss: 1.2129133892813115e-06 2023-01-22 11:06:52.024949: step: 180/77, loss: 6.869350386295991e-07 2023-01-22 11:06:53.482169: step: 184/77, loss: 0.0020278082229197025 2023-01-22 11:06:54.982780: step: 188/77, loss: 2.4167438823496923e-06 2023-01-22 11:06:56.523939: step: 192/77, loss: 0.0004763460601679981 2023-01-22 11:06:58.011770: step: 196/77, loss: 0.00020171045616734773 2023-01-22 11:06:59.513548: step: 200/77, loss: 0.006428330205380917 2023-01-22 11:07:00.959848: step: 204/77, loss: 0.054347168654203415 2023-01-22 11:07:02.437001: step: 208/77, loss: 9.391622006660327e-05 2023-01-22 11:07:03.955286: step: 212/77, loss: 5.589288048213348e-05 2023-01-22 11:07:05.386797: step: 216/77, loss: 1.017721160678775e-06 2023-01-22 11:07:06.816334: step: 220/77, loss: 5.111033942739596e-07 2023-01-22 11:07:08.238202: step: 224/77, loss: 1.4677115132144536e-06 2023-01-22 11:07:09.710302: step: 228/77, loss: 0.040604718029499054 2023-01-22 11:07:11.150943: step: 232/77, loss: 0.00016247703752014786 2023-01-22 11:07:12.632609: step: 236/77, loss: 6.460230360971764e-05 2023-01-22 11:07:14.121462: step: 240/77, loss: 0.00135130959097296 2023-01-22 11:07:15.537983: step: 244/77, loss: 0.00023467946448363364 2023-01-22 11:07:17.072517: step: 248/77, loss: 7.356399873970076e-05 2023-01-22 11:07:18.598994: step: 252/77, loss: 0.006442447658628225 2023-01-22 11:07:20.103184: step: 256/77, loss: 0.01619216613471508 2023-01-22 11:07:21.565089: step: 260/77, loss: 1.4328755241876934e-05 2023-01-22 11:07:23.028298: step: 264/77, loss: 2.5376413759659044e-05 2023-01-22 11:07:24.448717: step: 268/77, loss: 0.12852409482002258 2023-01-22 11:07:25.988972: step: 272/77, loss: 1.3841839063388761e-05 2023-01-22 11:07:27.484245: step: 276/77, loss: 0.00680402759462595 2023-01-22 11:07:28.930546: step: 280/77, loss: 0.00011101227573817596 2023-01-22 11:07:30.429370: step: 284/77, loss: 0.00027165425126440823 2023-01-22 11:07:31.961531: step: 288/77, loss: 0.00021375974756665528 2023-01-22 11:07:33.541686: step: 292/77, loss: 0.027166698127985 2023-01-22 11:07:35.012537: step: 296/77, loss: 0.000678533164318651 2023-01-22 11:07:36.565432: step: 300/77, loss: 8.930165677156765e-06 2023-01-22 11:07:38.121655: step: 304/77, loss: 0.0004659105616156012 2023-01-22 11:07:39.560711: step: 308/77, loss: 2.5572109734639525e-05 2023-01-22 11:07:41.033963: step: 312/77, loss: 0.0006945566856302321 2023-01-22 11:07:42.525520: step: 316/77, loss: 0.0002417838986730203 2023-01-22 11:07:43.990533: step: 320/77, loss: 5.098254405311309e-05 2023-01-22 11:07:45.483059: step: 324/77, loss: 5.953685104032047e-05 2023-01-22 11:07:46.951132: step: 328/77, loss: 0.013821378350257874 2023-01-22 11:07:48.377939: step: 332/77, loss: 0.0018615383887663484 2023-01-22 11:07:49.906019: step: 336/77, loss: 1.4126595488050953e-05 2023-01-22 11:07:51.358002: step: 340/77, loss: 2.8266451863601105e-06 2023-01-22 11:07:52.889991: step: 344/77, loss: 1.291919033974409e-06 2023-01-22 11:07:54.338444: step: 348/77, loss: 3.385403397260234e-05 2023-01-22 11:07:55.816460: step: 352/77, loss: 4.066002929903334e-06 2023-01-22 11:07:57.276249: step: 356/77, loss: 1.440505002392456e-05 2023-01-22 11:07:58.750510: step: 360/77, loss: 8.49277203087695e-05 2023-01-22 11:08:00.202931: step: 364/77, loss: 4.555952273221919e-06 2023-01-22 11:08:01.716318: step: 368/77, loss: 2.5610373995732516e-05 2023-01-22 11:08:03.172489: step: 372/77, loss: 0.028466036543250084 2023-01-22 11:08:04.638534: step: 376/77, loss: 2.1152529370738193e-05 2023-01-22 11:08:06.127253: step: 380/77, loss: 9.026997759065125e-06 2023-01-22 11:08:07.569717: step: 384/77, loss: 1.4021417200638098e-06 2023-01-22 11:08:08.992306: step: 388/77, loss: 0.00033527766936458647 ================================================== Loss: 0.004 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 27} Test Chinese: {'template': {'p': 0.9154929577464789, 'r': 0.4961832061068702, 'f1': 0.6435643564356436}, 'slot': {'p': 0.5, 'r': 0.014667817083692839, 'f1': 0.028499580888516347}, 'combined': 0.018341314433203592, 'epoch': 27} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 27} Test Korean: {'template': {'p': 0.9154929577464789, 'r': 0.4961832061068702, 'f1': 0.6435643564356436}, 'slot': {'p': 0.5151515151515151, 'r': 0.014667817083692839, 'f1': 0.02852348993288591}, 'combined': 0.01835670144195628, 'epoch': 27} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 27} Test Russian: {'template': {'p': 0.9285714285714286, 'r': 0.4961832061068702, 'f1': 0.6467661691542288}, 'slot': {'p': 0.5483870967741935, 'r': 0.014667817083692839, 'f1': 0.02857142857142857}, 'combined': 0.018479033404406535, 'epoch': 27} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 27} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 27} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 27} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Chinese: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.37777777777777777, 'r': 0.014667817083692839, 'f1': 0.028239202657807307}, 'combined': 0.01735677821894498, 'epoch': 3} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 3} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Korean: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.37777777777777777, 'r': 0.014667817083692839, 'f1': 0.028239202657807307}, 'combined': 0.01735677821894498, 'epoch': 3} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 3} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Russian: {'template': {'p': 0.863013698630137, 'r': 0.48091603053435117, 'f1': 0.6176470588235295}, 'slot': {'p': 0.38636363636363635, 'r': 0.014667817083692839, 'f1': 0.02826267664172901}, 'combined': 0.01745635910224439, 'epoch': 3} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 3} ****************************** Epoch: 28 command: python train.py --model_name template --xlmr_model_name xlm-roberta-large --batch_size 10 --xlmr_learning_rate 2e-5 --event_hidden_num 450 --accumulate_step 4 --max_epoch 30 --p1_data_weight 0.1 --learning_rate 2e-4 2023-01-22 11:09:47.101671: step: 4/77, loss: 6.341078551486135e-05 2023-01-22 11:09:48.660225: step: 8/77, loss: 1.2370970580377616e-05 2023-01-22 11:09:50.113248: step: 12/77, loss: 0.0004744845209643245 2023-01-22 11:09:51.606957: step: 16/77, loss: 6.40781627225806e-06 2023-01-22 11:09:53.136636: step: 20/77, loss: 6.124305400589947e-07 2023-01-22 11:09:54.578120: step: 24/77, loss: 7.749309588689357e-06 2023-01-22 11:09:55.995301: step: 28/77, loss: 1.6838266958529857e-07 2023-01-22 11:09:57.493190: step: 32/77, loss: 0.0009012370137497783 2023-01-22 11:09:59.033004: step: 36/77, loss: 0.00016497840988449752 2023-01-22 11:10:00.462959: step: 40/77, loss: 0.031463008373975754 2023-01-22 11:10:01.962739: step: 44/77, loss: 0.00010549923172220588 2023-01-22 11:10:03.385736: step: 48/77, loss: 0.00022325686586555094 2023-01-22 11:10:04.897581: step: 52/77, loss: 2.074153599096462e-06 2023-01-22 11:10:06.405488: step: 56/77, loss: 9.855855751084164e-05 2023-01-22 11:10:07.841913: step: 60/77, loss: 1.778966179699637e-05 2023-01-22 11:10:09.333572: step: 64/77, loss: 4.362392246548552e-06 2023-01-22 11:10:10.823965: step: 68/77, loss: 1.3663729987456463e-06 2023-01-22 11:10:12.288185: step: 72/77, loss: 0.004756465088576078 2023-01-22 11:10:13.735657: step: 76/77, loss: 4.683442966779694e-05 2023-01-22 11:10:15.222662: step: 80/77, loss: 0.00019577420607674867 2023-01-22 11:10:16.704680: step: 84/77, loss: 0.00035057426430284977 2023-01-22 11:10:18.168785: step: 88/77, loss: 0.0017822480294853449 2023-01-22 11:10:19.664632: step: 92/77, loss: 0.000780368922278285 2023-01-22 11:10:21.195955: step: 96/77, loss: 6.379641126841307e-05 2023-01-22 11:10:22.628747: step: 100/77, loss: 2.9951229407743085e-07 2023-01-22 11:10:24.083698: step: 104/77, loss: 7.06719310983317e-06 2023-01-22 11:10:25.624580: step: 108/77, loss: 1.3345738807402086e-05 2023-01-22 11:10:27.063687: step: 112/77, loss: 0.04274173080921173 2023-01-22 11:10:28.530282: step: 116/77, loss: 9.51277106651105e-05 2023-01-22 11:10:29.988578: step: 120/77, loss: 2.816310029629676e-07 2023-01-22 11:10:31.402837: step: 124/77, loss: 0.028377747163176537 2023-01-22 11:10:32.893047: step: 128/77, loss: 7.620793894602684e-06 2023-01-22 11:10:34.386003: step: 132/77, loss: 7.897602927187108e-08 2023-01-22 11:10:35.864009: step: 136/77, loss: 8.433857874479145e-05 2023-01-22 11:10:37.403463: step: 140/77, loss: 0.042357299476861954 2023-01-22 11:10:38.834772: step: 144/77, loss: 2.6225916371913627e-07 2023-01-22 11:10:40.267685: step: 148/77, loss: 1.937150173603186e-08 2023-01-22 11:10:41.734707: step: 152/77, loss: 0.00019146957492921501 2023-01-22 11:10:43.210738: step: 156/77, loss: 4.1731591409188695e-06 2023-01-22 11:10:44.714557: step: 160/77, loss: 1.1456423635536339e-05 2023-01-22 11:10:46.178714: step: 164/77, loss: 4.647242349165026e-06 2023-01-22 11:10:47.686427: step: 168/77, loss: 1.8775331511733384e-07 2023-01-22 11:10:49.145357: step: 172/77, loss: 0.000803797913249582 2023-01-22 11:10:50.636297: step: 176/77, loss: 0.00011625502520473674 2023-01-22 11:10:52.098547: step: 180/77, loss: 0.02850656770169735 2023-01-22 11:10:53.552489: step: 184/77, loss: 0.023709582164883614 2023-01-22 11:10:55.070211: step: 188/77, loss: 0.001574437483213842 2023-01-22 11:10:56.563676: step: 192/77, loss: 4.412966518430039e-06 2023-01-22 11:10:58.048951: step: 196/77, loss: 0.028234709054231644 2023-01-22 11:10:59.544376: step: 200/77, loss: 5.9604619906394873e-08 2023-01-22 11:11:01.012282: step: 204/77, loss: 0.007427121512591839 2023-01-22 11:11:02.439726: step: 208/77, loss: 1.28149778788611e-07 2023-01-22 11:11:03.916455: step: 212/77, loss: 1.8853193978429772e-05 2023-01-22 11:11:05.329332: step: 216/77, loss: 0.00016453364514745772 2023-01-22 11:11:06.771735: step: 220/77, loss: 0.04771774262189865 2023-01-22 11:11:08.294458: step: 224/77, loss: 0.00020399382628966123 2023-01-22 11:11:09.700602: step: 228/77, loss: 4.151056145929033e-06 2023-01-22 11:11:11.178472: step: 232/77, loss: 0.0006100233877077699 2023-01-22 11:11:12.708107: step: 236/77, loss: 0.0030997106805443764 2023-01-22 11:11:14.139059: step: 240/77, loss: 0.0016627665609121323 2023-01-22 11:11:15.691407: step: 244/77, loss: 1.3559860008172109e-06 2023-01-22 11:11:17.228881: step: 248/77, loss: 0.00033372087636962533 2023-01-22 11:11:18.709326: step: 252/77, loss: 2.2440113752963953e-05 2023-01-22 11:11:20.176387: step: 256/77, loss: 1.2248431175976293e-06 2023-01-22 11:11:21.618512: step: 260/77, loss: 3.861261939164251e-05 2023-01-22 11:11:23.157527: step: 264/77, loss: 7.700593414483592e-05 2023-01-22 11:11:24.662730: step: 268/77, loss: 6.907177157700062e-05 2023-01-22 11:11:26.156161: step: 272/77, loss: 0.0003736157377716154 2023-01-22 11:11:27.627356: step: 276/77, loss: 0.03906060755252838 2023-01-22 11:11:29.149042: step: 280/77, loss: 1.0371045391366351e-06 2023-01-22 11:11:30.672506: step: 284/77, loss: 2.6862755476031452e-05 2023-01-22 11:11:32.177548: step: 288/77, loss: 0.00017517435480840504 2023-01-22 11:11:33.613859: step: 292/77, loss: 3.055980641875067e-06 2023-01-22 11:11:35.193627: step: 296/77, loss: 0.0014680877793580294 2023-01-22 11:11:36.688007: step: 300/77, loss: 3.172190918121487e-05 2023-01-22 11:11:38.140744: step: 304/77, loss: 8.607449854025617e-05 2023-01-22 11:11:39.594553: step: 308/77, loss: 0.01271668728441 2023-01-22 11:11:41.119431: step: 312/77, loss: 4.753455016270891e-07 2023-01-22 11:11:42.537634: step: 316/77, loss: 1.7221045709447935e-05 2023-01-22 11:11:43.990422: step: 320/77, loss: 0.0009752624318934977 2023-01-22 11:11:45.454359: step: 324/77, loss: 1.535144474473782e-05 2023-01-22 11:11:46.960308: step: 328/77, loss: 6.94218761054799e-05 2023-01-22 11:11:48.439940: step: 332/77, loss: 0.00026013870956376195 2023-01-22 11:11:49.980853: step: 336/77, loss: 0.00010290901263942942 2023-01-22 11:11:51.428162: step: 340/77, loss: 1.2418715414241888e-05 2023-01-22 11:11:52.941660: step: 344/77, loss: 3.4568495266285026e-06 2023-01-22 11:11:54.351451: step: 348/77, loss: 0.0007589494343847036 2023-01-22 11:11:55.785368: step: 352/77, loss: 4.990630259271711e-05 2023-01-22 11:11:57.337862: step: 356/77, loss: 1.3291570439832867e-06 2023-01-22 11:11:58.777204: step: 360/77, loss: 4.246799960583303e-07 2023-01-22 11:12:00.270042: step: 364/77, loss: 7.510137152166863e-07 2023-01-22 11:12:01.712111: step: 368/77, loss: 0.010786645114421844 2023-01-22 11:12:03.262018: step: 372/77, loss: 1.5899097434157738e-06 2023-01-22 11:12:04.751561: step: 376/77, loss: 7.986807872839563e-07 2023-01-22 11:12:06.291668: step: 380/77, loss: 8.875988896761555e-06 2023-01-22 11:12:07.806124: step: 384/77, loss: 1.1134283340652473e-05 2023-01-22 11:12:09.294735: step: 388/77, loss: 2.2824442567070946e-05 ================================================== Loss: 0.004 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.4878048780487805, 'r': 0.03780718336483932, 'f1': 0.07017543859649122}, 'combined': 0.051708217913204055, 'epoch': 28} Test Chinese: {'template': {'p': 0.8947368421052632, 'r': 0.5190839694656488, 'f1': 0.6570048309178743}, 'slot': {'p': 0.40540540540540543, 'r': 0.012942191544434857, 'f1': 0.02508361204013378}, 'combined': 0.01648005428723765, 'epoch': 28} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.4878048780487805, 'r': 0.03780718336483932, 'f1': 0.07017543859649122}, 'combined': 0.051708217913204055, 'epoch': 28} Test Korean: {'template': {'p': 0.8947368421052632, 'r': 0.5190839694656488, 'f1': 0.6570048309178743}, 'slot': {'p': 0.4166666666666667, 'r': 0.012942191544434857, 'f1': 0.02510460251046025}, 'combined': 0.01649384512764538, 'epoch': 28} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.4878048780487805, 'r': 0.03780718336483932, 'f1': 0.07017543859649122}, 'combined': 0.051708217913204055, 'epoch': 28} Test Russian: {'template': {'p': 0.9178082191780822, 'r': 0.5114503816793893, 'f1': 0.6568627450980392}, 'slot': {'p': 0.4411764705882353, 'r': 0.012942191544434857, 'f1': 0.025146689019279123}, 'combined': 0.016517923179330405, 'epoch': 28} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 28} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 28} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 28} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Chinese: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.37777777777777777, 'r': 0.014667817083692839, 'f1': 0.028239202657807307}, 'combined': 0.01735677821894498, 'epoch': 3} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 3} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Korean: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.37777777777777777, 'r': 0.014667817083692839, 'f1': 0.028239202657807307}, 'combined': 0.01735677821894498, 'epoch': 3} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 3} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Russian: {'template': {'p': 0.863013698630137, 'r': 0.48091603053435117, 'f1': 0.6176470588235295}, 'slot': {'p': 0.38636363636363635, 'r': 0.014667817083692839, 'f1': 0.02826267664172901}, 'combined': 0.01745635910224439, 'epoch': 3} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 3} ****************************** Epoch: 29 command: python train.py --model_name template --xlmr_model_name xlm-roberta-large --batch_size 10 --xlmr_learning_rate 2e-5 --event_hidden_num 450 --accumulate_step 4 --max_epoch 30 --p1_data_weight 0.1 --learning_rate 2e-4 2023-01-22 11:13:47.463719: step: 4/77, loss: 3.6937315144314198e-06 2023-01-22 11:13:49.038194: step: 8/77, loss: 4.1574102738195506e-07 2023-01-22 11:13:50.536261: step: 12/77, loss: 0.0003356757224537432 2023-01-22 11:13:52.028710: step: 16/77, loss: 0.06527945399284363 2023-01-22 11:13:53.579394: step: 20/77, loss: 0.0002221096510766074 2023-01-22 11:13:55.037589: step: 24/77, loss: 0.00017310661496594548 2023-01-22 11:13:56.496983: step: 28/77, loss: 4.425617419201444e-07 2023-01-22 11:13:57.958635: step: 32/77, loss: 8.567985787522048e-06 2023-01-22 11:13:59.448875: step: 36/77, loss: 1.175110628537368e-05 2023-01-22 11:14:00.975583: step: 40/77, loss: 6.787302118027583e-05 2023-01-22 11:14:02.463440: step: 44/77, loss: 0.0003931297978851944 2023-01-22 11:14:03.953644: step: 48/77, loss: 0.0005336487083695829 2023-01-22 11:14:05.395692: step: 52/77, loss: 0.004889761097729206 2023-01-22 11:14:06.886063: step: 56/77, loss: 1.2963991480319237e-07 2023-01-22 11:14:08.308948: step: 60/77, loss: 0.006010737735778093 2023-01-22 11:14:09.795087: step: 64/77, loss: 0.002367202192544937 2023-01-22 11:14:11.222966: step: 68/77, loss: 0.005203723441809416 2023-01-22 11:14:12.653065: step: 72/77, loss: 0.0011668759398162365 2023-01-22 11:14:14.095621: step: 76/77, loss: 4.994434675609227e-06 2023-01-22 11:14:15.541459: step: 80/77, loss: 0.0210029948502779 2023-01-22 11:14:17.019964: step: 84/77, loss: 4.144196645938791e-05 2023-01-22 11:14:18.557471: step: 88/77, loss: 0.003072816878557205 2023-01-22 11:14:20.048821: step: 92/77, loss: 8.771360444370657e-05 2023-01-22 11:14:21.505615: step: 96/77, loss: 2.2703750801156275e-05 2023-01-22 11:14:22.927685: step: 100/77, loss: 0.0005149574135430157 2023-01-22 11:14:24.319302: step: 104/77, loss: 6.4182363530562725e-06 2023-01-22 11:14:25.765997: step: 108/77, loss: 2.364353531447705e-05 2023-01-22 11:14:27.209328: step: 112/77, loss: 1.475209217005613e-07 2023-01-22 11:14:28.701001: step: 116/77, loss: 1.2321707799856085e-05 2023-01-22 11:14:30.260272: step: 120/77, loss: 0.0019780858419835567 2023-01-22 11:14:31.761061: step: 124/77, loss: 2.781839384624618e-06 2023-01-22 11:14:33.295915: step: 128/77, loss: 0.002130064181983471 2023-01-22 11:14:34.779730: step: 132/77, loss: 5.453763947116386e-07 2023-01-22 11:14:36.270702: step: 136/77, loss: 0.00013564022083301097 2023-01-22 11:14:37.770359: step: 140/77, loss: 7.915633432276081e-06 2023-01-22 11:14:39.204980: step: 144/77, loss: 8.597708074375987e-07 2023-01-22 11:14:40.729496: step: 148/77, loss: 0.0005095271626487374 2023-01-22 11:14:42.171886: step: 152/77, loss: 5.636212790705031e-06 2023-01-22 11:14:43.679889: step: 156/77, loss: 6.559267785632983e-05 2023-01-22 11:14:45.132499: step: 160/77, loss: 2.820798908942379e-05 2023-01-22 11:14:46.643169: step: 164/77, loss: 1.3960838259663433e-05 2023-01-22 11:14:48.110186: step: 168/77, loss: 1.0728827959383125e-07 2023-01-22 11:14:49.546612: step: 172/77, loss: 1.5794829550941358e-06 2023-01-22 11:14:51.084098: step: 176/77, loss: 0.14907880127429962 2023-01-22 11:14:52.594890: step: 180/77, loss: 2.362069790251553e-05 2023-01-22 11:14:54.054611: step: 184/77, loss: 3.7997801882738713e-07 2023-01-22 11:14:55.506647: step: 188/77, loss: 0.0001900776260299608 2023-01-22 11:14:57.033286: step: 192/77, loss: 5.02228613186162e-06 2023-01-22 11:14:58.527483: step: 196/77, loss: 4.827723842026899e-06 2023-01-22 11:15:00.067379: step: 200/77, loss: 4.217001219330996e-07 2023-01-22 11:15:01.523925: step: 204/77, loss: 0.0013312987284734845 2023-01-22 11:15:03.033501: step: 208/77, loss: 8.674228411109652e-06 2023-01-22 11:15:04.514852: step: 212/77, loss: 2.571580080257263e-05 2023-01-22 11:15:05.976966: step: 216/77, loss: 8.191204688046128e-05 2023-01-22 11:15:07.481984: step: 220/77, loss: 2.35874153986515e-06 2023-01-22 11:15:08.921155: step: 224/77, loss: 0.0007027069223113358 2023-01-22 11:15:10.343528: step: 228/77, loss: 5.2166076784487814e-05 2023-01-22 11:15:11.807084: step: 232/77, loss: 0.00013188117009121925 2023-01-22 11:15:13.255526: step: 236/77, loss: 1.0763304089778103e-05 2023-01-22 11:15:14.772144: step: 240/77, loss: 6.738617230439559e-06 2023-01-22 11:15:16.214153: step: 244/77, loss: 6.242575182113796e-05 2023-01-22 11:15:17.733514: step: 248/77, loss: 7.897473324192106e-07 2023-01-22 11:15:19.174710: step: 252/77, loss: 0.0017958583775907755 2023-01-22 11:15:20.642507: step: 256/77, loss: 6.188504357851343e-06 2023-01-22 11:15:22.139107: step: 260/77, loss: 0.0002860073291230947 2023-01-22 11:15:23.604921: step: 264/77, loss: 1.5450248611159623e-05 2023-01-22 11:15:25.064237: step: 268/77, loss: 2.9504178655770374e-07 2023-01-22 11:15:26.533869: step: 272/77, loss: 4.7683684556432127e-08 2023-01-22 11:15:27.945367: step: 276/77, loss: 2.3002725356491283e-05 2023-01-22 11:15:29.413995: step: 280/77, loss: 0.0012365533038973808 2023-01-22 11:15:30.890866: step: 284/77, loss: 3.7278171021171147e-06 2023-01-22 11:15:32.362132: step: 288/77, loss: 4.2184408812318e-05 2023-01-22 11:15:33.796735: step: 292/77, loss: 0.011463784612715244 2023-01-22 11:15:35.256568: step: 296/77, loss: 0.02129548415541649 2023-01-22 11:15:36.684507: step: 300/77, loss: 0.02072848007082939 2023-01-22 11:15:38.147155: step: 304/77, loss: 0.03113666921854019 2023-01-22 11:15:39.560217: step: 308/77, loss: 8.225305236919667e-07 2023-01-22 11:15:41.080937: step: 312/77, loss: 3.0233147754188394e-06 2023-01-22 11:15:42.558970: step: 316/77, loss: 5.078814865555614e-06 2023-01-22 11:15:44.017272: step: 320/77, loss: 3.300204525658046e-06 2023-01-22 11:15:45.430785: step: 324/77, loss: 1.2367942758828576e-07 2023-01-22 11:15:46.902058: step: 328/77, loss: 6.798368758609286e-06 2023-01-22 11:15:48.425318: step: 332/77, loss: 2.697095453640941e-07 2023-01-22 11:15:49.928173: step: 336/77, loss: 0.07185610383749008 2023-01-22 11:15:51.505269: step: 340/77, loss: 0.011198935098946095 2023-01-22 11:15:53.010921: step: 344/77, loss: 0.0010905108647421002 2023-01-22 11:15:54.463570: step: 348/77, loss: 3.576277407546513e-08 2023-01-22 11:15:55.902903: step: 352/77, loss: 0.000372204085579142 2023-01-22 11:15:57.352419: step: 356/77, loss: 0.0019713786896318197 2023-01-22 11:15:58.794653: step: 360/77, loss: 0.00993399415165186 2023-01-22 11:16:00.263391: step: 364/77, loss: 0.00010721880971686915 2023-01-22 11:16:01.702834: step: 368/77, loss: 0.00018293378525413573 2023-01-22 11:16:03.158735: step: 372/77, loss: 3.2212028600042686e-05 2023-01-22 11:16:04.631405: step: 376/77, loss: 0.018035752698779106 2023-01-22 11:16:06.176947: step: 380/77, loss: 1.722459273878485e-06 2023-01-22 11:16:07.681280: step: 384/77, loss: 6.50309902994195e-06 2023-01-22 11:16:09.195028: step: 388/77, loss: 3.4866245641751448e-06 ================================================== Loss: 0.005 -------------------- Dev Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 29} Test Chinese: {'template': {'p': 0.8933333333333333, 'r': 0.5114503816793893, 'f1': 0.6504854368932039}, 'slot': {'p': 0.4, 'r': 0.012079378774805867, 'f1': 0.023450586264656615}, 'combined': 0.015254264851766926, 'epoch': 29} Dev Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 29} Test Korean: {'template': {'p': 0.9054054054054054, 'r': 0.5114503816793893, 'f1': 0.6536585365853658}, 'slot': {'p': 0.42424242424242425, 'r': 0.012079378774805867, 'f1': 0.02348993288590604}, 'combined': 0.0153543951546898, 'epoch': 29} Dev Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 29} Test Russian: {'template': {'p': 0.8947368421052632, 'r': 0.5190839694656488, 'f1': 0.6570048309178743}, 'slot': {'p': 0.3888888888888889, 'r': 0.012079378774805867, 'f1': 0.02343096234309623}, 'combined': 0.01539425545246902, 'epoch': 29} Sample Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 29} Sample Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 29} Sample Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 29} ================================================== Current best result: -------------------- Dev for Chinese: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Chinese: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.37777777777777777, 'r': 0.014667817083692839, 'f1': 0.028239202657807307}, 'combined': 0.01735677821894498, 'epoch': 3} Chinese: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.02857142857142857, 'f1': 0.05405405405405405}, 'combined': 0.03603603603603603, 'epoch': 3} -------------------- Dev for Korean: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Korean: {'template': {'p': 0.8513513513513513, 'r': 0.48091603053435117, 'f1': 0.6146341463414634}, 'slot': {'p': 0.37777777777777777, 'r': 0.014667817083692839, 'f1': 0.028239202657807307}, 'combined': 0.01735677821894498, 'epoch': 3} Korean: {'template': {'p': 0.5, 'r': 0.5, 'f1': 0.5}, 'slot': {'p': 0.0, 'r': 0.0, 'f1': 0.0}, 'combined': 0.0, 'epoch': 3} -------------------- Dev for Russian: {'template': {'p': 1.0, 'r': 0.5833333333333334, 'f1': 0.7368421052631579}, 'slot': {'p': 0.5, 'r': 0.03780718336483932, 'f1': 0.07029876977152899}, 'combined': 0.05179909351586346, 'epoch': 3} Test for Russian: {'template': {'p': 0.863013698630137, 'r': 0.48091603053435117, 'f1': 0.6176470588235295}, 'slot': {'p': 0.38636363636363635, 'r': 0.014667817083692839, 'f1': 0.02826267664172901}, 'combined': 0.01745635910224439, 'epoch': 3} Russian: {'template': {'p': 1.0, 'r': 0.5, 'f1': 0.6666666666666666}, 'slot': {'p': 0.5, 'r': 0.034482758620689655, 'f1': 0.06451612903225806}, 'combined': 0.04301075268817204, 'epoch': 3}