CrossEncoder based on BAAI/bge-reranker-v2-m3
This is a Cross Encoder model finetuned from BAAI/bge-reranker-v2-m3 using the sentence-transformers library. It computes scores for pairs of texts, which can be used for text reranking and semantic search.
Model Details
Model Description
- Model Type: Cross Encoder
- Base model: BAAI/bge-reranker-v2-m3
- Maximum Sequence Length: 512 tokens
- Number of Output Labels: 1 label
Model Sources
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import CrossEncoder
model = CrossEncoder("pujithapsx/finetuned_bge_reranker_m3_merged_fullrecord_v1")
pairs = [
['Name: Rohan Rao Sharma | First: Rohan | Middle: Rao | Last: Sharma | Gender: M | DOB: 1983-08-04 | Spouse: | Mother: | Father: | Company: ORACLE INDIA | ParentCompany: | TaxID: | LicenseID: | PassportID: | Address: 551 KORAMANGALA NEAR MALL | City: AHMEDABAD | State: GUJARAT | ZIP: 380071 | Address1: 551 KORAMANGALA NEAR MALL | City1: AHMEDABAD | State1: GUJARAT | ZIP1: 380071 | Address2: | City2: | State2: | ZIP2: | Address3: | City3: | State3: | ZIP3: | Address4: | City4: | State4: | ZIP4: | Phone: 8641615633 | Phone1: | Phone2: | Phone3: | Phone4: | Email: rohan.sharma@gmail.com | Email1: | Email2: | Email3: | Email4: ', 'Name: Rohan Rao Sharma | First: Rohan | Middle: Rao | Last: Sharma | Gender: M | DOB: 1983-08-04 | Spouse: | Mother: | Father: | Company: ORACLE INDIA | ParentCompany: | TaxID: | LicenseID: | PassportID: | Address: 551 KORAMANGALA NEAR MALL | City: AHMEDABAD | State: GUJARAT | ZIP: 380071 | Address1: 551 KORAMANGALA NEAR MALL | City1: AHMEDABAD | State1: GUJARAT | ZIP1: 380071 | Address2: | City2: | State2: | ZIP2: | Address3: | City3: | State3: | ZIP3: | Address4: | City4: | State4: | ZIP4: | Phone: | Phone1: | Phone2: | Phone3: | Phone4: | Email: rohan.sharma@yahoo.com | Email1: | Email2: | Email3: | Email4: '],
['Name: Priya Prasad Reddy | First: Priya | Middle: Prasad | Last: Reddy | Gender: F | DOB: 1983-07-03 | Spouse: | Mother: | Father: | Company: GLOBAL INFOTECH | ParentCompany: | TaxID: | LicenseID: | PassportID: | Address: FLAT 401, LAKE APT, 24 MAIN RD | City: LUCKNOW | State: UTTAR PRADESH | ZIP: 226001 | Address1: FLAT 401, LAKE APT, 24 MAIN RD | City1: LUCKNOW | State1: UTTAR PRADESH | ZIP1: 226001 | Address2: | City2: | State2: | ZIP2: | Address3: | City3: | State3: | ZIP3: | Address4: | City4: | State4: | ZIP4: | Phone: 9149203558 | Phone1: | Phone2: | Phone3: | Phone4: | Email: priyareddy@gmail.com | Email1: priya.reddy@globalinfotech.com | Email2: | Email3: | Email4: ', 'Name: Priya Reddy | First: Priya | Middle: | Last: Reddy | Gender: F | DOB: 1983-07-03 | Spouse: | Mother: | Father: | Company: GLOBAL INFOTECH PVT LTD | ParentCompany: | TaxID: | LicenseID: | PassportID: | Address: FLAT 401, LAKE APARTMENT, 24 MAIN ROAD | City: LUCKNOW | State: UTTAR PRADESH | ZIP: 226001 | Address1: FLAT 401, LAKE APARTMENT, 24 MAIN ROAD | City1: LUCKNOW | State1: UTTAR PRADESH | ZIP1: 226001 | Address2: | City2: | State2: | ZIP2: | Address3: | City3: | State3: | ZIP3: | Address4: | City4: | State4: | ZIP4: | Phone: 9208449460 | Phone1: | Phone2: | Phone3: | Phone4: | Email: priyareddy@gmail.com | Email1: priya.reddy@globalinfotech.com | Email2: | Email3: | Email4: '],
['Name: Aditya Singh | First: Aditya | Middle: | Last: Singh | Gender: M | DOB: 1982-02-07 | Spouse: | Mother: | Father: | Company: CAPGEMINI INDIA | ParentCompany: | TaxID: | LicenseID: | PassportID: | Address: 753 SECTOR 12 NEAR PARK | City: COIMBATORE | State: TAMIL NADU | ZIP: 641099 | Address1: 753 SECTOR 12 NEAR PARK | City1: COIMBATORE | State1: TAMIL NADU | ZIP1: 641099 | Address2: | City2: | State2: | ZIP2: | Address3: | City3: | State3: | ZIP3: | Address4: | City4: | State4: | ZIP4: | Phone: | Phone1: | Phone2: | Phone3: | Phone4: | Email: adityasingh@company.in | Email1: | Email2: | Email3: | Email4: ', 'Name: Kiran Chopra | First: Kiran | Middle: | Last: Chopra | Gender: M | DOB: 1986-04-13 | Spouse: | Mother: | Father: | Company: GOOGLE INDIA | ParentCompany: | TaxID: | LicenseID: | PassportID: | Address: 524 MG ROAD NEAR SCHOOL | City: NAGPUR | State: MAHARASHTRA | ZIP: 440013 | Address1: 524 MG ROAD NEAR SCHOOL | City1: NAGPUR | State1: MAHARASHTRA | ZIP1: 440013 | Address2: | City2: | State2: | ZIP2: | Address3: | City3: | State3: | ZIP3: | Address4: | City4: | State4: | ZIP4: | Phone: | Phone1: | Phone2: | Phone3: | Phone4: | Email: kiranchopra@company.in | Email1: | Email2: | Email3: | Email4: '],
['Name: Divya Patel | First: Divya | Middle: | Last: Patel | Gender: F | DOB: 1985-02-01 | Spouse: | Mother: | Father: | Company: INFOSYS TECHNOLOGIES | ParentCompany: | TaxID: | LicenseID: | PassportID: | Address: 316 SECTOR 12 NEAR PARK | City: HYDERABAD | State: TELANGANA | ZIP: 500048 | Address1: 316 SECTOR 12 NEAR PARK | City1: HYDERABAD | State1: TELANGANA | ZIP1: 500048 | Address2: | City2: | State2: | ZIP2: | Address3: | City3: | State3: | ZIP3: | Address4: | City4: | State4: | ZIP4: | Phone: 8883181644 | Phone1: | Phone2: | Phone3: | Phone4: | Email: divyapatel@outlook.com | Email1: | Email2: | Email3: | Email4: ', 'Name: Nitin Rao | First: Nitin | Middle: | Last: Rao | Gender: M | DOB: 1987-12-02 | Spouse: | Mother: | Father: | Company: RELIANCE JIO | ParentCompany: | TaxID: | LicenseID: | PassportID: | Address: 612 BANJARA HILLS NEAR PARK | City: COIMBATORE | State: TAMIL NADU | ZIP: 641039 | Address1: 612 BANJARA HILLS NEAR PARK | City1: COIMBATORE | State1: TAMIL NADU | ZIP1: 641039 | Address2: | City2: | State2: | ZIP2: | Address3: | City3: | State3: | ZIP3: | Address4: | City4: | State4: | ZIP4: | Phone: 9358550137 | Phone1: | Phone2: | Phone3: | Phone4: | Email: nitinrao@gmail.com | Email1: | Email2: | Email3: | Email4: '],
['Name: Neha Gupta | First: Neha | Middle: | Last: Gupta | Gender: F | DOB: 1981-04-13 | Spouse: | Mother: | Father: | Company: HEALTHCARE SYSTEMS | ParentCompany: | TaxID: | LicenseID: | PassportID: | Address: HOUSE 57, BLOCK A kanpur | City: KANPUR | State: UTTAR PRADESH | ZIP: 208001 | Address1: HOUSE 57, BLOCK A kanpur | City1: KANPUR | State1: UTTAR PRADESH | ZIP1: 208001 | Address2: | City2: | State2: | ZIP2: | Address3: | City3: | State3: | ZIP3: | Address4: | City4: | State4: | ZIP4: | Phone: 9423646570 | Phone1: | Phone2: | Phone3: | Phone4: | Email: nehagupta@gmail.com | Email1: | Email2: | Email3: | Email4: ', 'Name: Vikram Gupta | First: Vikram | Middle: | Last: Gupta | Gender: M | DOB: 1993-06-04 | Spouse: | Mother: | Father: | Company: TECH SOLUTIONS PVT LTD | ParentCompany: | TaxID: | LicenseID: | PassportID: | Address: PLOT 510, NEAR TEMPLE delhi | City: DELHI | State: DELHI | ZIP: 110001 | Address1: PLOT 510, NEAR TEMPLE delhi | City1: DELHI | State1: DELHI | ZIP1: 110001 | Address2: | City2: | State2: | ZIP2: | Address3: | City3: | State3: | ZIP3: | Address4: | City4: | State4: | ZIP4: | Phone: 9288650860 | Phone1: | Phone2: | Phone3: | Phone4: | Email: vikramgupta@yahoo.com | Email1: | Email2: | Email3: | Email4: '],
]
scores = model.predict(pairs)
print(scores.shape)
ranks = model.rank(
'Name: Rohan Rao Sharma | First: Rohan | Middle: Rao | Last: Sharma | Gender: M | DOB: 1983-08-04 | Spouse: | Mother: | Father: | Company: ORACLE INDIA | ParentCompany: | TaxID: | LicenseID: | PassportID: | Address: 551 KORAMANGALA NEAR MALL | City: AHMEDABAD | State: GUJARAT | ZIP: 380071 | Address1: 551 KORAMANGALA NEAR MALL | City1: AHMEDABAD | State1: GUJARAT | ZIP1: 380071 | Address2: | City2: | State2: | ZIP2: | Address3: | City3: | State3: | ZIP3: | Address4: | City4: | State4: | ZIP4: | Phone: 8641615633 | Phone1: | Phone2: | Phone3: | Phone4: | Email: rohan.sharma@gmail.com | Email1: | Email2: | Email3: | Email4: ',
[
'Name: Rohan Rao Sharma | First: Rohan | Middle: Rao | Last: Sharma | Gender: M | DOB: 1983-08-04 | Spouse: | Mother: | Father: | Company: ORACLE INDIA | ParentCompany: | TaxID: | LicenseID: | PassportID: | Address: 551 KORAMANGALA NEAR MALL | City: AHMEDABAD | State: GUJARAT | ZIP: 380071 | Address1: 551 KORAMANGALA NEAR MALL | City1: AHMEDABAD | State1: GUJARAT | ZIP1: 380071 | Address2: | City2: | State2: | ZIP2: | Address3: | City3: | State3: | ZIP3: | Address4: | City4: | State4: | ZIP4: | Phone: | Phone1: | Phone2: | Phone3: | Phone4: | Email: rohan.sharma@yahoo.com | Email1: | Email2: | Email3: | Email4: ',
'Name: Priya Reddy | First: Priya | Middle: | Last: Reddy | Gender: F | DOB: 1983-07-03 | Spouse: | Mother: | Father: | Company: GLOBAL INFOTECH PVT LTD | ParentCompany: | TaxID: | LicenseID: | PassportID: | Address: FLAT 401, LAKE APARTMENT, 24 MAIN ROAD | City: LUCKNOW | State: UTTAR PRADESH | ZIP: 226001 | Address1: FLAT 401, LAKE APARTMENT, 24 MAIN ROAD | City1: LUCKNOW | State1: UTTAR PRADESH | ZIP1: 226001 | Address2: | City2: | State2: | ZIP2: | Address3: | City3: | State3: | ZIP3: | Address4: | City4: | State4: | ZIP4: | Phone: 9208449460 | Phone1: | Phone2: | Phone3: | Phone4: | Email: priyareddy@gmail.com | Email1: priya.reddy@globalinfotech.com | Email2: | Email3: | Email4: ',
'Name: Kiran Chopra | First: Kiran | Middle: | Last: Chopra | Gender: M | DOB: 1986-04-13 | Spouse: | Mother: | Father: | Company: GOOGLE INDIA | ParentCompany: | TaxID: | LicenseID: | PassportID: | Address: 524 MG ROAD NEAR SCHOOL | City: NAGPUR | State: MAHARASHTRA | ZIP: 440013 | Address1: 524 MG ROAD NEAR SCHOOL | City1: NAGPUR | State1: MAHARASHTRA | ZIP1: 440013 | Address2: | City2: | State2: | ZIP2: | Address3: | City3: | State3: | ZIP3: | Address4: | City4: | State4: | ZIP4: | Phone: | Phone1: | Phone2: | Phone3: | Phone4: | Email: kiranchopra@company.in | Email1: | Email2: | Email3: | Email4: ',
'Name: Nitin Rao | First: Nitin | Middle: | Last: Rao | Gender: M | DOB: 1987-12-02 | Spouse: | Mother: | Father: | Company: RELIANCE JIO | ParentCompany: | TaxID: | LicenseID: | PassportID: | Address: 612 BANJARA HILLS NEAR PARK | City: COIMBATORE | State: TAMIL NADU | ZIP: 641039 | Address1: 612 BANJARA HILLS NEAR PARK | City1: COIMBATORE | State1: TAMIL NADU | ZIP1: 641039 | Address2: | City2: | State2: | ZIP2: | Address3: | City3: | State3: | ZIP3: | Address4: | City4: | State4: | ZIP4: | Phone: 9358550137 | Phone1: | Phone2: | Phone3: | Phone4: | Email: nitinrao@gmail.com | Email1: | Email2: | Email3: | Email4: ',
'Name: Vikram Gupta | First: Vikram | Middle: | Last: Gupta | Gender: M | DOB: 1993-06-04 | Spouse: | Mother: | Father: | Company: TECH SOLUTIONS PVT LTD | ParentCompany: | TaxID: | LicenseID: | PassportID: | Address: PLOT 510, NEAR TEMPLE delhi | City: DELHI | State: DELHI | ZIP: 110001 | Address1: PLOT 510, NEAR TEMPLE delhi | City1: DELHI | State1: DELHI | ZIP1: 110001 | Address2: | City2: | State2: | ZIP2: | Address3: | City3: | State3: | ZIP3: | Address4: | City4: | State4: | ZIP4: | Phone: 9288650860 | Phone1: | Phone2: | Phone3: | Phone4: | Email: vikramgupta@yahoo.com | Email1: | Email2: | Email3: | Email4: ',
]
)
Evaluation
Metrics
Cross Encoder Classification
| Metric |
Value |
| accuracy |
1.0 |
| accuracy_threshold |
0.4994 |
| f1 |
1.0 |
| f1_threshold |
0.4994 |
| precision |
1.0 |
| recall |
1.0 |
| average_precision |
1.0 |
Training Details
Training Dataset
Unnamed Dataset
Evaluation Dataset
Unnamed Dataset
Training Hyperparameters
Non-Default Hyperparameters
eval_strategy: steps
per_device_train_batch_size: 4
gradient_accumulation_steps: 4
learning_rate: 3e-05
weight_decay: 0.05
warmup_ratio: 0.2
warmup_steps: 0.2
dataloader_num_workers: 3
load_best_model_at_end: True
dataloader_pin_memory: False
dataloader_persistent_workers: True
All Hyperparameters
Click to expand
do_predict: False
eval_strategy: steps
prediction_loss_only: True
per_device_train_batch_size: 4
per_device_eval_batch_size: 8
gradient_accumulation_steps: 4
eval_accumulation_steps: None
torch_empty_cache_steps: None
learning_rate: 3e-05
weight_decay: 0.05
adam_beta1: 0.9
adam_beta2: 0.999
adam_epsilon: 1e-08
max_grad_norm: 1.0
num_train_epochs: 3
max_steps: -1
lr_scheduler_type: linear
lr_scheduler_kwargs: None
warmup_ratio: 0.2
warmup_steps: 0.2
log_level: passive
log_level_replica: warning
log_on_each_node: True
logging_nan_inf_filter: True
enable_jit_checkpoint: False
save_on_each_node: False
save_only_model: False
restore_callback_states_from_checkpoint: False
use_cpu: False
seed: 42
data_seed: None
bf16: False
fp16: False
bf16_full_eval: False
fp16_full_eval: False
tf32: None
local_rank: -1
ddp_backend: None
debug: []
dataloader_drop_last: False
dataloader_num_workers: 3
dataloader_prefetch_factor: None
disable_tqdm: False
remove_unused_columns: True
label_names: None
load_best_model_at_end: True
ignore_data_skip: False
fsdp: []
fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
parallelism_config: None
deepspeed: None
label_smoothing_factor: 0.0
optim: adamw_torch_fused
optim_args: None
group_by_length: False
length_column_name: length
project: huggingface
trackio_space_id: trackio
ddp_find_unused_parameters: None
ddp_bucket_cap_mb: None
ddp_broadcast_buffers: False
dataloader_pin_memory: False
dataloader_persistent_workers: True
skip_memory_metrics: True
push_to_hub: False
resume_from_checkpoint: None
hub_model_id: None
hub_strategy: every_save
hub_private_repo: None
hub_always_push: False
hub_revision: None
gradient_checkpointing: False
gradient_checkpointing_kwargs: None
include_for_metrics: []
eval_do_concat_batches: True
auto_find_batch_size: False
full_determinism: False
ddp_timeout: 1800
torch_compile: False
torch_compile_backend: None
torch_compile_mode: None
include_num_input_tokens_seen: no
neftune_noise_alpha: None
optim_target_modules: None
batch_eval_metrics: False
eval_on_start: False
use_liger_kernel: False
liger_kernel_config: None
eval_use_gather_object: False
average_tokens_across_devices: True
use_cache: False
prompts: None
batch_sampler: batch_sampler
multi_dataset_batch_sampler: proportional
router_mapping: {}
learning_rate_mapping: {}
Training Logs
| Epoch |
Step |
Training Loss |
Validation Loss |
rich-entity-matching-lora_average_precision |
| 0.5 |
4 |
0.0774 |
- |
- |
| 1.0 |
8 |
0.0427 |
- |
- |
| 1.5 |
12 |
0.0678 |
- |
- |
| 1.875 |
15 |
- |
0.0067 |
1.0 |
| 2.0 |
16 |
0.0284 |
- |
- |
| 2.5 |
20 |
0.0091 |
- |
- |
| 3.0 |
24 |
0.0860 |
- |
- |
Framework Versions
- Python: 3.12.12
- Sentence Transformers: 5.2.3
- Transformers: 5.0.0
- PyTorch: 2.10.0+cpu
- Accelerate: 1.12.0
- Datasets: 4.8.3
- Tokenizers: 0.22.2
Citation
BibTeX
Sentence Transformers
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}