Building open-source AI for 500,000+ BPY speakers worldwide
সাম্ভাষা! | Thank you for helping build AI for Bishnupriya Manipuri.
Submit English↔BPY sentence pairs
Training scripts, evaluation, apps
Report translation errors
This is the most important. More data = better models. We accept data 3 ways:
Submit 1-10 sentence pairs via email: usangraha@gmail.com
Subject: BPY Data Submission
English: The sun is hot.
BPY: বেলীগ তপ্তা ইসে।
English: Fifty books.
BPY: য়াংখেইহান লেরিক।
For 50+ pairs, create a CSV and email to usangraha@gmail.com
CSV Format: bpy_training_data.csv
english,bpy_beng,source,notes
My name is John.,মর নাঙহান জন।,contributor_arunita,
The cat is sleeping.,মেকুরগ ঘুমজার।,wikipedia,
Fifty books.,য়াংখেইহান লেরিক।,book_scan_v1,number pattern
Have BPY books? Help us digitize them.
Base: facebook/nllb-200-distilled-600M | Method: LoRA fine-tuning
Ideas we want: BPY→English model, larger base models, multilingual training, evaluation scripts
Setup:
git clone https://huggingface.co/BishnupriyaManipuri/nllb-bpy-beng-v8-5-3-merged
cd nllb-bpy-beng-v8-5-3-merged
pip install transformers peft datasets accelerate
HF Inference Endpoint:
https://hcurzfqqhq3x21kg.us-east-1.aws.endpoints.huggingface.cloud
Found a bad translation? Email usangraha@gmail.com
Subject: BPY Translation Bug
Input: Fifty books
Output: লেরিকহান লেরিকহান ❌
Expected: য়াংখেইহান লেরিক ✅
Version: V8.5.3
By contributing, you agree:
We will NOT: Sell your data, use for non-BPY purposes, or share your email without permission