Option	Pros	Cons
Cloud Provider(Gcloud, AWS, Azure)	flexibility, save the data	Higher ramp-up time
Colaboratory notebook	Good documentation	Short runtimes, slow GPU, not good for long training jobs
Jupyter hub	Open-source, multiple language support	no free GPU support
Kaggle Notebooks	free 43 hours of GPU computing	data IO to machine is little inconvenient

Option

Pros

Cons

Cloud Provider(Gcloud, AWS, Azure)

flexibility, save the data

Higher ramp-up time

Colaboratory notebook

Good documentation

Short runtimes, slow GPU, not good for long training jobs

Jupyter hub

Open-source, multiple language support

no free GPU support

Kaggle Notebooks

free 43 hours of GPU computing

data IO to machine is little inconvenient

Summary

The Hindi UD treebank is based on the Hindi Dependency Treebank (HDTB) created at IIIT Hyderabad, India.

Introduction

The Hindi Universal Dependency Treebank was automatically converted from Hindi Dependency Treebank (HDTB) which is part of an ongoing effort of creating multi-layered treebanks for Hindi and Urdu. HDTB is developed at IIIT-H India.

Acknowledgments

The project is supported by NSF Grant (Award Number: CNS 0751202; CFDA Number: 47.070).

Any publication reporting the work done using this data should cite the following references:

Riyaz Ahmad Bhat, Rajesh Bhatt, Annahita Farudi, Prescott Klassen, Bhuvana Narasimhan, Martha Palmer, Owen Rambow, Dipti Misra Sharma, Ashwini Vaidya, Sri Ramagurumurthy Vishnu, and Fei Xia. The Hindi/Urdu Treebank Project. In the Handbook of Linguistic Annotation (edited by Nancy Ide and James Pustejovsky), Springer Press

@InCollection{bhathindi
  Title                    = {The Hindi/Urdu Treebank Project}
  Author                   = {Bhat, Riyaz Ahmad and Bhatt, Rajesh and Farudi, Annahita and Klassen, Prescott and Narasimhan,

…

from spacy.lang.hi import Hindi from spacy.gold import docs_to_json nlp_hi = Hindi() nlp_hi.add_pipe(nlp_hi.create_pipe('tagger')) nlp_hi.add_pipe(nlp_hi.create_pipe('parser')) nlp_hi.add_pipe(nlp_hi.create_pipe('ner')) nlp_hi = nlp_hi.from_disk("model_dir/model-best/") sentence = "मैं खाना खा रहा हूँ।" doc = nlp_hi(sentence) print(docs_to_json([doc])) # ... # {'id': 0, 'orth': 'मैं', 'tag': 'PRP', 'head': 2, 'dep': 'nsubj', 'ner': 'O'} # ...