Yo where my data engineers at!? Anybody have exper...
# 06-technical-discussion
c
Yo where my data engineers at!? Anybody have experience working with WordNet? I was thinking about synthin' (synthesizing) out (expanding with better examples) this dataset via a thoroughly read (literary works) OSS model for faster grokking during pretraining (new or continuation of existing OSS model). Additionally, I have some questions on phases like would it be better to format this into a regular print dictionary (like physical webster's) layout and pretrain (next token) for p1, then random [MASK] p2, then instruct with system prompts for providing example sentences or right/wrong usage generation for p3, or is there a faster way that has recently been published?