• 4 Posts
  • 35 Comments
Joined 1 year ago
cake
Cake day: July 2nd, 2023

help-circle


















  • We’ve reached far beyond practical necessity in model sizes for Moore’s Law to apply there. That is, model sizes have become so huge that they are performing at 99% of the capability they ever will be able to.

    Context size however, has a lot farther to go. You can think of context size as “working memory” where model sizes are more akin to “long term memory”. The larger the context size, the more a model is able to understand beyond the scope of it’s original model training in one go.