• @[email protected]
    link
    fedilink
    English
    28
    edit-2
    23 days ago

    How many times is this same article going to be written? Model collapse from synthetic data is not a concern at any scale when human data is in the mix. We have entire series of models now trained with mostly synthetic data: https://huggingface.co/docs/transformers/main/model_doc/phi3. When using entirely unassisted outputs error accumulates with each generation but this isn’t a concern in any real scenarios.

    • Something Burger 🍔
      link
      fedilink
      English
      3423 days ago

      As the number of articles about this exact subject increases, so does the likelihood of AI only being able to write about this very subject.