프리필 및 디코딩(pre-fill-and-decoding)이란 무엇인가요?

Question

Accepted Answer

대규모 언어 모델 추론의 두 가지 주요 단계이다. 프리필은 입력 텍스트를 처리하고 디코딩은 결과 토큰을 생성하는 과정으로, 각 단계에 최적화된 하드웨어를 배치하여 전체 효율을 높인다.

pre-fill-and-decoding