Conference
      
    
    
      
            
Corallo, Giulio; Papotti, Paolo
      
    
    
FINCH: Prompt-guided key-value cache compression for large language models
    
      
            
EMNLP 2024, Conference on Empirical Methods in Natural Language Processing, 12-16 November 2024, Miami, Florida, USA / Also published in Transactions of the Association for Computational Linguistics, Vol.12, 2024